赞
踩
llama-index在实现RAG方案的时候多是用的llama等英文大模型,对于国内的诸多模型案例较少,本次将使用qwen大模型实现llama-index的RAG方案。
llamaindex需要预装很多包,这里先把我成功的案例里面的pip包配置发出来,在requirements.txt里面。
- absl-py==1.4.0
- accelerate==0.27.2
- aiohttp==3.9.3
- aiosignal==1.3.1
- aliyun-python-sdk-core==2.13.36
- aliyun-python-sdk-kms==2.16.1
- annotated-types==0.6.0
- anyio==3.7.1
- apphub @ file:///environment/apps/apphub/dist/apphub-1.0.0.tar.gz#sha256=260f99c0de4c575b19ab913aa134877e9efd81b820b97511fc8379674643c253
- argon2-cffi==21.3.0
- argon2-cffi-bindings==21.2.0
- asgiref==3.7.2
- asttokens==2.2.1
- astunparse==1.6.3
- async-timeout==4.0.3
- attrs==23.1.0
- Babel==2.12.1
- backcall==0.2.0
- backoff==2.2.1
- bcrypt==4.1.2
- beautifulsoup4==4.12.3
- bleach==6.0.0
- boltons @ file:///croot/boltons_1677628692245/work
- brotlipy==0.7.0
- bs4==0.0.2
- build==1.1.1
- cachetools==5.3.1
- certifi @ file:///croot/certifi_1690232220950/work/certifi
- cffi @ file:///croot/cffi_1670423208954/work
- chardet==3.0.4
- charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
- chroma-hnswlib==0.7.3
- chromadb==0.4.24
- click==7.1.2
- cmake==3.25.0
- coloredlogs==15.0.1
- comm==0.1.4
- conda @ file:///croot/conda_1690494963117/work
- conda-content-trust @ file:///tmp/abs_5952f1c8-355c-4855-ad2e-538535021ba5h26t22e5/croots/recipe/conda-content-trust_1658126371814/work
- conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1685032319139/work/src
- conda-package-handling @ file:///croot/conda-package-handling_1685024767917/work
- conda_package_streaming @ file:///croot/conda-package-streaming_1685019673878/work
- contourpy==1.2.0
- crcmod==1.7
- cryptography @ file:///croot/cryptography_1686613057838/work
- cycler==0.12.1
- dataclasses-json==0.6.4
- debugpy==1.6.7
- decorator==5.1.1
- defusedxml==0.7.1
- Deprecated==1.2.14
- dirtyjson==1.0.8
- distro==1.9.0
- ecdsa==0.18.0
- exceptiongroup==1.1.2
- executing==1.2.0
- fastapi==0.104.1
- fastjsonschema==2.18.0
- featurize==0.0.24
- filelock==3.9.0
- flatbuffers==23.5.26
- fonttools==4.44.0
- frozenlist==1.4.1
- fsspec==2024.2.0
- gast==0.4.0
- google-auth==2.22.0
- google-auth-oauthlib==1.0.0
- google-pasta==0.2.0
- googleapis-common-protos==1.62.0
- greenlet==3.0.3
- grpcio==1.62.0
- gunicorn==21.2.0
- h11==0.14.0
- h5py==3.9.0
- httpcore==0.17.3
- httptools==0.6.1
- httpx==0.24.1
- huggingface-hub==0.20.3
- humanfriendly==10.0
- idna==2.10
- imageio==2.32.0
- importlib-metadata==6.11.0
- importlib_resources==6.1.3
- ipykernel==6.25.0
- ipython==8.14.0
- ipython-genutils==0.2.0
- ipywidgets==8.1.2
- jedi==0.19.0
- Jinja2==3.1.2
- jmespath==0.10.0
- joblib==1.3.2
- json5==0.9.14
- jsonpatch @ file:///tmp/build/80754af9/jsonpatch_1615747632069/work
- jsonpointer==2.1
- jsonschema==4.18.6
- jsonschema-specifications==2023.7.1
- jupyter-server==1.24.0
- jupyter_client==8.3.0
- jupyter_core==5.3.1
- jupyterlab==3.2.9
- jupyterlab-pygments==0.2.2
- jupyterlab_server==2.24.0
- jupyterlab_widgets==3.0.10
- keras==2.13.1
- kiwisolver==1.4.5
- kubernetes==29.0.0
- lazy_loader==0.3
- libclang==16.0.6
- libmambapy @ file:///croot/mamba-split_1685993156657/work/libmambapy
- lit==15.0.7
- llama-index==0.10.17
- llama-index-agent-openai==0.1.5
- llama-index-cli==0.1.8
- llama-index-core==0.10.17
- llama-index-embeddings-huggingface==0.1.4
- llama-index-embeddings-openai==0.1.6
- llama-index-indices-managed-llama-cloud==0.1.3
- llama-index-legacy==0.9.48
- llama-index-llms-huggingface==0.1.3
- llama-index-llms-openai==0.1.7
- llama-index-multi-modal-llms-openai==0.1.4
- llama-index-program-openai==0.1.4
- llama-index-question-gen-openai==0.1.3
- llama-index-readers-file==0.1.8
- llama-index-readers-llama-parse==0.1.3
- llama-index-vector-stores-chroma==0.1.5
- llama-parse==0.3.8
- llamaindex-py-client==0.1.13
- Markdown==3.4.4
- MarkupSafe==2.1.2
- marshmallow==3.21.1
- matplotlib==3.8.1
- matplotlib-inline==0.1.6
- mistune==3.0.1
- mmh3==4.1.0
- monotonic==1.6
- mpmath==1.2.1
- multidict==6.0.4
- mypy-extensions==1.0.0
- nbclassic==0.2.8
- nbclient==0.8.0
- nbconvert==7.7.3
- nbformat==5.9.2
- nest-asyncio==1.6.0
- networkx==3.0
- nltk==3.8.1
- notebook==6.4.12
- numpy==1.24.1
- nvidia-cublas-cu12==12.1.3.1
- nvidia-cuda-cupti-cu12==12.1.105
- nvidia-cuda-nvrtc-cu12==12.1.105
- nvidia-cuda-runtime-cu12==12.1.105
- nvidia-cudnn-cu12==8.9.2.26
- nvidia-cufft-cu12==11.0.2.54
- nvidia-curand-cu12==10.3.2.106
- nvidia-cusolver-cu12==11.4.5.107
- nvidia-cusparse-cu12==12.1.0.106
- nvidia-nccl-cu12==2.19.3
- nvidia-nvjitlink-cu12==12.4.99
- nvidia-nvtx-cu12==12.1.105
- oauthlib==3.2.2
- onnxruntime==1.17.1
- openai==1.13.3
- opencv-python==4.8.1.78
- opentelemetry-api==1.23.0
- opentelemetry-exporter-otlp-proto-common==1.23.0
- opentelemetry-exporter-otlp-proto-grpc==1.23.0
- opentelemetry-instrumentation==0.44b0
- opentelemetry-instrumentation-asgi==0.44b0
- opentelemetry-instrumentation-fastapi==0.44b0
- opentelemetry-proto==1.23.0
- opentelemetry-sdk==1.23.0
- opentelemetry-semantic-conventions==0.44b0
- opentelemetry-util-http==0.44b0
- opt-einsum==3.3.0
- orjson==3.9.15
- oss2==2.18.1
- overrides==7.7.0
- packaging @ file:///croot/packaging_1678965309396/work
- pandas==2.1.2
- pandocfilters==1.5.0
- parso==0.8.3
- pexpect==4.8.0
- pickleshare==0.7.5
- Pillow==9.3.0
- platformdirs==3.10.0
- pluggy @ file:///tmp/build/80754af9/pluggy_1648024709248/work
- posthog==3.5.0
- prometheus-client==0.17.1
- prompt-toolkit==3.0.39
- protobuf==4.23.4
- psutil==5.9.5
- ptyprocess==0.7.0
- pulsar-client==3.4.0
- pure-eval==0.2.2
- pyasn1==0.5.0
- pyasn1-modules==0.3.0
- pycosat @ file:///croot/pycosat_1666805502580/work
- pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
- pycryptodome==3.18.0
- pydantic==2.4.2
- pydantic_core==2.10.1
- Pygments==2.15.1
- PyMuPDF==1.23.26
- PyMuPDFb==1.23.22
- pyOpenSSL @ file:///croot/pyopenssl_1677607685877/work
- pyparsing==3.1.1
- pypdf==4.1.0
- PyPika==0.48.9
- pyproject_hooks==1.0.0
- PySocks @ file:///home/builder/ci_310/pysocks_1640793678128/work
- python-dateutil==2.8.2
- python-dotenv==1.0.0
- pytz==2023.3.post1
- PyYAML==6.0.1
- pyzmq==25.1.0
- referencing==0.30.0
- regex==2023.12.25
- requests==2.31.0
- requests-oauthlib==1.3.1
- rpds-py==0.9.2
- rsa==4.9
- ruamel.yaml @ file:///croot/ruamel.yaml_1666304550667/work
- ruamel.yaml.clib @ file:///croot/ruamel.yaml.clib_1666302247304/work
- safetensors==0.4.2
- scikit-image==0.22.0
- scikit-learn==1.3.2
- scipy==1.11.3
- seaborn==0.13.0
- Send2Trash==1.8.2
- six @ file:///tmp/build/80754af9/six_1644875935023/work
- sniffio==1.3.0
- socksio==1.0.0
- soupsieve==2.4.1
- SQLAlchemy==2.0.28
- sshpubkeys==3.3.1
- stack-data==0.6.2
- starlette==0.27.0
- sympy==1.11.1
- tabulate==0.8.7
- tenacity==8.2.3
- tensorboard==2.13.0
- tensorboard-data-server==0.7.1
- tensorflow==2.13.0
- tensorflow-estimator==2.13.0
- tensorflow-io-gcs-filesystem==0.33.0
- termcolor==2.3.0
- terminado==0.17.1
- threadpoolctl==3.2.0
- tifffile==2023.9.26
- tiktoken==0.6.0
- tinycss2==1.2.1
- tokenizers==0.15.2
- tomli==2.0.1
- toolz @ file:///croot/toolz_1667464077321/work
- torch==2.2.1
- torchaudio==2.0.2+cu118
- torchvision==0.15.2+cu118
- tornado==6.3.2
- tqdm==4.66.2
- traitlets==5.9.0
- transformers==4.38.2
- triton==2.2.0
- typer==0.9.0
- typing-inspect==0.9.0
- typing_extensions==4.8.0
- tzdata==2023.3
- urllib3==1.25.11
- uvicorn==0.23.2
- uvloop==0.19.0
- watchfiles==0.21.0
- wcwidth==0.2.5
- webencodings==0.5.1
- websocket-client==1.2.1
- websockets==12.0
- Werkzeug==2.3.6
- widgetsnbextension==4.0.10
- workspace @ file:///home/featurize/work/workspace/dist/workspace-0.1.0.tar.gz#sha256=b292beb3599f79d3791771eff9dc422cc37c58c1fc8daadeafbf025a2e7ea986
- wrapt==1.15.0
- yarl==1.9.2
- zipp==3.17.0
- zstandard @ file:///croot/zstandard_1677013143055/work
- !pip install llama-index
- !pip install llama-index-llms-huggingface
- !pip install llama-index-embeddings-huggingface
- !pip install llama-index ipywidgets
- !pip install torch
- !git clone https://www.modelscope.cn/AI-ModelScope/bge-small-zh-v1.5.git
- !git clone https://www.modelscope.cn/qwen/Qwen1.5-4B-Chat.git
- import torch
- from llama_index.llms.huggingface import HuggingFaceLLM
- from llama_index.core import PromptTemplate
- import os
- os.environ['KMP_DUPLICATE_LIB_OK']='True'
- # Model names (make sure you have access on HF)
- LLAMA2_7B = "/home/featurize/Qwen1.5-4B-Chat"
- # LLAMA2_7B_CHAT = "meta-llama/Llama-2-7b-chat-hf"
- # LLAMA2_13B = "meta-llama/Llama-2-13b-hf"
- LLAMA2_13B_CHAT = "/home/featurize/Qwen1.5-4B-Chat"
- # LLAMA2_70B = "meta-llama/Llama-2-70b-hf"
- # LLAMA2_70B_CHAT = "meta-llama/Llama-2-70b-chat-hf"
-
- selected_model = LLAMA2_13B_CHAT
-
- SYSTEM_PROMPT = """You are an AI assistant that answers questions in a friendly manner, based on the given source documents. Here are some rules you always follow:
- - Generate human readable output, avoid creating output with gibberish text.
- - Generate only the requested output, don't include any other language before or after the requested output.
- - Never say thank you, that you are happy to help, that you are an AI agent, etc. Just answer directly.
- - Generate professional language typically used in business documents in North America.
- - Never generate offensive or foul language.
- """
-
- query_wrapper_prompt = PromptTemplate(
- "[INST]<<SYS>>\n" + SYSTEM_PROMPT + "<</SYS>>\n\n{query_str}[/INST] "
- )
-
- llm = HuggingFaceLLM(context_window=4096,
- max_new_tokens=2048,
- generate_kwargs={"temperature": 0.0, "do_sample": False},
- query_wrapper_prompt=query_wrapper_prompt,
- tokenizer_name=selected_model,
- model_name=selected_model,
- device_map="auto"
- )
- from llama_index.embeddings.huggingface import HuggingFaceEmbedding
-
- embed_model = HuggingFaceEmbedding(model_name="/home/featurize/bge-small-zh-v1.5")
- from llama_index.core import Settings
-
- Settings.llm = llm
- Settings.embed_model = embed_model
- from llama_index.core import SimpleDirectoryReader
-
- # load documents
- documents = SimpleDirectoryReader("./data/").load_data()
- from llama_index.core import VectorStoreIndex
- index = VectorStoreIndex.from_documents(documents)
index
- # set Logging to DEBUG for more detailed outputs
- query_engine = index.as_query_engine()
- response = query_engine.query("小额贷款咋规定的?")
- print(response)
llamaindex实现RAG中很关键的一环就是知识库,知识库主要是各种类型的文档,这里给的文档是一个pdf文件,文件内容如下。
从上面的代码可以看出,我们使用qwen和bge-zh模型可以实现本地下载模型的RAG方案,知识库里面的内容也可以实现中文问答,这非常有利于我们进行私有化部署方案,从而扩展我们的功能。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。