当前位置:   article > 正文

LangChain整合Chroma实现本地知识库问答_langchain 本地embedding 文档问答

langchain 本地embedding 文档问答

LangChain整合Chroma实现本地知识库问答

将本地embedding数据存储到Chroma,然后使用LangChain调用openai api完成本地知识库问答

from langchain.vectorstores import Chroma
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.llms import OpenAI
from langchain.chains import VectorDBQA
from langchain.document_loaders import TextLoader

import os
os.environ["OPENAI_API_KEY"] = 'sk-xxx'



# 加载数据,并转换为document格式
loader = TextLoader('state_of_the_union.txt')
documents = loader.load()
# 分割document
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
texts = text_splitter.split_documents(documents)


# 使用本地模型进行embedding,并将嵌入后的数据存储到db路径下
persist_directory = 'db'
embedding = SentenceTransformerEmbeddings()
vectordb = Chroma.from_documents(documents=texts, embedding=embedding,          persist_directory=persist_directory)


# 持久化数据,并释放内存
vectordb.persist()
vectordb = None


# 从磁盘中加载数据
vectordb = Chroma(persist_directory=persist_directory, embedding_function=embedding)
# 使用LangChain的VectorDBQA链,来初始化qa对象
qa = VectorDBQA.from_chain_type(llm=OpenAI(model_name="gpt-3.5-turbo"),  chain_type="stuff", vectorstore=vectordb)


# 输入query,调用qa对象
query = "What did the president say about Ketanji Brown Jackson"
qa.run(query)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/运维做开发/article/detail/880012
推荐阅读
相关标签
  

闽ICP备14008679号