当前位置:   article > 正文

win10 Langchain-chatchat 知识库本地搭建记录_langchain chatchat

langchain chatchat

一、clone源码

git clone https://github.com/chatchat-space/Langchain-Chatchat.git

  • 1
  • 2

二、环境准备

conda create -n Chatchat python==3.10
conda activate Chatchat
​

  • 1
  • 2
  • 3
  • 4

三、模型配置

model_config.py 中

​
# 选用的 Embedding 名称
EMBEDDING_MODEL = "m3e-base"
​
LLM_MODELS = ["zhipu-api"] 
​
ONLINE_LLM_MODEL = {
    # 具体注册及api key获取请前往 http://open.bigmodel.cn
    "zhipu-api": {
        "api_key": "你自己的智普API key",
        "version": "chatglm_turbo",  # 可选包括 "chatglm_turbo"
        "provider": "ChatGLMWorker",
    },
 }
    
    MODEL_PATH = {
    "embed_model": {
        "zhipu-api": "lucidrains/GLM-130B",
        "m3e-base": "G:\AIGC\Langchain\m3e-base-main",
    },
    
    "llm_model": {
        "zhipu-api": "lucidrains/GLM-130B",
     }
  }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

四、报错问题

python init_database.py --recreate-vs 初始数据库失败:

  • 1
  • 2
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>python init_database.py --recreate-vs
recreating all vector stores
2023-12-19 17:02:47,732 - faiss_cache.py[line:80] - INFO: loading vector store in 'samples/vector_store/bge-large-zh' from disk.
2023-12-19 17:02:51,277 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: BAAI/bge-large-zh
2023-12-19 17:03:33,432 - embeddings_api.py[line:39] - ERROR: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/BAAI/bge-large-zh (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001F06C868BE0>, 'Connection to huggingface.co timed out. (connect timeout=None)'))"), '(Request ID: 149213c1-2ec8-4340-90cd-f6d60fdde1da)')
AttributeError: 'NoneType' object has no attribute 'conjugate'
​
The above exception was the direct cause of the following exception:
​
Traceback (most recent call last):
  File "G:\AIGC\Langchain\Langchain-Chatchat\init_database.py", line 108, in <module>
    folder2db(kb_names=args.kb_name, mode="recreate_vs", embed_model=args.embed_model)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\migrate.py", line 121, in folder2db
    kb.create_kb()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 81, in create_kb
    self.do_create_kb()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\faiss_kb_service.py", line 47, in do_create_kb
    self.load_vector_store()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\faiss_kb_service.py", line 28, in load_vector_store
    return kb_faiss_pool.load_vector_store(kb_name=self.kb_name,
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_cache\faiss_cache.py", line 90, in load_vector_store
    vector_store = self.new_vector_store(embed_model=embed_model, embed_device=embed_device)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_cache\faiss_cache.py", line 48, in new_vector_store
    vector_store = FAISS.from_documents([doc], embeddings, normalize_L2=True)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\vectorstores.py", line 510, in from_documents
    return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\vectorstores\faiss.py", line 911, in from_texts
    embeddings = embedding.embed_documents(texts)
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 399, in embed_documents
    return normalize(embeddings).tolist()
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\knowledge_base\kb_service\base.py", line 38, in normalize
    norm = np.linalg.norm(embeddings, axis=1)
  File "<__array_function__ internals>", line 200, in norm
  File "F:\Anaconda3\envs\langchain\lib\site-packages\numpy\linalg\linalg.py", line 2541, in norm
    s = (x.conj() * x).real
TypeError: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38

解决方法:

EMBEDDING_MODEL 改成bge-large-zh

然后清空knowledge_base 重新初始化向量库即可。

启动startup.py

python startup.py -a

2023-12-19 15:44:46,117 - utils.py[line:24] - ERROR: object of type 'NoneType' has no len()
Traceback (most recent call last):
  File "G:\AIGC\Langchain\Langchain-Chatchat\server\utils.py", line 22, in wrap_done
    await fn
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\base.py", line 381, in acall
    raise e
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\base.py", line 375, in acall
    await self._acall(inputs, run_manager=run_manager)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\llm.py", line 275, in _acall
    response = await self.agenerate([inputs], run_manager=run_manager)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain\chains\llm.py", line 142, in agenerate
    return await self.llm.agenerate_prompt(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 501, in agenerate_prompt
    return await self.agenerate(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 461, in agenerate
    raise exceptions[0]
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 564, in _agenerate_with_cache
    return await self._agenerate(
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_community\chat_models\openai.py", line 518, in _agenerate
    return await agenerate_from_stream(stream_iter)
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_core\language_models\chat_models.py", line 81, in agenerate_from_stream
    async for chunk in stream:
  File "F:\Anaconda3\envs\langchain\lib\site-packages\langchain_community\chat_models\openai.py", line 489, in _astream
    if len(chunk["choices"]) == 0:
TypeError: object of type 'NoneType' has no len()
2023-12-19 15:44:46,122 - utils.py[line:27] - ERROR: TypeError: Caught exception: object of type 'NoneType' has no len()

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

启动 webui:

streamlit run webui.py

(langchain) G:\AIGC\Langchain\Langchain-Chatchat>
(langchain) G:\AIGC\Langchain\Langchain-Chatchat>streamlit run webui.py
​
  You can now view your Streamlit app in your browser.
​
  Local URL: http://localhost:8501
  Network URL: http://192.168.43.195:8501
​
2023-12-19 14:21:27,722 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:29,726 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:31,032 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:31,729 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:33,035 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:33,838 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,041 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,503 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:35,843 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:36,099 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37,519 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37,857 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:37.859 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable
2023-12-19 14:21:38,116 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:39,526 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:40,131 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:41,635 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:42,240 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:43,641 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:44,248 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:45,647 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:45.647 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable
2023-12-19 14:21:46,262 - utils.py[line:95] - ERROR: ConnectError: error when post /llm_model/list_running_models: [WinError 10061] 由于目标计算机积极拒绝,无法连接。
2023-12-19 14:21:46.262 Uncaught app exception
Traceback (most recent call last):
  File "F:\Anaconda3\envs\langchain\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui.py", line 64, in <module>
    pages[selected_page]["func"](api=api, is_lite=is_lite)
  File "G:\AIGC\Langchain\Langchain-Chatchat\webui_pages\dialogue\dialogue.py", line 165, in dialogue_page
    running_models = list(api.list_running_models())
TypeError: 'NoneType' object is not iterable

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57

创建知识库失败

2023-12-20 10:43:16,728 - SentenceTransformer.py[line:66] - INFO: Load pretrained SentenceTransformer: G:\AIGC\Langchain\m3e-base-main
2023-12-20 10:43:21,466 - embeddings_api.py[line:39] - ERROR: Error while deserializing header: HeaderTooLarge
2023-12-20 10:43:21,483 - kb_api.py[line:34] - ERROR: TypeError: 创建知识库出错: loop of ufunc does not support argument 0 of type NoneType which has no callable conjugate method

  • 1
  • 2
  • 3
  • 4

解决方法:

EMBEDDING_MODEL 改成bge-large-zh

$ git lfs install
$ git clone https://huggingface.co/BAAI/bge-large-zh

  • 1
  • 2
  • 3

然后清空knowledge_base 执行命令 python init_database.py --recreate-vs 重新初始化向量库即可,以上问题均得到解决。

五、启动信息

(langchain) G:\AIGC\Langchain\Langchain-Chatchat>python startup.py -a
​
​
==============================Langchain-Chatchat Configuration==============================
操作系统:Windows-10-10.0.18363-SP0.
python版本:3.10.12 | packaged by Anaconda, Inc. | (main, Jul  5 2023, 19:01:18) [MSC v.1916 64 bit (AMD64)]
项目版本:v0.2.8
langchain版本:0.0.344. fastchat版本:0.2.34
​
​
当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['zhipu-api'] @ cpu
{'api_key': '你自己的apikey',
 'device': 'cpu',
 'host': '127.0.0.1',
 'infer_turbo': False,
 'model_path': 'lucidrains/GLM-130B',
 'online_api': True,
 'port': 21001,
 'provider': 'ChatGLMWorker',
 'version': 'chatglm_turbo',
 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: m3e-base @ cpu
==============================Langchain-Chatchat Configuration==============================
​
​
2023-12-20 10:09:39,873 - startup.py[line:650] - INFO: 正在启动服务:
2023-12-20 10:09:39,873 - startup.py[line:651] - INFO: 如需查看 llm_api 日志,请前往 G:\AIGC\Langchain\Langchain-Chatchat\logs
2023-12-20 10:09:52 | INFO | model_worker | Register to controller
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Started server process [27468]
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Waiting for application startup.
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Application startup complete.
2023-12-20 10:09:54 | ERROR | stderr | INFO:     Uvicorn running on http://127.0.0.1:20000 (Press CTRL+C to quit)
INFO:     Started server process [25024]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:7861 (Press CTRL+C to quit)
​
​
==============================Langchain-Chatchat Configuration==============================
操作系统:Windows-10-10.0.18363-SP0.
python版本:3.10.12 | packaged by Anaconda, Inc. | (main, Jul  5 2023, 19:01:18) [MSC v.1916 64 bit (AMD64)]
项目版本:v0.2.8
langchain版本:0.0.344. fastchat版本:0.2.34
​
​
当前使用的分词器:ChineseRecursiveTextSplitter
当前启动的LLM模型:['zhipu-api'] @ cpu
{'api_key': '你自己的apikey',
 'device': 'cpu',
 'host': '127.0.0.1',
 'infer_turbo': False,
 'model_path': 'lucidrains/GLM-130B',
 'online_api': True,
 'port': 21001,
 'provider': 'ChatGLMWorker',
 'version': 'chatglm_turbo',
 'worker_class': <class 'server.model_workers.zhipu.ChatGLMWorker'>}
当前Embbedings模型: m3e-base @ cpu
​
​
服务端运行信息:
    OpenAI API Server: http://127.0.0.1:20000/v1
    Chatchat  API  Server: http://127.0.0.1:7861
    Chatchat WEBUI Server: http://127.0.0.1:8501
==============================Langchain-Chatchat Configuration==============================
​
​
​
  You can now view your Streamlit app in your browser.
​
  URL: http://127.0.0.1:8501

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73

启动页面如下:

启动信息.PNG

六、注意事项

新建知识库名字不支持中文名称,且导入PDF解析速度较慢:

知识库名称.PNG

如何系统的去学习大模型LLM ?

作为一名热心肠的互联网老兵,我意识到有很多经验和知识值得分享给大家,也可以通过我们的能力和经验解答大家在人工智能学习中的很多困惑,所以在工作繁忙的情况下还是坚持各种整理和分享。

但苦于知识传播途径有限,很多互联网行业朋友无法获得正确的资料得到学习提升,故此将并将重要的 AI大模型资料 包括AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频免费分享出来

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/木道寻08/article/detail/990273
推荐阅读
相关标签