赞
踩
RasaGPT 结合了 Rasa 和 Langchain 这 2 个开源项目,当超出 Rasa 现有意图(out_of_scope
)的时候,就会执行 ActionGPTFallback
,本质上就是利用 Langchain 做了一个 RAG,调用 LLM API。RasaGPT 涉及的技术栈比较多而复杂,包括 Rasa、Langchain、LlamaIndex、Telegram、PostgresSQL、PGVector、Ngrok、FastAPI、Docker、docker-compose、Dozzle 等。尽管对项目做了简化[3],删除了不容易实现的部分,但仍是一次失败的实践,各种原因没有完整运行起来。不过 RasaGPT 为结合 Rasa 和 Langchain 提供了一种思路,接下来重点是把 Rasa 和 Langchain-Chatchat 进行对接。
RasaGPT 是一个建立在 Rasa 和 Langchain 之上的无头 LLM chatbot 平台(无头简单理解就是没有界面)。它是 Rasa 和 Telegram 的样板文件(Boilerplate)和参考实现,利用 LLM 库(如 Langchain)进行索引、检索和上下文注入。RasaGPT 可以直接投入使用,很多实施上的麻烦事都已经被解决了,这样就不必亲自处理,包括:
Rasa 是一个开源 (Python) 机器学习框架,用于自动执行基于文本和语音的对话:NLU、对话管理、连接到 Slack、Facebook 等,创建聊天机器人和语音助手。LangChain 主要功能是围绕 LLM 快速构建应用程序和管道,通过与外部数据和知识源的结合,可以提高 LLM 的应用效率和范围。
使用 docker-compose.yml 文件快速启动,如果使用的是 Linux 或 Windows,则需要将 docker-compose.yml 文件中的 Dockerfilekhalosa/rasa-aarch64:3.5.2 名称修改为 rasa/rasa:latest。
# 获取源码
git clone https://github.com/paulpierre/RasaGPT.git
cd RasaGPT
# 根据需要配置.env文件
cp .env-example .env
# 自动安装和运行RasaGPT
make install
说明:可以输入 make 查看更多选项列表,安装完成后再次执行命令 make run 运行。完整安装日志参考: https://app.warp.dev/block/vflua6Eue29EPk8EVvW8Kd
chat_api
chat_ngrok
chat_rasa_core
chat_rasa_actions
chat_rasa_credentials
chat_db
chat_pgadmin
chat_dozzle
简化后的 RasaGPT 项目[3]包括:chat_rasa_core(模型)、chat_api(LLM 接口)、chat_db(数据库)、chat_rasa_actions(action 服务)、chat_rasa_credentials(凭证)。如下所示:
可以通过访问 https://localhost:9999/查看所有日志,它将显示所有 Docker 容器的实时日志。如下所示:
可以通过访问 https://localhost:8888/docs 查看 API 端点文档。在这个页面上,可以创建和更新实体,以及将文档上传到知识库。如下所示:
这可以被认为是 SaaS/多租户中的客户公司。默认情况下,已提供虚拟组织列表。如下所示:
[
{
"id": 1,
"uuid": "d2a642e6-c81a-4a43-83e2-22cee3562452",
"display_name": "Pepe Corp.",
"namespace": "pepe",
"bot_url": null,
"created_at": "2023-05-05T10:42:45.933976",
"updated_at": "2023-05-05T10:42:45.933979"
},
{
"id": 2,
"uuid": "7d574f88-6c0b-4c1f-9368-367956b0e90f",
"display_name": "Umbrella Corp",
"namespace": "acme",
"bot_url": null,
"created_at": "2023-05-05T10:43:03.555484",
"updated_at": "2023-05-05T10:43:03.555488"
},
{
"id": 3,
"uuid": "65105a15-2ef0-4898-ac7a-8eafee0b283d",
"display_name": "Cyberdine Systems",
"namespace": "cyberdine",
"bot_url": null,
"created_at": "2023-05-05T10:43:04.175424",
"updated_at": "2023-05-05T10:43:04.175428"
},
{
"id": 4,
"uuid": "b7fb966d-7845-4581-a537-818da62645b5",
"display_name": "Bluth Companies",
"namespace": "bluth",
"bot_url": null,
"created_at": "2023-05-05T10:43:04.697801",
"updated_at": "2023-05-05T10:43:04.697804"
},
{
"id": 5,
"uuid": "9283d017-b24b-4ecd-bf35-808b45e258cf",
"display_name": "Evil Corp",
"namespace": "evil",
"bot_url": null,
"created_at": "2023-05-05T10:43:05.102546",
"updated_at": "2023-05-05T10:43:05.102549"
}
]
这可以被视为属于公司的产品。可以查看属于组织的项目列表。如下所示:
[
{
"id": 1,
"documents": [
{
"id": 1,
"uuid": "92604623-e37c-4935-bf08-0e9efa8b62f7",
"display_name": "project-pepetamine.md",
"node_count": 3
}
],
"document_count": 1,
"uuid": "44a4b60b-9280-4b21-a676-00612be9aa87",
"display_name": "Pepetamine",
"created_at": "2023-05-05T10:42:46.060930",
"updated_at": "2023-05-05T10:42:46.060934"
},
{
"id": 2,
"documents": [
{
"id": 2,
"uuid": "b408595a-3426-4011-9b9b-8e260b244f74",
"display_name": "project-frogonil.md",
"node_count": 3
}
],
"document_count": 1,
"uuid": "5ba6b812-de37-451d-83a3-8ccccadabd69",
"display_name": "Frogonil",
"created_at": "2023-05-05T10:42:48.043936",
"updated_at": "2023-05-05T10:42:48.043940"
},
{
"id": 3,
"documents": [
{
"id": 3,
"uuid": "b99d373a-3317-4699-a89e-90897ba00db6",
"display_name": "project-kekzal.md",
"node_count": 3
}
],
"document_count": 1,
"uuid": "1be4360c-f06e-4494-bf20-e7c73a56f003",
"display_name": "Kekzal",
"created_at": "2023-05-05T10:42:49.092675",
"updated_at": "2023-05-05T10:42:49.092678"
},
{
"id": 4,
"documents": [
{
"id": 4,
"uuid": "94da307b-5993-4ddd-a852-3d8c12f95f3f",
"display_name": "project-memetrex.md",
"node_count": 3
}
],
"document_count": 1,
"uuid": "1fd7e772-365c-451b-a7eb-4d529b0927f0",
"display_name": "Memetrex",
"created_at": "2023-05-05T10:42:50.184817",
"updated_at": "2023-05-05T10:42:50.184821"
},
{
"id": 5,
"documents": [
{
"id": 5,
"uuid": "6deff180-3e3e-4b09-ae5a-6502d031914a",
"display_name": "project-pepetrak.md",
"node_count": 4
}
],
"document_count": 1,
"uuid": "a389eb58-b504-48b4-9bc3-d3c93d2fbeaa",
"display_name": "PepeTrak",
"created_at": "2023-05-05T10:42:51.293352",
"updated_at": "2023-05-05T10:42:51.293355"
},
{
"id": 6,
"documents": [
{
"id": 6,
"uuid": "2e3c2155-cafa-4c6b-b7cc-02bb5156715b",
"display_name": "project-memegen.md",
"node_count": 5
}
],
"document_count": 1,
"uuid": "cec4154f-5d73-41a5-a764-eaf62fc3db2c",
"display_name": "MemeGen",
"created_at": "2023-05-05T10:42:52.562037",
"updated_at": "2023-05-05T10:42:52.562040"
},
{
"id": 7,
"documents": [
{
"id": 7,
"uuid": "baabcb6f-e14c-4d59-a019-ce29973b9f5c",
"display_name": "project-neurokek.md",
"node_count": 5
}
],
"document_count": 1,
"uuid": "4a1a0542-e314-4ae7-9961-720c2d092f04",
"display_name": "Neuro-kek",
"created_at": "2023-05-05T10:42:53.689537",
"updated_at": "2023-05-05T10:42:53.689539"
},
{
"id": 8,
"documents": [
{
"id": 8,
"uuid": "5be007ec-5c89-4bc4-8bfd-448a3659c03c",
"display_name": "org-about_the_company.md",
"node_count": 5
},
{
"id": 9,
"uuid": "c2b3fb39-18c0-4f3e-9c21-749b86942cba",
"display_name": "org-board_of_directors.md",
"node_count": 3
},
{
"id": 10,
"uuid": "41aa81a9-13a9-4527-a439-c2ac0215593f",
"display_name": "org-company_story.md",
"node_count": 4
},
{
"id": 11,
"uuid": "91c59eb8-8c05-4f1f-b09d-fcd9b44b5a20",
"display_name": "org-corporate_philosophy.md",
"node_count": 4
},
{
"id": 12,
"uuid": "631fc3a9-7f5f-4415-8283-78ff582be483",
"display_name": "org-customer_support.md",
"node_count": 3
},
{
"id": 13,
"uuid": "d4c3d3db-6f24-433e-b2aa-52a70a0af976",
"display_name": "org-earnings_fy2023.md",
"node_count": 5
},
{
"id": 14,
"uuid": "08dd478b-414b-46c4-95c0-4d96e2089e90",
"display_name": "org-management_team.md",
"node_count": 3
}
],
"document_count": 7,
"uuid": "1d2849b4-2715-4dcf-aa68-090a221942ba",
"display_name": "Pepe Corp. (company)",
"created_at": "2023-05-05T10:42:55.258902",
"updated_at": "2023-05-05T10:42:55.258904"
}
]
这可以被视为与产品相关的工件,例如 FAQ 页面或财务报表收益 PDF。可以查看与组织项目相关的所有文档。如下所示:
{
"id": 1,
"uuid": "44a4b60b-9280-4b21-a676-00612be9aa87",
"organization": {
"id": 1,
"uuid": "d2a642e6-c81a-4a43-83e2-22cee3562452",
"display_name": "Pepe Corp.",
"bot_url": null,
"status": 2,
"created_at": "2023-05-05T10:42:45.933976",
"updated_at": "2023-05-05T10:42:45.933979",
"namespace": "pepe"
},
"document_count": 1,
"documents": [
{
"id": 1,
"uuid": "92604623-e37c-4935-bf08-0e9efa8b62f7",
"organization_id": 1,
"project_id": 1,
"display_name": "project-pepetamine.md",
"url": "",
"data": "# Pepetamine\n\nProduct Name: Pepetamine\n\nPurpose: Increases cognitive focus just like the Limitless movie\n\n**How to Use**\n\nPepetamine is available in the form of rare Pepe-coated tablets. The recommended dosage is one tablet per day, taken orally with a glass of water, preferably while browsing your favorite meme forum for maximum cognitive enhancement. For optimal results, take Pepetamine 30 minutes before engaging in mentally demanding tasks, such as decoding ancient Pepe hieroglyphics or creating your next viral meme masterpiece.\n\n**Side Effects**\n\nSome potential side effects of Pepetamine may include:\n\n1. Uncontrollable laughter and a sudden appreciation for dank memes\n2. An inexplicable desire to collect rare Pepes\n3. Enhanced meme creation skills, potentially leading to internet fame\n4. Temporary green skin pigmentation, resembling the legendary Pepe himself\n5. Spontaneously speaking in \"feels good man\" language\n\nWhile most side effects are generally harmless, consult your memologist if side effects persist or become bothersome.\n\n**Precautions**\n\nBefore taking Pepetamine, please consider the following precautions:\n\n1. Do not use Pepetamine if you have a known allergy to rare Pepes or dank memes.\n2. Pepetamine may not be suitable for individuals with a history of humor deficiency or meme intolerance.\n3. Exercise caution when driving or operating heavy machinery, as Pepetamine may cause sudden fits of laughter or intense meme ideation.\n\n**Interactions**\n\nPepetamine may interact with other substances, including:\n\n1. Normie supplements: Combining Pepetamine with normie supplements may result in meme conflicts and a decreased sense of humor.\n2. Caffeine: The combination of Pepetamine and caffeine may cause an overload of energy, resulting in hyperactive meme creation and potential internet overload.\n\nConsult your memologist if you are taking any other medications or substances to ensure compatibility with Pepetamine.\n\n**Overdose**\n\nIn case of an overdose, symptoms may include:\n\n1. Uncontrollable meme creation\n2. Delusions of grandeur as the ultimate meme lord\n3. Time warps into the world of Pepe\n\nIf you suspect an overdose, contact your local meme emergency service or visit the nearest meme treatment facility. Remember, the key to enjoying Pepetamine is to use it responsibly, and always keep in mind the wise words of our legendary Pepe: \"Feels good man.\"",
"hash": "fdee6da2b5441080dd78e7850d3d2e1403bae71b9e0526b9dcae4c0782d95a78",
"version": 1,
"status": 2,
"created_at": "2023-05-05T10:42:46.755428",
"updated_at": "2023-05-05T10:42:46.755431"
}
],
"display_name": "Pepetamine",
"created_at": "2023-05-05T10:42:46.060930",
"updated_at": "2023-05-05T10:42:46.060934"
}
尽管在 API 中没有公开,但节点是为其生成 embedding 的文档块。节点用于检索搜索以及上下文注入。节点属于文档。
用户代表与机器人交谈的人。用户不一定属于组织或产品,但这种关系在下面的 ChatSession 中被捕获。
不通过 API 公开,但这表示用户和机器人之间的问答。这些对象中的每一个都可以由自动生成的 session_id
灵活识别。Chat Session 包含丰富的 metadata,可用于训练和优化。通过 /chat
端点的 ChatSession 实际上与组织相关联(出于多租户安全目的)。
这个机器人只是一个概念验证,并且尚未进行检索方面的优化。目前,它使用 1000 个字符长度的分块进行索引,使用基本的欧几里得距离进行检索,质量时好时坏。可以在 RESULTS.MD 文件中查看机器人的示例命中和未命中。总体而言,估计通过进行索引优化和 LLM 配置更改,输出质量可以提高超过 70%。单击查看演示数据的 Q&A 结果在 RESULTS.MD 文件中。
Rasa 处理与通信 channel 的集成,在本例中为 Telegram。
/webhooks/{channel}/webhook
Rasa 有两个组件,核心Rasa app和单独运行的actions server。
Rasa 必须通过一些 yaml 文件进行配置(已经完成):
FallbackClassifier
阈值rasa-credentials
通过app/rasa-credentials/main.py进行更新action_gpt_fallback
操作,它将触发actions serverout_of_scope
的地方action_gpt_fallback
ActionGPTFallback
定义和表达 action 的地方。方法名返回为上面意图定义的动作Rasa 的 NLU 模型必须经过训练,这可以通过 CLI 命令 rasa train
来完成。当运行 make install
时,这会自动完成。
训练后必须通过 rasa run
运行 Rasa Core。
Rasa 的 action server 必须用 rasa run actions
单独运行。
rasa-credentials
服务会处理这个过程。Ngrok 作为一个服务运行,一旦准备就绪,rasa-credentials 将调用本地的 Ngrok API 来检索隧道 URL,并更新 credentials.yml 文件,然后重新启动 Rasa。actions.py
中运行的 fallback action。PGVector 是 Postgres 的插件,可以自动安装,能够存储和计算向量数据类型。有自己的实现,因为 Langchain PGVector 类不灵活,无法适应模式,需要灵活性。
/docker-entry-initdb.d
中的任何文件都运行。在postgres Dockerfile中,复制 create_db.sh 创建数据库和用户。models.py
,该容器从 models 中创建表。GPTSimpleVectorIndex
查找相关数据并将其注入 prompt。用户将在 Telegram 中聊天,消息将被过滤为现有意图
如果它检测到没有意图匹配,而是匹配 out_of_scope
,它将根据 rule.yml 触发 action_gpt_fallback
操作
使用 LlamaIndex 的 API 会找到相关的索引内容,并将其注入到 prompt 中发送给 OpenAI 进行推理
prompt 包含对话护栏,包括:
[1] https://github.com/paulpierre/RasaGPT
[2] https://youtu.be/GAPnQ0qf1-E
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。