    使用 FastAPI 可以帮助我们更简单高效地部署 AI 交互业务。FastAPI 提供了快速构建 API 的能力,开发者可以轻松地定义模型需要的输入和输出格式,并编写好相应的业务逻辑。

    FastAPI 的异步高性能架构,可以有效支持大量并发的预测请求,为用户提供流畅的交互体验。此外,FastAPI 还提供了容器化部署能力,开发者可以轻松打包 AI 模型为 Docker 镜像,实现跨环境的部署和扩展。

    总之,使用 FastAPI 可以大大提高 AI 应用程序的开发效率和用户体验,为 AI 模型的部署和交互提供全方位的支持。




    FastAPI 是一个用于构建 API 的现代、快速(高性能)的 Python Web 框架。它是基于标准 Python 类型注释的 ASGI (Asynchronous Server Gateway Interface) 框架。

FastAPI 具有以下主要特点:

  1. 快速: FastAPI 使用 ASGI 服务器和 Starlette 框架,在性能测试中表现出色。它可以与 Uvicorn 一起使用,提供非常高的性能。

  2. 简单: FastAPI 利用 Python 类型注释,使 API 定义变得简单且直观。开发人员只需要定义输入和输出模型,FastAPI 会自动生成 API 文档。

  3. 现代: FastAPI 支持 OpenAPI 标准,可以自动生成 API 文档和交互式文档。它还支持 JSON Schema 和数据验证。

  4. 全功能: FastAPI 提供了路由、依赖注入、数据验证、安全性、测试等功能,是一个功能齐全的 Web 框架。

  5. 可扩展: FastAPI 被设计为可扩展的。开发人员可以轻松地集成其他库和组件,如数据库、身份验证等。


    是一种计算机通信协议,它提供了在单个 TCP 连接上进行全双工通信的机制。它是 HTML5 一个重要的组成部分。

WebSocket 协议主要有以下特点:

  1. 全双工通信:WebSocket 允许客户端和服务器之间进行双向实时通信,即数据可以同时在两个方向上流动。这与传统的 HTTP 请求-响应模型不同,HTTP 中数据只能单向流动。

  2. 持久性连接:WebSocket 连接是一种持久性的连接,一旦建立就会一直保持,直到客户端或服务器主动关闭连接。这与 HTTP 的连接是短暂的不同。

  3. 低开销:相比 HTTP 请求-响应模型,WebSocket 在建立连接时需要较少的数据交换,因此网络开销较小。

  4. 实时性:由于 WebSocket 连接是持久性的,且数据可以双向流动,因此 WebSocket 非常适用于需要实时、低延迟数据交互的应用场景,如聊天应用、实时游戏、股票行情等。


    Tool(工具)是为了增强其语言模型的功能和实用性而设计的一系列辅助手段,用于扩展模型的能力。例如代码解释器(Code Interpreter)和知识检索(Knowledge Retrieval)等都属于其工具。





3.1. 创建虚拟环境&安装依赖

  增加Google Search的依赖包

  1. conda create -n fastapi_test python=3.10
  2. conda activate fastapi_test
  3. pip install fastapi websockets uvicorn
  4. pip install --quiet langchain-core langchain-community langchain-openai
  5. pip install google-search-results

3.2. 注册Google Search API账号

1. 输入注册信息


2. 需要认证邮箱

3. 需要认证手机

4. 认证成功

3.3. 生成Google Search API的KEY


4.1. Google Search小试

  1. # -*- coding: utf-8 -*-
  2. import os
  3. from langchain_community.utilities.serpapi import SerpAPIWrapper
  4. os.environ["SERPAPI_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  5. serp = SerpAPIWrapper()
  6. result = serp.run("广州的实时气温如何?")
  7. print("实时搜索结果:", result)


4.2. 非流式输出



  1. import uvicorn
  2. import os
  3. from typing import Annotated
  4. from fastapi import (
  5. Depends,
  6. FastAPI,
  7. WebSocket,
  8. WebSocketException,
  9. WebSocketDisconnect,
  10. status,
  11. )
  12. from langchain.agents import create_structured_chat_agent, AgentExecutor
  13. from langchain_community.utilities import SerpAPIWrapper
  14. from langchain_core.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate
  15. from langchain_core.tools import tool
  16. from langchain_openai import ChatOpenAI
  17. os.environ["OPENAI_API_KEY"] = 'sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' #你的Open AI Key
  18. os.environ["SERPAPI_API_KEY"] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  19. class ConnectionManager:
  20. def __init__(self):
  21. self.active_connections: list[WebSocket] = []
  22. async def connect(self, websocket: WebSocket):
  23. await websocket.accept()
  24. self.active_connections.append(websocket)
  25. def disconnect(self, websocket: WebSocket):
  26. self.active_connections.remove(websocket)
  27. async def send_personal_message(self, message: str, websocket: WebSocket):
  28. await websocket.send_text(message)
  29. async def broadcast(self, message: str):
  30. for connection in self.active_connections:
  31. await connection.send_text(message)
  32. manager = ConnectionManager()
  33. app = FastAPI()
  34. async def authenticate(
  35. websocket: WebSocket,
  36. userid: str,
  37. secret: str,
  38. ):
  39. if userid is None or secret is None:
  40. raise WebSocketException(code=status.WS_1008_POLICY_VIOLATION)
  41. print(f'userid: {userid},secret: {secret}')
  42. if '12345' == userid and 'xxxxxxxxxxxxxxxxxxxxxxxxxx' == secret:
  43. return 'pass'
  44. else:
  45. return 'fail'
  46. @tool
  47. def search(query:str):
  48. """只有需要了解实时信息或不知道的事情的时候才会使用这个工具,需要传入要搜索的内容。"""
  49. serp = SerpAPIWrapper()
  50. result = serp.run(query)
  51. print("实时搜索结果:", result)
  52. return result
  53. def get_prompt():
  54. template='''
  55. Respond to the human as helpfully and accurately as possible. You have access to the following tools:
  56. {tools}
  57. Use a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).
  58. Valid "action" values: "Final Answer" or {tool_names}
  59. Provide only ONE action per $JSON_BLOB, as shown:
  60. ```
  61. {{
  62. "action": $TOOL_NAME,
  63. "action_input": $INPUT
  64. }}
  65. ```
  66. Follow this format:
  67. Question: input question to answer
  68. Thought: consider previous and subsequent steps
  69. Action:
  70. ```
  71. $JSON_BLOB
  72. ```
  73. Observation: action result
  74. ... (repeat Thought/Action/Observation N times)
  75. Thought: I know what to respond
  76. Action:
  77. ```
  78. {{
  79. "action": "Final Answer",
  80. "action_input": "Final response to human"
  81. }}
  82. Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation
  83. '''
  84. system_message_prompt = SystemMessagePromptTemplate.from_template(template)
  85. human_template='''
  86. {input}
  87. {agent_scratchpad}
  88. (reminder to respond in a JSON blob no matter what)
  89. '''
  90. human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
  91. prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
  92. return prompt
  93. async def chat(query):
  94. global llm,tools
  95. agent = create_structured_chat_agent(
  96. llm, tools, get_prompt()
  97. )
  98. agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)
  99. result = agent_executor.invoke({"input": query})
  100. print(result['output'])
  101. yield result['output']
  102. @app.websocket("/ws")
  103. async def websocket_endpoint(*,websocket: WebSocket,userid: str,permission: Annotated[str, Depends(authenticate)],):
  104. await manager.connect(websocket)
  105. try:
  106. while True:
  107. text = await websocket.receive_text()
  108. if 'fail' == permission:
  109. await manager.send_personal_message(
  110. f"authentication failed", websocket
  111. )
  112. else:
  113. if text is not None and len(text) > 0:
  114. async for msg in chat(text):
  115. await manager.send_personal_message(msg, websocket)
  116. except WebSocketDisconnect:
  117. manager.disconnect(websocket)
  118. print(f"Client #{userid} left the chat")
  119. await manager.broadcast(f"Client #{userid} left the chat")
  120. if __name__ == '__main__':
  121. tools = [search]
  122. llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, max_tokens=512)
  123. uvicorn.run(app, host='',port=7777)


  1. <!DOCTYPE html>
  2. <html>
  3. <head>
  4. <title>Chat</title>
  5. </head>
  6. <body>
  7. <h1>WebSocket Chat</h1>
  8. <form action="" onsubmit="sendMessage(event)">
  9. <label>USERID: <input type="text" id="userid" autocomplete="off" value="12345"/></label>
  10. <label>SECRET: <input type="text" id="secret" autocomplete="off" value="xxxxxxxxxxxxxxxxxxxxxxxxxx"/></label>
  11. <br/>
  12. <button onclick="connect(event)">Connect</button>
  13. <hr>
  14. <label>Message: <input type="text" id="messageText" autocomplete="off"/></label>
  15. <button>Send</button>
  16. </form>
  17. <ul id='messages'>
  18. </ul>
  19. <script>
  20. var ws = null;
  21. function connect(event) {
  22. var userid = document.getElementById("userid")
  23. var secret = document.getElementById("secret")
  24. ws = new WebSocket("ws://localhost:7777/ws?userid="+userid.value+"&secret=" + secret.value);
  25. ws.onmessage = function(event) {
  26. var messages = document.getElementById('messages')
  27. var message = document.createElement('li')
  28. var content = document.createTextNode(event.data)
  29. message.appendChild(content)
  30. messages.appendChild(message)
  31. };
  32. event.preventDefault()
  33. }
  34. function sendMessage(event) {
  35. var input = document.getElementById("messageText")
  36. ws.send(input.value)
  37. input.value = ''
  38. event.preventDefault()
  39. }
  40. </script>
  41. </body>
  42. </html>







模型输出:The current weather in Guangzhou is partly cloudy with a temperature of 95°F, 66% chance of precipitation, 58% humidity, and wind speed of 16 mph. This information was last updated on Monday at 1:00 PM.


1. 在AI交互中,LangChain框架并不是必须引入,此处引用仅用于简化Openai的交互流程。

2. 页面输出的样式可以根据实际需要进行调整,此处仅用于演示效果。

3. 目前还遗留两个问题,一是如何实现流式输出,二是如何更好维护prompt模版,篇幅有限,下回分解


5.1. 如何避免模型用英文回复

在提示词模版加入:Remember to answer in Chinese.  暗示模型一定要以中文进行回复。


  1. Respond to the human as helpfully and accurately as possible. You have access to the following tools:
  2. {tools}
  3. Use a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).
  4. Valid "action" values: "Final Answer" or {tool_names}
  5. Provide only ONE action per $JSON_BLOB, as shown:
  6. ```
  7. {{
  8. "action": $TOOL_NAME,
  9. "action_input": $INPUT
  10. }}
  11. ```
  12. Follow this format:
  13. Question: input question to answer
  14. Thought: consider previous and subsequent steps
  15. Action:
  16. ```
  17. $JSON_BLOB
  18. ```
  19. Observation: action result
  20. ... (repeat Thought/Action/Observation N times)
  21. Thought: I know what to respond
  22. Action:
  23. ```
  24. {{
  25. "action": "Final Answer",
  26. "action_input": "Final response to human"
  27. }}
  28. Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Remember to answer in Chinese.Format is Action:```$JSON_BLOB```then Observation

