赞
踩
独立部署ChatGLM3-6B并提供HTTP API能力。
自定义LLM封装对ChatGLM3-6B的访问。
创建一个简单的Agent来使用自定义的LLM。
上一篇文章LLM大语言模型(九):LangChain封装自定义的LLM-CSDN博客
已经介绍过如何在LangChain里封装自定义的LLM。
本文对本地部署的ChatGLM3-6B进行了简单的封装。
- import requests
- import json
- from typing import Any, List, Optional
- from langchain.llms.base import LLM
- from langchain_core.callbacks import CallbackManagerForLLMRun
-
-
- class MyChatGLM(LLM):
- model: str = "chatglm3-6b"
- url: str = "http://localhost:8000/v1/chat/completions"
-
- # def __init__(self):
- # super().__init__()
-
- @property
- def _llm_type(self) -> str:
- return "MyChatGLM"
-
- def _resp_process_mock(self,input:str,resp:str):
- final_answer_json = {
- "action": "Final Answer",
- "action_input": input
- }
- return f"""
- Action:
- ```
- {json.dumps(final_answer_json, ensure_ascii=False)}
- ```"""
-
- def _call(self, prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any) -> str:
- data = {}
- data["model"] = self.model
- lst = []
- lst.append({"role":"user","content":prompt})
- data["messages"] = lst
- resp = self.doRequest(data)
- return self._resp_process_mock(prompt,resp)
-
-
- def doRequest(self,payload:dict) -> str:
- # 请求头
- headers = {"content-type":"application/json"}
- # json形式,参数用json
- res = requests.post(self.url,json=payload,headers=headers)
- return res.text
-
- mllm = MyChatGLM()
- print(mllm._llm_type)
- # mllm._llm_type = "haha" _llm_type该属性是无法被修改的
- print(mllm("hello world!"))
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
本示例是个简单的QA,没有记录history。
_call()方法内部就是通过HTTP POST请求调用本地部署的ChatGLM3-6B服务。
这个是对LLM返回的结果进行了格式化的封装,直接返回action: Final Answer。
为什么这样返回,下文有解释。
Agent使用的是LangChain的Structured chat类型的Agent,执行结构化的chat能力。
Structured chat类型的Agent支持多个Tool的输入,本文示例未考虑引入tool,所以在_resp_process_mock()里直接返回action: Final Answer。
action: Final Answer表示chat过程中使用tool的推理执行过程已经结束。(详见下文)
- from langchain import hub
- from langchain.agents import AgentExecutor, create_structured_chat_agent
- from langchain_community.tools.tavily_search import TavilySearchResults
- from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
- from my_chatglm3 import MyChatGLM
-
- if __name__ == "__main__":
- # tools = [TavilySearchResults(max_results=1)]
- tools = []
- prompt = hub.pull("hwchase17/structured-chat-agent")
- # Choose the LLM to use
- llm = MyChatGLM()
-
- # Construct the agent
- agent = create_structured_chat_agent(llm, tools, prompt)
- # Create an agent executor by passing in the agent and tools
- agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)
- agent_executor.invoke({"input": "我心情不好,给我讲个笑话逗我开心"})
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
使用的是hwchase17/structured-chat-agent,其结构如下:
- System: Respond to the human as helpfully and accurately as possible. You have access to the following tools:
-
-
- Use a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).
- Valid \"action\" values: \"Final Answer\" or
- Provide only ONE action per $JSON_BLOB, as shown:
- ```\n{\n \"action\": $TOOL_NAME,\n \"action_input\": $INPUT\n}\n```
- Follow this format:
- Question: input question to answer
- Thought: consider previous and subsequent steps
- Action:
- ```
- $JSON_BLOB
- ```
- Observation: action result
- ... (repeat Thought/Action/Observation N times)
- Thought: I know what to respond
- Action:
- ```
- {
- \"action\": \"Final Answer\",
- \"action_input\": \"Final response to human\"
- }
- Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation
- Human: 我心情不好,给我讲个笑话逗我开心
- (reminder to respond in a JSON blob no matter what)
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
第一段:声明System信息,类似我们使用LLM对话时的prompt engineering,指定角色等context信息。
第二段:声明如何使用tool以及推理过程,tool本文不涉及后续再写。注意对推理过程的约束,推理结束的标志是action: Final Answer,如果LLM给Agent返回的结果一直没有action: Final Answer,会导致Agent一直推理下去进入“死循环”。
这也是为什么_resp_process_mock()方法直接就返回action: Final Answer,一轮对话就结束就完事。
第三段:声明输出格式。
第四段:Human用户的输入
- > Entering new AgentExecutor chain...
-
- Action:
- ```
- {"action": "Final Answer",
- "action_input": "{\"model\":\"chatglm3-6b\",\"id\":\"\",\"object\":\"chat.completion\",\"choices\":[{\"index\":0,\"message\":{\"role\":\"assistant\",\"content\":\"{\\n \\\"action\\\": \\\"Final Answer\\\",\\n \\\"action_input\\\": \\\"A joke for you: Why was the math book sad? Because it had too many problems.\\\"\\n}\",\"name\":null,\"function_call\":null},\"finish_reason\":\"stop\"}],\"created\":1712419428,\"usage\":{\"prompt_tokens\":297,\"total_tokens\":339,\"completion_tokens\":42}}"}
- ```
-
- > Finished chain.
可以看到这里的推理过程,只有一个action: Final Answer。
action_input这里直接返回了LLM的原始回答。
其中的A joke for you: Why don't scientists trust atoms? Because they make up everything! 这个就是LLM的回答。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。