赞
踩
【第六节】: 【Lagent & AgentLego 智能体应用搭建】
1、Agent 理论及 Lagent&AgentLego 开源产品介绍
2、Lagent 调用已有 Arxiv 论文搜索工具实战
3、Lagent 新增自定义工具实战(以查询天气的工具为例)
4、AgentLego 新增 MagicMaker 文生图工具实战
【视频地址】:https://www.bilibili.com/video/BV1Xt4217728/
【课程文档】:https://github.com/InternLM/Tutorial/tree/camp2/agent
【课程作业】:https://github.com/InternLM/Tutorial/blob/camp2/agent/homework.md
【操作平台】:https://studio.intern-ai.org.cn/console/instance/
【lagent文档】: https://github.com/InternLM/Tutorial/blob/camp2/agent/lagent.md
【agentlego文档】:https://github.com/InternLM/Tutorial/blob/camp2/agent/agentlego.md
【lagent自定义工具】: https://lagent.readthedocs.io/zh-cn/latest/tutorials/action.html
【agentlego自定义工具】: https://agentlego.readthedocs.io/zh-cn/latest/modules/tool.html
大语言模型的局限性:幻觉、时效性,可靠性
智能体定义:感知、决策、行动
智能体组成:感知、大脑、动作
智能体类型:ReAct(选择工具),ReWoo(计划拆分),AutoGPT(人工干预)
Lagent一个轻量级开源智能体框架,旨在让用户可以高效地构建基于大语言模型的智能体。支持多种智能体范式(ReAct,ReWoo,AutoGPT),支持多种工具(如谷歌搜索、python解释器等)
AgentLego一个多模态工具包,旨在像乐高积木,可以快速简便地拓展自定义工具,从而组装出自己的智能体。支持多个智能体框架(如lagent,LangChain,Transformers Agents),提供大量视觉、多模态领域前沿算法
mkdir -p /root/agent
studio-conda -t agent -o pytorch-2.1.2
cd /root/agent
conda activate agent
git clone https://gitee.com/internlm/lagent.git
cd lagent && git checkout 581d9fb && pip install -e . && cd ..
git clone https://gitee.com/internlm/agentlego.git
cd agentlego && git checkout 7769e0d && pip install -e . && cd ..
conda activate agent
pip install lmdeploy==0.3.0
cd /root/agent
git clone -b camp2 https://gitee.com/internlm/Tutorial.git
问题:
请帮我搜索 InternLM2 Technical Report
资源:
部署:
#vscode terminal:Imdeploy的api server
conda activate agent
lmdeploy serve api_server /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-7b \
--server-name 127.0.0.1 \
--model-name internlm2-chat-7b \
--cache-max-entry-count 0.1
#new vscode terminal:streamlit web demo
conda activate agent
cd /root/agent/lagent/examples
streamlit run internlm2_agent_web_demo.py --server.address 127.0.0.1 --server.port 7860
#终端映射powershell
ssh -CNg -L 7860:127.0.0.1:7860 -L 23333:127.0.0.1:23333 root@ssh.intern-ai.org.cn -p 你的 ssh 端口号
使用:
浏览器: http://localhost:7860
Web页面设置:
模型IP:127.0.0.1:23333
插件选择:ArxivSearch
效果:
问题:
请帮我查询上海的天气
资源:
#自定义天气查询工具touch /root/agent/lagent/lagent/actions/weather.py import json import os import requests from typing import Optional, Type from lagent.actions.base_action import BaseAction, tool_api from lagent.actions.parser import BaseParser, JsonParser from lagent.schema import ActionReturn, ActionStatusCode class WeatherQuery(BaseAction): """Weather plugin for querying weather information.""" def __init__(self, key: Optional[str] = None, description: Optional[dict] = None, parser: Type[BaseParser] = JsonParser, enable: bool = True) -> None: super().__init__(description, parser, enable) key = os.environ.get('WEATHER_API_KEY', key) if key is None: raise ValueError( 'Please set Weather API key either in the environment ' 'as WEATHER_API_KEY or pass it as `key`') self.key = key self.location_query_url = 'https://geoapi.qweather.com/v2/city/lookup' self.weather_query_url = 'https://devapi.qweather.com/v7/weather/now' @tool_api def run(self, query: str) -> ActionReturn: """一个天气查询API。可以根据城市名查询天气信息。 Args: query (:class:`str`): The city name to query. """ tool_return = ActionReturn(type=self.name) status_code, response = self._search(query) if status_code == -1: tool_return.errmsg = response tool_return.state = ActionStatusCode.HTTP_ERROR elif status_code == 200: parsed_res = self._parse_results(response) tool_return.result = [dict(type='text', content=str(parsed_res))] tool_return.state = ActionStatusCode.SUCCESS else: tool_return.errmsg = str(status_code) tool_return.state = ActionStatusCode.API_ERROR return tool_return def _parse_results(self, results: dict) -> str: """Parse the weather results from QWeather API. Args: results (dict): The weather content from QWeather API in json format. Returns: str: The parsed weather results. """ now = results['now'] data = [ f'数据观测时间: {now["obsTime"]}', f'温度: {now["temp"]}°C', f'体感温度: {now["feelsLike"]}°C', f'天气: {now["text"]}', f'风向: {now["windDir"]},角度为 {now["wind360"]}°', f'风力等级: {now["windScale"]},风速为 {now["windSpeed"]} km/h', f'相对湿度: {now["humidity"]}', f'当前小时累计降水量: {now["precip"]} mm', f'大气压强: {now["pressure"]} 百帕', f'能见度: {now["vis"]} km', ] return '\n'.join(data) def _search(self, query: str): # get city_code try: city_code_response = requests.get( self.location_query_url, params={'key': self.key, 'location': query} ) except Exception as e: return -1, str(e) if city_code_response.status_code != 200: return city_code_response.status_code, city_code_response.json() city_code_response = city_code_response.json() if len(city_code_response['location']) == 0: return -1, '未查询到城市' city_code = city_code_response['location'][0]['id'] # get weather try: weather_response = requests.get( self.weather_query_url, params={'key': self.key, 'location': city_code} ) except Exception as e: return -1, str(e) return weather_response.status_code, weather_response.json()
#获取天气查询API key: https://dev.qweather.com/docs/api/
部署:
#vscode terminal:Imdeploy的api serve
conda activate agent
lmdeploy serve api_server /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-7b \
--server-name 127.0.0.1 \
--model-name internlm2-chat-7b \
--cache-max-entry-count 0.1
#new vscode terminal:启动并部署web demo
export WEATHER_API_KEY=在2.2节获取的API KEY
# 比如 export WEATHER_API_KEY=1234567890abcdef
conda activate agent
cd /root/agent/Tutorial/agent
streamlit run internlm2_weather_web_demo.py --server.address 127.0.0.1 --server.port 7860
#终端映射powershell
ssh -CNg -L 7860:127.0.0.1:7860 -L 23333:127.0.0.1:23333 root@ssh.intern-ai.org.cn -p 你的 ssh 端口号
使用:
浏览器: http://localhost:7860
Web页面设置:
模型IP:127.0.0.1:23333
插件选择:WeatherQuery
效果:
问题:
请检测图中的物体
资源:
cd /root/agent
wget http://download.openmmlab.com/agentlego/road.jpg
conda activate agent
pip install openmim==0.3.9
mim install mmdet==3.3.0
#新建目标检测工具文件:touch /root/agent/direct_use.py import re import cv2 from agentlego.apis import load_tool # load tool tool = load_tool('ObjectDetection', device='cuda') # apply tool visualization = tool('/root/agent/road.jpg') print(visualization) # visualize image = cv2.imread('/root/agent/road.jpg') preds = visualization.split('\n') pattern = r'(\w+) \((\d+), (\d+), (\d+), (\d+)\), score (\d+)' for pred in preds: name, x1, y1, x2, y2, score = re.match(pattern, pred).groups() x1, y1, x2, y2, score = int(x1), int(y1), int(x2), int(y2), int(score) cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 1) cv2.putText(image, f'{name} {score}', (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (0, 255, 0), 1) cv2.imwrite('/root/agent/road_detection_direct.jpg', image)
#修改相关文件:/root/agent/agentlego/webui/modules/agents/lagent_agent.py 文件的第 105行位置,将 internlm2-chat-20b 修改为 internlm2-chat-7b,
def llm_internlm2_lmdeploy(cfg):
url = cfg['url'].strip()
llm = LMDeployClient(
- model_name='internlm2-chat-20b',
+ model_name='internlm2-chat-7b',
url=url,
meta_template=INTERNLM2_META,
top_p=0.8,
top_k=100,
temperature=cfg.get('temperature', 0.7),
repetition_penalty=1.0,
stop_words=['<|im_end|>'])
return llm
部署:
#vscode terminal:Imdeploy的api serve
conda activate agent
lmdeploy serve api_server /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-7b \
--server-name 127.0.0.1 \
--model-name internlm2-chat-7b \
--cache-max-entry-count 0.1
#new vscode terminal:启动 AgentLego WebUI
conda activate agent
cd /root/agent/agentlego/webui
python one_click.py
#终端映射powershell
ssh -CNg -L 7860:127.0.0.1:7860 -L 23333:127.0.0.1:23333 root@ssh.intern-ai.org.cn -p 你的 ssh 端口号
使用:
浏览器: http://localhost:7860
agent配置、加载,tools配置,chat选择
点击上方 Agent 进入 Agent 配置页面。(如①所示)
点击 Agent 下方框,选择 New Agent。(如②所示)
选择 Agent Class 为 lagent.InternLM2Agent。(如③所示)
输入模型 URL 为 http://127.0.0.1:23333 。(如④所示)
输入 Agent name,自定义即可,图中输入了 internlm2。(如⑤所示)
点击 save to 以保存配置,这样在下次使用时只需在第2步时选择 Agent 为 internlm2 后点击 load 以加载就可以了。(如⑥所示)
点击 load 以加载配置。(如⑦所示)
点击上方 Tools 页面进入工具配置页面。(如①所示)
点击 Tools 下方框,选择 New Tool 以加载新工具。(如②所示)
选择 Tool Class 为 ObjectDetection。(如③所示)
点击 save 以保存配置。(如④所示)
等待工具加载完成后,点击上方 Chat 以进入对话页面。在页面下方选择工具部分只选择 ObjectDetection 工具,如下图所示。为了确保调用工具的成功率,请在使用时确保仅有这一个工具启用。
效果:
问题:
请帮我生成一幅山水画
资源:
#新建目标检测工具文件: #touch /root/agent/agentlego/agentlego/tools/magicmaker_image_generation.py import json import requests import numpy as np from agentlego.types import Annotated, ImageIO, Info from agentlego.utils import require from .base import BaseTool class MagicMakerImageGeneration(BaseTool): default_desc = ('This tool can call the api of magicmaker to ' 'generate an image according to the given keywords.') styles_option = [ 'dongman', # 动漫 'guofeng', # 国风 'xieshi', # 写实 'youhua', # 油画 'manghe', # 盲盒 ] aspect_ratio_options = [ '16:9', '4:3', '3:2', '1:1', '2:3', '3:4', '9:16' ] @require('opencv-python') def __init__(self, style='guofeng', aspect_ratio='4:3'): super().__init__() if style in self.styles_option: self.style = style else: raise ValueError(f'The style must be one of {self.styles_option}') if aspect_ratio in self.aspect_ratio_options: self.aspect_ratio = aspect_ratio else: raise ValueError(f'The aspect ratio must be one of {aspect_ratio}') def apply(self, keywords: Annotated[str, Info('A series of Chinese keywords separated by comma.')] ) -> ImageIO: import cv2 response = requests.post( url='https://magicmaker.openxlab.org.cn/gw/edit-anything/api/v1/bff/sd/generate', data=json.dumps({ "official": True, "prompt": keywords, "style": self.style, "poseT": False, "aspectRatio": self.aspect_ratio }), headers={'content-type': 'application/json'} ) image_url = response.json()['data']['imgUrl'] image_response = requests.get(image_url) image = cv2.imdecode(np.frombuffer(image_response.content, np.uint8), cv2.IMREAD_COLOR) return ImageIO(image)
#注册新工具,修改 /root/agent/agentlego/agentlego/tools/__init__.py 文件,将我们的工具注册在工具列表中。 from .base import BaseTool from .calculator import Calculator from .func import make_tool from .image_canny import CannyTextToImage, ImageToCanny from .image_depth import DepthTextToImage, ImageToDepth from .image_editing import ImageExpansion, ImageStylization, ObjectRemove, ObjectReplace from .image_pose import HumanBodyPose, HumanFaceLandmark, PoseToImage from .image_scribble import ImageToScribble, ScribbleTextToImage from .image_text import ImageDescription, TextToImage from .imagebind import AudioImageToImage, AudioTextToImage, AudioToImage, ThermalToImage from .object_detection import ObjectDetection, TextToBbox from .ocr import OCR from .scholar import * # noqa: F401, F403 from .search import BingSearch, GoogleSearch from .segmentation import SegmentAnything, SegmentObject, SemanticSegmentation from .speech_text import SpeechToText, TextToSpeech from .translation import Translation from .vqa import VQA + from .magicmaker_image_generation import MagicMakerImageGeneration __all__ = [ 'CannyTextToImage', 'ImageToCanny', 'DepthTextToImage', 'ImageToDepth', 'ImageExpansion', 'ObjectRemove', 'ObjectReplace', 'HumanFaceLandmark', 'HumanBodyPose', 'PoseToImage', 'ImageToScribble', 'ScribbleTextToImage', 'ImageDescription', 'TextToImage', 'VQA', 'ObjectDetection', 'TextToBbox', 'OCR', 'SegmentObject', 'SegmentAnything', 'SemanticSegmentation', 'ImageStylization', 'AudioToImage', 'ThermalToImage', 'AudioImageToImage', 'AudioTextToImage', 'SpeechToText', 'TextToSpeech', 'Translation', 'GoogleSearch', 'Calculator', - 'BaseTool', 'make_tool', 'BingSearch' + 'BaseTool', 'make_tool', 'BingSearch', 'MagicMakerImageGeneration' ]
部署:
#vscode terminal:Imdeploy的api serve
conda activate agent
lmdeploy serve api_server /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-7b \
--server-name 127.0.0.1 \
--model-name internlm2-chat-7b \
--cache-max-entry-count 0.1
#new vscode terminal:启动 AgentLego WebUI
conda activate agent
cd /root/agent/agentlego/webui
python one_click.py
#终端映射powershell
ssh -CNg -L 7860:127.0.0.1:7860 -L 23333:127.0.0.1:23333 root@ssh.intern-ai.org.cn -p 你的 ssh 端口号
使用:
浏览器: http://localhost:7860
agent配置、加载,tools配置,chat选择
效果:
https://github.com/InternLM/lagent/blob/main/lagent/actions/ipython_interpreter.py
有人研究过这4个工具定义文件功能有什么不同吗?
ipythoninteraactive
ipythoninterpreter
ipythonmanager(暂时不知)
pythoninterpreter
这里加插件,然后model部分改成你要的
备注:来自网址https://boiled-ginger-cbc.notion.site/Lagent-80ac842782f54171a6443db3e385e846
看看本次实验使用了哪些 包
import copy import os import streamlit as st from streamlit.logger import get_logger # 这里应该可以可以总共支持多少的 actions from lagent.actions import ActionExecutor, GoogleSearch, PythonInterpreter from lagent.agents.react import ReAct ''' ReAct 使用到的地方 -- 看上去就是初始化要用的 model 和 action def initialize_chatbot(self, model, plugin_action): """Initialize the chatbot with the given model and plugin actions.""" return ReAct( llm=model, action_executor=ActionExecutor(actions=plugin_action)) ''' # 这就是字面意思 -- GPT的API from lagent.llms import GPTAPI ''' 看下使用到的地方 -- 这里就是初始化 model 有多少个选择的地方 -- 没有看到输入 api key 的地方 想要增加 model 看这里 def init_model(self, option): """Initialize the model based on the selected option.""" if option not in st.session_state['model_map']: if option.startswith('gpt'): st.session_state['model_map'][option] = GPTAPI( model_type=option) else: st.session_state['model_map'][option] = HFTransformerCasualLM( '/root/model/Shanghai_AI_Laboratory/internlm-chat-7b') return st.session_state['model_map'][option] ''' # 加载 hf 上的 model 的方式 from lagent.llms.huggingface import HFTransformerCasualLM
在当前代码的父目录下初始化一个 tmp_dir 的文件夹,然后调用 main 函数
if __name__ == '__main__':
root_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
root_dir = os.path.join(root_dir, 'tmp_dir')
os.makedirs(root_dir, exist_ok=True)
main()
我们也能看到它哈,但是是空的,估计有一些要保存什么东西的指令才会保存吧
ok 进入 main 函数
def main(): # 初始化日志 logger = get_logger(__name__) # Initialize Streamlit UI and setup sidebar # 初始化Streamlit UI if 'ui' not in st.session_state: session_state = SessionState() # 初始化一个会话状态 session_state.init_state() st.session_state['ui'] = StreamlitUI(session_state) # 初始UI else: # 设置 UI 参数:这里可以修改 显示title 和图片 st.set_page_config( layout='wide', page_title='lagent-web', page_icon='./docs/imgs/lagent_icon.png') # st.header(':robot_face: :blue[Lagent] Web Demo ', divider='rainbow') # 获取 UI 中的设置:选择了什么模型、插件、上传文件 # -- 这里没有循环调用,重新选择后可能需要刷新一下网页 model_name, model, plugin_action, uploaded_file = st.session_state[ 'ui'].setup_sidebar() # Initialize chatbot if it is not already initialized # or if the model has changed # 修改模型这里要重新加载 if 'chatbot' not in st.session_state or model != st.session_state[ 'chatbot']._llm: st.session_state['chatbot'] = st.session_state[ 'ui'].initialize_chatbot(model, plugin_action) # 获取 user 和 assistant 的对话,并且显示到界面 for prompt, agent_return in zip(st.session_state['user'], st.session_state['assistant']): st.session_state['ui'].render_user(prompt) st.session_state['ui'].render_assistant(agent_return) # User input form at the bottom (this part will be at the bottom) # with st.form(key='my_form', clear_on_submit=True): # ":=" 是一个赋值运算符, # `st.chat_input('')` 是一个函数调用,它会创建一个文本框,并等待用户在文本框中输入值。 # 所以这句话是 检测是否用户有输入 if user_input := st.chat_input(''): st.session_state['ui'].render_user(user_input) st.session_state['user'].append(user_input) # Add file uploader to sidebar # 如果有上传的文件:则读取 -- 支持图像、视频、音频 if uploaded_file: file_bytes = uploaded_file.read() file_type = uploaded_file.type if 'image' in file_type: st.image(file_bytes, caption='Uploaded Image') elif 'video' in file_type: st.video(file_bytes, caption='Uploaded Video') elif 'audio' in file_type: st.audio(file_bytes, caption='Uploaded Audio') # Save the file to a temporary location and get the path # 上传文件才保存 -- 所以咱们的 tmpdir 是空 file_path = os.path.join(root_dir, uploaded_file.name) with open(file_path, 'wb') as tmpfile: tmpfile.write(file_bytes) st.write(f'File saved at: {file_path}') # 修改 prompt user_input = '我上传了一个图像,路径为: {file_path}. {user_input}'.format( file_path=file_path, user_input=user_input) # 这里就是和 Model 交互了 agent_return = st.session_state['chatbot'].chat(user_input) st.session_state['assistant'].append(copy.deepcopy(agent_return)) logger.info(agent_return.inner_steps) st.session_state['ui'].render_assistant(agent_return)
修改图片的话,在这个路径下放自己的图就可以。
OK 这基本就是主要的 脉络了 里面两个重点 类 的实现 SessionState, StreamlitUI
if 'ui' not in st.session_state:
session_state = SessionState() # 初始化一个会话状态
session_state.init_state()
st.session_state['ui'] = StreamlitUI(session_state) # 初始UI
可以自己打开看看
我关心的部分都在 import 环节看到了 这里暂时不做展开,有空再搞
把简单的先搞了: 唔 感觉没啥说的 — 就是建立和清除状态
class SessionState: def init_state(self): """Initialize session state variables.""" st.session_state['assistant'] = [] st.session_state['user'] = [] #action_list = [PythonInterpreter(), GoogleSearch()] action_list = [PythonInterpreter()] st.session_state['plugin_map'] = { action.name: action for action in action_list } st.session_state['model_map'] = {} st.session_state['model_selected'] = None st.session_state['plugin_actions'] = set() def clear_state(self): """Clear the existing session state.""" st.session_state['assistant'] = [] st.session_state['user'] = [] st.session_state['model_selected'] = None if 'chatbot' in st.session_state: st.session_state['chatbot']._session_history = []
这里做个尝试:
我觉得在环境变量里添加 api key 应该就可以,我手里恰好有个 Key
嘶 看这个意思是从 环境变量获得 API key 呀
pytho 代码里看下
>>> import os
>>> os.getenv('OPENAI_API_KEY')
是正确的
那就是 api key 用不了 — 在我自己的平台测试 api key 可以正常使用。
在公司提供的平台不能访问,应该是没有代理的原因吧 — 是的
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。