爱喝兽奶帝天荒

这个屌丝很懒，什么也没留下！

热门标签

大模型应用开发框架:autoGen初体验与原理

作者：爱喝兽奶帝天荒 | 2024-08-15 13:47:36

踩

autogen

在体验和学习autogen的原理前，先来看看官网的两段话：在这里插入图片描述
总的来说，autoGen是为了复杂的工作流而生的LLM应用开发框架，通过可定制可对话的agent与LLM交互，简化LLM工作流的编排、优化和自动化。
原文地址：AutoGen: Enabling next-generation large language model applications
docs地址：Getting Started

本文将从autogen简单使用，functioncall，代码生成与执行、Groupchat等几个方面对autogen原理做简要分析。

一、环境准备：

1、安装pyautogen
创建Python的应用工程后，需要依赖pyautogen

pip install pyautogen
1

注意别安装成autogen了，这里是带py开头的

2、模型准备
能直接对接openai那体验效果应该会更好，但如果没有那也没关系，在测试上，千问也是够用的，qwen-14b/qwen-72b都是开源的，如果有资源可以直接下载部署，另外直接在阿里云平台申请开通千问的API接口访问也是可选的（按字数计费，有一定的免费额度）

3、llmconfig配置

llm_config = {
            "config_list": [
                {
                    "model": "qwen-72b",
                    "base_url": "NULL",#补充自己地址
                    "api_key": "NULL" #补充自己密钥
                    "cache_seed":None #如果想要每次都访问llm，不希望直接取缓存结果可配置为None
                }]
        }
1
2
3
4
5
6
7
8
9

二、入门demo与交互原理

上demo代码

import unittest
from autogen import AssistantAgent, UserProxyAgent

class MyTestCase(unittest.TestCase):
    def test_something(self):
        assistant = AssistantAgent(
            name="assistant",
            llm_config=llm_config
        )

        user_proxy = UserProxyAgent(
            name="user_proxy",
            max_consecutive_auto_reply=0,
            human_input_mode="NEVER",
            # human_input_mode="ALWAYS",
            llm_config=llm_config,
            code_execution_config={
                "work_dir": "coding",
                "use_docker": False
            }
        )
        user_proxy.initiate_chat(assistant, message="中国的首都是哪里，到北极大约多少公里")
        
if __name__ == '__main__':
    unittest.main()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

在这里插入图片描述
分析流程前，先来大概看下UserProxyAgent、AssistantAgent的关系，顺便也看看后面将介绍的群聊功能GroupChatManager的关系：

基本上，了解了ConversableAgent的每个属性的作用，基本就可以autogen的整个功能框架有个大概得掌握，比如，何时停止继续询问，何时与LLM交互，何时调用工具，何时执行代码等。
接下来就能更好理解agent间的交互流程：
在这里插入图片描述

步骤6中，当配置human_input_mode为ALWAYS或TERMINATE，在执行check_termination_and_human_reply时会与用户交互，可定制get_human_input实现定制用户输入方式：
user_proxy.get_human_input = custom_get_human_input
源码默认是通过标准输入stdin的方式接收用户输入。

二、工具调用

上demo代码

import unittest
from typing import Literal, Annotated

from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager, agentchat
CurrencySymbol = Literal["USD", "EUR"]

def exchange_rate(base_currency: CurrencySymbol, quote_currency: CurrencySymbol) -> float:
    if base_currency == quote_currency:
        return 1.0
    elif base_currency == "USD" and quote_currency == "EUR":
        return 1 / 5
    elif base_currency == "EUR" and quote_currency == "USD":
        return 5
    else:
        raise ValueError(f"Unknown currencies {base_currency}, {quote_currency}")

# # NOTE: for Azure OpenAI, please use API version 2023-12-01-preview or later as
# # support for earlier versions will be deprecated.
# # For API versions 2023-10-01-preview or earlier you may
# # need to set `api_style="function"` in the decorator if the default value does not work:
# # `register_for_llm(description=..., api_style="function")`.
# @user_proxy.register_for_execution()
# @chatbot.register_for_llm(description="Currency exchange calculator.")
def currency_calculator(
    base_amount: Annotated[float, "Amount of currency in base_currency"],
    base_currency: Annotated[CurrencySymbol, "Base currency"] = "USD",
    quote_currency: Annotated[CurrencySymbol, "Quote currency"] = "EUR",
) -> str:
    quote_amount = exchange_rate(base_currency, quote_currency) * base_amount
    return f"{quote_amount} {quote_currency}"

def sayHello(name:Annotated[str,"向 name打招呼"])->str:
    return f"hello {name}"

class MyTestCase(unittest.TestCase):
    def testTools(self):
        chatbot = AssistantAgent(
            name="chatbot",
            system_message="用于计算美元欧元转换任务，只允许通过提供的functions 完成转换计算，请从问题中提取参数完成对应function的调用，在得到最终结果后，回复结果中拼接上TERMINATE",
            # system_message="For currency exchange tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done.",
            llm_config=llm_config,
        )
        hellobot = AssistantAgent(
            name="hellobot",
            system_message="用于向某个人say hello打招呼，只允许通过提供的functions 完成，请从问题中提取参数完成对应function的调用，在得到最终结果后，回复结果中拼接上TERMINATE",
            # system_message="For currency exchange tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done.",
            llm_config=llm_config,
        )

        # create a UserProxyAgent instance named "user_proxy"
        user_proxy = UserProxyAgent(
            name="user_proxy",
            system_message="用户代理，用于向其他的角色请求处理问题",
            # system_message="A human admin, who talk with the Product_Manager first to discuss the plan and approve it..",
            is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
            human_input_mode="NEVER",
            max_consecutive_auto_reply=10,
            code_execution_config={"work_dir": "coding",
                                   "use_docker": False}
        )
        # # Register the function with the chatbot's llm_config.
        # currency_calculator = chatbot.register_for_llm(description="Currency exchange calculator.")(currency_calculator)
        #
        # # Register the function with the user_proxy's function_map.
        # user_proxy.register_for_execution()(currency_calculator)

        # agentchat.register_function(
        #     currency_calculator,
        #     caller=chatbot,
        #     executor=user_proxy,
        #     description="计算美元欧元转换",
        # )
        f = chatbot.register_for_llm(name="currency_calculator", description="计算美元欧元转换",api_style="function")(currency_calculator)
        user_proxy.register_for_execution(name="currency_calculator")(f)

        fhello = hellobot.register_for_llm(name="sayhello", description="用于向某人sayhello 打招呼",api_style="function")(sayHello)
        user_proxy.register_for_execution(name="sayhello")(fhello)

        user_proxy.initiate_chat(
            chatbot,
            message=" 123.45美元（ USD） 等于多少 欧元（EUR）?",
        )
        user_proxy.initiate_chat(
            hellobot,
            message=" 向eshin打招呼吧",
        )

if __name__ == '__main__':
    unittest.main()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89

在这里插入图片描述
工具调用用到模型层提供的functioncall功能，方法注册的原理大致如下：

注意register_for_llm设置参数时，api_style有function和tool两种，默认是tool，function是openai过时的接口字段,但目前千问还是只支持function，千问通过openai_api.py实现了functioncall功能，如果通过vllm部署千问，目前vllm本身不支持functioncall。

工具调用流程如下： 在这里插入图片描述

从交互的信息可以看出：在向LLM请求时，发送的消息，通过functions(tools) 字段描述方法name，用途，参数要求等方法元数据 信息供LLM分析匹配选择，LLM响应时，通过function_call（tool_calls） 告知请求方LLM选择的需要执行的工具方法。最终由user_proxy通过generate_function_call_reply(generate_tool_calls_reply) 完成对应工具方法的调用。

因此是通过assistant向LLM传递方法元数据信息，真正执行还是由userproxy调用执行

完整请求响应参考如下：
在这里插入图片描述

请求：{'context': None, 'messages': [{'content': '用于计算美元欧元转换任务，只允许通过提供的functions 完成转换计算，请从问题中提取参数完成对应function的调用，在得到最终结果后，回复结果中拼接上TERMINATE', 'role': 'system'}, {'content': ' 123.45美元（ USD） 等于多少 欧元（EUR）?', 'role': 'user'}], 'cache': None, 'temperature': 0, 'functions': [{'description': '计算美元欧元转换', 'name': 'currency_calculator', 'parameters': {'type': 'object', 'properties': {'base_amount': {'type': 'number', 'description': 'Amount of currency in base_currency'}, 'base_currency': {'enum': ['USD', 'EUR'], 'type': 'string', 'default': 'USD', 'description': 'Base currency'}, 'quote_currency': {'enum': ['USD', 'EUR'], 'type': 'string', 'default': 'EUR', 'description': 'Quote currency'}}, 'required': ['base_amount']}}], 'model': 'qwen-14b'}

响应：function_call=FunctionCall(arguments='{"base_amount": 123.45, "base_currency": "USD", "quote_currency": "EUR"}', name='currency_calculator'), tool_calls=None))], created=1708878949, model='qwen-14b', object='chat.completion', system_fingerprint=None, usage=None, cost=0)
1
2
3

三、代码生成与执行
按上面的方式创建unittest类，添加如下单元测试方法：

 def testConversation(self):
        assistant = AssistantAgent("assistant", llm_config=llm_config)
        user_proxy = UserProxyAgent("user_proxy", code_execution_config={"work_dir": "coding",
                                                                         "use_docker": False})  # IMPORTANT: set to True to run code in docker, recommended
        user_proxy.initiate_chat(assistant, message="计算2023的立方，然后加上5")
        # user_proxy.initiate_chat(assistant, message="I want to create a simple front-end html page with the text Hello World.")

1
2
3
4
5
6
7

在这里插入图片描述

生成的代码保存在work_dir配置的位置

代码生成的流程与工具调用的的LLM交互流程并没有太大区别，不同的是在user_proxy中执行的是generate_code_execution_reply 。
在assistantagent的默认system_message配置中：在这里插入图片描述
提到，如果有浏览查找网站，下载文件，获取时间，检查操作系统、或其他LLM认为需要通过代码块执行解决等方面的问题，LLM可以生成以python或者shell标注（目前从源码看，只支持这两种）的代码块让调用方执行并获得结果，并通过print打印的数据，作为代码块执行的输出：
在这里插入图片描述
当然，我们也可以参考上面system_message 的内容自己编写提示信息。

从执行线索generate_code_execution_reply->execute_code_blocks()->run_code()->execute_code() 看：代码块最终通过subprocess.run() 执行，并将标准输出stdout 作为正常输出结果，因此在提示词里需要要求生成的代码通过print 输出结果，有兴趣可以跑下一下的例子：

    def test_subprocess(self):
    	#windows命令
        # result = subprocess.run(["dir","coding"], capture_output=True, text=True,shell=True)
        #Linux bash命令
        # result = subprocess.run(["ls","-l"], capture_output=True, text=True,shell=True)
        result = subprocess.run(["python", "coding/compare.py"], capture_output=True, text=True)

        # 检查命令执行结果
        print("returnCode:",result.returncode)  # 返回码
        print("stdout:",result.stdout)  # 标准输出
        print("stderr:",result.stderr)  # 标准错误
1
2
3
4
5
6
7
8
9
10
11

四、群聊模式GroupChat

    def testGroup(self):
        user_proxy = UserProxyAgent(
            name="user_proxy",
            system_message="用户代理，用于向其他的角色请求处理问题",
            is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE"),
            human_input_mode="TERMINATE",
            max_consecutive_auto_reply=10,
            code_execution_config={"last_n_messages": 2,"work_dir": "groupchat","use_docker": False}
        )
        chatbot = AssistantAgent(
            name="美元欧元转换",
            system_message="用于计算美元欧元转换任务，在得到最终结果后，回复结果中拼接上TERMINATE",
            # system_message="For currency exchange tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done.",
            llm_config=llm_config,
        )
        hellobot = AssistantAgent(
            name="打招呼",
            system_message="用于向某个人say hello打招呼，在得到最终结果后，回复结果中拼接上TERMINATE",
            # system_message="For currency exchange tasks, only use the functions you have been provided with. Reply TERMINATE when the task is done.",
            llm_config=llm_config,
        )

        groupchat = GroupChat(agents=[user_proxy, hellobot, chatbot], messages=[], max_round=20)
        manager = GroupChatManager(
            groupchat=groupchat,
            code_execution_config={"use_docker": False},
            llm_config=llm_config,
            is_termination_msg=lambda x: x.get("content", "") and x.get("content", "").rstrip().endswith("TERMINATE")
        )
        user_proxy.initiate_chat(manager, message="123.45美元（ USD） 等于多少 欧元（EUR）?")
        user_proxy.initiate_chat(manager, message="向eshin打招呼")
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

在这里插入图片描述

先来看看GroupChatManager的结构：在这里插入图片描述
接下来看看群聊的流程原理：

在通过LLM选择匹配时，GroupChatManager向LLM发送消息：在这里插入图片描述

请求：{'context': None, 'messages': [{'content': "You are in a role play game. The following roles are available:\nuser_proxy: A user that can run Python code or input command line commands at a Linux terminal and report back the execution results.\n打招呼: 用于向某个人say hello打招呼，在得到最终结果后，回复结果中拼接上TERMINATE\n美元欧元转换: 用于计算美元欧元转换任务，在得到最终结果后，回复结果中拼接上TERMINATE.\n\nRead the following conversation.\nThen select the next role from ['user_proxy', '打招呼', '美元欧元转换'] to play. Only return the role.", 'role': 'system'}, {'content': '123.45美元（ USD） 等于多少 欧元（EUR）?', 'role': 'user', 'name': 'user_proxy'}, {'role': 'system', 'content': "Read the above conversation. Then select the next role from ['user_proxy', '打招呼', '美元欧元转换'] to play. Only return the role."}], 'cache': None, 'temperature': 0, 'model': 'qwen-72b'}

响应：(content='美元欧元转换', role='assistant', function_call=None, tool_calls=None))]
1
2
3

可以看到GroupChatManager将多个agent的描述信息及相关问题一同发往LLM，LLM通过分析匹配对应的agent，返回对应的agent名和对应的角色给到GroupChatManager，GroupChatManager再speak to对应的agent。

目前玩到的功能就这些，后续如果有玩其他的功能，再做笔记记录
to be continue…

声明：本文内容由网友自发贡献，转载请注明出处：【wpsshop】