赞
踩
https://www.bilibili.com/video/BV11K421h7dY/
【xinference】(7):在autodl上,使用xinference一次部署embedding,rerank,qwen多个大模型,兼容openai的接口协
Xorbits Inference (Xinference) 是一个开源平台,用于简化各种 AI 模型的运行和集成。借助 Xinference,您可以使用任何开源 LLM、嵌入模型和多模态模型在云端或本地环境中运行推理,并创建强大的 AI 应用。
官方网站:
https://inference.readthedocs.io/zh-cn/latest/index.html
启动Xinference服务
https://gitee.com/fly-llm/xinference-run-llm
在autodl上下载项目主要是安装
pip3 install "xinference[all]"
# https://hf-mirror.com/
export HF_ENDPOINT=https://hf-mirror.com
export XINFERENCE_MODEL_SRC=modelscope
export XINFERENCE_HOME=/root/autodl-tmp
# 首先启动 xinference-local :
nohup xinference-local --host 0.0.0.0 --port 9997 > xinference-local.log 2>&1 &
启动访问之后可以进行安装各种软件了
https://inference.readthedocs.io/zh-cn/latest/models/builtin/embedding/bge-large-zh-v1.5.html
xinference launch --model-name bge-small-zh-v1.5 --model-type embedding
测试接口:
curl http://0.0.0.0:9997/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "测试ebmeddings",
"model": "bge-large-zh-v1.5"
}'
返回向量数组就是成功了:
https://inference.readthedocs.io/zh-cn/latest/models/builtin/rerank/bge-reranker-large.html
xinference launch --model-name bge-reranker-large --model-type rerank
测试接口:
curl -X 'POST' 'http://0.0.0.0:9997/v1/rerank' \
-H 'Content-Type: application/json' \
-d '{
"model": "bge-reranker-large",
"query": "A man is eating pasta.",
"documents": [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A man is riding a horse.",
"A woman is playing violin."
]
}'
返回:
# curl -X 'POST' 'http://0.0.0.0:9997/v1/rerank' -H 'Content-Type: application/json' -d '{
"model": "bge-reranker-large",
"query": "A man is eating pasta.",
"documents": [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A man is riding a horse.",
"A woman is playing violin."
]
}'
{"id":"48f0e1ec-f0fd-11ee-be64-0242ac110004","results":[{"index":0,"relevance_score":0.9999258518218994,"document":null},{"index":1,"relevance_score":0.04828396067023277,"document":null},{"index":2,"relevance_score":0.00007636439841007814,"document":null},{"index":4,"relevance_score":0.00007636331429239362,"document":null},{"index":3,"relevance_score":0.00007617334631504491,"document":null}]}
https://inference.readthedocs.io/zh-cn/latest/models/builtin/llm/qwen-chat.html
xinference launch --model-name qwen-chat --size-in-billions 1_8 --model-format gptq --quantization Int8
# curl -X 'POST' 'http://0.0.0.0:9997/v1/chat/completions' -H 'Content-Type: application/json' -d '{
"model": "qwen-chat",
"messages": [
{
"role": "user",
"content": "北京景点?"
}
],
"max_tokens": 512,
"temperature": 0.7
}'
{"id":"chat8967be76-f0f8-11ee-af26-0242ac110004","object":"chat.completion","created":1712066074,"model":"qwen-chat","choices":[{"index":0,"message":{"role":"assistant","content":"1、故宫:位于北京市中心,是中国明清两代的皇家宫殿,也是世界上现存规模最大、保存最完整的木质结构古建筑之一。2、天安门广场:是国家象征性的建筑群,也是中华人民共和国的重要标志。3、颐和园:世界文化遗产,以昆明湖和万寿山为主,是中国古代园林艺术的瑰宝。4、长城:是中国古代劳动人民智慧的结晶,被誉为“万里长城”。5、圆明园:中国清朝时期的皇家园林,有“万园之园”之称。6、八达岭长城:世界文化遗产,是中国古代劳动人民智慧的结晶。7、北海公园:是中国四大名园之一,以其美丽的湖光山色和众多的历史文物而闻名于世。8、南锣鼓巷:有着上千年的历史,是北京最具有特色的胡同之一,充满了老北京的生活气息。9、天坛公园:是中国明清两代皇帝祭天祈谷的地方,也是中国最大的古代祭祀场所之一。10、颐和园附近的西山风景区:是中国著名的风景名胜区,以其秀美的自然风光和丰富的历史文化内涵而闻名。11、北京动物园:是中国最大的动物园之一,拥有大量的珍稀动物,如大熊猫、金丝猴等。12、北京植物园:是中国著名的人造园林,汇集了世界各地的名贵花卉,是游览赏花的好去处。13、北京欢乐谷:是一个大型的主题公园,集休闲、娱乐、文化于一体,游客可以在这里尽情享受各种游乐设施。14、北京野生动物园:是中国最大的野生动物饲养繁育基地之一,提供了大量野生动物供游客观赏。15、北京水立方:是中国最大的游泳馆之一,也是北京城市名片之一。"},"finish_reason":"stop"}],"usage":{"prompt_tokens":22,"completion_tokens":364,"total_tokens":386}}
可以支持多个模型:
curl http://0.0.0.0:9997/v1/models
{
"object": "list",
"data": [
{
"id": "bge-large-zh-v1.5",
"object": "model",
"created": 0,
"owned_by": "xinference",
"model_type": "embedding",
"address": "0.0.0.0:34327",
"accelerators": [
"0"
],
"model_name": "bge-large-zh-v1.5",
"dimensions": 1024,
"max_tokens": 512,
"language": [
"zh"
],
"model_revision": "v0.0.1",
"replica": 1
},
{
"id": "bge-reranker-large",
"object": "model",
"created": 0,
"owned_by": "xinference",
"model_type": "rerank",
"address": "0.0.0.0:37947",
"accelerators": [
"0"
],
"model_name": "bge-reranker-large",
"language": [
"en",
"zh"
],
"model_revision": "v0.0.1",
"replica": 1
},
{
"id": "qwen-chat",
"object": "model",
"created": 0,
"owned_by": "xinference",
"model_type": "LLM",
"address": "0.0.0.0:37003",
"accelerators": [
"0"
],
"model_name": "qwen-chat",
"model_lang": [
"en",
"zh"
],
"model_ability": [
"chat",
"tools"
],
"model_description": "Qwen-chat is a fine-tuned version of the Qwen LLM trained with alignment techniques, specializing in chatting.",
"model_format": "gptq",
"model_size_in_billions": "1_8",
"model_family": "qwen-chat",
"quantization": "Int8",
"model_hub": "modelscope",
"revision": "master",
"context_length": 32768,
"replica": 1
}
]
}
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。