赞
踩
MindIE是基于昇腾硬件的运行加速、调试调优、快速迁移部署的高性能深度学习推理框架。它包含了MindIE-Service、MindIE-Torch和MindIE-RT等组件。我主要用MindIE-Service的功能,这个组件对标的是vllm这样的大语言推理框架。
先拉取镜像(要去官网获取最新镜像版本)
docker pull swr.cn-central-221.ovaijisuan.com/dxy/mindie:1.0.RC1-800I-A2-aarch64
然后启动容器,我这里将前2张NPU加速卡映射到docker内:
docker run --name my_mindie -it -d --net=host --shm-size=500g \ --device=/dev/davinci0 \ --device=/dev/davinci1 \ -w /home \ --device=/dev/davinci_manager \ --device=/dev/hisi_hdc \ --device=/dev/devmm_svm \ --entrypoint=bash \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ -v /usr/local/dcmi:/usr/local/dcmi \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/sbin:/usr/local/sbin \ -v /root/xxx/mindformer_share/:/home/xxx_share \ -v /tmp:/tmp \ -v /etc/hccn.conf:/etc/hccn.conf \ -v /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime \ -e http_proxy=$http_proxy \ -e https_proxy=$https_proxy \ swr.cn-central-221.ovaijisuan.com/dxy/mindie:1.0.RC1-800I-A2-aarch64
上面-v /root/xxx/mindformer_share/:/home/xxx_share是在映射我的磁盘进容器,需要根据自己的环境做修改。
进入容器:
docker exec -it my_mindie bash
进入之后执行环境设置:
source /usr/local/Ascend/ascend-toolkit/set_env.sh
source /usr/local/Ascend/mindie/set_env.sh
上述操作做完,就可以修改mindie-service的配置文件了,这个文件位于/usr/local/Ascend/mindie/latest/mindie-service/conf/config.json。
"ipAddress" : "0.0.0.0", "port" : 1025, "ModelDeployParam": { "maxSeqLen" : 4096, "npuDeviceIds" : [[0,1]], "ModelParam" : [ { "modelName" : "baichuan2", "modelWeightPath" : "/home/xxxx/baichuan-inc/Baichuan2-13B-Chat/", "worldSize" : 2, "cpuMemSize" : 5, "npuMemSize" : 10, "backendType": "atb" } ] },
我这里罗列下我关注的字段。
cd /usr/local/Ascend/mindie/latest/mindie-service/
bin/mindieservice_daemon
可以用postman或者python接口调用http服务。
POST http://223.106.234.6:2250/generate
{
"prompt": "你是谁?\n",
"max_tokens": 1024,
"repetition_penalty": 1.03,
"presence_penalty": 1.2,
"frequency_penalty": 1.2,
"temperature": 0.5,
"top_k": 10,
"top_p": 0.95,
"stream": false
}
mindie支持openai\triton\vllm等接口。具体可参考文档 这里
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。