赞
踩
0、环境
1)ubuntu20.04
2)docker
3)cuda 11.5
4)jetson4.6.1
5)T4 和驱动
1、quickstart:
1)NVIDIA Container Toolkit
curl https://get.docker.com | sh \
&& sudo systemctl --now enable docker
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
如果出现Unsupported distribution! 设置distribution=ubuntu18.04 ,原因:20.04的版本中还没有这个。
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
2)server code
git clone https://github.com/triton-inference-server/server.git
cd server/docs/examples
./fetch_models.sh
model_repository=$(pwd)/model_repository
3)server docker
docker pull nvcr.io/nvidia/tritonserver:22.03-py3
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/$model_repository:/models nvcr.io/nvidia/tritonserver:22.03-py3 tritonserver --model-repository=/models
4)test health
curl -v localhost:8000/v2/health/ready
输出:HTTP/1.1 200 OK
5)client examples
docker pull nvcr.io/nvidia/tritonserver:22.03-py3-sdk
/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
输出识别结果
2、model repository
1)model management
model control: NONE (dfault)
POLL --model-control-mode=poll --repository-poll-secs = 100
EXPLICIT 支持model control protocol,HTTP/REST GRPC
tritonserver --model-repository=<model-repository-path> --model-control-mode=none
2)repository layout:
<model-repository-path>/
<model-name>/
[config.pbtxt]
[<output-labels-file> ...]
<version>/
<model-definition-file>
<version>/
<model-definition-file>
...
<model-name>/
[config.pbtxt]
[<output-labels-file> ...]
<version>/
<model-definition-file>
<version>/
<model-definition-file>
...
...
模型目录名大于0的为有效版本
eg: TensorRT model
<model-repository-path>/
<model-name>/
config.pbtxt
1/
model.plan
eg: ONNX Models
<model-repository-path>/
<model-name>/
config.pbtxt
1/
model.onnx
eg: Python Models
<model-repository-path>/
<model-name>/
config.pbtxt
1/
model.py
3)Model Configuration
config.pbtxt
curl localhost:8000/v2/models/<model name>/config
max_batch_size > 0 the full shape is formed as [ -1 ] + dims
max_batch_size == 0 the full shape is formed as dims
Auto-Generated Model Configuration
--strict-model-config=false
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。