当前位置:   article > 正文

nvidia triton server 快速启动随记_tritonserver --model-control-mode=poll --repositor

tritonserver --model-control-mode=poll --repository-poll-secs=1

0、环境

1)ubuntu20.04
2)docker
3)cuda 11.5
4)jetson4.6.1
5)T4 和驱动

1、quickstart:

1)NVIDIA Container Toolkit

curl https://get.docker.com | sh \
  && sudo systemctl --now enable docker

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
      && curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add - \
      && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
      
      curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
      
      如果出现Unsupported distribution! 设置distribution=ubuntu18.04 ,原因:20.04的版本中还没有这个。
      
sudo apt-get update
sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

2)server code

git clone https://github.com/triton-inference-server/server.git
cd server/docs/examples
./fetch_models.sh
model_repository=$(pwd)/model_repository

3)server docker 

docker pull nvcr.io/nvidia/tritonserver:22.03-py3
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v/$model_repository:/models nvcr.io/nvidia/tritonserver:22.03-py3 tritonserver --model-repository=/models

4)test health

curl -v localhost:8000/v2/health/ready
输出:HTTP/1.1 200 OK

5)client examples

docker pull nvcr.io/nvidia/tritonserver:22.03-py3-sdk
/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
输出识别结果

2、model repository

1)model management
  model control:    NONE (dfault)
                    POLL  --model-control-mode=poll --repository-poll-secs = 100
                    EXPLICIT 支持model control protocol,HTTP/REST GRPC
 
  tritonserver --model-repository=<model-repository-path> --model-control-mode=none
  
2)repository layout:
  <model-repository-path>/
    <model-name>/
      [config.pbtxt]
      [<output-labels-file> ...]
      <version>/
        <model-definition-file>
      <version>/
        <model-definition-file>
      ...
    <model-name>/
      [config.pbtxt]
      [<output-labels-file> ...]
      <version>/
        <model-definition-file>
      <version>/
        <model-definition-file>
      ...
    ...
    
    模型目录名大于0的为有效版本
    
    eg: TensorRT model     
      <model-repository-path>/
            <model-name>/
                config.pbtxt
                1/
                    model.plan
                    
    eg: ONNX Models
        <model-repository-path>/
            <model-name>/
                config.pbtxt
                1/
                    model.onnx
                    
    eg: Python Models 
        <model-repository-path>/
            <model-name>/
                config.pbtxt
                1/
                    model.py

3)Model Configuration    
    config.pbtxt
    curl localhost:8000/v2/models/<model name>/config
    
    max_batch_size > 0 the full shape is formed as [ -1 ] + dims
    max_batch_size == 0 the full shape is formed as dims
    
    Auto-Generated Model Configuration
    --strict-model-config=false 
    
    

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/羊村懒王/article/detail/443495
推荐阅读
相关标签
  

闽ICP备14008679号