赞
踩
1.读卡器
2.SD 卡
3.网线
4.键盘
5.鼠标
定义:可以理解为把电脑格式化,因为我们用的jetson nano 之前有人用过,我们需要把SD清空
插上读卡器时,会弹出很多盘,我们只需要把其中一个进行格式化就行了
第一步:
需要下载一个烧录用的软件,链接已给出!
JetPack SDK 4.4.1 archive | NVIDIA Developer
下载完成之后需要进行解压
第二步:
我们还需要下载烧录SD卡的工具,链接在下边!Get Started With Jetson Nano Developer Kit | NVIDIA Developer
下载完之后 是一个包,点击安装即可
安装完之后 就可以进行烧录拉!!!
然后开始烧录,会自动烧录两边
烧录完成之后便可以插卡开机
1.配置cuda
打开终端,输入命令:
sudo gedit ~/.bashrc
会弹出一个文档,拉到文档最下面,加上以下命令,保存并退出:
- export CUDA_HOME=/usr/local/cuda-10.2
- export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64:$LD_LIBRARY_PATH
- export PATH=/usr/local/cuda-10.2/bin:$PATH
我们可以进行验证是否配置成功:
nvcc -V
2.配置conda 环境
jetson nanoB01的架构是aarch64,与windows和liunx不同不同,所以不能安装Anaconda,可以安装一个替代它的archiconda。
输入命令:
wget https://github.com/Archiconda/build-tools/releases/download/0.2.3/Archiconda3-0.2.3-Linux-aarch64.sh
下载好之后输入一下命令:
bash Archiconda3-0.2.3-Linux-aarch64.sh
下载完之后可以进行查看是否成功:
conda -V
安装完成之后就可以配置环境了:
sudo gedit ~/.bashrc
会弹出一个文档,跟上边一样的操作,在底部加上一行命令,保存并退出:
export PATH=~/archiconda3/bin:$PATH
3.创建自己的虚拟环境
- conda create -n xxx(虚拟环境名) python=3.6 #创建一个python3.6的虚拟环境
- conda activate xxx 进入虚拟环境
- conda deactivate 退出虚拟环境
有的用这个命令会进不去虚拟环境,可以用下边这个命令就可以进去拉:
source activate xx
4.换源
首先需要备份一下sources.list文件,执行后终端没有反应:
sudo cp /etc/apt/sources.list /etc/apt/sources.list.bak
进入文件:
sudo gedit /etc/apt/sources.list
把里边内容全部删除,并换成一下内容,保存并退出:
- deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic main multiverse restricted universe
- deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic-security main multiverse restricted universe
- deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic-updates main multiverse restricted universe
- deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic-backports main multiverse restricted universe
- deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic main multiverse restricted universe
- deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic-security main multiverse restricted universe
- deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic-updates main multiverse restricted universe
- deb-src http://mirrors.tuna.tsinghua.edu.cn/ubuntu-ports/ bionic-backports main multiverse restricted universe
然后更新软件,保存到本地:
sudo apt-get update
更新软件:
sudo apt-get upgrade
这可能会报错:
例如:
解决方案:
即可完成!!!
升级所有的安装包,并解决依赖关系:
sudo apt-get dist-upgrade
5. 安装pip
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
更新pip到最新版本:
pip3 install --upgrade pip #如果pip已是最新,可不执行
6. 下载torch和torchvision
我们需要到英伟达官网下载相对于的版本,我这里下载的1.8.0
链接:PyTorch for Jetson - version 1.10 now available - Jetson Nano - NVIDIA Developer Forums
因为我们在jetson nano 上下载torch和torchvision会出现太卡 进不去的情况,所有我们这里推荐一个工具(MobaXterm)用于连接板子进行互传,当然你也可以直接下载到U盘,用U盘实现互传。
我们可以看这位大佬教我们安装并使用(MobaXterm):
MobaXterm(终端工具)下载&安装&使用教程_蜗牛也不慢......的博客-CSDN博客
7、安装torch
首先我们打开终端,进入自己创建的虚拟环境,输入命令:
sudo apt-get install python3-pip libopenblas-base libopenmpi-dev
pip install torch-1.8.0-cp36-cp36m-linux_aarch64.whl
这时需要安装numpy,不然测试torch时不显示安装成功
sudo apt install python3-numpy
测试torch,是否安装成功 首先输入命令进入python
- import torch
- print(torch.__version__)
torch,在测试时报错:非法指令,核心已转储解决办法:
export OPENBLAS_CORETYPE=ARMV8
8、安装torchvision,逐个执行以下命令(如果执行第三个命令报错,不用管,继续执行第四第五行命令,如果不报错就直接cd ..)
- cd torchvision
-
- export BUILD_VERSION=0.9.0
-
- sudo python setup.py install
-
- python setup.py build
-
- python setup.py install
-
- cd ..
测试torchvision 跟上边测试一样:
- python
-
- import torchvision
-
- print(torchvision.__version__)
在测试时如果报错PIL,就安装pillow,命令如下(在第二步如果报权限错误,就在开头加sudo,或在结尾加--user)
- sudo apt-get install libjpeg8 libjpeg62-dev libfreetype6 libfreetype6-dev
-
- python3 -m pip install -i https://mirrors.aliyun.com/pypi/simple pillow
1、去官网上下载需要的版本,我这里下载的是5.0版本,下载的时候要把对应的权重也要下载下来网址如下:
ultralytics/yolov5 at v5.0 (github.com)
权重网址如下:
Releases · ultralytics/yolov5 (github.com)
YOLOv5 和 权重 必须对应 否则会报错 我当时就踩坑拉
然后通过MobaXterm将YoloV5和模型拉到home目录下,从YOLOv5文件打开终端执行以下命令:
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
这里可能会报错:
解决方法:
这是因为numpy版本过高,减低一下就可以拉,把版本降低到1.19.4,输入以下命令:
pip install numpy==1.19.4 -i https://pypi.tuna.tsinghua.edu.cn/simple
还有一种报错:
这只是因为网络问题,重新下载即可
就可以继续安装拉 ,当安装opencv时会出现Building wheel for opencv-python (pyroject.toml),就说明快要成功拉,但是会很慢,不要着急!!
安装好所有依赖包后,将权重文件拖到yolov5文件夹根目录下,在yolov5的根目录下打开终端,执行以下命令:
python3 detect.py --weights yolov5s.pt
会出现这个错误:
解决方法,在在common.py中再添加一段代码即可:
- import warnings
- class SPPF(nn.Module):
- # Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher
- def __init__(self, c1, c2, k=5): # equivalent to SPP(k=(5, 9, 13))
- super().__init__()
- c_ = c1 // 2 # hidden channels
- self.cv1 = Conv(c1, c_, 1, 1)
- self.cv2 = Conv(c_ * 4, c2, 1, 1)
- self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
-
- def forward(self, x):
- x = self.cv1(x)
- with warnings.catch_warnings():
- warnings.simplefilter('ignore') # suppress torch 1.9.0 max_pool2d() warning
- y1 = self.m(x)
- y2 = self.m(y1)
- return self.cv2(torch.cat([x, y1, y2, self.m(y2)], 1))
1、tensorrt官网下载v5,网址如下,确定下载是5.0版本
链接:https://gitcode.net/mirrors/wang-xinyu/tensorrtx?utm_source=csdn_github_accelerator
2.然后我们通过MobaXterm传到jetson nano上,在yolov5里找到gen_wts.py,并且复制刚刚运行过的YOLOv5文件下,右击打开终端,输入下边命令生成wts文件:
python3 gen_wts.py --w yolov5s.pt
3. 在yolov5中找到yololayer.h文件,打开之后修改一些参数(但是基本上不用修改)
4. 在当前目录下创建文件XX,命令如下:
- mkdir new #创建XX文件夹
-
- cd new #进入XX
-
- cmake .. #构建项目
-
- #将我们上面生成的.wts文件复制到 new 文件夹中
-
- make
5. 将上面生成的yolov5.wts文件拖到tensortx/yolov5下,右键打开终端:
sudo ./yolov5 -s yolov5s.wts yolov5s.engine s
6. 在tensorrtx-yolov5-v5.0\yolov5下新建IMG文件夹,在里面放一张需要测试的图片进行测试,命令如下 :
sudo ./yolov5 -d yolov5s.engine ../sample
测试完之后并没有反应,调完摄像头之后会出现效果!!!
7.调用摄像头
修改yolov5.cpp文件,将里面内容替换成如下代码:
- #include <iostream>
- #include <chrono>
- #include "cuda_utils.h"
- #include "logging.h"
- #include "common.hpp"
- #include "utils.h"
- #include "calibrator.h"
-
- #define USE_FP16 // set USE_INT8 or USE_FP16 or USE_FP32
- #define DEVICE 0 // GPU id
- #define NMS_THRESH 0.4
- #define CONF_THRESH 0.5
- #define BATCH_SIZE 1
-
- // stuff we know about the network and the input/output blobs
- static const int INPUT_H = Yolo::INPUT_H;
- static const int INPUT_W = Yolo::INPUT_W;
- static const int CLASS_NUM = Yolo::CLASS_NUM;
- static const int OUTPUT_SIZE = Yolo::MAX_OUTPUT_BBOX_COUNT * sizeof(Yolo::Detection) / sizeof(float) + 1; // we assume the yololayer outputs no more than MAX_OUTPUT_BBOX_COUNT boxes that conf >= 0.1
- const char* INPUT_BLOB_NAME = "data";
- const char* OUTPUT_BLOB_NAME = "prob";
- static Logger gLogger;
-
- //修改为自己的类别
- char *my_classes[]={ "person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light",
- "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow",
- "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee",
- "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard","surfboard",
- "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple",
- "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch",
- "potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone",
- "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear",
- "hair drier", "toothbrush" };
-
- static int get_width(int x, float gw, int divisor = 8) {
- //return math.ceil(x / divisor) * divisor
- if (int(x * gw) % divisor == 0) {
- return int(x * gw);
- }
- return (int(x * gw / divisor) + 1) * divisor;
- }
-
- static int get_depth(int x, float gd) {
- if (x == 1) {
- return 1;
- }
- else {
- return round(x * gd) > 1 ? round(x * gd) : 1;
- }
- }
- //#创建engine和network
- ICudaEngine* build_engine(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, float& gd, float& gw, std::string& wts_name) {
- INetworkDefinition* network = builder->createNetworkV2(0U);
-
- // Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME
- ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W });
- assert(data);
-
- std::map<std::string, Weights> weightMap = loadWeights(wts_name);
-
- /* ------ yolov5 backbone------ */
- auto focus0 = focus(network, weightMap, *data, 3, get_width(64, gw), 3, "model.0");
- auto conv1 = convBlock(network, weightMap, *focus0->getOutput(0), get_width(128, gw), 3, 2, 1, "model.1");
- auto bottleneck_CSP2 = C3(network, weightMap, *conv1->getOutput(0), get_width(128, gw), get_width(128, gw), get_depth(3, gd), true, 1, 0.5, "model.2");
- auto conv3 = convBlock(network, weightMap, *bottleneck_CSP2->getOutput(0), get_width(256, gw), 3, 2, 1, "model.3");
- auto bottleneck_csp4 = C3(network, weightMap, *conv3->getOutput(0), get_width(256, gw), get_width(256, gw), get_depth(9, gd), true, 1, 0.5, "model.4");
- auto conv5 = convBlock(network, weightMap, *bottleneck_csp4->getOutput(0), get_width(512, gw), 3, 2, 1, "model.5");
- auto bottleneck_csp6 = C3(network, weightMap, *conv5->getOutput(0), get_width(512, gw), get_width(512, gw), get_depth(9, gd), true, 1, 0.5, "model.6");
- auto conv7 = convBlock(network, weightMap, *bottleneck_csp6->getOutput(0), get_width(1024, gw), 3, 2, 1, "model.7");
- auto spp8 = SPP(network, weightMap, *conv7->getOutput(0), get_width(1024, gw), get_width(1024, gw), 5, 9, 13, "model.8");
-
- /* ------ yolov5 head ------ */
- auto bottleneck_csp9 = C3(network, weightMap, *spp8->getOutput(0), get_width(1024, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.9");
- auto conv10 = convBlock(network, weightMap, *bottleneck_csp9->getOutput(0), get_width(512, gw), 1, 1, 1, "model.10");
-
- auto upsample11 = network->addResize(*conv10->getOutput(0));
- assert(upsample11);
- upsample11->setResizeMode(ResizeMode::kNEAREST);
- upsample11->setOutputDimensions(bottleneck_csp6->getOutput(0)->getDimensions());
-
- ITensor* inputTensors12[] = { upsample11->getOutput(0), bottleneck_csp6->getOutput(0) };
- auto cat12 = network->addConcatenation(inputTensors12, 2);
- auto bottleneck_csp13 = C3(network, weightMap, *cat12->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.13");
- auto conv14 = convBlock(network, weightMap, *bottleneck_csp13->getOutput(0), get_width(256, gw), 1, 1, 1, "model.14");
-
- auto upsample15 = network->addResize(*conv14->getOutput(0));
- assert(upsample15);
- upsample15->setResizeMode(ResizeMode::kNEAREST);
- upsample15->setOutputDimensions(bottleneck_csp4->getOutput(0)->getDimensions());
-
- ITensor* inputTensors16[] = { upsample15->getOutput(0), bottleneck_csp4->getOutput(0) };
- auto cat16 = network->addConcatenation(inputTensors16, 2);
-
- auto bottleneck_csp17 = C3(network, weightMap, *cat16->getOutput(0), get_width(512, gw), get_width(256, gw), get_depth(3, gd), false, 1, 0.5, "model.17");
-
- // yolo layer 0
- IConvolutionLayer* det0 = network->addConvolutionNd(*bottleneck_csp17->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.0.weight"], weightMap["model.24.m.0.bias"]);
- auto conv18 = convBlock(network, weightMap, *bottleneck_csp17->getOutput(0), get_width(256, gw), 3, 2, 1, "model.18");
- ITensor* inputTensors19[] = { conv18->getOutput(0), conv14->getOutput(0) };
- auto cat19 = network->addConcatenation(inputTensors19, 2);
- auto bottleneck_csp20 = C3(network, weightMap, *cat19->getOutput(0), get_width(512, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.20");
- //yolo layer 1
- IConvolutionLayer* det1 = network->addConvolutionNd(*bottleneck_csp20->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.1.weight"], weightMap["model.24.m.1.bias"]);
- auto conv21 = convBlock(network, weightMap, *bottleneck_csp20->getOutput(0), get_width(512, gw), 3, 2, 1, "model.21");
- ITensor* inputTensors22[] = { conv21->getOutput(0), conv10->getOutput(0) };
- auto cat22 = network->addConcatenation(inputTensors22, 2);
- auto bottleneck_csp23 = C3(network, weightMap, *cat22->getOutput(0), get_width(1024, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.23");
- IConvolutionLayer* det2 = network->addConvolutionNd(*bottleneck_csp23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.24.m.2.weight"], weightMap["model.24.m.2.bias"]);
-
- auto yolo = addYoLoLayer(network, weightMap, "model.24", std::vector<IConvolutionLayer*>{det0, det1, det2});
- yolo->getOutput(0)->setName(OUTPUT_BLOB_NAME);
- network->markOutput(*yolo->getOutput(0));
-
- // Build engine
- builder->setMaxBatchSize(maxBatchSize);
- config->setMaxWorkspaceSize(16 * (1 << 20)); // 16MB
- #if defined(USE_FP16)
- config->setFlag(BuilderFlag::kFP16);
- #elif defined(USE_INT8)
- std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
- assert(builder->platformHasFastInt8());
- config->setFlag(BuilderFlag::kINT8);
- Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
- config->setInt8Calibrator(calibrator);
- #endif
-
- std::cout << "Building engine, please wait for a while..." << std::endl;
- ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
- std::cout << "Build engine successfully!" << std::endl;
-
- // Don't need the network any more
- network->destroy();
-
- // Release host memory
- for (auto& mem : weightMap)
- {
- free((void*)(mem.second.values));
- }
-
- return engine;
- }
-
- ICudaEngine* build_engine_p6(unsigned int maxBatchSize, IBuilder* builder, IBuilderConfig* config, DataType dt, float& gd, float& gw, std::string& wts_name) {
- INetworkDefinition* network = builder->createNetworkV2(0U);
-
- // Create input tensor of shape {3, INPUT_H, INPUT_W} with name INPUT_BLOB_NAME
- ITensor* data = network->addInput(INPUT_BLOB_NAME, dt, Dims3{ 3, INPUT_H, INPUT_W });
- assert(data);
-
- std::map<std::string, Weights> weightMap = loadWeights(wts_name);
-
- /* ------ yolov5 backbone------ */
- auto focus0 = focus(network, weightMap, *data, 3, get_width(64, gw), 3, "model.0");
- auto conv1 = convBlock(network, weightMap, *focus0->getOutput(0), get_width(128, gw), 3, 2, 1, "model.1");
- auto c3_2 = C3(network, weightMap, *conv1->getOutput(0), get_width(128, gw), get_width(128, gw), get_depth(3, gd), true, 1, 0.5, "model.2");
- auto conv3 = convBlock(network, weightMap, *c3_2->getOutput(0), get_width(256, gw), 3, 2, 1, "model.3");
- auto c3_4 = C3(network, weightMap, *conv3->getOutput(0), get_width(256, gw), get_width(256, gw), get_depth(9, gd), true, 1, 0.5, "model.4");
- auto conv5 = convBlock(network, weightMap, *c3_4->getOutput(0), get_width(512, gw), 3, 2, 1, "model.5");
- auto c3_6 = C3(network, weightMap, *conv5->getOutput(0), get_width(512, gw), get_width(512, gw), get_depth(9, gd), true, 1, 0.5, "model.6");
- auto conv7 = convBlock(network, weightMap, *c3_6->getOutput(0), get_width(768, gw), 3, 2, 1, "model.7");
- auto c3_8 = C3(network, weightMap, *conv7->getOutput(0), get_width(768, gw), get_width(768, gw), get_depth(3, gd), true, 1, 0.5, "model.8");
- auto conv9 = convBlock(network, weightMap, *c3_8->getOutput(0), get_width(1024, gw), 3, 2, 1, "model.9");
- auto spp10 = SPP(network, weightMap, *conv9->getOutput(0), get_width(1024, gw), get_width(1024, gw), 3, 5, 7, "model.10");
- auto c3_11 = C3(network, weightMap, *spp10->getOutput(0), get_width(1024, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.11");
-
- /* ------ yolov5 head ------ */
- auto conv12 = convBlock(network, weightMap, *c3_11->getOutput(0), get_width(768, gw), 1, 1, 1, "model.12");
- auto upsample13 = network->addResize(*conv12->getOutput(0));
- assert(upsample13);
- upsample13->setResizeMode(ResizeMode::kNEAREST);
- upsample13->setOutputDimensions(c3_8->getOutput(0)->getDimensions());
- ITensor* inputTensors14[] = { upsample13->getOutput(0), c3_8->getOutput(0) };
- auto cat14 = network->addConcatenation(inputTensors14, 2);
- auto c3_15 = C3(network, weightMap, *cat14->getOutput(0), get_width(1536, gw), get_width(768, gw), get_depth(3, gd), false, 1, 0.5, "model.15");
-
- auto conv16 = convBlock(network, weightMap, *c3_15->getOutput(0), get_width(512, gw), 1, 1, 1, "model.16");
- auto upsample17 = network->addResize(*conv16->getOutput(0));
- assert(upsample17);
- upsample17->setResizeMode(ResizeMode::kNEAREST);
- upsample17->setOutputDimensions(c3_6->getOutput(0)->getDimensions());
- ITensor* inputTensors18[] = { upsample17->getOutput(0), c3_6->getOutput(0) };
- auto cat18 = network->addConcatenation(inputTensors18, 2);
- auto c3_19 = C3(network, weightMap, *cat18->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.19");
-
- auto conv20 = convBlock(network, weightMap, *c3_19->getOutput(0), get_width(256, gw), 1, 1, 1, "model.20");
- auto upsample21 = network->addResize(*conv20->getOutput(0));
- assert(upsample21);
- upsample21->setResizeMode(ResizeMode::kNEAREST);
- upsample21->setOutputDimensions(c3_4->getOutput(0)->getDimensions());
- ITensor* inputTensors21[] = { upsample21->getOutput(0), c3_4->getOutput(0) };
- auto cat22 = network->addConcatenation(inputTensors21, 2);
- auto c3_23 = C3(network, weightMap, *cat22->getOutput(0), get_width(512, gw), get_width(256, gw), get_depth(3, gd), false, 1, 0.5, "model.23");
-
- auto conv24 = convBlock(network, weightMap, *c3_23->getOutput(0), get_width(256, gw), 3, 2, 1, "model.24");
- ITensor* inputTensors25[] = { conv24->getOutput(0), conv20->getOutput(0) };
- auto cat25 = network->addConcatenation(inputTensors25, 2);
- auto c3_26 = C3(network, weightMap, *cat25->getOutput(0), get_width(1024, gw), get_width(512, gw), get_depth(3, gd), false, 1, 0.5, "model.26");
-
- auto conv27 = convBlock(network, weightMap, *c3_26->getOutput(0), get_width(512, gw), 3, 2, 1, "model.27");
- ITensor* inputTensors28[] = { conv27->getOutput(0), conv16->getOutput(0) };
- auto cat28 = network->addConcatenation(inputTensors28, 2);
- auto c3_29 = C3(network, weightMap, *cat28->getOutput(0), get_width(1536, gw), get_width(768, gw), get_depth(3, gd), false, 1, 0.5, "model.29");
-
- auto conv30 = convBlock(network, weightMap, *c3_29->getOutput(0), get_width(768, gw), 3, 2, 1, "model.30");
- ITensor* inputTensors31[] = { conv30->getOutput(0), conv12->getOutput(0) };
- auto cat31 = network->addConcatenation(inputTensors31, 2);
- auto c3_32 = C3(network, weightMap, *cat31->getOutput(0), get_width(2048, gw), get_width(1024, gw), get_depth(3, gd), false, 1, 0.5, "model.32");
-
- /* ------ detect ------ */
- IConvolutionLayer* det0 = network->addConvolutionNd(*c3_23->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.0.weight"], weightMap["model.33.m.0.bias"]);
- IConvolutionLayer* det1 = network->addConvolutionNd(*c3_26->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.1.weight"], weightMap["model.33.m.1.bias"]);
- IConvolutionLayer* det2 = network->addConvolutionNd(*c3_29->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.2.weight"], weightMap["model.33.m.2.bias"]);
- IConvolutionLayer* det3 = network->addConvolutionNd(*c3_32->getOutput(0), 3 * (Yolo::CLASS_NUM + 5), DimsHW{ 1, 1 }, weightMap["model.33.m.3.weight"], weightMap["model.33.m.3.bias"]);
-
- auto yolo = addYoLoLayer(network, weightMap, "model.33", std::vector<IConvolutionLayer*>{det0, det1, det2, det3});
- yolo->getOutput(0)->setName(OUTPUT_BLOB_NAME);
- network->markOutput(*yolo->getOutput(0));
-
- // Build engine
- builder->setMaxBatchSize(maxBatchSize);
- config->setMaxWorkspaceSize(16 * (1 << 20)); // 16MB
- #if defined(USE_FP16)
- config->setFlag(BuilderFlag::kFP16);
- #elif defined(USE_INT8)
- std::cout << "Your platform support int8: " << (builder->platformHasFastInt8() ? "true" : "false") << std::endl;
- assert(builder->platformHasFastInt8());
- config->setFlag(BuilderFlag::kINT8);
- Int8EntropyCalibrator2* calibrator = new Int8EntropyCalibrator2(1, INPUT_W, INPUT_H, "./coco_calib/", "int8calib.table", INPUT_BLOB_NAME);
- config->setInt8Calibrator(calibrator);
- #endif
-
- std::cout << "Building engine, please wait for a while..." << std::endl;
- ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
- std::cout << "Build engine successfully!" << std::endl;
-
- // Don't need the network any more
- network->destroy();
-
- // Release host memory
- for (auto& mem : weightMap)
- {
- free((void*)(mem.second.values));
- }
-
- return engine;
- }
-
- void APIToModel(unsigned int maxBatchSize, IHostMemory** modelStream, float& gd, float& gw, std::string& wts_name) {
- // Create builder
- IBuilder* builder = createInferBuilder(gLogger);
- IBuilderConfig* config = builder->createBuilderConfig();
-
- // Create model to populate the network, then set the outputs and create an engine
- ICudaEngine* engine = build_engine(maxBatchSize, builder, config, DataType::kFLOAT, gd, gw, wts_name);
- assert(engine != nullptr);
-
- // Serialize the engine
- (*modelStream) = engine->serialize();
-
- // Close everything down
- engine->destroy();
- builder->destroy();
- config->destroy();
- }
-
- void doInference(IExecutionContext& context, cudaStream_t& stream, void** buffers, float* input, float* output, int batchSize) {
- // DMA input batch data to device, infer on the batch asynchronously, and DMA output back to host
- CUDA_CHECK(cudaMemcpyAsync(buffers[0], input, batchSize * 3 * INPUT_H * INPUT_W * sizeof(float), cudaMemcpyHostToDevice, stream));
- context.enqueue(batchSize, buffers, stream, nullptr);
- CUDA_CHECK(cudaMemcpyAsync(output, buffers[1], batchSize * OUTPUT_SIZE * sizeof(float), cudaMemcpyDeviceToHost, stream));
- cudaStreamSynchronize(stream);
- }
-
- bool parse_args(int argc, char** argv, std::string& engine) {
- if (argc < 3) return false;
- if (std::string(argv[1]) == "-v" && argc == 3) {
- engine = std::string(argv[2]);
- }
- else {
- return false;
- }
- return true;
- }
-
- int main(int argc, char** argv) {
- cudaSetDevice(DEVICE);
-
- //std::string wts_name = "";
- std::string engine_name = "";
- //float gd = 0.0f, gw = 0.0f;
- //std::string img_dir;
-
- if (!parse_args(argc, argv, engine_name)) {
- std::cerr << "arguments not right!" << std::endl;
- std::cerr << "./yolov5 -v [.engine] // run inference with camera" << std::endl;
- return -1;
- }
-
- std::ifstream file(engine_name, std::ios::binary);
- if (!file.good()) {
- std::cerr << " read " << engine_name << " error! " << std::endl;
- return -1;
- }
- char* trtModelStream{ nullptr };
- size_t size = 0;
- file.seekg(0, file.end);
- size = file.tellg();
- file.seekg(0, file.beg);
- trtModelStream = new char[size];
- assert(trtModelStream);
- file.read(trtModelStream, size);
- file.close();
-
-
- // prepare input data ---------------------------
- static float data[BATCH_SIZE * 3 * INPUT_H * INPUT_W];
- //for (int i = 0; i < 3 * INPUT_H * INPUT_W; i++)
- // data[i] = 1.0;
- static float prob[BATCH_SIZE * OUTPUT_SIZE];
- IRuntime* runtime = createInferRuntime(gLogger);
- assert(runtime != nullptr);
- ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream, size);
- assert(engine != nullptr);
- IExecutionContext* context = engine->createExecutionContext();
- assert(context != nullptr);
- delete[] trtModelStream;
- assert(engine->getNbBindings() == 2);
- void* buffers[2];
- // In order to bind the buffers, we need to know the names of the input and output tensors.
- // Note that indices are guaranteed to be less than IEngine::getNbBindings()
- const int inputIndex = engine->getBindingIndex(INPUT_BLOB_NAME);
- const int outputIndex = engine->getBindingIndex(OUTPUT_BLOB_NAME);
- assert(inputIndex == 0);
- assert(outputIndex == 1);
- // Create GPU buffers on device
- CUDA_CHECK(cudaMalloc(&buffers[inputIndex], BATCH_SIZE * 3 * INPUT_H * INPUT_W * sizeof(float)));
- CUDA_CHECK(cudaMalloc(&buffers[outputIndex], BATCH_SIZE * OUTPUT_SIZE * sizeof(float)));
- // Create stream
- cudaStream_t stream;
- CUDA_CHECK(cudaStreamCreate(&stream));
-
- //#读取本地视频
- //cv::VideoCapture capture("/home/nano/Videos/video.mp4");
- //#调用本地usb摄像头,我的默认参数为1,如果1报错,可修改为0.
- cv::VideoCapture capture(1);
- if (!capture.isOpened()) {
- std::cout << "Error opening video stream or file" << std::endl;
- return -1;
- }
-
- int key;
- int fcount = 0;
- while (1)
- {
- cv::Mat frame;
- capture >> frame;
- if (frame.empty())
- {
- std::cout << "Fail to read image from camera!" << std::endl;
- break;
- }
- fcount++;
- //if (fcount < BATCH_SIZE && f + 1 != (int)file_names.size()) continue;
- for (int b = 0; b < fcount; b++) {
- //cv::Mat img = cv::imread(img_dir + "/" + file_names[f - fcount + 1 + b]);
- cv::Mat img = frame;
- if (img.empty()) continue;
- cv::Mat pr_img = preprocess_img(img, INPUT_W, INPUT_H); // letterbox BGR to RGB
- int i = 0;
- for (int row = 0; row < INPUT_H; ++row) {
- uchar* uc_pixel = pr_img.data + row * pr_img.step;
- for (int col = 0; col < INPUT_W; ++col) {
- data[b * 3 * INPUT_H * INPUT_W + i] = (float)uc_pixel[2] / 255.0;
- data[b * 3 * INPUT_H * INPUT_W + i + INPUT_H * INPUT_W] = (float)uc_pixel[1] / 255.0;
- data[b * 3 * INPUT_H * INPUT_W + i + 2 * INPUT_H * INPUT_W] = (float)uc_pixel[0] / 255.0;
- uc_pixel += 3;
- ++i;
- }
- }
- }
-
- // Run inference
- auto start = std::chrono::system_clock::now();//#获取模型推理开始时间
- doInference(*context, stream, buffers, data, prob, BATCH_SIZE);
- auto end = std::chrono::system_clock::now();//#结束时间
- //std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
- int fps = 1000.0 / std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
- std::vector<std::vector<Yolo::Detection>> batch_res(fcount);
- for (int b = 0; b < fcount; b++) {
- auto& res = batch_res[b];
- nms(res, &prob[b * OUTPUT_SIZE], CONF_THRESH, NMS_THRESH);
- }
- for (int b = 0; b < fcount; b++) {
- auto& res = batch_res[b];
- //std::cout << res.size() << std::endl;
- //cv::Mat img = cv::imread(img_dir + "/" + file_names[f - fcount + 1 + b]);
- for (size_t j = 0; j < res.size(); j++) {
- cv::Rect r = get_rect(frame, res[j].bbox);
- cv::rectangle(frame, r, cv::Scalar(0x27, 0xC1, 0x36), 2);
- std::string label = my_classes[(int)res[j].class_id];
- cv::putText(frame, label, cv::Point(r.x, r.y - 1), cv::FONT_HERSHEY_PLAIN, 1.2, cv::Scalar(0xFF, 0xFF, 0xFF), 2);
- std::string jetson_fps = "FPS: " + std::to_string(fps);
- cv::putText(frame, jetson_fps, cv::Point(11, 80), cv::FONT_HERSHEY_PLAIN, 3, cv::Scalar(0, 0, 255), 2, cv::LINE_AA);
- }
- //cv::imwrite("_" + file_names[f - fcount + 1 + b], img);
- }
- cv::imshow("yolov5", frame);
- key = cv::waitKey(1);
- if (key == 'q') {
- break;
- }
- fcount = 0;
- }
-
- capture.release();
- // Release stream and buffers
- cudaStreamDestroy(stream);
- CUDA_CHECK(cudaFree(buffers[inputIndex]));
- CUDA_CHECK(cudaFree(buffers[outputIndex]));
- // Destroy the engine
- context->destroy();
- engine->destroy();
- runtime->destroy();
-
- return 0;
- }
可能会出现打开方式的不对,导致不能删除里边的内容,注意:用文本的方式打开
修改之后 打开终端输入命令:
- cd new
- make
- sudo ./yolov5 -v yolov5s.engine
调用即可成功:
完成jetson nano 部署yolov5目标检测 并且 tensorRT加速
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。