赞
踩
目录
batch_size 大于1的调试:
tensorrt yolov5学习笔记_jacke121的专栏-CSDN博客
调试ok的地址:
https://github.com/enazoe/yolo-tensorrt
新增了yolov5-6。
也需要tensorrt7.1版本
../configs/yolov5-5.0/yolov5s6.weights
下载地址:
https://github.com/enazoe/yolo-tensorrt/blob/master/yolov5_tutorial.md
结果是对的:
yolo-tensorrt_dll 报错不能提示。exe生成引擎文件后,也ok了。
yolo-tensorrt_exe结果:
python训练yolov5代码:没验证
F:\project\detect\yolov5\yolov5-5.0_voc
这个:
https://github.com/ttanzhiqiang/onnx_tensorrt_project
1.cuda版本:
dll_detector.vcxproj
test_dll.vcxproj
可能另外一个项目的:
tiny_tensorrt_onnx.vcxproj
搜索11.1改为安装的版本,但是10.2版本以上,10.2 11.0 11.1
支持列表:
1. cuda10.2 TensorRT-7.2.1.6
2.cuda11 TensorRT-7.2.3.4
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\TensorRT-7.2.3.4\include
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\opencv3.4.0\include\opencv
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\opencv3.4.0\include
库文件:
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\opencv3.4.0\lib\vc15
F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project\3rdparty\TensorRT-7.2.3.4\lib
compute_30,sm_30
int main_yolov5()
改为:
int main()
yolo_detector.cpp中的int main()
改为:
int main_yolov5()
3.更改tensorrt的版本。
2.报错:
严重性 代码 说明 项目 文件 行 禁止显示状态
错误 LNK2001 无法解析的外部符号 createNvOnnxParser_INTERNAL tiny_tensorrt_onnx F:\project\detect\yolov5\tensorrt\onnx_tensorrt_project-main\tiny_tensorrt_onnx\Trt.obj 1
解决:
tensorrt版本:7.2.3.4
添加lib:
nvonnxparser.lib
nvparsers.lib
https://github.com/ttanzhiqiang/onnx_tensorrt_project
cuda11.1 tensorrt 7.2.3版本,我换成6.0版本,int8放弃,编译成功,转onnx成功,转engine失败了
报错代码:
- if(!parser->parseFromFile(onnxModel.c_str(), static_cast<int>(nvinfer1::ILogger::Severity::kWARNING))) {
- spdlog::error("error: could not parse onnx engine");
- return false;
- }
yolov5-4.0,预测出来的值与python的结果稍微不一样:
yolo5各个版本cfg下载:
yolo-tensorrt/yolov5_tutorial.md at master · enazoe/yolo-tensorrt · GitHub
yolov5 tensorrt_jacke121的专栏-CSDN博客_yolov5tensorrt
改之前的:get_weights.py
- import torch
- import struct
- from utils.torch_utils import select_device
-
- # Initialize
- device = select_device('cpu')
- # Load model
- # model = torch.load('yolov5s.pt', map_location=device)['model'].float() # load to FP32
- model = torch.load('_0.9437_5997.pt', map_location=device)['model'].float() # load to FP32
- model.to(device).eval()
-
- f = open('yolov5s.weights', 'w')
- f.write('{}\n'.format(len(model.state_dict().keys())))
- for k, v in model.state_dict().items():
- vr = v.reshape(-1).cpu().numpy()
- f.write('{} {} '.format(k, len(vr)))
- for vv in vr:
- f.write(' ')
- f.write(struct.pack('>f',float(vv)).hex())
- f.write('\n')
报错解决之后的:get_weights.py
修改代码:f.write('{} '.format(k))
- import torch
- import struct
- from utils.torch_utils import select_device
-
- # Initialize
- device = select_device('cpu')
- # Load model
- # model = torch.load('yolov5s.pt', map_location=device)['model'].float() # load to FP32
- model = torch.load('_0.9437_5997.pt', map_location=device)['model'].float() # load to FP32
- model.to(device).eval()
-
- f = open('yolov5s.weights', 'w')
- f.write('{}\n'.format(len(model.state_dict().keys())))
- for k, v in model.state_dict().items():
- vr = v.reshape(-1).cpu().numpy()
- #f.write('{} {} '.format(k, len(vr)))
- f.write('{} '.format(k))
- for vv in vr:
- f.write(' ')
- f.write(struct.pack('>f',float(vv)).hex())
- f.write('\n')
c++解析weights,修改之前的:
- void Yolo::load_weights_v5(const std::string s_weights_path_,
- std::map<std::string,std::vector<float>> &vec_wts_)
- {
- vec_wts_.clear();
- assert(fileExists(s_weights_path_));
- std::cout << "Loading pre-trained weights..." << std::endl;
- std::ifstream file(s_weights_path_, std::ios_base::binary);
- assert(file.good());
- std::string line;
- while (std::getline(file,line))
- {
- if(line.empty())continue;
- std::stringstream iss(line);
- std::string wts_name;
- iss >> wts_name ;
- std::vector<float> weights;
- uint32_t n_str;
- while(iss >> std::hex >> n_str)
- {
- weights.push_back(reinterpret_cast<float&>(n_str));
- }
- vec_wts_[wts_name] = weights;
- }
- std::cout << "Loading complete!" << std::endl;
- }
自己导出的weights,导入报错,报错的地方:
std::cout << size << " size " << map_wts_[s_layer_name_ + ".weight"].size() << std::endl;
assert(size == (map_wts_[s_layer_name_ + ".weight"].size()));
结果; 3456 3457
发现python保存时,保存名字+空格+长度+空格+数据。
c++读取,读取名字后,直接读取数据,导出实际长度比size大1。
修改后代码:
- void Yolo::load_weights_v5(const std::string s_weights_path_,
- std::map<std::string,std::vector<float>> &vec_wts_)
- {
- vec_wts_.clear();
- assert(fileExists(s_weights_path_));
- std::cout << "Loading pre-trained weights..." << std::endl;
- std::ifstream file(s_weights_path_, std::ios_base::binary);
- assert(file.good());
- std::string line;
- while (std::getline(file,line))
- {
- if(line.empty())continue;
- std::stringstream iss(line);
- std::string wts_name;
- iss >> wts_name ;
-
- int size_a;
- iss >> size_a;
-
- std::vector<float> weights;
- uint32_t n_str;
- while(iss >> std::hex >> n_str)
- {
- weights.push_back(reinterpret_cast<float&>(n_str));
- }
- vec_wts_[wts_name] = weights;
- }
- std::cout << "Loading complete!" << std::endl;
- }
参考:Pytorch通过保存为ONNX模型转TensorRT5_连正的博客-CSDN博客_onnx转tensorrt
ERROR: (Unnamed Layer* 193) [Concatenation]: all concat input tensors must have the same dimensions except on the concatenation axis (0), but dimensions mismatched at index 1. Input 0 shape: [384,18,18], Input 1 shape: [384,17,17]
Assertion failed: d.nbDims == 3, file f:\project\detect\yolov5\tensorrt\yolo-tensorrt_exe\modules\trt_utils.cpp, line 444
后来发现是分辨率给错了,分辨率需要是64的倍数
yolov5s6.cfg 修改内容:
width和height,需要是64的倍数
nc,实际训练的类别个数。
yolo.cpp文件:
m_BatchSize = 2;// std::stoi(trim(block.at("batch")));
先检测engine文件不存在,然后自动创建,创建大概要上分钟,需要耐心等待。
void Yolo::create_engine_yolov5(const nvinfer1::DataType dataType,
Int8EntropyCalibrator* calibrator )
{
yolov5s6-kFLOAT-batch1
设备 | 模型 | model | batch | gpu计算耗时/ms | 平均耗时 |
1060 | yolov5s6 | fp32 | 1 | 23 | 23 |
1060 | yolov5s7 | fp32 | 2 | 40 | 20 |
1060 | yolov5s8 | fp32 | 4 | 72 | 18 |
1060 | yolov5s9 | fp32 | 8 | 130 | 16.25 |
输出特征维数:
w | h | anchors | 特征数量 | 类别加box | ||||
80 | 64 | 3 | 15360 | 85 | 1305600 | |||
40 | 32 | 3 | 3840 | 85 | 326400 | |||
20 | 16 | 3 | 960 | 85 | 81600 | |||
10 | 8 | 3 | 240 | 85 | 20400 | 173400 |
打印特征:
- std::vector<BBoxInfo> Yolo::decodeDetections(const int& imageIdx,
- const int& imageH,
- const int& imageW)
- {
- // Timer timer;
- std::vector<BBoxInfo> binfo;
- for (auto& tensor : m_OutputTensors)
- {
-
- for (int a = 0; a < 20; a++) {
- std::cout << tensor.hostBuffer[a] << " ";
- }
- std::cout << std::endl;
-
- std::vector<BBoxInfo> curBInfo = decodeTensor(imageIdx, imageH, imageW, tensor);
- binfo.insert(binfo.end(), curBInfo.begin(), curBInfo.end());
- }
- // timer.out("decodeDetections");
- return binfo;
第一步:转化weights文件时,记录anchors,
就是下面代码中的:anchor_grid
代码文件:
gen_weight_new.py
anchor_grid = model.model[-1].anchors * model.model[-1].stride[...,None,None]
print(anchor_grid)
# model.model[-1].anchor_grid = anchor_grid
delattr(model.model[-1], 'anchor_grid') # model.model[-1] is detect layer
model.model[-1].register_buffer("anchor_grid",anchor_grid)
打印结果:
tensor([[[ 35.18750, 5.43750],
[ 69.50000, 6.88672],
[ 49.90625, 9.78906]],
[[ 75.00000, 10.44531],
[112.87500, 8.17969],
[141.87500, 10.35156]],
[[217.25000, 8.93750],
[243.00000, 12.73438],
[320.00000, 10.21875]]])
第二步:保存到cfg中:
D:\work\lbg\yolov5_track\yolo-tensorrt_dll\sln\x64\Release\configs\kuaizi.cfg
width_multiple=0.5
anchors=35.18750,5.4375,69.50,6.88672, 49.90625,9.78906,75.0,10.445,112.870,8.17969,141.875, 10.35156,217.25,8.9375,243.0,12.73438,320.0,10.21875
cfg中修改:
[net]
width=640
height=128
然后删掉kuaizi-kFLOAT-batch5_640.engine文件,重新生成engine文件,
自己的项目速度由16ms提升到8ms。
cfg中修改:
[net]
width=640
height=128
然后删掉kuaizi-kFLOAT-batch5_640.engine文件,重新生成engine文件,
速度右16ms提升到8ms。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。