当前位置:   article > 正文

yolov3,yolov3-spp转onnx再转tensorrt填坑记录_w inference: the dims of input(ndarray) shape (224

w inference: the dims of input(ndarray) shape (224, 224, 3) is wrong, expect

一.填坑

1.官方代码使用python2转的onnx,比如这份代码:https://github.com/Cw-zero/TensorRT_yolo3_module
这里有份使用python3转onnx的代码:https://github.com/jkjung-avt/tensorrt_demos,由于个人python3安装onnx低版本不成功,未测试该代码

2.报错ERROR: ValueError: not enough values to unpack (expected 2, got 1)
yolov3.cfg文件使用官方提供的,有两点需注意:一.两层之间至少有1条空行;二**.cfg文件末尾有2个空行**。亲测末尾空一行要报这个错,被这坑惨了,网上有些地方说的是末尾空一行。

3.报错:onnx.onnx_cpp2py_export.checker.ValidationError: Op registered for Upsample is deprecated in domain_version of 12onnx.onnx_cpp2py_export.checker.ValidationError: Node (086_upsample) has input size 1 not in range [min=2, max=2] 这两种
onnx默认安装的版本过高,降级为1.2.1,降为1.4.1还是会报错。

pip2 install onnx==1.2.1
  • 1

4.用python2转yolov3的onnx成功了,使用python3转yolov3的onnx时,降低onnx版本,一直报错

(base) lgy@lgy:~$ pip install onnx==1.2.1
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting onnx==1.2.1
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/0c/ce/e66db6ac8462eeca295b30749ec3497c8d607d822de03288531577c725ce/onnx-1.2.1.tar.gz (2.6 MB)
     |████████████████████████████████| 2.6 MB 48 kB/s 
    ERROR: Command errored out with exit status 1:
     command: /home/lgy/anaconda3/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-jz71wfj8/onnx/setup.py'"'"'; __file__='"'"'/tmp/pip-install-jz71wfj8/onnx/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-xwrptm41
         cwd: /tmp/pip-install-jz71wfj8/onnx/
    Complete output (6 lines):
    fatal: Not a git repository (or any of the parent directories): .git
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-jz71wfj8/onnx/setup.py", line 71, in <module>
        assert CMAKE, 'Could not find "cmake" executable!'
    AssertionError: Could not find "cmake" executable!
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17

尝试各种方法,仍未解决,先用python2转换把

二.推理图片由618大小改为416

https://github.com/Cw-zero/TensorRT_yolo3_module默认使用608的图片进行推理,权重使用官方下载的https://pjreddie.com/media/files/yolov3.weights,按readme提示操作,正常运行。

改成416*416大小的试哈,修改代码:

yolov3-608.cfg:第8,9行: width=608,height=608,改为416

weight_to_onnx.py:注释掉640-642行,取消注释644-646行:

    #yolo-v3(608*608)
    # output_tensor_dims['082_convolutional'] = [255, 19, 19]
    # output_tensor_dims['094_convolutional'] = [255, 38, 38]
    # output_tensor_dims['106_convolutional'] = [255, 76, 76]
    #yolo-v3(416*416)
    output_tensor_dims['082_convolutional'] = [255, 13, 13]
    output_tensor_dims['094_convolutional'] = [255, 26, 26]
    output_tensor_dims['106_convolutional'] = [255, 52, 52]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

trt_yolo3_module_1batch.py:修改55和57行

	self.inp_dim = 416       # 608改为416
	self.num_classes = 80
	# self.output_shapes = [(1, 255, 19, 19), (1, 255, 38, 38), (1, 255, 76, 76)] # yolov3-608
	self.output_shapes = [(1, 255, 13, 13), (1, 255, 26, 26), (1, 255, 52, 52)]   # yolov3-416
  • 1
  • 2
  • 3
  • 4

运行无问题,可是看推理的时间和我自己用的那份yolov3代码推理时间差不多

三.使用https://github.com/ultralytics/yolov3下载的***.pt转为onnx和tensorrt

1.该项目的yolov3权重有两种,一种是yolov3.weights,这是darknet的官方权重,另一种是yolov3.pt样式的,这是作者自己训练的pytorch权重,上面用官方的yolov3.weights转换onnx和tensorrt无问题,在使用yolov3.pt转换时报错,问题在于两种权重的保存方式有点区别,需要先将pytorch的权重.pt转换为darknet样式的权重.weights。
个人用的代码还是是去年的版本,现在已经大更新了,最新版本的代码新增了一个功能,将该项目的pytorch权重和darknet权重互相转换,将.pt转换为.weights后再转为onnx和tensorrt,使用无问题。
由于个人用的代码跟最新版本的代码有些不一样的地方,保存的权重.pt不一样,使用该转换代码时报错,仔细分析后,发现darknet的权重前5行是头文件信息,作者在构建模型时还加了其他信息,比对着最新版代码里的转换文件,稍加修改即可。

import torch
import numpy as np
from models import Darknet


def save_weights(self, path='model.weights', cutoff=-1):
    fp = open(path, 'wb')
    version = np.array([0, 2, 5], dtype=np.int32)  # (int32) version info: major, minor, revision
    seen = np.array([0], dtype=np.int64)  # (int64) number of images seen during training
    version.tofile(fp)
    seen.tofile(fp)

    # Iterate through layers
    for i, (module_def, module) in enumerate(zip(self.module_defs[:cutoff], self.module_list[:cutoff])):
        if module_def['type'] == 'convolutional':
            conv_layer = module[0]
            # If batch norm, load bn first
            if module_def['batch_normalize']:
                bn_layer = module[1]
                bn_layer.bias.data.cpu().numpy().tofile(fp)
                bn_layer.weight.data.cpu().numpy().tofile(fp)
                bn_layer.running_mean.data.cpu().numpy().tofile(fp)
                bn_layer.running_var.data.cpu().numpy().tofile(fp)
            # Load conv bias
            else:
                conv_layer.bias.data.cpu().numpy().tofile(fp)
            # Load conv weights
            conv_layer.weight.data.cpu().numpy().tofile(fp)

    fp.close()


def convert(cfg='cfg/yolov3.cfg', weights='weights/yolov3.pt'):
    # Converts between PyTorch and Darknet format per extension (i.e. *.weights convert to *.pt and vice versa)
    # from models import *; convert('cfg/yolov3-spp.cfg', 'weights/yolov3-spp.weights')

    # Initialize model
    model = Darknet(cfg)

    # Load weights and save
    if weights.endswith('.pt'):  # if PyTorch format
        model.load_state_dict(torch.load(weights, map_location='cpu')['model'])
        save_weights(model, path='weights/converted.weights', cutoff=-1)
        print("Success: converted '%s' to 'converted.weights'" % weights)

    else:
        print('Error: extension not supported.')


if __name__ == '__main__':
    convert(cfg='cfg/my_yolov3.cfg', weights='weights/best.pt')
    
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52

四.用自己训练的yolov3.pt模型转为onnx

1.cfg文件直接替换为自己用的

2.trt_yolo3_module_1batch.py:修改56行的类别数num_classes和59行的anchor

	self.inp_dim = 416       # 608改为416
	self.num_classes = 2
	# self.output_shapes = [(1, 255, 19, 19), (1, 255, 38, 38), (1, 255, 76, 76)] # yolov3-608
	self.output_shapes = [(1, 255, 13, 13), (1, 255, 26, 26), (1, 255, 52, 52)]   # yolov3-416
	self.yolo_anchors = [[(116, 90), (156, 198), (373, 326)],
	                     [(30, 61),  (62, 45),   (59, 119)],
	                     [(10, 13),  (16, 30),   (33, 23)]]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

最后运行时报错:

  File "/home/lgy/PycharmProjects/TensorRT_yolo3_module/trt_yolo3_module_1batch.py", line 101, in detection
    output = output.reshape(shape)
ValueError: cannot reshape array of size 3549 into shape (1,255,13,13)
  • 1
  • 2
  • 3

在前一步打印output.shape和要转为的shape,分别为:(3549,) (1, 255, 13, 13),
分析原因:3549=131321,而后面为1313255,个人训练的类别为2类,在cfg里的yolo层前的filters=21,显然是此处255有问题,查找为上一步修改时的self.output_shapes = [(1, 255, 13, 13), (1, 255, 26, 26), (1, 255, 52, 52)],输出形状还有255,改为21

self.output_shapes = [(1, 21, 13, 13), (1, 21, 26, 26), (1, 21, 52, 52)]
  • 1

再运行,不报错,但是未在测试图片上画框。

再次找到个错误,在weights_to_onnx.py的646行里,output_tensor_dims应该改为自己模型的输出维度

    # yolo-v3(416*416)
    # output_tensor_dims['082_convolutional'] = [255, 13, 13]
    # output_tensor_dims['094_convolutional'] = [255, 26, 26]
    # output_tensor_dims['106_convolutional'] = [255, 52, 52]

    # my_yolo-v3(416*416)
    output_tensor_dims['082_convolutional'] = [21, 13, 13]
    output_tensor_dims['094_convolutional'] = [21, 26, 26]
    output_tensor_dims['106_convolutional'] = [21, 52, 52]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

改完再测,还是没有显示,打印输出,从trt出来有输出,在做完dets = dynamic_write_results(detections, 0.5, self.num_classes, nms=True, nms_conf=0.3) 即nms后没有输出,怀疑有可能是nms的问题,看了半天原作者的代码,实在是不好下手,由于

    for output, shape, anchors in zip(trt_outputs, self.output_shapes, self.yolo_anchors):
        output = output.reshape(shape)
        trt_output = torch.from_numpy(output).cuda().data
        print(trt_output.shape)
        # trt_output = trt_output.data
        # cuda_time1 = time.time()
        trt_output = predict_transform(trt_output, self.inp_dim, anchors, self.num_classes, self.use_cuda)
        print(trt_output.shape)
        # cuda_time2 = time.time()
        # print('CUDA time : %f' % (cuda_time2 - cuda_time1))
        if type(trt_output) == int:
            continue

        if not write:
            detections = trt_output
            write = 1

        else:
            detections = torch.cat((detections, trt_output), 1)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19

这步结束后,得到的detections的shape是 torch.Size([1, 10647, 85]) (这是80类的),这后面可以用我自己那份yolov3的推理代码里的mns代替。
替换后输出box等值画框,终于没出问题了,再用darknet的weights进行测试,也没问题,泪流满面啊,居然是原作者的nms坑我这么久。

修改后的trt的推理时间居然比我用pytorch推理的时间还长,检测发现时图片预处理时间过长,将近10ms(1060显卡),找了半天发现是自己代码的图片处理耗时较多,而该转换代码的图片处理时间少了将近一半,再更换后发现总时间的确少了点,v3的主网络推理耗时大概减少了10%左右,跟网上说的大幅提速差远了啊有木有,后续再改吧。

五. yolov3-spp转onnx和tensorrt

  1. cfg和weights文件用官方提供的,注意cfg最后要空两行,图片输入尺寸改为416*416;

  2. yolov3-spp比v3多了maxpool,在weights_to_onnx.py里增加maxpool相关代码

    class GraphBuilderONNX(object)里增加:

    def _make_maxpool_node(self, layer_name, layer_dict):
        stride = layer_dict['stride']
        # stride = 1
        kernel_size = layer_dict['size']
        previous_node_specs = self._get_previous_node_specs()
        inputs = [previous_node_specs.name]
        channels = previous_node_specs.channels
        kernel_shape = [kernel_size, kernel_size]
        strides = [stride, stride]
        assert channels > 0
        maxpool_node = helper.make_node(
            'MaxPool',
            inputs=inputs,
            outputs=[layer_name],
            kernel_shape=kernel_shape,
            strides=strides,
            auto_pad='SAME_UPPER',
            name=layer_name,
        )
        self._nodes.append(maxpool_node)
        return layer_name, channels

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

该类的def _make_onnx_node(self, layer_name, layer_dict)里增加:

node_creators['maxpool'] = self._make_maxpool_node
  • 1

main函数里多少层进行输出也要修改,可以直接将每层的权重名称打出来查看

    # yolo-v3-spp(416*416)
    output_tensor_dims['089_convolutional'] = [255, 13, 13]
    output_tensor_dims['101_convolutional'] = [255, 26, 26]
    output_tensor_dims['113_convolutional'] = [255, 52, 52]
  • 1
  • 2
  • 3
  • 4

始终报错:

  File "/home/lgy/PycharmProjects/TensorRT_yolo3_module/spp_weight_to_onnx.py", line 296, in _load_one_param_type
    buffer=self.weights_file.read(param_size * 4))
TypeError: buffer is too small for requested array
  • 1
  • 2
  • 3

报错信息是数组给定的shape与spp.weights读出来的shape不匹配,初步怀疑cfg和weights文件不匹配,但是在pytorch项目里推理,能够正常运行,排除该可能。
在报错代码前打印出相关shape:

    def _load_one_param_type(self, conv_params, param_category, suffix):
        """Deserializes the weights from a file stream in the DarkNet order.

        Keyword arguments:
        conv_params -- a ConvParams object
        param_category -- the category of parameters to be created ('bn' or 'conv')
        suffix -- a string determining the sub-type of above param_category (e.g.,
        'weights' or 'bias')
        """
        param_name = conv_params.generate_param_name(param_category, suffix)
        channels_out, channels_in, filter_h, filter_w = conv_params.conv_weight_dims
        if param_category == 'bn':
            param_shape = [channels_out]
        elif param_category == 'conv':
            if suffix == 'weights':
                param_shape = [channels_out, channels_in, filter_h, filter_w]
            elif suffix == 'bias':
                param_shape = [channels_out]
        param_size = np.product(np.array(param_shape))
        print(param_name,param_shape)
        print(param_size)
        param_data = np.ndarray(
            shape=param_shape,
            dtype='float32',
            buffer=self.weights_file.read(param_size * 4))
        print(param_data.shape)
        print('----')
        param_data = param_data.flatten().astype(float)
        return param_name, param_data, param_shape
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29

输出为:

----
109_convolutional_conv_weights [128, 256, 1, 1]
32768
(128, 256, 1, 1)
----
110_convolutional_bn_bias [256]
256
(256,)
----
110_convolutional_bn_scale [256]
256
(256,)
----
110_convolutional_bn_mean [256]
256
(256,)
----
110_convolutional_bn_var [256]
256
(256,)
----
110_convolutional_conv_weights [256, 128, 3, 3]
294912
Traceback (most recent call last):
  File "/home/lgy/PycharmProjects/TensorRT_yolo3_module/spp_weight_to_onnx.py", line 721, in <module>
    main(cfg='weights_spp_80/yolov3-spp.cfg', weights_file='weights_spp_80/yolov3-spp.weights', onnx_file='weights_spp_80/yolov3-spp.onnx')
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

显然,是第110层的shape不匹配,在这里进行了各种尝试,都未解决问题,其中,将dtype='float32'改为float16,居然能输出110层的shape,但是又在111层卡住了,在v3的转换里,改成int类型都没问题。

网上关于这个问题的解决方法是重新下载匹配的cfg和weights,我也重下了还是没用。

最后尝试将yolov3-spp.weights(或者.pt)里每一层的shape打印出来,与该代码的shape进行比较,发现

weights['module_list.109.conv_109.weight'].shape = torch.Size([256, 128, 3, 3])
weights['module_list.110.conv_110.weight'].shape = torch.Size([128, 256, 1, 1])
  • 1
  • 2

而在weights_to_onnx.py里:

109_convolutional_conv_weights [128, 256, 1, 1]
32768
(128, 256, 1, 1)
110_convolutional_conv_weights [256, 128, 3, 3]
294912
  • 1
  • 2
  • 3
  • 4
  • 5

显然有一层的位置错开了,而v3-spp与v3相比只多了个spp模块,估计是spp模块里的maxpool或者route转换为onnx出错了,
然而经过仔细比对,发现yolov3-spp.weights是从module_list.0.conv_0.weight开始计数的,而onnx里是从001_convolutional_conv_weights开始计数的,所以应该110_convolutional_conv_weights [256, 128, 3, 3]对应weights['module_list.109.conv_109.weight'].shape = torch.Size([256, 128, 3, 3]),没毛病啊,再把yolov3.weights和对应的onnx每层打印出来看,的确也是这样的。

未完待续

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Gausst松鼠会/article/detail/558537
推荐阅读
相关标签
  

闽ICP备14008679号