当前位置:   article > 正文

libtorch(pytorch c++)教程(七)_std::vector

std::vector

阅读本文需要有基础的pytorch编程经验,目标检测框架相关知识,不用很深入,大致了解概念即可。

本章简要介绍如何如何用C++实现一个目标检测器模型,该模型具有训练和预测的功能。本文的分割模型架构使用yolov4-tiny结构,代码结构参考了bubbliiiing yolov4-tiny,本文分享的c++模型几乎完美复现了pytorch的版本,且具有速度优势,30-40%的速度提升。

模型简介

简单介绍一下yolov4-tiny模型。yolov4-tiny模型是YOLO(you only look once)系列模型中,version 4的轻巧版,相比于yolov4,它牺牲了部分精度以实现速度上的大幅提升。yolov4_tiny模型结构如图(图片来源自):

可以发现模型结构非常简单,以CSPDarknet53-tiny为骨干网络,FPN为颈部(neck),Yolo head为头部。最后输出两个特征层,分别是原图下采样32倍和下采样16倍的特征图。训练时,以这两个特征图分别输入损失计算中计算损失,再将损失求和(或平均,怎么都好),后做反向传播,预测时将两个特征图解码出的结果做并集再做NMS(非极大值抑制)。

骨干网络

CSPDarknet53-tiny是CSPNet的一种,CSPNet发表于CVPR2019,是用于提升目标检测模型检测性能的一种骨干网络。感兴趣的同学可以去看原文,简单理解该论文贡献,就是将特征层沿着通道维度切成两片,两片分别做不同的卷积,然后再拼接起来,这样做相比于直接对原图做特征提取,能减少计算量。

默认看过我的libtorch系列教程的前部分,直接上代码。首先是基本单元,由Conv2d + BatchNorm2d + LeakyReLU构成。

  1. //Conv2d + BatchNorm2d + LeakyReLU
  2. class BasicConvImpl : public torch::nn::Module {
  3. public:
  4. BasicConvImpl(int in_channels, int out_channels, int kernel_size, int stride = 1);
  5. torch::Tensor forward(torch::Tensor x);
  6. private:
  7. // Declare layers
  8. torch::nn::Conv2d conv{ nullptr };
  9. torch::nn::BatchNorm2d bn{ nullptr };
  10. torch::nn::LeakyReLU acitivation{ nullptr };
  11. }; TORCH_MODULE(BasicConv);
  12. BasicConvImpl::BasicConvImpl(int in_channels, int out_channels, int kernel_size,
  13. int stride) :
  14. conv(conv_options(in_channels, out_channels, kernel_size, stride,
  15. int(kernel_size / 2), 1, false)),
  16. bn(torch::nn::BatchNorm2d(out_channels)),
  17. acitivation(torch::nn::LeakyReLU(torch::nn::LeakyReLUOptions().negative_slope(0.1)))
  18. {
  19. register_module("conv", conv);
  20. register_module("bn", bn);
  21. }
  22. torch::Tensor BasicConvImpl::forward(torch::Tensor x)
  23. {
  24. x = conv->forward(x);
  25. x = bn->forward(x);
  26. x = acitivation(x);
  27. return x;
  28. }

该层作为基本模块,将在后期作为搭积木的基本块,搭建yolo4_tiny。

然后是Resblock_body模块,

  1. class Resblock_bodyImpl : public torch::nn::Module {
  2. public:
  3. Resblock_bodyImpl(int in_channels, int out_channels);
  4. std::vector<torch::Tensor> forward(torch::Tensor x);
  5. private:
  6. int out_channels;
  7. BasicConv conv1{ nullptr };
  8. BasicConv conv2{ nullptr };
  9. BasicConv conv3{ nullptr };
  10. BasicConv conv4{ nullptr };
  11. torch::nn::MaxPool2d maxpool{ nullptr };
  12. }; TORCH_MODULE(Resblock_body);
  13. Resblock_bodyImpl::Resblock_bodyImpl(int in_channels, int out_channels) {
  14. this->out_channels = out_channels;
  15. conv1 = BasicConv(in_channels, out_channels, 3);
  16. conv2 = BasicConv(out_channels / 2, out_channels / 2, 3);
  17. conv3 = BasicConv(out_channels / 2, out_channels / 2, 3);
  18. conv4 = BasicConv(out_channels, out_channels, 1);
  19. maxpool = torch::nn::MaxPool2d(maxpool_options(2, 2));
  20. register_module("conv1", conv1);
  21. register_module("conv2", conv2);
  22. register_module("conv3", conv3);
  23. register_module("conv4", conv4);
  24. }
  25. std::vector<torch::Tensor> Resblock_bodyImpl::forward(torch::Tensor x) {
  26. auto c = out_channels;
  27. x = conv1->forward(x);
  28. auto route = x;
  29. x = torch::split(x, c / 2, 1)[1];
  30. x = conv2->forward(x);
  31. auto route1 = x;
  32. x = conv3->forward(x);
  33. x = torch::cat({ x, route1 }, 1);
  34. x = conv4->forward(x);
  35. auto feat = x;
  36. x = torch::cat({ route, x }, 1);
  37. x = maxpool->forward(x);
  38. return std::vector<torch::Tensor>({ x,feat });
  39. }

最后是骨干网络主体

  1. class CSPdarknet53_tinyImpl : public torch::nn::Module
  2. {
  3. public:
  4. CSPdarknet53_tinyImpl();
  5. std::vector<torch::Tensor> forward(torch::Tensor x);
  6. private:
  7. BasicConv conv1{ nullptr };
  8. BasicConv conv2{ nullptr };
  9. Resblock_body resblock_body1{ nullptr };
  10. Resblock_body resblock_body2{ nullptr };
  11. Resblock_body resblock_body3{ nullptr };
  12. BasicConv conv3{ nullptr };
  13. int num_features = 1;
  14. }; TORCH_MODULE(CSPdarknet53_tiny);
  15. CSPdarknet53_tinyImpl::CSPdarknet53_tinyImpl() {
  16. conv1 = BasicConv(3, 32, 3, 2);
  17. conv2 = BasicConv(32, 64, 3, 2);
  18. resblock_body1 = Resblock_body(64, 64);
  19. resblock_body2 = Resblock_body(128, 128);
  20. resblock_body3 = Resblock_body(256, 256);
  21. conv3 = BasicConv(512, 512, 3);
  22. register_module("conv1", conv1);
  23. register_module("conv2", conv2);
  24. register_module("resblock_body1", resblock_body1);
  25. register_module("resblock_body2", resblock_body2);
  26. register_module("resblock_body3", resblock_body3);
  27. register_module("conv3", conv3);
  28. }
  29. std::vector<torch::Tensor> CSPdarknet53_tinyImpl::forward(torch::Tensor x) {
  30. // 416, 416, 3 -> 208, 208, 32 -> 104, 104, 64
  31. x = conv1(x);
  32. x = conv2(x);
  33. // 104, 104, 64 -> 52, 52, 128
  34. x = resblock_body1->forward(x)[0];
  35. // 52, 52, 128 -> 26, 26, 256
  36. x = resblock_body2->forward(x)[0];
  37. // 26, 26, 256->13, 13, 512
  38. # // -> feat1Ϊ26,26,256
  39. auto res_out = resblock_body3->forward(x);
  40. x = res_out[0];
  41. auto feat1 = res_out[1];
  42. // 13, 13, 512 -> 13, 13, 512
  43. x = conv3->forward(x);
  44. auto feat2 = x;
  45. return std::vector<torch::Tensor>({ feat1, feat2 });
  46. }

至此,yolo4_tiny中的骨干网络已经搭建好。接下来将搭建yolo4_tiny模型。

yolov4_tiny

骨干网络得到的特征图,将经过FPN,需要上采样模块。

  1. //conv+upsample
  2. class UpsampleImpl : public torch::nn::Module {
  3. public:
  4. UpsampleImpl(int in_channels, int out_channels);
  5. torch::Tensor forward(torch::Tensor x);
  6. private:
  7. // Declare layers
  8. torch::nn::Sequential upsample = torch::nn::Sequential();
  9. }; TORCH_MODULE(Upsample);
  10. UpsampleImpl::UpsampleImpl(int in_channels, int out_channels)
  11. {
  12. upsample = torch::nn::Sequential(
  13. BasicConv(in_channels, out_channels, 1)
  14. //torch::nn::Upsample(torch::nn::UpsampleOptions().scale_factor(std::vector<double>({ 2 })).mode(torch::kNearest).align_corners(false))
  15. );
  16. register_module("upsample", upsample);
  17. }
  18. torch::Tensor UpsampleImpl::forward(torch::Tensor x)
  19. {
  20. x = upsample->forward(x);
  21. x = at::upsample_nearest2d(x, { x.sizes()[2] * 2 , x.sizes()[3] * 2 });
  22. return x;
  23. }

然后是yolo_head模块

  1. torch::nn::Sequential yolo_head(std::vector<int> filters_list, int in_filters);
  2. torch::nn::Sequential yolo_head(std::vector<int> filters_list, int in_filters) {
  3. auto m = torch::nn::Sequential(BasicConv(in_filters, filters_list[0], 3),
  4. torch::nn::Conv2d(conv_options(filters_list[0], filters_list[1], 1)));
  5. return m;
  6. }

以及yolo_body

  1. class YoloBody_tinyImpl : public torch::nn::Module {
  2. public:
  3. YoloBody_tinyImpl(int num_anchors, int num_classes);
  4. std::vector<torch::Tensor> forward(torch::Tensor x);
  5. private:
  6. // Declare layers
  7. CSPdarknet53_tiny backbone{ nullptr };
  8. BasicConv conv_for_P5{ nullptr };
  9. Upsample upsample{ nullptr };
  10. torch::nn::Sequential yolo_headP5{ nullptr };
  11. torch::nn::Sequential yolo_headP4{ nullptr };
  12. }; TORCH_MODULE(YoloBody_tiny);
  13. YoloBody_tinyImpl::YoloBody_tinyImpl(int num_anchors, int num_classes) {
  14. backbone = CSPdarknet53_tiny();
  15. conv_for_P5 = BasicConv(512, 256, 1);
  16. yolo_headP5 = yolo_head({ 512, num_anchors * (5 + num_classes) }, 256);
  17. upsample = Upsample(256, 128);
  18. yolo_headP4 = yolo_head({ 256, num_anchors * (5 + num_classes) }, 384);
  19. register_module("backbone", backbone);
  20. register_module("conv_for_P5", conv_for_P5);
  21. register_module("yolo_headP5", yolo_headP5);
  22. register_module("upsample", upsample);
  23. register_module("yolo_headP4", yolo_headP4);
  24. }
  25. std::vector<torch::Tensor> YoloBody_tinyImpl::forward(torch::Tensor x) {
  26. //return feat1 with shape of {26,26,256} and feat2 of {13, 13, 512}
  27. auto backbone_out = backbone->forward(x);
  28. auto feat1 = backbone_out[0];
  29. auto feat2 = backbone_out[1];
  30. //13,13,512 -> 13,13,256
  31. auto P5 = conv_for_P5->forward(feat2);
  32. //13, 13, 256 -> 13, 13, 512 -> 13, 13, 255
  33. auto out0 = yolo_headP5->forward(P5);
  34. //13,13,256 -> 13,13,128 -> 26,26,128
  35. auto P5_Upsample = upsample->forward(P5);
  36. //26, 26, 256 + 26, 26, 128 -> 26, 26, 384
  37. auto P4 = torch::cat({ P5_Upsample, feat1 }, 1);
  38. //26, 26, 384 -> 26, 26, 256 -> 26, 26, 255
  39. auto out1 = yolo_headP4->forward(P4);
  40. return std::vector<torch::Tensor>({ out0, out1 });
  41. }

代码写到这一步,其实只要细心就会发现基本是对pytorch代码到libtorch的迁移,除了少数bug需要调试,大部分简单迁移到c++即可。可以说是非常简便了。

像前面章节中一样,生成torchscript模型。bubbliiiing yolov4-tiny中有提供一个coco训练版本,通过下述代码生成.pt文件:

  1. import torch
  2. from torchsummary import summary
  3. import numpy as np
  4. from nets.yolo4_tiny import YoloBody
  5. from train import get_anchors, get_classes,YOLOLoss
  6. device = torch.device('cpu')
  7. model = YoloBody(3,80).to(device)
  8. model_path = "model_data/yolov4_tiny_weights_coco.pth"
  9. print('Loading weights into state dict...')
  10. model_dict = model.state_dict()
  11. pretrained_dict = torch.load(model_path, map_location=torch.device("cpu"))
  12. pretrained_dict = {k: v for k, v in pretrained_dict.items() if np.shape(model_dict[k]) == np.shape(v)}
  13. model_dict.update(pretrained_dict)
  14. model.load_state_dict(model_dict)
  15. print('Finished!')
  16. #生成pt模型,按照官网来即可
  17. model=model.to(torch.device("cpu"))
  18. model.eval()
  19. var=torch.ones((1,3,416,416))
  20. traced_script_module = torch.jit.trace(model, var)
  21. traced_script_module.save("yolo4_tiny.pt")

然后在c++中使用下述代码测试是否能够正确加载:

  1. auto model = YoloBody_tiny(3, 80);
  2. torch::load(model, "weights/yolo4_tiny.pt");

执行通过即表明加载成功。

预测

预测需要将YOLO4_tiny模型输出的张量进行解码,根据源代码解码函数,写出c++版本的解码函数,此时将发现,libtorch教程第二章的重要性了。

  1. torch::Tensor DecodeBox(torch::Tensor input, torch::Tensor anchors, int num_classes, int img_size[])
  2. {
  3. int num_anchors = anchors.sizes()[0];
  4. int bbox_attrs = 5 + num_classes;
  5. int batch_size = input.sizes()[0];
  6. int input_height = input.sizes()[2];
  7. int input_width = input.sizes()[3];
  8. //计算步长
  9. //每一个特征点对应原来的图片上多少个像素点
  10. //如果特征层为13x13的话,一个特征点就对应原来的图片上的32个像素点
  11. //416 / 13 = 32
  12. auto stride_h = img_size[1] / input_height;
  13. auto stride_w = img_size[0] / input_width;
  14. //把先验框的尺寸调整成特征层大小的形式
  15. //计算出先验框在特征层上对应的宽高
  16. auto scaled_anchors = anchors.clone();
  17. scaled_anchors.select(1, 0) = scaled_anchors.select(1, 0) / stride_w;
  18. scaled_anchors.select(1, 1) = scaled_anchors.select(1, 1) / stride_h;
  19. //bs, 3 * (5 + num_classes), 13, 13->bs, 3, 13, 13, (5 + num_classes)
  20. //cout << "begin view"<<input.sizes()<<endl;
  21. auto prediction = input.view({ batch_size, num_anchors,bbox_attrs, input_height, input_width }).permute({ 0, 1, 3, 4, 2 }).contiguous();
  22. //cout << "end view" << endl;
  23. //先验框的中心位置的调整参数
  24. auto x = torch::sigmoid(prediction.select(-1, 0));
  25. auto y = torch::sigmoid(prediction.select(-1, 1));
  26. //先验框的宽高调整参数
  27. auto w = prediction.select(-1, 2); // Width
  28. auto h = prediction.select(-1, 3); // Height
  29. //获得置信度,是否有物体
  30. auto conf = torch::sigmoid(prediction.select(-1, 4));
  31. //种类置信度
  32. auto pred_cls = torch::sigmoid(prediction.narrow(-1, 5, num_classes));// Cls pred.
  33. auto LongType = x.clone().to(torch::kLong).options();
  34. auto FloatType = x.options();
  35. //生成网格,先验框中心,网格左上角 batch_size, 3, 13, 13
  36. auto grid_x = torch::linspace(0, input_width - 1, input_width).repeat({ input_height, 1 }).repeat(
  37. { batch_size * num_anchors, 1, 1 }).view(x.sizes()).to(FloatType);
  38. auto grid_y = torch::linspace(0, input_height - 1, input_height).repeat({ input_width, 1 }).t().repeat(
  39. { batch_size * num_anchors, 1, 1 }).view(y.sizes()).to(FloatType);
  40. //生成先验框的宽高
  41. auto anchor_w = scaled_anchors.to(FloatType).narrow(1, 0, 1);
  42. auto anchor_h = scaled_anchors.to(FloatType).narrow(1, 1, 1);
  43. anchor_w = anchor_w.repeat({ batch_size, 1 }).repeat({ 1, 1, input_height * input_width }).view(w.sizes());
  44. anchor_h = anchor_h.repeat({ batch_size, 1 }).repeat({ 1, 1, input_height * input_width }).view(h.sizes());
  45. //计算调整后的先验框中心与宽高
  46. auto pred_boxes = torch::randn_like(prediction.narrow(-1, 0, 4)).to(FloatType);
  47. pred_boxes.select(-1, 0) = x + grid_x;
  48. pred_boxes.select(-1, 1) = y + grid_y;
  49. pred_boxes.select(-1, 2) = w.exp() * anchor_w;
  50. pred_boxes.select(-1, 3) = h.exp() * anchor_h;
  51. //用于将输出调整为相对于416x416的大小
  52. std::vector<int> scales{ stride_w, stride_h, stride_w, stride_h };
  53. auto _scale = torch::tensor(scales).to(FloatType);
  54. //cout << pred_boxes << endl;
  55. //cout << conf << endl;
  56. //cout << pred_cls << endl;
  57. pred_boxes = pred_boxes.view({ batch_size, -1, 4 }) * _scale;
  58. conf = conf.view({ batch_size, -1, 1 });
  59. pred_cls = pred_cls.view({ batch_size, -1, num_classes });
  60. auto output = torch::cat({ pred_boxes, conf, pred_cls }, -1);
  61. return output;
  62. }

此外,还需要将输出进行非极大值抑制。参考我的NMS的几种写法写出非极大值抑制函数:

  1. std::vector<int> nms_libtorch(torch::Tensor bboxes, torch::Tensor scores, float thresh) {
  2. auto x1 = bboxes.select(-1, 0);
  3. auto y1 = bboxes.select(-1, 1);
  4. auto x2 = bboxes.select(-1, 2);
  5. auto y2 = bboxes.select(-1, 3);
  6. auto areas = (x2 - x1)*(y2 - y1); //[N, ] 每个bbox的面积
  7. auto tuple_sorted = scores.sort(0, true); //降序排列
  8. auto order = std::get<1>(tuple_sorted);
  9. std::vector<int> keep;
  10. while (order.numel() > 0) {// torch.numel()返回张量元素个数
  11. if (order.numel() == 1) {// 保留框只剩一个
  12. auto i = order.item();
  13. keep.push_back(i.toInt());
  14. break;
  15. }
  16. else {
  17. auto i = order[0].item();// 保留scores最大的那个框box[i]
  18. keep.push_back(i.toInt());
  19. }
  20. //计算box[i]与其余各框的IOU(思路很好)
  21. auto order_mask = order.narrow(0, 1, order.size(-1) - 1);
  22. x1.index({ order_mask });
  23. x1.index({ order_mask }).clamp(x1[keep.back()].item().toFloat(), 1e10);
  24. auto xx1 = x1.index({ order_mask }).clamp(x1[keep.back()].item().toFloat(), 1e10);// [N - 1, ]
  25. auto yy1 = y1.index({ order_mask }).clamp(y1[keep.back()].item().toFloat(), 1e10);
  26. auto xx2 = x2.index({ order_mask }).clamp(0, x2[keep.back()].item().toFloat());
  27. auto yy2 = y2.index({ order_mask }).clamp(0, y2[keep.back()].item().toFloat());
  28. auto inter = (xx2 - xx1).clamp(0, 1e10) * (yy2 - yy1).clamp(0, 1e10);// [N - 1, ]
  29. auto iou = inter / (areas[keep.back()] + areas.index({ order.narrow(0,1,order.size(-1) - 1) }) - inter);//[N - 1, ]
  30. auto idx = (iou <= thresh).nonzero().squeeze();//注意此时idx为[N - 1, ] 而order为[N, ]
  31. if (idx.numel() == 0) {
  32. break;
  33. }
  34. order = order.index({ idx + 1 }); //修补索引之间的差值
  35. }
  36. return keep;
  37. }
  38. std::vector<torch::Tensor> non_maximum_suppression(torch::Tensor prediction, int num_classes, float conf_thres, float nms_thres) {
  39. prediction.select(-1, 0) -= prediction.select(-1, 2) / 2;
  40. prediction.select(-1, 1) -= prediction.select(-1, 3) / 2;
  41. prediction.select(-1, 2) += prediction.select(-1, 0);
  42. prediction.select(-1, 3) += prediction.select(-1, 1);
  43. std::vector<torch::Tensor> output;
  44. for (int image_id = 0; image_id < prediction.sizes()[0]; image_id++) {
  45. auto image_pred = prediction[image_id];
  46. auto max_out_tuple = torch::max(image_pred.narrow(-1, 5, num_classes), -1, true);
  47. auto class_conf = std::get<0>(max_out_tuple);
  48. auto class_pred = std::get<1>(max_out_tuple);
  49. auto conf_mask = (image_pred.select(-1, 4) * class_conf.select(-1, 0) >= conf_thres).squeeze();
  50. image_pred = image_pred.index({ conf_mask }).to(torch::kFloat);
  51. class_conf = class_conf.index({ conf_mask }).to(torch::kFloat);
  52. class_pred = class_pred.index({ conf_mask }).to(torch::kFloat);
  53. if (!image_pred.sizes()[0]) {
  54. output.push_back(torch::full({ 1, 7 }, 0));
  55. continue;
  56. }
  57. //获得的内容为(x1, y1, x2, y2, obj_conf, class_conf, class_pred)
  58. auto detections = torch::cat({ image_pred.narrow(-1,0,5), class_conf, class_pred }, 1);
  59. //获得种类
  60. std::vector<torch::Tensor> img_classes;
  61. for (int m = 0, len = detections.size(0); m < len; m++)
  62. {
  63. bool found = false;
  64. for (size_t n = 0; n < img_classes.size(); n++)
  65. {
  66. auto ret = (detections[m][6] == img_classes[n]);
  67. if (torch::nonzero(ret).size(0) > 0)
  68. {
  69. found = true;
  70. break;
  71. }
  72. }
  73. if (!found) img_classes.push_back(detections[m][6]);
  74. }
  75. std::vector<torch::Tensor> temp_class_detections;
  76. for (auto c : img_classes) {
  77. auto detections_class = detections.index({ detections.select(-1,-1) == c });
  78. auto keep = nms_libtorch(detections_class.narrow(-1, 0, 4), detections_class.select(-1, 4)*detections_class.select(-1, 5), nms_thres);
  79. std::vector<torch::Tensor> temp_max_detections;
  80. for (auto v : keep) {
  81. temp_max_detections.push_back(detections_class[v]);
  82. }
  83. auto max_detections = torch::cat(temp_max_detections, 0);
  84. temp_class_detections.push_back(max_detections);
  85. }
  86. auto class_detections = torch::cat(temp_class_detections, 0);
  87. output.push_back(class_detections);
  88. }
  89. return output;
  90. }

这些函数准备好后,写出预测函数:

  1. void show_bbox_coco(cv::Mat image, torch::Tensor bboxes, int nums) {
  2. //设置绘制文本的相关参数
  3. int font_face = cv::FONT_HERSHEY_COMPLEX;
  4. double font_scale = 0.4;
  5. int thickness = 1;
  6. float* bbox = new float[bboxes.size(0)]();
  7. std::cout << bboxes << std::endl;
  8. memcpy(bbox, bboxes.cpu().data_ptr(), bboxes.size(0) * sizeof(float));
  9. for (int i = 0; i < bboxes.size(0); i = i + 7)
  10. {
  11. cv::rectangle(image, cv::Rect(bbox[i + 0], bbox[i + 1], bbox[i + 2] - bbox[i + 0], bbox[i + 3] - bbox[i + 1]), cv::Scalar(0, 0, 255));
  12. //将文本框居中绘制
  13. cv::Point origin;
  14. origin.x = bbox[i + 0];
  15. origin.y = bbox[i + 1] + 8;
  16. cv::putText(image, std::to_string(int(bbox[i + 6])), origin, font_face, font_scale, cv::Scalar(0, 0, 255), thickness, 1, 0);
  17. }
  18. delete bbox;
  19. cv::imshow("test", image);
  20. cv::waitKey(0);
  21. cv::destroyAllWindows();
  22. }
  23. void Predict(YoloBody_tiny detector, cv::Mat image, bool show, float conf_thresh, float nms_thresh) {
  24. int origin_width = image.cols;
  25. int origin_height = image.rows;
  26. cv::resize(image, image, { 416,416 });
  27. auto img_tensor = torch::from_blob(image.data, { image.rows, image.cols, 3 }, torch::kByte);
  28. img_tensor = img_tensor.permute({ 2, 0, 1 }).unsqueeze(0).to(torch::kFloat) / 255.0;
  29. float anchor[12] = { 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 };
  30. auto anchors_ = torch::from_blob(anchor, { 6,2 }, torch::TensorOptions(torch::kFloat32));
  31. int image_size[2] = { 416,416 };
  32. img_tensor = img_tensor.cuda();
  33. auto outputs = detector->forward(img_tensor);
  34. std::vector<torch::Tensor> output_list = {};
  35. auto tensor_input = outputs[1];
  36. auto output_decoded = DecodeBox(tensor_input, anchors_.narrow(0, 0, 3), 80, image_size);
  37. output_list.push_back(output_decoded);
  38. tensor_input = outputs[0];
  39. output_decoded = DecodeBox(tensor_input, anchors_.narrow(0, 3, 3), 80, image_size);
  40. output_list.push_back(output_decoded);
  41. //std::cout << tensor_input << anchors_.narrow(0, 3, 3);
  42. auto output = torch::cat(output_list, 1);
  43. auto detection = non_maximum_suppression(output, 80, conf_thresh, nms_thresh);
  44. float w_scale = float(origin_width) / 416;
  45. float h_scale = float(origin_height) / 416;
  46. for (int i = 0; i < detection.size(); i++) {
  47. for (int j = 0; j < detection[i].size(0) / 7; j++)
  48. {
  49. detection[i].select(0, 7 * j + 0) *= w_scale;
  50. detection[i].select(0, 7 * j + 1) *= h_scale;
  51. detection[i].select(0, 7 * j + 2) *= w_scale;
  52. detection[i].select(0, 7 * j + 3) *= h_scale;
  53. }
  54. }
  55. cv::resize(image, image, { origin_width,origin_height });
  56. if (show)
  57. show_bbox_coco(image, detection[0], 80);
  58. return;
  59. }

使用VOC数据集中一张图片,测试一下函数准确性。直接将上述代码用于测试.pt文件,如输入下述代码:

  1. cv::Mat image = cv::imread("2007_005331.jpg");
  2. auto model = YoloBody_tiny(3, 80);
  3. torch::load(model, "weights/yolo4_tiny.pt");
  4. model->to(torch::kCUDA);
  5. Predict(model, image, true, 0.1, 0.3);

使用的图片如下图


将会发现,预测结果如下:

结果分析有以下两点结论:

  • 输出了检测框,预测函数大概率正确;
  • 存在部分误检,提高置信度阈值可能改善,但是会漏检。这是由于.pt文件训练时采用的预处理策略,和本文代码预测时采用的预处理策略不一致导致的。

使用训练和预测一致的预处理方式处理图片,得到的结果应该要好很多。下面时一张,以coco预训练权重做迁移学习,只训练yolo_head,训练voc数据集一个周期后,预测该图的效果:


继续训练,数据增强,训练全部权重应该可以将结果提升更多。

训练

训练代码比较多,博客就不再介绍。可以移步到LibtorchTutorials中。同时,LibtorchTutorials中的代码实现的功能都比较基础,我将分开在LibtorchSegment项目和LibtorchDetection中将功能提升完善。有帮助到的话请点个star资瓷下。

分享不易,如果有用请不吝给我一个

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/149583
推荐阅读
相关标签