当前位置:   article > 正文

Yolov5中Detect层的输出转化成图片上的预测框的过程_yolov5 detect时画框代码

yolov5 detect时画框代码

首先,val模式下的Detect层输出是一个大小为( tensor(bs, 3*(20*20+40*40+80*80), 5+nc), list[3])的元组,以下是val.py中的代码。

  1. # Inference
  2. out, train_out = model(im) if training else model(im, augment=augment, val=True) # inference, loss outputs
out接收了tensor(bs, 3*(20*20+40*40+80*80), 5+nc)预测的信息,其中3*(20*20+40*40+80*80)代表对三个检测层输出的20/40/80特征图的每个像素点预测三个不同尺寸比例的预测框。其中5+nc为xywh+conf+nc。train_out接受的是列表与本文无关,不作讨论。
  1. # out =元素为形状为(筛选后的n, x1y1x2y2 + rate + class = 6)张量的列表,元素个数为bs
  2. out = non_max_suppression(out, conf_thres, iou_thres, labels=lb, multi_label=True, agnostic=single_cls)

对输出进行NMS操作,返回的结果out是list,其中的元素是张量,形状为(筛选后的n, x1y1x2y2 + rate + class = 6),元素个数为bs,n对于每个张量元素各不相同,是从3*(20*20+40*40+80*80)中筛选掉了不符合置信度和iou的预测框得到的,multi_label=True,所以同一个框会有可能多个置信度超过阈值的分类,n也不是该batch内的所有框的数量,而是比它多。

  1. # Plot images Thread将会创建并启动两个新的线程来执行 plot_images函数,以提高程序的执行效率和响应速度
  2. if plots and batch_i < 3:
  3. f = save_dir / f'val_batch{batch_i}_labels.jpg' # labels
  4. Thread(target=plot_images, args=(im, targets, paths, f, names), daemon=True).start()
  5. f = save_dir / f'val_batch{batch_i}_pred.jpg' # predictions
  6. Thread(target=plot_images, args=(im, output_to_target(out), paths, f, names), daemon=True).start()

最后一行代码创建一个新的线程来执行plot_images函数,参数为(im, output_to_target(out), paths, f, names)。对于out先执行了output_to_target()函数,得到的返回结果是一个数组,类似列表,其中数组的每一行元素是[bs的序号, 类别, xywh, conf],例如[0, 0, x1, y1 ,w1, h1, 0.5]、[0, 2, x1, y1 ,w1, h1, 0.3]、[0, 0, x2,y2 ,w2, h2, 0.4]......

  1. def plot_images(images, targets, paths=None, fname='images.jpg', names=None, max_size=1920, max_subplots=16):
  2. # Plot image grid with labels
  3. if isinstance(images, torch.Tensor):
  4. images = images.cpu().float().numpy()
  5. if isinstance(targets, torch.Tensor):
  6. targets = targets.cpu().numpy()
  7. if np.max(images[0]) <= 1:
  8. images *= 255 # de-normalise (optional)
  9. bs, _, h, w = images.shape # batch size, _, height, width
  10. bs = min(bs, max_subplots) # limit plot images
  11. ns = np.ceil(bs ** 0.5) # number of subplots (square)
  12. # Build Image
  13. mosaic = np.full((int(ns * h), int(ns * w), 3), 255, dtype=np.uint8) # init
  14. for i, im in enumerate(images):
  15. if i == max_subplots: # if last batch has fewer images than we expect
  16. break
  17. x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin
  18. im = im.transpose(1, 2, 0)
  19. mosaic[y:y + h, x:x + w, :] = im
  20. # Resize (optional)
  21. scale = max_size / ns / max(h, w)
  22. if scale < 1:
  23. h = math.ceil(scale * h)
  24. w = math.ceil(scale * w)
  25. mosaic = cv2.resize(mosaic, tuple(int(x * ns) for x in (w, h)))
  26. # Annotate
  27. fs = int((h + w) * ns * 0.01) # font size
  28. annotator = Annotator(mosaic, line_width=round(fs / 10), font_size=fs, pil=True, example=names)
  29. for i in range(i + 1):
  30. x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin 给batch组成的大图中的每一张图像划定边界
  31. annotator.rectangle([x, y, x + w, y + h], None, (255, 255, 255), width=2) # borders
  32. if paths:
  33. annotator.text((x + 5, y + 5 + h), text=Path(paths[i]).name[:40], txt_color=(220, 220, 220)) # filenames
  34. if len(targets) > 0:
  35. ti = targets[targets[:, 0] == i] # image targets
  36. boxes = xywh2xyxy(ti[:, 2:6]).T
  37. classes = ti[:, 1].astype('int')
  38. labels = ti.shape[1] == 6 # labels if no conf column
  39. conf = None if labels else ti[:, 6] # check for confidence presence (label vs pred)
  40. if boxes.shape[1]:
  41. if boxes.max() <= 1.01: # if normalized with tolerance 0.01
  42. boxes[[0, 2]] *= w # scale to pixels
  43. boxes[[1, 3]] *= h
  44. elif scale < 1: # absolute coords need scale if image scales
  45. boxes *= scale
  46. boxes[[0, 2]] += x
  47. boxes[[1, 3]] += y
  48. for j, box in enumerate(boxes.T.tolist()):
  49. cls = classes[j]
  50. color = colors(cls)
  51. cls = names[cls] if names else cls
  52. if labels or conf[j] > 0.25: # 0.25 conf thresh
  53. label = f'{cls}' if labels else f'{cls} {conf[j]:.1f}'
  54. annotator.box_label(box, label, color=color)
  55. annotator.im.save(fname) # save

这些代码中只需要关注以下代码的第二行代码及下面的循环

  1. # Annotate
  2. fs = int((h + w) * ns * 0.01) # font size
  3. annotator = Annotator(mosaic, line_width=round(fs / 10), font_size=fs, pil=True, example=names)
  4. for i in range(i + 1):
  5. x, y = int(w * (i // ns)), int(h * (i % ns)) # block origin
  6. annotator.rectangle([x, y, x + w, y + h], None, (255, 255, 255), width=2) # borders 图像边界
  7. if paths:
  8. annotator.text((x + 5, y + 5 + h), text=Path(paths[i]).name[:40], txt_color=(220, 220, 220)) # filenames
  9. if len(targets) > 0:
  10. ti = targets[targets[:, 0] == i] # image targets bs的序号==i表示如果是该batch内的信息,则保存在ti中
  11. boxes = xywh2xyxy(ti[:, 2:6]).T
  12. classes = ti[:, 1].astype('int')
  13. labels = ti.shape[1] == 6 # labels if no conf column
  14. conf = None if labels else ti[:, 6] # check for confidence presence (label vs pred)
  15. if boxes.shape[1]:
  16. if boxes.max() <= 1.01: # if normalized with tolerance 0.01
  17. boxes[[0, 2]] *= w # scale to pixels 框在这张图片上的原始尺寸大小的宽
  18. boxes[[1, 3]] *= h
  19. elif scale < 1: # absolute coords need scale if image scales
  20. boxes *= scale
  21. boxes[[0, 2]] += x # 框在batch拼成的大图上的位置
  22. boxes[[1, 3]] += y
  23. for j, box in enumerate(boxes.T.tolist()):
  24. cls = classes[j] # boxes与classes的索引是一一对应的
  25. color = colors(cls)
  26. cls = names[cls] if names else cls
  27. if labels or conf[j] > 0.25: # 0.25 conf thresh
  28. label = f'{cls}' if labels else f'{cls} {conf[j]:.1f}'
  29. annotator.box_label(box, label, color=color) # 画框
  30. annotator.im.save(fname) # save

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/AllinToyou/article/detail/448952
推荐阅读
相关标签
  

闽ICP备14008679号