当前位置:   article > 正文

Yolo v5 长方形 训练修改_check_img_size

check_img_size

感谢,以下内容改自:http://t.csdn.cn/37m2w

Train.py

  1. 添加train,test (480,640) for each
 parser.add_argument('--imgsz', '--img', '--img-size', type=int, default=[[480,640],[480,640]], help='train, val image size (pixels)')
 parser.add_argument('--rect', action='store_true', default=True, help='rectangular training')
  • 1
  • 2
  1. 分出train/val
# imgsz = check_img_size(opt.imgsz, gs, floor=gs * 2)  # verify imgsz is 
if isinstance(opt.imgsz,int): 
        imgsz_train = check_img_size(opt.imgsz, gs, floor=gs * 2)  # verify imgsz is gs-multiple
        imgsz_val = imgsz_train
else:
     	imgsz_train = check_img_size(opt.imgsz[0], gs, floor=gs * 2)  # verify imgsz is gs-multiple
     	imgsz_val = check_img_size(opt.imgsz[1], gs, floor=gs * 2)  # verify imgsz is gs-multiple
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  1. create_dataloader
    train, val 都会有create_dataloader, 里面的参数要做改变:
train_loader, dataset = create_dataloader(train_path,
                                              imgsz_train,
                                              batch_size // WORLD_SIZE,
                                              gs,
                                              single_cls,
                                              hyp=hyp,
                                              augment=True,
                                              cache=None if opt.cache == 'val' else opt.cache,
                                              rect=opt.rect,
                                              rank=LOCAL_RANK,
                                              workers=workers,
                                              image_weights=opt.image_weights,
                                              quad=opt.quad,
                                              prefix=colorstr('train: '),
                                              shuffle=True,
                                              seed=opt.seed)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
val_loader = create_dataloader(val_path,
                                       imgsz_val,
                                       batch_size // WORLD_SIZE * 2,
                                       gs,
                                       single_cls,
                                       hyp=hyp,
                                       cache=None if noval else opt.cache,
                                       rect=True,
                                       rank=-1,
                                       workers=workers * 2,
                                       pad=0.5,
                                       prefix=colorstr('val: '))[0]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
if not resume:
 	if not opt.noautoanchor:
                check_anchors(dataset, model=model, thr=hyp['anchor_t'], imgsz=imgsz_train)  # run AutoAnchor
   model.half().float()  # pre-reduce anchor precision
  • 1
  • 2
  • 3
  • 4
if opt.multi_scale:
    sz = random.randrange(int(max(imgsz_train) * 0.5), int(max(imgsz_train) * 1.5) + gs) // gs * gs  # size
    sf = sz / max(imgs.shape[2:])  # scale factor
    if sf != 1:
        ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]]  # new shape (stretched to gs-multiple)
        imgs = nn.functional.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  1. 其它

这里用于调整loss gain, 如果imgsz_train 越大, ‘max(imgsz_train) / 640) ** 2’ 越大,如果 number of layers(nl) 越多层,3 / nl 越小。 这两个量中和了

if isinstance(imgsz_train,int):
        hyp['obj'] *= (imgsz_train / 640) ** 2 * 3 / nl  # scale to image size and layers
else:
   		hyp['obj'] *= (max(imgsz_train) / 640) ** 2 * 3 / nl  # scale to image size and layers
  • 1
  • 2
  • 3
  • 4
LOGGER.info(f'Image sizes {imgsz_train} train, {imgsz_val} val\n'
                f'Using {train_loader.num_workers * WORLD_SIZE} dataloader workers\n'
                f"Logging results to {colorstr('bold', save_dir)}\n"
                f'Starting training for {epochs} epochs...')
  • 1
  • 2
  • 3
  • 4
# Multi-scale
if opt.multi_scale:
    sz = random.randrange(int(imgsz_train * 0.5), int(imgsz_train * 1.5) + gs) // gs * gs  # size
    sf = sz / max(imgs.shape[2:])  # scale factor
    if sf != 1:
        ns = [math.ceil(x * sf / gs) * gs for x in imgs.shape[2:]]  # new shape (stretched to gs-multiple)
        imgs = nn.functional.interpolate(imgs, size=ns, mode='bilinear', align_corners=False)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
results, maps, _ = validate.run(data_dict,
			                     batch_size=batch_size // WORLD_SIZE * 2,
			                     imgsz=imgsz_val,
			                     half=amp,
			                     model=ema.ema,
			                     single_cls=single_cls,
			                     dataloader=val_loader,
			                     save_dir=save_dir,
			                     plots=False,
			                     callbacks=callbacks,
			                     compute_loss=compute_loss)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
results, _, _ = validate.run(
                        data_dict,
                        batch_size=batch_size // WORLD_SIZE * 2,
                        imgsz=imgsz_val,
                        model=attempt_load(f, device).half(),
                        iou_thres=0.65 if is_coco else 0.60,  # best pycocotools at iou 0.65
                        single_cls=single_cls,
                        dataloader=val_loader,
                        save_dir=save_dir,
                        save_json=is_coco,
                        verbose=True,
                        plots=plots,
                        callbacks=callbacks,
                        compute_loss=compute_loss)  # val best model with plots
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

dataloaders.py

class: LoadImagesAndLabels()

  1. 【mosaic 】可处理一个值/多个值的尺寸
    comment :self.mosaic = self.augment and not self.rect . 就算rect, 也要mosaic
# self.mosaic = self.augment and not self.rect 
self.mosaic = self.augment
if isinstance(img_size, int):
     self.mosaic_border = [-img_size//2, -img_size//2]
else:
    self.mosaic_border = [-img_size[0]//2, -img_size[1]//2] # hight, width 
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  1. def load_image()

如果将 comments的地方放开,做letterbox时,图片会被直接放大,而不是原图

if isinstance(self.img_size,int):
       r = self.img_size / max(h0, w0)  # ratio
       if r != 1:  # if sizes are not equal
           interp = cv2.INTER_LINEAR if (self.augment or r > 1) else cv2.INTER_AREA
           im = cv2.resize(im, (math.ceil(w0 * r), math.ceil(h0 * r)), interpolation=interp)
       else:
           rh, rw = self.img_size[0]/ max(h0, w0), self.img_size[1]/ max(h0, w0)
           # if rh != 1 or rw !=1:
           #     interp = cv2.INTER_LINEAR if (self.augment or rh > 1 or rw > 1) else cv2.INTER_AREA
           #     im = cv2.resize(im, (math.ceil(w0 * rw), math.ceil(h0 * rh)), interpolation=interp)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 如果 rh, rw = self.img_size[0]/ max(h0, w0), self.img_size[1]/ max(h0, w0) 这个是等比例放大
  • 如果 rh, rw = self.img_size[0]/ h0, self.img_size[1]/ w0 这个等价于 直接resize 到 480x640的高宽,会变形

根据需要,如果不希望被放大,而是原图,则使用上面被comment掉的代码。
那么从load_image 到 letter_box 它会是:load_image 后不改变图的尺寸,保持原图大小.

letter_box后:

  • 如果图片太大就会等比例缩小

(这一步要通过填充得到理想尺寸)。

原图: 11071222
目标尺寸:480
672
ratio:0.4
新尺寸:530*480 (resize)

  • 如果图片过小

    原图: 122128
    目标尺寸:480
    672
    ratio:3.93 (选择不要scale up, 所以ratio = 1)

如果选择星号的方案,那么基本在letterbox不需要补边,因为在上一步,基本符合了640,480

  1. load_mosaic()
s = self.img_size
if isinstance(self.img_size,int):
     yc, xc = (int(random.uniform(-x, 2 * s + x)) for x in self.mosaic_border)  # mosaic center x, y
 else:
     s_h, s_w = s[0], s[1] 
     yc, xc = [int(random.uniform(-x, 2 * s + x)) for x, s in zip(self.mosaic_border, self.img_size)]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

clip 的范围处理为0~最大

# Concat/clip labels
labels4 = np.concatenate(labels4, 0)
for x in (labels4[:, 1:], *segments4):
    if isinstance(s,int):
        np.clip(x, 0, 2 * s, out=x)  # clip when using random_perspective()
    else:
        np.clip(x, 0, 2 * max(s), out=x)  # clip when using random_perspective()random_perspective()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

self.rect

之前设定高宽为480x640
如果这里的 self.rect == True。 那么这里会贴心的考虑到stride 和 padding的尺寸,在480x640基础上往外扩充,得到对应batch中容错尺寸:512x672
进入letterbox时,目标尺寸会变为512x672.

如果这里的 self.rect == True, 进入letterbox时,目标尺寸依然为480x640。

  • 在train.py 中:
val_loader = create_dataloader(val_path,
                                       imgsz_val,
                                       batch_size // WORLD_SIZE * 2,
                                       gs,
                                       single_cls,
                                       hyp=hyp,
                                       cache=None if noval else opt.cache,
                                       rect=opt.rect,
                                       rank=-1,
                                       workers=workers * 2,
                                       pad=0.5,
                                       prefix=colorstr('val: '))[0]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
声明:本文内容由网友自发贡献,转载请注明出处:【wpsshop】
推荐阅读
相关标签
  

闽ICP备14008679号