当前位置:   article > 正文

yolov8训练中断后重新开始训练_closing dataloader mosaic

closing dataloader mosaic

需要做三件事:

第一件事:

 将cfg中的default.yaml配置文件中的resume参数改成True。

第二件事:

将engine中的model.py的self.model = self.trainer.model这行代码注释掉。

  1. if not overrides.get('resume'): # manually set model only if not resuming
  2. self.trainer.model = self.trainer.get_model(weights=self.model if self.ckpt else None, cfg=self.model.yaml)
  3. # self.model = self.trainer.model # 中断再重启需要注释这行代码
  4. self.trainer.hub_session = self.session # attach optional HUB session
  5. self.trainer.train()

第三件事:

在engine中的trainer.py做以下修改

1. 将check_resume方法中的

resume = self.args.resume 

改成

resume = r'G:\yolov8\ultralytics\yolo\v8\detect\runs\detect\train2\weights\last.pt'

即加载中断时的最后一次权重。

2. 在resume_training方法中的加上

ckpt = torch.load(r'G:\yolov8\ultralytics\yolo\v8\detect\runs\detect\train2\weights\last.pt')

同样是加载中断时的最后一次权重。

  1. def check_resume(self):
  2. resume = r'G:\yolov8\ultralytics\yolo\v8\detect\runs\detect\train2\weights\last.pt' # 加载中断时的last.pt
  3. # resume = self.args.resume # 中断后重启换成上一行代码
  4. if resume:
  5. try:
  6. last = Path(
  7. check_file(resume) if isinstance(resume, (str,
  8. Path)) and Path(resume).exists() else get_latest_run())
  9. self.args = get_cfg(attempt_load_weights(last).args)
  10. self.args.model, resume = str(last), True # reinstate
  11. except Exception as e:
  12. raise FileNotFoundError('Resume checkpoint not found. Please pass a valid checkpoint to resume from, '
  13. "i.e. 'yolo train resume model=path/to/last.pt'") from e
  14. self.resume = resume
  15. def resume_training(self, ckpt):
  16. ckpt = torch.load(r'G:\yolov8\ultralytics\yolo\v8\detect\runs\detect\train2\weights\last.pt') # 训练中断再重启,需要加载这段代码,加载最后一次last.pt
  17. if ckpt is None:
  18. return
  19. best_fitness = 0.0
  20. start_epoch = ckpt['epoch'] + 1
  21. if ckpt['optimizer'] is not None:
  22. self.optimizer.load_state_dict(ckpt['optimizer']) # optimizer
  23. best_fitness = ckpt['best_fitness']
  24. if self.ema and ckpt.get('ema'):
  25. self.ema.ema.load_state_dict(ckpt['ema'].float().state_dict()) # EMA
  26. self.ema.updates = ckpt['updates']
  27. if self.resume:
  28. assert start_epoch > 0, \
  29. f'{self.args.model} training to {self.epochs} epochs is finished, nothing to resume.\n' \
  30. f"Start a new training without --resume, i.e. 'yolo task=... mode=train model={self.args.model}'"
  31. LOGGER.info(
  32. f'Resuming training from {self.args.model} from epoch {start_epoch + 1} to {self.epochs} total epochs')
  33. if self.epochs < start_epoch:
  34. LOGGER.info(
  35. f"{self.model} has been trained for {ckpt['epoch']} epochs. Fine-tuning for {self.epochs} more epochs.")
  36. self.epochs += ckpt['epoch'] # finetune additional epochs
  37. self.best_fitness = best_fitness
  38. self.start_epoch = start_epoch
  39. if start_epoch > (self.epochs - self.args.close_mosaic):
  40. LOGGER.info('Closing dataloader mosaic')
  41. if hasattr(self.train_loader.dataset, 'mosaic'):
  42. self.train_loader.dataset.mosaic = False
  43. if hasattr(self.train_loader.dataset, 'close_mosaic'):
  44. self.train_loader.dataset.close_mosaic(hyp=self.args)

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/162293
推荐阅读
相关标签
  

闽ICP备14008679号