使用GPU训练yolo_v2报错：RuntimeError: CUDA out of memory. Tried to allocate XXX MiB_tried to allocate 58.00 mib (gpu 0; 8.00 gib total

作者：AllinToyou | 2024-02-16 00:53:09

踩

tried to allocate 58.00 mib (gpu 0; 8.00 gib total capacity; 7.06 gib alread

报错内容：
RuntimeError: CUDA out of memory. Tried to allocate 200.00 MiB (GPU 0; 8.00 GiB total capacity; 5.96 GiB already allocated; 0 bytes free; 6.18 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

这个报错的意思就是:
运行时错误：CUDA内存不足。尝试分配200.00个MiB（GPU 0；8.00 GiB总容量；5.96 GiB已分配；0字节空闲；PyTorch总共保留6.18 GiB）如果保留内存>>已分配内存，请尝试设置最大拆分大小以避免碎片。

错误原因：
显卡内存不够，因为显卡内存不够大的话，当batch_size过大，那么处理的数据量就过大，训练时模型的所有参数都会参与计算，数据内存就会溢出，显卡内存就不够用了。

解决办法：
训练时候，将batch_size 改小点，例如：
python train.py --cuda --batch_size 64 --num_workers 0
改为：
python train.py --cuda --batch_size 2 --num_workers 0

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/AllinToyou/article/detail/89051