赞
踩
pyrotch 训练显示显存不足,但是实际占用显存还有很多剩余:
CUDA out of memory. Tried to allocate 768.00 MiB (GPU 4; 44.37 GiB total capacity; 33.63 GiB already allocated; 560.56 MiB free; 42.34 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
是由于碎片内存问题,只需要在网络训练前将碎片内存释放即可
我是在每个batch前向前都运行如下代码
import os
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:128" #总是报显存不足的问题,是因为碎片没完全释放
if hasattr(torch.cuda, 'empty_cache'):
torch.cuda.empty_cache()
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。