torch.cuda.empty_cache()导致RuntimeError: CUDA error: out of memory._torch.cuda.empty_cache()out of memory

作者：小蓝xlanll | 2024-02-21 23:53:19

踩

torch.cuda.empty_cache()out of memory

用双卡训练时，当GPU0接近跑满时，在GPU1中如果有

torch.cuda.empty_cache()
1

会导致：

torch._C._cuda_emptyCache()
RuntimeError: CUDA error: out of memory.
1
2

出现这个问题是当使用GPU1训练时，torch.cuda.empty_cache()默认是给GPU0释放显存，而此时GPU0接近跑满，且无显存可释放，直接使用这句话就会导致显存溢出。解决方法为：

    with torch.cuda.device('cuda:1'):
        torch.cuda.empty_cache()
1
2

即指定给GPU1回收显存，这样就可以正常运行了。

参考资料：
1.https://discuss.pytorch.org/t/out-of-memory-when-i-use-torch-cuda-empty-cache/57898
2.https://github.com/PyTorchLightning/pytorch-lightning/issues/3016

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/小蓝xlanll/article/detail/127386