当前位置:   article > 正文

Nvidia 显卡 Failed to initialize NVML Driver/library version mismatch 错误解决方案(不重启方案)【简单方案:重启】_failed to initialize nvml: driver/library version

failed to initialize nvml: driver/library version mismatch nvml library vers


本文记录错误 Failed to initialize NVML: Driver/library version mismatch 错误解决方案。(使用GPU运行程序会出现:RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW)

问题复现

  1. $ nvidia-smi
  2. -->
  3. Failed to initialize NVML: Driver/library version mismatch


问题分析

  • NVIDIA 内核驱动版本与系统驱动不一致

查看显卡驱动所使用的内核版本

  1. cat /proc/driver/nvidia/version
  2. -->
  3. NVRM version: NVIDIA UNIX x86_64 Kernel Module 430.34 Wed Jun 26 12:19:48 CDT 2019
  4. GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)

  • 内核版本 Kernel Module 为 430.34, 系统内核 16.04.12

查看系统驱动日志

  1. cat /var/log/dpkg.log | grep nvidia
  2. -->
  3. 2021-03-30 14:04:55 install libnvidia-compute-460-server:amd64 <none> 460.32.03-0ubuntu0.18.04.2
  4. 2021-03-30 14:04:55 status half-installed libnvidia-compute-460-server:amd64 460.32.03-0ubuntu0.18.04.2
  5. 2021-03-30 14:04:57 status unpacked libnvidia-compute-460-server:amd64 460.32.03-0ubuntu0.18.04.2
  6. 2021-03-30 14:04:57 status unpacked libnvidia-compute-460-server:amd64 460.32.03-0ubuntu0.18.04.2
  7. 2021-03-30 14:05:15 install nvidia-cuda-dev:amd64 <none> 9.1.85-3ubuntu1
  8. 2021-03-30 14:05:15 status half-installed nvidia-cuda-dev:amd64 9.1.85-3ubuntu1
  9. 2021-03-30 14:05:34 status unpacked nvidia-cuda-dev:amd64 9.1.85-3ubuntu1
  10. 2021-03-30 14:05:34 status unpacked nvidia-cuda-dev:amd64 9.1.85-3ubuntu1
  11. 2021-03-30 14:05:34 install nvidia-cuda-doc:all <none> 9.1.85-3ubuntu1
  12. 2021-03-30 14:05:34 status half-installed nvidia-cuda-doc:all 9.1.85-3ubuntu1
  13. 2021-03-30 14:05:38 status unpacked nvidia-cuda-doc:all 9.1.85-3ubuntu1
  14. 2021-03-30 14:05:38 status unpacked nvidia-cuda-doc:all 9.1.85-3ubuntu1
  15. 2021-03-30 14:05:38 install nvidia-cuda-gdb:amd64 <none> 9.1.85-3ubuntu1
  16. 2021-03-30 14:05:38 status half-installed nvidia-cuda-gdb:amd64 9.1.85-3ubuntu1
  17. 2021-03-30 14:05:39 status unpacked nvidia-cuda-gdb:amd64 9.1.85-3ubuntu1
  18. 2021-03-30 14:05:39 status unpacked nvidia-cuda-gdb:amd64 9.1.85-3ubuntu1
  19. 2021-03-30 14:05:39 install nvidia-profiler:amd64 <none> 9.1.85-3ubuntu1
  20. 2021-03-30 14:05:39 status half-installed nvidia-profiler:amd64 9.1.85-3ubuntu1


可以看到曾经安装过系统内核 18.04 的 460.32 的驱动
查看驱动程序

  1. sudo dpkg --list | grep nvidia-*
  2. -->
  3. ii libnvidia-compute-460-server:amd64 460.32.03-0ubuntu0.18.04.2 amd64 NVIDIA libcompute package
  4. ii libnvidia-container-tools 1.0.5-1 amd64 NVIDIA container runtime library (command-line tools)
  5. ii libnvidia-container1:amd64 1.0.5-1 amd64 NVIDIA container runtime library
  6. ii nvidia-container-runtime 3.1.4-1 amd64 NVIDIA container runtime
  7. ii nvidia-container-toolkit 1.0.5-1 amd64 NVIDIA container runtime hook
  8. ii nvidia-cuda-dev 9.1.85-3ubuntu1 amd64 NVIDIA CUDA development files
  9. ii nvidia-cuda-doc 9.1.85-3ubuntu1 all NVIDIA CUDA and OpenCL documentation

  • 可以看到系统安装了ubuntu 内核18.04 下的 nvidia 460 驱动
  • 实际系统内核版本与驱动需求的版本不一致是问题产生的根源

解决方案

  • 卸载现有驱动,重新安装

卸载驱动

  1. sudo /usr/bin/nvidia-uninstall
  2. sudo apt-get --purge remove nvidia-*
  3. sudo apt-get purge nvidia*
  4. sudo apt-get purge libnvidia*


直到命令不输出任何内容

sudo dpkg --list | grep nvidia-*


重新安装

  1. sudo chmod a+x NVIDIA-Linux-x86_64-450.80.02.run
  2. sudo ./NVIDIA-Linux-x86_64-450.80.02.run -no-x-check -no-nouveau-check -no-opengl-files
  1. no-opengl-files 只安装驱动文件,不安装OpenGL文件
  2. no-x-check 安装驱动时不检查X服务
  3. no-nouveau-check 安装驱动时不检查nouveau

查看驱动更新结果
 

$ nvidia-smi

参考资料
https://blog.csdn.net/qq_40200387/article/details/90341107

https://www.zywvvd.com/2020/12/03/linux/driver/nvidia-driver-install-linux/

【nvidia-smi】Failed to initialize NVML: Driver/library version mismatch解决方法(不用重启)_Wumbuk的博客-CSDN博客

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/羊村懒王/article/detail/392079
推荐阅读
相关标签
  

闽ICP备14008679号