当前位置:   article > 正文

debian11安装Nvidia驱动及Docker运行_debian安装nvidia驱动

debian安装nvidia驱动

前言

博主由于视觉开发需求, 配置nvidia驱动并映射到docker中运行, 在本文中记录过程及遇到的问题

硬件及软件环境

   Static hostname: debian
         Icon name: computer-desktop
           Chassis: desktop
  Operating System: Debian GNU/Linux 11 (bullseye)
            Kernel: Linux 5.10.0-19-amd64
      Architecture: x86-64
     
  CPU: 12th Gen Intel(R) Core(TM) i7-12700F
  GPU: Nvidia Quadro M2000
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

驱动

下载驱动

根据自己的显卡型号去官网搜索对应的驱动程序. 本机选择470.161…03版本驱动.
NVIDIA 驱动程序下载 官方高级驱动搜索
在这里插入图片描述
cuda对应驱动版本要求对照表:
NVIDIA CUDA Toolkit Release Notes

在这里插入图片描述
注意!

  1. 直接使用apt-get install nvidia-driver时不可运行(can not communicate with nvidia driver 类似报错)
  2. 下载最新驱动525时不可运行(can not communicate with nvidia driver 类似报错)
  3. 安装时需屏蔽x server及nouveau1

安装

禁用nouveau

sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
  • 1
  • 2

禁用后重启计算机

sudo reboot
  • 1

安装依赖

后续编译使用的依赖软件2

sudo apt-get install gcc g++ cmake pkg-config libglvnd-dev
sudo apt-get install linux-headers-$(uname -r|sed 's/[^-]*-[^-]*-//')
  • 1
  • 2

禁用xserver

sudo service gdm3 stop
  • 1

输入该行指令后会进入命令行状态, 此时只有一个光标, 通过按Ctrl + Alt + F1Ctrl + Alt + F2即可跳出输入用户名密码的指令行.

设置可执行并运行

chmod +x ~/Downloads/NVIDIA-Linux-x86_64-470.161.03.run

# 需要以管理员权限运行
sudo ~/Downloads/NVIDIA-Linux-x86_64-470.161.03.run 
  • 1
  • 2
  • 3
  • 4

中间弹窗可以按照以下几个选项:

Are you sure you want to continue? ->                  CONTINUE INSTALLATION
Would you like to run the nvidia-xconfig utility? ->             YES
  • 1
  • 2

安装完成后, 重启计算机并删除禁用nouveau时创建的blacklist文件

检查安装结果

nvidia-smi

# 输出
Thu Mar  9 14:22:29 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro M2000        Off  | 00000000:01:00.0  On |                  N/A |
| 63%   59C    P0    38W /  75W |    769MiB /  4041MiB |     30%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1835      G   /usr/lib/xorg/Xorg                282MiB |
|    0   N/A  N/A      1982      G   /usr/bin/gnome-shell              110MiB |
|    0   N/A  N/A     30799      G   gnome-control-center               39MiB |
+-----------------------------------------------------------------------------+

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

Docker配置

Docker安装

安装可以参考此文如何建立并使用docker

nvidia-container-runtime安装3

命令

nano nvidia-container-runtime-script.sh
  • 1

脚本内容

curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
  sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

执行脚本

nvidia-container-runtime-script.sh
  • 1

安装 nvidia-container-runtime

sudo apt-get install nvidia-container-runtime
sudo systemctl restart docker # 重启docker
  • 1
  • 2

检测

which nvidia-container-runtime-hook 
/usr/bin/nvidia-container-runtime-hook
  • 1
  • 2

Docker gpu 验证

docker pull nvidia/cuda:11.3.1-base-ubuntu20.04
docker run --gpus all --rm -it nvidia/cuda:11.3.1-base-ubuntu20.04 bash

nvidia-smi
#输出如下, 说明运行成功:
root@8a57ae3075d7:/# nvidia-smi
Thu Mar  9 06:42:20 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.161.03   Driver Version: 470.161.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Quadro M2000        Off  | 00000000:01:00.0  On |                  N/A |
| 62%   53C    P0    28W /  75W |    761MiB /  4041MiB |     34%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26

卸载指令

卸载安装的驱动可以使用4:

sudo apt-get --purge remove "*nvidia*"
sudo /usr/bin/nvidia-uninstall
  • 1
  • 2

总结

本文用来记录Debian11在安装nvidia驱动和docker运行时遇到的一些问题, 由于是事后补写可能中间有些异常处理略有缺漏, 各位同学有问题可以留言交流.

异常处理参考链接

显卡驱动报错:NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver.
固件错误Possible missing firmware解决办法


  1. Debian安装英伟达(NVIDIA)驱动一站式避坑教学(Ubuntu通用) ↩︎

  2. Debian 10.2命令安装Nvidia显卡驱动成功,问题回顾 ↩︎

  3. Docker GPU 调用 ↩︎

  4. Ubuntu 卸载 Nvidia 驱动和安装最新驱动 ↩︎

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Cpp五条/article/detail/368587
推荐阅读
相关标签
  

闽ICP备14008679号