赞
踩
一、正常安装ubuntu 22.04系统,安装以后sudo apt update,sudo apt upgrade更新软件到最新版。
二、安装cuda
到下面的地址去下载cuda离线安装包,根据cpu指令集架构等选择正确的选项:
https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local
出来的选项内容如下所示:
- wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-ubuntu2204.pin
- sudo mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
- wget https://developer.download.nvidia.com/compute/cuda/12.4.1/local_installers/cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb
- sudo dpkg -i cuda-repo-ubuntu2204-12-4-local_12.4.1-550.54.15-1_amd64.deb
- sudo cp /var/cuda-repo-ubuntu2204-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
- sudo apt-get update
- sudo apt-get -y install cuda-toolkit-12-4
第一个wget下载的内容不大,没有必要单独下载,第二个wget下载的安装包大概3.6G左右,需要用迅雷等下载加速,下载好以后按照以上的命令顺序执行即可。
sudo apt-get install -y cuda-drivers
三、安装cudnn
cudnn的安装过程与安装与cuda的安装过程类似,打开下面的网址并根据实际情况选择合适的选项:
https://developer.nvidia.com/cudnn-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_local
- wget https://developer.download.nvidia.com/compute/cudnn/9.1.1/local_installers/cudnn-local-repo-ubuntu2204-9.1.1_1.0-1_amd64.deb
- sudo dpkg -i cudnn-local-repo-ubuntu2204-9.1.1_1.0-1_amd64.deb
- sudo cp /var/cudnn-local-repo-ubuntu2204-9.1.1/cudnn-*-keyring.gpg /usr/share/keyrings/
- sudo apt-get update
- sudo apt-get -y install cudnn
四、安装完成以后配置一下系统的环境变量,导出cuda和cudnn相关头文件与库文件编译搜索路径
sudo vi /etc/profile
- export PATH=/usr/local/cuda/bin:$PATH
- export CPATH=$CPATH:/usr/include:/usr/local/cuda/include
- export LIBRARY_PATH=$LIBRARY_PATH:/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64
- export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64
source /etc/profile
五:验证
执行nvidia-smi验证显卡驱动安装是否正确,正确的话应该输出类似下面这样:
- Sat May 11 15:27:56 2024
- +-----------------------------------------------------------------------------------------+
- | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
- |-----------------------------------------+------------------------+----------------------+
- | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
- | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
- | | | MIG M. |
- |=========================================+========================+======================|
- | 0 NVIDIA GeForce RTX 9999 ... Off | 00000000:01:00.0 On | N/A |
- | 30% 35C P8 20W / 250W | 64MiB / 102400MiB | 0% Default |
- | | | N/A |
- +-----------------------------------------+------------------------+----------------------+
-
- +-----------------------------------------------------------------------------------------+
- | Processes: |
- | GPU GI CI PID Type Process name GPU Memory |
- | ID ID Usage |
- |=========================================================================================|
- | 0 N/A N/A 1035 G /usr/lib/xorg/Xorg 54MiB |
- | 0 N/A N/A 1106 G /usr/bin/gnome-shell 7MiB |
- +-----------------------------------------------------------------------------------------+
如果安装的驱动有问题,可能会出现下面这样的提示:
- root@server:~# nvidia-smi
- Failed to initialize NVML: Driver/library version mismatch
- NVML library version: 550.54
此时的处理方案是:
- sudo apt-get remove --purge '^nvidia-.*'
- sudo rm /etc/modprobe.d/blacklist-nvidia.conf
- sudo rm /lib/modprobe.d/blacklist-nvidia.conf
- apt-get update
- apt-get install nvidia-driver-550
安装指定版本的驱动以后reboot重启系统应该就没问题了。
执行nvcc --version验证cuda版本,正常情况下输出如下:
- sudo nvcc --version
- nvcc: NVIDIA (R) Cuda compiler driver
- Copyright (c) 2005-2024 NVIDIA Corporation
- Built on Thu_Mar_28_02:18:24_PDT_2024
- Cuda compilation tools, release 12.4, V12.4.131
- Build cuda_12.4.r12.4/compiler.34097967_0
cudnn版本验证不是很方便,可以写个小程序验证一下,例如新建main.cpp,内容如下:
- #include <cudnn.h>
- #include <iostream>
-
- int main() {
- std::cout << "cuDNN Version: " << cudnnGetVersion() << std::endl;
- return 0;
- }
g++ main.cpp -o cudnn_test -I/usr/include -I/usr/local/cuda/include -L/usr/lib/x86_64-linux-gnu -L/usr/local/cuda/lib64 -lcudnn -lcudart
正常情况下应该可以编译得到cudnn_test的可执行文件,执行以后可以打印cudnn的版本信息:
- ./cudnn_test
- cuDNN Version: 90101
六、安装对docker的gpu支持
根据docker官方文档资料正常安装docker,这里跳过docker的安装过程记录,以下内容是安装对docker的gpu支持,使docker实例可以访问宿主机gpu算力资源,依次执行以下命令:
- # 设置稳定版仓库和 GPG 密钥
- distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
- curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
- curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
-
- # 安装 nvidia-docker2 包并重启 Docker 服务
- sudo apt-get update
- sudo apt-get install -y nvidia-docker2
- sudo systemctl restart docker
安装完成以后拉取一个官方镜像访问一下gpu算力资源,
docker pull nvidia/cuda:12.4.1-cudnn-runtime-ubuntu20.04
docker镜像拉回来以后执行一下试试:
docker run --gpus all nvidia/cuda:12.4.1-cudnn-runtime-ubuntu20.04 nvidia-smi
正常情况应该输出如下内容:
-
- ==========
- == CUDA ==
- ==========
-
- CUDA Version 12.4.1
-
- Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-
- This container image and its contents are governed by the NVIDIA Deep Learning Container License.
- By pulling and using the container, you accept the terms and conditions of this license:
- https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
-
- A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
-
- Sat May 11 07:46:25 2024
- +-----------------------------------------------------------------------------------------+
- | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
- |-----------------------------------------+------------------------+----------------------+
- | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
- | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
- | | | MIG M. |
- |=========================================+========================+======================|
- | 0 NVIDIA GeForce RTX xxxx ... Off | 00000000:01:00.0 On | N/A |
- | 30% 36C P8 21W / 250W | 64MiB / xxxxMiB | 0% Default |
- | | | N/A |
- +-----------------------------------------+------------------------+----------------------+
-
- +-----------------------------------------------------------------------------------------+
- | Processes: |
- | GPU GI CI PID Type Process name GPU Memory |
- | ID ID Usage |
- |=========================================================================================|
- +-----------------------------------------------------------------------------------------+
这样就彻底安装完了。最后说一句,盗取文章的死全家,关键是还有傻子私信骚扰我说我盗取别人的文章,本文首发于http://blog.csdn.net/peihexian
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。