赞
踩
CentOS Linux release 7.9.2009 (Core)
NVIDIA: Tesla K80
1、检查显卡版本
lshw -numeric -C display
*-display
description: 3D controller
product: GK210GL [Tesla K80] [10DE:102D]
vendor: NVIDIA Corporation [10DE]
physical id: 0
bus info: pci@0000:04:00.0
logical name: /dev/fb0
version: a1
width: 64 bits
clock: 33MHz
可以看到显卡是Tesla K80的
2、去官方下载驱动
选择操作系统及cuda版本
3、安装依赖
yum install kernel-devel gcc -y
4、安装驱动
chmod +x NVIDIA-Linux-x86_64-410.129-diagnostic.run
./NVIDIA-Linux-x86_64-410.129-diagnostic.run
5、报错
报错 unable to find the kernel source tree for the currently running kernel.........,使用下面命令安装,/3.10.0-1160.31.1.el7.x86_64需要改成自己的目录
./NVIDIA-Linux-x86_64-410.129-diagnostic.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.31.1.el7.x86_64 -k $(uname -r)
然后图像化界面操作。基本是一直回车就可以了
6、安装完成后检查
nvidia-smi
Wed Jun 23 20:19:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.129 Driver Version: 410.129 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:04:00.0 Off | 0 |
| N/A 29C P0 56W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 00000000:05:00.0 Off | 0 |
| N/A 26C P0 70W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K80 Off | 00000000:08:00.0 Off | 0 |
| N/A 28C P0 59W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla K80 Off | 00000000:09:00.0 Off | 0 |
| N/A 22C P0 71W / 149W | 0MiB / 11441MiB | 80% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
到这里NVIDIA的驱动就安装完成了。
方法二:使用rpm方式安装
i) `rpm -i nvidia-driver-local-repo-rhel7-418.211.00-1.0-1.x86_64.rpm'
ii) `yum clean all`
iii) `yum install cuda-drivers` 会提示缺少dkms包,手动安装dkms包 dkms-2.3-5.8.noarch.rpm
iv) `reboot`
下载链接:CUDA Toolkit Archive | NVIDIA Developer
参考文档:Centos7安装cuda10.1_Happy_wtg的博客-CSDN博客_centos7安装cuda10.1
选择对应的版本进行下载。
检查当前运行级别。
检查当前运行级别, 若为3则不用修改,若为5需要修改为3.
修改运行级别为3的命令:systemctl set-default multi-user.target,重启机器,再次执行runlevel,此时应该变为3.
# runlevel
N 3
# chmod +x cuda_10.0.130_410.48_linux.run
安装
出现这个界面表示安装完成。
# sh cuda_10.0.130_410.48_linux.run --no-opengl-libs
Do you accept the previously read EULA?
accept/decline/quit: accept
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: n #因为上面已经安装过了驱动
Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y
Enter Toolkit Location
[ default is /usr/local/cuda-10.0 ]:
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y
Enter CUDA Samples Location
[ default is /home/dd ]:
Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
Installing the CUDA Samples in /home/dd ...
Copying samples to /home/dd/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.
===========
= Summary =
===========
Driver: Not Selected
Toolkit: Installed in /usr/local/cuda-10.0
Samples: Installed in /home/dd, but missing recommended libraries
Please make sure that
- PATH includes /usr/local/cuda-10.0/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 10.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
sudo <CudaInstaller>.run -silent -driver
Logfile is /tmp/cuda_install_11398.log
查看运行状态:nvidia-smi
Wed Jun 23 22:00:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.129 Driver Version: 410.129 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:04:00.0 Off | 0 |
| N/A 31C P0 56W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K80 Off | 00000000:05:00.0 Off | 0 |
| N/A 27C P0 70W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 Tesla K80 Off | 00000000:08:00.0 Off | 0 |
| N/A 31C P0 60W / 149W | 0MiB / 11441MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 Tesla K80 Off | 00000000:09:00.0 Off | 0 |
| N/A 24C P0 71W / 149W | 0MiB / 11441MiB | 78% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
配置环境变量
# vim /etc/profile
......
#cuda
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
cuda测试
测试 cuda, nvcc 命令是否可用
# cuda ; 按两下 tab 键
cudafe++ cuda-gdb cuda-gdbserver cuda-install-samples-10.0.sh cuda-memcheck
# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
用 cuda 例程测试,找到例程的安装目录
目录可以查看上面cuda安装完成提示那里。
默认在 /root 下。
我这里用的dd这个用户安装的所以是在/home/dd/NVIDIA_CUDA-10.0_Samples
cd /home/dd/NVIDIA_CUDA-10.0_Samples
# ls
0_Simple 1_Utilities 2_Graphics 3_Imaging 4_Finance 5_Simulations 6_Advanced 7_CUDALibraries bin common EULA.txt Makefile
只需要挑选其中的几个进行测试即可,比如
# cd 1_Utilities/deviceQuery
# make
# ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 4 CUDA Capable device(s)
Device 0: "Tesla K80"
CUDA Driver Version / Runtime Version 10.0 / 10.0
CUDA Capability Major/Minor version number: 3.7
Total amount of global memory: 11441 MBytes (11996954624 bytes)
(13) Multiprocessors, (192) CUDA Cores/MP: 2496 CUDA Cores
GPU Max Clock rate: 824 MHz (0.82 GHz)
Memory Clock rate: 2505 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 1572864 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
至此,CUDA Toolkit 已经安装完成。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。