当前位置:   article > 正文

基于Centos7安装cuda 10

centos7安装cuda

环境介绍:

CentOS Linux release 7.9.2009 (Core)

NVIDIA: Tesla K80

一、安装驱动

1、检查显卡版本

lshw -numeric -C display
*-display
       description: 3D controller
       product: GK210GL [Tesla K80] [10DE:102D]
       vendor: NVIDIA Corporation [10DE]
       physical id: 0
       bus info: pci@0000:04:00.0
       logical name: /dev/fb0
       version: a1
       width: 64 bits
       clock: 33MHz


可以看到显卡是Tesla K80的

2、去官方下载驱动

Official Drivers | NVIDIA

选择操作系统及cuda版本

3、安装依赖

yum install kernel-devel gcc -y

4、安装驱动

chmod +x NVIDIA-Linux-x86_64-410.129-diagnostic.run
 
./NVIDIA-Linux-x86_64-410.129-diagnostic.run

5、报错

报错 unable to find the kernel source tree for the currently running kernel.........,使用下面命令安装,/3.10.0-1160.31.1.el7.x86_64需要改成自己的目录

./NVIDIA-Linux-x86_64-410.129-diagnostic.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.31.1.el7.x86_64 -k $(uname -r)

然后图像化界面操作。基本是一直回车就可以了

6、安装完成后检查

nvidia-smi
Wed Jun 23 20:19:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.129      Driver Version: 410.129      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:04:00.0 Off |                    0 |
| N/A   29C    P0    56W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:05:00.0 Off |                    0 |
| N/A   26C    P0    70W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | 00000000:08:00.0 Off |                    0 |
| N/A   28C    P0    59W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | 00000000:09:00.0 Off |                    0 |
| N/A   22C    P0    71W / 149W |      0MiB / 11441MiB |     80%      Default |
+-------------------------------+----------------------+----------------------+
 
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

到这里NVIDIA的驱动就安装完成了。

方法二:使用rpm方式安装

i) `rpm -i nvidia-driver-local-repo-rhel7-418.211.00-1.0-1.x86_64.rpm'
ii) `yum clean all`
iii) `yum install cuda-drivers` 会提示缺少dkms包,手动安装dkms包 dkms-2.3-5.8.noarch.rpm
iv) `reboot`

二、安装cuda

下载链接:CUDA Toolkit Archive | NVIDIA Developer

参考文档:Centos7安装cuda10.1_Happy_wtg的博客-CSDN博客_centos7安装cuda10.1

选择对应的版本进行下载。

检查当前运行级别。

检查当前运行级别, 若为3则不用修改,若为5需要修改为3.
修改运行级别为3的命令:systemctl set-default multi-user.target,重启机器,再次执行runlevel,此时应该变为3.
# runlevel
N 3
# chmod +x cuda_10.0.130_410.48_linux.run

安装

出现这个界面表示安装完成。

# sh cuda_10.0.130_410.48_linux.run --no-opengl-libs
Do you accept the previously read EULA?
accept/decline/quit: accept
 
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: n   #因为上面已经安装过了驱动
 
Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: y
 
Enter Toolkit Location
 [ default is /usr/local/cuda-10.0 ]:
 
Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y
 
Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y
 
Enter CUDA Samples Location
 [ default is /home/dd ]:
 
Installing the CUDA Toolkit in /usr/local/cuda-10.0 ...
Missing recommended library: libGLU.so
Missing recommended library: libX11.so
Missing recommended library: libXi.so
Missing recommended library: libXmu.so
 
Installing the CUDA Samples in /home/dd ...
Copying samples to /home/dd/NVIDIA_CUDA-10.0_Samples now...
Finished copying samples.
 
===========
= Summary =
===========
 
Driver:   Not Selected
Toolkit:  Installed in /usr/local/cuda-10.0
Samples:  Installed in /home/dd, but missing recommended libraries
 
Please make sure that
 -   PATH includes /usr/local/cuda-10.0/bin
 -   LD_LIBRARY_PATH includes /usr/local/cuda-10.0/lib64, or, add /usr/local/cuda-10.0/lib64 to /etc/ld.so.conf and run ldconfig as root
 
To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.0/bin
 
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.0/doc/pdf for detailed information on setting up CUDA.
 
***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 10.0 functionality to work.
To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:
    sudo <CudaInstaller>.run -silent -driver
 
Logfile is /tmp/cuda_install_11398.log

查看运行状态:nvidia-smi

Wed Jun 23 22:00:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.129      Driver Version: 410.129      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:04:00.0 Off |                    0 |
| N/A   31C    P0    56W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:05:00.0 Off |                    0 |
| N/A   27C    P0    70W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla K80           Off  | 00000000:08:00.0 Off |                    0 |
| N/A   31C    P0    60W / 149W |      0MiB / 11441MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla K80           Off  | 00000000:09:00.0 Off |                    0 |
| N/A   24C    P0    71W / 149W |      0MiB / 11441MiB |     78%      Default |
+-------------------------------+----------------------+----------------------+
 
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

配置环境变量

# vim /etc/profile
......
#cuda
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

cuda测试

测试 cuda, nvcc 命令是否可用
# cuda ; 按两下 tab 键
cudafe++                      cuda-gdb                      cuda-gdbserver                cuda-install-samples-10.0.sh  cuda-memcheck
 
# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130


用 cuda 例程测试,找到例程的安装目录

目录可以查看上面cuda安装完成提示那里。

默认在 /root 下。

我这里用的dd这个用户安装的所以是在/home/dd/NVIDIA_CUDA-10.0_Samples

cd /home/dd/NVIDIA_CUDA-10.0_Samples
# ls
0_Simple  1_Utilities  2_Graphics  3_Imaging  4_Finance  5_Simulations  6_Advanced  7_CUDALibraries  bin  common  EULA.txt  Makefile

只需要挑选其中的几个进行测试即可,比如

# cd 1_Utilities/deviceQuery
# make
# ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
 
Detected 4 CUDA Capable device(s)
 
Device 0: "Tesla K80"
  CUDA Driver Version / Runtime Version          10.0 / 10.0
  CUDA Capability Major/Minor version number:    3.7
  Total amount of global memory:                 11441 MBytes (11996954624 bytes)
  (13) Multiprocessors, (192) CUDA Cores/MP:     2496 CUDA Cores
  GPU Max Clock rate:                            824 MHz (0.82 GHz)
  Memory Clock rate:                             2505 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 1572864 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32

至此,CUDA Toolkit 已经安装完成。

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/128588
推荐阅读
相关标签
  

闽ICP备14008679号