赞
踩
在执行pip install flash_attn
,安装一个推理加速库的时候,遇到如下异常:
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/ Collecting flash_attn Downloading https://mirrors.aliyun.com/pypi/packages/72/94/06f618bb338ec7203b48ac542e73087362b7750f9c568b13d213a3f181bb/flash_attn-2.5.8.tar.gz (2.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 1.6 MB/s eta 0:00:00 Preparing metadata (setup.py) ... error error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [20 lines of output] fatal: not a git repository (or any of the parent directories): .git /tmp/pip-install-fg7pt8f4/flash-attn_1e4c76d3ba9f4a5d968930613e3c4bd7/setup.py:78: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc. warnings.warn( Traceback (most recent call last): File "<string>", line 2, in <module> File "<pip-setuptools-caller>", line 34, in <module> File "/tmp/pip-install-fg7pt8f4/flash-attn_1e4c76d3ba9f4a5d968930613e3c4bd7/setup.py", line 134, in <module> CUDAExtension( File "/usr/local/program/miniconda3/envs/llama3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1077, in CUDAExtension library_dirs += library_paths(cuda=True) File "/usr/local/program/miniconda3/envs/llama3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1204, in library_paths if (not os.path.exists(_join_cuda_home(lib_dir)) and File "/usr/local/program/miniconda3/envs/llama3/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2419, in _join_cuda_home raise OSError('CUDA_HOME environment variable is not set. ' OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root. torch.__version__ = 2.3.0+cu121 [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: metadata-generation-failed × Encountered error while generating package metadata. ╰─> See above for output. note: This is an issue with the package mentioned above, not pip. hint: See above for details.
首先操作系统已经安装了驱动,并且驱动自带CUDA,可通过nvidia-smi
命令查看
注意:
当时看到这里是有疑惑的,GPU显卡上已经有了CUDA,为何还提示需要CUDA?
原因如下:
首先CUDA有两个主要的API,runtime API和driver API。显然GPU显卡中的CUDA对应driver API,那么此时出现这个异常提示需要CUDA信息,很显然这个CUDA需要的就是runtime API,因此为了支持runtime API,就需要额外再安装CUDA Toolkit
解决异常:
CUDA Toolkit的安装路径通常在
usr/local/
路径下,经检查发现该路径下确实不存在CUDA Toolkit的安装目录
既然没有安装CUDA Toolkit,那么直接安装CUDA Toolkit来尝试解决这个问题。
CUDA Toolkit是CUDA的工具包,安装CUDA其实就是安装CUDA Toolkit。
访问https://developer.nvidia.com/cuda-toolkit-archive
,选择需要的CUDA版本
为了兼容性,执行nvidia-smi
命令,查看GPU的驱动与CUDA版本
由于GPU自身CUDA版本是12.2,因此这里选择下载CUDA Toolkit 12.2
。
这里选择:Linux系统、x86_64架构、Ubuntu系统、系统版本22.04、runfile(local)安装方式
同时页面下方也给出了安装说明
wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
sudo sh cuda_12.2.0_535.54.03_linux.run
选择Continue
后回车
输入accept
接受
因为安装了Drive驱动,所以取消安装,默认勾选(x),取消后选择Install进行安装。
出现如下日志,表示安装成功
=========== = Summary = =========== Driver: Not Selected Toolkit: Installed in /usr/local/cuda-12.2/ Please make sure that - PATH includes /usr/local/cuda-12.2/bin - LD_LIBRARY_PATH includes /usr/local/cuda-12.2/lib64, or, add /usr/local/cuda-12.2/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-12.2/bin ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 535.00 is required for CUDA 12.2 functionality to work. To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run --silent --driver Logfile is /var/log/cuda-installer.log
编辑vim ~/.bashrc
文件,配置环境变量,参考官方文档: Environment Setup
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64
export PATH=$PATH:/usr/local/cuda/bin
export CUDA_HOME=$CUDA_HOME:/usr/local/cuda
执行nvcc -V
命令,查看cuda是否安装成功
CUDA NVCC就是CUDA的编译器,可以从CUDA Toolkit的/bin目录中获取,类似于gcc就是c语言的编译器
root@master:~# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Jun_13_19:16:58_PDT_2023
Cuda compilation tools, release 12.2, V12.2.91
Build cuda_12.2.r12.2/compiler.32965470_0
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。