当前位置:   article > 正文

centos7输入python -m bitsandbytes报错CUDA Setup failed despite GPU being available. Please run the follo

python -m bitsandbytes

在centos7.9系统中安装gpu驱动及cuda,跑大模型会报错,提示让输入python -m bitsandbytes依然报错:

CUDA SETUP: Loading binary /usr/local/python3/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /usr/local/python3/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda117.so)
CUDA SETUP: Something unexpected happened. Please compile from source:
git clone https://github.com/TimDettmers/bitsandbytes.git
cd bitsandbytes
CUDA_VERSION=117 make cuda11x
python setup.py install
Traceback (most recent call last):
  File "/usr/local/python3/lib/python3.9/runpy.py", line 188, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/local/python3/lib/python3.9/runpy.py", line 147, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/local/python3/lib/python3.9/runpy.py", line 111, in _get_module_details
    __import__(pkg_name)
  File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/__init__.py", line 6, in <module>
    from . import cuda_setup, utils, research
  File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/research/__init__.py", line 1, in <module>
    from . import nn
  File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/research/nn/__init__.py", line 1, in <module>
    from .modules import LinearFP8Mixed, LinearFP8Global
  File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/research/nn/modules.py", line 8, in <module>
    from bitsandbytes.optim import GlobalOptimManager
  File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/optim/__init__.py", line 6, in <module>
    from bitsandbytes.cextension import COMPILED_WITH_CUDA
  File "/usr/local/python3/lib/python3.9/site-packages/bitsandbytes/cextension.py", line 20, in <module>
    raise RuntimeError('''
RuntimeError: 
        CUDA Setup failed despite GPU being available. Please run the following command to get more information:

        python -m bitsandbytes

        Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
        to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
        and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34

发现问题:

可能是由于缺少CXXABI_1.3.9导致的错误

/lib64/libstdc++.so.6: version `CXXABI_1.3.9' not found
  • 1

解决方案:

1、检查下gcc的版本

gcc -v
  • 1

输出

gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
  • 1

2、检查当前系统的动态库

strings /usr/lib64/libstdc++.so.6 | grep GLIBC
或者
strings /usr/lib64/libstdc++.so.6 | grep CXXABI
  • 1
  • 2
  • 3

输出可以看到缺少CXXABI_1.3.9,最新只到CXXABI_1.3.7

GLIBCXX_3.4
GLIBCXX_3.4.1
GLIBCXX_3.4.2
GLIBCXX_3.4.3
GLIBCXX_3.4.4
GLIBCXX_3.4.5
GLIBCXX_3.4.6
GLIBCXX_3.4.7
GLIBCXX_3.4.8
GLIBCXX_3.4.9
GLIBCXX_3.4.10
GLIBCXX_3.4.11
GLIBCXX_3.4.12
GLIBCXX_3.4.13
GLIBCXX_3.4.14
GLIBCXX_3.4.15
GLIBCXX_3.4.16
GLIBCXX_3.4.17
GLIBCXX_3.4.18
GLIBCXX_3.4.19
GLIBC_2.3
GLIBC_2.2.5
GLIBC_2.14
GLIBC_2.4
GLIBC_2.3.2
GLIBCXX_DEBUG_MESSAGE_LENGTH

分别输出 和

CXXABI_1.3
CXXABI_1.3.1
CXXABI_1.3.2
CXXABI_1.3.3
CXXABI_1.3.4
CXXABI_1.3.5
CXXABI_1.3.6
CXXABI_1.3.7
CXXABI_TM_1
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38

3、查看当前动态库的位置和版本

find / -name libstdc++.so.6*
  • 1

输出

/usr/lib64/libstdc++.so.6
/usr/lib64/libstdc++.so.6.0.19
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.py
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyc
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyo
/usr/local/cuda-12.1/nsight-systems-2023.1.2/host-linux-x64/libstdc++.so.6
/usr/local/cuda-12.1/nsight-compute-2023.1.1/host/linux-desktop-glibc_2_11_3-x64/libstdc++.so.6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

输入

[root@dev ~]# cd /usr/lib64
[root@dev lib64]# ls -l libstdc++.so*
lrwxrwxrwx. 1 root root     19 Jul 31 09:44 libstdc++.so.6 -> libstdc++.so.6.0.19
-rwxr-xr-x. 1 root root 995840 Sep 30  2020 libstdc++.so.6.0.19
  • 1
  • 2
  • 3
  • 4

发现libstdc++.so.6 指向的版本是 libstdc++.so.6.0.19

4、结论

这个问题的原因是没有链接到CXXABI库的最新的版本,那么升级会不会就能解决问题,思路清晰之后开搞

升级gcc

gcc的各个版本http://ftp.gnu.org/gnu/gcc/ ,根据自己需要选择,如果不知道,就选择最新的吧

下载GCC版本,这里选择最新的
wget http://ftp.gnu.org/gnu/gcc/gcc-11.2.0/gcc-11.2.0.tar.gz
  • 1
解压:
tar -zxvf gcc-11.2.0.tar.gz
  • 1
下载各项依赖
cd gcc-11.2.0
./contrib/download_prerequisites
  • 1
  • 2
创建编译目录
# 还是在gcc-11.2.0目录
mkdir build
cd build
../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib
  • 1
  • 2
  • 3
  • 4
编译
make
# 编译的过程可能需要1-3个小时,建议使用nohup后台运行
nohup make &
  • 1
  • 2
  • 3
编译完成后,执行安装
make install
  • 1
检查gcc版本
gcc -v
  • 1

如果版本还是旧的,执行reboot重启服务器,再查看

创建软链接(快结束了)

执行最开始的命令
strings /usr/lib64/libstdc++.so.6 | grep CXXABI
  • 1

可以看到已经有了CXXABI_1.3.9

查找GCC编译时生成的最新的动态库位置
find / -name "libstdc++.so*"
  • 1

输出:

/root/gcc-11.2.0/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.29
/root/gcc-11.2.0/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6
/root/gcc-11.2.0/build/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so
/root/gcc-11.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.29
/root/gcc-11.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6
/root/gcc-11.2.0/build/x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so
/root/gcc-11.2.0/build/stage1-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6.0.29
/root/gcc-11.2.0/build/stage1-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6
/root/gcc-11.2.0/build/stage1-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so
/usr/lib/gcc/x86_64-redhat-linux/4.8.2/32/libstdc++.so
/usr/lib/gcc/x86_64-redhat-linux/4.8.2/libstdc++.so
/usr/lib64/libstdc++.so.6
/usr/lib64/libstdc++.so.6.0.19
/usr/lib64/libstdc++.so.6.0.29
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.py
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyc
/usr/share/gdb/auto-load/usr/lib64/libstdc++.so.6.0.19-gdb.pyo
/usr/local/lib64/libstdc++.so.6.0.29
/usr/local/lib64/libstdc++.so.6
/usr/local/lib64/libstdc++.so
/usr/local/lib64/libstdc++.so.6.0.29-gdb.py
/usr/local/cuda-12.1/nsight-systems-2023.1.2/host-linux-x64/libstdc++.so.6
/usr/local/cuda-12.1/nsight-compute-2023.1.1/host/linux-desktop-glibc_2_11_3-x64/libstdc++.so.6
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23

可以看到,有更高的版本/usr/local/lib64/libstdc++.so.6.0.29
那么我们接下来让

创建软链接
cd /usr/lib64
cp /usr/local/lib64/libstdc++.so.6.0.29 /usr/lib64/
rm libstdc++.so.6
ln -s libstdc++.so.6.0.29 libstdc++.so.6
  • 1
  • 2
  • 3
  • 4

检查

[aigc@dev lib64]$ ls -l libstdc++.so*
lrwxrwxrwx  1 root root       19 Aug  9 11:48 libstdc++.so.6 -> libstdc++.so.6.0.29
-rwxr-xr-x. 1 root root   995840 Sep 30  2020 libstdc++.so.6.0.19
-rwxr-xr-x  1 root root 14595752 Aug  9 11:46 libstdc++.so.6.0.29
  • 1
  • 2
  • 3
  • 4

可以看到libstdc++.so.6已经更新到了libstdc++.so.6.0.29

再次检查动态库

strings /usr/lib64/libstdc++.so.6 | grep CXXABI
  • 1

可以看到,已经有了CXXABI_1.3.9,还有更高版本的一些库

gcc升级结束,再去输入

python -m bitsandbytes
  • 1

在这里插入图片描述
彻底解决啦

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家小花儿/article/detail/688550
推荐阅读
相关标签
  

闽ICP备14008679号