赞
踩
问题详述:
[ 69%] Building CUDA object src/spconv/CMakeFiles/spconv.dir/maxpool.cu.o make[2]: *** [src/spconv/CMakeFiles/spconv.dir/build.make:76: src/spconv/CMakeFiles/spconv.dir/all.cc.o] Error 1 make[2]: *** Waiting for unfinished jobs.... make[1]: *** [CMakeFiles/Makefile2:136: src/spconv/CMakeFiles/spconv.dir/all] Error 2 make: *** [Makefile:136: all] Error 2 Traceback (most recent call last): File "setup.py", line 77, in <module> setup( File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/__init__.py", line 107, in setup return distutils.core.setup(**attrs) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup return run_commands(dist) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands dist.run_commands() File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands self.run_command(cmd) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command super().run_command(command) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 299, in run self.run_command('build') File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command self.distribution.run_command(command) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command super().run_command(command) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run self.run_command(cmd_name) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command self.distribution.run_command(command) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/dist.py", line 1233, in run_command super().run_command(command) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "setup.py", line 40, in run self.build_extension(ext) File "setup.py", line 73, in build_extension subprocess.check_call(['cmake', '--build', '.'] + build_args, cwd=self.build_temp) File "/home/xd/anaconda3/envs/centerformer/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '--', '-j4']' returned non-zero exit status 2.
描述: 该问题是在你安装spconv1.0/1.2.1时,执行python setup.py bdist_wheel时,编译一半出现的问题。
可能1:此问题说明spconv安装缺文件,从官方下载下来的spconv1.2.1,文件夹下的third_party/pybind11是空的,需要自己手动去下载。
pybind11链接:https://github.com/pybind/pybind11/tree/3b1dbebabc801c9cf6f0953a4c20b904d444f879
可能2:电脑是30系列显卡时,需要安装CUDA11.x,属于高版本,spconv需要安装2.x。
遇见问题:
nvcc fatal : Unsupported gpu architecture 'compute_89' make[2]: *** [src/utils/CMakeFiles/spconv_nms.dir/build.make:63:src/utils/CMakeFiles/spconv_nms.dir/nms.cu.o] 错误 1 make[1]: *** [CMakeFiles/Makefile2:240:src/utils/CMakeFiles/spconv_nms.dir/all] 错误 2 make[1]: *** 正在等待未完成的任务.... nvcc fatal : Unsupported gpu architecture 'compute_89' make[2]: *** [src/cuhash/CMakeFiles/cuhash.dir/build.make:89:src/cuhash/CMakeFiles/cuhash.dir/hash_table.cu.o] 错误 1 make[2]: *** 正在等待未完成的任务.... nvcc fatal : Unsupported gpu architecture 'compute_89' make[2]: *** [src/cuhash/CMakeFiles/cuhash.dir/build.make:63:src/cuhash/CMakeFiles/cuhash.dir/hash_functions.cu.o] 错误 1 make[1]: *** [CMakeFiles/Makefile2:159:src/cuhash/CMakeFiles/cuhash.dir/all] 错误 2 make: *** [Makefile:130:all] 错误 2 Traceback (most recent call last): File "setup.py", line 108, in <module> zip_safe=False, File "/home/xd/anaconda3/envs/VFF/lib/python3.7/site-packages/setuptools/__init__.py", line 153, in setup return distutils.core.setup(**attrs) File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/core.py", line 148, in setup dist.run_commands() File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/dist.py", line 966, in run_commands self.run_command(cmd) File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/home/xd/anaconda3/envs/VFF/lib/python3.7/site-packages/wheel/bdist_wheel.py", line 325, in run self.run_command("build") File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/command/build.py", line 135, in run self.run_command(cmd_name) File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/cmd.py", line 313, in run_command self.distribution.run_command(command) File "/home/xd/anaconda3/envs/VFF/lib/python3.7/distutils/dist.py", line 985, in run_command cmd_obj.run() File "setup.py", line 48, in run self.build_extension(ext) File "setup.py", line 92, in build_extension subprocess.check_call(['cmake', '--build', '.'] + build_args, cwd=self.build_temp) File "/home/xd/anaconda3/envs/VFF/lib/python3.7/subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '--', '-j4']' returned non-zero exit status 2.
原因:算力太高不支持,
解决方法:
export TORCH_CUDA_ARCH_LIST="8.0"
source ~/.bashrc
每台主机问题多种多样,出现这种问题很多情况下是版本不适配。
可能3:今天又碰着了一回,是安装spconv1.0时候,以下给出一个github的解决方案:
还没试,之后试一下。
conda create --name spconv python=3.7 pytorch=1.4 cudatoolkit=10.1 cudatoolkit-dev=10.1 cmake --channel pytorch conda activate spconv conda install cudnn git clone https://github.com/traveller59/spconv --recursive cd spconv git checkout 8da6f967fb9a054d8870c3515b1b44eca2103634 git am <path_to_hotfixes>/0001-fix-problem-with-torch-1.4.patch git am <path_to_hotfixes>/0001-Allow-to-specifiy-CUDA_ROOT-directory-and-pick-corre.patch CUDA_ROOT=<path_to_your_conda_installation>/envs/spconv python setup.py bdist_wheel cd dist/ pip install *
参考:https://lightrun.com/answers/traveller59-spconv-cuda-path-seems-to-be-hard-coded
可能4:
不乏有文件错误的可能,这里提供一个我亲测有效的文件:
用于spconv1.2.1的安装,目前亲测有效
可能5:
把/spconv/build/文件夹里面的环境都删除,重新安装一遍,有可能是之前某个电脑安装的残留,导致现在安装不了。
可能6:
今天又是一个新问题,这次使用A100,cuda11.5,刚编译就报错
|||||CMAKE ARGS||||| ['-DCMAKE_PREFIX_PATH=/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/torch', '-DPYBIND11_PYTHON_VERSION=3.8', '-DSPCONV_BuildTests=OFF', '-DPYTORCH_VERSION=11001', '-DCMAKE_CUDA_FLAGS="--expt-relaxed-constexpr" -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/home/ubuntu/Downloads/centerformer/spconv-1.2.1/build/lib.linux-x86_64-cpython-38/spconv', '-DCMAKE_BUILD_TYPE=Release'] -- The CXX compiler identification is GNU 9.5.0 CMake Error at /home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:739 (message): Compiling the CUDA compiler identification source file "CMakeCUDACompilerId.cu" failed. Compiler: /usr/bin/nvcc Build flags: "--expt-relaxed-constexpr";-D__CUDA_NO_HALF_OPERATORS__;-D__CUDA_NO_HALF_CONVERSIONS__ Id flags: --keep;--keep-dir;tmp -v The output was: 1 #$ _NVVM_BRANCH_=nvvm #$ _SPACE_= #$ _CUDART_=cudart #$ _HERE_=/usr/lib/nvidia-cuda-toolkit/bin #$ _THERE_=/usr/lib/nvidia-cuda-toolkit/bin #$ _TARGET_SIZE_= #$ _TARGET_DIR_= #$ _TARGET_SIZE_=64 #$ NVVMIR_LIBRARY_DIR=/usr/lib/nvidia-cuda-toolkit/libdevice #$ PATH=/usr/lib/nvidia-cuda-toolkit/bin:/home/ubuntu/anaconda3/envs/centerformer/bin:/home/ubuntu/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/usr/local/cuda/bin #$ LIBRARIES= -L/usr/lib/x86_64-linux-gnu/stubs -L/usr/lib/x86_64-linux-gnu nvcc fatal : Don't know what to do with '"--expt-relaxed-constexpr"' Compiling the CUDA compiler identification source file "CMakeCUDACompilerId.cu" failed. Compiler: /usr/bin/nvcc Build flags: Id flags: --keep;--keep-dir;tmp -v The output was: 1 #$ _NVVM_BRANCH_=nvvm #$ _SPACE_= #$ _CUDART_=cudart #$ _HERE_=/usr/lib/nvidia-cuda-toolkit/bin #$ _THERE_=/usr/lib/nvidia-cuda-toolkit/bin #$ _TARGET_SIZE_= #$ _TARGET_DIR_= #$ _TARGET_SIZE_=64 #$ NVVMIR_LIBRARY_DIR=/usr/lib/nvidia-cuda-toolkit/libdevice #$ PATH=/usr/lib/nvidia-cuda-toolkit/bin:/home/ubuntu/anaconda3/envs/centerformer/bin:/home/ubuntu/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/usr/local/cuda/bin #$ LIBRARIES= -L/usr/lib/x86_64-linux-gnu/stubs -L/usr/lib/x86_64-linux-gnu #$ rm tmp/a_dlink.reg.c #$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -E -x c++ -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__ -D__CUDACC_VER_MAJOR__=11 -D__CUDACC_VER_MINOR__=5 -D__CUDACC_VER_BUILD__=119 -D__CUDA_API_VER_MAJOR__=11 -D__CUDA_API_VER_MINOR__=5 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "CMakeCUDACompilerId.cu" -o "tmp/CMakeCUDACompilerId.cpp1.ii" #$ cicc --c++14 --gnu_version=90500 --display_error_number --orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name "/home/ubuntu/Downloads/centerformer/spconv-1.2.1/build/temp.linux-x86_64-cpython-38/CMakeFiles/3.25.0/CompilerIdCUDA/CMakeCUDACompilerId.cu" --allow_managed -arch compute_52 -m64 --no-version-ident -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 --include_file_name "CMakeCUDACompilerId.fatbin.c" -tused --gen_module_id_file --module_id_file_name "tmp/CMakeCUDACompilerId.module_id" --gen_c_file_name "tmp/CMakeCUDACompilerId.cudafe1.c" --stub_file_name "tmp/CMakeCUDACompilerId.cudafe1.stub.c" --gen_device_file_name "tmp/CMakeCUDACompilerId.cudafe1.gpu" "tmp/CMakeCUDACompilerId.cpp1.ii" -o "tmp/CMakeCUDACompilerId.ptx" /usr/include/c++/11/type_traits(1406): error: type name is not allowed /usr/include/c++/11/type_traits(1406): error: type name is not allowed /usr/include/c++/11/type_traits(1406): error: identifier "__is_same" is undefined /usr/include/c++/11/type_traits(3251): error: type name is not allowed /usr/include/c++/11/type_traits(3251): error: type name is not allowed /usr/include/c++/11/bits/stl_pair.h(460): error: argument list for class template "std::pair" is missing /usr/include/c++/11/bits/stl_pair.h(460): error: expected a ")" /usr/include/c++/11/bits/stl_pair.h(460): error: template parameter "_T1" may not be redeclared in this scope /usr/include/c++/11/bits/stl_pair.h(460): error: expected a ";" 9 errors detected in the compilation of "CMakeCUDACompilerId.cu". # --error 0x1 -- Call Stack (most recent call first): /home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:6 (CMAKE_DETERMINE_COMPILER_ID_BUILD) /home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCompilerId.cmake:48 (__determine_compiler_id_test) /home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.25/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID) CMakeLists.txt:6 (project) -- Configuring incomplete, errors occurred! See also "/home/ubuntu/Downloads/centerformer/spconv-1.2.1/build/temp.linux-x86_64-cpython-38/CMakeFiles/CMakeOutput.log". See also "/home/ubuntu/Downloads/centerformer/spconv-1.2.1/build/temp.linux-x86_64-cpython-38/CMakeFiles/CMakeError.log". Traceback (most recent call last): File "setup.py", line 96, in <module> setup( File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/__init__.py", line 103, in setup return distutils.core.setup(**attrs) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 185, in setup return run_commands(dist) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/core.py", line 201, in run_commands dist.run_commands() File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands self.run_command(cmd) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command super().run_command(command) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/wheel/bdist_wheel.py", line 364, in run self.run_command("build") File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command self.distribution.run_command(command) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command super().run_command(command) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/command/build.py", line 131, in run self.run_command(cmd_name) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command self.distribution.run_command(command) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/dist.py", line 989, in run_command super().run_command(command) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/setuptools/_distutils/dist.py", line 988, in run_command cmd_obj.run() File "setup.py", line 48, in run self.build_extension(ext) File "setup.py", line 91, in build_extension subprocess.check_call(['cmake', ext.sourcedir] + cmake_args, cwd=self.build_temp, env=env) File "/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/subprocess.py", line 364, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['cmake', '/home/ubuntu/Downloads/centerformer/spconv-1.2.1', '-DCMAKE_PREFIX_PATH=/home/ubuntu/anaconda3/envs/centerformer/lib/python3.8/site-packages/torch', '-DPYBIND11_PYTHON_VERSION=3.8', '-DSPCONV_BuildTests=OFF', '-DPYTORCH_VERSION=11001', '-DCMAKE_CUDA_FLAGS="--expt-relaxed-constexpr" -D__CUDA_NO_HALF
然后我发现是gcc版本的问题,我将gcc-9和g+±9换到了gcc-11和g+±11,就解决了报错。
具体的更换的方法如下:
①移除过去版本
sudo rm /usr/bin/gcc
sudo rm /usr/bin/g++
②安装新版本
sudo apt-get install gcc-11 # 不行的话用这个:sudo apt-get install gcc
sudo apt-get install g++-11
③添加新的软连接(快捷方式)
sudo ln -s /usr/bin/gcc-11 /usr/bin/gcc
sudo ln -s /usr/bin/g++-11 /usr/bin/g++
gcc -v再检查一下是否更换正确即可。
卒,可能6的问题没解决,网上的说法是gcc或者cudnn出了问题,我没排查出来,这里有一个更加简单的解决方法:
我们可以将spconv-1.2.1的语法换成2.x版本的,主要替换的有:
①索引:
#spconv1.x版本
import spconv
a = spconv.SparseModule()
#spconv2.x版本
import spconv.pytorch as spconv
a = spconv.SparseModule()
②卷积
#spconv1.x版本
from spconv import SparseConv3d, SubMConv3d
#spconv2.x版本
from spconv.pytorch.conv import (SparseConv2d, SparseConv3d, SparseConvTranspose2d,
SparseConvTranspose3d, SparseInverseConv2d,
SparseInverseConv3d, SubMConv2d, SubMConv3d)
③SparseTensor特征传递
#spconv1.x版本
out.features = self.bn1(out.features)
#spconv2.x版本
out = out.replace_feature(self.bn1(out.features))
例如对centerformer中的backbone/scn.py的更改方法为:
out =out.replace_feature(self.bn1(out.features))
out =out.replace_feature(self.relu(out.features))
out = self.conv2(out)
out = out.replace_feature(self.bn2(out.features))
if self.downsample is not None:
identity = self.downsample(x)
out= out.replace_feature(out.features+identity.features)
out = out.replace_feature(self.relu(out.features))
具体更改方法详见:
【点云3D目标检测】OpenPCDet下Spconv1.x与Spconv2.x的安装问题及解决方法
如果还遇见上述问题,我还会更新。
借鉴:
spconv1.2.1安装指南
Spconv库安装教程
caffe编译报错“nvcc fatal : Unsupported gpu architecture ‘compute_86‘”
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。