赞
踩
bug:编译问题
使用make,nvcc为cuda安装位置
make LLAMA_CUBLAS=1 LLAMA_CUDA_NVCC=/usr/local/cuda/bin/nvcc
报错信息:
- nvcc fatal : Value 'native' is not defined for option 'gpu-architecture'
- make: *** [Makefile:171: ggml-cuda.o] Error 1
- make: *** Waiting for unfinished jobs....
解决方法:
添加 CUDA_DOCKER_ARCH参数,可先尝试改为=all,无法解决的话,其他参数值自行对应cuda尝试,如:compute_75,
'all','all-major','compute_35','compute_37', 'compute_50','compute_52','compute_53','compute_60','compute_61','compute_62', 'compute_70','compute_72','compute_75','compute_80','compute_86','compute_87', 'lto_35','lto_37','lto_50','lto_52','lto_53','lto_60','lto_61','lto_62', 'lto_70','lto_72','lto_75','lto_80','lto_86','lto_87','sm_35','sm_37','sm_50', 'sm_52','sm_53','sm_60','sm_61','sm_62','sm_70','sm_72','sm_75','sm_80', 'sm_86','sm_87'.
make LLAMA_CUBLAS=1 CUDA_DOCKER_ARCH=compute_75 LLAMA_CUDA_NVCC=/usr/local/cuda-11.4/bin/nvcc
解决成功:
cc -I. -Icommon -D_XOPEN_SOURCE=600 -D_GNU_SOURCE -DNDEBUG -DGGML_USE_CUBLAS -I/usr/local/cuda/include -I/opt/cuda/include -I/targets/x86_64-linux/include -std=c11 -fPIC -O3 -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -Werror=implicit-int -Werror=implicit-function-declaration -Wdouble-promotion -pthread -march=native -mtune=native -c tests/test-c.c -o tests/test-c.o
因硬件资源足够,则不j继续进行量化
量化方法:
./quantize ./zh-models/7B/ggml-model-f16.gguf ./zh-models/7B/ggml-model-q4_0.gguf q4_0
chinese-llama官方已经说的很详细了,就不再赘述:
llamacpp_zh · ymcui/Chinese-LLaMA-Alpaca-2 Wiki (github.com)
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。