赞
踩
https://github.com/echonoshy/cgft-llm
【大模型量化】- Llama.cpp轻量化模型部署及量化_哔哩哔哩_bilibili
github.com/ggerganov/llama.cpp
cd ~/code/llama.cpp/build_cuda/bin
./quantize --allow-requantize /root/autodl-tmp/models/Llama3-8B-Chinese-Chat-GGUF/Llama3-8B-Chinese-Chat-q8_0-v2_1.gguf /root/autodl-tmp/models/Llama3-8B-Chinese-Chat-GGUF/Llama3-8B-Chinese-Chat-q4_1-v1.gguf Q4_1
python convert-hf-to-gguf.py /root/autodl-tmp/models/Llama3-8B-Chinese-Chat --outfile /root/autodl-tmp/models/Llama3-8B-Chinese-Chat-GGUF/Llama3-8B-Chinese-Chat-q8_0-v1.gguf --outtype q8_0
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。