赞
踩
llama2.c是一个极简的Llama 2 LLM全栈工具,使用一个简单的 700 行 C 文件 ( run.c ) 对其进行推理。llama2.c涉及LLM微调、模型构建、推理端末部署(量化、硬件加速)等众多方面,是学习研究Open LLM的很好切入点。
git clone https://github.com/karpathy/llama2.c.git
# 15M参数模型
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin
# 42M参数模型
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories42M.bin
# 110M参数模型
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.bin
make run
# 15M参数模型
./run stories15M.bin
# 42M参数模型,运行并输入提示词
./run stories42M.bin -i "One day, Lily met a Shoggoth"
# 下载模型
git clone https://huggingface.co/flyingfishinwater/chinese-baby-llama2
百度网盘下载:https://pan.baidu.com/s/1bqO6bii5pxMu6dEL8ZOUWQ?pwd=p6hs
安装 python 相关依赖
pip3 install numpy
pip3 install torch torchvision torchaudio
pip3 install transformers
# 将hf模型文件转换成.bin文件
python export.py ./chinese-baby-llama2.bin --hf ./chinese-baby-llama2
// 将 main() 中的 tokenizer.bin 改为 chinese-baby-llama2 目录下的tokenizer.bin
char *tokenizer_path = "chinese-baby-llama2/tokenizer.bin";
make run
./run chinese-baby-llama2.bin -i "今天是武林大会,我是武林盟主"
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。