黑客灵魂

这个屌丝很懒，什么也没留下！

热门标签

ollama 部署自己微调过的模型_ollama safetensor

作者：黑客灵魂 | 2024-07-09 15:52:08

踩

ollama safetensor

使用 Ollama 导入模型

本指南将介绍如何在 Ollama 中导入 GGUF、PyTorch 或 Safetensors 模型。

首先创建一个 Modelfile。该文件是模型的蓝图，用于指定权重、参数、提示模板等。

示例 Modelfile：

FROM ./mistral-7b-v0.1.Q4_0.gguf
TEMPLATE "[INST] {{ .Prompt }} [/INST]"
1
2

使用 Modelfile 创建一个模型：

ollama create example -f Modelfile
1

使用以下命令测试模型：

ollama run example "你最喜欢的调味品是什么？"
1

从 PyTorch 和 Safetensors 导入模型的过程比导入 GGUF 更长，改进工作正在进行中。

首先，克隆 Ollama 仓库：

git clone git@github.com:ollama/ollama.git ollama
cd ollama
1
2

然后获取其 llama.cpp 子模块：

git submodule init
git submodule update llm/llama.cpp
1
2

接下来，安装 Python 依赖项：

python3 -m venv llm/llama.cpp/.venv
source llm/llama.cpp/.venv/bin/activate
pip install -r llm/llama.cpp/requirements.txt
1
2
3

然后构建量化工具：

make -C llm/llama.cpp quantize
1

如果模型目前托管在 HuggingFace 仓库中，首先克隆该仓库以下载原始模型。

安装 Git LFS，验证安装，然后克隆模型仓库：

git lfs install
git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1 model
1
2

注意：某些模型架构需要使用特定的转换脚本。例如，Qwen 模型需要运行 convert-hf-to-gguf.py 而不是 convert.py

python llm/llama.cpp/convert.py ./model --outtype f16 --outfile converted.bin
1

llm/llama.cpp/quantize converted.bin quantized.bin q4_0
1

接下来，为你的模型创建一个 Modelfile：

FROM quantized.bin
TEMPLATE "[INST] {{ .Prompt }} [/INST]"
1
2

使用 Modelfile 创建一个模型：

ollama create example -f Modelfile
1

使用以下命令测试模型：

ollama run example "你最喜欢的调味品是什么？"
1

发布模型处于早期 Alpha 阶段。如果你想将模型发布给他人，请按照以下步骤操作：

接下来，将你的模型复制到你的用户名命名空间：

ollama cp example <your username>/example
1

注意：模型名称只能包含小写字母、数字和字符 ., -, 和 _.

然后推送模型：

ollama push <your username>/example
1

发布后，你的模型将可以在以下地址访问：

https://ollama.com/<your username>/example
1

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/黑客灵魂/article/detail/802895