当前位置:   article > 正文

不乱码、下载 Transformers 模型 (抱抱脸、model)_抱脸上模型怎么下载

抱脸上模型怎么下载

不乱码、下载 Transformers 模型 (抱抱脸、model)

概述

目的: 因为需要对预训练模型等做一些查看、转移操作,不想要乱码,不想频繁下载模型等;

  • a. (可不乱码) 使用 huggingface_hub 的 snapshot_download(推荐);
  • b. (不乱码) 使用 wget 手动下载;
  • c. 使用 git lfs;
  • d. 使用 本地已经下载好的.

1. (可不乱码) 使用 huggingface_hub 的 snapshot_download

配置 local_dir_use_symlinks=False就不乱码了;

from huggingface_hub import snapshot_download

# repo_id = "ziqingyang/chinese-alpaca-lora-7b"
repo_id = "nghuyong/ernie-3.0-micro-zh"
local_dir = repo_id.replace("/", "_")
cache_dir = local_dir + "/cache"
snapshot_download(cache_dir=cache_dir,
                  local_dir=local_dir,
                  repo_id=repo_id,
                  local_dir_use_symlinks=False,  # 不转为缓存乱码的形式, auto, Small files (<5MB) are duplicated in `local_dir` while a symlink is created for bigger files.
                  resume_download=True,
                  allow_patterns=["*.model", "*.json", "*.bin",
                                   "*.py", "*.md", "*.txt"],
                  ignore_patterns=["*.safetensors", "*.msgpack",
                                   "*.h5", "*.ot", ],
                  )
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16

2. (不乱码)使用 wget 手动下载

但是现在大模型的权重太大了,一般会拆分成比较多的文件,下载速度也有点慢;
根据地址下载, https://huggingface.co/models/{{repo_id}}
下载路径为: https://huggingface.co/{{repo_id}}/resolve/main/{{具体的文件名}}
以为repo_id=="THUDM/chatglm-6b"为例子
网址: https://huggingface.co/THUDM/chatglm-6b

比如linux可以直接使用wget
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/README.md
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/config.json
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/configuration_chatglm.py
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/tokenizer_config.json
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/ice_text.model
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/quantization.py
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/tokenization_chatglm.py
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/modeling_chatglm.py

wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model.bin.index.json

wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00001-of-00008.bin
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00002-of-00008.bin
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00003-of-00008.bin
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00004-of-00008.bin
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00005-of-00008.bin
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00006-of-00008.bin
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00007-of-00008.bin
wget https://huggingface.co/THUDM/chatglm-6b/resolve/main/pytorch_model-00008-of-00008.bin
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

3. 使用 git lfs

安装git lfs:  git lfs install
下载模型:  git clone https://huggingface.co/THUDM/chatglm-6b
  • 1
  • 2

4. 使用 已经下载好的.


本地已经下载好的可以使用, 也可以转移模型目录,
默认windows地址在: C:\Users\{{账户}}\.cache\huggingface\hub
默认linux地址在: {{账户}}/.cache\huggingface\hub

from transformers import BertTokenizer, BertModel
repo_id = "nghuyong/ernie-3.0-micro-zh"
cache_dir = {{填实际地址}}
tokenizer = BertTokenizer.from_pretrained(repo_id, cache_dir=cache_dir)
model = BertModel.from_pretrained(repo_id, cache_dir=cache_dir)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

参考

希望对你有所帮助!
lan.zhihu.com/p/475260268)

希望对你有所帮助!

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/空白诗007/article/detail/943155
推荐阅读
相关标签
  

闽ICP备14008679号