赞
踩
CogVlM2 Int4 型号需要 16G GPU 内存就可以运行,并且必须在具有 Nvidia GPU 的 Linux 上运行。
Model Name | 19B Series Model | Remarks |
---|---|---|
BF16 / FP16 Inference | 42GB | Tested with 2K dialogue text |
Int4 Inference | 16GB | Tested with 2K dialogue text |
BF16 Lora Tuning (Freeze Vision Expert Part) | 57GB | Training text length is 2K |
BF16 Lora Tuning (With Vision Expert Part) | > 80GB | Single GPU cannot tune |
source /etc/network_turbo
进行学术加速 , unset http_proxy && unset https_proxy
取消加速# 创建文件夹
mkdir cogvlm2
# 按照huggingface_hub 工具下载模型
pip install -U huggingface_hub
# 下载模型到当前文件夹
huggingface-cli download THUDM/cogvlm2-llama3-chinese-chat-19B-int4 --local-dir .
# 也可以使用
git clone https://huggingface.co/THUDM/cogvlm2-llama3-chinese-chat-19B-int4
git clone https://github.com/THUDM/CogVLM2
cd basic_demo
pip install -r requirements.txt
requirements.txt
xformers>=0.0.26.post1
#torch>=2.3.0
#torchvision>=0.18.0
transformers>=4.40.2
huggingface-hub>=0.23.0
pillow>=10.3.0
chainlit>=1.0.506
pydantic>=2.7.1
timm>=0.9.16
openai>=1.30.1
loguru>=0.7.2
pydantic>=2.7.1
einops>=0.7.0
sse-starlette>=2.1.0
bitsandbytes>=0.43.1
vim web_demo.py
# 修改模型路径为本地路径
MODEL_PATH = '/root/autodl-tmp/cogvlm2/cogvlm2-llama3-chinese-chat-19B-int4'
chainlit run web_demo.py
本地则访问 : http://localhost:8000
如果是AutoDL 使用ssh代理来访问 , 输入yes, 如何粘贴密码即可
ssh -CNg -L 8000:127.0.0.1:8000 root@connect.cqa1.xxxx.com -p 46671
这里键的含义不对, int4 估计会有性能损失导致的
使用 OpenAI API格式的方式请求和模型的对话。
python openai_api_demo.py
解决办法 :
使用下面
requirements.txt
重新安装依赖xformers>=0.0.26.post1 #torch>=2.3.0 #torchvision>=0.18.0 transformers>=4.40.2 huggingface-hub>=0.23.0 pillow>=10.3.0 chainlit>=1.0.506 pydantic>=2.7.1 timm>=0.9.16 openai>=1.30.1 loguru>=0.7.2 pydantic>=2.7.1 einops>=0.7.0 sse-starlette>=2.1.0 bitsandbytes>=0.43.1
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
赞
踩
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。