赞
踩
租用了1台GPU服务器,系统 ubuntu20,GeForce RTX 3090 24G。过程略。本人测试了ai-galaxy的,今天发现网友也有推荐autodl的。
(GPU服务器已经关闭,因此这些信息已经失效)
SSH地址:*
端口:16116
SSH账户:root
密码:*
内网: 3389 , 外网:16114
VNC地址: *
端口:16115
VNC用户名:root
密码:*
硬件需求,这是ChatGLM-6B的,应该和ChatGLM2-6B相当。
量化等级 最低 GPU 显存
FP16(无量化) 13 GB
INT8 10 GB
INT4 6 GB
- nvidia-smi
- (base) root@ubuntuserver:~# nvidia-smi
- Fri Sep 8 09:58:25 2023
- +-----------------------------------------------------------------------------+
- | NVIDIA-SMI 510.54 Driver Version: 510.54 CUDA Version: 11.6 |
- |-------------------------------+----------------------+----------------------+
- | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
- | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
- | | | MIG M. |
- |===============================+======================+======================|
- | 0 NVIDIA GeForce ... Off | 00000000:00:07.0 Off | N/A |
- | 38% 42C P0 62W / 250W | 0MiB / 11264MiB | 0% Default |
- | | | N/A |
- +-------------------------------+----------------------+----------------------+
-
- +-----------------------------------------------------------------------------+
- | Processes: |
- | GPU GI CI PID Type Process name GPU Memory |
- | ID ID Usage |
- |=============================================================================|
- | No running processes found |
- +-----------------------------------------------------------------------------+
- (base) root@ubuntuserver:~#
- git clone https://github.com/THUDM/ChatGLM2-6B
- cd ChatGLM2-6B
服务器也无法下载,需要浏览器download as zip 通过winscp拷贝上去
查看显卡驱动版本要求:
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
发现cuda 11.8需要 >=450.80.02。已经满足。
执行指令更新cuda
- wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
- sh cuda_11.8.0_520.61.05_linux.run
-> 输入 accept
-> 取消勾选 Driver
-> 点击 install
- export PATH=$PATH:/usr/local/cuda-11.8/bin
- nvcc --version
- wget https://www.openssl.org/source/openssl-1.1.1s.tar.gz
- tar -zxf openssl-1.1.1s.tar.gz && \
- cd openssl-1.1.1s/ && \
- ./config -fPIC --prefix=/usr/include/openssl enable-shared && \
- make -j8
- make install
-
- wget https://www.python.org/ftp/python/3.10.10/Python-3.10.10.tgz
- or
- wget https://registry.npmmirror.com/-/binary/python/3.10.10/Python-3.10.10.tgz
- apt update && \
- apt install build-essential zlib1g-dev libncurses5-dev libgdbm-dev libnss3-dev libssl-dev libreadline-dev libffi-dev libsqlite3-dev wget libbz2-dev
- tar -xf Python-3.10.10.tgz && \
- cd Python-3.10.10 && \
- ./configure --prefix=/usr/local/python310 --with-openssl-rpath=auto --with-openssl=/usr/include/openssl OPENSSL_LDFLAGS=-L/usr/include/openssl OPENSSL_LIBS=-l/usr/include/openssl/ssl OPENSSL_INCLUDES=-I/usr/include/openssl
-
- make -j8
- make install
ln -s /usr/local/python310/bin/python3.10 /usr/bin/python3.10
- # 首先单独安装cuda版本的torch
- python3.10 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
-
- # 再安装仓库依赖
- python3.10 -m pip install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple
- python3.10 -m pip install -r requirements.txt
-
问题:网速慢,加上国内软件源
python3.10 -m pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
问题:ERROR: Could not find a version that satisfies the requirement streamlit>=1.24.0
ubuntu20内的python3.9太旧了,不兼容。
验证torch是否带有cuda
- import torch
- device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
- print(device)
# 这里将下载的模型文件放到了本地的 chatglm-6b 目录下
- curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
- sudo apt-get install git-lfs
- git clone https://huggingface.co/THUDM/chatglm2-6b $PWD/chatglm2-6b
还是网速太慢
另外一种办法:
- mkdir -p THUDM/ && cd THUDM/
- GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/THUDM/chatglm2-6b
下载ChatGLM2作者上传到清华网盘的模型文件
https://cloud.tsinghua.edu.cn/d/674208019e314311ab5c/?p=%2Fchatglm2-6b&mode=list
并覆盖到THUDM/chatglm2-6b
先前以为用wget可以下载,结果下来的文件是一样大的,造成推理失败。
win10 逐一校验文件SHA256,需要和https://huggingface.co/THUDM/chatglm2-6b中Git LFS Details的匹配。
- C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00001-of-00007.bin SHA256
- SHA256 的 pytorch_model-00001-of-00007.bin 哈希:
- cdf1bf57d519abe11043e9121314e76bc0934993e649a9e438a4b0894f4e6ee8
- CertUtil: -hashfile 命令成功完成。
- C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00002-of-00007.bin SHA256
- SHA256 的 pytorch_model-00002-of-00007.bin 哈希:
- 1cd596bd15905248b20b755daf12a02a8fa963da09b59da7fdc896e17bfa518c
- CertUtil: -hashfile 命令成功完成。
- C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00003-of-00007.bin SHA256
- 812edc55c969d2ef82dcda8c275e379ef689761b13860da8ea7c1f3a475975c8
- C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00004-of-00007.bin SHA256
- 555c17fac2d80e38ba332546dc759b6b7e07aee21e5d0d7826375b998e5aada3
- C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00005-of-00007.bin SHA256
- cb85560ccfa77a9e4dd67a838c8d1eeb0071427fd8708e18be9c77224969ef48
- C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00006-of-00007.bin SHA256
- 09ebd811227d992350b92b2c3491f677ae1f3c586b38abe95784fd2f7d23d5f2
- C:\Users\qjfen\Downloads\chatglm2-6b>certutil -hashfile pytorch_model-00007-of-00007.bin SHA256
- 316e007bc727f3cbba432d29e1d3e35ac8ef8eb52df4db9f0609d091a43c69cb
这里需要推到服务器中。并在ubuntu下用sha256sum <filename> 校验下文件。
注意如果模型是坏的,会出现第一次推理要大概10分钟、而且提示idn越界什么的错误。
切换回主目录
python3.10
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained("chatglm2-6b", trust_remote_code=True)
>>> model = AutoModel.from_pretrained("chatglm2-6b", trust_remote_code=True, device='cuda')
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "你好", history=[])
>>> print(response)
- (base) root@ubuntuserver:~/work/ChatGLM2-6B/chatglm2-6b# nvidia-smi
- Mon Sep 11 07:12:21 2023
- +-----------------------------------------------------------------------------+
- | NVIDIA-SMI 510.54 Driver Version: 510.54 CUDA Version: 11.6 |
- |-------------------------------+----------------------+----------------------+
- | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
- | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
- | | | MIG M. |
- |===============================+======================+======================|
- | 0 NVIDIA GeForce ... Off | 00000000:00:07.0 Off | N/A |
- | 30% 41C P2 159W / 350W | 13151MiB / 24576MiB | 38% Default |
- | | | N/A |
- +-------------------------------+----------------------+----------------------+
-
- +-----------------------------------------------------------------------------+
- | Processes: |
- | GPU GI CI PID Type Process name GPU Memory |
- | ID ID Usage |
- |=============================================================================|
- | 0 N/A N/A 55025 C python3.10 13149MiB |
- +-----------------------------------------------------------------------------+
- (base) root@ubuntuserver:~/work/ChatGLM2-6B/chatglm2-6b#
vim cli_demo.py
修改下模型路径为chatglm2-6b即可运行测试
用户:hello
ChatGLM:Hello! How can I assist you today?
用户:你好
ChatGLM:你好! How can I assist you today?
用户:请问怎么应对嵌入式工程师的中年危机
修改模型路径
vim web_demo.py
把
- tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
- model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).cuda()
修改为
- tokenizer = AutoTokenizer.from_pretrained("chatglm2-6b", trust_remote_code=True)
- model = AutoModel.from_pretrained("chatglm2-6b", trust_remote_code=True).cuda()
- python3.10 -m pip install streamlit -i https://pypi.tuna.tsinghua.edu.cn/simple
- python3.10 -m streamlit run web_demo2.py --server.port 3389
内网: 3389 , 外网:16114
本地浏览器打开:lyg.blockelite.cn:16114
把
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True).cuda()
修改为
tokenizer = AutoTokenizer.from_pretrained("chatglm2-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("chatglm2-6b", trust_remote_code=True).cuda()
另外,智星云服务器设置了端口映射,把port修改为3389,可以通过外网访问。
运行:
python3.10 api.py
客户端(智星云服务器):
curl -X POST "http://127.0.0.1:3389" \
-H 'Content-Type: application/json' \
-d '{"prompt": "你好", "history": []}'
客户端2(任意linux系统)
curl -X POST "http://lyg.blockelite.cn:16114" \
-H 'Content-Type: application/json' \
-d '{"prompt": "你好", "history": []}'
- (base) root@ubuntuserver:~/work/ChatGLM2-6B# python3.10 api.py
- Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████| 7/7 [00:46<00:00, 6.60s/it]
- INFO: Started server process [91663]
- INFO: Waiting for application startup.
- INFO: Application startup complete.
- INFO: Uvicorn running on http://0.0.0.0:3389 (Press CTRL+C to quit)
- [2023-09-11 08:55:21] ", prompt:"你好", response:"'你好声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/木道寻08/article/detail/946418推荐阅读
相关标签
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。