这部分环境配置不过多展开，可以搜一下网上详细教程有很多，注意cuda版本与torch版本对应，安装好conda（建议使用：Miniconda — miniconda documentation）后，在新创建的虚拟环境下进行实验。这里我的环境是：python 3.8.17 + cuda11.7 + torch 2.0.1。


conda常用命令：
创建虚拟环境：conda create -n 环境名称 python=版本号
查看已有虚拟环境：conda env list
激活虚拟环境：conda activate 环境名称
删除虚拟环境：conda remove -n 环境名称 --all
查看当前环境下已安装的包：conda list

2. pip和conda源

建议更换南方科技大学的conda源https://mirrors.sustech.edu.cn/help/anaconda.html#introduction 和pip源https://mirrors.sustech.edu.cn/help/pypi.html#_1-confirm-your-python-environment 具体操作步骤看链接内的说明。

3. 下载模型到本地

git clone https://huggingface.co/FlagAlpha/Llama2-Chinese-7b-Chat

4. 下载并部署gradio

1. 把链接中的gradio_demo.py和requirements.txt下载到服务器（本文路径相关部分请根据自己的目录结构修改）https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/inference/gradio_demo.py

2.修改requrement.txt里的torch版本为2.0.1，然后安装requirements.txt：

pip install -r requirements.txt

3. 把gradio.py里59、60、61行注释掉，然后手动安装gradio和gradio_demo.py里import的包；

安装gradio：

pip install gradio

安装bitsandbytes：

pip install bitsandbytes

安装accelerate：

pip install accelerate

安装scipy：

pip install scipy

5. 使用gradio运行模型


cd llama-2
python gradio_demo.py --base_model /home/yrgu/llm/model/FlagAlpha_Llama2-Chinese-7b-Chat --tokenizer_path /home/yrgu/llm/model/FlagAlpha_Llama2-Chinese-7b-Chat --gpus 0

运行结果：（这里应该在llama-2文件夹下操作，截错图了）

(注：好像要开vpn才能生成gradio的外部分享链接)

6. text generation webui

这也是一个图形化可交互的大模型Web UI，可以方便地与模型对话、下载模型、训练与微调等，官方给的一键式懒人安装也十分便捷。详细教程可参考下方链接：

oobabooga/text-generation-webui：A Gradio web UI for Large Language Models.支持变压器，GPTQ，美洲驼.cpp（GGUF），美洲驼模型。 (github.com)

https://www.wpsshop.cn/w/盐析白兔/article/detail/292159?site