赞
踩
sed -i "s/\r//" scripts/run_for_local_option.sh
sed -i "s/^M//" scripts/run_for_local_option.sh
sed -i "s/\r//" scripts/run_for_cloud_option.sh
sed -i "s/^M//" scripts/run_for_cloud_option.sh
sed -i "s/\r//" run.sh
sed -i "s/^M//" run.sh
sed -i "s/\r//" close.sh
sed -i "s/^M//" close.sh
sudo apt-get update sudo apt-get install \ apt-transport-https \ ca-certificates \ curl \ gnupg \ lsb-release curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo \ "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compse distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list sudo apt-get update sudo apt-get install -y nvidia-docker2 sudo service docker stop sudo service docker start sudo apt install docker-compose # 测试docker sudo docker run --runtime=nvidia --rm -it --name tensorflow-1.14.0 tensorflow/tensorflow:1.14.0-gpu-py3
- 执行脚本文件:
- sudo bash ./run.sh -c local -i 0 -b hf -m Qwen-7B-QAnything -t qwen-7b-qanything(将-b hf的时候会卡住,=经过确定是因为下载的模型有问题我是从魔塔上下载的模型,所以一直gpu加载不动,使用了chatglm3-6b模型之后就正常了)
- 此时会执行一些docker的下载资源,有很多的docker库需要下载
- 注意一下-b 参数需要谨慎选择,有可能是不能使用的,我用chatglm3-6b使用vllm时就执行不出来,用hf就可以,用default就是失败的
2.urllib3.exceptions.ProtocolError: (‘Connection aborted.’, PermissionError(13, ‘Permission denied’))
4.安装了nvidia-smi之后出现NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
1. sudo add-apt-repository ppa:graphics-drivers/ppa --yes
2. sudo apt update
3. sudo apt install nvidia-driver-550 当前最新的驱动了
1. sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get remove --purge '^libnvidia-.*'
sudo apt-get remove --purge '^cuda-.*'
2. wget https://developer.download.nvidia.com/compute/cuda/12.3.2/local_installers/cuda_12.3.2_545.23.08_linux.run
3. sudo sh cuda_12.3.2_545.23.08_linux.run
5.docker.errors.DockerException: Error while fetching server API version: (‘Connection aborted.’, FileNotFoundError(2, 'No such file or ,Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
5.执行的时候报错:nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/libnvidia-container/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo snap restart docker # 因为是通过snap安装的,其他可以通过systemctl服务重启(systemctl restart docker)
docker run --rm -it --gpus all ubuntu:22.04 nvidia-smi # 失败了,和上面的报错一样
nvidia-smi # 未执行
安装最新的安装方式后就正常了,参考连接:https://blog.csdn.net/SUNbrightness/article/details/116783604
mv /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1.bak
mv /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so.1.bak
问题6:please install FlashAttention https://github.com/Dao-AILab/flash-attention
问题7:RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
问题8:ValueError: Cannot find any model weight files. Please check your (cached) weight path: /model_repos/CustomLLM/Qwen-7B-QAnything
问题9:import flash_attn rms_norm fail,import flash_attn rotary fail,import flash_attn fail,
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention && pip install . (安装太慢,使用pip install flash-attn --no-build-isolation 安装)
# Below are optional. Installing them might be slow.
pip install csrc/layer_norm (非常的占用cpu,需要等待)
pip install csrc/rotary
问题10:/usr/bin/ld: cannot find /mnt/c/QAnything/flash-attention/csrc/rotary/build/temp.linux-x86_64-cpython-310/rotary.o: No such file or directory
问题11:run.sh 设置为-b hf 或者vllm是无法成功的
问题12: requests.exceptions.ReadTimeout: HTTPConnectionPool(host=‘0.0.0.0’, port=36001): Read timed out. (read timeout=60)
问题13:wsl安装ubuntu之后没有关在电脑硬盘
问题14: Failed to install npm dependencies.
问题15:Failed to build the front end.
file:///mnt/c/QAnything/QAnything/front_end/node_modules/vite/bin/vite.js:7
await import('source-map-support').then((r) => r.default.install())
^^^^^
SyntaxError: Unexpected reserved word
问题16:peer closed connection without sending complete message body (incomplete chunked read)
2024-02-26 12:06:58 | ERROR | stderr | ERROR: Exception in ASGI application
2024-02-26 12:06:58 | ERROR | stderr | Traceback (most recent call last):
2024-02-26 12:06:58 | ERROR | stderr | File "/usr/local/lib/python3.10/dist-packages/starlette/responses.py", line 261, in __call__
2024-02-26 12:06:58 | ERROR | stderr | await wrap(partial(self.listen_for_disconnect, receive))
2024-02-26 12:06:58 | ERROR | stderr | File "/usr/local/lib/python3.10/dist-packages/starlette/responses.py", line 257, in wrap
2024-02-26 12:06:58 | ERROR | stderr | await func()
2024-02-26 12:06:58 | ERROR | stderr | File "/usr/local/lib/python3.10/dist-packages/starlette/responses.py", line 234, in listen_for_disconnect
2024-02-26 12:06:58 | ERROR | stderr | message = await receive()
2024-02-26 12:06:58 | ERROR | stderr | File "/usr/local/lib/python3.10/dist-packages/uvicorn/protocols/http/httptools_impl.py", line 587, in receive
2024-02-26 12:06:58 | ERROR | stderr | await self.message_event.wait()
2024-02-26 12:06:58 | ERROR | stderr | File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait
2024-02-26 12:06:58 | ERROR | stderr | await fut
2024-02-26 12:06:58 | ERROR | stderr | asyncio.exceptions.CancelledError: Cancelled by cancel scope 7fa254ef6140
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。