赞
踩
本文需要有一定的docker 基础知识,且会修改python代码,不是新手向的教程
第一步:下载官方项目
GitHub - THUDM/GLM-4: GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
然后在官方项目的根目录下,添加dockerfile文件,docker file 文件的内容如下
- # 使用nvidia/cuda:12.4.1-devel-ubuntu22.04作为基础镜像
- FROM nvidia/cuda:12.4.1-devel-ubuntu22.04 AS dev
-
- # 更新包索引并安装Python 3的pip和Git
- RUN apt-get update -y \
- && apt-get install -y python3-pip git \
- && rm -rf /var/lib/apt/lists/*
-
- # 运行ldconfig命令以确保CUDA库可以被正确找到
- RUN ldconfig /usr/local/cuda-12.4/compat/
-
- # 设置工作目录
- WORKDIR /glm4
-
- # 复制文件夹和文件到容器的工作目录
- COPY basic_demo /glm4/basic_demo
- COPY composite_demo /glm4/composite_demo
- COPY finetune_demo /glm4/finetune_demo
- COPY LICENSE /glm4/LICENSE
- COPY resources /glm4/resources
-
- # 进入到/basic_demo目录
- WORKDIR /glm4/basic_demo
-
- # 安装requirements.txt中指定的Python依赖
- RUN python3 -m pip install --no-cache-dir -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/
-
- # 回到之前的工作目录
- WORKDIR /glm4
-
- # 声明容器将监听的端口
- EXPOSE 8000

然后在项目的根目录下,构建容器
docker build -t you_need_name/glm4-9b:tag .
然后等待容器构建成功。
容器构建成功后,需要提前下载好模型。
下载模型后,需要修改openai_api_server.py文件,修改的位置如下。
修改的内容如下,修改这些内容是为了方便在容器启动时,直接设置模型路径,
- MODEL_PATH = os.environ.get('LOCAL_MODEL_PATH','THUDM/glm-4-9b-chat')
- MAX_MODEL_LENGTH = os.environ.get('LOCAL_MAX_MODEL_LENGTH','8192')
修改完成后使用如下docker-compose.yml文件,在该文件的目录下运行 ‘docker compose up -d’命令
- version: '3.8'
- services:
- glm4_vllm_server:
- image: jjck/glm4-9b:20240606
- runtime: nvidia
- ipc: host
- restart: always
- environment:
- - LOCAL_MODEL_PATH=/glm4/glm-4-9b-chat
- - LOCAL_MAX_MODEL_LENGTH=8192
- ports:
- - "8101:8000"
- volumes:
- - "/etc/localtime:/etc/localtime:ro"
- - "/home/aicode/logs/AI_Order/glm-4-9b-chat:/glm4/glm-4-9b-chat"
- - "/home/pythonproject/GLM-4-main/basic_demo:/glm4/basic_demo"
- command: python3 basic_demo/openai_api_server.py
- deploy:
- resources:
- reservations:
- devices:
- - driver: nvidia
- device_ids: ['0']
- capabilities: [gpu]

这两个路径,第一个是模型的存放路径,这个路径根据你的实际情况自行修改。
第二个是代码路径,这个也是根据自身情况自行修改。
或者是这个命令
- docker run --runtime nvidia --gpus device=0 \
- --name glm4_vllm_server \
- -p 8101:8000 \
- -v /etc/localtime:/etc/localtime:ro \
- -v /home/aicode/logs/AI_Order/glm-4-9b-chat:/glm4/glm-4-9b-chat \
- -v /home/pythonproject/GLM-4-main/basic_demo:/glm4/basic_demo \
- -itd xxx/glm4-9b:20240606
我尝试用他们的api_server demo的过程中出现的问题。
用清华官方的这个demo会出现异步线程的问题,导致用户访问异常,异常的日志如下
- glm4_vllm_api_server-glm4_vllm_server-1 | File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 191, in run_endpoint_function
- glm4_vllm_api_server-glm4_vllm_server-1 | return await dependant.call(**values)
- glm4_vllm_api_server-glm4_vllm_server-1 | File "/glm4/basic_demo/openai_api_server.py", line 344, in create_chat_completion
- glm4_vllm_api_server-glm4_vllm_server-1 | async for response in generate_stream_glm4(gen_params):
- glm4_vllm_api_server-glm4_vllm_server-1 | File "/glm4/basic_demo/openai_api_server.py", line 199, in generate_stream_glm4
- glm4_vllm_api_server-glm4_vllm_server-1 | async for output in engine.generate(inputs=inputs, sampling_params=sampling_params, request_id="glm-4-9b"):
- glm4_vllm_api_server-glm4_vllm_server-1 | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 662, in generate
- glm4_vllm_api_server-glm4_vllm_server-1 | async for output in self._process_request(
- glm4_vllm_api_server-glm4_vllm_server-1 | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 756, in _process_request
- glm4_vllm_api_server-glm4_vllm_server-1 | stream = await self.add_request(
- glm4_vllm_api_server-glm4_vllm_server-1 | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 561, in add_request
- glm4_vllm_api_server-glm4_vllm_server-1 | self.start_background_loop()
- glm4_vllm_api_server-glm4_vllm_server-1 | File "/usr/local/lib/python3.10/dist-packages/vllm/engine/async_llm_engine.py", line 431, in start_background_loop
- glm4_vllm_api_server-glm4_vllm_server-1 | raise AsyncEngineDeadError(
- glm4_vllm_api_server-glm4_vllm_server-1 | vllm.engine.async_llm_engine.AsyncEngineDeadError: Background loop has errored already.
所以,不要把demo放在生产环境中使用。需要在生产环境的,等官方适配吧
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。