当前位置:   article > 正文

大模型学习与实践笔记(九)_running bdist_wheel guessing wheel url

running bdist_wheel guessing wheel url

一、LMDeply方式部署

使用 LMDeploy 以本地对话方式部署 InternLM-Chat-7B 模型,生成 300 字的小故事

2.api 方式部署

运行

结果:

显存占用:

二、报错与解决方案

在使用命令,对lmdeploy 进行源码安装是时,报错

1.源码安装语句

pip install 'lmdeploy[all]==v0.1.0'

2.报错语句:

  1. Building wheels for collected packages: flash-attn
  2. Building wheel for flash-attn (setup.py) ... error
  3. error: subprocess-exited-with-error
  4. × python setup.py bdist_wheel did not run successfully.
  5. exit code: 1
  6. ╰─> [9 lines of output]
  7. fatal: not a git repository (or any of the parent directories): .git
  8. torch.__version__ = 2.0.1
  9. running bdist_wheel
  10. Guessing wheel URL: https://github.com/Dao-AILab/flash-attention/releases/download/v2.4.2/flash_attn-2.4.2+cu118torch2.0cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
  11. error: <urlopen error Tunnel connection failed: 503 Service Unavailable>
  12. [end of output]
  13. note: This error originates from a subprocess, and is likely not a problem with pip.
  14. ERROR: Failed building wheel for flash-attn
  15. Running setup.py clean for flash-attn
  16. Failed to build flash-attn
  17. ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects

3.解决方法

(1)在https://github.com/Dao-AILab/flash-attention/releases/ 下载对应版本的安装包

(2)通过pip 进行安装

pip install flash_attn-2.3.5+cu117torch2.0cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

4.参考链接

https://github.com/Dao-AILab/flash-attention/issues/224

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小丑西瓜9/article/detail/270411
推荐阅读
相关标签
  

闽ICP备14008679号