当前位置:   article > 正文

win10部署xinference遇到的问题_xinference windows安装

xinference windows安装

xinference官网安装方式:

https://inference.readthedocs.io/zh-cn/latest/getting_started/installation.html

pip install "xinference[all]"

通过该命令,可一键下载安装。


先在macOS上一键部署,中途没有遇到任何障碍,奈何模型太大,mac的内存不够

又配了一遍win10,然而在win10环境下,这条命令

pip install "xinference[all]"

在安装中出现了众多问题,下面描述各种遇到的问题和解决方法:

llama-cpp-python

安装llama-cpp-python包时,出现以下问题

  1. Collecting llama-cpp-python
  2. Using cached llama_cpp_python-0.2.28.tar.gz (9.4 MB)
  3. Installing build dependencies ... done
  4. Getting requirements to build wheel ... done
  5. Installing backend dependencies ... done
  6. Preparing metadata (pyproject.toml) ... done
  7. Requirement already satisfied: typing-extensions>=4.5.0 in d:\software\anaconda3\lib\site-packages (from llama-cpp-python) (4.8.0)
  8. Collecting diskcache>=5.6.1
  9. Using cached diskcache-5.6.3-py3-none-any.whl (45 kB)
  10. Requirement already satisfied: numpy>=1.20.0 in d:\software\anaconda3\lib\site-packages (from llama-cpp-python) (1.23.5)
  11. Building wheels for collected packages: llama-cpp-python
  12. Building wheel for llama-cpp-python (pyproject.toml) ... error
  13. error: subprocess-exited-with-error
  14. × Building wheel for llama-cpp-python (pyproject.toml) did not run successfully.
  15. exit code: 1
  16. ╰─> [20 lines of output]
  17. *** scikit-build-core 0.7.1 using CMake 3.28.1 (wheel)
  18. *** Configuring CMake...
  19. 2024-01-15 02:55:12,546 - scikit_build_core - WARNING - Can't find a Python library, got libdir=None, ldlibrary=None, multiarch=None, masd=None
  20. loading initial cache file C:\Windows\TEMP\tmpyjbtivnu\build\CMakeInit.txt
  21. -- Building for: NMake Makefiles
  22. CMake Error at CMakeLists.txt:3 (project):
  23. Running
  24. 'nmake' '-?'
  25. failed with:
  26. no such file or directory
  27. CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
  28. CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
  29. -- Configuring incomplete, errors occurred!
  30. *** CMake configuration failed
  31. [end of output]
  32. note: This error originates from a subprocess, and is likely not a problem with pip.
  33. ERROR: Failed building wheel for llama-cpp-python
  34. Failed to build llama-cpp-python
  35. ERROR: Could not build wheels for llama-cpp-python, which is required to install pyproject.toml-based projects

解决方案

似乎是程序在寻找C编译器,但是失败了

笔者通过安装VisualStudio 2022 中的 C++ building tool来解决的

首先去vs官网

https://visualstudio.microsoft.com/zh-hans/vs/

下载安装器

然后在安装时,选择“使用C++的桌面开发”,大概需要10个G的空间

VS安装完成,再次执行,此错误解决

chatglm-cpp

安装chatglm-cpp时报错:

  1. subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--config', 'Release', '-j']' returned non-zero exit status 1.
  2.       [end of output]
  3.   note: This error originates from a subprocess, and is likely not a problem with pip.
  4.   ERROR: Failed building wheel for chatglm-cpp
  5. Failed to build chatglm-cpp
  6. ERROR: Could not build wheels for chatglm-cpp, which is required to install pyproject.toml-based projects

解决方案

直接去官网下载对应版本的chatglm-cpp

https://github.com/li-plus/chatglm.cpp/releases

cpXX对应的python版本,笔者的版本是3.9,所以下载了

chatglm_cpp-0.3.1-cp39-cp39-win_amd64.whl

在该文件对应的目录,使用命令安装该包即可

pip install chatglm_cpp-0.3.1-cp39-cp39-win_amd64.whl

编码错误

运行

xinference-local --host 127.0.0.1 --port 9997

时,遇到以下错误

  1. Traceback (most recent call last):
  2. File "D:\workTool\lib\runpy.py", line 197, in _run_module_as_main
  3. return _run_code(code, main_globals, None,
  4. File "D:\workTool\lib\runpy.py", line 87, in _run_code
  5. exec(code, run_globals)
  6. File "D:\workTool\Scripts\xinference-local.exe\__main__.py", line 7, in <module>
  7. File "D:\workTool\lib\site-packages\click\core.py", line 1128, in __call__
  8. return self.main(*args, **kwargs)
  9. File "D:\workTool\lib\site-packages\click\core.py", line 1053, in main
  10. rv = self.invoke(ctx)
  11. File "D:\workTool\lib\site-packages\click\core.py", line 1395, in invoke
  12. return ctx.invoke(self.callback, **ctx.params)
  13. File "D:\workTool\lib\site-packages\click\core.py", line 754, in invoke
  14. return __callback(*args, **kwargs)
  15. File "D:\workTool\lib\site-packages\xinference\deploy\cmdline.py", line 225, in local
  16. start_local_cluster(
  17. File "D:\workTool\lib\site-packages\xinference\deploy\cmdline.py", line 112, in start_local_cluster
  18. main(
  19. File "D:\workTool\lib\site-packages\xinference\deploy\local.py", line 125, in main
  20. from ..api import restful_api
  21. File "D:\workTool\lib\site-packages\xinference\api\restful_api.py", line 27, in <module>
  22. import gradio as gr
  23. File "D:\workTool\lib\site-packages\gradio\__init__.py", line 3, in <module>
  24. import gradio._simple_templates
  25. File "D:\workTool\lib\site-packages\gradio\_simple_templates\__init__.py", line 1, in <module>
  26. from .simpledropdown import SimpleDropdown
  27. File "D:\workTool\lib\site-packages\gradio\_simple_templates\simpledropdown.py", line 6, in <module>
  28. from gradio.components.base import FormComponent
  29. File "D:\workTool\lib\site-packages\gradio\components\__init__.py", line 40, in <module>
  30. from gradio.components.multimodal_textbox import MultimodalTextbox
  31. File "D:\workTool\lib\site-packages\gradio\components\multimodal_textbox.py", line 28, in <module>
  32. class MultimodalTextbox(FormComponent):
  33. File "D:\workTool\lib\site-packages\gradio\component_meta.py", line 198, in __new__
  34. create_or_modify_pyi(component_class, name, events)
  35. File "D:\workTool\lib\site-packages\gradio\component_meta.py", line 92, in create_or_modify_pyi
  36. source_code = source_file.read_text()
  37. File "D:\workTool\lib\pathlib.py", line 1267, in read_text
  38. return f.read()
  39. UnicodeDecodeError: 'gbk' codec can't decode byte 0xb2 in position 1972: illegal multibyte sequence

这是因为gradio的4.22版本有问题,可以安装旧版本来解决

pip install gradio==4.21.0

安装成功后,可以正常运行

模型未启用GPU

启动模型后总是感觉卡卡的,试了以下命令

python -c "import torch; print(torch.cuda.is_available())"

结果输出为False,说明根本没有启用CUDA

看了一下torch的版本

说明这个版本的torch是cpu版的,没有用到CUDA加速

后面发现时conda的问题,conda默认下载的torch都是CPU版本。

所以只能去下载

解决方法

1. 卸载原组件

在conda的目录\anaconda3\Lib\site-packages下,找到torch和torch-2.2.1.dist.info两个文件夹删除

接着去torch网站下载:

https://pytorch.org/get-started/locally/#no-cuda-1

然后选择相应的配置,即可得到下载链接

可以直接在控制台执行命令:

pip3 install torch --index-url https://download.pytorch.org/whl/cu121

安装成功后,再输入

python -c "import torch; print(torch.cuda.is_available())"

输出True,可以正常使用CUDA加速了

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/373977
推荐阅读
相关标签
  

闽ICP备14008679号