当前位置:   article > 正文

llama.cpp部署(windows)_llama.cpp windows

llama.cpp windows

一、下载源码和模型

 下载源码和模型
  1. # 下载源码
  2. git clone https://github.com/ggerganov/llama.cpp.git
  3. # 下载llama-7b模型
  4. git clone https://www.modelscope.cn/skyline2006/llama-7b.git
 查看cmake版本:
  1. D:\pyworkspace\llama_cpp\llama.cpp\build>cmake --version
  2. cmake version 3.22.0-rc2
  3. CMake suite maintained and supported by Kitware (kitware.com/cmake).

 二、开始build

  1. # 进入llama.cpp目录
  2. mkdir build
  3. cd build
  4. cmake ..

build信息 

  1. D:\pyworkspace\llama_cpp\llama.cpp\build>cmake ..
  2. -- Building for: Visual Studio 16 2019
  3. -- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22631.
  4. -- The C compiler identification is MSVC 19.29.30137.0
  5. -- The CXX compiler identification is MSVC 19.29.30137.0
  6. -- Detecting C compiler ABI info
  7. -- Detecting C compiler ABI info - done
  8. -- Check for working C compiler: D:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
  9. -- Detecting C compile features
  10. -- Detecting C compile features - done
  11. -- Detecting CXX compiler ABI info
  12. -- Detecting CXX compiler ABI info - done
  13. -- Check for working CXX compiler: D:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/Hostx64/x64/cl.exe - skipped
  14. -- Detecting CXX compile features
  15. -- Detecting CXX compile features - done
  16. -- Found Git: D:/Git/Git/cmd/git.exe (found version "2.29.2.windows.2")
  17. -- Looking for pthread.h
  18. -- Looking for pthread.h - not found
  19. -- Found Threads: TRUE
  20. -- CMAKE_SYSTEM_PROCESSOR: AMD64
  21. -- CMAKE_GENERATOR_PLATFORM:
  22. -- x86 detected
  23. -- Performing Test HAS_AVX_1
  24. -- Performing Test HAS_AVX_1 - Success
  25. -- Performing Test HAS_AVX2_1
  26. -- Performing Test HAS_AVX2_1 - Success
  27. -- Performing Test HAS_FMA_1
  28. -- Performing Test HAS_FMA_1 - Success
  29. -- Performing Test HAS_AVX512_1
  30. -- Performing Test HAS_AVX512_1 - Failed
  31. -- Performing Test HAS_AVX512_2
  32. -- Performing Test HAS_AVX512_2 - Failed
  33. -- Configuring done
  34. -- Generating done
  35. -- Build files have been written to: D:/pyworkspace/llama_cpp/llama.cpp/build

 本地使用Realease会出现报错,修改为Debug进行build,这里会使用到visual studio进行build

cmake --build . --config Debug

 build信息

  1. D:\pyworkspace\llama_cpp\llama.cpp\build>cmake --build . --config Debug
  2. 用于 .NET Framework 的 Microsoft (R) 生成引擎版本 16.11.2+f32259642
  3. 版权所有(C) Microsoft Corporation。保留所有权利。
  4. Checking Build System
  5. Generating build details from Git
  6. -- Found Git: D:/Git/Git/cmd/git.exe (found version "2.29.2.windows.2")
  7. Building Custom Rule D:/pyworkspace/llama_cpp/llama.cpp/common/CMakeLists.txt
  8. build-info.cpp
  9. build_info.vcxproj -> D:\pyworkspace\llama_cpp\llama.cpp\build\common\build_info.dir\Debug\build_info.lib
  10. Building Custom Rule D:/pyworkspace/llama_cpp/llama.cpp/CMakeLists.txt
  11. ggml.c

 在我本地D:\pyworkspace\llama_cpp\llama.cpp\build\bin\Debug目录下面产生了quantize.exe和main.exe等

 三、量化和推理

安装相关python依赖

python -m pip install -r requirements.txt

将下载好的llama-7b模型放入models目录下,并执行命令,会在llama-7b目录下面产生ggml-model-f16.gguf文件

python convert.py models/llama-7b/

对产生的文件进行量化

D:\pyworkspace\llama_cpp\llama.cpp\build\bin\Debug\quantize.exe ./models/llama-7b/ggml-model-f16.gguf ./models/llama-7b/ggml-model-q4_0.gguf q4_0

进行推理

D:\pyworkspace\llama_cpp\llama.cpp\build\bin\Debug\main.exe -m ./models/llama-7b/ggml-model-q4_0.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/288413
推荐阅读
相关标签
  

闽ICP备14008679号