当前位置:   article > 正文

ERROR: Failed building wheel for pyarrow_error: failed building wheel for brotli failed to

error: failed building wheel for brotli failed to build brotli error: could

问题描述

安装HuggingFace datasets时出现报错

系统:MacOS 10.13.6

环境:Conda虚拟环境,python==3.8.1

命令:

pip install datasets

报错信息:

  1. CMake Error at CMakeLists.txt:268 (find_package):
  2. By not providing "FindArrow.cmake" in CMAKE_MODULE_PATH this project has
  3. asked CMake to find a package configuration file provided by "Arrow", but
  4. CMake did not find one.
  5. Could not find a package configuration file provided by "Arrow" with any of
  6. the following names:
  7. ArrowConfig.cmake
  8. arrow-config.cmake
  9. Add the installation prefix of "Arrow" to CMAKE_PREFIX_PATH or set
  10. "Arrow_DIR" to a directory containing one of the above files. If "Arrow"
  11. provides a separate development package or SDK, be sure it has been
  12. installed.
  13. -- Configuring incomplete, errors occurred!
  14. See also "/private/var/folders/sd/6d0w7lz121v38498dngh6y540000gn/T/pip-install-ewqnh087/pyarrow_673989b028794d389cba544b08d75516/build/temp.macosx-10.9-x86_64-cpython-38/CMakeFiles/CMakeOutput.log".
  15. error: command '/usr/local/bin/cmake' failed with exit code 1
  16. [end of output]
  17. note: This error originates from a subprocess, and is likely not a problem with pip.
  18. ERROR: Failed building wheel for pyarrow
  19. Failed to build pyarrow
  20. ERROR: Could not build wheels for pyarrow, which is required to install pyproject.toml-based projects

解决方案

使用Conda虚拟环境,先拉取测试数据,配置环境变量

  1. $ cd /Users/../anaconda3/envs/env_name # 先定位到虚拟环境目录
  2. $ git clone https://github.com/apache/arrow.git
  3. $ pushd arrow
  4. $ git submodule update --init
  5. $ export PARQUET_TEST_DATA="${PWD}/cpp/submodules/parquet-testing/data"
  6. $ export ARROW_TEST_DATA="${PWD}/testing/data"
  7. $ popd

conda-forge安装Arrow C++和PyArrow的依赖,但是报错`CondaValueError: Malformed version string '~': invalid character(s).`

  1. $ conda activate env_name # 激活虚拟环境
  2. $ conda install -c conda-forge \
  3. --file arrow/ci/conda_env_unix.txt \
  4. --file arrow/ci/conda_env_cpp.txt \
  5. --file arrow/ci/conda_env_python.txt \
  6. --file arrow/ci/conda_env_gandiva.txt \
  7. compilers # 从channel下载
  8. $ export ARROW_HOME=$CONDA_PREFIX

尝试从系统虚拟环境入手,安装Arrow C++的依赖,配置环境变量。使用现有虚拟环境时安装时,发现大量深度学习相关的包,都有依赖冲突问题,需要创建新虚拟环境:

  • lamini 1.0.2 requires pydantic==1.10.*,但gradio 4.4.0 requires pydantic>=2.0
  • tensorflow 2.6.5 requires typing-extensions<3.11,>=3.7,但大部分要求typing-extensions>=4.7.0
  • tensorflow 2.6.5 requires numpy~=1.19.2, 但大部分要求numpy>=1.22.0
  1. $ brew update && brew bundle --file=arrow/cpp/Brewfile
  2. $ python3 -m venv pyarrow-dev # 创建新的虚拟环境
  3. $ source ./pyarrow-dev/bin/activate
  4. $ pip install -r arrow/python/requirements-build.txt # 里面含有oldest-supported-numpy,无法用于Conda、HomeBrew
  5. $ mkdir dist
  6. $ export ARROW_HOME=$(pwd)/dist
  7. $ export LD_LIBRARY_PATH=$(pwd)/dist/lib:$LD_LIBRARY_PATH
  8. $ export CMAKE_PREFIX_PATH=$ARROW_HOME:$CMAKE_PREFIX_PATH

安装

  1. $ mkdir arrow/cpp/build
  2. $ pushd arrow/cpp/build
  3. $ cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
  4. -DCMAKE_INSTALL_LIBDIR=lib \
  5. -DCMAKE_BUILD_TYPE=Debug \
  6. -DARROW_BUILD_TESTS=ON \
  7. -DARROW_COMPUTE=ON \
  8. -DARROW_CSV=ON \
  9. -DARROW_DATASET=ON \
  10. -DARROW_FILESYSTEM=ON \
  11. -DARROW_HDFS=ON \
  12. -DARROW_JSON=ON \
  13. -DARROW_PARQUET=ON \
  14. -DARROW_WITH_BROTLI=ON \
  15. -DARROW_WITH_BZ2=ON \
  16. -DARROW_WITH_LZ4=ON \
  17. -DARROW_WITH_SNAPPY=ON \
  18. -DARROW_WITH_ZLIB=ON \
  19. -DARROW_WITH_ZSTD=ON \
  20. -DPARQUET_REQUIRE_ENCRYPTION=ON \
  21. ..
  22. $ make -j4
  23. $ make install
  24. $ popd

进行到cmake步骤,又报错,暂时放弃

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/羊村懒王/article/detail/534817
推荐阅读
相关标签