当前位置:   article > 正文

c++嵌python3.5与 ubuntu解决tensorflow C++警告SSE4.1 SSE4.2 AVX AVX2 FMA XLA_i tensorflow/compiler/xla/service/service.cc:168]

i tensorflow/compiler/xla/service/service.cc:168] xla service 0x26fbd9d51d0

看过前几篇的应该知道每次tensorflow c++预测时都会报警如下图所示:

即:

  1. 2019-07-16 10:33:52.057179: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
  2. 2019-07-16 10:33:52.082548: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3407965000 Hz
  3. 2019-07-16 10:33:52.082883: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x44d56d0 executing computations on platform Host. Devices:
  4. 2019-07-16 10:33:52.082903: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
  5. 2019-07-16 10:33:52.557067: I tensorflow/core/common_runtime/optimization_registry.cc:35] Running all optimization passes in grouping 0. If you see this a lot, you might be extending the graph too many times (which means you modify the graph many times before execution). Try reducing graph modifications or using SavedModel to avoid any graph modification
  6. 2019-07-16 10:33:52.694202: I tensorflow/core/common_runtime/optimization_registry.cc:35] Running all optimization passes in grouping 1. If you see this a lot, you might be extending the graph too many times (which means you modify the graph many times before execution). Try reducing graph modifications or using SavedModel to avoid any graph modification
  7. 2019-07-16 10:33:53.157970: I tensorflow/core/common_runtime/optimization_registry.cc:35] Running all optimization passes in grouping 2. If you see this a lot, you might be extending the graph too many times (which means you modify the graph many times before execution). Try reducing graph modifications or using SavedModel to avoid any graph modification
  8. 2019-07-16 10:33:53.228415: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1337] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.

虽然很多人说这些只是提速警告不用管,但是传说会有很大的性能提升,所以还是想解决。

一、解决AVX AVX2 SSE4.1 SSE4.2 FMA

我查了一些资料想解决这个警告 


https://stackoverflow.com/questions/57049454/tensorflows-warningextending-the-graph-too-many-times-which-means-you-modify  (翻墙找的)

https://stackoverflow.com/questions/47068709/your-cpu-supports-instructions-that-this-tensorflow-binary-was-not-compiled-to-u


https://github.com/lakshayg/tensorflow-build

https://stackoverflow.com/questions/41293077/how-to-compile-tensorflow-with-sse4-2-and-avx-instructions

https://www.tensorflow.org/install/source

https://blog.csdn.net/edisonleeee/article/details/89503365

https://github.com/lakshayg/tensorflow-build

有人说 直接 pip install --ignore-installed --upgrade "Download URL" 或者

  1. pip --upgrade tensorflow
  2. pip unistall tensorflow
  3. pip list
  4. pip install tensorflow-1.9.0-cp36-cp36m-win_amd64.whl

但是还是不行。

  1. root@rootwd-Default-string:/media/root/Ubuntu311/projects/Ecology_projects/copy/ThirdParty/tensorflow-master# ./configure
  2. WARNING: Running Bazel server needs to be killed, because the startup options are different.
  3. WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
  4. You have bazel 0.24.1 installed.
  5. Please specify the location of python. [Default is /usr/bin/python]: bazel shutdown
  6. Invalid python path: bazel shutdown cannot be found.
  7. Please specify the location of python. [Default is /usr/bin/python]:
  8. Found possible Python library paths:
  9. /usr/lib/python3/dist-packages
  10. /usr/local/lib/python3.5/dist-packages
  11. Please input the desired Python library path to use. Default is [/usr/lib/python3/dist-packages]
  12. Do you wish to build TensorFlow with XLA JIT support? [Y/n]: Y
  13. XLA JIT support will be enabled for TensorFlow.
  14. Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
  15. No OpenCL SYCL support will be enabled for TensorFlow.
  16. Do you wish to build TensorFlow with ROCm support? [y/N]:
  17. No ROCm support will be enabled for TensorFlow.
  18. Do you wish to build TensorFlow with CUDA support? [y/N]:
  19. No CUDA support will be enabled for TensorFlow.
  20. Do you wish to download a fresh release of clang? (Experimental) [y/N]:
  21. Clang will not be downloaded.
  22. Do you wish to build TensorFlow with MPI support? [y/N]:
  23. No MPI support will be enabled for TensorFlow.
  24. Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2
  25. Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:
  26. Not configuring the WORKSPACE for Android builds.
  27. Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
  28. --config=mkl # Build with MKL support.
  29. --config=monolithic # Config for mostly static monolithic build.
  30. --config=gdr # Build with GDR support.
  31. --config=verbs # Build with libverbs support.
  32. --config=ngraph # Build with Intel nGraph support.
  33. --config=numa # Build with NUMA support.
  34. --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
  35. Preconfigured Bazel build configs to DISABLE default on features:
  36. --config=noaws # Disable AWS S3 filesystem support.
  37. --config=nogcp # Disable GCP support.
  38. --config=nohdfs # Disable HDFS support.
  39. --config=noignite # Disable Apache Ignite support.
  40. --config=nokafka # Disable Apache Kafka support.
  41. --config=nonccl # Disable NVIDIA NCCL support.
  42. Configuration finished
  43. root@rootwd-Default-string:/media/root/Ubuntu311/projects/Ecology_projects/copy/ThirdParty/tensorflow-master# bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 -k
  44. Starting local Bazel server and connecting to it...
  45. WARNING: Usage: bazel build <options> <targets>.
  46. Invoke `bazel help build` for full description of usage and options.
  47. Your request is correct, but requested an empty set of targets. Nothing will be built.
  48. INFO: Analysed 0 targets (0 packages loaded, 0 targets configured).
  49. INFO: Found 0 targets...
  50. INFO: Elapsed time: 1.805s, Critical Path: 0.01s
  51. INFO: 0 processes.
  52. INFO: Build completed successfully, 1 total action
  53. root@rootwd-Default-string:/media/root/Ubuntu311/projects/Ecology_projects/copy/ThirdParty/tensorflow-master#

后来又看到有人说升级到2.0就可以了 pip install tensorflow==2.0.0-beta1 但是还是不行!

  1. root@rootwd-Default-string:/media/root/Ubuntu311/projects/Ecology_projects/copy/ThirdParty# pip install tensorflow-2.0.0b1-cp35-cp35m-manylinux1_x86_64.whl
  2. Processing ./tensorflow-2.0.0b1-cp35-cp35m-manylinux1_x86_64.whl
  3. Requirement already satisfied: keras-applications>=1.0.6 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (1.0.8)
  4. Requirement already satisfied: astor>=0.6.0 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (0.8.0)
  5. Requirement already satisfied: numpy<2.0,>=1.14.5 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (1.16.4)
  6. Requirement already satisfied: tb-nightly<1.14.0a20190604,>=1.14.0a20190603 in /usr/local/lib/python3.5/dist-packages (from tensorflow==2.0.0b1) (1.14.0a20190603)
  7. Requirement already satisfied: keras-preprocessing>=1.0.5 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (1.1.0)
  8. Requirement already satisfied: gast>=0.2.0 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (0.2.2)
  9. Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.5/dist-packages (from tensorflow==2.0.0b1) (0.1.7)
  10. Requirement already satisfied: tf-estimator-nightly<1.14.0.dev2019060502,>=1.14.0.dev2019060501 in /usr/local/lib/python3.5/dist-packages (from tensorflow==2.0.0b1) (1.14.0.dev2019060501)
  11. Requirement already satisfied: wheel>=0.26 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (0.33.4)
  12. Collecting protobuf>=3.6.1 (from tensorflow==2.0.0b1)
  13. Using cached https://files.pythonhosted.org/packages/55/34/7158a5ec978f12307eb361a8c4fdd867a8e2a0ab63fac99e5f555ee796d2/protobuf-3.9.0-cp35-cp35m-manylinux1_x86_64.whl
  14. Requirement already satisfied: grpcio>=1.8.6 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (1.22.0)
  15. Requirement already satisfied: six>=1.10.0 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (1.12.0)
  16. Requirement already satisfied: absl-py>=0.7.0 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (0.7.1)
  17. Requirement already satisfied: termcolor>=1.1.0 in /root/.local/lib/python3.5/site-packages (from tensorflow==2.0.0b1) (1.1.0)
  18. Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.5/dist-packages (from tensorflow==2.0.0b1) (1.11.2)
  19. Requirement already satisfied: h5py in /root/.local/lib/python3.5/site-packages (from keras-applications>=1.0.6->tensorflow==2.0.0b1) (2.9.0)
  20. Requirement already satisfied: werkzeug>=0.11.15 in /root/.local/lib/python3.5/site-packages (from tb-nightly<1.14.0a20190604,>=1.14.0a20190603->tensorflow==2.0.0b1) (0.15.4)
  21. Requirement already satisfied: setuptools>=41.0.0 in /root/.local/lib/python3.5/site-packages (from tb-nightly<1.14.0a20190604,>=1.14.0a20190603->tensorflow==2.0.0b1) (41.0.1)
  22. Requirement already satisfied: markdown>=2.6.8 in /root/.local/lib/python3.5/site-packages (from tb-nightly<1.14.0a20190604,>=1.14.0a20190603->tensorflow==2.0.0b1) (3.1.1)
  23. Installing collected packages: protobuf, tensorflow
  24. Successfully installed protobuf-3.9.0 tensorflow-2.0.0b1
  25. root@rootwd-Default-string:/media/root/Ubuntu311/projects/Ecology_projects/copy/ThirdParty#

我在tensorflow的issue和stackoverflow以及别的网站上都有提问。

后来还是完完全全彻底卸载了所有的tensorflow(不止命令行,将所有相关的文件夹都删掉了),然后试着下载tensorflow-2.0.0-alpha0.tar然后:

  1. ./configure
  2. bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 //tensorflow:libtensorflow_cc.so

但是不行。

  1. https://github.com/tensorflow/tensorflow/releases/tag/v2.0.0-alpha0
  2. http://mirror.tensorflow.org/www.sqlite.org/2019/sqlite-amalgamation-3280000.zip
  3. https://www.sqlite.org/2019/sqlite-amalgamation-3280000.zip
  4. Executing genrule //tensorflow/cc:nn_ops_genrule failed (Exit 127)
  5. bazel-out/host/bin/tensorflow/cc/ops/nn_ops_gen_cc: symbol lookup error: bazel-out/host/bin/tensorflow/cc/ops/nn_ops_gen_cc: undefined symbol: _ZN10tensorflow15shape_inference21FusedBatchNormV3ShapeEPNS0_16InferenceContextE
  6. Target //tensorflow:libtensorflow_cc.so failed to build
  7. Use --verbose_failures to see the command lines of failed build steps.
  8. INFO: Elapsed time: 2612.253s, Critical Path: 100.09s
  9. INFO: 6033 processes: 6033 local.
  10. FAILED: Build did NOT complete successfully

然后我将之前的又删掉卸载掉,然后下载了tensorflow-2.0.0-beta1.tar 

  1. ./configure
  2. bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma --copt=-mfpmath=both --copt=-msse4.1 --copt=-msse4.2 //tensorflow:libtensorflow_cc.so

然后我发现警告已经消失了很多!!!!总结就是一定不要舍不得删掉卸载掉之前可用版本,舍不得孩子套不到狼,然后就是一定要下与自己对应的版本 CPU版本、ubuntu版本、gcc版本、python版本等都与之对应。

同时我发现 这次的tensorflow不再需要-Wl,--no-as-needed 这些flag了。

现在只剩下下面这些警告了:

  1. 2019-07-25 15:47:02.775473: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3406455000 Hz
  2. 2019-07-25 15:47:02.775793: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2f59c70 executing computations on platform Host. Devices:
  3. 2019-07-25 15:47:02.775812: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
  4. Session successfully created.
  5. 2019-07-25 15:47:02.858641: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1483] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.

二、解决XLA

我看了这些链接,说可以在代码中激活XLA:

  1. https://www.tensorflow.org/xla/developing_new_backend
  2. https://stackoverflow.com/questions/47977533/how-to-debug-tensorflow-compiler-xla-testsarray-elementwise-ops-test-cpu-para
  3. https://stackoverflow.com/questions/56633372/how-can-i-activate-tensorflows-xla-for-the-c-api
  4. #include "c_api_experimental.h"
  5. TF_SessionOptions* options = TF_NewSessionOptions();
  6. TF_EnableXLACompilation(options,true);

然而加了这三句后,我编译都通不过了,直接返回又缺少库。

后来又按照别人的:

  1. $ TF_XLA_FLAGS=--tf_xla_cpu_global_jit path/to/your/program
  2. export TF_XLA_FLAGS=--tf_xla_cpu_global_jit=/mytensorflowpath/tensorflow/compiler/xla:$TF_XLA_FLAGS=--tf_xla_cpu_global_jit
  3. 2019-07-23 10:17:57.259354: E tensorflow/core/util/command_line_flags.cc:106] Couldn't interpret value =/mytensorflowpath/tensorflow/compiler/xla:=--tf_xla_cpu_global_jit for flag tf_xla_cpu_global_jit.

但是返回上图所示无法这样添加环境变量。

我再次确认我安装tensorflow-2.0.0-beta1.tar 时已经加了允许xla的,Do you wish to build TensorFlow with XLA JIT support? [Y/n]: Y  所以实在想不明白为什么还是报XLA警告。

然后我又继续尝试别的解决办法:

  1. https://blog.csdn.net/w285868925/article/details/88317112
  2. http://quabr.com/49549364/layer-conv2d-53-was-called-with-an-input-that-isnt-a-symbolic-tensor
  3. uint8_t intra_op_parallelism_threads = maxCores;
  4. uint8_t inter_op_parallelism_threads = maxCores;
  5. uint8_t config[]={0x10,intra_op_parallelism_threads,0x28,inter_op_parallelism_threads};
  6. TF_SetConfig(sess_opts,config,sizeof(config),status);
  7. uint8_t config[]={0x52,0x4,0x1a,0x2,0x28,0x1};
  8. TF_SetConfig(sess_opts,config,sizeof(config),status);

这个依旧不行。

反正查了很久


https://mp.weixin.qq.com/s/RO3FrPxhK2GEoDCGE9DXrw

https://stackoverflow.com/questions/52943489/what-is-xla-gpu-and-xla-cpu-for-tensorflow

https://stackoverflow.com/questions/43673380/tensorflow-cross-compile-xla-to-android

https://stackoverflow.com/questions/52890108/how-to-open-tensorflow-xla

https://www.cnblogs.com/iyulang/p/6586866.html

https://stackoverflow.com/questions/57049454/tensorflows-warningextending-the-graph-too-many-times-which-means-you-modify

https://stackoverflow.com/questions/57197854/fma-avx-sse-flags-did-not-bring-me-good-performance 

https://fast-depth-coding.readthedocs.io/en/latest/tf-speed.html  //speed up solution

https://github.com/tensorflow/tensorflow/issues/8243

最后是直接添加环境变量 

export TF_XLA_FLAGS=--tf_xla_cpu_global_jit

就这一句,让这个环境变量生效,就解决了问题。不要添加什么路径,就这样写。

看,已经没有那句XLA警告了。

至此,所有警告已解决。

但是我看了下预测速度:

这是用的手写字体测试的速度,并没有传说中那样300%的可观啊。

然后测我自己使用的模型时,解决这些优化提速与不解决几乎一样慢。

然后有人说与模型有关?!

然后又有人建议我使用MKL-DNN,这样会提速很多,我试过了自己编译的MKL-DNN或者直接使用tensorflow的MKL都很慢:

  1. // initialize the number of worker threads
  2. tensorflow::SessionOptions options;
  3. tensorflow::ConfigProto & config = options.config;
  4. if (coresToUse > 0)
  5. {
  6. config.set_inter_op_parallelism_threads(coresToUse);
  7. config.set_intra_op_parallelism_threads(coresToUse);
  8. config.set_use_per_session_threads(false);
  9. }
  10. // now create a session to make the change
  11. std::unique_ptr<tensorflow::Session>
  12. session(tensorflow::NewSession(options));
  13. session->Close

对于这种预测的速度,目前用libtensorflow_cc.so和libtensorflow_framework.so,我是无能为力了。其实可能像别人说的,速度大头还是看训练的模型,模型复杂预测自然慢,模型简单预测自然快。

如果有大神,在优化AVX AVX2 SSE4.1 SSE4.2 FMA XLA后,预测速度大幅度提升请告诉我。

三、C++嵌python(embedding python3.5 in c++)

我在另一台ubuntu上将python3.5想嵌入c++中调用,按网上教程测试hello程序通过。

但是测试自己的预测图片例子时报错:

  1. requests.packages.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.
  2. SystemError: <built-in method locked of _thread.lock object at 0x7fe771c79148> returned a result with an error set

我找了很多资料,有的说是这种嵌入用法中的python文件中不能有“from  import ”所以之前那种io.imread等等都用不了不然就会报错,于是我改成import cv2直接cv2.imread发现也不行。好像c++嵌入python时,python中不能有对系统操作的函数不然就报这些错。

后来发现了网上的一个例子他是自己在C++端读入图片,然后传进python:

  1. int main()
  2. {
  3. Py_Initialize();
  4. import_array(); // 检查初始化是否成功
  5. if ( !Py_IsInitialized() )
  6. {
  7. return -1;
  8. }
  9. PyRun_SimpleString("print 'hello'");
  10. PyObject *pName,*pModule,*pDict,*pFunc,*pArgs;
  11. PyRun_SimpleString("import sys");
  12. PyRun_SimpleString("sys.path.append('/home/vetec-p/Pan/project/run-maskrcnn')");
  13. PyRun_SimpleString("sys.path.append('/home/vetec-p/Pan/project/run-maskrcnn/build')");
  14. PyRun_SimpleString("sys.path.append('/home/vetec-p/Pan/Detectron-master')");
  15. // 载入名为pytest的脚本
  16. pModule = PyImport_ImportModule("infer_one_pic");
  17. if ( !pModule )
  18. {
  19. printf("can't find testvideo.py");
  20. //getchar();
  21. return -1;
  22. }
  23. pDict = PyModule_GetDict(pModule);
  24. if ( !pDict )
  25. {
  26. return -1;
  27. }
  28. pFunc = PyDict_GetItemString(pDict, "run");
  29. if ( !pFunc || !PyCallable_Check(pFunc) )
  30. {
  31. printf("can't find function [run]");
  32. getchar();
  33. return -1;
  34. }
  35. for(int i=1;i<200;i++)
  36. {
  37. clock_t start,finish;
  38. double totaltime;
  39. start=clock();
  40. Mat img=imread("/media/vetec-p/Data/Rubbish/maskrcnn_dataset/0803_mask/train_all/pic/"+to_string(i)+".png");
  41. if(img.empty())
  42. return -1;
  43. clock_t s1;
  44. s1=clock();
  45. PyObject *PyList = PyList_New(data_size);//定义一个与数组等长的PyList对象数组
  46. PyObject *ArgList = PyTuple_New(1);
  47. auto sz = img.size();
  48. int x = sz.width;
  49. int y = sz.height;
  50. int z = img.channels();
  51. uchar *CArrays = new uchar[x*y*z];
  52. int iChannels = img.channels();
  53. int iRows = img.rows;
  54. int iCols = img.cols * iChannels;
  55. if (img.isContinuous())
  56. {
  57. iCols *= iRows;
  58. iRows = 1;
  59. }
  60. uchar* p;
  61. int id = -1;
  62. for (int i = 0; i < iRows; i++)
  63. {
  64. p = img.ptr<uchar>(i);
  65. for (int j = 0; j < iCols; j++)
  66. {
  67. CArrays[++id] = p[j];//连续空间
  68. }
  69. }
  70. npy_intp Dims[3] = { y, x, z}; //注意这个维度数据!
  71. PyObject *PyArray = PyArray_SimpleNewFromData(3, Dims, NPY_UBYTE, CArrays);
  72. PyTuple_SetItem(ArgList, 0, PyArray);
  73. clock_t e1=clock();
  74. cout<<"\n赋值为"<<(double)(e1-s1)/CLOCKS_PER_SEC<<"秒!"<<endl;
  75. //PyTuple_SetItem(ArgList, 0, PyList);//将PyList对象放入PyTuple对象中
  76. PyObject *pReturn = PyObject_CallObject(pFunc, ArgList);
  77. clock_t e2=clock();
  78. cout<<"\n detect为"<<(double)(e2-e1)/CLOCKS_PER_SEC<<"秒!"<<endl;
  79. }
  80. Py_DECREF(pModule); // 关闭Python
  81. Py_Finalize();
  82. return 0;
  83. }
  84. https://sportsmanlee.blogspot.com/2017/09/c-pass-opencv-mat-image-to-python.html

我参考这个改写,报了一个错:

  1. /usr/include/python3.5m/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
  2. #warning "Using deprecated NumPy API, disable it by " \
  3. ^
  4. In file included from /usr/include/c++/5/cstddef:45:0,
  5. from /home/jumper/workspace/opencv3.4.1/include/opencv2/core/hal/interface.h:15,
  6. from /home/jumper/workspace/opencv3.4.1/include/opencv2/core/cvdef.h:91,
  7. from /home/jumper/workspace/opencv3.4.1/include/opencv2/core.hpp:52,
  8. from /home/jumper/workspace/opencv3.4.1/include/opencv2/opencv.hpp:52,
  9. from ../src/insertpython.cpp:11:
  10. ../src/insertpython.cpp: In function ‘void predictimg()’:
  11. /usr/include/python3.5m/numpy/__multiarray_api.h:1527:35: error: return-statement with a value, in function returning 'void' [-fpermissive]
  12. #define NUMPY_IMPORT_ARRAY_RETVAL NULL
  13. ^
  14. /usr/include/python3.5m/numpy/__multiarray_api.h:1532:151: note: in expansion of macro ‘NUMPY_IMPORT_ARRAY_RETVAL’
  15. #define import_array() {if (_import_array() < 0) {PyErr_Print(); PyErr_SetString

解决办法:加入一行代码:

#define NUMPY_IMPORT_ARRAY_RETVAL

终于搞定了将python3.5嵌入C++中预测图片,我最后的例子如下:

mymodel2.py文件:

  1. import tensorflow as tf
  2. import numpy as np
  3. def test_one_image(imagearray):
  4. print("进入模型")
  5. with tf.Graph().as_default():
  6. output_graph_def = tf.GraphDef()
  7. with open(r"/home/jumper/workspace/algaeprojects/insertpython/good_frozen.pb", "rb") as f:
  8. output_graph_def.ParseFromString(f.read())
  9. _ = tf.import_graph_def(output_graph_def, name="")
  10. with tf.Session() as sess:
  11. init = tf.global_variables_initializer()
  12. sess.run(init)
  13. input_x = sess.graph.get_tensor_by_name("input:0")
  14. # out_softmax = sess.graph.get_tensor_by_name("softmax:0")
  15. out_softmax = sess.graph.get_tensor_by_name("output:0")
  16. is_training_x=sess.graph.get_tensor_by_name("is_training:0")
  17. print("模型加载完成")
  18. print("开始读图...")
  19. l=imagearray.shape
  20. k=l[0]
  21. print(k)
  22. img=imagearray*(1./255)
  23. print("get image...")
  24. feed_dict={input_x:np.reshape(img, [-1, 96, 224, 1]),is_training_x:False}
  25. print("开始预测...")
  26. img_out_softmax = sess.run(out_softmax,feed_dict)
  27. print(img_out_softmax)

c++文件:

  1. #include <Python.h>
  2. #include <iostream>
  3. #include <string>
  4. #include <opencv2/opencv.hpp>
  5. #include <numpy/arrayobject.h>
  6. using namespace std;
  7. #define NUMPY_IMPORT_ARRAY_RETVAL
  8. void predictimg()
  9. {
  10. Py_Initialize();
  11. PyEval_InitThreads();
  12. PyObject*pFunc = NULL;
  13. PyObject*pArg = NULL;
  14. PyObject* module = NULL;
  15. PyRun_SimpleString("import sys");
  16. PyRun_SimpleString("sys.path.append('/home/jumper/workspace/algaeprojects/insertpython/')");
  17. module = PyImport_ImportModule("mymodel2");//myModel:Python文件名  
  18. PyObject *pDict=PyModule_GetDict(module);
  19. pFunc = PyDict_GetItemString(pDict, "test_one_image");
  20. //PyEval_CallObject(pFunc, NULL);
  21. clock_t start,finish;
  22. double totaltime;
  23. start=clock();
  24. cv::Mat img =cv::imread("/home/jumper/workspace/algaeprojects/insertpython/cnn-imgs/AABW22496.jpg");
  25. int m, n;
  26. n = img.cols;
  27. m = img.rows;
  28. unsigned char *data = (unsigned char*)malloc(sizeof(unsigned char) * m * n);
  29. int p = 0;
  30. for (int i = 0; i < m;i++)
  31. {
  32. for (int j = 0; j < n; j++)
  33. {
  34. data[p]= img.at<unsigned char>(i, j);
  35. p++;
  36. }
  37. }
  38. clock_t s1;
  39. s1=clock();
  40. npy_intp Dims[3]= { m, n,1 }; //给定维度信息
  41. import_array();
  42. PyObject *PyArray = PyArray_SimpleNewFromData(3, Dims, NPY_UBYTE, data);
  43. PyObject*ArgArray = PyTuple_New(1);
  44. PyTuple_SetItem(ArgArray,0, PyArray);
  45. //PyObject *pFuncFive = PyDict_GetItemString(pDict,"test_one_image");
  46. clock_t e1=clock();
  47. cout<<"\n赋值为"<<(double)(e1-s1)/CLOCKS_PER_SEC<<"秒!"<<endl;
  48. PyObject*pReturn = PyObject_CallObject(pFunc, ArgArray);
  49. clock_t e2=clock();
  50. cout<<"\n detect为"<<(double)(e2-e1)/CLOCKS_PER_SEC<<"秒!"<<endl;
  51. Py_DECREF(module); // 关闭Python
  52. Py_Finalize();
  53. }
  54. void test()
  55. {
  56. Py_Initialize();
  57. PyRun_SimpleString("print('hello c++ python')");
  58. Py_Finalize();
  59. return;
  60. }
  61. void test1()
  62. {
  63. Py_Initialize();
  64. PyRun_SimpleString("import sys");
  65. PyRun_SimpleString("sys.path.append('/home/jumper/workspace/algaeprojects/insertpython/')");
  66. PyObject* module = PyImport_ImportModule("demo");
  67. PyObject* pFunc = PyObject_GetAttrString(module, "print_arg");
  68. PyObject* pArg = Py_BuildValue("(s)", "hello c++ python!!!");
  69. PyEval_CallObject(pFunc, pArg);
  70. Py_Finalize();
  71. return;
  72. }
  73. int main()
  74. {
  75. char * path = "/home/jumper/workspace/algaeprojects/insertpython/cnn-imgs/AABW22496.jpg";
  76. test();//简单测试
  77. test1();//再简单测试
  78. predictimg();//预测图片测试
  79. return 0;
  80. }

同时我的C++工程配置如下:

这条红色的是优化警告,我在我同事电脑上配置的,他的CPU比我电脑低,他报了AVX FMA可优化,也就是他电脑上可优化项比较少。但是我不想解决他电脑上这两个优化,因为看现在的时间预测是1.9秒,比我电脑低了太多,所以我估计优化了也还是比我电脑慢。哎...无论是用tensorflow C++ shared library还是embedding python in C++,预测速度对项目来说目前都太慢,无法使用上去。

看后期能找到什么解决办法没。其实简化模型了就非常快,但简化模型后精度会下降。。。。

依旧放上小不点的美照

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Monodyee/article/detail/104807
推荐阅读
相关标签
  

闽ICP备14008679号