赞
踩
本项目是参加PaddlePaddle Hackathon第二期活动的任务, 在此分享将yolov5模型部署在手机端上的一些实践记录和踩坑经历。
任务目标:
参考ssd_mobilnetv1目标检测的 Android demo,使用 yolo_v5 模型在安卓手机上完成 demo 开发,输入为摄像头实时视频流,输出为包含检测框的视频流;在界面上添加一个 backend 选择开关,用户可以选择将模型运行在 CPU 或 GPU 上,保证分别运行 CPU 和 GPU 结果均正确。
项目任务链接: https://github.com/PaddlePaddle/Paddle/issues/40234
GitHub 任务提交链接: https://github.com/PaddlePaddle/Paddle-Lite-Demo/pull/237
基于之前项目的后续开发
链接: https://aistudio.baidu.com/aistudio/projectdetail/3452279
基于之前的项目稍微做了调整,代码已上传至GitHub: https://github.com/thunder95/yolov5_paddle_prune
%cd /home/aistudio/
!rm -rf yolov5_paddle_prune/
!git clone https://github.com/thunder95/yolov5_paddle_prune.git
%cd yolov5_paddle_prune
!git checkout android
# 安装依赖库
!pip install gputil==1.4.0 pycocotools terminaltables
!mkdir -p /home/aistudio/.config/thunder95/
!cp /home/aistudio/Arial.ttf /home/aistudio/.config/thunder95/
本项目不需要训练, 但需要解压并修改数据集加载的配置文件
%cd /home/aistudio/data/data127016/
!unzip /home/aistudio/data/data127016/yolov5_mot20.zip
!python create_txt.py
!sed -i 's|\/f\/dataset\/person\/out|\/home\/aistudio\/data\/data127016|g' /home/aistudio/yolov5_paddle_prune/data/coco.yaml
Error: This model is not supported, because 1 ops are not supported on ‘arm’. These unsupported ops are: ‘silu’
遇到的问题一:
[F 4/10 8:51:53.568 …-Lite/lite/kernels/opencl/image_helper.h:72 DefaultGlobalWorkSize] Unsupport DefaultGlobalWorkSize with tensor_dim.size():5, image_shape.size():2
所提Issue(https://github.com/PaddlePaddle/Paddle-Lite/issues/8803)反馈, 在PaddleLite中使用opencl不支持>4维的算子
遇到的问题二:
使用paddle.max和paddle.argmax组合会跟原网络输出无法对齐
所提Issue(https://github.com/PaddlePaddle/Paddle-Lite/issues/8647)反馈, 官方待修复这个问题。
修改models/yolo.py中检测头的foward部分
class Detect(nn.Layer): def __init__(self, nc=80, anchors=(), ch=(), inplace=True): # detection layer super().__init__() self.onnx_dynamic = False # ONNX export parameter self.stride = None # strides computed during build self.nc = nc # number of classes self.no = nc + 5 # number of outputs per anchor self.nl = len(anchors) # number of detection layers self.na = len(anchors[0]) // 2 # number of anchors self.grid = [paddle.zeros([1])] * self.nl # init grid self.anchor_grid = [paddle.zeros([1])] * self.nl # init anchor grid self.register_buffer('anchors', paddle.to_tensor(anchors).astype('float32').reshape([self.nl, -1, 2])) # shape(nl,na,2) self.m = nn.LayerList(nn.Conv2D(x, self.no * self.na, 1) for x in ch) # output conv self.inplace = inplace # use in-place ops (e.g. slice assignment) def forward(self, x): z = [] # inference output for i in range(self.nl): x[i] = self.m[i](x[i]) # conv bs, _, ny, nx = x[i].shape # x(bs,255,20,20) to x(bs,3,20,20,85) x[i] = x[i].reshape([bs, self.na, self.no, ny * nx]).transpose([0, 1, 3, 2]) if not self.training: # inference y = F.sigmoid(x[i]) z.append(y) return x if self.training else z
修改detect.py中提取boxes代码, 将在andriod端用c实现
boxes = [] # xywh, score, cls for l in range(len(pred)): pred[l] = pred[l].numpy() bs, ch, nd, cn = pred[l].shape rows = dim_num[l] print(l, rows) for b in range(bs): for c in range(ch): for r in range(rows): offset = r * row_num max_cls_val = 0 max_cls_id = 0 score = pred[l][b, c, r, 4] if score < conf_thres: continue for i in range(80): if pred[l][b, c, r, i + 5] > max_cls_val: max_cls_val = pred[l][b, c, r, i + 5] max_cls_id = i score *= max_cls_val if score < conf_thres: continue y = int(r / xdim[l]) x = int(r % xdim[l]) tmp_box = [ (pred[l][b, c, r, 0] * 2 - 0.5 + x) * strides[l], (pred[l][b, c, r, 1] * 2 - 0.5 + y) * strides[l], # i, y, x, 2 (pred[l][b, c, r, 2] * 2) ** 2 * anchors[l][c * 2], (pred[l][b, c, r, 3] * 2) ** 2 * anchors[l][c * 2 + 1], max_cls_id, score, # scores ] tmp_box[0] = tmp_box[0] - tmp_box[2] / 2 # x1 tmp_box[1] = tmp_box[1] - tmp_box[3] / 2 # y1 tmp_box[2] = tmp_box[0] + tmp_box[2] # x2 tmp_box[3] = tmp_box[1] + tmp_box[3] # y2 boxes.append(tmp_box)
# 导出静态图模型
%cd /home/aistudio/yolov5_paddle_prune
!python convert_static.py --weights ./weights/yolov5n.pdparams --cfg models/yolov5n.yaml --data ./data/coco.yaml --source crop.jpg
安装PaddleLite优化工具paddle_lite_opt,并通过命令生成nb权重文件。
本项目推荐编译安装release/v2.11的PaddleLite, 参考issue:https://github.com/PaddlePaddle/Paddle-Lite/issues/9168
git clone https://github.com/PaddlePaddle/Paddle-Lite.git
cd Paddle-Lite
git checkout release/v2.11
export NDK_ROOT=~/Android/Sdk/ndk/24.0.8215888/
./lite/tools/build_android.sh --with_opencl=ON --arch=armv8 --with_extra=ON --with_log=ON
%cd /home/aistudio/yolov5_paddle_prune/
!pip install paddlelite==2.11
# cpu
!paddle_lite_opt --model_file=./simple_net.pdmodel --param_file=simple_net.pdiparams --optimize_out_type=naive_buffer --optimize_out=./yolov5n_cpu --valid_targets=arm
#gpu
!paddle_lite_opt --model_file=./simple_net.pdmodel --param_file=simple_net.pdiparams --optimize_out_type=naive_buffer --optimize_out=./yolov5n_gpu --valid_targets=opencl,arm
/home/aistudio/yolov5_paddle_prune Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting paddlelite==2.11 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8b/d7/8babc059bff1d02dc85e32fb8bff3c56680db5f5b20246467f9fcbcfe62c/paddlelite-2.11-cp37-cp37m-manylinux1_x86_64.whl (47.1 MB) [2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.1/47.1 MB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m [?25hInstalling collected packages: paddlelite Successfully installed paddlelite-2.11 [33mWARNING: You are using pip version 22.0.4; however, version 22.1.2 is available. You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.[0m[33m [0mLoading topology data from ./simple_net.pdmodel Loading params data from simple_net.pdiparams 1. Model is successfully loaded! 2. Model is optimized and saved into ./yolov5n_cpu.nb successfully Loading topology data from ./simple_net.pdmodel Loading params data from simple_net.pdiparams 1. Model is successfully loaded! 2. Model is optimized and saved into ./yolov5n_gpu.nb successfully
准备一部测试用的安卓手机armv8版本,处理器Qualcomm Snapdrgon 450。
sudo apt update
sudo apt install -y wget adb
adb devices
下方出现设备,表示设备连接成功
List of devices attached
XKKBB19A10231894 device
修改本地NDK配置:
export NDK_ROOT=~/Android/Sdk/ndk/24.0.8215888
下载代码:
git clone https://github.com/PaddlePaddle/Paddle-Lite-Demo.git
注意: 之前的版本opencl会存在精度对齐的问题,需要切换到develop或者release/v2.11, 参考本人issue: https://github.com/PaddlePaddle/Paddle-Lite/issues/8807
下载, 解压并拷贝预编译库到Paddle-Lite-Demo项目: https://paddle-lite.readthedocs.io/zh/develop/quick_start/release_lib.html
/cxx/include$ cp ./* /home/hulei/projects/tmp/Paddle-Lite-Demo/libs/android/cxx/include/
/cxx/lib$ cp libpaddle_light_api_shared.so /home/hulei/projects/tmp/Paddle-Lite-Demo/libs/android/cxx/libs/arm64-v8a/
// 1. Set MobileConfig MobileConfig config; config.set_model_from_file(model_file); std::cout << "model_file: " << model_file << std::endl; config.set_power_mode(static_cast<paddle::lite_api::PowerMode>(power_mode)); config.set_threads(thread_num); // 2. Create PaddlePredictor by MobileConfig std::shared_ptr<PaddlePredictor> predictor = CreatePaddlePredictor<MobileConfig>(config); // 3. Prepare input data from image // read img and pre-process std::unique_ptr<Tensor> input_tensor0(std::move(predictor->GetInput(0))); input_tensor0->Resize({1, 3, height, width}); auto *data0 = input_tensor0->mutable_data<float>(); cv::Mat img = imread(img_path, cv::IMREAD_COLOR); pre_process(img, width, height, data0); // 4. Run predictor double first_duration{-1}; for (size_t widx = 0; widx < warmup; ++widx) { if (widx == 0) { auto start = GetCurrentUS(); predictor->Run(); first_duration = (GetCurrentUS() - start) / 1000.0; } else { predictor->Run(); } }
void pre_process(const cv::Mat &img_ori, int width, int height, float *data) { cv::Mat img = img_ori.clone(); int w, h, x, y; int channelLength = width * height; float r_w = width / (img.cols * 1.0); float r_h = height / (img.rows * 1.0); if (r_h > r_w) { w = width; h = r_w * img.rows; x = 0; y = (height - h) / 2; } else { w = r_h * img.cols; h = height; x = (width - w) / 2; y = 0; } cv::Mat re(h, w, CV_8UC3); cv::resize(img, re, re.size(), 0, 0, cv::INTER_CUBIC); cv::Mat out(height, width, CV_8UC3, cv::Scalar(128, 128, 128)); re.copyTo(out(cv::Rect(x, y, re.cols, re.rows))); // split channels out.convertTo(out, CV_32FC3, 1. / 255.); cv::Mat input_channels[3]; cv::split(out, input_channels); for (int j = 0; j < 3; j++) { memcpy(data + width * height * j, input_channels[2 - j].data, channelLength * sizeof(float)); } }
void post_process(std::shared_ptr<PaddlePredictor> predictor, float thresh, std::vector<std::string> class_names, const cv::Mat &image, int in_width, int in_height) { // NOLINT const int strides[3] = {8, 16, 32}; const int anchors[3][6] = {{10, 13, 16, 30, 33, 23}, {30, 61, 62, 45, 59, 119}, {116, 90, 156, 198, 373, 326}}; std::map<int, std::vector<Object>> raw_outputs; float r_w = in_width / static_cast<float>(image.cols); float r_h = in_height / static_cast<float>(image.rows); float r, off_x, off_y; if (r_h > r_w) { r = r_w; off_x = 0; off_y = static_cast<int>((in_height - r_w * image.rows) / 2); } else { r = r_h; off_y = 0; off_x = static_cast<int>((in_width - r_h * image.cols) / 2); } for (int k = 0; k < 3; k++) { std::unique_ptr<const Tensor> output_tensor( std::move(predictor->GetOutput(k))); auto *outptr = output_tensor->data<float>(); auto shape_out = output_tensor->shape(); std::vector<int> shape_new(shape_out.begin(), shape_out.end()); int xdim = static_cast<int>(in_width / strides[k]); extract_boxes(outptr, &raw_outputs, strides[k], anchors[k], shape_new, r, thresh, off_x, off_y, xdim); } std::vector<Object> outs; nms(&raw_outputs, &outs, 0.45); }
模型可能算子不支持,也可能有其他问题,需要更多详细的日志信息, 这是需要打开日志模式。下载PaddleLite源码重新编译:
./lite/tools/build_android_armv8.sh --with_opencl=ON --arch=armv8 --with_extra=ON --with_log=ON
cd Paddle-Lite-Demo/libs
# 下载所需要的 Paddle Lite 预测库
sh download.sh
cd ../object_detection/assets
# 下载OPT 优化后模型、测试图片、标签文件
sh download.sh
cd ../android/app/shell/cxx/yolov5n_detection
# 更新 NDK_ROOT 路径,然后完成可执行文件的编译和运行
sh build.sh
# CMakeList.txt 里的 System 默认设置是linux;如果在Mac 运行,则需将 CMAKE_SYSTEM_NAME 变量设置为 drawn
cpu推理性能:
推理结果:
gpu推理性能: shell下相比cpu测试没有明显的性能提升, 输出精度与cpu对齐
参考教程: https://aistudio.baidu.com/aistudio/projectdetail/3431580
注意:如果您的 Android Studio 尚未配置 NDK ,请根据 Android Studio 用户指南中的安装及配置 NDK 和 CMake 内容,预先配置好 NDK 。您可以选择最新的 NDK 版本,或者使用 Paddle Lite 预测库版本一样的 NDK
使用AndroidStudio打开工程目录: https://github.com/PaddlePaddle/Paddle-Lite-Demo/tree/develop/object_detection/android/shell/cxx/yolov5n_detection
更新预测库到demo工程中:
/java/jar$ cp PaddlePredictor.jar yolov5n_detection_demo/app/PaddleLite/java/
/java/so$ cp libpaddle_lite_jni.so yolov5n_detection_demo/app/PaddleLite/java/libs/arm64-v8a/
cxx/include$ cp ./* yolov5n_detection_demo/app/PaddleLite/cxx/include/
cxx/lib$ cp libpaddle_light_api_shared.so yolov5n_detection_demo/app/PaddleLite/cxx/libs/arm64-v8a/
在性能不错的小米手机上可达到50fps
使用上一个项目裁剪训练的权重: yolov5_paddle_prune/weights/finetune.pdparams
转换静态图时需要适配cfg配置文件,将裁剪后的模型通过cfg配置文件加载
相比之前的nb文件大小, 裁剪后从 7.5MB 降低到 2.8MB.
%cd /home/aistudio/yolov5_paddle_prune
!python convert_static_prune.py --weights ./weights/finetune.pdparams --cfg cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg --data ./data/coco_person.yaml --source crop.jpg
/home/aistudio/yolov5_paddle_prune
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/__init__.py:107: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import MutableMapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/rcsetup.py:20: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import Iterable, Mapping
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/colors.py:53: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop working
from collections import Sized
[34m[1mconvert_static_prune: [0mweights=['./weights/finetune.pdparams'], cfg=cfg/prune_0.5_keep_0.01_8x_yolov5n_v6_person.cfg, source=crop.jpg, imgsz=[320, 320], conf_thres=0.01, iou_thres=0.6, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, dnn=False, single_cls=False, data=./data/coco_person.yaml, hyp=data/hyps/hyp.scratch.yaml
[31m[1mrequirements:[0m tqdm>=4.36.1 not found and is required by YOLOv5, attempting auto-update...
# 按照同样的方式生成nb文件, 并部署到同样的安卓手机, paddle-lite-demo代码不需要额外的改动
%cd /home/aistudio/yolov5_paddle_prune/
!pip install paddlelite==2.11
# cpu
!paddle_lite_opt --model_file=./prune_net.pdmodel --param_file=prune_net.pdiparams --optimize_out_type=naive_buffer --optimize_out=./yolov5n_cpu_prune --valid_targets=arm
#gpu
!paddle_lite_opt --model_file=./prune_net.pdmodel --param_file=prune_net.pdiparams --optimize_out_type=naive_buffer --optimize_out=./yolov5n_gpu_prune --valid_targets=opencl,arm
/home/aistudio/yolov5_paddle_prune Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting paddlelite==2.11 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/8b/d7/8babc059bff1d02dc85e32fb8bff3c56680db5f5b20246467f9fcbcfe62c/paddlelite-2.11-cp37-cp37m-manylinux1_x86_64.whl (47.1 MB) [2K [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m47.1/47.1 MB[0m [31m12.8 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m [?25hInstalling collected packages: paddlelite Successfully installed paddlelite-2.11 [33mWARNING: You are using pip version 22.0.4; however, version 22.1.2 is available. You should consider upgrading via the '/opt/conda/envs/python35-paddle120-env/bin/python -m pip install --upgrade pip' command.[0m[33m [0mLoading topology data from ./prune_net.pdmodel Loading params data from prune_net.pdiparams 1. Model is successfully loaded! 2. Model is optimized and saved into ./yolov5n_cpu_prune.nb successfully Loading topology data from ./prune_net.pdmodel Loading params data from prune_net.pdiparams 1. Model is successfully loaded! 2. Model is optimized and saved into ./yolov5n_gpu_prune.nb successfully
替换之前的模型, 推理速度大幅度降低
CPU下运行:
GPU下运行:
注意: 测试时该模型的conf阈值设置0.01, 输出结果 05000319_yolov5n_detection_result.jpg
部署到安卓手机, 原来的代码有点bug, 设置阈值0.01无法生效,需要修改一点源码, 将confThresh_替换成scoreThreshold_
本项目详细记录了如何将yolov5模型部署到安卓手机的操作步骤和踩坑经历,不经将原始权重成功部署,还能将自己训练的裁剪后的模型也能无需修改源码进行部署。
我在AI Studio上获得钻石等级,点亮10个徽章,来互关呀~
https://aistudio.baidu.com/aistudio/personalcenter/thirdview/89442
请点击此处查看本环境基本用法.
Please click here for more detailed instructions.
此文为搬运,原作链接:https://aistudio.baidu.com/aistudio/projectdetail/4245931
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。