赞
踩
(一)openvino安装说明:
【pytorch】将模型部署至生产环境:openVINO安装及python和c++环境配置
(二)openvino部署流程如图所示:
OpenVINO工具套件中用于AI推理加速的组件包括用于优化神经网络模型的Python工具Model Optimizer(模型优化器)和用于加速推理计算的软件包Inference Engine(推理引擎)。
中间表示(IR)包括两个文件,一个是描述神经网络拓扑结构的*.xml文件,另一个是储存模型权重参数的*.bin文件。
再将IR文件加载进runtime引擎中进行推理。
Yolov5官方提供了支持openVINO的方法,但为了体现通用性,这里的转换思路为:
yolov5的pt文件 先转 Onnx 再转 openVino中间件
这是因为pytorch模型必须先变为onnx才能再通过Model Optimizer转为IR。
(三)Yolov5转Onnx
(1)下载预训练模型:yolov5的yolov5s.pt
(2)使用yolov5工程目录下的export.py,本文传入参数为:
python export.py --weights D:\yolov5-master\yolov5s.pt --device '0' --batch-size 1 --imgsz (640, 640) --iou-thres 0.6 --conf-thres 0.65 --include ('onnx')
(3)顺利生成yolov5s.onnx后,使用Model Optimizer对onnx文件做优化,生成IR中间件。
pycharm切换至路径(参考Dev Tools的安装路径):D:\anaconda\envs\mypytorch\Lib\site-packages\openvino\tools\mo路径下,运行如下命令:
mo --input_model "D:\yolov5-master\yolov5s.onnx" --data_type FP16 --output_dir "D:\yolov5-master\openvino"
注在openvino2022中,不用显示指定–input_shape “[1,3, 512, 1024]”
最终在D:\yolov5-master\openvino文件夹下得到了:yolov5s.bin yolov5s.mapping 和 yolov5s.xml。
也可使用如下方法,使用python读取onnx文件后再序列化为IR中间件:
from openvino.runtime import Core
import common
from openvino.inference_engine import IECore
from openvino.offline_transformations import serialize
ie = Core()
onnx_model_path = r"D:\flask_pytorch\yolov5s.onnx"
model_onnx = ie.read_model(model=onnx_model_path)
compiled_model_onnx = ie.compile_model(model=model_onnx, device_name="CPU")
serialize(model=model_onnx, model_path="exported_onnx_model.xml", weights_path="exported_onnx_model.bin")
用下述代码可以得到模型的输入及输出:
ie = Core()
onnx_model_path = r"D:\flask_pytorch\yolov5s.onnx"
model_onnx = ie.read_model(model=onnx_model_path)
print(model_onnx.inputs)
print(model_onnx.outputs)
运行得到:
[<Output: names[images] shape{1,3,640,640} type: f32>]
[<Output: names[output] shape{1,25200,85} type: f32>,
<Output: names[onnx::Sigmoid_339] shape{1,3,80,80,85} type: f32>,
<Output: names[onnx::Sigmoid_391] shape{1,3,40,40,85} type: f32>, <Output: names[onnx::Sigmoid_443] shape{1,3,20,20,85} type: f32>]
标准的yolov5-5.0的输出有三个,分别是
1x255x80x80
1x255x40x40
1x255x20x20
其中这里的255是85*3,这里的3是指3个anchor产生的3个box(不是RGB三个通道,最后输出那里已经没有RGB的概念了),而这里的85是指5+80=85,其中80是类别数量,每个类别数量对应一个label score,一共80个label score,而5是指box的四个坐标加一个box score.
输出0:[<Output: names[output] shape{1,25200,85} type: f32>是将三者合并在一起,有 3x(80x80+40x40+20x20)=25200,下面的示例代码将使用输出0;
(四)代码
下面的代码利用了openvino的异步模式:
异步模式是指启动第一张图片的AI推理计算后,无需等待AI推理计算结束,直接采集第二张图片,并完成图像预处理,然后再检查第一张图片推理计算是否完毕,若完毕,处理输出结果,这样的好处是,并行执行了第一张图片的AI推理计算任务和第二张图片的图像采集和预处理任务,减少了等待时间,提高了硬件的利用率,也提高了吞吐量。
import cv2
import numpy as np
import yaml
from openvino.inference_engine import IECore
from openvino.runtime import Core # the version of openvino >= 2022.1
import random
def letterbox(img, new_shape=(640, 640), color=(114, 114, 114), scaleup=False, stride=32):
"""
将图片缩放调整到指定大小,1920x1080的图片最终会缩放到640x384的大小,和YOLOv4的letterbox不一样
Resize and pad image while meeting stride-multiple constraints
:param img: 原图 hwc
:param new_shape: 缩放后的最长边大小
:param color: pad的颜色
:param auto: True:进行矩形填充 False:直接进行resize
:param scale_up: True:仍进行上采样 False:不进行上采样
:return: img: letterbox后的图片 HWC
ratio: wh ratios
(dw, dh): w和h的pad
"""
shape = img.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
# 只进行下采样 因为上采样会让图片模糊
# (for better test mAP) scale_up = False 对于大于new_shape(r<1)的原图进行缩放,小于new_shape(r>1)的不变
if not scaleup: # only scale down, do not scale up (for better test mAP)
r = min(r, 1.0)
ratio = r # width, height ratios
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
# 这里的取余操作可以保证padding后的图片是32的整数倍(416x416),如果是(512x512)可以保证是64的整数倍
# dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding
# 在较小边的两侧进行pad, 而不是在一侧pad
# divide padding into 2 sides
dw /= 2
dh /= 2
if shape[::-1] != new_unpad: # resize
img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
return img,ratio,(dw,dh)
def iou(b1,b2):
b1_x1, b1_y1, b1_x2, b1_y2 = b1[0], b1[1], b1[2], b1[3]
b2_x1, b2_y1, b2_x2, b2_y2 = b2[:,0], b2[:,1], b2[:,2], b2[:,3]
inter_rect_x1 = np.maximum(b1_x1, b2_x1)
inter_rect_y1 = np.maximum(b1_y1, b2_y1)
inter_rect_x2 = np.minimum(b1_x2, b2_x2)
inter_rect_y2 = np.minimum(b1_y2, b2_y2)
inter_area = np.maximum(inter_rect_x2 - inter_rect_x1, 0) * \
np.maximum(inter_rect_y2 - inter_rect_y1, 0)
area_b1 = (b1_x2 - b1_x1) * (b1_y2 - b1_y1)
area_b2 = (b2_x2 - b2_x1) * (b2_y2 - b2_y1)
iou = inter_area / np.maximum((area_b1+area_b2-inter_area),1e-6)
return iou
#非极大值抑制函数
def non_max_suppression(boxes, conf_thres=0.5, nms_thres=0.4, ratio=1, pad=(20,20)):
# 取出batch_size
bs = np.shape(boxes)[0]
# xywh___ to____ xyxy
shape_boxes = np.zeros_like(boxes[:,:,:4])
shape_boxes[:, :, 0] = boxes[:, :, 0] - boxes[:, :, 2] / 2
shape_boxes[:, :, 1] = boxes[:, :, 1] - boxes[:, :, 3] / 2
shape_boxes[:, :, 2] = boxes[:, :, 0] + boxes[:, :, 2] / 2
shape_boxes[:, :, 3] = boxes[:, :, 1] + boxes[:, :, 3] / 2
boxes[:, :, :4] = shape_boxes
boxes[:, :, 5:] *= boxes[:, :, 4:5]
# output存放每一张图片的预测结果,推理阶段一般是一张图片
output = []
for i in range(bs):
predictions = boxes[i] # 预测位置xyxy shape==(12700,85)
score = np.max(predictions[:, 5:], axis=-1)
# score = predictions[:,4] # 存在物体置信度,shape==12700
mask = score > conf_thres # 物体置信度阈值mask==[False,False,True......],shape==12700,True将会被保留,False列将会被删除
detections = predictions[mask] # 第一次筛选 shape==(115,85)
class_conf = np.expand_dims(np.max(detections[:,5:],axis=-1),axis=-1) # 获取每个预测框预测的类别置信度
class_pred = np.expand_dims(np.argmax(detections[:,5:],axis=-1),axis=-1) # 获取每个预测框的类别下标
# 结果堆叠,(num_boxes,位置信息4+包含物体概率1+类别置信度1+类别序号1)
detections = np.concatenate([detections[:,:4],class_conf,class_pred],axis=-1) # shape=(numbox,7)
unique_class = np.unique(detections[:,-1]) # 取出包含的所有类别
if len(unique_class)==0:
continue
best_box = []
for c in unique_class:
# 取出类别为c的预测结果
cls_mask = detections[:,-1] == c
detection = detections[cls_mask] # shape=(82,7)
# 包含物体类别概率从高至低排列
scores = detection[:,4]
arg_sort = np.argsort(scores)[::-1] # 返回的是索引
detection = detection[arg_sort]
while len(detection) != 0:
best_box.append(detection[0])
if len(detection) == 1:
break
# 计算当前置信度最大的框和其它预测框的iou
ious = iou(best_box[-1],detection[1:])
detection = detection[1:][ious < nms_thres] # 小于nms_thres将被保留,每一轮至少减少一个
output.append(best_box)
boxes_loc = []
conf_loc = []
class_loc = []
if len(output):
for i in range(len(output)):
pred = output[i]
for i, det in enumerate(pred):
if len(det):
# 将框坐标调整回原始图像中
det[0] = (det[0] - pad[0]) / ratio
det[2] = (det[2] - pad[0]) / ratio
det[1] = (det[1] - pad[1]) / ratio
det[3] = (det[3] - pad[1]) / ratio
boxes_loc.append([det[0],det[1],det[2],det[3]])
conf_loc.append(det[4])
class_loc.append(det[5])
return boxes_loc,conf_loc,class_loc
def plot_one_box(img,boxes,conf,clas_id,line_thickness=3,names=None):
# 画位置框
# tl = 框的线宽 要么等于line_thickness要么根据原图im长宽信息自适应生成一个
tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1
color = [random.randint(0, 255) for _ in range(3)]
c1, c2 = (int(boxes[0]), int(boxes[1])), (int(boxes[2]),int(boxes[3]))
cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
# 画类别信息框
label = f'{names[int(clas_id)]} {conf:.2f}'
tf = max(tl - 1, 1) # label字体的线宽 font thickness
t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
c2 = (c1[0] + t_size[0], c1[1] - t_size[1] - 3)
cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)
cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
if __name__ == '__main__':
# # 样例使用的COCO 80类别数据集
names = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard',
'cell phone','microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors',
'teddy bear', 'hair drier', 'toothbrush']
conf_thres = 0.5
nms_thres = 0.4
model_path = "smoke_phone.onnx"
#img_path = r'C:\Users\25360\Desktop\people_test.webp.jpg'
#frame = cv2.imread(img_path)
model_xml=r"D:\flask_pytorch\exported_onnx_model.xml"
model_bin=r"D:\flask_pytorch\exported_onnx_model.bin"
ie = IECore()
net = ie.read_network(model=model_xml, weights=model_bin)
#双请求,自动分配GPU及CPU资源:
exec_net = ie.load_network(network=net, num_requests=2, device_name="HETERO:GPU,CPU")
input_layer = next(iter(net.input_info))
is_async_mode = True
cap = cv2.VideoCapture(0)
#fps = cap.get(cv2.CAP_PROP_FPS) # 帧率
#摄像头一次读入的帧数:
number_input_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
#如果该值无效,默认其帧数为1,实际是指无法获取;如能获取,则该值为实际输入帧数
number_input_frames = 1 if number_input_frames != -1 and number_input_frames < 0 else number_input_frames
wait_key_code = 1
#异步模式下定义了当前帧及下一帧的帧ID:
cur_request_id = 0
next_request_id = 1
if number_input_frames != 1:
is_async_mode = True
#给出异步模式的初始第一帧:
ret, frame = cap.read()
else:
is_async_mode = False
wait_key_code = 0
#打开电脑摄像头:
cap = cv2.VideoCapture(0)
while cap.isOpened():
# 异步模式是指启动第一张图片的AI推理计算后,无需等待AI推理计算结束,直接采集第二张图片,并完成图像预处理,然后再检查第一张图片推理计算是否完毕,若完毕,处理输出结果,
# 这样的好处是,并行执行了第一张图片的AI推理计算任务和第二张图片的图像采集和预处理任务,减少了等待时间,提高了硬件的利用率,也提高了吞吐量。
if is_async_mode:
#异步模式,捕获帧提供给下一个请求:
ret, next_frame = cap.read()
else:
#常规同步模式,捕获帧提供给当前请求
ret, frame = cap.read()
if not ret:
break
if is_async_mode:
request_id = next_request_id
img, ratio, (dw,dh) = letterbox(frame)
else:
request_id = cur_request_id
img, ratio, (dw,dh) = letterbox(frame)
# np.ascontiguousarray()将一个内存不连续存储的数组转换为内存连续存储的数组,使得运行速度更快
blob = cv2.dnn.blobFromImage(np.ascontiguousarray(img), 1/255.0, (img.shape[0], img.shape[1]), swapRB=True, crop=False)
infer_request_handle=exec_net.start_async(request_id=request_id,inputs={input_layer: blob})
#0代表推理结果已出现:
if infer_request_handle.wait(-1) == 0:
#这里直接使用新版yolo5的3合1输出:
res = infer_request_handle.output_blobs["output"]
outs = res.buffer
boxes_loc,conf_loc,class_loc = non_max_suppression(outs, conf_thres=conf_thres, nms_thres=nms_thres,
ratio=ratio, pad=(dw,dh))
# 可视化
for i in range(len(boxes_loc)):
boxes = boxes_loc[i]
conf = conf_loc[i]
clas_id = class_loc[i]
plot_one_box(frame, boxes, conf, clas_id, line_thickness=3, names=names)
cv2.imshow("result", frame)
key=cv2.waitKey(1)
if is_async_mode:
#异步模式下,下一帧及当前帧请求号互换:
cur_request_id, next_request_id = next_request_id, cur_request_id
frame = next_frame
# ESC按键按下时退出程序:
if key == 27:
break
#while循环结束时释放资源:
cv2.destroyAllWindows()
视频检测效果示例:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。