当前位置:   article > 正文

YOLOv8实战-模型推理及部署_python yolo8 程序 打包部署

python yolo8 程序 打包部署

        接上篇:YOLOv8实战-自定义数据集制作、模型训练及测试

        经过模型训练及测试,我们得到了保存在 ‘./runs/detect/train/weights/’ 下的pt类型文件,这种文件保存的是模型的权重信息,为便于进一步推理和部署,需要将pt文件转为onnx格式,再转为engine格式,下面介绍具体实现步骤。

        本文主要参考:YOLOv8模型训练 + 部署

目 录

目 录

一、pt转onnx

1、修改配置文件

2、编写转换程序

二、onnx转engine

1、安装TensorRT

1)下载合适版本TensorRT

2)配置系统环境

3) 安装 TensorRT Python wheel 文件

​2、onnx转engine

三、推理 


一、pt转onnx

这一步是将pt格式的文件转为onnx格式。

1、修改配置文件

        找到yolov8/ultralytics/nn/modules/head.py文件,大约在第76行和第252行,有一句代码“return y if self.export else (y, x)”,将它改为“return y.permute(0, 2, 1) if self.export else (y, x)”。

修改前:

修改后:

 

2、编写转换程序

        在/yolov8下新建文件:pt2onnx.py,内容如下:

  1. from ultralytics import YOLO
  2. model = YOLO('./runs/detect/train/weights/last.pt')
  3. model.export(format="onnx")

运行pt2onnx.py程序,即可得到onnx文件,存储在./runs/detect/train/weights/last.onnx中。

        对于onnx模型,可以借助 https://netron.app/ 查看​​​​​​​该模型的网络结构。

二、onnx转engine

这一步我借助的是TensorRT下的trtexec.exe工具,所以没安装TensorRT的需要先安装一下。

1、安装TensorRT

        安装TensorRT我主要参考的是:TensorRT的安装与使用

1)下载合适版本TensorRT

        首先,在nvidia官网(NVIDIA TensorRT 8.x Download | NVIDIA Developer)下载TensorRT,勾选I agree to……后,会出现可下载的版本,我选择的是第一个TensorRT 8.6 GA。

我的运行环境是windows系统,CUDA版本是12.2,下载了如下图所示的安装包(虽然只说了适用CUDA12.0和CUDA12.1,但本人亲测12.2也可以用):

解压刚下载的TensorRT压缩包,将解压好的文件放置到待安装目录下:

2)配置系统环境

        控制面板->系统>高级系统设置–>环境变量–>系统变量–>Path(添加tensorRT的lib路径)

3) 安装 TensorRT Python wheel 文件

        打开终端;激活虚拟环境;cd到D:\TensorRT-8.6.1.6\python路径下;查看python版本:

        根据当前虚拟环境下的python版本(我的是python3.8),在D:\TensorRT-8.6.1.6\python路径下挑选合适的版本,复制文件名称,在终端采用pip安装:

        安装完毕后,验证是否安装成功:python -c "import tensorrt;print(tensorrt.__version__)"

如果安装成功会输出TensorRT的版本号。

 2、onnx转engine

        将终端路径切换到yolov8文件夹下,输入命令:

D:\TensorRT-8.6.1.6\bin\trtexec.exe --onnx=./runs/detect/train/weights/last.onnx  --saveEngine=last.engine --fp16

即可得到我们想要的engine文件。(该过程大约需要3-5分钟,需耐心等待~)

三、推理 

        在yolov8文件夹下新建inference.py文件,具体内容如下。运行该程序,就完成了模型推理,推理结果保存在/yolov8/output/路径下。

  1. """
  2. An example that uses TensorRT's Python api to make inferences.
  3. """
  4. import ctypes
  5. import os
  6. import shutil
  7. import random
  8. import sys
  9. import threading
  10. import time
  11. import cv2
  12. import numpy as np
  13. import pycuda.autoinit
  14. import pycuda.driver as cuda
  15. import tensorrt as trt
  16. CONF_THRESH = 0.5
  17. IOU_THRESHOLD = 0.45
  18. LEN_ALL_RESULT = 705600 ##42000 ##(20*20+40*40+80*80)*(num_cls+4) 一个batch长度
  19. NUM_CLASSES = 1 ##1
  20. OBJ_THRESH = 0.4
  21. def get_img_path_batches(batch_size, img_dir):
  22. ret = []
  23. batch = []
  24. for root, dirs, files in os.walk(img_dir):
  25. for name in files:
  26. if len(batch) == batch_size:
  27. ret.append(batch)
  28. batch = []
  29. batch.append(os.path.join(root, name))
  30. if len(batch) > 0:
  31. ret.append(batch)
  32. return ret
  33. def plot_one_box(x, img, color=None, label=None, line_thickness=None):
  34. """
  35. description: Plots one bounding box on image img,
  36. this function comes from YoLov5 project.
  37. param:
  38. x: a box likes [x1,y1,x2,y2]
  39. img: a opencv image object
  40. color: color to draw rectangle, such as (0,255,0)
  41. label: str
  42. line_thickness: int
  43. return:
  44. no return
  45. """
  46. tl = (
  47. line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1
  48. ) # line/font thickness
  49. color = color or [random.randint(0, 255) for _ in range(3)]
  50. c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
  51. cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
  52. if label:
  53. tf = max(tl - 1, 1) # font thickness
  54. t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
  55. c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
  56. cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled
  57. cv2.putText(
  58. img,
  59. label,
  60. (c1[0], c1[1] - 2),
  61. 0,
  62. tl / 3,
  63. [225, 255, 255],
  64. thickness=tf,
  65. lineType=cv2.LINE_AA,
  66. )
  67. class YoLov8TRT(object):
  68. """
  69. description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops.
  70. """
  71. def __init__(self, engine_file_path):
  72. # Create a Context on this device,
  73. self.ctx = cuda.Device(0).make_context()
  74. stream = cuda.Stream()
  75. TRT_LOGGER = trt.Logger(trt.Logger.INFO)
  76. runtime = trt.Runtime(TRT_LOGGER)
  77. # Deserialize the engine from file
  78. with open(engine_file_path, "rb") as f:
  79. engine = runtime.deserialize_cuda_engine(f.read())
  80. context = engine.create_execution_context()
  81. host_inputs = []
  82. cuda_inputs = []
  83. host_outputs = []
  84. cuda_outputs = []
  85. bindings = []
  86. for binding in engine:
  87. print('bingding:', binding, engine.get_tensor_shape(binding))
  88. size = trt.volume(engine.get_tensor_shape(binding)) * engine.max_batch_size
  89. dtype = trt.nptype(engine.get_tensor_dtype(binding))
  90. # Allocate host and device buffers
  91. host_mem = cuda.pagelocked_empty(size, dtype)
  92. cuda_mem = cuda.mem_alloc(host_mem.nbytes)
  93. # Append the device buffer to device bindings.
  94. bindings.append(int(cuda_mem))
  95. # Append to the appropriate list.
  96. if engine.binding_is_input(binding):
  97. self.input_w = engine.get_tensor_shape(binding)[-1]
  98. self.input_h = engine.get_tensor_shape(binding)[-2]
  99. host_inputs.append(host_mem)
  100. cuda_inputs.append(cuda_mem)
  101. else:
  102. host_outputs.append(host_mem)
  103. cuda_outputs.append(cuda_mem)
  104. # Store
  105. self.stream = stream
  106. self.context = context
  107. self.engine = engine
  108. self.host_inputs = host_inputs
  109. self.cuda_inputs = cuda_inputs
  110. self.host_outputs = host_outputs
  111. self.cuda_outputs = cuda_outputs
  112. self.bindings = bindings
  113. self.batch_size = engine.max_batch_size
  114. def infer(self, raw_image_generator):
  115. threading.Thread.__init__(self)
  116. # Make self the active context, pushing it on top of the context stack.
  117. self.ctx.push()
  118. # Restore
  119. stream = self.stream
  120. context = self.context
  121. engine = self.engine
  122. host_inputs = self.host_inputs
  123. cuda_inputs = self.cuda_inputs
  124. host_outputs = self.host_outputs
  125. cuda_outputs = self.cuda_outputs
  126. bindings = self.bindings
  127. # Do image preprocess
  128. batch_image_raw = []
  129. batch_origin_h = []
  130. batch_origin_w = []
  131. batch_input_image = np.empty(shape=[self.batch_size, 3, self.input_h, self.input_w])
  132. for i, image_raw in enumerate(raw_image_generator):
  133. input_image, image_raw, origin_h, origin_w = self.preprocess_image(image_raw)
  134. batch_image_raw.append(image_raw)
  135. batch_origin_h.append(origin_h)
  136. batch_origin_w.append(origin_w)
  137. np.copyto(batch_input_image[i], input_image)
  138. batch_input_image = np.ascontiguousarray(batch_input_image)
  139. # Copy input image to host buffer
  140. np.copyto(host_inputs[0], batch_input_image.ravel())
  141. start = time.time()
  142. # Transfer input data to the GPU.
  143. cuda.memcpy_htod_async(cuda_inputs[0], host_inputs[0], stream)
  144. # Run inference.
  145. context.execute_async_v2(bindings=bindings, stream_handle=stream.handle)
  146. # context.execute_async(batch_size=self.batch_size, bindings=bindings, stream_handle=stream.handle)
  147. # Transfer predictions back from the GPU.
  148. cuda.memcpy_dtoh_async(host_outputs[0], cuda_outputs[0], stream)
  149. # Synchronize the stream
  150. stream.synchronize()
  151. end = time.time()
  152. # Remove any context from the top of the context stack, deactivating it.
  153. self.ctx.pop()
  154. # Here we use the first row of output in that batch_size = 1
  155. output = host_outputs[0]
  156. # Do postprocess
  157. for i in range(self.batch_size):
  158. result_boxes, result_scores, result_classid = self.post_process_new(
  159. output[i * LEN_ALL_RESULT: (i + 1) * LEN_ALL_RESULT], batch_origin_h[i], batch_origin_w[i],
  160. batch_input_image[i]
  161. )
  162. if result_boxes is None:
  163. continue
  164. # Draw rectangles and labels on the original image
  165. for j in range(len(result_boxes)):
  166. box = result_boxes[j]
  167. plot_one_box(
  168. box,
  169. batch_image_raw[i],
  170. label="{}:{:.2f}".format(
  171. categories[int(result_classid[j])], result_scores[j]
  172. ),
  173. )
  174. return batch_image_raw, end - start
  175. def destroy(self):
  176. # Remove any context from the top of the context stack, deactivating it.
  177. self.ctx.pop()
  178. def get_raw_image(self, image_path_batch):
  179. """
  180. description: Read an image from image path
  181. """
  182. for img_path in image_path_batch:
  183. yield cv2.imread(img_path)
  184. def get_raw_image_zeros(self, image_path_batch=None):
  185. """
  186. description: Ready data for warmup
  187. """
  188. for _ in range(self.batch_size):
  189. yield np.zeros([self.input_h, self.input_w, 3], dtype=np.uint8)
  190. def preprocess_image(self, raw_bgr_image):
  191. """
  192. description: Convert BGR image to RGB,
  193. resize and pad it to target size, normalize to [0,1],
  194. transform to NCHW format.
  195. param:
  196. input_image_path: str, image path
  197. return:
  198. image: the processed image
  199. image_raw: the original image
  200. h: original height
  201. w: original width
  202. """
  203. image_raw = raw_bgr_image
  204. h, w, c = image_raw.shape
  205. image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
  206. # Calculate widht and height and paddings
  207. r_w = self.input_w / w
  208. r_h = self.input_h / h
  209. if r_h > r_w:
  210. tw = self.input_w
  211. th = int(r_w * h)
  212. tx1 = tx2 = 0
  213. ty1 = int((self.input_h - th) / 2)
  214. ty2 = self.input_h - th - ty1
  215. else:
  216. tw = int(r_h * w)
  217. th = self.input_h
  218. tx1 = int((self.input_w - tw) / 2)
  219. tx2 = self.input_w - tw - tx1
  220. ty1 = ty2 = 0
  221. # Resize the image with long side while maintaining ratio
  222. image = cv2.resize(image, (tw, th))
  223. # Pad the short side with (128,128,128)
  224. image = cv2.copyMakeBorder(
  225. image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, None, (128, 128, 128)
  226. )
  227. image = image.astype(np.float32)
  228. # Normalize to [0,1]
  229. image /= 255.0
  230. # HWC to CHW format:
  231. image = np.transpose(image, [2, 0, 1])
  232. # CHW to NCHW format
  233. image = np.expand_dims(image, axis=0)
  234. # Convert the image to row-major order, also known as "C order":
  235. image = np.ascontiguousarray(image)
  236. return image, image_raw, h, w
  237. def xywh2xyxy(self, origin_h, origin_w, x):
  238. """
  239. description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
  240. param:
  241. origin_h: height of original image
  242. origin_w: width of original image
  243. x: A boxes numpy, each row is a box [center_x, center_y, w, h]
  244. return:
  245. y: A boxes numpy, each row is a box [x1, y1, x2, y2]
  246. """
  247. y = np.zeros_like(x)
  248. r_w = self.input_w / origin_w
  249. r_h = self.input_h / origin_h
  250. if r_h > r_w:
  251. y[:, 0] = x[:, 0] - x[:, 2] / 2
  252. y[:, 2] = x[:, 0] + x[:, 2] / 2
  253. y[:, 1] = x[:, 1] - x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2
  254. y[:, 3] = x[:, 1] + x[:, 3] / 2 - (self.input_h - r_w * origin_h) / 2
  255. y /= r_w
  256. else:
  257. y[:, 0] = x[:, 0] - x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2
  258. y[:, 2] = x[:, 0] + x[:, 2] / 2 - (self.input_w - r_h * origin_w) / 2
  259. y[:, 1] = x[:, 1] - x[:, 3] / 2
  260. y[:, 3] = x[:, 1] + x[:, 3] / 2
  261. y /= r_h
  262. return y
  263. def post_process_new(self, output, origin_h, origin_w, img_pad):
  264. # Reshape to a two dimentional ndarray
  265. c, h, w = img_pad.shape
  266. ratio_w = w / origin_w
  267. ratio_h = h / origin_h
  268. num_anchors = int(((h / 32) * (w / 32) + (h / 16) * (w / 16) + (h / 8) * (w / 8)))
  269. pred = np.reshape(output, (num_anchors, 4 + NUM_CLASSES))
  270. results = []
  271. for detection in pred:
  272. score = detection[4:]
  273. classid = np.argmax(score)
  274. confidence = score[classid]
  275. if confidence > CONF_THRESH:
  276. if ratio_h > ratio_w:
  277. center_x = int(detection[0] / ratio_w)
  278. center_y = int((detection[1] - (h - ratio_w * origin_h) / 2) / ratio_w)
  279. width = int(detection[2] / ratio_w)
  280. height = int(detection[3] / ratio_w)
  281. x1 = int(center_x - width / 2)
  282. y1 = int(center_y - height / 2)
  283. x2 = int(center_x + width / 2)
  284. y2 = int(center_y + height / 2)
  285. else:
  286. center_x = int((detection[0] - (w - ratio_h * origin_w) / 2) / ratio_h)
  287. center_y = int(detection[1] / ratio_h)
  288. width = int(detection[2] / ratio_h)
  289. height = int(detection[3] / ratio_h)
  290. x1 = int(center_x - width / 2)
  291. y1 = int(center_y - height / 2)
  292. x2 = int(center_x + width / 2)
  293. y2 = int(center_y + height / 2)
  294. results.append([x1, y1, x2, y2, confidence, classid])
  295. results = np.array(results)
  296. if len(results) <= 0:
  297. return None, None, None
  298. # Do nms
  299. boxes = self.non_max_suppression(results, origin_h, origin_w, conf_thres=CONF_THRESH, nms_thres=IOU_THRESHOLD)
  300. result_boxes = boxes[:, :4] if len(boxes) else np.array([])
  301. result_scores = boxes[:, 4] if len(boxes) else np.array([])
  302. result_classid = boxes[:, 5] if len(boxes) else np.array([])
  303. return result_boxes, result_scores, result_classid
  304. def bbox_iou(self, box1, box2, x1y1x2y2=True):
  305. """
  306. description: compute the IoU of two bounding boxes
  307. param:
  308. box1: A box coordinate (can be (x1, y1, x2, y2) or (x, y, w, h))
  309. box2: A box coordinate (can be (x1, y1, x2, y2) or (x, y, w, h))
  310. x1y1x2y2: select the coordinate format
  311. return:
  312. iou: computed iou
  313. """
  314. if not x1y1x2y2:
  315. # Transform from center and width to exact coordinates
  316. b1_x1, b1_x2 = box1[:, 0] - box1[:, 2] / 2, box1[:, 0] + box1[:, 2] / 2
  317. b1_y1, b1_y2 = box1[:, 1] - box1[:, 3] / 2, box1[:, 1] + box1[:, 3] / 2
  318. b2_x1, b2_x2 = box2[:, 0] - box2[:, 2] / 2, box2[:, 0] + box2[:, 2] / 2
  319. b2_y1, b2_y2 = box2[:, 1] - box2[:, 3] / 2, box2[:, 1] + box2[:, 3] / 2
  320. else:
  321. # Get the coordinates of bounding boxes
  322. b1_x1, b1_y1, b1_x2, b1_y2 = box1[:, 0], box1[:, 1], box1[:, 2], box1[:, 3]
  323. b2_x1, b2_y1, b2_x2, b2_y2 = box2[:, 0], box2[:, 1], box2[:, 2], box2[:, 3]
  324. # Get the coordinates of the intersection rectangle
  325. inter_rect_x1 = np.maximum(b1_x1, b2_x1)
  326. inter_rect_y1 = np.maximum(b1_y1, b2_y1)
  327. inter_rect_x2 = np.minimum(b1_x2, b2_x2)
  328. inter_rect_y2 = np.minimum(b1_y2, b2_y2)
  329. # Intersection area
  330. inter_area = np.clip(inter_rect_x2 - inter_rect_x1 + 1, 0, None) * \
  331. np.clip(inter_rect_y2 - inter_rect_y1 + 1, 0, None)
  332. # Union Area
  333. b1_area = (b1_x2 - b1_x1 + 1) * (b1_y2 - b1_y1 + 1)
  334. b2_area = (b2_x2 - b2_x1 + 1) * (b2_y2 - b2_y1 + 1)
  335. iou = inter_area / (b1_area + b2_area - inter_area + 1e-16)
  336. return iou
  337. def non_max_suppression(self, prediction, origin_h, origin_w, conf_thres=0.5, nms_thres=0.4):
  338. """
  339. description: Removes detections with lower object confidence score than 'conf_thres' and performs
  340. Non-Maximum Suppression to further filter detections.
  341. param:
  342. prediction: detections, (x1, y1,x2, y2, conf, cls_id)
  343. origin_h: original image height
  344. origin_w: original image width
  345. conf_thres: a confidence threshold to filter detections
  346. nms_thres: a iou threshold to filter detections
  347. return:
  348. boxes: output after nms with the shape (x1, y1, x2, y2, conf, cls_id)
  349. """
  350. # Get the boxes that score > CONF_THRESH
  351. boxes = prediction[prediction[:, 4] >= conf_thres]
  352. # Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
  353. # boxes[:, :4] = self.xywh2xyxy(origin_h, origin_w, boxes[:, :4])
  354. # clip the coordinates
  355. boxes[:, 0] = np.clip(boxes[:, 0], 0, origin_w)
  356. boxes[:, 2] = np.clip(boxes[:, 2], 0, origin_w)
  357. boxes[:, 1] = np.clip(boxes[:, 1], 0, origin_h)
  358. boxes[:, 3] = np.clip(boxes[:, 3], 0, origin_h)
  359. # Object confidence
  360. confs = boxes[:, 4]
  361. # Sort by the confs
  362. boxes = boxes[np.argsort(-confs)]
  363. # Perform non-maximum suppression
  364. keep_boxes = []
  365. while boxes.shape[0]:
  366. large_overlap = self.bbox_iou(np.expand_dims(boxes[0, :4], 0), boxes[:, :4]) > nms_thres
  367. label_match = boxes[0, -1] == boxes[:, -1]
  368. # Indices of boxes with lower confidence scores, large IOUs and matching labels
  369. invalid = large_overlap & label_match
  370. keep_boxes += [boxes[0]]
  371. boxes = boxes[~invalid]
  372. boxes = np.stack(keep_boxes, 0) if len(keep_boxes) else np.array([])
  373. return boxes
  374. def img_infer(yolov5_wrapper, image_path_batch):
  375. batch_image_raw, use_time = yolov5_wrapper.infer(yolov5_wrapper.get_raw_image(image_path_batch))
  376. for i, img_path in enumerate(image_path_batch):
  377. parent, filename = os.path.split(img_path)
  378. save_name = os.path.join('output', filename)
  379. # Save image
  380. cv2.imwrite(save_name, batch_image_raw[i])
  381. print('input->{}, time->{:.2f}ms, saving into output/'.format(image_path_batch, use_time * 1000))
  382. def warmup(yolov5_wrapper):
  383. batch_image_raw, use_time = yolov5_wrapper.infer(yolov5_wrapper.get_raw_image_zeros())
  384. print('warm_up->{}, time->{:.2f}ms'.format(batch_image_raw[0].shape, use_time * 1000))
  385. if __name__ == "__main__":
  386. engine_file_path = r"E:\yolov8\last.engine"
  387. # load coco labels
  388. categories = ["dog", "cat", "rabbit", "people", "car"]
  389. if os.path.exists('output/'):
  390. shutil.rmtree('output/')
  391. os.makedirs('output/')
  392. yolov8_wrapper = YoLov8TRT(engine_file_path)
  393. try:
  394. print('batch size is', yolov8_wrapper.batch_size)
  395. image_dir = r"E:\yolov8\data\data_nc5\test_images"
  396. image_path_batches = get_img_path_batches(yolov8_wrapper.batch_size, image_dir)
  397. for i in range(10):
  398. warmup(yolov8_wrapper)
  399. for batch in image_path_batches:
  400. img_infer(yolov8_wrapper, batch)
  401. finally:
  402. yolov8_wrapper.destroy()

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/859704
推荐阅读
相关标签
  

闽ICP备14008679号