当前位置:   article > 正文

YOLOv8-pose之tensorrt版本_yolov8分类模型 tensorrt

yolov8分类模型 tensorrt

目录

一、环境安装

1、创建一个虚拟环境

2、下载TensorRT

3.TensorRT安装

4、其他库安装

二、模型转换及代码

(一)模型转换

导出ONNX模型

导出engine模型

(二)代码

完整的tensorrt推理代码:

三、遇到的问题

(一)模型转换遇到的问题

(二)代码使用注意

1、模型修改地方

2、输入参数修改

3、关键点的颜色和大小修改

4、骨架连接顺序和线宽修改

5、代码使用,看注释

6、项目目录结构

参考文献:

一、环境安装

1、创建一个虚拟环境
conda create -n 环境名 python=X.X
2、下载TensorRT

进入官方网站:Log in | NVIDIA Developer
寻找自己对应的版本,我这里选择为:

下载得到 zip 压缩包,解压。

3.TensorRT安装

 1、复制TensorRT-8.4.3.1\bin中内容到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin

 2、复制TensorRT的include文件夹到CUDA的include文件夹

 3、复制TensorRT-8.4.3.1\lib文件夹中的lib文件到CUDA的lib文件夹dll文件到CUDA的bin文件夹(注意lib文件和dll文件要分开)

4、使用pip install xxx.whl安装TensorRT-8.4.3.1文件夹中框出来的(里面有whl文件,直接在创建的虚拟环境中pip安装)

使用python检查是否安装成功。

  1. import tensorrt as trt
  2. trt.__version__

4、其他库安装
pip install pycuda

  pip install opencv-contrib-python

二、模型转换及代码

(一)模型转换

首先需要将pth模型转换为ONNX模型,再将ONNX模型转换为engine模型。

导出ONNX模型

使用如下脚本:(需要安装ultralytics,直接pip安装即可)

  1. from ultralytics import YOLO
  2. # Load a model
  3. model = YOLO("yolov8s-pose.pt") # load a pretrained model (recommended for training)
  4. success = model.export(format="onnx", opset=11, simplify=True) # export the model to onnx format
  5. assert success
导出engine模型

使用如下命令:

trtexec.exe --onnx='你的onnx模型'.onnx  --saveEngine='保存的名称'.engine --fp16
(二)代码
完整的tensorrt推理代码:
  1. '''
  2. Author: [egrt]
  3. Date: 2023-03-26 09:39:21
  4. LastEditors: Egrt
  5. LastEditTime: 2023-07-15 22:10:25
  6. Description:
  7. '''
  8. import numpy as np
  9. import time
  10. import tensorrt as trt
  11. import pycuda.driver as cuda
  12. import pycuda.autoinit
  13. import cv2
  14. from numpy import array
  15. def resize_image(image, size, letterbox_image):
  16. ih, iw = image.shape[:2]
  17. h, w = size
  18. if letterbox_image:
  19. scale = min(w / iw, h / ih)
  20. nw = int(iw * scale)
  21. nh = int(ih * scale)
  22. image = cv2.resize(image, (nw, nh), interpolation=cv2.INTER_CUBIC)
  23. new_image = 128 * np.ones((h, w, 3), dtype=np.uint8)
  24. new_image[(h - nh) // 2:(h - nh) // 2 + nh, (w - nw) // 2:(w - nw) // 2 + nw, :] = image
  25. else:
  26. new_image = cv2.resize(image, (w, h), interpolation=cv2.INTER_CUBIC)
  27. scale = [iw / w, ih / h]
  28. return new_image, scale
  29. def preprocess_input(image):
  30. image /= 255.0
  31. return image
  32. def xywh2xyxy(x):
  33. """
  34. Convert bounding box coordinates from (x, y, width, height) format to (x1, y1, x2, y2) format where (x1, y1) is the
  35. top-left corner and (x2, y2) is the bottom-right corner.
  36. Args:
  37. x (np.ndarray | torch.Tensor): The input bounding box coordinates in (x, y, width, height) format.
  38. Returns:
  39. y (np.ndarray | torch.Tensor): The bounding box coordinates in (x1, y1, x2, y2) format.
  40. """
  41. y = np.copy(x)
  42. y[..., 0] = x[..., 0] - x[..., 2] / 2 # top left x
  43. y[..., 1] = x[..., 1] - x[..., 3] / 2 # top left y
  44. y[..., 2] = x[..., 0] + x[..., 2] / 2 # bottom right x
  45. y[..., 3] = x[..., 1] + x[..., 3] / 2 # bottom right y
  46. return y
  47. def box_area(boxes: array):
  48. """
  49. :param boxes: [N, 4]
  50. :return: [N]
  51. """
  52. return (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])
  53. def box_iou(box1: array, box2: array):
  54. """
  55. :param box1: [N, 4]
  56. :param box2: [M, 4]
  57. :return: [N, M]
  58. """
  59. area1 = box_area(box1) # N
  60. area2 = box_area(box2) # M
  61. # broadcasting, 两个数组各维度大小 从后往前对比一致, 或者 有一维度值为1;
  62. lt = np.maximum(box1[:, np.newaxis, :2], box2[:, :2])
  63. rb = np.minimum(box1[:, np.newaxis, 2:], box2[:, 2:])
  64. wh = rb - lt # 右下角 - 左上角;
  65. wh = np.maximum(0, wh) # [N, M, 2]
  66. inter = wh[:, :, 0] * wh[:, :, 1]
  67. iou = inter / (area1[:, np.newaxis] + area2 - inter)
  68. return iou # NxM
  69. def numpy_nms(boxes: array, scores: array, iou_threshold: float):
  70. idxs = scores.argsort() # 按分数 降序排列的索引 [N]
  71. keep = []
  72. while idxs.size > 0: # 统计数组中元素的个数
  73. max_score_index = idxs[-1]
  74. max_score_box = boxes[max_score_index][None, :]
  75. keep.append(max_score_index)
  76. if idxs.size == 1:
  77. break
  78. idxs = idxs[:-1] # 将得分最大框 从索引中删除; 剩余索引对应的框 和 得分最大框 计算IoU;
  79. other_boxes = boxes[idxs] # [?, 4]
  80. ious = box_iou(max_score_box, other_boxes) # 一个框和其余框比较 1XM
  81. idxs = idxs[ious[0] <= iou_threshold]
  82. keep = np.array(keep) # Tensor
  83. return keep
  84. def non_max_suppression(
  85. prediction,
  86. conf_thres=0.25,
  87. iou_thres=0.45,
  88. classes=None,
  89. agnostic=False,
  90. multi_label=False,
  91. labels=(),
  92. max_det=300,
  93. nc=0, # number of classes (optional)
  94. max_time_img=0.05,
  95. max_nms=30000,
  96. max_wh=7680,
  97. ):
  98. """
  99. Perform non-maximum suppression (NMS) on a set of boxes, with support for masks and multiple labels per box.
  100. Arguments:
  101. prediction (np.ndarray): An array of shape (batch_size, num_classes + 4 + num_masks, num_boxes)
  102. containing the predicted boxes, classes, and masks. The array should be in the format
  103. output by a model, such as YOLO.
  104. conf_thres (float): The confidence threshold below which boxes will be filtered out.
  105. Valid values are between 0.0 and 1.0.
  106. iou_thres (float): The IoU threshold below which boxes will be filtered out during NMS.
  107. Valid values are between 0.0 and 1.0.
  108. classes (List[int]): A list of class indices to consider. If None, all classes will be considered.
  109. agnostic (bool): If True, the model is agnostic to the number of classes, and all
  110. classes will be considered as one.
  111. multi_label (bool): If True, each box may have multiple labels.
  112. labels (List[List[Union[int, float, np.ndarray]]]): A list of lists, where each inner
  113. list contains the apriori labels for a given image. The list should be in the format
  114. output by a dataloader, with each label being a tuple of (class_index, x1, y1, x2, y2).
  115. max_det (int): The maximum number of boxes to keep after NMS.
  116. nc (int, optional): The number of classes output by the model. Any indices after this will be considered masks.
  117. max_time_img (float): The maximum time (seconds) for processing one image.
  118. max_nms (int): The maximum number of boxes into torchvision.ops.nms().
  119. max_wh (int): The maximum box width and height in pixels
  120. Returns:
  121. (List[np.ndarray]): A list of length batch_size, where each element is an array of
  122. shape (num_boxes, 6 + num_masks) containing the kept boxes, with columns
  123. (x1, y1, x2, y2, confidence, class, mask1, mask2, ...).
  124. """
  125. # Checks
  126. assert 0 <= conf_thres <= 1, f'Invalid Confidence threshold {conf_thres}, valid values are between 0.0 and 1.0'
  127. assert 0 <= iou_thres <= 1, f'Invalid IoU {iou_thres}, valid values are between 0.0 and 1.0'
  128. if isinstance(prediction, (list, tuple)): # YOLOv8 model in validation model, output = (inference_out, loss_out)
  129. prediction = prediction[0] # select only inference output
  130. bs = prediction.shape[0] # batch size
  131. nc = nc or (prediction.shape[1] - 4) # number of classes
  132. nm = prediction.shape[1] - nc - 4
  133. mi = 4 + nc # mask start index
  134. xc = prediction[:, 4:mi].max(axis=1) > conf_thres # candidates
  135. # Settings
  136. # min_wh = 2 # (pixels) minimum box width and height
  137. time_limit = 0.5 + max_time_img * bs # seconds to quit after
  138. redundant = True # require redundant detections
  139. multi_label &= nc > 1 # multiple labels per box (adds 0.5ms/img)
  140. merge = False # use merge-NMS
  141. prediction = np.transpose(prediction, (0, 2, 1)) # shape(1,84,6300) to shape(1,6300,84)
  142. prediction[..., :4] = xywh2xyxy(prediction[..., :4]) # xywh to xyxy
  143. t = time.time()
  144. output = [np.zeros((0, 6 + nm)) for _ in range(bs)]
  145. for xi, x in enumerate(prediction): # image index, image inference
  146. # Apply constraints
  147. # x[((x[:, 2:4] < min_wh) | (x[:, 2:4] > max_wh)).any(1), 4] = 0 # width-height
  148. x = x[xc[xi]] # confidence
  149. # Cat apriori labels if autolabelling
  150. if labels and len(labels[xi]):
  151. lb = labels[xi]
  152. v = np.zeros((len(lb), nc + nm + 5))
  153. v[:, :4] = lb[:, 1:5] # box
  154. v[np.arange(len(lb)), lb[:, 0].astype(int) + 4] = 1.0 # cls
  155. x = np.concatenate((x, v), axis=0)
  156. # If none remain process next image
  157. if not x.shape[0]:
  158. continue
  159. # Detections matrix nx6 (xyxy, conf, cls)
  160. box, cls, mask = np.split(x, (4, 4 + nc), axis=1)
  161. if multi_label:
  162. i, j = np.where(cls > conf_thres)
  163. x = np.concatenate((box[i], x[i, 4 + j, None], j[:, None].astype(float), mask[i]), axis=1)
  164. else: # best class only
  165. conf = np.max(cls, axis=1, keepdims=True)
  166. j = np.argmax(cls, axis=1)
  167. j = np.expand_dims(j, axis=1)
  168. x = np.concatenate((box, conf, j.astype(float), mask), axis=1)[conf.reshape(-1) > conf_thres]
  169. # Filter by class
  170. if classes is not None:
  171. class_indices = np.array(classes)
  172. mask = np.any(x[:, 5:6] == class_indices, axis=1)
  173. x = x[mask]
  174. # Apply finite constraint
  175. # if not np.isfinite(x).all():
  176. # x = x[np.isfinite(x).all(axis=1)]
  177. # Check shape
  178. n = x.shape[0] # number of boxes
  179. if not n: # no boxes
  180. continue
  181. if n > max_nms: # excess boxes
  182. sorted_indices = np.argsort(x[:, 4])[::-1]
  183. x = x[sorted_indices[:max_nms]] # sort by confidence and remove excess boxes
  184. # Batched NMS
  185. c = x[:, 5:6] * (0 if agnostic else max_wh) # classes
  186. boxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scores
  187. i = numpy_nms(boxes, scores, iou_thres) # NMS
  188. i = i[:max_det] # limit detections
  189. if merge and (1 < n < 3E3): # Merge NMS (boxes merged using weighted mean)
  190. # Update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
  191. iou = box_iou(boxes[i], boxes) > iou_thres # iou matrix
  192. weights = iou * scores[None] # box weights
  193. x[i, :4] = np.dot(weights, x[:, :4]).astype(float) / weights.sum(1, keepdims=True) # merged boxes
  194. if redundant:
  195. i = i[np.sum(iou, axis=1) > 1] # require redundancy
  196. output[xi] = x[i]
  197. if (time.time() - t) > time_limit:
  198. break # time limit exceeded
  199. return output
  200. class YOLO(object):
  201. _defaults = {
  202. # ---------------------------------------------------------------------#
  203. # 模型文件存放的路径
  204. # ---------------------------------------------------------------------#
  205. "model_path": 'yolov8-pose.engine',
  206. # ---------------------------------------------------------------------#
  207. # 输入图像的分辨率大小
  208. # ---------------------------------------------------------------------#
  209. "input_shape": [640, 640],
  210. # ---------------------------------------------------------------------#
  211. # 只有得分大于置信度的预测框会被保留下来
  212. # ---------------------------------------------------------------------#
  213. "confidence": 0.5,
  214. # ---------------------------------------------------------------------#
  215. # 非极大抑制所用到的nms_iou大小
  216. # ---------------------------------------------------------------------#
  217. "nms_iou": 0.3,
  218. }
  219. @classmethod
  220. def get_defaults(cls, n):
  221. if n in cls._defaults:
  222. return cls._defaults[n]
  223. else:
  224. return "Unrecognized attribute name '" + n + "'"
  225. # ---------------------------------------------------#
  226. # 初始化YOLO
  227. # ---------------------------------------------------#
  228. def __init__(self, **kwargs):
  229. self.__dict__.update(self._defaults)
  230. for name, value in kwargs.items():
  231. setattr(self, name, value)
  232. self._defaults[name] = value
  233. # ---------------------------------------------------#
  234. # 获得种类和先验框的数量
  235. # ---------------------------------------------------#
  236. self.class_names = ['person']
  237. self.num_classes = len(self.class_names)
  238. self.kpts_shape = [17, 3]
  239. self.bbox_color = (150, 0, 0)
  240. self.bbox_thickness = 6
  241. # 框类别文字
  242. self.bbox_labelstr = {
  243. 'font_size': 1, # 字体大小
  244. 'font_thickness': 2, # 字体粗细
  245. 'offset_x': 0, # X 方向,文字偏移距离,向右为正
  246. 'offset_y': -10, # Y 方向,文字偏移距离,向下为正
  247. }
  248. # 关键点 BGR 配色
  249. self.kpt_color_map = {
  250. 0: {'color': [255, 128, 0], 'radius': 3}, # radius 半径
  251. 1: {'color': [255, 153, 51], 'radius': 3}, #
  252. 2: {'color': [255, 178, 102], 'radius': 3}, #
  253. 3: {'color': [230, 230, 0], 'radius': 3},
  254. 4: {'color': [255, 153, 255], 'radius': 3},
  255. 5: {'color': [153, 204, 255], 'radius': 3},
  256. 6: {'color': [255, 102, 255], 'radius': 3},
  257. 7: {'color': [255, 51, 255], 'radius': 3},
  258. 8: {'color': [102, 178, 255], 'radius': 3},
  259. 9: {'color': [51, 153, 255], 'radius': 3},
  260. 10: {'color': [255, 153, 153], 'radius': 3},
  261. 11: {'color': [255, 102, 102], 'radius': 3},
  262. 12: {'color': [255, 51, 51], 'radius': 3},
  263. 13: {'color': [153, 255, 153], 'radius': 3},
  264. 14: {'color': [102, 255, 102], 'radius': 3},
  265. 15: {'color': [51, 255, 51], 'radius': 3},
  266. 16: {'color': [0, 255, 0], 'radius': 3},
  267. }
  268. # 点类别文字
  269. # self.kpt_labelstr = {
  270. # 'font_size': 1.5, # 字体大小
  271. # 'font_thickness': 3, # 字体粗细
  272. # 'offset_x': 10, # X 方向,文字偏移距离,向右为正
  273. # 'offset_y': 0, # Y 方向,文字偏移距离,向下为正
  274. # }
  275. # 骨架连接 BGR 配色
  276. self.skeleton_map = [
  277. {'srt_kpt_id': 0, 'dst_kpt_id': 1, 'color': [196, 75, 255], 'thickness': 2}, # thickness 线宽
  278. {'srt_kpt_id': 0, 'dst_kpt_id': 2, 'color': [180, 187, 28], 'thickness': 2}, #
  279. {'srt_kpt_id': 1, 'dst_kpt_id': 3, 'color': [47, 255, 173], 'thickness': 2}, #
  280. {'srt_kpt_id': 2, 'dst_kpt_id': 4, 'color': [47, 255, 173], 'thickness': 2},
  281. {'srt_kpt_id': 3, 'dst_kpt_id': 5, 'color': [47, 255, 173], 'thickness': 2},
  282. {'srt_kpt_id': 5, 'dst_kpt_id': 7, 'color': [47, 255, 173], 'thickness': 2},
  283. {'srt_kpt_id': 7, 'dst_kpt_id': 9, 'color': [47, 255, 173], 'thickness': 2},
  284. {'srt_kpt_id': 4, 'dst_kpt_id': 6, 'color': [47, 255, 173], 'thickness': 2},
  285. {'srt_kpt_id': 6, 'dst_kpt_id': 8, 'color': [47, 255, 173], 'thickness': 2},
  286. {'srt_kpt_id': 8, 'dst_kpt_id': 10, 'color': [47, 255, 173], 'thickness': 2},
  287. {'srt_kpt_id': 5, 'dst_kpt_id': 6, 'color': [47, 255, 173], 'thickness': 2},
  288. {'srt_kpt_id': 5, 'dst_kpt_id': 11, 'color': [47, 255, 173], 'thickness': 2},
  289. {'srt_kpt_id': 11, 'dst_kpt_id': 13, 'color': [47, 255, 173], 'thickness': 2},
  290. {'srt_kpt_id': 13, 'dst_kpt_id': 15, 'color': [47, 255, 173], 'thickness': 2},
  291. {'srt_kpt_id': 6, 'dst_kpt_id': 12, 'color': [47, 255, 173], 'thickness': 2},
  292. {'srt_kpt_id': 12, 'dst_kpt_id': 14, 'color': [47, 255, 173], 'thickness': 2},
  293. {'srt_kpt_id': 14, 'dst_kpt_id': 16, 'color': [47, 255, 173], 'thickness': 2},
  294. ]
  295. self.generate()
  296. # ---------------------------------------------------#
  297. # 生成模型
  298. # ---------------------------------------------------#
  299. def generate(self):
  300. # ---------------------------------------------------#
  301. # 建立yolo模型,载入yolo模型的权重
  302. # ---------------------------------------------------#
  303. engine = self.load_engine(self.model_path)
  304. self.context = engine.create_execution_context()
  305. self.inputs, self.outputs, self.bindings = [], [], []
  306. self.stream = cuda.Stream()
  307. for binding in engine:
  308. size = engine.get_binding_shape(binding)
  309. dtype = trt.nptype(engine.get_binding_dtype(binding))
  310. host_mem = np.empty(size, dtype=dtype)
  311. host_mem = np.ascontiguousarray(host_mem)
  312. device_mem = cuda.mem_alloc(host_mem.nbytes)
  313. self.bindings.append(int(device_mem))
  314. if engine.binding_is_input(binding):
  315. self.inputs.append({'host': host_mem, 'device': device_mem})
  316. else:
  317. self.outputs.append({'host': host_mem, 'device': device_mem})
  318. def load_engine(self, engine_path):
  319. TRT_LOGGER = trt.Logger(trt.Logger.ERROR)
  320. with open(engine_path, 'rb') as f, trt.Runtime(TRT_LOGGER) as runtime:
  321. return runtime.deserialize_cuda_engine(f.read())
  322. def forward(self, img):
  323. self.inputs[0]['host'] = np.ravel(img)
  324. # transfer data to the gpu
  325. for inp in self.inputs:
  326. cuda.memcpy_htod_async(inp['device'], inp['host'], self.stream)
  327. # run inference
  328. self.context.execute_async_v2(
  329. bindings=self.bindings,
  330. stream_handle=self.stream.handle)
  331. # fetch outputs from gpu
  332. for out in self.outputs:
  333. cuda.memcpy_dtoh_async(out['host'], out['device'], self.stream)
  334. # synchronize stream
  335. self.stream.synchronize()
  336. return [out['host'] for out in self.outputs]
  337. # ---------------------------------------------------#
  338. # 检测图片
  339. # ---------------------------------------------------#
  340. def detect_image(self, image):
  341. # ---------------------------------------------------------#
  342. # 在这里将图像转换成RGB图像,防止灰度图在预测时报错。
  343. # 代码仅仅支持RGB图像的预测,所有其它类型的图像都会转化成RGB
  344. # ---------------------------------------------------------#
  345. image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
  346. image_data, scale = resize_image(image, (self.input_shape[1], self.input_shape[0]), False)
  347. # ---------------------------------------------------------#
  348. # 添加上batch_size维度
  349. # h, w, 3 => 3, h, w => 1, 3, h, w
  350. # ---------------------------------------------------------#
  351. image_data = np.expand_dims(np.transpose(preprocess_input(np.array(image_data, dtype='float32')), (2, 0, 1)), 0)
  352. # ---------------------------------------------------------#
  353. # 将图像输入网络当中进行预测!
  354. # ---------------------------------------------------------#
  355. outputs = self.forward(image_data)[::-1]
  356. # ---------------------------------------------------------#
  357. # 将预测框进行堆叠,然后进行非极大抑制
  358. # ---------------------------------------------------------#
  359. results = non_max_suppression(outputs, conf_thres=self.confidence, iou_thres=self.nms_iou, nc=1)[0]
  360. if results is None:
  361. return image
  362. top_label = np.array(results[:, 5], dtype='int32')
  363. top_conf = results[:, 4]
  364. top_boxes = results[:, :4]
  365. top_kpts = results[:, 6:].reshape(len(results), self.kpts_shape[0], self.kpts_shape[1])
  366. # ---------------------------------------------------------#
  367. # 图像绘制
  368. # ---------------------------------------------------------#
  369. for i, c in list(enumerate(top_label)):
  370. predicted_class = self.class_names[int(c)]
  371. box = top_boxes[i]
  372. score = top_conf[i]
  373. left, top, right, bottom = box.astype('int32')
  374. left = int(left * scale[0])
  375. top = int(top * scale[1])
  376. right = int(right * scale[0])
  377. bottom = int(bottom * scale[1])
  378. image = cv2.rectangle(image, (left, top), (right, bottom), self.bbox_color, self.bbox_thickness)
  379. label = '{} {:.2f}'.format(predicted_class, score)
  380. # 写框类别文字:图片,文字字符串,文字左上角坐标,字体,字体大小,颜色,字体粗细
  381. image = cv2.putText(image, label,
  382. (left + self.bbox_labelstr['offset_x'], top + self.bbox_labelstr['offset_y']),
  383. cv2.FONT_HERSHEY_SIMPLEX, self.bbox_labelstr['font_size'], self.bbox_color,
  384. self.bbox_labelstr['font_thickness'])
  385. bbox_keypoints = top_kpts[i] # 该框所有关键点坐标和置信度
  386. # 画该框的骨架连接
  387. for skeleton in self.skeleton_map:
  388. # 获取起始点坐标
  389. srt_kpt_id = skeleton['srt_kpt_id']
  390. srt_kpt_x = int(bbox_keypoints[srt_kpt_id][0] * scale[0])
  391. srt_kpt_y = int(bbox_keypoints[srt_kpt_id][1] * scale[1])
  392. # 获取终止点坐标
  393. dst_kpt_id = skeleton['dst_kpt_id']
  394. dst_kpt_x = int(bbox_keypoints[dst_kpt_id][0] * scale[0])
  395. dst_kpt_y = int(bbox_keypoints[dst_kpt_id][1] * scale[1])
  396. # 获取骨架连接颜色
  397. skeleton_color = skeleton['color']
  398. # 获取骨架连接线宽
  399. skeleton_thickness = skeleton['thickness']
  400. # 画骨架连接
  401. image = cv2.line(image, (srt_kpt_x, srt_kpt_y), (dst_kpt_x, dst_kpt_y), color=skeleton_color,
  402. thickness=skeleton_thickness)
  403. # 画该框的关键点
  404. for kpt_id in self.kpt_color_map:
  405. # 获取该关键点的颜色、半径、XY坐标
  406. kpt_color = self.kpt_color_map[kpt_id]['color']
  407. kpt_radius = self.kpt_color_map[kpt_id]['radius']
  408. kpt_x = int(bbox_keypoints[kpt_id][0] * scale[0])
  409. kpt_y = int(bbox_keypoints[kpt_id][1] * scale[1])
  410. # 画圆:图片、XY坐标、半径、颜色、线宽(-1为填充)
  411. image = cv2.circle(image, (kpt_x, kpt_y), kpt_radius, kpt_color, -1)
  412. # 写关键点类别文字:图片,文字字符串,文字左上角坐标,字体,字体大小,颜色,字体粗细
  413. # kpt_label = str(self.kpt_color_map[kpt_id]['name'])
  414. # image = cv2.putText(image)
  415. image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  416. return image
  417. if __name__ == '__main__':
  418. yolo = YOLO()
  419. # ----------------------------------------------------------------------------------------------------------#
  420. # mode用于指定测试的模式:
  421. # 'predict' 表示单张图片预测,如果想对预测过程进行修改,如保存图片,截取对象等,可以先看下方详细的注释
  422. # 'video' 表示视频检测,可调用摄像头或者视频进行检测,详情查看下方注释。
  423. # 'fps' 表示测试fps,使用的图片是img里面的street.jpg,详情查看下方注释。
  424. # 'dir_predict' 表示遍历文件夹进行检测并保存。默认遍历img文件夹,保存img_out文件夹,详情查看下方注释。
  425. # ----------------------------------------------------------------------------------------------------------#
  426. mode = "video"
  427. # ----------------------------------------------------------------------------------------------------------#
  428. # video_path 用于指定视频的路径,当video_path=0时表示检测摄像头
  429. # 想要检测视频,则设置如video_path = "xxx.mp4"即可,代表读取出根目录下的xxx.mp4文件。
  430. # video_save_path 表示视频保存的路径,当video_save_path=""时表示不保存
  431. # 想要保存视频,则设置如video_save_path = "yyy.mp4"即可,代表保存为根目录下的yyy.mp4文件。
  432. # video_fps 用于保存的视频的fps
  433. #
  434. # video_path、video_save_path和video_fps仅在mode='video'时有效
  435. # 保存视频时需要ctrl+c退出或者运行到最后一帧才会完成完整的保存步骤。
  436. # ----------------------------------------------------------------------------------------------------------#
  437. video_path = 'two.mp4'
  438. video_save_path = "one_out.mp4"
  439. video_fps = 25.0
  440. # ----------------------------------------------------------------------------------------------------------#
  441. # test_interval 用于指定测量fps的时候,图片检测的次数。理论上test_interval越大,fps越准确。
  442. # fps_image_path 用于指定测试的fps图片
  443. #
  444. # test_interval和fps_image_path仅在mode='fps'有效
  445. # ----------------------------------------------------------------------------------------------------------#
  446. test_interval = 100
  447. fps_image_path = "img/test.jpg"
  448. # -------------------------------------------------------------------------#
  449. # dir_origin_path 指定了用于检测的图片的文件夹路径
  450. # dir_save_path 指定了检测完图片的保存路径
  451. #
  452. # dir_origin_path和dir_save_path仅在mode='dir_predict'时有效
  453. # -------------------------------------------------------------------------#
  454. dir_origin_path = "img/"
  455. dir_save_path = "img_out/"
  456. if mode == "predict":
  457. '''
  458. 1、如果想要进行检测完的图片的保存,利用r_image.save("img.jpg")即可保存,直接在predict.py里进行修改即可。
  459. 2、如果想要获得预测框的坐标,可以进入yolo.detect_image函数,在绘图部分读取top,left,bottom,right这四个值。
  460. 3、如果想要利用预测框截取下目标,可以进入yolo.detect_image函数,在绘图部分利用获取到的top,left,bottom,right这四个值
  461. 在原图上利用矩阵的方式进行截取。
  462. 4、如果想要在预测图上写额外的字,比如检测到的特定目标的数量,可以进入yolo.detect_image函数,在绘图部分对predicted_class进行判断,
  463. 比如判断if predicted_class == 'car': 即可判断当前目标是否为车,然后记录数量即可。利用draw.text即可写字。
  464. '''
  465. while True:
  466. img = input('Input image filename:')
  467. try:
  468. image = cv2.imread(img)
  469. except:
  470. print('Open Error! Try again!')
  471. continue
  472. else:
  473. r_image = yolo.detect_image(image)
  474. cv2.imshow('result', r_image)
  475. c = cv2.waitKey(0)
  476. elif mode == "video":
  477. capture = cv2.VideoCapture(video_path)
  478. if video_save_path != "":
  479. fourcc = cv2.VideoWriter_fourcc(*'XVID')
  480. size = (int(capture.get(cv2.CAP_PROP_FRAME_WIDTH)), int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT)))
  481. out = cv2.VideoWriter(video_save_path, fourcc, video_fps, size)
  482. ref, frame = capture.read()
  483. if not ref:
  484. raise ValueError("未能正确读取摄像头(视频),请注意是否正确安装摄像头(是否正确填写视频路径)。")
  485. fps = 0.0
  486. while (True):
  487. t1 = time.time()
  488. # 读取某一帧
  489. ref, frame = capture.read()
  490. if not ref:
  491. break
  492. # 进行检测
  493. frame = yolo.detect_image(frame)
  494. fps = (fps + (1. / (time.time() - t1))) / 2
  495. print("fps= %.2f" % (fps))
  496. frame = cv2.putText(frame, "fps= %.2f" % (fps), (0, 40), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
  497. cv2.imshow("video", frame)
  498. c = cv2.waitKey(1) & 0xff
  499. if video_save_path != "":
  500. out.write(frame)
  501. if c == 27:
  502. capture.release()
  503. break
  504. print("Video Detection Done!")
  505. capture.release()
  506. if video_save_path != "":
  507. print("Save processed video to the path :" + video_save_path)
  508. out.release()
  509. cv2.destroyAllWindows()
  510. elif mode == "fps":
  511. img = cv2.imread(fps_image_path)
  512. tact_time = yolo.get_FPS(img, test_interval)
  513. print(str(tact_time) + ' seconds, ' + str(1 / tact_time) + 'FPS, @batch_size 1')
  514. elif mode == "dir_predict":
  515. import os
  516. from tqdm import tqdm
  517. img_names = os.listdir(dir_origin_path)
  518. for img_name in tqdm(img_names):
  519. if img_name.lower().endswith(
  520. ('.bmp', '.dib', '.png', '.jpg', '.jpeg', '.pbm', '.pgm', '.ppm', '.tif', '.tiff')):
  521. image_path = os.path.join(dir_origin_path, img_name)
  522. image = cv2.imread(image_path)
  523. r_image = yolo.detect_image(image)
  524. if not os.path.exists(dir_save_path):
  525. os.makedirs(dir_save_path)
  526. r_image.save(os.path.join(dir_save_path, img_name.replace(".jpg", ".png")), quality=95, subsampling=0)
  527. else:
  528. raise AssertionError("Please specify the correct mode: 'predict', 'video', 'fps', 'dir_predict'.")

三、遇到的问题

(一)模型转换遇到的问题

在使用命令转engine模型时,会出现如下图的问题:

这个不影响,只是警告,等待几分钟就转换完成了

(二)代码使用注意
1、模型修改地方

2、输入参数修改

3、关键点的颜色和大小修改

4、骨架连接顺序和线宽修改

5、代码使用,看注释

6、项目目录结构

参考文献:

睿智的目标检测——YOLOv8-Pose的TensorRT推理__白鹭先生_的博客-CSDN博客

YOLOV8模型训练+部署(实战)-CSDN博客

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/繁依Fanyi0/article/detail/1013315
推荐阅读
相关标签
  

闽ICP备14008679号