赞
踩
目标检测与跟踪 (1)- 机器人视觉与YOLO V8_Techblog of HaoWANG的博客-CSDN博客
目标检测与跟踪 (2)- YOLO V8配置与测试_Techblog of HaoWANG的博客-CSDN博客
目录
YOLOv8 算法的核心特性和改动可以归结为如下:
1. 提供了一个全新的 SOTA 模型,包括 P5 640 和 P6 1280 分辨率的目标检测网络和基于 YOLACT 的实例分割模型。和 YOLOv5 一样,基于缩放系数也提供了 N/S/M/L/X 尺度的不同大小模型,用于满足不同场景需求
2. Backbone:
骨干网络和 Neck 部分可能参考了 YOLOv7 ELAN 设计思想,将 YOLOv5 的 C3 结构换成了梯度流更丰富的 C2f 结构,并对不同尺度模型调整了不同的通道数。
YOLO (You Only Look Once) is a real-time object detection system that is widely used in various applications such as self-driving cars, surveillance systems, and facial recognition software. YOLO V8 is the latest version of YOLO, released in 2022.
Here are some key features of YOLO V8:
- Improved accuracy: YOLO V8 has improved object detection accuracy compared to its predecessors, especially for objects with complex shapes and sizes.
- Real-time performance: YOLO V8 is designed for real-time object detection and can process images and videos at high frame rates.
- Multi-scale features: YOLO V8 uses multi-scale features to detect objects of different sizes and shapes.
- Improved bounding box regression: YOLO V8 has improved bounding box regression, which helps to more accurately detect the location and size of objects.
- New algorithms: YOLO V8 includes several new algorithms, such as spatial pyramid pooling and a new loss function, that improve object detection accuracy and speed.
- Support for multiple platforms: YOLO V8 can be run on a variety of platforms, including Windows, Linux, and Android.
If you're interested in using YOLO V8 for a specific project, you can find more information and resources on the YOLO website, including documentation, tutorials, and sample code.
TensorRT是一个高性能的深度学习推理(Inference)优化器,可以为深度学习应用提供低延迟、高吞吐率的部署推理。TensorRT可用于对超大规模数据中心、嵌入式平台或自动驾驶平台进行推理加速。TensorRT现已能支持TensorFlow、Caffe、Mxnet、Pytorch等几乎所有的深度学习框架,将TensorRT和NVIDIA的GPU结合起来,能在几乎所有的框架中进行快速和高效的部署推理。
TensorRT 是一个C++库,从 TensorRT 3 开始提供C++ API和Python API,主要用来针对 NVIDIA GPU进行 高性能推理(Inference)加速。
TensorRT(TensorRT™)是英伟达(NVIDIA)开发的一个高性能推理优化器,旨在加速深度学习模型的推理过程。它针对英伟达GPU进行了优化,利用深度神经网络(DNN)推理的并行计算能力,提供了快速且高效的推理解决方案。下面我将详细介绍TensorRT的原理、架构、功能和性能。
(TensorRT(1)-介绍-使用-安装 | arleyzhang)
TensorRT的核心原理是通过优化和精简深度学习模型,以提高推理的速度和效率。它使用了三个关键技术:
TensorRT的架构可以分为四个主要组件:
TensorRT提供了丰富的功能,用于优化和加速深度学习模型的推理过程,包括:
TensorRT在推理性能方面表现出色,具有以下特点:
总而言之,TensorRT是一个针对深度学习模型推理优化的高性能引擎。它通过网络层融合、精确度校准和动态张量内存等技术,提供了快速、高效的推理解决方案。TensorRT在加速推理速度、降低延迟和提高吞吐量方面具有显著优势,特别适用于对性能要求较高的应用场景。
TensorRT的推理引擎充分利用了GPU的并行计算能力,以实现高效的推理。下面是TensorRT推理引擎如何利用GPU并行计算能力的几个关键方面:
通过这些并行计算技术,TensorRT推理引擎能够充分发挥GPU的并行计算能力,实现高效的推理。并行计算图、流水线并行、批处理并行、权重共享以及Tensor核心计算等方法的结合,可以显著提高模型的推理性能,并满足对于实时性、低延迟和高吞吐量的要求。
最新版本为tensorRT8 GA,根据系统下载适合的tensorRT版本:
1、 解压缩
tar xzvf TensorRT-X
2、 安装TensorRT wheel 文件,根据python版本选择,这里是python3.7
- cd TensorRT-X/python
- pip install tensorrt-X.whl
3、 安装graphsurgeon wheel文件
- cd TensorRT-X/python
- pip install graphsurgeon-X.whl
4、 配置环境变量
- export PATH=$PATH:/usr/local/cuda-11.1/bin
- export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-11.1/lib64
- export LIBRARY_PATH=$LIBRARY_PATH:/usr/local/cuda-11.1/lib64
- source /etc/profile
5、有时需要设置
- export LD_LIBRARY_PATH=/home/XX/TensorRT-X/lib:$LD_LIBRARY_PATH
- source ~/.bashrc
安装依赖
pip install onnx==1.12.0
pip install onnx-simplifier==0.4.0
pip install coloredlogs==15.0.1
pip install humanfriendly==10.0
pip install onnxruntime-gpu==1.12.0
pip isntall onnxsim-no-ort==0.4.0
pip install opencv-python==4.5.2.52(注意cv2一定不能用4.6.0)
pip install protobuf==3.19.4
pip install setuptools==63.2.0
导出测试:
- yolo export model=yolov8n.pt format=engine device=0 # export official model
- yolo export model=path/to/best.pt format=onnx # export custom trained model
注意:AttributeError:module ‘distutils‘ has no attribute ‘version错误解决方法:
AttributeError:module ‘distutils‘ has no attribute ‘version_distutils version-CSDN博客
模型1:yolov8l.pt
TRT模型:yolov8l.engine
- import cv2
- from ultralytics import YOLO
-
- # Load the YOLOv8 model
- model = YOLO('yolov8l.engine')
-
- # Open the video file
- video_path = "path/to/your/video/file.mp4"
-
- cap = cv2.VideoCapture(0)
-
- # Loop through the video frames
- while cap.isOpened():
- # Read a frame from the video
- success, frame = cap.read()
-
- if success:
- # Run YOLOv8 tracking on the frame, persisting tracks between frames
- results = model.track(frame, persist=True)
-
- # Visualize the results on the frame
- annotated_frame = results[0].plot()
-
- # Display the annotated frame
- cv2.imshow("YOLOv8 Tracking", annotated_frame)
-
- # Break the loop if 'q' is pressed
- if cv2.waitKey(1) & 0xFF == ord("q"):
- break
- else:
- # Break the loop if the end of the video is reached
- break
-
- # Release the video capture object and close the display window
- cap.release()
- cv2.destroyAllWindows()
对比: watch -n 1 nvidia-smi
结果
模型加载速度和检测效果有较大程度提升,并且帧率和占用也维持在合理水平,TensorRT模型优化和部署性能优异。
使用yolov8m-set.pt ->>> 'yolov8m-seg.engine'
- import cv2
- from ultralytics import YOLO
-
- # Load the YOLOv8 model
- model = YOLO('yolov8m-seg.engine')
-
- # Open the video file
- video_path = "path/to/your/video/file.mp4"
-
- cap = cv2.VideoCapture(0)
-
- # Loop through the video frames
- while cap.isOpened():
- # Read a frame from the video
- success, frame = cap.read()
-
- if success:
- # Run YOLOv8 tracking on the frame, persisting tracks between frames
- results = model.track(frame, persist=True)
-
- # Visualize the results on the frame
- annotated_frame = results[0].plot()
-
- # Display the annotated frame
- cv2.imshow("YOLOv8 Tracking", annotated_frame)
-
- # Break the loop if 'q' is pressed
- if cv2.waitKey(1) & 0xFF == ord("q"):
- break
- else:
- # Break the loop if the end of the video is reached
- break
-
- # Release the video capture object and close the display window
- cap.release()
- cv2.destroyAllWindows()
识别与分割输出结果:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。