当前位置:   article > 正文

基于YOLOv8和RealsenseD455相机实现物体距离检测_可以通过yolo识别与物体间的距离吗

可以通过yolo识别与物体间的距离吗

目录

前言

一、Yolov8环境搭建

二、配置RealSense-ros功能包

1.安装ROS-humble

2.安装Intel RealSense SDK 2.0

​编辑 3.克隆ROS功能包

三、物体距离检测代码实现

1.算法流程:

2.代码示例:

3.效果展示:


前言

要基于YOLOv8和RealsenseD455相机实现物体距离检测,可以按照以下步骤进行操作:

  1. 准备环境:

    • 安装YOLOv8:可以使用开源框架如Darknet或PyTorch实现YOLOv8模型。
    • 安装Realsense SDK:根据相机型号和操作系统,下载并安装相机的SDK。
  2. 获取相机数据:

    • 使用Realsense SDK连接并获取相机数据。
    • 通过相机获取的RGB图像作为输入,传递给YOLOv8模型。
  3. 物体检测:

    • 使用YOLOv8模型对输入的图像进行物体检测,获取物体的边界框和类别。
  4. 距离估计:

    • 根据Realsense相机的深度图像,对检测到的物体进行距离估计。
    • 使用相机的深度图像,可以通过计算像素与实际距离的关系来获取物体的距离。
  5. 结果显示:

    • 将检测到的物体类别、边界框和距离信息进行显示或保存。

 

一、Yolov8环境搭建

YOLOv8是一种基于深度学习的目标检测算法,可以实现实时的物体检测、分割、分类等操作。

官方网址:

https://github.com/ultralytics/ultralytics

基本环境要求:Python>=3.8        Pytorch>=1.8

以第三方库的形式安装Yolov8:

pip install ultralytics

安装完毕后,通过以下命令测试是否安装成功:

yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'

命令执行完毕后,将在同级目录下得到runs目录,内容为示例图的检测结果。

得到示例图片说明Yolo模型运行正常。

如果由于网络问题,下载资源速度很慢,可以选择直接下载网盘中链接到当前目录进行测试。

网盘链接:https://pan.baidu.com/s/1txVCeJ0eBOEv2DF4evzftg 提取码: hnnp

二、配置RealSense-ros功能包

1.安装ROS-humble

官方安装手册:

 Installation — ROS 2 Documentation: Humble documentation

2.安装Intel RealSense SDK 2.0

GitHub - IntelRealSense/realsense-ros: Intel(R) RealSense(TM) ROS Wrapper for Depth Camera

sudo apt install ros-humble-librealsense2*

 可以选择安装realsense viewer进行相机调试,Intel提供两种安装方法,第一种是通过build好的package安装,第二种是通过源码自己build安装,为了方便,本文直接使用第一种方法。

step1:注册服务器的公钥

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-key F6E65AC044F831AC80A06380C8B3A55A6F3EFCDE || sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-key F6E65AC044F831AC80A06380C8B3A55A6F3EFCDE

 step2:将服务器加入到repo列表

sudo add-apt-repository "deb https://librealsense.intel.com/Debian/apt-repo $(lsb_release -cs) main" -u

step3:安装realsense-viewer的libraries

  1. sudo apt-get install librealsense2-dev
  2. sudo apt-get install librealsense2-dbg

step4: 打开realsense viewer查看图像是否正常(注意打开软件中RGB开关)

realsense-viewer

3.克隆ROS功能包

1.克隆到workspace/src目录下:

git clone https://github.com/IntelRealSense/realsense-ros.git -b ros2-development

2.安装相关依赖项:

回到当前工作空间目录下

  1. sudo apt-get install python3-rosdep -y
  2. sudo rosdep init
  3. rosdep update
  4. rosdep install -i --from-path src --rosdistro $ROS_DISTRO --skip-keys=librealsense2 -y

如果发生rosdep失败可以下载使用国内版rosdepc替代。

3.编译当前工作空间:

colcon build

4.节点启动命令:

ros2 launch realsense2_camera rs_launch.py

 

 

三、物体距离检测代码实现

1.算法流程:

将相机输出的深度图像信息与RGB图像对齐处理,滤除深度值大于阈值的背景图像,在将背景去除后的图像送入Yolo模型进行处理,得到对应物体的类别信息与蒙板信息。使用简化的KNN算法对蒙板对应的深度值进行筛选,取出边缘点和极端数据。最后均值处理得到物体的三维坐标信息。

2.代码示例:

源码链接: https://pan.baidu.com/s/1M2_yx7Cu9UXm3u0VwzLELQ 提取码: ai64

物体测距节点:

obj_detect.py:

  1. #!/usr/bin/env python
  2. # Basic ROS2
  3. # Author:Jin Xiangyv
  4. # 2024.1.2
  5. #System imports
  6. import rclpy
  7. from rclpy.node import Node
  8. from rclpy.qos import qos_profile_sensor_data
  9. from rclpy.qos import ReliabilityPolicy, QoSProfile
  10. # Executor and callback imports
  11. from rclpy.callback_groups import MutuallyExclusiveCallbackGroup
  12. from rclpy.executors import MultiThreadedExecutor
  13. # ROS2 interfaces
  14. from sensor_msgs.msg import Image, CameraInfo
  15. from std_msgs.msg import String
  16. # Image msg parser
  17. from cv_bridge import CvBridge
  18. # Vision model
  19. from ultralytics import YOLO
  20. # Others
  21. import numpy as np
  22. import time, json, torch
  23. class Yolov8Node(Node):
  24. '''
  25. Work:
  26. a class based on the visual recognition moudle of YOLOv8
  27. which can obtain the object's category and position.
  28. Param:
  29. model:neural network model
  30. device:use GPU accelerationn or use CPU
  31. depth_threshold:background remove
  32. threshold:reserved parameter
  33. enable_yolo:use YOLO or not
  34. '''
  35. def __init__(self):
  36. super().__init__("yolov8_node")
  37. rclpy.logging.set_logger_level('yolov8_node', rclpy.logging.LoggingSeverity.INFO)
  38. ## Declare parameters for node
  39. self.declare_parameter("model", "yolov8n-seg.pt")
  40. self.declare_parameter("device", "cuda:0")
  41. self.declare_parameter("depth_threshold", 0.6)
  42. self.declare_parameter("threshold", 0.3)
  43. self.declare_parameter("enable_yolo", True)
  44. ## Transmit parameters
  45. model = self.get_parameter("model").get_parameter_value().string_value
  46. self.device = self.get_parameter("device").get_parameter_value().string_value
  47. self.depth_threshold = self.get_parameter("depth_threshold").get_parameter_value().double_value
  48. self.threshold = self.get_parameter("threshold").get_parameter_value().double_value
  49. self.enable_yolo = self.get_parameter("enable_yolo").get_parameter_value().bool_value
  50. ## Camera correction matrix
  51. self.tf_world_to_camera = np.array([[1.000, 0.000, 0.000, 0.000],
  52. [0.000, 1.000, 0.000, 0.000],
  53. [0.000, 0.000, 1.000, 0.000],
  54. [0.000, 0.000, 0.000, 1.000]])
  55. self.tf_camera_to_optical = np.array([[385.591, 0.000, 324.346, 0.000],
  56. [ 0.000, 385.137, 244.673, 0.000],
  57. [ 0.000, 0.000, 1.000, 0.000]])
  58. self.tf_world_to_optical = np.matmul(self.tf_camera_to_optical, self.tf_world_to_camera)
  59. ## Multithread processing
  60. self.group_img_sub = MutuallyExclusiveCallbackGroup()
  61. self.group_timer = MutuallyExclusiveCallbackGroup()
  62. ## Others init
  63. self.cv_bridge = CvBridge()
  64. self.yolo = YOLO("yolov8n-seg.pt")
  65. self.yolo.fuse()
  66. '''
  67. Conv2d and BatchNorm2d Layer Fusion:
  68. Conv2d layers are often followed by BatchNorm2d layers in deep neural networks.
  69. Fusing these layers means combining the operations of the convolutional layer and the batch normalization layer into a single operation.
  70. This can reduce the computational cost and improve inference speed.
  71. '''
  72. self.color_image_msg = None
  73. self.depth_image_msg = None
  74. self.camera_intrinsics = None
  75. self.pred_image_msg = Image()
  76. self.result_img = None
  77. # Set clipping distance for background removal
  78. depth_scale = 0.001
  79. self.depth_threshold = self.depth_threshold/depth_scale
  80. # Publishers
  81. self._item_dict_pub = self.create_publisher(String, "/yolo/prediction/item_dict", 10)
  82. self._pred_pub = self.create_publisher(Image, "/yolo/prediction/image", 10)
  83. self._bg_removed_pub = self.create_publisher(Image, "/yolo/bg_removed", 10)
  84. self._test = self.create_publisher(Image, "/yolo/test", 10)
  85. # Subscribers
  86. self._color_image_sub = self.create_subscription(Image, "/camera/camera/color/image_raw", self.color_image_callback, qos_profile_sensor_data, callback_group=self.group_img_sub)
  87. self._depth_image_sub = self.create_subscription(Image, "/camera/camera/depth/image_rect_raw", self.depth_image_callback, qos_profile_sensor_data, callback_group=self.group_img_sub)
  88. self._camera_info_subscriber = self.create_subscription(CameraInfo, '/camera/camera/color/camera_info', self.camera_info_callback, QoSProfile(depth=1,reliability=ReliabilityPolicy.RELIABLE), callback_group=self.group_img_sub)
  89. # Timers
  90. self._vision_timer = self.create_timer(0.04, self.object_segmentation, callback_group=self.group_timer) # 25 hz
  91. def color_image_callback(self, msg):
  92. self.color_image_msg = msg
  93. def depth_image_callback(self, msg):
  94. self.depth_image_msg = msg
  95. def camera_info_callback(self, msg):
  96. pass
  97. def clsuter_select(self, num_list, threshold):
  98. """
  99. parameters
  100. num_list: original number list
  101. threshold: retention rate
  102. returns
  103. new number list
  104. Works by removing singularity in original number list
  105. """
  106. n = len(num_list)
  107. n_threshold = n*(1-threshold)
  108. new_list = []
  109. out_list = []
  110. for i in num_list:
  111. if i != 0:
  112. new_list.append(i)
  113. for num in new_list:
  114. fit_num = 0
  115. current = num
  116. for item in new_list:
  117. flag = abs(current - item)
  118. if flag < 200*threshold:
  119. fit_num = fit_num + 1
  120. if fit_num > n_threshold:
  121. out_list.append(num)
  122. return out_list
  123. def bg_removal(self, color_img_msg: Image, depth_img_msg: Image):
  124. """
  125. parameters
  126. color_img_msg: Message class 'Image'--RGB image
  127. depth_img_msg: Message class 'Image'--Z16 image
  128. returns
  129. Backgroung removed image
  130. Works by removing background in original image
  131. """
  132. if self.color_image_msg is not None and self.depth_image_msg is not None:
  133. # Convert color image msg
  134. cv_color_image = self.cv_bridge.imgmsg_to_cv2(color_img_msg, desired_encoding='bgr8')
  135. np_color_image = np.array(cv_color_image, dtype=np.uint8)
  136. # Convert depth image msg
  137. cv_depth_image = self.cv_bridge.imgmsg_to_cv2(depth_img_msg, desired_encoding='passthrough')
  138. np_depth_image = np.array(cv_depth_image, dtype=np.uint16)
  139. self.test_msg = self.cv_bridge.cv2_to_imgmsg(np_depth_image, encoding='16UC1')
  140. self._test.publish(self.test_msg)
  141. # background removal
  142. grey_color = 153
  143. depth_image_3d = np.dstack((np_depth_image, np_depth_image, np_depth_image)) # depth image is 1 channel, color is 3 channels
  144. # bg_removed = np.where((depth_image_3d > self.depth_threshold) | (depth_image_3d != depth_image_3d), grey_color, np_color_image)
  145. bg_removed = np.where((depth_image_3d > 10000) | (depth_image_3d != depth_image_3d), grey_color, np_color_image)
  146. return bg_removed, np_color_image, np_depth_image
  147. self.get_logger().error("Background removal error, color or depth msg was None")
  148. def filter_depth_object_img(self, img, starting_mask, deviation): #TODO: I need explanation from alfredo -Dreez
  149. """
  150. parameters
  151. img: np depth image
  152. deviation: the deviation allowed
  153. returns
  154. filteref image
  155. Works by removing pixels too far from the median value
  156. """
  157. mdv = np.median(img[starting_mask]) #median depth value
  158. u_lim = mdv + mdv*deviation #upper limit
  159. uidx = (img >= u_lim)
  160. #we stack the two masks and then takes the max in the new axis
  161. out_img = img
  162. zero_img = np.zeros_like(img)
  163. out_img[uidx] = zero_img[uidx]
  164. return out_img
  165. def object_segmentation(self):
  166. if self.enable_yolo and self.color_image_msg is not None and self.depth_image_msg is not None:
  167. self.get_logger().debug("Succesfully acquired color and depth image msgs")
  168. # Remove background
  169. bg_removed, np_color_image, np_depth_image = self.bg_removal(self.color_image_msg, self.depth_image_msg)
  170. self.get_logger().debug("Succesfully removed background")
  171. self._bg_removed_pub.publish(self.cv_bridge.cv2_to_imgmsg(bg_removed, encoding='bgr8'))
  172. # Predict on image "bg_removed"
  173. results = self.yolo(source=bg_removed,device='0')
  174. self.get_logger().debug("Succesfully predicted")
  175. # Go through detections in prediction results
  176. for detection in results:
  177. # Extract image with yolo predictions
  178. pred_img = detection.plot()
  179. self.result_img = pred_img
  180. self.pred_image_msg = self.cv_bridge.cv2_to_imgmsg(pred_img, encoding='bgr8')
  181. self._pred_pub.publish(self.pred_image_msg)
  182. # Get number of objects in the scene
  183. object_boxes = detection.boxes.xyxy.cpu().numpy()
  184. n_objects = object_boxes.shape[0]
  185. try:
  186. masks = detection.masks
  187. except AttributeError:
  188. continue
  189. self.get_logger().debug("Succesfully extracted boxes and masks")
  190. # Declare variables used later
  191. objects_median_center = []
  192. objects_median_center_transform = []
  193. detection_class = detection.boxes.cls.cpu().numpy()
  194. detection_conf = detection.boxes.conf.cpu().numpy()
  195. for i in range(n_objects):
  196. # Get mask for the i'th object
  197. single_selection_mask = np.array(masks.xyn[i])
  198. single_object_box = object_boxes[i]
  199. center_x = int(0.5*(single_object_box[0] + single_object_box[2]))
  200. center_y = int(0.5*(single_object_box[1] + single_object_box[3]))
  201. depth = []
  202. sum = 0
  203. dict = 0
  204. for point in single_selection_mask:
  205. p_x = int(point[0]*480)
  206. p_y = int(point[1]*640)
  207. if np_depth_image.item(p_x, p_y)!= 0:
  208. depth.append(np_depth_image.item(p_x, p_y))
  209. print("original")
  210. print(depth)
  211. print(len(depth))
  212. selected_depth = self.clsuter_select(depth, 0.8)
  213. selected_depth = self.clsuter_select(selected_depth, 0.8)
  214. print("selected")
  215. print(selected_depth)
  216. print(len(selected_depth))
  217. for dep in selected_depth:
  218. sum = sum + dep
  219. if len(selected_depth) != 0:
  220. dict = sum/len(selected_depth)
  221. else:
  222. dict = 0
  223. dict = dict/1024
  224. print("end")
  225. print("class:" + str(detection.names[detection_class[i]]))
  226. print(dict)
  227. a = self.tf_world_to_optical[:,3]
  228. obj_center = [dict*center_x - a[0],dict*center_y - a[1], dict*1 - a[2]]
  229. obj_transformed = np.linalg.solve(self.tf_world_to_optical[:,[0,1,2]], obj_center)
  230. objects_median_center.append(obj_center)
  231. objects_median_center_transform.append(obj_transformed)
  232. # Item dict creation
  233. item_dict = {}
  234. for item, n, median_tf in zip(detection_class, range(n_objects), objects_median_center_transform):
  235. item_dict[f'item_{n}'] = {'class': detection.names[item],'position': median_tf.tolist()}
  236. self.item_dict = item_dict
  237. self.item_dict_str = json.dumps(self.item_dict)
  238. print(self.item_dict)
  239. self.get_logger().info(f"Yolo detected items: {[detection.names[item] for item in detection_class]}")
  240. item_dict_msg = String()
  241. item_dict_msg.data = self.item_dict_str
  242. self._item_dict_pub.publish(item_dict_msg)
  243. self.get_logger().debug("Item dictionary succesfully created and published")
  244. def shutdown_callback(self):
  245. self.get_logger().warn("Shutting down...")
  246. ##Main funtion
  247. def main(args=None):
  248. rclpy.init(args=args)
  249. # Instansiate node class
  250. vision_node = Yolov8Node()
  251. # Create executor
  252. executor = MultiThreadedExecutor()
  253. executor.add_node(vision_node)
  254. try:
  255. # Run executor
  256. executor.spin()
  257. except KeyboardInterrupt:
  258. pass
  259. finally:
  260. # Shutdown executor
  261. vision_node.shutdown_callback()
  262. executor.shutdown()
  263. if __name__ == "__main__":
  264. main()

图像可视化节点:

video_player.py:

  1. #!/usr/bin/env python
  2. # Basic ROS2
  3. # Author:Jin Xiangyv
  4. # 2024.1.2
  5. import rclpy
  6. import message_filters
  7. import rclpy
  8. from rclpy.node import Node
  9. from sensor_msgs.msg import Image
  10. from cv_bridge import CvBridge
  11. import cv2
  12. class Video_player(Node):
  13. def __init__(self, name):
  14. super().__init__(name)
  15. self.version_info()
  16. self.get_logger().info("Wait for video source......")
  17. self.subcriber_1 = message_filters.Subscriber(self,Image,'/yolo/prediction/image')
  18. self.subcriber_2 = message_filters.Subscriber(self,Image,'/yolo/bg_removed')
  19. self.sync = message_filters.ApproximateTimeSynchronizer([self.subcriber_1,
  20. self.subcriber_2],\
  21. 10,0.5,allow_headerless=True)
  22. self.sync.registerCallback(self.multi_callback)
  23. self.cv_bridge = CvBridge()
  24. def multi_callback(self,stream_1,stream_2):
  25. self.get_logger().info("Image received:")
  26. prediction_img = self.cv_bridge.imgmsg_to_cv2(stream_1,'bgr8')
  27. background_removal_img = self.cv_bridge.imgmsg_to_cv2(stream_2,'bgr8')
  28. cv2.imshow("prediction_img",prediction_img)
  29. cv2.imshow("background_removal_img",background_removal_img)
  30. cv2.waitKey(10)
  31. def version_info(self):
  32. self.get_logger().info("VERSION 1.2.0")
  33. def main(args=None):
  34. rclpy.init(args=args)
  35. node = Video_player("video_player")
  36. rclpy.spin(node)
  37. node.destroy_node()
  38. rclpy.shutdown()

3.效果展示:

ros2 launch realsense2_camera rs_launch.py 
ros2 run vision_pkg_python obj_detect
ros2 run vision_pkg_python video_player

 

查看物体位置信息:

 ros2 topic echo /yolo/prediction/item_dict

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/凡人多烦事01/article/detail/250749
推荐阅读
相关标签
  

闽ICP备14008679号