当前位置:   article > 正文

Openpcdet训练自己的数据集

openpcdet训练自己的数据集

一. Openpcdet的安装以及使用

* Openpcdet详细内容请看以下链接:

GitHub - open-mmlab/OpenPCDet: OpenPCDet Toolbox for LiDAR-based 3D Object Detection.

1.首先gitclone原文代码

2. 这里我建议自己按照作者github上的docs/install文件夹下指示一步步安装,(之前根据csdn上教程一直有报错),然后下载spconv,以及cumm, github链接如下:

GitHub - traveller59/spconv: Spatial Sparse Convolution Library

GitHub - FindDefinition/cumm: CUda Matrix Multiply library.

3. 打开spconv中的readme,并且严格按照readme步骤安装,一般需要编译一段时间。

4. 打开cumm中readme,严格按照上面指示安装。

5. 安装完成之后按照测试数据跑通检验一下。

二. Openpcdet训练自己的数据集

* 本人移植其他的数据集,由于我有自己的image数据,已经按照kitti数据集的格式转换为velodyne, calib, label, image四个文件,并且实现了评估,以及最终的检测结果,所以可能和其他博主不一样。

* 如果你只有velodyne,label,或者数据集格式还不知道如何转换,文件建议参考以下这几个博主的链接:

Training using our own dataset · Issue #771 · open-mmlab/OpenPCDet · GitHub

 OpenPCDet 训练自己的数据集详细教程!_JulyLi2019的博客-CSDN博客_openpcdet 数据集

3D目标检测(4):OpenPCDet训练篇--自定义数据集 - 知乎

Openpcdet-(2)自数据集训练数据集训练_花花花哇_的博客-CSDN博客

win10 OpenPCDet 训练KITTI以及自己的数据集_树和猫的博客-CSDN博客_openpcdet训练

这里首先总结以下主要涉及到以下三个文件的修改

* pcdet/datasets/custom/custom_dataset.py

* tools/cfgs/custom_models/pointpillar.yaml (也可以是其他模型)

* tools/cfgs/dataset_configs/custom_dataset.yaml

* demo.py

1.pcdet/datasets/custom/custom_dataset.py

其实custom_dataset.py只需要大家去模仿kitti_dataset.py去删改就可以了,而且大部分内容不需要用户修改,这里我修改了:

1)get_lidar函数

* 获取激光雷达数据,其他的get_image也类似

2) __getitem__函数

* 这个函数最重要,是获取数据字典并更新的关键

* 如果有些字典不需要可以删改,如calib,image等

3)get_infos函数

* 生成字典信息infos

infos={'image':xxx,

           'calib': xxx,

           'annos': xxx}

annos = {'name': xxx,

                'truncated': xxx,

                'alpha':xxx,

                .............}

其中annos就是解析你的label文件生成的字典, 如类别名,是否被遮挡,bbox的角度

同理有些字典信息不需要可以增删

3) create_custom_infos函数

这个函数主要用来生成你的数据字典,一般以.pkl后缀,如果你不需要评估,可以将其中的评估部分删除,原理也很简单。

4) main函数中的类别信息

修改后的代码如下:

  1. import copy
  2. import pickle
  3. import os
  4. from skimage import io
  5. import numpy as np
  6. from ..kitti import kitti_utils
  7. from ...ops.roiaware_pool3d import roiaware_pool3d_utils
  8. from ...utils import box_utils, common_utils, calibration_kitti, object3d_custom
  9. from ..dataset import DatasetTemplate
  10. class CustomDataset(DatasetTemplate):
  11. def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None, ext='.bin'):
  12. """
  13. Args:
  14. root_path:
  15. dataset_cfg:
  16. class_names:
  17. training:
  18. logger:
  19. """
  20. super().__init__(
  21. dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
  22. )
  23. self.split = self.dataset_cfg.DATA_SPLIT[self.mode]
  24. self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')
  25. split_dir = os.path.join(self.root_path, 'ImageSets', (self.split + '.txt')) # custom/ImagSets/xxx.txt
  26. self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if os.path.exists(split_dir) else None # xxx.txt内的内容
  27. self.custom_infos = []
  28. self.include_data(self.mode) # train/val
  29. self.map_class_to_kitti = self.dataset_cfg.MAP_CLASS_TO_KITTI
  30. self.ext = ext
  31. def include_data(self, mode):
  32. self.logger.info('Loading Custom dataset.')
  33. custom_infos = []
  34. for info_path in self.dataset_cfg.INFO_PATH[mode]:
  35. info_path = self.root_path / info_path
  36. if not info_path.exists():
  37. continue
  38. with open(info_path, 'rb') as f:
  39. infos = pickle.load(f)
  40. def get_label(self, idx):
  41. label_file = self.root_split_path / 'label_2' / ('%s.txt' % idx)
  42. assert label_file.exists()
  43. return object3d_custom.get_objects_from_label(label_file)
  44. def get_lidar(self, idx, getitem=True):
  45. if getitem == True:
  46. lidar_file = self.root_split_path + '/velodyne/' + ('%s.bin' % idx)
  47. else:
  48. lidar_file = self.root_split_path / 'velodyne' / ('%s.bin' % idx)
  49. return np.fromfile(str(lidar_file), dtype=np.float32).reshape(-1, 4)
  50. def get_image(self, idx):
  51. """
  52. Loads image for a sample
  53. Args:
  54. idx: int, Sample index
  55. Returns:
  56. image: (H, W, 3), RGB Image
  57. """
  58. img_file = self.root_split_path / 'image_2' / ('%s.png' % idx)
  59. assert img_file.exists()
  60. image = io.imread(img_file)
  61. image = image.astype(np.float32)
  62. image /= 255.0
  63. return image
  64. def get_image_shape(self, idx):
  65. img_file = self.root_split_path / 'image_2' / ('%s.png' % idx)
  66. assert img_file.exists()
  67. return np.array(io.imread(img_file).shape[:2], dtype=np.int32)
  68. def get_fov_flag(self, pts_rect, img_shape, calib):
  69. """
  70. Args:
  71. pts_rect:
  72. img_shape:
  73. calib:
  74. Returns:
  75. """
  76. pts_img, pts_rect_depth = calib.rect_to_img(pts_rect)
  77. val_flag_1 = np.logical_and(pts_img[:, 0] >= 0, pts_img[:, 0] < img_shape[1])
  78. val_flag_2 = np.logical_and(pts_img[:, 1] >= 0, pts_img[:, 1] < img_shape[0])
  79. val_flag_merge = np.logical_and(val_flag_1, val_flag_2)
  80. pts_valid_flag = np.logical_and(val_flag_merge, pts_rect_depth >= 0)
  81. return pts_valid_flag
  82. def set_split(self, split):
  83. super().__init__(
  84. dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training,
  85. root_path=self.root_path, logger=self.logger
  86. )
  87. self.split = split
  88. split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
  89. self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None
  90. custom_infos.extend(infos)
  91. self.custom_infos.extend(custom_infos)
  92. self.logger.info('Total samples for CUSTOM dataset: %d' % (len(custom_infos)))
  93. def __len__(self):
  94. if self._merge_all_iters_to_one_epoch:
  95. return len(self.sample_id_list) * self.total_epochs
  96. return len(self.custom_infos)
  97. def __getitem__(self, index):
  98. if self._merge_all_iters_to_one_epoch:
  99. index = index % len(self.custom_infos)
  100. info = copy.deepcopy(self.custom_infos[index])
  101. sample_idx = info['point_cloud']['lidar_idx']
  102. img_shape = info['image']['image_shape']
  103. calib = self.get_calib(sample_idx)
  104. get_item_list = self.dataset_cfg.get('GET_ITEM_LIST', ['points'])
  105. input_dict = {
  106. 'frame_id': self.sample_id_list[index],
  107. 'calib': calib,
  108. }
  109. # 如果annos标签存在info的字典里
  110. if 'annos' in info:
  111. annos = info['annos']
  112. annos = common_utils.drop_info_with_name(annos, name='DontCare')
  113. loc, dims, rots = annos['location'], annos['dimensions'], annos['rotation_y']
  114. gt_names = annos['name']
  115. gt_boxes_camera = np.concatenate([loc, dims, rots[..., np.newaxis]], axis=1).astype(np.float32)
  116. gt_boxes_lidar = box_utils.boxes3d_kitti_camera_to_lidar(gt_boxes_camera, calib)
  117. # 更新gtbox
  118. input_dict.update({
  119. 'gt_names': gt_names,
  120. 'gt_boxes': gt_boxes_lidar
  121. })
  122. if "gt_boxes2d" in get_item_list:
  123. input_dict['gt_boxes2d'] = annos["bbox"]
  124. # 获取fov视角的points
  125. if "points" in get_item_list:
  126. points = self.get_lidar(sample_idx, False)
  127. if self.dataset_cfg.FOV_POINTS_ONLY:
  128. pts_rect = calib.lidar_to_rect(points[:, 0:3])
  129. fov_flag = self.get_fov_flag(pts_rect, img_shape, calib)
  130. points = points[fov_flag]
  131. input_dict['points'] = points
  132. input_dict['calib'] = calib
  133. data_dict = self.prepare_data(data_dict=input_dict)
  134. data_dict['image_shape'] = img_shape
  135. return data_dict
  136. def evaluation(self, det_annos, class_names, **kwargs):
  137. if 'annos' not in self.custom_infos[0].keys():
  138. return 'No ground-truth boxes for evaluation', {}
  139. def kitti_eval(eval_det_annos, eval_gt_annos, map_name_to_kitti):
  140. from ..kitti.kitti_object_eval_python import eval as kitti_eval
  141. from ..kitti import kitti_utils
  142. kitti_utils.transform_annotations_to_kitti_format(eval_det_annos, map_name_to_kitti=map_name_to_kitti)
  143. kitti_utils.transform_annotations_to_kitti_format(
  144. eval_gt_annos, map_name_to_kitti=map_name_to_kitti,
  145. info_with_fakelidar=self.dataset_cfg.get('INFO_WITH_FAKELIDAR', False)
  146. )
  147. kitti_class_names = [map_name_to_kitti[x] for x in class_names]
  148. ap_result_str, ap_dict = kitti_eval.get_official_eval_result(
  149. gt_annos=eval_gt_annos, dt_annos=eval_det_annos, current_classes=kitti_class_names
  150. )
  151. return ap_result_str, ap_dict
  152. eval_det_annos = copy.deepcopy(det_annos)
  153. eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.custom_infos]
  154. if kwargs['eval_metric'] == 'kitti':
  155. ap_result_str, ap_dict = kitti_eval(eval_det_annos, eval_gt_annos, self.map_class_to_kitti)
  156. else:
  157. raise NotImplementedError
  158. return ap_result_str, ap_dict
  159. def get_calib(self, idx):
  160. calib_file = self.root_split_path / 'calib' / ('%s.txt' % idx)
  161. assert calib_file.exists()
  162. return calibration_kitti.Calibration(calib_file)
  163. def get_infos(self, num_workers=4, has_label=True, count_inside_pts=True, sample_id_list=None):
  164. import concurrent.futures as futures
  165. def process_single_scene(sample_idx):
  166. # 生成point_cloud字典
  167. print('%s sample_idx: %s' % (self.split, sample_idx))
  168. info = {}
  169. pc_info = {'num_features': 4, 'lidar_idx': sample_idx}
  170. info['point_cloud'] = pc_info
  171. # 生成image字典
  172. image_info = {'image_idx': sample_idx, 'image_shape': self.get_image_shape(sample_idx)}
  173. info['image'] = image_info
  174. # 生成calib字典
  175. calib = self.get_calib(sample_idx)
  176. P2 = np.concatenate([calib.P2, np.array([[0., 0., 0., 1.]])], axis=0)
  177. R0_4x4 = np.zeros([4, 4], dtype=calib.R0.dtype)
  178. R0_4x4[3, 3] = 1.
  179. R0_4x4[:3, :3] = calib.R0
  180. V2C_4x4 = np.concatenate([calib.V2C, np.array([[0., 0., 0., 1.]])], axis=0)
  181. calib_info = {'P2': P2, 'R0_rect': R0_4x4, 'Tr_velo_to_cam': V2C_4x4}
  182. info['calib'] = calib_info
  183. if has_label:
  184. # 生成annos字典
  185. obj_list = self.get_label(sample_idx)
  186. annotations = {}
  187. annotations['name'] = np.array([obj.cls_type for obj in obj_list])
  188. annotations['truncated'] = np.array([obj.truncation for obj in obj_list])
  189. annotations['occluded'] = np.array([obj.occlusion for obj in obj_list])
  190. annotations['alpha'] = np.array([obj.alpha for obj in obj_list])
  191. annotations['bbox'] = np.concatenate([obj.box2d.reshape(1, 4) for obj in obj_list], axis=0)
  192. annotations['dimensions'] = np.array([[obj.l, obj.h, obj.w] for obj in obj_list]) # lhw(camera) format
  193. annotations['location'] = np.concatenate([obj.loc.reshape(1, 3) for obj in obj_list], axis=0)
  194. annotations['rotation_y'] = np.array([obj.ry for obj in obj_list])
  195. annotations['score'] = np.array([obj.score for obj in obj_list])
  196. annotations['difficulty'] = np.array([obj.level for obj in obj_list], np.int32)
  197. num_objects = len([obj.cls_type for obj in obj_list if obj.cls_type != 'DontCare'])
  198. num_gt = len(annotations['name'])
  199. index = list(range(num_objects)) + [-1] * (num_gt - num_objects)
  200. annotations['index'] = np.array(index, dtype=np.int32)
  201. loc = annotations['location'][:num_objects]
  202. dims = annotations['dimensions'][:num_objects]
  203. rots = annotations['rotation_y'][:num_objects]
  204. loc_lidar = calib.rect_to_lidar(loc)
  205. l, h, w = dims[:, 0:1], dims[:, 1:2], dims[:, 2:3]
  206. loc_lidar[:, 2] += h[:, 0] / 2
  207. gt_boxes_lidar = np.concatenate([loc_lidar, l, w, h, -(np.pi / 2 + rots[..., np.newaxis])], axis=1)
  208. annotations['gt_boxes_lidar'] = gt_boxes_lidar
  209. info['annos'] = annotations
  210. if count_inside_pts:
  211. points = self.get_lidar(sample_idx, False)
  212. calib = self.get_calib(sample_idx)
  213. pts_rect = calib.lidar_to_rect(points[:, 0:3])
  214. fov_flag = self.get_fov_flag(pts_rect, info['image']['image_shape'], calib)
  215. pts_fov = points[fov_flag]
  216. corners_lidar = box_utils.boxes_to_corners_3d(gt_boxes_lidar)
  217. num_points_in_gt = -np.ones(num_gt, dtype=np.int32)
  218. for k in range(num_objects):
  219. flag = box_utils.in_hull(pts_fov[:, 0:3], corners_lidar[k])
  220. num_points_in_gt[k] = flag.sum()
  221. annotations['num_points_in_gt'] = num_points_in_gt
  222. return info
  223. sample_id_list = sample_id_list if sample_id_list is not None else self.sample_id_list
  224. with futures.ThreadPoolExecutor(num_workers) as executor:
  225. infos = executor.map(process_single_scene, sample_id_list)
  226. return list(infos)
  227. def create_groundtruth_database(self, info_path=None, used_classes=None, split='train'):
  228. import torch
  229. database_save_path = Path(self.root_path) / ('gt_database' if split == 'train' else ('gt_database_%s' % split))
  230. db_info_save_path = Path(self.root_path) / ('custom_dbinfos_%s.pkl' % split)
  231. database_save_path.mkdir(parents=True, exist_ok=True)
  232. all_db_infos = {}
  233. with open(info_path, 'rb') as f:
  234. infos = pickle.load(f)
  235. for k in range(len(infos)):
  236. print('gt_database sample: %d/%d' % (k + 1, len(infos)))
  237. info = infos[k]
  238. sample_idx = info['point_cloud']['lidar_idx']
  239. points = self.get_lidar(sample_idx, False)
  240. annos = info['annos']
  241. names = annos['name']
  242. difficulty = annos['difficulty']
  243. bbox = annos['bbox']
  244. gt_boxes = annos['gt_boxes_lidar']
  245. num_obj = gt_boxes.shape[0]
  246. point_indices = roiaware_pool3d_utils.points_in_boxes_cpu(
  247. torch.from_numpy(points[:, 0:3]), torch.from_numpy(gt_boxes)
  248. ).numpy() # (nboxes, npoints)
  249. for i in range(num_obj):
  250. filename = '%s_%s_%d.bin' % (sample_idx, names[i], i)
  251. filepath = database_save_path / filename
  252. gt_points = points[point_indices[i] > 0]
  253. gt_points[:, :3] -= gt_boxes[i, :3]
  254. with open(filepath, 'w') as f:
  255. gt_points.tofile(f)
  256. if (used_classes is None) or names[i] in used_classes:
  257. db_path = str(filepath.relative_to(self.root_path)) # gt_database/xxxxx.bin
  258. db_info = {'name': names[i], 'path': db_path, 'image_idx': sample_idx, 'gt_idx': i,
  259. 'box3d_lidar': gt_boxes[i], 'num_points_in_gt': gt_points.shape[0],
  260. 'difficulty': difficulty[i], 'bbox': bbox[i], 'score': annos['score'][i]}
  261. if names[i] in all_db_infos:
  262. all_db_infos[names[i]].append(db_info)
  263. else:
  264. all_db_infos[names[i]] = [db_info]
  265. # Output the num of all classes in database
  266. for k, v in all_db_infos.items():
  267. print('Database %s: %d' % (k, len(v)))
  268. with open(db_info_save_path, 'wb') as f:
  269. pickle.dump(all_db_infos, f)
  270. @staticmethod
  271. def create_label_file_with_name_and_box(class_names, gt_names, gt_boxes, save_label_path):
  272. with open(save_label_path, 'w') as f:
  273. for idx in range(gt_boxes.shape[0]):
  274. boxes = gt_boxes[idx]
  275. name = gt_names[idx]
  276. if name not in class_names:
  277. continue
  278. line = "{x} {y} {z} {l} {w} {h} {angle} {name}\n".format(
  279. x=boxes[0], y=boxes[1], z=(boxes[2]), l=boxes[3],
  280. w=boxes[4], h=boxes[5], angle=boxes[6], name=name
  281. )
  282. f.write(line)
  283. @staticmethod
  284. def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
  285. """
  286. Args:
  287. batch_dict:
  288. frame_id:
  289. pred_dicts: list of pred_dicts
  290. pred_boxes: (N, 7), Tensor
  291. pred_scores: (N), Tensor
  292. pred_labels: (N), Tensor
  293. class_names:
  294. output_path:
  295. Returns:
  296. """
  297. def get_template_prediction(num_samples):
  298. ret_dict = {
  299. 'name': np.zeros(num_samples), 'truncated': np.zeros(num_samples),
  300. 'occluded': np.zeros(num_samples), 'alpha': np.zeros(num_samples),
  301. 'bbox': np.zeros([num_samples, 4]), 'dimensions': np.zeros([num_samples, 3]),
  302. 'location': np.zeros([num_samples, 3]), 'rotation_y': np.zeros(num_samples),
  303. 'score': np.zeros(num_samples), 'boxes_lidar': np.zeros([num_samples, 7])
  304. }
  305. return ret_dict
  306. def generate_single_sample_dict(batch_index, box_dict):
  307. pred_scores = box_dict['pred_scores'].cpu().numpy()
  308. pred_boxes = box_dict['pred_boxes'].cpu().numpy()
  309. pred_labels = box_dict['pred_labels'].cpu().numpy()
  310. pred_dict = get_template_prediction(pred_scores.shape[0])
  311. if pred_scores.shape[0] == 0:
  312. return pred_dict
  313. calib = batch_dict['calib'][batch_index]
  314. image_shape = batch_dict['image_shape'][batch_index].cpu().numpy()
  315. pred_boxes_camera = box_utils.boxes3d_lidar_to_kitti_camera(pred_boxes, calib)
  316. pred_boxes_img = box_utils.boxes3d_kitti_camera_to_imageboxes(
  317. pred_boxes_camera, calib, image_shape=image_shape
  318. )
  319. pred_dict['name'] = np.array(class_names)[pred_labels - 1]
  320. pred_dict['alpha'] = -np.arctan2(-pred_boxes[:, 1], pred_boxes[:, 0]) + pred_boxes_camera[:, 6]
  321. pred_dict['bbox'] = pred_boxes_img
  322. pred_dict['dimensions'] = pred_boxes_camera[:, 3:6]
  323. pred_dict['location'] = pred_boxes_camera[:, 0:3]
  324. pred_dict['rotation_y'] = pred_boxes_camera[:, 6]
  325. pred_dict['score'] = pred_scores
  326. pred_dict['boxes_lidar'] = pred_boxes
  327. return pred_dict
  328. annos = []
  329. for index, box_dict in enumerate(pred_dicts):
  330. frame_id = batch_dict['frame_id'][index]
  331. single_pred_dict = generate_single_sample_dict(index, box_dict)
  332. single_pred_dict['frame_id'] = frame_id
  333. annos.append(single_pred_dict)
  334. if output_path is not None:
  335. cur_det_file = output_path / ('%s.txt' % frame_id)
  336. with open(cur_det_file, 'w') as f:
  337. bbox = single_pred_dict['bbox']
  338. loc = single_pred_dict['location']
  339. dims = single_pred_dict['dimensions'] # lhw -> hwl
  340. for idx in range(len(bbox)):
  341. print('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f'
  342. % (single_pred_dict['name'][idx], single_pred_dict['alpha'][idx],
  343. bbox[idx][0], bbox[idx][1], bbox[idx][2], bbox[idx][3],
  344. dims[idx][1], dims[idx][2], dims[idx][0], loc[idx][0],
  345. loc[idx][1], loc[idx][2], single_pred_dict['rotation_y'][idx],
  346. single_pred_dict['score'][idx]), file=f)
  347. return annos
  348. def create_custom_infos(dataset_cfg, class_names, data_path, save_path, workers=4):
  349. dataset = CustomDataset(
  350. dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path,
  351. training=False, logger=common_utils.create_logger()
  352. )
  353. train_split, val_split = 'train', 'val'
  354. num_features = len(dataset_cfg.POINT_FEATURE_ENCODING.src_feature_list)
  355. train_filename = save_path / ('custom_infos_%s.pkl' % train_split)
  356. val_filename = save_path / ('custom_infos_%s.pkl' % val_split)
  357. print('------------------------Start to generate data infos------------------------')
  358. dataset.set_split(train_split)
  359. custom_infos_train = dataset.get_infos(
  360. num_workers=workers, has_label=True, count_inside_pts=True
  361. )
  362. with open(train_filename, 'wb') as f:
  363. pickle.dump(custom_infos_train, f)
  364. print('Custom info train file is saved to %s' % train_filename)
  365. dataset.set_split(val_split)
  366. custom_infos_val = dataset.get_infos(
  367. num_workers=workers, has_label=True, count_inside_pts=True
  368. )
  369. with open(val_filename, 'wb') as f:
  370. pickle.dump(custom_infos_val, f)
  371. print('Custom info train file is saved to %s' % val_filename)
  372. print('------------------------Start create groundtruth database for data augmentation------------------------')
  373. dataset.set_split(train_split)
  374. dataset.create_groundtruth_database(train_filename, split=train_split)
  375. print('------------------------Data preparation done------------------------')
  376. if __name__ == '__main__':
  377. import sys
  378. if sys.argv.__len__() > 1 and sys.argv[1] == 'create_custom_infos':
  379. import yaml
  380. from pathlib import Path
  381. from easydict import EasyDict
  382. dataset_cfg = EasyDict(yaml.safe_load(open(sys.argv[2])))
  383. ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve()
  384. create_custom_infos(
  385. dataset_cfg=dataset_cfg,
  386. class_names=['Car', 'Pedestrian', 'Van'],
  387. data_path=ROOT_DIR / 'data' / 'custom',
  388. save_path=ROOT_DIR / 'data' / 'custom',
  389. )

2. tools/cfgs/custom_models/pointpillar.yaml

这个函数主要是网络模型参数的配置

我主要修改了以下几个点:

1) CLASS_NAMES(替换成你自己的类别信息)

2) _BASE_CONFIFG(custom_dataset.yaml的路径,建议用详细的绝对路径)

3) POINT_CLOUD_RANGE和VOXEL_SIZE

这两者很重要,直接影响后面模型的传播,如果设置不对很容易报错

官方建议 Voxel设置:X,Y方向个数是16的倍数。Z方向为40。

之前尝试设置了一些还是不行,这个我也没太明白到底怎么回事,索性我就不修改

4) ANCHOR_GENERATOR_CONFIG

我修改了自己的类别属性以及feature_map_stride,去除了gt_sampling

完整的代码如下:

  1. CLASS_NAMES: ['Car', 'Pedestrian', 'Van']
  2. DATA_CONFIG:
  3. _BASE_CONFIG_: /home/gmm/下载/OpenPCDet/tools/cfgs/dataset_configs/custom_dataset.yaml
  4. POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]
  5. DATA_PROCESSOR:
  6. - NAME: mask_points_and_boxes_outside_range
  7. REMOVE_OUTSIDE_BOXES: True
  8. - NAME: shuffle_points
  9. SHUFFLE_ENABLED: {
  10. 'train': True,
  11. 'test': False
  12. }
  13. - NAME: transform_points_to_voxels
  14. VOXEL_SIZE: [0.16, 0.16, 4]
  15. MAX_POINTS_PER_VOXEL: 32
  16. MAX_NUMBER_OF_VOXELS: {
  17. 'train': 16000,
  18. 'test': 40000
  19. }
  20. DATA_AUGMENTOR:
  21. DISABLE_AUG_LIST: ['placeholder']
  22. AUG_CONFIG_LIST:
  23. # - NAME: gt_sampling
  24. # USE_ROAD_PLANE: True
  25. # DB_INFO_PATH:
  26. # - custom_dbinfos_train.pkl
  27. # PREPARE: {
  28. # filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Van:5']
  29. # }
  30. #
  31. # SAMPLE_GROUPS: ['Car:15', 'Pedestrian:15', 'Van:15']
  32. # NUM_POINT_FEATURES: 4
  33. # DATABASE_WITH_FAKELIDAR: False
  34. # REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
  35. # LIMIT_WHOLE_SCENE: False
  36. - NAME: random_world_flip
  37. ALONG_AXIS_LIST: ['x']
  38. - NAME: random_world_rotation
  39. WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
  40. - NAME: random_world_scaling
  41. WORLD_SCALE_RANGE: [0.95, 1.05]
  42. MODEL:
  43. NAME: PointPillar
  44. VFE:
  45. NAME: PillarVFE
  46. WITH_DISTANCE: False
  47. USE_ABSLOTE_XYZ: True
  48. USE_NORM: True
  49. NUM_FILTERS: [64]
  50. MAP_TO_BEV:
  51. NAME: PointPillarScatter
  52. NUM_BEV_FEATURES: 64
  53. BACKBONE_2D:
  54. NAME: BaseBEVBackbone
  55. LAYER_NUMS: [3, 5, 5]
  56. LAYER_STRIDES: [2, 2, 2]
  57. NUM_FILTERS: [64, 128, 256]
  58. UPSAMPLE_STRIDES: [1, 2, 4]
  59. NUM_UPSAMPLE_FILTERS: [128, 128, 128]
  60. DENSE_HEAD:
  61. NAME: AnchorHeadSingle
  62. CLASS_AGNOSTIC: False
  63. USE_DIRECTION_CLASSIFIER: True
  64. DIR_OFFSET: 0.78539
  65. DIR_LIMIT_OFFSET: 0.0
  66. NUM_DIR_BINS: 2
  67. ANCHOR_GENERATOR_CONFIG: [
  68. {
  69. 'class_name': 'Car',
  70. 'anchor_sizes': [[1.8, 4.7, 1.8]],
  71. 'anchor_rotations': [0, 1.57],
  72. 'anchor_bottom_heights': [0],
  73. 'align_center': False,
  74. 'feature_map_stride': 2,
  75. 'matched_threshold': 0.55,
  76. 'unmatched_threshold': 0.45
  77. },
  78. {
  79. 'class_name': 'Pedestrian',
  80. 'anchor_sizes': [[0.77, 0.92, 1.83]],
  81. 'anchor_rotations': [0, 1.57],
  82. 'anchor_bottom_heights': [0],
  83. 'align_center': False,
  84. 'feature_map_stride': 2,
  85. 'matched_threshold': 0.5,
  86. 'unmatched_threshold': 0.45
  87. },
  88. {
  89. 'class_name': 'Van',
  90. 'anchor_sizes': [[2.5, 5.7, 1.9]],
  91. 'anchor_rotations': [0, 1.57],
  92. 'anchor_bottom_heights': [0],
  93. 'align_center': False,
  94. 'feature_map_stride': 2,
  95. 'matched_threshold': 0.5,
  96. 'unmatched_threshold': 0.45
  97. },
  98. ]
  99. TARGET_ASSIGNER_CONFIG:
  100. NAME: AxisAlignedTargetAssigner
  101. POS_FRACTION: -1.0
  102. SAMPLE_SIZE: 512
  103. NORM_BY_NUM_EXAMPLES: False
  104. MATCH_HEIGHT: False
  105. BOX_CODER: ResidualCoder
  106. LOSS_CONFIG:
  107. LOSS_WEIGHTS: {
  108. 'cls_weight': 1.0,
  109. 'loc_weight': 2.0,
  110. 'dir_weight': 0.2,
  111. 'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
  112. }
  113. POST_PROCESSING:
  114. RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
  115. SCORE_THRESH: 0.1
  116. OUTPUT_RAW_SCORE: False
  117. EVAL_METRIC: kitti
  118. NMS_CONFIG:
  119. MULTI_CLASSES_NMS: False
  120. NMS_TYPE: nms_gpu
  121. NMS_THRESH: 0.01
  122. NMS_PRE_MAXSIZE: 4096
  123. NMS_POST_MAXSIZE: 500
  124. OPTIMIZATION:
  125. BATCH_SIZE_PER_GPU: 4
  126. NUM_EPOCHS: 80
  127. OPTIMIZER: adam_onecycle
  128. LR: 0.003
  129. WEIGHT_DECAY: 0.01
  130. MOMENTUM: 0.9
  131. MOMS: [0.95, 0.85]
  132. PCT_START: 0.4
  133. DIV_FACTOR: 10
  134. DECAY_STEP_LIST: [35, 45]
  135. LR_DECAY: 0.1
  136. LR_CLIP: 0.0000001
  137. LR_WARMUP: False
  138. WARMUP_EPOCH: 1
  139. GRAD_NORM_CLIP: 10

3. tools/cfgs/dataset_configs/custom_dataset.yaml

修改了DATA_PATH, POINT_CLOUD_RANGE和MAP_CLASS_TO_KITTI还有其他的一些类别属性。

修改后的代码如下:

  1. DATASET: 'CustomDataset'
  2. DATA_PATH: '/home/gmm/下载/OpenPCDet/data/custom'
  3. POINT_CLOUD_RANGE: [0, -40, -3, 70.4, 40, 1]
  4. DATA_SPLIT: {
  5. 'train': train,
  6. 'test': val
  7. }
  8. INFO_PATH: {
  9. 'train': [custom_infos_train.pkl],
  10. 'test': [custom_infos_val.pkl],
  11. }
  12. GET_ITEM_LIST: ["points"]
  13. FOV_POINTS_ONLY: True
  14. MAP_CLASS_TO_KITTI: {
  15. 'Car': 'Car',
  16. 'Pedestrian': 'Pedestrian',
  17. 'Van': 'Cyclist',
  18. }
  19. DATA_AUGMENTOR:
  20. DISABLE_AUG_LIST: ['placeholder']
  21. AUG_CONFIG_LIST:
  22. - NAME: gt_sampling
  23. USE_ROAD_PLANE: False
  24. DB_INFO_PATH:
  25. - custom_dbinfos_train.pkl
  26. PREPARE: {
  27. filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Van:5'],
  28. }
  29. SAMPLE_GROUPS: ['Car:20', 'Pedestrian:15', 'Van:20']
  30. NUM_POINT_FEATURES: 4
  31. DATABASE_WITH_FAKELIDAR: False
  32. REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
  33. LIMIT_WHOLE_SCENE: True
  34. - NAME: random_world_flip
  35. ALONG_AXIS_LIST: ['x']
  36. - NAME: random_world_rotation
  37. WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
  38. - NAME: random_world_scaling
  39. WORLD_SCALE_RANGE: [0.95, 1.05]
  40. POINT_FEATURE_ENCODING: {
  41. encoding_type: absolute_coordinates_encoding,
  42. used_feature_list: ['x', 'y', 'z', 'intensity'],
  43. src_feature_list: ['x', 'y', 'z', 'intensity'],
  44. }
  45. DATA_PROCESSOR:
  46. - NAME: mask_points_and_boxes_outside_range
  47. REMOVE_OUTSIDE_BOXES: True
  48. - NAME: shuffle_points
  49. SHUFFLE_ENABLED: {
  50. 'train': True,
  51. 'test': False
  52. }
  53. - NAME: transform_points_to_voxels
  54. VOXEL_SIZE: [0.05, 0.05, 0.1]
  55. MAX_POINTS_PER_VOXEL: 5
  56. MAX_NUMBER_OF_VOXELS: {
  57. 'train': 16000,
  58. 'test': 40000
  59. }

4. demo.py

之前训练之后检测框并没有出来,后来我才发现可能是自己的数据集太少,出来的检测框精度太低,于是我在V.draw_scenes部分作了一点修改,并在之前加入一个mask限制条件,结果果然出来检测框了。

demo.py修改部分的代码:

  1. with torch.no_grad():
  2. for idx, data_dict in enumerate(demo_dataset):
  3. logger.info(f'Visualized sample index: \t{idx + 1}')
  4. data_dict = demo_dataset.collate_batch([data_dict])
  5. load_data_to_gpu(data_dict)
  6. pred_dicts, _ = model.forward(data_dict)
  7. scores = pred_dicts[0]['pred_scores'].detach().cpu().numpy()
  8. mask = scores > 0.3
  9. V.draw_scenes(
  10. points=data_dict['points'][:, 1:], ref_boxes=pred_dicts[0]['pred_boxes'][mask],
  11. ref_scores=pred_dicts[0]['pred_scores'], ref_labels=pred_dicts[0]['pred_labels'],
  12. )
  13. if not OPEN3D_FLAG:
  14. mlab.show(stop=True)

三. 运行过程

1. 生成数据字典

python -m pcdet.datasets.custom.custom_dataset create_custom_infos tools/cfgs/dataset_configs/custom_dataset.yaml

 

 2. 训练

这里我偷懒只训练10轮,自己可以自定义

python tools/train.py --cfg_file tools/cfgs/custom_models/pointpillar.yaml --batch_size=1 --epochs=10

 这里有个警告不知道怎么回事,暂时忽略[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)

3. 评估

由于数据集样本设置比较少,而且训练次数比较少,可以看出评估结果较差

 4. 结果

 还好能有显示,如果没有出现检测框可以把demo.py的score调低

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/空白诗007/article/detail/795066
推荐阅读
相关标签
  

闽ICP备14008679号