当前位置:   article > 正文

Mask R-CNN 模型_maskrcnn模型

maskrcnn模型

数据准备

要训练 Mask R-CNN 实例分割模型,我们首先要准备图像的掩模(mask),使用标注工具 labelme(支持 Windows 和 Ubuntu,使用 (sudo) pip install labelme 安装,需要安装依赖项:(sudo) pip install pyqt5)来完成这一步。安装完 labelme 之后,在命令行执行 labelme 会弹出一个标注窗口:

【将命令行来到 Abyssinian_65.json 文件所在的文件夹,执行

labelme_json_to_dataset Abyssinian_65.json

会在当前目录下生成一个名叫 Abyssinian_65_json 的文件夹,里面包含如下文件:

        但是 labelme 有一个很大的缺陷,即它只能标注首尾相连的多边形,如果一个目标实例包含一个洞(如第二幅图像 Abyssinian_65.jpg 的猫的两腿之间的空隙),那么这个洞也会算成这个目标实例的一部分,而这显然是不正确的。为了避免这个缺陷,在标注目标实例时,可以增加一个额外的类 hole(如上图的 绿色 部分),实际使用时只要把 hole 部分去掉即可,如:

 

        TensorFlow 训练时要求 mask 是跟原图像一样大小的二值(0-1)png 图像(如上图),而且数据输入格式必须为 tfrecord 文件,所以还需要写一个数据格式转化的辅助 python 文件,该文件可以参考 TensorFlow 目标检测官方的文件 create_coco_tf_record.py 来写。

        在写之前,强调说明一下数据输入的格式:对每张图像中的每个目标,该目标的 mask 是一张与原图一样大小的 0-1 二值图像,该目标所在区域的值为 1,其他区域全为 0(见 TensorFlow/object_detection 官方说明:Run an Instance Segmentation Model/PNG Instance Segmentation Masks)。也就是说,同一张图像中的所有目标的 mask 都需要从单个标注文件中分割出来。这可以使用 OpenCV 的 cv2.fillPoly 函数来实现,该函数将指定多边形区域内部的值都填充为用户设定的值。 

 生成数据

        假设已经准备好了 mask 标注数据,因为包围每个目标的 mask 的最小矩形就是该目标的 boundingbox,所以目标检测的标注数据也就同时有了。接下来,只需要将这些标注数据(原始图像,以及 labelme 标注生成的 json 文件)转换成 TFRecord 文件即可,使用如下代码完成这一步操作(命名为 create_tf_record.py,见 github):

  1. #!/usr/bin/env python3
  2. # -*- coding: utf-8 -*-
  3. """
  4. Created on Sun Aug 26 10:57:09 2018
  5. @author: shirhe-lyh
  6. """
  7. """Convert raw dataset to TFRecord for object_detection.
  8. Please note that this tool only applies to labelme's annotations(json file).
  9. Example usage:
  10. python3 create_tf_record.py \
  11. --images_dir=your absolute path to read images.
  12. --annotations_json_dir=your path to annotaion json files.
  13. --label_map_path=your path to label_map.pbtxt
  14. --output_path=your path to write .record.
  15. """
  16. import cv2
  17. import glob
  18. import hashlib
  19. import io
  20. import json
  21. import numpy as np
  22. import os
  23. import PIL.Image
  24. import tensorflow as tf
  25. import read_pbtxt_file
  26. flags = tf.app.flags
  27. flags.DEFINE_string('images_dir', None, 'Path to images directory.')
  28. flags.DEFINE_string('annotations_json_dir', 'datasets/annotations',
  29. 'Path to annotations directory.')
  30. flags.DEFINE_string('label_map_path', None, 'Path to label map proto.')
  31. flags.DEFINE_string('output_path', None, 'Path to the output tfrecord.')
  32. FLAGS = flags.FLAGS
  33. def int64_feature(value):
  34. return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
  35. def int64_list_feature(value):
  36. return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
  37. def bytes_feature(value):
  38. return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
  39. def bytes_list_feature(value):
  40. return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))
  41. def float_list_feature(value):
  42. return tf.train.Feature(float_list=tf.train.FloatList(value=value))
  43. def create_tf_example(annotation_dict, label_map_dict=None):
  44. """Converts image and annotations to a tf.Example proto.
  45. Args:
  46. annotation_dict: A dictionary containing the following keys:
  47. ['height', 'width', 'filename', 'sha256_key', 'encoded_jpg',
  48. 'format', 'xmins', 'xmaxs', 'ymins', 'ymaxs', 'masks',
  49. 'class_names'].
  50. label_map_dict: A dictionary maping class_names to indices.
  51. Returns:
  52. example: The converted tf.Example.
  53. Raises:
  54. ValueError: If label_map_dict is None or is not containing a class_name.
  55. """
  56. if annotation_dict is None:
  57. return None
  58. if label_map_dict is None:
  59. raise ValueError('`label_map_dict` is None')
  60. height = annotation_dict.get('height', None)
  61. width = annotation_dict.get('width', None)
  62. filename = annotation_dict.get('filename', None)
  63. sha256_key = annotation_dict.get('sha256_key', None)
  64. encoded_jpg = annotation_dict.get('encoded_jpg', None)
  65. image_format = annotation_dict.get('format', None)
  66. xmins = annotation_dict.get('xmins', None)
  67. xmaxs = annotation_dict.get('xmaxs', None)
  68. ymins = annotation_dict.get('ymins', None)
  69. ymaxs = annotation_dict.get('ymaxs', None)
  70. masks = annotation_dict.get('masks', None)
  71. class_names = annotation_dict.get('class_names', None)
  72. labels = []
  73. for class_name in class_names:
  74. label = label_map_dict.get(class_name, None)
  75. if label is None:
  76. raise ValueError('`label_map_dict` is not containing {}.'.format(
  77. class_name))
  78. labels.append(label)
  79. encoded_masks = []
  80. for mask in masks:
  81. pil_image = PIL.Image.fromarray(mask.astype(np.uint8))
  82. output_io = io.BytesIO()
  83. pil_image.save(output_io, format='PNG')
  84. encoded_masks.append(output_io.getvalue())
  85. feature_dict = {
  86. 'image/height': int64_feature(height),
  87. 'image/width': int64_feature(width),
  88. 'image/filename': bytes_feature(filename.encode('utf8')),
  89. 'image/source_id': bytes_feature(filename.encode('utf8')),
  90. 'image/key/sha256': bytes_feature(sha256_key.encode('utf8')),
  91. 'image/encoded': bytes_feature(encoded_jpg),
  92. 'image/format': bytes_feature(image_format.encode('utf8')),
  93. 'image/object/bbox/xmin': float_list_feature(xmins),
  94. 'image/object/bbox/xmax': float_list_feature(xmaxs),
  95. 'image/object/bbox/ymin': float_list_feature(ymins),
  96. 'image/object/bbox/ymax': float_list_feature(ymaxs),
  97. 'image/object/mask': bytes_list_feature(encoded_masks),
  98. 'image/object/class/label': int64_list_feature(labels)}
  99. example = tf.train.Example(features=tf.train.Features(
  100. feature=feature_dict))
  101. return example
  102. def _get_annotation_dict(images_dir, annotation_json_path):
  103. """Get boundingboxes and masks.
  104. Args:
  105. images_dir: Path to images directory.
  106. annotation_json_path: Path to annotated json file corresponding to
  107. the image. The json file annotated by labelme with keys:
  108. ['lineColor', 'imageData', 'fillColor', 'imagePath', 'shapes',
  109. 'flags'].
  110. Returns:
  111. annotation_dict: A dictionary containing the following keys:
  112. ['height', 'width', 'filename', 'sha256_key', 'encoded_jpg',
  113. 'format', 'xmins', 'xmaxs', 'ymins', 'ymaxs', 'masks',
  114. 'class_names'].
  115. #
  116. # Raises:
  117. # ValueError: If images_dir or annotation_json_path is not exist.
  118. """
  119. # if not os.path.exists(images_dir):
  120. # raise ValueError('`images_dir` is not exist.')
  121. #
  122. # if not os.path.exists(annotation_json_path):
  123. # raise ValueError('`annotation_json_path` is not exist.')
  124. if (not os.path.exists(images_dir) or
  125. not os.path.exists(annotation_json_path)):
  126. return None
  127. with open(annotation_json_path, 'r') as f:
  128. json_text = json.load(f)
  129. shapes = json_text.get('shapes', None)
  130. if shapes is None:
  131. return None
  132. image_relative_path = json_text.get('imagePath', None)
  133. if image_relative_path is None:
  134. return None
  135. image_name = image_relative_path.split('/')[-1]
  136. image_path = os.path.join(images_dir, image_name)
  137. image_format = image_name.split('.')[-1].replace('jpg', 'jpeg')
  138. if not os.path.exists(image_path):
  139. return None
  140. with tf.gfile.GFile(image_path, 'rb') as fid:
  141. encoded_jpg = fid.read()
  142. image = cv2.imread(image_path)
  143. height = image.shape[0]
  144. width = image.shape[1]
  145. key = hashlib.sha256(encoded_jpg).hexdigest()
  146. xmins = []
  147. xmaxs = []
  148. ymins = []
  149. ymaxs = []
  150. masks = []
  151. class_names = []
  152. hole_polygons = []
  153. for mark in shapes:
  154. class_name = mark.get('label')
  155. class_names.append(class_name)
  156. polygon = mark.get('points')
  157. polygon = np.array(polygon)
  158. if class_name == 'hole':
  159. hole_polygons.append(polygon)
  160. else:
  161. mask = np.zeros(image.shape[:2])
  162. cv2.fillPoly(mask, [polygon], 1)
  163. masks.append(mask)
  164. # Boundingbox
  165. x = polygon[:, 0]
  166. y = polygon[:, 1]
  167. xmin = np.min(x)
  168. xmax = np.max(x)
  169. ymin = np.min(y)
  170. ymax = np.max(y)
  171. xmins.append(float(xmin) / width)
  172. xmaxs.append(float(xmax) / width)
  173. ymins.append(float(ymin) / height)
  174. ymaxs.append(float(ymax) / height)
  175. # Remove holes in mask
  176. for mask in masks:
  177. mask = cv2.fillPoly(mask, hole_polygons, 0)
  178. annotation_dict = {'height': height,
  179. 'width': width,
  180. 'filename': image_name,
  181. 'sha256_key': key,
  182. 'encoded_jpg': encoded_jpg,
  183. 'format': image_format,
  184. 'xmins': xmins,
  185. 'xmaxs': xmaxs,
  186. 'ymins': ymins,
  187. 'ymaxs': ymaxs,
  188. 'masks': masks,
  189. 'class_names': class_names}
  190. return annotation_dict
  191. def main(_):
  192. if not os.path.exists(FLAGS.images_dir):
  193. raise ValueError('`images_dir` is not exist.')
  194. if not os.path.exists(FLAGS.annotations_json_dir):
  195. raise ValueError('`annotations_json_dir` is not exist.')
  196. if not os.path.exists(FLAGS.label_map_path):
  197. raise ValueError('`label_map_path` is not exist.')
  198. label_map = read_pbtxt_file.get_label_map_dict(FLAGS.label_map_path)
  199. writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
  200. num_annotations_skiped = 0
  201. annotations_json_path = os.path.join(FLAGS.annotations_json_dir, '*.json')
  202. for i, annotation_file in enumerate(glob.glob(annotations_json_path)):
  203. if i % 100 == 0:
  204. print('On image %d', i)
  205. annotation_dict = _get_annotation_dict(
  206. FLAGS.images_dir, annotation_file)
  207. if annotation_dict is None:
  208. num_annotations_skiped += 1
  209. continue
  210. tf_example = create_tf_example(annotation_dict, label_map)
  211. writer.write(tf_example.SerializeToString())
  212. print('Successfully created TFRecord to {}.'.format(FLAGS.output_path))
  213. if __name__ == '__main__':
  214. tf.app.run()

假设你的所有原始图像的路径为 path_to_images_dir,使用 labelme 标注产生的所有用于 训练 的 json 文件的路径为 path_to_train_annotations_json_dir,所有用于 验证 的 json 文件的路径为 path_to_val_annotaions_json_dir,在终端先后执行如下指令:

  1. $ python3 create_tf_record.py \
  2. --images_dir=path_to_images_dir \
  3. --annotations_json_dir=path_to_train_annotations_json_dir \
  4. --label_map_path=path_to_label_map.pbtxt \
  5. --output_path=path_to_train.record

  1. $ python3 create_tf_record.py \
  2. --images_dir=path_to_images_dir \
  3. --annotations_json_dir=path_to_val_annotations_json_dir \
  4. --label_map_path=path_to_label_map.pbtxt \
  5. --output_path=path_to_val.record

其中,以上所有路径都支持相对路径。output_path 为输出的 train.record 以及 val.record 的路径,label_map_path 是所有需要检测的类名及类标号的配置文件,该文件的后缀名为 .pbtxt,写法很简单,假如你要检测 ’person' , 'car' ,'bicycle' 等类目标,则写入如下内容:

item {
        id: 1
        name: 'person'
}

item {
        id: 2
        name: 'car'
}

item {
        id: 3
        name: 'bicycle'
}

...

备注
 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/不正经/article/detail/564349
推荐阅读
相关标签
  

闽ICP备14008679号