赞
踩
要训练 Mask R-CNN 实例分割模型,我们首先要准备图像的掩模(mask),使用标注工具 labelme(支持 Windows 和 Ubuntu,使用 (sudo) pip install labelme 安装,需要安装依赖项:(sudo) pip install pyqt5)来完成这一步。安装完 labelme 之后,在命令行执行 labelme
会弹出一个标注窗口:
【将命令行来到 Abyssinian_65.json 文件所在的文件夹,执行
labelme_json_to_dataset Abyssinian_65.json
会在当前目录下生成一个名叫 Abyssinian_65_json 的文件夹,里面包含如下文件:
但是 labelme 有一个很大的缺陷,即它只能标注首尾相连的多边形,如果一个目标实例包含一个洞(如第二幅图像 Abyssinian_65.jpg 的猫的两腿之间的空隙),那么这个洞也会算成这个目标实例的一部分,而这显然是不正确的。为了避免这个缺陷,在标注目标实例时,可以增加一个额外的类 hole(如上图的 绿色 部分),实际使用时只要把 hole 部分去掉即可,如:
TensorFlow 训练时要求 mask 是跟原图像一样大小的二值(0-1)png 图像(如上图),而且数据输入格式必须为 tfrecord 文件,所以还需要写一个数据格式转化的辅助 python 文件,该文件可以参考 TensorFlow 目标检测官方的文件 create_coco_tf_record.py 来写。
在写之前,强调说明一下数据输入的格式:对每张图像中的每个目标,该目标的 mask 是一张与原图一样大小的 0-1 二值图像,该目标所在区域的值为 1,其他区域全为 0(见 TensorFlow/object_detection 官方说明:Run an Instance Segmentation Model/PNG Instance Segmentation Masks)。也就是说,同一张图像中的所有目标的 mask 都需要从单个标注文件中分割出来。这可以使用 OpenCV 的 cv2.fillPoly 函数来实现,该函数将指定多边形区域内部的值都填充为用户设定的值。
假设已经准备好了 mask 标注数据,因为包围每个目标的 mask 的最小矩形就是该目标的 boundingbox,所以目标检测的标注数据也就同时有了。接下来,只需要将这些标注数据(原始图像,以及 labelme 标注生成的 json 文件)转换成 TFRecord 文件即可,使用如下代码完成这一步操作(命名为 create_tf_record.py,见 github):
- #!/usr/bin/env python3
- # -*- coding: utf-8 -*-
- """
- Created on Sun Aug 26 10:57:09 2018
- @author: shirhe-lyh
- """
-
- """Convert raw dataset to TFRecord for object_detection.
- Please note that this tool only applies to labelme's annotations(json file).
- Example usage:
- python3 create_tf_record.py \
- --images_dir=your absolute path to read images.
- --annotations_json_dir=your path to annotaion json files.
- --label_map_path=your path to label_map.pbtxt
- --output_path=your path to write .record.
- """
-
- import cv2
- import glob
- import hashlib
- import io
- import json
- import numpy as np
- import os
- import PIL.Image
- import tensorflow as tf
-
- import read_pbtxt_file
-
-
- flags = tf.app.flags
-
- flags.DEFINE_string('images_dir', None, 'Path to images directory.')
- flags.DEFINE_string('annotations_json_dir', 'datasets/annotations',
- 'Path to annotations directory.')
- flags.DEFINE_string('label_map_path', None, 'Path to label map proto.')
- flags.DEFINE_string('output_path', None, 'Path to the output tfrecord.')
-
- FLAGS = flags.FLAGS
-
-
- def int64_feature(value):
- return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
-
-
- def int64_list_feature(value):
- return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
-
-
- def bytes_feature(value):
- return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
-
-
- def bytes_list_feature(value):
- return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))
-
-
- def float_list_feature(value):
- return tf.train.Feature(float_list=tf.train.FloatList(value=value))
-
-
- def create_tf_example(annotation_dict, label_map_dict=None):
- """Converts image and annotations to a tf.Example proto.
-
- Args:
- annotation_dict: A dictionary containing the following keys:
- ['height', 'width', 'filename', 'sha256_key', 'encoded_jpg',
- 'format', 'xmins', 'xmaxs', 'ymins', 'ymaxs', 'masks',
- 'class_names'].
- label_map_dict: A dictionary maping class_names to indices.
-
- Returns:
- example: The converted tf.Example.
-
- Raises:
- ValueError: If label_map_dict is None or is not containing a class_name.
- """
- if annotation_dict is None:
- return None
- if label_map_dict is None:
- raise ValueError('`label_map_dict` is None')
-
- height = annotation_dict.get('height', None)
- width = annotation_dict.get('width', None)
- filename = annotation_dict.get('filename', None)
- sha256_key = annotation_dict.get('sha256_key', None)
- encoded_jpg = annotation_dict.get('encoded_jpg', None)
- image_format = annotation_dict.get('format', None)
- xmins = annotation_dict.get('xmins', None)
- xmaxs = annotation_dict.get('xmaxs', None)
- ymins = annotation_dict.get('ymins', None)
- ymaxs = annotation_dict.get('ymaxs', None)
- masks = annotation_dict.get('masks', None)
- class_names = annotation_dict.get('class_names', None)
-
- labels = []
- for class_name in class_names:
- label = label_map_dict.get(class_name, None)
- if label is None:
- raise ValueError('`label_map_dict` is not containing {}.'.format(
- class_name))
- labels.append(label)
-
- encoded_masks = []
- for mask in masks:
- pil_image = PIL.Image.fromarray(mask.astype(np.uint8))
- output_io = io.BytesIO()
- pil_image.save(output_io, format='PNG')
- encoded_masks.append(output_io.getvalue())
-
- feature_dict = {
- 'image/height': int64_feature(height),
- 'image/width': int64_feature(width),
- 'image/filename': bytes_feature(filename.encode('utf8')),
- 'image/source_id': bytes_feature(filename.encode('utf8')),
- 'image/key/sha256': bytes_feature(sha256_key.encode('utf8')),
- 'image/encoded': bytes_feature(encoded_jpg),
- 'image/format': bytes_feature(image_format.encode('utf8')),
- 'image/object/bbox/xmin': float_list_feature(xmins),
- 'image/object/bbox/xmax': float_list_feature(xmaxs),
- 'image/object/bbox/ymin': float_list_feature(ymins),
- 'image/object/bbox/ymax': float_list_feature(ymaxs),
- 'image/object/mask': bytes_list_feature(encoded_masks),
- 'image/object/class/label': int64_list_feature(labels)}
- example = tf.train.Example(features=tf.train.Features(
- feature=feature_dict))
- return example
-
-
- def _get_annotation_dict(images_dir, annotation_json_path):
- """Get boundingboxes and masks.
-
- Args:
- images_dir: Path to images directory.
- annotation_json_path: Path to annotated json file corresponding to
- the image. The json file annotated by labelme with keys:
- ['lineColor', 'imageData', 'fillColor', 'imagePath', 'shapes',
- 'flags'].
-
- Returns:
- annotation_dict: A dictionary containing the following keys:
- ['height', 'width', 'filename', 'sha256_key', 'encoded_jpg',
- 'format', 'xmins', 'xmaxs', 'ymins', 'ymaxs', 'masks',
- 'class_names'].
- #
- # Raises:
- # ValueError: If images_dir or annotation_json_path is not exist.
- """
- # if not os.path.exists(images_dir):
- # raise ValueError('`images_dir` is not exist.')
- #
- # if not os.path.exists(annotation_json_path):
- # raise ValueError('`annotation_json_path` is not exist.')
-
- if (not os.path.exists(images_dir) or
- not os.path.exists(annotation_json_path)):
- return None
-
- with open(annotation_json_path, 'r') as f:
- json_text = json.load(f)
- shapes = json_text.get('shapes', None)
- if shapes is None:
- return None
- image_relative_path = json_text.get('imagePath', None)
- if image_relative_path is None:
- return None
- image_name = image_relative_path.split('/')[-1]
- image_path = os.path.join(images_dir, image_name)
- image_format = image_name.split('.')[-1].replace('jpg', 'jpeg')
- if not os.path.exists(image_path):
- return None
-
- with tf.gfile.GFile(image_path, 'rb') as fid:
- encoded_jpg = fid.read()
- image = cv2.imread(image_path)
- height = image.shape[0]
- width = image.shape[1]
- key = hashlib.sha256(encoded_jpg).hexdigest()
-
- xmins = []
- xmaxs = []
- ymins = []
- ymaxs = []
- masks = []
- class_names = []
- hole_polygons = []
- for mark in shapes:
- class_name = mark.get('label')
- class_names.append(class_name)
- polygon = mark.get('points')
- polygon = np.array(polygon)
- if class_name == 'hole':
- hole_polygons.append(polygon)
- else:
- mask = np.zeros(image.shape[:2])
- cv2.fillPoly(mask, [polygon], 1)
- masks.append(mask)
-
- # Boundingbox
- x = polygon[:, 0]
- y = polygon[:, 1]
- xmin = np.min(x)
- xmax = np.max(x)
- ymin = np.min(y)
- ymax = np.max(y)
- xmins.append(float(xmin) / width)
- xmaxs.append(float(xmax) / width)
- ymins.append(float(ymin) / height)
- ymaxs.append(float(ymax) / height)
- # Remove holes in mask
- for mask in masks:
- mask = cv2.fillPoly(mask, hole_polygons, 0)
-
- annotation_dict = {'height': height,
- 'width': width,
- 'filename': image_name,
- 'sha256_key': key,
- 'encoded_jpg': encoded_jpg,
- 'format': image_format,
- 'xmins': xmins,
- 'xmaxs': xmaxs,
- 'ymins': ymins,
- 'ymaxs': ymaxs,
- 'masks': masks,
- 'class_names': class_names}
- return annotation_dict
-
-
- def main(_):
- if not os.path.exists(FLAGS.images_dir):
- raise ValueError('`images_dir` is not exist.')
- if not os.path.exists(FLAGS.annotations_json_dir):
- raise ValueError('`annotations_json_dir` is not exist.')
- if not os.path.exists(FLAGS.label_map_path):
- raise ValueError('`label_map_path` is not exist.')
-
- label_map = read_pbtxt_file.get_label_map_dict(FLAGS.label_map_path)
-
- writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
-
- num_annotations_skiped = 0
- annotations_json_path = os.path.join(FLAGS.annotations_json_dir, '*.json')
- for i, annotation_file in enumerate(glob.glob(annotations_json_path)):
- if i % 100 == 0:
- print('On image %d', i)
-
- annotation_dict = _get_annotation_dict(
- FLAGS.images_dir, annotation_file)
- if annotation_dict is None:
- num_annotations_skiped += 1
- continue
- tf_example = create_tf_example(annotation_dict, label_map)
- writer.write(tf_example.SerializeToString())
-
- print('Successfully created TFRecord to {}.'.format(FLAGS.output_path))
-
-
- if __name__ == '__main__':
- tf.app.run()
假设你的所有原始图像的路径为 path_to_images_dir,使用 labelme 标注产生的所有用于 训练 的 json 文件的路径为 path_to_train_annotations_json_dir,所有用于 验证 的 json 文件的路径为 path_to_val_annotaions_json_dir,在终端先后执行如下指令:
- $ python3 create_tf_record.py \
- --images_dir=path_to_images_dir \
- --annotations_json_dir=path_to_train_annotations_json_dir \
- --label_map_path=path_to_label_map.pbtxt \
- --output_path=path_to_train.record
- $ python3 create_tf_record.py \
- --images_dir=path_to_images_dir \
- --annotations_json_dir=path_to_val_annotations_json_dir \
- --label_map_path=path_to_label_map.pbtxt \
- --output_path=path_to_val.record
其中,以上所有路径都支持相对路径。output_path 为输出的 train.record 以及 val.record 的路径,label_map_path 是所有需要检测的类名及类标号的配置文件,该文件的后缀名为 .pbtxt,写法很简单,假如你要检测 ’person' , 'car' ,'bicycle' 等类目标,则写入如下内容:
item {
id: 1
name: 'person'
}item {
id: 2
name: 'car'
}item {
id: 3
name: 'bicycle'
}...
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。