赞
踩
coco数据集annotation的segmentation并不是二值mask,而是polygon格式,
看一个annotation.
{
"segmentation": [[510.66,423.01,511.72,420.03,510.45......]], #两两组成(x,y)坐标,polygon格式
"area": 702.1057499999998, #面积
"iscrowd": 0, #是不是一群物体,为0是seg是polygon格式,否则是RLE格式
"image_id": 289343, #对应的image id
"bbox": [473.07,395.93,38.65,28.67], #(x,y,w,h)
"category_id": 18, #分类label
"id": 1768 #当前annotation的id,每一个图像有不止一个对象,所以要对每一个对象编号(每个对象的ID是唯一的)
},
segmentation其实是一个二值mask的轮廓点,
如果想把二值mask转成这种格式,需要提取轮廓。
网上找来一张二值mask的图片
提取它的轮廓
mask_img = cv2.imread("mask.png",cv2.IMREAD_GRAYSCALE)
contours, _ = cv2.findContours(mask_img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
polygons = []
for object in contours:
coords = []
for point in object:
coords.append(int(point[0][0]))
coords.append(int(point[0][1]))
polygons.append(coords)
print(polygons)
#[[131, 48, 130, 49, 129, 50, 128, 51, ...]]
这个polygon能不能用呢?是不是和coco的annotation格式一样?
下面来验证一下。
引入python自带的coco库,要用到里面的annToMask
函数,把polygon转为mask,
然后imshow出来看看是不是和原图一样。
先看一下annToMask
函数,它需要先把polygon格式转为RLE格式。
def annToMask(self, ann): """ Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask. :return: binary mask (numpy 2D array) """ rle = self.annToRLE(ann) m = maskUtils.decode(rle) return m def annToRLE(self, ann): """ Convert annotation which can be polygons, uncompressed RLE to RLE. :return: binary mask (numpy 2D array) """ t = self.imgs[ann['image_id']] h, w = t['height'], t['width'] segm = ann['segmentation'] if type(segm) == list: # polygon -- a single object might consist of multiple parts # we merge all parts into one mask rle code rles = maskUtils.frPyObjects(segm, h, w) rle = maskUtils.merge(rles) elif type(segm['counts']) == list: # uncompressed RLE rle = maskUtils.frPyObjects(segm, h, w) else: # rle rle = ann['segmentation'] return rle
转成RLE的过程中需要用到annotation里的image_id, segmentation.
而现在我们只有一个segmentation, 并没有image_id,
而image_id只是用来得到图片的w, h,
所以引入coco的instance_val2017.json
文件,随便找一个image_id用一下。
from pycocotools.coco import COCO
coco_api = COCO('coco/annotations/instances_val2017.json')
ann = dict()
ann['image_id'] = 37777
ann['segmentation'] = polygons
mask = coco_api.annToMask(ann)
mask = np.clip(mask*255,0,255)
说明找到的轮廓是可以用的(随便找的image_id导致h,w和原图不一致,不过不影响mask).
(在曲线的连续性上仔细看会有一些误差)。
还是首选下面的RLE格式。
如果用在coco的segment, 首选这种,上面轮廓处会存在误差,且coco函数中polygon格式有诸多限制。
mask转RLE的代码来自segment anything的utils,
这里的mask是tensor形式的二值mask, 值是int型。
def mask_to_rle(tensor: torch.Tensor) -> List[Dict[str, Any]]: """ Encodes masks to an uncompressed RLE, in the format expected by pycoco tools. """ # Put in fortran order and flatten h,w h, w, b = tensor.shape #需要根据tensor的shape修改 tensor = tensor.permute(2, 1, 0).flatten(1) #需要根据tensor的shape修改 # Compute change indices diff = tensor[:, 1:] ^ tensor[:, :-1] #要求值是int型 change_indices = diff.nonzero() # Encode run length out = [] for i in range(b): cur_idxs = change_indices[change_indices[:, 0] == i, 1] cur_idxs = torch.cat( [ torch.tensor([0], dtype=cur_idxs.dtype, device=cur_idxs.device), cur_idxs + 1, torch.tensor([h * w], dtype=cur_idxs.dtype, device=cur_idxs.device), ] ) btw_idxs = cur_idxs[1:] - cur_idxs[:-1] counts = [] if tensor[i, 0] == 0 else [0] counts.extend(btw_idxs.detach().cpu().tolist()) out.append({"size": [h, w], "counts": counts}) return out
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。