赞
踩
官方文档:https://mmsegmentation.readthedocs.io/
我的数据是RGB影像为.tif,标签影像为单通道(0-255)的.png影像,背景像素值为0,建筑物像素值为255
制作自己的数据集进行二分类步骤如下:
1、在mmsegmentation\mmseg\datasets文件夹下创建mydataset.py
注意:将reduce_zero_label=False,ignore_index随便设置一个数。这两个变量的作用参看源代码mmsegmentation\mmseg\datasets\custom.py。个人理解reduce_zero_label为Ture就是减少0像素在标签中的数量,因为我们的0为背景,所有设置为False,ignore_index应该指的是忽略像素值为ignore_index所表示的类别
# Copyright (c) OpenMMLab. All rights reserved. import os.path as osp from .builder import DATASETS from .custom import CustomDataset @DATASETS.register_module() class MydataDataset(CustomDataset): CLASSES = ('background','building') PALETTE = [[0,0,0],[255,255,255]] def __init__(self, **kwargs): super(MydataDataset, self).__init__( img_suffix='.tif', seg_map_suffix='.png', reduce_zero_label=False, ignore_index=10, classes = ('background','building'), palette = [[0,0,0],[255,255,255]], **kwargs) assert osp.exists(self.img_dir)
mmsegmentation\mmseg\core\evaluation\metrics.py中intersect_and_union函数————reduce_zero_label对于评价指标的影响
if reduce_zero_label:
label[label == 0] = 255
label = label - 1
label[label == 254] = 255
mask = (label != ignore_index)
pred_label = pred_label[mask]
label = label[mask]
2、在mmsegmentation\mmseg\core\evaluation\class_names.py文件后面添加代码
def mydata_classes():
"""shengteng class names for external use."""
return [
'background','building'
]
def mydata_palette():
return [[0,0,0][255,255,255]]
3、在mmseg/datasets/init.py文件中添加自己的数据类别
注意:from .mydataset import MydataDataset一定要添加
from .mydataset import MydataDataset
__all__ = [
'CustomDataset', 'build_dataloader', 'ConcatDataset', 'RepeatDataset',
'DATASETS', 'build_dataset', 'PIPELINES', 'CityscapesDataset',
'PascalVOCDataset', 'ADE20KDataset', 'PascalContextDataset',
'PascalContextDataset59', 'ChaseDB1Dataset', 'DRIVEDataset', 'HRFDataset',
'STAREDataset', 'DarkZurichDataset', 'NightDrivingDataset',
'COCOStuffDataset', 'LoveDADataset', 'MultiImageMixDataset',
'iSAIDDataset', 'ISPRSDataset', 'PotsdamDataset', 'MydataDataset',
]
4、在mmsegmentation\configs_base_\datasets中创建mydata.py,设置路径、尺寸等相关参数
# dataset settings dataset_type = 'MydataDataset' data_root = '/media/vge/DataA/lcw/Sanyuan_Buildings' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (512, 512) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', reduce_zero_label=False), dict(type='Resize', img_scale=(512, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(512, 512), # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=4, #设置每个GPU的batch_size workers_per_gpu=6, #设置每个GPU对应读取线程数 train=dict( type=dataset_type, data_root=data_root, img_dir='train/image', ann_dir='train/label', pipeline=train_pipeline), val=dict( type=dataset_type, data_root=data_root, img_dir='val/image', ann_dir='val/label', pipeline=test_pipeline), test=dict( type=dataset_type, data_root=data_root, img_dir='test/image', ann_dir='test/label', pipeline=test_pipeline))
5、我使用的模型是mmsegmentation\configs\segformer\segformer_mit-b0_512x512_160k_ade20k.py
在这个文件夹中设置mydata.py、segformer_mit-b0.py、default_runtime.py、schedule_80k.py的路径,以及预训练模型参数.pth文件的路径位置。这里我将num_classes=2设置类别数为2。我去掉了辅助分类器。
_base_ = [ '../_base_/models/segformer_mit-b0.py', '../_base_/datasets/mydata.py', '../_base_/default_runtime.py', '../_base_/schedules/schedule_160k.py' ] model = dict( pretrained='/home/vge/Documents/lcw/mmsegmentation/checkpoints/segformer_mit_b0_convert.pth', decode_head=dict(num_classes=2)) # optimizer optimizer = dict( _delete_=True, type='AdamW', lr=0.00006, betas=(0.9, 0.999), weight_decay=0.01, paramwise_cfg=dict( custom_keys={ 'pos_block': dict(decay_mult=0.), 'norm': dict(decay_mult=0.), 'head': dict(lr_mult=10.) })) lr_config = dict( _delete_=True, policy='poly', warmup='linear', warmup_iters=1500, warmup_ratio=1e-6, power=1.0, min_lr=0.0, by_epoch=False) data = dict(samples_per_gpu=16, workers_per_gpu=12) # 这里在设置每个GPU的batch_size和读取数据的线程数
6、一个需要注意的地方,将官方预训练模型转换到 MMSegmentation。
利用mmsegmentation\tools\model_converters\mit2mmseg.py对模型进行转换,因为mmsegmentation中Segformer代码的一些参数名字和segformer_mit_b0预训练模型参数名字不一样。因此需要利用mit2mmseg.py代码将预训练模型中的参数名字进行一些修改。
parser.add_argument('--src',default='XXX/mmsegmentation/checkpoints/segformer_mit_b0.pth', help='src model path or url')
# The dst path must be a full path of the new checkpoint.
parser.add_argument('--dst',default='XXX/mmsegmentation/checkpoints/segformer_mit_b0_convert.pth', help='save path')
7、其他相关设置
评价指标mIoU、mFscore在mmsegmentation\configs_base_\schedules文件夹中进行设置
默认GPU id,设置mmsegmentation\tools\train.py中47行代码
group_gpus.add_argument(
'--gpu-id',
type=int,
default=2,
help='id of gpu to use '
'(only applicable to non-distributed training)')
为了在pycharm中进行调试,设置mmsegmentation\tools\train.py中的
的第23行代码
parser.add_argument('--config', default='../configs/segformer/segformer_mit-b0_512x512_160k_ade20k.py', help='train config file path')
8、一个很坑的地方
由于我的数据是8位单通道的数据,所有做了上述步骤以后还是报错
经过调试发现读取label数据的时候,对于单通道数据,像素值要和类别的id
相等。比如我的数’background’,‘building’在单通道的label数据中要像素值分别要为background’=0,‘building’= 1,因为我的数据是background’=0,‘building’= 255,所以我修改了源代码。
mmsegmentation\mmseg\datasets\pipelines\loading.py143行代码
gt_semantic_seg[gt_semantic_seg_copy == old_id*255] = new_id
9、遇到的评价指标为0或者100%的问题
修改LOSS函数应该可以解决此问题。在mmsegmentation\configs_base_\models\upernet_swin.py中我将主分类器LOSS函数都由CrossEntropyLoss修改为DiceLoss。评价指标就正常了。我没用辅助分类器。
# model settings norm_cfg = dict(type='SyncBN', requires_grad=True) backbone_norm_cfg = dict(type='LN', requires_grad=True) model = dict( type='EncoderDecoder', pretrained=None, backbone=dict( type='SwinTransformer', pretrain_img_size=224, embed_dims=96, patch_size=4, window_size=7, mlp_ratio=4, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], strides=(4, 2, 2, 2), out_indices=(0, 1, 2, 3), qkv_bias=True, qk_scale=None, patch_norm=True, drop_rate=0., attn_drop_rate=0., drop_path_rate=0.3, use_abs_pos_embed=False, act_cfg=dict(type='GELU'), norm_cfg=backbone_norm_cfg), decode_head=dict( type='UPerHead', in_channels=[96, 192, 384, 768], in_index=[0, 1, 2, 3], pool_scales=(1, 2, 3, 6), channels=512, dropout_ratio=0.1, num_classes=2, norm_cfg=norm_cfg, align_corners=False, loss_decode=dict( type='DiceLoss', loss_weight=1.0)), train_cfg=dict(), test_cfg=dict(mode='whole'))
10、输出mFscore的同时输出mIoU,在mmseg/core/evaluation/metrics.py第360-380行代码
for metric in metrics: if metric == 'mIoU': iou = total_area_intersect / total_area_union acc = total_area_intersect / total_area_label ret_metrics['IoU'] = iou ret_metrics['Acc'] = acc elif metric == 'mDice': dice = 2 * total_area_intersect / ( total_area_pred_label + total_area_label) acc = total_area_intersect / total_area_label ret_metrics['Dice'] = dice ret_metrics['Acc'] = acc elif metric == 'mFscore': precision = total_area_intersect / total_area_pred_label recall = total_area_intersect / total_area_label f_value = torch.tensor( [f_score(x[0], x[1], beta) for x in zip(precision, recall)]) ret_metrics['Fscore'] = f_value ret_metrics['Precision'] = precision ret_metrics['Recall'] = recall iou = total_area_intersect / total_area_union ret_metrics['IoU'] = iou
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。