赞
踩
框架是MMSegmentation;
自己的数据集是 voc 格式;
代码:https://github.com/NVlabs/SegFormer
mmlab环境的安装:https://blog.csdn.net/Scenery0519/article/details/129595886?spm=1001.2014.3001.5501
mmseg 教程文档:https://mmsegmentation.readthedocs.io/zh_CN/latest/useful_tools.html#id10
首先需要配置好 mmlab 环境。
参考 mmlab环境的安装:https://blog.csdn.net/Scenery0519/article/details/129595886?spm=1001.2014.3001.5501
安装如下的库,版本按照自己匹配的来
pip install torchvision==0.8.2
pip install timm==0.3.2
pip install mmcv-full==1.2.7
pip install opencv-python==4.5.1.48
cd SegFormer && pip install -e . --user
# Single-gpu training
python tools/train.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py
AssertionError: MMCV==1.7.1 is used but incompatible. Please install mmcv>=[1, 1, 4], <=[1, 7, 0].
修改 /SegFormermaster/mmseg/init.py 文件
使自己的mmcv版本匹配在这个区间里。我使用的是mmcv==1.6.0版本可以正常跑程序。
File “/home/8TDisk/wangjl/condaEnv/mmlab/lib/python3.7/site-packages/timm/models/layers/helpers.py”, line 6, in
from torch._six import container_abcs
ImportError: cannot import name ‘container_abcs’ from ‘torch._six’ (/condaEnv/mmlab/lib/python3.7/site-packages/torch/_six.py)
上边的报错内容给出了出错的文件路径,照着路径找到 _six.py 文件修改。
修改 condaEnv/mmlab/lib/python3.7/site-packages/timm/models/layers/helpers.py
修改内容如下所示,将 from torch._six import container_abcs 注释掉,替换下面的代码。
# from torch._six import container_abcs
import torch
TORCH_MAJOR = int(torch.__version__.split('.')[0])
TORCH_MINOR = int(torch.__version__.split('.')[1])
if TORCH_MAJOR == 1 and TORCH_MINOR < 8:
from torch._six import container_abcs
else:
import collections.abc as container_abcs
File “/home/8TDisk/wangjl/condaEnv/mmlab/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py”, line 430, in _get_default_group
"Default process group has not been initialized, "
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
修改 /SegFormermaster/mmseg/apis/train.py 文件如下
代码如下:
if distributed: print("if") find_unused_parameters = cfg.get('find_unused_parameters', False) # Sets the `find_unused_parameters` parameter in # torch.nn.parallel.DistributedDataParallel # torch.distributed.init_process_group('nccl',init_method='file:///home/.../my_file',world_size=1,rank=0) model = MMDistributedDataParallel( model.cuda(), device_ids=[torch.cuda.current_device()], broadcast_buffers=False, find_unused_parameters=find_unused_parameters) print("distributed") else: print("else") print("cfg.gpu_ids[0]:{}".format(cfg.gpu_ids[0])) print("cfg.gpu_ids:{}".format(cfg.gpu_ids)) # model = MMDataParallel( # model.cuda(cfg.gpu_ids[0]), device_ids=cfg.gpu_ids) torch.distributed.init_process_group('nccl', init_method='file:///tmp/somefile', rank=0, world_size=1) model = MMDataParallel( model, device_ids=cfg.gpu_ids) print("distributed:false")
如果报这个错:
RuntimeError: open(/tmp/somefile): Permission denied
已放弃 (核心已转储)
说明:‘file:///tmp/somefile’ 这个文件没有访问权限
换一个地址就可以了。
File “/condaEnv/mmlab/lib/python3.7/site-packages/mmcv/runner/hooks/logger/text.py”, line "153, in _log_info
log_str += f’time: {log_dict[“time”]:.3f}, ’
KeyError: ‘data_time’
修改:
找到环境目录下
/condaEnv/mmlab/lib/python3.7/site-packages/mmcv/runner/hooks/logger/text.py 下文件,导入 time 库
import time
153行,做如下更改:
# log_str += f'time: {log_dict["time"]:.3f}, ' \
# f'data_time: {log_dict["data_time"]:.3f}, '
log_dict["data_time"] = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
log_str += f'time: {log_dict["time"]}, ' \
f'data_time: {log_dict["data_time"]}, '
需要更改的文件或创建的文件:
目录:/SegFormermaster/local_configs/segformer/B0/
参考文件:segformer.b0.512x512.ade.160k.py
新建自己的配置文件:segformer.b0.800x800.self.160k.py
修改:自己数据集的配置文件路径、类别数(num_classes)。
_base_ = [ '../../_base_/models/segformer.py', '../../_base_/datasets/self_dataset.py', # 改这里,是自己的数据集配置文件路径,也就是下边2、改的文件路径 '../../_base_/default_runtime.py', '../../_base_/schedules/schedule_160k_adamw.py' ] # model settings norm_cfg = dict(type='SyncBN', requires_grad=True) find_unused_parameters = True model = dict( type='EncoderDecoder', pretrained='pretrained/mit_b0.pth', backbone=dict( type='mit_b0', style='pytorch'), decode_head=dict( type='SegFormerHead', in_channels=[32, 64, 160, 256], in_index=[0, 1, 2, 3], feature_strides=[4, 8, 16, 32], channels=128, dropout_ratio=0.1, num_classes=150, # 改这里,改成自己的数据集的类别数。注意是类别 + 1(包含_background_) norm_cfg=norm_cfg, align_corners=False, decoder_params=dict(embed_dim=256), loss_decode=dict(type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), # model training and testing settings train_cfg=dict(), test_cfg=dict(mode='whole')) # optimizer optimizer = dict(_delete_=True, type='AdamW', lr=0.00006, betas=(0.9, 0.999), weight_decay=0.01, paramwise_cfg=dict(custom_keys={'pos_block': dict(decay_mult=0.), 'norm': dict(decay_mult=0.), 'head': dict(lr_mult=10.) })) lr_config = dict(_delete_=True, policy='poly', warmup='linear', warmup_iters=1500, warmup_ratio=1e-6, power=1.0, min_lr=0.0, by_epoch=False) data = dict(samples_per_gpu=2) # 每个gpu的迭代书,可改可不改 evaluation = dict(interval=16000, metric='mIoU')
路径:/SegFormermaster/local_configs/_base_/datasets/
参考文件:pascal_voc12.py
新建文件:self_dataset.py
修改内容: dataset_type、data_root
# dataset settings dataset_type = 'SelfVOCDataset' # 改这里,给自己的数据集type起个名字 data_root = 'data/VOCdevkit/VOC2012' # 改这里,是自己的数据集路径 img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) crop_size = (512, 512) # 裁剪大小 # train_pipeline 中的配置参数,随需要更改 train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations'), dict(type='Resize', img_scale=(2048, 512), ratio_range=(0.5, 2.0)), dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), dict(type='RandomFlip', prob=0.5), dict(type='PhotoMetricDistortion'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_semantic_seg']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(2048, 512), # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75], flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=4, workers_per_gpu=4, train=dict( type=dataset_type, data_root=data_root, img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/train.txt', pipeline=train_pipeline), val=dict( type=dataset_type, data_root=data_root, img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/val.txt', pipeline=test_pipeline), test=dict( type=dataset_type, data_root=data_root, img_dir='JPEGImages', ann_dir='SegmentationClass', split='ImageSets/Segmentation/val.txt', pipeline=test_pipeline))
路径:/SegFormermaster/local_configs/_base_/models/segformer.py
修改内容:num_classes 改成自己数据集的类别数
路径: /SegFormermaster/mmseg/datasets/
参考文件:voc.py
新建文件:self_voc.py
import os.path as osp from .builder import DATASETS from .custom import CustomDataset @DATASETS.register_module() # 修改这里,给自己的数据集类别起个名字,和 2、self_dataset.py 中的 dataset_type 保持一致 class SelfVOCDataset(CustomDataset): """Pascal VOC dataset. Args: split (str): Split txt file for Pascal VOC. """ # 修改这里,改成自己数据集的类别名称 CLASSES = ('background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor') # 修改这里,给自己数据集类别图像上色,类别数量和num_classes保持一致 PALETTE = [[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0], [0, 0, 128], [128, 0, 128], [0, 128, 128], [128, 128, 128], [64, 0, 0], [192, 0, 0], [64, 128, 0], [192, 128, 0], [64, 0, 128], [192, 0, 128], [64, 128, 128], [192, 128, 128], [0, 64, 0], [128, 64, 0], [0, 192, 0], [128, 192, 0], [0, 64, 128]] def __init__(self, split, **kwargs): # 修改这里,和类名保持一致 super(SelfVOCDataset, self).__init__( img_suffix='.jpg', seg_map_suffix='.png', split=split, **kwargs) assert osp.exists(self.img_dir) and self.split is not None
路径:/SegFormermaster/mmseg/datasets/_init_.py
from .ade import ADE20KDataset from .builder import DATASETS, PIPELINES, build_dataloader, build_dataset from .chase_db1 import ChaseDB1Dataset from .cityscapes import CityscapesDataset from .custom import CustomDataset from .dataset_wrappers import ConcatDataset, RepeatDataset from .drive import DRIVEDataset from .hrf import HRFDataset from .pascal_context import PascalContextDataset from .stare import STAREDataset from .voc import PascalVOCDataset from .mapillary import MapillaryDataset from .cocostuff import CocoStuff from .self_voc import SelfVOCDataset # 修改这里,导入自己数据集的类 __all__ = [ 'CustomDataset', 'build_dataloader', 'ConcatDataset', 'RepeatDataset', 'DATASETS', 'build_dataset', 'PIPELINES', 'CityscapesDataset', 'PascalVOCDataset', 'ADE20KDataset', 'PascalContextDataset', 'ChaseDB1Dataset', 'DRIVEDataset', 'HRFDataset', 'STAREDataset', 'MapillaryDataset', 'CocoStuff', 'SelfVOCDataset' # 修改这里,在这里添加上自己数据集的类名 ]
路径:/SegFormermaster/mmseg/core/evaluation/class_names.py
在文件中添加两个函数 selfvoc_classes()、selfvoc_palette()。
修改 dataset_aliases
def selfvoc_classes(): """Pascal VOC class names for external use.""" return [ 'background', 'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor' ] def selfvoc_palette(): """Pascal VOC palette for external use.""" return [[0, 0, 0], [128, 0, 0], [0, 128, 0], [128, 128, 0], [0, 0, 128], [128, 0, 128], [0, 128, 128], [128, 128, 128], [64, 0, 0], [192, 0, 0], [64, 128, 0], [192, 128, 0], [64, 0, 128], [192, 0, 128], [64, 128, 128], [192, 128, 128], [0, 64, 0], [128, 64, 0], [0, 192, 0], [128, 192, 0], [0, 64, 128]] dataset_aliases = { 'cityscapes': ['cityscapes'], 'ade': ['ade', 'ade20k'], 'voc': ['voc', 'pascal_voc', 'voc12', 'voc12aug'], 'selfvoc': ['selfvoc'] # 数据集type类的小写 }
python tools/train.py local_configs/segformer/B0/segformer.b0_self_160k.py --gpu-ids 0 --work-dir './work_dir'
–gpu-ids 指定用那块卡
–work-dir 制定跑的日志文件和权重文件保存在哪个路径下
–resume-from 如果训练意外终止,可以通过resume-from加载权重继续训练。注意,配置不能改变。
如果训练出现了卡住的情况,显卡上没有跑起来,可以尝试删除掉上次跑的日志文件,或者制定新的日志文件。因为如果配置参数与日志文件中的配置参数不一致,就会卡住,跑不起来。
其次,不能同时跑两个人任务,多卡分布式跑两个任务,可以通过指定端口来解决,单卡跑的话目前不清楚。
tools/dist_train.sh local_configs/segformer/B0/segformer.b0_self_160k.py 2
最后的数字是用几张显卡的意思。2,就是用两张显卡。
tools/analyze_logs.py 会画出给定的训练日志文件的 loss/mIoU 曲线,首先需要 pip install seaborn 安装依赖包。
pip install seaborn
训练前,对 /SegFormermaster/local_configs/base/default_runtime.py 文件中,dict(type=‘TensorboardLoggerHook’) 取消注释。
对 mIoU, mAcc, aAcc 指标画图
python tools/analyze_logs.py log.json --keys mIoU mAcc aAcc --legend mIoU mAcc aAcc
对 loss 指标画图
python tools/analyze_logs.py log.json --keys loss --legend loss
# Single-gpu testing
python tools/test.py local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file
# Multi-gpu testing
./tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM>
# Multi-gpu, multi-scale testing
tools/dist_test.sh local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py /path/to/checkpoint_file <GPU_NUM> --aug-test
python demo/image_demo.py demo/demo.png local_configs/segformer/B1/segformer.b1.512x512.ade.160k.py \
/path/to/checkpoint_file --device cuda:0 --palette cityscapes
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。