赞
踩
哈哈,我又来了!!! 再次立下flag,开学之后还是要保持更新频率!!!
本次用efficientdet来对抽烟检测,检测出是否抽烟。那么,老规矩,先上结果图!!!
那么,接下来,还是原先一套流程。走起!!!
本次,在租的gpu的机器上的。没办法,efficientnet这个网络占据显存太大了。本次机器带不动呀。
本次数据集是用labelme
标注的,提供的json格式的数据集,但本次我们的voc格式的xml数据集,所以需要对json格式的数据进行转换。
图片:
标注的json格式数据:
转换后的xml格式
本次json转xml的源码如下:
# -*- coding: utf-8 -*- """ Created on Sun May 31 10:19:23 2020 @author: ywx """ import os from typing import List, Any import numpy as np import codecs import json from glob import glob import cv2 import shutil from sklearn.model_selection import train_test_split # 1.标签路径 labelme_path = "annotations/" #原始labelme标注数据路径 saved_path = "VOC2007/" # 保存路径 isUseTest=True#是否创建test集 # 2.创建要求文件夹 if not os.path.exists(saved_path + "Annotations"): os.makedirs(saved_path + "Annotations") if not os.path.exists(saved_path + "JPEGImages/"): os.makedirs(saved_path + "JPEGImages/") if not os.path.exists(saved_path + "ImageSets/Main/"): os.makedirs(saved_path + "ImageSets/Main/") # 3.获取待处理文件 files = glob(labelme_path + "*.json") files = [i.replace("\\","/").split("/")[-1].split(".json")[0] for i in files] print(files) # 4.读取标注信息并写入 xml for json_file_ in files: json_filename = labelme_path + json_file_ + ".json" json_file = json.load(open(json_filename, "r", encoding="utf-8")) height, width, channels = cv2.imread('jpeg/' + json_file_ + ".jpg").shape with codecs.open(saved_path + "Annotations/" + json_file_ + ".xml", "w", "utf-8") as xml: xml.write('<annotation>\n') xml.write('\t<folder>' + 'WH_data' + '</folder>\n') xml.write('\t<filename>' + json_file_ + ".jpg" + '</filename>\n') xml.write('\t<source>\n') xml.write('\t\t<database>WH Data</database>\n') xml.write('\t\t<annotation>WH</annotation>\n') xml.write('\t\t<image>flickr</image>\n') xml.write('\t\t<flickrid>NULL</flickrid>\n') xml.write('\t</source>\n') xml.write('\t<owner>\n') xml.write('\t\t<flickrid>NULL</flickrid>\n') xml.write('\t\t<name>WH</name>\n') xml.write('\t</owner>\n') xml.write('\t<size>\n') xml.write('\t\t<width>' + str(width) + '</width>\n') xml.write('\t\t<height>' + str(height) + '</height>\n') xml.write('\t\t<depth>' + str(channels) + '</depth>\n') xml.write('\t</size>\n') xml.write('\t\t<segmented>0</segmented>\n') for multi in json_file["shapes"]: points = np.array(multi["points"]) labelName=multi["label"] xmin = min(points[:, 0]) xmax = max(points[:, 0]) ymin = min(points[:, 1]) ymax = max(points[:, 1]) label = multi["label"] if xmax <= xmin: pass elif ymax <= ymin: pass else: xml.write('\t<object>\n') xml.write('\t\t<name>' + labelName+ '</name>\n') xml.write('\t\t<pose>Unspecified</pose>\n') xml.write('\t\t<truncated>1</truncated>\n') xml.write('\t\t<difficult>0</difficult>\n') xml.write('\t\t<bndbox>\n') xml.write('\t\t\t<xmin>' + str(int(xmin)) + '</xmin>\n') xml.write('\t\t\t<ymin>' + str(int(ymin)) + '</ymin>\n') xml.write('\t\t\t<xmax>' + str(int(xmax)) + '</xmax>\n') xml.write('\t\t\t<ymax>' + str(int(ymax)) + '</ymax>\n') xml.write('\t\t</bndbox>\n') xml.write('\t</object>\n') print(json_filename, xmin, ymin, xmax, ymax, label) xml.write('</annotation>') # 5.复制图片到 VOC2007/JPEGImages/下 image_files = glob("jpeg/" + "*.jpg") print("copy image files to VOC007/JPEGImages/") for image in image_files: shutil.copy(image, saved_path + "JPEGImages/") # 6.split files for txt txtsavepath = saved_path + "ImageSets/Main/" ftrainval = open(txtsavepath + '/trainval.txt', 'w') ftest = open(txtsavepath + '/test.txt', 'w') ftrain = open(txtsavepath + '/train.txt', 'w') fval = open(txtsavepath + '/val.txt', 'w') total_files = glob("./VOC2007/Annotations/*.xml") total_files = [i.replace("\\","/").split("/")[-1].split(".xml")[0] for i in total_files] trainval_files=[] test_files=[] if isUseTest: trainval_files, test_files = train_test_split(total_files, test_size=0.15, random_state=55) else: trainval_files=total_files for file in trainval_files: ftrainval.write(file + "\n") # split train_files, val_files = train_test_split(trainval_files, test_size=0.15, random_state=55) # train for file in train_files: ftrain.write(file + "\n") # val for file in val_files: fval.write(file + "\n") for file in test_files: print(file) ftest.write(file + "\n") ftrainval.close() ftrain.close() fval.close() ftest.close()
EfficientDet是基于Efficientnet的目标检测网络,所以需要先读懂Efficientnet,这里可以先去看我之前写的卷积神经网络发展史中有关于Efficientnet的介绍。
简短来说,EfficientNet是将图片的分辨率,网络的宽度,网络的深度这三者结合起来,通过α实现缩放模型,不同的α有不同的模型精度。
总的来说,efficientdet目标检测网络,是以efficientnet为主干网络,之后经过bifpn特征特征网络,之后再输出检测结果。
1.EfficientNet
EfficientNet主要由Efficient Blocks构成,在其中小残差边以及大残差边构成,并在其中添加了注意力模块。
def mb_conv_block(inputs, block_args, activation, drop_rate=None, prefix='', ): """Mobile Inverted Residual Bottleneck.""" has_se = (block_args.se_ratio is not None) and (0 < block_args.se_ratio <= 1) bn_axis = 3 if backend.image_data_format() == 'channels_last' else 1 # workaround over non working dropout with None in noise_shape in tf.keras Dropout = get_dropout( backend=backend, layers=layers, models=models, utils=keras_utils ) # Expansion phase filters = block_args.input_filters * block_args.expand_ratio if block_args.expand_ratio != 1: x = layers.Conv2D(filters, 1, padding='same', use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'expand_conv')(inputs) x = layers.BatchNormalization(axis=bn_axis, name=prefix + 'expand_bn')(x) x = layers.Activation(activation, name=prefix + 'expand_activation')(x) else: x = inputs # Depthwise Convolution x = layers.DepthwiseConv2D(block_args.kernel_size, strides=block_args.strides, padding='same', use_bias=False, depthwise_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'dwconv')(x) x = layers.BatchNormalization(axis=bn_axis, name=prefix + 'bn')(x) x = layers.Activation(activation, name=prefix + 'activation')(x) # Squeeze and Excitation phase if has_se: num_reduced_filters = max(1, int( block_args.input_filters * block_args.se_ratio )) se_tensor = layers.GlobalAveragePooling2D(name=prefix + 'se_squeeze')(x) target_shape = (1, 1, filters) if backend.image_data_format() == 'channels_last' else (filters, 1, 1) se_tensor = layers.Reshape(target_shape, name=prefix + 'se_reshape')(se_tensor) se_tensor = layers.Conv2D(num_reduced_filters, 1, activation=activation, padding='same', use_bias=True, kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'se_reduce')(se_tensor) se_tensor = layers.Conv2D(filters, 1, activation='sigmoid', padding='same', use_bias=True, kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'se_expand')(se_tensor) if backend.backend() == 'theano': # For the Theano backend, we have to explicitly make # the excitation weights broadcastable. pattern = ([True, True, True, False] if backend.image_data_format() == 'channels_last' else [True, False, True, True]) se_tensor = layers.Lambda( lambda x: backend.pattern_broadcast(x, pattern), name=prefix + 'se_broadcast')(se_tensor) x = layers.multiply([x, se_tensor], name=prefix + 'se_excite') # Output phase x = layers.Conv2D(block_args.output_filters, 1, padding='same', use_bias=False, kernel_initializer=CONV_KERNEL_INITIALIZER, name=prefix + 'project_conv')(x) x = layers.BatchNormalization(axis=bn_axis, name=prefix + 'project_bn')(x) if block_args.id_skip and all( s == 1 for s in block_args.strides ) and block_args.input_filters == block_args.output_filters: if drop_rate and (drop_rate > 0): x = Dropout(drop_rate, noise_shape=(None, 1, 1, 1), name=prefix + 'drop')(x) x = layers.add([x, inputs], name=prefix + 'add') return x
2.BiFPN
改进了FPN中的多尺度特征融合方式,提出了加权双向特征金字塔网络BiFPN。BiFPN 引入了一种自顶向下的路径,融合P3~P7的多尺度特征
BiFPN模块类似于FPN网络(特征金字塔网络),不过比FPN更复杂些。其主要是为了增强特征,提取更有代表性的特征。
下图展示一下FPN网络:
而这是BiFPN的网络图:
其中的一个BiFPN模块为:
1.准备数据集
准备抽烟数据,使用VOC格式的数据进行训练
voc2efficientdet.py
文件生成对应的txt。VOCdevkit
-VOC2007
├─ImageSets # 存放数据集列表文件,由voc2yolo3.py文件生成
├─Annotations # 存放数据集中图片文件
├─JPEGImages # 存放图片标签,xml 格式
└─voc2yolo4.py # 用来生成数据集列表文件
2.运行生成EfficientDet所需的数据
再运行根目录voc_annotation.py
,运行前需要将voc_annotation
文件中classes改成你自己的classes。
每一行对应其图片位置及其真实框的位置
3.修改voc_classes.txt
在训练前需要修改model_data里面的voc_classes.txt
文件,需要将classes改成你自己的classes。
4.修改yolo_anchors.txt
运行kmeans_for_anchors.py
生成yolo_anchors.txt
5.运行
运行train.py
6.测试图片
需修改efficientdet.py
文件中模型的位置,替换成你训练好的模型并修改phi
为efficientdet的版本。然后在根目录下,运行python predict.py
进行测试。
好了,本次就到此结束了!!!
哦,突然想起,我还保存了logs,在展示一波的训练过程吧!!!
好了,下次看情况,更新点别的东西,更新点有关bert文本的吧!!!
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。