赞
踩
最近,信道注意机制被证明在提高深度卷积神经网络(CNN)的性能方面具有巨大潜力。然而,大多数现有方法都致力于开发更复杂的注意力模块以实现更好的性能,这不可避免地增加了模型的复杂性。为了克服性能和复杂性权衡的矛盾,本文提出了一种高效信道注意力(ECA)模块,该模块只涉及少数参数,同时带来了明显的性能增益。通过剖析SENet中的通道注意模块,我们从经验上表明,避免降维对于学习通道注意很重要,适当的跨通道交互可以在显著降低模型复杂性的同时保持性能。因此,我们提出了一种无降维的局部跨信道交互策略,该策略可以通过1D卷积有效地实现。此外,我们开发了一种自适应地选择1D卷积的核大小的方法,以确定本地跨信道交互的覆盖范围。所提出的ECA模块是高效而有效的,例如,我们的模块对ResNet50主干的参数和计算分别为80对24.37M和4.7e-4 GFLOP对3.86 GFLOP,并且在Top-1精度方面性能提升超过2%。我们使用ResNets和MobileNetV2的主干对我们的ECA模块在图像分类、对象检测和实例分割方面进行了广泛的评估。实验结果表明,我们的模块效率更高,性能优于同类模块。
这一部分我并没有进行简单的概述,旨在让小白对注意力机制的发展有一定的了解
深度卷积神经网络(CNN)已广泛应用于计算机视觉领域,在图像分类、对象检测和语义分割等广泛的任务中取得了巨大进展。从开创性的AlexNet开始,许多研究正在持续进行,以进一步提高深度神经网络的性能。最近,将信道注意力纳入卷积块引起了很多兴趣,在性能改进方面显示出巨大潜力。代表性方法之一是挤压和激励网络(SENet),它学习每个卷积块的信道注意力,为各种深度CNN架构带来明显的性能增益。
作者想在计算精度和需要的计算量之间做一个更好的兼容,提出了以下思路
首先回顾SENet中的频道关注模块。具体而言,给定输入特征,SE块首先独立地为每个信道使用全局平均池,然后使用两个具有非线性的完全连接(FC)层,然后使用Sigmoid函数生成信道权重。两个FC层被设计为捕获非线性跨信道交互,这涉及降维以控制模型复杂性。尽管该策略在随后的渠道注意力模块中被广泛使用,但我们的实证研究表明,降维会对渠道注意力预测产生副作用,而且捕获所有渠道的依赖性是低效的,也是不必要的。
因此,本文提出了一种用于深度神经网络的高效通道注意(ECA)模块,该模块避免了维度降低,并以有效的方式捕获跨通道交互。如图2所示,在没有降维的信道化全局平均池之后,我们的ECA通过考虑每个信道及其k个邻居来捕获局部跨信道交互。这种方法被证明既保证了效率又保证了有效性。注意,我们的ECA可以通过大小为k的快速1D卷积来有效地实现,其中内核大小k表示本地跨信道交互的覆盖范围,即有多少邻居参与了一个信道的注意力预测。为了避免通过交叉验证手动调整k,我们开发了一种自适应确定k的方法,其中交互覆盖(即内核大小k)与信道维度成比例。如图1和表3所示,ECA的深度卷积模块(称为ECA Net)引入了很少的附加参数和可忽略的计算,同时带来了显著的性能增益。例如,对于具有24.37M参数和3.86 GFLOP的ResNet-50,ECA-Net50的附加参数和计算分别为80和4.7e4GFLOP;同时,ECA-Net50在Top-1精度方面优于ResNet-50 2.28%。表1总结了现有的注意力模块,包括渠道维度降低(DR)、跨渠道互动和轻量级模型,其中我们可以看到,我们的ECA模块通过避免渠道维度降低,同时以极其轻量级的方式捕获跨渠道互动,来学习有效的渠道注意力。为了评估我们的方法,我们使用不同的深度CNN架构在多种任务中对ImageNet-1K和MS COCO进行了实验。
本文的创新点
(1) 从经验上证明了避免维度降低和适当的跨渠道互动对于学习有效和高效的渠道注意力分别很重要。
(2) 提出一种高效的渠道关注(ECA)来开发一种用于深度CNNs的极其轻量级的渠道关注模块,该模块在带来明显改进的同时,几乎不增加模型复杂性。
(3) 在ImageNet-1K和MS COCO上的实验结果表明,我们的方法具有比现有技术更低的模型复杂度,同时实现了非常有竞争力的性能。
注意力机制已被证明是增强深度神经网络的潜在手段。SE Net首次提出了一种学习渠道注意力的有效机制,并取得了令人满意的性能。随后,注意力模块的发展大致可以分为两个方向:
(1)增强特征聚合;
(2) 渠道和空间关注的结合。
介绍了一些注意力机制的方法,但要么存在精度高但是模型复杂,参数量大,轻量化的效果反而不理想,本文的方法刚好做出了两者的兼容
我们的方法研究了具有自适应核大小的1D卷积来替换信道关注模块中的FC层。与组和深度可分离卷积相比,我们的方法以较低的模型复杂度获得了更好的性能。
这个部分应当是论文的重点了
1.回顾SENet中的通道注意模块(即SE块)。
2.提出ECA模块。
3.开发了一种自适应确定ECA参数的方法,并最终展示了如何将其用于深度神经网络。
这部分回顾了SE block并指出了其缺点:
虽然方程中的降维可以降低模型复杂性,但它破坏了信道与其权重之间的直接对应关系。例如,一个FC层使用所有信道的线性组合来预测每个信道的权重。但等式(2)首先将通道特征投影到低维空间中,然后将其映射回去,使通道与其权重之间的对应是间接的。
在重新回顾SE块之后,进行了实证比较,以分析渠道维度降低和跨渠道互动对渠道注意力学习的影响。最后提出了我们的有效渠道关注(ECA)模块。
将原始SE块与其三个变体(即SE-Var1、SE-Var2和SEV-ar3)进行了比较,这三个变体都不执行降维。实验结果说明没有参数的SE-V ar1仍然优于原始网络,表明信道关注有能力提高深度CNN的性能
通过将SE-Var2独立地学习每个信道的权重,这在涉及较少参数的情况下略优于SE块。这个说明信道及其权重需要直接对应,而避免降维比考虑非线性信道相关性更重要
通过实验结果清楚地表明,避免降维,信道权重直接对应都是很中国要的
实验表明,跨信道交互有利于学习信道注意力,实现更好的网络性能,而以往的方法会出现参数量大,模型复杂度高,增加存储器访问成本,降低计算效率的问题,作者就提出了自己的方法
在本文中,我们探索了另一种捕获本地跨信道交互的方法,旨在确保效率和有效性。特别地,我们使用频带矩阵Wk来学习信道注意力在本文中,我们探索了另一种捕获本地跨信道交互的方法,旨在确保效率和有效性。特别地,我们使用频带矩阵Wk来学习信道注意力
方程(6)中的Wk涉及k×C参数,通常小于方程(5)中的参数。此外,等式(6)避免了等式(5)中不同组之间的完全独立性。如表2所示,等式(6)中的方法(即ECA-NS)优于等式(5)中的SE-GC。对于等式(6),通过仅考虑yi与其k个邻居之间的相互作用来计算yi的权重
Ωki表示yi的k个相邻信道的集合
更有效的方法是使所有通道共享相同的学习参数更有效的方法是使所有通道共享相同的学习参数
注意,这种策略可以很容易地通过核大小为k的快速1D卷积来实现
上图是对本文提出的种捕获本地跨信道交互的方法的介绍,使用了频带矩阵Wk来学习信道注意力,使所有信道共享相同的学习参数,结果表明ECA模块(k=3)实现了与SE-var3相似的结果,同时具有更低的模型复杂性,这通过适当地捕获本地跨信道交互来保证效率和有效性。论文发现了1D卷积的核大小k与信道维度C成比例。
对于不同CNN架构中具有不同信道号的卷积块,可以手动调整交互的优化覆盖范围。然而,通过交叉验证进行手动调整将耗费大量计算资源,作者提出了自动计算内核大小k(也就是1D卷积核的大小)方法公式
其中|t|odd表示t的最近奇数。将γ和b分别设置为2和1。
介绍了一下ECA模块的代码逻辑
在使用没有降维的GAP聚合卷积特征之后,ECA模块首先自适应地确定内核大小k,然后执行1D卷积,然后执行Sigmoid函数以学习信道注意力。为了将我们的ECA应用于深度神经网络,我们按照中的相同配置将SE模块替换为ECA模块。由此产生的网络由ECA Net命名。Figure3给出了ECA的PyTorch代码
class eca_layer(nn.Layer): """Constructs a ECA module. Args: channel: Number of channels of the input feature map k_size: Adaptive selection of kernel size """ def __init__(self, channel=1, k_size=3): super(eca_layer, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2D(1) self.conv = nn.Conv1D(1, 1, kernel_size=k_size, padding=(k_size - 1) // 2, bias_attr=False) self.sigmoid = nn.Sigmoid() def forward(self, x): # feature descriptor on the global spatial information y = self.avg_pool(x) # Two different branches of ECA module y = self.conv(y.squeeze(-1).transpose([0,2,1])).transpose([0,2,1]).unsqueeze(-1) # Multi-scale information fusion y = self.sigmoid(y) return x * y.expand_as(x) model=paddle.Model(eca_layer()) model.summary((1, 3, 224, 224))
-------------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param # ================================================================================ AdaptiveAvgPool2D-24 [[1, 3, 224, 224]] [1, 3, 1, 1] 0 Conv1D-23 [[1, 1, 3]] [1, 1, 3] 3 Sigmoid-23 [[1, 3, 1, 1]] [1, 3, 1, 1] 0 ================================================================================ Total params: 3 Trainable params: 3 Non-trainable params: 0 -------------------------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 0.00 Params size (MB): 0.00 Estimated Total Size (MB): 0.57 -------------------------------------------------------------------------------- {'total_params': 3, 'trainable_params': 3}
本文的实验部分是论文的第二种要部分,详细介绍了论文的实验求证细节
理解难度不大,就不再赘述
首先评估了内核大小对ECA模块的影响,并验证了自适应确定内核大小的方法的有效性,然后我们使用ResNet-50、ResNet-101、ResNet-152和MobileNetV2与现有技术的同行和CNN模型进行了比较
结果表明,不同的深度神经网络具有不同的最优k,并且k对ECA网络的性能有明显的影响。
自适应核大小选择在获得更好和稳定的结果方面的有效性
使用不同的主干网络将ECA-Net同其他网络进行对比,比较参数有:网络参数、每秒浮点运算(FLOP)和训练/推理速度)和有效性(即Top-1/Top5精度)
#eca_layer import paddle import paddle.nn as nn class eca_layer(paddle.nn.Layer): """ 构建ECA模块 参数: channel :输入特征图的通道数 k_size :自适应卷积核的大小 """ def __init__(self,channel=1,k_size=3): super(eca_layer,self).__init__() self.avg_pool = nn.AdaptiveAvgPool2D(1) self.conv = nn.Conv1D(1,1,kernel_size=k_size,padding=(k_size-1)//2,bias_attr=False) self.sigmoid = nn.Sigmoid() def forward(self,x): # 全局空间信息的特征提取 y = self.avg_pool(x) # ECA模块的两个不同分支 # shape = [64,64,1,1] y = y.squeeze(-1).transpose([0,2,1]) # print("infront: ",y.shape) y = self.conv(y) # print("conv out : ",y.shape) y = y.transpose([0,2,1]).unsqueeze(-1) # print("conv finish: ",y.shape) # 多尺度特征融合 y = self.sigmoid(y) # y变成x一样的形状 y = y.expand_as(x) #y相当于权重,也就是注意力 result = x*y return result import paddle.nn as nn import numpy as np import paddle from paddle.utils.download import get_weights_path_from_url __all__ = ['ECA_MobileNetV2', 'eca_mobilenet_v2'] model_urls = { 'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth', } class ConvBNReLU(nn.Sequential): def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1): padding = (kernel_size - 1) // 2 super(ConvBNReLU, self).__init__( nn.Conv2D(in_planes, out_planes, kernel_size, stride, padding, groups=groups,bias_attr=False), nn.BatchNorm2D(out_planes), nn.ReLU6(True) ) class InvertedResidual(nn.Layer): def __init__(self, inp, oup, stride, expand_ratio, k_size): super(InvertedResidual, self).__init__() self.stride = stride assert stride in [1, 2] hidden_dim = int(round(inp * expand_ratio)) self.use_res_connect = self.stride == 1 and inp == oup layers = [] if expand_ratio != 1: # pw layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1)) layers.extend([ # dw ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim), # pw-linear nn.Conv2D(hidden_dim, oup, 1, 1, 0, bias_attr=False), nn.BatchNorm2D(oup,momentum=0.1), ]) layers.append(eca_layer(oup, k_size)) self.conv = nn.Sequential(*layers) def forward(self, x): if self.use_res_connect: return x + self.conv(x) else: return self.conv(x) class ECA_MobileNetV2(nn.Layer): def __init__(self, num_classes=1000, width_mult=1.0): super(ECA_MobileNetV2, self).__init__() block = InvertedResidual input_channel = 32 last_channel = 1280 inverted_residual_setting = [ # t, c, n, s [1, 16, 1, 1], [6, 24, 2, 2], [6, 32, 3, 2], [6, 64, 4, 2], [6, 96, 3, 1], [6, 160, 3, 2], [6, 320, 1, 1], ] # building first layer input_channel = int(input_channel * width_mult) self.last_channel = int(last_channel * max(1.0, width_mult)) features = [ConvBNReLU(3, input_channel, stride=2)] # building inverted residual blocks for t, c, n, s in inverted_residual_setting: output_channel = int(c * width_mult) for i in range(n): if c < 96: ksize = 1 else: ksize = 3 stride = s if i == 0 else 1 features.append(block(input_channel, output_channel, stride, expand_ratio=t, k_size=ksize)) input_channel = output_channel # building last several layers features.append(ConvBNReLU(input_channel, self.last_channel, kernel_size=1)) # make it nn.Sequential self.features = nn.Sequential(*features) # building classifier self.classifier = nn.Sequential( nn.Dropout(0.25), nn.Linear(self.last_channel, num_classes), ) # weight initialization for m in self.sublayers(): if isinstance(m, nn.Conv2D): m.weight_attr = paddle.framework.ParamAttr(initializer=paddle.nn.initializer.KaimingNormal) if m.bias is not None: m.bias.set_value(np.zeros(m.bias.shape).astype('float32')) elif isinstance(m, nn.BatchNorm2D): m.weight.set_value(np.ones(m.weight.shape).astype('float32')) m.bias.set_value(np.zeros(m.bias.shape).astype('float32')) elif isinstance(m, nn.Linear): v = np.random.normal(loc=0,scale=0.01,size=m.weight.shape).astype('float32') m.weight.set_value(v) if m.bias is not None: m.bias.set_value(np.zeros(m.bias.shape).astype('float32')) def forward(self, x): x = self.features(x) x = x.mean(-1).mean(-1) x = self.classifier(x) return x def eca_mobilenet_v2(pretrained=False, progress=True, **kwargs): """ Constructs a ECA_MobileNetV2 architecture from Args: pretrained (bool): If True, returns a model pre-trained on ImageNet progress (bool): If True, displays a progress bar of the download to stderr """ model = ECA_MobileNetV2(**kwargs) if pretrained: weight_path = get_weights_path_from_url(model_urls["mobilenet_v2"][0], model_urls["mobilenet_v2"][1]) param = paddle.load(weight_path) model.set_dict(param) return model
model=paddle.Model(ECA_MobileNetV2())
model.summary((1, 3, 224, 224))
-------------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param # ================================================================================ Conv2D-1 [[1, 3, 224, 224]] [1, 32, 112, 112] 864 BatchNorm2D-1 [[1, 32, 112, 112]] [1, 32, 112, 112] 128 ReLU6-1 [[1, 32, 112, 112]] [1, 32, 112, 112] 0 Conv2D-2 [[1, 32, 112, 112]] [1, 32, 112, 112] 288 BatchNorm2D-2 [[1, 32, 112, 112]] [1, 32, 112, 112] 128 ReLU6-2 [[1, 32, 112, 112]] [1, 32, 112, 112] 0 Conv2D-3 [[1, 32, 112, 112]] [1, 16, 112, 112] 512 BatchNorm2D-3 [[1, 16, 112, 112]] [1, 16, 112, 112] 64 AdaptiveAvgPool2D-1 [[1, 16, 112, 112]] [1, 16, 1, 1] 0 Conv1D-1 [[1, 1, 16]] [1, 1, 16] 1 Sigmoid-1 [[1, 16, 1, 1]] [1, 16, 1, 1] 0 eca_layer-1 [[1, 16, 112, 112]] [1, 16, 112, 112] 0 InvertedResidual-1 [[1, 32, 112, 112]] [1, 16, 112, 112] 0 Conv2D-4 [[1, 16, 112, 112]] [1, 96, 112, 112] 1,536 BatchNorm2D-4 [[1, 96, 112, 112]] [1, 96, 112, 112] 384 ReLU6-3 [[1, 96, 112, 112]] [1, 96, 112, 112] 0 Conv2D-5 [[1, 96, 112, 112]] [1, 96, 56, 56] 864 BatchNorm2D-5 [[1, 96, 56, 56]] [1, 96, 56, 56] 384 ReLU6-4 [[1, 96, 56, 56]] [1, 96, 56, 56] 0 Conv2D-6 [[1, 96, 56, 56]] [1, 24, 56, 56] 2,304 BatchNorm2D-6 [[1, 24, 56, 56]] [1, 24, 56, 56] 96 AdaptiveAvgPool2D-2 [[1, 24, 56, 56]] [1, 24, 1, 1] 0 Conv1D-2 [[1, 1, 24]] [1, 1, 24] 1 Sigmoid-2 [[1, 24, 1, 1]] [1, 24, 1, 1] 0 eca_layer-2 [[1, 24, 56, 56]] [1, 24, 56, 56] 0 InvertedResidual-2 [[1, 16, 112, 112]] [1, 24, 56, 56] 0 Conv2D-7 [[1, 24, 56, 56]] [1, 144, 56, 56] 3,456 BatchNorm2D-7 [[1, 144, 56, 56]] [1, 144, 56, 56] 576 ReLU6-5 [[1, 144, 56, 56]] [1, 144, 56, 56] 0 Conv2D-8 [[1, 144, 56, 56]] [1, 144, 56, 56] 1,296 BatchNorm2D-8 [[1, 144, 56, 56]] [1, 144, 56, 56] 576 ReLU6-6 [[1, 144, 56, 56]] [1, 144, 56, 56] 0 Conv2D-9 [[1, 144, 56, 56]] [1, 24, 56, 56] 3,456 BatchNorm2D-9 [[1, 24, 56, 56]] [1, 24, 56, 56] 96 AdaptiveAvgPool2D-3 [[1, 24, 56, 56]] [1, 24, 1, 1] 0 Conv1D-3 [[1, 1, 24]] [1, 1, 24] 1 Sigmoid-3 [[1, 24, 1, 1]] [1, 24, 1, 1] 0 eca_layer-3 [[1, 24, 56, 56]] [1, 24, 56, 56] 0 InvertedResidual-3 [[1, 24, 56, 56]] [1, 24, 56, 56] 0 Conv2D-10 [[1, 24, 56, 56]] [1, 144, 56, 56] 3,456 BatchNorm2D-10 [[1, 144, 56, 56]] [1, 144, 56, 56] 576 ReLU6-7 [[1, 144, 56, 56]] [1, 144, 56, 56] 0 Conv2D-11 [[1, 144, 56, 56]] [1, 144, 28, 28] 1,296 BatchNorm2D-11 [[1, 144, 28, 28]] [1, 144, 28, 28] 576 ReLU6-8 [[1, 144, 28, 28]] [1, 144, 28, 28] 0 Conv2D-12 [[1, 144, 28, 28]] [1, 32, 28, 28] 4,608 BatchNorm2D-12 [[1, 32, 28, 28]] [1, 32, 28, 28] 128 AdaptiveAvgPool2D-4 [[1, 32, 28, 28]] [1, 32, 1, 1] 0 Conv1D-4 [[1, 1, 32]] [1, 1, 32] 1 Sigmoid-4 [[1, 32, 1, 1]] [1, 32, 1, 1] 0 eca_layer-4 [[1, 32, 28, 28]] [1, 32, 28, 28] 0 InvertedResidual-4 [[1, 24, 56, 56]] [1, 32, 28, 28] 0 Conv2D-13 [[1, 32, 28, 28]] [1, 192, 28, 28] 6,144 BatchNorm2D-13 [[1, 192, 28, 28]] [1, 192, 28, 28] 768 ReLU6-9 [[1, 192, 28, 28]] [1, 192, 28, 28] 0 Conv2D-14 [[1, 192, 28, 28]] [1, 192, 28, 28] 1,728 BatchNorm2D-14 [[1, 192, 28, 28]] [1, 192, 28, 28] 768 ReLU6-10 [[1, 192, 28, 28]] [1, 192, 28, 28] 0 Conv2D-15 [[1, 192, 28, 28]] [1, 32, 28, 28] 6,144 BatchNorm2D-15 [[1, 32, 28, 28]] [1, 32, 28, 28] 128 AdaptiveAvgPool2D-5 [[1, 32, 28, 28]] [1, 32, 1, 1] 0 Conv1D-5 [[1, 1, 32]] [1, 1, 32] 1 Sigmoid-5 [[1, 32, 1, 1]] [1, 32, 1, 1] 0 eca_layer-5 [[1, 32, 28, 28]] [1, 32, 28, 28] 0 InvertedResidual-5 [[1, 32, 28, 28]] [1, 32, 28, 28] 0 Conv2D-16 [[1, 32, 28, 28]] [1, 192, 28, 28] 6,144 BatchNorm2D-16 [[1, 192, 28, 28]] [1, 192, 28, 28] 768 ReLU6-11 [[1, 192, 28, 28]] [1, 192, 28, 28] 0 Conv2D-17 [[1, 192, 28, 28]] [1, 192, 28, 28] 1,728 BatchNorm2D-17 [[1, 192, 28, 28]] [1, 192, 28, 28] 768 ReLU6-12 [[1, 192, 28, 28]] [1, 192, 28, 28] 0 Conv2D-18 [[1, 192, 28, 28]] [1, 32, 28, 28] 6,144 BatchNorm2D-18 [[1, 32, 28, 28]] [1, 32, 28, 28] 128 AdaptiveAvgPool2D-6 [[1, 32, 28, 28]] [1, 32, 1, 1] 0 Conv1D-6 [[1, 1, 32]] [1, 1, 32] 1 Sigmoid-6 [[1, 32, 1, 1]] [1, 32, 1, 1] 0 eca_layer-6 [[1, 32, 28, 28]] [1, 32, 28, 28] 0 InvertedResidual-6 [[1, 32, 28, 28]] [1, 32, 28, 28] 0 Conv2D-19 [[1, 32, 28, 28]] [1, 192, 28, 28] 6,144 BatchNorm2D-19 [[1, 192, 28, 28]] [1, 192, 28, 28] 768 ReLU6-13 [[1, 192, 28, 28]] [1, 192, 28, 28] 0 Conv2D-20 [[1, 192, 28, 28]] [1, 192, 14, 14] 1,728 BatchNorm2D-20 [[1, 192, 14, 14]] [1, 192, 14, 14] 768 ReLU6-14 [[1, 192, 14, 14]] [1, 192, 14, 14] 0 Conv2D-21 [[1, 192, 14, 14]] [1, 64, 14, 14] 12,288 BatchNorm2D-21 [[1, 64, 14, 14]] [1, 64, 14, 14] 256 AdaptiveAvgPool2D-7 [[1, 64, 14, 14]] [1, 64, 1, 1] 0 Conv1D-7 [[1, 1, 64]] [1, 1, 64] 1 Sigmoid-7 [[1, 64, 1, 1]] [1, 64, 1, 1] 0 eca_layer-7 [[1, 64, 14, 14]] [1, 64, 14, 14] 0 InvertedResidual-7 [[1, 32, 28, 28]] [1, 64, 14, 14] 0 Conv2D-22 [[1, 64, 14, 14]] [1, 384, 14, 14] 24,576 BatchNorm2D-22 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-15 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-23 [[1, 384, 14, 14]] [1, 384, 14, 14] 3,456 BatchNorm2D-23 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-16 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-24 [[1, 384, 14, 14]] [1, 64, 14, 14] 24,576 BatchNorm2D-24 [[1, 64, 14, 14]] [1, 64, 14, 14] 256 AdaptiveAvgPool2D-8 [[1, 64, 14, 14]] [1, 64, 1, 1] 0 Conv1D-8 [[1, 1, 64]] [1, 1, 64] 1 Sigmoid-8 [[1, 64, 1, 1]] [1, 64, 1, 1] 0 eca_layer-8 [[1, 64, 14, 14]] [1, 64, 14, 14] 0 InvertedResidual-8 [[1, 64, 14, 14]] [1, 64, 14, 14] 0 Conv2D-25 [[1, 64, 14, 14]] [1, 384, 14, 14] 24,576 BatchNorm2D-25 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-17 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-26 [[1, 384, 14, 14]] [1, 384, 14, 14] 3,456 BatchNorm2D-26 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-18 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-27 [[1, 384, 14, 14]] [1, 64, 14, 14] 24,576 BatchNorm2D-27 [[1, 64, 14, 14]] [1, 64, 14, 14] 256 AdaptiveAvgPool2D-9 [[1, 64, 14, 14]] [1, 64, 1, 1] 0 Conv1D-9 [[1, 1, 64]] [1, 1, 64] 1 Sigmoid-9 [[1, 64, 1, 1]] [1, 64, 1, 1] 0 eca_layer-9 [[1, 64, 14, 14]] [1, 64, 14, 14] 0 InvertedResidual-9 [[1, 64, 14, 14]] [1, 64, 14, 14] 0 Conv2D-28 [[1, 64, 14, 14]] [1, 384, 14, 14] 24,576 BatchNorm2D-28 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-19 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-29 [[1, 384, 14, 14]] [1, 384, 14, 14] 3,456 BatchNorm2D-29 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-20 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-30 [[1, 384, 14, 14]] [1, 64, 14, 14] 24,576 BatchNorm2D-30 [[1, 64, 14, 14]] [1, 64, 14, 14] 256 AdaptiveAvgPool2D-10 [[1, 64, 14, 14]] [1, 64, 1, 1] 0 Conv1D-10 [[1, 1, 64]] [1, 1, 64] 1 Sigmoid-10 [[1, 64, 1, 1]] [1, 64, 1, 1] 0 eca_layer-10 [[1, 64, 14, 14]] [1, 64, 14, 14] 0 InvertedResidual-10 [[1, 64, 14, 14]] [1, 64, 14, 14] 0 Conv2D-31 [[1, 64, 14, 14]] [1, 384, 14, 14] 24,576 BatchNorm2D-31 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-21 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-32 [[1, 384, 14, 14]] [1, 384, 14, 14] 3,456 BatchNorm2D-32 [[1, 384, 14, 14]] [1, 384, 14, 14] 1,536 ReLU6-22 [[1, 384, 14, 14]] [1, 384, 14, 14] 0 Conv2D-33 [[1, 384, 14, 14]] [1, 96, 14, 14] 36,864 BatchNorm2D-33 [[1, 96, 14, 14]] [1, 96, 14, 14] 384 AdaptiveAvgPool2D-11 [[1, 96, 14, 14]] [1, 96, 1, 1] 0 Conv1D-11 [[1, 1, 96]] [1, 1, 96] 3 Sigmoid-11 [[1, 96, 1, 1]] [1, 96, 1, 1] 0 eca_layer-11 [[1, 96, 14, 14]] [1, 96, 14, 14] 0 InvertedResidual-11 [[1, 64, 14, 14]] [1, 96, 14, 14] 0 Conv2D-34 [[1, 96, 14, 14]] [1, 576, 14, 14] 55,296 BatchNorm2D-34 [[1, 576, 14, 14]] [1, 576, 14, 14] 2,304 ReLU6-23 [[1, 576, 14, 14]] [1, 576, 14, 14] 0 Conv2D-35 [[1, 576, 14, 14]] [1, 576, 14, 14] 5,184 BatchNorm2D-35 [[1, 576, 14, 14]] [1, 576, 14, 14] 2,304 ReLU6-24 [[1, 576, 14, 14]] [1, 576, 14, 14] 0 Conv2D-36 [[1, 576, 14, 14]] [1, 96, 14, 14] 55,296 BatchNorm2D-36 [[1, 96, 14, 14]] [1, 96, 14, 14] 384 AdaptiveAvgPool2D-12 [[1, 96, 14, 14]] [1, 96, 1, 1] 0 Conv1D-12 [[1, 1, 96]] [1, 1, 96] 3 Sigmoid-12 [[1, 96, 1, 1]] [1, 96, 1, 1] 0 eca_layer-12 [[1, 96, 14, 14]] [1, 96, 14, 14] 0 InvertedResidual-12 [[1, 96, 14, 14]] [1, 96, 14, 14] 0 Conv2D-37 [[1, 96, 14, 14]] [1, 576, 14, 14] 55,296 BatchNorm2D-37 [[1, 576, 14, 14]] [1, 576, 14, 14] 2,304 ReLU6-25 [[1, 576, 14, 14]] [1, 576, 14, 14] 0 Conv2D-38 [[1, 576, 14, 14]] [1, 576, 14, 14] 5,184 BatchNorm2D-38 [[1, 576, 14, 14]] [1, 576, 14, 14] 2,304 ReLU6-26 [[1, 576, 14, 14]] [1, 576, 14, 14] 0 Conv2D-39 [[1, 576, 14, 14]] [1, 96, 14, 14] 55,296 BatchNorm2D-39 [[1, 96, 14, 14]] [1, 96, 14, 14] 384 AdaptiveAvgPool2D-13 [[1, 96, 14, 14]] [1, 96, 1, 1] 0 Conv1D-13 [[1, 1, 96]] [1, 1, 96] 3 Sigmoid-13 [[1, 96, 1, 1]] [1, 96, 1, 1] 0 eca_layer-13 [[1, 96, 14, 14]] [1, 96, 14, 14] 0 InvertedResidual-13 [[1, 96, 14, 14]] [1, 96, 14, 14] 0 Conv2D-40 [[1, 96, 14, 14]] [1, 576, 14, 14] 55,296 BatchNorm2D-40 [[1, 576, 14, 14]] [1, 576, 14, 14] 2,304 ReLU6-27 [[1, 576, 14, 14]] [1, 576, 14, 14] 0 Conv2D-41 [[1, 576, 14, 14]] [1, 576, 7, 7] 5,184 BatchNorm2D-41 [[1, 576, 7, 7]] [1, 576, 7, 7] 2,304 ReLU6-28 [[1, 576, 7, 7]] [1, 576, 7, 7] 0 Conv2D-42 [[1, 576, 7, 7]] [1, 160, 7, 7] 92,160 BatchNorm2D-42 [[1, 160, 7, 7]] [1, 160, 7, 7] 640 AdaptiveAvgPool2D-14 [[1, 160, 7, 7]] [1, 160, 1, 1] 0 Conv1D-14 [[1, 1, 160]] [1, 1, 160] 3 Sigmoid-14 [[1, 160, 1, 1]] [1, 160, 1, 1] 0 eca_layer-14 [[1, 160, 7, 7]] [1, 160, 7, 7] 0 InvertedResidual-14 [[1, 96, 14, 14]] [1, 160, 7, 7] 0 Conv2D-43 [[1, 160, 7, 7]] [1, 960, 7, 7] 153,600 BatchNorm2D-43 [[1, 960, 7, 7]] [1, 960, 7, 7] 3,840 ReLU6-29 [[1, 960, 7, 7]] [1, 960, 7, 7] 0 Conv2D-44 [[1, 960, 7, 7]] [1, 960, 7, 7] 8,640 BatchNorm2D-44 [[1, 960, 7, 7]] [1, 960, 7, 7] 3,840 ReLU6-30 [[1, 960, 7, 7]] [1, 960, 7, 7] 0 Conv2D-45 [[1, 960, 7, 7]] [1, 160, 7, 7] 153,600 BatchNorm2D-45 [[1, 160, 7, 7]] [1, 160, 7, 7] 640 AdaptiveAvgPool2D-15 [[1, 160, 7, 7]] [1, 160, 1, 1] 0 Conv1D-15 [[1, 1, 160]] [1, 1, 160] 3 Sigmoid-15 [[1, 160, 1, 1]] [1, 160, 1, 1] 0 eca_layer-15 [[1, 160, 7, 7]] [1, 160, 7, 7] 0 InvertedResidual-15 [[1, 160, 7, 7]] [1, 160, 7, 7] 0 Conv2D-46 [[1, 160, 7, 7]] [1, 960, 7, 7] 153,600 BatchNorm2D-46 [[1, 960, 7, 7]] [1, 960, 7, 7] 3,840 ReLU6-31 [[1, 960, 7, 7]] [1, 960, 7, 7] 0 Conv2D-47 [[1, 960, 7, 7]] [1, 960, 7, 7] 8,640 BatchNorm2D-47 [[1, 960, 7, 7]] [1, 960, 7, 7] 3,840 ReLU6-32 [[1, 960, 7, 7]] [1, 960, 7, 7] 0 Conv2D-48 [[1, 960, 7, 7]] [1, 160, 7, 7] 153,600 BatchNorm2D-48 [[1, 160, 7, 7]] [1, 160, 7, 7] 640 AdaptiveAvgPool2D-16 [[1, 160, 7, 7]] [1, 160, 1, 1] 0 Conv1D-16 [[1, 1, 160]] [1, 1, 160] 3 Sigmoid-16 [[1, 160, 1, 1]] [1, 160, 1, 1] 0 eca_layer-16 [[1, 160, 7, 7]] [1, 160, 7, 7] 0 InvertedResidual-16 [[1, 160, 7, 7]] [1, 160, 7, 7] 0 Conv2D-49 [[1, 160, 7, 7]] [1, 960, 7, 7] 153,600 BatchNorm2D-49 [[1, 960, 7, 7]] [1, 960, 7, 7] 3,840 ReLU6-33 [[1, 960, 7, 7]] [1, 960, 7, 7] 0 Conv2D-50 [[1, 960, 7, 7]] [1, 960, 7, 7] 8,640 BatchNorm2D-50 [[1, 960, 7, 7]] [1, 960, 7, 7] 3,840 ReLU6-34 [[1, 960, 7, 7]] [1, 960, 7, 7] 0 Conv2D-51 [[1, 960, 7, 7]] [1, 320, 7, 7] 307,200 BatchNorm2D-51 [[1, 320, 7, 7]] [1, 320, 7, 7] 1,280 AdaptiveAvgPool2D-17 [[1, 320, 7, 7]] [1, 320, 1, 1] 0 Conv1D-17 [[1, 1, 320]] [1, 1, 320] 3 Sigmoid-17 [[1, 320, 1, 1]] [1, 320, 1, 1] 0 eca_layer-17 [[1, 320, 7, 7]] [1, 320, 7, 7] 0 InvertedResidual-17 [[1, 160, 7, 7]] [1, 320, 7, 7] 0 Conv2D-52 [[1, 320, 7, 7]] [1, 1280, 7, 7] 409,600 BatchNorm2D-52 [[1, 1280, 7, 7]] [1, 1280, 7, 7] 5,120 ReLU6-35 [[1, 1280, 7, 7]] [1, 1280, 7, 7] 0 Dropout-1 [[1, 1280]] [1, 1280] 0 Linear-1 [[1, 1280]] [1, 1000] 1,281,000 ================================================================================ Total params: 3,539,015 Trainable params: 3,504,903 Non-trainable params: 34,112 -------------------------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 157.27 Params size (MB): 13.50 Estimated Total Size (MB): 171.34 -------------------------------------------------------------------------------- {'total_params': 3539015, 'trainable_params': 3504903}
#网络 import paddle import paddle.nn as nn import paddle.nn.functional as F class ResidualBasicBlock(nn.Layer): expansion = 1 def __init__(self, in_channels, out_channels, stride=1): super(ResidualBasicBlock, self).__init__() self.stride =stride self.conv1 = nn.Conv2D(in_channels=in_channels, out_channels=out_channels, kernel_size=3, stride=stride, padding=1) self.bn1 = nn.BatchNorm2D(out_channels) self.conv2 = nn.Conv2D(in_channels=out_channels, out_channels=out_channels, kernel_size=3, stride=1, padding=1) self.bn2 = nn.BatchNorm2D(out_channels) # in_channels == out_channels时,处理同一卷积层内的数据流动 # 此时,右边的shortcut 直接使用恒等实现,即输入的x self.conv3 = None # 当in_channels != out_channels时, 处理卷积层之间的数据流动, 即 conv_2 -> conv_3 -> conv_4 -> conv_5 之间的转换 # 此时,feature map缩小一倍, channel数放大两倍 # 右边的shortcut通过conv3 层的 1*1卷积实现shape的改变 (映射方式) if in_channels != out_channels: self.conv3 = nn.Conv2D(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=2) self.bn3 = nn.BatchNorm2D(out_channels) def forward(self, x): identity = x x = self.conv1(x) x = F.relu(self.bn1(x)) x = self.conv2(x) x = F.relu(self.bn2(x)) if self.conv3: identity = self.conv3(identity) identity = self.bn3(identity) out = F.relu(x + identity) return out class ResidualBottleneckBlock(nn.Layer): expansion = 4 def __init__(self, in_channels, out_channels, stride=1): super(ResidualBottleneckBlock, self).__init__() mid_channel_num = int(in_channels if stride == 2 else out_channels / 4) self.stride = stride self.conv1 = nn.Conv2D(in_channels=in_channels, out_channels=mid_channel_num, kernel_size=1) self.bn1 = nn.BatchNorm2D(mid_channel_num) self.conv2 = nn.Conv2D(in_channels=mid_channel_num, out_channels=mid_channel_num, kernel_size=3, stride=stride, padding=1) self.bn2 = nn.BatchNorm2D(mid_channel_num) self.conv3 = nn.Conv2D(in_channels=mid_channel_num, out_channels=out_channels, kernel_size=1, stride=1) self.bn3 = nn.BatchNorm2D(out_channels) self.conv4 = None # in_channels == out_channels时,处理同一卷积层内的数据流动 # 此时,右边的shortcut 直接使用恒等实现,即输入的x # 当 in_channels != out_channels 时, 处理卷积层之间的数据流动, 即 conv_2 -> conv_3 -> conv_4 -> conv_5 之间的转换 # 此时,feature map缩小一倍, channel数放大两倍 # 右边的shortcut通过conv4 层的 1*1卷积实现shape的改变 (映射方式) if in_channels != out_channels: self.conv4 = nn.Conv2D(in_channels=in_channels, out_channels=out_channels, kernel_size=1, stride=stride) self.bn4 = nn.BatchNorm2D(out_channels) def forward(self, x): identity = x x = self.conv1(x) x = F.relu(self.bn1(x)) x = self.conv2(x) x = F.relu(self.bn2(x)) x = self.conv3(x) x = F.relu(self.bn3(x)) if self.conv4: identity = self.conv4(identity) identity = self.bn4(identity) out = F.relu(x + identity) return out class MyResNet(nn.Layer): def __init__(self, residual_block, depth, class_num): """ input shape (1, 3, 224, 224) :param residual_block: :param depth: :param class_num: """ super(MyResNet, self).__init__() self.inplanes = 64 self.block = residual_block self.is_bottleneck_block = isinstance(self.block, ResidualBottleneckBlock) layer_cfg = { 18: [2, 2, 2, 2], 34: [3, 4, 6, 3], 50: [3, 4, 6, 3], 101: [3, 4, 23, 3], 152: [3, 8, 36, 3] } self.layers = layer_cfg[depth] self.conv1 = nn.Conv2D(in_channels=3, out_channels=64, kernel_size=7, stride=2, padding=3) self.bn1 = nn.BatchNorm2D(64) self.relu1 = nn.ReLU() self.max_pool = nn.MaxPool2D(kernel_size=3, stride=2, padding=1) self.conv2_layer = self._make_layer(64, 0, False) self.conv3_layer = self._make_layer(128, 1) self.conv4_layer = self._make_layer(256, 2) self.conv5_layer = self._make_layer(512, 3) self.avg_pool = nn.AdaptiveAvgPool2D((1, 1)) self.fc = nn.Linear(in_features=512 * self.block.expansion, out_features=class_num) def forward(self, x): x = self.conv1(x) x = self.bn1(x) x = self.relu1(x) x = self.max_pool(x) x = self.conv2_layer(x) x = self.conv3_layer(x) x = self.conv4_layer(x) x = self.conv5_layer(x) x = self.avg_pool(x) x = paddle.flatten(x, 1) x = self.fc(x) return x def _make_layer(self, out_channel, layer_index, need_down_sample=True): out_channel *= self.block.expansion tmp = int(self.inplanes / 2) if need_down_sample and self.is_bottleneck_block else out_channel layer_list = [self.block(self.inplanes, tmp, stride=2 if need_down_sample else 1)] for i in range(1, self.layers[layer_index]): layer_list.append(self.block(tmp, out_channel, stride=1)) tmp = out_channel self.inplanes = out_channel return nn.Sequential(*layer_list)
def eca_resnet18(ECABlock,depth):
"""Constructs a ResNet-18 model.
Args:
k_size: Adaptive selection of kernel size
pretrained (bool): If True, returns a model pre-trained on ImageNet
num_classes:The classes of classification
"""
model = MyResNet(ECABlock, depth,class_num)
model.avgpool = nn.AdaptiveAvgPool2D(1)
return model
eca_resnet18=MyResNet(ResidualBottleneckBlock, 18,10)
model=paddle.Model(eca_resnet18)
model.summary((1, 3, 224, 224))
------------------------------------------------------------------------------------- Layer (type) Input Shape Output Shape Param # ===================================================================================== Conv2D-53 [[1, 3, 224, 224]] [1, 64, 112, 112] 9,472 BatchNorm2D-53 [[1, 64, 112, 112]] [1, 64, 112, 112] 256 ReLU-1 [[1, 64, 112, 112]] [1, 64, 112, 112] 0 MaxPool2D-1 [[1, 64, 112, 112]] [1, 64, 56, 56] 0 Conv2D-54 [[1, 64, 56, 56]] [1, 64, 56, 56] 4,160 BatchNorm2D-54 [[1, 64, 56, 56]] [1, 64, 56, 56] 256 Conv2D-55 [[1, 64, 56, 56]] [1, 64, 56, 56] 36,928 BatchNorm2D-55 [[1, 64, 56, 56]] [1, 64, 56, 56] 256 Conv2D-56 [[1, 64, 56, 56]] [1, 256, 56, 56] 16,640 BatchNorm2D-56 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024 Conv2D-57 [[1, 64, 56, 56]] [1, 256, 56, 56] 16,640 BatchNorm2D-57 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024 ResidualBottleneckBlock-1 [[1, 64, 56, 56]] [1, 256, 56, 56] 0 Conv2D-58 [[1, 256, 56, 56]] [1, 64, 56, 56] 16,448 BatchNorm2D-58 [[1, 64, 56, 56]] [1, 64, 56, 56] 256 Conv2D-59 [[1, 64, 56, 56]] [1, 64, 56, 56] 36,928 BatchNorm2D-59 [[1, 64, 56, 56]] [1, 64, 56, 56] 256 Conv2D-60 [[1, 64, 56, 56]] [1, 256, 56, 56] 16,640 BatchNorm2D-60 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024 ResidualBottleneckBlock-2 [[1, 256, 56, 56]] [1, 256, 56, 56] 0 Conv2D-61 [[1, 256, 56, 56]] [1, 256, 56, 56] 65,792 BatchNorm2D-61 [[1, 256, 56, 56]] [1, 256, 56, 56] 1,024 Conv2D-62 [[1, 256, 56, 56]] [1, 256, 28, 28] 590,080 BatchNorm2D-62 [[1, 256, 28, 28]] [1, 256, 28, 28] 1,024 Conv2D-63 [[1, 256, 28, 28]] [1, 512, 28, 28] 131,584 BatchNorm2D-63 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048 Conv2D-64 [[1, 256, 56, 56]] [1, 512, 28, 28] 131,584 BatchNorm2D-64 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048 ResidualBottleneckBlock-3 [[1, 256, 56, 56]] [1, 512, 28, 28] 0 Conv2D-65 [[1, 512, 28, 28]] [1, 128, 28, 28] 65,664 BatchNorm2D-65 [[1, 128, 28, 28]] [1, 128, 28, 28] 512 Conv2D-66 [[1, 128, 28, 28]] [1, 128, 28, 28] 147,584 BatchNorm2D-66 [[1, 128, 28, 28]] [1, 128, 28, 28] 512 Conv2D-67 [[1, 128, 28, 28]] [1, 512, 28, 28] 66,048 BatchNorm2D-67 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048 ResidualBottleneckBlock-4 [[1, 512, 28, 28]] [1, 512, 28, 28] 0 Conv2D-68 [[1, 512, 28, 28]] [1, 512, 28, 28] 262,656 BatchNorm2D-68 [[1, 512, 28, 28]] [1, 512, 28, 28] 2,048 Conv2D-69 [[1, 512, 28, 28]] [1, 512, 14, 14] 2,359,808 BatchNorm2D-69 [[1, 512, 14, 14]] [1, 512, 14, 14] 2,048 Conv2D-70 [[1, 512, 14, 14]] [1, 1024, 14, 14] 525,312 BatchNorm2D-70 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096 Conv2D-71 [[1, 512, 28, 28]] [1, 1024, 14, 14] 525,312 BatchNorm2D-71 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096 ResidualBottleneckBlock-5 [[1, 512, 28, 28]] [1, 1024, 14, 14] 0 Conv2D-72 [[1, 1024, 14, 14]] [1, 256, 14, 14] 262,400 BatchNorm2D-72 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024 Conv2D-73 [[1, 256, 14, 14]] [1, 256, 14, 14] 590,080 BatchNorm2D-73 [[1, 256, 14, 14]] [1, 256, 14, 14] 1,024 Conv2D-74 [[1, 256, 14, 14]] [1, 1024, 14, 14] 263,168 BatchNorm2D-74 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096 ResidualBottleneckBlock-6 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 0 Conv2D-75 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 1,049,600 BatchNorm2D-75 [[1, 1024, 14, 14]] [1, 1024, 14, 14] 4,096 Conv2D-76 [[1, 1024, 14, 14]] [1, 1024, 7, 7] 9,438,208 BatchNorm2D-76 [[1, 1024, 7, 7]] [1, 1024, 7, 7] 4,096 Conv2D-77 [[1, 1024, 7, 7]] [1, 2048, 7, 7] 2,099,200 BatchNorm2D-77 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 8,192 Conv2D-78 [[1, 1024, 14, 14]] [1, 2048, 7, 7] 2,099,200 BatchNorm2D-78 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 8,192 ResidualBottleneckBlock-7 [[1, 1024, 14, 14]] [1, 2048, 7, 7] 0 Conv2D-79 [[1, 2048, 7, 7]] [1, 512, 7, 7] 1,049,088 BatchNorm2D-79 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,048 Conv2D-80 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,359,808 BatchNorm2D-80 [[1, 512, 7, 7]] [1, 512, 7, 7] 2,048 Conv2D-81 [[1, 512, 7, 7]] [1, 2048, 7, 7] 1,050,624 BatchNorm2D-81 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 8,192 ResidualBottleneckBlock-8 [[1, 2048, 7, 7]] [1, 2048, 7, 7] 0 AdaptiveAvgPool2D-18 [[1, 2048, 7, 7]] [1, 2048, 1, 1] 0 Linear-2 [[1, 2048]] [1, 10] 20,490 ===================================================================================== Total params: 25,376,010 Trainable params: 25,341,578 Non-trainable params: 34,432 ------------------------------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 156.20 Params size (MB): 96.80 Estimated Total Size (MB): 253.58 -------------------------------------------------------------------------------------
rams: 34,432
-------------------------------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 156.20
Params size (MB): 96.80
Estimated Total Size (MB): 253.58
-------------------------------------------------------------------------------------
{'total_params': 25376010, 'trainable_params': 25341578}
# 网络训练结果展示,详细代码在trainmynet.ipynb文件
#训练 learning_rate = 0.001 n_epochs = 50 paddle.seed(42) np.random.seed(42) work_path = 'work/model' model = eca_resnet18(ResidualBasicBlock,depth=18,class_num=10) criterion = LabelSmoothingCrossEntropy() scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=learning_rate, T_max=50000 // batch_size * n_epochs, verbose=False) optimizer = paddle.optimizer.Adam(parameters=model.parameters(), learning_rate=scheduler, weight_decay=1e-5) gate = 0.0 threshold = 0.0 best_acc = 0.0 val_acc = 0.0 loss_record = {'train': {'loss': [], 'iter': []}, 'val': {'loss': [], 'iter': []}} # for recording loss acc_record = {'train': {'acc': [], 'iter': []}, 'val': {'acc': [], 'iter': []}} # for recording accuracy loss_iter = 0 acc_iter = 0 for epoch in range(n_epochs): # ---------- Training ---------- model.train() train_num = 0.0 train_loss = 0.0 val_num = 0.0 val_loss = 0.0 accuracy_manager = paddle.metric.Accuracy() val_accuracy_manager = paddle.metric.Accuracy() print("#===epoch: {}, lr={:.10f}===#".format(epoch, optimizer.get_lr())) for batch_id, data in enumerate(train_loader): x_data, y_data = data labels = paddle.unsqueeze(y_data, axis=1) logits = model(x_data) loss = criterion(logits, y_data) acc = paddle.metric.accuracy(logits, labels) accuracy_manager.update(acc) if batch_id % 10 == 0: loss_record['train']['loss'].append(loss.numpy()) loss_record['train']['iter'].append(loss_iter) loss_iter += 1 loss.backward() optimizer.step() scheduler.step() optimizer.clear_grad() train_loss += loss train_num += len(y_data) total_train_loss = (train_loss / train_num) * batch_size train_acc = accuracy_manager.accumulate() acc_record['train']['acc'].append(train_acc) acc_record['train']['iter'].append(acc_iter) acc_iter += 1 # Print the information. print("#===epoch: {}, train loss is: {}, train acc is: {:2.2f}%===#".format(epoch, total_train_loss.numpy(), train_acc*100)) # ---------- Validation ---------- model.eval() for batch_id, data in enumerate(val_loader): x_data, y_data = data labels = paddle.unsqueeze(y_data, axis=1) with paddle.no_grad(): logits = model(x_data) loss = criterion(logits, y_data) acc = paddle.metric.accuracy(logits, labels) val_accuracy_manager.update(acc) val_loss += loss val_num += len(y_data) total_val_loss = (val_loss / val_num) * batch_size loss_record['val']['loss'].append(total_val_loss.numpy()) loss_record['val']['iter'].append(loss_iter) val_acc = val_accuracy_manager.accumulate() acc_record['val']['acc'].append(val_acc) acc_record['val']['iter'].append(acc_iter) print("#===epoch: {}, val loss is: {}, val acc is: {:2.2f}%===#".format(epoch, total_val_loss.numpy(), val_acc*100)) # ===================save==================== if val_acc > best_acc: best_acc = val_acc paddle.save(model.state_dict(), os.path.join(work_path, 'best_model.pdparams')) paddle.save(optimizer.state_dict(), os.path.join(work_path, 'best_optimizer.pdopt')) print(best_acc) paddle.save(model.state_dict(), os.path.join(work_path, 'final_model.pdparams')) paddle.save(optimizer.state_dict(), os.path.join(work_path, 'final_optimizer.pdopt'))
1.关于论文还是要花时间读透,而不是浅尝辄止要细扣
2.关于代码,没有捷径,即使花时间多写多改,当然我也参照了别人搭建网络的代码,对于第一次接触paddle的这样的效率无疑是最高的,代码很磨人,但我并不想放弃
3.其实无论是代码,还是论文,都需要脚踏实地
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。