赞
踩
整理了以下几个方便使用并且对提高模型效果有一定效果的注意力机制,但对于不同的模型和不同的数据集,效果不同。
Squeeze and Excitation(SE)
Efficient Channel Attention(ECA)
Coordinate Attention(CA)
Convolutional Block Attention Module(CBAM)
Concurrent Spatial and Channel Squeeze & Excitation(SCSE)
SE模块是给提取到的特征每个通道乘以不同的权重。
Squeeze:将特征进行全局平均池化,得到1x1xC的向量。
Excitation:为了提升模型的泛化能力,对向量进行两次全连接操作,第一次全连接操作将其通道数进行缩小,第二次全连接操作将通道数还原,最后将得到的向量映射为0至1范围内的小数,对应不同的通道乘以不同的权重。
(比率reduction不是固定的,可以根据不同的模型,自行进行调整,原文分别将r设置为2、4、8、16、32,实验结果在比率为16时既保证了准确率,也保证了模型的复杂度。)
class SEAttention(nn.Module): def __init__(self, channel=512,reduction=16): super().__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(channel, channel // reduction, bias=False), nn.ReLU(inplace=True), nn.Linear(channel // reduction, channel, bias=False), nn.Sigmoid() ) def init_weights(self): for m in self.modules(): if isinstance(m, nn.Conv2d): init.kaiming_normal_(m.weight, mode='fan_out') if m.bias is not None: init.constant_(m.bias, 0) elif isinstance(m, nn.BatchNorm2d): init.constant_(m.weight, 1) init.constant_(m.bias, 0) elif isinstance(m, nn.Linear): init.normal_(m.weight, std=0.001) if m.bias is not None: init.constant_(m.bias, 0) def forward(self, x): b, c, _, _ = x.size() y = self.avg_pool(x).view(b, c) y = self.fc(y).view(b, c, 1, 1) return x * y.expand_as(x) if __name__ == '__main__': input = torch.randn((4, 320, 4, 4)) model = SEAttention(320, 16) #输入参数为输入通道数以及比率 output = model(input)
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8701503
(IEEE2019)
ECA模块是SE模块的变体,由于SE模块中将得到的权重输入两层全连接层,会将通道的数量减小之后再还原,改论文通过一层全连接层与两层全连接层分别进行实验对比得到:单层的全连接层效果优于双层全连接层,因此要避免通道数量的减少。与此同时,ECA模块还提出跨通道交互,学习相邻的k个通道之间的信息,相比于对角矩阵可以更好地学习通道间的信息,相比于普通的CxC的全矩阵,可以减少参数量。
ECA模块先对输入的特征图进行全局平均池化,得到一个1x1xC的向量,对其进行卷积,维持其维数不变化,最后将得到向量映射到0到1之间的小数作为通道权重,与原特征图进行相乘。
class ECA_Block(nn.Module): """Constructs a ECA module. Args: channel: Number of channels of the input feature map k_size: Adaptive selection of kernel size """ def __init__(self, channel, k_size=3): super(ECA_Block, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.conv = nn.Conv1d(1, 1, kernel_size=k_size, padding=(k_size - 1) // 2, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): # feature descriptor on the global spatial information y = self.avg_pool(x) # Two different branches of ECA module y = self.conv(y.squeeze(-1).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-1) # Multi-scale information fusion y = self.sigmoid(y) return x * y.expand_as(x) if __name__ == '__main__': input = torch.randn((4, 320, 4, 4)) model = ECA_Block(320) output = model(input)
https://arxiv.org/pdf/1910.03151.pdf
CA模块既考虑到通道之间的关系,也考虑了位置间的相关性, 在SE模块的基础上进行改进。
Coordinate Information Embedding:分别对特征进行长度和宽度的平均池化得到一对跟方向有关的特征图CxHx1以及Cx1xW。
Coordinate Attention Generation:,将CxHx1的特征图进行转置,得到的Cx1xH与Cx1xW进行拼接,Cx1x(H+W),对其卷积成(C/r)x1x(H+W)并映射到0至1之间的小数,最后将特征图还原成跟原来尺度一致的两个特征图,分别通过卷积把通道数还原成C,分别通过sigmoid,将得到的权重分别乘以特征图。
import torch import torch.nn as nn import torch.nn.functional as F class h_sigmoid(nn.Module): def __init__(self, inplace=True): super(h_sigmoid, self).__init__() self.relu = nn.ReLU6(inplace=inplace) def forward(self, x): return self.relu(x + 3) / 6 class h_swish(nn.Module): def __init__(self, inplace=True): super(h_swish, self).__init__() self.sigmoid = h_sigmoid(inplace=inplace) def forward(self, x): return x * self.sigmoid(x) class CoordAtt(nn.Module): def __init__(self, inp, oup, reduction=32): super(CoordAtt, self).__init__() self.pool_h = nn.AdaptiveAvgPool2d((None, 1)) self.pool_w = nn.AdaptiveAvgPool2d((1, None)) mip = max(8, inp // reduction) self.conv1 = nn.Conv2d(inp, mip, kernel_size=1, stride=1, padding=0) self.bn1 = nn.BatchNorm2d(mip) self.act = h_swish() self.conv_h = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0) self.conv_w = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0) def forward(self, x): identity = x n,c,h,w = x.size() x_h = self.pool_h(x) x_w = self.pool_w(x).permute(0, 1, 3, 2) y = torch.cat([x_h, x_w], dim=2) y = self.conv1(y) y = self.bn1(y) y = self.act(y) x_h, x_w = torch.split(y, [h, w], dim=2) x_w = x_w.permute(0, 1, 3, 2) a_h = self.conv_h(x_h).sigmoid() a_w = self.conv_w(x_w).sigmoid() out = identity * a_w * a_h return out if __name__ == '__main__': input = torch.randn((4, 320, 4, 4)) model = CoordAtt(320, 320, 16) #输入参数为输入通道数、输出通道数以及比率 output = model(input)
https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9577301
CBAM模块不仅有通道注意力还有空间注意力,分别将特征图的各个通道乘以不同的权重以及对各个位置乘以不同的权重:
Channel Attention Module:对特征进行最大池化以及平均池化,得到Cx1x1的向量,将这两个向量分别输入共享权重的多层感知机后相加,将其映射到0至1之间的小数,得到通道注意力的权重,与原特征图相乘。
Spatial Attention Module:分别对特征图进行最大池化和平均池化,将得到的两个1xHxW的矩阵拼接成2xHxW的特征图,对其进行卷积,通道数降为1,将矩阵映射成0至1之间的小数,生成空间注意力权重,与特征图相乘。
class channel_attention(nn.Module): def __init__(self, channel, ratio=16): super(channel_attention, self).__init__() self.max_pool = nn.AdaptiveMaxPool2d(1) self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(channel, channel // ratio, bias=False), nn.ReLU(), nn.Linear(channel // ratio, channel, bias=False) ) self.sigmoid = nn.Sigmoid() def forward(self, x): b, c, h, w, = x.size() max_pool_out = self.max_pool(x).view(x.size(0), -1) avg_pool_out = self.avg_pool(x).view(x.size(0), -1) max_fc_out = self.fc(max_pool_out) avg_fc_out = self.fc(avg_pool_out) out = max_fc_out + avg_fc_out out = self.sigmoid(out).view(b, c, 1, 1) # print(out) return out * x class spatial_attention(nn.Module): def __init__(self, kernel_size=7): super(spatial_attention, self).__init__() padding = kernel_size // 2 self.conv = nn.Conv2d(2, 1, kernel_size, stride=1, padding=padding, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): max_pool_out, _ = torch.max(x, dim=1, keepdim=True) mean_pool_out = torch.mean(x, dim=1, keepdim=True) pool_out = torch.cat([max_pool_out, mean_pool_out], dim=1) out = self.conv(pool_out) out = self.sigmoid(out) # print(out) return out * x class CBAM_Block(nn.Module): def __init__(self, channel, ratio=16, kernel_size=7): super(CBAM_Block, self).__init__() self.channel_attention = channel_attention(channel, ratio=ratio) self.spatial_attention = spatial_attention(kernel_size=kernel_size) def forward(self, x): x = self.channel_attention(x) x = self.spatial_attention(x) return x if __name__ == '__main__': input = torch.randn((4, 320, 4, 4)) model = CBAM_Block(320) output = model(input)
https://arxiv.org/pdf/1807.06521.pdf
SCSE模块既包含空间注意力又包含通道注意力,分别将不同注意力机制得到的权重与特征图相乘,最后相加,与CBAM的串联式相比,SCSE模块是将空间注意力与通道注意力并联式组合:
Spatial Squeeze and Channel Excitation Block (cSE):
通道注意力首先将特征图平均池化得到1x1xC的向量,通过两层卷积层对通道进行放缩,然后还原,映射到0到1之间的小数得到通道注意力的权重,与原特征图相乘。
Channel Squeeze and Spatial Excitation Block (sSE):
空间注意力将特征图进行卷积,通道数变为1,得到1xHxW的位置信息权重,分别于对应位置的特征相乘。
class SCSEModule(nn.Module): def __init__(self, in_channels, reduction): super().__init__() self.cSE = nn.Sequential( nn.AdaptiveAvgPool2d(1), nn.Conv2d(in_channels, in_channels // reduction, 1), nn.ReLU(inplace=True), nn.Conv2d(in_channels // reduction, in_channels, 1), nn.Sigmoid(), ) self.sSE = nn.Sequential(nn.Conv2d(in_channels, 1, 1), nn.Sigmoid()) def forward(self, x): return x * self.cSE(x) + x * self.sSE(x) if __name__ == '__main__': input = torch.randn((4, 320, 4, 4)) model = SCSEModule(320,12) output = model(input)
https://arxiv.org/pdf/1803.02579.pdf
赞
踩
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。