当前位置:   article > 正文

涨点技巧:注意力机制---Yolov5/Yolov7引入CBAM、GAM、Resnet_CBAM_yolo注意力机制

yolo注意力机制

1.计算机视觉中的注意力机制

一般来说,注意力机制通常被分为以下基本四大类:

通道注意力 Channel Attention

空间注意力机制 Spatial Attention

时间注意力机制 Temporal Attention

分支注意力机制 Branch Attention

1.1.CBAM:通道注意力和空间注意力的集成者

轻量级的卷积注意力模块,它结合了通道和空间的注意力机制模块

论文题目:《CBAM: Convolutional Block Attention Module》
论文地址:  https://arxiv.org/pdf/1807.06521.pdf

上图可以看到,CBAM包含CAM(Channel Attention Module)和SAM(Spartial Attention Module)两个子模块,分别进行通道和空间上的Attention。这样不只能够节约参数和计算力,并且保证了其能够做为即插即用的模块集成到现有的网络架构中去。

1.2 GAM:Global Attention Mechanism

超越CBAM,全新注意力GAM:不计成本提高精度!
论文题目:Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions
论文地址:https://paperswithcode.com/paper/global-attention-mechanism-retain-information

从整体上可以看出,GAM和CBAM注意力机制还是比较相似的,同样是使用了通道注意力机制和空间注意力机制。但是不同的是对通道注意力和空间注意力的处理。
图片

1.3 ResBlock_CBAM

CBAM结构其实就是将通道注意力信息核空间注意力信息在一个block结构中进行运用。

在resnet中实现cbam:即在原始block和残差结构连接前,依次通过channel attention和spatial attention即可。

1.4性能评价

 2.Yolov5加入CBAM、GAM

2.1 CBAM加入common.py

  1. class ChannelAttentionModule(nn.Module):
  2. def __init__(self, c1, reduction=16,light=False):
  3. super(ChannelAttentionModule, self).__init__()
  4. mid_channel = c1 // reduction
  5. self.light=light
  6. self.avg_pool = nn.AdaptiveAvgPool2d(1)
  7. if self.light:
  8. self.max_pool = nn.AdaptiveMaxPool2d(1)
  9. self.shared_MLP = nn.Sequential(
  10. nn.Linear(in_features=c1, out_features=mid_channel),
  11. nn.LeakyReLU(0.1, inplace=True),
  12. nn.Linear(in_features=mid_channel, out_features=c1)
  13. )
  14. else:
  15. self.shared_MLP = nn.Conv2d(c1, c1, 1, 1, 0, bias=True)
  16. self.act = nn.Sigmoid()
  17. def forward(self, x):
  18. if self.light:
  19. avgout = self.shared_MLP(self.avg_pool(x).view(x.size(0),-1)).unsqueeze(2).unsqueeze(3)
  20. maxout = self.shared_MLP(self.max_pool(x).view(x.size(0),-1)).unsqueeze(2).unsqueeze(3)
  21. fc_out=(avgout + maxout)
  22. else:
  23. fc_out=(self.shared_MLP(self.avg_pool(x)))
  24. return x * self.act(fc_out)
  25. class SpatialAttentionModule(nn.Module): ##update:coding-style FOR LIGHTING
  26. def __init__(self, kernel_size=7):
  27. super().__init__()
  28. assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
  29. padding = 3 if kernel_size == 7 else 1
  30. self.cv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
  31. self.act = nn.Sigmoid()
  32. def forward(self, x):
  33. return x * self.act(self.cv1(torch.cat([torch.mean(x, 1, keepdim=True), torch.max(x, 1, keepdim=True)[0]], 1)))
  34. class CBAM(nn.Module):
  35. def __init__(self, c1,c2,k=7):
  36. super().__init__()
  37. self.channel_attention = ChannelAttentionModule(c1)
  38. self.spatial_attention = SpatialAttentionModule(k)
  39. def forward(self, x):
  40. return self.spatial_attention(self.channel_attention(x))

2.2  GAM加入common.py

  1. def channel_shuffle(x, groups=2): ##shuffle channel
  2. #RESHAPE----->transpose------->Flatten
  3. B, C, H, W = x.size()
  4. out = x.view(B, groups, C // groups, H, W).permute(0, 2, 1, 3, 4).contiguous()
  5. out=out.view(B, C, H, W)
  6. return out
  7. class GAM_Attention(nn.Module):
  8. #https://paperswithcode.com/paper/global-attention-mechanism-retain-information
  9. def __init__(self, c1, c2, group=True,rate=4):
  10. super(GAM_Attention, self).__init__()
  11. self.channel_attention = nn.Sequential(
  12. nn.Linear(c1, int(c1 / rate)),
  13. nn.ReLU(inplace=True),
  14. nn.Linear(int(c1 / rate), c1)
  15. )
  16. self.spatial_attention = nn.Sequential(
  17. nn.Conv2d(c1, c1//rate, kernel_size=7, padding=3,groups=rate)if group else nn.Conv2d(c1, int(c1 / rate), kernel_size=7, padding=3),
  18. nn.BatchNorm2d(int(c1 /rate)),
  19. nn.ReLU(inplace=True),
  20. nn.Conv2d(c1//rate, c2, kernel_size=7, padding=3,groups=rate) if group else nn.Conv2d(int(c1 / rate), c2, kernel_size=7, padding=3),
  21. nn.BatchNorm2d(c2)
  22. )
  23. def forward(self, x):
  24. b, c, h, w = x.shape
  25. x_permute = x.permute(0, 2, 3, 1).view(b, -1, c)
  26. x_att_permute = self.channel_attention(x_permute).view(b, h, w, c)
  27. x_channel_att = x_att_permute.permute(0, 3, 1, 2)
  28. # x_channel_att=channel_shuffle(x_channel_att,4) #last shuffle
  29. x = x * x_channel_att
  30. x_spatial_att = self.spatial_attention(x).sigmoid()
  31. x_spatial_att=channel_shuffle(x_spatial_att,4) #last shuffle
  32. out = x * x_spatial_att
  33. #out=channel_shuffle(out,4) #last shuffle
  34. return out

2.4 GAM加入common.py中加入common.py

  1. class ResBlock_CBAM(nn.Module):
  2. def __init__(self, in_places, places, stride=1, downsampling=False, expansion=4):
  3. super(ResBlock_CBAM, self).__init__()
  4. self.expansion = expansion
  5. self.downsampling = downsampling
  6. self.bottleneck = nn.Sequential(
  7. nn.Conv2d(in_channels=in_places, out_channels=places, kernel_size=1, stride=1, bias=False),
  8. nn.BatchNorm2d(places),
  9. nn.LeakyReLU(0.1, inplace=True),
  10. nn.Conv2d(in_channels=places, out_channels=places, kernel_size=3, stride=stride, padding=1, bias=False),
  11. nn.BatchNorm2d(places),
  12. nn.LeakyReLU(0.1, inplace=True),
  13. nn.Conv2d(in_channels=places, out_channels=places * self.expansion, kernel_size=1, stride=1,
  14. bias=False),
  15. nn.BatchNorm2d(places * self.expansion),
  16. )
  17. self.cbam = CBAM(c1=places * self.expansion, c2=places * self.expansion, )
  18. if self.downsampling:
  19. self.downsample = nn.Sequential(
  20. nn.Conv2d(in_channels=in_places, out_channels=places * self.expansion, kernel_size=1, stride=stride,
  21. bias=False),
  22. nn.BatchNorm2d(places * self.expansion)
  23. )
  24. self.relu = nn.ReLU(inplace=True)
  25. def forward(self, x):
  26. residual = x
  27. out = self.bottleneck(x)
  28. out = self.cbam(out)
  29. if self.downsampling:
  30. residual = self.downsample(x)
  31. out += residual
  32. out = self.relu(out)
  33. return out

2.3 CBAM、GAM加入yolo.py

  1. if m in {
  2. Conv, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, MixConv2d, Focus, CrossConv,
  3. BottleneckCSP, C3, C3TR, C3SPP, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x, C2f,CBAM,ResBlock_CBAM,GAM_Attention}:

2.4 CBAM、GAM修改对应yaml

2.4.1 修改 yolov5s_cbam.yaml

  1. # parameters
  2. nc: 10 # number of classes
  3. depth_multiple: 0.33 # model depth multiple
  4. width_multiple: 0.50 # layer channel multiple
  5. # anchors
  6. anchors:
  7. #- [5,6, 7,9, 12,10] # P2/4
  8. - [10,13, 16,30, 33,23] # P3/8
  9. - [30,61, 62,45, 59,119] # P4/16
  10. - [116,90, 156,198, 373,326] # P5/32
  11. # YOLOv5 backbone
  12. backbone:
  13. # [from, number, module, args] # [c=channels,module,kernlsize,strides]
  14. [[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2 [c=3,64*0.5=32,3]
  15. [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
  16. [-1, 3, C3, [128]],
  17. [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
  18. [-1, 6, C3, [256]],
  19. [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
  20. [-1, 9, C3, [512]],
  21. [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
  22. [-1, 3, C3, [1024]],
  23. [-1, 1, CBAM, [1024,7]], #9
  24. [-1, 1, SPPF, [1024,5]], #10
  25. ]
  26. # YOLOv5 head
  27. head:
  28. [[-1, 1, Conv, [512, 1, 1]],
  29. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  30. [[-1, 6], 1, Concat, [1]], # cat backbone P4
  31. [-1, 3, C3, [512, False]], # 14
  32. [-1, 1, Conv, [256, 1, 1]],
  33. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  34. [[-1, 4], 1, Concat, [1]], # cat backbone P3
  35. [-1, 3, C3, [256, False]], # 18 (P3/8-small)
  36. [-1, 1, CBAM, [256,7]], #19
  37. [-1, 1, Conv, [256, 3, 2]],
  38. [[-1, 15], 1, Concat, [1]], # cat head P4
  39. [-1, 3, C3, [512, False]], # 22 (P4/16-medium) [256, 256, 1, False]
  40. [-1, 1, CBAM, [512,7]],
  41. [-1, 1, Conv, [512, 3, 2]], #[256, 256, 3, 2]
  42. [[-1, 11], 1, Concat, [1]], # cat head P5
  43. [-1, 3, C3, [1024, False]], # 25 (P5/32-large) [512, 512, 1, False]
  44. [-1, 1, CBAM, [1024,7]],
  45. [[19, 23, 27], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
  46. ]

2.4.2 修改 yolov5s_gam.yaml

  1. # parameters
  2. nc: 1 # number of classes
  3. depth_multiple: 0.33 # model depth multiple
  4. width_multiple: 0.50 # layer channel multiple
  5. # anchors
  6. anchors:
  7. #- [5,6, 7,9, 12,10] # P2/4
  8. - [10,13, 16,30, 33,23] # P3/8
  9. - [30,61, 62,45, 59,119] # P4/16
  10. - [116,90, 156,198, 373,326] # P5/32
  11. # YOLOv5 backbone
  12. backbone:
  13. # [from, number, module, args] # [c=channels,module,kernlsize,strides]
  14. [[-1, 1, Conv, [64, 6, 2, 2]], # 0-P1/2 [c=3,64*0.5=32,3]
  15. [-1, 1, Conv, [128, 3, 2]], # 1-P2/4
  16. [-1, 3, C3, [128]],
  17. [-1, 1, Conv, [256, 3, 2]], # 3-P3/8
  18. [-1, 6, C3, [256]],
  19. [-1, 1, Conv, [512, 3, 2]], # 5-P4/16
  20. [-1, 9, C3, [512]],
  21. [-1, 1, Conv, [1024, 3, 2]], # 7-P5/32
  22. [-1, 3, C3, [1024]],
  23. [-1, 1, GAM_Attention, [1024,1024]], #9
  24. [-1, 1, SPPF, [1024,5]], #10
  25. ]
  26. # YOLOv5 head
  27. head:
  28. [[-1, 1, Conv, [512, 1, 1]],
  29. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  30. [[-1, 6], 1, Concat, [1]], # cat backbone P4
  31. [-1, 3, C3, [512, False]], # 14
  32. [-1, 1, Conv, [256, 1, 1]],
  33. [-1, 1, nn.Upsample, [None, 2, 'nearest']],
  34. [[-1, 4], 1, Concat, [1]], # cat backbone P3
  35. [-1, 3, C3, [256, False]], # 18 (P3/8-small)
  36. [-1, 1, GAM_Attention, [256,256]], #19
  37. [-1, 1, Conv, [256, 3, 2]],
  38. [[-1, 15], 1, Concat, [1]], # cat head P4
  39. [-1, 3, C3, [512, False]], # 22 (P4/16-medium) [256, 256, 1, False]
  40. [-1, 1, GAM_Attention, [512,512]],
  41. [-1, 1, Conv, [512, 3, 2]], #[256, 256, 3, 2]
  42. [[-1, 11], 1, Concat, [1]], # cat head P5
  43. [-1, 3, C3, [1024, False]], # 25 (P5/32-large) [512, 512, 1, False]
  44. [-1, 1, GAM_Attention, [1024,1024]], #
  45. [[19, 23, 27], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5)
  46. ]

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/873800
推荐阅读
相关标签
  

闽ICP备14008679号