当前位置:   article > 正文

涨点技巧:注意力机制---Yolov8引入CBAM、GAM、Resnet_CBAM_yolov8论文

yolov8论文

 1.计算机视觉中的注意力机制

一般来说,注意力机制通常被分为以下基本四大类:

通道注意力 Channel Attention

空间注意力机制 Spatial Attention

时间注意力机制 Temporal Attention

分支注意力机制 Branch Attention

1.1.CBAM:通道注意力和空间注意力的集成者

轻量级的卷积注意力模块,它结合了通道和空间的注意力机制模块

论文题目:《CBAM: Convolutional Block Attention Module》
论文地址:  https://arxiv.org/pdf/1807.06521.pdf

上图可以看到,CBAM包含CAM(Channel Attention Module)和SAM(Spartial Attention Module)两个子模块,分别进行通道和空间上的Attention。这样不只能够节约参数和计算力,并且保证了其能够做为即插即用的模块集成到现有的网络架构中去。

1.2 GAM:Global Attention Mechanism

超越CBAM,全新注意力GAM:不计成本提高精度!
论文题目:Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions
论文地址:https://paperswithcode.com/paper/global-attention-mechanism-retain-information

从整体上可以看出,GAM和CBAM注意力机制还是比较相似的,同样是使用了通道注意力机制和空间注意力机制。但是不同的是对通道注意力和空间注意力的处理。
图片

1.3 ResBlock_CBAM

CBAM结构其实就是将通道注意力信息核空间注意力信息在一个block结构中进行运用。

在resnet中实现cbam:即在原始block和残差结构连接前,依次通过channel attention和spatial attention即可。

1.4性能评价

 2.Yolov8加入CBAM、GAM

2.1 CBAM加入modules.py中(相当于yolov5中的common.py

  1. class ChannelAttention(nn.Module):
  2. # Channel-attention module https://github.com/open-mmlab/mmdetection/tree/v3.0.0rc1/configs/rtmdet
  3. def __init__(self, channels: int) -> None:
  4. super().__init__()
  5. self.pool = nn.AdaptiveAvgPool2d(1)
  6. self.fc = nn.Conv2d(channels, channels, 1, 1, 0, bias=True)
  7. self.act = nn.Sigmoid()
  8. def forward(self, x: torch.Tensor) -> torch.Tensor:
  9. return x * self.act(self.fc(self.pool(x)))
  10. class SpatialAttention(nn.Module):
  11. # Spatial-attention module
  12. def __init__(self, kernel_size=7):
  13. super().__init__()
  14. assert kernel_size in (3, 7), 'kernel size must be 3 or 7'
  15. padding = 3 if kernel_size == 7 else 1
  16. self.cv1 = nn.Conv2d(2, 1, kernel_size, padding=padding, bias=False)
  17. self.act = nn.Sigmoid()
  18. def forward(self, x):
  19. return x * self.act(self.cv1(torch.cat([torch.mean(x, 1, keepdim=True), torch.max(x, 1, keepdim=True)[0]], 1)))
  20. class CBAM(nn.Module):
  21. # Convolutional Block Attention Module
  22. def __init__(self, c1, kernel_size=7): # ch_in, kernels
  23. super().__init__()
  24. self.channel_attention = ChannelAttention(c1)
  25. self.spatial_attention = SpatialAttention(kernel_size)
  26. def forward(self, x):
  27. return self.spatial_attention(self.channel_attention(x))

2.2 GAM_Attention加入modules.py中:

  1. def channel_shuffle(x, groups=2): ##shuffle channel
  2. # RESHAPE----->transpose------->Flatten
  3. B, C, H, W = x.size()
  4. out = x.view(B, groups, C // groups, H, W).permute(0, 2, 1, 3, 4).contiguous()
  5. out = out.view(B, C, H, W)
  6. return out
  7. class GAM_Attention(nn.Module):
  8. # https://paperswithcode.com/paper/global-attention-mechanism-retain-information
  9. def __init__(self, c1, c2, group=True, rate=4):
  10. super(GAM_Attention, self).__init__()
  11. self.channel_attention = nn.Sequential(
  12. nn.Linear(c1, int(c1 / rate)),
  13. nn.ReLU(inplace=True),
  14. nn.Linear(int(c1 / rate), c1)
  15. )
  16. self.spatial_attention = nn.Sequential(
  17. nn.Conv2d(c1, c1 // rate, kernel_size=7, padding=3, groups=rate) if group else nn.Conv2d(c1, int(c1 / rate),
  18. kernel_size=7,
  19. padding=3),
  20. nn.BatchNorm2d(int(c1 / rate)),
  21. nn.ReLU(inplace=True),
  22. nn.Conv2d(c1 // rate, c2, kernel_size=7, padding=3, groups=rate) if group else nn.Conv2d(int(c1 / rate), c2,
  23. kernel_size=7,
  24. padding=3),
  25. nn.BatchNorm2d(c2)
  26. )
  27. def forward(self, x):
  28. b, c, h, w = x.shape
  29. x_permute = x.permute(0, 2, 3, 1).view(b, -1, c)
  30. x_att_permute = self.channel_attention(x_permute).view(b, h, w, c)
  31. x_channel_att = x_att_permute.permute(0, 3, 1, 2)
  32. # x_channel_att=channel_shuffle(x_channel_att,4) #last shuffle
  33. x = x * x_channel_att
  34. x_spatial_att = self.spatial_attention(x).sigmoid()
  35. x_spatial_att = channel_shuffle(x_spatial_att, 4) # last shuffle
  36. out = x * x_spatial_att
  37. # out=channel_shuffle(out,4) #last shuffle
  38. return out

2.3 CBAM、GAM_Attention、ResBlock_CBAM加入tasks.py中(相当于yolov5中的yolo.py

  1. from ultralytics.nn.modules import (C1, C2, C3, C3TR, SPP, SPPF, Bottleneck, BottleneckCSP, C2f, C3Ghost, C3x, Classify,
  2. Concat, Conv, ConvTranspose, Detect, DWConv, DWConvTranspose2d, Ensemble, Focus,
  3. GhostBottleneck, GhostConv, Segment,CBAM, GAM_Attention , ResBlock_CBAM)

def parse_model(d, ch, verbose=True):函数中

  1. if m in (Classify, Conv, ConvTranspose, GhostConv, Bottleneck, GhostBottleneck, SPP, SPPF, DWConv, Focus,
  2. BottleneckCSP, C1, C2, C2f, C3, C3TR, C3Ghost, nn.ConvTranspose2d, DWConvTranspose2d, C3x , CBAM , GAM_Attention ,ResBlock_CBAM):

2.4 CBAM、GAM修改对应yaml

2.4.1 CBAM加入yolov8

  1. # Ultralytics YOLO
    声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/558450
    推荐阅读
    相关标签