当前位置:   article > 正文

ResNet及其变体结构梳理与总结

resnet56,110模型

点击上方“小白学视觉”,选择加"星标"或“置顶

重磅干货,第一时间送达

【导读】2020年,在各大CV顶会上又出现了许多基于ResNet改进的工作,比如:Res2Net,ResNeSt,IResNet,SCNet等等。为了更好的了解ResNet整个体系脉络的发展,我们特此对ResNet系列重新梳理,并制作了一个ResNet专题,希望能帮助大家对ResNet体系有一个更深的理解。本篇文章我们将主要讲解ResNet、preResNet、ResNext以及它们的代码实现。

ResNet

357d76c5e803a9dbd0e7d2a5ceddbca6.png

  • 论文链接:https://arxiv.org/abs/1512.03385

  • 代码地址:https://github.com/KaimingHe/deep-residual-networks

  • pytorch版:https://github.com/Cadene/pretrained-models.pytorch

Motivation和创新点

深度学习的发展从LeNet到AlexNet,再到VGGNet和GoogLeNet,网络的深度在不断加深,经验表明,网络深度有着至关重要的影响,层数深的网络可以提取出图片的低层、中层和高层特征。但当网络足够深时,仅仅在后面继续堆叠更多层会带来很多问题:第一个问题就是梯度爆炸 / 消失(vanishing / exploding gradients),backprop无法把有效地把梯度更新到前面的网络层,导致前面的层参数无法更新。第二个问题就是退化(degradation)问题,即当网络层数堆叠过多会导致优化困难、且训练误差和预测误差更大了,注意这里误差更大并不是由过拟合导致的。

ResNet旨在解决网络加深后训练难度增大的现象。其提出了residual模块,包含两个3×3卷积和一个shortcut connection。shortcut connection可以有效缓解反向传播时由于深度过深导致的梯度消失现象,这使得网络加深之后性能不会变差。短路连接是深度学习又一重要思想,除计算机视觉外,短路连接也被用到了机器翻译、语音识别/合成领域。此外,具有短路连接的ResNet可以看作是许多不同深度而共享参数的网络的集成,网络数目随层数指数增加。值得注意的是:在此之前已有研究者使用跨层连接对响应和梯度中心化(center)处理;inception结构本质也是跨层连接;highway网络也使用到了跨层连接

ResNet的关键点是:

  • 利用残差结构让网络能够更深、收敛速度更快、优化更容易,同时参数相对之前的模型更少、复杂度更低

  • ResNet大量使用了批量归一层,而不是Dropout。

  • 对于很深的网络(超过50层),ResNet使用了更高效的瓶颈(bottleneck)结构极大程度上降低了参数计算量。

145b63741947484ace4fe84fc2677097.png

ResNet的残差结构

cce3cedd7f8975cb2d5d98960a99b219.png

为了解决退化问题,我们引入了一个新的深度残差学习block,在这里,对于一个堆积层结构(几层堆积而成)当输入为时,其学习到的特征记为,现在我们希望其可以学习到残差 ,这样其实原始的学习特征是 。之所以这样是因为残差学习相比原始特征直接学习更容易。当残差为0时,此时堆积层仅仅做了恒等映射,至少网络性能不会下降,实际上残差不会为0,这也会使得堆积层在输入特征基础上学习到新的特征,从而拥有更好的性能。

本质也就是不改变目标函数 ,将网络结构拆成两个分支,一个分支是残差映射,一个分支是恒等映射,于是网络仅需学习残差映射即可。对于上述残差单元,我们可以从数学的角度来分析一下,首先上述结构可表示为:

其中和分别表示的是第个残差单元的输入和输出,注意每个残差单元一般包含多层结构。是残差函数,表示学习到的残差,而表示恒等映射,是ReLU激活函数。基于上式,我们求得从浅层到深层的学习特征为:

利用链式规则,可以求得反向过程的梯度:

式子的第一个因子 表示的损失函数到达的梯度,小括号中的1表明短路机制可以无损地传播梯度,而另外一项残差梯度则需要经过带有weights的层,梯度不是直接传递过来的。残差梯度不会那么巧全为-1,而且就算其比较小,有1的存在也不会导致梯度消失。所以残差学习会更容易。要注意上面的推导并不是严格的证明。

残差结构为什么有效?

  1. 自适应深度:网络退化问题就体现了多层网络难以拟合恒等映射这种情况,也就是说难以拟合,但使用了残差结构之后,拟合恒等映射变得很容易,直接把网络参数全学习到为0,只留下那个恒等映射的跨层连接即可。于是当网络不需要这么深时,中间的恒等映射就可以多一点,反之就可以少一点。(当然网络中出现某些层仅仅拟合恒等映射的可能性很小,但根据下面的第二点也有其用武之地;另外关于为什么多层网络难以拟合恒等映射,这涉及到信号与系统的知识见:https://www.zhihu.com/question/293243905/answer/484708047)

  2. 差分放大器:假设最优更接近恒等映射,那么网络更容易发现除恒等映射之外微小的波动

  3. 模型集成:整个ResNet类似于多个网络的集成,原因是删除ResNet的部分网络结点不影响整个网络的性能,但VGGNet会崩溃,具体可以看这篇NIPS论文:Residual Networks Behave Like Ensembles of Relatively Shallow Networks

  4. 缓解梯度消失:针对一个残差结构对输入求导就可以知道,由于跨层连接的存在,总梯度在对的导数基础上还会加1

下面给出一个直观理解图:

ab5fe379e4ec295b70aaf37e985e8e22.png

如上图所示,左边来了一辆装满了“梯度”商品的货车,来领商品的客人一般都要排队一个个拿才可以,如果排队的人太多,后面的人就没有了。于是这时候派了一个人走了“快捷通道”,到货车上领了一部分“梯度”,直接送给后面的人,这样后面排队的客人就能拿到更多的“梯度”。

Bottleneck的好处-两种残差单元

8663e955a28e4b1a3b0d91d6c42683a3.png

我们来计算一下1*1卷积的计算量优势:首先看上图右边的bottleneck结构,对于256维的输入特征,参数数目:1x1x256x64+3x3x64x64+1x1x64x256=69632,如果同样的输入输出维度但不使用1x1卷积,而使用两个3x3卷积的话,参数数目为(3x3x256x256)x2=1179648。简单计算下就知道了,使用了1x1卷积的bottleneck将计算量简化为原有的5.9%,收益超高。

详细见:【基础积累】1x1卷积到底有哪些用处?

ResNet网络设计结构:

608da96d0b06c18b9643e02f3517459b.png

基于Pytorch的ResNet代码实现

  1. def conv3x3(in_planes, out_planes, stride=1):
  2.     """3x3 convolution with padding"""
  3.     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
  4.                      padding=1, bias=False)
  5. class Bottleneck(nn.Module):
  6.     expansion = 4
  7.     def __init__(self, inplanes, planes, stride=1, downsample=None):
  8.         super(Bottleneck, self).__init__()
  9.         self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
  10.         self.bn1 = nn.BatchNorm2d(planes)
  11.         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
  12.                                padding=1, bias=False)
  13.         self.bn2 = nn.BatchNorm2d(planes)
  14.         self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
  15.         self.bn3 = nn.BatchNorm2d(planes * 4)
  16.         self.relu = nn.ReLU(inplace=True)
  17.         self.downsample = downsample
  18.         self.stride = stride
  19.     def forward(self, x):
  20.         residual = x
  21.         out = self.conv1(x)
  22.         out = self.bn1(out)
  23.         out = self.relu(out)
  24.         out = self.conv2(out)
  25.         out = self.bn2(out)
  26.         out = self.relu(out)
  27.         out = self.conv3(out)
  28.         out = self.bn3(out)
  29.         if self.downsample is not None:
  30.             residual = self.downsample(x)
  31.         out += residual
  32.         out = self.relu(out)
  33.         return out
  34. class ResNet(nn.Module):
  35.     def __init__(self, block, layers, num_classes, grayscale):
  36.         self.inplanes = 64
  37.         if grayscale:
  38.             in_dim = 1
  39.         else:
  40.             in_dim = 3
  41.         super(ResNet, self).__init__()
  42.         self.conv1 = nn.Conv2d(in_dim, 64, kernel_size=7, stride=2, padding=3,
  43.                                bias=False)
  44.         self.bn1 = nn.BatchNorm2d(64)
  45.         self.relu = nn.ReLU(inplace=True)
  46.         self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
  47.         self.layer1 = self._make_layer(block, 64, layers[0])
  48.         self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
  49.         self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
  50.         self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
  51.         self.avgpool = nn.AvgPool2d(7, stride=1, padding=2)
  52.         #self.fc = nn.Linear(2048 * block.expansion, num_classes)
  53.         self.fc = nn.Linear(2048, num_classes)
  54.         for m in self.modules():
  55.             if isinstance(m, nn.Conv2d):
  56.                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
  57.                 m.weight.data.normal_(0, (2. / n)**.5)
  58.             elif isinstance(m, nn.BatchNorm2d):
  59.                 m.weight.data.fill_(1)
  60.                 m.bias.data.zero_()
  61.     def _make_layer(self, block, planes, blocks, stride=1):
  62.         downsample = None
  63.         if stride != 1 or self.inplanes != planes * block.expansion:
  64.             downsample = nn.Sequential(
  65.                 nn.Conv2d(self.inplanes, planes * block.expansion,
  66.                           kernel_size=1, stride=stride, bias=False),
  67.                 nn.BatchNorm2d(planes * block.expansion),
  68.             )
  69.         layers = []
  70.         layers.append(block(self.inplanes, planes, stride, downsample))
  71.         self.inplanes = planes * block.expansion
  72.         for i in range(1, blocks):
  73.             layers.append(block(self.inplanes, planes))
  74.         return nn.Sequential(*layers)
  75.     def forward(self, x):
  76.         x = self.conv1(x)
  77.         x = self.bn1(x)
  78.         x = self.relu(x)
  79.         x = self.maxpool(x)
  80.         x = self.layer1(x)
  81.         x = self.layer2(x)
  82.         x = self.layer3(x)
  83.         x = self.layer4(x)
  84.         #x = self.avgpool(x)
  85.         x = x.view(x.size(0), -1)
  86.         logits = self.fc(x)
  87.         probas = F.softmax(logits, dim=1)
  88.         return logits, probas
  89. def resnet101(num_classes, grayscale):
  90.     """Constructs a ResNet-101 model."""
  91.     model = ResNet(block=Bottleneck,
  92.                    layers=[3, 4, 23, 3],
  93.                    num_classes=NUM_CLASSES,
  94.                    grayscale=grayscale)
  95.     return model

PreResNet

c26da31e374b0ec3a8847e075610d3ad.png

  • 论文链接:https://arxiv.org/abs/1603.05027

  • 代码地址:https://github.com/KaimingHe/resnet-1k-layers.

b6ee2d1e2b5f07a3da7f4ed6d97b2c1f.gif

论文主要思想和改进点

本文分析了残差模块背后的传播公式,重点是创建一个信息传播的“直接”路径——不仅在残差单元内,而且要通过整个网络。一系列实验表明在使用恒等映射作为跳跃连接和在BN后添加激活函数的方式,前向和后向传播信号可以直接从一个块传播到任何其他块。一系列的消融实验证明了这些恒等映射的重要性。这促使我们提出了一个新的残差单位,它使的训练更容易,并改进了泛化效果。我们把它叫做preResNet,preResNet主要是调整了residual模块中各层的顺序。preResNet比较了ReLU和BN的摆放位置不同,比较两者的效果。相比经典residual模块(a),(b)将BN共享会更加影响信息的短路传播,使网络更难训练、性能也更差;(c)直接将ReLU移到BN后会使该分支的输出始终非负,使网络表示能力下降;(d)将ReLU提前解决了(e)的非负问题,但ReLU无法享受BN的效果;(e)将ReLU和BN都提前解决了(d)的问题。preResNet的短路连接(e)能更加直接的传递信息,进而取得了比ResNet更好的性能。

60198822587c93c050730c1c1877ff5c.png

基于Pytorch的ResNet代码实现

  1. import torch.nn as nn
  2. __all__ = ['preresnet20', 'preresnet32', 'preresnet44',
  3.            'preresnet56', 'preresnet110', 'preresnet1202']
  4. def conv3x3(in_planes, out_planes, stride=1):
  5.     "3x3 convolution with padding"
  6.     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
  7.                      padding=1, bias=False)
  8. class BasicBlock(nn.Module):
  9.     expansion = 1
  10.     def __init__(self, inplanes, planes, stride=1, downsample=None):
  11.         super(BasicBlock, self).__init__()
  12.         self.bn_1 = nn.BatchNorm2d(inplanes)
  13.         self.relu = nn.ReLU(inplace=True)
  14.         self.conv_1 = conv3x3(inplanes, planes, stride)
  15.         self.bn_2 = nn.BatchNorm2d(planes)
  16.         self.conv_2 = conv3x3(planes, planes)
  17.         self.downsample = downsample
  18.         self.stride = stride
  19.     def forward(self, x):
  20.         residual = x
  21.         out = self.bn_1(x)
  22.         out = self.relu(out)
  23.         out = self.conv_1(out)
  24.         out = self.bn_2(out)
  25.         out = self.relu(out)
  26.         out = self.conv_2(out)
  27.         if self.downsample is not None:
  28.             residual = self.downsample(x)
  29.         out += residual
  30.         return out
  31. class Bottleneck(nn.Module):
  32.     expansion = 4
  33.     def __init__(self, inplanes, planes, stride=1, downsample=None):
  34.         super(Bottleneck, self).__init__()
  35.         self.bn_1 = nn.BatchNorm2d(inplanes)
  36.         self.conv_1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
  37.         self.bn_2 = nn.BatchNorm2d(planes)
  38.         self.conv_2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
  39.                                 padding=1, bias=False)
  40.         self.bn_3 = nn.BatchNorm2d(planes)
  41.         self.conv_3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
  42.         self.relu = nn.ReLU(inplace=True)
  43.         self.downsample = downsample
  44.         self.stride = stride
  45.     def forward(self, x):
  46.         residual = x
  47.         out = self.bn_1(x)
  48.         out = self.relu(out)
  49.         out = self.conv_1(out)
  50.         out = self.bn_2(out)
  51.         out = self.relu(out)
  52.         out = self.conv_2(out)
  53.         out = self.bn_3(out)
  54.         out = self.relu(out)
  55.         out = self.conv_3(out)
  56.         if self.downsample is not None:
  57.             residual = self.downsample(x)
  58.         out += residual
  59.         return out
  60. class PreResNet(nn.Module):
  61.     def __init__(self, depth, num_classes=1000, block_name='BasicBlock'):
  62.         super(PreResNet, self).__init__()
  63.         # Model type specifies number of layers for CIFAR-10 model
  64.         if block_name.lower() == 'basicblock':
  65.             assert (
  66.                 depth - 2) % 6 == 0, "When use basicblock, depth should be 6n+2, e.g. 20, 32, 44, 56, 110, 1202"
  67.             n = (depth - 2) // 6
  68.             block = BasicBlock
  69.         elif block_name.lower() == 'bottleneck':
  70.             assert (
  71.                 depth - 2) % 9 == 0, "When use bottleneck, depth should be 9n+2 e.g. 20, 29, 47, 56, 110, 1199"
  72.             n = (depth - 2) // 9
  73.             block = Bottleneck
  74.         else:
  75.             raise ValueError('block_name shoule be Basicblock or Bottleneck')
  76.         self.inplanes = 16
  77.         self.conv_1 = nn.Conv2d(3, 16, kernel_size=3, padding=1,
  78.                                 bias=False)
  79.         self.layer1 = self._make_layer(block, 16, n)
  80.         self.layer2 = self._make_layer(block, 32, n, stride=2)
  81.         self.layer3 = self._make_layer(block, 64, n, stride=2)
  82.         self.bn = nn.BatchNorm2d(64 * block.expansion)
  83.         self.relu = nn.ReLU(inplace=True)
  84.         self.avgpool = nn.AvgPool2d(8)
  85.         self.fc = nn.Linear(64 * block.expansion, num_classes)
  86.         for m in self.modules():
  87.             if isinstance(m, nn.Conv2d):
  88.                 nn.init.kaiming_normal_(m.weight.data)
  89.             elif isinstance(m, nn.BatchNorm2d):
  90.                 m.weight.data.fill_(1)
  91.                 m.bias.data.zero_()
  92.     def _make_layer(self, block, planes, blocks, stride=1):
  93.         downsample = None
  94.         if stride != 1 or self.inplanes != planes * block.expansion:
  95.             downsample = nn.Sequential(
  96.                 nn.Conv2d(self.inplanes, planes * block.expansion,
  97.                           kernel_size=1, stride=stride, bias=False))
  98.         layers = []
  99.         layers.append(block(self.inplanes, planes, stride, downsample))
  100.         self.inplanes = planes * block.expansion
  101.         for _ in range(1, blocks):
  102.             layers.append(block(self.inplanes, planes))
  103.         return nn.Sequential(*layers)
  104.     def forward(self, x):
  105.         x = self.conv_1(x) # 32x32
  106.         x = self.layer1(x) # 32x32
  107.         x = self.layer2(x) # 16x16
  108.         x = self.layer3(x) # 8x8
  109.         x = self.bn(x)
  110.         x = self.relu(x)
  111.         x = self.avgpool(x)
  112.         x = x.view(x.size(0), -1)
  113.         x = self.fc(x)
  114.         return x
  115. def preresnet20(num_classes):
  116.     return PreResNet(depth=20, num_classes=num_classes)
  117. def preresnet32(num_classes):
  118.     return PreResNet(depth=32, num_classes=num_classes)
  119. def preresnet44(num_classes):
  120.     return PreResNet(depth=44, num_classes=num_classes)
  121. def preresnet56(num_classes):
  122.     return PreResNet(depth=56, num_classes=num_classes)
  123. def preresnet110(num_classes):
  124.     return PreResNet(depth=110, num_classes=num_classes)
  125. def preresnet1202(num_classes):
  126.     return PreResNet(depth=1202, num_classes=num_classes)

ResNetXt

202357d5f460bdda8966d53d6fa150c1.png

  • 论文链接:https://arxiv.org/abs/1611.05431

412f0b99ba27a5bfe1d6697dd671a7bc.gif

论文思想和主要改进点

传统的方法通常是靠加深或加宽网络来提升性能,但计算开销也会随之增加。ResNeXt旨在不改变模型复杂度的情况下提升性能。受精简而高效的Inception模块启发,在这篇文章中,我们提出了一个简单的、高度模块化的图像分类网络架构。我们的网络是通过聚合不同的模块构建起来的,它借鉴了Inception的“分割-变换-聚合”策略,却用相同的拓扑结构来组建多分支结构。这种多分支结构的策略衍生出了一个新的维度,我们称之为“基数”,它也是网络结构中除了深度和宽度之外,一个重要的影响因素。作为ResNet的一个高能进化版,ResNeXt在宽度和深度之外,通过引入了“基数 (Cardinality) ”的概念。在网络不加深不加宽的情况下,增加基数便可以提高模型效果和提升准确率,还能减少超参数的数量。

ResNeXt的关键点是:

  • 沿用ResNet的短路连接,并且重复堆叠相同的模块组合。

  • ResNeXt将ResNet中非跳跃连接的那一分支变为多个分支。

  • 多分支分别处理。

  • 使用1×1卷积降低计算量。其综合了ResNet和Inception的优点。

  • ResNeXt与Inception最本质的差别,其实是Block内每个分支的拓扑结构,Inception为了提高表达能力/结合不同感受野,每个分支使用了不同的拓扑结构。而ResNeXt则使用了同一拓扑的分支,即ResNeXt的分支是同构的!

因为ResNeXt是同构的,因此继承了VGG/ResNet的精神衣钵:维持网络拓扑结构不变。主要体现在两点:

  • 特征图大小相同,则涉及的结构超参数相同

  • 每当空间分辨率/2(降采样),则卷积核的宽度*2

神经元连接

799c089c6d83628c0b598681d8d5edce.png

聚合变换

c23295984e3fa0454562a30d13d3c02e.png

ResNeXt最终输出模块公式:

6a3a506cbddb6994fd0ac4f30f640ec0.png

此外,ResNeXt巧妙地利用分组卷积进行实现。ResNeXt发现,增加分支数是比加深或加宽更有效地提升网络性能的方式。ResNeXt的命名旨在说明这是下一代(next)的ResNet。

09a4cd501e773454efb91ee0ec28fcaa.png

如果一个ResNeXt Block中只有两层conv,前后都可等效成一个大的conv层

9523d260ee58989679de9e0163495043.png

上图a的解读:

b8029c69845bf42823ad03231f003ee1.png

ResNeXt最核心的地方只存在于被最上最下两层卷积夹着的,中间的部分

  1. 因为第一个分开的conv其实都接受了一样的输入,各分支又有着相同的拓扑结构。类比乘法结合律,这其实就是把一个conv的输出拆开了分掉。(相同输入,不同输出)

  2. 而最后一个conv又只对同一个输出负责,因此就可以并起来用一个conv处理。(不同输入,相同输出

  3. 唯一一个输入和输出都不同的,就是中间的3*3conv了。它们的输入,参数,负责的输出都不同,无法合并,因此也相互独立。这才是模型的关键所在。最终模型可以被等效为下图所示的最终形态:

4879d358ea7f186893ac44090c29220e.png

ResNeXt的网络结构设计:

141715cb2febd6515f32d73324a38a4a.png

基于Pytorch的ResNet代码实现

  1. import torch.nn as nn
  2. import torch.nn.functional as F
  3. __all__ = ['resnext29_8x64d', 'resnext29_16x64d']
  4. class Bottleneck(nn.Module):
  5.     def __init__(
  6.             self,
  7.             in_channels,
  8.             out_channels,
  9.             stride,
  10.             cardinality,
  11.             base_width,
  12.             expansion):
  13.         super(Bottleneck, self).__init__()
  14.         width_ratio = out_channels / (expansion * 64.)
  15.         D = cardinality * int(base_width * width_ratio)
  16.         self.relu = nn.ReLU(inplace=True)
  17.         self.conv_reduce = nn.Conv2d(
  18.             in_channels, D, kernel_size=1, stride=1, padding=0, bias=False)
  19.         self.bn_reduce = nn.BatchNorm2d(D)
  20.         self.conv_conv = nn.Conv2d(
  21.             D,
  22.             D,
  23.             kernel_size=3,
  24.             stride=stride,
  25.             padding=1,
  26.             groups=cardinality,
  27.             bias=False)
  28.         self.bn = nn.BatchNorm2d(D)
  29.         self.conv_expand = nn.Conv2d(
  30.             D, out_channels, kernel_size=1, stride=1, padding=0, bias=False)
  31.         self.bn_expand = nn.BatchNorm2d(out_channels)
  32.         self.shortcut = nn.Sequential()
  33.         if in_channels != out_channels:
  34.             self.shortcut.add_module(
  35.                 'shortcut_conv',
  36.                 nn.Conv2d(
  37.                     in_channels,
  38.                     out_channels,
  39.                     kernel_size=1,
  40.                     stride=stride,
  41.                     padding=0,
  42.                     bias=False))
  43.             self.shortcut.add_module(
  44.                 'shortcut_bn', nn.BatchNorm2d(out_channels))
  45.     def forward(self, x):
  46.         out = self.conv_reduce.forward(x)
  47.         out = self.relu(self.bn_reduce.forward(out))
  48.         out = self.conv_conv.forward(out)
  49.         out = self.relu(self.bn.forward(out))
  50.         out = self.conv_expand.forward(out)
  51.         out = self.bn_expand.forward(out)
  52.         residual = self.shortcut.forward(x)
  53.         return self.relu(residual + out)
  54. class ResNeXt(nn.Module):
  55.     """
  56.     ResNext optimized for the Cifar dataset, as specified in
  57.     https://arxiv.org/pdf/1611.05431.pdf
  58.     """
  59.     def __init__(
  60.             self,
  61.             cardinality,
  62.             depth,
  63.             num_classes,
  64.             base_width,
  65.             expansion=4):
  66.         """ Constructor
  67.         Args:
  68.             cardinality: number of convolution groups.
  69.             depth: number of layers.
  70.             num_classes: number of classes
  71.             base_width: base number of channels in each group.
  72.             expansion: factor to adjust the channel dimensionality
  73.         """
  74.         super(ResNeXt, self).__init__()
  75.         self.cardinality = cardinality
  76.         self.depth = depth
  77.         self.block_depth = (self.depth - 2) // 9
  78.         self.base_width = base_width
  79.         self.expansion = expansion
  80.         self.num_classes = num_classes
  81.         self.output_size = 64
  82.         self.stages = [64, 64 * self.expansion, 128 *
  83.                        self.expansion, 256 * self.expansion]
  84.         self.conv_1_3x3 = nn.Conv2d(3, 64, 3, 1, 1, bias=False)
  85.         self.bn_1 = nn.BatchNorm2d(64)
  86.         self.stage_1 = self.block('stage_1', self.stages[0], self.stages[1], 1)
  87.         self.stage_2 = self.block('stage_2', self.stages[1], self.stages[2], 2)
  88.         self.stage_3 = self.block('stage_3', self.stages[2], self.stages[3], 2)
  89.         self.fc = nn.Linear(self.stages[3], num_classes)
  90.         for m in self.modules():
  91.             if isinstance(m, nn.Conv2d):
  92.                 nn.init.kaiming_normal_(m.weight.data)
  93.             elif isinstance(m, nn.BatchNorm2d):
  94.                 m.weight.data.fill_(1)
  95.                 m.bias.data.zero_()
  96.     def block(self, name, in_channels, out_channels, pool_stride=2):
  97.         block = nn.Sequential()
  98.         for bottleneck in range(self.block_depth):
  99.             name_ = '%s_bottleneck_%d' % (name, bottleneck)
  100.             if bottleneck == 0:
  101.                 block.add_module(
  102.                     name_,
  103.                     Bottleneck(
  104.                         in_channels,
  105.                         out_channels,
  106.                         pool_stride,
  107.                         self.cardinality,
  108.                         self.base_width,
  109.                         self.expansion))
  110.             else:
  111.                 block.add_module(
  112.                     name_,
  113.                     Bottleneck(
  114.                         out_channels,
  115.                         out_channels,
  116.                         1,
  117.                         self.cardinality,
  118.                         self.base_width,
  119.                         self.expansion))
  120.         return block
  121.     def forward(self, x):
  122.         x = self.conv_1_3x3.forward(x)
  123.         x = F.relu(self.bn_1.forward(x), inplace=True)
  124.         x = self.stage_1.forward(x)
  125.         x = self.stage_2.forward(x)
  126.         x = self.stage_3.forward(x)
  127.         x = F.avg_pool2d(x, 8, 1)
  128.         x = x.view(-1, self.stages[3])
  129.         return self.fc(x)
  130. def resnext29_8x64d(num_classes):
  131.     return ResNeXt(
  132.         cardinality=8,
  133.         depth=29,
  134.         num_classes=num_classes,
  135.         base_width=64)
  136. def resnext29_16x64d(num_classes):
  137.     return ResNeXt(
  138.         cardinality=16,
  139.         depth=29,
  140.         num_classes=num_classes,
  141.         base_width=64)

参考连接

  1. https://zhuanlan.zhihu.com/p/54289848

  2. https://zhuanlan.zhihu.com/p/28124810

  3. https://zhuanlan.zhihu.com/p/31727402

  4. https://zhuanlan.zhihu.com/p/56961832

  5. https://zhuanlan.zhihu.com/p/54072011

  6. https://github.com/BIGBALLON/CIFAR-ZOO

  7. https://zhuanlan.zhihu.com/p/78019001

下载1:OpenCV-Contrib扩展模块中文版教程

在「小白学视觉」公众号后台回复:扩展模块中文教程即可下载全网第一份OpenCV扩展模块教程中文版,涵盖扩展模块安装、SFM算法、立体视觉、目标跟踪、生物视觉、超分辨率处理等二十多章内容。

下载2:Python视觉实战项目52讲

在「小白学视觉」公众号后台回复:Python视觉实战项目即可下载包括图像分割、口罩检测、车道线检测、车辆计数、添加眼线、车牌识别、字符识别、情绪检测、文本内容提取、面部识别等31个视觉实战项目,助力快速学校计算机视觉。

下载3:OpenCV实战项目20讲

在「小白学视觉」公众号后台回复:OpenCV实战项目20讲即可下载含有20个基于OpenCV实现20个实战项目,实现OpenCV学习进阶。

交流群

欢迎加入公众号读者群一起和同行交流,目前有SLAM、三维视觉、传感器、自动驾驶、计算摄影、检测、分割、识别、医学影像、GAN、算法竞赛等微信群(以后会逐渐细分),请扫描下面微信号加群,备注:”昵称+学校/公司+研究方向“,例如:”张三 + 上海交大 + 视觉SLAM“。请按照格式备注,否则不予通过。添加成功后会根据研究方向邀请进入相关微信群。请勿在群内发送广告,否则会请出群,谢谢理解~

0d43f10b4f7e322379ac4bb65d0e2fcb.png

497754fcd4e320e5f9c58e401cd7bd09.png

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/310012
推荐阅读
  

闽ICP备14008679号