当前位置:   article > 正文

hrnet模型源代码详解_fhr-net 模型

fhr-net 模型

(仅为个人学习笔记,如果有错误欢迎提出)

请对照源代码和本帖子观看:https://github.com/leoxiaobin/deep-high-resolution-net.pytorch

建议先看看论文大概了解hrnet特点再看

我们先看看代码里用来搭建模型的方法:

  1. def get_pose_net(cfg, is_train, **kwargs):
  2. model = PoseHighResolutionNet(cfg, **kwargs)
  3. if is_train and cfg['MODEL']['INIT_WEIGHTS']:
  4. model.init_weights(cfg['MODEL']['PRETRAINED'])
  5. return model

这里面用的model=PoseHighResolutionNet(cfg, **kwargs)来构建整个模型,所以我们来看PoseHighResolutionNet(cfg, **kwargs)类的forward函数,并且一节一节开始分析。

1、初步提特征

首先是最简单的一节,这一节就是先对输入的图片进行简单的提取特征,没啥好说的,自己对照这init函数看看就晓得了

  1. def forward(self, x):
  2. #初步的进行提取特征
  3. x = self.conv1(x) #(h,w,3)-->((hin+1)/2,(win+1)/2,64)
  4. x = self.bn1(x) #正则化
  5. x = self.relu(x) #激活函数
  6. x = self.conv2(x) #(h,w,64)-->((hin+1)/2,(win+1)/2,64)
  7. x = self.bn2(x) #正则化
  8. x = self.relu(x) #激活函数

模型结构是这样的:

  1. (conv1): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
  2. (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  3. (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
  4. (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  5. (relu): ReLU(inplace=True)

2、利用残差结构,加深层数继续提特征

在forward函数中,初步提特征后下一行是:

    x = self.layer1(x)

我们先来看看self.layer1在init中的定义:

self.layer1 = self._make_layer(Bottleneck, 64, 4)

然后我们再进入到self._make_layer(Bottleneck, 64, 4)函数去看

  1. def _make_layer(self, block, planes, blocks, stride=1):
  2. downsample = None
  3. #我们来看一下下面的if部分
  4. #在layer1中,block传入的是Bottlenect类,block.expansion是block类里的一个变量,定义为4
  5. #layer1的stride为1,planes为64,而self.inplane表示当前特征图通道数,经过初步提特征处理后的特征图通道数为是64,block.expanson=4,达成条件
  6. #那么downsample = nn.Sequential(
  7. # nn.Conv2d(64, 64*4,kernel_size=1, stride=1, bias=False),
  8. # nn.BatchNorm2d(64*4, momentum=BN_MOMENTUM),
  9. # )
  10. #这里的downsample会在后面的bottleneck里面用到,用于下面block中调整输入x的通道数,实现残差结构相加
  11. if stride != 1 or self.inplanes != planes * block.expansion:
  12. downsample = nn.Sequential(
  13. nn.Conv2d(
  14. self.inplanes, planes * block.expansion,
  15. kernel_size=1, stride=stride, bias=False
  16. ),
  17. nn.BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
  18. )
  19. layers = []
  20. #所以layers里第一层是:bottleneck(64, 64, 1, downsample) (w,h,64)-->(w,h,256) 详细的分析在下面哦
  21. layers.append(block(self.inplanes, planes, stride, downsample))
  22. #经过第一层后,当前特征图通道数为256
  23. self.inplanes = planes * block.expansion
  24. #这里的block为4,即for i in range(1,4)
  25. #所以这里for循环实现了3层bottleneck,目的应该是为了加深层数
  26. #bottleneck(256, 64, 1) 这里就没有传downsample了哦,因为残差结构相加不需要升维或者降维
  27. #bottleneck(256, 64, 1)
  28. #bottleneck(256, 64, 1)
  29. for i in range(1, blocks):
  30. layers.append(block(self.inplanes, planes))
  31. return nn.Sequential(*layers)

#############################################################################

以layer1的第一层bottleneck(64, 64, 1, downsample)为例子,我们再来看看bottleneck到底干了个啥,bottleneck类的代码如下:

  1. #这里只看代码干了啥,不详细解释残差结构的特点啊原理啥的
  2. class Bottleneck(nn.Module):
  3. expansion = 4
  4. def __init__(self, inplanes, planes, stride=1, downsample=None):
  5. super(Bottleneck, self).__init__()
  6. self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
  7. self.bn1 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
  8. self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
  9. padding=1, bias=False)
  10. self.bn2 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
  11. self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1,
  12. bias=False)
  13. self.bn3 = nn.BatchNorm2d(planes * self.expansion,
  14. momentum=BN_MOMENTUM)
  15. self.relu = nn.ReLU(inplace=True)
  16. self.downsample = downsample
  17. self.stride = stride
  18. def forward(self, x):
  19. residual = x
  20. out = self.conv1(x) #n.Conv2d(64,64, kernel_size=1, bias=False) (w,h,64)-->(w,h,64)
  21. out = self.bn1(out)
  22. out = self.relu(out)
  23. out = self.conv2(out) #nn.Conv2d(64, 64, kernel_size=3, 1,padding=1, bias=False) (w,h,64)-->(w,h,64)
  24. out = self.bn2(out)
  25. out = self.relu(out)
  26. out = self.conv3(out) #nn.Conv2d(64, 64 * 4, kernel_size=1,bias=False) (w,h,64)-->(w,h,256)
  27. out = self.bn3(out)
  28. if self.downsample is not None:
  29. #这里的downsample的作用是希望输入原图x与conv3输出的图维度相同,方便两种特征图进行相加,保留更多的信息(你要是看不懂这句话,就去先简单了解一下残差结构)
  30. #如果x与conv3输出图维度本来就相同,就意味着可以直接相加,那么downsample会为空,自然就不会进行下面操作
  31. residual = self.downsample(x) #downsample = nn.Sequential(
  32. # nn.Conv2d(64, 64*4,kernel_size=1, stride=1, bias=False),
  33. # nn.BatchNorm2d(64*4, momentum=BN_MOMENTUM),
  34. #
  35. out += residual #残差结构相加嘛
  36. out = self.relu(out) 得到结果
  37. return out

#############################################################################

那么这一部分的模型结构是这样子滴

  1. (layer1): Sequential(
  2. (0): Bottleneck(
  3. (conv1):
声明:本文内容由网友自发贡献,转载请注明出处:【wpsshop】
推荐阅读
相关标签
  

闽ICP备14008679号