赞
踩
(仅为个人学习笔记,如果有错误欢迎提出)
请对照源代码和本帖子观看:https://github.com/leoxiaobin/deep-high-resolution-net.pytorch
建议先看看论文大概了解hrnet特点再看
我们先看看代码里用来搭建模型的方法:
- def get_pose_net(cfg, is_train, **kwargs):
- model = PoseHighResolutionNet(cfg, **kwargs)
-
- if is_train and cfg['MODEL']['INIT_WEIGHTS']:
- model.init_weights(cfg['MODEL']['PRETRAINED'])
-
- return model
这里面用的model=PoseHighResolutionNet(cfg, **kwargs)来构建整个模型,所以我们来看PoseHighResolutionNet(cfg, **kwargs)类的forward函数,并且一节一节开始分析。
1、初步提特征
首先是最简单的一节,这一节就是先对输入的图片进行简单的提取特征,没啥好说的,自己对照这init函数看看就晓得了
- def forward(self, x):
- #初步的进行提取特征
- x = self.conv1(x) #(h,w,3)-->((hin+1)/2,(win+1)/2,64)
- x = self.bn1(x) #正则化
- x = self.relu(x) #激活函数
- x = self.conv2(x) #(h,w,64)-->((hin+1)/2,(win+1)/2,64)
- x = self.bn2(x) #正则化
- x = self.relu(x) #激活函数
模型结构是这样的:
- (conv1): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
- (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
- (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
- (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
- (relu): ReLU(inplace=True)
2、利用残差结构,加深层数继续提特征
在forward函数中,初步提特征后下一行是:
x = self.layer1(x)
我们先来看看self.layer1在init中的定义:
self.layer1 = self._make_layer(Bottleneck, 64, 4)
然后我们再进入到self._make_layer(Bottleneck, 64, 4)函数去看
- def _make_layer(self, block, planes, blocks, stride=1):
- downsample = None
-
- #我们来看一下下面的if部分
- #在layer1中,block传入的是Bottlenect类,block.expansion是block类里的一个变量,定义为4
- #layer1的stride为1,planes为64,而self.inplane表示当前特征图通道数,经过初步提特征处理后的特征图通道数为是64,block.expanson=4,达成条件
- #那么downsample = nn.Sequential(
- # nn.Conv2d(64, 64*4,kernel_size=1, stride=1, bias=False),
- # nn.BatchNorm2d(64*4, momentum=BN_MOMENTUM),
- # )
- #这里的downsample会在后面的bottleneck里面用到,用于下面block中调整输入x的通道数,实现残差结构相加
- if stride != 1 or self.inplanes != planes * block.expansion:
- downsample = nn.Sequential(
- nn.Conv2d(
- self.inplanes, planes * block.expansion,
- kernel_size=1, stride=stride, bias=False
- ),
- nn.BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
- )
-
- layers = []
- #所以layers里第一层是:bottleneck(64, 64, 1, downsample) (w,h,64)-->(w,h,256) 详细的分析在下面哦
- layers.append(block(self.inplanes, planes, stride, downsample))
-
- #经过第一层后,当前特征图通道数为256
- self.inplanes = planes * block.expansion
-
- #这里的block为4,即for i in range(1,4)
- #所以这里for循环实现了3层bottleneck,目的应该是为了加深层数
- #bottleneck(256, 64, 1) 这里就没有传downsample了哦,因为残差结构相加不需要升维或者降维
- #bottleneck(256, 64, 1)
- #bottleneck(256, 64, 1)
- for i in range(1, blocks):
- layers.append(block(self.inplanes, planes))
-
- return nn.Sequential(*layers)
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
#############################################################################
以layer1的第一层bottleneck(64, 64, 1, downsample)为例子,我们再来看看bottleneck到底干了个啥,bottleneck类的代码如下:
- #这里只看代码干了啥,不详细解释残差结构的特点啊原理啥的
- class Bottleneck(nn.Module):
- expansion = 4
-
- def __init__(self, inplanes, planes, stride=1, downsample=None):
- super(Bottleneck, self).__init__()
- self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
- self.bn1 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
- self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
- padding=1, bias=False)
- self.bn2 = nn.BatchNorm2d(planes, momentum=BN_MOMENTUM)
- self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1,
- bias=False)
- self.bn3 = nn.BatchNorm2d(planes * self.expansion,
- momentum=BN_MOMENTUM)
- self.relu = nn.ReLU(inplace=True)
- self.downsample = downsample
- self.stride = stride
-
- def forward(self, x):
- residual = x
-
- out = self.conv1(x) #n.Conv2d(64,64, kernel_size=1, bias=False) (w,h,64)-->(w,h,64)
- out = self.bn1(out)
- out = self.relu(out)
-
- out = self.conv2(out) #nn.Conv2d(64, 64, kernel_size=3, 1,padding=1, bias=False) (w,h,64)-->(w,h,64)
- out = self.bn2(out)
- out = self.relu(out)
-
- out = self.conv3(out) #nn.Conv2d(64, 64 * 4, kernel_size=1,bias=False) (w,h,64)-->(w,h,256)
- out = self.bn3(out)
-
- if self.downsample is not None:
- #这里的downsample的作用是希望输入原图x与conv3输出的图维度相同,方便两种特征图进行相加,保留更多的信息(你要是看不懂这句话,就去先简单了解一下残差结构)
- #如果x与conv3输出图维度本来就相同,就意味着可以直接相加,那么downsample会为空,自然就不会进行下面操作
- residual = self.downsample(x) #downsample = nn.Sequential(
- # nn.Conv2d(64, 64*4,kernel_size=1, stride=1, bias=False),
- # nn.BatchNorm2d(64*4, momentum=BN_MOMENTUM),
- #
-
- out += residual #残差结构相加嘛
- out = self.relu(out) 得到结果
- return out
![](https://csdnimg.cn/release/blogv2/dist/pc/img/newCodeMoreWhite.png)
#############################################################################
那么这一部分的模型结构是这样子滴
- (layer1): Sequential(
- (0): Bottleneck(
- (conv1):
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。