当前位置:   article > 正文

Pytorch提取神经网络层结构、层参数及自定义初始化_pytorch 信息抽取模型

pytorch 信息抽取模型

1. 提取层结构

对于一个给定的模型,如果不想模型中所有的层结构,只希望能够提取网络中的某一层或者某几层,应该如何实现呢?

首先定义一下神经网络结构:

import torch
import torch.nn as nn


class SimpleCnn(nn.Module):
    def __init__(self):
        super(SimpleCnn, self).__init__()

        layer1 = nn.Sequential()
        layer1.add_module("conv1", nn.Conv2d(3, 32, 3, 1, padding=1))
        layer1.add_module("relu1", nn.ReLU(True))
        layer1.add_module("pool1", nn.MaxPool2d(2, 2))
        self.layer1 = layer1

        layer2 = nn.Sequential()
        layer2.add_module("conv2", nn.Conv2d(32, 64, 3, 1, padding=1))
        layer2.add_module("relu2", nn.ReLU(True))
        layer2.add_module("pool2", nn.MaxPool2d(2, 2))
        self.layer2 = layer2

        layer3 = nn.Sequential()
        layer3.add_module("conv3", nn.Conv2d(64, 128, 3, 1, padding=1))
        layer3.add_module("relu3", nn.ReLU(True))
        layer3.add_module("pool3", nn.MaxPool2d(2, 2))
        self.layer3 = layer3

        layer4 = nn.Sequential()
        layer4.add_module("fc1", nn.Linear(2048, 512))
        layer4.add_module("fc_relu1", nn.ReLU(True))
        layer4.add_module("fc2", nn.Linear(512, 64))
        layer4.add_module("fc_relu2", nn.ReLU(True))
        layer4.add_module("fc3", nn.Linear(64, 10))
        self.layer4 = layer4

    def forward(self, x):
        conv1 = self.layer1(x)
        conv2 = self.layer2(conv1)
        conv3 = self.layer3(conv2)
        fc_input = conv3.view(conv3.shape[0], -1)
        fc_out = self.layer4(fc_input)

        return fc_out
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42

再来看一下nn.Module的几个重要属性

1.1 children()

children():返回下一级模块的迭代器。比如在上述定义的神经网络模型中,它只会返回self.layer1self.layer2self.layer3以及self.layer4以上的迭代器,不会返回它们内部的东西。

  1. 代码示例1:model.children()返回的是一个生成器
model = SimpleCnn()
print(model.children())
# 输出:<generator object Module.children at 0x7f84932e2050>
  • 1
  • 2
  • 3
  1. 代码示例2:对model.children()进行迭代,提取神经网络下一级模块
model = SimpleCnn()
for layer in model.children():
    print(layer)
  • 1
  • 2
  • 3
Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU(inplace=True)
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Sequential(
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu2): ReLU(inplace=True)
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Sequential(
  (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu3): ReLU(inplace=True)
  (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Sequential(
  (fc1): Linear(in_features=2048, out_features=512, bias=True)
  (fc_relu1): ReLU(inplace=True)
  (fc2): Linear(in_features=512, out_features=64, bias=True)
  (fc_relu2): ReLU(inplace=True)
  (fc3): Linear(in_features=64, out_features=10, bias=True)
)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  1. 代码示例3:对下一级模块继续进行迭代,得到下下级模块。
model = SimpleCnn()
for layer in model.children():
    for sublayer in layer.children():
        print(sublayer)
  • 1
  • 2
  • 3
  • 4
Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
ReLU(inplace=True)
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
ReLU(inplace=True)
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
ReLU(inplace=True)
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
Linear(in_features=2048, out_features=512, bias=True)
ReLU(inplace=True)
Linear(in_features=512, out_features=64, bias=True)
ReLU(inplace=True)
Linear(in_features=64, out_features=10, bias=True)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  1. 如果想提取神经网络的前两层,实现如下:
model = SimpleCnn()
new_model = nn.Sequential(*list(model.children())[0:2])
print(new_model)
  • 1
  • 2
  • 3
Sequential(
  (0): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU(inplace=True)
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (1): Sequential(
    (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu2): ReLU(inplace=True)
    (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

1.2 modules()

modules会返回模型中的所有模块的迭代器,这样就有一个好处,即它能访问到最内层,比如self.layer1.conv1这个模块。输出结果:返回神经网络的所有模块,包括各种不同级的模块。注意与1.1中代码示例1和代码示例2的区别。

model = SimpleCnn()
for layer in model.modules():
    print(layer)
  • 1
  • 2
  • 3
SimpleCnn(
  (layer1): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU(inplace=True)
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu2): ReLU(inplace=True)
    (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer3): Sequential(
    (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu3): ReLU(inplace=True)
    (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer4): Sequential(
    (fc1): Linear(in_features=2048, out_features=512, bias=True)
    (fc_relu1): ReLU(inplace=True)
    (fc2): Linear(in_features=512, out_features=64, bias=True)
    (fc_relu2): ReLU(inplace=True)
    (fc3): Linear(in_features=64, out_features=10, bias=True)
  )
)
Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU(inplace=True)
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
ReLU(inplace=True)
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
Sequential(
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu2): ReLU(inplace=True)
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
ReLU(inplace=True)
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
Sequential(
  (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu3): ReLU(inplace=True)
  (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
ReLU(inplace=True)
MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
Sequential(
  (fc1): Linear(in_features=2048, out_features=512, bias=True)
  (fc_relu1): ReLU(inplace=True)
  (fc2): Linear(in_features=512, out_features=64, bias=True)
  (fc_relu2): ReLU(inplace=True)
  (fc3): Linear(in_features=64, out_features=10, bias=True)
)
Linear(in_features=2048, out_features=512, bias=True)
ReLU(inplace=True)
Linear(in_features=512, out_features=64, bias=True)
ReLU(inplace=True)
Linear(in_features=64, out_features=10, bias=True)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60

1.3 named_children()和named_modules()

named_children()named_modules()分别与children()modules()相对应,其不仅返回模块的迭代器,还会返回网络层的名字。

  1. 代码示例1
model = SimpleCnn()
for layer in model.named_modules():
    print(layer)
  • 1
  • 2
  • 3
('', SimpleCnn(
  (layer1): Sequential(
    (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu1): ReLU(inplace=True)
    (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu2): ReLU(inplace=True)
    (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer3): Sequential(
    (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (relu3): ReLU(inplace=True)
    (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer4): Sequential(
    (fc1): Linear(in_features=2048, out_features=512, bias=True)
    (fc_relu1): ReLU(inplace=True)
    (fc2): Linear(in_features=512, out_features=64, bias=True)
    (fc_relu2): ReLU(inplace=True)
    (fc3): Linear(in_features=64, out_features=10, bias=True)
  )
))
('layer1', Sequential(
  (conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU(inplace=True)
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
))
('layer1.conv1', Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
('layer1.relu1', ReLU(inplace=True))
('layer1.pool1', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False))
('layer2', Sequential(
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu2): ReLU(inplace=True)
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
))
('layer2.conv2', Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
('layer2.relu2', ReLU(inplace=True))
('layer2.pool2', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False))
('layer3', Sequential(
  (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu3): ReLU(inplace=True)
  (pool3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
))
('layer3.conv3', Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))
('layer3.relu3', ReLU(inplace=True))
('layer3.pool3', MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False))
('layer4', Sequential(
  (fc1): Linear(in_features=2048, out_features=512, bias=True)
  (fc_relu1): ReLU(inplace=True)
  (fc2): Linear(in_features=512, out_features=64, bias=True)
  (fc_relu2): ReLU(inplace=True)
  (fc3): Linear(in_features=64, out_features=10, bias=True)
))
('layer4.fc1', Linear(in_features=2048, out_features=512, bias=True))
('layer4.fc_relu1', ReLU(inplace=True))
('layer4.fc2', Linear(in_features=512, out_features=64, bias=True))
('layer4.fc_relu2', ReLU(inplace=True))
('layer4.fc3', Linear(in_features=64, out_features=10, bias=True))
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  1. 如果希望提取模型中所有的卷积层,实现如下:
model = SimpleCnn()
new_model = nn.Sequential()
for layer in model.named_modules():
    if isinstance(layer[1], nn.Conv2d):
        conv_name = layer[0].replace('.', '_')
        new_model.add_module(conv_name, layer[1])

print(new_model)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
Sequential(
  (layer1_conv1): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (layer2_conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (layer3_conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
)
  • 1
  • 2
  • 3
  • 4
  • 5

2.提取参数及自定义初始化

2.1 提取参数

nn.Module里有两个关于参数的属性:

  • parameters():给出一个网络的全部参数的迭代器
  • named_parameters():给出网络层的名字和参数的迭代器
  1. 代码示例1:获取模型的参数。还可以通过para.grad获取梯度
model = SimpleCnn()
for para in model.parameters():
    print(para)
  • 1
  • 2
  • 3
  1. 代码示例2:获取模型网络层的名字和参数。model.named_parameters()迭代出形式为(名字,参数值)的元组。
model = SimpleCnn()
for name, para in model.named_parameters():
    print(name)
  • 1
  • 2
  • 3

2.2 自定义初始化

model = SimpleCnn()
for m in model.modules():
    if isinstance(m, nn.Conv2d):
        # nn.init.normal_(m.weight.data)
        # nn.init.xavier_normal_(m.weight.data)
        nn.init.kaiming_normal_(m.weight.data)
        m.bias.data.fill_(0)
    elif isinstance(m, nn.Linear):
        m.weight.data.normal_()
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/你好赵伟/article/detail/365880
推荐阅读
相关标签
  

闽ICP备14008679号