当前位置:   article > 正文

PyTorch model 返回函数总结——model.state_dict(),model.modules(),model.children(),model.parameters()

model.state_dict()

系列文章目录

pytorch优化器——add_param_group()介绍及示例、Yolov7 优化器代码示例
pytorch学习率设置——optimizer.param_groups、对不同层设置学习率、动态调整学习率
PyTorch学习——关于tensor、Variable、nn.Parameter()、叶子节点、非叶子节点、detach()函数、查看网络层参数
PyTorch model 返回函数总结——model.state_dict(),model.modules(),model.children(),model.parameters()
PyTorch模型参数初始化(weights_init)——torch.nn.init、加载预权重



前言


一、model.modules(),model.children(),model.parameters()

首先定义网络模型

import torch
import torch.nn as nn
import numpy as np

# 设计模型   CNN
class CNN(torch.nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = torch.nn.Conv2d(1, 10, kernel_size=(5,5))      #卷积层1
        self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=(5,5))     #卷积层2
        self.pooling = torch.nn.MaxPool2d(2)                        #池化层
        self.fc1 = torch.nn.Linear(320, 256)                #全连接层1
        self.fc2 = torch.nn.Linear(256, 128)                #全连接层2
        self.fc3 = torch.nn.Linear(128, 10)                 #全连接层3
 
    def forward(self, x):
        # x.shape = 256*1*28*28
        batch_size = x.size(0)   # 256
        # 1*28*28 -> 10*24*24 -> 10*12*12
        x = F.relu(self.pooling(self.conv1(x)))    #卷积层1->池化层->激活函数Relu
        # 10*12*12-> 20*8*8 ->20*4*4
        x = F.relu(self.pooling(self.conv2(x)))    #卷积层2->池化层->激活函数Relu
        # 20*4*4 -> 320
        x = x.view(batch_size, -1)        #改变张量的维度
        # 320 -> 256
        x = self.fc1(x)            #全连接层1
        # 256 -> 128
        x = self.fc2(x)            #全连接层2
        # 128 ->10
        x = self.fc3(x)            #全连接层3
        return x
        
model = CNN()    #实例化()模型为model
# 以下6个函数返回值是一个生成器,通过 for 循环将内容保存在一个列表里
print('model.modules()',model.modules())
print('model.named_modules()', model.named_modules())
print('model.children()',model.children())
print('model.named_children()',model.named_children())
print('model.parameters()',model.parameters())
print('model.named_parameters()',model.named_parameters())
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40

1.1 model.modules()

遍历model的所有子层,也包括所有子层的子层,子层是指继承了 nn.Module 类的层。CNN() 本身,self.conv1、self.conv2、self.pooling、self.fc1、self.fc2、self.fc3。

model_modules = [m for m in model.modules()]          
print('len(model_modules)=',len(model_modules))
print('model_modules',model_modules)
  • 1
  • 2
  • 3

len(model_modules)= 7
model_modules [CNN(
(conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))
(pooling): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(fc1): Linear(in_features=320, out_features=256, bias=True)
(fc2): Linear(in_features=256, out_features=128, bias=True)
(fc3): Linear(in_features=128, out_features=10, bias=True)
), Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1)), Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1)), MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), Linear(in_features=320, out_features=256, bias=True), Linear(in_features=256, out_features=128, bias=True), Linear(in_features=128, out_features=10, bias=True)]

1.2 model.named_modules()

model.named_modules()就是带有layer name的model.modules()

model_named_modules = [m for m in model.named_modules()]     
print('len(model_named_modules)=',len(model_named_modules))
print('model_named_modules',model_named_modules)
  • 1
  • 2
  • 3

len(model_named_modules)= 7
model_named_modules [(‘’, CNN(
(conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
(conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))
(pooling): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(fc1): Linear(in_features=320, out_features=256, bias=True)
(fc2): Linear(in_features=256, out_features=128, bias=True)
(fc3): Linear(in_features=128, out_features=10, bias=True)
)), (‘conv1’, Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))), (‘conv2’, Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))), (‘pooling’, MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)), (‘fc1’, Linear(in_features=320, out_features=256, bias=True)), (‘fc2’, Linear(in_features=256, out_features=128, bias=True)), (‘fc3’, Linear(in_features=128, out_features=10, bias=True))]

1.3 model.children()

model_children只获取 model第二层的网络结构

model_children = [m for m in model.children()]         
print('len(model_children)=',len(model_children))
print('model_children',model_children)
  • 1
  • 2
  • 3

len(model_children)= 6
model_children [Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1)), Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1)), MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False), Linear(in_features=320, out_features=256, bias=True), Linear(in_features=256, out_features=128, bias=True), Linear(in_features=128, out_features=10, bias=True)]

1.4 model.named_children()

model.named_children() 就是带有layer name的 model.children()

model_named_children = [m for m in model.named_children()]  
print('len(model_named_children)=',len(model_named_children))
print('model_named_children',model_named_children)
  • 1
  • 2
  • 3

len(model_named_children)= 6
model_named_children [(‘conv1’, Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))), (‘conv2’, Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))), (‘pooling’, MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)), (‘fc1’, Linear(in_features=320, out_features=256, bias=True)), (‘fc2’, Linear(in_features=256, out_features=128, bias=True)), (‘fc3’, Linear(in_features=128, out_features=10, bias=True))]

1.5 model.parameters()

model.parameters() 迭代地返回 模型所有可学习参数,有些层没有学习的参数不输出(如relu)

model_parameters = [m for m in model.parameters()]                                                                         
print('len(model_parameters)=',len(model_parameters))
print('model_parameters',model_parameters)
  • 1
  • 2
  • 3

输出结果太长,这里就不打印了

1.6 model.named_parameters()

model.named_parameters() 输出包含 layer name 的 model.parameters()

model_named_parameters = [m for m in model.named_parameters()]
print('len(model_named_parameters)=',len(model_named_parameters))
print('model_named_parameters',model_named_parameters)
  • 1
  • 2
  • 3

二、model.state_dict()

model.state_dict() 能够获取 模型中的所有参数,包括可学习参数和不可学习参数,其返回值是一个有序字典 OrderedDict。
model中所有的可学习参数(weight、bias),同时还获取了不可学习参数(BN layer 的 running mean 和 running var 等

print('model.state_dict()',model.state_dict())
  • 1

总结

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/103463
推荐阅读
相关标签
  

闽ICP备14008679号