当前位置:   article > 正文

PyTorch之保存加载模型_model': model.module.state_dict()

model': model.module.state_dict()

PyTorch之保存加载模型

1 字数 844 阅读 29090

前提

本文来源于https://pytorch.org/tutorials/beginner/saving_loading_models.html#

SAVING AND LOADING MODELS

当提到保存和加载模型时,有三个核心功能需要熟悉:
1.torch.save:将序列化的对象保存到disk。这个函数使用Python的pickle实用程序进行序列化。使用这个函数可以保存各种对象的模型、张量和字典。
2.torch.load:使用pickle unpickle工具将pickle的对象文件反序列化为内存。
3.torch.nn.Module.load_state_dict:使用反序列化状态字典加载model’s参数字典。

一:WHAT IS A STATE_DICT

在PyTorch中,torch.nn.Module的可学习参数(即权重和偏差),模块模型包含在model's参数中(通过model.parameters()访问)。state_dict是个简单的Python dictionary对象,它将每个层映射到它的参数张量。
注意,只有具有可学习参数的层(卷积层、线性层等)才有model's state_dict中的条目。优化器对象(connector .optim)也有一个state_dict,其中包含关于优化器状态以及所使用的超参数的信息。
Example:

import torch
import torch.nn as nn
import torch.nn.functional as F
# Define model
class TheModelClass(nn.Module):
    def __init__(self):
        super(TheModelClass,self).__init__()
        self.conv1=nn.Conv2d(3,6,5)
        self.pool=nn.MaxPool2d(2,2)
        self.conv2=nn.Conv2d(6,16,5)
        self.fc1=nn.Linear(16*5*5,120)
        self.fc2=nn.Linear(120,84)
        self.fc3=nn.Linear(84,10)
    def farward(self,x):
        x=self.pool(F.relu(self.conv1(x)))
        x=self.pool(F.relu(self.conv2(x)))
        x=x.view(-1,16*5*5)
        x=F.relu(self.fc1(x))
        x=F.relu(self.fc2(x))
        x=self.fc3(x)
        return x
# Initialize model
model=TheModelClass()
# Initialize optimizer
optimizer=torch.optim.SGD(model.parameters(),lr=1e-4,momentum=0.9)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25

print(“Model’s state_dict:”)
# Print model’s state_dict
for param_tensor in model.state_dict():
print(param_tensor,"\t",model.state_dict()[param_tensor].size())
print(“optimizer’s state_dict:”)
# Print optimizer’s state_dict
for var_name in optimizer.state_dict():
print(var_name,"\t",optimizer.state_dict()[var_name])

Output:

Model's state_dict:
conv1.weight     torch.Size([6, 3, 5, 5])
conv1.bias   torch.Size([6])
conv2.weight     torch.Size([16, 6, 5, 5])
conv2.bias   torch.Size([16])
fc1.weight   torch.Size([120, 400])
fc1.bias     torch.Size([120])
fc2.weight   torch.Size([84, 120])
fc2.bias     torch.Size([84])
fc3.weight   torch.Size([10, 84])
fc3.bias     torch.Size([10])
optimizer's state_dict:
state    {}
param_groups     [{'lr': 0.0001, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'params': [1310469552240, 1310469552384, 1310469552456, 1310469552528, 1310469552600, 1310469552672, 1310469552744, 1310469552816, 1310469552888, 1310469552960]}]
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

二:SAVING & LOADING MODEL FOR INFERENCE

Save/Load state_dict (Recommended)

  • Save:

       torch.save(model.state_dict(), PATH)
    
    • 1
    • 1

在保存模型进行推理时,只需要保存训练过的模型的学习参数即可。一个常见的PyTorch约定是使用.pt或.pth文件扩展名保存模型。

  • Load:

     model = TheModelClass(*args, **kwargs)
     model.load_state_dict(torch.load(PATH))
     model.eval()
    
    • 1
    • 2
    • 3
    • 1
    • 2
    • 3

记住,您必须调用model.eval(),以便在运行推断之前将dropout和batch规范化层设置为评估模式。如果不这样做,将会产生不一致的推断结果。

Note:

 注意,load_state_dict()函数接受一个dictionary对象,而不是保存对象的路径。这意味着您必须在将保存的state_dict传至load_state_dict()函数之前反序列化它。
  • 1

Save/Load Entire Model

  • Save:

      torch.save(model, PATH)
    
    • 1
    • 1
  • Load:

    # Model class must be defined somewhere
      model = torch.load(PATH)
     model.eval()
    
    • 1
    • 2
    • 3
    • 1
    • 2
    • 3

三:

Save:

       torch.save({
        'epoch': epoch,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
        'loss': loss,
        ...
        }, PATH)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

</pre>

Load:

         model = TheModelClass(*args, **kwargs)
         optimizer = TheOptimizerClass(*args, **kwargs)
  • 1
  • 2
    checkpoint <span class="token operator">=</span> torch<span class="token punctuation">.</span><span class="token function">load</span><span class="token punctuation">(</span><span class="token constant">PATH</span><span class="token punctuation">)</span>
    model<span class="token punctuation">.</span><span class="token function">load_state_dict</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">[</span><span class="token single-quoted-string string">'model_state_dict'</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
    optimizer<span class="token punctuation">.</span><span class="token function">load_state_dict</span><span class="token punctuation">(</span>checkpoint<span class="token punctuation">[</span><span class="token single-quoted-string string">'optimizer_state_dict'</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
    epoch <span class="token operator">=</span> checkpoint<span class="token punctuation">[</span><span class="token single-quoted-string string">'epoch'</span><span class="token punctuation">]</span>
    loss <span class="token operator">=</span> checkpoint<span class="token punctuation">[</span><span class="token single-quoted-string string">'loss'</span><span class="token punctuation">]</span>

     model<span class="token punctuation">.</span><span class="token keyword">eval</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token shell-comment comment"># - or -</span>
    model<span class="token punctuation">.</span><span class="token function">train</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token operator">&lt;</span><span class="token operator">/</span>pre<span class="token operator">&gt;</span>
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 1
  • 2

在保存用于推理或恢复训练的通用检查点时,必须保存模型的state_dict。另外,保存优化器的state_dict也是很重要的,因为它包含缓冲区和参数,这些缓冲区和参数是在模型训练时更新的。要保存多个组件,请将它们组织在字典中,并使用torch.save()序列化字典。一个常见的PyTorch约定是使用.tar文件扩展名保存这些检查点。

四:SAVING & LOADING MODEL ACROSS DEVICES

Save on GPU, Load on CPU

  • Save:

      torch.save(model.state_dict(), PATH)
    
    • 1
    • 1
  • Load:

      device = torch.device('cpu')
      model = TheModelClass(*args, **kwargs)
      model.load_state_dict(torch.load(PATH, map_location=device))
    
    • 1
    • 2
    • 3

Save on GPU, Load on GPU

  • Save:

      torch.save(model.state_dict(), PATH)
    
    • 1
    • 1
  • Load:

      device = torch.device("cuda")
      model = TheModelClass(*args, **kwargs)
      model.load_state_dict(torch.load(PATH))
      model.to(device)
      # Make sure to call input = input.to(device) on any input tensors that you feed to the model
    
    • 1
    • 2
    • 3
    • 4
    • 5

Save on CPU, Load on GPU

  • Save:

      torch.save(model.state_dict(), PATH)
    
    • 1
    • 1
  • Load:

      device = torch.device("cuda")
      model = TheModelClass(*args, **kwargs)
      model.load_state_dict(torch.load(PATH, map_location="cuda:0"))  # Choose whatever GPU device number you want
     model.to(device)
      # Make sure to call input = input.to(device) on any input tensors that you feed to the model
    
    • 1
    • 2
    • 3
    • 4
    • 5

Saving torch.nn.DataParallel Models

  • Save:

      torch.save(model.module.state_dict(), PATH)
    
    • 1
    • 1
  • Load:

     # Load to whatever device you want
    
    • 1

torch.nn.DataParallel是支持并行GPU使用的模型包装器。为了节省DataParallel模型属性,保存model.module.state_dict()。通过这种方式,您可以灵活地以任何方式加载模型以加载任何设备

如果,model 参数有 module 参数,多GPU加载到单GPU中,可以用下面的代码,去除,参数重新加载

    state_dict = torch.load("mobilenetV1X0.25_pretrain.tar")["state_dict"]
    # create new OrderedDict that does not contain `module.`
    from collections import OrderedDict
    new_state_dict = OrderedDict()
    for k, v in state_dict.items():
        name = k[7:] # remove `module.`
        new_state_dict[name] = v
    # load params
    model.load_state_dict(new_state_dict,strict=False)  #not model = model.load_state_dict(new_state_dict,strict=False),is None
    print model
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
15人点赞
1人踩
拥有0.98钻 (约0.11元)
关注
"小礼物走一走,来简书关注我"
赞赏
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/我家小花儿/article/detail/103406?site
推荐阅读
相关标签
  

闽ICP备14008679号