当前位置:   article > 正文

PyTorch基础知识讲解(一)完整训练流程示例_pytorch教程

pytorch教程

Tutorial

大多数机器学习工作流程涉及处理数据、创建模型、优化模型参数和保存训练好的模型。本教程向你介绍一个用PyTorch实现的完整的ML工作流程,并提供链接来了解这些概念中的每一个。

我们将使用FashionMNIST数据集来训练一个神经网络,预测输入图像是否属于以下类别之一。T恤/上衣、长裤、套头衫、连衣裙、外套、凉鞋、衬衫、运动鞋、包或踝靴。

1. 数据处理

PyTorch有两个处理数据的方法:Torch.utils.data.DataLoaderTorch.utils.data.Dataset。Dataset存储了样本及其相应的标签,而DataLoader则围绕Dataset包装了一个可迭代的数据。

torchvision.datasets模块包含了许多真实世界的视觉数据的数据集对象,如CIFARCOCO。在本教程中,我们使用FashionMNIST数据集。每个TorchVision数据集都包括两个参数:transformtarget_transform,分别用来修改样本和标签。

import torch
from torchvision import datasets
from torchvision import transforms

train_data = datasets.FashionMNIST(root="D:/datasets/DL/", 
                                   train=True,
                                   download=True,
                                   transform=transforms.ToTensor())
test_data = datasets.FashionMNIST(root="D:/datasets/DL/", 
                                train=False,
                                download=True,
                                transform=transforms.ToTensor())
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

我们将数据集作为参数传递给DataLoader。这在我们的数据集上包裹了一个可迭代的数据集,并支持自动批处理、采样、洗牌和多进程数据加载。在这里,我们定义了一个64的批处理大小,即dataloader可迭代的每个元素将返回一个批次,包括64个元素的特征和标签。

from torch.utils.data import DataLoader

batch_size = 64
train_dataloader = DataLoader(train_data, batch_size=batch_size)
test_dataloader = DataLoader(train_data, batch_size=batch_size)

for X,y in train_dataloader:
    print("Shape of X [N, C, H, W]: ", X.shape)
    print("Shape of y: ", y.shape, y.dtype)
    break
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
Shape of X [N, C, H, W]:  torch.Size([64, 1, 28, 28])
Shape of y:  torch.Size([64]) torch.int64
  • 1
  • 2

2. 网络模型定义

为了在PyTorch中定义一个神经网络,我们创建一个继承自nn.Module的类。我们在__init__函数中定义网络的层,并在forward函数中指定数据将如何通过网络。为了加速神经网络的操作,如果有GPU的话,我们把它移到GPU上。

输入是28*28, 输出包含10个类

device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device {device}")

from torch import nn
class MyRSNN(nn.Module):
    def __init__(self, n_in=28*28, n_out=10):
        super(MyRSNN, self).__init__()
        self.flat_layer = nn.Flatten()
        self.n_hidden = 64
        self.network = nn.Sequential(
            nn.Linear(n_in, self.n_hidden),
            nn.ReLU(),
            nn.Linear(self.n_hidden, self.n_hidden),
            nn.ReLU(),
            nn.Linear(self.n_hidden, n_out)
        )
    def forward(self, X):
        return self.network(self.flat_layer(X))
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
Using device cuda
  • 1
model = MyRSNN().to(device)
model
  • 1
  • 2
MyRSNN(
  (flat_layer): Flatten(start_dim=1, end_dim=-1)
  (network): Sequential(
    (0): Linear(in_features=784, out_features=64, bias=True)
    (1): ReLU()
    (2): Linear(in_features=64, out_features=64, bias=True)
    (3): ReLU()
    (4): Linear(in_features=64, out_features=10, bias=True)
  )
)
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

3. 损失函数、模型优化、模型训练、模型评价

损失函数就是一个函数,用来评价模型预测的结果和真实结果之间的差距,优化器提供了一个算法,通过不断的调节参数使得预测结果和真实结果的差距越来越小

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

def train(model, dataloader, lf, opt):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)
        # compute loss
        y_pred = model(X)
        loss = lf(y_pred, y)
        # back propagation
        opt.zero_grad()
        loss.backward()
        opt.step()
        if batch % 200 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
def test(model, dataloader, lf):
    size = len(dataloader.dataset)
    batch_num = len(dataloader)
    model.eval()
    loss, acc = 0., 0.

    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            y_pred = model(X)
            loss += lf(y_pred, y).item()
            acc += (y_pred.argmax(1) == y).type(torch.float).sum().item()
    avg_loss = loss/batch_num
    avg_acc = acc/size
    print(f"Test Error: \n Accuracy: {100*avg_acc:>0.2f}%, Avg loss: {avg_loss:>8f} \n")

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
'
运行
# 训练多个 epoch
epochs = 10
for e in range(epochs):
    print(f"------- Epoch {e} --------")
    train(model, train_dataloader, loss_fn, optimizer)
    test(model, test_dataloader, loss_fn)
print("Done!")
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
------- Epoch 0 --------
loss: 2.302572  [    0/60000]
loss: 2.278226  [12800/60000]
loss: 2.263920  [25600/60000]
loss: 2.279993  [38400/60000]
loss: 2.265715  [51200/60000]
Test Error: 
 Accuracy: 31.01%, Avg loss: 2.240282 

------- Epoch 1 --------
loss: 2.253209  [    0/60000]
loss: 2.209677  [12800/60000]
loss: 2.195682  [25600/60000]
loss: 2.215378  [38400/60000]
loss: 2.191957  [51200/60000]
Test Error: 
 Accuracy: 37.75%, Avg loss: 2.135047 

------- Epoch 2 --------
loss: 2.170743  [    0/60000]
loss: 2.075798  [12800/60000]
loss: 2.044513  [25600/60000]
loss: 2.065165  [38400/60000]
loss: 2.013370  [51200/60000]
Test Error: 
 Accuracy: 41.41%, Avg loss: 1.910970 

------- Epoch 3 --------
loss: 1.980009  [    0/60000]
loss: 1.803920  [12800/60000]
loss: 1.756796  [25600/60000]
loss: 1.774534  [38400/60000]
loss: 1.720578  [51200/60000]
Test Error: 
 Accuracy: 46.69%, Avg loss: 1.598630 

------- Epoch 4 --------
loss: 1.691669  [    0/60000]
loss: 1.478196  [12800/60000]
loss: 1.456628  [25600/60000]
loss: 1.474672  [38400/60000]
loss: 1.443763  [51200/60000]
Test Error: 
 Accuracy: 60.15%, Avg loss: 1.341288 

------- Epoch 5 --------
loss: 1.435102  [    0/60000]
loss: 1.234250  [12800/60000]
loss: 1.234853  [25600/60000]
loss: 1.256342  [38400/60000]
loss: 1.251099  [51200/60000]
Test Error: 
 Accuracy: 63.44%, Avg loss: 1.166800 

------- Epoch 6 --------
loss: 1.251537  [    0/60000]
loss: 1.064981  [12800/60000]
loss: 1.084116  [25600/60000]
loss: 1.118475  [38400/60000]
loss: 1.128412  [51200/60000]
Test Error: 
 Accuracy: 65.01%, Avg loss: 1.051667 

------- Epoch 7 --------
loss: 1.125720  [    0/60000]
loss: 0.946272  [12800/60000]
loss: 0.984997  [25600/60000]
loss: 1.029155  [38400/60000]
loss: 1.044023  [51200/60000]
Test Error: 
 Accuracy: 66.17%, Avg loss: 0.971739 

------- Epoch 8 --------
loss: 1.034441  [    0/60000]
loss: 0.859551  [12800/60000]
loss: 0.916106  [25600/60000]
loss: 0.967752  [38400/60000]
loss: 0.982405  [51200/60000]
Test Error: 
 Accuracy: 67.27%, Avg loss: 0.913043 

------- Epoch 9 --------
loss: 0.964701  [    0/60000]
loss: 0.793666  [12800/60000]
loss: 0.865933  [25600/60000]
loss: 0.922777  [38400/60000]
loss: 0.935715  [51200/60000]
Test Error: 
 Accuracy: 68.13%, Avg loss: 0.867866 

Done!
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91

4. 模型保存、模型加载、模型推理

通常,我们会把模型参数保存下来,随后可以加载并用于推理预测

model_path = "C01_10.pth"
torch.save(model.state_dict(), model_path)
print("Model parameters saved!")
  • 1
  • 2
  • 3
Model parameters saved!
  • 1
model_2 = MyRSNN()
model_2.load_state_dict(torch.load("C01_10.pth"))
  • 1
  • 2
<All keys matched successfully>
  • 1
def predict(sample_idx=0):
    classes = [
        "T-shirt/top",
        "Trouser",
        "Pullover",
        "Dress",
        "Coat",
        "Sandal",
        "Shirt",
        "Sneaker",
        "Bag",
        "Ankle boot",
    ]
    model_2.eval()
    X, y = test_data[sample_idx][0], test_data[sample_idx][1]
    print("Ground Truth: ", classes[y])
    with torch.no_grad():
        y_pred = model_2(X).flatten()
    print("Model prediction: ", classes[y_pred.argmax()])
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
'
运行
predict(0)
  • 1
Ground Truth:  Ankle boot
Model prediction:  Ankle boot
  • 1
  • 2
predict(190)
  • 1
Ground Truth:  Trouser
Model prediction:  Dress
  • 1
  • 2

  • 1
'
运行
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/码创造者/article/detail/811898
推荐阅读
相关标签
  

闽ICP备14008679号