当前位置:   article > 正文

pytorch实现手写字体识别(Mnist数据集)_谷歌字母识别问题torch实现

谷歌字母识别问题torch实现

1.加载数据集

一个快速体验学习的小tip在google的云jupyter上做实验,速度快的飞起。

  1. import torch
  2. from torch.nn import Linear, ReLU
  3. import torch.nn as nn
  4. import numpy as np
  5. from torch.autograd import Variable
  6. from torchvision import datasets,transforms
  7. from torch.autograd import Variable
  8. import torch.optim as optim
  9. transformation = transforms.Compose([transforms.ToTensor(),transforms.Normalize((0.1307,),(0.3081,))])
  10. #data/表示下载数据集到的目录,transformation表示对数据集进行的相关处理
  11. train_dataset = datasets.MNIST('data/',train=True, transform=transformation,download=True)
  12. test_dataset = datasets.MNIST('data/', train=False, transform=transformation,download=True)
  13. train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
  14. test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=32, shuffle=True)

2.显示图片

  1. #将数据加载为一个迭代器,读取其中一个批次
  2. simple_data = next(iter(train_loader))
  3. import matplotlib.pyplot as plt
  4. def plot_img(image):
  5. image = image.numpy()[0]
  6. mean = 0.1307
  7. std = 0.3081
  8. image = ((mean * image) + std)
  9. plt.imshow(image,cmap='gray')
  10. plot_img(simple_data[0][3])

 3.构造网络模型

  1. import torch.nn.functional as F
  2. class Mnist_Net(nn.Module):
  3. def __init__(self):
  4. super(Mnist_Net,self).__init__()
  5. self.conv1 = nn.Conv2d(1,10,kernel_size=5)
  6. self.conv2 = nn.Conv2d(10,20,kernel_size=5)
  7. self.conv2_drop = nn.Dropout2d()
  8. self.fc1 = nn.Linear(320, 50) #320是根据卷积计算而来4*4*20(4*4表示大小,20表示通道数)
  9. self.fc2 = nn.Linear(50, 10)
  10. def forward(self, x):
  11. x = F.relu(F.max_pool2d(self.conv1(x), 2))
  12. x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
  13. x = x.view(-1, 320)
  14. x = F.relu(self.fc1(x))
  15. #x = F.dropout(x,p=0.1, training=self.training)
  16. x = self.fc2(x)
  17. return F.log_softmax(x,dim=1)

log_softmax数学上等价于log(softmax(x)),

由于softmax得出的结果每一个概率都是(0,1)的,这就会导致有些概率过小,导致下溢。 考虑到这个概率分布总归是要经过crossentropy的,而crossentropy的计算是把概率分布外面套一个-log 来似然,那么直接在计算概率分布的时候加上log,把概率从(0,1)变为(-∞,0),这样就防止中间会有下溢出。 所以log_softmax说白了就是将本来应该由crossentropy做的套log的工作提到预测概率分布来,跳过了中间的存储步骤,防止中间数值会有下溢出,使得数据更加稳定

nll_loss(negative log likelihood loss):最大似然 / log似然代价函数
CrossEntropyLoss: 交叉熵损失函数。交叉熵描述了两个概率分布之间的距离,当交叉熵越小说明二者之间越接近。

  1. model = Mnist_Net()
  2. model = model.cuda() #使用Gpu加速训练
  3. optimizer = optim.SGD(model.parameters(), lr=0.01)#优化函数

model

Mnist_Net( (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1)) (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1)) (conv2_drop): Dropout2d(p=0.5, inplace=False) (fc1): Linear(in_features=320, out_features=50, bias=True) (fc2): Linear(in_features=50, out_features=10, bias=True) )

4.模型训练/验证函数

  1. def fit(epoch,model,data_loader,phase='training', volatile=False):
  2. if phase =="training": #判断当前是训练还是验证
  3. model.train()
  4. if phase == "validation":
  5. model.eval()
  6. volatile =True
  7. running_loss = 0.0
  8. running_correct =0
  9. for batch_idx,(data,target) in enumerate(data_loader): #取出数据
  10. data,target = data.cuda(),target.cuda() #使用cuda加速
  11. data,target = Variable(data,volatile), Variable(target)
  12. if phase == 'training':
  13. optimizer.zero_grad() #重置梯度
  14. output =model(data) #得出预测结果
  15. loss = F.nll_loss(output,target) #计算损失值
  16. running_loss += F.nll_loss(output,target,size_average=False).item() #计算总的损失值
  17. preds = output.data.max(dim=1,keepdim=True)[1] #预测概率值转换为数字
  18. running_correct += preds.eq(target.data.view_as(preds)).cpu().sum()
  19. if phase == 'training':
  20. loss.backward()
  21. optimizer.step()
  22. loss = running_loss/len(data_loader.dataset)
  23. accuracy = 100.*running_correct/len(data_loader.dataset)
  24. print(f'{phase} loss is {loss:{5}.{2}} and {phase} accuracy is {running_correct}/{len(data_loader.dataset)}{accuracy:{10}.{4}}')
  25. return loss,accuracy

5.模型训练/验证及可视化

  1. train_losses , train_accuracy = [],[]
  2. val_losses , val_accuracy = [],[]
  3. for epoch in range(1,40):
  4. epoch_loss, epoch_accuracy = fit(epoch,model,train_loader,phase='training')
  5. val_epoch_loss , val_epoch_accuracy = fit(epoch,model,test_loader,phase='validation')
  6. train_losses.append(epoch_loss)
  7. train_accuracy.append(epoch_accuracy)
  8. val_losses.append(val_epoch_loss)
  9. val_accuracy.append(val_epoch_accuracy)

结果:基本上到达99%在训练集和验证集上

training loss is 0.042 and training accuracy is 59206/60000 98.68 validation loss is 0.027 and validation accuracy is 9907/10000 99.07 training loss is 0.041 and training accuracy is 59229/60000 98.71 validation loss is 0.029 and validation accuracy is 9910/10000 99.1 training loss is 0.039 and training accuracy is 59271/60000 98.79 validation loss is 0.029 and validation accuracy is 9908/10000 99.08 training loss is 0.04 and training accuracy is 59261/60000 98.77 validation loss is 0.026 and validation accuracy is 9911/10000 99.11 training loss is 0.041 and training accuracy is 59244/60000 98.74 validation loss is 0.026 and validation accuracy is 9913/10000 99.13 training loss is 0.038 and training accuracy is 59287/60000 98.81 validation loss is 0.029 and validation accuracy is 9906/10000 99.06

可视化损失值

  1. plt.plot(range(1,len(train_losses)+1),train_losses,'bo',label='training_loss')
  2. plt.plot(range(1,len(val_losses)+1),val_losses,'r',label ='validation loss')
  3. plt.legend()

可视化精确度

  1. plt.plot(range(1,len(train_accuracy)+1),train_accuracy,'bo',label = 'train accuracy')
  2. plt.plot(range(1,len(val_accuracy)+1),val_accuracy,'r',label = 'val accuracy')
  3. plt.legend()

 

 6.使用torchvision提供的预训练模型进行训练

加载模型(选用resnet18,注意:需要改变一部分模型结构来拟合你的数据特征)

 

  1. from torchvision import models
  2. transfer_model1 = models.resnet18(pretrained=True)
  3. #第一层卷积层改为1通道,因为mnist是(1,28,28)
  4. transfer_model1.conv1=nn.Conv2d(1,64,kernel_size=(7,7),stride=(2,2),padding=(3,3),bias=False)
  5. #输出的模型结构改为10
  6. dim_in = transfer_model1.fc.in_features
  7. transfer_model1.fc = nn.Linear(dim_in, 10)

模型训练

  1. #损失函数
  2. criteon = nn.CrossEntropyLoss()
  3. optimizer = optim.SGD(transfer_model.parameters(), lr=0.01)
  4. transfer_model = transfer_model.cuda()
  5. train_losses , train_accuracy = [],[]
  6. val_losses , val_accuracy = [],[]
  7. for epoch in range(10):
  8. transfer_model.train()
  9. running_loss =0.0
  10. running_correct =0
  11. for batch_idx,(x,target) in enumerate(train_loader):
  12. #预测值logits
  13. x,target = x.cuda(),target.cuda()
  14. x,target = Variable(x),Variable(target)
  15. logits = transfer_model(x)
  16. loss = criteon(logits,target)
  17. optimizer.zero_grad()
  18. loss.backward()
  19. optimizer.step()
  20. running_loss +=loss.item()
  21. preds = logits.data.max(dim=1,keepdim=True)[1]
  22. running_correct += preds.eq(target.data.view_as(preds)).cpu().sum()
  23. train_loss = running_loss/len(train_loader.dataset)
  24. train_acc = 100*running_correct/len(train_loader.dataset)
  25. train_losses.append(train_loss)
  26. train_accuracy.append(train_acc)
  27. print('epoch:{},train loss is{},train_acc is {}'.format(epoch,train_loss,train_acc))
  28. test_loss =0.0
  29. test_acc_num=0
  30. #模型test
  31. model.eval()
  32. for data,target in test_loader:
  33. data,target = data.cuda(),target.cuda()
  34. data,target = Variable(data),Variable(target)
  35. logits = transfer_model(data)
  36. test_loss +=criteon(logits,target).item()
  37. _,pred = torch.max(logits,1)
  38. test_acc_num += pred.eq(target).float().sum().item()
  39. test_los = test_loss/len(test_loader.dataset)
  40. test_acc = test_acc_num/len(test_loader.dataset)
  41. val_losses.append(test_los)
  42. val_accuracy.append(test_acc)
  43. print("epoch:{} total loss:{},acc:{}".format(epoch,test_los,test_acc))

训练过程

  1. poch:0,train loss is0.006211824892895917,train_acc is 93
  2. /pytorch/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
  3. epoch:0 total loss:0.0015338615737855435,acc:0.9845
  4. epoch:1,train loss is0.001937273425484697,train_acc is 98
  5. epoch:1 total loss:0.0012044131346046925,acc:0.9875
  6. epoch:2,train loss is0.0012458768549064795,train_acc is 98
  7. epoch:2 total loss:0.0010627412386238575,acc:0.9888
  8. epoch:3,train loss is0.0009563570257897178,train_acc is 99
  9. epoch:3 total loss:0.0010410105541348456,acc:0.9895
  10. epoch:4,train loss is0.0006897176718960205,train_acc is 99
  11. epoch:4 total loss:0.000987174815684557,acc:0.9897
  12. epoch:5,train loss is0.0005422685200969378,train_acc is 99
  13. epoch:5 total loss:0.0009278423339128494,acc:0.9905
  14. epoch:6,train loss is0.0004471833350757758,train_acc is 99
  15. epoch:6 total loss:0.0008359558276832104,acc:0.9921
  16. epoch:7,train loss is0.00036988445594906806,train_acc is 99
  17. epoch:7 total loss:0.0007887807078659535,acc:0.9921
  18. epoch:8,train loss is0.00031936961747705935,train_acc is 99
  19. epoch:8 total loss:0.0009252857074141502,acc:0.9909
  20. epoch:9,train loss is0.00030207459069788454,train_acc is 99
  21. epoch:9 total loss:0.0008598978526890278,acc:0.9921
  22. 从上面可以看出迅速到达非常好的精确度

可视化

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/花生_TL007/article/detail/214899
推荐阅读
相关标签
  

闽ICP备14008679号