赞
踩
LSTM处理手写数据集(分类问题)
LSTM通过sin预测cos(回归问题)
input_size – 输入的特征维度,即词嵌入里是one-hot的长度,一个单词的长度是300,则input_size为300,输入图像的宽为28,则input_size为28
hidden_size – 隐状态的特征维度,可以自行设定
num_layers – 层数(和时序展开要区分开),一般为1或者2
bias – 如果为False,那么LSTM将不会使用 b i h , b h h b_{ih},b_{hh} bih,bhh,默认为True。
batch_first – 如果为True,那么输入和输出Tensor的形状为 (batch, time_step, input_size),否则 (time_step batch, input_size)
dropout – 如果非零的话,将会在RNN的输出上加个dropout,最后一层除外。
bidirectional – 如果为True,将会变成一个双向RNN,默认为False。
time_step:长度为一句话的长度,一句话包含的单词数量,
batch_size:分批次送入rnn的数量
以下代码来自lstm
lstm输入是input, (h_0, c_0)
input (time_step, batch, input_size) 如果设置了batch_first,则batch为第一维。
(h_0, c_0) 隐层状态
h0 shape:(num_layers * num_directions, batch, hidden_size)
c0 shape:(num_layers * num_directions, batch, hidden_size)
lstm输出是output, (h_n, c_n)
output (time_step, batch, hidden_size * num_directions) 包含每一个时刻的输出特征,如果设置了batch_first,则batch为第一维
(h_n, c_n) 隐层状态
h_n shape: (num_layers * num_directions, batch, hidden_size)
c_n shape: (num_layers * num_directions, batch, hidden_size)
在一些文档中,time_step和seq_len都表示时间步
class RNN(nn.Module): def __init__(self): super(RNN, self).__init__() self.rnn = nn.LSTM( input_size=INPUT_SIZE, hidden_size=64, num_layers=1, batch_first=True, #(batch, time_step, input_size) ) self.out = nn.Linear(64, 10) #Linear(num_layer*hidden_size,分类的个数) def forward(self, x): # x shape (batch, time_step, input_size) # r_out shape (batch, time_step, output_size) # h_n shape (n_layers, batch, hidden_size) # h_c shape (n_layers, batch, hidden_size) r_out, (h_n, h_c) = self.rnn(x, None) # 初始状态为None out = self.out(r_out[:, -1, :]) return out
其中 out = self.out(r_out[:, -1, :])这一句是取time_step为最后时刻的输出作为linera层的输入,因为输入一个xt和ht-1会产生一个r_out和(h_n, h_c),产生的hn会继续送入rnn的输入继续产生下一个r_out,所以需要取最后时刻的输出作为输入
完整代码整理后发布
CNN实现手写数据集代码如下
import torch import torch.nn from torch import nn from torch.utils.data import Dataset from torch.utils.data import DataLoader import numpy as np import matplotlib.pyplot as plt from torchvision import datasets,transforms import torch.optim as optim import torch.nn.functional as F BATCH_SIZE=512 EPOCHS=20 DEVICE=torch.device("cuda" if torch.cuda.is_available() else "cpu") transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ]) train_data=datasets.MNIST(root='./mnist',download=True,train=True,transform=transform) test_data=datasets.MNIST(root='./mnist',download=True,train=False,transform=transform) train_loader=DataLoader(train_data,batch_size=BATCH_SIZE,shuffle=True) test_loader=DataLoader(test_data,batch_size=BATCH_SIZE,shuffle=True) class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Sequential( nn.Conv2d(in_channels=1, out_channels=10, kernel_size=5), nn.ReLU(), nn.MaxPool2d(kernel_size=2) ) self.conv2 = nn.Sequential( nn.Conv2d(in_channels=10, out_channels=20, kernel_size=3), nn.ReLU(), nn.MaxPool2d(2) ) self.out = nn.Linear(20 * 5 * 5, 10) def forward(self, x): x = self.conv1(x) x = self.conv2(x) x = x.view(x.size(0), -1) output = self.out(x) return output model = Net() # 实例化网络net,再送入gpu训练 optimizer = optim.Adam(model.parameters()) criterion = nn.CrossEntropyLoss() def train(model, device, train_loader, optimizer, epoch, criterion): model.train() for batch_idx, (data, target) in enumerate(train_loader): output = model(data) # loss=criterion(output,target) optimizer.zero_grad() loss = criterion(output, target) loss.backward() optimizer.step() if (batch_idx + 1) % 30 == 0: # train_loader的长度为train_loader.dataset的长度除以batch_size print('Train Epoch:{} [{}/{} ({:.0f}%)]\tLoss:{:.6f}'.format( epoch, batch_idx * len(data), len(train_loader.dataset), 100. * batch_idx / len(train_loader), loss.item() )) def test(model, device, test_loader): model.eval() test_loss = 0 test_corr = 0 with torch.no_grad(): for img, label in test_loader: output = model(img) test_loss += criterion(output, label) pred = output.max(1, keepdim=True)[1] test_corr += pred.eq(label.view_as(pred)).sum().item() print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format( test_loss, test_corr, len(test_loader.dataset), 100. * (test_corr / len(test_loader.dataset)) )) for epoch in range(1, EPOCHS + 1): train(model, DEVICE, train_loader, optimizer, epoch, criterion) test(model, DEVICE, test_loader)
训练结果如下
Train Epoch:1 [14848/60000 (25%)] Loss:0.809119
Train Epoch:1 [30208/60000 (50%)] Loss:0.332066
Train Epoch:1 [45568/60000 (75%)] Loss:0.248601
Test set: Average loss: 3.3879, Accuracy: 9515/10000 (95%)
Train Epoch:2 [14848/60000 (25%)] Loss:0.200926
Train Epoch:2 [30208/60000 (50%)] Loss:0.167642
Train Epoch:2 [45568/60000 (75%)] Loss:0.129635
Test set: Average loss: 1.9960, Accuracy: 9700/10000 (97%)
Train Epoch:3 [14848/60000 (25%)] Loss:0.097073
Train Epoch:3 [30208/60000 (50%)] Loss:0.078300
Train Epoch:3 [45568/60000 (75%)] Loss:0.095262
Test set: Average loss: 1.5412, Accuracy: 9764/10000 (98%)
Train Epoch:4 [14848/60000 (25%)] Loss:0.067570
Train Epoch:4 [30208/60000 (50%)] Loss:0.091387
Train Epoch:4 [45568/60000 (75%)] Loss:0.058170
Test set: Average loss: 1.3722, Accuracy: 9795/10000 (98%)
Train Epoch:5 [14848/60000 (25%)] Loss:0.081385
Train Epoch:5 [30208/60000 (50%)] Loss:0.069488
Train Epoch:5 [45568/60000 (75%)] Loss:0.108909
Test set: Average loss: 1.1676, Accuracy: 9818/10000 (98%)
Train Epoch:6 [14848/60000 (25%)] Loss:0.060494
Train Epoch:6 [30208/60000 (50%)] Loss:0.070833
Train Epoch:6 [45568/60000 (75%)] Loss:0.085588
Test set: Average loss: 1.0887, Accuracy: 9833/10000 (98%)
Train Epoch:7 [14848/60000 (25%)] Loss:0.067081
Train Epoch:7 [30208/60000 (50%)] Loss:0.082414
Train Epoch:7 [45568/60000 (75%)] Loss:0.045014
Test set: Average loss: 1.0601, Accuracy: 9837/10000 (98%)
Train Epoch:8 [14848/60000 (25%)] Loss:0.062390
Train Epoch:8 [30208/60000 (50%)] Loss:0.048241
Train Epoch:8 [45568/60000 (75%)] Loss:0.042879
Test set: Average loss: 0.9528, Accuracy: 9836/10000 (98%)
Train Epoch:9 [14848/60000 (25%)] Loss:0.048539
Train Epoch:9 [30208/60000 (50%)] Loss:0.055073
Train Epoch:9 [45568/60000 (75%)] Loss:0.055796
Test set: Average loss: 0.8623, Accuracy: 9866/10000 (99%)
Train Epoch:10 [14848/60000 (25%)] Loss:0.051431
Train Epoch:10 [30208/60000 (50%)] Loss:0.045435
Train Epoch:10 [45568/60000 (75%)] Loss:0.075674
Test set: Average loss: 0.7783, Accuracy: 9874/10000 (99%)
Train Epoch:11 [14848/60000 (25%)] Loss:0.028392
Train Epoch:11 [30208/60000 (50%)] Loss:0.049267
Train Epoch:11 [45568/60000 (75%)] Loss:0.042472
Test set: Average loss: 0.8189, Accuracy: 9875/10000 (99%)
Train Epoch:12 [14848/60000 (25%)] Loss:0.058731
Train Epoch:12 [30208/60000 (50%)] Loss:0.025470
Train Epoch:12 [45568/60000 (75%)] Loss:0.029647
Test set: Average loss: 0.7829, Accuracy: 9871/10000 (99%)
Train Epoch:13 [14848/60000 (25%)] Loss:0.052567
Train Epoch:13 [30208/60000 (50%)] Loss:0.028609
Train Epoch:13 [45568/60000 (75%)] Loss:0.020649
Test set: Average loss: 0.7527, Accuracy: 9872/10000 (99%)
Train Epoch:14 [14848/60000 (25%)] Loss:0.039200
Train Epoch:14 [30208/60000 (50%)] Loss:0.019106
Train Epoch:14 [45568/60000 (75%)] Loss:0.067107
Test set: Average loss: 0.7386, Accuracy: 9886/10000 (99%)
Train Epoch:15 [14848/60000 (25%)] Loss:0.038181
Train Epoch:15 [30208/60000 (50%)] Loss:0.022419
Train Epoch:15 [45568/60000 (75%)] Loss:0.016036
Test set: Average loss: 0.7954, Accuracy: 9862/10000 (99%)
Train Epoch:16 [14848/60000 (25%)] Loss:0.018675
Train Epoch:16 [30208/60000 (50%)] Loss:0.039494
Train Epoch:16 [45568/60000 (75%)] Loss:0.017992
Test set: Average loss: 0.8029, Accuracy: 9859/10000 (99%)
Train Epoch:17 [14848/60000 (25%)] Loss:0.019442
Train Epoch:17 [30208/60000 (50%)] Loss:0.014947
Train Epoch:17 [45568/60000 (75%)] Loss:0.024432
Test set: Average loss: 0.6863, Accuracy: 9874/10000 (99%)
Train Epoch:18 [14848/60000 (25%)] Loss:0.013267
Train Epoch:18 [30208/60000 (50%)] Loss:0.022075
Train Epoch:18 [45568/60000 (75%)] Loss:0.024906
Test set: Average loss: 0.6707, Accuracy: 9887/10000 (99%)
Train Epoch:19 [14848/60000 (25%)] Loss:0.031900
Train Epoch:19 [30208/60000 (50%)] Loss:0.014791
Train Epoch:19 [45568/60000 (75%)] Loss:0.037303
Test set: Average loss: 0.7329, Accuracy: 9878/10000 (99%)
Train Epoch:20 [14848/60000 (25%)] Loss:0.030795
Train Epoch:20 [30208/60000 (50%)] Loss:0.016112
Train Epoch:20 [45568/60000 (75%)] Loss:0.020148
Test set: Average loss: 0.6894, Accuracy: 9884/10000 (99%)
RNN代码如下
import torch import numpy import torch.nn as nn import matplotlib.pyplot as plt import torch.optim as optim import torch.nn.functional as F from torch.utils.data import DataLoader,Dataset from torchvision import datasets,transforms BATCH_SIZE=512 INPUT_SIZE=28 TIME_STEP=28 HIDDEN_SIZE=32 NUM_LAYER=1 EPOCHS=20 transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,),(0.3081,)) ]) train_data=datasets.MNIST(root='.\mnist',download=True,train=True,transform=transform) test_data=datasets.MNIST(root='.\mnist',download=True,train=False,transform=transform) train_loader=DataLoader(dataset=train_data,shuffle=True,batch_size=BATCH_SIZE) test_loader=DataLoader(dataset=test_data,shuffle=True,batch_size=BATCH_SIZE) h_n=None class NET(nn.Module): def __init__(self): super(NET,self).__init__() self.rnn=nn.LSTM(input_size=INPUT_SIZE,hidden_size=HIDDEN_SIZE,num_layers=NUM_LAYER,batch_first=True) self.out=nn.Linear(NUM_LAYER*HIDDEN_SIZE,10) def forward(self,x): #x_shape(batch_size,time_step,input_size) #r_out_shape(batch_size,time_step,output_size) #h_n_shape(num_layer*hidden_size,batch_size,hidden_size) r_out,(h_n,h_c)=self.rnn(x,None) output=self.out(r_out[:,-1,:]) return output net=NET() print(net) optimizer=optim.Adam(net.parameters()) criterion=nn.CrossEntropyLoss() def train(net,epoch,train_loader): net.train() for batch_idx,(data,label) in enumerate(train_loader): data=data.view(-1,28,28) output=net(data) optimizer.zero_grad() loss=criterion(output,label) loss.backward() optimizer.step() if (batch_idx+1)%30==0: print('Train Epoch:{} [{}/{}]({:.0f})%\tLOSS:{:.6f}'.format( epoch, batch_idx*len(data), len(train_loader.dataset), 100.*(batch_idx*len(data)/len(train_loader.dataset)), loss.item() )) def test(net,test_loader): net.eval() test_loss=0 test_correct=0 with torch.no_grad(): for data,label in test_loader: data=data.view(-1,28,28) output=net(data) test_loss+=criterion(output,label) pred=output.max(1,keepdim=True)[1] test_correct+=pred.eq(label.view_as(pred)).sum().item() test_loss=test_loss/len(test_loader.dataset) print("\nTest set:Average loss:{:.5f},Accuracy:{}/{}({:.0f}%)\n".format( test_loss,test_correct,len(test_loader.dataset),100.*(test_correct/len(test_loader.dataset)) )) for epoch in range(1,EPOCHS+1): train(net=net,train_loader=train_loader,epoch=epoch) test(net=net,test_loader=test_loader)
测试结果如下:
Train Epoch:1 14848/60000% LOSS:2.203524
Train Epoch:1 30208/60000% LOSS:1.798758
Train Epoch:1 45568/60000% LOSS:1.489452
Test set:Average loss:0.00231,Accuracy:6584/10000(66%)
Train Epoch:2 14848/60000% LOSS:0.899769
Train Epoch:2 30208/60000% LOSS:0.805880
Train Epoch:2 45568/60000% LOSS:0.606137
Test set:Average loss:0.00103,Accuracy:8584/10000(86%)
Train Epoch:3 14848/60000% LOSS:0.446443
Train Epoch:3 30208/60000% LOSS:0.391045
Train Epoch:3 45568/60000% LOSS:0.406176
Test set:Average loss:0.00064,Accuracy:9125/10000(91%)
Train Epoch:4 14848/60000% LOSS:0.353174
Train Epoch:4 30208/60000% LOSS:0.272244
Train Epoch:4 45568/60000% LOSS:0.343986
Test set:Average loss:0.00047,Accuracy:9339/10000(93%)
Train Epoch:5 14848/60000% LOSS:0.200243
Train Epoch:5 30208/60000% LOSS:0.195071
Train Epoch:5 45568/60000% LOSS:0.213462
Test set:Average loss:0.00044,Accuracy:9376/10000(94%)
Train Epoch:6 14848/60000% LOSS:0.197920
Train Epoch:6 30208/60000% LOSS:0.205575
Train Epoch:6 45568/60000% LOSS:0.170409
Test set:Average loss:0.00034,Accuracy:9525/10000(95%)
Train Epoch:7 14848/60000% LOSS:0.211715
Train Epoch:7 30208/60000% LOSS:0.166500
Train Epoch:7 45568/60000% LOSS:0.126960
Test set:Average loss:0.00031,Accuracy:9559/10000(96%)
Train Epoch:8 14848/60000% LOSS:0.125349
Train Epoch:8 30208/60000% LOSS:0.116293
Train Epoch:8 45568/60000% LOSS:0.178416
Test set:Average loss:0.00029,Accuracy:9592/10000(96%)
Train Epoch:9 14848/60000% LOSS:0.141461
Train Epoch:9 30208/60000% LOSS:0.164373
Train Epoch:9 45568/60000% LOSS:0.146364
Test set:Average loss:0.00027,Accuracy:9626/10000(96%)
Train Epoch:10 14848/60000% LOSS:0.121087
Train Epoch:10 30208/60000% LOSS:0.132021
Train Epoch:10 45568/60000% LOSS:0.118700
Test set:Average loss:0.00026,Accuracy:9632/10000(96%)
Train Epoch:11 14848/60000% LOSS:0.161801
Train Epoch:11 30208/60000% LOSS:0.167275
Train Epoch:11 45568/60000% LOSS:0.116637
Test set:Average loss:0.00026,Accuracy:9633/10000(96%)
Train Epoch:12 14848/60000% LOSS:0.125100
Train Epoch:12 30208/60000% LOSS:0.087668
Train Epoch:12 45568/60000% LOSS:0.087067
Test set:Average loss:0.00023,Accuracy:9673/10000(97%)
Train Epoch:13 14848/60000% LOSS:0.104275
Train Epoch:13 30208/60000% LOSS:0.049023
Train Epoch:13 45568/60000% LOSS:0.083864
Test set:Average loss:0.00023,Accuracy:9672/10000(97%)
Train Epoch:14 14848/60000% LOSS:0.064737
Train Epoch:14 30208/60000% LOSS:0.116962
Train Epoch:14 45568/60000% LOSS:0.134746
Test set:Average loss:0.00023,Accuracy:9673/10000(97%)
Train Epoch:15 14848/60000% LOSS:0.125909
Train Epoch:15 30208/60000% LOSS:0.081128
Train Epoch:15 45568/60000% LOSS:0.086529
Test set:Average loss:0.00021,Accuracy:9704/10000(97%)
Train Epoch:16 14848/60000% LOSS:0.152240
Train Epoch:16 30208/60000% LOSS:0.076676
Train Epoch:16 45568/60000% LOSS:0.103419
Test set:Average loss:0.00020,Accuracy:9721/10000(97%)
Train Epoch:17 14848/60000% LOSS:0.077082
Train Epoch:17 30208/60000% LOSS:0.070063
Train Epoch:17 45568/60000% LOSS:0.099703
Test set:Average loss:0.00020,Accuracy:9722/10000(97%)
Train Epoch:18 14848/60000% LOSS:0.042793
Train Epoch:18 30208/60000% LOSS:0.055832
Train Epoch:18 45568/60000% LOSS:0.076663
Test set:Average loss:0.00019,Accuracy:9728/10000(97%)
Train Epoch:19 14848/60000% LOSS:0.081578
Train Epoch:19 30208/60000% LOSS:0.112749
Train Epoch:19 45568/60000% LOSS:0.091129
Test set:Average loss:0.00020,Accuracy:9712/10000(97%)
Train Epoch:20 14848/60000% LOSS:0.094860
Train Epoch:20 30208/60000% LOSS:0.079394
Train Epoch:20 45568/60000% LOSS:0.081650
Test set:Average loss:0.00020,Accuracy:9721/10000(97%)
Process finished with exit code 0
可见在Mnist数据集上的训练,cnn的正确率略高于rnn
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。