赞
踩
LeNet是最早的卷积神经网络之一,诞生于1994年。通常说的LeNet一般是指LeNet经过多次迭代后的LeNet-5,是由Yann LeCun在1998年的论文"GradientBased Learning Applied to DocumentRecognition"中提出的用于手写字符识别的高效卷积神经网络。它是卷积神经网络的开山之作,大大推动了深度学习领域的发展。
LeNet-5提出年代较早,面对当时计算能力的限制,为了降低参数量,减小算力需求,在一些细节实现上与现在有所区别。网络结构共有7层,主要包括卷积层、降采样层、全连接层。
(
1
)
C
o
n
v
L
a
y
e
r
o
u
t
p
u
t
s
i
z
e
=
i
n
p
u
t
s
i
z
e
−
k
e
r
n
e
l
s
i
z
e
+
2
∗
p
a
d
d
i
n
g
s
t
r
i
d
e
+
1
(1)\ \ Conv\ Layer\ output\ size = \frac{input\ size - kernel\ size + 2*padding}{stride}+1
(1) Conv Layer output size=strideinput size−kernel size+2∗padding+1
(
2
)
P
o
o
l
L
a
y
e
r
o
u
t
p
u
t
s
i
z
e
=
i
n
p
u
t
s
i
z
e
−
p
o
o
l
s
i
z
e
s
t
r
i
d
e
+
1
(2)\ \ Pool\ Layer\ output\ size = \frac{input\ size - pool\ size}{stride}+1
(2) Pool Layer output size=strideinput size−pool size+1
(
1
)
C
o
n
v
L
a
y
e
r
P
a
r
a
m
e
t
e
r
s
=
k
e
r
n
e
l
s
i
z
e
2
∗
k
e
r
n
e
l
n
u
m
∗
i
n
p
u
t
c
h
a
n
n
e
l
s
+
b
i
a
s
n
u
m
(
=
k
e
r
n
e
l
n
u
m
)
(1)\ \ Conv\ Layer\ Parameters = kernel\ size^2*kernel\ num * input\ channels+bias\ num(=kernel\ num)
(1) Conv Layer Parameters=kernel size2∗kernel num∗input channels+bias num(=kernel num)
(
2
)
F
C
L
a
y
e
r
P
a
r
a
m
e
t
e
r
s
=
i
n
p
u
t
s
i
z
e
2
∗
i
n
p
u
t
c
h
a
n
n
e
l
s
∗
F
+
b
i
a
s
n
u
m
(
=
F
)
F
:
全
连
接
层
神
经
元
数
量
(2)\ \ FC\ Layer\ Parameters = input\ size^2*input\ channels * F+bias\ num(=F)\\F:全连接层神经元数量
(2) FC Layer Parameters=input size2∗input channels∗F+bias num(=F)F:全连接层神经元数量
模型输入:
32*32 手写字符图片
C1(卷积层):
6个5*5卷积核 → feature maps:6*28*28(28=32-5+1)
可训练参数:5*5*6*1+6=156
S2(降采样层):
加和并乘上系数,加上bias,再通过Tanh激活 → feature maps:6*14*14(
14
=
28
−
2
2
+
1
14=\frac{28-2}{2}+1
14=228−2+1)
一般LeNet的实现以及现在的使用都是通过平均/最大池化层而非原论文的降采样层。但原论文中的降采样层具有很大的参考意义,此处完整实现原论文的降采样层。
真正的池化层不存在可训练可训练参数,此处存在可训练参数(自定义的权重与偏置):(1+1)*6=12
C3(特殊的卷积层):
6个5*5卷积核 → feature maps 16*10*10(14-5+1)
此卷积层与现在我们常说的卷积层不同,为了降低参数量,减轻算力负担,作者采用的是特征图的互补子集(16种组合)来进行计算,而并非现在的多通道完全卷积,现在使用的卷积包含了原计算图。
原论文实现可训练参数:6*(3*5*5+1)+6*(4*5*5+1)+3*(4*5*5+1)+1*(6*5*5+1)=1516
现卷积实现可训练参数:5*5*16*6+16=2416
S4(降采样层):
feature maps:16*5*5
可训练参数:2*16=32
C5(与上一层全连接的卷积层):
120个5*5卷积核 → feature maps:120*1*1
可训练参数量:5*5*120*16+120=48120
F6(全连接层):
前一层神经元数量:120 → 当前层神经元数量:84
可训练参数:120*84+84=10164
OUTPUT(采用RBF的全连接层):
Gaussian Connections:84 → 10(对应输出的10类字符)
原文采用的是RBF径向基函数(Radial Basis Function)的网络连接方式,相关参数设置是特适用于论文数据集的编码,不做深入。
可训练参数量为84*10+10=850
总可训练参数量 = 61750 = 0.06175M。通常参数为float32,即32bit。32bit = 8Byte。
模型大小 = 0.06175M*32bit = 1.976Mb=0.247MB
import torch
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torchsummary import summary
from torch import nn
import matplotlib.pyplot as plt
# 超参配置
epochs = 10
batch_size = 32
learn_rate = 1e-3
class DownSample(nn.Module): def __init__(self, in_channels, kernel_size=2, stride=2): super(DownSample, self).__init__() self.in_channels = in_channels self.sum_4 = nn.AvgPool2d(kernel_size=kernel_size, stride=stride) # 使用平均池化代替求和,由于此权重可学习,可视为等效 self.weight = nn.Parameter(torch.randn(in_channels), requires_grad=True) # 添加in_channels个可学习权重参数 self.bias = nn.Parameter(torch.randn(in_channels), requires_grad=True) # 添加in_channels个可学习偏置参数 def forward(self, feature): # Eg.feature.shape(-1,6,28,28) sample_outputs = [] feature = self.sum_4(feature) # Eg.feature.shape(-1,6,14,14) for i in range(self.in_channels): sample_output = feature[:, i] * self.weight[i] + self.bias[i] # Eg.sample_output.shape(-1,14,14) sample_output = sample_output.unsqueeze(1) # Eg.sample_output.shape(-1,1,14,14) sample_outputs.append(sample_output) return torch.cat(sample_outputs, 1) # Eg.sample_output.shape(-1,6,14,14)
class LeNet(nn.Module): def __init__(self): super(LeNet, self).__init__() self.con_sam = nn.Sequential( nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5), # 1*32*32->6*28*28 nn.Tanh(), DownSample(in_channels=6, kernel_size=2, stride=2), # 6*28*28->6*14*14 nn.Tanh(), nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5), # 6*14*14->16*10*10 nn.Tanh(), DownSample(in_channels=16, kernel_size=2, stride=2), # 16*10*10->16*5*5 nn.Tanh(), nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5), # 16*5*5->120*1*1 nn.Tanh(), nn.Flatten(), # 展平 120*1*1->120 ) self.fc = nn.Sequential( nn.Linear(in_features=120, out_features=84), nn.Tanh(), nn.Linear(in_features=84, out_features=10) ) def forward(self, input): # input.shape(-1, 1, 32, 32) output = self.con_sam(input) # output.shape(-1, 120) # 展平操作也可采用下列几种方法 # output = torch.squeeze(output) # output = output.reshape(120) # output = output.view(120) # output = output.flatten() # output = torch.flatten(output) output = self.fc(output) return output
net = LeNet().cuda()
summary(net, (1, 32, 32)) # 查看网络结构
transform = transforms.Compose([torchvision.transforms.Resize(32), transforms.ToTensor()]) train_data = torchvision.datasets.MNIST('./mnist', train=True, transform=transform, download=True) test_data = torchvision.datasets.MNIST('./mnist', train=False, transform=transform, download=True) # 28*28 print('train_data:{}, test_data:{}'.format(len(train_data), len(test_data))) # 查看一个数据 # data1 = train_data[0][0].numpy().squeeze() # 需要去掉多余维度 # plt.imshow(data1) # plt.show() train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=True) test_loader = DataLoader(dataset=test_data, batch_size=batch_size) optimizer = torch.optim.Adam(net.parameters(), lr=learn_rate) loss_func = torch.nn.CrossEntropyLoss() for epoch in range(epochs): print('epoch {}'.format(epoch+1)) # train net.train() # 训练模型 train_loss, train_correct = 0, 0 for _, data in enumerate(train_loader, 0): batch_data, batch_label = data batch_data, batch_label = batch_data.cuda(), batch_label.cuda() # 数据移至GPU batch_data.shape(1,1,32,32) batch_pred = net(batch_data) # predict.shape(batch_size,10) 输出的每个样本的10个值代表10个输出类别的概率,取最大作为预测类别 predict_correct = torch.max(batch_pred, 1)[1] # 1:返回每一行的最大值 [1]:返回最大值的索引 train_correct: 预测结果的序列 predict_correct = (predict_correct == batch_label).sum() # 预测与真实比较,求和得到该批次正确预测的数量 train_correct += predict_correct.item() # 累加得此epoch正确预测的总数量,以计算准确率,使用item()获取具体数值 loss = loss_func(batch_pred, batch_label) optimizer.zero_grad() # 梯度清零 loss.backward() # 反向传播 optimizer.step() # 根据梯度更新网络参数 train_loss += loss.item() # batch累计loss print('Train Loss: {:.6f}, Acc: {:.6f}'.format(train_loss / (len(train_data)), train_correct / (len(train_data)))) # test net.eval() # 测试模型 with torch.no_grad(): # 不计算梯度,进一步加速、节省显存 test_loss, test_correct = 0, 0 for _, data in enumerate(test_loader, 0): batch_data, batch_label = data batch_data, batch_label = batch_data.cuda(), batch_label.cuda() batch_pred = net(batch_data) predict_correct = torch.max(batch_pred, 1)[1] predict_correct = (predict_correct == batch_label).sum() test_correct += predict_correct.item() loss = loss_func(batch_pred, batch_label) test_loss += loss.item() print('Test Loss: {:.6f}, Acc: {:.6f}'.format(test_loss / (len(test_data)), test_correct / (len(test_data)))) print('End of the training')
将Pytorch代码转化为PaddlePaddle实现
import paddle import numpy import paddle.nn as nn from paddle.vision.datasets import MNIST from paddle.vision.transforms import Compose, Resize, ToTensor from paddle.io import DataLoader epochs = 10 batch_size = 64 learning_rate = 1e-3 class DownSample(nn.Layer): def __init__(self, in_channels, kernel_size=2, stride=2): super(DownSample, self).__init__() self.in_channels = in_channels self.sum_4 = nn.AvgPool2D(kernel_size=kernel_size, stride=stride) # 使用平均池化代替求和,由于此权重可学习,可视为等效 self.weight = paddle.static.create_parameter(shape=[in_channels], dtype='float32') # 添加in_channels个可学习权重参数 self.bias = paddle.static.create_parameter(shape=[in_channels], dtype='float32', is_bias=True) # 添加in_channels个可学习偏置参数 def forward(self, feature): # Eg.feature.shape(-1,6,28,28) sample_outputs = [] feature = self.sum_4(feature) # Eg.feature.shape(-1,6,14,14) for i in range(self.in_channels): sample_output = feature[:, i] * self.weight[i] + self.bias[i] # Eg.sample_output.shape(-1,14,14) sample_output = sample_output.unsqueeze(1) # Eg.sample_output.shape(-1,1,14,14) sample_outputs.append(sample_output) return paddle.concat(sample_outputs, 1) # Eg.sample_output.shape(-1,6,14,14) class LeNet(nn.Layer): def __init__(self): super(LeNet, self).__init__() self.con_sam = nn.Sequential( nn.Conv2D(in_channels=1, out_channels=6, kernel_size=5), # 1*32*32->6*28*28 nn.Tanh(), DownSample(in_channels=6, kernel_size=2, stride=2), # 6*28*28->6*14*14 nn.Tanh(), nn.Conv2D(in_channels=6, out_channels=16, kernel_size=5), # 6*14*14->16*10*10 nn.Tanh(), DownSample(in_channels=16, kernel_size=2, stride=2), # 16*10*10->16*5*5 nn.Tanh(), nn.Conv2D(in_channels=16, out_channels=120, kernel_size=5), # 16*5*5->120*1*1 nn.Tanh(), nn.Flatten(), # 展平 120*1*1->120 ) self.fc = nn.Sequential( nn.Linear(in_features=120, out_features=84), nn.Tanh(), nn.Linear(in_features=84, out_features=10) ) def forward(self, input): # input.shape(-1, 1, 32, 32) output = self.con_sam(input) # output.shape(-1, 120) # 展平操作也可采用下列几种方法 # output = paddle.squeeze(output) # output = output.reshape(120) # output = output.view(120) # output = output.flatten() # output = paddle.flatten(output) output = self.fc(output) return output net = LeNet() paddle.summary(net, (-1, 1, 32, 32)) # 查看网络结构 transform = Compose([Resize(32), ToTensor()]) train_data = MNIST(mode='train', transform=transform, download=True) test_data = MNIST(mode='test', transform=transform, download=True) # 28*28 print('train_data:{}, test_data:{}'.format(len(train_data), len(test_data))) # 查看一个数据 # data1 = train_data[0][0].numpy().squeeze() # 需要去掉多余维度 # plt.imshow(data1) # plt.show() train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=True) test_loader = DataLoader(dataset=test_data, batch_size=batch_size) optimizer = paddle.optimizer.Adam(parameters=net.parameters(), learning_rate=learning_rate) loss_func = paddle.nn.CrossEntropyLoss() for epoch in range(epochs): print('epoch {}'.format(epoch+1)) # train net.train() # 训练模型 train_loss, train_correct = 0, 0 for _, data in enumerate(train_loader, 0): batch_data, batch_label = data batch_pred = net(batch_data) # predict.shape(batch_size,10) 输出的每个样本的10个值代表10个输出类别的概率,取最大作为预测类别 predict_correct = paddle.argmax(batch_pred, 1) batch_label = batch_label.squeeze() predict_correct = (predict_correct == batch_label).numpy().sum() # 预测与真实比较,求和得到该批次正确预测的数量 train_correct += predict_correct.item() # 累加得此epoch正确预测的总数量,以计算准确率,使用item()获取具体数值 loss = loss_func(batch_pred, batch_label) optimizer.clear_grad() # 梯度清零 loss.backward() # 反向传播 optimizer.step() # 根据梯度更新网络参数 train_loss += loss.numpy() # batch累计loss print('Train Loss: {:.6f}, Acc: {:.6f}'.format(train_loss[0] / (len(train_data)), train_correct / (len(train_data)))) # test net.eval() # 测试模型 with paddle.no_grad(): # 不计算梯度,进一步加速、节省显存 test_loss, test_correct = 0, 0 for _, data in enumerate(test_loader, 0): batch_data, batch_label = data batch_pred = net(batch_data) predict_correct = paddle.argmax(batch_pred, 1) batch_label = batch_label.squeeze() predict_correct = (predict_correct == batch_label).numpy().sum() test_correct += predict_correct.item() loss = loss_func(batch_pred, batch_label) test_loss += loss.numpy() print('Test Loss: {:.6f}, Acc: {:.6f}'.format(test_loss[0] / (len(test_data)), test_correct / (len(test_data)))) print('End of the training')
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。