赞
踩
目录
LeNet-5 是由 Yann LeCun 等人在 1998 年提出的一种经典卷积神经网络(CNN)模型,主要用于手写数字识别任务。它在 MNIST 数据集上表现出色,并且是深度学习历史上的一个重要里程碑。
LeNet-5 的结构包括以下几个层次:
CIFAR-10 是一个常用的图像分类数据集,包含 10 个类别的 60,000 张 32x32 彩色图像。每个类别有 6,000 张图像,其中 50,000 张用于训练,10,000 张用于测试。
1. 标注数据量训练集:50000张图像测试集:10000张图像
2. 标注类别数据集共有10个类别。具体分类见图1。
3. 可视化
在实际应用中,最大池化更常用,因为它通常能更好地保留重要特征并提高模型的性能。
- import torch.nn as nn
- import torch.nn.functional as func
-
-
- class LeNet(nn.Module):
- def __init__(self):
- super(LeNet, self).__init__()
- self.conv1 = nn.Conv2d(3, 6, kernel_size=5)
- self.conv2 = nn.Conv2d(6, 16, kernel_size=5)
- self.fc1 = nn.Linear(16*5*5, 120)
- self.fc2 = nn.Linear(120, 84)
- self.fc3 = nn.Linear(84, 10)
-
- def forward(self, x):
- x = func.relu(self.conv1(x))
- x = func.max_pool2d(x, 2)
- x = func.relu(self.conv2(x))
- x = func.max_pool2d(x, 2)
- x = x.view(x.size(0), -1)
- x = func.relu(self.fc1(x))
- x = func.relu(self.fc2(x))
- x = self.fc3(x)
- return x
导入训练数据和测试数据
- def load_data(self):
- #transforms.RandomHorizontalFlip() 是 pytorch 中用来进行随机水平翻转的函数。它将以一定概率(默认为0.5)对输入的图像进行水平翻转,并返回翻转后的图像。这可以用于数据增强,使模型能够更好地泛化。
-
- train_transform = transforms.Compose([transforms.RandomHorizontalFlip(), transforms.ToTensor()])
- test_transform = transforms.Compose([transforms.ToTensor()])
-
- train_set = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=train_transform)
- self.train_loader = torch.utils.data.DataLoader(dataset=train_set, batch_size=self.train_batch_size, shuffle=True)
-
- # shuffle=True 表示在每次迭代时,数据集都会被重新打乱。这可以防止模型在训练过程中过度拟合训练数据,并提高模型的泛化能力。
- test_set = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=test_transform)
- self.test_loader = torch.utils.data.DataLoader(dataset=test_set, batch_size=self.test_batch_size, shuffle=False)
- def train(self):
- print("train:")
- self.model.train()
- train_loss = 0
- train_correct = 0
- total = 0
-
- for batch_num, (data, target) in enumerate(self.train_loader):
- data, target = data.to(self.device), target.to(self.device)
- self.optimizer.zero_grad()
- output = self.model(data)
- loss = self.criterion(output, target)
- loss.backward()
- self.optimizer.step()
- train_loss += loss.item()
- prediction = torch.max(output, 1) # second param "1" represents the dimension to be reduced
- total += target.size(0)
-
- # train_correct incremented by one if predicted right
- train_correct += np.sum(prediction[1].cpu().numpy() == target.cpu().numpy())
-
- progress_bar(batch_num, len(self.train_loader), 'Loss: %.4f | Acc: %.3f%% (%d/%d)'
- % (train_loss / (batch_num + 1), 100. * train_correct / total, train_correct, total))
-
- return train_loss, train_correct / total
- def test(self):
- print("test:")
- self.model.eval()
- test_loss = 0
- test_correct = 0
- total = 0
-
- with torch.no_grad():
- for batch_num, (data, target) in enumerate(self.test_loader):
- data, target = data.to(self.device), target.to(self.device)
- output = self.model(data)
- loss = self.criterion(output, target)
- test_loss += loss.item()
- prediction = torch.max(output, 1)
- total += target.size(0)
- test_correct += np.sum(prediction[1].cpu().numpy() == target.cpu().numpy())
-
- progress_bar(batch_num, len(self.test_loader), 'Loss: %.4f | Acc: %.3f%% (%d/%d)'
- % (test_loss / (batch_num + 1), 100. * test_correct / total, test_correct, total))
-
- return test_loss, test_correct / total
网上随便下载一个图片
然后使用图片编辑工具,把图片设置为32x32大小
通过导入模型,然后测试一下
- import torch
- import cv2
- import torch.nn.functional as F
- #from model import Net ##重要,虽然显示灰色(即在次代码中没用到),但若没有引入这个模型代码,加载模型时会找不到模型
- from torch.autograd import Variable
- from torchvision import datasets, transforms
- import numpy as np
-
- classes = ('plane', 'car', 'bird', 'cat',
- 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
- if __name__ == '__main__':
- device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
- model = torch.load('lenet.pth') # 加载模型
- model = model.to(device)
- model.eval() # 把模型转为test模式
-
- img = cv2.imread("bird1.png") # 读取要预测的图片
- trans = transforms.Compose(
- [
- transforms.ToTensor()
- ])
-
- img = trans(img)
- img = img.to(device)
- img = img.unsqueeze(0) # 图片扩展多一维,因为输入到保存的模型中是4维的[batch_size,通道,长,宽],而普通图片只有三维,[通道,长,宽]
- # 扩展后,为[1,1,28,28]
- output = model(img)
- prob = F.softmax(output,dim=1) #prob是10个分类的概率
- print(prob)
- value, predicted = torch.max(output.data, 1)
- print(predicted.item())
- print(value)
- pred_class = classes[predicted.item()]
- print(pred_class)
- tensor([[1.8428e-01, 1.3935e-06, 7.8295e-01, 8.5042e-04, 3.0219e-06, 1.6916e-04,
- 5.8798e-06, 3.1647e-02, 1.7037e-08, 8.9128e-05]], device='cuda:0',
- grad_fn=<SoftmaxBackward0>)
- 2
- tensor([4.0915], device='cuda:0')
- bird
从结果看,效果还不错。记录一下
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。