赞
踩
学习视频:B站 刘二大人《PyTorch深度学习实践》完结合集
详细理解CNN可参考:通俗理解卷积神经网络
之前所学习的网络为全连接网络,一层一层的全部连接,也就是前一层的每一个节点都会和后面一层的所有节点进行连接,全连接网络在数值计算的时候,会把信息拉成一维的,这样会丧失掉图片特征中的一些空间信息
卷积神经网络:可以很好的保留图像的空间特征,具体工作方式为:
输入一张图像,先进行卷积操作,然后通过下采样减少元素的数量,最后为了实现分类,输出还是一个10维的向量,中间的过程采用不同的方式
总的来看,首先经过卷积层,进行特征提取,经过特征提取之后,图片信息变成了一个向量;而后将向量接入一个全连接网络进行分类处理。
卷积和下采样称为特征提取器,全连接层作为分类器。
图像一般是RGB的通道图像
栅格图像:把图像分为一个个的格子,每个格子都有颜色值,每一个格子代表一个像素,图像信息一般有W(宽)H(高)C(通道)
步骤:先在输入图像中,画出一个卷积核大小的(3x3)的窗口,而后将该窗口与卷积核做数乘+求(对应元素相乘);然后将该窗口从左往右、从上到下做遍历,每一次都进行数乘求和运算,最终由这些数乘求和数值组成的矩阵,就是卷积后的结果。
对于输入图像,每一个通道都会配一个卷积核**(输入图像的通道数 == 卷积核的通道数)**
每个通道根据上面讲的单通道的计算方式计算,得到一个矩阵,一共可以得到三个矩阵,将这三个矩阵求和,最终得到的结果就是三通道卷积的结果。
只有一个卷积核,最终输出的是一个通道的结果
单通道输出:
多通道输出:
输入图像的通道数 == 卷积核的通道数
卷积核的个数 == 输出图像的通道数
在pytorch里面,卷积核kernel_size = 常数,代表是一个常数x常数的卷积核,而kernel_size也可以设置为一个元组,比如kernel_size=(5,3)代表卷积核的大小为长方形5x3,但是我们经常使用的卷积核还是kernel_size=3,代表3x3这样的卷积核。
定义一个卷积层:输入通道、输出通道、卷积核大小。如下图:
import torch
in_channels,out_channels = 5,10
width,height = 100,100 #图像大小
kernel_size = 3 #表示3×3的卷积核
batch_size = 1
input = torch.randn(batch_size,in_channels,width,height)
conv_layer = torch.nn.Conv2d(in_channels,out_channels,kernel_size=kernel_size)
output = conv_layer(input)
print(input.shape)
print(output.shape)
print(conv_layer.weight.shape)
运算结果:
padding
根据输入输出宽度的需求,进行填充数据,通常是补零
填充padding:例如我们输入一个5x5大小的输入图像,通过3x3的卷积核进行卷积,此时输出图像为3x3。但是此时我们想要输出图像的大小和输入图像的大小是一样大,此时我们就需要进行对输入图像进行填充,padding = 3/2 = 1,此时就需要在输入图像填充一圈使其变成7x7。
假如我们使用5x5大小的卷积核进行卷积,然后要求输出图像大小等于输入图像大小,此时我们需要填充padding = 5/2 = 2,填充2圈,使得原图像变成9x9。
代码:
import torch
input = [3,4,6,5,7,
2,4,6,8,2,
1,6,7,8,4,
9,7,4,6,2,
3,7,5,4,1]
input = torch.Tensor(input).view(1,1,5,5) #这里的1,1,5,5分别表示batch_size,通道数,图像的宽度,图像的高度
conv_layer = torch.nn.Conv2d(1,1,kernel_size=3,padding=1,bias=False)
kernel = torch.Tensor([1,2,3,4,5,6,7,8,9]).view(1,1,3,3) #这里的1,1,3,3表示输出通道数,输入通道数,卷积核的宽度,卷积核的高度
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)
运行结果:
stride
步长:就是卷积核窗口在遍历图像时,每走一步的步长
代码:
import torch
input = [3,4,6,5,7,
2,4,6,8,2,
1,6,7,8,4,
9,7,4,6,2,
3,7,5,4,1]
input = torch.Tensor(input).view(1,1,5,5)
conv_layer = torch.nn.Conv2d(1,1,kernel_size=3,stride=2,bias=False)
kernel = torch.Tensor([1,2,3,4,5,6,7,8,9]).view(1,1,3,3)
conv_layer.weight.data = kernel.data
output = conv_layer(input)
print(output)
运行结果:
池化运算常用的是最大池化,它是没有权重的,2x2的池化默认stride=2
做最大池化,只是在一个通道内进行,不同的通道不会最大池化。所以说,做最大池化,通道的数量不会发生变化,只是2x2的最大池化,图像的大小会变为原来的一半。
代码实现:
import torch
input = [3,4,6,5,
2,4,6,8,
1,6,7,8,
9,7,4,6]
input = torch.Tensor(input).view(1,1,4,4)
maxpooling_layer = torch.nn.MaxPool2d(kernel_size=2)
output = maxpooling_layer(input)
print(output)
运行结果:
算法模型:
import numpy as np import torch from torchvision import transforms from torchvision import datasets from torch.utils.data import DataLoader import torch.nn.functional as F import torch.optim as optim import matplotlib.pyplot as plt #1.准备数据 batch_size = 64 transform = transforms.Compose([ transforms.ToTensor(), #把PIL图像变为pytorch中的tensor transforms.Normalize((0.1307,),(0.3081,)) #归一化,0.1301为均值,0.3081为标准差 ]) train_dataset = datasets.MNIST(root='./mnist_data/', train=True, download=True, transform=transform) train_loader = DataLoader(train_dataset,shuffle=True,batch_size=batch_size) test_dataset = datasets.MNIST(root='./mnist_data/', train=False, download=True, transform=transform) test_loader = DataLoader(train_dataset,shuffle=False,batch_size=batch_size) #设计模型 class Net(torch.nn.Module): def __init__(self): super(Net,self).__init__() self.conv1 = torch.nn.Conv2d(1,10,kernel_size=5) self.conv2 = torch.nn.Conv2d(10, 20, kernel_size=5) self.pooling = torch.nn.MaxPool2d(2) self.fc = torch.nn.Linear(320,10) def forward(self,x): batch_size = x.size(0) x = F.relu(self.pooling(self.conv1(x))) x = F.relu(self.pooling(self.conv2(x))) x = x.view(batch_size,-1) x = self.fc(x) return x model = Net() #损失函数、优化器 criterion = torch.nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(),lr=0.01,momentum=0.5) #训练 def train(epoch): running_loss = 0.0 for batch_idx,data in enumerate(train_loader,0): inputs,target = data optimizer.zero_grad() #forward+backward+update outputs = model(inputs) loss = criterion(outputs,target) loss.backward() optimizer.step() running_loss += loss.item() if batch_idx % 300 == 299: print('[%d, %5d] loss: %.3f' % (epoch + 1, batch_idx + 1, running_loss / 300)) running_loss = 0.0 epoch_list = [] acc_list = [] #测试 def test(): correct = 0 total = 0 # 避免计算梯度 with torch.no_grad(): for data in test_loader: images, labels = data outputs = model(images) # 取每一行(dim=1表第一个维度)最大值(max)的下标(predicted)及最大值(_) _, predicted = torch.max(outputs.data, dim=1) # 加上这一个批量的总数(batch_size),label的形式为[N,1] total += labels.size(0) correct += (predicted == labels).sum().item() acc_list.append(100 * correct / total) print('Accuracy on test set: %d %%' % (100 * correct / total)) if __name__ =='__main__': for epoch in range(10): train(epoch) epoch_list.append(epoch + 1) test() #画图 plt.plot(epoch_list,acc_list) plt.xlabel('epoch') plt.ylabel('acc') plt.grid() plt.show()
运行结果:
使用GPU运行代码:
代码实现:
import torch import torch.nn as nn from torchvision import transforms from torchvision import datasets from torch.utils.data import DataLoader import torch.nn.functional as f import torch.optim as optim import matplotlib.pyplot as plt # 准备数据集 batch_size = 64 transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize(mean=[0.1307,],std=[0.3081,]) ]) train_dataset = datasets.MNIST(root="./mnist_data/",train=True,transform=transform,download=False) test_dataset = datasets.MNIST(root="./mnist_data/",train=False,transform=transform,download=False) train_dataloader = DataLoader(dataset=train_dataset,batch_size=batch_size,shuffle=True) test_dataloader = DataLoader(dataset=test_dataset,batch_size=batch_size,shuffle=True) # 网络模型 class Model(nn.Module): def __init__(self): super(Model, self).__init__() self.conv1 = nn.Conv2d(1,10,kernel_size=5) self.conv2 = nn.Conv2d(10,20,kernel_size=5) self.pooling = nn.MaxPool2d(2) self.fc = nn.Linear(320,10) def forward(self,x): # Flatten data from (n,1,28,28) to (n,784) x = f.relu(self.conv1(x)) # batch_size = x.size(0) x = self.pooling(x) x = f.relu(self.conv2(x)) x = self.pooling(x) x = x.view(x.size()[0],-1) x = self.fc(x) return x # x = f.relu(self.pooling(self.conv1(x))) # x = f.relu(self.pooling(self.conv2(x))) # x = x.view(x.size(0), -1) # x = self.fc(x) # return x model = Model() device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") model.to(device) # 优化器和损失函数 criterion = nn.CrossEntropyLoss(reduction='mean') optimizer = optim.SGD(model.parameters(),lr = 0.01,momentum=0.5) # 训练 epoch_list = [] def train(epoch): running_loss = 0.0 epoch_list.append(epoch+1) # for epoch in range(10): for i, data in enumerate(train_dataloader, 0): input, target = data input, target = input.to(device),target.to(device) y_pred = model(input) loss = criterion(y_pred, target) # print(i+1,epoch+1,loss.item()) optimizer.zero_grad() loss.backward() optimizer.step() running_loss += loss.item() if i % 300 == 299: print("{} {} loss:{:.3f}".format(epoch + 1, i + 1, running_loss / 300)) running_loss = 0.0 # 测试 accuracy_list = [] def test(): total = 0 correct = 0 with torch.no_grad(): for i,data in enumerate(test_dataloader,0): input,target = data input, target = input.to(device), target.to(device) y_pred = model(input) predicted = torch.argmax(y_pred.data,dim=1) total += target.size(0) correct += (predicted==target).sum().item() accuracy = correct/total accuracy_list.append(accuracy) print("Accuracy on test set:{:.2f} %".format(100*correct/total)) if __name__ == '__main__': for epoch in range(10): train(epoch) test() #画图 plt.plot(epoch_list,accuracy_list) plt.xlabel('epoch') plt.ylabel('accuracy') plt.grid() plt.show()
运行结果:
训练50轮:
https://blog.csdn.net/qq_43800119/article/details/126415845
https://blog.csdn.net/lizhuangabby/article/details/125730151?spm=1001.2014.3001.5501
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。