当前位置:   article > 正文

(学习笔记)使用pytorch复现AlexNet网络识别minst数据集并绘制loss,PR,ROC曲线_alexnet示意图

alexnet示意图

目录

一、相关背景和研究目标

二、Alexnet网络主要结构

2.1 卷积层

2.2全连接层

2.3 pytorch代码实现

三、训练模型

四、绘制loss曲线

4.1 代码实现

4.2 运行结果

 五、P-R曲线与ROC曲线

5.1 代码实现

5.2 运行结果

六、对单张图片进行预测 


 

一、相关背景和研究目标

        2012年,Alexnet网络是由Geoffrey和他的学生Alex在2012年的ILSVRC竞赛上提出,Alexnet属于经典的卷积神经网络的衍生,本次主要目的为利用pytorch框架复现Alexnet网络实现对minist数据集的手写数字进行识别。

二、Alexnet网络主要结构

         AlexNet网络结构相对简单,它的隐含层共有8层,其中前五层为卷积层,后三层为全连接层,Alexnet网络结构在当时有好几个创新点,其中包括使用多GPU训练,该方案在当时确实具有极大的优势,但是,在如今的时代,单GPU就已经可以完成该项目了,其结果简图如下。

Alexnet结构图
图1:标题Alexnet结构图

        Alexnet的最大的优点在于使用了ReLU非线性激活函数,其表达式为f(x)=max(0,x)

尽管从表达式中看到函数的两个部分都是线性的,但该函数事实上本身不满足叠加性。

        (激活函数:对于一大堆数据集,它本身是非线性的,无论使用多少线性变换叠加成的神经网络,都是线性的,激活函数将线性转化为非线性的激活过程)

        Alexnet的另一个优点在于使用了8层卷积神经网络,前5层是卷积层,剩下的3层是全连接层

2.1 卷积层

        卷积的计算公式为N=(W-F+2P)/S+1,其中P为padding,代表padding的像素数,s即是步长,W为W*W的输入图片大小,F为卷积核或池化核的大小。

        此处要进行多次计算以得到输入输出层的图片大小,但事实上从应用角度可以直接调用库函数,此处只举一层的例子

    C1(第一层):卷积–>ReLU–>池化

卷积:输入1×224×224 (单通道,长×宽),96个1×11×11×1的卷积核,扩充边缘padding = 2,步长stride = 4,因此其FeatureMap大小为(224-11+0×2+4 )/4+1= 55,即48×55×55;输入输出层次由函数定义,唯一需要注意的是有两层没有最大池化操作这点在代码中有所体现
激活函数:ReLU;
池化:池化核大小3 × 3,不扩充边缘padding = 0,步长stride = 2,因此其FeatureMap输出大小为(55-3)/2+1=27, 即C1输出为48×27×27

2.2全连接层

        全连接层各层节点全相连,将特征分布到样本标记空间,从而起到分类作用。

        全连接层有一个重要的操作,dropout函数,它以0.5的概率去除了前向通道中的神经节点,从而防止了过拟合。

图2:dropout示意图

 

2.3 pytorch代码实现

  1. import torch.nn as nn
  2. import torch
  3. class AlexNet(nn.Module):
  4. def __init__(self, num_classes=10, init_weights=False):
  5. super(AlexNet, self).__init__()
  6. self.features = nn.Sequential( #对于深层次的网络采用sequential进行打包使得其更加整洁
  7. nn.Conv2d(1, 48, kernel_size=11, stride=4, padding=2),#卷积层
  8. nn.ReLU(inplace=True),
  9. nn.MaxPool2d(kernel_size=3, stride=2),
  10. nn.Conv2d(48, 128, kernel_size=5, padding=2),
  11. nn.ReLU(inplace=True),
  12. nn.MaxPool2d(kernel_size=3, stride=2),
  13. nn.Conv2d(128, 192, kernel_size=3, padding=1),
  14. nn.ReLU(inplace=True),
  15. nn.Conv2d(192, 192, kernel_size=3, padding=1),
  16. nn.ReLU(inplace=True),
  17. nn.Conv2d(192, 128, kernel_size=3, padding=1),
  18. nn.ReLU(inplace=True),
  19. nn.MaxPool2d(kernel_size=3, stride=2),
  20. )
  21. self.classifier = nn.Sequential(#全连接层
  22. nn.Dropout(p=0.5),
  23. nn.Linear(128 * 6 * 6, 2048),
  24. nn.ReLU(inplace=True),
  25. nn.Dropout(p=0.5),
  26. nn.Linear(2048, 2048),
  27. nn.ReLU(inplace=True),
  28. nn.Linear(2048, num_classes),
  29. )
  30. if init_weights:
  31. self._initialize_weights()
  32. def forward(self, x):#前向
  33. x = self.features(x)
  34. x = torch.flatten(x, start_dim=1)
  35. x = self.classifier(x)
  36. return x
  37. def _initialize_weights(self):#初始化权重,据说这种方法在需要套用别人的训练结果时有用
  38. for m in self.modules():
  39. if isinstance(m, nn.Conv2d):
  40. nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
  41. if m.bias is not None:
  42. nn.init.constant_(m.bias, 0)
  43. elif isinstance(m, nn.Linear):
  44. nn.init.normal_(m.weight, 0, 0.01)
  45. nn.init.constant_(m.bias, 0)

三、训练模型

  1. import os
  2. import sys
  3. #import json
  4. #from PIL import Image
  5. import torch
  6. import torch.nn as nn
  7. import torchvision
  8. from torchvision import transforms, datasets, utils
  9. #import matplotlib.pyplot as plt
  10. #import numpy as np
  11. import torch.optim as optim
  12. from tqdm import tqdm
  13. from model import AlexNet
  14. loss_list = [] # 定义几个空列表方便可视化
  15. acc_list = []
  16. def main():
  17. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")#这里始终切不到GPU1(存疑)
  18. print("using {} device.".format(device))
  19. # 训练集预处理
  20. train_transform = transforms.Compose([torchvision.transforms.Resize(244),
  21. # transforms.RandomHorizontalFlip(),
  22. torchvision.transforms.ToTensor()
  23. #torchvision.transforms.Normalize((0.5,), (0.5,))
  24. ])
  25. #测试集预处理
  26. test_transform = transforms.Compose([transforms.Resize(244),
  27. transforms.ToTensor(),
  28. #transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
  29. ])
  30. # img=Image.open("F:/data_set/minst/train/0_0.png")
  31. # img1=train_transform(img)
  32. #加载数据集
  33. data_root = os.path.abspath(os.path.join(os.getcwd(), "../..")) # get data root path,data_set之前
  34. image_path = os.path.join(data_root, "data_set", "minst") # minst set path
  35. assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
  36. train_dataset = torchvision.datasets.MNIST(root='F:/data_set/minst/train', train=True,
  37. download=True, transform=train_transform)#torchvision.transforms.Compose([torchvision.transforms.ToTensor(),
  38. # torchvision.transforms.Normalize(
  39. # (0.1307,),(0.3081,)
  40. # )
  41. # ]))#后一个参数表示预处理,这里的TRUE可以直接从网上下载数据集
  42. train_num = len(train_dataset) # 记录训练集的总数
  43. batch_size = 100
  44. nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8]) # number of workers
  45. print('Using {} dataloader workers every process'.format(nw))
  46. train_loader = torch.utils.data.DataLoader(train_dataset,
  47. batch_size=batch_size, shuffle=True,
  48. num_workers=nw)
  49. test_dataset = torchvision.datasets.MNIST(root='F:/data_set/minst/train',
  50. train=False, download=True,
  51. transform=test_transform)
  52. test_num = len(test_dataset)
  53. test_loader = torch.utils.data.DataLoader(test_dataset,
  54. batch_size=100, shuffle=False,
  55. num_workers=nw)
  56. net = AlexNet(num_classes=10, init_weights=True)
  57. net.to(device)
  58. loss_function = nn.CrossEntropyLoss()
  59. # pata = list(net.parameters())
  60. optimizer = optim.Adam(net.parameters(), lr=0.0002)
  61. epochs = 10
  62. save_path = './AlexNet.pth'
  63. best_acc = 0.0
  64. train_steps = len(train_loader)
  65. for epoch in range(epochs):
  66. # train
  67. net.train()
  68. running_loss = 0.0
  69. train_bar = tqdm(train_loader, file=sys.stdout)
  70. for step, data in enumerate(train_bar):
  71. images, labels = data
  72. optimizer.zero_grad()
  73. outputs = net(images.to(device))
  74. loss = loss_function(outputs, labels.to(device))
  75. loss.backward()
  76. optimizer.step()
  77. running_loss += loss.item()
  78. loss_list.append(loss.item()) # 存每次的loss值,每迭代一次就会产生一次loss值
  79. with open("./train_loss.txt", 'w') as train_loss:#在该目录下打开一个文件存一下数据
  80. train_loss.write(str(loss_list))
  81. train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
  82. epochs,
  83. loss)
  84. # test,说是测试集实际上更应该是验证集
  85. net.eval()
  86. acc = 0.0 # accumulate accurate number / epoch
  87. with torch.no_grad():
  88. test_bar = tqdm(test_loader, file=sys.stdout)
  89. for test_data in test_bar:
  90. test_images, test_labels = test_data
  91. outputs = net(test_images.to(device))
  92. predict_y = torch.max(outputs, dim=1)[1]
  93. acc += torch.eq(predict_y, test_labels.to(device)).sum().item()
  94. test_accurate = acc / test_num
  95. acc_list.append(test_accurate)
  96. with open("./train_acc.txt", 'w') as train_acc:
  97. train_acc.write(str(acc_list))#这里存入了acc
  98. if test_accurate > best_acc:
  99. best_acc = test_accurate
  100. torch.save(net, save_path)
  101. print('[epoch %d] train_loss: %.3f test_accuracy: %.3f' %
  102. (epoch + 1, running_loss / train_steps, test_accurate))
  103. print('Finished Training')
  104. if __name__ == '__main__':
  105. main()

四、绘制loss曲线

4.1 代码实现

  1. import numpy as np
  2. import matplotlib.pyplot as plt
  3. # 读取存储为txt文件的数据
  4. def data_read(dir_path):
  5. with open(dir_path, "r") as f:
  6. raw_data = f.read()#读取原始数据,这里读出来是带, 的字符串
  7. data = raw_data[1:-1].split(", ")#切片后划分变到列表
  8. return np.asfarray(data, float)#返回该列表的数组
  9. # 不同长度数据,统一为一个标准,倍乘x轴,没啥意义
  10. # def multiple_equal(x, y):
  11. # x_len = len(x)
  12. # y_len = len(y)
  13. # times = x_len/y_len
  14. # y_times = [i * times for i in y]
  15. # return y_times
  16. if __name__ == "__main__":
  17. train_loss_path = r"F:\AIPY\CNNm\train_loss.txt"
  18. train_acc_path = r"F:\AIPY\CNNm\train_acc.txt"
  19. y_train_loss = data_read(train_loss_path)
  20. y_train_acc = data_read(train_acc_path)
  21. x_train_loss = range(len(y_train_loss))
  22. x_train_acc = range(len(y_train_acc))
  23. # 去除顶部和右边框框,不去除也无所谓
  24. #ax = plt.axes()
  25. #ax.spines['top'].set_visible(False)
  26. #ax.spines['right'].set_visible(False)
  27. plt.figure(figsize=(12, 6))
  28. plt.subplot(1, 2, 1)
  29. plt.xlabel('iteration')#迭代次数
  30. plt.ylabel('loss')
  31. plt.plot(x_train_loss, y_train_loss, linewidth=1, linestyle="solid", label="train loss")
  32. plt.title('loss curve')
  33. plt.subplot(1, 2, 2)
  34. plt.xlabel('epoch')
  35. plt.ylabel('acc')
  36. plt.plot(x_train_acc, y_train_acc, color='red', linestyle="solid", label="train accuracy")
  37. plt.legend()
  38. plt.title('Accuracy curve')
  39. plt.show()

4.2 运行结果

图3 loss曲线和准确率曲线

 五、P-R曲线与ROC曲线

5.1 代码实现

  1. import os
  2. import sys
  3. import prettytable
  4. import json
  5. from PIL import Image
  6. import torch
  7. import torch.nn as nn
  8. import torchvision
  9. from contourpy.util import data
  10. from torch.nn import functional as F
  11. from torchvision import transforms, datasets, utils
  12. import matplotlib.pyplot as plt
  13. import numpy as np
  14. import torch.optim as optim
  15. from tqdm import tqdm
  16. from model import AlexNet
  17. from sklearn.metrics import precision_recall_curve, roc_curve
  18. def main():
  19. device = torch.device("cpu")#开CPU
  20. print("using {} device.".format(device))
  21. # 测试集预处理
  22. batch_size=100
  23. nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8]) # 开工作区
  24. test_transform = transforms.Compose([transforms.Resize(244),
  25. transforms.ToTensor(),
  26. ])
  27. test_dataset = torchvision.datasets.MNIST(root='F:/data_set/minst/train',
  28. train=False, download=True,
  29. transform=test_transform)
  30. test_num = len(test_dataset)#图片总数
  31. test_loader = torch.utils.data.DataLoader(test_dataset,
  32. batch_size=100, shuffle=False,
  33. num_workers=nw)
  34. #加载模型
  35. model=torch.load('AlexNet.pth')
  36. model.eval()
  37. model.to(device)
  38. pred_list = torch.tensor([])#建tensor
  39. with torch.no_grad():
  40. test_bar = tqdm(test_loader, file=sys.stdout)
  41. for X, y in test_bar:
  42. pred = model(X)
  43. pred_list =torch.cat([pred_list, pred])#将每一次测试得到的张量拼合起来
  44. test_iter1 = torch.utils.data.DataLoader(test_dataset, batch_size=10000, shuffle=False,
  45. num_workers=2)#迭代数据保存标签和图片
  46. features, labels = next(iter(test_iter1))#调用迭代器,可迭代对象有it方法还可调用next就是迭代器
  47. print(labels.shape)
  48. #test_loader1=torch.utils.data.DataLoader(test_dataset,
  49. # batch_size=10000, shuffle=False,
  50. # num_workers=nw)
  51. #features, labels = next(iter(test_loader1))
  52. #print(labels.shape)#记录一下本身的标签
  53. #可视化P与R值,这里是在计算整个过程中的值
  54. train_result = np.zeros((10, 10), dtype=int)#10*10零张量
  55. for i in range(test_num):
  56. train_result[labels[i]][np.argmax(pred_list[i])] += 1#取出最大可能为对应的预测值,每出现一次就加一
  57. result_table = prettytable.PrettyTable()
  58. result_table.field_names = ['Type', 'Accuracy(精确率)', 'Recall(召回率)', 'F1_Score']
  59. class_names = ['Zero', 'One', 'Two', 'Three', 'Four', 'Five', 'Six', 'Seven', 'Eight', 'Nine']
  60. for i in range(10):#计算从0到9各个类别的PR
  61. accuracy = train_result[i][i] / train_result.sum(axis=0)[i]#准确值,axis=0表示列求和,TP/tp+fn,0和把其他的也识别成了0
  62. recall = train_result[i][i] / train_result.sum(axis=1)[i]#召回率0和把0识别错了的
  63. result_table.add_row([class_names[i], np.round(accuracy, 3), np.round(recall, 3),
  64. np.round(accuracy * recall * 2 / (accuracy + recall), 3)])#加入相关数据到这个表格,np.round对数组进行四舍五入
  65. print(result_table)
  66. #归一化操作便于处理
  67. pred_probilities = F.softmax(pred_list, dim=1)#用softmax函数将最后一层的输出转化为(0,1)间的概率
  68. #绘制PR曲线与ROC曲线
  69. for i in range(1):#调用十次,每次输出一个类
  70. temp_true = []#用于预测结果存放,正确放1
  71. temp_probilities = []
  72. temp = 0
  73. for j in range(len(labels)):
  74. if i == labels[j]:
  75. temp = 1
  76. else:
  77. temp = 0
  78. temp_true.append(temp)#一旦预测准确就加入1,#真实标签
  79. temp_probilities.append(pred_probilities[j][i])#预测概率
  80. precision, recall, threshholds = precision_recall_curve(temp_true, temp_probilities)
  81. fpr, tpr, thresholds = roc_curve(temp_true, temp_probilities)#FPR=TP/(TP+FN),TPR=FP/(FP+TN)
  82. #开始绘图
  83. plt.figure(figsize=(12, 6))
  84. plt.subplot(1, 2, 1)#创建子图,一行二列第一个,一行二列第二个
  85. plt.xlabel('Precision')
  86. plt.ylabel('Recall')
  87. plt.title(f'Precision & Recall Curve (class:{i}) ')#加f表示格式化字符串
  88. plt.plot(precision, recall, 'yellow')
  89. plt.subplot(1, 2, 2)
  90. plt.xlabel('Fpr')
  91. plt.ylabel('Tpr')
  92. plt.title(f'Roc Curve (class:{i})')
  93. plt.plot(fpr, tpr, 'cyan')
  94. plt.show()
  95. if __name__ == '__main__':
  96. main()

5.2 运行结果

图4 P-R曲线和ROC曲线

六、对单张图片进行预测 

  1. import torch
  2. import torchvision.transforms as transforms
  3. from PIL import Image
  4. from model import AlexNet
  5. import matplotlib.pyplot as plt
  6. classes = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9')#创建类便于输出结果
  7. predict_transform=transforms.Compose([transforms.ToTensor(),
  8. transforms.Resize(244),
  9. #transforms.Normalize((0.5,),(0.5,))
  10. ])
  11. net=AlexNet()
  12. net.load_state_dict(torch.load('AlexNet.pth')) #加载以训练的模型load_state_dict这里是字典型号,我之前的训练存的是整个模型,注意分辨
  13. image=Image.open('9.png')
  14. image_gray=image.convert('L')#灰度图模式
  15. image_gray=predict_transform(image_gray)
  16. image_gray=torch.unsqueeze(image_gray,dim=0)
  17. plt.imshow(image)
  18. #plt.figure('number')
  19. plt.axis('off')
  20. plt.show()
  21. with torch.no_grad():
  22. outputs = net(image_gray)
  23. predict_y = torch.max(outputs, dim=1)[1] #输出标签
  24. print(classes[predict_y])

参考链接

https://www.bilibili.com/video/BV1W7411T7qc/

https://blog.csdn.net/tortorish/article/details/125227675

https://blog.csdn.net/WYKB_Mr_Q/article/details/125661871

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Monodyee/article/detail/92622
推荐阅读
相关标签
  

闽ICP备14008679号