当前位置:   article > 正文

GoogLeNet网络结构的实现和详解

googlenet

在这里插入图片描述

 网络中的亮点:

1.引入了Inception(开端)结构(融合不同尺度的特征信息)

2.使用1*1的卷积核进行降维以及映射处理

3.添加两个辅助分类器帮助训练(AlexNet和VGG都只有一个输出层,GoogLeNet有三个,其中两个辅助分类层)

4.丢弃全连接层,使用平均池化层(大大减少了模型的参数)

左边是初始版本      右边是一个Inception结构加上了一个降维的功能(多了3个1*1的卷积层,起到了降维作用)

 之前讲的AlexNet和VGG都是串型结构,他们的网络将一系列的卷积层和最大下采样层进行串联得到我们的一个网络结构

但是Inception出现了并型结构,将得到的特征矩阵同时输入到4个分支当中进行处理

处理之后再将我们所得到的这4个分支的特征矩阵按深度进行拼接得到我们的一个输出特征矩阵

注意:每个分支所得的特征矩阵高和宽必须相同
(否则无法按深度进行拼接)

1*1卷积核是怎样起到降维的作用呢?

因为输出特征矩阵的深度是由卷积核的个数决定的,所以通过24个1*1的卷积核可以将深度减少为24,从而减少参数,也就减少了计算量

辅助分类器(Auxiliary Classifier)

 第一层是一个平均池化下采样操作,它的池化核大小是5*5,stride=3

假设输入是14*14*512

(14-5)/3+1=4

输出特征矩阵是4*4*512

第二层用了128个卷积核大小为1*1的卷积层进行卷积处理,目的是降维,并且使用了Relu激活函数

第三层用了1024个节点的全连接层,同样也使用了Relu激活函数

在全连接层和全连接层之间使用了dropout,以70%的比率随机失活神经元

第四层是输出层,这里的节点个数1000就对应我们的类别个数

最后通过Softmax函数得到概率分布

参数表格

 GoogLeNet与VGG参数对照

代码实现

网络结构搭建代码

  1. import torch.nn as nn
  2. import torch
  3. import torch.nn.functional as F
  4. class GoogLeNet(nn.Module):
  5. def __init__(self, num_classes=1000, aux_logits=True, init_weights=False):
  6. super(GoogLeNet, self).__init__()
  7. self.aux_logits = aux_logits
  8. self.conv1 = BasicConv2d(3, 64, kernel_size=7, stride=2, padding=3)
  9. self.maxpool1 = nn.MaxPool2d(3, stride=2, ceil_mode=True)#ceil_mode=true 得到的小数向上取整 ceil_mode=false 向下取整
  10. self.conv2 = BasicConv2d(64, 64, kernel_size=1)#第一个64是输入特征矩阵深度,第二个64是卷积核的个数
  11. self.conv3 = BasicConv2d(64, 192, kernel_size=3, padding=1)
  12. self.maxpool2 = nn.MaxPool2d(3, stride=2, ceil_mode=True)
  13. self.inception3a = Inception(192, 64, 96, 128, 16, 32, 32)#第一个参数是输入特征矩阵深度,后面的参数都是按照表格中的参数
  14. self.inception3b = Inception(256, 128, 128, 192, 32, 96, 64)
  15. self.maxpool3 = nn.MaxPool2d(3, stride=2, ceil_mode=True)
  16. self.inception4a = Inception(480, 192, 96, 208, 16, 48, 64)
  17. self.inception4b = Inception(512, 160, 112, 224, 24, 64, 64)
  18. self.inception4c = Inception(512, 128, 128, 256, 24, 64, 64)
  19. self.inception4d = Inception(512, 112, 144, 288, 32, 64, 64)
  20. self.inception4e = Inception(528, 256, 160, 320, 32, 128, 128)
  21. self.maxpool4 = nn.MaxPool2d(3, stride=2, ceil_mode=True)
  22. self.inception5a = Inception(832, 256, 160, 320, 32, 128, 128)
  23. self.inception5b = Inception(832, 384, 192, 384, 48, 128, 128)
  24. if self.aux_logits:
  25. self.aux1 = InceptionAux(512, num_classes)#4a的输出
  26. self.aux2 = InceptionAux(528, num_classes)#4d的输出
  27. self.avgpool = nn.AdaptiveAvgPool2d((1, 1))#自适应平均池化下采样操作(1,1)是输出特征矩阵的高和宽,好处就是无论输入特征矩阵的高和宽是什么样的大小,我们都能够我们所指定的一个特征矩阵的高和宽
  28. self.dropout = nn.Dropout(0.4)
  29. self.fc = nn.Linear(1024, num_classes)
  30. if init_weights:
  31. self._initialize_weights()
  32. def forward(self, x):#网络的正向传播过程
  33. # N x 3 x 224 x 224
  34. x = self.conv1(x)
  35. # N x 64 x 112 x 112
  36. x = self.maxpool1(x)
  37. # N x 64 x 56 x 56
  38. x = self.conv2(x)
  39. # N x 64 x 56 x 56
  40. x = self.conv3(x)
  41. # N x 192 x 56 x 56
  42. x = self.maxpool2(x)
  43. # N x 192 x 28 x 28
  44. x = self.inception3a(x)
  45. # N x 256 x 28 x 28
  46. x = self.inception3b(x)
  47. # N x 480 x 28 x 28
  48. x = self.maxpool3(x)
  49. # N x 480 x 14 x 14
  50. x = self.inception4a(x)
  51. # N x 512 x 14 x 14
  52. if self.training and self.aux_logits: # eval model lose this layer
  53. aux1 = self.aux1(x)
  54. x = self.inception4b(x)
  55. # N x 512 x 14 x 14
  56. x = self.inception4c(x)
  57. # N x 512 x 14 x 14
  58. x = self.inception4d(x)
  59. # N x 528 x 14 x 14
  60. if self.training and self.aux_logits: # eval model lose this layer
  61. aux2 = self.aux2(x)
  62. x = self.inception4e(x)
  63. # N x 832 x 14 x 14
  64. x = self.maxpool4(x)
  65. # N x 832 x 7 x 7
  66. x = self.inception5a(x)
  67. # N x 832 x 7 x 7
  68. x = self.inception5b(x)
  69. # N x 1024 x 7 x 7
  70. x = self.avgpool(x)
  71. # N x 1024 x 1 x 1
  72. x = torch.flatten(x, 1)
  73. # N x 1024
  74. x = self.dropout(x)
  75. x = self.fc(x)
  76. # N x 1000 (num_classes)
  77. if self.training and self.aux_logits: # eval model lose this layer
  78. return x, aux2, aux1
  79. return x
  80. def _initialize_weights(self):
  81. for m in self.modules():
  82. if isinstance(m, nn.Conv2d):
  83. nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
  84. if m.bias is not None:
  85. nn.init.constant_(m.bias, 0)
  86. elif isinstance(m, nn.Linear):
  87. nn.init.normal_(m.weight, 0, 0.01)
  88. nn.init.constant_(m.bias, 0)
  89. class Inception(nn.Module):#Inception模板
  90. def __init__(self, in_channels, ch1x1, ch3x3red, ch3x3, ch5x5red, ch5x5, pool_proj):
  91. super(Inception, self).__init__()
  92. self.branch1 = BasicConv2d(in_channels, ch1x1, kernel_size=1)
  93. self.branch2 = nn.Sequential(#传入非关键字的参数
  94. BasicConv2d(in_channels, ch3x3red, kernel_size=1),
  95. BasicConv2d(ch3x3red, ch3x3, kernel_size=3, padding=1) # 保证输出特征矩阵大小等于输入大小
  96. )
  97. self.branch3 = nn.Sequential(
  98. BasicConv2d(in_channels, ch5x5red, kernel_size=1),
  99. # 在官方的实现中,其实是3x3的kernel并不是5x5,这里我也懒得改了,具体可以参考下面的issue
  100. # Please see https://github.com/pytorch/vision/issues/906 for details.
  101. BasicConv2d(ch5x5red, ch5x5, kernel_size=5, padding=2) # 保证输出大小等于输入大小
  102. )
  103. self.branch4 = nn.Sequential(
  104. nn.MaxPool2d(kernel_size=3, stride=1, padding=1),
  105. BasicConv2d(in_channels, pool_proj, kernel_size=1)
  106. )
  107. def forward(self, x):#正向传播过程
  108. branch1 = self.branch1(x)
  109. branch2 = self.branch2(x)
  110. branch3 = self.branch3(x)
  111. branch4 = self.branch4(x)
  112. outputs = [branch1, branch2, branch3, branch4]#将4个分支的输出放入到一个列表当中
  113. return torch.cat(outputs, 1)#通过cat函数将这4个分支进行合并,在第一个维度也就是channel深度进行合并
  114. class InceptionAux(nn.Module):#定义辅助分类器模板
  115. def __init__(self, in_channels, num_classes):
  116. super(InceptionAux, self).__init__()
  117. self.averagePool = nn.AvgPool2d(kernel_size=5, stride=3)
  118. self.conv = BasicConv2d(in_channels, 128, kernel_size=1) # output[batch, 128, 4, 4]
  119. self.fc1 = nn.Linear(2048, 1024)#2048是展平后的节点个数128*4*4
  120. self.fc2 = nn.Linear(1024, num_classes)
  121. def forward(self, x):
  122. # aux1: N x 512 x 14 x 14, aux2: N x 528 x 14 x 14 输入特征矩阵的维度
  123. x = self.averagePool(x)
  124. # aux1: N x 512 x 4 x 4, aux2: N x 528 x 4 x 4
  125. x = self.conv(x)
  126. # N x 128 x 4 x 4
  127. x = torch.flatten(x, 1)
  128. x = F.dropout(x, 0.5, training=self.training)#当我们实例化一个模型model后,可以通过model.train()和model.eval()来控制模型的状态,
  129. #在model.train()模式下self.training=True,在model.eval()模式下self.training=False
  130. # N x 2048
  131. x = F.relu(self.fc1(x), inplace=True)
  132. x = F.dropout(x, 0.5, training=self.training)
  133. # N x 1024
  134. x = self.fc2(x)
  135. # N x num_classes
  136. return x
  137. class BasicConv2d(nn.Module):#卷积模板文件
  138. def __init__(self, in_channels, out_channels, **kwargs):
  139. super(BasicConv2d, self).__init__()
  140. self.conv = nn.Conv2d(in_channels, out_channels, **kwargs)
  141. self.relu = nn.ReLU(inplace=True)
  142. def forward(self, x):#正向传播过程
  143. x = self.conv(x)
  144. x = self.relu(x)
  145. return x

训练代码

  1. import os
  2. import sys
  3. import json
  4. import torch
  5. import torch.nn as nn
  6. from torchvision import transforms, datasets
  7. import torch.optim as optim
  8. from tqdm import tqdm
  9. from model import GoogLeNet
  10. def main():
  11. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  12. print("using {} device.".format(device))
  13. data_transform = {
  14. "train": transforms.Compose([transforms.RandomResizedCrop(224),
  15. transforms.RandomHorizontalFlip(),
  16. transforms.ToTensor(),
  17. transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
  18. "val": transforms.Compose([transforms.Resize((224, 224)),
  19. transforms.ToTensor(),
  20. transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}
  21. data_root = os.path.abspath(os.path.join(os.getcwd(), "../..")) # get data root path
  22. image_path = os.path.join(data_root, "data_set", "flower_data") # flower data set path
  23. assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
  24. train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
  25. transform=data_transform["train"])
  26. train_num = len(train_dataset)
  27. # {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
  28. flower_list = train_dataset.class_to_idx
  29. cla_dict = dict((val, key) for key, val in flower_list.items())
  30. # write dict into json file
  31. json_str = json.dumps(cla_dict, indent=4)
  32. with open('class_indices.json', 'w') as json_file:
  33. json_file.write(json_str)
  34. batch_size = 32
  35. nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8]) # number of workers
  36. print('Using {} dataloader workers every process'.format(nw))
  37. train_loader = torch.utils.data.DataLoader(train_dataset,
  38. batch_size=batch_size, shuffle=True,
  39. num_workers=0)
  40. validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
  41. transform=data_transform["val"])
  42. val_num = len(validate_dataset)
  43. validate_loader = torch.utils.data.DataLoader(validate_dataset,
  44. batch_size=batch_size, shuffle=False,
  45. num_workers=0)
  46. print("using {} images for training, {} images for validation.".format(train_num,
  47. val_num))
  48. # test_data_iter = iter(validate_loader)
  49. # test_image, test_label = test_data_iter.next()
  50. net = GoogLeNet(num_classes=5, aux_logits=True, init_weights=True)
  51. # 如果要使用官方的预训练权重,注意是将权重载入官方的模型,不是我们自己实现的模型
  52. # 官方的模型中使用了bn层以及改了一些参数,不能混用
  53. # import torchvision
  54. # net = torchvision.models.googlenet(num_classes=5)
  55. # model_dict = net.state_dict()
  56. # # 预训练权重下载地址: https://download.pytorch.org/models/googlenet-1378be20.pth
  57. # pretrain_model = torch.load("googlenet.pth")
  58. # del_list = ["aux1.fc2.weight", "aux1.fc2.bias",
  59. # "aux2.fc2.weight", "aux2.fc2.bias",
  60. # "fc.weight", "fc.bias"]
  61. # pretrain_dict = {k: v for k, v in pretrain_model.items() if k not in del_list}
  62. # model_dict.update(pretrain_dict)
  63. # net.load_state_dict(model_dict)
  64. net.to(device)
  65. loss_function = nn.CrossEntropyLoss()
  66. optimizer = optim.Adam(net.parameters(), lr=0.0003)
  67. epochs = 10
  68. best_acc = 0.0
  69. save_path = './googleNet.pth'
  70. train_steps = len(train_loader)
  71. for epoch in range(epochs):
  72. # train
  73. net.train()
  74. running_loss = 0.0
  75. train_bar = tqdm(train_loader, file=sys.stdout)
  76. for step, data in enumerate(train_bar):
  77. images, labels = data
  78. optimizer.zero_grad()
  79. logits, aux_logits2, aux_logits1 = net(images.to(device))
  80. loss0 = loss_function(logits, labels.to(device))
  81. loss1 = loss_function(aux_logits1, labels.to(device))
  82. loss2 = loss_function(aux_logits2, labels.to(device))
  83. loss = loss0 + loss1 * 0.3 + loss2 * 0.3
  84. loss.backward()#将损失反向传播
  85. optimizer.step()#更新模型参数
  86. # print statistics
  87. running_loss += loss.item()
  88. train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
  89. epochs,
  90. loss)
  91. # validate 验证
  92. net.eval()
  93. acc = 0.0 # accumulate accurate number / epoch
  94. with torch.no_grad():
  95. val_bar = tqdm(validate_loader, file=sys.stdout)#添加一个进度条
  96. for val_data in val_bar:
  97. val_images, val_labels = val_data
  98. outputs = net(val_images.to(device)) # eval model only have last output layer测试过程中不需要管辅助分类器的结果
  99. predict_y = torch.max(outputs, dim=1)[1]
  100. acc += torch.eq(predict_y, val_labels.to(device)).sum().item()
  101. val_accurate = acc / val_num
  102. print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' %
  103. (epoch + 1, running_loss / train_steps, val_accurate))
  104. if val_accurate > best_acc:
  105. best_acc = val_accurate
  106. torch.save(net.state_dict(), save_path)
  107. print('Finished Training')
  108. if __name__ == '__main__':
  109. main()

训练结果

 预测代码

  1. import os
  2. import json
  3. import torch
  4. from PIL import Image
  5. from torchvision import transforms
  6. import matplotlib.pyplot as plt
  7. from model import GoogLeNet
  8. def main():
  9. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  10. data_transform = transforms.Compose(
  11. [transforms.Resize((224, 224)),
  12. transforms.ToTensor(),
  13. transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
  14. # load image
  15. img_path = "../tulip.jpg"
  16. assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
  17. img = Image.open(img_path)
  18. plt.imshow(img)
  19. # [N, C, H, W]
  20. img = data_transform(img)
  21. # expand batch dimension
  22. img = torch.unsqueeze(img, dim=0)
  23. # read class_indict
  24. json_path = './class_indices.json'
  25. assert os.path.exists(json_path), "file: '{}' dose not exist.".format(json_path)
  26. with open(json_path, "r") as f:
  27. class_indict = json.load(f)
  28. # create model
  29. model = GoogLeNet(num_classes=5, aux_logits=False).to(device)#预测过程中是不需要辅助分类器的
  30. # load model weights
  31. weights_path = "./googleNet.pth"
  32. assert os.path.exists(weights_path), "file: '{}' dose not exist.".format(weights_path)
  33. missing_keys, unexpected_keys = model.load_state_dict(torch.load(weights_path, map_location=device),
  34. strict=False)#等于true的话精准的匹配当前模型和我们所需要载入的权重模型进行一个精准匹配
  35. model.eval()
  36. with torch.no_grad():
  37. # predict class
  38. output = torch.squeeze(model(img.to(device))).cpu()
  39. predict = torch.softmax(output, dim=0)
  40. predict_cla = torch.argmax(predict).numpy()
  41. print_res = "class: {} prob: {:.3}".format(class_indict[str(predict_cla)],
  42. predict[predict_cla].numpy())
  43. plt.title(print_res)
  44. for i in range(len(predict)):
  45. print("class: {:10} prob: {:.3}".format(class_indict[str(i)],
  46. predict[i].numpy()))
  47. plt.show()
  48. if __name__ == '__main__':
  49. main()

预测结果 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/菜鸟追梦旅行/article/detail/338799?site
推荐阅读
相关标签
  

闽ICP备14008679号