当前位置:   article > 正文

ResNet训练四种天气分类模型_识别图片季节和白天或晚上的大模型

识别图片季节和白天或晚上的大模型

任务:训练图片分类模型,将雨,雪,雾,晴天四种天气图像分类

参考文章:CNN经典网络模型(五):ResNet简介及代码实现(PyTorch超详细注释版)

目录

1.数据集准备

2.模型

3.训练

4.测试


1.数据集准备

数据集中有四种天气图像,每一类都有10000张图片,将其分好类放在不同的文件夹下。建立image文件夹如下:

  • spilit_data.py:划分给定的数据集为训练集和测试集
  1. import os
  2. from shutil import copy, rmtree
  3. import random
  4. def mk_file(file_path: str):
  5. if os.path.exists(file_path):
  6. # 如果文件夹存在,则先删除原文件夹再重新创建
  7. rmtree(file_path)
  8. os.makedirs(file_path)
  9. def main():
  10. # 保证随机可复现
  11. random.seed(0)
  12. # 将数据集中10%的数据划分到验证集中
  13. split_rate = 0.1
  14. # 指向解压后的flower_photos文件夹
  15. # getcwd():该函数不需要传递参数,获得当前所运行脚本的路径
  16. cwd = os.getcwd()
  17. # join():用于拼接文件路径,可以传入多个路径
  18. data_root = os.path.join(cwd, "")
  19. origin_flower_path = os.path.join(data_root, "image")
  20. # 确定路径存在,否则反馈错误
  21. assert os.path.exists(origin_flower_path), "path '{}' does not exist.".format(origin_flower_path)
  22. # isdir():判断某一路径是否为目录
  23. # listdir():返回指定的文件夹包含的文件或文件夹的名字的列表
  24. flower_class = [cla for cla in os.listdir(origin_flower_path)
  25. if os.path.isdir(os.path.join(origin_flower_path, cla))]
  26. # 创建训练集train文件夹,并由类名在其目录下创建子目录
  27. train_root = os.path.join(data_root, "/data/train")
  28. mk_file(train_root)
  29. for cla in flower_class:
  30. # 建立每个类别对应的文件夹
  31. mk_file(os.path.join(train_root, cla))
  32. # 创建验证集val文件夹,并由类名在其目录下创建子目录
  33. val_root = os.path.join(data_root, "/data/val")
  34. mk_file(val_root)
  35. for cla in flower_class:
  36. # 建立每个类别对应的文件夹
  37. mk_file(os.path.join(val_root, cla))
  38. # 遍历所有类别的图像并按比例分成训练集和验证集
  39. for cla in flower_class:
  40. cla_path = os.path.join(origin_flower_path, cla)
  41. # iamges列表存储了该目录下所有图像的名称
  42. images = os.listdir(cla_path)
  43. num = len(images)
  44. # 随机采样验证集的索引
  45. # 从images列表中随机抽取k个图像名称
  46. # random.sample:用于截取列表的指定长度的随机数,返回列表
  47. # eval_index保存验证集val的图像名称
  48. eval_index = random.sample(images, k=int(num*split_rate))
  49. for index, image in enumerate(images):
  50. if image in eval_index:
  51. # 将分配至验证集中的文件复制到相应目录
  52. image_path = os.path.join(cla_path, image)
  53. new_path = os.path.join(val_root, cla)
  54. copy(image_path, new_path)
  55. else:
  56. # 将分配至训练集中的文件复制到相应目录
  57. image_path = os.path.join(cla_path, image)
  58. new_path = os.path.join(train_root, cla)
  59. copy(image_path, new_path)
  60. # '\r'回车,回到当前行的行首,而不会换到下一行,如果接着输出,本行以前的内容会被逐一覆盖
  61. # end="":将print自带的换行用end中指定的str代替
  62. print("\r[{}] processing [{}/{}]".format(cla, index+1, num), end="")
  63. print()
  64. print("processing done!")
  65. if __name__ == '__main__':
  66. main()

 运行后得到划分好的训练集和测试集

2.模型

  • model.py :定义ResNet网络模型
  1. import torch.nn as nn
  2. import torch
  3. # 定义ResNet18/34的残差结构,为2个3x3的卷积
  4. class BasicBlock(nn.Module):
  5. # 判断残差结构中,主分支的卷积核个数是否发生变化,不变则为1
  6. expansion = 1
  7. # init():进行初始化,申明模型中各层的定义
  8. # downsample=None对应实线残差结构,否则为虚线残差结构
  9. def __init__(self, in_channel, out_channel, stride=1, downsample=None, **kwargs):
  10. super(BasicBlock, self).__init__()
  11. self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=out_channel,
  12. kernel_size=3, stride=stride, padding=1, bias=False)
  13. # 使用批量归一化
  14. self.bn1 = nn.BatchNorm2d(out_channel)
  15. # 使用ReLU作为激活函数
  16. self.relu = nn.ReLU()
  17. self.conv2 = nn.Conv2d(in_channels=out_channel, out_channels=out_channel,
  18. kernel_size=3, stride=1, padding=1, bias=False)
  19. self.bn2 = nn.BatchNorm2d(out_channel)
  20. self.downsample = downsample
  21. # forward():定义前向传播过程,描述了各层之间的连接关系
  22. def forward(self, x):
  23. # 残差块保留原始输入
  24. identity = x
  25. # 如果是虚线残差结构,则进行下采样
  26. if self.downsample is not None:
  27. identity = self.downsample(x)
  28. out = self.conv1(x)
  29. out = self.bn1(out)
  30. out = self.relu(out)
  31. # -----------------------------------------
  32. out = self.conv2(out)
  33. out = self.bn2(out)
  34. # 主分支与shortcut分支数据相加
  35. out += identity
  36. out = self.relu(out)
  37. return out
  38. # 定义ResNet50/101/152的残差结构,为1x1+3x3+1x1的卷积
  39. class Bottleneck(nn.Module):
  40. # expansion是指在每个小残差块内,减小尺度增加维度的倍数,如64*4=256
  41. # Bottleneck层输出通道是输入的4倍
  42. expansion = 4
  43. # init():进行初始化,申明模型中各层的定义
  44. # downsample=None对应实线残差结构,否则为虚线残差结构,专门用来改变x的通道数
  45. def __init__(self, in_channel, out_channel, stride=1, downsample=None,
  46. groups=1, width_per_group=64):
  47. super(Bottleneck, self).__init__()
  48. width = int(out_channel * (width_per_group / 64.)) * groups
  49. self.conv1 = nn.Conv2d(in_channels=in_channel, out_channels=width,
  50. kernel_size=1, stride=1, bias=False)
  51. # 使用批量归一化
  52. self.bn1 = nn.BatchNorm2d(width)
  53. # -----------------------------------------
  54. self.conv2 = nn.Conv2d(in_channels=width, out_channels=width, groups=groups,
  55. kernel_size=3, stride=stride, bias=False, padding=1)
  56. self.bn2 = nn.BatchNorm2d(width)
  57. # -----------------------------------------
  58. self.conv3 = nn.Conv2d(in_channels=width, out_channels=out_channel * self.expansion,
  59. kernel_size=1, stride=1, bias=False)
  60. self.bn3 = nn.BatchNorm2d(out_channel * self.expansion)
  61. # 使用ReLU作为激活函数
  62. self.relu = nn.ReLU(inplace=True)
  63. self.downsample = downsample
  64. # forward():定义前向传播过程,描述了各层之间的连接关系
  65. def forward(self, x):
  66. # 残差块保留原始输入
  67. identity = x
  68. # 如果是虚线残差结构,则进行下采样
  69. if self.downsample is not None:
  70. identity = self.downsample(x)
  71. out = self.conv1(x)
  72. out = self.bn1(out)
  73. out = self.relu(out)
  74. out = self.conv2(out)
  75. out = self.bn2(out)
  76. out = self.relu(out)
  77. out = self.conv3(out)
  78. out = self.bn3(out)
  79. # 主分支与shortcut分支数据相加
  80. out += identity
  81. out = self.relu(out)
  82. return out
  83. # 定义ResNet类
  84. class ResNet(nn.Module):
  85. # 初始化函数
  86. def __init__(self,
  87. block,
  88. blocks_num,
  89. num_classes=4,
  90. include_top=True,
  91. groups=1,
  92. width_per_group=64):
  93. super(ResNet, self).__init__()
  94. self.include_top = include_top
  95. # maxpool的输出通道数为64,残差结构输入通道数为64
  96. self.in_channel = 64
  97. self.groups = groups
  98. self.width_per_group = width_per_group
  99. self.conv1 = nn.Conv2d(3, self.in_channel, kernel_size=7, stride=2,
  100. padding=3, bias=False)
  101. self.bn1 = nn.BatchNorm2d(self.in_channel)
  102. self.relu = nn.ReLU(inplace=True)
  103. self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
  104. # 浅层的stride=1,深层的stride=2
  105. # block:定义的两种残差模块
  106. # block_num:模块中残差块的个数
  107. self.layer1 = self._make_layer(block, 64, blocks_num[0])
  108. self.layer2 = self._make_layer(block, 128, blocks_num[1], stride=2)
  109. self.layer3 = self._make_layer(block, 256, blocks_num[2], stride=2)
  110. self.layer4 = self._make_layer(block, 512, blocks_num[3], stride=2)
  111. if self.include_top:
  112. # 自适应平均池化,指定输出(H,W),通道数不变
  113. self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
  114. # 全连接层
  115. self.fc = nn.Linear(512 * block.expansion, num_classes)
  116. # 遍历网络中的每一层
  117. # 继承nn.Module类中的一个方法:self.modules(), 他会返回该网络中的所有modules
  118. for m in self.modules():
  119. # isinstance(object, type):如果指定对象是指定类型,则isinstance()函数返回True
  120. # 如果是卷积层
  121. if isinstance(m, nn.Conv2d):
  122. # kaiming正态分布初始化,使得Conv2d卷积层反向传播的输出的方差都为1
  123. # fan_in:权重是通过线性层(卷积或全连接)隐性确定
  124. # fan_out:通过创建随机矩阵显式创建权重
  125. nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
  126. # 定义残差模块,由若干个残差块组成
  127. # block:定义的两种残差模块,channel:该模块中所有卷积层的基准通道数。block_num:模块中残差块的个数
  128. def _make_layer(self, block, channel, block_num, stride=1):
  129. downsample = None
  130. # 如果满足条件,则是虚线残差结构
  131. if stride != 1 or self.in_channel != channel * block.expansion:
  132. downsample = nn.Sequential(
  133. nn.Conv2d(self.in_channel, channel * block.expansion, kernel_size=1, stride=stride, bias=False),
  134. nn.BatchNorm2d(channel * block.expansion))
  135. layers = []
  136. layers.append(block(self.in_channel,
  137. channel,
  138. downsample=downsample,
  139. stride=stride,
  140. groups=self.groups,
  141. width_per_group=self.width_per_group))
  142. self.in_channel = channel * block.expansion
  143. for _ in range(1, block_num):
  144. layers.append(block(self.in_channel,
  145. channel,
  146. groups=self.groups,
  147. width_per_group=self.width_per_group))
  148. # Sequential:自定义顺序连接成模型,生成网络结构
  149. return nn.Sequential(*layers)
  150. # forward():定义前向传播过程,描述了各层之间的连接关系
  151. def forward(self, x):
  152. # 无论哪种ResNet,都需要的静态层
  153. x = self.conv1(x)
  154. x = self.bn1(x)
  155. x = self.relu(x)
  156. x = self.maxpool(x)
  157. # 动态层
  158. x = self.layer1(x)
  159. x = self.layer2(x)
  160. x = self.layer3(x)
  161. x = self.layer4(x)
  162. if self.include_top:
  163. x = self.avgpool(x)
  164. x = torch.flatten(x, 1)
  165. x = self.fc(x)
  166. return x
  167. # ResNet()中block参数对应的位置是BasicBlock或Bottleneck
  168. # ResNet()中blocks_num[0-3]对应[3, 4, 6, 3],表示残差模块中的残差数
  169. # 34层的resnet
  170. def resnet34(num_classes=1000, include_top=True):
  171. # https://download.pytorch.org/models/resnet34-333f7ec4.pth
  172. return ResNet(BasicBlock, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)
  173. # 50层的resnet
  174. def resnet50(num_classes=1000, include_top=True):
  175. # https://download.pytorch.org/models/resnet50-19c8e357.pth
  176. return ResNet(Bottleneck, [3, 4, 6, 3], num_classes=num_classes, include_top=include_top)
  177. # 101层的resnet
  178. def resnet101(num_classes=1000, include_top=True):
  179. # https://download.pytorch.org/models/resnet101-5d3b4d8f.pth
  180. return ResNet(Bottleneck, [3, 4, 23, 3], num_classes=num_classes, include_top=include_top)

3.训练

  • train.py:加载数据集并训练,计算loss和accuracy,保存训练好的网络参数
  1. import os
  2. import sys
  3. import json
  4. import torch
  5. import torch.nn as nn
  6. import torch.optim as optim
  7. from torchvision import transforms, datasets
  8. from tqdm import tqdm
  9. # 训练resnet34
  10. from model import resnet34
  11. def main():
  12. # 如果有NVIDA显卡,转到GPU训练,否则用CPU
  13. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  14. print("using {} device.".format(device))
  15. data_transform = {
  16. # 训练
  17. # Compose():将多个transforms的操作整合在一起
  18. "train": transforms.Compose([
  19. # RandomResizedCrop(224):将给定图像随机裁剪为不同的大小和宽高比,然后缩放所裁剪得到的图像为给定大小
  20. transforms.RandomResizedCrop(224),
  21. # RandomVerticalFlip():以0.5的概率竖直翻转给定的PIL图像
  22. transforms.RandomHorizontalFlip(),
  23. # ToTensor():数据转化为Tensor格式
  24. transforms.ToTensor(),
  25. # Normalize():将图像的像素值归一化到[-1,1]之间,使模型更容易收敛
  26. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
  27. # 验证
  28. "val": transforms.Compose([transforms.Resize(256),
  29. transforms.CenterCrop(224),
  30. transforms.ToTensor(),
  31. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}
  32. # abspath():获取文件当前目录的绝对路径
  33. # join():用于拼接文件路径,可以传入多个路径
  34. # getcwd():该函数不需要传递参数,获得当前所运行脚本的路径
  35. data_root = os.path.abspath(os.getcwd())
  36. # 得到数据集的路径
  37. image_path = os.path.join(data_root, "flower_data")
  38. # exists():判断括号里的文件是否存在,可以是文件路径
  39. # 如果image_path不存在,序会抛出AssertionError错误,报错为参数内容“ ”
  40. assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
  41. train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
  42. transform=data_transform["train"])
  43. # 训练集长度
  44. train_num = len(train_dataset)
  45. # {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
  46. # class_to_idx:获取分类名称对应索引
  47. flower_list = train_dataset.class_to_idx
  48. # dict():创建一个新的字典
  49. # 循环遍历数组索引并交换val和key的值重新赋值给数组,这样模型预测的直接就是value类别值
  50. cla_dict = dict((val, key) for key, val in flower_list.items())
  51. # 把字典编码成json格式
  52. json_str = json.dumps(cla_dict, indent=4)
  53. # 把字典类别索引写入json文件
  54. with open('class_indices.json', 'w') as json_file:
  55. json_file.write(json_str)
  56. # 一次训练载入16张图像
  57. batch_size = 16
  58. # 确定进程数
  59. # min():返回给定参数的最小值,参数可以为序列
  60. # cpu_count():返回一个整数值,表示系统中的CPU数量,如果不确定CPU的数量,则不返回任何内容
  61. nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])
  62. print('Using {} dataloader workers every process'.format(nw))
  63. # DataLoader:将读取的数据按照batch size大小封装给训练集
  64. # dataset (Dataset):输入的数据集
  65. # batch_size (int, optional):每个batch加载多少个样本,默认: 1
  66. # shuffle (bool, optional):设置为True时会在每个epoch重新打乱数据,默认: False
  67. # num_workers(int, optional): 决定了有几个进程来处理,默认为0意味着所有的数据都会被load进主进程
  68. train_loader = torch.utils.data.DataLoader(train_dataset,
  69. batch_size=batch_size, shuffle=True,
  70. num_workers=nw)
  71. # 加载测试数据集
  72. validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
  73. transform=data_transform["val"])
  74. # 测试集长度
  75. val_num = len(validate_dataset)
  76. validate_loader = torch.utils.data.DataLoader(validate_dataset,
  77. batch_size=batch_size, shuffle=False,
  78. num_workers=nw)
  79. print("using {} images for training, {} images for validation.".format(train_num,
  80. val_num))
  81. # 模型实例化
  82. net = resnet34()
  83. net.to(device)
  84. # 加载预训练模型权重
  85. # model_weight_path = "./resnet34-pre.pth"
  86. # exists():判断括号里的文件是否存在,可以是文件路径
  87. # assert os.path.exists(model_weight_path), "file {} does not exist.".format(model_weight_path)
  88. # net.load_state_dict(torch.load(model_weight_path, map_location='cpu'))
  89. # 输入通道数
  90. # in_channel = net.fc.in_features
  91. # 全连接层
  92. # net.fc = nn.Linear(in_channel, 5)
  93. # 定义损失函数(交叉熵损失)
  94. loss_function = nn.CrossEntropyLoss()
  95. # 抽取模型参数
  96. params = [p for p in net.parameters() if p.requires_grad]
  97. # 定义adam优化器
  98. # params(iterable):要训练的参数,一般传入的是model.parameters()
  99. # lr(float):learning_rate学习率,也就是步长,默认:1e-3
  100. optimizer = optim.Adam(params, lr=0.0001)
  101. # 迭代次数(训练次数)
  102. epochs = 3
  103. # 用于判断最佳模型
  104. best_acc = 0.0
  105. # 最佳模型保存地址
  106. save_path = './resNet34.pth'
  107. train_steps = len(train_loader)
  108. for epoch in range(epochs):
  109. # 训练
  110. net.train()
  111. running_loss = 0.0
  112. # tqdm:进度条显示
  113. train_bar = tqdm(train_loader, file=sys.stdout)
  114. # train_bar: 传入数据(数据包括:训练数据和标签)
  115. # enumerate():将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列,同时列出数据和数据下标,一般用在for循环当中
  116. # enumerate返回值有两个:一个是序号,一个是数据(包含训练数据和标签)
  117. # x:训练数据(inputs)(tensor类型的),y:标签(labels)(tensor类型)
  118. for step, data in enumerate(train_bar):
  119. # 前向传播
  120. images, labels = data
  121. # 计算训练值
  122. logits = net(images.to(device))
  123. # 计算损失
  124. loss = loss_function(logits, labels.to(device))
  125. # 反向传播
  126. # 清空过往梯度
  127. optimizer.zero_grad()
  128. # 反向传播,计算当前梯度
  129. loss.backward()
  130. optimizer.step()
  131. # item():得到元素张量的元素值
  132. running_loss += loss.item()
  133. # 进度条的前缀
  134. # .3f:表示浮点数的精度为3(小数位保留3位)
  135. train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
  136. epochs,
  137. loss)
  138. # 测试
  139. # eval():如果模型中有Batch Normalization和Dropout,则不启用,以防改变权值
  140. net.eval()
  141. acc = 0.0
  142. # 清空历史梯度,与训练最大的区别是测试过程中取消了反向传播
  143. with torch.no_grad():
  144. val_bar = tqdm(validate_loader, file=sys.stdout)
  145. for val_data in val_bar:
  146. val_images, val_labels = val_data
  147. outputs = net(val_images.to(device))
  148. # torch.max(input, dim)函数
  149. # input是具体的tensor,dim是max函数索引的维度,0是每列的最大值,1是每行的最大值输出
  150. # 函数会返回两个tensor,第一个tensor是每行的最大值;第二个tensor是每行最大值的索引
  151. predict_y = torch.max(outputs, dim=1)[1]
  152. # 对两个张量Tensor进行逐元素的比较,若相同位置的两个元素相同,则返回True;若不同,返回False
  153. # .sum()对输入的tensor数据的某一维度求和
  154. acc += torch.eq(predict_y, val_labels.to(device)).sum().item()
  155. val_bar.desc = "valid epoch[{}/{}]".format(epoch + 1,
  156. epochs)
  157. val_accurate = acc / val_num
  158. print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' %
  159. (epoch + 1, running_loss / train_steps, val_accurate))
  160. # 保存最好的模型权重
  161. if val_accurate > best_acc:
  162. best_acc = val_accurate
  163. # torch.save(state, dir)保存模型等相关参数,dir表示保存文件的路径+保存文件名
  164. # model.state_dict():返回的是一个OrderedDict,存储了网络结构的名字和对应的参数
  165. torch.save(net.state_dict(), save_path)
  166. print('Finished Training')
  167. if __name__ == '__main__':
  168. main()

4.测试

  • predict.py:用自己的数据集进行分类测试
  1. import os
  2. import sys
  3. import json
  4. import torch
  5. import torch.nn as nn
  6. import torch.optim as optim
  7. from torchvision import transforms, datasets
  8. from tqdm import tqdm
  9. # 训练resnet34
  10. from model import resnet34
  11. def main():
  12. # 如果有NVIDA显卡,转到GPU训练,否则用CPU
  13. device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
  14. print("using {} device.".format(device))
  15. data_transform = {
  16. # 训练
  17. # Compose():将多个transforms的操作整合在一起
  18. "train": transforms.Compose([
  19. # RandomResizedCrop(224):将给定图像随机裁剪为不同的大小和宽高比,然后缩放所裁剪得到的图像为给定大小
  20. transforms.RandomResizedCrop(224),
  21. # RandomVerticalFlip():以0.5的概率竖直翻转给定的PIL图像
  22. transforms.RandomHorizontalFlip(),
  23. # ToTensor():数据转化为Tensor格式
  24. transforms.ToTensor(),
  25. # Normalize():将图像的像素值归一化到[-1,1]之间,使模型更容易收敛
  26. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])]),
  27. # 验证
  28. "val": transforms.Compose([transforms.Resize(256),
  29. transforms.CenterCrop(224),
  30. transforms.ToTensor(),
  31. transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])}
  32. # abspath():获取文件当前目录的绝对路径
  33. # join():用于拼接文件路径,可以传入多个路径
  34. # getcwd():该函数不需要传递参数,获得当前所运行脚本的路径
  35. data_root = os.path.abspath(os.getcwd())
  36. # 得到数据集的路径
  37. image_path = os.path.join(data_root, "data")
  38. # exists():判断括号里的文件是否存在,可以是文件路径
  39. # 如果image_path不存在,序会抛出AssertionError错误,报错为参数内容“ ”
  40. assert os.path.exists(image_path), "{} path does not exist.".format(image_path)
  41. train_dataset = datasets.ImageFolder(root=os.path.join(image_path, "train"),
  42. transform=data_transform["train"])
  43. # 训练集长度
  44. train_num = len(train_dataset)
  45. # {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
  46. # class_to_idx:获取分类名称对应索引
  47. flower_list = train_dataset.class_to_idx
  48. # dict():创建一个新的字典
  49. # 循环遍历数组索引并交换val和key的值重新赋值给数组,这样模型预测的直接就是value类别值
  50. cla_dict = dict((val, key) for key, val in flower_list.items())
  51. # 把字典编码成json格式
  52. json_str = json.dumps(cla_dict, indent=4)
  53. # 把字典类别索引写入json文件
  54. with open('class_indices.json', 'w') as json_file:
  55. json_file.write(json_str)
  56. # 一次训练载入16张图像
  57. batch_size = 16
  58. # 确定进程数
  59. # min():返回给定参数的最小值,参数可以为序列
  60. # cpu_count():返回一个整数值,表示系统中的CPU数量,如果不确定CPU的数量,则不返回任何内容
  61. nw = min([os.cpu_count(), batch_size if batch_size > 1 else 0, 8])
  62. print('Using {} dataloader workers every process'.format(nw))
  63. # DataLoader:将读取的数据按照batch size大小封装给训练集
  64. # dataset (Dataset):输入的数据集
  65. # batch_size (int, optional):每个batch加载多少个样本,默认: 1
  66. # shuffle (bool, optional):设置为True时会在每个epoch重新打乱数据,默认: False
  67. # num_workers(int, optional): 决定了有几个进程来处理,默认为0意味着所有的数据都会被load进主进程
  68. train_loader = torch.utils.data.DataLoader(train_dataset,
  69. batch_size=batch_size, shuffle=True,
  70. num_workers=nw)
  71. # 加载测试数据集
  72. validate_dataset = datasets.ImageFolder(root=os.path.join(image_path, "val"),
  73. transform=data_transform["val"])
  74. # 测试集长度
  75. val_num = len(validate_dataset)
  76. validate_loader = torch.utils.data.DataLoader(validate_dataset,
  77. batch_size=batch_size, shuffle=False,
  78. num_workers=nw)
  79. print("using {} images for training, {} images for validation.".format(train_num,
  80. val_num))
  81. # 模型实例化
  82. net = resnet34()
  83. net.to(device)
  84. # 加载预训练模型权重
  85. # model_weight_path = "./resnet34-pre.pth"
  86. # exists():判断括号里的文件是否存在,可以是文件路径
  87. # assert os.path.exists(model_weight_path), "file {} does not exist.".format(model_weight_path)
  88. # net.load_state_dict(torch.load(model_weight_path, map_location='cpu'))
  89. # 输入通道数
  90. # in_channel = net.fc.in_features
  91. # 全连接层
  92. # net.fc = nn.Linear(in_channel, 5)
  93. # 定义损失函数(交叉熵损失)
  94. loss_function = nn.CrossEntropyLoss()
  95. # 抽取模型参数
  96. params = [p for p in net.parameters() if p.requires_grad]
  97. # 定义adam优化器
  98. # params(iterable):要训练的参数,一般传入的是model.parameters()
  99. # lr(float):learning_rate学习率,也就是步长,默认:1e-3
  100. optimizer = optim.Adam(params, lr=0.0001)
  101. # 迭代次数(训练次数)
  102. epochs = 100
  103. # 用于判断最佳模型
  104. best_acc = 0.0
  105. # 最佳模型保存地址
  106. save_path = './resNet34.pth'
  107. train_steps = len(train_loader)
  108. for epoch in range(epochs):
  109. # 训练
  110. net.train()
  111. running_loss = 0.0
  112. # tqdm:进度条显示
  113. train_bar = tqdm(train_loader, file=sys.stdout)
  114. # train_bar: 传入数据(数据包括:训练数据和标签)
  115. # enumerate():将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列,同时列出数据和数据下标,一般用在for循环当中
  116. # enumerate返回值有两个:一个是序号,一个是数据(包含训练数据和标签)
  117. # x:训练数据(inputs)(tensor类型的),y:标签(labels)(tensor类型)
  118. for step, data in enumerate(train_bar):
  119. # 前向传播
  120. images, labels = data
  121. # 计算训练值
  122. logits = net(images.to(device))
  123. # 计算损失
  124. loss = loss_function(logits, labels.to(device))
  125. # 反向传播
  126. # 清空过往梯度
  127. optimizer.zero_grad()
  128. # 反向传播,计算当前梯度
  129. loss.backward()
  130. optimizer.step()
  131. # item():得到元素张量的元素值
  132. running_loss += loss.item()
  133. # 进度条的前缀
  134. # .3f:表示浮点数的精度为3(小数位保留3位)
  135. train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,
  136. epochs,
  137. loss)
  138. # 测试
  139. # eval():如果模型中有Batch Normalization和Dropout,则不启用,以防改变权值
  140. net.eval()
  141. acc = 0.0
  142. # 清空历史梯度,与训练最大的区别是测试过程中取消了反向传播
  143. with torch.no_grad():
  144. val_bar = tqdm(validate_loader, file=sys.stdout)
  145. for val_data in val_bar:
  146. val_images, val_labels = val_data
  147. outputs = net(val_images.to(device))
  148. # torch.max(input, dim)函数
  149. # input是具体的tensor,dim是max函数索引的维度,0是每列的最大值,1是每行的最大值输出
  150. # 函数会返回两个tensor,第一个tensor是每行的最大值;第二个tensor是每行最大值的索引
  151. predict_y = torch.max(outputs, dim=1)[1]
  152. # 对两个张量Tensor进行逐元素的比较,若相同位置的两个元素相同,则返回True;若不同,返回False
  153. # .sum()对输入的tensor数据的某一维度求和
  154. acc += torch.eq(predict_y, val_labels.to(device)).sum().item()
  155. val_bar.desc = "valid epoch[{}/{}]".format(epoch + 1,
  156. epochs)
  157. val_accurate = acc / val_num
  158. print('[epoch %d] train_loss: %.3f val_accuracy: %.3f' %
  159. (epoch + 1, running_loss / train_steps, val_accurate))
  160. # 保存最好的模型权重
  161. if val_accurate > best_acc:
  162. best_acc = val_accurate
  163. # torch.save(state, dir)保存模型等相关参数,dir表示保存文件的路径+保存文件名
  164. # model.state_dict():返回的是一个OrderedDict,存储了网络结构的名字和对应的参数
  165. torch.save(net.state_dict(), save_path)
  166. print('Finished Training')
  167. if __name__ == '__main__':
  168. main()

5.模型对比

ResNet34

训练了140轮,但其实在50轮的时候就已经达到了0.9的accuracy,后面收敛缓慢,于是想换模型。训练时间为一分钟一轮。

ResNet50

训练时间大概两分钟一轮

 

50确实比34情况要好,但还是同样的问题,前期收敛到0.955之后就一直摇摆,也没有突破

这个时候还在想换模型,于是换了152

ResNet152

 现在感觉不是模型的问题了,应该是数据集哪里出了问题,想寻找新数据集。模型的话ResNet50就够用了,因为152训练时间太长了,大概5分钟才一轮。

6.新数据集训练

持续更新中

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/2023面试高手/article/detail/581958
推荐阅读