赞
踩
RestNet是2015年由微软团队提出的,在当时获得分类任务,目标检测,图像分割第一名。该论文的四位作者何恺明、张祥雨、任少卿和孙剑如今在人工智能领域里都是响当当的名字,当时他们都是微软亚研的一员。实验结果显示,残差网络更容易优化,并且加深网络层数有助于提高正确率。在ImageNet上使用152层的残差网络(VGG net的8倍深度,但残差网络复杂度更低)。对这些网络使用集成方法实现了3.75%的错误率。获得了ILSVRC 2015竞赛的第一名。
论文地址:原文链接
这是一篇计算机视觉领域的经典论文。李沐曾经说过,假设你在使用卷积神经网络,有一半的可能性就是在使用 ResNet 或它的变种。ResNet 论文被引用数量突破了 10 万+。
ResNet的经典网络结构有:ResNet-18、ResNet-34、ResNet-50、ResNet-101、ResNet-152几种,其中,ResNet-18和ResNet-34的基本结构相同,属于相对浅层的网络,后面3种属于更深层的网络,其中RestNet50最为常用。
残差网络是为了解决深度神经网络(DNN)隐藏层过多时的网络退化问题而提出。退化(degradation)问题是指:当网络隐藏层变多时,网络的准确度达到饱和然后急剧退化,而且这个退化不是由于过拟合引起的。
假设一个网络 A,训练误差为 x。在 A 的顶部添加几个层构建网络 B,这些层的参数对于 A 的输出没有影响,我们称这些层为 C。这意味着新网络 B 的训练误差也是 x。网络 B 的训练误差不应高于 A,如果出现 B 的训练误差高于 A 的情况,则使用添加的层 C 学习恒等映射(对输入没有影响)并不是一个平凡问题。
为了解决这个问题,上图中的模块在输入和输出之间添加了一个直连路径,以直接执行映射。这时,C 只需要学习已有的输入特征就可以了。由于 C 只学习残差,该模块叫作残差模块。
此外,和当年几乎同时推出的 GoogLeNet 类似,它也在分类层之后连接了一个全局平均池化层。通过这些变化,ResNet 可以学习 152 个层的深层网络。它可以获得比 VGGNet 和 GoogLeNet 更高的准确率,同时计算效率比 VGGNet 更高。ResNet-152 可以取得 95.51% 的 top-5 准确率。
RestNet18和RestNet50网络结构如下:
Cifar-10 是由 Hinton 的学生 Alex Krizhevsky、Ilya Sutskever 收集的一个用于普适物体识别的计算机视觉数据集,它包含 60000 张 32 X 32 的 RGB 彩色图片,总共 10 个分类。其中,包括 50000 张用于训练集,10000 张用于测试集。
CIFAR-10数据集中一共包含10 个类别的RGB 彩色图片:飞机( airplane )、汽车( automobile )、鸟类( bird )、猫( cat )、鹿( deer )、狗( dog )、蛙类( frog )、马( horse )、船( ship )和卡车( truck )。
CIFAR-10是一个更接近普适物体的彩色图像数据集。与MNIST数据集相比, CIFAR-10具有以下不同点:
相比于手写字符,CIFAR-10含有的是现实世界中真实的物体,不仅噪声很大,而且物体的比例、特征都不尽相同,这为识别带来很大困难。直接的线性模型如Softmax 在CIFAR-10 上表现得很差。
- import torch
- from torch import nn
- from torch.utils.data import DataLoader
- from torchvision import datasets, utils
- from torchvision.transforms import ToTensor
- import matplotlib.pyplot as plt
- from torchvision.transforms import transforms
- import torch.nn.functional as F
- import datetime
- import numpy as np
-
-
- class Bottleneck(nn.Module):
- def __init__(self, in_channels, out_channels, stride=[1, 1, 1], padding=[0, 1, 0], first=False) -> None:
- super(Bottleneck, self).__init__()
- self.bottleneck = nn.Sequential(
- nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride[0], padding=padding[0], bias=False),
- nn.BatchNorm2d(out_channels),
- nn.ReLU(inplace=True),
- nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=stride[1], padding=padding[1], bias=False),
- nn.BatchNorm2d(out_channels),
- nn.ReLU(inplace=True),
- nn.Conv2d(out_channels, out_channels * 4, kernel_size=1, stride=stride[2], padding=padding[2], bias=False),
- nn.BatchNorm2d(out_channels * 4)
- )
-
- # 由于存在维度不一致的情况 所以分情况
- self.shortcut = nn.Sequential()
- if first:
- self.shortcut = nn.Sequential(
- # 卷积核为1 进行升降维
- # 注意跳变时 都是stride==2的时候 也就是每次输出信道升维的时候
- nn.Conv2d(in_channels, out_channels * 4, kernel_size=1, stride=stride[1], bias=False),
- nn.BatchNorm2d(out_channels * 4)
- )
-
- def forward(self, x):
- out = self.bottleneck(x)
- out += self.shortcut(x)
- out = F.relu(out)
- return out
-
-
- class ResNet50(nn.Module):
- def __init__(self, Bottleneck, num_classes=10) -> None:
- super(ResNet50, self).__init__()
- self.in_channels = 64
- self.conv1 = nn.Sequential(
- nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False),
- nn.BatchNorm2d(64),
- nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
- )
-
- self.conv2 = self._make_layer(Bottleneck, 64, [[1, 1, 1]] * 3, [[0, 1, 0]] * 3)
- self.conv3 = self._make_layer(Bottleneck, 128, [[1, 2, 1]] + [[1, 1, 1]] * 3, [[0, 1, 0]] * 4)
- self.conv4 = self._make_layer(Bottleneck, 256, [[1, 2, 1]] + [[1, 1, 1]] * 5, [[0, 1, 0]] * 6)
- self.conv5 = self._make_layer(Bottleneck, 512, [[1, 2, 1]] + [[1, 1, 1]] * 2, [[0, 1, 0]] * 3)
-
- self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
- self.fc = nn.Linear(2048, num_classes)
-
- def _make_layer(self, block, out_channels, strides, paddings):
- layers = []
- flag = True
- for i in range(0, len(strides)):
- layers.append(block(self.in_channels, out_channels, strides[i], paddings[i], first=flag))
- flag = False
- self.in_channels = out_channels * 4
-
- return nn.Sequential(*layers)
-
- def forward(self, x):
- out = self.conv1(x)
- out = self.conv2(out)
- out = self.conv3(out)
- out = self.conv4(out)
- out = self.conv5(out)
-
- out = self.avgpool(out)
- out = out.reshape(x.shape[0], -1)
- out = self.fc(out)
- return out
-
-
- def get_format_time():
- return datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
-
-
- transform = transforms.Compose([ToTensor(),
- transforms.Normalize(
- mean=[0.5, 0.5, 0.5],
- std=[0.5, 0.5, 0.5]
- ),
- transforms.Resize((224, 224))
- ])
-
- training_data = datasets.CIFAR10(
- root="data",
- train=True,
- download=True,
- transform=transform,
- )
-
- testing_data = datasets.CIFAR10(
- root="data",
- train=False,
- download=True,
- transform=transform,
- )
-
-
- if __name__ == "__main__":
- res50 = ResNet50(Bottleneck)
-
- batch_size = 128
- train_loader = DataLoader(dataset=training_data, batch_size=batch_size, shuffle=True, drop_last=True)
- test_loader = DataLoader(dataset=testing_data, batch_size=batch_size, shuffle=True, drop_last=True)
-
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
- model = res50.to(device)
- cost = torch.nn.CrossEntropyLoss()
- optimizer = torch.optim.Adam(model.parameters())
-
- epochs = 20
- accuracy_rate = []
- for epoch in range(epochs):
- train_loss = 0.0
- train_correct = 0.0
- model.train()
-
- print(f"{get_format_time()}, train epoch: {epoch}/{epochs}")
- for step, (images, labels) in enumerate(train_loader, 0):
- images, labels = images.to(device), labels.to(device)
- outputs = model(images)
- _, predicted = torch.max(outputs.data, 1)
- optimizer.zero_grad()
- loss = cost(outputs, labels)
-
- loss.backward()
- optimizer.step()
- train_loss += loss.item()
- train_correct += torch.sum(predicted == labels.data)
-
- # 在测试集上进行验证
- model.eval()
- test_correct = 0
- test_total = 0
- test_loss = 0
- with torch.no_grad():
- for images, labels in test_loader:
- images, labels = images.to(device), labels.to(device)
- outputs = model(images).to(device)
- loss = cost(outputs, labels)
- _, predicted = torch.max(outputs, 1)
- test_total += labels.size(0)
- test_correct += torch.sum(predicted == labels.data)
- test_loss += loss.item()
-
- accuracy = 100 * test_correct / test_total
- accuracy_rate.append(accuracy)
-
- print("{}, Train Loss is:{:.4f}, Train Accuracy is:{:.4f}%, Test Loss is::{:.4f} Test Accuracy is:{:.4f}%".format(
- get_format_time(),
- train_loss / len(training_data),
- 100 * train_correct / len(training_data),
- test_loss / len(testing_data),
- 100 * test_correct / len(testing_data)
- ))
-
- accuracy_rate = torch.tensor(accuracy_rate).detach().cpu().numpy()
- times = np.linspace(1, epochs, epochs)
- plt.xlabel('times')
- plt.ylabel('accuracy rate')
- plt.plot(times, accuracy_rate)
- plt.show()
-
- print(f"{datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')},accuracy_rate={accuracy_rate}")
(1)如果运行环境为cpu,环境准备如下:
- conda create -n cv python=3.9
- conda activate cv
-
- pip install torchvision==0.9.0
- pip install numpy
- pip install matplotlib
- pip install requests
(2)如果运行环境GPU,环境准备如下:
通过nvidia-smi命令,查找cuda对应的版本:
- Tue May 23 15:24:10 2023
- +-----------------------------------------------------------------------------+
- | NVIDIA-SMI 528.89 Driver Version: 528.89 CUDA Version: 12.0 |
- |-------------------------------+----------------------+----------------------+
- | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
- | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
- | | | MIG M. |
- |===============================+======================+======================|
- | 0 Tesla T4 TCC | 00000000:01:00.0 Off | 0 |
- | N/A 55C P8 11W / 70W | 0MiB / 15360MiB | 0% Default |
- | | | N/A |
- +-------------------------------+----------------------+----------------------+
-
- +-----------------------------------------------------------------------------+
- | Processes: |
- | GPU GI CI PID Type Process name GPU Memory |
- | ID ID Usage |
- |=============================================================================|
- | No running processes found |
- +-----------------------------------------------------------------------------+
构建运行环境,在torch的GPU版本获取对应的版本进行安装
- conda create -n cv python=3.9
- conda activate cv
-
- pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
- pip install numpy
- pip install matplotlib
- pip install requests
这是通过nvidia-smi命令,看到已经在GPU上运行:
- Tue May 23 15:25:25 2023
- +-----------------------------------------------------------------------------+
- | NVIDIA-SMI 528.89 Driver Version: 528.89 CUDA Version: 12.0 |
- |-------------------------------+----------------------+----------------------+
- | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
- | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
- | | | MIG M. |
- |===============================+======================+======================|
- | 0 Tesla T4 TCC | 00000000:01:00.0 Off | 0 |
- | N/A 56C P0 28W / 70W | 1101MiB / 15360MiB | 3% Default |
- | | | N/A |
- +-------------------------------+----------------------+----------------------+
-
- +-----------------------------------------------------------------------------+
- | Processes: |
- | GPU GI CI PID Type Process name GPU Memory |
- | ID ID Usage |
- |=============================================================================|
- | 0 N/A N/A 6728 C ...nda\envs\voice\python.exe 1100MiB |
- +-----------------------------------------------------------------------------+
- 2023-12-22 14:44:39, train epoch: 0/20
- 2023-12-22 14:46:21, Train Loss is:0.0126, Train Accuracy is:40.9520%, Test Loss is::0.0116 Test Accuracy is:46.3200%
- 2023-12-22 14:46:21, train epoch: 1/20
- 2023-12-22 14:48:01, Train Loss is:0.0087, Train Accuracy is:59.5060%, Test Loss is::0.0109 Test Accuracy is:51.6700%
- 2023-12-22 14:48:01, train epoch: 2/20
- 2023-12-22 14:49:40, Train Loss is:0.0070, Train Accuracy is:68.1060%, Test Loss is::0.0072 Test Accuracy is:67.8100%
- 2023-12-22 14:49:40, train epoch: 3/20
- 2023-12-22 14:51:20, Train Loss is:0.0057, Train Accuracy is:74.2540%, Test Loss is::0.0073 Test Accuracy is:67.7400%
- 2023-12-22 14:51:20, train epoch: 4/20
- 2023-12-22 14:53:00, Train Loss is:0.0049, Train Accuracy is:77.9280%, Test Loss is::0.0061 Test Accuracy is:73.7400%
- 2023-12-22 14:53:00, train epoch: 5/20
- 2023-12-22 14:54:41, Train Loss is:0.0042, Train Accuracy is:81.3260%, Test Loss is::0.0049 Test Accuracy is:77.9900%
- 2023-12-22 14:54:41, train epoch: 6/20
- 2023-12-22 14:56:20, Train Loss is:0.0036, Train Accuracy is:83.9240%, Test Loss is::0.0047 Test Accuracy is:79.0400%
- 2023-12-22 14:56:20, train epoch: 7/20
- 2023-12-22 14:58:00, Train Loss is:0.0031, Train Accuracy is:86.0780%, Test Loss is::0.0059 Test Accuracy is:75.6300%
- 2023-12-22 14:58:00, train epoch: 8/20
- 2023-12-22 14:59:39, Train Loss is:0.0027, Train Accuracy is:87.7120%, Test Loss is::0.0048 Test Accuracy is:79.7600%
- 2023-12-22 14:59:39, train epoch: 9/20
- 2023-12-22 15:01:19, Train Loss is:0.0023, Train Accuracy is:89.3680%, Test Loss is::0.0048 Test Accuracy is:80.5800%
- 2023-12-22 15:01:19, train epoch: 10/20
- 2023-12-22 15:02:58, Train Loss is:0.0019, Train Accuracy is:91.2760%, Test Loss is::0.0044 Test Accuracy is:82.3400%
- 2023-12-22 15:02:58, train epoch: 11/20
- 2023-12-22 15:04:38, Train Loss is:0.0016, Train Accuracy is:92.4040%, Test Loss is::0.0045 Test Accuracy is:82.6400%
- 2023-12-22 15:04:38, train epoch: 12/20
- 2023-12-22 15:06:18, Train Loss is:0.0014, Train Accuracy is:93.7200%, Test Loss is::0.0053 Test Accuracy is:81.7900%
- 2023-12-22 15:06:18, train epoch: 13/20
- 2023-12-22 15:07:57, Train Loss is:0.0011, Train Accuracy is:94.7360%, Test Loss is::0.0051 Test Accuracy is:81.7700%
- 2023-12-22 15:07:57, train epoch: 14/20
- 2023-12-22 15:09:37, Train Loss is:0.0010, Train Accuracy is:95.1120%, Test Loss is::0.0062 Test Accuracy is:80.6500%
- 2023-12-22 15:09:37, train epoch: 15/20
- 2023-12-22 15:11:15, Train Loss is:0.0008, Train Accuracy is:96.1600%, Test Loss is::0.0056 Test Accuracy is:82.0300%
- 2023-12-22 15:11:15, train epoch: 16/20
- 2023-12-22 15:12:54, Train Loss is:0.0007, Train Accuracy is:96.6140%, Test Loss is::0.0055 Test Accuracy is:82.4200%
- 2023-12-22 15:12:54, train epoch: 17/20
- 2023-12-22 15:14:34, Train Loss is:0.0007, Train Accuracy is:96.8880%, Test Loss is::0.0068 Test Accuracy is:81.1300%
- 2023-12-22 15:14:34, train epoch: 18/20
- 2023-12-22 15:16:13, Train Loss is:0.0006, Train Accuracy is:97.0620%, Test Loss is::0.0062 Test Accuracy is:82.1900%
- 2023-12-22 15:16:13, train epoch: 19/20
- 2023-12-22 15:17:52, Train Loss is:0.0006, Train Accuracy is:97.4180%, Test Loss is::0.0063 Test Accuracy is:82.7800%
- 2023-12-22 15:17:53,accuracy_rate=[46.39423 51.752804 67.91867 67.84856 73.85818 78.11498 79.166664
- 75.751205 79.887825 80.70914 82.471954 82.77244 81.921074 81.90104
- 80.77925 82.16146 82.552086 81.26002 82.32172 82.91266 ]
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。