赞
踩
深度卷积网络极大地推进深度学习各领域的发展,ILSVRC作为最具影响力的竞赛功不可没,促使了许多经典工作。我梳理了ILSVRC分类任务的各届冠军和亚军网络,简单介绍了它们的核心思想、网络架构及其实现。
代码主要来自:https://github.com/weiaicunzai/pytorch-cifar100
ImageNet和ILSVRC
ImageNet是一个超过15 million的图像数据集,大约有22,000类。
ILSVRC全称ImageNet Large-Scale Visual Recognition Challenge,从2010年开始举办到2017年最后一届,使用ImageNet数据集的一个子集,总共有1000类。
历届结果
年 | 网络/队名 | val top-1 | val top-5 | test top-5 | 备注 |
---|---|---|---|---|---|
2012 | AlexNet | 38.1% | 16.4% | 16.42% | 5 CNNs |
2012 | AlexNet | 36.7% | 15.4% | 15.32% | 7CNNs。用了2011年的数据 |
2013 | OverFeat | 14.18% | 7 fast models | ||
2013 | OverFeat | 13.6% | 赛后。7 big models | ||
2013 | ZFNet | 13.51% | ZFNet论文上的结果是14.8 | ||
2013 | Clarifai | 11.74% | |||
2013 | Clarifai | 11.20% | 用了2011年的数据 | ||
2014 | VGG | 7.32% | 7 nets, dense eval | ||
2014 | VGG(亚军) | 23.7% | 6.8% | 6.8% | 赛后。2 nets |
2014 | GoogleNet v1 | 6.67% | 7 nets, 144 crops | ||
GoogleNet v2 | 20.1% | 4.9% | 4.82% | 赛后。6 nets, 144 crops | |
GoogleNet v3 | 17.2% | 3.58% | 赛后。4 nets, 144 crops | ||
GoogleNet v4 | 16.5% | 3.1% | 3.08% | 赛后。v4+Inception-Res-v2 | |
2015 | ResNet | 3.57% | 6 models | ||
2016 | Trimps-Soushen | 2.99% | 公安三所 | ||
2016 | ResNeXt(亚军) | 3.03% | 加州大学圣地亚哥分校 | ||
2017 | SENet | 2.25% | Momenta 与牛津大学 |
评价标准
top1是指概率向量中最大的作为预测结果,若分类正确,则为正确;top5则只要概率向量中最大的前五名里有分类正确的,则为正确。
Gradient-Based Learning Applied to Document Recognition
- import torch.nn as nn
- import torch.nn.functional as func
- class LeNet(nn.Module):
- def __init__(self):
- super(LeNet, self).__init__()
- self.conv1 = nn.Conv2d(1, 6, kernel_size=5)
- self.conv2 = nn.Conv2d(6, 16, kernel_size=5)
- self.fc1 = nn.Linear(16*16, 120)
- self.fc2 = nn.Linear(120, 84)
- self.fc3 = nn.Linear(84, 10)
-
- def forward(self, x):
- x = func.relu(self.conv1(x))
- x = func.max_pool2d(x, 2)
- x = func.relu(self.conv2(x))
- x = func.max_pool2d(x, 2)
- x = x.view(x.size(0), -1)
- x = func.relu(self.fc1(x))
- x = func.relu(self.fc2(x))
- x = self.fc3(x)
- return x
ImageNet Classification with Deep Convolutional Neural Networks
AlexNet相比前人有以下改进:
1.采用ReLU激活函数
2.局部响应归一化LRN
3.Overlapping Pooling
4.引入Drop out
5.数据增强
6.多GPU并行
代码实现
- class AlexNet(nn.Module):
- def __init__(self, num_classes=NUM_CLASSES):
- super(AlexNet, self).__init__()
- self.features = nn.Sequential(
- nn.Conv2d(1, 96, kernel_size=11,padding=1),
- nn.ReLU(inplace=True),
- nn.MaxPool2d(kernel_size=2),
- nn.Conv2d(96, 256, kernel_size=3, padding=1),
- nn.ReLU(inplace=True),
- nn.MaxPool2d(kernel_size=2),
- nn.Conv2d(256, 384, kernel_size=3, padding=1),
- nn.ReLU(inplace=True),
- nn.Conv2d(384, 384, kernel_size=3, padding=1),
- nn.ReLU(inplace=True),
- nn.Conv2d(384, 256, kernel_size=3, padding=1),
- nn.ReLU(inplace=True),
- nn.MaxPool2d(kernel_size=2),
- )
- self.classifier = nn.Sequential(
- nn.Dropout(),
- nn.Linear(256 * 2 * 2, 4096),
- nn.ReLU(inplace=True),
- nn.Dropout(),
- nn.Linear(4096, 4096),
- nn.ReLU(inplace=True),
- nn.Linear(4096, 10),
- )
- def forward(self, x):
- x = self.features(x)
- x = x.view(x.size(0), 256 * 2 * 2)
- x = self.classifier(x)
- return x
Visualizing and Understanding Convolutional Networks
利用反卷积可视化CNN学到的特征。
1.Unpooling:池化操作不可逆,但通过记录池化最大值的位置可实现逆操作。
2.Rectification:ReLU
3.Filtering:使用原卷积核的转置版本。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。