赞
踩
VGG(Visual Geometry Group)全称Very Deep Convolutional Networks for Large-scale Image Recognition
常用D的配置选择
13个卷积层,5个下采样层(池化层),三个全连接层
网络的亮点: 通过堆叠多个3×3的卷积网络来代替大尺度剪辑和(减少所需参数)
论文中提到,可以通过堆叠两个3×3的卷积核替代5×5的卷积核,堆叠三个3×3的卷积核可以代替7×7的卷积核。
在卷积神经网络中,决定某一层输出结果中一个元素所对应的输入层的区域大小被称作感受野(receptive field)
。通俗的解释是,输出feature map上的一个单元对应输入成上的区域大小。
堆叠两个3×3的卷积核代替5×5的卷积核,堆叠三个3×3的卷积核代替7×7的卷积核,所需参数比较如下各个卷积层分别为
conv1:
kernels:64;kernel_size:3;padding:1;stride:1
input_size: [224, 224, 64]
N = (W − F + 2P ) / S + 1=[224-3+2]/1+1=224
output_size: [224,224,64]>Maxpool1:
kernel_size:2;pading: 0;stride:2;
input_size: [224,224,64]
N = (W − F + 2P ) / S + 1=(224-2)/2+1=112
output_size: [112,112,64]conv2:
kernels:128;kernel_size:3;padding:1;stride:1
input_size: [112,112,64]
N = (W − F + 2P ) / S + 1=[112-3+2]/1+1=112
output_size: [112,112,128]Maxpool2:
kernel_size:2;pading: 0;stride:2;
input_size: [112,112,128]
N = (W − F + 2P ) / S + 1=(112-2)/2+1=56
output_size: [56,56,128]conv3:
kernels:256;kernel_size:3;padding:1;stride:1
input_size: [56,56,128]
N = (W − F + 2P ) / S + 1=[56-3+2]/1+1=56
output_size: [56,56,256]Maxpool3:
kernel_size:2;pading: 0;stride:2;
input_size: [56,56,256]
N = (W − F + 2P ) / S + 1=(56-2)/2+1=28
output_size: [28,28,256]conv4:
kernels:512;kernel_size:3;padding:1;stride:1
input_size: [28,28,256]
N = (W − F + 2P ) / S + 1=[28-3+2]/1+1=28
output_size: [28,28,512]Maxpool4:
kernel_size:2;pading: 0;stride:2;
input_size: [28,28,512]
N = (W − F + 2P ) / S + 1=(28-2)/2+1=14
output_size: [14,14,512]conv5:
kernels:512;kernel_size:3;padding:1;stride:1
input_size: [14,14,512]
N = (W − F + 2P ) / S + 1=[14-3+2]/1+1=14
output_size: [14,14,512]Maxpool5:
kernel_size:2;pading: 0;stride:2;
input_size: [14,14,512]
N = (W − F + 2P ) / S + 1=(14-2)/2+1=7
output_size: [7,7,512]Full Connection1:
nodes=4096
input_size=[7×7×512] output=[4096]Full Connection2:
nodes=4096
input_size=[4096] output=[4096]Full Connection3:
nodes=1000
input_size=[4096] output=[1000]softmax
最后使用softmax函数完成分类任务
----结束----
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。