赞
踩
使用keras构建深度学习模型,我们会通过model.summary()
输出模型各层的参数状况,如下:
________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_4 (Dense) (None, 7) 35 _________________________________________________________________ activation_4 (Activation) (None, 7) 0 _________________________________________________________________ dense_5 (Dense) (None, 13) 104 _________________________________________________________________ activation_5 (Activation) (None, 13) 0 _________________________________________________________________ dense_6 (Dense) (None, 5) 70 _________________________________________________________________ activation_6 (Activation) (None, 5) 0 ================================================================= Total params: 209 Trainable params: 209 Non-trainable params: 0 _________________________________________________________________
通过这些参数,可以看到模型各个层的组成(dense表示全连接层)。也能看到数据经过每个层后,输出的数据维度。
还能看到Param,它表示每个层参数的个数,这个Param是怎么计算出来的呢?
本文详细讲解了如下两种模型的Param的计算过程。
Param
计算过程我们先用如下代码构建一个最简单的神经网络模型,它只有3个全连接层组成:
from keras.models import Sequential from keras.layers.core import Dense, Dropout, Activation model = Sequential() # 顺序模型 # 输入层 model.add(Dense(7, input_shape=(4,))) # Dense就是常用的全连接层 model.add(Activation('sigmoid')) # 激活函数 # 隐层 model.add(Dense(13)) # Dense就是常用的全连接层 model.add(Activation('sigmoid')) # 激活函数 # 输出层 model.add(Dense(5)) model.add(Activation('softmax')) model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=["accuracy"]) model.summary()
这个模型的参数输出如下:
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_4 (Dense) (None, 7) 35 _________________________________________________________________ activation_4 (Activation) (None, 7) 0 _________________________________________________________________ dense_5 (Dense) (None, 13) 104 _________________________________________________________________ activation_5 (Activation) (None, 13) 0 _________________________________________________________________ dense_6 (Dense) (None, 5) 70 _________________________________________________________________ activation_6 (Activation) (None, 5) 0 ================================================================= Total params: 209 Trainable params: 209 Non-trainable params: 0 _________________________________________________________________
全连接层神经网络的Param
,说明的是每层神经元权重的个数,所以它的计算如下:
第一个Dense层,输入数据维度是4(一维数据),有7个神经元。所以,Param=(4+1)*7=35.
第二个Dense层,输入数据维度是7(经过第一层7个神经元作用后,输出数据维度就是7了),有13个神经元。所以,Param=(7+1)*13=104.
第三个Dense层,输入数据维度是13(经过第二层13个神经元作用后,输出数据维度就是13了),有5个神经元。所以,Param=(13+1)*5=70.
Param
计算过程我们先用如下代码构建一个CNN模型,它有3个卷积层组成:
import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten, Activation from keras.layers import Convolution2D as Conv2D from keras.layers import MaxPooling2D from keras import backend as K model = Sequential() model.add(Conv2D(32, kernel_size=(3, 2), input_shape=(8,8,1))) convout1 = Activation('relu') model.add(convout1) model.add(Conv2D(64, (2, 3), activation='relu')) model.add(Conv2D(64, (2, 2), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax')) model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adadelta(), metrics=['accuracy']) model.summary()
这个模型的参数输出如下:
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d_10 (Conv2D) (None, 6, 7, 32) 224 _________________________________________________________________ activation_4 (Activation) (None, 6, 7, 32) 0 _________________________________________________________________ conv2d_11 (Conv2D) (None, 5, 5, 64) 12352 _________________________________________________________________ conv2d_12 (Conv2D) (None, 4, 4, 64) 16448 _________________________________________________________________ max_pooling2d_4 (MaxPooling2 (None, 2, 2, 64) 0 _________________________________________________________________ dropout_7 (Dropout) (None, 2, 2, 64) 0 _________________________________________________________________ flatten_4 (Flatten) (None, 256) 0 _________________________________________________________________ dense_6 (Dense) (None, 128) 32896 _________________________________________________________________ dropout_8 (Dropout) (None, 128) 0 _________________________________________________________________ dense_7 (Dense) (None, 10) 1290 ================================================================= Total params: 63,210 Trainable params: 63,210 Non-trainable params: 0 _________________________________________________________________
根据[1],可知对CNN模型,Param的计算方法如下:
所以,
第一个CONV层,Conv2D(32, kernel_size=(3, 2), input_shape=(8,8,1))
,Param=(3 * 2 * 1+1)*32 = 224.
第二个CONV层,Conv2D(64, (2, 3), activation='relu')
,经过第一个层32个卷积核的作用,第二层输入数据通道数为32,Param=(2 * 3 * 32+1)*64 = 12352.
第三个CONV层,Conv2D(64, (2, 2), activation='relu')
,经过第二个层64个卷积核的作用,第二层输入数据通道数为64,Param=(2 * 2 * 64+1)*64 = 16448.
dense_6 (Dense)这里的Param为什么是32896呢?
因为经过flatten_4 (Flatten)的作用,输出变为了256,而dense_6 (Dense)中有128个卷积核,所以Param=128*(256+1)= 32896。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。