当前位置:   article > 正文

使用VGG-19模型训练自己的数据集_vgg19模型

vgg19模型

前言:

上一节介绍的图像识别中一个经典的模型AlexNet,今天介绍的是图像识别领域另一个经典的模型VGG-19。VGG-19是由牛津大学的Oxford Visual Geometry Group实验室发明的。因为不像是AlexNet是由Alex一个人完成的。所以这个模型就按照实验室的名称的缩写命名。VGG-19和AlexNet的整体架构是相似的,只是在AlexNet进行了一些改进,具体的有。

 

第一: VGG16相比AlexNet的一个改进是采用连续的几个3x3的卷积核代替AlexNet中的较大卷积核(11x11,7x7,5x5)

第二: VGGNet的结构非常简洁,整个网络都使用了同样大小的卷积核尺寸(3x3)和最大池化尺寸(2x2)

 

VGG-19的架构图:

首先让我们看一下VGG的发展历程,第三行表示VGG不同版本的卷积层数,从11层到13再到16最后达到19层。

 

首先同样是本程序的主程序:

和上一节的AlexNet几乎一毛一样。所以只把代码公布一下,就不做解释了。

  1. # -*- coding: utf-8 -*-
  2. # @Time : 2019/7/2 16:07
  3. # @Author : YYLin
  4. # @Email : 854280599@qq.com
  5. # @File : VGG_19_Train.py
  6. # 定义一些模型中所需要的参数
  7. from VGG_19 import VGG19
  8. import tensorflow as tf
  9. import os
  10. import cv2
  11. import numpy as np
  12. from keras.utils import to_categorical
  13. batch_size = 64
  14. img_high = 100
  15. img_width = 100
  16. Channel = 3
  17. label = 9
  18. # 定义输入图像的占位符
  19. inputs = tf.placeholder(tf.float32, [batch_size, img_high, img_width, Channel], name='inputs')
  20. y = tf.placeholder(dtype=tf.float32, shape=[batch_size, label], name='label')
  21. keep_prob = tf.placeholder("float")
  22. is_train = tf.placeholder(tf.bool)
  23. model = VGG19(inputs, keep_prob, label)
  24. score = model.fc8
  25. softmax_result = tf.nn.softmax(score)
  26. # 定义损失函数 以及相对应的优化器
  27. cross_entropy = -tf.reduce_sum(y*tf.log(softmax_result))
  28. train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
  29. # 显示最后预测的结果
  30. correct_prediction = tf.equal(tf.argmax(softmax_result, 1), tf.argmax(y, 1))
  31. accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
  32. # 现在的我只需要加载图像和对应的label即可 不需要加载text中的内容
  33. def load_satetile_image(batch_size=128, dataset='train'):
  34. img_list = []
  35. label_list = []
  36. dir_counter = 0
  37. if dataset == 'train':
  38. path = '../Dataset/baidu/train_image/train'
  39. # 对路径下的所有子文件夹中的所有jpg文件进行读取并存入到一个list中
  40. for child_dir in os.listdir(path):
  41. child_path = os.path.join(path, child_dir)
  42. for dir_image in os.listdir(child_path):
  43. img = cv2.imread(os.path.join(child_path, dir_image))
  44. img = img / 255.0
  45. img_list.append(img)
  46. label_list.append(dir_counter)
  47. dir_counter += 1
  48. else:
  49. path = '../Dataset/baidu/valid_image/valid'
  50. # 对路径下的所有子文件夹中的所有jpg文件进行读取并存入到一个list中
  51. for child_dir in os.listdir(path):
  52. child_path = os.path.join(path, child_dir)
  53. for dir_image in os.listdir(child_path):
  54. img = cv2.imread(os.path.join(child_path, dir_image))
  55. img = img / 255.0
  56. img_list.append(img)
  57. label_list.append(dir_counter)
  58. dir_counter += 1
  59. # 返回的img_list转成了 np.array的格式
  60. X_train = np.array(img_list)
  61. Y_train = to_categorical(label_list, 9)
  62. # print('to_categorical之后Y_train的类型和形状:', type(Y_train), Y_train.shape)
  63. # 加载数据的时候 重新排序
  64. data_index = np.arange(X_train.shape[0])
  65. np.random.shuffle(data_index)
  66. data_index = data_index[:batch_size]
  67. x_batch = X_train[data_index, :, :, :]
  68. y_batch = Y_train[data_index, :]
  69. return x_batch, y_batch
  70. # 开始feed 数据并且训练数据
  71. with tf.Session() as sess:
  72. sess.run(tf.global_variables_initializer())
  73. for i in range(500000//batch_size):
  74. # 加载训练集和验证集
  75. img, img_label = load_satetile_image(batch_size, dataset='train')
  76. img_valid, img_valid_label = load_satetile_image(batch_size, dataset='vaild')
  77. # print('使用 mnist.train.next_batch加载的数据集形状', img.shape, type(img))
  78. # print('模型使用的是dropout的模型')
  79. dropout_rate = 0.5
  80. # print('经过 tf.reshape之后数据的形状以及类型是:', img.shape, type(img))
  81. if i % 20 == 0:
  82. train_accuracy = accuracy.eval(feed_dict={inputs: img, y: img_label, keep_prob: dropout_rate})
  83. print("step %d, training accuracy %g" % (i, train_accuracy))
  84. train_step.run(feed_dict={inputs: img, y: img_label, keep_prob: dropout_rate})
  85. # 输出验证集上的结果
  86. if i % 50 == 0:
  87. dropout_rate = 1
  88. valid_socre = accuracy.eval(feed_dict={inputs: img_valid, y: img_valid_label, keep_prob: dropout_rate})
  89. print("step %d, valid accuracy %g" % (i, valid_socre))

 

本节的核心代码 VGG-19:

从图中我们可以看到VGG-19有16个卷积层,卷积层的通道数分别是64、128、256、512。最后有三个全连接层通道数分别是4096,4096,1000。

第一: VGG-19所有的卷积核大小都是 3 * 3,  步长为1 * 1。 代码中满足要求

第二: VGG-19所有最大池化层的卷积核大小为2 * 2, 步长为1 * 1  代码中满足要求

第三: 根据上图查看一下每层卷积操作的通道数是否与代码对应    显然代码满足要求。

第四: 在第一节的时候我们向模型中增加一些优化技巧,我们发现使用batch normalize的话,能够极大的提高模型的准确率。但是VGG-19中并没有增加。 尝试增加batch normalize。而且也没有使用一些激活函数,所以说这个模型可以尝试的优化方案还是很多的。

 

  1. # -*- coding: utf-8 -*-
  2. # @Time : 2019/7/2 8:18
  3. # @Author : YYLin
  4. # @Email : 854280599@qq.com
  5. # @File : VGG_19.py
  6. # 本模型为VGG-19参考代码链接
  7. import tensorflow as tf
  8. def maxPoolLayer(x, kHeight, kWidth, strideX, strideY, name, padding="SAME"):
  9. return tf.nn.max_pool(x, ksize=[1, kHeight, kWidth, 1],
  10. strides=[1, strideX, strideY, 1], padding=padding, name=name)
  11. def dropout(x, keepPro, name=None):
  12. return tf.nn.dropout(x, keepPro, name)
  13. def fcLayer(x, inputD, outputD, reluFlag, name):
  14. with tf.variable_scope(name) as scope:
  15. w = tf.get_variable("w", shape=[inputD, outputD], dtype="float")
  16. b = tf.get_variable("b", [outputD], dtype="float")
  17. out = tf.nn.xw_plus_b(x, w, b, name=scope.name)
  18. if reluFlag:
  19. return tf.nn.relu(out)
  20. else:
  21. return out
  22. def convLayer(x, kHeight, kWidth, strideX, strideY, featureNum, name, padding = "SAME"):
  23. channel = int(x.get_shape()[-1])
  24. with tf.variable_scope(name) as scope:
  25. w = tf.get_variable("w", shape=[kHeight, kWidth, channel, featureNum])
  26. b = tf.get_variable("b", shape=[featureNum])
  27. featureMap = tf.nn.conv2d(x, w, strides=[1, strideY, strideX, 1], padding=padding)
  28. out = tf.nn.bias_add(featureMap, b)
  29. return tf.nn.relu(tf.reshape(out, featureMap.get_shape().as_list()), name=scope.name)
  30. class VGG19(object):
  31. def __init__(self, x, keepPro, classNum):
  32. self.X = x
  33. self.KEEPPRO = keepPro
  34. self.CLASSNUM = classNum
  35. self.begin_VGG_19()
  36. def begin_VGG_19(self):
  37. """build model"""
  38. conv1_1 = convLayer(self.X, 3, 3, 1, 1, 64, "conv1_1" )
  39. conv1_2 = convLayer(conv1_1, 3, 3, 1, 1, 64, "conv1_2")
  40. pool1 = maxPoolLayer(conv1_2, 2, 2, 2, 2, "pool1")
  41. conv2_1 = convLayer(pool1, 3, 3, 1, 1, 128, "conv2_1")
  42. conv2_2 = convLayer(conv2_1, 3, 3, 1, 1, 128, "conv2_2")
  43. pool2 = maxPoolLayer(conv2_2, 2, 2, 2, 2, "pool2")
  44. conv3_1 = convLayer(pool2, 3, 3, 1, 1, 256, "conv3_1")
  45. conv3_2 = convLayer(conv3_1, 3, 3, 1, 1, 256, "conv3_2")
  46. conv3_3 = convLayer(conv3_2, 3, 3, 1, 1, 256, "conv3_3")
  47. conv3_4 = convLayer(conv3_3, 3, 3, 1, 1, 256, "conv3_4")
  48. pool3 = maxPoolLayer(conv3_4, 2, 2, 2, 2, "pool3")
  49. conv4_1 = convLayer(pool3, 3, 3, 1, 1, 512, "conv4_1")
  50. conv4_2 = convLayer(conv4_1, 3, 3, 1, 1, 512, "conv4_2")
  51. conv4_3 = convLayer(conv4_2, 3, 3, 1, 1, 512, "conv4_3")
  52. conv4_4 = convLayer(conv4_3, 3, 3, 1, 1, 512, "conv4_4")
  53. pool4 = maxPoolLayer(conv4_4, 2, 2, 2, 2, "pool4")
  54. conv5_1 = convLayer(pool4, 3, 3, 1, 1, 512, "conv5_1")
  55. conv5_2 = convLayer(conv5_1, 3, 3, 1, 1, 512, "conv5_2")
  56. conv5_3 = convLayer(conv5_2, 3, 3, 1, 1, 512, "conv5_3")
  57. conv5_4 = convLayer(conv5_3, 3, 3, 1, 1, 512, "conv5_4")
  58. pool5 = maxPoolLayer(conv5_4, 2, 2, 2, 2, "pool5")
  59. print('最后一层卷积层的形状是:', pool5.shape)
  60. fcIn = tf.reshape(pool5, [-1, 4*4*512])
  61. fc6 = fcLayer(fcIn, 4*4*512, 4096, True, "fc6")
  62. dropout1 = dropout(fc6, self.KEEPPRO)
  63. fc7 = fcLayer(dropout1, 4096, 4096, True, "fc7")
  64. dropout2 = dropout(fc7, self.KEEPPRO)
  65. self.fc8 = fcLayer(dropout2, 4096, self.CLASSNUM, True, "fc8")

 

VGG-19增加batch normalize: 亲测是可以使用的,但是需要将batch_size修改成32不然GPU显存溢出

  1. # -*- coding: utf-8 -*-
  2. # @Time : 2019/7/2 16:57
  3. # @Author : YYLin
  4. # @Email : 854280599@qq.com
  5. # @File : VGG_19_BN.py
  6. import tensorflow as tf
  7. # 相对于第一个版本 增加的批量正则化 2019 7 2
  8. def bn(x, is_training):
  9. return tf.layers.batch_normalization(x, training=is_training)
  10. def maxPoolLayer(x, kHeight, kWidth, strideX, strideY, name, padding="SAME"):
  11. return tf.nn.max_pool(x, ksize=[1, kHeight, kWidth, 1],
  12. strides=[1, strideX, strideY, 1], padding=padding, name=name)
  13. def dropout(x, keepPro, name=None):
  14. return tf.nn.dropout(x, keepPro, name)
  15. def fcLayer(x, inputD, outputD, reluFlag, name):
  16. with tf.variable_scope(name) as scope:
  17. w = tf.get_variable("w", shape=[inputD, outputD], dtype="float")
  18. b = tf.get_variable("b", [outputD], dtype="float")
  19. out = tf.nn.xw_plus_b(x, w, b, name=scope.name)
  20. if reluFlag:
  21. return tf.nn.relu(out)
  22. else:
  23. return out
  24. def convLayer(x, kHeight, kWidth, strideX, strideY, featureNum, name, padding = "SAME"):
  25. channel = int(x.get_shape()[-1])
  26. with tf.variable_scope(name) as scope:
  27. w = tf.get_variable("w", shape=[kHeight, kWidth, channel, featureNum])
  28. b = tf.get_variable("b", shape=[featureNum])
  29. featureMap = tf.nn.conv2d(x, w, strides=[1, strideY, strideX, 1], padding=padding)
  30. out = tf.nn.bias_add(featureMap, b)
  31. return tf.nn.relu(tf.reshape(out, featureMap.get_shape().as_list()), name=scope.name)
  32. class VGG19(object):
  33. def __init__(self, x, keepPro, classNum, is_training):
  34. self.X = x
  35. self.KEEPPRO = keepPro
  36. self.CLASSNUM = classNum
  37. self.is_training = is_training
  38. self.begin_VGG_19()
  39. def begin_VGG_19(self):
  40. """build model"""
  41. conv1_1 = convLayer(self.X, 3, 3, 1, 1, 64, "conv1_1" )
  42. conv1_1 = bn(conv1_1, self.is_training)
  43. conv1_2 = convLayer(conv1_1, 3, 3, 1, 1, 64, "conv1_2")
  44. conv1_2 = bn(conv1_2, self.is_training)
  45. pool1 = maxPoolLayer(conv1_2, 2, 2, 2, 2, "pool1")
  46. conv2_1 = convLayer(pool1, 3, 3, 1, 1, 128, "conv2_1")
  47. conv2_1 = bn(conv2_1, self.is_training)
  48. conv2_2 = convLayer(conv2_1, 3, 3, 1, 1, 128, "conv2_2")
  49. conv2_2 = bn(conv2_2, self.is_training)
  50. pool2 = maxPoolLayer(conv2_2, 2, 2, 2, 2, "pool2")
  51. conv3_1 = convLayer(pool2, 3, 3, 1, 1, 256, "conv3_1")
  52. conv3_1 = bn(conv3_1, self.is_training)
  53. conv3_2 = convLayer(conv3_1, 3, 3, 1, 1, 256, "conv3_2")
  54. conv3_2 = bn(conv3_2, self.is_training)
  55. conv3_3 = convLayer(conv3_2, 3, 3, 1, 1, 256, "conv3_3")
  56. conv3_3 = bn(conv3_3, self.is_training)
  57. conv3_4 = convLayer(conv3_3, 3, 3, 1, 1, 256, "conv3_4")
  58. conv3_4 = bn(conv3_4, self.is_training)
  59. pool3 = maxPoolLayer(conv3_4, 2, 2, 2, 2, "pool3")
  60. conv4_1 = convLayer(pool3, 3, 3, 1, 1, 512, "conv4_1")
  61. conv4_1 = bn(conv4_1, self.is_training)
  62. conv4_2 = convLayer(conv4_1, 3, 3, 1, 1, 512, "conv4_2")
  63. conv4_2 = bn(conv4_2, self.is_training)
  64. conv4_3 = convLayer(conv4_2, 3, 3, 1, 1, 512, "conv4_3")
  65. conv4_3 = bn(conv4_3, self.is_training)
  66. conv4_4 = convLayer(conv4_3, 3, 3, 1, 1, 512, "conv4_4")
  67. conv4_4 = bn(conv4_4, self.is_training)
  68. pool4 = maxPoolLayer(conv4_4, 2, 2, 2, 2, "pool4")
  69. conv5_1 = convLayer(pool4, 3, 3, 1, 1, 512, "conv5_1")
  70. conv5_1 = bn(conv5_1, self.is_training)
  71. conv5_2 = convLayer(conv5_1, 3, 3, 1, 1, 512, "conv5_2")
  72. conv5_2 = bn(conv5_2, self.is_training)
  73. conv5_3 = convLayer(conv5_2, 3, 3, 1, 1, 512, "conv5_3")
  74. conv5_3 = bn(conv5_3, self.is_training)
  75. conv5_4 = convLayer(conv5_3, 3, 3, 1, 1, 512, "conv5_4")
  76. conv5_4 = bn(conv5_4, self.is_training)
  77. pool5 = maxPoolLayer(conv5_4, 2, 2, 2, 2, "pool5")
  78. print('最后一层卷积层的形状是:', pool5.shape)
  79. fcIn = tf.reshape(pool5, [-1, 4*4*512])
  80. fc6 = fcLayer(fcIn, 4*4*512, 4096, True, "fc6")
  81. dropout1 = dropout(fc6, self.KEEPPRO)
  82. fc7 = fcLayer(dropout1, 4096, 4096, True, "fc7")
  83. dropout2 = dropout(fc7, self.KEEPPRO)
  84. self.fc8 = fcLayer(dropout2, 4096, self.CLASSNUM, True, "fc8")

 

VGG-19模型运行的结果分析:

 

VGG-19 增加BN之后的结果分析:

 

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/从前慢现在也慢/article/detail/340852
推荐阅读
  

闽ICP备14008679号