当前位置:   article > 正文

基于ResNet50做图片分类的tensorflow代码实现_tensorflow用resnet50进行图像分类

tensorflow用resnet50进行图像分类

目标任务:将数据集中5类美食图片进行分类,每一类有1000张图片,共5000张。

实验总结:刚开始设置训练集和验证集的比例为8:2,有些欠拟合,因此后来调整到了9:1;分别测试了原生的ResNet50、ResNet101、ResNet152和改进后的ResNet50、ResNet101,但最终在验证集上的最佳精度只能达到75%左右。

改进后的ResNet101表现:

  • 训练集和验证集的精确度变化

  •  训练集和验证集的损失值变化

实验环境:TensorFlow-2.1.0。

ResNet50结构:

ResNet-50 结构 - 简书 (jianshu.com)

ResNet有2个基本的block,一个是Identity Block,输入和输出的dimension是一样的,所以可以连续串联多个;另外一个基本的block是Conv Block,输入和输出的dimension不一样,所以不能连续串联,它的作用就是为了改变特征向量的dimension。

在这里插入图片描述

CNN最后都要把输入的图像逐步转换为平面尺度很小但是depth很深的feature map,一般采用统一的比较小的kernel(比如VGG用3*3)进行操作,但是随着网络深度的增加,output的channel也增大(学到的东西越来越复杂),所以有必要在进入Identity Block之前,用Conv Block转换一下维度,这样之后就可以连续堆积Identity Block。

这里写图片描述

Conv Block:

Identity Block:

Conv Block中,在shortcut path边加上一个conv2D layer(1*1 filter size),可以在main path改变dimension之后,保证shortcut path进行变换之后的输出维度与之相同。

ResNetV1-50流程如下, 不使用bottleneck, 且只有resnetv1在initial_conv后面做BN和Relu:

  1. block_sizes=[3, 4, 6, 3]指的是stage1(first pool)之后的4个layer的block数, 分别对应res2,res3,res4,res5,
  2. 每一个layer的第一个block在shortcut上做conv+BN, 即Conv Block
  3. inputs: (1, 720, 1280, 3)
  4. initial_conv:
  5. conv2d_fixed_padding()
  6. 1. kernel_size=7, 先做padding(1, 720, 1280, 3) -> (1, 726, 1286, 3)
  7. 2. conv2d kernels=[7, 7, 3, 64], stride=2, VALID 卷积. 7x7的kernel, padding都为3, 为了保证左上角和卷积核中心点对其
  8. (1, 726, 1286, 3) -> (1, 360, 640, 64)
  9. 3. BN, Relu (只有resnetv1在第一次conv后面做BN和Relu)
  10. initial_max_pool:
  11. k=3, s=2, padding='SAME', (1, 360, 640, 64) -> (1, 180, 320, 64)
  12. 以下均为不使用bottleneck的building_block
  13. block_layer1:
  14. (有3个block, layer间stride=1(上一层做pool了), 64filter, 不使用bottleneck(若使用bottleneck 卷积核数量需乘4))
  15. 1. 第一个block:
  16. Conv Block有projection_shortcut, 且strides可以等于1或者2
  17. Identity Block没有projection_shortcut, 且strides只能等于1
  18. `inputs = block_fn(inputs, filters, training, projection_shortcut, strides, data_format)`
  19. shortcut做[1, 1, 64, 64], stride=1的conv和BN, shape不变
  20. 然后和主要分支里input3次卷积后的结果相加, 一起Relu, 注意block里最后一次卷积后只有BN没有Relu
  21. input: conv-bn-relu-conv-bn-relu-conv-bn 和shortcut相加后再做relu
  22. shortcut: conv-bn
  23. shortcut: [1, 1, 64, 64], s=1, (1, 180, 320, 64) -> (1, 180, 320, 64)
  24. input做两次[3, 3, 64, 64], s=1的卷积, shape不变(1, 180, 320, 64) -> (1, 180, 320, 64) -> (1, 180, 320, 64)
  25. inputs += shortcut, 再relu
  26. 2. 对剩下的2个block, 每个block操作相同:
  27. `inputs = block_fn(inputs, filters, training, None, 1, data_format)`
  28. shortcut直接和input卷积结果相加, 不做conv-bn
  29. input做两次[3, 3, 64, 64], s=1的卷积, shape不变(1, 180, 320, 64) -> (1, 180, 320, 64) -> (1, 180, 320, 64)
  30. inputs += shortcut, 再relu
  31. block_layer2/3/4同block_layer1, 只是每个layer的identity block数量不同, 卷积核数量和layer间stride也不同, 不过仍然只有第一个conv block的shortcut做conv-bn
  32. block_layer2: 4个block, 128filter, layer间stride=2 (因为上一层出来后没有pool)
  33. 1. 第一个block:
  34. 对shortcut做kernel=[1, 1, 64, 128], s=2的conv和BN, (1, 180, 320, 64) -> (1, 90, 160, 128)
  35. 对主要分支先做kernel=[3, 3, 64, 128], s=2的卷积, padding='VALID', (1, 180, 320, 64) -> (1, 90, 160, 128)
  36. 再做kernel=[3, 3, 128, 128], s=1的卷积, padding='SAME', (1, 90, 160, 128) -> (1, 90, 160, 128)
  37. 2. 剩下的3个block, 每个block操作相同:
  38. shortcut不操作直接和结果相加做Relu
  39. 对主要分支做两次[3, 3, 128, 128], s=1的卷积, padding='SAME', (1, 90, 160, 128) -> (1, 90, 160, 128) -> (1, 90, 160, 128)
  40. block_layer3: 6个block, 256filter, layer间stride=2
  41. 1. 第一个block:
  42. 对shortcut做kernel=[1, 1, 128, 256], s=2的conv和BN, (1, 90, 160, 128) -> (1, 45, 80, 256)
  43. 对主要分支先做kernel=[3, 3, 128, 256], s=2的卷积, padding='VALID', (1, 90, 160, 128) -> (1, 45, 80, 256)
  44. 再做kernel=[3, 3, 256, 256], s=1的卷积, padding='SAME', (1, 45, 80, 256) -> (1, 45, 80, 256)
  45. 2. 剩下的5个block, 每个block操作相同:
  46. shortcut不操作直接和结果相加做Relu
  47. 对主要分支做两次[3, 3, 256, 256], s=1的卷积, padding='SAME', (1, 45, 80, 256) -> (1, 45, 80, 256) -> (1, 45, 80, 256)
  48. block_layer4: 3个block, 512filter, layer间stride=2
  49. 1. 第一个block:
  50. 对shortcut做kernel=[1, 1, 256, 512], s=2的conv和BN, (1, 45, 80, 256) -> (1, 23, 40, 512)
  51. 对主要分支先做kernel=[3, 3, 256, 512], s=2的卷积, padding='VALID', (1, 45, 80, 256) -> (1, 23, 40, 512)
  52. 再做kernel=[3, 3, 512, 512], s=1的卷积, padding='SAME', (1, 23, 40, 512) -> (1, 23, 40, 512)
  53. 2. 剩下的2个block, 每个block操作相同:
  54. shortcut不操作直接和结果相加做Relu
  55. 对主要分支做两次[3, 3, 512, 512], s=1的卷积, padding='SAME', (1, 23, 40, 512) -> (1, 23, 40, 512)
  56. avg_pool, 7*7
  57. FC, output1000
  58. softmax
  59. 输出prediction

Keras版结构如下, res2a代表stage2的第1个block, branch1是shortcut path, branch2是main path, branch2a代表是main path的第1个卷积:

 实验过程:

1. 原生ResNet50 + 优化器①

  1. model.compile(
  2. optimizer='rmsprop',
  3. loss=tf.keras.losses.SparseCategoricalCrossentropy(),
  4. metrics=['accuracy']
  5. )
  6. train_count = len(train_path)
  7. test_count = len(test_path)
  8. steps_per_epoch = train_count // BATCH_SIZE
  9. validation_steps = test_count // BATCH_SIZE
  10. history = model.fit(
  11. train_datasets,
  12. steps_per_epoch=steps_per_epoch,
  13. epochs=120,
  14. #verbose=1,
  15. validation_data=test_datasets,
  16. validation_steps=validation_steps
  17. )

实验结果:

loss: 0.0414 - accuracy: 0.9880 - val_loss: 3.8412 - val_accuracy: 0.5560

2. 原生ResNet50 + 优化器②

  1. model.compile(optimizer=tf.keras.optimizers.Adam(0.0001),
  2. loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
  3. metrics=['acc']
  4. )
  5. history = model.fit(train_datasets,
  6. epochs = 150,
  7. steps_per_epoch = steps_per_epoch,
  8. validation_data = test_datasets,
  9. validation_steps = validation_steps
  10. )

实验结果:

loss: 0.0322 - acc: 0.9896 - val_loss: 2.4060 - val_acc: 0.5938

3. 改进后的ResNet50/101

  • 改进方法:

在每个卷积操作之后加入了Dropout(rate=0.2),并且在训练过程中加入了EarlyStopping,以减弱过拟合的问题;SGD优化器中加入了momentum=0.8,训练速度有一定提升。

  • 完整代码:
  1. import tensorflow as tf
  2. import glob
  3. import numpy as np
  4. import matplotlib.pyplot as plt
  5. from tensorflow.keras.layers import Input
  6. from tensorflow.keras.layers import Dense,Conv2D,MaxPooling2D,ZeroPadding2D,AveragePooling2D
  7. from tensorflow.keras.layers import Activation,BatchNormalization,Flatten
  8. from tensorflow.keras.models import Model
  9. from tensorflow.keras.preprocessing import image
  10. import tensorflow.keras.backend as K
  11. #from tensorflow.keras.utils.data_utils import get_file
  12. from tensorflow.keras.applications.imagenet_utils import decode_predictions
  13. from tensorflow.keras.applications.imagenet_utils import preprocess_input
  14. from tensorflow.keras import layers
  1. !unzip 'Food/foods.zip'
  2. glob.glob('*/*')
  1. img_path = glob.glob('foods/train/*/*.jpg')
  2. labels = [img.split('/')[2] for img in img_path]
  3. label = np.unique(labels)
  4. label_to_index = dict((item, index) for (index, item) in enumerate(label))
  5. index_to_label = dict((index, item) for (item, index) in label_to_index.items())
  6. index_labels = [label_to_index.get(name) for name in labels] # 映射
index_labels[:3] # index_labels为标签数组
  1. random_index = np.random.permutation(len(img_path))
  2. train_img_path = np.array(img_path)[random_index]
  3. train_img_label = np.array(index_labels)[random_index]
  4. divider = int(len(img_path) * 0.9)
  5. train_path = train_img_path[:divider]
  6. train_label = train_img_label[:divider]
  7. test_path = train_img_path[divider:]
  8. test_label = train_img_label[divider:]
  1. # 数据集
  2. train_datasets = tf.data.Dataset.from_tensor_slices((train_path, train_label))
  3. test_datasets = tf.data.Dataset.from_tensor_slices((test_path, test_label))
  1. def load_img(img_path, img_label):
  2. image = tf.io.read_file(img_path)
  3. image = tf.image.decode_jpeg(image, channels = 3) # 将image解码,通道数为3
  4. image = tf.image.resize(image, [224, 224])
  5. image = tf.cast(image, tf.float32)
  6. image = image / 255 # 归一化
  7. return image, img_label
  8. AUTOTUNE = tf.data.experimental.AUTOTUNE # 多线程
  9. train_datasets = train_datasets.map(load_img, num_parallel_calls=AUTOTUNE)
  10. test_datasets = test_datasets.map(load_img, num_parallel_calls=AUTOTUNE)
  11. BATCH_SIZE = 32
  1. train_datasets = train_datasets.repeat().shuffle(300).batch(BATCH_SIZE)
  2. # <BatchDataset shapes: ((None, 224, 224, 3), (None,)), types: (tf.float32, tf.int64)>
  3. test_datasets = test_datasets.batch(BATCH_SIZE)
  4. # <BatchDataset shapes: ((None, 224, 224, 3), (None,)), types: (tf.float32, tf.int64)>
  1. from tensorflow.keras.layers import Dropout
  2. def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
  3. # 64,64,256
  4. filters1, filters2, filters3 = filters
  5. conv_name_base = 'res' + str(stage) + block + '_branch'
  6. bn_name_base = 'bn' + str(stage) + block + '_branch'
  7. # 降维
  8. x = Conv2D(filters1, (1, 1), strides=strides,
  9. name=conv_name_base + '2a')(input_tensor)
  10. x = BatchNormalization(name=bn_name_base + '2a')(x)
  11. x = Dropout(0.2)(x) # dropout
  12. x = Activation('relu')(x)
  13. # 3x3卷积
  14. x = Conv2D(filters2, kernel_size, padding='same',
  15. name=conv_name_base + '2b')(x)
  16. x = BatchNormalization(name=bn_name_base + '2b')(x)
  17. # dropout
  18. x = Dropout(0.2)(x)
  19. x = Activation('relu')(x)
  20. # 升维
  21. x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
  22. x = BatchNormalization(name=bn_name_base + '2c')(x)
  23. x = Dropout(0.2)(x) # dropout
  24. # 残差边
  25. shortcut = Conv2D(filters3, (1, 1), strides=strides, name=conv_name_base + '1')(input_tensor) # 将input_tensor转换为对应维度(w x h x c)
  26. shortcut = BatchNormalization(name=bn_name_base + '1')(shortcut)
  27. x = layers.add([x, shortcut])
  28. x = Activation('relu')(x)
  29. return x
  1. def identity_block(input_tensor, kernel_size, filters, stage, block):
  2. [filters1, filters2, filters3] = filters
  3. # filters = [512, 512, 1024]
  4. # filters1 = 512, filters2 = 512, filters3 = 1024
  5. # print(filters1)
  6. # print(filters)
  7. conv_name_base = 'res' + str(stage) + block + '_branch'
  8. bn_name_base = 'bn' + str(stage) + block + '_branch'
  9. # 降维
  10. x = Conv2D(filters1, (1, 1), name=conv_name_base + '2a')(input_tensor)
  11. x = BatchNormalization(name=bn_name_base + '2a')(x)
  12. x = Dropout(0.2)(x) # dropout
  13. x = Activation('relu')(x)
  14. # 3x3卷积
  15. x = Conv2D(filters2, kernel_size,padding='same', name=conv_name_base + '2b')(x)
  16. x = Dropout(0.2)(x) # dropout
  17. x = BatchNormalization(name=bn_name_base + '2b')(x)
  18. x = Activation('relu')(x)
  19. # 升维
  20. x = Conv2D(filters3, (1, 1), name=conv_name_base + '2c')(x)
  21. x = BatchNormalization(name=bn_name_base + '2c')(x)
  22. x = Dropout(0.2)(x) # dropout
  23. x = layers.add([x, input_tensor])
  24. x = Activation('relu')(x)
  25. return x
  1. # resnet18: ResNet(BasicBlock, [2, 2, 2, 2])
  2. # resnet34: ResNet(BasicBlock, [3, 4, 6, 3])
  3. # resnet50:ResNet(Bottleneck, [3, 4, 6, 3])
  4. # resnet101:ResNet(Bottleneck, [3, 4, 23, 3])
  5. # resnet152:ResNet(Bottleneck, [3, 8, 36, 3])
  6. def ResNet50(input_shape=[224,224,3],classes=5):
  7. # [224,224,3]
  8. img_input = tf.keras.layers.Input(shape=input_shape)
  9. x = ZeroPadding2D((3, 3))(img_input) # [230,230,3]
  10. # [112,112,64]
  11. x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x) #[112,112,64]
  12. x = BatchNormalization(name='bn_conv1')(x)
  13. x = Activation('relu')(x)
  14. # [56,56,64]
  15. x = MaxPooling2D((3, 3), strides=(2, 2))(x)
  16. # [56,56,256]
  17. x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
  18. x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
  19. x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
  20. # [28,28,512]
  21. x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
  22. x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
  23. x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
  24. x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')
  25. # [14,14,1024]
  26. x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
  27. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
  28. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
  29. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
  30. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
  31. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')
  32. # [7,7,2048]
  33. x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
  34. x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
  35. x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
  36. # 代替全连接层
  37. x = AveragePooling2D((7, 7), name='avg_pool')(x)
  38. # 进行预测
  39. x = Flatten()(x)
  40. x = Dense(classes, activation='softmax', name='fc5')(x)
  41. model = Model(img_input, x, name='resnet50')
  42. return model
  1. def ResNet101(input_shape=[224,224,3],classes=5):
  2. # [224,224,3]
  3. img_input = tf.keras.layers.Input(shape=input_shape)
  4. x = ZeroPadding2D((3, 3))(img_input) # [230,230,3]
  5. # [112,112,64]
  6. x = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(x) #[112,112,64]
  7. x = BatchNormalization(name='bn_conv1')(x)
  8. x = Activation('relu')(x)
  9. # [56,56,64]
  10. x = MaxPooling2D((3, 3), strides=(2, 2))(x)
  11. # [56,56,256]
  12. x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
  13. x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
  14. x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
  15. # [28,28,512]
  16. x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
  17. x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
  18. x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
  19. x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')
  20. # [14,14,1024]
  21. x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
  22. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
  23. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
  24. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
  25. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
  26. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')
  27. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='g')
  28. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='h')
  29. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='i')
  30. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='j')
  31. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='k')
  32. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='l')
  33. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='m')
  34. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='n')
  35. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='o')
  36. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='p')
  37. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='q')
  38. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='r')
  39. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='s')
  40. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='t')
  41. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='u')
  42. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='v')
  43. x = identity_block(x, 3, [256, 256, 1024], stage=4, block='w')
  44. # [7,7,2048]
  45. x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
  46. x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
  47. x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
  48. # 代替全连接层
  49. x = AveragePooling2D((7, 7), name='avg_pool')(x)
  50. # 分类预测
  51. x = Flatten()(x)
  52. x = Dense(classes, activation='softmax', name='fc5')(x)
  53. model = Model(img_input, x, name='resnet101')
  54. return model
  1. # model = ResNet50(input_shape=[224,224,3],classes=5)
  2. # model = ResNet50()
  3. model = ResNet101()
# model.summary()
  1. sgd = tf.keras.optimizers.SGD(lr=0.01, momentum=0.8, decay=0.0, nesterov=False)
  2. model.compile(optimizer=sgd,
  3. #loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
  4. loss = 'sparse_categorical_crossentropy', #交叉熵损失
  5. metrics=['accuracy']
  6. )
  7. train_count = len(train_path)
  8. test_count = len(test_path)
  9. steps_per_epoch = train_count // BATCH_SIZE
  10. validation_steps = test_count // BATCH_SIZE
  1. from tensorflow.keras.callbacks import EarlyStopping
  2. early_stopping = EarlyStopping(
  3. min_delta=0.0008, # minimium amount of change to count as an improvement
  4. patience=40, # how many epochs to wait before stopping
  5. restore_best_weights=True,
  6. )
  1. # 开始训练
  2. history = model.fit(train_datasets,
  3. epochs = 120,
  4. steps_per_epoch = steps_per_epoch,
  5. validation_data = test_datasets,
  6. validation_steps = validation_steps,
  7. callbacks=[early_stopping], # put your callbacks in a list
  8. )
  1. # model.save('myModel.h5')
  2. history.history.keys()
  3. # dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
  1. # 画出精确值变化图像
  2. plt.plot(history.epoch, history.history.get('accuracy'), label='accuracy')
  3. plt.plot(history.epoch, history.history.get('val_accuracy'), label='val_accuracy')
  4. plt.legend()
  1. # 画出损失值变化图像
  2. plt.plot(history.epoch, history.history.get('loss'), label='loss')
  3. plt.plot(history.epoch, history.history.get('val_loss'), label='val_loss')
  4. plt.legend()

实验结果:

loss: 0.0115 - accuracy: 0.9960 - val_loss: 1.8830 - val_accuracy: 0.7542

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/小蓝xlanll/article/detail/92768
推荐阅读
相关标签
  

闽ICP备14008679号