赞
踩
【Tensorflow】训练keras模型+keras的数据生成器ImageDataGenerator+jpg图像数据格式的MNIST数据集+对比flow和flow_from_directory
【Tensorflow】训练keras模型+tensorflow-V2的数据集tf.data.Dataset+jpg图像数据格式的MNIST数据集+MobileNet
博主在某个6分类的数据集上训练下面的keras模型 :
import tensorflow as tf def single_net_block(inputs,classes,postfix='r1'): h1 = tf.keras.layers.Conv2D(32, 3, activation='relu', name='conv1_' + postfix)(inputs) h1 = tf.keras.layers.Conv2D(64, 3, activation='relu', name='conv2_' + postfix)(h1) block1_out = tf.keras.layers.MaxPooling2D(3, name='pool1_' + postfix)(h1) h2 = tf.keras.layers.Conv2D(64, 3, activation='relu', padding='same', name='conv3_' + postfix)(block1_out) h2 = tf.keras.layers.Conv2D(64, 3, activation='relu', padding='same', name='conv4_' + postfix)(h2) block2_out = tf.keras.layers.add([h2, block1_out], name='add1_' + postfix) h3 = tf.keras.layers.Conv2D(64, 3, activation='relu', padding='same', name='conv5_' + postfix)(block2_out) # h3 = tf.keras.layers.Conv2D(16, 3, activation='relu', padding='same', name='conv6_' + postfix)(h3) h3 = tf.keras.layers.Conv2D(64, 1, activation='relu', padding='same', name='conv6_' + postfix)(h3) block3_out = tf.keras.layers.add([h3, block2_out], name='add2_' + postfix) h4 = tf.keras.layers.Conv2D(64, 3, activation='relu', name='conv7_' + postfix)(block3_out) return h4 def single_net(classes=10,im_w=256,im_h=256,im_channels=3,postfix='r1'): inputs = tf.keras.Input(shape=(im_w, im_h, im_channels), name='input_'+postfix) h4 = single_net_block(inputs, classes, postfix) h4 = tf.keras.layers.GlobalMaxPool2D(name='pool2_' + postfix)(h4) h4 = tf.keras.layers.Dense(256, activation='relu', name='dense1_' + postfix)(h4) outputs = tf.keras.layers.Dense(classes, activation='softmax', name='dense2_' + postfix)(h4) model = tf.keras.Model(inputs, outputs, name=postfix) model.summary() tf.keras.utils.plot_model(model, 'architecture_' + postfix + '.png', show_shapes=True) return model网络结构图
训练过程:
测试集结果:99.0%,fps为23ms
test time (ms/frame): 23.402893940607708
test accuracy: 0.990000
模型大小:1557KB
接下来,
这里需要用到:tf.lite.TFLiteConverter
这里支持savedmodel(https://www.alibabacloud.com/help/zh/doc-detail/111030.htm)、keras模型或者其他的具体的模型(可以自定义构造方法,没有试过。。。)
tf.lite.TFLiteConverter.from_keras_model会返回一个tf.lite.TFLiteConverter对象,其属性如上所示:
之前博主参考https://www.cnblogs.com/jiangxinyang/p/12056209.html看到的。
查看lite.py (https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/lite/python/lite.py#L456-L1034)
post_training_quantize: Deprecated. Please specify `[Optimize.DEFAULT]` for `optimizations` instead. Boolean indicating whether to quantize the weights of the converted float model. Model size will be reduced and there will be latency improvements (at the cost of accuracy). (default False)这个属性会被舍弃,请用属性optimizations
tf.lite.Optimize
optimizations有三种选择:
https://github.com/tensorflow/tensorflow/blob/v2.1.0/tensorflow/lite/python/lite.py#L79-L111
class Optimize(enum.Enum): """Enum defining the optimizations to apply when generating tflite graphs. Some optimizations may come at the cost of accuracy. """ # Default optimization strategy. # # Converter will do its best to improve size and latency based on the # information provided. # Enhanced optimizations can be gained by providing a representative_dataset. # This is recommended, and is currently equivalent to the modes below. # Currently, weights will be quantized and if representative_dataset is # provided, activations for quantizable operations will also be quantized. DEFAULT = "DEFAULT" # Optimize for size. # # Optimizations that reduce the size of the model. # The model size will be reduced. # Currently, weights will be quantized and if representative_dataset is # provided, activations for quantizable operations will also be quantized. OPTIMIZE_FOR_SIZE = "OPTIMIZE_FOR_SIZE" # Optimize for latency. # # Optimizations that reduce the latency of the model. # Currently, weights will be quantized and if representative_dataset is # provided, activations for quantizable operations will also be quantized. OPTIMIZE_FOR_LATENCY = "OPTIMIZE_FOR_LATENCY" def __str__(self): return self.value通过源码可以看到,
DEFAULT 模式的作用是:
转换器将根据所提供的信息尽力改进大小和延迟。
权重默认会被量化,如果提供了代表性数据集,可量化操作的激活函数也将被量化。
OPTIMIZE_FOR_SIZE 模式的作用是:重点缩小模型的大小
OPTIMIZE_FOR_LATENCY 模式的作用是:重点减少模型的前向传播延迟,也就是提升速度
https://github.com/tensorflow/tensorflow/blob/v2.1.0/tensorflow/lite/python/lite.py#L266-L471
experimental_new_converter: Experimental flag, subject to change. Enables MLIR-based conversion instead of TOCO conversion. experimental_new_quantizer: Experimental flag, subject to change. Enables MLIR-based post-training quantization.
experimental_new_converter默认开启(True),experimental_new_quantizer默认关闭(False)
新量化方法基于MLIR
- import tensorflow as tf
- from tensorflow.keras.models import load_model
- model = load_model('models_resnet.hdf5')
- converter = tf.lite.TFLiteConverter.from_keras_model(model)
-
- # converter.experimental_new_converter=False#旧的量化方法
- converter.experimental_new_quantizer=True#新的量化方法,基于MLIR
- #上面两行代码,选择一行注释,注释第一行为新方法,注释第二行为旧方法
-
- converter.optimizations = ["DEFAULT"]
- tflite_model = converter.convert()
- open("quantize_models_resnet_new_DEFAULT.tflite", "wb").write(tflite_model)
-
- converter.optimizations = ["OPTIMIZE_FOR_SIZE"]
- tflite_model = converter.convert()
- open("quantize_models_resnet_new_OPTIMIZE_FOR_SIZE.tflite", "wb").write(tflite_model)
-
- #
- converter.optimizations = ["OPTIMIZE_FOR_LATENCY"]
- tflite_model = converter.convert()
- open("quantize_models_resnet_new_OPTIMIZE_FOR_LATENCY.tflite", "wb").write(tflite_model)
结果转换之后这几个模型的大小几乎都一样:旧的方法比新的方法小了1kb
对比原始keras模型,压缩率的定义如下:
压缩率(Compression rate),描述压缩文件的效果名,是文件压缩后的大小与压缩前的大小之比,例如:把100m的文件压缩后是90m,压缩率为90/100*100%=90%,压缩率一般是越小越好,但是压得越小,解压时间越长。
https://baike.baidu.com/item/%E5%8E%8B%E7%BC%A9%E7%8E%87/6435712?fr=aladdin
tensorflow量化方法的模型压缩率为47.9%,量化模型的大小约为原始模型的1/2
-
- import time
- import tensorflow as tf
- import gc
-
-
- gc.collect()
- tf.keras.backend.clear_session()
- tf.compat.v1.reset_default_graph()
-
- import numpy as np
-
-
- from get_RSM_data import get_data_list,test_preprocess_image,preprocess_label
- class_name=['Cr','In','Pa','PS','RS','Sc']
- file_dir='../database/'
- log_dir='./logs/'
- classes=len(class_name)
- train_ratio = 2 / 3
- val_ratio= 2 / 10
- nums_of_each_classes = 300
- nums_for_training=int(len(class_name)*nums_of_each_classes*train_ratio)
- nums_for_validation=int(float(nums_for_training)*val_ratio)
- im_w=256
- im_h=256
- im_channels=3
- divide = 8
- batch_size=32
- epochs=500
- _,_,test_image_list,test_label_list,_,_=get_data_list(file_dir,class_name,nums_of_each_classes,nums_for_training,nums_for_validation)
-
- test_image = []
- test_label = []
- for image, label in zip(test_image_list, test_label_list):
- r_image = test_preprocess_image(image,im_w=im_w, im_h=im_h, im_channels=im_channels)
- r_label = preprocess_label(label,classes=classes,one_hot=False)
- test_image.append(r_image)
- test_label.append(r_label)
-
- test_images = np.array(test_image)
- test_labels = np.array(test_label)
-
- print(test_images.shape)
- print(test_labels.shape)
-
-
- interpreter = tf.lite.Interpreter(model_path="quantize_models_resnet_OPTIMIZE_FOR_SIZE.tflite")
- interpreter.allocate_tensors()
- input_tensor_index = interpreter.get_input_details()[0]["index"]
- output = interpreter.tensor(interpreter.get_output_details()[0]["index"])
-
-
- start_time = time.time()
- prediction_digits = []
- for test_image in test_images:
- # Pre-processing: add batch dimension and convert to float32 to match with
- # the model's input data format.
- test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
- interpreter.set_tensor(input_tensor_index, test_image)
-
- # Run inference.
- interpreter.invoke()
-
- # Post-processing: remove batch dimension and find the digit with highest
- # probability.
- digit = np.argmax(output()[0])
- prediction_digits.append(digit)
-
- end_time = time.time()
- print('test time (ms/frame): ',(end_time - start_time) / len(prediction_digits) * 1000)
-
- accurate_count = 0
- for index in range(len(prediction_digits)):
- if prediction_digits[index] == test_labels[index]:
- accurate_count += 1
- accuracy = float(accurate_count) * 1.0 / float(len(prediction_digits))
-
- print('test accuracy: %.6f'%accuracy)
-
- gc.collect()
- tf.keras.backend.clear_session()
- tf.compat.v1.reset_default_graph()
-
-
- import time
- import tensorflow as tf
- import gc
-
-
- gc.collect()
- tf.keras.backend.clear_session()
- tf.compat.v1.reset_default_graph()
-
- import numpy as np
-
-
- from get_RSM_data import get_data_list,test_preprocess_image,preprocess_label
- class_name=['Cr','In','Pa','PS','RS','Sc']
- file_dir='../database/'
- log_dir='./logs/'
- classes=len(class_name)
- train_ratio = 2 / 3
- val_ratio= 2 / 10
- nums_of_each_classes = 300
- nums_for_training=int(len(class_name)*nums_of_each_classes*train_ratio)
- nums_for_validation=int(float(nums_for_training)*val_ratio)
- im_w=256
- im_h=256
- im_channels=3
- divide = 8
- batch_size=32
- epochs=500
- _,_,test_image_list,test_label_list,_,_=get_data_list(file_dir,class_name,nums_of_each_classes,nums_for_training,nums_for_validation)
-
- test_image = []
- test_label = []
- for image, label in zip(test_image_list, test_label_list):
- r_image = test_preprocess_image(image,im_w=im_w, im_h=im_h, im_channels=im_channels)
- r_label = preprocess_label(label,classes=classes,one_hot=False)
- test_image.append(r_image)
- test_label.append(r_label)
-
- test_images = np.array(test_image)
- test_labels = np.array(test_label)
-
- print(test_images.shape)
- print(test_labels.shape)
-
-
- # interpreter = tf.lite.Interpreter(model_path="quantize_models_resnet_OPTIMIZE_FOR_SIZE.tflite")
- # interpreter.allocate_tensors()
- # input_tensor_index = interpreter.get_input_details()[0]["index"]
- # output = interpreter.tensor(interpreter.get_output_details()[0]["index"])
- from tensorflow.keras.models import load_model
- model = load_model('models_resnet.hdf5')
-
- # cost = model.evaluate(test_images,test_labels)
- # print('test loss: ', cost)
-
- start_time = time.time()
- prediction_digits = []
- for test_image in test_images:
- test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
- res=model.predict(test_image, batch_size=1)
- digit = np.argmax(res[0])
- prediction_digits.append(digit)
- #
- end_time = time.time()
- print('test time (ms/frame): ',(end_time - start_time) / len(prediction_digits) * 1000)
- #
- accurate_count = 0
- for index in range(len(prediction_digits)):
- # print(prediction_digits[index],test_labels[index])
- if prediction_digits[index] == test_labels[index]:
- accurate_count += 1
- accuracy = float(accurate_count) * 1.0 / float(len(prediction_digits))
-
- print('test accuracy: %.6f'%accuracy)
-
- gc.collect()
- tf.keras.backend.clear_session()
- tf.compat.v1.reset_default_graph()
-
quantize_models_resnet_new_DEFAULT.tflite
test time (ms/frame): 43.77090573310852
test accuracy: 0.990000
quantize_models_resnet_new_OPTIMIZE_FOR_LATENCY
test time (ms/frame): 45.00792781511942
test accuracy: 0.990000
quantize_models_resnet_new_OPTIMIZE_FOR_SIZE
test time (ms/frame): 44.728724559148155
test accuracy: 0.990000
quantize_models_resnet_DEFAULT
test time (ms/frame): 44.802682399749756
test accuracy: 0.990000
quantize_models_resnet_OPTIMIZE_FOR_LATENCY
test time (ms/frame): 44.54366246859232
test accuracy: 0.990000
quantize_models_resnet_OPTIMIZE_FOR_SIZE
test time (ms/frame): 44.73690390586853
test accuracy: 0.990000
测试结果表明基于post-training的模型量化,不论旧方法或者新方法在测试准确率上均没有变化,和原始模型的测试准确率几乎相等,可以说是没有损失的压缩,
但是,压缩模型的预测速度慢了接近一倍(压缩模型44ms,原始模型23ms),
新量化方法的DEFAULT模式可能会稍微快一点,但还是43ms而已,我本来还以为会加速呢,结果只是模型变小了而已(失望!!!)
当然也可能和我的网络模型有关,量化一般是做权重偏置和激活函数的输出值,
博主设计的网络中,用的激活函数都是relu,也没有做BN归一化中间特征图的数据分布,
所以不好做量化,可能会在量化和反量化之间浪费了很多时间,后面再做验证!
——————分割线——————
后面我又继续用tensorflow_model_optimization对keras模型进行量化,进行Quantization aware training:
- import tensorflow_model_optimization as tfmot
-
- quantize_model = tfmot.quantization.keras.quantize_model
-
- # q_aware stands for for quantization aware.
- q_aware_model = quantize_model(model)
-
- optimizer=tf.keras.optimizers.RMSprop()
- q_aware_model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
- q_aware_model.summary()
-
-
- q_aware_model.fit(train_ds,batch_size=batch_size,epochs=epochs,validation_data=val_ds)
-
- converter = tf.lite.TFLiteConverter.from_keras_model(q_aware_model)
- converter.optimizations = ["DEFAULT"]
- tflite_model = converter.convert()
- open("quantize_models_resnet_tflite.tflite", "wb").write(tflite_model)
-
- cost = q_aware_model.evaluate(test_ds)
- print('test loss: ', cost)
但是生成的keras模型或者转化的tflite模型都不能用上面的读取方法读取测试,暂时没有找到解决方法,
keras模型读取失败
from tensorflow.keras.models import load_model model = load_model('quantize_models_resnet.hdf5')报错:ValueError: Unknown layer: QuantizeLayer
Traceback (most recent call last): File "H:/Leon/CODE/python_projects/master_ImRecognition/dataset/NEU/ThesisTest_Leon/test_keras.py", line 52, in <module> model = load_model('quantize_models_resnet.hdf5') File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\saving\save.py", line 207, in load_model compile) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\saving\hdf5_format.py", line 184, in load_model_from_hdf5 custom_objects=custom_objects) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\saving\model_config.py", line 64, in model_from_config return deserialize(config, custom_objects=custom_objects) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\layers\serialization.py", line 177, in deserialize printable_module_name='layer') File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 358, in deserialize_keras_object list(custom_objects.items()))) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 669, in from_config config, custom_objects) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 1275, in reconstruct_from_config process_layer(layer_data) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 1257, in process_layer layer = deserialize_layer(layer_data, custom_objects=custom_objects) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\layers\serialization.py", line 177, in deserialize printable_module_name='layer') File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 347, in deserialize_keras_object config, module_objects, custom_objects, printable_module_name) File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py", line 296, in class_and_config_for_serialized_keras_object raise ValueError('Unknown ' + printable_module_name + ': ' + class_name) ValueError: Unknown layer: QuantizeLayer Process finished with exit code 1
tflite模型读取失败
interpreter = tf.lite.Interpreter(model_path="quantize_models_resnet_tflite.tflite") interpreter.allocate_tensors() input_tensor_index = interpreter.get_input_details()[0]["index"] output = interpreter.tensor(interpreter.get_output_details()[0]["index"])报错:RuntimeError: tensorflow/lite/kernels/quantize.cc:113 affine_quantization->scale->size == 1 was not true.Node number 0 (QUANTIZE) failed to prepare.
Traceback (most recent call last): File "H:/Leon/CODE/python_projects/master_ImRecognition/dataset/NEU/ThesisTest_Leon/test_tflite.py", line 48, in <module> interpreter.allocate_tensors() File "D:\Users\Leon_PC\Anaconda3\envs\tf2_4_1\lib\site-packages\tensorflow\lite\python\interpreter.py", line 259, in allocate_tensors return self._interpreter.AllocateTensors() RuntimeError: tensorflow/lite/kernels/quantize.cc:113 affine_quantization->scale->size == 1 was not true.Node number 0 (QUANTIZE) failed to prepare.
后面继续跟进这个bug!应该不是不能解决的!
但是为了测试,我把训练代码和测试代码写到一起然后测试了速度和准确率:
- # -*- coding: utf-8 -*-
- """
- Created on Fri Jan 29 20:36:16 2021
- @author: Leon_PC
- """
- import tensorflow as tf
- import random
- import pathlib
- from tensorflow import keras
- import os
- import numpy as np
-
-
- class_name=['Cr','In','Pa','PS','RS','Sc']
- file_dir='../database/'
- postfix='resnet'
- model_save_dir='quantize_models_'+postfix+'.hdf5'
- log_dir='./logs/'
- classes=len(class_name)
- train_ratio = 2 / 3
- val_ratio= 2 / 10
- nums_of_each_classes = 300
- nums_for_training=int(len(class_name)*nums_of_each_classes*train_ratio)
- nums_for_validation=int(float(nums_for_training)*val_ratio)
- im_w=256
- im_h=256
- im_channels=3
- batch_size=32
- epochs=50
-
-
- from get_Normal_data import get_tfdataset
- train_ds,test_ds,val_ds=get_tfdataset(file_dir=file_dir, \
- class_name=class_name,\
- nums_of_each_classes=nums_of_each_classes,\
- nums_for_training=nums_for_training,\
- nums_for_validation=nums_for_validation,\
- im_w=im_w, im_h=im_h, im_channels=im_channels,batch_size=batch_size,is_aug=True)
-
- from RSM_CascadeNet import single_net,cascade_net
-
-
- model = single_net(classes=classes,im_w=im_w,im_h=im_h,im_channels=im_channels,postfix=postfix)
-
- import tensorflow_model_optimization as tfmot
-
- quantize_model = tfmot.quantization.keras.quantize_model
-
- # q_aware stands for for quantization aware.
- q_aware_model = quantize_model(model)
-
- optimizer=tf.keras.optimizers.RMSprop()
- q_aware_model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
- q_aware_model.summary()
-
- import gc
- #callbacks
- tensorboard = tf.keras.callbacks.TensorBoard(log_dir=log_dir)
- import callbacks_history
- history = callbacks_history.LossHistory()
- cp_callback = tf.keras.callbacks.ModelCheckpoint(model_save_dir,verbose=1,monitor='val_accuracy', save_best_only=True, mode='max')
-
- q_aware_model.fit(train_ds,batch_size=batch_size,epochs=epochs,validation_data=val_ds,callbacks=[history,tensorboard,cp_callback])
-
- converter = tf.lite.TFLiteConverter.from_keras_model(q_aware_model)
- converter.optimizations = ["DEFAULT"]
- # converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
- tflite_model = converter.convert()
- open("quantize_models_resnet_tflite.tflite", "wb").write(tflite_model)
-
- # from tensorflow.keras.models import load_model
- # test_model = load_model(model_save_dir)
- cost = q_aware_model.evaluate(test_ds)
- print('test loss: ', cost)
-
-
- from get_RSM_data import get_data_list,test_preprocess_image,preprocess_label
- _,_,test_image_list,test_label_list,_,_=get_data_list(file_dir,class_name,nums_of_each_classes,nums_for_training,nums_for_validation)
-
- test_image = []
- test_label = []
- for image, label in zip(test_image_list, test_label_list):
- r_image = test_preprocess_image(image,im_w=im_w, im_h=im_h, im_channels=im_channels)
- r_label = preprocess_label(label,classes=classes,one_hot=False)
- test_image.append(r_image)
- test_label.append(r_label)
-
- test_images = np.array(test_image)
- test_labels = np.array(test_label)
-
- print(test_images.shape)
- print(test_labels.shape)
-
- import time
- start_time = time.time()
- prediction_digits = []
- for test_image in test_images:
- test_image = np.expand_dims(test_image, axis=0).astype(np.float32)
- res=q_aware_model.predict(test_image, batch_size=1)
- digit = np.argmax(res[0])
- prediction_digits.append(digit)
- #
- end_time = time.time()
- print('test time (ms/frame): ',(end_time - start_time) / len(prediction_digits) * 1000)
- #
- accurate_count = 0
- for index in range(len(prediction_digits)):
- # print(prediction_digits[index],test_labels[index])
- if prediction_digits[index] == test_labels[index]:
- accurate_count += 1
- accuracy = float(accurate_count) * 1.0 / float(len(prediction_digits))
-
- print('test accuracy: %.6f'%accuracy)
-
-
-
- history.loss_plot('epoch')
- gc.collect()
- tf.keras.backend.clear_session()
- tf.compat.v1.reset_default_graph()
基于Quantization aware training的量化keras模型的测试结果:
test time (ms/frame): 25.569610993067425
test accuracy: 0.971667
速度也没有提升,准确率也没有提升,模型大小还变大了。。。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。