赞
踩
Tensorflow模型量化4 --pb转tflite小结(uint8量化)
模型的fp16量化和int8量化我之前有写,参考:
龟龟:Tensorflow模型量化实践2--量化自己训练的模型zhuanlan.zhihu.com
这次发现uint8量化时有参数设置,所以准备是从头再梳理一遍
2.参与量化的模型:
训练tensorflow-object-detection API 得到的ssdlite_mobilenet _v2模型,导出为frozen_inference_graph.pb
3.获取输入输出节点
进行frozen_inference_graph.pb模型解析,得到输入输出节点信息
代码入下:
- """
- code by zzg
- """
- import tensorflow as tf
- import os
- os.environ["CUDA_VISIBLE_DEVICES"] = "0"
- config = tf.ConfigProto()
- config.gpu_options.allow_growth = True
-
- with tf.Session() as sess:
- with open('frozen_inference_graph_resnet.pb','rb') as f:
- graph_def = tf.GraphDef()
- graph_def.ParseFromString(f.read())
-
- tf.import_graph_def(graph_def, name='')
- tensor_name_list = [tensor.name for tensor in tf.get_default_graph().as_graph_def().node]
- for tensor_name in tensor_name_list:
- print(tensor_name,'\n')
之后找到输入节点在预处理之后入下所示:
找到输出节点在后处理之前,如下图所示:
4.量化(pb->tflite)
4.1方法一:利用TFLiteConverter
- '''
- code by zzg 2020-04-27
- '''
- import tensorflow as tf
- import os
- os.environ["CUDA_VISIBLE_DEVICES"] = "0"
- config = tf.ConfigProto()
- config.gpu_options.allow_growth = True
-
- graph_def_file = "frozen_inference_graph.pb"
-
- input_names = ["FeatureExtractor/MobilenetV2/MobilenetV2/input"]
- output_names = ["concat", "concat_1"]
- input_tensor = {input_names[0]:[1,300,300,3]}
-
-
- #uint8 quant
- converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, input_names, output_names, input_tensor)
- converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,tf.lite.OpsSet.SELECT_TF_OPS]
- converter.allow_custom_ops=True
-
- converter.inference_type = tf.uint8 #tf.lite.constants.QUANTIZED_UINT8
- input_arrays = converter.get_input_arrays()
- converter.quantized_input_stats = {input_arrays[0]: (127.5, 127.5)} # mean, std_dev
- converter.default_ranges_stats = (0, 255)
-
- tflite_uint8_model = converter.convert()
- open("uint8.tflite", "wb").write(tflite_uint8_model)
4.2方法二:利用TOCO
- toco --graph_def_file
- ./frozen_inference_graph.pb
- --output_file test.tflite
- --input_format=TENSORFLOW_GRAPHDEF
- --output_format=TFLITE
- --inference_type=QUANTIZED_UINT8
- --input_shape='1,300,300,3' --input_array='FeatureExtractor/MobilenetV2/MobilenetV2/input' --output_array='concat,concat_1'
- --std_dev_value 127.5
- --mean_value 127.5
- --default_ranges_min 0
- --default_ranges_max 255
补充重点:uint8量化时的参数设置
01.由于是进行uint8量化,所以输出范围为[0,255]
即default_ranges_min =0,default_ranges_max=255
02.std_dev_value和mean_value参数
参考:https://www.cnblogs.com/sdu20112013/p/11960552.html
结论:
训练时模型的输入tensor的值在不同范围时,对应的mean_values,std_dev_values分别如下:
我查看了我的输入tensor范围是[-1,1], 所以我设置参数为 mean = 127.5, std_dev = 127.5
5.tflite测试
在转换完成后,进行tflie解析测试,验证最后转换成功。
代码入下:
- '''
- code by zzg 2020-04-30
- '''
- import tensorflow as tf
- import numpy as np
- InputSize = 300
-
- def test_tflite(input_test_tflite_file):
- interpreter = tf.lite.Interpreter(model_path = input_test_tflite_file)
- tensor_details = interpreter.get_tensor_details()
- for i in range(0,len(tensor_details)):
- # print("tensor:", i, tensor_details[i])
- interpreter.allocate_tensors()
-
- input_details = interpreter.get_input_details()
- print("=======================================")
- print("input :", str(input_details))
- output_details = interpreter.get_output_details()
- print("ouput :", str(output_details))
- print("=======================================")
- new_img = np.random.uniform(0,1,(1,InputSize,InputSize,3))
- # image_np_expanded = np.expand_dims(new_img, axis=0)
- new_img = new_img.astype('uint8')# 类型也要满足要求
-
- interpreter.set_tensor(input_details[0]['index'],new_img)
- # 注意注意,我要调用模型了
- interpreter.invoke()
- output_data = interpreter.get_tensor(output_details[0]['index'])
- print("test_tflite finish!")
-
- intput_tflite_file = "uint8.tflite"
- test_tflite(intput_tflite_file)
最后显示如下:
补充:获取输入输出节点的话采用神经网络模型可视化工具Netron更加方便直观
参考:
模型结构可视化神器--Netron(支持tf, caffe, keras,mxnet等多种框架)blog.csdn.net
轻量好用的神经网络模型可视化工具netron_网络_Mingyong_Zhuang的技术博客-CSDN博客blog.csdn.net
安装比较简单:windows直接安装.exe文件,linux直接 pip install netron即可
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。