当前位置:   article > 正文

Tensorflow模型量化4 --pb转tflite(uint8量化)小结_tflite量化 custom op

tflite量化 custom op

Tensorflow模型量化4 --pb转tflite小结(uint8量化)

  1. 实验环境:tensorflow-gpu1.15+cuda10.0

模型的fp16量化和int8量化我之前有写,参考:

龟龟:Tensorflow模型量化实践2--量化自己训练的模型​zhuanlan.zhihu.com

这次发现uint8量化时有参数设置,所以准备是从头再梳理一遍

2.参与量化的模型:

训练tensorflow-object-detection API 得到的ssdlite_mobilenet _v2模型,导出为frozen_inference_graph.pb

3.获取输入输出节点

进行frozen_inference_graph.pb模型解析,得到输入输出节点信息

代码入下:

  1. """
  2. code by zzg
  3. """
  4. import tensorflow as tf
  5. import os
  6. os.environ["CUDA_VISIBLE_DEVICES"] = "0"
  7. config = tf.ConfigProto()
  8. config.gpu_options.allow_growth = True
  9. with tf.Session() as sess:
  10. with open('frozen_inference_graph_resnet.pb','rb') as f:
  11. graph_def = tf.GraphDef()
  12. graph_def.ParseFromString(f.read())
  13. tf.import_graph_def(graph_def, name='')
  14. tensor_name_list = [tensor.name for tensor in tf.get_default_graph().as_graph_def().node]
  15. for tensor_name in tensor_name_list:
  16. print(tensor_name,'\n')

之后找到输入节点在预处理之后入下所示:

找到输出节点在后处理之前,如下图所示:

 

4.量化(pb->tflite)

4.1方法一:利用TFLiteConverter

  1. '''
  2. code by zzg 2020-04-27
  3. '''
  4. import tensorflow as tf
  5. import os
  6. os.environ["CUDA_VISIBLE_DEVICES"] = "0"
  7. config = tf.ConfigProto()
  8. config.gpu_options.allow_growth = True
  9. graph_def_file = "frozen_inference_graph.pb"
  10. input_names = ["FeatureExtractor/MobilenetV2/MobilenetV2/input"]
  11. output_names = ["concat", "concat_1"]
  12. input_tensor = {input_names[0]:[1,300,300,3]}
  13. #uint8 quant
  14. converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, input_names, output_names, input_tensor)
  15. converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,tf.lite.OpsSet.SELECT_TF_OPS]
  16. converter.allow_custom_ops=True
  17. converter.inference_type = tf.uint8 #tf.lite.constants.QUANTIZED_UINT8
  18. input_arrays = converter.get_input_arrays()
  19. converter.quantized_input_stats = {input_arrays[0]: (127.5, 127.5)} # mean, std_dev
  20. converter.default_ranges_stats = (0, 255)
  21. tflite_uint8_model = converter.convert()
  22. open("uint8.tflite", "wb").write(tflite_uint8_model)

4.2方法二:利用TOCO

  1. toco --graph_def_file
  2. ./frozen_inference_graph.pb
  3. --output_file test.tflite
  4. --input_format=TENSORFLOW_GRAPHDEF
  5. --output_format=TFLITE
  6. --inference_type=QUANTIZED_UINT8
  7. --input_shape='1,300,300,3' --input_array='FeatureExtractor/MobilenetV2/MobilenetV2/input' --output_array='concat,concat_1'
  8. --std_dev_value 127.5
  9. --mean_value 127.5
  10. --default_ranges_min 0
  11. --default_ranges_max 255

补充重点:uint8量化时的参数设置

01.由于是进行uint8量化,所以输出范围为[0,255]

即default_ranges_min =0,default_ranges_max=255

02.std_dev_value和mean_value参数

参考:https://www.cnblogs.com/sdu20112013/p/11960552.html

结论:
训练时模型的输入tensor的值在不同范围时,对应的mean_values,std_dev_values分别如下:

  • range (0,255) then mean = 0, std_dev = 1
  • range (-1,1) then mean = 127.5, std_dev = 127.5
  • range (0,1) then mean = 0, std_dev = 255

我查看了我的输入tensor范围是[-1,1], 所以我设置参数为 mean = 127.5, std_dev = 127.5

 

5.tflite测试

在转换完成后,进行tflie解析测试,验证最后转换成功。

代码入下:

  1. '''
  2. code by zzg 2020-04-30
  3. '''
  4. import tensorflow as tf
  5. import numpy as np
  6. InputSize = 300
  7. def test_tflite(input_test_tflite_file):
  8. interpreter = tf.lite.Interpreter(model_path = input_test_tflite_file)
  9. tensor_details = interpreter.get_tensor_details()
  10. for i in range(0,len(tensor_details)):
  11. # print("tensor:", i, tensor_details[i])
  12. interpreter.allocate_tensors()
  13. input_details = interpreter.get_input_details()
  14. print("=======================================")
  15. print("input :", str(input_details))
  16. output_details = interpreter.get_output_details()
  17. print("ouput :", str(output_details))
  18. print("=======================================")
  19. new_img = np.random.uniform(0,1,(1,InputSize,InputSize,3))
  20. # image_np_expanded = np.expand_dims(new_img, axis=0)
  21. new_img = new_img.astype('uint8')# 类型也要满足要求
  22. interpreter.set_tensor(input_details[0]['index'],new_img)
  23. # 注意注意,我要调用模型了
  24. interpreter.invoke()
  25. output_data = interpreter.get_tensor(output_details[0]['index'])
  26. print("test_tflite finish!")
  27. intput_tflite_file = "uint8.tflite"
  28. test_tflite(intput_tflite_file)

最后显示如下:

 

补充:获取输入输出节点的话采用神经网络模型可视化工具Netron更加方便直观

参考:

模型结构可视化神器--Netron(支持tf, caffe, keras,mxnet等多种框架)​blog.csdn.net

轻量好用的神经网络模型可视化工具netron_网络_Mingyong_Zhuang的技术博客-CSDN博客​blog.csdn.net

 

安装比较简单:windows直接安装.exe文件,linux直接 pip install netron即可

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/154290
推荐阅读
相关标签
  

闽ICP备14008679号