vitis-ai-gpu FPGA实现部分_freeze fpga

作者：2023面试高手 | 2024-03-16 15:48:05

踩

freeze fpga

来源：目前我们的板子是1.3版本 2020.12

a. 几个代码教程：https://github.com/Xilinx/Vitis-In-Depth-Tutorial/tree/master/Machine_Learning 其中example2的代码主要供我们参考

b. 安装系统：https://github.com/Xilinx/Vitis-AI/blob/master/demo/VART/README.md

c. 安装docker（其中碰到的nvidia-container-runtime问题，参考另外一篇博客）：https://github.com/Xilinx/Vitis-AI https://github.com/Xilinx/Vitis-AI/blob/master/docs/install_docker/README.md

d. compile 部分：https://forums.xilinx.com/t5/Vitis-HLS-Alveo-ML%E4%BB%A5%E5%8F%8A%E5%85%B6%E4%BB%96%E8%BD%AF%E4%BB%B6%E5%BA%94%E7%94%A8/vitis-ai-1-3%E7%BC%96%E8%AF%91%E6%8A%A5%E9%94%99/td-p/1187929 目前需要用另外一个pb转换工具

e. 用户指导手册：大量api 很重要！https://japan.xilinx.com/html_docs/vitis_ai/1_3/compiling_model.html#hnl1605502495146__section_ehb_d2y_ykb

最重要的是：我们最终对应处理需要的都是网络inference的代码去freeze 不要搞错了

启动docker：sudo ./docker_run.sh xilinx/vitis-ai-gpu:latest 重启：sudo systemctl restart docker

1. 首先要将tensorflow model转换为对应的pb格式（checkpoint格式对应也可以但是需要修改源码比较繁琐因为推荐用save_model的方式将原本的文件转换过来）如下图所示：

saved_model_cli show --dir ./save_model --all 这个命令可以看输入输出tensor的名称


def compress():
    """Compresses an image."""
    img = tf.placeholder(tf.float32, [None, None, None, 3])
 
    with tf.name_scope('intra'):
        # Transform and compress the image, then remove batch dimension.
        y = analysis_transform(img, args.conv_filters, args.num_filters)
        
    with tf.Session() as sess:
        # Load the latest model checkpoint, get the compressed string and the tensor
        # shapes.
        latest = tf.train.latest_checkpoint(checkpoint_dir=args.checkpoint_dir)
        tf.train.Saver().restore(sess, save_path=latest)
        print('save model ing')
        tf.saved_model.simple_save(sess,"./save_model_e", inputs={"inputs":img}, outputs={"outputs":y})

对应inputs和outputs即为我们保存的节点，后续freeze针对的输出就是这个outputs节点。

2. freeze部分

第一次启动docker的话执行这一步可能会比较慢耐心等待 ~QAQ~

freeze代码如下：这里的save_model_e即为对应上面保存的pb文件所在位置 output_node_names需要指定对应你所需要的输出。


#!/bin/bash
 
# remove previous results
rm -rf ./freeze_e
mkdir ./freeze_e
 
#conda activate decent_q3
 
 
# freeze trained graph
echo "#####################################"
echo "FREEZE GRAPH"
echo "#####################################"
 
 
 
freeze_graph --input_saved_model_dir=./save_model_e/ \
             --output_graph=./freeze_e/frozen_graph.pb \
             --output_node_names=intra/enc_4/BiasAdd            
 
echo "#####################################"
echo "FREEZE GRAPH COMPLETED"
echo "#####################################"

3. quantize 部分

根据上面freeze的图片，输入一部分的数据集进行预处理来实现量化 input_fn对应一个生成数据集的函数这个数据集是为了校准数据的数据集代码如下图所示：注意如果网络原本的输入都是/255归一化的数据，那这里也要提供归一化的数据。


import os
import numpy as np
import cv2
from os import listdir
from os.path import join
 
 
calib_image_dir_name = "./kodak"
height = 512
width = 768
channels = 3
 
def calib_input(iter):
  image = []
  imgs = np.zeros([8, height, width, channels])
  # read image
  image_index = 0
  pngfilenames = [x for x in listdir(calib_image_dir_name)]
  for pngfilename in pngfilenames:
    print(pngfilename)
    in_png_file = join(calib_image_dir_name, pngfilename)
    image = cv2.imread(in_png_file, cv2.IMREAD_COLOR) / 255.
 
    image = np.reshape(image,[height, width, channels])
    image = np.float32(image)
    imgs[image_index, :, :, :] = image[:, :, :]
    image_index += 1
 
  return {"Placeholder": imgs}

然后，进入我们的量化部分 input_nodes就是对应我们数据集的名字和size 这个数据集是为了模型的校准具体的效果还要调研默认为8bit量化对应method是1

output_nodes就是对应原来pb中对应输出的名字


#!/bin/bash
 
 
# activate DECENT_Q Python3.6 virtual environment
#conda activate decent_q3
 
# generate calibraion images and list file
 
# remove existing files
rm -rf ./quantize_e
mkdir ./quantize_e
 
 
# run quantization
echo "#####################################"
echo "QUANTIZE"
echo "#####################################"
vai_q_tensorflow quantize \
  --input_frozen_graph ./freeze_e/frozen_graph.pb \
  --input_nodes Placeholder \
  --input_shapes ?,512,768,3 \
  --output_nodes intra/enc_4/BiasAdd \
  --input_fn graph_input_fn_e_list.calib_input \
  --method 1 \
  --weight_bit 8 \
  --activation_bit 8 \
  --gpu 0 \
  --calib_iter 10 \
  --output_dir ./quantize_e
echo "#####################################"
echo "QUANTIZATION COMPLETED"
echo "#####################################"

4. 最后一个阶段编译阶段代码如下：其实代码很简单就是把上面量化的文件转换为能在FPGA开发板上跑的东西 .xmodel格式。

重要的问题 ：https://forums.xilinx.com/t5/Vitis-HLS-Alveo-ML%E4%BB%A5%E5%8F%8A%E5%85%B6%E4%BB%96%E8%BD%AF%E4%BB%B6%E5%BA%94%E7%94%A8/vitis-ai-1-3%E7%BC%96%E8%AF%91%E6%8A%A5%E9%94%99/td-p/1187929

以前vitis-1.2版本使用deploy_model.pb格式的，但是现在2020.12更新了1.3版本，所以现在统一都用quantize_eval_model.pb的文件。

而且提到了一种情况，有时候没有输入的size信息，需要手动去指定，需要--options 指定shape才可以。


#!/bin/bash
 
# delete previous results
rm -rf ./compile_e
mkdir ./compile_e
 
 
#conda activate decent_q3
# --frozen_pb=./quantize_e/deploy_model.pb \
 
# Compile
echo "#####################################"
echo "COMPILE WITH DNNC"
echo "#####################################"
vai_c_tensorflow \
       --frozen_pb=./quantize_e/quantize_eval_model.pb \
       --arch=/opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU104/arch.json \
       --output_dir=compile_e \
       --net_name=analysis_transform \
       --options '{"input_shape": "10,512,768,3"}'
 
echo "#####################################"
echo "COMPILATION COMPLETED"
echo "#####################################"

对于最后一个阶段json文件的选择则根据以下的表格来确定具体操作可以看guide部分。

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/2023面试高手/article/detail/250455

vitis-ai-gpu FPGA实现部分_freeze fpga

最重要的是：我们最终对应处理需要的都是网络inference的代码 去freeze 不要搞错了

最重要的是：我们最终对应处理需要的都是网络inference的代码去freeze 不要搞错了