注意:笔者使用的是Quartus Standard 17.1版本,高版本的Quartus需要先破解IP核才能调用FFT,不然在编译仿真时会在EDA Netlist Writer报错说没有相应的license,同时打开simulation tool也会报错。板子用的是小梅哥的AC620V2开发板,型号是cyclone IV E : EP4CE10F17C8。
按以下步骤可以打开官方对于FFT IP核的解释文档
.inverse (inverse), // .inverse
I/O:Data Flow可以选择四种模式Variable Streaming、Streaming、Buffered Burst、Burst。可以根据自己的需求选择具体的模式。Streaming模式允许对于输入数据的持续处理并且持续输出处理后的数据流,不需要暂停输入输出数据流。Variable Streaming模式和Streaming模式很相似,但Variable Streaming在FFT运行过程中可以处理不同长度的序列。Buffered Burst模式进行FFT的处理时间会相对久一点,但对于内存资源的需求会比Streaming模式少。Burst模式和Buffered Burst模式也很相似,但Burst会消耗更少的内存资源,同时平均吞吐量会更低。
Data and Twiddle:设置数据类型和旋转因子。和之前设置的IO模式有对应关系,否则会显示报错。可以再具体翻一翻手册。
Parameters | Value | Description |
Transform Length | 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, or 65536. Variable streaming also allows 8, 16, 32, 131072, and 262144 | The transform length. For variable streaming, this value is the maximum FFT length. |
Transform Direction | Forward,reverse, bidirectional | The transform direction. |
I/O Data Flow | Streaming Variable Streaming Buffered Burst Burst | If you select Variable Streaming and Floating Point, the precision is automatically set to 32, and the reverse I/O order options are Digit Reverse Order. |
I/O Order | Bit Reverse Order, Digit Reverse Order, Natural Order, N/2 to N/2 | The input and output order for data entering and leaving the FFT (variable streaming FFT only). The Digit Reverse Order option replaces the Bit Reverse Order in variable streaming floating point variations. |
Data Representation | Fixed point or single floating point, or block floating point | The internal data representation type (variable streaming FFT only), either fixed point with natural bit-growth or single precision floating point. Floating-point bidirectional IP cores expect input inThe internal data representation type (variable streaming FFT only), either fixed point with natural bit-growth or single precision floating point. Floating-point bidirectional IP cores expect input in |
Data Width | 8, 10, 12, 14, 16, 18, 20,24, 28, 32 | The data precision. The values 28 and 32 are available for variable streaming only. |
Twiddle Width | 8, 10, 12, 14, 16, 18, 20,24, 28, 32 | The twiddle precision. The values 28 and 32 are available for variable streaming only. Twiddle factor precision must be less than or equal to data precision. |
Basic Parameters
在Burst模式下还需要配置Advanced Parameters,但只需要配置FFT Engine Architecture和Number of Parallel FFT Engines。
Parameters | Value | Description |
FFT Engine Architecture | Quad Output, Single Output | Choose between one, two, and four quad-output FFT engines working in parallel. Alternatively, if you have selected a singleoutput FFT engine architecture, you may choose to implement one or two engines in parallel. Multiple parallel engines reduce transform time at the expense of device resources, which allows you to select the desired area and throughput trade-off point. Not available for variable streaming or streaming FFTs. |
Number of Parallel FFT Engines | 1, 2, 4 | |
DSP Block Resource Optimization | On or Off | Turn on for multiplier structure optimizations. These optimizations use different DSP block configurations to pack multiply operations and reduce DSP resource requirements. This optimization may reduce FMAX because of the structure of the specific configurations of the DSP blocks when compared to the basic operation. Specifically, on Stratix V devices, this optimization may also come at the expense of accuracy. You can evaluate it using the MATLAB model provided and bit wise accurate simulation models. If you turn on DSP Block Resource Optimization and your variation has data precision between 18 and 25 bits, inclusive, and twiddle precision less than or equal to 18 bits, the FFT MegaCore function configures the DSP blocks in complex 18 x 25 multiplication mode. |
Enable Hard Floating Point Blocks | On or off | For Arria 10 devices and single-floating-point FFTs only. |
在配置完IP后,可以看到生成的Block Symbol 。
具体信号含义可以看下表,帮助理解。该表来源于FPGA学习专题-FFT IP核的使用_quartus实现fft全流程-CSDN博客
在配置IP核时可以考虑fft的延时问题,burst模式比stream模式稍慢,而基4模式比基2模式(在advanced parameters中设置)快,可以搭配使用。
这是Quartus中FFT IP核的编辑界面。要generate两个地方,不然在仿真的时候会报错
- //File name:fft_demo
- //Complete date:23/03/24
- /
- module fft_demo(
- input wire clk,
- input wire rst_n,
- input wire sink_valid,
- input wire sink_sop,
- input wire sink_eop,
- input signed [15:0] data_in,
- output wire source_valid,
- output wire source_sop,
- output wire source_eop,
- output signed [31:0] data_out
- );
- wire sink_ready;
- wire [1:0] sink_error;
- wire signed [15:0] sink_imag;
- wire inverse;
- wire source_ready;
- wire [1:0] source_error;
- wire signed [15:0] source_real;
- wire signed [15:0] source_imag;
- wire [5:0] source_exp;
- assign sink_error=2'b00;
- assign sink_imag=16'd0;
- assign inverse=1'b0;
- assign source_ready=1'b1;
- fft u0 (
- .clk (clk), // clk.clk
- .reset_n (rst_n), // rst.reset_n
- .sink_valid (sink_valid), // sink.sink_valid
- .sink_ready (sink_ready), // .sink_ready
- .sink_error (sink_error), // .sink_error
- .sink_sop (sink_sop), // .sink_sop
- .sink_eop (sink_eop), // .sink_eop
- .sink_real (data_in), // .sink_real
- .sink_imag (sink_imag), // .sink_imag
- .inverse (inverse), // .inverse
- .source_valid (source_valid), // source.source_valid
- .source_ready (source_ready), // .source_ready
- .source_error (source_error), // .source_error
- .source_sop (source_sop), // .source_sop
- .source_eop (source_eop), // .source_eop
- .source_real (source_real), // .source_real
- .source_imag (source_imag), // .source_imag
- .source_exp (source_exp) // .source_exp
- );
- wire signed [31:0] dout_re,dout_im;
- assign dout_re=source_real*source_real;
- assign dout_im=source_imag*source_imag;
- assign data_out=dout_re+dout_im;
- endmodule

- clc;
- clear;
- fs=100000;
- f1=1000;
- t=0:1/fs:2047/fs;
- x=floor((0.5*cos(2*pi*f1*t)+0.5)*255);
- plot(x);
- y=fft(x,1024);
- plot(abs(y).*abs(y));
- #include<bits/stdc++.h>
- using namespace std;
- int main()
- {
- freopen("datain.txt","r",stdin);
- freopen("dataset.vh","w",stdout);
- signed short val;
- for(int i=0;i<2048;i++)
- {
- cin>>val;
- printf("%x\n",val);
- }
- fclose(stdin);
- fclose(stdout);
- return 0;
- }

- /
- `timescale 1ns/1ns
- module fft_demo_tb;
- reg clk;
- reg rst_n;
- wire sink_valid;
- wire sink_sop;
- wire sink_eop;
- wire signed [15:0]data_in;
- wire source_valid;
- wire source_sop;
- wire source_eop;
- wire [1:0]source_error;
- wire signed [31:0]data_out;
- parameter FILE_PATH="E:/Quartus-standard-17.1/Documents/fft/simulation/modelsim/dataset.vh";
- reg [15:0] data[2048:0];
- initial begin
- clk=0;
- $readmemh(FILE_PATH,data);
- #0 rst_n=0;
- #110 rst_n=1;
- end
- always #5 clk=~clk;
- reg [10:0] cnt;
- always@(posedge clk or negedge rst_n)
- begin
- if(!rst_n)begin cnt<=0; end
- else begin cnt<=cnt+1; end
- end
- assign data_in=data[cnt];
- reg [10:0] cnt1;
- always@(posedge clk or negedge rst_n)
- begin
- if(!rst_n)begin cnt1<=0; end
- else begin cnt1<=cnt1+1; end
- end
- assign sink_sop=(cnt1==1 && rst_n==1)?1:0;
- assign sink_eop=(cnt1==1024 && rst_n==1)?1:0;
- assign sink_valid=(cnt1>=1 && cnt1<=1024 && rst_n==1)?1:0;
- fft_demo u_test(
- .clk(clk),
- .rst_n(rst_n),
- .sink_valid(sink_valid),
- .sink_sop(sink_sop),
- .sink_eop(sink_eop),
- .data_in(data_in),
- .source_valid(source_valid),
- .source_sop(source_sop),
- .source_eop(source_eop),
- .source_error(source_error),
- .data_out(data_out)
- );
- integer vec_file1;
- initial
- begin
- wait(rst_n==1'b1);
- #10;
- vec_file1=$fopen("E:/Quartus-standard-17.1/Documents/fft/data_out.dat","w");
- forever
- begin
- @(posedge clk);
- #1;
- if(source_valid)
- $fwrite(vec_file1,"%d\n",data_out);
- end
- $fclose(vec_file1);
- end
- endmodule

打开Quartus中的RTL Simulation。
将matlab中生成的信号数据导成mif文件,FPGA 使用ram ip核读取mif文件,再调用fft ip核计算频谱。
- clc;
- clear;
- fs=800;
- f1=50;
- N=1024;
- x=0.5*sin(2*pi*f1/fs*(0:N-1));
- f_axis=[0:N-1]*fs/N;
- figure;
- plot(x); %原信号
- y=fft(x,1024);
- figure;
- plot(f_axis,abs(y)); %频谱
- % 将信号量化为12位
- x_quantized = round(x * (2^11-1));
- % 生成mif文件
- fid = fopen('single_freq_signal.mif', 'w');
- fprintf(fid, 'DEPTH = 1024;\n');
- fprintf(fid, 'WIDTH = 12;\n');
- fprintf(fid, 'ADDRESS_RADIX = HEX;\n');
- fprintf(fid, 'DATA_RADIX = HEX;\n');
- fprintf(fid, 'CONTENT\n');
- fprintf(fid, 'BEGIN\n');
- for i = 0:1023
- fprintf(fid, '%03X : %03X;\n', i, x_quantized(i+1));
- end
- fprintf(fid, 'END;\n');
- fclose(fid);

- //顶层模块
- module ram_ip(
- input sys_clk, //系统时钟
- input sys_rst_n, //系统复位,低电平有效
- output [7:0] ram_rd_data
- );
- //wire define
- wire ram_wr_en ; //ram写使能
- wire ram_rd_en ; //ram读使能
- wire [9:0] ram_addr ; //ram读写地址
- wire [7:0] ram_wr_data ; //ram写数据
- //wire [7:0] ram_rd_data ; //ram读数据
- ram_rw ram_rw_inst( //例化底层模块
- .clk (sys_clk), //系统时钟
- .rst_n (sys_rst_n), //系统复位,低电平有效
- .ram_wr_en (ram_wr_en ), //ram写使能
- .ram_rd_en (ram_rd_en ), //ram读使能
- .ram_addr (ram_addr ), //ram读写地址
- .ram_wr_data (ram_wr_data), //ram写数据
- .ram_rd_data (ram_rd_data) //ram读数据
- );
- ram_config ram_config_inst ( //实例化底层IP核
- .address (ram_addr), //ram读写地址
- .inclock (sys_clk), //系统时钟
- .outclock (sys_clk),
- .data (ram_wr_data), //ram写数据
- .rden (ram_rd_en), //ram读使能
- .wren (ram_wr_en), //ram写使能
- .q (ram_rd_data) //ram读数据
- );
- endmodule

- //RAM读写驱动
- module ram_rw(
- input clk , //时钟信号
- input rst_n , //复位信号,低电平有效
- output ram_wr_en , //ram写使能
- output ram_rd_en , //ram读使能
- output reg [9:0] ram_addr , //ram读写地址
- output reg [7:0] ram_wr_data, //ram写数据
- output reg read_flag,
- input [7:0] ram_rd_data //ram读数据
- );
- reg [10:0] rw_cnt ; //读写控制计数器
- //rw_cnt计数范围在0~31,ram_wr_en为高电平;32~63时,ram_wr_en为低电平
- //assign ram_wr_en = ((rw_cnt >= 6'd0) && (rw_cnt <= 6'd31)) ? 1'b1 : 1'b0;
- assign ram_wr_en = 1'b0;
- //rw_cnt计数范围在32~63,ram_rd_en为高电平;0~31时,ram_rd_en为低电平
- assign ram_rd_en = ((rw_cnt >= 11'd0) && (rw_cnt <= 11'd1023)) ? 1'b1 : 1'b0; //assign赋值功能
- //读写控制计数器,计数器范围0~63
- always @(posedge clk or negedge rst_n) begin
- if(rst_n == 1'b0)
- rw_cnt <= 11'd0;
- else if(rw_cnt == 11'd1023) //计数到63清零
- begin
- rw_cnt <= 11'd0;
- read_flag<=1;
- end
- else //if(rw_cnt < 11'd1023)
- begin
- rw_cnt <= rw_cnt + 11'd1;
- read_flag<=0;
- end
- end
- //读写控制器计数范围:0~31 产生ram写使能信号和写数据信号
- always @(posedge clk or negedge rst_n) begin
- if(rst_n == 1'b0)
- ram_wr_data <= 8'd0;
- else if(rw_cnt >= 6'd0 && rw_cnt <= 6'd31)
- ram_wr_data <= ram_wr_data + 8'd1;
- else
- ram_wr_data <= 8'd0;
- end
- //读写地址信号 范围:0~31
- always @(posedge clk or negedge rst_n) begin
- if(rst_n == 1'b0)
- ram_addr <= 10'd0;
- else if(ram_addr == 10'd1023)
- ram_addr <= 10'd0;
- else
- ram_addr <= ram_addr + 1'b1;
- end
- endmodule

- module fft_ram(
- input clk,
- input rst_n,
- output [15:0] data_out,
- output source_valid
- );
- wire [7:0] ram_rd_data;
- ram_ip ram_ip(
- .sys_clk(clk),
- .sys_rst_n(rst_n),
- .ram_rd_data(ram_rd_data)
- );
- fft_demo fft_demo(
- .clk(clk),
- .rst_n(rst_n),
- .data_in(ram_rd_data),
- .data_out(data_out),
- .source_valid(source_valid)
- );
- endmodule

- //File name:fft_demo
- //Complete date:23/03/24
- /
- module fft_demo(
- input wire clk,
- input wire rst_n,
- input signed [7:0] data_in,
- output signed [15:0] data_out,
- output wire source_valid
- );
- wire sink_valid;
- wire sink_sop;
- wire sink_eop;
- //wire source_valid;
- wire source_sop;
- wire source_eop;
- wire [1:0] source_error;
- wire sink_ready;
- wire [1:0] sink_error;
- wire signed [7:0] sink_imag;
- wire inverse;
- wire source_ready;
- //wire [1:0] source_error;
- wire signed [7:0] source_real;
- wire signed [7:0] source_imag;
- wire [5:0] source_exp;
- assign sink_error=2'b00;
- assign sink_imag=8'd0;
- assign inverse=1'b0;
- assign source_ready=1'b1;
- reg [10:0] cnt;
- always@(posedge clk or negedge rst_n)
- begin
- if(!rst_n)
- cnt<=0;
- else
- cnt<=cnt+1;
- end
- assign sink_sop=(cnt==1)?1:0;
- assign sink_eop=(cnt==1024)?1:0;
- assign sink_valid=(cnt>=1 && cnt<=1024)?1:0;
- fft u0 (
- .clk (clk), // clk.clk
- .reset_n (rst_n), // rst.reset_n
- .sink_valid (sink_valid), // sink.sink_valid
- .sink_ready (sink_ready), // .sink_ready
- .sink_error (sink_error), // .sink_error
- .sink_sop (sink_sop), // .sink_sop
- .sink_eop (sink_eop), // .sink_eop
- .sink_real (data_in), // .sink_real
- .sink_imag (sink_imag), // .sink_imag
- .inverse (inverse), // .inverse
- .source_valid (source_valid), // source.source_valid
- .source_ready (source_ready), // .source_ready
- .source_error (source_error), // .source_error
- .source_sop (source_sop), // .source_sop
- .source_eop (source_eop), // .source_eop
- .source_real (source_real), // .source_real
- .source_imag (source_imag), // .source_imag
- .source_exp (source_exp) // .source_exp
- );
- wire signed [15:0] dout_re,dout_im;
- assign dout_re=source_real*source_real;
- assign dout_im=source_imag*source_imag;
- assign data_out=dout_re+dout_im;
- endmodule

- //通过横坐标算出对应频率,其实也就是找峰值
- module f_calculate(
- input clk,
- input rst_n,
- output reg [15:0] peak_value,
- output reg [9:0] peak_index,//未经过换算
- output reg [9:0] peak_index2,//经过换算
- output reg [9:0] index
- );
- reg [15:0] data_array [0:1023];
- reg [15:0] fs=16000;
- reg [15:0] threshold1=500;//设置上下阈值
- reg [15:0] threshold2=2000;
- wire [15:0] data_out;
- wire source_valid;
- fft_ram fft_ram_inst(
- .clk(clk),
- .rst_n(rst_n),
- .data_out(data_out),
- .source_valid(source_valid)
- );
- always@(posedge clk or negedge rst_n)
- begin
- if(!rst_n)
- begin
- index<=0;
- peak_value<=0;
- peak_index<=0;
- end
- else begin
- if(source_valid)
- begin
- data_array[index]<=data_out;
- index<=index+1;
- if(data_out>threshold1 && data_out<threshold2)
- begin
- peak_value<=data_out;
- peak_index<=index<512 ? index : 1024-index;
- peak_index2<=peak_index*fs/1024;
- end
- end
- end
- end
- endmodule

如果使用的是高版本的Quartus需要先破解IP核才能调用FFT,不然在编译仿真时会在EDA Netlist Writer报错说没有相应的license。
以上就是完整使用fft ip核进行仿真和实际上板的过程。笔者学识尚浅,文章内容可能有些许纰漏。如有问题可指出,共同讨论学习。
