ECE 734: Project Presentation 64-point FFT Algorithm for OFDM Applications using 8-point DFT processor (radix-8) Pankhuri May 8, 2013 Fast Fourier Transform • Uses symmetry and periodicity properties of DFT to lower computation • 64-point DFT computes a sequence X(f), • Basis of FFT: DFT can be divided into smaller DFTs. • e.g. radix-8 algorithm divides FFT into 8-point DFTs, radix-2: 2-point DFTs (BF) 64-point FFT Algorithm Details Performance Improvement 8-point FFT processor details • FFT8 processor uses Winograd algorithm • Minimizes multiplication (more expensive operation) at expense of increased additions and some more memory requirement. • FFT8 (unit that performs base FFT operation) is pipelined • One complex number is read from/written into input/output data buffer each clock cycle. (Total of 14 clock cycles) • Supports clock frequency of up to 250 MHz 8-point FFT processor algorithm 8 – point Winograd FFT Processor Design Overview Complex Input Buffer 8-point FFT RAM1 unit 1 Buffer 8-point FFT Buffer RAM2 unit 2 RAM3 Twiddle factor multiplier • Synthesis and Simulation: Altera Quartus II • Language: Verilog • Target: Stratix IV FPGA Complex Output Processor Design Overview • Data buffers: convert data from 8-inverse order to natural order e.g. without third buffer at the end, the output order is 0,8,16….56, 1,9,17,….. (8-inverse order). • Use altsyncram, can store 2x64 complex data • One bank is written to from previous stage, other can be read simultaneously. • FFT Blocks: Only constant multiplications needed are 1/√2 (bunch of shift and add operations) Processor Design Overview • Sixteen 8-point FFT units are avoided here by instead multiplexing the use of two units at expense of increased latency. • Twiddle factor multiplier is a ROM having pre-calculated twiddle factors • Complex multiplication is accomplished by breaking it into three multiplies and five additions. (lpm_mult mega function) (A + jB)(C + jD) = C(A-B) + B(C-D) + j(A(C-D) – C(A-B)) Synthesis Results Processor Design Overview Learning Outcomes • Details of various implementation issues of FFT processor design - resolving bandwidth issues when multiple stages are involved, reducing multiplier count (pipelining), total number of multiplications required (algorithm efficiency) • Read about a LOT of FFT algorithms used for OFDM applications (before shortlisting this one). Various strategies to reduce computation employed in these algorithms especially popularity of radix-8 algorithms over radix-2. Future Work & Applications • Future Work: Modular design allows it to be used together with other 64-point FFTs to create larger size. (Much as this design is built using 8-point units) • Structure can be configured in Xilinx, Altera, Alcatel, Lattice FPGA devices and ASIC • Applications: OFDM modems, software defined radio, multichannel coding and many other high-speed real-time systems.