DSP C5000 Chapter 19 Fast Fourier Transform Copyright © 2003 Texas Instruments. All rights reserved. Discrete Fourier Transform Allows us to compute an approximation of the Fourier Transform on a discrete set of frequencies from a discrete set of time samples. X( f ) x(t )e j 2ft dt N 1 X k xn e j 2 k n N for k 0, 1,, N 1 n 0 ESIEE, Slide 2 Where k are the index of the discrete frequencies and n the index of the time samples Copyright © 2003 Texas Instruments. All rights reserved. Inverse Discrete Fourier Transform The inverse formula is: 1 xn N X k e j 2 k n N for n 0, 1,, N 1 k 0 Where, again, k are the index of the discrete frequencies and n the index of the time samples. We have the following properties: ESIEE, Slide 3 N 1 Discrete time Periodic time periodic spectra discrete spectra Copyright © 2003 Texas Instruments. All rights reserved. DFT Computation We can write the DFT: N 1 X k xn WN-kn for k 0, 1,, N 1 with WN e j 2 N n0 We need: N 128 1024 4096 ESIEE, Slide 4 N(N-1) complex ‘+’ N2 complex ‘×’ + 16256 1047552 16773120 x 16384 1048576 16777216 Copyright © 2003 Texas Instruments. All rights reserved. Fast Fourier Transform 1 of 3 Cooley-Tukey algorithm: Based on decimation, leads to a factorization of computations. Let us first look at the classical radix 2 decimation in time. This particular case of the algorithm requires the time sequence length to be a power of 2. First we split the computation between odd and even samples: X k ESIEE, Slide 5 N / 2 1 N / 21 n0 n 0 -k2n x 2 n W N -k 2n 1 x 2 n 1 W N Copyright © 2003 Texas Instruments. All rights reserved. Fast Fourier Transform 2 of 3 Using the following property: W WN 2 N 2 X k N / 21 x2n W -kn N 2 n0 ESIEE, Slide 6 The DFT can be rewritten: W -k N N / 21 x2n 1 W n0 -kn N 2 For k=0, 1, …, N-1 Copyright © 2003 Texas Instruments. All rights reserved. Fast Fourier Transform 3 of 3 Using the property that: k WN X k N 2 WNk The entire DFT can be computed with only k=0, 1, …,N/2-1. N / 21 -kn -k x 2 n W W N N n0 2 N / 21 -kn x 2 n 1 W N n0 2 and N / 21 N N / 21 -kn -k -kn X k x 2n WN WN x 2n 1 WN 2 n0 n 0 2 2 ESIEE, Slide 7 Copyright © 2003 Texas Instruments. All rights reserved. Butterfly This leads to basic building block of the FFT, the butterfly. X(0) X(1) x(0) x(2) We need: •N/2(N/2-1) complex ‘+’ for each N/2 DFT. TFD N/2 •(N/2)2 complex ‘×’ for each DFT. x(N-2) X(N/2-1) W0 W1 x(1) x(3) - X(N/2) X(N/2+1) •N/2 complex ‘×’ at the input of the butterflies. •N complex ‘+’ for the butterflies. •Grand total: N2/2 complex ‘+’ TFD N/2 N/2(N/2+1) complex ‘×’ x(N-1) ESIEE, Slide 8 WN/2-1 - X(N-1) Copyright © 2003 Texas Instruments. All rights reserved. Recursion If N/2 is even, we can further split the computation of each DFT of size N/2 into two computations of half size DFT. When N=2r this can be done until DFT of size 2 (i.e. butterfly with two elements). 3rd stage 2nd stage 1st stage X(0) x(0) x(4) W80 W80 x(2) x(6) X(1) - W80 - W81 - X(2) - X(3) W80 x(1) W80=1 x(5) W80 W80 x(3) x(7) ESIEE, Slide 9 W81 W80 - W81 - - W82 W83 - X(4) - X(5) - X(6) - X(7) Copyright © 2003 Texas Instruments. All rights reserved. Number of Operations If N=2r, we have r=log2(N) stages. For each one we have: N/2 complex ‘×’ (some of them are by ‘1’). N complex ‘+’. Thus the grand total of operations is: N/2 log2(N) complex ‘×’. N log2(N) complex ‘+’. N 128 1024 4096 + 896 10240 49152 x 448 5120 24576 These counts can be compared with the ones for the DFT ESIEE, Slide 10 Copyright © 2003 Texas Instruments. All rights reserved. Shuffling the Data, Bit Reverse Ordering At each step of the algorithm, data are split between even and odd values. This results in scrambling the order. index address 000 000 x(0) x(0) x(0) 100 001 x(4) x(2) x(1) 010 010 x(2) x(4) x(2) 110 011 x(6) x(6) x(3) 001 100 x(1) x(1) x(4) 101 101 x(5) x(3) x(5) 011 111 110 111 x(3) x(5) x(6) x(7) x(7) x(7) Recursion of the algorithm ESIEE, Slide 11 Copyright © 2003 Texas Instruments. All rights reserved. Reverse Carry Propagation ESIEE, Slide 12 This scrambling when we use radix 2 FFT can be obtained by the Reverse Carry Propagation (RCP) algorithm. n Address[x(n+1)] Address[x(n)] 000 RCP(000+100)=100 000 001 RCP(100+100)=010 100 010 RCP(010+100)=110 010 011 RCP(110+100)=001 110 100 RCP(001+100)=101 001 101 RCP(101+100)=011 101 110 111 RCP(011+100)=111 011 111 We start with address 0 then we add N/2 to obtain the next address. If there is a carry, it propagates towards the least significant bit. When the data arrive in natural order, they are scrambled in this way. Copyright © 2003 Texas Instruments. All rights reserved. Algorithm Parameters 1 of 2 The FFT can be computed according to the following pseudo-code: ESIEE, Slide 13 For each stage For each group of butterfly For each butterfly compute butterfly end end end Copyright © 2003 Texas Instruments. All rights reserved. Algorithm Parameters 2/2 Node Spacing Butterflies per group Number of groups Twiddle factor ESIEE, Slide 14 The parameters are shown below: 1st stage 2nd stage 3rd stage … Last stage 1 2 3 … N/2 1 2 3 … N/2 N/2 N/4 N/8 … 1 … Copyright © 2003 Texas Instruments. All rights reserved. Scaling The DFT computation for each k is : N 1 X k xn WN-kn n0 To prevent overflow we need to have: X k 1 This is guaranteed provided a scale factor 1/N N 1 1 -kn X k xn WN 1 n 0 N ESIEE, Slide 15 Copyright © 2003 Texas Instruments. All rights reserved. Quantization Noise Quantization step is : 2 b If words have b+1 bits and x(n) belongs to [-1,1] If we assume that each real multiplication gives rise to a noise source of power e2 The total amount of noise power for each X(k) is given by 2 2 b r / 2 4 N with r log2 N 3 2 t ESIEE, Slide 16 2 e Copyright © 2003 Texas Instruments. All rights reserved. Signal to Quantization Noise Ratio (SQNR) DFT case If we assume x(n) uniform in [-1,1], after scaling, variance of data become: x2 1 3N 2 And because each X(k) comes from N summations: 1 2 X 3N SQNR for DFT with scaling is given by: X2 SQNR (dB) 10log 2 t 6b 6r N SQNR(dB) 128 48 1024 30 4096 18 16 bits per word ENOB 8 5 3 ENOB: effective number of bits, gives the effective resolution given a SNR. Based on the assumption that 6dB of SNR equates to 1 bit of precision. ESIEE, Slide 17 Copyright © 2003 Texas Instruments. All rights reserved. Signal to Quantization Noise Ratio (SQNR) FFT case 1 of 3 The computation of one X(k) requires N-1 butterflies: N/2 butterflies N/4 butterflies of the 1st stage of the 2nd stage 1 butterfly of the last stage X(0) x(0) x(4) W80 W80 x(2) x(6) X(1) - W80 - W81 - X(2) - X(3) W80 x(1) x(5) W80 ESIEE, Slide 18 W80 x(3) x(7) W81 W80 W81 - W82 W83 - X(4) - X(5) - X(6) X(7) Copyright © 2003 Texas Instruments. All rights reserved. Signal to Quantization Noise Ratio (SQNR) FFT case 2 of 3 Butterfly computation Xn+1(l) Xn(l) Xn(k) W Xn+1(k) - Xn+1(l) Xn(l) ESIEE, Slide 19 X n1 k X n l W X n k To prevent overflow, we only need to scale the inputs of each butterfly by 1/2 1/2 Xn(k) X n1 l X n l W X n k 1/2 W - Xn+1(k) • Because we have r stages, the global scaling factor for one output is: r 1 1 N 2 Copyright © 2003 Texas Instruments. All rights reserved. Signal to Quantization Noise Ratio (SQNR) FFT case 3 of 3 Quantization noise source at one stage is attenuated by scale factors of all the following stages. This gives an equivalent noise source of: 2 2b 2 t 3 2b 1 r 2 2 1 2 2 3 for each output. SQNR for FFT with scaling is given by: X2 SQNR (dB) 10 log 2 t ESIEE, Slide 20 6b 3r 3 N SQNR(dB) 128 66 1024 57 4096 51 16 bits per word ENOB 11 9,5 8,5 Copyright © 2003 Texas Instruments. All rights reserved. FFT Algorithm with Block Floating Point Scaling Input data in bit reverse order Set-up for next stage More groups ? Set-up for next group Set-up for next butterfly Scale inputs of butterfly (×1/2) More stages ? End of algorithm Compute butterfly More butterflies ? ESIEE, Slide 21 Copyright © 2003 Texas Instruments. All rights reserved. Case Study ‘C54x Use of audioFFT* inBuffer PCM3002 ADC pipRx Block processing pipTx outBuffer PCM3002 DAC (*) this program is the same as \ti\examples\dsk5416\bios\audio except for some slight modifications that will be emphazised when necessary ESIEE, Slide 22 Copyright © 2003 Texas Instruments. All rights reserved. Audio Program Block processing is echo() function in audio.c : In the original program: Data from input stream are copied directly in output stream. In the modified program : ESIEE, Slide 23 The input stream is split between left and right in two separate buffers. Each buffer is processed Bit-reverse scrambling Forward transform Bit reverse scrambling Inverse transform Resulting left and right buffers are interleaved in the output stream. Copyright © 2003 Texas Instruments. All rights reserved. DSPLIB functions The DSP Library (DSPLIB) is a collection of high-level optimized DSP function modules for the ‘C54x and ‘C55x DSP platform. ‘C54x DSPLIB functions for FFT computation ESIEE, Slide 24 Copyright © 2003 Texas Instruments. All rights reserved. Block Processing 1 of 3 Echo( ) ESIEE, Slide 25 Include and declarations Copyright © 2003 Texas Instruments. All rights reserved. Block Processing 2 of 3 ESIEE, Slide 26 Dsplib functions calls Copyright © 2003 Texas Instruments. All rights reserved. ‘C54x DSPLIB Bit Reversal ESIEE, Slide 27 Copyright © 2003 Texas Instruments. All rights reserved. Block Processing 3 of 3 ESIEE, Slide 28 Linker .cmd file Copyright © 2003 Texas Instruments. All rights reserved. ‘C54x DSPLIB FFT ESIEE, Slide 29 Copyright © 2003 Texas Instruments. All rights reserved. To Run the Build Process ESIEE, Slide 30 Project options Copyright © 2003 Texas Instruments. All rights reserved. To Change the Size of the Processing Buffer • Change buffer length declaration in echo function (audio.c) • Change the call to cfft and cifft according to the size of the FFT (must be hard coded) • Change buffer alignment in linker .cmd file ESIEE, Slide 31 Copyright © 2003 Texas Instruments. All rights reserved. Follow on Activities for TMS320C5416 DSK ESIEE, Slide 32 Application 8 for the the TMS320C5416 DSK uses the FFT as a spectrum analyzer to display the power in an audio signal at various frequencies. Rather than using the optimized library DSPLIB for the FFT, it uses a C code version that is slower, but can be stepped through line by line using Code Composer Studio. Copyright © 2003 Texas Instruments. All rights reserved.