Chapter 4

advertisement
Chapter 4
The Fast Fourier Transform
4.1
Introduction
The fundamental motivation for the FFT is the extremely high computational load
of calculating the DTFS directly as shown in section (3.7.1). By exploiting the symmetry
and periodicity properties of the twiddle factor in section (3.3), the number of required
calculations is significantly reduced [1]. Many FFT algorithms exist today, but they are
all derivatives of the work done by Cooley and Tukey in 1965 [7]. Their algorithm was
so efficient in computation that it revolutionized digital signal processing [4]. The
highest efficiency is achieved when the sample length is a power of two [1].
It is important to note that the FFT is mathematically equivalent to the DTFS, not
an approximation. As a result, all of the properties and strengths of the DTFS hold true
for all FFT algorithms. Most of the weaknesses also apply to FFT algorithms. The FFT
does reduce the computational requirements and the amount of quantization noise error
due to the high computational load of the DTFS [2].
4.2
FFT Improvements to the DTFS
4.2.1
Computational Load
The computational load of the DTFS discussed in section (3.7.1) is defined by
equations (3.24) and (3.25). The high computational load of the DTFS is due to the N2
terms in these equations.
The FFT algorithms significantly reduce the required
computations. All FFT algorithms have a computational load of
32
33
Number of Adds = C1*N*LOG2(N)
(4.1)
Number of Multiplies = C2*N*LOG2(N)
(4.2)
Where C1 and C2 are constants.
The radix-2 Cooley-Tukey algorithm requires approximately 3*N*LOG2(N) adds
and 2*N*LOG2(N) multiplies [2]. Figure (4.1) compares the computational load of this
FFT algorithm with that of computing the DTFS directly.
Figure 4.1 Computational Loads of DTFS and FFT
As shown in figure (4.1) there is a significant advantage to using the FFT over the
DTFS as the sample length (N) increases. For example, consider a sample length (N) of
32,768. Calculating the DTFS directly would require 8,590,000,129 computations while
calculating the FFT would require only 2,457,600. This means that if the calculation
34
time for a computer to evaluate the FFT is 30 seconds, then it would require 29.13 hours
to resolve the DTFS.
4.2.2
Reduced Quantization Noise Error
Quantization noise error has two contributing components. First, as discussed in
section (3.7.2), the rounding-off process involved with quantization of a sampled signal,
introduces quantization error. Quantization of a sampled signal was discussed in section
(2.6). This quantization error is magnified by the numerous multiplications in calculating
the DTFS.
Since the number of multiplications required to calculate the FFT is
significantly reduced, the error due to quantization noise is also reduced.
The second part of quantization noise error also deals with the multiplications
involved. The product of the multiplication itself must be rounded-off. The product of
two M-bit numbers is a 2*M bit number. To store the result as an M-bit value, the bottom
M bits must be discarded. For example, if two 16-bit numbers are multiplied, the product
is a 32-bit number [2]. Again, since the number of multiplications required to calculate
the FFT is reduced, this contribution to error is also reduced.
4.3
Additional Weakness of the FFT
Since the input data must be reorganized in order to compute the FFT, all of the
output coefficients must be computed. A single output coefficient cannot be computed
using the FFT. However, by using the DTFS, each output coefficient can be computed
one at a time. As a result, if only a few output coefficients are needed, then using the
DTFS would be more beneficial than the FFT in terms of computational load. Generally
though, all of the output coefficients are needed, and this weakness does not apply [2].
35
4.4
Radix-2 Decimation-in-Time (DIT) FFT Algorithm
A length-N DTFS can be split up into a series of lower-order DTFS. The number
of computations required to calculate the series of lower-order DTFS is significantly
reduced. Consider equations (3.1) and (3.2). Since they are basically the same operation,
the same algorithm with small modification can be used to generate either the set of X[k]
(FFT) or the set of x[n] (inverse-FFT) [1].
Assuming that N is an even number, the set x[n] in equation (3.1) an be divided
into its even and odd indexes [1]. Quantitatively, this is described by equations (4.3) and
(4.4):
xe [n]  x[2n],
0  n  N '-1
(4.3)
xo [n]  x[2n  1],
0  n  N '-1
(4.4)
where N’ = N/2. The DTFS of each even and odd data set is defined by equations (4.5)
and (4.6)
DTFS{xe[n]} = Xe[k],
0 ’
(4.5)
DTFS{xo[n]} = Xo[k],
0 ’
(4.6)
where 0’ = 2/N’ = 4/N = 20. Now express equation (3.1) in terms of the even and
odd data sets [1]:
X [k ] 


N 1
 x[n]e  j0kn 
1
N
N ' 1
 x[2m]e  j0k ( 2m) 
1
N
N ' 1
1
N
N ' 1
1
N
N ' 1
n 0
1
N
N 1
1
N
 x[n]e  j0kn 
even
m 0
 x[2m]e  j0k ( 2m) 
m 0
1
N
 x[2m  1]e
N 1
 x[n]e
 j 0 kn
(4.7)
odd
 j 0 k ( 2 m 1)
(4.8)
m 0
 x[2m  1]e
m 0
 j 0 k ( 2 m )
e  j 0 k
(4.9)
36
1

N
N ' 1
 x[2m]e
 j0 k ( 2 m )
m 0
e  j0k

N
N ' 1
 x[2m  1]e
 j0 k ( 2 m )
(4.10)
m 0
Now by substituting xe[n], xo[n] and 0’ = 2/N’ = 4/N = 20 into equation (4.10),
equation (4.11) is obtained:
X[k] 
1
N
N ' 1
 xe [m]e  j0 'km 
m 0
e  j0k
N
N ' 1
 x [m]e
m 0
 j0 'km
o
(4.11)
and by equations (3.1), (4.5) and (4.6) the following expression is obtained:
X [k ]  X e [k ]  e  j0 k X o [k ],
0  k  N -1
(4.12)
Equation (4.12) implies that an N-point DTFS can be divided into its even- and oddindexed N’-point DTFSs and evaluated. The original N-point DTFS is the sum of the
even-indexed DTFS and the weighted, odd-indexed DTFS. If N is a power of two, then
each even- and odd-indexed DTFS can be subdivided into its even- and odd-indexed
portions until an array of N one-point DTFSs exist [1].
The DTFS can be further simplified by using the periodicity of the DTFS. From
equation (3.22), equations (4.13) and (4.14) are obtained [1]:
X e [k ]  X e [k  N ' ],
0  k  N'-1
(4.13)
X o [k ]  X o [k  N ' ],
0  k  N'-1
(4.14)
Also, from the inverse symmetry property in equation (3.10) it is gathered that
e
 j 0 k
 e
 j 0 ( k 
N
)
2
(4.15)
By applying equations (4.13), (4.14), and (4.15) to equation (4.12), the following are
obtained [1]:
X [k ]  X e [k ]  e  j0 k X o [k ],
0  k  N '-1
(4.16)
X [k  N ' ]  X e [k ]  e  j0 k X o [k ],
0  k  N '-1
(4.17)
37
Figure (4.2) displays the calculation of equation (4.16) and (4.17) for an 8-point FFT.
Figure 4.2 Eight-Point FFT [1]
Each 4-Point DTFS block can further be divided into two 2-point DTFSs as shown below
in figure (4.3).
Figure 4.3 Expanded Four-Point FFT [1]
38
Finally, each 2-point DTFS can be divided into two 1-point DTFSs. This is shown in
figure (4.4) below.
Figure 4.4 Expanded Two-Point FFT [1]
This is called a butterfly because of its appearance [1]. Since it is the first stage, the
calculation in figure (4.3) is called a first-stage butterfly [9]. The calculations in the
second column are called second-stage butterflies, etc. Notice the absence of the
complex exponential in figure (4.4). Equations (4.18) and (4.19) illustrate why the
complex exponential is not present in figure (4.4).
X ee [0]  xee [0]e 0  xee [1]e 0  xee [0]  xee [1]
X ee [1]  xee [0]e  xee [1]e
0
j
2
(1)(1)
2
 xee [0]  xee [1]
(4.18)
(4.19)
The process of subdividing each of the blocks into even- and odd- indexed sets of
data permutes the order of the DTFS input coefficients. The final result after all of the
subdivisions down to the 1-point DTFSs results in bit-reversed order of the DTFS input
coefficients. For example, after all of the subdivisions for the 8-point FFT shown above,
the final order of the input coefficients is x[0], x[4], x[2], x[6], x[1], x[5], x[3], and lastly
x[7]. Therefore, when using this algorithm, the input coefficients must first be bitreversed before applying the calculations. The location of each input coefficient can be
found by taking the index’s binary representation, bit reversing this binary value, and
then converting back to decimal. For example, the binary representation of the index of
39
x[1] is 0012. Bit reversing this binary value gives 1002 which is equal to 410 in decimal.
Therefore, x[1] is relocated to the original x[4] position [5].
The FFT algorithm used in this thesis is a radix-2 decimation-in-time (DIT) FFT
algorithm that uses the scheme discussed above. The FFT algorithm used is an iterative
algorithm in which the larger DTFS blocks are broken down into smaller blocks until the
fundamental DTFSs are reached. After this process is completed the algorithm computes
the butterflies from the 1st stage to the last stage. The algorithm takes in an input data set
in its original order and manipulates the data in a bit-reversed order for processing by the
FFT. The code for the FFT algorithm used in this thesis is presented in Appendix (D).
4.5
Summary
The primary weakness of the DTFS is the computational burden it puts on a
system for increasing values of N. In order to harness the power of the DTFS for signal
processing of large data sets, it is necessary to circumvent the computational load
required to calculate it. As a product of the work done by Cooley and Tukey in 1965, this
is made possible [7]. The fast Fourier transform provides an output of mathematical
equivalence to the DTFS while drastically reducing its computational requirements. In
the initial development more time is required to utilize the FFT over the DTFS, but this
time is outweighed by compounded time saved by using the FFT over the DTFS.
Many FFT algorithms exist in addition to the algorithm used in this thesis and can
be found in references [2] and [4]. The user may peruse through the available algorithms
and find the one that is best suited for the application at hand. As will be discussed in
section (5.2), the choice of the FFT algorithm and the choice of the window function are
independent of each other. This allows a “superposition” approach to the development
40
process of an FFT-based signal processing system. The next chapter discusses the use of
window functions in order to minimize some of the weaknesses of the DTFS.
Download