Decimation-in-frequency FFT algorithm

advertisement
Decimation-in-frequency FFT algorithm
The decimation-in-time FFT algorithms are all based on
structuring the DFT computation by forming smaller and smaller
subsequences of the input sequence x[n]. Alternatively, we can
consider dividing the output sequence X[k] into smaller and
smaller subsequences in the same manner.
N −1
X [k ] = ∑ x[n]WNnk
k = 0,1,..., N − 1
n =0
The even-numbered frequency samples are
N −1
( N / 2 ) −1
n =0
n =0
X [2r ] = ∑ x[n]WNn ( 2 r ) =
X [ 2r ] =
n(2r )
x
[
n
]
W
+
∑
N
( N / 2 ) −1
( N / 2 ) −1
n =0
n =0
2 nr
x
n
W
[
]
+
∑
N
N −1
n(2r )
x
[
n
]
W
∑
N
n =( N / 2 )
2 r ( n + ( N / 2 ))
x
n
N
W
[
(
/
2
)]
+
∑
N
Since
WN2 r[ n+( N / 2 )] = WN2 nrWNrN = WN2 nr
and WN
2
= WN / 2
X [ 2r ] =
( N / 2 )−1
nr
(
[
]
[
(
/
2
)])
x
n
+
x
n
+
N
W
∑
N /2
r = 0,1,..., ( N / 2) − 1
n= 0
The above equation is the (N/2)-point DFT of the (N/2)-point
sequence obtained by adding the first and the last half of the
input sequence.
Adding the two halves of the input sequence represents time
aliasing, consistent with the fact that in computing only the evennumber frequency samples, we are sub-sampling the Fourier
transform of x[n].
We now consider obtaining the odd-numbered frequency points:
N −1
( N / 2 ) −1
n =0
n =0
X [2r + 1] = ∑ x[n]WNn ( 2 r +1) =
n ( 2 r +1)
+
[
]
x
n
W
∑
N
N −1
n ( 2 r +1)
[
]
x
n
W
∑
N
n =( N / 2 )
Since
N −1
( N / 2 ) −1
n= N / 2
n =0
n ( 2 r +1)
x
n
W
=
[
]
∑
N
= WN( N / 2 )( 2 r +1)
=−
( N / 2 ) −1
( n + N / 2 )( 2 r +1)
x
n
N
W
+
[
(
/
2
)]
∑
N
( N / 2 ) −1
n ( 2 r +1)
x
n
N
W
+
[
(
/
2
)]
∑
N
n =0
n ( 2 r +1)
x
n
N
W
+
[
(
/
2
)]
∑
N
n =0
We obtain
X [2r + 1] =
( N / 2 ) −1
n ( 2 r +1)
x
n
x
n
N
W
−
+
(
[
]
[
/
2
])
∑
N
n =0
=
( N / 2 ) −1
n
nr
x
n
x
n
N
W
W
−
+
(
[
]
[
/
2
])
∑
N
N /2
r = 0,1,..., ( N / 2) − 1
n =0
The above equation is the (N/2)-point DFT of the sequence
obtained by subtracting the second half of the input sequence
from the first half and multiplying the resulting sequence by
WNn.
Let g[n] = x[n]+x[n+N/2] and h[n] = x[n]−x[x+N/2], the DFT can
be computed by forming the sequences g[n] and h[n], then
computing h[n] WNn, and finally computing the (N/2)-point DFTs
of these two sequences.
Flow graph of decimation-in-frequency decomposition of an Npoint DFT (N=8).
Recursively, we can further decompose the (N/2)-point DFT
into smaller substructures:
Finally, we have
Butterfly structure for decimation-in-frequency FFT algorithm:
The decimation-in-frequency FFT algorithm also has the
computation complexity of O(N log2N)
Circular Convolution (for DFT)
 Time-domain convolution implies frequency domain
multiplication. This property is valid for continuous Fourier
transform, Fourier series, and DTFT, but is not exactly true
for DFT.
 The DFT pair considered hereafter (following Openheim’s
book, where the 1/N is put on the inverse-transform side):
N −1
X [k ] = ∑ x[n]WNkn ,
k = 0,1..., N − 1
n =0
1
x[n] =
N
N −1
− kn
X
[
k
]
W
∑
N ,
n = 0,1..., N − 1
k =0
where WN = e−j2π/N is a root of the equation WN=1.
Circular Convolution (for DFT)
 For DFT, time domain circular convolution implies frequency
domain multiplication, and vice versa.
 Consider a periodic sequence. Its DTFT is both periodic
and discrete in frequency. Multiplication in the frequency
domain results in a convolution of the two corresponding
periodic sequences in the time domain.
 Now let’s consider a single period of the resulted
sequence. Since the two sequences are both periodic, the
convolution appears as ‘folding’ the rear of a sequence to
the front one by one, and superimposing the inner products
so obtained, in a single period.
Convolution of two periodic sequences
Circular convolution (definition)
x3[n] = x1[n] ⊗ x2 [n] = x2 [n] ⊗ x1[n]
N −1
N −1
m =0
m =0
≡ ∑ x2 [m]x1[((n − m)) N ] = ∑ x2 [m]x1[(n − m) mod N ]
circular convolution
Symbol for representing circular convolution: ⊗ or N .
If the DFT of x1[n], x2[n], and x3[n] are X1[k], X2[k], and X3[k],
respectively.
 Time domain circular convolution implies frequency domain
multiplication: x [ n] = x [ n] ⊗ x [ n] ↔ X [ k ] = X [ k ] X [ k ]
3
1
2
3
1
2
 Time domain multiplication implies frequency domain circular
convolution (with 1/N amplitude reduction): 1
x3[n] = x1[n]x2 [n] ↔ X 3[k ] =
N
X 1[ k ] ⊗ X 2 [ k ]
Example: circular
convolution with a delayed
impulse sequence
x1[n] = δ [n − n0 ], 0 ≤ n0 ≤ N
Example: circular convolution of two rectangular pulses
N-point circular convolution of two sequences of length N.
Example: circular convolution of two rectangular pulses
(continue)
Given two sequences of length L, assume that we add L
zeros on its end, making an N=2L point sequence – referred
to as zero padding
N-point circular convolution of two sequences of length L, where N=2L.
N-point circular convolution of two sequences of length L, where N=2L (continue).
Note that by zero padding, we can use circular convolution to
compute convolution of two finite length sequences.
Some other properties involving circulation:
 Time domain circular shift implies frequency domain phase
shift:
x[((n − m)) N ], 0 ≤ n ≤ N − 1 ↔ e − j ( 2πk / N ) m X [k ] = W km X [k ]
Duality property of DFT:
 Since DFT and IDFT has very similar form, we have a duality
property for DFT:
DFT
If
Then
x[n] ↔ X [k ]
DFT
X [ n] ↔ Nx[((− k )) N ],
0 ≤ k ≤ N −1
DFT Properties:
Sampling the DTFT spectrum
We have seen that, for an M-point sequence, if we uniformly
sample M points in its DTFT spectrum within [0, 2π], we can
equivalently obtain its DFT.
 What happen when we sample N points in the frequency domain
[0, 2π], instead of M points? Let the samples be
~
X [k ] = X ( z ) z =e j ( 2π / N ) k = X (e j ( 2π / N ) k ),
k = 0,..., N − 1
We will have the following property:
∞
~
x [n] = x[n] ∗ ∑ δ [n − rN ] =
r = −∞
∞
∑ x[n − rN ]
r = −∞
which means that “frequency-domain sampling implies time domain
aliasing.”
 when N ≥ M, no aliasing will happen, and particular when N=M,
we obtain DFT.
 when N < M, aliasing (of nonzero values) happens.
Linear Convolution Using DFT
 Recall that linear convolution is
x3 [n] =
∞
∑ x [m]x [n − m]
m = −∞
1
2
when the lengths of x1[n] and x2[n] are L and P, respectively the
length of x3[n] is L+P-1.
 Thus a useful property is that the linear convolution of two
finite-length sequences (with lengths being L and P respectively)
is equivalent to circular convolution of the two N-point (N ≥ L+P−1)
sequences obtained by zero padding.
 Another useful property is that we can perform circular
convolution and see how many points remain the same as those of
linear convolution. When P < L and an L-point circular convolution
is performed, the first (P−1) points are corrupted by circulation,
and the remaining points from n=p−1 to n=L−1 (ie. The last L−P+1
points) are not corrupted (ie., the last L−P+1 points remain the
same as the linear convolution result).
Block convolution (for implementing an FIR filter)
 FIR filtering is equal to the linear convolution of a finite-length
impulse response.
 To avoid delay in processing, and also to make efficient
computation, we would like to segment the signal into sections of
length L. Each L-length sequence can then be convolved with the
finite-length impulse response and the filtered sections fitted
together in an appropriate way. – called block convolution.
 When each section is sufficiently large, we usually use circular
convolution (instead of linear convolution) to compute each section.
(since it will be shown that there are fast algorithms, fast
Fourier transform (FFT), to compute circular convolutions highly
efficiently)
 Two methods for circular-convolution-based block convolution:
Overlapping-add method and overlapping-save method.
Overlapping-add method (for implementing an FIR filter)
When segmenting into L-length segments, the signal x[n] can be
∞
represented as
x[n] =
∑ x [n − rL]
m = −∞
where
r
 x[n + rL] 0 ≤ n ≤ L − 1
xr [ n ] = 
otherwise
 0
Because convolution is an LTI operation, it follows that
y[n] = x[n] ∗ h[n] =
where
yr [n] = xr [n] ∗ h[n]
∞
∑ y [n − rL]
m = −∞
r
Since xr[n] is of length L and h[n] is of length P, each yr[n] has
length (L+P−1).
So, we can use zero-padding to form two N point sequences,
N=L+P−1, for both xr[n] and h[n]. Performing N-point circular
convolution (instead of linear convolution) to compute yr[n].
For example, consider two sequences h[n] and x[n] as follows.
Segmenting x[n] into Llength sequences.
Each segment is
padded by P−1 zero
values.
Fir filtering by using the
overlapping-add
method.
Overlapping-save method (for implementing an FIR filter)
Can we perform L-point circular convolution, instead of (L+P−1)point circular convolution?
If a P-point sequence is circularly convolved with a P-point
sequence (P<L), the first (P−1) points of the result are
incorrect, while the remaining points are identical to those
that would be obtained by linear convolution.
 Separating x[n] as overlapping sections of length L, so that
each section overlaps the preceding section by (P−1) points.
xr [n] = x[n + r ( L − P + 1) − P + 1],
Then
∞
y[n] = ∑ yr [n − r ( L − P + 1) + P − 1]
r =0
0 ≤ n ≤ L −1
Example of overlapping-save method
Decompose x[n] into
overlapping sections
of length L
Example of overlapping-save method (continue)
Result of circularly convolving
each section with h[n]. The
portions of each filter section
to be discarded in forming the
linear convolution are indicated
Two-dimensional Transform
(c.f. Fundamentals of Digital Image Processing, A. K. Jain, Prentice
Hall, 1989)
One-dimensional orthogonal (unitary) transforms
N −1
v=Au →
u =
A*T
v=
AH
v→
v[k ] = ∑ ak ,nu[n]
0 ≤ k ≤ N −1
n =0
N −1
u[n] = ∑ ak∗,n v[k ]
0 ≤ n ≤ N −1
k =0
where A*T = A−1, i.e., AAH = AHA = I. That is, the columns of AH
form a set of orthonormal bases, and so are the columns of A.
The vector ak* ≡ {ak,n*, 0≤ n ≤ N−1} are called the basis vector
of A. The series coefficients v[k] give a representation of the
original sequence u[k], and are useful in filtering, data
compression, feature extraction, and other analysis.
Two-dimensional orthogonal (unitary) transforms
Let {u[m,n]} be an n×n image.
N −1 N −1
v[k , l ] = ∑∑ u[m, n]ak ,l [m, n]
0 ≤ k, l ≤ N −1
m =0 n =0
N −1 N −1
u[m, n] = ∑∑ v[k , l ]ak∗,l [m, n]
0 ≤ m, n ≤ N − 1
k =0 l =0
where {ak,l[m,n]}, called an image transform, is a set of complete
orthonormal discrete basis functions satisfying the properties:
Orthonormality:
N −1 N −1
∗
a
m
n
a
[
,
]
∑∑ k ,l
k ',l ' [ m, n] = δ [ k − k ' , l − l ' ]
m =0 n =0
N −1 N −1
∗
a
m
n
a
[
,
]
∑∑ k ,l
k ,l [ m ' , n ' ] = δ [ m − m ' , n − n ' ]
k =0 l =0
where δ[a,b] is the 2D delta function, which is one only when
a=b=0, and is zero otherwise.
V = {v[k,l]} is called the transformed image.
The orthonormal property assures that any expansion of the
basis images
P −1 Q −1
u P ,Q [m, n] = ∑∑ v'[k , l ]ak∗,l [m, n]
P ≤ N,
Q≤N
k =0 l =0
will be minimized by the truncated series
v'[k , l ] = v[k , l ]
When P=Q=N, the error of minimization will be zero.
Separable Unitary Transforms
The number of multiplications and additions required to
compute the transform coefficients v[k,l] is O(N4), which is
quite excessive.
The dimensionality can be reduced to O(N3) when the
transform is restricted to be separable.
A transform {ak,l[m,n]} is separable iff for all 0≤k,l,m,n≤N−1, it
can be decomposed as follows:
ak ,l [m, n] = ak [m]bl [n]
where A ≡ {a[k,m]} and B ≡ {b[l,n]} should be unitary matrices
themselves, i.e., AAH = AHA = I and BBH = BHB = I .
Often one choose B to be the same as A, so that
N −1 N −1
v[k , l ] = ∑∑ ak [m]u[m, n]al [n]
m =0 n =0
N −1 N −1
u[m, n] = ∑∑ ak∗ [m]v[k , l ]al∗ [n]
k =0 l =0
Hence, we can simplify the transform as
V = AUAT, and U = A*TVA*
where V = {v[k,l]} and U = {u[m,n]}.
A more general form: for an M×N rectangular image, the
transform pair is
V = AMUANT, and U = AM*TUAN*
where AM and AN are M×M and N×N unitary matrices,
respectively. themselves, i.e., AAH = AHA = I and BBH = BHB = I.
These are called two-dimensional separable transforms. The
complexity in computing the coefficient image is O(N3).
 The computation can be decomposed as computing T=UAT
first, and then compute V= AT (for an N×N image)
 Computing T=UAT requires N2 inner products (of N-point
vectors). Each inner product requires N operations, and so in
total O(N3).
 Similarly, V= AT also requires O(N3) operations, and
finally we need O(N3) to compute V.
A closer look at T=UAT:
Let the rows of U be {U1, U2, …, UN}. Then
T=UAT = [U1T, U2T, …, UNT]TAT = [U1AT, U2AT, …, UNAT] T.
Note that each UiAT (i=1 … N) is a one-dimensional unitary
transform. That is, this step performs N one-dimensional
transforms for the rows of the image U, obtaining a temporary
image T.
Then, the step V= AT performs N 1-D unitary transforms on the
columns of T.
Totally, 2N 1-D transforms are performed. Each 1-D transform is
of O(N2).
Two-dimensional Fourier Transform
Two-dimensional transforms can be formulated by directly
extending the one-dimensional transform. Eg.
DFT of two-dimensional signal (eg., an image):
1
v[k , l ] =
N
N −1 N −1
km
ln
[
,
]
u
m
n
W
W
∑∑
N
N
m =0 n =0
1
u[m, n] =
N
N −1 N −1
− mk
− nl
[
,
]
v
k
l
W
W
∑∑
N
N
k =0 l =0
Two-dimensional convolution (circular convolution):
u2 [m, n] =
N −1 N −1
∑∑ h[(m − m' ) mod N , (n − n' ) mod N ]u [m' , n' ]
m '= 0 n '= 0
1
Since the two-dimensional DFT is
1
v[k , l ] =
N
N −1 N −1
km
ln
u
m
n
W
W
[
,
]
∑∑
N
N
m =0 n =0
1
u[m, n] =
N
where WN = e
N −1 N −1
− mk
− nl
v
k
l
W
W
[
,
]
∑∑
N
N
k =0 l =0
− j ( 2π / N )
The 2D DFT is separable, and so it can be represented as
V = FUF
where F is the N×N matrix with the element of k-th row and nth element be
 1
kn 
W
F= 
N 
 N

,0 ≤ k , n ≤ N − 1
Fast computation of two-dimensional DFT:
 According to V = FUF, it can be decomposed as the computation
of 2N 1-D DFTs.
 Each 1-D DFT requires N×log2N computations.
 So, the 2-D DFT can be efficiently implemented in time
complexity of O(N2×log2N )
2-D DFT is inherent in many properties of 1-D DFT (e.g., conjugate
symmetry, shifting, scaling, convolution, etc.). A property not from
the 1-D DFT is the rotation property.
Rotation property: if we represent (m,n) and (k,l) in polar
coordinate,
(m, n) = (r cos θ , r sin θ ) and (k , l ) = ( w cos ϕ , w sin ϕ )
DFT
then u[ r , θ + ∆θ ] ⇔ v[ w, ϕ + ∆θ ]
That is, the rotation of an image implies the rotation of its DFT.
Digital Filter Structures
N
M
k =1
k =0
y[n] = ∑ ak y[n − k ] + ∑ (bk )x[n − k ]
M
v[n] = ∑ (bk )x[n − k ]
k =0
N
y[n] = ∑ ak y[n − k ] + v[n]
k =1
Direct Form I implementation
On the z-domain
M
−k 
) ∑ bk z 
H ( z ) = H 2 ( z ) H1 ( z ) = (
N

1 − ∑ ak z − k  k − 0
1
k =1
or equivalently
M
−k 
V ( z ) = H1 ( z ) X ( z ) =  ∑ bk z  X ( z )
 k −0

Y ( z ) = H 2 ( z )V ( z ) = (
1
N
1 − ∑ ak z − k
k =1
)V ( z )
By changing the order of H1 and H2, onsider the equivalence on
the z-domain:
H ( z ) = H1 ( z ) H 2 ( z )
where
M
−k 
H1 ( z ) =  ∑ bk z 
 k −0

H 2 ( z) = (
N
1 − ∑ ak z − k
k =1
Let
1
W ( z) = H 2 ( z) X ( z) = (
Then
1
N
1 − ∑ ak z − k
) X ( z)
k =1
M

Y ( z ) = H1 ( z )W ( z ) =  ∑ bk z − k W ( z )
 k −0

)
In the time domain,
N
w[n] = ∑ ak w[n − k ] + x[n]
k =1
M
y[n] = ∑ bk w[n − k ]
k =0
We have the following equivalence for implementation:
We assume M=N
here
Note that the exactly the same signal, w[k], is stored in the
two chains of delay elements in the block diagram. The
implementation can be further simplified as follows:
Direct Form II (or
Canonic Direct Form)
implementation
By using the direct form II implementation, the number of
delay elements is reduced from (M+N) to max(M,N).
Example:
1 + 2 z −1
H ( z) =
1 − 1.5 z −1 + 0.9 z − 2
Direct form I implementation
Direct form II implementation
Representing by signal-flow graph
Example: the signal-flow graph of direct form II.
Cascade Form
If we factor the numerator and denominator polynomials, we
can express H(z) in the form:
H ( z) =
M1
M2
k =1
N1
k =1
N2
k =1
k =1
−1
−1
∗ −1
(
1
)
(
1
)(
1
f
z
g
z
g
−
−
−
 k 
k
kz )
−1
−1
∗ −1
(
1
)
(
1
)(
1
c
z
d
z
d
−
−
−
 k  k
kz )
where M=M1+2M2 and N=N1+2N2, gk and gk* are a complex
conjugate pair of zeros, and ck and ck* are a complex conjugate
pair of poles.
It is because that any N-th order real-coefficient polynomial
equation has n roots, and these roots are either real or complex
conjugate pairs.
−1
−2
b
b
z
b
z
+
+
1k
2k
A general form is H ( z ) =  0 k
−1
−2
a
z
a
z
1
−
−
k =1
1k
2k
where N s = ( N + 1) / 2
Ns
and we assume that M≤N. The real poles and zeros have been
combined in pairs. If there are an odd number of zeros, one of
the coefficients b2k will be zero.
It suggests that a difference equation can be implemented via
the following structure consisting of a cascade of secondorder and first-order systems:
cascade form of implementation (with a direct form II realization of
each second-order subsystems)
Example:
1 + 2 z −1 + z −2
(1 + z −1 )(1 + z −1 )
H ( z) =
=
−1
−2
1 − 0.75 z + 0.125 z
(1 − 0.5 z −1 )(1 − 0.25 z −1 )
Cascade structure: direct form I implementation
Cascade structure: direct form II implementation
Parallel Form
If we represent H(z) by additions of low-order rational systems:
Np
H ( z ) = ∑ Ck z − k
k =0
N2
Ak
Bk (1 − ek z −1 )
+∑
+∑
−1
−1
∗ −1
c
z
d
z
d
1
(
1
)(
1
−
−
−
k =1
k =1
k
k
kz )
N1
where N=N1+2N2. If M≥N, then Np = M−N.
Alternatively, we may group the real poles in pairs, so that
Np
H ( z ) = ∑ Ck z − k
k =0
e0 k + e1k z −1
+∑
−1
−2
1
a
z
a
z
−
−
k =1
1k
2k
Ns
Illustration of parallel-form
structure for six-order
system (M=N=6) with the
real and complex poles
grouped in pairs.
Example: consider still the same system
− 7 + 8 z −1
1 + 2 z −1 + z −2
= 8+
H ( z) =
−1
−2
1 − 0.75 z + 0.125 z
1 − 0.75 z −1 + 0.125 z − 2
another alternation of the same system
1 + 2 z −1 + z −2
18
25
H ( z) =
= 8+
+
−1
−2
−1
1 − 0.75 z + 0.125 z
1 − 0.5 z
1 − 0.25 z −1
Hence, given a system
function, there are many
ways to implement it.
There are equivalent when
infinite-precision arithmetic
is used. However, their
behavior with finiteprecision arithmetic can be
quite different.
While the signal flow graph is an efficient way to
represent a difference equation, not all of its instances
are realizable:
If a system function has poles, a corresponding signal
flow graph will have feedback loops.
A signal flow graph is computable if all loops contain at
least one unit delay element. Eg.
A non-computable system
Computable systems
Download