Topic 10: The Fast Fourier Transform ELEN E4810: Digital Signal Processing 1.

advertisement
ELEN E4810: Digital Signal Processing
Topic 10:
The Fast Fourier Transform
1. Calculation of the DFT
2. The Fast Fourier Transform algorithm
3. Short-Time Fourier Transform
Dan Ellis
2013-11-27
1
1. Calculation of the DFT


Filter design so far has been oriented to
time-domain processing - cheaper!
But: frequency-domain processing
makes some problems very simple:
DFT
x[n]


X[k]
Fourier domain Y[k]
IDFT
processing
y[n]
use all of x[n], or use short-time windows
Need an efficient way to calculate DFT
Dan Ellis
2013-11-27
2
The DFT

Recall the DFT:
( WN = e
N1
X[k] =  x[n]WNkn

X[0]
X[1]
X[2]
..
.
⇧
⇧
⇧
⇧
⇧
⇤
X[N
1
⌃ ⇧
1
⌃ ⇧
⇧
⌃ ⇧1
⌃=⇧
⌃ ⇧.
⌅ ⇤ ..
1]
1
WN1
WN2
..
.
(N 1)
1 WN
Dan Ellis
)
discrete transform of discrete sequence
Matrix form:
⇥
2
N
WN@2º/N
WNr has only
N distinct values
n=0

j
1
WN2
WN4
..
.
···
···
···
..
.
WN
2(N
WN
..
.
2(N 1)
···
WN
WN
1
2013-11-27
x[0]
x[1]
x[2]
..
.
⌃⇧
⌃
1) ⌃ ⇧
⌃⇧
⌃⇧
⌃⇧
⌅⇤
1)2
x[N
(N 1)
(N
⇥
⇥
⌃
⌃
⌃
⌃
⌃
⌅
1]
Structure
opportunities
for
efficiency
3
Computational Complexity
N1
X[k] = 
kn
x[n]WN
n=0

N complex multiplies
+ N-1 complex adds per point (k)
× N points (k = 0.. N-1)



cpx mult: (a+jb)(c+jd) = ac - bd + j(ad + bc)
= 4 real mults + 2 real adds
cpx add = 2 real adds
N points: 4N2 real mults, 4N2-2N real adds
Dan Ellis
2013-11-27
4
Goertzel’s N1
Algorithm
k
 Now: X [k ] =  x [ ]W
N
=0
= WNkN

 x[]
k ( N)
WN
x[n] 0 ≤ n < N
xe[n] = { 0 n = N
X [k ] = yk [ N ]
i.e.
where yk [n] = xe [n]  hk [n]
xe[n]
+
xe[N] = 0 W -k z-1
N
Dan Ellis
looks like a
convolution
WN-kn n ≥ 0
hk[n] = { 0 n < 0
yk[n]
yk[-1] = 0
yk[N] = X[k]
2013-11-27
5
Goertzel’s Algorithm

Separate ‘filters’ for each X[k]



No large buffer, no coefficient table
Same complexity for full X[k]
(4N2 mults, 4N2 - 2N adds)

H (z)
can calculate for just a few values of k
but: can halve multiplies by making the
denominator real:
evaluate only
k 1
for last step
1
1  WN z
=
=
k 1
mults
1  WN z
1  2 cos 2Nk z 1 + z 2 2perreal
step
Dan Ellis
2013-11-27
6
2. Fast Fourier Transform FFT

Reduce complexity of DFT
from O(N2) to O(N·logN)



grows more slowly with larger N
Works by decomposing large DFT into
several stages of smaller DFTs
Often provided as a highly optimized
library
Dan Ellis
2013-11-27
7
Decimation in Time (DIT) FFT

Can rearrange DFT formula in 2 halves:
N1
X [k ] =  x [n]  WNnk
k = 0.. N-1
Arrange
terms
in pairs...
=
Group terms
from each
pair
=
n=0
N
1
2
 ( x[2m ]
2mk
 WN
+ x [2m +1]  WN
m=0
N
1
2
)
N
1
2
mk
k
mk
x
2m

W
+
W
x
2m
+1

W
] N
 [ ] N
N  [
m=0
2
X0[<k>N/2]
N/2 pt DFT of x for even n
Dan Ellis
(2 m+1) k
2013-11-27
2
m=0
X1[<k>N/2]
N/2 pt DFT of x for odd n
8
Decimation in Time (DIT) FFT
x[n] for even n
DFTN {x [n]} = DFTN {x0 [n]}
2


x[n] for odd n
k
+ WN DFTN
2
{x1 [n]}
We can evaluate an N-pt DFT as two
N/2-pt DFTs (plus a few mults/adds)
But if DFTN{•} ~ O(N2)
then DFTN/2{•} ~ O((N/2)2) = 1/4 O(N2)
Total computation ~ 2  1/4 O(N2)
= 1/2 the computation (+") of direct DFT
Dan Ellis
2013-11-27
9
One-Stage DIT Flowgraph
[ ]
X [k ] = X0 k
Even x[0]
points x[2]
from x[4]
x[n] x[6]
Odd x[1]
points x[3]
from x[5]
x[n]
x[7]
N
2
k
+ WN X1
DFTN
2
DFTN
2
[ ]
k
“twiddle factors”:
always apply to
odd-terms output
NOT mirror-image
N
2
X0[0]
X0[1]
X0[2]
X0[3]
X1[0]
X1[1]
X1[2]
X1[3]
WN0
WN1
WN2
WN3
WN4
WN5
WN6
WN7
X[0]
X[1]
X[2]
X[3]
X[4] Same as
X[5] X[0..3]
except for
X[6] factors on
X[7]
X1[•]
Classic FFT structure
Dan Ellis
2013-11-27
terms
10
Multiple DIT Stages


If decomposing one DFTN into two
smaller DFTN/2’s speeds things up ...
Why not further divide into DFTN/4’s ?
i.e. X [k ] = X0 k N + WNk X1 k N
[ ]
[ ]
make: X [k ] = X [ k ] + W X [ k ]
0≤k<N

2
2
k
0
00
N
2
N
4
01
N
4
0 ≤ k < N/2
N/4-pt DFT of even points N/4-pt DFT of odd points
from even subset
in even subset of x[n]

[ ]
Similarly, X1 [k ] = X10 k
Dan Ellis
2013-11-27
N
4
[ ]
+ WNk X11 k
2
11
N
4
Two-Stage DIT Flowgraph
different from before
x[0]
x[4]
x[2]
x[6]
DFTN X
00
x[1]
x[5]
x[3]
x[7]
DFTN X
10
Dan Ellis
4
DFTN X01
4
4
DFTN X11
4
0
WN/2
X0[0]
X0[1]
X0[2]
X0[3]
3
WN/2
X1[0]
0
WN/2
X1[1]
X1[2]
X1[3]
3
WN/2
2013-11-27
same as before
WN0
WN1
WN2
WN3
WN4
WN5
WN6
WN7
X[0]
X[1]
X[2]
X[3]
X[4]
X[5]
X[6]
X[7]
12
Multi-stage DIT FFT

Can keep doing this until we get down
to 2-pt DFTs:
“butterfly” element
DFT2
X[0] = x[0] + x[1]
X[1] = x[0] - x[1]
≡
1 = W20
-1 = W21
→ N = 2M-pt DFT reduces to M stages of
twiddle factors & summation
(O(N2) part vanishes)
→ real mults < M·4N , real adds < 2M·2N
→ complexity ~ O(N·M) = O(N·log2N)
Dan Ellis
2013-11-27
13
FFT Implementation Details

Basic butterfly (at any stage):
XX0[r]
••
•
XX1[r]

WNr
WNr+N/2
XX[r]
••
2 cpx mults
•
XX[r+N/2]
2  ( r+ N )
Can simplify:
XX0[r]
r+ N2
WN
=e
j
j
XX[r]
2
N
2 r
N
=e
e
= WNr
j
2 N / 2
N
just one cpx mult!
XX1[r]
Dan Ellis
WNr
-1
XX[r+N/2]
i.e. SUB rather than ADD
2013-11-27
14
bit-reversed indexing
8-pt DIT FFT Flowgraph
x[0]
x[4]
x[2]
x[6]
x[1]
x[5]
x[3]
x[7]
000
100
010
110
001
101
011
111



-
-
W4
-
-
W4
W8
W82
W83
-
X[0]
X[1]
X[2]
X[3]
X[4]
X[5]
X[6]
X[7]
-1’s absorbed into summation nodes
WN0 disappears
‘in-place’ algorithm: sequential stages
Dan Ellis
2013-11-27
15
FFT for Other Values of N


Having N = 2M meant we could divide
each stage into 2 halves = “radix-2 FFT”
Same approach works for:




N = 3M radix-3
N = 4M radix-4 - more optimized radix-2
etc...
Composite N = a·b·c·d → mixed radix
(different N/r point FFTs at each stage)

Dan Ellis
.. or just zero-pad to make N = 2M
2013-11-27
16
M
Inverse FFT


1
Recall IDFT: x[n] =
N
Thus:
N1
Nx [n] = 
*
(
N1

x [n] =
nk
X[k]WN
k=0
Forward DFT of x′[n] = X*[k]|k=n
i.e. time sequence made from spectrum
N1
) = X
nk *
X[k]WN
k=0

only differences
from forward DFT
*
nk
[k]WN
k=0
Hence, use FFT to calculate IFFT:
N 1
1
N
k=0
Dan Ellis
pure real flowgraph
*
X [k ]
*
nk
WN
Re{X[k]}
Im{X[k]}
2013-11-27
Re
-1
Im
DFT
Re
Im
1/N
-1/N
Re{x[n]}
Im{x[n]}
17
DFT of Real Sequences





If x[n] is pure-real, DFT wastes mult’s
Real x[n] → Conj. symm. X[k] = X*[-k]
Given two real sequences, x[n] and w[n]
call y[n] = j·w[n] , v[n] = x[n] + y[n]
N-pt DFT V[k] = X[k] + Y[k] X[k]
-Y[k]
but: V[k]+V*[-k] = X[k]+X*[-k]+Y[k]+Y*[-k]
X[k]=1/2(V[k]+V*[-k]) , W[k]=-j/2(V[k]-V*[-k])
i.e. compute DFTs of two N-pt real
sequences with a single N-pt DFT
Dan Ellis
2013-11-27
18
3. Short-Time
Fourier Transform (STFT)



Fourier Transform (e.g. DTFT) gives
spectrum of an entire sequence:
How to see a time-varying spectrum?
e.g. slow AM of a sinusoid carrier:
2

2n 
x [n] = 1  cos
 cos  0 n

N 
x[n]
1
0
-1
n
-2
0
Dan Ellis
2013-11-27
200
400
600
800
19
1000
Fourier Transform of AM Sine
600

\X[k]\
Spectrum of
400
whole sequence
200
indicates
modulation
0
0
0.02
indirectly...
Nsin2ºkn
... as
N
cancellation
between -Nsin2º(k-1)n
closely2
N
tuned
-Nsin2º(k+1)n
sines
2
N
2cAcB
= cA+B
+cA-B
N
N/2
0.04
0.06
WP
k/(N/2)
0.08
1

0.5
0
-0.5
-1
1
0.5
0
-0.5
-1
1
0.5
0
-0.5
-1
0
Dan Ellis
2013-11-27
128
256
384
512
640
20
768
896
M
Fourier Transform of AM Sine

Sometimes we’d rather separate
modulation and carrier:
x[n] = A[n]cos!0n


A[n] varies on a
different (slower) timescale
A[n]
!
!0
One approach:



Dan Ellis
chop x[n] into short sub-sequences ..
.. where slow modulator is ~ constant
DFT spectrum of pieces → show variation
2013-11-27
21
FT of Short Segments

Break up x[n] into successive, shorter
chunks of length NFT, then DFT each:
2
x[n]
1
n
0
NFT
-2 = N/8
0
128
256
384
512
640
768
896
1024 = N
x0[n] x1[n] x2[n] x3[n] x4[n] x5[n] x6[n] x7[n]
-1
100
X0[k] X1[k] X2[k] X3[k] X4[k] X5[k] X6[k] X7[k]
50
0
0
64
k
Shows amplitude modulation
of !0 energy
Dan Ellis
k0 = W0 · NFT
2P
2013-11-27
k
22
The Spectrogram
Plot successive DFTs in time-frequency:

X0[k] X1[k] X2[k] X3[k] X4[k] X5[k] X6[k] X7[k]
\Xi[k]\
k
k
\X[k,n]\
k
120
100
15
80
10
60
40
5
0
20
0
0
128
256
384
512
640
768
896
1024
n
time hopsize (between successive frames)
= 128 points

This image is called the Spectrogram
Dan Ellis
2013-11-27
23
Short-Time Fourier Transform


Spectrogram = STFT magnitude
plotted on time-frequency plane
STFT is (DFT form):
X [k,n0 ] =
N FT 1
 x[n0 + n]  w[n]  e
 j N2 kn
FT
n=0
frequency time
index
index

Dan Ellis
NFT points of x
starting at n0
window
DFT
kernel
intensity as a function of time & frequency
2013-11-27
24
STFT Window Shape

w[n] provides ‘time localization’ of STFT

w[n]
e.g. rectangular
selects x[n], n0 ≤ n < n0+NW
n
But: resulting spectrum has same
problems as windowing for FIR design:
j
X e ,n0 = DTFT{x [n0 + n]  w[n]}

DTFT
form of
STFT
(
)
=

  e
jn 0
( )W (e
X e
j
j (  )
)d
spectrum of short-time window
is convolved with (twisted) parent spectrum
Dan Ellis
2013-11-27
25
STFT Window Shape

e.g. if x[n] is a pure sinusoid,
X(ejW)
P

W(ejW)
W
W
W
blurring (mainlobe)
+ ghosting (sidelobes)
Hence, use tapered window for w[n]
W(ejW)
w[n]
e.g. Hamming
w[n] =
0.54 + 0.46 cos(2 2Mn +1)
-10
Dan Ellis
-5
0
5
10
sidelobes
< -40 dB
W
n
2013-11-27
26
STFT Window Length

Length of w[n] sets temporal resolution
0.2
x[n]
0.2
wS [n]
0.1
0.1
0
0
-0.1
0
200
400
600
800
-0.1
1000
short window measures
only local properties

x[n]
0
wL[n]
200
400
800
1000
longer window averages
spectral character
Window length ∝ 1/(Mainlobe width)
wS[n]
1
10
N1 pts
0
WS(ejW)
20
0.5
-100
-50
0
50
wL[n]
n
0
-P
20
100
zero at 4π
N1
-0.5P
0
N2 pts
0.5
0
10
n
-100
-50
0
50
100
more time detail
Dan Ellis
0
-P
0.5P
WL(ejW)
1

600
zero at 4π
N2
-0.5P
0
0.5P
W
P
shorter window
→ more blurred
spectrum
W
P
less frequency detail
2013-11-27
27
STFT Window Length
Can illustrate time-frequency tradeoff
on the time-frequency plane:
k

250
disks show ‘blurring’
due to window length;
area of disk is constant
→ Uncertainty principle:
200
150
100
50
0
±f·±t ≥ k
1
0.5
00

100
200
300
n
Alternate tilings
of time-freq:
half-length window → half as many DFT samples
Dan Ellis
2013-11-27
28
Spectrograms of Real Sounds
0.1
time-domain
4000
10
0
3000
-10
2000
-20
-30
1000
intensity / dB
freq / Hz
0
successive
short
DFTs
-40
0
2.35
2.4
2.45
2.5
2.55
2.6
-50
time / s
freq / Hz
4000
individual t-f
cells merge
into continuous
image
3000
2000
1000
0
0
0.5
Dan Ellis
1
1.5
2
2.5
2013-11-27
time / s
29
Narrowband vs. Wideband

Effect of varying window length:
0.2
freq / Hz
4000
3000
2000
10
0
1000
-10
0
freq / Hz
Window = 48 pt
Wideband
Window = 256 pt
Narrowband
0
-20
4000
-30
-40
3000
-50
level
/ dB
2000
1000
0
Dan Ellis
1.4
1.6
1.8
2
2.2
2.4
2013-11-27
2.6
time / s
30
M
Spectrogram in Matlab
Frequency
>>
>>
>>
>>
>>
[d,sr]=wavread(’mpgr1_sx419.wav');
(hann) window length
Nw=256;
specgram(d,Nw,sr)
actual sampling rate
caxis([-80 0])
(to label time axis)
colorbar
8000
0
6000
-20
4000
-40
2000
-60
0
0.5
Dan Ellis
1
1.5
Time
2
2013-11-27
2.5
-80
3
31
dB
STFT as a Filterbank
Consider one ‘row’ of STFT:
N1
X k [n0 ] =  x [n0 + n]  w[n]  e
just one freq.
=
 j 2 Nkn
convolution
with
complex IR
n=0
 ( N1)
 hk [m ] x[n0  m ]
m=0
1
where hk [n] = w[n]  e
j 2 Nkn
Im{x[n]}

0
1
-1
-60
-40
n

0
-20
0
Each STFT row is output of a filter
(subsampled by the STFT hop size)
Dan Ellis
2013-11-27
32
-1
Re{x[n]}
STFT as a Filterbank

If hk [n] = w[()n]  e
( ) = W (e
then H k e

j
j 2 Nkn
( ) j (  2Nk )
)
shift-in-!
Each STFT row is the same bandpass
response defined by W(ej!),
frequency-shifted to a given DFT bin:
\H1(ejW)\
\W(ejW)\
\H2(ejW)\
•••
W
Dan Ellis
A bank of identical,
frequency-shifted
bandpass filters:
“filterbank”
P
2013-11-27
33
STFT Analysis-Synthesis



IDFT of STFT frames can reconstruct
(part of) original waveform
e.g. if X [k,n0 ] = DFT{x [n0 + n]  w[n]}
then IDFT{X [k,n0 ]} = x [n0 + n]  w[n]
^
Can shift by n0, combine, to get x[n]:
^
x[n]
x[n]·w[n-n0]
n0

n
Could divide by w[n-n0] to recover x[n]...
Dan Ellis
2013-11-27
34
STFT Analysis-Synthesis

Dividing by small values of w[n] is bad
x[n]·w[n-r·H]

Prefer to
overlap windows:
^
x[n]
i.e. sample X[k,n0]
n
at n0 = r·H where H = N/2 (for example)
hopsize

window length
Then xˆ [n] =  x [n]w[n  rH ]
r
= x [n] if  w[n  rH ] = 1
r
Dan Ellis
2013-11-27
35
STFT Analysis-Synthesis
Hann or Hamming windows
w[n] + w[n-N/2]
with 50% overlap
w[n]
w[n-N/2]
sum to constant
n)
0.54
+
0.46
cos(2

(
N )

1
0.8
0.6
(
+ 0.54 + 0.46 cos(2

n N2
N
)
) = 1.08
0.4
0.2
0
0
20
40
60
80
Can modify individual frames of X[k,n]
and then reconstruct


Dan Ellis
complex, time-varying modifications
tapered overlap makes things OK
2013-11-27
36
n
STFT Analysis-Synthesis

e.g. Noise reduction:
Speech corrupted
by white noise
k
STFT of
original speech
Energy threshold
mask
120
100
80
60
40
20
100
Dan Ellis
2013-11-27
200
r
300
37
M
Download