ELEN E4810: Digital Signal Processing Topic 10: The Fast Fourier Transform 1. Calculation of the DFT 2. The Fast Fourier Transform algorithm 3. Short-Time Fourier Transform Dan Ellis 2013-11-27 1 1. Calculation of the DFT Filter design so far has been oriented to time-domain processing - cheaper! But: frequency-domain processing makes some problems very simple: DFT x[n] X[k] Fourier domain Y[k] IDFT processing y[n] use all of x[n], or use short-time windows Need an efficient way to calculate DFT Dan Ellis 2013-11-27 2 The DFT Recall the DFT: ( WN = e N1 X[k] = x[n]WNkn X[0] X[1] X[2] .. . ⇧ ⇧ ⇧ ⇧ ⇧ ⇤ X[N 1 ⌃ ⇧ 1 ⌃ ⇧ ⇧ ⌃ ⇧1 ⌃=⇧ ⌃ ⇧. ⌅ ⇤ .. 1] 1 WN1 WN2 .. . (N 1) 1 WN Dan Ellis ) discrete transform of discrete sequence Matrix form: ⇥ 2 N WN@2º/N WNr has only N distinct values n=0 j 1 WN2 WN4 .. . ··· ··· ··· .. . WN 2(N WN .. . 2(N 1) ··· WN WN 1 2013-11-27 x[0] x[1] x[2] .. . ⌃⇧ ⌃ 1) ⌃ ⇧ ⌃⇧ ⌃⇧ ⌃⇧ ⌅⇤ 1)2 x[N (N 1) (N ⇥ ⇥ ⌃ ⌃ ⌃ ⌃ ⌃ ⌅ 1] Structure opportunities for efficiency 3 Computational Complexity N1 X[k] = kn x[n]WN n=0 N complex multiplies + N-1 complex adds per point (k) × N points (k = 0.. N-1) cpx mult: (a+jb)(c+jd) = ac - bd + j(ad + bc) = 4 real mults + 2 real adds cpx add = 2 real adds N points: 4N2 real mults, 4N2-2N real adds Dan Ellis 2013-11-27 4 Goertzel’s N1 Algorithm k Now: X [k ] = x [ ]W N =0 = WNkN x[] k ( N) WN x[n] 0 ≤ n < N xe[n] = { 0 n = N X [k ] = yk [ N ] i.e. where yk [n] = xe [n] hk [n] xe[n] + xe[N] = 0 W -k z-1 N Dan Ellis looks like a convolution WN-kn n ≥ 0 hk[n] = { 0 n < 0 yk[n] yk[-1] = 0 yk[N] = X[k] 2013-11-27 5 Goertzel’s Algorithm Separate ‘filters’ for each X[k] No large buffer, no coefficient table Same complexity for full X[k] (4N2 mults, 4N2 - 2N adds) H (z) can calculate for just a few values of k but: can halve multiplies by making the denominator real: evaluate only k 1 for last step 1 1 WN z = = k 1 mults 1 WN z 1 2 cos 2Nk z 1 + z 2 2perreal step Dan Ellis 2013-11-27 6 2. Fast Fourier Transform FFT Reduce complexity of DFT from O(N2) to O(N·logN) grows more slowly with larger N Works by decomposing large DFT into several stages of smaller DFTs Often provided as a highly optimized library Dan Ellis 2013-11-27 7 Decimation in Time (DIT) FFT Can rearrange DFT formula in 2 halves: N1 X [k ] = x [n] WNnk k = 0.. N-1 Arrange terms in pairs... = Group terms from each pair = n=0 N 1 2 ( x[2m ] 2mk WN + x [2m +1] WN m=0 N 1 2 ) N 1 2 mk k mk x 2m W + W x 2m +1 W ] N [ ] N N [ m=0 2 X0[<k>N/2] N/2 pt DFT of x for even n Dan Ellis (2 m+1) k 2013-11-27 2 m=0 X1[<k>N/2] N/2 pt DFT of x for odd n 8 Decimation in Time (DIT) FFT x[n] for even n DFTN {x [n]} = DFTN {x0 [n]} 2 x[n] for odd n k + WN DFTN 2 {x1 [n]} We can evaluate an N-pt DFT as two N/2-pt DFTs (plus a few mults/adds) But if DFTN{•} ~ O(N2) then DFTN/2{•} ~ O((N/2)2) = 1/4 O(N2) Total computation ~ 2 1/4 O(N2) = 1/2 the computation (+") of direct DFT Dan Ellis 2013-11-27 9 One-Stage DIT Flowgraph [ ] X [k ] = X0 k Even x[0] points x[2] from x[4] x[n] x[6] Odd x[1] points x[3] from x[5] x[n] x[7] N 2 k + WN X1 DFTN 2 DFTN 2 [ ] k “twiddle factors”: always apply to odd-terms output NOT mirror-image N 2 X0[0] X0[1] X0[2] X0[3] X1[0] X1[1] X1[2] X1[3] WN0 WN1 WN2 WN3 WN4 WN5 WN6 WN7 X[0] X[1] X[2] X[3] X[4] Same as X[5] X[0..3] except for X[6] factors on X[7] X1[•] Classic FFT structure Dan Ellis 2013-11-27 terms 10 Multiple DIT Stages If decomposing one DFTN into two smaller DFTN/2’s speeds things up ... Why not further divide into DFTN/4’s ? i.e. X [k ] = X0 k N + WNk X1 k N [ ] [ ] make: X [k ] = X [ k ] + W X [ k ] 0≤k<N 2 2 k 0 00 N 2 N 4 01 N 4 0 ≤ k < N/2 N/4-pt DFT of even points N/4-pt DFT of odd points from even subset in even subset of x[n] [ ] Similarly, X1 [k ] = X10 k Dan Ellis 2013-11-27 N 4 [ ] + WNk X11 k 2 11 N 4 Two-Stage DIT Flowgraph different from before x[0] x[4] x[2] x[6] DFTN X 00 x[1] x[5] x[3] x[7] DFTN X 10 Dan Ellis 4 DFTN X01 4 4 DFTN X11 4 0 WN/2 X0[0] X0[1] X0[2] X0[3] 3 WN/2 X1[0] 0 WN/2 X1[1] X1[2] X1[3] 3 WN/2 2013-11-27 same as before WN0 WN1 WN2 WN3 WN4 WN5 WN6 WN7 X[0] X[1] X[2] X[3] X[4] X[5] X[6] X[7] 12 Multi-stage DIT FFT Can keep doing this until we get down to 2-pt DFTs: “butterfly” element DFT2 X[0] = x[0] + x[1] X[1] = x[0] - x[1] ≡ 1 = W20 -1 = W21 → N = 2M-pt DFT reduces to M stages of twiddle factors & summation (O(N2) part vanishes) → real mults < M·4N , real adds < 2M·2N → complexity ~ O(N·M) = O(N·log2N) Dan Ellis 2013-11-27 13 FFT Implementation Details Basic butterfly (at any stage): XX0[r] •• • XX1[r] WNr WNr+N/2 XX[r] •• 2 cpx mults • XX[r+N/2] 2 ( r+ N ) Can simplify: XX0[r] r+ N2 WN =e j j XX[r] 2 N 2 r N =e e = WNr j 2 N / 2 N just one cpx mult! XX1[r] Dan Ellis WNr -1 XX[r+N/2] i.e. SUB rather than ADD 2013-11-27 14 bit-reversed indexing 8-pt DIT FFT Flowgraph x[0] x[4] x[2] x[6] x[1] x[5] x[3] x[7] 000 100 010 110 001 101 011 111 - - W4 - - W4 W8 W82 W83 - X[0] X[1] X[2] X[3] X[4] X[5] X[6] X[7] -1’s absorbed into summation nodes WN0 disappears ‘in-place’ algorithm: sequential stages Dan Ellis 2013-11-27 15 FFT for Other Values of N Having N = 2M meant we could divide each stage into 2 halves = “radix-2 FFT” Same approach works for: N = 3M radix-3 N = 4M radix-4 - more optimized radix-2 etc... Composite N = a·b·c·d → mixed radix (different N/r point FFTs at each stage) Dan Ellis .. or just zero-pad to make N = 2M 2013-11-27 16 M Inverse FFT 1 Recall IDFT: x[n] = N Thus: N1 Nx [n] = * ( N1 x [n] = nk X[k]WN k=0 Forward DFT of x′[n] = X*[k]|k=n i.e. time sequence made from spectrum N1 ) = X nk * X[k]WN k=0 only differences from forward DFT * nk [k]WN k=0 Hence, use FFT to calculate IFFT: N 1 1 N k=0 Dan Ellis pure real flowgraph * X [k ] * nk WN Re{X[k]} Im{X[k]} 2013-11-27 Re -1 Im DFT Re Im 1/N -1/N Re{x[n]} Im{x[n]} 17 DFT of Real Sequences If x[n] is pure-real, DFT wastes mult’s Real x[n] → Conj. symm. X[k] = X*[-k] Given two real sequences, x[n] and w[n] call y[n] = j·w[n] , v[n] = x[n] + y[n] N-pt DFT V[k] = X[k] + Y[k] X[k] -Y[k] but: V[k]+V*[-k] = X[k]+X*[-k]+Y[k]+Y*[-k] X[k]=1/2(V[k]+V*[-k]) , W[k]=-j/2(V[k]-V*[-k]) i.e. compute DFTs of two N-pt real sequences with a single N-pt DFT Dan Ellis 2013-11-27 18 3. Short-Time Fourier Transform (STFT) Fourier Transform (e.g. DTFT) gives spectrum of an entire sequence: How to see a time-varying spectrum? e.g. slow AM of a sinusoid carrier: 2 2n x [n] = 1 cos cos 0 n N x[n] 1 0 -1 n -2 0 Dan Ellis 2013-11-27 200 400 600 800 19 1000 Fourier Transform of AM Sine 600 \X[k]\ Spectrum of 400 whole sequence 200 indicates modulation 0 0 0.02 indirectly... Nsin2ºkn ... as N cancellation between -Nsin2º(k-1)n closely2 N tuned -Nsin2º(k+1)n sines 2 N 2cAcB = cA+B +cA-B N N/2 0.04 0.06 WP k/(N/2) 0.08 1 0.5 0 -0.5 -1 1 0.5 0 -0.5 -1 1 0.5 0 -0.5 -1 0 Dan Ellis 2013-11-27 128 256 384 512 640 20 768 896 M Fourier Transform of AM Sine Sometimes we’d rather separate modulation and carrier: x[n] = A[n]cos!0n A[n] varies on a different (slower) timescale A[n] ! !0 One approach: Dan Ellis chop x[n] into short sub-sequences .. .. where slow modulator is ~ constant DFT spectrum of pieces → show variation 2013-11-27 21 FT of Short Segments Break up x[n] into successive, shorter chunks of length NFT, then DFT each: 2 x[n] 1 n 0 NFT -2 = N/8 0 128 256 384 512 640 768 896 1024 = N x0[n] x1[n] x2[n] x3[n] x4[n] x5[n] x6[n] x7[n] -1 100 X0[k] X1[k] X2[k] X3[k] X4[k] X5[k] X6[k] X7[k] 50 0 0 64 k Shows amplitude modulation of !0 energy Dan Ellis k0 = W0 · NFT 2P 2013-11-27 k 22 The Spectrogram Plot successive DFTs in time-frequency: X0[k] X1[k] X2[k] X3[k] X4[k] X5[k] X6[k] X7[k] \Xi[k]\ k k \X[k,n]\ k 120 100 15 80 10 60 40 5 0 20 0 0 128 256 384 512 640 768 896 1024 n time hopsize (between successive frames) = 128 points This image is called the Spectrogram Dan Ellis 2013-11-27 23 Short-Time Fourier Transform Spectrogram = STFT magnitude plotted on time-frequency plane STFT is (DFT form): X [k,n0 ] = N FT 1 x[n0 + n] w[n] e j N2 kn FT n=0 frequency time index index Dan Ellis NFT points of x starting at n0 window DFT kernel intensity as a function of time & frequency 2013-11-27 24 STFT Window Shape w[n] provides ‘time localization’ of STFT w[n] e.g. rectangular selects x[n], n0 ≤ n < n0+NW n But: resulting spectrum has same problems as windowing for FIR design: j X e ,n0 = DTFT{x [n0 + n] w[n]} DTFT form of STFT ( ) = e jn 0 ( )W (e X e j j ( ) )d spectrum of short-time window is convolved with (twisted) parent spectrum Dan Ellis 2013-11-27 25 STFT Window Shape e.g. if x[n] is a pure sinusoid, X(ejW) P W(ejW) W W W blurring (mainlobe) + ghosting (sidelobes) Hence, use tapered window for w[n] W(ejW) w[n] e.g. Hamming w[n] = 0.54 + 0.46 cos(2 2Mn +1) -10 Dan Ellis -5 0 5 10 sidelobes < -40 dB W n 2013-11-27 26 STFT Window Length Length of w[n] sets temporal resolution 0.2 x[n] 0.2 wS [n] 0.1 0.1 0 0 -0.1 0 200 400 600 800 -0.1 1000 short window measures only local properties x[n] 0 wL[n] 200 400 800 1000 longer window averages spectral character Window length ∝ 1/(Mainlobe width) wS[n] 1 10 N1 pts 0 WS(ejW) 20 0.5 -100 -50 0 50 wL[n] n 0 -P 20 100 zero at 4π N1 -0.5P 0 N2 pts 0.5 0 10 n -100 -50 0 50 100 more time detail Dan Ellis 0 -P 0.5P WL(ejW) 1 600 zero at 4π N2 -0.5P 0 0.5P W P shorter window → more blurred spectrum W P less frequency detail 2013-11-27 27 STFT Window Length Can illustrate time-frequency tradeoff on the time-frequency plane: k 250 disks show ‘blurring’ due to window length; area of disk is constant → Uncertainty principle: 200 150 100 50 0 ±f·±t ≥ k 1 0.5 00 100 200 300 n Alternate tilings of time-freq: half-length window → half as many DFT samples Dan Ellis 2013-11-27 28 Spectrograms of Real Sounds 0.1 time-domain 4000 10 0 3000 -10 2000 -20 -30 1000 intensity / dB freq / Hz 0 successive short DFTs -40 0 2.35 2.4 2.45 2.5 2.55 2.6 -50 time / s freq / Hz 4000 individual t-f cells merge into continuous image 3000 2000 1000 0 0 0.5 Dan Ellis 1 1.5 2 2.5 2013-11-27 time / s 29 Narrowband vs. Wideband Effect of varying window length: 0.2 freq / Hz 4000 3000 2000 10 0 1000 -10 0 freq / Hz Window = 48 pt Wideband Window = 256 pt Narrowband 0 -20 4000 -30 -40 3000 -50 level / dB 2000 1000 0 Dan Ellis 1.4 1.6 1.8 2 2.2 2.4 2013-11-27 2.6 time / s 30 M Spectrogram in Matlab Frequency >> >> >> >> >> [d,sr]=wavread(’mpgr1_sx419.wav'); (hann) window length Nw=256; specgram(d,Nw,sr) actual sampling rate caxis([-80 0]) (to label time axis) colorbar 8000 0 6000 -20 4000 -40 2000 -60 0 0.5 Dan Ellis 1 1.5 Time 2 2013-11-27 2.5 -80 3 31 dB STFT as a Filterbank Consider one ‘row’ of STFT: N1 X k [n0 ] = x [n0 + n] w[n] e just one freq. = j 2 Nkn convolution with complex IR n=0 ( N1) hk [m ] x[n0 m ] m=0 1 where hk [n] = w[n] e j 2 Nkn Im{x[n]} 0 1 -1 -60 -40 n 0 -20 0 Each STFT row is output of a filter (subsampled by the STFT hop size) Dan Ellis 2013-11-27 32 -1 Re{x[n]} STFT as a Filterbank If hk [n] = w[()n] e ( ) = W (e then H k e j j 2 Nkn ( ) j ( 2Nk ) ) shift-in-! Each STFT row is the same bandpass response defined by W(ej!), frequency-shifted to a given DFT bin: \H1(ejW)\ \W(ejW)\ \H2(ejW)\ ••• W Dan Ellis A bank of identical, frequency-shifted bandpass filters: “filterbank” P 2013-11-27 33 STFT Analysis-Synthesis IDFT of STFT frames can reconstruct (part of) original waveform e.g. if X [k,n0 ] = DFT{x [n0 + n] w[n]} then IDFT{X [k,n0 ]} = x [n0 + n] w[n] ^ Can shift by n0, combine, to get x[n]: ^ x[n] x[n]·w[n-n0] n0 n Could divide by w[n-n0] to recover x[n]... Dan Ellis 2013-11-27 34 STFT Analysis-Synthesis Dividing by small values of w[n] is bad x[n]·w[n-r·H] Prefer to overlap windows: ^ x[n] i.e. sample X[k,n0] n at n0 = r·H where H = N/2 (for example) hopsize window length Then xˆ [n] = x [n]w[n rH ] r = x [n] if w[n rH ] = 1 r Dan Ellis 2013-11-27 35 STFT Analysis-Synthesis Hann or Hamming windows w[n] + w[n-N/2] with 50% overlap w[n] w[n-N/2] sum to constant n) 0.54 + 0.46 cos(2 ( N ) 1 0.8 0.6 ( + 0.54 + 0.46 cos(2 n N2 N ) ) = 1.08 0.4 0.2 0 0 20 40 60 80 Can modify individual frames of X[k,n] and then reconstruct Dan Ellis complex, time-varying modifications tapered overlap makes things OK 2013-11-27 36 n STFT Analysis-Synthesis e.g. Noise reduction: Speech corrupted by white noise k STFT of original speech Energy threshold mask 120 100 80 60 40 20 100 Dan Ellis 2013-11-27 200 r 300 37 M