Fast Fourier Transform Fast Fourier Transform Applications. ■ ■ ■ Perhaps single algorithmic discovery that has had the greatest practical impact in history. Optics, acoustics, quantum physics, telecommunications, systems theory, signal processing, speech recognition, data compression. Progress in these areas limited by lack of fast algorithms. History. ■ Cooley-Tukey (1965) revolutionized all of these areas. ■ Danielson-Lanczos (1942) efficient algorithm. ■ Runge-König (1924) laid theoretical groundwork. ■ Gauss (1805, 1866) describes similar algorithm. ■ Importance not realized until advent of digital computers. Jean Baptiste Joseph Fourier (1768-1830) 2 Polynomials: Coefficient Representation Polynomials: Point-Value Representation Degree n polynomial. Degree n polynomial. p( x ) = a0 + a1 x + a 2 x + L + a n−1 x 2 n−1 ■ Uniquely specified by knowing p(x) at n different values of x. {(x0 , y0 ), (x1 , y1 ),K, (xn-1 , yn−1 ) }, q( x ) = b0 + b1 x + b2 x 2 + L + bn−1 x n−1 where yk = p( xk ) Addition: O(n) ops. p( x ) + q( x ) = (a0 + b0 ) + (a1 + b1 ) x + L + (a n−1 + bn−1 ) x n−1 y = p(x) Evaluation: O(n) using Horner’s method. p( x ) = a0 + ( x a1 + x (a 2 + L + x (a n− 2 + x (a n−1 ) )L ) ) yj Multiplication (convolution): O(n2). xj j p( x ) × q( x ) = (a0 b0 ) + (a0b1 + a1b0 ) x + L + ( ∑ a k b j −k ) x j + L + (a n−1bn−1 ) x 2 n−2 x k =0 3 4 Polynomials: Point-Value Representation Best of Both Worlds Degree n polynomial. Can we get "fast" multiplication and evaluation? {(x0 , y0 ), (x1 , y1 ),K, (xn-1 , yn−1 ) }, where yk = p( xk ) {(x0 , z0 ), (x1 , z1 ),K, (xn-1 , zn−1 ) }, where z k = q( xk ) Representation Multiplication Evaluation coefficient O(n2) O(n) point-value O(n) O(n2) FFT O(n log n) O(n log n) Addition: O(n). {(x0 , y0 + z 0 ), (x1 , y1 + z1 ), K, (x n-1 , yn−1 + z n−1 ) } ! Yes! Convert back and forth between two representations. Multiplication: O(n), but need 2n points. {(x0 , a0 , a1, y0 z0 ), (x1 , y1 z1 ) ,K, (x 2n-1 , y2 n−1 z 2 n−1 ) } b0 , b1, Evaluation: O(n2) using Lagrange’s formula. n −1 k =0 evaluation FFT ∏ (x − x j ) j ≠k p( x ) = ∑ y k K, an-1 K, bn-1 p(x0 ), p( x1), ∏ ( xk − x j ) q(x0 ), q ( x1), j ≠k coefficient multiplication c0 , c1, O(n2) interpolation inverse FFT O(n log n) K, p( x2n-1) K, q( x2n-1) K, c2n-2 point-value multiplication O(n) r(x 0 ), r ( x1), O(n log n) K , r ( x2n-1) 5 6 Converting Between Representations: Naïve Solution Fast Interpolation: Key Idea Evaluation (coefficient to point-value). ■ ■ Key idea: choose {x0, x1, . . . , xn-1 } to make computation easier! Given a polynomial p(x) =a0 + a2x1 + . . . + an-1 xn-1, choose n distinct points {x0, x1, . . . , xn-1 } and compute yk = p(xk), for each k using Horner’s method. ■ O(n2). ■ ■ Given n distinct points {x0, x1, . . . , xn-1 } and yk = p(xk), compute the coefficients {a0, a1, . . . , an-1 } by solving the following linear system of equations. M M Note Vandermonde matrix is invertible iff xk are distinct. O(n3). L x L x L x M O M L x 1 x0 x 02 1 x1 x12 x 22 1 x2 2 1 x n −1 x n −1 M M n −1 0 n −1 1 n −1 2 L x L x L x M O M L x 1 x0 x 02 1 x1 x12 x 22 1 x2 2 1 x n −1 x n −1 Interpolation (point-value to coefficient). ■ Set xk = xj? n −1 0 n −1 1 n −1 2 y0 a0 y1 a1 a = y 2 2 y n −1 a n −1 0 n −1 M M y0 a0 y1 a1 a = y 2 2 y n −1 a n − n − 1 1 n −1 M M 7 8 Fast Interpolation: Key Idea Fast Interpolation: Key Idea Key idea: choose {x0, x1, . . . , xn-1 } to make computation easier! ■ ■ Key idea: choose {x0, x1, . . . , xn-1 } to make computation easier! Set xk = xj? Use negative numbers: set xk = -xj so that (xk )2 = (xj )2. – set xk = -xn/2 + k 1 17 (17)2 1 5 (5 )2 2 1 − 17 (−17) 1 − 5 (−5 )2 M M M M – – – M M (17)3 (5 )3 M (−17)3 (−5 )3 M M L L M L L ■ Set xk = xj? ■ Use negative numbers: set xk = -xj so that (xk )2 = (xj )2. – ■ (17)n −1 a y0 (5 )n −1 0 a1 y1 a2 = y 2 n −1 (−17) y (−5 )n −1 a n −1 n −1 M M M set xk = -xn/2 + k Use complex numbers: set x k = ωk where ω is nth root of unity. – (xk )2 = (-xn/2 + k )2 – (xk )4 = (-xn/4 + k )4 – (xk )8 = (-xn/8 + k )8 L L L L O L 1 1 1 1 y0 1 2 3 1 ω ω ω y1 2 y ω4 ω6 1 ω 2 = 3 y3 ω6 ω9 1 ω y 1 ω n −1 ω 2( n −1) ω 3(n −1) n −1 M M E = peven(x2) = a0 + a2172 + a4174 + a6176 + . . . + an-217n-2 O = x podd(x2) = a117 + a3173 + a5175 + a7177 + . . . + an-117n - 1 y0 = E + O, yn/2 = E - O M M M M 2(n −1) ω 3(n −1) ω (n −1)(n −1) ω 1 ω n −1 M a0 a1 a 2 a3 a n −1 M 9 10 Roots of Unity Roots of Unity: Properties L1: Let ω be the principal nth root of unity. If n > 0, then ωn/2 = -1. An nth root of unity is a complex number z such that zn = 1. ■ ω = e 2π i / n = principal n th root of unity. it ■ e ■ i2 = -1. ■ ■ Proof: ω = e 2π i / n ⇒ ωn/2 = e π i = -1. (Euler’s formula) = cos t + i sin t. L2: Let n > 0 be even, and let ω and ν be the principal nth and (n/2)th roots of unity. Then (ωk ) 2 = ν k . There are exactly n roots of unity: ωk, k = 0, 1, . . . , n-1. ■ ω2 = i L3: Let n > 0 be even. Then, the squares of the n complex nth roots of unity are the n/2 complex (n/2) th roots of unity. ω1 ω3 ■ ω4 ω0 = -1 Proof: (ωk ) 2 = e (2k)2π i / n = e (k) 2π i / (n / 2) = ν k . =1 Proof: If we square all of the nth roots of unity, then each (n/2) th root is obtained exactly twice since: – – – ω7 ω5 – ω6 = -i 11 L1 ⇒ ωk + n / 2 = - ωk thus, (ωk + n / 2) 2 = (ωk ) 2 L2 ⇒ both of these = ν k ωk + n / 2 and ωk have the same square 12 Divide-and-Conquer FFT Algorithm Given degree n polynomial p(x) = a0 + a1x1 + a2 x2 + . . . + an-1 xn-1. ■ ■ ■ FFT (n, a0, a1, a2, . . . , an-1) Assume n is a power of 2, and let ω be the principal n th root of unity. if (n == 1) return a0 Define even and odd polynomials: – peven(x) := a0 + a2x1 + a4x2 + a6x3 + . . . + an-2 xn/2 - 1 – podd(x) := a1 + a3x1 + a5x2 + a7x3 + . . . + an-1 xn/2 - 1 – p(x) = peven(x2) + x podd(x2) ω ← e 2π i / n (e0,e1,e2,...,en/2-1) ← FFT(n/2, a0,a2,a4,...,an-2) (d0,d1,d2,...,dn/2-1) ← FFT(n/2, a1,a3,a5,...,an-1) for k = 0 to n/2 - 1 yk ← ek + ωk dk yk+n/2 ← ek - ωk dk Reduces problem of – evaluating degree n polynomial p(x) at ω0, ω1, . . . , ωn-1 to – ■ // n is a power of 2 evaluating two degree n/2 polynomials at: (ω0)2, (ω1)2, . . . , (ωn-1)2 . O(n) complex multiplies if we pre-compute ωk. return (y0,y1,y2,...,yn-1) L3 ⇒ peven(x) and podd(x) only evaluated at n/2 complex (n/2) th roots of unity. T (n ) = 2T (n / 2) + O (n ) ⇒ T (n ) = O (n log n ) 13 14 Recursion Tree Proof of Correctness Proof of correctness. Need to show yk = p(ωk ) for each k = 0, . . . , n-1, where ω is the principal n th root of unity. a0, a1, a2, a3, a4, a5, a6, a7 ■ ■ a0, a2, a4, a6 a0, a4 a1, a3, a5, a7 a2, a6 yk a3, a7 a1, a5 a4 a2 a6 a1 a5 a3 j =0 Base case. n = 1 ⇒ ω = 1. Algorithm returns y0 = a0 = a0 ω0 . Induction step. Assume algorithm correct for n / 2. – let ν be the principal (n/2) th root of unity – ek = peven(ν k ) = peven(ω2k ) by Lemma 2 – dk = podd(νk ) = podd(ω2k ) by Lemma 2 – recall p(x) = peven(x2) + x podd(x2) = ek + ω k d k = peven (ω 2k ) + ω k podd (ω 2k ) = a0 n −1 p(ω k ) = ∑ a j ω k j p(ω ) k y k +n / 2 = ek − ω k d k = peven (ω 2k ) − ω k podd (ω 2k ) = peven (ω 2k ) + ω k + n / 2 podd (ω 2k ) peven (ω 2k + n ) + ω k + n / 2 podd (ω 2k + n ) p(ω k + n / 2 ) a7 "bit-reversed" order 15 16 Best of Both Worlds Inverse FFT Can we get "fast" multiplication and evaluation? Forward FFT: given {a0, a1, . . . , an-1 } , compute {y0, y1, . . . , yn-1 } . Representation Multiplication Evaluation coefficient O(n2) O(n) point-value O(n) O(n2) FFT O(n log n) O(n log n) 1 1 1 1 y0 1 2 ω ω3 1 ω y1 2 4 y 1 ω ω ω6 2 = 1 ω3 ω6 ω9 y3 y 1 ω n −1 ω 2( n −1) ω 3(n −1) n −1 M ! Yes! Convert back and forth between two representations. K, an-1 b0 , b1,K , bn-1 a0 , a1, evaluation FFT coefficient multiplication O(n2) q(x0 ), q ( x1), K, p( x2n-1) K, q( x2n-1) point-value multiplication O(n) r(x 0 ), r ( x1), M M ω ω 2(n −1) ω 3(n −1) ω ( n −1)(n −1) 1 n −1 M a0 a1 a 2 a3 a n −1 M Inverse FFT: given {y0, y1, . . . , yn-1 } compute {a0, a1, . . . , an-1 }. K, c2n-2 interpolation inverse FFT O(n log n) p(x0 ), p( x1), c0 , c1, M M L L L L O L 1 1 1 1 a0 1 2 3 1 ω ω ω a1 1 ω2 a ω4 ω6 2 = 1 ω3 ω6 ω9 a3 a 1 ω n −1 ω 2(n −1) ω 3( n −1) n −1 O(n log n) M K , r ( x2n-1) M M M M L L L L O L ω ω 2(n −1) ω 3(n −1) ω (n −1)(n −1) 1 −1 y0 y1 y 2 y3 y n −1 n −1 M M 17 18 Inverse FFT Inverse FFT: Proof of Correctness Great news: same algorithm as FFT, except use ω-1 as "principal" nth root of unity (and divide by n). Fn Fn−1 = 1 = n 1 1 1 ω1 1 ω2 1 ω2 ω4 1 ω ω M 3 M 6 M 1 ω (n −1) ω 2(n −1) 1 1 1 ω −1 ω −2 1 ω −2 ω −4 1 ω −3 ω −6 M M 1 M 1 ω −( n −1) ω − 2(n −1) Summation lemma. Let ω be a primitive n th root of unity. Then L 1 L ω L ω ω L ω ω M O M ω Lω L 1 1 L ω ω L ω ω L ω ω M O M ω Lω 1 ω3 6 9 3( n −1) −3 −6 −9 − 3(n −1) n −1 ∑ω (n −1) 2(n −1) 3(n −1) (n −1)(n −1) j =0 ■ ■ − ( n −1) − 2(n −1) − 3(n −1) − ( n −1)( n −1) kj n k ≡ 0 mod n = 0 otherwise If k is a multiple of n then ωk = 1. Each nth root of unity ωk is a root of xn - 1 = (x - 1) (1 + x + x2 + . . . + xn-1), if ωk ≠ 1 we have: 1 + ωk + ωk(2) + . . . + ωk(n-1) = 0. Claim: Fn and Fn-1 are inverses. (Fn Fn−1) i i ′ = = n −1 ∑ ω ij j =0 1 n ω − j i′ n n −1 ′ ∑ ω (i − i ) j j =0 1 if i = i ′ = 0 otherwise 19 20 Inverse FFT: Algorithm Best of Both Worlds Can we get "fast" multiplication and evaluation? IFFT (n, a0, a1, a2, . . . , an-1) if (n == 1) return a0 // n is a power of 2 Representation Multiplication Evaluation coefficient O(n2) O(n) point-value O(n) O(n2) FFT O(n log n) O(n log n) ω ← e-2πi/n (e0,e1,e2,...,en/2-1) ← FFT(n/2, a0,a2,a4,...,an-2) (d0,d1,d2,...,dn/2-1) ← FFT(n/2, a1,a3,a5,...,an-1) for k = 0 to n/2 - 1 yk ← (ek + ωk dk ) / n yk+n/2 ← (ek - ωk dk ) / n ! Yes! Convert back and forth between two representations. a0 , a1, return (y0,y1,y2,...,yn-1) b0 , b1, K, an-1 K, bn-1 evaluation FFT p(x0 ), p( x1), q(x0 ), q ( x1), 21 Integer Arithmetic Multiply two n-digit integers: a = an-1 . . . a1a0 and b = bn-1 . . . b1b0. ■ Form two degree n polynomials. ■ Note: a = p(10), b = q(10). n −1 p( x ) = ∑ a j x j j =0 ■ ■ ■ n −1 q( x ) = ∑ b j x j j =0 Compute product using FFT in O(n log n) steps. Evaluate r(10) = a × b. r ( x ) = p( x ) × q ( x ) Problem: O(n log n) complex arithmetic steps. Solution. ■ ■ Strassen (1968): carry out arithmetic to suitable precision. – T(n) = O(n T(log n)) ⇒ T(n) = O(n log n (log log n)1+ε ) Schönhage-Strassen (1971): use modular arithmetic. – T(n) = O(n log n log log n) 23 coefficient multiplication O(n2) O(n log n) K, p( x2n-1) K, q( x2n-1) c0 , c1, K, c2n-2 interpolation inverse FFT point-value multiplication O(n) r(x 0 ), r ( x1), O(n log n) K , r ( x2n-1) 22