The O () notation When analyzing the runtime of an algorithm, we want to consider the time required for large n. We also want to ignore constant factors (which often stem from “tricks” and do not indicate a better procedure). Definition: Let f (n), g(n) be functions of the natural (or real) numbers. We say that f (n) = O (g(n)) if there exist constants A, N such that |f (n)| ≤ |A · g(n)| for all n ≥ N. For example 1 + n = O (n) 2 9n + 7n − 28 = O n 2 31415926 = O (1) We may also write f (n) = g(n) + O (h(n)), meaning that f (n) − g(n) = O (h(n)). For example (remember that 1 + 2 + · · · + n = n(n + 1)/2 = 21 n2 + 12 n) 3 1 + 2 + ··· + n = O n 2 1 + 2 + ··· + n = O n 1 2 n + O (n) 1 + 2 + ··· + n = 2 Lemma: Let f (n) = O (r(n)) and g(n) = O (r(n)). Then: f (n) + g(n) = O (r(n) + s(n)) f (n) · g(n) = O (r(n) · s(n)) f (n) + r(n) = O (r(n)) but: O (f (n)) − O (g(n)) 6= O (f (n) − g(n)) Theorem: Let f be a monotonous function (i.e. f (n1 ) ≥ f (n2 ) whenever n1 ≥ n2 ). Then for every c > 0 and every a > 1 we have that c f (n) (f (n)) = O a In particular nc = O (an ). d Similarly log n = O n for all d > 0. The following list is in “ascending O-order”: 1 log n n3 √ n ··· n n log n n2 1.5n 2n nn ··· Projection on Function Spaces Consider a vector space consisting of (complex-valued) functions. (If you like column vectors consider the discretized version with vectors of the form (f (t1 ), f (t2 ), . . . , f (tn )) obtained from function evaluations.) One can define a scalar product between functions as Z (f, g) = f (t)g(t) dt In the same way as with column vectors, it is now possible to define the projection of a function on a subspace spanned by basis functions bi . We can use projection methods to decompose a function in the subspace into the basis elements: X f (t) = ci bi (t) If the bi form an orthonormal basis, the projection formulae give us that Z ci = (f, bi ) = bi (t)f (t) dt These coefficients ci are called the Fourier Coefficients of f . The Fourier Transformation Consider the functions sin(n · t) and cos(n · t) on the interval [0, 2π]. Then (for n 6= m: Z 2π Z 2π 0 = sin(nt) sin(mt) dt = cos(nt) cos(mt) dt 0 0 Z 2π = sin(n1 t) cos(n2 t) dt 0 Z 2π Z 2π π = sin(nt) sin(nt) dt = cos(nt) cos(nt) dt 0 Thus the function for a subspace. 0 √1 π sin(nt), √1 π cos(nt) form an orthonormal basis Theorem: This subspace consists of all periodic functions. In general, one can consider not just integral, but real multiples. In the limit the infinite sum becomes an integral and we have Z ∞ (∗) f (t) = A(ω) sin(ωt) dω Z0 ∞ + B(ω) cos(ωt) dω 0 with Z 1 A(ω) = f (t) sin(ωt) dt π Z 1 B(ω) = f (t) cos(ωt) dt π These functions A and B indicate the weight of the various frequencies. Applications The Fourier transform has many applications in signal processing. We can use it for example to filter low frequencies: Perceptual Coding Another application is in audio encoding. The perception of loudness depends on the frequency. Also rhe human ear cannot recognize frequencies with low amplitude, if another frequency “close by” has high amplitude: When recording the audio signal, we can remove such “masked” frequencies, this data reduction cannot be spotted by the ear. This principle is used for example in the MPEG standards (e.g. MP3 audio), similar techniques apply to pictures. Interpretation over the complex numbers Things become more uniform if we move to the complex numbers. eix −e−ix eix +e−ix Here we have: sin(x) = , cos(x) = and 2i 2 ea+ib = ea · (cos b + i sin b) The Complex Fourier transform Using these relations, we get the Fourier transform of f as: Z ∞ 1 (F(f ))(ω) = g(ω) = √ f (t)eiωt dt 2π −∞ It fulfills (call f = F(g) the inverse transform of g) Z ∞ 1 f (t) = (F(g))(t) = √ g(ω)e−iωt dω 2π −∞ Discretization On a computer we cannot take infinitely many values but have to discretize an continuous process. For every natural n and 0 ≤ m < n the numbers: e 2πim n lie equally distributed on the complex unit circle. We call these numbers Roots of Unity. Im 6 e Let ω := e 2πi n 2πi3 8 .. .. e 2πi 8 . '$ . .. .. .. .. . .. . .. . . . . .. .. ..&% .. . . Re - . Then ω fulfills: 1. ω n = 1, ω m 6= 1(1 ≤ m < n) n−1 P jp 2. ω = 0 for 1 ≤ p < n j=0 A number (6= 1) fulfilling both properties is called a principal n-th root of unity. If ω is a principal n-th root of unity, then ω −1 is as well. (This is all we need to know about complex numbers.) The integrals of the Fourier transform thus become sums in the Discrete Fourier transform (DFT). Let a = [a0 , . . . , an−1 ]. Then F(a) = [u0 , . . . , un−1 ] with: uj = n−1 X ak ω jk k=0 Pn−1 If we define a polynomial p(x) = k=0 ak xk , we have uj = p(ω j ), the Fourier transformation thus consists of multiple polynomial evaluations. Similarly we define the inverse discrete Fourier transform: n−1 1X F(u) = [v0 , . . . , vn−1 ] with vj = uk ω −jk . n k=0 It is easily checked that F and F are mutually inverse. Furthermore the inverse can be computed by the same algorithm, replacing ω by ω −1 . If a ∈ Rn , a naı̈ve evaluation of the DFT takes O (n2 ) operations. Our next aim will be to develop an O (n log n) algorithm. Fast Fourier Transformation A particular efficient way to evaluate the discrete Fourier transform is the Fast-Fourier transform (FFT) algorithm of C O OLEY and T U K EY: Let u = F(a). Then we can write u = V · a with 1 1 1 ... 1 1 ω 2 n−1 ω . . . ω 2 4 2(n−1) V = 1 ω ω . . . ω . . .. . . . .. .. .. . . 1 ω n−1 ω 2(n−1) . . . ω (n−1)(n−1) a Vandermonde matrix. Let us consider the case n = 8. u 1 1 1 0 u 1 1 ω ω2 u 1 ω2 ω4 2 u 1 ω3 ω6 3 = u4 1 ω 4 1 u5 1 ω 5 ω 2 u6 1 ω 6 ω 4 u7 1 ω7 ω6 Then 1 1 1 1 1 ω3 ω4 ω5 ω6 ω7 ω6 1 ω2 ω4 ω6 ω1 ω4 ω7 ω2 ω5 ω4 1 ω4 1 ω4 ω7 ω4 ω1 ω6 ω3 ω2 1 ω6 ω4 ω2 ω5 ω4 ω3 ω2 ω1 a0 a1 a2 a3 a4 a5 a6 a7 We swap the columns to sort by even/odd indices: [0, 2, 4, 6, 1, 3, 5, 7]: u0 1 1 1 1 1 1 1 1 a0 u 1 1 ω 2 ω 4 ω 6 ω ω 3 ω 5 ω 7 a2 u 1 ω4 1 ω4 ω2 ω6 ω2 ω6 a 2 4 u 1 ω6 ω4 ω2 ω3 ω1 ω7 ω5 a 3 6 = u 4 1 1 1 1 ω 4 ω 4 ω 4 ω 4 a1 u 5 1 ω 2 ω 4 ω 6 ω 5 ω 7 ω 1 ω 3 a3 u 6 1 ω 4 1 ω 4 ω 6 ω 2 ω 6 ω 2 a5 u7 1 ω6 ω4 ω2 ω7 ω5 ω3 ω1 a7 We observe that we can write (ω, ω 3 , ω 5 , ω 7 ) = ω(1, ω 2 , ω 4 , ω 6 ) = ω(1, ζ, ζ 2 , ζ 3 ) (ω 2 , ω 6 , ω 2 , ω 6 ) = ω 2 (1, ω 4 , 1, ω 4 ) = ω 2 (1, ζ 2 , 1, ζ 2 ) (ω 3 , ω, ω 7 , ω 5 ) = ω 3 (1, ω 6 , ω 4 , ω 2 ) = ω 3 (1, ζ 3 , ζ 2 , ζ) where ζ = ω 2 is a primitive 4-th root of unity and we have a 4-dimensional fourier transform given by u0 1 1 1 1 a0 u 1 ζ ζ2 ζ3 a 1 1 · . = u 2 1 ζ 2 1 ζ 2 a2 1 ζ3 ζ2 ζ a3 u3 In other words: We can write the transformation matrix F8 as a product: 1 0 F4 0 0 F8 = 1 0 F − 4 0 0 0 ω 0 0 0 0 · F4 0 ω3 0 0 · F4 0 ω3 0 ω2 0 0 0 0 ω 0 0 ω2 0 0 where F4 is the matrix for the 4-dimensional transformation. We can now apply the same method recursively to the 4-dimensional transformation and so on. Eventually, this eliminates all matrix multiplication by scalar multiplication. For an n-dimensional problem we have log(n) iterations, each step of which has a cost of O (n). The total cost is thus O (n log n). The same process works for any power of 2 (filling up the data if there are fewer points). The inverse transform uses the same algorithm with ω replaced by ω −1 . function FFT(n,a,ω); (n a power of 2) Output: u = F(a). begin u := []; if n = 1 then u[0] := a[0]; else m := n/2; c :=FFT(m,ae ,ω 2 ); d :=FFT(m,ao ,ω 2 ); for j ∈ [0..m − 1] do u[j] := c[j] + ω j d[j]; u[j + m] := c[j] − ω j d[j]; od; fi; return u; end Example: Sunspots We want to analyze the number of sunspots per year. Astronomers have counted these for about three centuries, their number is called the Wolfer number. We will use MATLAB to perform a fourier transform. If we log in remotely (from another X-windows workstation), we have to redirect graphics output: $ setenv DISPLAY mycomputer.math.colostate.edu:0 $ matlab Once in MATLAB we load the data (which is supplied as a sample) and split the list of pairs in a list of years and of counts. We also plot the curve. load sunspot.dat year=sunspot(:,1); wolfer=sunspot(:,2); plot(year,wolfer) title(’Sunspot Data’) We now perform a fast fourier transformation. This stores the coefficients in a list. The first coefficient (cos(0x) = 1) gives an absolut shift (zero bias) that is usually not interesting, so we eliminate it. Y = fft(wolfer); Y(1)=0; The first n2 list entries correspond to positive frequencies, the last n2 corresponding to negative frequencies. Since our input signal is real, we need to consider only the entries corresponding to positive frequencies. (The rest will be their complex conjugates.) Since we are only interested in the intensities for the various frequencies, we only consider the (absolute values of) the first n2 entries. Taking their square (proportional to the power per frequency) will emphasize peaks. n=length(Y); intens = abs(Y(1:n/2)); power = intens.ˆ2;; To obtain the frequencies, we note that our frequency resolution will be 12 times the sampling frequency (Nyquist rate/Shannon’s theorem). Thus we create a equidistributed list of length n2 which goes from 1 to rate . Also, since the frequencies are very small, we instead will use 2 their reciprocal values. rate=1; nyquist = 1/2; freq = (1:n/2)/(n/2)*nyquist*rate; period=1./freq; Finally we plot the frequency distribution and set the view window to x ∈ [0, 40], y ∈ [0, 2 · 107 ], which fits the plot. plot(period,power); ylabel(’Power’); xlabel(’Period (Years/Cycle)’); axis([0 40 0 2e+7]);