Overview •Importance of Features •Mathematical Notation & Background •Fourier Transform •Windowed Fourier Transform •Wavelets •Feature Anecdote •Literature & Homework Fall 2004 Pattern Recognition for Vision General Remarks Classifier Feature vector (x1, x2 ,…, xn) Feature Extraction Photograph by MIT OCW. “The choice of features is more important than the choice of the classifier.” Fall 2004 Pattern Recognition for Vision General Remarks—Application specific Traffic sign recognition Color, shape (Hough transform) Texture Recognition DFT, WT Action Recognition Motion-based features Fall 2004 Pattern Recognition for Vision General Remarks Is the choice of features really more important than the choice of the classifier? We know that a fly uses optical flow features for navigation. Still, technical systems using optical flow for navigation are (far) behind capabilities of a fly. Fall 2004 Pattern Recognition for Vision General Remarks—why talk about FT, WFT, WT? Fourier Transform(FT), Windowed FT (WFT) and Wavelet Transform (WT) •used in many computer vision applications •derivation from signal processing •basic tools for engineers Other features: color, motion features (optical flow), gradient features, SIFT (orientation histograms), affine invariant features, steerable filters (overcomplete wavelets), … Fall 2004 Pattern Recognition for Vision Background—Notation Z, R, C H, L, L (R ) 2 f function spaces Norm a + jb = a 2 + b 2 Absolute value f ,g Inner product f (t ) Complex conjugate fˆ (ω ) Fourier transform cos 2πω t + j sin 2πω t = e j 2πω t Fall 2004 integers, real, complex Euler formula Pattern Recognition for Vision Background—Vector and Function Spaces Vectors and Functions u = [u1 ,..., u N ] T f (n), n ∈ Z f (t ), t ∈ R Fall 2004 Pattern Recognition for Vision Background—Inner Product&Norm Inner Product&Norm N u, v ≡ ∑ un vn u ≡ u, u 2 n =1 2 L norm: u 2 L2 = ∑ un 2 n ∞ f ( n), g ( n) ≡ ∑ f (n) g (n) f (n) ≡ f (n), f (n) 2 −∞ f (t ), g (t ) ≡ ∞ ∫ f (t ) g (t ) dt f (t ) ≡ f (t ), f (t ) 2 −∞ Fall 2004 Pattern Recognition for Vision Background—Function Spaces Inner Product cont. f (t ) > 0 for all f ∈ H, f ≠ 0 Positivity: f , g = g, f Hermiticity: Linearity: f , cg + h = c f , g + f , h for f , g , h ∈ H, c ∈C Triangle & Schwarz inequality f +g ≤ f + g , f,g ≤ f g for all f , g ∈ H L2 (R ) Function space, finite energy { H ≡ f : R → C, f Fall 2004 2 } ≡ ∫ f (t ) dt < ∞ 2 Pattern Recognition for Vision Background—Basis Basis of a Vector and Function Space N N b ,..., b is a basis of C if ∀ u ∈ C {1 N} N u = ∑ un b n , n =1 {u1 ,..., uN } is unique, un = { f1 ,..., f N } is a basis of H if ∀ g ∈ N g (t ) = ∑ cn f n (t ), n =1 bn , u H {cn ,..., cN } is unique, cn = f n (t ), g (t ) g (t ) = ∫ g� (ω ) fω (t ) dω , g� (ω ) is unique, g� (ω ) = f ω (t ), g (t ) Orthonormal Basis fω , fω ′ Fall 2004 0 ω ≠ ω ′ = 1 ω = ω ′ f ω = fω g� (ω ) = fω , g Pattern Recognition for Vision Fourier Series—Continuous Signal Continuous, periodic signal 1.5 T /2 ck = 1 0.5 ∫ f (t )e − j 2π kt T dt −T / 2 0 sk = e -0.5 -1 -15 -10 -5 0 5 10 T f (t ) periodic with period T f (t ) ∈ L2 ([ −T / 2, T / 2]) Fall 2004 15 j 2π kt T ck = sk , f , sk , sl T = T δ kl 2 T ∞ , f (t ) T 1 f (t ) = ∑ ck e T k =−∞ 1 ∞ = ∑ ck T k =−∞ 2 j 2π kt T Pattern Recognition for Vision Fourier Series—Examples 1 1 0.8 0.9 0.6 0.8 0.4 0.7 0.2 0.6 0 0 0.1 0.2 0.3 0.4 0.5 0.5 -0.2 0.4 -0.4 0.3 -0.6 0.2 -0.8 0.1 -1 -8 -6 -4 0 -2 0.6 0.7 0.8 0.9 1 −1/ T 0 2 4 6 8 T 2 1/ T 1 1.8 0.8 1.6 τ 1.4 1.2 0.6 0.4 1/ T 1 0.2 0.8 0.6 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.4 0 -5 1/ τ -0.2 0.2 -4 -3 -2 -1 0 1 2 3 4 5 -0.4 -4 -3 -2 -1 0 1 2 3 4 T Fall 2004 Pattern Recognition for Vision Fourier Transform f(x) Continuous signals, f (t ) ∈ L2 (R ) T →∞ 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 1 0.3 0.5 0 0.3 0.5 0 0.1 0.2 0.3 0.4 0.2 0.2 0.1 0.1 0 -2 -1.5 -1 -0.5 0 L2 ([ −T / 2, T / 2]) → L2 (R ) 0 x ωk = k / T 0.6 0.7 0.5 T /2 ck = ∫ 0.8 1 0.9 1 1.5 1 ∆ω = ωk +1 − ωk = T 2 f (t )e − j 2πωk t dt → fˆ (ω ) = ∞ ∫ f (t )e− j 2πωt dt −∞ −T / 2 j 2π kt ∞ 1 ∞ f (t ) = ∑ ck e T = ∑ ck e j 2πωk ∆ωk T k =−∞ k =−∞ T → ∞ f (t ) = ∞ ∫ fˆ (ω )e j 2πωt d ω −∞ Fall 2004 Pattern Recognition for Vision Fourier Transform Another look at the inverse Fourier transform ∞ ∞ j 2πω t ' ' − j 2πω t ' f (t ) = ∫ ∫ f (t )e dt e dω −∞ −∞ = ∞ ∞ ∫∫ −∞ −∞ Fall 2004 f (t ' )e − j 2πω (t −t ) dω dt ' = ' ∞ ∫ f (t ' )δ (t − t ' ) dt ' −∞ Pattern Recognition for Vision Fourier Transform—Properties Linearity: A f (t ) + B g (t ) Shift: f (t − t 0 ) ∞ Convolution: ∫ A fˆ (ω ) + Bgˆ (ω ) − j 2πωt 0 ˆ e f (ω ) fˆ (ω ) gˆ (ω ) f (t ' ) g ( t − t ' ) d t ' −∞ Derivative: d f (t ) dt j 2πω fˆ (ω ) f ( At ) 1 ˆ ω f A A Scaling: ∞ Correlation : ∫ f (t ' ) g (t + t ' )dt ' fˆ (ω ) gˆ (ω ) −∞ ∞ Autocorrelation: ∫ f (t ) f (t + t )dt ' ' ' fˆ (ω ) 2 −∞ Fall 2004 Pattern Recognition for Vision Fourier Transform—Plancherel&Parseval Plancherel’s Theorem ∞ ∫ f dt = 2 −∞ ∞ ∫ 2 fˆ dω −∞ f 2 = fˆ 2 Parseval Identity ∞ ∫ −∞ f g dt = ∞ ∫ fˆ gˆ dω −∞ f , g = fˆ , gˆ Fall 2004 Pattern Recognition for Vision Fourier Transform—Examples Time e−t 2 σ 2π e /(2σ ) 2 1 1 1 1 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.4 0.4 0.4 0.4 1 0.3 0.5 0 0 0.1 0.2 0.3 0.4 F(f) f(x) Frequency 0.3 0.5 1 0.3 0.5 0 0.6 0.7 0.8 0.9 1 0.2 0.2 0.2 0.1 0.1 0.1 0 -2 -1.5 -1 -0.5 0 0 x 0.5 1 1.5 0.1 0.2 -1.5 0.3 -1 0.4 -0.5 1 1 0.3 0.5 0.2 0.1 0 0 f 0.6 0.7 0.5 0.8 1 0.9 1.5 1 2 sin(πω T ) = T sinc(ω T ) πω T T rect (t / T ) 1.5 0 -2 2 0 −2π 2σ 2ω 2 1 0.9 0.9 0.8 0.8 0.8 0.6 0.7 0.7 1 f(x) 0.6 0.6 0.4 0.5 0.5 0.2 0.4 0.4 0.5 1 0.5 0 0 0.1 0.2 0.3 0.4 0.3 0.5 0 0.6 0.7 0.8 0.9 0 1 0.1 0.2 0.3 0.4 0.3 0.5 0.6 0.7 0.8 0.9 1 0.2 0.2 -0.2 0.1 0.1 0 -1 -0.8 -0.6 -0.4 -0.2 0 0 0.2 0.4 0.6 0.8 1 -0.4 -4 -3 -2 -1 0 0 1 2 3 4 x Fall 2004 Pattern Recognition for Vision Fourier Transform—Examples Time Frequency sin(2πΩt ) = 2Ω sinc(2Ωt ) 2Ω 2πΩt 1 rect (ω /(2Ω)) 1 1.5 1 0.9 0.9 0.8 0.8 0.8 0.6 0.7 0.7 1 0.6 0.4 f(x) 0.6 0.5 0.5 0.2 0.4 0.4 0.5 0 0 0.1 0.2 0.3 0.4 0.3 0.5 1 0.5 0 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.3 0.5 0.6 0.7 0.2 0.4 0.8 0.9 1 0.2 0.2 -0.2 0.1 0.1 -0.4 -4 -3 -2 -1 0 0 1 2 3 4 0 -1 -0.8 -0.6 -0.4 -0.2 0 0 0.6 0.8 1 x Ω Fall 2004 Pattern Recognition for Vision Fourier Transform—Sampling Theorem 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Ω 0 -4 -3 -2 -1 0 1 2 3 4 1/(2Ω ) fˆ (ω ) = 0, for ω > Ω ∞ n f (t ) = ∑ sinc ( 2Ωt − n ) f Ω 2 n =−∞ Fall 2004 Pattern Recognition for Vision Fourier Transform—Sampling Theorem fˆ (ω ) = 0, for ω > Ω, f (t ) = ∞ ∑ sinc ( 2Ωt − n ) f n =−∞ n 2 Ω ∞ n , f (t ) = ∑ sinc ( 2Ω(t − tn ) ) f ( tn ) tn = 2Ω n =−∞ expand fˆ (ω ) into a Fourier series: Ω ∞ 1 fˆ (ω ) = cn e− j 2πωtn , cn = ∫ fˆ (ω )e j 2πωtn d ω = ∫ fˆ (ω )e j 2πωtn d ω = f (tn ) ∑ 2Ω −Ω −∞ ������� Inverse FT f (t ) = ∞ ∫ fˆ (ω )e j 2πωt d ω = −∞ Ω ∫ fˆ (ω )e j 2πωt d ω −Ω Ω Ω 1 1 j 2πω ( t −tn ) j 2πω ( t −tn ) f t e d ω f t e ( ) ( ) dω = ∫ = ∑ ∑ n n ∫ 2Ω 2Ω −Ω −Ω ��������� Inverse FT of rect function = ∞ ∑ sinc ( 2Ω(t − t ) ) f ( t ) n =−∞ Fall 2004 n n basis of bandlimited functions Pattern Recognition for Vision Fourier Series—Discrete Signals Discrete Fourier Transform (DFT) Discrete, periodic signal 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 -4 Fall 2004 -3 -2 -1 0 N −1 ck = ∑ f (n) e − j 2π kn N n =0 1 f ( n) = N 0 1 2 3 N −1 ∑c k =0 k e j 2π kn N 4 Pattern Recognition for Vision Fourier Transform—Summary Time Continuous, periodic Continuous Fall 2004 Frequency FS FT Discrete Continuous Pattern Recognition for Vision Fourier Transform—Summary Frequency Time DFT Discrete, periodic Fall 2004 Discrete, periodic Pattern Recognition for Vision Discrete Fourier Transform—Fast Fourier Transform Decimation in Space N −1 ck = ∑ f (n) e − j 2π kn N O( N 2 ) n=0 = N / 2 −1 ∑ − j 2π k 2 n N N / 2 −1 ∑ � ��������� ��������� f (2n) e + n=0 = N / 2 −1 ∑ f (2n +1) e n=0 f (2n) e − j 2π kn N /2 +e − j 2π k N / 2 −1 N n=0 Wk ∑ − j 2π k N = −e ∑ cko − j 2π (k + N / 2) N N / 2 −1 f (2n) e − j 2π kn N /2 = ∑ n=0 Fall 2004 ⇔ Wk = −Wk + N / 2 N / 2 −1 n=0 N / 2 −1 f (2n +1) e − j 2π kn N /2 n=0 ck e e − j 2π k (2n+1) N ∑ f (2n) e − j 2π ( k + N / 2)n N /2 ⇔ cke = cke+ N / 2 n=0 f (2n +1) e − j 2π kn N /2 = N / 2 −1 ∑ f (2 n +1) e − j 2π ( k + N / 2)n N /2 ⇔ cko = cko+ N / 2 n=0 Pattern Recognition for Vision Discrete Fourier Transform— Fast Fourier Transform Wk = −Wk + N / 2 ck = cke +Wk cko 0 ≤ k < N / 2 e o ck + N / 2 = ck −Wk ck cke = cke+ N / 2 cko = cko+ N / 2 N −1 ck = ∑ f (n) e Recursion: − j 2π kn N ĉ0 = ĉ + ĉ e 0 n=0 N ' −1 c ek = ∑ f (2n) e − j 2π kn N' n=0 N −1 ' c k = ∑ f (2n +1) e o − j 2π kn N ' n=0 N = N / 2 O( N / 2) ' Fall 2004 2 o 0 ĉ1 = ĉ0e − ĉ0o Complexity: M / 2 ld( M ) Multiplications M ld(M ) Summations Pattern Recognition for Vision Discrete Fourier Transform—Image Analysis x y N −1 ∑ ∑ y =0 f ( x, y ) e − j 2π k x x N − j 2π k y y e M x=0 − j 2π k y y − j 2π k x x 1 N M ( , ) f x y e e = ∑ ∑ MN y = 0 x = 0 M −1 N −1 Courtesy of Professors Tomaso Poggio and Sayan Mukherjee. Used with permission. M 1-D DFT’s rows N 1-D DFT’s columns Fall 2004 1 c( k x , k y ) = MN M −1 N −1 c ( k x , y ) = ∑ f ( x, y ) e − j 2π k x x N x =0 M −1 c( k x , k y ) = ∑ c(k x , y ) e − j 2π k y y M y =0 Pattern Recognition for Vision Discrete Fourier Transform—Image Analysis Example y x Fall 2004 ky kx Pattern Recognition for Vision Discrete Fourier Transform—Image Analysis x Image y (B) (A) Spectrum ωy (A) Vertical edges ωx Horizontal edges (B) Amplitude of (A) & Phase of (B) Fall 2004 Courtesy of Professors Tomaso Poggio and Sayan Mukherjee. Used with permission. Pattern Recognition for Vision Discrete Fourier Transform—Image Analysis Template Matching corr ( x) = ∞ ∫ f (x ' ) g ( x ' − x)dx ' g ′( x ) = g(−x) −∞ corr ( x) = f ∗ g ′ = ∞ ∫ f (x ' )g ′(x − x ' )dx ' fˆ (ω ) gˆ ′(ω ) −∞ g ( x) f ( x) Fall 2004 corr ( x) Pattern Recognition for Vision Discrete Fourier Transform—Image Analysis Shape description with Fourier descriptors y r r ϕ ϕ x Fall 2004 Invariance: •Scale •Rotation •Translation Pattern Recognition for Vision Discrete Fourier Transform—Image Analysis Shape description with Fourier descriptors Im Re Im t t Re Fall 2004 t Pattern Recognition for Vision Windowed Fourier Transform—Motivation Fourier transform of a chirp signal cos(π t 2 ), ωinst = t Fall 2004 Pattern Recognition for Vision Windowed Fourier Transform Windowed Fourier Transform (WFT) f� (ω , t ) = ∞ ∫ f (u ) g (u − t )e − j 2πω u supp g ⊂ [ −T , 0] du −∞ supp ft ⊂ [t − T , t ] ft (u ) = f (u ) g (u − t ) Fourier transform: f� (ω , t ) = ∞ ∫ ft (u )e − j 2πω u du −∞ g (u − t ) f (u ) t −T Fall 2004 t u Pattern Recognition for Vision Windowed Fourier Transform—Time Frequency Symmetry f� (ω , t ) = ∞ ∫ f (u ) g (u − t )e− j 2πω u du −∞ substitute g (u − t )e j 2πωu by gω ,t (u ) f� (ω , t ) = gˆω ,t (ν ) = ∞ ∞ ∫ Parseval’s Identity gω ,t (u ) f (u )du = gω ,t , f = gˆ ω ,t , fˆ −∞ ∫ g (u − t )e j 2πωu − j 2πν u e −∞ du = ∞ ∫ g (u − t )e − j 2π u (ν −ω ) du , −∞ substitute u ′ by u − t : ∞ ∫ −∞ g (u ′)e − j 2π ( u′+t )(ν −ω ) du ′ = e − j 2π t (ν −ω ) gˆ (ν − ω ) f� (ω , t ) = e − j 2πωt ∞ ∫ gˆ (ν − ω ) fˆ (ν )e j 2πν t dν −∞ Fall 2004 Pattern Recognition for Vision Windowed Fourier Transform—Time Frequency Localization Time Frequency Symmetry f� (ω , t ) = f� (ω , t ) = e f (u ) ∞ ∫ f (u ) g (u − t )e −∞ ∞ − j 2πω t ∫ − j 2πω u du gˆ (ν − ω ) fˆ (ν )e j 2πν t dν −∞ ω g (u − t0 ) f� (ω , t0 ) supp g ⊂ [ −T / 2, T / 2] fˆ (ν ) t0 ω gˆ (ν − ω 0 ) ω0 Fall 2004 u ω0 ν t0 t f� (ω 0 , t ) t Pattern Recognition for Vision Windowed Fourier Transform—Time Frequency Localization Time Frequency Localization gˆ (ω ) = 1 2 g (t ) = 1 2 tm = σt = 2 ∞ ∞ ωm = ∫ t g (t ) dt 2 −∞ ∫ (t − tm ) g (t ) dt 2 2 σω = 2 ∞ ∞ ∫ ω gˆ (ω ) 2 dω −∞ ∫ (ω − ω ) gˆ (ω ) d ω 2 2 m −∞ −∞ Heisenberg’s uncertainty principle 4πσ ωσ t ≥ 1 g (t ) = (2a) e 1/ 4 −π at 2 gˆ (ω ) = (2 / a) e 1/ 4 Fall 2004 −πω 2 / a σt = 1 4π a t m = ωm = 0 a σω = 4π Pattern Recognition for Vision Windowed Fourier Transform—Time Frequency Localization Uncertainty Principle Time 1 1 0.8 0.9 0.6 0.8 0.4 0.7 0.2 0.6 0 0 0.1 0.2 0.3 0.4 0.5 0.5 -0.2 0.4 -0.4 0.3 -0.6 0.2 -0.8 0.1 -1 -8 -6 -4 -2 0 0 Frequency 0.6 0.7 2 0.8 4 T 0.9 6 1 −1/ T 8 1 0.8 0.9 0.6 0.8 0.4 0.7 0.2 0.6 0 −T T 1/ T 1 0 0.1 0.2 0.3 0.4 0.5 0.5 -0.2 0.4 -0.4 0.3 -0.6 0.2 -0.8 0.1 -1 -8 -6 -4 -2 0 0 0.6 0.7 2 0.8 4 0.9 6 1 8 1/ T Fall 2004 Pattern Recognition for Vision Windowed Fourier Transform—Time Frequency Localization Time Frequency Localization ω f� (ω 0 , t0 ) describes f σω ω0 σt σt σω t0 Fall 2004 σω σω σt within a resolution cell: σt Uncertainty principle: [t0 ± σ t ][ω0 ± σ ω ] 1 σ tσ ω > 4π t Pattern Recognition for Vision Windowed Fourier—Reconstruction Reconstruction Formula 1 f (u ) = ∫∫ g ω ,t (u ) f� (ω , t )dω dt C = g C 2 f t (u ) = f (u ) g (u − t ) f� (ω , t ) = ∞ ∫ ft (u )e − j 2πω u du −∞ g (u − t ) f (u ) = ∞ ∫ f� (ω , t )e j 2πωu d ω −∞ ∞ f (u ) ∞ ∞ � (ω , t ) g (u − t )e j 2πωu d ω dt g u t dt f ( − ) = ∫−∞ ∫−∞ −∞∫ �� ���� � gω ,t ( u ) ������ � 2 C Fall 2004 Pattern Recognition for Vision Windowed Fourier—Reconstruction Reconstruction Formula f (u ) u ω Reconstruction t Fall 2004 Pattern Recognition for Vision Windowed Fourier—Redundancy Redundancy ∞ ∞ 1 f (u ) = ∫ ∫ f� (ω , t ) g (u − t )e j 2πωu dω dt �� ���� � C −∞ −∞ g (u ) ω ,t Parseval for WFT f� (ω , t ) = gω ,t (u ), f (u ) L2 ( R ) 1 g� ω ,t (ω ′, t ′), f� (ω ′, t ′) = C g� ω ,t (ω ′, t ′) = gω ,t (u ), gω ′,t ′ (u ) = K g (ω , t ω ′, t ′) = L2 ( R 2 ) ∞ ∫ gω ,t (u ) gω ′,t ′ (u ) du −∞ �f (ω , t ) = 1 K (ω , t ω ′, t ′) f� (ω ′, t ′)dt ′dω ′ g ���� � C ∫∫ �� K g (ω ′,t ′ ω ,t ) -Values of f� are correlated -Not any function f� (ω , t ) ∈ L2 can be a WFT - f� (ω , t ) ∈ F ,F ⊂ L2 is a reproducing kernel Hilbert space Fall 2004 Pattern Recognition for Vision Windowed Fourier Transform—Sampling Theorem Sampling Theorem f� (n, m) = ω ∞ ∫ f (u ) g (u − nts )e − j 2π mω su du −∞ Reconstruction if ω s ts < 1 ωs ts Fall 2004 t Pattern Recognition for Vision Windowed Fourier Transform—Examples cos(π u ), ωinst = u 2 Fall 2004 Pattern Recognition for Vision Time Scale Analysis—Motivation f (t ) ω t σt Reconstruction t Features with time scales much shorter and longer than σ t have to be synthesized with many notes. Fall 2004 Pattern Recognition for Vision Time Scale Analysis—Continuous Wavelet Transform ψ s ,t (u ) = 1 s p ∞ u −t 2 ∈ u L ψ , ψ ( ) , s ≠ 0, p > 0 s f� ( s, t ) = ∫ ψ s ,t ( u ) f (u ) du = ψ s ,t , f , f ∈L 2 −∞ ψ (u ) = ue ψ −0.5,−10 red −u2 Fall 2004 ψ 1,0 green ψ 2,15 blue Pattern Recognition for Vision Wavelet Transform—Reconstruction Reconstruction Formula Admissibility condition: ψˆ (±ω ) 0 < C± = ∫ dω < ∞ ω 0 ∞ 2 ∞ ∞ f (u ) = ∫ ∫ ψ s ,t (u ) f� ( s, t ) s 2 p −3dtds 0 −∞ s ,t ψ { } reciprocal wavelet family of {ψ s,t } C C s ,t if C− = C+ = then ψ s ,t = ψ , {ψ s ,t } is self reciprocal: 2 2 ∞ ∞ 2 f (u ) = ∫ ∫ ψ s ,t (u ) f� ( s, t ) s 2 p −3 dtds C 0 −∞ Fall 2004 Pattern Recognition for Vision Wavelet Transform—Reconstruction Admissibility condition: ∞ 0 < C± = ∫ 0 ψˆ (0) = 0 ψˆ (±ω ) ω 2 dω < ∞ ∞ ∫ ψ (u )du = 0 −∞ Symmetry condition: C− = C+ if ψ is symmetric, ψˆ is symmetric if ψ is real-valued, ψˆ ( −ω ) = ψˆ (ω ) Fall 2004 Pattern Recognition for Vision Wavelet Transform—Plancherel&Parseval Plancherel Formula 1 �f , f� = L C +C + − ∞ ∞ ∫∫ s 2 p −3 f� ( s, t ) dsdt 2 −∞ −∞ Parseval Identity f,g L2 ( R ) = f� , g� 1 �f , g� = L C +C + − Fall 2004 L ∞ ∞ ∫∫ ∀ f , g ∈ L (R ) 2 s 2 p −3 f� ( s, t ) g� ( s, t )dsdt −∞ −∞ Pattern Recognition for Vision Wavelet Transform—Time Frequency Symmetry 1 u −t ψ s ,t (u ) = ψ s s p = 0.5, s > 0 : ψˆ s (ω ) = s e− j 2πω tψˆ ( sω ) ∞ f� ( s, t ) = ψ s ,t , f = ψˆ s ,t , fˆ = ∫ ψˆ s ,t (ω ) fˆ (ω )du −∞ ∞ f� ( s, t ) = s ∫ ψˆ ( sω ) fˆ (ω ) e j 2πω t dω −∞ �f ( s, t ) = 1 s Fall 2004 ∞ u −t ∫−∞ψ s f (u )du Pattern Recognition for Vision Wavelet Transform—Time Frequency Localization Time Frequency Localization ψ (u ) is centered at t0 with σ t , ψˆ (ω ) is centered at ω0 with σ ω �f ( s, t ′) = 1 s ∞ u − t′ ∫−∞ψ s f (u )du time window centered at s t0 + t ′ with sσ t ∞ f� ( s, t ′) = s ∫ ψˆ ( sω ) fˆ (ω ) e j 2πω t ′ dω −∞ frequency window centered at ω 0 / s with σ ω / s Fall 2004 Pattern Recognition for Vision Wavelet Transform—Time Frequency Localization Time Frequency Localization ω s 1/ 2 1 2ω 0 ω0 2 ω0 / 2 Fall 2004 0.5σ t 2σ ω σω 0.5σ ω 0.5σ t size of resolution cells is constant: 2σ ω σt 2σ t σω 0.5σ ω σt σ tσ ω 2σ t t Pattern Recognition for Vision Wavelet Transform—Redundancy Redundancy f1 , f 2 2 L = f�1 , f�2 L f� ( s, t ) = ψ s ,t , f ψ� s′,t ′ = ψ s ,t ,ψ s′,t ′ L2 L2 = ψ� s ,t , f� L = Kψ ( s′, t ′ s, t ) �f ( s, t ) = s 2 p −3 K ( s′, t ′ s, t ) f� ( s′, t ′)ds′t ′ ψ ∫∫ -Values of f� are correlated -Not any function f� ( s, t ) ∈ L2 can be a WFT - f� ( s, t ) ∈ L, L ⊂ L2 is a reproducing kernel Hilbert space Fall 2004 Pattern Recognition for Vision Wavelet Transform—Frames Notion of Frames f� ( s, t ) = ψ s ,t , f x u3 u2 e3 T e2 X Y e1 u1 Hilbert space X , dim X = N {u1 ,..., u M } ∈ X , M > N , Hilbert space Y , dim Y = M y = Tx Tx = ∑ x, u j e j , T : X → Y j x, T *y = Tx, y , T * : Y → X Unlike the basis the frames {u1 ,..., u M } can be linearly dependent Fall 2004 Pattern Recognition for Vision Wavelet Transform—Frames if x ∈ X is uniquely determined by Tx ∈ Y then {u1 ,..., u M } ∈ X is a frame of X T is injective {u1 ,..., u M } ∈ X is a frame of X if: A x ≤ Tx ≤ B x 2 2 2 ∀x ∈ X , B ≥ A ≥ 0 Reconstruction of x from y = Tx: x = Sy = (T *T ) −1T *y G = T *T is selfadjoint → eigenvalues are real Frames are the framework for continuous and discrete wavelets Fall 2004 Pattern Recognition for Vision Wavelet Transform—Discrete Wavelets �f ( s, t ) = s 2 p −3 K ( s′, t ′ s, t ) f� ( s′, t ′)ds′t ′ ψ ∫∫ Redundancy: Discrete wavelets ψ ( s, t ), s, t ∈ R to ψ (a, b), a, b ∈ Z Exponential sampling s(a) = σ a , σ > 1, elementary dilation step t (a, b) = bσ a β , β > 0 Fall 2004 Pattern Recognition for Vision Wavelet Transform—Discrete Wavelets Sampling of the s, t plane s σ1 σ 0 b =1 b=2 a =1 β a=0 σx a=x a t ( a , b ) = b σ β, β > 0 s(a) = σ , σ > 1 t a Fall 2004 Pattern Recognition for Vision Wavelet Transform—Multiresolution Analysis Multiresolution analysis of discrete signals f (n), n ∈ Ζ Sampling f� ( s, t ) on a dyadic grid, orthogonal wavelets: σ = 2, β = 1: s = 2a , t = b 2a a, b ∈ Ζ Two half band filters with impulse response h(n), g (n) f H ( n) = f ( n) ∗ h( n) = ∑ f ( k ) h( n − k ) k f L ( n) = f ( n) ∗ g ( n) = ∑ f ( k ) g ( n − k ) k Downsampling: f H * ( n) = f H (2n), f L* ( n) = f L (2n) Fall 2004 Pattern Recognition for Vision Wavelet Transform—Multiresolution Analysis g ( n) f ( n) f L* ,l (n) 2 Analysis Wavelet coefficients h ( n) 2 f L ,l ( n ) f H * ,l ( n) 2 g * ( n) f H * ,l ( n) + Reconstruction 2 h* ( n ) g (n), h(n) are a Quadrature mirror filter Fall 2004 Pattern Recognition for Vision Wavelet Transform—Multiresolution Analysis Haar Wavelets h 1 0.5 1 2 t h are orthonormal easy to compute but poor frequency resolution g 1 t Fall 2004 Pattern Recognition for Vision Wavelet Transform—Multiresolution Analysis Daubechie Wavelets g (t ) h(t ) g (t ) h(t ) f (t ) f ( n) f * ( n) Matlab Documentation Fall 2004 Pattern Recognition for Vision Wavelet Transform—Multiresolution Analysis Haar Wavelets (Matlab Toolbox) Screenshot from Matlab Toolbox removed due to copyright reasons. Fall 2004 Pattern Recognition for Vision Wavelet Transform—Applications •Image Compression •Texture Analysis •De-noising •Features for Object Detection and Recognition Fall 2004 Pattern Recognition for Vision My Face Detector in 2000 Face examples Classification Result Off-line training Photograph by MIT OCW. Classifier Non-linear SVM 5 min / image Feature vector (x1, x2 ,…, xn) Feature Extraction Non-face examples Pixel pattern 19x19 Search for faces at different resolutions and locations Photograph by MIT OCW. Fall 2004 Pattern Recognition for Vision My Face Detector in 2000 My experiments with different types of features Photographs courtesy of CMU/VASC Image Database at _________________________________________ http://vasc.ri.cmu.edu/idb/html/face/frontal_images/ Gradient AI Memo 2000 Fall 2004 Pattern Recognition for Vision Speeding-up Face Detection 19x19 detection Finalpolynomial classifier 17x17 polynomial classifier Original image 3x3 classifier 9x9 classifier 17x17 linear classifier 30% 1% 7% 8% 22% Propagated 70% 100% Removed Photograph courtesy of CMU/VASC Image Database at http://vasc.ri.cmu.edu/idb/html/face/frontal_images/ Fall 2004 Pattern Recognition for Vision Hierarchical Face Detection—Heisele, CVPR 2001 Fall 2004 Pattern Recognition for Vision Viola&Jones Detector—Viola&Jones, CVPR 2001 Speed is proportional to the average number of features computed per sub-window. On the MIT+CMU test set, an average of 9 features out of a total of 6061 are computed per sub-window. On a 700 Mhz Pentium III, a 384x288 pixel image takes about 0.067 seconds to process (15 fps). Roughly 15 times faster than Rowley-Baluja-Kanade and 600 times faster than Schneiderman-Kanade. Fall 2004 Pattern Recognition for Vision Viola&Jones Detector—Image Features Photo removed due to copyright considerations. “Rectangle filters” Similar to Haar wavelets Differences between sums of pixels in adjacent rectangles { ht(x) = +1 if ft(x) > θt -1 otherwise 160,000 × 100 = 16,000,000 Unique Features Series of photos and figures from Rapid object detection using a boosted cascade of simple features. Viola, P., and M. Jones. Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference 1 (2001): I-511 - I-518. Graphs courtesy of IEEE. Copyright 2001 IEEE. Used with Permission. Fall 2004 Pattern Recognition for Vision Viola&Jones Detector How exactly does it work?? Read the CVPR 2001 paper……. or wait until the lecture on object detection by Mike Jones Fall 2004 Pattern Recognition for Vision Integral Image—Viola&Jones CVPR 2001 Define the Integral Image Any rectangular sum can be computed in constant time: I ' ( x, y ) = ∑ I ( x ' , y ' ) x '≤ x y '≤ y Rectangle features can be computed as differences between rectangles D = 1 + 4 − (2 + 3) = A + ( A + B + C + D) − ( A + C + A + B) =D Series of photos and figures from Rapid object detection using a boosted cascade of simple features. Viola, P., and M. Jones. Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference 1 (2001): I-511 - I-518. Graphs courtesy of IEEE. Copyright 2001 IEEE. Used with Permission. Fall 2004 Pattern Recognition for Vision Example Classifier for Face Detection—Viola&Jones CVPR 2001 A classifier with 200 rectangle features was learned using AdaBoost 95% correct detection on test set with 1 in 14084 false positives. Not quite competitive... Photographs removed due to copyright considerations. ROC curve for 200 feature classifier Series of photos and figures from Rapid object detection using a boosted cascade of simple features. Viola, P., and M. Jones. Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference 1 (2001): I-511 - I-518. Graphs courtesy of IEEE. Copyright 2001 IEEE. Used with Permission. Fall 2004 Pattern Recognition for Vision Literature Ingrid Daubechies “Ten Lectures on Wavelets” For mathematicians only, many proofs Gerald Kaiser “A Friendly Guide to Wavelets” Not friendly, but easier to understand than above Christian Blatter “Wavelets—A Primer” more friendly than Kaiser… Stephane Mallat “Multifrequency Channel Decomposition” IEEE Acoustics Speech & Signal Proc. 1989 One of the early papers on multiresolution analysis Fall 2004 Pattern Recognition for Vision Homework •Some proofs (optional) •Template matching with Fourier transform Shape representation with Fourier descriptors (stick to instructions) •Playing with wavelets Fall 2004 Pattern Recognition for Vision