Analysis of High-Dimensional Signal Data by Manifold Learning and Convolution Transforms Mijail Guillemard and Armin Iske University of Hamburg Alba di Canazei, 6-9 September 2010 DRWA2010 6-9 September 2010 Armin Iske 1 Outline Application from Neuro and Bioscience Electromyogram (EMG) Signal Analysis Objectives and Problem Formulation Manifold Learning and Convolution Transforms Some more Background Dimensionality Reduction: PCA, MDS, and Isomap Differential Geometry: Curvature Analysis Numerical Examples Parameterization of Scale- and Frequency-Modulated Signals Manifold Evolution under Wave Equation Geometrical and Topological Distortions through Convolutions DRWA2010 6-9 September 2010 Armin Iske 2 EMG Signal Analysis in Neuro and Bioscience An EMG signal is an electrical measurement combining multiple action potentials propagating along motor neural cells. EMG signal analysis provides ... ... physiological information about muscle and nerve interactions. Applications: diagnosis of neural diseases, athletic performance analysis. Goal: Develop fast and accurate methods to EMG signal analysis. Tools: Dimensionality Reduction; Manifold Learning. Peggy busy weight lifting. Collaborators: Deutscher Olympia-Stützpunkt Rudern, Hamburg Prof. Klaus Mattes (Bewegungs- und Trainingswissenschaft, UHH) DRWA2010 6-9 September 2010 Armin Iske 3 Action Potential and Surface EMG DRWA2010 6-9 September 2010 Armin Iske 4 Objectives and Problem Formulation Manifold Learning by Dimensionality Reduction. Input data: X = {x1 , . . . , xm } ⊂ M ⊂ Rn ; Hypothesis: Y = {y1 , . . . , ym } ⊂ Ω ⊂ Rd , d n; nonlinear embedding map A : Ω → Rn , X = A(Y ); Task: Recover Y (and Ω) from X . / X ⊂ M ⊂ Rn nn nnn n n nn wnnn P Rd ⊃ Ω ⊃ Y A Ω0 ⊂ R d DRWA2010 6-9 September 2010 Armin Iske 5 Objectives and Method Description DRWA2010 6-9 September 2010 Armin Iske 6 Manifold Learning Techniques Geometry-based Dimensionality Reduction Methods. Principal Component Analysis (PCA) Multidimensional Scaling (MDS) Isomap Supervised Isomap Local Tangent Space Alignment (LTSA) Riemannian Normal Coordinates (RNC, 2005) DRWA2010 6-9 September 2010 Armin Iske 7 Principal Component Analysis (PCA) Problem: n n Given points X = {xk }m k=1 ⊂ R , find closest hyperplane H ⊂ R to X , n n i.e., find orthogonal projection P : R → R , with rank(P) = p < n, minimizing m X d(P, X ) = kxk − P(xk )k2 k=1 DRWA2010 6-9 September 2010 Armin Iske 8 Principal Component Analysis (PCA) Theorem: Let X = (x1 , . . . , xm ) ∈ Rn×m be scattered with zero mean, m 1 X xk = 0. m k=1 Then, for an orthogonal projection P : Rn → Rn , with rank(P) = p < n, are equivalent: (a) The projection P minimizes the distance d(P, X ) = m X kxk − P(xk )k2 . k=1 (b) The projection P maximizes the variance var(P(X )) = m X kP(xk )k2 . k=1 (c) The matrix representation of the projection P is spanned by p eigenvectors of the p largest eigenvalues of the covariance matrix XX T ∈ Rn×n . DRWA2010 6-9 September 2010 Armin Iske 9 Principal Component Analysis (PCA) Proof: By the Pythagoras Theorem, we have kxk k2 = kxk − P(xk )k2 + kP(xk )k2 and so d(P, X ) = m X for k = 1, . . . , m kxk k2 − var(P(X )). k=1 Therefore, minimizing d(P, X ) is equivalent to maximizing var(P(X )), i.e., characterizations (a) and (b) are equivalent. To study property (c), let v1 , . . . , vp ⊂ Rn denote an orthonormal set of vectors, such that p X P(xk ) = < vi , xk > vi i=1 and therefore var(P(X )) = m X 2 kP(xk )k = k=1 DRWA2010 6-9 September 2010 p m X X | < vi , xk > |2 . k=1 i=1 Armin Iske 10 Principal Component Analysis (PCA) Now, m X | < vi , xk > |2 = k=1 m X < vi , xk >< xk , vi > k=1 * vi , = * = vi , m X k=1 m X + xk < xk , vi > xk n X + xjk vji j=1 k=1 For the covariance matrix S = XX T = m X ! x`k xjk k=1 we obtain (Svi )` = n X m X x`k xjk vji = j=1 k=1 DRWA2010 6-9 September 2010 1≤`,j≤n m X k=1 x`k n X xjk vji j=1 Armin Iske 11 Principal Component Analysis (PCA) Therefore, m X | < vi , xk > |2 =< vi , Svi > k=1 and so var(P(X )) = m X kP(xk )k2 = p X < vi , Svi > i=1 k=1 Now, maximizing var(P(X )) is equivalent to select p eigenvectors v1 , . . . , vp corresponding to the p largest eigenvalues of S. The latter observation relies on the following Lemma: Let S ∈ Rn×n be a symmetric matrix. Then, the function W (v ) =< v , Sv > for v ∈ Rn with kv k = 1 attains its maximum at an eigenvector of S corresponding to the largest eigenvalue of S. Proof: Exercise. DRWA2010 6-9 September 2010 Armin Iske 12 Principal Component Analysis (PCA) Conclusion. In order to compute P minimizing d(P, X ), perform the following steps (a) Compute the singular value decomposition of X , so that X = UDV T , with a diagonal matrix D containing the singular values of X , and U, V unitary matrices. (b) So obtain XX T = U(DD T )U T the eigendecomposition of the covariance matrix XX T . (c) Take the p (orthonormal) eigenvectors v1 , . . . , vp in U corresponding to the p largest singular values in DD T , and so obtain the required projection p X P(x) = < x, vi > vi . i=1 DRWA2010 6-9 September 2010 Armin Iske 13 Multidimensional Scaling (MDS) Problem (MDS): Given distance matrix DX = (kxi − xj k2 )1≤i,j≤m ∈ Rm×m of X = (x1 . . . xm ) ∈ Rn×m , find Y = (y1 . . . ym ) ∈ Rp×m minimizing d(Y , D) = m X (dij − kyi − yj k2 ). i,j=1 DRWA2010 6-9 September 2010 Armin Iske 14 Multidimensional Scaling (MDS) Solution to Multidimensional Scaling Problem. Theorem: Let D = (kxi − xj k2 )1≤i,j≤m ∈ Rm×m be a given distance matrix of X = (x1 , . . . , xm ) ∈ Rn×m with m 1 X xi = 0. m i=1 Then, the matrix X can (up to orthogonal transformation) be recovered from D by p p Y = diag( λ1 , . . . , λm )U T , where λ1 ≥ . . . ≥ λm ≥ 0 and U ∈ Rm×n are the corresponding eigenvalues and eigenvectors of the matrix 1 XX T = − JDJ 2 with J = I − (1/m)ee T where e = (1, . . . , 1)T ∈ Rm . DRWA2010 6-9 September 2010 Armin Iske 15 Multidimensional Scaling (MDS) Proof (sketch): Note that dij = kxi − xj k2 = (xi , xi ) − 2(xi , xj ) + (xj , xj ). This implies the relation D = Ae T − 2XX T + eAT for A = [(x1 , x1 ), . . . , (xm , xm )]T ∈ Rm . Now regard J = Im − (1/m)ee T ∈ Rm×m . Since Je = 0, the above relation implies 1 XX T = − JDJ. 2 Therefore, the eigendecomposition of XX T = Udiag(λ1 , . . . , λm )U T allows us to recover X , up to orthogonal transformation, by p p Y = diag( λ1 , . . . , λm )U T . DRWA2010 6-9 September 2010 Armin Iske 16 Dimensionality Reduction by Isomap From Multidimensional Scaling to Isomap. Example: Regard the Swiss roll data, sampled from the surface f (u, v ) = (u cos(u), v , u sin(u)) DRWA2010 for u ∈ [3π/2, 9π/2], v ∈ [0, 1]. 6-9 September 2010 Armin Iske 17 Dimensionality Reduction by Isomap Isomap Strategy. Neighbourhood graph construction. Define a graph where each vertex is a data point, and each edge connects two points satisfying an -radius or k-nearest neighbour criterion. Geodesic distance construction. Compute the geodesic distance between each point pair (x, y ) by using the length of shortest path from x to y in the graph. d-dimensional embedding. Use the geodesic distances for computing a d-dimensional embedding as in the MDS algorithm. DRWA2010 6-9 September 2010 Armin Iske 18 Dimensionality Reduction by Isomap DRWA2010 6-9 September 2010 Armin Iske 19 Curvature Analysis for Curves Curvature of Curves. For a curve r : I → Rn with arc-length parametrization: Zt kr 0 (x)k dx s(t) = a its curvature is κ(s) = kr 00 (s)k For a curve r : I → Rn with arbitrary parametrization we have κ2 (r ) = DRWA2010 kr̈ k2 kṙ k2 − hr̈ , ṙ i2 (kṙ k2 )3 6-9 September 2010 Armin Iske 20 Curvature Analysis for Manifolds Computation of the Scalar Curvature for Manifolds. ∂ ∂ , i ∂xi ∂xj m 1 X ∂gj` ∂gi` ∂gij k Christoffel symbols : Γij = + − g `k 2 ∂xi ∂xj ∂x` Metric Tensor : gij (p) = h Curvature Tensor : R ` ijk = `=1 m X (Γjkh Γih` − Γikh Γjh` ) + h=1 Ricci Tensor : Rijk` = m X h Rijk g`h , h=1 Scalar Curvature : κ = m X Rij = ∂Γjk` ∂Γ ` − ik ∂xi ∂xj m X g k` Rkij` = k,`=1 m X k Rkij k=1 g ij Rij i,j=1 DRWA2010 6-9 September 2010 Armin Iske 21 Curvature Distortion and Convolution Regard the convolution map T on manifold M, MT = {T (x), x ∈ M} T (x) = x ∗ h, h1 h2 h3 . . . T = hm 0 .. . 0 Curvature (for curves): DRWA2010 0 h1 h2 .. . hm−1 hm .. . 0 κ2T (r ) = 6-9 September 2010 ... ... ... ... ... ... ... ... h = (h1 , . . . , hm ) 0 0 0 .. . h1 h2 .. . hm kT r̈ k2 kT ṙ k2 − hT r̈ , T ṙ i2 (kT ṙ k2 )3 Armin Iske 22 Modulation Maps and Modulation Manifolds A modulation map is defined for α = (α1 , . . . , αd ) ∈ Ω, and {ti }ni=1 ⊂ [0, 1]: A:Ω→M Aα (ti ) = d X φk (αk ti ) for 1 ≤ i ≤ n, k=1 (Ω ⊂ Rd and M ⊂ Rn manifolds, dim(Ω) = dim(M), d < n). Example: Scale modulation map Aα (ti ) = d X exp(αk (ti ) − bk )2 for 1 ≤ i ≤ n. k=1 Example: Frequency modulation map Aα (ti ) = d X sin(αk ti + bk ) for 1 ≤ i ≤ n. k=1 DRWA2010 6-9 September 2010 Armin Iske 23 Numerical Example 1: Curvature Distortion Low dimensional Parameterization of Scale Modulated Signals. P3 2 −α (t)(·−b ) i ,α ∈ Ω X = fαt = i=1 e i Ω = αt = (α1 (t), α2 (t), α3 (t)), t ∈ [t0 , t1 ] X = {fα , α ∈ Ω} fα Ω ⊂ R3 DRWA2010 6-9 September 2010 X ⊂ R64 Armin Iske 24 Numerical Example 1: Curvature Distortion Low dimensional Parameterization of Scale Modulated Signals. / Vj+2 Multiresolution Vj E / Vj+1 II EE I EE II " $ Wj+1 Wj+2 ... ... V8 ⊕ W8 ⊕ W16 ⊕ W32 = V64 Ω ⊂ R3 DRWA2010 X ⊂ R64 6-9 September 2010 T (X ) ⊂ R64 Armin Iske 25 Numerical Example 1: Curvature Distortion Evolution Manifold Evolution under a PDE 2 ∂2 u 2∂ u Wave Equation = c (WE) ∂t 2 ∂x 2 Ut = uα (t, x), uα solution of (WE) with initial condition fα , α ∈ Ω0 uα0 (t, x), (t, x) ∈ [t0 , t1 ] × [x0 , x1 ] DRWA2010 6-9 September 2010 X0 Xt Armin Iske 26 Example 2: Curvature Distortion - Frequency Modulation Frequency Modulation. A : Ω ⊂ Rd → M ⊂ Rn α1 (u, v ) = (R + r cos v ) cos u α2 (u, v ) = (R + r cos v ) sin u A −−−→ Aα (ti ) = P3 k=1 sin((α0k + γαk )ti ) α3 (u, v ) = r sin v Aα (t) DRWA2010 Frequency Content of Aα (t) 6-9 September 2010 Armin Iske 27 Example 2: Curvature Distortion - Frequency Modulation Torus Deformation. A : Ω ⊂ Rd → M ⊂ R n Modulation Map Torus Ω ⊂ R3 PCA 3D Projection : P(M) = Ω 0 P(M) : PCA 3D Projection of M ⊂ R256 Scalar Curvature of M ⊂ R256 DRWA2010 P : M ⊂ Rn → Ω 0 ⊂ R3 Scalar Curvature of P(M) ⊂ R3 6-9 September 2010 Armin Iske 28 Example 2: Curvature Distortion - Frequency Modulation PCA Distortion for high frequency bandwidths γ. Aα (ti ) = 3 X sin((α0k + γαk )ti ) k=1 Isomap 3d Projection of M DRWA2010 6-9 September 2010 PCA 3d Projection of M Armin Iske 29 Example 3: Topological Distortion band3 Torus (Genus 1 and Genus 2). 0.01 0.005 0 ftorus12 −0.005 −0.01 0.01 −0.02 −0.015 0.02 0.005 −0.01 0.01 0 0 −0.01 0.01 −0.02 0 band4 −0.005 0.015 −0.01 −0.02 0.01 0 −0.015 −0.01 0.005 −0.005 0 0.005 0.01 0.015 0.02 0 −0.005 −0.01 0.01 0 −0.01 −0.02 DRWA2010 6-9 September 2010 −0.015 −0.01 −0.005 Armin Iske 0 0.005 0.01 0.015 30 Literature Relevant Literature. M. Guillemard and A. Iske (2010) Curvature Analysis of Frequency Modulated Manifolds in Dimensionality Reduction. Calcolo. Published online, 21st Sept. 2010, DOI: 10.1007/s10092-010-0031-8. M. Guillemard and A. Iske (2009) Analysis of High-Dimensional Signal Data by Manifold Learning and Convolutions. In: Sampling Theory and Applications (SampTA’09), L. Fesquet and B. Torrésani (eds.), Marseille, May 2009, 287–290. Further up-to-date material, including slides and code, are accessible through our homepages http://www.math.uni-hamburg.de/home/guillemard/ and http://www.math.uni-hamburg.de/home/iske/. DRWA2010 6-9 September 2010 Armin Iske 31