singular values = 0

Object Orie’d Data Analysis, Last Time • Gene Cell Cycle Data • Microarrays and HDLSS visualization • DWD bias adjustment • NCI 60 Data Today: More NCI 60 Data & Detailed (math’cal) look at PCA Last Time: Checked Data Combo, using DWD Dir’ns DWD Views of NCI 60 Data Interesting Question: Which clusters are really there? Issues: • DWD great at finding dir’ns of separation • And will do so even if no real structure • Is this happening here? • Or: which clusters are important? • What does “important” mean? Real Clusters in NCI 60 Data Simple Visual Approach: • Randomly relabel data (Cancer Types) • Recompute DWD dir’ns & visualization • Get heuristic impression from this Deeper Approach • Formal Hypothesis Testing (Done later) Random Relabelling #1 Random Relabelling #2 Random Relabelling #3 Random Relabelling #4 Revisit Real Data Revisit Real Data (Cont.) Heuristic Results: Strong Clust’s Weak Clust’s Not Clust’s Melanoma CNS NSCLC Leukemia Ovarian Breast Renal Colon Later: will find way to quantify these ideas i.e. develop statistical significance NCI 60 Controversy • Can NCI 60 Data be normalized? • Negative Indication: • Kou, et al (2002) Bioinformatics, 18, 405412. – Based on Gene by Gene Correlations • Resolution: Gene by Gene Data View vs. Multivariate Data View Resolution of Paradox: Toy Data, Gene View Resolution: Correlations suggest “no chance” Resolution: Toy Data, PCA View Resolution: PCA & DWD direct’ns Resolution: DWD Adjusted Resolution: DWD Adjusted, PCA view Resolution: DWD Adjusted, Gene view Resolution: Correlations & PC1 Projection Correl’n Needed final verification of Cross-platform Normal’n • Is statistical power actually improved? • Will study later DWD: Why does it work? Rob Tibshirani Query: • Really need that complicated stuff? (DWD is complex) • Can’t we just use means? • Empirical Fact (Joel Parker): (DWD better than simple methods) DWD: Why does it work? Xuxin Liu Observation: • Key is unbalanced sub-sample sizes (e.g biological subtypes) • Mean methods strongly affected • DWD much more robust • Toy Example DWD: Why does it work? Xuxin Liu Example • Goals: – Bring colors together – Keep symbols distinct (interesting biology) • Study varying sub-sample proportions: – – – – Ratio = 1: Both methods great Ratio = 0.61: Mean degrades, DWD good Ratio = 0.35: Mean poor, DWD still OK Ratio = 0.11: DWD degraded, still better • Later: will find underlying theory PCA: Rediscovery – Renaming Statistics: Principal Component Analysis (PCA) Social Sciences: Factor Analysis (PCA is a subset) Probability / Electrical Eng: Karhunen – Loeve expansion Applied Mathematics: Proper Orthogonal Decomposition (POD) Geo-Sciences: Empirical Orthogonal Functions (EOF) An Interesting Historical Note The 1st (?) application of PCA to Functional Data Analysis: Rao, C. R. (1958) Some statistical methods for comparison of growth curves, Biometrics, 14, 1-17. 1st Paper with “Curves as Data” viewpoint Detailed Look at PCA Three important (and interesting) viewpoints: 1. Mathematics 2. Numerics 3. Statistics 1st: Review linear alg. and multivar. prob. Review of Linear Algebra Vector Space: x, • set of “vectors”, • and “scalars” (coefficients), • “closed” under “linear combination” (    x1  e.g. d        x     : x1 ,..., xd   x   ,   d   a a x i i i “ d dim Euclid’n space” in space) Review of Linear Algebra (Cont.) Subspace: • subset that is again a vector space • i.e. closed under linear combination • e.g. lines through the origin • e.g. planes through the origin • e.g. subsp. “generated by” a set of vector (all linear combos of them = = containing hyperplane through origin) Review of Linear Algebra (Cont.) Basis of subspace: set of vectors that: • span, i.e. everything is a lin. com. of them • are linearly indep’t, i.e. lin. Com. is unique  • e.g. • since d  1   0   0         “unit vector basis”  0   1      ,  ,...,          0   0   0   1   x1  1  0  0         0 1   x2         x1     x 2       x d  0          0  0 1  xd  Review of Linear Algebra (Cont.) Basis Matrix, of subspace of Given a basis,  d v1 ,..., vn , create matrix of columns:  B  v1 v1n   v11    vn       v  vdn  d n  d1  Review of Linear Algebra (Cont.) Then “linear combo” is a matrix multiplicat’n: n a v i 1 i i  Ba Check sizes: where  a1    a   a   n d 1  (d  n)  (n 1) Review of Linear Algebra (Cont.) Aside on matrix multiplication: (linear transformat’n) For matrices  a1,1  a1, m   b1,1  b1, n      A     B     a  , b   a  b k , 1 k , m m , 1 m , n     Define the “matrix product”  m   a1,i bi ,1   i 1 AB      m   a k ,i bi ,1   i 1  a1,i bi , n   i 1    m  a b  k ,i i , n  i 1  m (“inner products” of columns with rows) (composition of linear transformations) Often useful to check sizes: k  n  k  m  m  n Review of Linear Algebra (Cont.) Matrix trace: • For a square matrix m • Define tr ( A)   ai ,i  a1,1  a1, m    A     a   a m,m   m,1 i 1 • Trace commutes with matrix multiplication: tr  AB   tr  BA Review of Linear Algebra (Cont.) Dimension of subspace (a notion of “size”): • number of elements in a basis (unique) • dim d   d • e.g. dim of a line is 1 • e.g. dim of a plane is 2 • dimension is “degrees of freedom” (use basis above) Review of Linear Algebra (Cont.) Norm of a vector: • in  d, 1/ 2  2 x    x j   j 1  d    x x t 1/ 2 • Idea: “length” of the vector • Note:  strange properties for high d , e.g. “length of diagonal of unit cube” = d Review of Linear Algebra (Cont.) Norm of a vector (cont.): • “length normalized vector”: x x (has length one, thus on surf. of unit sphere & is a direction vector) • get “distance” as: d x , y   x  y  x  y  x  y  t Review of Linear Algebra (Cont.) Inner (dot, scalar) product: d x, y   x j y j  x y t j 1 • for vectors x and y, • related to norm, via x  x, x  x x t Review of Linear Algebra (Cont.) Inner (dot, scalar) product (cont.): • measures “angle between  x, y 1  anglex, y   cos  x y  x and y ” as: t    x y    cos 1    xt x  yt y     • key to “orthogonality”, i.e. “perpendicul’ty”: x y if and only if x, y  0 Review of Linear Algebra (Cont.) Orthonormal basis v1 ,..., vn : • All ortho to each other, i.e. vi , vi '  0 , for i  i' • All have length 1, i.e. vi , vi  1, for i  1,..., n Review of Linear Algebra (Cont.) Orthonormal basis v1 ,..., vn (cont.): n x   a i vi • “Spectral Representation”: ai  x, vi where check: x, vi  i 1 n a v ,v i '1 i' i' n i   a i ' vi ' , vi  a i i '1 • Matrix notation: x  B a where a t  x t B i.e. a  B t x a is called “transform (e.g. Fourier, wavelet) of x ” Review of Linear Algebra (Cont.) Parseval identity, for x in subsp. gen’d by o. n. basis v1 ,..., vn : n x   x, vi 2 i 1 2 n  a  a i 1 2 i 2 • Pythagorean theorem • “Decomposition of Energy” • ANOVA - sums of squares • Transform, a , has same length as x , i.e. “rotation in  d ” Review of Linear Algebra (Cont.) Gram-Schmidt Ortho-normalization Idea: Given a basis v1 ,..., vn, find an orthonormal version, by subtracting non-ortho part u 1  v1 / v1   v  u 2  v 2  v 2 , u1 u1 / v 2  v 2 , u1 u1 u3 3   v 3 , u1 u1  v 3 , u1 u1 / v 3  v 3 , u1 u1  v 3 , u1 u1 Review of Linear Algebra (Cont.) Projection of a vector x onto a subspace V : • Idea: member of V that is closest to x (i.e. “approx’n”) • Find PV x  V that solves: min x  v vV (“least squares”) • For inner product (Hilbert) space: PV x exists and is unique Review of Linear Algebra (Cont.) Projection of a vector onto a subspace (cont.): • General solution in  : for basis matrix BV , d PV x  BV B BV  B x 1 t V t V • So “proj’n operator” is “matrix mult’n”:  PV  BV B BV t V  1 BVt (thus projection is another linear operation) (note same operation underlies least squares) Review of Linear Algebra (Cont.) Projection using orthonormal basis v1 ,..., vn : • Basis matrix is “orthonormal”:  v ,v  v1t   1 1      v1  vn      t  vn   vn , v1     • So     BVt BV  I nn v1 , vn   1  0           vn , vn   0  1   PV x  BV BVt x  = = Recon(Coeffs of x “in V dir’n”) Review of Linear Algebra (Cont.) Projection using orthonormal basis (cont.):  V • For “orthogonal complement”, , x  PV x  PV  x x  PV x  PV  x 2 and 2 • Parseval inequality: n PV x  x   x, vi 2 2 i 1 2 n   ai2  a i 1 2 2 Review of Linear Algebra (Cont.) (Real) Unitary Matrices: U d d with U tU  I • Orthonormal basis matrix (so all of above applies) • Follows that UU  I t 1 U (since have full rank, so exists …) • Lin. trans. (mult. by U ) is like “rotation” of  d • But also includes “mirror images” Review of Linear Algebra (Cont.) Singular Value Decomposition (SVD): For a matrix X d n Find a diagonal matrix S d n, with entries s1 ,..., smin( d , n ) called singular values And unitary (rotation) matrices U d d , Vnn (recall U tU  V tV  I ) so that X  USV t Review of Linear Algebra (Cont.) Intuition behind Singular Value Decomposition: • For X a “linear transf’n” (via matrix multi’n) X  v  U  S  V t  v  U  S  V t  v  • First rotate • Second rescale coordinate axes (by si ) • Third rotate again • i.e. have diagonalized the transformation Review of Linear Algebra (Cont.) SVD Compact Representation: Useful Labeling: s1    smin( n ,d ) Singular Values in Increasing Order Note: singular values = 0 can be omitted Let r = # of positive singular values Then: Where X  U d r SrrVnr t are truncations of U , S , V Review of Linear Algebra (Cont.) Eigenvalue Decomposition: For a (symmetric) square matrix X d d  1  0    Find a diagonal matrix D       0     d  And an orthonormal matrix Bd d (i.e. B t  B  B  B t  I d d ) So that: X  B  B  D, i.e. X  B  D  B t Review of Linear Algebra (Cont.) Eigenvalue Decomposition (cont.): • Relation to Singular Value Decomposition (looks similar?): • Eigenvalue decomposition “harder” U V • Since needs • Price is eigenvalue decomp’n is generally complex • Except for X square and symmetric • Then eigenvalue decomp. is real valued • Thus is the sing’r value decomp. with: U V  B

singular values = 0

Related documents

Products

Support

singular values = 0

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib