Lecture 18 Varimax Factors and Empircal Orthogonal Functions Syllabus Lecture 01 Lecture 02 Lecture 03 Lecture 04 Lecture 05 Lecture 06 Lecture 07 Lecture 08 Lecture 09 Lecture 10 Lecture 11 Lecture 12 Lecture 13 Lecture 14 Lecture 15 Lecture 16 Lecture 17 Lecture 18 Lecture 19 Lecture 20 Lecture 21 Lecture 22 Lecture 23 Lecture 24 Describing Inverse Problems Probability and Measurement Error, Part 1 Probability and Measurement Error, Part 2 The L2 Norm and Simple Least Squares A Priori Information and Weighted Least Squared Resolution and Generalized Inverses Backus-Gilbert Inverse and the Trade Off of Resolution and Variance The Principle of Maximum Likelihood Inexact Theories Nonuniqueness and Localized Averages Vector Spaces and Singular Value Decomposition Equality and Inequality Constraints L1 , L∞ Norm Problems and Linear Programming Nonlinear Problems: Grid and Monte Carlo Searches Nonlinear Problems: Newton’s Method Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Factor Analysis Varimax Factors, Empircal Orthogonal Functions Backus-Gilbert Theory for Continuous Problems; Radon’s Problem Linear Operators and Their Adjoints Fréchet Derivatives Exemplary Inverse Problems, incl. Filter Design Exemplary Inverse Problems, incl. Earthquake Location Exemplary Inverse Problems, incl. Vibrational Problems Purpose of the Lecture Choose Factors Satisfying A Priori Information of Spikiness (varimax factors) Use Factor Analysis to Detect Patterns in data (EOF’s) Part 1: Creating Spiky Factors can we find “better” factors that those returned by svd() ? mathematically S = CF = C’ F’ with F’ = M F and C’ = M-1 C where M is any P×P matrix with an inverse must rely on prior information to choose M one possible type of prior information factors should contain mainly just a few elements example of rock and minerals rocks contain minerals minerals contain elements Mineral Composition Quartz SiO2 Rutile TiO2 Anorthite CaAl2Si2O8 Fosterite Mg2SiO4 example of rock and minerals rocks contain minerals minerals contain elements Mineral Composition Quartz SiO2 Rutile TiO2 Anorthite CaAl2Si2O8 Fosterite Mg2SiO4 factors most of these minerals are “simple” in the sense that each contains just a few elements spiky factors factors containing mostly just a few elements How to quantify spikiness? variance as a measure of spikiness modification for factor analysis modification for factor analysis depends on the square, so positive and negative values are treated the same f(1)= [1, 0, 1, 0, 1, 0]T is much spikier than f(2)= [1, 1, 1, 1, 1, 1]T f(2)=[1, 1, 1, 1, 1, 1]T is just as spiky as f(3)= [1, -1, 1, -1, -1, 1]T “varimax” procedure find spiky factors without changing P start with P svd() factors rotate pairs of them in their plane by angle θ to maximize the overall spikiness f’A fB f’B q fA determine θ by maximizing after tedious trig the solution can be shown to be and the new factors are in this example A=3 and B=5 now one repeats for every pair of factors and then iterates the whole process several times until the whole set of factors is as spiky as possible example: Atlantic Rock dataset Old New SiO2 TiO2 Al2O3 FeOtotal MgO CaO Na2O K2O f2 f3 f4 f5 f2p f3p f4p f5p f2 f3 f4 f5 f’2f’3 f’4f’5 example: Atlantic Rock dataset Old New SiO2 TiO2 Al2O3 FeOtotal MgO CaO Na2O K2O f2 f3 f4 f5 f2p f3p f4p f5p f2 f3 f4 f5 f’2f’3 f’4f’5 not so spiky example: Atlantic Rock dataset Old New SiO2 TiO2 Al2O3 spiky FeOtotal MgO CaO Na2O K2O f2 f3 f4 f5 f2p f3p f4p f5p f2 f3 f4 f5 f’2f’3 f’4f’5 Part 2: Empirical Orthogonal Functions row number in the sample matrix could be meaningful example: samples collected at a succession of times time column number in the sample matrix could be meaningful example: concentration of the same chemical element at a sequence of positions distance S = CF becomes S = CF becomes distance dependence time dependence S = CF becomes each loading: a temporal pattern of variability of the corresponding factor each factor: a spatial pattern of variability of the element S = CF becomes there are P patterns and they are sorted into order of importance S = CF becomes factors now called EOF’s (empirical orthogonal functions) example 1 hypothetical mountain profiles what are the most important spatial patterns that characterize mountain profiles this problem has space but not time s( xj , i ) = p Σk=1 cki f (k)(xj) this problem has space but not time s( xj , i ) = p Σk=1 cki f (k)(xj ) factors are spatial patterns that add together to make mountain profiles 3 5 i 6 5 0 10 5 i 5 5 i 0 10 5 i 5 5 i 0 10 5 i 0 0 5 i 10 5 i 10 0 5 i 10 0 5 i 10 0 5 i 10 5 0 10 5 s s 14 5 0 15 0 0 5 s 0 10 5 0 10 s s 11 5 0 12 0 5 i s 0 0 5 0 10 s 8 5 0 9 5 i 0 10 s 0 s 7 0 s 5 5 0 10 0 10 s 5 i 5 s 2 0 s 4 0 13 5 s s 1 5 0 0 5 i 10 0 3 5 i 5 0 0 5 i 9 5 0 10 5 i 5 5 i 0 10 5 i 0 0 5 i 10 10 0 5 i 10 0 5 i 10 0 5 i 10 5 0 10 5 s s 14 5 0 15 0 5 i 5 s 0 0 5 0 10 s 11 5 0 12 5 i 10 s 0 5 i 5 0 10 s 8 5 0 s 10 0 10 6 5 5 i s 10 0 s 0 s 7 0 0 13 0 10 5 s 2 5 i s 4 0 5 s 0 elements: elevations ordered by distance along profile 5 s s 1 5 0 0 5 i 10 0 λ1 = 38 λ2 = 13 λ3 = 7 EOF 2 EOF 3 EOF 1 -1 0 5 i index, i 10 fi 3 fi 2 0 1 F (3) 1 F (2) fiF(1) 1 1 0 -1 0 5 i index, i 10 0 -1 0 5 i index, i 10 factor loading, Ci3 factor loading, Ci2 example 2 spatial-temporal patterns (synthetic data) the data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 y the data x 1 spatial pattern at a single time t=1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 the data time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 the data 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 need to unfold each 2D image into vector 4 6 1 9 4 6 1 9 1 2 2 1 3 6 1 1 3 6 s 200 150 λi 100 50 0 0 5 p=3 10 index, i 15 20 25 (A) 1 EOF 1 loadng1 (B) 3 EOF 3 EOF 2 0 -50 -100 loadng3 loadng2 2 0 5 10 0 5 10 0 5 10 20 time, t 15 20 25 time, t 15 20 25 time, t 15 20 25 0 -20 50 0 -50 example 3 spatial-temporal patterns (actual data) sea surface temperature in the Pacific Ocean CAC Sea Surface Temperature 29N latitude equatorial Pacific Ocean 29S 124E longitude 290E sea surface temperature (black = warm) the image is 30 by 84 pixels in size, or 2520 pixels total to use svd(), the image must be unwrapped into a vector of length 2520 399 times 2520 positions in the equatorial Pacific ocean “element” means temperature singular values, Sii singular values index, i singular values, Sii singular values index, i no clear cutoff for P, but the first 12 singular values are considerably larger than the rest using SVD to approximate data S=CMFM With M EOF’s, the data is fit exactly S=CPFP With P chosen to exclude only zero singular values, the data is fit exactly S≈CP’FP’ With P’<P, small non-zero singular values are excluded too, and the data is fit only approximately A) Original B) Based on first 5 EOF’s