Lecture 18 Varimax Factors and Empircal Orthogonal Functions

advertisement
Lecture 18
Varimax Factors
and
Empircal Orthogonal Functions
Syllabus
Lecture 01
Lecture 02
Lecture 03
Lecture 04
Lecture 05
Lecture 06
Lecture 07
Lecture 08
Lecture 09
Lecture 10
Lecture 11
Lecture 12
Lecture 13
Lecture 14
Lecture 15
Lecture 16
Lecture 17
Lecture 18
Lecture 19
Lecture 20
Lecture 21
Lecture 22
Lecture 23
Lecture 24
Describing Inverse Problems
Probability and Measurement Error, Part 1
Probability and Measurement Error, Part 2
The L2 Norm and Simple Least Squares
A Priori Information and Weighted Least Squared
Resolution and Generalized Inverses
Backus-Gilbert Inverse and the Trade Off of Resolution and Variance
The Principle of Maximum Likelihood
Inexact Theories
Nonuniqueness and Localized Averages
Vector Spaces and Singular Value Decomposition
Equality and Inequality Constraints
L1 , L∞ Norm Problems and Linear Programming
Nonlinear Problems: Grid and Monte Carlo Searches
Nonlinear Problems: Newton’s Method
Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals
Factor Analysis
Varimax Factors, Empircal Orthogonal Functions
Backus-Gilbert Theory for Continuous Problems; Radon’s Problem
Linear Operators and Their Adjoints
Fréchet Derivatives
Exemplary Inverse Problems, incl. Filter Design
Exemplary Inverse Problems, incl. Earthquake Location
Exemplary Inverse Problems, incl. Vibrational Problems
Purpose of the Lecture
Choose Factors Satisfying A Priori
Information of Spikiness
(varimax factors)
Use Factor Analysis to Detect
Patterns in data
(EOF’s)
Part 1: Creating Spiky Factors
can we find “better” factors
that those returned by svd()
?
mathematically
S = CF = C’ F’
with F’ = M F and C’ = M-1 C
where M is any P×P matrix with an inverse
must rely on prior information to choose M
one possible type of prior
information
factors should contain mainly just a
few elements
example of rock and minerals
rocks contain minerals
minerals contain elements
Mineral
Composition
Quartz
SiO2
Rutile
TiO2
Anorthite
CaAl2Si2O8
Fosterite
Mg2SiO4
example of rock and minerals
rocks contain minerals
minerals contain elements
Mineral
Composition
Quartz
SiO2
Rutile
TiO2
Anorthite
CaAl2Si2O8
Fosterite
Mg2SiO4
factors
most of
these
minerals are
“simple” in
the sense
that each
contains just
a few
elements
spiky factors
factors containing mostly just a few elements
How to quantify spikiness?
variance as a measure of spikiness
modification for factor analysis
modification for factor analysis
depends on the square, so positive and
negative values are treated the same
f(1)= [1, 0, 1, 0, 1, 0]T
is much spikier than
f(2)= [1, 1, 1, 1, 1, 1]T
f(2)=[1, 1, 1, 1, 1, 1]T
is just as spiky as
f(3)= [1, -1, 1, -1, -1, 1]T
“varimax” procedure
find spiky factors without changing P
start with P svd() factors
rotate pairs of them in their plane by angle θ
to maximize the overall spikiness
f’A
fB
f’B
q
fA
determine θ by maximizing
after tedious trig the solution can be
shown to be
and the new factors are
in this example A=3 and B=5
now one repeats for every pair of factors
and then iterates the whole process several
times
until the whole set of factors is as spiky as
possible
example: Atlantic Rock dataset
Old
New
SiO2
TiO2
Al2O3
FeOtotal
MgO
CaO
Na2O
K2O
f2
f3
f4
f5
f2p
f3p f4p f5p
f2 f3 f4 f5 f’2f’3 f’4f’5
example: Atlantic Rock dataset
Old
New
SiO2
TiO2
Al2O3
FeOtotal
MgO
CaO
Na2O
K2O
f2
f3
f4
f5
f2p
f3p f4p f5p
f2 f3 f4 f5 f’2f’3 f’4f’5
not so spiky
example: Atlantic Rock dataset
Old
New
SiO2
TiO2
Al2O3
spiky
FeOtotal
MgO
CaO
Na2O
K2O
f2
f3
f4
f5
f2p
f3p f4p f5p
f2 f3 f4 f5 f’2f’3 f’4f’5
Part 2: Empirical Orthogonal Functions
row number in the sample matrix could be
meaningful
example: samples collected at a succession of times
time
column number in the sample matrix could
be meaningful
example: concentration of the same chemical element at a
sequence of positions
distance
S = CF
becomes
S = CF
becomes
distance dependence
time dependence
S = CF
becomes
each loading: a
temporal pattern of
variability of the
corresponding factor
each factor:
a spatial pattern
of variability of
the element
S = CF
becomes
there are P
patterns and
they are sorted
into order of
importance
S = CF
becomes
factors now called
EOF’s (empirical
orthogonal functions)
example 1
hypothetical mountain profiles
what are the most important spatial patterns
that characterize mountain profiles
this problem has space but not time
s( xj , i ) =
p
Σk=1 cki
f (k)(xj)
this problem has space but not time
s( xj , i ) =
p
Σk=1 cki
f (k)(xj )
factors are spatial
patterns that add
together to make
mountain profiles
3
5
i
6
5
0
10
5
i
5
5
i
0
10
5
i
5
5
i
0
10
5
i
0
0
5
i
10
5
i
10
0
5
i
10
0
5
i
10
0
5
i
10
5
0
10
5
s
s
14
5
0
15
0
0
5
s
0
10
5
0
10
s
s
11
5
0
12
0
5
i
s
0
0
5
0
10
s
8
5
0
9
5
i
0
10
s
0
s
7
0
s
5
5
0
10
0
10
s
5
i
5
s
2
0
s
4
0
13
5
s
s
1
5
0
0
5
i
10
0
3
5
i
5
0
0
5
i
9
5
0
10
5
i
5
5
i
0
10
5
i
0
0
5
i
10
10
0
5
i
10
0
5
i
10
0
5
i
10
5
0
10
5
s
s
14
5
0
15
0
5
i
5
s
0
0
5
0
10
s
11
5
0
12
5
i
10
s
0
5
i
5
0
10
s
8
5
0
s
10
0
10
6
5
5
i
s
10
0
s
0
s
7
0
0
13
0
10
5
s
2
5
i
s
4
0
5
s
0
elements:
elevations
ordered by
distance
along profile
5
s
s
1
5
0
0
5
i
10
0
λ1 = 38
λ2 = 13
λ3 = 7
EOF 2
EOF 3
EOF 1
-1
0
5
i
index, i
10
fi 3
fi 2
0
1
F (3)
1
F (2)
fiF(1)
1
1
0
-1
0
5
i
index, i
10
0
-1
0
5
i
index, i
10
factor loading, Ci3
factor loading, Ci2
example 2
spatial-temporal patterns
(synthetic data)
the data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
y
the data
x
1
spatial
pattern at
a single
time
t=1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
the data
time
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
the data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
need to unfold each 2D image into vector
4
6
1
9
4
6
1
9
1
2
2
1
3
6
1
1
3
6
s
200
150
λi
100
50
0
0
5
p=3
10
index, i
15
20
25
(A)
1 EOF 1
loadng1
(B)
3 EOF 3
EOF 2
0
-50
-100
loadng3 loadng2
2
0
5
10
0
5
10
0
5
10
20
time, t
15
20
25
time, t
15
20
25
time, t
15
20
25
0
-20
50
0
-50
example 3
spatial-temporal patterns
(actual data)
sea surface temperature in the Pacific Ocean
CAC Sea Surface Temperature
29N
latitude
equatorial Pacific
Ocean
29S
124E
longitude
290E
sea surface temperature
(black = warm)
the image is 30 by
84 pixels in size,
or 2520 pixels
total
to use svd(), the image must be
unwrapped into a vector of length 2520
399 times
2520 positions in the equatorial Pacific ocean
“element” means
temperature
singular values, Sii
singular values
index, i
singular values, Sii
singular values
index, i
no clear cutoff for P, but the first 12 singular values are
considerably larger than the rest
using SVD to approximate data
S=CMFM
With M EOF’s, the data is fit exactly
S=CPFP
With P chosen to exclude only zero
singular values, the data is fit exactly
S≈CP’FP’
With P’<P, small non-zero singular
values are excluded too, and the data is
fit only approximately
A) Original
B) Based on first 5 EOF’s
Download