0 0 # [Vp | Vo] Vo

advertisement
Singular value decomposition
If only the first p singular values are nonzero we write
G = [Up | Uo]
"
Sp 0
0 0
#
Up represents the first p columns of U
Uo represents the last N-p columns of U
Vp represents the first p columns of V
Vo represents the last M-p columns of V
[Vp | Vo]T
→ A data null space is created
→ A model null space is created
Properties
UpT Uo = 0
UoT Up = 0
VpT Vo = 0
VoT Vp = 0
UpT Up = I
UoT Uo = I
VoT Vo = I
VpT Vp = I
Since the columns of Vo and Uo multiply by zeros we get the
compact form for G
G = UpSpVpT
120
Model null space
Consider a vector made up of a linear combination of the columns of Vo
mv =
M
X
λiv i
i=p+1
The model m lies in the space spanned by columns of Vo
Gm v =
M
X
λiUpSpVpT v i = 0
i=p+1
So any model of this type has no affect on the data. It lies in the
model null space !
Where have we seen this before ?
Consequence: If any solution exists to the inverse problem then an
infinite number will
Assume the model mls fits the data
Gmls = dobs
G(mls + mv ) = Gmls + Gmv
Uniqueness question
of Backus and Gilbert
= dobs + 0
The data can not constrain models in the model null space
121
Data null space
Consider a data vector with at least one component in Uo
dobs = do + λiui
(i > p)
For any model space vector m we have
dpre = Gm = UpSpVpT m
= Up a
For the model to fit the data we must have
do + λiui =
p
X
j=1
aj uj
dobs = dpre
Where have we seen this before ?
So data of this type can not be fit by any model. The data has a
component in the data null space !
Consequence: No model exists that can fit the data
Existence question
of Backus and Gilbert
All this depends on the structure of the kernel matrix G !
122
Moore Penrose Generalized inverse
G† = VpSp−1UpT
The generalized inverse combines the features of the least squares
and minimum length solutions.
Purely over-determined problem it is equivalent to the least squares
solution
m† = G†d = (GT G)−1GT d
In a purely under-determined problem it is
equivalent to the minimum length solution
m† = G†d = GT (GGT )−1d
In general problems it minimizes the data prediction error while also
producing a minimizing the length solution.
L(m†) = m†T m†
φ(m†) = (d − Gm†)T (d − Gm†)
123
Covariance and Resolution of the pseudo inverse
How does data noise propagate into the model ?
What is the model covariance matrix for the generalized inverse ?
CM = G†Cd(G†)T
For the case Cd
= σ 2I
CM = σ 2G†(G†)T
G† = VpSp−1UpT
Prove this
= σ 2VpSp−2VpT
Recall that Sp is a diagonal matrix of singular ordered values
Sp = diag[s1, s2, . . . , sp]
p
X
v iv Ti
⇒ CM = σ
2
s
i
i=1
2
Prove this
As the number of singular values, p, increases the variance of
What is the effect of singular values on the model covariance ?
the model parameters increases !
124
Covariance and Resolution of the pseudo inverse
How is the estimated model related to the true model ?
Model resolution matrix
m† = Rmtrue
R = G† G
= VpSp−1UpT UpSpVpT
= VpVpT
G† = VpSp−1UpT
As p increases the model null space decreases
p→M :
VpT → Vp−1,
R→I
As the number of singular values, p, increases the resolution of
What is the effect of singular values on the resolution matrix ?
the model parameters increases !
We see the trade-off between variance and resolution
125
Worked example: tomography
δ d = Gδ m
Using rays 1- 4
⎡
⎢
⎢
G=⎢
⎢
⎣
1
0
1
0
0 √1 √0
1
2
2 √0
√0
2 0
0
2
⎡
⎢
⎢
G G=⎢
⎣
T
3
0
1
2
0
3
2
1
1
2
3
0
2
1
0
3
⎤
⎤
⎥
⎥
⎥
⎥
⎦
⎥
⎥
⎥
⎦
This has eigenvalues 0, 2, 4, 6.
⎡
⎢
⎢
Vp = ⎢
⎣
0.5 −0.5 −0.5
0.5
0.5
0.5
0.5
0.5 −0.5
0.5 −0.5
0.5
s12 = 6
s22 = 4
⎤
⎥
⎥
⎥
⎦
s32 = 2
⎡
⎢
⎢
Vo = ⎢
⎣
0.5
0.5
−0.5
−0.5
⎤
⎥
⎥
⎥
⎦
Gv o = 0
s42 = 0
126
Worked example: Eigenvectors
S12=6
S22=4
⎡
⎢
⎢
⎣
Vp = ⎢
0.5 −0.5 −0.5
0.5
0.5
0.5
0.5
0.5 −0.5
0.5 −0.5
0.5
S32=2
⎡
⎢
⎢
⎣
Vo = ⎢
0.5
0.5
−0.5
−0.5
127
⎤
⎥
⎥
⎥
⎦
⎤
⎥
⎥
⎥
⎦
Worked example: tomography
Using all non zero eigenvalues s1, s2 and s3 the resolution
matrix becomes
δm
= Rδ mtrue = VpVpT δ mtrue
⎡
⎢
⎢
⎣
R=⎢
0.75 −0.25
0.25
0.25
−0.25
0.75
0.25
0.25
0.25
0.25
0.75 −0.25
0.25
0.25 −0.25
0.75
Input model
⎤
⎥
⎥
⎥
⎦
⎡
⎢
⎢
⎣
Vp = ⎢
0.5 −0.5 −0.5
0.5
0.5
0.5
0.5
0.5 −0.5
0.5 −0.5
0.5
Recovered model
128
⎤
⎥
⎥
⎥
⎦
Worked example: tomography
Using eigenvalues s1, s2 and s3 the model covariance becomes
⇒ CM
p
X
v iv Ti
=σ
2
i=1 si
2
s12 = 2
⎧ ⎛
⎪
1 −1
1 −1
⎪
⎪
2
⎜
⎨
σ
1 ⎜ −1
1 −1
1
CM =
⎜
⎪ 2 ⎝ 1 −1
1 −1
4 ⎪
⎪
⎩
−1
1 −1
⎡
σ2 ⎢
⎢
CM =
⎢
48 ⎣
1
⎞
⎛
⎟
1⎜
⎟
⎜
⎟+ ⎜
⎠
4⎝
⎞
⎛
1 −1 −1
1
1⎜
−1
1
1 −1 ⎟
⎟
⎜
⎟+ ⎜
−1 −1
1 −1 ⎠
6⎝
1 −1 −1
1
11 −7
5 −1
−7 11 −1
5
5 −1 11 −7
−1
5 −7 11
1
1
1
1
s22 = 4
1
1
1
1
1
1
1
1
1
1
1
1
s32 = 6
⎞⎫
⎪
⎪
⎪
⎟⎬
⎟
⎟
⎠⎪
⎪
⎪
⎭
⎤
⎥
⎥
⎥
⎦
129
Worked example: tomography
Repeat using only one singular value s3 =6
Model resolution matrix
⎡
1⎢
⎢
T
R = Vp Vp = ⎢
4⎣
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
⎢
⎢
⎣
Vp = ⎢
0.5
0.5
0.5
0.5
⎤
⎥
⎥
⎥
⎦
⎤
⎥
⎥
⎥
⎦
Input
Model covariance matrix
⎡
Output
p
X
v iv Ti
CM = σ
2
i=1 si
2
⎡
σ2 ⎢
⎢
=
⎢
24 ⎣
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
⎤
⎥
⎥
⎥
⎦
130
Recap: Singular value decomposition
There may exist a model null space -> models that can not
be constrained by the data.
There may exist a data null space -> data that can not be fit by
any model.
The general linear discrete inverse problem may be simultaneously
under and over determined (mix-determined).
Singular value decomposition is a framework for dealing with
ill-posed problems.
The Pseudo inverse is constructed using SVD and provides a
unique model with desirable properties.
Fits the data in a least squares sense
Gives a minimum length model (no component in the null space)
Model Resolution and Covariance can be traded off by choosing the
number of eigenvalues to use in reconstruction.
131
Ill-posedness = sensitivity to noise
Look what happens when the eigenvalues are small and positive
Truncated SVD
m† = VpSp−1UpT d =
!
p Ã
X
ui · d
i=1
si
Discrete Picard
condition
vi
Stability question
of Backus and Gilbert
Noise in the data is amplified in the model if si << 1. The
eigenvalue spectrum needs to be truncated by reducing p.
TSVD: Choose the smallest p such
that data fit is acceptable
||Gm − d||2 ≤ δ
As N or M increase the computational cost increases significantly !
(See example 4.3 of Aster et al., 2005)
132
SVD Example: The Shaw problem
m(θ) = intensity of light incident on a slit at angle θ −
π
π
≤ m(θ) ≤
2
2
d(s) = measurements of diffracted light intensity at angle s −
π
π
≤s≤
2
2
Shaw Problem
Given d(s) find m(s) ?
d(s) =
Z π/2
−π/2
(cos(s)+cos(θ))2
Ã
sin(π(sin(s) + sin(θ)))
π(sin(s) + sin(θ))
!2
m(θ)dθ
Is this a continuous or discrete inverse problem ?
Is this a linear or nonlinear inverse problem ?
133
SVD Example: The Shaw problem
Let’s discretize the inverse problem
Data d(s) and model m(θ) at N equal angles
si = θi =
(i − 0.5)π π
− ,
n
2
di = d(si)
mj = m(θj )
(i = 1, 2, . . . , n)
(i = 1, . . . , n)
(j = 1, . . . , n)
This gives a system of N× N linear equations
d = Gm
where
Gi,j = ∆s(cos(si )+cos(θj ))2
Ã
sin(π(sin(si) + sin(θj )))
π(sin(si) + sin(θj ))
∆s =
!2
π
n
See MATLAB routine `shaw’
134
Example: Ill-posedness
Ill-posedness means solution sensitivity to noise
m† = VpSp−1UpT d =
!
p Ã
X
ui · d
i=1
si
vi
d = Gm
si
20 data, 20 unknowns
N = M = 20
i
Eigenvalue spectrum for Shaw problem
Condition number is the ratio of largest to smallest singular value = 1014
Large condition number means severe ill-posedness
135
Example: Ill-posedness
Eigenvectors for different singular values: Shaw problem
!
p Ã
X
ui · d
†
−1 T
m = VpSp Up d =
vi
Amplitude
i=1
si
v1
v 18
Model units
Eigenvector for smallest non-zero
singular value
Model units
Eigenvector for largest singular value
136
Test inversion without noise
d = Gm
m† = VpSp−1UpT d =
!
p Ã
X
ui · d
i=1
vi
Data from input spike
Amplitude
Input spike model
si
Model units
Data units
Recovered model
137
Test inversion with noise
m† = VpSp−1UpT d =
d = Gm
!
p Ã
X
ui · d
si
i=1
vi
Data from spike model
Model units
Data units
Amplitude
Input spike model
Add Gaussian noise to data
Recovered model
σ = 10−6
Presence of small
eigenvalues means
sensitivity of
solution to noise
138
Shaw problem with p=10
m† = VpSp−1UpT d =
d = Gm
!
p Ã
X
ui · d
i=1
si
vi
Amplitude
Input spike model
Model units
use first 10 eigenvalues only
No noise
solution
Noise
solution
Truncating
Truncatingeigenvalues
eigenvaluesreduces
reducessensitivity
sensitivityto
tonoise
noise
but
also
resolving
power
of
the
data
but also resolving power of the data
139
Shaw problem Picard plot
A guide to choosing the SVD truncation level p (=number of eigenvalues)
!
p Ã
X
u
·
d
i
m† = VpSp−1UpT d =
vi
s
i
i=1
The eigenvalue from the truncation level in SVD
140
Download