Jos M. F. Ten Berge , 1977 ORTHOGONAL PROCRUSTES ROTATION FOR

advertisement
ORTHOGONAL PROCRUSTES ROTATION FOR
TWO OR MORE MATRICES
Jos M. F. Ten Berge , 1977
Presented by Arne Gjuvsland - INF9540 - 2005
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.1/20
The Orthogonal Procrustes Problem
Problem: How to rotate matrices to maximum
agreement in the least-squares sense?
Let Ai (i = 1, 2, ..., m) be matrices of size n × k(n ≥ k),
the problem is to find orthonormal matrices Ti (k × k) for
which the function
X
f (T1 , ..., Tm ) =
tr((Ai Ti − Aj Tj )T (Ai Ti − Aj Tj ))
i<j
is minimized, equivalent to maximizing
X
g(T1 , ..., Tm ) =
tr(TiT ATi Aj Tj ).
i<j
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.2/20
Example 1
A1 and A2 are (10 × 2).
g(A1,A2)=7.8924
g(A1T1,A2)=15.9254
2.5
2.5
A1
A2
2
1.5
1.5
1
1
0.5
0
0.5
0
−0.5
−0.5
−1
−1
−1.5
−2
−1
0
1
First column
2
A2
2
Second column
Second column
A1T1
3
−1.5
−2
−1
0
First column
1
2
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.3/20
Special case m = 2
When the number of matrices is two, taking T2 = Ik ,
g reduces to
g(T1 ) = tr(T1T AT1 A2 ).
The solution to this reduced problem is well known, and
based on the SVD of AT1 A2 .
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.4/20
Singular Value (or Echart-Young) Decomposition
Theorem 1. If X is a real n × k matrix of rank r(n ≥ k ≥ r),
then matrices Pr (n × r),Dr (r × r) and Qr (k × r) can be
constructed which satisfy the equation
X = Pr Dr QTr ,
where
PrT Pr = QTr Qr = Ir ,
and Dr is diagonal and positive definite.
By adding orthonormal columns to Pr and Qr and zeros to
Dr one can construct P (n × k),D(k × k) and Q(k × k)
satisfying
X = Pr Dr QTr = P DQT .
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.5/20
Solution when m = 2
Symmetry: X T = X .
Positive semidefinite: X(n × n), for all non-zero vectors
v(n × 1), v T Xv ≥ 0.
Theorem 2. The function g(T1 ) = tr(T1T AT1 A2 ), where T1
is a (k × k) orthonormal matrix, is maximized if and only
if T1T AT1 A2 is symmetric and positive semi-definite
(SPSD).
If P DQT is an SVD of AT1 A2 , and we let T1 = P QT , then
T1T AT1 A2 = QDQT is SPSD. This is the solution of the
Orthogonal Procrustes problem when m = 2.
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.6/20
Proof for Theorem 2: g maximised ⇒ T1T AT1 A2 is SPDS
Let T1T AT1 A2 = Pr Dr QTr = P DQT be an SVD of T1T AT1 A2 and
assume that for any orthonormal (k × k) matrix N
(1)
tr(T1T AT1 A2 ) ≥ tr(N T AT1 A2 ).
If T1T AT1 A2 is not SPSD then Qr 6= Pr and
(2)
tr(T1T AT1 A2 ) = tr(Pr Dr QTr ) = tr(QTr Pr Dr ) < trDr .
But taking N = T1 P QT gives
(3)
tr(N T AT1 A2 ) = tr(QP T P DQTr ) = tr(D) = tr(Dr ).
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.7/20
Proof for Theorem 2: g maximised ⇐ T1T AT1 A2 is SPDS
Let T1T AT1 A2 be SPSD. Then
(4)
T1T AT1 A2 = P DQT = P DP T
and
(5)
tr(T1T AT1 A2 ) = tr(P DP T ) = tr(P P T D) = tr(D).
If N is an arbitrary orthonormal [k × k] matrix then
(6)
tr(N T AT1 A2 ) = tr(N T T1 T1T AT1 A2 ) = tr(N T T1 P DP T )
= tr(P T N T T1 P D) ≤ trD.
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.8/20
General case m > 2
Sufficient and necessary conditions for maximizing g
(A) Sij = TiT ATi Aj Tj is SPSD for i, j = 1, 2, .., m
⇓
(B) g is maximal
⇓
P
T
T
(C) Si. = Ti Ai j6=i Aj Tj is SPSD for i = 1, 2, .., m
⇓
P
T
T
Aj Tj is SPSD for i = 1, 2, .., m
(D) Si = Ti Ai
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.9/20
Example 2: (B) ; (A)
The condition TiT ATi Aj Tj is SPSD for i, j = 1, 2, .., m cannot
always be fulfilled.






Ik
−Ik
0






A1 =  Ik , A2 =  0 , A3 =  Ik .
0
Ik
Ik
AT1 A2 = −Ik , AT1 A3 = Ik , AT2 A3 = Ik .
T1T AT1 A3 T3 SPSD ⇒ T1 = T3 , T2T AT2 A3 T3 SPSD ⇒ T2 = T3 .
But then T1 = T2 which makes T1T AT1 A2 T2 negative definite.
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.10/20
Example 3: (C) ; (B)
A1 T 1 , A2 T 2 , A3 T 3 , A4 T 4 =





Ik
−Ik
0
0





,
,
,
 Ik   0   0   Ik  .
0
0
Ik
Ik
S1. = T1T AT1 A2 T2 + T1T AT1 A3 T3 + T1T AT1 A4 T4 = −Ik + 0 + Ik = 0.
S1. = 0, S2. = 0, S3. = Ik , S4 = Ik .
(C) is fulfilled, but g = k , jointly changing signs of A2 T2 and
A1 T1 would give g = 3k .
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.11/20
Example 4: (D) ; (C)
A1 T 1 , A2 T 2 , A3 T 3 =




Ik
−Ik
0




,
,
 Ik   0   0 .
0
Ik
Ik
S1. = T1T AT1 A2 T2 + T1T AT1 A3 T3 = −Ik + 0 = −Ik .
S1 = T1T AT1 A1 T1 + T1T AT1 A2 T2 + T1T AT1 A3 T3 =
2Ik + (−Ik ) + 0 = Ik .
S1 = Ik , S2 = 2Ik , S3 = 2Ik ,.
(D) is fulfilled, but not (C).
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.12/20
Algoritm for the general case
In each step a pair of matrices is compared (m = 2) on one
of them rotated.
Step 1. Rotate A1 to
Pm
(1)
j=2 Aj , to get A1 T1 .
Pm
(1)
(1)
Step 2. Rotate A2 to A1 T1 + j=3 Aj , to get A2 T2 .
Pm−1
(1)
(1)
Step m. Rotate Am to j=1 Aj Tj , to get A2 T2 .
Pm
(1)
(1)
(2)
Step m+1. Rotate A1 T1 to j=2 Aj Tj , to get A1 T1 .
Terminate if m steps jointly fail to rise g over a certain
threshold.
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.13/20
Example 5
A1 is a (5 × 2) matrix drawn from Z(0, 1). A1 is rotated 60
and 120 degrees and noise from Z(0, 0.05) added to get A2
and A3 .
Original matrices
Step nr. 1
2
2
A1
A1
A
1.5
A
1.5
2
2
A
A
3
3
1
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
−1.5
−1.5
−2
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
−2
−2
−1.5
−1
−0.5
Step nr. 2
0
0.5
1
1.5
Step nr. 3
2
2
A
A
1
1
A2
1.5
A2
1.5
A3
A3
1
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
−1.5
−1.5
−2
−2
2
−1.5
−1
−0.5
0
0.5
1
1.5
2
−2
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.14/20
Upper bounds for g
The algorithm can be shown to converge, but only
ensures the strong necessary condition (C) for maximal
agreement. Two upper bounds for g can be compared
to the computed value.
Bound 1: let ATi Aj = Pij Dij Qij be an SVD. Then
tr(TiT ATi Tj Aj) ≤ tr(Dij ), and summing gives
X
g≤
tr(Dij )
i<j
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.15/20
Upper bounds for g (2)
Bound 2: 


T
let A A = 


0
AT1 A2 · · · AT1 Am
AT2 A1
0
· · · AT1 Am
..
..
..
.
.
.
ATm A1 ATm A2 · · ·
0






if we do an EVD of AT A, AT A = P ∆P , then a second
bound for g can be computed by
g≤
k
X
m
2
∆ii
i=1
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.16/20
Example 6 with bounds
Seven (10 × 9) matrices A1 , A2 , .., A7 were drawn from
Z(0, 1).
2500
Dataset
2000
Upper bound 1
Upper bound 2
Value of g
1500
1000
500
0
0
5
10
15
20
25
30
35
Step nr.
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.17/20
Gower’s Generalized Procrustes Analysis (GGPA)
GGPA includes rotation, scaling and translation of two
or more matrices.
Corrections and improvements to the rotation step and
scaling step in Gower’s algorithm is suggested.
For the rotation step the algorithm developed here is
better than the Kristof and Wingersky procedure used
by Gower.
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.18/20
Generalized Procrustes Analysis - scaling
Let A1 , A2 , .., Am be m(n × k) matrices, prescaled to
P
tr(ATi Ai ) = m.
We want to find scaling constants c1 , c2 , .., cm to maximize
P
h(c1 , c2 , .., cm ) = i<j ci cj tr(ATi Aj ) under the constraint
P 2
P
T
ci tr(Ai Ai ) =
tr(ATi Ai ) = m.
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.19/20
Generalized Procrustes Analysis - scaling - solution



Let Y = 


tr(AT1 A1 ) tr(AT1 A2 ) · · · tr(AT1 Am )
tr(AT2 A1 ) tr(AT2 A2 ) · · · tr(AT1 Am )
..
..
..
.
.
.
tr(ATm A1 ) tr(ATm A2 ) · · · tr(ATm Am )
Yd = diag(Y ), and Φ =



,


− 12
− 12
Y d Y Yd .
If Φ = P ∆P T is an eigenvalue decomposition of Φ, then the
scaling problem is solved by setting ci =
1
m
( tr(AT Ai ) 2 Pi1 .
i
ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.20/20
Download