ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES Jos M. F. Ten Berge , 1977 Presented by Arne Gjuvsland - INF9540 - 2005 ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.1/20 The Orthogonal Procrustes Problem Problem: How to rotate matrices to maximum agreement in the least-squares sense? Let Ai (i = 1, 2, ..., m) be matrices of size n × k(n ≥ k), the problem is to find orthonormal matrices Ti (k × k) for which the function X f (T1 , ..., Tm ) = tr((Ai Ti − Aj Tj )T (Ai Ti − Aj Tj )) i<j is minimized, equivalent to maximizing X g(T1 , ..., Tm ) = tr(TiT ATi Aj Tj ). i<j ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.2/20 Example 1 A1 and A2 are (10 × 2). g(A1,A2)=7.8924 g(A1T1,A2)=15.9254 2.5 2.5 A1 A2 2 1.5 1.5 1 1 0.5 0 0.5 0 −0.5 −0.5 −1 −1 −1.5 −2 −1 0 1 First column 2 A2 2 Second column Second column A1T1 3 −1.5 −2 −1 0 First column 1 2 ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.3/20 Special case m = 2 When the number of matrices is two, taking T2 = Ik , g reduces to g(T1 ) = tr(T1T AT1 A2 ). The solution to this reduced problem is well known, and based on the SVD of AT1 A2 . ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.4/20 Singular Value (or Echart-Young) Decomposition Theorem 1. If X is a real n × k matrix of rank r(n ≥ k ≥ r), then matrices Pr (n × r),Dr (r × r) and Qr (k × r) can be constructed which satisfy the equation X = Pr Dr QTr , where PrT Pr = QTr Qr = Ir , and Dr is diagonal and positive definite. By adding orthonormal columns to Pr and Qr and zeros to Dr one can construct P (n × k),D(k × k) and Q(k × k) satisfying X = Pr Dr QTr = P DQT . ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.5/20 Solution when m = 2 Symmetry: X T = X . Positive semidefinite: X(n × n), for all non-zero vectors v(n × 1), v T Xv ≥ 0. Theorem 2. The function g(T1 ) = tr(T1T AT1 A2 ), where T1 is a (k × k) orthonormal matrix, is maximized if and only if T1T AT1 A2 is symmetric and positive semi-definite (SPSD). If P DQT is an SVD of AT1 A2 , and we let T1 = P QT , then T1T AT1 A2 = QDQT is SPSD. This is the solution of the Orthogonal Procrustes problem when m = 2. ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.6/20 Proof for Theorem 2: g maximised ⇒ T1T AT1 A2 is SPDS Let T1T AT1 A2 = Pr Dr QTr = P DQT be an SVD of T1T AT1 A2 and assume that for any orthonormal (k × k) matrix N (1) tr(T1T AT1 A2 ) ≥ tr(N T AT1 A2 ). If T1T AT1 A2 is not SPSD then Qr 6= Pr and (2) tr(T1T AT1 A2 ) = tr(Pr Dr QTr ) = tr(QTr Pr Dr ) < trDr . But taking N = T1 P QT gives (3) tr(N T AT1 A2 ) = tr(QP T P DQTr ) = tr(D) = tr(Dr ). ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.7/20 Proof for Theorem 2: g maximised ⇐ T1T AT1 A2 is SPDS Let T1T AT1 A2 be SPSD. Then (4) T1T AT1 A2 = P DQT = P DP T and (5) tr(T1T AT1 A2 ) = tr(P DP T ) = tr(P P T D) = tr(D). If N is an arbitrary orthonormal [k × k] matrix then (6) tr(N T AT1 A2 ) = tr(N T T1 T1T AT1 A2 ) = tr(N T T1 P DP T ) = tr(P T N T T1 P D) ≤ trD. ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.8/20 General case m > 2 Sufficient and necessary conditions for maximizing g (A) Sij = TiT ATi Aj Tj is SPSD for i, j = 1, 2, .., m ⇓ (B) g is maximal ⇓ P T T (C) Si. = Ti Ai j6=i Aj Tj is SPSD for i = 1, 2, .., m ⇓ P T T Aj Tj is SPSD for i = 1, 2, .., m (D) Si = Ti Ai ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.9/20 Example 2: (B) ; (A) The condition TiT ATi Aj Tj is SPSD for i, j = 1, 2, .., m cannot always be fulfilled. Ik −Ik 0 A1 = Ik , A2 = 0 , A3 = Ik . 0 Ik Ik AT1 A2 = −Ik , AT1 A3 = Ik , AT2 A3 = Ik . T1T AT1 A3 T3 SPSD ⇒ T1 = T3 , T2T AT2 A3 T3 SPSD ⇒ T2 = T3 . But then T1 = T2 which makes T1T AT1 A2 T2 negative definite. ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.10/20 Example 3: (C) ; (B) A1 T 1 , A2 T 2 , A3 T 3 , A4 T 4 = Ik −Ik 0 0 , , , Ik 0 0 Ik . 0 0 Ik Ik S1. = T1T AT1 A2 T2 + T1T AT1 A3 T3 + T1T AT1 A4 T4 = −Ik + 0 + Ik = 0. S1. = 0, S2. = 0, S3. = Ik , S4 = Ik . (C) is fulfilled, but g = k , jointly changing signs of A2 T2 and A1 T1 would give g = 3k . ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.11/20 Example 4: (D) ; (C) A1 T 1 , A2 T 2 , A3 T 3 = Ik −Ik 0 , , Ik 0 0 . 0 Ik Ik S1. = T1T AT1 A2 T2 + T1T AT1 A3 T3 = −Ik + 0 = −Ik . S1 = T1T AT1 A1 T1 + T1T AT1 A2 T2 + T1T AT1 A3 T3 = 2Ik + (−Ik ) + 0 = Ik . S1 = Ik , S2 = 2Ik , S3 = 2Ik ,. (D) is fulfilled, but not (C). ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.12/20 Algoritm for the general case In each step a pair of matrices is compared (m = 2) on one of them rotated. Step 1. Rotate A1 to Pm (1) j=2 Aj , to get A1 T1 . Pm (1) (1) Step 2. Rotate A2 to A1 T1 + j=3 Aj , to get A2 T2 . Pm−1 (1) (1) Step m. Rotate Am to j=1 Aj Tj , to get A2 T2 . Pm (1) (1) (2) Step m+1. Rotate A1 T1 to j=2 Aj Tj , to get A1 T1 . Terminate if m steps jointly fail to rise g over a certain threshold. ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.13/20 Example 5 A1 is a (5 × 2) matrix drawn from Z(0, 1). A1 is rotated 60 and 120 degrees and noise from Z(0, 0.05) added to get A2 and A3 . Original matrices Step nr. 1 2 2 A1 A1 A 1.5 A 1.5 2 2 A A 3 3 1 1 0.5 0.5 0 0 −0.5 −0.5 −1 −1 −1.5 −1.5 −2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −2 −2 −1.5 −1 −0.5 Step nr. 2 0 0.5 1 1.5 Step nr. 3 2 2 A A 1 1 A2 1.5 A2 1.5 A3 A3 1 1 0.5 0.5 0 0 −0.5 −0.5 −1 −1 −1.5 −1.5 −2 −2 2 −1.5 −1 −0.5 0 0.5 1 1.5 2 −2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.14/20 Upper bounds for g The algorithm can be shown to converge, but only ensures the strong necessary condition (C) for maximal agreement. Two upper bounds for g can be compared to the computed value. Bound 1: let ATi Aj = Pij Dij Qij be an SVD. Then tr(TiT ATi Tj Aj) ≤ tr(Dij ), and summing gives X g≤ tr(Dij ) i<j ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.15/20 Upper bounds for g (2) Bound 2: T let A A = 0 AT1 A2 · · · AT1 Am AT2 A1 0 · · · AT1 Am .. .. .. . . . ATm A1 ATm A2 · · · 0 if we do an EVD of AT A, AT A = P ∆P , then a second bound for g can be computed by g≤ k X m 2 ∆ii i=1 ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.16/20 Example 6 with bounds Seven (10 × 9) matrices A1 , A2 , .., A7 were drawn from Z(0, 1). 2500 Dataset 2000 Upper bound 1 Upper bound 2 Value of g 1500 1000 500 0 0 5 10 15 20 25 30 35 Step nr. ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.17/20 Gower’s Generalized Procrustes Analysis (GGPA) GGPA includes rotation, scaling and translation of two or more matrices. Corrections and improvements to the rotation step and scaling step in Gower’s algorithm is suggested. For the rotation step the algorithm developed here is better than the Kristof and Wingersky procedure used by Gower. ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.18/20 Generalized Procrustes Analysis - scaling Let A1 , A2 , .., Am be m(n × k) matrices, prescaled to P tr(ATi Ai ) = m. We want to find scaling constants c1 , c2 , .., cm to maximize P h(c1 , c2 , .., cm ) = i<j ci cj tr(ATi Aj ) under the constraint P 2 P T ci tr(Ai Ai ) = tr(ATi Ai ) = m. ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.19/20 Generalized Procrustes Analysis - scaling - solution Let Y = tr(AT1 A1 ) tr(AT1 A2 ) · · · tr(AT1 Am ) tr(AT2 A1 ) tr(AT2 A2 ) · · · tr(AT1 Am ) .. .. .. . . . tr(ATm A1 ) tr(ATm A2 ) · · · tr(ATm Am ) Yd = diag(Y ), and Φ = , − 12 − 12 Y d Y Yd . If Φ = P ∆P T is an eigenvalue decomposition of Φ, then the scaling problem is solved by setting ci = 1 m ( tr(AT Ai ) 2 Pi1 . i ORTHOGONAL PROCRUSTES ROTATION FOR TWO OR MORE MATRICES – p.20/20