Groups of banded matrices with banded inverses The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Strang, Gilbert. “Groups of banded matrices with banded inverses.” Proceedings of the American Mathematical Society 139.12 (2011): 4255-4264. As Published http://dx.doi.org/10.1090/s0002-9939-2011-10959-6 Publisher American Mathematical Society Version Author's final manuscript Accessed Thu May 26 00:30:10 EDT 2016 Citable Link http://hdl.handle.net/1721.1/72057 Terms of Use Creative Commons Attribution-Noncommercial-Share Alike 3.0 Detailed Terms http://creativecommons.org/licenses/by-nc-sa/3.0/ Groups of Banded Matrices with Banded Inverses Gilbert Strang Department of Mathematics, MIT Abstract A product A D F1 : : : FN of invertible block-diagonal matrices will be banded with a banded inverse. We establish this factorization with the number N controlled by the bandwidths w and not by the matrix size n: When A is an orthogonal matrix, or a permutation, or banded plus finite rank, the factors Fi have w D 1 and generate that corresponding group. In the case of infinite matrices, conjectures remain open. 1 Introduction Banded matrices with banded inverses are unusual, but these exceptional matrices do exist. Block diagonal matrices are the first examples. Products of block diagonal matrices give many more examples (including useful ones). The main theorem in an earlier paper Œ10 is that all examples are produced this way, from multiplying invertible block diagonal matrices of bandwidth 1. When A and B are banded with banded inverses, A1 and AB also have those properties. This group of matrices is all of GL.n/ in the finite case (every matrix has bandwidth less than n). For singly infinite matrices this appears to be a new group. In both cases the key point of the theorem is that the number N of block diagonal factors is controlled by the bandwidth w and not by the matrix size n. Theorem 1 The factors in A D F1 F2 : : : FN can be chosen block diagonal, with 2 by 2 and 1 by 1 blocks. Then each generator F and F 1 has bandwidth ¨ 1. Here w is the larger of the bandwidths of A and A1 . So Aij D 0 and A1 ij D 0 for |i j | ¡ w. The number of factors F could be as large as C w2 (just to carry out ordinary elimination) but N does not increase with n. Important banded matrices with banded inverses arise in constructing orthogonal polynomials on the unit circle Œ5. They also yield filter banks with perfect reconstruction, the key to wavelets. Those are block Toeplitz matrices in the wavelet case, and “CMV matrices” in other cases. Our earlier paper Œ10 applied an observation of Ilya Spitkovsky to separate these matrices into the minimum number N D w of factors F (each with bandwidth 1). We believe that these special (and short) factorizations can lead to useful algorithms. For other A, the main step of the proof is to reach A D BC , two factors with diagonal blocks of size 2w. One of those factors begins with a block of size w, and it is this “offset” between the block positions in B and C that makes the proof work. 1 The form of this “offset product” is important even for w D 1 : 2 32 1 1 2 6 76 3 4 2 3 6 76 6 76 4 5 5 6 76 BC D 6 6 76 6 7 7 8 6 76 4 54 8 9 9 10 10 11 12 3 7 7 7 7: 7 7 5 (1) The second row of BC has 2 times Œ 3 4 followed by 3 times Œ 5 6 . The third row of BC has 4 times Œ 3 4 followed by 5 times Œ 5 6 . A pair of singular matrices lies side by side in the product BC : 2 3 2 3 1 Œ1 2 1 2 6 7 6 7 3 6 6 8 15 18 7 6 2 7 Œ3 4 Œ5 6 6 7 6 7 5 6 12 16 25 30 7 6 4 7 6 7D6 7 6 7 6 7 42 48 7 6 6 7 6 6 7 6 Œ7 8 7 56 64 5 4 4 8 5 When a column of B multiplies a row of C , the nonzeros sit in a 2 by 2 matrix of rank 1. These 2 by 2 matrices don’t overlap in BC . So when those column-row products are added (a legal way to compute BC ), the result is a “CMV matrix.” This matrix BC has two diagonals of singular 2 by 2 blocks. The inverse matrix .BC /1 D C 1 B 1 has a similar multiplication in the opposite order. The pattern of nonzeros is just the transpose of the pattern above. This product is another CMV matrix (singular 2 by 2 blocks side by side). This paper will describe a corresponding factorization for other (smaller or larger) groups of matrices. Here are four of those groups, not an exhaustive list : 1. Banded orthogonal matrices. The inverse is the transpose (so automatically banded). The factors F will now be orthogonal matrices with w D 1 : block diagonal with small orthogonal blocks. 2. Banded permutation matrices. In this case w D max |i p.i /| measures the maximum movement from i in .1; : : : ; n/ to p.i / in .p.1/; : : : ; p.n//. Each factor F is then a permutation with w D 1; it exchanges disjoint pairs of neighbors. We conjectured that fewer than 2w factors F would be sufficient. Greta Panova has found a beautiful proof Œ6. Other constructions Œ1; 9 also yield N ¨ 2w 1. 3. Banded plus finite rank. For infinite matrices A, banded with banded inverse, we enlarge to a group B by including also A C Z. The perturbation Z allows any matrix of finite rank such that A C Z is invertible. Then its inverse will be A1 C Y , also perturbed with rank .Y / ¨ rank.Z/, and we have a group. In this case, we include the new factors F D I C .rank 1/ along with the block diagonal F ’s. These factors generate the enlarged group. 2 4. Cyclically banded matrices. Cyclic means that “n is adjacent to 1.” The distance from diagonal i to diagonal j is the smaller of |i j | and n |i j |. The cyclic bandwidth of A is the largest distance for which Aij ¤ 0. Then wc is the larger of the cyclic bandwidths of A and A1 . The natural conjecture is A D F1 : : : FN with factors that have cyclic bandwidth 1 (so that F1n and Fn1 may be nonzero). The number N should be controlled by wc . No proof of this conjecture is to be found in the present paper. 2 Banded orthogonal matrices We need to recall how to obtain A D BC with block diagonal factors B and C . We construct orthogonal B and C when A is orthogonal. Then the final step will separate B and C into orthogonal factors with w D 1. The key is to partition A into blocks H and K of size 2w. Each of those blocks has rank exactly w. This comes from a theorem of Asplund about ranks of submatrices of A, when the inverse matrix has bandwidth w. The display shows a matrix for which A and A1 have bandwidth at most w D 2. H1 H2 X x x x X x x x x X x x x x X x x K1 x x X x x x x X x x K2 x x X x x x x X x x Asplund’s theorem applies to submatrices like the K’s that are above subdiagonal w, and like the H ’s that are below superdiagonal w. “Those submatrices have rank ¨ wŒ2; 11.” In our case the ranks are exactly w because each set of 2w rows of A must have full rank 2w (since A is invertible). The two columns of H1 are orthonormal. Choose a 4 by 4 orthogonal matrix Q1 , such that we get two columns correct in the identity matrix : 2 3 1 0 60 17 7 Q1 H1 D 6 40 05 D R: 0 0 Since the four rows of Q1 Œ H1 K1 D Œ R Q1 K1 are orthonormal, the first two rows of Q1 K1 must be zero. Then there is a 4 by 4 orthogonal matrix C1 T acting on the columns 3 of Q1 K1 that produces RT in the next two rows : 2 3 0 0 0 0 T 6 0 0 0 0 7 7D 0T Q1 K1 C1 T D 6 4 1 0 0 0 5 R 0 1 0 0 in rows 1 4 columns 3 6 At this stage the first 2w D 4 rows are set. Q1 A C1 T agrees with the first four rows of I . If B1 D Q11 is placed into the first block of B, and C1 in rows=columns 5 to 8 of C , then four rows of A D BC are now correct. Moving to the next four rows, our goal is to change those into four more rows of the identity matrix. This will put the 8 by 10 submatrix of H ’s and K’s in its final form : 2 3 R 0T 2 3 " #" # I2 6 7 RT 7 Q1 H1 K1 6 7 6 T 6 7 C 4 5D6 1 T 7 Q2 H2 K2 0 R 0 4 5 C2 T T R Columns 3 and 4 end with the zero matrix as indicated, because those columns are orthonormal and they already have 1’s from RT . Rows 5 to 8 are now in exactly the same situation that we originally met for rows 1 to 4. There is an orthogonal matrix Q2 that produces R in columns 5 and 6, as shown. The rest of rows 5 and 6 must contain 0T , as shown. Then an orthogonal C2 T produces RT in rows 7 and 8. We now have eight rows of QA C T D I . The construction continues to rows 9 to 12, and onward : 3 2 I2 2 3 Q1 7 6 7 C1 T 6 7 6 T 7 D I: 6 Q2 QAC D 4 A 5 6 7 C2 T 5 4 Q and C are block diagonal with orthogonal 4 by 4 blocks, except for the 2 by 2 block I2 . We have shown that every banded orthogonal matrix A with w D 2 can be factored into A D QT C D BC . The reasoning is the same for any w. Theorem 2 Every banded orthogonal matrix A, finite or singly infinite, has orthogonal block diagonal factors A D Q1 C D BC . The blocks Bi and Ci have size 2w except that C0 D I has size w (to produce the offset). This completes the main step in the factorization—we have B and C with orthogonal blocks Bi and Ci . The final step is to reach 2 by 2 blocks. A straightforward construction succeeds for each Bi and Ci . At the end we factor all Bi at once and all Ci at once. Lemma Any orthogonal matrix Q of size 2w can be reduced to the identity matrix by a sequence of multiplications, GM : : : G1 Q D I . Each factor G differs from I only in a 2 by 4 2 orthogonal block (a plane rotation) : 2 6 6 G D6 6 4 Im c s s c 3 Ip 7 7 7: 7 5 (2) Proof Start at the bottom of column 1. Choose c D cos and s D sin to produce zero from s Qn1;1 C c Qn;1 in that corner of G1 Q. (Take s D 1 and c D 0 in case Qn1;1 D 0.) Move up column 1, producing a new zero with each factor G. At the top of column 1, choose signs to produce 1 as the diagonal entry in this unit vector. Continue from the bottom of column 2. The new factors G that produce zeros below the diagonal will not affect the zeros in column 1. The .1; 2/ entry in column 2 is already zero because the columns remain orthogonal at every step. At the end, we have the columns of I from M D O w 2 orthogonal matrices Gi . In the non-orthogonal case, elimination in this order (climbing up each column) yields N D O w 2 factors F in Theorem 1. Each zero is produced by a “Gauss matrix” that has a single nonzero next to its diagonal part I. The O w 3 estimate in Œ10 was over-generous. This lemma applies to each orthogonal block Bi (of size 2w) in the matrix B. The key point is that we can simultaneously reduce all those blocks to I . The matrices G1 ; : : : ; GM for all the different blocks of B go along the diagonals of F1 ; : : : ; FM . Then FM : : : F1 B D I . Similarly the lemma applies to the blocks Ci of C . Then we have A D BC D 1 F 1 : : : F 1 . This completes the orthogonal case. F11 : : : FM M C1 2M 3 Wavelet Matrices The matrices that lead to wavelets are banded with banded inverses. Furthermore they are block Toeplitz (away from the first and last rows) with 2 by 2 blocks. In this case the factors F will also be block Toeplitz (away from those boundary rows). This means that each Fi will have a 2 by 2 block repeating down the diagonal. In the analysis of wavelets it is often convenient to work with doubly infinite matrices A (purely block Toeplitz, with no boundary rows). Suppose A and its inverse have bandwidth w D N , coming from N blocks centered on each pair of rows. We showed in Œ10 how to find N block diagonal factors in A D F1 : : : FN : First find N 1 factors Gi , each with two singular 2 by 2 blocks per row .w D 2/. Then split each Gi into block diagonal factors Fi1 Fi 2 . Those are offset as in equation (1) above, and we choose the factors so that Fi 2 is not offset compared to Fi C1;1 . Then Fi 2 Fi C1;1 is a block diagonal F , and the banded matrix A has N factors : 8 8 8 8 8 D .F11F12 / .F21 F22/ .FN 1;1FN 1;2/ D F11 .F12 F21 / .F22 F31 / .FN 2;2 FN 1;1 / FN 1;2 : A 5 The 4-coefficient Daubechies wavelet matrix has N D 2. It is the single matrix G1 . It was factored in Œ5 into block diagonal F11 times F12 : 2 6 6 6 4 ? ? 1 C 3 1 C 3 ? ? 1 3 1C 3 32 ? 3 76 1 76 76 54 1 ? 3 3 ? ? ? ? 7 61C 3 3C 3 3 3 1 3 7 7 6 ? ? ? ? ? 77 7D6 3 1 5 4 1 3 3 C 3 3 C 3 1 3 5 ? 1 3 3 2 Again, column 2 times row 2 gives ? one?singular block and column 3 times row 3 gives the other. Dividing by the lengths 8 and 4 of their rows, those factors become orthogonal matrices. Their product shows the 4 C 4 Daubechies coefficients. The 6 C 6 coefficient matrix for the next orthogonal Daubechies wavelet was factored by Raghavan Œ8. These factors are plane rotations through =12 and =6. TeKolste also observed Œ12 that the Daubechies matrices are offset products of plane rotations. He recognized that the N rotation angles add to =4. This connects to the fact that the coefficients in the second row of A add to zero (a highpass filter). There should be additional conditions on the angles to determine the multiplicity of this zero in the highpass frequency response (the polynomial whose coefficients come from that second row of A ). A task for the future is to construct useful matrices A and An starting from wellchosen factors. Wavelets need not be orthogonal (the most popular choices are not). And they need not be restricted to block Toeplitz matrices. It seems feasible to construct timevarying wavelets by starting with factors F in which the blocks are not constantly repeated down the diagonal. The matrix still has a banded inverse. 8 8 8 4 Banded Permutation Matrices The bandwidth of a permutation is the maximum distance |i p.i /| that any entry is moved. Thus w D 1 for a “parallel exchange of neighbors” like 2, 1, 4, 3. The matrix F for this permutation is block diagonal with a 2 by 2 block for each exchange. It is straightforward to reach the identity by a sequence of these F ’s. At each step we move from left to right, exchanging pairs of neighbors that are in the wrong order : 4 5 6 1 2 3 Ñ 4 5 1 62 3 Ñ 4 1 5 2 6 3 Ñ 1 4 2 5 3 6 Ñ 1 2 4 3 5 6 Ñ 1 2 3 4 5 6: The original permutation has w D 3 (in fact all entries have |i p.i /| D 3). Five steps were required to reach the identity. So the banded permutation matrix is the product of N D 5 D 2w 1 matrices F . We conjectured in Œ10 that N ¨ 2w 1 in all cases. A beautiful proof is given by Greta Panova Œ6, using the “wiring diagram” to decide the sequence of exchanges in advance. A second proof Œ1 by Albert, Li, Strang, and Yu confirms that the algorithm illustrated above also has N ¨ 2w 1. A third proof is proposed by Samson and Ezerman Œ9. 6 5 Finite Rank Perturbations We now consider the larger set of invertible matrices M D A C Z, when A and A1 have bandwidth ¨ w and Z has rank ¨ r. The inverse matrix M 1 has the same properties, from the Woodbury-Morrison formula (which has many discoverers). Write Z as U V where U has independent columns and V has independent rows. Then the formula yields M 1 D .A U V /1 D A1 C Y D A1 C A1 U.I VA1 U /1 VA1 : (3) The rank of Y is not greater than the ranks (both ¨ r) of U and V . The product M1 M2 will be the sum of A1 A2 with bandwidth ¨ 2w, and A1 Z2 C A2 Z1 C Z1 Z2 with rank ¨ 3r. So we have a group B of invertible matrices M D A C Z, in which A and A1 belong to our original group and rank.Z/ is finite. As before, B is all of GL.n/ for finite size n. It is apparently a new group for n D 8. We want to describe a set of generators for this group. They will include the same block diagonal factors F with bandwidth 1, together with new factors of the form I C .rank 1/. We show how to express A1 M D I C A1 Z using at most r of these new factors. Theorem 3 If L D A1 Z has rank r, there are vectors u1 ; v1 ; : : : ; ur ; vr so that I C L D I C ur vrT : : : I C u1 v1T : (4) ¡ j . Then if the columns of V and U are v1 ; : : : ; vr and u1 ; : : : ; ur , the product V T U is upper triangular. Under this condition, the expression (4) reduces to I C ur vrT C C u1 v1T D I C U V T . So the goal is to achieve U V T D L with V T U upper triangular. Proof We will choose vectors such that viT uj D 0 for i The usual elimination steps reduce L to a matrix W with only r nonzero rows. Thus EL D W or L D E 1 W D E 1 r Wr , where we keep only those first r rows of W and the first r columns of E 1 . In the opposite order, Wr E 1 r may not be triangular, but every matrix is similar to an upper triangular matrix : For some r by r matrix S , S Wr E 1 r S 1 is upper triangular. Now take V T D S Wr and U D E 1 r S 1 . Then V T U is triangular and U V T D L. 6 Cyclically Banded Matrices A cyclically banded n by n matrix A (with bandwidth w) has three nonzero parts. There can be w nonzero diagonals in the upper right corner and lower left corner, in addition to the 2w C 1 diagonals in the main band. Call the new parts in the corners b and c : 2 3 Aij D 0 if |i j | ¡ w b 6 7 0 and also n |i j | ¡ w: 6 7 7 a AD6 6 7 4 5 0 c 7 Thus “cyclic bandwidth w D 1” now allows A1n ¤ 0 and An1 ¤ 0. There is a corresponding infinite periodic matrix that has bandwidth w in the normal way. “Periodic” means that Ai Cn;j Cn D Aij for all integers i and j . When we identify 0 with n and every i with n C i , the corner parts b and c move to fill the gaps at the start and end of the main band. Here is an example with b D 4 and c D 3 and all 1’s in the tridiagonal a : 2 3 1 1 3 6 7 4 1 1 6 7 7 with cyclic w D 1 1 1 1 Periodic A D 6 6 7 4 5 1 1 3 4 1 1 8 8 Notice that the periodic matrix A is block Toeplitz. The banded blocks a repeat down the main diagonal, and blocks b and c (all 3 by 3) go down the adjacent diagonals. Multiplication AB of cyclically banded matrices corresponds to A B for periodic matrices. Inversion A1 corresponds to .A /1 . The cyclic bandwidth for A is the normal bandwidth for A . The reader will make the same conjecture as the author : 8 8 8 8 Conjecture If A and A1 have cyclic bandwidth ¤ w, they are products of N D O.w 2 / factors for which F and F 1 have cyclic bandwidth ¤ 1. The number N does not depend on n. For cyclically banded permutation matrices this conjecture is proved by Greta Panova. Her wiring diagrams move to a cylinder, for periodicity. The number of factors still satisfies N ¤ 2w 1 (for permutations). The parallel transpositions in a factor F can include an exchange of 1 with n. Two new factors are allowed that also have cyclic bandwidth 1 — these are the cyclic shifts. They permute 1; : : : ; n to n; 1; : : : ; n 1 or to 2; : : : ; n; 1. The inverse of one cyclic shift is the other. Allow us to close the paper with preliminary comments on a proof of the conjecture. The non-cyclic proof began with Asplund’s condition on a matrix A whose inverse has bandwidth ¤ w. All submatrices of A above subdiagonal w and below superdiagonal w have rank ¤ w. We expect this condition to extend to periodic infinite matrices A , but one change is important : The cyclic shift S (and its periodic form S ) is inverted by its transpose. The finite matrix obeys Asplund’s rule : 2 3 0 0 0 1 has rank 1 above subdiagonal 1 : 6 1 0 0 0 7 then S 1 has upper bandwidth 1 7 S D6 4 0 1 0 0 5 has rank 3 below superdiagonal 3 : 0 0 1 0 then S 1 has lower bandwidth 3. 8 8 8 But S needs a revised rule. It is lower triangular (rank zero above the diagonal) but its inverse (the transpose shift) is upper triangular. The second obstacle, perhaps greater, lies in the factorization of A into block diagonal periodic matrices B C : As in Section 2 above, their diagonal blocks may have size 2w 8 8 8 8 (independent of n). The further factorization into periodic matrices F with 2 by 2 blocks would be straightforward, but how to reach B and C ? The elimination steps we used earlier are now looking for a place to start. In our original proof of the (non-cyclic) factorization into A D F1 : : : FN , a decomposition of “Bruhat type” was the first step — in order to reach triangular factors : 8 8 A D (triangular) (permutation) (triangular) D LP U . (5) With P factored into F ’s, this leaves the triangular L and U to be factored. L; P; U are all banded with banded inverses; their bandwidths come from A and A1 . Note that Bruhat places P between the triangular factors, where numerical linear algebra places it first. (The standard Bruhat triangular factors are both upper triangular.) In the periodic case, P will be an affine permutation. The banded periodic matrix A is naturally associated with a rational matrix function a.z/. The blocks in A are the coefficients in a.z/. For our block tridiagonal matrix, this function will be bz 1 C a C cz: Then the triangular L P U factorization of A is associated with a factorization of this n by n matrix function, as in 8 8 8 8 8 8 8 a.z/ D l.z/ p.z/ u.z/ with p.z/ D diag.z k1 ; : : : ; z kn /: (6) The integers k1 ; : : : ; kn are the (left) partial indices of a.z/. They determine the periodic permutation matrix P ; the 1 in row i lies in column i C ki n: The factors u.z/ and l.z/ are analytic for |z | 1 and |z | ¡ 1: Their matrix coefficients appear as the blocks in the triangular factors U and L : Matrix factorization theory began with the fundamental paper of Plemelj Œ7. A remarkable bibliography and a particularly clear exposition and proof of (6) are in the survey paper Œ4 by Gohberg, Kaashoek, and Spitkovsky. They include a small example that we convert from a.z/ D l.z/ p.z/ u.z/ to A D L P U with U D I . Rows 1 to 4 show the pattern in these infinite block Toeplitz matrices : z 0 1 0 z 1 0 a.z/ D D 1 Dlpu 1 z 1 z 1 z 1 0 1 2 3 0 0 0 0 1 0 60 1 1 0 0 0 7 7DL P D A D6 4 0 0 0 0 1 05 0 1 1 0 0 0 2 32 3 0 0 1 0 0 0 0 0 0 0 1 0 61 0 0 1 0 0 7 60 1 0 0 0 0 7 76 7: D6 4 0 0 1 0 0 05 4 0 0 0 0 1 05 1 0 0 1 0 0 0 1 0 0 0 0 8 8 8 8 8 8 8 8 8 8 8 8 Important : The inverse of A is also banded in this example because the determinant of a.z/ is a monomial. The inverse of this L reverses signs below the diagonal, and the inverse of a permutation P is always the transpose. With indices k1 D 1 and k2 D 1 and n D 2, rows 1 and 2 of P have ones in columns 1 C 2 D 3 and 2 2 D 0. 8 8 8 9 Two key questions stand out : 1. Block diagonal factorization of the periodic matrices U 8 and L8 . 2. Nonperiodic infinite matrices A, banded with banded inverses: Do they factor into F1 : : : FN (block diagonal and shifts)? 10 References 1. C. Albert, C.-K. Li, G. Strang and G. Yu, Permutations as products of parallel transpositions, submitted to SIAM J. Discrete Math .2010/. 2. E. Asplund, Inverses of matrices taij u which satisfy aij D 0 for j Scand. 7 .1959/ 57-60. ¡ i C p, Math. 3. I. Daubechies, Orthonormal bases of compactly supported wavelets, Comm. Pure Appl. Math. 41 .1988/ 909-996. 4. I. Gohberg, M. Kaashoek, and I. Spitkovsky. An overview of matrix factorization theory and operator applications, Operator Th. Adv. Appl., Birkhäuser .2003/ 1102. 5. V. Olshevsky, P. Zhlobich, and G. Strang, Green’s matrices, Lin. Alg. Appl. 432 .2010/ 218-241. 6. G. Panova, Factorization of banded permutations, arXiv.org/abs/1007:1760 .2010/. 7. J. Plemelj, Riemannsche Funktionenscharen mit gegebener Monodromiegruppe, Monat. Math. Phys. 19 .1908/ 211-245. 8. V.S.G. Raghavan, Banded Matrices with Banded Inverses, M.Sc. Thesis, MIT .2010/. 9. M.D. Samson and M.F. Ezerman, Factoring permutation matrices into a product of tridiagonal matrices, arXiv.org/abs/1007:3467 .2010/. 10. G. Strang, Fast transforms : Banded matrices with banded inverses, Proc. Natl. Acad. Sci. 107 .2010/ 12413-12416. 11. G. Strang and T. Nguyen, The interplay of ranks of submatrices, SIAM Review 46 .2004/ 637-646. 12. K. TeKolste, private communication .2010/. Department of Mathematics, MIT Cambridge MA 02139 gs@math.mit.edu AMS Subject Classification 15A23 Keywords: Banded matrix, banded inverse, group generators, factorization, permutations. 11