ETNA

advertisement
ETNA
Electronic Transactions on Numerical Analysis.
Volume 43, pp. 163-187, 2015.
Copyright  2015, Kent State University.
ISSN 1068-9613.
Kent State University
http://etna.math.kent.edu
BLOCK GRAM-SCHMIDT DOWNDATING∗
JESSE L. BARLOW†
Abstract. Given positive integers m, n, and p, where m ≥ n + p and p ≪ n. A method is proposed to modify
the QR decomposition of X ∈ Rm×n to produce a QR decomposition of X with p rows deleted. The algorithm
is based upon the classical block Gram-Schmidt method, requires an approximation of the norm of the inverse of
a triangular matrix, has O(mnp) operations, and achieves an accuracy in the matrix 2-norm that is comparable to
similar bounds for related procedures for p = 1 in the vector 2-norm. Since the algorithm is based upon matrixmatrix operations, it is appropriate for modern cache oriented computer architectures.
Key words. QR decomposition, singular value decomposition, orthogonality, downdating, matrix-matrix operations.
AMS subject classifications. 65F25, 65F20, 65F35
1. Introduction. Given a matrix X ∈ Rm×n and an integer p, where m ≥ n + p and
p ≪ n. Suppose that we have the orthogonal decomposition (i.e., QR decomposition)
(1.1)
X = U R,
where U ∈ Rm×n is left orthogonal and R ∈ Rn×n is upper triangular. Let X be partitioned
as
X0 p
,
(1.2)
X=
X m−p
and suppose that we wish to produce the QR decomposition
(1.3)
X = U R,
where U ∈ R(m−p)×n̄ is left orthogonal, R ∈ Rn̄×n is upper trapezoidal, and n̄ ≤ n. Obtaining (1.3) inexpensively from (1.1), called the block downdating problem, is important in the
context of solving recursive least squares problems where observations are added or deleted
over time. It also arises as an intermediate computation in a recent null space algorithm by
Overton et al. [18]. In (1.2), the first p observations are deleted, but, by simply applying a
row permutation to X, any p observations can be deleted. Without changing the algorithm,
we could assume that U ∈ Rm×n0 and R ∈ Rn0 ×n is upper trapezoidal, where n0 ≤ n, but,
for simplicity, we assume n0 = n.
For p = 1, the block downdating problem (or simply the downdating problem) has
an extensive literature [4, 9, 20]. A block downdating algorithm for the Cholesky decomposition based upon hyperbolic transformations was described by Q. Liu [17]. Our block
CGS algorithm, closely related to the BCGS2 algorithm in [3] and the CGS2 algorithm
in [1, 12], has matrix-matrix operations substituted for the matrix-vector operations, orthogonal decompositions—either the QR or the singular value decomposition—substituted for
normalizations along with inverse norm estimates and singular values used in orthogonality
tests instead of the size of vector norms. This leads to a BLAS-3 [10] or matrix-matrix operation oriented algorithm, which is more suited to modern computer architectures as it makes
more effective use of caching. It requires O(mnp) operations.
∗ Received December 5, 2013. Accepted November 5, 2014. Published online on January 23, 2015. Recommended by L. Reichel. The research of Jesse L. Barlow was sponsored by the National Science Foundation under
contract no. CCF-1115704.
† Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA,
16802–6822, USA (barlow@cse.psu.edu).
163
ETNA
Kent State University
http://etna.math.kent.edu
164
J. L. BARLOW
The downdating problem for p = 1 is solved by adding the column b = e1 , the first column of the identity, to X in (1.1), updating its QR factorization, and obtaining the matrices U
and R in (1.3) as “by-products” of that updating process.
For p > 1 we substitute for b = e1 the left orthogonal matrix B ∈ Rm×p given by
V p
,
(1.4)
B=
0 m−p
where V ∈ Rp×p is orthogonal.
Following a script in [3] for adding a block of columns B, we seek an integer k ≤ p,
a left orthogonal matrix QB ∈ Rm×k , an upper trapezoidal matrix RB ∈ Rk×p , and a
matrix SB ∈ Rn×p such that
B = U SB + QB R B ,
(1.5)
T
U QB = 0.
(1.6)
Once the problem (1.5)–(1.6) is solved, we have the decomposition
RB 0
V X0
= QB U
.
SB R
0 X
We then let Z ∈ R(n+k)×(n+k) be an orthogonal matrix such that
(1.7)
Z
T
RV Y 0 p
0
= 0
,
n̄
R
R
|{z} |{z}
RB
SB
p
n
where n̄ = n − p + k and R ∈ Rn̄×n remains upper trapezoidal. Applying Z to QB
we obtain
(1.8)
e = QB
U
where we note that
(1.9)
Thus,
V
0
i
h
e1 U
e2 ,
U Z= U
|{z} |{z}
p
X0
e RV
=U
0
X
n̄
Y0
.
R
V
e1 RV ,
=U
0
e1 in (1.8) has the form
requiring that the matrix U
p
U
e
(1.10)
U1 = V 0
m−p
and implying that UV , RV ∈ Rp×p must be orthogonal.
U ,
ETNA
Kent State University
http://etna.math.kent.edu
BLOCK GRAM-SCHMIDT DOWNDATING
165
e is left orthogonal, U
e2 ∈ Rm×n̄ has the form
Since U
e2 = 0 p
(1.11)
U
U m−p
so that U in (1.11) and R in (1.9) satisfy (1.3). Once QB , SB , and RB are obtained, the
decomposition (1.7) can be constructed efficiently as the product of 2 × 2 orthogonal transformations.
Unfortunately, if the matrix
(1.12)
C= B U
is rank deficient, the problem of obtaining QB , RB , and SB in (1.5)–(1.6) is ill-posed; the
left orthogonal matrix QB is not unique, RB is rank deficient, and the resulting factorization
in (1.3) is rank deficient. From (1.5)–(1.6), C in (1.12) can be factored into
SB I n
.
C = QB U
RB 0
Since QB U is left orthogonal and the matrix on the right is quasi-upper triangular, C is
rank deficient (ill-conditioned) only if RB is.
There is little difficulty in obtaining QB , RB , and SB that satisfy (1.5) with a small residual, i.e., where kB − U SB − QB RB k2 is small. However, if C is (near)
rank
deficient, the
ill-conditioning in RB can make it difficult to obtain QB such that U T QB 2 is small, i.e.,
such that (1.6) has a small residual. Understanding this issue and constructing an algorithm
that addresses it are the main themes of the text below.
To develop block CGS downdating, we relax the assumption that U is left orthogonal
and instead assume that
I n − U T U ≤ ξ ≪ 1
(1.13)
2
for some small and unknown value ξ. In some contexts (as in, say, [3]), we may assume
that ξ ≤ f (m, n)εM where εM is machine precision and f (m, n) is a modestly growing
function, but, in our discussion, we simply assume that it is “small.” It is possible that ξ
depends on the condition number of R, for example, when it is the result of a modified
Gram-Schmidt QR factorization [6]. Using the assumption (1.13), in Section 2 we design
our algorithm with the goal of computing QB ∈ Rm×k , RB ∈ Rk×p upper trapezoidal, and
SB ∈ Rn×p , k ≤ p, such that
√
kB − U SB − QB RB k2 ≤ 5ξ + O(ξ 2 ),
(1.14)
T
U QB ≤ 0.5ξ + O(ξ 2 ).
(1.15)
2
When k < p, i.e., when RB is strictly upper trapezoidal, the algorithm produces a lower
bound estimate ξest for ξ in (1.13).
The algorithm we develop and the bounds associated with it, (1.14)–(1.15), assume the
reliability of an O(p2 log p) operation heuristic that, for an upper
R2 , finds
triangular matrix
the largest integer k such that, for a prescribed constant βorth , R2−1 (1 : k, 1 : k)2 ≤ βorth .
The outline of this paper is as follows. The algorithm for computing QB , RB , and SB
that satisfy (1.14)–(1.15) is assembled in Section 2. It is based firmly upon the function
block CGS2 step from [3, Function 2.2], which we restate as Function 2.1. Our modification, given as block CGS2 down (Function 2.2), is justified by Theorems 2.1 and 2.2,
ETNA
Kent State University
http://etna.math.kent.edu
166
J. L. BARLOW
F UNCTION 2.1 (Function block CGS2 step from [3]).
function [QB , SB , RB ]=block CGS2 step(U, B)
%
%
First block Gram-Schmidt step
Produces Y1 = (I − U U T )B
% Left orthogonal Q1 satisfies Range(Q1 ) = Range(Y1 ) = Range((Im − U U T )B)
% if R1 nonsingular
%
%
(1) S1 = U T B;
(2) Y1 = B − U S1 ;
(3) Q1 R1 = Y1 ;
%
QR decomposition of Y1
%
%
Second block Gram-Schmidt step
Produces Y2 = (I − U U T )Q1
% Left orthogonal QB satisfies Range(QB ) = Range(Y2 ) = Range((Im − U U T )2 B)
% if R1 and R2 nonsingular
%
%
(4) S2 = U T Q1 ;
(5) Y2 = Q1 − U S2 ;
(6) QB R2 = Y2 ;
%
QR decomposition of Y2
%
%
%
Assemble RB and SB to satisfy (1.5)
(7) SB = S1 + S2 R1 ; RB = R2 R1 ;
end block CGS2 step
requires a condition number estimate to determine the appropriate value k for the dimensions
of QB and RB and leads to the function block downdate info (Function 2.4). We forgo
a backward error analysis of block downdate info, but this could also be established
from the backward error analysis of block CGS2 step in [3, Section 3.2].
In Section 3.1, we perform a sequence of Givens rotations to produce Z in (1.7) that results
in the factorization
(1.3). In Section 3.2, we give an analysis of the relationship between
T e1 and U
e2 deviate from struc
In̄ − U U and In − U T U F and show how, in practice, U
F
tural orthogonality in (1.10)–(1.11). Numerical tests are presented in Section 4 along with a
Householder-based algorithm for adding a block of rows in Section 4.1. Proofs of important
theorems are given in Section 5, and we summarize with a brief conclusion in Section 6.
2. A block Gram-Schmidt algorithm for downdating.
2.1. The functions block CGS2 step and block CGS2 down. We begin our development by examining the function block CGS2 step from [3, Function 2.2] (given as
Function 2.1 here) applied to a general matrix B ∈ Rm×p with the intention of producing
matrices QB , RB , and SB to solve (1.5)–(1.6). The function performs two classical GramSchmidt (BCGS) steps on B. The resulting matrices QB and RB satisfy
QB RB = (Im − U U T )2 B,
and QB , SB , and RB satisfy (1.5). The remaining issue is the extent to which (1.6) or at
least (1.15) can be satisfied. In [3], Function 2.1 is part of a QR factorization of a larger matrix, and the conditions on that QR factorization implicitly impose conditions on RB relative
to the loss of orthogonality in U .
ETNA
Kent State University
http://etna.math.kent.edu
BLOCK GRAM-SCHMIDT DOWNDATING
167
Assuming U satisfies (1.13) and that RB is nonsingular, we have that
−1
U T QB = (In − U T U )2 U T BRB
,
kU k2 ≤ (1 + ξ)1/2 .
A crude norm bound is
T
U QB ≤ In − U T U 2 kU k kBk R−1 B 2
2
2
2
2
≤ ξ 2 (1 + ξ)1/2 kBk R−1 .
2
B
2
If
(2.1)
for some constant corth , then
(2.2)
−1 ≤ corth
ξ kBk2 RB
2
T
U QB ≤ corth ξ + O(ξ 2 ).
2
The inequality (2.1) poses two problems: we cannot always guarantee that RB is nonsingular
much less that it satisfies (2.1), and ξ is not necessarily known. The relationship (2.1)–(2.2)
can be made columnwise in that for any positive integer k ≤ p,
T
U QB ( : , 1 : k) ≤ corth ξ + O(ξ 2 )
(2.3)
2
if
(2.4)
−1
(1 : k, 1 : k)2 ≤ corth .
ξ B( : , 1 : k)2 RB
Unfortunately, (2.3)–(2.4) does not lead to such a bound as in (1.14).
The downdating problem (1.3) demands that we choose B of the form (1.4), but our ideas
for choosing QB , RB , and SB apply to the situation in [3] where B is more general, provided
that we normalize it as kBk2 = 1. For the downdating problem, RB and SB are discarded
once the decomposition (1.3) is computed, whereas for the application in [3], these quantities
are needed in subsequent computations.
To delete the first p rows from the QR decomposition (1.1), we let
I
B0 = p
0
and look at the application of the Function 2.1 to B0 . Steps (1)–(2) of Function 2.1 read
S1 = U T B0 = U (1 : p, : )T ,
Y 1 = B 0 − U S1 .
Instead of computing the QR decomposition in step (3), we compute the singular value decomposition (SVD) of Y1 . Thus, we have
(2.5)
Y 1 = Q1 R 1 V T ,
Q1 ∈ Rm×p , left orthogonal,
(2.6)
V ∈ Rp×p , orthogonal,
R1 = diag(ρ1 , . . . , ρp ), ρ1 ≥ · · · ≥ ρp ≥ 0.
ETNA
Kent State University
http://etna.math.kent.edu
168
J. L. BARLOW
R EMARK 2.1. In (2.5), the matrix R1 in the statement (3) of Function 2.1 is replaced by
the diagonal matrix of Y1 ’s singular values rather than its upper triangular factor. We use the
same notation for it since it is the matrix R1 obtained by Function 2.1 with B given by (1.4)
and V being the right singular vector matrix in (2.5). Moreover, R1 is used in the same way
in the algorithm described below as it is in Function 2.1. We substitute the SVD because it
sets us up to extract useful information for solving the problem (1.14)–(1.15).
To obtain a matrix QB that is near orthogonal to U , the second BCGS step, steps (4)–(5)
in Function 2.1, reads
S2 = U T Q1 ,
Y 2 = Q1 − U S2 ,
(2.7)
b B R2 = Y2 , QR decomposition,
Q
b B ∈ Rm×p , left orthogonal,
Q
R2 ∈ Rp×p , upper triangular.
b B are left orthogonal to near machine accuracy, that is,
To guarantee that Q1 and Q
b B },
Ip − QT Q ≤ εM L(m, p) ≤ ξ, Q ∈ {Q1 , Q
(2.8)
F
where L(m, p) = O(mp3/2 ), εM is the machine unit, and ξ is defined in (1.13), we recommend that (2.5) is computed with either the Golub-Kahan-Householder (GKH) SVD [13]
or the Lawson-Hanson-Chan (LHC) SVD [16, Section 18.5], [8] and that the QR factorization (2.7) may be computed using the Householder QR factorization [7].
Step (6) of Function 2.1 becomes
(2.9)
SB = S1 V + S2 R 1 ,
b B = R2 R1 .
R
bB , R
bB , and SB such that
Equations (2.7) and (2.9) yield Q
(2.10)
bB R
bB .
B = U SB + Q
b B and R
bB because we are not yet ready to accept these matrices as solutions
We write Q
to (1.14)–(1.15), however, SB will not be further modified by our algorithms. The only
difference between (2.9) and step (6) of Function 2.1 is the presence of V in the computation
of SB .
The modifications (2.5) and (2.9) to block CGS2 step produce block CGS2 down
bB , R
bB , and SB as Function 2.1
given in Function 2.2. This function produces the same Q
applied to U and B in (1.4). Note that it also outputs R1 , the diagonal matrix of singular
values from the SVD in (2.5)–(2.6), and the upper triangular matrix R2 .
R EMARK 2.2. Function 2.2 requires three matrix multiplications with U , one SVD of Y1
using the GKH or LHC SVD, one QR decomposition of Y2 , two matrix multiplications and
one matrix addition to form SB , and one multiplication of a triangular matrix by a diagonal
matrix to form RB . This totals to 6mnp+ 10mp2 + (20/3)p3 + 2np2 + O(m+ n) operations.
The dominant cost is the matrix multiplication with U .
bB , R
bB , and SB satisfy (1.5) (and thus
Although equation (2.10) guarantees that Q
b
also (1.14)), we cannot guarantee that QB = QB satisfies (1.15). However, we are able
b B that are sufficiently orthogonal to U . More preto identify a subset of the columns of Q
cisely, for a given constant corth such that 0 < corth ≤ 1, we intend to find the largest
ETNA
Kent State University
http://etna.math.kent.edu
169
BLOCK GRAM-SCHMIDT DOWNDATING
F UNCTION 2.2 (The function block CGS2 down).
b B , SB , R
bB , R2 , R1 ]=block CGS2 down(U, p)
function [Q
%
%
%
First orthogonalization of B0 against U
I
(= U (1 : p, : )T );
(1) S1 = U T p
0
I
(2) Y1 = p − U S1 ;
0
(3) Q1 R1 V T = Y1 ;
%
%
%
SVD of Y1
Second orthogonalization
%
(4) S2 = U T Q1 ; Y2 = Q1 − U S2 ;
b B R2 = Y2 ;
(5) Q
% QR decomposition of Y2
b B = R2 R1 ;
(6) SB = S1 V + S2 R1 ; R
end block CGS2 down
integer k such that
Tb
(2.11)
U QB ( : , 1 : k) ≤ corth ξ + O(ξ 2 ),
2
Tb
U
Q
(
:
,
1
:
k)
(2.12)
≤ corth ξF + O(ξ 2 ),
B
F
ξF ≥ max{In − U T U F , ξ}.
In this paper, we default to corth = 0.5 as it was used in [4] for the case p = 1. Daniel et al. [9]
used corth = 1 for the case p = 1 since it maintains roughly the same level of orthogonality.
Values of corth > 1 are not desirable since that would make the bound in (2.11) larger than
the bound in (1.13). On the other hand, we do not want the restriction (2.11) to be too harsh,
thus we recommend corth to be bounded away from zero.
To assure the bound (1.15), we need two theorems both of which require the definition
(2.13)
βj = R2−1 (1 : j, 1 : j)2 , j = 1, . . . , p, β0 = 0
and both of which are proved in Section 5.1. Equation (2.13) defines the norms of the inverses
of the leading principal submatrices of R2 in (2.7). The first theorem relates the values βj
in (2.13) to the singular values in the diagonal matrix R1 in (2.5).
T HEOREM 2.1. Assume that U satisfies (1.13). Let Q1 in (2.5) be left orthogonal and βj ,
j = 1, . . . , p, be given by (2.13). Let ρj , j = 1, . . . , p, be the singular values of Y1 in (2.5).
If, for a given constant corth with 0 < corth ≤ 1, k is the largest integer such that
1/2
(2.14)
βk ≤ βorth = 1 + c2orth
and k < p, then
(2.15)
2
ρj ≤ αorth ξ + O(ξ ),
αorth
1 + c2orth
=
corth
1/2
,
for j = k + 1, . . . , p.
√
√
Using corth = 0.5, the two constants in Theorem 2.1 are βorth = 5/2 and αorth = 5.
If k < p, then the inequality
ρk+1 ≤ αorth ξ + O(ξ 2 )
ETNA
Kent State University
http://etna.math.kent.edu
170
J. L. BARLOW
yields the lower bound estimate of ξ given by
ξest = ρk+1 /αorth .
(2.16)
This estimate may be useful in determining if U has become too far off from being left
orthogonal after a sequence of modifications to the QR factorization.
A second theorem, which follows from Theorem 2.1, shows that establishing k in (2.14)
leads to an algorithm to produce QB , SB , and RB satisfying (1.14)–(1.15).
T HEOREM 2.2. Assume the hypothesis and terminology of Theorem 2.1 including the
bB , R
bB , and SB be the output of Function 2.2. Then,
assumption that k satisfies (2.14). Let Q
we have (2.11)–(2.12) and
(2.17)
where Dp = 0m×p ,
(2.18)
and
(2.19)
b B ( : , 1 : k)R
bB (1 : k, : ) + Dk ,
B = U SB + Q
b B ( : , k + 1 : p)R
bB (k + 1 : p, k + 1 : p),
Dk = Q
kDk k2 ≤
(
0
ρk+1
k < p,
k = p,
k < p.
b B ( : , 1 : k),
Thus, using the bound for ρk+1 in (2.15), for corth = 0.5, we have that QB = Q
b
RB = RB (1 : k, : ), and SB satisfy (1.14)–(1.15).
In the next section, to find the integer k that satisfies (2.14), we give an O(p2 log p)
operation algorithm.
b B ( : , 1 : k) is near orthogonal to U . To pro2.2. Finding the largest k such that Q
duce an algorithm to find the largest k satisfying (2.14), for the upper triangular
matrix R2 in
step (5) of Function 2.2, we need an algorithm to compute R2−1 (1 : j, 1 : j)2 with reasonable accuracy for a given j, and we need a binary search.
To produce the first, we compute zj , wj ∈ Rj such that
R2−1 (1 : j, 1 : j)wj = βj zj ,
R2−T (1 : j, 1 : j)zj = βj wj + fj ,
thus yielding the approximate leading singular triplet (βj , zj , wj ) of R2−1 (1 : j, 1 : j). The
vector fj is a residual that satisfies
wjT fj = 0,
kfj k2 ≤ tol ∗ βj
for some tolerance tol. This can easily be done with a few steps of a Golub-Kahan-Lanczos
(GKL) bidiagonal reduction followed by an algorithm to find the largest singular triplet of
a bidiagonal matrix. This approach is related to ideas in Ferng, Golub, and Plemmons [11]
and ideas that have been used in ULV decompositions [2, 5]. If we are seeking the leading
singular value of R2−1 (1 : j, 1 : j), such a procedure is akin to finding the leading eigenvalue
of a symmetric, positive definite matrix by the Lanczos algorithm, and no reorthogonalization is necessary in this circumstance [19]. Since the details of GKL bidiagonal reduction
and the process of extracting the leading singular triplet is explored in detail elsewhere (see,
for instance, [14, Chapters 8 and 9]), we skip these here and simply assume that the triplet
(βj , zj , wj ) can be delivered by the “black box” call
[βj , zj , wj ] = GKL inv norm(R2 (1 : j, 1 : j)).
ETNA
Kent State University
http://etna.math.kent.edu
BLOCK GRAM-SCHMIDT DOWNDATING
171
F UNCTION 2.3 (Lanczos-based routine for finding k satisfying (2.14)).
function k = max col orth(R2 , βorth )
n=length(R2 );
if βorth |R2 (1, 1)| < 1
%
%
All of R2 is too small, no columns are guaranteed orthogonal
%
k = 0;
else
[β, z, w]=GKL inv norm(R2 );
if β <= βorth
%
%
%
−1 R ≤ βorth , all columns are guarantee orthogonal
2
2
k = p;
else
%
%
%
Do binary search
f irst = 1; last = p;
%
%
%
%
At any given point in this loop
f irst ≤ k < last
while last − f irst > 1
middle = ⌊(f irst + last)/2⌋;
cols = 1 : middle;
[β, z, w] = GKL inv norm(R2 (cols, cols));
if β > βorth
last = middle;
else
f irst = middle;
end;
end;
k = f irst;
end;
end;
end max col orth
Coupling GKL inv norm with a binary search that successively brackets k in the interval
f irst ≤ k < last
that starts with f irst = 1 and last = p and converges when k = f irst = last − 1, yields
the function max col orth (Function 2.3) that produces k in (2.14).
2.3. The necessary information for a block downdate. The following function, Funcb B , SB , and R
bB and then
tion 2.4, uses block CGS2 down (Function 2.2) to produce Q
b
uses max col orth (Function 2.3) to find k such that QB = QB ( : , 1 : k), SB , and
bB ( : , 1 : k) satisfy (1.14)–(1.15) as shown by Theorem 2.2.
RB = R
ETNA
Kent State University
http://etna.math.kent.edu
172
J. L. BARLOW
F UNCTION 2.4 (Block Downdate Information).
function [QB , SB , RB , k, ξest ]=block downdate info (U, p)
%
%
%
Produces information for a Block Downdate Operation for deleting the first p rows
from a QR decomposition where U is a near left orthogonal matrix.
% Function max
−1orth colsfrom Section 2.2 is called to find the largest integer k
% such that R2 (1 : k, 1 : k) ≤ βorth for 1 ≤ j ≤ p.
2
%
%
%
Define constants used in the function. Specify corth = 0.5 to get
the bounds (1.14)–(1.15).
%
1/2
(1) corth = 0.5; βorth = 1 + c2orth
; αorth = βorth /corth ;
b B , SB , R
bB , R2 , R1 ]=block CGS2 down(U, p);
(2) [Q
%
%
%
b B guaranteed orthogonal to U .
Determine “rank =k”, the number of columns of Q
(3) k=max orth cols(R2 , βorth )
%
%
%
Give lower bound estimate, ξest in (2.16), for ξ.
Note that ρk+1 = R1 (k + 1, k + 1).
%
(4) if k < p
(5)
ξest = R1 (k + 1, k + 1)/αorth ;
(6) else
(7)
ξest = 0;
(8) end;
%
%
Produce rank k solution to satisfy (1.14)–(1.15).
%
bB (1 : k, : );
(9) RB = R
b B ( : , 1 : k);
(10) QB = Q
end block downdate info
R EMARK 2.3. Except for O(p2 log p) operations for max orth cols, almost all of the
operations for Function 2.4 are from block CGS2 down, thus it requires 6mnp + 10mp2 +
(20/3)p3 + 2np2 + O(m + n + p2 log p) operations.
3. Producing a new QR factorization.
3.1. The algorithm to produce a new QR factorization. The remaining step in performing the block downdate is to find an orthogonal matrix Z such that
RV Y 0 p
RB 0 k
(3.1)
ZT S
, n̄ = n − p + k,
=
R n
n̄
R
0
B
|{z} |{z}
|{z} |{z}
p
n
p
n
where R remains upper trapezoidal. First, permute the rows of the above matrix so that


RB
0
k
R
0
B


PT S
R n = SB (n̄ + 1 : n, : ) R(n̄ + 1 : n, : ) .
B
|{z} |{z}
SB (1 : n̄, : )
R(1 : n̄, : )
p
n
ETNA
Kent State University
http://etna.math.kent.edu
173
BLOCK GRAM-SCHMIDT DOWNDATING
We then let Z̆0 be a product of Householder transformations such that
RB
T
= R̆B ,
Z̆0
SB (n̄ + 1 : n, : )
Z̆0 0
, we have
where R̆B is upper triangular. Letting Z0 =
0 In̄

 RB
0
R̆B Y̆0 p
T 

.
Z0 SB (n̄ + 1 : n, : ) R(n̄ + 1 : n, : ) =
n̄
S̆B R̆
SB (1 : n̄, : )
R(1 : n̄, : )
|{z} |{z}
p
n
Note that R̆ is upper trapezoidal and Y̆0 is nonzero only in the columns n̄ + 1, . . . , n.
The remaining orthogonal transformations in Z are either Givens rotations or Householder transformations applied to only two rows. For simplicity, we just refer to both kinds
as “Givens rotations”.
Since R̆B is upper triangular, we let
(3.2)
Z̆ = Z1 · · · Zp ,
where each Zj , j = 1, . . . , p, is given by
Zj = Gj,p+n̄ · · · Gj,p+1
and Gj,ℓ is a Givens rotation rotating rows j and ℓ and inserting a zero in position (j, ℓ) so
that
R̆B
RV
.
=
Z̆ T
0
S̆B
In terms of data movement, (3.2) is a poor Givens ordering. A better one is to let
Here we have
Z̆ = Zb1 · · · Zbp−1 Zbp · · · Zbn · · · Zbn̄+p−1 .


Gj,n̄+p · · · G1,n+p−j+1
Zbj = Gp,n̄+2p−j · · · G1,n̄+p−j+1


Gp,n̄+2p−j · · · Gj−n̄+1,p+1
j < p,
p ≤ j ≤ n̄,
n̄ < k < n̄ + p.
On its set of “active rows,” each of these Zbj has the form
Γj
∆j
Zbj =
, Γ2j + ∆2j = I,
−∆j Γj
where Γj and ∆j are diagonal. The orthogonal factor Z in the operation above is given by
Z = P Z0 Zb1 . . . Zbn̄+p−1 .
R EMARK 3.1. Ignoring terms of O(mn), the algorithm described in this section requires
2mp2 +(10/3)p3 −2kp2 operations to calculate and apply Z0 and 6mn̄p+3nn̄p+3n̄p2 operations to calculate and apply Z̆, with a total of 6mn̄p+3nn̄p+3n̄p2 +2mp2 +(10/3)p3 −2kp2
operations. The dominant cost in this algorithm is the operation of updating the orthogonal
factor as in (1.8).
ETNA
Kent State University
http://etna.math.kent.edu
174
J. L. BARLOW
e
3.2. Properties of the new QR factorization. In our new QR factorization, we have U
given in (1.8) with
(3.3)
UV ∆U p
e
U = ∆U U
,
m−p
V
|{z} |{z}
p
n̄
where, if ξ = 0, ∆U = 0p×n̄ and ∆UV = 0(m−p)×p . In practice, this will not necessarily be
the case. Given the QR factorization (1.1), our downdate algorithm has produced
RB 0
V X0
+ Dk 0
= QB U
SB R
0 X
RB 0
= QB U ZZ T
+ Dk 0
SB R
e R V Y 0 + Dk 0
=U
0
R
UV
∆U RV Y0
(3.4)
+ Dk 0 .
=
0
R
U
∆UV
Blockwise that is
V
UV
=
R V + Dk ,
∆UV
0
X0 = UV Y0 + (∆U )R,
(3.5)
X = (∆UV )Y0 + U R.
We accept U above to produce (1.3). Thus we need bounds for
X − U R (3.6)
F
and
(3.7)
T In̄ − U U .
F
For the quantity in (3.6), we could use an identical argument to obtain a bound in the 2-norm,
but for the quantity
in (3.7), the Frobenius norm yields a more meaningful bound.
Bounds for ∆U 2 and k∆UV k2 are also desirable. Theorems 3.1 and 3.3 are proved
in Section 5.2. The short proof of Corollary 3.2 following from Theorem 3.1 is given in this
section.
e be the result of the matrix Z defined to perform the operation (1.7)
T HEOREM 3.1. Let U
using QB , RB , and SB produced by Function 2.4 with input U satisfying (1.13) and QB
e is partitioned according
satisfying (2.8). Assume that QB is exactly left orthogonal. If U
to (3.3), then
(3.8)
(3.9)
(3.10)
where
k∆UV k2 ≤ αorth ξ + O(ξ 2 ),
∆U ≤ (αorth + γorth )ξ + O(ξ 2 ),
2
T
e U
e
≤ γorth ξ + O(ξ 2 ),
In+k − U
2
γorth = 1 + corth .
ETNA
Kent State University
http://etna.math.kent.edu
175
BLOCK GRAM-SCHMIDT DOWNDATING
C OROLLARY 3.2. Assume the terminology and hypothesis of Theorem 3.1. Then
X − U R ≤ k∆UV k kXk
(3.11)
2
F
F
≤ αorth ξ kXkF + O(ξ 2 ).
(3.12)
Proof. From (3.5), we have
X − U R = X − UR = (∆UV )Y0 ,
where
Y0 = Z( : , 1 : p)T
0
.
R
Thus,
Since
X − UR = k(∆UV )Y0 k ≤ k∆UV k kY0 k .
F
2
F
F
T 0 Z(
:
,
1
:
p)
kY0 kF = ≤ kXkF ,
R F
we have (3.11). The bound (3.8) gives us (3.12).
We now let
U V p
∆U p
e
e
U1 =
,
U2 =
∆UV m−p
m−p
U
and present a theorem bounding the quantity in (3.7).
T HEOREM 3.3. Assume theterminologyand hypothesis of Theorem 3.1. Let QB satisfy (2.8), and define ξF = max{In − U T U F , ξ}. Then
2
T
2 e T e 2
2
T e e
In̄ − U2 U2 ≤ ξF + 2 U QB F − U1 U2 F
(3.13)
(3.14)
(3.15)
Thus,
(3.16)
F
2
2 eT U
e1 + Ip − QTB QB F − Ip − U
,
1
F
T e2T U
e2 In̄ − U U ≤ In̄ − U
+ (∆U )T (∆U )F ,
F
F √
T e e
≤ In̄ − U2 U2 + p(αorth + γorth )2 ξ 2 + O(ξF3 ).
F
2
5
T 2
e T e 2 3
T e e
−
U
U
−
U
I
In̄ − U U ≤ ξF2 − 2 U
p
1 + O(ξF ).
1
1 2
2
F
F
F
Theorem 3.3 establishes that a loss of orthogonality in U will not be significantly worse
than that
that U is closer to a left orthogonal matrix than U
in U . Moreover,
it is possible
eT e eT U
e1 since Ip − U
and
may
contain a significant portion of the loss of orthogU
U
1
1 2
F
F
e while Ip − QT QB will be near machine accuracy and our implied bound
onality in U
F
B
from (1.15) for U T QB F could be quite pessimistic. In the next section, our numerical
tests on a sliding window problem bear out this observation.
ETNA
Kent State University
http://etna.math.kent.edu
176
J. L. BARLOW
4. Numerical tests. We consider a classic “sliding window” problem from statistics.
Here, we compute the QR decomposition of a matrix X(t) ∈ Rm×n which is a slice of a
large matrix Xbig ∈ RM ×n . The results displayed are for M = 4000, m = 300, n = 250.
At step t,
X(t) = Xbig (p ∗ (t − 1) + 1 : p(t − 1) + m, : ) ,
where, in the example shown, p = 40. The matrix Xbig is constructed by generating a
random M × n matrix using MATLAB’s randn function that simulates a standard normal
distribution and then multiplying the rows at random by factors of 1, 10−7 , 10−14 , and 10−21 .
Thus Xbig is a random matrix with rows that have large and small entries.
4.1. Adding a block of rows to a QR factorization. For the purposes of our numerical
tests in Section 4.2, we also need an algorithm that adds a block of rows to a QR factorization.
Again, supposing that we already have the factorization (1.1) and we wish to add a p×n block
of rows given by Xnew , then we simply compute the QR factorization
R
Rnew
,
= Qnew
0
Xnew
where Qnew is the product of n Householder transformations. Thus, the new QR factorization
is
X
= Unew Rnew ,
Xnew
where
Unew =
U
0
0
Qnew ( : , 1 : n).
Ip
4.2. The sliding window experiment. Given the QR factorization
X(t) = U (t)R(t)
at step t, the QR factorization of X(t+1) is produced by adding p rows at the bottom of X(t)
using the algorithm in Section 4.1 to produce its QR factorization, deleting the p rows at the
top of X(t), and updating its QR factorization using Function 2.4 followed by the algorithm
in Section 3.1 to obtain
X(t + 1) = U (t + 1)R(t + 1).
For t = 1, 2, . . . , rank, changes were frequent because of the wild scaling of Xbig .
At t = 1, we produce the QR decomposition
X(1) = U (1)R(1)
using the modified Gram-Schmidt algorithm. Since X(1) is ill-conditioned, consistent with
the bound in [6], we expect that
I − U (1)T U (1)
2
will be significantly larger
εM = 2−53 ≈ 1.1102 × 10−16 .
than
the
IEEE
double
precision
machine
unit
ETNA
Kent State University
http://etna.math.kent.edu
177
BLOCK GRAM-SCHMIDT DOWNDATING
Orthogonality for Sliding Window Problem
0
Log10 of orthogonality loss
Orthogonality loss in (4.1)
Loss estimate in (2.22)
−5
−10
−15
0
20
40
60
Window number
80
100
F IG . 4.1. Loss of orthogonality for the sliding window example.
We produce two graphs. The first, Figure 4.1, is the graph of
(4.1)
ξ = I − U (t)T U (t)2 .
The symbol ”+” in the graph indicates the estimate of ξ given by ξest in (2.16) from the
function block downdate info whenever k < p holds in Theorem 2.2. As can see be
observed, ξest is a fairly accurate estimate of ξ.
The second, Figure 4.2, is the graph of
(4.2)
kX(t) − U (t)R(t)k2
·
kX(t)k2
The value of ”*” is graphed whenever k < p hold in Function 2.4, which makes R having
smaller rank than R.
Note that in Figure 4.1, the value of I − U (t)T U (t)2 improves to near machine precision after about 10–20 steps, and this improvement persists. A similar pattern can be observed
for the relative residual kX(t) − U (t)R(t)k2 / kX(t)k2 , that is, it also improves to near machine precision and remains at this level. If we compute the QR factorization of X(1) with
Householder transformations instead of the modified Gram-Schmidt method, the loss of orthogonality starts out at near machine precision and stays there. The residuals, graphed in
Figure 4.2, follow the same pattern.
We have repeated this test with different values of M , m, n, and p many times. If U (1)
satisfied the fundamental assumption (1.13), the result was always similar. However, if the
matrix U (1) in the initial MGS factorization of X(1) did not satisfy the assumption (1.13),
meaning that U (1) could not be considered “near left orthogonal” and thus did not meet a
fundamental assumption of this work, the residuals still self-corrected, but the loss of orthogonality either did not self-correct or took significantly longer to do so.
ETNA
Kent State University
http://etna.math.kent.edu
178
J. L. BARLOW
Residuals for Sliding Window Problem
0
−5
Log
10
of residuals
Relative residual in (4.2)
rank deficient R
−10
−15
0
20
40
60
Window number
80
100
F IG . 4.2. Residuals for the sliding window example.
5. Proofs of the key theorems.
5.1. Proofs of Theorems 2.1 and 2.2. Using the form of R1 in (2.6), we have that if for
some ℓ ≤ p it hold that ρℓ = · · · = ρp = 0, then
V ( : , ℓ : p)
= 0,
(I − U U )
0
T
V ( : , ℓ : p)
is linearly dependent upon the columns of U and thereby
0
allowing us to immediately reduce the dimension of this problem. Therefore, without loss of
generality, we assume that
thus concluding that
ρ1 ≥ . . . ≥ ρp > 0
(5.1)
and thus that R1 is nonsingular.
Using (5.1), we note that
(5.2)
U T Q1 = (I − U T U )U T BR1−1 ,
= (I − U T U )F1 ,
where F1 = U T BR1−1 . This allows us to rewrite Q1 as
(5.3)
leading to the following lemma.
Q1 = BR1−1 − U F1
ETNA
Kent State University
http://etna.math.kent.edu
179
BLOCK GRAM-SCHMIDT DOWNDATING
L EMMA 5.1. Let U ∈ Rm×n and ξ satisfy (1.13). Let B, Q1 ∈ Rm×p be left orthogonal
matrices and be as in (5.3), n + p ≤ m, R1 ∈ Rp×p be nonsingular, and F1 = U T BR1−1 .
Then for any unit vector w ∈ Rp ,
(5.4)
−1 2
R w = 1 + kF1 wk2 (1 + δw ),
1
2
2
|δw | ≤ ξ,
where δw depends upon w.
Proof. Using the fact that Q1 and B are left orthogonal and computing the normal equation’s matrix of both sides of (5.3) yields
I = R1−T R1−1 − R1−T B T U F1 − F1T U T BR1−1 + F1T U T U F1
= R1−T R1−1 − 2F1T F1 + F1T U T U F1
= R1−T R1−1 − F1T F1 − F1T (I − U T U )F1 .
Thus,
R1−T R1−1 = I + F1T F1 + F1T (I − U T U )F1
so that for any unit vector w ∈ Rp , kwk2 = 1, the use of norm inequalities yields
−1 2
R w = 1 + kF1 wk2 + wT F1T (I − U T U )F1 w
1
2
2
2
2
≤ 1 + kF1 wk2 + I − U T U 2 kF1 wk2
2
= 1 + (1 + ξ) kF1 wk2 .
By a similar argument,
−1 2
R w ≥ 1 + (1 − ξ) kF1 wk2 .
1
2
2
Thus, for some |δw | ≤ ξ that depends upon w, we have (5.4).
R EMARK 5.1. Note that Lemma 5.1 does not depend upon R1 being diagonal, but R1
only needs to be nonsingular.
From (5.4), using the definition of the 2-norm yields
kF1 ( : , 1 : k)k2 ≤
1/2
R−1 (1 : k, 1 : k)2 − 1
1
2
(1 − ξ)
1/2
·
From the definition of R1 in (2.5)–(2.6), i.e., as a diagonal matrix of singular values, we have
that
(5.5)
1 − ρ2k
kF1 ( : , 1 : k)k2 ≤
ρk
1/2
(1 − ξ)−1/2 ,
and thus from (5.2) it follows that
T
U Q1 ( : , 1 : k) ≤ I − U T U kF1 ( : , 1 : k)k
2
2
2
2 1/2
1 − ρk
≤ξ
+ O(ξ 2 ).
ρk
ETNA
Kent State University
http://etna.math.kent.edu
180
J. L. BARLOW
The following lemma allows us to assume that R2 in (2.7) is nonsingular.
L EMMA 5.2. Assume the hypothesis and terminology of Lemma 5.1. Let the left orthogb B ∈ Rm×p and the upper triangular matrix R2 ∈ Rp×p be given by (2.7). If
onal matrix Q
R2 (1 : k, 1 : k) is singular, then
ρk ≤ ξ(1 + ξ).
(5.6)
Proof. If ρk = 0, the theorem holds trivially, so we assume that ρk > 0. For the left
orthogonal matrix B in (1.4),
Q1 R1 = (I − U U T )B,
b B R2 R1 = (I − U U T )2 B.
Q
If R2 (1 : k, 1 : k) is singular, there is a vector v 6= 0 such that
R2 (1 : k, 1 : k)v = 0.
Since ρ1 ≥ · · · ≥ ρk > 0 and R1 (1 : k, 1 : k) = diag(ρ1 , . . . , ρk ), we can choose v so that
v = R1 (1 : k, 1 : k)w,
where w is a unit vector. Thus,
Since
b B ( : , 1 : k)R2 (1 : k, 1 : k)R1 (1 : k, 1 : k)w = (I − U U T )2 B( : , 1 : k)w = 0.
Q
(I − U U T )2 = I − U U T − U (I − U T U )U T ,
this implies
Q1 ( : , 1 : k)R1 (1 : k, 1 : k)w = (I − U U T )B( : , 1 : k)w
= U (I − U T U )U T B( : , 1 : k)w.
Thus, by the definition of ρk and using the fact that kwk2 = 1,
ρk ≤ kR1 (1 : k, 1 : k)wk2 ≤ U (I − U T U )U T B( : , 1 : k)w2
≤ U (I − U T U )U T B( : , 1 : k) .
2
Since B is left orthogonal, we have
2
2
ρk ≤ kU k2 I − U T U 2 ≤ ξ kU k2 .
From the assumption (1.13),
2
kU k2 = U T U 2 ≤ 1 + I − U T U 2 = 1 + ξ,
thus ρk satisfies (5.6).
Assuming R2 is nonsingular, we have that
b B = (I − U T U )F2 ,
UT Q
F2 = U T Q1 R2−1
ETNA
Kent State University
http://etna.math.kent.edu
BLOCK GRAM-SCHMIDT DOWNDATING
so that
181
Tb
U QB ( : , 1 : k) ≤ ξ kF2 ( : , 1 : k)k2 ,
2
Tb
U QB ( : , 1 : k) ≤ ξF kF2 ( : , 1 : k)k2 .
F
Invoking Lemma 5.1, for any unit vector w ∈ Rp , we obtain
−1 2
R w = 1 + kF2 wk2 (1 + δw ), |δw | ≤ ξ
2
2
2
so that, again from the definition of the matrix 2-norm,
1/2
Tb
+ O(ξ 2 ),
U QB ( : , 1 : k) ≤ ξ βk2 − 1
2
1/2
Tb
+ O(ξF2 ),
U QB ( : , 1 : k) ≤ ξF βk2 − 1
F
where βk = R2−1 (1 : k, 1 : k)2 .
Proof of Theorem 2.1. We note that F1 and F2 are related according to
F2 = U T Q1 R2−1 = (I − U T U )U T BR1−1 R2−1
= (I − U T U )F1 R2−1 .
(5.7)
We now exploit (5.7) to show an important relationship between βj = R1−1 (1 : j, 1 : j)2
and ρj , the jth singular value of R1 . Applying norm inequalities to (5.7) and (5.5), we have
kF2 ( : , 1 : j)k2 ≤ I − U T U 2 kF1 ( : , 1 : j)k2 R2−1 (1 : j, 1 : j)2
1/2 ! 1/2
1 − ρ2j
2
≤ξ
1 + kF2 ( : , 1 : j)k2
+ O(ξ 2 ),
ρj
which yields
kF2 ( : , 1 : j)k2
1 + kF2 ( : , 1 :
2
j)k2
1/2
1 − ρ2j
≤ξ
ρj
Since βj > βorth implies (2.15) and since x/ 1 + x2
for x > 0, kF2 ( : , 1 : j)k2 ≤ corth implies that
corth
(1 + c2orth )
1/2
1 − ρ2j
≤ξ
ρj
1/2
1/2
1/2
+ O(ξ 2 ).
is a strictly increasing function
+ O(ξ 2 ) ≤ ξ/ρj + O(ξ 2 ),
which becomes
ρj ≤ αorth ξ + O(ξ 2 ),
where αorth is given in (2.15).
1/2
We note that if k is the largest integer such that βk ≤ 1 + c2orth
, then the matrix
b
QB = QB ( : , 1 : k) as defined in Theorem 2.2 satisfies
T
U QB ≤ corth ξ + O(ξ 2 ),
(5.8)
T
2
U QB ≤ corth ξF + O(ξ 2 ).
F
ETNA
Kent State University
http://etna.math.kent.edu
182
J. L. BARLOW
Some algebra shows that
bB R
bB
B = U SB + Q
= U SB + QB R B + D k ,
(5.9)
where Dk is given by (2.18).
Thus, we have that
(5.10)
kB − U SB − QB RB k2 = kDk k2
b
b
= Q
B ( : , k + 1 : p)RB (k + 1 : p, k + 1 : p)
2
b
= RB (k + 1 : p, k + 1 : p) .
2
b
To prove a bound for R
B (k + 1 : p, k + 1 : p) and thus to bound the residual in (5.10),
2
we need the following lemma proved in [3].
L EMMA 5.3 ( [3, Lemma 3.2]). If U ∈ Rm×n satisfies (1.13), then
Im − U U T ≤ 1.
b B ( : , 1 : k) satisfies (2.11).
Proof of Theorem 2.2. From (5.8), we have that QB = Q
For k = p, we have Dk = 0, and (2.19) is trivially satisfied. For k < p, from (5.9), we have
(2.17)–(2.18), thus by orthogonal equivalence,
b
kDk k2 = R
B (k + 1 : p, k + 1 : p) .
2
b
Thus, we only need to establish a bound for R
B (k + 1 : p, k + 1 : p) to prove the theorem.
2
Since
bB (k + 1 : p, k + 1 : p) = R2 (k + 1 : p, k + 1 : p)R1 (k + 1 : p, k + 1 : p),
R
by a standard norm inequality and the SVD structure (2.5)–(2.6), it follows that
b
RB (k + 1 : p, k + 1 : p)
2
≤ kR2 (k + 1 : p, k + 1 : p)k2 kR1 (k + 1 : p, k + 1 : p)k2
= ρk+1 kR2 (k + 1 : p, k + 1 : p)k2 ≤ ρk+1 kR2 k2 .
(5.11)
A bit of algebra shows that
b TB (I − U U T )Q1 ,
R2 = Q
thus we can use orthogonal equivalence and Lemma 5.3 to show
kR2 k2 ≤ I − U U T 2 ≤ 1.
Thus, from (5.11),
b
kDk k2 = R
B (k + 1 : p, k + 1 : p) ≤ ρk+1 kR2 k2 ≤ ρk+1 ,
2
which establishes (2.19).
ETNA
Kent State University
http://etna.math.kent.edu
183
BLOCK GRAM-SCHMIDT DOWNDATING
5.2. Proofs of Theorems 3.1 and 3.3. We begin with the proof of Theorem 3.1. First,
we need two lemmas.
e and γorth be as in Theorem 3.1. Then
L EMMA 5.4. Let U
def
eT U
e
In+k − U
= ξ˜ ≤ γorth ξ + O(ξ 2 ).
2
Proof. We have that
eT U
e = Z T [In+k − QB U T QB U ]Z
In+k − U
I − QT Q
QTB U
Z.
= ZT k T B B
U QB
In − U T U
Thus using standard norm inequalities, we have
I k − QT QB
QTB U
B
eT U
e
In+k − U
=
T
T
U QB
I n − U U 2
2
T
Ik − QTB QB T
2 U QTB 2
≤
U QB I n − U U 2
2
.
2
Since QB satisfies (2.8) and since Theorem 2.2 implies that Function 2.4 produces a matrix QB satisfying
T
U QB ≤ corth ξ + O(ξ 2 ),
Ik − QTB QB ≤ ξ
2
2
and moreover U is assumed to satisfy (1.13), it follows that
1
corth T e
e
ξ + O(ξ 2 ) = γorth ξ + O(ξ 2 ).
In+k − U U ≤ corth
1 2
2
L EMMA 5.5. Let RV ∈ Rp×p be the upper triangular matrix defined in (3.1). Using the
terminology in Theorem 2.1 and Lemma 5.4 with the convention that ρp+1 = 0, we have
(5.12)
(5.13)
−1 ˜ 1/2 (1 − ρk+1 )−1
R ≤ (1 + ξ)
V
2
≤ 1 + (αorth + γorth /2)ξ + O(ξ 2 ),
kRV k2 ≤ 1 + (αorth + γorth /2)ξ + O(ξ 2 ).
Proof. Theorem 2.2 implies
(5.14)
e1 RV = V − Dk ,
U
0
where Dk is bounded as in (2.19). Through the use of a singular value inequality in [15,
Problem 7.3.P16], we conclude that
e1 )σj (RV ) ≤ σj (U
e1 RV ) ≤ σ1 (U
e1 )σj (RV ).
σ p (U
Thus, using standard norm inequalities,
V
e σp (RV ) U1 ≥ σp
− kDk k2 .
0
2
ETNA
Kent State University
http://etna.math.kent.edu
184
J. L. BARLOW
Since V is orthogonal and
e e
˜ 1/2 + O(ξ 2 ),
U 1 ≤ U
≤ (1 + ξ)
2
2
2
we have that within a margin of O(ξ ),
˜ 1/2 ≥ 1 − kDk k .
σp (RV )(1 + ξ)
2
Invoking Theorem 2.2 yields
˜ −1/2 (1 − ρk+1 ).
σp (RV ) ≥ (1 + ξ)
Thus,
−1 ˜ 1/2 (1 − ρk+1 )−1 + O(ξ 2 ).
R = σp (RV )−1 ≤ (1 + ξ)
V
2
From the bound for ξ˜ in Lemma 5.4 and the bound for ρk+1 in Theorem 2.1, we have
−1 R ≤ (1 + αorth + γorth /2)ξ + O(ξ 2 ).
V
2
To establish the bound for kRV k2 , simply note that
e † V − Dk
RV = U
1
0
so that
e †
˜ −1/2 (1 + ρk+1 ).
kRV k2 ≤ U
1 (1 + ρk+1 ) ≤ (1 − ξ)
2
The bound (5.13) follows from an argument similar to that for (5.12).
We are now ready to prove Theorem 3.1.
Proof of Theorem 3.1. We have already proved (3.10). Next we bound k∆UV k2 . From
(1.8), (3.1), (3.3), and (5.14), we have
∆UV = −Dk (p + 1 : m, : )RV−1 .
Thus, from Theorem 2.1 and Lemma 5.5,
k∆UV k2 ≤ kDk (p + 1 : m, : )k2 RV−1 2
≤ ρk+1 (1 + (αorth + γorth /2)ξ + O(ξ 2 )
= αorth ξ + O(ξ 2 ),
which is (3.8).
Now proceed to prove (3.9). We have that
so that
(5.15)
e1T U
e2 = UVT ∆U + (∆UV )T U
U
T
e1T U
e2 UV ∆U ≤ U
+ (∆UV )T U 2 .
2
2
To bound the first term in (5.15), note that
eT e ˜
e
eT U
U1 U2 ≤ In+k − U
= ξ,
2
2
ETNA
Kent State University
http://etna.math.kent.edu
185
BLOCK GRAM-SCHMIDT DOWNDATING
and thus
(5.16)
T
UV ∆U ≤ ξ˜ + k∆UV k U .
2
2
2
e,
Since U is just the lower right block of U
˜ 1/2
e
U ≤ U
≤ (1 + ξ)
2
2
so that (5.16) becomes
T
˜ 1/2 .
UV ∆U ≤ ξ˜ + k∆UV k (1 + ξ)
2
2
Again using the singular value result in [15, Problem 7.3.P16], we have
˜ 1/2 .
σp (UV ) ∆U 2 ≤ (ξ˜ + k∆UV k2 )(1 + ξ)
We note that from (3.4) it follows that
UV = (V + Dk (1 : p, : ))RV−1 ,
hence
σp (UV ) ≥ σp (V + Dk (1 : p, : ))σp (RV−1 ).
Using the orthogonality of V and the bound for kDk k2 from Theorem 2.2, we have
−1
σp (UV ) ≥ (1 − ρk+1 ) kRV k2 .
Therefore, using (5.13) yields
˜ 1/2 (1 − ρk+1 )−1 kRV k
∆U ≤ (ξ˜ + k∆UV k )(1 + ξ)
2
2
2
1/2
˜
˜
= (ξ + k∆UV k )(1 + ξ) (1 + ρk+1 )(1 − ρk+1 )−1
2
≤ (αorth + γorth )ξ + O(ξ 2 ).
We now prove Theorem 3.3.
Proof of Theorem 3.3. We note that
"
#2
2
eT U
e1
eT U
e
U
Ip − U
T e
1
1 2
e
In+k − U U = e2 e
eT U
eT U
F
In̄ − U
U
2
2 1
F
2
2 2
T
T
e2T U
e2 e1 U
e2 + e1 U
e1 + 2 U
(5.17)
= Ip − U
In̄ − U
F
F
F
and that by orthogonal equivalence
2
T 2
eT U
e
QB U )Z In+k − U
= Z T (In+k − QB U
F
F
T 2
QB U = In+k − QB U
F
2
T
2 2
T
(5.18)
= I k − QB QB F + 2 U QB F + I n − U T U F .
2
e2 eT U
Equating (5.17) and (5.18) and solving for In̄ − U
obtains (3.13).
2
F
ETNA
Kent State University
http://etna.math.kent.edu
186
J. L. BARLOW
To obtain (3.15), we note that
thus
∆U
e
,
U2 =
U
Thus,
eT U
e2 = (∆U )T (∆U ) + In̄ − U T U .
In̄ − U
2
T e2T U
e2 + (∆U )T (∆U )F ,
In̄ − U U ≤ In̄ − U
F
F
which is (3.14). From the inequalities
(∆U )T (∆U ) ≤ √p (∆U )T (∆U ) ≤ √p ∆U 2
2
F
2
and the bound (3.9), we obtain (3.15). To get (3.16), we note that
In − U T U , Ik − QTB QB ≤ ξF
(5.19)
F
F
and that
(5.20)
T
U QB ≤ corth ξF + O(ξF2 ) = 1 ξF + O(ξF2 ),
F
2
thus
√
T 2
eT U
e2 + p(αorth + γorth )2 ξ 2 )2 + O(ξF4 )
In̄ − U U ≤ (In̄ − U
2
F
F
2 √
T
e2 U
e2 + p(αorth + γorth )2 ξ 2 e2T U
e2 ≤ In̄ − U
+ O(ξF ).
In̄ − U
F
F
From (3.13) and (5.19)–(5.20), we have
2
2
5
e T e 2 3
eT U
e2 eT e ≤ ξF − 2 U
In̄ − U
2
1 U2 − Ip − U1 U1 + O(ξF ),
2
F
F
F
which is (3.16).
6. Conclusion. We have taken the 2-norm formulation of the downdating algorithms
in [4, 9, 20] for deleting a single row from a QR factorization and fashioned the matrix 2norm formulation for a block downdating algorithm designed to delete p rows from a matrix.
Similar to results shown in [4], if we are asked to delete p rows from a QR decomposition
with a near left orthogonal factor U satisfying (1.13), we can obtain a QR decomposition for
the remaining m − p rows that has a new left orthogonal factor U whose loss of orthogonality
can be bounded as in Theorem 3.3. Our numerical tests indicate that repeated block updates
and downdates often have a correcting effect on the loss of orthogonality.
Acknowledgments. The author completed much of this work while visiting the University of California at Berkeley as the guest of James Demmel. That visit provided a stimulating
and helpful atmosphere to work on the ideas in this paper as well as some other ideas. During
that time, a conversation with Ming Gu had an important influence on this work. Thanks
also to Lothar Reichel and to two careful referees for helpful suggestions that improved this
manuscript.
ETNA
Kent State University
http://etna.math.kent.edu
BLOCK GRAM-SCHMIDT DOWNDATING
187
REFERENCES
[1] N. A BDELMALEK, Round off error analysis for Gram–Schmidt method and solution of linear least squares
problems, Nordisk Tidskr. Informationsbehandling (BIT), 11 (1971), pp. 354–367.
[2] J. BARLOW, H. E RBAY, AND I. S LAPNI ČAR, An allternative algorithm for the refinement of ULV decomposition, SIAM J. Matrix Anal. Appl., 27 (2005), pp. 198–211.
[3] J. BARLOW AND A. S MOKTUNOWICZ, Reorthogonalized block classical Gram-Schmidt, Numer. Math., 123
(2013), pp. 395–423.
[4] J. BARLOW, A. S MOKTUNOWICZ , AND H. E RBAY, Improved Gram–Schmidt downdating methods, BIT, 45
(2005), pp. 259–285.
[5] J. BARLOW, P. YOON , AND H. Z HA, An algorithm and a stability theory for downdating the ULV decomposition, BIT, 36 (1996), pp. 14–40.
[6] A. B J ÖRCK, Solving linear least squares problems by Gram–Schmidt orthogonalization, Nordisk Tidskr.
Informationsbehandling (BIT), 7 (1967), pp. 1–21.
[7] P. B USINGER AND G. G OLUB, Linear least squares solutions by Householder transformations, Numer.
Math., 7 (1965), pp. 269–278.
[8] T. C HAN, An improved algorithm for computing the singular value decomposition, ACM Trans. Math. Software, 8 (1982), pp. 72–83.
[9] J. W. DANIEL , W. B. G RAGG , L. K AUFMAN , AND G. W. S TEWART, Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization, Math. Comp., 30 (1976), pp. 772–795.
[10] J. D ONGARRA , J. D U C ROZ , I. D UFF , AND S. H AMMARLING, A set of level 3 basic linear algebra subprograms, ACM Trans. Math. Software, 16 (1990), pp. 1–17.
[11] W. F ERNG , G. G OLUB , AND R. P LEMMONS, Adaptive Lanczos methods for recursive condition estimation,
Numer. Algorithms, 1 (1991), pp. 1–20.
[12] L. G IRAUD , J. L ANGOU , M. ROZLO ŽNIK , AND J. VAN DEN E SHOF, Rounding error analysis of the classical
Gram–Schmidt orthogonalization process, Numer. Math., 101 (2005), pp. 87–100.
[13] G. G OLUB AND W. K AHAN, Calculating the singular values and pseudoinverse of a matrix, Soc. Indust.
Appl. Math. Ser. B Numer. Anal., 2 (1965), pp. 205–224.
[14] G. G OLUB AND C. VAN L OAN, Matrix Computations, 4th ed., Johns Hopkins University Press, Baltimore,
2013.
[15] R. H ORN AND C. J OHNSON, Matrix Analysis, 2nd ed., Cambridge University Press, Cambridge, 2013.
[16] C. L AWSON AND R. H ANSON, Solving Least Squares Problems, Prentice-Hall, Englewood Cliff, 1974.
[17] Q. L IU, Modified Gram-Schmidt-based methods for block downdating the Cholesky factorization, J. Comput.
Appl. Math., 235 (2011), pp. 1897–1905.
[18] M. OVERTON , N. G UGLIELMI , AND G. S TEWART, An efficient algorithm for generalized null space decomposition, SIAM J. Matrix Anal. Appl., to appear.
[19] B. PARLETT, H. S IMON , AND L. S TRINGER, On estimating the largest eigenvalue with the Lanczos algorithm, Math. Comp., 38 (1982), pp. 153–166.
[20] K. YOO AND H. PARK, Accurate downdating of a modified Gram–Schmidt QR decomposition, BIT, 36
(1996), pp. 166–181.
Download