Continuous analogues of matrix factorizations Alex Townsend

advertisement
Continuous analogues of matrix
factorizations
NASC seminar, 9th May 2014
Alex Townsend
DPhil student
Mathematical Institute
University of Oxford
(joint work with Nick Trefethen)
Many thanks to Gil Strang, MIT.
Work supported by supported by EPSRC grant EP/P505666/1.
Introduction
Discrete vs. continuous
v = column vector
Alex Townsend @ Oxford
f (x )
chebfun
[Battles & Trefethen, 04]
2/24
Introduction
Discrete vs. continuous
v = column vector
f (x )
chebfun
[Battles & Trefethen, 04]
A = tall skinny matrix
[ f 1 (x ) | · · · | f n (x ) ]
quasimatrix
[Stewart, 98]
Alex Townsend @ Oxford
2/24
Introduction
Discrete vs. continuous
v = column vector
f (x )
chebfun
[Battles & Trefethen, 04]
A = tall skinny matrix
[ f 1 (x ) | · · · | f n (x ) ]
quasimatrix
[Stewart, 98]
A = square matrix
f (x , y )
chebfun2
[T. & Trefethen, 13]
Alex Townsend @ Oxford
2/24
Introduction
Discrete vs. continuous
v = column vector
f (x )
chebfun
[Battles & Trefethen, 04]
A = tall skinny matrix
[ f 1 (x ) | · · · | f n (x ) ]
quasimatrix
[Stewart, 98]
A = square matrix
f (x , y )
chebfun2
[T. & Trefethen, 13]
chebop
[Driscoll, Bornemann, &
Trefethen, 08]
Av
Alex Townsend @ Oxford
∫
f (s , y )v (s ) ds
2/24
Introduction
Discrete vs. continuous
v = column vector
f (x )
chebfun
[Battles & Trefethen, 04]
A = tall skinny matrix
[ f 1 (x ) | · · · | f n (x ) ]
quasimatrix
[Stewart, 98]
A = square matrix
f (x , y )
chebfun2
[T. & Trefethen, 13]
chebop
[Driscoll, Bornemann, &
Trefethen, 08]
cmatrix
[T. & Trefethen, 14]
Av
SVD , QR , LU , Chol
Alex Townsend @ Oxford
∫
f (s , y )v (s ) ds
?
2/24
Introduction
Discrete vs. continuous
v = column vector
f (x )
chebfun
[Battles & Trefethen, 04]
A = tall skinny matrix
[ f 1 (x ) | · · · | f n (x ) ]
quasimatrix
[Stewart, 98]
A = square matrix
f (x , y )
chebfun2
[T. & Trefethen, 13]
chebop
[Driscoll, Bornemann, &
Trefethen, 08]
cmatrix
[T. & Trefethen, 14]
Av
SVD , QR , LU , Chol
∫
f (s , y )v (s ) ds
?
Interested in continuous analogues rather than infinite analogues.
Alex Townsend @ Oxford
2/24
Introduction
Discrete vs. continuous
v = column vector
f (x )
chebfun
[Battles & Trefethen, 04]
A = tall skinny matrix
[ f 1 (x ) | · · · | f n (x ) ]
quasimatrix
[Stewart, 98]
A = square matrix
f (x , y )
chebfun2
[T. & Trefethen, 13]
chebop
[Driscoll, Bornemann, &
Trefethen, 08]
cmatrix
[T. & Trefethen, 14]
Av
SVD , QR , LU , Chol
∫
f (s , y )v (s ) ds
?
Interested in continuous analogues rather than infinite analogues.
Aside: Infinite analogues are Schmidt, Wiener–Hopf, infinite-dimensional QR, etc.
Alex Townsend @ Oxford
2/24
Introduction
Matrices, quasimatrices, cmatrices
matrix
quasimatrix
cmatrix
m×n
[a , b ] × n
[a , b ] × [c , d ]
.
A cmatrix is a continuous function of (y , x ) ∈ [a , b ] × [c , d ].
Alex Townsend @ Oxford
3/24
Introduction
Matrices vs. cmatrices
An
m ×
n
matrix:
An [a , b ] × [c , d ] cmatrix:
{1, . . . , m}
entries indexed by {1, . . . , m} × {1, . . . , n}
entries indexed by
subset of R
Well-ordered Not well-ordered by <
[a , b ]
×
[c , d ]
Question
What is the 1st column?
Successor
No successor
What is the next column?
A null set
Null subsets
What sparsity makes sense?
Finite
Infinite
Convergence?
Alex Townsend @ Oxford
4/24
Introduction
Matrices vs. cmatrices
An
m ×
n
matrix:
An [a , b ] × [c , d ] cmatrix:
{1, . . . , m}
entries indexed by {1, . . . , m} × {1, . . . , n}
entries indexed by
subset of R
Well-ordered Not well-ordered by <
[a , b ]
×
[c , d ]
Question
What is the 1st column?
Successor
No successor
What is the next column?
A null set
Null subsets
What sparsity makes sense?
Finite
Infinite
Convergence?
Three heroes:
Alex Townsend @ Oxford
4/24
Introduction
Matrices vs. cmatrices
An
m ×
n
matrix:
An [a , b ] × [c , d ] cmatrix:
{1, . . . , m}
entries indexed by {1, . . . , m} × {1, . . . , n}
entries indexed by
subset of R
Well-ordered Not well-ordered by <
[a , b ]
×
[c , d ]
Question
What is the 1st column?
Successor
No successor
What is the next column?
A null set
Null subsets
What sparsity makes sense?
Finite
Infinite
Convergence?
Three heroes:
Alex Townsend @ Oxford
Smoothness
4/24
Introduction
Matrices vs. cmatrices
An
m ×
n
matrix:
An [a , b ] × [c , d ] cmatrix:
{1, . . . , m}
entries indexed by {1, . . . , m} × {1, . . . , n}
entries indexed by
subset of R
[a , b ]
×
[c , d ]
Question
Well-ordered Not well-ordered by <
What is the 1st column?
Successor
No successor
What is the next column?
A null set
Null subsets
What sparsity makes sense?
Finite
Infinite
Convergence?
Three heroes:
Alex Townsend @ Oxford
Smoothness
pivoting
4/24
Introduction
Matrices vs. cmatrices
An
m ×
n
matrix:
An [a , b ] × [c , d ] cmatrix:
{1, . . . , m}
entries indexed by {1, . . . , m} × {1, . . . , n}
entries indexed by
subset of R
[a , b ]
×
[c , d ]
Question
Well-ordered Not well-ordered by <
What is the 1st column?
Successor
No successor
What is the next column?
A null set
Null subsets
What sparsity makes sense?
Finite
Infinite
Convergence?
Three heroes:
Alex Townsend @ Oxford
Smoothness
pivoting
mach
4/24
Singular value decomposition
Matrix factorization
A = U ΣV T ,
Σ = diagonal,
U , V = orthonormal columns
.
A
Alex Townsend @ Oxford
U
Σ
VT
5/24
Singular value decomposition
Matrix factorization
A = U ΣV T ,
Σ = diagonal,
U , V = orthonormal columns
.
A
U
Σ
VT
Exists: SVD exists and is (almost) unique
Alex Townsend @ Oxford
5/24
Singular value decomposition
Matrix factorization
A = U ΣV T ,
Σ = diagonal,
U , V = orthonormal columns
.
A
U
Σ
VT
Exists: SVD exists and is (almost) unique
Application: A best rank r approx. is Ar = 1st r terms (in 2- & F-norm)
Alex Townsend @ Oxford
5/24
Singular value decomposition
Matrix factorization
A = U ΣV T ,
Σ = diagonal,
U , V = orthonormal columns
.
A
U
Σ
VT
Exists: SVD exists and is (almost) unique
Application: A best rank r approx. is Ar = 1st r terms (in 2- & F-norm)
∑
Separable model: A = jn=1 σj uj vjT is a sum of outer products
Alex Townsend @ Oxford
5/24
Singular value decomposition
Matrix factorization
A = U ΣV T ,
Σ = diagonal,
U , V = orthonormal columns
.
A
U
Σ
VT
Exists: SVD exists and is (almost) unique
Application: A best rank r approx. is Ar = 1st r terms (in 2- & F-norm)
∑
Separable model: A = jn=1 σj uj vjT is a sum of outer products
Computation: Bidiagonalize then iterate [Golub & Kahan (1965)]
Alex Townsend @ Oxford
5/24
Singular value decomposition
Continuous analogue
A = U ΣV T ,
U , V = orthonormal columns
Σ = diagonal,
σ1
σ2
v1T
v2T
Σ
VT
u1 u2
At least
formally
.
A
Alex Townsend @ Oxford
U
6/24
Singular value decomposition
Continuous analogue
A = U ΣV T ,
U , V = orthonormal columns
Σ = diagonal,
σ1
σ2
v1T
v2T
Σ
VT
u1 u2
At least
formally
.
A
U
Exists: SVD exists if A is continuous and is (almost) unique [Schmidt 1907]
Alex Townsend @ Oxford
6/24
Singular value decomposition
Continuous analogue
A = U ΣV T ,
U , V = orthonormal columns
Σ = diagonal,
σ1
σ2
v1T
v2T
Σ
VT
u1 u2
At least
formally
.
A
U
Exists: SVD exists if A is continuous and is (almost) unique [Schmidt 1907]
Application: A best rank r approx. is fr = 1st r terms (L 2 -norm) [Weyl 1912]
Alex Townsend @ Oxford
6/24
Singular value decomposition
Continuous analogue
A = U ΣV T ,
U , V = orthonormal columns
Σ = diagonal,
σ1
σ2
v1T
v2T
Σ
VT
u1 u2
At least
formally
.
A
U
Exists: SVD exists if A is continuous and is (almost) unique [Schmidt 1907]
Application: A best rank r approx. is fr = 1st r terms (L 2 -norm) [Weyl 1912]
∑
T
Separable model: A = ∞
j =1 σj uj vj is a sum of “outer products”
Alex Townsend @ Oxford
6/24
Singular value decomposition
Continuous analogue
A = U ΣV T ,
U , V = orthonormal columns
Σ = diagonal,
σ1
σ2
v1T
v2T
Σ
VT
u1 u2
At least
formally
.
A
U
Exists: SVD exists if A is continuous and is (almost) unique [Schmidt 1907]
Application: A best rank r approx. is fr = 1st r terms (L 2 -norm) [Weyl 1912]
∑
T
Separable model: A = ∞
j =1 σj uj vj is a sum of “outer products”
Computation: Avoid bidiagonalization
Alex Townsend @ Oxford
6/24
Singular value decomposition
Absolute and uniform convergence of the SVD
Theorem
Let A be an [a , b ] × [c , d ] cmatrix that is (uniformly) Lipschitz continuous in both
variables. Then the SVD of A exists, the singular values are unique with σj → 0
as j → ∞, and
∞
∑
A=
σj uj vjT ,
j =1
where the series is uniformly and absolutely convergent to A .
Proof.
See [Schmidt 1907], [Hammerstein 1923], and [Smithies 1937].
If A satisfies the assumptions of the theorem, then A
Alex Townsend @ Oxford
= U ΣV T .
7/24
Singular value decomposition
Algorithm
1. Compute A = QA RA
≈
.
2. Compute quasimatrix QR,
RAT = QR RR
A
QA
RA
QR
RR
=
.
(Householder triangularization of a quasimatrix [Trefethen 08])
RAT
=
3. Compute SVD
.
RR
U
Σ
VT
A = (QA V )Σ(QR U )T
This is a continuous analogue of a discrete algorithm [Ipsen 90].
Alex Townsend @ Oxford
8/24
Singular value decomposition
Related work
Erhard Schmidt
James Mercer
Autonne, Bateman,
Hammerstein, Kellogg,
Picard, Smithies, Weyl
Aizerman, Braverman,
König, Rozonoer
Alex Townsend @ Oxford
Carl Eckart & Gail Young
Golub, Hestenes, Kahan,
Kogbetliantz, Reinsch
9/24
LU decomposition
Matrix factorization
A = P −1 LU , P = permutation, L = unit lower-triangular, U = upper-triangular.
.
A
P −1 L
U
P −1 L = “psychologically” lower-triangular
Exists: It (almost) exists and with extra conditions is (almost) unique
Application: Used to solve dense linear systems Ax = b
∑
Separable model: A = jn=1 `j ujT is a sum of outer products [Pan 2000]
Computation: Gaussian elimination with pivoting
Alex Townsend @ Oxford
10/24
LU decomposition
Continuous analogue
A = LU ,
L = unit lower-triangular,
U = upper-triangular.
u1T
u2T
`1 `2
.
A
L
U
Exists: It (usually) exists and with extra conditions is (almost) unique
Application: Can be used to “solve” integral equations
∑
T
Separable model: A = ∞
j =1 `j uj is a sum of outer products
Computation: Continuous analogue of GECP (GE with complete pivoting)
Alex Townsend @ Oxford
11/24
LU decomposition
Computation
The standard point of view:
.
Alex Townsend @ Oxford
A
P −1 L
U
12/24
LU decomposition
Computation
The standard point of view:
A different point of view:
.
A
A ←− A − A (j , :)A (:, k )/A (j , k )
A ←− A − A (y0 , :)A (:, x0 )/A (y0 , x0 )
P −1 L
U
(GE step for matrices)
(GE step for functions)
Each step of GE is a rank-1 update. We use complete pivoting.
Alex Townsend @ Oxford
12/24
LU decomposition
Computation
The standard point of view:
A different point of view:
.
A
A ←− A − A (j , :)A (:, k )/A (j , k )
A ←− A − A (y0 , :)A (:, x0 )/A (y0 , x0 )
P −1 L
U
(GE step for matrices)
(GE step for functions)
Each step of GE is a rank-1 update. We use complete pivoting.
Pivoting orders the columns and rows.
Alex Townsend @ Oxford
12/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
Alex Townsend @ Oxford
13/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
Alex Townsend @ Oxford
y1
13/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
y1
y
2
Alex Townsend @ Oxford
13/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
y1
y3
y
2
Alex Townsend @ Oxford
13/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
y1
y3
y
2
y4
Alex Townsend @ Oxford
13/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
y1
y3
y
2
y4
y5
Alex Townsend @ Oxford
13/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
y1
y3
y
2
y4
y5
Alex Townsend @ Oxford
13/24
LU decomposition
What is a triangular quasimatrix?
u1T
u2T
`1 `2
L = unit lower-triangular
U = upper-triangular
.
U
A
L
What is a lower-triangular quasimatrix?
Red dots = 0’s, blue squares = 1’s
Position of 0’s is determined by pivoting strategy
Forward substitution has a continuous analogue
More precisely, L is lower-triangular wrt y1 , y2 , . . .
Alex Townsend @ Oxford
y1
y3
y
2
y4
y5
13/24
LU decomposition
Absolute and uniform convergence of LU
Theorem
Let A be an [a , b ] × [c , d ] continuous cmatrix. Suppose A (·, x ) is analytic in the
“stadium” of radius 2ρ(b − a ) about [a , b ] for some ρ > 1 where it is bounded in
absolute value by M (uniformly in x). Then
A=
∞
∑
`j ujT ,
j =1
where the series is uniformly and absolutely
convergent to A . Moreover,
k
∑
A −
`j ujT ≤ M ρ−k .
j =1
stadium
2ρ(b−a)
a
b
∞
Alex Townsend @ Oxford
14/24
LU decomposition
A Chebfun2 application
Low rank function approximation
A = chebfun2(@(x,y) cos(10*(x.ˆ2+y))+sin(10*(x+y.ˆ2)));
contour(A, ’.’)
• = pivot location
A (y , x ) ≈
k
∑
j =1
Alex Townsend @ Oxford
Rank = 125
Rank = 65
Rank = 5
Rank = 33
∫
d∫
b
A (y , x )dydx ≈
`j (y )uj (x ),
c
a
Rank = 28
Rank = 2
k ∫ b
∑
j =1
a
∫
d
`j (y )dy
uj (x )dx .
c
15/24
LU decomposition
A Chebfun2 application
SVD is optimal, but GE can be faster
2D Runge function:
A (y , x ) =
Wendland’s CSRBFs:
1
.
1 + γ(x 2 + y 2 )
As (y , x ) = φ3,s (kx − y k2 ) ∈ C2s .
0
10
0
10
10
Relative error in L2
SVD
GE
−2
Relative error in L2
SVD
−2
10
−4
10
−6
10
−8
10
−10
10
γ=100
GE
−4
10
φ3,0 ∈ C0
−6
10
−8
10
φ3,1 ∈ C2
−10
10
−12
10
φ3,3 ∈ C6
−12
10
γ=10
−14
−14
10
γ=1
10
−16
10
−16
10
0
5
10
15
20
Rank of approximant
Alex Townsend @ Oxford
25
30
0
50
100
150
200
Rank of approximant
16/24
LU decomposition
Related work
Eugene Tyrtyshnikov
Goreinov, Oseledets,
Savostyanov,
Zamarashkin
Mario Bebendorf
Gesenhues, Griebel,
Hackbusch, Rjasanow
Keith Geddes
Carvajal, Chapman
Petros Drineas
Candes, Greengard,
Mahoney, Martinsson,
Rokhlin
Moral of the story: Iterative GE is everywhere, under different guises
Many others: Halko, Liberty, Martinsson, O’Neil, Tropp, Tygert, Woolfe, etc.
Alex Townsend @ Oxford
17/24
Cholesky factorization
Matrix factorization
A = RT R,
.
A
R = upper-triangular
RT
R
Exists: Exists and is unique if A is a positive-definite matrix
Application: A numerical test for a positive-definite matrix
∑
Separable model: A = jn=1 rj rjT is a sum of outer products
Computation: Cholesky algorithm, i.e., GECP on a positive definite matrix
Alex Townsend @ Oxford
18/24
Cholesky factorization
Continuous analogue
A = RT R,
R = upper-triangular quasimatrix
r1T
r2T
r1 r2
At least
formally
.
Pivoting: Essential. Continuous analogue of pivoted Cholesky
Exists: Exists and is essentially unique for nonnegative definite functions
Definition
An [a , b ] × [a , b ] continuous symmetric cmatrix A is nonnegative definite if
∫ b∫ b
T
v (y )A (y , x )v (x )dxdy ≥ 0,
v Av =
∀v ∈ C[a , b ].
a
Alex Townsend @ Oxford
a
19/24
Cholesky factorization
Convergence
Theorem
Let A be an [a , b ] × [a , b ] continuous, symmetric, and nonnegative definite
cmatrix. Suppose that A (·, x ) is analytic in the closed Bernstein ellipse E2ρ(b −a )
with foci a and b with ρ > 1 and bounded in absolute value by M, uniformly in y.
Then
∞
∑
A=
rj rjT ,
stadium
j =1
where the series is uniformly and absolutely
convergent to A . Moreover,
k
∑
32Mk ρ−k
A −
rj rjT ≤
.
4ρ − 1
j =1
E2ρ(b−a)
a
b
∞
Alex Townsend @ Oxford
20/24
Cholesky factorization
Computation
Pivoted Cholesky = GECP on nonnegative definite function1
1
1
1
0
0
0
Pivots in Cholesky
0
10
−5
0
−1
1 −1
0
−1
1 −1
1
1
1
0
0
0
10
0
1
Pivot size
−1
−1
−10
10
−15
10
0
2
4
6
8
10
12
14
Step
−1
−1
0
1
−1
−1
0
1
Each step is a rank 1 update:
1 Always
−1
−1
0
1
A ←− A − A (:, x0 )A (x0 , :)/A (x0 , x0 )
take the absolute maximum on the diagonal even if there is a tie with an off-diagonal entry.
Alex Townsend @ Oxford
21/24
Cholesky factorization
A Chebfun2 application
A test for symmetric nonnegative definite functions
A = chebfun2(@(x,y) cos(10*x.*y) + y + x.ˆ2 + sin(10*x.*y));
B = A.’ * A;
chol(B)
Inverse multiquadric
1
1
0.5
0.5
0
0
−0.5
−0.5
−1
−1
−0.5
0
0.5
1
−1
−1
−0.5
0
0.5
1
All the pivots are nonnegative and on the y = x line ⇒ nonnegative definite.
Alex Townsend @ Oxford
22/24
Demo
Demo
Alex Townsend @ Oxford
23/24
References
Z. Battles & L. N. Trefethen, An extension of MATLAB to continuous functions and operators, SISC, 25 (2004),
pp. 1743–1770.
T. A. Driscoll, F. Bornemann, & L. N. Trefethen, The chebop system for automatic solution of differential
equations, BIT, 48 (2008), pp. 701–723.
C. Eckart & G. Young, The approximation of one matrix by another of lower rank, Psychometrika, 1 (1936),
pp. 211–218.
N. J. Higham, Accuracy and Stability of Numerical Algorithms, 2nd edition, SIAM, 2002.
E. Schmidt, Zur Theorie der linearen und nichtlinearen Integralgleichungen. I Teil. Entwicklung willkürlichen
Funktionen nach System vorgeschriebener, Math. Ann., 63 (1907), pp. 433–476.
G. W. Stewart, Afternotes Goes to Graduate School, Philadelphia, SIAM, 1998.
T. & L. N. Trefethen, Gaussian elimination as an iterative algorithm, SIAM News, March 2013.
T. & L. N. Trefethen, An extension of Chebfun to two dimensions, to appear in SISC, 2013.
Alex Townsend @ Oxford
24/24
Download