Random Matrix Theory and Numerical Linear Algebra: ILAS Meeting

advertisement
Random Matrix Theory
and Numerical Linear Algebra:
A story of
communication
Alan Edelman
Mathematics
Computer Science & AI Labs
ILAS Meeting
June 3, 2013
Page 2
Page 3
Page 4
An Intriguing Thesis
The results/methodologies from
NUMERICAL LINEAR ALGEBRA
would be valuable even if computers had not been built.
… but I’m glad we have computers
…& I’m especially grateful for the Julia computing system
Page 5
Eigenvalues of GOE (β=1 means reals)
• Naïve Way in four languages:
A=randn(n); S=(A+A’)/sqrt(2*n);eig(S)
A=randn(n,n);S=(A+A’)/sqrt(2*n);eigvals(S)
A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n);
eigen(S,symmetric=T,only.values=T)$values;
A=RandomArray[NormalDistribution[],{n,n}];
S=(A+Transpose[A])/Sqrt[2*n];Eigenvalues[s]
Page 6
Eigenvalues of GOE (β=1 means reals)
• Naïve Way in four languages:
A=randn(n); S=(A+A’)/sqrt(2*n);eig(S)
A=randn(n,n);S=(A+A’)/sqrt(2*n);eigvals(S)
A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n);
eigen(S,symmetric=T,only.values=T)$values;
A=RandomArray[NormalDistribution[],{n,n}];
S=(A+Transpose[A])/Sqrt[n];Eigenvalues[s]
If for you: simulation is just
intuition, a check, a figure, or a student project,
then software like this is written quickly and victory
is declared.
Page 7
Eigenvalues of GOE (β=1 means reals)
• Naïve Way in four languages:
A=randn(n); S=(A+A’)/sqrt(2*n);eig(S)
A=randn(n,n);S=(A+A’)/sqrt(2*n);eigvals(S)
A=matrix(rnorm(n*n),ncol=n);S=(a+t(a))/sqrt(2*n);
eigen(S,symmetric=T,only.values=T)$values;
A=RandomArray[NormalDistribution[],{n,n}];
S=(A+Transpose[A])/Sqrt[n];Eigenvalues[s]
Ψ=
For me: Simulation is a chance to research efficiency,
Numerical Linear Algebra Style, and this research
cycles back to the mathematics!
Page 8
Tridiagonal Model More Efficient
(β=1: Same eigenvalue distribution!)
n
gi ~N(0,2)
Julia:
eig(SymTridiagonal(d,e))
Storage: O(n) (vs O(n2))
(Silverstein, Trotter,
general β Dumitriu&E. etc) Time: O(n2) (vs O(n3))
Page 9
Histogram without Histogramming:
Sturm Sequences
• Count #eigs < s: Count sign changes in
Det( (A-s*I)[1:k,1:k] )
• Count #eigs in [x,x+h]:
Take difference in number of sign changes at x+h and x
• Speed comparison vs. naïve way
Mentioned in Dumitriu and E 2006,
Page 10
A good
computational
trick is a good
theoretical
trick!
Page 11
Efficient Tracy Widom Simulation
• Naïve Way:
A=randn(n); S=(A+A’)/sqrt(2*n);max(eig(S))
• Better Way:
• Only create the 10n1/3 initial segment of the diagonal and off-diagonal as
the “Airy” function tells us that the max eig hardly depends on the rest
• Lead to the notion of the stochastic operator limit
Page 12
Google Translate
Page 13
Google Translate (in my dreams)
-1
Lost In Translation: eigs = svd’s squared
Page 14
The Jacobi Random Matrix
Ensemble (Constantine 1963)
• Suppose A and B are randn(m1,n) and randn(m2,n)
– (iid standard normals)
• The eigenvalues of
• or in symmetric form
• has joint density
Page 15
The Jacobi Ensemble
Geometrically
• Take random n-hyperplane in Rm (m>n) (uniformly)
• Take reference hyperplane (any dimension)
• Orthogonal projection of the unit ball in the random
hyperplane onto the reference hyperplane is an ellipsoid
• Semi-axes lengths are Jacobi cosines:
unit ball in random hyperplane
ellipsoid in
reference hyperplane
Page 16
The GSVD
A: m1 x n
B: m2 x n
A
[ B]: m x n
• Usual Presentation
• Underlying Geometry
m-dimensions
Flattened View Expanded View
subspaces represented by
lines planes
(m2 dim)
Y
(cs11vu11)
n-dim subspace
s1v1
• Take hyperplane
spanned by [ A ]
B
st
X=span( 1 m1
columns of Im)
Y=span(last m2
columns of Im)
(cs11vu11)
c1u1
π
π
Y
X
(m1 dim)
X
Page 17
GSVD Mathematics (Cosine,Sine)
• Thus any hyperplane is the span of
• with U’U=V’V=I and C2+S2=I (diagonal)
• One can directly verify by taking Jacobians projecting out
the W’dW directions that we do not care about
• and further understand CJ =
18/77
Page 18
Circular Ensembles
• A random complex unitary matrix has eigenvalues
distributed on the unit circle with density
• It is desriable to construct random matrices that we can
compute with for general beta:
(Killip and Nenciu 2004
product of tridiagonal solution)
Page 19
Numerical linear algebra
• Ammar, Gragg, Reichel (1991) Hessenberg Unitary Matrices
Gj =
• Recalled in Forrester and Rains (2005)
• Converts Killip and Nenciu to AmmarGraggReichel format
• Classical orthogonal polynomials on unit circle!
• Story relates to CMV matrices
• Verblunsky Coefficients = Schur Parameters
Page 20
Implementation
Remark: Authors would rightly feel all the mathematical information
is in their papers. Still this presenter had some considerable trouble
gathering enough of the pieces to produce the ultimate arbiter of
understanding the result: a working code!
Plea: Random Matrix Theory is so valuable to so many, that
a pseudocode or code available is a great technology transfer mechanism.
Needed facts
1) AmmarGraggReichel format
2) Generating the variables requires some thinking
| j| ~ sqrt(1-rand()^ ((2/β)/j))
j=
| j|*exp(2*pi*i*rand())
Page 21
Julia Code: (similar to MATLAB etc.)
really useful!
Page 22
The method of Ghosts and Shadows
for Beta Ensembles
Page 23
So far: I tried to hint about β
Introduction to Ghosts
• G1 is a standard normal N(0,1)
• G2 is a complex normal (G1 +iG1)
• G4 is a quaternion normal (G1 +iG1+jG1+kG1)
• Gβ (β>0) seems to often work just fine
“Ghost Gaussian”
Page 24
Chi-squared
• Defn: χβ2 is the sum of β iid squares of standard normals if
β=1,2,…
• Generalizes for non-integer β as the “gamma” function
interpolates factorial
• χ β is the sqrt of the sum of squares (which generalizes) (wikipedia
chi-distriubtion)
• |G1| is χ 1 , |G2| is χ 2, |G4| is χ 4
• So why not |G β | is χ β ?
• I call χ β the shadow of G β
Page 25
Page 26
Real quantity
Working with Ghosts
Page 27
Singular Values of a Ghost
Gaussian by a real diagonal
(Dubbs, E, Koev, Venkataramana 2013)
(see related work by Forrester 2011)
Page 28
The Algorithm
Page 29
Removing U and V
Page 30
Algorithm cont.
Page 31
Completion of Recursion
Page 32
Monte Carlo Trials
Parallel Histograms
No tears!
Page 33
Julia: Parallel Histogram
3rd eigenvalue, pylab plot, 8 seconds!
75 processors
Page 34
Linear Algebra too limited in
Lets me put together what I need:
e.g.:Tridiagonal Eigensolver
Fast rank one update
Arrow matrix eigensolver
can surgically use LAPACK without tears
Page 35
Weekend Julia Run
• Histogram of smallest singular value of randn(200)
– spiked by doubling the first column
– Julia: A=randn(200,200); A(:,1)*=2;
– Reduced mathematically to bidiagonal form
– Used julia’s bidiagonal svd
– Ran 2.25 billion trials in 20 hours on 75 processors
– Naïve serial algorithm would take 16 years
Page 36
Weekend Julia Run
I knew the limiting histogram from my thesis, universality, etc.
• Histogram of smallest singular value of randn(200)
With this kind of power, I could obtain the first order correction
– spiked
by doubling the first column
for finite
n!
– Julia: A=randn(200,200); A(:,1)*=2;
Semilogy plot of abs correction and a prediction:
– Reduced mathematically to bidiagonal form
Used
julia’s
svd
This is –like
having
anbidiagonal
electron microscope
to see small details
that would be invisible with conventional tools!
– Ran 2.25 billion trials in 20 hours on 75 processors
– Naïve serial algorithm would take 16 years
Page 37
Conclusion
• If you already held Numerical Linear Algebra with high
esteem, thanks to random matrix theory, now there are
even more reasons!
• Try julia. (Google: julia)
Page 38
Page 39
Download