Advances in Random Matrix Theory (stochastic eigenanalysis) Alan Edelman MIT: Dept of Mathematics, Computer Science AI Laboratories 5/28/2016 1 Stochastic Eigenanalysis Counterpart to stochastic differential equations Emphasis on applications to engineering & finance Beautiful mathematics: Random Matrix Theory Free Probability Raw Material from Physics Combinatorics Numerical Linear Algebra Multivariate Statistics 5/28/2016 2 Scalars, Vectors, Matrices Mathematics: Computation: Statistics: Notation = power & less ink! Use those caches! Classical, Multivariate, Modern Random Matrix Theory The Stochastic Eigenproblem * Mathematics of probabilistic linear algebra * Emerging Computational Algorithms * Emerging Statistical Techniques Ideas from numerical computation that stand the test of time are right for mathematics! 3 Open Questions Find new applications of spacing (or other) statistics Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 4 The Wigner’s Semi-Circle classical & most famous rand eig theorem Let S = random symmetric Gaussian MATLAB: A=randn(n); S=( A+A’)/2; S known as the Hermite Ensemble Normalized eigenvalue histogram is a semi-circle Precise statements require n etc. 5 The Wigner’s Semi-Circle classical & most famous rand eig theorem Let S = random symmetric Gaussian MATLAB: A=randn(n); S=( A+A’)/2; S known as the Hermite Ensemble Normalized eigenvalue histogram is a semi-circle Precise statements require n etc. n x n iid standard normals 6 The Wigner’s Semi-Circle classical & most famous rand eig theorem Let S = random symmetric Gaussian MATLAB: A=randn(n); S=( A+A’)/2; S known as the Hermite Ensemble Normalized eigenvalue histogram is a semi-circle Precise statements require n etc. 7 Wigner’s original proof Compute E(tr A2p) as n∞ Terms with too many indices, have some element with power 1. Vanishes with mean 0. Terms with too few indices: not enough to be relevant as n∞ Leaves only a Catalan number left: Cp=(2p p )/(p+1) for the moments when all is said and done Semi-circle only distribution with Catalan number moments 8 n=2; n=3; Finite Versions of semicircle n=4; n=5; 9 n=2; n=3; Finite Versions n=4; Area under curve (-∞,x): Can n=5;as sums of be expressed probabilities that certain tridiagonal determinants are positive. 10 Wigner’s Semi-Circle Real Numbers: x Complex Numbers: x+iy Quaternions: x+iy+jz+kw Defined through joint eigenvalue density: β=2½? x+iy+jz β 2 const x ∏|xi-xj| ∏exp(-xi /2) β=1 β=2 β=4 β=2½? β=repulsion strength β=0 “no interference” spacings are Poisson Classical research only β=1,2,4 missing the link to Poisson, continuous techniques, etc 11 Largest eigenvalue “convection-diffusion?” 12 Haar or not Haar? “Uniform Distribution on orthogonal matrices” Gram-Schmidt or [Q,R]=QR(randn(n)) 13 Haar or not Haar? “Uniform Distribution on orthogonal matrices” Gram-Schmidt or [Q,R]=QR(randn(n)) Eigenvalues Wrong 14 Longest Increasing Subsequence (n=4) (Baik-Deift-Johansson) (Okounkov’s proof) Green: 4 Yellow: 3 Red: 2 Purple: 1 1234 2134 3124 4123 1243 2143 3142 4132 1324 2314 3214 4213 1342 2341 3241 4231 1423 2413 3412 4312 1432 2431 3421 4321 15 Bulk spacing statistics “convection-diffusion?” Bus wait times in Mexico Energy levels of heavy atoms Parked Cars in London Zeros of Riemann zeta Telltale Sign: Repulsion + Mice Brain Wave Spikes optimality 16 “what’s my β?” web page • • • • Cy’s tricks: Maximum Likelihood Estimation Bayesian Probability Kernel Density Estimation • Epanechnikov kernel Confidence Intervals http://people.csail.mit.edu/cychan/BetaEstimator.html 17 Open Questions Find new applications of spacing (or other) distributions Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 18 Everyone’s Favorite Tridiagonal -2 1 1 -2 1 1 n2 1 1 -2 d2 dx2 5/28/2016 19 Everyone’s Favorite Tridiagonal -2 1 1 -2 G 1 1 +(βn)1/2 1 n2 1 d2 dx2 5/28/2016 G 1 -2 G + dW β1/2 20 Stochastic Operator Limit d2 - x + 2 dx 2 dW , β N(0,2) χ (n -1)β χ (n -1)β N(0,2) χ (n - 2)β 1 Hβn ~ 2 nβ χ 2β H β n H n + , N(0,2) χβ χβ N(0,2) 2 G β Cast of characters: Dumitriu, Sutton, Rider n , 21 Open Questions Find new applications of spacing (or other) distributions Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 22 Is it really the random matrices? The excitement is that the random matrix statistics are everyhwere Random matrices properly tridiagonalized are discretizations of stochastic differential operators! Eigenvalues of SDO’s not as well studied Deep down this is what I believe is the important mechanism in the spacings, not the random matrices! (See Brian Sutton thesis, Brian Rider papers—connection to Schrodinger operators) Deep down for other statistics, though it’s the matrices 23 Open Questions Find new applications of spacing (or other) distributions Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 24 Open Questions Find new applications of spacing (or other) distributions Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 26 Free Probability Free Probability (name refers to “free algebras” meaning no strings attached) Gets us past Gaussian ensembles and Wishart Matrices 27 The flipping coins example Classical Probability: Coin: +1 or -1 with p=.5 50% 50% 50% 50% y: x: -1.5 -1 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 -1 +1 0 0.5 1 1.5 +1 x+y: 28 -2 0 +2 The flipping coins example Classical Probability: Coin: +1 or -1 with p=.5 Free 50% 50% 50% 50% eig(B): eig(A): -1.5 -1 -1 -0.5 0 0.5 1 1.5 +1 -1.5 -1 -0.5 -1 0 0.5 1 1.5 +1 eig(A+QBQ’): 29 -2 0 +2 From Finite to Infinite 30 From Finite to Infinite Gaussian (m=1) 31 From Finite to Infinite Gaussian (m=1) Wiggly 32 From Finite to Infinite Gaussian (m=1) Wiggly Wigner 33 Semi-circle law for different betas 34 Open Questions Find new applications of spacing (or other) distributions Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 35 Matrix Statistics •Many Worked out in 1950s and 1960s •Muirhead “Aspects of Multivariate Statistics” •Are two covariance matrices equal? •Does my matrix equal this matrix? •Is my matrix a multiple of the identity? •Answers Require Computation of •Hypergeometrics of Matrix Argument •Long thought Computationally Intractible 36 The special functions of multivariate statistics Hypergeometric Functions of Matrix Argument β=2: Schur Polynomials Other values: Jack Polynomials Orthogonal Polynomials of Matrix Argument Begin with w(x) on I pκ(x)pλ(x) Δ(x)β ∏i w(xi)dxi = δκλ Jack Polynomials orthogonal for w=1 on the unit circle. Analogs of xm ∫ Plamen Koev revolutionary computation Dumitriu’s MOPS symbolic package 37 Multivariate Orthogonal Polynomials & Hypergeometrics of Matrix Argument important special functions of the 21st century Begin with w(x) on I The pκ(x)pλ(x) Δ(x)β ∏i w(xi)dxi = δκλ Jack Polynomials orthogonal for w=1 on the unit circle. Analogs of xm ∫ 38 Smallest eigenvalue statistics A=randn(m,n); hist(min(svd(A).^2)) 39 Multivariate Hypergeometric Functions 40 Multivariate Hypergeometric Functions 41 Open Questions Find new applications of spacing (or other) distributions Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 42 Plamen Koev’s clever idea 43 Symbolic MOPS applications A=randn(n); S=(A+A’)/2; trace(S^4) det(S^3) 44 Mops (Ioana Dumitriu) Symbolic 45 Random Matrix Calculator 46 Encoding the semicircle The algebraic secret = sqrt(4-x2)/(2π) m(z) = (-z + i*sqrt(4-z2))/2 L(m,z) ≡ m2+zm+1=0 f(x) m(z) = ∫ (x-z)-1f(x) dx 0.35 0.3 Probability 0.25 0.2 0.15 0.1 0.05 0 -3 -2 -1 0 x 1 2 Stieltjes transform Practical encoding: Polynomial L whose root m is Stieltjes transform 47 3 The Polynomial Method RMTool http://arxiv.org/abs/math/0601389 The polynomial method for random matrices Eigenvectors as well! 48 Plus 1 0.9 0.35 0.8 0.3 0.7 0.6 Probability 0.2 + 0.15 0.1 0 -3 0.5 0.4 0.3 0.2 0.05 0.1 -2 -1 0 x 1 2 0 0 3 X =randn(n,n) A=X+X’ m2+zm+1=0 0.5 1 1.5 x 2 2.5 3 Y=randn(n,2n) B=Y*Y’ zm2+(2z-1)m+2=0 0.4 0.35 0.3 0.25 Probability Probability 0.25 0.2 0.15 0.1 0.05 0 -2 -1 0 1 x 2 3 4 A+B m3+(z+2)m2+(2z-1)m+2=0 49 Times 1 0.9 0.35 0.8 0.3 0.7 0.6 Probability 0.2 * 0.15 0.1 0 -3 0.5 0.4 0.3 0.2 0.05 0.1 -2 -1 0 x 1 2 0 0 3 X =randn(n,n) A=X+X’ m2+zm+1=0 0.5 1 1.5 x 2 2.5 3 Y=randn(n,2n) B=Y*Y’ zm2+(2z-1)m+2=0 0.7 0.4 0.6 0.35 0.3 0.5 Probability Probability Probability 0.25 0.25 0.4 0.2 0.3 0.15 0.2 0.1 0.1 0.05 00 -3-2 -2-1 -10 10 xx 21 3 2 A*B m4z2-2m3z+m2+4mz+4=0 4 3 50 Open Questions Find new applications of spacing (or other) distributions Cleanest derivation of Tracy-Widom? “Finite” free probability? Finite meets infinite Muirhead Software meets Tracy-Widom for stochastic eigen-analysis 51 Matrix Versions of Classical Stats Orthog Matrix MATLAB (A=randn(n) B=randn(n)) Hermite Sym Eig eig(A+A’) Laguerre SVD eig(A*A’) Jacobi GSVD gsvd(A,B) Fourier Eig [U,R]=qr(A+i*B) Normal Chisquared Beta 52 The big structure Orthog Matrix Weight Stats Hermite Sym Eig exp(-x2) Normal Laguerre SVD Jacobi GSVD xαe-x Chisquared (1-x)α x Beta β (1+x) Graph Theory SymSpace Complete Graph Bipartite Graph noncompact A,AI,AII noncompact AIII,BDI,CII compact Regular Graph A, AI, AII, C, D, CI, D, DIII compact Fourier Eig eiθ AIII, BDI, 53 CDI Summary Stochastic Eigenanalysis Emerging Techniques Open Problems 54