Beta-Ensembles with Covariance by Alexander Dubbs A.B. Harvard University (2009) Submitted to the Department of Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Applied Mathematics at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY & MSACHUSET OF TECHNOLOGY June 2014 @2014 Alexander Dubbs. All rights reserved. 4 Author . 1' V JUN 17 2014 LI BRARI ES Signature redacted L" - - -- to" Department of Mathematics April 18, 2014 Certified by. Signature redacted Alan Edelman Professor Thesis Supervisor Signature redacted Accepted by .................................................... Peter Shor Chairman, Applied Mathematics Committee 2 Beta-Ensembles with Covariance by Alexander Dubbs Submitted to the Department of Mathematics on April 18, 2014, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Applied Mathematics Abstract This thesis presents analytic samplers for the ,3-Wishart and O-MANOVA ensembles with diagonal covariance. These generalize the /3-ensembles of Dumitriu-Edelman, Lippert, Killip-Nenciu, Forrester-Rains, and Edelman-Sutton, as well as the classical 3 = 1, 2,4 ensembles of James, Li-Xue, and Constantine. Forrester discovered a sampler for the -Wishart ensemble around the same time, although our proof has key differences. We also derive the largest eigenvalue pdf for the /3-MANOVA case. In infinite-dimensional random matrix theory, we find the moments of the Wachter law, and the Jacobi parameters and free cumulants of the McKay and Wachter laws. We also present an algorithm that uses complex analysis to solve "The Moment Problem." It takes the first batch of moments of an analytic, compactly-supported distribution as input, and it outputs a fine discretization of that distribution. Thesis Supervisor: Alan Edelman Title: Professor 3 4 Acknowledgments I am grateful for the help of my adviser Alan Edelman. This thesis would not have been possible without his patience and inspiration. Thanks to him I am a much better researcher than I was when I arrived at MIT. It was a priviledge to contribute to the fields of ,3-ensembles and infinite random matrix theory. I am also grateful for the help of my coauthors Plamen Koev and Praveen Venkataramana. Plamen's mhg software and Praveen's combinatorial skills helped push this thesis across the finish line. I would also like to thank Marcelo Magnasco and Christopher Jones. Marcelo let me into his lab while I was still a high school student and taught me to do computational neuroscience research, culminating in a paper. Chris both kept me occupied with "bonus" problems and allowed me the opportunity to learn independently. To my friends, it has been a wonderful experience living with you in Cambridge for the last nine years, you will all be missed. Finally, I would like to thank my family, who encouraged me to study mathematics. 5 6 Chapter 1 Introduction Work on beta-ensembles to date. 1.1 We define a -ensemble to be a probability distribution with a continuous dimension parameter /3 > 0 that adjusts the degree of Vandermonde repulsion among its vari- ables. /3-ensembles are typically the eigenvalue, singular value, or generalized singular value distributions of finite random matrices with Gaussian entries. The three main ones are the Hermite, Laguerre, and Jacobi ensembles, see [12]. c. f7 JAi - A Hermite 1exp ) - i<j c' aFJIAi - AjL exp f A Laguerre Jacobi cI a1a2 f i=1 i i<j 3 -2ZAi IA - A1j- 3 expJJ (Aalp(1 - Ai)a2-P) i<j The / - 1, 2, 4 cases of these distributions are the eigenvalue distributions of ensem- bles of real (/ = 1), complex (/ = 2), and quaternionic (/ = 4) random matrices of Gaussians. Let X and Y denote a Gaussian random matrices over the reals, complexes, or quaternions, depending on /. In terms of the eigenvalue distribution, we have the correspondence: 7 Hermite eig ((X + Xt)/2) Laguerre eig (XXt) Jacobi eig (XXt/(XXt + YYt)) There also exist finite Gaussian random matrix ensembles over the reals, complexes, or quaternions governed by a diagonal matrix of nonrandom tuning parameters, called the ensemble's "covariance." The two known ones are below, where gsvdc indicates the "cosine generalized singular values." - ) 1/2 gsvdc (Y, XQ) = eig (Yty/(yty + QXtXQ> D and Q are diagonal matrices of tuning parameters, E = diag(- 1 , . .. , -) are singular values, and C = diag(ci,..., cn) are cosine generalized singular values. see [10] and [11]. Wishart svd n D CW (D 1 / 2 x) 2 (m-n+1),3-1 F i 0 '3 x Fo (3)(IY MANOVA c Q fl gsvdc (Y, XQ) M x H fJi<j |c 1"C(-1 - 1 R 2, n" 2 - i (ZD D-1) (1 - C2)-(p+n-2),3/2-1 c|11Fo(3 ) (rn- .; c 2 (C2 - i)-1, Q2) The hypergeometric functions pF (3 ) are defined in Chapter 2, Section 2.5 (and that definition uses the Jack functions, which are in Section 2.4). It is a natural question to ask, "For continuous 0 > 0, is there a matrix ensemble that has a given 3-ensemble as its eigenvalue distribution?" In [12], Dumitriu and Edelman were the first to answer yes, in the cases of the Hermite and Laguerre ensemble. If we define the matrix B as it is below, eig(BB t ) follows the Laguerre ensemble, and it works for any 3 > 0. Xk's denote independent X-distributed variables 8 with the correct degrees of freedom. X2a B X3(m-i) X2a-0 X,3 X2a-O(m-1) This thesis' contributions to finite random matrix theory are analytic samplers for the /3-Wishart and /3-MANOVA ensembles with covariance for general 3 > 0. They are not as simple as finding the eigenvalues of a matrix, instead, the eigenvalues of many matrices are needed to produce the samples, which are proven to come from the exactly correct distributions. In addition, we contribute the probability distribution function of the largest eigenvalue of the 3-MANOVA ensemble, which we check with the software mhg [43]. Chapter 2 (originally in [10]) is concerned with the 3-Wishart ensemble, which was discovered around the same time by Forrester [27], and Chapter 3 (originally in [11]) is concerned with the /-MANOVA ensemble. Most work to date on matrix models for 3-ensembles is described below: Laguerre/Wishart Models 1 I3 Q = I (Laguerre) D general (Wishart) Fisher [24] (1939), Hsu [33] (1939), James [36] (1960) Roy [61] (1939) 3 2 James [37] (1964) James [37] (1964) 3 4 Li-Xue [45] (2009) Li-Xue [45] (2009) Forrester [27] (2011), 3 > 0 [12ti(Ed02) , [12] (2002)[10] Dubbs-Edelman-Koev-Venkatarmana, (2013) 9 Jacobi/MANOVA Models 3= / 1 Q = I (Jacobi) Q general (MANOVA) Fisher [24] (1939), Girshick [29] (1939), Constantine Hsu [33] (1939), Mood [51] (1951), (unpublished, Olkin-Roy [55] (1954), Roy [61] (1939) found in [37] (1964)) James [37] (1964) - =2 Lippert [46] (2003), / Killip-Nenciu [40] (2004), Forrester-Rains [28] (2005), > 0 Dubbs-Edelman [11] (2014) Edelman-Sutton [21] (2008) The Hermite ensemble does not, as of this writing, have a generalization using a covariance matrix. In addition, a matrix model for the /-circular ensemble was were proven to work by [40]. The samplers for the /3-Wishart and /3-MANOVA ensembles are described below. The Wishart covariance parameters are in D and its singular values are in E; the MANOVA covariance parameters are in Q and its cosine generalized singular values are in C. Beta-Wishart (Recursive) Model Pseudocode Function E := BetaWishart(m, n, /3, D) if n 1 then E:= Xmi3D12 else Z:n_1,1:nI := BetaWishart(m, n - 1, Zn,:ni [0, ..., 0] Zl:n_1,n : [X, Dnff ; ...; X, Dnfn] Zn,n := X(m-n+1)Dnn E := diag(svd(Z)) end if 10 /, D:n1,1:n_1) Beta-MANOVA Model Pseudocode Function C := BetaMANOVA(m, n, p, /3, Q) BetaWishart(m, n, /3, Q2 ) A : M := BetaWishart(p, n, 1, A- 1 ) 1 C:= (M + I)--2 The distributions of the largest and smallest /-Wishart eigenvalues are due to [41] and included in Chapter 2. The distribution of the largest cosine generalized singular value of the -MANOVA distribution is new to this thesis and proved in Chapter 3, it is: Theorem 1. If t = (m - n + 1)//2 - 1 E Z>o, P(ci < x) = det(x2 Q 2 ((1 - 2)+ ± 21)@ nt1 (p/2)$jCj ((1 x - x2)((1 - X2)1 + x 2 Q2 )), (1.1) k=O & kp16t where the Jack polynomial C,3 and Pochhammer symbol (-)(j) are defined Sections 2.4 and 2.5. 1.2 Ghost methods. Dumitriu and Edelman's original paper on /3-ensembles [12] as well as Chapters 2 and 3 of this thesis make use of Ghost methods, a concept formally put forward by Edelman [17]. There are two ways of looking at it. 1. Say you can use linear algebra to reduce a complex or quaternionic matrix to a real matrix with the same eigenvalues, and say that method works in the same way for an initially given random real, complex, or quaternionic matrix. The derived similar real matrix will have a tuning parameter / > 0 indicating whether it originally came from a real 1), complex (3 = 2), or quaternionic (/3 = 4) matrix. (/ = Then, make that tuning parameter 3 in the derived similar matrix continuous and find the matrix's eigenvalue p.d.f., which will be a /-ensemble with Vandermonde 3-repulsion. Now, we have 11 accomplished two things: we have a proof of the eigenvalue p.d.f. for the initial real, complex, or quaternionic random matrix, and we have additionally found a matrix model for a generalizing -ensemble. Another way to look at Ghost methods, which has not yet fully formalized, is to pretend that an initial matrix for which we desire the eigenvalue p.d.f. is populated by independent "Ghost Gaussians," and possibly some real covariance parameters. Ghost matrices have the property that their eigenvalue distributions are invariant under real orthogonal matrices and "Ghost Orthogonal Matrices," including but not limited to diagonal matrices of "Ghost Signs." A Ghost Orthogonal Matrix or a real orthogonal matrix times a vector of Ghost Gaussians leaves it invariant. Ghost Signs have the property that if they multiply their respective Ghost Gaussians, the answer is a real x3 random variable. Let's consider the case of the 3 x 3 Wishart over the reals, complex numbers, or quaternions with identity covariance. Let G,3 represent an independent Gaussian real, complex, or quaternion for 3 = 1, 2, 4, with mean zero and variance one. Let Xd be a X-distributed real with d degrees of freedom. The following algorithm computes the singular values, where all of the random variables in a given matrix are assumed independent. We assume D = I for purposes of illustration, but this algorithm generalizes. We proceed through a series of matrices related by orthogonal transformations on the left and the right. G,3 G3 G X3/3 G, Go Go Go 0 G3 Go G[ 0 Go G3 G,3 GJ G X313 - X3 G 0 X20 Go 0 0 G 1 To create the real, positive (1, 2) entry, we multiply the second column by a real sign, or a complex or quaternionic phase. We then use a Householder reflector on the bottom two rows to make the (2,2) entry a X2,3. Now we take the SVD of the 2 x 2 12 upper-left block: T1 0 G1 0 T2 G,3 G,- 0 0 G, T1 0 [0 T2 0 0 1-1 X3 0 X[ 0 0 -2 0 0 0 3 ] We convert the third column to reals using a diagonal matrix of signs on both sides. The process can be continued for a larger matrix, and can work with one that is taller than is wide. What it proves is that the second-to-last-matrix, T1 0 xO 0 T2 X,3 0 0 x,3 has the same singular values as the first matrix, if 3 = 1, 2, 4. We call this new matrix a "Broken-Arrow Matrix." The previously stated algorithm, "Beta-Wishart (Recursive) Model Pseudocode," which generalizes the one above for the 3 x 3 case, samples the singular values of the Wishart ensemble for general 3 and general D. We can also use Ghost methods to derive the correctness of the previously stated algorithm, "Beta-MANOVA Model Pseudocode," for / = 1, 2,4, and conjecture that the algorithm works for continuous /3. Let X be m x n real, complex, quaternion, or Ghost normal, Y be p x n real, complex, quaternion, or Ghost normal, and let Q be n x n diagonal real p.d.s. Let QX*XQ have eigendecomposition UAU*, and QX*XQ(Y*Y)- 1 have eigendecomposition VMV*. We want to draw M so we can draw C from (C, S) = gsvd 0 (Y, XQ) = (M + I)d. Let ~ mean "having the same eigenvalue distribution." QX*XQ(Y*Y)- 1 ~ AU*(Y*Y )-U ~ A((U*Y*)(YU))- 1 ~ A(Y*Y) which we can draw the eigenvalues M of using BetaWishart(p, n, /3,A')-. A can be drawn using BetaWishart(m, n3, , Since Q2 ), this completes the algorithm for BetaMANOVA(m, n, p, /, Q) and proves that it has the desired generalized singular 13 values in the 0 = 1, 2, 4 cases. Infinite Random Matrix Theory and The Mo- 1.3 ment Problem. Consider the "big" laws for asymptotic level densities for various random matrices: Wigner semicircle law [66] Marchenko-Pastur law [49] McKay law [50] Wachter law [65] Their measures and support are defined in the table below (The McKay and Wachter laws are related by the linear transform ( 2 xMcKay - 1)v = XWachter and a = b = v/2). Support Measure Parameters Wigner semicircle 4 d pws = 21rX dxr 27r =[±2] 2Iws N/A Marchenko-Pastur dpump= (A±-x)(x 27x A) dx McKay dyuM = IM = [i2 v -I] 2 V,4(v - 1) - X 27(v 2 - X2 ) Yp Wachter _ v> 2 (a + b) V p+ - )(x- i-) =-Y (Vs± a(a-+b-1) a a, b > 1 27rx(1 - x) 14 ) 2 These four measures have other representations: As their Cauchy, R, and S transforms; as their moments and free cumulants; and as their Jacobi parameters and orthogonal polynomial sequences. In fact, their Jacobi parameters (ai,#4)g0 the property that they are "bordered Toeplitz," #1 = /32 = - - - = Ce = a2 an =-- have and n = - - . This motivates the two parts of Chapter 4: 1. We tabulate in one place key properties of the four laws, not all of which can be found in the literature. These sections are expository, with the exception of the as-of-yet unpublished Wachter moments, and the McKay and Wachter law Jacobi parameters and free cumulants. 2. We describe a new algorithm to exploit the Toeplitz-with-length-k boundary structure. In particular, we show how practical it is to approximate distributions with incomplete information using distributions having nearly-Toeplitz encodings. We can use the theory of Cauchy transforms to go from the first batch of moments or Jacobi parameters of an analytic, compactly-supported distribution to a fine discretization thereof which can be used computationally. Studies of nearly Toeplitz matrices in random matrix theory have been pioneered by Anshelevich [2, 3]. Other laws may be characterized as being asymptotically Toeplitz or numerically Toeplitz fairly quickly, such as the limiting histogram of the eigenvalues of (X//m + plI)'(X/-ij + MI), where X is m x n, n is 0(m), and m - oc. Figure 4-3 shows that its histogram can be reconstructed from its first batch of Jacobi parameters. Figure 4-2 shows distributions recovered from random first batches of Jacobi parameters. Figure 4-4 shows the normal distribution - which is not compactly-supported - reasonably well recovered from its first 10 or 20 moments. 15 16 Chapter 2 -Wishart A Matrix Model for the Ensemble 2.1 Introduction The goal of this chapter is to prove that a random matrix Z has eig(Z t Z) distributed with pdf equal to the O-ensemble below: c2 13 = T A(A) - oFo() A D 1)dA. Z's singular values are said to be the VAi. Z is defined by the recursion in the box, if n is a positive integer and m a real greater than n - 1. Beta-Wishart (Recursive) Model, W() (D, m, n) rn- 1 )( Lin',n X(m-n+1)3Dnn ] where {rTi, . W(D, m, 1) is . Tn1} are the singular values of W(O)(D.:n_1,1.:n_, T = Xm3D12 17 m, n - 1), base case The singular values of WO) (D, m, n) are the singular values of Z. The critical aspect of the proof is changing variables from the Ti's and x D1n2's to the singular values of Z and the bottom row of its eigenvector matrix, q. This requires the derivation of a Jacobian between the two sets of variables. In addition, to complete the recursion, a theorem about Jack polynomials is needed. It is originally due to [54] in a different form, the proof in this thesis is due to Praveen Venkataramana. Let q be the surface area measure of the first quadrant of the n-sphere. The theorem is: q-2 C)(A) - qq t )A)dq. 1Cf )((I The Jack polynomial Cf is defined in Section 2.4. For completeness we also include the distributions of the extreme eigenvalues, which are due to [19], and we check them with the mhg software from [43]. 2.2 Arrow and Broken-Arrow Matrix Jacobians Define the (symmetric) Arrow Matrix di Ci A= dnC1 .. 1 Cn-1 Cn-1 Cn Let its eigenvalues be A,.. . , An. Let q be the last row of its eigenvector matrix, i.e. q contains the n-th element of each eigenvector. q is by convention in the positive quadrant. 18 Define the broken arrow matrix B by a, an_1 0 Let its singular values be -1,.. ... 0 an and let q contain the bottom row of its right . , U, singular vector matrix, i.e. A = BtB, BtB is an arrow matrix. q is by convention in the positive quadrant. Define dq to be the surface-area element on the sphere in R . Lemma 1. For an arrow matrix A, let f be the unique map f : (c, d) (q, A). The - Jacobian of f satisfies: dqdA -* dcdd. = 1 ci The proof is after Lemma 3. Lemma 2. For a broken arrow matrix B, let g be the unique map g: (a, b) - (q, or). The Jacobian of g satisfies: dqdu = 1 -qi .dadb. f1=1 ai The proof is after Lemma 3. Lemma 3. If all elements of a, b, q, - are nonnegative, and b, d, A, o- are ordered, then f and g are bijections excepting sets of measure zero (if some bi = bj or some di = dj for ij). Proof. We only prove it for f; the g case is similar. We show that f is a bijection using results from Dumitriu and Edelman [12], who in turn cite Parlett [57]. Define 19 the tridiagonal matrix 6 7)1 61 1 0 0 2 C2 0 0 0 En-1 ?7n-1 to have eigenvalues dj, ... , d,_ 1 and bottom entries of the eigenvector matrix u c + -.. - + c2_ 1. Let the whole eigenvector matrix be (cI,... , cn1)/y, where y = U. (d, u) + (6, r) is a bijection[12], [57] excepting sets of measure 0. Now we extend the above tridiagonal matrix further and use ~ to indicate similar matrices: ? 1 61 0 0 0 6I T/2 62 0 0 0 0 6n_1 7n_1 1 0 0 0 1Y C (ci, ... ,cn1) I, , c_1, Un_17 - e (u, y) is a bijection, as is (ca) + bijection from (ci,... 0. (cn, y, dn_ 1 (ca), so we have constructed a ), excepting sets of measure cn, di,.. . , dn1) + (cn, y,1, c) defines a tridiagonal matrix which is in bijection with (q, A) [12], [57]. Hence we have bijected (c, d) ++ (q, A). The proof that f is a bijection is complete. Proof of Lemma 1. By Dumitriu and Edelman [12], Lemma 2.9, H dqdA-= dcnd-ydcdTI. Also by Dumitriu and Edelman [12], Lemma 2.9, n-I dddu = =_ R=1 Ci dcdq. Together, dqd\ = n- R =1 q dCndddud-y u2 1Y R1i 20 D The full spherical element is, using -y as the radius, dci -cn = _n-2 dudy. Hence, dqdA =q dcdd, which by substitution is dqdA I qj dcdd = R=1 ci Proof of Lemma 2. Let A = BtB. dA =2 ] -ido-, and since H det(B t B) = det(B)2 = a2 H1 -1 b , by Lemma 1, dqd- The full-matrix Jacobian ( ,9(a,b) 2 2nan Hn1(bici) dcdd. is 2a, bn_ 1 2anI a(c, d) a(a, b) 2a, 2b, a, 2bn-1 an_- The determinant gives dcdd = 2nan Hn u n dqdo- = IM-- b'dadb. So, n-1 b fi= n-1c dadb U=>n 1 ai J K 21 dadb. 1 O- 2 2.3 Further Arrow and Broken-Arrow Matrix Lemmas Lemma 4. n-i qk +1 - -1/2 2 Ck (Ak j=1 - d) Temporarily fix vn = 1. Proof. Let v be the eigenvector of A corresponding to Ak. Using Av = Av, for j = cj /(Ak < n, v - dj). Renormalizing v so that IIvII = 1, we El get the desired value for v, = qk. |xi - Lemma 5. For a vector x of length 1, define A(x) =H< n-i A(A) = A(d) n JJlCkj -1.qk k=1 k=1 Proof. Using a result in Wilkinson [67], the characteristic polynomial of A is: n-1 n-I n p(A) = f(A - A) = j(di - A) i=1 xj|. Then, A -1 Cn j=1 i=1 2 C3 cij - AJ (2.1) Therefore, for k < n, n-I n p(dk) = J(A - dk) (2.2) (di - dk). H = i=1 i=1,izk Taking a product on both sides, n-1 n n-1 f fj(A - dk) =( - )n- )2 c . k=1 i=1 k=1 Also, p'(Ak) = (Ai - A) i=1,ik n-1 n-i n = (di -f i=1 22 - Ak) I + E j=1 (dj 2 C. - Ak )2} (2.3) Taking a product on both sides, n n-1 i=1 k=1 n-i Equating expressions equal to 171 n n-1 i=1 j=1 - 2 (d A3 ]kJ_- (Ai - dk), we get n-I n n-1 2 A(d) 17c = A(A)2 9 k=1 ( 1+ i=1 ( - 2 i j=1 E The desired result follows by the previous lemma. Lemma 6. For a vector x of length 1, define A 2 (x) -717<3Ix2 - x \. The singular values of B satisfy n-I 2 A (u) - n H Iakbj 171 q;-'. A 2 (b) k=1 k=1 l Proof. Follows from A = B'B. 2.4 Jack and Hermite Polynomials The proof structure of this section, culminating in Theorem 2, is due to my collaborator Praveen Venkataramana, as are several of the lemmas. As in [141, if K F k, K = it sums to k. Let a = 2/. (KI, K2 ,.. .) Let pc = is nonnegative, ordered non-increasingly, and >_ Ki(Ki - 1 - (2/a)(i - 1)). We define l(K) to be the number of nonzero elements of K. We say that A < K in "lexicographic ordering" if for the largest integer j such that pi = Ki for all i < j, we have pIj < Kj. Definition 1. As in Dumitriu, Edelman and Shuman [141, we define the Jack polynomial of a matrix argument, Cfj'(X), as follows: Let x 1 , ... of X. , xn be the eigenvalues Cfj3(X) is the only homogeneous polynomial eigenfunction of the Laplace- Beltrami-type operator: D* = ZX 2 i= +3 1<i4j<n xi 23 'x3 x with eigenvalue p' + k(n - 1), having highest order monomial basis function in lexicographic ordering (see [14], Section 2.4) corresponding to S i. In addition, Cf)(X) =trace(X)k. &4-k,l(i)<n Lemma 7. If we write Cfi (X) in terms of the eigenvalues x 1 ,. . . , xa, as C (i(x, ... then C (xi, .(.). , i_1) = C (xi,... , xn_ 1 , 0) if l(s) < n. If l(K) = n, CXn(xi, ... , Xn), , x- 0. Proof. The l(K) = n case follows from a formula in Stanley [63], Propositions 5.1 and 5.5 that only applies if K, > 0, Cfj)(X) oc det(X)C _1 ,. 3 If rn = 0, from Koev [43, (3.8)], C,(j (xi, ... , x,-1) = C( ) (xi, ... , xn_ 1 , 0). E Definition 2. The Hermite Polynomials (of a matrix argument) are a basis for the space of symmetric multivariate polynomials over eigenvalues x 1 ,... , x, of X which are related to the Jack polynomials by (Dumitriu, Edelman, and Shuman [14J, page 17) H()(X) = O. - c where o- C r means for each i, o- < rj, and the coefficicents c13 are given by (Du- mitriu, Edelman, and Shuman [14], page 17). Since Jack polynomials are homogeneous, that means HO)(X) oc C()(X) + L.O.T. Furthermore, by (Dumitriu, Edelman, and Shuman /14], page 16), the Hermite Polynomials are orthogonal with respect to the measure exp(42 i=1 f 24 i54j x - l 1, 0) Lemma 8. Let C1 Cl Pn-1 Cn_1 Cf-1 Cn-I C [I A (p, c) = C1 ... j C1 Cn-1 ... Cn and let for l(Q) < n, n-1 4c_ Q(P, cn) = 1 H O)(A(_, c)) exp(-c'-- - c_ 1 )dc .. dc2_. i=1 Q is a symmetric polynomial in p with leading term proportional to H terms of order strictly less than 3(M) plus i|. Proof. If we exchange two ci's, i < n, and the corresponding pi's, A(, c) has the same eigenvalues, so HfjO(A(y, c)) is unchanged. So, we can prove Q(P, cn) is symmetric in p by swapping two pi's, and seeing that the integral is invariant over swapping the corresponding ci's. Now since H )(A(p, c)) is a symmetric polynomial in the eigenvalues of A(p, c), we can write it in the power-sum basis, i.e. it is in the ring generated by t, = AP + - + AP, for p = 0, 1, 2,3,.. ., if A,,... , An are the eigenvalues of A(p, c). But tp = trace(A(p, c)P), so it is a polynomial in p and c, H j) (A(p, c)) ypi,(p)c'c"1 = . .. ci[_1. 20 Ei,..,En_1>O Its order in p and c must be jrj, the same as its order in A. Integrating, it follows that Q(P, cn) pE([t)ciMe, = i>0 Ei,...,f"__12 25 for constants M. Since deg(Hj3 (A(y, c))) = I'i, deg(pi, (y)) < Ir - Q(1, cn) = M6POd(P) + - i. Writing pi,E(1p)ci-Mc, 5 (i,E)#(O, ) we see that the summation has degree at most Hrl - 1 in p only, treating c, as a constant. Now PO 6(p) = () Hr where r(p) has degree at most Jr M 0 0 0 Hja(p) + r(p), - This follows from the expansion of H/j3 in 1. Jack polynomials in Definition 2 and the fact about Jack polynomials in Lemma 7. EI The new lemma follows. Lemma 9. Let the arrow matrix below have eigenvalues in A = diag(A1,..., An) and have q be the last row of its eigenvector matrix, i.e. q contains the n-th element of each eigenvector, ci c1 1k M A(A, q) /Tn-I LC1 Cn-1 Cn-1 cnI c j c1 - ... 1 cn By Lemma 3 this is a well-defined map except on a set of measure zero. Then, for U(X) a symmetric homogeneous polynomial of degree k in the eigenvalues of X, n V(A)= Jf q,-U(M)dq is a symmetric homogeneous polynomial of degree k in A1,..., An. Proof. Let en be the column vector that is 0 everywhere except in the last entry, which is 1. (I - ene' )A(A, q)(I - enet) has eigenvalues {yi, ... 26 , p_1, 0}. If the eigenvector matrix of A(A, q) is Q, so must Qt (I - ene')QAQ t(I - ese')Q have those eigenvalues. But this is (I - qqt)A(I - qqt). So U(M) = U(eig((I - qq t )A(I - qq t ))\{O}). (2.4) It is well known that we can write U(M) in the power-sum ring, U(M) is made of sums and products of functions of the form pp, + - - - + _ where p is a positive integer. Therefore, the RHS is made of functions of the form p+- -, + OP = trace(((I - qqt)A(I - qqt))P), which if U(M) is order k in the pi's, must be order k in the Ai's. So V(A) is a polynomial of order k in the A's. Switching A1 and A2 and also qi and q2 leaves J q U(eig((I - qq')A(I - qq'))\{0})dq invariant, so V(A) is symmetric. Theorem 2 is a new theorem about Jack polynomials. Theorem 2. Let the arrow matrix below have eigenvalues in A = diag(A,,..., A,) and have q be the last row of its eigenvector matrix, i.e. q contains the n-th element 27 of each eigenvector, Cl Cl M A(A, q) = Cl k2n-I Cn-1 Cn-1 Cn ... cn-1 Cl Cn_1 Cn By Lemma 3 this is a well-defined map except on a set of measure zero. Then, if for a partition r,, l(r;) < n, and q on the first quadrant of the unit sphere, -n Cl$;I) (A) cx Proof. Define q ,n 77(0)J(A) H(3)(M)dq. =q- i=1 This is a symmetric polynomial in n variables (Lemma 9). Thus it can be expanded in Hermite polynomials with max order r (Lemma 9): (e) (A) rK)H(")(A), =C(rK('), ,~(K)O) where Isj = /-i1 + r 2 + + rS(,). Using orthogonality, from the previous definition of Hermite Polynomials, c(r,(O) r') OC AERn/ x exp(- 2 /Anj H,()3)(M)HL (" (A) qf- trace(A2 )) n IAi - Aj[dqdA. Using Lemmas 1 and 3, C(rO),I r) OC q 28 1Hn(3)(M)H )(A) x exp(--trace( A2 )) I dydc. i- l 2 ci Using Lemma 6, Jc 0 c(r( ), iK) cX x exp(- 2 HN(' (M) H (")(A) trace(A 2 )) Hj I1j - yjdpdc, 1 i:Ai and by substitution c(r 0 lu - Hsi ' (M)HAf " (A (A,7 q)) ), I's) c x exp(- 2 trace(A(A, q) 2 )) S-pdpdc Define QC, Q(p, c,) c(Y) = C 1 Ho0(A(A, q)) exp(-c - - c_ 1 )dci 1 - -l- is a symmetric polynomial in p (Lemma 8). Furthermore, by Lemma 8, Q(p, cn) oc H) (M) + L.O.T., where the Lower Order Terms are of lower order than JK( 0 ) and are symmetric poly- nomials. Hence they can be written in a basis of lower order Hermite Polynomials, and as H) (M)Q(pt, cn) c(i(0), rI) cJ x H itj - P| 1texp (C + p+_+ itj we have by orthogonality c(r(0 ), ,) c 6(r0), r), 29 pi) dpdc, where 6 is the Dirac Delta. So r01 (A) Jf q -H( (M)dq oc H0)(A). By Lemma 9, coupled with Definition 2, C Cj) (M)dq. q- (A) cx f i=1 El Corollary 1. Finding the proportionalityconstant: For l(K) < n, C (A) = J 2n- F(no/2) 17(,3/2)n (In) 1qC)((I - qq t )A)dq. Proof. By Theorem 2 with Equation (2.4) (in the proof of Lemma 9), qI C')(eig((I - qqt)A(I - qqt)) - {O})dq, Cr(3)(A) ocx which by Lemma 7 and properties of matrices is Cr (A) oc fl q - 1C(1)((I - qqt) A)dq. n , i=1 Now to find the proportionality constant. Let A = I, and let cp be the constant of proportionality. CK( (In) =C - q1- - C) (I - qq jdq. Since I - qqt is a projection, we can replace the term in the integral by Cn)(In1), which can be moved out. So c<(')(I=) ( 30 qftldq Now q -idq 2 f~ 'rnOIerV2 d , dr 1(n/3/2) J f f 2 jfrq ')OI-1 F(r 0/2) n 2 2 2 n31~~ jo xl F(_/2)_ F(r 0/2) -~ (0/2)n 2n- (rn-ldrdq) xdx (00 F(n /3/2) qI311d x- - e X F(T /2) n exJ n 2J 117(no/2)' and the corollary follows. Corollary 2. The Jack polynomials can be defined recursively using Corollary 1 and two results in the compilation [411. Proof. By Stanley [63], Proposition 4.2, the Jack polynomial of one variable under the J normalization is (l')(A,) = An (1 + (2/3)) ... (1 + (K1 - 1)(2/3)). There exists another recursion for Jack polynomials under the J normalization: n J(n - i + 1 + (2/# 3 )(ni - 1)), Jr3)(A) = det(A)J(-1,. ..,_l) if ru > 0. Note that if rIn > 0 we can use the above formula to reduce the size of , in a recursive expression for a Jack polynomial, and if Kn = 0 we can use Corollary 1 to reduce the number of variables in a recursive expression for a Jack polynomial. Using those facts together and the conversion between C and J normalizations in [14], we E can define all Jack polynomials. 31 Hypergeometric Functions 2.5 Definition 3. We define the hypergeometric function of two matrix arguments and parameter 3, oFo3(X, Y), for n x n matrices X and Y, by 00 oFo(3X, Y) = k=O Cf(j (X)Cfj (Y) k!Cfj3 (I) i&'k,l(r)<n as in Koev and Edelman [43]. It is efficiently calculated using the software described in Koev and Edelman [43], mhg, which is available online [42]. The C's are Jack polynomials under the C normalization, r H k means that r is a partition of the integer k, so 1 ;> r12 > ... > 0 have |jr=k = x1+ 2+ --- = k. Lemma 10. 0 FO ( (X, Y) = exp (s - trace(X)) oFo ( 3 )(X, Y - sI). Proof. The claim holds for s = 1 by Baker and Forrester [4]. Now, using that fact with the homogeneity of Jack polynomials, oFo()(X, Y - sI) = oFo()(X, s((1/s)Y - I)) = oFo('3 )(sX, (1/s)Y - I) exp (s -trace(X)) oFo(') (sX, (1/s)Y) = exp (s -trace(X)) oFo(' (X, Y). Definition 4. We define the generalized Pochhammer symbol to be, for a partition K = ('I, . ,K) (a =riri )(3 i=1 j=1 a - 2 2 Definition 5. As in Koev and Edelman [43], we define the hypergeometric function 1F1 to be 00 iF1 (a),3 (a;b;X,Y) = () k=O -k,l(K)<n 32 C.,()C C (Y !(X)C()(Y) The best software available to compute this function numerically is described in Koev and Edelman [43], mhg. Definition 6. We define the generalized Gamma function to be n F($)(c) = n(n-1),3/4 f -(c - (i - 1)3/2) for !R(c) > (n - 1)3/2. 2.6 The -Wishart ensemble, and its Spectral Dis- tribution The ,3-Wishart ensemble for m x n matrices is defined iteratively; we derive the m x n case from the m x (n - 1) case. Definition 7. We assume n is a positive integer and m is a real greater than n - 1. Let D be a positive-definite diagonal n x n matrix. For n = 1, the 43- Wishart ensemble is Xm,31/2 1,1 0 Z=, 0 with n - 1 zeros, where X-i3 represents a random positive real that is x-distributed with m/3 degrees of freedom. For n > 1, the 0- Wishart ensemble with positive-definite diagonal n x n covariance matrix D is defined as follows: Let TI, ... , r_1 be one draw of the singular values of the m x (n - 1) /3-Wishart ensemble with covariance Dl:(n-1),1:(n-1). Define the matrix Z by T= 1/2 n-- X Dn,n 3 1/2 X(m-n+1)ODn,n 33 , on be the singular All the X-distributed random variables are independent. Let o,. . . values of Z. They are one draw of the singularvalues of the mxn /3-Wishart ensemble, completing the recursion. Ai = a are the eigenvalues of the 3-Wishart ensemble. Theorem 3. Let E = diag(o-1, , ... o-), oi > o-2 > ... > o-n. The singular values of the 0- Wishart ensemble with covariance D are distributed by a pdf proportionalto ( m--n+I),--1A2(,,)O oFo(O) det(D) -mo/2 D-1 -E2,I do-. It follows from a simple change of variables that the ordered Ai 's are distributed as CW' det(D)-m// 2 A2 - A, D-) lA(A)OoFo-3) dA. Proof. First we need to check the n = 1 case: the one singular value a- is distributed as -1 = 1D which has pdf proportional to =D, (, exp Di-m/2 1 I1 2 .mno-1 dcri. al We use the fact that OFo(O) - a , D -1 = Fo ( ) --2 D 1,1 The first equality comes from the expansion of OFO in terms of Jack polynomials and the fact that Jack polynomials are homogeneous, see the definition of Jack polynomials and OFO in this paper, the second comes from (2.1) in Koev [41], or in Forrester [28]. We use that oFo"(X, I) = oFo()(X), by definition [43]. Now we assume n > 1. Let 1/2 rn- XpDn,n T_ 1/2 X(m-n+1),Dn,n 34 a, an so the ai's are X-distributed with different parameters. By hypothesis, the Ti's are a Wishart draw. Therefore, the ai's and the Ti's are assumed to have joint distribution proportional to n- det(nm// JfJ 2 ( + (m-n±2 )/ 3 -- lA(T13F( i=1 a where T = diag(i, ... 1 a m-n+10m exp ) 12 -- T 2I D-1:n1 ( 4 2D 1,1:n 1) dadr, , Tn_1). Using Lemmas 2 and 3, we can change variables to n- 1F fI det(D)m!rn3 / 2 3 Ti~mn±) -A2(),Fo( -) T2,D _1,1:n 1) i=1 a m-n+l)-l exp ai) - dadq. k\2Dn,n Using Lemma 6 this becomes: n-1 det(D)-m3/ 2 J- (m- n+1)O-1 2 (a)oF 3 (- T2, D-1 1,1:n-1) i=1 x exp q''do-dq a,)f -j i 1 flnfl Using properties of determinants this becomes: n det(D)m! 3 / 2 Im~) 3 3 A() F() (_2 T 1D1: 1,1:n 1) n X21 x exp 1n,n q13 1 d1odq. To complete the induction, we need to prove oFo) ( - E2 , D 1q e-Il 12/(2Dn,n) 35 Fo() - T2, D- dq. We can reduce this expression using Ia| 2 + Z = E 1 or that it suffices to show ( exp (trace(E 2 )/(2Dn,n)) OFO() oC 1 q-1 exp (trace(T 2 )/ (2Dn,n)) OFO( ) 2 , D-1) (nT2, D- .1)1:n-1 ) dq, or moving some constants and signs around, exp ((-1/Dn,n)trace(- E2 /2)) OFO(") oc exp ((- 1/ (Dn,n)) trace( -T q ( 2 D- IT 2,ID oFo(O) 2/2)) . ),1:(n1))dq, or using Lemma 10, oc I .1 2, D DE2 oFo nJ Dn' n IT 2, D-1 q -1 oFo(") ~ - -In_ 1) dq. We will prove this expression termwise using the expansion of OFO into infinitely many Jack polynomials. The (k, r,) term on the right hand side is fn IC T K3) Cro) I D -1) - n~ In 1) dq, where , - k and l(K) < n. The (k, K) term on the left hand side is C o) where ' - Cro) D-1 - In_ - k and I(K) < n. If l(K) =n, the term is 0 by Lemma 7, so either it has a corresponding term on the right hand side or it is zero. Hence, using Lemma 7 again it suffices to show that for l(r,) < n, - C3)(T )dq. iiq 1 C$,) (y 2 ) 36 2 0 This follows by Theorem 2, and the proof of Theorem 3 is complete. Corollary 3. The normalization constant, for A, > A2 > - > An: OW- 1 m,n where "~ 2mno/2 7rn(n-1)0/2 (n/3/2) (m3/2)F$) pF~) F(0/2)n Proof. We have used the convention that elements of D do not move through oc, so we may assume D is the identity. Using OFO(' 3 )(-A/2, I) - exp (-trace(A)/2), (Koev [411, (2.1)), the model becomes the /-Laguerre model studied in Forrester [25]. E Corollary 4. Using Definition 6 of the generalized Gamma, the distribution of Amax for the 43-Wishart ensemble with general covariance in diagonal D, P(Amax < x), is: 'n)(I + (n - 1)0/2) r,)(1) +(m + n - 1)0/2) det mD 1 F M ; +n - +1; xD-. 2 2 2 Proof. See page 14 of Koev [41], Theorem 6.1. A factor of / is lost due to differences in nomenclature. The best software to calculate this is described in Koev and Edelman [43], mhg. Convergence is improved using formula (2.6) in Koev [41]. Corollary 5. The distribution of Amin for the El 3- Wishart ensemble with general co- variance in diagonal D, P(Amin < x), is: nt~ 1 - exp (trace(-xD-1 /2)) Q)(xD--1/2) E k=O &1k,,1<t It is only valid when t = (m - n k( k! + 1)0/2 - 1 is a nonnegative integer. Proof. See page 14-15 of Koev [41], Theorem 6.1. A factor of 4 is lost due to differ- ences in nomenclature. The best software to calculate this is described in Koev and D Edelman [43], mhg. [41] Theorem 6.2 gives a formula for the distribution of the trace of the 3-Wishart ensemble. 37 0.90.80.70.60.50.40.30.20.10 0 20 60 40 80 100 120 Figure 2-1: The line is the empirical cdf created from many draws of the maximum eigenvalue of the -Wishart ensemble, with m = 4, n = 4, ,3 = 2.5, and D = diag(1.1, 1.2, 1.4, 1.8). The x's are the analytically derived values of the cdf using Corollary 4 and mhg. The Figures 2-1, 2-2, 2-3, and 2-4 demonstrate the correctness of Corollaries 4 and 5, which are derived from Theorem 3. 2.7 The #-Wishart Ensemble and Free Probability Given the eigenvalue distributions of two large random matrices, free probability allows one to analytically compute the eigenvalue distributions of the sum and product of those matrices (a good summary is Nadakuditi and Edelman [59]). In particular, we would like to compute the eigenvalue histogram for XtXD/(m), where X is a tall matrix of standard normal reals, complexes, quaternions, or Ghosts, and D is a positive definite diagonal matrix drawn from a prior. Dumitriu [13] proves that for the D = I and 0 = 1, 2,4 case, the answer is the Marcenko-Pastur law, invariant over 6. So it is reasonable to assume that the value of 3 does not figure into hist(eig(X t XD)), where D is random. We use the methods of Olver and Nadakuditi [56] to analytically compute the product of the Marcenko-Pastur distribution for m/n -- + 10 and variance 1 with the Semicircle distribution of width 2f2/centered at 3. Figure 2-5 demonstrates that 38 0.90.80.70.60.50.4- 0.30.2 0.1 0 0 10 20 40 30 50 60 70 Figure 2-2: The line is the empirical cdf created from many draws of the maximum eigenvalue of the 3-Wishart ensemble, with m = 6, n = 4, / = 0.75, and D = diag(1.1, 1.2, 1.4, 1.8). The x's are the analytically derived values of the cdf using Corollary 4 and mhg. 1 -0 0.90.80.7- 0.60.5- 0.40.30.2- 0.10 5 15 10 20 25 Figure 2-3: The line is the empirical cdf created from many draws of the minimum eigenvalue of the /-Wishart ensemble, with m = 4, n = 3, 3 = 5, and D = diag(1.1, 1.2, 1.4). The x's are the analytically derived values of the cdf using Corollary 5 and mhg. 39 0.9- 0.80.70.6- 0.4 - 0.3 - 0.2 0.1 5mlig and 00 8 1'0 12 14 Figure 2-4: The line is the empirical cdf created from many draws of the minimum eigenvalue of the O-Wishart ensemble, with m = 7, n = 4, 3 = 0.5, and D = diag (1, 2, 3, 4). The x's are the analytically derived values of the cdf using Corollary 5 and mhg. the histogram of 1000 draws of XtXD/(m3) for m = 1000, n = 100, and / = 3, represented as a bar graph, is equal to the analytically computed red line. The /- Wishart distribution allows us to draw the eigenvalues of XtXD/(mO), even if we cannot sample the entries of the matrix for 3 = 3. 2.8 Acknowledgements We acknowledge the support of National Science Foundation through grants SOLAR Grant No. 1035400, DMS-1035400, and DMS-1016086. Alexander Dubbs was funded by the NSF GRFP. We also acknowledge the partial support by the Woodward Fund for Applied Mathematics at San Jose State University, a gift from the estate of Mrs. Marie Woodward in memory of her son, Henry Tynham Woodward. He was an alumnus of the Mathematics Department at San Jose State University and worked with research groups at NASA Ames. 40 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 1 2 3 4 5 6 7 Figure 2-5: The analytical product of the Semicircle and Marcenko-Pastur laws is the red line, the histogram is 1000 draws of the ,-Wishart (3 = 3) with covariance drawn from the shifted semicircle distribution. They match perfectly. 41 42 Chapter 3 A Matrix Model for the #-MANOVA Ensemble Introduction 3.1 Recall from the thesis introduction, that: Beta-MANOVA Model Pseudocode Function C := BetaMANOVA(m, n, p, #, Q) A := BetaWishart(m, n ,3 Q2 ) M :=BetaWishart (p, n , 13, A-1- C := (M + I)-2 Our main theorem is the joint distribution of the elements of C, Theorem 4. The distributionof the generalizedsingular values diag(C) c1 > c2 > ... > cn, generated by the above algorithm for m, p 2nldi3) nnI 2 +p" n det(Q)P' m,n p,n 17 i=1 - _7J(i cl (p-n+l)/-1 - c 3)1 = (c 1, ... ,Cn) > n is equal to: - C |# i<j i=1 x 1Fo) ( 43 -2P 3; C2(C2 - I) 1 Q2 dc. where 1Fj()) and IC$ 2 are defined in the upcoming section, Preliminaries. We also find the distributions of the largest generalized singular value in certain cases, generalizing Dumitriu and Koev's results in on the Jacobi ensemble in [15]. Theorem 5. If t = (m - n + 1)0/2 - 1 E Z>O, P(ci < x) = detr 2 2 )I+ x2Q2)1)92 Q2 nt ( I x (pO/2)f)C3) ((1 - x2)((1 - x 2 )I + x 2 Q2 -1 ) , (3.1) k=O K1-kK<t and Pochhammer symbol (.) ) are defined in the where the Jack polynomial C upcoming section, Preliminaries. The following section contains preliminaries to the proofs of Theorems 4 and 5 in the general 1 case. Most important are several propositions concerning Jack polynomials and hypergeometric functions. Proposition 1 was conjectured by Macdonald [47] and proved by Baker and Forrester [4], Proposition 3 is due to Kaneko, in a paper containing many results on Selberg-type integrals [39], and the other propositions are found in [26, pp. 593-596]. 3.2 Preliminaries Definition 8. We define the generalized gamma function to be F$()(c) - 7nCn-i34 J F(c - (i - 1)13/2) for W(c) > (n - 1)1/2. Definition 9. 2mn /2 p() (m/3/2) IF) (n1/2) rmn n(n-1)0/2 F(13/2)n Definition 10. A(A) = f(Ai i<j 44 Aj). If X is a diagonal matrix, A(X) H IXi,i - X, L. i<j As in [141, if F k, K H (K 1 , K2,... is nonnegative, ordered non-increasingly, , Kn) n_1 and it sums to k. Let a = 2/3. Let po = l() j(rj - 1 - (2/a)(i - 1)). We define to be the number of nonzero elements of K. We say that p < K in "lexicographic ordering" if for the largest integer j such that pi = Ki for all i < j, we have yj <Kj. Definition 11. We define the Jack polynomial of a matrix argument, Cfi'(X), (see, for example, [14]) as follows: Let x 1 ,... ,x, be the eigenvalues of X. CK (X) is the only homogeneous polynomial eigenfunction of the Laplace-Beltrami-type operator: n9 D* = n with eigenvalue pa x2 i +13. - j- 1 Ei54j~n T1 xj xi' - + k(n - 1), having highest order monomial basis function in lex- icographic ordering (see Dumitriu, Edelman, Shuman, Section 2.4) corresponding to K. In addition, S Cl)(X) =trace(X)k. Kk,l(K) n Definition 12. We define the generalized Pochhammer symbol to be, for a partition K = (K1, . . . , Ki) fj (a)() = (a- - +13 - 1). ) i=1 j=1 Definition 13. As in Koev and Edelman [43], we define the hypergeometric function SF(O to be 00 (al)13 ...(ap)l F, na; X ,k (b C,( 3(X)C,(' )K(' -... (bq) K$ k!Cf3 (Y) (I) The best software available to compute this function numerically is described in Koev and Edelman, mhg, [43]. pFO (a; b; X) = pF9O(a; b; X, I). 45 We will also need several theorems from the literature about integrals of Jack polynomials and hypergeometric functions. The first was conjectured by MacDonald [47] and proved by Baker and Forrester ([4], (6.1)) with the wrong constant. The correct constant is found using Special Functions [1, p. 406] (Corollary 8.2.2): Proposition 1. Let Y be a diagonal matrix. "F" (a + (n - 1)>/2 + 1)(a + (n - 1)3/2 + 1) y a+(n-l)/ where c 3) = -n(n-1) 3 / 4 n!F(3/2) 2 C3)(Y-1) OF0, (-X,Y) IX aCj)(X)|A(X |dX, +1 In I F(i/3/2). From [26, p.593], Proposition 2. If X < I is diagonal, 1Fo( - XI. ; X) = 1II(a; Kaneko, Corollary 2 [39]: Proposition 3. Let -1 K = .. (I,. , n) be nonincreasingand X be diagonal. Let a,b > and /3> 0. O<X<I C =~ C(j (X)A(X), 3 [xa(I - Xi)b] dX 1 ) F(i/2 + i)F(h2 + a + (//2)(n f Cr() - i) + 1)F(b + (#/2)(n - i) + 1) ((0/2) + 1)F(Ki + a + b + (0/2)(2n - Z'- 1) + 2) From [26, p. 595], Proposition 4. Let X be diagonal, 2 FI$')(a, b; c; X) - a, b; c; = 2 FI))(c = 2 Fle)(c - -X(I - X )1)1I - a, c - b; c; X)|I - Xicab 46 X|-b From [26, p. 596], Proposition 5. If X is n x n diagonal and a or b is a nonpositive integer, 2 F/)(a, b; c; X) 2 FP)(a,b; - C; I) 2 Fo)(a,b; a + b + 1 + (n - 1)3/2 - c; I - X). From [26, p. 5941, Proposition 6. 2 FIP)(a,b; 3.3 0) c; I) (c) )(c - a - b) (c - a)F13(c r) - b) Main Theorems Proof of Theorem 4. BetaWishart(m, n We will draw M by drawing A ~ P(A) = Let m, p > n. , Q 2 ), and compute M by drawing M ~ P(MIA) = BetaWishart(p, n, /3, A- 1 )- 1 . The distribution of M is f P(MIA)P(A)dA. I)-2. Then we will compute C by C = (M + We use the convention that eigenvalues and generalized singular values are unordered. By the paper [10], the BetaWishart described in the introduction, we sample the diagonal A from P(A) - det(A MI n A+1 2t + , 1 A-Aj 3oFo('3 ) n!/m ,n 2mn,/2 m,n ( 1 2A, Q-2) dA, F(nl3/2) p$()(m3/2)P ,n(n-1)0/2 Likewise, by inverting the answer to the [10] BetaWishart described in the introduction, we can sample diagonal M from 47 n 3 2 = det (A)p / P(M|A) P(M ~ rIKj' n 3) P, +I3 ~ -1 4 - pt 1i0F(13 dp. M-1,A z<j To get P(M) we need to compute det(Q) m,, n - p-n+l1 1 3 _ - n!2 X det(A)p, 3 / 2 i H mC 2 1<J J+ 2/3 rn-i 1 1 L~dAt I~t A A~F( 3) (Q-2 k\Fo A1,-=-,Ai;> (2 oFo(O)-- M -A) Expanding the hypergeometric function, this is det(Q)-" 3 nm,nlvp ii n 2 [l- 0 n i=1 n 0 k i<j 2- l1.-,An>0 Aj'oO 1 ) (.IM-1) k2 I dp &k I m-n+p+1 1Ai C - -Q- -2 Cf3 (A) dA] -A) i<j i=1 Using Proposition 1, det(Q)-"l 2)(Sm,n p,n n i=1 x n!Km+p,n2 j - -1 _1 GC (-jMI) k 2 I) i<j (3) m+p det 2 2(m+p)nO/ dp k=O &d-k (Q-2 3) (2 ) -+p,3 C(j3 ) (2Q2) Cleaning things up, det(Q)P,3KC(") (3) 1-(3) n!/%m,n/\p,n n p _- n4 00 P j11ECE r i= k=O Kd-k i~i ( n-p 2 .~3) (~) CK( 3 (2Q2) Il 48 (3 ) (I) k!C Q()I dp 1 A) dA. By the definition of the hypergeometric function, this is det(Q)P K~ 3 )(1 p,n -T+ Pn+- _-1 iH j i=1 m,nJ-p,n 1p - 1Fo (\;; - 1 Q2) dp. -M igj (3.2) Converting to cosine form, C = diag(cl, n IflcP- n!K+p,(2-- det(Q')"# ... , i- -n-14)3-1 rJJ(j m,np,n cn ) = (M + 2~ _ p±~nf-I - i<j i=1 mP x 1Fo(O) Theorem 6. If we set Q = I and ui = this is 1)-1/2, 0 0; _ j)-i ;C2(C2 c , (u1,.. Q2 ) dc. (3.3) obey the standard/-Jacobi .,u) density of [46], [40], [28], and [22]. (W) nn -l nnn1 1 ( ) = _3 !k_ nm,ni-p,n - J 1 U) lu i (3.4) u /l#du. - j Proof. Proposition 2 works from the statement of Theorem 4 because C 2 (C 2 - I)-< I (we know that M > 0 from how it is sampled, so 0 < C 2 = (M+ I)-1 < I, likewise C2 _ I). n n m+p,n rl C P-n+1)-1 rj(1 - ci)-- 2 - m,ni-p,n i -C i<j det(I 02(C2 _ - 1) - dc, or equivalently n!K( +p,( IC ,np,n i=1 pn+1 JJ(i - C1)7m- 1 c - cIdc. =1<j If we substitute ui = c, by the change-of-variables theorem we get the desired result. Proof of Theorem 5. Let H = diag(q 1 , . .., n) = M-. 49 Changing variables from (3.2) we get mp$ }f det(Q) nm,n -n±13 2 p,n 11 = m- r1 ,q 1 F( 3 ((m+ p),3 / 2 ; ; H, -Q 2)cdb. i<i Taking the maximum eigenvalue, following [19], det(Q)P3K<,3p P(H < xI) I! nK 0)I 77i2 I x 'I 1 SIF(')((m + p) 3 / 2 ; ; H, -Q 2 )d1 . 'i<j Letting N = diag(vi,... , vn) = H/x, changing variables again we get det(Q)P)3Co P(H < xI) , nI N I -x "0g 2 Jki r1 2 N<I=1 -I 1)((m + p)/3 - 2 ; ; N, -xQ 2 )dv, i<j Expanding the hypergeometric function we get P(H <xI) = n det(Q)P,3/C(") Kmn p~n kO - ((im + p)13/2)$j)Cfj)(-xQ2) k=O r, k J nj~~ Pfl± 31/3 7 2i' V r k!Cfj" (I) - .Cf)(N) dv (3.5) N<I Using Proposition 3, ng + N<I ji__1 --C_ Nj d i<j C() F(3/2 ~ 1 + 1)n H F(i/2 + 1)F (K + (/3/2)(p + 1 - i))F((3/2)(n F(rsi + (#/2)(p + n - i) + 1) 1 i) + 1) F(Ki + (0/2)(p + 1 - i)) (((n - 1)0/2) + 1) n /2 + 1)n 1F(i + (/3/ 2 )(p + n - i) + 1) CK)(I)Fn)((n±/2) + 1)F n(n-1) - 50 Now n flrK+ (/3/2)(p±+1-i) I= F((j/2)(P+ -Zi))f1 ((/2)(p + I - i) + i - 1) i~1 (p/2)fj((#3/2)(p+ 1- i) ±j -1) ni = j=1 =r (p4/2)(p#3/2)(O) 41 n F (i + (3/2)(p + n - Zi)+ 1) =lF((/3/2)(p+n- i) + 1) l ((3/ 2 )(p + n - i) +J) j~1 i=1 wr = n = ((p +n - 1)0/2+l)((/2)(p+n - Z)j) ]rF O)((p + n - 1)0/2 + 1)((p + n -- 1)0/2 + 1)(. Therefore, - v| - Cfj)(N) dv |i<j Cf) (I)Fn((n3/2) + 1)IQ()(((n - 1),3/2) + 1)I(,f (p,3/2) n(n () rr21(/3/2 ± 1)nr(,3)((p +Fn - 1)03/2 +F1) ((p (p3/ 2)) +n - 1),/2 +1)$j Using (3.5) and the definition of the hypergeometric function we get P(H < xI) = det(Q) 'KC,,n n!KC2,O)nK/ kp F$ 3((n43/2) + 1)IF((((n - 1)3/2) + 1)IF(7 (p4/2) 7 n(n 21)0 npO XX 2 -* 2 F, F(3/2 +)nF"I((p (rn-FMp (2 4l p3-3 '2' + n - 1)#/2 +) p-+-n- -Fl+ ; _ - Q 3 2 Rewriting the constant we get F(#3/2)nFFl)((m + p)4/2) F$(3((no/2) + 1)F$ (((n - 1)0/2) + 1) n!F7n (mO/2)F(" (n#/2) F(3/2 + 1)nF$o((p + n - 1)0/2 + 1) 51 Commuting some terms gives F(/2)nF ((n/3/2) + 1) Fn('((m + p)0/2)F F)(((n - 1)0/2) + 1) ")(m#3/2)F()((p + n - 1),3/2 + 1) n!F(,3/2 + 1)nF$()(nO/2), The left fraction in parentheses is LM F((nr3/2) + 1 - (i - 1),/32) H= 1 (i/3/2) H., F((n#/2) - (i - 1)/3/2) H>s F((i#3/2) +±1) _1 j= (i3/2) Hn 1(i0/2) 1 Hence P(H < xI) = F()((m + p)3/2)F$()(((n - 1)0/2) + 1) IF) (m/2)IF(((n + p - 1),3/2 + 1) np ~ rnIp + p p-rI x x XX-2FI 13) 0, 2-1; +n-2 ~-13 + 1; xQ2 ). det(Q)"- (2 Now H= M- 1 and C P(C < xl) = det(Q)? -FW ((m + p),/32)F()(((n - 1)3/2) + 1) )(m,/2)F()((n + p - 1)0/2 + 1) (3.6) (M + I)-1/2, so equivalently, -PO 2 1 2FI1(,\ X2) +/ 01 ; p + n -3 + 1- x 2 2 . Q2 ) (3.7) Remark. Using U = diag(Ui,... , un) = C2 and setting Q = I this is P(U < xI) = IF ((m + p)13/2)I (((n - 1),3/2) + 1) F2(m#/2) F() ((n + p - 1),3/2 + 1) np,3 2 x(1 x) *rnmFP p 2 ( 2 p-mi 21- so by using Proposition 4, this is P(U < xI) = F($)((m + p)/3/2)F($)(((n - 1)0/2) + 1) Fn ) (mo/ 2) Fnf)((n + p - 1),3/2 + 1) x-F n - m - I i)+ xX 2 F1 (3 3+1 52 p; p + -+1;XI n 2' 2/ x I), which is familiar from Dumitriu and Koev [15]. Now back to the proof of Theorem 5. If we use Proposition 4 on (3.7) we get P(C < xI) = det(x 2 Q 2 ((1-X 2 )I+X 2Q 2 ) x 2 F 1 (') ( -nm- F(3 ((m + p),3/2)F ()(((n - 1)/3/2) + 1) 1) F() (mo/2)F(")((n + p - 1)/3/2 + 1) + 1, P; p + n - k\ 2 2 Q 2 ((1 _X 2 )I + x 2Q 2 ) ; 2 ' 1) (3.8) Using the approach of Dumitriu and Koev [15], let t = (M - n +1)3/2 - 1 in Z;>o. We can prove that the series truncates: Looking at (3.8), the hypergeometric function involves the term (-t)() - - 1t = #+1 -1 , i=1 j=1 1 and j which is zero when i 1 = t, so the series truncates when any - ri has Ki - 1 > t, or just ri= t + 1. This must happen if k > nt. Thus (3.8) is just a finite polynomial, P F((m + p)/2) IF (((n - 1)4/2) + 1) P(C < x1) = det (22((1 ((n - m k=1 - 2)/+X22-+) 2 F( 3)(m,3/2)F('3 ((n + p - 1)#3/2 + 1 ( 1)4/2 + 1 1 n - C')(X2Q2(( X2 )I + _ X2Q2W) k,i<t (3.9) Let Z be a positive-definite diagonal matrix, and c a real with nt E ((n - m - 1)43/2 + 1) k=1 Kak,p 1<t ((p-+n - (po 2) 1)3/2+ |cl > 0. Define -CF (Z). Using Proposition 5, f(Z, C) = n - m2F1(0) X 2F1(,) (n 143 1, 13; p + n 2 '2 m - 143 + 1, p 3 n '2' S2 53 1+ I+;I ' m - 1 + 143l 2 ;I- Z) (3.10) Using the definition of the hypergeometric function and the fact that the series must truncate, n- m - =2F() f Z)IE f(Z~e=2F1%2 0 x XE L\ 1/ + p p+ + ' 12 -((r m - 1),3/2 + 1) (p,/2) S t k=1 Kd-k,Kj m - Cr3)(I - Z) (3.11) (p1 2)() C() (I - Z). (3.12) -F 1I -1),312 Now the limit is obvious n - m-i f(Z, 0) = 2F(#) + 1; i) P3; P + n - +1, nt k=1 K -k,K1<t Plugging this expression into (3.9) P(C < XI) = det(x222 ( _( 2)+ x 2 F 1 (, -m 2 - 2 I7$jF((m + p)0/2)Fn($ (((n - 1)0/2) + 1) (2O n 1/2)F ((n + p - 1),3/2 + 1) -3) 1 ,+P ; p + n 2 '2 ) #+ 1 ±iI) nt (pol~~ 2)()C()(IX 2 )((1 -X 2 )I ± X2 Q 2 <1l). k=1 K -k,Kist Cancelling via Proposition 6 gives P(C < xI) = det(x 2 Q 2 ((I - X2 )I+ Q2)-1)2 nt (p#/2)()Cr) ((1 x - X2) ((I _ X2 )I + X2-2i--) . (3.13) k=0 K~k,KI <t 3.4 Numerical Evidence The plots below are empirical cdf's of the greatest generalized singular value as sam- pled by the BetaMANOVA pseudocode in the introduction (the solid lines) against the Theorem 5 formula for them as calculated by mhg (the o's). 54 CDF of Greatest Generalized Singular Value 1 0.9 0.8F 0.7S0.6 2 0.5E 0.40.30.20.1 - 0.4 0.5 0.8 0.7 0.6 Greatest Generalized Singular Value Figure 3-1: Empirical vs. analytic when m Q = diag(1, 2, 2.5, 2.7). = 0.9 1 7, n = 4, p = 5, /# = 2.5, and CDF of Greatest Generalized Singular Value 0.9 - 0.8 6 0.7- 2 0.6- 2 aa 0.5 E 0.40.3- 0.2 0.1 0.4 .5 0.8 0.7 0.6 Greatest Generalized Singular Value 0.9 1 Figure 3-2: Empirical vs. analytic when m = 9, n = 4, p = 6, /3 = 3, and Q = diag(1, 2,2.5, 2.7). 55 Acknowledgments We acknowledge the support of the National Science Foundation through grants SOLAR Grant No. 1035400, DMS-1035400, and DMS-1016086. Alexander Dubbs was funded by the NSF GRFP. 56 Chapter 4 Infinite Random Matrix Theory, Tridiagonal Bordered Toeplitz Matrices, and the Moment Problem 4.1 Introduction First, we list the Cauchy, R, and S transforms; moments and free cumulants; and Jacobi parameters and orthogonal polynomial sequences; of the four major laws of infinite random matrix theory, the Wigner semicircle law, the Marchenko-Pastur law, the McKay law, and the Wachter law. We discuss special properties of the moments and free cumulants. Then, we present an algorithm that starts with the moments of an analytic, compactly-supported measure and returns a fine discretization of the measure in MATLAB. Moments are converted to Jacobi parameters via the continuous Lanczos iteration, which are then placed in a continued fraction, the imaginary part of which is nearly the original measure, see Theorem 7 and the algorithm before it. 57 4.2 The Jacobi Symmetric Tridiagonal Encoding of Probability Distributions All distributions have corresponding tridiagonal matrices of Jacobi parameters. They may be computed, for example, by the continuous Lanczos iteration, described in [64, p.286] and reproduced in Table 4.7. We computed the Jacobi representations of the four laws providing the results in Table 4.1. The Jacobi parameters (ai and 3 .j .) are elements of an for i = 0, 1, 2 ... infinite Toeplitz tridiagonal representations bordered by the first row and column, which may have different values from the Toeplitz part of the matrix. a0 10 a1 31 ,31 a #1 00 =31 o 01 a1 3 f 1 ao = al ao 7 al Wigner semicircle McKay Marchenko-Pastur Wachter ai ao an, (n ; 1) 0 f3, (n ; 1) Wigner Semicircle 0 0 1 1 Marchenko-Pastur A A+ 1 V McKay 0 0 Measure a a+b V/f ab a2 - a+ab+b 2 (a + b)3/ 2 (a + b) V V/o -I Fab(a+ b -1) (a + b)2 Table 4.1: Jacobi parameter encodings for the big level density laws. Upper left: Symmetric Toeplitz Tridiagonal with 1-boundary , Upper Right: Laws Organized by Toeplitz Property, Below: Specific Parameter Values Ansehlovich [3] provides a complete table of six distributions that have Toeplitz 58 Jacobi structure. The first three of which are semicircle, Marchenko-Pastur, and Wachter. The other three distributions occupy the same box as Wachter in Table 4.1. Anshelovich casts the problem as the description of all distributions whose orthogonal polynomials have generating functions of the form 00 1 1 (X) z -1 - xu(z) + tv(z)' n=O which he calls Free Meixner distributions. He includes the one and two atom forms of the Marchenko-Pastur and Wachter laws which correspond in random matrix theory to the choices of tall-and-skinny vs. short-and-fat matrices in the SVD or CS decompositions, respectively. 4.3 Infinite RMT Laws. This section compares the properties of all four major infinite random matrix theory laws, the Wigner semicircle law, the Marchenko-Pastur law, the McKay law, and the Wachter law. Their Cauchy transforms are below. Measure Cauchy Transform 2 z - dz -_4 Wigner Semicircle 2 (1 - A+ z) 2 - 4z 2z Marchenko-Pastur - A+z - McKay (v- 2)z -v Wachter 2(v2 4(1 -v) + z 2 - z2 ) 1 - a + (a + b - 2)z - y'(a+ 1 - (a+ b)z) 2 - 4a(1 - z) 2z(1 - z) Table 4.2: Cauchy transforms . 59 We can also write down the moments for each measure in Table 4.3, for Wigner and Marchenko-Pastur see [18], for McKay see [50], and for Wachter see Theorem 6.1 in the Section 6. Remember the Catalan number C = n1 polynomial Nn(r) = 1 Nn,jrj, where Nn, = ( I() n (2n) and the Narayana ), excepting No(r) = 1. The coefficients of v3 (1 - v)n/ 2-j in the McKay moments form the Catalan triangle. We discuss the pyramid created by the Wachter moments in Section 4.4. Moment n Measure Wigner Semicircle Cn/2 if n is even, 0 otherwise Marchenko-Pastur Nn(A) McKay Wachter n ( E n/2 j=+ a a+b- ( n- j b n-2 [ (a + b) 1: . vj(v - - 1)n/2-j a(a+b- 1) if n is even, 0 otherwise 2j+4 a+bNj+1 b aa+b-1 Table 4.3: Moments Inverting the Cauchy transforms and subtracting 1/w, computes the R-transform, see Table 4.4. If there are multiple roots, we pick one with a series expansion with no pole at w = 0. The free cumulants rn for each measure appear in Table 4.5 by expnding the Rtransform above (the generating function for the Narayana polynomials is given by [48], the generating function for the Catalan numbers is well known). It is widely known that the Catalan numbers are the moments of the semicircle law, but we have not seen any mention that the same numbers figure prominently as 60 Measure R-transform S-transform w 1 1-w z Wigner Semicircle Marchenko-Pastur -v+v McKay Mc~ay V+ -a-b+ w-+F Wachter 1+4w2 v V2 V -z ' +422 2w 2 (a+b)2 + 2(a-b)w+w a-az-bz 2w z2(z- 1) Table 4.4: R-transforms and S-transforms computed as S(z) R-l(z)/z the free cumulants of the McKay Law. The Narayana Polynomials are prominent as the moments of the Marchenko-Pastur Law, but they also figure clearly as the free cumulants of the Wachter Law. There are well known relationships, involving Catalan numbers, between the moments and free cumulants of any law [53], but we do not know if the pattern is general enough to take the moments of one law, transform it somewhat, and have them show up in the free cumulants in another law. We compute an S-transform as S(z) = R- 1 (z)/z. See Table 4.4. Each measure has a corresponding three-term recurrence for its orthonormal polynomial basis, with q_1 ((x - an)qn(x) - (x) = 0, qo(x) = 1, 3_1 = 0, and for n > 0, qni+(x) = /3_1qn_(x))/. In the case of the Wigner semicircle, Marchenko- Pastur, McKay, and Wachter laws, the Jacobi parameters an and f3 are constant for n > 1 because they are all versions of the Meixner law [3] (a linear transformation may be needed). The Wigner Semicircle case is given by simplifying the Meixner law in [2], and the Marchenko-Pastur, McKay, and Wachter cases are given by taking two iterations Lanczos algorithm symbolically to get a 1 and #1. See Table 4.1. Each measure also has an infinite sequence of monic polynomials qn(x) which are 61 Measure rn Wigner Semicircle 6n,2 Marchenko-Pastur A (-1)(n- 1 )/ 2vC(n-l)/ McKay if n is odd, 0 otherwise 2 -Nn (b) Wachter ( a)n+ a (a +b)2n+1 Table 4.5: Free cumulants. qn(x), Measure Wigner Semicircle A(n- 1 )/ 2 (x - A)Un Wachter (v - 1)(n- 1)/ 2 xUn (x - ab) - An/ 2 Un- 1 (xA 1 (2 - v(v ab(a+b-1) Un- a~b(a~) a+b a-b-1 ( ab(a+b-1) (a+b) 2 1. (') Un Marchenko-Pastur McKay > U 2 (XA1) 1)(n- 2 )/ 2 Un-2 1 ( -b-a(a+b-l)+(a+b)2 2V/ab(a+b--1) x -b-a(a+b-1)+(a+b)2X 2Vab(a+b-1) n-2 Table 4.6: Sequences of polynomials orthogonal over of the four major laws. orthogonal with respect to that measure. They can be written as sums of Chebyshev polynomials of the second kind, Un(x), which satisfy U 62 1 = 0, Uo(x) = 1, and Un(x) = 2xU,_ 1 (x) - Un- 2 (x) for n > 1, [35]. See Table 4.6. For n = 0, qo(x) = 1, and in general for n > 1, qn(x) = 3'--(x - ao)Un-1 ((x - a1)/(2/1)) - /32,3n- 2 Un- 2 ((X - a1)/(21)). In the Wigner semicircle case the polynomials can be combined using the recursion rule for Chebyshev polynomials. 4.4 The Wachter Law Moment Pyramid. Using Mathematica we can extract an interesting number pyramid from the Wachter moments, see Figure 4-1. Each triangle in the pyramid is formed by taking the coefficients of a and b in the i-th Wachter moment, with the row number within the pyramid determined by the degree of the corresponding monomial in a and b. All factors of (a + b) are removed from the numerator and denominator beforehand and alternating signs are ignored. Furtheremore, there are many patterns within the pyramid. The top row of each triangle is a list of Narayana numbers, which sum to Catalan numbers. The bottom entries of each pyramid are triangular numbers. The second-to-bottom entry on the right of every pyramid is a sum of consecutive triangular numbers. The second to both the left and right on the top row of every triangle are also triangular numbers. 4.5 Moments Build Nearly-Toeplitz Jacobi Matrices. This section is concerned with recovering a probability distribution from its Jacobi parameters, ac and #3 such that they are "Nearly Toeplitz," i.e. there exists a k such that for i > k all ai are equal and all /3# are equal. Note that i ranges from k to oc. The Jacobi parameters are found from a distribution by the Lanczos iteration. We now state the continuous Lanczos iteration, replacing the matrix A by the 63 I 1 1 3 1 1 3 4 6 6 6 1 1 6 10 20 5 10 20 10 1 10 60 6 45 15 75 50 1 10 20 15 20 50 15 15 1 15 50 84 210 21 315 189 1 50 140 21 105 175 210 105 7 35 35 21 1 21 28 105 175 105 700 280 560 392 1176 588 56 1176 70 490 490 980 196 490 196 1 21 8 140 28 56 28 1 36 28 490 196 490 28 196 216 2520 1260 1890 504 36 720 5040 3360 2520 336 1344 84 4704 4704 2176 1512 3528 1764 1008 1176 1 9 126 126 84 336 36 1 36 336 1176 336 1176 1764 36 1 315 10 2520 7350 8820 840 4410 45 45 1215 8100 18900 17010 5670 540 120 2700 14400 25200 15120 2520 210 3780 15120 17640 5292 2520 252 3402 9072 5292 1890 120 540 210 45 Figure 4-1: A number pyramid from the coefficients of the Wachter law moments. 64 variable x and using p =pws,IptMP, ptM, pw to compute dot products. A good source is [64]. For a given measure p on an interval I, let (p(x), q(x)) = p(x)q()dy, and ||p(x)IH = V(p(x),p(x)). Then the Lanczos iteration is described by Table 4.7. Lanczos on Measure p /3 -i = 0, q_ 1 (x) = 0, qo(x) = 1 for n= 0,1,2, ... do v(x) = xq,(x) an = (qn(x), v(x)) v(x) = v(x) - #2n-1qn-i(x) - anqn(x) f3 = 11v(x)1| qn+1(x) = VW #n end for Table 4.7: The Lanczos iteration produces the Jacobi parameters in a and /3. There are two ways to compute the integrals numerically. The first is to sample x and qa(x) at many points on the interval of support for qo(x) = 1 and discretize the integrals on that grid. The second can be done if you know the moments of 1t. If r(x) and s(x) are polynomials, (r(x), s(x)) can be computed given p's moments. Since the qn(x) are polynomials, every integral in the Lanczos iteration can be done in this way. In that case, the qn(x) are stored by their coefficients of powers of x instead of on a grid. Once we have reached k iterations, we have fully constructed the infinite Jacobi matrix using the first batch of f's moments, or a discretization of P. Step 1 can start with a general measure in which case Step 3 finds an approximate measure with a nearly Toeplitz representation. Step 1 could also start with a sequence of moments. It should be noted that the standard way to go from moments to Lanczos coefficients uses a Hankel matrix of moments and its Cholesky factorization ((30], (4.3)). As an example, we apply the algorithm to the histogram of the eigenvalues of (X/ II + pI)'(X/Vm + pI), where X is m x n, which has Jacobi parameters ai 65 Algorithm: Compute Measure from Nearly Toeplitz Jacobi Matrix. 1. Nearly Jacobi Toeplitz Representation: Run the continuous Lanczos algorithm up to step k, after which all ai are equal and all 13i are equal, or very nearly so. If they are equal, this algorithm will recover dp exactly, otherwise it will find it approximately. The Lanczos algorithm may be run using a discretization of the measure y, or its initial moments. (ao:oo, /3o:oo) = Lanczos (dp(x)). 2. Cauchy transform: evaluate the finite continued fraction below on the interval of x where it is imaginary. 1 g(x) 2 x - ao - /32 -1 Xk- al 2 k-2 3. Inverse Cauchy Transform: divide the imaginary part by -wr, to compute the desired measure. 1 dp(x) -- Im (g(x)). Table 4.8: Algorithm recovering or approximating an analytic measure by Toeplitz matrices with boundary. and 13i that converge asymptotically and quickly. We smooth the histogram using a Gaussian kernel and then compute its Jacobi parameters. The reconstruction of the histogram is in Figure 4-3 We also use the above algorithm to reconstruct a normal distribution from its first sixty moments, see Figure 4-4. The following theorem concerning continued fractions allows one to stably recover a distribution from its Lanczos coefficients ac and fi. As we have said, if the first batch of p's moments are known, we can find all ai and /J from i = 0 to 00 using the continuous Lanczos iteration. Theorem 7. Let p be a measure on interval I C R with Lanczos coefficients ai and 03, with the property that all ai are equal for i > k and all f3 are equal for i > k. We can recover I = [ak - 2 ,3 k, ack + 20k], and we can recover dp(x) using a continued 66 fraction. This theorem combines Theorems 1.97 and 1.102 of [32]. 1 g(x) x - a1 - ____________ -ak dpu(x) 1 -- Im (g(x)). Ir Figure 4-2 illustrates curves recovered from random terminating continued fractions g(x) such that the f3# are positive and greater in magnitude than the aj. In both cases, the above theorem allows correct recovery of the ai and fj (which is not always numerically possible). In the first one, k = 5, in the second, k = 3. If X is an m x n, m < n matrix of normals for m and n very large, (X/inI pI)(X// m + pI) has ai and + /J which converge to a constant, making its eigenvalue distribution recoverable up to a very small approximation. See Figure 4-3 We also tried to reconstruct the normal distribution, whose Jacobi parameterization is not at all Toeplitz, and which is not compactly supported. Figure 4-4 plots the approximations using 10 and 20 moments. 4.6 Direct computation of the Wachter law moments. While the moments of the Wachter law may be obtained in a number of ways, including expanding the cauchy transform, or applying the mobius inverse formula to the free cumulants, in this section we show that a direct computation of the integral is possible. 67 0.21 0.180.160.140.120.10.080.060.040.020 -6 8 6 4 2 0 -2 -4 x 0.18 0.160.140.12> 0.1- C V 0.080.060.040.020 -8 -6 -4 -2 2 0 4 6 8 10 x Figure 4-2: Recovery from of a distribution from random ao and /3 using Theorem 5.1. On top we use k = 5, on bottom we use k = 3. 68 140 r 120- 100- 80- 60- 40- 20- 0 10 15 20 30 25 35 40 45 x Figure 4-3: Eigenvalues taken from (X/V/~r + pI)(X/#/i+ JI), where X is m x n, 4 = 10 , n = 3m, p = 5. The blue bar histogram is taken using hist.m, a better one was taken by convolving the data with Gaussian kernel. That convolution histogram was used to initialize the continuous Lanczos algorithm which produced five a's and /'s. They were put into a continued fraction as described above, assuming ac and 3i to be constant after i = 5. The continued fraction recreated the histogram, which is the thick red line. m 69 0.45 r 0.40.350.3- 0.25 0.2 0.150.1 - 0.050 -4 -3 -2 0 X -1 1 2 4 3 Figure 4-4: The normal distribution's Jacobi matrix is not well approximated by Toeplitz plus boundary, but with sufficiently many moments good approximations are possible. The above graph shows the normal distribution recovered by the method in this paper using 10 and 20 moments. The thick line is the normal computed by e- 2 /2 /2w, and the thin lines on top of it use our algorithm. Theorem 8. We find the moments of the Wachter law, ink. Mk = a+b- b__4 Va(a + b - 1) k-2 ( aN+1 (ak+-b) -b)( Proof. We start by integrating the following expression by comparing it to the MarchenkoPastur law. 1 f"+ IXk ( - Xa(X- ,adX. If x = su, dx = sdu and this integral becomes J1 = IUIV(Uur- du. To compare this expression to the Marchenko-Pastur law, we need to pick s and A such that = (1V ( ) 2 and I = (1 - VA) 2 for A > 1. There are more than one 70 ( choices of each parameter, but we pick Vi = ,'± ) - gjp and Using the Narayana numbers, and the formula for the moments of the MarchenkoPastur law, the integral equals J1 = 2 k+ P4 ) I(VITT- 4 ±+2 / Nk+1 Using a and b, this becomes 2k+4 J a(a+b - 1)) a +b) Nk+1 (a(a + b - 1) We also need to integrate 1 V/+(P+ -x)(x - )dx - x 1 27r fsu_ Let su = x - 1. sdu = dx. This becomes J2 ~'1 (/1±-i = - a) (a U - ____ LULL, du, which by symmetry is S J2 = 2 U 27r Using the same technique as previously, v/s l -- (g1- + + V/l -p- 71 iu. - v/1 - p+) and Using the fact that the Marchenko-Pastur law is normalized, the answer is 1J2 = s = 4 (VI a 2 _ l- + = (a+ b) 2 * Now we are ready to find the moments of the Wachter law. Using the geometric series formula, a+b mk 27 a+b a~b 27 a a +b 4.7 x- t_- 0 I dx 1 -x A+ PJ7 -)(- p-)dx - a+ b F(p+ - SA+ 1- x /1 k-2 (a+b)E ' 27rj ( a(a +b-) a+b Y+I (p,+ - x)(x - p_)dx 2j+4 ) 2 Nj1a(a+b-1)) Acknowledgements We would like to thank Michael LaCroix, Plamen Koev, Sheehan Olver and Bernie Wang for interesting discussions. We gratefully acknowledge the support of the National Science Foundation: DMS-1312831, DMS-1016125, DMS-1016086. 72 Bibliography [1] George E. Andrews, Richard Askey, and Ranjan Roy, Special Functions, Cambridge University Press, 1999. [2] Michael Anshelevich, Wojciech Motkowski, "The free Meixner class for pairs of measures," arXiv, 2011, http://arxiv.org/abs/1003.4025 [3] Michael Anshelevich, "Bochner-Pearson-type characterization of the free Meixner class," Adv. in Appl. Math. 46 (2011), 25-45 [4] T. H. Baker and P. J. Forrester, "The Calogero-Sutherland model and generalized classical polynomials," Communications in Mathematical Physics 188 (1997), no. 1, 175-216. [51 A. Bekker and J.J.J. Roux, "Bayesian multivariate normal analysis with a Wishart prior," Communications in Statistics - Theory and Methods, 24:10, 24852497. [6] James R. Bunch and Christopher P. Nielson, "Updating the Singular Value Decomposition," Numerische Mathematik, 31, 111-129, 1978. [7] Mireille Capitaine, Muriel Casalis, "Asymptotic freeness by generalized moments for Gaussian and Wishart matrices. Application to beta random matrices." Indiana University Mathematics Journal,01/2004; 53(2):397-432. [8] Djalil Chafai, "Singular Values of Random Matrices," notes available online. [9] Percy Deift, Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach, Courant Lecture Notes in Mathematics, 1998. 73 [10] Alexander Dubbs, Alan Edelman, Plamen Koev, and Praveen Venkataramana, "The Beta-Wishart Ensemble," 2013, http://arxiv.org/abs/1305.3561 [11] Alexander Dubbs, Alan Edelman, "The Beta-MANOVA Ensemble with General Covariance," Random Matrices: Theory and Applications, Vol. 03, No. 1. [12] Ioana Dumitriu and Alan Edelman, "Matrix Models for Beta Ensembles," Journal of Mathematical Physics, Volume 43, Number 11, November, 2002. [13] loana Dumitriu, "Eigenvalue Statistics for Beta-Ensembles," Ph.D. Thesis, MIT, 2003. [14] loana Dumitriu, Alan Edelman, Gene Shuman, "MOPS: Multivariate orthogonal polynomials (symbolically)," Journal of Symbolic Computation, 42, 2007. [15] Joana Dumitriu and Plamen Koev, "Distributions of the extreme eigenvalues of Beta-Jacobi random matrices," SIAM Journal of Matrix Analysis and Applications, Volume 30, Number 1, 2008. [16] Freeman J. Dyson, "The Threefold Way. Algebraic Structure of Symmetry Groups and Ensembles in Quantum Mechanics," Journal of Mathematical Physics, Volume 3, Issue 6. [17] Alan Edelman, "The Random Matrix Technique of Ghosts and Shadows," Markov Processes and Related Fields, 16, 2010, No. 4, 783-790. [18] Alan Edelman, Random Matrix Theory, in preparation. [19] Alan Edelman and Plamen Koev, "Eigenvalue distributions of beta-Wishart matrices," unpublished, available at: math.mit.edu/~plamen/files/mvs.pdf [20] Alan Edelman, N. Raj Rao, "Random matrix theory," Acta Numerica, 2005. [21] Alan Edelman and Brian D. Sutton, "The Beta-Jacobi Matrix Model, the CS Decomposition, and Generalized Singular Value Problems," Foundationsof Computational Mathematics, 259-285 (2008). 74 [22] Alan Edelman and Brian Sutton, "The Beta-Jacobi Matrix Model, the CS decomposition, and generalized singular value problems," Foundations of Computational Mathematics, 2007. [23] I. G. Evans, "Bayesian Estimation of Parameters of a Multivariate Normal Distribution" , Journal of the Royal Statistical Society. Series B (Methodological), Vol. 27, No. 2 (1965), pp. 279-283. [24] R. A. Fisher, "The sampling distribution of some statistics obtained from nonlinear equations," Ann. Eugenics 9, 238-249, 1939. [25] Peter Forrester, "Exact results and universal asymptotics in the Laguerre random matrix ensemble," J. Math. Phys. 35, (1994). [26] Peter Forrester, Log-gases and random matrices, Princeton University Press, 2010. [27] Peter Forrester, "Probability densities and distributions for spiked Wishart /3ensembles," arXiv:1101.2261 v1 (2011). [28] Peter J. Forrester, Eric M. Rains, "Interpretations of some parameter dependent generalizations of classical matrix ensembles," Probability Theory and Related Fields, Volume 131, Issue 1, pp. 1-61, January, 2005. [29] Girshick, M. A. "On the sampling theory of roots of determinantal equations," Ann. Math. Statist., 10, 203-224, (1939). [30] Golub, G. H., and J. A. Welsch, "Calculation of Gauss Quadrature Rules," 1969, Math. Comp. 23, 221. [31] Ming Gu and Stanley Eisenstat, "A stable and fast algorithm for updating the singular value decomposition," Research Report YALEU/DCS/RR9-66, Yale University, New Haven, CT, 1994. 75 [32] Akihito Hora and Nobuaki Obata, Quantum Probability and Spectral Analysis of Graphs, Theoretical and Mathematical Physics (Springer, Berlin Heidelberg 2007). [33] P. L. Hsu, "On the distribution of roots of certain determinantal equations," Ann. Eugenics 9, 250-258, 1939. [34] Suk-Geun Hwang, "Cauchy's Interlace Theorem for Eigenvalues of Hermitian Matrices," The American Mathematical Monthly, Vol. 111, No. 2 (Feb., 2004), pp. 157-159. [35] Timothy Kusalik, James A. Mingo, Roland Speicher, "Orthogonal Polynomials and Fluctuations of Random Matrices," (2005), on arXiv. [36] Alan T. James, "The distribution of the latent roots of the covariance matrix," Ann. Math. Statist. Volume 31, 151-158, (1960). [37] Alan T. James, "Distributions of matrix variates and latent roots derived from normal samples," Ann. Math. Statist. Volume 35, Number 2, (1964). [38] Kurt Johansson, Eric Nordenstam, "Eigenvalues of GUE minors," Electronic Journal of Probability,Vol. 11 (2006), pp. 1342-1371. [39] Jyoichi Kaneko, "Selberg integrals and hypergeometric functions associated with Jack polynomials," SIAM Journal on Mathematical Analysis, Volume 24, Issue 4, July 1993. [40] Rowan Killip and Irina Nenciu, "Matrix models for circular ensembles," International Mathematics Research Notes, Volume 2004, Issue 50, pp. 2665-2701. [41] Plamen Koev, "Computing Multivariate Statistics," online notes at http://math.mit.edu/~plamen/files/mvs.pdf [42] Plamen Koev's web page: http://www-math.mit.edu/~plamen/software/mhgref.html 76 [43] Plamen Koev and Alan Edelman, "The Efficient Evaluations of the Hypergeometric Function of a Matrix Argument," Mathematics of Computation, Volume 75, Number 254, Pages 833-846, January 19, 2006. Code available at http://wwwmath.mit.edu/ plamen/software/mhgref.html [44] Fei Li and Yifeng Xue, Zonal polynomials and hypergeometric functions of quaternion matrix argument, Communications in Statistics: Theory and Methods, Volume 38, Number 8, January 2009. [45] Fei Li and Yifeng Xue, "Zonal polynomials and hypergeometric functions of quaternion matrix argument," Communications in Statistics: Theory and Methods, Volume 38, Number 8, January 2009. [46] Ross Lippert, "A matrix model for the -Jacobi ensemble," Journal of Mathe- matical Physics 44(10), 2003. [47] I. G. Macdonald, "Hypergeometric functions," unpublished manuscript. [48] T. Mansour and Y. Sun, "Identities involving Narayana polynomials and Catalan numbers," Disc. Math., 309:4079-4088, 2009. [49] V. A. Marchenko and L. A. Pastur, "Distribution of the eigenvalues in certain sets of random matrices," Matematicheskii Sbornik, 72 (114), 1967. [50] Brendan D. McKay, "The Expected Eigenvalue Distribution of a Large Regular Graph," Linear Algebra and its Applications, 40:203-216 (1981). [51] Mood, A. M., "On the distribution of the characteristic roots of normal secondmoment matrices," Ann. Math. Statist., 22, 266-273, 1951. [52] Robb J. Muirhead, Aspects of MultivariateStatistical Theory, Wiley-Interscience, 1982. [53] Alexandru Nica and Roland Speicher, Lectures on the Combinatorics of Free Probability, Cambridge University Press, 2006. 77 [54] A. Okounkov and G. Olshanksi, "Shifted Jack Polynomials, Binomial Formla, and Applications," Mathematical Research Letters 4, 69-78, (1997). [55] I. Olkin and S. N. Roy, "On Multivariate Distribution Theory," Ann. Math. Statist. Volume 25 Number 2 (1954), 329-339. [56] S. Olver and R. Nadakuditi, "Numerical computation of convolutions in free probability theory," preprint on arXiv:1203.1958. [57] B. N. Parlett, The Symmetric Eigenvalue Problem. SIAM Classics in Applied Mathematics, 1998. [58] Victor Perez-Abreu and Noriyoshi Sakuma, "Free infinite divisibility and free multiplicative mixtures of the wigner distribution," Comunicaciones del CIMAT, No 1-09-07/15-10-2009. [59] N. Raj Rao, Alan Edelman, "The Polynomial Method for Random Matrices," Foundations of Computational Mathematics, 2007. [60] T. Ratnarajah, R. Vaillancourt, M. Alvo, "Complex Random Matrices and Applications," CRM-2938, January, 2004. [61] S. N. Roy, "p-Statistics and some generalizations in analysis of variance appropriate to multivariate problems," Sankhya 4, 381-396, 1939. [62] Luis Santalo, Integral Geometry and Geometric Probability, Addison-Wesley Publishing Company, Inc. 1976. [63] Richard P. Stanley, "Some combinatorial properties of Jack symmetric functions," Adv. Math. 77, 1989. [64] Lloyd N. Trefethen and David Bau, III, Numerical Linear Algebra, SIAM, 1997. [65] Kenneth W. Wachter, "The strong limits of random matrix spectra for sample matrices of independent elements," Annals of Probability,6, 1978. 78 [66] Eugene P. Wigner, "Characteristic vectors of bordered matrices with infinite dimensions," Annals of Mathematics, Vol. 62, 1955. [67] J. H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford University Press, 1999. 79