Eigenvectors of Random Graphs : Delocalization and Nodal Domains Sanjeev Arora and Aditya Bhaskara Department of Computer Science Princeton University, Princeton NJ 08540 {arora, bhaskara}@cs.princeton.edu Abstract. We study properties of the eigenvectors of adjacency matrices of G(n, p) random graphs, for p = ω(log n)/n. This connects to similar investigations for other random matrix models studied in physics and mathematics. Motivated by the recent paper of Dekel, Lee and Linial we study delocalization properties of eigenvectors and their connection to nodal domains. We show the following for an eigenvector x (normalized s.t. kxk2 = 1): 1. For any S ⊆ [n] and |S| = δn, we have X i∈S x2i ≥ δ 4 p6 log (1/pδ) 2 w.h.p. A similar statement proved for δ > 1/2 by Dekel, Lee and Linial. 2. Let p > n−1/20 . Then x has exactly two nodal domains whp. (i.e., the subgraph of vertices with xi ≥ 0 is connected and so is the subgraph of vertices with xi ≤ 0). Previously such a statement was not known even for p = 1/2, unless one is allowed to discard O(1/p) “exceptional” vertices of the graph. Our techniques involve using Wigner’s semicircle law on “short scales”, an idea previously used in mathematical physics by Erdős, Schlein and Yau. 1 Introduction Eigenvalues and eigenvectors of random matrices is a topic that spans multiple disciplines of science, and is related to topics as diverse as the zeros of the Riemann Zeta function, properties of hamiltonians of atomic nuclei, and conductance properties of metallic lattices. The seminal work of Wigner [1] studied matrices whose entries are i.i.d. (standard) gaussians, and identified the distribution of eigenvalues, which he quantified in the celebrated semi-circle law. Our knowledge of the finer distribution of eigenvalues has grown in the decades since, and extensions of these results to other random matrix models have been found. In this paper we are interested in adjacency matrices of random graphs in the Erdős-Renyi model G(n, p) wherein each edge is picked iid with probability p. The eigenvalue results in Wigner tradition have also been proved for it (see [2]). Establishing the properties of eigenvectors has proved much harder than for eigenvalues in most models. Hints from a variety of disciplines suggest that eigenvectors of random matrices should look like random vectors in <n . A natural conjecture is that the entries are distributed like i.i.d. Gaussians. This is easy to prove via a symmetry argument when the random matrix has i.i.d. Gaussian entries (note that this distribution is invariant under rotation), and with some effort weaker results are provable for the case of Gaussian Orthogonal Ensembles (we refer to the excellent survey of [3]). Recently Tao and Vu [4] and Erdős, Schlein and Yau [5] have proved strong theorems about distributions of eigenvectors for more general distributions. When a full characterization of eigenvector properties eludes us, we may try to prove delocalization properties of eigenvectors as a first step. Delocalization says that the “mass distribution” over the coordinates of the eigenvectors is not too lumpy. If the eigenvector entries were truly Gaussian, one would expect en√ tries to be about 1/ n in magnitude, so the following is a small list of properties one may expect to hold with high probability (throughout the paper, “whp.” will mean with probability bigger than 1 − n−c for a large constant c): 1. kxk4 ≤ log n (Posed by T. Spencer, see [3].) √ n . (Stronger than the previous statement.) 2. kxk∞ ≤ log n √ n w.h.p. for every unit vector u. 3. |x.u| ≤ log n P 4. Say S ⊆ [n] and |S| = δn. Then i∈S x2i is bounded from above and below by ‘reasonable’ functions of δ w.h.p. Dekel, Lee and Linial [6] were led to some of the above questions because of their study of nodal domains. Given an f : V 7→ <, its nodal domains are defined as the connected components formed by vertices having the same sign for f (some care is needed in dealing with v s.t. f (v) = 0, which we will get to later). Nodal domains arising out of physical systems have been studied extensively. For a review of the subject, we refer to the articles in [7]. For instance, Blum et al have shown that nodal count statistics can be used as a criterion for quantum chaos. Closer to home, in computer graphics they are used as a way to distinguish between surfaces whose associated graphs happen to be iso-spectral (see [8] for these and other examples). Courant’s celebrated nodal domain theorem (c.f. [7]) says that the kth eigenvector (after sorting based on increasing eigenvalues) of the Laplacian of a graph has at most k nodal domains. A canonical example is that of harmonics on a string: the kth eigenfunction is a sine function with k peaks and k sign changes. Connections are also known between nodal domains and chromatic number (we refer to the survey [9]). In G(n, p) graphs which are dense enough, we do not expect the eigenvectors to have more than two nodal domains (one each for positive and negative). The intuition is that each vertex with a negative value in the eigenvector is very likely to have an edge to a vertex in the domain corresponding to negative vertices, so the entire negative domai should just be connected. Dekel, Lee and Linial [6] studied this question, and proved that this is true, provided we throw away O(1/p) “exceptional” vertices. As p gets smaller, this is number is quite large, and thus they cannot obtain a good bound on the number of nodal domains. Our results. We prove that there are precisely two nodal domains in any eigenvector of G(n, p) whp., for p ≥ n−1/19+ε (any ε > 0). Our proof has two key ingredients: the first is a strong delocalization result (in this case an `∞ bound) on the eigenvectors proved very recently due to [10].1 The second is a stronger “spreading” result we prove – it says that for any subset S of δn coordinates, an eigenvector has non-trivial `2 mass on S whp. Dekel, Lee and Linial [6] also prove such a result, however they can do this only for δ > 1/2. The emphasis of our results is on the connection between delocalization (in this case, saying that mass is “spread-out”, even to sets of a small constant size), and properties of nodal domains. Our techniques are inspired by the work of Erdős, Schlein and Yau [5] who give a beautiful technique to prove properties of eigenvectors starting with refined versions of the semicircle law. (Section 2.1) A key component of their argument is proving that the gap between two eigenvalues is small for most of the spectrum – it is only around 1/n, if eigenvalues are normalized to lie in [−2, 2]. Gaps between eigenvalues are interesting in their own right, and are related to, for instance, the distribution of roots of the Riemann zeta function [3]. We rely on results of this flavor for G(n, p), proved recently in work by Tran et al [11], and Erdős et al [10]. Outline. The delocalization result is proved in Section 3. This is used to prove the theorem on nodal domains (Section 4). We end with a discussion on other natural delocalization properties and directions for future work. 2 Notation and Preliminaries We consider the Erdős-Renyi graph model G(n, p), in which we have n vertices and each possible edge is picked independently with probability p. We study properties of the adjacency matrix of G ∼ G(n, p). Define Dp to be a random variable which takes value 1 w.p. p and value 0 w.p. 1 − p. Define Xp to be the “shifted” variant: it takes value 1 − p w.p. p, and −p w.p. 1 − p (and hence has a mean zero). Note that the adjacency matrix has zero on the diagonal, other entries are distributed according to Dp (and they are independent, modulo symmetry constraints). Let 1n √ denote the vector in <n with every coordinate (in the standard basis) equal to 1/ n (thus it is a unit vector for each n). Also, Jm,n will refer to an m×n matrix with each entry equal to 1. Finally, whenever we say that a statement is 1 true ‘whp.’, we mean that it is true with probability 1 − poly(n) . As is standard, there will be constants which can be chosen so as to make the statements true for any given polynomial. We also take union bounds over polynomially many events that are true whp., and the result will be true whp. Let us now outline some standard results from random matrix theory. 1 An earlier version of this paper used a weaker bound due to [11]. 2.1 Semicircle law and the eigenvalues of G(n, p) Let G be drawn from G(n, p), with p > ω(log n)/n and let A denote its adjacency matrix. The following facts are well known about the eigenvalues of A (we refer to the excellent lecture notes due to Tao [12] for details): Fact 1 The largest eigenvalue λmax is np(1 ± O( √1n )). Further, the eigenvector corresponding to the largest eigenvalue is ‘close’ to the vector 1n (see [6] or [13]). Fact 2√ The rest of the p eigenvalues (apart from the first) satisfy |λi | ≤ 2(1 + o(1))γ n (= 2(1 + o(1)) np(1 − p)). Fact 3 (Semicircle law) Let ε > 0 and t, t + ε √∈ [−2, 2]. Let N√(t, ε) denote the number of eigenvalues of A that lie between tγ n and (t + ε)γ n. Then for any δ > 0, we have Z N (t, ε) − n t+ε t Z ρsc (x)dx ≤ δn t+ε ρsc (x)dx t whp., where ρsc (x) denotes the ‘semicircle density’: 1 2π √ 4 − x2 . A lot of work has been aimed at proving the semicircle law for ε (in the statement above) as small as possible. These are called local semicircle laws, and play a crucial role in all the existing results on properties of eigenvectors ([5],[11],[10]). 2.2 Singular values of random matrices The distribution of singular values of random rectangular matrices has also been extensively studied in literature. We consider rectangular m × n matrices (think of m > n), with entries drawn iid from a distribution D with mean zero and variance γ 2 . Bai and Yin [14] showed asymptotic (in m, n, with fixed ratio m/n) bounds on the smallest and largest singular values, for fairly general D. This was extended to more general m, n in later works, culminating in the result of Rudelson and Vershynin [15], which we state for the special case of D = Xp (defined earlier in this section). Theorem 1. [15] Suppose A is an m × n matrix with entries drawn iid from Xp , and let γ 2 = p(1 − p), as earlier. Then we have √ √ σmax (A) = (1+o(1))γ( m+ n), and √ √ σmin (A) = (1−o(1))γ( m− n)whp. We will only need fairly innocuous applications of this theorem – for instance the o(1) terms could be, say 1/2. Such versions are in fact much easier to prove. 3 Support on δ-sized sets In what follows, let G be a graph drawn from G(n, p), and let A denote its adjacency matrix. Let us fix 1 < i ≤ n, and let v ∈ <n be the eigenvector corresponding to the ith largest eigenvalue of A.2 The main result of this section considers the distribution of mass on this eigenvector. More precisely, Theorem 2. Let G, A and v be defined as in the paragraph above, and let S ⊆ [n] be a set of indices satisfying |S| ≥ δn, for some parameter δ > 0. The we have δ 4 p6 whp., kv|S k22 ≥ Ω log2 (1/pδ) P where kv|S k22 := j∈S vj2 . Comments. Note that the “whp.” is after fixing an S. It would be nice to say that every set of size δn has a large mass, but we do not know how to do this. Another point is that the bound degrades (quite badly) as p gets smaller – this is a common issue with G(n, p) and it is not clear how it can be removed. As a warm-up, we will consider a simpler case. Here we will look at a symmetric matrix with entries drawn i.i.d. from the distribution Xp (instead of an adjacency matrix). More precisely, Lemma 1. Let M be an n×n symmetric matrix with the upper diagonal entries drawn i.i.d. from the (symmetric) distribution Xp . Let S ⊆ [n] be a set of indices, with |S| > δn, and δ > 1/2. Lastly, let v be an eigenvector of M , normalized to have unit length. Then w.h.p., we have kv|S k22 ≥ Ω(1). Having entries from Xp and not Dp simplifies things. The other (more crucial) simplification is in assuming that δ is a constant > 1/2. The difference will determine the constant in the Ω(·). This is the key lemma in the work of Dekel et al. Proof. Let λ be the eigenvalue corresponding to v. Now let us view the matrix M as blocks based on indices being in S and V \ S. We write the equation M v = λv as B H x x =λ , (1) y y HT D (To be clear, the vector v is split into vectors x, y based on indices in S, V \S, and matrix M is split into B, H, H T , D, since it is symmetric). Let us denote |S| = m (= δn), and let r = n − m; thus the dimensions of B, H, D are m × m, m × r, r × r respectively. v is normalized, thus kxk2 + kyk2 = 1. From (1), we have Bx + Hy = λx, H T x + Dy = λy. 2 (2) We do not deal with i = 1 since it is very simple – in this case standard results (c.f. [13]) say that the eigenvector is close (entry-wise) to the all-ones vector (w.h.p.), and all our claims follow from this. Assumption. We will assume that neither B nor D has λ as an eigenvalue, and will thus freely use the expression (λI − B)−1 and the like. This is a technical issue – we can deal with it by either adding a small amount of noise (as in Section 3.1 of [11]), or by working with the pseudo-inverse. Now from the first equation of (2), we have that x = (λI − B)−1 Hy. Since kxk2 +kyk2 = 1, and we are done if kxk2 > 1/4, we may assume that kyk2 > 3/4. The key now is that for δ > 1/2, H is an m × r matrix, with m > cr for some constant c > 1. Thus H is well-conditioned, more precisely (see Section 2), we have that √ √ √ σmin (H) ≥ γ( m − r) > γ m · c0 , 2 for some constant c0 (depending on c). Since kHzk2 ≥ σmin kzk2 for any z, we 2 2 02 have in particular that kHyk ≥ mγ · c . We will √ claim that the largest eigenvalue (in magnitude) of (λI − B) is at most 6γ m. This would imply that the smallest eigenvalue of (λI − B)−1 is at 1 in magnitude, thus for any vector z, k(λI − B)−1 zk2 ≥ 36γ12 m kzk2 , least 6γ √ m so plugging in z = Hy and using the earlier bound, the lemma follows. √It thus remains to verify that the largest eigenvalue of λI − B is at most matrix 6γ m. First, note that λ is an eigenvalue of M , a symmetric √ √ with entries i.i.d. copies of the symmetric r.v. Xp . Thus λ < 2γ n < 3γ m (since m > n/2). Second, since B is a symmetric random matrix√with entries from Xp , the eigenvalues of B all have magnitude smaller than 3γ n w.h.p. These imply the desired bound. General case. The main difficulty in the general case (δ ≤ 1/2) is that we no longer have a good lower bound on kHyk2 – indeed such a bound is false for arbitrary y (there will even exist non-zero y s.t. Hy = 0!). Another (more technical) difficulty is that we wish to deal with the adjacency matrix, whose entries are distributed according to Dp and not the symmetric Xp : this means that (λI − B)−1 will have one eigenvalue which is tiny (of the order of 1/n if we think of p as a constant), and we need to ensure that Hy does not have most of its mass long this eigenvector. Outline. For the sake of intuition, suppose the second issue did not exist (i.e., we are dealing with symmetric matrices with upper-diagonal entries i.i.d. from Xp ). Then the first issue is dealt with by observing the following: even though kHyk could be small for certain y, it is still true that given y, this quantity is large with high probability (over the choice of H). So the key idea is to try to restrict the possible y to a “small” set (which does not depend on H), and then take a union bound. The restriction is achieved by observing that the second equality in (2) implies that y = (λI − D)−1 H T x, and if x is “too short”, then so is H T x, and hence (λI − D)−1 should “stretch” H T x by a large amount (because y is close to length 1). This in turn implies that y is essentially supported on the large singular vectors of (λI − D)−1 (which is independent of the choice of H!). This allows the union bound argument to go through. Dealing with the second issue above, namely working with Dp will involve many technical difficulties, but the rough outline is as described above: first analyze for a fixed y, then impose a “structure” on y, followed by an ε-net argument. 3.1 Theorem 2: Analyzing a fixed y Let us now show, for a given vector y, how (λI −B)−1 Hy has a large norm w.h.p. (over choice of H). This is step is more or less straightforward if the entries of the matrix B i.i.d. from Xp . However in our case B is the adjacency matrix of the induced (also random with edge prob. p) √ graph on S. It has mp as its m whp. Since the parameter top eigenvalue and all other eigenvalues are < 3γ √ λ < 3γ n, we have that all √ except possibly one, of the eigenvalues of (λI − B) m have magnitude at most 6γ √ n. Let u ∈ < be the eigenvector corresponding to the only “large” (> 6γ n) eigenvalue. Now if the vector Hy has “enough” mass orthogonal to u, it turns out we can give a lower bound on the length of (λI − B)−1 Hy. We thus show the following. Lemma 2. Suppose u ∈ <m is a unit vector s.t. ku − 1m k2 < 1, and let y ∈ <r . Let H be a matrix in <m×r whose entries are i.i.d. copies of Dp (it is rectangular – we do not have any symmetry conditions). Then we have 2 Pr kHyk2 − hu, Hyi2 < mpγ 2 kyk2 /10 < e−mpγ /200 . Proof. First, let us write H = H 0 + pJm,r , as before. Also since the statement is invariant under scaling y, let us assume kyk = 1. Further, let y = y 0 +α1r , where hy, 1r i = 0. Then Hy = H 0 y + (mr)1/2 pα · 1m . Let us write c := (mr)1/2 pα, and let Z = H 0 y. Thus for random H, Z is a vector with i.i.d. mean-zero entries. The quantity we are interested in is now kZ+c1m k2 − hu, (Z + c1m )i2 = kZk2 + kc1m k2 + 2hZ, c1m i − hu, Zi2 + hu, c1m i2 + 2hu, Zihu, c1m i = kZk2 − hu, Zi2 + h1m , c1m i2 − hu, c1m i2 + 2h1m , Zih1m , c1m i − 2hu, Zihu, c1m i Let T1 and T2 denote the terms in the first and the second parentheses. The idea is now to manipulate the terms, T1 and T2 separately and using Azuma’s inequality to prove that they are both large, whp. The details are somewhat tricky, though morally straightforward. The entire proof of the lemma is reproduced in the appendix (Section A.2). t u 3.2 Theorem 2: Imposing structure on y First, we show how to restrict the possible set of y. To this effect, we show the following: Lemma 3. Let x ∈ <m , y ∈ <n , with kxk < ε2 d/2n, and suppose kyk > 3/4. Further, let y = (λI − D)−1 H T x (with H, D as before). Then there exist a set S of at most d vectors, determined entirely by D, such that y = s + z, with s ∈ span(S) and kzk < ε. Before going into the proof, let us show: (proof in Section A.3) Lemma 4. Suppose M ∈ <n×n and u, v ∈ <n , with v = M u. Suppose kvk = 1, and kuk < ε/τ , for some ε, τ > 0. Then v = v 0 + z, where kzk < ε, and v 0 is in the span of the eigenvalues of M of magnitude > τ . Proof ((of Lemma 3)). Let us write H = H 0 + pJm,r , so we get y = (λI − D)−1 (H 0T + pJr,m )x = (λI − D)−1 H 0T x + p(λI − D)−1 Jr,m x The second term is a vector which is parallel to (λI − D)−1 1r . Let us add a unit vector along this direction to S. Now consider the vector η := (λI − D)−1 H 0T x. If kηk < ε, we are done: we have expressed y as the span of a small set of vectors (which depend only on D) plus an ε noise. Thus suppose kηk > ε. We now observe that H 0T has entries are iid copies of Xp . Thus from √ √ Theorem 1, √ which we have σmax (H 0T ) ≤ γ( r + m) whp., and thus kH 0T xk < 4γ nkxk. Now we ask: how many eigenvalues of (λI − D)−1 have magnitude > 2γC√n ? (for some √ parameter C). It is precisely the number of eigenvalues of D which are in a 2γ n/C interval around λ. By the semicircle law (Fact 3, in a scale roughly 1/C), this number is at most 4n/C (recall that we assumed δ < 1/2, and hence r > n/2). We set C = 4n/d. √ Let us now recall that kηk ≥ ε, and further we saw that kH 0T xk ≤ 4γ nkxk < √ ε2 d n 2γ n · n . Thus (λI − D)−1 stretches H 0T x by a factor at least dε · 2γ1√n . Thus by Lemma 4 (in the hypothesis, the vector is stretched by a factor τ /ε), we have η = η 0 + z, where kzk < ε, and η 0 is in the span of eigenvectors of (λI − D)−1 of eigenvalues with magnitude at least nd · 2γ1√n . The number of such eigenvectors, as we saw above, is precisely d. This completes the proof. t u We will use this lemma with ε = (pδ)1/2 /20, and d = for which we need kxk = 3.3 ε2 d 2n = δ 2 p3 C log(1/pδ) , mpγ 2 20 log(1/ε) ≈ δnp2 40 log(1/ε) , for some absolute constant C. Theorem 2: The ε-net argument, finale In Section 3.2, we saw that we could restrict y to vectors of the kind s + z, where s is a unit vector in the span of a set of d vectors (we can assume s is unit even though we did not technically prove it, since y is extremely close to a unit vector if kxk is small), for some “small” d, and kzk ≤ ε. Let us denote the set of all such y by Y. Modified ε-Net. We will need an ε-net with additional property. In particular, we need a set Nε s.t. for every y ∈ Y, we have y = y 0 +η, where y 0 ∈ Nε , kηk ≤ ε, and further |hη, 1r i| ≤ ε/r. The additional requirement is to handle the fact that we deal with Dp and not Xp . We can find an Nε of this kind as S follows: first we consider a “usual” ε-net, so we have a set Mε , such that Y ⊆ x∈Mε B(x, ε). Now we can chop up B(x, ε) for each ε into 2r “slices” where for any u, u0 in one slice, hu, 1r i and hu0 , 1r i differ by at most ε/r, and pick the ‘center’ of each slice (the number of slices is 2r because in B(x, ε), the dot-product with 1r can vary by at most 2ε). Now we use Lemma 2 to prove: Lemma 5. Let u be a unit vector in <m satisfying ku − 1m k2 < 1, and H be a 1 , matrix in <m×r whose entries are i.i.d. copies of Dp . Then w.p. ≥ 1 − poly(n) we have kHyk2 − hu, Hyi2 ≥ mpγ 2 kyk2 /10 for all y ∈ Y Proof. Let Π : <m 7→ <m be the operator which projects to the (dimension m − 1) subspace orthogonal to u. Now let ε := (pδ)1/2 /20, and Nε be an ε-net for Y in the modified sense as above. Now for each y ∈ Nε , we have by Lemma 2, kΠHyk2 ≥ mpγ 2 kyk2 /10 w.p. ≥ 1 − e−mpγ 2 /10 . By our choice of parameters (i.e., d, and because δnp2 log n), we have r · d 1 1 < exp(mpγ 2 /10), we have that w.p. ≥ 1− poly(n) , kΠHyk2 ≥ mpγ 2 kyk2 /10 ε for all y ∈ Nε . In particular, since kyk > 3/4 for all y ∈ Y, we have kΠHyk ≥ √ γ mp/6 for all y ∈ Nε w.h.p. Now suppose η ∈ <r , with the property that kηk < ε, and |hη, 1r i| < ε/r. Write η = η 0 + ρ1r , where hη 0 , 1r i = 0. So we have |ρ| < ε/r. Thus √ kHηk = kH 0 η 0 + p(mr)1/2 ρ1m k ≤ εσmax (H 0 ) + γ mε. √ √ From section 2, we have σmax (H 0 ) ≤ γ( m + r) w.h.p. Combining this with the above, we have that √ √ √ kΠHyk ≥ γ mp/6 − 2γ( m + r)ε for all y ∈ Y, whp. This completes the proof, noting ε < (pδ)1/2 /20. Lemma 6. Let B be the adjacency matrix of a G(m, p) random graph, and let H be an m × r matrix with entries picked iid from Dp . Then whp. (over the choice of H, B), we have √ p k(λI − B)−1 Hyk ≥ for all y ∈ Y. 24 See Appendix A.4 for the proof. This completes the three step proof of Theorem 2. 4 Nodal domains Let G ∼ G(n, p), and let us consider an eigenvector v of the adjacency matrix A. Let the corresponding eigenvalue be λ, and that it is not the largest (this means hv, 1n i is small, and hence we expect it to have both positive and negative components). Recall that nodal domains of v are maximal connected components of v which have the same sign. For technical reasons, we will consider only weak nodal domains, i.e., if a coordinate vi = 0, we allow it to be in multiple domains. In fact, Courant’s nodal domain theorem in a discrete setting applies only to weak nodal domains (though there is a different version for strong nodal domains, it is not as clean to state) [9]. Intuitively if a graph is fairly dense, we would expect much fewer than n nodal domains for any function f on the vertices, unless the graph has a certain structure (for instance if a graph is bipartite, there exists an f with n different nodal domains). Thus for dense enough random graphs, we would expect to see precisely two nodal domains (one each for positive/negative) when we consider the eigenfunctions. This will be the subject of this section. We crucially use the following (very recent) result of Erdős, Knowles, Yau and Yin [10]. A weaker result (used in an earlier version of the current paper) was proved by Tran et al [11]. Lemma 7. [[10], Theorem 2.16] Let A be the adjacency matrix of G(n, p), and let v be any eigenvector of A, normalized to kvk2 = 1. Then we have log2 n kvk∞ ≤ √ , n whp. For convenience, we will denote this upper bound by β. Let us start with some simple observations, along the lines of [6]. First, observe that the size of the largest independent set in G is O(log n)/p whp. (By a standard calculation – Observation 1). Thus there cannot be more than O(log n)/p domains of each sign (because vertices in different domains cannot have an edge by definition). Further there can be at most one domain of each sign of size ≥ C log n/p, because otherwise we have two sets of size C log n/p with no edges between them and this cannot exist whp (calculation à la Observation 1). Thus let M+ and M− denote the largest positive and negative domains respectively. We claim that both are ‘large’ whp. Lemma 8. Suppose p > n−1/2+ε . Then both |M+ | and |M− | are at least log2 n/p whp. The proof is an application of Lemma 7, and can be found in Section A.5. Now, following [6], we will call the vertices in Me = V \(M+ ∪M− ) exceptional vertices. At most C log n/p of these have vi > 0, because otherwise these along with M+ are two large sets with no edges between them. Thus the number of exceptional vertices is only C log n/p. This crude upper bound will be helpful in showing that there are in fact no exceptional vertices w.h.p. Theorem 3. Let p ≥ n−1/19+ε , and v be an eigenvector of G ∼ G(n, p) for non-first eigenvalue λ. Then v has precisely two nodal domains whp. Proof. Equivalently, we wish to prove that there are no exceptional vertices. Suppose for the sake of contradiction that Me 6= ∅, and let i be an exceptional vertex. Wlog, we may assume that vi > 0. Thus i can have edges only to M− ∪Me , because if it has an edge to M+ it is in the same connected component and hence not exceptional. Consider the ith coordinate in the equality Av = λv X X X X vj + vj = λvi , i.e., vj − |vj | = λvi j∈Me ∩Γ (i) j∈M− ∩Γ (i) j∈Me ∩Γ (i) j∈M− ∩Γ (i) (3) The i, we wish to claim that P proof is along the following lines: for every vertex P 2 |v | is large. This is because Theorem 2 implies j∈Γ (i) j j∈Γ (i) vj is ‘large’ (because the `2 mass on any set is large, and we apply this to the set of neighbors), and then using Lemma 7. Then we will claim that the other two terms are both very small – the first is because there are too few terms, and the last by Lemma 7 and the fact that λ is a non-first eigenvalue. Formally, we first note that P |Γ (i)| ≥ np(1 − o(1)) for every i whp. Thus by Theorem 2 applied to Γ (i),3 j∈Γ (i) vj2 ≥ s whp., where s denotes the RHS of the bound in Theorem 2. We can then take a union bound and P conclude this for all i. Now by Lemma 7 we have |vj | ≤ β for every j, thus j∈Γ (i) |vj | ≥ s/β. We showed that |Me | is at most O(log n) , p X and thus |xj | ≤ j∈Me ∩Γ (i) Cβ log n , p for some constant C. We will verify that our choice of parameters satisfy s 4Cβ log n ≥ , 2β p and √ s > 8γ n ≥ 4λβ. 2β (4) These will together contradict (3), thus finishing the proof of the theorem. Plugging in the bound from Theorem 2 with δ = p, the first inequality above p10 log n simplifies to log(1/p) ≥ 4Cnp , which is true as long as p ≥ n−1/11+ε . The second p √ inequality in (4), along with the fact that λ ≤ 2γ n = 2 np(1 − p) simplifies 1 1/2 0 to log2p(1/p) ≥ (np) bound dominates. 5 n log4 n , which is true as long as p ≥ n−1/19+ε . The latter Future Directions A natural question posed by [11] is the following 3 Strictly speaking, we cannot use the theorem – because it says that for a fixed S, the property holds whp. In our case we have the graph, and we are picking S = Γ (i). However, we can repeat the proof of the lemma setting S = Γ (i) for one vertex i – in the block decomposition, one of the columns of H has all ones, and the ‘corresponding’ column in D has zeroes. However the rest of the entries are random, and this ensures that all the claims hold. We omit the details. Conjecture 1. Let v be an eigenvector corresponding to a non-first eigenvalue of log n A, and let u be any unit vector. Then whp., |hu, vi| ≤ C √ . n Note that `∞ bounds on eigenvectors translate to choosing u = ei . While it seems tempting to use similar techniques – after all, by rotation we can change any u to ei – the difficulty is that we no longer have blocks with entries that are independent of each other. Level sets. One can also consider nodal domains of level sets of the eigenvectors: these are nodal domains of the vector v − t1n , for some threshold t. Such sets arise in spectral partitioning, for instance [16]. We would expect that for t beyond some critical value, the nodal domains start “breaking apart” into smaller pieces, and it is interesting to study this threshold. G(n, p) for smaller p. Perhaps the most interesting open question is to study delocalization and properties of nodal domains for small p (including p = C/n for large constant C). Many of our technical ingredients, like the delocalization results of Erdős, Schlein and Yau, and even some properties about the spectrum are not known for this range of p. An ambitious goal is to investigate nodal domain counts for the giant component. Perhaps a more approachable question is to reduce the lower bound on p required by Theorem 3 (i.e., does it hold for all p > ω(log n)/n?) 5.1 Acknowledgements We thank Peter Sarnak for pointing us to [5] which served as the starting point for this work. We would also like to thank James Lee and Nati Linial for interesting discussions, and Nati for encouraging thought on related problems, such as nodal domains in level sets. References 1. Wigner, E.: On the Distribution of the Roots of Certain Symmetric Matrices. Annals of Mathematics 67 (1958) 325–328 2. Bollobas, B.: Random Graphs. Cambridge University Press (2001) 3. Erdos, L.: Universality of Wigner random matrices: a Survey of Recent Results. ArXiv e-prints (April 2010) 4. Tao, T., Vu, V.: Random matrices: Universality of local eigenvalue statistics. ArXiv e-prints (June 2009) 5. Erdös, L., Schlein, B., Yau, H.T.: Local semicircle law and complete delocalization for wigner random matrices. Communications in Mathematical Physics 287 (2009) 641–655 10.1007/s00220-008-0636-9. 6. Dekel, Y., Lee, J.R., Linial, N.: Eigenvectors of random graphs: Nodal domains. In: APPROX-RANDOM. (2007) 436–448 7. : Nodal patterns in physics and mathematics. The European Physical Journal Special Topics 145 (June 2007) 10.1140/epjst/e2007-00142-7. 8. Band, R., Oren, I., Smilansky, U.: Nodal domains on graphs - How to count them and why? ArXiv e-prints (November 2007) 9. Bıyıkoğlu, T., Leydold, J., Stadler, P.F.: Laplacian eigenvectors of graphs: PerronFrobenius and Faber-Krahn type theorems. Springer Verlag (2007) 10. Erdos, L., Knowles, A., Yau, H.T., Yin, J.: Spectral Statistics of Erd\H{o}sR\’enyi Graphs I: Local Semicircle Law. ArXiv e-prints (March 2011) 11. Tran, L., Vu, V., Wang, K.: Sparse random graphs: Eigenvalues and Eigenvectors. ArXiv e-prints (November 2010) 12. Tao, T.: Topics in random matrix theory (in preparation). (2011) 13. Mitra, P.: Entrywise bounds for eigenvectors of random graphs. Electronic Journal of Combinatorics 16 (2009) 14. Bai, Z., Yin, Y.: Limit of the smallest eigenvalue of a large dimensional sample covariance matrix. Ann. Probab. 21(3) (1993) 1275–1294 15. Rudelson, M., Vershynin, R.: Smallest singular value of a random rectangular matrix. Communications on Pure and Applied Mathematics 62(12) (2009) 1707– 1739 16. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (1997) 888–905 A Additional Lemmas Lemma 9. Let Y1 , . . . Yn be i.i.d. random P 2 variables drawn according to Xp , and let α1P , . . . , αn be real numbers s.t. i αi = S. Further let Z denote the r.v. Z := i αi Yi . Then E[Z 2 ] = Sp(1 − p) = Sγ 2 E[Z 4 ] ≤ S 2 (γ 2 + 3γ 4 ) (5) Proof. Noting that Yi are drawn from the symmetric distribution Xp , we have that E[Yi Yj ] = E[Yi3 Yj ] = 0 for distinct i, j. Thus we have X 2 X 2 2 E[Z 2 ] = E αi Yi =E αi Yi = γ 2 · S i i X 4 X E[Z ] = E αi Yi =E αi αj αk αl Yi Yj Yk Yl 4 i = X i,j,k,l αi4 · E[Yi4 ] + 6 i Now that E[Y 4 ] P i4 i αi X αi2 αj2 · E[Yi2 Yj2 ] i<j 2 = p(1 − p)(1 − 3p + 3p ) < p(1 − p) = γ 2 , and E[Yi2 Yj2 ] = γ 4 . Noting P < S 2 and 6 i<j αi2 αj2 < 3S 2 now implies the desired result. Observation 1 Let G ∼ G(n, p). The size of the largest independent set in G is at most C log n/p whp. Proof. The expected number of independent sets of size k is (by linearity of expectation): n −k2 /2 e ≤ e−k(log n−(k/2)) . k This number is 1/poly(n) for k > C log n/p, thus Markov’s inequality gives the result. A.1 Proof of Claim. A.2 Claim. Let y ∈ <r be any vector with kyk = 1. Let H 0 be an m × r matrix with entries drawn i.i.d. from Xp , and let Z = H 0 y. Then for any unit vector u ∈ <m , and all t > 0, we have 2 Pr |hu, Zi| ≥ t ≤ e−t . (6) P P 0 Proof. We can write out uT H 0 y = i,j Hi,j ui yj . Since i,j u2i yj2 = 1, and H 0 are mean-zero random variables bounded in absolute value by 1, we have the claim by Azuma’s inequality. A.2 Proof of Lemma 2 First, let us write H = H 0 +pJm,r , as before. Also since the statement is invariant under scaling y, let us assume kyk = 1. Further, let y = y 0 +α1r , where hy, 1r i = 0. Then Hy = H 0 y + (mr)1/2 pα · 1m . Let us write c := (mr)1/2 pα, and let Z = H 0 y. Thus for random H, Z is a vector with i.i.d. mean-zero entries. The quantity we are interested in is now kZ+c1m k2 − hu, (Z + c1m )i2 = kZk2 + kc1m k2 + 2hZ, c1m i − hu, Zi2 + hu, c1m i2 + 2hu, Zihu, c1m i = kZk2 − hu, Zi2 + h1m , c1m i2 − hu, c1m i2 + 2h1m , Zih1m , c1m i − 2hu, Zihu, c1m i Let T1 and T2 denote the terms in the first and the second parentheses. Consider T2 . To simplify things, let δ := 1m − u, so that 1m = u + δ. So first, we have h1m , c1m i2 − hu, c1m i2 = hδ, c1m ih1m + u, c1m i. Next, writing h1m , Zi = hu, Zi + hδ, Zi, h1m , Zih1m , c1m i − hu, Zihu, c1m i = hδ, Zih1m , c1m i + hu, Zihδ, c1m i. Now since u and 1m are unit vectors, and u = 1m − δ, we have 1 = 1 + kδk2 − 2hδ, 1m i, implying hδ, 1m i = kδk2 /2. Thus h1m + u, c1m i = 2c 1 − kδk2 /4 . Plugging in these values and simplifying, kδk2 hu, Zi T2 = c2 kδk2 · 1 − kδk2 /4 + 2c hδ, Zi + 2 By assumption, ku − 1m k2 < 1, and so we have T2 ≥ c2 kδk2 kδk2 − 2|c| |hδ, Zi| + |hu, Zi| . 2 2 Now a standard Azuma-Hoeffding bound gives the following. Claim. Let y ∈ <r be any vector with kyk = 1. Let H 0 be an m × r matrix with entries drawn i.i.d. from Xp , and let Z = H 0 y. Then for any unit vector u ∈ <m , and all t > 0, we have 2 Pr |hu, Zi| ≥ t ≤ e−t . (7) The simple proof can be found in Section A.1.4 Using this we have that Pr[|hδ, Zi| > 2 2 √ √ kδk·γ mp/20] < e−mpγ /400 , and so also Pr[|hu, Zi| > γ mp/20] < e−mpγ /400 . 2 Thus w.p. at least 1 − e−mpγ /500 , we have T2 ≥ √ √ 2 γ mp γ mp 2 c + kδk − 2|c|kδk · − 2|c| · 20 4 20 (ckδk)2 4 Since kδk2 < 2, and for all α, β ∈ < we have α2 − 2αβ ≥ −β 2 , we have, w.p. at 2 least 1 − e−mpγ , T2 ≥ −mpγ 2 /50 Now consider T1 := kZk2 − hu, Zi2 . As before, we have that Pr[khu, Zik2 > 2 mpγ 2 /100] < e−mpγ /100 . We now show that kZk2 ≥ mpγ 2 /10 with extremely high probability. More specifically, Claim. Let y ∈ <r be given, with kyk2 = 1, and H 0 be an m × r matrix with entries drawn i.i.d. from Xp . Then mpγ 2 Pr kH 0 yk2 < < e−mp/10 . 10 Proof. Let Z := H 0 y, and let Zi denote the ith component of Z. We can check (see Lemma 9 for details) that for each i, E[Zi2 ] = γ 2 kyk2 (8) E[Zi4 ] ≤ (γ 2 + 3γ 4 )kyk4 (9) 2 2 2 kyk ) Thus by the Paley-Zygmund inequality, Pr[Zi2 > γ 2 kyk2 /3] ≥ 31 · (γ(γ 2 +3γ 4 )kyk4 ≥ p/6. [The last bit is because we assume p ≤ 1/2.] Let us say an index i is good if the above inequality holds. Thus any i is good w.p. at least p/6, and further, these events are independent. Thus in expectation there are at least mp/6 good indices, and by simple Chernoff bounds, the probability that there are at most mp/8 is at most e−mp/100 . If there are at least mp/8 good indices, then kZk2 ≥ (mp/8) · γ 2 /3 ≥ mpγ 2 /24. This proves the claim. Putting all these together, we showed that T1 ≥ Also, we saw that T2 ≥ proof of Lemma 2. 4 2 − mpγ 20 mpγ 2 10 w.p. at −mpγ 2 /20 w.p. at least 1 − e least 1−e−mpγ 2 /20 . . This completes the It would be nice if we could have exp(−t2 /2γ 2 ) in the RHS – it would lead to a better bound overall. A.3 Proof of Lemma 4 Proof. Let λ1 , λ2 , . . . , λn be the eigenvalues of M in decreasing order of magnitude, and let s denote the index where the values become smaller than τ . P Further, let w be the corresponding eigenvectors, and let u = α w i i i . Then i P M u = i αi λi wi , and thus if we write v = v 0 + z, with v 0 being the component along the ‘large’ eigenvalues of M , we have X X kzk2 < λ2i αi2 < τ 2 αi2 < ε2 i>s i>s t u A.4 Proof of Lemma 6 Proof. Let u be the eigenvector corresponding to the largest eigenvalue of B. It is well-known (see e.g. [6]) that u is “close” to 1m , i.e., ku − 1m k2 < 1 whp. Now from the semicircle law (section 2), all the eigenvalues of B except the first √ are at most 3γ m whp. Thus if w is a vector orthogonal to u, we have that 1 · kwk. Now using the previous lemma, we obtain k(λI − B)−1 wk ≥ 3γ √ m k(λI − B)−1 Hyk ≥ √ √ p γ mp ≥ 8 24 3γ m 1 √ · for all y ∈ Y. t u A.5 Proof of Lemma 8 Proof. Suppose for the sake of contradiction, that |M− | ≤ log2 n/p. By the above, there can only be log2 /p2 indices i with vi < 0 (small number of components, each small). P n Since λ is not the top eigenvalue, hv, 1n i ≤ C√log i vi ≤ np whp. (c.f. [6]), i.e., P P 2 C log n 1 1 √ i |vi | ≥ β i vi = β . Since the sum is p . But using Lemma 7, we get small, the negative terms should cancel the positive ones, however there are only log2 n/p2 of them and each is of magnitude only β. Thus this would be a contradiction if log2 n 1 β· < . p2 4β This simplifies to p2 > log6 n/n, which holds as long as p > n−1/2+ε .