Mean field Ising models Sumit Mukherjee Columbia University Joint work with Anirban Basak, Duke University What is an Ising model? An Ising model is a probability distribution on {−1, 1}n . Sumit Mukherjee Mean field Ising models 1 / 46 What is an Ising model? An Ising model is a probability distribution on {−1, 1}n . The probability mass function of the Ising model is given by β 0 Pn (X = x) = e 2 x An x−Fn . Sumit Mukherjee Mean field Ising models 1 / 46 What is an Ising model? An Ising model is a probability distribution on {−1, 1}n . The probability mass function of the Ising model is given by β 0 Pn (X = x) = e 2 x An x−Fn . Here An is a symmetric n × n matrix with 0 on the diagonals, and β > 0 is a one dimensional parameter usually referred to as inverse temperature. Sumit Mukherjee Mean field Ising models 1 / 46 What is an Ising model? An Ising model is a probability distribution on {−1, 1}n . The probability mass function of the Ising model is given by β 0 Pn (X = x) = e 2 x An x−Fn . Here An is a symmetric n × n matrix with 0 on the diagonals, and β > 0 is a one dimensional parameter usually referred to as inverse temperature. Fn is the log partition function, which will be the main focus of this talk. Sumit Mukherjee Mean field Ising models 1 / 46 Applications of Ising model Ferromagnetism: Ising model was introduced by Wilhelm Lenz in 1920 to study orientation of configurations of electrons in Ferromagnetic materials. The idea is that if most of the electrons are oriented in the same direction, we observe magnetism. Sumit Mukherjee Mean field Ising models 2 / 46 Applications of Ising model Ferromagnetism: Ising model was introduced by Wilhelm Lenz in 1920 to study orientation of configurations of electrons in Ferromagnetic materials. The idea is that if most of the electrons are oriented in the same direction, we observe magnetism. Lattice gas: Ising models have been used to model motion of atoms of gas, by partitioning space into a lattice. Each position is either 1 or −1 depending on whether an atom is present or absent. Sumit Mukherjee Mean field Ising models 2 / 46 Applications of Ising model Ferromagnetism: Ising model was introduced by Wilhelm Lenz in 1920 to study orientation of configurations of electrons in Ferromagnetic materials. The idea is that if most of the electrons are oriented in the same direction, we observe magnetism. Lattice gas: Ising models have been used to model motion of atoms of gas, by partitioning space into a lattice. Each position is either 1 or −1 depending on whether an atom is present or absent. Neuroscience: Ising models have been used to study neuron activity in our brains. Here each neuron is +1 if active, and −1 if inactive. Sumit Mukherjee Mean field Ising models 2 / 46 Applications of Ising model Ferromagnetism: Ising model was introduced by Wilhelm Lenz in 1920 to study orientation of configurations of electrons in Ferromagnetic materials. The idea is that if most of the electrons are oriented in the same direction, we observe magnetism. Lattice gas: Ising models have been used to model motion of atoms of gas, by partitioning space into a lattice. Each position is either 1 or −1 depending on whether an atom is present or absent. Neuroscience: Ising models have been used to study neuron activity in our brains. Here each neuron is +1 if active, and −1 if inactive. A generalization of the Ising model for more than two states (Potts model) has been used to study Protein folding and Image Processing. Sumit Mukherjee Mean field Ising models 2 / 46 Commonly studied Ising models Sumit Mukherjee Mean field Ising models 3 / 46 Commonly studied Ising models Ferromagnetic Ising model on generals graphs: An is the adjacency matrix of a finite graph (scaled by the average degree): Sumit Mukherjee Mean field Ising models 3 / 46 Commonly studied Ising models Ferromagnetic Ising model on generals graphs: An is the adjacency matrix of a finite graph (scaled by the average degree): Lattice graphs: Original paper by Ising (1925) considered the one dimensional lattice, subsequent focus has been on higher dimensional lattices; Sumit Mukherjee Mean field Ising models 3 / 46 Commonly studied Ising models Ferromagnetic Ising model on generals graphs: An is the adjacency matrix of a finite graph (scaled by the average degree): Lattice graphs: Original paper by Ising (1925) considered the one dimensional lattice, subsequent focus has been on higher dimensional lattices; Curie-Weiss Model: Complete graph; Sumit Mukherjee Mean field Ising models 3 / 46 Commonly studied Ising models Ferromagnetic Ising model on generals graphs: An is the adjacency matrix of a finite graph (scaled by the average degree): Lattice graphs: Original paper by Ising (1925) considered the one dimensional lattice, subsequent focus has been on higher dimensional lattices; Curie-Weiss Model: Complete graph; Random Graphs: Erdős-Rényi, random regular, etc. Sumit Mukherjee Mean field Ising models 3 / 46 Commonly studied Ising models Ferromagnetic Ising model on generals graphs: An is the adjacency matrix of a finite graph (scaled by the average degree): Lattice graphs: Original paper by Ising (1925) considered the one dimensional lattice, subsequent focus has been on higher dimensional lattices; Curie-Weiss Model: Complete graph; Random Graphs: Erdős-Rényi, random regular, etc. Sherrington-Kirkpatrick: An is a matrix of i.i.d. N (0, 1/n). Sumit Mukherjee Mean field Ising models 3 / 46 Commonly studied Ising models Ferromagnetic Ising model on generals graphs: An is the adjacency matrix of a finite graph (scaled by the average degree): Lattice graphs: Original paper by Ising (1925) considered the one dimensional lattice, subsequent focus has been on higher dimensional lattices; Curie-Weiss Model: Complete graph; Random Graphs: Erdős-Rényi, random regular, etc. Sherrington-Kirkpatrick: An is a matrix of i.i.d. N (0, 1/n). Hopfield model of neural networks: where An = n1 Bn Bn0 , with Bn a matrix of i.i.d. random matrix taking values ±1 with probability 21 . Sumit Mukherjee Mean field Ising models 3 / 46 Why is the log partition function important? The log partition function gives information about the moments of the distribution. For e.g. ∂Fn = E(Xi Xj ). ∂An (i, j) Similar statements holds for higher derivatives. Sumit Mukherjee Mean field Ising models 4 / 46 Why is the log partition function important? The log partition function gives information about the moments of the distribution. For e.g. ∂Fn = E(Xi Xj ). ∂An (i, j) Similar statements holds for higher derivatives. The partition function gives information about the stable configurations under the model. Sumit Mukherjee Mean field Ising models 4 / 46 Why is the log partition function important? The log partition function gives information about the moments of the distribution. For e.g. ∂Fn = E(Xi Xj ). ∂An (i, j) Similar statements holds for higher derivatives. The partition function gives information about the stable configurations under the model. The uniform distribution on the space of q colorings of a graph is a (limiting) Potts model, and the partition function is the number of q colorings of the graph. Sumit Mukherjee Mean field Ising models 4 / 46 Why is the log partition function important? The log partition function gives information about the moments of the distribution. For e.g. ∂Fn = E(Xi Xj ). ∂An (i, j) Similar statements holds for higher derivatives. The partition function gives information about the stable configurations under the model. The uniform distribution on the space of q colorings of a graph is a (limiting) Potts model, and the partition function is the number of q colorings of the graph. Many estimation techniques in Statistics require the knowledge, or good estimates of the partition function. Sumit Mukherjee Mean field Ising models 4 / 46 Mean field technique of Statistical Physics Let D(·||·) denote the Kullback-Leibler divergence. Sumit Mukherjee Mean field Ising models 5 / 46 Mean field technique of Statistical Physics Let D(·||·) denote the Kullback-Leibler divergence. Then for any probability measure Qn on {−1, 1}n , D(Qn ||Pn ) equals X Qn (x) Qn (x) log Pn (x) n x∈{−1,1} Sumit Mukherjee Mean field Ising models 5 / 46 Mean field technique of Statistical Physics Let D(·||·) denote the Kullback-Leibler divergence. Then for any probability measure Qn on {−1, 1}n , D(Qn ||Pn ) equals X Qn (x) Qn (x) log Pn (x) x∈{−1,1}n X X = Qn (x) log Qn (x) − Qn log Pn (x) x∈{−1,1}n x∈{−1,1}n Sumit Mukherjee Mean field Ising models 5 / 46 Mean field technique of Statistical Physics Let D(·||·) denote the Kullback-Leibler divergence. Then for any probability measure Qn on {−1, 1}n , D(Qn ||Pn ) equals X Qn (x) Qn (x) log Pn (x) x∈{−1,1}n X X = Qn (x) log Qn (x) − Qn log Pn (x) x∈{−1,1}n = X x∈{−1,1}n Qn (x) log Qn (x) − x∈{−1,1}n + X β 2 X Qn (x)x0 An x x∈{−1,1}n Qn (x)Fn x∈{−1,1}n Sumit Mukherjee Mean field Ising models 5 / 46 Mean field technique of Statistical Physics Let D(·||·) denote the Kullback-Leibler divergence. Then for any probability measure Qn on {−1, 1}n , D(Qn ||Pn ) equals X Qn (x) Qn (x) log Pn (x) x∈{−1,1}n X X = Qn (x) log Qn (x) − Qn log Pn (x) x∈{−1,1}n = X x∈{−1,1}n Qn (x) log Qn (x) − x∈{−1,1}n + X β 2 X Qn (x)x0 An x x∈{−1,1}n Qn (x)Fn x∈{−1,1}n n o β =EQn log Qn (X) − X 0 An X + Fn . 2 Sumit Mukherjee Mean field Ising models 5 / 46 Mean field technique of Statistical Physics Since D(Qn ||Pn ) ≥ 0 with equality iff Qn = Pn , we have Fn ≥ EQn nβ 2 o X 0 An X − log Qn (X) , with equality iff Qn = Pn . Sumit Mukherjee Mean field Ising models 6 / 46 Mean field technique of Statistical Physics Since D(Qn ||Pn ) ≥ 0 with equality iff Qn = Pn , we have Fn ≥ EQn nβ 2 o X 0 An X − log Qn (X) , with equality iff Qn = Pn . Equivalently, Fn = sup EQn Qn nβ 2 o X 0 An X − log Qn (X) , where the sup is over all probability measures on {−1, 1}n . Sumit Mukherjee Mean field Ising models 6 / 46 Mean field technique of Statistical Physics Instead, if we restrict the sup over product measures Qn , we get a lower bound o nβ X 0 An X − log Qn (X) . (1) Fn ≥ sup EQn 2 Qn Sumit Mukherjee Mean field Ising models 7 / 46 Mean field technique of Statistical Physics Instead, if we restrict the sup over product measures Qn , we get a lower bound o nβ X 0 An X − log Qn (X) . (1) Fn ≥ sup EQn 2 Qn Under a product measure Qn , each Xi is independently distributed with mean mi ∈ [−1, 1], say. Sumit Mukherjee Mean field Ising models 7 / 46 Mean field technique of Statistical Physics Instead, if we restrict the sup over product measures Qn , we get a lower bound o nβ X 0 An X − log Qn (X) . (1) Fn ≥ sup EQn 2 Qn Under a product measure Qn , each Xi is independently distributed with mean mi ∈ [−1, 1], say. Then Xi takes the value {1, −1} with probabilities 1−mi respectively. 2 Sumit Mukherjee Mean field Ising models 1+mi 2 and 7 / 46 Mean field technique of Statistical Physics Instead, if we restrict the sup over product measures Qn , we get a lower bound o nβ X 0 An X − log Qn (X) . (1) Fn ≥ sup EQn 2 Qn Under a product measure Qn , each Xi is independently distributed with mean mi ∈ [−1, 1], say. i Then Xi takes the value {1, −1} with probabilities 1+m and 2 1−mi respectively. 2 Q Also Qn (x) becomes ni=1 Qn,i (xi ), which on taking expectation gives EQn log Qn (X) = n X E log Qn,i (Xi ) i=1 Sumit Mukherjee Mean field Ising models 7 / 46 Mean field technique of Statistical Physics = n h X 1 + mi i=1 2 log 1 + mi 1 − mi 1 − mi i + log 2 2 2 Sumit Mukherjee Mean field Ising models 8 / 46 Mean field technique of Statistical Physics = n h X 1 + mi i=1 2 log 1 + mi 1 − mi 1 − mi i + log 2 2 2 Thus the RHS of (1) on the last slide becomes n X β 0 m An m + I(mi ), 2 i=1 1+x where I(x) = − 1+x 2 log 2 − entropy function. Sumit Mukherjee 1−x 2 log 1−x 2 is the Binary Mean field Ising models 8 / 46 Mean field technique of Statistical Physics = n h X 1 + mi i=1 2 log 1 + mi 1 − mi 1 − mi i + log 2 2 2 Thus the RHS of (1) on the last slide becomes n X β 0 m An m + I(mi ), 2 i=1 1+x where I(x) = − 1+x 2 log 2 − entropy function. 1−x 2 log 1−x 2 is the Binary Combining we have nFn ≥ Mn , where o P Mn = supm∈[−1,1]n β2 m0 An m + ni=1 I(mi ) . Sumit Mukherjee Mean field Ising models 8 / 46 Mean field technique of Statistical Physics = n h X 1 + mi i=1 2 log 1 + mi 1 − mi 1 − mi i + log 2 2 2 Thus the RHS of (1) on the last slide becomes n X β 0 m An m + I(mi ), 2 i=1 1+x where I(x) = − 1+x 2 log 2 − entropy function. 1−x 2 log 1−x 2 is the Binary Combining we have nFn ≥ Mn , where o P Mn = supm∈[−1,1]n β2 m0 An m + ni=1 I(mi ) . This lower bound to the log partition function is referred to as mean field prediction. Sumit Mukherjee Mean field Ising models 8 / 46 Eg: Mean field prediction for Regular graphs For a regular graph with degree dn the matrix An (i, j) is taken to be d1n if i ∼ j, and 0 otherwise. Sumit Mukherjee Mean field Ising models 9 / 46 Eg: Mean field prediction for Regular graphs For a regular graph with degree dn the matrix An (i, j) is taken to be d1n if i ∼ j, and 0 otherwise. This scaling ensures that the resulting Ising model is non trivial, for all values of dn . Sumit Mukherjee Mean field Ising models 9 / 46 Eg: Mean field prediction for Regular graphs For a regular graph with degree dn the matrix An (i, j) is taken to be d1n if i ∼ j, and 0 otherwise. This scaling ensures that the resulting Ising model is non trivial, for all values of dn . By our scaling, each row sum of An equals 1. Sumit Mukherjee Mean field Ising models 9 / 46 Upper bound for Mn Thus Perron Frobenius theorem gives λ∞ (An ) = 1. Sumit Mukherjee Mean field Ising models 10 / 46 Upper bound for Mn Thus Perron Frobenius theorem gives λ∞ (An ) = 1. This gives Mn ≤ sup m∈[−1,1]n Sumit Mukherjee nβ 2 m0 m + n X o I(mi ) i=1 Mean field Ising models 10 / 46 Upper bound for Mn Thus Perron Frobenius theorem gives λ∞ (An ) = 1. This gives Mn ≤ sup m∈[−1,1]n = sup m∈[−1,1]n Sumit Mukherjee nβ 2 m0 m + o I(mi ) i=1 n nβ X 2 n X o m2i + I(mi ) i=1 Mean field Ising models 10 / 46 Upper bound for Mn Thus Perron Frobenius theorem gives λ∞ (An ) = 1. This gives Mn ≤ sup nβ 2 m∈[−1,1]n = sup = 2 sup i=1 mi ∈[−1,1] Sumit Mukherjee n X o I(mi ) i=1 n nβ X m∈[−1,1]n n X m0 m + o m2i + I(mi ) i=1 nβ 2 o m2i + I(mi ) Mean field Ising models 10 / 46 Upper bound for Mn Thus Perron Frobenius theorem gives λ∞ (An ) = 1. This gives Mn ≤ sup nβ 2 m∈[−1,1]n = sup = 2 2 i=1 mi ∈[−1,1] =n sup m∈[−1,1] Sumit Mukherjee o I(mi ) o m2i + I(mi ) i=1 nβ sup n X i=1 n nβ X m∈[−1,1]n n X m0 m + nβ 2 o m2i + I(mi ) o m2 + I(m) . Mean field Ising models 10 / 46 Lower bound for Mn Also, by restricting the sup over m ∈ [−1, 1]n such that mi = m is same for all i, we have Mn ≥ sup nβ m∈[−1,1] Sumit Mukherjee 2 o m2 10 An 1 + nI(m) Mean field Ising models 11 / 46 Lower bound for Mn Also, by restricting the sup over m ∈ [−1, 1]n such that mi = m is same for all i, we have Mn ≥ sup nβ m∈[−1,1] =n sup m∈[−1,1] Sumit Mukherjee o m2 10 An 1 + nI(m) 2 nβ 2 o m2 + I(m) . Mean field Ising models 11 / 46 Lower bound for Mn Also, by restricting the sup over m ∈ [−1, 1]n such that mi = m is same for all i, we have Mn ≥ sup nβ m∈[−1,1] =n sup m∈[−1,1] o m2 10 An 1 + nI(m) 2 nβ 2 o m2 + I(m) . In the last line we use An 1 = 1. Sumit Mukherjee Mean field Ising models 11 / 46 Lower bound for Mn Also, by restricting the sup over m ∈ [−1, 1]n such that mi = m is same for all i, we have Mn ≥ nβ sup m∈[−1,1] =n sup o m2 10 An 1 + nI(m) 2 nβ m∈[−1,1] 2 o m2 + I(m) . In the last line we use An 1 = 1. Thus combining we have Mn = n sup m∈[−1,1] Sumit Mukherjee nβ 2 o m2 + I(m) . Mean field Ising models 11 / 46 Solving the optimization Consider the optimization of the function m 7→ β2 m2 + I(m) on [−1, 1]. Sumit Mukherjee Mean field Ising models 12 / 46 Solving the optimization Consider the optimization of the function m 7→ β2 m2 + I(m) on [−1, 1]. Differentiating with respect to m gives the equation m = tanh(βm). Sumit Mukherjee Mean field Ising models 12 / 46 Solving the optimization Consider the optimization of the function m 7→ β2 m2 + I(m) on [−1, 1]. Differentiating with respect to m gives the equation m = tanh(βm). If β ≤ 1 then this equation has a unique solution m = 0. Sumit Mukherjee Mean field Ising models 12 / 46 Solving the optimization Consider the optimization of the function m 7→ β2 m2 + I(m) on [−1, 1]. Differentiating with respect to m gives the equation m = tanh(βm). If β ≤ 1 then this equation has a unique solution m = 0. If β > 1 then this equation has three solutions, {0, mβ , −mβ }. Sumit Mukherjee Mean field Ising models 12 / 46 Solving the optimization Consider the optimization of the function m 7→ β2 m2 + I(m) on [−1, 1]. Differentiating with respect to m gives the equation m = tanh(βm). If β ≤ 1 then this equation has a unique solution m = 0. If β > 1 then this equation has three solutions, {0, mβ , −mβ }. Of these 0 is a local minimizer, and ±mβ are glolal maximizers. Sumit Mukherjee Mean field Ising models 12 / 46 So? For any sequence of dn regular graphs, we have lim inf n→∞ nβ o 1 Fn ≥ sup m2 + I(m) . n m∈[−1,1] 2 Sumit Mukherjee Mean field Ising models 13 / 46 So? For any sequence of dn regular graphs, we have lim inf n→∞ nβ o 1 Fn ≥ sup m2 + I(m) . n m∈[−1,1] 2 If the graph happens to be a complete graph (dn = n − 1), then using large deviation arguments for i.i.d. random variables one can show the limit is tight. Sumit Mukherjee Mean field Ising models 13 / 46 So? For any sequence of dn regular graphs, we have lim inf n→∞ nβ o 1 Fn ≥ sup m2 + I(m) . n m∈[−1,1] 2 If the graph happens to be a complete graph (dn = n − 1), then using large deviation arguments for i.i.d. random variables one can show the limit is tight. It is also known that equality does not hold in the above limit if the graph happens to be a random dn regular graph with dn = d fixed. Sumit Mukherjee Mean field Ising models 13 / 46 So? For any sequence of dn regular graphs, we have lim inf n→∞ nβ o 1 Fn ≥ sup m2 + I(m) . n m∈[−1,1] 2 If the graph happens to be a complete graph (dn = n − 1), then using large deviation arguments for i.i.d. random variables one can show the limit is tight. It is also known that equality does not hold in the above limit if the graph happens to be a random dn regular graph with dn = d fixed. This raises the natural question: “For which sequence of regular graphs is the above limit tight?” Sumit Mukherjee Mean field Ising models 13 / 46 Regular graphs continued In a regular graph the scaled adjacency matrix has maximum eigenvalue 1, with eigenvector √1n 1. Sumit Mukherjee Mean field Ising models 14 / 46 Regular graphs continued In a regular graph the scaled adjacency matrix has maximum eigenvalue 1, with eigenvector √1n 1. Assume that all other eigenvalues are on (1). Sumit Mukherjee Mean field Ising models 14 / 46 Regular graphs continued In a regular graph the scaled adjacency matrix has maximum eigenvalue 1, with eigenvector √1n 1. Assume that all other eigenvalues are on (1). Thus we have λ∞ An − n1 110 = on (1). Sumit Mukherjee Mean field Ising models 14 / 46 Regular graphs continued In a regular graph the scaled adjacency matrix has maximum eigenvalue 1, with eigenvector √1n 1. Assume that all other eigenvalues are on (1). Thus we have λ∞ An − n1 110 = on (1). For any x ∈ {−1, 1}n , this gives 1 x0 An − 110 x ≈ on (1)x0 x = o(n) n Sumit Mukherjee Mean field Ising models 14 / 46 Regular graphs continued In a regular graph the scaled adjacency matrix has maximum eigenvalue 1, with eigenvector √1n 1. Assume that all other eigenvalues are on (1). Thus we have λ∞ An − n1 110 = on (1). For any x ∈ {−1, 1}n , this gives 1 x0 An − 110 x ≈ on (1)x0 x = o(n) n n 1 X 2 ⇒ x0 An x = ( xi ) + o(n) n i=1 Sumit Mukherjee Mean field Ising models 14 / 46 Regular graphs continued In a regular graph the scaled adjacency matrix has maximum eigenvalue 1, with eigenvector √1n 1. Assume that all other eigenvalues are on (1). Thus we have λ∞ An − n1 110 = on (1). For any x ∈ {−1, 1}n , this gives 1 x0 An − 110 x ≈ on (1)x0 x = o(n) n n 1 X 2 ⇒ x0 An x = ( xi ) + o(n) n i=1 X 1 = [n + xi xj ] + o(n) n i6=j Sumit Mukherjee Mean field Ising models 14 / 46 Regular graphs continued Summing over x ∈ {−1, 1}n we have X X β 0 e 2 x An x ≈ eo(n) β e 2n P i6=j xi xj . x∈{−1,1}n x∈{−1,1}n Sumit Mukherjee Mean field Ising models 15 / 46 Regular graphs continued Summing over x ∈ {−1, 1}n we have X X β 0 e 2 x An x ≈ eo(n) β e 2n P i6=j xi xj . x∈{−1,1}n x∈{−1,1}n Upto a o(n) term, the RHS above is exactly the partition function of the Curie Weiss model. Sumit Mukherjee Mean field Ising models 15 / 46 Regular graphs continued Summing over x ∈ {−1, 1}n we have X X β 0 e 2 x An x ≈ eo(n) β e 2n P i6=j xi xj . x∈{−1,1}n x∈{−1,1}n Upto a o(n) term, the RHS above is exactly the partition function of the Curie Weiss model. Thus using the Curie Weiss solution we readily get nβ o Fn = sup m2 + I(m) . lim n→∞ n m∈[−1,1] 2 Sumit Mukherjee Mean field Ising models 15 / 46 Regular graphs continued Summing over x ∈ {−1, 1}n we have X X β 0 e 2 x An x ≈ eo(n) β e 2n P i6=j xi xj . x∈{−1,1}n x∈{−1,1}n Upto a o(n) term, the RHS above is exactly the partition function of the Curie Weiss model. Thus using the Curie Weiss solution we readily get nβ o Fn = sup m2 + I(m) . lim n→∞ n m∈[−1,1] 2 The above heuristic is rigorous, and so we get the same asymptotics for any regular graph, as soon as the matrix An has only one dominant eigenvalue, i.e. there is a spectral gap. Sumit Mukherjee Mean field Ising models 15 / 46 Example: Erdős-Renyi graphs Suppose Gn is the adjacency matrix of an Erdős-Renyi graph on n vertices with parameter pn . Sumit Mukherjee Mean field Ising models 16 / 46 Example: Erdős-Renyi graphs Suppose Gn is the adjacency matrix of an Erdős-Renyi graph on n vertices with parameter pn . In this case we choose An to be Gn scaled by npn , which is the average degree. Sumit Mukherjee Mean field Ising models 16 / 46 Example: Erdős-Renyi graphs Suppose Gn is the adjacency matrix of an Erdős-Renyi graph on n vertices with parameter pn . In this case we choose An to be Gn scaled by npn , which is the average degree. log n If p n n then λ∞ (Gn√) ≈ npn (Krivelevich-Sudakov ’03), and 0 λ∞ Gn − pn 11 = Op ( npn )(Feige Ofek ’05). Sumit Mukherjee Mean field Ising models 16 / 46 Example: Erdős-Renyi graphs Suppose Gn is the adjacency matrix of an Erdős-Renyi graph on n vertices with parameter pn . In this case we choose An to be Gn scaled by npn , which is the average degree. log n If p n n then λ∞ (Gn√) ≈ npn (Krivelevich-Sudakov ’03), and 0 λ∞ Gn − pn 11 = Op ( npn )(Feige Ofek ’05). 1 Thus λ∞ An − n1 110 = Op √np = on (1), as n npn log n. Sumit Mukherjee Mean field Ising models 16 / 46 Example: Erdős-Renyi graphs Thus it follows that nβ o Fn = sup m2 + I(m) . n→∞ n m∈[−1,1] 2 lim Sumit Mukherjee Mean field Ising models 17 / 46 Example: Erdős-Renyi graphs Thus it follows that nβ o Fn = sup m2 + I(m) . n→∞ n m∈[−1,1] 2 lim If pn logn n then estimates for λ1 (Gn ) and λ∞ (Gn − p110 ) are not as nice. Sumit Mukherjee Mean field Ising models 17 / 46 Example: Erdős-Renyi graphs Thus it follows that nβ o Fn = sup m2 + I(m) . n→∞ n m∈[−1,1] 2 lim If pn logn n then estimates for λ1 (Gn ) and λ∞ (Gn − p110 ) are not as nice. Also we know mean field prediction does not hold when npn is constant. Sumit Mukherjee Mean field Ising models 17 / 46 Example: Erdős-Renyi graphs Thus it follows that nβ o Fn = sup m2 + I(m) . n→∞ n m∈[−1,1] 2 lim If pn logn n then estimates for λ1 (Gn ) and λ∞ (Gn − p110 ) are not as nice. Also we know mean field prediction does not hold when npn is constant. Thus a natural question is where does the mean field approximation break down. Sumit Mukherjee Mean field Ising models 17 / 46 Example: Random dn regular graphs Suppose Gn is the adjacency matrix of a random dn regular graph, and An is Gn scaled by dn . Sumit Mukherjee Mean field Ising models 18 / 46 Example: Random dn regular graphs Suppose Gn is the adjacency matrix of a random dn regular graph, and An is Gn scaled by dn . In this case λ1 (Gn ) = dn , and λ2 (Gn ) = o(dn ) as soon as dn ≥ (log n)γ for some γ (Chung-Lu-Vu ’03, Coja-Oghlan-Lanka ’09). Sumit Mukherjee Mean field Ising models 18 / 46 Example: Random dn regular graphs Suppose Gn is the adjacency matrix of a random dn regular graph, and An is Gn scaled by dn . In this case λ1 (Gn ) = dn , and λ2 (Gn ) = o(dn ) as soon as dn ≥ (log n)γ for some γ (Chung-Lu-Vu ’03, Coja-Oghlan-Lanka ’09). Thus a similar argument goes through, proving that mean field prediction is asymptotically correct if dn ≥ (log n)γ . Sumit Mukherjee Mean field Ising models 18 / 46 Example: Random dn regular graphs Suppose Gn is the adjacency matrix of a random dn regular graph, and An is Gn scaled by dn . In this case λ1 (Gn ) = dn , and λ2 (Gn ) = o(dn ) as soon as dn ≥ (log n)γ for some γ (Chung-Lu-Vu ’03, Coja-Oghlan-Lanka ’09). Thus a similar argument goes through, proving that mean field prediction is asymptotically correct if dn ≥ (log n)γ . Again there is a question of how far one can stretch this result. Sumit Mukherjee Mean field Ising models 18 / 46 Example: Random dn regular graphs Suppose Gn is the adjacency matrix of a random dn regular graph, and An is Gn scaled by dn . In this case λ1 (Gn ) = dn , and λ2 (Gn ) = o(dn ) as soon as dn ≥ (log n)γ for some γ (Chung-Lu-Vu ’03, Coja-Oghlan-Lanka ’09). Thus a similar argument goes through, proving that mean field prediction is asymptotically correct if dn ≥ (log n)γ . Again there is a question of how far one can stretch this result. It is known that the mean field prediction is not correct if dn = d is fixed. Sumit Mukherjee Mean field Ising models 18 / 46 What is the underlying picture? If the random graph is dense enough, then the mean field prediction is expected to hold via dominant eigenvalue argument. Sumit Mukherjee Mean field Ising models 19 / 46 What is the underlying picture? If the random graph is dense enough, then the mean field prediction is expected to hold via dominant eigenvalue argument. How dense is dense enough depends on the underlying model. Sumit Mukherjee Mean field Ising models 19 / 46 What is the underlying picture? If the random graph is dense enough, then the mean field prediction is expected to hold via dominant eigenvalue argument. How dense is dense enough depends on the underlying model. None of the results mentioned above applies for a specific sequence of non random graphs. Sumit Mukherjee Mean field Ising models 19 / 46 What is the underlying picture? If the random graph is dense enough, then the mean field prediction is expected to hold via dominant eigenvalue argument. How dense is dense enough depends on the underlying model. None of the results mentioned above applies for a specific sequence of non random graphs. However, the mean field prediction has this very nice form for any regular graph. Sumit Mukherjee Mean field Ising models 19 / 46 Example: The Hypercube Suppose Gn is the adjacency matrix of a d dimensional hypercube {0, 1}d . Sumit Mukherjee Mean field Ising models 20 / 46 Example: The Hypercube Suppose Gn is the adjacency matrix of a d dimensional hypercube {0, 1}d . In this case the number of vertices is n = 2d . Sumit Mukherjee Mean field Ising models 20 / 46 Example: The Hypercube Suppose Gn is the adjacency matrix of a d dimensional hypercube {0, 1}d . In this case the number of vertices is n = 2d . Gn has eigenvalues d − 2i with multiplicity Sumit Mukherjee d i Mean field Ising models , for 0 ≤ i ≤ d. 20 / 46 Example: The Hypercube Suppose Gn is the adjacency matrix of a d dimensional hypercube {0, 1}d . In this case the number of vertices is n = 2d . Gn has eigenvalues d − 2i with multiplicity d i , for 0 ≤ i ≤ d. Thus the eigenvalues of Gn range between [−d, d], and there is no dominant eigenvalue. Sumit Mukherjee Mean field Ising models 20 / 46 Example: The Hypercube (d = 50) 4 6 x 10 5 4 3 2 1 0 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 Sumit Mukherjee Ising models with d = 50. Figure: Histogram of eigenvalues ofMean thefield hypercube 21 / 46 Example: The hypercube In this case d = dn = log2 n, which is somewhere between sparse and dense graphs. Sumit Mukherjee Mean field Ising models 22 / 46 Example: The hypercube In this case d = dn = log2 n, which is somewhere between sparse and dense graphs. There is essentially no spectral gap, and hence dominant eigenvalue argument fails. Sumit Mukherjee Mean field Ising models 22 / 46 Example: The hypercube In this case d = dn = log2 n, which is somewhere between sparse and dense graphs. There is essentially no spectral gap, and hence dominant eigenvalue argument fails. Thus it is not clear even at a heuristic level whether the mean field prediction should be correct. Sumit Mukherjee Mean field Ising models 22 / 46 A more general question More generally, for a general sequence of matrices An (not necessarily an adjacency matrix), we pose the question of when is the mean field prediction tight. Sumit Mukherjee Mean field Ising models 23 / 46 A more general question More generally, for a general sequence of matrices An (not necessarily an adjacency matrix), we pose the question of when is the mean field prediction tight. Thus we look for sufficient condition on the sequence of matrices An such that lim n→∞ 1 (Fn − Mn ) = 0. n Sumit Mukherjee Mean field Ising models 23 / 46 A more general question More generally, for a general sequence of matrices An (not necessarily an adjacency matrix), we pose the question of when is the mean field prediction tight. Thus we look for sufficient condition on the sequence of matrices An such that lim n→∞ 1 (Fn − Mn ) = 0. n If this holds, then we are able to reduce the asymptotics of Fn to the asymptotics of Mn . Sumit Mukherjee Mean field Ising models 23 / 46 A more general question More generally, for a general sequence of matrices An (not necessarily an adjacency matrix), we pose the question of when is the mean field prediction tight. Thus we look for sufficient condition on the sequence of matrices An such that lim n→∞ 1 (Fn − Mn ) = 0. n If this holds, then we are able to reduce the asymptotics of Fn to the asymptotics of Mn . A follow up question is how to solve the optimization problem in Mn . Sumit Mukherjee Mean field Ising models 23 / 46 A more general question More generally, for a general sequence of matrices An (not necessarily an adjacency matrix), we pose the question of when is the mean field prediction tight. Thus we look for sufficient condition on the sequence of matrices An such that lim n→∞ 1 (Fn − Mn ) = 0. n If this holds, then we are able to reduce the asymptotics of Fn to the asymptotics of Mn . A follow up question is how to solve the optimization problem in Mn . As we checked, the optimization problem is easy for regular graphs. Sumit Mukherjee Mean field Ising models 23 / 46 The proposed solution We claim that a sufficient condition for the mean field prediction to hold is n X An (i, j)2 = tr(A2n ) = i,j=1 n X λ2i = o(n). i=1 Sumit Mukherjee Mean field Ising models 24 / 46 The proposed solution We claim that a sufficient condition for the mean field prediction to hold is n X An (i, j)2 = tr(A2n ) = i,j=1 n X λ2i = o(n). i=1 This basically says that the empirical eigenvalue distribution converges to 0 in L2 . Sumit Mukherjee Mean field Ising models 24 / 46 The proposed solution We claim that a sufficient condition for the mean field prediction to hold is n X An (i, j)2 = tr(A2n ) = i,j=1 n X λ2i = o(n). i=1 This basically says that the empirical eigenvalue distribution converges to 0 in L2 . One way to think about this claim is that in this case most of the eigenvalues are o(1). Sumit Mukherjee Mean field Ising models 24 / 46 The proposed solution We claim that a sufficient condition for the mean field prediction to hold is n X An (i, j)2 = tr(A2n ) = i,j=1 n X λ2i = o(n). i=1 This basically says that the empirical eigenvalue distribution converges to 0 in L2 . One way to think about this claim is that in this case most of the eigenvalues are o(1). Thus the number of dominant eigenvalues is o(n). Sumit Mukherjee Mean field Ising models 24 / 46 Example: Ising models on graphs In this case An is the adjacency matrix of Gn scaled by the n) average degree 2E(G . n Sumit Mukherjee Mean field Ising models 25 / 46 Example: Ising models on graphs In this case An is the adjacency matrix of Gn scaled by the n) average degree 2E(G . n Thus we have n X i,j=1 A2n (i, j) = n X n2 n2 . 1{i ∼ j} = 4E(Gn )2 2E(Gn ) i,j=1 Sumit Mukherjee Mean field Ising models 25 / 46 Example: Ising models on graphs In this case An is the adjacency matrix of Gn scaled by the n) average degree 2E(G . n Thus we have n X i,j=1 A2n (i, j) = n X n2 n2 . 1{i ∼ j} = 4E(Gn )2 2E(Gn ) i,j=1 The RHS is o(n) iff E(Gn ) n. Sumit Mukherjee Mean field Ising models 25 / 46 Example: Ising models on graphs In this case An is the adjacency matrix of Gn scaled by the n) average degree 2E(G . n Thus we have n X i,j=1 A2n (i, j) = n X n2 n2 . 1{i ∼ j} = 4E(Gn )2 2E(Gn ) i,j=1 The RHS is o(n) iff E(Gn ) n. Thus mean field prediction holds as soon as the graph has super linear edges. Sumit Mukherjee Mean field Ising models 25 / 46 Example: Erdős-Renyi graphs In this case 2E(Gn ) ≈ n2 pn . Sumit Mukherjee Mean field Ising models 26 / 46 Example: Erdős-Renyi graphs In this case 2E(Gn ) ≈ n2 pn . This is super linear as soon as npn → ∞. Sumit Mukherjee Mean field Ising models 26 / 46 Example: Erdős-Renyi graphs In this case 2E(Gn ) ≈ n2 pn . This is super linear as soon as npn → ∞. Also if npn = c is fixed, then mean field prediction does not hold. Sumit Mukherjee Mean field Ising models 26 / 46 Example: Erdős-Renyi graphs In this case 2E(Gn ) ≈ n2 pn . This is super linear as soon as npn → ∞. Also if npn = c is fixed, then mean field prediction does not hold. This gives a complete picture for Erdős-Renyi graphs. Sumit Mukherjee Mean field Ising models 26 / 46 Example: Regular graphs In this case 2E(Gn ) = ndn . Sumit Mukherjee Mean field Ising models 27 / 46 Example: Regular graphs In this case 2E(Gn ) = ndn . This is super linear, as soon as dn → ∞. Sumit Mukherjee Mean field Ising models 27 / 46 Example: Regular graphs In this case 2E(Gn ) = ndn . This is super linear, as soon as dn → ∞. Thus mean field prediction holds for any sequence of dn regular graphs, as soon as dn → ∞. Sumit Mukherjee Mean field Ising models 27 / 46 Example: Regular graphs In this case 2E(Gn ) = ndn . This is super linear, as soon as dn → ∞. Thus mean field prediction holds for any sequence of dn regular graphs, as soon as dn → ∞. In particular this covers the hypercube, as dn = log2 n. Sumit Mukherjee Mean field Ising models 27 / 46 Example: Regular graphs In this case 2E(Gn ) = ndn . This is super linear, as soon as dn → ∞. Thus mean field prediction holds for any sequence of dn regular graphs, as soon as dn → ∞. In particular this covers the hypercube, as dn = log2 n. For dn = d fixed mean field prediction does not hold. Sumit Mukherjee Mean field Ising models 27 / 46 Sanity check: SK Model In this case An (i, j) = √1 Z(i, j), n i.i.d. where {Z(i, j), 1 ≤ i < j ≤ n} ∼ N (0, 1). Sumit Mukherjee Mean field Ising models 28 / 46 Sanity check: SK Model In this case An (i, j) = √1 Z(i, j), n i.i.d. where {Z(i, j), 1 ≤ i < j ≤ n} ∼ N (0, 1). This gives n n 1 X 1 X p An (i, j)2 = 2 Z(i, j)2 → 1. n n i,j=1 Sumit Mukherjee i,j=1 Mean field Ising models 28 / 46 Sanity check: SK Model In this case An (i, j) = √1 Z(i, j), n i.i.d. where {Z(i, j), 1 ≤ i < j ≤ n} ∼ N (0, 1). This gives n n 1 X 1 X p An (i, j)2 = 2 Z(i, j)2 → 1. n n i,j=1 i,j=1 P Thus we have ni,j=1 An (i, j)2 = Θp (n), and so mean field condition does not hold. Sumit Mukherjee Mean field Ising models 28 / 46 Sanity check: SK Model In this case An (i, j) = √1 Z(i, j), n i.i.d. where {Z(i, j), 1 ≤ i < j ≤ n} ∼ N (0, 1). This gives n n 1 X 1 X p An (i, j)2 = 2 Z(i, j)2 → 1. n n i,j=1 i,j=1 P Thus we have ni,j=1 An (i, j)2 = Θp (n), and so mean field condition does not hold. This is what should be the case, as the mean field prediction does not hold for the SK Model. Sumit Mukherjee Mean field Ising models 28 / 46 Comments about our results Our results allow for the P presence of an external magnetic field term of the form B ni=1 xi in the Ising model. Sumit Mukherjee Mean field Ising models 29 / 46 Comments about our results Our results allow for the P presence of an external magnetic field term of the form B ni=1 xi in the Ising model. Our results apply for Potts model as well, where we have q states instead of 2. Sumit Mukherjee Mean field Ising models 29 / 46 Comments about our results Our results allow for the P presence of an external magnetic field term of the form B ni=1 xi in the Ising model. Our results apply for Potts model as well, where we have q states instead of 2. There is a technical condition required on the matrix An , which is sup n n X X xi An (i, j) = O(n). x∈[−1,1]n i=1 j=1 Sumit Mukherjee Mean field Ising models 29 / 46 Comments about our results Our results allow for the P presence of an external magnetic field term of the form B ni=1 xi in the Ising model. Our results apply for Potts model as well, where we have q states instead of 2. There is a technical condition required on the matrix An , which is sup n n X X xi An (i, j) = O(n). x∈[−1,1]n i=1 j=1 We don’t know of a single model in the literature which violates this assumption, be it mean field or otherwise Sumit Mukherjee Mean field Ising models 29 / 46 Comments about our results Our results allow for the P presence of an external magnetic field term of the form B ni=1 xi in the Ising model. Our results apply for Potts model as well, where we have q states instead of 2. There is a technical condition required on the matrix An , which is sup n n X X xi An (i, j) = O(n). x∈[−1,1]n i=1 j=1 We don’t know of a single model in the literature which violates this assumption, be it mean field or otherwise (Ising model on all graphs, SK Model, Hopfield Model). Sumit Mukherjee Mean field Ising models 29 / 46 What about the optimization problem? Our general sufficient condition gives a wide class of graphs for which the mean field prediction holds. Sumit Mukherjee Mean field Ising models 30 / 46 What about the optimization problem? Our general sufficient condition gives a wide class of graphs for which the mean field prediction holds. This means that for Ising models on such graphs we have limn→∞ n1 (Fn − Mn ) = 0. Sumit Mukherjee Mean field Ising models 30 / 46 What about the optimization problem? Our general sufficient condition gives a wide class of graphs for which the mean field prediction holds. This means that for Ising models on such graphs we have limn→∞ n1 (Fn − Mn ) = 0. Recall that Mn = sup n1 m∈[−1,1]n Sumit Mukherjee 2 0 m An m + n X o I(mi ) . i=1 Mean field Ising models 30 / 46 What about the optimization problem? Our general sufficient condition gives a wide class of graphs for which the mean field prediction holds. This means that for Ising models on such graphs we have limn→∞ n1 (Fn − Mn ) = 0. Recall that Mn = sup n1 m∈[−1,1]n 2 0 m An m + n X o I(mi ) . i=1 Solving this for regular graphs was easy. Sumit Mukherjee Mean field Ising models 30 / 46 What about the optimization problem? Our general sufficient condition gives a wide class of graphs for which the mean field prediction holds. This means that for Ising models on such graphs we have limn→∞ n1 (Fn − Mn ) = 0. Recall that Mn = sup n1 m∈[−1,1]n 2 0 m An m + n X o I(mi ) . i=1 Solving this for regular graphs was easy. What can we say about other graphs? Sumit Mukherjee Mean field Ising models 30 / 46 Approximately regular graphs ¯ with Suppose all the degrees di are approximately close to d, d¯ → ∞. Sumit Mukherjee Mean field Ising models 31 / 46 Approximately regular graphs ¯ with Suppose all the degrees di are approximately close to d, d¯ → ∞. Then one expects the same asymptotics for Fn as in the regular graph case. Sumit Mukherjee Mean field Ising models 31 / 46 Approximately regular graphs ¯ with Suppose all the degrees di are approximately close to d, d¯ → ∞. Then one expects the same asymptotics for Fn as in the regular graph case. We show this is the case P under thew assumption that the empirical distribution ni=1 n1 δ di → δ1 . d¯ Sumit Mukherjee Mean field Ising models 31 / 46 Approximately regular graphs ¯ with Suppose all the degrees di are approximately close to d, d¯ → ∞. Then one expects the same asymptotics for Fn as in the regular graph case. We show this is the case P under thew assumption that the empirical distribution ni=1 n1 δ di → δ1 . d¯ As a by product of our analysis, we show that X̄ has the same large deviation for all approximately regular graphs in the Ferro-magnetic domain (β > 0), as soon as the average degree goes to +∞. Sumit Mukherjee Mean field Ising models 31 / 46 Ising model on bi-regular bi-partite graphs Let Gn be a sequence of bi-regular bi-partite graphs with bipartition sizes an and bn . Sumit Mukherjee Mean field Ising models 32 / 46 Ising model on bi-regular bi-partite graphs Let Gn be a sequence of bi-regular bi-partite graphs with bipartition sizes an and bn . Thus we have an + bn = n, the total number of vertices. Sumit Mukherjee Mean field Ising models 32 / 46 Ising model on bi-regular bi-partite graphs Let Gn be a sequence of bi-regular bi-partite graphs with bipartition sizes an and bn . Thus we have an + bn = n, the total number of vertices. Assume that limn→∞ n1 an = p ∈ (0, 1). Sumit Mukherjee Mean field Ising models 32 / 46 Ising model on bi-regular bi-partite graphs Let Gn be a sequence of bi-regular bi-partite graphs with bipartition sizes an and bn . Thus we have an + bn = n, the total number of vertices. Assume that limn→∞ n1 an = p ∈ (0, 1). This removes the possibility that one of the partitions dominate. Sumit Mukherjee Mean field Ising models 32 / 46 Ising model on bi-regular bi-partite graphs Assume that each bi-partition has constant degree cn and dn respectively. Sumit Mukherjee Mean field Ising models 33 / 46 Ising model on bi-regular bi-partite graphs Assume that each bi-partition has constant degree cn and dn respectively. Then we must have an cn = bn dn , which equals the total number of edges. Sumit Mukherjee Mean field Ising models 33 / 46 Ising model on bi-regular bi-partite graphs Assume that each bi-partition has constant degree cn and dn respectively. Then we must have an cn = bn dn , which equals the total number of edges. Thus for E(Gn ) to be super linear, we need cn ( or equivalently dn ) to go to +∞. Sumit Mukherjee Mean field Ising models 33 / 46 Ising model on bi-regular bi-partite graphs Choose An to be the adjacency matrix of Gn scaled by Sumit Mukherjee Mean field Ising models 1 cn +dn . 34 / 46 Ising model on bi-regular bi-partite graphs Choose An to be the adjacency matrix of Gn scaled by The asymptotics of sup Fn n 1 cn +dn . is given by {βp(1 − p)m1 m2 + pI(m1 ) + (1 − p)I(m2 )}. m1 ,m2 ∈[−1,1] Sumit Mukherjee Mean field Ising models 34 / 46 Ising model on bi-regular bi-partite graphs Choose An to be the adjacency matrix of Gn scaled by The asymptotics of sup Fn n 1 cn +dn . is given by {βp(1 − p)m1 m2 + pI(m1 ) + (1 − p)I(m2 )}. m1 ,m2 ∈[−1,1] Thus again the asymptotics is universal for any sequence of bi-regular bi-partite graphs, as soon as the average degree goes to +∞. Sumit Mukherjee Mean field Ising models 34 / 46 Ising model on bi-regular bi-partite graphs Choose An to be the adjacency matrix of Gn scaled by The asymptotics of sup Fn n 1 cn +dn . is given by {βp(1 − p)m1 m2 + pI(m1 ) + (1 − p)I(m2 )}. m1 ,m2 ∈[−1,1] Thus again the asymptotics is universal for any sequence of bi-regular bi-partite graphs, as soon as the average degree goes to +∞. Differentiating with respect to m1 , m2 we get the equations m1 = tanh(β(1 − p)m2 ), m2 = tanh(βpm1 ). Sumit Mukherjee Mean field Ising models 34 / 46 Ising model on bi-regular bi-partite graphs If |β| ≤ √ 1 p(1−p) then there is a unique solution (m1 , m2 ) = (0, 0). Sumit Mukherjee Mean field Ising models 35 / 46 Ising model on bi-regular bi-partite graphs If |β| ≤ √ 1 p(1−p) then there is a unique solution (m1 , m2 ) = (0, 0). If β > √ 1 , p(1−p) then there are two solutions (xβ,p , yβ,p ) and (−xβ,p , −yβ,p ). Sumit Mukherjee Mean field Ising models 35 / 46 Ising model on bi-regular bi-partite graphs If |β| ≤ √ 1 p(1−p) then there is a unique solution (m1 , m2 ) = (0, 0). If β > √ 1 , p(1−p) then there are two solutions (xβ,p , yβ,p ) and (−xβ,p , −yβ,p ). If β < − √ 1 , p(1−p) then there are again two solutions (xβ,p , −yβ,p ) and (−xβ,p , yβ,p ). Sumit Mukherjee Mean field Ising models 35 / 46 Difference between Kn and Kn/2,n/2 Consider Ising model on two graphs, the complete graph and the complete bi-partite graph. Sumit Mukherjee Mean field Ising models 36 / 46 Difference between Kn and Kn/2,n/2 Consider Ising model on two graphs, the complete graph and the complete bi-partite graph. If β > 0 then both the models show same asymtotic behavior. Sumit Mukherjee Mean field Ising models 36 / 46 Difference between Kn and Kn/2,n/2 Consider Ising model on two graphs, the complete graph and the complete bi-partite graph. If β > 0 then both the models show same asymtotic behavior. In particular, there is a phase transition in β. Sumit Mukherjee Mean field Ising models 36 / 46 Difference between Kn and Kn/2,n/2 Consider Ising model on two graphs, the complete graph and the complete bi-partite graph. If β > 0 then both the models show same asymtotic behavior. In particular, there is a phase transition in β. If β < 0 then the two models behave differently. Sumit Mukherjee Mean field Ising models 36 / 46 Difference between Kn and Kn/2,n/2 Consider Ising model on two graphs, the complete graph and the complete bi-partite graph. If β > 0 then both the models show same asymtotic behavior. In particular, there is a phase transition in β. If β < 0 then the two models behave differently. Ising model on the complete bi-partite graph shows a phase transition, whereas that on the complete graph does not. Sumit Mukherjee Mean field Ising models 36 / 46 Graphs converging in cut metric Given a graph Gn on n vertices, one can define a function WGn on the unit square as follows: Sumit Mukherjee Mean field Ising models 37 / 46 Graphs converging in cut metric Given a graph Gn on n vertices, one can define a function WGn on the unit square as follows: Partition the unit square into n2 squares each of equal size. Sumit Mukherjee Mean field Ising models 37 / 46 Graphs converging in cut metric Given a graph Gn on n vertices, one can define a function WGn on the unit square as follows: Partition the unit square into n2 squares each of equal size. Define WGn to be equal to 1 on the (i, j)th box if (i, j) is an edge in Gn , and 0 otherwise. Sumit Mukherjee Mean field Ising models 37 / 46 Graphs converging in cut metric Given a graph Gn on n vertices, one can define a function WGn on the unit square as follows: Partition the unit square into n2 squares each of equal size. Define WGn to be equal to 1 on the (i, j)th box if (i, j) is an edge in Gn , and 0 otherwise. Thus we have WGn is a function on the unit square, with ´ 2E(Gn ) . [0,1]2 |WGn (x, y)|dxdy = n2 Sumit Mukherjee Mean field Ising models 37 / 46 Graphs converging in cut metric To proceed in a manner which works across edge densities, n) and call the resulting scale the function WGn by 2E(G n2 f function by WGn . Sumit Mukherjee Mean field Ising models 38 / 46 Graphs converging in cut metric To proceed in a manner which works across edge densities, n) and call the resulting scale the function WGn by 2E(G n2 f function by WGn . Thus after scaling we have Sumit Mukherjee ´ [0,1]2 fGn (x, y)|dxdy = 1. |W Mean field Ising models 38 / 46 Graphs converging in cut metric To proceed in a manner which works across edge densities, n) and call the resulting scale the function WGn by 2E(G n2 f function by WGn . Thus after scaling we have ´ [0,1]2 fGn (x, y)|dxdy = 1. |W Consider the space of symmetric measurable functions W ∈ L1 [0, 1]2 , to be henceforth referred as graphons. Sumit Mukherjee Mean field Ising models 38 / 46 Graphs converging in cut metric To proceed in a manner which works across edge densities, n) and call the resulting scale the function WGn by 2E(G n2 f function by WGn . Thus after scaling we have ´ [0,1]2 fGn (x, y)|dxdy = 1. |W Consider the space of symmetric measurable functions W ∈ L1 [0, 1]2 , to be henceforth referred as graphons. For any graphon W , define the cut norm ´ ||W || = supS,T ⊂[0,1] | S×T W (x, y)dxdy|. Sumit Mukherjee Mean field Ising models 38 / 46 Graphs converging in cut metric We will say a sequence of graphs Gn converge in cut metric fGn converge in cut norm to W . to W , if the functions W Sumit Mukherjee Mean field Ising models 39 / 46 Graphs converging in cut metric We will say a sequence of graphs Gn converge in cut metric fGn converge in cut norm to W . to W , if the functions W As example, the complete graph converges to W1 ≡ 1. Sumit Mukherjee Mean field Ising models 39 / 46 Graphs converging in cut metric We will say a sequence of graphs Gn converge in cut metric fGn converge in cut norm to W . to W , if the functions W As example, the complete graph converges to W1 ≡ 1. It follows from (Borgs et. al ’14) that if a sequence of graphs Gn converge W in cut metric, then |E(Gn )| n. Sumit Mukherjee Mean field Ising models 39 / 46 Graphs converging in cut metric We will say a sequence of graphs Gn converge in cut metric fGn converge in cut norm to W . to W , if the functions W As example, the complete graph converges to W1 ≡ 1. It follows from (Borgs et. al ’14) that if a sequence of graphs Gn converge W in cut metric, then |E(Gn )| n. Thus the mean field condition holds in this case, and so it suffices to consider the asymptotics of Mn . Sumit Mukherjee Mean field Ising models 39 / 46 Graphs converging in cut metric By a simple analysis of Mn , the limiting log partition function is ˆ nˆ o β m(x)m(y)W (x, y)dxdy + sup I(m(x))dx . [0,1]2 2 [0,1] m(.) Sumit Mukherjee Mean field Ising models 40 / 46 Graphs converging in cut metric By a simple analysis of Mn , the limiting log partition function is ˆ nˆ o β m(x)m(y)W (x, y)dxdy + sup I(m(x))dx . [0,1]2 2 [0,1] m(.) Here the supremum is over all measurable functions m : [0, 1] 7→ [−1, 1]. Sumit Mukherjee Mean field Ising models 40 / 46 Graphs converging in cut metric By a simple analysis of Mn , the limiting log partition function is ˆ nˆ o β m(x)m(y)W (x, y)dxdy + sup I(m(x))dx . [0,1]2 2 [0,1] m(.) Here the supremum is over all measurable functions m : [0, 1] 7→ [−1, 1]. This result was already proved in (Borgs et. al ’14). Sumit Mukherjee Mean field Ising models 40 / 46 Graphs converging in cut metric By a simple analysis of Mn , the limiting log partition function is ˆ nˆ o β m(x)m(y)W (x, y)dxdy + sup I(m(x))dx . [0,1]2 2 [0,1] m(.) Here the supremum is over all measurable functions m : [0, 1] 7→ [−1, 1]. This result was already proved in (Borgs et. al ’14). In our paper we re-derive this result as a consequence of mean field prediction, to illustrate the flexibility of this approach. Sumit Mukherjee Mean field Ising models 40 / 46 Graphs converging in cut metric By a simple analysis of Mn , the limiting log partition function is ˆ nˆ o β m(x)m(y)W (x, y)dxdy + sup I(m(x))dx . [0,1]2 2 [0,1] m(.) Here the supremum is over all measurable functions m : [0, 1] 7→ [−1, 1]. This result was already proved in (Borgs et. al ’14). In our paper we re-derive this result as a consequence of mean field prediction, to illustrate the flexibility of this approach. As an immediate application, we obtain the phase transition 1 1 point for such Ising models, which is β = λ1 (W ) ≈ λ1 (An ) . Sumit Mukherjee Mean field Ising models 40 / 46 Proof Strategy The main tool for proving the mean field prediction is a large deviation estimate for binary variables (Chatterjee-Dembo ’14). Sumit Mukherjee Mean field Ising models 41 / 46 Proof Strategy The main tool for proving the mean field prediction is a large deviation estimate for binary variables (Chatterjee-Dembo ’14). They give a general technique to prove mean field type predictions in exponential families of the form ef (x) on binary random variables x ∈ {−1, 1}n . Sumit Mukherjee Mean field Ising models 41 / 46 Proof Strategy The main tool for proving the mean field prediction is a large deviation estimate for binary variables (Chatterjee-Dembo ’14). They give a general technique to prove mean field type predictions in exponential families of the form ef (x) on binary random variables x ∈ {−1, 1}n . The idea is that if the gradient of the function ∇f (x) can be expressed in o(n) bits, then mean field prediction works. Sumit Mukherjee Mean field Ising models 41 / 46 Proof Strategy The main tool for proving the mean field prediction is a large deviation estimate for binary variables (Chatterjee-Dembo ’14). They give a general technique to prove mean field type predictions in exponential families of the form ef (x) on binary random variables x ∈ {−1, 1}n . The idea is that if the gradient of the function ∇f (x) can be expressed in o(n) bits, then mean field prediction works. In our case f (x) = 12 x0 An x for a symmetric matrix An . Sumit Mukherjee Mean field Ising models 41 / 46 E.g. Ising model on Complete graph For a specific example take f (x) = Sumit Mukherjee 1 n P xi xj . 1≤i<j≤n Mean field Ising models 42 / 46 E.g. Ising model on Complete graph For a specific example take f (x) = 1 n P xi xj . 1≤i<j≤n This is the Curie-Weiss Ising model on the complete graph. Sumit Mukherjee Mean field Ising models 42 / 46 E.g. Ising model on Complete graph For a specific example take f (x) = 1 n P xi xj . 1≤i<j≤n This is the Curie-Weiss Ising model on the complete graph. A direct calculation gives ∂f 1X xi (x) = xj = x̄ − . ∂xi n n j6=i Sumit Mukherjee Mean field Ising models 42 / 46 E.g. Ising model on Complete graph For a specific example take f (x) = 1 n P xi xj . 1≤i<j≤n This is the Curie-Weiss Ising model on the complete graph. A direct calculation gives ∂f 1X xi (x) = xj = x̄ − . ∂xi n n j6=i This is approximately x̄, and so every component of ∇f (x) can be expressed by one bit of information x̄. Sumit Mukherjee Mean field Ising models 42 / 46 E.g. Ising model on Z 1 For an example of a model that is not mean field, take n−1 P f (x) = xi xi+1 . i=1 Sumit Mukherjee Mean field Ising models 43 / 46 E.g. Ising model on Z 1 For an example of a model that is not mean field, take n−1 P f (x) = xi xi+1 . i=1 This is the Ising model on the one dimensional integer lattice. Sumit Mukherjee Mean field Ising models 43 / 46 E.g. Ising model on Z 1 For an example of a model that is not mean field, take n−1 P f (x) = xi xi+1 . i=1 This is the Ising model on the one dimensional integer lattice. In this case we have ∂f (x) = xi−1 + xi+1 ∂xi for 2 ≤ i ≤ n − 1. Sumit Mukherjee Mean field Ising models 43 / 46 E.g. Ising model on Z 1 For an example of a model that is not mean field, take n−1 P f (x) = xi xi+1 . i=1 This is the Ising model on the one dimensional integer lattice. In this case we have ∂f (x) = xi−1 + xi+1 ∂xi for 2 ≤ i ≤ n − 1. This vector is no longer expressable in terms of o(n) many quantities. Sumit Mukherjee Mean field Ising models 43 / 46 E.g. Ising model on Z 1 For an example of a model that is not mean field, take n−1 P f (x) = xi xi+1 . i=1 This is the Ising model on the one dimensional integer lattice. In this case we have ∂f (x) = xi−1 + xi+1 ∂xi for 2 ≤ i ≤ n − 1. This vector is no longer expressable in terms of o(n) many quantities. Thus Ising model on complete graph is mean field, whereas Ising model on Z 1 is not. Sumit Mukherjee Mean field Ising models 43 / 46 Extending to general matrices In general if f (x) = 12 x0 An x then ∇f (x) = An x. Sumit Mukherjee Mean field Ising models 44 / 46 Extending to general matrices In general if f (x) = 12 x0 An x then ∇f (x) = An x. Assume An has spectral decomposition Sumit Mukherjee Pn 0 i=1 λi pi pi Mean field Ising models 44 / 46 Extending to general matrices In general if f (x) = 12 x0 An x then ∇f (x) = An x. Assume An has spectral decomposition Pn Recall that our mean field assumption is which says most eigenvalues are o(1). Sumit Mukherjee 0 i=1 λi pi pi 1 n Pn Mean field Ising models 2 i=1 λi = o(n), 44 / 46 Extending to general matrices In general if f (x) = 12 x0 An x then ∇f (x) = An x. Assume An has spectral decomposition Pn Recall that our mean field assumption is which says most eigenvalues are o(1). Thus Ax ≈ PK 0 i=1 λi pi pi x Sumit Mukherjee 0 i=1 λi pi pi 1 n Pn 2 i=1 λi = o(n), with K = o(n). Mean field Ising models 44 / 46 Extending to general matrices In general if f (x) = 12 x0 An x then ∇f (x) = An x. Assume An has spectral decomposition Pn Recall that our mean field assumption is which says most eigenvalues are o(1). Thus Ax ≈ PK 0 i=1 λi pi pi x 0 i=1 λi pi pi 1 n Pn 2 i=1 λi = o(n), with K = o(n). This is expressable using at most K bits of information {p0i x}K i=1 , which is why the resulting model is mean field. Sumit Mukherjee Mean field Ising models 44 / 46 Open questions? Extend the argument to other exponential families, such as ERGMs. Sumit Mukherjee Mean field Ising models 45 / 46 Open questions? Extend the argument to other exponential families, such as ERGMs. Think of edges in graphs as binary random variables. Sumit Mukherjee Mean field Ising models 45 / 46 Open questions? Extend the argument to other exponential families, such as ERGMs. Think of edges in graphs as binary random variables. Find for what values of parameters the mean field prediction is tight. Sumit Mukherjee Mean field Ising models 45 / 46 Open questions? Extend the argument to other exponential families, such as ERGMs. Think of edges in graphs as binary random variables. Find for what values of parameters the mean field prediction is tight. Partially addressed in Chatterjee-Dembo, results are almost sharp. Sumit Mukherjee Mean field Ising models 45 / 46 Open questions? Extend the argument to other exponential families, such as ERGMs. Think of edges in graphs as binary random variables. Find for what values of parameters the mean field prediction is tight. Partially addressed in Chatterjee-Dembo, results are almost sharp. Sumit Mukherjee Mean field Ising models 45 / 46 Open questions? Extend the argument to other exponential families, such as ERGMs. Think of edges in graphs as binary random variables. Find for what values of parameters the mean field prediction is tight. Partially addressed in Chatterjee-Dembo, results are almost sharp. Solve optimization problem for more graph ensembles. Sumit Mukherjee Mean field Ising models 45 / 46 Sumit Mukherjee Mean field Ising models 46 / 46