Algorithms on large graphs László Lovász Eötvös Loránd University, Budapest May 2013 1 The Weak Regularity Lemma Cut norm of matrix A A W = 1 n 2 nxn: å å [ ] m ax S ,T Í n iÎ S A ij jÎ T Cut distance of two graphs with V(G) = V(G’): d W (G , G ') = 1 n 2 m a x | e G ( S ,T ) - e G ' ( S ,T ) | S ,T Í V ( G ) (extends to edge-weighted) May 2013 2 The Weak Regularity Lemma Avereged graph GP (P partition of V(G)) 1/2 1 0 Template graph G/P 1/2 1 May 2013 2/5 2/5 1/2 1 0 1/5 3 The Weak Regularity Lemma For every graph G and every >0 there is a partition | P |= 2 with O (1/ e 2 ) and d W (G , G P ) < e Frieze – Kannan 1999 May 2013 4 Algorithms for large graphs How is the graph given? - Graph is HUGE. - Not known explicitly, not even the number of nodes. Idealize: define minimum amount of info. May 2013 5 Algorithms for large graphs Dense case: cn2 edges. - We can sample a uniform random node a bounded number of times, and see edges between sampled nodes. „Property testing”, constant time algorithms: Arora- Karger-Karpinski, Goldreich-Goldwasser-Ron, Rubinfeld-Sudan, Alon-Fischer-Krivelevich-Szegedy, Fischer, Frieze-Kannan, Alon-Shapira May 2013 6 Algorithms for large graphs Parameter estimation: edge density, triangle density, maximum cut Property testing: is the graph bipartite? triangle-free? perfect? Computing a structure: find a maximum Computing a constantcut, regularity partition,... size encoding The partition (cut,...) can be computed in polynomial time. For every node, we can determine in constant time which class it belongs to May 2013 7 Representative set Representative set of nodes: bounded size, (almost) every node is “similar” to one of the nodes in the set When are two nodes similar? Neighbors? Same neighborhood? May 2013 8 Similarity distance of nodes d sim ( s , t ) := E v E u ( a su avu ) - E w ( a tw aw v ) s w u t v This is a metric, computable in the sampling model May 2013 9 Representative set Strong representative set U: for any two nodes in s,tU, dsim(s,t) > for all nodes s, dsim(U,s) Average representative set U: for any two nodes s,tU, dsim(s,t) > for a random node s, Edsim(U,s) 2 May 2013 10 Representative sets and regularity partitions If P = {S1, . . . , Sk} is a weak regularity partition with error , then we can select nodes viSi such that S = {v1, . . . , vk} is an average representative set with error < 4. If SV is an average representative set with error , then the Voronoi cells of S form a weak regularity partition with error < 8. L-Szegedy May 2013 11 Representative sets and regularity partitions Voronoi diagram = weak regularity partition May 2013 12 Representative sets Every graph has an average representative set with at most 2 O (1/ e 2 ) nodes. Every graph has a strong representative set with at most 2 O (lo g (1/ e ) / e 2 ) nodes. Alon If S V(G) and dsim(u,v)> for all u,vS, then S = 2 May 2013 O (lo g (1/ e ) / e 2 ) 13 Representative sets Example: every average representative set has 2 W(1/ e 2 ) nodes. dimension 1/ May 2013 angle 14 Representative sets and regularity partitions For every graph G and >0 there are ui, vi {0,1}V(G) and ai such that k AG - å i= 1 aiu iv T i < e, W æ1 ö k = O çç 2 ÷ ÷ ÷ çè e ø Frieze-Kannan d sim ( s , t ) := E v E u ( a su avu ) - E w ( a tw aw v ) May 2013 15 How to compute a (weak) regularity partition? Construct weak representative set U Each node is in same class as closest representative. May 2013 16 How to compute a maximum cut? - Construct representative set - Compute weights in template graph (use sampling) - Compute max cut in template graph Each node is on same side as closest representative. (Different algorithm implicit by Frieze-Kannan.) May 2013 17 How to compute a maximum matching? Given a bigraph with bipartition {U,W} (|U|=|W|=n) and c[0,1], find a maximum subgraph with all degrees at most c|U|. May 2013 18 Nondeterministically estimable parameters Divine help: coloring the nodes, orienting and coloring the edges G: directed, (edge)-colored graph G’: forget orientation, delete some colors, forget coloring; shadow of G g: parameter defined on directed, colored graphs g’(H)=max{g(G): G’=H}; shadow of g f nondeterministically estimable: f=g’,where g is an estimable parameter of colored directed graphs. May 2013 19 Nondeterministically estimable parameters Examples: density of maximum cut Goldreich-Goldwasser-Ron edit distance from a testable property Fischer- Newman the graph contains a subgraph G’ with all degrees cn and |E(G’)| an2 May 2013 20 Nondeterministically estimable parameters Every nondeterministically estimable graph paratemeter is estimable. L-Vesztergombi Every nondeterministically estimable graph N=NP for dense property testing pproperty is testable. L-Vesztergombi Proof via graph limit theory: pure existence proof of an algorithm... May 2013 21 How to compute a maximum matching? More generally, how to compute a witness in non-deterministic property testing? May 2013 22 May 2013 23