Property Testing in Sparse and General Graphs

Property Testing in Sparse
and General Graphs
Michael Krivelevich
Tel Aviv University
Graph Property Testing
Very general setting:
P = graph property to test
(k-colorability, planarity, non-existence of a copy of H, etc.)
Input: graph G on n vertices, n→∞
Should be
Promise: GP (positive)
or: G is ε-far from P (negative)
(ε-percentage of description of P should be changed to get HP)
Algorithm A (typically randomized): Queries description of P
GP  Pr[ A accepts G] ≥ 2/3
G is ε-far from P  Pr[ A rejects G] ≥ 2/3
GP, Pr[ A accepts G] =1
one-sided error algorithm
Property Testing in Dense Graphs
- Formally defined in GGR’98
(appeared implicitly in combinatorial papers in 70’s, 80’s)
Input graph description: adjacency matrix
G=(V,E), V=[n]
1, (i, j )  E (G )
aij  
0, otherwise
Algorithm: queries the adjacency matrix of G
Query: whether (i,j) E(G)?
(vertex pair query)
Distance: G is is ε-far from P if ≥εn2 entries in A(G) need to
be changed to get HP
Property Testing in Dense Graphs – Brief
“… It’s all about REGULARITY.” (AFNS’06)
• Very strong (and fruitful) connection between property
testing in dense graphs and the Szemerédi Regularity
Lemma and its versions
(started in AFKS’99 and culminated in AFNS’06)
• Have reached very good understanding of this setting
(though of course quite a few challenging problems
Dense Graph Model - limitations
• Suitable/tailored for dense graphs only
• Degenerate for many graph properties
Ex. : P = “ G is connected”
- Always answer “YES”
( dist(G,P)≤ n-1 << εn2 )
• A typical algorithm:
- sample S  [n], |S|=O(1)
- look inside to check whether G[S]P
- returns a.s. empty set S for |E(G)|=o(n2)
Property Testing in Bounded Degree Graphs
Introduced by GR’97
• Assumption: Δ(input graph G) ≤ d=const;
ε<< 1/d
• Graph representation: by incidence lists
L(vi)=(vi,1,…,vi,d) – list of neighbors of vi
• Query: who is the j-th neighbor of vi?
(neighbor query)
• Distance: G is ε-far from P if need ≥ εdn modifications in
incidence lists to get HP
Bounded Degree Graphs – an Example
Th. (GR’97): Connectivity in bounded degree model can be
tested in O(1/ε2) queries
Proof: Assume: G is ε-far from being connected
G has ≥ εn connected components
G has ≥ εn/2 con. components of size ≤ 2/ε
(= small components)
≥ ε/2 percentage of all vertices in small components
Property Testing in Bounded Degree Graphs (cont.)
Algorithm: Repeat O(1/ε) times:
1. Sample a random vertex vRV
2. Explore the connected component C(v) of v till
accumulate 2/ε vertices
3. If |C(v)| ≤ 2/ε – reject
If never reject – accept
One-sided error algorithm with complexity O(1/ε2)
More careful analysis
~ (1/ε) queries
Testing bounded degree graphs – basic tools
• Random sampling
• Local search
(exploring the neighborhood/ball of a vertex)
• Random walks
(a random neighbor of a random neighbor of a random
Bounded degree – first results
Results from GR’97:
Can test:
connectivity in O (1/ε) queries
2-edge connectivity: O~ (1/ε2)
3-edge connectivity: O (1/ε3)
k-vertex connectivity, k=2,3: O (1/εk)
- one-sided error algorithms
Uses structural
connectivity results
(block, cactus, etc.)
- cycle-freeness in O(1/ε3) queries
- two-sided error algorithm
Proof idea: G is ε-far from a forest  many small components
with a cycle, or large components Ci with large surplus e(Ci)-v(Ci)
Testing bipartiteness in bounded degree graphs
P = “G is bipartite”
Lower bound (GR’97): Ω(√n) queries
- in very sharp contrast to the dense case
Proof idea:
Negative distribution DN= Hamilton cycle + random perfect matching
(O(1)-far from being bipartite a.s.)
Positive distribution DP=Hamilton cycle + random perfect matching between
vertices of different parity
= DN
= DP
Any tester: can’t distinguish between DP, DN before having seen a cycle
Takes Ω(√n) queries by birthday paradox
Testing bipartiteness in bounded degree graphs
Th. (GR’99): There is a one-sided error algorithm~for testing
bipartiteness in the bounded degree model in O (√n)
Algorithm: Repeat T= O(1/ε) times:
1. Choose a random vertex sRV
2. Perform K:= O (√n) random walks of length
L:=polylog(n) starting from s
3. If get to the same endvertex by an odd and an even
path – reject
If no rejection - accept
Testing bipartiteness in bounded degree graphs
Analysis: very elaborate
- relatively easy for rapidly mixing case
[s Pr[a random walk of length L starting from s] =
Θ(1/n) )]
- for general case:
no rapid mixing   small cut (M’89)
use them to decompose the graph and the
Testing k-colorability
P = “G is k-colorable”; k≥3 – fixed
Obviously can be done in O(n) queries
(just get all O(dn) edges of G)
Th. (BOT’02): For every fixed k ≥3, testing
k-colorability in the bounded degree model requires Ω(n)
 No room for sophisticated testing algorithms
Testing k-colorability (cont.)
Proof Idea:
For one-sided error: Can use classical result of Erdős’62:
Th.: There exists G=(V,E), |V|=n, Δ(G)=O(1),
G is ε-far from 3-colorable,
but: every δn edges form a 3-colorable graph
 tester has to obtain ≥ δn edges to catch G0  G with χ(G0)>3
For two-sided error algorithm:
- Two distributions (positive, negative) over instances of systems of
linear equations;
Any algorithm can’t distinguish between them in o(n) time
- Then: gap preserving reductions from linear equations to 3colorability
Testing in non-expanding bounded degree graphs
Czumaj, Shapira, Sohler’07
Notion of hereditary non-expanding graphs:
Def: G is λ-expanding if for every V0  V(G), |V0| ≤n/2,
|N(V0)|≥ λ |V_0|
Def: Graph family F is non-expanding if there exists n0=n0(F) s.t.
for all GF , |V(G)|≥ n0, G is not (1/log2n)-expanding
Ex.: F =planar graphs – non-expanding
(exists separator of size O(√|V(G)|)
Use: G non-expanding family F , bounded degree
 can repeatedly cut G to decompose it into constant sized pieces
number of edges between pieces ≤ ε n/2
Testing in non-expanding graphs (cont.)
Th. (CSS): P= hereditary property
(closed under taking induced subgraphs,
say, 3-colorability)
Assume: Input G non-expanding family F of bounded
degree subgraphs
 P can be tested over F in constant time f(ε)
Proof idea: Decompose G=(H1,H2,…) as above
G=negative instance
 many of Hi’s are witnesses
 can be found by random sampling + local search
Testing planarity
Th. (BSS’08) P = “G is planar”
P can be tested in time Oε(1) in bounded degree graphs by
a 2-sided error algorithm
(proved more: every minor-closed property P is testable
in constant time)
Proof idea: Local statistics in planar graphs differ
substantially from those in graphs ε-far from planar
(related to hyper-finite graphs, converging sequences of
sparse graphs, etc.)
Testing planarity (cont.)
1. Get two-sided error algorithm, query complexity exp(exp(exp(1/ε))).
Better query complexity?
2. Two-sided vs one-sided
Ex: G= bounded degree expander of high girth (Θ(log n))
(say, LPS graph)
- Θ(1)-far from planar
- every c logn edges form a forest  planar subgraph
 LB=Ω(log n)
can strengthen to Ω(√ n) of GR’97
Conj: P= “G is H-minor free”
P can be tested with a one-sided error algorithm in O(√n) queries
Bounded degree graphs –open questions
• Characterization of testable properties?
(testable := testable in Oε(1) queries)
or at least: wide classes of testable properties
• One-sided vs two-sided?
Comparative study for various properties
• Testing in restricted graph classes?
(á la CSS)
• Tolerant testing? Estimating distance to a given
Bounded degree model - limitations
Opposite/similar to the dense model
• Suitable/tailored only for bounded degree graphs
• Distance notion is “hardwired” – measured always w.r.t.
to dn
• Degenerates for certain properties
(e.g. √ n-colorability – always answer “YES”)
Testing in graphs of general density
- Introduced in KKR’03
Main principles:
Distance in measured w.r.t. to the actual size of the input graph
(latter can be approximated first if necessary)
G=(V,E) is ε-far from P if ≥ ε|E| edges need to be changed to get
(appeared already in PR’02)
Queries allowed:
a) vertex pair queries: whether (i,j) E(G)?
(like in the dense model)
b) neighbor queries: j-th neighbor of i V(G)?
(like in the sparse model)
c) degree queries: what is dG(i)?
No inherent limitation on input graph density
Testing bipartiteness in general graphs
Th. (KKR’03):
Testing bipartiteness can be done in O(min(√n, n/d)) queries,
where d=2|E|/|V| is the average degree of G;
Lower bound of Ω(min(√n, n/d))
- continuous interpolation between the sparse and the dense cases
Testing bipartiteness for general graphs - proofs
Upper bound:
Case d≤√n – same as in the bounded degree model
K:= Oε(√n),
Repeat T= O(1/ε) times:
1. Choose a random vertex sRV
2. Perform K random walks of length L starting from s
3. A0 = endpoints of walks corresponding to paths of even length
A1 = endpoints of walks corresponding to paths of odd length
4. If A0∩ A1 ≠Ø – reject, found an odd cycle
Never rejected - accept
Testing bipartiteness for general graphs – proofs
Upper bound:
Case d≥√n
Now: K:= Oε(√(n/d)),
A0 , A1 – as before
Check whether A0 or A1 spans an edge
(here use vertex pair queries)
If happens – reject
Never happens - accept
Testing bipartiteness for general graphs – proofs
Lower bound:
Negative distribution DN= Gn,d – random d-regular graph
Positive distribution DP=Gn/2,n/2,d – random bipartite d-regular graph
- choose an equipartition V=(V1,V2) u.a.r.
- construct a random d-regular bipartite graph between V1, V2
Proof idea: ALG = arbitrary algorithm
• o(n/d) vertex pair queries  a.s. do not produce an edge
• have seen o(√n) vertices  a.s. no neighbor query closes a cycle
(birthday paradox)
o(min(n/d, √n)) queries – both items apply,
can’t distinguish between DP, DN
Testing triangle-freeness in general graphs
Result of AKKR’06
Property P to test = “G is K3-free”
Most interesting part – Lower Bound
d:=average degree of the input graph
• d≤ n1-δ(n), δ(n)→ 0  Ω(n1/3) queries are needed
• d=Θ(n)  Oε(1) queries are enough (AFKS’99)
 Threshold-like behavior for query complexity, abrupt change around
Proof Idea: Cayley graphs, set of generators – random subset of a
dense 3AP-free set
(c.f. A’02 for the dense case)
Comparative study of strength of different query
- BKKR’08
Test case: k-colorability, k≥3 fixed
Models to compare:
 vertex pair queries
 neighbor queries
 combined model (pair+neighbor queries)
 new query type – group query
Group query: vV - vertex, S – vertex subset
? Whether there is an edge between v and S in G ?
(and then can find a random edge between v and S in O(log n) queries if
motivated by Group Testing
Comparative study of strength of different query
types -results
On the qualitative level:
• vertex pair, neighbor < combined model < group query
Say, in testing bipartiteness
• vertex pair queries are better for dense graphs, neighbor
queries are better for sparse graphs
• for group queries: UB=O(n/d)
LB= Ω(n/d)
(d := average degree of the input graph)
Testing general graphs – open problems
Results for (other) concrete problems?
(testing H-freeness, k-colorability, etc.)
Develop technology for proving lower bounds
One-sided vs two-sided error algorithms?
What if given ability to sample a random edge?
(to eliminate hiding small dense hard instances)
Further query types, their comparison? Query types
driven by practical applications?