Chernoff Bound Markov ➞ Chebyshev ➞ Chernoff Uses generating function type calculation Exponential bound on Bp,n exceeding mean by constant factor Application example Alon Orlitsky, UCSD Herman Chernoff 1923 Statistician UIUC, Stanford, MIT, Harvard Broad view of statistics Chernoff bound Chernoff faces Chernoff Faces Visualize multi-dimensional data Familiarity with human faces Dimensions as facial features April 1, 2008 Why Chernoff X ∼ Bp,n P ( X ≥ constant times its mean ) p= 1 2 δ=1 What sayeth X ∼ B 12 ,n P(X ≥ (1 + δ)μ) δ>0 μ = n/2 P(X ≥ 2μ) = P(X ≥ n) = P(X = n) = General Bp,n 1 2n Markov, Chebyshev, Chernoff X ∼ Bp,n P(X ≥ (1 + δ)μ) P(X ≥ αμ) ≤ 1/α ≥0 Markov ≤ P(X ≥ (1 + δ)μ) 1 1+δ Constant pn = P(X ≥ μ + δnp) μ= Chebyshev P(X ≥ (1 + δ)μ) P(X ≥ μ + a) ≤ P( | X − μ | ≥ a) ≤ ≤ Chernoff Next 2 σ /a 2 2 σ 2 (δnp) σ = npq 2 = npq 2 (δnp) = q 2 δ np Linear Markov → Chernoff Crucial observation 8a 8t 0 X a $ tX ta $ e tX e ta Markov P (X tX E(e ) Evaluate a) = P (e Bound tX ta e ) Incorporate E(etX ) eta Simplify P (X a) = P (tX ta) = P (e tX ta tX e )E(e ) tX ta tX ta E(e ) tX e = P (e e E(e ) ) n X ta tX e Exactly moment function Self-contained E(e ) generating n X ta e i=1 P n tX n XX ⇠ Bp,n X ⇠ B ? ? X = X i p i E e i=1 i=1 8t 0 tX Evaluate t⌃Xi = E(e ) E e ✓n ◆⫫ n a) = P (tX ta) Q Q tXi ⌃tX t⌃X tX t n i i i tX ta E e = E e = E e = E(e ) = E(e ) = [(1 p) + pe ] = P (e e ) ! i=1 E(etXi ) i=1 t⌃Xi ⌃tX n i E(e )= E(eE(etX ) ) Y tX i t·0 t·1 ! = E e = P (X = 0) · e + P (X = 1) · e i i ⌃tXi n E(e ) Y eta i=1 tX t tX i t·0 t·1 i ! n = E e E(e ) = (1 p) + pe = P (X = 0) · e + P (X = 1) · e i i X n Y tXi t·1 i=1 t e = 1) · e = 1 + p(e 1) i P (Xi=1 tX i=1 i=1 t pe tX p) + = [(1 E e t + p(e t⌃X 1)i t n p) + pe ] e p(e t 1) tXi E(e ) Bound = P (X = 0) · e + P (X = 1) · e tXE(e t·0 E(e i E e = tX P (Xi = = [(1 t t·1 (1 p) + pe n 0) · e t+n P (Xi =p(e 1)t ·= e t 1) 1) np(e p) + pe ] e = et t = 1 + p(e 1) p) + pe 1+xe E e tX e )i t·1 t·0 = (1 (1 tXi ) µ(et 1) x t p) + pe = 1 + p(e e t p(e t 1) e p(e t = e t µ(e 1) 1) 1) x e =1+x+ 2 x 2 + ... 1+x Incorporate in Markov ≤ 8t 0 P (X X ⇠ Bp,n 8a 8t a = (1 + )µ PPP(X (X (X 0 a) = P (tX = a) P (X tX E(e ) ≤ e ta) µ(et 1) tX 0 8t n X 0 i=1 ttt 1) µ(e ttt tX µ(e 1) µ(e 1) eee E e µ((e 1) t(1+ )) µ((e 1) t(1+ )) µ((e 1) t(1+ )) (1 + )µ) = e (1 = e (1++ )µ) )µ)eeµ(1+ = e )t µ(1+ )t µ(1+ )t e t⌃Xi ) = E(e ) Find t Minimizing ! n Y ⌃tXi μ(e t−1) t t1) 1) µ(eµ(e e E(e tX) eta E(e ) e P e(e e ) P (Xeta a) etaeta P (X a) ta eta tX E(e ) ta e = E(e μ δ Given tX t (e − 1) − t(1 + δ) Optimizing t 8t 0 P (X Find t minimizing (1 + )µ) e def f (t) = (e 0 f (t) = e t t µ((et 1) t(1+ )) 1) (1 + ) = 0 t t = ln(1 + ) e =1+ 00 f (t) = e P (X (1 + )µ) e t µ((e t(1 + ) t 1) t(1+ )) 0 =e µ[ (1+ ) ln(1+ )] f (t) t Final Simpli cation P (X (1 + )µ) e Inequality De ne µ[ ln (1 + x) x x 1+ 2 f (x) = ln(1 + x) 0 1 = 0 f (x) = f (x) x x (1+ 2 ) 2 x 2 (1+ 2 ) 1 1+x 0 0 fi (1 + ) ln(1 + ) 8x e 8x x x 1+ 2 f (0) = ln 1 fi (1+ ) ln(1+ )] 2 2+ µ 0 Show f (x) = 1 1+x 1 x 2 (1+ 2 ) 0 = 8x 1 1+x 0 1 x2 1+x+ 2 0 0 (1 + ) 1+ 2 hhh = 111 = 111+ 1+ 1+222 1+ ii 2+ 2+ [ = [ = 22 22 2+ 2+ 22 = 2+ ]] = 2+ Chernoff Bound X ⇠ Bp,n Showed Similarly 0 P (X (1 + )µ) e P (X (1 )µ) e 2 2+ µ 2 µ 2 ➘ exponentially in n fi Current form Binomial Modi cations for other distributions Poisson Markov > Chebyshev > Chernoff X ∼ B1/2, n Markov Chebyshev P(X ≥ n) ≤ ? P(X ≥ a) ≤ P( | X − μ | P(X ≥ n) Chernoff μ a P(X ≥ n) = P(X = n) = μ P(X ≥ n) ≤ σ 2 ≥ a) ≤ ( a ) n ≤ P( | X − 2 n/2 n Constant σ = npq 2 |≥ δ2 −2+δ μ n ) 2 ≤ δ μ P(X ≥ n) = P(X ≥ (1 + 1) ⋅ n ) 2 P(X ≥ (1 + δ)μ) ≤ e = 1 2 1 2n n/4 (n/2)2 ≤e μ −3 = =e 1 n Linear n −6 Expon ential What’s the most surprising thing about Markov’s ≤ That such a weak bound P(B1/2, n ≥ n) ≤ Can be used to drive such strong results 1 2 P(B1/2, n ≥ n) ≤ e −n/6 Poll 47% vote R Poll 6000 R will lose Poll Wrong X # support R ~ B0.47, 6000 ↔ X ≥ 0.5·6,000 = 3,000 To apply Chernoff P(wrong) = P(X ≥ 3000) = P(X ≥ (1 + δ)μ) ≤e µ = np = 6000 · 0.47 = 2820 3000 = (1 + )µ 3000 6000 · 0.5 50 1+ = = = ⇡ 1.06383 2820 6000 · 0.47 47 ⇡ 0.0638 δ2 −2+δ μ ≈ 0.38 % Chernoff Bound Markov ➞ Chebyshev ➞ Chernoff Uses generating function type calculation Exponential bound on Bp,n exceeding mean by constant factor P (X (1 + )µ) e 2 2+ µ P (X (1 )µ) e 2 µ 2 Application Poll accuracy Central Limit Theorem Alon Orlitsky, UCSD