Uploaded by skcarstairs

14.1 Chernoff Bound

advertisement
Chernoff Bound
Markov ➞ Chebyshev ➞ Chernoff
Uses generating function type calculation
Exponential bound on Bp,n exceeding mean by constant factor
Application example
Alon Orlitsky, UCSD
Herman Chernoff
1923 Statistician
UIUC, Stanford, MIT, Harvard
Broad view of statistics
Chernoff bound
Chernoff faces
Chernoff Faces
Visualize multi-dimensional data
Familiarity with human faces
Dimensions as facial features
April 1, 2008
Why Chernoff
X ∼ Bp,n
P ( X ≥ constant times its mean )
p=
1
2
δ=1
What sayeth
X ∼ B 12 ,n
P(X ≥ (1 + δ)μ)
δ>0
μ = n/2
P(X ≥ 2μ) = P(X ≥ n) = P(X = n) =
General Bp,n
1
2n
Markov, Chebyshev, Chernoff
X ∼ Bp,n
P(X ≥ (1 + δ)μ)
P(X ≥ αμ) ≤ 1/α
≥0
Markov
≤
P(X ≥ (1 + δ)μ)
1
1+δ
Constant
pn
= P(X ≥ μ + δnp)
μ=
Chebyshev
P(X ≥ (1 + δ)μ)
P(X ≥ μ + a) ≤ P( | X − μ | ≥ a) ≤
≤
Chernoff
Next
2
σ /a
2
2
σ
2
(δnp)
σ = npq
2
=
npq
2
(δnp)
=
q
2
δ np
Linear
Markov → Chernoff
Crucial observation
8a 8t
0
X
a
$ tX
ta $ e
tX
e
ta
Markov
P (X
tX
E(e )
Evaluate
a) = P (e
Bound
tX
ta
e ) 
Incorporate
E(etX )
eta
Simplify
P (X
a) = P (tX
ta)
= P (e
tX
ta
tX
e )E(e )

tX
ta
tX
ta
E(e
)
tX
e
= P (e
e E(e
)
)
n

X
ta
tX
e
Exactly moment
function
Self-contained
E(e ) generating
n

X
ta
e
i=1
P
n tX
n
XX
⇠ Bp,n
X
⇠
B
?
?
X
=
X
i
p
i
E
e
i=1
i=1
8t 0
tX
Evaluate
t⌃Xi
=
E(e
)
E
e
✓n
◆⫫ n
a) = P (tX ta)
Q
Q
tXi
⌃tX
t⌃X
tX
t
n
i
i
i
tX
ta
E e
=
E
e
=
E
e
=
E(e
)
= E(e
)
=
[(1
p)
+
pe
]
= P (e
e ) !
i=1 E(etXi )
i=1
t⌃Xi
⌃tX
n
i
E(e
)= E(eE(etX
) ) Y tX
i
t·0
t·1
!
=
E
e

=
P
(X
=
0)
·
e
+
P
(X
=
1)
·
e
i
i
⌃tXi
n
E(e
) Y eta i=1
tX
t
tX
i
t·0
t·1
i
!
n
=
E
e
E(e
)
=
(1
p)
+
pe
=
P
(X
=
0)
·
e
+
P
(X
=
1)
·
e
i
i
X
n
Y
tXi t·1 i=1
t
e
=
1) · e
= 1 + p(e
1)
i
P (Xi=1
tX
i=1
i=1
t
pe
tX
p) +
= [(1
E e
t
+ p(e t⌃X
1)i
t n
p) + pe ]
e
p(e
t
1)
tXi
E(e )
Bound
= P (X = 0) · e + P (X = 1) · e
tXE(e
t·0
E(e
i
E e
=
tX
P (Xi =
= [(1
t
t·1
(1
p)
+
pe
n
0) · e t+n P (Xi =p(e
1)t ·=
e
t 1)
1)
np(e
p) + pe ]  e
= et
t
= 1 + p(e
1)
p) + pe
1+xe
E e
tX
e
)i
t·1
t·0
= (1
(1
tXi
)
µ(et 1)
x
t
p) + pe
= 1 + p(e
e
t
p(e
t
1)  e
p(e
t
= e
t
µ(e
1)
1)
1)
x
e =1+x+
2
x
2
+ ...
1+x
Incorporate in Markov ≤
8t
0
P (X
X ⇠ Bp,n
8a 8t
a = (1 + )µ
PPP(X
(X
(X
0
a) = P (tX
=
a) 
P (X
tX
E(e ) ≤ e
ta)
µ(et
1)
tX
0
8t
n
X
0
i=1
ttt 1)
µ(e
ttt
tX
µ(e
1)
µ(e
1)
eee E e
µ((e
1)
t(1+
))
µ((e
1)
t(1+
))
µ((e
1)
t(1+
))
(1
+
)µ)

=
e
(1
=
e
(1++ )µ)
)µ)eeµ(1+
=
e
)t
µ(1+
)t
µ(1+
)t
e
t⌃Xi
)
=
E(e
)
Find t Minimizing
!
n
Y
⌃tXi
μ(e t−1)
t t1) 1)
µ(eµ(e
e
E(e tX)
eta
E(e
)
e
P e(e
e
)
P (Xeta a) etaeta
P (X a)
ta
eta
tX
E(e )

ta
e
= E(e
μ δ Given
tX
t
(e − 1) − t(1 + δ)
Optimizing t
8t
0
P (X
Find t minimizing
(1 + )µ)  e
def
f (t) = (e
0
f (t) = e
t
t
µ((et 1) t(1+ ))
1)
(1 + ) = 0
t
t = ln(1 + )
e =1+
00
f (t) = e
P (X
(1 + )µ)  e
t
µ((e
t(1 + )
t
1) t(1+ ))
0
=e
µ[
(1+ ) ln(1+ )]
f (t)
t
Final Simpli cation
P (X
(1 + )µ)  e
Inequality
De ne
µ[
ln (1 + x)
x
x
1+ 2
f (x) = ln(1 + x)
0
1
= 0
f (x) =
f (x)
x
x
(1+ 2 ) 2
x 2
(1+ 2 )
1
1+x
0
0
fi
(1 + ) ln(1 + ) 
8x
 e
8x
x
x
1+ 2
f (0) = ln 1
fi
(1+ ) ln(1+ )]
2
2+
µ
0
Show f (x)
=
1
1+x
1
x 2
(1+ 2 )
0
=
8x
1
1+x
0
1
x2
1+x+ 2
0
0
(1 + ) 1+
2
hhh
= 111
=
111+
1+
1+222
1+
ii
2+
2+
[
= [
=
22 22
2+
2+
22
= 2+
]] =
2+
Chernoff Bound
X ⇠ Bp,n
Showed
Similarly
0
P (X
(1 + )µ)  e
P (X  (1
)µ)  e
2
2+
µ
2
µ
2
➘ exponentially in n
fi
Current form
Binomial
Modi cations for other distributions
Poisson
Markov > Chebyshev > Chernoff
X ∼ B1/2, n
Markov
Chebyshev
P(X ≥ n) ≤ ?
P(X ≥ a) ≤
P( | X − μ |
P(X ≥ n)
Chernoff
μ
a
P(X ≥ n) = P(X = n) =
μ
P(X ≥ n) ≤
σ 2
≥ a) ≤ ( a )
n
≤ P( | X − 2
n/2
n
Constant
σ = npq
2
|≥
δ2
−2+δ μ
n
)
2
≤
δ
μ
P(X ≥ n) = P(X ≥ (1 + 1) ⋅
n
)
2
P(X ≥ (1 + δ)μ) ≤ e
=
1
2
1
2n
n/4
(n/2)2
≤e
μ
−3
=
=e
1
n
Linear
n
−6
Expon
ential
What’s the most surprising thing about Markov’s ≤
That such a weak bound
P(B1/2, n ≥ n) ≤
Can be used to drive such strong results
1
2
P(B1/2, n ≥ n) ≤ e
−n/6
Poll
47% vote R
Poll 6000
R will lose
Poll Wrong
X # support R ~ B0.47, 6000
↔ X ≥ 0.5·6,000 = 3,000
To apply Chernoff
P(wrong)
= P(X ≥ 3000) = P(X ≥ (1 + δ)μ)
≤e
µ = np = 6000 · 0.47 = 2820
3000 = (1 + )µ
3000
6000 · 0.5
50
1+ =
=
=
⇡ 1.06383
2820
6000 · 0.47
47
⇡ 0.0638
δ2
−2+δ μ
≈ 0.38 %
Chernoff Bound
Markov ➞ Chebyshev ➞ Chernoff
Uses generating function type calculation
Exponential bound on Bp,n exceeding mean by constant factor
P (X
(1 + )µ)  e
2
2+
µ
P (X  (1
)µ)  e
2
µ
2
Application Poll accuracy
Central Limit Theorem
Alon Orlitsky, UCSD
Download