Chernoff bounds

advertisement
Chernoff bounds
The Chernoff bound for a random variable X is
obtained as follows: for any t >0,
Pr[X  a] = Pr[etX  eta] ≤ E[etX ] / eta
Similarly, for any t <0,
Pr[X  a] = Pr[etX  eta] ≤ E[etX ] / eta
The value of t that minimizes E[etX ] / eta gives the
best possible bounds.
Moment generating functions
• Def: The moment generating function of a
random variable X is MX(t) = E[etX].
• E[Xn] = MX(n)(0) , which is the nth derivative of
MX(t) evaluated at t = 0.
• Fact: If MX(t)= MY(t) for all t in (-c, c) for some
c > 0, then X and Y have the same distribution.
• If X and Y are independent r.v., then
MX+Y(t)= MX(t) MY(t).
Chernoff bounds for the sum of
Poisson trials
• Poisson trials: the distribution of a sum of
independent 0-1 random variables, which may
not be identical.
• Bernoulli trials: same as above except that all
the random variables are identical.
• Xi:i=1…n, mutually independent 0-1 r.v. with
Pr[Xi=1]=pi.
Let X =X1+…+Xn and E[X] =μ=p1+...+pn.
MXi(t) =E[etXi] = piet +(1-pi) = 1 + pi (et -1)
≤ exp (pi (et -1) ).
Chernoff bound for a sum of
Poisson trials
• MX(t) = MX1(t) MX2(t) .... MXn(t)
≤ exp{(p1 + ...+ pn)(et -1)} = exp{(et -1)μ}
T heorem Let X = X 1 + ¢¢¢+ X n , where X 1 ; : : : ; X n are n
independent t rials such t hat Pr[X i = 1] = pi holds for each
i³ = 1; 2; : :´: ; n. T hen, (1) for any d > 0, Pr[X ¸ (1+ d)¹ ] ·
ed
( 1+ d) 1 +
¹
d
; (2) for d 2 (0; 1], Pr[X ¸ (1 + d)¹ ] · e¡
(3) for R ¸ 6¹ , Pr[X ¸ R] · 2¡
R.
¹ d 2 =3 ;
Proof:
B y M ar kov inequal ity, f or any t > 0 we have
Pr[X ¸ (1+ d)¹ ] = Pr[et X ¸ et ( 1+ d) ¹ ] · E [et X ]=et ( 1+ d) ¹ ·
t
e( e ¡ 1) ¹ =et ( 1+ d) ¹ . For any d > 0 set t = ln(1 + d) > 0 we
have (1).
To prove (2), we need t o show for 0 < d · 1, ed =(1 +
2
d) ( 1+ d) · e¡ d =3 . Taking t he logarit hm of bot h sides, we
have d ¡ (1 + d) ln(1 + d) + d2 =3 · 0, which can be proved
wit h calculus.
To prove (3), let R = (1+ d)¹ . T hen, for R ¸ 6¹ , d = R=¹ ¡
d
e
1 ¸ 5. Hence, using (1), Pr[X ¸ (1+ d)¹ ] · ( ( 1+ d) ( 1 + d ) ) ¹ ·
( 1+e d ) ( 1+ d) ¹ · (e=6) R · 2¡
R.
• Similarly, we have:
P
n
T heorem Let X =
X i , where X 1 ; : : : ; X n are n
i= 1
independent Poisson t rials such t hat Pr[X i = 1] = pi . Let
¹³ = E [X ]. ´ T hen, for 0 < d < 1: (1) Pr[X · (1 ¡ d)¹ ] ·
e¡ d
( 1¡ d) ( 1 ¡
¹
d)
Corollary
; (2) Pr[X · (1 ¡ d)¹ ] ·
2
¡
¹
d
=2 .
e
For 0 < d < 1, Pr[jX ¡ ¹ j ¸ d¹ ] · 2e¡
¹ d 2 =3 :
• Example: Let X be the number of heads of
n independent fair coin flips. Applying the
above Corollary, we have:
Pr[jX ¡ n=2j ¸
p
6n ln n=2] · 2 exp(¡
Pr[jX ¡ n=2j ¸ n=4] · 2 exp(¡
1 n 1)
3 2 4
1 n 6 ln n
3 2 n
= 2e¡
) = 2=n:
n =24 :
By Chebyshev inequality, i.e. Pr[jX ¡ E [X ]j ¸ a] ·
we have Pr[jX ¡ n=2j ¸ n=4] · 4=n.
V ar [X ]
,
a2
Application: Estimating a parameter
• Given a DNA sample, a lab test can determine if it
carries the mutation. Since the test is expensive and we
would like to obtain a relatively reliable estimate from a
small number of samples. Let p be the unknown
parameter that we are looking for estimation.
• Assume we have n samples and X=p~ n of these
samples have the mutation. For sufficient large
number of samples, we expect p to be close to
p~.
Def: A 1 ¡ r con¯dence int erval for a paramet er p is an
int erval [~
p ¡ d; p~+ d] such t hat Pr[p 2 [~
p ¡ d; p~+ d] ¸ 1 ¡ r .
Among t he n samples, we ¯nd X = p~n mut at ion samples.
We need t o ¯nd d and r for which
Pr[p 2 [~
p ¡ d; p~ + d]] = Pr[np 2 [n( p~ ¡ d); n p~ + d]] ¸ 1 ¡ r .
X = n p~ has a binomial dist ribut ion wit h paramet ers n and
p, so E [X ] = np. If p 2= [~
p ¡ d; p~ + d] t hen one of t he
following event s holds:
(1) if p < p~ ¡ d t hen X = n p~ > n(p + d) = E [X ](1 + d=p);
(2) if p > p~ + d t hen n p~ < n(p ¡ d) = E [X ](1 ¡ d=p). T hus
Pr[p 2= [~
p¡ d; p~+ d]] = Pr[X < np(1¡ d=p)]+ Pr[X > np(1+
2
2
2
2
¡
n
p(
d=p)
=2
¡
n
p(
d=p)
=3
¡
n
d
=2p
¡
n
d
=3p :
d=p)] · e
+ e
= e
+ e
2
2
¡
n
d
=2p
¡
n
d
=3p , we have a t radeo® between
Set t ing r = e
+e
d, n and r .
Better bounds for special cases
T heorem Let X
=
X 1 + ¢¢¢ + X n , where
X 1 ; : : : ; X n are n independent random variables wit h
Pr[X i = 1] = Pr[X i = ¡ 1] = 1=2. For any a > 0,
2
¡
a
Pr[X ¸ a] · e =2n .
Pf: For any t > 0, E [et X i ] = et =2 + e¡ t =2. et = 1 + t +
t 2 =2! + ¢¢¢+ t i =i ! + ¢¢¢ and
e¡ t = 1¡ t + t 2 =2!+ ¢¢¢+ (¡ 1) i t i =i !+ ¢¢¢, using Taylor series.
P
P
t
X
2i
2 =2) i =i ! = et 2 =2 .
T hus E [e i ] =
t
=(2i
)!
·
(t
i¸ 0
i¸ 0
E [et X
et a ] ·
Pr[X
¡ a] ·
2
] = ¦ ni= 1 E [et X i ] · et n =2 and Pr[X ¸ a] = Pr[et X ¸
2
t
X
t
a
t
E [e ]=e · e n =2 =et a . Set t ing t = a=n, we have
2
¡
a
¸ a] · e =2n . By symmet ry, we have Pr[X ·
2
¡
a
e =2n .
Better bounds for special cases
Corollary Let X = X 1 + ¢¢¢+ X n , where X 1 ; : : : ; X n are n
independent random variables wit h Pr[X i = 1] = Pr[X i =
2
¡
a
¡ 1] = 1=2. For any a > 0, Pr[jX j ¸ a] · 2e =2n .
Let Yi = (X i + 1)=2, we have t he following.
Corollary Let Y = Y1 + ¢¢¢+ Yn , where Y1 ; : : : ; Yn are n
independent random variables wit h Pr[Yi = 1] = Pr[Yi =
0] = 1=2. Let ¹ = E [Y ] = n=2. (1) For any a > 0,
2
Pr[Y ¸ ¹ + a] · e¡ 2a =n . (2) For any d > 0 Pr[Y ¸
2
(1 + d)¹ P
] · e¡ d ¹ .
P n
n
Pf:Y =
Y = 1=2( i = 1 X i ) + n=2 = X =2 + ¹ . T hus
i= 1 i
2
Pr[Y ¸ ¹ + a] = Pr[X ¸ 2a] · e¡ 4a =2n .
To prove (2), set t ing a = d¹ = dn=2. T hus, Pr[Y ¸ (1 +
2 2
2
¡
2d
¹
=n
¡
d
d)¹ ] = Pr[X ¸ 2d¹ ] · e
= e ¹:
Better bounds for special cases
Corollary Let Y = Y1 + ¢¢¢+ Yn , where Y1 ; : : : ; Yn are n
independent random variables wit h Pr[Yi = 1] = Pr[Yi =
0] = 1=2. Let ¹ = E [Y ] = n=2. (1) For any ¹ > a > 0,
2
Pr[Y · ¹ ¡ a] · e¡ 2a =2n . (2) For any 1 > d > 0
2
Pr[Y · (1 ¡ d)¹ ] · e¡ d ¹ .
Applicat ion Set balancing: Given an n £ m mat rix A
wit h ent ries 0 or 1, let v be an m-dimensional vect or wit h
ent ries in f 1; ¡ 1g and c be an n-dimensional vect or such
t hat Av = c.
T heorem For a random vect or v wit h ent ries chosen randomly and witph equal probability from t he set f 1; ¡ 1g,
Pr[max i jci j ¸
4m ln n] · 2=n:
Proof of set balancing:
Proof: Let t he i -t h row of A be ai p= (ai ;1 ; ¢¢¢; ai ;m ) and
supposept here are k 1s in ai . If k ·p 4m ln n, t hen clearly
jai vj ·
4m ln n. Suppose
P mk > 4m ln n, t hen t here are
k non-zero t erms in Z i =
a v , which is t he sum of
j = 1 i ;j j
k independent random variables.
By t he Cherno®
bound and t he fact m ¸ k, we have
p
Pr[jZ i j > 4m ln n] · 2e¡ 4m l n n =2k · 2=n 2 . By t he union
bound we have t he bound for every row is at most 2=n.
Download