Chernoff bounds The Chernoff bound for a random variable X is obtained as follows: for any t >0, Pr[X a] = Pr[etX eta] ≤ E[etX ] / eta Similarly, for any t <0, Pr[X a] = Pr[etX eta] ≤ E[etX ] / eta The value of t that minimizes E[etX ] / eta gives the best possible bounds. Moment generating functions • Def: The moment generating function of a random variable X is MX(t) = E[etX]. • E[Xn] = MX(n)(0) , which is the nth derivative of MX(t) evaluated at t = 0. • Fact: If MX(t)= MY(t) for all t in (-c, c) for some c > 0, then X and Y have the same distribution. • If X and Y are independent r.v., then MX+Y(t)= MX(t) MY(t). Chernoff bounds for the sum of Poisson trials • Poisson trials: the distribution of a sum of independent 0-1 random variables, which may not be identical. • Bernoulli trials: same as above except that all the random variables are identical. • Xi:i=1…n, mutually independent 0-1 r.v. with Pr[Xi=1]=pi. Let X =X1+…+Xn and E[X] =μ=p1+...+pn. MXi(t) =E[etXi] = piet +(1-pi) = 1 + pi (et -1) ≤ exp (pi (et -1) ). Chernoff bound for a sum of Poisson trials • MX(t) = MX1(t) MX2(t) .... MXn(t) ≤ exp{(p1 + ...+ pn)(et -1)} = exp{(et -1)μ} T heorem Let X = X 1 + ¢¢¢+ X n , where X 1 ; : : : ; X n are n independent t rials such t hat Pr[X i = 1] = pi holds for each i³ = 1; 2; : :´: ; n. T hen, (1) for any d > 0, Pr[X ¸ (1+ d)¹ ] · ed ( 1+ d) 1 + ¹ d ; (2) for d 2 (0; 1], Pr[X ¸ (1 + d)¹ ] · e¡ (3) for R ¸ 6¹ , Pr[X ¸ R] · 2¡ R. ¹ d 2 =3 ; Proof: B y M ar kov inequal ity, f or any t > 0 we have Pr[X ¸ (1+ d)¹ ] = Pr[et X ¸ et ( 1+ d) ¹ ] · E [et X ]=et ( 1+ d) ¹ · t e( e ¡ 1) ¹ =et ( 1+ d) ¹ . For any d > 0 set t = ln(1 + d) > 0 we have (1). To prove (2), we need t o show for 0 < d · 1, ed =(1 + 2 d) ( 1+ d) · e¡ d =3 . Taking t he logarit hm of bot h sides, we have d ¡ (1 + d) ln(1 + d) + d2 =3 · 0, which can be proved wit h calculus. To prove (3), let R = (1+ d)¹ . T hen, for R ¸ 6¹ , d = R=¹ ¡ d e 1 ¸ 5. Hence, using (1), Pr[X ¸ (1+ d)¹ ] · ( ( 1+ d) ( 1 + d ) ) ¹ · ( 1+e d ) ( 1+ d) ¹ · (e=6) R · 2¡ R. • Similarly, we have: P n T heorem Let X = X i , where X 1 ; : : : ; X n are n i= 1 independent Poisson t rials such t hat Pr[X i = 1] = pi . Let ¹³ = E [X ]. ´ T hen, for 0 < d < 1: (1) Pr[X · (1 ¡ d)¹ ] · e¡ d ( 1¡ d) ( 1 ¡ ¹ d) Corollary ; (2) Pr[X · (1 ¡ d)¹ ] · 2 ¡ ¹ d =2 . e For 0 < d < 1, Pr[jX ¡ ¹ j ¸ d¹ ] · 2e¡ ¹ d 2 =3 : • Example: Let X be the number of heads of n independent fair coin flips. Applying the above Corollary, we have: Pr[jX ¡ n=2j ¸ p 6n ln n=2] · 2 exp(¡ Pr[jX ¡ n=2j ¸ n=4] · 2 exp(¡ 1 n 1) 3 2 4 1 n 6 ln n 3 2 n = 2e¡ ) = 2=n: n =24 : By Chebyshev inequality, i.e. Pr[jX ¡ E [X ]j ¸ a] · we have Pr[jX ¡ n=2j ¸ n=4] · 4=n. V ar [X ] , a2 Application: Estimating a parameter • Given a DNA sample, a lab test can determine if it carries the mutation. Since the test is expensive and we would like to obtain a relatively reliable estimate from a small number of samples. Let p be the unknown parameter that we are looking for estimation. • Assume we have n samples and X=p~ n of these samples have the mutation. For sufficient large number of samples, we expect p to be close to p~. Def: A 1 ¡ r con¯dence int erval for a paramet er p is an int erval [~ p ¡ d; p~+ d] such t hat Pr[p 2 [~ p ¡ d; p~+ d] ¸ 1 ¡ r . Among t he n samples, we ¯nd X = p~n mut at ion samples. We need t o ¯nd d and r for which Pr[p 2 [~ p ¡ d; p~ + d]] = Pr[np 2 [n( p~ ¡ d); n p~ + d]] ¸ 1 ¡ r . X = n p~ has a binomial dist ribut ion wit h paramet ers n and p, so E [X ] = np. If p 2= [~ p ¡ d; p~ + d] t hen one of t he following event s holds: (1) if p < p~ ¡ d t hen X = n p~ > n(p + d) = E [X ](1 + d=p); (2) if p > p~ + d t hen n p~ < n(p ¡ d) = E [X ](1 ¡ d=p). T hus Pr[p 2= [~ p¡ d; p~+ d]] = Pr[X < np(1¡ d=p)]+ Pr[X > np(1+ 2 2 2 2 ¡ n p( d=p) =2 ¡ n p( d=p) =3 ¡ n d =2p ¡ n d =3p : d=p)] · e + e = e + e 2 2 ¡ n d =2p ¡ n d =3p , we have a t radeo® between Set t ing r = e +e d, n and r . Better bounds for special cases T heorem Let X = X 1 + ¢¢¢ + X n , where X 1 ; : : : ; X n are n independent random variables wit h Pr[X i = 1] = Pr[X i = ¡ 1] = 1=2. For any a > 0, 2 ¡ a Pr[X ¸ a] · e =2n . Pf: For any t > 0, E [et X i ] = et =2 + e¡ t =2. et = 1 + t + t 2 =2! + ¢¢¢+ t i =i ! + ¢¢¢ and e¡ t = 1¡ t + t 2 =2!+ ¢¢¢+ (¡ 1) i t i =i !+ ¢¢¢, using Taylor series. P P t X 2i 2 =2) i =i ! = et 2 =2 . T hus E [e i ] = t =(2i )! · (t i¸ 0 i¸ 0 E [et X et a ] · Pr[X ¡ a] · 2 ] = ¦ ni= 1 E [et X i ] · et n =2 and Pr[X ¸ a] = Pr[et X ¸ 2 t X t a t E [e ]=e · e n =2 =et a . Set t ing t = a=n, we have 2 ¡ a ¸ a] · e =2n . By symmet ry, we have Pr[X · 2 ¡ a e =2n . Better bounds for special cases Corollary Let X = X 1 + ¢¢¢+ X n , where X 1 ; : : : ; X n are n independent random variables wit h Pr[X i = 1] = Pr[X i = 2 ¡ a ¡ 1] = 1=2. For any a > 0, Pr[jX j ¸ a] · 2e =2n . Let Yi = (X i + 1)=2, we have t he following. Corollary Let Y = Y1 + ¢¢¢+ Yn , where Y1 ; : : : ; Yn are n independent random variables wit h Pr[Yi = 1] = Pr[Yi = 0] = 1=2. Let ¹ = E [Y ] = n=2. (1) For any a > 0, 2 Pr[Y ¸ ¹ + a] · e¡ 2a =n . (2) For any d > 0 Pr[Y ¸ 2 (1 + d)¹ P ] · e¡ d ¹ . P n n Pf:Y = Y = 1=2( i = 1 X i ) + n=2 = X =2 + ¹ . T hus i= 1 i 2 Pr[Y ¸ ¹ + a] = Pr[X ¸ 2a] · e¡ 4a =2n . To prove (2), set t ing a = d¹ = dn=2. T hus, Pr[Y ¸ (1 + 2 2 2 ¡ 2d ¹ =n ¡ d d)¹ ] = Pr[X ¸ 2d¹ ] · e = e ¹: Better bounds for special cases Corollary Let Y = Y1 + ¢¢¢+ Yn , where Y1 ; : : : ; Yn are n independent random variables wit h Pr[Yi = 1] = Pr[Yi = 0] = 1=2. Let ¹ = E [Y ] = n=2. (1) For any ¹ > a > 0, 2 Pr[Y · ¹ ¡ a] · e¡ 2a =2n . (2) For any 1 > d > 0 2 Pr[Y · (1 ¡ d)¹ ] · e¡ d ¹ . Applicat ion Set balancing: Given an n £ m mat rix A wit h ent ries 0 or 1, let v be an m-dimensional vect or wit h ent ries in f 1; ¡ 1g and c be an n-dimensional vect or such t hat Av = c. T heorem For a random vect or v wit h ent ries chosen randomly and witph equal probability from t he set f 1; ¡ 1g, Pr[max i jci j ¸ 4m ln n] · 2=n: Proof of set balancing: Proof: Let t he i -t h row of A be ai p= (ai ;1 ; ¢¢¢; ai ;m ) and supposept here are k 1s in ai . If k ·p 4m ln n, t hen clearly jai vj · 4m ln n. Suppose P mk > 4m ln n, t hen t here are k non-zero t erms in Z i = a v , which is t he sum of j = 1 i ;j j k independent random variables. By t he Cherno® bound and t he fact m ¸ k, we have p Pr[jZ i j > 4m ln n] · 2e¡ 4m l n n =2k · 2=n 2 . By t he union bound we have t he bound for every row is at most 2=n.