PPT

Introduction to probability Stat 134 FAll 2005 Berkeley Lectures prepared by: Elchanan Mossel Yelena Shvets Follows Jim Pitman’s book: Probability Section 2.2 Binomial Distribution: Recall: the Mean (m) Mean of a binomial(n,p) distribution is given by: m = #Trials £ P(success) = n p • The Mean = Mode (most likely value). • The Mean = “Center of gravity” of the distribution. Standard Deviation The Standard Deviation (SD) of a distribution, denoted s measures the spread of the distribution. Def: The Standard Deviation (SD) s of the binomial(n,p) distribution is given by: Example 1: Binomial Distribution • Let’s compare Bin(100,1/4) to Bin(100,1/2). • m1 = 25, m2 = 50 • s1 = 4.33, s2 = 5 • We expect Bin(100,1/4) to be less spread than Bin(100,1/2). • Indeed, you are more likely to guess the value of Bin(100,1/4) distribution than of Bin(100,1/2) since: For p=1/2 : P(50) ¼ 0.079589237; For p=1/4 : P(25) ¼ 0.092132; Histograms for binomial(100, 1/4) & binomial(100, 1/2) ; .12 .10 .08 .06 .04 .02 .00 0 10 20 30 40 50 60 70 80 90 100 Example 2: Binomial Distribution • Let’s compare Bin(50,1/2) to Bin(100,1/2). • m1 = 25, m2 = 50 • s1 = 3.54, s2 = 5 • We expect Bin(50,1/2) to be less spread than Bin(100,1/2). • Indeed, you are more likely to guess the value of Bin(50,1/2) distribution than of Bin(100,1/2) since: For n=100 : P(50) ¼ 0.079589237; For n=50 : P(25) ¼ 0.112556; Histograms for binomial(50, 1/2) & binomial(100,1/2) .12 .10 .08 .06 .04 .02 .00 0 10 20 30 40 50 60 70 80 90 100 m, s and the normal curve •We will see today that the •Mean (m) and the •SD (s) give a very good summary of the binom(n,p) distribution via •The Normal Curve with parameters s and m. binomial(n,½); n=50,100,250,500 .12 m=25, s=3.54 .10 m=50, s=5 .08 m=125, s=7.91 .06 m=250, s=11.2 .04 .02 .00 0 50 100 150 200 250 300 Normal Curve y .5 1 y= e 2s .4 (x-m )2 2s2 .3 .2 .1 .0 m-s m m+s x The Normal Distribution b .5 .4 P(-, a) = a  - 1 e 2s - (x-m )2 2s2 P(a,b)= a dx 1 e 2s - ( x- m ) 2 2 s2 dx .3 .2 .1 .0 a b x b (x-m )2 2 s2 1 P(a,b) =  e dx The Normal 2s Distribution a ( x-m )2 ( x-m )2 b a 1 1 2 2 s2 2 s e dx -  e dx =  2s 2s - - .5 .4 .3 .2 .1 .0 a b x Standard Normal Standard Normal Density Function: 1 (x)= e 2 x2 2 Corresponds to m = 0 and s=1 Standard Normal Standard Normal Cumulative Distribution Function: z F(z)=  (x)dx - For the normal (m,s) distribution: P(a,b) = F((b-m)/s) - F((a-m)/s); Standard Normal .5 .4 (x)= .3 1 e 2 - x2 2 .2 .1 .0 -5 -4 -3 -2 -1 0 1 2 3 4 5 z F(z)=  (x)dx - 1.0 .9 .8 .7 .6 .5 .4 .3 .2 .1 .0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Standard Normal .5 .4 1 e 2 (x)= .3 - x2 2 .2 .1 .0 -5 -4 -3 -2 -1 0 1 2 3 4 5 1.0 1 - F(z) .9 .8 .7 F(z) 1 F(0)= 2 .6 .5 .4 .3 F(-z) .2 .1 .0 -5 -4 -3 -2 -1 -z 0 z 1 2 3 4 5 Standard Normal Cumulative Standard Normal Distribution Function: For the normal (m,s) distribution: P(a,b) = F((b-m)/s) - F((a-m)/s); In order to prove this, suffices to show: P(-1,a) = F((a-m)/s); a  - s= 1 e 2s a -m Change of variables 2 ( x-m ) (x2s2 x-m s dx ds = s a-m x=as= s dx = s  - = a -m s  1 e 2 s2 2 (s)ds - a-m = F( ) s ds Properties of F: F(0) = 1/2; F(-z) = 1 - F(z); F(-1) = 0; F(1) = 1. F does not have a closed form formula! Normal Approximation of a binomial For n independent trials with success probability p: where: binomial(n,½); n=50,100,250,500 .12 m=25, s=3.54 .10 m=50, s=5 .08 m=125, s=7.91 .06 m=250, s=11.2 .04 .02 .00 0 50 100 150 200 250 300 Normal(m,s); .12 m=25, s=3.54 .10 m=50, s=5 .08 m=125, s=7.91 .06 m=250, s=11.2 .04 .02 .00 0 50 100 150 200 250 300 Normal Approximation of a binomial For n independent trials with success probability p: •The 0.5 correction is called the “continuity correction” •It is important when s is small or when a and b are close. Normal Approximation Question: Find P(H>40) in 100 tosses. Normal Approximation to bin(100,1/2). .12 .10 .08 .06 .04 .02 .00 1 e 2 5 - (x-50)2 2*52 1-F((40-50)/5) =1-F(-2) = F(2) ¼ 0.9772 What do we get With continuity correction? Exact = 0.971556 P(#H  40) When does the Normal Approximation fail? 1.00 .90 .80 bin(1,1/2) N(0.5,0.5) m=0.5,s=0.5 .70 .60 .50 .40 .30 .20 .10 .00 0 1 1. When does the NA fail? bin(100,1/100) 0.8 m =1, s ¼ 1 0.6 N(1,1) 0.4 0.2 0. 0 1 2 3 4 5 6 7 8 9 10 Rule of Thumb Normal works better: •The larger s is. •The closer p is to ½. Fluctuation in the number of successes. From the normal approximation it follows that: P(m-s to m+s successes in n trials) ¼ 68% P(m-2s to m+2s successes in n trials) ¼ 95% P(m-3s to m+3s successes in n trials) ¼ 99.7% P(m-4s to m+4s successes in n trials) ¼ 99.99% Fluctuation in the proportion of Typical size of successes. fluctuation in the number of successes is: s = np(1 - p) Typical size of fluctuation in the proportion of successes is: s = n p(1 - p) n Square Root Law Let n be a large number of independent trials with probability of success p on each. The number of successes will, with high probability, lie in an interval, centered on the mean np, with a width a moderate multiple of n . The proportion of successes, will lie in a small interval centered on p, with the width a moderate multiple of 1/ n . Law of large numbers Let n be a number of independent trials, with probability p of success on each. For each e > 0; P(|#successes/n – p|< e) ! 1, as n ! 1 Confidence intervals Suppose we observe the results of n trials with an unknown probability of success p. #successes ˆ The observed frequency of successes p= . n Conf Intervals The Normal Curve Approximation says that for any fixed p and n large enough, there is a 99.99% chance that the observed frequency pˆ will differ from p by p(1-p) less than 4 . n It's easy to see that p(1-p) 1 2 p(1-p)  , so 4  . 2 n n Conf Intervals 2 2 ˆˆ(p ,p ) n n is called a 99.99% confidence interval. binomial(n,½); n=50,100,250,500 .12 m=25, s=3.54 .10 m=50, s=5 .08 m=125, s=7.91 .06 m=250, s=11.2 .04 .02 .00 0 50 100 150 200 250 300 Normal(m,s); .12 m=25, s=3.54 .10 m=50, s=5 .08 m=125, s=7.91 .06 m=250, s=11.2 .04 .02 .00 0 50 100 150 200 250 300 (x-m )2 2s2 m=50, s=5 1 Normal Approximation e 2s .12 .10 .08 bin(50,1/2) bin(50,1/2) + 25 (bin(50,1/2) + 25)* 5/3.535534 m=50, s =3.535534 m = 25, s = 3.535534 m=50, s = 5 bin(100,1/2) .06 .04 .02 m=50, s=5 .00 0 10 20 30 40 50 60 70 80 90 100 Notre Dame de Rheims Quote Bientôt je pus montrer quelques esquisses. Personne n'y comprit rien. Même ceux qui furent favorables à ma perception des vérités que je voulais ensuite graver dans le temple, me félicitèrent de les avoir découvertes au « microscope », quand je m'étais au contraire servi d'un télescope pour apercevoir des choses, très petites en effet, mais parce qu'elles étaient situées à une grande distance […]. Là où je cherchais les grandes lois, on m'appelait fouilleur de détails. (TR, p.346)

PPT

Related documents

Products

Support

PPT

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib