251distrl 11/19/04 (Open this document in 'Outline' view!) L. Discrete Distributions. 1. Binomial Distribution. a. Formula: Px C xn p x q n x . q 1 p Gives probability of x successes in n tries. p is probability of success on 1 try. b. Mean: np . c. Variance: 2 npq . d. Replacement of x by observed proportion p in formulas. E p p , Var p pq n e. Cumulative Distribution - Use of Tables. Always use tables if possible! Note that for any discrete distribution (with integer values for x ) like the ones in this section Px1 x x2 F x2 F x1 1 . (But for a continuous distribution like the normal or continuous uniform distribution Px1 x x2 F x2 F x1 .) Also note that for p .5 the problem must be totally recast in terms of failures to use the table. 2. Geometric Distribution. a. Formula: P( x) q x1 p . Gives probability that the first success occurs on try x . q 1 b. Mean and Variance: , 2 2 . p p c. Cumulative Distribution: F x 1 q x . This is the formula you really use! 3. Poisson Distribution. e m m x . x! Gives probability of x successes in an interval in which the average number of successes is m . Recursive version m Px Px 1 x a. Formula: Px b. Mean and Variance: m , 2 m . c. Cumulative Distribution - Use of Tables. d. Use Poisson as an approximation for the Binomial Distribution. m np if n 500 p 2 4. Hypergeometric Distribution. a. Formula: Px C nNxM C xM C nN . ( Note that x, n, N and M are integers!) Gives the probability of x successes in a sample of n taken from a population of N in which there are M successes. There is a recursive formula for the Hypergeometric distribution that can save you some time in repeated calculations with the same distribution - see appendix. M N n npq . b. Mean and Variance: If p , np, 2 N N 1 c. What if N is infinite? Use the Binomial Distribution! This works if N 20 n . 5. Summary (Not in 251 distr. See also 251greatD) Remember: (i) If you are looking for numbers of successes when the number of tries is given and the probability of success is constant, you want the Binomial distribution. (ii) If you are looking for the try on which the first success occurs out of many possible tries when the probability of success is constant, you want the Geometric distribution. (iii) If you are looking for numbers of successes when the average number of successes per unit time or space is given, you want the Poisson distribution. (iv) If you are looking for numbers of successes when the number of tries is given and the probability of success is not constant because the total number of successes in the population is limited, you want the Hypergeometric distribution. M. Continuous Distributions. 1. Introduction. (for more detail see 251greatD) a. Normal Distribution b. Exponential Distribution: F x 1 ecx , when the mean time to a success is 1 . c c. Chi-squared Distribution. d. t Distribution. e. F Distribution. 2. Properties of the Normal Distribution. a. Use of Standard Normal Tables. 3 For examples see 251distrex2. For more examples see 251distrex1. b. Probabilities for Normal Distributions that are not Standardized. z x For examples see 251distrex3 3. Percentiles and Intervals about the Mean. Read the table backwards to find z. x z or x z For examples see 251distrex4 4. Normal Approximation to the Binomial Distribution. a. Without Continuity Correction. np , npq if np 5 and nq 5 . b. With Continuity Correction. Expand interval by 0.5 in both directions. Use especially if npq 9 . For examples see 251distrex5 5. Normal Approximation to the Poisson Distribution. m , m if if m 25 . 6. Review of Conditions for Approximation of One Distribution by Another. 4 N. Statistical Sampling. 1. Definitions. Random Samples, Sampling and Nonsampling Errors. The Law of Large Numbers. Convenience, Judgment and Probability Samples. Simple, Stratified and Cluster Random Samples. 2. Distribution of x and p a. E x , x2 2 This is the standard deviation of the n sample mean, and is often called the standard error. b. Finite Population Correction Factor. Use this if sample is more than 5% of population! x X s N n or s x x N 1 n n N n N 1 3. The Central Limit Theorem a. x approaches N , as n becomes large. n b. Problems involving Probabilities for x and x. O. Estimation of Parameters. (For more detail see 251param) 1. Point and Interval Estimation. Properties of Estimators. a. Unbiassedness. b. Consistency c. Efficiency. d. Maximum Likelihood. 2. A Confidence Interval for When is Known. x z 2 x You can only use this when you know the population variance. Don’t forget that there are two formulas for the sample variance depending on sample size! 3. A Confidence Interval for When is not known. x tn1 s x 2 This is what you actually use most of the time! All that " unknown" means is that we do not have a value of the population variance. If you 5 only have the sample variance, use the t table. 6 Appendix to L - a Recursive Formula for the Hypergeometric Distribution. The formula is Px Example: Px P0 C 790 C 010 C 7100 M x 1 n x 1 N M n x Px 1 x C nNxM C xM C nN . Assume that N 100 , M 10 and n 7 . 90 89 88 87 86 85 84 .46674 (Particularly easy to 100 99 98 97 96 95 94 calculate because both the numerator and denominator are divided by (7!). 10 1 1 7 1 1 10 2 1 7 2 1 P1 P0 , P2 P1 , etc. This amounts to saying 1 2 100 10 7 1 100 10 7 2 that 10 7 10 7 P1 P0 .46674 .38895 1 84 1 84 P2 96 96 P1 .38895 .12355 2 85 2 85 P3 85 85 P2 .12355 .01916 3 86 3 86 P4 74 74 P3 .01916 .00154 4 87 4 87 63 63 P4 .00154 .00006 5 88 4 88 Notice how each part of the fractions changes by 1 each time until the first denominator rises to n , the second denominator rises to N M and the second numerator falls to 1. At that point we have all values of x that can occur. P5 7