Applied Statistics I Liang Zhang June 23, 2008

Applied Statistics I Liang Zhang Department of Mathematics, University of Utah June 23, 2008 Liang Zhang (UofU) Applied Statistics I June 23, 2008 1 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 2 / 24 Hypergeometric Distribution Assume we are drawing cards from a deck of well-shulffed cards with replacement, one card per each draw. We do this 5 times and record whether the outcome is ♠ or not. Then this is a binomial experiment. Liang Zhang (UofU) Applied Statistics I June 23, 2008 2 / 24 Hypergeometric Distribution Assume we are drawing cards from a deck of well-shulffed cards with replacement, one card per each draw. We do this 5 times and record whether the outcome is ♠ or not. Then this is a binomial experiment. If we do the same thing without replacement, then it is NO LONGER a binomial experiment. Liang Zhang (UofU) Applied Statistics I June 23, 2008 2 / 24 Hypergeometric Distribution Assume we are drawing cards from a deck of well-shulffed cards with replacement, one card per each draw. We do this 5 times and record whether the outcome is ♠ or not. Then this is a binomial experiment. If we do the same thing without replacement, then it is NO LONGER a binomial experiment. However, if we are drawing from 100 decks of cards without replacement and record only the first 5 outcomes, then it is approximately a binomial experiment. Liang Zhang (UofU) Applied Statistics I June 23, 2008 2 / 24 Hypergeometric Distribution Assume we are drawing cards from a deck of well-shulffed cards with replacement, one card per each draw. We do this 5 times and record whether the outcome is ♠ or not. Then this is a binomial experiment. If we do the same thing without replacement, then it is NO LONGER a binomial experiment. However, if we are drawing from 100 decks of cards without replacement and record only the first 5 outcomes, then it is approximately a binomial experiment. What is the exact model for drawing cards without replacement? Liang Zhang (UofU) Applied Statistics I June 23, 2008 2 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 3 / 24 Hypergeometric Distribution 1. The population or set to be sampled consists of N individuals, objects, or elements (a finite population). Liang Zhang (UofU) Applied Statistics I June 23, 2008 3 / 24 Hypergeometric Distribution 1. The population or set to be sampled consists of N individuals, objects, or elements (a finite population). 2. Each individual can be characterized as a success (S) or a failure (F), and there are M successes in the population. Liang Zhang (UofU) Applied Statistics I June 23, 2008 3 / 24 Hypergeometric Distribution 1. The population or set to be sampled consists of N individuals, objects, or elements (a finite population). 2. Each individual can be characterized as a success (S) or a failure (F), and there are M successes in the population. 3. A sample of n individuals is selected without replacement in such a way that each subset of size n is equally likely to be chosen. Liang Zhang (UofU) Applied Statistics I June 23, 2008 3 / 24 Hypergeometric Distribution 1. The population or set to be sampled consists of N individuals, objects, or elements (a finite population). 2. Each individual can be characterized as a success (S) or a failure (F), and there are M successes in the population. 3. A sample of n individuals is selected without replacement in such a way that each subset of size n is equally likely to be chosen. Definition For any experiment which satisfies the above 3 conditions, let X = the number of S’s in the sample. Then X is a hypergeometric random variable and we use h(x; n, M, N) to denote the pmf p(x) = P(X = x). Liang Zhang (UofU) Applied Statistics I June 23, 2008 3 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 4 / 24 Hypergeometric Distribution Examples: Liang Zhang (UofU) Applied Statistics I June 23, 2008 4 / 24 Hypergeometric Distribution Examples: In the second cards drawing example (without replacement and totally 52 cards), if we let X = the number of ♠’s in the first 5 draws, then X is a hypergeometric random variable with n = 5, M = 13 and N = 52. Liang Zhang (UofU) Applied Statistics I June 23, 2008 4 / 24 Hypergeometric Distribution Examples: In the second cards drawing example (without replacement and totally 52 cards), if we let X = the number of ♠’s in the first 5 draws, then X is a hypergeometric random variable with n = 5, M = 13 and N = 52. For the pmf, the probability for getting exactly x (x = 0, 1, 2, 3, 4, or 5) ♠’s is calculated as following: 13 39 · p(x) = P(X = x) = x 525−x 5 Liang Zhang (UofU) Applied Statistics I June 23, 2008 4 / 24 Hypergeometric Distribution Examples: In the second cards drawing example (without replacement and totally 52 cards), if we let X = the number of ♠’s in the first 5 draws, then X is a hypergeometric random variable with n = 5, M = 13 and N = 52. For the pmf, the probability for getting exactly x (x = 0, 1, 2, 3, 4, or 5) ♠’s is calculated as following: 13 39 · p(x) = P(X = x) = x 525−x 5 13 39 where x is the number of choices for getting x ♠’s, 5−x is the number of choices for getting the remaining 5 − x non-♠ cards and 52 5 is the total number of choices for selecting 5 cards from 52 cards. Liang Zhang (UofU) Applied Statistics I June 23, 2008 4 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 5 / 24 Hypergeometric Distribution Examples: Liang Zhang (UofU) Applied Statistics I June 23, 2008 5 / 24 Hypergeometric Distribution Examples: For the same experiment (without replacement and totally 52 cards), if we let X = the number of ♠’s in the first 20 draws, then X is still a hypergeometric random variable, but with n = 20, M = 13 and N = 52. Liang Zhang (UofU) Applied Statistics I June 23, 2008 5 / 24 Hypergeometric Distribution Examples: For the same experiment (without replacement and totally 52 cards), if we let X = the number of ♠’s in the first 20 draws, then X is still a hypergeometric random variable, but with n = 20, M = 13 and N = 52. However, in this case, all the possible values for X is 0, 1, 2, . . . , 13 and the pmf is 13 39 · p(x) = P(X = x) = x 5220−x 20 where 0 ≤ x ≤ 13. Liang Zhang (UofU) Applied Statistics I June 23, 2008 5 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 6 / 24 Hypergeometric Distribution Proposition If X is the number of S’s in a completely random sample of size n drawn from a population consisting of M S’s and (N − M) F ’s, then the probability distribution of X , called the hypergeometric distribution, is given by N−M M x · n−x P(X = x) = h(x; n, M, N) = N n for x an integer satisfying max(0, n − N + M) ≤ x ≤ min(n, M). Liang Zhang (UofU) Applied Statistics I June 23, 2008 6 / 24 Hypergeometric Distribution Proposition If X is the number of S’s in a completely random sample of size n drawn from a population consisting of M S’s and (N − M) F ’s, then the probability distribution of X , called the hypergeometric distribution, is given by N−M M x · n−x P(X = x) = h(x; n, M, N) = N n for x an integer satisfying max(0, n − N + M) ≤ x ≤ min(n, M). Remark: If n < M, then the largest x is n. However, if n > M, then the largest x is M. Therefore we require x ≤ min(n, M). Liang Zhang (UofU) Applied Statistics I June 23, 2008 6 / 24 Hypergeometric Distribution Proposition If X is the number of S’s in a completely random sample of size n drawn from a population consisting of M S’s and (N − M) F ’s, then the probability distribution of X , called the hypergeometric distribution, is given by N−M M x · n−x P(X = x) = h(x; n, M, N) = N n for x an integer satisfying max(0, n − N + M) ≤ x ≤ min(n, M). Remark: If n < M, then the largest x is n. However, if n > M, then the largest x is M. Therefore we require x ≤ min(n, M). Similarly, if n < N − M, then the smallest x is 0. However, if n > N − M, then the smallest x is n − (N − M). Thus x ≥ min(0, n − N + M). Liang Zhang (UofU) Applied Statistics I June 23, 2008 6 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 7 / 24 Hypergeometric Distribution Example: (Problem 70) Liang Zhang (UofU) Applied Statistics I June 23, 2008 7 / 24 Hypergeometric Distribution Example: (Problem 70) An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. Liang Zhang (UofU) Applied Statistics I June 23, 2008 7 / 24 Hypergeometric Distribution Example: (Problem 70) An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. a. What is the probability that exactly 10 of these are from the second section? Liang Zhang (UofU) Applied Statistics I June 23, 2008 7 / 24 Hypergeometric Distribution Example: (Problem 70) An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. a. What is the probability that exactly 10 of these are from the second section? b. What is the probability that at least 10 of these are from the second section? Liang Zhang (UofU) Applied Statistics I June 23, 2008 7 / 24 Hypergeometric Distribution Example: (Problem 70) An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. a. What is the probability that exactly 10 of these are from the second section? b. What is the probability that at least 10 of these are from the second section? c. What is the probability that at least 10 of these are from the same section? Liang Zhang (UofU) Applied Statistics I June 23, 2008 7 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 8 / 24 Hypergeometric Distribution Proposition The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are N −n M M M V (X ) = ·n· · 1− E (X ) = n · N N −1 N N Liang Zhang (UofU) Applied Statistics I June 23, 2008 8 / 24 Hypergeometric Distribution Proposition The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are N −n M M M V (X ) = ·n· · 1− E (X ) = n · N N −1 N N Remark: The ratio M N is the proportion of S’s in the population. If we replace M N by p, then we get Liang Zhang (UofU) Applied Statistics I June 23, 2008 8 / 24 Hypergeometric Distribution Proposition The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are N −n M M M V (X ) = ·n· · 1− E (X ) = n · N N −1 N N Remark: The ratio M N is the proportion of S’s in the If we replace population. N−n p, then we get E (X ) = np and V (X ) = N−1 · np(1 − p). Liang Zhang (UofU) Applied Statistics I June 23, 2008 M N by 8 / 24 Hypergeometric Distribution Proposition The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are N −n M M M V (X ) = ·n· · 1− E (X ) = n · N N −1 N N Remark: The ratio M N is the proportion of S’s in the If we replace population. N−n p, then we get E (X ) = np and V (X ) = N−1 · np(1 − p). Recall the mean and variance for a binomial rv is np and np(1 − p). Liang Zhang (UofU) Applied Statistics I June 23, 2008 M N by 8 / 24 Hypergeometric Distribution Proposition The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are N −n M M M V (X ) = ·n· · 1− E (X ) = n · N N −1 N N Remark: The ratio M N is the proportion of S’s in the If we replace M N by population. N−n p, then we get E (X ) = np and V (X ) = N−1 · np(1 − p). Recall the mean and variance for a binomial rv is np and np(1 − p). We see that the mean for binomial and hypergeometric rv’s are equal, while the variances differ by the factor (N − n)/(N − 1). Liang Zhang (UofU) Applied Statistics I June 23, 2008 8 / 24 Hypergeometric Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 9 / 24 Hypergeometric Distribution Example (Problem 70) continued: Liang Zhang (UofU) Applied Statistics I June 23, 2008 9 / 24 Hypergeometric Distribution Example (Problem 70) continued: An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. Liang Zhang (UofU) Applied Statistics I June 23, 2008 9 / 24 Hypergeometric Distribution Example (Problem 70) continued: An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. d. What are the mean value and standard deviation of the number of projects among these 15 that are from the second section? Liang Zhang (UofU) Applied Statistics I June 23, 2008 9 / 24 Hypergeometric Distribution Example (Problem 70) continued: An instructor who taught two sections of engineering statistics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. d. What are the mean value and standard deviation of the number of projects among these 15 that are from the second section? e. What are the mean value and standard deviation of the number of projects not among these 15 that are from the second section? Liang Zhang (UofU) Applied Statistics I June 23, 2008 9 / 24 Negative Binomial Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 10 / 24 Negative Binomial Distribution Consider the card drawing example again. This time, we still draw cards from a deck of well-shulffed cards with replacement, one card per each draw. However, we keep drawing until we get 5 ♠’s. Let X = the number of draws which do not give us a ♠, then X is NO LONGER a binomial random variable, but a negative binomial random variable. Liang Zhang (UofU) Applied Statistics I June 23, 2008 10 / 24 Negative Binomial Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 11 / 24 Negative Binomial Distribution 1. The experiment consists of a sequence of independent trials. Liang Zhang (UofU) Applied Statistics I June 23, 2008 11 / 24 Negative Binomial Distribution 1. The experiment consists of a sequence of independent trials. 2. Each trial can result in either s success (S) or a failure (F). Liang Zhang (UofU) Applied Statistics I June 23, 2008 11 / 24 Negative Binomial Distribution 1. The experiment consists of a sequence of independent trials. 2. Each trial can result in either s success (S) or a failure (F). 3. The probability of success is constant from trial to trial, so P(S on trial i) = p for i = 1, 2, 3, . . . . Liang Zhang (UofU) Applied Statistics I June 23, 2008 11 / 24 Negative Binomial Distribution 1. The experiment consists of a sequence of independent trials. 2. Each trial can result in either s success (S) or a failure (F). 3. The probability of success is constant from trial to trial, so P(S on trial i) = p for i = 1, 2, 3, . . . . 4. The experiment continues (trials are performed) until a total of r successes have been observed, where r is a specified positive integer. Liang Zhang (UofU) Applied Statistics I June 23, 2008 11 / 24 Negative Binomial Distribution 1. The experiment consists of a sequence of independent trials. 2. Each trial can result in either s success (S) or a failure (F). 3. The probability of success is constant from trial to trial, so P(S on trial i) = p for i = 1, 2, 3, . . . . 4. The experiment continues (trials are performed) until a total of r successes have been observed, where r is a specified positive integer. Definition For any experiment which satisfies the above 4 conditions, let X = the number of failures that precede thr r th success. Then X is a negative binomial random variable and we use nb(x; r , p) to denote the pmf p(x) = P(X = x). Liang Zhang (UofU) Applied Statistics I June 23, 2008 11 / 24 Negative Binomial Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 12 / 24 Negative Binomial Distribution Remark: 1. In some sources, the negative binomial rv is taken to be the number of trials X + r rather than the number of failures. Liang Zhang (UofU) Applied Statistics I June 23, 2008 12 / 24 Negative Binomial Distribution Remark: 1. In some sources, the negative binomial rv is taken to be the number of trials X + r rather than the number of failures. 2. If r = 1, we call X a geometric random variable. The pmf for X is then the familiar one nb(x; 1, p) = (1 − p)x p Liang Zhang (UofU) Applied Statistics I x = 0, 1, 2, . . . June 23, 2008 12 / 24 Negative Binomial Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 13 / 24 Negative Binomial Distribution Proposition The pmf of the negative binomial rv X with parameters r = number of S’s and p = P(S) is x +r −1 nb(x; r , p) = · p r (1 − p)x r −1 Then mean and variance for X are E (X ) = r (1 − p) r (1 − p) and V (X ) = , p p2 respectively Liang Zhang (UofU) Applied Statistics I June 23, 2008 13 / 24 Negative Binomial Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 14 / 24 Negative Binomial Distribution Example: (Problem 78) Individual A has a red die and B has a green die (both fair). If they each roll until they obtain five “doubles” (1 − 1, 2 − 2, . . . , 6 − 6), what is the pmf of X = the total number of times a die is rolled? What are E (X ) and V (X )? Liang Zhang (UofU) Applied Statistics I June 23, 2008 14 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Consider the following random variables: Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Consider the following random variables: 1. The number of people arriving for treatment at an emergency room in each hour. Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Consider the following random variables: 1. The number of people arriving for treatment at an emergency room in each hour. 2. The number of drivers who travel between Salt Lake City and Sandy during each day. Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Consider the following random variables: 1. The number of people arriving for treatment at an emergency room in each hour. 2. The number of drivers who travel between Salt Lake City and Sandy during each day. 3. The number of trees in each square mile in a forest. Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Consider the following random variables: 1. The number of people arriving for treatment at an emergency room in each hour. 2. The number of drivers who travel between Salt Lake City and Sandy during each day. 3. The number of trees in each square mile in a forest. None of them are binomial, hypergeometric or negative binomial random variables. Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Consider the following random variables: 1. The number of people arriving for treatment at an emergency room in each hour. 2. The number of drivers who travel between Salt Lake City and Sandy during each day. 3. The number of trees in each square mile in a forest. None of them are binomial, hypergeometric or negative binomial random variables. In fact, the experiments associated with above random variables DO NOT involve trials. Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Consider the following random variables: 1. The number of people arriving for treatment at an emergency room in each hour. 2. The number of drivers who travel between Salt Lake City and Sandy during each day. 3. The number of trees in each square mile in a forest. None of them are binomial, hypergeometric or negative binomial random variables. In fact, the experiments associated with above random variables DO NOT involve trials. We use Poisson distribution to model the experiment for occurence of events of some type over time or area. Liang Zhang (UofU) Applied Statistics I June 23, 2008 15 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 16 / 24 Poisson Distribution Definition A random variable X is said to have a Posiion distribution with parameter λ (λ > 0) if the pmf of X is p(x; λ) = Liang Zhang (UofU) e −λ λx x! x = 0, 1, 2, . . . Applied Statistics I June 23, 2008 16 / 24 Poisson Distribution Definition A random variable X is said to have a Posiion distribution with parameter λ (λ > 0) if the pmf of X is p(x; λ) = e −λ λx x! x = 0, 1, 2, . . . 1. The value λ is frequently a rate per unit time or per unit area. Liang Zhang (UofU) Applied Statistics I June 23, 2008 16 / 24 Poisson Distribution Definition A random variable X is said to have a Posiion distribution with parameter λ (λ > 0) if the pmf of X is p(x; λ) = e −λ λx x! x = 0, 1, 2, . . . 1. The value λ is frequently a rate per unit time or per unit area. 2. e is the base of the natural logarithm system. Liang Zhang (UofU) Applied Statistics I June 23, 2008 16 / 24 Poisson Distribution Definition A random variable X is said to have a Posiion distribution with parameter λ (λ > 0) if the pmf of X is p(x; λ) = e −λ λx x! x = 0, 1, 2, . . . 1. The value λ is frequently a rate per unit time or per unit area. 2. e is the base of the natural P∞ logarithm system. 3. It is guaranteed that x=0 p(x; λ) = 1. Liang Zhang (UofU) Applied Statistics I June 23, 2008 16 / 24 Poisson Distribution Definition A random variable X is said to have a Posiion distribution with parameter λ (λ > 0) if the pmf of X is p(x; λ) = e −λ λx x! x = 0, 1, 2, . . . 1. The value λ is frequently a rate per unit time or per unit area. 2. e is the base of the natural P∞ logarithm system. 3. It is guaranteed that x=0 p(x; λ) = 1. ∞ X λx λ2 λ3 e =1+λ+ + + ··· = 2! 3! x! λ x=0 Liang Zhang (UofU) Applied Statistics I June 23, 2008 16 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 17 / 24 Poisson Distribution Example: The red blood cell (RBC) density in blood is estimated by means of a hematometer. A blood sample is thoroughly mixed with a saline solution, and then pipetted onto a slide. The RBC’s are counted under a microscope through a square grid. Because the solution is throughly mixed, the RBC’s have an equal chance of being in a particular square in the grid. It is known that the number of cells counted in a given square follows a Poisson distribution and the parameter λ for certain blood sample is believed to be 1.5. Liang Zhang (UofU) Applied Statistics I June 23, 2008 17 / 24 Poisson Distribution Example: The red blood cell (RBC) density in blood is estimated by means of a hematometer. A blood sample is thoroughly mixed with a saline solution, and then pipetted onto a slide. The RBC’s are counted under a microscope through a square grid. Because the solution is throughly mixed, the RBC’s have an equal chance of being in a particular square in the grid. It is known that the number of cells counted in a given square follows a Poisson distribution and the parameter λ for certain blood sample is believed to be 1.5. Then what is the probability that there is no RBC in a given square? Liang Zhang (UofU) Applied Statistics I June 23, 2008 17 / 24 Poisson Distribution Example: The red blood cell (RBC) density in blood is estimated by means of a hematometer. A blood sample is thoroughly mixed with a saline solution, and then pipetted onto a slide. The RBC’s are counted under a microscope through a square grid. Because the solution is throughly mixed, the RBC’s have an equal chance of being in a particular square in the grid. It is known that the number of cells counted in a given square follows a Poisson distribution and the parameter λ for certain blood sample is believed to be 1.5. Then what is the probability that there is no RBC in a given square? What is the probability for a square containing exactly 2 RBC’s? Liang Zhang (UofU) Applied Statistics I June 23, 2008 17 / 24 Poisson Distribution Example: The red blood cell (RBC) density in blood is estimated by means of a hematometer. A blood sample is thoroughly mixed with a saline solution, and then pipetted onto a slide. The RBC’s are counted under a microscope through a square grid. Because the solution is throughly mixed, the RBC’s have an equal chance of being in a particular square in the grid. It is known that the number of cells counted in a given square follows a Poisson distribution and the parameter λ for certain blood sample is believed to be 1.5. Then what is the probability that there is no RBC in a given square? What is the probability for a square containing exactly 2 RBC’s? What is the probability for a square containing at most 2 RBC’s? Liang Zhang (UofU) Applied Statistics I June 23, 2008 17 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 18 / 24 Poisson Distribution Proposition If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ. Liang Zhang (UofU) Applied Statistics I June 23, 2008 18 / 24 Poisson Distribution Proposition If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ. We see that the parameter λ equals to the mean and variance of the Poisson random variable X . Liang Zhang (UofU) Applied Statistics I June 23, 2008 18 / 24 Poisson Distribution Proposition If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ. We see that the parameter λ equals to the mean and variance of the Poisson random variable X . e.g. for the previous example, the expected number of RBC’s per square is thus 1.5 and the variance is also 1.5. Liang Zhang (UofU) Applied Statistics I June 23, 2008 18 / 24 Poisson Distribution Proposition If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ. We see that the parameter λ equals to the mean and variance of the Poisson random variable X . e.g. for the previous example, the expected number of RBC’s per square is thus 1.5 and the variance is also 1.5. In practice, the parameter usually is unknown to us. However, we can use the sample mean to estimate it. For example, if we observed 15 RBC’s over 10 squares, then we can use x̄ = 15 10 = 1.5 to estimate λ. Liang Zhang (UofU) Applied Statistics I June 23, 2008 18 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 19 / 24 Poisson Distribution Poisson Process: the occurrence of events over time. Liang Zhang (UofU) Applied Statistics I June 23, 2008 19 / 24 Poisson Distribution Poisson Process: the occurrence of events over time. 1. There exists a parameter α > 0 such that for any short time interval of length ∆t, the probability that exactly one event is received is α · ∆t + o(∆t). Liang Zhang (UofU) Applied Statistics I June 23, 2008 19 / 24 Poisson Distribution Poisson Process: the occurrence of events over time. 1. There exists a parameter α > 0 such that for any short time interval of length ∆t, the probability that exactly one event is received is α · ∆t + o(∆t). 2. The probability of more than one event being received during ∆t is o(∆t) [which, along with Assumption 1, implies that the probability of no events during ∆t] is 1 − α · ∆t − o(∆t)]. Liang Zhang (UofU) Applied Statistics I June 23, 2008 19 / 24 Poisson Distribution Poisson Process: the occurrence of events over time. 1. There exists a parameter α > 0 such that for any short time interval of length ∆t, the probability that exactly one event is received is α · ∆t + o(∆t). 2. The probability of more than one event being received during ∆t is o(∆t) [which, along with Assumption 1, implies that the probability of no events during ∆t] is 1 − α · ∆t − o(∆t)]. 3. The number of events received during the time interval ∆t is independent of the number received prior to this time interval. Liang Zhang (UofU) Applied Statistics I June 23, 2008 19 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 20 / 24 Poisson Distribution Proposition Let Pk (t) denote the probability that k events will be observed during any particular time interval of length t. Then Pk (t) = e −αt · (αt)k . k! In words, the number of events during a time interval of length t is a Poisson rv with parameter λ = αt. The expected number of events during any such time interval is then αt, so the expected number during a unit interval of time is α. Liang Zhang (UofU) Applied Statistics I June 23, 2008 20 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 21 / 24 Poisson Distribution Example: (Problem 92) Automobiles arrive at a vehicle equipment inspection station according to a Poisson process with rate α = 10 per hour. Suppose that with probability 0.5 an arriving vehicle will have no equipemnt violations. Liang Zhang (UofU) Applied Statistics I June 23, 2008 21 / 24 Poisson Distribution Example: (Problem 92) Automobiles arrive at a vehicle equipment inspection station according to a Poisson process with rate α = 10 per hour. Suppose that with probability 0.5 an arriving vehicle will have no equipemnt violations. a. What is the probability that exactly ten arrive during the hour and all ten have no violations? Liang Zhang (UofU) Applied Statistics I June 23, 2008 21 / 24 Poisson Distribution Example: (Problem 92) Automobiles arrive at a vehicle equipment inspection station according to a Poisson process with rate α = 10 per hour. Suppose that with probability 0.5 an arriving vehicle will have no equipemnt violations. a. What is the probability that exactly ten arrive during the hour and all ten have no violations? b. For any fixed y ≥ 10, what is the probability that y arrive during the hour, of which ten have no violations? Liang Zhang (UofU) Applied Statistics I June 23, 2008 21 / 24 Poisson Distribution Example: (Problem 92) Automobiles arrive at a vehicle equipment inspection station according to a Poisson process with rate α = 10 per hour. Suppose that with probability 0.5 an arriving vehicle will have no equipemnt violations. a. What is the probability that exactly ten arrive during the hour and all ten have no violations? b. For any fixed y ≥ 10, what is the probability that y arrive during the hour, of which ten have no violations? c. What is the probability that ten “no-violation” cars arrive during the next 45 minutes? Liang Zhang (UofU) Applied Statistics I June 23, 2008 21 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 22 / 24 Poisson Distribution In some sense, the Poisson distribution can be recognized as the limit of a binomial experiment. Proposition Suppose that in the binomial pmf b(x; n, p), we let n → ∞ and p → 0 in such a way that np approaches a value λ > 0. Then b(x; n, p) → p(x; λ). Liang Zhang (UofU) Applied Statistics I June 23, 2008 22 / 24 Poisson Distribution In some sense, the Poisson distribution can be recognized as the limit of a binomial experiment. Proposition Suppose that in the binomial pmf b(x; n, p), we let n → ∞ and p → 0 in such a way that np approaches a value λ > 0. Then b(x; n, p) → p(x; λ). This tells us in any binomial experiment in which n is large and p is small, b(x; n, p) ≈ p(x; λ), where λ = np. Liang Zhang (UofU) Applied Statistics I June 23, 2008 22 / 24 Poisson Distribution In some sense, the Poisson distribution can be recognized as the limit of a binomial experiment. Proposition Suppose that in the binomial pmf b(x; n, p), we let n → ∞ and p → 0 in such a way that np approaches a value λ > 0. Then b(x; n, p) → p(x; λ). This tells us in any binomial experiment in which n is large and p is small, b(x; n, p) ≈ p(x; λ), where λ = np. As a rule of thumb, this approximation can safely be applied if n > 50 and np < 5. Liang Zhang (UofU) Applied Statistics I June 23, 2008 22 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 23 / 24 Poisson Distribution Example 3.40: If a publisher of nontechnical books takes great pains to ensure that its books are free of typographical errors, so that the probability of any given page containing at least one such error is 0.005 and errors are independent from page to page, what is the probability that one of its 400-page novels will contain exactly one page with errors? Liang Zhang (UofU) Applied Statistics I June 23, 2008 23 / 24 Poisson Distribution Example 3.40: If a publisher of nontechnical books takes great pains to ensure that its books are free of typographical errors, so that the probability of any given page containing at least one such error is 0.005 and errors are independent from page to page, what is the probability that one of its 400-page novels will contain exactly one page with errors? Let S denote a page containing at least one error, F denote an error-free page and X denote the number of pages containing at least one error. Then X is a binomial rv, and P(X = 1) = b(1; 400, 0.005) ≈ p(1; 400 · 0.005) = p(1; 2) = e −2 (2) 1! = 0.270671 Liang Zhang (UofU) Applied Statistics I June 23, 2008 23 / 24 Poisson Distribution Liang Zhang (UofU) Applied Statistics I June 23, 2008 24 / 24 Poisson Distribution A proof for b(x; n, p) → p(x; λ) as n → ∞ and p → 0 with np → λ. n! p x (1 − p)n−x x!(n − x)! n! n(n − 1) · · · (n − x + 1) x lim p x = lim p n→∞ x!(n − x)! n→∞ x! (np)[(n − 1)p] · · · [(n − x + 1)p] = lim n→∞ x! λx = x! np n−x n−x lim (1 − p) = lim {1 − } n→∞ n→∞ n λ = lim {1 − }n−x n→∞ n −λ =e b(x; n, p) = Liang Zhang (UofU) Applied Statistics I June 23, 2008 24 / 24

Applied Statistics I Liang Zhang June 23, 2008

Related documents

Products

Support

Applied Statistics I Liang Zhang June 23, 2008

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib