Applied Statistics I Liang Zhang June 23, 2008

advertisement
Applied Statistics I
Liang Zhang
Department of Mathematics, University of Utah
June 23, 2008
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
1 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
2 / 24
Hypergeometric Distribution
Assume we are drawing cards from a deck of well-shulffed cards with
replacement, one card per each draw. We do this 5 times and record
whether the outcome is ♠ or not. Then this is a binomial experiment.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
2 / 24
Hypergeometric Distribution
Assume we are drawing cards from a deck of well-shulffed cards with
replacement, one card per each draw. We do this 5 times and record
whether the outcome is ♠ or not. Then this is a binomial experiment.
If we do the same thing without replacement, then it is NO LONGER a
binomial experiment.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
2 / 24
Hypergeometric Distribution
Assume we are drawing cards from a deck of well-shulffed cards with
replacement, one card per each draw. We do this 5 times and record
whether the outcome is ♠ or not. Then this is a binomial experiment.
If we do the same thing without replacement, then it is NO LONGER a
binomial experiment.
However, if we are drawing from 100 decks of cards without replacement
and record only the first 5 outcomes, then it is approximately a binomial
experiment.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
2 / 24
Hypergeometric Distribution
Assume we are drawing cards from a deck of well-shulffed cards with
replacement, one card per each draw. We do this 5 times and record
whether the outcome is ♠ or not. Then this is a binomial experiment.
If we do the same thing without replacement, then it is NO LONGER a
binomial experiment.
However, if we are drawing from 100 decks of cards without replacement
and record only the first 5 outcomes, then it is approximately a binomial
experiment.
What is the exact model for drawing cards without replacement?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
2 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
3 / 24
Hypergeometric Distribution
1. The population or set to be sampled consists of N individuals, objects,
or elements (a finite population).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
3 / 24
Hypergeometric Distribution
1. The population or set to be sampled consists of N individuals, objects,
or elements (a finite population).
2. Each individual can be characterized as a success (S) or a failure (F),
and there are M successes in the population.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
3 / 24
Hypergeometric Distribution
1. The population or set to be sampled consists of N individuals, objects,
or elements (a finite population).
2. Each individual can be characterized as a success (S) or a failure (F),
and there are M successes in the population.
3. A sample of n individuals is selected without replacement in such a way
that each subset of size n is equally likely to be chosen.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
3 / 24
Hypergeometric Distribution
1. The population or set to be sampled consists of N individuals, objects,
or elements (a finite population).
2. Each individual can be characterized as a success (S) or a failure (F),
and there are M successes in the population.
3. A sample of n individuals is selected without replacement in such a way
that each subset of size n is equally likely to be chosen.
Definition
For any experiment which satisfies the above 3 conditions, let X = the
number of S’s in the sample. Then X is a hypergeometric random
variable and we use h(x; n, M, N) to denote the pmf p(x) = P(X = x).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
3 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
4 / 24
Hypergeometric Distribution
Examples:
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
4 / 24
Hypergeometric Distribution
Examples:
In the second cards drawing example (without replacement and totally 52
cards), if we let X = the number of ♠’s in the first 5 draws, then X is a
hypergeometric random variable with n = 5, M = 13 and N = 52.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
4 / 24
Hypergeometric Distribution
Examples:
In the second cards drawing example (without replacement and totally 52
cards), if we let X = the number of ♠’s in the first 5 draws, then X is a
hypergeometric random variable with n = 5, M = 13 and N = 52.
For the pmf, the probability for getting exactly x (x = 0, 1, 2, 3, 4, or 5)
♠’s is calculated as following:
13
39
·
p(x) = P(X = x) = x 525−x
5
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
4 / 24
Hypergeometric Distribution
Examples:
In the second cards drawing example (without replacement and totally 52
cards), if we let X = the number of ♠’s in the first 5 draws, then X is a
hypergeometric random variable with n = 5, M = 13 and N = 52.
For the pmf, the probability for getting exactly x (x = 0, 1, 2, 3, 4, or 5)
♠’s is calculated as following:
13
39
·
p(x) = P(X = x) = x 525−x
5
13
39
where x is the number of choices for getting x ♠’s, 5−x
is the number
of choices for getting the remaining 5 − x non-♠ cards and 52
5 is the
total number of choices for selecting 5 cards from 52 cards.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
4 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
5 / 24
Hypergeometric Distribution
Examples:
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
5 / 24
Hypergeometric Distribution
Examples:
For the same experiment (without replacement and totally 52 cards), if we
let X = the number of ♠’s in the first 20 draws, then X is still a
hypergeometric random variable, but with n = 20, M = 13 and N = 52.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
5 / 24
Hypergeometric Distribution
Examples:
For the same experiment (without replacement and totally 52 cards), if we
let X = the number of ♠’s in the first 20 draws, then X is still a
hypergeometric random variable, but with n = 20, M = 13 and N = 52.
However, in this case, all the possible values for X is 0, 1, 2, . . . , 13 and the
pmf is
13
39
·
p(x) = P(X = x) = x 5220−x
20
where 0 ≤ x ≤ 13.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
5 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
6 / 24
Hypergeometric Distribution
Proposition
If X is the number of S’s in a completely random sample of size n drawn
from a population consisting of M S’s and (N − M) F ’s, then the
probability distribution of X , called the hypergeometric distribution, is
given by
N−M M
x · n−x
P(X = x) = h(x; n, M, N) =
N
n
for x an integer satisfying max(0, n − N + M) ≤ x ≤ min(n, M).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
6 / 24
Hypergeometric Distribution
Proposition
If X is the number of S’s in a completely random sample of size n drawn
from a population consisting of M S’s and (N − M) F ’s, then the
probability distribution of X , called the hypergeometric distribution, is
given by
N−M M
x · n−x
P(X = x) = h(x; n, M, N) =
N
n
for x an integer satisfying max(0, n − N + M) ≤ x ≤ min(n, M).
Remark:
If n < M, then the largest x is n. However, if n > M, then the largest x is
M. Therefore we require x ≤ min(n, M).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
6 / 24
Hypergeometric Distribution
Proposition
If X is the number of S’s in a completely random sample of size n drawn
from a population consisting of M S’s and (N − M) F ’s, then the
probability distribution of X , called the hypergeometric distribution, is
given by
N−M M
x · n−x
P(X = x) = h(x; n, M, N) =
N
n
for x an integer satisfying max(0, n − N + M) ≤ x ≤ min(n, M).
Remark:
If n < M, then the largest x is n. However, if n > M, then the largest x is
M. Therefore we require x ≤ min(n, M).
Similarly, if n < N − M, then the smallest x is 0. However, if n > N − M,
then the smallest x is n − (N − M). Thus x ≥ min(0, n − N + M).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
6 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
7 / 24
Hypergeometric Distribution
Example: (Problem 70)
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
7 / 24
Hypergeometric Distribution
Example: (Problem 70)
An instructor who taught two sections of engineering statistics last term,
the first with 20 students and the second with 30, decided to assign a term
project. After all projects had been turned in, the instructor randomly
ordered them before grading. Consider the first 15 graded projects.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
7 / 24
Hypergeometric Distribution
Example: (Problem 70)
An instructor who taught two sections of engineering statistics last term,
the first with 20 students and the second with 30, decided to assign a term
project. After all projects had been turned in, the instructor randomly
ordered them before grading. Consider the first 15 graded projects.
a. What is the probability that exactly 10 of these are from the second
section?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
7 / 24
Hypergeometric Distribution
Example: (Problem 70)
An instructor who taught two sections of engineering statistics last term,
the first with 20 students and the second with 30, decided to assign a term
project. After all projects had been turned in, the instructor randomly
ordered them before grading. Consider the first 15 graded projects.
a. What is the probability that exactly 10 of these are from the second
section?
b. What is the probability that at least 10 of these are from the second
section?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
7 / 24
Hypergeometric Distribution
Example: (Problem 70)
An instructor who taught two sections of engineering statistics last term,
the first with 20 students and the second with 30, decided to assign a term
project. After all projects had been turned in, the instructor randomly
ordered them before grading. Consider the first 15 graded projects.
a. What is the probability that exactly 10 of these are from the second
section?
b. What is the probability that at least 10 of these are from the second
section?
c. What is the probability that at least 10 of these are from the same
section?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
7 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
8 / 24
Hypergeometric Distribution
Proposition
The mean and variance of the hypergeometric rv X having pmf
h(x; n, M, N) are
N −n
M
M
M
V (X ) =
·n·
· 1−
E (X ) = n ·
N
N −1
N
N
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
8 / 24
Hypergeometric Distribution
Proposition
The mean and variance of the hypergeometric rv X having pmf
h(x; n, M, N) are
N −n
M
M
M
V (X ) =
·n·
· 1−
E (X ) = n ·
N
N −1
N
N
Remark:
The ratio
M
N
is the proportion of S’s in the population. If we replace
M
N
by
p, then we get
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
8 / 24
Hypergeometric Distribution
Proposition
The mean and variance of the hypergeometric rv X having pmf
h(x; n, M, N) are
N −n
M
M
M
V (X ) =
·n·
· 1−
E (X ) = n ·
N
N −1
N
N
Remark:
The ratio
M
N
is the proportion of S’s in the
If we replace
population.
N−n
p, then we get E (X ) = np and V (X ) = N−1 · np(1 − p).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
M
N
by
8 / 24
Hypergeometric Distribution
Proposition
The mean and variance of the hypergeometric rv X having pmf
h(x; n, M, N) are
N −n
M
M
M
V (X ) =
·n·
· 1−
E (X ) = n ·
N
N −1
N
N
Remark:
The ratio
M
N
is the proportion of S’s in the
If we replace
population.
N−n
p, then we get E (X ) = np and V (X ) = N−1 · np(1 − p).
Recall the mean and variance for a binomial rv is np and np(1 − p).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
M
N
by
8 / 24
Hypergeometric Distribution
Proposition
The mean and variance of the hypergeometric rv X having pmf
h(x; n, M, N) are
N −n
M
M
M
V (X ) =
·n·
· 1−
E (X ) = n ·
N
N −1
N
N
Remark:
The ratio
M
N
is the proportion of S’s in the
If we replace M
N by
population.
N−n
p, then we get E (X ) = np and V (X ) = N−1 · np(1 − p).
Recall the mean and variance for a binomial rv is np and np(1 − p).
We see that the mean for binomial and hypergeometric rv’s are equal,
while the variances differ by the factor (N − n)/(N − 1).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
8 / 24
Hypergeometric Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
9 / 24
Hypergeometric Distribution
Example (Problem 70) continued:
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
9 / 24
Hypergeometric Distribution
Example (Problem 70) continued:
An instructor who taught two sections of engineering statistics last term,
the first with 20 students and the second with 30, decided to assign a term
project. After all projects had been turned in, the instructor randomly
ordered them before grading. Consider the first 15 graded projects.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
9 / 24
Hypergeometric Distribution
Example (Problem 70) continued:
An instructor who taught two sections of engineering statistics last term,
the first with 20 students and the second with 30, decided to assign a term
project. After all projects had been turned in, the instructor randomly
ordered them before grading. Consider the first 15 graded projects.
d. What are the mean value and standard deviation of the number of
projects among these 15 that are from the second section?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
9 / 24
Hypergeometric Distribution
Example (Problem 70) continued:
An instructor who taught two sections of engineering statistics last term,
the first with 20 students and the second with 30, decided to assign a term
project. After all projects had been turned in, the instructor randomly
ordered them before grading. Consider the first 15 graded projects.
d. What are the mean value and standard deviation of the number of
projects among these 15 that are from the second section?
e. What are the mean value and standard deviation of the number of
projects not among these 15 that are from the second section?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
9 / 24
Negative Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
10 / 24
Negative Binomial Distribution
Consider the card drawing example again. This time, we still draw cards
from a deck of well-shulffed cards with replacement, one card per each
draw. However, we keep drawing until we get 5 ♠’s. Let X = the number
of draws which do not give us a ♠, then X is NO LONGER a binomial
random variable, but a negative binomial random variable.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
10 / 24
Negative Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
11 / 24
Negative Binomial Distribution
1. The experiment consists of a sequence of independent trials.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
11 / 24
Negative Binomial Distribution
1. The experiment consists of a sequence of independent trials.
2. Each trial can result in either s success (S) or a failure (F).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
11 / 24
Negative Binomial Distribution
1. The experiment consists of a sequence of independent trials.
2. Each trial can result in either s success (S) or a failure (F).
3. The probability of success is constant from trial to trial, so
P(S on trial i) = p for i = 1, 2, 3, . . . .
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
11 / 24
Negative Binomial Distribution
1. The experiment consists of a sequence of independent trials.
2. Each trial can result in either s success (S) or a failure (F).
3. The probability of success is constant from trial to trial, so
P(S on trial i) = p for i = 1, 2, 3, . . . .
4. The experiment continues (trials are performed) until a total of r
successes have been observed, where r is a specified positive integer.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
11 / 24
Negative Binomial Distribution
1. The experiment consists of a sequence of independent trials.
2. Each trial can result in either s success (S) or a failure (F).
3. The probability of success is constant from trial to trial, so
P(S on trial i) = p for i = 1, 2, 3, . . . .
4. The experiment continues (trials are performed) until a total of r
successes have been observed, where r is a specified positive integer.
Definition
For any experiment which satisfies the above 4 conditions, let X = the
number of failures that precede thr r th success. Then X is a negative
binomial random variable and we use nb(x; r , p) to denote the pmf
p(x) = P(X = x).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
11 / 24
Negative Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
12 / 24
Negative Binomial Distribution
Remark:
1. In some sources, the negative binomial rv is taken to be the number of
trials X + r rather than the number of failures.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
12 / 24
Negative Binomial Distribution
Remark:
1. In some sources, the negative binomial rv is taken to be the number of
trials X + r rather than the number of failures.
2. If r = 1, we call X a geometric random variable. The pmf for X is
then the familiar one
nb(x; 1, p) = (1 − p)x p
Liang Zhang (UofU)
Applied Statistics I
x = 0, 1, 2, . . .
June 23, 2008
12 / 24
Negative Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
13 / 24
Negative Binomial Distribution
Proposition
The pmf of the negative binomial rv X with parameters r = number of S’s
and p = P(S) is
x +r −1
nb(x; r , p) =
· p r (1 − p)x
r −1
Then mean and variance for X are
E (X ) =
r (1 − p)
r (1 − p)
and V (X ) =
,
p
p2
respectively
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
13 / 24
Negative Binomial Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
14 / 24
Negative Binomial Distribution
Example: (Problem 78)
Individual A has a red die and B has a green die (both fair). If they each
roll until they obtain five “doubles” (1 − 1, 2 − 2, . . . , 6 − 6), what is the
pmf of X = the total number of times a die is rolled? What are E (X ) and
V (X )?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
14 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Consider the following random variables:
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Consider the following random variables:
1. The number of people arriving for treatment at an emergency room in
each hour.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Consider the following random variables:
1. The number of people arriving for treatment at an emergency room in
each hour.
2. The number of drivers who travel between Salt Lake City and Sandy
during each day.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Consider the following random variables:
1. The number of people arriving for treatment at an emergency room in
each hour.
2. The number of drivers who travel between Salt Lake City and Sandy
during each day.
3. The number of trees in each square mile in a forest.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Consider the following random variables:
1. The number of people arriving for treatment at an emergency room in
each hour.
2. The number of drivers who travel between Salt Lake City and Sandy
during each day.
3. The number of trees in each square mile in a forest.
None of them are binomial, hypergeometric or negative binomial random
variables.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Consider the following random variables:
1. The number of people arriving for treatment at an emergency room in
each hour.
2. The number of drivers who travel between Salt Lake City and Sandy
during each day.
3. The number of trees in each square mile in a forest.
None of them are binomial, hypergeometric or negative binomial random
variables.
In fact, the experiments associated with above random variables DO NOT
involve trials.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Consider the following random variables:
1. The number of people arriving for treatment at an emergency room in
each hour.
2. The number of drivers who travel between Salt Lake City and Sandy
during each day.
3. The number of trees in each square mile in a forest.
None of them are binomial, hypergeometric or negative binomial random
variables.
In fact, the experiments associated with above random variables DO NOT
involve trials. We use Poisson distribution to model the experiment for
occurence of events of some type over time or area.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
15 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
16 / 24
Poisson Distribution
Definition
A random variable X is said to have a Posiion distribution with
parameter λ (λ > 0) if the pmf of X is
p(x; λ) =
Liang Zhang (UofU)
e −λ λx
x!
x = 0, 1, 2, . . .
Applied Statistics I
June 23, 2008
16 / 24
Poisson Distribution
Definition
A random variable X is said to have a Posiion distribution with
parameter λ (λ > 0) if the pmf of X is
p(x; λ) =
e −λ λx
x!
x = 0, 1, 2, . . .
1. The value λ is frequently a rate per unit time or per unit area.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
16 / 24
Poisson Distribution
Definition
A random variable X is said to have a Posiion distribution with
parameter λ (λ > 0) if the pmf of X is
p(x; λ) =
e −λ λx
x!
x = 0, 1, 2, . . .
1. The value λ is frequently a rate per unit time or per unit area.
2. e is the base of the natural logarithm system.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
16 / 24
Poisson Distribution
Definition
A random variable X is said to have a Posiion distribution with
parameter λ (λ > 0) if the pmf of X is
p(x; λ) =
e −λ λx
x!
x = 0, 1, 2, . . .
1. The value λ is frequently a rate per unit time or per unit area.
2. e is the base of the natural
P∞ logarithm system.
3. It is guaranteed that x=0 p(x; λ) = 1.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
16 / 24
Poisson Distribution
Definition
A random variable X is said to have a Posiion distribution with
parameter λ (λ > 0) if the pmf of X is
p(x; λ) =
e −λ λx
x!
x = 0, 1, 2, . . .
1. The value λ is frequently a rate per unit time or per unit area.
2. e is the base of the natural
P∞ logarithm system.
3. It is guaranteed that x=0 p(x; λ) = 1.
∞
X λx
λ2 λ3
e =1+λ+
+
+ ··· =
2!
3!
x!
λ
x=0
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
16 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
17 / 24
Poisson Distribution
Example:
The red blood cell (RBC) density in blood is estimated by means of a
hematometer. A blood sample is thoroughly mixed with a saline solution,
and then pipetted onto a slide. The RBC’s are counted under a
microscope through a square grid. Because the solution is throughly
mixed, the RBC’s have an equal chance of being in a particular square in
the grid. It is known that the number of cells counted in a given square
follows a Poisson distribution and the parameter λ for certain blood
sample is believed to be 1.5.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
17 / 24
Poisson Distribution
Example:
The red blood cell (RBC) density in blood is estimated by means of a
hematometer. A blood sample is thoroughly mixed with a saline solution,
and then pipetted onto a slide. The RBC’s are counted under a
microscope through a square grid. Because the solution is throughly
mixed, the RBC’s have an equal chance of being in a particular square in
the grid. It is known that the number of cells counted in a given square
follows a Poisson distribution and the parameter λ for certain blood
sample is believed to be 1.5.
Then what is the probability that there is no RBC in a given square?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
17 / 24
Poisson Distribution
Example:
The red blood cell (RBC) density in blood is estimated by means of a
hematometer. A blood sample is thoroughly mixed with a saline solution,
and then pipetted onto a slide. The RBC’s are counted under a
microscope through a square grid. Because the solution is throughly
mixed, the RBC’s have an equal chance of being in a particular square in
the grid. It is known that the number of cells counted in a given square
follows a Poisson distribution and the parameter λ for certain blood
sample is believed to be 1.5.
Then what is the probability that there is no RBC in a given square?
What is the probability for a square containing exactly 2 RBC’s?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
17 / 24
Poisson Distribution
Example:
The red blood cell (RBC) density in blood is estimated by means of a
hematometer. A blood sample is thoroughly mixed with a saline solution,
and then pipetted onto a slide. The RBC’s are counted under a
microscope through a square grid. Because the solution is throughly
mixed, the RBC’s have an equal chance of being in a particular square in
the grid. It is known that the number of cells counted in a given square
follows a Poisson distribution and the parameter λ for certain blood
sample is believed to be 1.5.
Then what is the probability that there is no RBC in a given square?
What is the probability for a square containing exactly 2 RBC’s?
What is the probability for a square containing at most 2 RBC’s?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
17 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
18 / 24
Poisson Distribution
Proposition
If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
18 / 24
Poisson Distribution
Proposition
If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ.
We see that the parameter λ equals to the mean and variance of the
Poisson random variable X .
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
18 / 24
Poisson Distribution
Proposition
If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ.
We see that the parameter λ equals to the mean and variance of the
Poisson random variable X .
e.g. for the previous example, the expected number of RBC’s per square is
thus 1.5 and the variance is also 1.5.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
18 / 24
Poisson Distribution
Proposition
If X has a Poisson distribution with parameter λ, then E (X ) = V (X ) = λ.
We see that the parameter λ equals to the mean and variance of the
Poisson random variable X .
e.g. for the previous example, the expected number of RBC’s per square is
thus 1.5 and the variance is also 1.5.
In practice, the parameter usually is unknown to us. However, we can use
the sample mean to estimate it. For example, if we observed 15 RBC’s
over 10 squares, then we can use x̄ = 15
10 = 1.5 to estimate λ.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
18 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
19 / 24
Poisson Distribution
Poisson Process: the occurrence of events over time.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
19 / 24
Poisson Distribution
Poisson Process: the occurrence of events over time.
1. There exists a parameter α > 0 such that for any short time interval of
length ∆t, the probability that exactly one event is received is
α · ∆t + o(∆t).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
19 / 24
Poisson Distribution
Poisson Process: the occurrence of events over time.
1. There exists a parameter α > 0 such that for any short time interval of
length ∆t, the probability that exactly one event is received is
α · ∆t + o(∆t).
2. The probability of more than one event being received during ∆t is
o(∆t) [which, along with Assumption 1, implies that the probability of no
events during ∆t] is 1 − α · ∆t − o(∆t)].
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
19 / 24
Poisson Distribution
Poisson Process: the occurrence of events over time.
1. There exists a parameter α > 0 such that for any short time interval of
length ∆t, the probability that exactly one event is received is
α · ∆t + o(∆t).
2. The probability of more than one event being received during ∆t is
o(∆t) [which, along with Assumption 1, implies that the probability of no
events during ∆t] is 1 − α · ∆t − o(∆t)].
3. The number of events received during the time interval ∆t is
independent of the number received prior to this time interval.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
19 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
20 / 24
Poisson Distribution
Proposition
Let Pk (t) denote the probability that k events will be observed during any
particular time interval of length t. Then
Pk (t) = e −αt ·
(αt)k
.
k!
In words, the number of events during a time interval of length t is a
Poisson rv with parameter λ = αt. The expected number of events during
any such time interval is then αt, so the expected number during a unit
interval of time is α.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
20 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
21 / 24
Poisson Distribution
Example: (Problem 92)
Automobiles arrive at a vehicle equipment inspection station according to
a Poisson process with rate α = 10 per hour. Suppose that with
probability 0.5 an arriving vehicle will have no equipemnt violations.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
21 / 24
Poisson Distribution
Example: (Problem 92)
Automobiles arrive at a vehicle equipment inspection station according to
a Poisson process with rate α = 10 per hour. Suppose that with
probability 0.5 an arriving vehicle will have no equipemnt violations.
a. What is the probability that exactly ten arrive during the hour and all
ten have no violations?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
21 / 24
Poisson Distribution
Example: (Problem 92)
Automobiles arrive at a vehicle equipment inspection station according to
a Poisson process with rate α = 10 per hour. Suppose that with
probability 0.5 an arriving vehicle will have no equipemnt violations.
a. What is the probability that exactly ten arrive during the hour and all
ten have no violations?
b. For any fixed y ≥ 10, what is the probability that y arrive during the
hour, of which ten have no violations?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
21 / 24
Poisson Distribution
Example: (Problem 92)
Automobiles arrive at a vehicle equipment inspection station according to
a Poisson process with rate α = 10 per hour. Suppose that with
probability 0.5 an arriving vehicle will have no equipemnt violations.
a. What is the probability that exactly ten arrive during the hour and all
ten have no violations?
b. For any fixed y ≥ 10, what is the probability that y arrive during the
hour, of which ten have no violations?
c. What is the probability that ten “no-violation” cars arrive during the
next 45 minutes?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
21 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
22 / 24
Poisson Distribution
In some sense, the Poisson distribution can be recognized as the limit of a
binomial experiment.
Proposition
Suppose that in the binomial pmf b(x; n, p), we let n → ∞ and p → 0 in
such a way that np approaches a value λ > 0. Then b(x; n, p) → p(x; λ).
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
22 / 24
Poisson Distribution
In some sense, the Poisson distribution can be recognized as the limit of a
binomial experiment.
Proposition
Suppose that in the binomial pmf b(x; n, p), we let n → ∞ and p → 0 in
such a way that np approaches a value λ > 0. Then b(x; n, p) → p(x; λ).
This tells us in any binomial experiment in which n is large
and p is small, b(x; n, p) ≈ p(x; λ), where λ = np.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
22 / 24
Poisson Distribution
In some sense, the Poisson distribution can be recognized as the limit of a
binomial experiment.
Proposition
Suppose that in the binomial pmf b(x; n, p), we let n → ∞ and p → 0 in
such a way that np approaches a value λ > 0. Then b(x; n, p) → p(x; λ).
This tells us in any binomial experiment in which n is large
and p is small, b(x; n, p) ≈ p(x; λ), where λ = np.
As a rule of thumb, this approximation can safely be applied if n > 50 and
np < 5.
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
22 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
23 / 24
Poisson Distribution
Example 3.40:
If a publisher of nontechnical books takes great pains to ensure that its
books are free of typographical errors, so that the probability of any given
page containing at least one such error is 0.005 and errors are independent
from page to page, what is the probability that one of its 400-page novels
will contain exactly one page with errors?
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
23 / 24
Poisson Distribution
Example 3.40:
If a publisher of nontechnical books takes great pains to ensure that its
books are free of typographical errors, so that the probability of any given
page containing at least one such error is 0.005 and errors are independent
from page to page, what is the probability that one of its 400-page novels
will contain exactly one page with errors?
Let S denote a page containing at least one error, F denote an error-free
page and X denote the number of pages containing at least one error.
Then X is a binomial rv, and
P(X = 1) = b(1; 400, 0.005) ≈ p(1; 400 · 0.005) = p(1; 2) =
e −2 (2)
1!
= 0.270671
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
23 / 24
Poisson Distribution
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
24 / 24
Poisson Distribution
A proof for b(x; n, p) → p(x; λ) as n → ∞ and p → 0 with np → λ.
n!
p x (1 − p)n−x
x!(n − x)!
n!
n(n − 1) · · · (n − x + 1) x
lim
p x = lim
p
n→∞ x!(n − x)!
n→∞
x!
(np)[(n − 1)p] · · · [(n − x + 1)p]
= lim
n→∞
x!
λx
=
x!
np n−x
n−x
lim (1 − p)
= lim {1 −
}
n→∞
n→∞
n
λ
= lim {1 − }n−x
n→∞
n
−λ
=e
b(x; n, p) =
Liang Zhang (UofU)
Applied Statistics I
June 23, 2008
24 / 24
Download