Document

advertisement
1
Remark: (a) Probability is based on data about
many repetitions of the same random experiment. Don’t
be confused about that with the personal probability, which is just a personal judgment of how likely
the outcome is.
(b) The order matters or does not matter?
This depends on the nature of the question. For example, in the three coin selection example. We are only
concerned about the monetary value selected. Then,
we have
N QD = DQN = $0.40
So the order doesn’t matter.
However, in the example of a couple to have three
children, BGB and GBB should be counted as different outcomes. Otherwise, all the possible outcomes
are not equally likely.
Example: Three students selected from total 10 stand
in a line to take a picture. Two pictures are counted as
different if at least two students’ positions are changed
or at least one student is changed. How many pictures
can we take?
Example: Three students are selected from total 10
to organize a traveling group. How many different
groups can we get?
Chapter Eleven: Sampling Distributions
A parameter is a number that describes the population. A statistic is a number that can be com-
2
puted from the sample data without making use of
any unknown parameters. So a statistic is a function
of random sample or SRS observations without any
unknown parameter.
Remark: In statistical practice, the value of a parameter is not known because we cannot examine the
entire population. So we often use a statistic, which
is called an estimator, to estimate the parameter.
Example: The average height of Americans is a parameter of the population, which is called the population mean. Let X1, X2, · · · , Xn be the height observations of the SRS of Americans. The mean of the
sample,
n
1X
Xi
X̄ =
n i=1
is the average of the observations in the sample, which
is a statistic, where n is the sample size which is a given
integer.
In an advanced Probability and Statistics course, we
can prove the following Theorem.
law of large number:
Let X1, X2, · · · , Xn be the observations of a simple random sample from the population with finite mean µ.
As sample size n increases, the sample mean X̄ gets
closer and closer to µ.
3
The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible
sample of the same size from the same population.
Then, an important question is how to find the sampling distribution of a given statistic. The following
theorem gives the answer partially.
Central Limit Theorem:
(1) Suppose that X̄ is the mean of an SRS of size n
drawn from a large population with mean µ and standard deviation σ. Then, the sampling distribution of
√
X̄ has mean µ and standard deviation σ/ n.
(2) If individual observations have the N (µ, σ) distribution, then the sample mean X̄ of an SRS of size n
√
has the N (µ, σ/ n) distribution.
(3) If the population with mean µ and finite standard
deviation σ is not normally distributed, when n is large
√
(n ≥ 25), X̄ has approximately the N (µ, σ/ n) distribution.
Example :
Suppose that {X1, · · · , X6} are the observations of SRS
of size n = 6 selected from a population that is normally distributed, with mean equal to 1 and standard
deviation equal to 0.3.
(1) Find the mean and the standard deviation of X̄ =
P
(1/6) 6i=1 Xi.
(2) Find the probability P(X̄ ≥ 1.03).
4
Solution:
(1) According to the central limit theorem, we have
that the mean of X̄ is :
µ=1
the standard deviation of X̄ is :
√
√
σ/ n = 0.3/ 6 = 0.1224
(2)
X̄−1
P(X̄ ≥ 1.03) = P( 0.1224
≥
1.03−1
0.1224 )
X̄−1
= P( 0.1224
≥ 0.245)
X̄−1
≤ 0.24)
= 1 − P( 0.1224
= 1 − P(Z ≤ 0.24) = 1 − 05948
= 0.4052
Example :
Suppose that {X1, · · · , X30} are the observations of SRS
of size n = 30 selected from a population that is not
normally distributed, with mean equal to 1 and standard deviation equal to 0.3.
(1) Find the mean and the standard deviation of X̄ =
P30
(1/30) i=1 Xi.
(2) Find the probability P(X̄ ≥ 1.01).
Solution:
(1) According to the central limit theorem, we have
that the mean of X̄ is :
µ=1
5
the standard deviation of X̄ is :
√
√
σ/ n = 0.3/ 30 = 0.05477
(2)
X̄−1
≥
P(X̄ ≥ 1.01) = P( 0.05477
1.01−1
0.05477 )
X̄−1
= P( 0.05477
≥ 0.1825)
.
X̄−1
= 1 − P( 0.05477
≤ 0.1825) = 1 − 0.966 = 0.034
Example: The gypsy moth is a serious threat to
oak and aspen trees. A state agriculture department
places traps throughout the state to detect the moths.
When traps are checked periodically, the mean number of moths trapped is only 0.5, but some traps have
several moths. The distribution of moth counts is discrete and strongly skewed, with standard deviation 0.7.
(a) What are the mean and standard deviation of the
average number of moths X̄ in 50 traps?
(b) Find the probability that the average number of
moths in 50 traps is greater than 0.6.
Solution:
(a) According to CLT, the mean and √
the standard de√
viation of X̄ are 0.5 and σ/ n = 0.7/ 50 = 0.09899, respectively.
(b)
P(X̄ > 0.6) = P(Z >
0.6 − 0.5
) = P(Z > 1.01) = 0.1562
0.09899
6
Example: An insurance company knows that in the
entire population of millions of homeowners, the mean
annual loss from fire is µ = $250 and the standard deviation of the loss is σ = $1000. The distribution of
losses is strongly right skewed: most polices have $0
loss, but a few have large losses. If the company sells
10, 000 policies, can it safely base its rates on the assumption that its average loss will be no greater than
$275?
Solution: We use the central limit theorem to approximate the probability that the average loss will be no
greater than $275. The central limit theorem says that,
in spite of the skewness of the population distribution,
the average loss, X̄, among 10, 000 policies will be ap√
proximately N ($250, σ/ 10, 000) = N ($250, $10). Because
P(X̄ ≤ 275) = P(
X̄ − 250 275 − 250 .
≤
) = P(Z ≤ 2.5) = 0.9938.
10
10
So we can be about 99.38% certain that average losses
will not exceed $275 per policy.
Download