Notes

advertisement
Name__________________________________
AP Statistics
Chapter 18: Sampling Distribution Models Notes
Suppose I randomly select 100 seniors in Howard County and record each one’s GPA.
1.95
1.98
1.86
2.04
2.75
2.72
2.06
3.36
2.09
2.06
2.33
2.56
2.17
1.67
2.75
3.95
2.23
4.53
1.31
3.79
1.29
3.00
1.89
2.36
2.76
3.29
1.51
1.09
2.75
2.68
2.28
3.13
2.62
2.85
2.41
3.16
3.39
3.18
4.05
3.26
1.95
3.23
2.53
3.70
2.90
2.79
3.08
2.79
3.26
2.29
2.59
1.36
2.38
2.03
3.31
2.05
1.58
3.12
3.33
2.04
2.81
3.94
0.82
3.14
2.63
1.51
2.24
2.22
1.85
1.96
2.05
2.62
3.27
1.94
2.01
1.68
2.01
3.15
3.44
4.00
2.33
3.01
3.15
2.25
3.34
2.22
3.29
3.90
2.96
2.61
3.01
2.86
1.70
1.55
1.63
2.37
2.84
1.67
2.92
3.29
-These 100 seniors make up one possible
sample. All seniors in Howard County make up
the population.
-The sample mean, ð‘ĨĖ… is 2.5470 and the sample
standard deviation, s, is 0.7150. -The
population mean, µ, and the population
standard deviation, 𝜎, are unknown.
-We can use ð‘ĨĖ… to estimate µ and we can use s
to estimate 𝜎. These estimates may or may
not be reliable.
-A number that describes the population is
called a parameter. Hence,  and ïģ are both
parameters. A parameter is usually
represented by p.
-A number that is computed from a sample is
called a statistic. Therefore, x and s x are
both statistics. A statistic is usually
represented by pĖ‚.
**If I had chosen a different 100 seniors, then I would have a different sample, but it would still
represent the same population. A different sample almost always produces different statistics.**
Example: Let pĖ‚ represent the proportion of seniors in a sample of 100 seniors whose GPA is 2.0 or higher.
pˆ1 ï€― .78
pˆ 2 ï€― .72
p3 ï€― .81
pˆ 4 ï€― .70
pˆ 5 ï€― .68
pˆ 6 ï€― .75
pˆ 7 ï€― .79
pˆ 8 ï€― .72
pˆ 9 ï€― .83
pˆ10 ï€― .76
If I compare many different samples and the statistic is very similar in each one, then the sampling
variability is low. If I compare many different samples and the statistic is very different in each one, then
the sampling variability is high.
Sampling Model
The sampling model of a statistic is a model of the values of the statistic from all possible samples of the
same size from the same population.
Example: Suppose the sampling model consists of the samples pˆ1 , pˆ 2 ,..., pˆ 9 , pˆ10 . (Note: There are
actually many more than ten possible samples.) This sampling model has mean 0.754 and standard
deviation 0.049.
Sampling Distribution Model for a Proportion
Assumptions:
1) Sampled values are independent
Condition: 10% Condition
2) Sample size is large enough
Condition: Success/Failure Condition (i.e. np ï‚ģ 10
and
nq ï‚ģ 10 )
If these assumption/conditions are met, then we can model this with a Normal model were the
mean and standard deviation are as follows:
𝜇pĖ‚ = 𝑝
𝑝𝑞
𝜎pĖ‚ = √ 𝑛 .
Examples:
1. Recent estimates suggest that 16.9% of young people (aged 2–19) are obese. A
researcher into childhood obesity obtains a random sample of 60 young people.
(a) What is the probability that fewer than 10 of these young people are obese?
(b) What is the probability that between 9 and 11 of these young people are obese?
2. A CBS poll asked people if they thought that high school students should be required
to learn a foreign language in order to graduate from high school. 51% of the
respondents answered “yes.” As a follow-up, a research group has randomly selected
25 people and asked the same question.
(a) What is the probability that more than 13 people will answer
yes?
(b) What is the probability that fewer than 10 people will answer
yes?
Sampling Distribution Model for Sample Means
Must check:
1) Random Sampling Condition: values must be randomly sampled
2) Independence Assumption: sampled values must be mutually independent
3) 10% Condition
Central Limit Theorem:
As the sample size, n, increases, the mean of n independent values has a sampling distribution that
tends toward a Normal model with the following mean and standard deviation:
𝜎
𝜇ð‘Ĩ = 𝜇
𝜎ð‘Ĩ = 𝑛
√
Examples:
1.) A new drug designed boost the immune system of HIV infected patients is being
tested. Using the drug for three years results in a mean change in CD4+ cells
(more of these cells indicate a healthier immune system) was 94.9 million cells
per liter of blood, with a standard deviation of 169.0414. A sample of 127 HIV
infected patients is selected to use this drug for a three year period, and their
CD4+ cell count change is measured at the conclusion. What is the probability
that between the patients will have a mean change of 110 million cells per liter of
blood or more?
2.) Studies suggest that adult males are missing 3.49 teeth on average, with a
standard deviation of 12.459. The population distribution of number of missing
teeth amongst males is clearly skew right. A random sample of 100 males is
obtained and the number of missing teeth measured.
(a) What is the probability that group will have a mean of 1 missing tooth or less?
(b) 10% of samples will have a mean of ____ missing teeth or more. What
number fills in the blank?
Download