Document

advertisement
Sampling Distributions
1
Central Limit Theorem*
Distribution of Sample Means
• Consider the following data as a Population
2, 4, 6, 8
– The population mean is 5
– The population standard deviation is 2.236
• Now we are going to take ALL possible samples of n = 2 from this
population.
• We will calculate the mean for each sample
Sampling Distribution of Means for
Samples of n = 2
Pick 1
Pick 2
Mean
Mean 2
Variance
Standard Deviation
2
2
2
2
4
4
4
4
6
6
6
6
8
8
8
8
2
4
6
8
2
4
6
8
2
4
6
8
2
4
6
8
2
3
4
5
3
4
5
6
4
5
6
7
5
6
7
8
80
4
9
16
25
9
16
25
36
16
25
36
49
25
36
49
64
440
0
2
8
18
2
0
2
8
8
2
0
2
18
8
2
0
0.000
1.414
2.828
4.243
1.414
0.000
1.414
2.828
2.828
1.414
0.00
1.414
4.243
2.828
1.414
0.00
Central Limit Theorem Applied
Page 337
Results from a survey of students who were asked how many hours they
spend per week using a search engine on the Internet.
n = 400
μ = 3.88 σ = 2.40
Sample 1
1.1
7.0
6.8
7.8
3.8
6.5
A sample of 32 students selected from the 400 on the previous slide.
6.8
5.7
1.7
4.9
6.5
2.1
3.0
2.7
1.2
6.5
2.6
0.3
5.2
1.4
0.9
2.2
7.1
2.4
5.1
5.5
2.5
3.4
3.1
7.8
4.7
5.0
The mean of this sample is x̄x = 4.17.
Sample 2
1.8
3.6
5.0
0.4
5.2
3.1
A different sample of 32 students selected from the 400.
4.0
5.7
0.5
2.4
6.5
3.9
0.8
1.2
3.1
6.2
5.4
5.8
0.8
5.7
2.9
6.6
7.2
7.2
5.7
5.1
0.9
7.9
3.2
4.0
2.5
3.1
For this sample x̄ is = 3.98.
Now you have two sample means that don’t agree with each other, and neither
one agrees with the true population mean.
Figure 8.6 shows a histogram that results from 100 different samples, each with
32 students. Notice that this histogram is very close to a normal distribution and
its mean is very close to the population mean, μ = 3.88.
Figure 8.6 A distribution of 100 sample means, with a sample size of n = 32,
appears close to a normal distribution with a mean of 3.88.
Central Limit Theorem application: If we were to include all possible
samples of size n = 32, this distribution would have these
characteristics:
• The distribution of sample means is approximately a normal
distribution.
• The mean of the distribution of sample means is 3.88 (the
mean of the population).
• The standard deviation of the distribution of sample means
depends on the population standard deviation and the sample
size. The population standard deviation is σ = 2.40 and the
sample size is n = 32, so the standard deviation of sample
means is
σ = 2.40= 0.42
n
32
Margin of Error for the Mean
The margin of error for the 95% confidence interval is
margin of error = E ≈
2s
n
where s is the standard deviation of the sample.
We find the 95% confidence interval by adding and subtracting the
margin of error from the sample mean. That is, the 95% confidence
interval ranges
from (x – margin of error)
to
(x + margin of error)
We can write this confidence interval more formally as
or more briefly as
x̄ – E < μ < x̄ + E
x̄ ± E
10
95% Confidence Interval
Constructing a Confidence Interval
Interpreting the Confidence
Interval
Figure 8.10 This figure
illustrates the idea behind
confidence intervals. The
central vertical line represents
the true population mean, μ.
Each of the 20 horizontal lines
represents the 95% confidence
interval for a particular
sample, with the sample mean marked by the dot in the center of the
confidence interval. With a 95% confidence interval, we expect that 95% of all
samples will give a confidence interval that contains the population mean, as is
the case in this figure, for 19 of the 20 confidence intervals do indeed contain
the population mean. We expect that the population mean will not be within the
confidence interval in 5% of the cases; here, 1 of the 20 confidence intervals
(the sixth from the top) does not contain the population mean.
Using StatCrunch -Confidence
Intervals
• In the data set; select:
– STAT
– Z Statistics
– One-Sample
– With Data
– Select Variable
– Click next
– Select confidence interval and percent
– Calculate
15
Determine Minimum Sample Size
• Solve the margin of error formula [E =2s/√n] for n.
 2s 
n   E 
 
2
• You want to study housing costs in the country by sampling recent house
sales in various (representative) regions. Your goal is to provide a 95%
confidence interval estimate of the housing cost. Previous studies suggest
that the population standard deviation is about $7,200. What sample size
(at a minimum) should be used to ensure that the sample mean is within
• a. $500 of the true population mean?
 2
n   E

2
  2  7,200 
2
  
  28.8  829.4
  500 
2
16
EXAMPLE Constructing a Confidence Interval
You want to study housing costs in the country by sampling recent house
sales in various (representative) regions. Your goal is to provide a 95%
confidence interval estimate of the housing cost. Previous studies suggest that
the population standard deviation is about $7,200. What sample size (at a
minimum) should be used to ensure that the sample mean is within
a. $500 of the true population mean?
b. $100 of the true population mean?
Solution:
a. With E = $500 and σ estimated as $7,200, the minimum sample size that
meets the requirements is
 2
n  
 E
2
  2  7,200 
2
  
  28.8  829.4
  500 
2
EXAMPLE Constructing a Confidence Interval
Solution:
a. (cont.) Because the sample size must be a whole number, we conclude that
the sample should include at least 830 prices.
b.
With E = $100 and σ = $7,200, the minimum sample size that meets the
requirements is
 2
n   E

2
  2  7,200 
2
  
  144  20,736
  100 
2
Notice that to decrease the margin of error by a factor of 5 (from $500 to
$100), we must increase the sample size by a factor of 25. That is why
achieving greater accuracy generally comes with a high cost.
Distribution of Sample Proportions
Page 340
Sample Proportions
In a survey where 400 students were asked if they own a car, 240
replied that they did.
The exact proportion of car owners is
p=
240
400
= 0.6
This population proportion, p = 0.6, is another example of a population
parameter.
95% Confidence Interval for a Population Proportion
For a population proportion, the margin of error for the 95%
confidence interval is
ˆ
ˆ
E2
p(1  p)
n
where p̂ is the sample proportion.
The 95% confidence interval ranges
from p̂ – margin of error
to p̂ + margin of error
We can write this confidence interval more formally as
pˆ – E  p  pˆ  E
21
Choosing the Correct Sample Size
In order to estimate a population proportion with a 95%
degree of confidence and a specified margin of error of E,
the size of the sample should be at least
1
n> 2
E
22
EXAMPLE TV Nielsen Ratings
The Nielsen ratings for television use a random sample of households. A
Nielsen survey results in an estimate that a women’s World Cup soccer game
had 72.3% of the entire viewing audience. Assuming that the sample consists
of n = 5,000 randomly selected households, find the margin of error and the
95% confidence interval for this estimate.
Solution: The sample proportion, p̂ = 72.3% = 0.723, is the best estimate of
the population proportion.
The margin of error is
E2
pˆ (1  pˆ )
0.723(1  0.723)
2
 0.013
n
5,000
EXAMPLE 2 TV Nielsen Ratings
Solution: (cont.)
The 95% confidence interval is
0.723 – 0.013 < p < 0.723 + 0.013,
or
0.710 < p < 0.736
With 95% confidence, we conclude that between 71.0% and 73.6% of the entire
viewing audience watched the women’s World Cup soccer game.
EXAMPLE Minimum Sample Size for Survey
You plan a survey to estimate the proportion of students on your
campus who carry a cell phone regularly. How many students
should be in the sample if you want (with 95% confidence) a
margin of error of no more than 4 percentage points?
Solution: Note that 4 percentage points means a margin of error
of 0.04. From the given formula, the minimum sample size is
n=
1
1
=
= 625
2
2
E
0.04
You should survey at least 625 students.
25
Core Logic of Hypothesis Testing
• Considers the probability that the result of a study
could have come about if the experimental
procedure had no effect
• If this probability is low, scenario of no effect is
rejected and the theory behind the experimental
procedure is supported
Hypothesis Testing using Confidence Intervals




State the claim about the population mean
Determine desired confidence level
Select a random sample from the population
Calculate the confidence interval for the desired level of
confidence.
 If the claim is contained within the interval, the claim is
reasonable; if it is not within the interval, the claim is not
reasonable, at the given level of confidence.
 See Testing a Claim document in Doc Sharing
Download