AP Statistics Chapter 23 Notes

advertisement
AP Statistics Chapter 23 Notes
“Inference about Sample
Means”
Introduction to Sample Means

We’ve been working with confidence intervals and
hypothesis tests about proportions (% of the population, p),
now we want to do the same thing for means (average
amount for the population, m).

The Central Limit Theorem tells us that we can still use a
Normal Model for means, no matter what shape population
the data came from (as long as the conditions are met)
The assumptions and conditions




Independence – must have independent events
Random sample – must be a simple random sample
10% condition – population must be at least 10 times
the sample size
Nearly Normal condition –
–
–
For very small samples (<40), always check your data in a
histogram to make sure the data is unimodal and roughly
symmetric.
For larger samples (>40), proceed with hypothesis testing
even if the data is skewed or has outliers. You do not have
to check a histogram.
Standard Error

For proportions we
used:
SE 

For means we will use:
pˆ qˆ
n
std
SE 
n
The T-Model

With smaller samples we need a little extra variation and a little more
margin of error than the Normal Model allows

The T-Model, created by William Gosset, uses a whole family of
corrected “Normal Models” with fatter tails to correct this problem of a
small sample size
Solid curve = t model with 2
degrees of freedom
dotted curve = normal model
More about the T-Model

Gosset’s T-Model uses degrees of freedom
to determine how fat the tails should be. The
smaller the sample size, the fatter the tails.
As the sample size increases, the tails shrink
closer to the tails of the Normal Model and as
the sample size approaches infinity, the TModel becomes the same as the Normal
Model.
Compare the Normal Model to the T-Model
(n – 1 degrees of freedom)

Find normalcdf(1.645, 99)

Find tcdf(1.645, 99, 12)

Find tcdf(1.645, 99, 25)

Find tcdf(1.645, 99, 100)
Example – With Data

In 2000 the Bureau of Census reports that the average life expectancy
for a person in the United States has increased beyond 77 years.
Insurance companies track life expectancy information to assist in
determining the cost of life insurance policies. The insurance
company wants to know if their clients have also started living longer,
so they randomly select a sample of recently paid policies to see if the
mean life expectancy of those policyholders has increased. The
insurance company will change their premium rates if there is
evidence that people who buy their policies are living longer. Does this
sample indicate that the insurance company should increase their
premiums? Test the hypotheses and state your conclusion.
86
76
75
85
83
70
84
76
81
79
77
81
78
73
79
74
79
72
81
83
STAT
TESTS
The Solution Process

State Ho and Ha

Check the conditions
–
–
–
–
#2 T-Test
It is a random sample – given information
The events are independent – we assume the length of one person’s life is not
effected by the length of another person’s life in this situation
The population of all policy holders with this insurance company must be at least
200 people.
The sample size is under 40 so we need to check a histogram of the data. The
histogram of the data looks unimodal and roughly symmetric. (draw it here)

Use a T-model with how many degrees of freedom?
Calculations:
SE =
p-value =

Conclusion:

STAT
TESTS
Confidence Intervals
#8 T-Interval

Find the 95% confidence interval and explain
what it means in context.

Does the confidence interval support your
conclusion? Why?
Example – With Stats

According to a newspaper article, the national
average math SAT score in 2010 was a 516. A
certain teacher wants to see if the students in her
school are performing higher than the national
average, so she collects a random sample of 50
students’ SAT math scores in her school. She
calculates the average from her sample to be 550
with a standard deviation of 36 points. Is this
sufficient evidence to conclude that students in her
school are performing higher than the national
average?
STAT
TESTS
The Solution Process

State Ho and Ha

Check the conditions
–
–
–
–
#2 T-Test
It is a random sample – given information
The events are independent – we assume one student’s SAT math score
is not influenced by another student’s SAT math score
The population of all students at this high school must be at least 500.
The sample size is over 40 so we don’t need to check a histogram of the
data.

Use a T-model with how many degrees of freedom?
Calculations:
SE =
p-value =

Conclusion:

Download