AP Stats Chapter 23 Part 1 PDF

advertisement
AP Stats Chapter 23
Inferences About Means
We can do what we done did with proportions the last few chapters with means in this chapter.
From an unknown population (the population is known but the mean and standard deviation are not) we can use a sample to make estimates and decisions about the mean.
Remember...the CLT says that while we might not know the true shape of a population distribution the distribution of sample means will be normal (if we take a large enough sample)
The CLT also says the mean of the sample distribution will be the same as the mean of the population distribution and that the standard deviation of the sample distribution will be the standard deviation of the population divided by the square root of n.
But since we don't know the standard deviation of the population we have a problem. With proportions we could use the mean (which we knew because the sample mean is equal to the population mean (CLT) and the mean gives us the standard deviation or in that case the standard error.
But if we aren't dealing with proportions there is no easy way to get the standard deviation from the mean.
What can we do? Well we can estimate the standard deviation of the population using the standard deviation of the sample (s)
SE(x) = s n
Now this worked pretty well (especially with large samples) but not as well as we would like. Remember s will vary from sample to sample and this messes up P­values and margins of error.
William S. Gosset to the rescue!!!!
t
Gosset was the quality control engineer at the Guinness Brewery in Dublin. He was responsible for making sure the stout (dark ale) that This is not an actual
photo of William S. the brewery produced was of a high Gosset
enough quality. Large samples could be a problem in this case...so he used small samples (3 or 4). But often batches that he rejected went back to the lab and it was found that they should not have been rejected.
Gosset figured out that using standard error ( s )
n
the sampling distribution actually changed shape (was not consistently normal) and he came up with a model that fit this changing distribution.
He called this new model the "t­distribution". The Guiness company had strict policies against its employees publishing materials (scared to have trade secrets given away) so Gosset had to publish his work under a pseudonym. Thus the distribution is called Student's t.
Since the t distribution changes shape depending on the size of the sample wee must always know the degrees of freedom. This simple paremeter determines the shape of the t distribution
So...using Student's t
t = x ­ µ SE(x)
df = n ­ 1
SE(x) = s n
This model deals with extra uncertainty (small samples) and thus gives wider confidence intervals and higher P­values.
One Sample Confidence Interval:
Confidence interval = x + t*n­1 X SE(x)
t*n­1 depends on the confidence level. We can find this value using a table...(A­104)
or a calculator (coming soon).
Let's look at an example....
A research team took a sample of 150 salmon from a salmon farm and found the level of mirex (an insecticide expected to be toxic to the kidneys). The results are shown below.
n = 150 x = .0913 ppm s = .0495 ppm
What is a 95% confidence interval for the amount of mirex in the salmon?
df = 149
SE(x) = .0495 = .004
150
~
t*149 = 1.977 (~ from table)
Confidence Interval .0913 + 1.977(.004)
.0913 + .0079
(.0834,.0992)
NOTE!!!!! if you are doing a problem where the actual population standard deviation is known then use the normal model!!!! We use t when we do not know the population standard deviation.
Calculator Time
Let's review normalcdf
Use normalcdf to find the area of the normal curve greater than a z­score of 1.645 _____________
Use tcdf to find the area of the t­distribution that is greater than a t­score of 1.645 with df = 12 ______
Use tcdf to find the area of the t­distribution that is greater than a t­score of 1.645 with df = 25
snoitidnoC dna snoitpmussA
Independence Assumption:
Randomization Condition
10% Condition
*Not as crucial when dealing with inferences about means (as opposed to proportions)
Normal Population Assumption
Nearly Normal Condition: Unimodal/
symmetricish
A random sample of 23 cars on Triphammer Road were found to have a mean speed of 31.0 mph with a sample standard deviation of 4.25 mph. A histogram of the sample shows it to be unimodal and roughly symmetric. Find a 90% confidence interval for the mean speed of all cars drivng on Triphammer Road.
I am 90% certain that the interval from 29.5 to 32.5 mph contains the true mean speed of all vehicles on Triphammer Road.
Assignment: 5, 9, 10, 11, 13, 14, 19
Download