Section 11.1 – Inference for the Mean of a Population Conditions for

advertisement
Section 11.1 – Inference for the Mean of a Population
Conditions for Inference about a Mean



SRS of size n from the population of interest Very important!!!
Observations must be independent
Observations must have a normal distribution with mean µ and standard deviation σ.
can be just symmetric and single-peaked unless the sample is very small.
Distribution
What if σ isn’t given?
 Because σ is usually unknown, we estimate it by the sample deviation s.
Standard Error
The standard error of the sample mean x is
s
. ***s is the standard deviation of the sample.
n
***When we know the value σ, we base confidence intervals and tests for µ on the z –statistic: z 
x 

.
n
The z-statistic has a normal distribution.
***When we do not know σ, we substitute the standard error
s
. This statistic does not have a normal
n
distribution.
So what do we use?
The t-distribution  New table : )
Draw an SRS of size n from a population that has the normal distribution with mean µ and standard deviation σ.
x 
The one-sample t statistic t 
has the t distribution with n – 1 degrees of freedom.
s
n
***This statistic has the same interpretation as any standardized statistic: it says how far the sample mean is
from the population mean in standard deviation units.
Different t-distributions:
There is a different t-distribution for each sample size n.
So what do we notice? As n increases, the t
distribution gets closer to the standard
normal distribution.
Example 11.1, p. 619 – Using the t-table
What critical value t* from Table C (back cover of book) (t table) would you use for a t distribution with 18
degrees of freedom having probability 0.90 to the left of t*?
t* is a critical value with probability p to the
right.
Need tail probability. It wants 0.90 to the left. The picture shows the right.
1 – 0.90 = 0.10  Tail probability.
So we have df = 18 and 0.10 tail probability:
So the critical value t* = 1.330
Now suppose you want to construct a 95% confidence interval for the mean µ of a population based on an SRS
of size n = 12. What critical value of t* should you use?
First, degrees of freedom: n – 1 = 12 – 1 = 11
NEXT PAGE
What about tail area? Don’t need it!!! CIs are at the bottom!
So the desired critical value t* = 2.201
One-sample t procedures
Draw an SRS of size n from a population having unknown mean µ. A level C confidence interval for µ is
x  t*
s
n
Where t* is the upper (1 – C) / 2 critical value for the t (n – 1) distribution.
**t (n – 1)  This means t distribution with degrees of freedom n – 1.
***Interval is exact when the population distribution is normal and is approximately correct for large n in other
cases.
Example 11.2, p. 622 – 623
The following table gives the NOX levels (in g/mi) for a sample of light-duty engines of the same type.
1.28
1.17
1.16
1.08
0.60
1.32
1.24
0.71
0.49
1.38
1.20
0.78
0.95
2.20
1.78
1.83
1.26
1.73
1.31
1.80
1.15
0.97
1.12
0.72
1.31
1.45
1.22
1.32
1.47
1.44
0.51
1.49
1.33
0.86
0.57
1.79
2.27
1.87
2.94
1.16
1.45
1.51
1.47
1.06
2.01
1.39
Construct a 95% interval for the mean amount of NOX emitted by light-duty engines of this type.
*Remember: State, Plan, Do, Conclude
State: We want to estimate µ, the mean amount of NOX emitted by light-duty engines of this type, at a 95% confidence
level.
Plan: Procedure: 1 – sample t-procedure
 Use t because we do not know σ
Conditions: 1. SRS? We will assume the sample is random.
2. Independent? n * 10 = 46 * 10 = 460. Assume there are more than 460 engines.
3. Normal? Because 46 ≥ 30, we’ll say it is approximately normal by the Central Limit
Theorem.
Do: Calculate x  1.329
s = 0.484
Df = n – 1 = 45 so….. t* = 2.021
*Note: 45 is not on the Df chart. Go one that is close. In this case, we will use 40. To get the mean and standard
deviation, type the numbers into your lists, and calculate one variable stats. (Stat, Edit, Enter L1, 2nd, Quit, Stat, CALC, 1Var Stats)
x  t*
s
.484
 1.329  2.021
 1.329  0.144  1.185,1.473  Remember, smallest to largest in the interval
n
46
Conclude: We are 95% confident that the true mean level of nitrogen oxides emitted by this type of light-duty engine is
between 1.185 grams/mile and 1.473 grams/mile.
Robust Procedures:
A confidence interval or significance test is called robust if the confidence level or P-value does not change very much
when the assumptions of the procedure are violated.
-
t procedures are relatively robust when the population is non-normal, especially for larger samples.
IMPORTANT: t procedures are strongly influenced by outliers. (example 11.6)
Using the t procedures (p. 636)
 Assumption that data are an SRS from the population is more important than the assumption that the population
is normal (except with small samples)
 Sample size < 15. Use if the data are close to normal. If outliers are present, do not!
 Sample size ≥ 15. Can be used except when there are outliers and distribution is skewed.
 Large samples (roughly n ≥ 40). Can be used even for skewed distributions.
Download