Comparing Two Means

advertisement
10
Chapter
Two-Sample Hypothesis
Testing
Two-Sample Tests
Comparing Two Means: Independent
Samples
Comparing Two Means: Paired
Samples
Comparing Two Proportions
Comparing Two Variances
McGraw-Hill/Irwin
Copyright © 2009 by The McGraw-Hill Companies, Inc.
Two-Sample Tests
What is a Two-Sample Test
•
•
A Two-sample test compares two sample
estimates with each other.
A one-sample test compares a sample
estimate against a non-sample benchmark.
Basis of Two-Sample Tests
•
10-2
Two samples that are drawn from the same
population may yield different estimates of
a parameter due to chance.
Two-Sample Tests
What is a Two-Sample Test
•
If the two sample statistics differ by more
than the amount attributable to chance,
then we conclude that the samples came
from populations with different parameter
values.
Figure 10.1
10-3
Two-Sample Tests
Test Procedure
•
•
•
•
•
•
10-4
State the hypotheses
Set up the decision rule
Insert the sample statistics
Make a decision based on the critical
values or using p-values
If our decision is wrong, we could commit a
type I or type II error.
Larger samples are needed to reduce type I
or type II errors.
Comparing Two Means:
Independent Samples
Format of Hypotheses
•
10-5
The hypotheses for comparing two
independent population means m1 and m2
are:
Comparing Two Means:
Independent Samples
Test Statistic
•
•
If the population variances s12 and s22 are
known, then use the normal distribution.
If population variances are unknown and
estimated using s12 and s22, then use the
Students t distribution.
Table 10.1
10-6
Comparing Two Means:
Independent Samples
Test Statistic
•
Excel’s Tools | Data Analysis menu handles
all three cases.
Figure 10.2
10-7
Comparing Two Means:
Independent Samples
Case 1: Known Variances
•
•
10-8
When the variances are known, use the
normal distribution for the test (assuming a
normal population).
The test statistic is:
Comparing Two Means:
Independent Samples
Case 2: Unknown Variances, Assumed Equal
•
•
10-9
Since the variances are unknown, they
must be estimated and the Student’s t
distribution used to test the means.
Assuming the population variances are
equal, s12 and s22 can be used to estimate a
common pooled variance sp2.
Comparing Two Means:
Independent Samples
Case 2: Unknown Variances, Assumed Equal
10-10
•
The test statistic is
•
With degrees of freedom n = n1 + n2 - 2
Comparing Two Means:
Independent Samples
Case 3: Unknown Variances, Assumed Unequal
•
•
•
10-11
If the unknown variances are assumed to
be unequal, they are not pooled together.
In this case, the distribution of the random
variable x1 – x2 is not certain (BehrensFisher problem).
Use the Welch-Satterthwaite test which
replaces s12 and s22 with s12 and s22 in the
known variance z formula, then uses a
Student’s t test with adjusted degrees of
freedom.
Comparing Two Means:
Independent Samples
Case 3: Unknown Variances, Assumed Unequal
10-12
•
Welch-Satterthwaite test
•
with degrees of freedom
•
A Quick Rule for degrees of freedom is to
use min(n1 – 1, n2 – 1).
Comparing Two Means:
Independent Samples
Steps in Testing Two Means
10-13
•
Step 1: State the hypotheses
•
Step 2: Specify the decision rule
Choose a (the level of significance) and
determine the critical value(s).
•
Step 3: Calculate the Test Statistic
Comparing Two Means:
Independent Samples
Steps in Testing Two Means
•
Step 4: Make the decision
Reject H0 if the test statistic falls in the
rejection region(s) as defined by the critical
value(s).
For example,
for a two-tailed
test for
Student’s t
and a = .05
Figure 10.4
10-14
Comparing Two Means:
Independent Samples
Which Assumption Is Best?
•
•
•
•
10-15
If the sample sizes are equal, the Case 2
and Case 3 test statistics will be identical,
although the degrees of freedom may
differ.
If the variances are similar, the two tests
will usually agree.
If no information about the population
variances is available, then the best choice
is Case 3.
The fewer assumptions, the better.
Comparing Two Means:
Independent Samples
Must Sample Sizes Be Equal?
•
Unequal sample sizes are common and the
formulas still apply.
Large Samples
•
10-16
For unknown variances, if both samples
are large (n1 > 30 and n2 > 30) and the
population isn’t badly skewed, use the
following formula with appendix C.
Comparing Two Means:
Independent Samples
Caution: Three Issues
1. Are the populations skewed? Are there
outliers?
Check using histograms and dot plots of
each sample.
t tests are OK if moderately skewed,
especially if samples are large.
Outliers are more serious.
10-17
Comparing Two Means:
Independent Samples
Caution: Three Issues
2. Are the sample sizes large (n > 30)?
If samples are small, the mean is not a
reliable indicator of central tendency and
the test may lack power.
3. Is the difference important as well as
significant?
A small difference in means or proportions
could be significant if the sample size is
large.
10-18
Comparing Two Means:
Paired Samples
Paired Data
•
•
•
10-19
Data occurs in matched pairs when the
same item is observed twice but under
different circumstances.
For example, blood pressure is taken
before and after a treatment is given.
Paired data are typically displayed in
columns:
Comparing Two Means:
Paired Samples
Paired t Test
10-20
•
Paired data typically come from a
before/after experiment.
•
In the paired t test, the difference between
x1 and x2 is measured as d = x1 – x2
•
The mean d and standard deviation sd of
the sample of n differences are calculated
with the usual formulas for a mean and
standard deviation.
Comparing Two Means:
Paired Samples
Paired t Test
10-21
•
The calculations for the mean and standard
deviation are:
•
Since the population
variance of d is unknown,
use the Student’s t with n-1
degrees of freedom.
Comparing Two Means:
Paired Samples
Steps in Testing Paired Data
10-22
•
Step 1: State the hypotheses, for example
H0: md = 0
H1: md ≠ 0
•
Step 2: Specify the decision rule.
Choose a (the level of
significance) and
determine the critical
values from Appendix D.
Comparing Two Means:
Paired Samples
Steps in Testing Paired Data
•
•
10-23
Step 3: Calculate the test statistic t
Step 4: Make the decision
Reject H0 if the test statistic falls in the
rejection region(s) as defined by the
critical values.
Comparing Two Means:
Paired Samples
Excel’s Paired Difference Test
•
10-24
Excel gives you the option of choosing
either a one-tailed or two-tailed test and
also gives the p-value.
Comparing Two Means:
Paired Samples
Analogy to Confidence Interval
•
10-25
A two-tailed test for a zero difference is
equivalent to asking whether the
confidence interval for the true mean
difference md includes zero.
Comparing Two Proportions
Testing for Zero Difference: p1 = p2
•
10-26
To compare two population proportions,
p1, p2, use the following hypotheses
Comparing Two Proportions
Testing for Zero Difference: p1 = p2
10-27
•
The sample proportion p1 is a point
estimate of p1:
•
The sample proportion p2 is a point
estimate of p2:
Comparing Two Proportions
Pooled Proportion
•
If H0 is true, there is no difference between
p1 and p2, so the samples are pooled
(averaged) into one “big” sample to
estimate the common population
proportion.
x1 + x2
p=n +n =
1
2
10-28
number of successes in combined samples
combined sample size
Comparing Two Proportions
Test Statistic
•
•
•
•
10-29
If the samples are large, p1 – p2 may be
assumed normally distributed.
The test statistic is the difference of the
sample proportions divided by the
standard error of the difference.
The standard error is calculated by using
the pooled proportion.
The test statistic for the
hypothesis p1 = p2 is:
Comparing Two Proportions
Test Statistic
•
10-30
The test statistic for the hypothesis p1 = p2
may also be written as:
Comparing Two Proportions
Steps in Testing Two Proportions
•
•
•
10-31
Step 1: State the hypotheses
Step 2: Specify the decision rule
Choose a (the level of significance) and
determine the critical value(s).
Step 3: Calculate the Test Statistic
Assuming that p1 = p2, use a pooled
estimate of the common proportion.
Comparing Two Proportions
Steps in Testing Two Proportions
•
Step 4: Make the decision
Reject H0 if the test statistic falls in the
rejection region(s) as defined by the critical
value(s).
For example, for
a two-tailed test
and a = .05
10-32
Comparing Two Proportions
Using the p-Value
•
•
•
•
10-33
The p-value is the level of significance that
allows us to reject H0.
The p-value indicates the probability that a
sample result would occur by chance if H0
were true.
The p-value can be obtained from Appendix
C-2 or Excel using =NORMDIST(z).
A smaller p-value indicates a more
significant difference.
Comparing Two Proportions
Checking Normality
•
•
•
•
10-34
Check the normality assumption with
np > 10 and n(1-p) > 10.
Each of the two samples must be checked
separately using each sample proportion in
place of p.
If either sample proportion is not normal, their
difference cannot safely be assumed normal.
The sample size rule of thumb is equivalent to
requiring that each sample contains at least 10
“successes” and at least 10 “failures.”
Comparing Two Proportions
Small Samples
•
•
If sample sizes do not justify the normality
assumption, treat each sample as a
binomial experiment.
If the samples are small, the test is likely to
have low power.
Must Sample Sizes Be Equal?
•
10-35
Unequal sample sizes are common and the
formulas still apply.
Comparing Two Proportions
Using Software for Calculations
•
•
MegaStat gives the option of entering
sample proportions or the fractions.
MINITAB gives the option of nonpooled
proportions.
Figure 10.15
10-36
Comparing Two Proportions
Analogy to Confidence Intervals
10-37
•
The confidence interval for p1 – p2 without
pooling the samples is:
•
If the confidence interval does not include
0, then we reject the null hypothesis.
Comparing Two Proportions
Testing for Non-Zero Differences (Optional)
10-38
•
Testing for equality is a special case of
testing for a specified difference D0
between two proportions.
•
If the hypothesized difference D0 is nonzero, the test statistic is:
Comparing Two Variances
Format of Hypotheses
•
•
10-39
To test whether two population means are
equal, we may also need to test whether
two population variances are equal.
The hypotheses may be stated as
Comparing Two Variances
Format of Hypotheses
•
10-40
An equivalent way to state these
hypotheses would be to use ratios since
the variance can never be less than zero
and it would not make sense to take the
difference between two variances.
Comparing Two Variances
The F Test
10-41
•
The test statistic is the ratio of the sample
variances:
•
If the variances are equal, this ratio should
be near unity: F  1
Comparing Two Variances
The F Test
•
•
•
10-42
If the test statistic is far below 1 or above 1,
we would reject the hypothesis of equal
population variances.
The numerator s12 has degrees of freedom
n1 = n1 – 1 and the denominator s22 has
degrees of freedom n2 = n2 – 1.
The F distribution is
skewed with the mean > 1
and its mode < 1.
Comparing Two Variances
The F Test
•
•
•
10-43
Critical values for the F test are denoted
FL (left tail) and FR (right tail).
A right-tail critical value FR may be found from
Appendix F using n1 and n2 degrees of
freedom.
FR = Fn1,n2
A left-tail critical value FR may be found by
reversing the numerator and denominator
degrees of freedom, finding the critical value
from Appendix F and taking its reciprocal:
FL = 1/Fn2,n1
Comparing Two Variances
Steps in Testing Two Variances
10-44
•
Step 1: State the hypotheses, for example
H0: s12 = s22
H1: s12 ≠ s22
•
Step 2: Specify the decision rule
Degrees of freedom are:
Numerator: n1 = n1 – 1
Denominator: n2 = n2 – 1
Choose a and find the left-tail and right-tail
critical values from Appendix F.
Comparing Two Variances
Steps in Testing Two Variances
•
•
10-45
Step 3: Calculate the test statistic
Step 4: Make the decision
Reject H0 if the test statistic falls in the
rejection regions as defined by the critical
values.
Comparing Two Variances
Comparison of Variances: One Tailed Test
10-46
•
Step 1: State the hypotheses, for example
H0: s12 = s22
H1: s12 < s22
•
Step 2: State the decision rule
Degrees of freedom are:
Numerator: n1 = n1 – 1
Denominator: n2 = n2 – 1
Choose a and find the left-tail critical value
from Appendix F.
Comparing Two Variances
Comparison of Variances: One Tailed Test
•
•
10-47
Step 3: Calculate the Test Statistic F
Step 4: Make the decision
Reject H0 if the test statistic falls in the lefttail rejection region as defined by the
critical value.
Comparing Two Variances
Excel’s F Test
•
Excel allows you to test the equality of
variances and gives you a p-value.
Figure 10.26
10-48
Comparing Two Variances
Assumptions of the F Test
•
•
•
10-49
The F test assumes that the populations
being sampled are normal.
It is sensitive to non-normality of the
sampled populations.
Often, the samples used in the F test are
too small for a normality test.
Applied Statistics in
Business & Economics
End of Chapter 10
10-50
McGraw-Hill/Irwin
Copyright © 2009 by The McGraw-Hill Companies, Inc.
Download