Two Sample Hypothesis Tests

advertisement
Chapter 8 (2/3e) or 9 (1e)
Hypothesis Testing:
Two Sample Test for Means and
Proportions
Introduction:

The two sample test is similar to the one sample
test, except that we are now testing for
differences between two populations rather than
a sample and a population. There are three
types of two sample tests:

Hypothesis Testing with Sample Means (Large
Samples)
Hypothesis Testing with Sample Means (Small
Samples)
Hypothesis Testing with Sample Proportions (Large
Samples)


The Question to be Answered:
 “Is
the difference between sample
statistics large enough to
conclude that the populations
represented by the samples are
significantly different?”
Null Hypothesis:

The H0 is that the populations are the same.
H 0: μ 1 = μ 2

If the difference between the sample statistics is
large enough, or, if a difference of this size is
unlikely, assuming that the H0 is true, we will
reject the H0 and conclude there is a difference
between the populations.
Null Hypothesis (cont.)

The H0 is a statement of “no difference”

The 0.05 level will continue to be our indicator of
a significant difference

We change the sample statistics to a Z score,
place the Z score on the sampling distribution
and use Appendix A to determine the probability
of getting a difference that large if the H0 is true.
Alternate Hypothesis:


The alternate hypothesis is the research
hypothesis.
If the null hypothesis is rejected, then we will
have found evidence to support the research
hypothesis.
H 1: μ 1 ≠ μ 2
Formula for Hypothesis Testing with
Sample Means (Large Samples)

  
Z
1
  
2
Explanation of formula:



The numerator
sample means.
1   2
is the difference in

The denominator
   is the “pooled estimate”
of the standard error for both samples.
The pooled estimate is calculated by using the
sample information in the following formula:
2
1
2
2
s
s
 

n1  1 n 2  1
The Five Step Model
1.
2.
3.
4.
5.
Make assumptions and meet test
requirements.
State the H0 and H1.
Select the Sampling Distribution and
Determine the Critical Region.
Calculate the test statistic.
Make a Decision and Interpret Results.
Example: Hypothesis Testing in the Two
Sample Case

Text 1e 9.5b, 2/3e 8.5b (Email messages):

Middle class families average 8.7 email
messages and working class families average
5.7 messages.

The middle class families seem to use email
more but is the difference significant?
Problem Information:
E-Mail Messages
Sample 1 (M.Class)
1
S1
n1
= 8.7
= 0.3
= 89
Sample 2 (W.Class)
2
= 5.7
S2 = 1.1
n2 = 55
Step 1 Make Assumptions and Meet
Test Requirements

We have:

Independent Random Samples

Level of Measurement is Interval Ratio

Sampling Distribution is normal in shape because
we have a large sample:
n1 + n2 ≥ 100 (in this case, n1 + n2 = 144)
Step 2 State the Null Hypothesis

H0: μ1 = μ2


The Null asserts there is no significant difference
between the populations.
H1: µ1≠ µ2

The research hypothesis contradicts the H0 and
asserts there is a significant difference between
the populations.
Step 3 Select the Sampling Distribution
and Establish the Critical Region

Sampling Distribution = Z distribution

Alpha (α) = 0.05

Z (critical) = ± 1.96
Using the formula:

Compute the pooled estimate (S.E.):
s12
s22
.32
1.12
 



 .001  .022  .152
n1  1 n 2  1
89  1 55  1

Solve for Z:

    8.7  5.7
Z

 19.74
1
  
2
.152
Step 5 Make a Decision

The obtained test statistic (Z = 19.74) falls in the
Critical Region so reject the null hypothesis.

The difference between the sample means is so
large that we can conclude (at α = 0.05) that a
difference exists between the populations
represented by the samples.

The difference between the email usage of
middle class and working class families is
significant (Z=19.74, α=.05)
Two-tailed Hypothesis Test:
Z= -1.96
c
Z = +1.96
c
Z=19.74
I
When α = .05, then .025 of the area is distributed on either
side of the curve in area (C )
The .95 in the middle section represents no significant
difference between the two populations.
The cut-off between the middle section and +/- .025 is
represented by a Z-value of +/- 1.96.
Factors in Making a Decision


The use of one- vs. two-tailed tests (we
are more likely to reject with a one-tailed
test)
The size of the sample (n). The larger the
sample the more likely we are to reject the
H0.
Significance Vs. Importance


As long as we work with random samples, we
must conduct a test of significance.
Significance is not the same thing as
importance.

Differences that are otherwise trivial or
uninteresting may be significant.
Significance Vs. Importance

When working with large samples, even small
differences may be significant.


The value of the test statistic (step 4) is an inverse
function of n.
The larger the n, the greater the value of the test
statistic, the more likely it will fall in the critical
region (region of rejection) and be declared
significant.
Significance Vs Importance

Significance and importance are different things.

A sample outcome could be:
 significant and important
 significant but unimportant
 not significant but important
 not significant and unimportant
Formula for Hypothesis Testing with
Sample Proportions (Large Samples)

Formula for proportions:

s1  s 2
 p p

See next slide for how to calculate the standard
deviation of the sampling distribution* and the
pooled estimate of the population proportion*….

*Note that you need to calculate both these values in order to solve the
denominator of the above equation!
Calculating Pu (the Pooled Estimate of the
Population Proportion) and the Standard
Deviation of the Sampling Distribution

To calculate Pu (the pooled estimate, fig. 7.7 or 8.7):
n1 Ps1  n2 Ps 2
Pu 
n1  n2

Standard Deviation of the S.D. (fig. 7.7 or 8.8):
 p p
n1  n2
 u (1  u )
n1n2
Example:

Using the same guidelines as for the large
sample test for means (above) and the 5-step
method, work with a partner and try #9.11 to
test for a difference in proportions.

The answer to this question can be found at
the back of your text.
Formula (t-test) for Hypothesis Testing with
Sample Means (Small Samples N1 + N2 < 100)
Formula:
S.E:

  
t
1
2
  
n s n s
   
n1  n2  2
2
1 1
2
2 2
n1  n2
n1n2
Note: Use t-table with df = n1 + n2 - 2
Example:



Using the same format as for the large sample
test (above) and the 5-step method, work with
a partner and try 1e #9.7a or 2/3e #8.7a)
Do part b for homework.
The answer to this question can be found at
the back of your text.
Using SPSS to do Independent Samples Test
for Difference in Two Means





SPSS uses a t-test rather than a z-test for both large
and small samples.
Follow guidelines in text at the end of the chapter.
In interpreting your printout, look at the Levene’s test
(shown in the first two columns F and sig.) first.
If the p-value (sig) is greater than alpha=.05, focus
on interpreting the top row of the “t-test for Equality of
Means”. If it is less than .05, use the bottom row of
the t-test.
If the significance level (Sig. 2-tailed) is less than
α=.05, then the difference between the sample
means is significant. Report t, df, and your α-level in
your interpretation.
Practice Problems



#8.4 (2/3e)/ 9.4 (1e)
#8.8 (2/3e)/ 9.8 (1e)
#8.12a (2/3e)/ 9.12a (1e)
Answers can be found in the lecture list directly
below this presentation. No looking before
you have tried the questions!
Download