(3) Paired-Samples T tests

advertisement
Author(s): Brenda Gunderson, Ph.D., 2011
License: Unless otherwise noted, this material is made available under the
terms of the Creative Commons Attribution–Non-commercial–Share
Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/
We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your
ability to use, share, and adapt it. The citation key on the following slide provides information about how you
may share and adapt this material.
Copyright holders of content included in this material should contact open.michigan@umich.edu with any
questions, corrections, or clarification regarding the use of content.
For more information about how to cite these materials visit http://open.umich.edu/education/about/terms-of-use.
Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis
or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please
speak to your physician if you have questions about your medical condition.
Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers.
Some material may be sourced from:
Mind on Statistics
Utts/Heckard, 3rd Edition, Duxbury, 2006
Text Only: ISBN 0495667161
Bundled version: ISBN 1111978301
Material from this publication used with permission.
Attribution Key
for more information see: http://open.umich.edu/wiki/AttributionPolicy
Use + Share + Adapt
{ Content the copyright holder, author, or law permits you to use, share and adapt. }
Public Domain – Government: Works that are produced by the U.S. Government. (17 USC §
105)
Public Domain – Expired: Works that are no longer protected due to an expired copyright term.
Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain.
Creative Commons – Zero Waiver
Creative Commons – Attribution License
Creative Commons – Attribution Share Alike License
Creative Commons – Attribution Noncommercial License
Creative Commons – Attribution Noncommercial Share Alike License
GNU – Free Documentation License
Make Your Own Assessment
{ Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. }
Public Domain – Ineligible: Works that are ineligible for copyright protection in the U.S. (17 USC § 102(b)) *laws in
your jurisdiction may differ
{ Content Open.Michigan has used under a Fair Use determination. }
Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act. (17 USC § 107) *laws in your
jurisdiction may differ
Our determination DOES NOT mean that all uses of this 3rd-party content are Fair Uses and we DO NOT guarantee that
your use of the content is Fair.
To use this content you should do your own independent analysis to determine whether or not your use will be Fair.
Module 6: Paired t Procedures
Objectives: In this module you will learn how to construct a confidence interval and perform a paired ttest in the case when we have two quantitative variables collected in pairs. You will make a confidence
interval for and test hypotheses about the population mean difference, D . You will be able to provide a
statement about how confident you are about your interval estimate or in your decision.
Overview: Matched or paired data results from a deliberate experimental design scheme. For example,
suppose we are examining the effect of a drug on a certain type of response. The drug is administered
to a group of people. Responses for each individual can be measured both before and after the drug is
given. Or consider an experiment where rats are matched by weight, and then one rat in each match
receives a new diet and the other rat in the match receives a control diet. These types of design are
called paired data designs. Note that paired designs can occur when you have two measurements on
the same individual or when you have two individuals that have been matched or paired prior to
administering a treatment.
The inference procedures for a paired data design are based on the one sample of differences, and thus
the one-sample t procedures from Module 5 could be used. We are interested in estimating or testing
hypotheses about the population mean difference D , generally with the hypothesized value of zero,
indicating no difference on average.
Formula Card:
62
Activity:
Do books purchased from Borders (in-store) cost more on
average than if purchased online at Amazon.com?
Background: In recent years, the popularity of purchasing books via the Internet has increased
dramatically. The conventional bookstore no longer dominates the sales of books. The most influential
factor that sways customers into purchasing books online is lower prices when compared to local
bookstores.
A group of Statistics 350 students decided to perform a comparison of the Amazon.com prices versus
Borders bookstore (Ann Arbor) prices based on a sample of 40 books, selected from a wide range of
categories. For Amazon, a standard ground shipping of $4.29 and local state tax of 6% were included in
the cost.
The corresponding costs are available in the SPSS data set called books.sav (Source: Stat 350 group
project, 2004). Do the data provide sufficient evidence to conclude that, on average, Borders (in-store)
books are more expensive than Amazon.com books?
Task: Perform the appropriate paired t-test regarding the mean difference in book price, D , where the
differences are computed as “Borders less Amazon“ (i.e. ‘price at Borders’ minus ‘price on Amazon’).
Before conducting any test, here are a set of questions to ask yourself:
 How many populations are there?
One
Two
More than two
 How many variables are there?
One
Two
 What is the response variable?
 What type of variable is the response?
Categorical
Quantitative
 What type of parameter would be useful for summarizing this response?
Proportion
Mean
Other (see Supplement 3)
Based on the answers to these questions, you should be able to identify the appropriate inference
procedure. You may refer back to Supplement 3 – Name that Scenario for assistance.
The appropriate inference procedure for this scenario is ______________________________
and the specific parameter of interest is ___________________ .
NOTE: Why is this a paired procedure?
1. State the hypotheses: H0: __________ = __________ Ha: _______________________
where _____ represents
Your parameter definition should always be a statement about the population(s) under study.
63
2. Assumption Checks and Computing the Test Statistic:
Assumptions:
a. For this scenario, we need to assume that the sampled differences are a ________ sample.
To check this assumption, we would make a _______ plot (if there was time order) of the
_________________ and look for _____________________________________.
b. We also need to assume that the ___________________ of differences
is normally distributed. To check this assumption, we would make a _____________ plot
of the __________________.
c. We will assume the assumptions are reasonable for this example.
Test-statistic:
d. Generate the paired t-test output.
Use Analyze> Compare Means> Paired-Samples T-Test.
Note: If you want a CI, you can use Options to change the confidence level from 95%.
e. The test value is _____ (this is the null value from the null hypothesis).
f.
What is the value of the test statistic?
g. What is the distribution of the test statistic if the null hypothesis is true?
This is the same as asking what model you use to find the p-value.
3. Calculate the p-value:
a. What is the SPSS reported p-value? _____________. Is it the p-value we want? _____
b. Draw a picture of the p-value we want.
c. So, our p-value is _____________________
d. Provide an interpretation of the p-value.
64
4. Decision:
What is your decision at a 5% significance level? Reject H0 Fail to reject H0
Remember:
Reject H0

Fail to reject H0 
Results statistically significant
Results not statistically significant
5. Conclusion: State your conclusion in the context of the problem.
Conclusions should not be too strong -- i.e. say you have sufficient evidence or equivalent, do NOT
say we have proven.
Conclusions should always include a reference to the population parameter of interest.
Check Your Understanding:
1. The denominator of the test statistic is the standard error of the sample mean difference. The
following two sentences attempt to interpret a standard error. Which one is correct and why?
“The standard error of the sample mean estimates roughly the average distance of the
sample mean from the population mean”
“The standard error of the sample mean difference estimates roughly the average
distance of the observed differences from the population mean differences”
Think About It:
What is the connection between the paired t-test procedure and the one-sample t-test procedure from
Module 5?
How could you carry out this test via the one-sample procedure?
Try this and compare your results. Comment on your findings.
65
Example Exam Question on Paired t-Test
A utilization study was conducted to see how often two rooms of a sports facility were being used
during the lunch hour. The number of people in each room was counted at 12:30 noon each Monday
for 10 weeks. The results are summarized below.
Week
1
2
3
4
5
6
7
8
9
10
1 = Dance Studio
2 = Weight Room
D = Difference
16
11
5
20
13
7
7
8
-1
12
14
-2
11
12
-1
19
15
4
23
9
14
12
15
-3
25
18
7
16
10
6
Suppose you want to test the hypothesis of no difference between the utilization of the two rooms
against the alternative that the dance studio is used by more people during the lunch hour on average.
You conduct a (matched) paired t-test and enter the above data into SPSS to obtain the following
output.
m
i
f
o
n
a
l
e
r
e
E
p
e
2
e
e
d
w
t
p
a
a
v
P
#
0
4
9
2
2
3
9
2
1
#
a. The observed test statistic is given as t = 2.13. State what this value tells you about the location of
the sample mean difference of 3.6.
b. State the appropriate null and alternative hypotheses, and define the parameter of interest.
H0: _______ _____________________ Ha: _______ ____________________
where ______ is ____________________________________________________.
c. Report the p-value for the test in part (b) and decision using a significance level of 0.10.
p-value: _____________________
Decision: (circle) Reject H0
Fail to Reject H0
d. A decision was made in part (c). Which type of error could have been made? Use the appropriate
statistical name to identify the mistake. Error:
e. Circle each of the following statements that is an assumption required for performing the paired ttest.
…the population standard deviation for the difference in room use is known.
…the numbers using the dance studio are normally distributed.
…the difference in room use is normally distributed.
…the standard deviation for the number using the dance studio is equal to
the standard deviation for the number using the weight room.
…the numbers using the dance studio are independent of the numbers using the weight room.
66
67
Download