One Sample T-Test

advertisement
Dr. Scott Marley
EdPsy 511
Spring 2007
SPSS Tutorial
Please note that SPSS only gives results with p-values based upon two-tailed
statistical tests. All examples presented here are two-tailed tests. In order to verify
the hand calculations for one-tailed tests on your homework, Dr. Marley
recommends the following methodologies: 1) After computing the t-test value by
hand, compare this value to the t-test value seen in the SPSS output for two-tailed
tests. & 2) Divide the p-value in the SPSS output in half and apply it to the alpha
level assigned in the homework problem.
One Sample T-Test
The One-Sample T Test procedure tests whether the mean of a single variable differs
from a specified constant.
Examples. A researcher might want to test whether the average IQ score for a group of
students differs from 100. Or, a cereal manufacturer can take a sample of boxes from the
production line and check whether the mean weight of the samples differs from 1.3
pounds at the 95% confidence level. The following example is designed for the purposes
of humor and instruction.
We have volunteered to collect data for the people of Mozambique on the circumferences
of mythical giraffe necks. We were able to measure 10 adult mythical giraffes whose
neck sizes in feet had the following dimensions: 37, 68, 45, 30, 40, 50, 70, 35, 26, 47.
We have approximately 500 feet of gauze material that we are hoping is sufficient to
create neck bows for each giraffe. Thus, we are testing the null hypothesis that the
average neck size of our sample of mythical giraffes is 50 ft. We are using an alpha level
of .05.
H0: µ = 50
HA: µ ≠ 50
Enter the following neck sizes of the giraffes 37, 68, 45, 30, 40, 50, 70, 35, 26, 47. Then
choose One-Sample T-Test
Set Test Value at 50.
Click OK.
.
Conclusion Based Upon SPSS Output for a One-Sample T-Test.
The mean size of our giraffe necks was 44.8 ft. with a standard deviation of 14.75 ft. The
standard error of the mean was 4.66 ft.. We find a p-value of .294, which is greater than
our alpha level of .05. Thus, the mean difference between neck sizes in our sample of
giraffes is not statistically significantly different from 50 ft and we fail to reject the null
hypothesis. The people of Mozambique are sure to cast admiring gazes at the beautifully
mythical and gauzed giraffes.
Independent-Samples T Test
The Independent-Samples T Test procedure compares means for two groups of cases.
Ideally, for this test, the subjects should be randomly assigned to two groups, so that any
difference in response is due to the treatment (or lack of treatment) and not to other
factors. This is not the case if you compare average income for males and females. A
person is not randomly assigned to be a male or female. In such situations, you should
ensure that differences in other factors are not masking or enhancing a significant
difference in means. Differences in average income may be influenced by factors such as
education (and not by sex alone).
Example. Patients with high blood pressure are randomly assigned to a placebo group and
a treatment group. The placebo subjects receive an inactive pill, and the treatment
subjects receive a new drug that is expected to lower blood pressure. After the subjects
are treated for two months, the two-sample t test is used to compare the average blood
pressures for the placebo group and the treatment group. Each patient is measured once
and belongs to one group.
1. Select Independent Samples T –Test as follows: Please find the Data Set entitled,
‘New Drug’ on Dr. Marley’s website to follow along with this exercise as practice for
your homework.
2. Enter the drug condition as the grouping variable and the blood pressure results as the
test variable.
Using Option Subcommand enter values (in this case 1 & 2) specified for your control
and treatment drug groups. Then hit the continue button followed by the ok button on the
original screen.
3. Analyze SPSS Output for the Independent Samples T-Test. Find the Mean, Std.
Deviation and Std. Error of the Mean for each of the drug conditions.
Group Statistics
Blood Pressure
Drug
New Drug
Placebo
6
Mean
149.000
Std. Deviation
18.6976
Std. Error
Mean
7.6333
6
129.833
16.4732
6.7252
N
I apologize for the blurriness of the following output, which is due to size restrictions.
The t-test for Equality of Means yields a p-value of .089, which informs us that the mean
difference of 19.1667in blood pressure between the two groups was not statistically
significant at the two-tailed alpha = .05 level. We note that a larger sample size or a onetailed test might/would give different results.
Paired Samples T-Test
The Paired-Samples T Test procedure compares the means of two variables for a single
group. The procedure computes the differences between values of the two variables for
each case and tests whether the average differs from 0.
Example. In a study on coronary artery disease, the time a patient can spend on a
treadmill is measured while still smoking and measured again after six months of having
quit smoking. Thus, each subject has two measures, often called before and after
measures.
1. Select Paired Samples T–Test as follows: Please find the Data Set entitled, ‘Coronary
Artery Data’ on Dr. Marley’s website to follow along with this exercise as practice for
your homework.
2. Select ‘Treadmill Time Before’ and ‘Treadmill Time After’ in the left window and
notice how Variable 1 and Variable 2 are filled in as you do so. Then press the arrow
button.
3. Notice how the two variables are paired into a relationship that indicates that the ‘time
before’ variable precedes the ‘time after’ variable. Verify that the correct sequence is
established. Then hit the okay button.
4. Analyze SPSS output. Notice the mean, sample size, standard deviation and standard
error of the mean for each of the paired variables.
Paired Samples Statistics
Mean
Pair 1
Treadmill time in seconds
before Treatment
Treadmill time in seconds
after Treatment
N
837.44
617.9444
Std. Deviation
Std. Error
Mean
18
197.653
46.587
18
204.98335
48.31504
The correlation between the paired variables is .810, which is statistically significant at
the alpha = .05 level with a p-value <.001.
Paired Samples Correlations
N
Pair 1
Treadmill time in
seconds before
Treatment & Treadmill
time in seconds after
Treatment
Correlation
18
.810
Sig.
.000
Finally, in examining the following output (again adjusted due to size restrictions) we see
that the mean difference in treadmill times before and after cessation of smoking was
statistically significant at the alpha = .05 level with a p-value < .000.
Part II
I. Go to the Piface website at: http://www.stat.uiowa.edu/~rlenth/Power/
Choose the two-sample t test option from the menu listed on the left of your screen. If
this menu does not appear, then you must install JAVA onto your computer via the
instructions listed on the web page or use a different computer.
This tutorial presents the way to calculate the power or sample size for a two sample ttest given an effect size, which is calculated from the difference in means divided by the
pooled standard deviation. For the sake of clarity of explanation, the example presented
here includes samples with equal n-sizes and equal standard deviations. However, the
same principles apply in other situations.
II. Piface Options for the Two Sample T-Test with Pooled Standard Deviation. (This
information is entered into the dialog box shown below.)
Standard Deviation (sigma)
This dialog provides for power analysis of a two-sample t-test. If the "equal SDs" box is
checked, then the pooled t test is used. This is the option you will usually use.
Sample Size
You have three choices for sample-size allocation. "Equal" forces n1 = n2
Power Slider
You may choose to solve for sample size when you click on the "Power" slider. For our
current examples we will keep the selection on sample size.
III. Example: Two Sample T-Test with Pooled Standard Deviation
Research Question:
Does the desire for food give purpose to maze walking in mice? We hypothesize that the
influence of the cheese will motivate the treatment group to race through the maze with
more speed.
Study:
We randomly assigned mice to one of two conditions. The first condition is a control
group composed of 30 mice that are simply placed within a maze and timed to see how
long each one takes to exit. The second group of 30 mice is the treatment group and
these mice are also timed to see how long it takes each one to get through the maze with
the incentive of a wafting piece of swiss cheese at the end of the maze.
Results:
The control group had a mean exit time of 1.004 seconds while the treatment group had a
mean exit time of .589 seconds. Amazingly, the standard deviation of each sample was
the same at .63 seconds.
Power Analysis:
I. Using Cohen's D formula (shown below) for determining effect size, we see that our
current study with sample means of 1.004 and .589 respectively for the control and
treatment groups and equal standard deviation of .63 generates an effect size of .66. In
order to find the power of this study we use Piface statistical software.
d = M1 - M2 / [( 1² +²)/ 2]
= 1.004 - 0.589 / [(0.63² + 0.63²) / 2]
= 0.415 / [(0.3969 + 0.3969) / 2]
= 0.415 / (0.7938/2)
= 0.415 / 0..3969
= 0.415 / 
 = .66
(continued on next page)
II. In order to find the power needed at an alpha level of .05 to detect a statistically
significant difference in mean exit time for our two samples of mice with an effect size of
.66, we enter the difference of means (.415), and the standard deviations of .63
(remember, .415/.63 = .66), and the sample sizes of 30 in the dialog box by clicking the
little box in the right hand corner or sliding the n1 bar to 30.
III. Results of the Independent Two-Sample T-Test Power Analysis.
The dialog box above shows us that we approximately have power of .7085 to detect a
statistically significant result in mean difference in exit time between the treatment and
control conditions with an effect size of .66 at the alpha = .05 level. If you want to
increase the power to .80 move the slider bar to .80. The n size will increase as you slide
the bar to the right.
(continued on next page)
IV. Another way to input Cohen’s d is to standardize sigma to 1 and then indicate the
difference is .66 (i.e., d = .66/1 = .66). In the screen below you will see that the results
are exactly the same as above when we entered d = .415/.63 = .66.
Notice that the power (1 – Β) is the same.
V. As we discussed in class a way to increase your power is to perform a one-tailed test.
Do this by removing the checkmark from the two-tailed box.
Notice that the power has now increased to .81. This is why many prefer to use a onetailed test.
VI. The other purpose of power analysis is to determine what size sample you will need
to find an effect of interest. For example, in previous studies researchers have found d =
.40 when comparing one instructional strategy to another. If you were to perform a
similar study using a two-tailed test and α = .05 how many subjects would you need per
condition if you would like to have power of .80.
First, set sigma to 1 in both groups. Second, to find a Cohen’s d = .40 insert .40 in the
difference of means box. Next, click on the box in the right corner of the Power field and
input .80. After you do this, the sample size will automatically be calculated.
Based on this analysis to have power of .80 to find a Cohen’s d = .40 the study will
require 99 participants in each condition (N = 198).
(continued on next page)
VII. Using the previous example let’s see what happens to our sample size estimation
when we use a one-tailed test.
Now our study will require 156 participants which is 42 less than are required for the
two-tailed test.
VIII. In some cases a type II error is more problematic so researchers increase their
power to .90 (or even .95). In the analysis below the researcher is looking to find a
difference of 4/10 of a standard deviation (i.e., d = .40) using a two-tailed test and α =
.05.
With a 132 participants per condition the sample size increases to 264.
Download