Practice Test 2 Answer Key

advertisement
STATISTICS FOR BEHAVIORAL SCIENTISTS
Name:
You have 3 hours for the real test. This is longer, to give you more practice, but does not exhaustively
sample the possible topics to be tested.
You must show work on computational problems.
In computations, keep as much accuracy as possible in your intermediate work, but round your final answers off to two
decimal places.
1. Why do we use the pooled variance in calculations of independent samples t-statistics
and ANOVAs?
Because we believe that the different groups all have the same underlying population variance
(homogeneity of variance assumption) and that the best way to get an estimate of that true
variance is to pool together the individual estimates we get from each group.
2. What is the difference between testwise and experimentwise alpha? How are testwise
and experimentwise alpha affected as the number of tests increases?
Testwise alpha is the criterion we select for an individual statistical test, that limits the risk of Type
I error we deem to be acceptable for that test. Experimentwise alpha sets the risk of Type I error we
deem to be acceptable for an entire experiment (consisting of several tests, usually). As the
number of tests increases, experimentwise alpha will by default go up, unless we correct this by
setting our testwise alpha lower.
3. How are t-tests related to F-tests? In what ways are they similar, and how are they
different?
We use a t-test to examine the difference between two group means. We use an F test to examine
the differences between two or more group means. When there are two groups, F=t2. T is
computed by comparing the difference in means we observed in our samples to the difference we
would have expected by chance alone. F is computed by comparing the variance in our sample
means to the variance we would have expected by chance alone. Thus, Fs must always be positive.
Both F and t are a family of curves, the t distribution has one degree of freedom, the F has two.
4. The F-statistic is the ratio of sb2 to sw2. Why is the F constructed this way? What does
it tell us?
The between-subjects variance is tells us about the variability in the sample means we observed in
our study. If there are large differences in the group means, this variability will be large. We
expect the between-subjects variance to include chance variability and variability coming from any
true effect. The within-subjects variance is our way of estimating just the chance (or error)
variability. How variable would we expect the means to be by chance alone? If the betweensubjects variance is much larger than the within-subjects variance, we conclude that there was a
significant effect.
Practice
5. For which of the two tables are you more likely to find a significant ANOVA result?
Why?
X-bar
SS
Group
1
10
38
Group
2
13
35
Group
3
12
35
X-bar
SS
Group
4
10
38
Group
5
18
35
Group
6
9
35
A significant ANOVA result would be more likely in the second table, because the means vary
more from one another (and thus from the grand mean, which is how we calculate the betweensubjects variance). This would make the between-subjects variance higher in table two than it table
one. Both tables would have the same within-subjects variance. Thus, the F would be higher in
table 2 than in table 1.
6. What is the difference between a post-hoc test and a planned comparison? When
should you use each, and why is it appropriate for that context?
A post-hoc test is a conservative test that controls for runaway Type I error when a
researcher is running multiple tests. It is appropriate to use when you have no strong a
priori predictions and you have already gotten “permission” to go fishing in the data bc
the ANOVA was significant. A planned comparison is a test that allows you to directly
assess whether you obtained the predicted pattern across groups. You use it when you
have strong a priori predictions about the pattern. It is a more liberal test than post-hoc
tests (it is permitted because you don’t have runaway Type I error with strong a priori
predictions).
7. Fill in the ANOVA table for the dataset below. Is there evidence of a significant
difference between groups?
8.
Group
1
0
4
Group
2
6
4
Group
3
2
4
2
5
3
0
3
1
4
Mean 2
SS
16
7
5
10
0
2
10
Source
SS
df
S2
F
Between
30
2
15
5
Within
36
12
3
Total
66
14
You would reject the null, using an alpha of .05 but you would not reject the null using an alpha of
.01. I will use an alpha of .05 and say there is evidence of a significant difference between groups.
9. What are the null hypotheses that we test in a two-way ANOVA?
In a two-way ANOVA, there are three null hypotheses. The first is that there is no main effect of
factor A, the second is that there is no main effect of factor B, and the third is that there is no
interaction.
10. The number of cells in a 2 by 2 full factorial design is
a) two
b) three
c) four
d) none
C
11. Give an example of study with a 2 by 3 design. Name the factors and levels.
Answers will vary. Example: One factor is gender. It has two levels: men and women. The other
factor is experimental condition. It has three levels: control, low dose of drug, and high dose of
drug. The experiment will look at the incidence of cancer in the 2(gender) by 3(condition) groups.
Questions 10-11. Identify the possible main effects and/or interactions you see in each
of the pictures below.
9
8
7
6
5
4
3
2
1
9
8
7
6
5
4
3
2
1
0
0
One main effect of the variable on the
X axis.
Most notably, an interaction.
Probably no main effects.
12. A statistics student has to do a final project involving two-way ANOVAs. She decides
to collect data on the physical strength of science geeks. She gets 12 science-majors
to compete against a champion arm wrestler and records the number of seconds they
last. She has a control group of 6 psychology students to compare them with. She
groups her data by major and by gender. Test for main effects and interactions in the
data below. Fill in the table below, give your F values, compare them to critical values
of F and then write a paragraph stating the hypotheses you were testing and the
conclusions you reached.
Male
Female
Source
SS
df
S2
F
1
12.5
5.4
2
3.5
1.5
2
.5
.5
12
2.33
Psych
Chem
Physics
6
5
6
6
3
4
Gender
12.5
3
7
2
Major
7
3
5
1
2
4
3
4
3
2
Between
Interaction
Within
1
28
Total
48.5
17
There is no main effect of major. There is a main effect of gender (such that men last longer). There is no
interaction between gender and major.
13. Imagine you are a college professor and you notice that fewer students appear to attend class on
afternoons when the weather is warm than when it is cold outside. To test your hunch, you
collect data regarding outside temperature and attendance from randomly selected lecture
classes for several randomly selected days during the academic year. Your data are as follows:
Temp
Attendance
(F)
58
84
62
82
77
64
76
62
67
66
50
85
80
59
75
72
70
65
75
61
Mean= 69
70
SS= 842
912
a. Draw a scatterplot of the data– don’t forget to label the axes.
b. Do you feel comfortable using a Pearson’s r to calculate the correlation of these variables? Why or
why not?
Yes, the relationship appears to be linear.
c. Calculate the correlation coefficient for these data. Check its significance.
R=-.898.
t=5.77
It is significant.
d. What does the r tell you about the relationship between temperature and attendance?
As temperature increases, attendance decreases
e. How much of the variance in attendance is explained by temperature?
R2 = .81. 81% of variance in attendance is explained by temperature.
f. Calculate the regression coefficients a and b
a= 134.5, b=-.9
g. Write the regression line for this data.
Attendance’ = 134.5 – 0.9(temp)
h. Describe how you would compute an F-test for this regression model.
We would get smodel2 which describes the variance in Y explained by our regression model (Y’ – Y-bar). We
would compare that to sresid2 which describes the variance in Y not explained by our model (Y-Y’). Sresid2 is our
estimate of the error variance. If smodel2 is substantially bigger, we would reject the null.
14. Explain why the multiple R2 in a multiple regression is not simply the sum of the R2s of the individual
regression lines.
There may be some overlap in the Y variance that each predictor is able to account for. If the two predictors
are somewhat correlated, you will overestimate the multiple R2 when you add together the individual R2s,
because you have double-counted the overlap.
15. What are two research questions a multiple regression would allow you to answer that would be difficult
to study with other analyses you’ve learned about in this class? (Explain why).
What is the effect of predictor 1 when controlling for predictor 2? How well can you predict an outcome
variable when you use several predictors together? What interaction is there between a categorical variable
and a continuous variable in predicting an outcome? Etc. etc.
16. Why do we say multiple regression is a more general mathematical model than the other tests we’ve
learned about?
You can use multiple regression to answer all the same questions as both ANOVA and t-tests (with the same
results), as well as tackle many more types of questions.
17. A social psychology student is doing a study of conformity. In her high pressure condition, she tries to get
people she knows well to conform. In her low pressure condition, she tries to get people she doesn’t know
well to conform. She records the number of people who do and do not conform.
High pressure
Low pressure
conform
25
fe = 24.08
17
fe = 17.92
No conform
18
fe = 18.92
15
fe = 14.08
b. Use a two-way chi-square test to examine the same question. What is Χ2test,, Χ2crit, and your final decision?
Χ2calc = .188, Χ2crit = 3.84, retain null.
18. A movie producer wants to find out what kind of audience her latest movie has found among adult
viewers. She expects it to have universal appeal. She screens the movie in a large city and observes the
breakdown of genders and ages that are in attendance. Write in the expected frequencies you would use to
compute a chi-squared test of the hypothesis that the movie had universal appeal.
18-24
25-31
32 & over
WOMEN
10
40
44
28.8
28.8
28.8
MEN
8
25
46
28.8
28.8
28.8
Tricky question. This is not a chi-squared test of independence. If the movie had universal appeal there
would be equal numbers of people from all age groups and equal numbers of women and men.
Download