Homework #1:

advertisement
Psych 3101
Keller
1
3/16/16
Name:____________________________________
ANOVA HOMEWORK
DIRECTIONS: 1) Please turn in hard copies of homeworks in lab the week after assigned.
Write answers out in a document (e.g., Microsoft Word). You may work in groups for R-related
material, but make sure to code the work yourself and come up with independent answers. 2)
You have been given an (incomplete) R script. Do your R work IN THIS SCRIPT and turn it in
along with your homework. Saving this script will be helpful for future homeworks and tests. 3)
If you want to save graphics, the easiest way is to go to click on the graphic, then go to FILE,
then “Save As”, and save it as a PDF (or another format if you are on windows and can do so).
You should be able to simply drag the PDF into your Microsoft Word homework file and resize
it thereafter. 4) In case you want it as a reference, a PDF of the survey taken during the first lab
is here: http://psych.colorado.edu/wiki/doku.php?id=courses:keller:3101:other
ANOVA Q1) Read the question below, then work through the script “ANOVA.R” in R, then
return to this HW to write out your answer based on your R results.
A) The big-5 agreeableness scale is a scale of empathy, and it was measured on the first lab
session along with a question about your favorite color. There is a hypothesis that a person’s
favorite color is associated with their personality. In particular, people who choose ‘cool’ colors
like blue and purple are thought to be more empathic (interested in others, sympathize with
others’ feelings) than people who choose ‘warmer’ colors (like orange or red) or neutral colors
(like green). Perform an Analysis of Variance (ANOVA) to test this hypothesis. What is your
alternative and null hypotheses?
B) Create an ANOVA summary table of your results.
C) What is your F-value and p-value? What do you conclude?
D) A low p-value (a significant finding) merely indicates that a result was unlikely to have arisen
by chance. It doesn’t tell us how “large” an effect is (very small mean differences between
groups – i.e., a small effect – can nevertheless be significant if the sample size is very large).
Provide an estimate of the “effect size” (an r2) for the effect you found above. How much
variation in agreeableness is “explained” by a person’s favorite color?
E) Write a four sentence summary of your findings. Make sure to include an estimate of the
“effect size” (i.e., r2) of the effect you found, and include a side-by-side boxplot of agreeableness
as a function of favorite color below your 4-sentence summary.
F) Was this an experiment or not? Can we make causal inference based on this study (this is a
hypothetical, so answer irrespective of whether or not your results were significant)?
G) If you collected data on 100s of additional students (the exact number is kept intentionally
vague), but your effect size estimate (r2 estimate) stayed the same, what would happen to your pvalue? Explain, intuitively, why this occurs.
Psych 3101
Keller
2
3/16/16
ANOVA Q2) Do this problem by hand and show your work. You may use the formulas
below to answer these questions, or may choose to use the ones from your book.
There are k groups, and there are nj individuals within each group and N individuals total. The
subscript j indexes group, so j=1 for group 1, j=2 for group 2, and so forth. X stands for the
“grand mean,” or the mean of all the data points. Note that these formulas are equivalent but
different—and more intuitive and easier in my opinion—than those in your book. You can
choose to memorize the formulas below or the ones that are in your book – it’s up to you. These
are the only formulas you’ll need to know for one-way ANOVA:
k
SSbetween   n j (X j  X )2
j 1
dfbetween  k  1 MSbetween  SSbetween / dfbetween
F  MSbetween / MS within
nj
k
k
j 1
j 1 i 1
SS within   SS j   ( X i  X j ) 2 df within  N  k MSwithin  SSwithin / df within
N
SSTotal  SSbetween  SS within   ( X i  X ) 2
dfTotal  dfbetween  df within  N  1
i
Here are the scores on a statistical reasoning test from people who had taken a stats class at CU:
GROUP1: 10, 15, 17, 18
Here are the scores on a statistical reasoning test from people who had taken a stats class at CSU:
GROUP2: 6, 8, 10, 12
Here are the scores on a statistical reasoning test from people who had taken a class at NU:
GROUP3: 10, 12, 12, 14
We are interested in whether statistical reasoning test scores are different depending on where
the statistics class was taken
A) What is the null and alternative hypothesis?
B) What is your alpha level? Give an intuitive explanation behind what this number means.
C) What is the mean of Group 1? Of Group 2? Of Group 3?
D) Fill out an ANOVA summary table for the above data.
G) What is your F-value? How many degrees of freedom does this test have? Use Table B4 in
the back of your book to understand whether to reject the null hypothesis or not. What is your
conclusion?
H) What is your estimate of effect size (r2)?
Psych 3101
Keller
3
3/16/16
I) Explain what would happen to your estimate of F, p, and r2 under the following scenarios:
i) The differences between the means get larger, but the variation within the groups
remains the same.
ii) The variation within the groups gets larger, but the differences between the means
remain the same.
iii) The differences between the means stay the same, the variation within the groups
remains the same, but the sample size for each group gets larger.
iv) you learn that the first score is the same person who took the class at each school, the
second score is the same person, and so forth. You therefore perform a repeated-measures
ANOVA rather than a normal ANOVA.
J) Open up the “ANOVA.R” script and perform the analysis you just did by hand above in R.
What is your F and p-value from the R analysis? Are the answers you got in R the same as what
you achieved by hand?
ANOVA Q3) This problem should be done in R after Q2 above.
OK, let’s assume that the same person took the statistical reasoning test 3 times, once after
having taken the class at CU, once after having taken it at NU, and once after taking it at CSU.
The first set of 3 scores (10, 6, and 10) are for the same person, the second set of 3 scores (15, 8,
and 12) are for the same person, and so forth. For purposes of demonstration, let us assume that
people were given a memory-erasing drug after taking the class and pre-test (such that no
learning occurred between taking the course at one school and then another). Silly, I know, but…
Open up your R script and complete Question 3, then answer the following question:
A) Fill out an ANOVA summary table for your results. Note that the Residual row under
“Error:person” in the R output corresponds to what we call the “between subject” information,
and the Residual row under “Error: Within” in the R output corresponds to what we call the
“error” information.
B) Write a 4-sentence summary of your findings.
C) Why did the F value change between your result in the above question vs. this question?
Explain this intuitively in a few sentences.
Download