click here. - WordPress.com

advertisement
Statistics Portfolio
Section A: Issues in Research Design
Q1. What are differential carry-over effects? [50 words]
The difference in magnitude of the residual (or carry-over) effect of one condition on another in one
testing order, compared to another testing order (Pesarin & Salmaso, 2010). For example, the residual
effect of condition 1 on condition 2 is different from the residual effect of condition 2 on condition 1.
Q2. Table 1 contains information on the complete counterbalancing sequences for a withinparticipants design that has 6 conditions. The arrangement is digram-balanced. Explain what this
means. [20 words]
Each condition appears once in each ordinal position and precedes and follows every other condition
once (Myers & Hansen, 2012).
Q3. How does a cyclic counterbalancing arrangement differ to one that is digram-balanced? [40
words]
A cyclic counterbalancing arrangement, unlike a digram-balanced arrangement, does not ensure that
each condition precedes and follows every other condition once (Myers & Hansen, 2012). For example,
in a six-condition cyclic counterbalancing arrangement, all conditions are preceded and followed by the
same condition five times.
Q4. State one advantage and one disadvantage of using a cyclic counterbalancing arrangement. [45
words]
As every possible order grows factorially with the number of conditions used, a cyclic counterbalancing
arrangement is more practical to use (Girden, 1992). However, all conditions are preceded and
followed by the same condition more than once, doing little to reduce carry-over effects (Girden,
1992).
Q5. State one advantage and one disadvantage of using a digram-balanced arrangement. [45 words]
As each condition precedes and follows every other condition once, a digram-balanced arrangement
significantly reduces carry-over effects (Girden, 1992). However, the reduction of carry-over effects is
not absolute as each condition does not precede and follow every other condition equally often at each
ordinal position (Girden, 1992).
Q6. Why should a researcher be interested in effect sizes? [4 arguments: 60 words]
Using null hypothesis testing alone is arbitrary, dependent upon sample size and severely limiting to
what can be concluded (Cohen, 1990). Moreover, the null hypothesis is never true in social science data
(Cohen, 1990). Effect size provides a measure of the magnitude of an observed effect (Field, 2005).
Effect size can be standardised, and thus compared across different measures (Field, 2005). Effect size
allows one to estimate the likely size of an effect in the target population (Field, 2001).
Q7. Supporting your comments with references, explain what statistical power refers to and why this
is important for psychological research. [60 words]
Power is the ability to detect a significant effect, if one exists (Cohen, 1990). Power is important
because it is influenced by a number of factors (Cohen, 1990), including effect size (e.g. a smaller effect
size requires exponentially more power), the accepted probability level (e.g. a smaller probability level
requires greater power) and the number of participants tested (e.g. the larger the sample, the greater
the power).
Section B: Factorial ANOVA and Interactions
Q1. Inspect Table 2. The data are taken from an experiment examining visual search performance in
which participants searched for a target among distractors. The within-participants factors were trial
type (the target was either present or absent from the display) and target type (digits or letters) and
the between-participants factor was working memory group. Are the patterns of means suggestive
of any main effects? Explain your answer. [60 words]
Visual search performance was considerable quicker in target present trials (M = 1403.75) relative to
target absent trials (M = 1887.25), when participants searched for letters (M = 1579.25) as opposed to
digits (M = 1711.75) and for participants with high working memory (M = 1362) relative to those with
low working memory (M = 1929). The means are suggestive of a main effect for all three independent
variables.
(See next page)
Q2. The data on Blackboard are taken from an experiment that examined visual search performance
in men and women. Participants searched for a target item among distractors in displays with a
small set size (12 items in the display) and a large set size (24 items in the display). The target was
present in the display on 50% of trials but absent on the remaining trials. The dependent variable
was search times recorded in milliseconds (ms). Using the descriptive data available, produce an APA
table of means and standard deviations. Remember to be consistent with decimal places and to
include an appropriate table number and title.
Table 1. Mean (Standard Deviation) Visual Search Times (in ms) as a Function of Trial Type, Set Size and
Gender
Trial Type
Set Size (12 items)
Set Size (24 items)
Male
Target Present Trials
993.26 (155.49)
1223.51 (145.10)
Target Absent Trials
1698.25 (252.71)
2116.26 (191.66)
Female
Target Present Trials
994.88 (152.62)
1256.45 (181.84)
Target Absent Trials
1510.25 (229.03)
1985.38 (265.04)
Q3. Using the SPSS data output on Blackboard, report the results of the study (all main effects and
interactions) in a narrative. You should include F ratios, MSEs, p values and effect sizes. State the
exact p value unless this is < .001. You should refer to all effects by variable name (i.e. the main
effect of set size; the interaction between set size and trial type) and ensure that you make it clear
whether each effect is significant. [160 words]
The ANOVA revealed a significant main effect for trail type [F (1, 38) = 715.459, MSE = 28224.170, p<
.001, ηp2 = .950] and set size [F (1, 38) = 422.092, MSE = 11360.995, p< .001, ηp2 = .917] respectively.
However, no significant main effect was found for gender [F (1, 38) = 1.706, MSE = 118475.617, p> .199,
ηp2 = .043]. Following on, no significant interaction was found between trial type, set size and gender [F
(1, 38) = .380, p> .541, ηp2 = .010]. A significant interaction was found between trail type and set size [F
(1, 38) = 91.919, MSE = 4381.081, p< .001, ηp2 = .708] and trail type and gender [F (1, 38) = 11.064, p<
.002, ηp2 = .226]. However, no significant interaction was found between gender and set size [F (1, 38) =
1.722, p> .197, ηp2 = .043].
Q4. Using the SPSS data file on Blackboard run the appropriate post hoc tests to examine the trial
type x set size interaction and report the results of the post hocs in an appropriate narrative. [80
words]
Within-subjects t-tests revealed that participant’s visual search times were significantly quicker
between the small set size (12 items) and the large set size (24 items) in both the present target trials [t
(39) = 15.636, p< .001] and the absent target trials [t (39) = 19.098, p< .001]. Moreover, visual search
times were quicker for present relative to absent target trials for both the small set size [t (39) = 19.413,
p< .001] and the large set size [t (39) = 25.603, p< .001].
Q5. Summarise the trial type x set size interaction using the descriptive information where necessary
to support your explanation. [120 words]
The reduction in visual search times was greater between absent and present target trials for the large
set size (MD = 810.85, SD = 200.30) relative to the small set size (MD =610.18, SD = 198.79). This
suggests that trial type has a greater impact on visual search times for the larger set size, relative to the
small set size. Moreover, there is a greater reduction in visual search times between the small and large
set size for absent target trials (MD = 446.58, SD = 147.89) relative to present target trials (MD =
245.91, SD = 99.47). This suggests that set size has a greater impact on visual search times for absent
relative present target trials.
Q6. The post hoc tests for the trial type x gender interaction have been conducted for you (see SPSS
output on Blackboard). With reference to this information and the Bonferroni correction, explain
what the interaction shows. Use descriptive information where necessary to support your
explanation. [120 words]
Accounting for the Bonferroni correction, there was no significant difference in visual search times
between males and females on both absent target trials (p> .031) and present target trials (p> .720).
However, females (M = 1747.82, SD = 237.04) were quicker than males (M = 1907.26, SD = 211.34) on
absent target trials, males (M = 1108.39, SD = 142.08) were marginally quicker than females (M =
1125.66, SD = 160.19) on present target trials. Thus, a small, albeit non-significant effect, of gender was
found only for absent target trial visual search times. Visual search times were significantly less for
present relative to absent target trials for both females (p< .001) and males (p< .001). The difference in
visual search times between absent and present target trials was greater for females (MD = 798.87, SD
= 192.60) relative to males (MD = 622.16, SD = 139.11), suggesting that, relative to males, trial type has
a greater impact on female visuals search times.
Q7. Write three to four sentences detailing what the data in Figure 1 suggests about the potential
results (see Appendix 1). [80 words]
The data is illustrative of a main effect for task type: mean completion times for both women (20 s) and
men (80 s) for motor task 1 are considerably less than for women (50 s) and men (90 s) for motor task
2. The data is illustrative of a main effect for gender: mean completion times are lower for women (20
s) relative to men (80 s) for motor task 1 and lower for women (50 s) relative to men (90 s) for motor
task 2. The data is illustrative of an interaction: there is a bigger difference in mean completion times
between women and men for motor task 1 (60 s) relative to motor task 2 (40 s).
Section C: Linear and Logistic Regression
Q1. Comment briefly on the assumptions for both hierarchical linear regression and logistic
regression. [40 words]
According to Field (2009) hierarchical linear regression requires: normality, linearity and
homoscedasticity of residuals, the criterion variable to be interval/ratio level, significant outliers to be
removed and very little or no multicolinearity between predictors. At least 15 cases per predictor is
recommended (Stevens, 1996). For logistic regression, 50 cases per predictor is recommended and
variables must be dichotomous and categories must be mutually exclusive and exhaustive (Field, 2009).
Q2. The SPSS data output on Blackboard shows a hierarchical regression analysis in which the
criterion was spelling performance in children. Non-verbal reasoning skills (Raven’s) were entered
first, followed by two language measures (TROG & expressive vocabulary) and phoneme awareness
(spoonerisms and alliteration) was entered in the final model (see Appendix A for a summary of the
tests). Present an APA formatted table to show the B, SE of B and Beta values for the data shown.
Remember to number the table and to provide an appropriate table title.
Table 2. Hierarchical Multiple Regression with Spelling Performance as the Criterion and Non-verbal
Reasoning Skills (Raven’s), Language Measures (Test of Reception of Grammar & Expressive Vocabulary)
and Phoneme Awareness (Spoonerisms & Alliteration) as Predictors.
Predictor
b
SE b
β
Block 1
Constant
Raven’s
15.236
.556
2.977
.135
.411**
Block 2
Constant
Raven’s
Test of Reception of Grammar
Expressive Vocabulary
-.899
.465
.298
-.166
6.460
.135
.116
.168
.343**
.384*
-.146
Block 3
Constant
Raven’s
Test of Reception of Grammar
Expressive Vocabulary
Spoonerisms
Alliteration
8.417
.236
.101
-.246
.521
.190
4.307
.101
.085
.120
.073
.122
.174*
.130
-.216*
.639**
.139
R2 = .169, Adjusted R2 = .159 (Block 1); R2 = .247, Adjusted R2 = .220 (Block 2); R2 = .630, Adjusted R2 = .607 (Block 3); R2
change = .169 (Block 1); R2 change = .079 (Block 2); R2 change = .383 (Block 3). ** p < .001; * p < .05.
Q3. Using the SPSS output on Blackboard, report all the results of the multiple regression analysis
(see lecture and guidance notes). Provide an interpretation for these results including a statement of
what beta values represent and therefore what they show for this particular study. [300 – 350 words]
The ANOVA revealed that the regression model was significant for block 1 [F (1, 84) = 17.042, MSE =
19.161, p< .001], block 2 [F (3, 82) = 8.983, MSE = 17.771, p< .001] and block 3 [F (5, 80) = 27.227, MSE
= 8.958, p< .001]. The proportion of variance in spelling explained by block 1 was 15.9% (R2 = .169;
Adjusted R2 = .159), this rose to 22% for block 2 (R2 = .247; Adjusted R2 = .220) and rose considerably
more to 60.7% for block 3 (R2 = .630; Adjusted R2 = .607). The change statistics showed that the
proportion of variance explained by block 1 (16.9%) was significant [F Change (1, 84) = 17.042, p< .001].
The additional 7.9% of variance explained by block 2 [F Change (2, 82) = 4.387, p< .017] and the
additional 38.3% of variance explained by block 3 [F Change (2, 80) = 41.336, p< .001] were both
significant. Thus, the inclusion of Phoneme Awareness measures in block 3 explained substantially
more variance in spelling on top of the combined 22% of variance explained by Non-verbal Reasoning
Skills and Language Measures in block 2. On block 1, Raven’s emerged as a significant predictor of
spelling (β = .411, t = 4.128, p< .001) and remained so on block 2 (β = .343, t = 3.439, p< .001) and block
3 (β = .174, t = 2.342, p< .022). The Test of Reception of Grammar emerged as a significant predictor on
block 2 (β = .384, t = 2.579, p< .012), but not on block 3 (β = .130, t = 1.188, p> .238). Expressive
Vocabulary was not a significant predictor on block 2 (β = -.146, t = -.989, p> .325), but was on block 3
(β = -.216, t = -2.056, p<. 043). On block 3, Spoonerisms was a significant predictor (β = .639, t = 7.089,
p< .001), but Alliteration was not (β = .139, t = 1.560, p> .123). Beta represents the number of standard
deviations the criterion variable will change as a result of a 1 standard deviation change in a predictor.
Analysing block 3, Spoonerism is the most important predictor of spelling performance; for every 1
standard deviation change in the Spoonerism score, spelling performance improves by .639 of a
standard deviation. The second most important predictor is Expressive Vocabulary, which is inversely
correlated with spelling performance. The other 3 predictors, in order of importance, are Raven’s,
Alliteration and Test of Reception of Grammar.
(See next page)
Q4. Following the format of the table used in the lecture, present an APA table showing the relevant
statistics for the logistic regression analysis. In this analysis, the criterion was spelling group
(poor/good speller) with phoneme awareness (spoonerisms) entered first and then TROG entered
next (i.e. hierarchical). The table should be numbered and have an appropriate title.
Table 3. Binary Logistic Regression with Spelling Group as the Criterion and Spoonerisms and Test of
Reception of Grammar as Predictors.
95% CI for EXP (B)
Model
B
SE
Lower
EXP (B)
Upper
Constant
-.140
.216
Block 1
Spoonerisms
.330*
.075
1.202
1.392
1.610
Block 2
Spoonerisms
Test of Reception of Grammar
.363*
-.068
.081
.052
1.228
.843
1.438
.934
1.684
1.035
.870
Hosmer and Lemeshow (final model): χ² (8) = 11.192, p> .191; R2 = .361 (Cox & Snell); R2 = .481 (Nagelkerke); Model: χ² (2) =
38.449, p< .001. *p< .001.
Q5. Using the SPSS output on Blackboard, report all the results of the logistic regression analysis (see
lecture and guidance notes). Provide an interpretation for these results including a comparison of
the models and an explanation of the relevant statistical values. [200 words]
The binary logistic regression revealed that prior to any coefficients being entered in to the model, the 2LL value was 118.802 and the overall classification was 53.5% [EXP (B) = .870, p> .518]. When
Spoonerisms was entered on block 1, the -2LL value was 82.128 and the overall classification was
81.4%. The model was significant [χ² (1) = 36.674, p< .001; Spoonerisms: EXP (B) = 1.302, CI: 1.202 1.610]. The percentage of variance explained in spelling group membership on block 1 was 34.7% (R2 =
.347) according to Cox and Shell and 46.7% (R2 = .467) according to Nagelkerke. On block 2, the -2LL
value was 80.353 and the overall classification was 82.6%. The overall model was significant [χ² (2) =
38.449, p< .001], but the block 2 statistic revealed a non-significant effect [χ² (1) = 1.775, p> .183;
Spoonerisms: EXP (B) = 1.438, CI: 1.228 – 1.684; Test of Reception of Grammar: EXP (B) = .934, CI:
.843=1.035]. The percentage of variance explained on block 2 was 36.1% (R2 = .361) according to Cox
and Shell and 48.1% (R2 = .481) according to Nagelkerke. The introduction of Spoonerisms on block 1
considerably improved the fit of model, as evident by the significance of the model, the notable
reduction in the -2LL value and the increase in the percentage of cases that can be correctly classified
as good or poor spellers. The introduction of Test of Reception of Grammar in block 2 only slightly
improved the fit of the model, as evident by the non-significance of the model, the marginal decrease
in the -2LL value and the marginal increase in the percentage of cases that can be correctly classified as
good or poor spellers. In block 2, Spoonerisms is the most important predictor of spelling group
membership; for every 1 unit increase in Spoonerism score, individuals are 1.438 times more likely to
be classified as a good speller. For every 1 unit increase in Test of Reception of Grammar, individuals
are marginally more likely to be classified as a poor speller.
Q6. Supporting your answer with references, provide a brief discussion of the influence that residuals
play in multiple regression analysis and the extent to which they may compromise the integrity of
the results. [150 words].
According to Gonzalez’s (2013) overview of residuals, outliers are data points with large residual values
(the deviation of a particular point from the regression line [its predicted value]) that can ‘push’ or ‘pull’
the regression line in one direction, leading to bias regression coefficients. As a result and in line with
the assumptions of multiple regression, one should omit or rescore outliers. The exclusion of just one
outlier can have a substantial impact on the regression model. Detecting outliers is difficult in multiple
regression due to the need to analyse all variables concurrently. Cook’s Distance (Cook, 1977) analyses
the role of potential outliers and is the measure of the difference between the regression one gets by
including a data point relative to omitting said data point, it is influenced by the residual and leverage
(the amount by which the predicted value would change if the observation was shifted one unit in the
y-direction) of predictors.
References
Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45, 1304-1312.
Cook, R.D. (1977). Detection of influential observations in linear regression. Technometrics, 19(1), 1518.
Field, A.P. (2001). Meta-analysis of correlation coefficients: a Monte Carlo comparison of fixed- and
random-effects methods. Psychological Methods, 6 (2), 161-180.
Field, A.P. (2005). Meta-analysis. In J. Miles & P. Gilbert (Eds.). A handbook of research methods in
clinical and health psychology (pp. 295-308). Oxford: Oxford University Press.
Field, A.P. (2009). Discovering statistics using SPPS: and sex and drugs and rock ‘n’ roll (3rd edition).
London: Sage
Girden, E. (1992). ANOVA: Repeated Measures. California: SAGE.
Gonzalez, R. (2013). Residual Analysis and Multiple Regression (lecture notes). Retrieved 10 December,
2014, from http://www-personal.umich.edu/~gonzo/coursenotes/file7.pdf
Myers, A., & Hansen, C. (2012). Experimental Psychology. California: Thomson/Wadsworth.
Pesarin, F., & Salmaso L. (2010). Permutation tests for complex data. Chichester: John Wiley and Sons
Ltd.
Stevens, J. (1996). Applied multivariate statistics for the social sciences (3rd edition). Mahwah: Lawrence
Erlbaum Associates.
Download