Neuropsychological norming and mixed effects models Davide Crepaldi davide.crepaldi1@unimib.it www.davidecrepaldi.net [MoMo Lab, Department of Psychology, University of Milano Bicocca, Italy] Alessandra Casarotti [Department of Medical Biotechnology and Translational Medicine, University of Milano, Italy] [Humanitas Research Hospital, Unit of Neurosurgical Oncology, Rozzano, Italy] Barbara Zarino [Department of Neuroscience and Sense Organs, Ospedale Maggiore Policlinico, Milan, Italy] Costanza Papagno [Department of Psychology, University of Milano Bicocca, Italy] INFERENCE IN PSYCHOLOGY MIXED EFFECT MODELING Psychologists are typically interested in effects that are general, i.e., hold across multiple items and multiple subjects. Mixed effect models, random intercepts for subjects and items. Item variables, because actual data points are data points. Establishing when this is the case is by no means trivial [Clark, 1973]. Very often, the statistical tests that we typically use are seriously flawed [Jaeger, 2008]. COMPARISON: % CORRECT AND EQUIVALENT SCORES We computed expected % correct using the two procedures for 2.6K combinations of age (20-85), education (3-2) and gender: correlation is .81. %correct-mixed [minus] %correct-classic 22 Education (years) Here we apply these big advances to psychological test norming. THE TEST (CASE) 0.10 0.08 17 0.06 12 0.02 7 0.00 -0.02 25 35 40 55 60 65 70 75 80 85 WHERE DO THEY DISAGREE? disagree agree agree disagree agree disagree agree % of cases 0.0 0.4 disagree agree Age=70, Education=18 0.0 0.5 1.0 0.0 0.5 1.0 Age=70, Education=13 agree disagree Age=50, Education=18 0.0 0.5 1.0 0.0 0.5 1.0 agree % of cases % of cases 0.0 0.5 1.0 disagree agree Age=50, Education=13 % of cases % of cases 0.0 0.5 1.0 disagree disagree 0.0 0.5 1.0 agree % of cases disagree Age=30, Education=18 % of cases Age=30, Education=13 % of cases 0.0 0.5 1.0 Age=30, Education=5 Age=70, Education=5 Education (years) 5 10 15 20 50 We computed Equivalent Scores using to the two procedures for 80k combinations of age (20-85), education (3-22), gender and raw score (20-50 correct). ES disagree 28% of the times. Age=50, Education=5 290 unimpaired Italian speakers (148 F and 142 M), ranging 18 to 98 years in age, and 3 to 23 years in education. 45 % of cases NORMING SAMPLE 30 40 60 Age (years) 80 100 CLASSIC APPROACH (Capitani et al., 1988) Simple regression, no random effects. No item variables, because actual data points are subject means. Relevant subjects variables: age (<.001), education (<.001), and gender (.03). Overall fit: R-squared = .34 [0=randomness, 1=perfect prediction] AND HOW STRONGLY? -2 -1 0 1 2 3 ES-mixed (minus) ES-classic 4 We computed Equivalent Scores using to the two procedures for 69 unselected, aphasic patients. ES disagree 26% of the times. Education (years) 5 10 15 20 20 Mixed prediction is lower than classic prediction Age (years) % of cases Picture naming of actions, taken from Crepaldi et al. (2006): 50 items Rated for frequency, age of acquisition, actionality, picture typicality. Length in letters and syllables was also considered Three syntactic classes: transitive verbs (n=20), inergative verbs (n=17) and inaccusative verbs (n=13) Pictures either new or taken from Druks (2000) Mixed prediction is higher than classic prediction 0.04 0.0 0.5 1.0 An effective solution is mixed effect modeling [Baayen, 2008; Jaeger, 2008]. Among other things, it allows to: run models on individual data points, that is, the performance of sbj i on item j, with no averaging. Thus, much more data points. Thus, much more power; deal appropriately with special distributions (e.g., binomial, as in accuracy data), which we never do (e.g., ANOVA on % correct is deeply wrong); explain away spurious variability related to random variation in items (some are difficult, some are easy) and subjects (some are good, some are bad). This may account up to some 45-50% of variance in our data. Relevant item variables: imageability (p=.02), number of syllables (p=.12), age of acquisition (p=.15), and picture typicality (p=.12). Relevant subject variables: age (p<.001), education (p=.07), gender (p=.01), and age by education (p<.001). Overall fit: Somers' Dxy = .73 [0=randomness, 1=perfect prediction] Pt with same score Pt with +1 classic Pt with +1 mixed 20 30 40 50 Age (years) 60 70 80 Clark, H. H. (1973). The language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Verbal Behavior, 12, 335-359. Baayen, R. H. et al. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390-412. Capitani, E. (1987). Statistical methods. In Spinnler, H. and Tognoni, G. (Eds.), Italian Normative Values and standardization of neuropsychological tests. talian Journal of Neurological Sciences, 6 (suppl. 8). Crepaldi et al. (2006). Noun-verb dissociation in aphasia: the role of imageability and functional locus of the lesion. Neuropsychologia, 44, 73-99. Druks, J. (2000). Object and Action Naming Battery. Hove: Psychology Press. Jaeger, F. T. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59, 434-446. REFERENCES