Issues with Mixed Models Model doesn’t converge… OR Convergence Likelihood Landscape Likelihood Landscape Maximum Likelihood Estimation Maximum Likelihood Estimation Maximum Likelihood Estimation Maximum Likelihood Estimation Maximum Likelihood Estimation Likelihood = the probability of seeing the data we actually collected given a particular model Maximum Likelihood Estimates = those values that make the observed data most likely to have happened Sources of Convergence Problems • You estimate more parameters than data (or, in general, too many parameters • Severe collinearity (e.g., two predictors are exactly correlated) • Missing cells in your design • Predictors of vastly different metrics Failure to converge ATTITUDE polite informal GENDER male female 16 0 16 32 … and then trying to test the ATTITUDE*GENDER interaction How can this happen? “Death by Design” (coined by Roger Mundry) Solutions to Convergence Problems • Drop a random slope (not preferred, should be reported) • Drop subjects/items for which there is not enough data (not preferred, should be reported) • Rescale variables so that they lie range between 0 and 1; or make them on similar metrics overall • Center continuous predictors • Nonlinear transformations of skewed predictors Solutions to Convergence Problems • Change order of variable names in model formula • Have a balanced and complete design p-values The p-value conundrum What are the degrees of freedom? How to get p-values out of mixed models is not entirely straightforward… Douglas Bates “There are a number of ways to compute p-values from LMEMs, none of which is uncontroversially the best.” Barr et al. (2013) Ways to get p-values • t-test/F-test with normal approximation • Likelihood Ratio Test • Boostrapping • Permutation • Markov Chain Monte Carlo (MCMC) Getting p-vals with normal approximation xmdl coefs=data.frame(summary(xmdl)@coefs) coefs$p = 2*(1-pnorm(abs(coefs$t.value))) coefs Function for getting p-vals with normal approximation create.sig.table = function(x){ coefs=data.frame(summary(x)@coefs) coefs$p = 2*(1-pnorm(abs(coefs$t.value))) coefs$sig = character(nrow(coefs)) coefs[which(coefs$p < 0.05),]$sig = "*" coefs[which(coefs$p < 0.01),]$sig = "**" coefs[which(coefs$p < 0.001),]$sig = "***" return(coefs) } Likelihood Ratio Test First model needs to be nested in second Likelihood Ratio The likelihood ratio expresses how many times more likely the data are under one model than the other Likelihood Ratio Test Likelihood Ratio Test Important when doing likelihood ratio tests lmer(…,REML=FALSE) http://anythingbutrbitrary.blogspot.com/2012/06/r andom-regression-coefficients-using.html Final issue: Random slopes DANGEROUS!!! Random intercept only models are known to be very anti-conservative in many circumstances (cf. Barr et al., 2013, Schielzeth & Forstmeier, 2008) Random intercept only Schielzeth & Forstmeier (2008) Type I error simulation 10 subjects 10 data points each 5 of those in condition A, 5 in B LRT LRT LRT LRT z-test z-test z-test z-test intercept slope intercept slope intercept slope intercept slope ML ML REML REML ML ML REML REML 0.052 0.035 0.052 0.035 0.053 0.039 0.054 0.042 Add to this explicit subject slopes for A/B 10 subjects 10 data points each 5 of those in condition A, 5 in B LRT LRT LRT LRT z-test z-test z-test z-test intercept slope intercept slope intercept slope intercept slope ML ML REML REML ML ML REML REML 0.24 0.15 0.24 0.069 0.24 0.079 0.25 0.091 Add to this explicit subject slopes for A/B 10 subjects 10 data points each 5 of those in condition A, 5 in B LRT LRT LRT LRT z-test z-test z-test z-test intercept slope intercept slope intercept slope intercept slope ML ML REML REML ML ML REML REML 0.24 0.15 0.24 0.069 0.24 0.079 0.25 0.091 Add to this explicit subject slopes for A/B + take item slopes 10 subjects 10 data points each 5 of those in condition A, 5 in B LRT LRT LRT LRT z-test z-test z-test z-test intercept slope intercept slope intercept slope intercept slope ML ML REML REML ML ML REML REML 0.18 0.085 0.18 0.052 0.21 0.064 0.23 0.08 Add to this explicit subject slopes for A/B + take item slopes 10 subjects 10 data points each 5 of those in condition A, 5 in B LRT LRT LRT LRT z-test z-test z-test z-test intercept slope intercept slope intercept slope intercept slope ML ML REML REML ML ML REML REML 0.18 0.085 0.18 0.052 0.21 0.064 0.23 0.08 “Keep it maximal” “Keep it maximal” random effects justified by the design vs. random effects justified by the data Barr et al. (2013) “Keep it maximal” “for whatever fixed effects are of critical interest, the corresponding random effects should be in that analysis” Barr et al. (2013) That’s it (for now)