Mixed Effects Models Exercise The dataset in the file man1a2.POE.txt gives the weights of mice at different ages. The mice were offspring in an experiment to examine parent of origin effects of the gene man1a2. The experiment comprises a reciprocal F1 cross between males carrying a KO for man1a2 mated with a wild-type female, and vice versa. The offspring are classified by sex, genotype (whether they carry a knockout for the man1a2 gene or are wild type), and mother (whether the mother carried the KO or was wildtype C57). The object of the exercise is to investigate the effects of sex, age, genotype and maternal status on body weight. The same mouse was weighed at different ages, so we expect to see correlations between observations on the same animal, and we need to take this into account in the analysis with a mixed model. These correlations can be taken into account by fitting mixed models which include a random effect for each mouse. (i) Install the R library lme4 from CPAN (ii) Read in the dataset and examine it. man1a2=read.delim(“man1a2.POE.txt”) The columns are self explanatory (iii) First investigate the effect of sex. Fit a random effect model with no fixed effects with the command h0 = lmer( weight ~ 1 +(1|mouse) , data=man1a2) and compare it to a model with sex h1 = lmer(weight ~ sex +(1|mouse),data=man1a2) anova(h0,h1) Compare these to model fits using standard least squares with lm() (iv) Now investigate the effect of age. Plot the residuals from h1 against age to get a sense of the dependence on age. plot(resid(h1), man1a2$age) Fit models with linear and linear+quadratic dependence on age and compare them, using both anova and residual plots. What to you conclude? (v) Now investigate the effects of maternal status and offspring genotype by adding appropriate terms to your existing best model. You should understand the difference between fitting a model including the terms mother genotype mother+genotype mother*genotype Which of these can you compare using anova? What do you conclude? If you had fitted the same models using lm() how would your conclusions have differed? Solution: There is an important effect due to Sex and a quadratic dependence on age, as shown by the following residual plots for a series of more complicated model fits using lmer: par(mfrow=c(3,2)) (i) plot weight against age, colour males red: plot(man1a2$weeks,man1a2$weight, xlab="age, weeks", ylab="weight/g",main="mouse weight vs age",cex.main=0.6) males = man1a2$Sex=="Male" points(man1a2$weeks[males],man1a2$weight[males], col="red") (ii) plot residuals after removing sex plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex+ (1|man1a2$Mouse.ID.number))),xlab="age, weeks", ylab="residual weight/g",main="resid(lmer(man1a2$weight ~ man1a2$Sex)",cex.main=0.6) points(man1a2$weeks[males],resid(lmer(man1a2$weight ~ man1a2$Sex + (1|man1a2$Mouse.ID.number)))[males],col="red") (ii) fit a Sex effect and a linear dependence on age (don’t colour sexes): plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex + man1a2$weeks + (1|man1a2$Mouse.ID.number))),xlab="age, weeks", ylab="residual weight/g",main="resid(lmer(man1a2$weight ~ man1a2$Sex +man1a2$weeks)",cex.main=0.6) (iii) fit Sex and a linear+quadratic dependence on age (weeks2 is the square of weeks): plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex + man1a2$weeks +man1a2$weeks2 + (1|man1a2$Mouse.ID.number))),xlab="age, weeks", ylab="residual weight/g",main="resid(lmer(man1a2$weight ~ man1a2$Sex +man1a2$weeks+man1a2$weeks2)",cex.main=0.6) (iv) fit interaction between sex and age: plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex * (man1a2$weeks +man1a2$weeks2) + (1|man1a2$Mouse.ID.number))),xlab="age, weeks", ylab="residual weight/g",main="resid(lmer(man1a2$weight ~ man1a2$Sex +man1a2$weeks+man1a2$weeks2)",cex.main=0.6) Model fits: g0 g1 g2 g3 g4 = = = = = lmer(weight lmer(weight lmer(weight lmer(weight lmer(weight ~ ~ ~ ~ ~ 1 + Sex Sex Sex Sex (1|Mouse.ID.number),data=man1a2) + (1|Mouse.ID.number),data=man1a2) + weeks + (1|Mouse.ID.number),data=man1a2) + weeks + weeks2 + (1|Mouse.ID.number),data=man1a2) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2) anova(g0,g1,g2,g3,g4) Data: man1a2 Models: g0: weight ~ 1 + (1 | Mouse.ID.number) g1: weight ~ Sex + (1 | Mouse.ID.number) g2: weight ~ Sex + weeks + (1 | Mouse.ID.number) g3: weight ~ Sex + weeks + weeks2 + (1 | Mouse.ID.number) g4: weight ~ Sex * (weeks + weeks2) + (1 | Mouse.ID.number) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) g0 3 4542.1 4556.3 -2268.0 g1 4 4380.4 4399.4 -2186.2 163.71 1 < 2.2e-16 *** g2 5 2938.6 2962.4 -1464.3 1443.73 1 < 2.2e-16 *** g3 6 2775.2 2803.7 -1381.6 165.43 1 < 2.2e-16 *** g4 8 2154.9 2192.9 -1069.5 624.28 2 < 2.2e-16 *** --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 The ANOVA indicates that all the terms in the model are very significant, and the residual plot suggests that most of the dependence on these covariates has been captured. We now turn to examining the parent of origin effects. (a) add a term for purely maternal effects ie whether the mother ccarrying the KO has an effect – we allow this to change not only the overall weight but also the rate of growth by using the term (Sex + Mother)*(weeks + weeks2) g5 = lmer(weight ~ (Sex + Mother) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2) (b) add a term for whether the offspring carry the KO or are wild-type – again we allow this to affect not only the overall level but also the rate of growth (Sex + Mother + Genotype)*(weeks + weeks2) g6 = lmer(weight ~ (Sex + Mother + Genotype) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2) (c) now allow these maternal and genotype effects to depend on the sex of the mice: (Sex*Mother + Sex*Genotype)*(weeks + weeks2) g7 = lmer(weight ~ (Sex + Mother * Genotype) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2) (d) now test for a parent of origin effect ie a KO allele transmitted from the Mother is different from the Father: (Sex*Mother + Sex*Genotype + Mother*Genotype)*(weeks + weeks2) g8 = lmer(weight ~ (Sex * Mother + Sex * Genotype + Mother*Genotype) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2) (e) finally a model with sex-dependent Parent of Origin effects (Sex*Mother*Genotype)*(weeks + weeks2) > g9 = lmer(weight ~ (Sex * Mother * Genotype ) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2) > anova(g0,g1,g2,g3,g4,g5,g6,g7,g8,g9) Data: man1a2 Models: g0: weight ~ 1 + (1 | Mouse.ID.number) g1: weight ~ Sex + (1 | Mouse.ID.number) g2: weight ~ Sex + weeks + (1 | Mouse.ID.number) g3: weight ~ Sex + weeks + weeks2 + (1 | Mouse.ID.number) g4: weight ~ Sex * (weeks + weeks2) + (1 | Mouse.ID.number) g5: weight ~ (Sex + Mother) * (weeks + weeks2) + (1 | Mouse.ID.number) g6: weight ~ (Sex + Mother + Genotype) * (weeks + weeks2) + (1 | g6: Mouse.ID.number) g7: weight ~ (Sex + Mother * Genotype) * (weeks + weeks2) + (1 | g7: Mouse.ID.number) g8: weight ~ (Sex * Mother + Sex * Genotype + Mother * Genotype) * g8: (weeks + weeks2) + (1 | Mouse.ID.number) g9: weight ~ (Sex * Mother * Genotype) * (weeks + weeks2) + (1 | g9: Mouse.ID.number) Df AIC BIC logLik Chisq Chi Df Pr(>Chisq) g0 3 4542.1 4556.3 -2268.0 g1 4 4380.4 4399.4 -2186.2 163.7116 1 < 2.2e-16 *** g2 5 2938.6 2962.4 -1464.3 1443.7336 1 < 2.2e-16 *** g3 6 2775.2 2803.7 -1381.6 165.4312 1 < 2.2e-16 *** g4 8 2154.9 2192.9 -1069.5 624.2836 2 < 2.2e-16 *** g5 11 2050.3 2102.5 -1014.1 110.6239 3 < 2.2e-16 *** g6 14 2051.3 2117.8 -1011.6 5.0344 3 0.1692990 g7 17 2040.0 2120.7 -1003.0 17.2750 3 0.0006204 *** g8 23 2035.3 2144.6 -994.7 16.6434 6 0.0106872 * g9 26 2040.2 2163.7 -994.1 1.1528 3 0.7643500 --Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > The ANOVA indicates that most models are significantly better than the immediately preceding model, the exceptions being: g6 vs g5 (ie there is no overall effect on whether the offspring carry the KO) g8 vs g7, g9 vs g8 (this is marginally significant after allowing for multiple comparisons) Conclusion: There are both maternal effects and imprinting effects, the former are stronger than the latter.