Mixed Effects Models Exercise

advertisement
Mixed Effects Models Exercise
The dataset in the file man1a2.POE.txt gives the weights of mice at different ages.
The mice were offspring in an experiment to examine parent of origin effects of
the gene man1a2. The experiment comprises a reciprocal F1 cross between
males carrying a KO for man1a2 mated with a wild-type female, and vice versa.
The offspring are classified by sex, genotype (whether they carry a knockout for
the man1a2 gene or are wild type), and mother (whether the mother carried the
KO or was wildtype C57). The object of the exercise is to investigate the effects of
sex, age, genotype and maternal status on body weight. The same mouse was
weighed at different ages, so we expect to see correlations between observations
on the same animal, and we need to take this into account in the analysis with a
mixed model. These correlations can be taken into account by fitting mixed
models which include a random effect for each mouse.
(i) Install the R library lme4 from CPAN
(ii) Read in the dataset and examine it.
man1a2=read.delim(“man1a2.POE.txt”)
The columns are self explanatory
(iii) First investigate the effect of sex. Fit a random effect model with no fixed
effects with the command
h0 = lmer( weight ~ 1 +(1|mouse) , data=man1a2)
and compare it to a model with sex
h1 = lmer(weight ~ sex +(1|mouse),data=man1a2)
anova(h0,h1)
Compare these to model fits using standard least squares with lm()
(iv) Now investigate the effect of age. Plot the residuals from h1 against age to
get a sense of the dependence on age.
plot(resid(h1), man1a2$age)
Fit models with linear and linear+quadratic dependence on age and compare
them, using both anova and residual plots. What to you conclude?
(v) Now investigate the effects of maternal status and offspring genotype by
adding appropriate terms to your existing best model. You should understand
the difference between fitting a model including the terms
mother
genotype
mother+genotype
mother*genotype
Which of these can you compare using anova? What do you conclude? If you had
fitted the same models using lm() how would your conclusions have differed?
Solution:
There is an important effect due to Sex and a quadratic dependence on age, as
shown by the following residual plots for a series of more complicated model fits
using lmer:
par(mfrow=c(3,2))
(i) plot weight against age, colour males red:
plot(man1a2$weeks,man1a2$weight, xlab="age, weeks", ylab="weight/g",main="mouse weight vs age",cex.main=0.6)
males = man1a2$Sex=="Male"
points(man1a2$weeks[males],man1a2$weight[males], col="red")
(ii) plot residuals after removing sex
plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex+ (1|man1a2$Mouse.ID.number))),xlab="age, weeks",
ylab="residual weight/g",main="resid(lmer(man1a2$weight ~ man1a2$Sex)",cex.main=0.6)
points(man1a2$weeks[males],resid(lmer(man1a2$weight ~ man1a2$Sex + (1|man1a2$Mouse.ID.number)))[males],col="red")
(ii) fit a Sex effect and a linear dependence on age (don’t colour sexes):
plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex + man1a2$weeks + (1|man1a2$Mouse.ID.number))),xlab="age,
weeks", ylab="residual weight/g",main="resid(lmer(man1a2$weight ~ man1a2$Sex +man1a2$weeks)",cex.main=0.6)
(iii) fit Sex and a linear+quadratic dependence on age (weeks2 is the square of
weeks):
plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex + man1a2$weeks +man1a2$weeks2 +
(1|man1a2$Mouse.ID.number))),xlab="age, weeks", ylab="residual weight/g",main="resid(lmer(man1a2$weight ~
man1a2$Sex +man1a2$weeks+man1a2$weeks2)",cex.main=0.6)
(iv) fit interaction between sex and age:
plot(man1a2$weeks,resid(lmer(man1a2$weight ~ man1a2$Sex * (man1a2$weeks +man1a2$weeks2) +
(1|man1a2$Mouse.ID.number))),xlab="age, weeks", ylab="residual weight/g",main="resid(lmer(man1a2$weight ~
man1a2$Sex +man1a2$weeks+man1a2$weeks2)",cex.main=0.6)
Model fits:
g0
g1
g2
g3
g4
=
=
=
=
=
lmer(weight
lmer(weight
lmer(weight
lmer(weight
lmer(weight
~
~
~
~
~
1 +
Sex
Sex
Sex
Sex
(1|Mouse.ID.number),data=man1a2)
+ (1|Mouse.ID.number),data=man1a2)
+ weeks + (1|Mouse.ID.number),data=man1a2)
+ weeks + weeks2 + (1|Mouse.ID.number),data=man1a2)
*( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2)
anova(g0,g1,g2,g3,g4)
Data: man1a2
Models:
g0: weight ~ 1 + (1 | Mouse.ID.number)
g1: weight ~ Sex + (1 | Mouse.ID.number)
g2: weight ~ Sex + weeks + (1 | Mouse.ID.number)
g3: weight ~ Sex + weeks + weeks2 + (1 | Mouse.ID.number)
g4: weight ~ Sex * (weeks + weeks2) + (1 | Mouse.ID.number)
Df
AIC
BIC logLik
Chisq Chi Df Pr(>Chisq)
g0 3 4542.1 4556.3 -2268.0
g1 4 4380.4 4399.4 -2186.2 163.71
1 < 2.2e-16 ***
g2 5 2938.6 2962.4 -1464.3 1443.73
1 < 2.2e-16 ***
g3 6 2775.2 2803.7 -1381.6 165.43
1 < 2.2e-16 ***
g4 8 2154.9 2192.9 -1069.5 624.28
2 < 2.2e-16 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The ANOVA indicates that all the terms in the model are very significant, and the
residual plot suggests that most of the dependence on these covariates has been
captured.
We now turn to examining the parent of origin effects.
(a) add a term for purely maternal effects ie whether the mother ccarrying the
KO has an effect – we allow this to change not only the overall weight but also
the rate of growth by using the term (Sex + Mother)*(weeks + weeks2)
g5 = lmer(weight ~ (Sex + Mother) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2)
(b) add a term for whether the offspring carry the KO or are wild-type – again we
allow this to affect not only the overall level but also the rate of growth (Sex +
Mother + Genotype)*(weeks + weeks2)
g6 = lmer(weight ~ (Sex + Mother + Genotype) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2)
(c) now allow these maternal and genotype effects to depend on the sex of the
mice: (Sex*Mother + Sex*Genotype)*(weeks + weeks2)
g7 = lmer(weight ~ (Sex + Mother * Genotype) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2)
(d) now test for a parent of origin effect ie a KO allele transmitted from the
Mother is different from the Father: (Sex*Mother + Sex*Genotype +
Mother*Genotype)*(weeks + weeks2)
g8 = lmer(weight ~ (Sex * Mother + Sex * Genotype + Mother*Genotype) *( weeks + weeks2) +
(1|Mouse.ID.number),data=man1a2)
(e) finally a model with sex-dependent Parent of Origin effects
(Sex*Mother*Genotype)*(weeks + weeks2)
> g9 = lmer(weight ~ (Sex * Mother * Genotype ) *( weeks + weeks2) + (1|Mouse.ID.number),data=man1a2)
> anova(g0,g1,g2,g3,g4,g5,g6,g7,g8,g9)
Data: man1a2
Models:
g0: weight ~ 1 + (1 | Mouse.ID.number)
g1: weight ~ Sex + (1 | Mouse.ID.number)
g2: weight ~ Sex + weeks + (1 | Mouse.ID.number)
g3: weight ~ Sex + weeks + weeks2 + (1 | Mouse.ID.number)
g4: weight ~ Sex * (weeks + weeks2) + (1 | Mouse.ID.number)
g5: weight ~ (Sex + Mother) * (weeks + weeks2) + (1 | Mouse.ID.number)
g6: weight ~ (Sex + Mother + Genotype) * (weeks + weeks2) + (1 |
g6:
Mouse.ID.number)
g7: weight ~ (Sex + Mother * Genotype) * (weeks + weeks2) + (1 |
g7:
Mouse.ID.number)
g8: weight ~ (Sex * Mother + Sex * Genotype + Mother * Genotype) *
g8:
(weeks + weeks2) + (1 | Mouse.ID.number)
g9: weight ~ (Sex * Mother * Genotype) * (weeks + weeks2) + (1 |
g9:
Mouse.ID.number)
Df
AIC
BIC logLik
Chisq Chi Df Pr(>Chisq)
g0 3 4542.1 4556.3 -2268.0
g1 4 4380.4 4399.4 -2186.2 163.7116
1 < 2.2e-16 ***
g2 5 2938.6 2962.4 -1464.3 1443.7336
1 < 2.2e-16 ***
g3 6 2775.2 2803.7 -1381.6 165.4312
1 < 2.2e-16 ***
g4 8 2154.9 2192.9 -1069.5 624.2836
2 < 2.2e-16 ***
g5 11 2050.3 2102.5 -1014.1 110.6239
3 < 2.2e-16 ***
g6 14 2051.3 2117.8 -1011.6
5.0344
3 0.1692990
g7 17 2040.0 2120.7 -1003.0
17.2750
3 0.0006204 ***
g8 23 2035.3 2144.6 -994.7
16.6434
6 0.0106872 *
g9 26 2040.2 2163.7 -994.1
1.1528
3 0.7643500
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
The ANOVA indicates that most models are significantly better than the
immediately preceding model, the exceptions being:
g6 vs g5 (ie there is no overall effect on whether the offspring carry the KO)
g8 vs g7, g9 vs g8 (this is marginally significant after allowing for multiple
comparisons)
Conclusion: There are both maternal effects and imprinting effects, the former
are stronger than the latter.
Download