Variance functions or modelling heteroscedasticity In earlier slides we saw our models had the assumption of constant variance – homoskedasticity… y y -3 -2 -1 0 1 2 3 x Residuals variance constant wrt to x: homoskedasticity -3 -2 -1 0 1 2 3 x Residuals variance not constant wrt to x: heteroskedasticity A simple case of heteroskedasticity In the educational data set we have been working with. Tabulating normexam by gender we see that the means and variances for boys and girls are (–0.140 and 1.051) and (0.093 and 0.940). We may want to fit a model that estimates separate variances for boys and girls. The notation we have been using so far assumes a common intercept(0) and a single set of student residuals, ei, with a common variance e2. We need to use a more flexible notation to build this model. Working with general notation in MLwiN A model with no variables specified in general notation looks like this. A new first line is added stating that the response variable follows a Normal distribution. We now have the flexibility to specify alternative distributions for our response. We will explore these models later. The 0 coefficient now has an explanatory x0 associated with it. The values x0 takes determines the meaning of the 0 coefficient. If x0 is a vector of 1s then 0 will estimate an intercept common to all individuals, in the absence of other predictors this would be the overall mean. If x0 variable, say 1 for boys and 0 for girls, then 0 will estimate the mean for boys. A simple variance function The new notation allows us to set up this simple model where x0i is a dummy variable for boy and x1i is a dummy variable for girl. This model estimates separate means and variances for the two groups. This is an example of a variance function because the variance changes as a function of explanatory variables. The function is : var( yi ) e20 x0i e21x1i Deriving the variance function We arrive at the expression var( yi ) e20 x0i e21x1i (1) By takingthebasic model yi 0 x0i 1 x1i 0i 0 e0i 1i 1 e1i and rearranging it yi 0 x0i 1 x1i e0i x0i e1i x1i var(yi ) var(e0i x0i e1i x1i ) var(e0i x0i ) 2 cov(e0i x0i , e1i x1i ) var(e1i x1i ) var(e0i ) x02i 2 cov(e0i , e1i ) x0i x1i var(e1i ) x12i e20 x02i 2 e 01 x0i x1i e21 x12i e 01 because a student cannotbe both a boy and a girl. Also x0i and x1i are (0,1) variablesso x02i x0i and x12i x1i so we arriveat (1). Variance functions at level 2 The notion of variance functions is powerful and not restricted to level 1 variances – we have met level 2 variance function already. The random slopes model fitted earlier produces the following school level predictions which show school level variability increasing with intake score. The model yij 0ij x0 1 j x1ij 0 j 0 u0 j e0ij 1 j 1 u1 j Can be rewritten as yij 0 x0 1 x1ij u0 j x0 u1 j x1ij e0ij x0 u20 u0 j ~ N ( 0 , ) u u u 2 1 j u1 u 01 e0ij ~ N (0, e2 ) u20 u0 j u ~ N (0, u ) u 2 1j u 01 u1 e0ij ~ N (0, e2 ) So the between school variance is var(u0 j x0 u1 j x1ij ) E ((u0 j x0 u1 j x1ij ) 2 ) u20 x02 2 u 01x0 x1ij u21 x12ij Two views of the level 2 variance Given x0 = [1], we have var(u0 j x0 u1 j x1ij ) u20 x02 2 u 01x0 x1ij u21 x12ij u20 2 u 01x1ij u21 x12ij Which shows that the level 2 variance is polynomial function of x1ij var(u0 j x0 u1 j x1ij ) a bx1ij cx12ij 0.9 (2 * 0.018) x1ij 0.015 x12ij • View 1: In terms of school lines predicted intercepts and slopes varying across schools. View 2 : In terms of a variance function which shows how the level 2 variance changes as a function of 1 or more explanatory variables. Elaborating the level 1 variance Maybe the student level departures around their schools summary lines are not constant. 2 schools 2 students Note at level 2 we have 2 interpretations of level 2 random variation, random coefficients (varying slopes and intercepts across level 2 units) and variance functions. In each level 1 unit, by definition, we only have one point, therefore the first interpretation does not exist because you cannot have a slope given a single data point. Variance functions at level 1 If we allow standlrt(x1ij) to have a random term at level 1, we get yij 0 x0 1 x1ij u0 j x0 u1 j x1ij e0ij x0 e1ij x1ij u20 u0 j ~ N ( 0 , ) u u u 2 1j u 01 u1 e20 e0ij ~ N ( 0 , ) e e e 2 1ij e01 e1 So the student level variance is now: var( e0ij x0 e1ij x1ij ) e20 x02 2 e01x0 x1ij e21 x12ij 0.533 (2 * 0.015) x1ij 0.001x12ij The resulting graph shows decreasing level 1 variance wrt standlrt extenuates the importance of school level factors driving variation in the outcome score, particularly for high ability pupils Modelling the mean and variance simultaneously In our model yij 0 x0 1 x1ij u0 j x0 u1 j x1ij e0ij x0 e1ij x1ij u20 u0 j ~ N ( 0 , ) u u u 2 1j u 01 u1 e20 e0ij ~ N ( 0 , ) e e e 2 1 ij e01 e1 The global mean is predicted by 0 x0 1 x1ij The jth school mean is predicted by ( 0 u0 j ) x0 0 ( 1 u1 j ) x1ij The student level variance is var( e0ij x0 e1ij x1ij ) e20 2 e01x1ij e21 x12ij The school level variance is var(u0 j x0 u1 j x1ij ) u20 2 u 01x1ij u21 x12ij Where as ordinary regression: yi 0 1 x1i ei ei ~ N (0, e2 ) estimates the global relationship and has a single catch all bucket for the variance. MM Opening up new types of research question Multilevel approach allows modelling of mean and variance simultaneously. Illustrate by an analysis exploring the sources of differential parenting. Why do parents treat siblings differently? Understanding the sources of differential parenting: the role of child and family level effects. Jenny Jenkins, Jon Rasbash and Tom O’Connor Developmental Psychology 2003(1) 99-113 Is there a family effect? Recent studies in developmental psychology and behavioural genetics emphasise non-shared environment and genetic influences are much more important in explaining children’s adjustment than shared environment has led to a focus on nonshared environment.(Plomin et al, 1994; Turkheimer&Waldron, 2000) Differential parental treatment •One key aspect of the non-shared environment that has been investigated is differential parental treatment of siblings. •Differential treatment predicts differences in sibling adjustment •What are the sources of differential treatment? •Child specific/non-shared: age, temperament, biological relatedness •Can family level shared environmental factors influence differential treatment? The Stress/Resources Hypothesis Do family contexts(shared environment) increase or decrease the extent to which children within the same family are treated differently? “Parents have a finite amount of resources in terms of time, attention, patience and support to give their children. In families in which most of these resources are devoted to coping with economic stress, depression and/or marital conflict, parents may become less consciously or intentionally equitable and more driven by preferences or child characteristics in their childrearing efforts”. Henderson et al 1996. This is the hypothesis we wish to test. We operationalised the stress/resources hypothesis using four contextual variables: socioeconomic status, single parenthood, large family size, and marital conflict A multilevel analysis A model for the mean and a model for the variability around the mean. positive parenting yij u j eij u j ~ N (0, u2 ) eij ~ N (0, e2 ) Overall mean Overall mean Family means (between family variance) Child specific parenting scores vary around family mean(between child within family variance) – the within family variance is a measure of differential parental treatment. positive parenting Modelling the mean and variance simultaneously We show a possible pattern of how the mean, within family variance and between family variance might behave as functions of HSES in the schematic diagram below. Here are 5 families of increasing HSES(in the actual data set there are 3900 families. We can fit a linear function of SES to the mean. The family means now vary around the dashed trend line. This is now the between family variation; which is pretty constant wrt HSES HSES However, the within family variation(measure of differential parenting) decreases with HSES – this supports the SR hypothesis. Full Combined model for mean and variance y age age2 girl notBioM ij 0j 1j ij 2 ij 3 ij 4 ij 5 notBioFij 6 oldestSibij 7 m idSibij hses fam size loneParent allGirls 7 j 8 j 9 j 10 j 11m ixedGenderj 12 m aritalprbj 13 fam size* age u e j ij u j ~ N (0, u ) •We then allow the level 1 variance to be a function of the family level variables household socioeconomic status, large family size, and marital conflict. That is e ~ N (0, 2 ) ij ej ej2 w0 2w1hsesj w2 hses2j 2w3m arital. prbj 2w4 m aritalprb.ses j 2w5 fam ilysizej wˆ 0 1.84(0.1) wˆ 1 0.23(0.04) wˆ 2 0.17(0.07) wˆ 0.29(0.13) wˆ 0.11(0.05) 4 5 Reduction in the deviance with 7df is 78. Graphically … family size family size family size family size differential parenting 5 4 = 2, no marital problems = 2, marital problems > 2, marital problems > 2, no marital problems 3 2 1 -2.0 -1.5 -1.0 -0.5 0.0 0.5 household ses 1.0 1.5 2.0 Conclusion for differential parenting • We have found strong support for the stress/resources hypothesis. That is although differential parenting is a child specific factor that drives differential adjustment, differential parenting itself is influenced by family as well as child specific factors. • This challenges the current tendency in developmental psychology and behavioural genetics to focus on child specific factors. • Multilevel models fitting complex level 1 variation needs to be employed to uncover these relationships.