Variance functions or modelling heteroscedasticity

advertisement
Variance functions or modelling heteroscedasticity
In earlier slides we saw our models had the assumption of constant variance –
homoskedasticity…
y
y
-3
-2
-1
0
1
2
3
x
Residuals variance constant wrt to
x: homoskedasticity
-3
-2
-1
0
1
2
3
x
Residuals variance not constant
wrt to x: heteroskedasticity
A simple case of heteroskedasticity
In the educational data set we have been working with.
Tabulating normexam by gender we see that the means and variances for
boys and girls are (–0.140 and 1.051) and (0.093 and 0.940).
We may want to fit a model that estimates separate variances for boys and
girls. The notation we have been using so far assumes a common
intercept(0) and a single set of student residuals, ei, with a common variance
e2. We need to use a more flexible notation to build this model.
Working with general notation in MLwiN
A model with no variables specified in
general notation looks like this.
A new first line is added stating that the response variable follows a
Normal distribution. We now have the flexibility to specify
alternative distributions for our response. We will explore these
models later.
The 0 coefficient now has an explanatory x0 associated with it. The
values x0 takes determines the meaning of the 0 coefficient. If x0 is
a vector of 1s then 0 will estimate an intercept common to all
individuals, in the absence of other predictors this would be the
overall mean. If x0 variable, say 1 for boys and 0 for girls, then 0
will estimate the mean for boys.
A simple variance function
The new notation allows us to
set up this simple model
where x0i is a dummy variable
for boy and x1i is a dummy
variable for girl. This model
estimates separate means and
variances for the two groups.
This is an example of a
variance function because the
variance changes as a
function of explanatory
variables. The function is :
var( yi )   e20 x0i   e21x1i
Deriving the variance function
We arrive at the expression
var( yi )   e20 x0i   e21x1i
(1)
By takingthebasic model
yi   0 x0i  1 x1i
 0i   0  e0i
1i  1  e1i
and rearranging it
yi   0 x0i  1 x1i  e0i x0i  e1i x1i
var(yi )  var(e0i x0i  e1i x1i )  var(e0i x0i )  2 cov(e0i x0i , e1i x1i )  var(e1i x1i )
 var(e0i ) x02i  2 cov(e0i , e1i ) x0i x1i  var(e1i ) x12i   e20 x02i  2 e 01 x0i x1i   e21 x12i
 e 01 because a student cannotbe both a boy and a girl. Also x0i and x1i are (0,1)
variablesso x02i  x0i and x12i  x1i so we arriveat (1).
Variance functions at level 2
The notion of variance functions is powerful and not restricted to level 1
variances – we have met level 2 variance function already.
The random slopes model fitted earlier produces the
following school level predictions which show school
level variability increasing with intake score.
The model
yij   0ij x0  1 j x1ij
 0 j   0  u0 j  e0ij
1 j  1  u1 j
Can be rewritten as
yij   0 x0  1 x1ij  u0 j x0  u1 j x1ij  e0ij x0
  u20

u0 j 
~
N
(
0
,

)



u 
u
u
2 


1
j
u1 
 
 u 01
e0ij ~ N (0,  e2 )
  u20

u0 j 
 u  ~ N (0,  u )  u  
2 
 1j 
 u 01  u1 
e0ij ~ N (0,  e2 )
So the between school variance is
var(u0 j x0  u1 j x1ij )  E ((u0 j x0  u1 j x1ij ) 2 )
  u20 x02  2 u 01x0 x1ij   u21 x12ij
Two views of the level 2 variance
Given x0 = [1], we have
var(u0 j x0  u1 j x1ij )   u20 x02  2 u 01x0 x1ij   u21 x12ij   u20  2 u 01x1ij   u21 x12ij
Which shows that the level 2 variance is polynomial function of x1ij
var(u0 j x0  u1 j x1ij )  a  bx1ij  cx12ij  0.9  (2 * 0.018) x1ij  0.015 x12ij
• View 1: In terms of school lines predicted
intercepts and slopes varying across schools.
 View 2 : In terms of a variance function
which shows how the level 2 variance
changes as a function of 1 or more
explanatory variables.
Elaborating the level 1 variance
Maybe the student level departures
around their schools summary lines
are not constant.
2 schools
2 students
Note at level 2 we have 2
interpretations of level 2 random
variation, random coefficients (varying
slopes and intercepts across level 2
units) and variance functions. In each
level 1 unit, by definition, we only
have one point, therefore the first
interpretation does not exist because
you cannot have a slope given a single
data point.
Variance functions at level 1
If we allow standlrt(x1ij) to have a random term at level 1, we get
yij   0 x0  1 x1ij  u0 j x0  u1 j x1ij  e0ij x0  e1ij x1ij
  u20

u0 j 
~
N
(
0
,

)


u
u

u 
2 
 1j 
 u 01  u1 
  e20

e0ij 
~
N
(
0
,

)


e
e

e 
2 
 1ij 
 e01  e1 
So the student level variance is now:
var( e0ij x0  e1ij x1ij )   e20 x02  2 e01x0 x1ij   e21 x12ij
 0.533  (2 * 0.015) x1ij  0.001x12ij
The resulting graph shows
decreasing level 1 variance wrt
standlrt extenuates the importance
of school level factors driving
variation in the outcome score,
particularly for high ability pupils
Modelling the mean and variance simultaneously
In our model
yij   0 x0  1 x1ij  u0 j x0  u1 j x1ij  e0ij x0  e1ij x1ij
  u20

u0 j 
~
N
(
0
,

)


u
u

u 
2 
 1j 
 u 01  u1 
  e20

e0ij 
~
N
(
0
,

)


e
e

e 
2 


1
ij
 
 e01
e1 
The global mean is predicted by
 0 x0  1 x1ij
The jth school mean is predicted by
(  0  u0 j ) x0 0  ( 1  u1 j ) x1ij
The student level variance is
var( e0ij x0  e1ij x1ij )   e20  2 e01x1ij   e21 x12ij
The school level variance is
var(u0 j x0  u1 j x1ij )   u20  2 u 01x1ij   u21 x12ij
Where as ordinary regression:
yi   0  1 x1i  ei
ei ~ N (0, e2 )
estimates the global relationship and has
a single catch all bucket for the variance.
MM Opening up new types of research question
Multilevel approach allows modelling of mean and variance
simultaneously.
Illustrate by an analysis exploring the sources of differential parenting.
Why do parents treat siblings differently?
Understanding the sources of differential parenting: the role of child
and family level effects. Jenny Jenkins, Jon Rasbash and Tom
O’Connor Developmental Psychology 2003(1) 99-113
Is there a family effect?
Recent studies in developmental psychology and behavioural
genetics emphasise non-shared environment and genetic
influences are much more important in explaining children’s
adjustment than shared environment has led to a focus on nonshared environment.(Plomin et al, 1994; Turkheimer&Waldron,
2000)
Differential parental treatment
•One key aspect of the non-shared environment that has been
investigated is differential parental treatment of siblings.
•Differential treatment predicts differences in sibling
adjustment
•What are the sources of differential treatment?
•Child specific/non-shared: age, temperament, biological
relatedness
•Can family level shared environmental factors influence
differential treatment?
The Stress/Resources Hypothesis
Do family contexts(shared environment) increase or decrease the
extent to which children within the same family are treated
differently?
“Parents have a finite amount of resources in terms of time,
attention, patience and support to give their children. In families in
which most of these resources are devoted to coping with
economic stress, depression and/or marital conflict, parents may
become less consciously or intentionally equitable and more driven
by preferences or child characteristics in their childrearing efforts”.
Henderson et al 1996.
This is the hypothesis we wish to test. We operationalised the
stress/resources hypothesis using four contextual variables:
socioeconomic status, single parenthood, large family size, and
marital conflict
A multilevel analysis
A model for the mean and a model for the variability around the mean.
positive parenting
yij    u j  eij
u j ~ N (0, u2 ) eij ~ N (0, e2 )
Overall mean
Overall mean
Family means (between family variance)
Child specific parenting scores vary around family mean(between child
within family variance) – the within family variance is a measure of
differential parental treatment.
positive parenting
Modelling the mean and variance simultaneously
We show a possible pattern of how the mean, within family variance and
between family variance might behave as functions of HSES in the schematic
diagram below.
Here are 5 families of increasing
HSES(in the actual data set there
are 3900 families.
We can fit a linear function of SES
to the mean.
The family means now vary around
the dashed trend line. This is now
the between family variation;
which is pretty constant wrt HSES
HSES
However, the within family variation(measure of differential
parenting) decreases with HSES – this supports the SR hypothesis.
Full Combined model for mean and variance
y     age   age2   girl   notBioM
ij
0j
1j
ij
2
ij
3
ij
4
ij
 5 notBioFij   6 oldestSibij   7 m idSibij
  hses   fam size   loneParent   allGirls 
7
j
8
j
9
j
10
j
11m ixedGenderj  12 m aritalprbj  13 fam size* age
u e
j
ij
u j ~ N (0,  u )
•We then allow the level 1
variance to be a function
of the family level
variables household
socioeconomic status,
large family size, and
marital conflict. That is
e ~ N (0,  2 )
ij
ej
 ej2  w0  2w1hsesj  w2 hses2j  2w3m arital. prbj 
2w4 m aritalprb.ses j  2w5 fam ilysizej
wˆ 0  1.84(0.1) wˆ 1  0.23(0.04) wˆ 2  0.17(0.07)
wˆ  0.29(0.13) wˆ  0.11(0.05)
4
5
Reduction in the deviance with 7df is 78.
Graphically …
family size
family size
family size
family size
differential parenting
5
4
= 2, no marital problems
= 2, marital problems
> 2, marital problems
> 2, no marital problems
3
2
1
-2.0
-1.5
-1.0
-0.5
0.0
0.5
household ses
1.0
1.5
2.0
Conclusion for differential parenting
• We have found strong support for the stress/resources hypothesis.
That is although differential parenting is a child specific factor that
drives differential adjustment, differential parenting itself is
influenced by family as well as child specific factors.
• This challenges the current tendency in developmental psychology
and behavioural genetics to focus on child specific factors.
• Multilevel models fitting complex level 1 variation needs to be
employed to uncover these relationships.
Download