An Introduction to Hierarchical Linear Modeling

advertisement
An Introduction to Hierarchical Linear Modeling
The data are those described in the following article:
 Singer, J. D. (1998). Using SAS proc mixed to fit multilevel models, hierarchical models, and
individual growth models. Journal of Educational and Behavioral Statistics, 24, 323-355.
There are data for 7,185 students (Level 1) in 160 schools (Level 2). I shall use MathAch as
the Level 1 outcome variable.
Download data file in XLS format . Boot up SAS and import the data. When the Import Wizard
asks you to name the imported member, name it HLM.
Model 1: Unconditional Means, Intercepts Only
After SAS has imported the data, submit this program which will estimate parameters for a
model that includes only the outcome variable and intercepts.
title 'Model 1: Unconditional Means Model, Intercepts Only';
options formdlim='-' pageno=min nodate;
proc mixed data = covtest noclprint;
class School;
model MathAch = / solution;
random intercept / subject = School;
run;
Level 1 Equation. Yij   0 j  eij . That is, the score of the ith case in the jth school is due to the
intercept for the jth school and error for the the ith case in the jth school. Although I usually use “a” for
the intercept, here I use “β0” for the intercept. 00 will be an estimate of the variance in the school
intercepts – the more the schools differ in mean MathAch, the greater this variance should be.
Level 2 Equation.  0 j   00  0 j . That is, the school intercepts are due to the average
intercept across schools plus the effect (on the intercept) of being in school j.
Combined Equation. Substitute  00  0 j (from the Level 2 equation) for  0 j in the Level 1
equation and you get Yij   00  0 j  eij .
HLM-Intro.docx
2
Fixed Effects. These are effects that are constant across schools. They are specified in the
model statement (see the SAS code above). Since no variable follows the “=” sign in
“model MathAch =/”, the only fixed parameter is the average intercept across schools, which SAS
automatically includes in the model. This effect is symbolized with “  00 ” in the boxed equations.
Remember that the outcome variable is MathAch.
Random Effects. These are effects that vary across schools, 0j and eij. I shall estimate their
variance.
Look at the Output. Under “Solution for Fixed Effects” we find an estimate of the average
intercept across schools, 12.637. That it differs significantly from zero is of no interest (unless zero is
a meaningful point on the scale of the outcome variable).
Under “Covariance Parameter Estimates,” the Random Effects, you see that the variance in
intercepts across schools is estimated to be 8.6097 and it is significantly greater than zero (this is a
one-tailed test, since a variance cannot be less than 0 unless you have been drinking too much).
This tells us that the schools differ significantly in intercepts (means). The error variance (differences
among students within schools) is estimated to be 39.1487, also significantly greater than zero.
Intraclass Correlation. We can use this coefficient to estimate the proportion of the variance
in MathAch that is due to differences among schools. To get this coefficient we simply take the
estimated variance due to schools and divide by the sum of that same variance plus the error
variance, that is, 8.6097 / (8.6097 + 39.1487) = 18%.
Model 2: Including a Level 2 Predictor in the Model
I shall use MeanSES (the mean SES by school) as a predictor of MathAch. MeanSES has
been centered/transformed to have a mean of zero (by subtracting the grand mean from each score).
Level 1 Equation. Same as before.
Level 2 Equation.  0 j   00   01MeanSES j  0 j . That is, the school intercepts/means are
due to the average intercept across schools, the effect of being in a school with the MeanSES of
school j, and the effect of everything else (error, extraneous variables) on which school j differs from
the other schools.
Combined Equation. Substituting the right hand side of the Level 2 equation into the Level 1
equation, we get Yij  [ 00   01MeanSES j ]  [  0 j  e ij ] . The parameters within the brackets on the
left are fixed, those on the right are random.
SAS Code. Add this code to your program and submit it (highlight just this code before you
click the running person).
title 'Model 2: Including Effects of School (Level 2) Predictors';
title2 '-- predicting MathAch from MeanSES'; run;
proc mixed covtest noclprint;
class school;
model MathAch = MeanSES / solution ddfm = bw;
random intercept / subject = school;
run;
Notice the addition of “ddfm = bw;”. This results in SAS using the “between/within” method of
computing denominator df for tests of fixed effects. Why do this – because Singer says so. 
Look at the Output, Fixed Effects. Under “Solution for Fixed Effects,” we see that the
equation for predicting MathAch is 12.6495 + 5.8635*(School MeanSES – Grand MeanSES) –
remember that MeanSES is centered about zero. That is, for each one point increase in a school’s
3
MeanSES, MathAch rises by 5.9 points. For a school with average MeanSES, the predicted
MathACH would be the intercept, 12.6.
Grab your calculator and divide the estimated slope for MeanSES by its standard error,
retaining all decimal points. Square the resulting value of t. You will get the value of F reported
under “Type 3 Tests of Fixed Effects.”
Notice that the df for the fixed effect of MeanSES = 158 (number of schools minus 2). Without
the “ddfm = bw” the df would have been 7025. The t distribution is not much different with 7025 df
than with 158 df, so this really would not have much mattered.
Look at the Output, Random Effects. The value of the covariance parameter estimate for
the (error) variance within schools has changed little, but that for the difference in intercepts/means
across schools has decreased dramatically, from 8.6097 to 2.6357, a reduction of 5.974. That is,
MeanSES explains 5.974/8.6097 = 69% of the differences among schools in MathAch.
Even after accounting for variance explained by MeanSES, the MathAch scores differ
significantly across schools (z = 6.53). Our estimate of this residual variance is 2.6357. Add to that
our estimate of (error) variance among students within schools (39.1578) and we have 41.7935 units
of variance not yet explained. Of that not yet explained variance, 2.6357/41.7935 = 6.3% remains
available to be explained by some other (not yet introduced into the model) Level 2 predictor. Clearly
most of the variance not yet explained is within schools, that is, at Level 1 – so lets introduce a Level
1 predictor in our next model.
Model 3: Including a Level 1 Predictor in the Model
Suppose that instead of entering MeanSES into the model I entered SES, the socio-economicstatus of individual students.
Level 1 Equation. Yij   0 j   1 j SESij  eij . That is, a student’s score is due to the
intercept/mean for his/her school, the within-school effect of SES (these slopes may differ
across schools), and error. To facilitate interpretation, I shall subtract from each student’s SES
score the mean SES score for the school in which that student is enrolled. Thus,
Yij  0 j  1j (SESij  MSES j )  eij . In the SAS code this centered SES is represented by the
variable “cSES.”
Level 2 Equations. Each random effect (excepting error within schools) will require a
separate Level 2 equation. Here I need one for the random intercept and one for the random slope.
For the random intercept,  0 j   00  0 j . That is, the intercept for school j is the sum of the
grand intercept across schools and the effect (on intercept) of being in school j.
For the random slope, 1 j   10  1 j . That is, the slope for predicting MathAch from SES is, in
school j, the grand slope (across all groups) and the effect (on slope) of being in school j.
Combined Equation. Substituting the right hand expressions in the Level 2 equations for the
corresponding elements in the Level 1 equation yields
Yij  [ 00   10 (SESij  McSES j )  eij ]  [ 0 j  1j (SESij  MSES j )  eij ] . The fixed effects are within the
brackets on the left, the random effects to the right.
SAS Code. Add this code to your program and submit it.
title 'Model 3: Including Effects of Student-Level Predictors';
title2 '--predicting MathAch from cSES';
data HLM2; set HLM;
run;
cSES = SES - MeanSES;
4
proc mixed data = hsbc noclprint covtest noitprint;
class School;
model MathAch = cSES / solution ddfm = bw notest;
random intercept cSES / subject = School type = un;
run;
Note the computation of cSES, student SES centered about the mean SES for the student’s
school. Just as “noclprint” suppresses the printing of class information, “noitprint” suppresses printing
of information about the iterations. “Type = un” indicates you are imposing no structure, allowing all
parameters to be determined by the data.
Look at the Output, Fixed Effects. Under “Solution for Fixed Effects,” see that the estimated
MathAch for a student whose SES is average for his or her school is 12.6493. The average slope,
across schools, for predicting MathAch from SES is 2.1932, which is significantly different from zero.
Look at the Output, Random Effects. Under “Covariance Parameter Estimates” we see that
the “UN(1,1)” estimate is 8.6769 and is significantly greater than zero. This is an estimate of the
variance (across schools) for the first parameter, the intercept. That it is significantly greater than
zero tells us that there remains variance, across schools, in MathAch, even after controlling for cSES.
The UN(2,1) estimates the covariance between the first parameter and the second, that is,
between the school intercepts and school slopes. This (with a two-tailed test) falls well short of
significance. There is no evidence that the slope for predicting MathAch from cSES depends on a
school’s average value of MathAch.
The UN(2,2) estimates the variance in the second parameter, cSES. The estimated variance,
.694, is significantly greater than zero. In other words, the slope for predicting MathAch from cSES
differs across schools.
The unconditional means model (the first model) estimated the within-schools variance in
MathAch to be 39.1487. Our most recent model shows that within-schools variance is 36.7006 after
taking out the effect of cSES. Accordingly, cSES accounted for 39.1487-36.7006 – 2.4481 units of
variance, or 2.4881/39.1487 = 6.25% of the within-school variance.
Model 4: A Model with Predictors at Both Levels and All Interactions
Here I add to the model the variable sector, where 0 = public school and 1 = Catholic school.
Notice in the SAS code that the model also includes interactions among predictors. More on this
later.
SAS Code.
title 'Model 4: Model with Predictors From Both Levels and Interactions';
proc mixed noclprint covtest noitprint;
class School;
model mathach = MeanSES sector cSES MeanSES*Sector
MeanSES*cSES Sector*cSES MeanSES*Sector*cSES
/ solution ddfm = bw notest;
random intercept cSES / subject = School type = un;
run;
Look at the Output, Fixed Effects. MeanSES x Sector and MeanSES x Sector x cSES are
not significant. Without further comment I shall drop these from the model.
Model 5: A Model with Predictors at Both Levels and Selected Interactions
I provide more comment on this model.
Level 1 Equation. Yij   0 j  1 j cSES  eij .
5
Level 2 Equations.
For the random intercept,  0 j   00   01MeanSESj   02Sector j  0 j . That is, the
intercept/mean for a school’s MathAch is due to the grand mean, the effect of MeanSES, the effect of
Sector, and the effect of being in school j.
For the random slope, 1 j   10   11MeanSESj   12Sector j  1 j . That is, the slope for
predicting MathAch from SES, in school j, is affected by the grand slope (across all schools), the
effect of being in a school with the MeanSES that school j has, the effect of being in Catholic school,
and the effect of everything else on which the schools differ.
Combined Equation.
Yij  [ 00   01MeanSESj   02Sector j   10 cSES   11MeanSESj cSES j   12Sector j cSES j ] 
[ 0 j  1 j cSES j  eij ] . Aren’t you glad you remember that algebra you learned in ninth grade?
SAS Code. Add this code to your program and submit it
title 'Model 5: Model with Two Interactions Deleted';
title2 '--predicting mathach from meanses, sector, cses and ';
title3 'cross level interaction of meanses and sector with cses'; run;
proc mixed noclprint covtest noitprint;
class School;
model MathAch = MeanSES Sector cSES MeanSES*cSES Sector*cSES
/ solution ddfm = bw notest;
random intercept cSES / subject = School type = un;
proc means mean q1 q3 min max skewness kurtosis; var MeanSES Sector cSES; run;
Look at the Output, Fixed Effects. All of the fixed effects are significant. Sector is new to this
model. The main effect of sector tells us that a one point increase in sector is associated with a 1.2
point increase in MathAch. Since public schools were coded with “0” and Catholic schools with “1,”
this means that higher MatchAch is associated with the schools being Catholic. Keep in mind that
this is “above and beyond” other effects in the model. Also new to this model are the interactions with
cSES. The MeanSES x cSES interaction indicates that the slopes for predicting MathAch from cSES
differ across levels of MeanSES. The Sector x cSES interaction indicates that the slopes for
predicting MathAch from cSES differ between public and Catholic schools. Note that Singer reported
that she tested for a MeanSES x Sector interaction and a MeanSES x cSES x Sector interaction but
found them not to be significant.
I created separate regressions equations for the public and the Catholic schools by substituting
“0” and “1” for the values of sector. For the public schools, that yields MathAch = 12.11 +
5.34(MeanSES) + 1.22(0) + 2.94(cSES) + 1.04(MeanSES)(cSES) - 1.64(cSES)(0). For the Catholic
schools that yields MathAch = 12.11 + 5.34(MeanSES) + 1.22(1) + 2.94(cSES) +
1.04(MeanSES*cSES) - 1.64(cSES)(0). These simplify to:
 Public:
12.11 + 5.34(MeanSES) + 2.94(cSES) + 1.04(MeanSES)(cSES)
 Catholic: 13.33 + 5.34(MeanSES) + 1.30(cSES) + 1.04(MeanSES)(cSES)
As you can see, MathAch is significantly higher in the Catholic schools and the effect of cSES on
MathAch is significantly greater in the public schools.
The MeanSES x cSES interaction can be illustrated by preparing a plot of the relationship
between MathAch and cSES at each of two or three levels of MeanSES – for example, when
MeanSES is its first quartile, its second quartile, and its third quartile. Italassi could be used to
illustrate this interaction interactively, but it is hard to move that slider in the published article.
6
At the mean for sector (.493), MathAch = 12.11 + 5.34(MeanSES) + 1.22(.493) + 2.94(cSES)
+ 1.04(MeanSES)(cSES) - 1.64(cSES)(.493) =
12.71 + 5.34(MeanSES) + 2.13(cSES) + 1.04(MeanSES)(cSES).
At Q1 for MeanSES (-.32), MathAch = 12.71 -1.71 + 2.13(cSES) - 0.32(cSES) =
11.00 + 1.81(cSES).
At Q2 for MeanSES (.038), MathAch = 12.71 + .03 + 2.13(cSES) + .038(cSES) =
12.74 + 2.17(cSES).
At Q3 for MeanSES (.33), MathAch = 12.71 +1.76 + 2.13(cSES) + 0.33(cSES) =
14.47 + 2.47(cSES).
For each of these conditional regressions I shall predict MatchAch at two values of cSES (-3
and +3) and then produce an overlay plot with the three lines.
Here is the table of predicted values:
MeanSES
cSES
-3
+3
Difference
Q1
5.57
16.43
10.86
Q2
6.23
19.25
13.02
Q3
7.06
21.88
14.82
MathAch
Here is a plot of the relationship between cSES and MathAch at each of three levels of
MeanSES. Notice that the slope increases as MeanSES increases.
MeanSES=Q1
MeanSES=Q2
MeanSES=Q3
cSES
Look at the Output, Random Effects. The estimate for the difference in intercepts across
schools, UN (1,1) remains significant, but now the estimate for the differences across schools in slope
(for predicting MathAch from cSES), UN (2,2), is small and not significant, as is the estimate for the
covariance between intercepts and slopes, UN (2,1). Perhaps I should trim the model, removing
those components.
7
Model 6: Trimmed
In this model I remove the random effect of cSES slopes (and thus also the interaction
between those slopes and the intercepts). Because there is only one random effect, I no longer need
to use “type = un.”
SAS Code.
title 'Model 6: Simpler Model Without cSES Slopes';
proc mixed noclprint covtest noitprint;
class School;
model MathAch = MeanSES Sector cSES MeanSES*cSES Sector*cSES / solution ddfm = bw
notest;
random intercept / subject = School;
run;
data p;
df = 2; p = 1-probchi(1.1,2);
proc print data = pvalue noobs; run;
Look at the Output. As before, all the fixed effects are significant.
Fit Statistics
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
46503.7
46511.7
46511.7
46524.0
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
46504.8
46508.8
46508.8
46514.9
Trimming the model has increased the log likelihood statistic by 1.1, indicating slightly poorer
fit. We can evaluate the significance of this change with a Chi-square on 2 df (one df for each
parameter trimmed, the slopes and the Slope x Intercept interaction). As you can see, deleting those
two parameters has not significantly affected the fit of the model to the data.




SAS Results
HLM Cheat Sheet – for symbols commonly used in HLM equations
Return to Wuensch’s Stats Lessons Page
Resources for Hierarchical Linear Modeling
Karl L. Wuensch, East Carolina Univ., Dept. of Psychology, Greenville, NC 27858, USA
October, 2013
Download