Multilevel Modeling - KU School of Medicine–Wichita

advertisement
Multilevel Modeling: Why, When
and How?
Frank Dong
1-9-2013
Outline
• Why do we need the Multilevel Modeling
• When do we need Multilevel Modeling
• How can we conduct Multilevel Modeling
analysis (live demo)
Background
• Everyone knows about ordinary least squares
regression, aka, linear regression
• The formula is
𝑦 = α + π‘₯β + ε
• We typically assume the error term ε has a
normal distribution N(0, σ2πœ€ )
• Everyone knows how to do it in SPSS
Problems
• Ordinary least squares analysis does not solve
everything
• There are often times where data present
certain hierarchy
• For example, the performance of students on
the test score may depends on the students
themselves, but also may depends on schools
• School effects are often ignored
Purpose of this presentation
• To introduce the idea of multilevel modeling
• Not everything can be done with the linear
regression
• Live demonstration of how to conduct
multilevel analysis in SPSS.
An example
• This example is from a book called Multilevel
Statistical Models, 4th Edition by Harvey Goldstein
• Have data on 728 elementary students
• N=50 schools
• Interested in the following question: Does the
student’s 8-year math score predict the 11-year
math score?
• Y= 11-year math score
• X=8-year math score
Some data points
11-year
Math
Score
8-year
Math
Gender: Boy=1 Social class: Manual=1
Score School ID
Girl=0
Non-manual=0
39
36
1
1
0
11
19
1
0
1
32
31
1
0
1
27
23
1
0
0
36
39
1
0
0
Inappropriate Analysis
• For each school, 𝑦𝑖 = 𝛼 + 𝛽π‘₯𝑖 + ε
• The overall model becomes
𝑦𝑖𝑗 = 𝛼𝑗 + 𝛽𝑗 π‘₯𝑖𝑗 + πœ€π‘–π‘—
• We have 50 pairs of 𝛼𝑗 , 𝛽𝑗 to estimate, one
for each school
• We also have a variance term, 𝛿
2
to estimate
Issues
• Too many unknown (N=2*50+1) parameters
• Unable to compare school performance if we
desires to do so
• Some schools have fewer students than other
schools
Solutions
• Multilevel Modeling
• Instead of estimating N=2*50+1 unknown
parameters, we will simplify the model
• 𝑦𝑖𝑗 = 𝛼𝑗 + 𝛽𝑗 π‘₯𝑖𝑗 + πœ€π‘–π‘— -----Original model
• More importantly, 𝛼𝑗 and 𝛽𝑗 are also treated
as random variable
• They are assumed to have a normal
distribution with certain M and SD
Final Solution
• The final model becomes
• 𝑦𝑖𝑗 = 𝛼0 + 𝛽0 π‘₯𝑖𝑗 + πœ‡0𝑗 + 𝑒1𝑗 π‘₯𝑖𝑗 + πœ€0𝑖𝑗
• The unknown parameters are 𝛼0 , 𝛽0 , variance
of πœ‡0𝑗 , 𝑒1𝑗 , and πœ€0𝑖𝑗 , and covariance between
πœ‡0𝑗 π‘Žπ‘›π‘‘ 𝑒1𝑗
• We reduced the number of parameters from
101 to 6
Results
Parameter
Fixed
Intercept
8-year Math Score
Random Effect
Between School
Variance
Between Students
Variance
Variance Partition
Coefficient
Multilevel Modeling
Estimate (s.e.)
13.9
0.65 (0.025)
OLS Estimate (s.e.)
13.8
0.65 (0.026)
3.28
19.8
0.14
23.34
Research Question 2
• We also have the gender (1=boy, 2=girl), and
social class (1=manual, 0=non-manual), would
those two variables affect the performance of
the 11-year math grade?
• Is gender significant?
• Is social class significant?
Parameters
Multilevel Modeling
Estimate (s.e.)
OLS Modeling Estimate
(s.e.)
14.88
0.638 (0.025)
-0.357 (0.340)
-0.720 (0.387)
14.79
0.638 (0.026)
-0.363 (0.358)
-0.697 (0.397)
Fixed Effects
Intercept
8-year Math Score
Gender (boy vs girl)
Social Class (manual vs
non-manual)
Random Effect
Between School
Variance
3.312
Between Students
Variance
19.728
Variance Partition
Coefficient
0.144
49.36
How to conduct a Multilevel Modeling
• You do not need to do it by yourself
• You are required to be aware of the existence
of multilevel modeling
• The benefit is to improve the estimate
accuracy
• Here is how to do it in SPSS (live demo)
Summary
• Ordinary least squares regression is not
almighty
• When there is a clear structure of hierarchy,
multilevel modeling will be useful
• Multilevel modeling can also be used to
compare the performance of hospitals
Download