Adjusting for Unequal Selection Probability in Multilevel Models: 2005

advertisement
2005
Adjusting for Unequal Selection
Probability in Multilevel Models:
A Comparison of Software Packages
by
Kim Chantala
C. M. Suchindran
Dan Blanchette
2005
Overview
• Compare capabilities of multilevel modeling software
packages for analyzing data collected with a complex
sampling plan
• Describe characteristics of survey data that can
influence estimates
• Construct sampling weights for estimating multilevel
models
• Contrast results from estimating a two-level model
with different software packages
2005
Comparison of Software Packages:
General Information
SEM
Analysis
MLM
Analysis
Adjust for
Clustering
Adjust for
Stratification
MPLUS 3.1




LISREL 8.7




GLLAMM (Stata 8)



MLWIN 1.1


HLM 6.0


MIXED (SAS 8.2)


NLMIXED (SAS 8.2)


2005
Comparison of Software Packages:
Implementation of Sampling Weights
Allow MLM
Sampling
Weights
Method for
Scaling MLM
Sampling Weights
Responsibility for
Scaling MLM
Sampling Weights
MPLUS 3.1

Asparouhov (2004)
User
LISREL 8.7

Pfefferman (1998)
User
GLLAMM (Stata 8)

Pfefferman (1998)
User
MLWIN 1.1

Pfefferman (1998)
User or MLWIN default
HLM 6.0

Normalize
HLM default
MIXED (SAS 8.2)
Unknown
User
NLMIXED (SAS 8.2)
Grilli, L. (2004)
User
2005
Comparison of Software Packages:
MLM Analyses with Sampling Weights
Multinomial
Categorical
Ordered
Categorical













Normal
Binary
Poisson
MPLUS 3.1



LISREL 8.7

GLLAMM (Stata 8)


MLWIN 1.1

HLM 6.0

MIXED (SAS 8.2)

NLMIXED (SAS 8.2)
2005
Survey Data Characteristics: Design of Add Health
80 High Schools selected with
probability proportional to size from list
of 26,666 schools sorted by:
•Enrollment Size
•Region of Country
•School Type
•Location
•Percent White
52 High Schools
did not include
a 7th or 8th grade
Feeder school selected
with probability proportional
to percentage of each high
schools’ entering class that
came from feeder school.
52 Feeder
Schools
80 High
Schools
18,924 Students selected from 132 schools for Wave I In-Home Interview
All Students from
Ethnic Samples
16 Schools
•High SES Black
•Cuban
•Puerto Rican
•Chinese
Core
Sample
Disabled
Sample
Genetic Samples
•Twins
•Full siblings
•Half siblings
•Unrelated in Same HH
2005
Constructing Multilevel Weights
Weight Components Needed to Construct Sampling Weights
for Two-Level Analysis using the Add Health Data:
Level
Unit
Interviewed
1
Adolescent i
enrolled in
School j
fsu_wti|j
Number of adolescents enrolled in
school j with the same
characteristics as adolescent i.
2
School j
psu_wtj
Number of schools in the U.S. with
the same characteristics as school j.
Weight Component *
Meaning of Weight
Component
* Stata programs for constructing sampling weights for estimating two-level models can be downloaded
from our website (http://www.cpc.unc.edu/restools/data_analysis/ml_sampling_weights) after August 1,
2005. These programs have implemented methods from Pfefferman (1998) and Asparouhov (2004).
2005
Some MLM Software Packages Requires Special Weights*
Constructed for Each Level:
5000
5000
nj
psu _ m2wt j 
Frequency
4000
 fsu _ wt
i| j
4000
nj
3000
3000
2000
2000
1000
1000
0
0
2
8
14
20
26
fsu _ m2wti| j 
i
32 38
60
Midpoint for
Level 2 (School) Weights
*Method of weight construction from Pfeffermann (1998)
0
1.5
3
4.5
6
fsu _ wti| j
psu _ m2wt j
7.5
9
Midpoint for
Level 1 (Adolescent) Weights
2005
Other MLM Software Packages require one Weight* that
combines the weights from each level in a particular way:
8000
mpml _ wtai , j 
4000
 nj
  fsu _ wti| j
 i

nj








2000
Midpoint
*Method of weight construction from Asparouhov (2004)
4400
3600
2800
2000
1200
1000
800
650
550
450
350
250
150
0
50
Frequency
6000
fsu _ wti| j * psu _ wt j
2005
Illustrative Example
• Research Question: How is the effect of hours watching
TV on BMI of students in a school influenced by the
availability of a school recreation center?
• Data from the National Longitudinal Study of Adolescent
Health (Add Health)
• Contrast the results from MPLUS, MIXED, LISREL,
MLWIN, and GLAMM
• Weights for MPLUS & MIXED will be constructed with
the Asparouhov (2004) method; weights for LISREL,
MLWIN, and GLAMM will be constructed with the
Pfeffermann (1998) method.
2005
Data in example
Level
Variable
School
RC_S
Individual
BMIPCT
Individual
HR_WATCH
Meaning
School has on-site recreation
facility, 0=No,1=Yes
Percentile BMI for age and sex of
adolescent
Hours watched TV, played video or
computer games during past week
2005
Two-level Model
• Student-level model (Within or Level 1):
BMIPCTij = {0j + 1j(HR_WATCHij)} + eij
where:
E(eij) = 0, Var(eij) = σ2
• School-level Model (Between or Level 2):
0j = 00 + 01(RC_S)j + 0j
1j = 10 + 11(RC_S)j + 1j
where:
E(0j ) = E(1j ) = 0
Var (0j ) = σ20, Var(1j) = σ21, Cov(0j, 1j ) = σ0,1
2005
Effect of Sampling Weights on Estimates
Parameter
Fixed Effect
00
01
10
11
Random Effect
σ2 0
σ2 1
σ 0,1
σ2
Range of Parameter Estimates
Using Weights
Ignoring Weights
Ratio
2.72
3.08
0.019
0.055
0.05
0.08
0.001
0.003
54.5
38.5
19.0
18.3
12.41
0.008
0.234
25.03
0.53
0.0005
0.025
0.62
23.4
16.0
9.36
40.3
When sampling weights were omitted from analyses, all software packages gave nearly the
same results.
2005
Analysis Results from Different Packages
Weight: MPML Method A
Weight: PWIGLS Method 2
MPLUS 3.1
Estimate (S.E)
MIXED 8.2
Estimate (S.E)
LISREL 8.7
Estimate (S.E.)
MLWIN 1.1
Estimate (S.E.)
GLLAMM
Estimate (S.E)
00
60.19 (0.65)
59.09 (0.79)
57.83 (0.72)
58.52 (0.58)
57.47 (0.77)
01
-4.49 (0.87)
-2.74 (1.10)
-1.678 (1.06)
-1.41 (0.95)
-1.51 (1.18)
10
0.033 (0.016)
0.038(0.020)
0.045 (0.018)
0.052 (0.013)
0.049 (0.021)
11
0.12 (0.021)
0.11 (0.027)
0.099 (0.025)
0.065 (0.022)
0.101 (0.029)
σ2 0
16.27 (4.04)
24.84 (5.04)
14.13 (3.18)
12.43 (3.05)
17.11 (4.74)
σ2 1
0.002 (0.002)
0.009(0.003)
0.002 (0.001)
0.001 (0.001)
0.007 (0.003)
σ0,1
-0.065 (0.067)
-0.241 (0.097)
-0.047 (0.047)
-0.007 (0.040)
-0.12 (0.08)
σ2
794.36 (10.12)
774.08 (8.19)
792.95 (8.72)
793.57 (8.38)
799.11 (11.94)
Fixed
Effects
Random
Effects
Parameter Estimate Profile for Analysis Using Sampling Weights
01
00
MPLUS
LISREL
MLWIN
GLAMM
MIXED
s20
s0,1
s2
s21
10
11
2005
Predictions from Analysis Using Sampling Weights
Percentile BMI (BMIPCT) of
Students for Average School
64
Solid lines (RC_S=1): schools
with recreation centers
62
60
58
56
54
0
10
20
30
40
Dashed lines (RC_S=0):
schools without recreation
centers
MPLUS
MPLUS
MIXED
MIXED
LISREL
LISREL
MLWIN
MLWIN
GLLAMM
50
GLLAMM
Hours per week watch TV, etc. (HR_WATCH)
2005
Conclusion
• Use of sampling weights to adjust for non-response and the
design characteristics of complex survey data has recently been
incorporated in software used for estimating multilevel models.
• This provides analysts with a simple method for obtaining
unbiased estimates from complex survey data.
• When sampling weights are used, results from these packages
can vary. If weights are ignored, these packages produce the
same results.
• Simulation studies need to be conducted to determine why
these packages produce different results when sampling
weights are used.
• Models with non-normal outcomes need to be examined.
2005
References
• Asparouhov, T. (2004). Weighting for Unequal Probability of
Selection in Multilevel Modeling, Mplus Web Notes No. 8
available from http://www.statmodel.com/
• Grilli, L., and Pratesi, M. Weighted Estimation in Multilevel
Ordinal and Binary Models in the Presence of Informative
Sampling Designs. Survey Methodology, June 2004, Volume
30, pp 93-103
• Pfeffermann, D., Skinner, C. J., Holmes D. J, and H. Goldstein,
Rasbash, J., (1998). Weighting for Unequal Selection
Probabilities in Multilevel Models. JRSS, Series B, 60, 123-40.
Download