Advances in Missing Data and Implications for Developmental Science Todd D. Little

advertisement
Advances in Missing Data
and Implications for
Developmental Science
Todd D. Little
University of Kansas
Director, Quantitative Training Program
Director, Center for Research Methods and Data Analysis
Director, Undergraduate Social and Behavioral Sciences Methodology Minor
Member, Developmental Psychology Training Program
crmda.KU.edu
Talk presented 03-31-2011 @
Society for Research in Child Development
crmda.KU.edu
1
Conclusions
•Imputing missing data is not cheating
• NOT imputing missing data is MOST
likely to lead to errors in generalization!
•Plan for un-intentional missing data
•Plan intentionally missing data
crmda.KU.edu
2
Types of missing data
crmda.KU.edu
3
Modern Missing Data Analysis
MI or FIML
•
In 1978, Rubin proposed Multiple Imputation (MI)
•
•
•
•
An approach especially well suited for use with large public-use
databases.
First suggested in 1978 and developed more fully in 1987.
MI primarily uses the Expectation Maximization (EM) algorithm
and/or the Markov Chain Monte Carlo (MCMC) algorithm.
Beginning in the 1980’s, likelihood approaches developed.
•
•
Multiple group SEM
Full Information Maximum Likelihood (FIML).
• An approach well suited to more circumscribed models
crmda.KU.edu
4
Missing Data and Estimation:
Missingness by Design
•
•
Assess all persons, but not all variables at each
time of measurement
Control entry into study: estimate and control for
retest effects, increase validity, decrease costs,
increase power, etc.
•
•
Randomly assign participants to their entry into a
longitudinal study and/or to the occasions of assessment
Key to providing unbiased estimates of growth or
change
crmda.KU.edu
5
3-Form Protocol
Common
Form Variables
Variable
Set A
Variable
Set B
Variable
Set C
1
Marker
Variables
~1/3 of
Variables
~1/3 of
Variables
None
2
Marker
Variables
Marker
Variables
~1/3 of
Variables
none
~1/3 of
Variables
none
~1/3 of
Variables
~1/3 of
Variables
3
crmda.KU.edu
6
Expansions of 3-Form Design
(Graham, Taylor, Olchowski, & Cumsille, 2006)
crmda.KU.edu
7
Expansions of 3-Form Design
(Graham, Taylor, Olchowski, & Cumsille, 2006)
crmda.KU.edu
8
2-Method Planned Missing Design
crmda.KU.edu
9
Controlled Enrollment
Group
Time 1
Time 2
Time 3
Time 4
Time 5
1
x
x
x
x
x
2
x
x
x
missing
missing
3
x
x
missing
x
missing
4
x
missing
x
x
missing
5
missing
x
x
x
missing
6
x
x
missing
missing
x
7
x
missing
x
missing
x
8
missing
x
x
missing
x
9
x
missing
missing
x
x
10
missing
x
missing
x
x
11
missing
missing
x
x
x
crmda.KU.edu
10
Optimal Growth Curve Design
Group
Time 1
Time 2
Time 3
Time 4
Time 5
1
x
x
x
x
x
6
x
x
missing
missing
x
7
x
missing
x
missing
x
9
x
missing
missing
x
x
crmda.KU.edu
11
Combined Elements
crmda.KU.edu
12
The Sequential Designs
crmda.KU.edu
13
Transforming to Accelerated Longitudinal
• Assumes a MAR process, but if you plan for it and measure
cohort-related influences, the impact will be easily estimated.
• In the analysis, cohort becomes a variable that is controlled for.
crmda.KU.edu
14
Advances in Missing Data
and Implications for
Developmental Science
Thanks for your attention!
Questions?
crmda.KU.edu
Talk presented 03-31-2011 @
Society for Research in Child Development
crmda.KU.edu
15
Update
Dr. Todd Little is currently at
Texas Tech University
Director, Institute for Measurement, Methodology, Analysis and Policy (IMMAP)
Director, “Stats Camp”
Professor, Educational Psychology and Leadership
Email: yhat@ttu.edu
IMMAP (immap.educ.ttu.edu)
Stats Camp (Statscamp.org)
www.Quant.KU.edu
16
Download