Comparison of Missing Data Methods

advertisement
2011-25
Treatment of Missing Data in Workforce Education Research
Sinan Gemici, Jay W. Rojewski, In Heok Lee
University of Georgia
Abstract
Most quantitative analyses in workforce education are affected by missing data.
Traditional approaches to remedy missing data problems often result in reduced
statistical power and biased parameter estimates due to systematic differences between
missing and observed values. This article examines the treatment of missing data in
pertinent quantitative analyses published in recent issues of Career and Technical
Education Research. Next, essential missing data patterns and mechanisms are reviewed,
and alternative methods of handling missing data are discussed. The article concludes
with a comparison of missing data methods using a small sample from the National
Longitudinal Survey of Youth 1997 to illustrate the detrimental effects of traditional
approaches to handling missing data, and demonstrate the benefits of multiple imputation
(MI) as an efficient modern missing data technique.
Introduction
Most quantitative research studies in education and the social sciences contain missing data
(Allison, 2002). To address this issue, Wilkinson and the Task Force on Statistical Inference (1999) urged
researchers to portray any complications encountered in the course of completing their investigations,
including the occurrence of missing data. The report strongly highlighted the importance of using
appropriate methodology to ensure that results are not biased by anomalies in the data. Authors were
encouraged to describe patterns of missing data, as well as steps taken to address the problem during the
analysis stage. It was concluded that such steps should be a standard component of all analyses, for to do
otherwise was to risk “publishing nonsense” (p. 597). Guidelines found in the latest edition of the
American Psychological Association Publication Manual (2010) echoed these sentiments. Given that
many applied research studies in the social sciences continue to show deficiencies in the treatment of
missing data, Schlomer, Bauman, and Card (2010) recently called for greater attention to the issue.
Researchers know that they must report response rates for survey data and include effect sizes for
statistically significant results. The same expectation for accurate reporting should apply to
missing data management. Ignoring this step is poor science, and results reported without
attention to missing data can misinform our scientific understanding and misguide policy and
practice. (p. 8)
Purpose
The problem of how to deal with missing data has received little attention in career-technical
education (CTE) research. Given the importance of using best practice in addressing missing data issues,
the purpose of our study is threefold. First, we determine how CTE researchers have managed missing
data in their own analyses by examining several recent issues of Career and Technical Education
Research (CTER). Our examination criteria are based on two items considered essential in best practices
related to the treatment of missing data, i.e., reporting the extent and nature of the missing data, and
describing the procedures used to manage missing data, including the rationale for using the method
selected (Schlomer et al., 2010). Second, we outline the types and mechanisms under which missing data
occur, and review several alternative methods of handling missing data. Third, we compare these methods
using a small data sample to illustrate the detrimental effects that unprincipled missing data techniques
2011-25
may have on the accuracy of parameter estimates. Our overall objective is to offer guidelines that may
help CTE researchers to adequately address missing data problems in their own work.
2011-25
Treatment of Missing Data in Recent CTER Articles
We selected four volume series of the CTER journal covering the years 2006 through 2009
(Volumes 31-34) to determine how researchers have managed missing data in quantitative investigations.
We emphasize that the objective here was neither to engage in a full meta-analytic review of recently
published research, nor to criticize the work of individual CTE researchers. Instead, our motivation for
conducting this review was to highlight the need for taking a more principled approach to dealing with
missing data when conducting quantitative CTE research. A total of 12 issues were published during this
4-year period containing 27 quantitative articles included in the pool of eligible studies. For each article in
our sample, we determined whether (a) the percentage of missing data was reported, (b) the method for
handling these data was specified, and (c) a rationale for the chosen method was provided (see Table 1).
Of the 27 articles we examined, only three addressed missing data with specific techniques. One
article employed a listwise deletion technique, eliminating 19 surveys with incomplete responses to items
measuring the dependent variable. A second article included only “useable” surveys in its final data pool,
meaning that only surveys containing complete responses to all questionnaire items were retained,
eliminating the missing data problem. A third article mentioned the percentage of cases with complete
data and indicated the use of a pairwise deletion method to address the problem, despite the author’s
knowledge of potential bias being introduced into the analysis. No rationale was offered to support this
choice. Four articles contained data tables with footnotes alerting readers to the missingness of data.
However, in each case no explanation of the missing data, its possible effect on analysis, or its treatment
was found in the article narrative. An additional eight articles made no mention of missing data but
presented data tables indicating missing data. Finally, one study conducted a missing data analysis to
determine differences in pre-/post-group attrition but did not extend this analysis to the data that was used
for the final analysis.
2011-25
Table 1
Summary of How Articles Published in Career and Technical Education Research from 2006-2009 Dealt with the Issue of Missing Data
Author(s)
Iss Missing data
Missing data addressed
Rationale for
Comments
correction
Volume 31
Park & Rojewski
1
NO
__
Chadd & Drage
2
Listwise deletion (n=19)
NO
Zirkle, Norris,
Winegardner, &
Frustaci
2
NA
__
Alfeld, Hansen,
Aragon, & Stone
3
NO
__
Higgins & Kotrlik
3
YES; Discrepancies noted in
NO
frequency tables for n but no
explanation offered.
Correlation table n values vary from
102-104 without explanation (N=105).
__
YES; small n missing data in
demographic data table; Doesn’t
appear to influence inferential analysis
but no explanation.
YES; “due to missing responses in the
11 items addressing perceptions” (p.
89)
Authors also noted that “many
respondents did not identify their
school or school district” (p. 89).
NO; All surveys “had complete
responses to the 36 barrier items of
interest…and were deemed useable”
(p. 108).
YES; Due to “a page missing in some
of the surveys sent” (p. 136) to one of
the groups.
Authors noted, “A final screening
procedure to check for missing data did
not detect any missing values for
variables” (p. 34)
Low response rate from survey ~ 20%.
Authors examined attrition between
surveys completed in the fall but not
spring administrations. They referred to
this analysis as a missing data analysis
finding “no significant pattern to which
students did not take the second survey”
(p. 147).
2011-25
Author(s)
Iss Missing data
Volume 32
Bae, Gray, & Yeager
1
Geiman, Torres, Burris, 1
& Kitchel
Missing data addressed
Rationale for
Comments
correction
NO; Used post hoc random selection NA
of existing database, so may not be
issue
YES; Missing data detected in
NO
frequency table but no explanation
offered. Doesn’t appear to be issue for
analysis
__
__
Sample size confusing. Initially, n=80
divided into 2 groups of 40 but
eventually reported as 2 groups of n=39
and n=31. Additional information would
allow for determining possible missing
values.
Park & Osborne
1
YES; Two variables in the descriptive NO
data table and ANOVA results (df)
indicated missingness.
__
Esters
2
__
Low response rate from survey ~ 20%.
Small sample size, n=88.
Treated ordinal results as interval data for
analysis (p. 87).
McClain & McClain
2
YES; Author notes in Table 1 that
NO
“total does not equal 88 due to missing
data” (p. 55). Table 2 also contains
missing values; Results of PDA reveal
only n=43.
YES; Not mentioned but appears
NO
widespread through each data table.
__
Low response rate from survey = 31.3%.
Geiman & Covington
2
__
Small sample size, n=41 [3/44 did not
reply]. Authors claimed, “Non-response
error was not considered a serious threat
to the validity of the study due to the high
response rate” (p. 125).
Park & Covington
3
DON’T KNOW; Cohorts and
NO
conditions described based on n=44
but had response of n=41. Only
clarified response in terms of
independent variable for analysis. No
missing data on this variable.
YES; Missing data on 3 of 9 variables NO
included in Table 2; df for selected ttests indicate no missing data
__
2011-25
Author(s)
Rationale for
Comments
correction
Iss Missing data
Missing data addressed
Bennett
3
YES; Author indicates that only 602
of 1741 students responded to all
items on the survey (34.6%).
YES; Pairwise deletion was
NO
used though “it risked entering
significant bias in the
analyses” (p. 206)
“To determine whether the
missing data were randomly
distributed, the differences
between the ‘missing’ and the
‘non-missing’ student groups
on each measure were tested”
(p. 206).
Archival database representing 67%
student response rate. All descriptive data
in percentages only, no n.
Volume 33
Burns
1
NO
__
Small sample size, n=48. Indicated that
55 teachers were initially administered
surveys but included 48 for analysis.
Rehm
1
YES; Table 2 (descriptive) has
missing data on 13 of 25 items—
footnote provided; Table 3 indicates
n=47. survey
YES; A footnote to Table 1 indicated
presence of missing data
NO
__
Low response rate from survey = 22.8%.
Small sample size, n=41.
Kitchel, Geiman,
Torres, & Burris
Bragg & Marvel
2
YES; Mentioned in text for a single
descriptive variable
YES; Large discrepancies between
responses and data in Tables 2-4
NA
__
NO
__
Gaytan
2
NO
NA
__
Kim & Bragg
2
DON’T KNOW; No n provided after
initial values; Can’t determine
NO
__
McCharen
3
DON’T KNOW
NO
__
2
Low response rate from survey = 29.0%.
F statistics presented without df so extent
of missing data not determined.
t statistics presented without df so extent
of missing data not determined.
2011-25
Author(s)
Iss Missing data
Volume 34
Grier-Reid, Skaar, &
Parson
1
Missing data addressed
Rationale for
Comments
correction
YES; Table 1 presents incomplete data NO
for initial rather than final pool, which
indicates missing data although final
data pool does not.
YES; Nonresponse on individual items NO
reported. Table footnote indicates
n=511, which is much lower than
reported sample size of n=716 “due to
mortality rate of the KAI” (p. 36).
n=712 noted for analysis on another
variable without comment.
YES
NO
__
Reports attrition rates of 45%.
__
Missing data = 28%.
DON’T KNOW; Can’t determine
from data given.
DON’T KNOW; Can’t determine
from data given.
NO
__
Friedel & Rudd
1
Kotrlik & Redmann
1
Fletcher & Zirkle
2
Wolf, Foster, &
Birkensholz
2
Crittenden
3
NO; df indicate no missing data.
NA
__
Kitchel, Cannon, &
Duncan
3
YES; Missing data in descriptive data
table (very small).
NO
__
__
NO
Small sample size, n=27.
Descriptive data presented in percentages
only
Small sample size, n=45.
2011-25
Overview of Missing Data
Data can be missing for a variety of reasons. Missing data due to nonresponse can occur because
of noncontact, refusal to cooperate, or specific barriers that impede an eligible respondent from
participating (Groves & Couper, 1998). Survey researchers generally distinguish between unit
nonresponse and item nonresponse. The former refers to the absence of any sort of data from an eligible
respondent due to noncontact or outright refusal to participate, while the latter denotes a situation in
which a respondent answers some items but fails to answer others (Elliott, Edwards, Angeles,
Hambarsoomians, & Hays, 2005). Wave nonresponse occurs in longitudinal surveys where participants’
responses may be missing for one or more survey waves. In experimental studies missing data may occur
due to attrition, meaning that a participant decides to drop out before data collection has been completed
(Given, Keilman, Collins, & Given, 1990). Finally, erroneous data entry, disclosure restrictions, and
similar procedural factors can lead to incomplete data.
Missing values that emanate from these and other scenarios routinely obstruct data analysis
because most statistical procedures require a complete data matrix. Incomplete data can result in reduced
statistical power, difficulties in data analytic procedures using standard software packages, and biased
analysis results due to the potential existence of systematic differences between missing and observed
data (Barnard & Meng, 1999). The detrimental effects caused by missing data are particularly challenging
in the context of survey research due to the sizeable number of responses and respondents involved
(Raaijmakers, 1999). Overall, incomplete data are a nuisance that routinely obstructs data analytic
procedures in applied workforce education and other areas of scientific investigation, including
psychology (Jeliĉić, Phelps, & Lerner), political science (Honaker & King, 2010), and public health
(Stuart, Azur, Frangakis, & Leaf, 2009).
Historically, cases with missing values were either ignored or the missing observations were
substituted with imprecise approximations based on simplistic replacement procedures. The statistical
cost incurred by these approaches was frequently prohibitive in terms of case loss and/or analysis bias. To
address this issue, Dempster, Laird, and Rubin (1977) developed the expectation maximization algorithm,
whereby a likelihood function is used to draw parameter estimates from a particular distribution that is
assumed to underlie the missing data. Based on Rubin’s (1976) framework of inference from incomplete
data, EM was the first modern-day stochastic missing data technique (Schafer & Graham, 2002). A
decade after introducing EM, Rubin (1987) developed the multiple imputation (MI) method that is based
on the creation of several complete datasets in which missing values are replaced with different random
draws from a distribution of plausible values. By analyzing each imputed dataset separately before
pooling results, MI is able to incorporate the uncertainty inherent in the missing data, thus producing
more robust parameter estimates (Schafer, 1999).
This brief historic overview of handling missing data illustrates a progression from simplistic
approaches to more principled ones that incorporate the randomness reflected in the missing data. This
progression has been supported by the general proliferation of computing power and the widespread
incorporation of advanced missing data methods in standard statistical software packages. Before
reviewing different missing data methods it is important to understand the nature of missing data patterns
and mechanisms.
Missing Data Patterns and Mechanisms
Missing data can occur in random or nonrandom patterns within a data matrix. Methodologists
differentiate between three types of patterns, including univariate, monotone, and arbitrary. Univariate
patterns occur when a specific variable contains missing values, while all other variables are fully
observed. Monotone patterns occur when individuals decide to drop out from a study before its formal
completion (Fielding, Fayers, & Ramsay, 2009). For instance, if an individual within a Yi variable data
matrix had an observed value on variable Y3, the same individual would have observed values on all
2011-25
preceding variables Y1 and Y2. Likewise, if Y3 was the last variable for which data were collected before
dropout, all further variables Y4 to Yi would exhibit missing values. Arbitrary patterns arise when missing
data display no systematic, discernable structure within a given data matrix. This occurs when each case
exhibits a different pattern of missing values (McKnight et al., 2007). It should be noted that not all
different missing data patterns have to be present in any given data matrix.
Missing data theory distinguishes between three nonresponse mechanisms that may underlie the
structure of a data matrix (Graham, 2009). These mechanisms capture differences in the probabilistic
relationship between missing and observed values. When data are missing completely at random (MCAR)
the missingness of a value in variable Z is unrelated to any other data point within variable Z or any other
variable in the dataset. Nonresponse under MCAR is ignorable, since it assumes that missing values are
simply a random subsample of the complete data matrix and, therefore, do not alter the original
distributional relationships between variables. MCAR makes the very strong assumption of complete
randomness, which is difficult to uphold in practice (Little & Rubin, 1987). Little’s (1988) MCAR test,
which is available in several standard statistical software packages (e.g., SPSS, SAS), can be conducted to
determine whether missing data are, in fact, missing completely at random. Where test results reject the
null hypothesis of complete randomness, the application of missing data techniques that assume MCAR
will yield biased parameter estimates.
A less stringent mechanism is known as missing at random (MAR), whereby the missingness of a
value in variable Z is unrelated to any other data point within variable Z, but is related to one or more of
the other variables in the dataset. Lower-achieving students, for instance, may exhibit a lower propensity
to participate in a voluntary aptitude test. Consequently, the missingness of test scores is directly related
to a student’s achievement status. Nonresponse under MAR is ignorable because the probabilities of
missingness do not depend on the missing data themselves (Allison, 2002). The strength of modern
missing data techniques lies in their ability to produce unbiased parameter estimates under MAR.
The third mechanism, missing not at random (MNAR), refers to situations in which the
missingness of a value in variable Z is a function of other values in variable Z. Data that are MNAR
represent nonignorable nonresponse and greatly complicate the treatment of missing data, since a model
for the distribution of missingness in each variable must be specified separately (Schafer & Graham,
2002). While methods dealing with MNAR have been developed (e.g., Demirtas, 2005), these models are
highly complex and require modeling assumptions that, if incorrect, exacerbate bias when compared to
the application of modern MAR-based techniques (Demirtas & Schafer, 2003). No statistical test exists to
determine whether missing data are MAR or MNAR.
Traditional Missing Data Methods
Frequently-encountered traditional approaches to handling missing data include complete case
analysis, complete variables analysis, mean substitution, regression-based imputation, and cold-deck/hotdeck imputation. For clarity, we have divided traditional approaches into case reduction and deterministic
methods.
Case Reduction Methods
Complete case and complete variables analysis are based on eliminating the missing data problem
through case reduction. Complete case analysis, also referred to as listwise deletion, entails simply
discarding all cases in a dataset that exhibit missing data on one or more variables. This approach is often
used by researchers because it can be implemented without computational effort and may be used in
conjunction with all sorts of subsequent statistical analyses (Allison, 2002). Complete variables analysis,
also referred to as pairwise deletion, is a variable-by-variable approach that discards only those cases that
exhibit missing values on a particular bivariate pair. Both methods are generally considered inefficient
because they discard cases for which information is at least partially available.
2011-25
The use of case reduction techniques may be appropriate when data are MCAR and the amount of
missing values is small. Five percent has been suggested as an acceptable upper limit for case reduction
(Schafer, 1997). When applied in scenarios with higher rates of missingness, case reduction eliminates
important information contained in the original data matrix, resulting in potentially dramatic case loss and
biased parameter estimates (Graham, Hofer, & MacKinnon, 1996). Complex multivariate analyses based
on large-scale datasets are particularly prone to detrimental effects from case reduction due to the high
number of variables on which missingness can occur.
Deterministic Methods
Mean substitution, regression imputation, as well as cold-deck/hot-deck imputation are
considered deterministic procedures because they replace missing values with a simple fixed estimate of
the hypothesized true value (Schulte Nordholt, 1998). The key advantage of deterministic approaches
over case reduction methods lies in the preservation of sample size. Mean substitution simply replaces all
missing data points in a given variable with that variable’s arithmetic mean value. This approach,
however, is no less problematic than case reduction techniques, for replacing missing observations with
the mean value reduces variability in the data and leads to biased estimates of variances and covariances
even under MCAR (Little & Rubin, 1987).
Regression imputation is a slightly more refined approach that replaces missing data with the
predicted values from a linear regression model using a set of auxiliary variables. This method requires at
least a moderate degree of covariance between variables with missing data and all other variables within
the data matrix. Similar to case reduction, regression imputation requires data to be MCAR. Although
easy to implement, regression imputation produces negatively biased standard errors (Enders, 2006),
inflated correlations between variables, and overestimated R2 values (Schafer & Olsen, 1998). Moreover,
imputed values fall directly on the regression plane, leading to a lack of residual variability in the data. To
offset this effect, a random error term can be added to the imputation model to introduce additional
variance.
Cold-deck and hot-deck procedures have long been used to deal with missing data in survey
research. In contrast to mean substitution and regression imputation, cold-deck and hot-deck procedures
do not rely on the creation of synthetic values (Chen & Shao, 2001). Cold-deck imputation is used in
longitudinal surveys that consist of several data collection waves. If a certain case exhibits an observed
value on a given variable in a previous wave, but a missing value on that same variable in a later wave,
the previous wave’s observed value is assigned (Chaudhuri & Stenger, 1992). Whereas cold-deck
imputation is based on data from different datasets on the same case, hot-deck imputation uses the actual
value from a different case in the same dataset (Schulte-Nordholt, 1998). Hot deck imputation identifies a
case (also referred to as a donor) in the same dataset that is similar across all variables to the case
containing the missing value and replaces the missing observation with the donor’s value. Distance
measures ensure that the closest-fitting donor value is identified and used for replacement (Switzer, Roth,
& Switzer, 1998). Using similar donors generally avoids the computation of nonsensical replacement
values. Nonetheless, resulting estimates of correlations and regression weights are often unreliable (Roth
& Switzer, 1995) and parameter estimates can be biased even under MCAR (Brown, 1994).
While deterministic approaches are easy to compute and implement, their detrimental effect on
variance is problematic when data are used for multivariate analysis. Moreover, deterministic methods
routinely underestimate parameter standard errors, thus increasing the likelihood for Type I error. Modern
estimation procedures, such as expectation maximization and multiple imputation, can remedy many of
these shortcomings.
2011-25
Modern Missing Data Methods
In contrast to the case reduction and deterministic approaches of traditional missing data methods,
modern missing data techniques include an element of randomness. Specifically, modern methods assume
a certain underlying distribution for the missing values. Plausible values are then drawn at random from
that assumed distribution to re-create a complete data matrix (see Little & Rubin, 1987). Modern
techniques have gained widespread popularity because they have demonstrated consistently superior
efficiency and estimation properties in terms of parameter bias (Schafer, 1997). Here, we refer to Chen
and Åstebro’s (2003) simplified definition of efficiency as “a procedure that provides an unbiased
estimate of sample properties that is also easy to implement” (p. 315). One key advantage of these
methods lies in their ability to produce unbiased parameter estimates under MAR instead of requiring the
more stringent (and less realistic) MCAR assumption. Frequently-used modern missing data methods
include expectation maximization and multiple imputation. Options for carrying out these methods exist
for standard statistical software packages, such as SPSS, Stata, SAS, or R.
Expectation Maximization
Expectation maximization (EM; Dempster et al., 1977) is a maximum-likelihood approach that
arrives at missing value estimates through an iterative approximation process. Maximum-likelihood
estimation “searches over different possible population values, finally selecting parameter estimates that
are most likely (have the ‘maximum likelihood’) to be true, given the sample observations” (Eliason,
1993, p. v). Conceptually, EM solves a complex missing data problem by repeatedly solving simpler
complete data problems. EM is a two-step process that consists of an expectation and a maximization
step. During the expectation step, the mean vectors and covariance matrix of the available data and
resulting parameter estimates are used to determine the conditional expectations of the missing data
(Enders, 2006). This means that a series of separate equations is used to regress each missing variable on
the remaining complete variables for a given case. Predicted scores (i.e., parameter estimates) produced
from these regressions are used to replace the missing values. The maximization step consists of recalculating these predicted scores using maximum-likelihood estimates based on actual and re-estimated
missing data from the expectation step (Little & Rubin, 1987). EM iterations are repeated until the loglikelihood converges to a stationary point.
The maximum-likelihood procedures in EM infer probable values for the missing data from
information contained in the observed data. Simulation studies have found EM to perform very well under
different missing data scenarios (see Graham & Donaldson, 1993; Ibrahim, 1990). One important
disadvantage of EM is its high sensitivity to misspecifications of the imputation model. Another
disadvantage lies in EM’s limited ability to account for the uncertainty inherent in the estimation of
missing data. This is due to the fact that the covariance matrix used as a basis for the regression equations
is itself only one of many plausible covariance matrices and, therefore, estimated with error due to the
missing data (Enders, 2006).
Multiple Imputation
First notions of multiple imputation (MI) were introduced by Rubin (1978) as a reaction to the
nonresponse problem in the analysis of large-scale surveys. Almost a decade later, Rubin (1987)
presented a comprehensive framework for the use of MI as a highly versatile, general-purpose approach
to missing data. However, it was not until the late 1990s that MI became more widely used based on
advances in computational power (Sindharay, Stern, & Russell, 2001). Today, it is well established that
MI provides accurate estimates in conditions under which deterministic approaches yield biased results
(Schafer, 1997; Schulte-Nordholt, 1998).
MI is a Monte Carlo approach, a general term for computational techniques that repeat an
artificially created chance process using random numbers (Mooney, 1997). MI is based on the creation of
m > 1 complete datasets that are analyzed individually before pooling parameter estimates and standard
2011-25
errors into one unified set of results. The replacement of each missing data point with several simulated
values is a key characteristic that distinguishes MI from other methods (Rubin, 1996). By replacing each
missing observation with several slightly different plausible values, MI incorporates the randomness
inherent in the missing data, thus mitigating the problem of variance underestimation inherent in both
traditional missing data methods and the EM approach. MI is further able to yield precise missing value
estimates without a large number of computation cycles. Between five and 10 imputations are generally
viewed as sufficient (Schafer, 1997), although much higher numbers of imputations have been suggested
with regards to preserving statistical power for testing small effect sizes (for more details see Graham,
Olchowski, & Gilreath, 2007). Once several imputed data matrices have been created they are analyzed
separately before results are pooled into a final set of parameter estimates and standard errors using the
four-step process outlined in Table 2 (see also Enders, 2006).
Table 2
Pooling Procedure for Multiple Parameter Estimates and Standard Errors
Step
Formula
π‘š
1
𝑄̅ = ∑ 𝑄̂𝑖
π‘š
1. Pooled parameter estimate
𝑖=1
where m is the number of
imputations and 𝑄̂𝑖 is the
parameter estimate from the ith
imputed dataset
2. Pooled standard error
π‘š
1
Μ… = ∑π‘ˆ
̂𝑖
π‘ˆ
π‘š
a. Within-imputation variance
𝑖=1
̂𝑖 is the variance estimate
where π‘ˆ
from the ith imputed dataset, and
m is the number of imputations
π‘š
b. Between-imputation
variance
1
2
𝐡 = ∑(𝑄̂𝑖 − 𝑄̅ )
π‘š
c. Total imputation variance
Μ… + (1 +
𝑇=π‘ˆ
d. MI standard error
𝑖−1
1
)𝐡
π‘š
S.E. = √𝑇
In this section, we reviewed traditional and modern missing data methods. Generally, these
methods represent a progression from unprincipled approaches, such as case reduction and deterministic
methods, to principled ones, such as EM and MI. While traditional methods can be adequate for simple
missing data problems with low rates of missingness under the MCAR assumption, EM and MI generally
yield much better estimation results when data are MAR. Due to its flexibility, MI is particularly wellsuited for addressing multivariate missing data problems under the normal model. MI has also
demonstrated relative robustness to misspecification of the imputation model (Beunckens, Sotto, &
Molenberghs, 2008) and deviations from multivariate normality (Graham & Schafer, 1999).
Comparison of Missing Data Methods
To demonstrate the effects of different approaches to handling missing data, we conducted a
comparison of several missing data methods using a small sample from the National Longitudinal Survey
of Youth 1997 (NLSY97, U.S. Bureau of Labor Statistics, 2009). The NLSY97 is a nationallyrepresentative annual survey that provides data to examine the transition process of secondary students
2011-25
into postsecondary education and/or the workplace. We randomly selected a sample of 100 complete
cases from the 1996/97 base year cohort of 9th-graders to regress socioeconomic status, academic
achievement, and curriculum track on outcome scores of the Peabody Individual Achievement Test
(PIAT) math assessment. The PIAT is a widely-used brief assessment of academic achievement, and the
instrument’s mathematics assessment subtest was administered to all respondents who were in ninth grade
or lower during the NLSY97 base year data collection. The choice of these predictor and outcome
variables was guided by their frequent use in CTE research, along with the need to keep the analysis
simple for demonstration purposes. Also, we intentionally did not select a large sample to account for the
fact that many studies in applied workforce education research operate with smaller sample sizes (see
examples provided in Table 1). Table 3 provides details on the variables used for comparison.
Table 3
NLSY97 Variables Used for Comparison
NLSY97 designation
Description
Levels
Role
CV_HH_POV_RATIO
Ratio of household income to poverty
level (referred to as Poverty ratio)
Continuous
Predictor
YSCH-6800
Grades received in eighth grade
(Grades)
1=Mostly below Ds
2=Mostly Ds
3=About half Cs and Ds
4=Mostly Cs
5=About half Bs and Cs
6=Mostly Bs
7=About half As and Bs
8=Mostly As
Predictor
TRANS_SCH_PGM
Curriculum track (Track)
0=Academic
1=CTE
Predictor
CV_PIAT_STANDARD_UPD
PIAT standard score (PIAT)
Continuous
Outcome
Note. The eight-level ordinal grades variable was treated as a continuous predictor for imputation purposes.
The distributional properties of all continuous predictors were examined, since standard missing
data mechanisms, including MI, assume multivariate normality. The original household poverty ratio
variable was positively skewed and leptokurtic. After applying square root transformation, the poverty
ratio variable more closely approximated normality. No transformations were applied to any other
variable. Following this transformation, the Shapiro-Wilk test indicated the presence of multivariate
normality for the complete dataset (W = .983, p = .230). Descriptive statistics for the complete-case
sample are provided in Table 4.
Table 4
Descriptive Statistics for the Complete-case Sample (n=100)
Poverty ratio
Grades
Track
PIAT
Min
3.46
3
0
68
Max
31.21
8
1
137
M
17.80
5.80
.61
99.65
SD
5.384
1.287
β€’
13.432
Skewness
-.150
-.197
β€’
.149
Kurtosis
-.090
-.466
β€’
.150
2011-25
We first used the complete-case sample to regress PIAT scores on poverty ratio, grades, and track
in order to establish a baseline of the hypothetically true parameter estimates. Baseline results for the
complete sample are provided in Table 5.
Table 5
Baseline Results for the Complete Dataset
R2 Adj
β
S.E.
64.528
.648
4.839
-7.334
5.512
.175
.809
2.023
t
df
CI β€’
CI +
96
96
96
96
53.588
.300
3.233
-11.349
75.469
.996
6.444
-3.319
.586
Intercept
Povratio
Grades
Track
*
p<.05
**
p<.01
11.707***
3.699***
5.982***
-3.626***
***
p<.001
The same analysis was repeated for different missing data mechanisms and rates of missingness
using listwise deletion (LD), mean substitution (MS), and multiple imputation (MI). LD and MS were
chosen due to their continued use in applied workforce education research, and MI was chosen due to its
rapidly increasing popularity as a modern missing data method. Parameter estimates were compared to
those of the complete-case baseline to determine differential effects on missing data bias. All analyses
were conducted in the statistics program R, which is widely on the internet available at no cost. Multiple
imputation was carried out using the Multiple Imputation by Chained Equations (MICE, Van Buuren &
Groothuis-Oudshoorn, 2009) package for R. Ten complete imputed datasets were created for each
application of MI under different missing data mechanisms and rates of missingness.
MCAR
We imposed an MCAR mechanism by randomly deleting 40 (10%), 80 (20%), and 120 (30%)
observations from the sample’s 100 x 4 complete data matrix. Table 6 illustrates the missing data pattern
by category of missingness. The first row of the first pattern (i.e., 10% missingness) indicates that 64 out
of 100 cases in the sample are complete. The second row shows that eight cases have a missing value on
the poverty ratio variable, whereas the seventh row indicates the existence of one case with missing
values on both poverty ratio and PIAT. The last column summarizes the number of variables that have
missing values for the number of cases specified in the first column. The total number of missing values
is 40, and most of them (i.e., n=13) occur in the PIAT outcome variable. The interpretation applies
analogously to all other missing data patterns.
Table 6
MCAR Missing Data Patterns for Various Categories of Missingness
10%
20%
PR G
T
PT
PR G T
PT
64 1
1
1
1
0
36 1
1
1
1
0
8
0
1
1
1
1
8
0
1
1
1
1
8
1
0
1
1
1
13 1
0
1
1
1
6
1
1
0
1
1
13 1
1
0
1
1
10 1
1
1
0
1
14 1
1
1
0
1
1
0
0
1
1
2
4
0
0
1
1
2
1
0
1
1
0
2
2
0
1
0
1
2
2
1
0
1
0
2
4
1
0
0
1
2
10 11 6
13 40
1
0
1
1
0
2
5
1
1
0
0
2
15 21 24 20 80
30%
13
8
11
21
16
10
6
2
4
2
5
PR
1
0
1
1
1
0
0
1
0
1
1
G
1
1
0
1
1
0
1
0
1
0
1
T
1
1
1
0
1
1
0
0
1
1
0
PT
1
1
1
1
0
1
1
1
0
0
0
0
1
1
1
1
2
2
2
2
2
2
2011-25
1
1
0
0
30
0
1
26
1
0
35
0
0
29
3
3
120
Note. PR = poverty ratio; G = grades; T = track; PT = PIAT
For the four variable columns, 0 indicates missing data and 1 indicates observed data.
Listwise deletion. Listwise deletion resulted in 40, 80, and 120 values being randomly deleted
across all four variables in the 400-cell complete data matrix. When compared with the hypothetically
true parameters from the complete dataset (see Table 4), results from the listwise-deletion-based analysis
yielded Type II errors for track (for 10% and 20% missingness), as well as poverty ratio and grades (for
30% missingness). This means that the null hypothesis for these variables is wrongfully accepted. Type II
errors and biased regression coefficients were accompanied by inflated adjusted R2 values and standard
errors for 20 and 30% missingness. Table 7 lists results for listwise deletion under MCAR.
Table 7
Regression Results for Listwise Deletion, Mean Substitution, and Multiple Imputation under MCAR
CI
%
CI
R2 Adj
β
t
df
Missingness
S.E.
+
β€’
Listwise deletion
10
.556
Intercept
61.611
6.965
8.846***
47.679
75.543
Povratio
.710
.212
3.355**
60
.287
1.134
Grades
4.593
1.058
4.342***
60
2.477
6.709
Track
-4.379
2.661
-1.645
60
-9.702
.944
20
.602
Intercept
53.186
9.735
5.464***
32
33.357
73.015
Povratio
.645
.285
2.263*
32
.064
1.225
Grades
6.408
1.549
4.136***
32
3.252
9.564
Track
-3.036
3.483
-.872
32
-10.129
4.058
30
.610
Intercept
87.526
16.341
5.356***
9
50.559
124.492
Povratio
.527
.615
.857
9
-.864
1.918
Grades
1.332
2.400
.555
9
-4.096
6.760
Track
-17.018
6.596
-2.580*
9
-31.939
-2.097
Mean substitution
10
.477
Intercept
Povratio
Grades
Track
20
.346
Intercept
Povratio
Grades
Track
30
.232
Intercept
Povratio
Grades
Track
65.374
.662
4.282
-5.948
5.721
.188
.881
2.083
11.426***
3.518**
4.860***
-2.855**
96
96
96
96
54.017
.289
2.533
-10.082
76.731
1.036
6.031
-1.813
80.768
.464
2.957
-8.856
6.022
.211
.952
2.357
13.412***
2.196*
3.106**
-3.757***
96
96
96
96
68.814
.045
1.067
-13.535
92.722
.883
4.847
-4.177
70.077
.719
3.172
-4.956
8.241
.253
1.079
3.000
8.504***
2.840**
2.939**
-1.652
96
96
96
96
53.719
.216
1.030
-10.911
86.435
1.221
5.314
.999
2011-25
Multiple imputation
10
.518
Intercept
Povratio
Grades
Track
20
.625
Intercept
povratio
Grades
Track
30
.503
Intercept
Povratio
Grades
Track
*
p<.05 **p<.01
68.541
.705
3.954
-8.050
6.159
.197
.961
2.369
11.128***
3.581**
4.116***
-3.398**
96
96
96
96
56.270
.313
2.035
-12.794
80.812
1.097
5.873
-3.306
65.514
.461
5.286
-7.330
6.205
.236
.935
2.552
10.558***
1.982*
5.655***
-2.872**
96
96
96
96
52.880
.026
3.402
-12.598
78.148
.948
7.170
-2.062
61.995
9.251
96
42.793
81.198
96
96
96
.064
2.471
-13.520
1.176
7.840
.551
.620
5.156
-6.484
***
p<.001
.272
1.302
3.393
6.701***
2.278*
3.960**
-1.911
Mean substitution. In our example, mean substitution was more robust to Type II error than
listwise deletion, and a misclassification of the track variable occurred only in the highest missingness
category. However, whereas standard error inflation and bias in the regression coefficients was moderate
compared the performance of listwise deletion, negative bias in R2 values was substantial (see Table 7 for
results of mean substitution under MCAR).
Multiple imputation. Poverty ratio, grades, and PIAT were imputed using predictive mean
matching (PMM), which is the default method for imputing continuous data in MICE (see Van Buuren &
Groothuis-Oudshoorn, 2009, for detailed information on PMM). The binary track variable was imputed
using logistic regression. MI performed considerably better than LD and MS with regard to regression
coefficients, standard errors, and R2 values. Results were robust to Type II error up to, but not including,
30% missingness (see multiple imputation results under MCAR in Table 7).
MAR
A MAR mechanism was imposed on the dataset by creating an artificial dependency between
poverty ratio and PIAT scores such that individuals with lower poverty ratios had a higher likelihood of
missingness on PIAT. Accordingly, we randomly deleted 10 (10%), 20 (20%), and 30 (30%) of values
from the PIAT outcome variable for cases in the two lowest poverty ratio quartiles. Table 9 illustrates the
data pattern by category of missingness.
Table 9
MAR Missing Data Patterns for Various Categories of Missingness
10%
20%
30%
PR G
T
PT
PR G T
PT
PR
90 1
1
1
1
0
80 1
1
1
1
0
70 1
10 1
1
1
0
1
20 1
1
1
0
1
30 1
0
0
0
10 10
0
0
0
20 20
0
Note. PR = poverty ratio; G = grades; T = track; PT = PIAT
For the four variable columns, 0 indicates missing data and 1 indicates observed data.
G
1
1
0
T
1
1
0
PT
1
0
0
1
30 30
2011-25
Listwise deletion. When compared with the hypothetically true parameters from the complete
dataset, results from the listwise deletion-based analysis yielded Type II errors on poverty ratio for 20%
and 30% missingness. R2 values were deflated, whereas regression coefficients and standard errors were
moderately inflated. Table 9 lists results for listwise deletion under MAR.
Table 9
Regression Results for Listwise Deletion, Mean Substitution, and Multiple Imputation under MAR
%
R2 Adj
β
S.E.
t
df
CI +
CI β€’
Missingness
Listwise deletion
10
.459
Intercept
70.932
6.487
10.934*** 60
58.036
83.828
*
Povratio
.442
.197
2.244
60
.050
.834
Grades
4.472
.878
5.093***
60
2.726
6.217
Track
-6.997
2.058
-3.400** 60
-11.089
-2.906
20
.423
Intercept
77.866
7.366
10.571*** 32
63.196
92.536
Povratio
.256
.215
1.192
32
-.172
.685
***
Grades
4.126
.935
4.411
32
2.263
5.990
Track
-8.147
2.166
-3.761***
32
-12.461
-3.833
30
.463
Intercept
75.706
7.623
9.931***
9
60.486
90.927
Povratio
.370
.231
1.600
9
-.092
.833
Grades
4.085
.980
4.169***
9
2.129
6.040
Track
-8.898
2.339
-3.804***
9
-13.569
-4.227
Mean substitution
10
.337
Intercept
Povratio
Grades
Track
20
.281
Intercept
Povratio
Grades
Track
30
.278
Intercept
Povratio
Grades
Track
Multiple imputation
10
.560
Intercept
Povratio
Grades
Track
20
.561
Intercept
85.552
.110
3.226
4.846
.186
.858
14.635***
.594
96
96
96
73.948
-.258
1.523
97.155
.479
4.929
96
-11.171
-2.654
-6.913
2.145
3.760***
-3.222**
94.993
-.080
2.453
-7.338
5.569
.177
.817
2.044
17.058***
-.450
3.001**
-3.591**
96
96
96
96
83.939
-.431
.831
-11.394
106.046
.272
4.075
-3.281
93.533
-.023
2.416
-6.845
5.449
.173
.800
2.000
17.166***
-.134
3.021**
-3.423**
96
96
96
96
82.717
-.367
.829
-10.814
104.349
.320
4.003
-2.876
67.604
.516
4.760
-6.990
6.661
.204
0.846
2.008
10.149***
2.528*
5.625***
-3.481**
96
96
96
96
54.095
.104
3.075
-10.976
81.113
.928
6.445
-3.003
71.488
6.446
11.090***
96
58.454
84.522
2011-25
Povratio
.334
Grades
4.876
Track
-7.818
30
.584
Intercept
71.354
Povratio
.497
Grades
4.350
Track
-9.050
*
p<.05 **p<.01 ***p<.001
.209
.813
2.142
1.992*
6.000***
-3.650**
96
96
96
.089
3.261
-12.092
.758
6.491
-3.542
7.633
.197
.926
2.670
9.348***
2.519*
4.700***
-3.389**
96
96
96
96
55.422
.100
2.482
-14.573
87.287
.895
6.219
-3.526
Mean substitution. In our example, mean substitution led to Type II error with only 10%
missing values on the PIAT outcome variable. Negative bias in R2 values was substantial, as was bias in
coefficients. Standard errors were close to the complete-case benchmark (see Table 9 for results of mean
substitution under MAR).
Multiple imputation. Whereas both listwise deletion and mean substitution produced Type II
error, MI estimates were robust across all three tested categories of missingness. R2 values were
consistently close to the complete-case benchmark, as were regression coefficients and standard errors
(see Table 9 for results for multiple imputation under MAR).
Relative Efficiency of Missing Data Methods
We used a relative efficiency approach to assess performance differences between the missing
data methods examined in our example. Relative efficiency is a concept in which two estimators of a
given parameter of interest are compared against each other in terms of their bias relative to the
parameter’s hypothesized true value. To illustrate, let T represent the first estimator and T′ the second
estimator of parameter θ. T is relatively more efficient if σ2(T) < σ2(T′) for all possible values of θ (Panik,
2005). For ease of implementation, we chose R2 as our reference statistic although relative efficiencies
could have also been calculated for each individual t-value for the regression coefficients. Given that our
example was based on a single estimation run for each method, we adapted the relative efficiency
approach by substituting estimator variance with differences between the hypothesized true adjusted R2
value and the adjusted R2 value produced by each missing data method. Missing data methods were
compared pairwise, with MI being the reference method in each comparison. Our adaptation of relative
efficiency resulted in
(βˆ†π‘… 2 𝐴𝑑𝑗)2
R.E.1 vs. 2 = (βˆ†π‘…12 𝐴𝑑𝑗)2
2
where (βˆ†π‘…12 𝐴𝑑𝑗)2 is the difference between the hypothesized true adjusted R2 value from the complete
dataset and the adjusted R2 value generated by method 1, and (βˆ†π‘…22 𝐴𝑑𝑗)2 is the difference between the
hypothesized true adjusted R2 value from the complete dataset and the adjusted R2 value generated by
method 2 (i.e., multiple imputation as the reference method). Adapted relative efficiency values are
provided in Tables 10 and 11 for MCAR and MAR, respectively.
Table 10
Relative Efficiency of Missing Data Methods under MCAR based on Adjusted R2
% Missing
R2 Adj
(ΔR2 Adj)2
Listwise deletion
10
20
30
.556
.602
.610
.0009
.0003
.0006
Mean substitution
10
20
30
.477
.346
.232
.0119
.0576
.1253
Method
R.E. (LD vs .MI)
.195
.168
.084
R.E. (MS vs. MI)
2.569
37.870
18.191
2011-25
Multiple imputation
10
20
30
.518
.625
.503
.0046
.0015
.0069
Note. Efficiency was computed relative to the hypothesized true adjusted R2 value from the complete data (.589; see
Table 5). Two equally-efficient estimators have a relative efficiency of 1. Given that MI was the reference group,
R.E. < 1 indicate that the respective comparison missing data method is more efficient than MI. Likewise, R.E. > 1
indicate that MI is more efficient.
Table 11
Relative Efficiency of Missing Data Methods under MAR based on Adjusted R2
% Missing
R2adj
(ΔR2 adj)2
Listwise deletion
10
20
30
.459
.423
.463
.0161
.0266
.0151
Mean substitution
10
20
30
.337
.281
.278
.0620
.0930
.0949
Multiple imputation
10
20
30
.560
.561
.584
.0007
.0006
.0000
Method
R.E. (LD vs .MI)
R.E. (MS vs. MI)
23.859
42.510
3782.250
91.717
148.840
23716.000
Note. Efficiency was computed relative to the hypothesized true adjusted R2 value from the complete data (.589; see
Table 5). Two equally-efficient estimators have a relative efficiency of 1. Given that MI was the reference group,
R.E. < 1 indicate that the respective comparison missing data method is more efficient than MI. Likewise, R.E. > 1
indicate that MI is more efficient.
Discussion
The purpose of this article was threefold; (a) to determine how CTE researchers have managed
missing data in several recent issues of CTER, (b) to review missing data theory and alternative methods
of handling missing data, and (c) to illustrate the detrimental effects that unprincipled missing data
techniques may have on the accuracy of research results. Our overarching goal was to offer guidelines
that may help CTE researchers to adequately address missing data problems in their own work.
The examination of recent research in our field reveals that more attention needs to be paid to the
issue of missing data. Improvements are necessary in the reporting of missing data, as well as the methods
used for their treatment. These concerns are intensified by the frequent use of small sample sizes. While
small sample sizes are not problematic in certain types of research designs, they do present a problem
when using survey data to conduct inferential statistical analysis. Small sample sizes may not provide
enough power to detect differences, and the use of case reduction methods to remedy missing data in
small samples further exacerbates this issue. Finally, low response rates to survey research were reported
in the 20-30% range for a number of studies. The extent of non-response bias threatens the
generalizability of findings from these studies, for non-responders may be systematically different from
responders (i.e., missing data are not likely to be MCAR). Traditional missing data methods, however,
perform acceptably only under MCAR.
Our example using a small sample of real-life data illustrated the various detrimental effects of
traditional approaches to handling missing data. These effects include the occurrence of Type II error,
2011-25
biased regression coefficients, standard errors, and adjusted R2 values, as well as loss of variance (for
mean substitution) and statistical power/efficiency (for listwise deletion). Clearly, the repercussions of
distorted analysis results can be serious for policy and practice, as intervention effects and other outcomes
of interest may be severely misestimated.
MI yielded estimates that were much closer to those of the complete-case benchmark. The
method’s performance was particularly robust under the MAR assumption, which is more realistic than
MCAR in most education and social science datasets. While researchers must keep in mind that imputed
data represent plausible as opposed to real data, and that the uncertainty inherent in the missing data will
be reflected in larger standard errors, MI has allowed us to draw inferences that would be highly similar
to those from complete data. Overall, our findings are in line with prior studies that have illustrated the
superior performance of MI as a prominent modern missing data technique (e.g., Graham, Hofer, &
MacKinnon, 1996; Roth & Switzer, 1995; Schafer & Graham, 2002).
Conclusion
Scholmer et al. (2010) urged researchers to provide detail in their manuscripts about the extent
and nature of the missing data they encountered, describe any procedures they used to manage this
missing data, and include a rationale for selecting particular techniques to manage the missing data.
Incorporating this type of expectation in the review process of the CTER and other journals in workforce
education would contribute to reducing the potentially biasing effects that missing data can have, as
detailed in our comparison of missing data techniques with a small sample from NLSY97. These
potentially biasing effects can produce incorrect inferences that may, in the worst case, have implications
for CTE policy decisions.
Schlomer et al.’s (2010) suggestions, if adopted, will not change behavior overnight. Our review
of the treatment of missing data for four recent volumes of the CTER showed little attention to this
important aspect of data analysis. Nevertheless, directing attention toward missing data now can enhance
the quality of the research generated by CTE researchers, and also provide a more robust basis for making
recommendations for policy and practice.
2011-25
References
Allison, P. D. (2002). Missing data. Thousand Oaks, CA: Sage.
American Psychological Association. (2010). Publication manual. Washington, DC: Author.
Barnard, J., & Meng, X. L. (1999). Applications of multiple imputation in medical studies: From AIDS to
NHANES. Statistical Methods in Medical Research, 8, 7-36.
Beunckens, C., Sotto, C., & Molenberghs, G. (2008). A simulation study comparing weighted estimating
equations with multiple imputation based estimating equations for longitudinal binary data.
Computational Statistics and Data Analysis, 52, 1533-1548.
Brown, R. L. (1994). Efficacy of the indirect approach for estimating structural equation models with
missing data: A comparison of five methods. Structural Equation Modeling, 4, 287-316.
Chaudhuri, A., & Stenger, H. (1992). Survey sampling: Theory and methods. New York, NY: Dekker.
Chen, G., & Åstebro, T. (2003). How to deal with missing categorical data: Test of a simple Bayesian
method. Organizational Research Methods, 6, 309-327.
Chen, J., & Shao, J. (2001). Jackknife variance estimation for nearest-neighbor imputation. Journal of the
American Statistical Association, 96, 260-269.
Demirtas, H. (2005). Multiple imputation under Bayesianly smoothed pattern-mixture models for
nonignorable drop-out. Statistics in Medicine, 24, 2345-2363.
Demirtas, H., & Schafer, J. L. (2003). On the performance of random-coefficient pattern-mixture models
for nonignorable dropout. Statistics in Medicine, 21, 1-23.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the
EM algorithm. Journal of the Royal Statistical Society, 39, 1-38.
Eliason, S. R. (1993). Maximum likelihood estimation: Logic and practice. Newbury Park, CA: Sage.
Elliott, M. N., Edwards, C., Angeles, J., Habarsoomians, K., & Hays, R. D. (2005). Patterns of unit and
item nonresponse in the CAHPS hospital survey. Health Services Research, 40, 2096-2119.
Enders, C. K. (2006). Analyzing structural equation models with missing data. In G. R. Hancock & R. O.
Mueller (Eds.), Structural equation modeling: A second course (pp. 313-342). Greenwich, CT:
Information Age.
Fielding, S., Fayers, P. M., & Ramsay, C. R. (2009). Investigating the missing data mechanism in quality
of life outcomes: A comparison of approaches. Health and Quality of Life Outcomes, 7, 1-10.
Given, B. A., Keilman, L. J., Collins, C., & Given, C. W. (1990). Strategies to minimize attrition in
longitudinal studies. Nursing Research, 39, 184-187.
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of
Psychology. Advance online publication. doi: 10.1146/annurev.psych.58.110405.085530
Graham, J. W., & Donaldson, S. W. (1993). Evaluating interventions with differential attrition: The
importance of nonresponse mechanisms and use of follow-up data. Journal of Applied Psychology,
78, 119-128.
Graham, J. W., Hofer, S. M., & MacKinnon, D. P. (1996). Maximizing the usefulness of data obtained
with planned missing value patterns: An application of maximum likelihood procedures. Multivariate
Behavioral Research, 31, 197-218.
Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How many imputations are really needed?
Some practical clarifications of multiple imputation theory. Prevention Science, 8, 206-213.
Graham, J. W., & Schafer, J. L. (1999). On the performance of multiple imputation for multivariate data
with small sample size. In Rick H. Hoyle (Ed), Statistical strategies for small sample research (pp. 129). Thousand Oaks, CA: Sage.
Groves, R. M., & Couper, M. P. (1998). Nonresponse in household interview surveys. New York, NY:
Wiley.
Honaker, J. & King, G. (2010). What to do about missing data in time-series cross-sectional data.
American Journal of Political Science, 54, 561-81.
Ibrahim, J. G. (1990). Incomplete data in generalized linear models. Journal of the American Statistical
Association, 85, 765-769.
2011-25
Jeliĉić, H., Phelps, E., Lerner, R. M. (2009). Use of missing data methods in longitudinal studies: The
persistence of bad practices in developmental psychology. Developmental Psychology, 45, 11951199.
Little, R. J. (1988). A test of missing completely at random for multivariate data with missing values.
Journal of the American Statistical Association, 83, 1198-1202.
Little, R. J., & Rubin, D. B. (1987). Statistical analysis with missing data. New York, NY: Wiley.
McKnight, P. E., McKnight, K. M., Sidani, S., & Figueredo, A. J. (2007). Missing data: A gentle
introduction. New York, NY: Guilford.
Mooney, C. Z. (1997). Monte Carlo simulation. Thousand Oaks, CA: Sage.
Panik, M. J. (2005). Advanced statistics from an elementary point of view. Burlington, MA: Elsevier.
Raaijmakers, Q. A. W. (1999). Effectiveness of different missing data treatments in surveys with Likerttype data: Introducing the relative mean substitution approach. Educational and Psychological
Measurement, 59, 725-748.
Roth, P. L., & Switzer, F. S. (1995). A Monte Carlo analysis of missing data techniques in HRM settings.
Journal of Management, 21,1003-1023.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581-592.
Rubin, D. B. (1978). Multiple imputation in sample surveys – a phenomenological Bayesian approach to
nonresponse. Proceedings of the Survey Research Methods Section. American Statistical Association,
20-34.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: Wiley.
Rubin, D. B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association,
91, 473-489.
Schafer, J. L. (1997). Analysis of incomplete multivariate data. Boca Raton, FL: Chapman & Hall.
Schafer, J. L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research, 8, 3-15.
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological
Methods, 7, 147-177.
Schafer, J. L., & Olsen, M. K. (1998). Multiple imputation for multivariate missing-data problems: A data
analyst’s perspective. Multivariate Behavioral Research, 33, 545-571.
Schlomer, G. L., Bauman, S., & Card, N. A. (2010). Best practices for missing data management in
counseling psychology. Journal of Counseling Psychology, 57, 1–10.
Schulte Nordholt, E. (1998). Imputation: Methods, simulation experiments and practical examples.
International Statistical Review, 66, 157-180.
Sinharay, S., Stern, H. S., & Russell, D. (2001). The use of multiple imputation for the analysis of
missing data. Psychological Methods, 6, 317-329.
Stuart, E. A., Azur, M., Frangakis, C., & Leaf, P. (2009). Multiple imputation with large data sets: A case
study of the children’s mental health initiative. American Journal of Epidemiology, 169, 1133-1139.
Switzer, F. S., Roth, P. L., & Switzer, D. M. (1998). A Monte Carlo analysis of systematic data loss in an
HRM setting. Journal of Management, 24, 763-779.
Tsikriktsis, N. (2005). A review of techniques for treating missing data in OM survey research. Journal of
Operations Management, 24, 53-62.
U.S. Bureau of Labor Statistics. (2009). National longitudinal survey of youth 1997 [Data file and code
book]. Retrieved from https://www.nlsinfo.org/investigator/pages/welcome.jsp
Van Buuren, S., & Groothuis-Oudshoorn, K. (2009). MICE: Multivariate imputation by chained
equations in R. Journal of Statistical Software. Retrieved from
http://www.stefvanbuuren.nl/publications/MICE in R - Draft.pdf
Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals:
Guidelines and explanations. American Psychologist, 54, 594–604.
Download