Inaccurate Disparities Estimates: Systematic Measurement Error’s Biasing Influence on Disparities in

advertisement
Inaccurate Disparities Estimates:
Systematic Measurement Error’s
Biasing Influence on Disparities in
National Rates of Children with Special
Health Care Needs across Spanish and
English Speaking Households
Adam C. Carle, M.A., Ph.D.
adam.carle@cchmc.org
Division of Health Policy and Clinical Effectiveness
Cincinnati Children’s Hospital Medical Center
University of Cincinnati School of Medicine
AcademyHealth June 26, 2010: Boston, MA
Introduction
• To validly compare treatments and policies, the
field needs sound measurement.
– Observation and evaluation depend on self-report.
– Sound measurement = reliable and valid.
• To accurately examine outcomes across
subgroups, field requires measures demonstrating
similar reliability and validity across subgroups.
Introduction
• Little research has empirically evaluated child
health measures.
• (Drotar, 1998; Stein, 2004).
• Ensues despite fundamental importance of
reliable and valid measurement in research.
• Consider a measure to identify children with
special health care needs.
Introduction
• Children with special health care needs (CSHCN):
• Children with or at “increased risk for a chronic
physical, developmental, behavioral, or emotional
condition and who also require health and related
services of a type beyond that required for
children generally.”
• (McPherson et al., 1998).
Introduction
• CSHCN account for 12.8% of all US children.
• (van Dyck, et al., 2004).
• Yet, CSHCN’s medical expenses account for
over 80% of all children’s medical expenses.
• (Institute of Medicine, 1998).
Introduction
• Researchers have observed substantial differences
in the prevalence of CSHCN across:
– Race and ethnicity.
• White, Black, Hispanic.
– Main difference across NH and Hispanic children.
– Primary household language.
• English vs. Other.
• Largest difference across Spanish speaking
Hispanics and all others.
• 4.6% vs. 14%!
– (Read, et al., 2007).
Introduction
• These rates specifically inform allocation of
resources to different subpopulations of children.
• Rates also guide how we understand the
effectiveness of health policies, services, and
treatments for different subgroups of children.
• Given a limited set of resources, we should:
– Direct resources to the groups most in need.
– Direct resources to the most successful policies.
– Direct resources to the most successful treatments.
Introduction
• Before making comparisons across subgroups, we
must consider cross-group measurement
equivalence.
– If responses affected by translation problems,
impossible to compare groups on observed scores.
• We must consider whether observed differences
reflect true differences.
• Do we allocate resources and deliver treatments
based on systematic measurement error?
Measurement Bias
Measurement Bias
• A possibility…
• Individuals with identical health statuses who
receive different translations may respond
differently to questions about their health.
– Should respond similarly, but don’t.
• Systematic measurement error.
– Measurement bias.
– Differential item functioning (DIF).
Why Study Bias?
• Generally decreases reliability and validity.
• (Knight & Hill, 1998).
• Attenuate or accentuate group differences.
• (Carle, 2008).
• Lead to inaccurate diagnoses and classifications.
• (Carle, 2009).
• Can render cross group comparisons impossible.
• (Prelow, et al., 2002).
Bias and Translation
• No studies empirically examine equivalence of
CSHCN measures across translations.
– Does not imply failure to follow stringent translation
process methods.
– Despite best methods, differences still occur.
• Leaves unclear to what extent bias influences
disparities research.
Evaluating Bias
• Latent variable models potently investigate bias.
• (Millsap & Kwok, 2004, Muthén, 1989).
• Equations describe the relations among questions.
• Examine the cross-group equivalence of the
measurement parameters in the equations.
• (Millsap & Yun-Tien, 2004).
• Differences in these parameters across groups
reflect bias.
Evaluating Bias
• Multiple group (MG) confirmatory factor analyses
for ordered-categorical measures (CFA-OCM).
• A popular latent variable method.
Evaluating Bias: MG-CFA-OCM
• Let X ij equal the ith individual’s score on the
jth ordered-categorical item.
– Let the number of items be p (j = 1, 2, .., p).
– Scores, m, range {0, 1, …, s}.
• We assume a continuous latent response
variate, X ij , determines observed responses.
Evaluating Bias: MG-CFA-OCM

X
• A threshold value on ij determines responses:
– If X ij less than the threshold, respond in one category.
– If X ij greater than threshold, respond in at least next
highest category.
X ij  m
if
 jm  X   j m1

ij
•  j 0 , j1 ,..., j s 1)   represent threshold parameters.
Evaluating Bias: MG-CFA-OCM
• Suppose some factor or set of factors, , is
responsible for the observed scores.
• X ij relates to the factor(s) as follows:
X   j   ji   ij
*
ij
Evaluating Bias: MG-CFA-OCM
• Subscript the parameters to allow for group
differences.
• To the extent that cross-group constraints lead to
problematic fit, measurement bias presents.
?
Evaluating Bias: MG-CFA-OCM
Evaluating Bias: MG-CFA-OCM
• Essentially, we ask if we “see” the same picture
for each group.
• If not, bias exists.
– Not good!
• If yes, we have equivalent measurement.
– A good thing!
Evaluating Bias
• Methodological and substantive issues can limit
MG-CFA-OCM.
• Difficult to incorporate multiple grouping
variables simultaneously.
• Why does this matter?
Observational Research
• Survey respondents not randomly assigned.
• Differences in translation may result from
translation problems, or….
• Differences may result from other variables that
covary with language.
Observational Research
• Individuals who speak only Spanish tend to:
– Come from poorer homes.
– Have lower educational achievement.
• (Yu, et al., 2002).
• Both of these predict CSHCN status.
• Apparent translation differences may reflect
systematic differences in the distribution of SES,
not bias.
Observational Research
• Difficult to simultaneously include multiple
variables in “traditional” latent variable
approaches.
• Failure to include available information in model
estimation may lead to erroneous conclusions.
• What do we do?
MG-MIMIC Models
• Multiple group (MG) multiple indicator,
multiple cause (MIMIC) models.
– Build on developments in structural equation
modeling, IRT, and CFA-OCM.
• (Jones, 2003; Jones, 2006; Muthén, 1989)
• Control for “extra” (background) variables by
incorporating them as covariates.
• More fully address heterogeneity within and
across groups.
MG-MIMIC Models
Review
• Children in Spanish speaking households are
much less likely to be identified as CSHCN.
– 4.6 vs. 14%.
• Measurement bias may partially or fully account
for this difference.
• No studies have examined this possibility.
Why should we care?
• Beyond “just” achieving accurate estimates, why
should we care?
• The Maternal and Child Health Bureau allocates
approximately one BILLION dollars each YEAR
to the child health population.
• States must spend 30% of this money on CSHCN.
• Distribution of funds depends in part on
distribution of CSHCN.
Why should we care?
• If CSHCN measures systematically fail to identify
minority CSHCN….
• Minority CSHCN fail to receive services as a
function of measurement bias.
• Measurement bias may influence hundreds of
millions of dollars in federal spending.
• Measurement bias may cause minority children to
suffer needlessly.
Current Study
• Utilized MG-MIMIC.
– Probed for bias across English to Spanish translation of
the CSHCN Screener in the 2005-2006 National Survey
(NS) of CSHCN.
• Principal source of information on US CSHCN.
• MG-MIMIC models simultaneously included the
family’s education and income level in analyses.
• Compared MG-MIMIC to MG-CFA-OCM.
– MG-CFA-OCM “only” examined language.
Methods
• Participants (n = 350,345) came from the National
Survey of CSHCN (NS-CSHCN).
– English n = 326,050.
– Spanish n = 24,295.
– Sponsored by Maternal and Child Health Bureau.
– Fielded by National Center for Health Statistics
(NCHS).
• (Blumberg, et al., 2008).
• Provides state and national data on CSHCN.
– Represents noninstitutionalized children.
– All analyses accounted for complex design.
Results: MG-CFA-OCM
• Similar general model fit across language
versions.
–
–
–
–
–
RMSEA = 0.025
CFI = 0.99
TLI = 0.99
McDonald’s NCI > 0.99
Gamma Hat = 0.998
• Items should measure a similar construct and
have comparable meaning regardless of
translation.
Results : MG-CFA-OCM
• Model constraining the loadings demonstrated
measurement bias.
–
–
–
–
–
–
RMSEA = 0.019
CFI > 0.99
TLI > 0.99
McDonald’s NCI > 0.99
Gamma Hat = 0.999
Δ2 = 11.727 (3, n = 350,345, p < 0.01)
Results : MG-CFA-OCM
• Items would not relate to underlying construct
similarly across groups.
• Estimates wouldn’t deliver similar reliability
across language.
Results : MG-CFA-OCM
• Model constraining the thresholds demonstrated
statistically significant measurement bias.
–
–
–
–
–
–
RMSEA = 0.018
CFI > 0.99
TLI > 0.99
McDonald’s NCI > 0.99
Gamma Hat = 0.999
Δ2 = 76.434 (3, n = 350,345, p < 0.01)
• Reponses would not correspond to similar health
levels.
• Translation would not provide equivalent
measurement.
Discussion
• MG-MIMIC showed that measurement bias did
not present across translation.
– After accounting for systematic differences in SES
across groups, I found NO bias.
– Revealed that the CSHCN Screener did provide
equivalent measurement across translations.
– Indicates we can trust estimates across translations.
• When comparing within similar sociodemographic
groups.
Discussion
• MG-CFA-OCM indicated measurement bias
across translations.
– Suggested that the CSHCN Screener did not provide
equivalent measurement across translations.
– This would mean that we could not trust estimates
across English and Spanish versions of the Screener.
Discussion
• Traditional methods unable to model the
systematic influence of multiple variables.
• Led to erroneous conclusion.
– Suggested that bias presented across translation.
• This would incorrectly indicate that
differences in prevalences of CSHCN across
reflect systematic measurement error due to
language.
– When disregarding sociodemographic groupings.
Discussion
• MG-MIMIC provided more accurate picture.
– Modeled systematic influence of multiple variables.
• Differences in prevalences of CSHCN across
languages likely do not reflect measurement bias.
– When considering sociodemographic differences.
• Likely reflect true differences.
Limitations
• Self report data.
– Responses may not reflect children’s actual health.
• Only address these five consequences.
– Consequences not included on the Screener may have
more relevance among Spanish speaking Hispanics.
Limitations
• MG-MIMIC can only detect biased thresholds for
background variables.
– Leaves loadings unaddressed.
– Problem for background variables only.
– Missing data approach possibilities.
• If bias permeates the entire Screener, analyses
cannot detect this.
– No statistical approach can.
• (Millsap, 2006).
Summary
• Findings strongly support CSHCN Screener
responses across Spanish and English versions.
– Only when examining within similar
sociodemographic groups.
• Increases confidence in research showing
significant differences in CSHCN prevalences
across English and Spanish speaking children.
– When similar sociodemographic groups.
Summary
• MG-MIMIC and MG-CFA-OCM differences:
– Highlight importance of simultaneously considering
multiple influences on measurement.
– Some cross-cultural measurement differences result
from systematic education and income differences.
– Using model-based estimates can mitigate systematic
measurement error and lead to more accurate
disparities estimates and research.
Conclusion
• When conducting research across distinct racial,
ethnic, and/or cultural groups, investigators
should establish measurement equivalence to
insure validity of results before developing policy
and allocating resources based on the results.
• In this way, we can better advance positive
outcomes in a diverse world.
Inaccurate Disparities Estimates:
Systematic Measurement Error’s
Biasing Influence on Disparities in
National Rates of Children with Special
Health Care Needs across Spanish and
English Speaking Households
Adam C. Carle, M.A., Ph.D.
adam.carle@cchmc.org
Division of Health Policy and Clinical Effectiveness
Cincinnati Children’s Hospital Medical Center
University of Cincinnati School of Medicine
AcademyHealth June 26, 2010: Boston, MA
Download