Reviewer`s report Title: Measuring socioeconomic status in multi

advertisement
Reviewer's report
Title: Measuring socioeconomic status in multi-country studies: Results from the eightcountry MAL-ED study
Version: 1
Date: 20 August 2013
Reviewer: Sean Green
Reviewer's report:
Major compulsory issues
1. The analysis is thorough and the paper clearly articulates the need for a better
measure of SES for cross-country comparisons, but the authors need to present
a more compelling case that the wealth measure they have created meets this
need. On lines 263-265 (pg 11) the authors mention that HAZ is used instead of
WHZ because HAZ is a better composite measure of acute and chronic
deprivation, but there is no explanation for why assessing the association
between the composite SES measure and HAZ is a sufficient test of the
suitability of the SES measure. If association with HAZ is an adequate test for an
SES measure, and parsimony is a consideration, why not just use HAZ itself? It
would be good if the analysis could show how the composite SES index fares
when associated with other quantities of interest (e.g. biostatistical measures
such as WHZ, WAZ, and BMIZ, or other survey measures that are routinely
associated with SES in single-country analyses).
Thank you for this comment. We will respond in two parts, first to the question of why we used
the association between SES and HAZ to assess suitability of our SES measure, and second,
whether our new measure is an improvement over what previously existed.
The question of why we assessed associations between SES and HAZ is an important one,
which we discussed at length while working on this paper. First, the association with HAZ is not
the only way we assessed the suitability of this SES measure, or even the primary way. The
development of the four possible measures of wealth, and then the full measure of SES, was
based on an understanding of underlying theory (e.g. SES represents wealth/income,
education, occupation), as well as grounded in the extensive literature exploring approaches to
measuring wealth and SES, such as Filmer & Pritchett’s 2001 paper in Demography. In our
view, this strong grounding is sufficient justification for the approaches to measuring SES that
we compare, with the exception of mother’s education, which is included because it has been
used (arguably incorrectly) in previous research as a simple proxy.
We chose to take these analyses of suitability a step further for two reasons: 1) we were
interested in comparing the four measures of wealth directly, and 2) analyzing associations of a
construct of interest with other constructs that are believed to be related theoretically or
empirically is an accepted approach to assessing construct validity (Cronbach & Meehl 1955).
We have added this explanation more explicitly starting in line 263. Further, we could not use
HAZ as a proxy for SES because it was one of our primary outcomes of interest, and because
that approach could not be justified on theoretical grounds. Although we expected HAZ and
SES to be statistically associated, we do not assume that they are equivalent.
As described (in what is now lines 267-269), we chose HAZ because it is understood to be a
child growth measure associated with chronic deprivation, rather than WHZ, which is more often
associated with acute deprivation. This was more closely aligned with our theoretical
understanding of SES. However, we did also assess associations with WHZ (not associated, as
expected, results not shown).
Further, in analyses that were published in PHM in December 2012 (see citation below), we
found that, while food access insecurity and low SES are related in the study population, food
access insecurity also appears to capture a separate area of vulnerability that is (at least in part)
independent of SES. In the case of that study, we found similar results when using a PCAbased measure of SES and a simpler sum score of household assets. Our final measure of SES
in the current analyses was also significantly associated with food access insecurity, as
expected given the hypothesized nomological network (Cronbach & Meehl 1955).
Stephanie Psaki; Zulfiqar Bhutta; Tahmeed Ahmed; A.M. Shamsir Ahmed; Pascal Bessong;
Munirul Islam; Sushil John; Margaret Kosek; Cebisa Nesamvuni; Prakash Shrestha; Erling
Svensen; Stephanie Richard; Jessica Seidman; Laura Caulfield; Aldo A Lima; Mark Miller;
William Checkley; MALED Network Investigators. Household food insecurity and child
malnutrition: Results from the eight-country MAL-ED study. Population Health Metrics. 2012 Dec
13; 10(1): 24.
In response to whether our measure is an improvement over what previously existed, we
believe that it is for several reasons: 1) It is a robust measure of SES that reflects theory on
what SES is intended to measure, 2) It reduces the data collection burden by highlighting a
priority set of indicators for measurement, in contrast to commonly-used PCA approaches,
which require collecting data on a full set of indicators, even if some are irrelevant, 3) It is
computationally simple to apply, once the priority assets have been selected using the random
forests technique. We have added text to more explicitly outline these contributions starting in
line 453.
[Cronbach, LJ & Paul E. Meehl. Construct Validity in Psychological Tests. Psychological Bulletin.
(1955). 52: 281-302.]
2. The statement on lines 231 and 232 of page 10 "household eligibility was
determined based on location..." is a bit ambiguous. Were households that
comprised the intersection of the set of households within the MAL-ED
catchment area and the set of households with children aged 24 to 60 months
chosen, or were there certain locations in the catchment area that were favorably
selected (e.g. peri-urban or rural locations). The selection process is important
because whether the 798 households selected had statistically significant
differences in the proportions of households from peri-urban, urban, and rural
locations or whether this data set differs from Filmer and Pritchett's approach to
DHS data in regards to bias towards sampling in urban areas. An clearer
explanation of the selection process or a chart showing summary statistics for
proportion of households of each type would address this issue.
Thank you for this question, we can see that it was unclear as written in the text on lines 231232. All households that met those two criteria (within the MAL-ED catchment area and had a
child aged 24 to 60 months in the household) were eligible. We have edited the wording to
clarify this.
Table 1 describes each study site. We have added an additional column to more explicitly show
which sites are urban, rural, or peri-urban, and added some clarifying language in line 220 on
the study setting. Three of the sites are urban (in Brazil, Bangladesh, and India), two sites are
peri-urban (in Peru and Nepal), and three sites are rural (in South Africa, Pakistan, and
Tanzania). Given the mix, we do not feel that these analyses suffer from the bias toward urban
areas that Filmer and Pritchett discuss.
3. What was the percent of missingness in the data-- especially in the attributes that
were selected by the random forest and the PCA index. Some matrix
factorization techniques are impeded by data with a high degree of missingness
so the software package will perform list-wise deletion to produce a matrix that is
positive semidefinite. Depending on the random forest algorithm used
missingness can be handled in several ways-- none of which involves list-wise
deletion. Depending on the degree of missingness in the data random forest's
variable importance measures may be aided by the fact that they include
observations that are excluded in the other methods because of the methods'
implicit treatment of missingness. Without knowing the completeness of the data
and the random forest algorithm used, it will be difficult to reproduce your results.
If your data has no missingness, then you may want to consider using conditional
variable importance ( Carolin Strobl's method in R's cforest package) because it
can provide more robust variable importance results in data sets without missing
As described in lines 325-327, we dropped 11 observations because of missing or extreme
anthropometric values. All remaining observations had complete data on the variables used for
these analyses. We have added a sentence at line 327 with that clarification. We did calculate
and use conditional variable importance using the cforest package in R, and have added that
clarification in lines 287-288.
Minor essential revisions
4. There are three measures of variable importance associated with random
forests-- Gini importance, permutation importance, and conditional importance
(Strobl's method). The paper should be clear about which one was used,
because the chosen method has implications for how the importance measures
perform.
We thank the reviewer for this question and request for clarification of the method. Indeed, we
used Strobl’s approach. As the reviewer correctly points out, we calculated conditional variable
importance, and have added this clarification in lines 287-288.
Discretionary revisions
5. Leave one out cross-validation (LOOCV) provides optimistic, deflated estimates
of MSE if any of your data rows are duplicates since, with duplicate rows, the
training set contains an exact match for the test row at least twice. You may want
to do a uniqueness test on your data because just one duplicate entry would
likely break the tie between random forest, PCA, and the composite SES index in
Table 4.
We also conducted five- and ten-fold cross validation, but chose to only include the results from
LOOCV because it showed similar, consistent results with five-fold and ten-fold cross validation.
We have added a note under Table 4 with this information.
6. On line 393-394 of page 17 the author states that maternal education was more
parsimonious than other methods but was rejected. It would be good if the way
that parsimony was treated in evaluation of models could be explicitly stated
since the parsimony of models is included in Table 4. Is a model with eight terms
eight times as bad as one with one term? A simple method of address this would
be to use the Akaike Information Criteria (AIC) or Bayesian Information Criteria
(BIC) which explicitly include terms for model degrees of freedom.
Thank you very much for this comment, and we have amended this section of the manuscript.
Your comment made us think more in depth about this point, and upon revisiting the subject we
realized we were discussing parsimony from the standpoint of the number of variables that were
included in the development of each metric instead of statistical parsimony (i.e., fewers
variables in a regression model). Specifically, all four SES metrics (plus the WAMI index) are
summarized into single independent variable. So there there may be no value in including
AIC/BIC as a comparative metric of model fit (essentially all metrics have the same number of
variables in the model). Moreover, given that the models are not nested with each other, it
made less sense to use likelihood based approaches for model comparison. This is the main
reason why we chose to use cross-validation approaches (LOOCV) and effect size as methods
to compare the models. Again, thank you very much for bringing up this point and allowing us
to provide further clarification.
Reviewer's report
Title: Measuring socioeconomic status in multi-country studies: Results from the
eight-country MAL-ED study
Version: 1
Date: 22 August 2013
Reviewer: Abraham Flaxman
Reviewer's report:
This paper takes on an important topic, how to measure socioeconomic (SE) status
cross-culturally. The authors have collected a new dataset consisting of information on
800 children aged 24-60 months in 8 distinct sites. Although not explicitly stated, it
seems that the authors are also particularly interested in developing an SE index that is
easily implemented in a household survey and easily calculated from the survey results,
and their proposed solution, the WAMI index, achieves this goal.
The paper uses a sophisticated machine learning technique to select household assets
for use in a wealth index, and uses the components of the SE index independently and
collectively to predict levels of height-for-age Z score (HAZ). However, the paper has
some major shortcomings that prevent me from recommending it for publication at this
time. It does not satisfactorily define SE status, and it does not describe in sufficient
detail how the novel random forest approach works. Furthermore, validating the SE
index by its ability to predict low HAZ seems circular.
Major Compulsory Revisions
1. Please include a detailed definition of socioeconomic status and
contextualization it among existing, competing definitions and
operationalizations. It seems that there is no settled conceptual definition of SE
status [1], and even calling it “status” is not universal, with some authors
considering “position” more precise [2,3]. This makes it essential for your work to
clearly state the definition you are using.
Thank you for this comment. You are right that there has been much discussion in the literature
about the definition of socio-economic status vs. socio-economic position, as well as the
appropriate measurement of both. We have defined socio-economic status in lines 166-168 in
accordance with Adler and colleagues’ definition (1994), which is also the definition used by the
U.S. Centers for Disease Control: “a composite measure that typically incorporates economic,
social, and work status. Economic status is measured by income. Social status is measured by
education, and work status is measured by occupation.”
Krieger and colleagues’ main argument in favor of using the term socio-economic position,
rather than socio-economic status, is that the latter, “blurs distinctions between two different
aspects of socioeconomic position: (a) actual resources, and (b) status, meaning prestige- or
rank-related characteristics.” They explain that indicators in the former category tend to be
dichotomous (e.g. owning a chair), whereas prestige or rank-related characteristics are typically
continuous, e.g. level of access to services. Krieger and colleagues’ definition of socio-
economic position explicitly includes indicators in the latter category, which are not included in
our measure. We have added this clarification as a footnote in the text in line 169, as well as
adding the lack of prestige or rank-related characteristics more explicitly as a limitation in lines
465-466.
2. Complete description of the random forest method used. A satisfactory revision
will at least address the following questions, of which I was unsure of the
answers in the current version: Why is it necessary to use the subset of
indicators selected for PCA (line 282) and what happens if you do not? Did you
use “supervised learning” specifically to predict HAZ (line 283-4) or
“unsupervised learning” (which is more commonly used to cluster instances, but
can also be used to identify important covariates)? If you used supervised
learning with RF to predict HAZ and compared this to methods that were not
designed specifically to predict HAZ, it is not surprising that RF performed better
(see point 4 below). Why did you choose 8 indicators (line 284-5), instead of
more or less?
Thank you for this comment. We have provided additional information on the random forests
method used, as requested. We made a decision to use the same initial set of indicators that
was used in PCA so that the results would be comparable. If we used a smaller subset of
indicators to start, we would not know whether a difference in predictive power was due to the
choice of indicators or the method used. We would not have used a larger set of indicators
because the initial set used for PCA were selected based on assessments of variation and
internal consistency.
We describe our reason for choosing 8 indicators in the results section (currently lines 344-347).
Since we found no accepted approach to this in the literature, we borrowed from the methods
used in PCA and created a scree plot using variable importance. We selected eight variables
based on the results of the scree plot (not shown), because the magnitude of change in variable
importance between items dropped off notably after eight variables.
We used unsupervised learning with conditional random forests to predict HAZ, and have added
this to the text.
3. Explicitly state goal of developing a simple index (implicit in the parsimony criteria
on line 301-303, perhaps also in “easily collected across diverse settings”
observation on line 403-404, and table 5), and give necessary criteria for any
solution to be considered simple (or easy or parsimonious or whatever you
decide to call it).
We have added the goal of developing a simple index explicitly in several places in the text,
including in lines 270-272, and 415-416. We also added a note under Table 4 explaining more
clearly how parsimony was assessed and compared.
4. Validation of wealth or SE index by predicting HAZ seems invalid. For example,
selecting RF over MPI as the preferred measure of wealth based on the MSE of
1.37 vs 1.39 for predicting HAZ says that RF is a better proxy for HAZ, not
necessarily for wealth (that these MSE values are similarly large is also
concerning, but that is another issue). Implicit in the validation approach is the
assumption that a better predictor for wealth is a better predictor for HAZ, which I
find implausible. At the very least, this should be clearly stated as a limitation of
the study.
The question of why we assessed associations between SES and HAZ is an important one,
which we discussed at length while working on this paper. First, the association with HAZ is not
the only way we assessed the suitability of this SES measure, or even the primary way. The
development of the four possible measures of wealth, and then the full measure of SES, was
based on an understanding of underlying theory (e.g. SES represents wealth/income,
education, occupation), as well as grounded in the extensive literature exploring approaches to
measuring wealth and SES, such as Filmer & Pritchett’s 2001 paper in Demography. In our
view, this strong grounding is sufficient justification for the approaches to measuring SES that
we compare, with the exception of mother’s education, which is included because it has been
used (arguably incorrectly) in previous research as a simple proxy.
We chose to take these analyses of suitability a step further for two reasons: 1) we were
interested in comparing the four measures of wealth directly, and 2) analyzing associations of a
construct of interest with other constructs that are believed to be related theoretically or
empirically is an accepted approach to assessing construct validity (Cronbach & Meehl 1955).
Although we expected HAZ and SES to be statistically associated, we do not assume that they
are equivalent. We have added this explanation more explicitly starting in line 263, as well as
including this as a limitation in lines 450-453.
As for the MSE values for RF and PCA, we agree that they are very close, and did not use the
difference between these values as one of our criteria for selecting random forests. Rather, as
described in the “choice of wealth measure” section starting in line 375, we considered those
values to be equivalent, and chose RF because of the parsimony and simplicity of application,
once the important variables are selected.
Minor Essential Revisions
5. State all household asset questions from which the 16 used in PCA were chosen,
as well as the reason for excluding any questions outside these 16.
We have added additional detail in the “household wealth measurement” section (starting in line
346) on how the 16 assets were chosen, including a full list of the assets.
6. I am not familiar with using maternal education as a proxy for household wealth
(line 268-9), please include references.
Below are two references that discuss the use of mother’s education as a proxy for SES. We
have added these references to the text. The Fotso et al. (2005) paper you mentioned
previously also references this practice.
Monteiro, CA; Conde, WL; Popkin, BM. Obesity and inequities in health in the
developing world. International Journal of Obesity (2004) 28, 1181-1186.
Desai, S. & Alva, S. Maternal education and child health: Is there a strong causal
relationship? Demography (1998) 35, 71-81.
Discretionary Revisions
7. Food insecurity questions (line 228) seem highly relevant, and perhaps should
not be excluded (line 278-9).
We previously assessed associations between food access insecurity and anthropometry in the
study population. In analyses that were published in PHM in December 2012 (citation below),
we found that, while food access insecurity and low SES are related in the study population,
food access insecurity also appears to capture a separate area of vulnerability that is (at least in
part) independent of SES. We found similar results when using a PCA-based measure of SES
and a simpler sum score of household assets. Our final measure of SES in the current analyses
was also significantly associated with food access insecurity, as expected given the
hypothesized nomological network (Cronbach & Meehl 1955).
Stephanie Psaki; Zulfiqar Bhutta; Tahmeed Ahmed; A.M. Shamsir Ahmed; Pascal Bessong;
Munirul Islam; Sushil John; Margaret Kosek; Cebisa Nesamvuni; Prakash Shrestha; Erling
Svensen; Stephanie Richard; Jessica Seidman; Laura Caulfield; Aldo A Lima; Mark Miller;
William Checkley; MALED Network Investigators. Household food insecurity and child
malnutrition: Results from the eight-country MAL-ED study. Population Health Metrics. 2012 Dec
13; 10(1): 24.
[Cronbach, LJ & Paul E. Meehl. Construct Validity in Psychological Tests. Psychological Bulletin.
(1955). 52: 281-302.]
8. Were translated questionnaires back-translated for quality assurance? If so, say
so (line 248).
Yes, questionnaires were back-translated for quality assurance. We have added that
clarification in line 249.
9. Supplementary table 1, sort total high to low, instead of alphabetically
Thank you for this suggestion. We have made that change, and now country sites are sorted
from highest SES score (Brazil) to lowest (Tanzania) from left to right. We have included a note
under the table with that clarification.
10. Figure 1, sort high to low, instead of alphabetically
We have made this change, and sorted country sites accordingly in Figure 1.
Download