Neidell, M. (2000). Early Parental Time Investments. In Children`s

advertisement
What Determines Learning among Indigenous and Non-Indigenous
Children in Peru? An Analysis of the First Three Waves of the Young
Lives Survey
PRELIMINARY AND INCOMPLETE
PLEASE DO NOT CITE OR CIRCULATE WITHOUT PERMISSION
Irma A. Arteaga1
Abstract
In Peru, indigenous children obtain lower test scores, on average, than nonindigenous children. In this study we investigate whether there is a test score achievement
gap prior to starting school (age 5) and whether this gap increases or not by age 8. We are
particularly interested in decomposing the ethnic gap into its underlying determinants.
Specifically, we examine four pathways: the effect of family characteristics and behavior,
child nutrition, neighborhood, and school on child development. A major problem in
estimating the effect of child health on child cognitive skills is that unobserved factors
may affect child health and child cognitive skills simultaneously. This means that the
effect of child health on child skills may not be correctly identified. To date, studies on
the relationship between child health and child cognitive skills have used instrumental
variables methods to correct for potential bias. In this paper, rainfall around the time of
birth, district’s health infrastructure, and exogenous changes in food availability when a
child was a newborn are used as instruments. Using the 2000/01 cohort of the Young
Lives Longitudinal Survey for Peru, our results suggest that the indigenous- non
indigenous gap increases over time. Also, child nutrition has a positive effect on
vocabulary and math tests. Moreover, a bigger effect in both vocabulary and math is
found when child nutrition is instrumented, in contrast to results when instrumental
variables are not taken into account. Equalizing family characteristics and behavior, child
nutrition, neighborhood, and school factors to the average level of Spanish-speaking
children would potentially reduce the indigenous- non-indigenous test score gap up to
76%. The pathways considered yield valuable information for policy makers who are
interested in targeting policies. These are important for human capital formation and
could potentially have long lasting impacts on educational attainment and poverty in
Peru.
JEL classification: I20; J15; H75; O15
Keywords: Cognitive skills; test score; child development; indigenous – non-indigenous
gap; discrimination
1
Paper prepared for presentation at the 2012 Annual Meeting of the Association for Education Finance
and Policy. Please do not cite without permission. Arteaga is an Assistant Professor at the University of
Missouri. Corresponding author: Irma Arteaga, 112 Middlebush Hall, Truman School of Public Affairs,
University of Missouri, MO 65211. Email: arteagai@missouri.edu.
1
1. Introduction
The gap in test scores between indigenous and non-indigenous children in Latin
America is of increasing concern. The country with the highest indigenous population in
Latin America is Peru, where 8.5 million people speak 67 different indigenous languages.
Indigenous people in Peru account for approximately 30% of the total population, and
Quechua is the most widely spoken indigenous language (Solis, 2009). Indigenous people
have some of the lowest overall test scores in both reading and math (World Bank, 2009).
This problem is compounded by the large, persistent test score gaps between rural and
urban students as well as a large gap between indigenous and non-indigenous students
(Sakellario, 2008). Prior research in Latin America has shown that these gaps persist
even after controlling for socioeconomic status indicators (Hernandez-Zavala et al., 2006;
Marshall, 2009; McEwan, 2007; and McEwan and Trowbridge, 2007).
The sources of the disparities between indigenous and non-indigenous children
have not been fully identified. In some studies, more than 50% of the achievement gap is
left unexplained. Prior research has found that family and parental influences are
important factors. The effects of schools have also been examined, although results are
mixed. Data from Chile suggest that schools explain a large portion of the gap (McEwan,
2007) but data from Mexico and Guatemala show that school effects are not as important
as family factors, and the combination of family and school effects does not satisfactorily
explain the gap (Hernanez-Zavala, et al, 2006). Although parental education and
attitudes, family health, and school quality have been suggested along with the usual
determinants of educational outcomes, there are no studies on the ethnic achievement gap
that adequately address all of these dimensions. Specifically, child health is absent from
2
these studies even when it is theoretically crucial in the explanation of the achievement
gap (Glewwe & Miguel, 2008; Marshall, 2009). Another problem that we address is that
school and neighborhood effects are often treated as the same effect. Additionally, the
effects of parental preferences and attitudes have not been examined separately from
demographics and socioeconomic status due to data limitations. Prior research has been
limited to an analysis of parents’ roles based solely on education and ability.
The importance of family influences on cognitive development is well established
in the literature. Basic socioeconomic factors along with genetic endowments are
accepted components of the production function for cognitive development. The role of
parental behavior, however, has not been fully explored in the research from Latin
American countries. In the United States, James-Burdumy (2005) finds that birth order,
number of children, mother’s age at birth, parents’ education, and the presence of a
grandmother in the household all have significant effects on cognitive development.
These variables represent parental attitudes towards child rearing. A study of the test
score gap between black and white students in the United States finds that differences in
“household” inputs explain 10-20% of the gap and that differences in mothers’ abilities
explain as much as 50% of the test score gap (Todd and Wolpin, 2007). This result raises
some questions as to what the important household inputs are.
School and teacher quality have also been examined as critical elements. Cook
and Evans (2000) examine the importance of these factors on the decrease in the test
score achievement gap between black and white students in the United States between the
1970’s and 2000. Using the Oaxaca-Blinder decomposition technique and data from the
National Assessment of Education Progress, they found that, together, convergence in
3
family factors and school quality explain only 25% of the convergence in test scores.
They conclude that more detailed variables for parental education and school quality
would improve the model but that there still are missing elements such has health and
parenting attitudes. Further investigation using data form the National Institute of Child
Health and Human Development Study of Early Child Care and Youth Development
(NICHD) shows that, among third graders, the quality of parenting has a significant
effect on learning. (Murnane, et al. 2006). Results show that a rough measure, “maternal
sensitivity,” explains almost one-third of the gap in test scores between black and white
students.
There is a great deal of variation in the estimates for the Latin American countries
regarding the relative importance of family, school, and parenting quality. An analysis of
data on test scores in Mexico and Guatemala shows that the test score gap between
indigenous and non-indigenous students cannot be fully explained by family and school
factors; only 41-75% of the gap is explained by these variables (Hernandez-Zavala, et al,
2006). Results from an analysis of the National Census for Student Achievement
(SIMCE) in Chile show that school effects have a large impact on the test score gap
between indigenous and non-indigenous students (McEwan, 2007). The study, which
focused on Mathematics and Spanish scores for 8 year olds in 1997 and 2001, uses a
differences-in-differences estimation method and a modified Oaxaca-Blinder
decomposition. The results showed that a family’s socioeconomic status had only a small
effect on the child’s cognitive development; but school factors explained a significant
portion of the achievement gap between indigenous and non-indigenous children.
4
In addition to family, schools and parenting, health has been shown to have a
large influence on a child’s cognitive development. Research shows that a well-nourished
child is more able to pay attention in class, focus, concentrate and get better grades
(Currie, 2009). Moreover, in a developing country such as Peru, a child who does not
receive proper nutrition may develop lasting deficiencies or stunting that would be
difficult or impossible to remedy (Alderman et al., 2006). This is one of the main reasons
why, in contrast with the U.S. literature, research on developing countries has placed
higher emphasis on child health as a determinant of cognitive development and little
emphasis on parenting practices and home environment.
However, child health has not been used to explain the test score gap between
indigenous and non-indigenous students. One reason health and nutrition have been
excluded from previous analysis is that data limitations make it difficult to measure
nutrition and maternal health during the beginning of a child’s life, when nutrition is
thought to have the greatest influence on cognitive development. There are few data sets
currently available that include information on early nutrition and maternal health as well
as test scores that show cognitive development. Additionally, it is challenging to
measure health accurately since it encompasses a broad range of factors. Moreover,
when child health is included, most of the literature ignores that child health and child
cognitive skills both reflect household decisions regarding investments in children’s
human capital, and therefore child health cannot be treated as an exogenous variable.
There are only a few studies from developing countries that control for behaviors that
determine child health when estimating the impact of child health on child cognitive
skills (Alderman et al., 2001; Glewwe, Jacoby & King, 2001; Behrman & Levy, 1998).
5
These studies on Pakistan and the Philippines suggest that food availability, maternal
health, and environmental health factors are ideal instruments for child health. Moreover,
these findings indicate that child health is crucial to cognitive development and that the
effect of child health on child cognitive skills is three to seven times greater than when
child health is assumed to be an exogenous variable.
The analysis in this essay contributes to the existing literature in a number of
ways. First, this is one of the first studies that explicitly examines the achievement gap
between Quechua and Spanish speakers. School performance and relevant information to
model its determinants are not widely available for the Peru’s Quechua population,
despite their being the largest indigenous group in Peru and Quechua being the second
official language. Furthermore, even when such data are gathered, cultural factors and
differences in modeling behavior in rural areas pose additional difficulties. For purposes
of this analysis, an unusually rich dataset containing information on child, caregiver,
family, and community is used, which identifies ethnicity for each members of the child’s
nuclear family. Secondly, this may be the first study on a developing country that
considers parenting behavior as one of the components explaining the achievement gap,
even though these factors have proven to be important determinants of children’s
cognitive skills (James-Burdumy, 2005; Kimmel & Connelly, 2007; Paxson & Schady,
2007). Thirdly, this paper expands the existing body of literature by incorporating several
previously excluded variables: mother and child health, parenting behavior, and separate
neighborhood and school characteristics. We use the Young Lives Study to measure the
effects of parents education, family factors, home environment, child health,
neighborhood and school. The data set provides longitudinal data for 1,681 Peruvian
6
children who were 1 year old in 2002, 5 years old in 2006, and 8 years old in 2009. It
includes information on the factors listed above along with test scores from the Spanish
version of the Peabody Picture Vocabulary Test along with information on the child’s
ethnic and linguistic background.
The remainder of this paper is organized as follows. Section 2 provides a
background on the indigenous people in Peru. Section 3 describes the conceptual and
empirical framework. Section 4 briefly describes the Peruvian setting, the sample and
variables used in the analysis. Section 5 studies the effects of child health, family
characteristics and behavior, school and neighborhood characteristics on cognitive skills
of children, as well as the sources of the indigenous – non-indigenous gap on test scores
of children. Section 6 concludes.
2. Background on Peru
Peru has succeeded in providing broad access to preprimary and primary school:
the net enrollment rates for preschool and primary school, respectively, were 64.4% and
96.6% in 2008, according to the Peruvian Ministry of Education. Improving educational
outcomes for those who are in school, however, remains a challenge. Along these lines,
more than half of Peruvian children tested into the lowest of 5 categories of the PISA
20092 tests in reading (OECD, 2010). Improving investments in the first years of life may
be an important strategy to improve learning outcomes in Peru and other developing
countries.
2
PISA is the acronym for Program in International Student Achievement. It is run by the
Organization of Economic Cooperation and Development (OECD). Peru participated in PISA 2000 and
2009. In 2009, 65 countries participated and Peru ranked 62 in reading and 60 in mathematics.
7
Peru is a lower-middle income country, where the GDP per capita in 2010 was
only $5,205, 11 percent that of the United States (IMF, 2010). An estimated 31 percent of
people live below the official poverty line (INEI, 2011). An estimated 8 percent of the
population lives on less than a dollar per person per day, and 19 percent live on less than
two dollars per day (United Nations, 2010). Overall, education and health indicators for
Peru are roughly on par with those of other Latin American countries. Net primary
enrollment in Peru for males and females is 96% and 97%, respectively, whereas for the
Latin American region average is 94% and 95%, respectively. Net secondary enrollment
in Peru for males and females is 75%, whereas the Latin American region averages are
69% and 74%, respectively. Infant and child mortality are lower in Peru than in the rest
of the Latin American region; the infant mortality rate per 1,000 live births is 17 for Peru,
compared to an average of 22 for Latin America and the child mortality rate is 20 for
Peru and 26 for Latin America. However, Peru has worse indicators for some measures of
child nutrition. Thirty percent of children younger than five have low height-for-age
(stunting), a figure that is far above the Latin American average of 16 percent (UNICEF,
2009). Moreover, one in ten children has low birth weight, whereas the Latin American
average is nine percent.
Indicators of child health status are even lower for indigenous children in Peru.
But, what do we mean by indigenous people? According to Vasquez, Chumpitaz and
Jara (2008) there are three main definitions. The first one refers to maternal tongue.
According to these authors, maternal tongue is the most employed definition in Latin
America. Bolivia, Ecuador, Colombia, Honduras, Mexico, Paraguay, and Peru, among
others, use this definition. This criterion is considered “objective” because it allows
8
transmitting culture from one generation to the other. When one group loses its own
language, it also looses a fundamental element in the socialization process and therefore
it will start its extinction as an ethnic group (Vasquez, Chumpitaz, and Jara, 2008). The
second definition of indigenous refers to self-identification and this is referred as a
“subjective” measure. This definition of indigenous is based on the own perception about
belonging to certain ethnic group and his/her self identification with its ancestors and
customs. This criterion can be influenced by an individual’s pride to belonging to a
certain group or by a feeling of exclusion. That’s the reason why this definition is more
subjective. Finally, the last definition is based on the recognition of an indigenous
community through its territory, race, religion or clothing. Guatemala and Paraguay, for
example, use type of shoes and indigenous clothing as a complimentary criterion to
identify natives groups.
In this paper we use the objective definition of language learned at home.
Traditional measures in Peru show that around 30 percent of the population is
indigenous; these traditional measures use the first definition and complement their
findings with the other measures of self-identification, territory and clothing. The Young
Lives Survey shows that when mothers are asked about their language and when they are
asked about their ethnicity, 30 percent identify themselves as indigenous. This is similar
to the traditional measure. However, when the survey asks their children about the
language they speak, only 13 percent responded that they speak an indigenous language.
This result is similar to the Peruvian National Census from 2007, where 15 percent of the
population five years and older responded that they learnt an indigenous language at
home. The children in the Young Lives Survey were born in 2001, which means that the
9
younger the cohort, the lower the percentage of individuals who are learning an
indigenous language. This consistency of results with the national census guarantees that
our results can be generalizable.
According to the Census of 2007, 83 percent of the indigenous population is
Quechua, 11 percent is Aymara, 2 percent is Ashaninka, and 4 percent belong to other
indigenous groups. The Peruvian National Institute of Statistics reports that the
enrollment rate for indigenous children 6 to 16 years of age is 88 percent whereas for
Spanish speaking children it is 96 percent. Moreover, 29 percent of indigenous children
who are enrolled in school, do not go to school, and 73 percent are one or more grade
levels below the grade level appropriate to their age.
Because the Political Constitution of Peru guarantees education for all, and due to
the large percentage of indigenous population in Peru and their vulnerability, the Ministry
of Education initiated a National Policy of Bilingual Education in the early 1970s.
During that time, the government gathered information about appropriate techniques to
be used to address education for the indigenous population. Since then, some advances
has been made, but different governments have had different visions of bilingual
education and the National Division of Intercultural and Bilingual Education still does
not have the capacity to hire teachers or to approve a bilingual curriculum for children or
for future teachers. One of the most important policies that has been implemented is the
creation of an intercultural bilingual school. A bilingual school is defined as a school that
has at least one teacher who has been trained on intercultural and bilingual education.
10
According to the Peruvian Ministry of Education, only 35% of the rural bilingual
population attends to an intercultural bilingual school in 2008.
3. Conceptual Framework (insert all the methodological issues
and production functions)
A standard procedure used to analyze the determinants of the child’s cognitive
achievement gap is to fit equations between cognitive achievement and observed
characteristics. Given there are two groups, indigenous (I) and non-indigenous (NI), let
the outcome variable be Dcog, and denote the explanatory variables by X. The question is
how much of the mean outcome difference is accounted for by group differences in
predictors.

 
I
NI
Gap  E Dcog
 E Dcog

(1)
where E (Dcog) denotes the expected value of the outcome variable, child’s cognitive
achievement (e.g. test scores).
Consider the following linear model:
Yl  X l l   l , assu min g E  l | X i    , l  I , NI 
(2)
where  denotes unobserved determinants of Y and random noise in the Y variable. This
setup allows one to write the gap between indigenous and non-indigenous children as
follows:


I
NI
Gap  E Dcog
 E Dcog
 E X i  I  E X NI  NI

 

 


because:
E Yl   E  X l l   l   E  X l l   E  l   E  X l l 
with E(βl)=βl, and E(εl)=0 by assumption.
11
(3)
To identify the contribution of group differences in variables and differences in the
impact at themselves, equation (4) can be rearranged, so that the observed cognitive
achievement differential can be decomposed as:

 
 


I
NI
Gap  E Dcog
 E Dcog
 X I  X NI  I  X NI  I   NI
3
(4)
where E(X)= X . A child is identified as indigenous if the child speaks an indigenous
language. Robust checks that include different definitions of indigenous, such as
mother’s tongue is indigenous, maternal ethnicity is indigenous, and child’s ethnicity is
indigenous, showed similar results.
This is known as the Oaxaca-Blinder decomposition (Blinder, 1973; Oaxaca,
1973). The overall achievement gap is decomposed into two components. The first is the


portion attributed to differences in the means characteristics X I  X NI evaluated on
 
endowments of indigenous children  I . This portion is known in the literature as the
endowment effect, because it accounts for the effect of having been born as an
indigenous child, given by βI. Thus, this portion of the gap answers the following
question: If the initial endowment will be indigenous (βI), what will be the difference
between indigenous and non-indigenous children? The second component is attributable
to differences on performance (βI - βNI) across indigenous and non-indigenous children
for the same characteristics (XNI). This portion of the gap is known in the literature as the
coefficient effect, because it accounts for the contribution of differences in the
coefficients. Moreover, this portion of the gap answers the following question: if
indigenous children will have characteristics of non-indigenous children, what will be the
differences between indigenous and non-indigenous children?
4. Sample, Data, and Measures
3
A second way to do the decomposition is: Gap=(X I – XNI)ΒNI + XI(βI-βNI).
12
The data used in this paper include the first three waves of the Peruvian Young
Lives Survey, which consists of 2,000-year-old children randomly selected in 20 districts,
that are part of 18 provinces in Peru4. In addition to that, district data were gathered from
FONCODES (Fondo Nacional de Compensacion para el Desarrollo; Peruvian National
Fund for Development and Social Compensation). This section briefly describes some
important features of the Peruvian data, discusses measures of cognitive development,
child health, family characteristics, home environment and school characteristics.
4.1. Data
The sample in this paper consists of urban and rural children from the Young
Lives Survey (YLS) panel data for Peru. Young Lives is a longitudinal research project
that collects data for investigating the changing nature of childhood poverty and tracks
the development of 12,000 children in Ethiopia, Vietnam, India and Peru through
studying and following two cohorts of children over a 15-year period. The younger
cohort consists of about 2,000 children per country, aged between 6 and 18 months in
2002. The older cohort consists of 1,000 children per country aged between 7.5 and 8.5
years in 2002. The present study only uses data from Peru. In Peru, the Young Lives used
multi-stage, stratified, random sampling to select the two cohorts of children. A panel
dataset is composed of three waves of data, collected in 2002, 2006, and 2009. This study
uses the panel data set for the younger cohort, children who were about 1 year old in
2002, because it includes data on parenting practices just after birth, and also two waves
of data that account for cognitive development. Previous research has shown that
4
A district was randomly chosen within each of the 18 provinces. Only in the province of Lima, three
districts were selected. The reason is that almost one-third of the population in Peru lives in Lima.
13
cognitive ability is a cumulative process (Todd & Wolpin, 2007). This survey provides
information on household characteristics, detailed caregiver records, child characteristics,
as well as results of the Spanish version of the Peabody Picture Vocabulary Test (PPVT),
a test of vocabulary recognition that has been widely used in Latin America as a general
measure of cognitive development (Bernal, 2008; James-Burdumy, S., 2005, Shady, N.
2006). Results from the PPVT test constitute a key variable of interest in this study and
serve as the primary measure of cognitive ability. This panel data set contains 1,784
observations from the whole country: 35% of the children live in the coast, 50% in the
mountains and 15% in the cloud forest/jungle.
Given that the YLS is a longitudinal study, it is important to consider the
potential for attrition bias. Such bias may occur if attrition is non-random and if
characteristics correlated with attrition are also correlated with the outcomes of interest.
Outes-Leon and Dercon analyzed the first and second waves of data for the four YLS
countries (2008). They found that the attrition rate (3.5 %, excluding 6 deaths) for the
Peru sample is low compared to other longitudinal studies in developing countries.
Furthermore, the authors use attrition probit and BGLW5 tests for attrition bias and
conclude that attrition is overwhelmingly random and unlikely to lead to significant
biases.
A sample of 1,681 children was used for the analyses in this paper. First, this
article selected the complete panel of children from the younger cohort, a total of 1,959
children. Second, 103 children were excluded because of missing information on the
control variables used in the analyses. In most cases, missing values were due to missing
5
BGLW stands for Becketti, Gould, Lilliard and Welch (1988) test for selection on observables.
14
paternal characteristics (either the father was not present in the household at the time of
the interview or the father refused to answer those questions). Third, 278 children were
excluded because of missing information on the dependent variables used in the analysis:
223 were not reached on the second wave, and additional 55 were not reached on the
third wave of data collection. A comparison of the mean characteristics of the full sample
of children and the subsample of children used in the analysis, presented in Appendix
table A1, suggests that differences between the two groups are very small and statistically
insignificant.
4.2. Measures
Descriptive statistics are shown in Appendix Table A1. The two outcome measures for
the analysis of this paper are a child’s score on the quantitative subset of questions of the
quantitative/mathematics test and a child’s score on the PPVT-III test. Both tests were
conducted in rounds 2 and 3. The former test was developed by the International
Evaluation Association. For round 2, the test used was the CDA-Q (cognitive
development assessment – quantitative), which is specially developed for children 4.5-5.5
years of age. It consists of 15 items, and for each item the child is shown a picture and
asked a question, and she is asked to choose the best answer from three or four choices.
An example is the following: “Look at the plates of cupcakes; point to the plate that has
the fewest cupcakes.” The child will score 1 point for each correct answer and 0 points
for incorrect or incomplete answers. The raw score is calculated as the number of correct
answers. Four round 3, the mathematical achievement test measures various numerical
abilities appropriate for children aged 8, and it has two parts. The first part consists of 9
questions, where the fieldworker showed the child a series of cards and will ask him/her
to answer some questions orally. Examples of these questions are: “Please put your finger
on number 21,” “Which number should be in the blank space?” In the second part the
child was tested on math computing. The fieldworker gave the child a test booklet and
15
explained the instructions found in there to him/her. The child had 12 minutes to answer
20 basic arithmetic questions.
The PPVT is a test of receptive vocabulary that is frequently used to evaluate
Hispanic children and adults (Paxson & Shady, 2007; Umbel et al., 1992). During the
test, children were asked to select from among four pictures the one that best represented
the meaning of a word presented to them orally by the interviewer. This test is composed
of 125 questions arranged in 10 groups. Each group corresponds to items that children of
a certain age should be proficient in. Thus, the first questions are targeted to 3-4 year-oldchildren, the next group to 5 year-old children, and so on. Finally, the last group targets
14 year-old children; later pictures are gradually more difficult. It is not expected that
children who are 8 years old answer the 125 questions correctly, because they would be
shown pictures that would represent words that are completely unfamiliar to them.
However, some children will be able to recognize words that are far beyond the
expectations for their age. Test items have been selected for their universality and
appropriateness to Hispanic communities and have been widely used in Latin American
countries and with Spanish-speaking children in the United States (Paxson & Shady,
2007; Rosenzweig & Wolpin, 1994). In the case of the Young Lives survey, the test was
available in Spanish, Quechua and Aymara, the most frequent languages spoken in the
country (Spanish and Quechua are both official languages in Peru).
We examine how performance on the PPVT differs across children of different
ages, which requires the use of age-normed scores. The PPVT has not been normed for
samples of Peruvian children. We standardized the raw scores on the test using the norms
published by the developers of the Peabody test, which are based on samples of Mexican
16
and Puerto Rican children, but apply to Hispanic populations. These norms are set so that
the mean should be 100 and the standard deviation 15 at each age. It is important to
notice that the PPVT has a minimum raw score that can be normed for children of each
age.
For children over 132 months of age (11 years old), the minimum raw score
always corresponds to a normed score of 55. The existence of these minimum scores is a
challenge for our analysis. A fraction of children in our sample—8.3 percent overall—do
not attain the minimum raw score that can be normed. However, in contrast to other
studies (Paxson & Schady, 2007), none of the children in our sample have a raw score of
zero or close to zero, but most of them have a score between 40 and 54, which indicates
that they have made at least some progress. A common practice is to exclude such
children from analyses, but this will generally result in biased parameter estimates.
Instead of dropping these children, and because their score is really close to the minimum
norm, we assign them the minimum normed score for that age.
Including the censored values, average performance on the PPVT test is low: The
mean of 88.71 indicates that, on average, the children in our sample are .75 standard
deviations below the mean of the group used for norming the PPVT in round 2. Twelve
and a half percent of children in the sample have censored PPVT scores. When the
censored group is excluded, the mean PPVT score rises to 90.1, still well below the
normed mean.
Our variable of interest is ethnicity. There is not a consensus in the literature
about the definition of ethnicity. Some researchers use mother’s tongue, some others use
17
child’s language, while others use child’s self-reported ethnicity. Our analysis in this
paper defines indigenous as child’s spoken language. However, auxiliary regressions
using the other definitions of ethnicity yield similar results.
This paper controls for socioeconomic characteristics, child health, parenting
characteristics, neighborhood, and school characteristics. The socioeconomic variables
include the natural logarithm of annual household expenditures, father’s education, and
mother’s education. To calculate the annual household expenditures we added up all the
expenses reported by the head of the household during the year. These expenses included
items bought on weekly, monthly and annual basis, as well as paid utilities and services.
The education of parents is given in single years.
This paper uses four measures of parental behavior towards the education of their
child: the natural logarithm of expenditure in education, number of months in day care,
number of months in preschool and hours per week spent in preschool. One variable is
used to control for home environment, number of books in the household.
Measures of school characteristics were used only for round 3, because a child did
not go to school in round 2. This study uses three measures of school characteristics:
grade repetition, teaching quality and type of school (private or public). All measures are
based on parents’ self-reports. Teacher quality was constructed using two measures:
whether the parent thinks that the teacher is a good teacher or not, and whether the
teacher frequently misses school or not. Grade repetition is a dichotomous variable that
refers to whether the child repeated a grade in round 3 or not.
18
This paper uses the child’s height for age z-score as an indicator of early childhood
malnutrition. There is a consensus in the literature that this is the best single indicator of a
child’s cumulative nutritional status among infants and children (McKee & Todd, 2009;
Behrman & Hoddinott, 2001; Glewwe & King, 2001). To avoid an endogeneity problem
caused by the fact that parents make decisions about health/nutrition and education
simultaneously, this study instruments child’s height-for-age (Glewwe, Jacoby & King,
2001; Glewwe & Miguel, 2008). Moreover, ordinary least squares (OLS) estimates of
the parameter of the demand for cognitive skills are likely to be biased for several
reasons. First, a child’s nutritional status may be correlated with unobserved parental
tastes regarding their child’s nutrition. Second, height-for-age z-score may not
completely reflect children’s nutritional status (there is variation in height and weight
even among healthy children, which implies that this z-score can be a noisy measure of
child’s nutritional status). Because these two problems can be resolved using instrumental
variable methods, we consider the choice of instruments for child’s growth. To deal with
the first problem, the instruments must be correlated with child’s growth but uncorrelated
with parental tastes or a child’s inherited ability to learn, both of which are unobserved
variables in the demand equation for child’s cognitive skills. Three variables that would
appear to satisfy these criteria are exogenous changes in the food supply, rainfall for each
child’s birth’s year and birth’s location, and a measure of fluctuations in the physical
environment such as health characteristics of the community. Food shocks when a child
is 6-18 months of age allows one to control for the behavior determining the child health
stock as measured by child’s height by age 5. These food shocks can serve as
instrumental variables because they may disrupt income-earning activities or raise
19
individual’s exposure to disease-spreading pathogens, both of which are important
determinants of health status. Such exogenous changes in food supply (e.g. livestock
death, crops failure, livestock theft, and exogenous changes in food availability) are
uncorrelated with shocks at the subsequent age at which decisions are made whether or
not to invest in children’s education. Therefore, this procedure permits estimation of the
impact of child health on child cognitive skills without contamination from unobserved
behaviors such as time allocation decisions about other household members that are
concurrent with the child’s cognitive development decision. Similar conditions are
satisfied with the rainfall shocks instrument. In a rural country like Peru, higher birthyear rainfall can be interpreted as a positive shock to a Peruvian community, leading to a
higher local-level crop output and household income. These exogenous weather
conditions, rainfall shocks, are expected to have a positive and significant effect on
child’s nutrition and health on the first year of life. The nutrition literature suggests that
this initial health endowment is determined by environmental conditions early in life and
has lasting effects. We obtained historical rainfall data for weather stations across Peru
from the Global Historical Climatology Network Precipitation and Temperature Data6.
The data include monthly records for each station as well as its latitude and longitude.
For each month between 1999 and 2001, we use the station location information to match
each birth district represented in the YLS to the closest weather station. We matched a
total of 20 stations with YLS birth districts. Finally, we use population per health center
as our third instrument, to measure health environment in the community. All three of
these variables appear to resolve the first econometric problem. In order to address the
6
These data are available at http://www.ncdc.noaa.gov
20
second problem, one can think that changes in height may not fully measure children’s
nutritional status, because this not may reflect micronutrients deficiencies, implying that
the error term in the demand equation for cognitive skills could be correlated with the
instruments. However, the literature suggests that this is not going to generate a
significant bias. The reason is that low-income households not only have lower calorie
and protein intakes but also less diverse diets, which leads to micronutrient deficiencies.
Because of the correlation between calorie and protein intakes and micronutrient intakes,
child growth may reflect micronutrient deficiencies, leaving little possibility for such
deficiencies to be revealed in the error term in the demand equation for cognitive skills.
We explore how much the estimates differ with this preferred procedure in comparison
with a naïve model in which child health is treated as exogenous as well as other
alternatives. The alternatives include the district’s characteristics of health infrastructure
during the first wave of data collection included in the first-stage estimates to attempt to
control for the behaviors determining child health.
4.3. Empirical Framework
The empirical approach is based on the Oaxaca-Blinder decomposition, which is
commonly used to assess the determinants of earning gaps between two groups (Blinder,
1973; Oaxaca, 1973). This approach has recently been used to decompose the
determinants of the racial and ethnic gap on test scores (Hernandez-Zavala et al, 2006;
Neidell, 2000).
As mentioned above, the Young Lives data collect information for children in 20
sites. Each group of about 100 children live in the same neighborhood, and is highly
likely that these neighborhoods vary in terms of unobserved family variables such as
wealth or parental behavior (due to choices parents made regarding where to raise their
children). If these omitted variables have a causal impact on test scores, then the
estimated impact of neighborhood effects will be biased in ordinary least squares (OLS)
21
regressions. This raises concern about omitted variable bias in OLS estimates. One way
to reduce this concern is to control for neighborhood fixed effects, facilitated by the
availability of multiple neighborhood observations, both indigenous and non-indigenous,
within neighborhoods.
4.3.1. Ordinary Least Square (OLS) Estimation
In order to explore the indigenous – non-indigenous gap in test scores, this study
starts by specifying the test score to be a function of the child’s ethnicity and a set of
child and household characteristics (child’s gender, per capita expenditure, father’s years
of education, mother’s years of education, height for age z-score, per capita expenditure
in education, months of child in day care, months of child in preschool, and hours a week
spent in a preschool center), that may influence test scores. It begins with the estimation
of OLS regressions for indigenous and indigenous children (as indicated in the
superscripts) within each year’s sample:


Test ind
  ind X ind ij   ind
  ijind
ij
j


Test nind
  nind X nind ij   nind
  ijnind
ij
j
(5)
(6)
The test score (Test) of child i in neighborhood j is regressed on a vector of child
and family variables (X). The ’s are coefficients to be estimated. Each neighborhood
(district) is allowed to have a separate intercept via a series of fixed effects (). Finally, 
is an error term, which represents all unobserved variables. The standard errors are
adjusted for arbitrary forms of heteroskedasticity because observations within
22
neighborhoods are correlated due to their exposure to the same community factors; this
makes the disturbances not independently and identically distributed (Wooldrigde, 2010).
Given the coefficient estimates, the mean difference in indigenous and nonindigenous test scores within each cross-section can be re-written as:
Test
Test score
difference


X  ˆ ind  X nind ˆ ind  ˆ nind
Endowments

Performance
  (7)
Neighborhood
effect





 Test non
 X ind
 X non
where Test   Test ind
ind and X
ind . The overbars refer
to variable means and the subscripts refer to the subsamples of indigenous and nonindigenous children. The first term on the right-hand side of equation (7) is the portion of
the gap that can be explained by differences in each group’s endowment such as maternal
schooling, number of books in the household. The second term is known as the
performance effect. Technically speaking, this is the sum, over all the x variables, of the
contributions to the difference in mean test scores between non-indigenous and ethnic
indigenous children caused by differences -across the two groups of children- in the
impacts of the x variables on test scores. For example, it may be that the impact (β) of
maternal years of education on test scores is higher for non-indigenous than for indigenous
children, for example because school quality is higher among the non-indigenous
population. The third term in equation (7) is the portion of the gap due to living in a
certain neighborhood, also known as neighborhood effect.
Even though one of the goals of this paper is to understand the indigenous – nonindigenous gap in each round of data, another goal is understand the causes of changes in
the test score gap. In this sense, equation (6) can be successively applied to the 2005 and
2009 samples. The differences between the two can be written as:
Test   Test  
X ˆ
ind

 
 X ˆ ind  ˆ nind      
23

Test score diff-indiff
endowments
performance
neighb.
effect
(8)
ind
ind
where: Xˆ ind  X 09ˆ09  X 05ˆ05 ,





nind ˆ ind
X ˆ ind  ˆ nind  X 09
 09  ˆ09 nind  X 05nind ˆ05ind  ˆ05nind

The left-hand side of equation (8) refers to the test score difference-in-difference,
discussed on the conceptual framework section. The right-hand side decomposes into
three terms. The first term represents the change in the gap that is due to changes in
endowments between indigenous and non-indigenous children. We consider four types of
endowments: socio-economic, child’s health, home environment and parental behavior
towards education. The second term measures whether changing performance of child
and family variables (that is, the coefficients on child and family variables), changes the
test score gap. Finally, the third term represents the portion of changing gap that is due to
changes in neighborhood effects.
4.3.2. Instrumental Variable Approach
Following Glewwe and King (2001)This paper uses food availability, rainfall
shocks and environmental health as instruments for child’s nutritional status. Child’s
nutritional status is measured by height-for-age z-score. This measures the deviation of
the child’s height according to her/his age and gender with respect to the reference
population. This variable is commonly used to measure child’s nutritional status. We use
three instruments: food availability shocks, rainfall shocks and health infrastructure in the
child’s district. We are able to use food shocks during wave 1 to control for the behavior
determining the child nutritional status in waves 2 and 3. Such food availability shocks
are uncorrelated with shocks during the following waves at which decisions are made
about investing in child’s cognitive skills. Similarly, we use child’s birth-year rainfall
shocks on the locality where the child was born to instrument child’s nutritional status.
Rainfall shocks should be uncorrelated with decisions about whether or not to invest in
child’s cognitive skills. In the same way, health infrastructure in the district in wave 1
should be uncorrelated with parental decisions about investing in their child’s cognitive
24
skills. Therefore, this procedure permits estimation of the impact of child health on child
cognitive skills without contamination from unobserved behaviors such as allocation
decisions for other household members that are concurrent with the child’s cognitive
skills decision. This suggests running the following regression in the first stage:
height for age  f  food availabili tyt  , ra inf all t , neight  , ut  (  )
Where dheight denotes difference in height, food availability refers to food availability
shocks in wave 1, rainfall refers to rainfall shocks and neigh to health characteristics in
the district in wave 1 (e.g. population per health center).
As Fortin, Lemieux, and Firpo (2011) suggest, the Oaxaca-Blinder decomposition
can be performed by substituting the OLS estimates of the beta coefficients by their
instrumental variables equivalents.
5. Results
This section examines the test scores of the 1,681 children in the younger cohort,
who were about five years old when they were tested in mid 2006 and about eight – nine
years old when they were retested at the end of 2009. The section begins by comparing
the test scores of ethnic majority (Spanish speakers) and ethnic minority (Quechua,
Aymara and other native language speakers) children, first for the entire sample and then
for the subsample of districts that have both Spanish speaking and non-Spanish speaking
children. Then, it presents estimates of cognitive skills production functions for both
vocabulary and mathematics that attempt to explain the determinants of test scores, and
why the test scores of children are lower for ethnic minority children.
Table 1 presents means and standard deviations of the scores for the Math
Achievement test and the PPVT-III test, first for the full sample and then separately for
the Spanish speaking population and the indigenous speaking population. The top half of
25
the table shows statistics for all students while the bottom half limits the sample to the
seven districts that had both Spanish speakers and ethnic minority children.
Beginning with the first row in that table, the average child in the full sample had
a standardized PPVT score of 88.7 in 2005. The gap between non-indigenous and
indigenous children is not so large, the former had a mean score of 90 while the latter had
a mean score of 79.9, a difference of 10.1 which is equivalent to 0.4 standard deviations
of the distribution of test scores. This gap, however, considerably increases in 2009,
where non-indigenous children had a mean score of 99.4 and indigenous children a mean
score of 77.0, a difference of 22.4 which is equivalent to 1.1 standard deviations of the
distribution of test scores. Turning to the mathematics scores, the average child in the full
sample correctly answered 8.4 of the 15 questions in 2005. The gap between nonindigenous and indigenous children is small, the former had a mean score of 8.5 and the
latter had a mean score of 8.0, a difference of 0.5 points which is equivalent to .2
standard deviations of the distribution of test scores. However, the mathematics
achievement gap increases in 2009. The average child in the full sample correctly
answered 12.1 of the 29 questions in the Mathematics Achievement test. Non-indigenous
children had a mean score of 12.6 while indigenous children had a mean score of 8.2, a
difference of 4.5 which is equivalent to 0.9 standard deviations of the distribution of test
scores.
These simple descriptive statistics show that the test score gap between nonindigenous and indigenous children is increasing overtime. It is possible; however, that
part of the difference between the test scores of non-indigenous and indigenous children
26
may be due to the fact that they live in different communities and so the neighborhood
environment can play a role. It may also be that living in different communities implies
that children attend to different schools. To control for differences in communities, the
bottom half of Table 1 presents the same numbers in the top half, but is limited to the
seven districts that have both non-indigenous and indigenous children in the sample. This
comparison does result in somewhat smaller gaps for 2005 (44 percent smaller for the
PPVT test and 65 percent smaller for the Math test), and marginally smaller gaps for
2009 (12 percent smaller for both PPVT and Math tests). Yet, there are still large gaps
even when comparing non-indigenous and indigenous children who live in the same
district.
To better understand the nature of the gaps, the rest of this section presents
regressions that attempt to explain the differences in the test scores of indigenous and
non-indigenous children in Peru. The results for the PPVT test are shown in Table 2,
those for the Mathematics test are shown in Table 3 and those for the gap decomposition
in Table 4.
The first column of Table 2shows the estimates of nind in equation (1) and the
second column shows the estimates of ind in equation (6), both for 2005. The next two
columns show similar results for 2009. For 2005, per capita expenditure, age of child in
months, per capita expenditure in education, and months in preschool have a significant
effect on non-indigenous PPVT skills. Positive impacts of household income on child
cognitive abilities are often found in other studies and are quite intuitive; however, a
negative impact of expenditure in education is not that intuitive. One possible explanation
27
is that in 2005, most of the children were 5 years old; therefore, expenses on education
are not that common in Peru for that age. In 2009; however, expenditure in education
had a positive and significant effect on PPVT skills for non-indigenous children, as well
as household income, maternal education, gender, age of child in months and height-forage z-scores. Similar effects were found for indigenous children: in 2005, age of child in
month, household income, maternal education, height-for-age z-score and months in day
care had a significant effect on PPVT scores; and in 2009, age of child in months,
maternal and paternal education, household expenditures and height-for-age z-scores had
positive and significant effects.
As previously described, the gap in mathematics achievement was really small for
2005, but increased eight hundred percent by 2009. In the latter year, age in months was
significant for both non-indigenous and indigenous children. It is possible that this
reflects the fact that older children are more mature and therefore, are able to acquire
more skills. Paternal education has a significant effect on indigenous children, but not in
non-indigenous children. Presumably, this reflects the fact that in indigenous societies,
the male plays a predominant role and if the father is educated, he will want his child to
be it too. Maternal education, on the other hand, just has a significant effect on nonindigenous children. This result is frequently found in the literature (Currie, 2009; JamesBurdumy, 2005; Kimmel & Connellly, 2007). Height-for-age z-score has a positive and
significant impact on mathematics achievement, as often found in the literature, but just
for non-indigenous children.
28
The Oaxaca-Blinder decomposition can be used to see how much of the observed
gap between non-indigenous and indigenous children in their mean PPVT test scores is
explained by differences in the means of the explanatory variables between nonindigenous and indigenous children (endowment effect) and how much is explained by
differences in the impacts of those variables (performance effect). Table 4 shows the
decomposition results.
The left-most columns report the decomposition within the 2005 sample, based
upon Equation (7). Approximately 71 percent of the 2005 PPVT test score gaps can be
explained by the different endowments across indigenous and non-indigenous children. A
somewhat significant percentage, 5 percent is explained by differing neighborhood
effects. In 2009, however, 64 percent of the PPVT gap can be explained by the different
endowments across indigenous and non-indigenous children. We can observe then that
parental education explains 49 percent of the gap, followed by preschool (13 percent).
Twenty two percent of the gap is accounted by the returns or performance. Some
researchers call the endowment effect, the “explained” part of the gap, because modifying
the endowments of parental education, child health, or preschool for the indigenous
groups will reduce the gap. This can be done for example, with policies that reduce adult
illiteracy in indigenous communities or policies that provide incentives to indigenous
people to get vaccinations for their children or policies that provide them nutritious food
for their children. These same researchers call the performance effect, the “unexplained”
part of the gap, because there are no simply policies that can be used in order to modify
them. This means that we can also interpret these results in the following way, seventy
29
eight percent of the gap can be explained by parental, children, preschool (64 percent)
and neighborhood factor (14 percent), and 22 percent of the gap is unexplained.
One objective of this study is to decompose the gaps within years, but another one
is to understand what caused the increase of the gap between 2005 and 2009. Thus, the
right-most columns report the decomposition of the test score difference-in-differences in
each subject based upon Equation (8). Overall the gap increases 0.46 standard deviations
in Vocabulary, and 1.08 in mathematics. The decomposition of the vocabulary gap
suggests that 79 percent of this change is explained by changing endowments of family,
child and neighborhood, consistent with previous descriptive evidence. Similarly, in the
case of mathematics, 77 percent of the gap can be explained by changes in the
endowments.
To summarize, the evidence suggests that convergence in endowments of family,
children, preschool and neighborhood variables, at least those observed in the Young
Lives data, plays a significant role in the increasing gap.
The next tables present results with the IV method. Remember that the OLS
estimates in Table 2 and Table 3 for non-indigenous and indigenous children in 2005 and
indigenous children in 2009 suggest that poor nutrition has no significant effects on
children’s cognitive development, which is consistent with previous findings (Alderman,
et al., 2001; Stein et al., 1975; Waver et al. 1981), but is inconsistent with Glewwe &
King, 2001; and Villar et al., 1984. Yet, the OLS results may be misleading because
tastes for child quality may induce correlation between child nutrition and cognitive skills
that is not causally related. To address this problem, we use an IV approach and present
30
these results on Tables 5, and 6. Estimates in these tables treat child nutrition as
endogenous, using food availability shocks, rainfall shocks, and health characteristics in
the community. The fist-stage results are given in Appendix table A2. These parameter
estimates are different from the OLS results. Indeed, a Hausman test rejects the
hypothesis that the two sets of estimates are equal. The most notable difference is that the
coefficient on child nutrition increased relative to the OLS estimate. These results are
consistent with Glewwe & King (2001), Alderman et al.(2001), Waber et al. (1981), and
Stein et al. (1975), whose studies suggest that the OLS estimates are biased.
The instrumental variables used for height-for-age z-score have strong predictive
power, as revealed by the F-test statistics (bottom of table A2). The model easily passes
the standard over-identification test, suggesting that the instruments are not correlated
with the error term in the demand equation for cognitive skills.
Table 5 presents regression results using an IV approach that includes measures
of child health, parenting, and neighborhood characteristics. Regressions include the full
vector of characteristics specified in table 2. Results suggest that for 2005 and 2009
maternal education, gender and expenditure in education have a statistically significant
effect on vocabulary test scores for non-indigenous children. The effect of expenditure in
education is negative in 2005. As it is shown in Table A1, Peruvian parents spend 5 times
more in 2009 in comparison to 2005; thus, presumably, because they spend so little in
2005, it does not have a positive influence on test scores. In addition to that, for 2009,
height for age z-score has a positive effect on vocabulary for indigenous children. This
effect is 3.5 times bigger than the one described in Table 2 (OLS). In the case of non-
31
indigenous children, maternal education, household expenses, educational expenses and
have positive effects on vocabulary of a similar magnitude than the ones found with the
OLS regressions.
Similar effects are found for mathematics achievement in 2009. Age of child in
months has a positive effect on mathematics achievement for indigenous children.
Expenditure in education also has a positive and significant effect on mathematics
achievement for indigenous and non-indigenous children. For indigenous children, being
a boy has a positive impact on mathematics achievement, as well as height-for-age zscore. The effect of height-for-age z-score on mathematics achievement is almost 6 times
larger on the IV estimation than on the OLS estimation. For non-indigenous children,
maternal and paternal education also play an important role.
Results of the Oaxaca-Blinder decomposition are presented in table 6. As before,
the left-most columns report the decomposition within the 2005 sample. Approximately
one-third of the 2005 PPVT test score gaps can be explained by parental endowments,
while in 2009, half the gap was explained by parental endowments.
The right-most columns report the decomposition of the test score difference-indifferences in each subject. Overall the gap increases 0.46 standard deviations in
Vocabulary, and 1.1 in mathematics. The decomposition of the vocabulary gap suggests
that 78 percent of this change is explained by changing endowments of family and child,
consistent with previous descriptive evidence. Similarly, in the case of mathematics, 79
percent of the gap can be explained by changes in the endowments. In both cases, the
major contributor is parental education.
32
To summarize, the evidence suggests that convergence in endowments of family,
children, and parental behavior towards daycare and preschool plays a significant role in
the increasing gap. Even when child health is an important determinant of vocabulary and
mathematics scores for indigenous children when using the IV approach, it does not
explain a greater portion of the endowment effect of the ethnic gap. What it might occur
is that child health operates through the performance effect. This result is consistent with
the previous literature in developing countries that shows that child health is an important
determinant of child’s cognitive skills. However, our results also show that parental
behavior towards daycare and preschool plays a significant role. Thus, it seems that
investments in early education will help closing the ethnic gap.
Auxiliary regressions were run for different scenarios. We used different
definitions of indigenous and found similar results. Also, we included school variables
for 2009 (when children were old enough to go to school). Table 7 reports the
indigenous-non-indigenous gap decomposition considering schooling factors. However,
we found that the effect of school was small.
6. Discussion
In this paper we studied the achievement gap between indigenous and non-indigenous
children using a unique longitudinal data set, one that follows a large sample of children
in a low income country from age 1 to age 8. Several important findings emerge from this
analysis. First, there is a strong relationship between childhood nutrition and academic
achievement and this factor explains an important portion of the achievement gap
33
between indigenous and non-indigenous children. The positive relationship between
nutrition and achievement persists even after controlling for parental and household
characteristics. Our results thus support a causal link between nutrition and academic
success, though arguably a definitive answer to the causality question is only possible
from an ideal (and therefore unlikely) empirical experiment. Second, parental
characteristics, especially maternal education have a positive significant effect on child
achievement. Moreover, this factor is also important in closing the indigenous-nonindigenous gap. However, the child development literature in the US points out that this
is a crucial factor, probably the most important in explaining child achievement.
Preliminary results that are not presented in this paper show that this is also the case for
older children (using the Young Lives’ older cohort for Peru, where children were seven
year old in wave 1, twelve in wave 2 and fifteen in wave 3). It the case studied in this
paper, it seems possible that the effects of parental attitudes and behaviors when children
were infants and toddlers are acting through the daycare and preschool variables.
It is important to notice that there is not a consensus in the literature about the
definition of indigenous. Some studies define indigenous based on self-reported ethnicity,
maternal tongue or child’s tongue. We used the latter definition. However, for robustness
checks, we tried the other two definitions and the results were similar.
When considering the policy implications of our findings, substantial caution is in
order. First, when we found evidence that the effect of school characteristics is small, we
cannot conclude that schooling variables are not important. The data set that we used was
collected by interviewing individuals in the household. We did not have access to
teaching reports or specific data from schools. Our schooling variables are based on
34
parental perceptions about schools. Second, these findings are based on one area of one
developing country; further evidence is needed from other countries before we can claim
that parental education is the most critical factor in closing the indigenous – nonindigenous achievement gap. Third, even when child health has a large, positive and
significant effect on achievement for indigenous children, the cost of preventing child
malnutrition must be compared with the cost of literacy programs and further education
classes for adults targeted to indigenous women.
In sum, the results of this study suggest that parenting characteristics, child nutrition,
school and neighborhood characteristics are responsible for more than 75 percent of the
indigenous-non-indigenous gap. These findings suggest that there may be large returns to
malnutrition programs, as well as literacy programs for adult women, and to programs
that improve parenting quality. However, a full assessment of such policies would require
a more complete analysis of how families make decisions about what inputs to provide to
their children.
35
References
Alderman, H., Behrman, J.R., Lavy, V., & Menon, R. (2001). Child health and school
enrolment: A longitudinal analysis. The Journal of Human Resources, 36(1): 185205.
Becker, G.S. (1965). A theory of the allocation of time. The Economic Journal, 75(299),
493-517.
Bernal, R. (2008). The effect of maternal employment and child care on children’s
cognitive development. International Economic Review, 49(4):1173-1209.
Cook, M.D., & Evans, W.N. (2000). Families or schools? Explaining the convergence in
white and black academic performance. Journal of Labor Economics, 18(4): 729754.
Currie, J. (2009). Healthy, Wealthy, and Wise: Socioeconomic Status, Poor Health in
Childhood, and Human Capital Development, Journal of Economic Literature
47(1), 87-122.
Fortin, N., Lemieux, T., and Firpo, S. Decomposition methods in economics. In
Handbook of Labor Economics, 2011.
Glewwe, P., & King, E.M. (2001). The impact of early childhood nutritional status on
cognitive development: Does the timing of malnutrition matter? The World Bank
Economic Review, 15(1): 81-115.
Glewwe, P., & E.A. Miguel. (2008). The impact of child health and nutrition on
education in less developed countries, Handbook of Development Economics,
Elsevier. Vol. 4: 3562-3604.
Hernandez-Zavala, M., Patrinos, H.A., Sakellariou, C., & Shapiro, J. (2006). Quality of
schooling and quality of schools for indigenous students in Guatemala, Mexico and
Peru. World Bank Working paper 3982.
International Monetary Fund. World Economic Outlook Database-September 2011,
International Monetary Fund. Accessed on September 20, 2011.
James-Burdumy, S. (2005). The effect of maternal labor force participation on child
development. Journal of Labor Economics, 23(1), 177-211.
Kimmel, J., & Connelly R. (2007). Mother’s time choices. Caregiving, leisure, home
production, and paid work. Journal of Human Resources, 42(3), 643-81.
36
Leibowitz, A. (1977). Parental inputs and children’s achievement. Journal of Human
Resources, 12(2), 242-51.
Marshall, J.H. (2009). School quality and learning gains in Guatemala. Economics of
Education Review, 28(2): 207-216.
McEwan, P.J. (2007). Can schools reduce the indigenous test score gap? Evidence from
Chile. Journal of Developmental Studies, 44(10): 1506-1530.
McEwan, P.J., & Trowbridge, M. (2007). The achievement of indigenous students in
Guatemalan primary schools. International Journal of Educational Development,
27: 61-76.
McKee, & Todd, P. (2009). The longer term effects of human capital enrichment
programs on poverty and inequality: Oportunidades in Mexico. Working paper.
Neidell, M. (2000). Early Parental Time Investments. In Children's Human Capital
Development: Effects Of Time In The First Year On Cognitive And Non-Cognitive
Outcomes. UCLA Economics Working Papers, 886.
Oaxaca, R. & Ransom, M. (1999). Identification in detailed wage decomposition. Review
of Economics and Statistics, 81(1): 154-157.
Paxson, C., & Schady, N. (2007). Cognitive development among young children in
Ecuador: The role of health, wealth and parenting. Journal of Human Resources,
42(1), 49-84.
Schady, N., “Early childhood development in Latin America and the Caribbean (March 1,
2006),” World Bank Policy Research Working Paper No. 3869.
Shin, H. (2007). Child Health in Peru: Importance of Regional Variation and Community
Effects on Children's Height and Weight. Journal of Health and Social Behavior,
48(4): 418-433.
Stein, Z., Susser, M., Saenger, G., & Marolla, F. (1975). Famine and Human
Development: The Dutch Hunger Winter of 1944-45. New York: Oxford University
Press.
Todd, P., & Wolpin, K. (2007). The production of cognitive achievement in children:
home, school, and racial test scores gaps. Journal of Human Capital, 1(1), 91-136.
Umbel, V.M., B.Z. Pearson, M.C. Fernandez, & Oller, D.K. (1992) Measuring bilingual
children’s receptive vocabularies. Child Development, 63(4): 1012-1020.
UNICEF. (2009). “The state of the world’s children special edition: Celebrating 20 years
of the convention on the rights of the child.
37
United Nations. (2010). Human Development Report 2010. New York: Palgrave
Macmillan
Villar, J., Smerglio, V., Martorell, R., Brown, C.H., & Klein, R.E. (1984). Heterogenous
growth and mental development of intrauterine growth-retarded infants Turing the
first 3 years of life. Pediatrics, 74(5): 783-791.
Waber, D., Vuori-Christiansen, L., Ortiz, N., Clement, J., Christiansen, N, Mora, J.,
Reed, R., & Herrera, G. (1981). Nutritional supplementation, maternal education,
and cognitive development of infants at risk of malnutrition. American Journal of
Clinical Nutrition, 34: 807-813.
Wilcox-Gök, V. (1983). The determination of child health: An application of sibling and
adoption data. The Review of Economics and Statistics, 65(5), 266-273.
Wooldridge, J. (2010). Econometric analysis of cross-section and panel data. MIT Press,
2nd. Edition.
38
Figure 1 – Regions in the Sample by Child’s Spoken Language
Language Spoken:
- Only Spanish
- Spanish and Indigenous (Quechua, Aymara or other)
39
Figure 2 – PPVT and Math Test Scores by Ethnicity, rounds 2 and 3
40
Figure 3 – Math Test Scores by different characteristics, round 2 and 3
Round2
Round 3
41
Table 1: Mean Test Scores for Ethnic Majority and Ethnic Minority Children
Student Type
All districts
Full sample
(n=1,662)
(n=1,681)
Spanish (child speaks Spanish)
(n=1,450)
(n=1,462)
Indigenous (speaks other language)
(n=212)
(n=219)
Student Type
Mixed districts
Full sample
(n=572)
(n=584)
Non-indigenous
(n=360)
(n=365)
Indigenous
(n=212)
(n=219)
Variable
PPVT standardized score
2005
2009
CDA-Q raw score
2005
2009
PPVT standardized score
2005
2009
CDA-Q raw score
2005
2009
PPVT standardized score
2005
2009
CDA-Q raw score
2005
2009
Variable
PPVT standardized score
2005
2009
CDA-Q raw score
2005
2009
PPVT standardized score
2005
2009
CDA-Q raw score
2005
2009
PPVT standardized score
2005
2009
CDA-Q raw score
2005
2009
42
Mean Std. Dev.
88.71
96.55
21.77
19.92
8.40
12.05
2.11
4.91
90.00
99.41
22.04
18.57
8.47
12.63
2.13
4.72
79.93
76.97
17.49
17.72
7.96
1.91
8.17
4.34
Mean Std. Dev.
89.67
95.93
21.99
20.92
8.48
11.95
2.09
5.08
85.57
96.78
20.86
20.28
8.14
12.09
2.11
4.84
79.93
76.97
17.49
17.72
7.96
8.17
1.91
4.34
Table 2 - Regression Estimates for PPVT Test
2005
Spanish
Indigenous
Age of Child in month
0.596
***
Per Capita Expenditure(log)
6.455
***
Father's education
0.325
-0.085
Mother's education
0.303
1.008
Child is male
-4.318
Height-for-age Z-score
Per Capita Expenditure in
education, (log)
-1.324
Months in pre-school
-0.039
*
1.553
0.572
***
0.344
***
0.437
***
5.703
***
3.496
**
4.005
**
0.894
**
***
**
***
1.196
3.425
**
3.051
1.975
*
3.252
-0.146
3.422
***
0.708
-0.06
0.127
-0.152
0.11
0.119
1.083
0.268
Hours per week in pre-school
0.117
-0.091
-0.03
0.027
0.942
0.95
0.973
0.96
Adjusted R
0.302
*
Months in daycare,
2
*
0.359
1.542
2.775
***
2009
Spanish
Indigenous
*
**
Notes:
1.
2.
Child’s districts are also taken as control for neighborhood effect.
*p<.1, **p<.05, ***p<.01.
Table 3 - Regression Estimates for Mathematics Test
2005
Spanish
Indigenous
Age of Child in month
Per Capita Expenditure(log)
0.106
0.096
*
***
0.017
0.05
0.221
0.109
0.1
Father's education
0.013
0.061
0.116
0.226
Mother's education
0.017
0.021
0.274
***
-0.013
Child is male
0.119
-0.054
1.118
**
0.831
Height-for-age Z-score
Per Capita Expenditure,
(log)
0.042
-0.139
0.329
-0.116
-0.077
1.089
Months in pre-school
-0.002
0.004
0.021
0.041
0.014
-0.01
0.01
-0.006
-0.003
-0.011
-0.041
0.938
0.943
0.88
0.853
Months in daycare,
Hours per week in preschool
2
Adjusted R
0.25
***
2009
Spanish
Indigenous
0.048
**
***
Notes:
Child’s districts are also taken as control for neighborhood effect.
*p<.1, **p<.05, ***p<.01.
43
***
*
**
0.977
***
0.716
**
Table 4 – Decomposition Results
2005
2009
Difference
Panel A. PPVT Test
Score
Endowment Effect
Parental Education
1.60
45%
5.72
49%
4.12
51%
Child's Health
-0.51
-14%
0.35
3%
0.86
11%
Preschool
-1.84
-52%
1.46
13%
3.30
41%
3.26
92%
1.60
14%
-1.66
-21%
Neighborhood Effects
0.18
5%
-0.08
-1%
-0.26
-3%
Performance Effect
0.86
24%
2.57
22%
1.71
21%
Total Gap
3.56
100%
11.63
100%
8.07
100%
Parental Education
0.08
-227%
1.14
51%
1.06
47%
Child's Health
0.01
-38%
0.05
2%
0.04
2%
-0.22
614%
0.56
25%
0.77
34%
0.15
-440%
0.08
4%
-0.08
-3%
0.03
-76%
-0.03
-1%
-0.05
-2%
Performance Effect
-0.09
266%
0.44
20%
0.53
23%
Total Gap
-0.04
100%
2.24
100%
2.28
100%
Other
Panel B. Math Test
Score
Endowment Effect
Preschool
Other
Neighborhood Effects
44
Table 5 – Regression Estimates for PPVT Test, Instrumental Variable Method
Mathematics
Vocabulary
2005
Variable
Age of Child in month
Household Per Capita
Expenditure(log)
Spanish
0.182
***
2009
Indigenous
0.093
***
Spanish
2005
Indigenous
-0.032
0.119
**
Spanish
0.699
**
***
2009
Indigenous
0.664
**
0.313
5.456
**
3.599
0.040
0.229
0.256
-0.076
6.169
Father's education
-0.027
0.063
0.173
*
0.160
0.281
-0.125
Mother's education
-0.034
0.025
0.306
***
0.170
0.233
0.919
Child is male
0.244
-0.067
1.058
**
1.037
*
Height-for-age Z-score
Education per Capita
Expenditure, (log)
1.405
**
0.222
1.097
3.440
***
-0.168
***
-0.074
1.257
0.537
*
Months in pre-school
-0.007
0.004
0.018
0.042
-0.043
0.013
-0.011
0.045
0.275
Months in daycare
0.050
***
***
-4.159
*
0.321
-1.406
***
**
Spanish
0.883
**
0.389
0.459
***
0.282
1.473
3.418
**
4.313
-0.666
1.101
-0.246
3.540
-0.060
0.125
-0.162
0.106
0.357
*
18.843
**
-0.607
Hours per week in pre-school
-0.005
-0.004
0.002
-0.047
0.113
-0.065
-0.021
0.015
Adjusted R2
0.8564
0.8470
0.8006
0.8026
0.8593
0.8486
0.801
0.7982
Notes:
Child’s districts are also taken as control for neighborhood effect.
*p<.1, **p<.05, ***p<.01.
45
***
2.777
1.557
0.332
*
Indigenous
***
Table 6 – Decomposition results, Instrumental Variable Method
2005
2009
Difference
Panel A. PPVT Test
Score
Endowment Effect
Parental Education
1.3
36%
5.83
50%
4.53
55%
0.11
3%
0.20
2%
0.09
1%
-1.94
-54%
1.53
13%
3.47
42%
3.13
88%
1.64
14%
-1.49
-18%
Neighborhood Effects
0.18
5%
-0.09
-1%
-0.27
-3%
Performance Effect
0.79
22%
2.63
22%
1.84
22%
Total Gap
3.57
100%
11.74
100%
8.17
100%
-0.16
291%
1.37
60%
1.53
66%
0.44
-804%
-0.18
-8%
-0.62
-27%
-0.28
499%
0.65
29%
0.92
40%
0.06
-112%
0.13
6%
0.07
3%
0.03
-47%
-0.04
-2%
-0.07
-3%
Performance Effect
-0.15
273%
0.34
15%
0.49
21%
Total Gap
-0.06
100%
2.27
100%
2.32
100%
Child's Health
Preschool
Other
Panel B. Math Test
Score
Endowment Effect
Parental Education
Child's Health
Preschool
Other
Neighborhood Effects
46
Appendix Tables
Appendix Table 1 – Descriptive Statistics – Children in and out of the sample
Variable
PPVT test
Standardized PPVT, round 2
Standardized PPVT, round 3
Math test, round 2
Math test, round 3
Being indigenous (child's spoken
language)
Being indigenous (maternal tongue)
Age of child in months, round 2
Age of child in months, round 3
Per Capita Expenditure (logs), round 2
Per Capita Expenditure (logs), round 3
Father's education, round 2
Father's education, round 3
Mother's education, round 2
Mother's education, round 3
Male
Height-for-age Z-score, round 2
Height-for-age Z-score, round 3
Expenditure in child’s education (logs),
round 2
Expenditure in child’s education (logs),
round 3
Months in pre-school
Months in daycare, round 2
Hours per week in pre-school
Number of books in household
(excluding textbooks)
Student repeated a grade on round 3
Teacher quality, round 3
Attended private school, round 3
Sample in this
Sample in this
study all
study mixed
districts
districts
(n=1662)
(n=987)
All YLS for Peru
Std.
Std.
Std.
Mean Dev.
Mean Dev.
Mean Dev.
Obs.
88.71
21.77 89.67
21.99 88.42
21.79 1903
96.55
19.92 95.93
20.92 95.58
20.15 1904
8.40
2.11
8.48
2.09
8.37
2.16 1949
12.05
4.91 11.95
5.08 11.90
4.93 1884
0.13
0.28
63.53
103.53
7.11
7.51
9.24
9.16
8.03
8.08
0.50
-1.43
-1.11
0.33
0.21
0.45
0.47
4.69 63.42
4.69 103.42
0.90
7.06
0.85
7.51
3.94
9.21
4.06
9.09
4.52
7.63
4.40
7.71
0.50
0.50
1.08
-1.53
1.03
-1.23
0.41
0.13
0.50
0.29
4.86 63.45
4.86 103.45
0.94
7.07
0.87
7.47
4.05
9.15
4.16
9.07
4.78
7.81
4.63
7.89
0.50
0.50
1.08
-1.50
1.03
-1.17
0.34
0.45
4.71
4.71
0.91
0.86
3.97
4.05
4.55
4.44
0.50
1.12
1.06
3.68
2.91
3.60
3.00
3.51
3.00 1963
5.27
14.10
3.04
17.31
1.23
10.19
7.28
9.49
5.35
13.68
3.43
17.53
1.19
10.17
7.70
9.87
5.15
13.09
2.78
16.20
1.45
10.42
7.02
10.04
1943
2052
2052
2052
1.71
0.04
0.63
0.19
1.26
0.21
0.48
0.39
1.69
0.04
0.63
0.23
1.26
0.20
0.48
0.42
1.68
0.04
0.59
0.18
1.26
0.21
0.49
0.39
1936
2052
2052
1938
47
1963
1954
1963
1963
1963
1932
1906
1923
1954
1816
2052
1950
1938
48
Download