Inequality, Poverty and their evolution in the US: Consumption and... the Consumer Expenditure Survey.

advertisement
Inequality, Poverty and their evolution in the US: Consumption and Income information in
the Consumer Expenditure Survey.
Orazio Attanasio, Erich Battistin and Andrew Leicester.
London 26.10.2004
Notes prepared for the National Poverty Center's ASPE-Initiated Workshop on Consumption
among Low-Income Families. Washington DC November 5th 2004.
I.
Introduction.
In these notes we discuss three main issues. We start with that of the quality of household level data
on consumption in the US and their use for the measurement of inequality, poverty and their
evolution. Consumption has some important advantages over income as a measure of well-being,
ranging from the fact that it is more likely to reflect ‘permanent income’ and more generally the
effect that different intertemporal smoothing mechanisms of shocks to the fact that it is more likely
to be directly related to ‘utility’ and ‘welfare’ than income. It is therefore worrying and puzzling that
the basic facts that come out of the main available surveys are not consistent. We will argue that the
consumption survey has some important problems and they have got worse in recent years. These
need, in our opinion, to be resolved, as these data can be extremely important for a variety of policy
issues.
These problems none withstanding, we proceed to show some evidence on the evolution of poverty
as measured by consumption and income in the CEX. In particular, we focus on: (i) the number of
‘poor’ according to the different definitions; (ii) how much the ‘poor’ consume as a fraction of their
income; (iii) the extent to which some observable (human capital, race) are linked to the probability
of being poor and how that has changed over time.
The level of consumption and its relation to income is only one aspect of the behaviour of poor
households that is of some interest. A relatively under-researched agenda is, in our opinion, the
modelling of consumption components by poor individuals. From a policy perspective, it is
important to consider what are the components of consumption that are most important for poor
households, how do they change with the level of total consumption, income, relative prices and so
on. This analysis is important because it can be very informative both on the nature of consumption
among poor households and on the mechanisms through which poor households cope with shocks
to their income. In the last part of the paper we present some preliminary evidence on modelling
food shares in total consumption for the whole population and for sub-groups. This is just an
example of the type of analysis that can and should be pursued. In the conclusions we mention
possible extensions to this analysis and some future research projects.
II.
Consumption and Income : available data and their quality.
We will not re-iterate the importance of having exhaustive and high quality consumption data in
household level surveys. This issue has been discussed at length elsewhere and we will take it for
granted. In the US there are mainly two household level data bases that contain information on
consumption. The first is the PSID. The big advantage of the PSID is that it is a longitudinal data
base that has been collected since 1967 and has been proven of very high quality in a variety of
dimensions. The main problem with the PSID, in terms of the consumption data, is the fact that the
information contained in it is far from being exhaustive. The PSID contains information on food
consumption (at home and outside the home) and on a few additional items. In addition to providing
very synthetic and incomplete information, the questions asked to gather this information are
somewhat ambiguous, especially in terms of the time horizon to which they refer.
The other large data set that contains consumption information is, of course, the Consumer
Expenditure Survey (CEX). The CEX has a long history, going back to the beginning of the 20th
century. However, only in 1980 did the survey become a continuous and consistent survey.
The CEX is made of two independent samples. The first, and largest, sample is referred to as the
Interview sample and is a rotating panel. Around 5,000 households are interviewed 4 times per year
and in each of these interview they report information on a the expenditure in the previous three
months on a long list of commodities, which should almost exhaust the items that make Personal
Consumption Expenditure in the NIPA accounts.1 While the information on most commodities is
very detailed, the one on food is not. There is basically a catch all question on ‘total food in the
home’. Food is particularly problematic in the Interview Survey as it is the one question on which
there are two important changes in the methodology used to collect the information, one in 1982 and
one in 1987. In 1999 the sample size of this survey was increased by about 30%.
The second component of the CEX is the so called Diary Survey. This is a somewhat smaller
independent sample of 5,000 households. Households in this sample are observed for two weeks,
and during this time they keep detailed diaries of all their expenditures. From 1980 to 1985 the Diary
Survey only collected information on ‘frequently purchased items’. These include very detailed
information on food and on a number of other items. Since 1986, the Diary Survey includes all
commodities, including retrospective information on durables. Interestingly, in addition to the
detailed diary information, for food at home the Diary Survey also contains the same retrospective
questions asked in the Interview Survey.
Both surveys, but especially the Interview Survey, contain a large set of additional variables ranging
from income and labour supply variables, to detailed demographic information. The Interview
Survey also contains a number of extremely detailed modules on a variety of themes, ranging from
vehicles and vehicle loans to education, housing and so on.
The main purpose of these data sets is the computations of the weights for the CPI. The BLS
typically uses information from the Diary Survey for frequently purchased items and from the
Interview Survey for the remaining items. This procedure implicitly recognizes the plausible fact that
the information on frequently purchased items in the Diary Survey is more accurate than in the
Interview Survey, while the information on other commodities is more reliable in the Interview
Survey.
Partly because the CEX was only started in 1980, until relatively recently the survey has not been
used extensively, especially in comparison to other data sources, such as the PSID, the CPS, and the
NLSs. An important question, therefore, is the reliability of the data. In this respect, one can check
whether the information one extracts from the CEX is consistent with the information from other
data sources. As is often the case, there are good news and bad news.
The good news is that information on wages and on pre-tax income is remarkably consistent with the
information from other data sources such as the CPS. If one studies the pattern of inequality using
hourly wages measures from the CEX, one gets, with a little bit additional noise (explained by the
smaller sample size) the same general picture that one gets out of the CPS. This is particularly
comforting as it signals that the CEX should not be affected by particularly nasty composition
problems and the like.
The bad news is that when one looks at consumption, the picture is not very comforting. As the
CEX is pretty much the only data source containing consumption information, the only possible
comparison is to National Income and Product Account (NIPA) data. In Figure 1, we plot ‘grossed
up’ CEX data for consumption along with PCE data. First, the CEX aggregates represent only a
relatively small fraction of PCE from NIPA. What is worse is that the ratio of CEX consumption to
NIPA PCEs has been deteriorating over time and is now, for total consumption, roughly 60% (down
1
The two main exceptions are imputed rent on owner occupied housing and personal care items.
from 65% until the early 1990s. There is no convincing explanation for this deterioration of the data
(in relation to the aggregate counterpart) during the last decade that we are aware of.
In addition to the comparability with aggregate data there are additional problems of consistency
between the two components of the Survey. Given the different methodology used in the Interview
and Diary surveys, it is not particularly surprising that one gets different levels of aggregate
expenditures from the two surveys. In Figure 2, we plot the aggregated Diary and Interview data for
(log) non durable consumption. The comforting feature of these data is that, over time, the two
series are roughly parallel.2 However, if one moves to consider distributional aspects, which are the
focus of this note, the picture changes. The story about the evolution of inequality that emerges from
the two surveys is considerably different. Figure 3, which is taken from Attanasio, Battistin and
Ichimura (2004) (ABI), shows the evolution of the standard deviation of the log for non durable
consumption in the diary and the interview survey. According to the latter survey, inequality in
consumption, especially after the early 1990s, is substantially flat. This contrasts strongly with the
evolution of income and wage inequality over the same period and would have strong implications in
terms of the smoothing mechanisms and insurance markets available to US households. On the other
hand, if one looks at the picture that emerges from the diary data, one gets a substantial increase of
inequality throughout the 1990s. ABI rule out a number of simple explanations for this puzzling
evidence (such as differences in sample compositions, changes in the frequency of purchases and so
on), and propose a methodology that uses the assumption that the IS provides good measurements
for some commodities while the DS provides good measurements for others. The problem, of
course, is that to compute the variance of total non durable consumption one needs the covariance
between the expenditure in the two sets of commodities. ABI use some assumptions that allow them
to identify the path of inequality from 1982 to 2000 combining the information from the two data
sources. We reproduce their evidence in Figure 4. Not surprisingly, the path of inequality is
somewhere in between the one implied by the DS and the one implied by the IS. When looking at
the distribution of consumption it becomes much harder to combine the information from the two
data sets: for means it can be without any problems, for the variance, we can do it by making some
(strong) assumptions. For the other features of the distribution, it is nearly impossible, without even
stronger assumptions.
The main lesson that comes out of this discussion is that the CEX data is affected by serious quality
problems, which are particularly difficult to ignore if one wants to analyze the dynamics of inequality
or other distributional issues. In what follows we will proceed and use the CEX data despite these
problems. It is clear, however, that they should be kept in mind when interpreting our results.
Before moving to the analysis of poverty in the CEX, it is however worth asking the question: are the
data problems discussed in this section unavoidable? Is consumption in rich developed countries
intrinsically impossible to measure without substantial measurement error? To answer this question
the experience of other countries can be instructive.
The UK has had a continuous consumption survey, the Family Expenditure Survey (FES), since the
late 1960s and consistent consumption definitions can be constructed since 1974. A number of
studies have compared the evolution of the FES data to the NIPA data. The contrast with the US
situation is stunning: Tanner (1998) shows that when grossing up the FES data, one gets nearly 95%
of national account PCE. And the difference is easily accounted for by definitional differences
(mainly in owned occupied housing which is not included in the FES and the exclusion of
institutionalized individuals). In Figure 5, we reproduce a picture that compares rates of growth of
consumption in the FES and NIPA reproduced from Attanasio, Blow, Hamilton and Leicester
(2004): while there are some blips (notably the 1991-1992 data), the correspondence between the two
series is remarkable.
2
Another comforting feature is that if one looks at the retrospective question asked in the Diary on food in,
one gets figures that are remarkably close to those of the Interview Survey.
The methodology used in the FES survey is much more similar to the one used in the Diary Survey
than in the Interview, in that it is based on two weeks diaries. However, it has also a considerable
retrospective component and a number of procedures that capture accurately several important items
(such as utilities etc.). Finally, it is larger than the CEX diary (at about 7,000 households per year) for
a population that is considerably smaller (and presumably more homogenous) than the one in the US.
III.
Consumption and Income Poor Households in the US.
In this section, we use the CEX to identify poor households according to income and consumption.
In doing so we have to select a sample, define consumption and income and so on. Many of these
choices are arbitrary and a more complete analysis would explore several alternatives and develop
several loose ends. However, the material we present is meant to be indicative of the type of analysis
that can be conducted with these data.
The sample will be made of households headed by an individual aged 25 to 60 and excluding single
mothers. This implies that we exclude three important groups that might be of particular interest in
the analysis of poverty, such as elderly individuals and single mothers.
All figures are deflated using the CPI. As income we consider before tax total family income. The
main motivation, at this point, is the fact that taxes are not extremely well measured in the CEX.
Some progress, however, can be made in this direction. As consumption we take total non durable
consumption expenditure. Both income and consumption are ‘equivalized’ using the OECD adult
equivalent scales. A household is defined as ‘poor’ in terms of consumption (income) if it has
consumption (income) below 60% of median consumption (income). Most of the analysis will be
executed both with data from the IS and the DS.
In Figures 6 and 7 we plot the evolution of poverty rates over time computed in terms of income and
consumption, first with the Interview and then with the Diary survey. Several interesting features
emerge from these pictures. First, both in the Diary and the Interview Surveys, income poverty rates
are higher than consumption poverty rates and they are roughly comparable, between 25% and 30%.
In both surveys, the income poverty rates increase during the 1990s. The stories from the two
surveys diverge when we consider consumption. First, the level of consumption poverty rates is
considerable larger when we look at the DS. It should be remembered that the concept we are
considering is a relative, rather than absolute concept. Therefore, this evidence points to a
substantially different (and more unequal) distribution in the DS than in the IS. For the IS the
poverty rate is between 16% and 19%. Consistently with the puzzle discussed in the previous section,
poverty rates are substantially flat over time in the IS, while those from the DS increase considerably.
When looking at the whole distribution of consumption, it is difficult to integrate the two data bases,
one of which contains reliable information on one set of commodities and the other from the other.
A fact that has received some attention (Meyer and Sullivan, 2003; Sabelhaus 2000) is that
households in the lowest income quantiles consume substantially more than their income. In Figures
8 and 9 we plot, for three representative years, consumption and income (smoothed) against income
quintiles for the lowest income quintiles. Notice that in all the pictures, the consumption graph is
remarkably flat and, up to almost the 10th percentile, it is considerably larger than income.
We next move to consider how the probability of being poor (in terms of income and consumption)
is related to the membership of particular groups and how that has changed between the 1980s and
1990s. We do this by running a simple probit for the ‘poor’ indicator on a constant, a dummy for
group membership, a dummy for the decade and their interactions. The groups we consider (in turn)
are high school dropouts and blacks. Of course the size of the first group declines considerably in
size in the 1990s. We report the results of this exercise in Table 1 for high school dropouts and in
Table 2 for blacks. The coefficients should be interpreted as the change in the probability of being
poor (according to the relevant definition).
Table 1: effect of on the probability of being poor
Interview survey
Diary survey
Consumption
Income
Consumption
Income
poverty
poverty
poverty
poverty
High
school 0.227
0.314
0.164
0.332
dropouts
(0.005)
(0.005)
(0.015)
(0.016)
1990s
0.003
0.015
0.014
0.012
(0.002)
(0.002)
(0.006)
(0.006)
High
school 0.015
0.066
0.034
0.040
drop* 1990s
(0.005)
(0.006)
(0.016)
0.017
Fraction of poor 0.174
0.269
0.251
0.267
Black
1990s
black* 1990s
Fraction of poor
Table 2 effect of on the probability of being poor
Interview survey
Diary survey
Consumption
Income
Consumption
Income
poverty
poverty
poverty
poverty
0.149
0.162
0.178
0.147
(0.006)
(0.007)
(0.019)
(0.021)
0.006
0.023
0.020
0.018
(0.002)
(0.002)
(0.005)
(0.006)
-0.022
-0.012
-0.021
-0.020
(0.005)
(0.007)
(0.017)
(0.020)
0.174
0.269
0.251
0.267
In the interview survey, being a high school dropout increases the probability of being income poor
by 0.31 (from a basis of 0.27) and increases the probability of being consumption poor by 0.227
(from a basis of 0.17). In the 1990s there is slight (and insignificant in the case of consumption)
increase in the probability of being poor. However, high school dropouts fare marginally worse,
especially in terms of income. If we look at the Diary figures, we find that, in terms of consumption,
high school dropouts have fared considerably worse in the 1990s than indicated by the IS. More
generally, the 1990s seem a worse decade than what indicated by the IS.
The evidence on the effect of race is consistent in the two surveys. In both cases and for both
definitions of poverty, blacks are between 15 and 18% more likely to be poor. In both survey and
for both definitions the 1990s see a slight (albeit not significant) improvement. As before, we see that
the 1990s register an overall increase in poverty rates in three out of the four columns, the exception
being the consumption definition in the Interview Survey.
IV Food shares
The study of expenditure shares and more generally demand systems has, of course a long tradition.
However, I am not aware of many studies that have looked at Engel curves for the US and, in
particular, for poor households. In this section we present some preliminary and simple analysis of
food shares in the CEX sample we have been using and for some sub-samples.
We start by looking at the shares of food in and food out as a total of non durable consumption. In
Table 3 we report the median share for the whole sample, the sample of consumption poor and the
sample of households headed by a high school dropout. Moreover, we report the median in each of
the two decades in our sample. These data are from the Interview Survey.
As to be expected, the share of food at home in total non durable consumption for the consumption
poor is considerably higher than in the overall sample. Moreover, while for the overall sample the
share increases slightly in the 1990s relative to the 1980s, the increase is much larger for the
consumption poor. For the high-school dropouts, the share of food in is in the middle but it goes up
at the same pace as for the consumption poor.
Table 3 median share of food in non durable expenditure for the consumption poor
Interview survey
Total
Consumption poor
High school dropouts
Food in
Food out Food in
Food out Food in
Food out
1980-1989
0.242
0.060
0.363
0.039
0.306
0.039
1990-2001
0.256
0.056
0.386
0.035
0.324
0.034
Next we estimate some simple Engel curves. In particular, we estimate the following relationship for
total food.
(1)
wij = θ X i + ν 1 ln(qi ) + ν 2 (ln(qi )) 2 + ε i j
where wij is the share of consumption of food in total non durable consumption by household i , X
is a vector of control variables, including year dummies, family composition and age variables as well
as socio-economic indicators, q is total non-durable consumption and ε ij a residual term. An
equation like (1) can be derived from a (rank-two) demand system and describes how expenditure
share for food (or other commodities) changes with total expenditure. Year dummies control for
changes in relative prices. Notice that the consumption share is allowed to be a quadratic function of
the log of total consumption, in line with Banks, Blundell and Lewbell (1996).
The estimation of equation (1) presents a number of econometric problems, ranging from the
possibility that total non durable expenditure is correlated with the residual terms, either because of
unobservable taste shocks not completely captured by the observables X or because of the presence
of measurement error. The latter can be particularly problematic if one wants to estimate (1) using
individual level instruments (see Lewbel, 1996). In what follows we simply report the OLS estimates
of equation (1) for the whole sample and two sub-samples: the ‘consumption poor’ and the high
school dropouts.
ν1
ν2
R^2
(n. obs.)
Whole sample
-0.105
(0.007)
0.00006
(0.00053)
0.2034
(192639)
Table 4
Engel curves for food
Consumption poor
0.421
(0.051)
-0.045
(0.005)
0.0351
(32882)
High school dropouts
0.044
(0.020)
-0.012
(0.002)
0.1778
(26706)
In Table 4, to save space, we report only the estimates of ν 1 and ν 2 for the Interview Survey. The
estimates from the Diary survey where qualitatively similar. The Engel curve for the whole
population is unremarkable: the quadratic term is very small and not significantly different from zero.
The linear term is negative, indicating that food is a necessity and consistently with the evidence in
Table 3. However, when we focus on the consumption poor or on the high school dropouts, things
become a bit more interesting. In both cases the quadratic term is strongly significant. The linear
term is now positive and the quadratic negative, indicating an inverse U shaped relationship, for the
poorest segment of the population, between the share of food and total non durable consumption
expenditure. Taken at face value, this evidence means that, at very low levels of total consumption,
food is a luxury, in that its share increases with total consumption. This type of pattern can be found
in several developing countries data. While the peak of the curve is such that very few households are
on the increasing portion of the Engel curve, it is nonetheless interesting to notice that such a group
exist and that for a much larger group, the share of food declines slowly with total consumption.
Clearly these results need to be investigated in depth before any strong statement can be made.
V Conclusions and thoughts for future research .
In this note we have discussed three issues: (i) consumption data quality and reliability; (ii) the
dynamics of poverty and inequality in the last two decades and (iii) the composition of consumption
among the poor and in particular the share of food. Each of these issues constitutes an important
research topic in its own right, so that the discussion here was necessarily superficial. However, the
discussion above gives an idea of an entire research agenda that can be opened. We conclude these
notes with a list of potential topics that develop what is discussed above.
(i)
(ii)
(iii)
(iv)
What more can be learned about the evolution of consumption distributions by
combining data sets? To what extent can one use structural models, economic theory
and other pieces of evidence to fill in the missing bits?
Many other groups are worth looking at: single mothers, the elderly and children are
particularly important;
Much work can be done in exploring the composition of consumption. The evidence in
the last section is suggestive of the fact that food is not necessarily a necessity for the
poorest. There is much detailed information in the Diary survey about the components
of food. These could be analyzed to find out how and what the poor (according to
different definitions) eat.
Exploring the composition of consumption in other dimensions and over time can also
be useful to establish the importance of different mechanisms the poor use to cope with
shocks. In this respect it could be important to exploit time and geographic variation.
References
Attanasio, Battistin and Ichimura (2004): “What really happened to consumption inequality in the
US”
Attanasio, Blow, Hamilton and Leicester (2004): “Booms and Busts in the UK”
Banks, Blundell and Lewbell (1996)
Lewbell (1996)
Meyer and Sullivan (2003)
Figure 1: CEX and PCE
2900
2700
2500
2300
2100
1900
1700
1500
8
8
8
8
8
9
9
9
9
9
CE
9
9
9
9
9
10
PC
Figure 2: CEX Diary and Interview Survey
Mean log expenditure on non-durables
6.60
6.50
6.40
6.30
82
83
84
85
86
87
88 89
90
91 92
Interview data
93
94 95
96
97
Diary data
98
99 100 101
Figure 3: Inequality in the Diary and Interview data
Standard deviation of log expenditure on non-durables
0.75
0.70
0.65
0.60
0.55
0.50
0.45
82
83
84
85
86
87
88
89
90
91 92
Interview data
93
94
95
96
97
98
99
100 101
Diary data
Figure 4. Combining information from Diary and Interview Survey
What happens to overall consumption inequality before 1986?
0.58
0.56
0.54
0.52
0.50
0.48
Combined
Interview data
-6
-4
-2
0
per cent
2
4
6
8
10
Figure 5.
Real per capita expenditure growth rates, UK FES and ONS (National Accounts), 1975 – 2001
1975
1980
1985
1990
1995
year
FES
ONS
Source: National Accounts from ONS; Author’s calculations from FES/EFS
2000
Figure 6:Poverty Rates (threshold is 60% of median in each year): Interview Survey
(mean) povinco
(mean) povcons
.32
.3
.28
.26
.24
.22
.2
.18
.16
.14
.12
80
82
84
86
88
90
year
92
94
96
98
100
98
100
Figure 7: Poverty Rates: Diary Survey (from 1986 only)
(mean) povinco
(mean) povcons
.32
.3
.28
.26
.24
.22
.2
.18
.16
.14
.12
80
82
84
86
88
90
year
92
94
96
Figure 8 Mean expenditure and income by income percentile, 1985, 1990, 1995, 2000
(Interview survey) – bottom 35% of income distribution
mean monthly expenditure
mean monthly income
year==85
year==90
year==95
year==100
1500
1000
500
0
1500
1000
500
0
0
5
10
15
20
25
30
0
35
5
10
15
20
25
30
35
pinc
Graphs by year
Figure 9 Mean expenditure and income by income percentile, 1990, 1995, 2000
(Diary Survey) – bottom 35% of income distribution
mean monthly expenditure
mean monthly income
year==90
year==95
1500
1000
500
0
0
year==100
1500
1000
500
0
0
5
10
15
20
25
30
35
pinc
Graphs by year
5
10
15
20
25
30
35
Figure 10 Engel Curve for total food (food in + food out), Interview Survey, whole
population
-.2
engel
-.4
-.6
-.8
-1
4
2
6
lnc1
8
10
Engel curve for food
Figure 11 Engel Curve for total food (food in + food out), Interview Survey,
consumption poor only
1
engelpov
.95
.9
.85
3
4
5
lnc1
Engel curve for food - Poor only
6
Figure 12 Engel Curve for total food (food in + food out), Interview Survey, High
school dropouts
Interview survey
0
engeldrop
−.2
−.4
−.6
2
4
6
lnc1
8
Engel curve for food − High school dropouts
10
Download