Empirical Welfare Analysis for Discrete Choice: Some New Results Debopam Bhattacharya

advertisement
Empirical Welfare Analysis for Discrete Choice: Some New
Results
Debopam Bhattacharya
University of Cambridge
Spring, 2016.
Abstract
This paper explores the problem of empirical welfare-analysis in multinomial choice settings,
allowing for completely general consumer-heterogeneity and income-e¤ects. We present results
on welfare-analysis under three practically important scenarios, viz., (i) simultaneous pricechange of multiple options, (ii) introduction/elimination of a choice-alternative, and (iii) choice
among non-exclusive goods. Our results do not follow from Bhattacharya (Econometrica, 2015).
In program-evaluation contexts, these results would enable estimation of compensated programe¤ects, i.e., how much the subjects themselves value the program, and the resulting deadweight
loss. Welfare-analysis under income-endogeneity is also brie‡y discussed. The methods are
illustrated with hypothetical policy experiments using survey data from India on teenagers who
make the choice between school, work and home-stay.
1
Introduction
Welfare analysis, based on the compensating and equivalent variation (CV and EV, henceforth), lies
at the heart of economic policy evaluations. Although theoretically well-understood, these measures
are rarely calculated or reported as part of econometric program evaluation. In fact, empirical
studies – both reduced-form and structural – typically adopt a paternalistic view and evaluate a
policy intervention in terms of its uncompensated e¤ects on individual outcomes. But they typically
ignore the heterogeneity of policy-impact on individual welfare which requires computation of
compensated e¤ects. For example, in an educational context, researchers typically evaluate the
I am grateful to Phil Haile, Jerry Hausman, Whitney Newey and seminar participants at the University of
Wisconsin for feedback and suggestions. All errors are mine.
1
impact of a tuition subsidy program via its net e¤ect on college-enrolment (c.f. Ichimura and
Taber, 2002, Kane, 2003). But they stop short of evaluating how much the subsidy is valued by
the potential bene…ciaries themselves, viz., the lump-sum income transfer that would result in the
same individual utilities as the subsidy, and any resulting deadweight loss thereof.1 The present
paper develops methods for simple calculation of such welfare e¤ects as part of program evaluation
studies in the common setting of discrete choice, without requiring the researcher to make any
restrictive assumptions on the nature of preference heterogeneity or income e¤ects.
In practical settings, unobservable heterogeneity in consumer preferences makes empirical welfare analysis a challenging problem. Nonetheless, some advances based on Roy’s identity have
recently been made for the case of a continuous good, such as gasoline demand, under general
preference heterogeneity (c.f., Hausman and Newey, 2016 and Lewbel and Pendakur, 2015). However, many real-life individual decisions involve discrete choice, such as college-attendance, choice
of commuting method, school-choice, retirement decisions, and so forth. The Roy’s identity based
methods used for continuous choice cannot be applied in these settings owing to the non-smoothness
of individual demand in price and income. Until very recently, available methods for welfare analysis in discrete choice settings were based on restrictive and arbitrary assumptions on preference
heterogeneity, e.g., quasi-linear preferences implying absence of income e¤ects (c.f., Domencich
and McFadden, 1975, Small and Rosen 1981), or parametrically speci…ed utility functions and
heterogeneity distributions (Herriges and Kling, 1999, Dagsvik and Karlstrom, 2005, Goolsbee and
Klenow, 2006). Two key concerns with these parametric approaches are (a) model mis-speci…cation
leading to erroneous substantive conclusions, and (b) identi…cation of welfare distributions solely
from functional form assumptions.
Recently, Bhattacharya, 2015 (DB15, henceforth) has shown that for heterogeneous consumers
facing choice between mutually exclusive discrete alternatives, the marginal distributions of equivalent/compensating variation (EV/CV) resulting from a ceteris paribus price-change of an alternative
can be expressed as closed-form transformations of conditional choice probabilities. These results
hold under fully unrestricted heterogeneity and income-e¤ects across consumers. The present paper
complements DB15, and makes three contributions. First, in Section 2, it is shown that in a discrete choice setting, welfare e¤ects of simultaneous price changes of several alternatives continues to
1
Stiglitz (2000, page 276) notes that empirical researchers typically ignore income e¤ects owing to the perceived
di¢ culty of calculating them, and Goolsbee (1999, page 10) points out that whereas the economic theory of welfare
largely relates to compensated elasticities, common program evaluation studies in public …nance typically report
uncompensated e¤ects. Hendren (2013) discusses this point and some related issues in more detail.
2
remain well-de…ned whereas the analog of Marshallian Consumer Surplus becomes path-dependent.
Next, EV/CV distributions are shown to be expressible as closed-form transformations of estimable
choice-probabilities (Theorem 1). These results cover situations where some price changes are negative, some zero and some positive. The key issue here is that although welfare e¤ects of simultaneous
change in multiple prices are well-de…ned, their distributions cannot be obtained by iterating the
single price-change result of DB15. This is because the income at which welfare distributions are to
be evaluated varies in an unobservable way across individuals from the second iteration onwards,
and thus cannot be conditioned on. Consequently, new results are required for welfare analysis in
these situations.
Multiple price-changes are the likely "general equilibrium" consequences of a single initial price
change of a product with substitutes. For example, a school-tuition subsidy in an area with child
labor is likely to raise children’s wages in response. The impact of such simultaneous price-changes
on consumer welfare is usually the key consideration for regulatory agencies with regard to their
decision on whether to implement the proposed change (c.f., Willig et al, 1991). It has been common
practice in applications to use the so-called log-sum formula (c.f., Small and Rosen, 1981, Train,
2009) for welfare comparisons. This formula, though convenient, is based on strong, unsubstantiated
assumptions like absence of income e¤ects and extreme valued heterogeneity, and potentially leads
to erroneous substantive conclusions (see our empirical example below for a concrete illustration of
such errors).2 In the present paper, we show that under completely general preference heterogeneity
and income e¤ects, one can express welfare-distributions resulting from multiple simultaneous pricechanges in terms of choice probability functions, thereby reducing welfare-analysis to the problem
of estimating choice probabilities. Crucially, our welfare-expressions (a) hold when income e¤ects
are non-negligible, as is likely for bigger purchases like children’s education and consumer durables,
and (b) apply to arbitrary patterns of price changes across alternatives, thereby making the results
useful across a large range of empirical situations.
Section 3 of the present paper concerns welfare e¤ects resulting from elimination of an existing alternative, or, equivalently, "retrospective" calculation of welfare gain from introduction of a
choice alternative, e.g., banning teenage wage-labour in a poor country. Previous empirical studies
on this problem (e.g. Hausman’s celebrated study on breakfast-cereal brands) either ignored con2
Some of these restrictive parametric assumptions have subsequently been replaced with less stringent parametric
assumptions, c.f. Dagsvik and Karlstrom, 2005, McFadden and Train, 2000, Herriges and Kling, 1999, Goolsbe and
Klenow, 2006. All of these still require speci…cation of the dimension and distribution of unobserved heterogeneity
and functional form of utility functions about which no a priori information is available.
3
sumer heterogeneity, and implicitly assumed a “representative consumer”model, (Willig et al, 1991,
Hausman, 1996, Hausman and Leonard, 2002, Hausman, 2003),3 and/or worked under restrictive
parametric assumptions (c.f., Petrin, 2002). In contrast, our results here allow for (a) unrestricted
consumer heterogeneity, and (b) arbitrary e¤ects of introducing/eliminating the alternative on the
price of existing alternatives, thereby making them applicable to a wide variety of practical situations.4 Importantly, these results are robust to failure of parametric assumptions, and clarify
which features of welfare distributions can and which ones cannot be identi…ed from demand data
without functional form assumptions. Indeed, our key result, outlined in theorem 2 below, shows
that Hicksian welfare distributions in this case can again be expressed as closed-form transformations of choice-probabilities without any assumption on heterogeneity. Nonetheless, identi…cation
of the tails of these distributions require observation of demand at prices where aggregate demand
at an adjusted income level reaches zero. At high income levels, such prices may not be observed
since producers have no incentive to set a price where aggregate demand for high income people is
zero. In that case, a lower bound on the welfare distributions can be obtained using the value of
aggregate demand at the highest observed price.5 A corollary of our main result is that the heuristic empirical practice of calculating welfare e¤ects of new goods by integrating choice probabilities
from the current price to in…nity yields the average CV, if prices of substitutes remain unchanged;
however, calculation of the average EV and/or allowing for prices of substitutes to change entail
di¤erent expressions. These results can be used mutatis mutandis for analyzing welfare e¤ects of
eliminating an existing alternative simply by reversing the labels of EV and CV.
Section 4 of this paper considers multinomial choice with non-exclusive alternatives. For example, if a cable-TV company o¤ering a sports package and a movie package raises product-prices,
then the resulting welfare calculations requires new results because the packages are not exclusive
alternatives for a potential consumer. We show that for multiple non-exclusive discrete goods, the
welfare distribution resulting from a single price-change can be directly expressed as a closed-form
functional of choice-probabilities but that for multiple price-changes cannot, unlike the case of
3
See Lewbel (2001) for an illuminating discussion of demand and welfare analysis with a representative consumer
vis-a-vis allowing for preference heterogeneity.
4
Welfare calculations in the two scenarios described above correspond to situations where the price vectors before
and after are given. The process through which the …nal price vector following the relevant change is calculated requires modelling the supply side. See, Hausman and Leonard, 2002, page 256-8, for an illustration of the methodology,
which is now well-known. See also Berry and Pakes, 1993 for an overview of this topic.
5
For common parametric models, like mixed logit, these issues are assumed away, and welfare distributions are
point-identi…ed simply via functional form assumptions.
4
multinomial choice among exclusive options. Accordingly, we also show that one can construct
non-trivial bounds on these distributions.
Taken together, these three results provide new insights into welfare analysis in discrete choice
situations, as well as providing practitioners with useful empirical tools for evaluating policy changes
in real-life settings. In section 5, we discuss practical implementation of these results, and state a
practically useful result, viz. that under income endogeneity and corresponding to a price increase,
the EV but not the CV can be used for legitimate welfare analysis even in the absence of instruments
or control functions.
A concrete empirical illustration of our methods is provided in Section 6. Using household
survey data from India on teenage children’s choice between school, wage-work and staying at
home, we calculate welfare impacts for two hypothetical policy changes - viz., (i) a tuition subsidy
accompanied by a rise in wage, and (ii) elimination of the work option via child-labor regulations.
In this application, income e¤ects are shown to be large, and their e¤ect on welfare calculations
important in both magnitude and in respect of their policy implications.
Proofs of all theoretical results and details of numerical calculations for the empirical illustration
are provided in the appendix.
2
Multiple Price-Changes
Set-up and notation: Consider a multinomial choice situation where alternatives are indexed by
j = 1; :::; J; individual income is denoted by Y , and price of alternative j by Pj . Individual utility
from choosing alternative j is Uj (Y
of unknown dimension;
Pj ; ), j = 1; :::; J, where
denotes individual heterogeneity
is distributed in the population with unknown marginal CDF F ( ).
De…ne the structural choice probability for alternative j evaluated at price vector p and income
y, denoted fqj (p; y)g, j = 1; :::; J, as
qj (p; y) =
Z
1 Uj (y
pj ; r) > max fUk (y
k6=j
pk ; r)g dF (r) .
(1)
In words, if we randomly sample individuals from the population, and o¤er the price vector p
and income y to each sampled individual, then a fraction qj (p; y) will choose alternative j, in
expectation.
Assumption 1 Assume that for each
and for each j = 1; :::; J, the utility function Uj ( ; ) is
strictly increasing.
5
Assumption 1 simply says that corresponding to making any choice, every consumer is strictly
better o¤ if they have more numeraire left in their pocket. Note that this assumption leaves the
dimension of heterogeneity completely unspeci…ed, and says nothing about how utility changes with
unobserved heterogeneity.
Consider arbitrary price changes from p0
(p10 ; p20 ; :::; pJ0 ) to p1
(p11 ; p21 ; :::; pJ1 ). Assume
without loss of generality that
pJ1
pJ0
pJ
1;1
pJ
1;0
:::
p11
p10 .
(2)
That is, label the alternative with the smallest price change (the smallest could be a negative
number, representing a fall in price) alternative 1, the next smallest as alternative 2 and so on.
De…ne EV as the solution S to the equation
max fU1 (y
= max fU1 (y
p11 ; ) ; U2 (y
S
p21 ; ) ; :::; UJ (y
p10 ; ) ; U2 (y
p20
pJ1 ; )g
S; ) ; :::; UJ (y
pJ0
S; )g .
(3)
Similarly, de…ne CV as the solution S to the equation
max fU1 (y + S
= max fU1 (y
p11 ; ) ; U2 (y + S
p10 ; ) ; U2 (y
p21 ; ) ; :::; UJ (y + S
p20 ; ) ; :::; UJ (y
pJ0 ; )g .
pJ1 ; )g
(4)
Corresponding to the multiple price-changes, one can attempt to de…ne the change in average
consumer surplus via the line-integral
CS (L) =
Z X
J
qj (p; y) dpj ,
(5)
L j=1
where L denotes a path from p0 to p1 . The negative sign stems from the fact that rise in price
leads to a loss in consumer surplus.
In the set-up described above, we …rst show that for arbitrary price changes, the EV and CV are
well-de…ned under assumption 1, but the Marshallian Consumer Surplus is not. We then establish
the …rst key result of the paper, viz., that the marginal distributions of individual-level EV and CV
are point-identi…ed, and closed-form expressions for their CDFs can be obtained as functionals of
the structural choice probabilities.
6
It is well known that for continuous choice with multiple prices changing simultaneously, the
Marshallian Consumer Surplus is generically unde…ned in the sense that the corresponding lineintegral is path-dependent, but the Hicksian CV and EV continue to remain well-de…ned (c.f.,
Tirole, 1988, page 11). Our …rst result establishes that the same conclusion is also true for discrete
choice.
Proposition 1 Consider the multinomial choice set-up with J alternatives. Consider a price
change from p0
(p10 ; p20 ; :::; pJ0 ) to p1
(p11 ; p21 ; :::; pJ1 ). Under assumption 1, the individ-
ual compensating and equivalent variations are uniquely de…ned.
(Proof in Appendix)
Marshallian Consumer Surplus: It can be shown that for discrete choice with multiple price
changes, the integral (5) is path-dependent. In the appendix, we demonstrate this for the case with
3 alternatives (J = 3). Thus it is no longer true that the average CS is identical to the average EV,
as found in DB15 for the case of a ceteris paribus price increase of a single alternative.
2.1
Welfare Distributions for Multiple Price-Changes
DB15 showed that in the above setting, when the price of a single alternative changes ceteris
paribus, the resulting CV and EV distributions can be expressed as closed-form functionals of
choice-probabilities. We begin by noting that one cannot iterate this single price-change result to
obtain the welfare distribution for simultaneous changes in the prices of multiple alternatives. To
see this, consider a choice among three alternatives (J = 3), and suppose that price of alternative
1 changes from p10 to p11 , and that of 2 from p20 to p21 , and price of 3 is unchanged at p3 . Suppose
we try to calculate the overall CV, starting with, say, price change of alternative 1, followed by 2
(the order in which we do this does not matter, by path independence) and applying theorem 2 of
DB15 at each stage. Suppose the CVs corresponding to the two price changes are denoted by S1
and S2 respectively. Then, by de…nition,
max fU1 (y
p10 ; ) ; U2 (y
p20 ; ) ; U3 (y
p3 ; )g
= max fU1 (y + S1 p11 ; ) ; U2 (y + S1 p20 ; ) ; U3 (y + S1
1
0
0
8
>
>
>
>
p11 ; A ; U2 @y + S1 + S2
U1 @y + S1 + S2
>
>
| {z }
| {z }
<
S
=S,
overall
CV
0
1
= max
>
>
>
>
>
U @y + S1 + S2 p3 ; A .
>
| {z }
: 3
S
7
p3 ; )g
1 9
>
>
>
p21 ; A ; >
>
>
=
>
>
>
>
>
>
;
Then using theorem 2 of DB15, we can get the marginal distribution of S1 but we cannot get the
marginal distribution of S1 +S2 because the price of both alternative 1 and 2 have changed between
lines 1 and 3 of the previous display and so theorem 2 of DB15 does not apply. Secondly, because
we cannot calculate the value of the CV S1 for an individual (we can only calculate its distribution
across all individuals), we cannot apply theorem 2 of DB15 to calculate the marginal distribution
of S2 , since the income at which we could potentially apply the theorem depends on S1 which is
unknown, unlike y which is a …xed known constant. Thus, a new result is required for welfare
analysis corresponding to simultaneous changes in multiple prices, and it is given by the following
theorem.
Theorem 1 Consider the multinomial choice set-up with J exclusive alternatives. Consider a price
change from p0
(p10 ; p20 ; :::; pJ0 ) to p1
pJ1
pJ0
pJ
(p11 ; p21 ; :::; pJ1 ). Assume without loss of generality that
1;1
pJ
1;0
:::
p11
p10 .
Under assumption 1, the marginal distribution of the individual EV evaluated at income y is given
by
Pr (EV
a)
8
>
0 if a < p11 p10 ,
>
>
0
1
>
>
>
>
p
;
:::;
p
;
>
P
11
j1
>
j
@
A if pj1 pj0 a < pj+1;1 pj+1;0 , a 0, 1
>
>
k=1 qk
>
<
pj+1;0 + a; :::; pJ0 + a; y
0
1
=
>
>
p
a;
:::;
p
a;
P
11
j1
>
j
>
@
A if pj0 pj1 a < pj+1;0 pj+1;1 , a < 0, 1 j
>
>
k=1 qk
>
>
p
;
:::;
p
;
y
a
>
j+1;0
J0
>
>
>
:
1 if a pJ1 pJ0 ,
j
J
1,
(6)
J
1,
while that of the individual CV evaluated at income y is given by
Pr (CV
a)
8
>
0 if a < p11 p10 ,
>
>
0
1
>
>
>
>
p11 ; :::; pj1 ;
>
Pj
>
@
A if pj1 pj0 a < pj+1;1 pj+1;0 , a 0, 1
>
>
k=1 qk
>
<
pj+1;0 + a; :::; pJ0 + a; y + a
1
0
=
>
>
p
a;
:::;
p
a;
P
11
j1
>
j
>
A if pj0 pj1 a < pj+1;0 pj+1;1 , a < 0, 1 j < J
@
>
>
k=1 qk
>
>
p
;
:::;
p
;
y
>
j+1;0
J0
>
>
>
:
1 if a pJ1 pJ0 ,
8
j
J
1,
(7)
1,
where qk s are de…ned above in equation (1). (The separate entries for a
0 and a < 0 in each line
of (6) and (7) arise from accommodating rise and fall of prices, respectively).
The distributional results cover positive, zero and negative price-changes. For example, in
the 3-alternative case, suppose alternative 2 is the outside option with price p21 = p20 = 0, and
p11
p10 < 0 < p31
p30 , then (7) becomes
8
>
0 if a < p11 p10 ,
>
>
>
>
>
< q1 (p11 a; 0; p30 ; y) if p11 p10 a < 0,
a) =
>
> q1 (p11 ; 0; p30 + a; y + a) + q2 (p11 ; 0; p30 + a; y + a) if 0
>
>
>
>
: 1 if a p
p30 .
31
Pr (CV
a < p31
p30 ,
Finally, note that in the above results, the CDFs are non-decreasing because of assumption 1.
For instance, for the CV, we require
q1 p11 ; p20 + a0 ; p30 + a0 ; y + a0
q1 (p11 ; p20 + a; p30 + a; y + a)
whenever p11
p10
a < a0 . But this is true because the LHS is the probability of the event
U1 (y + a
p11 ; )
U1 y + a0
p11 ;
() U1 y + a0
p11 ;
)
(8)
max fU2 (y
p20 ; ) ; U3 (y
p30 ; )g
max fU2 (y p20 ; ) ; U3 (y p30 ; )g , since a0 > a,
8
9
< U2 (y + a0 (p20 + a0 ) ; ) ; =
max
,
: U (y + a0 (p + a0 ) ; ) ;
3
30
whose probability is the RHS of the previous display. Indeed, inequality (8) can be interpreted as
a Slutsky/Revealed Preference inequality for discrete choice.
Remark 1 Note that the above results hold no matter whether price and income are endogenous or
not. Endogeneity a¤ ects how the choice probabilities will be estimated, not the relationship between
welfare distributions and the structural choice probabilities which is what the above results establish.
See Section 5 below for a more detailed discussion of welfare estimation under income-endogeneity.6
6
It is also implicit throughout (as in DB15) that the price changes do not alter the population of interest, e.g. a
tuition subsidy in a district might attract outsiders with a strong preference for education to migrate in, altering the
distribution of preferences relative to the staus-quo. In other words, the price changes considered here are assumed
to be modest enough to have no impact on the distribution of preference heterogenity.
9
3
Two Extensions
3.1
Elimination of an Alternative
Consider a setting of multinomial choice among exclusive alternatives f1; :::; J + 1g. Suppose the
alternative J + 1 is eliminated subsequently, which can potentially a¤ect consumer welfare by both
restricting the choice set and also by a¤ecting the prices of other alternatives. Assume that we
have data on a cross-section of individual choices in the pre-elimination situation. We wish to
calculate the distribution of Hicksian welfare e¤ects that would result from eliminating the J + 1th
alternative. Previous analyses of the problem, notably Hausman, 2003, had provided convenient,
o¤-the-shelf formulae for estimation of welfare e¤ects for a "representative" consumer, ignoring the
impact of unobserved heterogeneity in preferences. Incorporation of preference heterogeneity and
developing the associated welfare analysis reveals interesting di¤erences between CV and EV based
formulae and identi…ability of their distribution, as will be shown below.
Toward that end, consider an individual with initial income y and unobserved heterogeneity
whose utility from consuming alternative j at price pj is given by Uj (y
pj ; ). The problem is to
…nd the distribution of welfare arising from elimination of alternative J + 1 as
varies across indi-
viduals. Suppose from an initial price vector (p11 ; :::; pJ1 ; pJ+1 ), the eventual price vector becomes
(p10 ; :::; pJ0 ).
De…ne the CV as the income compensation in the …nal situation necessary to attain the initial
utility. Formally, CV is the solution S to the equation
max fU1 (y
p11 ; ) ; :::; UJ (y
= max fU1 (y + S
pJ1 ; ) ; UJ+1 (y
p10 ; ) ; :::; UJ (y + S
pJ+1 ; )g
pJ0 ; )g .
Analogously, the EV is de…ned as the income reduction in the initial situation necessary to attain
the eventual utility, i.e.,
max fU1 (y
= max fU1 (y
S
p11 ; ) ; :::; UJ (y
p10 ; ) ; :::; UJ (y
S
pJ1 ; ) ; UJ+1 (y
S
pJ+1 ; )g
pJ0 ; )g .
De…ne the initial (i.e., pre-elimination) choice probabilities, for alternatives k = 1; :::; J + 1, as
qk (p1 ; :::; pJ+1 ; y) = Pr Uk (y
As before, assume WLOG that pJ;0
pJ;1
pk ; )
:::
max
j2f1;:::;J+1gnfkg
p10
Uj (y
pj ; ) .
(9)
p11 .
We now state the main result describing the marginal distribution of EV and CV de…ned above.
The proof appears in the appendix.
10
Theorem 2 Assume that for each j = 1; :::; J + 1, Uj ( ; ) is strictly increasing. Let qk ( ; :::; ; y)
be as de…ned in (9). Then
Pr (CV
a)
8
>
>
0 if a < p10 p11 ,
>
>
0
1
>
>
>
>
p ; :::; pj0 ;
>
>
B 10
C
>
>
Pj
B
C
>
>
q
B
pj+1;1 + a; :::; pJ1 + a; pJ+1 + a; C , if pj0 pj1 a < pj+1;0 pj+1;1 , j = 1; ::; J 1, a
>
k=1 k
>
@
A
>
>
>
<
y+a
0
1
=
>
>
p
a;
:::;
p
a;
P
10
j0
>
j
>
@
A , if pj0 pj1 a < pj+1;0 pj+1;1 , 1 j < J 1, a < 0,
>
>
k=1 qk
>
>
pj+1;1 ; :::; pJ1 ; pJ+1 ; y
>
>
>
>
>
>
1 qJ+1 (p10 ; :::; pJ0 ; pJ+1 + a; y + a) , if pJ0 pJ1 a, a 0,
>
>
>
>
>
: 1 qJ+1 (p10 a; :::; pJ0 a; pJ+1 ; y) , if pJ0 pJ1 a < 0.
0,
On the other hand,
=
Pr (EV
a)
8
>
0 if a < p10 p11 ,
>
>
0
1
>
>
>
>
p10 ; :::; pj0 ;
>
Pj
>
@
A , if pj0
>
>
k=1 qk
>
>
>
pj+1;1 + a; :::; pJ1 + a; pJ+1 + a; y
>
<
0
1
pj1
a < pj+1;0
p10 a; :::; pj0 a;
Pj
>
A , if pj0 pj1 a < pj+1;0
qk @
>
>
k=1
>
>
p
;
:::;
p
;
p
;
y
a
j+1;1
J1
J+1
>
>
>
>
>
>
1 qJ+1 (p10 ; :::; pJ0 ; pJ+1 + a; y) , if pJ0 pJ1 a, a 0,
>
>
>
>
:
1 qJ+1 (p10 a; :::; pJ0 a; pJ+1 ; y a) , if pJ0 pJ1 a < 0.
As in the previous theorem, the pairs of results for a < 0 and a
pj+1;1 , a
pj+1;1 , 1
j<J
0, 1
1, a < 0,
0 correspond to which
existing alternatives have become more and less expensive, respectively, following the elimination
of the J +1th alternative. Also, note that in order to calculate the probabilities appearing in theorem
2, we need to observe adequate cross-sectional variation in the price of all J + 1 alternatives in the
pre-elimination period.
Corollary 1 If elimination of the alternative has no e¤ ect on prices of the other alternatives, then
pj1 = pj0 for all j = 1; :::; J, and the above results simplify to
8
< 0 if a < 0,
Pr (CV
a) =
: 1 q
J+1 (p10 ; :::; pJ0 ; pJ+1 + a; y + a) if 0
11
(10)
a.
j<J
1,
Pr (EV
8
< 0 if a < 0,
a) =
: 1 q
J+1 (p10 ; :::; pJ0 ; pJ+1 + a; y) if 0
(11)
a.
These corollaries have clear intuitive interpretation. For example, consider the result (10).
Recall that CV is de…ned as the solution S to
max fU1 (y
p10 ; ) ; :::; UJ (y
= max fU1 (y + S
pJ0 ; ) ; UJ+1 (y
p10 ; ) ; :::; UJ (y + S
pJ+1 ; )g
pJ0 ; )g .
Observe that any individual who does not consume the J + 1th product in the pre-change situation
su¤ers no welfare change relative to the …nal situation when the product becomes unavailable.
Therefore, we only need to consider those individuals who consume the product in the pre-change
situation. The CV is positive and less than a for those individuals who, with the additional income
a, would enjoy a higher utility than what they are currently getting from consuming the J + 1th
alternative. Since buying this alternative at price pJ+1 with income y yields the same utility as
buying it at price pJ+1 + a with income y + a, the probability of CV being less than a equals the
probability of not buying the alternative at price pJ+1 + a when income is y + a.
Corollary 2 If eliminating the J+1th alternative has no e¤ ect on prices of the other alternatives,
then the average values of individual welfare change are given by
Z 1
E (CV ) =
qJ+1 (p10 ; :::; pJ0 ; r; y + r
pJ+1
Z 1
E(EV ) =
qJ+1 (p10 ; :::; pJ0 ; r; y) dr,
pJ+1 ) dr,
(12)
(13)
pJ+1
using change of variable r = pJ+1 + a.
The expression for expected EV is commonly used as a "heuristic" measure of the welfare e¤ect
of introducing a new product. Thus it follows from the above discussion that if elimination of the
alternative entails no price change for the other alternatives, then the commonly used expression
is identical to the mean EV, but ceases to be so if one is interested in the mean CV, or prices of
substitutes also change.
In order to calculate the above expressions nonparametrically, a researcher needs to observe
demand up to the price where choice probability becomes zero. Typically, in a dataset, one is
unlikely to observe such prices, since producers have no incentive to raise prices where revenue is
zero. But then one can obtain a lower bound for E (CV ) by integrating up to the highest price
12
observed in the dataset where demand from consumers with income y is non-zero. This is in contrast
to the case of price changes for multiple alternatives, reported in equation (6) above, where welfare
distributions are nonparametrically identi…ed as long as the hypothetical price-changes are within
the range of the observed price data. Of course, for parametric choice probabilities, e.g. random
coe¢ cient logit, expressions like (13) are identi…ed directly from functional form.
3.2
Non-exclusive Discrete Choice
We now change the set-up described above to allow for non-exclusive choice. Accordingly, assume
that there are two binary choices which are non-exclusive among themselves. For example, suppose
choice 1 for a household is whether to subscribe to a sports package o¤ered by a cable TV network
which costs P1 and choice 2 is whether to subscribe to a movie package which costs P2 . A household
then has four exclusive options –f1g ; f2g ; f1; 2g ; f0g (where f0g denotes choosing none of the two
packages) with respective utilities U1 (Y
P1 ; ), U2 (Y
P2 ; ), U12 (Y
P1
P2 ; ) and U0 (Y; ),
respectively.
Consider the CV corresponding to a rise in the price of the sports package from p10 to p11 with
the price of the movie package …xed at p2 . The CV evaluated at income Y = y is the solution S to
the equation
8
9
< U0 (y + S; ) ; U1 (y + S p11 ; ) ;
=
max
: U (y + S p ; ) ; U (y + S p
p2 ; ) ;
2
2
12
11
8
9
< U0 (y; ) ; U1 (y p10 ; ) ;
=
= max
.
: U (y p ; ) ; U (y p
p2 ; ) ;
2
2
12
10
(14)
The marginal distribution of CV corresponding to this single price change is point-identi…ed in
this case. The intuition for this result is as follows. Suppose the price P1 changes from p10 to p11
with the price P2 …xed at p2 . Now, group option f1g and f1; 2g together (call it group A) and
options f0g and f2g together and call it group B. De…ne
def
" = (p2 ; )
VA (y
def
p1 ; ") = max fU1 (y
def
p1 ; ) ; U12 (y
VB (y; ") = max fU0 (y; ) ; U2 (y
p1
p2 ; )g ,
p2 ; )g .
Correspondingly (14) becomes
max fVA (y + S
p11 ; ") ; VB (y + S; ")g = max fVA (y
13
p10 ; ") ; VB (y; ")g .
(15)
If the U functions are strictly increasing and continuous in the …rst argument for each , then so
are VA ( ; ") and VB ( ; ") for each ". Now we can apply theorem 1 of DB15 for binary choice to
this problem and get the marginal distribution of the compensating variation. For example, for
0
a < p11
p10 , theorem 1 of DB15 gives
Pr (CV
a)
= Pr [VB (y + a; ") VA (y + a (p10 + a) ; ")]
2
3
max fU0 (y + a; ) ; U2 (y + a p2 ; )g
6
8
9 7
6
< U1 (y + a (p10 + a) ; ) ; = 7
= Pr 6
7
5
4
max
: U (y + a (p + a) p ; ) ;
12
10
2
= q0 (p10 + a; p2 ; y + a) + q2 (p10 + a; p2 ; y + a) .
Thus
8
>
>
0 if a < 0,
>
<
a) =
q0 (p10 + a; p2 ; y + a) + q2 (p10 + a; p2 ; y + a) ; if 0
>
>
>
: 1 if a p
p .
Pr (CV
11
a < p11
p10 ,
10
The key fact enabling us to write (14) as the binary choice CV (15) is that P2 is being held
…xed at p2 ; if P2 also varied across individuals, then the distribution of " would vary beyond the
variation of
and the binary formulation would no longer be applicable. Indeed, for multiple price
changes in the non-exclusive alternatives case, welfare distributions can no longer be written in
terms of choice probabilities. To see this, consider a simultaneous rise in P1 and P2 from (p10 ; p20 )
to (p11 ; p21 ). Assume that 0 < p11
p10 < p21
p20 .
Then the CV is de…ned via
8
9
<
=
U0 (y + S; ) ; U1 (y + S p11 ; ) ;
max
: U (y + S p ; ) ; U (y + S p
p21 ; ) ;
2
21
12
11
= max fU0 (y; ) ; U1 (y
p10 ; ) ; U2 (y
p20 ; ) ; U12 (y
p10
p20 ; )g .
(16)
Now,
Pr [S = p11
2
6
6
= Pr 6U1 (y
4
2
= Pr 4U1 (y
p10 ]
p10 ; )
p10 ; )
8
>
>
U (y; ) ; U2 (y p20 ; ) ; U12 (y p10 p20 ; ) ;
>
< 0
max
U0 (y + p11 p10 ; ) ; U2 (y + p11 p10 p21 ; ) ;
>
>
>
: U (y p
p21 ; )
12
10
8
93
< U2 (y p20 ; ) ; U12 (y p10 p20 ; ) ; =
5.
max
: U (y + p
;
p ; )
0
11
14
10
93
>
>
>
=7
7
7
>
5
>
>
;
(17)
This equals the probability of an individual picking the bundle (1; y
among the bundles (1; y
p10 ), (2; y
p20 ), (12; y
p10
p10 ) when facing choice
p20 ) and (0; y + p11
p10 ). In order that
we can identify this probability, there has to be a mass of individuals who face choice over these
four bundles. This requires the four bundles to lie in the same budget set; in other words, we need
a combination of alternative speci…c prices p1 ; p2 and income y such that the four bundles are
respectively identical to (1; y
p1 ), (2; y
p2 ), (f1; 2g ; y
p1
= y + p11
p10
y
y
y
p1 = y
p10
y
p2 = y
p20
p1
p2 = y
p10
p2 ) and (0; y ), i.e.,
p20 .
But adding the …rst and fourth equation in the previous display we get
2y
p1
p2 = 2y + p11
2p10
p20
(18)
whereas adding the 2nd and 3rd equation gives
2y
p1
p2 = 2y
p10
p20 ,
(19)
Now, equations (18) and (19) can hold simultaneously if and only if p11 = p10 which reduces to the
single price change case. Therefore, if both price changes are positive, we cannot observe anyone in
the data who might face the choice between bundles (1; y
(0; y + p11
p10 ), (2; y
p20 ), (12; y
p10
p20 ) and
p10 ); therefore, (17) cannot be expressed as a choice-probability. But for any generic
probability distribution on , the probability (17) will be positive.
One can nonetheless bound the probability (17) using revealed preference arguments. In particular, let
8
< p ;p ;y : y
1 2
A =
: y
p1 p 2
8
< p ;p ;y : y
1 2
B =
: y
p1 p2
p1
y
p1
y
y
p10 , y
p2
p10
p20 , y
y + p11
y
p10 , y
p2
p10
p20 , y
y + p11
15
y
y
9
p20 ; =
,
p10 ;
9
p20 ; =
.
p ;
10
Then an upper bound can be obtained as
8
2
< U2 (y p20 ;
Pr 4U1 (y p10 ; ) max
: U (y + p
0
11
8
2
<
inf
Pr 4U1 (y
p1 ; ) max
:
(p1 ; p2 ; y )2A
=
) ; U12 (y
p10
p10 ; )
U2 (y
93
p20 ; ) ; =
5
;
p2 ; ) ; U12 (y
p1
U0 (y ; )
inf
q1 (p1 ; p2 ; y ) .
(p1 ; p2 ; y )2A
These bounds arise from the fact that if, e.g., y
p1
by assumption 1, so that the probability of U1 (y
y p10 , then U1 (y
p1 ; ) exceeding that same number. Similarly, y
that U2 (y
U2 (y
p2 ; ) so that the probability of U2 (y
number is smaller than that of U2 (y
U1 (y
p10 ; ),
p10 ; ) exceeding a speci…c number is smaller
than that of U1 (y
p20 ; )
p1 ; )
93
p2 ; ) ; =
5
;
p2
y
p20 implies
p20 ; ) being exceeded by a
p2 ; ) being exceeded by it, etc.
By a similar logic, a lower bound on (17) is given by
8
93
2
< U2 (y p20 ; ) ; U12 (y p10 p20 ; ) ; =
5
Pr 4U1 (y p10 ; ) max
: U (y + p
;
p10 ; )
0
11
8
93
2
< U2 (y
p2 ; ) ; U12 (y
p1 p 2 ; ) ; =
5
Pr 4U1 (y
p1 ; ) max
sup
: U (y ; )
;
0
(p1 ; p2 ; y )2B
q1 (p1 ; p2 ; y ) .
sup
(p1 ; p2 ; y )2B
These upper and lower bounds cannot coincide, implying that (17) cannot be point-identi…ed
unless p11 = p10 . To see this, note that by assumption 1, the only way for the lower and upper
bound to coincide would be that the inf and sup are attained at the point of intersection of A and
B where each weak inequality holds as an equality (otherwise the distribution of
can be chosen
such that inf in the interior of A is strictly larger than the sup in the interior of B). But we saw
above that this is impossible unless p11 = p10 .
The conclusion from the above discussion is that in a discrete choice setting when alternatives
are non-exclusive, the distributions of individual welfare change for a single ceteris paribus price
rise can be expressed directly as a choice probability, but for simultaneous increase in several
alternatives’ prices, they can only be bounded. This is in contrast to the multinomial case with
exclusive alternatives where welfare distributions are identi…ed for both single (shown in DB15)
and multiple (shown above) price changes. An intuition for the result is that in the non-exclusive
16
case, there is a systematic relationship between the prices of the di¤erent (composite) options, so
that in a dataset we can never observe independent price variations across the various (composite)
options, no matter how many price combinations are observed across the individual alternatives.
Remark 2 Note that this distinction between the exclusive and non-exclusive options would not
be apparent if one started with a parametric model of choice; indeed, welfare distributions would
appear to be identi…able in both cases for arbitrary price-changes. This identi…cation would, of
course, arise solely from functional form assumptions.
4
Discussions on Implementation
The identi…cation results reported in theorems 1 and 2, and the associated corollaries are fully
nonparametric in that no functional form assumptions are required to derive them. When implementing these results in practical applications, one can therefore estimate conditional choice
probabilities nonparametrically, e.g., using kernel or series regressions and use these estimates to
calculate welfare distributions. For instance, recall the 3-alternative case, leading to (??), where
we have, say, p11
p10 < 0 = p21
p20 < p31
p30 . For any random variable X with CDF F ( )
and …nite mean, the expectation satis…es
Z 0
Z
E (X) =
F (x) dx +
1
(1
F (x)) dx;
0
thus the mean CV from (??) is given by
Z 0
Z
q1 (p11 a; p20 ; p30 ; y) da +
p11 p10
1
p31 p30
q3 (p11 ; p20 ; p30 + a; y + a) da,
(20)
0
where qj (p1 ; p2 ; p3 ; y) denotes the choice probability of alternative j when the price of the three
alternatives are (p1 ; p2 ; p3 ) and income is y. If the dataset is of modest size, so that kernel regressions
are imprecise, then one can alternatively use a parametric approximation to the choice probabilities.
Numerical integration routines are now available in popular software packages like STATA and
MATLAB and can be used to calculate the integrals of choice probabilities, in the same way
that consumer surplus was traditionally calculated by earlier researchers. A speci…c numerical
illustration based on household survey data from India is provided below in Section 5.
4.1
Endogeneity
In some applications, observed income may be endogenous with respect to individual choice. A
natural example is when omitted variables, such as unrecorded education level, can both determine
17
individual choice and be correlated with income. Under endogeneity, one can de…ne welfare distributions either unconditionally, or conditionally on income, analogous to the average treatment
e¤ect and the average e¤ect of treatment on the treated, respectively, in the program evaluation
literature. In either case, the observed choice probabilities would potentially di¤er from the structural choice probabilities which appear in our welfare distribution results, and control function
type restrictions would typically be required for estimating them.7 In this context, an important
and useful insight, not previously noted, is that for a price-rise, the distribution of the incomeconditioned EV is not a¤ected by income endogeneity, whereas that of the CV is (for a fall in price,
the conclusion is reversed for CV and EV respectively). To see why, recall the three alternative
case discussed above, and de…ne the conditional-on-income structural choice probabilities as
Z
qjc p1 ; p2 ; p3 ; y 0 ; y = 1 Uj y 0 pj ;
max Uk y 0 pk ;
dF ( jy) ,
k6=j
where F ( jy) denotes the distribution of the unobserved heterogeneity
current income is y. Now, for a real number a, satisfying p11
p10
for individuals whose
a < p21
p20 , it is easy to
see that similar to equations (??) and (??), the distributions of EV at a, evaluated at income y,
conditional on current income being y, are given by
Pr (EV
ajInc = y) = q1c (p11 ; p20 + a; p30 + a; y; y) ,
while for CV it is given by
Pr (CV
ajInc = y) = q1c (p11 ; p20 + a; p30 + a; y + a; y) .
Now, q1c (p11 ; p20 + a; p30 + a; y; y), by de…nition, is the fraction of individuals currently at income y
who would choose alternative 1 at prices (p11 ; p20 + a; p30 + a), had their income been y. But this is
directly observable in the data since the realized and hypothetical incomes are the same, and therefore, no corrections are required owing to endogeneity. However, q1c (p11 ; p20 + a; p30 + a; y + a; y)
is the fraction of individuals currently at income y who would choose alternative 1 at prices
(p11 ; p20 + a; p30 + a), had their income been y + a. This fraction is counterfactual and not directly estimable because the distribution of
is likely to be di¤erent across people with income
y + a relative to those with income y, due to endogeneity. To summarize, if the objective of welfare
7
When individual choice data are available, endogeneity of price is typically of lesser concern, because an individ-
ual’s choice or her omitted characteristics are less likely to a¤ect the market price she faces. However, if there are
unobserved choice attributes, then price endogeneity may be an important empirical concern. Blundell and Matzkin
(2015) and Berry and Haile (2015) have made important recent advances for handling price endogeneity in demand
analysis with general heterogeneity.
18
analysis is to calculate the EV distribution resulting from price rise for individuals whose realized
income equals the hypothetical income, then endogeneity of income is irrelevant to the analysis.
This suggests that if exogeneity of income is suspect and no obvious instrument or control function is available, then a researcher can still perform meaningful welfare analysis based on the EV
distribution at current income.
5
An Illustration
We now provide a simple empirical illustration of theorems 2 and 3 by estimating average welfare
e¤ects resulting from two hypothetical policy changes in the context of children’s school attendance
in India. This exercise is in the spirit of Hausman (1981)’s illustration of welfare calculations for
wives’ labor supply in the United States. The data come from a large-scale household survey,
conducted by the Indian National Sample Survey Organization in 2004-5. Our subsample of interest consists of approximately 25000 households from across India, which have a single child of
secondary-school age (15-18 years) and belong to one of the two dominant religious groups –Hindu
and Muslim. Each child faces three alternatives –stay at home (option 1), work for a wage (option
2), or attend school (option 3). We drop the two states of Kerala and Himachal Pradesh which typically score unusually high relative to the rest of India on human development indicators including
children’s education. For income y, we use the monthly per capita household expenditure (mpce),
as is standard in empirical studies on developing countries; we use the term "income" throughout
the rest of this section to mean mpce. For potential price faced by a household related to schoolattendance, we use the median annual educational expense (divided by 12) per school-going child
across all households in the same income stratum in the village or urban block where the household
resides.8 The reason for using the median expense is that some households might spend excessively
large amounts on educational resources, which does not represent the true "price" envisaged by a
typical household making the choice between sending and not sending their child to school. The
price of working is taken to be the negative of the average wage for children aged 15-18 in the
village or urban block where the household resides. The price of staying at home is normalized to
8
An advantage of the Indian NSS is that it divides each village and urban block into two income strata (labelled
"second stage stratum" in the survey documentation) and samples households from both strata. In forming our
measure of potential price faced by a household, we use median school expenses across households in the same income
strata and the same village as the reference household. In cases where no household from a village and a speci…c
income stratum sends any child to school, we use the median educational expenditure in the entire village as the
potential price.
19
zero. All money amounts are expressed as Indian rupees per month (Rs), with 1 Indian rupee=0.02
US dollars in 2004. Summary statistics are presented in Table 1.
In table 2, we show semi-elasticities for price and income from a multinomial logit regression of
school-attendance and working conditional on a set of household characteristics, using home-stay
as the base outcome. Results are shown separately by gender, for a "typical" child, de…ned as
Hindu, at age 15 and coming from a household with 5 members, adult literacy rate 60% and at
the lowest decile of monthly expenditure, viz., y = Rs 1800. These semi-elasticities represent the
increase in probability of choosing the respective outcome corresponding to a doubling of prices or
income. Income e¤ects are seen to be highly signi…cant for every choice, for each activity and for
both genders. This suggests that including income e¤ects can potentially make a signi…cant impact
on welfare calculations.
To illustrate the calculation of welfare e¤ects, we consider two hypothetical policy changes. The
…rst is a simultaneous price change for schooling and working induced by a tuition subsidy and a
potential general equilibrium increase in wages. The second is elimination of the work option with
no induced change on school fees, as would result from enforcement of a child labor legislation.
Details of all numerical calculations are shown in the Appendix.
Policy Change 1: For the …rst policy simulation exercise, we consider a change from initial
price of schooling of Rs 790 and initial wage Rs 25 to a …nal school-price of Rs 45 and wage Rs 256
with the price of staying at home (i.e., the outside option) constant at 0. These change in prices
correspond to a 2 standard deviation fall from the ninth decile of school-fees and a 1.5 s.d. rise
from the lowest decile for child wage. We calculate the average welfare gain at various hypothetical
income values for these price changes for the typical household, de…ned above. Note that it is
not a priori obvious which one among the tuition subsidy e¤ect, which bene…ts the comparatively
richer whose children are more likely to attend schools, or the wage rise e¤ect, which bene…ts the
comparatively poorer whose children work, will dominate, and so how welfare e¤ects will vary with
income is an empirical question.
To use our formuale for welfare change in this context, we need to estimate the choice probabilities for schooling and working at the speci…ed covariate values. We model these choice probabilities
via cubic splines in prices and income (see appendix for details). This formulation should be interpreted as an approximation to the underlying nonparametric choice probabilities. The alternative
to the above calculations is the Small and Rosen (1981) approach which ignores income-e¤ects,
assumes Type 1 extreme valued errors and leads to average welfare calculations via the popular
20
"log-sum" formula, c.f., Train, 2009, page 52.
Results: In Figure 1, we plot the estimated mean welfare change at various values of income
for female children. Two interesting features of the CV and EV graphs are evident from the
…gure, viz., (a) welfare e¤ects rise with income, because at relatively higher levels of income, a
larger proportion of children attend schools and gain from the fall in tuition (i.e. the tuition e¤ect
dominates), with the mean EV/CV reaching the entire tuition subsidy of Rs 745 at the highest
income levels, and (b) the log-sum based estimates of welfare gains, which do not vary with income
due to assumed lack of income e¤ects, also do not lie in between the EV and CV estimates for
most of the income range, suggesting potentially serious under/over estimation of welfare e¤ects
due to neglected income-e¤ects. To focus on mean welfare at a speci…c income, in Table 3, we
present results on mean welfare e¤ects of the price changes at the lowest percentile of income,
together with corresponding 95% con…dence intervals, by gender. We see that welfare gains for
male teenagers is slightly higher than for females owing to higher incidence of school-attendance.
Furthermore, for both male and female children, the Logsum based estimates are outside the 95%
con…dence intervals for the average EV and CV. The last two columns in Table 3 present mean
deadweight loss calculations, obtained by subtracting the mean welfare gains from the sum of the
subsidy payments and the additional wage-payments of employers of child labourers. The entries
show that deadweight losses are estimated to be much smaller – up to nearly 100 times – using
the logsum method than the semiparametric estimates using our formulas above. This can also be
inferred from Fig.2, where the estimated mean welfare e¤ects using the logsum formulae are much
higher at low income levels.
The reason why ignoring income e¤ects lead to over-estimation of mean welfare at low incomes
and under-estimation at high incomes is intuitive. When income e¤ects are ignored, the estimated
choice probability at a speci…c price vector is e¤ectively the true choice probability as a function of
price and income averaged over the distribution of income. Since the good in question is normal, the
income-averaged choice probability is higher than the true income-conditioned choice probability
at lower incomes and lower than it at higher incomes. Given that subsidy-eligible individuals are
typically poorer, over-estimating private welfare gains by ignoring income-e¤ects should therefore
be a bigger concern in contexts like the one above.
Policy Change 2: The second policy change we consider is a hypothetical child-labor regulation, mandating removal of the "work" option. For simplicity, we assume that there is no change in
school tuition as a result of the legislation since the fraction of working teenagers is relatively small.
21
The corresponding welfare analysis can be performed using corollary 1 above. It is important to
stress that here welfare loss refers to the loss in private utility from reduced choice, ignoring any
external social bene…t of eliminating child labor. Accordingly, consider a child with demographic
characteristics given above, family income y = Rs 1800 (lowest decile) and facing school-fees of Rs
146 per month who initially had the option to work at wage Rs111 per month (Rs 146 and Rs 111
being the respective medians in the data), and then the work option is eliminated. The question
is what would be the average CV/EV resulting from this restriction on choice. Using corollary 1,
these are given by integrating the adjusted probability of working
E (EV ) =
E(CV ) =
Z
1
q2 ( 111 + a; 146; 1800 + a) da,
Z0 1
q2 ( 111 + a; 146; 1800) da.
0
Ignoring income e¤ects would lead to the Marshallian Consumer Surplus which is given by
E(CS) =
Z
1
q^2 ( 111 + a; 146) da,
0
where q^2 (p2 ; p3 ) is the estimated probability of working at wage
p2 and school fee p3 .
Results: In Figure 2, we plot the mean welfare loss by gender against income, as measured via
average EV, average CV and the average consumer surplus which ignores income e¤ects. Both the
CV and EV indicate that the welfare e¤ect diminishes to zero rapidly as income rises; at higher
income levels very few children work, and so eliminating the option to work has little impact on
their welfare. In contrast, the average Marshallian consumer surplus does not vary with income due
to assumed lack of income e¤ects; more importantly, its magnitude does not lie between the EV and
CV estimates for most of the income range, suggesting potentially serious under/over estimation of
welfare e¤ects. It is also apparent that male children, who engage in wage-work much more than
females, su¤er a larger welfare loss from reduced choice. In table 3, we record the mean EV/CV
at the …rst decile of income separately by gender; it is apparent that removal of the work option
leads to a welfare loss for males equal to a sixth of average teenage wage, whereas that for females
is about half that amount.
6
Summary and Conclusion
In this paper, we have shown how to conduct empirical welfare analysis in multinomial choice
settings, allowing for completely general consumer heterogeneity and income-e¤ects. The paper
22
considers three scenarios – (a) simultaneous change in prices of multiple alternatives, (b) the introduction of a new alternative or elimination of an existing alternative, possibly accompanied by
price-changes of other alternatives; and (c) situations where choice-alternatives are non-exclusive.
The key contributions are to show that (i) Hicksian welfare changes are well-de…ned under a mild
monotonicity assumption on utilities in all these cases, (ii) in cases (a) and (b) the marginal distributions of CV and EV can be expressed as simple closed form functions of choice probabilities
without making any assumption on the functional forms of utilities, preference heterogeneity or
income e¤ects, and (iii) this last conclusion fails when alternatives are non-exclusive. Our results
are empirically illustrated through a hypothetical policy simulation using school/work choice data
for Indian teenagers. The illustration shows that in real empirical examples, income e¤ects can be
substantial, and that mean welfare e¤ects vary systematically by income –a highly policy-relevant
point that would be missed if one uses conventional "log-sum" type formulae popular in applied
work. In particular, corresponding to elimination of the "work-for-wage" option, private welfare
losses will be systematically over-estimated at relatively higher incomes and under-estimated at
lower incomes if one erroneously assumes away income-e¤ects. These shortcomings can be easily
corrected by the methods developed in the present paper. More broadly, our methods can be used
in program evaluation studies to calculate the program’s value to the subjects themselves, i.e.,
compensated treatment e¤ects, and the resulting deadweight loss, without any assumption on the
nature of preference heterogeneity and income e¤ects in the population.
23
Table 1: Summary Statistics, N=24,793 Variable At home Working In school School‐fees Local wage Monthly Expense Age Mean 0.19 0.20 0.61 168 123 4063 16.13 Std. Dev. 0.39 0.40 0.49 383.41 170 6030 1.70 Min 0 0 0 0 0 1 15 Max 1 1 1 12000 1800 80000 18 Hhd size 5.84 2.53 1 Female 0.45 0.49 0 36 1 Fraction literate 0.34 0.23 0 Muslim SCST 0.15 0.30 0.35 0.46 0 0 1 1 1 Notes: Summary statistics for children aged 15‐18 in India, in 2004. Data are from the Indian National Sample Survey, 61st round, Schedule 10 and includes households from 31 states with a single child between 15‐18 yrs. Table 2: Semi‐Elasticities (t‐stat) for MN Logit; Base outcome=Home‐stay Male, N=13,675 Covariate School‐fees Local wage Income Female, N=11,118 Work Attend School Work Attend School 0.015 (8.50) 0.004 (2.43) ‐0.071 (‐10.7) ‐0.02 (‐11.41) ‐0.006 (‐3.08) 0.11 (14.09) 0.006 (7.66) 0.002 (2.23) ‐0.03 (‐9.57) ‐0.04 (‐11.7) ‐0.01 (‐2.7) 0.19 (14.6) Notes: Increase in probability of choice due to doubling of prices or income, controlling for age, religion, household size and literacy, for a “typical” child, defined as Hindu, aged 15, coming from a household of size 5, literacy rate 60% and at lowest income decile, viz., Rs 1800. For example, the highlighted coefficient ‐0.04 says that a doubling of school fees from the average faced by a typical household reduces the probability of school attendance (vis‐à‐vis staying at home) of a girl by 4 percentage points. 24
600
650
700
750
Figure 1: Mean Welfare Gain in Rupees per Month for Girls from Tuition Subsidy and Simultaneous Wage Rise, by Income 0
2000
4000
6000
8000
10000
Income
Mean_CV
Logsum
Mean_EV
Figure 1: Mean welfare gain in monthly rupees per capita based on the Logsum approximation, and semiparametric CV and EV resulting from tuition subsidy of Rs 538 and simultaneous wage rise of Rs 235, for Hindu teenage girls at age 15, living in households with household size 5 and adult literacy rate 60%, where 1 Rupee=0.02 US $. Table 3: Mean Welfare Gain in Rupees per Month from Tuition Subsidy and Wage Rise by Gender Evaluated at First Percentile of Income Mean Mean Welfare Gain 95% CI Deadweight 95% CI (Rs/month) Loss (Rs/month) Male Log‐Sum 731 [729, 732] 0.19 [0.12 , 0.40] Semiparametric CV 681 [673, 689] 20.08 [17.36, 23.46] Semiparametric EV 688 [679, 696] 13.39 [9.98, 18.89] Log‐Sum 715 [711, 719] 0.47 [0.03, 1.17] Semiparametric CV 638 [616, 660] 34.45 [9.37, 33.08] Semiparametric EV 661 [643, 687] 26.62 [14.19, 36.33] Female Notes: Mean welfare gain in monthly rupees per capita evaluated at the first income percentile of Rs 1000, resulting from a tuition subsidy of Rs 745 and simultaneous wage rise of Rs 221, evaluated at age 15, for Hindu households of size 5 and adult literacy rate 60%, where 1 Rupee=0.02 US $. 25
Figure 2. Mean Private Welfare Loss in Rupees per Month from Elimination of Work Option, by Income 40
30
20
10
0
0
10
20
30
40
50
Female
50
Male
0
2000
4000
6000
Income
Mean_CV
8000
10000
0
2000
Mean_EV
4000
6000
Income
Mean_CV
Mean_CS
8000
10000
Mean_EV
Mean_CS
Note: Mean welfare loss in Rupees per month per person from removal of wage‐work option. Graph for Hindu teenagers at age 15, living in households of size 5 and adult literacy rate 60%. Mean welfare is evaluated at median monthly tuition of Rs 200, where 1 Rupee=0.02 US $. Table 4: Mean Welfare Loss at Lowest Income Decile, by Gender, due to Elimination of Work Option for Teenagers Mean EV Mean CV (Rs/month) (Rs/month) Male Std Err Female Std Err 22.19 4.39 28.64 5.55 12.09 2.64 15.09 4.35 Mean private welfare loss in monthly rupees per capita resulting from eliminating work for wage option, evaluated at monthly per capita expenditure level of Rs 1800 (lowest decile) for Hindu children at age 15, living in households of size 5 and adult literacy rate 60%. Mean income for the population is about Rs 4000, and mean wage Rs 124 where 1 Rupee=0.02 US $. 26
References
[29] Berry, S. and P. Haile (2014). Identi…cation in Di¤erentiated Products Markets Using Market
Level Data, Econometrica, 82 (5), pp. 1749-1798.
[29] Bhattacharya, D. (2009). Inferring Optimal Peer Assignment from Experimental Data. Journal
of the American Statistical Association 104, pp. 486-500.
[29] Bhattacharya, D. (2015). Nonparametric Welfare Analysis for Discrete Choice, Econometrica,
83(2), 617–649.
[29] Blundell, R. and J. Powell (2003). Endogeneity in Nonparametric and Semiparametric Regression Models, in Advances in Economics and Econometrics, Cambridge University Press,
Cambridge, U.K.
[29] Dagsvik, J. and A. Karlstrom (2005). Compensating Variation and Hicksian Choice Probabilities in Random Utility Models that are Nonlinear in Income, Review of Economic Studies 72
(1), 57-76.
[29] Domencich, T. and D. McFadden (1975). Urban Travel Demand - A Behavioral Analysis,
North-Holland, Oxford.
[29] Goolsbee, A. 1999. Evidence on the High-Income La¤er Curve from Six Decades of Tax Reform. Brookings Papers on Economic Activity, Economic Studies Program, The Brookings
Institution, vol. 30(2), pages 1-64.
[29] Goolsbee, A. and P. Klenow. Valuing Consumer Products By The Time Spend Using Them:
An Application To The Internet. American Economic Review, vol. 96, 108-113.
[29] Hausman, J. (1981). Exact Consumer’s Surplus and Deadweight Loss, The American Economic
Review, Vol. 71, 4, 662-676.
[29] Hausman, J. (1996). Valuation of new goods under perfect and imperfect competition, The
Economics of New Goods. University of Chicago Press, 207-248.
[29] Hausman, J. and T. Leonard (2002). The Competitive E¤ects of a New Product Introduction:
A Case Study, The Journal of Industrial Economics, Vol. 50 (3), pages 237–263.
[29] Hausman, J. (2003). Sources of Bias and Solutions to Bias in the Consumer Price Index, The
Journal of Economic Perspectives, Vol. 17, No. 1, 23-44.
27
[29] Hausman, J. and Newey, W. (2015). Individual Heterogeneity and Average Welfare, CEMMAP
working paper, 42/14, forthcoming, Econometrica.
[29] Hendren, N. (2013). The Policy Elasticity. NBER Working Paper #19177
[29] Herriges, J and C. Kling (1999). Nonlinear Income E¤ects in Random Utility Models, The
Review of Economics and Statistics, Vol. 81, No. 1, 62-72.
[29] Hicks, J. (1946). Value and Capital, Clarendon Press, Oxford.
[29] Ichimura, H., and C. Taber (2002). Semiparametric reduced-form estimation of tuition subsidies. American Economic Review (2002): 286-292.
[29] Kaldor, N. (1939). Welfare propositions of economics and interpersonal comparisons of utility.
The Economic Journal, pp. 549-552.
[29] Kane, T. J. (2003). A quasi-experimental estimate of the impact of …nancial aid on collegegoing. No. w9703. National Bureau of Economic Research.
[29] Kitamura, Y. and J. Stoye (2013). Nonparametric Analysis of Random Utility Models. Testing,
mimeo. Cornell and Yale University.
[29] Lewbel, A. (2001). Demand Systems with and without Errors, The American Economic Review, Vol. 91, No. 3.
[29] Lewbel. A. and K. Pendakur (2015). Unobserved Preference Heterogeneity in Demand Using
Generalized Random Coe¢ cients, mimeo. Boston College.
[29] D. McFadden, K. Train (2000). Mixed MNL models for discrete response. Journal of Applied
Econometrics 15 (5), 447-470.
[29] Newey, W. (1994). Kernel Estimation of Partial Means and a General Variance Estimator.
Econometric Theory, 10, 233-253.
[29] Petrin, A. (2002). Quantifying the Bene…ts of New Products: The Case of the Minivan, Journal
of Political Economy, 110, 705-729.
[29] Small, K. and Rosen, H. (1981). Applied Welfare Economics with Discrete Choice Models.
Econometrica. Vol. 49, No. 1, 105-130.
[29] Stiglitz, J (2000). Economics of the public sector. W. W. Norton.
28
[29] Train, K. (2009). Discrete Choice Methods with Simulation, Cambridge University Press, Cambridge, UK.
[29] Willig, R., S. Salop, and F. Scherer (1991). Merger analysis, industrial organization theory,
and merger guidelines. Brookings Papers on Economic Activity. Microeconomics 281-332.
29
Appendix
Path-dependence of Line Integral De…ning Marshallian Consumer Surplus for Discrete Choice
Consider a setting with three mutually exclusive alternatives with initial prices p0
and …nal prices p1
(p10 ; p20 ; p30 )
(p11 ; p22 ; p33 ). Let y denote income and qj (p; y) denote the choice probability
of alternative j when the price vector is p and income is y. Then the change in average consumer
surplus arising from the price change from p0 to p1 can be de…ned via the line integral
Z
CS (L) =
q1 (p; y) dp1 + q2 (p; y) dp2 + q3 (p; y) dp3 ,
L
where L denotes a path L (t) from t = 0 to t = 1 such that L (0)
L (1)
(p10 ; p20 ; p30 ) and
p0
(p11 ; p22 ; p33 ). Consider two di¤erent such paths
p1
L1 (t) = (p10 + t (p11
L2 (t) =
p10 + t2 (p11
Then
CS (L1 ) =
2
Z
0
16
But
=
CS (L2 )
2
Z
0
16
p10 ) ; p20 + t (p21
2t (p11
6
6 + (p21
4
+ (p31
p20 ) ; p30 + t (p31
p10 ) ; p20 + t (p21
(p11
6
6 + (p21
4
+ (p31
p10 )
p30 ))
p20 ) ; p30 + t (p31
q1 (p0 + t (p1
p20 )
q2 (p0 + t (p1
p30 )
q3 (p0 + t (p1
p30 ) .
p0 ) ; y)
3
p10 ) ; y
7
7
p0 ) ; y) 7 dt:
5
p0 ) ; y)
p10 )
q1 p0 + t (p1
p0 ) + t2
t (p11
p20 )
q2 p0 + t (p1
p0 ) + t2
t (p11
p30 )
q3 p0 + t (p1
p0 ) + t2
t (p11
3
7
7
p10 ) ; y 7 dt;
5
p10 ) ; y
which would in general di¤er from CS (L1 ). Thus the CS is not well-de…ned for simultaneous
change in multiple prices. Note that if only p1 changes, then
CS (L1 ) =
(p11
CS (L2 ) =
Z1
p10 )
Z1
[q1 (p10 + t (p11
p10 ) ; y)] dt;
0
2
(p11
p10 )
q1 p10 + t2 (p11
p10 ) ; y
tdt
0
=
(p11
p10 )
Z1
q1 (p10 + r (p11
0
= CS (L1 ) ,
30
p10 ) ; y) dr, substituting t2 = r
and we get back path independence. Thus the loss of path-independence arises only for multiple
simultaneous price-changes.
Proof of Proposition 1
Proof. Let p0 = (p10 ; p20 ; :::; pJ0 ) denote the initial price vector and p1 = (p11 ; p21 ; :::; pJ1 ) denote
the …nal price-vector. Suppose that two numbers S and T with S 6= T solve the equation for the
EV. Then, by de…nition,
max fU1 (y
= max fU1 (y
p11 ; ) ; :::; UJ (y
p10
pJ1 ; )g
S; ) ; :::; UJ (y
pJ0
S; )g ,
(21)
(22)
and
max fU1 (y
p11 ; ) ; :::; UJ (y
pJ1 ; )g
= max fU1 (y
p10
T; ) ; :::; UJ (y
pJ0
T; )g ,
max fU1 (y
p10
T; ) ; :::; UJ (y
pJ0
T; )g
= max fU1 (y
p10
S; ) ; :::; UJ (y
pJ0
S; )g .
so that
(23)
Since each Uj ( ), j = 1; :::J, is strictly increasing by assumption 2 if S > T , each term within fg on
the RHS of (23) will be strictly smaller than the corresponding term on the RHS. Therefore, each
term within fg on the RHS will be strictly smaller than the maximum value in the LHS. Since there
are …nitely many terms on the RHS, the maximum will also be strictly smaller than the maximum
on the LHS, a contradiction. Similarly, if S < T , then the RHS of (23) will be strictly larger than
the LHS. Therefore in order for (23) to hold, we must have S = T .
An exactly analogous argument works for the CV where the analogous equalities are
max fU1 (y
p11 + S; ) ; :::; UJ (y
= max fU1 (y
p10 ; ) ; :::; UJ (y
= max fU1 (y
p11 + T; ) ; :::; UJ (y
Again, by assumption 1, this implies S = T .
Proof of Theorem 1. First, consider EV.
31
pJ1 + S; )g
pJ0 ; )g
pJ1 + T; )g .
Note that the EV is de…ned by
max fU1 (y
p11 ; ) ; U2 (y
= max fU1 (y
S
p21 ; ) ; :::; UJ (y
p10 ; ) ; U2 (y
p20
pJ1 ; )g
S; ) ; :::; UJ (y
pJ0
S; )g .
(24)
The …rst step is to establish that
8
9
<
=
max fU1 (y p11 ; ) ; U2 (y p21 ; ) ; :::; UJ (y pJ1 ; )g
EV
a,
.
:
max fU1 (y a p10 ; ) ; U2 (y p20 a; ) ; :::; UJ (y pJ0 a; )g ;
Indeed, it is obvious from (24) that EV
(25)
a will imply the RHS inequality inside the f g in
(25). To see the converse, assume that
max fU1 (y
p11 ; ) ; U2 (y
max fU1 (y
a
p21 ; ) ; :::; UJ (y
p10 ; ) ; U2 (y
pJ1 ; )g
p20
a; ) ; :::; UJ (y
pJ0
a; )g .
(26)
Now (26) and equation (24) imply that
max fU1 (y
S
p10 ; ) ; U2 (y
p20
S; ) ; :::; UJ (y
pJ0
S; )g
max fU1 (y
a
p10 ; ) ; U2 (y
p20
a; ) ; :::; UJ (y
pJ0
a; )g ,
pJ0
S; )g
pj0
i.e., for all j = 1; :::; J, we have that
max fU1 (y
S
p10 ; ) ; :::; UJ (y
Uj (y
a; ) .
If the maximum on the LHS of the previous display is the kth term, i.e., Uk (y
S
pk0 ; ), then
choosing j = k on the RHS, we have that
Uk (y
S
pk0 ; )
Uk (y
whence, applying assumption 1, it follows that S
pk0
a; ) ,
a. This establishes (25).
Now we work from (25) to derive the CDF of the EV. To do this, …rst note that if pl1 pl0
pl+1;1
pl+1;0 , l = 1; :::; J
U1 (y
p11 ; ) ; U2 (y
a<
1, then the inequality (26) can hold if and only if the LHS max is one of
p21 ; ) ; :::; Ul (y
pl1 ; ) but not Ul+1 (y
pl+1;1 ; ) ; :::; UJ (y
pJ1 ; ). To
see this, suppose to the contrary that the max on the LHS of (26) is obtained for some k satisfying
J
k > l. Then (26) implies
Uk (y
pk1 ; )
max fUj (y
j=1;:::J
a
pj0 ; )g
Uk (y
Given assumption 1, i.e., monotonicity of Uk ( ; ), it follows that pk1
a contradiction since k > l.
32
a
pk0 ; ) .
a + pk0 , i.e., a
pk1
pk0 ,
Therefore, if a satis…es pl1
Pr (EV
2
6
6
6
= Pr 6
6
6
4
6
6
6
+ Pr 6
6
6
4
+:::
>
>
>
:
max
max
2
6
6
= Pr 6
4
(1)
+:::
2
6
6
+ Pr 6
4
8
>
>
>
<
>
>
>
:
8
>
U1 (y
>
>
>
>
>
<
>
>
>
>
>
>
:
max
2
6
6
+ Pr 6
4
pl+1;0 , then
max
8
<
U1 (y
p21 ; ) ; :::; Ul (y
l
J
Ul (y
p11 ; ) ; U2 (y
Ul+1 (y
pl0
J0
pl1 ; )
p21 ; ) ; :::; Ul
p10 ; ) ; U2 (y
Ul (y
a
pl0 ; ) ; :::; UJ (y
U1 (y
: U (y
l+1
pJ0
pl1 ; ) ;
pl+1;0 ) ; :::; UJ (y
U2 (y
a
p20
p11 ; )
p21 ; ) ; :::; Ul (y
U1 (y
1;1 ;
a; ) ; :::;
a
a
pl
pJ1 ; )
U1 (y
U2 (y
1 (y
pl+1;1 ; ) ; :::; UJ (y
pJ0
p21 ; )
p11 ; ) ; :::; Ul (y
pl1 ; ) ;
pl+1;0 ) ; :::; UJ (y
Ul (y
pJ0
pl1 ; )
a; )
3
9 7
= 7
7
5
a; ) ;
3
9 7
= 7
7
5
;
a; )
pj1
y
a
9 7
7
7
); >
>
>
7
>
>
>
= 7
7
7
7
>
7
>
>
>
5
>
>
;
pj;0 , for all j = 1; :::; l, since
a. The above probability equals
q1 (p11 ; p21 ; :::; pl1 ; pl+1;0 + a; :::; pJ0 + a; y)
+q2 (p11 ; p21 ; :::; pl1 ; pl+1;0 + a; :::; pJ0 + a; y)
::: + ql (p11 ; p21 ; :::; pl1 ; pl+1;0 + a; :::; pJ0 + a; y) ,
33
3
3
9 7
= 7
7
U1 (y p11 ; ) ; :::; Ul 1 (y pl 1;1 ; ) ;
5
max
: U (y a p
;
pJ0 a; )
l+1
l+1;0 ; ) ; :::; UJ (y
8
<
where the equality marked (1) uses the fact that y
pl1
pl1 ; ) ; :::; UJ (y
l0
: U (y
l+1
8
<
3
p11 ; )
9 7
7
7
>
pJ1 ; ) ; >
>
= 7
7
7
U1 (y a p10 ; ) ; U2 (y p20 a; ) ; :::;
>
5
>
>
;
Ul (y a pl0 ; ) ; :::; UJ (y pJ0 a; )
3
U2 (y p21 ; )
9 7
7
7
>
U1 (y p11 ; ) ; :::; Ul (y pl1 ; ) ; :::; UJ (y pJ1 ; ) >
>
= 7
7
7
U1 (y a p10 ; ) ; U2 (y p20 a; ) ; :::;
>
5
>
>
;
U (y a p ; ) ; :::; U (y p
a; )
8
>
>
U (y
>
< 2
2
6
6
6
6
6
+ Pr 6
6
6
6
4
a < pl+1;1
a)
max
2
pl0
if a
0 and equals
= q1 (p11
a; p21
+q2 (p11
a; :::; pl1
a; p21
::: + ql (p11
a; pl+1;0 ; :::; pJ0 ; y
a; :::; pl1
a; p21
a)
a; pl+1;0 ; :::; pJ0 ; y
a; :::; pl1
a)
a; pl+1;0 ; :::; pJ0 ; y
a) ,
if a < 0. This is precisely expression (6).
Next consider CV, de…ned as the solution S to the equation
max fU1 (y + S
= max fU1 (y
p11 ; ) ; U2 (y + S
p10 ; ) ; U2 (y
The …rst step is to see that CV
max fU1 (y
p20 ; ) ; :::; UJ (y
pJ1 ; )g
pJ0 ; )g .
(27)
a is equivalent to
p10 ; ) ; U2 (y
max fU1 (y + a
p21 ; ) ; :::; UJ (y + S
p20 ; ) ; :::; UJ (y
p11 ; ) ; U2 (y + a
pJ0 ; )g
p21 ; ) ; :::; UJ (y + a
pJ1 ; )g .
(28)
The necessity is obvious by assumption 1. Su¢ ciency follows because (27) and (28) imply that
max fU1 (y + S
p11 ; ) ; U2 (y + S
p21 ; ) ; :::; UJ (y + S
max fU1 (y + a
p11 ; ) ; U2 (y + a
p21 ; ) ; :::; UJ (y + a
Then there must be at least one j such that the RHS max is Uj (y + a
must be larger than Uj (y + S
Next, suppose a satis…es pj1
pj1 ; ), implying by assumption 1 that S
pj0
therefore, by assumption 1, Uj+1 (y + a
a < pj+1;1
pj+1;0 . Then y + a
pj+1;1 ; ) < Uj+1 (y
34
pJ1 ; )g
pJ1 ; )g .
(29)
pj1 ; ), which, by (29)
a.
pj+1;1 < y
pj+1;0 , and
pj+1;0 ; ). Therefore, the RHS max
in (28) must be one of U1 (y + a
Pr (CV
2
6
6
6
6
6
j
X
6
=
Pr 6
6
6
k=1
6
6
6
4
2
6
6
6
6
j
X 6
6
=
Pr 6
6
6
k=1
6
6
6
4
=
j
X
k=1
2
6
6
Pr 6
4
p11 ; ) ; :::; Uj (y + a
pj1 ; ). Accordingly, we have that
a)
Uk (y + a
8
>
U1 (y p10 ; ) ; :::; Uk 1 (y pk 1;0 ; ) ; Uk (y pk0 ; ) ;
>
>
>
>
>
>
>
Uk+1 (y pk+1;0 ; ) :::; UJ (y pJ0 ; )
>
<
max
U1 (y + a p11 ; ) ; :::; Uj 1 (y + a pj 1;1 ; ) ;
>
>
>
>
>
Uj (y + a pj1 ; ) ; Uj+1 (y + a pj+1;1 ; ) ;
>
>
>
>
:
:::UJ (y + a pJ1 ; )
3
Uk (y + a pk1 ; )
8
9 7
7
>
>
U1 (y p10 ; ) ; :::; Uk 1 (y pk 1;0 ; ) ;
7
>
>
>
>
7
>
>
>
>
7
>
>
>
>
>
>
Uk+1 (y pk+1;0 ; ) :::; UJ (y pJ0 ; )
>
> 7
7
<
=
7
max
7
U1 (y + a p11 ; ) ; :::; Uj 1 (y + a pj 1;1 ; ) ;
7
>
>
>
>
7
>
>
>
>
> Uj (y + a pj1 ; ) ; Uj+1 (y + a pj+1;1 ; ) >
> 7
>
>
>
5
>
>
>
>
:
;
:::UJ (y + a pJ1 ; )
3
Uk (y + a pk1 ; )
8
9 7
< U1 (y + a p11 ; ) ; :::; Uj (y + a pj1 ; ) ; = 7
7.
5
max
: U
;
(y
p
;
)
;
:::U
(y
p
;
)
j+1
j+1;0
J
J0
The second equality follows from the fact that k
assumption 1 that Uk (y + a
a < pj+1;1
Uk (y
pk1 ; )
Uk (y
pk;0 ; ) > Uk (y + a
2
k=1
6
6
Pr 6
4
j and pj1
pj0
pk;0 ; )
Uk (y + a
8
< U1 (y + a
max
: U
(y
Uk (y + a
pk1 ; ) for all k
3
pk1 ; )
9 7
7
p11 ; ) ; :::; Uj (y + a pj1 ; ) ; = 7
5
;
pj+1;0 ; ) ; :::UJ (y pJ0 ; )
8 P
j
<
0,
k=1 qk (p11 ; :::; pj1 ; pj+1;0 + a; :::; pJ0 + a; y + a) if a
P
j
:
a; :::; pj1 a; pj+1;0 ; :::; pJ0 ; y) if a < 0.
k=1 qk (p11
Proof of theorem 2: The EV solves
max fU1 (y
p11 ; ) ; :::; UJ (y
= max fU1 (y + S
>
>
>
>
>
>
>
>
>
;
7
7
7
7
7
7
7
7
7
7
7
7
5
a, and so we have by
pk1 ; ) for all k > j. Finally,
j+1
9
>
>
>
>
>
>
>
>
>
=
3
pk0 ; ). The third equality follows from pj1
pj+1;0 , so by assumption 1, Uk (y
j
X
pk1 ; )
pJ1 ; ) ; UJ+1 (y
p10 ; ) ; :::; UJ (y + S
35
pJ0 ; )g .
pJ+1 ; )g
pj0
j and
Proof. Now, by the same logic as the one leading to equation (25) in the previous proof,
Pr [S
2
= Pr 4
a]
max fU1 (y
p11 ; ) ; :::; UJ (y
max fU1 (y + a
pJ1 ; ) ; UJ+1 (y
p10 ; ) ; :::; UJ (y + a
pJ+1 ; )g
pJ0 ; )g
Recall the assumption that (WLOG)
pJ0
Therefore, if a < p10
pJ1
pJ
1;0
pJ
1;1
:::
p10
p11 , then for each j = 1; :::; J, the Uj (y
strictly larger than the corresponding Uj (y + a
pj1
a < pj+1;0
5.
(30)
p11 .
pj1 ; ) on the LHS of (30) will be
pj0 ; ), contradicting that the RHS max exceeds
the LHS max. Thus there can be no probability mass below p10
pj0
3
p11 . If for some j 2 f1; :::J
1g,
pj+1;1 , then each of the …rst j terms on the RHS of (30) is at least as large
as the corresponding term on the LHS, while the (j + 1)th term onwards on the RHS are smaller
than the corresponding terms on the RHS. This means that one of these …rst j terms on the RHS
must be the maximum and it must also exceed the j + 1th term onwards on the RHS. Thus, for
pj0
pj1
a < pj+1;0
2
pj+1;1 , the probability that S a equals
8
93
>
>
>
maxk0 2f1;:::;jgnk fUk0 (y + a pk0 0 ; )g ; >
>
>
j
6
<
=7
X
6
7
Pr 6Uk (y + a pk0 ; ) max
7
maxl j+1 fUl (y pl1 ; )g ;
>
>
4
5
>
>
k=1
>
>
:
;
U
(y p
; )
J+1
=
j
X
J+1
qk (p10 ; :::; pj0 ; pj+1;1 + a; :::; pJ1 + a; pJ+1 + a; y + a) .
k=1
When a
pJ0
pJ1 , then (30) reduces to
2
3
max fU1 (y p11 ; ) ; :::; UJ (y pJ1 ; ) ; UJ+1 (y pJ+1 ; )g
5
Pr 4
max fU1 (y + a p10 ; ) ; :::; UJ (y + a pJ0 ; )g
2
3
max fUJ (y pJ1 ; ) ; UJ+1 (y pJ+1 ; )g
5
= Pr 4
max fU1 (y + a p10 ; ) ; :::; UJ (y + a pJ0 ; )g
2
3
max fUJ (y pJ1 ; ) ; UJ+1 (y pJ+1 ; )g
5
= 1 Pr 4
> max fU1 (y + a p10 ; ) ; :::; UJ (y + a pJ0 ; )g
3
2
UJ+1 (y pJ+1 ; )
(1)
5
= 1 Pr 4
> max fU1 (y + a p10 ; ) ; :::; UJ (y + a pJ0 ; )g
8
< 1 qJ+1 (p10 ; :::; pJ0 ; pJ+1 + a; y + a) if pJ0 pJ1 a
=
.
:
1, otherwise
36
(1)
Equality = follows from the fact that if a
UJ (y + a
pJ0
pJ1 , then UJ (y
pJ1 ; ) must be no more than
pJ0 ; ).
The proof of CV is very similar and is omitted for the sake of brevity.
Numerical Calculations for Empirical Illustration
Policy change 1: Let alternatives 1, 2, 3 denote staying at home, working and going to
school, respectively. We wish to compute welfare distributions at income 1800 resulting from a
simultaneous tuition subsidy of Rs 745 and wage rise of 20, viz., from initial school-price of 790
and initial wage 25 to a …nal school-price of 45 and wage 256. To do this, …rst consider the reverse
price change p0 = (0; 256; 45), p1 = (0; 25; 790), and compute EV and CV distributions for this
change using formulae given by equations (??) and (??).
Pr (EV
a) =
E (EV ) =
8
>
>
>
>
>
>
<
0, if a < 0,
q1 ( 256 + a; 45 + a; 1800) if 0
>
>
1
>
>
>
>
:
Z
Pr (CV
a) =
E (CV ) =
0
745,
f1
q1 ( 256 + a; 45 + a; 1800)g da +
Z
745
q3 ( 25; 45 + a; 1800) da,
(31)
221
0 if a < 0,
q1 ( 256 + a; 45 + a; 1800 + a) if 0
>
>
1
>
>
>
>
:
Z
1 if a
a < 745,
231
0
8
>
>
>
>
>
>
<
q3 ( 25; 45 + a; 1800) if 231
a < 231,
q3 ( 25; 45 + a; 1800 + a) if 231
1 if a
a < 231,
a < 745,
745,
231
f1
q1 ( 256 + a; 45 + a; 1800 + a)g da +
Z
745
q3 ( 25; 45 + a; 1800 + a)(32)
da.
231
We incorporate observed covariates into the choice probabilities as a linear index and use a sieve
approximation to the price and income terms for j = 2; 3, viz.,
qj (x; p2 ; p3 ; y) =
0
+
0
cj x
+
0
j m (p2 ; p3 ; y)
where m ( ; ; ) represent cubic splines, and the standard logit CDF
,
(33)
( ) is used as a link-function
to keep the choice probability between 0 and 1. The results reported here correspond to 5 equispaced knots for income and three equi-spaced knots for each of the two prices. This speci…cation
minimized the cross-validation criterion of leave-one-out in-sample prediction of the actual choice
out of all speci…cations using between 3 and 5 knots for each variable. Going beyond 5 knots
37
posed di¢ culty with estimation owing to the very large dimensional optimization involved. The
integrals in (31) and (32) are calculated numerically using the Trapezoidal rule via the STATA
command "integ". The coe¢ cient ’s are estimated via maximum likelihood. The average welfare
gain measured in terms of CV for the actual change is the average EV welfare loss calculated from
(31) and average welfare gain in terms of EV for the actual change is the average CV welfare loss
calculated from (32). These numbers are reported in Table 3, top row.
The log-sum calculations, on the other hand yield
E (EV ) = E (CV )
" (
exp
1
ln
=
j j
exp
=
1
ln
j j
01 + (
01 + (
02
02
exp (b02 +
exp (b02 +
b0x2 x
b0x2 x
+
+
x2
x2
0
x1 ) x + p20 + exp
0
x1 ) x + p21 + exp
45 ) + exp (b03 + b0x3 x
790 ) + exp (b03 + b0x3 x
03
03
256 ) + 1
25 ) + 1
01 + (
01 + (
x3
x3
0
x1 ) x + p30 + 1
0
x1 ) x + p31 + 1
,
where the bs represent the normalized alternative speci…c intercepts and slopes and j j is the constant marginal utility of income. The b’s and
are estimated via the constrained "mlogit" command
in STATA which imposes constant marginal utility of income by requiring price coe¢ cients to be
identical across alternatives. The EV and CV for the actual policy change, viz., rise in prices, are
the CV and EV respectively from the price rise derived above.
The deadweight loss is obtained by subtracting the mean welfare gains from the sum of the
subsidy payments and the additional wage-payments of employers of child labourers to individuals
at income 1800 rupees, i.e.,
DW L = (p31
p20
p30 )
q3 (p20 ; p30 ; 1800)
q2 (p20 ; p30 ; 1800) + p21
q2 (p21 ; p31 ; 1800)
E (W elf are)
= 745
+256
q3 ( 256; 45; 1800)
q2 ( 256; 45; 1800)
25
q2 ( 25; 45; 1800)
E (W elf are) ,
where E (W elf are) is either the Log-sum given by (34), or the EV given by (31) or the CV given
by (32).
38
)#
(34)
Policy Change 2: Recall the speci…cation qj (x; p2 ; p3 ; y) =
0
+
0
cj x
+
0
pj m (p2 ; p3 ; y)
,
which yields
E (EV ) =
Z
1
q2 ( 111 + a; 146; 1800 + a) da
0
=
Z
1
0
0
E(CV ) =
=
Z
Z0 1
0
where the
1
+
0
c2 x
+
0
2m (
111 + a; 146; 1800 + a) da,
q2 ( 111 + a; 146; 1800) da
0
+
0
c2 x
+
0
2m (
111 + a; 146; 1800) da,
coe¢ cients and the m ( ) function are the same as in (33). The integral up to in…nity
is numerically computed simply by taking a large upper limit of integration such that taking any
higher limit does not change the answer by more than 0.1%.
39
Download