Modeling Consumer Decision Making and Discrete Choice Behavior

advertisement
Part 24: Stated Choice [1/117]
Econometric Analysis of Panel Data
William Greene
Department of Economics
Stern School of Business
Part 24: Stated Choice [2/68]
Econometric Analysis of Panel Data
24.
Multinomial
Choice and
Stated Choice
Experiments
Part 24: Stated Choice [3/117]
A Microeconomics Platform



Consumers Maximize Utility (!!!)
Fundamental Choice Problem: Maximize
U(x1,x2,…) subject to prices and budget constraints
A Crucial Result for the Classical Problem:


Indirect Utility Function: V = V(p,I)
Demand System of Continuous Choices
*
j
x =

V(p,I)/p j
V(p,I)/I
Observed data usually consist of choices, prices, income
The Integrability Problem: Utility is not revealed
by demands
Part 24: Stated Choice [4/117]
Implications for Discrete Choice Models


Theory is silent about discrete choices
Translation of utilities to discrete choice requires:





Consumers often act to simplify choice situations
This allows us to build “models.”



Well defined utility indexes: Completeness of rankings
Rationality: Utility maximization
Axioms of revealed preferences
What common elements can be assumed?
How can we account for heterogeneity?
However, revealed choices do not reveal utility, only
rankings which are scale invariant.
Part 24: Stated Choice [5/117]
Multinomial Choice Among J Alternatives
• Random Utility Basis
Uitj = ij + i’xitj + ijzit + ijt
i = 1,…,N; j = 1,…,J(i,t); t = 1,…,T(i)
N individuals studied, J(i,t) alternatives in the choice
set, T(i) [usually 1] choice situations examined.
• Maximum Utility Assumption
Individual i will Choose alternative j in choice setting t if and only if
Uitj > Uitk for all k  j.
• Underlying assumptions


Smoothness of utilities
Axioms of utility maximization: Transitive, Complete, Monotonic
Part 24: Stated Choice [6/117]
Features of Utility Functions


The linearity assumption Uitj = ij + ixitj + jzit + ijt
To be relaxed later:
Uitj = V(xitj,zit,i) + ijt
The choice set:




Individual (i) and situation (t) specific
Unordered alternatives j = 1,…,J(i,t)
Deterministic (x,z,j) and random components (ij,i,ijt)
Attributes of choices, xitj and characteristics of the chooser, zit.




Alternative specific constants ij may vary by individual
Preference weights, i may vary by individual
Individual components, j typically vary by choice, not by person
Scaling parameters, σij = Var[εijt], subject to much modeling
Part 24: Stated Choice [7/117]
Unordered Choices of 210 Travelers
Part 24: Stated Choice [8/117]
Data on Multinomial Discrete
Choices
Part 24: Stated Choice [9/117]
The Multinomial Logit (MNL) Model

Independent extreme value (Gumbel):





F(itj) = Exp(-Exp(-itj)) (random part of each utility)
Independence across utility functions
Identical variances (means absorbed in constants)
Same parameters for all individuals (temporary)
Implied probabilities for observed outcomes
P[choice = j | xitj , zit ,i,t] = Prob[Ui,t,j  Ui,t,k ], k = 1,...,J(i,t)
=
exp(α j + β'xitj + γ j'zit )

J(i,t)
j=1
exp(α j + β'xitj + γ j'zit )
Part 24: Stated Choice [10/117]
Multinomial Choice Models
Multinomial logit model depends on characteristics
P[choice = j | zit ,i,t] =
exp(α j + γ j'zit )

J(i,t)
j=1
exp(α j + γ j'zit )
Conditional logit model depends on attributes
P[choice = j | x itj,i,t] =
exp(α j + β'x itj )

J(i,t)
j=1
exp(α j + β'x itj )
THE multinomial logit model accommodates both.
P[choice = j | x itj, zit ,i,t] =
exp(α j + β'x itj + γ j'zit )

J(i,t)
j=1
exp(α j + β'x itj + γ j'z it )
There is no meaningful distinction.
Part 24: Stated Choice [11/117]
Specifying the Probabilities
• Choice specific attributes (X) vary by choices, multiply by generic
coefficients. E.g., TTME=terminal time, GC=generalized cost of travel mode
• Generic characteristics (Income, constants) must be interacted with
choice specific constants.
• Estimation by maximum likelihood; dij = 1 if person i chooses j
P[choice = j | x itj , zit ,i,t] = Prob[Ui,t,j  Ui,t,k ], k = 1,...,J(i,t)
=
exp(α j + β'x itj + γ j'zit )

J(i,t)
j=1
logL =  i=1
N
exp(α j + β'x itj + γ j'zit )

J(i)
j=1
dijlogPij
Part 24: Stated Choice [12/117]
Willingness to Pay
Generally a ratio of coefficients
WTP =
β(Attribute Level)
β(Income)
Use negative of cost coefficient as a proxu for MU of income
WTP =
negative β(Attribute Level)
β(cost)
Measurable using model parameters
Ratios of possibly random parameters can produce wild and
unreasonable values. We will consider a different approach later.
Part 24: Stated Choice [13/117]
An Estimated MNL Model
----------------------------------------------------------Discrete choice (multinomial logit) model
Dependent variable
Choice
Log likelihood function
-199.97662
Estimation based on N =
210, K =
5
Information Criteria: Normalization=1/N
Normalized
Unnormalized
AIC
1.95216
409.95325
Fin.Smpl.AIC
1.95356
410.24736
Bayes IC
2.03185
426.68878
Hannan Quinn
1.98438
416.71880
R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj
Constants only
-283.7588 .2953 .2896
Chi-squared[ 2]
=
167.56429
Prob [ chi squared > value ] =
.00000
Response data are given as ind. choices
Number of obs.=
210, skipped
0 obs
--------+-------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
--------+-------------------------------------------------GC|
-.01578***
.00438
-3.601
.0003
TTME|
-.09709***
.01044
-9.304
.0000
A_AIR|
5.77636***
.65592
8.807
.0000
A_TRAIN|
3.92300***
.44199
8.876
.0000
A_BUS|
3.21073***
.44965
7.140
.0000
--------+--------------------------------------------------
Part 24: Stated Choice [14/117]
Estimated MNL Model
----------------------------------------------------------Discrete choice (multinomial logit) model
Dependent variable
Choice
Log likelihood function
-199.97662
Estimation based on N =
210, K =
5
Information Criteria: Normalization=1/N
Normalized
Unnormalized
AIC
1.95216
409.95325
Fin.Smpl.AIC
1.95356
410.24736
Bayes IC
2.03185
426.68878
Hannan Quinn
1.98438
416.71880
R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj
Constants only
-283.7588 .2953 .2896
Chi-squared[ 2]
=
167.56429
Prob [ chi squared > value ] =
.00000
Response data are given as ind. choices
Number of obs.=
210, skipped
0 obs
--------+-------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
--------+-------------------------------------------------GC|
-.01578***
.00438
-3.601
.0003
TTME|
-.09709***
.01044
-9.304
.0000
A_AIR|
5.77636***
.65592
8.807
.0000
A_TRAIN|
3.92300***
.44199
8.876
.0000
A_BUS|
3.21073***
.44965
7.140
.0000
--------+--------------------------------------------------
Part 24: Stated Choice [15/117]
Estimated MNL Model
----------------------------------------------------------Discrete choice (multinomial logit) model
Dependent variable
Choice
Log likelihood function
-199.97662
Estimation based on N =
210, K =
5
Information Criteria: Normalization=1/N
log L
Pseudo R 2 = 1.
Normalized
Unnormalized
log L0
AIC
1.95216
409.95325
Fin.Smpl.AIC
1.95356
410.24736

Adjusted Pseudo R 2 =1- 
Bayes IC
2.03185
426.68878

Hannan Quinn
1.98438
416.71880
R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj
Constants only
-283.7588 .2953 .2896
Chi-squared[ 2]
=
167.56429
Prob [ chi squared > value ] =
.00000
Response data are given as ind. choices
Number of obs.=
210, skipped
0 obs
--------+-------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
--------+-------------------------------------------------GC|
-.01578***
.00438
-3.601
.0003
TTME|
-.09709***
.01044
-9.304
.0000
A_AIR|
5.77636***
.65592
8.807
.0000
A_TRAIN|
3.92300***
.44199
8.876
.0000
A_BUS|
3.21073***
.44965
7.140
.0000
--------+--------------------------------------------------
N(J-1)   log L 
.

N(J-1)-K   log L0 
Part 24: Stated Choice [16/117]
Partial effects :
Change in attribute "k" of alternative "m" on the
probability that the individual makes choice "j"
j = Train
Prob(j) Pj
=
= Pj [1(j = m) - Pm ]βk
xm,k
xm,k
m = Car
k = Price
Part 24: Stated Choice [17/117]
Partial effects :
k = Price
Own effects :
j = Train
Prob(j) Pj
=
= Pj [1- Pj ]βk
x j,k
x j,k
Cross effects : j = Train
m = Car
Prob(j) Pj
=
= -PP
j m βk
x m,k
x m,k
Part 24: Stated Choice [18/117]
Elasticities for proportional changes :
xm,k
logProb(j) logPj
=
=
Pj [1(j = m) - Pm ]βk
logxm,k
logx m,k
Pj
= [1(j = m) - Pm ] x m,k βk
Note the elasticity is the same for all j. This is a
consequence of the IIA assumption in the model
specification made at the outset.
Part 24: Stated Choice [19/117]
+---------------------------------------------------+
| Elasticity
averaged over observations.|
| Attribute is INVT
in choice AIR
|
|
Mean
St.Dev
|
| *
Choice=AIR
-.2055
.0666
|
|
Choice=TRAIN
.0903
.0681
|
|
Choice=BUS
.0903
.0681
|
|
Choice=CAR
.0903
.0681
|
+---------------------------------------------------+
| Attribute is INVT
in choice TRAIN
|
|
Choice=AIR
.3568
.1231
|
| *
Choice=TRAIN
-.9892
.5217
|
|
Choice=BUS
.3568
.1231
|
|
Choice=CAR
.3568
.1231
|
+---------------------------------------------------+
| Attribute is INVT
in choice BUS
|
|
Choice=AIR
.1889
.0743
|
|
Choice=TRAIN
.1889
.0743
|
| *
Choice=BUS
-1.2040
.4803
|
|
Choice=CAR
.1889
.0743
|
+---------------------------------------------------+
| Attribute is INVT
in choice CAR
|
|
Choice=AIR
.3174
.1195
|
|
Choice=TRAIN
.3174
.1195
|
|
Choice=BUS
.3174
.1195
|
| *
Choice=CAR
-.9510
.5504
|
+---------------------------------------------------+
| Effects on probabilities of all choices in model: |
| * = Direct Elasticity effect of the attribute.
|
+---------------------------------------------------+
Note the effect of IIA on
the cross effects.
Own effect
Cross effects
Elasticities are computed
for each observation; the
mean and standard
deviation are then
computed across the
sample observations.
Part 24: Stated Choice [20/117]
A Multinomial Logit Common Effects Model

How to handle unobserved effects in other
nonlinear models?




Single index models such as probit, Poisson, tobit,
etc. that are functions of an xit'β can be modified to
be functions of xit'β + ci.
Other models – not at all obvious. Rarely found in
the literature.
Dealing with fixed and random effects?
Dynamics makes things much worse.
Part 24: Stated Choice [21/117]
A Multinomial Logit Model
The multinomial logit model for unordered choices
Ui,t ( j)  x i,t ( j)  i,t ( j), j = 1,...,J (choice set)
t = 1,...,T (choice situations)
i = 1,...,N (individuals)
i,t ( j) ~ I.I.D. Type 1 extreme value.
ji,t * = ji,t = index of choice such that Ui,t ( ji,t *)  Ui,t (k) for ji,t *  k.
How to modify the model to include common (random or
fixed) effects?
Part 24: Stated Choice [22/117]
A Heterogeneous Multinomial
Logit Model
The multinomial logit model for unordered choices
Ui,t (1)  x i,t (1)  i,t (1)  ui (1),
Ui,t (2)  x i,t (2)  i,t (2)  ui (2)
...
Ui,t (J)  x i,t (J)  i,t (J)  ui (J)
t = 1,...,T (choice situations)
i = 1,...,N (individuals)
ji,t * = ji,t = index of choice such that Ui,t ( ji,t *)  Ui,t (k) for ji,t *  k
i,t ( j) ~ I.I.D. Type 1 extreme value, j=1,...,J.
[ui (1),ui (2),...,ui (J)]  ui = J common individual effects.
[A dynamic version of this model in Gong, et al., "Mobility in the
Urban Labor Market" IZA Working Paper 213, Bonn, 2000]
Part 24: Stated Choice [23/117]
Common Effects Multinomial Logit
Fixed Effects is complicated. Needs N sets of J
dummy variable coefficients (that sum to zero across choices).
Random Effects:
Li |ui   t 1
T
 exp[ x i,t ( ji,t *)  ui ( ji,t *)] 
 J



exp[
x
(
j)


u
(
j)]
 j1

i,t
i
  t 1 Prob[choice made | u(i)]
T
Unconditional contribution to the log likelihood for person i is
logLi  log 
ui

T
t 1
 exp[ x i,t ( ji,t *)  ui ( ji,t *)] 
 J
 f(ui )dui
  j1 exp[ x i,t ( j)  ui ( j)] 
Part 24: Stated Choice [24/117]
Simulation Based Estimation
logLi  log 
ui

logL   i1log
N
ui
T
t 1
 exp[ x i,t ( ji,t *)  ui ( ji,t *)] 
 J
 f(ui )dui
  j1 exp[ x i,t ( j)  ui ( j)] 

T
t 1
 exp[ x i,t ( ji,t *)  ui ( ji,t *)] 
 J
 f(ui )dui
  j1 exp[ x i,t ( j)  ui ( j)] 
 exp[ x i,t ( ji,t *)   j* v i,r ( ji,t *)] 
 t 1  J exp[x ( j)   v ( j)] 
 j1

i,t
j i,r
where v i,r ( j) are random draws from the assumed population.
1 R
SimulatedLogL   i1log  r 1
R
N
T
This function is maximized over  and 1,..., J
Part 24: Stated Choice [25/117]
Application Shoe Brand Choice

Simulated Data: Stated Choice, N=400 respondents, T=8 choice
situations, 3,200 observations

3 choice/attributes + NONE  J=4





Fashion = High / Low
Quality = High / Low
Price = 25/50/75,100 coded 1,2,3,4; and Price2
Heterogeneity: Sex, Age (<25, 25-39, 40+)
Underlying data generated by a 3 class latent class process (100,
200, 100 in classes)

Thanks to www.statisticalinnovations.com (Latent Gold)
Part 24: Stated Choice [26/117]
Application
Ui,t (1)  1Fi,t  2Qi,t  3Pi,t  4Pi,t 2
  S,1Sexi   Y,1Youngi   O,1Olderi  i,t (1)  ui (1)
Ui,t (2)  1Fi,t  2Qi,t  3Pi,t  4Pi,t 2
  S,2Sexi   Y,2 Youngi   O,2Olderi  i,t (2)  ui (2)
Ui,t (3)  1Fi,t  2Qi,t  3Pi,t  4Pi,t 2
  S,3Sexi   Y,3 Youngi   O,3Olderi  i,t (3)  ui (3)
Ui,t (none)  i,t (none)
Part 24: Stated Choice [27/117]
No Common Effects
+---------------------------------------------+
| Start values obtained using MNL model
|
| Log likelihood function
-4119.500
|
+---------------------------------------------+
+--------+--------------+----------------+--------+--------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]|
+--------+--------------+----------------+--------+--------+
FASH
|
1.45964424
.07748860
18.837
.0000
QUAL
|
1.10637961
.07153725
15.466
.0000
PRICE
|
2.31763951
3.98732636
.581
.5611
PRICESQ |
-55.5527148
13.8684229
-4.006
.0001
ASC4
|
.64637513
.24440240
2.645
.0082
B1_MAL1 |
-.16751621
.10552035
-1.588
.1124
B1_YNG1 |
-.58118337
.11969068
-4.856
.0000
B1_OLD1 |
-.02600079
.14091863
-.185
.8536
B2_MAL2 |
-.05966758
.10055110
-.593
.5529
B2_YNG2 |
-.14991404
.11180414
-1.341
.1800
B2_OLD2 |
-.15128297
.14133889
-1.070
.2845
B3_MAL3 |
-.12076085
.09301010
-1.298
.1942
B3_YNG3 |
-.12265952
.10419547
-1.177
.2391
B3_OLD3 |
-.04753400
.12950649
-.367
.7136
Part 24: Stated Choice [28/117]
Random Effects MNL Model
+---------------------------------------------+
| Error Components (Random Effects) model
| Restricted logL = -4119.5
| Log likelihood function
-4112.495
| Chi squared(3) = 14.01 (Crit.Val.=7.81)
+---------------------------------------------+
+--------+--------------+----------------+--------+--------+
|Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]|
+--------+--------------+----------------+--------+--------+
---------+Nonrandom parameters in utility functions
FASH
|
1.50759565
.08204283
18.376
.0000
QUAL
|
1.14155991
.07884212
14.479
.0000
PRICE
|
2.61115484
4.23285024
.617
.5373
PRICESQ |
-58.0172769
14.7409678
-3.936
.0001
ASC4
|
.72127357
.25703909
2.806
.0050
B1_MAL1 |
-.19918832
.11818500
-1.685
.0919
B1_YNG1 |
-.61263642
.12580875
-4.870
.0000
B1_OLD1 |
-.03213515
.15732926
-.204
.8382
B2_MAL2 |
-.04059494
.10950154
-.371
.7108
B2_YNG2 |
-.12504492
.11986238
-1.043
.2968
B2_OLD2 |
-.12470329
.14151490
-.881
.3782
B3_MAL3 |
-.10619757
.10471334
-1.014
.3105
B3_YNG3 |
-.10372335
.11851081
-.875
.3815
B3_OLD3 |
-.02538899
.13269408
-.191
.8483
---------+Standard deviations of latent random effects
SigmaE01|
.53459541
.09531536
5.609
.0000
SigmaE02|
.01799747
.62983694
.029
.9772
SigmaE03|
.03109637
.35256770
.088
.9297
Part 24: Stated Choice [29/117]
Revealed and Stated Preference Data

Pure RP Data



Pure SP Data



Market (ex-post, e.g., supermarket scanner data)
Individual observations
Contingent valuation
(?) Validity
Combined (Enriched) RP/SP


Mixed data
Expanded choice sets
Part 24: Stated Choice [30/117]
Revealed Preference Data


Advantage: Actual
observations on actual
behavior
Disadvantage: Limited
range of choice sets
and attributes – does
not allow analysis of
switching behavior.
Part 24: Stated Choice [31/117]
Stated Preference Data



Pure hypothetical – does the subject
take it seriously?
No necessary anchor to real market
situations
Vast heterogeneity across individuals
Part 24: Stated Choice [32/117]
Pooling RP and SP Data Sets - 1


Enrich the attribute set by replicating
choices
E.g.:



RP: Bus,Car,Train (actual)
SP: Bus(1),Car(1),Train(1)
Bus(2),Car(2),Train(2),…
How to combine?
Part 24: Stated Choice [33/117]
Each person makes four choices
from a choice set that includes either
2 or 4 alternatives.
The first choice is the RP between
two of the 4 RP alternatives
The second-fourth are the SP among
four of the 6 SP alternatives.
There are 10 alternatives in total.
A Stated Choice Experiment with Variable Choice Sets
Part 24: Stated Choice [34/117]
Enriched Data Set – Vehicle Choice
Choosing between Conventional, Electric and LPG/CNG
Vehicles in Single-Vehicle Households
David A. Hensher
Institute of Transport Studies
School of Business
The University of Sydney
NSW 2006 Australia
William H. Greene
Department of Economics
Stern School of Business
New York University
New York USA
September 2000
Part 24: Stated Choice [35/117]
Fuel Types Study





Conventional, Electric, Alternative
1,400 Sydney Households
Automobile choice survey
RP + 3 SP fuel classes
Nested logit – 2 level approach – to handle
the scaling issue
Part 24: Stated Choice [36/117]
Attribute Space: Conventional
Part 24: Stated Choice [37/117]
Attribute Space: Electric
Part 24: Stated Choice [38/117]
Attribute Space: Alternative
Part 24: Stated Choice [39/117]
Part 24: Stated Choice [40/117]
Mixed Logit Approaches



Pivot SP choices around an RP outcome.
Scaling is handled directly in the model
Continuity across choice situations is handled by
random elements of the choice structure that are
constant through time


Preference weights – coefficients
Scaling parameters


Variances of random parameters
Overall scaling of utility functions
Part 24: Stated Choice [41/117]
Application
Survey sample of 2,688 trips, 2 or 4 choices per situation
Sample consists of 672 individuals
Choice based sample
Revealed/Stated choice experiment:
Revealed: Drive,ShortRail,Bus,Train
Hypothetical: Drive,ShortRail,Bus,Train,LightRail,ExpressBus
Attributes:
Cost –Fuel or fare
Transit time
Parking cost
Access and Egress time
Part 24: Stated Choice [42/117]
Nested Logit Approach
Mode
RP
Car
Train
Bus
SPCar
SPTrain
SPBus
Use a two level nested model, and
constrain three SP IV parameters
to be equal.
Part 24: Stated Choice [43/117]
Each person makes four choices
from a choice set that includes either
2 or 4 alternatives.
The first choice is the RP between
two of the 4 RP alternatives
The second-fourth are the SP among
four of the 6 SP alternatives.
There are 10 alternatives in total.
A Stated Choice Experiment with Variable Choice Sets
Part 24: Stated Choice [44/117]
Customers’ Choice of Energy Supplier



California, Stated Preference Survey
361 customers presented with 8-12 choice
situations each
Supplier attributes:






Fixed price: cents per kWh
Length of contract
Local utility
Well-known company
Time-of-day rates (11¢ in day, 5¢ at night)
Seasonal rates (10¢ in summer, 8¢ in winter, 6¢ in
spring/fall)
(TrainCalUtilitySurvey.lpj)
Part 24: Stated Choice [45/117]
Population Distributions

Normal for:




Log-normal for:



Contract length
Local utility
Well-known company
Time-of-day rates
Seasonal rates
Price coefficient held fixed
Part 24: Stated Choice [46/117]
Estimated Model
Estimate Std error
Price
-.883
0.050
Contract
mean
-.213
0.026
std dev
.386
0.028
Local
mean
2.23
0.127
std dev
1.75
0.137
Known
mean
1.59
0.100
std dev
.962
0.098
TOD
mean*
2.13
0.054
std dev*
.411
0.040
Seasonal mean*
2.16
0.051
std dev*
.281
0.022
*Parameters of underlying normal.
Part 24: Stated Choice [47/117]
Distribution of Brand Value
Standard
deviation
10% dislike local utility
0
2.5¢
Brand value of local utility
=2.0¢
Part 24: Stated Choice [48/117]
Random Parameter Distributions
Part 24: Stated Choice [49/117]
Time of Day Rates (Customers do not like –
lognormal coefficient. Multiply variable by -1.)
Part 24: Stated Choice [50/117]
Estimating Individual Parameters



Model estimates = structural parameters, α, β, ρ, Δ, Σ, Γ
Objective, a model of individual specific parameters, βi
Can individual specific parameters be estimated?
 Not quite – βi is a single realization of a random
process; one random draw.
 We estimate E[βi | all information about i]
 (This is also true of Bayesian treatments, despite
claims to the contrary.)
Part 24: Stated Choice [51/117]
Expected Preferences of Each Customer
Customer likes long-term contract, local utility,
and non-fixed rates.
Local utility can retain and make profit from this
customer by offering a long-term contract with
time-of-day or seasonal rates.
Part 24: Stated Choice [52/117]
Posterior Estimation of i
βˆ i = E βi | β, Δ,Γ, yi , X i ,zi 
 T

β
P(choice
j
|
X
,
β
)g(
β
|
β
,
Δ
,
Γ
,etc.,
z
)
i
i
i
i dβ i
 βi i 
t=1

=
 T

P(choice
j
|
X
,
β
)g(β
|
β
,
Δ
,
Γ
,etc.,
z
)
i
i
i
i dβi
 βi 
t=1

Estimate by simulation

1 R ˆ  T
ˆ
ˆ
ˆ
ˆ
ˆ
βir  P(choice j | Xi, βi )g(βi | β, Δ, Γ,etc.,z i ) 

R
,
βˆ i = r=1R  Tt=1
1 
ˆ )g(βˆ | βˆ , Δ
ˆ , Γˆ ,etc.,z ) 
P(choice
j
|
X
,
β

i
i
i
i 
R r=1  t=1

ˆ z + Γˆ w
βˆ = βˆ + Δ
ir
i
ir
Part 24: Stated Choice [53/117]
Application: Shoe Brand Choice

Simulated Data: Stated Choice,



400 respondents,
8 choice situations, 3,200 observations
3 choice/attributes + NONE



Fashion = High / Low
Quality = High / Low
Price = 25/50/75,100 coded 1,2,3,4

Heterogeneity: Sex (Male=1), Age (<25, 25-

Underlying data generated by a 3 class
39, 40+)
latent class process (100, 200, 100 in
classes)
Part 24: Stated Choice [54/117]
Stated Choice Experiment: Unlabeled Alternatives, One Observation
t=1
t=2
t=3
t=4
t=5
t=6
t=7
t=8
Part 24: Stated Choice [55/117]
Individual parameters
Random Parameters Logit Model
U(brand1)n , s = β1,nFashion1,n,s + β2Quality1,n,s + β3Price1,n,s + εBrand1,n,s
U(brand2)n , s = β1,nFashion2,n,s + β2Quality 2,n,s + β3Price2,n,s + εBrand2,n,s
U(brand3)n , s = β1,nFashion3,n,s + β2Quality 3,n,s + β3Price3,n,s + εBrand3,n,s
U(None)n , s = β4
β1,n  β1 + δ11Sex + δ12 Age2539 + δ13 Age40 + η1zn1
+ εNo Brand,n,s
Part 24: Stated Choice [56/117]
Individual parameters
Part 24: Stated Choice [57/117]
Individual parameters
Part 24: Stated Choice [58/117]
Panel Data



Repeated Choice Situations
Typically RP/SP constructions (experimental)
Accommodating “panel data”



Multinomial Probit [marginal, impractical]
Latent Class
Mixed Logit
Part 24: Stated Choice [59/117]
Customers’ Choice of Energy Supplier



California, Stated Preference Survey
361 customers presented with 8-12 choice
situations each
Supplier attributes:






Fixed price: cents per kWh
Length of contract
Local utility
Well-known company
Time-of-day rates (11¢ in day, 5¢ at night)
Seasonal rates (10¢ in summer, 8¢ in winter, 6¢ in
spring/fall)
Part 24: Stated Choice [60/117]
Part 24: Stated Choice [61/117]
Application: Shoe Brand Choice

Simulated Data: Stated Choice,



3 choice/attributes + NONE





400 respondents,
8 choice situations, 3,200 observations
Fashion = High / Low
Quality = High / Low
Price = 25/50/75,100 coded 1,2,3,4
Heterogeneity: Sex (Male=1), Age (<25, 25-39, 40+)
Underlying data generated by a 3 class latent class
process (100, 200, 100 in classes)
Part 24: Stated Choice [62/117]
Stated Choice Experiment: Unlabeled Alternatives, One Observation
t=1
t=2
t=3
t=4
t=5
t=6
t=7
t=8
Part 24: Stated Choice [63/117]
Unlabeled Choice Experiments
This an unlabelled choice experiment: Compare
Choice = (Air, Train, Bus, Car)
To
Choice = (Brand 1, Brand 2, Brand 3, None)
Brand 1 is only Brand 1 because it is first in
the list.
What does it mean to substitute Brand 1 for
Brand 2?
What does the own elasticity for Brand 1 mean?
Part 24: Stated Choice [64/117]
Aggregate Data and Multinomial Choice:
The Model of Berry, Levinsohn and Pakes
Part 24: Stated Choice [65/117]
Resources
Automobile Prices in Market Equilibrium, S. Berry, J. Levinsohn, A. Pakes, Econometrica, 63,
4, 1995, 841-890. (BLP)
http://people.stern.nyu.edu/wgreene/Econometrics/BLP.pdf
A Practitioner’s Guide to Estimation of Random-Coefficients Logit Models of Demand, A.
Nevo, Journal of Economics and Management Strategy, 9, 4, 2000, 513-548
http://people.stern.nyu.edu/wgreene/Econometrics/Nevo-BLP.pdf
A New Computational Algorithm for Random Coefficients Model with Aggregate-level Data,
Jinyoung Lee, UCLA Economics, Dissertation, 2011
http://people.stern.nyu.edu/wgreene/Econometrics/Lee-BLP.pdf
Elasticities of Market Shares and Social Health Insurance Choice in Germany: A Dynamic
Panel Data Approach, M. Tamm et al., Health Economics, 16, 2007, 243-256.
http://people.stern.nyu.edu/wgreene/Econometrics/Tamm.pdf
Part 24: Stated Choice [66/117]
Part 24: Stated Choice [67/117]
Part 24: Stated Choice [68/117]
Part 24: Stated Choice [69/117]
Part 24: Stated Choice [70/117]
Part 24: Stated Choice [71/117]
Part 24: Stated Choice [72/117]
Part 24: Stated Choice [73/117]
Part 24: Stated Choice [74/117]
Part 24: Stated Choice [75/117]
Aggregate Data and Multinomial Choice:
The Model of Berry, Levinsohn and Pakes
Part 24: Stated Choice [76/117]
Theoretical Foundation



Consumer market for J differentiated brands of a good

j =1,…, Jt brands or types

i = 1,…, N consumers

t = i,…,T “markets” (like panel data)
Consumer i’s utility for brand j (in market t) depends on

p = price

x = observable attributes

f = unobserved attributes

w = unobserved heterogeneity across consumers

ε = idiosyncratic aspects of consumer preferences
Observed data consist of aggregate choices, prices and features of
the brands.
Part 24: Stated Choice [77/117]
BLP Automobile Market
Jt
t
Part 24: Stated Choice [78/117]
Random Utility Model

Utility: Uijt=U(wi,pjt,xjt,fjt|), i = 1,…,(large)N, j=1,…,J





wi = individual heterogeneity; time (market) invariant. w has a
continuous distribution across the population.
pjt, xjt, fjt, = price, observed attributes, unobserved features of brand j;
all may vary through time (across markets)
Revealed Preference: Choice j provides maximum utility
Across the population, given market t, set of prices pt and
features (Xt,ft), there is a set of values of wi that induces
choice j, for each j=1,…,Jt; then, sj(pt,Xt,ft|) is the market
share of brand j in market t.
There is an outside good that attracts a nonnegligible market
share, j=0. Therefore,  Jj=1 s j (pt , X t , ft | θ) < 1
t
Part 24: Stated Choice [79/117]
Functional Form


(Assume one market for now so drop “’t.”)
Uij=U(wi,pj,xj,fj|)= xj'β – αpj + fj + εij
= δj + εij
Econsumers i[εij] = 0, δj is E[Utility].
Market Share j  E q j Prob( j  q  )



Will assume logit form to make integration
unnecessary. The expectation has a closed
form.
Part 24: Stated Choice [80/117]
Heterogeneity




Assumptions so far imply IIA. Cross price
elasticities depend only on market shares.
Individual heterogeneity: Random parameters
Uij=U(wi,pj,xj,fj|i)= xj'βi – αpj + fj + εij
βik = βk + σkvik.
The mixed model only imposes IIA for a
particular consumer, but not for the market as
a whole.
Part 24: Stated Choice [81/117]
Endogenous Prices: Demand side





Uij=U(wi,pj,xj,fj|)= xj'βi – αpj + fj + εij
fj is unobserved
Utility responds to the unobserved fj
Price pj is partly determined by features fj.
In a choice model based on observables, price is
correlated with the unobservables that
determine the observed choices.
Part 24: Stated Choice [82/117]
Endogenous Price: Supply Side





There are a small number of competitors in this market
Price is determined by firms that maximize profits given the
features of its products and its competitors.
mcj = g(observed cost characteristics c,
unobserved cost characteristics h)
At equilibrium, for a profit maximizing firm that produces one
product,
sj + (pj-mcj)sj/pj = 0
Market share depends on unobserved cost characteristics as well
as unobserved demand characteristics, and price is correlated with
both.
Part 24: Stated Choice [83/117]
Instrumental Variables
(ξ and ω are our h and f.)
Part 24: Stated Choice [84/117]
Econometrics: Essential Components
Uijt  x jti  fjt  ijt
Ui0t  i0t (Outside good)
i    v i ,   diagonal(1 ,...)
ijt ~ Type I extreme value, IID across all choices
Market shares: s j ( X t , ft : i ) 
exp( x jti  fjt )
1   m1 exp( x mti  fmt )
J
, j  1,..., Jt
Part 24: Stated Choice [85/117]
Econometrics
Market Shares: s j ( X t , ft : i ) 
exp( x jti  fjt )
1   m1 exp( x mti  fmt )
Expected Share: E[s j ( X t , ft :  )] 
J

i
, j  1,..., Jt
exp( x jti  fjt )
1   m1 exp( x mti  fmt )
J
Expected Shares are estimated using simulation:
exp[x jt   v ir )  fjt ]
1 R
ŝ j ( X t , ft :  )   r 1
J
R
1   m1 exp[x mt   v ir )  fmt ]
dF(i )
Part 24: Stated Choice [86/117]
GMM Estimation Strategy - 1
exp[x jt   v ir )  fjt ]
1 R
ŝ jt ( X t , ft :  )   r 1
J
R
1   m1 exp[x mt   v ir )  fmt ]
We have instruments z jt such that
E[fjt ( )z jt ]  0
fjt is obtained from an inverse mapping by equating the
fitted market shares, ˆ
s t , to the observed market shares, S t .
ˆ
s t ( X t , ft :  )  S t so ˆft  ˆ
s t 1 ( X t , S t :  ).
Part 24: Stated Choice [87/117]
GMM Estimation Strategy - 2
We have instruments z jt such that
E[fjt ( )z jt ]  0
ˆ
s t ( X t , ft :  )  S t so ˆft  ˆ
s t 1 ( X t , S t :  ).
1
ˆ
Define gt =
Jt

Jt
j1
ˆf z
jt jt
ˆ ( )  g
ˆ Wg
ˆ
GMM Criterion would be Q
t
t
t
where W = the weighting matrix for mi nimum distance estimation.
For the entire sample, the GMM estimator is built on
ˆ = 1 T 1
g
T t 1 Jt

Jt
j 1
ˆf z and Q( )=g
ˆWg
ˆ
jt jt
Part 24: Stated Choice [88/117]
BLP Iteration
Begin with starting values for ft  ˆft(0) and starting values for
structural parameters  and .
1)
ˆ(M1) , 
ˆ (M1) ).
Compute predicted shares ˆ
s (M
( X t , ˆft(M1) : 
t
INNER (Contraction Mapping) Find a fixed point for
1)
ˆf (M)  ˆf (M1)  log(S )  log[ˆ
ˆ(M1) , 
ˆ(M 1) , 
ˆ (M1) )]  ˆft(M) (ˆft(M1) , 
ˆ(M 1) )
s (M
( X t , ˆft(M1) : 
t
t
t
t
ˆ(M) , 
ˆ (M) .
OUTER (GMM Step) With ˆft(M) in hand, use GMM to (re)estimate 
Return to INNER step or exit if ˆft(M) - ˆft(M1) is sufficiently small.
GMM step is straightforward - concave function (quadratic form) of a
concave function (logit probability).
Solving the INNER step is time consuming and very complicated.
Recent research has produced several alternative algorithms.
Overall complication: The estimates ˆft(M) can diverge.
Part 24: Stated Choice [89/117]
ABLP Iteration
ξt is
our ft.
 is our
(β,)
No superscript is
our (M);
superscript 0 is
our (M-1).
Part 24: Stated Choice [90/117]
Side Results
Part 24: Stated Choice [91/117]
ABLP Iterative Estimator
Part 24: Stated Choice [92/117]
BLP Design Data
Part 24: Stated Choice [93/117]
Exogenous price and nonrandom parameters
Part 24: Stated Choice [94/117]
IV Estimation
Part 24: Stated Choice [95/117]
Full Model
Part 24: Stated Choice [96/117]
Some Elasticities
Part 24: Stated Choice [97/68]
Fixed Effects
Multinomial Logit:
Application of Minimum
Distance Estimation
Part 24: Stated Choice [98/117]
Binary Logit Conditional Probabiities
ei  xit 
Prob( yit  1| xit ) 
.
1  ei  xit 
Ti


Prob  Yi1  yi1 , Yi 2  yi 2 , , YiTi  yiTi  yit 
t 1


Ti


 Ti

exp   yit xit  
exp   yit xit β 
 t 1

 t 1



.
Ti
 Ti



dit xit    All  Ti  different ways that exp   dit xit β 
 t dit Si exp  
 Si 
 t 1

 t 1

 t dit can equal Si
Denominator is summed over all the different combinations of Ti values
of yit that sum to the same sum as the observed  Tt=1i yit . If Si is this sum,
T 
there are   terms. May be a huge number. An algorithm by Krailo
 Si 
and Pike makes it simple.
Part 24: Stated Choice [99/117]
Example: SevenPeriod Binary Logit
Prob[y = (1,0,0,0,1,1,1)|Xi ]=
exp( i  x1 )
exp( i  x 7 )
1

 ... 
1  exp( i  x1 ) 1  exp( i  x 2 )
1  exp( i  x 7 )
There are 35 different sequences of y it (permutations) that sum to 4.
For example, y*it| p1 might be (1,1,1,1,0,0,0). Etc.
Prob[y=(1,0,0,0,1,1,1)|Xi ,t71y it =7] =
exp t71 yit xit 
7
*


exp


y
 p1  t 1 it| p xit 
35
Part 24: Stated Choice [100/117]
Part 24: Stated Choice [101/117]
With T = 50, the number of permutations of sequences of
y ranging from sum = 0 to sum = 50 ranges from 1 for 0 and 50,
to 2.3 x 1012 for 15 or 35 up to a maximum of 1.3 x 1014 for sum =25.
These are the numbers of terms that must be summed for a model
with T = 50. In the application below, the sum ranges from 15 to 35.
Part 24: Stated Choice [102/117]
The sample is 200 individuals each observed 50 times.
Part 24: Stated Choice [103/117]
The data are generated from a probit process with b1 = b2 = .5. But, it is fit as a
logit model. The coefficients obey the familiar relationship, 1.6*probit.
Part 24: Stated Choice [104/117]
Multinomial Logit Model: J+1 choices including a base choice.
yitj = 1 if individual i makes choice j in period t
  x 
e ij itj
Prob( yitj  1| xitj ) 
  , j  1,..., J .
 im  xitm
J
1   m 1e
Prob( yit 0  1| xit 0 ) 
1
  .
1   mJ 1eim  xitm
The probability attached to the sequence of choices is remarkably complicated.
 Ti


exp
y
x

 j 1  
itj itj 
t 1




Ti


J
J

exp
d
x

 j 1  t ditj Sij  
 j 1
it it 
 t 1

J
 Ti


exp
y
x
β
 j 1  
itj itj 
 t 1

.
Ti


 Ti 

exp
d
x
β
All
different
ways
that


  Sij 
  itj itj 
 t 1

 t ditj can equal Sij
J
Denominator is summed over all the different combinations of Ti values
of yitj that sum to the same sum as the observed  Tt=1i yit . If Sij is this sum,
T 
there are   terms. May be a huge number. Larger yet by summing over choices.
 Sij 
Part 24: Stated Choice [105/117]
Estimation Strategy


Conditional ML of the full MNL model.
Impressively complicated.
A Minimum Distance (MDE) Strategy

Each alternative treated as a binary choice vs. the
base provides an estimator of 




Select subsample that chose either option j or the base
Estimate  using this binary choice setting
This provides J different estimators of the same 
Optimally combine the different estimators of 
Part 24: Stated Choice [106/117]
Minimum Distance Estimation
There are J estimators βˆ j of the same parameter vector, βˆ .
Each estimator is consistent and asymptotically normal.
ˆ . How to combine the estimators?
Estimated covariance matrices V
j
 βˆ 1  βˆ *    βˆ 1  βˆ * 




ˆ
ˆ
ˆ
ˆ
 β  β* 
 β 2  β* 
MDE: Minimize wrt ˆ * q =  2
W







ˆ
ˆ
ˆ 
ˆ 
β J  β* 
β J  β* 
What to use for the weighting matrix W? Any positive definite matrix will do.
Part 24: Stated Choice [107/117]
MDE Estimation
ˆ . How to combine the estimators?
Estimated covariance matrices V
j
 βˆ 1  βˆ *   βˆ 1  βˆ * 




ˆ
ˆ
ˆ
ˆ
β 2  β* 
β  β* 
W
MDE: Minimize wrt βˆ * q =  2
 . Propose a GLS approach






ˆ
ˆ
ˆ 
ˆ 
β J  β* 
β J  β* 
ˆ
V
1

0
W = A 1  

0

0
ˆ
V
2
0
0

0


ˆ 
V
J
1
Part 24: Stated Choice [108/117]
MDE Estimation
ˆ
 βˆ 1  βˆ *   V
  1

 βˆ 2  βˆ *   0
ˆ
MDE: Minimize wrt β* q = 
 
 

ˆ
ˆ  
β J  β*   0
0
ˆ
V
2
0
0

0


ˆ 
V
J
1
 βˆ 1  βˆ * 


ˆ
ˆ
 β 2  β* 
.



ˆ
ˆ 
β J  β* 
1
1
1
1
ˆ
ˆ 1βˆ 
ˆ 1βˆ  ...  V
ˆ 1βˆ  V
ˆ
ˆ
ˆ


The solution is β*   V1  V2  ...  VJ   V
J
J
2
2
1
1


1
J
ˆ 1βˆ 
ˆ 1    J V
=   j 1 V
j
  j 1 j j 

J
J
=  j 1 H j βˆ j where  j 1 H j  I
Part 24: Stated Choice [109/117]
Part 24: Stated Choice [110/117]
Part 24: Stated Choice [111/117]
Part 24: Stated Choice [112/117]
Part 24: Stated Choice [113/117]
Part 24: Stated Choice [114/117]
Part 24: Stated Choice [115/117]
Part 24: Stated Choice [116/117]
Part 24: Stated Choice [117/117]
Why a 500 fold increase in speed?



MDE is much faster
Not using Krailo and Pike, or not using
efficiently
Numerical derivatives for an extremely messy
function (increase the number of function
evaluations by at least 5 times)
Download