Latent Class Models

advertisement
1/68: Topic 4.2 – Latent Class Models
Microeconometric Modeling
William Greene
Stern School of Business
New York University
New York NY USA
4.2 Latent Class Models
2/68: Topic 4.2 – Latent Class Models
Concepts
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Latent Class
Prior and Posterior Probabilities
Classification Problem
Finite Mixture
Normal Mixture
Health Satisfaction
EM Algorithm
ZIP Model
Hurdle Model
NB2 Model
Obesity/BMI Model
Misreporter
Value of Travel Time Saved
Decision Strategy
Models
•
•
•
•
•
•
Multinomial Logit Model
Latent Class MNL
Heckman – Singer Model
Latent Class Ordered Probit
Attribute Nonattendance Model
2K Model
3/68: Topic 4.2 – Latent Class Models
Latent Classes
A population contains a mixture of
individuals of different types (classes)
 Common form of the data generating
mechanism within the classes
 Observed outcome y is governed by the
common process F(y|x,j )
 Classes are distinguished by the
parameters, j.

4/68: Topic 4.2 – Latent Class Models
5/68: Topic 4.2 – Latent Class Models
Unmixing a Mixed Sample
Sample
Calc
Create
Create
Create
Kernel
Regress
; 1 – 1000$
; Ran(123457)$
; lc1=rnn(1,1) ;lc2=rnn(5,1)$
; class=rnu(0,1)$
; if(class<.3)ylc=lc1 ; (else)ylc=lc2$
; rhs=ylc $
; lhs=ylc;rhs=one;lcm;pts=2;pds=1$
6/68: Topic 4.2 – Latent Class Models
Mixture of Normals
7/68: Topic 4.2 – Latent Class Models
How Finite Mixture Models Work
8/68: Topic 4.2 – Latent Class Models
Find the ‘Best’ Fitting Mixture of Two Normal Densities
 2
1  yi - μj  
LogL =  i=1 log   j=1 π j  
 



σ
σ
j
j




Maximum Likelihood Estimates
Class 1
Class 2
Estimate Std. Error Estimate
Std. error
7.05737
.77151
3.25966
.09824
3.79628
.25395
1.81941
.10858
.28547
.05953
.71453
.05953
1000
μ
σ
π

1
1
 y - 7.05737  
 y - 3.25966  
ˆF(y) = .28547 
 3.79628   3.79628   +.71453 1.81941   1.81941  






9/68: Topic 4.2 – Latent Class Models
Mixing probabilities .715 and .285
10/68: Topic 4.2 – Latent Class Models
Approximation
Actual
Distribution
11/68: Topic 4.2 – Latent Class Models
A Practical Distinction

Finite Mixture (Discrete Mixture):






Functional form strategy
Component densities have no meaning
Mixing probabilities have no meaning
There is no question of “class membership”
The number of classes is uninteresting – enough to get a good fit
Latent Class:





Mixture of subpopulations
Component densities are believed to be definable “groups”
(Low Users and High Users in Bago d’Uva and Jones application)
The classification problem is interesting – who is in which class?
Posterior probabilities, P(class|y,x) have meaning
Question of the number of classes has content in the context of
the analysis
12/68: Topic 4.2 – Latent Class Models
The Latent Class Model
(1) There are Q classes, unobservable to the analyst
(2) Class specific model: f(y it | x it , class  q)  g(y it , x it , β q )
(3) Conditional class probabilities
Common multinomial logit form for prior class probabilities
exp(δ q )
P(class=q|δ)  iq 
, δQ = 0
Q
 q1 exp(δq )
q = log(q / Q ).
13/68: Topic 4.2 – Latent Class Models
Log Likelihood for an LC Model
Conditional density for each observation is
P(y i,t | x i,t , class  q)  f(y it | x i,t , βq )
Joint conditional density for Ti observations is
f(y i1 , y i2 ,..., y i,Ti | X i , βq )   t i 1 f(y it | x i,t , βq )
T
(Ti may be 1. This is not only a 'panel data' model.)
Maximize this for each class if the classes are known.
They aren't. Unconditional density for individual i is
f(y i1 , y i2 ,..., y i,Ti | X i )   q1 q
Q

Ti
t 1
f(y it | x i,t , βq )

Log Likelihood
LogL(β1 ,..., β Q , δ1 ,..., δ Q )   i1 log  q1 q  t i 1 f(y it | x i,t , β q )
N
Q
T
14/68: Topic 4.2 – Latent Class Models
Estimating Which Class
Prior class probability Prob[class=q]=q
Joint conditional density for Ti observations is
P(y i1 , y i2 ,..., y i,Ti | X i , class  q)   t i 1 f(y it | x i,t , β q )
T
Joint density for data and class membership is the product
P(y i1 , y i2 ,..., y i,Ti , class  q | X i )  q  t i 1 f(y it | x i,t , β q )
T
Posterior probability for class, given the data
P( y i , class  q | X i )
P(class  q | y i1 , y i2 ,..., y i,Ti , X i ) 
P(y i1 , y i2 ,..., y i,Ti | X i )

P( y i , class  q | X i )

Q
q1
P( y i , class  q | X i )
Use Bayes Theorem to compute the posterior (conditional) probability
q  t i 1 f(y it | x i,t , β q )
T
w(q | y i , X i )  P(class  j | y i , X i ) 

Q
q1
q  t i 1 f(y it | x i,t , βq )
T
 w iq
Best guess = the class with the largest posterior probability.
15/68: Topic 4.2 – Latent Class Models
Posterior for Normal Mixture
 Ti 1  y it  
ˆq 
q  t 1  
ˆ
 


ˆq  
ˆ q  

ˆ | datai )  w(q
ˆ | i) 
w(q
 Ti 1  y it  
ˆq 
Q
 q1 ˆq  t 1 ˆ   ˆ 

q 
q
 
16/68: Topic 4.2 – Latent Class Models
Estimated Posterior Probabilities
17/68: Topic 4.2 – Latent Class Models
More Difficult When the
Populations are Close Together
18/68: Topic 4.2 – Latent Class Models
The Technique Still Works
---------------------------------------------------------------------Latent Class / Panel LinearRg Model
Dependent variable
YLC
Sample is 1 pds and
1000 individuals
LINEAR regression model
Model fit with 2 latent classes.
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Model parameters for latent class 1
Constant|
2.93611***
.15813
18.568
.0000
Sigma|
1.00326***
.07370
13.613
.0000
|Model parameters for latent class 2
Constant|
.90156***
.28767
3.134
.0017
Sigma|
.86951***
.10808
8.045
.0000
|Estimated prior probabilities for class membership
Class1Pr|
.73447***
.09076
8.092
.0000
Class2Pr|
.26553***
.09076
2.926
.0034
--------+-------------------------------------------------------------
19/68: Topic 4.2 – Latent Class Models
‘Estimating’ βi
ˆ from the class with the largest estimated probability
(1) Use β
j
(2) Probabilistic - in the same spirit as the 'posterior mean'
ˆ = Q Posterior Prob[class=q|data ] β
ˆ
β
i
i
q
q=1
ˆ
ˆ iqβ
= q=1 w
q
Q
Note : This estimates E[βi | y i , X i ], not βi itself.
20/68: Topic 4.2 – Latent Class Models
How Many Classes?
(1) Q is not a 'parameter' - can't 'estimate' Q with  and β
(2) Can't 'test' down or 'up' to Q by comparing
log likelihoods. Degrees of freedom for Q+1
vs. Q classes is not well defined.
(3) Use AKAIKE IC; AIC = -2  logL + 2#Parameters.
For our mixture of normals problem,
AIC1  10827.88
AIC2  9954.268 
AIC3  9958.756
21/68: Topic 4.2 – Latent Class Models
LCM for Health Status





Self Assessed Health Status = 0,1,…,10
Recoded: Healthy = HSAT > 6
Using only groups observed T=7 times; N=887
Prob = (Age,Educ,Income,Married,Kids)
2, 3 classes
22/68: Topic 4.2 – Latent Class Models
Too Many Classes
23/68: Topic 4.2 – Latent Class Models
Two Class Model
---------------------------------------------------------------------Latent Class / Panel Probit
Model
Dependent variable
HEALTHY
Unbalanced panel has
887 individuals
PROBIT (normal) probability model
Model fit with 2 latent classes.
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Model parameters for latent class 1
Constant|
.61652**
.28620
2.154
.0312
AGE|
-.02466***
.00401
-6.143
.0000
44.3352
EDUC|
.11759***
.01852
6.351
.0000
10.9409
HHNINC|
.10713
.20447
.524
.6003
.34930
MARRIED|
.11705
.09574
1.223
.2215
.84539
HHKIDS|
.04421
.07017
.630
.5287
.45482
|Model parameters for latent class 2
Constant|
.18988
.31890
.595
.5516
AGE|
-.03120***
.00464
-6.719
.0000
44.3352
EDUC|
.02122
.01934
1.097
.2726
10.9409
HHNINC|
.61039***
.19688
3.100
.0019
.34930
MARRIED|
.06201
.10035
.618
.5367
.84539
HHKIDS|
.19465**
.07936
2.453
.0142
.45482
|Estimated prior probabilities for class membership
Class1Pr|
.56604***
.02487
22.763
.0000
Class2Pr|
.43396***
.02487
17.452
.0000
24/68: Topic 4.2 – Latent Class Models
Partial Effects in LC Model
---------------------------------------------------------------------Partial derivatives of expected val. with
respect to the vector of characteristics.
They are computed at the means of the Xs.
Conditional Mean at Sample Point
.6116
Scale Factor for Marginal Effects
.3832
B for latent class model is a wghted avrg.
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z] Elasticity
--------+------------------------------------------------------------|Two class latent class model
AGE|
-.01054***
.00134
-7.860
.0000
-.76377
EDUC|
.02904***
.00589
4.932
.0000
.51939
HHNINC|
.12475**
.05598
2.228
.0259
.07124
MARRIED|
.03570
.02991
1.194
.2326
.04934
HHKIDS|
.04196**
.02075
2.022
.0432
.03120
--------+------------------------------------------------------------|Pooled Probit Model
AGE|
-.00846***
.00081
-10.429
.0000
-.63399
EDUC|
.03219***
.00336
9.594
.0000
.59568
HHNINC|
.16699***
.04253
3.927
.0001
.09865
|Marginal effect for dummy variable is P|1 - P|0.
MARRIED|
.02414
.01877
1.286
.1986
.03451
|Marginal effect for dummy variable is P|1 - P|0.
HHKIDS|
.06754***
.01483
4.555
.0000
.05195
--------+-------------------------------------------------------------
25/68: Topic 4.2 – Latent Class Models
Conditional Means of Parameters
Est.E[ | All information for individual i] =

ˆ ijˆ j
w
j=1
J
using posterior (conditional) estimated class probabilities.
26/68: Topic 4.2 – Latent Class Models
An Extended Latent Class Model
Class probabilities relate to observable variables (usually
demographic factors such as age and sex).
(1) There are Q classes, unobservable to the analyst
(2) Class specific model: f(y it | x it , class  q)  g( y it , x it , βq )
(3) Conditional class probabilities given some information, zi )
Common multinomial logit form for prior class probabilities
exp(ziδ q )
P(class=q|zi , δ)  iq 
, δq = 0
Q
 q1 exp(ziδq )
27/68: Topic 4.2 – Latent Class Models
28/68: Topic 4.2 – Latent Class Models
Health Satisfaction Model
---------------------------------------------------------------------Latent Class / Panel Probit
Model
Used mean AGE and FEMALE
Dependent variable
HEALTHY
in class probability model
Log likelihood function
-3465.98697
--------+------------------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
Mean of X
--------+------------------------------------------------------------|Model parameters for latent class 1
Constant|
.60050**
.29187
2.057
.0396
AGE|
-.02002***
.00447
-4.477
.0000
44.3352
EDUC|
.10597***
.01776
5.968
.0000
10.9409
HHNINC|
.06355
.20751
.306
.7594
.34930
MARRIED|
.07532
.10316
.730
.4653
.84539
HHKIDS|
.02632
.07082
.372
.7102
.45482
|Model parameters for latent class 2
Constant|
.10508
.32937
.319
.7497
AGE|
-.02499***
.00514
-4.860
.0000
44.3352
EDUC|
.00945
.01826
.518
.6046
10.9409
HHNINC|
.59026***
.19137
3.084
.0020
.34930
MARRIED|
-.00039
.09478
-.004
.9967
.84539
HHKIDS|
.20652***
.07782
2.654
.0080
.45482
|Estimated prior probabilities for class membership
ONE_1|
1.43661***
.53679
2.676
.0074
(.56519)
AGEBAR_1|
-.01897*
.01140
-1.664
.0960
FEMALE_1|
-.78809***
.15995
-4.927
.0000
ONE_2|
.000
......(Fixed Parameter)......
(.43481)
AGEBAR_2|
.000
......(Fixed Parameter)......
FEMALE_2|
.000
......(Fixed Parameter)......
--------+-------------------------------------------------------------
29/68: Topic 4.2 – Latent Class Models
Zero Inflation?
30/68: Topic 4.2 – Latent Class Models
Zero Inflation – ZIP Models

Two regimes: (Recreation site visits)



Unconditional:



Zero (with probability 1). (Never visit site)
Poisson with Pr(0) = exp[- ’xi]. (Number of visits,
including zero visits this season.)
Pr[0] = P(regime 0) + P(regime 1)*Pr[0|regime 1]
Pr[j | j >0] = P(regime 1)*Pr[j|regime 1]
This is a “latent class model”
31/68: Topic 4.2 – Latent Class Models
Hurdle Models

Two decisions:


Whether or not to participate: y=0 or +.
If participate, how much. y|y>0
One ‘regime’ – individual always makes
both decisions.
 Implies different models for zeros and
positive values



Prob(0) = 1 – F(′z), Prob(+) = F(′z)
Prob(y|+) = P(y)/[1 – P(0)]
32/68: Topic 4.2 – Latent Class Models
33/68: Topic 4.2 – Latent Class Models
34/68: Topic 4.2 – Latent Class Models
A Latent Class Hurdle NB2 Model


Analysis of ECHP panel data (19942001)
Two class Latent Class Model


Typical in health economics applications
Hurdle model for physician visits


Poisson hurdle for participation and
negative binomial intensity given
participation
Contrast to a negative binomial model
35/68: Topic 4.2 – Latent Class Models
36/68: Topic 4.2 – Latent Class Models
LC Poisson Regression for Doctor Visits
37/68: Topic 4.2 – Latent Class Models
Is the LCM Finding High and Low Users?
38/68: Topic 4.2 – Latent Class Models
Is the LCM Finding High and Low Users?
Apparently So.
39/68: Topic 4.2 – Latent Class Models
Heckman and Singer’s RE Model


Random Effects Model
Random Constants with Discrete Distribution
(1) There are Q classes, unobservable to the analyst
(2) Class specific model: f(y it | x it , class  q)  g(y it , x it ,  q , β)
(3) Conditional class probabilities q
Common multinomial logit form for prior class probabilities
to constrain all probabilities to (0,1) and ensure
multinomial logit form for class probabilities;
exp(q )
P(class=q|δ)  q 
, Q = 0
J
 j1 exp(q )

Q
q=1
q  1;
40/68: Topic 4.2 – Latent Class Models
3 Class Heckman-Singer Form
41/68: Topic 4.2 – Latent Class Models
Heckman and Singer Binary
ChoiceModel – 3 Points
42/68: Topic 4.2 – Latent Class Models
Heckman/Singer vs. REM
----------------------------------------------------------------------------Random Effects Binary Probit Model
Sample is 7 pds and
887 individuals.
--------+-------------------------------------------------------------------|
Standard
Prob.
95% Confidence
HEALTHY| Coefficient
Error
z
|z|>Z*
Interval
--------+-------------------------------------------------------------------Constant|
.33609
.29252
1.15 .2506
-.23723
.90941
(Other coefficients omitted)
Rho|
.52565***
.02025
25.96 .0000
.48596
.56534
--------+-------------------------------------------------------------------Rho = 2/(1+s2) so 2 = rho/(1-rho) = 1.10814.
Mean = .33609, Variance = 1.10814
For Heckman and
3 points
3 probabilities
Mean = .61593
Singer model,
a1,a2,a3 = 1.82601, .50135, -.75636
p1,p2,p3 = .31094, .45267, .23639
variance = .90642
43/68: Topic 4.2 – Latent Class Models
Modeling Obesity with a
Latent Class Model
Mark Harris
Department of Economics, Curtin University
Bruce Hollingsworth
Department of Economics, Lancaster University
William Greene
Stern School of Business, New York University
Pushkar Maitra
Department of Economics, Monash University
44/68: Topic 4.2 – Latent Class Models
Two Latent Classes: Approximately Half of European Individuals
45/68: Topic 4.2 – Latent Class Models
An Ordered Probit Approach
A Latent Regression Model for “True
BMI”
BMI*
= ′x + ,  ~ N[0,σ2], σ2 = 1
“True BMI” = a proxy for weight is unobserved
Observation Mechanism for Weight Type
WT
= 0 if
1 if
2 if
BMI* < 0 Normal
0 < BMI* < 
Overweight
 < BMI*
Obese
46/68: Topic 4.2 – Latent Class Models
Latent Class Modeling

Several ‘types’ or ‘classes. Obesity be due to genetic
reasons (the FTO gene) or lifestyle factors

Distinct sets of individuals may have differing
reactions to various policy tools and/or characteristics

The observer does not know from the data which class
an individual is in.

Suggests a latent class approach for health outcomes
(Deb and Trivedi, 2002, and Bago d’Uva, 2005)
47/68: Topic 4.2 – Latent Class Models
Latent Class Application

Two class model (considering FTO gene):




More classes make class interpretations much more
difficult
Parametric models proliferate parameters
Two classes allow us to correlate the unobservables
driving class membership and observed weight outcomes.
Theory for more than two classes not yet developed.
48/68: Topic 4.2 – Latent Class Models
Correlation of Unobservables in Class
Membership and BMI Equations
49/68: Topic 4.2 – Latent Class Models
Outcome Probabilities



Class 0 dominated by normal and overweight probabilities ‘normal
weight’ class
Class 1 dominated by probabilities at top end of the scale ‘non-normal
weight’
Unobservables for weight class membership, negatively correlated
with those determining weight levels:
50/68: Topic 4.2 – Latent Class Models
Classification (Latent Probit) Model
51/68: Topic 4.2 – Latent Class Models
Inflated Responses in Self-Assessed Health
Mark Harris
Department of Economics, Curtin University
Bruce Hollingsworth
Department of Economics, Lancaster University
William Greene
Stern School of Business, New York University
52/68: Topic 4.2 – Latent Class Models
SAH vs. Objective Health Measures
Favorable SAH categories seem artificially high.
 60% of Australians are either overweight or obese (Dunstan et. al, 2001)
 1 in 4 Australians has either diabetes or a condition of impaired glucose
metabolism
 Over 50% of the population has elevated cholesterol
 Over 50% has at least 1 of the “deadly quartet” of health conditions
(diabetes, obesity, high blood pressure, high cholestrol)
 Nearly 4 out of 5 Australians have 1 or more long term health conditions
(National Health Survey, Australian Bureau of Statistics 2006)
 Australia ranked #1 in terms of obesity rates
Similar results appear to appear for other countries
53/68: Topic 4.2 – Latent Class Models
A Two Class Latent Class Model
True Reporter
Misreporter
54/68: Topic 4.2 – Latent Class Models


Mis-reporters choose either good or very good
The response is determined by a probit model
m*  xm m  m
Y=3
Y=2
55/68: Topic 4.2 – Latent Class Models
Y=4
Y=3
Y=2
Y=1
Y=0
56/68: Topic 4.2 – Latent Class Models
Observed Mixture of Two Classes
57/68: Topic 4.2 – Latent Class Models
Pr(true,y) = Pr(true) * Pr(y | true)
58/68: Topic 4.2 – Latent Class Models
Pr( y )  Pr(true) Pr( y | true)  Pr(misreporter ) Pr( y | misreporter )
59/68: Topic 4.2 – Latent Class Models
60/68: Topic 4.2 – Latent Class Models
General Result
0.4
0.35
0.3
0.25
Sample
Predicted
Mis-Reporting
0.2
0.15
0.1
0.05
0
Poor
Fair
Good
Very Good Excellent
61/68: Topic 4.2 – Latent Class Models
62/68: Topic 4.2 – Latent Class Models
63/68: Topic 4.2 – Latent Class Models
Discrete Parameter Heterogeneity
Latent Classes
Discrete unobservable partition of the population
into Q classes
Discrete approximation to a continuous distribution
of parameters across individuals
Prob[β = βq | w i ] = πiq , q = 1,...,Q
πiq =
exp(q w i )

Q
q=1
exp( q w i )
64/68: Topic 4.2 – Latent Class Models
Latent Class Probabilities

Ambiguous – Classical Bayesian model?


Equivalent to random parameters models with
discrete parameter variation



The randomness of the class assignment is from the point of view
of the observer, not a natural process governed by a discrete
distribution.
Using nested logits, etc. does not change this
Precisely analogous to continuous ‘random parameter’ models
Not always equivalent – zero inflation models – in
which classes have completely different models
65/68: Topic 4.2 – Latent Class Models
A Latent Class MNL Model

Within a “class”
P[choice j | i,t, class = q] =

exp(α j + βqxitj + γj,q zit )

J(i)
j=1
exp(α j + βqxitj + γj,q zit )
Class sorting is probabilistic (to the analyst) determined
by individual characteristics
P[class q | i] =
exp(θq w i )

Q
c=1
exp(θc w i )
= Hiq
66/68: Topic 4.2 – Latent Class Models
Two Interpretations of Latent Classes
Heterogeneity with respect to 'latent' consumer classes
Pr(Choicei ) =  q=1 Pr(Choicei | class = q)Pr(Class = q)
Q
Pr(Choicei | Class = q) =
Pr(Class = q | i) = Fi,q =
exp(βq x i,choice )
Σ j=choice exp(βq xi,choice )
exp(θq zi )
Σq=classes exp(θq zi )
Discrete random parameter variation
exp(βi x i,j )
Pr(Choicei | βi ) =
Σ j=choice exp(βi xi,j )
Pr(βi = βq ) = Fi,q =
exp(θq zi )
Σq=classes exp(θq zi )
,q = 1,...,Q
Pr(Choicei ) =  q=1 Pr(choice | βi = β q )Pr(βi = β q )
Q
67/68: Topic 4.2 – Latent Class Models
Estimates from the LCM



Taste parameters within each class q
Parameters of the class probability model, θq
For each person:



Posterior estimates of the class they are in q|i
Posterior estimates of their taste parameters E[q|i]
Posterior estimates of their behavioral parameters,
elasticities, marginal effects, etc.
68/68: Topic 4.2 – Latent Class Models
Using the Latent Class Model
Computing posterior (individual specific) class probabilities
Fˆq|i =

Pˆi|qFˆiq
Q
ˆ
Pˆ H
q=1 i|q
(posterior) Note Fˆq|i vs. Fˆiq
iq
Fˆiq = estimated prior class probability
Pˆi|q = estimated choice probability for
the choice made, given the class
Computing posterior (individual specific) taste parameters
Q
ˆ
βi =  q=1Fˆq|i βˆ q
69/68: Topic 4.2 – Latent Class Models
Application: Shoe Brand Choice

Simulated Data: Stated Choice, 400 respondents, 8 choice
situations, 3,200 observations

3 choice/attributes + NONE



Fashion = High / Low
Quality = High / Low
Price
= 25/50/75,100 coded 1,2,3,4
Heterogeneity: Sex (Male=1), Age (<25, 25-39, 40+)
 Underlying data generated by a 3 class latent class

process (100, 200, 100 in classes)

Thanks to www.statisticalinnovations.com (Latent Gold)
70/68: Topic 4.2 – Latent Class Models
Degenerate Branches
Shoe Choice
Purchase
Brand
Choice Situation
Opt Out
Choose Brand
None
Brand1
Brand2
Brand3
U(Brand j) = β1Fashion + β 2Quality + β3Price + ε Brand
U(None)
= β0
+  None
71/68: Topic 4.2 – Latent Class Models
One Class MNL Estimates
----------------------------------------------------------Discrete choice (multinomial logit) model
Dependent variable
Choice
Log likelihood function
-4158.50286
Estimation based on N =
3200, K =
4
R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj
Constants only -4391.1804 .0530 .0510
Response data are given as ind. choices
Number of obs.= 3200, skipped
0 obs
--------+-------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
--------+-------------------------------------------------FASH|1|
1.47890***
.06777
21.823
.0000
QUAL|1|
1.01373***
.06445
15.730
.0000
PRICE|1|
-11.8023***
.80406
-14.678
.0000
ASC4|1|
.03679
.07176
.513
.6082
--------+--------------------------------------------------
72/68: Topic 4.2 – Latent Class Models
Application: Brand Choice
True underlying model is a three class LCM
NLOGIT
; Lhs=choice
; Choices=Brand1,Brand2,Brand3,None
; Rhs = Fash,Qual,Price,ASC4
; LCM=Male,Age25,Age39
; Pts=3
; Pds=8
; Parameters (Save posterior results) $
73/68: Topic 4.2 – Latent Class Models
Three Class LCM
Normal exit from iterations. Exit status=0.
----------------------------------------------------------Latent Class Logit Model
Dependent variable
CHOICE
Log likelihood function
-3649.13245
LogL for one class MNL = -4158.503
Restricted log likelihood
-4436.14196
Based on the LR statistic it would
Chi squared [ 20 d.f.]
1574.01902
seem unambiguous to reject the one
Significance level
.00000
class model. The degrees of freedom
McFadden Pseudo R-squared
.1774085
for the test are uncertain, however.
Estimation based on N =
3200, K = 20
R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj
No coefficients -4436.1420 .1774 .1757
Constants only -4391.1804 .1690 .1673
At start values -4158.5428 .1225 .1207
Response data are given as ind. choices
Number of latent classes =
3
Average Class Probabilities
.506 .239 .256
LCM model with panel has
400 groups
Fixed number of obsrvs./group=
8
Number of obs.= 3200, skipped
0 obs
--------+--------------------------------------------------
74/68: Topic 4.2 – Latent Class Models
Estimated LCM: Utilities
--------+-------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
--------+-------------------------------------------------|Utility parameters in latent class -->> 1
FASH|1|
3.02570***
.14549
20.796
.0000
QUAL|1|
-.08782
.12305
-.714
.4754
PRICE|1|
-9.69638***
1.41267
-6.864
.0000
ASC4|1|
1.28999***
.14632
8.816
.0000
|Utility parameters in latent class -->> 2
FASH|2|
1.19722***
.16169
7.404
.0000
QUAL|2|
1.11575***
.16356
6.821
.0000
PRICE|2|
-13.9345***
1.93541
-7.200
.0000
ASC4|2|
-.43138**
.18514
-2.330
.0198
|Utility parameters in latent class -->> 3
FASH|3|
-.17168
.16725
-1.026
.3047
QUAL|3|
2.71881***
.17907
15.183
.0000
PRICE|3|
-8.96483***
1.93400
-4.635
.0000
ASC4|3|
.18639
.18412
1.012
.3114
75/68: Topic 4.2 – Latent Class Models
Estimated LCM: Class Probability
Model
--------+-------------------------------------------------Variable| Coefficient
Standard Error b/St.Er. P[|Z|>z]
--------+-------------------------------------------------|This is THETA(01) in class probability model.
Constant|
-.90345**
.37612
-2.402
.0163
_MALE|1|
.64183*
.36245
1.771
.0766
_AGE25|1|
2.13321***
.32096
6.646
.0000
_AGE39|1|
.72630*
.43511
1.669
.0951
|This is THETA(02) in class probability model.
Constant|
.37636
.34812
1.081
.2796
_MALE|2|
-2.76536***
.69325
-3.989
.0001
_AGE25|2|
-.11946
.54936
-.217
.8279
_AGE39|2|
1.97657***
.71684
2.757
.0058
|This is THETA(03) in class probability model.
Constant|
.000
......(Fixed Parameter)......
_MALE|3|
.000
......(Fixed Parameter)......
_AGE25|3|
.000
......(Fixed Parameter)......
_AGE39|3|
.000
......(Fixed Parameter)......
--------+--------------------------------------------------
76/68: Topic 4.2 – Latent Class Models
Estimated LCM:
Conditional Parameter Estimates
77/68: Topic 4.2 – Latent Class Models
Estimated LCM: Conditional (Posterior)
Class Probabilities
78/68: Topic 4.2 – Latent Class Models
Average Estimated Class Probabilities
MATRIX ; list ; 1/400 * classp_i'1$
Matrix Result has 3 rows and 1 columns.
1
+-------------1|
.50555
2|
.23853
3|
.25593
This is how the data were simulated. Class
probabilities are .5, .25, .25. The model ‘worked.’
79/68: Topic 4.2 – Latent Class Models
Elasticities
+---------------------------------------------------+
| Elasticity
averaged over observations.|
| Effects on probabilities of all choices in model: |
| * = Direct Elasticity effect of the attribute.
|
| Attribute is PRICE
in choice BRAND1
|
|
Mean
St.Dev
|
| *
Choice=BRAND1
-.8010
.3381
|
|
Choice=BRAND2
.2732
.2994
|
|
Choice=BRAND3
.2484
.2641
|
|
Choice=NONE
.2193
.2317
|
+---------------------------------------------------+
| Attribute is PRICE
in choice BRAND2
|
|
Choice=BRAND1
.3106
.2123
|
| *
Choice=BRAND2
-1.1481
.4885
|
|
Choice=BRAND3
.2836
.2034
|
|
Choice=NONE
.2682
.1848
|
+---------------------------------------------------+
| Attribute is PRICE
in choice BRAND3
|
|
Choice=BRAND1
.3145
.2217
|
|
Choice=BRAND2
.3436
.2991
|
| *
Choice=BRAND3
-.6744
.3676
|
|
Choice=NONE
.3019
.2187
|
+---------------------------------------------------+
Elasticities are computed by
averaging individual elasticities
computed at the expected
(posterior) parameter vector.
This is an unlabeled choice
experiment. It is not possible to
attach any significance to the fact
that the elasticity is different for
Brand1 and Brand 2 or Brand 3.
80/68: Topic 4.2 – Latent Class Models
Application: Long Distance Drivers’
Preference for Road Environments
New Zealand survey, 2000, 274 drivers
Mixed revealed and stated choice experiment
4 Alternatives in choice set







The current road the respondent is/has been using;
A hypothetical 2-lane road;
A hypothetical 4-lane road with no median;
A hypothetical 4-lane road with a wide grass median.
16 stated choice situations for each with 2 choice profiles



choices involving all 4 choices
choices involving only the last 3 (hypothetical)
Hensher and Greene, A Latent Class Model for Discrete Choice Analysis:
Contrasts with Mixed Logit – Transportation Research B, 2003
81/68: Topic 4.2 – Latent Class Models
Attributes






Time on the open road which is free flow (in
minutes);
Time on the open road which is slowed by other
traffic (in minutes);
Percentage of total time on open road spent with
other vehicles close behind (ie tailgating) (%);
Curviness of the road (A four-level attribute almost straight, slight, moderate, winding);
Running costs (in dollars);
Toll cost (in dollars).
82/68: Topic 4.2 – Latent Class Models
Experimental Design
The four levels of the six attributes chosen are:






Free Flow Travel Time:
-20%, -10%, +10%, +20%
Time Slowed Down:
-20%, -10%, +10%, +20%
Percent of time with vehicles close behind:
-50%, -25%, +25%, +50%
Curviness:almost, straight, slight, moderate, winding
Running Costs:
-10%, -5%, +5%, +10%
Toll cost for car and double for truck if trip duration is:
1 hours or less
0, 0.5, 1.5,
3
Between 1 hour and 2.5 hours
0, 1.5, 4.5,
9
More than 2.5 hours
0, 2.5, 7.5,
15
83/68: Topic 4.2 – Latent Class Models
Estimated Latent Class Model
84/68: Topic 4.2 – Latent Class Models
Estimated Value of Time Saved
85/68: Topic 4.2 – Latent Class Models
Distribution of Parameters –
Value of Time on 2 Lane Road
Kernel density estimate for
VOT2L
.12
Density
.10
.07
.05
.02
.00
-2
0
2
4
6
8
VOT2L
10
12
14
16
86/68: Topic 4.2 – Latent Class Models
Decision Strategy in Multinomial
Choice
Choice Situation: Alternatives
A1,...,A J
Attributes of the choices:
x1,...,xK
Characteristics of the individual: z1,...,zM
Random utility functions:
U(j|x,z) = U(xij ,z j , ij )
Choice probability model:
Prob(choice=j)=Prob(Uj  Ul )  l  j
87/68: Topic 4.2 – Latent Class Models
Multinomial Logit Model
Prob(choice  j) 
exp[βx ij   j zi ]

J
j1
exp[βx ij   j zi ]
Behavioral model assumes
(1) Utility maximization (and the underlying micro- theory)
(2) Individual pays attention to all attributes. That is the
implication of the nonzero β.
88/68: Topic 4.2 – Latent Class Models
Individual Explicitly Ignores Attributes
Hensher, D.A., Rose, J. and Greene, W. (2005) The Implications on Willingness to
Pay of Respondents Ignoring Specific Attributes (DoD#6) Transportation, 32 (3),
203-222.
Hensher, D.A. and Rose, J.M. (2009) Simplifying Choice through Attribute
Preservation or Non-Attendance: Implications for Willingness to Pay, Transportation
Research Part E, 45, 583-590.
Rose, J., Hensher, D., Greene, W. and Washington, S. Attribute Exclusion Strategies
in Airline Choice: Accounting for Exogenous Information on Decision Maker
Processing Strategies in Models of Discrete Choice, Transportmetrica, 2011
Choice situations in which the individual explicitly states
that they ignored certain attributes in their decisions.
89/68: Topic 4.2 – Latent Class Models
Appropriate Modeling Strategy



Fix ignored attributes at zero? Definitely not!
 Zero is an unrealistic value of the attribute (price)
 The probability is a function of xij – xil, so the
substitution distorts the probabilities
Appropriate model: for that individual, the specific
coefficient is zero – consistent with the utility
assumption. A person specific, exogenously determined
model
Surprisingly simple to implement
90/68: Topic 4.2 – Latent Class Models
Choice Strategy Heterogeneity

Methodologically, a rather minor point – construct
appropriate likelihood given known information
logL  m1 iM logLi (θ | data,m)
M



Not a latent class model. Classes are not latent.
Not the ‘variable selection’ issue (the worst form of
“stepwise” modeling)
Familiar strategy gives the wrong answer.
91/68: Topic 4.2 – Latent Class Models
Application: Sydney
Commuters’ Route Choice




Stated Preference study – several possible
choice situations considered by each person
Multinomial and mixed (random parameters) logit
Consumers included data on which attributes
were ignored.
Ignored attributes visibly coded as ignored are
automatically treated by constraining β=0 for that
observation.
92/68: Topic 4.2 – Latent Class Models
Data for Application of Information Strategy
Stated/Revealed preference study, Sydney car commuters.
500+ surveyed, about 10 choice situations for each.
Existing route vs. 3 proposed alternatives.
Attribute design


Original: respondents presented with 3, 4, 5, or 6 attributes
Attributes – four level design.







Free flow time
Slowed down time
Stop/start time
Trip time variability
Toll cost
Running cost
Final: respondents use only some attributes and indicate
when surveyed which ones they ignored
93/68: Topic 4.2 – Latent Class Models
Stated Choice Experiment
Ancillary questions: Did you ignore any of these attributes?
94/68: Topic 4.2 – Latent Class Models
95/68: Topic 4.2 – Latent Class Models
Individual Implicitly Ignores Attributes
Hensher, D.A. and Greene, W.H. (2010) Non-attendance and dual processing of
common-metric attributes in choice analysis: a latent class specification, Empirical
Economics 39 (2), 413-426
Campbell, D., Hensher, D.A. and Scarpa, R. Non-attendance to Attributes in
Environmental Choice Analysis: A Latent Class Specification, Journal of
Environmental Planning and Management, proofs 14 May 2011.
Hensher, D.A., Rose, J.M. and Greene, W.H. Inferring attribute non-attendance from
stated choice data: implications for willingness to pay estimates and a warning for
stated choice experiment design, 14 February 2011, Transportation, online 2 June
2001 DOI 10.1007/s11116-011-9347-8.
96/68: Topic 4.2 – Latent Class Models
Stated Choice Experiment
Individuals seem to be ignoring attributes. Unknown to the analyst
97/68: Topic 4.2 – Latent Class Models
The 2K model



The analyst believes some attributes are
ignored. There is no indicator.
Classes distinguished by which attributes are
ignored
Same model applies, now a latent class. For K
attributes there are 2K candidate coefficient
vectors
98/68: Topic 4.2 – Latent Class Models
Latent Class Models with
Cross Class Restrictions

Free Flow Slowed Start / Stop   Prior Probs 



 
1
0
0
0



 






2
4
0
0



 

0
5
0
3
 Uncertainty Toll Cost Running Cost 








 
4
  
0
0
6







1
2
3






0

5

 
4
5





6
4
0
6








0
5
6


 
7

7





4
5
6

   1   j1 j 





8 Class Model: 6 structural utility parameters, 7 unrestricted prior probabilities.
Reduced form has 8(6)+8 = 56 parameters. (πj = exp(αj)/∑jexp(αj), αJ = 0.)
EM Algorithm: Does not provide any means to impose cross class restrictions.
“Bayesian” MCMC Methods: May be possible to force the restrictions – it will
not be simple.
Conventional Maximization: Simple
99/68: Topic 4.2 – Latent Class Models
Results for the 2K model
100/68: Topic 4.2 – Latent Class Models
101/68: Topic 4.2 – Latent Class Models
Choice Model with 6 Attributes
102/68: Topic 4.2 – Latent Class Models
Stated Choice Experiment
103/68: Topic 4.2 – Latent Class Models
Latent Class Model – Prior Class Probabilities
104/68: Topic 4.2 – Latent Class Models
Latent Class Model – Posterior Class Probabilities
105/68: Topic 4.2 – Latent Class Models
6 attributes implies 64 classes. Strategy to reduce
the computational burden on a small sample
106/68: Topic 4.2 – Latent Class Models
Posterior probabilities of membership in the
nonattendance class for 6 models
107/68: Topic 4.2 – Latent Class Models
The EM Algorithm
Latent Class is a 'missing data' model
di,q  1 if individual i is a member of class q
If di,q were observed, the complete data log likelihood would be
logL c   i1 log


Ti

d
 q1 i,q  t 1 f(y i,t | datai,t , class  q) 
(Only one of the Q terms would be nonzero.)
Expectation - Maximization algorithm has two steps
N
Q
(1) Expectation Step: Form the 'Expected log likelihood'
given the data and a prior guess of the parameters.
(2) Maximize the expected log likelihood to obtain a new
guess for the model parameters.
(E.g., http://crow.ee.washington.edu/people/bulyko/papers/em.pdf)
108/68: Topic 4.2 – Latent Class Models
Implementing EM for LC Models
Given initial guesses 0q  10 , 02 ,..., 0Q , β0q  β10 , β02 ,..., β0Q
E.g., use 1/Q for each q and the MLE of β from a one class
model. (Must perturb each one slightly, as if all q are equal
and all β q are the same, the model will satisfy the FOC.)
ˆ0 , δ
ˆ0
ˆ
(1) Compute F(q|i)
= posterior class probabilities, using β
Reestimate each βq using a weighted log likelihood
Maximize wrt βq
 i=1 Fˆiq
N

Ti
t=1
log f(y it | x it , β q )
(2) Reestimate q by reestimating δ
ˆ
q =(1/N)Ni=1F(q|i)
using old ˆ
 and new β
ˆ
Now, return to step 1.
Iterate until convergence.
Download