1/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Microeconometric Modeling William Greene Stern School of Business New York University New York NY USA 3.3 Discrete Choice; The Multinomial Logit Model 2/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Concepts • • • • • • • • • • • • • • • • • Random Utility Multinomial Choice Attributes and Characteristics Stated and Revealed Preference Unlabeled Choice IID and IIA Consumer Surplus Choice Based Sampling Alternative Specific Constants Partial Effects and Elasticities Simulation WTP Space Hausman – McFadden Test for IIA Random Regret Minimum Distance Estimator Scaled MNL Model Nonlinear Utility Functions Models • • • • • • • • Multinomial Logit Model Conditional Logit Fixed Effects Multinomial Logit Best/Worst Nested Logit Exploded Logit Heteroscedastic Extreme Value (HEV) Generalized Mixed Logit 3/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model A Microeconomics Platform Consumers Maximize Utility (!!!) Fundamental Choice Problem: Maximize U(x1,x2,…) subject to prices and budget constraints A Crucial Result for the Classical Problem: Indirect Utility Function: V = V(p,I) Demand System of Continuous Choices * j x = V(p,I)/p j V(p,I)/I Observed data usually consist of choices, prices, income The Integrability Problem: Utility is not revealed by demands 4/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Implications for Discrete Choice Models Theory is silent about discrete choices Translation of utilities to discrete choice requires: Consumers often act to simplify choice situations This allows us to build “models.” Well defined utility indexes: Completeness of rankings Rationality: Utility maximization Axioms of revealed preferences What common elements can be assumed? How can we account for heterogeneity? However, revealed choices do not reveal utility, only rankings which are scale invariant. 5/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Multinomial Choice Among J Alternatives • Random Utility Basis Uitj = ij + i’xitj + ijzit + ijt i = 1,…,N; j = 1,…,J(i,t); t = 1,…,T(i) N individuals studied, J(i,t) alternatives in the choice set, T(i) [usually 1] choice situations examined. • Maximum Utility Assumption Individual i will Choose alternative j in choice setting t if and only if Uitj > Uitk for all k j. • Underlying assumptions Smoothness of utilities Axioms of utility maximization: Transitive, Complete, Monotonic 6/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Features of Utility Functions The linearity assumption Uitj = ij + ixitj + jzit + ijt To be relaxed later: Uitj = V(xitj,zit,i) + ijt The choice set: Individual (i) and situation (t) specific Unordered alternatives j = 1,…,J(i,t) Deterministic (x,z,j) and random components (ij,i,ijt) Attributes of choices, xitj and characteristics of the chooser, zit. Alternative specific constants ij may vary by individual Preference weights, i may vary by individual Individual components, j typically vary by choice, not by person Scaling parameters, σij = Var[εijt], subject to much modeling 7/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Unordered Choices of 210 Travelers 8/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Data on Multinomial Discrete Choices 9/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Each person makes four choices from a choice set that includes either two or four alternatives. The first choice is the RP between two of the RP alternatives The second-fourth are the SP among four of the six SP alternatives. There are ten alternatives in total. A Stated Choice Experiment with Variable Choice Sets 10/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Stated Choice Experiment: Unlabeled Alternatives, One Observation t=1 t=2 t=3 t=4 t=5 t=6 t=7 t=8 11/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Unlabeled Choice Experiments This an unlabelled choice experiment: Compare Choice = (Air, Train, Bus, Car) To Choice = (Brand 1, Brand 2, Brand 3, None) Brand 1 is only Brand 1 because it is first in the list. What does it mean to substitute Brand 1 for Brand 2? What does the own elasticity for Brand 1 mean? 12/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model The Multinomial Logit (MNL) Model Independent extreme value (Gumbel): F(itj) = Exp(-Exp(-itj)) (random part of each utility) Independence across utility functions Identical variances (means absorbed in constants) Same parameters for all individuals (temporary) Implied probabilities for observed outcomes P[choice = j | xitj , zit ,i,t] = Prob[Ui,t,j Ui,t,k ], k = 1,...,J(i,t) = exp(α j + β'xitj + γ j'zit ) J(i,t) j=1 exp(α j + β'xitj + γ j'zit ) 13/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Multinomial Choice Models Multinomial logit model depends on characteristics P[choice = j | zit ,i,t] = exp(α j + γ j'zit ) J(i,t) j=1 exp(α j + γ j'zit ) Conditional logit model depends on attributes P[choice = j | x itj,i,t] = exp(α j + β'x itj ) J(i,t) j=1 exp(α j + β'x itj ) THE multinomial logit model accommodates both. P[choice = j | x itj, zit ,i,t] = exp(α j + β'x itj + γ j'zit ) J(i,t) j=1 exp(α j + β'x itj + γ j'z it ) There is no meaningful distinction. 14/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Specifying the Probabilities • Choice specific attributes (X) vary by choices, multiply by generic coefficients. E.g., TTME=terminal time, GC=generalized cost of travel mode • Generic characteristics (Income, constants) must be interacted with choice specific constants. • Estimation by maximum likelihood; dij = 1 if person i chooses j P[choice = j | x itj , zit ,i,t] = Prob[Ui,t,j Ui,t,k ], k = 1,...,J(i,t) = exp(α j + β'x itj + γ j'zit ) J(i,t) j=1 logL = i=1 N exp(α j + β'x itj + γ j'zit ) J(i) j=1 dijlogPij 15/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Using the Model to Measure Consumer Surplus Maximum j (Uj ) Consumer Surplus = Marginal Utility of Income Utility and marginal utility are not observable For the multinomial logit model (only), 1 E[CS] = log MUI J(i,t) j=1 exp(α j + β'x itj + γ j'z it ) + C Where Uj = the utility of the indicated alternative and C is the constant of integration. The log sum is the "inclusive value." 16/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Measuring the Change in Consumer Surplus 1 E[CS | Scenario B] = log MU E[CS | Scenario A] = 1 log MUI I J(i,t) j=1 J(i,t) j=1 + γ 'z ) | B + C exp(α j + β'x itj + γ j'zit ) | A + C exp(α j + β'x itj j it MUI and the constant of integration do not change under scenarios. Change in expected consumer surplus from a policy (scenario) change E[CS | Scenario A] - E[CS | Scenario B] J(i,t) exp(α j + β'xitj + γ j'zit ) | A 1 j=1 = log J(i,t) MUI j=1 exp(α j + β'xitj + γ j'zit ) | B 17/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Willingness to Pay Generally a ratio of coefficients WTP = β(Attribute Level) β(Income) Use negative of cost coefficient as a proxu for MU of income WTP = negative β(Attribute Level) β(cost) Measurable using model parameters Ratios of possibly random parameters can produce wild and unreasonable values. We will consider a different approach later. 18/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Observed Data Types of Data Attributes and Characteristics Individual choice Market shares – consumer markets Frequencies – vote counts Ranks – contests, preference rankings Attributes are features of the choices such as price Characteristics are features of the chooser such as age, gender and income. Choice Settings Cross section Repeated measurement (panel data) Stated choice experiments Repeated observations – THE scanner data on consumer choices 19/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Choice Based Sampling Over/Underrepresenting alternatives in the data set Choice Air Train Bus Car True 0.14 0.13 0.09 0.64 Sample 0.28 0.30 0.14 0.28 May cause biases in parameter estimates. (Possibly constants only) Certainly causes biases in estimated variances Weighted log likelihood, weight = j / Fj for all i. Fixup of covariance matrix – use “sandwich” estimator. Using weighted Hessian and weighted BHHH in the center of the sandwich 20/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model An Estimated MNL Model ----------------------------------------------------------Discrete choice (multinomial logit) model Dependent variable Choice Log likelihood function -199.97662 Estimation based on N = 210, K = 5 Information Criteria: Normalization=1/N Normalized Unnormalized AIC 1.95216 409.95325 Fin.Smpl.AIC 1.95356 410.24736 Bayes IC 2.03185 426.68878 Hannan Quinn 1.98438 416.71880 R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj Constants only -283.7588 .2953 .2896 Chi-squared[ 2] = 167.56429 Prob [ chi squared > value ] = .00000 Response data are given as ind. choices Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------GC| -.01578*** .00438 -3.601 .0003 TTME| -.09709*** .01044 -9.304 .0000 A_AIR| 5.77636*** .65592 8.807 .0000 A_TRAIN| 3.92300*** .44199 8.876 .0000 A_BUS| 3.21073*** .44965 7.140 .0000 --------+-------------------------------------------------- 21/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Estimated MNL Model ----------------------------------------------------------Discrete choice (multinomial logit) model Dependent variable Choice Log likelihood function -199.97662 Estimation based on N = 210, K = 5 Information Criteria: Normalization=1/N Normalized Unnormalized AIC 1.95216 409.95325 Fin.Smpl.AIC 1.95356 410.24736 Bayes IC 2.03185 426.68878 Hannan Quinn 1.98438 416.71880 R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj Constants only -283.7588 .2953 .2896 Chi-squared[ 2] = 167.56429 Prob [ chi squared > value ] = .00000 Response data are given as ind. choices Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------GC| -.01578*** .00438 -3.601 .0003 TTME| -.09709*** .01044 -9.304 .0000 A_AIR| 5.77636*** .65592 8.807 .0000 A_TRAIN| 3.92300*** .44199 8.876 .0000 A_BUS| 3.21073*** .44965 7.140 .0000 --------+-------------------------------------------------- 22/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Model Fit Based on Log Likelihood Three sets of predicted probabilities No model: Pij = 1/J (.25) Constants only: Pij = (1/N)i dij (58,63,30,59)/210=.286,.300,.143,.281 Constants only model matches sample shares Estimated model: Logit probabilities Compute log likelihood Measure improvement in log likelihood with Pseudo R-squared = 1 – LogL/LogL0 (“Adjusted” for number of parameters in the model.) 23/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Fit the Model with Only ASCs If the choice set varies across observations, this is the only way to obtain the restricted log likelihood. ----------------------------------------------------------Discrete choice (multinomial logit) model Dependent variable Choice If the choice set is fixed at Log likelihood function -283.75877 Estimation based on N = 210, K = 3 Nj J Information Criteria: Normalization=1/N logL = N log j 1 l Normalized Unnormalized N AIC 2.73104 573.51754 J Fin.Smpl.AIC 2.73159 573.63404 N log Pj Bayes IC 2.77885 583.55886 j 1 l Hannan Quinn 2.75037 577.57687 R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj Constants only -283.7588 .0000-.0048 Response data are given as ind. choices Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------A_AIR| -.01709 .18491 -.092 .9263 A_TRAIN| .06560 .18117 .362 .7173 A_BUS| -.67634*** .22424 -3.016 .0026 --------+-------------------------------------------------- J, then 24/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Estimated MNL Model ----------------------------------------------------------Discrete choice (multinomial logit) model Dependent variable Choice Log likelihood function -199.97662 Estimation based on N = 210, K = 5 Information Criteria: Normalization=1/N log L Pseudo R 2 = 1. Normalized Unnormalized log L0 AIC 1.95216 409.95325 Fin.Smpl.AIC 1.95356 410.24736 Adjusted Pseudo R 2 =1- Bayes IC 2.03185 426.68878 Hannan Quinn 1.98438 416.71880 R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj Constants only -283.7588 .2953 .2896 Chi-squared[ 2] = 167.56429 Prob [ chi squared > value ] = .00000 Response data are given as ind. choices Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------GC| -.01578*** .00438 -3.601 .0003 TTME| -.09709*** .01044 -9.304 .0000 A_AIR| 5.77636*** .65592 8.807 .0000 A_TRAIN| 3.92300*** .44199 8.876 .0000 A_BUS| 3.21073*** .44965 7.140 .0000 --------+-------------------------------------------------- N(J-1) log L . N(J-1)-K log L0 25/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Model Fit Based on Predictions Nj = actual number of choosers of “j.” Nfitj = i Predicted Probabilities for “j” Cross tabulate: Predicted vs. Actual, cell prediction is cell probability Predicted vs. Actual, cell prediction is the cell with the largest probability Njk = i dij Predicted P(i,k) 26/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Fit Measures Based on Crosstabulation AIR TRAIN BUS CAR Total AIR TRAIN BUS CAR Total +-------------------------------------------------------+ | Cross tabulation of actual choice vs. predicted P(j) | | Row indicator is actual, column is predicted. | | Predicted total is F(k,j,i)=Sum(i=1,...,N) P(k,j,i). | | Column totals may be subject to rounding error. | +-------------------------------------------------------+ NLOGIT Cross Tabulation for 4 outcome Multinomial Choice Model AIR TRAIN BUS CAR Total +-------------+-------------+-------------+-------------+-------------+ | 32 | 8 | 5 | 13 | 58 | | 8 | 37 | 5 | 14 | 63 | | 3 | 5 | 15 | 6 | 30 | | 15 | 13 | 6 | 26 | 59 | +-------------+-------------+-------------+-------------+-------------+ | 58 | 63 | 30 | 59 | 210 | +-------------+-------------+-------------+-------------+-------------+ NLOGIT Cross Tabulation for 4 outcome Constants Only Choice Model AIR TRAIN BUS CAR Total +-------------+-------------+-------------+-------------+-------------+ | 16 | 17 | 8 | 16 | 58 | | 17 | 19 | 9 | 18 | 63 | | 8 | 9 | 4 | 8 | 30 | | 16 | 18 | 8 | 17 | 59 | +-------------+-------------+-------------+-------------+-------------+ | 58 | 63 | 30 | 59 | 210 | +-------------+-------------+-------------+-------------+-------------+ 27/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Partial effects : Change in attribute "k" of alternative "m" on the probability that the individual makes choice "j" j = Train Prob(j) Pj = = Pj [1(j = m) - Pm ]βk xm,k xm,k m = Car k = Price 28/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Partial effects : k = Price Own effects : j = Train Prob(j) Pj = = Pj [1- Pj ]βk x j,k x j,k Cross effects : j = Train m = Car Prob(j) Pj = = -PP j m βk x m,k x m,k 29/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Elasticities for proportional changes : xm,k logProb(j) logPj = = Pj [1(j = m) - Pm ]βk logxm,k logx m,k Pj = [1(j = m) - Pm ] x m,k βk Note the elasticity is the same for all j. This is a consequence of the IIA assumption in the model specification made at the outset. 30/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model +---------------------------------------------------+ | Elasticity averaged over observations.| | Attribute is INVT in choice AIR | | Mean St.Dev | | * Choice=AIR -.2055 .0666 | | Choice=TRAIN .0903 .0681 | | Choice=BUS .0903 .0681 | | Choice=CAR .0903 .0681 | +---------------------------------------------------+ | Attribute is INVT in choice TRAIN | | Choice=AIR .3568 .1231 | | * Choice=TRAIN -.9892 .5217 | | Choice=BUS .3568 .1231 | | Choice=CAR .3568 .1231 | +---------------------------------------------------+ | Attribute is INVT in choice BUS | | Choice=AIR .1889 .0743 | | Choice=TRAIN .1889 .0743 | | * Choice=BUS -1.2040 .4803 | | Choice=CAR .1889 .0743 | +---------------------------------------------------+ | Attribute is INVT in choice CAR | | Choice=AIR .3174 .1195 | | Choice=TRAIN .3174 .1195 | | Choice=BUS .3174 .1195 | | * Choice=CAR -.9510 .5504 | +---------------------------------------------------+ | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | +---------------------------------------------------+ Note the effect of IIA on the cross effects. Own effect Cross effects Elasticities are computed for each observation; the mean and standard deviation are then computed across the sample observations. 31/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Use Krinsky and Robb to compute standard errors for Elasticities 32/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Analyzing the Behavior of Market Shares to Examine Discrete Effects Scenario: What happens to the number of people who make specific choices if a particular attribute changes in a specified way? Fit the model first, then using the identical model setup, add ; Simulation = list of choices to be analyzed ; Scenario = Attribute (in choices) = type of change For the CLOGIT application ; Simulation = * ? This is ALL choices ; Scenario: GC(car)=[*]1.25$ Car_GC rises by 25% 33/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Model Simulation +---------------------------------------------+ | Discrete Choice (One Level) Model | | Model Simulation Using Previous Estimates | | Number of observations 210 | +---------------------------------------------+ +------------------------------------------------------+ |Simulations of Probability Model | |Model: Discrete Choice (One Level) Model | |Simulated choice set may be a subset of the choices. | |Number of individuals is the probability times the | |number of observations in the simulated sample. | |Column totals may be affected by rounding error. | |The model used was simulated with 210 observations.| +------------------------------------------------------+ ------------------------------------------------------------------------Specification of scenario 1 is: Attribute Alternatives affected Change type Value --------- ------------------------------- ------------------- --------GC CAR Scale base by value 1.250 ------------------------------------------------------------------------The simulator located 209 observations for this scenario. Simulated Probabilities (shares) for this scenario: +----------+--------------+--------------+------------------+ |Choice | Base | Scenario | Scenario - Base | | |%Share Number |%Share Number |ChgShare ChgNumber| +----------+--------------+--------------+------------------+ |AIR | 27.619 58 | 29.592 62 | 1.973% 4 | |TRAIN | 30.000 63 | 31.748 67 | 1.748% 4 | |BUS | 14.286 30 | 15.189 32 | .903% 2 | |CAR | 28.095 59 | 23.472 49 | -4.624% -10 | |Total |100.000 210 |100.000 210 | .000% 0 | +----------+--------------+--------------+------------------+ Changes in the predicted market shares when GC_CAR increases by 25%. 34/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model More Complicated Model Simulation In vehicle cost of CAR falls by 10% Market is limited to ground (Train, Bus, Car) CLOGIT ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC ; Rh2 = One ,Hinc ; Simulation = TRAIN,BUS,CAR ; Scenario: GC(car)=[*].9$ 35/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Model Estimation Step ----------------------------------------------------------Discrete choice (multinomial logit) model Dependent variable Choice Log likelihood function -172.94366 Estimation based on N = 210, K = 10 R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj Constants only -283.7588 .3905 .3807 Chi-squared[ 7] = 221.63022 Prob [ chi squared > value ] = .00000 Response data are given as ind. choices Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------TTME| -.10289*** .01109 -9.280 .0000 INVC| -.08044*** .01995 -4.032 .0001 INVT| -.01399*** .00267 -5.240 .0000 GC| .07578*** .01833 4.134 .0000 A_AIR| 4.37035*** 1.05734 4.133 .0000 AIR_HIN1| .00428 .01306 .327 .7434 A_TRAIN| 5.91407*** .68993 8.572 .0000 TRA_HIN2| -.05907*** .01471 -4.016 .0001 A_BUS| 4.46269*** .72333 6.170 .0000 BUS_HIN3| -.02295 .01592 -1.442 .1493 --------+-------------------------------------------------- Alternative specific constants and interactions of ASCs and Household Income 36/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model +---------------------------------------------+ | Discrete Choice (One Level) Model | | Model Simulation Using Previous Estimates | | Number of observations 210 | +---------------------------------------------+ +------------------------------------------------------+ |Simulations of Probability Model | |Model: Discrete Choice (One Level) Model | |Simulated choice set may be a subset of the choices. | |Number of individuals is the probability times the | |number of observations in the simulated sample. | |The model used was simulated with 210 observations.| +------------------------------------------------------+ ------------------------------------------------------------------------Specification of scenario 1 is: Attribute Alternatives affected Change type Value --------- ------------------------------- ------------------- --------INVC CAR Scale base by value .900 ------------------------------------------------------------------------The simulator located 210 observations for this scenario. Simulated Probabilities (shares) for this scenario: +----------+--------------+--------------+------------------+ |Choice | Base | Scenario | Scenario - Base | | |%Share Number |%Share Number |ChgShare ChgNumber| +----------+--------------+--------------+------------------+ |TRAIN | 37.321 78 | 35.854 75 | -1.467% -3 | |BUS | 19.805 42 | 18.641 39 | -1.164% -3 | |CAR | 42.874 90 | 45.506 96 | 2.632% 6 | |Total |100.000 210 |100.000 210 | .000% 0 | +----------+--------------+--------------+------------------+ Model Simulation Step 37/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Willingness to Pay U(alt) = aj + bINCOME*INCOME + bAttribute*Attribute + … WTP = MU(Attribute)/MU(Income) When MU(Income) is not available, an approximation often used is –MU(Cost). U(Air,Train,Bus,Car) = αalt + βcost Cost + βINVT INVT + βTTME TTME + εalt WTP for less in vehicle time = -βINVT / βCOST WTP for less terminal time = -βTIME / βCOST 38/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model WTP from CLOGIT Model ----------------------------------------------------------Discrete choice (multinomial logit) model Dependent variable Choice --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------GC| -.00286 .00610 -.469 .6390 INVT| -.00349*** .00115 -3.037 .0024 TTME| -.09746*** .01035 -9.414 .0000 AASC| 4.05405*** .83662 4.846 .0000 TASC| 3.64460*** .44276 8.232 .0000 BASC| 3.19579*** .45194 7.071 .0000 --------+-------------------------------------------------WALD ; fn1=WTP_INVT=b_invt/b_gc ; fn2=WTP_TTME=b_ttme/b_gc$ ----------------------------------------------------------WALD procedure. --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------WTP_INVT| 1.22006 2.88619 .423 .6725 WTP_TTME| 34.0771 73.07097 .466 .6410 --------+-------------------------------------------------- Very different estimates suggests this might not be a very good model. 39/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Estimation in WTP Space Problem with WTP calculation : Ratio of two estimates that are asymptotically normally distributed may have infinite variance. Sample point estimates may be reasonable Inference - confidence intervals - may not be possible. WTP estimates often become unreasonable in random parameter models in which parameters vary across individuals. Estimation in WTP Space U(Air) = α + βCOST COST + βTIME TIME + βattr Attr + ε β β = α + βCOST COST + TIME TIME + attr Attr + ε βCOST βCOST = α + βCOST COST + θTIME TIME + θattr Attr + ε For a simple MNL the transformation is 1:1. Results will be identical to the original model. In more elaborate, RP models, results change. 40/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model The I.I.D Assumption Uitj = ij + ’xitj + ’zit + ijt F(itj) = Exp(-Exp(-itj)) (random part of each utility) Independence across utility functions Identical variances (means absorbed in constants) Restriction on equal scaling may be inappropriate Correlation across alternatives may be suppressed Equal cross elasticities is a substantive restriction Behavioral implication of IID is independence from irrelevant alternatives. If an alternative is removed, probability is spread equally across the remaining alternatives. This is unreasonable 41/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model IIA Implication of IID exp[U (train)] exp[U (air )] exp[U (train)] exp[U (bus )] exp[U (car )] exp[U (train)] Prob(train|train,bus,car) = exp[U (train)] exp[U (bus )] exp[U (car )] Air is in the choice set, probabilities are independent from air if air is Prob(train) = not in the condition. This is a testable behavioral assumption. 42/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model +---------------------------------------------------+ | Elasticity averaged over observations.| | Attribute is INVT in choice AIR | | Mean St.Dev | | * Choice=AIR -.2055 .0666 | | Choice=TRAIN .0903 .0681 | | Choice=BUS .0903 .0681 | | Choice=CAR .0903 .0681 | +---------------------------------------------------+ | Attribute is INVT in choice TRAIN | | Choice=AIR .3568 .1231 | | * Choice=TRAIN -.9892 .5217 | | Choice=BUS .3568 .1231 | | Choice=CAR .3568 .1231 | +---------------------------------------------------+ | Attribute is INVT in choice BUS | | Choice=AIR .1889 .0743 | | Choice=TRAIN .1889 .0743 | | * Choice=BUS -1.2040 .4803 | | Choice=CAR .1889 .0743 | +---------------------------------------------------+ | Attribute is INVT in choice CAR | | Choice=AIR .3174 .1195 | | Choice=TRAIN .3174 .1195 | | Choice=BUS .3174 .1195 | | * Choice=CAR -.9510 .5504 | +---------------------------------------------------+ | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | +---------------------------------------------------+ Behavioral Implication of IIA Own effect Cross effects Note the effect of IIA on the cross effects. Elasticities are computed for each observation; the mean and standard deviation are then computed across the sample observations. 43/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model A Hausman and McFadden Test for IIA Estimate full model with “irrelevant alternatives” Estimate the short model eliminating the irrelevant alternatives Eliminate individuals who chose the irrelevant alternatives Drop attributes that are constant in the surviving choice set. Do the coefficients change? Under the IIA assumption, they should not. Use a Hausman test: Chi-squared, d.f. Number of parameters estimated H = bshort - b full ' Vshort - Vfull bshort - b full -1 44/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model IIA Test for Choice AIR +--------+--------------+----------------+--------+--------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| +--------+--------------+----------------+--------+--------+ GC | .06929537 .01743306 3.975 .0001 TTME | -.10364955 .01093815 -9.476 .0000 INVC | -.08493182 .01938251 -4.382 .0000 INVT | -.01333220 .00251698 -5.297 .0000 AASC | 5.20474275 .90521312 5.750 .0000 TASC | 4.36060457 .51066543 8.539 .0000 BASC | 3.76323447 .50625946 7.433 .0000 +--------+--------------+----------------+--------+--------+ GC | .53961173 .14654681 3.682 .0002 TTME | -.06847037 .01674719 -4.088 .0000 INVC | -.58715772 .14955000 -3.926 .0001 INVT | -.09100015 .02158271 -4.216 .0000 TASC | 4.62957401 .81841212 5.657 .0000 BASC | 3.27415138 .76403628 4.285 .0000 Matrix IIATEST has 1 rows and 1 columns. 1 +-------------1| 33.78445 Test statistic +------------------------------------+ | Listed Calculator Results | IIA is rejected +------------------------------------+ Result = 9.487729 Critical value 45/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Alternative to Utility Maximization (!) Minimizing Random Regret 46/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model RUM vs. Random Regret 47/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 48/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 49/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 50/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 51/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 52/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 53/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Fixed Effects Multinomial Logit: Application of Minimum Distance Estimation 54/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Binary Logit Conditional Probabiities ei xit Prob( yit 1| xit ) . 1 ei xit Ti Prob Yi1 yi1 , Yi 2 yi 2 , , YiTi yiTi yit t 1 Ti Ti exp yit xit exp yit xit β t 1 t 1 . Ti Ti dit xit All Ti different ways that exp dit xit β t dit Si exp Si t 1 t 1 t dit can equal Si Denominator is summed over all the different combinations of Ti values of yit that sum to the same sum as the observed Tt=1i yit . If Si is this sum, T there are terms. May be a huge number. An algorithm by Krailo Si and Pike makes it simple. 55/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Example: Seven Period Binary Logit Prob[y = (1,0,0,0,1,1,1)|Xi ]= exp( i x1 ) exp( i x 7 ) 1 ... 1 exp( i x1 ) 1 exp( i x 2 ) 1 exp( i x 7 ) There are 35 different sequences of y it (permutations) that sum to 4. For example, y*it| p1 might be (1,1,1,1,0,0,0). Etc. Prob[y=(1,0,0,0,1,1,1)|Xi ,t71y it =7] = exp t71 yit xit 7 * exp y t 1 it | p x it p 1 35 56/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 57/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model With T = 50, the number of permutations of sequences of y ranging from sum = 0 to sum = 50 ranges from 1 for 0 and 50, to 2.3 x 1012 for 15 or 35 up to a maximum of 1.3 x 1014 for sum =25. These are the numbers of terms that must be summed for a model with T = 50. In the application below, the sum ranges from 15 to 35. 58/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model The sample is 200 individuals each observed 50 times. 59/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model The data are generated from a probit process with b1 = b2 = .5. But, it is fit as a logit model. The coefficients obey the familiar relationship, 1.6*probit. 60/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Multinomial Logit Model: J+1 choices including a base choice. yitj = 1 if individual i makes choice j in period t x e ij itj Prob( yitj 1| xitj ) , j 1,..., J . 1 mJ 1eim xitm Prob( yit 0 1| xit 0 ) 1 . im xitm J 1 m 1e The probability attached to the sequence of choices is remarkably complicated. Ti exp y x j 1 itj itj t 1 Ti J J exp d x j 1 t ditj Sij j 1 it it t 1 J Ti exp y x β j 1 itj itj t 1 . Ti ditj xitj β All STiji different ways that exp t 1 t ditj can equal Sij J Denominator is summed over all the different combinations of Ti values of yitj that sum to the same sum as the observed Tt=1i yit . If Sij is this sum, T there are terms. May be a huge number. Larger yet by summing over choices. Sij 61/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Estimation Strategy Conditional ML of the full MNL model. Impressively complicated. A Minimum Distance (MDE) Strategy Each alternative treated as a binary choice vs. the base provides an estimator of Select subsample that chose either option j or the base Estimate using this binary choice setting This provides J different estimators of the same Optimally combine the different estimators of 62/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Minimum Distance Estimation There are J estimators βˆ j of the same parameter vector, βˆ . Each estimator is consistent and asymptotically normal. ˆ . How to combine the estimators? Estimated covariance matrices V j βˆ 1 βˆ * βˆ 1 βˆ * βˆ 2 βˆ * βˆ 2 βˆ * ˆ MDE: Minimize wrt * q = W ˆ ˆ ˆ ˆ β β β β * * J J What to use for the weighting matrix W? Any positive definite matrix will do. 63/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model MDE Estimation ˆ . How to combine the estimators? Estimated covariance matrices V j βˆ 1 βˆ * βˆ 1 βˆ * ˆ ˆ ˆ ˆ β 2 β* β β* W MDE: Minimize wrt βˆ * q = 2 . Propose a GLS approach ˆ ˆ ˆ ˆ β J β* β J β* ˆ V 1 0 W = A 1 0 0 ˆ V 2 0 0 0 ˆ V J 1 64/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model MDE Estimation ˆ βˆ 1 βˆ * V 1 βˆ 2 βˆ * 0 ˆ MDE: Minimize wrt β* q = ˆ ˆ β J β* 0 0 ˆ V 2 0 0 0 ˆ V J 1 βˆ 1 βˆ * ˆ ˆ β 2 β* . ˆ ˆ β J β* 1 1 1 1 ˆ ˆ 1βˆ ˆ 1βˆ ... V ˆ 1βˆ V ˆ ˆ ˆ The solution is β* V1 V2 ... VJ V J J 2 2 1 1 1 J ˆ 1βˆ ˆ 1 J V = j 1 V j j 1 j j J J = j 1 H j βˆ j where j 1 H j I 65/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 66/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 67/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 68/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 69/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 70/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 71/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 72/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 73/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Why a 500 fold increase in speed? MDE is much faster Not using Krailo and Pike, or not using efficiently Numerical derivatives for an extremely messy function (increase the number of function evaluations by at least 5 times) 74/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Rank Data and Best/Worst 75/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 76/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Rank Data and Exploded Logit Alt 1 is the best overall Alt 3 is the best among remaining alts 2,3,4,5 Alt 5 is the best among remaining alts 2,4,5 Alt 2 is the best among remaining alts 2,4 Alt 4 is the worst. 77/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Exploded Logit U[j] = jth favorite alternative among 5 alternatives U[1] = the choice made if the individual indicates only the favorite Prob{j = [1],[2],[3],[4],[5]} = Prob{[1]|choice set = [1]...[5]} Prob{[2]|choice set = [2]...[5]} Prob{[3]|choice set = [3]...[5]} Prob{[4]|choice set = [4],[5]} 1 78/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Exploded Logit U[j] = jth favorite alternative among 5 alternatives U[1] = the choice made if the individual indicates only the favorite Individual ranked the alternatives 1,3,5,2,4 Prob{This set of ranks} exp(x3 ) exp(x1 ) = j 1,2,3,4,5 exp(x j ) j 2,3,4,5 exp(x j ) exp(x5 ) j 2,4,5 exp(x j ) exp(x 2 ) j 2,4 exp(x j ) 1 79/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Best Worst Individual simultaneously ranks best and worst alternatives. Prob(alt j) = best = exp[U(j)] / mexp[U(m)] Prob(alt k) = worst = exp[-U(k)] / mexp[-U(m)] 80/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 81/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Choices 82/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Best 83/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Worst 84/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 85/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Uses the result that if U(i,j) is the lowest utility, -U(i,j) is the highest. 86/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Uses the result that if U(i,j) is the lowest utility, -U(i,j) is the highest. 87/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 88/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 89/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 90/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Nonlinear Utility Functions Generalized (in functional form) multinomial logit model U(i, j) = Vj (x ij,zi, ) + ε ij (Utility function may vary by choice.) F(ε ij ) = exp(-exp(-(ε ij )) - the standard IID assumptions for MNL Prob(i, j) = exp Vj (xij ,zi ,β) J m=1 exp Vm (x im ,zi,β) Estimation problem is more complicated in practical terms Large increase in model flexibility. Note : Coefficients are no longer generic. WTP(i, k | j) = - Vj (xij ,zi ,β) / xi,j (k) Vj (xij ,zi ,β) / Cost 91/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Assessing Prospect Theoretic Functional Forms and Risk in a Nonlinear Logit Framework: Valuing Reliability Embedded Travel Time Savings David Hensher The University of Sydney, ITLS William Greene Stern School of Business, New York University 8th Annual Advances in Econometrics Conference Louisiana State University Baton Rouge, LA November 6-8, 2009 Hensher, D., Greene, W., “Embedding Risk Attitude and Decisions Weights in Non-linear Logit to Accommodate Time Variability in the Value of Expected Travel Time Savings,” Transportation Research Part B 92/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Prospect Theory Marginal value function for an attribute (outcome) v(xm) = subjective value of attribute Decision weight w(pm) = impact of a probability on utility of a prospect Value function V(xm,pm) = v(xm)w(pm) = value of a prospect that delivers outcome xm with probability pm We explore functional forms for w(pm) with implications for decisions 93/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model An Application of Valuing Reliability (due to Ken Small) late late 94/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Stated Choice Survey Trip Attributes in Stated Choice Design Routes A and B Free flow travel time Slowed down travel time Stop/start/crawling travel time Minutes arriving earlier than expected Minutes arriving later than expected Probability of arriving earlier than expected Probability of arriving at the time expected Probability of arriving later than expected Running cost Toll Cost Individual Characteristics: Age, Income, Gender 95/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Value and Weighting Functions x1-α Value Function : V(x) = 1- α Weighting Functions : Model 1 = pmγ 1 γ γ [pmγ + (1- pm ) ] τPmγ Model 2 = [τPmγ + (1- pm )γ ] Model 3 = exp(-τ(-lnpm )γ ) Model 4 = exp(-(-lnpm )γ ) 96/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Choice Model U(j) = βref + βcostCost + βAgeAge + βTollTollASC + βcurr w(pcurr)v(tcurr) + βlate w(plate) v(tlate) + βearly w(pearly)v(tearly) + εj Constraint: βcurr = βlate = βearly U(j) = βref + βcostCost + βAgeAge + βTollTollASC + β[w(pcurr)v(tcurr) + w(plate)v(tlate) + w(pearly)v(tearly)] + εj 97/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Application 2008 study undertaken in Australia toll vs. free roads stated choice (SC) experiment involving two SC alternatives (i.e., route A and route B) pivoted around the knowledge base of travellers (i.e., the current trip). 280 Individuals 32 Choice Situations (2 blocks of 16) 98/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Data 99/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 100/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Reliability Embedded Value of Travel Time Savings in Au$/hr $4.50 101/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model SCALING IN CHOICE MODELS 102/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Using Degenerate Branches to Reveal Scaling LIMB BRANCH TWIG Travel Fly Air Rail Train Drive Car GrndPblc Bus 103/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Scaling in Transport Modes ----------------------------------------------------------FIML Nested Multinomial Logit Model Dependent variable MODE Log likelihood function -182.42834 The model has 2 levels. Nested Logit form:IVparms=Taub|l,r,Sl|r & Fr.No normalizations imposed a priori Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------|Attributes in the Utility Functions (beta) GC| .09622** .03875 2.483 .0130 TTME| -.08331*** .02697 -3.089 .0020 INVT| -.01888*** .00684 -2.760 .0058 INVC| -.10904*** .03677 -2.966 .0030 A_AIR| 4.50827*** 1.33062 3.388 .0007 A_TRAIN| 3.35580*** .90490 3.708 .0002 A_BUS| 3.11885** 1.33138 2.343 .0192 |IV parameters, tau(b|l,r),sigma(l|r),phi(r) FLY| 1.65512** .79212 2.089 .0367 RAIL| .92758*** .11822 7.846 .0000 LOCLMASS| 1.00787*** .15131 6.661 .0000 DRIVE| 1.00000 ......(Fixed Parameter)...... --------+-------------------------------------------------- NLOGIT ; Lhs=mode ; Rhs=gc,ttme,invt,invc,one ; Choices=air,train,bus,car ; Tree=Fly(Air), Rail(train), LoclMass(bus), Drive(Car) ; ivset:(drive)=[1]$ 104/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model A Model with Choice Heteroscedasticity U(i,t, j) α j + β'x itj + γ j'zit + σ jε i,t,j F(ε i,t,j ) = exp(-exp(-ε i,t,j )) IID after scaling by a choice specific scale parameter P[choice = j | x itj , zit ,i,t] = Prob[Ui,t,j Ui,t,k ], k = 1,...,J(i,t) = exp (α j + β'x itj + γ j'zit ) / j J(i,t) exp(α j + β'x itj + γ j'zit ) / j Normalization required as only ratios can be estimated; j=1 σ j = 1 for one of the alternatives (Remember the integrability problem - scale is not identified.) 105/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Heteroscedastic Extreme Value Model (1) +---------------------------------------------+ | Start values obtained using MNL model | | Maximum Likelihood Estimates | | Log likelihood function -184.5067 | | Dependent variable Choice | | Response data are given as ind. choice. | | Number of obs.= 210, skipped 0 bad obs. | +---------------------------------------------+ +--------+--------------+----------------+--------+--------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| +--------+--------------+----------------+--------+--------+ GC | .06929537 .01743306 3.975 .0001 TTME | -.10364955 .01093815 -9.476 .0000 INVC | -.08493182 .01938251 -4.382 .0000 INVT | -.01333220 .00251698 -5.297 .0000 AASC | 5.20474275 .90521312 5.750 .0000 TASC | 4.36060457 .51066543 8.539 .0000 BASC | 3.76323447 .50625946 7.433 .0000 106/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Heteroscedastic Extreme Value Model (2) +---------------------------------------------+ | Heteroskedastic Extreme Value Model | | Log likelihood function -182.4440 | (MNL logL was -184.5067) | Number of parameters 10 | | Restricted log likelihood -291.1218 | +---------------------------------------------+ +--------+--------------+----------------+--------+--------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| +--------+--------------+----------------+--------+--------+ ---------+Attributes in the Utility Functions (beta) GC | .11903513 .06402510 1.859 .0630 TTME | -.11525581 .05721397 -2.014 .0440 INVC | -.15515877 .07928045 -1.957 .0503 INVT | -.02276939 .01122762 -2.028 .0426 AASC | 4.69411460 2.48091789 1.892 .0585 TASC | 5.15629868 2.05743764 2.506 .0122 BASC | 5.03046595 1.98259353 2.537 .0112 ---------+Scale Parameters of Extreme Value Distns Minus 1.0 s_AIR | -.57864278 .21991837 -2.631 .0085 Normalized for estimation s_TRAIN | -.45878559 .34971034 -1.312 .1896 s_BUS | .26094835 .94582863 .276 .7826 s_CAR | .000000 ......(Fixed Parameter)....... ---------+Std.Dev=pi/(theta*sqr(6)) for H.E.V. distribution. s_AIR | 3.04385384 1.58867426 1.916 .0554 Structural parameters s_TRAIN | 2.36976283 1.53124258 1.548 .1217 s_BUS | 1.01713111 .76294300 1.333 .1825 s_CAR | 1.28254980 ......(Fixed Parameter)....... 107/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model HEV Model - Elasticities +---------------------------------------------------+ | Elasticity averaged over observations.| | Attribute is INVC in choice AIR | | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | | Mean St.Dev | | * Choice=AIR -4.2604 1.6745 | | Choice=TRAIN 1.5828 1.9918 | | Choice=BUS 3.2158 4.4589 | | Choice=CAR 2.6644 4.0479 | | Attribute is INVC in choice TRAIN | | Choice=AIR .7306 .5171 | | * Choice=TRAIN -3.6725 4.2167 | | Choice=BUS 2.4322 2.9464 | | Choice=CAR 1.6659 1.3707 | | Attribute is INVC in choice BUS | | Choice=AIR .3698 .5522 | | Choice=TRAIN .5949 1.5410 | | * Choice=BUS -6.5309 5.0374 | | Choice=CAR 2.1039 8.8085 | | Attribute is INVC in choice CAR | | Choice=AIR .3401 .3078 | | Choice=TRAIN .4681 .4794 | | Choice=BUS 1.4723 1.6322 | | * Choice=CAR -3.5584 9.3057 | +---------------------------------------------------+ Multinomial Logit +---------------------------+ | INVC in AIR | | Mean St.Dev | | * -5.0216 2.3881 | | 2.2191 2.6025 | | 2.2191 2.6025 | | 2.2191 2.6025 | | INVC in TRAIN | | 1.0066 .8801 | | * -3.3536 2.4168 | | 1.0066 .8801 | | 1.0066 .8801 | | INVC in BUS | | .4057 .6339 | | .4057 .6339 | | * -2.4359 1.1237 | | .4057 .6339 | | INVC in CAR | | .3944 .3589 | | .3944 .3589 | | .3944 .3589 | | * -1.3888 1.2161 | +---------------------------+ 108/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Variance Heterogeneity in MNL We extend the HEV model by allowing variances to differ across individuals U(i,t, j) = α j + β'x itj + γ j'zit + σ ijε i,t,j σ ij = exp( j + δw i ). δ = 0 returns the HEV model F(ε i,t,j ) = exp(-exp(-ε i,t,j )) j = 0 for one of the alternatives Scaling now differs both across alternatives and across individuals 109/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Application: Shoe Brand Choice Simulated Data: Stated Choice, 400 respondents, 8 choice situations, 3,200 observations 3 choice/attributes + NONE Fashion = High / Low Quality = High / Low Price = 25/50/75,100 coded 1,2,3,4 Heterogeneity: Sex, Age (<25, 25-39, 40+) Underlying data generated by a 3 class latent class process (100, 200, 100 in classes) 110/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Multinomial Logit Baseline Values +---------------------------------------------+ | Discrete choice (multinomial logit) model | | Number of observations 3200 | | Log likelihood function -4158.503 | | Number of obs.= 3200, skipped 0 bad obs. | +---------------------------------------------+ +--------+--------------+----------------+--------+--------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| +--------+--------------+----------------+--------+--------+ FASH | 1.47890473 .06776814 21.823 .0000 QUAL | 1.01372755 .06444532 15.730 .0000 PRICE | -11.8023376 .80406103 -14.678 .0000 ASC4 | .03679254 .07176387 .513 .6082 111/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Multinomial Logit Elasticities +---------------------------------------------------+ | Elasticity averaged over observations.| | Attribute is PRICE in choice BRAND1 | | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | | Mean St.Dev | | * Choice=BRAND1 -.8895 .3647 | | Choice=BRAND2 .2907 .2631 | | Choice=BRAND3 .2907 .2631 | | Choice=NONE .2907 .2631 | | Attribute is PRICE in choice BRAND2 | | Choice=BRAND1 .3127 .1371 | | * Choice=BRAND2 -1.2216 .3135 | | Choice=BRAND3 .3127 .1371 | | Choice=NONE .3127 .1371 | | Attribute is PRICE in choice BRAND3 | | Choice=BRAND1 .3664 .2233 | | Choice=BRAND2 .3664 .2233 | | * Choice=BRAND3 -.7548 .3363 | | Choice=NONE .3664 .2233 | +---------------------------------------------------+ 112/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model HEV Model without Heterogeneity +---------------------------------------------+ | Heteroskedastic Extreme Value Model | | Dependent variable CHOICE | | Number of observations 3200 | | Log likelihood function -4151.611 | | Response data are given as ind. choice. | +---------------------------------------------+ +--------+--------------+----------------+--------+--------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| +--------+--------------+----------------+--------+--------+ ---------+Attributes in the Utility Functions (beta) FASH | 1.57473345 .31427031 5.011 .0000 QUAL | 1.09208463 .22895113 4.770 .0000 PRICE | -13.3740754 2.61275111 -5.119 .0000 ASC4 | -.01128916 .22484607 -.050 .9600 ---------+Scale Parameters of Extreme Value Distns Minus 1.0 s_BRAND1| .03779175 .22077461 .171 .8641 s_BRAND2| -.12843300 .17939207 -.716 .4740 s_BRAND3| .01149458 .22724947 .051 .9597 s_NONE | .000000 ......(Fixed Parameter)....... ---------+Std.Dev=pi/(theta*sqr(6)) for H.E.V. distribution. s_BRAND1| 1.23584505 .26290748 4.701 .0000 s_BRAND2| 1.47154471 .30288372 4.858 .0000 s_BRAND3| 1.26797496 .28487215 4.451 .0000 s_NONE | 1.28254980 ......(Fixed Parameter)....... Essentially no differences in variances across choices 113/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Homogeneous HEV Elasticities Multinomial Logit +---------------------------------------------------+ | Attribute is PRICE in choice BRAND1 | | Mean St.Dev | | * Choice=BRAND1 -1.0585 .4526 | | Choice=BRAND2 .2801 .2573 | | Choice=BRAND3 .3270 .3004 | | Choice=NONE .3232 .2969 | | Attribute is PRICE in choice BRAND2 | | Choice=BRAND1 .3576 .1481 | | * Choice=BRAND2 -1.2122 .3142 | | Choice=BRAND3 .3466 .1426 | | Choice=NONE .3429 .1411 | | Attribute is PRICE in choice BRAND3 | | Choice=BRAND1 .4332 .2532 | | Choice=BRAND2 .3610 .2116 | | * Choice=BRAND3 -.8648 .4015 | | Choice=NONE .4156 .2436 | +---------------------------------------------------+ | Elasticity averaged over observations.| | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | +---------------------------------------------------+ +--------------------------+ | PRICE in choice BRAND1| | Mean St.Dev | | * -.8895 .3647 | | .2907 .2631 | | .2907 .2631 | | .2907 .2631 | | PRICE in choice BRAND2| | .3127 .1371 | | * -1.2216 .3135 | | .3127 .1371 | | .3127 .1371 | | PRICE in choice BRAND3| | .3664 .2233 | | .3664 .2233 | | * -.7548 .3363 | | .3664 .2233 | +--------------------------+ 114/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Heteroscedasticity Across Individuals +---------------------------------------------+ | Heteroskedastic Extreme Value Model | Homog-HEV | Log likelihood function -4129.518[10] | -4151.611[7] +---------------------------------------------+ +--------+--------------+----------------+--------+--------+ |Variable| Coefficient | Standard Error |b/St.Er.|P[|Z|>z]| +--------+--------------+----------------+--------+--------+ ---------+Attributes in the Utility Functions (beta) FASH | 1.01640726 .20261573 5.016 .0000 QUAL | .55668491 .11604080 4.797 .0000 PRICE | -7.44758292 1.52664112 -4.878 .0000 ASC4 | .18300524 .09678571 1.891 .0586 ---------+Scale Parameters of Extreme Value Distributions s_BRAND1| .81114924 .10099174 8.032 .0000 s_BRAND2| .72713522 .08931110 8.142 .0000 s_BRAND3| .80084114 .10316939 7.762 .0000 s_NONE | 1.00000000 ......(Fixed Parameter)....... ---------+Heterogeneity in Scales of Ext.Value Distns. MALE | .21512161 .09359521 2.298 .0215 AGE25 | .79346679 .13687581 5.797 .0000 AGE39 | .38284617 .16129109 2.374 .0176 MNL -4158.503[4] 115/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Variance Heterogeneity Elasticities Multinomial Logit +---------------------------------------------------+ | Attribute is PRICE in choice BRAND1 | | Mean St.Dev | | * Choice=BRAND1 -.8978 .5162 | | Choice=BRAND2 .2269 .2595 | | Choice=BRAND3 .2507 .2884 | | Choice=NONE .3116 .3587 | | Attribute is PRICE in choice BRAND2 | | Choice=BRAND1 .2853 .1776 | | * Choice=BRAND2 -1.0757 .5030 | | Choice=BRAND3 .2779 .1669 | | Choice=NONE .3404 .2045 | | Attribute is PRICE in choice BRAND3 | | Choice=BRAND1 .3328 .2477 | | Choice=BRAND2 .2974 .2227 | | * Choice=BRAND3 -.7458 .4468 | | Choice=NONE .4056 .3025 | +---------------------------------------------------+ +--------------------------+ | PRICE in choice BRAND1| | Mean St.Dev | | * -.8895 .3647 | | .2907 .2631 | | .2907 .2631 | | .2907 .2631 | | PRICE in choice BRAND2| | .3127 .1371 | | * -1.2216 .3135 | | .3127 .1371 | | .3127 .1371 | | PRICE in choice BRAND3| | .3664 .2233 | | .3664 .2233 | | * -.7548 .3363 | | .3664 .2233 | +--------------------------+ 116/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 117/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model 118/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Generalized Mixed Logit Model U(i,t, j) = βi xi,t,j Common effects + εi,t,j Random Parameters βi = σi [β + Δhi ] + [γ + σi (1- γ)]Γivi Γi = ΛΣi Λ is a lower triangular matrix with 1s on the diagonal (Cholesky matrix) Σi is a diagonal matrix with φk exp(ψkhi ) Overall preference scaling σi = exp(-τi2 / 2 + τi w i + θhi ] τi = exp( λri ) 0 < γ <1 119/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Unobserved Heterogeneity in Scaling HEV formulation: U it , j x itj (1 / i )it , j Generalized model with = 1 and = [0]. Produces a scaled multinomial logit model with i i exp(i x itj ) Prob(choiceit = j) = , i 1,..., N , j 1,..., J it , t 1,..., Ti J it j 1 exp(i xitj ) The random variation in the scaling is i exp( 2 / 2 wi ) The variation across individuals may also be observed, so that i exp( 2 / 2 wi zi ) 120/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Scaled MNL 121/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Observed and Unobserved Heterogeneity 122/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Price Elasticities 123/122: Topic 3.3 – Discrete Choice; The Multinomial Logit Model Data on Discrete Choices CHOICE MODE TRAVEL AIR .00000 TRAIN .00000 BUS .00000 CAR 1.0000 AIR .00000 TRAIN .00000 BUS .00000 CAR 1.0000 AIR .00000 TRAIN .00000 BUS 1.0000 CAR .00000 AIR .00000 TRAIN .00000 BUS .00000 CAR 1.0000 INVC 59.000 31.000 25.000 10.000 58.000 31.000 25.000 11.000 127.00 109.00 52.000 50.000 44.000 25.000 20.000 5.0000 ATTRIBUTES INVT TTME 100.00 69.000 372.00 34.000 417.00 35.000 180.00 .00000 68.000 64.000 354.00 44.000 399.00 53.000 255.00 .00000 193.00 69.000 888.00 34.000 1025.0 60.000 892.00 .00000 100.00 64.000 351.00 44.000 361.00 53.000 180.00 .00000 GC 70.000 71.000 70.000 30.000 68.000 84.000 85.000 50.000 148.00 205.00 163.00 147.00 59.000 78.000 75.000 32.000 CHARACTERISTIC HINC 35.000 35.000 35.000 35.000 30.000 30.000 30.000 30.000 60.000 60.000 60.000 60.000 70.000 70.000 70.000 70.000 This is the ‘long form.’ In the ‘wide form,’ all data for the individual appear on a single ‘line’. The ‘wide form’ is unmanageable for models of any complexity and for stated preference applications.