Discrete Choice Modeling William Greene Stern School of Business New York University Lab Sessions Lab Session 8 Discrete Choice, Multinomial Logit Model Observed Data Types of Data Individual choice Market shares Frequencies Ranks Attributes and Characteristics Choice Settings Cross section Repeated measurement (panel data) Data for Multinomial Choice Line 1 2 3 4 5 6 7 8 321 322 323 324 325 326 327 328 MODE AIR TRAIN BUS CAR AIR TRAIN BUS CAR AIR TRAIN BUS CAR AIR TRAIN BUS CAR TRAVEL .00000 .00000 .00000 1.0000 .00000 .00000 .00000 1.0000 .00000 .00000 1.0000 .00000 .00000 .00000 .00000 1.0000 INVC 59.000 31.000 25.000 10.000 58.000 31.000 25.000 11.000 127.00 109.00 52.000 50.000 44.000 25.000 20.000 5.0000 INVT 100.00 372.00 417.00 180.00 68.000 354.00 399.00 255.00 193.00 888.00 1025.0 892.00 100.00 351.00 361.00 180.00 TTME 69.000 34.000 35.000 .00000 64.000 44.000 53.000 .00000 69.000 34.000 60.000 .00000 64.000 44.000 53.000 .00000 GC 70.000 71.000 70.000 30.000 68.000 84.000 85.000 50.000 148.00 205.00 163.00 147.00 59.000 78.000 75.000 32.000 HINC 35.000 35.000 35.000 35.000 30.000 30.000 30.000 30.000 60.000 60.000 60.000 60.000 70.000 70.000 70.000 70.000 Using NLOGIT To Fit the Model Start program Load CLOGIT.LPJ project Use command builder dialog box or Use typed commands in editor Specification of Choice Variable Specification of Utility Functions Copy the variable names from the list at the right into the appropriate window at the left, then press Run Submit Command from Editor (1) Type commands in editor (2) Highlight by dragging mouse (3) Press GO button Command Structure Generic CLOGIT (or NLOGIT) ; Lhs = choice variable ; Choices = list of labels for the J choices ; RHS = list of attributes that vary by choice ; RH2 = list of attributes that do not vary by choice $ For this application CLOGIT (or NLOGIT) ; Lhs = MODE ; Choices = Air, Train, Bus, Car ; RHS = TTME,INVC,INVT,GC ; RH2 = ONE, HINC $ Output Window Note: coef. on GC has the wrong sign! Effects of Changes in Attributes on Probabilities Partial Effects: Effect of a change in attribute “k” of alternative “m” on the probability that choice “j” will be made is Pj xmk = Pj [1(j = m) - Pm ]βk Proportional changes: Elasticities logPj logx mk x mk = Pj [1(j = m) - Pm ]βk Pj = [1(j = m) - Pm ]βk x mk Note the elasticity is the same for all choices “j.” (IIA) Elasticities for CLOGIT Request: ;Effects: attribute (choices where changes ) ; Effects: INVT(*) (INVT changes in all choices) +---------------------------------------------------+ | Elasticity averaged over observations.| | Attribute is INVT in choice AIR | | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | | Mean St.Dev | | * Choice=AIR -1.3363 .7275 | | Choice=TRAIN .5349 .6358 | | Choice=BUS .5349 .6358 | | Choice=CAR .5349 .6358 | | Attribute is INVT in choice TRAIN | | Choice=AIR 2.2153 2.4366 | | * Choice=TRAIN -6.2976 4.0280 | | Choice=BUS 2.2153 2.4366 | | Choice=CAR 2.2153 2.4366 | | Attribute is INVT in choice BUS | | Choice=AIR 1.1942 1.7469 | | Choice=TRAIN 1.1942 1.7469 | | * Choice=BUS -7.6150 3.4417 | | Choice=CAR 1.1942 1.7469 | | Attribute is INVT in choice CAR | | Choice=AIR 2.0852 2.0953 | | Choice=TRAIN 2.0852 2.0953 | | Choice=BUS 2.0852 2.0953 | | * Choice=CAR -5.9367 3.7493 | +---------------------------------------------------+ Own effect Cross effects Note the effect of IIA on the cross effects. Other Useful Options ; Describe for descriptive by statistics, by alternative ; Crosstab for crosstabulations of actuals and predicted ; List for listing of outcomes and predictions ; Prob = name to create a new variable with fitted probabilities ; IVB = log sum, inclusive value. New variable Analyzing Behavior of Market Shares Scenario: What happens to the number of people how make specific choices if a particular attribute changes in a specified way? Fit the model first, then using the identical model setup, add ; Simulation = list of choices to be analyzed ; Scenario = Attribute (in choices) = type of change For the CLOGIT application, for example ; Simulation = * ? This is ALL choices ; Scenario: INVC(car)=[*]1.25$ INVC rises by 25% More Complicated Model Simulation In vehicle cost of CAR rises by 25% Market is limited to ground (Train, Bus, Car) NLOGIT ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC ; Rh2 = One ,Hinc ; Simulation = TRAIN,BUS,CAR ; Scenario: INVC(car)=[*]1.25$ Model Simulation In vehicle cost of CAR rises by 25% +------------------------------------------------------+ |Simulations of Probability Model | |Model: Discrete Choice (One Level) Model | |Simulated choice set may be a subset of the choices. | |Number of individuals is the probability times the | |number of observations in the simulated sample. | |Column totals may be affected by rounding error. | |The model used was simulated with 210 observations.| +------------------------------------------------------+ ------------------------------------------------------------------------Specification of scenario 1 is: Attribute Alternatives affected Change type Value --------- ------------------------------- ------------------- --------INVC CAR Scale base by value 1.250 ------------------------------------------------------------------------The simulator located 209 observations for this scenario. Simulated Probabilities (shares) for this scenario: +----------+--------------+--------------+------------------+ |Choice | Base | Scenario | Scenario - Base | Changes in the | |%Share Number |%Share Number |ChgShare ChgNumber| predicted market +----------+--------------+--------------+------------------+ shares when |TRAIN | 37.321 78 | 40.711 85 | 3.390% 7 | |BUS | 19.805 42 | 22.560 47 | 2.755% 5 | INVC_CAR changes |CAR | 42.874 90 | 36.729 77 | -6.145% -13 | |Total |100.000 210 |100.000 209 | .000% -1 | +----------+--------------+--------------+------------------+ Compound Scenario: INVC(Car) falls by 10%, TTME (Air,Train) rises by 25% (at the same time). +------------------------------------------------------+ |Simulations of Probability Model | |Model: Discrete Choice (One Level) Model | |Simulated choice set may be a subset of the choices. | ;simulation=* |Number of individuals is the probability times the | ; scenario: INVC(car)=[*]0.9 / |number of observations in the simulated sample. | TTME(air,train)=[*]1.25 |Column totals may be affected by rounding error. | |The model used was simulated with 210 observations.| +------------------------------------------------------+ ------------------------------------------------------------------------Specification of scenario 1 is: Attribute Alternatives affected Change type Value --------- ------------------------------- ------------------- --------INVC CAR Scale base by value .900 TTME AIR TRAIN Scale base by value 1.250 ------------------------------------------------------------------------The simulator located 209 observations for this scenario. Simulated Probabilities (shares) for this scenario: +----------+--------------+--------------+------------------+ |Choice | Base | Scenario | Scenario - Base | | |%Share Number |%Share Number |ChgShare ChgNumber| +----------+--------------+--------------+------------------+ |AIR | 27.619 58 | 16.516 35 |-11.103% -23 | |TRAIN | 30.000 63 | 23.012 48 | -6.988% -15 | |BUS | 14.286 30 | 18.495 39 | 4.209% 9 | |CAR | 28.095 59 | 41.977 88 | 13.882% 29 | |Total |100.000 210 |100.000 210 | .000% 0 | +----------+--------------+--------------+------------------+ Choice Based Sampling Over/Underrepresenting alternatives in the data set Choice Air Train Bus Car True 0.14 0.13 0.09 0.64 Sample 0.28 0.30 0.14 0.28 Biases in parameter estimates Biases in estimated variances Weighted log likelihood, weight = Fixup of covariance matrix j / Fj for all i. ; Choices = list of names / list of true proportions $ ; Choices = Air,Train,Bus,Car / 0.14, 0.13, 0.09, 0.64 Choice Based Sampling Estimators --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------Unweighted TTME| -.10289*** .01109 -9.280 .0000 INVC| -.08044*** .01995 -4.032 .0001 INVT| -.01399*** .00267 -5.240 .0000 GC| .07578*** .01833 4.134 .0000 A_AIR| 4.37035*** 1.05734 4.133 .0000 AIR_HIN1| .00428 .01306 .327 .7434 A_TRAIN| 5.91407*** .68993 8.572 .0000 TRA_HIN2| -.05907*** .01471 -4.016 .0001 A_BUS| 4.46269*** .72333 6.170 .0000 BUS_HIN3| -.02295 .01592 -1.442 .1493 --------+-------------------------------------------------Weighted TTME| -.13611*** .02538 -5.363 .0000 INVC| -.10351*** .02470 -4.190 .0000 INVT| -.01772*** .00323 -5.486 .0000 GC| .10225*** .02107 4.853 .0000 A_AIR| 4.52505*** 1.75589 2.577 .0100 AIR_HIN1| .00746 .01481 .504 .6145 A_TRAIN| 5.53229*** .97331 5.684 .0000 TRA_HIN2| -.06026*** .02235 -2.696 .0070 A_BUS| 4.36579*** .97182 4.492 .0000 BUS_HIN3| -.01957 .01631 -1.200 .2302 Changes in Estimated Elasticities +---------------------------------------------------+ | Unweighted | | Elasticity averaged over observations.| | Attribute is INVC in choice CAR | | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | | Mean St.Dev | | Choice=AIR .3622 .3437 | | Choice=TRAIN .3622 .3437 | | Choice=BUS .3622 .3437 | | * Choice=CAR -1.3266 1.1731 | +---------------------------------------------------+ | Weighted | | Elasticity averaged over observations.| | Attribute is INVC in choice CAR | | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | | Mean St.Dev | | Choice=AIR .8371 .7363 | | Choice=TRAIN .8371 .7363 | | Choice=BUS .8371 .7363 | | * Choice=CAR -1.3362 1.4557 | +---------------------------------------------------+ Testing IIA vs. AIR Choice ? No alternative constants in the model NLOGIT NLOGIT ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC$ ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC ; IAS = Air $ Testing IIA – Dealing with Constants With ASCs in the model, the covariance matrix becomes singular because the constant for AIR is always zero within the reduced sample. Do the test against the other coefficients. NLOGIT ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC,One$ MATRIX ; Bair = b(1:4) ; Vair = Varb(1:4,1:4) $ NLOGIT ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC,One ; IAS = Air$ MATRIX ; BNoair=b(1:4) ; VNoair = Varb(1:4,1:4) $ MATRIX ; Db = BNoair-BAir ; Dv = VNoair - Vair $ MATRIX ; List ; H = Db'<Dv>Db $ Lab Session 8 Part 2 Nested Logit Models Extensions of the MNL Using NLOGIT To Fit the Model Start program Load CLOGIT.LPJ project Specify trees with :TREE = name1(alt1,alt2…), name2(alt…. ),… “Names” are optional names for branches. Nested Logit Model ? Load the CLOGIT data ? ? (1) A simple nested logit model ? NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car ; Tree = Private (Air,Car) , Public (Train,Bus) $ Model Form RU1 Twig Level Probability exp(β'x k|j ) Prob(Choice = k | j) = K|j m=1 exp(β'x m|j ) Inclusive Value for the Branch K|j IV(j) = log m=1 exp(β'x m|j ) Branch Probability Prob(Branch = j)= exp λ j γ'y j +IV(j) B b=1 exp λb γ'yb +IV(b) λ j = 1 Returns the Multinomial Logit Model Moving Scaling Down to the Twig Level RU2 Normalization (;RU2) βx k|j exp μ j Twig Level Probability : Pk|j βx m|j k|j m=1 exp μ j k|j βx m|j Inclusive Value for the Branch : IV(j) = log m=1 exp μj Branch Probability : Pj exp γy j μjIV(j) B b=1 exp γyb +μbIV(b) Normalizations There are different ways to normalize the variances in the nested logit model, at the lowest level, or up at the highest level. Use ;RU1 for the low level or ;RU2 to normalize at the branch level Normalizations of Nested Logit Models ? ? (2) Renormalize the nested logit model ? NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car ; Tree = Private (Air,Car) , Public (Train,Bus) ; RU1 $ NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car ; Tree = Private (Air,Car) , Public (Train,Bus) ; RU2 $ Fixing IV Parameters With branches defined by ;TREE = br1(…),br2(…),…,brK(…) (a) Force IV parameters to be equal with ; IVSET: (br1,…) The list may contain any or all of the branch names (b) Force IV parameters to equal specific values ; IVSET: (br1,…) = [ the value ] Constraining the IV Parameters ? (3) Force the IV parameters to be equal NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car ; Tree = Private (Air,Car) , Public (Train,Bus) ; RU2 ; IVSET: (Private,Public) $ NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car ; Tree = Private (Air,Car) , Public (Train,Bus) ; RU2 ; IVSET: (Private,Public) = [1] $ ? The preceding constraint produces the simple MNL model NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car $ Degenerate Branch ? (4) Fit the model with a degenerate branch NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car ; Tree = Fly (Air) , Ground (Train,Bus,Car) $ ? (5) Study scaling differences with nested logit rather ? than HEV. Make all alts their own branch. One is ? normalized to 1.000. NLOGIT ; Lhs = Mode ; RHS = GC, TTME, INVT ; RH2 = ONE ; Choices = Air,Train,Bus,Car ; Tree = Fly(Air),Rail(Train), Autobus(Bus),Auto(Car) ; IVSET: (Fly) = [1] $ Heteroscedasticity in the MNL Model Add ;HET to the generic NLOGIT command. No other changes. NLOGIT ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC,One ; Het ; Effects: INVT(*) $ Heteroscedastic Extreme Value Model (1) ----------------------------------------------------------Start values obtained using MNL model Dependent variable Choice Log likelihood function -184.50669 Estimation based on N = 210, K = 7 Information Criteria: Normalization=1/N Normalized Unnormalized AIC 1.82387 383.01339 Fin.Smpl.AIC 1.82651 383.56784 Bayes IC 1.93544 406.44314 Hannan Quinn 1.86898 392.48517 R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj Constants only -283.7588 .3498 .3393 Chi-squared[ 4] = 198.50415 Prob [ chi squared > value ] = .00000 Response data are given as ind. choices Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------TTME| -.10365*** .01094 -9.476 .0000 INVC| -.08493*** .01938 -4.382 .0000 INVT| -.01333*** .00252 -5.297 .0000 GC| .06930*** .01743 3.975 .0001 A_AIR| 5.20474*** .90521 5.750 .0000 A_TRAIN| 4.36060*** .51067 8.539 .0000 A_BUS| 3.76323*** .50626 7.433 .0000 --------+-------------------------------------------------- Heteroscedastic Extreme Value Model (2) ----------------------------------------------------------Heteroskedastic Extreme Value Model Dependent variable MODE Use to test vs. IIA assumption in Log likelihood function -182.44396 model? LogL0 = -184.5067. Restricted log likelihood -291.12182 Chi squared [ 10 d.f.] 217.35572 R2=1-LogL/LogL* Log-L fncn R-sqrd R2Adj IIA would not be rejected on this No coefficients -291.1218 .3733 .3632 (Not necessarily a test of that Constants only -283.7588 .3570 .3467 methodological assumption.) At start values -218.6505 .1656 .1521 Response data are given as ind. choices Number of obs.= 210, skipped 0 obs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------|Attributes in the Utility Functions (beta) TTME| -.11526** .05721 -2.014 .0440 INVC| -.15516* .07928 -1.957 .0503 INVT| -.02277** .01123 -2.028 .0426 GC| .11904* .06403 1.859 .0630 A_AIR| 4.69411* 2.48092 1.892 .0585 A_TRAIN| 5.15630** 2.05744 2.506 .0122 A_BUS| 5.03047** 1.98259 2.537 .0112 |Scale Parameters of Extreme Value Distns Minus 1. s_AIR| -.57864*** .21992 -2.631 .0085 s_TRAIN| -.45879 .34971 -1.312 .1896 s_BUS| .26095 .94583 .276 .7826 s_CAR| .000 ......(Fixed Parameter)...... |Std.Dev=pi/(theta*sqr(6)) for H.E.V. distribution s_AIR| 3.04385* 1.58867 1.916 .0554 s_TRAIN| 2.36976 1.53124 1.548 .1217 s_BUS| 1.01713 .76294 1.333 .1825 s_CAR| 1.28255 ......(Fixed Parameter)...... --------+-------------------------------------------------- Normalized for estimation Structural parameters MNL basis. HEV Model - Elasticities +---------------------------------------------------+ | Elasticity averaged over observations.| | Attribute is INVC in choice AIR | | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | | Mean St.Dev | | * Choice=AIR -4.2604 1.6745 | | Choice=TRAIN 1.5828 1.9918 | | Choice=BUS 3.2158 4.4589 | | Choice=CAR 2.6644 4.0479 | | Attribute is INVC in choice TRAIN | | Choice=AIR .7306 .5171 | | * Choice=TRAIN -3.6725 4.2167 | | Choice=BUS 2.4322 2.9464 | | Choice=CAR 1.6659 1.3707 | | Attribute is INVC in choice BUS | | Choice=AIR .3698 .5522 | | Choice=TRAIN .5949 1.5410 | | * Choice=BUS -6.5309 5.0374 | | Choice=CAR 2.1039 8.8085 | | Attribute is INVC in choice CAR | | Choice=AIR .3401 .3078 | | Choice=TRAIN .4681 .4794 | | Choice=BUS 1.4723 1.6322 | | * Choice=CAR -3.5584 9.3057 | +---------------------------------------------------+ Multinomial Logit +---------------------------+ | INVC in AIR | | Mean St.Dev | | * -5.0216 2.3881 | | 2.2191 2.6025 | | 2.2191 2.6025 | | 2.2191 2.6025 | | INVC in TRAIN | | 1.0066 .8801 | | * -3.3536 2.4168 | | 1.0066 .8801 | | 1.0066 .8801 | | INVC in BUS | | .4057 .6339 | | .4057 .6339 | | * -2.4359 1.1237 | | .4057 .6339 | | INVC in CAR | | .3944 .3589 | | .3944 .3589 | | .3944 .3589 | | * -1.3888 1.2161 | +---------------------------+ Heterogeneous HEV Model Does the variance depend on household income? NLOGIT ; Lhs = Mode ; Choices = Air,Train,Bus,Car ; Rhs = TTME,INVC,INVT,GC,One ; Het ; Hfn = HINC ; Effects: INVT(*) $ Lab Session 9 Multinomial Probit Mixed Logit (Random Parameters) Latent Class Models Multinomial Probit Model Add ;MNP to the generic command Use ;PTS=number to specify the number of points in the simulations. Use a small number (15) for demonstrations and examples. Use a large number (200+) for real estimation. (Don’t fit this now. Takes forever to compute. Much less practical – and probably less useful – than other specifications.) Multinomial Probit Model --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------|Attributes in the Utility Functions (beta) GC| .11825** .04783 2.472 .0134 TTME| -.09105*** .03439 -2.647 .0081 INVC| -.14880*** .05495 -2.708 .0068 INVT| -.02300*** .00797 -2.886 .0039 A_AIR| 2.94413* 1.59671 1.844 .0652 A_TRAIN| 4.64736*** 1.50865 3.080 .0021 A_BUS| 4.09869*** 1.29880 3.156 .0016 |Std. Devs. of the Normal Distribution. s[AIR]| 3.99782** 1.59304 2.510 .0121 s[TRAIN]| 1.63224* .86143 1.895 .0581 s[BUS]| 1.00000 ......(Fixed Parameter)...... s[CAR]| 1.00000 ......(Fixed Parameter)...... |Correlations in the Normal Distribution rAIR,TRA| .31999 .53343 .600 .5486 rAIR,BUS| .40675 .70841 .574 .5659 rTRA,BUS| .37434 .41343 .905 .3652 rAIR,CAR| .000 ......(Fixed Parameter)...... rTRA,CAR| .000 ......(Fixed Parameter)...... rBUS,CAR| .000 ......(Fixed Parameter)...... --------+-------------------------------------------------- MNP Elasticities +---------------------------------------------------+ | Elasticity averaged over observations.| | Attribute is INVT in choice AIR | | Effects on probabilities of all choices in model: | | * = Direct Elasticity effect of the attribute. | | Mean St.Dev | | * Choice=AIR -1.0154 .4600 | | Choice=TRAIN .4773 .4052 | | Choice=BUS .6124 .4282 | | Choice=CAR .3237 .3037 | +---------------------------------------------------+ | Attribute is INVT in choice TRAIN | | Choice=AIR 1.8113 1.6718 | | * Choice=TRAIN -11.8375 10.1346 | | Choice=BUS 7.9668 6.8088 | | Choice=CAR 4.3257 4.4078 | +---------------------------------------------------+ | Attribute is INVT in choice BUS | | Choice=AIR .9635 1.4635 | | Choice=TRAIN 3.9555 6.7724 | | * Choice=BUS -23.3467 14.2837 | | Choice=CAR 4.6840 7.8314 | +---------------------------------------------------+ | Attribute is INVT in choice CAR | | Choice=AIR 1.3324 1.4476 | | Choice=TRAIN 4.5062 4.7695 | | Choice=BUS 9.6001 7.6406 | | * Choice=CAR -10.8870 10.0449 | +---------------------------------------------------+ Data Sets for Random Parameters Modeling (1) clogit.lpj (as before) (2) brandchoicesSP.LPJ is 8 choice situations per person, 4 choices. True underlying model is a three class latent class model (3) panelprobit.lpj is 5 binary outcome situations per firm, 1270 firms. This has only firm specific data, no “choice specific” data. Suitable for Random Parameters Probit Models (4) innovation.lpj is 5 “choice” situations per firm. Converted the panel probit.lpj data to a format amenable to the RPL program in NLOGIT. Second line of each outcome is the other outcome, “not innovate” plus zeros for the “attributes.” (5) healthcare.lpj is a panel data set with numerous variables (DocVis, HospVis, DOCTOR, HOSPITAL, HSAT) that can be modeled with random parameters models. There are varying numbers of observations per person. (6) sprp.lpj is a mixed revealed/stated multinomial choice data set. There are a mixture of a variable number of choices per person as well as a choice among the elements of a master choice set. Panel Data Formats In case (1) ; PDS = 1 (2) use ; PDS = 8 (3) ; PDS = 5 (4) ; PDS = 5 (5) ; PDS = _Groupti (6) ; PDS = 4 (See discussion in Lab Session 10) Commands for Random Parameters Model name ; Lhs = … ; Rhs = … ; … < any other specifications > ; RPM if not NLOGIT or ;RPL if NLOGIT model ; PTS = the number of points (use 25 for our class) ; PDS = the panel data spedification ; Halton (to get better results) ; FCN = the specification of the random parameters $ Random Parameter Specifications All models in LIMDEP/NLOGIT may be fit with random parameters, with panel or cross sections. NLOGIT has more options (not shown here) than the more general cases. Options for specifications ; Correlated parameters (otherwise, independent) ; FCN = name ( type ). Type is N = normal, U = uniform, L = lognormal (positive), T = tent shaped distributions. C = nonrandom (variance = 0 – only in NLOGIT) Name is the name of a variable or parameter in the model or A_choice for ASCs (up to 8 characters). In the CLOGIT model, they are A_AIR A_TRAIN A_BUS. Replicability Consecutive runs of the identical model give different results. Why? Different random draws. Achieve replicability Use ;HALTON Set random number generator before each run with the same value. CALC ; Ran( large odd number) $ Random Parameters Models PROBIT ; Lhs = IP ; Rhs = One,IMUM,FDIUM,LogSales ; RPM ; Pts = 25 ; Halton ; Pds = 5 ; Fcn = IMUM(N),FDIUM(N) ; Correlated $ POISSON ; Lhs = Doctor ; Rhs = One,Educ,Age,Hhninc,Hhkids ; Fcn = Educ(N) ; Pds=_Groupti ; Pts=100 ; Halton ; Maxit = 25 $ And so on… Random Effects in Utility Functions Model has U(i,j,t) = ’x(i,j,t) + e(i,j,t) + w(i,j) w(i,j) is constant across time, correlated across utilities RPLogit ; ; ; ; ; ; lhs=mode ; choices=air,train,bus,car rhs=gc,ttme rh2=one rpl ; maxit=50;pts=25;halton ; pds=5 fcn=a_air(n),a_train(n),a_bus(n) Correlated $ Random Effects in Utility Functions Model has U(i,j,t) = ’x(i,j,t) + e(i,j,t) + w(i,m) w(i,m) is constant across time, the same for specified groups of utilities. ? This specifies two effects, one for private, one for public ECLogit ; lhs=mode ; choices=air,train,bus,car ; rhs=gc,ttme ; rh2=one ; rpl ; maxit=50;pts=25;halton ; pds=5 ; fcn=a_air(n),a_train(n),a_bus(n) ; ECM= (air,car),(bus,car) $ Options for Random Parameters in NLOGIT Only Name ( type ) = as described above Name ( C ) = a constant parameter. Variance = 0 Name (T,*) = triangular with one end at 0 the other at 2 Name (type | value) = fixes the mean at value, variance is free Name (type | # ) if variables in RPL=list, they do not apply to this parameter. Mean is constant. Name (type | #pattern) as above, but pattern is used to remove only some variables in RPL=list. Pattern is 1s and 0s. E.g., if RPL=Hinc,Psize, GC(N | #10) allows only Hinc in the mean. Name (type , value ) = forces standard deviation to equal value times absolute value of . Name (type,*,value) forces mean equal to value, variance is free, any variables in RPL=list are removed for this parameter. Some Random Parameters Models ? Basic random parameters model Nlogit ; lhs=mode ; choices=air,train,bus,car ; rhs=gc,ttme,invt ; rh2=one ; rpl ; maxit=50 ;pts=25 ; halton ; pds=5 ; fcn=gc(n),ttme(n),invt(n) $ ? ? Random parameters model with constrained parameter. Nlogit ; lhs=mode ; choices=air,train,bus,car ; rhs=gc,ttme,invt ; rh2=one ; rpl ; maxit=50 ;pts=25 ; halton ; pds=5 ; fcn=gc(t,*),ttme(n),invt(n) $ ? ? Random parameters with effects to induce correlation Nlogit ; lhs=mode ; choices=air,train,bus,car ; rhs=gc,ttme,invt ; rh2=one ; rpl ; maxit=50 ;pts=25 ; halton ; pds=5 ; fcn=gc(n),ttme(n),invt(n) ; kernel = (air,car),(bus,train) $ Constructed Parameters with Restrictions ? Dummy variables for PUBLIC or PRIVATE mode Create ; apriv = aasc + casc ; apub = tasc + basc$ ? Model contains a “type” effect (random effect) in the ? Utility functions. Note, no coefficients, just random variation. Nlogit ; lhs=mode ; choices=air,train,bus,car ; rhs=gc,ttme,apriv,apub ; rh2=one ; rpl ; maxit=50;pts=25;halton;output=3; pds=5 ; fcn=apriv(n,*,0), apub(n,*,0) $ Using NLOGIT To Fit an LC Model Start program Load BrandChoices.lpj project This is the artificial shoe brand choice data. Specify the model with ; LCM ; PTS = number of classes To request class probabilities to depend on variables in the data, use ; LCM = the variables (Do not include ONE in this variables list.) Latent Choice Models ? Load the BrandChoicesSP.lpj data set. (1) Three class model. (The truth) NLOGIT ;Lhs=choice ;Choices=Brand1,Brand2,Brand3,None ;Rhs = Fash,Qual,Price,ASC4 ;lcm;pds=8 ;pts=3 ;Crosstab $ (2) Try with different numbers of classes NLOGIT ;Lhs=choice ;Choices=Brand1,Brand2,Brand3,None ;Rhs = Fash,Qual,Price,ASC4 ;lcm;pds=8 ;pts=2 ;Crosstab $ NLOGIT ;Lhs=choice ;Choices=Brand1,Brand2,Brand3,None ;Rhs = Fash,Qual,Price,ASC4 ;lcm;pds=8 ;pts=4 ;Crosstab $ Latent Class Models (3) More elaborate model for class probabilities NLOGIT ;Lhs=choice ;Choices=Brand1,Brand2,Brand3,None ;Rhs = Fash,Qual,Price,ASC4 ;lcm=Male,Agel25,Age2539 ;pds=8 ;pts=4 ;Crosstab $ (4) Compare LCM to a simpler model - Nested Logit NLOGIT ;Lhs=choice ;Choices=Brand1,Brand2,Brand3,None ;Rhs = Fash,Qual,Price,ASC4 ;Tree=Shoes(brand*),NoShoes(none) ;ivset:(noshoes)=[1] ;Crosstab $ (5) Try some other experiments Lab Session 10 Discrete Choice Combining RP and SP Data Application Survey sample of 2,688 trips, 2 or 4 choices per situation Sample consists of 672 individuals Choice based sample Revealed/Stated choice experiment: Revealed: Drive,ShortRail,Bus,Train Hypothetical: Drive,ShortRail,Bus,Train,LightRail,ExpressBus Attributes: Cost –Fuel or fare Transit time Parking cost Access and Egress time Data Set Load data set RPSP.LPJ 9408 observations We fit separate models for RP and SP subsets of the data, then a combined, nested model that accommodates the different scaling. Each person makes four choices from a choice set that includes either two or four alternatives. The first choice is the RP between two of the RP alternatives The second-fourth are the SP among four of the six SP alternatives. There are ten alternatives in total. Model for Revealed Preference Data ? Using only Revealed Preference Data sample;all$ reject;sprp=2$ deleting SP data dstats;rhs=autotime,fcost,mptrtime,mptrfare$ NLOGIT ;lhs=chosen,cset,altij ;choices=RPDA,RPRS,RPBS,RPTN ;descriptives;crosstab ;maxit=100 ;model: U(RPDA) = rdasc+ fl*fcost+tm*autotime/ U(RPRS) = rrsasc+ fl*fcost+tm*autotime/ U(RPBS) = rbsasc + ptc*mptrfare+mt*mptrtime/ U(RPTN) = ptc*mptrfare+mt*mptrtime$ Model for Stated Preference Data ? Using only Stated Preference Data sample;all$ reject;sprp=1$ deleting RP data ? BASE MODEL nlogit ;lhs=chosen,cset,alt ;choices=SPDA,SPRS,SPBS,SPTN,SPLR,SPBW ;descriptives;crosstab ;maxit=150 ;model: U(SPDA) = dasc +cst*fueld+ tmcar*time+prk*parking +pincda*pincome +cavda*carav/ U(SPRS) = rsasc+cst*fueld+ tmcar*time+prk*parking/ U(SPBS) = bsasc+cst*fared+ tmpt*time+act*acctime+egt*eggtime/ U(SPTN) = tnasc+cst*fared+ tmpt*time+act*acctime+egt*eggtime/ U(SPLR) = lrasc+cst*fared+ tmpt*time+act*acctime +egt*eggtime/ U(SPBW) = cst*fared+ tmpt*time+act*acctime+egt*eggtime$ A Nested Logit Model for RP/SP Data NLOGIT ;lhs=chosen,cset,altij ;choices=RPDA,RPRS,RPBS,RPTN,SPDA,SPRS,SPBS,SPTN,SPLR,SPBW /.592,.208,.089,.111,1.0,1.0,1.0,1.0,1.0,1.0 ;tree=mode[rp(RPDA,RPRS,RPBS,RPTN),spda(SPDA), sprs(SPRS),spbs(SPBS),sptn(SPTN),splr(SPLR),spbw(SPBW)] ;ivset: (rp)=[1.0];ru1 ;maxit=150 ;model: U(RPDA) = rdasc+ invc*fcost+tmrs*autotime ?+prkda*vehprkct+ + pinc*pincome+CAVDA*CARAV/ U(RPRS) = rrsasc + invc*fcost+tmrs*autotime/?+ U(RPBS) = rbsasc + invc*mptrfare+mtpt*mptrtime/?+acegt*rpacegtm/ U(RPTN) = cstrs*mptrfare+mtpt*mptrtime/?+acegt*rpacegtm/ U(SPDA) = sdasc + invc*fueld + tmrs*time+cavda*carav ?+prkda*parking + pinc*pincome/ U(SPRS) = srsasc + invc*fueld + tmrs*time/? cavrs*carav/ U(SPBS) = invc*fared + mtpt*time +acegt*spacegtm/ U(SPTN) = stnasc + invc*fared + mtpt*time+acegt*spacegtm/ U(SPLR) = slrasc + invc*fared + mtpt*time+acegt*spacegtm/ U(SPBW) = sbwasc + invc*fared + mtpt*time+acegt*spacegtm$ A Random Parameters Approach NLOGIT ;lhs=chosen,cset,altij ;choices=RPDA,RPRS,RPBS,RPTN,SPDA,SPRS,SPBS,SPTN,SPLR,SPBW /.592,.208,.089,.111,1.0,1.0,1.0,1.0,1.0,1.0 ; rpl ; pds=4 ; halton ; pts=25 ; fcn=invc(n) ; model: U(RPDA) = rdasc+ invc*fcost+tmrs*autotime ?+prkda*vehprkct+ + pinc*pincome+CAVDA*CARAV/ U(RPRS) = rrsasc + invc*fcost+tmrs*autotime/?+ ?egt*autoegtm+prk*vehprkct+ U(RPBS) = rbsasc + invc*mptrfare+mtpt*mptrtime/?+acegt*rpacegtm/ U(RPTN) = cstrs*mptrfare+mtpt*mptrtime/?+acegt*rpacegtm/ U(SPDA) = sdasc + invc*fueld + tmrs*time+cavda*carav ?+prkda*parking + pinc*pincome/ U(SPRS) = srsasc + invc*fueld + tmrs*time/? cavrs*carav/ U(SPBS) = invc*fared + mtpt*time +acegt*spacegtm/ U(SPTN) = stnasc + invc*fared + mtpt*time+acegt*spacegtm/ U(SPLR) = slrasc + invc*fared + mtpt*time+acegt*spacegtm/ U(SPBW) = sbwasc + invc*fared + mtpt*time+acegt*spacegtm$ Connecting Choice Situations through RPs --------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------|Random parameters in utility functions INVC| -.58944*** .03922 -15.028 .0000 |Nonrandom parameters in utility functions RDASC| -.75327 .56534 -1.332 .1827 TMRS| -.05443*** .00789 -6.902 .0000 PINC| .00482 .00451 1.068 .2857 CAVDA| .35750*** .13103 2.728 .0064 RRSASC| -2.18901*** .54995 -3.980 .0001 RBSASC| -1.90658*** .53953 -3.534 .0004 MTPT| -.04884*** .00741 -6.591 .0000 CSTRS| -1.57564*** .23695 -6.650 .0000 SDASC| -.13612 .27616 -.493 .6221 SRSASC| -.10172 .18943 -.537 .5913 ACEGT| -.02943*** .00384 -7.663 .0000 STNASC| .13402 .11475 1.168 .2428 SLRASC| .27250** .11017 2.473 .0134 SBWASC| -.00685 .09861 -.070 .9446 |Distns. of RPs. Std.Devs or limits of triangular NsINVC| .45285*** .05615 8.064 .0000 --------+--------------------------------------------------