Sunday October 15 10:10:23 2023 Page 1 ___ ____ ____ ____ ____(R) /__ / ____/ / ____/ ___/ / /___/ / /___/ Statistics/Data analysis 1 . name: <unnamed> log: C:\Users\egevrek\Dropbox\PORTO-econometrics2\Workshops_2023\WS3\Est > imation_Results.smcl log type: smcl opened on: 1 Oct 2023, 17:57:09 2 . import excel "C:\Users\egevrek\Dropbox\PORTO-econometrics2\Workshops_2023\WS3\ > Mroz.xls", sheet("Sheet1") firstrow (7 vars, 753 obs) 3 . do "C:\Users\egevrek\AppData\Local\Temp\STDb8b0_000000.tmp" 4 . **Part 1. Estimate the linear probability model 5 . 6 . 7 . ** To estimate the linear probability model (LPM) by OLS, we use "reg" command > . 8 . 9 . reg inlf hincome educ exper age kidslt6 kidsge6 Source SS df MS Number of obs F(6, 746) Prob > F R-squared Adj R-squared Root MSE = = = = = = 753 42.32 0.0000 0.2539 0.2479 .42982 Model Residual 46.9082357 137.81952 6 746 7.81803929 .184744665 Total 184.727756 752 .245648611 inlf Coefficient Std. err. t P>|t| [95% conf. interval] hincome educ exper age kidslt6 kidsge6 _cons -.0033265 .0398189 .0225725 -.017712 -.2718291 .0125301 .7072318 .0014574 .0074006 .0021786 .0024487 .0335715 .0132781 .1504335 -2.28 5.38 10.36 -7.23 -8.10 0.94 4.70 0.023 0.000 0.000 0.000 0.000 0.346 0.000 -.0061876 .0252905 .0182956 -.0225191 -.3377348 -.0135368 .4119083 -.0004654 .0543474 .0268493 -.0129049 -.2059233 .038597 1.002555 10 . 11 . ** Since the error terms are heteroskedastic in the linear probability model ( > LPM), we should obtain heteroskedasticity-robust standard errors. 12 . 13 . **To obtain heteroskedasticity-robust standard errors, we use "reg" command wi > th "robust" option. 14 . 15 . reg inlf hincome educ exper age kidslt6 kidsge6, robust Linear regression Number of obs F(6, 746) Prob > F R-squared Root MSE = = = = = 753 65.55 0.0000 0.2539 .42982 inlf Coefficient Robust std. err. t P>|t| [95% conf. interval] hincome educ exper age kidslt6 kidsge6 _cons -.0033265 .0398189 .0225725 -.017712 -.2718291 .0125301 .7072318 .0015099 .0072073 .0021001 .002273 .0314266 .0136017 .1459241 -2.20 5.52 10.75 -7.79 -8.65 0.92 4.85 0.028 0.000 0.000 0.000 0.000 0.357 0.000 -.0062907 .0256698 .0184497 -.0221743 -.3335241 -.014172 .420761 -.0003623 .053968 .0266952 -.0132497 -.210134 .0392322 .9937025 Sunday October 15 10:10:23 2023 Page 2 16 . 17 . ** All the explanatory variables except for "kidsge6" have a statistically sig > nificant impact on the probability of being in the labor force (at the 5 perce > nt significance level). 18 . 19 . **Part 2. Use "predict" command (with "xb" option) to obtain the estimated pro > babilities. 20 . ** STATA creates a new variable named "probability" whose values are estimated > probabilities. Note that you can use a different name for the new variable ( > i.e., predict "name of the new variable", xb) 21 . 22 . predict probability, xb 23 . 24 . ** For example, if you want to name the new variable "fittedvalues", then type > : predict fittedvalues, xb 25 . 26 . ** "summarize" command (or shorcut "sum" command) provides descriptive statist > ics about a variable. 27 . 28 . **Use "sum" command to find out the number of the estimated probabilities that > are outside the unit interval. 29 . 30 . sum probability if probability>1 | probability<0 Variable Obs Mean Std. dev. Min Max probability 34 .6666093 .5640637 -.3042662 1.138675 31 . 32 . sum probability if probability>1 Variable Obs Mean Std. dev. Min Max probability 23 1.04874 .0357447 1.003759 1.138675 33 . 34 . sum probability if probability<0 Variable Obs Mean Std. dev. Min Max probability 11 -.1323903 .0951249 -.3042662 -.0003745 35 . 36 . ** Note that 11 of the estimated probabilities are less than zero and 23 of th > e estimated probabilities are greater than one. 37 . 38 . **Part 4. Use "probit" command to estimate the probit model. 39 . 40 . probit inlf hincome educ exper age kidslt6 kidsge6 Iteration 0: Iteration 1: Iteration 2: Iteration 3: Iteration 4: Log likelihood = -514.8732 Log likelihood = -407.11545 Log likelihood = -406.21971 Log likelihood = -406.21886 Log likelihood = -406.21886 Probit regression Number of obs = 753 LR chi2(6) = 217.31 Prob > chi2 = 0.0000 Pseudo R2 = 0.2110 Log likelihood = -406.21886 inlf Coefficient Std. err. z P>|z| [95% conf. interval] hincome educ exper age kidslt6 kidsge6 _cons -.0115648 .1336902 .0702165 -.0555548 -.8742923 .0345459 .5795817 .0047942 .0251346 .007571 .0083447 .1175098 .0429862 .496205 -2.41 5.32 9.27 -6.66 -7.44 0.80 1.17 0.016 0.000 0.000 0.000 0.000 0.422 0.243 -.0209613 .0844273 .0553775 -.0719101 -1.104607 -.0497055 -.3929623 -.0021684 .1829531 .0850555 -.0391995 -.6439773 .1187974 1.552126 Sunday October 15 10:10:23 2023 Page 3 41 . 42 . 43 . **Part 4 (a). The marginal effect of education on the probability of being in > the labor force for a woman who is 30 years old, has 14 years of education, 7 > years of labor market experience, a husband with an annual income of $20,000, > and 2 children who are older than 9 years old? 44 . 45 . ** Use "margins" command with "at" option. 46 . 47 . margins, dydx(educ) at(hincome=20 educ=14 exper=7 age=30 kidslt6=0 kidsge6=2) Conditional marginal effects Model VCE: OIM Number of obs = 753 Expression: Pr(inlf), predict() dy/dx wrt: educ At: hincome = 20 educ = 14 exper = 7 age = 30 kidslt6 = 0 kidsge6 = 2 dy/dx educ Delta-method std. err. z P>|z| [95% conf. interval] .0055782 5.14 0.000 .0177466 .0286796 .0396126 48 . 49 . 50 . **Part 4 (b). The average marginal effect of education on the probability of b > eing in the labor force 51 . ** Use "margins" command. 52 . 53 . margins, dydx(educ) Average marginal effects Model VCE: OIM Number of obs = 753 Expression: Pr(inlf), predict() dy/dx wrt: educ dy/dx educ Delta-method std. err. z P>|z| [95% conf. interval] .0072785 5.61 0.000 .0265645 .0408301 .0550958 54 . 55 . 56 . **Part 4 (c). The marginal effect of education on the probability of being in > the labor force for an average individual. 57 . ** Use "margins" command with "atmeans" option. 58 . 59 . margins, dydx(educ) atmeans Conditional marginal effects Model VCE: OIM Expression: Pr(inlf), predict() dy/dx wrt: educ At: hincome = 20.12896 (mean) educ = 12.28685 (mean) exper = 10.63081 (mean) age = 42.53785 (mean) kidslt6 = .2377158 (mean) kidsge6 = 1.353254 (mean) Number of obs = 753 Sunday October 15 10:10:23 2023 dy/dx educ Page 4 Delta-method std. err. z P>|z| [95% conf. interval] .0098013 5.32 0.000 .0329436 .0521537 .0713638 60 . 61 . **Part 5. Use "logit" command to estimate the probit model. 62 . 63 . logit inlf hincome educ exper age kidslt6 kidsge6 Iteration 0: Iteration 1: Iteration 2: Iteration 3: Iteration 4: Log likelihood = -514.8732 Log likelihood = -406.91038 Log likelihood = -406.14404 Log likelihood = -406.14318 Log likelihood = -406.14318 Logistic regression Number of obs = 753 LR chi2(6) = 217.46 Prob > chi2 = 0.0000 Pseudo R2 = 0.2112 Log likelihood = -406.14318 inlf Coefficient Std. err. z P>|z| [95% conf. interval] hincome educ exper age kidslt6 kidsge6 _cons -.0202165 .2269766 .1197458 -.0910884 -1.439393 .0581735 .8379089 .0082637 .0432954 .0136264 .0143207 .2014989 .07338 .8409368 -2.45 5.24 8.79 -6.36 -7.14 0.79 1.00 0.014 0.000 0.000 0.000 0.000 0.428 0.319 -.036413 .1421191 .0930385 -.1191564 -1.834324 -.0856487 -.810297 -.0040199 .3118341 .146453 -.0630204 -1.044462 .2019957 2.486115 64 . 65 . 66 . **Part 5 (a). The marginal effect of labor market experience on the probabilit > y of being in the labor force for a woman who is 30 years old, has 14 years of > education, 7 years of labor market experience, a husband with an annual incom > e of $20,000, and 2 children who are older than 9 years old? 67 . 68 . ** Use "margins" command with "at" option. 69 . 70 . margins, dydx(exper) at(hincome=20 educ=14 exper=7 age=30 kidslt6=0 kidsge6=2) Conditional marginal effects Model VCE: OIM Number of obs = 753 Expression: Pr(inlf), predict() dy/dx wrt: exper At: hincome = 20 educ = 14 exper = 7 age = 30 kidslt6 = 0 kidsge6 = 2 dy/dx exper .0142325 Delta-method std. err. z P>|z| [95% conf. interval] .0026503 5.37 0.000 .0090379 .019427 Sunday October 15 10:10:23 2023 Page 5 71 . 72 . 73 . **Part 5 (b). The average marginal effect of labor market experience on the pr > obability of being in the labor force 74 . ** Use "margins" command. 75 . 76 . margins, dydx(exper) Average marginal effects Model VCE: OIM Number of obs = 753 Expression: Pr(inlf), predict() dy/dx wrt: exper dy/dx exper Delta-method std. err. z P>|z| [95% conf. interval] .0019897 10.91 0.000 .0177994 .0216992 .025599 77 . 78 . 79 . **Part 5 (c). The marginal effect of labor market experience on the probabilit > y of being in the labor force for an average individual. 80 . ** Use "margins" command with "atmeans" option. 81 . 82 . margins, dydx(exper) atmeans Conditional marginal effects Model VCE: OIM Number of obs = 753 Expression: Pr(inlf), predict() dy/dx wrt: exper At: hincome = 20.12896 (mean) educ = 12.28685 (mean) exper = 10.63081 (mean) age = 42.53785 (mean) kidslt6 = .2377158 (mean) kidsge6 = 1.353254 (mean) dy/dx exper Delta-method std. err. z P>|z| [95% conf. interval] .003267 8.88 0.000 .0226113 .0290145 .0354177 83 . 84 . **Part 6. Estimate the logit model using "logit" command. 85 . 86 . logit inlf hincome educ exper age kidslt6 kidsge6 Iteration 0: Iteration 1: Iteration 2: Iteration 3: Iteration 4: Log likelihood = -514.8732 Log likelihood = -406.91038 Log likelihood = -406.14404 Log likelihood = -406.14318 Log likelihood = -406.14318 Logistic regression Log likelihood = -406.14318 Number of obs = 753 LR chi2(6) = 217.46 Prob > chi2 = 0.0000 Pseudo R2 = 0.2112 Sunday October 15 10:10:23 2023 Page 6 inlf Coefficient Std. err. z P>|z| [95% conf. interval] hincome educ exper age kidslt6 kidsge6 _cons -.0202165 .2269766 .1197458 -.0910884 -1.439393 .0581735 .8379089 .0082637 .0432954 .0136264 .0143207 .2014989 .07338 .8409368 -2.45 5.24 8.79 -6.36 -7.14 0.79 1.00 0.014 0.000 0.000 0.000 0.000 0.428 0.319 -.036413 .1421191 .0930385 -.1191564 -1.834324 -.0856487 -.810297 -.0040199 .3118341 .146453 -.0630204 -1.044462 .2019957 2.486115 87 . 88 . **Wald Test: 89 . 90 . **Use "test" command to perform Wald Test. 91 . 92 . test educ age ( 1) ( 2) [inlf]educ = 0 [inlf]age = 0 chi2( 2) = Prob > chi2 = 72.45 0.0000 93 . 94 . **Likelihood Ratio (LR) Test: 95 . 96 . ** Estimate the unrestricted model using "logit" command and store the estimat > ion results obtained from the unrestricted model using "estimates store" comma > nd. 97 . 98 . logit inlf hincome educ exper age kidslt6 kidsge6 Iteration 0: Iteration 1: Iteration 2: Iteration 3: Iteration 4: Log likelihood = -514.8732 Log likelihood = -406.91038 Log likelihood = -406.14404 Log likelihood = -406.14318 Log likelihood = -406.14318 Logistic regression Number of obs = 753 LR chi2(6) = 217.46 Prob > chi2 = 0.0000 Pseudo R2 = 0.2112 Log likelihood = -406.14318 inlf Coefficient Std. err. z P>|z| [95% conf. interval] hincome educ exper age kidslt6 kidsge6 _cons -.0202165 .2269766 .1197458 -.0910884 -1.439393 .0581735 .8379089 .0082637 .0432954 .0136264 .0143207 .2014989 .07338 .8409368 -2.45 5.24 8.79 -6.36 -7.14 0.79 1.00 0.014 0.000 0.000 0.000 0.000 0.428 0.319 -.036413 .1421191 .0930385 -.1191564 -1.834324 -.0856487 -.810297 -.0040199 .3118341 .146453 -.0630204 -1.044462 .2019957 2.486115 99 . 100 . estimates store unrestricted 101 . 102 . ** Estimate the restricted model using "logit" command and store the estimatio > n results obtained from the restricted model using "estimates store" command. Sunday October 15 10:10:23 2023 Page 7 103 . 104 . logit inlf hincome exper kidslt6 kidsge6 Iteration 0: Iteration 1: Iteration 2: Iteration 3: Iteration 4: Log likelihood = -514.8732 Log likelihood = -450.95875 Log likelihood = -450.24251 Log likelihood = -450.24055 Log likelihood = -450.24055 Logistic regression Number of obs = 753 LR chi2(4) = 129.27 Prob > chi2 = 0.0000 Pseudo R2 = 0.1255 Log likelihood = -450.24055 inlf Coefficient Std. err. z P>|z| [95% conf. interval] hincome exper kidslt6 kidsge6 _cons -.0122709 .1053723 -.6853735 .1930941 -.6222555 .0070886 .0126944 .1619961 .0643118 .24754 -1.73 8.30 -4.23 3.00 -2.51 0.083 0.000 0.000 0.003 0.012 -.0261643 .0804918 -1.00288 .0670453 -1.107425 .0016225 .1302527 -.367867 .3191428 -.137086 105 . 106 . estimates store restricted 107 . 108 . ** Note that the command "estimates store" allows us to store the estimation r > esults from the unrestricted and restricted model. 109 . 110 . ** Use "lrtest" command to perform LR Test. 111 . 112 . lrtest unrestricted restricted Likelihood-ratio test Assumption: restricted nested within unrestricted LR chi2(2) = 88.19 Prob > chi2 = 0.0000 113 . 114 . **Part 7. The percent perfectly predicted in the probit model 115 . 116 . probit inlf hincome educ exper age kidslt6 kidsge6 Iteration 0: Iteration 1: Iteration 2: Iteration 3: Iteration 4: Log likelihood = -514.8732 Log likelihood = -407.11545 Log likelihood = -406.21971 Log likelihood = -406.21886 Log likelihood = -406.21886 Probit regression Number of obs = 753 LR chi2(6) = 217.31 Prob > chi2 = 0.0000 Pseudo R2 = 0.2110 Log likelihood = -406.21886 inlf Coefficient Std. err. z P>|z| [95% conf. interval] hincome educ exper age kidslt6 kidsge6 _cons -.0115648 .1336902 .0702165 -.0555548 -.8742923 .0345459 .5795817 .0047942 .0251346 .007571 .0083447 .1175098 .0429862 .496205 -2.41 5.32 9.27 -6.66 -7.44 0.80 1.17 0.016 0.000 0.000 0.000 0.000 0.422 0.243 -.0209613 .0844273 .0553775 -.0719101 -1.104607 -.0497055 -.3929623 -.0021684 .1829531 .0850555 -.0391995 -.6439773 .1187974 1.552126 Sunday October 15 10:10:24 2023 Page 8 117 . 118 . **Use "predict" command to calculate the estimated probability of being in the > labor force for each observation. 119 . 120 . predict phat (option pr assumed; Pr(inlf)) 121 . 122 . ** Create a new variable named "p" using "generate" command (or shorcut "gen" > command) 123 . ** "p" is a dummy variable that takes the value of 1 if the estimated probabil > ity of being in the labor force is greater than 0.5 and takes the value of 0 o > therwise. 124 . ** The variable "p" is called predicted outcome. 125 . 126 . gen p=(phat>0.5) 127 . 128 . *** Use "tabulate" command (or shorcut "tab" command) to examine the relations > hip between the variable "p" (i.e., predicted outcome) and the variable "inlf > " (i.e, actual outcome) 129 . 130 . tab p inlf p inlf 0 1 Total 0 1 213 112 80 348 293 460 Total 325 428 753 131 . 132 . ** The overall percent perfectly predicted is 73.84 133 . 134 . **The percent perfectly predicted for inlf=0 is 65.54 135 . 136 . display (213/325)*100 65.538462 137 . 138 . **The percent perfectly predicted 139 . 140 . display (348/428)*100 81.308411 for inlf=1 is 81.31 141 . end of do-file 142 . log close name: <unnamed> log: C:\Users\egevrek\Dropbox\PORTO-econometrics2\Workshops_2023\WS3\Est > imation_Results.smcl log type: smcl closed on: 1 Oct 2023, 17:58:21