JPL Page 1 08/03/2016 M2R Economie internationale, développement, transition Année 2010-2011 EXAMEN D’ECONOMETRIE APPLIQUEE Question 1 . use "C:\Documents and Settings\Administrador\Mis I\Econometrics\Laffargue\exam\mus06 > data.dta", clear documentos\International Economics . describe ldrugexp hi_empunion totchr age female blhisp linc storage display value variable name type format label variable label -----------------------------------------------------------------------------------------------------------------------ldrugexp float %9.0g log(drugexp) hi_empunion byte %8.0g Insured thro emp/union totchr byte %8.0g Total chronic cond age byte %8.0g Age female byte %8.0g Female blhisp float %9.0g Black or Hispanic linc float %9.0g log(income) . sum ldrugexp hi_empunion totchr age female blhisp linc Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------ldrugexp | 10391 6.479668 1.363395 0 10.18017 hi_empunion | 10391 .3796555 .4853245 0 1 totchr | 10391 1.860745 1.290131 0 9 age | 10391 75.04639 6.69368 65 91 female | 10391 .5797325 .4936256 0 1 -------------+-------------------------------------------------------blhisp | 10391 .1703397 .3759491 0 1 linc | 10089 2.743275 .9131433 -6.907755 5.744476 We can see that the variable linc is contains missing observations. We can also see that the average age of individuals in the sample is 75 years, and that less than 50% of them have a complementary insurance. More than half of them are females. Proportion of blacks and Hispanics is not that high at all. JPL Page 2 08/03/2016 Question 2 To know if there are missing observations we use codebook command and then we the drop missing ones: . codebook linc -----------------------------------------------------------------------------------------------------------------------linc log(income) -----------------------------------------------------------------------------------------------------------------------type: range: unique values: mean: std. dev: numeric (float) [-6.9077554,5.7444763] 6914 units: missing .: 1.000e-09 302/10391 2.74328 .913143 percentiles: 10% 1.79176 25% 2.2327 50% 2.74316 75% 3.31506 90% 3.79928 . drop if linc==. (302 observations deleted) . des ssiratio lowincome firmsz multlc storage display value variable name type format label variable label -----------------------------------------------------------------------------------------------------------------------ssiratio float %9.0g SSI/Income ratio lowincome byte %8.0g Low income firmsz float %9.0g Firm size multlc byte %8.0g Multiple locations . sum ssiratio lowincome firmsz multlc Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------ssiratio | 10089 .5365438 .3678175 0 9.25062 lowincome | 10089 .1874319 .3902771 0 1 firmsz | 10089 .1405293 2.170389 0 50 multlc | 10089 .0620478 .2412543 0 1 We can see that the variable lowincome is not that high, meaning that the status lowi ncome is rather represents a very small proportion of the observations. We can also see that on average the ssiratio is not that high, meaning that there is not a very high income constraint. We also find that the size of the firms were the individuals are employed, on average are rather small and not operating in much locations. JPL Page 3 08/03/2016 Question 3 . ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ), first robust . ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ), first robust First-stage regressions ----------------------First-stage regression of hi_empunion: OLS estimation -------------Estimates efficient for homoskedasticity only Statistics robust to heteroskedasticity Total (centered) SS Total (uncentered) SS Residual SS = = = 2382.242839 3856 2201.062524 Number of obs F( 6, 10082) Prob > F Centered R2 Uncentered R2 Root MSE = = = = = = 10089 119.18 0.0000 0.0761 0.4292 .4672 -----------------------------------------------------------------------------| Robust hi_empunion | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------totchr | .0127865 .0036655 3.49 0.000 .0056015 .0199716 age | -.0086323 .0007087 -12.18 0.000 -.0100216 -.0072431 female | -.07345 .0096392 -7.62 0.000 -.0923448 -.0545552 blhisp | -.06268 .0122742 -5.11 0.000 -.08674 -.0386201 linc | .0483937 .0066075 7.32 0.000 .0354417 .0613456 ssiratio | -.1916432 .0236326 -8.11 0.000 -.2379678 -.1453186 _cons | 1.028981 .0581387 17.70 0.000 .9150172 1.142944 -----------------------------------------------------------------------------Included instruments: totchr age female blhisp linc ssiratio -----------------------------------------------------------------------------F test of excluded instruments: F( 1, 10082) = 65.76 Prob > F = 0.0000 Angrist-Pischke multivariate F test of excluded instruments: F( 1, 10082) = 65.76 Prob > F = 0.0000 Summary results for first-stage regressions ------------------------------------------Variable hi_empunion | F( | (Underid) (Weak id) 1, 10082) P-val | AP Chi-sq( 1) P-val | AP F( 1, 10082) 65.76 0.0000 | 65.81 0.0000 | 65.76 NB: first-stage test statistics heteroskedasticity-robust Stock-Yogo weak ID test critical values for single endogenous regressor: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. Underidentification test Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified) Ha: matrix has rank=K1 (identified) Kleibergen-Paap rk LM statistic Chi-sq(1)=138.02 P-val=0.0000 Weak identification test Ho: equation is weakly identified Cragg-Donald Wald F statistic Kleibergen-Paap Wald rk F statistic 183.98 65.76 JPL Page 4 08/03/2016 Stock-Yogo weak ID test critical values for K1=1 and L1=1: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. Weak-instrument-robust inference Tests of joint significance of endogenous regressors B1 in main equation Ho: B1=0 and orthogonality conditions are valid Anderson-Rubin Wald test F(1,10082)= 22.12 P-val=0.0000 Anderson-Rubin Wald test Chi-sq(1)= 22.13 P-val=0.0000 Stock-Wright LM S statistic Chi-sq(1)= 20.71 P-val=0.0000 NB: Underidentification, weak identification and weak-identification-robust test statistics heteroskedasticity-robust Number Number Number Number Number of of of of of observations regressors endogenous regressors instruments excluded instruments N K K1 L L1 = = = = = 10089 7 1 7 1 IV (2SLS) estimation -------------------Estimates efficient for homoskedasticity only Statistics robust to heteroskedasticity Total (centered) SS Total (uncentered) SS Residual SS = = = 18715.11622 442534.2012 17518.21658 Number of obs F( 6, 10082) Prob > F Centered R2 Uncentered R2 Root MSE = = = = = = 10089 333.25 0.0000 0.0640 0.9604 1.318 -----------------------------------------------------------------------------| Robust ldrugexp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------hi_empunion | -.8975913 .2211268 -4.06 0.000 -1.330992 -.4641908 totchr | .4502655 .0101969 44.16 0.000 .43028 .470251 age | -.0132176 .0029977 -4.41 0.000 -.0190931 -.0073421 female | -.020406 .0326114 -0.63 0.531 -.0843232 .0435113 blhisp | -.2174244 .0394944 -5.51 0.000 -.294832 -.1400167 linc | .0870018 .0226356 3.84 0.000 .0426368 .1313668 _cons | 6.78717 .2688453 25.25 0.000 6.260243 7.314097 -----------------------------------------------------------------------------Underidentification test (Kleibergen-Paap rk LM statistic): 138.015 Chi-sq(1) P-val = 0.0000 -----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic): 183.980 (Kleibergen-Paap rk Wald F statistic): 65.760 Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. -----------------------------------------------------------------------------Hansen J statistic (overidentification test of all instruments): 0.000 (equation exactly identified) -----------------------------------------------------------------------------Instrumented: hi_empunion Included instruments: totchr age female blhisp linc Excluded instruments: ssiratio ------------------------------------------------------------------------------ First stage results: The effect of the instrument on hi_empunion is negative as expected and is statisticant at significant 1% level. The fact of a supplementary insurance decreases the expenditure on prescribed medication in 89.76%, which is pretty high. JPL Page 5 08/03/2016 Question 4 . quietly ivreg ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc ) . estimates store iv . quietly ivreg robust ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc ), . estimates store ivrobust . quietly ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc ), gmm . estimates store GMM . estimates table iv ivrobust GMM, stat (se r2_a rmse) star -------------------------------------------------------------Variable | iv ivrobust GMM -------------+-----------------------------------------------hi_empunion | -.98992691*** -.98992691*** -.99327949*** totchr | .45120505*** .45120505*** .45095079*** age | -.01413842*** -.01413842*** -.01415093*** female | -.02783978 -.02783978 -.02817157 blhisp | -.22370865*** -.22370865*** -.22310484*** linc | .09427483*** .09427483*** .09446321*** _cons | 6.8751877*** 6.8751877*** 6.8778206*** -------------+-----------------------------------------------se | r2_a | .04087781 .04087781 .04002154 rmse | 1.3339228 1.3339228 1.3340551 -------------------------------------------------------------legend: * p<0.05; ** p<0.01; *** p<0.001 We can see that the results are significant for all regressors, exept for the fact of being a female, suggesting that gender does not have any effect on medical expences. “Number of chronic conditions” and “log of income” have indeed a positive effect on medical expenses, however the first one is much higher than the second effect (the more ill you are, the more you have to spend on medical care, and the higher your income, the more you can afford it). Medical expenses decrease with the fact of being black or Hispanic (maybe they have lower income and can afford less meical care), decreases as well with the fact of having an additional insurance (which is normal if the insurance covers the expenses) and apparently these expenses also decrease with age, which is kind of odd (the older you get, the more likely to get sick and the more likely to increase medical expenses). We also notice that the results of iv and ivrobust are identical, and only differ a little bit from GMM. Question 5 . ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc ), gmm -gmm- is no longer a supported option; use -gmm2s- with the appropriate option JPL Page 6 gmm gmm gmm gmm gmm robust bw() robust bw() cluster() = = = = = gmm2s gmm2s gmm2s gmm2s gmm2s 08/03/2016 robust robust bw() robust bw() cluster() 2-Step GMM estimation --------------------Estimates efficient for arbitrary heteroskedasticity Statistics robust to heteroskedasticity Total (centered) SS Total (uncentered) SS Residual SS = = = 18715.11622 442534.2012 17955.42285 Number of obs F( 6, 10082) Prob > F Centered R2 Uncentered R2 Root MSE = = = = = = 10089 325.50 0.0000 0.0406 0.9594 1.334 -----------------------------------------------------------------------------| Robust ldrugexp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------hi_empunion | -.9932795 .2045645 -4.86 0.000 -1.394219 -.5923405 totchr | .4509508 .0103058 43.76 0.000 .4307517 .4711498 age | -.0141509 .0029 -4.88 0.000 -.0198347 -.0084671 female | -.0281716 .0321727 -0.88 0.381 -.0912288 .0348857 blhisp | -.2231048 .0395804 -5.64 0.000 -.300681 -.1455287 linc | .0944632 .0218833 4.32 0.000 .0515727 .1373537 _cons | 6.877821 .2578727 26.67 0.000 6.372399 7.383242 -----------------------------------------------------------------------------Underidentification test (Kleibergen-Paap rk LM statistic): 170.738 Chi-sq(2) P-val = 0.0000 -----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic): 110.613 (Kleibergen-Paap rk Wald F statistic): 58.612 Stock-Yogo weak ID test critical values: 10% maximal IV size 19.93 15% maximal IV size 11.59 20% maximal IV size 8.75 25% maximal IV size 7.25 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. -----------------------------------------------------------------------------Hansen J statistic (overidentification test of all instruments): 1.048 Chi-sq(1) P-val = 0.3061 -----------------------------------------------------------------------------Instrumented: hi_empunion Included instruments: totchr age female blhisp linc Excluded instruments: ssiratio multlc ------------------------------------------------------------------------------ We find that the p value of the Hansen J test is large enough, giving us little evidence for rejecting the null hypothesis. So we don’t reject the instruments. Including the four instruments: . ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc lowincome firmsz ), gmm -gmm- is no longer a supported option; use -gmm2s- with the appropriate option gmm = gmm2s robust gmm robust = gmm2s robust gmm bw() = gmm2s bw() gmm robust bw() = gmm2s robust bw() gmm cluster() = gmm2s cluster() 2-Step GMM estimation --------------------Estimates efficient for arbitrary heteroskedasticity Statistics robust to heteroskedasticity Number of obs = F( 6, 10082) = Prob > F = 10089 335.98 0.0000 JPL Page 7 Total (centered) SS Total (uncentered) SS Residual SS = = = 18715.11622 442534.2012 17163.61371 08/03/2016 Centered R2 = Uncentered R2 = Root MSE = 0.0829 0.9612 1.304 -----------------------------------------------------------------------------| Robust ldrugexp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------hi_empunion | -.8124043 .1861018 -4.37 0.000 -1.177157 -.4476515 totchr | .449488 .01011 44.46 0.000 .4296728 .4693033 age | -.0124598 .0027643 -4.51 0.000 -.0178777 -.007042 female | -.0104528 .0308857 -0.34 0.735 -.0709876 .050082 blhisp | -.2061018 .0385144 -5.35 0.000 -.2815886 -.130615 linc | .0796532 .0205381 3.88 0.000 .0393992 .1199073 _cons | 6.7126 .2441439 27.49 0.000 6.234086 7.191113 -----------------------------------------------------------------------------Underidentification test (Kleibergen-Paap rk LM statistic): 200.657 Chi-sq(4) P-val = 0.0000 -----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic): 62.749 (Kleibergen-Paap rk Wald F statistic): 44.823 Stock-Yogo weak ID test critical values: 5% maximal IV relative bias 16.85 10% maximal IV relative bias 10.27 20% maximal IV relative bias 6.71 30% maximal IV relative bias 5.34 10% maximal IV size 24.58 15% maximal IV size 13.96 20% maximal IV size 10.26 25% maximal IV size 8.31 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors. -----------------------------------------------------------------------------Hansen J statistic (overidentification test of all instruments): 11.590 Chi-sq(3) P-val = 0.0089 -----------------------------------------------------------------------------Instrumented: hi_empunion Included instruments: totchr age female blhisp linc Excluded instruments: ssiratio multlc lowincome firmsz ------------------------------------------------------------------------------ We find that the p value of the Hansen J test becomes very small with the inclusion on the other two instruments, giving us evidence for rejecting the null hypothesis. So we shouldn’t keep the new instruments because the test suggests that they appear to be correlated with the error. Question 6 Quietly reg ldrugexp totchr age female blhisp linc hi_empunion, robust . estimates store OLSrobust . quietly ivreg2 robust ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ), first . estimates store IVrobust . estimates table OLSrobust IVrobust , stat (se r2_a rmse) star ---------------------------------------------Variable | OLSrobust IVrobust -------------+-------------------------------totchr | .44038073*** .45026553*** age | -.00352947 -.01321759*** female | .0578055* -.02040599 blhisp | -.15130678*** -.21742435*** linc | .01048155 .08700179*** hi_empunion | .0738788** -.89759128*** _cons | 5.8611305*** 6.7871701*** -------------+-------------------------------se | r2_a | .17648308 .06339657 rmse | 1.2360328 1.3177132 ---------------------------------------------legend: * p<0.05; ** p<0.01; *** p<0.001 JPL Page 8 08/03/2016 We find age looses significance with OLS, and indicator variable of being a female becomes significant at 1%. However, the most interesting result (not surprising) is that the variable “linc” looses all its significance with OLS, because of its endogeneity. . quietly ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ) . hausman iv ., constant sigmamore Note: the rank of the differenced variance matrix (1) does not equal the number of coefficients being tested (7); be sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar scale. ---- Coefficients ---| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | iv . Difference S.E. -------------+---------------------------------------------------------------hi_empunion | -.9899269 -.8975913 -.0923356 . totchr | .4512051 .4502655 .0009395 . age | -.0141384 -.0132176 -.0009208 . female | -.0278398 -.020406 -.0074338 . blhisp | -.2237087 -.2174244 -.0062843 . linc | .0942748 .0870018 .007273 . _cons | 6.875188 6.78717 .0880176 . -----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from ivreg B = inconsistent under Ha, efficient under Ho; obtained from ivreg2 Test: Ho: difference in coefficients not systematic chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B) = -1.19 chi2<0 ==> model fitted on these data fails to meet the asymptotic assumptions of the Hausman test; see suest for a generalized test . quietly ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ) . estimates store iv . quietly reg ldrugexp totchr age female blhisp linc hi_empunion . hausman iv ., constant sigmamore Note: the rank of the differenced variance matrix (1) does not equal the number of coefficients being tested (7); be sure this is what you expect, or there may be problems computing the test. Examine the output of your estimators for anything unexpected and possibly consider scaling your variables so that the coefficients are on a similar scale. ---- Coefficients ---| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | iv . Difference S.E. -------------+---------------------------------------------------------------hi_empunion | -.8975913 .0738788 -.9714701 .1932748 totchr | .4502655 .4403807 .0098848 .0019666 age | -.0132176 -.0035295 -.0096881 .0019275 female | -.020406 .0578055 -.0782115 .0155602 blhisp | -.2174244 -.1513068 -.0661176 .0131541 linc | .0870018 .0104815 .0765202 .0152238 _cons | 6.78717 5.861131 .9260396 .1842364 -----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from ivreg2 B = inconsistent under Ha, efficient under Ho; obtained from regress Test: Ho: difference in coefficients not systematic chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 25.26 Prob>chi2 = 0.0000 (V_b-V_B is not positive definite) JPL Page 9 08/03/2016 We find that the hausman test statistic rejects exogeneity of this variable!! OR ivreg2 ldrugexp totchr age female blhisp linc hi_empunion (= ssiratio ), robust orthog(hi_empunion) OLS estimation Estimates efficient for homoskedasticity only Statistics robust to heteroskedasticity Number of obs = 10089 F( 6, 10082) = 376.85 Prob > F = 0.0000 Total (centered) SS = 18715.11622 Total (uncentered) SS = 442534.2012 Residual SS = 15403.0482 Robust ldrugexp Coef. Std. Err. z P>z Centered R2 = 0.1770 Uncentered R2 = 0.9652 Root MSE = 1.236 [95% Conf. Interval] totchr .4403807 .00936 47.05 0.000 .4220354 .4587261 age -.0035295 .0019363 -1.82 0.068 -.0073246 .0002657 female .0578055 .0253563 2.28 0.023 .008108 .107503 blhisp -.1513068 .0341146 -4.44 0.000 -.2181701 -.0844435 linc .0104815 .0137079 0.76 0.444 -.0163854 .0373485 hi_empunion .0738788 .0259757 2.84 0.004 .0229673 .1247903 _cons 5.861131 .1570491 37.32 0.000 5.55332 6.168941 Hansen J statistic (Lagrange multiplier test of excluded instruments): 24.935 Chi-sq(1) P-val = 0.0000 -orthog- option: Hansen J statistic (eqn. excluding suspect orthog. conditions): 0.000 Chi-sq(0) P-val = . C statistic (exogeneity/orthogonality of suspect instruments): 24.935 Chi-sq(1) P-val = 0.0000 Instruments tested: hi_empunion Included instruments: totchr age female blhisp linc hi_empunion Excluded instruments: ssiratio .