solution - Jean-Pierre Laffargue's home page

advertisement
JPL
Page 1
08/03/2016
M2R Economie internationale, développement, transition
Année 2010-2011
EXAMEN D’ECONOMETRIE APPLIQUEE
Question 1
.
use
"C:\Documents
and
Settings\Administrador\Mis
I\Econometrics\Laffargue\exam\mus06
> data.dta", clear
documentos\International
Economics
. describe ldrugexp hi_empunion totchr age female blhisp linc
storage display
value
variable name
type
format
label
variable label
-----------------------------------------------------------------------------------------------------------------------ldrugexp
float %9.0g
log(drugexp)
hi_empunion
byte
%8.0g
Insured thro emp/union
totchr
byte
%8.0g
Total chronic cond
age
byte
%8.0g
Age
female
byte
%8.0g
Female
blhisp
float %9.0g
Black or Hispanic
linc
float %9.0g
log(income)
. sum ldrugexp hi_empunion totchr age female blhisp linc
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------ldrugexp |
10391
6.479668
1.363395
0
10.18017
hi_empunion |
10391
.3796555
.4853245
0
1
totchr |
10391
1.860745
1.290131
0
9
age |
10391
75.04639
6.69368
65
91
female |
10391
.5797325
.4936256
0
1
-------------+-------------------------------------------------------blhisp |
10391
.1703397
.3759491
0
1
linc |
10089
2.743275
.9131433 -6.907755
5.744476
We can see that the variable linc is contains missing observations. We can also see that the average age of
individuals in the sample is 75 years, and that less than 50% of them have a complementary insurance. More
than half of them are females. Proportion of blacks and Hispanics is not that high at all.
JPL
Page 2
08/03/2016
Question 2
To know if there are missing observations we use codebook command and then we the drop missing
ones:
. codebook linc
-----------------------------------------------------------------------------------------------------------------------linc
log(income)
-----------------------------------------------------------------------------------------------------------------------type:
range:
unique values:
mean:
std. dev:
numeric (float)
[-6.9077554,5.7444763]
6914
units:
missing .:
1.000e-09
302/10391
2.74328
.913143
percentiles:
10%
1.79176
25%
2.2327
50%
2.74316
75%
3.31506
90%
3.79928
. drop if linc==.
(302 observations deleted)
. des
ssiratio lowincome firmsz multlc
storage display
value
variable name
type
format
label
variable label
-----------------------------------------------------------------------------------------------------------------------ssiratio
float %9.0g
SSI/Income ratio
lowincome
byte
%8.0g
Low income
firmsz
float %9.0g
Firm size
multlc
byte
%8.0g
Multiple locations
. sum
ssiratio
lowincome firmsz multlc
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------ssiratio |
10089
.5365438
.3678175
0
9.25062
lowincome |
10089
.1874319
.3902771
0
1
firmsz |
10089
.1405293
2.170389
0
50
multlc |
10089
.0620478
.2412543
0
1
We can see that the variable lowincome is not that high, meaning that the status lowi ncome is rather represents a
very small proportion of the observations. We can also see that on average the ssiratio is not that high, meaning
that there is not a very high income constraint. We also find that the size of the firms were the individuals are
employed, on average are rather small and not operating in much locations.
JPL
Page 3
08/03/2016
Question 3
. ivreg2
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ), first robust
. ivreg2
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ), first robust
First-stage regressions
----------------------First-stage regression of hi_empunion:
OLS estimation
-------------Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity
Total (centered) SS
Total (uncentered) SS
Residual SS
=
=
=
2382.242839
3856
2201.062524
Number of obs
F( 6, 10082)
Prob > F
Centered R2
Uncentered R2
Root MSE
=
=
=
=
=
=
10089
119.18
0.0000
0.0761
0.4292
.4672
-----------------------------------------------------------------------------|
Robust
hi_empunion |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------totchr |
.0127865
.0036655
3.49
0.000
.0056015
.0199716
age | -.0086323
.0007087
-12.18
0.000
-.0100216
-.0072431
female |
-.07345
.0096392
-7.62
0.000
-.0923448
-.0545552
blhisp |
-.06268
.0122742
-5.11
0.000
-.08674
-.0386201
linc |
.0483937
.0066075
7.32
0.000
.0354417
.0613456
ssiratio | -.1916432
.0236326
-8.11
0.000
-.2379678
-.1453186
_cons |
1.028981
.0581387
17.70
0.000
.9150172
1.142944
-----------------------------------------------------------------------------Included instruments: totchr age female blhisp linc ssiratio
-----------------------------------------------------------------------------F test of excluded instruments:
F( 1, 10082) =
65.76
Prob > F
=
0.0000
Angrist-Pischke multivariate F test of excluded instruments:
F( 1, 10082) =
65.76
Prob > F
=
0.0000
Summary results for first-stage regressions
------------------------------------------Variable
hi_empunion
| F(
|
(Underid)
(Weak id)
1, 10082) P-val | AP Chi-sq( 1) P-val | AP F( 1, 10082)
65.76
0.0000 |
65.81
0.0000 |
65.76
NB: first-stage test statistics heteroskedasticity-robust
Stock-Yogo weak ID test critical values for single endogenous regressor:
10% maximal IV size
16.38
15% maximal IV size
8.96
20% maximal IV size
6.66
25% maximal IV size
5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
Underidentification test
Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
Ha: matrix has rank=K1 (identified)
Kleibergen-Paap rk LM statistic
Chi-sq(1)=138.02
P-val=0.0000
Weak identification test
Ho: equation is weakly identified
Cragg-Donald Wald F statistic
Kleibergen-Paap Wald rk F statistic
183.98
65.76
JPL
Page 4
08/03/2016
Stock-Yogo weak ID test critical values for K1=1 and L1=1:
10% maximal IV size
16.38
15% maximal IV size
8.96
20% maximal IV size
6.66
25% maximal IV size
5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
Weak-instrument-robust inference
Tests of joint significance of endogenous regressors B1 in main equation
Ho: B1=0 and orthogonality conditions are valid
Anderson-Rubin Wald test
F(1,10082)=
22.12
P-val=0.0000
Anderson-Rubin Wald test
Chi-sq(1)=
22.13
P-val=0.0000
Stock-Wright LM S statistic
Chi-sq(1)=
20.71
P-val=0.0000
NB: Underidentification, weak identification and weak-identification-robust
test statistics heteroskedasticity-robust
Number
Number
Number
Number
Number
of
of
of
of
of
observations
regressors
endogenous regressors
instruments
excluded instruments
N
K
K1
L
L1
=
=
=
=
=
10089
7
1
7
1
IV (2SLS) estimation
-------------------Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity
Total (centered) SS
Total (uncentered) SS
Residual SS
=
=
=
18715.11622
442534.2012
17518.21658
Number of obs
F( 6, 10082)
Prob > F
Centered R2
Uncentered R2
Root MSE
=
=
=
=
=
=
10089
333.25
0.0000
0.0640
0.9604
1.318
-----------------------------------------------------------------------------|
Robust
ldrugexp |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------hi_empunion | -.8975913
.2211268
-4.06
0.000
-1.330992
-.4641908
totchr |
.4502655
.0101969
44.16
0.000
.43028
.470251
age | -.0132176
.0029977
-4.41
0.000
-.0190931
-.0073421
female |
-.020406
.0326114
-0.63
0.531
-.0843232
.0435113
blhisp | -.2174244
.0394944
-5.51
0.000
-.294832
-.1400167
linc |
.0870018
.0226356
3.84
0.000
.0426368
.1313668
_cons |
6.78717
.2688453
25.25
0.000
6.260243
7.314097
-----------------------------------------------------------------------------Underidentification test (Kleibergen-Paap rk LM statistic):
138.015
Chi-sq(1) P-val =
0.0000
-----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic):
183.980
(Kleibergen-Paap rk Wald F statistic):
65.760
Stock-Yogo weak ID test critical values: 10% maximal IV size
16.38
15% maximal IV size
8.96
20% maximal IV size
6.66
25% maximal IV size
5.53
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
-----------------------------------------------------------------------------Hansen J statistic (overidentification test of all instruments):
0.000
(equation exactly identified)
-----------------------------------------------------------------------------Instrumented:
hi_empunion
Included instruments: totchr age female blhisp linc
Excluded instruments: ssiratio
------------------------------------------------------------------------------
First stage results: The effect of the instrument on hi_empunion is negative as expected and is statisticant at
significant 1% level.
The fact of a supplementary insurance decreases the expenditure on prescribed medication in 89.76%, which is
pretty high.
JPL
Page 5
08/03/2016
Question 4
. quietly ivreg
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc )
. estimates store iv
. quietly ivreg
robust
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc ),
. estimates store ivrobust
. quietly ivreg2
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc ), gmm
. estimates store GMM
. estimates table iv ivrobust GMM, stat (se r2_a rmse) star
-------------------------------------------------------------Variable |
iv
ivrobust
GMM
-------------+-----------------------------------------------hi_empunion | -.98992691***
-.98992691***
-.99327949***
totchr | .45120505***
.45120505***
.45095079***
age | -.01413842***
-.01413842***
-.01415093***
female | -.02783978
-.02783978
-.02817157
blhisp | -.22370865***
-.22370865***
-.22310484***
linc | .09427483***
.09427483***
.09446321***
_cons | 6.8751877***
6.8751877***
6.8778206***
-------------+-----------------------------------------------se |
r2_a | .04087781
.04087781
.04002154
rmse | 1.3339228
1.3339228
1.3340551
-------------------------------------------------------------legend: * p<0.05; ** p<0.01; *** p<0.001
We can see that the results are significant for all regressors, exept for the fact of being a female,
suggesting that gender does not have any effect on medical expences.
“Number of chronic conditions” and “log of income” have indeed a positive effect on medical
expenses, however the first one is much higher than the second effect (the more ill you are, the more
you have to spend on medical care, and the higher your income, the more you can afford it).
Medical expenses decrease with the fact of being black or Hispanic (maybe they have lower income
and can afford less meical care), decreases as well with the fact of having an additional insurance
(which is normal if the insurance covers the expenses) and apparently these expenses also decrease
with age, which is kind of odd (the older you get, the more likely to get sick and the more likely to
increase medical expenses). We also notice that the results of iv and ivrobust are identical, and only
differ a little bit from GMM.
Question 5
. ivreg2 ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc ), gmm
-gmm- is no longer a supported option; use -gmm2s- with the appropriate option
JPL
Page 6
gmm
gmm
gmm
gmm
gmm
robust
bw()
robust bw()
cluster()
=
=
=
=
=
gmm2s
gmm2s
gmm2s
gmm2s
gmm2s
08/03/2016
robust
robust
bw()
robust bw()
cluster()
2-Step GMM estimation
--------------------Estimates efficient for arbitrary heteroskedasticity
Statistics robust to heteroskedasticity
Total (centered) SS
Total (uncentered) SS
Residual SS
=
=
=
18715.11622
442534.2012
17955.42285
Number of obs
F( 6, 10082)
Prob > F
Centered R2
Uncentered R2
Root MSE
=
=
=
=
=
=
10089
325.50
0.0000
0.0406
0.9594
1.334
-----------------------------------------------------------------------------|
Robust
ldrugexp |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------hi_empunion | -.9932795
.2045645
-4.86
0.000
-1.394219
-.5923405
totchr |
.4509508
.0103058
43.76
0.000
.4307517
.4711498
age | -.0141509
.0029
-4.88
0.000
-.0198347
-.0084671
female | -.0281716
.0321727
-0.88
0.381
-.0912288
.0348857
blhisp | -.2231048
.0395804
-5.64
0.000
-.300681
-.1455287
linc |
.0944632
.0218833
4.32
0.000
.0515727
.1373537
_cons |
6.877821
.2578727
26.67
0.000
6.372399
7.383242
-----------------------------------------------------------------------------Underidentification test (Kleibergen-Paap rk LM statistic):
170.738
Chi-sq(2) P-val =
0.0000
-----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic):
110.613
(Kleibergen-Paap rk Wald F statistic):
58.612
Stock-Yogo weak ID test critical values: 10% maximal IV size
19.93
15% maximal IV size
11.59
20% maximal IV size
8.75
25% maximal IV size
7.25
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
-----------------------------------------------------------------------------Hansen J statistic (overidentification test of all instruments):
1.048
Chi-sq(1) P-val =
0.3061
-----------------------------------------------------------------------------Instrumented:
hi_empunion
Included instruments: totchr age female blhisp linc
Excluded instruments: ssiratio multlc
------------------------------------------------------------------------------
We find that the p value of the Hansen J test is large enough, giving us little evidence for rejecting the
null hypothesis. So we don’t reject the instruments.
Including the four instruments:
. ivreg2
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio multlc lowincome
firmsz ), gmm
-gmm- is no longer a supported option; use -gmm2s- with the appropriate option
gmm
= gmm2s robust
gmm robust
= gmm2s robust
gmm bw()
= gmm2s bw()
gmm robust bw() = gmm2s robust bw()
gmm cluster()
= gmm2s cluster()
2-Step GMM estimation
--------------------Estimates efficient for arbitrary heteroskedasticity
Statistics robust to heteroskedasticity
Number of obs =
F( 6, 10082) =
Prob > F
=
10089
335.98
0.0000
JPL
Page 7
Total (centered) SS
Total (uncentered) SS
Residual SS
=
=
=
18715.11622
442534.2012
17163.61371
08/03/2016
Centered R2
=
Uncentered R2 =
Root MSE
=
0.0829
0.9612
1.304
-----------------------------------------------------------------------------|
Robust
ldrugexp |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------hi_empunion | -.8124043
.1861018
-4.37
0.000
-1.177157
-.4476515
totchr |
.449488
.01011
44.46
0.000
.4296728
.4693033
age | -.0124598
.0027643
-4.51
0.000
-.0178777
-.007042
female | -.0104528
.0308857
-0.34
0.735
-.0709876
.050082
blhisp | -.2061018
.0385144
-5.35
0.000
-.2815886
-.130615
linc |
.0796532
.0205381
3.88
0.000
.0393992
.1199073
_cons |
6.7126
.2441439
27.49
0.000
6.234086
7.191113
-----------------------------------------------------------------------------Underidentification test (Kleibergen-Paap rk LM statistic):
200.657
Chi-sq(4) P-val =
0.0000
-----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic):
62.749
(Kleibergen-Paap rk Wald F statistic):
44.823
Stock-Yogo weak ID test critical values: 5% maximal IV relative bias
16.85
10% maximal IV relative bias
10.27
20% maximal IV relative bias
6.71
30% maximal IV relative bias
5.34
10% maximal IV size
24.58
15% maximal IV size
13.96
20% maximal IV size
10.26
25% maximal IV size
8.31
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
-----------------------------------------------------------------------------Hansen J statistic (overidentification test of all instruments):
11.590
Chi-sq(3) P-val =
0.0089
-----------------------------------------------------------------------------Instrumented:
hi_empunion
Included instruments: totchr age female blhisp linc
Excluded instruments: ssiratio multlc lowincome firmsz
------------------------------------------------------------------------------
We find that the p value of the Hansen J test becomes very small with the inclusion on the other two
instruments, giving us evidence for rejecting the null hypothesis. So we shouldn’t keep the new
instruments because the test suggests that they appear to be correlated with the error.
Question 6
Quietly reg ldrugexp totchr age female blhisp linc hi_empunion, robust
. estimates store OLSrobust
. quietly ivreg2
robust
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio ), first
. estimates store IVrobust
. estimates table OLSrobust IVrobust , stat (se r2_a rmse) star
---------------------------------------------Variable |
OLSrobust
IVrobust
-------------+-------------------------------totchr | .44038073***
.45026553***
age | -.00352947
-.01321759***
female |
.0578055*
-.02040599
blhisp | -.15130678***
-.21742435***
linc | .01048155
.08700179***
hi_empunion |
.0738788**
-.89759128***
_cons | 5.8611305***
6.7871701***
-------------+-------------------------------se |
r2_a | .17648308
.06339657
rmse | 1.2360328
1.3177132
---------------------------------------------legend: * p<0.05; ** p<0.01; *** p<0.001
JPL
Page 8
08/03/2016
We find age looses significance with OLS, and indicator variable of being a female becomes
significant at 1%. However, the most interesting result (not surprising) is that the variable “linc”
looses all its significance with OLS, because of its endogeneity.
. quietly ivreg2
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio )
. hausman iv ., constant sigmamore
Note: the rank of the differenced variance matrix (1) does not equal the number of
coefficients being tested (7); be sure
this is what you expect, or there may be problems computing the test.
Examine the
output of your estimators for
anything unexpected and possibly consider scaling your variables so that the
coefficients are on a similar scale.
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
|
iv
.
Difference
S.E.
-------------+---------------------------------------------------------------hi_empunion |
-.9899269
-.8975913
-.0923356
.
totchr |
.4512051
.4502655
.0009395
.
age |
-.0141384
-.0132176
-.0009208
.
female |
-.0278398
-.020406
-.0074338
.
blhisp |
-.2237087
-.2174244
-.0062843
.
linc |
.0942748
.0870018
.007273
.
_cons |
6.875188
6.78717
.0880176
.
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from ivreg
B = inconsistent under Ha, efficient under Ho; obtained from ivreg2
Test:
Ho:
difference in coefficients not systematic
chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
-1.19
chi2<0 ==> model fitted on these
data fails to meet the asymptotic
assumptions of the Hausman test;
see suest for a generalized test
. quietly ivreg2
ldrugexp totchr age female blhisp linc ( hi_empunion= ssiratio )
. estimates store iv
. quietly reg ldrugexp totchr age female blhisp linc
hi_empunion
. hausman iv ., constant sigmamore
Note: the rank of the differenced variance matrix (1) does not equal the number of
coefficients being tested (7); be sure
this is what you expect, or there may be problems computing the test.
Examine the
output of your estimators for
anything unexpected and possibly consider scaling your variables so that the
coefficients are on a similar scale.
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
|
iv
.
Difference
S.E.
-------------+---------------------------------------------------------------hi_empunion |
-.8975913
.0738788
-.9714701
.1932748
totchr |
.4502655
.4403807
.0098848
.0019666
age |
-.0132176
-.0035295
-.0096881
.0019275
female |
-.020406
.0578055
-.0782115
.0155602
blhisp |
-.2174244
-.1513068
-.0661176
.0131541
linc |
.0870018
.0104815
.0765202
.0152238
_cons |
6.78717
5.861131
.9260396
.1842364
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from ivreg2
B = inconsistent under Ha, efficient under Ho; obtained from regress
Test:
Ho:
difference in coefficients not systematic
chi2(1) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
25.26
Prob>chi2 =
0.0000
(V_b-V_B is not positive definite)
JPL
Page 9
08/03/2016
We find that the hausman test statistic rejects exogeneity of this variable!!
OR
ivreg2 ldrugexp totchr age female blhisp linc hi_empunion (= ssiratio ), robust orthog(hi_empunion)
OLS estimation
Estimates efficient for homoskedasticity only
Statistics robust to heteroskedasticity
Number of obs = 10089
F( 6, 10082) = 376.85
Prob > F = 0.0000
Total (centered) SS = 18715.11622
Total (uncentered) SS = 442534.2012
Residual SS
= 15403.0482
Robust
ldrugexp
Coef. Std. Err.
z P>z
Centered R2 = 0.1770
Uncentered R2 = 0.9652
Root MSE = 1.236
[95% Conf. Interval]
totchr .4403807 .00936 47.05 0.000 .4220354 .4587261
age -.0035295 .0019363 -1.82 0.068 -.0073246 .0002657
female .0578055 .0253563 2.28 0.023 .008108 .107503
blhisp -.1513068 .0341146 -4.44 0.000 -.2181701 -.0844435
linc .0104815 .0137079 0.76 0.444 -.0163854 .0373485
hi_empunion .0738788 .0259757 2.84 0.004 .0229673 .1247903
_cons 5.861131 .1570491 37.32 0.000 5.55332 6.168941
Hansen J statistic (Lagrange multiplier test of excluded instruments): 24.935
Chi-sq(1) P-val = 0.0000
-orthog- option:
Hansen J statistic (eqn. excluding suspect orthog. conditions):
0.000
Chi-sq(0) P-val =
.
C statistic (exogeneity/orthogonality of suspect instruments):
24.935
Chi-sq(1) P-val = 0.0000
Instruments tested: hi_empunion
Included instruments: totchr age female blhisp linc hi_empunion
Excluded instruments: ssiratio
.
Download