Document

advertisement
IV/2SLS models
1
zi  0
x0  0.80
zi  1
x1  0.57
2
y0  3186
y1  3278
y1  y0 3278  3186
ˆ
1 

 400
x1  x0
0.57  0.80
3
Vietnam era service
•
•
•
•
•
•
•
•
Defined as 1964-1975
Estimated 8.7 million served during era
3.4 million were in SE Asia
2.6 million served in Vietnam
1.6 million saw combat
203K wounded in action, 153K hospitalized
58,000 deaths
http://www.history.navy.mil/library/online/america
n%20war%20casualty.htm#t7
4
Vietnam Era Draft
• 1st part of war, operated liked WWII and
Korean War
• At age 18 men report to local draft boards
• Could receive deferment for variety of
reasons (kids, attending school)
• If available for service, pre-induction
physical and tests
• Military needs determined those drafted
5
• Everyone drafted went to the Army
• Local draft boards filled army.
• Priorities
– Delinquents, volunteers, non-vol. 19-25
– For non-vol., determined by age
• College enrollment powerful way to avoid
service
– Men w. college degree 1/3 less likely to serve
6
Draft Lottery
• Proposed by Nixon
• Passed in Nov 1969, 1st lottery Dec 1, 1969
• 1st lottery for men age 19-26 on 1/1/70
– Men born 1944-1950.
• Randomly assigned number 1-365, Draft Lottery
number (DLN)
• Military estimates needs, sets threshold T
• If DLN<=T, drafted
7
Questions?
• What are the research questions?
• Why can we NOT obtain estimates from
observational data?
8
• If volunteer, could get better assignment
• Thresholds for service
•
•
•
•
Draft
1970
1971
1972
Year of Birth
1946-50
1951
1952
Threshold
195
125
95
• Draft suspended in 1973
9
10
11
12
13
Angrist/Evans
14
19
48
19
51
19
54
19
57
19
60
19
63
19
66
19
69
19
72
19
75
19
78
19
81
19
84
19
87
19
90
19
93
19
96
19
99
20
02
Percent in labor force
Female Labor Force Paticipation Rate
70
60
50
40
30
20
10
0
Year
15
16
17
18
19
20
21
22
. * get correlation coefficient between;
. * instrument and endogenous RHS variable;
. corr morekids samesex;
(obs=254654)
| morekids samesex
-------------+-----------------morekids |
1.0000
samesex |
0.0695
1.0000
Correlation coefficient
23
Ratio of variances = (0.0020246/0.0291242)^2 = 0.004832484
24
R2 = 290.247937/60030.836855 = 0.004832
βiv = -0.0092924/0.0675253= -0.137631
25
Reduced form, just identified model
26
First stage, just identified model
27
2SLS, just identified model
Βiv= -0.0083481/0.0693854 = -0.120315
28
1st stage over identified model
29
ivreg2
• Download from www
• Within stata, type
ssc install ivreg2, replace
• and hit return
• Does all the tests seemlessly
30
Outcome of interest
W’s (exogenous covariates)
* the syntax is ivreg2 y w (x=z), first endog(x);
* the first command asks stata to report the 1st stage, and;
* endog(x) asks stata to do the hausman-wu test of endogeneity;
ivreg2 workedm boy1st boy2nd agem1 agefstm black hispan othrace
(morekids=samesex), first endog(morekids);
Test for endogeneity
of morekids in model
Endogenous variable
And instruments
Ask for 1st stage
31
IV (2SLS) estimation
--------------------
Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only
Total (centered) SS
Total (uncentered) SS
Residual SS
=
=
=
63460.72056
134513
60402.67924
Number of obs
F( 8,254645)
Prob > F
Centered R2
Uncentered R2
Root MSE
=
=
=
=
=
=
254654
865.24
0.0000
0.0482
0.5510
.487
-----------------------------------------------------------------------------workedm |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------morekids | -.1203151
.0278407
-4.32
0.000
-.1748818
-.0657483
boy1st |
.0009211
.0019489
0.47
0.636
-.0028986
.0047409
boy2nd | -.0048314
.0019425
-2.49
0.013
-.0086386
-.0010241
agem1 |
.0219352
.0009013
24.34
0.000
.0201687
.0237018
agefstm | -.0264911
.0012647
-20.95
0.000
-.0289698
-.0240124
black |
.1899764
.0047674
39.85
0.000
.1806325
.1993203
hispan | -.0139081
.0053812
-2.58
0.010
-.0244551
-.0033611
othrace |
.0443545
.0048137
9.21
0.000
.0349198
.0537891
_cons |
.4498966
.0138562
32.47
0.000
.4227389
.4770543
-----------------------------------------------------------------------------Underidentification test (Anderson canon. corr. LM statistic):
1405.578
Chi-sq(1) P-val =
0.0000
-----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic):
1413.330
Stock-Yogo weak ID test critical values: 10% maximal IV size
16.38
15% maximal IV size
8.96
20% maximal IV size
6.66
25% maximal IV size
5.53
Source: Stock-Yogo (2005). Reproduced by permission.
-----------------------------------------------------------------------------Sargan statistic (overidentification test of all instruments):
0.000
32
OLS estimation
-------------Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only
Number of obs =
254654
F( 8,254645) = 2825.70
Prob > F
=
0.0000
Total (centered) SS
= 60030.83676
Centered R2
=
0.0815
Total (uncentered) SS
=
96912
Uncentered R2 =
0.4311
Residual SS
=
55136.2215
Root MSE
=
.4653
-----------------------------------------------------------------------------morekids |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------boy1st | -.0015753
.0026228
-0.60
0.548
-.0067158
.0035653
agem1 |
.0304246
.000298
102.09
0.000
.0298405
.0310087
agefstm | -.0435676
.0003462 -125.85
0.000
-.0442461
-.0428891
black |
.0679715
.0041853
16.24
0.000
.0597684
.0761747
hispan |
.125998
.0038974
32.33
0.000
.1183591
.1336369
othrace |
.0479479
.0044209
10.85
0.000
.039283
.0566127
twoboys |
.0598382
.0025731
23.26
0.000
.0547951
.0648813
twogirls |
.0789326
.0026467
29.82
0.000
.0737452
.08412
_cons |
.3138696
.0092684
33.86
0.000
.2957038
.3320353
-----------------------------------------------------------------------------Included instruments: boy1st agem1 agefstm black hispan othrace twoboys twogirl
> s
-----------------------------------------------------------------------------F test of excluded instruments:
F( 2,254645) =
715.13
1st stage F
Prob > F
=
0.0000
Angrist-Pischke multivariate F test of excluded instruments:
F( 2,254645) =
715.13
Prob > F
=
0.0000
33
Summary results for first-stage regressions
------------------------------------------Variable
morekids
| F(
|
(Underid)
(Weak id)
2,254645) P-val | AP Chi-sq( 2) P-val | AP F( 2,254645)
715.13
0.0000 |
1430.31
0.0000 |
715.13
34
IV (2SLS) estimation
-------------------Estimates efficient for homoskedasticity only
Statistics consistent for homoskedasticity only
Number of obs =
254654
F( 7,254646) =
987.26
Prob > F
=
0.0000
Total (centered) SS
= 63460.72056
Centered R2
=
0.0475
Total (uncentered) SS
=
134513
Uncentered R2 =
0.5506
Residual SS
= 60445.97117
Root MSE
=
.4872
-----------------------------------------------------------------------------workedm |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------morekids | -.1127816
.0276854
-4.07
0.000
-.167044
-.0585193
boy1st |
.0009424
.0019496
0.48
0.629
-.0028786
.0047635
agem1 |
.0217057
.0008969
24.20
0.000
.0199478
.0234635
agefstm | -.0261649
.0012583
-20.79
0.000
-.0286312
-.0236987
black |
.1895035
.0047653
39.77
0.000
.1801637
.1988433
hispan |
-.014818
.0053707
-2.76
0.006
-.0253444
-.0042916
othrace |
.0439784
.004813
9.14
0.000
.034545
.0534118
_cons |
.4448388
.0137111
32.44
0.000
.4179656
.4717121
-----------------------------------------------------------------------------Underidentification test (Anderson canon. corr. LM statistic):
1422.320
Chi-sq(2) P-val =
0.0000
-----------------------------------------------------------------------------Weak identification test (Cragg-Donald Wald F statistic):
715.129
Stock-Yogo weak ID test critical values: 10% maximal IV size
19.93
15% maximal IV size
11.59
20% maximal IV size
8.75
25% maximal IV size
7.25
Source: Stock-Yogo (2005). Reproduced by permission.
-----------------------------------------------------------------------------Sargan statistic (overidentification test of all instruments):
6.182
Chi-sq(1) P-val =
0.0129
-endog- option:
Endogeneity test of endogenous regressors:
Chi-sq(1) P-val =
3.809
0.0510
Regressors tested:
morekids
-----------------------------------------------------------------------------Instrumented:
morekids
Included instruments: boy1st agem1 agefstm black hispan othrace
Excluded instruments: twoboys twogirls
------------------------------------------------------------------------------
Test of over id.
Hausman endo test
35
.
.
.
.
.
>
* output residuals and do the tests of overid;
* and hausman test by brute force;
predict res_2sls_worked, res;
* test of overid;
reg res_2sls_worked twoboys twogirls boy1st agem1 agefstm black hispan othr
ace;
Source |
SS
df
MS
Number of obs = 254654
-------------+-----------------------------F( 8,254645) =
0.77
Model | 1.46731447
8 .183414308
Prob > F
= 0.6269
Residual | 60444.5039254645 .237367723
R-squared
= 0.0000
-------------+-----------------------------Adj R-squared = -0.0000
Total | 60445.9712254653 .237366028
Root MSE
=
.4872
-----------------------------------------------------------------------------res_2sls_w~d |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------twoboys | -.0052822
.0026941
-1.96
0.050
-.0105625
-1.83e-06
twogirls |
.0042367
.0027711
1.53
0.126
-.0011946
.0096681
boy1st |
.004822
.0027461
1.76
0.079
-.0005603
.0102043
agem1 |
3.72e-07
.000312
0.00
0.999
-.0006112
.000612
agefstm |
2.07e-06
.0003625
0.01
0.995
-.0007084
.0007125
black | -.0000392
.0043822
-0.01
0.993
-.0086282
.0085498
hispan | -.0000393
.0040807
-0.01
0.992
-.0080375
.0079588
othrace |
.0000149
.0046288
0.00
0.997
-.0090575
.0090872
_cons | -.0021381
.0097043
-0.22
0.826
-.0211583
.016882
------------------------------------------------------------------------------
36
•
•
•
•
•
•
•
SSM = 1.467
SST = 600444.50
R2 = SSM/SST = 2.43E-5
N = 254654
NR2 = 6.18
Dist as χ2(1)
P-value of 6.18 is 0.0129
37
Do Hausman test brute force
. * Run Hausmans test of endogeneity, two instrument case;
. * add residual from 1st stage regression to OLS of structural model;
. reg workedm morekids boy1st agem1 agefstm black hispan othrace res_1st_2zs;
Source |
SS
df
MS
Number of obs = 254654
-------------+-----------------------------F( 8,254645) = 1677.06
Model | 3176.20362
8 397.025453
Prob > F
= 0.0000
Residual | 60284.5169254645 .236739449
R-squared
= 0.0500
-------------+-----------------------------Adj R-squared = 0.0500
Total | 63460.7206254653 .249204685
Root MSE
= .48656
-----------------------------------------------------------------------------workedm |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------morekids | -.1127816
.0276489
-4.08
0.000
-.1669726
-.0585906
boy1st |
.0009424
.001947
0.48
0.628
-.0028736
.0047585
agem1 |
.0217057
.0008957
24.23
0.000
.0199501
.0234612
agefstm | -.0261649
.0012566
-20.82
0.000
-.0286279
-.0237019
black |
.1895035
.004759
39.82
0.000
.180176
.1988311
hispan |
-.014818
.0053636
-2.76
0.006
-.0253305
-.0043054
othrace |
.0439784
.0048067
9.15
0.000
.0345574
.0533994
res_1st_2zs | -.0541136
.0277264
-1.95
0.051
-.1084566
.0002294
_cons |
.4448388
.013693
32.49
0.000
.4180009
.4716768
-----------------------------------------------------------------------------. * notice that OLS of this model generates 2SLS estimates of the other;
. * variables in the model (morekids, boy1st, etc.);
. test res_1st_2zs;
( 1) res_1st_2zs = 0
F( 1,254645) =
3.81
Prob > F =
0.0510
38
. * Run Hausmans test of endogeneity, one instrument case;
. * add residual from 1st stage regression to OLS of structural model;
. reg workedm morekids boy1st agem1 agefstm black hispan othrace res_1st_2zs;
Source |
SS
df
MS
-------------+-----------------------------Model | 3176.20362
8 397.025453
Residual | 60284.5169254645 .236739449
-------------+-----------------------------Total | 63460.7206254653 .249204685
Number of obs
F( 8,254645)
Prob > F
R-squared
Adj R-squared
Root MSE
= 254654
= 1677.06
= 0.0000
= 0.0500
= 0.0500
= .48656
-----------------------------------------------------------------------------workedm |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------morekids | -.1127816
.0276489
-4.08
0.000
-.1669726
-.0585906
boy1st |
.0009424
.001947
0.48
0.628
-.0028736
.0047585
agem1 |
.0217057
.0008957
24.23
0.000
.0199501
.0234612
agefstm | -.0261649
.0012566
-20.82
0.000
-.0286279
-.0237019
black |
.1895035
.004759
39.82
0.000
.180176
.1988311
hispan |
-.014818
.0053636
-2.76
0.006
-.0253305
-.0043054
othrace |
.0439784
.0048067
9.15
0.000
.0345574
.0533994
res_1st_2zs | -.0541136
.0277264
-1.95
0.051
-.1084566
.0002294
_cons |
.4448388
.013693
32.49
0.000
.4180009
.4716768
------------------------------------------------------------------------------
Can reject at 5.1 percent the null the coefficients are
The same
39
Angrist/Krueger
40
Example
• Suppose a school district requires that a
child turn 6 by October 31 in the 1st grade
• Has compulsory education until age 18
• Consider two kids
• One born Oct 1, 1960
• Another born Nov 1,1960
41
• Oct 1, 1960
–
–
–
–
Starts school in 1966 (age 5)
Turns 6 a few months into school
Starts senior year in 1977 (age 16)
Does not turn 18 until after HS school is over
• Nov 1, 1960
–
–
–
–
Start school in 1967 (age 6)
Turns 7 a few months into school
Starts senior year in 1978 (age 17)
Turns 18 midway through senior year
42
43
44
45
46
. * get reduced-forms for wald estimate;
. * compare to table III, panel B;
. reg educ qob1;
βiv==-0.0110989/-0.1088179=-0.10199
Source |
SS
df
MS
-------------+-----------------------------Model | 727.393312
1 727.393312
Residual | 3546940.27329507 10.7643852
-------------+-----------------------------Total | 3547667.66329508
10.76656
Number of obs
F( 1,329507)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
329509
67.57
0.0000
0.0002
0.0002
3.2809
-----------------------------------------------------------------------------educ |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------qob1 | -.1088179
.0132376
-8.22
0.000
-.1347633
-.0828725
_cons |
12.79688
.0065904 1941.75
0.000
12.78397
12.8098
-----------------------------------------------------------------------------. reg earnwkl qob1;
1st stage
Source |
SS
df
MS
-------------+-----------------------------Model | 7.56705582
1 7.56705582
Residual |
151830.3329507 .460780197
-------------+-----------------------------Total | 151837.867329508 .460801763
Number of obs
F( 1,329507)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
329509
16.42
0.0001
0.0000
0.0000
.67881
-----------------------------------------------------------------------------earnwkl |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------qob1 | -.0110989
.0027388
-4.05
0.000
-.0164669
-.0057309
_cons |
5.902694
.0013635 4329.00
0.000
5.900022
5.905367
47
------------------------------------------------------------------------------
Reduced-form
. * get correlation coefficient for;
. * educ and qob1;
. corr educ qob1;
(obs=329509)
|
educ
qob1
-------------+-----------------educ |
1.0000
qob1 | -0.0143
1.0000
Correlation coefficient: z and x
48
49
% of Mothers that Smoked During Pregnancy by Birth Month of their Child
14.0%
13.5%
% Smoked
13.0%
12.5%
12.0%
11.5%
11.0%
JAN
FEB
MAR
APR
MAY
JUN
JUL
AUG
SEP
OCT
NOV
DEC
Month
50
51
52
53
Average Birth weight by Birth Month
3340
Birth weight in grams
3330
3320
3310
3300
3290
3280
JAN
FEB
MAR
APR
MAY
JUN
JUL
AUG
SEP
OCT
NOV
DEC
Month
54
55
Overidentified model
• 10 years of birth
• 3 quarters of birth
• 30 instruments
56
. * get dummies needed for the models;
. xi i.yob*i.qob;
i.yob
_Iyob_30-39
(naturally coded; _Iyob_30 omitted)
i.qob
_Iqob_1-4
(naturally coded; _Iqob_1 omitted)
i.yob*i.qob
_IyobXqob_#_#
(coded as above)
The xi command i.m*i.n takes and generates dummies for
i.m, i.n then all the unique interactions of m and n
57
. * run 2sls, qob times yob interactions as instruments;
. * compare to column (2), table V;
. ivregress 2sls earnwkl _Iyob_* (educ=_Iqob* _IyobX*);
Instrumental variables (2SLS) regression
YOB effects
Number of obs
Wald chi2(10)
Prob > chi2
R-squared
Root MSE
=
=
=
=
=
329509
41.67
0.0000
0.1102
.64034
QOB main effects and qob x yob interactions as
instruments
------------------------------------------------------------------------------
earnwkl |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------educ |
.0891154
.0161098
5.53
0.000
.0575408
.1206901
_Iyob_31 | -.0088813
.0055293
-1.61
0.108
-.0197185
.0019558
DELETE SOME RESULTS
_Iyob_39 | -.0585271
.0104573
-5.60
0.000
-.0790231
-.0380311
_cons |
4.792727
.2006807
23.88
0.000
4.3994
5.186054
-----------------------------------------------------------------------------Instrumented: educ
Instruments:
_Iyob_31 _Iyob_32 _Iyob_33 _Iyob_34 _Iyob_35 _Iyob_36
_Iyob_37 _Iyob_38 _Iyob_39 _Iqob_2 _Iqob_3 _Iqob_4
_IyobXqob_31_2 _IyobXqob_31_3 _IyobXqob_31_4 _IyobXqob_32_2
_IyobXqob_32_3 _IyobXqob_32_4 _IyobXqob_33_2 _IyobXqob_33_3
_IyobXqob_33_4 _IyobXqob_34_2 _IyobXqob_34_3 _IyobXqob_34_4
_IyobXqob_35_2 _IyobXqob_35_3 _IyobXqob_35_4 _IyobXqob_36_2
_IyobXqob_36_3 _IyobXqob_36_4 _IyobXqob_37_2 _IyobXqob_37_3
_IyobXqob_37_4 _IyobXqob_38_2 _IyobXqob_38_3 _IyobXqob_38_458
_IyobXqob_39_2 _IyobXqob_39_3 _IyobXqob_39_4
. estat
overid;
Tests of overidentifying restrictions:
Sargan (score) chi2(29)= 25.4394 (p = 0.6553)
Basmann chi2(29)
= 25.4383 (p = 0.6553)
59
. estat firststage;
First-stage regression summary statistics
-------------------------------------------------------------------------|
Adjusted
Partial
Variable |
R-sq.
R-sq.
R-sq.
F(30,329469) Prob > F
-------------+-----------------------------------------------------------educ | 0.0033
0.0032
0.0004
4.90707
0.0000
--------------------------------------------------------------------------
1st stage F – lots of concerns about finite sample bias
60
In columns (4) and (8), age and agesq reduce information contained
in instrument. 1st stage F falls to 1.6. Compare 2sls to IV in these cases.
In this instance, low F – poor 1st stage fit – results collapse to OLS
61
Notice how close the 2SLS and OLS are
Generate instruments by interacting
3 QOB x 10 YOB dummies (30)
3 QOB x 50 YOB dummies (147)
177 instruments, 176 DOF in NR2 test
62
63
Download