Uploaded by Grace -

Endogeneity & Simultaneous Equations

advertisement
Endogeneity & Simultaneous Equation Models
Interdependence between variables also leads to endogeneity problems (ie
violates assumption that Cov(X,u) = 0 and so OLS will give biased
estimates)
Eg
C = a + bY + e
Y= C +I+G + v
(1)
(2)
This is a 2 equation simultaneous equation system
C and Y appear on both sides of respective equations and are
interdependent since
Any shock, represented by ∆e
but ∆C
and ∆Y
→ ∆C
→ ∆Y
→∆C
in (1)
from (2)
from (1)
so changes in C lead to changes in Y and changes in Y lead to changes in
C
but the fact that ∆e → ∆Y means Cov(X,u) = Cov(Y,e) ≠ 0 in (1)
which given OLS implies
^
b=
Cov ( X , y )
Cov ( X , u )
=b +
Var( X )
Var ( X )
^
means E (b ) ≠ b
So OLS in the presence of interdependent variables gives biased estimates.
Any right hand side variable which has the property
Cov(X,u) ≠ 0
is said to be endogenous
Any right hand side variable which has the property
Cov(X,u) = 0
is said to be exogenous
Solution: IV estimation (as with measurement error, since symptom, if not
cause, is the same)
^
bIV =
Cov ( Z , y )
Cov (Z , X )
(A)
Again, problem is where to find instruments. In a simultaneous equation
model, the answer is often in the system itself
Eg.2.
Price = b0 + b1 Wage + e
Wage = d0 + d1Price + d2Unemployment + v
(1)
(2)
This time wages and prices are interdependent so OLS on either (1) or (2)
will give biased estimates….. but
unemployment does not appear in (1) but is correlated with wages through
(2)
This means unemployment can be used as an instrument for wages in (1)
as
Cov(Unemployment, e) = 0
so uncorrelated with residual
and
Cov(Unemployment, Wage) ≠ 0 so correlated with endogenous RHS
variable
^
So now bIV =
Cov(Unemploment , Pr ice )
Cov (Unemployment,Wage)
(compare with (A))
This process of using extra exogenous variables as instruments for
endogenous RHS variables is known as identification
If there are no additional exogenous variables outside the original equation
than can be used as instruments for the endogenous RHS variables then
the equation is said to be unidentified
(In the example above (2) is unidentified because despite Price being
endogenous , there are no other exogenous variables not already in (2) that
can be used as instruments for Price).
In general we can develop a rule that tells us whether any particular
equation will be identified.
“In a system of M simultaneous equations, then any one equation is
identified if the number of exogenous variables excluded from that
equation is greater than or equal to the total number of endogenous
variables in that equation less one.”
K − k ≥ m −1
(B)
where K = Total no. of exogenous variables in the system
k = No. of exogenous variables included in the equation
m = No. of endogenous variables included in the equation
In practice this rule tells us whether we can find an instrument for each
and every endogenous RHS variable.
Consider the previous example
Price = b0 + b1 Wage + e
Wage = d0 + d1Price + d2Unemployment + v
(1)
(2)
We know Price and Wage are interdependent so can’t use OLS
Consider each equation in turn
In (1)
∃ 2 Endogenous variables (Price, Wage) ∴ m = 2
∃ 1 Exogenous variables in the whole system
(Unemployment)
∴K=1
∃ 0 Exogenous variables in equation (1) ∴ k = 0
Using (B)
K − k ≥ m −1
1-0 = 2-1
1= 1
so equation (1) is identified. There is an instrument for Wage
(Unemployment) ∴ can use IV
In (2)
As before
∃ 2 Endogenous variables (Price, Wage) ∴ m = 2
∃ 1 Exogenous variables in the whole system
(Unemployment)
∴K=1
but now
∃ 1 Exogenous variables in equation (2) ∴ k = 1
Using (B)
K − k ≥ m −1
1-1< 2-1
0<1
so equation (2) is not identified. There is no instrument (extra exogenous
variable) for Price anywhere else in the system of equations ∴ can’t use
IV
Eg 2. Suppose now that
Price = b0 + b1 Wage + e
(1)
Wage = d0 + d1Price + d2Unemployment + d3Productivity+ v (2)
ie added another exogenous variable, Productivity, to (2). Since exogenous
we assume Cov(Productivity, v) = 0
Now (2) is still not identified, since
∃ 2 Endogenous variables (Price, Wage) ∴ m = 2
and now
∃ 2 Exogenous variables in the whole system
(Unemployment, productivity)
∴K=2
∃ 2 Exogenous variables in equation (2) ∴ k = 2
(Unemployment, productivity)
K − k ≥ m −1
2-2< 2-1
0<1
But now in (1)
∃ 2 Endogenous variables (Price, Wage) ∴ m = 2
and
∃ 2 Exogenous variables in the whole system
(Unemployment, productivity)
∴K=2
∃ 0 Exogenous variables in equation (1) ∴ k = 0
K − k ≥ m −1
2-0 > 2-1
2>1
Equation (1) said to be over-identified (more instruments (external
exogenous variables) than strictly necessary for IV estimation)
So which instrument to use for Wage in (1) ?
If use unemployment then as before, using (A)
^
bIV =
Cov (Unemployment, price)
Cov (Unemployment,Wage)
If instead use Productivity then
^
bIV =
Cov ( productivity , price)
Cov ( productivity ,Wage)
Which is best?
Both will give unbiased estimate of true value, but likely (especially in
small samples) that estimates will be different.
(see example)
In practice use both at the same time
-removes possibility of conflicting estimates
- is more efficient (smaller variance) at least in large samples. With
small samples more efficient to use the minimum number of
instruments
Given
Price = b0 + b1 Wage + e
(1)
We know
Wage = d0 + d1Price + d2Unemployment + d3Productivity+ v (2)
and that Wage is related to the exogenous variables only by
Wage = g0 + g1Unemployment + g2Productivity+ w
(3)
(the equation which expresses the endogenous right hand side variables
solely as a function of exogenous variables is said to be a reduced form)
^
Idea is then to estimate (3) and save the predicted values Wage
^
^
^
^
Wage = g 0 + g 1 Unemployment + g 2 Pr oductivity
and use this as the instrument for Wage in (1)
^
Wage satisfies properties of an instrument since clearly correlated with
variable of interest and because it is an average only of exogenous
variables, is uncorrelated with the residual e in (1) (by assumption)
This strategy is called Two Stage Least Squares and
^
^
b2 SLS =
Cov( X , y )
^
Cov ( X , X )
^
=
Cov ( wage, price)
^
Cov ( wage, price)
Note: In many cases you will not have a simultaneous system. More than
likely will have just one equation but there may well be endogenous RHS
variables. In this case the principle is exactly the same. Find additional
exogenous variables that are correlated with the problem variable but
uncorrelated with the error (ie “as if” there were another equation in the
system).
Example
annual % change in nominal wages annual % change in price level
% change
30
20
10
0
1963 Q1
2000 Q4
time
The data set prod.dta contains quarterly time series data on wage, price,
unemployment and productivity changes
The graph suggests that wages and prices move together over time
and suggests may want to run a regression of inflation on wage changes
where by assumption (and nothing else) the direction of causality runs from
changes in wages to changes in inflation
reg inflat dwages if time>50
Source |
SS
df
MS
Number of obs =
101
-------------+-----------------------------F( 1,
99) = 381.84
Model | 2534.57478
1 2534.57478
Prob > F
= 0.0000
Residual | 657.143032
99 6.63780841
R-squared
= 0.7941
-------------+-----------------------------Adj R-squared = 0.7920
Total | 3191.71781
100 31.9171781
Root MSE
= 2.5764
-----------------------------------------------------------------------------inflatio |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------dwages |
1.012592
.0518196
19.54
0.000
.9097704
1.115413
_cons | -1.609425
.5174281
-3.11
0.002
-2.636115
-.5827354
which suggests an almost one-for-one relation between inflation and wage
changes over this period.
However you might equally run a regression of wage changes on prices
(where now the implied direction of causality is from changes in the inflation
rate to changes in wages)
. reg dwages inflat if time>50
Source |
SS
df
MS
Number of obs =
101
-------------+-----------------------------F( 1,
99) = 381.84
Model | 1962.98474
1 1962.98474
Prob > F
= 0.0000
Residual |
508.94602
99 5.14086889
R-squared
= 0.7941
-------------+-----------------------------Adj R-squared = 0.7920
Total | 2471.93076
100 24.7193076
Root MSE
= 2.2673
-----------------------------------------------------------------------------dwages |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------inflatio |
.784235
.0401334
19.54
0.000
.7046016
.8638684
_cons |
3.04795
.3657581
8.33
0.000
2.322207
3.773694
this suggests wages grow at constant rate of around 3% a year (the
coefficient on the constant) and then each 1 percentage point increase in the
inflation rate adds a .78 percentage point increase in wages.
Which is right specification?
In a sense both, since wages affect prices but prices also affect wages. The
2 variables are interdependent and said to be endogenous. This means that
Cov(X,u)≠ 0 ie a correlation between right hand side variables and the
residuals which makes OLS estimates biased and inconsistent.
Need to instrument the endogenous right hand side variables. ie find a
variable that is correlated with the suspect right hand side variable but
uncorrelated with the error term.
If, as in the example above, wages also depend on the unemployment rate
and productivity growth as well as inflation.
. reg
dwages inflat umprate prod if time>50
Source |
SS
df
MS
Number of obs =
101
---------+-----------------------------F( 3,
97) = 133.16
Model | 1988.96632
3 662.988773
Prob > F
= 0.0000
Residual | 482.964442
97 4.97901486
R-squared
= 0.8046
---------+-----------------------------Adj R-squared = 0.7986
Total | 2471.93076
100 24.7193076
Root MSE
= 2.2314
-----------------------------------------------------------------------------dwages |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------inflatio |
.7823925
.0423672
18.467
0.000
.6983054
.8664796
umprate |
.1898947
.0966741
1.964
0.052
-.0019766
.3817661
prod | -.2960195
.1554679
-1.904
0.060
-.6045802
.0125412
_cons |
2.117253
.8587953
2.465
0.015
.4127825
3.821724
(Strangely) over this period higher unemployment is associated with higher
wages and lower productivity with lower wages, controlling for the effects of
inflation. (Can you think of reasons why this might be?)
Which instrument to use? Both should give same estimate if sample size is
large enough but in finite (small) samples the two IV estimates will be
different.
Compare IV estimate using unemployment as instrument
. ivreg
inflat (dwages=ump ) if time>50
Instrumental variables (2SLS) regression
-----------------------------------------------------------------------------inflatio |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------dwages |
1.601649
.3725813
4.299
0.000
.8623667
2.340931
_cons | -6.718597
3.254932
-2.064
0.042
-13.17709
-.2601064
Instrumented:
Instruments:
dwages
umprate
With that using productivity as an instrument
. ivreg
inflat (dwages=prod ) if time>50
Instrumental variables (2SLS) regression
-----------------------------------------------------------------------------inflatio |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------dwages |
1.089042
.1544639
7.050
0.000
.782552
1.395532
_cons | -2.272514
1.364576
-1.665
0.099
-4.980128
.4351012
Instrumented:
Instruments:
dwages
prod
Both estimates are higher than original OLS estimate, but differ widely from
each other.
Best idea (which also gives more efficient estimates ie ones with lower
standard errors) is to use all the instruments at the same time – at least in
large samples
Use unemp and productivity as instruments for wages in original equation
1. Regress endogenous variables (wages) on instruments (unemp and prod.)
. reg
dwages umprate prod if time>50
Source |
SS
df
MS
Number of obs =
101
---------+-----------------------------F( 2,
98) =
6.54
Model | 290.979911
2 145.489955
Prob > F
= 0.0022
Residual | 2180.95085
98 22.2546005
R-squared
= 0.1177
---------+-----------------------------Adj R-squared = 0.0997
Total | 2471.93076
100 24.7193076
Root MSE
= 4.7175
-----------------------------------------------------------------------------dwages |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------umprate | -.1101753
.201477
-0.547
0.586
-.5099998
.2896493
prod | -.9147086
.3209619
-2.850
0.005
-1.551647
-.2777703
_cons |
11.28755
1.481331
7.620
0.000
8.347894
14.2272
Save predicted wage
. predict dwagehat
* stata command to save predicted value of dep. var. *
2.
Include this instead of wages on the right hand side of the inflation
regression
. reg
inflat dwagehat if time>50
Source |
SS
df
MS
Number of obs =
101
---------+-----------------------------F( 1,
99) =
13.41
Model | 380.648731
1 380.648731
Prob > F
= 0.0004
Residual | 2811.06908
99 28.3946372
R-squared
= 0.1193
---------+-----------------------------Adj R-squared = 0.1104
Total | 3191.71781
100 31.9171781
Root MSE
= 5.3287
-----------------------------------------------------------------------------inflatio |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------dwagehat |
1.143749
.3123825
3.661
0.000
.5239143
1.763584
_cons | -2.747013
2.760836
-0.995
0.322
-8.22511
2.731083
This gives unbiased estimate of effect of wages on prices.
Compare with original (biased) estimate below, can see wage effect is now
stronger, (though standard error of IV estimate is larger than in OLS)
. reg
inflat dwages if time>50
inflatio |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------dwages |
1.012592
.0518196
19.541
0.000
.9097704
1.115413
_cons | -1.609425
.5174281
-3.110
0.002
-2.636115
-.5827354
Remember that Stata does all this automatically using the ivreg command
. ivreg
inflat (dwages=ump prod) if time>50, first
First-stage regressions
----------------------Source |
SS
df
MS
Number of obs =
101
-------------+-----------------------------F( 2,
98) =
6.54
Model | 290.979911
2 145.489955
Prob > F
= 0.0022
Residual | 2180.95085
98 22.2546005
R-squared
= 0.1177
-------------+-----------------------------Adj R-squared = 0.0997
Total | 2471.93076
100 24.7193076
Root MSE
= 4.7175
-----------------------------------------------------------------------------dwages |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------umprate | -.1101753
.201477
-0.55
0.586
-.5099998
.2896493
prod | -.9147086
.3209619
-2.85
0.005
-1.551647
-.2777703
_cons |
11.28755
1.481331
7.62
0.000
8.347894
14.2272
Instrumental variables (2SLS) regression
Source |
SS
df
MS
Number of obs =
101
-------------+-----------------------------F( 1,
99) =
53.86
Model | 2492.05222
1 2492.05222
Prob > F
= 0.0000
Residual | 699.665589
99 7.06732918
R-squared
= 0.7808
-------------+-----------------------------Adj R-squared = 0.7786
Total | 3191.71781
100 31.9171781
Root MSE
= 2.6584
-----------------------------------------------------------------------------inflatio |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------dwages |
1.143749
.1558462
7.34
0.000
.8345162
1.452981
_cons | -2.747012
1.377368
-1.99
0.049
-5.48001
-.0140152
-----------------------------------------------------------------------------Instrumented: dwages
Instruments:
umprate prod
------------------------------------------------------------------------------
If there is more than one potentially endogenous rhs variable in your
equation the order condition (above) tells us that you will have to find at
least one different instrument for each endogenous rhs variable (eg 2
endogenous rhs variables requires 2 different instruments). Again, each
instrument should be correlated with the endogenous rhs variable it
replaces net of the other existing exogenous rhs variables.
Example
Price = b0 + b1Wage + b1InterestRates + e
(1)
where now both wages and interest rates are probably endogenous
(prices depend on both interest rates and wages but wages and
interest rates also depend on prices). OLS estimation of (1) will
therefore give biased estimates. To obtain unbiased 2SLS
estimates you need
1) to find one instrument for wages and a different instrument for
prices.
2) Obtain predicted values for both wages and prices using these
instruments (and any other exogenous variables there may be
in the equation)
^
^
^
Wage = g 0 + g 1 Instrument _ 1
^
^
^
InterestRa tes = d 0 + d 1 Instrument _ 2
and then use these predicted values instead of the originals in (1)
(in practice, the computer will do this for you, but you should always
be aware of what exactly is going on).
Example of Poor Instruments & IV Estimation
Often a poor choice of instrument can make things much worse
than the original OLS estimates.
Consider the example of the effect of education on wages (taken
from the data set video.dta). Policy makers are often interested in
the costs and benefits of education. Some people argue that
education is endogenous (because it also picks up the effects of
omitted variables like ability or motivation and so is correlated with
the error term).
The OLS estimates from a regression of log hourly wages on years of
education suggest that
. reg lhw yearsed
Source |
SS
df
MS
Number of obs =
6076
-------------+-----------------------------F( 1, 6074) = 589.23
Model | 246.491385
1 246.491385
Prob > F
= 0.0000
Residual | 2540.90296 6074
.41832449
R-squared
= 0.0884
-------------+-----------------------------Adj R-squared = 0.0883
Total | 2787.39434 6075 .458830344
Root MSE
= .64678
-----------------------------------------------------------------------------lhw |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------yearsed |
.0759492
.0031288
24.27
0.000
.0698156
.0820828
_cons |
5.468133
.0400159
136.65
0.000
5.389688
5.546579
1 extra year of education is associated with 7.6% increase in earnings.
If endogeneity is a problem, then these estimates are biased, (upward if
ability and education are positively correlated – see lecture notes on omitted
variable bias).
So try to instrument instead.
For some reason you choose whether the individual owns a video recorder.
To be a good instrument the variable should be a) uncorrelated with the
residual and by extension the dependent variable (wages) but b) correlated
with the endogenous right hand side variable (education).
To test assumption a) you first examine the correlation between wages and
videos in a regression.
. reg lhw video
Source |
SS
df
MS
Number of obs =
6706
-------------+-----------------------------F( 1, 6704) =
0.20
Model | .089653333
1 .089653333
Prob > F
= 0.6576
Residual | 3059.15176 6704 .456317387
R-squared
= 0.0000
-------------+-----------------------------Adj R-squared = -0.0001
Total | 3059.24142 6705 .456262702
Root MSE
= .67551
-----------------------------------------------------------------------------lhw |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------video | -.0178585
.0402898
-0.44
0.658
-.0968393
.0611223
_cons |
6.444377
.0428575
150.37
0.000
6.360363
6.528391
The regression shows that it satisfies the first requirement since it is
uncorrelated with wages (videos are relatively cheap now so that ownership
is more a matter of taste than income).
However when you instrument years of education using video ownership
. ivreg lhw (yearsed=video)
Instrumental variables (2SLS) regression
Source |
SS
df
MS
Number of obs =
6076
-------------+-----------------------------F( 1, 6074) =
0.03
Model | -57.4179219
1 -57.4179219
Prob > F
= 0.8657
Residual | 2844.81226 6074
.46835895
R-squared
=
.
-------------+-----------------------------Adj R-squared =
.
Total | 2787.39434 6075 .458830344
Root MSE
= .68437
-----------------------------------------------------------------------------lhw |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------yearsed | -.0083832
.0495839
-0.17
0.866
-.1055853
.0888189
_cons |
6.52326
.6204326
10.51
0.000
5.306992
7.739528
-----------------------------------------------------------------------------Instrumented: yearsed
Instruments:
video
------------------------------------------------------------------------------
The IV estimates are now negative (and insignificant). This does not appear
very sensible.
The reason is that video ownership is hardly correlated with education as the
regression below shows.
. reg yearsed video
Source |
SS
df
MS
Number of obs =
12334
-------------+-----------------------------F( 1, 12332) =
0.28
Model | 1.91564494
1 1.91564494
Prob > F
= 0.5943
Residual | 83278.6218 12332 6.75305075
R-squared
= 0.0000
-------------+-----------------------------Adj R-squared = -0.0001
Total | 83280.5375 12333 6.75265851
Root MSE
= 2.5987
-----------------------------------------------------------------------------yearsed |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------video | -.0500536
.0939784
-0.53
0.594
-.234266
.1341587
_cons |
12.15383
.1029141
118.10
0.000
11.9521
12.35556
------------------------------------------------------------------------------
Moral: Always check the correlation of your instrument with the
endogenous right hand variable. The “first” option on stata’s ivreg
command will always print the 1st stage of the 2SLS regression – ie
the regression above – in which you can tell if the proposed
instrument is correlated with the endogenous variable by simply
looking at the t value.
Sometimes instruments may be statistically significant from zero in
the 1st stage and still not good enough. So there is a rule of thumb
(at least in the case of a single endogenous variable) that should
only proceed with IV estimation if the F value on the 1st stage of
2SLS > 10.
In the example above this is clearly not the case.
Testing Instrumental Validity (Overidentifying Restrictions)
If you have more instruments than endogenous right hand side
variables (the equation is overidentified – hence the name for the
test) then it is possible to test whether (some of the) instruments
are valid – in the sense that they satisfy the assumption of being
uncorrelated with the residual in the original model.
One way to do this would be to compute two different 2SLS
estimates, one using one instrument and another using the other
instrument (rather like in the above example on prices, wages
productivity and unemployment). If these estimates are radically
different you might conclude that one (or both) of the instruments
was invalid (not exogenous).
An implicit test of this – that avoids having to compute all of the
possible IV estimates is based on the following idea
Given y = b0 + b1X + u
and Cov(X,u)? 0
If an instrument Z is valid (exogenous) it is uncorrelated with u
To test this simply regress u on all the possible instruments.
u = d0 + d1Z1 + d2Z2 + …. dkZk + v
If the instruments are exogenous they should be uncorrelated with
u and so the coefficients d1 .. dk should all be zero (ie the Z
variables have no explanatory power)
Since u is never observed have to use a proxy for this which turns
out to be the residual from the 2SLS estimation
^ 2sls
^
^
2
sls
u
= y − b0
− b12sls X
(since this is a consistent estimate of the true unknown residuals)
So to Test Overidentifying Restrictions
1. Estimate model by 2SLS and save the residuals
2. Regress these residuals on all the exogenous variables
(including those X variables in the original equation that are
not suspect) and save the R2
3. Compute N*R2
4. Under the null that all the instruments are uncorrelated then
N*R2 ~ ?2
with L-k degrees of freedom
(L is the number of instruments and k is the number of
endogenous right hand side variables)
Note that can only do this test if there are more instruments than
endogenous right hand side variables
Example: using the prod.dta file we can test whether the additional
variables (unemployment or productivity) are valid instruments for
wages
1st do the two stage ;least squares regression using all the possible
instruments
ivreg inflat (dwages=prod ump) if time>50
Instrumental variables (2SLS) regression
Source |
SS
df
MS
Number of obs =
101
-------------+-----------------------------F( 1,
99) =
53.86
Model | 2492.05222
1 2492.05222
Prob > F
= 0.0000
Residual | 699.665589
99 7.06732918
R-squared
= 0.7808
-------------+-----------------------------Adj R-squared = 0.7786
Total | 3191.71781
100 31.9171781
Root MSE
= 2.6584
-----------------------------------------------------------------------------inflatio |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------dwages |
1.143749
.1558462
7.34
0.000
.8345162
1.452981
_cons | -2.747012
1.377368
-1.99
0.049
-5.48001
-.0140152
Now save these 2sls residuals
. predict ivres, resid
and regress these on all the exogenous variables in the system (in this case
productivity and unemployment, but there may be situations where the
original equation contained other exogenous variables, in which case include
them here)
. reg ivres prod ump if time>50
Source |
SS
df
MS
Number of obs =
101
-------------+-----------------------------F( 2,
98) =
2.75
Model | 37.2070174
2 18.6035087
Prob > F
= 0.0687
Residual | 662.458589
98 6.75978152
R-squared
= 0.0532
-------------+-----------------------------Adj R-squared = 0.0339
Total | 699.665607
100 6.99665607
Root MSE
=
2.60
-----------------------------------------------------------------------------ivres |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------prod |
.2554312
.1768927
1.44
0.152
-.0956065
.606469
umprate | -.2575159
.1110406
-2.32
0.022
-.4778724
-.0371594
_cons |
1.557729
.8164104
1.91
0.059
-.0624104
3.177869
The test is N*R 2 = 101*0.053 = 5.35 which is ~ ?2 (L-k = 2-1)
(L=2 instruments and k=1 endogenous right hand side variable)
Since ?2 (1)critical = 3.84, estimated value exceeds critical value so reject null
hypothesis that some of the instruments are valid
(which is consistent with earlier results showing IV estimates varied widely
depending on which instrument was used – one of them, probably
unemployment since the IV estimate was far from the OLS one, was wrong).
Download