lab13_19may05_tobit

advertisement
Javier Aparicio
July 27, 2001
Regressions with Censored data:
Tobit Estimation of PAC contributions
(use PAC_tobit.dta)
We will estimate the amount of AFL-CIO PAC contributions as being dependent on
Congressmen membership in the Budget (V19) or Ways and Means Committees (V35),
the party affiliation of candidates (PARTY), seniority (SENIOR), the 1990 vote
obtained (VOTE90), and in the ideological “distance” between the PAC and the
given candidate (DISTANCE).
The data has some lower (211 obs.) and some upper (20 obs.) censored
observations at 0 and 5000 dollars, respectively. First, we will estimate a
simple OLS model with the data “as is”, namely with a censored OLS model. Next
we will exclude the censored observations—an OLS truncated model. Finally, we
will estimate a Tobit model for the entire sample with both the lower and upper
censoring cutoff points properly identified.
. * MODEL 1.
OLS CENSORED MODEL
. reg contrib v19 v35 party senior vote90 distance
Source |
SS
df
MS
Number of obs =
347
-------------+-----------------------------F( 6,
340) =
18.35
Model |
162478358
6 27079726.3
Prob > F
= 0.0000
Residual |
501622144
340 1475359.25
R-squared
= 0.2447
-------------+-----------------------------Adj R-squared = 0.2313
Total |
664100502
346 1919365.61
Root MSE
= 1214.6
-----------------------------------------------------------------------------contrib |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------v19 |
-377.864
229.575
-1.65
0.101
-829.4302
73.70213
v35 | -513.8839
260.6351
-1.97
0.049
-1026.544
-1.223525
party | -618.8379
295.288
-2.10
0.037
-1199.659
-38.01659
senior | -17.34461
8.680039
-2.00
0.046
-34.41795
-.271268
vote90 |
-14.7655
4.655749
-3.17
0.002
-23.9232
-5.607803
distance | -341.3641
143.8918
-2.37
0.018
-624.3943
-58.33388
_cons |
2668.555
335.0562
7.96
0.000
2009.511
3327.599
-----------------------------------------------------------------------------. predict yhat_cen
(option xb assumed; fitted values)
In this first estimation, all variables affect significantly and negatively the
amount of PAC contributions, with the exception of V19 (Budget Committee) which
is only significant at the 10.1% level. This means that republican candidates,
as well as more ideologically distant candidates, tend to receive less AFLCIO
PAC contributions—an expected result. A bit more surprisingly, PAC
contributions vary inversely with seniority and Budget and Ways and Means
Committee membership: one would have expected that the more influential the
candidates are (by accumulating seniority or important Committee positions),
the higher their contributions.
. * MODEL 2.
OLS TRUNCATED MODEL
. reg contrib v19 v35 party senior vote90 distance if contrib > 0 & contrib <
5000
Source |
SS
df
MS
Number of obs =
116
1
-------------+-----------------------------F( 6,
109) =
2.59
Model | 16989820.0
6 2831636.66
Prob > F
= 0.0221
Residual |
119305154
109 1094542.70
R-squared
= 0.1247
-------------+-----------------------------Adj R-squared = 0.0765
Total |
136294974
115 1185173.69
Root MSE
= 1046.2
-----------------------------------------------------------------------------contrib |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------v19 | -452.5959
390.3753
-1.16
0.249
-1226.307
321.1152
v35 | -615.2362
486.9393
-1.26
0.209
-1580.334
349.8617
party |
361.2142
794.4016
0.45
0.650
-1213.264
1935.692
senior | -6.232589
11.99617
-0.52
0.604
-30.0086
17.54343
vote90 | -26.07335
7.879505
-3.31
0.001
-41.69027
-10.45642
distance | -347.8259
393.511
-0.88
0.379
-1127.752
432.1
_cons |
3396.926
575.6287
5.90
0.000
2256.049
4537.803
-----------------------------------------------------------------------------. predict yhat_tr
(option xb assumed; fitted values)
This second model excludes all censored observations, and hence discards a
large proportion of the data. The result is that most variables turn out to be
statistically insignificant and some even flip signs (like PARTY). The only
variable that remains significantly and negatively affecting contributions is
the vote outcome in the previous election cycle.
. * MODEL 3.
TOBIT model
. tobit contrib v19 v35 party senior vote90 distance, ll ul
Tobit estimates
Number of obs
=
347
LR chi2(6)
=
198.43
Prob > chi2
=
0.0000
Log likelihood = -1152.4198
Pseudo R2
=
0.0793
-----------------------------------------------------------------------------contrib |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------v19 | -1441.746
658.317
-2.19
0.029
-2736.619
-146.8724
v35 | -2412.174
826.4691
-2.92
0.004
-4037.793
-786.555
party | -1275.197
1028.501
-1.24
0.216
-3298.202
747.8075
senior | -51.60711
21.7074
-2.38
0.018
-94.30437
-8.909856
vote90 | -41.40697
12.84263
-3.22
0.001
-66.66772
-16.14621
distance | -2913.713
617.6242
-4.72
0.000
-4128.545
-1698.88
_cons |
5616.36
948.0951
5.92
0.000
3751.509
7481.211
-------------+---------------------------------------------------------------_se |
2395.664
175.3974
(Ancillary parameter)
-----------------------------------------------------------------------------Obs. summary:
211 left-censored observations at contrib<=0
116
uncensored observations
20 right-censored observations at contrib>=5000
. predict yhat_tob
(option xb assumed; fitted values)
In the Tobit estimation we use the entire sample one more time, but properly
correct for the conditional distribution of censored observations. The results
are somewhat similar to those of model 1 but the size of the coefficients is
largely increased (more than a threefold increase in most cases). However,
being a republican candidate is no longer a significant variable (see the PARTY
coefficient).
To illustrate the effect of Tobit parameter estimates relative to those of
censored and truncated OLS models, we will generate fitted values for the
2
dependent variable for different levels of ideological distance, while keeping
all the other explanatory variables fixed at their means. This will require
some data manipulation for each model, as follows.
(Note how miserable life was before Clarify existed!)
. * OLS CENSORED MODEL
. quietly reg contrib v19 v35 party senior vote90 distance
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
* Creating holding variables
gen v19h = v19
gen v35h = v35
gen partyh = party
gen seniorh = senior
gen vote90h = vote90
gen distanceh = distance
drop v19 v35 party senior vote90
* Generate hypothetical mean values & predict
egen v19 = mean(v19h)
egen v35 = mean(v35h)
egen party = mean(partyh)
egen senior = mean(seniorh)
egen vote90 = mean(vote90h)
predict y1, xb
. * Drop the hypothetical mean values
. drop v19 v35 party senior vote90
. * Bring back real variable values
. gen v19 = v19h
. gen v35 = v35h
. gen party = partyh
. gen senior = seniorh
. gen vote90 = vote90h
.
. ** OLS TRUNCATED MODEL
. quietly reg contrib v19 v35 party senior vote90 distance if contrib > 0 &
contrib < 5000
.
. * Substitute again for variable means & predict
. drop v19 v35 party senior vote90
. egen v19 = mean(v19h) if contrib > 0 & contrib < 5000
(231 missing values generated)
. egen v35 = mean(v35h) if contrib > 0 & contrib < 5000
(231 missing values generated)
. egen party = mean(partyh) if contrib > 0 & contrib < 5000
(231 missing values generated)
. egen senior = mean(seniorh) if contrib > 0 & contrib < 5000
(231 missing values generated)
. egen vote90 = mean(vote90h) if contrib > 0 & contrib < 5000
(231 missing values generated)
. predict y2, xb
(231 missing values generated)
. * Drop the hypothetical mean values
. drop v19 v35 party senior vote90
. * Bring back real variable values
. gen v19 = v19h
. gen v35 = v35h
3
.
.
.
.
.
.
gen party = partyh
gen senior = seniorh
gen vote90 = vote90h
** TOBIT model
quietly tobit contrib v19 v35 party senior vote90 distance, ll ul
. drop v19 v35 party senior vote90
.
.
.
.
.
.
.
.
* Generate hypothetical mean values & predict
egen v19 = mean(v19h)
egen v35 = mean(v35h)
egen party = mean(partyh)
egen senior = mean(seniorh)
egen vote90 = mean(vote90h)
predict y3, xb
. * Drop the hypothetical mean values
. drop v19 v35 party senior vote90
.
.
.
.
.
.
.
.
* Bring back real variable values
gen v19 = v19h
gen v35 = v35h
gen party = partyh
gen senior = seniorh
gen vote90 = vote90h
gen index=0
. graph y1 y2 y3 index distance, s(opd.) c(...l) key1(s(o) "censored yhat")
key2(s(p) "truncated yhat") key3(s(d) "tobit yhat")
censored yhat
tobit yhat
truncated yhat
2028.55
-12638.6
.0036
4.23125
distance
As the graph above indicates, Tobit estimates of the negative impact of
ideological distance are much stronger than those of censored and truncated
4
OLS. The Tobit fitted values have a more negative slope than those of the OLS
models. (The horizontal line in the graph indicates the zero contribution
level.)
* MODEL 4.
PROBIT ESTIMATION
. probit give v19 v35 party senior vote90 distance
Probit estimates
Number of obs
=
347
LR chi2(6)
=
209.50
Prob > chi2
=
0.0000
Log likelihood = -127.60213
Pseudo R2
=
0.4508
-----------------------------------------------------------------------------give |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------v19 | -.6317307
.3198683
-1.97
0.048
-1.258661
-.0048003
v35 | -.9641854
.3737206
-2.58
0.010
-1.696664
-.2317066
party | -.1928536
.4614771
-0.42
0.676
-1.097332
.711625
senior |
-.028079
.0111763
-2.51
0.012
-.0499841
-.0061738
vote90 | -.0167377
.0064355
-2.60
0.009
-.0293511
-.0041244
distance | -1.618576
.3011001
-5.38
0.000
-2.208721
-1.028431
_cons |
2.713734
.4979086
5.45
0.000
1.737851
3.689617
-----------------------------------------------------------------------------note: 2 failures and 0 successes completely determined.
The probit estimation of model 3 is roughly similar to the previous Tobit
results. The party variable remains insignificant while the other variables
still affect negatively and significantly the probability of giving a
contribution. That is to say, the substantive results from model 3 and model 4
do not differ.
5
Download