Marginal Effects in the Censored Regression Model

advertisement
Marginal Effects in the Censored Regression Model
William Greene*
Department of Economics, Stern School of Business, New York University, 44 West 4 th St.,
New York, NY 10012, USA
Abstract
We find that a well known result for marginal effects in the censored regression model with normally
distributed disturbances applies more generally to any censored regression model in which the disturbances have a
continuous distribution. The result suggests a comparison of the coefficients and marginal effects in alternative
models, e.g., normal vs. logistic, that is qualitatively different from the familiar counterpart in the probit and logit
models for binary choice.
Keywords: Tobit, marginal effects, conditional mean
JEL classification: C34, C51
1. Introduction
The latest release of the E-Views computer program (QMS, 1997) advertises that it contains an estimator
for the censored regression model with not only the familiar normal distribution platform, but also with the logistic
and extreme value distributions. This raises a question about the model. In the context of the probit model, a close
cousin to the tobit model, it has been widely documented that while distributional assumptions can produce large
differences in structural coefficient estimates, these differences often exaggerate the substantive differences in the
models. The differences in the binary choice models often become surprisingly small when the coefficients are
scaled to produce the marginal effects, or derivatives of the conditional mean function. (See, e.g., Greene (1997, p.
886).) In this note, we obtain a general result that suggests that this effect is less likely to characterize the censored
regression model. Our results suggest that the differences in the structural estimates produced by these models are
more likely to be of substance and less likely to be just an effect of scaling, and, second, that the scaling that
produces marginal effects in the censored regression model is less likely to remove the apparent differences.
The main result of our derivation, is that the ‘general result’ cited in Greene (1997, page 963) for the
normal distribution, that the marginal effects in the censored regression model are simply the probability of the
nonlimit outcome times the coefficient vector, is even more general than suggested there. The result applies
regardless of the distribution assumed for the disturbances in the model. After establishing this result, we then use
it it examine the original question, whether the distributional assumption has as large an effect on the marginal
effects in the model as it appears to have on the estimates of the structural coefficients.
*
Telephone: 001-212-998-0876; fax: 001-212-995-4218; e-mail: wgreene@stern.nyu.edu. The helpful suggestions of an
anonymous reviewer are gratefully acknowledged. Any remaining errors are my own.
1
2. Marginal Effects in the Censored Regression Model
We write the censored regression model in a generic form,
(latent regression)
(observation)
(disturbance distribution)
(density)
y* = x + , E[] = 0,
y
= max(0,y*),
Prob[  a] = F(a), - <  < +, independent of x,
f() = F() = dF()/d.
Thus,  is simply a scale parameter in the distribution used to normalize the underlying variable, such that the
density of the structural disturbance is free of the model parameters. It is not necessarily the variance of the
disturbance. In the familiar tobit model,  has a standard normal distribution. In the logistic distribution,  has zero
mean and variance 2/3, so the disturbance variance is ()2/3. Other distributions, such as the extreme value are
likewise parameterized. Note that we make the conventional assumption that  ranges continuously over the entire
real line.
The conditional mean function in this model is
= Prob[y = 0|x]0 + Prob[y > 0|x]  E[y* |x, y* > 0].
E[y|x]
The conditioning probabiity is
Prob[y > 0|x]
=
=
=
=
Prob[ > -x ]
Prob[ > -x/]
1 - Prob[  -x/]
1 - F(-x/).
The useful results that follow from the symmetry of the normal and logistic distributions may not apply in more
general cases (such as the extreme value distribution), so we leave the probability in this form. Using some familiar
results, the conditional mean in the second term of E[y|x] is
E[y*|x, y* > 0]
= x +  E[ | y* > 0]
= x +  E[ |  > -x/].
The conditional mean in this expression is

E[ |  > -x/] =
 f ( |  x'  /  )d .
 x' /
The conditional density is the unconditional density divided by the probability of the conditioning event, which is
Prob[y* > 0|x] = 1 - F(-x/). This is the same probability that multiplies the conditional mean term in E[y|x],
so it falls out of the product. Collecting terms, we have

E[y|x] = [1 - F(-x/)] x + 
 f ( )d.
 x' / 
The result is a familiar one for the standard normal distribution; exploiting the symmetry of the distribution, we have
for the tobit model,
E[y|x]
= ( x/) [x + ], where  = ( x/)/( x/),
2
and  and  denote the density and cdf of the standard normal distribution.. The more general case is likely to be
more complicated, however.
What interests us here is not the conditional mean, but its derivatives. The marginal effects in the general
censored regression model are

 


= E[y|x]/x = [1 - F(-x/)] + (x)[-f(-x/)](-1/) + 
f ( )d  /x.


  x' / 


The third term need not be evaluated. We use Leibnitz’s theorem,
u ( )
u ( )



f ( z , )dz   [f ( z , ) /  ]dz  [ f (u ( ),  )]u ( ) /  [ f (l ( ),  )]l ( ) /  .



l ( )
 l ( )



Only the third of these terms is nonzero in the expression for . The required derivative is simply -(1/).
Combining terms, we find that in the expression for , the second and third terms are the same apart from sign, which
leaves our central result,
Marginal Effects for Censored Regressions:

= E[y|x]/x
= [1 - F(-x/)]
= Prob[y* > 0|x].
This is the result given in Greene (1997) for the normal distribution. We now see that it holds for any continuous
distribution. (Using the same theorem, it is easy to show that the more general result cited in Greene’s footnote for
censoring in both tails of the distribution holds as well.) The result is a convenient one for computational purposes.
It obviates computation of the mean of the truncated distribution. The probability required for the computation is
also required for evaluation of the likelihood function, so by construction, this is easily computable for any estimated
model.
The counterpart to this result for the binary choice model is
binary choice
= E[y|x]/x
= f(x).
The observed empirical regularity is that the marginal effects tend to be similar across different specifications of the
distribution. Since the scale of the density can very widely across distributions, it follows that there can be large
variation in the coefficients, which is then offset by the corresponding variation in the scale of the density. Thus, in
comparing logit to probit models, one is likely to find that the logit coefficients are larger than their probit
counterparts by a factor of roughly 1.6, a difference which is offset in the marginal effects by the smaller scale of the
logit density, of value roughly .25/.399, or 1/1.6. Looking at the censored regression model, it would appear that
this result would not be expected. Note that unlike the binary choice case, the censored regression model contains
information about the true scale of y*, in the nonlimit observations. Thus, the quantity x is tied to an observable,
measurable scale. Second, in the marginal effects, the scale factor Prob[y > 0|x] is the theoretical counterpart to a
sample statistic, the sample proportion of nonlimit observations. Thus, in this instance, there is less reason to expect
wide variation across models in the scale factor that produces the marginal effects.
3
The preceding is qualitative, of course. But, it does suggest that in the censored regression model, the
foundation provides for similar values of the coefficients and marginal effects. Indeed, the literature contains a
stronger, useful result that is consistent with this. In Powell’s (1984) derivation of the least absolute deviations
estimator for the censored regression model, he obtains strong consistency for the LAD estimator based only an
assumption that the conditional median of the disturbance is zero. This encompasses both the tobit and logistic
models discussed earlier, so we infer that, at least for this interesting subset of our cases, the parameters of the model
are invariant to the distribution. Our main result above implies that the marginal effects are also. It follows, then,
that the scaling difference necessary to bring about the equality of the estimated values of the CDF (i.e., the
marginal effects) is achieved by the parameter . This might have been expected, ex post, when one considers that
the parameters estimated in the binary choice model are only /, whereas here, the two parameters are separately
identified. A small example given below is illustrative.
The end result of this is that in the censored regression model, distribution should generally not matter.
Where real differences do emerge, these results suggest that they are less likely to be due to a simple distributional
scaling effect, and more to real, substantive differences in the models. That is, when the analyst observes substantial
differences attending the distributional choice, it may well be appropriate to take a very close look at the
specification of the model.
3. An Application
To illustrate these effects, we have used the labor supply data analyzed in Berndt (1991) (and contained on
a data disk that is distributed with the book). These are survey data on the labor supply behavior of a sample of
married women in the 1970s from a study by Mroz (1987). As this is is simply an illustration, we refer the reader to
this source, and Mroz (1987) for further detail. (Also, as a referee has pointed out, the “labor supply” model
estimated is, itself, questionable owing to the omission of any consideration of the fixed costs of working. Again,
our interst is in a numerical example.) The original sample contains 773 observations; we have used 250 of them.
(The subsample was not chosen systematically; it is simply a relatively balanced subset of the observations chosen
for this illustration.) The dependent variable is hours worked in the survey year. Independent variables in the
equation are:
KL6
K618
WA
WE
HHRS
HA
HE
HW
FAMINC
= number of children under 6
= number of children aged 6 to 18
= wife’s age
= wife’s education in years
= husband’s hours of work
= husband’s age
= husband’s education
= husband’s wage
= family income
Estimates of all of the parameters were obtained using LIMDEP. Standard errors for the estimates of the marginal
effects are computed using the delta method. In each model, let [b,s] denote the maximum likelihood estimates of
[,]. The estimated marginal effect in both cases would be d = [1 - F(-xb/s)]b. Let V denote the estimated
asymptotic covariance matrix of the maximum likelihood estimator of the parameters. Finally, let G = d/(b,s).
Then, the estimated asymptotic covariance matrix for the marginal effects is GVG.
Parameter estimates for a tobit model and a model based on the logistic distribution are shown in Table 1.
As expected, the parameter estimates are quite similar, as are the marginal effeects. There is, in fact, little to
distinguish these models. The large difference in the estimates of the parameter  is only apparent; rescaling the
logistic coefficient by /3 gives a value of 1251.9, which is essentially the same as the tobit estimate. Indeed, the
likelihood functions are virtually identical. The same similarity emerges in the estimated marginal effects in Table
1. The sample proportion of nonlimit observations is 0.6, or 150 of the 250 observations. The two scale factors for
the tobit and logistic model are .6360 and .6640, respectively, which are essentially the same, again, as expected.
Table 2 contains the same set of estimates for a model of binary choice, labor force participation, in which the
dependent variable is 1(hours > 0). The probit or logit estimates, therefore, are estimates of /. Since  is so
much smaller in the logistic than in the tobit model, for this reason alone, the slopes in the logit model are far larger
4
than their counterparts in the probit model. This effect is increaed by the scaling effect of the densities. As can be
seen, however, both of these effects are essentially offset when the binary choice coefficients are transformed to
marginal effects.
4. Conclusion
We have shown a simple, useful result for computing marginal effects in the censored regression model.
The result suggests that in the censored regression model, the effects of scaling which are produced by a specific
choice of distribution, and which are so misleading in the binary choice model derived from the censored dependent
variable, are less likely to be present in the censored regression model. However, none of this gives any indication of
which, or whether some other, model should be the preferred specification,.
The results are derived under two crucial assumptions. First, the disturbances are assumed to be
independent (at least uncorrelated) with the independent variables. This is standard in the literature, and is part of
the specification of the model. (I.e., if not, then the assumption E[|x] = 0 is not tenable, and the interpretation of 
becomes even more ambiguous than it is already.) The second assumption is that the disturbances are
homoscedastic, which merits some attention. If the disturbances are heteroscedastic, then the “parameters” in the
model become more difficult to interpret. If the conditional variance of  varies systematically with x, then the
result for the slopes of the conditional mean function given earlier does not apply; the relationship is much more
complicated. In this instance, it is unclear whether the MLE of  is even consistent. Powell’s result remains –
consider his statement “[i]t is presumed throughout that estimation of  is the primary object of estimation – as his
LAD estimator is still consistent. But, for what? Whether the latent, unobserved, regressand and its conditional
mean (x) are of interest or the observed dependent variable and its conditional mean are the primary focus of
interest is not obvious at the outset. This aspect is left for further study.
5
Table 1. Censored Regression Model Estimates
Tobit Model
Coefficients
Est.
Logistic Model
Marginal Effects
S.E.
Constant
2341.
913.7
KL6
-755.2
202.2
K618
-119.9
68.76
WA
-17.20
22.42
WE
73.20
49.17
HHRS
-.5688
.1607
HA
-11.60
22.22
HE
15.73
35.99
HW
-193.9
36.16
FAMINC
.05646
.01157

1203.5
75.56
Log-L
-1350.40
Scale Factor for Effects:
Est.
-480.3
-76.24
-10.94
46.56
-.3618
-7.374
10.00
-123.3
.03591
S.E.
127.9
43.77
14.26
31.18
.1020
14.13
22.89
22.85
.00740
Coefficients
Est.
2314.
-699.1
-106.7
-15.36
73.87
-.5800
-12.16
13.28
-204.5
.06171
690.2
-1350.65
Marginal Effects
S.E.
Est.
930.8
219.7
75.93
24.76
51.11
.1549
24.22
34.85
35.19
.01017
53.68
-464.2
-70.82
-10.20
49.05
-.3851
-8.073
8.820
-135.8
.04097
.6360
S.E.
144.9
50.32
16.43
34.01
.1016
16.09
23.13
22.64
.006507
.6640
Table 2. Parameter Estimates and Marginal Effects for Probit and Logit Models
Probit Model
Coefficient
Marginal Effect
Constant 1.606
KL6
-.5975
-.2288
K618
-.06603
-.02528
WA
-.01309
-.005011
WE
.1381
.05288
HHRS
-.0004634
-.0001774
HA
-.01820
-.006969
HE
-.01581
-.006053
HW
-.1417
-.05423
FAMINC
.00003825
.00001465
Log-L
-149.423
Scale Factor for Effects 0.3828
Logit Model
Coefficient
Marginal Effect
2.602
-.9800
-.1038
-.01832
.2330
-.0007903
-.03334
-.002310
-.2443
.00006728
-149.20
-.2312
-.02448
-.004320
.05496
-.0001864
-.007863
-.005448
-.05762
.00001587
0.2359
5. References
Berndt, E., 1991, The Practice of Econometrics: Classic and Contemporary. Addison Wesley, Reading.
Greene, W., 1997, Econometric Analysis, 3rd Ed., Prentice Hall, Saddle River.
Mroz, T., 1987, The sensitivity of an empirical model of married women’s hours of work to economic and statistical
assumptions, Econometrica, 55, 4, 765-799.
Powell, J., 1984, Least absolute deviations estimation for the censored regression model, Journal of Econometrics,
25, pp. 303-325.
Quantitative Micro Software, 1998, EViews, Version 3, Irvine, CA.
6
Professor Eric Maskin
Editor, Economics Letters
Department of Economics
Harvard University
Cambridge, MA 02138
February 16, 1998
Dear Professor Maskin:
I’ve received your letter of acceptance of my paper “Marginal Effects in The Censored
Regression Model.” Thank you very much. Your letter included some information you needed to
complete the first page of the paper. The items you listed there are already in the manuscript that you
have, though I’ve added an acknowledgement of the reviewer’s suggestions.
With your permission, however, I’d like to make one minor change in the conclusions. The
sentence which appears there now:
“The result suggests that in the censored regression model, the effects of scaling which are so misleading
in the binary choice model derived from the censored dependent variable, are less likely to be present in
the result.”
is badly constructed and a bit confusing. I’d like to change it to
“The result suggests that the effects of parameter scaling which are produced by a specific choice of
distribution, and which are so misleading in binary choice models, are less likely to be present in the
censored regression model.”
In addition, there was a typo in the header to table 2. I’ve enclosed a copy of the revised paper.
Thanks again.
Sincerely yours,
William Greene
7
Download