The Multiple Regression Model and theLeast Squares Assumptions

advertisement
The Multiple Regression Model and
theLeast Squares Assumptions
Yi = 0 + 1X1i + 2X2i + … + kXki + ui, i = 1,…,n
1. The conditional distribution of u
given the X’s has mean zero, that
is, E(u|X1 = x1,…, Xk = xk) = 0.
2. (X1i,…,Xki,Yi), i =1,…,n, are i.i.d.
3. X1,…, Xk, and u have four
moments: E( X 1i4 ) < ,…, E( X ki4 )
< , E( ui4 ) < .
4. There is no perfect
multicollinearity.
The OLS estimator of the β’s is
unbiased and consistent.
For large n,
ˆi   i
se( ˆi )
is approximately
distributed N(0,1)
Consider the following example
using the CA test score data set:
TESTSCRi = β0 + β1STRi +
β2EXPN_STUi + β3AVG_INCi +
β4EL_PCTi + ui
TESTSCR = average test score
STR = student-teacher ratio
EXPN_STU = expenditures per student ($)
AVG_INC = average household income ($)
EL_PCT = percent of English learners
Estimate this model by OLS in Eviews,
with heteroskedasticity adjusted
standard errors to obtain:
Dependent Variable: TESTSCR
Method: Least Squares
Date: 09/30/04 Time: 14:35
Sample: 1 420
Included observations: 420
White Heteroskedasticity-Consistent Standard Errors & Covariance
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
STR
EXPN_STU
AVGINC
EL_PCT
651.7362
-0.323907
-0.001289
1.518165
-0.483630
11.42310
0.360775
0.001098
0.092780
0.028504
57.05423
-0.897807
-1.173235
16.36306
-16.96727
0.0000
0.3698
0.2414
0.0000
0.0000
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.708237
0.705425
10.34116
44379.90
-1574.614
1.193905
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
654.1565
19.05335
7.521974
7.570072
251.8473
0.000000
First, let’s look at the point estimates of
the slope coefficients.
 The signs of the estimated
coefficients seem to conform to our
expectations, except for the
negative coefficient on
EXPN_STU.
 The coefficient on STR is much
smaller (in absolute value) than in
the other regressions we have
looked at.
Hypothesis Testing:
Under the null hypothesis
H0:βi =βi,0
for large n, the t-statistic
t=
ˆi   i 0
se( ˆi )
is approximately N(0,1).
So tests of this null hypothesis can
be applied in exactly same way as in
the simple regression model.
Example: California Test Scores
First, consider the null hypothesis that β1,
the coefficient on STR, is equal to 0 (i.e.,
all else equal, variations in STR do not
affect test scores) against the one-sided
alternative that the coefficient is less than
zero.
The t-statistic corresponding to this null
hypothesis is -0.898 (= -.324/.361)
The p-value for this one-sided test is
Prob( t < -.898), which is the probability
that an N(0,1) random variable is less than
-.898 which is approximately 0.19.
Therefore, we do not reject the null
hypothesis at conventional significance
levels (e.g., 10% ,5%, 1%).
The coefficient on EXPN_STU appears to
have the “wrong” sign, suggesting that
increasing expenditures per student will,
all else equal, reduce test scores. Let’s test
the null hypothesis that β2, the coefficient
on EXPN_STU is equal to 0 against the
two-sided alternative that it is not equal to
0.
The t-statistic is equal to -1.17. The pvalue for this two-sided test is equal to the
probability of drawing a N(0,1) random
variable that is larger than 1.17 in absolute
value, which is equal to 0.24. Therefore,
we would not reject the null hypothesis
that β2 = 0 against the two-sided
alternative at conventional significance
levels.
It appears that, all else equal (including
expenditures per student), changes in STR
do not affect test scores. It also appears
that all else equal (including studentteacher ratio) that changes in EXPN_STU
do not affect test scores. Does it follow
that, all else equal, neither changes in STR
nor changes EXPN_STU affect test
scores?
No. If STR and EXPN_STU are correlated
with one another, the individual t-tests
could easily be “misled” into understating
the actual significance of each of these
variables.
The appropriate way to test the hypothesis
that neither one matters is to formulate a
joint hypothesis and apply an F-test.
The joint null hypothesis that β1 = 0
and β2 = 0 is:
H0: β1 = 0,β2 = 0
The alternative hypothesis is that at
least one of (β1,β2) is non-zero.
The F-statistic can be constructed to
test this joint hypothesis.
Under the null hypothesis, for large n,
the F-statistic will have an F(2,∞)
distribution. (Or equivalently, 2×F will
have a chi-square distribution with 2
degrees of freedom.)
The null hypothesis is rejected at the αsignificance level if the statistic is
greater than the (1-α)x100 percentile of
the F(2, ∞) distribution. The p-value of
the test is the probability of drawing a
number from the F(2, ∞) distribution
larger than the calculated value of the
statistic.
Note that the “correct” calculation of
the F-statistic will depend on whether
you assume that the regression errors
are heteroskedastic or whether you
make the more restrictive assumption
that they are homoskedastic.
To calculate the F-statistic in Eviews:
Estimate the original regression
equation by OLS, using the
heteroskedasticity-consistent standard
errors option (unless we assume that
the errors are homoskedastic)
Apply the Wald test with the
restrictions “C(2)=0,C(3)=0”
(assuming that STR and EXPN_STU
are the second and third variables in the
list under “variables” in the regression)
Output:
Wald Test:
Equation: Untitled
Null
C(2)=0
Hypothesis:
C(3)=0
F-statistic
0.720996 Probability
Chi-square 1.441992 Probability
0.486876
0.486268
This produces F and 2xF (the Chi-square
statistic). The “Probability” is the pvalue of the test. That is under the null
hypothesis that C(2)=0 and C(3)=0, the
probability of drawing an F-statistic as
large as 0.72 is about 0.49. (Why are the
two probabilities different?) So, it
appears that neither STR nor
EXPN_STU have statistically significant
effects on test scores.
Maybe none of the variables that we’ve
included affects test scores! Let’s test
this joint null hypothesis –
H0 : β1 = 0, β2 = 0, β3 = 0, β4 = 0
against the two-sided alternative that at
least one of these β’s is equal to zero.
Under the null hypothesis, for large n,
the F-statistic has an F(4,∞)
distribution (and, equivalently, 4xF has
a chi-squared distribution with 4
degrees of freedom). We reject H0 at
the α-significance level if the
calculated F is greater than the
(1-α)x100 percentile of the F(4,∞)
distribution.
In Eviews, we follow the same
procedure as we used for the previous
F-test, except that now we insert the
following restrictions into the Wald test
dialogue box:
C(2)=0,C(3)=0,C(4)=0,C(5)=0
The result of this test:
Wald Test:
Equation: Untitled
Null Hypothesis: C(2)=0
C(3)=0
C(4)=0
C(5)=0
F-statistic
Chi-square
223.0756
892.3023
Probability
Probability
0.000000
0.000000
Note that the Chi-square statistic is 4
times the F-statistic.
Note, too that the F-statistic different
from the F-statistic produced in the
regression summary, even though these
F-statistics have been constructed to test
the same hypothesis (i.e., all of the slope
coefficients are equal to 0). Why?
The p-value of this test is VERY small
and so we reject the joint null hypothesis
at standard significance levels. So,
something seems to matter!
What seem to matter are the district
characteristic that policymakers have
little control over – average income and
English learners in the district. Each of
these two variables is statistically
significant.
The joint hypothesis we have looked at
so far have been use of the form that
some subset of the regression parameters
are equal to zero. However, the F-test
can be applied to test more general kinds
of linear restrictions.
Example 1: H0: β1 = β3
The F-statistic will have an F(1,∞)
distribution, for large n.
Eviews implementation? Same as above,
but in the Wald-test dialogue box use the
restriction C(2)=C(4), assuming the
intercept appears at the top of the
variable list.
Example 2:
H0: β1 = β3 and 2β1 + 0.5β2 = β4
The F-statistic will have an F(2,∞)
distribution, for large n.
Eviews implementation? Same as above,
but in the Wald-test dialogue box use the
restriction
C(2)=C(4), 2*C(2)+0.5*C(3)=C(5)
again assuming the intercept appears at
the top of the variable list.
More on the F-statistic –
The F-statistic essentially compares how
well the estimated model fits the data
when the restrictions specified by the
null hypothesis are imposed against the
fit of the model when the restrictions are
ignored. If the restrictions are correct,
then imposing the restrictions should not
substantially worsen the fit of the model.
This is especially clear in the case where
the errors are assumed to be
homoskedastic. In this special case, the
F-statistic has the following form:
Run two regressions, one under the
null hypothesis (the “restricted”
regression) and one under the
alternative hypothesis (the
“unrestricted” regression), the
“restricted” and “unrestricted”
regressions.
2
2
( Runrestricted
 Rrestricted
)/q
F=
2
(1  Runrestricted
) /( n  kunrestricted  1)
where:
2
= the R2 for the restricted regression
Rrestricted
2
= the R2 for the unrestricted
Runrestricted
regression
q = the number of restrictions under the null
kunrestricted = the number of regressors in the
unrestricted regression.
Example: are the coefficients on STR
and Expn zero?
Restricted population regression (that
is, under H0):
TESTSCRi = β0 + β3AVG_INCi +
β4EL_PCTi + ui (why?)
Unrestricted population regression
(under H1):
TESTSCRi = β0 + β1STRi +
β2EXPN_STUi + β3AVG_INCi +
β4EL_PCTi + ui
 The number of restrictions under
H0 = q = 2.
 The fit will be better (R2 will be
higher) in the unrestricted
regression (why?)
By how much must the R2 increase for
the coefficients on STR and
EXPN_STU to be judged statistically
significant?
Under the null hypothesis, for large n
and homoskedastic errors, this
statistic is drawn from an F(2,∞)
distribution.
When the errors are heteroskedastic,
this simple form of the F-statistic
does not have an F(q, ∞). The
“correct” version of the F-statistic
under heteroskedasticity has a more
complicated form, but it works
according to the same principles: If
the fit of the model is substantially
worse when the null hypothesis is
imposed then the null hypothesis is
rejected.
Download