4 ECONOMETRICS CHAPTER Y

advertisement
x
x
ECONOMETRICS
x
x
x
CHAPTER
4
Multiple Regression = more than one explanatory variable
Yi = B1 + B2 X2i + B3X3i + ui
Independent variables are X2 and X3.
X2i is the ith observation of X2.
Yi = B1 + B2 X2i + B3X3i + ui
B2 and B3 are partial regression coefficients.
• B2 measures the change in E(Y) holding
the value of X3 constant.
Yi = b1 + b2 X2i + b3X3i + ei
Sample regression function with parameter
estimates.
Estimating the impact of GDP and population on education
expenditures.
Educi = 414 + 0.052 GDPi - 50 Popi
Holding population fixed, education
spending increases 5.2¢ for every $1 of GDP.
Educi = -161 + 0.048 GDPi
GDP and population are correlated. When
we don’t control for population, part of the
population effect gets picked up by GDP.
Estimating the impact of GDP and population on education
expenditures.
Educi = 414 + 0.052 GDPi - 50 Popi
Holding GDP fixed, education spending
decreases $50 for each additional person.
Educi = 2,946 + 78.7 Popi
When we don’t control for GDP, population
picks up the GDP effect.
Impact of Population on Education
(millions)
y = 78.716x + 2946.4
R2 = 0.0865
Education
expenditures
25,000
20,000
15,000
10,000
5,000
0
0
50
Population
100
The Classical Linear Regression Model
One more assumption
8. No exact linear relationship between explanatory
variables, i.e. no multicollinearity.
Example of multicollinearity:
X2 = population of the state
X3 = female population of the state
X4 = male population of the state
Linear relationship: X2 = X3 + X4
Second example of multicollinearity:
X2 = % females in the state
X3 = % males in the state
Linear relationship: X2 = 1 - X3
Perfect collinearity is rare; error message if it happens.
Regression is possible with high collinearity – but caution in
interpretation of coefficients is needed.
Estimation of Parameters
Procedures for estimating parameters using OLS are the
same (the equations just become more complicated.)
Standard errors of the estimators are calculated in much
the same way.
We estimate the variance of the disturbance term in the
population from the residuals in the sample.
2
∑
e
i
2
σ =
n–k
k represents the number
of coefficients estimated.
Estimating Goodness of Fit
As before, R2 is used as a measure of goodness of fit.
R2 = ESS / TSS
Hypothesis Testing
Testing the null hypothesis that Bi = 0 is the same as
before except:
df = n - k
The test of significance approach to hypothesis testing
Educi = 414 + 0.052 GDPi - 50 Popi
Test statistic: t = b1 / se(b1) = 414 / 267 = 1.55
p = TDIST(t, df, tails)
1 tail: p = 0.065
2 tails: p = 0.13
t
-1.55
0
1.55
Testing the Joint Hypothesis that B2=B3=0
Testing that all the coefficients* are equal to zero is the
same as testing that R2=0.
* Not necessarily the intercept, B1.
R2 / (k - 1)
F =
(1 – R2) / (n – k)
F follows the F distribution
with (k-1) df in the numerator
and (n-k) df in the denominator.
From the regression of education expenditures on GDP and
population (R2 = 0.962):
F =
0.962 / 2
0.038 / 35
= 443.0
p = FDIST(F, df, tails)
p = FDIST(443, 2, 35) = 1.6 E-25 = 0.000
* Note: This number is reported in
standard regression output.
Adjusted R2
Adjusted R2 is a goodness of fit measure that is adjusted for
the number of explanatory variables.
R2 always increases as you add explanatory
variables. Adjusted R2 does not.
R2
= 1 – (1 –
R2)
n–1
n–k
Download