Uploaded by 江楼月(江楼月)

ME-UC3M-Classnotes-U4

advertisement
Universidad Carlos III de Madrid
ME + MIEM
César Alonso
ECONOMETRICS I
THE MULTIPLE LINEAR REGRESSION MODEL
Contents
1 The
1.1
1.2
1.3
Multiple Regression Model . . . . . . . . . . . . . . .
Assumptions of the Multiple Regression Model . . . .
Interpretation of coe¢ cients . . . . . . . . . . . . . .
The relation between multiple and simple regression:
regression . . . . . . . . . . . . . . . . . . . . . . . .
2 Parameter interpretation in the most usual speci…cations .
2.1 Linear in variables model . . . . . . . . . . . . . . . .
2.2 Semilogarithmic models . . . . . . . . . . . . . . . .
2.2.1 Model with log in the exogenous variable . . .
2.2.2 Model with log in the endogenous variable . .
2.3 Double logarithmic model . . . . . . . . . . . . . . .
2.4 Model with quadratic terms . . . . . . . . . . . . . .
2.5 Other models . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Reciprocal model . . . . . . . . . . . . . . . .
2.5.2 Models with interactions . . . . . . . . . . . .
2.6 Final comments . . . . . . . . . . . . . . . . . . . . .
3 Estimation in the multiple regression model: OLS . . . . .
3.1 Properties of the OLS estimators . . . . . . . . . . .
3.2 Estimation of 2 . . . . . . . . . . . . . . . . . . . .
3.3 Variances of the OLS estimators . . . . . . . . . . . .
3.4 Goodness of …t measures . . . . . . . . . . . . . . . .
4 Inference in the multiple regression model . . . . . . . . .
1
. . .
. . .
. . .
long
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. . .
. .
. .
. .
vs.
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. . . .
. . . .
. . . .
short
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
1
2
4
6
8
10
11
11
12
14
16
17
17
17
18
18
20
21
21
23
24
4.1
4.2
4.3
4.4
Hypothesis tests on a single coe¢ cient .
Tests about a linear restriction on several
Tests about q linear restrictions . . . . .
Test of joint signi…cance . . . . . . . . .
. . . . . . .
parameters
. . . . . . .
. . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
24
27
28
31
1
The Multiple Regression Model
In most economic applications, more than two variables are involved, since the
factors explaining an economic phenomenon used to be multiple.
This points out the limitation of the simple regression model for the empirical
analysis.
The extension to several explanatory variables is imperative to address most
interesting real-world problems.
We can propose a model to consider the existence of a multiple relationship
between Y and several other variables X1 ; X2 ; : : : ; XK .
Examples:
Y = wage X1 = education X2 = experience X3 = gender
Y = sales X1 = advertising expenditure X2 = prices
So, for example, we will have:
Wage =
0
+
1 Education
+
2 Experience
+
3 Gender
+"
– The unobserved error term " captures any other factors di¤erent than
Education, Experience or Gender, a¤ecting the Wage.
– Notice that since the Multiple Regression Model accounts of several factors,
it will be easier to argue the independence of the observed explanatory
variables with respect to unobserved factors including in ".
– If we are interested in the e¤ect of Education on Wages keeping other
factors constant, in our example we can ensure that we can measure the
1
e¤ect of Education for a given Experience and a given Gender.
On the contrary, in the simple regression model, the coe¢ cient of education
can be interpreted as the e¤ect of education for a given experience and
gender only if experience and gender were uncorrelated with education.
1.1
Assumptions of the Multiple Regression Model
Let the multiple linear regression model be:
Y =
0
+
1 X1
+
2 X2
+
+
K XK
+"
The assumptions are very similar to those invoked in the simple regression
model.
1. Linearity in parameters
2. E("jX1 ; X2 ; : : : ; XK ) = 0 for any combination of values of X1 ; X2 ; : : : ; XK .
) E(Y jX1 ; X2 ; : : : ; XK ) =
0
+
1 X1
+
2 X2
+
+
K XK
3. Homoskedasticity:
V ("jX1 ; X2 ; : : : ; XK ) =
) V (Y jX1 ; X2 ; : : : ; XK ) =
2
2
4. Absence of multicollinearity: No explanatory variable is an exact linear
function of other explanatory variables.
As in the simple regression case, assumption 2 is crucial for the model parameter
yield a causal interpretation.
But conditioning on several variables, this assumption will be more likely to be
full…lled.
2
The assumptions 1 and 2 imply that:
– The Conditional Expectation Function (CEF) is linear:
E(Y jX1 ; X2 ; : : : ; XK ) =
0
+
1 X1
+
2 X2
+
+
K XK
– For each possible combination of (X1 ; X2 ; : : : ; XK ), the CEF yields the
mean of Y in the subpopulation given the corresponding values of X1 ;
X2 ; : : : ; XK .
– The CEF, as in the simple linear model, coincides with L(Y jX1 ; X2 ; : : : ; XK ),
is the best predictor, in the sense that it minimizes E("2 ), where " = prediction error = Y
c(X1 ; X2 ; : : : ; XK ) = Y
(
0 + 1 X1 + 2 X2 +
Hence, the …rst order conditions determining the ’s are:
E(") = 0; C(X1 ; ") = 0; : : : ; C(XK ; ") = 0:
Example: Linear regression model with two explanatory variables:
Y =
0
+
1 X1
+
2 X2
+"
where:
Y = earnings
X1 = education
X2 = gender =
1 if woman
0 if man
– We have that:
E(Y jX1 ; X2 ) =
0
+
1 X1
+
2 X2
so that
E(Y jX1 ; X2 = 0) =
0
+
1 X1
E(Y jX1 ; X2 = 1) =
0
+
2
3
+
1 X1
+
K XK ).
Consequently, if
2
< 0, E(Y jX1 ; X2 = 0) is a line parallel to E(Y jX1 ; X2 =
1) and above it.
1.2
Interpretation of coe¢ cients
If all variables except Xj remain constant (other things equal),
E(Y jX1 ; X2 ; : : : ; XK ) =
j
Xj
and therefore
j
=
E(Y jX1 ; X2 ; : : : ; XK )
Xj
j
=
@E(Y jX1 ; X2 ; : : : ; XK )
@Xj
In other words,
4
Thus, we interpret
j
as:
When Xj increases by one unit (all other things constant), Y varies, on average,
by
j
units of Y .
This interpretation corresponds to ceteris paribus notion.
Precisely, this ability to make ceteris paribus comparisons in estimating the
relationship between one variable and another, is the value of econometric analysis.
It must be noticed that the multiple regression
Y = E(Y jX1 ; X2 ; : : : ; XK ) + "
answers a di¤erent question than the simple regressions
Y
= E(Y jX1 ) + "1 ;
:::
Y
= E(Y jXK ) + "K
Example: Considering earnings, education and gender,
– E(Y jX1 ; X2 ) =
0
+
1 X1
+
2 X2
=
Expected earnings for X1 years of education for a given gender X2 .
1:
Change in earnings due to an
additional year of education
for a given gender.
5
– E(Y jX1 ) =
+
0
1 X1
=
Expected earnings for X1 years of education.
1:
1.3
Change in earnings due to an
additional year of education
without controlling gender
The relation between multiple and simple regression:
long vs. short regression
Consider the simplest multiple linear regression model (population “long regression”)
Y = E(Y jX1 ; X2 ) + "
where
E(Y jX1 ; X2 ) = L(Y jX1 ; X2 ) =
The parameters
0,
1
and
2
0
+
1 X1
+
2 X2
must verify:
E(") = 0; C(X1 ; ") = 0; C(X2 ; ") = 0:
E(") = 0 )
0
1 E(X1 )
2 E(X2 )
(1)
2 C(X1 ; X2 )
= C(X1 ; Y )
(2)
(X2 ) = C(X2 ; Y )
(3)
= E(Y )
C(X1 ; ") = 0 )
1V
C(X2 ; ") = 0 )
1 C(X1 ; X2 )
(X1 ) +
+
2V
From (2) and (3) we have:
1
2
V (X2 )C(X1 ; Y ) C(X1 ; X2 )C(X2 ; Y )
V (X1 )V (X2 ) [C(X1 ; X2 )]2
V (X1 )C(X2 ; Y ) C(X1 ; X2 )C(X1 ; Y )
=
V (X1 )V (X2 ) [C(X1 ; X2 )]2
=
6
Note that, if C(X1 ; X2 ) = 0:
C(X1 ; Y )
V (X1 )
C(X2 ; Y )
=
V (X2 )
=
1
2
(slope of L(Y jX1 ))
(slope of L(Y jX2 ))
Consider now the simple linear regression model (pupulation “short regression”)
Y = E(Y jX1 ) + "1
where
E(Y jX1 ) = L(Y jX1 ) =
The parameters
0
and
1
0
+
1 X1
must verify:
E("1 ) = 0 )
0
= E(Y )
1 E(X1 )
(4)
C(X1 ; "1 ) = 0 )
1
= C(X1 ; Y ) /V (X1 )
(5)
From (2) and (5),
1
=
C(X1 ; Y )
=
V (X1 )
1V
(X1 ) + 2 C(X1 ; X2 )
=
V (X1 )
1
+
2
C(X1 ; X2 )
V (X1 )
Hence:
–
–
1
=
1
only if either C(X1 ; X2 ) = 0 or
= 0.
2
C(X1 ; X2 )
is the slope of L(X2 jX1 ) :
V (X1 )
L(X2 jX1 ) =
0
+
1 X1
By the same reasoning, there always will be another simple linear regression:
E(Y jX2 ) = L(Y jX2 ) =
7
0
+
2 X2
where the parameters
0
and
2
must verify:
0
= E(Y )
2 E(X2 )
(4’)
2
= C(X2 ; Y ) /V (X2 )
(5’)
From (3) and (5’),
2
=
C(X2 ; Y )
=
V (X2 )
2V
(X2 ) + 1 C(X1 ; X2 )
=
V (X2 )
2
+
1
C(X1 ; X2 )
V (X2 )
Likewise,
–
–
2
=
2
only if either C(X1 ; X2 ) = 0 or
= 0.
C(X1 ; X2 )
is the slope of L(X1 jX2 ) :
V (X2 )
L(X1 jX2 ) =
2
1
0
+
1 X1
Parameter interpretation in the most usual speci…cations
Chapter 7 (7.5) and 13 (13.2), Goldberger
Chapter 2 (2.4), 3 (3.1) and 6 (6.2), Wooldridge
We have focused on linear relations (both in parameters and in variables)
between the dependent variable Y/ and the explanatory variables X1 , . . . , Xk .
However, many relations in economics are non linear.
Provided that the model is linear-in-parameters, the regression analysis allows
to introduce non linear relations.
8
Key point:
In general, when we are saying that the regression model is linear, we mean
that the model is linear-in-parameters.
But it can be referred to non-linear transformations of the original variables.
The concept of elasticity is very important in economics: it measures the
percentage change in a variable (Y ) in response to a percentage change in
another variable (X).
– In general, elasticities are not constant for most speci…cations.
The value will depend on the realized values of the explanatory variable
(X) and the response variable (Y ).
– The transformation that we might apply to the variables will a¤ect the
way in which elasticities are calculated.
We will consider the most usual speci…cations.
For the sake of simplicity, we will concentrate in models with one or two explanatory variables.
9
Model
Linear
Speci…cation
Causal e¤ect
Y =
0
+
1X
+"
Semilog
(X)
Y =
0
+
1 ln X
Semilog
(Y )
ln Y =
Doub. log
ln Y =
=
1
+"
1
'
0
+
1X
+"
1
'
0
+
1 ln X
+"
1
'
OTHER
Reciprocal
Y =
Interactions
(2 o r m o re va ria b le s)
0
+
Y = 0+
+ 2 X2 +
1
1X
E(Y jX)
X
E(Y jX)
ln X
E(Y jX)
X=X
=
E(ln Y jX)
X
E[( Y =Y ) j X]
X
=
E(ln Y jX)
ln X
E[( Y =Y ) j X]
X=X
=
Elasticity
X
1 E(Y jX)
1
E(Y jX)
1X
1
+"
E[( Y j X1 ;X2 ]
X1
1 X1
3 X1 X2
+"
=
1
+
3 X2
MORE...
2.1
Linear in variables model
The model is simply
Y =
where E("jX) = 0 ) E(Y jX) =
Interpretation of
0
0
+
+
1X
+ ",
1 X.
1:
1
=
E(Y jX)
) As X increases by 1 unit,
X
Y varies on average by
1 unit of Y .
Elasticity of E(Y jX) with respect to X:
E [( Y =Y )jX]
=
X=X
10
1
X
E(Y jX)
Interpretation
As X " by 1 unit
Y varies on avg.
1 units.
As X " by 1%
Y varies on avg.
1 =100 units.
As X " by 1 unit
Y varies on avg.
( 1 100) %.
When X " by 1%
Y varies on avg.
1 %.
As X " 1 unit
Y varies on avg.
1
1 2 units.
X
As X " 1 unit
Y varies on avg.
1 + 3 X2 units.
The elasticity varies with the possible realizations of X and Y , so it is not
constant.
Usually, we calculate elasticities for particular individuals, with particular values of X and Y , as
1
2.2
2.2.1
X
.
Y
Semilogarithmic models
Model with log in the exogenous variable
Sometimes, percentage changes in X lead to constant changes in Y :
Y =
0
+
1
ln X + ",
where E("jX) = 0 ) E(Y jX) =
0
+
1
ln X.
Interpretation of
1:
1
Notice that
1
=
E(Y jX)
'
ln X
E(Y jX)
X=X
is a semielasticity.
Elasticity of E(Y jX) with respect to X
1
E(Y jX)
,
which depends on the particular realization of E(Y jX).
Usually, we calculate elasticities for particular individuals, with a particular
value of Y , as
1
Y
11
,
or we use the sample mean of Y , Y ,
or estimating E(Y jX) as the predicted value of Y for the sample means of the
X’s, E Y\
j X = X = b0 + b1 X.
Multiplying and dividing by 100 to express the change in X in percentage
terms,
1 =100
'
E(Y jX)
) As X increases by 1%,
100
X=X
Y varies on average by
1 =100 units of Y .
Example: Let Y = Consumption (in euros), X = Income. (in euros). Consider
two alternative models:
Model 1:
Y =
0
+
1X
+"
E(Y jX)
) If income increases by 1 euro, consumption
X
varies on average by 1 euros
In this model,
1
=
(MPC –Marginal Propensity to Consume–constant).
Model 2:
Y =
0
+
1
ln X + "
E(Y jX)
) If income increases by 1%; consump100
X=X
tion varies on average by 1 =100 euros.
In this model,
1 =100
(Here the MPC is
2.2.2
'
1 =X,
which is not constant: it decreases with income).
Model with log in the endogenous variable
Sometimes, variations in X entail percentage changes in Y ,
ln Y =
0
12
+
1X
+ ",
where E("jX) = 0 ) E(ln Y jX) =
0
+
1 X.
In terms of the original variables, this model can be expressed as
Y = exp(
Interpretation of
0
+
1X
+ ")
1:
1
=
E [( Y =Y ) j X]
E(ln Y jX)
'
X
X
so that
100 '
1
1
E [(100
Y =Y ) j X]
) When X varies by 1 unit,
X
Y varies on average by
( 1 100) %.
is a semielasticity.
The elasticity of E(Y jX) with respect to X is equal to
1X
(so it varies with
the value of X).
This speci…cation is very useful to describe curves with exponential growth.
Particularly, if X = t (time), then Entonces, Y = exp( 0 + 1 t + ") y como
E(ln Y jX)
,
1 =
t
1 recoge la tasa de crecimiento medio de Y a lo largo del tiempo.
Example: Let Y = hourly wage (euros), X = education (years).
Model 1:
Y =
0
+
1X
+"
E(Y jX)
) An additional year of education implies an increase
X
in the average wage of 1 euros.
where,
1
=
13
(The mean wage increases in a constant amount of
1
for each additional year
of education, irrespective of the level of education).
Model 2:
ln Y =
0
+
1X
+"
Y =Y ) j X]
) An additional year of education
X
implies a percentage increase in the average wage of ( 1 100) %.
where,
1
100 '
E [(100
(The hourly wage increases by (
Y ) euros for each additional year of edu-
1
cation, which varies with the wage level).
2.3
Double logarithmic model
It characterizes situation by which percentage changes in X lead to percentage
changes in the mean value of Y ) Constant elasticity.
Very useful in studies of demand, production, costs, etc.
The model can be expressed as
ln Y =
where E("jX) = 0 ) E(ln Y jX) =
Interpretation of
1
(
1
=
0
0
+
+
1
1
ln X + ",
ln X.
1:
E [( Y =Y ) j X]
E(ln Y jX)
'
) When X varies by 1%,
ln X
X=X
Y varies on average by
1 %.
is a elasticity)
Example: Let Y = Output, X1 = Labor and X2 = Capital.
14
Model 1:
Y =
0
1 X1
+
2 X2
+
+"
E(Y jX1 ; X2 )
) If labor input is increased by 1 unit (keeping capital
X1
constant), output varies on average by 1 units of output
so
=
1
) The elasticity of output with respect to labor is not constant:
Y =Y
=
X1 =X1
1
X1
Y
E(Y jX1 ; X2 )
) If capital input is increased by 1 unit
X2
(keeping labor constant), output varies on average by 1 units of output
Analogously
2
=
) The elasticity of output with respect to capital is not constant:
Y =Y
=
X2 =X2
2
X2
Y
Model 2:
ln Y =
0
+
1
ln X1 +
2
ln X2 + "
E [( Y =Y ) j X1 ; X2 ]
) If labor input is increased by 1% (keeping
X1 =X1
capital constant), output varies on average by 1 %.
so
1
'
) The elasticity of output with respect to labor is constant.
Likewise, if capital input varies by 1% (keeping labor constant), output varies
on average by
2 %.
) The elasticity of output with respect to capital is constant.
– Note that this model has the following representation in terms of the original variables:
Y = b0 X1 1 X2 2 exp(") (Cobb-Douglas)
15
2.4
Model with quadratic terms
This model allows to model for marginal increasing or decreasing e¤ects of X
on Y ,
Y =
0
where E("jX) = 0 ) E(Y jX) =
+
0
1X
+
+
1X
2X
+
2
+"
2X
2
.
It is useful in technologies of production of cost functions.
Here,
E(Y jX)
=
X
1
+2
2 X,
so that when X varies by 1 unit, Y varies on average by (
Note that
1
+2
2 X)
units.
and
2
cannot be interpreted separately.
– The sign of
2
will determine whether the marginal e¤ect is increasing
(
2
1
> 0) or decreasing (
2
< 0).
– There is a critical value of X after which the sign of the e¤ect of X on
E(Y jX) switches. Such critical value is X =
1 =2 2 .
Example: Let Y = hourly wage (euros), X1 = education (years), X2 = labor
experience (years).
Model 1:
ln Y =
0
+
1 X1
+
2 X2
+"
If experience increases by 1 year, keeping education constant, the wage varies
on average by (
2
100 ) %.
16
Model 2:
ln Y =
0
+
1 X1
+
2 X2
2
3 X2
+
+"
If experience increases by 1 year, keeping education constant, the wage varies
on average by 100
(
2
+2
3 X2 )%
) The return to an additional year of
experience is not constant (depending on the years of experience).
2.5
2.5.1
Other models
Reciprocal model
1
+ ",
X
1
where E("jX) = 0 ) E(Y jX) = 0 + 1 .
X
Y =
0
+
1
– It implies a hyperbolic curvature.
– It is used to describe nonlinear inverse relationships, such as the Phillips’
curve (unemployment-in‡ation tradeo¤).
– When X varies by 1 unit, Y varies on average by
2.5.2
1
1
units of Y .
X2
Models with interactions
Sometimes, the e¤ect on one explanatory variable depends on the level of another,
Y =
0
+
1 X1
+
2 X2
+
3 X1 X2
+ ",
where:
E("jX1 ; X2 ) = 0 ) E(Y jX1 ; X2 ) =
0
+
1 X1
– If X1 varies by 1 unit, Y varies on average by
17
+
1
+
2 X2
+
3 X2
3 X1 X2 :
units.
– Note that the parameters cannot be interpreted separately.
2.6
Final comments
The di¤erent transformations above can be combined in a model, so that we
can have logarithmic or semilogarithmc models with interactions, powers, etc.
Example: Translogarithmic production function.
– Let Y = Output, X1 = Labor and X2 = Capital.
ln Y
=
0
+
+
3
1
ln X1 +
(ln X1 )2 +
2
4
ln X2
(ln X2 )2 +
5
(ln X1 ) (ln X2 ) + "
Here, the elasticities of output with respect to either labor or capital are
not constant, despite being in logarithms:
E [( Y =Y ) j X1 ; X2 ]
'
X1 =X1
1
+2
3
ln X1 +
5
ln X2 ,
which depends on the logarithms of both inputs.
– This speci…cation is also useful to feature expedniture functions or cost
functions.
3
Estimation in the multiple regression model: OLS
Goldberger: Chapters 6 (6.4), 8 (8.2 y 8.3), 9 (9.2 y 9.4), 10 (10.2) y 12 (12.1 y 12.3).
Wooldridge: Chapters 2 (2.2, 2.3, 2.5 y 2.6), 3 (3.2-3.5), 5 (5.1 y 5.3).
Estimation follows the same rationale as in the simple regression case.
18
Consider the model:
Y =
0
+
1 X1
+
2 X2
+
+
K XK
+"
with the assumptions above.
Recall that the analog principle allows to derive estimators for the
’s that
coincide with the OLS estimators.
We will illustrate for the two-variable case, where the population parameters
satisfy:
0
1
2
= E(Y )
1 E(X1 )
2 E(X2 )
V (X2 )C(X1 ; Y ) C(X1 ; X2 )C(X2 ; Y )
V (X1 )V (X2 ) [C(X1 ; X2 )]2
V (X1 )C(X2 ; Y ) C(X1 ; X2 )C(X1 ; Y )
=
V (X1 )V (X2 ) [C(X1 ; X2 )]2
=
and applying the analog principle,
b0 = Y
b
1
b2
where
b1 X 1
b2 X 2
S22 S1y S12 S2y
2
S12 S22 S12
S12 S2y S12 S1y
=
2
S12 S22 S12
=
P
P
2
2
S12 = n1 Pi X1i X 1
S22 = n1 Pi X2i X 2
S1y = n1 Pi X1i X 1 Yi Y
S2y = n1 i X2i X 1 Yi
S12 = n1 i X1i X 1 X2i X 2 = S21
Y
In the general case, if we consider the LS criterion, the OLS estimator solves
the problem
1X 2
b
",
n i=1 i
n
min
0; 1;
where b
" i = Yi
Ybi = Yi
;
K
(b0 + b1 X1i + b2 X2i +
19
+ bK XKi ) is the residual.
The …rst order conditions for OLS are:
P
"i
ib
= 0;
"i
ib
= 0;
or equivalently,
1
n
P
where xji = Xji
P
"i X1i
ib
1
n
P
"i XKi
ib
= 0; : : : ;
P
"i x1i
ib
= 0; : : : ;
1
n
P
=0
"i xKi
ib
X j , j = 1; : : : ; K.
=0
These conditions are simply the sample analog of the conditions that the ’s
verify in the population:
E(") = 0; C(X1 ; ") = 0; : : : ; C(XK ; ") = 0:
The …rst order conditions imply the following system of (K + 1) equations with
(K + 1) unknowns (the b’s):
P
P
P
P
+ bK i XKi = i Yi
nb0 + b1 i X1i + b2 i X2i +
P
P
b P x2 + b P x2i x1i +
+ bK i xKi x1i = i yi x1i
2
1
i
i 1i
..
.
P
P 2
b P x1i xKi + b P x2i xKi +
yi xKi
x =
+b
1
i
2
K
i
i
Ki
i
Provided that there isn’t any explanatory variable that is a exact linear combination of the others (i.e., there is not exact multicollinearity), the system will
have a unique solution.
3.1
Properties of the OLS estimators
As in the simple regression case, the OLS estimators satisfy the properties of:
– Linearity in the observations of Y .
– Unbiasedness (given assumptions 1., 2. and 4.)
20
– Gauss-Markov Theorem: Under assumptions 1. to 4., b0 , b1 ;
the lowest variance among the linear and unbiased estimators.
; bK have
– Consistency.
3.2
2
Estimation of
Similar to the simple regression.
A consistent estimator of
2
is
2
e =
P
"2i
ib
n
.
Since the residuals satisfy K + 1 linear restrictions,
P
"i
ib
there are only (n
= 0;
K
P
"i X1i
ib
= 0; : : : ;
P
"i XKi
ib
=0
1) independent residuals (degrees of freedom).
We can the use an unbiased (and also consistent) estimator of
2
b =
n
P
"2i
ib
K
1
2
,
.
Under regular conditions, both e2 and b2 are consistent estimators of
2
, and
very similar for moderately large sample sizes.
3.3
Variances of the OLS estimators
In addition to assumptions 1. and 2., we make use of assumption 3. (V ("jX1 ; X2 ; : : : ; XK ) =
2
V
for any combination of the values of X1 ; X2 ; : : : ; XK ).
bj =
2
nSj2 1
2
Rj2
=P
i
x2ji 1
21
Rj2
, (j = 1; : : : ; K) where
– Sj2 =
1 P
1 P 2
i xji =
i Xji
n
n
Xj
2
– Rj2 is the R2 of the sample linear projection of Xj on the remaining explanatory variables X1i ; X2i ; : : : ; X(j
Xji =
0 + 1 X1i ; + 2 X2i +
+
1)i ; X(j+1)i ; : : : XKi :
j 1 X(j 1)i + j+1 X(j+1)i +
+
K XKi +ui
Rj2 measures the fraction of Xj which can be explained by the remaining explanatory variables.
Hence, 1
Rj2 is the information (not contained in other variables)
that Xj provides in addition to the remaining explanatory variables.
It is not possible that Rj2 = 1, because then Xj would be a exact
linear combination of the remaining explanatory variables (discarded by assumption 4.).
But if Rj2 were close to 1, V
bj
would be very large.
On the contrary, if Rj2 = 0 (i.e., the correlation of Xj with the
remaining explanatory variables is 0), then V
smallest.
b
j
would be the
Intuitively:
– The higher Sj2 =
1
n
P
i
x2ji , the higher the sample variation in Xj , and the
better the estimator precision.
– The larger the sample size n, the better the estimator precision.
– The higher the Rj2 , the lower the estimator precision.
22
The variance of bj can be then consistently estimated using a consistent estimator for
2
,
Vb bj =
nSj2
where Sj2 is the sample variance of Xj ,
Sj2 =
3.4
1X
Xji
n i
b2
1 Rj2
Xj
2
.
Goodness of …t measures
The goodness-of-…t measures are similar to the simple regression case.
We can use the square root of b2 , b, denoted as standard error of the
regression.
We can also use the R2 , with the same interpretation (fraction of variance of Y
explained by the explanatory variables),
R2 =
n
P
Ybi
ESS
= i=1
n
P
TSS
Yi
2
Y
Y
2
=1
RSS
,
TSS
0
R2
1.
i=1
The R2 can be helpful when comparing di¤erent models for the same dependent
variable Y .
However, the R2 always increases when adding new regressors, even though
they do not add explanatory value.
23
2
There is a similar measure, R , also called adjusted-R2 , which avoids this
problem
2
R =1
1
R2
n
n
1
K
=1
1
RSS/ (n K 1)
.
TSS/ (n 1)
2
In any case, for large sample sizes, R ' R2 .
4
Inference in the multiple regression model
Goldberger: Chapters 7, 10 (10.3), 11 y 12 (12.5 y 12.6).
Wooldridge: Chapters 4 y 5 (5.2).
4.1
Hypothesis tests on a single coe¢ cient
We must proceed in a similar way as with the simple regression.
Suppose we have the following null and alternative hypotheses,
H0 :
1
=a
H1 :
1
6= a
We then construct the t statistic
t=
b
a
1
b
1
which tells us how many standard deviations our sample slope is from the slope
hypothetized under the null.
Under normality
t=
b
a
1
b
1
24
N (0; 1)
or, using the asymptotic approximation,
b
t=
a
1
b
1
e N (0; 1)
In general, we would reject the null hypothesis at the signi…cance level 100 (1
)%
when
jtj =
b1
a
b
> z1
2
1
To test the one-sided (upper tail) hypothesis
H0 :
1
=a
H1 :
1
>a
we would decide in favor of the alternative at the signi…cance level 100 (1
when
t=
Example: Let
b
a
1
b
> z1
1
Y = logarithm of money demand (M1)
X1 = logarithm of real GDP
X2 = logarithm of Treasury-bills interest ratees
– Using US data, we have obtained the following results
Yb =
2:3296 +0:5573X1
(0:2054) (0:0264)
R2 = 0:927
s = 0:048
25
0:2032X2
(0:0210)
n = 38
Y = 6:629
)%
– Interpretation:
b1 : estimate of the elasticity of money demand with respect to output
(keeping the interest rate constant).
If GDP increases by 1% (and the interest rate does not change) the
money demand increased on average by 0:6%.
b : estimate of the elasticity of money demand with respect to interest
2
rate (keeping GDP constant).
If the interest rate increases by 1% (and GDP does not change) the
money demand falls on average by 0:2%.
H0 :
2
= 0 (money demand is inelastic to interest rate)
2
6= 0
vs.
H1 :
Then, under H0 :
b2
sb2
and
jtj =
e N (0; 1)
0:2032
= 9:676 > Z = 1:96
0:021
) we reject H0 at the 5% signi…cance level.
H0 :
1
= 1 (unit elasticity of money demand with respect to output)
1
6= 1
vs.
H1 :
Then, under H0 :
b1
1
sb1
26
e N (0; 1)
and
0:5573 1
= 16:769 > Z = 1:96
0:0264
jtj =
) we reject H0 at the 5% signi…cance level.
95% Con…dence Interval for
b1
1
1:96 : ) 0:5573
sb1
95% Con…dence Interval for
4.2
b
sb2
2
1:96 : )
0:0264
) [0:505 ; 0:609]
1:96
2
0:2032
0:0210
1:96
) [ 0:244 ;
0:162]
Tests about a linear restriction on several parameters
Consider the null hypothesis H0 :
where
0;
1;
;
K; a
0 0
+
1 1
+
+
K
K
= a,
are known constants.
Using the asymptotic approximation, we have that, under H0 :
Example: Let
0
t= r
Vb
b +
0
b +
1 1
b
0 0+
+
b
1 1+
K
b
K
a
+ K bK
e N (0; 1).
Y = logarithm of output
X1 = logarithm of labour input
X2 = logarithm of (physical) capital input
– Using data on 31 companies, we have obtained the following results:
Yb = 2:37 +0:632X1 +0:452X2
(0:257)
(0:219)
b b1 ; b2 ) = 0:055
Sb b = C(
1 2
27
n = 31
– Interpretation:
b1 : estimate of the elasticity of output with respect to labor (keeping
capital constant)
When labor input rises by 1% (and capital does not change) ouput
increases on average by 0:63%.
b2 : estimate of the elasticity of output with respect to capital (keeping
labor constant)
When capital input rises by 1% (and labor does not change) ouput
increases on average by 0:45%.
– Consider the hypothesis
H0 :
1
+
2
= 1 (Constant returns to scale)
1
+
2
6= 1.
vs
H1 :
Then, under H0 :
b +b
1
2
e N (0; 1).
t = q1
b
b
b
V ( 1 + 2)
b b1 ; b2 ), and
where Vb (b1 + b2 ) = Vb (b1 ) + Vb (b2 ) + 2C(
0:632 + 0:452 1
jtj = p
(0:257)2 + (0:219)2 + 2
0:055
= 0:177 < Z = 1:96
So we cannot reject constant returns to scale.
4.3
Tests about q linear restrictions
How can we test several linear constraints?
For example. How can we test q linear restrictions like:
28
– H0 :
– H0 :
1
=
1
2
– H0 :
4
=
=
K
=0
=0
=1
+
2 =
1
2
=0
1
2 5=1
3
We must form two regressions:
– The one that embodies the null hypothesis: this becomes the restricted
model, which is the appropriate model if the null is true.
– The original model, which does not restrict the coe¢ cients in any way:
this is denoted as the unrestricted model.
Basic idea: Ascertain whether imposing the null hypothesis has much of an
impact on how well the model …ts the data.
– If the null hypothesis is true, then both models should “…t” the data
equally well.
– Of course, even if the null hypothesis is true, the unrestricted model would
better capture random variation in the sample (since the constraints will
not be ful…lled exactly in the sample) and would provide somewhat better
…t.
– What we want to know is whether the …t achieved by the unrestricted
model is so much better than the …t achieved by the restricted model so
that we are willing to reject the null hypothesis.
29
When estimating each model, we can obtain
Unrestricted
Restricted
R2
RU2
R2
P 2
P 2 R
RSS
"iU = URSS
"iR = RRSS
ib
ib
0 and (RU2
– It is easy to check that (RRSS URSS)
2
RR
)
0.
Examples:
– Example 1: H0 :
Unrestricted
Y = 0 + 1 X1 +
– Example 2: H0 :
Y =
(Y
2
0
+2
+
1
2
X2 ) =
0
+2
+ (1
+
= 0 vs. H1 :
1
1
6= 0 y/o
2
6= 0.
Restricted
Y = 0+"
= 1 vs. H1 :
2
+2
1
6= 1
Restricted
Y = 0 + 1X + "
2 X2 + "
where: Y = Y X2
X = X1 2X2
=1)
1 X1
2
2 X2 + "
Unrestricted
Y = 0 + 1 X1 +
since
=
1
2
2
1 (X1
=1
2
1 )X2
1
)
+")
2X2 ) + "
Assuming conditional normality, it can be proved that
Y j X1i ; X2i ; : : : ; XKi
N(
0
+
1 X1i
+
n
K
q
2 X2i
+
+
K XKi ;
2
),
so under H0 [“q”linear restrictions],
F =
RRSS URSS
URSS
1
Fq ;n
K 1
or equivalently (provided that the dependent variable keeps unchanged in restricted model after introducing the constraints),
F =
2
RU2 RR
1 RU2
n
30
K
q
1
Fq ;n
K 1
Or, using the asypmtotic approximation,
W0 =
RRSS URSS
URSS
(n
1) = qF e
K
2
q
or equivalently (provided that the dependent variable keeps unchanged in restricted model after introducing the constraints),
W0 =
2
RU2 RR
1 RU2
(n
2
q
1) = qF e
K
All previous tests are particular cases of this one.
4.4
Test of joint signi…cance
This test is also denoted as “regression test”.
It consists of testing whether all the regression slope coe¢ cients are zero:
H0 :
1
=
j
6= 0 for at least any j = 1; : : : ; K.
2
=
=
K
=0
vs.
H1 :
Unrestricted
Y = 0 + 1 X1 +
RU2 > 0
2 X2
+
+
K XK
+"
Restricted
Y = 0+"
2
RR
=0
Under H0 ,
F =
RS2
1 RS2
n
K
K
1
FK ;n
K 1
(if we assume conditional normaility)
Alternatively, using the asymptotic approximation, we have that under H0 ,
W0 =
RS2
1 RS2
(n
31
K
1) = KF e
2
K
Example: Let
Y = logarithm of money demand (M1)
X1 = logarithm of real GDP
X2 = logarithm of Treasury-bills interest ratees
– Using US data, we have obtained the following results
Yb =
2:3296 +0:5573X1
(0:2054) (0:0264)
R2 = 0:927
0:2032X2
(0:0210)
s = 0:048
n = 38
Y = 6:629
– We want to test whether money demand is non sensitive to both output
and interest rate,
H0 :
1
=
H1 :
1
6= 0 y/o
2
=0
2
6= 0
– Then,
F =
0:927
1 0:927
38
2
2
1
= 222:23 > F2;35 = 3:28
or, using the asymptotic test,
W0 =
RS2
1 RS2
(n
K
so that
W0 =
0:927
1 0:927
(38
) we reject H0 .
32
2
1) = KF e
1) = 444:46 >
2
K
2
2
= 5:99
Download