# Chapter 3: The Multiple Linear Regression Model Christophe Hurlin November 23, 2013

```Chapter 3: The Multiple Linear Regression Model
Christophe Hurlin
University of Orl&eacute;ans
November 23, 2013
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
1 / 174
Section 1
Introduction
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
2 / 174
1. Introduction
The objectives of this chapter are the following:
1
De…ne the multiple linear regression model.
2
Introduce the ordinary least squares (OLS) estimator.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
3 / 174
1. Introduction
The outline of this chapter is the following:
Section 2: The multiple linear regression model
Section 3: The ordinary least squares estimator
Section 4: Statistical properties of the OLS estimator
Subsection 4.1: Finite sample properties
Subsection 4:2: Asymptotic properties
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
4 / 174
1. Introduction
References
Amemiya T. (1985), Advanced Econometrics. Harvard University Press.
Greene W. (2007), Econometric Analysis, sixth edition, Pearson - Prentice
Hil (recommended)
Pelgrin, F. (2010), Lecture notes Advanced Econometrics, HEC Lausanne (a
special thank)
Ruud P., (2000) An introduction to Classical Econometric Theory, Oxford
University Press.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
5 / 174
1. Introduction
Notations: In this chapter, I will (try to...) follow some conventions of
notation.
fY ( y )
probability density or mass function
FY ( y )
cumulative distribution function
Pr ()
probability
y
vector
Y
matrix
Be careful: in this chapter, I don’t distinguish between a random vector
(matrix) and a vector (matrix) of deterministic elements. For more
appropriate notations, see:
Abadir and Magnus (2002), Notation in econometrics: a proposal for a
standard, Econometrics Journal.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
6 / 174
Section 2
The Multiple Linear Regression Model
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
7 / 174
2. The Multiple Linear Regression Model
Objectives
1
De…ne the concept of Multiple linear regression model.
2
Semi-parametric and Parametric multiple linear regression model.
3
The multiple linear Gaussian model.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
8 / 174
2. The Multiple Linear Regression Model
De…nition (Multiple linear regression model)
The multiple linear regression model is used to study the relationship
between a dependent variable and one or more independent variables. The
generic form of the linear regression model is
y = x1 β1 + x2 β2 + .. + xK βK + ε
where y is the dependent or explained variable and x1 , .., xK are the
independent or explanatory variables.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
9 / 174
2. The Multiple Linear Regression Model
Notations
1
y is the dependent variable, the regressand or the explained
variable.
2
xj is an explanatory variable, a regressor or a covariate.
3
ε is the error term or disturbance.
IMPORTANT: do not use the term &quot;residual&quot;..
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
10 / 174
2. The Multiple Linear Regression Model
Notations (cont’d)
The term ε is a random disturbance, so named because it “disturbs” an
otherwise stable relationship. The disturbance arises for several reasons:
1
Primarily because we cannot hope to capture every in‡uence on an
economic variable in a model, no matter how elaborate. The net
e&curren;ect, which can be positive or negative, of these omitted factors is
captured in the disturbance.
2
There are many other contributors to the disturbance in an empirical
model. Probably the most signi…cant is errors of measurement. It is
easy to theorize about the relationships among precisely de…ned
variables; it is quite another to obtain accurate measures of these
variables.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
11 / 174
2. The Multiple Linear Regression Model
Notations (cont’d)
We assume that each observation in a sample fyi , xi 1 , xi 2 ..xiK g for
i = 1, .., N is generated by an underlying process described by
yi = xi 1 β1 + xi 2 β2 + .. + xiK βK + εi
Remark:
xik = value of the k th explanatory variable for the i th unit of the sample
xunit,variable
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
12 / 174
2. The Multiple Linear Regression Model
Notations (cont’d)
Let the N 1 column vector xk be the N observations on variable xk ,
for k = 1, .., K .
Let assemble these data in an N
K data matrix, X.
Let y be the N
1 column vector of the N observations, y1, y2 , .., yN .
Let ε be the N
1 column vector containing the N disturbances.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
13 / 174
2. The Multiple Linear Regression Model
Notations (cont’d)
0
B
B
B
y =B
B
N 1
B
@
y1
y2
..
yi
..
yN
0
1
C
C
C
C
C
C
A
xk
N 1
Christophe Hurlin (University of Orl&eacute;ans)
B
B
B
=B
B
B
@
x1k
x2k
..
xik
..
xNk
1
C
C
C
C
C
C
A
0
B
B
B
ε =B
B
N 1
B
@
ε1
ε2
..
εi
..
εN
1
C
C
C
C
C
C
A
0
1
β1
B β C
2 C
β =B
@ .. A
K 1
βK
November 23, 2013
14 / 174
2. The Multiple Linear Regression Model
Notations (cont’d)
X = (x1 : x2 : .. : xK )
N K
or equivalently
0
B
B
B
X =B
B
N K
B
@
Christophe Hurlin (University of Orl&eacute;ans)
x11 x12
x21 x22
..
..
xi 1 xi 2
..
xN 1 xN 2
.. x1k
.. x2k
.. ..
.. xik
.. ..
.. xNk
.. x1K
.. x2K
.. ..
.. xiK
.. ..
.. xNK
1
C
C
C
C
C
C
A
November 23, 2013
15 / 174
2. The Multiple Linear Regression Model
Fact
In most of cases, the …rst column of X is assumed to be a column of 1s so
that β1 is the constant term in the model.
x1
N 1
0
B
B
B
X =B
B
N K
B
@
Christophe Hurlin (University of Orl&eacute;ans)
1 x12
1 x22
.. ..
1 xi 2
..
1 xN 2
= 1
N 1
.. x1k
.. x2k
.. ..
.. xik
.. ..
.. xNk
.. x1K
.. x2K
.. ..
.. xiK
.. ..
.. xNK
1
C
C
C
C
C
C
A
November 23, 2013
16 / 174
2. The Multiple Linear Regression Model
Remark
More generally, the matrix X may as well contain stochastic and non
stochastic elements such as:
Constant;
Time trend;
Dummy variables (for speci…c episodes in time);
Etc.
Therefore, X is generally a mixture of …xed and random variables.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
17 / 174
2. The Multiple Linear Regression Model
De…nition (Simple linear regression model)
The simple linear regression model is a model with only one stochastic
regressor: K = 1 if there is no constant
yi = β1 xi + εi
or K = 2 if there is a constant:
yi = β1 + β2 xi 2 + εi
for i = 1, .., N, or
y = β1 + β2 x2 + ε
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
18 / 174
2. The Multiple Linear Regression Model
De…nition (Multiple linear regression model)
The multiple linear regression model can be written
y = X
N 1
Christophe Hurlin (University of Orl&eacute;ans)
β + ε
N KK 1
N 1
November 23, 2013
19 / 174
2. The Multiple Linear Regression Model
One key di&curren;erence for the speci…cation of the MLRM:
Parametric/semi-parametric speci…cation
Parametric model: the distribution of the error terms is fully
characterized, e.g. ε N (0, Ω)
Semi-Parametric speci…cation: only a few moments of the error terms
are speci…ed, e.g. E (ε) = 0 and V (ε) = E εε&gt; = Ω.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
20 / 174
2. The Multiple Linear Regression Model
This di&curren;erence does not matter for the derivation of the ordinary least
square estimator
But this di&curren;erence matters for (among others):
1
The characterization of the statistical properties of the OLS estimator
(e.g., e&cent; ciency);
2
The choice of alternative estimators (e.g., the maximum likelihood
estimator);
3
Etc.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
21 / 174
2. The Multiple Linear Regression Model
De…nition (Semi-parametric multiple linear regression model)
The semi-parametric multiple linear regression model is de…ned by
y = Xβ + ε
where the error term ε satis…es
E ( ε j X) = 0
N 1
V ( ε j X ) = σ 2 IN
N N
and IN is the identity matrix of order N.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
22 / 174
2. The Multiple Linear Regression Model
Remarks
1
If the matrix X is non stochastic (…xed), i.e. there are only …xed
regressors, then the conditions on the error term u read:
E (ε) = 0
V ( ε ) = σ 2 IN
2
If the (conditional) variance covariance matrix of ε is not diagonal,
i.e. if
V ( ε j X) = Ω
the model is called the Multiple Generalized Linear Regression
Model
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
23 / 174
2. The Multiple Linear Regression Model
Remarks (cont’d)
The two conditions on the error term ε
E ( εj X) = 0N
1
V ( ε j X ) = σ 2 IN
are equivalent to
E ( yj X) = Xβ
V ( y j X ) = σ 2 IN
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
24 / 174
2. The Multiple Linear Regression Model
De…nition (The multiple linear Gaussian model)
The (parametric) multiple linear Gaussian model is de…ned by
y = Xβ + ε
where the error term ε is normally distributed
ε
N 0, σ2 IN
As a consequence, the vector y has a conditional normal distribution with
yj X
Christophe Hurlin (University of Orl&eacute;ans)
N Xβ, σ2 IN
November 23, 2013
25 / 174
2. The Multiple Linear Regression Model
Remarks
1
The multiple linear Gaussian model is (by de…nition) a parametric
model.
2
If the matrix X is non stochastic (…xed), i.e. there are only …xed
regressors, then the vector y has marginal normal distribution:
y
Christophe Hurlin (University of Orl&eacute;ans)
N Xβ, σ2 IN
November 23, 2013
26 / 174
2. The Multiple Linear Regression Model
The classical linear regression model consists of a set of assumptions that
describes how the data set is produced by a data generating process (DGP)
Assumption 1: Linearity
Assumption 2: Full rank condition or identi…cation
Assumption 3: Exogeneity
Assumption 4: Spherical error terms
Assumption 5: Data generation
Assumption 6: Normal distribution
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
27 / 174
2. The Multiple Linear Regression Model
De…nition (Assumption 1: Linearity)
The model is linear with respect to the parameters β1 , .., βK .
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
28 / 174
Remarks
The model speci…es a linear relationship between the dependent
variable and the regressors. For instance, the models
y = β0 + β1 x + u
y = β0 + β1 cos (x ) + v
y = β0 + β1
1
+w
x
are all linear (with respect to β).
In contrast, the model y = β0 + β1 x β2 + ε is non linear
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
29 / 174
2. The Multiple Linear Regression Model
Remark
The model can be linear after some transformations. Starting from
y = Ax β exp (ε) , one has a log-linear speci…cation:
ln (y ) = ln (A) + β ln (x ) + ε
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
30 / 174
2. The Multiple Linear Regression Model
De…nition (Log-linear model)
The loglinear model is
ln (yi ) = β1 ln (xi 1 ) + β2 ln (xi 2 ) + .. + βK ln (xiK ) + εi
This equation is also known as the constant elasticity form as in this
equation, the elasticity of y with respect to changes in x does not vary
with xik :
∂yi
xik
∂ ln (yi )
=
βk =
∂ ln (xik )
∂xik
yi
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
31 / 174
2. The Multiple Linear Regression Model
The classical linear regression model consists of a set of assumptions that
describes how the data set is produced by a data generating process (DGP)
Assumption 1: Linearity
Assumption 2: Full rank condition or identi…cation
Assumption 3: Exogeneity
Assumption 4: Spherical error terms
Assumption 5: Data generation
Assumption 6: Normal distribution
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
32 / 174
2. The Multiple Linear Regression Model
De…nition (Assumption 2: Full column rank)
X is an N
K matrix with rank K .
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
33 / 174
2. The Multiple Linear Regression Model
Interpretation
1
There is no exact relationship among any of the independent variables
in the model.
2
The columns of X are linearly independent.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
34 / 174
2. The Multiple Linear Regression Model
Example
Suppose that a cross-section model satis…es:
yi
= β0 + β1 non labor incomei + β2 salaryi
+ β3 total incomei + εi
The identi…cation condition does not hold since total income is exactly
equal to salary plus non labor income (exact linear dependency in the
model).
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
35 / 174
2. The Multiple Linear Regression Model
Remarks
1
Perfect multi-collinearity is generally not di&cent; cult to spot and is
signalled by most statistical software.
2
Imperfect multi-collinearity is a more serious issue (see further).
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
36 / 174
2. The Multiple Linear Regression Model
De…nition (Identi…cation)
The multiple linear regression model is identi…able if and only if one the
following equivalent assertions holds:
(i) rank (X) = K
(ii) The matrix X&gt; X is invertible
(iii) The columns of X form a basis of L (X)
(iv) Xβ1 = Xβ2 =) β1 = β2 8 ( β1 , β2 ) 2 RK
RK
(v) Xβ = 0 =) β = 0 8 β 2 RK
(vi) ker (X) = f0g
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
37 / 174
2. The Multiple Linear Regression Model
The classical linear regression model consists of a set of assumptions that
describes how the data set is produced by a data generating process (DGP)
Assumption 1: Linearity
Assumption 2: Full rank condition or identi…cation
Assumption 3: Exogeneity
Assumption 4: Spherical error terms
Assumption 5: Data generation
Assumption 6: Normal distribution
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
38 / 174
2. The Multiple Linear Regression Model
De…nition (Assumption 3: Strict exogeneity of the regressors)
The regressors are exogenous in the sense that:
E ( εj X) = 0N
1
or equivalently for all the units i 2 f1, ..N g
E ( εi j X) = 0
or equivalently
E ( εi j xjk ) = 0
for any explanatory variable k 2 f1, ..K g and any unit j 2 f1, ..N g .
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
39 / 174
2. The Multiple Linear Regression Model
1
The expected value of the error term at observation i (in the sample)
is not a function of the independent variables observed at any
observation (including the i th observation). The independent variables
are not predictors of the error terms.
2
The strict exogeneity condition can be rewritten as:
E ( y j X) = Xβ
3
If the regressors are …xed, this condition can be rewritten as:
E (ε) = 0N
Christophe Hurlin (University of Orl&eacute;ans)
1
November 23, 2013
40 / 174
2. The Multiple Linear Regression Model
Implications
The (strict) exogeneity condition E ( εj X) = 0N
1
1
has two implications:
The zero conditional mean of ε implies that the unconditional mean
of u is also zero (the reverse is not true):
E (ε) = EX (E ( εj X)) = EX (0) = 0
2
The zero conditional mean of ε implies that (the reverse is not true):
E (εi xjk ) = 0
8i, j, k
or
Cov (εi , X) = 0
Christophe Hurlin (University of Orl&eacute;ans)
8i
November 23, 2013
41 / 174
2. The Multiple Linear Regression Model
The classical linear regression model consists of a set of assumptions that
describes how the data set is produced by a data generating process (DGP)
Assumption 1: Linearity
Assumption 2: Full rank condition or identi…cation
Assumption 3: Exogeneity
Assumption 4: Spherical error terms
Assumption 5: Data generation
Assumption 6: Normal distribution
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
42 / 174
2. The Multiple Linear Regression Model
De…nition (Assumption 4: Spherical disturbances)
The error terms are such that:
V ( εi j X) = E ε2i X = σ2 for all i 2 f1, ..N g
and
Cov ( εi , εj j X) = E ( εi
εj j X) = 0 for all i 6= j
The condition of constant variances is called homoscedasticity. The
uncorrelatedness across observations is called nonautocorrelation.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
43 / 174
2. The Multiple Linear Regression Model
1
Spherical disturbances = homoscedasticity + nonautocorrelation
2
If the errors are not spherical, we call them nonspherical disturbances.
3
The assumption of homoscedasticity is a strong one: this is the
exception rather than the rule!
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
44 / 174
2. The Multiple Linear Regression Model
Let us consider the (conditional) variance covariance matrix of the error
terms:
V ( εj X) = E εε&gt; X
| {z }
|
{z
}
N N
0
B
B
B
=B
B
B
@
E ( ε1 ε2 j X)
E ε21 X
E ( ε2 ε1 j X) E ε22 X
..
E ( εi ε1 j X)
..
E ( εN ε1 j X)
Christophe Hurlin (University of Orl&eacute;ans)
N N
.. E ( ε1 εj j X)
.. E ( ε2 εj j X)
..
..
.. E ( εi εj j X)
..
..
.. E ( εN εj j X)
.. E ( ε1 εN j X)
.. E ( ε2 εN j X)
..
..
.. E ( εi εN j X)
..
..
.. E ε2N X
November 23, 2013
1
C
C
C
C
C
C
A
45 / 174
2. The Multiple Linear Regression Model
The two assumptions (homoscedasticity and nonautocorrelation) imply
that:
V ( εj X) = E εε&gt; X = σ2 IN
| {z }
|
{z
}
N N
0
B
B
B
=B
B
B
@
Christophe Hurlin (University of Orl&eacute;ans)
N N
σ2 0
0 σ2
..
..
..
0
..
..
..
..
..
..
0
0
..
..
..
0
.. 0
.. 0
.. ..
.. 0
.. ..
.. σ2
1
C
C
C
C
C
C
A
November 23, 2013
46 / 174
2. The Multiple Linear Regression Model
The classical linear regression model consists of a set of assumptions that
describes how the data set is produced by a data generating process (DGP)
Assumption 1: Linearity
Assumption 2: Full rank condition or identi…cation
Assumption 3: Exogeneity
Assumption 4: Spherical error terms
Assumption 5: Data generation
Assumption 6: Normal distribution
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
47 / 174
2. The Multiple Linear Regression Model
De…nition (Assumption 5: Data generation)
The data in (xi 1 xi 2 ...xiK ) may be any mixture of constants and random
variables.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
48 / 174
2. The Multiple Linear Regression Model
1
Analysis will be done conditionally on the observed X, so whether the
elements in X are …xed constants or random draws from a stochastic
process will not in‡uence the results.
2
In the case of stochastic regressors, the unconditional statistical
properties of are obtained in two steps: (1) using the result
conditioned on X and (2) …nding the unconditional result by
”averaging” (i.e., integrating over) the conditional distributions.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
49 / 174
2. The Multiple Linear Regression Model
Assumptions regarding (xi 1 xi 2 ...xiK yi ) for i = 1, .., N is also
required. This is a statement about how the sample is drawn.
In the sequel, we assume that (xi 1 xi 2 ...xiK yi ) for i = 1, .., N are
independently and identically distributed (i.i.d).
The observations are drawn by a simple random sampling from a large
population.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
50 / 174
2. The Multiple Linear Regression Model
The classical linear regression model consists of a set of assumptions that
describes how the data set is produced by a data generating process (DGP)
Assumption 1: Linearity
Assumption 2: Full rank condition or identi…cation
Assumption 3: Exogeneity
Assumption 4: Spherical error terms
Assumption 5: Data generation
Assumption 6: Normal distribution
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
51 / 174
2. The Multiple Linear Regression Model
De…nition (Assumption 6: Normal distribution)
The disturbances are normally distributed.
εi j X
N 0, σ2
or equivalently
εj X
Christophe Hurlin (University of Orl&eacute;ans)
N 0N
1, σ
2
IN
November 23, 2013
52 / 174
2. The Multiple Linear Regression Model
1
Once again, this is a convenience that we will dispense with after
some analysis of its implications.
2
Normality is not necessary to obtain many of the results presented
below.
3
Assumption 6 implies assumptions 3 (exogeneity) and 4 (spherical
disturbances).
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
53 / 174
2. The Multiple Linear Regression Model
Summary
The main assumptions of the multiple linear regression model
A1: linearity
The model is linear with β
A2: identi…cation
X is an N
A3: exogeneity
E ( εj X) = 0N
K matrix with rank K
1
A4: spherical error terms
V ( ε j X) =
A5: data generation
X may be …xed or random
A6: normal distribution
εj X
Christophe Hurlin (University of Orl&eacute;ans)
σ2 I
N 0N
N
1, σ
2I
N
November 23, 2013
54 / 174
2. The Multiple Linear Regression Model
Key Concepts
1
Simple linear regression model
2
Multiple linear regression model
3
Semi-parametric multiple linear regression model
4
Multiple linear Gaussian model
5
Assumptions of the multiple linear regression model
6
Linearity (A1), Identi…cation (A2), Exogeneity (A3), Spherical error
terms (A4), Data generation (A5) and Normal distribution (A6)
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
55 / 174
Section 3
The ordinary least squares estimator
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
56 / 174
3. The ordinary least squares estimator
Introduction
1
The simple linear regression model assumes that the following
speci…cation is true in the population:
y = Xβ + ε
where other unobserved factors determining y are captured by the
error term ε.
2
3
Consider a sample fxi 1 , xi 2 , .., xiK , yi gN
i =1 of i.i.d. random variables
(be careful to the change of notations here) and only one realization
of this sample (your data set).
How to estimate the vector of parameters β?
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
57 / 174
3. The ordinary least squares estimator
Introduction (cont’d)
1
If we assume that assumptions A1-A6 hold, we have a multiple linear
Gaussian model (parametric model), and a solution is to use the
MLE. The MLE estimator for β coincides to the ordinary least
squares (OLS) estimator (cf. chapter 2).
2
If we assume that only assumptions A1-A5 hold, we have a
semi-parametric multiple linear regression model, the MLE is
unfeasible.
3
In this case, the only solution is to use the ordinary least squares
estimator (OLS).
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
58 / 174
3. The ordinary least squares estimator
Intuition
Let us consider the simple linear regression model and for simplicity denote
xi = xi 2 :
yi = β1 + β2 xi + εi
The general idea of the OLS consists in minimizing the ”distance”
between the points (xi , yi ) and the regression line
ybi = b
β1 + b
β2 xi
or the points (xi , ybi ) for all i = 1, .., N
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
59 / 174
3. The ordinary least squares estimator
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
60 / 174
3. The ordinary least squares estimator
Intuition (cont’d)
Estimates of β1 and β2 are chosen by minimizing the sum of the squared
residuals (SSR):
N
∑ bε2i
i =1
This SSR can be written as:
N
∑ bε2i =
i =1
yi
b
β1
b
β2 xi
2
Therefore, b
β1 and b
β2 are the solutions of the minimization problem
N
b
β1 , b
β2 = arg min ∑ (yi
( β1 ,β2 ) i =1
Christophe Hurlin (University of Orl&eacute;ans)
β1
β2 xi )2
November 23, 2013
61 / 174
3. The ordinary least squares estimator
De…nition (OLS - simple linear regression model)
In the simple linear regression model yi = β1 + β2 xi + εi , the OLS
estimators b
β1 and b
β2 are the solutions of the minimization problem
N
b
β1 , b
β2 = arg min ∑ (yi
( β1 ,β2 ) i =1
The solutions are:
b
β1 = y N
β1
β2 xi )2
b
β2 x N
∑N (xi x N ) (yi y N )
b
β 2 = i =1 N
2
∑i =1 (xi x N )
1 N x respectively denote the
where y N = N 1 ∑N
∑ i =1 i
i =1 yi and x N = N
sample mean of the dependent variable y and the regressor x.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
62 / 174
3. The ordinary least squares estimator
Remark
The OLS estimator is a linear estimator (cf. chapter 1) since it can be
expressed as a linear function of the observations yi :
b
β2 =
with
ωi =
N
∑ ωi yi
i =1
( xi x N )
(xi x N )2
∑N
i =1
in the case where y N = 0.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
63 / 174
3. The ordinary least squares estimator
De…nition (Fitted value)
The predicted or …tted value for observation i is:
ybi = b
β1 + b
β2 xi
with a sample mean equal to the sample average of the observations
ybN =
Christophe Hurlin (University of Orl&eacute;ans)
1
N
N
∑ ybi = y N
i =1
=
1
N
N
∑ yi
i =1
November 23, 2013
64 / 174
3. The ordinary least squares estimator
De…nition (Fitted residual)
The residual for observation i is:
b
β1
bεi = yi
b
β 2 xi
with a sample mean equal to zero by de…nition.
bεN =
Christophe Hurlin (University of Orl&eacute;ans)
1
N
N
∑ bεi = 0
i =1
November 23, 2013
65 / 174
3. The ordinary least squares estimator
Remarks
1
2
ε2i (or SSR) is
The …t of the regression is ”good” if the sum ∑N
i =1 b
”small”, i.e., the unexplained part of the variance of y is ”small”.
The coe&cent; cient of determination or R2 is given by:
∑N (ybi
R = iN=1
∑i =1 (yi
2
Christophe Hurlin (University of Orl&eacute;ans)
y N )2
y N )2
2
=1
εi
∑N
i =1 b
2
N
∑i =1 (yi y N )
November 23, 2013
66 / 174
3. The ordinary least squares estimator
Orthogonality conditions
Under assumption A3 (strict exogeneity), we have E ( εi j xi ) = 0. This
condition implies that:
E ( εi ) = 0
E (εi xi ) = 0
Using the sample analog of this moment conditions (cf. chapter 6,
GMM), one has:
1 N
β1 b
β2 xi = 0
yi b
N i∑
=1
1
N
N
∑
i =1
yi
b
β1
b
β2 xi xi = 0
This is a system of two equations and two unknowns b
β1 and b
β2 .
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
67 / 174
3. The ordinary least squares estimator
De…nition (Orthogonality conditions)
The ordinary least squares estimator can be de…ned from the two sample
analogs of the following moment conditions:
E ( εi ) = 0
E (εi xi ) = 0
The corresponding system of equations is just-identi…ed.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
68 / 174
3. The ordinary least squares estimator
OLS and multiple linear regression model
Now consider the multiple linear regression model
y = Xβ + ε
or
K
yi =
∑ βk xik + εi
k =1
Objective: …nd an estimator (estimate) of β1 , β2 , .., βK and σ2 under the
assumptions A1-A5.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
69 / 174
3. The ordinary least squares estimator
OLS and multiple linear regression model
Di&curren;erent methods:
1
Minimize the sum of squared residuals (SSR)
2
Solve the same minimization problem with matrix notation.
3
Use moment conditions.
4
Geometrical interpretation
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
70 / 174
3. The ordinary least squares estimator
1. Minimize the sum of squared residuals (SSR):
As in the simple linear regression,
N
b = arg min ∑
β
β
i =1
ε2i
K
N
= arg min ∑
β
yi
i =1
∑ βk xik
k =1
!2
One can derive the …rst order conditions with respect to βk for k = 1, .., K
and solve a system of K equations with K unknowns.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
71 / 174
3. The ordinary least squares estimator
2. Using matrix notations:
De…nition (OLS and multiple linear regression model)
In the multiple linear regression model yi = xi&gt; β + εi , with
b is the solution of
xi = (xi 1 , .., xiK )&gt; , the OLS estimator β
N
b = arg min ∑ yi
β
β
xi&gt; β
2
i =1
The OLS estimators of β is:
b=
β
Christophe Hurlin (University of Orl&eacute;ans)
N
∑
i =1
xi xi&gt;
!
1
N
∑ xi yi
i =1
!
November 23, 2013
72 / 174
3. The ordinary least squares estimator
2. Using matrix notations:
De…nition (Normal equations)
Under suitable regularity conditions, in the multiple linear regression model
yi = xi&gt; β + εi , with xi = (xi 1 : .. : xiK )&gt; , the normal equations are
N
∑ xi
i =1
Christophe Hurlin (University of Orl&eacute;ans)
yi
b = 0K
xi&gt; β
1
November 23, 2013
73 / 174
3. The ordinary least squares estimator
2. Using matrix notations:
De…nition (OLS and multiple linear regression model)
b
In the multiple linear regression model y = Xβ + ε, the OLS estimator β
is the solution of the minimization problem
b = arg min ε&gt; ε = arg min (y
β
β
Xβ)&gt; (y
Xβ)
β
The OLS estimators of β is:
b = X&gt; X
β
Christophe Hurlin (University of Orl&eacute;ans)
1
X&gt; y
November 23, 2013
74 / 174
3. The ordinary least squares estimator
2. Using matrix notations:
De…nition
b of β minimizes the following criteria
The ordinary least squares estimator β
s ( β) = k(y
Christophe Hurlin (University of Orl&eacute;ans)
Xβ)k2IN = (y
Xβ)&gt; (y
Xβ)
November 23, 2013
75 / 174
3. The ordinary least squares estimator
2. Using matrix notations:
The FOC (normal equations) are de…ned by:
∂s ( β)
∂β
=
b
β
b = 0K
2|{z}
X&gt; y X β
{z }
K N|
1
N 1
The second-order conditions hold:
∂s ( β)
∂β∂β
&gt;
b
β
&gt;
=2 X
| {zX} is de…nite positive
K K
since by de…nition X&gt; X is a positive de…nite matrix. We have a minimum.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
76 / 174
3. The ordinary least squares estimator
2. Using matrix notations:
De…nition (Normal equations)
Under suitable regularity conditions, in the multiple linear regression model
y = Xβ + ε, the normal equations are given by:
X&gt; (y Xβ) = 0K
|{z}
| {z }
K N
Christophe Hurlin (University of Orl&eacute;ans)
1
N 1
November 23, 2013
77 / 174
3. The ordinary least squares estimator
De…nition (Unbiased variance estimator)
In the multiple linear regression model y = Xβ + ε, the unbiased
estimator of σ2 is given by:
b2 =
σ
Christophe Hurlin (University of Orl&eacute;ans)
N
1
N
K
∑ bε2i
i =1
SSR
N K
November 23, 2013
78 / 174
3. The ordinary least squares estimator
2. Using matrix notations:
b2 can also be written as:
The estimator σ
b2 =
σ
N
b2 =
σ
2
(y
b =
σ
Christophe Hurlin (University of Orl&eacute;ans)
N
1
K
∑
yi
i =1
Xβ)&gt; (y
N K
k(y
b
xi&gt; β
2
Xβ)
Xβ)k2IN
N
K
November 23, 2013
79 / 174
3. The ordinary least squares estimator
3. Using moment conditions:
Under assumption A3 (strict exogeneity), we have E ( εj X) = 0. This
condition implies:
E (εi xi ) = 0K 1
with xi = (xi 1 : .. : xiK )&gt; . Using the sample analogs, one has:
1
N
N
∑ xi
i =1
yi
b = 0K
xi&gt; β
1
We have K (normal) equations with K unknown parameters b
β1 , .., b
βK .
The system is just identi…ed.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
80 / 174
3. The ordinary least squares estimator
4. Geometric interpretation:
1
2
The ordinary least squares estimation methods consists in determining
y, which is the closest to y (in a certain space...)
such that the squared norm between y and b
y is minimized.
Finding b
y is equivalent to …nd an estimator of β.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
81 / 174
3. The ordinary least squares estimator
4. Geometric interpretation:
De…nition (Geometric interpretation)
y, is the (orthogonal) projection of y onto the
column space of X. The …tted error terms, b
ε, is the projection of y onto
the orthogonal space engendered by the column space of X. The vectors b
y
and b
ε are orthogonal.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
82 / 174
3. The ordinary least squares estimator
4. Geometric interpretation:
Source: F. Pelgrin (2010), Lecture notes, Advanced Econometrics
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
83 / 174
3. The ordinary least squares estimator
4. Geometric interpretation:
De…nition (Projection matrices)
The vectors b
y and b
ε are de…ned to be:
b
y=P
y
b
ε=M
y
where P and M denote the two following projection matrices:
P = X X&gt; X
M = IN
Christophe Hurlin (University of Orl&eacute;ans)
P = IN
1
X&gt;
X X&gt; X
1
X&gt;
November 23, 2013
84 / 174
3. The ordinary least squares estimator
Other geometric interpretations:
Suppose that there is a constant term in the model.
1
The least squares residuals sum to zero:
N
∑ bεi = 0
i =1
2
The regression hyperplane passes through the point of means of the
data (xN , y N ) .
3
The mean of the …tted (adjusted) values of y equals the mean of the
actual values of y :
ybN = y N
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
85 / 174
3. The ordinary least squares estimator
De…nition (Coe&cent; cient of determination)
The coe&cent; cient of determination of the multiple linear regression model
(with a constant term) is the ratio of the total (empirical) variance
explained by model to the total (empirical) variance of y:
R2 =
bi
∑N
i =1 ( y
N
∑i =1 (yi
Christophe Hurlin (University of Orl&eacute;ans)
y N )2
y N )2
2
=1
εi
∑N
i =1 b
2
N
∑i =1 (yi y N )
November 23, 2013
86 / 174
3. The ordinary least squares estimator
Remark
1
2
The coe&cent; cient of determination measures the proportion of the total
variance (or variability) in y that is accounted for by variation in the
regressors (or the model).
Problem: the R2 automatically and spuriously increases when extra
explanatory variables are added to the model.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
87 / 174
3. The ordinary least squares estimator
The adjusted R-squared coe&cent; cient is de…ned to be:
N
2
R =1
N
1
p
1
1
R2
where p denotes the number of regressors (not counting the constant
term, i.e., p = K 1 if there is a constant or p = K otherwise).
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
88 / 174
3. The ordinary least squares estimator
Remark
One can show that
1
2
3
2
R &lt;R2
2
if N is large R 'R2
2
The adjusted R-squared R can be negative.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
89 / 174
3. The ordinary least squares estimator
Key Concepts
1
OLS estimator and estimate
2
Fitted or predicted value
3
Residual or …tted residual
4
Orthogonality conditions
5
Normal equations
6
Geometric interpretations of the OLS
7
Coe&cent; cient of determination and adjusted R-squared
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
90 / 174
Section 4
Statistical properties of the OLS estimator
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
91 / 174
4. Statistical properties of the OLS estimator
In order to study the statistical properties of the OLS estimator, we have
to distinguish (cf. chapter 1):
1
The …nite sample properties
2
The large sample or asymptotic properties
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
92 / 174
4. Statistical properties of the OLS estimator
But, we have also to distinguish the properties given the assumptions
made on the linear regression model
1
Semi-parametric linear regression model (the exact distribution of
ε is unknown) versus parametric linear regression model (and
especially Gaussian linear regression model, assumption A6).
2
X is a matrix of random regressors versus X is a matrix of …xed
regressors.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
93 / 174
4. Statistical properties of the OLS estimator
Fact (Assumptions)
In the rest of this section, we assume that assumptions A1-A5 hold.
A1: linearity
The model is linear with β
A2: identi…cation
X is an N
A3: exogeneity
E ( εj X) = 0N
K matrix with rank K
1
A4: spherical error terms
V ( ε j X ) = σ 2 IN
A5: data generation
X may be …xed or random
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
94 / 174
Subsection 4.1.
Finite sample properties of the OLS estimator
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
95 / 174
4.1. Finite sample properties
Objectives
The objectives of this subsection are the following:
1
2
3
4
Compute the two …rst moments of the (unknown) …nite sample
b and σ
b2
distribution of the OLS estimators β
b and
Determine the …nite sample distribution of the OLS estimators β
b under particular assumptions (A6).
σ
Determine if the OLS estimators are &quot;good&quot;: e&cent; cient estimator
versus BLUE.
Introduce the Gauss-Markov theorem.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
96 / 174
4.1. Finite sample properties
First moments of the OLS estimators
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
97 / 174
4.1. Finite sample properties
Moments
In a …rst step, we will derive the …rst moments of the OLS estimators
1
2
b and V β
b
Step 1: compute E β
b2
Step 2: compute E σ
Christophe Hurlin (University of Orl&eacute;ans)
b2
and V σ
November 23, 2013
98 / 174
4.1. Finite sample properties
De…nition (Unbiased estimator)
In the multiple linear regression model y = Xβ0 + ε, under the assumption
b is unbiased:
A3 (strict exogeneity), the OLS estimator β
b =β
E β
0
where β0 denotes the true value of the vector of parameters. This result
holds whether or not the matrix X is considered as random.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
99 / 174
4.1. Finite sample properties
Proof
Case 1: …xed regressors (cf. chapter 1)
b = X&gt; X
β
1
X&gt; y = β 0 + X&gt; X
1
X&gt; ε
So, if X is a matrix of …xed regressors:
b = β + X&gt; X
E β
0
1
X&gt; E ( ε )
Under assumption A3 (exogeneity), E ( εj X) = E (ε) = 0. Then, we get:
b =β
E β
0
The OLS estimator is unbiased.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
100 / 174
4.1. Finite sample properties
Proof (cont’d)
Case 2: random regressors
b = X&gt; X
β
1
X&gt; y = β 0 + X&gt; X
1
X&gt; ε
If X is includes some random elements:
b X = β + X&gt; X
E β
0
1
X&gt; E ( ε j X )
Under assumption A3 (exogeneity), E ( εj X) = 0. Then, we get:
b X =β
E β
0
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
101 / 174
4.1. Finite sample properties
Proof (cont’d)
Case 2: random regressors
b is conditionally unbiased.
The OLS estimator β
Besides, we have:
b X =β
E β
0
b = EX E β
b X
E β
= EX ( β 0 ) = β 0
where EX denotes the expectation with respect to the distribution of X.
b is unbiased.
So, the OLS estimator β
b =β
E β
0
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
102 / 174
4.1. Finite sample properties
De…nition (Variance of the OLS estimator, non-stochastic regressors)
In the multiple linear regression model y = Xβ + ε, if the matrix X is
non-stochastic, the unconditional variance covariance matrix of the OLS
b is:
estimator β
1
b = σ 2 X&gt; X
V β
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
103 / 174
4.1. Finite sample properties
Proof
b = X&gt; X
β
1
X&gt; y = β 0 + X&gt; X
1
X&gt; ε
So, if X is a matrix of …xed regressors:
b
V β
= E
= E
=
Christophe Hurlin (University of Orl&eacute;ans)
b
β
β0
X&gt; X
X&gt; X
1
&gt;
1
b
β
β0
X&gt; εε&gt; X X&gt; X
X&gt; E εε&gt; X X&gt; X
1
1
November 23, 2013
104 / 174
4.1. Finite sample properties
Proof (cont’d)
Under assumption A4 (spherical disturbances), we have:
V (ε) = E εε&gt; = σ2 IN
The variance covariance matrix of the OLS estimator is de…ned by:
b
V β
=
X&gt; X
=
X&gt; X
= σ 2 X&gt; X
= σ 2 X&gt; X
Christophe Hurlin (University of Orl&eacute;ans)
1
1
X&gt; E εε&gt; X X&gt; X
X &gt; σ 2 IN X X &gt; X
1
X&gt; X
X&gt; X
1
1
1
1
November 23, 2013
105 / 174
4.1. Finite sample properties
De…nition (Variance of the OLS estimator, stochastic regressors)
In the multiple linear regression model y = Xβ0 + ε, if the matrix X is
stochastic, the conditional variance covariance matrix of the OLS
b is:
estimator β
1
b X = σ 2 X&gt; X
V β
The unconditional variance covariance matrix is equal to:
b = σ2 EX
V β
X&gt; X
1
where EX denotes the expectation with respect to the distribution of X.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
106 / 174
4.1. Finite sample properties
Proof
b = X&gt; X
β
1
X&gt; y = β 0 + X&gt; X
1
X&gt; ε
So, if X is a stochastic matrix:
b X
V β
= E
= E
=
Christophe Hurlin (University of Orl&eacute;ans)
b
β
β0
X&gt; X
X&gt; X
1
&gt;
1
b
β
β0
X
X&gt; εε&gt; X X&gt; X
1
X&gt; E εε&gt; X X X&gt; X
X
1
November 23, 2013
107 / 174
4.1. Finite sample properties
Proof (cont’d)
Under assumption A4 (spherical disturbances), we have:
V ( εj X) = E εε&gt; X = σ2 IN
The conditional variance covariance matrix of the OLS estimator is de…ned
by:
b X
V β
=
X&gt; X
=
X&gt; X
= σ 2 X&gt; X
Christophe Hurlin (University of Orl&eacute;ans)
1
1
X&gt; E εε&gt; X X X&gt; X
X &gt; σ 2 IN X X &gt; X
1
1
1
November 23, 2013
108 / 174
4.1. Finite sample properties
Proof (cont’d)
We have:
b X = σ 2 X&gt; X
V β
1
The (unconditional) variance covariance matrix of the OLS estimator is
de…ned by:
b = EX V β
b X
V β
= σ2 EX
X&gt; X
1
where EX denotes the expectation with respect to the distribution of X.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
109 / 174
4.1. Finite sample properties
Summary
Case 1: X stochastic
b =β
E β
0
Mean
Variance
Cond. mean
Cond. var
b = σ 2 EX
V β
X&gt; X
b X =β
E β
0
b X = σ 2 X&gt; X
V β
Christophe Hurlin (University of Orl&eacute;ans)
Case 2: X nonstochastic
1
b =β
E β
0
b = σ 2 X&gt; X
V β
1
—
1
—
November 23, 2013
110 / 174
4.1. Finite sample properties
Question
How to estimate the variance covariance matrix of the OLS estimator?
b
V β
OLS
b
V β
OLS
= σ2
= σ 2 EX
Christophe Hurlin (University of Orl&eacute;ans)
X&gt; X
1
X&gt; X
if X is nonstochastic
1
if X is stochastic
November 23, 2013
111 / 174
4.1. Finite sample properties
Question (cont’d)
De…nition (Variance estimator)
An unbiased estimator of the variance covariance matrix of the OLS
estimator is given:
1
2
&gt;
b
b β
b
V
=
σ
X
X
OLS
b 2 = (N K ) 1 b
where σ
ε&gt;b
ε is an unbiased estimator of σ2 . This result
holds whether X is stochastic or non stochastic.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
112 / 174
4.1. Finite sample properties
Summary
Case 1: X stochastic
Variance
Estimator
b = σ 2 EX
V β
b
b β
V
OLS
Christophe Hurlin (University of Orl&eacute;ans)
X&gt; X
b 2 X&gt; X
=σ
Case 2: X nonstochastic
1
1
b = σ 2 X&gt; X
V β
b
b β
V
OLS
1
b 2 X&gt; X
=σ
November 23, 2013
1
113 / 174
4.1. Finite sample properties
De…nition (Estimator of the variance of disturbances)
Under the assumption A1-A5, in the multiple linear regression model
b2 is unbiased:
y = Xβ + ε, the estimator σ
b2 = σ2
E σ
where
b2 =
σ
N
1
N
K
∑ bε2i =
i =1
b
ε&gt;b
ε
N K
This result holds whether or not the matrix X is considered as random.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
114 / 174
4.1. Finite sample properties
Proof
We assume that X is stochastic. Let M denotes the projection matrix
(“residual maker”) de…ned by:
M = IN
X X&gt; X
1
X&gt;
with
b
ε = M
(N ,1 )
The N
1
2
y
(N ,N )(N ,1 )
N matrix M satis…es the following properties:
if X is regressed on X, a perfect …t will result and the residuals will be
zero, so M X = 0
The matrix M is symmetric M&gt; = M and idempotent M M = M
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
115 / 174
4.1. Finite sample properties
Proof (cont’d)
The residuals are de…ned as to be:
Since y = Xβ + ε, we have
b
ε=My
b
ε = M (Xβ + ε) = MXβ + Mε
Since MX = 0, we have
b
ε = Mε
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
116 / 174
4.1. Finite sample properties
Proof (cont’d)
b2 is based on the sum of squared residuals (SSR)
The estimator σ
b2 =
σ
ε&gt; Mε
b
ε&gt;b
ε
=
N K
N K
The expected value of the SSR is
E b
ε&gt;b
ε X = E ε&gt; Mε X
The scalar ε&gt; Mε is a 1
tr E ε&gt; Mε X
1 scalar, so it is equal to its trace.
= tr E εε&gt; M X
= tr E Mεε&gt; X
since tr (AB ) = tr (AB ) .
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
117 / 174
4.1. Finite sample properties
Proof (cont’d)
Since M = IN
X X&gt; X
1
X&gt; depends on X, we have:
E b
ε&gt;b
ε X = tr E Mεε&gt; X
= tr M
E εε&gt; X
Under assumptions A3 and A4, we have
E εε&gt; X = σ2 IN
As a consequence
E b
ε&gt;b
ε X = tr σ2 M IN = σ2 tr (M)
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
118 / 174
4.1. Finite sample properties
Proof (cont’d)
E b
ε&gt;b
ε X
= σ2 tr (M)
1
= σ2 tr IN
X X&gt; X
X&gt;
= σ2 tr (IN )
σ2 tr X X&gt; X
= σ2 tr (IN )
σ2 tr X&gt; X X&gt; X
1
X&gt;
1
= σ2 tr (IN ) σ2 tr (IK )
= σ 2 (N K )
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
119 / 174
4.1. Finite sample properties
Proof (cont’d)
b2 , we have:
By de…nition of σ
2
b X =
E σ
E b
ε&gt;b
ε X
N
K
=
σ 2 (N K )
= σ2
N K
b2 is conditionally unbiased.
So, the estimator σ
b2 = EX E σ
b2 X
E σ
= EX σ2 = σ2
b2 is unbiased:
The estimator σ
b2 = σ2
E σ
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
120 / 174
4.1. Finite sample properties
Remark
b2 .
Given the same principle, we can compute the variance of the estimator σ
b2 =
σ
As a consequence, we have:
ε&gt; Mε
b
ε&gt;b
ε
=
N K
N K
b2 X =
V σ
1
(N
K)
V ε&gt; Mε X
b 2 = EX V σ
b2 X
V σ
But, it takes... at least ten slides.....
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
121 / 174
4.1. Finite sample properties
b2 )
De…nition (Variance of the estimator σ
In the multiple linear regression model y = Xβ0 + ε, the variance of the
b2 is
estimator σ
b2 =
V σ
2σ4
N K
where σ2 denotes the true value of variance of the error terms. This result
holds whether or not the matrix X is considered as random.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
122 / 174
4.1. Finite sample properties
Summary
Case 1: X stochastic
Case 2: X nonstochastic
b2 = σ2
E σ
b2 = σ2
E σ
Mean
b2 =
V σ
Variance
Cond. mean
Cond. var
2σ4
N K
b2 X = σ2
E σ
b2 X =
V σ
Christophe Hurlin (University of Orl&eacute;ans)
b2 =
V σ
2σ4
N K
2σ4
N K
—
—
November 23, 2013
123 / 174
4.1. Finite sample properties
Finite sample distributions
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
124 / 174
4.1. Finite sample properties
Summary
1
2
3
Under assumptions A1-A5, we can derive the two …rst moments of
b i.e.
the (unknown) …nite sample distribution of the OLS estimator β,
b and V β
b .
E β
Are we able to characterize the …nite sample distribution (or exact
b?
sampling distribution) of β
For that, we need to put an assumption on the distribution of ε and
use a parametric speci…cation.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
125 / 174
4.1. Finite sample properties
Finite sample distribution
Fact (Finite sample distribution, I)
(1) In a parametric multiple linear regression model with stochastic
b is known. The
regressors, the conditional …nite sample distribution of β
unconditional …nite sample distribution is generally unknown:
b X
β
D
where D is a multivariate distribution.
Christophe Hurlin (University of Orl&eacute;ans)
b
β
??
November 23, 2013
126 / 174
4.1. Finite sample properties
Finite sample distribution
Fact (Finite sample distribution, II)
(2) In a parametric multiple linear regression model with nonstochastic
b is
regressors, the marginal (unconditional) …nite sample distribution of β
known:
b D
β
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
127 / 174
4.1. Finite sample properties
Finite sample distribution
Fact (Finite sample distribution, III)
(3) In a semi-parametric multiple linear regression model, with …xed or
b is unknown
random regressors, the …nite sample distribution of β
b X
β
Christophe Hurlin (University of Orl&eacute;ans)
??
b
β
??
November 23, 2013
128 / 174
4.1. Finite sample properties
Example (Linear Gaussian regression model)
Under the assumption A6 (normality - linear Gaussian regression model),
N 0,σ2 IN
ε
b = X&gt; X
β
1
X&gt; y = β + X&gt; X
1
X&gt; ε
b is
If X is a stochastic matrix, the conditional distribution of β
b X
β
N
1
β,σ2 X&gt; X
b is
If X is a nonstochastic matrix, the marginal distribution of β
Christophe Hurlin (University of Orl&eacute;ans)
b
β
N
β,σ2 X&gt; X
1
November 23, 2013
129 / 174
4.1. Finite sample properties
Theorem (Linear gaussian regression model)
b and σ
b2 have a
Under the assumption A6 (normality), the estimators β
…nite sample distribution given by:
b
β
N
b2
σ
(N
σ2
β,σ2 X&gt; X
K)
χ2 (N
1
K)
b and σ
b2 are independent. This result holds whether or not the
Moreover, β
b is
matrix X is considered as random. In this last case, the distribution of β
1
b X = σ 2 X&gt; X
conditional to X with V β
.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
130 / 174
4.1. Finite sample properties
E&cent; cient estimator or BLUE?
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
131 / 174
4.1. Finite sample properties
Question: OLS estimator = &quot;good&quot; estimator of β?
The question is to know if there this estimator is preferred to other
unbiased estimators?
b
V β
OLS
b
7V β
other
In general, to answer to this question we use the FDCR or
Cramer-Rao bound and study the e&cent; ciency of the estimator (cf.
chapters 1 and 2).
Problem: the computation of the FDCR bound requires an
assumption on the distribution of ε.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
132 / 174
4.1. Finite sample properties
Two cases:
1
In a parametric model, it is possible to show that the OLS estimator
is e&cent; cient - its variance reaches the FDCR bound. The OLS is the
Best Unbiased Estimator (BUE)
2
In a semi-parametric model:
1
2
3
we don’t known the conditional distribution of y …ven X and the
likelihood of the sample.
The FDCR bound is unknown: so it is impossible to show the
e&cent; ciency of the OLS estimator.
In a semi-parametric model, we introduce another concept: the Best
Linear Unbiased Estimator (BLUE)
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
133 / 174
4.1. Finite sample properties
Problem (E&cent; ciency of the OLS estimator)
In a semi-parametric multiple linear regression model (with no
assumption on the distribution of ε), it is impossible to compute the
FDCR or Cramer Rao bound and to show that the e&cent; ciency of the OLS
estimator. The e&cent; ciency of the OLS estimator can only be shown in a
parametric multiple linear regression model.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
134 / 174
4.1. Finite sample properties
Theorem (E&cent; ciency - Gaussian model)
In a Gaussian multiple linear regression model y = Xβ0 + ε, with
b is e&cent; cient. Its
ε N 0,σ2 IN (assumption A6), the OLS estimator β
variance attains the FDCR or Cramer-Rao bound:
b = I 1 (β )
V β
0
N
This result holds whether or not the matrix X is considered as random.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
135 / 174
4.1. Finite sample properties
Proof
Let us consider the case where X is nonstochastic. Under the assumption
A6, the distribution is known, since y N Xβ,σ2 IN . The log-likelihood
&gt;
of the sample fyi , xi gN
i =1 , with xi = (xi 1 : .. : xiK ) , is then equal to:
`N ( β; y, x) =
N
ln σ2
2
If we denote θ = β&gt; : σ2
I N (θ0 ) = Eθ
Christophe Hurlin (University of Orl&eacute;ans)
&gt;
N
ln (2π )
2
(y
Xβ)&gt; (y
2σ2
Xβ)
, we have (cf. chapter 2):
∂2 `N (θ; y, x)
∂θ∂θ&gt;
=
1 &gt;
X X
σ2
01
2
02
T
2σ4
1
!
November 23, 2013
136 / 174
4.1. Finite sample properties
Proof (cont’d)
I N (θ0 ) =
1 &gt;
X X
σ2
01
2
The inverse of this block matrix is given by
1
0
1
&gt;
2
σ X X
02 1
A=
I N 1 (θ0 ) = @
2σ4
01 2
T
02
1
T
2σ4
!
I N 1 ( β0 )
01
2
02
IN
1
1
σ2
!
Then we have:
b = I 1 ( β ) = σ 2 X&gt; X
V β
0
N
Christophe Hurlin (University of Orl&eacute;ans)
1
November 23, 2013
137 / 174
4.1. Finite sample properties
Problem
1
In a semi-parametric model (with no assumption on the distribution
of ε), it is impossible to compute the FDCR bound and to show the
e&cent; ciency of the OLS estimator.
2
The solution consists in introducing the concept of best linear
unbiased estimator (BLUE): the Gauss-Markov theorem...
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
138 / 174
4.1. Finite sample properties
Theorem (Gauss-Markov theorem)
In the linear regression model under assumptions A1-A5, the least squares
b is the best linear unbiased estimator (BLUE) of β whether
estimator β
0
X is stochastic or nonstochastic.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
139 / 174
4.1. Finite sample properties
Comment
The estimator b
βk for k = 1, ..K is the BLUE of βk
Best = smallest variance
Linear (in y or yi ) : b
βk = ∑N
i =1 ω ki yi
Unbiased: E b
βk
= βk
Estimator: b
βk = f (y1 , .., yN )
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
140 / 174
4.1. Finite sample properties
Remark
The fact that the OLS is the BLUE means that there is no linear
b
unbiased estimator with a smallest variance than β
BUT, nothing guarantees that a nonlinear unbiased estimator may
b
have a smallest variance than β
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
141 / 174
4.1. Finite sample properties
Summary
Model
Case 1: X stochastic
Case 2: X nonstochastic
Semi-parametric
b is the BLUE
β
(Gauss-Markov)
b is the BLUE
β
(Gauss-Markov)
Parametric
b is e&cent; cient and BUE
β
(FDCR bound)
b is e&cent; cient and BUE
β
(FDCR bound)
BUE: Best Unbiased Estimator
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
142 / 174
4.1. Finite sample properties
Other remarks
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
143 / 174
4.1. Finite sample properties
Remarks
1
Including irrelevant variables (i.e., coe&cent; cient = 0) does not a&curren;ect
the unbiasedness of the estimator.
2
Irrelevant variables decrease the precision of OLS estimator, but do
not result in bias.
3
Omitting relevant regressors obviously does a&curren;ect the estimator,
since the zero conditional mean assumption does not hold anymore.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
144 / 174
4.1. Finite sample properties
Remarks
1
2
Omitting relevant variables cause bias (under some conditions on
these variables).
1
The direction of the bias can be (sometimes) characterized.
2
The size of the bias is more complicated to derive.
More generally, correlation between a single regressor and the error
term (endogeneity) usually results in all OLS estimators being biased
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
145 / 174
4.1. Finite sample properties
Key Concepts
1
OLS unbiased estimator under assumption A3 (exogeneity)
2
Mean and variance of the OLS estimator under assumptions A3-A4
(exogeneity - spherical disturbances)
3
Finite sample distribution
4
Estimator of the variance of the error terms
5
BLUE estimator and e&cent; cient estimator (FDCR bound)
6
Gauss-Markov theorem
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
146 / 174
Subsection 4.2.
Asymptotic properties of the OLS estimator
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
147 / 174
4.2. Asymptotic properties
Objectives
The objectives of this subsection are the following
1
2
3
Study the consistency of the OLS estimator.
b
Derive the asymptotic distribution of the OLS estimator β.
b
Estimate the asymptotic variance covariance matrix of β.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
148 / 174
4.2. Asymptotic properties
Theorem (Consistency)
Let consider the multiple linear regression model yi = xi&gt; β0 + εi , with
xi = (xi 1 : .. : xiK )&gt; . Under assumptions A1-A5, the OLS estimators
b and σ
b2 are ( weakly) consistent:
β
p
b!
β
β0
p
b2 ! σ2
σ
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
149 / 174
4.2. Asymptotic Properties
Proof (cf. chapter 1)
We have:
b=
β
N
∑
i =1
xi xi&gt;
!
1
N
∑ xi yi
i =1
!
N
= β0 +
∑
xi xi&gt;
1
N
N
i =1
!
1
N
∑ xi εi
i =1
!
By multiplying and dividing by N, we get:
b = β+
β
Christophe Hurlin (University of Orl&eacute;ans)
1
N
N
∑
i =1
xi xi&gt;
!
1
∑ xi εi
i =1
!
November 23, 2013
150 / 174
4.2. Asymptotic Properties
Proof (cf. chapter 1)
1
N
b = β+
β
1
∑ xi xi&gt;
i =1
!
1
1
N
N
∑ xi εi
i =1
!
By using the (weak) law of large number (Kitchine’s theorem), we
have:
1
N
2
N
N
1
N
p
∑ xi xi&gt; ! E xi xi&gt;
i =1
N
p
∑ xi εi ! E (xi εi )
i =1
By using the Slutsky’s theorem:
p
b!
β+E
β
Christophe Hurlin (University of Orl&eacute;ans)
1
xi xi&gt; E (xi εi )
November 23, 2013
151 / 174
4.2. Asymptotic Properties
Proof (cf. chapter 1)
p
b!
β
β+E
1
xi xi&gt; E (xi &micro;i )
Under assumption A3 (exogeneity):
E ( εi j xi ) = 0 )
We have
E (εi xi ) = 0K
E ( εi ) = 0
1
p
b!
β
β
b is (weakly) consistent.
The OLS estimator β
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
152 / 174
4.2. Asymptotic properties
Theorem (Asymptotic distribution)
Consider the multiple linear regression model yi = xi&gt; β0 + εi , with
b is
xi = (xi 1 : .. : xiK )&gt; . Under assumptions A1-A5, the OLS estimator β
asymptotically normally distributed:
p
or equivalently
d
b
N β
b
β
Christophe Hurlin (University of Orl&eacute;ans)
β0 ! N 0, σ2 EX 1 xi xi&gt;
asy
N
β0 ,
σ2
E 1 xi xi&gt;
N X
November 23, 2013
153 / 174
4.2. Asymptotic properties
1
2
3
Whatever the speci…cation used (parametric or semi-parametric), it is
always possible to characterize the asymptotic distribution of the
OLS estimator.
b is asymptotically
Under assumptions A1-A5, the OLS estimator β
normally distributed, even if ε is not normally distributed.
This result (asymptotic normality) is valid whether X is stochastic or
nonstochastic.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
154 / 174
4.2. Asymptotic properties
Proof (cf. chapter 1)
This proof has be shown in the chapter 1 (Estimation theory)
1
Let us rewrite the OLS estimator as:
b=β +
β
0
2
b
Normalize the vector β
p
b
N β
β0 =
AN =
Christophe Hurlin (University of Orl&eacute;ans)
1
N
N
∑ xi xi&gt;
i =1
!
1
N
∑ xi εi
i =1
!
β0
1
N
N
∑
xi xi&gt;
i =1
!
1
p
N
∑ xi xi&gt;
bN =
1
N
N
p
N
i =1
N
∑ xi εi
i =1
1
N
!
= AN 1 bN
N
∑ xi εi
i =1
November 23, 2013
155 / 174
4.2. Asymptotic properties
Proof (cf. chapter 1)
3. Using the WLLN (Kinchine’s theorem) and the CMP (continuous
mapping theorem). Remark: if xi this part is useless.
! 1
1 N
p
xi xi&gt;
! EX 1 xi xi&gt;
∑
N i =1
4. Using the CLT:
BN =
p
N
1
N
N
∑ xi εi
i =1
E (xi εi )
!
d
! N (0, V (xi εi ))
Under assumption A3 E ( εi j xi ) = 0 =) E (xi εi ) = 0 and
V (xi εi ) = E xi εi εi xi&gt; = EX E xi εi εi xi&gt; xi
= EX xi V ( εi j xi ) xi&gt; = σ2 EX xi xi&gt;
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
156 / 174
4.2. Asymptotic properties
Proof (cf. chapter 1)
So we have
with
p
b
N β
β 0 = A N 1 bN
p
AN 1 ! A
1
d
bN ! N (Eb , Vb )
A
1
= EX 1 xi xi&gt;
Christophe Hurlin (University of Orl&eacute;ans)
Eb = 0
Vb = σ2 EX xi xi&gt; = σ2 A
November 23, 2013
157 / 174
4.2. Asymptotic properties
Proof (cf. chapter 1)
p
AN 1 ! A
1
d
bN ! N (Eb , Vb )
A
1
=E
xi xi&gt;
1
Eb = 0
Vb = σ2 EX xi xi&gt; = σ2 A
By using the Slusky’s theorem (for a convergence in distribution), we have:
d
AN 1 bN ! N (Ec , Vc )
Ec = A
Vc = A
1
Vb
Christophe Hurlin (University of Orl&eacute;ans)
Eb = A
1
A
1
=A
1
1
0=0
2
σ A
A
1
= σ2 A
1
November 23, 2013
158 / 174
4.2. Asymptotic properties
Proof (cf. chapter 1)
p
b
N β
d
β0 = AN 1 bN ! N 0, σ2 A
1
with A 1 = EX 1 xi&gt; xi (be careful: it is the inverse of the expectation
and not the reverse).
Finally, we have:
p
b
N β
Christophe Hurlin (University of Orl&eacute;ans)
d
β0 ! N 0, σ2 EX 1 xi xi&gt;
November 23, 2013
159 / 174
4.2. Asymptotic properties
Remark
1
In some textbooks (Greene, 2007 for instance), the asymptotic
variance covariance matrix of the OLS estimator is expressed as a
function of the plim of N 1 X&gt; X denoted by Q.
Q = p lim
2
1 &gt;
X X
N
Both notations are equivalent since:
1
1 &gt;
X X=
N
N
N
∑ xi xi&gt;
i =1
By using the WLLN (Kinchine’s theorem), we have:
1 &gt; p
X X ! Q = EX xi xi&gt;
N
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
160 / 174
4.2. Asymptotic properties
Theorem (Asymptotic distribution)
Consider the multiple linear regression model y = Xβ0 + ε. Under
b is asymptotically normally
assumptions A1-A5, the OLS estimator β
distributed:
p
d
b β !
N β
N 0, σ2 Q 1
0
where
Q = p lim
1 &gt;
X X = EX xi&gt; xi
N
or equivalently
b
β
Christophe Hurlin (University of Orl&eacute;ans)
asy
N
β0 ,
σ2
Q
N
1
November 23, 2013
161 / 174
4.2. Asymptotic properties
De…nition (Asymptotic variance covariance matrix)
b is
The asymptotic variance covariance matrix of the OLS estimators β
equal to
2
b = σ E 1 xi xi&gt;
Vasy β
N X
or equivalently
2
b = σ Q 1
Vasy β
N
where Q = p lim N
1 X&gt; X.
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
162 / 174
4.2. Asymptotic properties
Remark
Finite sample variance
Asymptotic variance
b = σ 2 EX
V β
X&gt; X
1
2
b = σ E 1 xi xi&gt;
Vasy β
N X
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
163 / 174
4.2. Asymptotic properties
Remark
How to estimate the asymptotic variance covariance matrix of the OLS
estimator?
2
b = σ E 1 xi xi&gt;
Vasy β
N X
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
164 / 174
4.2. Asymptotic properties
De…nition (Estimator of the asymptotic variance matrix)
A consistent estimator of the asymptotic variance covariance matrix is
given by:
1
b =σ
b asy β
b 2 X&gt; X
V
b2 is consistent estimator of σ2 :
where σ
b2 =
σ
Christophe Hurlin (University of Orl&eacute;ans)
b
ε&gt;b
ε
N K
November 23, 2013
165 / 174
4.2. Asymptotic properties
Proof
2
b = σ E 1 xi xi&gt;
Vasy β
N X
We known that:
p
X&gt; X
1
=
N
N
N
b2 ! σ2
σ
p
∑ xi xi&gt; ! EX
i =1
xi xi&gt; = Q
By using the CMP and the Slutsky theorem, we get:
b
σ
2
Christophe Hurlin (University of Orl&eacute;ans)
X&gt; X
N
!
1
p
! σ2 EX 1 xi xi&gt;
November 23, 2013
166 / 174
4.2. Asymptotic properties
Proof (cont’d)
As consequence
2
b = σ E 1 xi x&gt;
Vasy β
i
N X
! 1
X&gt; X
p
2
b
σ
! σ2 EX 1 xi xi&gt;
N
b2
σ
N
or equivalently
Christophe Hurlin (University of Orl&eacute;ans)
X&gt; X
N
!
b 2 X&gt; X
σ
1
p
!
σ2
E 1 xi xi&gt;
N X
1 p
b
! Vasy β
p
b !
b
b asy β
V
Vasy β
November 23, 2013
167 / 174
4.2. Asymptotic properties
Remark
Under assumptions A1-A5, for a stochastic matrix X, we have
Finite sample
Variance
Estimator
b = σ 2 EX
V β
b =σ
b β
b 2 X&gt; X
V
Christophe Hurlin (University of Orl&eacute;ans)
Asymptotic
X&gt; X
1
1
b = σ2 N
Vasy β
1Q 1
with Q = p lim N1 X&gt; X
b =σ
b asy β
b 2 X&gt; X
V
November 23, 2013
1
168 / 174
4.2. Asymptotic properties
Example (CAPM)
The empirical analogue of the CAPM is given by:
e
rit = αi + βi e
rmt + εt
e
rit = rit rft
| {z }
excess return of security i at time t
rft )
}
market excess return at time t
where εt is an i.i.d. error term with:
E ( εt ) = 0
e
rmt = (rmt
|
{z
V ( εt ) = σ 2
E ( εt j e
rmt ) = 0
Data …le: capm.xls for Microsoft, SP500 and Tbill (closing prices) from
11/1/1993 to 04/03/2003
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
169 / 174
4.2. Asymptotic sample properties
Example (CAPM, cont’d)
Question: write a Matlab code in order to …nd (1) the estimates of α and
β for Microsoft, (2) the corresponding standard errors, (3) the SE of the
regression, (4) the SSR, (5) the coe&cent; cient of determination and (6) the
adjusted R 2 . Compare your results to the following table (Eviews)
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
170 / 174
4.2. Asymptotic sample properties
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
171 / 174
4.2. Asymptotic sample properties
Christophe Hurlin (University of Orl&eacute;ans)
November 23, 2013
172 / 174
4.2. Asymptotic sample properties
Key Concepts
1
The OLS estimator is weakly consist
2
The OLS estimator is asymptotically normally distributed
3
Asymptotic variance covariance matrix
4
Estimator of the asymptotic variance
Christophe Hurlin (University of Orl&eacute;ans)