Chapter 1: Estimation Theory Advanced Econometrics - HEC Lausanne Christophe Hurlin

advertisement
Chapter 1: Estimation Theory
Advanced Econometrics - HEC Lausanne
Christophe Hurlin
University of Orléans
November 20, 2013
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
1 / 147
Section 1
Introduction
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
2 / 147
1. Introduction
Estimation problem
Let us consider a continuous random variable Y characterized by a
marginal probability density function fY (y ; θ ) for y 2 R and θ 2 Θ.
The parameter θ is unknown.
Let fY1 , .., YN g a random sample of i.i.d. random variables that have
the same distribution as Y .
We have one realisation fy1 , .., yN g of this sample.
How to estimate the parameter θ?
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
3 / 147
1. Introduction
Remarks
1
The estimation problem can be extended to the case of an
econometric model. In this case we consider two variables Y and X
and a conditional pdf f Y jX =x (y ; θ ) that depends on a parameter or a
vector of unknown parameters θ.
2
In this chapter, we don’t derive the estimators (for the estimation
methods, see next chapters). We admit that we have an estimator b
θ
for θ whatever the estimation method used and we study its …nite
sample and large sample properties.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
4 / 147
1. Introduction
Notations: In this course, I will (try to...) follow some conventions of
notation.
Y
y
fY ( y )
FY ( y )
Pr ()
y
Y
random variable
realisation
probability density or mass function
cumulative distribution function
probability
vector
matrix
Problem: this system of notations does not allow to discriminate between
a vector (matrix) of random elements and a vector (matrix) of
non-stochastic elements (realisation).
Abadir and Magnus (2002), Notation in econometrics: a proposal for a
standard, Econometrics Journal.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
5 / 147
1. Introduction
The outline of this chapter is the following:
Section 2: What is an estimator?
Section 3: Finite sample properties
Section 4: Large sample properties
Subsection 4.1: Almost sure convergence
Subsection 4.2: Convergence in probability
Subsection 4.3: Convergence in mean square
Subsection 4.4: Convergence in distribution
Subsection 4.5: Asymptotic distributions
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
6 / 147
Section 2
What is an Estimator?
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
7 / 147
2. What is an Estimator?
Objectives
1
De…ne the concept of estimator.
2
De…ne the concept of estimate.
3
Sampling distribution.
4
Discussion about the notion of "good "estimator.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
8 / 147
2. What is an Estimator?
De…nition (Point estimator)
A point estimator is any function T (Y1 , Y2 , .., YN ) of a sample. Any
statistic is a point estimator.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
9 / 147
What is an estimator?
Example (Sample mean)
Assume that Y1 , Y2 , .., YN are i.i.d. N m, σ2 random variables. The
sample mean (or average)
YN =
1
N
N
∑ Yi
i =1
is a point estimator (or an estimator) of m.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
10 / 147
2. What is an Estimator?
Example (Sample variance)
Assume that Y1 , Y2 , .., YN are i.i.d. N m, σ2 random variables. The
sample variance
N
1
2
SN2 =
Yi Y N
∑
N 1 i =1
is a point estimator (or an estimator) of σ2 .
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
11 / 147
2. What is an Estimator?
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
12 / 147
2. What is an Estimator?
Fact
An estimator b
θ is a random variable.
Consequence: b
θ has a (marginal or conditional) probability distribution.
This sampling distribution is caracterized by a probability density
function (pdf) fbθ (u )
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
13 / 147
2. What is an Estimator?
De…nition (Sampling Distribution)
The probability distribution of an estimator (or a statistic) is called the
sampling distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
14 / 147
2. What is an Estimator?
Fact
An estimator b
θ is a random variable.
Consequence: The sampling distribution of b
θ is caracterized by
b
moments such that the expectation E θ , the variance V b
θ and more
generally the k th central moment de…ned by:
E
b
θ
E b
θ
Christophe Hurlin (University of Orléans)
k
=
Z
u
µbθ = E b
θ =
Z
µbθ
k
fbθ (u ) du
8k 2 N
u fbθ (u ) du
Advanced Econometrics - HEC Lausanne
November 20, 2013
15 / 147
2. What is an Estimator?
De…nition (Point estimate)
A (point) estimateis the realized value of an estimator (i.e. a number)
that is obtained when a sample is actually taken. For an estimator b
θ it can
b
be denoted by θ (y ) .
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
16 / 147
2. What is an Estimator?
Example (Point estimate)
For instance y N is an estimate of m.
yN =
If N = 3 and fy1 , y2 , y3 g = f3,
If N = 3 and fy1 , y2 , y3 g = f4,
etc..
Christophe Hurlin (University of Orléans)
1
N
N
∑ yi
i =1
1, 2g then y N = 1.333.
8, 1g then y N =
Advanced Econometrics - HEC Lausanne
1.
November 20, 2013
17 / 147
2. What is an Estimator?
Question: What constitues a good estimator?
1
The search for good estimators constitutes much of econometrics.
2
An estimator is a rule or strategy for using the data to estimate
the parameter. It is de…ned before the data are drawn.
3
Our objective is to use the sample data to infer the value of a
parameter or set of parameters, which we denote θ.
4
Sampling distributions are used to make inferences about the
population. The issue is to know if the sampling distribution of the
estimator b
θ is informative about the value of θ....
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
18 / 147
2. What is an Estimator?
Question (cont’d): What constitues a good estimator?
1
Obviously, some estimators are better than others.
1
2
2
To take a simple example, your intuition should convince you that the
sample mean would be a better estimator of the population mean than
the sample minimum; the minimum is almost certain to underestimate
the mean.
Nonetheless, the minimum is not entirely without virtue; it is easy to
compute, which is occasionally a relevant criterion.
The idea is to study the properties of the sampling distribution
θ (for the bias), V b
θ (for
and especially its moments such as E b
the precision), etc..
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
19 / 147
2. What is an Estimator?
Question (cont’d): What constitues a good estimator?
Estimators are compared on the basis of a variety of attributes.
1
Finite sample properties (or …nite sample distribution) of estimators
are those attributes that can be compared regardless of the sample
size (SECTION 3).
2
Some estimation problems involve characteristics that are unknown in
…nite samples. In these cases, estimators are compared on the basis
on their large sample, or asymptotic properties (SECTION 4).
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
20 / 147
2. What is an Estimator?
Key Concepts Section 2
1
Point estimator
2
Point estimate
3
Sampling distribution
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
21 / 147
Section 3
Finite Sample Properties
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
22 / 147
3. Finite Sample Properties
Objectives
1
De…ne the concept of …nite sample distribution.
2
Finite sample properties => What is a good estimator?
3
Unbiased estimator.
4
Comparison of two unbiased estimators.
5
FDCR or Cramer Rao bound.
6
Best Linear Unbiased Estimator (BLUE).
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
23 / 147
3. Finite Sample Properties
De…nition (Finite sample properties and …nite sample distribution)
The …nite sample properties of an estimator b
θ correspond to the properties
of its …nite sample distribution (or exact distribution) de…ned for any
sample size N 2 N.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
24 / 147
3. Finite Sample Properties
Two cases:
1
2
In some particular cases, the …nite sample distribution of the
estimator is known. It corresponds to the distribution of the random
variable b
θ for any sample size N.
In most of cases, the …nite sample distribution is unknown, but we
can study some speci…c moments (mean, variance, etc..) of this
distribution (…nite sample properties).
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
25 / 147
3. Finite Sample Properties
Example (Sample mean and …nite sample distribution)
Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The
estimator m
b = Y N (sample mean) has also a normal distribution:
m
b =
1
N
N
∑ Yi
i =1
N
m,
σ2
N
8N 2 N
Consequence: the …nite sample distribution of m
b for any N 2 N is fully
characterized by m and σ2 (parameters that can be estimated). Example:
if N = 3, then m
b
N m, σ2 /3 , if N = 10, then m
b
N m, σ2 /10 ,
etc..
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
26 / 147
3. Finite Sample Properties
Proof: The sum of independent normal variables has a normal distribution
with:
Nm
1 N
E (Yi ) =
=m
E (m
b) =
∑
N i =1
N
!
1 N
1 N
Nσ2
σ2
V (m
b) = V
Y
=
V
Y
=
=
(
)
i
i
N i∑
N 2 i∑
N2
N
=1
=1
since the variables Yi are
independent (then cov (Yi , Yj ) = 0)
identically distributed (then E (Yi ) = m and V (Yi ) = σ2 ,
8i 2 [1, .., N ]).
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
27 / 147
3. Finite Sample Properties
Remarks
1
Except in very particular cases (normally distributed samples), the
exact distribution of the estimator is very di¢ cult to calculate.
2
Sometimes, it is possible to derive the exact distribution of a
transformed variable g b
θ , where g (.) is a continuous function.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
28 / 147
3. Finite Sample Properties
Example (Sample variance and …nite sample distribution)
Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The
sample variance
N
1
2
SN2 =
Yi Y N
∑
N 1 i =1
is an estimator of σ2 . The transformed variable (N 1) SN2 /σ2 has a
Chi-squared (exact / …nite sample) distribution with N 1 degrees of
freedom:
(N 1) 2
SN
χ2 (N 1)
8N 2 N
σ2
Proof: see Chapter 4.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
29 / 147
3. Finite Sample Properties
Fact
In most of cases, it is impossible to derive the exact / …nite sample
distribution for the estimator (or a transformed variable).
Two reasons:
1
2
In some cases, the exact distribution of Y1 , Y2 ..YN is known, but the
θ:
function T (.) is too complicated to derive the distribution of b
b
θ = T (Y1 , ..YN )
???
8N 2 N
In most of cases, the distribution of the sample variables Y1 , Y2 ..YN is
unknown...
b
θ = T (Y1 , ..YN ) ???
8N 2 N
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
30 / 147
3. Finite Sample Properties
Question: how to evaluate the …nite sample properties of the estimator b
θ
when its …nite sample distribution is unknow?
b
θ
???
8N 2 N
Solution: We will focus on some speci…c moments of this (unknown)
…nite sample (sampling) distribution in order to study some properties of
the estimator b
θ and determine if it is a "good" estimator or not.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
31 / 147
3. Finite Sample Properties
De…nition (Unbiased estimator)
An estimator b
θ of a parameter θ is unbiased if the mean of its sampling
distribution is θ:
E b
θ =θ
or
θ
E b
θ = Bias b
θ θ =0
implies that b
θ is unbiased. If θ is a vector of parameters, then the
estimator is unbiased if the expected value of every element of b
θ equals
the corresponding element of θ.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
32 / 147
3. Finite Sample Properties
Source: Greene (2007), Econometrics
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
33 / 147
3. Finite Sample Properties
Example (Bernouilli distribution)
Let Y1 , Y2 , .., YN be a random sampling from a Bernoulli distribution with
a success probability p. An unbiased estimator of p is
p
b=
1
N
N
∑ Yi
i =1
Proof: Since the Yi are i.i.d. with E (Yi ) = p, then we have:
E (p
b) =
Christophe Hurlin (University of Orléans)
1
N
N
∑ E (Yi ) =
i =1
pN
=p
N
Advanced Econometrics - HEC Lausanne
November 20, 2013
34 / 147
3. Finite Sample Properties
Example (Uniform distribution)
Let Y1 , Y2 , .., YN be a random sampling from a uniform distribution U[0,θ ] .
An unbiased estimator of θ is
2
b
θ=
N
N
∑ Yi
i =1
Proof: Since the Yi are i.i.d. with E (Yi ) = (θ + 0) /2 = θ/2, then we
have:
!
N
2
2 N
2
Nθ
E b
θ =E
Y
=
E (Yi ) =
=θ
i
∑
∑
N i =1
N i =1
N
2
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
35 / 147
3. Finite Sample Properties
Example (Multiple linear regression model)
Consider the model
y = Xβ + µ
where y 2 RN , X 2 MN K is a nonrandom matrix, β 2 RK is a vector of
parameters, E (µ) = 0N 1 and V (µ) = σ2 IN . The OLS estimator
b = X> X
β
1
X> y
is an unbiased estimator of β.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
36 / 147
3. Finite Sample Properties
Proof: Since y = Xβ + µ, X 2 MN K is a nonrandom matrix and
E (µ) = 0, we have
E (y) = Xβ
As a consequence:
b
E β
=
X> X
=
X> X
1
1
X> E (y )
X> Xβ
= β
b is unbiased.
The estimator β
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
37 / 147
3. Finite Sample Properties
Remark:
Even it is not relevant in the section devoted to the …nite sample
properties of estimators, we can introduce here the notion of
asymptotically unbiased estimator (which can be considered as a large
sample property..).
Here we assume that the estimator b
θ=b
θ N depends on the sample
size N.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
38 / 147
3. Finite Sample Properties
De…nition (Asymptotically unbiased estimator)
The sequence of estimators b
θ N (with N 2 N) is asymptotically unbiased if
lim E b
θN
N !∞
Christophe Hurlin (University of Orléans)
=θ
Advanced Econometrics - HEC Lausanne
November 20, 2013
39 / 147
3. Finite Sample Properties
Example (Sample variance)
Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The
uncorrected sample variance de…ned by
eN2 = 1
S
N
N
∑
Yi
YN
2
i =1
is a biased estimator of σ2 but is asymptotically unbiased.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
40 / 147
3. Finite Sample Properties
Proof: We known that:
SN2 =
(N
1)
σ2
1
N
SN2
N
1∑
Yi
YN
2
i =1
χ2 (N
1)
8N 2 N
e 2 , such that:
Since, we have a relationship between SN2 and S
N
then we get:
eN2 = 1
S
N
N
∑
YN
2
i =1
N e2
S
σ2 N
Christophe Hurlin (University of Orléans)
Yi
χ2 (N
1)
=
N
1
N
SN2
8N 2 N
Advanced Econometrics - HEC Lausanne
November 20, 2013
41 / 147
3. Finite Sample Properties
Proof (cont’d):
Reminder: If X
N e2
S
σ2 N
χ2 (N
1)
8N 2 N
χ2 (v ) , then E (X ) = v and V (X ) = 2v . By de…nition:
E
or equivalently:
eN2
E S
e 2 = (1/N ) ∑N
So, S
i =1 Yi
N
Christophe Hurlin (University of Orléans)
N e2
S
σ2 N
N
=
YN
=N
1
N
2
1
σ2 6 = σ2
is a biased estimator of σ2 .
Advanced Econometrics - HEC Lausanne
November 20, 2013
42 / 147
3. Finite Sample Properties
e 2 = (1/N ) ∑N
Proof (cont’d): But S
i =1 Yi
N
unbiased since:
eN2
lim E S
N !∞
= lim
N !∞
N
YN
1
N
2
is asymptotically
σ2 = σ2
Remark: Even in a more general framework (non-normal), the sample
variance (with a correction for small sample) is an unbiased estimator of σ2
SN2 =
(N 1) 1
| {z }
correction for small sample
N
∑
Yi
YN
2
i =1
E SN2 = σ2
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
43 / 147
3. Finite Sample Properties
Unbiasedness is interesting per se but not so much!
1
The absence of bias is not a su¢ cient criterion to discriminate among
competitive estimators.
2
It may exist many unbiased estimators for the same parameter
(vector) of interest.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
44 / 147
3. Finite Sample Properties
Example (Estimators)
Assume that Y1 , Y2 , .., YN are i.i.d. with E (Yi ) = m, the statistics
m
b1 =
are unbiased estimators of m.
Christophe Hurlin (University of Orléans)
1
N
N
∑ Yi
i =1
m
b 2 = Y1
Advanced Econometrics - HEC Lausanne
November 20, 2013
45 / 147
3. Finite Sample Properties
Proof: Since the Yi are i.i.d. with E (Yi ) = m, then we have:
E (m
b 1) =
1
N
N
∑ E (Yi ) =
i =1
Nm
=m
N
E (m
b 2 ) = E (Y1 ) = m
Both estimators m
b 1 and m
b 2 of the parameter m are unbiased.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
46 / 147
3. Finite Sample Properties
How to compare two unbiased estimators?
When two (or more) estimators are unbiased, the best one is the more
precise,.i.e. the estimator with the minimum variance.
Comparing two (or more) unbiased estimates becomes equivalent to
comparing their variance-covariance matrices.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
47 / 147
3. Finite Sample Properties
De…nition
Suppose that b
θ 1 and b
θ 2 are two unbiased estimators. b
θ 1 dominates b
θ 2 , i.e.
b
θ1 b
θ 2 , if and only if
V b
θ1
V b
θ2
In the case where b
θ1 , b
θ 2 and θ are vectors, this inequality becomes:
V b
θ2
V b
θ1
Christophe Hurlin (University of Orléans)
is a positive semi de…nite matrix
Advanced Econometrics - HEC Lausanne
November 20, 2013
48 / 147
3. Finite Sample Properties
0.8
0.7
Estimator 1
Estimator 2
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.5
Christophe Hurlin (University of Orléans)
1
1.5
2
θ
2.5
3
Advanced Econometrics - HEC Lausanne
3.5
4
November 20, 2013
49 / 147
3. Finite Sample Properties
Example (Estimators)
Assume that Y1 , Y2 , .., YN are i.i.d. E (Yi ) = m and V (Yi ) = σ2 , the
estimator m
b 1 = N 1 ∑N
b 2 = Y1 .
i =1 Yi dominates the estimator m
Proof: The two estimators m
b 1 and m
b 2 are unbiased, so they can be
compared in terms of variance (precision):
V (m
b 1) =
So, V (m
b 1)
1
N2
N
∑ V (Yi ) =
i =1
Nσ2
σ2
=
since the Yi are i.i.d.
N2
N
V (m
b 2 ) = V (Y1 ) = σ2
V (m
b 2 ) , the estimator m
b 1 is preferred to m
b 2, m
b1
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
m
b 2.
November 20, 2013
50 / 147
3. Finite Sample Properties
Question: is there a bound for the variance of the unbiased estimators?
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
51 / 147
3. Finite Sample Properties
De…nition (Cramer-Rao or FDCR bound)
Let X1 , .., XN be an i.i.d. sample with pdf fX (θ; x ). Let b
θ be an unbiased
b
estimator of θ; i.e., Eθ (θ ) = θ. If fX (θ; x ) is regular then
Vθ b
θ
I N 1 (θ 0 ) = FDCR or Cramer-Rao bound
where I N (θ 0 ) denotes the Fisher information number for the sample
evaluated at the true value θ 0 . If θ is a vector then this inequality means
that Vθ b
θ
I N 1 (θ 0 ) is positive semi-de…nite.
FDCR: Frechet - Darnois - Cramer and Rao
Remark: we will de…ne the Fisher information matrix (or number) in
Chapter 2 (Maximum Likelihood Estimation).
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
52 / 147
3. Finite Sample Properties
De…nition (E¢ ciency)
An estimator is e¢ cient if its variance attains the FDCR (Frechet Darnois - Cramer - Rao) or Cramer-Rao bound:
Vθ b
θ = I N 1 (θ 0 )
where I N (θ 0 ) denotes the Fisher information matrix associated to the
sample evaluated at the true value θ 0 .
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
53 / 147
3. Finite Sample Properties
Finally, note that in some cases we further restrict the set of estimators to
linear functions of the data.
De…nition (Estimator BLUE)
An estimator is the minimum variance linear unbiased estimator or best
linear unbiased estimator (BLUE) if it is a linear function of the data and
has minimum variance among linear unbiased estimators
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
54 / 147
3. Finite Sample Properties
Remark: the term "linear" means that the estimator b
θ is a linear function
of the data Yi :
b
θj =
Christophe Hurlin (University of Orléans)
N
∑ ωij Yi
i =1
Advanced Econometrics - HEC Lausanne
November 20, 2013
55 / 147
3. Finite Sample Properties
Key Concepts Section 3
1
Finite sample distribution
2
Finite sample properties
3
Bias and unbiased estimator
4
Comparison of unbiased estimators
5
Cramer-Rao or FDCR bound
6
E¢ cient estimator
7
Linear estimator
8
Estimateur BLUE
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
56 / 147
Section 4
Asymptotic Properties
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
57 / 147
4. Asymptotic Properties
Problem:
1
Let us consider an i.i.d. sample Y1 , Y2 .., YN , where Y has a pdf
fY (y ; θ ) and θ is an unknown parameter.
2
We assume that fY (y ; θ ) is also unknown (we do not know the
distribution of Yi ).
We consider an estimator b
θ (also denoted b
θ N to show that it depends
on N) such that
b
θ = T (Y1 , Y2 , .., YN ) b
θN
3
4
The …nite sample distribution of b
θ N is unknown....
b
θN
Christophe Hurlin (University of Orléans)
???
8N 2 N
Advanced Econometrics - HEC Lausanne
November 20, 2013
58 / 147
4. Asymptotic Properties
Question: what is the behavior of the random variable b
θ N when the
sample size N tends to in…nity?
De…nition (Asymptotic theory)
Asymptotic or large sample theory consists in the study of the
distribution of the estimator when the sample size is su¢ ciently large.
The asymptotic theory is fundamentally based on the notion of
convergence...
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
59 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
60 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
61 / 147
Section 4
Asymptotic Properties
4.1. Almost Sure Convergence
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
62 / 147
4. Asymptotic Properties
4.1. Almost sur convergence
De…nition (Almost sure convergence)
Let XN be a sequence random variable indexed by the sample size. XN
converges almost surely (or with probability 1 or strongly) to a constant
c, if, for every ε > 0,
lim XN
Pr
N !∞
c <ε
=1
or equivalently if:
Pr
It is written
lim XN = c
N !∞
=1
a.s .
XN ! c
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
63 / 147
4. Asymptotic Properties
4.1. Almost sur convergence
Comments
1
The almost sure convergence means that the values of XN approach
the value c, in the sense (see almost surely) that events for which XN
does not converge to c have probability 0.
2
In another words, it means that when N tends to in…nity, the random
variable Xn tends to a degenerate random variable (a random
variable which only takes a single value c) with a pdf equal to a
probability mass function.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
64 / 147
4. Asymptotic Properties
4.1. Almost sur convergence
1.2
1
0.8
0.6
0.4
0.2
0
0
0.5
Christophe Hurlin (University of Orléans)
1
1.5
2
c=2
2.5
3
Advanced Econometrics - HEC Lausanne
3.5
4
November 20, 2013
65 / 147
4. Asymptotic Properties
4.1. Almost sur convergence
De…nition (Strong consistency)
A point estimator b
θ N of θ is strongly consistent if:
a.s .
b
θN ! θ
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
66 / 147
4. Asymptotic Properties
4.1. Almost sur convergence
Comments When N ! ∞, the estimator tends to a degenerate random
variable that takes a single value equal to θ.
The crème de la crème (best of the best) of the estimators....
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
67 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
68 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
69 / 147
Section 4
Asymptotic Properties
4.2. Convergence in Probability
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
70 / 147
4. Asymptotic Properties
4.2. Convergence in probability
De…nition (Convergence in probability)
Let XN be a sequence random variable indexed by the sample size. XN
converges in probability to a constant c, if, for any ε > 0,
lim Pr (jXN
N !∞
It is written
p
XN ! c
Christophe Hurlin (University of Orléans)
or
c j > ε) = 0
plim XN = c
Advanced Econometrics - HEC Lausanne
November 20, 2013
71 / 147
4. Asymptotic Properties
4.2. Convergence in probability
p
XN ! c
if
c j > ε) = 0
lim Pr (jXN
N !∞
4.5
4
c+ε
c-ε
3.5
3
2.5
This area tends to 0
2
1.5
1
0.5
0
0
0.5
Christophe Hurlin (University of Orléans)
1
1.5
2
c=2
2.5
Advanced Econometrics - HEC Lausanne
3
3.5
4
November 20, 2013
72 / 147
4. Asymptotic Properties
4.2. Convergence in probability
p
XN ! c
if
c j > ε) = 0
lim Pr (jXN
N !∞
for a very small ε...
400
350
300
250
200
150
100
50
0
0
0.5
Christophe Hurlin (University of Orléans)
1
1.5
2
c=2
2.5
3
Advanced Econometrics - HEC Lausanne
3.5
4
November 20, 2013
73 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Comments
1
The general idea is the same than for the a.s. convergence: XN tends
to a degenerate random variable (even if it is not exactly the case)
equal to c..
2
But when XN is very likely to be close to c for large N, what about
the location of the remaining small probability mass which is not close
to c?...
3
Convergence in probability allows more erratic behavior in the
converging sequence than almost sure convergence.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
74 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Remark The notation
p
XN ! X
where X is a random element (scalar, vector, matrix) means that the
variable XN X converges to c = 0.
XN
Christophe Hurlin (University of Orléans)
p
X !0
Advanced Econometrics - HEC Lausanne
November 20, 2013
75 / 147
4. Asymptotic Properties
4.2. Convergence in probability
De…nition (Weak consistency)
A point estimator b
θ N of θ is (weakly) consistent if:
p
b
θN ! θ
Remark: In econometrics, in most of cases, we only consider the weak
consistency. When we say that an estimator is "consistent", it generally
refers to the convergence in probability.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
76 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Lemma (Convergence in probability)
Let XN be a sequence random variable indexed by the sample size and c a
constant. If
lim E (XN ) = c
N !∞
lim V (XN ) = 0
N !∞
Then, XN converges in probability to c as N ! ∞ :
p
XN ! c
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
77 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Example (Consistent estimator)
Assume that Y1 , Y2 , .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ2 ,
where σ2 is known and m is unknow. The estimator m,
b de…ned by,
m
b =
1
N
N
∑ Yi
i =1
is a consistenty estimator of m.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
78 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Proof: Since Y1 , Y2 , .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ2 ,
we have :
1 N
E (m
b) =
E (Yi ) = m
N i∑
=1
lim V (m
b) =
N !∞
1
N !∞ N 2
lim
N
∑ V (Yi ) =
i =1
σ2
=0
N !∞ N
lim
The estimator m
b is (weakly) consistent:
p
m
b !m
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
79 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Example (Consistent estimator)
Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The
sample variance de…ned by
SN2 =
1
N
N
1∑
Yi
YN
2
i =1
is a (weakly) consistent estimator of σ2 .
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
80 / 147
3. Finite Sample Properties
4.2. Convergence in probability
Proof: We known that for normal sample:
(N
1)
σ2
E
(N
1)
σ2
SN2
SN2
=N
χ2 (N
1)
V
(N
1
8N 2 N
1)
σ2
SN2
= 2 (N
1)
We get immediately:
E SN2 = σ2
lim V SN2 =
N !∞
lim
N !∞
2σ4
N 1
=0
p
The estimator SN2 is (weakly) consistent : SN2 ! σ2 .
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
81 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Lemma (Chain of implication)
The almost sure convergence implies the convergence in probability:
a.s .
p
! =) !
where the symbol "=) " means ’implies". The converse is not true
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
82 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Comments
1
One of the main applications of the convergence in probability and
the almost sure convergence is the law of large numbers.
2
The law of large numbers tells you that the sample mean converges in
probability (weak law of large numbers) or almost surely (strong
law of large numbers) to the population mean:
XN =
Christophe Hurlin (University of Orléans)
1
N
N
∑ Xi N !!∞ E (Xi )
i =1
Advanced Econometrics - HEC Lausanne
November 20, 2013
83 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Theorem (Weak law of large numbers, Khinchine)
If fXi g , for i = 1, .., N is a sequence of independently and identically
distributed (i.i.d.) random variables with …nite mean E (Xi ) = µ (<∞),
then the sample mean X N converges in probability to µ:
XN =
Christophe Hurlin (University of Orléans)
1
N
N
p
∑ Xi ! E (Xi ) = µ
i =1
Advanced Econometrics - HEC Lausanne
November 20, 2013
84 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Theorem (Strong law of large numbers, Kolmogorov)
If fXi g , for i = 1, .., N is a sequence of independently and identically
distributed (i.i.d.) random variables such that E (Xi ) = µ (< ∞) and
E (jXi j) < ∞, then the sample mean X N converges almost surely to µ:
XN =
Christophe Hurlin (University of Orléans)
1
N
N
∑ Xi
i =1
a.s .
! E (Xi ) = µ
Advanced Econometrics - HEC Lausanne
November 20, 2013
85 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Illustration:
1
Let us consider a random variable Xi
sample
U[0,10 ] and draw an i.i.d
fxi gNi=1
1
∑N
i =1 xi .
2
Compute the sample mean x N = N
3
Repeat this procedure 500 times. We get 500 realisations of the
sample mean x N .
4
Build an histogram of these 500 realisations.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
86 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
87 / 147
4. Asymptotic Properties
4.2. Convergence in probability
N = 10
N = 100
20
20
18
18
16
16
14
14
12
12
10
10
8
8
6
6
4
4
2
0
2
0
2
4
6
8
10
0
0
N = 1, 000
4
6
8
10
N = 10, 000
20
20
18
18
16
16
14
14
12
12
10
10
8
8
6
6
4
4
2
0
2
2
0
2
4
Christophe Hurlin (University of Orléans)
6
8
10
0
0
2
4
Advanced Econometrics - HEC Lausanne
6
8
10
November 20, 2013
88 / 147
4. Asymptotic Properties
4.2. Convergence in probability
An animation is worth 1,000,000 words...
Click me!
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
89 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Proof: There are many proofs of the law of large numbers. Most of them
use the additional assumption of …nite variance V (Xi ) = σ2 and the
Chebyshev’s inequality.
Theorem (Chebyshev’s inequality)
Let X be a random variable with …nite expected value µ and …nite
non-zero variance σ2 . Then for any real number k > 0,
Pr (jX
Christophe Hurlin (University of Orléans)
µj
kσ)
1
k2
Advanced Econometrics - HEC Lausanne
November 20, 2013
90 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Proof (cont’d): Under the assumpition of i.i.d. µ, σ2 , we have that:
E XN = µ
V XN =
σ2
N
Given the Chebyshev’s inequality, we get for k > 0:
Pr
XN
1
k2
σ
kp
N
µ
Let us de…ne ε > 0 such that
p
kσ
ε N
() k =
ε= p
σ
N
Then we get for any ε > 0:
Pr
Christophe Hurlin (University of Orléans)
XN
µ
ε
σ2
ε2 N
Advanced Econometrics - HEC Lausanne
November 20, 2013
91 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Proof (cont’d): for any ε > 0:
Pr
XN
µ
σ2
ε2 N
ε
So, when N ! ∞ this probability is necessarily equal to 0 (since
means = 0)
Pr
Since Pr
XN
lim X N
µ <ε =1
Pr
µ
N !∞
lim X N
N !∞
a.s .
P
ε
XN
µ <ε
0
= 0 8ε > 0
µ
ε , we have:
= 1 8ε > 0
p
X N ! µ (SLLN) =) X N ! µ (WLLN)
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
92 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Remarks
1
These two theorems consider a sequence of independently and
identically distributed (i.i.d.) random variables (as a consequence
with the same mean E (Xi ) = µ, 8i = 1, .., N.
2
There are alternative versions of the law of large numbers for
independent random variables not identically (heterogeneously)
distributed with E (Xi ) = µi (cf. Greene, 2007).
1
Chebychev’s Weak Law of Large Numbers.
2
Markov’s Strong Law of Large Numbers.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
93 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Theorem (Slutsky’s theorem)
p
Let XN and YN be two sequences of random variables where XN ! X and
p
YN ! c, where c 6= 0, then:
p
XN + YN ! X + c
p
XN YN ! cX
XN p X
!
YN
c
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
94 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Remark: This also holds for sequences of random matrices. The last
p
p
statement reads: if XN ! X and YN ! Ω then
p
Y N 1 XN ! Ω
provided that Ω
1
1
X
exists.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
95 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Example
Let us consider the multiple linear regression model
yi = xi> β + µi
where xi = (xi 1 ..xiK )> is K 1 vector of random variables,
β = ( β1 ...βK )> is K 1 vector of parmeters, and where the error term µi
satis…es E (µi ) = 0 and E ( µi j xij ) = 0 8j = 1, ..K . Question: show that
the OLS estimator de…ned by
b=
β
N
∑ xi xi>
i =1
!
1
N
∑ xi yi
i =1
!
is a consistent estimator of β.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
96 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Proof: let us rewritte the OLS estimator as:
!
! 1
N
N
>
b =
β
∑ xi yi
∑ xi xi
i =1
N
=
∑
xi xi>
i =1
N
=
∑ xi xi>
i =1
!
!
N
= β+
i =1
1
∑ xi
Christophe Hurlin (University of Orléans)
xi> β + µi
i =1
1
N
∑ xi xi>
∑ xi xi>
i =1
N
i =1
! 1
N
!
N
β+
∑ xi µi
i =1
!
!
∑ xi xi>
i =1
Advanced Econometrics - HEC Lausanne
!
1
N
∑ xi µi
i =1
November 20, 2013
!
97 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Proof (cont’d): By multiplying and dividing by N, we get:
1
N
b = β+
β
1
∑ xi xi>
i =1
!
1
1
N
N
∑ xi µi
i =1
!
By using the (weak) law of large number (Kitchine’s therorem), we
have:
1
N
2
N
N
p
∑ xi xi> ! E
1
N
xi xi>
i =1
N
p
∑ xi µi ! E (xi µi )
i =1
By using the Slutsky’s theorem:
p
b!
β
β+E
Christophe Hurlin (University of Orléans)
1
xi xi> E (xi µi )
Advanced Econometrics - HEC Lausanne
November 20, 2013
98 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Reminder: If X and Y are two random variables, then
E (Xj Y ) = 0
=) E (X Y ) = 0
The reverse is not true.
E ( X j Y ) = 0 =)
(
cov (X , Y ) = E (XY )
E (X ) E (Y ) = 0
E (X ) = 0
E ( X j Y ) = 0 =) E (XY ) = 0
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
99 / 147
4. Asymptotic Properties
4.2. Convergence in probability
Proof (cont’d):
Since
p
b!
β
β+E
1
xi xi> E (xi µi )
E ( µi j xij ) = 0 8j = 1, ..K ) E (µi xi ) = 0K
We have
1
p
b!
β
β
b is (weakly) consistent.
The OLS estimator β
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
100 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
101 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
102 / 147
Section 4
Asymptotic Properties
4.3. Convergence in Mean Square
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
103 / 147
4. Asymptotic Properties
4.3. Convergence in mean square
De…nition (Convergence in mean square)
Let fXi g for i = 1, .., N be a sequence of real-valued random variables
such that E jXN j2 < ∞. XN converges in mean square to a constant c,
if:
lim E jXN
N !∞
It is written
c j2 = 0
m.s .
XN ! c
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
104 / 147
4. Asymptotic Properties
4.3. Convergence in mean square
Remark: It is the less usefull notion of convergence.. except for the
demonstrations of the convergence in probability.
Lemma (Chain of implication)
The convergence in mean square implies the convergence in probability:
m.s .
!
p
=) !
where the symbol "=) " means ’implies". The converse is not true.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
105 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
106 / 147
4. Asymptotic Properties
We are mainly concerned with four modes of convergence:
1
Almost sure convergence
2
Convergence in probability
3
Convergence in quadratic mean
4
Convergence in distribution
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
107 / 147
Section 4
Asymptotic Properties
4.4. Convergence in Distribution
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
108 / 147
4. Asymptotic Properties
4.4. Convergence in distribution
De…nition (Convergence in distribution)
Let XN be a sequence random variable indexed by the sample size with a
cdf FN (.). XN converges in distribution to a random variable X with
cdf F (.) if
lim FN (x ) = F (x ) 8x
N !∞
It is written:
d
XN ! X
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
109 / 147
4. Asymptotic Properties
4.4. Convergence in distribution
Comment: In general, we have:
XN
|{z}
d
random var.
XN
|{z}
!
random var.
In the case, where
XN
|{z}
random var.
p
!
X
|{z}
random var.
Christophe Hurlin (University of Orléans)
p
X
|{z}
random var.
! |{z}
c
constant
p
0
it means XN X ! |{z}
| {z }
random var.
Advanced Econometrics - HEC Lausanne
constant
November 20, 2013
110 / 147
4. Asymptotic Properties
4.4. Convergence in distribution
Lemma (Chain of implication)
The convergence in probability implies the convergence in distribution:
p
d
! =) !
where the symbol "=) " means ’implies". The converse is not true.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
111 / 147
4. Asymptotic Properties
4.4. Convergence in distribution
De…nition (Asymptotic distribution)
If XN converges in distribution to X , where FN (.) is the cdf of XN , then
F (.) is the cdf of the limiting or asymptotic distribution of XN .
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
112 / 147
4. Asymptotic Properties
4.4. Convergence in distribution
Consequence: Generally, we denote:
XN
|{z}
random var.
d
!
L
|{z}
asy. distribution
It means XN converges in distribution to a random variable X that has a
dsitribution L.
Example
d
XN ! N (0, 1)
means that XN converges to a random variable X normally distributed or
that XN has an asymptotic standard normal distribution.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
113 / 147
4. Asymptotic Properties
4.4. Convergence in distribution
De…nition (Asymptotic mean and variance)
The asymptotic mean and variance of a random variable XN are the
mean and variance of the asymptotic or limiting distribution, assuming
that the limiting distribution and its moments exist. These moments are
denoted by
Easy (XN )
Vasy (XN )
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
114 / 147
4. Asymptotic Properties
4.4. Convergence in distribution
De…nition (Asymptotically normally distributed estimator)
A consistent estimator b
θ of θ is said to be asymptotically normally
distributed (or asymptotically normal) if:
p
N b
θ
d
θ 0 ! N (0, Σ0 )
Equivalently, b
θ is asymptotically normal if:
b
θ
asy
N θ0 , N
1
Σ0
The asymptotic variance of b
θ is then de…ned by:
Vasy b
θ
Christophe Hurlin (University of Orléans)
1
avar b
θ = Σ0
N
Advanced Econometrics - HEC Lausanne
November 20, 2013
115 / 147
Section 4
Asymptotic Properties
4.5. Asymptotic Distributions
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
116 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Let’s go back to our estimation problem
We consider a (strongly) consistent estimator b
θ N of the true
parameter θ 0 .
p
a.s .
b
θ N ! θ 0 =) b
θN ! θ0
This estimator has a degenerated asymptotic distribution
(point-mass distribution), since when N ! ∞,
lim fb
N !∞ θ N
(x ) = f (x )
where fbθ N (.) is the pdf of b
θ N and f (x ) is de…ned by:
f (x ) =
Christophe Hurlin (University of Orléans)
1
0
if x = θ 0
0 otherwise
Advanced Econometrics - HEC Lausanne
November 20, 2013
117 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Conclusion: one needs more than consistency to do inference (tests
about the true value of θ, etc.).
Solution: we will transform the estimator b
θ N to get a transformed
variable that has a non degenerated asymptotic distribution in order to
derive the the asymptotic distribution.
It is the general idea of the Central Limit Theorem for a particular
estimator: the sample mean...
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
118 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Theorem (Lindeberg–Levy Central Limit Theorem, univariate)
Let X1 , .., XN denote a sequence of independent and identically distributed
random variables with …nite mean E (Xi ) = µ and …nite variance
V (Xi ) = σ2 . Then the sample mean X N = N 1 ∑N
i =1 Xi satis…es
p
Christophe Hurlin (University of Orléans)
N XN
d
µ ! N 0, σ2
Advanced Econometrics - HEC Lausanne
November 20, 2013
119 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Comment:
1
The result is quite remarkable as it holds regardless of the form of the
parent distribution (the distribution of Xi ).
2
The central limit theorem requires virtually no assumptions (other
than independence and …nite variances) to end up with normality:
normality is inherited from the sums of ”small” independent
disturbances with …nite variance.
Proof: Rao (1973).
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
120 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Illustration:
1
Let us consider a random variable Xi
χ2 (2) , such that E (Xi ) = 2
and V (Xi ) = 4 and draw an i.i.d sample fxi gN
i =1
1
∑N
i =1 xi and the transformed
2
Computepthe sample mean x N = N
variable N (x N 2) /2
3
Repeat this procedure 5,000 times. We get 5,000 realisations of this
transformed variable.
4
Build an histogram (and a non parametric kernel estimate of f X N (.))
of these 5,000 realisations and compare it to the normal pdf.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
121 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
122 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
N = 10
N = 100
0.45
0.45
Realisations
S tandard normal pdf
K ernel estimate
0.4
0.35
0.3
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
-4
-2
0
2
4
6
0
-5
N = 1, 000
0
5
N = 10, 000
0.4
0.4
Realisations
S tandard normal pdf
K ernel estimate
0.35
0.3
0.3
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
-4
-2
Christophe Hurlin (University of Orléans)
0
2
4
Realisations
S tandard normal pdf
K ernel estimate
0.35
0.25
0
-6
Realisations
S tandard normal pdf
K ernel estimate
0.4
0.35
6
0
-5
Advanced Econometrics - HEC Lausanne
0
5
November 20, 2013
123 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Click me!
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
124 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
De…nition
The convergence result (CLT)
p
N XN
d
µ ! N 0, σ2
can be understood as:
XN
asy
N
µ,
σ2
N
asy
where the symbol
means "asymptotically distributed as". The
asymptotic mean and variance of the sample mean are then de…ned by:
Easy X N = µ
Christophe Hurlin (University of Orléans)
Vasy X N =
Advanced Econometrics - HEC Lausanne
σ2
N
November 20, 2013
125 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Speed of convergence: why studying
1
p
NX N in the TCL?
For simplicity, let us assume that µ = E (Xi ) = 0 and let us study the
asymptotic behavior of N α X N
V N α X N = N 2α V X N = N 2α
2
If we assume that α > 1/2, then 2α
of N α X N is in…nite:
σ2
= N 2α
N
1 2
σ
1 > 0, the asymptotic variance
lim V N α X N = +∞
N !∞
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
126 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
1
If we assume that α < 1/2, then 2α
degenerated distribution:
1 < 0, the N α X N has a
lim V N α X N = 0
N !∞
2
As a consequence α = 1/2 is the only choice to get a …nite and
positive variance
p
V
NX N = σ2
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
127 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Summary: Let X1 , .., XN denote a sequence of independent and
identically distributed random variables with …nite mean E (Xi ) = µ and
…nite variance V Xi2 = σ2 . Then, the sample mean
XN =
satis…es
1 N
∑ Xi
N i =1
p
CLT:
Christophe Hurlin (University of Orléans)
WLLN: X N ! µ
p
d
N X N µ ! N 0, σ2
Advanced Econometrics - HEC Lausanne
November 20, 2013
128 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
The central limit theorem does not assert that the sample mean
tends to normality. It is the transformation of the sample mean that
has this property
p
CLT:
Christophe Hurlin (University of Orléans)
WLLN: X N ! µ
p
d
N X N µ ! N 0, σ2
Advanced Econometrics - HEC Lausanne
November 20, 2013
129 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Theorem (Lindeberg–Levy Central Limit Theorem, multivariate)
Let x1 , .., xN denote a sequence of independent and identically distributed
random K 1 vectors with …nite mean E (xi ) = µ and …nite variance
covariance K K matrix V (xi ) = Σ. Then the sample mean
xN = N 1 ∑N
i =1 xi satis…es
0
1
p
d
N (xN µ) ! N @|{z}
0 , |{z}
Σ A
| {z }
K 1
Christophe Hurlin (University of Orléans)
K 1 K K
Advanced Econometrics - HEC Lausanne
November 20, 2013
130 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Remark: there exist other versions of the CLT, especially for independent
but not identically (heterogeneously) distributed variables
1
Lindeberg–Feller Central Limit Theorem for unequal variances.
2
Liapounov Central Limit Theorem for unequal means and variances.
For more details, see:
Greene W. (2007), Econometric Analysis, sixth edition, Pearson
Prentice Hill.
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
131 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Question: from the CLT (univariate or multivariate), and the asymptotic
distribution of X N , how to derive the asymptotic distribution of an
estimator b
θ that depends on the sample mean?
b
θ = g XN
Christophe Hurlin (University of Orléans)
asy
???
Advanced Econometrics - HEC Lausanne
November 20, 2013
132 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Theorem (Continouous mapping theorem)
Let fXi g for i = 1, .., N be a sequence of real-valued random variables and
g (.) a continous function:
a.s
a.s
p
p
d
d
if XN ! X then g (XN ) ! g (X )
if XN ! X then g (XN ) ! g (X )
if XN ! X then g (XN ) ! g (X )
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
133 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Example (multiple linear regression model)
Let us consider the multiple linear regression model
yi = xi> β + µi
where xi = (xi 1 ..xiK )> is K 1 vector of random variables,
β = ( β1 ...βK )> is K 1 vector of parameters, and where the error term
µi satis…es E (µi ) = 0, V (µi ) = σ2 and E ( µi j xij ) = 0, 8j = 1, ..K
Question: show that the OLS estimator satis…es
p
d
b β !
N β
N 0, σ2 E 1 xi> xi
0
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
134 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Proof:
1
Rewritte the OLS estimator as:
! 1
!
N
N
b = ∑ xi x>
β
∑ xi yi = β0 +
i
i =1
2
b
N β
Christophe Hurlin (University of Orléans)
∑ xi xi>
i =1
b
Normalize the vector β
p
N
β0 =
i =1
!
1
N
∑ xi µi
i =1
!
β0
1
N
N
∑ xi xi>
i =1
!
1
Advanced Econometrics - HEC Lausanne
p
1
N
N
N
∑ xi µi
i =1
!
November 20, 2013
135 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Reminder: if x is a vector of random variables and Y is a scalar (random
variable) such that E (xY ) = 0, then
V (xY ) = E x E ( Y j x) x>
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
136 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Proof (cont’d):
3. Using the WLLN and the CMP:
!
1 N
>
xi xi
N i∑
=1
1
p
! E
1
xi xi>
4. Using the CLT:
p
N
1
N
N
∑ xi µi
i =1
E (xi µi )
!
d
! N (0, V (xi µi ))
with E ( µi j xik ) = 0, 8k = 1, ..K =) E (xi µi ) = 0 and
V (xi µi ) = E xi µi µi xi> = E E xi µi µi xi> xi
= E xi V ( µi j xi ) xi> = σ2 E xi xi>
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
137 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Proof (cont’d): we have
1
N
p
N
1
N
Christophe Hurlin (University of Orléans)
N
∑
xi xi>
i =1
N
∑ xi µi
i =1
!
!
1
p
! E
1
xi xi>
d
! N 0, σ2 E xi xi>
Advanced Econometrics - HEC Lausanne
November 20, 2013
138 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Theorem (Slutsky’s theorem for convergence in distribution)
d
Let XN and YN be two sequences of random variables where XN ! X and
p
YN ! c, where c 6= 0, then:
d
XN + YN ! X + c
d
XN YN ! cX
XN d X
!
YN
c
d
If YN and XN are matrices/vectors, then YN 1 XN ! c
V c
Christophe Hurlin (University of Orléans)
1
X =c
1
Vc
1X
with
1>
Advanced Econometrics - HEC Lausanne
November 20, 2013
139 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Proof (cont’d): By using the Slusky’s theorem (for a convergence in
distribution), we have:
p
with
b
N β
β0 =
1
N
N
∑ xi xi>
i =1
Π=E
Ω=E
1
xi xi>
1
!
1
p
1
N
N
xi xi>
σ2 E xi xi>
N
∑ xi µi
i =1
!
d
! N (Π, Ω)
0=0
E
1
xi xi> = σ2 E
1
xi xi>
Finally, we have:
p
b
N β
Christophe Hurlin (University of Orléans)
d
β0 ! N 0, σ2 E
1
xi xi>
Advanced Econometrics - HEC Lausanne
November 20, 2013
140 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
De…nition (univariate Delta method)
Let ZN be a sequence random variable indexed by the sample size N such
that
p
d
N (ZN µ) ! N 0, σ2
If g (.) is a continuous and continuously di¤erentiable function with
g (µ) 6= 0 and not involving N, then
0
!2 1
p
∂g (x )
d
N (g (ZN ) g (µ)) ! N @0,
σ2 A
∂x µ
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
141 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Multivariate Delta method Let ZN be a sequence random vectors
indexed by the sample size such that
p
N ( ZN
d
µ) ! N (0, Σ)
If g (.) is a continuous and continuously di¤erentiable multivariate
function with g (µ) 6= 0 and not involving N, then
!
p
∂g
x
∂g
x
(
)
(
)
d
N (g (ZN ) g (µ)) ! N 0,
Σ
∂x µ
∂x> µ
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
142 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Example (Gamma distribution)
Let X1 , .., XN denote a sequence of independent and identically distributed
random variables. We assume that Xi
Γ (α, β) (gamma distribution)
with E (X ) = αβ and V (X ) = αβ2 , α > 0, β > 0 and a pdf de…ned by:
xα
fX (x; α, β) =
1
x
β
exp
Γ (α) βα
, useless in this exercice, but for your culture
R∞
for 8x 2 [0, +∞[ , where Γ (α) = 0 t α 1 exp ( t ) dt denotes the
Gamma function. We assume that α is known. Question: What is the
asymptotic distribution of the estimator b
β de…ned by:
1
b
β=
αN
Christophe Hurlin (University of Orléans)
N
∑ Xi
i =1
Advanced Econometrics - HEC Lausanne
November 20, 2013
143 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Solution: The estimator b
β is de…ned by:
1
b
β=
αN
N
∑ Xi
i =1
Since X1 , .., XN are i.i.d. with E (X ) = αβ and V (X ) = αβ2 , we can
apply the Lindeberg–Levy CLT, and we get immediately:
p
Christophe Hurlin (University of Orléans)
N XN
d
αβ ! N 0, αβ2
Advanced Econometrics - HEC Lausanne
November 20, 2013
144 / 147
4. Asymptotic Properties
4.5. Asymptotic distributions
Solution (cont’d): If we de…ne g (x ) = x /α, with
g E XN
p
= g (αβ) = β 6= 0
1
b
β = XN = g XN
α
N XN
By using the delta method, we have:
p
N g XN
d
αβ ! N 0, αβ2
0
d
g (αβ) ! N @0,
∂g (z )
∂z
Since ∂g (z ) /∂z = ∂ (z/α) /∂z = 1/α, we have:
!
2
p
β
d
N b
β β ! N 0,
α
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
αβ
!2
1
αβ2 A
November 20, 2013
145 / 147
4. Asymptotic Properties
Key Concepts Section 4
1
Almost sure convergence
2
Convergence in probability
3
Law of large numbers: Khinchine’s and Kolmogorov’s theorems
4
Weakly and strongly consistent estimator
5
Slutsky’s theorem
6
Convergence in mean square
7
Convergence in distribution
8
Asymptotic distribution and asymptotic variance
9
Lindeberg-Levy Central Limite Theorem (univariate and multivariate)
10
Continuous mapping theorem
11
Delta method
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
146 / 147
End of Chapter 1
Christophe Hurlin (University of Orléans)
Christophe Hurlin (University of Orléans)
Advanced Econometrics - HEC Lausanne
November 20, 2013
147 / 147
Download