Chapter 1: Estimation Theory Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans November 20, 2013 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 1 / 147 Section 1 Introduction Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 2 / 147 1. Introduction Estimation problem Let us consider a continuous random variable Y characterized by a marginal probability density function fY (y ; θ ) for y 2 R and θ 2 Θ. The parameter θ is unknown. Let fY1 , .., YN g a random sample of i.i.d. random variables that have the same distribution as Y . We have one realisation fy1 , .., yN g of this sample. How to estimate the parameter θ? Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 3 / 147 1. Introduction Remarks 1 The estimation problem can be extended to the case of an econometric model. In this case we consider two variables Y and X and a conditional pdf f Y jX =x (y ; θ ) that depends on a parameter or a vector of unknown parameters θ. 2 In this chapter, we don’t derive the estimators (for the estimation methods, see next chapters). We admit that we have an estimator b θ for θ whatever the estimation method used and we study its …nite sample and large sample properties. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 4 / 147 1. Introduction Notations: In this course, I will (try to...) follow some conventions of notation. Y y fY ( y ) FY ( y ) Pr () y Y random variable realisation probability density or mass function cumulative distribution function probability vector matrix Problem: this system of notations does not allow to discriminate between a vector (matrix) of random elements and a vector (matrix) of non-stochastic elements (realisation). Abadir and Magnus (2002), Notation in econometrics: a proposal for a standard, Econometrics Journal. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 5 / 147 1. Introduction The outline of this chapter is the following: Section 2: What is an estimator? Section 3: Finite sample properties Section 4: Large sample properties Subsection 4.1: Almost sure convergence Subsection 4.2: Convergence in probability Subsection 4.3: Convergence in mean square Subsection 4.4: Convergence in distribution Subsection 4.5: Asymptotic distributions Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 6 / 147 Section 2 What is an Estimator? Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 7 / 147 2. What is an Estimator? Objectives 1 De…ne the concept of estimator. 2 De…ne the concept of estimate. 3 Sampling distribution. 4 Discussion about the notion of "good "estimator. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 8 / 147 2. What is an Estimator? De…nition (Point estimator) A point estimator is any function T (Y1 , Y2 , .., YN ) of a sample. Any statistic is a point estimator. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 9 / 147 What is an estimator? Example (Sample mean) Assume that Y1 , Y2 , .., YN are i.i.d. N m, σ2 random variables. The sample mean (or average) YN = 1 N N ∑ Yi i =1 is a point estimator (or an estimator) of m. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 10 / 147 2. What is an Estimator? Example (Sample variance) Assume that Y1 , Y2 , .., YN are i.i.d. N m, σ2 random variables. The sample variance N 1 2 SN2 = Yi Y N ∑ N 1 i =1 is a point estimator (or an estimator) of σ2 . Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 11 / 147 2. What is an Estimator? Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 12 / 147 2. What is an Estimator? Fact An estimator b θ is a random variable. Consequence: b θ has a (marginal or conditional) probability distribution. This sampling distribution is caracterized by a probability density function (pdf) fbθ (u ) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 13 / 147 2. What is an Estimator? De…nition (Sampling Distribution) The probability distribution of an estimator (or a statistic) is called the sampling distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 14 / 147 2. What is an Estimator? Fact An estimator b θ is a random variable. Consequence: The sampling distribution of b θ is caracterized by b moments such that the expectation E θ , the variance V b θ and more generally the k th central moment de…ned by: E b θ E b θ Christophe Hurlin (University of Orléans) k = Z u µbθ = E b θ = Z µbθ k fbθ (u ) du 8k 2 N u fbθ (u ) du Advanced Econometrics - HEC Lausanne November 20, 2013 15 / 147 2. What is an Estimator? De…nition (Point estimate) A (point) estimateis the realized value of an estimator (i.e. a number) that is obtained when a sample is actually taken. For an estimator b θ it can b be denoted by θ (y ) . Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 16 / 147 2. What is an Estimator? Example (Point estimate) For instance y N is an estimate of m. yN = If N = 3 and fy1 , y2 , y3 g = f3, If N = 3 and fy1 , y2 , y3 g = f4, etc.. Christophe Hurlin (University of Orléans) 1 N N ∑ yi i =1 1, 2g then y N = 1.333. 8, 1g then y N = Advanced Econometrics - HEC Lausanne 1. November 20, 2013 17 / 147 2. What is an Estimator? Question: What constitues a good estimator? 1 The search for good estimators constitutes much of econometrics. 2 An estimator is a rule or strategy for using the data to estimate the parameter. It is de…ned before the data are drawn. 3 Our objective is to use the sample data to infer the value of a parameter or set of parameters, which we denote θ. 4 Sampling distributions are used to make inferences about the population. The issue is to know if the sampling distribution of the estimator b θ is informative about the value of θ.... Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 18 / 147 2. What is an Estimator? Question (cont’d): What constitues a good estimator? 1 Obviously, some estimators are better than others. 1 2 2 To take a simple example, your intuition should convince you that the sample mean would be a better estimator of the population mean than the sample minimum; the minimum is almost certain to underestimate the mean. Nonetheless, the minimum is not entirely without virtue; it is easy to compute, which is occasionally a relevant criterion. The idea is to study the properties of the sampling distribution θ (for the bias), V b θ (for and especially its moments such as E b the precision), etc.. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 19 / 147 2. What is an Estimator? Question (cont’d): What constitues a good estimator? Estimators are compared on the basis of a variety of attributes. 1 Finite sample properties (or …nite sample distribution) of estimators are those attributes that can be compared regardless of the sample size (SECTION 3). 2 Some estimation problems involve characteristics that are unknown in …nite samples. In these cases, estimators are compared on the basis on their large sample, or asymptotic properties (SECTION 4). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 20 / 147 2. What is an Estimator? Key Concepts Section 2 1 Point estimator 2 Point estimate 3 Sampling distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 21 / 147 Section 3 Finite Sample Properties Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 22 / 147 3. Finite Sample Properties Objectives 1 De…ne the concept of …nite sample distribution. 2 Finite sample properties => What is a good estimator? 3 Unbiased estimator. 4 Comparison of two unbiased estimators. 5 FDCR or Cramer Rao bound. 6 Best Linear Unbiased Estimator (BLUE). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 23 / 147 3. Finite Sample Properties De…nition (Finite sample properties and …nite sample distribution) The …nite sample properties of an estimator b θ correspond to the properties of its …nite sample distribution (or exact distribution) de…ned for any sample size N 2 N. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 24 / 147 3. Finite Sample Properties Two cases: 1 2 In some particular cases, the …nite sample distribution of the estimator is known. It corresponds to the distribution of the random variable b θ for any sample size N. In most of cases, the …nite sample distribution is unknown, but we can study some speci…c moments (mean, variance, etc..) of this distribution (…nite sample properties). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 25 / 147 3. Finite Sample Properties Example (Sample mean and …nite sample distribution) Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The estimator m b = Y N (sample mean) has also a normal distribution: m b = 1 N N ∑ Yi i =1 N m, σ2 N 8N 2 N Consequence: the …nite sample distribution of m b for any N 2 N is fully characterized by m and σ2 (parameters that can be estimated). Example: if N = 3, then m b N m, σ2 /3 , if N = 10, then m b N m, σ2 /10 , etc.. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 26 / 147 3. Finite Sample Properties Proof: The sum of independent normal variables has a normal distribution with: Nm 1 N E (Yi ) = =m E (m b) = ∑ N i =1 N ! 1 N 1 N Nσ2 σ2 V (m b) = V Y = V Y = = ( ) i i N i∑ N 2 i∑ N2 N =1 =1 since the variables Yi are independent (then cov (Yi , Yj ) = 0) identically distributed (then E (Yi ) = m and V (Yi ) = σ2 , 8i 2 [1, .., N ]). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 27 / 147 3. Finite Sample Properties Remarks 1 Except in very particular cases (normally distributed samples), the exact distribution of the estimator is very di¢ cult to calculate. 2 Sometimes, it is possible to derive the exact distribution of a transformed variable g b θ , where g (.) is a continuous function. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 28 / 147 3. Finite Sample Properties Example (Sample variance and …nite sample distribution) Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The sample variance N 1 2 SN2 = Yi Y N ∑ N 1 i =1 is an estimator of σ2 . The transformed variable (N 1) SN2 /σ2 has a Chi-squared (exact / …nite sample) distribution with N 1 degrees of freedom: (N 1) 2 SN χ2 (N 1) 8N 2 N σ2 Proof: see Chapter 4. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 29 / 147 3. Finite Sample Properties Fact In most of cases, it is impossible to derive the exact / …nite sample distribution for the estimator (or a transformed variable). Two reasons: 1 2 In some cases, the exact distribution of Y1 , Y2 ..YN is known, but the θ: function T (.) is too complicated to derive the distribution of b b θ = T (Y1 , ..YN ) ??? 8N 2 N In most of cases, the distribution of the sample variables Y1 , Y2 ..YN is unknown... b θ = T (Y1 , ..YN ) ??? 8N 2 N Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 30 / 147 3. Finite Sample Properties Question: how to evaluate the …nite sample properties of the estimator b θ when its …nite sample distribution is unknow? b θ ??? 8N 2 N Solution: We will focus on some speci…c moments of this (unknown) …nite sample (sampling) distribution in order to study some properties of the estimator b θ and determine if it is a "good" estimator or not. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 31 / 147 3. Finite Sample Properties De…nition (Unbiased estimator) An estimator b θ of a parameter θ is unbiased if the mean of its sampling distribution is θ: E b θ =θ or θ E b θ = Bias b θ θ =0 implies that b θ is unbiased. If θ is a vector of parameters, then the estimator is unbiased if the expected value of every element of b θ equals the corresponding element of θ. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 32 / 147 3. Finite Sample Properties Source: Greene (2007), Econometrics Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 33 / 147 3. Finite Sample Properties Example (Bernouilli distribution) Let Y1 , Y2 , .., YN be a random sampling from a Bernoulli distribution with a success probability p. An unbiased estimator of p is p b= 1 N N ∑ Yi i =1 Proof: Since the Yi are i.i.d. with E (Yi ) = p, then we have: E (p b) = Christophe Hurlin (University of Orléans) 1 N N ∑ E (Yi ) = i =1 pN =p N Advanced Econometrics - HEC Lausanne November 20, 2013 34 / 147 3. Finite Sample Properties Example (Uniform distribution) Let Y1 , Y2 , .., YN be a random sampling from a uniform distribution U[0,θ ] . An unbiased estimator of θ is 2 b θ= N N ∑ Yi i =1 Proof: Since the Yi are i.i.d. with E (Yi ) = (θ + 0) /2 = θ/2, then we have: ! N 2 2 N 2 Nθ E b θ =E Y = E (Yi ) = =θ i ∑ ∑ N i =1 N i =1 N 2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 35 / 147 3. Finite Sample Properties Example (Multiple linear regression model) Consider the model y = Xβ + µ where y 2 RN , X 2 MN K is a nonrandom matrix, β 2 RK is a vector of parameters, E (µ) = 0N 1 and V (µ) = σ2 IN . The OLS estimator b = X> X β 1 X> y is an unbiased estimator of β. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 36 / 147 3. Finite Sample Properties Proof: Since y = Xβ + µ, X 2 MN K is a nonrandom matrix and E (µ) = 0, we have E (y) = Xβ As a consequence: b E β = X> X = X> X 1 1 X> E (y ) X> Xβ = β b is unbiased. The estimator β Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 37 / 147 3. Finite Sample Properties Remark: Even it is not relevant in the section devoted to the …nite sample properties of estimators, we can introduce here the notion of asymptotically unbiased estimator (which can be considered as a large sample property..). Here we assume that the estimator b θ=b θ N depends on the sample size N. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 38 / 147 3. Finite Sample Properties De…nition (Asymptotically unbiased estimator) The sequence of estimators b θ N (with N 2 N) is asymptotically unbiased if lim E b θN N !∞ Christophe Hurlin (University of Orléans) =θ Advanced Econometrics - HEC Lausanne November 20, 2013 39 / 147 3. Finite Sample Properties Example (Sample variance) Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The uncorrected sample variance de…ned by eN2 = 1 S N N ∑ Yi YN 2 i =1 is a biased estimator of σ2 but is asymptotically unbiased. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 40 / 147 3. Finite Sample Properties Proof: We known that: SN2 = (N 1) σ2 1 N SN2 N 1∑ Yi YN 2 i =1 χ2 (N 1) 8N 2 N e 2 , such that: Since, we have a relationship between SN2 and S N then we get: eN2 = 1 S N N ∑ YN 2 i =1 N e2 S σ2 N Christophe Hurlin (University of Orléans) Yi χ2 (N 1) = N 1 N SN2 8N 2 N Advanced Econometrics - HEC Lausanne November 20, 2013 41 / 147 3. Finite Sample Properties Proof (cont’d): Reminder: If X N e2 S σ2 N χ2 (N 1) 8N 2 N χ2 (v ) , then E (X ) = v and V (X ) = 2v . By de…nition: E or equivalently: eN2 E S e 2 = (1/N ) ∑N So, S i =1 Yi N Christophe Hurlin (University of Orléans) N e2 S σ2 N N = YN =N 1 N 2 1 σ2 6 = σ2 is a biased estimator of σ2 . Advanced Econometrics - HEC Lausanne November 20, 2013 42 / 147 3. Finite Sample Properties e 2 = (1/N ) ∑N Proof (cont’d): But S i =1 Yi N unbiased since: eN2 lim E S N !∞ = lim N !∞ N YN 1 N 2 is asymptotically σ2 = σ2 Remark: Even in a more general framework (non-normal), the sample variance (with a correction for small sample) is an unbiased estimator of σ2 SN2 = (N 1) 1 | {z } correction for small sample N ∑ Yi YN 2 i =1 E SN2 = σ2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 43 / 147 3. Finite Sample Properties Unbiasedness is interesting per se but not so much! 1 The absence of bias is not a su¢ cient criterion to discriminate among competitive estimators. 2 It may exist many unbiased estimators for the same parameter (vector) of interest. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 44 / 147 3. Finite Sample Properties Example (Estimators) Assume that Y1 , Y2 , .., YN are i.i.d. with E (Yi ) = m, the statistics m b1 = are unbiased estimators of m. Christophe Hurlin (University of Orléans) 1 N N ∑ Yi i =1 m b 2 = Y1 Advanced Econometrics - HEC Lausanne November 20, 2013 45 / 147 3. Finite Sample Properties Proof: Since the Yi are i.i.d. with E (Yi ) = m, then we have: E (m b 1) = 1 N N ∑ E (Yi ) = i =1 Nm =m N E (m b 2 ) = E (Y1 ) = m Both estimators m b 1 and m b 2 of the parameter m are unbiased. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 46 / 147 3. Finite Sample Properties How to compare two unbiased estimators? When two (or more) estimators are unbiased, the best one is the more precise,.i.e. the estimator with the minimum variance. Comparing two (or more) unbiased estimates becomes equivalent to comparing their variance-covariance matrices. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 47 / 147 3. Finite Sample Properties De…nition Suppose that b θ 1 and b θ 2 are two unbiased estimators. b θ 1 dominates b θ 2 , i.e. b θ1 b θ 2 , if and only if V b θ1 V b θ2 In the case where b θ1 , b θ 2 and θ are vectors, this inequality becomes: V b θ2 V b θ1 Christophe Hurlin (University of Orléans) is a positive semi de…nite matrix Advanced Econometrics - HEC Lausanne November 20, 2013 48 / 147 3. Finite Sample Properties 0.8 0.7 Estimator 1 Estimator 2 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.5 Christophe Hurlin (University of Orléans) 1 1.5 2 θ 2.5 3 Advanced Econometrics - HEC Lausanne 3.5 4 November 20, 2013 49 / 147 3. Finite Sample Properties Example (Estimators) Assume that Y1 , Y2 , .., YN are i.i.d. E (Yi ) = m and V (Yi ) = σ2 , the estimator m b 1 = N 1 ∑N b 2 = Y1 . i =1 Yi dominates the estimator m Proof: The two estimators m b 1 and m b 2 are unbiased, so they can be compared in terms of variance (precision): V (m b 1) = So, V (m b 1) 1 N2 N ∑ V (Yi ) = i =1 Nσ2 σ2 = since the Yi are i.i.d. N2 N V (m b 2 ) = V (Y1 ) = σ2 V (m b 2 ) , the estimator m b 1 is preferred to m b 2, m b1 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne m b 2. November 20, 2013 50 / 147 3. Finite Sample Properties Question: is there a bound for the variance of the unbiased estimators? Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 51 / 147 3. Finite Sample Properties De…nition (Cramer-Rao or FDCR bound) Let X1 , .., XN be an i.i.d. sample with pdf fX (θ; x ). Let b θ be an unbiased b estimator of θ; i.e., Eθ (θ ) = θ. If fX (θ; x ) is regular then Vθ b θ I N 1 (θ 0 ) = FDCR or Cramer-Rao bound where I N (θ 0 ) denotes the Fisher information number for the sample evaluated at the true value θ 0 . If θ is a vector then this inequality means that Vθ b θ I N 1 (θ 0 ) is positive semi-de…nite. FDCR: Frechet - Darnois - Cramer and Rao Remark: we will de…ne the Fisher information matrix (or number) in Chapter 2 (Maximum Likelihood Estimation). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 52 / 147 3. Finite Sample Properties De…nition (E¢ ciency) An estimator is e¢ cient if its variance attains the FDCR (Frechet Darnois - Cramer - Rao) or Cramer-Rao bound: Vθ b θ = I N 1 (θ 0 ) where I N (θ 0 ) denotes the Fisher information matrix associated to the sample evaluated at the true value θ 0 . Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 53 / 147 3. Finite Sample Properties Finally, note that in some cases we further restrict the set of estimators to linear functions of the data. De…nition (Estimator BLUE) An estimator is the minimum variance linear unbiased estimator or best linear unbiased estimator (BLUE) if it is a linear function of the data and has minimum variance among linear unbiased estimators Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 54 / 147 3. Finite Sample Properties Remark: the term "linear" means that the estimator b θ is a linear function of the data Yi : b θj = Christophe Hurlin (University of Orléans) N ∑ ωij Yi i =1 Advanced Econometrics - HEC Lausanne November 20, 2013 55 / 147 3. Finite Sample Properties Key Concepts Section 3 1 Finite sample distribution 2 Finite sample properties 3 Bias and unbiased estimator 4 Comparison of unbiased estimators 5 Cramer-Rao or FDCR bound 6 E¢ cient estimator 7 Linear estimator 8 Estimateur BLUE Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 56 / 147 Section 4 Asymptotic Properties Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 57 / 147 4. Asymptotic Properties Problem: 1 Let us consider an i.i.d. sample Y1 , Y2 .., YN , where Y has a pdf fY (y ; θ ) and θ is an unknown parameter. 2 We assume that fY (y ; θ ) is also unknown (we do not know the distribution of Yi ). We consider an estimator b θ (also denoted b θ N to show that it depends on N) such that b θ = T (Y1 , Y2 , .., YN ) b θN 3 4 The …nite sample distribution of b θ N is unknown.... b θN Christophe Hurlin (University of Orléans) ??? 8N 2 N Advanced Econometrics - HEC Lausanne November 20, 2013 58 / 147 4. Asymptotic Properties Question: what is the behavior of the random variable b θ N when the sample size N tends to in…nity? De…nition (Asymptotic theory) Asymptotic or large sample theory consists in the study of the distribution of the estimator when the sample size is su¢ ciently large. The asymptotic theory is fundamentally based on the notion of convergence... Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 59 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 60 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 61 / 147 Section 4 Asymptotic Properties 4.1. Almost Sure Convergence Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 62 / 147 4. Asymptotic Properties 4.1. Almost sur convergence De…nition (Almost sure convergence) Let XN be a sequence random variable indexed by the sample size. XN converges almost surely (or with probability 1 or strongly) to a constant c, if, for every ε > 0, lim XN Pr N !∞ c <ε =1 or equivalently if: Pr It is written lim XN = c N !∞ =1 a.s . XN ! c Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 63 / 147 4. Asymptotic Properties 4.1. Almost sur convergence Comments 1 The almost sure convergence means that the values of XN approach the value c, in the sense (see almost surely) that events for which XN does not converge to c have probability 0. 2 In another words, it means that when N tends to in…nity, the random variable Xn tends to a degenerate random variable (a random variable which only takes a single value c) with a pdf equal to a probability mass function. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 64 / 147 4. Asymptotic Properties 4.1. Almost sur convergence 1.2 1 0.8 0.6 0.4 0.2 0 0 0.5 Christophe Hurlin (University of Orléans) 1 1.5 2 c=2 2.5 3 Advanced Econometrics - HEC Lausanne 3.5 4 November 20, 2013 65 / 147 4. Asymptotic Properties 4.1. Almost sur convergence De…nition (Strong consistency) A point estimator b θ N of θ is strongly consistent if: a.s . b θN ! θ Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 66 / 147 4. Asymptotic Properties 4.1. Almost sur convergence Comments When N ! ∞, the estimator tends to a degenerate random variable that takes a single value equal to θ. The crème de la crème (best of the best) of the estimators.... Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 67 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 68 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 69 / 147 Section 4 Asymptotic Properties 4.2. Convergence in Probability Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 70 / 147 4. Asymptotic Properties 4.2. Convergence in probability De…nition (Convergence in probability) Let XN be a sequence random variable indexed by the sample size. XN converges in probability to a constant c, if, for any ε > 0, lim Pr (jXN N !∞ It is written p XN ! c Christophe Hurlin (University of Orléans) or c j > ε) = 0 plim XN = c Advanced Econometrics - HEC Lausanne November 20, 2013 71 / 147 4. Asymptotic Properties 4.2. Convergence in probability p XN ! c if c j > ε) = 0 lim Pr (jXN N !∞ 4.5 4 c+ε c-ε 3.5 3 2.5 This area tends to 0 2 1.5 1 0.5 0 0 0.5 Christophe Hurlin (University of Orléans) 1 1.5 2 c=2 2.5 Advanced Econometrics - HEC Lausanne 3 3.5 4 November 20, 2013 72 / 147 4. Asymptotic Properties 4.2. Convergence in probability p XN ! c if c j > ε) = 0 lim Pr (jXN N !∞ for a very small ε... 400 350 300 250 200 150 100 50 0 0 0.5 Christophe Hurlin (University of Orléans) 1 1.5 2 c=2 2.5 3 Advanced Econometrics - HEC Lausanne 3.5 4 November 20, 2013 73 / 147 4. Asymptotic Properties 4.2. Convergence in probability Comments 1 The general idea is the same than for the a.s. convergence: XN tends to a degenerate random variable (even if it is not exactly the case) equal to c.. 2 But when XN is very likely to be close to c for large N, what about the location of the remaining small probability mass which is not close to c?... 3 Convergence in probability allows more erratic behavior in the converging sequence than almost sure convergence. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 74 / 147 4. Asymptotic Properties 4.2. Convergence in probability Remark The notation p XN ! X where X is a random element (scalar, vector, matrix) means that the variable XN X converges to c = 0. XN Christophe Hurlin (University of Orléans) p X !0 Advanced Econometrics - HEC Lausanne November 20, 2013 75 / 147 4. Asymptotic Properties 4.2. Convergence in probability De…nition (Weak consistency) A point estimator b θ N of θ is (weakly) consistent if: p b θN ! θ Remark: In econometrics, in most of cases, we only consider the weak consistency. When we say that an estimator is "consistent", it generally refers to the convergence in probability. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 76 / 147 4. Asymptotic Properties 4.2. Convergence in probability Lemma (Convergence in probability) Let XN be a sequence random variable indexed by the sample size and c a constant. If lim E (XN ) = c N !∞ lim V (XN ) = 0 N !∞ Then, XN converges in probability to c as N ! ∞ : p XN ! c Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 77 / 147 4. Asymptotic Properties 4.2. Convergence in probability Example (Consistent estimator) Assume that Y1 , Y2 , .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ2 , where σ2 is known and m is unknow. The estimator m, b de…ned by, m b = 1 N N ∑ Yi i =1 is a consistenty estimator of m. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 78 / 147 4. Asymptotic Properties 4.2. Convergence in probability Proof: Since Y1 , Y2 , .., YN are i.i.d. with E (Yi ) = m and V (Yi ) = σ2 , we have : 1 N E (m b) = E (Yi ) = m N i∑ =1 lim V (m b) = N !∞ 1 N !∞ N 2 lim N ∑ V (Yi ) = i =1 σ2 =0 N !∞ N lim The estimator m b is (weakly) consistent: p m b !m Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 79 / 147 4. Asymptotic Properties 4.2. Convergence in probability Example (Consistent estimator) Assume that Y1 , Y2 , .., YN are N .i.d. m, σ2 random variables. The sample variance de…ned by SN2 = 1 N N 1∑ Yi YN 2 i =1 is a (weakly) consistent estimator of σ2 . Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 80 / 147 3. Finite Sample Properties 4.2. Convergence in probability Proof: We known that for normal sample: (N 1) σ2 E (N 1) σ2 SN2 SN2 =N χ2 (N 1) V (N 1 8N 2 N 1) σ2 SN2 = 2 (N 1) We get immediately: E SN2 = σ2 lim V SN2 = N !∞ lim N !∞ 2σ4 N 1 =0 p The estimator SN2 is (weakly) consistent : SN2 ! σ2 . Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 81 / 147 4. Asymptotic Properties 4.2. Convergence in probability Lemma (Chain of implication) The almost sure convergence implies the convergence in probability: a.s . p ! =) ! where the symbol "=) " means ’implies". The converse is not true Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 82 / 147 4. Asymptotic Properties 4.2. Convergence in probability Comments 1 One of the main applications of the convergence in probability and the almost sure convergence is the law of large numbers. 2 The law of large numbers tells you that the sample mean converges in probability (weak law of large numbers) or almost surely (strong law of large numbers) to the population mean: XN = Christophe Hurlin (University of Orléans) 1 N N ∑ Xi N !!∞ E (Xi ) i =1 Advanced Econometrics - HEC Lausanne November 20, 2013 83 / 147 4. Asymptotic Properties 4.2. Convergence in probability Theorem (Weak law of large numbers, Khinchine) If fXi g , for i = 1, .., N is a sequence of independently and identically distributed (i.i.d.) random variables with …nite mean E (Xi ) = µ (<∞), then the sample mean X N converges in probability to µ: XN = Christophe Hurlin (University of Orléans) 1 N N p ∑ Xi ! E (Xi ) = µ i =1 Advanced Econometrics - HEC Lausanne November 20, 2013 84 / 147 4. Asymptotic Properties 4.2. Convergence in probability Theorem (Strong law of large numbers, Kolmogorov) If fXi g , for i = 1, .., N is a sequence of independently and identically distributed (i.i.d.) random variables such that E (Xi ) = µ (< ∞) and E (jXi j) < ∞, then the sample mean X N converges almost surely to µ: XN = Christophe Hurlin (University of Orléans) 1 N N ∑ Xi i =1 a.s . ! E (Xi ) = µ Advanced Econometrics - HEC Lausanne November 20, 2013 85 / 147 4. Asymptotic Properties 4.2. Convergence in probability Illustration: 1 Let us consider a random variable Xi sample U[0,10 ] and draw an i.i.d fxi gNi=1 1 ∑N i =1 xi . 2 Compute the sample mean x N = N 3 Repeat this procedure 500 times. We get 500 realisations of the sample mean x N . 4 Build an histogram of these 500 realisations. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 86 / 147 4. Asymptotic Properties 4.2. Convergence in probability Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 87 / 147 4. Asymptotic Properties 4.2. Convergence in probability N = 10 N = 100 20 20 18 18 16 16 14 14 12 12 10 10 8 8 6 6 4 4 2 0 2 0 2 4 6 8 10 0 0 N = 1, 000 4 6 8 10 N = 10, 000 20 20 18 18 16 16 14 14 12 12 10 10 8 8 6 6 4 4 2 0 2 2 0 2 4 Christophe Hurlin (University of Orléans) 6 8 10 0 0 2 4 Advanced Econometrics - HEC Lausanne 6 8 10 November 20, 2013 88 / 147 4. Asymptotic Properties 4.2. Convergence in probability An animation is worth 1,000,000 words... Click me! Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 89 / 147 4. Asymptotic Properties 4.2. Convergence in probability Proof: There are many proofs of the law of large numbers. Most of them use the additional assumption of …nite variance V (Xi ) = σ2 and the Chebyshev’s inequality. Theorem (Chebyshev’s inequality) Let X be a random variable with …nite expected value µ and …nite non-zero variance σ2 . Then for any real number k > 0, Pr (jX Christophe Hurlin (University of Orléans) µj kσ) 1 k2 Advanced Econometrics - HEC Lausanne November 20, 2013 90 / 147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): Under the assumpition of i.i.d. µ, σ2 , we have that: E XN = µ V XN = σ2 N Given the Chebyshev’s inequality, we get for k > 0: Pr XN 1 k2 σ kp N µ Let us de…ne ε > 0 such that p kσ ε N () k = ε= p σ N Then we get for any ε > 0: Pr Christophe Hurlin (University of Orléans) XN µ ε σ2 ε2 N Advanced Econometrics - HEC Lausanne November 20, 2013 91 / 147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): for any ε > 0: Pr XN µ σ2 ε2 N ε So, when N ! ∞ this probability is necessarily equal to 0 (since means = 0) Pr Since Pr XN lim X N µ <ε =1 Pr µ N !∞ lim X N N !∞ a.s . P ε XN µ <ε 0 = 0 8ε > 0 µ ε , we have: = 1 8ε > 0 p X N ! µ (SLLN) =) X N ! µ (WLLN) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 92 / 147 4. Asymptotic Properties 4.2. Convergence in probability Remarks 1 These two theorems consider a sequence of independently and identically distributed (i.i.d.) random variables (as a consequence with the same mean E (Xi ) = µ, 8i = 1, .., N. 2 There are alternative versions of the law of large numbers for independent random variables not identically (heterogeneously) distributed with E (Xi ) = µi (cf. Greene, 2007). 1 Chebychev’s Weak Law of Large Numbers. 2 Markov’s Strong Law of Large Numbers. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 93 / 147 4. Asymptotic Properties 4.2. Convergence in probability Theorem (Slutsky’s theorem) p Let XN and YN be two sequences of random variables where XN ! X and p YN ! c, where c 6= 0, then: p XN + YN ! X + c p XN YN ! cX XN p X ! YN c Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 94 / 147 4. Asymptotic Properties 4.2. Convergence in probability Remark: This also holds for sequences of random matrices. The last p p statement reads: if XN ! X and YN ! Ω then p Y N 1 XN ! Ω provided that Ω 1 1 X exists. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 95 / 147 4. Asymptotic Properties 4.2. Convergence in probability Example Let us consider the multiple linear regression model yi = xi> β + µi where xi = (xi 1 ..xiK )> is K 1 vector of random variables, β = ( β1 ...βK )> is K 1 vector of parmeters, and where the error term µi satis…es E (µi ) = 0 and E ( µi j xij ) = 0 8j = 1, ..K . Question: show that the OLS estimator de…ned by b= β N ∑ xi xi> i =1 ! 1 N ∑ xi yi i =1 ! is a consistent estimator of β. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 96 / 147 4. Asymptotic Properties 4.2. Convergence in probability Proof: let us rewritte the OLS estimator as: ! ! 1 N N > b = β ∑ xi yi ∑ xi xi i =1 N = ∑ xi xi> i =1 N = ∑ xi xi> i =1 ! ! N = β+ i =1 1 ∑ xi Christophe Hurlin (University of Orléans) xi> β + µi i =1 1 N ∑ xi xi> ∑ xi xi> i =1 N i =1 ! 1 N ! N β+ ∑ xi µi i =1 ! ! ∑ xi xi> i =1 Advanced Econometrics - HEC Lausanne ! 1 N ∑ xi µi i =1 November 20, 2013 ! 97 / 147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): By multiplying and dividing by N, we get: 1 N b = β+ β 1 ∑ xi xi> i =1 ! 1 1 N N ∑ xi µi i =1 ! By using the (weak) law of large number (Kitchine’s therorem), we have: 1 N 2 N N p ∑ xi xi> ! E 1 N xi xi> i =1 N p ∑ xi µi ! E (xi µi ) i =1 By using the Slutsky’s theorem: p b! β β+E Christophe Hurlin (University of Orléans) 1 xi xi> E (xi µi ) Advanced Econometrics - HEC Lausanne November 20, 2013 98 / 147 4. Asymptotic Properties 4.2. Convergence in probability Reminder: If X and Y are two random variables, then E (Xj Y ) = 0 =) E (X Y ) = 0 The reverse is not true. E ( X j Y ) = 0 =) ( cov (X , Y ) = E (XY ) E (X ) E (Y ) = 0 E (X ) = 0 E ( X j Y ) = 0 =) E (XY ) = 0 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 99 / 147 4. Asymptotic Properties 4.2. Convergence in probability Proof (cont’d): Since p b! β β+E 1 xi xi> E (xi µi ) E ( µi j xij ) = 0 8j = 1, ..K ) E (µi xi ) = 0K We have 1 p b! β β b is (weakly) consistent. The OLS estimator β Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 100 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 101 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 102 / 147 Section 4 Asymptotic Properties 4.3. Convergence in Mean Square Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 103 / 147 4. Asymptotic Properties 4.3. Convergence in mean square De…nition (Convergence in mean square) Let fXi g for i = 1, .., N be a sequence of real-valued random variables such that E jXN j2 < ∞. XN converges in mean square to a constant c, if: lim E jXN N !∞ It is written c j2 = 0 m.s . XN ! c Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 104 / 147 4. Asymptotic Properties 4.3. Convergence in mean square Remark: It is the less usefull notion of convergence.. except for the demonstrations of the convergence in probability. Lemma (Chain of implication) The convergence in mean square implies the convergence in probability: m.s . ! p =) ! where the symbol "=) " means ’implies". The converse is not true. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 105 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 106 / 147 4. Asymptotic Properties We are mainly concerned with four modes of convergence: 1 Almost sure convergence 2 Convergence in probability 3 Convergence in quadratic mean 4 Convergence in distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 107 / 147 Section 4 Asymptotic Properties 4.4. Convergence in Distribution Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 108 / 147 4. Asymptotic Properties 4.4. Convergence in distribution De…nition (Convergence in distribution) Let XN be a sequence random variable indexed by the sample size with a cdf FN (.). XN converges in distribution to a random variable X with cdf F (.) if lim FN (x ) = F (x ) 8x N !∞ It is written: d XN ! X Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 109 / 147 4. Asymptotic Properties 4.4. Convergence in distribution Comment: In general, we have: XN |{z} d random var. XN |{z} ! random var. In the case, where XN |{z} random var. p ! X |{z} random var. Christophe Hurlin (University of Orléans) p X |{z} random var. ! |{z} c constant p 0 it means XN X ! |{z} | {z } random var. Advanced Econometrics - HEC Lausanne constant November 20, 2013 110 / 147 4. Asymptotic Properties 4.4. Convergence in distribution Lemma (Chain of implication) The convergence in probability implies the convergence in distribution: p d ! =) ! where the symbol "=) " means ’implies". The converse is not true. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 111 / 147 4. Asymptotic Properties 4.4. Convergence in distribution De…nition (Asymptotic distribution) If XN converges in distribution to X , where FN (.) is the cdf of XN , then F (.) is the cdf of the limiting or asymptotic distribution of XN . Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 112 / 147 4. Asymptotic Properties 4.4. Convergence in distribution Consequence: Generally, we denote: XN |{z} random var. d ! L |{z} asy. distribution It means XN converges in distribution to a random variable X that has a dsitribution L. Example d XN ! N (0, 1) means that XN converges to a random variable X normally distributed or that XN has an asymptotic standard normal distribution. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 113 / 147 4. Asymptotic Properties 4.4. Convergence in distribution De…nition (Asymptotic mean and variance) The asymptotic mean and variance of a random variable XN are the mean and variance of the asymptotic or limiting distribution, assuming that the limiting distribution and its moments exist. These moments are denoted by Easy (XN ) Vasy (XN ) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 114 / 147 4. Asymptotic Properties 4.4. Convergence in distribution De…nition (Asymptotically normally distributed estimator) A consistent estimator b θ of θ is said to be asymptotically normally distributed (or asymptotically normal) if: p N b θ d θ 0 ! N (0, Σ0 ) Equivalently, b θ is asymptotically normal if: b θ asy N θ0 , N 1 Σ0 The asymptotic variance of b θ is then de…ned by: Vasy b θ Christophe Hurlin (University of Orléans) 1 avar b θ = Σ0 N Advanced Econometrics - HEC Lausanne November 20, 2013 115 / 147 Section 4 Asymptotic Properties 4.5. Asymptotic Distributions Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 116 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Let’s go back to our estimation problem We consider a (strongly) consistent estimator b θ N of the true parameter θ 0 . p a.s . b θ N ! θ 0 =) b θN ! θ0 This estimator has a degenerated asymptotic distribution (point-mass distribution), since when N ! ∞, lim fb N !∞ θ N (x ) = f (x ) where fbθ N (.) is the pdf of b θ N and f (x ) is de…ned by: f (x ) = Christophe Hurlin (University of Orléans) 1 0 if x = θ 0 0 otherwise Advanced Econometrics - HEC Lausanne November 20, 2013 117 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Conclusion: one needs more than consistency to do inference (tests about the true value of θ, etc.). Solution: we will transform the estimator b θ N to get a transformed variable that has a non degenerated asymptotic distribution in order to derive the the asymptotic distribution. It is the general idea of the Central Limit Theorem for a particular estimator: the sample mean... Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 118 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Theorem (Lindeberg–Levy Central Limit Theorem, univariate) Let X1 , .., XN denote a sequence of independent and identically distributed random variables with …nite mean E (Xi ) = µ and …nite variance V (Xi ) = σ2 . Then the sample mean X N = N 1 ∑N i =1 Xi satis…es p Christophe Hurlin (University of Orléans) N XN d µ ! N 0, σ2 Advanced Econometrics - HEC Lausanne November 20, 2013 119 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Comment: 1 The result is quite remarkable as it holds regardless of the form of the parent distribution (the distribution of Xi ). 2 The central limit theorem requires virtually no assumptions (other than independence and …nite variances) to end up with normality: normality is inherited from the sums of ”small” independent disturbances with …nite variance. Proof: Rao (1973). Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 120 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Illustration: 1 Let us consider a random variable Xi χ2 (2) , such that E (Xi ) = 2 and V (Xi ) = 4 and draw an i.i.d sample fxi gN i =1 1 ∑N i =1 xi and the transformed 2 Computepthe sample mean x N = N variable N (x N 2) /2 3 Repeat this procedure 5,000 times. We get 5,000 realisations of this transformed variable. 4 Build an histogram (and a non parametric kernel estimate of f X N (.)) of these 5,000 realisations and compare it to the normal pdf. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 121 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 122 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions N = 10 N = 100 0.45 0.45 Realisations S tandard normal pdf K ernel estimate 0.4 0.35 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 0 -4 -2 0 2 4 6 0 -5 N = 1, 000 0 5 N = 10, 000 0.4 0.4 Realisations S tandard normal pdf K ernel estimate 0.35 0.3 0.3 0.25 0.2 0.2 0.15 0.15 0.1 0.1 0.05 0.05 -4 -2 Christophe Hurlin (University of Orléans) 0 2 4 Realisations S tandard normal pdf K ernel estimate 0.35 0.25 0 -6 Realisations S tandard normal pdf K ernel estimate 0.4 0.35 6 0 -5 Advanced Econometrics - HEC Lausanne 0 5 November 20, 2013 123 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Click me! Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 124 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions De…nition The convergence result (CLT) p N XN d µ ! N 0, σ2 can be understood as: XN asy N µ, σ2 N asy where the symbol means "asymptotically distributed as". The asymptotic mean and variance of the sample mean are then de…ned by: Easy X N = µ Christophe Hurlin (University of Orléans) Vasy X N = Advanced Econometrics - HEC Lausanne σ2 N November 20, 2013 125 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Speed of convergence: why studying 1 p NX N in the TCL? For simplicity, let us assume that µ = E (Xi ) = 0 and let us study the asymptotic behavior of N α X N V N α X N = N 2α V X N = N 2α 2 If we assume that α > 1/2, then 2α of N α X N is in…nite: σ2 = N 2α N 1 2 σ 1 > 0, the asymptotic variance lim V N α X N = +∞ N !∞ Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 126 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions 1 If we assume that α < 1/2, then 2α degenerated distribution: 1 < 0, the N α X N has a lim V N α X N = 0 N !∞ 2 As a consequence α = 1/2 is the only choice to get a …nite and positive variance p V NX N = σ2 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 127 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Summary: Let X1 , .., XN denote a sequence of independent and identically distributed random variables with …nite mean E (Xi ) = µ and …nite variance V Xi2 = σ2 . Then, the sample mean XN = satis…es 1 N ∑ Xi N i =1 p CLT: Christophe Hurlin (University of Orléans) WLLN: X N ! µ p d N X N µ ! N 0, σ2 Advanced Econometrics - HEC Lausanne November 20, 2013 128 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions The central limit theorem does not assert that the sample mean tends to normality. It is the transformation of the sample mean that has this property p CLT: Christophe Hurlin (University of Orléans) WLLN: X N ! µ p d N X N µ ! N 0, σ2 Advanced Econometrics - HEC Lausanne November 20, 2013 129 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Theorem (Lindeberg–Levy Central Limit Theorem, multivariate) Let x1 , .., xN denote a sequence of independent and identically distributed random K 1 vectors with …nite mean E (xi ) = µ and …nite variance covariance K K matrix V (xi ) = Σ. Then the sample mean xN = N 1 ∑N i =1 xi satis…es 0 1 p d N (xN µ) ! N @|{z} 0 , |{z} Σ A | {z } K 1 Christophe Hurlin (University of Orléans) K 1 K K Advanced Econometrics - HEC Lausanne November 20, 2013 130 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Remark: there exist other versions of the CLT, especially for independent but not identically (heterogeneously) distributed variables 1 Lindeberg–Feller Central Limit Theorem for unequal variances. 2 Liapounov Central Limit Theorem for unequal means and variances. For more details, see: Greene W. (2007), Econometric Analysis, sixth edition, Pearson Prentice Hill. Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 131 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Question: from the CLT (univariate or multivariate), and the asymptotic distribution of X N , how to derive the asymptotic distribution of an estimator b θ that depends on the sample mean? b θ = g XN Christophe Hurlin (University of Orléans) asy ??? Advanced Econometrics - HEC Lausanne November 20, 2013 132 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Theorem (Continouous mapping theorem) Let fXi g for i = 1, .., N be a sequence of real-valued random variables and g (.) a continous function: a.s a.s p p d d if XN ! X then g (XN ) ! g (X ) if XN ! X then g (XN ) ! g (X ) if XN ! X then g (XN ) ! g (X ) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 133 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Example (multiple linear regression model) Let us consider the multiple linear regression model yi = xi> β + µi where xi = (xi 1 ..xiK )> is K 1 vector of random variables, β = ( β1 ...βK )> is K 1 vector of parameters, and where the error term µi satis…es E (µi ) = 0, V (µi ) = σ2 and E ( µi j xij ) = 0, 8j = 1, ..K Question: show that the OLS estimator satis…es p d b β ! N β N 0, σ2 E 1 xi> xi 0 Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 134 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Proof: 1 Rewritte the OLS estimator as: ! 1 ! N N b = ∑ xi x> β ∑ xi yi = β0 + i i =1 2 b N β Christophe Hurlin (University of Orléans) ∑ xi xi> i =1 b Normalize the vector β p N β0 = i =1 ! 1 N ∑ xi µi i =1 ! β0 1 N N ∑ xi xi> i =1 ! 1 Advanced Econometrics - HEC Lausanne p 1 N N N ∑ xi µi i =1 ! November 20, 2013 135 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Reminder: if x is a vector of random variables and Y is a scalar (random variable) such that E (xY ) = 0, then V (xY ) = E x E ( Y j x) x> Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 136 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Proof (cont’d): 3. Using the WLLN and the CMP: ! 1 N > xi xi N i∑ =1 1 p ! E 1 xi xi> 4. Using the CLT: p N 1 N N ∑ xi µi i =1 E (xi µi ) ! d ! N (0, V (xi µi )) with E ( µi j xik ) = 0, 8k = 1, ..K =) E (xi µi ) = 0 and V (xi µi ) = E xi µi µi xi> = E E xi µi µi xi> xi = E xi V ( µi j xi ) xi> = σ2 E xi xi> Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 137 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Proof (cont’d): we have 1 N p N 1 N Christophe Hurlin (University of Orléans) N ∑ xi xi> i =1 N ∑ xi µi i =1 ! ! 1 p ! E 1 xi xi> d ! N 0, σ2 E xi xi> Advanced Econometrics - HEC Lausanne November 20, 2013 138 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Theorem (Slutsky’s theorem for convergence in distribution) d Let XN and YN be two sequences of random variables where XN ! X and p YN ! c, where c 6= 0, then: d XN + YN ! X + c d XN YN ! cX XN d X ! YN c d If YN and XN are matrices/vectors, then YN 1 XN ! c V c Christophe Hurlin (University of Orléans) 1 X =c 1 Vc 1X with 1> Advanced Econometrics - HEC Lausanne November 20, 2013 139 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Proof (cont’d): By using the Slusky’s theorem (for a convergence in distribution), we have: p with b N β β0 = 1 N N ∑ xi xi> i =1 Π=E Ω=E 1 xi xi> 1 ! 1 p 1 N N xi xi> σ2 E xi xi> N ∑ xi µi i =1 ! d ! N (Π, Ω) 0=0 E 1 xi xi> = σ2 E 1 xi xi> Finally, we have: p b N β Christophe Hurlin (University of Orléans) d β0 ! N 0, σ2 E 1 xi xi> Advanced Econometrics - HEC Lausanne November 20, 2013 140 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions De…nition (univariate Delta method) Let ZN be a sequence random variable indexed by the sample size N such that p d N (ZN µ) ! N 0, σ2 If g (.) is a continuous and continuously di¤erentiable function with g (µ) 6= 0 and not involving N, then 0 !2 1 p ∂g (x ) d N (g (ZN ) g (µ)) ! N @0, σ2 A ∂x µ Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 141 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Multivariate Delta method Let ZN be a sequence random vectors indexed by the sample size such that p N ( ZN d µ) ! N (0, Σ) If g (.) is a continuous and continuously di¤erentiable multivariate function with g (µ) 6= 0 and not involving N, then ! p ∂g x ∂g x ( ) ( ) d N (g (ZN ) g (µ)) ! N 0, Σ ∂x µ ∂x> µ Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 142 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Example (Gamma distribution) Let X1 , .., XN denote a sequence of independent and identically distributed random variables. We assume that Xi Γ (α, β) (gamma distribution) with E (X ) = αβ and V (X ) = αβ2 , α > 0, β > 0 and a pdf de…ned by: xα fX (x; α, β) = 1 x β exp Γ (α) βα , useless in this exercice, but for your culture R∞ for 8x 2 [0, +∞[ , where Γ (α) = 0 t α 1 exp ( t ) dt denotes the Gamma function. We assume that α is known. Question: What is the asymptotic distribution of the estimator b β de…ned by: 1 b β= αN Christophe Hurlin (University of Orléans) N ∑ Xi i =1 Advanced Econometrics - HEC Lausanne November 20, 2013 143 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Solution: The estimator b β is de…ned by: 1 b β= αN N ∑ Xi i =1 Since X1 , .., XN are i.i.d. with E (X ) = αβ and V (X ) = αβ2 , we can apply the Lindeberg–Levy CLT, and we get immediately: p Christophe Hurlin (University of Orléans) N XN d αβ ! N 0, αβ2 Advanced Econometrics - HEC Lausanne November 20, 2013 144 / 147 4. Asymptotic Properties 4.5. Asymptotic distributions Solution (cont’d): If we de…ne g (x ) = x /α, with g E XN p = g (αβ) = β 6= 0 1 b β = XN = g XN α N XN By using the delta method, we have: p N g XN d αβ ! N 0, αβ2 0 d g (αβ) ! N @0, ∂g (z ) ∂z Since ∂g (z ) /∂z = ∂ (z/α) /∂z = 1/α, we have: ! 2 p β d N b β β ! N 0, α Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne αβ !2 1 αβ2 A November 20, 2013 145 / 147 4. Asymptotic Properties Key Concepts Section 4 1 Almost sure convergence 2 Convergence in probability 3 Law of large numbers: Khinchine’s and Kolmogorov’s theorems 4 Weakly and strongly consistent estimator 5 Slutsky’s theorem 6 Convergence in mean square 7 Convergence in distribution 8 Asymptotic distribution and asymptotic variance 9 Lindeberg-Levy Central Limite Theorem (univariate and multivariate) 10 Continuous mapping theorem 11 Delta method Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 146 / 147 End of Chapter 1 Christophe Hurlin (University of Orléans) Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne November 20, 2013 147 / 147