多元回归分析:渐进性 y = b0 + b1x1 + b2x2 + . . . bkxk + u 1 Copyright © 2007 Thomson Asia Pte. Ltd. All rights reserved. 渐进性的含义 如果误差并非正态分布,对任何的样本容量而 言,t统计量、F统计量并非恰好服从t分布、F 分布。 幸运的是,即使没有正态性假定,t统计量和F 统计量仍然渐进的服从t分布、F分布,,至少 在大样本情况下使如此。 计量经济学导论 刘愿 2 一致性 在高斯-马尔科夫假定下,OLS估计是最优线 性无偏的,但我们并非总能得到无偏的估计量。 一致性是对一个估计量最起码的要求。在无法 满足无偏性的情况下,我们可以搜集尽可能多 的样本,即使n→ ∞,参数估计值的分布将逼近 真实参数值。 计量经济学导论 刘愿 3 一致性的正式定义 令Wn是基于样本Y1 , Y2 , , YN的参数的估计值, 则Wn是的一致估计量,对于任意一个正数 0, P Wn 0,当n 否则,Wn不是的一致估计量。 当Wn是一致时,我们说 是Wn的概率极限,记为: plim Wn 计量经济学导论 刘愿 4 一致性的直观理解 如果估计量是一致的,那么随着样本容量的增加, bˆ j的分布就越来越紧密地分布在b j的周围。当n趋向 无穷时,bˆ 的分布就紧缩成单一一个点b 。这意味着, j j 如果我们能够搜集到所需要的样本数据,我们就能让 估计量任意接近于b j。 计量经济学导论 刘愿 5 当样本容量增加时的样本分布 计量经济学导论 刘愿 6 定理5.1 OLS的一致性 在假定MLR.1~MLR.4下,对所有的j =0, 1, 估计量bˆ j 都是b j的一致估计。 计量经济学导论 刘愿 , k, OLS 7 OLS的一致性 在高斯-马尔科夫假定下,OLS估计值是一致 且无偏的。 类似的,我们可以像无偏性一样证明一致性, 为此需要引入概率极限。 计量经济学导论 刘愿 8 简单回归中证明一致性 bˆ1 xi1 x1 yi x i1 i1 x1 x1 b 0 b1 xi1 ui x i1 x 2 x1 2 b 0 xi1 x1 b1 xi1 x1 xi1 xi1 x1 ui x i1 b1 n 1 x i1 x1 ui x1 n 1 2 x i1 x1 2 plimbˆ1 b1 Cov x1 , u Var x1 b1 because y1 b 0 b1 xi1 ui , and Cov x1 , u 0. 计量经济学导论 刘愿 9 一个较弱的假定 为了得到无偏性,我们需要零条件均值假设 E(u|x1, x2,…,xk) = 0 为了得到一致性,我们仅需要较弱的假定:零 均值和零相关:E(u) = 0 ,Cov(xj,u) = 0, for j = 1, 2, …, k. 不满足上述条件,OLS是有偏和不一致的。 计量经济学导论 刘愿 10 Deriving the Inconsistency Just as we could derive the omitted variable bias earlier, now we want to think about the inconsistency, or asymptotic bias, in this case T rue model: y b 0 b1 x1 b 2 x2 v You think: y b 0 b1 x1 u , so that ~ u b 2 x2 v and, plimb1 b1 b 2 where Cov x1 , x2 Var x1 计量经济学导论 刘愿 11 Asymptotic Bias (cont) So, thinking about the direction of the asymptotic bias is just like thinking about the direction of bias for an omitted variable Main difference is that asymptotic bias uses the population variance and covariance, while bias uses the sample counterparts Remember, inconsistency is a large sample problem – it doesn’t go away as add data 计量经济学导论 刘愿 12 Summary of Direction of Asymptotic bias Corr(x1, x2) > 0 Corr(x1, x2) < 0 1 0 1 0 b2 > 0 Positive asymptotic bias b2 < 0 Negative asymptotic Positive asymptotic bias bias Negative asymptotic bias 计量经济学导论 刘愿 13 Large Sample Inference Recall that under the CLM assumptions, the sampling distributions are normal, so we could derive t and F distributions for testing This exact normality was due to assuming the population error distribution was normal This assumption of normal errors implied that the distribution of y, given the x’s, was normal as well 计量经济学导论 刘愿 14 Large Sample Inference (cont) Easy to come up with examples for which this exact normality assumption will fail Any clearly skewed variable, like wages, arrests, savings, etc. can’t be normal, since a normal distribution is symmetric Normality assumption not needed to conclude OLS is BLUE, only for inference 计量经济学导论 刘愿 15 Central Limit Theorem Based on the central limit theorem, we can show that OLS estimators are asymptotically normal Asymptotic Normality implies that P(Z<z)F(z) as n , or P(Z<z) F(z) (标准正态累积分布 函数) The central limit theorem states that the standardized average of any population with mean m and variance s2 is asymptotically ~N(0,1), or Y mY a Z ~ N 0,1 s n 计量经济学导论 刘愿 16 Theorem 5.2 Asymptotic Normality Under theGauss - Markovassumptions, plimn rˆ a 2 2 ˆ (i) n b j b j ~ Normal0, s a j , where a 2 j 1 2 ij (ii) sˆ is a consistentestimatorof s 2 2 (iii) bˆ j b j se bˆ j ~ Normal0,1 a 计量经济学导论 刘愿 17 bˆ j N b j , var bˆ j var bˆ j s 2 SST j 1 R 2 j s 2 r n i 1 2 ij n 2 2 ˆ b j b j N 0, s rij i 1 2 2 ˆ n bj bj N 0, s a j 1 2 a p lim n rij i 1 n 2 j 计量经济学导论 刘愿 18 Law of large numbers 计量经济学导论 刘愿 19 Asymptotic Normality (cont) Because the t distribution approaches the normal distribution for large df, we can also say that a ˆ ˆ b j b j se b j ~ tnk 1 Note that while we no longer need to assume normality with a large sample, we do still need homoskedasticity 计量经济学导论 刘愿 20 Asymptotic Standard Errors If u is not normally distributed, we sometimes will refer to the standard error as an asymptotic standard error, since se bˆ j sˆ 2 SST j 1 R 2 j sˆ SSR j , 1 n 2 SSR sˆ uˆi n 2 i 1 n2 se bˆ j c j n , SST j ns 2j So, we can expect standard errors to shrink at a rate proportional to the inverse of √n 计量经济学导论 刘愿 21 Lagrange Multiplier statistic Once we are using large samples and relying on asymptotic normality for inference, we can use more that t and F stats The Lagrange multiplier or LM statistic is an alternative for testing multiple exclusion restrictions Because the LM statistic uses an auxiliary regression it’s sometimes called an nR2 stat 计量经济学导论 刘愿 22 LM Statistic (cont) Suppose we have a standard model, y = b0 + b1x1 + b2x2 + . . . bkxk + u and our null hypothesis is H0: bk-q+1 = 0, ... , bk = 0 First, we just run the restricted model y b 0 b1 x1 ... b k q xk q u Now take the residuals, u, and regress u on x1 , x2 ,..., xk (i.e. all the variables) LM nRu2 , where Ru2 is from this reg 计量经济学导论 刘愿 23 The idea of LM statistic If the omitted variables xk-q+1 through xk truly have zero population coefficients then, at least approximately, u should be uncorrelated with each of these variables in the sample. Running a regression of these residuals on those independent variables excluded under H0, we should get a small enough R2. However, we must include all of the independent variables in the regression for technical reasons. 计量经济学导论 刘愿 24 LM Statistic (cont) a LM ~ , so can choosea critical 2 q value, c, froma q2 distribution, or just calculatea p - value for 2 q With a large sample, the result from an F test and from an LM test should be similar. LM>c, reject H 计量经济学导论 刘愿 25 计量经济学导论 刘愿 26 Example: Economic Model of Crime Narr86为一个人被拘捕的次数; Pcnv为以前被拘捕后被定罪的次数; Avgsen为过去定罪后被判刑的平均时间长度; Tottime为此人在年龄达到18岁后在1986年以前被送进监狱的总次数; Ptime86为1986年坐牢的月数; Qemp86为此人在1986年合法就业的季度数。 narr86 b0 b1 pcnv b4 ptime86 b5qem86 u u 0 1 pcnv 2 avgsen 3totime 4 ptime86 5qem86 v Ru2 0.0015, LM 2725 0.0015 4.09 4.61 q ,10%, x2 , P x22 4.09 0.129 计量经济学导论 刘愿 27 Asymptotic Efficiency Estimators besides OLS will be consistent However, under the Gauss-Markov assumptions, the OLS estimators will have the smallest asymptotic variances We say that OLS is asymptotically efficient Important to remember our assumptions though, if not homoskedastic, not true 计量经济学导论 刘愿 28 The discussion in the simple regression g(x) is any function of x, let zi=g(x), then u is uncorrelated with zi. 计量经济学导论 刘愿 29 计量经济学导论 刘愿 30