Omitted Variable Bias

advertisement

Analysis of Cross Section and Panel Data

Yan Zhang

School of Economics, Fudan University

CCER, Fudan University

Introductory Econometrics

A Modern Approach

Yan Zhang

School of Economics, Fudan University

CCER, Fudan University

Analysis of Cross Section and Panel Data

Part 1. Regression Analysis on

Cross Sectional Data

Chap 2. The Simple Regression Model

——

Practice for learning multiple Regression

 Bivariate linear regression model

 :the slope parameter in the relationship between y and x holding the other factors in u fixed; it is of primary interest in applied economics.

 :the intercept parameter, also has its uses, although it is rarely central to an analysis.

More Discussion

 :A one-unit change in x has the same effect on y, regardless of the initial value of x.

 Increasing returns: wage-education (f. form)

 Can we draw ceteris paribus conclusions about how x affects y from a random sample of data, when we are ignoring all the other factors?

 Only if we make an assumption restricting how the unobservable random variable u is related to the explanatory variable x

Classical Regression Assumptions

 Feasible assumption if the intercept term is included

 Linearly uncorrelated zero conditional expectation

 Meaning

 内生性

=

 PRF (Population Regression Function): sth. fixed but unknown

 Minimize uu

 sample regression function (SRF)

 The point is always on the

OLS regression line.

PRF:

OLS

拟合值与残差

OLS

 Coefficient of determination

 the fraction of the sample variation in y that is explained by x.

 the square of the sample correlation coefficient between and

 Low R-squareds

Units of Measurement

 If one of the dependent variables is multiplied by the constant c

— which means each value in the sample is multiplied by c

— then the OLS intercept and slope estimates are also multiplied by c.

 If one of the independent variables is divided or

multiplied by some nonzero constant, c, then its OLS slope coefficient is also multiplied or divided by c respectively.

 The goodness-of-fit of the model, R-squareds, should not depend on the units of measurement of our variables.

Function Form

Linear Nonlinear

Logarithmic dependent variable

 A

 Percentage change in y, semi-elasticity

 an increasing return to edu.

 Other nonlinearity: diploma effect

Bi-Logarithmic

 A

 a

 Constant elasticity

Change of units of measurement

 P45, error: b

0

* = b

0

+log(c

1

)-b

1

· log(c

2

)

 Bi-Logarithmic

 A

 a

 Constant elasticity

 Change of units of measurement

 P45, error b

0

*

= b

0

+log(c

1

)-b

1

· log(c

2

)

 Be proficient at interpreting the coef.

Unbiasedness of OLS Estimators

 Statistical properties of OLS

 从总体中随机抽样取出的不同样本的

OLS

估计

分布性质

 Assumptions

 Linear in parameters (f. form; advanced methods)

 Random sampling (time series data; nonrandom sampling)

 Zero conditional mean (unbiased biased; spurious cor)

 Sample Variation in the independent variables (colinearity)

 Theorem (Unbiasedness)

 Under the four assumptions above, we have:

Variance of OLS Estimators

 的随机抽样以 为中心,问题是 究竟距离

多远?

 Assumptions

 Homoskedasticity:

 Error variance

 A larger means that the distribution of the unobservables affecting y is more spread out.

 Theorem (Sampling variance of OLS estimators)

Under the five assumptions above:

Variance of y given x

 Conditional mean and variance of y:

 Heteroskedasticity

What does depend on?

 More variation in the unobservables affecting y makes it more difficult to precisely estimate

 The more spread out is the sample of x i

-s, the easier it is to find the relationship between

E(y x) and x

 As the sample size increases, so does the total variation in the x i

. Therefore, a larger sample size results in a smaller variance of the estimator

Estimating Error Variance

 Errors (Disturbances) and Residuals

 Errors: , population

 Residuals: , estimated f.

 Theorem (The unbiased estimator of )

 Under the five assumptions above, we have:

 standard error of the regression (SER):

 Estimating the standard deviation in y after the effect of x has been taken out.

 Standard Error of :

Regression through the Origin

 Regression through the Origin:

 Pass through

 E.g. income tax revenue

—— income

 The estimator of OLS:

 = only if 0

 if the intercept 0, then is a biased estimator of

Chap 3. Multiple Regression

Analysis : Estimation

 Advantages of multiple regression analysis

 build better models for predicting the dependent variable.

 E.g.

 generalize functional form.

 Marginal propensity to consume

 Be more amenable to ceteris paribus analysis

 Chap 3.2

 Key assumption:

 Implication: other factors affecting wage are not related on average to educ and exper.

 Multiple linear regression model:

:the ceteris paribus effect of x j on y

Ordinary Least Square Estimator

 SPF:

 OLS:

Minimize

 F.O.C:

ceteris paribus interpretations:

 Holding fixed, then

 Thus, we have controlled for the variables when estimating the effect of x

1 on y.

Holding Other Factors Fixed

 The power of multiple regression analysis is that it provides this ceteris paribus interpretation even though the data have not been collected in a ceteris paribus fashion.

 it allows us to do in non-experimental environments what natural scientists are able to do in a controlled laboratory setting: keep other factors fixed.

OLS and Ceteris Paribus Effects

 Step of OLS:

 (1) :the OLS residuals from a multiple regression of x

1 on

 (2) :the OLS estimator from a simple regression

 of y on measures the effect of x

1 been partialled or netted out.

on y after x

2

,

, x k have

 Two special cases in which the simple regression of y on x

1 will produce the same OLS estimate on x the regression of y on x

1 and x

2

.

1 as

Goodness-of-fit

 also equal the squared correlation coef. between the actual and the fitted values of y.

 R never decreases, and it usually increases when another independent variable is added to a regression.

 The factor that should determine whether an explanatory variable belongs in a model is whether the explanatory variable has a nonzero partial effect on y in the population.

Regression through the origin

 the properties of OLS derived earlier no longer hold for regression through the origin.

 the OLS residuals no longer have a zero sample average.

 can actually be negative.

 to calculate it as the squared correlation coefficient

 if the intercept in the population model is different from zero, then the OLS estimators of the slope parameters will be biased .

The Expectation of OLS Estimator

 Assumptions( 简单回归模型假定的直接推广;比较 )

 Linear in parameters

 Random sampling

 Zero conditional mean

 No perfect co-linearity rank (X)=K

 none of the independent variables is constant;

 and there are no exact linear relationships among the independent variables

 Theorem (Unbiasedness)

 Under the four assumptions above, we have:

Notice 1: Zero conditional mean

 Exogenous Endogenous

 Misspecification of function form (Chap 9)

 Omitting the quadratic term

 The level or log of variable

 Omitting important factors that correlated with any independent v.

 如果被遗漏的变量与解释变量相关,则零条件方差不

成立,回归结果有偏

 Measurement Error (Chap 15, IV)

 Simultaneously determining one or more x-s with y (Chap

16,

联立方程组

)

Omitted Variable Bias:

The Simple Case

 Problem : Excluding a relevant variable or

Under-specifying the model (遗漏本来应该包

括在总体(真实)模型中的变量)

 Omitted Variable Bias (misspecification analysis)

 The true population model:

 The underspecified OLS line:

 The expectation of :

 The Omitted variable bias:

前面 3.2

节中是 x

1

对 x

2

回归

Omitted Variable Bias: Nonexistence

 Two cases where is unbiased:

 The true population model:

 is the sample covariance between x

1 variance of x

1 and x

2 over the sample

 If , then 的无偏性与 x

2

无关,估计时只需调整截

距,将 x

2

放入误差项不影响零条件均值假定

Summary of Omitted Variable Bias:

 The expectation of :

 The Omitted variable bias:

The Size of Omitted Variable Bias

 Direction Size

 A small bias of either sign need not be a cause for concern.

 Unknown Some idea

 we usually have a pretty good idea about the direction of the partial effect of x

2

on y, that is, the sign of

 in many cases we can make an educated guess about whether x

1 and x

2 are positively or negatively correlated.

 E.g. (Upward/downward Bias; biased toward zero)

高估!

Omitted Variable Bias:

More General Cases

 Suppose: x x

1

2 and x

3 are uncorrelated, but that is correlated with x

3

.

 Both and will normally be biased. The only exception to this is when x

1 also uncorrelated.

and x

2 are

 Difficult to obtain the direction of the bias in and

 Approximation: if x

1 and x

2 are also uncor.

Notice 2: No Perfect Collinearity

 An assumption only about x-s, nothing about the relationship between u and x-s

 Assumption MLR.4 does allow the independent variables to be correlated; they just cannot be

perfectly correlated. Ceteris Paribus effect

 If we did not allow for any correlation among the independent variables, then multiple regression would not be very useful for econometric analysis.

 Significance

Cases of Perfect Collinearity

 When can independent variables be perfectly collinear software

—“ singular

 Nonlinear functions of the same variable is not an exact linear f.

 Not to include the same explanatory variable measured in different units in the same regression equation.

 More subtle ways

 one independent variable can be expressed as an exact linear function of some or all of the other independent variables. Drop it

 Key:

Notice 3: Unbiase

 the meaning of unbiasedness:

 an estimate cannot be unbiased: an estimate is a fixed number, obtained from a particular sample, which usually is not equal to the population parameter.

 When we say that OLS is unbiased under Assumptions

MLR.1 through MLR.4, we mean that the procedure by which the OLS estimates are obtained is unbiased when we view the procedure as being applied across all possible random samples.

Notice 4: Over-Specification

 Inclusion of an irrelevant variable or over-specifying the model :

 does not affect the unbiasedness of the OLS estimators.

 including irrelevant variables can have undesirable effects on the variances of the

OLS estimators.

Variance of The OLS Estimators

 Adding Assumptions

 Homoskedasticity:

 Error variance

 A larger means that the distribution of the unobservables affecting y is more spread out.

 Gauss-Markov assumptions (for cross-sectional regression): Assumption 1-5

 Theorem (Sampling variance of OLS estimators)

Under the five assumptions above:

More about

 The stastical properties of y on x=(x

1

, x

2

,

, x k

)

 Error variance

 only one way to reduce the error variance: to add more explanatory variables

—— not always possible and desirable

 The total sample variations in x j

: SST j

 Increase the sample size

Multi-collinearity (多重共线性)

 The linear relationships among the independent v.

 其他解释变量对 x j

的拟合优度(含截距项)

If k=2 :

: the proportion of the total variation in x j that can be explained by the other independent variables

 :

 :

 : High (but not perfect) correlation between two or more of the in dependent variables is called multicollinearity .

Micro-numerosity: problem of small sample size

 High

Low SST j one thing is clear: everything else being equal, for estimating j, it is better to have less correlation between x j and the other x-s.

How to

“ solve

” the multicollinearity?

 Increase sample size

 Dropping some v.? 如果删除了总体模型中

的一个变量,则会导致有偏

Notice: The influence of multicollinearity

 A high degree of correlation between certain independent variables can be

irrelevant as to how well we can estimate

other parameters in the model.

 E.g.

 Importance for economists : controlling v.

参见注释

Variances in Misspecified Models

Whether or Not to Include x

2

:

Two Favorable Reasons

 The choice of whether or not to include a particular variable in a regression model can be made by analyzing the tradeoff between bias and variance..

 However, when 2 0, there are two favorable reasons for including x

2 in the model.

 any bias in does not shrink as the sample size grows;

 The variance of estimators both shrink to zero as n increase

 Therefor, the multicollinearity induced by adding x

2 becomes less important as the sample size grows. In large samples, we would prefer

参见

注释

Estimating : Standard Errors of the OLS Estimators

EFFICIENCY OF OLS:

THE GAUSS-MARKOV THEOREM

 BLUE

 “

Best

: smallest variance

 “ linear

:

 “ unbiased

:

定理含义:( 1 )无需寻找其他线性组合的无偏估计量;( 2 )如果 G-M 假设有一个

不成立,则 BLUE 不成立。例如零条件均值不成立(内生性)会导致有偏;异方差不

会有偏,但会使方差不再是最小。

Classical Linear Model Assumptions

——

Inference

本部分课程内容参考资料

 Jeffrey M. Wooldridge, Introductory

Econometrics

——

A Modern

Approach, Chap 2-3.

Download