Handout 2

advertisement
Research Method
Lecture 2 (Ch3)
Multiple linear
regression
©
1
Model with k independent
variables
y=β0+β1x1+β2x2+….+βkxk+u
β0 is the intercept
βj for j=1,…,k are the slope parameters
2
Mechanics of OLS
Variable labels
Suppose you have n observations. Then you have
data that look like
Obs
id
Y
x1
x2
…
xk
1
y1
x11
x12
x1k
2
y2
x21
x22
x2k
:
:
:
:
:
n
yn
xn1
xn2
xnk
3
 The OLS estimates of the parameters are chosen to
minimize the estimated sum of squared errors. That is,
you minimize Q, given below, by choosing betas.
n
n
Q   uˆi   ( yi  ˆ0  ˆ1 xi1  ˆ2 xi 2  ...  ˆk xik ) 2
i 1
2
i 1
This can be achieved by taking the partial
derivatives of Q with respect to betas, then set them
equal to zero. (See next page)
4
The first order conditions (FOCs)
n
Q
 0  2 ( yi  ˆ0  ˆ1 xi1  ...  ˆk xik )  0
ˆ0
i 1
n
Q
 0  2 xi1 ( yi  ˆ0  ˆ1 xi1  ...  ˆk xik )  0
ˆ1
i 1
…….
n
Q
 0  2 xik ( yi  ˆ0  ˆ1 xi1  ...  ˆk xik )  0
ˆk
i 1
You solve these equations for betas. The solutions
are the OLS estimators for the coefficients.
5
Most common method to solve for the
FOCs is to use matrix notation. We will
use this method later.
For our purpose, more useful
representation of the estimators are given
in the next slide.
6
The OLS estimators
 The slope parameters have the following representation.
The jth parameter (except intercept) is given by
n
ˆ j 
 ˆ
i 1
n
ij
yi
2
ˆ

 ij
i 1
Where ˆij is the OLS residual of the following equation
where xj is regressed on all other explanatory variables.
That is;
xij  ˆ0  ˆ1 xi1  ...  ˆ j 1 xij1  ˆ j xij1  ...  ˆk 1 xik  ˆij



all the explanatory variables except xj
Proof: See the front board
7
Unbiasedness of OLS
Now, we introduce a series of assumptions to
show the unbiasedness of OLS.
Assumption MLR.1: Linear in parameters
The population model can be written as
y=β0+β1x1+β2x2+….+βkxk+u
8
Assumption MLR.2: Random sampling
We have a random sample of n observations
{xi1 xi2…xik, yi}, i=1,…,n following the
population model.
9
 MLR.2 means the following
MLR.2a yi , i=1,…,n are iid
MLR.2b xi1,i=1,….,n are iid
:
xik, i=1,….,n are iid
MLR.2c Any variables
across observations are
independent
MLR.2d ui , i=1,…,n are iid
Ob Y
s
id
x1
x2
1
y1
x11 x12
x1k
2
y2
x21 x22
x2k
:
:
:
:
n
yn xn1 xn2
:
…
xk
xnk
10
Assumption MLR.3: No perfect collinearity
In the sample and in the population, none of
the independent variables are constant,
and there are no exact linear relationships
among the independent variables.
11
Assumption MLR.4: Zero conditional
mean
E(u|x1,x2,…,xk)=0
12
Combined with MRL.2 and MRL.4, we have the
following.
MLR.4a: E(ui|xi1, xi2,…,xik)=0 for i=1,…,n
MLR.4b:
E(ui|x11,x12,..,x1k,x21,x22,..,x2k,..…,xn1,xn2,..,xnk)=0
for i=1,…,n.
We usually write this
as E(ui|X)=0
MLR.4b means that conditional on all the data, the
expected value of ui is zero.
13
Unbiasedness of OLS parameters
Theorem 3.1
Under assumption MRL.1 through MRL.4 we have
E ( ˆ j )   j
for j=0,1,..,k
Proof: See front board
14
Omitted variable bias
 Suppose that the following population model satisfies
MLR.1 through MLR.4
y=β0+β1x1+β2x2+u -----------------------------(1)
But, further suppose that you instead estimate the
following model which omits x2, perhaps because of a
simple mistake, or perhaps because x2 is not available in
your data.
y=β0+β1x1+v ------------------------------------(2)
15
 Then, OLS estimate of (1) and OLS estimate
of (2) have the following relationship.
~
~
ˆ
ˆ
1  1   21
~
ˆ
ˆ
where 1 ,  2 is the OLS estimate from (1), and  1
is the OLS estimate from (2).
~
and,  1is the OLS estimate of the following model
x2=δ0+δ1x1+e
Proof for this will be give later for a general case.
16
So we have
~
~
E ( 1 | X )  1   21
So, unless  2 =0 or
is biased.
~
 1 =0, the estimate from equation (2), ~1 ,
~
Notice that  1 >0 if cov(x1,x2) >0 and vise versa, so we can
predict the direction of the bias in the following way.
17
Summary of bias
~
~ <0
1
 1>0 i.e,.
cov(x1,x2)>0
i.e.,
cov(x1,x2)<0
Β2>0
Positive bias
(upward bias)
Negative bias
(downward
bias)
Β2<0
Negative bias
Positive bias
(downward bias) (upward bias)
18
Question
Suppose the population model (satisfying the
MRL.1 through MRL.4) is given by
(Crop yield)= β0+ β1(fertilizer)+ β2(land quality)+u -----(1)
But your data do not have land quality variable, so you
estimate the following.
(Crop yield)= β0+ β1(fertilizer)+ v ---------------------------(2)
Questions next page:
19
 Consider the following two scenarios.
Scenario 1: On the farm where data were collected, farmers
used more fertilizer on pieces of land where land quality
is better.
Scenario 2: On the farm where data were collected,
scientists randomly assigned different quantities of
fertilizer on different pieces of land, irrespective of the
land quality.
Question 1: In which scenario, do you expect to get an
unbiased estimate?
Question 2: If the estimate under one of the above scenario
is biased, predict the direction of the bias.
20
Omitted bias, more general case
Suppose the population model (which
satisfies MRL.1 through MRL.3) is given by
y=β0+β1x1+β2x2+….+βk-1xk-1+βkxk+u -----(1)
But you estimate a model which omits xk.
y=β0+β1x1+β2x2+….+βk-1xk-1+v
-----(2)
21
Then, we have the following
~
~
ˆ
ˆ
 j   j   k j
~
ˆ
ˆ
where  j ,  k is the OLS estimate from (1), and  j
is the OLS estimate from (2).
~
And,  j is the OLS estimate of the following
xk=δ0+δ1x1+…+ δk-1xk-1+ e
22
In general, it is difficult to predict the direction
of bias in the general case.
However, approximation is often useful.
Note that ~j is likely to be positive if the
correlation between xj and xk are positive. Using
this, you can make predict the “approximate”
direction of the bias.
23
Endogeneity
Consider the following model
y=β0+β1x1+β2x2+….+βk-1xk-1+βkxk+u
A variable xj is said to be endogenous if xj and u
are correlated. This causes a bias in βj, and in
certain cases, for other variables as well.
One reason why endogeneity occurs is the omitted
variable problem, described in the previous
slides.
24
Variance of OLS estimators
First, we introduce one more assumption
Assumption MLR.5: Homoskedasticity
Var(u|x1,x2,…,xk)=σ2
This means that the variance of u does not depend
on the values of independent variables.
25
Combining MLR.5 with MLR.2, we also
have
MRL.4a Var(ui|X)=σ2 for i=1,…,n
where X denotes all the independent
variables for all the observations. That is,
x11, x12,..,x1k, x2l,x22,…x2k,…., xn1, xn2,…xnk.
26
Sampling variance of OLS slope
estimators
Theorem 3.2:
Under assumptions MLR.1 through
MLR.5, we have
Var ( ˆ j | X ) 
where
2
SST j (1  R j )
2
for j=1,…,n
n
SST j   ( xij  x j )
i 1
And Rj2 is the R=squared from regressing xj on all other
independent variables. That is, R-squared from the following
regression:
x j   0  1 x1  ...   j 1 x j 1   j x j 1  ...   k 1 xk  e



All x - variables except xj
Proof: see front board
27
The standard deviation of OLS slope
parameters are given by the square root of
the variance, which is
2


ˆ
ˆ
sd ( j )  Var ( j | X ) 

2
2
SST j (1  R j )
SST j (1  R j )
for j=1,…,n
28
The estimator of σ2
In theorem 3.2, σ2 is unknown, which have to
be estimated.
The estimator is given by
n
1
2
ˆ
ˆ 2 
u

i
n  k  1 i 1
n-k-1 comes from (# obs)-(# parameters
estimated including the intercept). This is
called the degree of freedom.
29
Theorem 3.3: Unbiased estimator of σ2 .
Under MLR.1 through MLR.5, we have
2
ˆ
E ( )  
2
Proof: See the front board
30
Estimates of the variance and the
standard errors of OLS slope parameters
We replace the σ2 in the theorem 3.2 by ˆ 2 to
get the estimate of the variance of the OLS
parameters. This is given by
Note the is a hat
^
ˆ 2
Var(ˆ j | X )  SST (1  R
j
2
j
)
indicating that this
is an estimate.
Then the standard error of the OLS estimate
is the square root of the above. This is the
estimated standard deviation of the slope
parameters
2
ˆ

ˆ
ˆ
se(  j ) 

2
2
SST j (1  R j )
SST j (1  R j )
31
Multicollinearity
Var ( ˆ j | X ) 
2
SST j (1  R j )
2
•If xj is highly correlated with other independent variables, Rj2 gets
close to 1. This in turn means that the variance of the βj gets large.
This is the problem of multicollinearity.
•In an extreme case where xj is perfectly linearly correlated with
other explanatory variables, Rj2 is equal to 1. In this case, you
cannot estimate betas at all. However, this case is eliminated by
MLR.3.
•Note that multicollinearity does not violate any of the OLS
assumptions (except the perfect multicollinearity case), and should
not be over-emphasized. You can reduce variance by increasing the
number of observations.
32
Gauss-Markov theorem
Theorem 3.4
Under Assumption MLR.1 through MRL.5, OLS
estimates of beta parameters are the best linear
unbiased estimators.
This theorem means that among all the possible
unbiased estimators of the beta parameters, OLS
estimators have the smallest variances.
33
Download