definition - University of Alberta

advertisement
3. Multiple Regression
Analysis: Estimation
-Although bivariate linear regressions are
sometimes useful, they are often unrealistic
-SLR.4, that all factors affecting y are
uncorrelated with x, is often violated
-MULTIPLE REGRESSION ANALYSIS
allows us to explicitly control factors to
obtain a Ceteris Paribus situation
-this allows us to infer causality better
than a bivariate regression
3. Multiple Regression
Analysis: Estimation
-multiple regression analysis includes more
variables, therefore explaining more of the
variation in y
-multiple regression analysis can also
“incorporate fairly general functional form
relationships
-it’s more flexible
3. Multiple Regression Analysis:
Estimation
3.1 Motivation for Multiple Regression
3.2 Mechanics and Interpretation of
Ordinary Least Squares
3.3 The Expected value of the OLS
Estimators
3.4 The Variance of the OLS Estimators
3.5 Efficiency of OLS: The Gauss-Markov
Theorem
3.1 Motivation for Multiple Regression
Take the bivariate regression:
Moviequality   0  1Plot  u
(ie)
-where u takes into other factors affecting movie
quality, such as the characters
-for this regression to be valid, we have to
assume that characters are uncorrelated with
the plot – a poor assumption
-since u affects Plot, this estimate is biased and
we can’t isolate the Ceteris Paribus effect of
plot on movie quality
3.1 Motivation for Multiple Regression
Take the multiple variable regression:
Moviequality  0  1Plot   2Character  u
(ie)
-we still need to be concerned of u’s effect on
character and plot BUT…
-by including Character in the regression we
ensure we can examine Plot’s effect with
Character held constant (B1)
-We can also analyze Character’s effect on movie
quality with Plot held constant (B2)
3.1 Motivation for Multiple Regression
-”Multiple regression analysis is also useful for
generalizing functional relationships between
variables”:
Exammark   0  1Study   2 Study 2  u
(ie)
-here study time can impact exam mark in a
direct and/or quadratic fashion
-this quadratic equation effects how the
parameters are interpreted
-you cannot examine study’s effect on
exammark by holding study2 constant
3.1 Motivation for Multiple Regression
-the change in exammark due to an extra hour of
studying therefore becomes:
Exammark
 1  2 2 Study
(ie)
Study
-the impact is no longer a constant (B1).
-while including one variable twice in multiple
regression analysis allows it to have a more
dynamic impact, it requires a more in-depth
analysis of the coefficients estimated
3.1 Motivation for Multiple Regression
-A simple model with two independent variables
(x1 and x2) can be written as:
y   0  1x1   2 x2  u
(3.3)
-where B1 examines x1’s impact on y and B2
examines x2’s impact on y
-a key assumption on how u is related to x1 and
x2 is:
E(u | x1, x 2 )  0
(3.5)
-that is, all unobserved impacts on y are
expected to be zero given any x1 and x2
-as in the bivariate case, B0 can be scaled to
make this hold true
3.1 Motivation for Multiple Regression
-in our movie example, this becomes:
E (u | plot, character)  0
(ie)
-in other words, other factors affecting movie
quality (such as filming skill) are not related to
plot or character
-in the quadratic case, this assumption is
simplified:
E (u | study, study )  0  E (u | study, )  0
2
(ie)
3.1 Model with k Independent Variables
-in a regression with k independent variables, the
MULTIPLE LINEAR REGRESSION MODEL or
MULTIPLE REGRESSION MODEL of the
population is:
y   0  1x1   2 x 2  3 x 3  ...   k x k  u
(3.6)
-B0 is the intercept, B1 relates to x1, B2 relates to
x2, and so on
-k variables and an intercept give k+1 unknown
parameters
-parameters other than the intercept are
sometimes called SLOPE PARAMETERS
3.1 Model with k Independent Variables
-in the multiple regression model:
y   0  1x1   2 x 2  3 x 3  ...   k x k  u
(3.6)
-u is the error term or disturbance that captures
all effects on y not included in the x’s
-some effects can’t be measured
-some effects aren’t expected
-y is the DEPENDENT, EXPLAINED, or PREDICTED
variable
-x are the INDEPENDENT, EXPLANATORY or
PREDICTOR variables
3.1 Model with k Independent Variables
-parameter interpretation is key in multiple
regressions:
log( mark )   0  1log(a bility )   2study   3study 2  u
(ie)
-here B1 is the ceteris paribus elasticity of mark
with respect to ability
-if B3=0, then 100B2 is approximately the ceteris
paribus increase in mark when you study an
extra hour
-if B3≠0, this is more complicated
-note that this equation is linear in the
parameters even though mark and study have
a non-linear relationship
3.1 Model with k Independent Variables
-the k assumption with k independent variables
becomes:
E(u | x1, x 2 ,..., x k )  0
(3.8)
-that is, ALL unobserved factors are uncorrelated
with ALL explanatory variables
-anything that causes correlation between u and
any explanatory variable causes (3.8) to fail
3.2 Mechanics and Interpretation of
Ordinary Least Squares
-in a simple model with two independent
variables, the OLS estimation is written as:
yˆ  ˆ0  ˆ1x1  ˆ2 x 2
(3.9)
-where B0hat estimates B0, B1hat estimates B1
and B2hat estimates B2
-we obtain these estimates through the method
of ORDINARY LEAST SQUARES which
minimizes the sum of squared residuals:
2
ˆ
ˆ
ˆ
Min
(
y




x


x
)

i
0
1
i1
2
i2
ˆ ˆ ˆ
 0 , 1 ,  2
(3.10)
3.2 Indexing Note
-when independent variables have two
subscripts, the i refers to the observation number
-likewise the number (1 or 2, etc.) distinguishes
between different variables
-for example, x54 indicates the 5th observations
data for variable 4
-in this course, variables will be generalized xij,
where i refers to observation number and j refers
to variable number
-this is not universal, other papers will use
different conventions
3.2 K Independent Variables
-in a model with k independent variables, the
OLS estimation is written as:
yˆ  ˆ0  ˆ1x1  ˆ2 x 2  ....  ˆk x k
(3.11)
-where B0hat estimates B0, B1hat estimates B1
and B2hat estimates B2, etc.
-this is called the OLS REGRESSION LINE or
SAMPLE REGRESSION FUNCTION (SRF)
-we still obtain k+1 OLS estimates by minimizing
the sum of squared residuals:
n
ˆ  ˆ x  ...  ˆ x ) 2
Min
(
y


 i 0 1 i1
k ik
ˆ
j
i 1
(3.12)
3.2 K Independent Variables
-using multivariable calculus (partial derivatives),
this leads to k+1 equations of k+1 unknowns:
ˆ  ˆ x  ˆ x  ....  ˆ x  0

x
x
0
1
i1
2
i2
k
ik
ˆ  ˆ x  ˆ x  ....  ˆ x )  0
(

i1
0
1 i1
2 i2
k ik
ˆ  ˆ x  ˆ x  ....  ˆ x )  0
(

i2
0
1 i1
2 i2
k ik
(3.13)
...
ˆ  ˆ x  ˆ x  ....  ˆ x )  0
x
(

 ik 0 1 i1 2 i2
k ik
-these are also OLS’s FIRST ORDER CONDITIONS
(FOC’s)
3.2 K Independent Variables
-these equations are sample counterparts of
population moments from a method of
moments estimation (we’ve omitted dividing by
n) using the following assumptions:
E (u)  0
E ( x ju)  0
(3.8)
-(3.13) is tedious to solve by hand, and we use
statistics and econometric software -the one
requirement is that (3.13) can be solved
uniquely for Bjhat (this is an easy assumption)
-B0hat is called the OLS INTERCEPT ESTIMATE
and B1hat to BKhat the OLS SLOPE ESIMATES
3.2 Interpreting the OLS Equation
-given a model with 2 independent variables (x1
and x2):
yˆ  ˆ0  ˆ1x1  ˆ2 x 2
(3.14)
-B0hat is the predicted value of y when x1=0 and
x1=0
-this is sometimes and interesting situation and
other times impossible
-the intercept is still essential to the estimation,
even if it is theoretically meaningless
3.2 Interpreting the OLS Equation
-”B1hat and B2hat have PARTIAL EFFECT or
CETERIS PARIBUS interpretations:
yˆ  ˆ1x1  ˆ2 x 2
-therefore given a change in x1 and x2, we can
predict a change in y
-in addition, when the other x variable is held
constant, we have:
yˆ  ˆ1x1 (when x 2 is held fixed)
and
yˆ  ˆ2 x 2 (when x 1 is held fixed)
3.2 Interpreting Example
-consider the theoretical model:
intell î gence  80  5HomeParent  0.5Held
(ie)
-Where a person’s innate intelligence is a function
of how many years a parent was home during
their childhood and the average amount of
hours they are held as a child
-the intercept (80) estimates that a child with no
stay-at home parent that is never held with
have an innate intelligence of 80
3.2 Interpreting Example
-consider the theoretical model:
intell î gence  80  5HomeParent  0.5Held
(ie)
-B1hat estimates that a parent staying home for
an extra year increases child intellect by 5
-B2hat estimates that a parent holding a child for
on average an extra hour increases child
intellect by 0.5
-if a parent stays home for an extra year, and as
a result holds a child an extra hour on average,
we would estimate their intellect to rise by 5.5
(5+0.5; 1(B1hat) + 1(B2hat))
3.2 Interpreting the OLS Equation
-A model with k independent variables is written
similar to the 2 independent variable case:
yˆ  ˆ0  ˆ1x1  ˆ2 x 2  ...  ˆk x k
(3.16)
-Written in terms of changes:
yˆ  ˆ1x1  ˆ2 x 2  ...  ˆk x k
(3.17)
-If we hold all other variables (xj|j=1,2…k, i≠f)
fixed, or CONTROL FOR ALL other variables,
ˆ
ˆ
y   f x f
(3.18' )
3.2 Holding Other Factors Fixed
-we’ve already seen that Bjhat examines the
effect of increasing xj by one, holding all
other x’s constant
-in simple regression analysis, this would
require two identical observations where
only xj differed
-multiple regression analysis estimates this
effect without having an explicit example
-multiple regression analysis mimics a
controlled experiment using
nonexperimental data
Download