3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors affecting y are uncorrelated with x, is often violated -MULTIPLE REGRESSION ANALYSIS allows us to explicitly control factors to obtain a Ceteris Paribus situation -this allows us to infer causality better than a bivariate regression 3. Multiple Regression Analysis: Estimation -multiple regression analysis includes more variables, therefore explaining more of the variation in y -multiple regression analysis can also “incorporate fairly general functional form relationships -it’s more flexible 3. Multiple Regression Analysis: Estimation 3.1 Motivation for Multiple Regression 3.2 Mechanics and Interpretation of Ordinary Least Squares 3.3 The Expected value of the OLS Estimators 3.4 The Variance of the OLS Estimators 3.5 Efficiency of OLS: The Gauss-Markov Theorem 3.1 Motivation for Multiple Regression Take the bivariate regression: Moviequality 0 1Plot u (ie) -where u takes into other factors affecting movie quality, such as the characters -for this regression to be valid, we have to assume that characters are uncorrelated with the plot – a poor assumption -since u affects Plot, this estimate is biased and we can’t isolate the Ceteris Paribus effect of plot on movie quality 3.1 Motivation for Multiple Regression Take the multiple variable regression: Moviequality 0 1Plot 2Character u (ie) -we still need to be concerned of u’s effect on character and plot BUT… -by including Character in the regression we ensure we can examine Plot’s effect with Character held constant (B1) -We can also analyze Character’s effect on movie quality with Plot held constant (B2) 3.1 Motivation for Multiple Regression -”Multiple regression analysis is also useful for generalizing functional relationships between variables”: Exammark 0 1Study 2 Study 2 u (ie) -here study time can impact exam mark in a direct and/or quadratic fashion -this quadratic equation effects how the parameters are interpreted -you cannot examine study’s effect on exammark by holding study2 constant 3.1 Motivation for Multiple Regression -the change in exammark due to an extra hour of studying therefore becomes: Exammark 1 2 2 Study (ie) Study -the impact is no longer a constant (B1). -while including one variable twice in multiple regression analysis allows it to have a more dynamic impact, it requires a more in-depth analysis of the coefficients estimated 3.1 Motivation for Multiple Regression -A simple model with two independent variables (x1 and x2) can be written as: y 0 1x1 2 x2 u (3.3) -where B1 examines x1’s impact on y and B2 examines x2’s impact on y -a key assumption on how u is related to x1 and x2 is: E(u | x1, x 2 ) 0 (3.5) -that is, all unobserved impacts on y are expected to be zero given any x1 and x2 -as in the bivariate case, B0 can be scaled to make this hold true 3.1 Motivation for Multiple Regression -in our movie example, this becomes: E (u | plot, character) 0 (ie) -in other words, other factors affecting movie quality (such as filming skill) are not related to plot or character -in the quadratic case, this assumption is simplified: E (u | study, study ) 0 E (u | study, ) 0 2 (ie) 3.1 Model with k Independent Variables -in a regression with k independent variables, the MULTIPLE LINEAR REGRESSION MODEL or MULTIPLE REGRESSION MODEL of the population is: y 0 1x1 2 x 2 3 x 3 ... k x k u (3.6) -B0 is the intercept, B1 relates to x1, B2 relates to x2, and so on -k variables and an intercept give k+1 unknown parameters -parameters other than the intercept are sometimes called SLOPE PARAMETERS 3.1 Model with k Independent Variables -in the multiple regression model: y 0 1x1 2 x 2 3 x 3 ... k x k u (3.6) -u is the error term or disturbance that captures all effects on y not included in the x’s -some effects can’t be measured -some effects aren’t expected -y is the DEPENDENT, EXPLAINED, or PREDICTED variable -x are the INDEPENDENT, EXPLANATORY or PREDICTOR variables 3.1 Model with k Independent Variables -parameter interpretation is key in multiple regressions: log( mark ) 0 1log(a bility ) 2study 3study 2 u (ie) -here B1 is the ceteris paribus elasticity of mark with respect to ability -if B3=0, then 100B2 is approximately the ceteris paribus increase in mark when you study an extra hour -if B3≠0, this is more complicated -note that this equation is linear in the parameters even though mark and study have a non-linear relationship 3.1 Model with k Independent Variables -the k assumption with k independent variables becomes: E(u | x1, x 2 ,..., x k ) 0 (3.8) -that is, ALL unobserved factors are uncorrelated with ALL explanatory variables -anything that causes correlation between u and any explanatory variable causes (3.8) to fail 3.2 Mechanics and Interpretation of Ordinary Least Squares -in a simple model with two independent variables, the OLS estimation is written as: yˆ ˆ0 ˆ1x1 ˆ2 x 2 (3.9) -where B0hat estimates B0, B1hat estimates B1 and B2hat estimates B2 -we obtain these estimates through the method of ORDINARY LEAST SQUARES which minimizes the sum of squared residuals: 2 ˆ ˆ ˆ Min ( y x x ) i 0 1 i1 2 i2 ˆ ˆ ˆ 0 , 1 , 2 (3.10) 3.2 Indexing Note -when independent variables have two subscripts, the i refers to the observation number -likewise the number (1 or 2, etc.) distinguishes between different variables -for example, x54 indicates the 5th observations data for variable 4 -in this course, variables will be generalized xij, where i refers to observation number and j refers to variable number -this is not universal, other papers will use different conventions 3.2 K Independent Variables -in a model with k independent variables, the OLS estimation is written as: yˆ ˆ0 ˆ1x1 ˆ2 x 2 .... ˆk x k (3.11) -where B0hat estimates B0, B1hat estimates B1 and B2hat estimates B2, etc. -this is called the OLS REGRESSION LINE or SAMPLE REGRESSION FUNCTION (SRF) -we still obtain k+1 OLS estimates by minimizing the sum of squared residuals: n ˆ ˆ x ... ˆ x ) 2 Min ( y i 0 1 i1 k ik ˆ j i 1 (3.12) 3.2 K Independent Variables -using multivariable calculus (partial derivatives), this leads to k+1 equations of k+1 unknowns: ˆ ˆ x ˆ x .... ˆ x 0 x x 0 1 i1 2 i2 k ik ˆ ˆ x ˆ x .... ˆ x ) 0 ( i1 0 1 i1 2 i2 k ik ˆ ˆ x ˆ x .... ˆ x ) 0 ( i2 0 1 i1 2 i2 k ik (3.13) ... ˆ ˆ x ˆ x .... ˆ x ) 0 x ( ik 0 1 i1 2 i2 k ik -these are also OLS’s FIRST ORDER CONDITIONS (FOC’s) 3.2 K Independent Variables -these equations are sample counterparts of population moments from a method of moments estimation (we’ve omitted dividing by n) using the following assumptions: E (u) 0 E ( x ju) 0 (3.8) -(3.13) is tedious to solve by hand, and we use statistics and econometric software -the one requirement is that (3.13) can be solved uniquely for Bjhat (this is an easy assumption) -B0hat is called the OLS INTERCEPT ESTIMATE and B1hat to BKhat the OLS SLOPE ESIMATES 3.2 Interpreting the OLS Equation -given a model with 2 independent variables (x1 and x2): yˆ ˆ0 ˆ1x1 ˆ2 x 2 (3.14) -B0hat is the predicted value of y when x1=0 and x1=0 -this is sometimes and interesting situation and other times impossible -the intercept is still essential to the estimation, even if it is theoretically meaningless 3.2 Interpreting the OLS Equation -”B1hat and B2hat have PARTIAL EFFECT or CETERIS PARIBUS interpretations: yˆ ˆ1x1 ˆ2 x 2 -therefore given a change in x1 and x2, we can predict a change in y -in addition, when the other x variable is held constant, we have: yˆ ˆ1x1 (when x 2 is held fixed) and yˆ ˆ2 x 2 (when x 1 is held fixed) 3.2 Interpreting Example -consider the theoretical model: intell î gence 80 5HomeParent 0.5Held (ie) -Where a person’s innate intelligence is a function of how many years a parent was home during their childhood and the average amount of hours they are held as a child -the intercept (80) estimates that a child with no stay-at home parent that is never held with have an innate intelligence of 80 3.2 Interpreting Example -consider the theoretical model: intell î gence 80 5HomeParent 0.5Held (ie) -B1hat estimates that a parent staying home for an extra year increases child intellect by 5 -B2hat estimates that a parent holding a child for on average an extra hour increases child intellect by 0.5 -if a parent stays home for an extra year, and as a result holds a child an extra hour on average, we would estimate their intellect to rise by 5.5 (5+0.5; 1(B1hat) + 1(B2hat)) 3.2 Interpreting the OLS Equation -A model with k independent variables is written similar to the 2 independent variable case: yˆ ˆ0 ˆ1x1 ˆ2 x 2 ... ˆk x k (3.16) -Written in terms of changes: yˆ ˆ1x1 ˆ2 x 2 ... ˆk x k (3.17) -If we hold all other variables (xj|j=1,2…k, i≠f) fixed, or CONTROL FOR ALL other variables, ˆ ˆ y f x f (3.18' ) 3.2 Holding Other Factors Fixed -we’ve already seen that Bjhat examines the effect of increasing xj by one, holding all other x’s constant -in simple regression analysis, this would require two identical observations where only xj differed -multiple regression analysis estimates this effect without having an explicit example -multiple regression analysis mimics a controlled experiment using nonexperimental data