3.1 MOTIVA TION FOR MULTIPLE REGRES SION 3.2 MEC HA NIC S A N D IN TER PR ETA TION OF OLS (WITH 2 REG RES SOR S ) Multiple Regression Analysis Motivation for multiple regression Incorporate more explanatory factors into the model Explicitly hold fixed other factors that otherwise would be in Allow for more flexible functional forms Example: Family income and family consumption Other factors Family consumption Family income Family income squared Model has two explanatory variables: inome and income squared Consumption is explained as a quadratic function of income One has to be very careful when interpreting the coefficients OLS Estimation with 2 Regressors Recall the case of two regressors: Y = β0 + β1X1 + β2X2 + u, Y is the dependent variable X1, X2 are the two independent variables (regressors) β0 = unknown population intercept β1 = effect on Y of a change in X1, holding X2 constant β2 = effect on Y of a change in X2, holding X1 constant u = the regression error (omitted factors) Key assumption: E(u|x1,x2)=0 OLS Estimation with 2 Regressors Yi = β0 + β1X1i + β2X2i + ui, i = 1,…,n (Yi, X1i, X2i) denote the ith observation on Y, X1, and X2. With two regressors, the OLS estimator solves: The OLS estimator minimizes the sum of squared difference between the actual values of Yi and the prediction (predicted value) based on the estimated line. This minimization problem is solved using calculus. 1 , and 𝛽 0 , 𝛽 2 This yields the OLS estimators of 𝛽 OLS Estimation with 2 Regressors Difference: ΔY = β1ΔX1 + β2ΔX2 So: estimates β1 and β2 have partial effect, Y β1 = X , holding X2 constant (ΔX2=0) 1 Y β2 = X , holding X1 constant (ΔX1=0) 2 β0 = predicted value of Y when X1 = X2 = 0. Textbook Example 3.1 Example: Determinants of college GPA Grade point average at college High school grade point average Achievement test score Interpretation Holding ACT fixed, another point on high school grade point average is associated with another .453 points college grade point average. Or: If we compare two students with the same ACT, but the hsGPA of student A is one point higher, we predict student A to have a colGPA that is .453 higher than that of student B. Holding high school grade point average fixed, another 10 points on ACT are associated with less than one point on college GPA. Example: the California Test Score Data Regression of TestScore against STR: TestScore = 698.9 – 2.28×STR Now include percent English Learners in the district (PctEL): TestScore = 686.0 – 1.10×STR – 0.65PctEL What happens to the coefficient on STR? How do you interpret the coefficient on STR now? 6-9 Example: the California Test Score Data (STATA) reg testscr str pctel, robust; Regression with robust standard errors Number of obs F( 2, 417) Prob > F R-squared Root MSE = = = = = 420 223.82 0.0000 0.4264 14.464 -----------------------------------------------------------------------------| Robust testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------str | -1.101296 .4328472 -2.54 0.011 -1.95213 -.2504616 pctel | -.6497768 .0310318 -20.94 0.000 -.710775 -.5887786 _cons | 686.0322 8.728224 78.60 0.000 668.8754 703.189 ------------------------------------------------------------------------------ TestScore = 686.0 – 1.10×STR –0.65PctEL 6-8 Textbook Exercise 3.2 3. 2 MEC HA NIC S AND INTERPRETATION OF OLS (WITH K REGR ESSOR S ) Multiple Regression Analysis Definition of the multiple linear regression model “Explains variable Intercept Dependent variable, explained variable, response variable,… in terms of variables ” Slope parameters Independent variables, explanatory variables, regressors,… Key assumption: E(u|x1,x2,…,xk)=0 Error term, disturbance, unobservables,… OLS Estimation with k Regressors OLS Estimation of the multiple regression model Random sample Regression residuals Minimize sum of squared residuals Minimization will be carried out by computer OLS Estimation with k Regressors Interpretation of the multiple regressionmodel By how much does the dependent variable change if the j-th independent variable is increased by one unit, holding all other independent variables and the error term constant The multiple linear regression model manages to hold the values of other explanatory variables fixed even if, in reality, they are correlated with the explanatory variable under consideration. “Ceteris paribus” - interpretation It has still to be assumed that unobserved factors do not change if the explanatory variables are changed. Textbook Example 3.2 Interpret the slope coefficient educ? 2. What is the effect on wage when educ increases by 2 years, holding other factors fixed? 3. What is the effect on wage when exper and tenure each increase by one year, holding educ fixed? 1. OLS Properties Properties of OLS on any sample of data Fitted values and residuals Fitted or predicted values Residuals Algebraic properties of OLS regression Deviations from regression line sum up to zero Covariance between deviations and regressors are zero Sample averages of y and of the regressors lie on regression line Stata Exercise Lwage is log(wage), educ is years of education, exper is years of experience, and expersq is exper^2. The following means are lwage = 1.62, educ = 12.56, exper = 17.02, expersq = 473.44. Calculate the value of the missing educ coefficient. What is its interpretation? Exercise Goodness-of-Fit Decomposition of total variation R-squared Notice that R-squared can only increase if another explanatory variable is added to the regression Alternative expression for R-squared The R2 always increases when you add another regressor Textbook Example 3.5 Example: Explaining arrest records Number of times arrested 1986 Proportion prior arrests that led to conviction Months in prison 1986 Quarters employed 1986 Interpretation: This means that pcnv, ptime86 and qemp86 all together explain about 4.13% of the variation in number of timesarrested. Textbook Example 3.5 Example: Explaining arrest records (cont.) An additional explanatory variable is added: Average sentence in prior convictions Interpretation: R-squared increases only slightly Longer average sentence length increases criminal activity Limited additional explanatory power as R-squared increases by little General remark on R-squared Even if R-squared is small (as in the given example), regression may still provide good estimates of ceteris paribus effects Textbook Exercise 3.4