Lecture Notes #6 AN APPLICATION OF OPTIMIZATION AND MATRIX ALGEBRA: REGRESSION Regression analysis, applied in econometrics, is a basic part of economics. Here we will study how the concepts we learned in the discussion of matrix algebra and derivatives are applied in regression analysis. We will start with simple regression and then extend the discussion to multiple regression. 1. Simple Regression In simple regression we want to find out about the relationship between two variable: If the variations in the values of one variable are associated with the variations with the values of another variable. Here we designate one variable as the dependent variable and the other as the independent variable. We will use the following simple example, to see if, and to what extent, the variations in the quantity of a good sold by a firm (the dependent variable) is related to the annual advertising expenditure (the independent variable). Year 1 2 3 4 5 6 7 8 9 10 11 12 Sales y 38 48 52 35 30 56 63 46 61 68 72 65 Ad Expenditure x 56.7 63.9 60.7 59.7 55.9 68.7 69.2 65.5 72.5 73.4 74.1 76.2 A scatter diagram of the data is shown in Figure 1. LN6—Applications in Econometrics Page 1 of 11 Figure 1 Scatter Diagram of Sales and Advertising Data As the first step, the scatter diagram indicates that there is a direct relationship between sales volume and advertising expenditure. We want, however, obtain a mathematical relationship between x and y. The mathematical relationship is represented by the regression equation, which is the equation of a line that is fitted to the scatter diagram. This fitted line is shown in Figure 2. Figure 2 The Fitted Regression Line The general format of the regression line is as follows: ๐ฆฬ = ๐โ + ๐โ๐ฅ Note the symbol “^” above y. The symbol ลท (y-hat) represents the y value that lies on the regression line, which is obtained by plugging in values for x after we determine the values for the slope of the line (b2) and the vertical intercept (b1). It is called the predicted value. The “plain”, hatless, y represents the value of the dependent variable observed in data set, called the observed value. For now, the equation of the fitted regression line is LN6—Applications in Econometrics Page 2 of 11 ๐ฆฬ = −67.4299 + 1.8119๐ฅ To show the difference between observed and predicted value of y, consider, for example, the observed sales volume in year 7 is: y = 63 million units, which corresponds to the advertising expenditure of x = $69.2 thousand. The predicted sales is: ๐ฆฬ = −67.4299 + 1.8119(69.2) = 57.95 The difference between the observed and predicted values of y is called the prediction error or the residual and is denoted by e. ๐ = ๐ฆฬ − ๐ฆฬ The prediction error provides the theoretical basis to determine the mathematical process to obtain the values for slope and vertical intercept (the coefficients) of the regression equation. The mathematical process is called the least squares method. Note that for each value of x in the data set there is an observed y value and a predicted y value. Thus, given the number of observations in the data set, n, there are n residuals. All predicted values lie along the fitted regression line. There is only one line that fits the scatter diagram the best. The criterion for the best fitting line is that the sum of the residuals equal zero, that is, the combined positive prediction errors exactly balance the combined negative errors. ๐ ๐ ∑ ๐๐ = ∑(๐ฆฬ๐ − ๐ฆฬ๐ ) = 0 ๐=1 ๐=1 Since the best-fitting line provides for the sum of residuals being zero, then the sum of squared residuals must be the minimum, or the least. Thus, we need the method to obtain the coefficients of the regression line by minimizing the sum of square errors (SSE), ๐ ๐๐๐ธ = ๐ ∑ ๐๐2 ๐=1 2 = ∑(๐ฆฬ๐ − ๐ฆฬ๐ ) = 0 ๐=1 2. Least Square Method The objective here is to find the formulas to compute the values for b1 and b2. We start with the SSE. ๐ ∑ ๐๐2 ๐=1 ๐ 2 = ∑(๐ฆฬ๐ − ๐ฆฬ๐ ) ๐=1 Substituting for ๐ฆฬ๐ = ๐1 + ๐2 ๐ฅ๐ on the right-hand-side, we have, ๐ ∑ ๐๐2 ๐=1 ๐ = ∑(๐ฆฬ๐ − ๐1 − ๐2 ๐ฅ๐ )2 ๐=1 Since we are interested in determining the values of the two coefficients b1 and b2 such that ∑ ๐๐ is minimized, we take the partial derivate of the right hand side with respect to b1 and b2 and set them equal to zero. LN6—Applications in Econometrics Page 3 of 11 ๐ (∑๐ 2 ) = −2∑(๐ฆฬ − ๐1 − ๐2 ๐ฅ) ๐๐1 ๐ (∑๐ 2 ) = −2∑๐ฅ(๐ฆฬ − ๐1 − ๐2 ๐ฅ) ๐๐2 Rewriting the right-hand-side and setting them equal to zero, we have: ∑๐ฆฬ − ๐๐1 − ๐2 ∑๐ฅ = 0 ∑๐ฅ๐ฆฬ − ๐1 ∑๐ฅ − ๐2 ∑๐ฅ2 = 0 which lead to: ๐๐1 + (∑๐ฅ)๐2 = ∑๐ฆฬ (∑๐ฅ)๐1 + (∑๐ฅ2 )๐2 = ∑๐ฅ๐ฆฬ These are called the normal equations of the regression. Here we have two equations with two unknowns ๐1 and ๐2 . We can solve for ๐1 and ๐2 two ways. Using the matrix notation, the equation system is written a ๐๐ = ๐ s, where ๐=[ ๐ ∑๐ฅ ] ∑๐ฅ ∑๐ฅ 2 ๐ ๐ = [ 1] ๐2 ๐=[ “Coefficient” matrix “Variable” matrix ∑๐ฆฬ ] ∑๐ฅ๐ฆฬ “Constant” matrix Thus, [ ๐ ∑๐ฅ ∑๐ฅ ๐1 ∑๐ฆฬ ] 2 ] [๐ ] = [ ∑๐ฅ ∑๐ฅ๐ฆฬ 2 The solutions for ๐1 and ๐2 is then found by finding the inverse of the coefficient matrix and the post-multiplying the inverse by the constant matrix. ๐ = ๐ −1 ๐ To find ๐ −1, first find the determinant of ๐. |๐| = ๐∑๐ฅ 2 − (∑๐ฅ)2 = ๐∑๐ฅ 2 − (๐๐ฅฬ )2 (Substituting for ∑๐ฅ = ๐๐ฅ ฬ ) |๐| = ๐(∑๐ฅ 2 − ๐๐ฅฬ 2 ) Next find the Cofactor matrix, [๐ถ] = [ ∑๐ฅ 2 −∑๐ฅ ] − ∑๐ฅ ๐ LN6—Applications in Econometrics Page 4 of 11 Since the square matrix is symmetric about the principal diagonal, the Adjoint matrix, which is the transpose of the cofactor matrix, is the same as [๐ถ]. The inverse matrix ๐ −1 is then, ๐ −1 = 1 ∑๐ฅ 2 [ |๐| −∑๐ฅ −∑๐ฅ ] ๐ ๐ −1 = 1 ∑๐ฅ 2 [ ๐(∑๐ฅ 2 − ๐๐ฅฬ 2 ) −∑๐ฅ −∑๐ฅ ] ๐ ∑๐ฅ 2 ๐ −1 −∑๐ฅ ๐(∑ − ๐(∑๐ฅ 2 − ๐๐ฅฬ 2 ) = −∑๐ฅ ๐ 2 2 2 [๐(∑๐ฅ − ๐๐ฅฬ ) ๐(∑๐ฅ − ๐๐ฅฬ 2 )] ๐ฅ2 ๐๐ฅฬ 2 ) ∑๐ฅ 2 ๐ −1 = ๐(∑ ๐ฅ2 ๐๐ฅฬ 2 ) − −๐ฅฬ [ ∑๐ฅ 2 − ๐๐ฅฬ 2 −๐ฅฬ ∑ − ๐๐ฅฬ 2 1 ∑๐ฅ 2 − ๐๐ฅฬ 2 ] ๐ฅ2 Thus, ∑๐ฅ 2 ๐ ๐(∑ − [ 1] = ๐2 −๐ฅฬ 2 [ ∑๐ฅ − ๐๐ฅฬ 2 ๐ฅ2 ๐๐ฅฬ 2 ) −๐ฅฬ ∑ − ๐๐ฅฬ 2 ∑๐ฆฬ [ ] 1 ∑๐ฅ๐ฆฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 ] ๐ฅ2 Using the matrix multiplication rule, first solve for ๐2 . ๐2 = −๐๐ฅฬ ๐ฆฬฬ ∑๐ฅ๐ฆฬ + 2 2 ∑๐ฅ − ๐๐ฅฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 ๐2 = ∑๐ฅ๐ฆฬ − ๐๐ฅฬ ๐ฆฬฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 (∑๐ฆฬ = ๐๐ฆฬฬ ) Next, solve for ๐1 . ๐1 = ๐ฆฬฬ ∑๐ฅ 2 ๐ฅฬ ∑๐ฅ๐ฆฬ − ∑๐ฅ 2 − ๐๐ฅฬ 2 ∑๐ฅ 2 − ๐๐ฅฬ 2 ๐1 = ๐ฆฬฬ ∑๐ฅ 2 − ๐ฅฬ ∑๐ฅ๐ฆฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 ๐1 = ๐ฆฬฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 ๐ฆฬฬ − ๐ฅฬ ∑๐ฅ๐ฆฬ + ๐๐ฅฬ 2 ๐ฆฬฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 ๐1 = ๐ฆฬฬ (∑๐ฅ 2 − ๐๐ฅฬ 2 ) − (∑๐ฅ๐ฆฬ − ๐๐ฅฬ ๐ฆฬฬ )๐ฅฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 ๐1 = ๐ฆฬฬ − ๐2 ๐ฅฬ LN6—Applications in Econometrics Page 5 of 11 Alternatively, dividing both sides of the first normal equation by n ๐1 + ๐2 ∑๐ฅ = ๐ ∑๐ฆฬ ๐ ๐1 + ๐2 ๐ฅฬ = ๐ฆฬฬ ๐1 = ๐ฆฬฬ − ๐2 ๐ฅฬ Now divide the second normal equation by n and substitute for b1. (๐ฆฬฬ − ๐2 ๐ฅฬ ) ∑๐ฅ ๐ + ๐2 ∑๐ฅ 2 ๐ = ∑๐ฅ๐ฆฬ ๐ ๐(๐ฆฬฬ − ๐2 ๐ฅฬ )๐ฅฬ + ๐2 ∑๐ฅ 2 = ∑๐ฅ๐ฆฬ ๐๐ฅฬ ๐ฆฬฬ − ๐๐2 ๐ฅฬ 2 + ๐2 ∑๐ฅ 2 = ∑๐ฅ๐ฆฬ ๐2 (∑๐ฅ 2 − ๐๐ฅฬ 2 ) = ∑๐ฅ๐ฆฬ − ๐๐ฅฬ ๐ฆฬฬ ๐2 = ∑๐ฅ๐ฆฬ − ๐๐ฅฬ ๐ฆฬฬ ∑๐ฅ 2 − ๐๐ฅฬ 2 From the data compute the following: ๐ฅฬ = 66.375 ๐2 = ∑ ๐ฅ๐ฆฬ = 43066.4 ๐ฆฬฬ = 52.833 ∑ ๐ฅ² = 53411.13 43066.4 − 12(66.375)(52.833) = 1.8119 53411.13 − 10(66.375)² ๐1 = 52.833 − 1.8119(66.375) = −67.4299 Alternatively, compute the quantities in the matrix notation: [ ๐ ∑๐ฅ ๐1 ∑๐ฆฬ ][ ] = [ ] ∑๐ฅ ∑๐ฅ 2 ๐2 ∑๐ฅ๐ฆฬ The solutions for ๐1 and ๐2 is then found by finding the inverse of the coefficient matrix and the post-multiplying the inverse by the constant matrix. X=[ ๐ ∑๐ฅ ∑๐ฅ ] ∑๐ฅ 2 ๐ b = [ 1] ๐2 c=[ ∑๐ฆฬ ] ∑๐ฅ๐ฆฬ ๐ = ๐ −1 ๐ From the data compute the following: ∑๐ฅ = 796.5 12 [ 796.5 ∑๐ฆฬ = 634 ∑๐ฅ² = 53411.13 ∑๐ฅ๐ฆฬ = 43066.4 634.0 796.50 ๐1 ][ ] = [ ] 43066.4 53411.13 ๐2 LN6—Applications in Econometrics Page 6 of 11 Using Excel, find X −1 = [ 8.1902 −0.1221 −0.1221 ] 0.0018 ๐ 8.1902 [ 1] = [ ๐2 −0.1221 −0.1221 634 ][ ] 0.0018 43066.4 ๐ −67.4299 [ 1] = [ ] ๐2 1.8119 3. Multiple Regression In multiple regression there are two or more independent variables. With two independent variables, the regression equation is written as ๐ฆฬฬ = ๐1 + ๐2 ๐ฅ2 + ๐ฅ3 To determine the coefficients of the multiple regression the only feasible approach is using matrix algebra. Here, again, we start with finding the normal equations. ∑๐ 2 = ∑(๐ฆฬ − ๐ฆฬ)2 ∑๐ 2 = ∑(๐ฆฬ − ๐1 − ๐2 ๐ฅ2 − ๐3 ๐ฅ3 )2 ๐ (∑๐ 2 ) = −2∑(๐ฆฬ − ๐1 − ๐2 ๐ฅ2 − ๐3 ๐ฅ3 ) = 0 ๐๐1 ๐ (∑๐ 2 ) = −2∑๐ฅ2 (๐ฆฬ − ๐1 − ๐2 ๐ฅ2 − ๐3 ๐ฅ3 ) = 0 ๐๐2 ๐ (∑๐ 2 ) = −2∑๐ฅ3 (๐ฆฬ − ๐1 − ๐2 ๐ฅ2 − ๐3 ๐ฅ3 ) = 0 ๐๐3 The normal equations are: ๐๐1 + (∑๐ฅ2 )๐1 + (∑๐ฅ2 )๐2 + (∑๐ฅ3 )๐3 = ∑๐ฆฬ (∑๐ฅ22 )๐2 + (∑๐ฅ2 ๐ฅ3 )๐3 = ∑๐ฅ2 ๐ฆฬ (∑๐ฅ3 )๐1 + (∑๐ฅ2 ๐ฅ3 )๐3 + (∑๐ฅ32 )๐3 = ∑๐ฅ3 ๐ฆฬ In matrix format, we have, ๐ [∑๐ฅ2 ∑๐ฅ3 ∑๐ฅ2 ∑๐ฅ22 ∑๐ฅ2 ๐ฅ3 ∑๐ฅ3 ∑๐ฆฬ ๐1 ∑๐ฅ2 ๐ฅ3 ] [๐2 ] = [∑๐ฅ2 ๐ฆฬ] ๐3 ∑๐ฅ3 ๐ฆฬ ∑๐ฅ32 Labeling the three matrices as before, to find the values of the regression coefficients, b = X −1 k LN6—Applications in Econometrics Page 7 of 11 Example The following data shows the price of houses as the dependent variable, and the size (in square feet) and the age (years) as the independent variables. We want to determine to what extent the price varies with the size and the age of the house. PRICE y 89950 138950 87000 165000 210000 108000 89000 79000 124500 135000 105500 133650 83500 101000 151500 88500 198000 135000 79500 135050 175000 71000 76000 99250 98500 117500 97000 125000 115000 145000 SQFT x2 917 1684 1800 1900 2000 1050 1057 954 1350 2134 1313 1671 1200 1314 1877 1132 2198 1525 1208 1450 2000 1267 1088 1159 1255 1386 1400 1442 1477 1566 AGE x3 39 10 20 13 20 40 45 6 47 13 12 37 18 45 10 38 2 10 29 35 9 18 12 35 4 30 63 12 23 6 The following are the quantities needed to generate the X and k matrices. ๐ = 30 ∑๐ฅ22 = 67540602 ∑๐ฆฬ = 3556850 30 43774 X = [43774 67540602 701 955209 ∑๐ฅ2 = 43774 ∑๐ฅ32 = 23573 ∑๐ฅ2 ๐ฆฬ = 5494952850 701 955209] 23573 ∑๐ฅ3 = 701 ∑๐ฅ2 ๐ฅ3 = 955209 ∑๐ฅ3 ๐ฆฬ = 77587100 3556850 k = [5494952850] 77587100 b = X −1 k LN6—Applications in Econometrics Page 8 of 11 ๐1 1.0387523 [๐2 ] = [−0.0005537 ๐3 −0.0084550 −0.0005537 0.0000003 0.0000031 −0.0084550 3556850 0.0000031] [5494952850] 0.0001682 77587100 ๐1 −3609.450 [๐2 ] = [ 83.459] ๐3 16.802 The estimated regression equation is then, ๐ฆฬฬ = −3609.450 + 83.459๐ฅ2 + 16.802๐ฅ3 Thus, for example, the price of a 2,000 square feet house of 10 years of age is predicted to be ๐ฆฬฬ = −3609.450 + 83.459(2000) + 16.802(10) = $163,476.57 4. Functional Forms in Econometrics In the simple linear regression model the population parameters βโ (the intercept) and βโ (the slope) are linear—that is, they are not expressed as, say, β22 , 1⁄β2 , or any form other than βโ—and also the impact of the changes in the independent variable on ๐ฆฬ works directly through ๐ฅ rather than through expressions such as, say, ๐ฅ², 1⁄๐ฅ , or ln ๐ฅ. In many cases, even though the parameters are held as linear, the variables may take on forms other than linear. In many economic models the relationship between the dependent and independent variables is not a straight line relationship. That is the change in y does not follow the same pattern for all values of x. Consider for example an economic model explaining the relationship between expenditure on food (or housing) and income. As income rises, we do expect expenditure on food to rise, but not at a constant rate. In fact, we should expect the rate of increase in expenditure on food to decrease as income rises. Therefore the relationship between income and food expenditure is not a straight-line relationship. The following is an outline of various functional forms encountered in regression analysis. First we start with the straight-line (linear) relationship between ๐ฅ and ๐ฆฬ and then point out various non-linear functional relationships. 4.1. Linear Functional Form The linear functional form is the familiar, ๐ฆฬ = β1 + β2 ๐ฅ The slope of the function is: ๐๐ฆฬ = β2 ๐๐ฅ The slope represents the change in y per unit change in x. The elasticity of the function shows the proportional or percentage change in y relative to a percent change in x: ε= ๐(ln ๐ฆฬ) ๐ฅ = β2 ๐(ln ๐ฅ) ๐ฆฬ To show this, let ๐ข = ln ๐ฆฬ, and ๐ฃ = ln ๐ฅ. But, since ๐ฃ = ln ๐ฅ, then ๐ฅ = ๐ ๐ฃ . Therefore, ๐(ln ๐ฆฬ) ๐๐ข ๐๐ข ๐๐ฆฬ ๐๐ฅ = = ๐(ln ๐ฅ) ๐๐ฃ ๐๐ฆฬ ๐๐ฅ ๐๐ฃ LN6—Applications in Econometrics Page 9 of 11 ๐(ln ๐ฆฬ) 1 ๐ฅ = ( ) (β2 )๐ ๐ฃ = β2 ๐(ln ๐ฅ) ๐ฆฬ ๐ฆฬ ๐ฅ Note that to obtain elasticity, simply multiply the slope by . ๐ฆ ε= ๐๐ฆฬ ๐ฅ ๐ฅ = β2 ๐๐ฅ ๐ฆฬ ๐ฆฬ 4.2. Reciprocal Functional Form The reciprocal functional form is, ๐ฆฬ = β1 + β2 1 ๐ฅ The slope is, ๐๐ฆฬ β2 =− 2 ๐๐ฅ ๐ฅ and the elasticity is, ε= ๐(ln ๐ฆฬ) 1 = −β2 ๐(ln ๐ฅ) ๐ฅ๐ฆฬ Note that, as usual, using the general definition of elasticity, ε= ๐๐ฆฬ ๐ฅ β2 ๐ฅ 1 = (− 2 ) ( ) = −β2 ๐๐ฅ ๐ฆฬ ๐ฅ ๐ฆฬ ๐ฅ๐ฆฬ 4.3. Log-Log Functional Form Many relationships between variables are naturally expressed in percentages. Logarithms convert changes in variables into percentage changes. The log-log functional form is, ln ๐ฆฬ = β1 + β2 ln ๐ฅ The slope and elasticity of the function are: 1 ๐๐ฆฬ 1 = β2 ๐ฆฬ ๐๐ฅ ๐ฅ ε= ๐๐ฆฬ ๐ฆฬ = β2 ๐๐ฅ ๐ฅ ๐๐ฆฬ ๐ฅ ๐ฆฬ ๐ฅ = (β2 ) ( ) = β2 ๐๐ฅ ๐ฆฬ ๐ฅ ๐ฆฬ Consider, for example, the function, ln ๐ฆฬ = 0.5 + 2 ln ๐ฅ For ๐ฅ0 = 1.5, ln ๐ฆฬ = 0.5 + 2 ln 1.5 = 0.5 + 2(0.4055) = 1.3109 ๐ฆฬ0 = exp(1.3109) = 3.7096 The slope of the function at the point ๐ฅ0 = 1.5 is then, LN6—Applications in Econometrics Page 10 of 11 ๐๐ฆฬ ๐ฆฬ 3.7096 = β2 = 2 = 4.9462 ๐๐ฅ ๐ฅ 1.5 This means that for each small unit increase in x, y increases by 4.9462. Now let’s compute the elasticity. First, assume x increases by 1 percent point from ๐ฅ0 = 1.5 to ๐ฅ1 = 1.5 × 1.01 = 1.515. Then, ln ๐ฆฬ1 = 0.5 + 2 ln 1.515 = 1.3308 ๐ฆฬ = exp(1.3308) = 3.7842 Thus, the percentage change in y when x increases by 1 percentage point is, 3.7842 − 3.7096 = 0.0201 ๐๐ 2.01% 3.7096 Elasticity is then, ε = 0.0201⁄0.01 = 2.01. For a very small percent change in x, elasticity will approach β2 = 2, as shown in the following table. “Percent” change in x (1) 0.01 0.005 0.0025 0.001 0.0001 0.00001 ๐๐ (2) 1.515 1.5075 1.50375 1.5015 1.50015 1.500015 ๐๐ (3) 3.7842 3.7468 3.7282 3.7170 3.7104 3.7097 โ๐⁄๐๐ (4) 0.0201 0.010025 0.005006 0.002001 0.0002 0.00002 Elasticity (4)/(1) 2.01 2.005 2.0025 2.001 2.0001 2.00001 The following will be added: Linear-log (semi-log) model Log-inverse model LN6—Applications in Econometrics Page 11 of 11