UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Appendix to Regression Handout Ģ š and š· Ģš Derivation of the OLS Regression Estimator Equations for š· (The material in this Appendix is not required for ECN377, but if you plan to go to grad school, you should understand and be able to explain the derivation below. If you want to understand, but you don’t, come by my office, and I’ll help you.) The OLS Regression Estimator Equations are the solution to the problem: “Find the βĢi’s that minimize the sum of the squared eĢi’s for all individuals in the sample.” min ∑(šĢš2 ) Ģ0 ,š½ Ģ1 š½ š The problem above is a nonlinear optimization problem without constraints, so we can use the classical calculus method of optimization to find a solution (recall the classical calculus method of optimization from your ECN321 course at UNCW). To solve this problem, begin by replacing eĢi with šš − βĢ0 − βĢ1 ā š1š 2 min ∑ ((šš − βĢ0 − βĢ1 ā š1š ) ) Ģ0 ,š½ Ģ1 š½ š Next, square the expression within the parentheses: 2 min ∑[šš2 − šš š½Ģ0 − šš š½Ģ1 š1š − š½Ģ0 šš + š½Ģ02 + š½Ģ0 š½Ģ1 š1š − š½Ģ1 š1š šš + š½Ģ1 š1š š½Ģ0 + š½Ģ12 š1š ] Ģ0 ,š½ Ģ1 š½ š Inside the brackets, collect similar terms together: 2 min ∑[šš2 − 2šš š½Ģ0 − 2šš š½Ģ1 š1š + š½Ģ02 + 2š½Ģ0 š½Ģ1 š1š + š½Ģ12 š1š ] Ģ0 ,š½ Ģ1 š½ š Take partial derivatives and set each equal to zero to find the First Order Conditions for the minimization problem (remember that the partial derivative operator can “move through” the summation operator): F.O.C.’s (1) (2) š Ģ šš½ 0 š Ģ šš½ 1 Ģ + 2š½ Ģ š ]=0 = ∑š [−2šš + 2š½ 0 1 1š Ģ š + 2š½ Ģ š2 ] = 0 = ∑š [−2šš š1š + 2š½ 0 1š 1 1š Now let’s focus on simplifying FOC (1): Ģ + 2š½ Ģ š ]=0 ∑ [−2šš + 2š½ 0 1 1š š 1 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Recall that the summation operator can distribute across terms that are added or subtracted: Ģ ] + ∑ [2š½ Ģ š ]=0 ∑[−2šš ] + ∑ [2š½ 0 1 1š š š š Recall that constants can be “pulled through” summation operators: Ģ ∑ 1 + 2š½ Ģ ∑š = 0 −2 ∑ šš + 2š½ 1š 0 1 š š š Next, assuming that our sample size is “n”, notice that the sum of n “one’s” is simply equal to n: Ģ š + 2š½ Ģ ∑š = 0 −2 ∑ šš + 2š½ 1š 0 1 š š Cancelling the “2’s” and moving the term with the sum of Y to the right side of the equation: š½Ģ0 š + š½Ģ1 ∑š š1š = ∑š šš call this Equation (3) Turning to FOC (2), using methods similar to those that we used for FOC (1), we find that FOC (2) simplifies to: 2 ] = ∑š[šš š1š ] š½Ģ0 ∑š[š1š ] + š½Ģ1 ∑š[š1š call this Equation (4) Importantly, notice that Equation (3) and Equation (4) are “two equations in two unknowns.” The two unknowns are š½Ģ0 and š½Ģ1 (recall that we are trying to solve for š½Ģ0 and š½Ģ1 .) The X and Y variables are not unknowns, because we have sample data on X and Y that we can insert into the equations; we also assume that we know n, the sample size. Next, we solve these “two equations in two unknowns” for š½Ģ0 and š½Ģ1 (for example, we could solve Equation (3) for š½Ģ0 , substitute the result into Equation (4), solve Equation (4) for š½Ģ1 , and then substitute the result for š½Ģ1 back into Equation (3) to find š½Ģ0 ). When we solve Equation (3) and Equation (4) for š½Ģ0 and š½Ģ1 , we find: š½Ģ0 š½Ģ1 = = ∑š šš š − š½Ģ1 ∑š š1š š ∑š[šš š1š ] − (∑š[š1š ]) ( 2 ] (∑ [š ]) ∑š[š1š − š 1š ( ∑š[šš ] š ∑š[š1š ] š ) ) 2 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas We can simplify the last two equations above a bit more by noticing that the dataset, or Ģ Ģ Ģ š1 , and similarly, ∑š šš š ∑š š1š š is simply the average value of X1 in is equal to šĢ . Making these substitutions: š½Ģ0 š½Ģ1 = Ģ − š½Ģ1 =š ā Ģ Ģ Ģ š1 ∑š[šš š1š ] − (∑š[š1š ]) āĢ š 2 ] (∑ [š ]) Ģ Ģ Ģ Ģ ∑š[š1š − š 1š ā š1 As a last simplification, if we multiply each ∑š š1š by š š and notice again that ∑š š1š š is equal to Ģ Ģ Ģ š1 , we find: Ģ š and š· Ģš The OLS Regression Estimator Equations for š· š½Ģ0 š½Ģ1 = Ģ − š½Ģ1 =š ā Ģ Ģ Ģ š1 ∑š[šš š1š ] − š ā šĢ 1 āĢ š 2] ∑š[š1š − š ā (šĢ 1 )2 3