Econ 399 Chapter2b

2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect on OLS estimates? -When does scaling variables have no effect on OLS estimates? 2) Functional Forms -How do natural logs affect linear regressions? -How do functional forms impact elasticity? 2.4 Units of Measurement and Functional Form • Consider the following model: squiˆrrels i  500  0.5treesi (ie) -where the number of squirrels and trees are in single units -if trees=2000 then the predicted number of squirrels becomes 1500 -if there are no trees, there are 500 squirrels -how does this change if squirrels are measured in hundreds (ie: divided by 100)? 2.4 Units of Measurement and Functional Form -If squirrels are measured in hundreds, then zero trees would produce 500 squirrels, so B0hat=5 (divided by 100) -If there are 2,000 trees, then B1hat*2000 must equal 10 (15-5, or 1,000 squirrels) -Therefore B1hat=0.005 (divided by 100) squiˆrrels i  5  0.005treesi (ie) 2.4 Acorns are for the squirrels Therefore When the dependent variable is multiplied (divided) by a constant… Multiply (divide) the OLS intercept and slope by the same constant. IE: Change the y – change OLS the same way. 2.4 Units of Measurement and Functional Form • Consider the following model: treehuˆggersi  500  0.5treesi (ie) -where the number of treehuggers and trees are in single units -if trees=800 then the predicted number of treehuggers becomes 900 -if there are no trees, there are 500 treehuggers -how does this change if trees are measured in hundreds (ie: divided by 100)? 2.4 Units of Measurement and Functional Form -If trees are measured in hundreds, then zero trees would produce 500 treehuggers, so B0hat=500 (nothing has changed) -If there are 800 trees, then trees=8 and 8(B1hat) must equal 400 (900-500) -Therefore B1hat=50 (multiplied by 100) squiˆrrels i  500  50treesi (ie) 2.4 Acorns are for the squirrels Therefore When the independent variable is multiplied (divided) by a constant… Divide (multiply) the OLS slope (not intercept) by the same constant. IE: Change the x – change OLS slope the opposite way. 2.4 Units of Measurement and Functional Form How does R2 (goodness of fit) change when a variable is scaled? -It doesn’t -R2 calculates how much of the variation in y is explained by x -this doesn’t depend on scaling -a similar “best fit line” is drawn through data points, regardless of scaling 2.4 Functional Form Thus far, we have focused on LINEAR relationships -Linear relationships don’t capture all of the possible interaction between variables -Linear relationships assume that the first x has the same impact on y as the last x -changing impacts can be captured through the use of NATURAL LOGARITHMS 2.4 Log-Lin Model When a variable has an increasing (percentage) impact on y, the log-lin model is appropriate: log( y)   0  1 x  u (2.42) -note that log(y) indicates the natural log of y -if we assume that u doesn’t change, %y  (1001 )x (2.43) -note as x increases, y increases, therefore this equation expresses INCREASING return 2.4 Log-Lin Model Assume that absence does make the heart grow fonder: log( fondness)  3  0.5absence  u (i.e) -assuming (for simplicity) that u=0, 2 days absence causes fondness of e4 (54.6) while 10 day’s absence causes fondness of e8 (2,981) -therefore, given another day of absence: %fondness  (1001 )x  50% (2.43) 2.4 Lin-Log Model Therefore: -the 3rd day of absence increases fondness by 27.3 -the 11th day of absence increases fondness by 1,490.5 -we have INCREASING RETURNS -Note: a Log-lin model can also be expressed: (  0  1 x  u ) ye 2.4 Log-Log Model Recall from Econ 299 that elasticity is calculated as:  ln( y )   ln( x) -if constant elasticity is theoretically important to a model, a log-log functional form ensures that elasticity is constant as B1: log( y)   0  1 log( x)  u 2.4 Scaling and Dependent Logs • Consider a Log-Lin model where the y value is multiplied by c: log( yi )   0  1 xi  ui log( yi )  log( c)   0  1 xi  ui  log( c) log( cyi )  [  0  log( c)]  1 xi  ui log( cyi )   0  1 xi  ui -scaling a dependent variable in log form changes the intercept but does not affect the slope 2.4 Units of Measurement and Functional Form Different Functional Forms are Summarized as Follows: Model Function Interpretation of B1 Lin-Lin y=f(x) ∆y= B1∆x Lin-Log y=f(log(x)) ∆y= (B1/100)%∆x Log-Lin Log(y)=f(x) Log-Log %∆y= 100B1∆x *also call semi-elasticity Log(y)=f(log(x)) %∆y= B1%∆x 2.4 Units of Measurement and Functional Form Notes: 1) Even though non-linear variables are included in models (ie: log(y) or y2), the models are still considered “Linear Regressions” as they are linear in the parameters B1 and B2 2) Non-linear variables make interpreting B1 and B2 more complicated 3) Some estimated models are NOT linear regression models 2.5 Expected Values and Variances of the OLS Estimators • This section will, using classical Gauss-Markov Assumptions, find 3 OLS properties: 1) OLS is unbiased 2) Sample Variance of OLS Estimators 3) Estimated Error Variance This will be done viewing B0hat and B1hat as estimators of the population model: y   0  1x  u Gauss-Markov Assumption SLR.1 (Linear in Parameters) In the population model, the dependent variable, y, is related to the independent variable, x, and the error (or disturbance), u, as y   0  1 x  u (2.47) Where B0 and B1 are the population intercept and slope parameters, respectively. Gauss-Markov Assumption SLR.1 (Linear in Parameters) Notes: 1) In reality, x, y and u are all viewed as random variables 2) Since OLS needs only be linear in B1 and B2, SLR.1 is far from restrictive Given an equation, an assumption must now be made concerning data Gauss-Markov Assumption SLR.2 (Random Sampling) We have a random sample of size n, {(x,y): i=1,2,…..n}, following the population model in equation (2.47). Gauss-Markov Assumption SLR.2 (Random Sample) We will see in later chapters that random sampling can fail, especially in time series data but also in cross-sectional data -Now that we have a population equation and an assumption about data, (2.47) can be rewritten as: yi   0  1 xi  ui , i  1, 2,...., n. (2.47) Where ui captures all unobservables for observation I and differs from uihat -to estimate B0 and B1 we need a 3rd assumption: Gauss-Markov Assumption SLR.3 (Sample Variation in the Explanatory Variable) Sample outcomes of x, namely, {xi, u=1,….,n}, are not all the same value. Gauss-Markov Assumption SLR.3 (Sample Variation in the Explanatory Variable) -This assumption ensures that the denominator of B1hat is not zero -This assumption is violated if: -The variance of x is zero -The standard deviation of x is zero -The minimum value of x is equal to the maximum value -Although we can now obtain OLS estimates, we need one more assumption to ensure unbiasedness Gauss-Markov Assumption SLR.4 (Zero Conditional Mean) The error u has an expected value of zero given any value of the explanatory variable. In other words, E (u | x)  0 Gauss-Markov Assumption SLR.4 (Zero Conditional Mean) Given our assumption about random sampling, we can further conclude: E (u i | x i )  0 i 1, 2,...n . -this is read “for all 1=1, 2,….n” -given SLR.2 and SLR.4, we can derive the properties of OLS estimators as conditional on xi’s values -given these 2 assumptions, nothing is lost in derivation by assuming xi is nonrandom 2.4 OLS is Unbiased In order to prove OLS’s unbiasedness, B1hat must first be algebraically manipulated: ˆ1   (x  x)y  (x  x) i i 2 (2.49) i -By a familiar mathematical property. -Substituting out yi and restating the denominator gives us: (note that SSTx is not the same as SST) ˆ1 (x   i  x)(  0  1x i  u i ) SSTx (2.50) 2.4 OLS is Unbiased Using summation properties, the numerator becomes:  (x i  x)0   (x i  x)1x i   (x i  x)ui (2.51)   0  (x i  x)  1  (x i  x)x i   (x i  x)ui -Which is simplified using the properties:  ( x  x)  0 i and  ( x  x) x   ( x  x) i i i 2  SSTx 2.4 OLS is Unbiased Returning to our B1hat estimate, we now have: ˆ1  1 (x   i  x)u i SSTx 1  1  SSTx d u i i (2.52) -Which indicates that the estimate of B1 equals B1 plus a term that is a linear combination of errors -Conditional on values of x, B1hat’s randomness is due solely to the errors -Note: i i d x x Theorem 2.1 (Unbiasedness of OLS) Using assumptions SLR.1 through SLR.4, E(ˆo )  0 and E(ˆ1 )  1 (2.53) for any values of B0 and B1. In other words, B0hat is unbiased for B0 and B1hat is unbiased for B1. Theorem 2.1 Proof Since expected values are conditional on samples of x, and SSTx and di are functions only of xi, they are nonrandom in conditioning. Therefore: E ( ˆ1 )  1  E[(1/SST)  d i ui ]  1  (1/SST)  E(d iui )  1  (1/SST)  d i E (ui )  1  (1/SST)  d i 0 1 Theorem 2.1 Proof -this is proved from the fact that each ui (conditional on sample x’s) is zero from SLR.2 and SLR.4 -”since unbiasedness holds for any outcome on {x1, x2,…,xn}, unbiasedness also holds without conditioning on {x1, x2,…,xn} -unbiasness of B1hat is now straightforward: Theorem 2.1 Proof ˆ0  y  ˆ1 x   0  1 x  u  ˆ1 x   0  ( 1  ˆ1 ) x  u E ( ˆ0 )   0  E[( 1  ˆ1 ) x]  E[u ]    E[(   ˆ ) x] 0 1 1  0 Since we already proved that E ( ˆ1 )  1 therefor e E( ˆ1 - 1 )  0 Theorem 2.1 Notes -Remember that unbiasedness is a feature of the sample distributions of B1hat and B2hat -if we have a poor sample, our OLS estimates would be far from the true values -if any of our 4 initial Gauss-Markov assumptions are not true, OLS’s unbiasedness fails Assumption Failure -If SLR.1 fails (y and x are not linearly related), very advanced estimation methods are needed -Failure of SLR.2 (random sampling) is discussed in Chapters 9 and 17 -common in time series and possible in cross-sectional data -If SLR.3 fails (x’s are all the same), we cannot obtain OLS estimates -If SLR.4 fails, OLS estimators are biased, which can be corrected SLR.4 Failure -If x is correlated with u, we have spurious correlation -the relationship between x and y is influenced by other factors connected with x -note that some vague connection is always possible but not statistically significant SLR.4 Failure Example -Saskatchewan instituted a hypothetical drunk driving awareness (DDA) campaign as an alternative to jail time for DUI -It was found that the relationship between DUI’s and enrolment in the program is as follows: ˆ DUI i  23  0.78DDAi -Even though the program looks to have failed, it is due to a spurious correlation: -the existence of drunk drivers both increases the number of DUI’s and the enrolment in the program

Econ 399 Chapter2b

Related documents

Products

Support

Econ 399 Chapter2b

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib