Chapter 5 The Multiple Regression Model Walter R. Paczkowski Rutgers University

Chapter 5 The Multiple Regression Model Walter R. Paczkowski Rutgers University Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 1 Chapter Contents  5.1 Introduction  5.2 Estimating the Parameters of the Multiple Regression Model  5.3 Sampling Properties of the Least Squares Estimators  5.4 Interval Estimation  5.5 Hypothesis Testing  5.6 Polynomial Equations  5.7 Interaction Variables  5.8 Measuring Goodness-of-fit Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 2 5.1 Introduction Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 3 5.1 Introduction Let’s set up an economic model in which sales revenue depends on one or more explanatory variables – We initially hypothesize that sales revenue is linearly related to price and advertising expenditure – The economic model is: Eq. 5.1 SALES  1  2 PRICE  3 ADVERT Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 4 5.1 Introduction 5.1.1 The Economic Model In most economic models there are two or more explanatory variables – When we turn an economic model with more than one explanatory variable into its corresponding econometric model, we refer to it as a multiple regression model – Most of the results we developed for the simple regression model can be extended naturally to this general case Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 5 5.1 Introduction 5.1.1 The Economic Model β2 is the change in monthly sales SALES ($1000) when the price index PRICE is increased by one unit ($1), and advertising expenditure ADVERT is held constant SALES 2  PRICE  ADVERT held constant  SALES  PRICE Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 6 5.1 Introduction 5.1.1 The Economic Model Similarly, β3 is the change in monthly sales SALES ($1000) when the advertising expenditure is increased by one unit ($1000), and the price index PRICE is held constant SALES 3  ADVERT  PRICE held constant  SALES  ADVERT Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 7 5.1 Introduction 5.1.2 The Econometric Model The econometric model is: E (SALES )  β1  β2 PRICE  β3 ADVERT – To allow for a difference between observable sales revenue and the expected value of sales revenue, we add a random error term, e = SALES - E(SALES) Eq. 5.2 SALES  E  SALES   e  β1  β2 PRICE  β3 ADVERT  e Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 8 5.1 Introduction FIGURE 5.1 The multiple regression plane 5.1.2 The Econometric Model Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 9 5.1 Introduction Table 5.1 Observations on Monthly Sales, Price, and Advertising in Big Andy’s Burger Barn 5.1.2 The Econometric Model Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 10 5.1 Introduction 5.1.2a The General Model In a general multiple regression model, a dependent variable y is related to a number of explanatory variables x2, x3, …, xK through a linear equation that can be written as: Eq. 5.3 Principles of Econometrics, 4th Edition y  β1  β2 x2  β3 x3  Chapter 5: The Multiple Regression Model  β K xK  e Page 11 5.1 Introduction 5.1.2a The General Model A single parameter, call it βk, measures the effect of a change in the variable xk upon the expected value of y, all other variables held constant βk  Principles of Econometrics, 4th Edition E  y  xk  other xs held constant Chapter 5: The Multiple Regression Model E  y  xk Page 12 5.1 Introduction 5.1.2a The General Model The parameter β1 is the intercept term. – We can think of it as being attached to a variable x1 that is always equal to 1 – That is, x1 = 1 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 13 5.1 Introduction 5.1.2a The General Model The equation for sales revenue can be viewed as a special case of Eq. 5.3 where K = 3, y = SALES, x1 = 1, x2 = PRICE and x3 = ADVERT Eq. 5.4 Principles of Econometrics, 4th Edition y  β1  β 2 x2  β3 x3  e Chapter 5: The Multiple Regression Model Page 14 5.1 Introduction 5.1.2b The Assumptions of the Model We make assumptions similar to those we made before: – E(e) = 0 – var(e) = σ2 • Errors with this property are said to be homoskedastic – cov(ei,ej) = 0 – e ~ N(0, σ2) Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 15 5.1 Introduction 5.1.2b The Assumptions of the Model The statistical properties of y follow from those of e: – E(y) = β1 + β2x2 + β3x3 – var(y) = var(e) = σ2 – cov(yi, yj) = cov(ei, ej) = 0 – y ~ N[(β1 + β2x2 + β3x3), σ2] • This is equivalent to assuming that e ~ N(0, σ2) Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 16 5.1 Introduction 5.1.2b The Assumptions of the Model We make two assumptions about the explanatory variables: 1. The explanatory variables are not random variables • We are assuming that the values of the explanatory variables are known to us prior to our observing the values of the dependent variable Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 17 5.1 Introduction 5.1.2b The Assumptions of the Model We make two assumptions about the explanatory variables (Continued): 2. Any one of the explanatory variables is not an exact linear function of the others • This assumption is equivalent to assuming that no variable is redundant • If this assumption is violated – a condition called exact collinearity - the least squares procedure fails Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 18 5.1 Introduction ASSUMPTIONS of the Multiple Regression Model 5.1.2b The Assumptions of the Model MR1. yi  1  2 xi 2    K xiK  ei , i  1, , N MR2. E ( yi )  1  2 xi 2    K xiK  E (ei )  0 MR3. var( yi )  var(ei )  2 MR4. cov( yi , y j )  cov(ei , e j )  0 MR5. The values of each xtk are not random and are not exact linear functions of the other explanatory variables MR6. yi ~ N (1  2 xi 2  Principles of Econometrics, 4th Edition  K xiK ), 2   ei ~ N (0, 2 ) Chapter 5: The Multiple Regression Model Page 19 5.2 Estimating the Parameters of the Multiple Regression Model Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 20 5.2 Estimating the Parameters of the Multiple Regression Model We will discuss estimation in the context of the model in Eq. 5.4, which we repeat here for convenience, with i denoting the ith observation yi  β1  β2 xi 2  β3 xi 3  ei Eq. 5.4 – This model is simpler than the full model, yet all the results we present carry over to the general case with only minor modifications Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 21 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.1 Least Squares Estimation Procedure Mathematically we minimize the sum of squares function S(β1, β2, β3), which is a function of the unknown parameters, given the data: N S  β1 ,β 2 ,β3     yi  E  yi   Eq. 5.5 2 i 1 N    yi  β1  β 2 xi 2  β3 xi 3  i 1 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 22 2 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.1 Least Squares Estimation Procedure Formulas for b1, b2, and b3, obtained by minimizing Eq. 5.5, are estimation procedures, which are called the least squares estimators of the unknown parameters – In general, since their values are not known until the data are observed and the estimates calculated, the least squares estimators are random variables Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 23 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.2 Least Squares Estimates Using Hamburger Chain Data Estimates along with their standard errors and the equation’s R2 are typically reported in equation format as: Eq. 5.6 SALES  118.91  7.908PRICE  1863 ADVERT ( se) Principles of Econometrics, 4th Edition (6.35) (1.096) R2  0.448 (0.683) Chapter 5: The Multiple Regression Model Page 24 5.2 Estimating the Parameters of the Multiple Regression Model Table 5.2 Least Squares Estimates for Sales Equation for Big Andy’s Burger Barn 5.2.2 Least Squares Estimates Using Hamburger Chain Data Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 25 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.2 Least Squares Estimates Using Hamburger Chain Data Interpretations of the results: 1. The negative coefficient on PRICE suggests that demand is price elastic; we estimate that, with advertising held constant, an increase in price of $1 will lead to a fall in monthly revenue of $7,908 2. The coefficient on advertising is positive; we estimate that with price held constant, an increase in advertising expenditure of $1,000 will lead to an increase in sales revenue of $1,863 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 26 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.2 Least Squares Estimates Using Hamburger Chain Data Interpretations of the results (Continued): 3. The estimated intercept implies that if both price and advertising expenditure were zero the sales revenue would be $118,914 • Clearly, this outcome is not possible; a zero price implies zero sales revenue • In this model, as in many others, it is important to recognize that the model is an approximation to reality in the region for which we have data • Including an intercept improves this approximation even when it is not directly interpretable Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 27 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.2 Least Squares Estimates Using Hamburger Chain Data Using the model to predict sales if price is $5.50 and advertising expenditure is $1,200: SALES  118.91 - 7.908PRICE  1.863 ADVERT  118.914 - 7.9079  5.5  1.8626 1.2  77.656 – The predicted value of sales revenue for PRICE = 5.5 and ADVERT =1.2 is $77,656. Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 28 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.2 Least Squares Estimates Using Hamburger Chain Data A word of caution is in order about interpreting regression results: – The negative sign attached to price implies that reducing the price will increase sales revenue. – If taken literally, why should we not keep reducing the price to zero? – Obviously that would not keep increasing total revenue – This makes the following important point: • Estimated regression models describe the relationship between the economic variables for values similar to those found in the sample data • Extrapolating the results to extreme values is generally not a good idea • Predicting the value of the dependent variable for values of the explanatory variables far from the sample values invites disaster Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 29 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.3 Estimation of the Error Variance σ2 We need to estimate the error variance, σ2 – Recall that:  2  var(ei )  E ei2 – But, the squared errors are unobservable, so we develop an estimator for σ2 based on the squares of the least squares residuals:   eî  yi  yî  yi  b1  b2 xi 2  b3 xi 3  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 30 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.3 Estimation of the Error Variance σ2 An estimator for σ2 that uses the information from êi2 and has good statistical properties is: 2 ˆ  i 1 ei N 2 ˆ   Eq. 5.7 N K where K is the number of β parameters being estimated in the multiple regression model. Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 31 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.3 Estimation of the Error Variance σ2 For the hamburger chain example: 2 ˆ  i1 ei 75 1718.943 ˆ    23.874 N K 75  3 2 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 32 5.2 Estimating the Parameters of the Multiple Regression Model 5.2.3 Estimation of the Error Variance σ2 Note that: N SSE   eî2  1718.943 i 1 – Also, note that ˆ  23.874  4.8861 – Both quantities typically appear in the output from your computer software • Different software refer to it in different ways. Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 33 5.3 Sampling Properties of the Least Squares Estimators Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 34 5.3 Sampling Properties of the Least Squares Estimators THE GAUSS-MARKOV THEOREM For the multiple regression model, if assumptions MR1–MR5 hold, then the least squares estimators are the best linear unbiased estimators (BLUE) of the parameters. Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 35 5.3 Sampling Properties of the Least Squares Estimators If the errors are not normally distributed, then the least squares estimators are approximately normally distributed in large samples – What constitutes ‘‘large’’ is tricky – It depends on a number of factors specific to each application – Frequently, N – K = 50 will be large enough Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 36 5.3 Sampling Properties of the Least Squares Estimators 5.3.1 The Variances and Covariances of the Least Squares Estimators We can show that: var(b2 )  Eq. 5.8 2 N (1  r232 ) ( xi 2  x2 ) 2 i 1 where Eq. 5.9 Principles of Econometrics, 4th Edition r23   ( xi 2  x2 )( xi 3  x3 ) 2 2 ( x  x ) ( x  x )  i 2 2  i3 3 Chapter 5: The Multiple Regression Model Page 37 5.3 Sampling Properties of the Least Squares Estimators 5.3.1 The Variances and Covariances of the Least Squares Estimators We can see that: 1. Larger error variances 2 lead to larger variances of the least squares estimators 2. Larger sample sizes N imply smaller variances of the least squares estimators 3. More variation in an explanatory variable around its mean, leads to a smaller variance of the least squares estimator 4. A larger correlation between x2 and x3 leads to a larger variance of b2 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 38 5.3 Sampling Properties of the Least Squares Estimators 5.3.1 The Variances and Covariances of the Least Squares Estimators We can arrange the variances and covariances in a matrix format:  var  b1  cov  b1 , b2  cov  b1 , b3     cov  b1 , b2 , b3   cov  b1 , b2  var  b2  cov  b2 , b3    cov  b1 , b3  cov  b2 , b3  var  b3   Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 39 5.3 Sampling Properties of the Least Squares Estimators 5.3.1 The Variances and Covariances of the Least Squares Estimators Using the hamburger data: Eq. 5.10  40.343 6.795 0.7484  cov  b1 , b2 , b3    6.795 1.201 0.0197   0.7484 0.0197 0.4668  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 40 5.3 Sampling Properties of the Least Squares Estimators 5.3.1 The Variances and Covariances of the Least Squares Estimators Therefore, we have: var  b1   40.343 cov  b1 , b2   6.795 var  b2   1.201 cov  b1 , b3   0.7484 var  b3   0.4668 cov  b2 , b3   0.0197 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 41 5.3 Sampling Properties of the Least Squares Estimators 5.3.1 The Variances and Covariances of the Least Squares Estimators We are particularly interested in the standard errors: se  b1   var  b1   40.343  6.3516 se  b2   var  b2   1.201  1.0960 se  b3   var  b3   0.4668  0.6832 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 42 5.3 Sampling Properties of the Least Squares Estimators Table 5.3 Covariance Matrix for Coefficient Estimates 5.3.1 The Variances and Covariances of the Least Squares Estimators Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 43 5.3 Sampling Properties of the Least Squares Estimators 5.3.2 The Distribution of the Least Squares Estimators Consider the general for of a multiple regression model: yi  1  2 xi 2  3 xi 3    K xiK  ei – If we add assumption MR6, that the random errors ei have normal probability distributions, then the dependent variable yi is normally distributed: yi ~ N (1  2 xi 2  Principles of Econometrics, 4th Edition  K xiK ), 2   ei ~ N (0, 2 ) Chapter 5: The Multiple Regression Model Page 44 5.3 Sampling Properties of the Least Squares Estimators 5.3.2 The Distribution of the Least Squares Estimators Since the least squares estimators are linear functions of dependent variables, it follows that the least squares estimators are also normally distributed: bk ~ N   k , var  bk  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 45 5.3 Sampling Properties of the Least Squares Estimators 5.3.2 The Distribution of the Least Squares Estimators We can now form the standard normal variable Z: Eq. 5.11 z Principles of Econometrics, 4th Edition bk  k var  bk  ~ N  0,1 , for k  1, 2,, K Chapter 5: The Multiple Regression Model Page 46 5.3 Sampling Properties of the Least Squares Estimators 5.3.2 The Distribution of the Least Squares Estimators Replacing the variance of bk with its estimate: t Eq. 5.12 bk  k var  bk  bk  k  ~ t N K  se(bk ) – Notice that the number of degrees of freedom for t-statistics is N - K Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 47 5.3 Sampling Properties of the Least Squares Estimators 5.3.2 The Distribution of the Least Squares Estimators We can form a linear combination of the coefficients as:   c1β1  c2β 2   cK β K   k 1 ck β k K And then we have Eq. 5.13 Principles of Econometrics, 4th Edition ˆ    ck bk   ck β k t  ~ t N  K  se( ck bk ) se ˆ  Chapter 5: The Multiple Regression Model Page 48 5.3 Sampling Properties of the Least Squares Estimators 5.3.2 The Distribution of the Least Squares Estimators If = 3, the we have: se  c1b1  c2 b2  c3 b K   var  c1b1  c2 b2  c3 b K  where var  c1b1  c2 b 2  c3 b K   c12 var  b1   c22 var  b2   c32 var  b3  Eq. 5.14 2c1c2 cov  b1 , b2   2c1c3 cov  b1 , b3   2c2c3 cov  b2 , b3  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 49 5.3 Sampling Properties of the Least Squares Estimators 5.3.2 The Distribution of the Least Squares Estimators What happens if the errors are not normally distributed? – Then the least squares estimator will not be normally distributed and Eq. 5.11, Eq. 5.12, and Eq. 5.13 will not hold exactly – They will, however, be approximately true in large samples – Thus, having errors that are not normally distributed does not stop us from using Eq. 5.12 and Eq. 5.13, but it does mean we have to be cautious if the sample size is not large – A test for normally distributed errors was given in Chapter 4.3.5 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 50 5.4 Interval Estimation Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 51 5.4 Interval Estimation We now examine how we can do interval estimation and hypothesis testing Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 52 5.4 Interval Estimation 5.4.1 Interval Estimation for a Single Coefficient For the hamburger example, we need: P(tc  t(72)  tc )  .95 Eq. 5.15 Using tc = 1.993, we can rewrite (5.15) as:   b2  2 P  1.993   1.993   .95 se(b2 )   Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 53 5.4 Interval Estimation 5.4.1 Interval Estimation for a Single Coefficient Rearranging, we get: P b2  1.993  se(b2 )  2  b2  1.993  se(b2 )  .95 – Or, just writing the end-points for a 95% interval: Eq. 5.16 b2  1.993  se(b2 ), Principles of Econometrics, 4th Edition b2  1.993  se(b2 ) Chapter 5: The Multiple Regression Model Page 54 5.4 Interval Estimation 5.4.1 Interval Estimation for a Single Coefficient Using our data, we have b2 = -7.908 and se(b2) = 1.096, so that:  7.9079  1.993 1.096,  7.9079  1.993 1.096   10.093, 5.723 – This interval estimate suggests that decreasing price by $1 will lead to an increase in revenue somewhere between $5,723 and $10,093. • In terms of a price change whose magnitude is more realistic, a 10-cent price reduction will lead to a revenue increase between $572 and $1,009 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 55 5.4 Interval Estimation 5.4.1 Interval Estimation for a Single Coefficient Similarly for advertising, we get: 1.8626  1.9935  0.6832,1.8626  1.9935  0.6832   0.501,3.225 – We estimate that an increase in advertising expenditure of $1,000 leads to an increase in sales revenue of between $501 and $3,225 – This interval is a relatively wide one; it implies that extra advertising expenditure could be unprofitable (the revenue increase is less than $1,000) or could lead to a revenue increase more than three times the cost of the advertising Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 56 5.4 Interval Estimation 5.4.1 Interval Estimation for a Single Coefficient We write the general expression for a 100(1-α)% confidence interval as: b k  t(1 /2, N  K )  se(bk ), bk  t(1 /2, N  K )  se(bk )  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 57 5.4 Interval Estimation 5.4.2 Interval Estimation for a Linear Combination of Coefficients Suppose Big Andy wants to increase advertising expenditure by $800 and drop the price by 40 cents. – Then the change in expected sales is:   E  SALES1   E  SALES0   β1  β 2  PRICE0  0.4   β3  ADVERT0  0.8    β1  β 2 PRICE0  β3 ADVERT0   0.4β 2  0.8β3 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 58 5.4 Interval Estimation 5.4.2 Interval Estimation for a Linear Combination of Coefficients A point estimate would be: ˆ  0.4b2  0.8b3  0.4   7.9079   0.8 1.8626  4.6532 A 90% interval would be:  ˆ  t  se  ˆ  , ˆ  t  se  ˆ  c c    0.4b2  0.8b3   tc  se  0.4b2  0.8b3  ,  0.4b2  0.8b3   tc  se  0.4b2  0.8b3   Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 59 5.4 Interval Estimation 5.4.2 Interval Estimation for a Linear Combination of Coefficients The standard error is: se  0.4b 2  0.8b3   var  0.4b 2  0.8b3    0.4  2 var  b2    0.8  var  b3   2  0.4  0.8  cov  b2 , b3  2  0.16 1.2012  0.64  0.4668  0.64   0.0197   0.7096 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 60 5.4 Interval Estimation 5.4.2 Interval Estimation for a Linear Combination of Coefficients The 90% interval is then:  4.6532 1.666  0.7096, 4.6532  1.666  0.7096  3.471,5.835 – We estimate, with 90% confidence, that the expected increase in sales will lie between $3,471 and $5,835 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 61 5.5 Hypothesis Testing Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 62 5.5 Hypothesis Testing COMPONENTS OF HYPOTHESIS TESTS 1. A null hypothesis H0 2. An alternative hypothesis H1 3. A test statistic 4. A rejection region 5. A conclusion Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 63 5.5 Hypothesis Testing 5.5.1 Testing the Significance of a Single Coefficient We need to ask whether the data provide any evidence to suggest that y is related to each of the explanatory variables – If a given explanatory variable, say xk, has no bearing on y, then βk = 0 – Testing this null hypothesis is sometimes called a test of significance for the explanatory variable xk Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 64 5.5 Hypothesis Testing 5.5.1 Testing the Significance of a Single Coefficient Null hypothesis: H 0 : k  0 Alternative hypothesis: H1 :  k  0 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 65 5.5 Hypothesis Testing 5.5.1 Testing the Significance of a Single Coefficient Test statistic: bk t ~ t( N  K ) se  bk  t values for a test with level of significance α: tc  t(1 /2, N  K ) and  tc  t(  /2, N  K ) Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 66 5.5 Hypothesis Testing 5.5.1 Testing the Significance of a Single Coefficient For our hamburger example, we can conduct a test that sales revenue is related to price: 1. The null and alternative hypotheses are: H 0 : 2  0 and H1 : 2  0 2. The test statistic, if the null hypothesis is true, is: t  b2 se  b2  ~ t( N  K ) 3. Using a 5% significance level (α=.05), and 72 degrees of freedom, the critical values that lead to a probability of 0.025 in each tail of the distribution are: t.975,72  1.993 and t.025,72  1.993 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 67 5.5 Hypothesis Testing 5.5.1 Testing the Significance of a Single Coefficient For our hamburger example (Continued) : 4. The computed value of the t-statistic is: 7.908 t  7.215 1.096 and the p-value from software is:     P t 72  7.215  P t 72  7.215  2  (2.2 1010 )  0.000 5. Since -7:215 < -1.993, we reject H0: β2 = 0 and conclude that there is evidence from the data to suggest sales revenue depends on price • Using the p-value to perform the test, we reject H0 because 0.000 < 0.05 . Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 68 5.5 Hypothesis Testing 5.5.1 Testing the Significance of a Single Coefficient Similarly, we can conduct a test that sales revenue is related to advertising expenditure: 1. The null and alternative hypotheses are: H 0 : 3  0 and H1 : 3  0 2. The test statistic, if the null hypothesis is true, is: t  b3 se  b3  ~ t( N  K ) 3. Using a 5% significance level (α=.05), and 72 degrees of freedom, the critical values that lead to a probability of 0.025 in each tail of the distribution are: t.975,72  1.993 and t.025,72  1.993 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 69 5.5 Hypothesis Testing 5.5.1 Testing the Significance of a Single Coefficient For our hamburger example (Continued) : 4. The computed value of the t-statistic is: 1.8626 t  2.726 0.6832 and the p-value from software is:     P t 72  2.726  P t 72  2.726  2  0.004  0.008 5. Since 2.726 > 1.993, we reject H0: β3 = 0: the data support the conjecture that revenue is related to advertising expenditure • Using the p-value to perform the test, we reject H0 because 0.008 < 0.05 . Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 70 5.5 Hypothesis Testing 5.5.2a Testing for Elastic Demand We now are in a position to state the following questions as testable hypotheses and ask whether the hypotheses are compatible with the data 1. Is demand price-elastic or price-inelastic? 2. Would additional sales revenue from additional advertising expenditure cover the costs of the advertising? Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 71 5.5 Hypothesis Testing 5.5.2a Testing for Elastic Demand For the demand elasticity, we wish to know if: – β2 ≥ 0: a decrease in price leads to a decrease in sales revenue (demand is price-inelastic or has an elasticity of unity), or – β2 < 0: a decrease in price leads to a decrease in sales revenue (demand is price-inelastic) Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 72 5.5 Hypothesis Testing 5.5.2a Testing for Elastic Demand As before: 1. The null and alternative hypotheses are: H 0 : 2  0 (demand is unit-elastic or inelastic) H1 : 2 < 0 (demand is elastic) 2. The test statistic, if the null hypothesis is true, is: t  b2 se(b2 ) ~ t N  K  3. At a 5% significance level, we reject H0 if t ≤ -1.666 or if the p-value ≤ 0.05 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 73 5.5 Hypothesis Testing 5.5.2a Testing for Elastic Demand Hypothesis test (Continued) : 4. The test statistic is: t b2 7.908   7.215 se  b2  1.096 and the p-value is:   P t 72  7.215  0.000 5. Since -7.215 < 1.666, we reject H0: β2 ≥ 0 and conclude that H0: β2 < 0 (demand is elastic) Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 74 5.5 Hypothesis Testing 5.5.2b Testing Advertising Effectiveness The other hypothesis of interest is whether an increase in advertising expenditure will bring an increase in sales revenue that is sufficient to cover the increased cost of advertising – Such an increase will be achieved if β3 > 1 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 75 5.5 Hypothesis Testing 5.5.2b Testing Advertising Effectiveness As before: 1. The null and alternative hypotheses are: H 0 : 3  1 H 1 : 3 > 1 2. The test statistic, if the null hypothesis is true, is: b3  1 t se(b3 ) ~ t N  K  3. At a 5% significance level, we reject H0 if t ≥ 1.666 or if the p-value ≤ 0.05 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 76 5.5 Hypothesis Testing 5.5.2b Testing Advertising Effectiveness Hypothesis test (Continued) : 4. The test statistic is: b3  β3 1.8626  1 t   1.263 se  b2  0.6832 and the p-value is:   P t 72  1.263  0.105 5. Since 1.263 < 1.666, we do not reject H0 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 77 5.5 Hypothesis Testing 5.5.3 Hypothesis Testing for a Linear Combination of Coefficients The marketing adviser claims that dropping the price by 20 cents will be more effective for increasing sales revenue than increasing advertising expenditure by $500 – In other words, she claims that -0.2β2 > 0.5β3, or -0.2β2 - 0.5β3> 0 – We want to test a hypothesis about the linear combination -0.2β2 – 0.5β3 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 78 5.5 Hypothesis Testing 5.5.3 Hypothesis Testing for a Linear Combination of Coefficients As before: 1. The null and alternative hypotheses are: H 0 : 0.2β 2  0.5β3  0 (marketer's claim is not correct) H1 : 0.2β 2  0.5β3  0 (marketer's claim is correct) 2. The test statistic, if the null hypothesis is true, is: 0.2b2  0.5b3 t ~ t 72 se(0.2b2  0.5b3 ) 3. At a 5% significance level, we reject H0 if t ≥ 1.666 or if the p-value ≤ 0.05 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 79 5.5 Hypothesis Testing 5.5.3 Hypothesis Testing for a Linear Combination of Coefficients We need the standard error: se( 0.2b2  0.5b3 )  var  se( 0.2b2  0.5b3 )    0.2  2 var  b2    0.5  var  b3   2   0.2    0.5  cov  b2, b3  2  0.04  1.2012  0.25  0.4668  0.2   0.0197   0.4010 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 80 5.5 Hypothesis Testing 5.5.3 Hypothesis Testing for a Linear Combination of Coefficients Hypothesis test (Continued) : 4. The test statistic is: 0.2b2  0.5b3 1.58158  0.9319 t   1.622 se  0.2b2  0.5b3  0.4010 and the p-value is:   P t 72  1.622  0.055 5. Since 1.622 < 1.666, we do not reject H0 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 81 5.6 Polynomial Equations Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 82 5.6 Polynomial Equations We have studied the multiple regression model y  β1  β2 x2  β3 x3  Eq. 5.17  β K xK  e – Sometimes we are interested in polynomial equations such as the quadratic y = β1 + β2x + β3x2 + e or the cubic y = α1 + α2x + α3x2 + α4x3 + e. Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 83 5.6 Polynomial Equations 5.6.1 Hypothesis Testing for a Linear Combination of Coefficients Consider the average cost equation AC  β1  β 2Q  β3Q  e 2 Eq. 5.18 – And the total cost function Eq. 5.19 Principles of Econometrics, 4th Edition TC  1   2Q   3Q   4Q  e 2 Chapter 5: The Multiple Regression Model 3 Page 84 5.6 Polynomial Equations FIGURE 5.2 (a) Total cost curve and (b) total product curve 5.6.1 Hypothesis Testing for a Linear Combination of Coefficients Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 85 5.6 Polynomial Equations FIGURE 5.3 Average and marginal (a) cost curves and (b) product curves 5.6.1 Hypothesis Testing for a Linear Combination of Coefficients Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 86 5.6 Polynomial Equations 5.6.1 Hypothesis Testing for a Linear Combination of Coefficients Eq. 5.20 For the general polynomial function: y  a0  a1 x  a2 x2  a3 x3  – The slope is dy 2  a1  2a2 x  3a3 x  dx  ap x p  pa p x p 1 – Evaluated at a particular value, x = x0 , the slope is: Eq. 5.21 dy dx Principles of Econometrics, 4th Edition  a1  2a2 x0  3a3 x02   pa p x0p 1 x  x0 Chapter 5: The Multiple Regression Model Page 87 5.6 Polynomial Equations 5.6.1 Hypothesis Testing for a Linear Combination of Coefficients The slope of the average cost curve Eq. 5.18 is: dE TC  dQ   2  23Q  3 4Q 2 – For a U-shaped marginal cost curve, we expect the parameter signs to be α2 > 0, α3 < 0, and α4 > 0 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 88 5.6 Polynomial Equations 5.6.1 Hypothesis Testing for a Linear Combination of Coefficients The slope of the total cost curve Eq. 5.19 ,which is the marginal cost, is: dE  AC  dQ  β2  2β3Q – For this U-shaped curve, we expect β2 < 0 and β3 > 0 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 89 5.6 Polynomial Equations 5.6.1 Hypothesis Testing for a Linear Combination of Coefficients It is sometimes true that having a variable and its square or cube in the same model causes collinearity problems – This will be discussed in Chapter 6 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 90 5.6 Polynomial Equations 5.6.2 Extending the Model for Burger Barn Sales The linear sales model with the constant slope β3 for advertising SALES  β1  β2 PRICE  β3 ADVERT  e does not capture diminishing returns in advertising expenditure Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 91 5.6 Polynomial Equations 5.6.2 Extending the Model for Burger Barn Sales A new, better model might be: Eq. 5.22 SALES  β1  β 2 PRICE  β3 ADVERT  β 4 ADVERT 2  e The change in expected sales to a change in advertising is: E  SALES  Eq. 5.23 ADVERT   PRICE held constant  E  SALES  ADVERT = β3  2β 4 ADVERT Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 92 5.6 Polynomial Equations FIGURE 5.4 A model where sales exhibits diminishing returns to advertising expenditure 5.6.2 Extending the Model for Burger Barn Sales Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 93 5.6 Polynomial Equations 5.6.2 Extending the Model for Burger Barn Sales We refer to E  SALES  ADVERT as the marginal effect of advertising on sales Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 94 5.6 Polynomial Equations 5.6.2 Extending the Model for Burger Barn Sales The least squares estimates are: Eq. 5.24 SALES  109.72  7.640 PRICE  12.151ADVERT  2.768 ADVERT 2  se  Principles of Econometrics, 4th Edition  6.80  1.046  3.556  Chapter 5: The Multiple Regression Model  0.941 Page 95 5.6 Polynomial Equations 5.6.2 Extending the Model for Burger Barn Sales The estimated response of sales to advertising is:  SALES  12.151  5.536 ADVERT ADVERT Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 96 5.6 Polynomial Equations 5.6.2 Extending the Model for Burger Barn Sales Substituting, we find that when advertising is at its minimum value in the sample of $500 (ADVERT = 0.5), the marginal effect of advertising on sales is 9.383 – When advertising is at a level of $2,000 (ADVERT = 2), the marginal effect is 1.079 – Allowing for diminishing returns to advertising expenditure has improved our model both statistically and in terms of meeting expectations about how sales will respond to changes in advertising Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 97 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising The marginal benefit from advertising is the marginal revenue from more advertising – The required marginal revenue is given by the marginal effect of more advertising β3 + 2β4ADVERT – The marginal cost of $1 of advertising is $1 plus the cost of preparing the additional products sold due to effective advertising – Ignoring the latter costs, the marginal cost of $1 of advertising expenditure is $1 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 98 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising Advertising should be increased to the point where β3  2β 4 ADVERT0  1 with ADVERT0 denoting the optimal level of advertising Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 99 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising Using the least squares estimates, a point estimate of ADVERT0 is: 1  b3 1  12.1512 ADVERT0    2.014 2b4 2   2.76796  implying that the optimal monthly advertising expenditure is $2,014 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 100 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising Variances of nonlinear functions are hard to derive – Recall that the variance of a linear function, say, c3b3 + c4b4, is: Eq. 5.25 var  c3b3  c4b4   c32 var b3   c42 var b4   2c3c4 cov b3 , b4  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 101 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising Suppose λ = (1 - β3)/2β4 and 1  b3 ˆ  2b4 – Then, the approximate variance expression is: Eq. 5.26 2 2            var ˆ   var b  var b  2          3 4  β  β  β  4  3  3   β 4     cov  b3 , b4   – Using Eq. 5.26 to find an approximate expression for a variance is called the delta method Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 102 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising The required derivatives are:  1  , β3 2β 4 Principles of Econometrics, 4th Edition 1  β3   β 4 2β 42 Chapter 5: The Multiple Regression Model Page 103 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising Thus, for the estimated variance of the optimal level of advertising, we have: 2 2  1   1  b3   1   1  b3 var ˆ    var b   var b  2          3 4 2  2 b 2 b 2 b 2b42 4  4  4         cov  b3 , b4   1    1  12.151    0.88477  12.646   2  2  2.768 2  2.768     1    1  12.151  2   3.2887  2  2  2.768 2  2.768     0.016567 2 and 2   se ˆ  Principles of Econometrics, 4th Edition 0.016567  0.1287 Chapter 5: The Multiple Regression Model Page 104 5.6 Polynomial Equations 5.6.3 The Optimal Level of Advertising An approximate 95% interval estimate for ADVERT0 is:      ˆ  t 0.975,71se ˆ , ˆ  t 0.975,71se ˆ   2.014 -1.994  0.1287, 2.014  1.994  0.1287   1.757, 2.271 – We estimate with 95% confidence that the optimal level of advertising lies between $1,757 and $2,271 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 105 5.7 Interaction Variables Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 106 5.7 Interaction Variables Suppose that we wish to study the effect of income and age on an individual’s expenditure on pizza – An initial model would be: Eq. 5.27 PIZZA  β1  β 2 AGE  β3 INCOME  e Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 107 5.7 Interaction Variables Implications of this model are: 1. E  PIZZA AGE  β2 : For a given level of income, the expected expenditure on pizza changes by the amount β2 with an additional year of age 2. E  PIZZA INCOME  β3 : For individuals of a given age, an increase in income of $1,000 increases expected expenditures on pizza by β3 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 108 5.7 Interaction Variables Principles of Econometrics, 4th Edition Table 5.4 Pizza Expenditure Data Chapter 5: The Multiple Regression Model Page 109 5.7 Interaction Variables The estimated model is: PIZZA  342.88  7.576 AGE  1.832 INCOME (t)  -3.27   3.95 – The signs of the estimated parameters are as we anticipated • Both AGE and INCOME have significant coefficients, based on their t-statistics Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 110 5.7 Interaction Variables It is not reasonable to expect that, regardless of the age of the individual, an increase in income by $1,000 should lead to an increase in pizza expenditure by $1.83? – It would seem more reasonable to assume that as a person grows older, his or her marginal propensity to spend on pizza declines • That is, as a person ages, less of each extra dollar is expected to be spent on pizza – This is a case in which the effect of income depends on the age of the individual. • That is, the effect of one variable is modified by another – One way of accounting for such interactions is to include an interaction variable that is the product of the two variables involved Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 111 5.7 Interaction Variables We will add the interaction variable (AGE x INCOME) to the regression model – The new model is: Eq. 5.28 PIZZA  β1  β2 AGE  β3 INCOME  β4  AGE  INCOME   e Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 112 5.7 Interaction Variables Implications of this revised model are: 1. E  PIZZA AGE  β2  β4 INCOME 2. E  PIZZA INCOME  β3  β4 AGE Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 113 5.7 Interaction Variables The estimated model is: PIZZA  161.47  2.977 AGE  6.980 INCOME  0.1232  AGE  INCOME  (t) Principles of Econometrics, 4th Edition  -0.89   2.47  Chapter 5: The Multiple Regression Model  1.85  Page 114 5.7 Interaction Variables The estimated marginal effect of age upon pizza expenditure for two individuals—one with $25,000 income and one with $90,000 income is: E  PIZZA  AGE  b2  b4 INCOME  2.977  0.1232 INCOME -6.06  -14.07 for INCOME = 25 for INCOME = 90 – We expect that an individual with $25,000 income will reduce pizza expenditures by $6.06 per year, whereas the individual with $90,000 income will reduce pizza expenditures by $14.07 per year Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 115 5.7 Interaction Variables 5.7.1 Log-Linear Models Eq. 5.29 Consider a wage equation where ln(WAGE) depends on years of education (EDUC) and years of experience (EXPER): ln WAGE   β1  β2 EDUC  β3 EXPER  e – If we believe the effect of an extra year of experience on wages will depend on the level of education, then we can add an interaction variable Eq. 5.30 ln WAGE   β1  β2 EDUC  β3 EXPER  β4  EDUC  EXPER   e Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 116 5.7 Interaction Variables 5.7.1 Log-Linear Models The effect of another year of experience, holding education constant, is roughly:  ln WAGE  EXPER  β3  β 4 EDUC EDUC fixed – The approximate percentage change in wage given a one-year increase in experience is 100(β3+β4EDUC)% Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 117 5.7 Interaction Variables 5.7.1 Log-Linear Models An estimated model is: ln WAGE   1.392  0.09494 EDUC  0.00633EXPER 0.0000364  EDUC  EXPER  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 118 5.7 Interaction Variables 5.7.1 Log-Linear Models If there is an interaction and a quadratic term on the right-hand side, as in ln WAGE   β1  β 2 EDUC  β3 EXPER  β 4  EDUC  EXPER  β5 EXPER 2  e then we find that a one-year increase in experience leads to an approximate percentage wage change of: %WAGE  100 β3 +β4 EDUC  2β5 EXPER  % Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 119 5.8 Measuring Goodness-of-fit Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 120 5.8 Measuring Goodness-of-fit In the multiple regression model the R2 is relevant and the same formulas are valid, but now we talk of the proportion of variation in the dependent variable explained by all the explanatory variables included in the linear model Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 121 5.8 Measuring Goodness-of-fit The coefficient of determination is: SSR R   SST 2 Eq. 5.31   yˆ  y i 1 N i  y 2 i 1 i  y 2 N SSE  1 SST 2 ˆ e  i 1 i N  1  y N i 1 Principles of Econometrics, 4th Edition i  y Chapter 5: The Multiple Regression Model 2 Page 122 5.8 Measuring Goodness-of-fit The predicted value of y is: yî  b1  b2 xi 2  b3 xi 3   bK xiK – Recall that: sy  1 N 2  yi  y    N  1 i 1 SST N 1 – Then: SST   N  1 s y2 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 123 5.8 Measuring Goodness-of-fit For the hamburger example:. 2 ˆ e i 1 i N 1718.943 R  1 1  0.448 N 2 i 1  yi  y  3115.482 2 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 124 5.8 Measuring Goodness-of-fit Interpretation – 44.8% of the variation in sales revenue is explained by the variation in price and by the variation in the level of advertising expenditure – In our sample, 55.2% of the variation in revenue is left unexplained and is due to variation in the error term or to variation in other variables that implicitly form part of the error term – Adding the square of advertising to the Burger Barn model (see Eq. 5.24) increased the R2 to 0.508 • Thus an additional 6% of the variation in sales is explained by including this variable Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 125 5.8 Measuring Goodness-of-fit If the model does not contain an intercept parameter, then the measure R2 given in Eq. 5.31 is no longer appropriate – The reason it is no longer appropriate is that without an intercept term in the model, i1  yi  y  N 2  i 1  yî  y   i 1 eî 2 N 2 N so SST ≠ SSR + SSE Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 126 5.8 Measuring Goodness-of-fit Under these circumstances it does not make sense to talk of the proportion of total variation that is explained by the regression – When your model does not contain a constant, it is better not to report R2 • Even if your computer displays one Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 127 Key Words Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 128 Keywords BLU estimator covariance matrix of least squares estimator critical value delta method error variance estimate error variance estimator goodness-of-fit interaction variable interval estimate Principles of Econometrics, 4th Edition least squares estimates least squares estimation least squares estimators linear combinations marginal effects multiple regression model Nonlinear functions one-tailed test p-value Chapter 5: The Multiple Regression Model Polynomial regression coefficients standard errors sum of squared errors sum of squares of regression testing significance total sum of squares two-tailed test Page 129 Appendices Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 130  5A Derivation of Least Squares Estimators  5B Large Sample Analysis Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 131 5A Derivation of the Least Squares Estimators Following a procedure similar to the single variable model, we have to minimize: N S (1 , 2 , 3 )   ( yi  1  2 xi 2  3 xi 3 ) 2 – To get: i 1 S  2 N 1  22  xi 2  23  xi 3  2 yi 1 S  21  xi 2  22  xi22  23  xi 2 xi 3  2 xi 2 yi 2 S  21  xi 3  22  xi 2 xi 3  23  xi23  2 xi 3 yi 3 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 132 5A Derivation of the Least Squares Estimators We now can get three simultaneous normal equations: Nb1   xi 2b2   xi 3b3   yi Eq. 5A.1 2 x b  x  i 2 1  i 2b2   xi 2 xi 3b3   xi 2 yi 2 x b  x x b  x  i 3 1  i 2 i 3 2  i 3b3   xi 3 yi Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 133 5A Derivation of the Least Squares Estimators To help solve them, let: yi  yi  y , xi2  xi 2  x2 , xi3  xi 3  x3 – Then the least squares estimators are: b1  y  b2 x2  b3 x3 b2  b3  Principles of Econometrics, 4th Edition   2     y x x  y x x   i i 2   i 3    i i 3   i 2 xi 3    x   x     x 2 i2    2 i 2 i3 2 i3 x   2     y x x  y x x   i i 3   i 2    i i 2   i 3 xi 2    x   x     x 2 i2 2 i3 Chapter 5: The Multiple Regression Model    2 i 2 i3 x Page 134 5B Large Sample Analysis In the multiple regression model, if assumptions MR1– MR5 hold (or SR1–SR5 in the simple regression model) we are able to show that the least squares estimators are Best, Linear, Unbiased Estimators (BLUE) – These properties are called ‘‘finite sample’’ properties because they do not depend on the sample size N, and will hold if the sample is any size N > K There are additional properties of the least squares estimator that can be established if samples are imagined becoming infinitely large – In econometrics and statistics these are called asymptotic properties, with the term asymptotic implying the analysis of limiting behavior, here as N→∞ Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 135 5B Large Sample Analysis 5B.1 Consistency The probability of obtaining an estimate ‘‘close’’ to β2 is: P β2    b2  β2    Eq. 5B.1 An estimator is said to be consistent if this probability converges to 1 as the sample size N→∞ Eq. 5B.2 Principles of Econometrics, 4th Edition lim P β2    b2  β2     1 N  Chapter 5: The Multiple Regression Model Page 136 5B Large Sample Analysis FIGURE 5B.1 An illustration of consistency 5B.1 Consistency Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 137 5B Large Sample Analysis 5B.1 Consistency The property of consistency applies to many estimators, even ones that are biased in finite samples – Example: the estimator β̂2  b2  1 N is a biased estimator • The bias is     1 ˆ ˆ bias β 2  E β 2  β 2  N Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 138 5B Large Sample Analysis 5B.1 Consistency But the bias converges to zero as N → ∞:     lim bias βˆ 2  lim  E βˆ 2  β 2   0 N  N    Eq. 5B.3 – The estimator is said to be asymptotically unbiased Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 139 5B Large Sample Analysis 5B.1 Consistency Consistency for an estimator can be established by showing that the estimator is either unbiased or asymptotically unbiased, and that its variance converges to zero as N → ∞: Eq. 5B.4 Principles of Econometrics, 4th Edition   lim var βˆ 2  0 N  Chapter 5: The Multiple Regression Model Page 140 5B Large Sample Analysis 5B.1 Consistency Because the probability density function of a consistent estimator collapses around the true parameter, and the probability that an estimator b2 will be close to the true parameter β2 approaches one, the estimator b2 is said to ‘‘converge in probability’’ to β2 – Notation plim  b2   β 2 N  p b2  β2 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 141 5B Large Sample Analysis 5B.2 Asymptotic Normality As N → ∞ the probability density function of the standardized estimator has a distribution that approaches the standard normal: bk  β k var  bk  a ~ N  0,1 – The estimator is asymptotically normal and generally write: bk ~ N  β k , var  bk   a Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 142 5B Large Sample Analysis 5B.3 Monte Carlo Simulation Recall that the data generation process for the simple linear regression model is given by: yi  E  yi | xi   ei  β1  β2 xi  ei , i  1, Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 143 ,N 5B Large Sample Analysis 5B.3 Monte Carlo Simulation The Monte Carlo parameter values are β1 = 100 and β2 = 10 – The value of xi is 10 for the first N = 2 observations and 20 for the remaining N = 2 observations, so that the regression functions are: E  yi | xi  10   100  10 xi  100  10 10  200, i  1, ,N 2 E  yi | xi  20   100  20 xi  100  10  20  300, i   N 2   1, Principles of Econometrics, 4th Edition ,N Chapter 5: The Multiple Regression Model Page 144 5B Large Sample Analysis Table 5B.1 The least squares estimators, tests, and interval estimators 5B.3 Monte Carlo Simulation Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 145 5B Large Sample Analysis FIGURE 5B.2 Histogram of the estimates b2 for N = 40 5B.3 Monte Carlo Simulation Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 146 5B Large Sample Analysis 5B.4 The Delta Method Working with nonlinear functions of the estimated parameters requires additional tools, because even if the regression errors are normal, nonlinear functions of them are not normally distributed in finite samples, and usual variance formulas do not apply Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 147 5B Large Sample Analysis 5B.4.1 Nonlinear Functions of a Single Parameter The Taylor series approximation is: f  x  f a  df  x  dx  x  a   f  a   f   a  x  a  xa – We apply this to g1(b2) replacing x with b2 and a with β2 Eq. 5B.5 Principles of Econometrics, 4th Edition g1  b2   g1 β2   g  β2 b2  β2  Chapter 5: The Multiple Regression Model Page 148 5B Large Sample Analysis 5B.4.1 Nonlinear Functions of a Single Parameter This Taylor series expansion of shows the following: 1. If E(b2) = β2 then E[g1(b2)]  β2. 2. If b2 is a biased but consistent estimator, so that b2 β 2 , then g1  b2  g1 β2  3. The variance of g1(b2) is given by p  p  var  g1  b2    g1 β 2  var  b2  2 which is known as the delta method. Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 149 5B Large Sample Analysis 5B.4.1 Nonlinear Functions of a Single Parameter This Taylor series expansion of shows (Continued) – The delta method follows from working with the Taylor series approximation: var  g1  b2    var  g1  β 2   g1  β 2  b2  β 2    var  g1  β 2  b2  β 2   because g1  β 2  is not random =  g1  β 2   var  b2  β 2  because g1 β 2  is not random =  g1  β 2   var  b2  because β 2 is not random 2 2 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 150 5B Large Sample Analysis 5B.4.1 Nonlinear Functions of a Single Parameter This Taylor series expansion of shows (Continued) 4. The estimator g1(b2) has an approximate normal distribution in large samples: a Eq. 5B.6 Principles of Econometrics, 4th Edition 2  g1  b2  ~ N  g1  β 2  ,  g1 β 2   var  b2    Chapter 5: The Multiple Regression Model Page 151 5B Large Sample Analysis 5B.4.2 The Delta Method Illustrated A fitted regression is yˆ  87.44311  10.68456 x  se   33.8764  2.1425 – Consider the nonlinear function: g1 β2   exp β2 10 – The estimated value of the nonlinear function is: g1  b2   exp  b2 10  exp 10.68456 10   2.91088 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 152 5B Large Sample Analysis Table 5B.2 Estimated covariance matrix 5B.4.2 The Delta Method Illustrated Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 153 5B Large Sample Analysis Table 5B.3 Estimates and tests of g1(β2) = exp(β2) =10 5B.4.2 The Delta Method Illustrated Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 154 5B Large Sample Analysis 5B.4.2 The Delta Method Illustrated The estimated variance is: var  g1  b2    g1  b2  var b2   exp  b2 10   1 10  var b2  2 2  exp 10.68456 10   1 10    4.59045 2  0.38896 – se(g1(b2)) = 0.62367 – The 95% confidence interval is: g1  b2   t 0.975,202 se  g1  b2    2.91088  2.10092  0.62367  1.60061, 4.22116 Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 155 5B Large Sample Analysis 5B.4.3 Monte Carlo Simulation of the Delta Method The test statistic is: t g1  b2   2.71828 se  g1  b2  t N 2 – The histogram of the estimates for sample size N = 40 in Figure 5B.3 shows only the very slightest deviation from normality, which is why the t-test performs so well. Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 156 5B Large Sample Analysis FIGURE 5B.3 Histogram of g1(b2) = exp(b2) =10 5B.4.3 Monte Carlo Simulation of the Delta Method Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 157 5B Large Sample Analysis 5B.5 The Delta Method Extended When working with functions of two (or more) parameters, the approach is much the same, but the Taylor series approximation changes to a more general form: Eq. 5B.7 g 2  b1 , b2   g 2  β1 ,β 2   Principles of Econometrics, 4th Edition g 2  β1 ,β 2  β1  b1  β1   Chapter 5: The Multiple Regression Model g 2 β1 ,β 2  β 2 Page 158  b2  β2  5B Large Sample Analysis 5B.5 The Delta Method Extended This Taylor series expansion of shows the following: 1. If E(b1) = β1 and E(b2) = β2 then E[g2(b1,b2)]  g2(β1,β2). 2. If b1 and b2 are consistent estimators, so that β 2 , then g2  b1 , b2  g2 β1 ,β2  b1 β1 and b2 p  Principles of Econometrics, 4th Edition p  Chapter 5: The Multiple Regression Model p  Page 159 5B Large Sample Analysis 5B.5 The Delta Method Extended This Taylor series expansion of shows (Continued) : 3. The variance of g1(b2) is given by:  g 2  β1 ,β 2    g 2  β1 ,β 2   var  g 2  b1 , b2      var  b1     var  b2  β1 β 2     2 Eq. 5B.8 Principles of Econometrics, 4th Edition 2  g 2  β1 ,β 2    g 2  β1 ,β 2   2    cov  b1 , b2  β1 β 2    Chapter 5: The Multiple Regression Model Page 160 5B Large Sample Analysis 5B.5 The Delta Method Extended This Taylor series expansion of shows (Continued) 4. The estimator g2(b1,b2) has an approximate normal distribution in large samples: a Eq. 5B.9  g 2  b1 , b2  ~ N g 2  β1 ,β 2  , var  g 2  b1 , b2   Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model  Page 161 5B Large Sample Analysis Table 5B.4 Estimates g2(b1, b2) = b1/b2 5B.5.1 The Delta Method Illustrated: Continued Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 162 5B Large Sample Analysis 5B.5.1 The Delta Method Illustrated: Continued The nonlinear function of two parameters that we consider is g2(b1, b2) = b1/b2 – The derivatives are: g 2  β1 ,β 2  β1 1  β2 β 2 β1  2 β2 g 2  β1 ,β 2  Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 163 5B Large Sample Analysis 5B.5.1 The Delta Method Illustrated: Continued The estimate g2(b1, b2) = b1/b2 = 87.44311 /10.68456 = 8.18406 and its estimated variance is: 2 2 1  b   1  b  var  g 2  b1 , b2     var  b1     12  var  b2   2     12  cov  b1 , b2   b2   b2   b2   b2   22.61857 – The resulting 95% interval estimate for b1/b2 is [1.807712, 18.17583] Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 164 5B Large Sample Analysis FIGURE 5B.4a Histogram of g2(b1, b2) = b1/b2, N = 40 5B.5.2 Monte Carlo Simulation of the Extended Delta Method Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 165 5B Large Sample Analysis FIGURE 5B.4b Histogram of g2(b1, b2) = b1/b2, N = 200 5B.5.2 Monte Carlo Simulation of the Extended Delta Method Principles of Econometrics, 4th Edition Chapter 5: The Multiple Regression Model Page 166

Chapter 5 The Multiple Regression Model Walter R. Paczkowski Rutgers University

Related documents

Products

Support

Chapter 5 The Multiple Regression Model Walter R. Paczkowski Rutgers University

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib