Lecture notes to Stock and Watson chapter 5 Introductory linear regression Tore Schweder September 2008 TS () LN3 01/09 1 / 11 Inference in regression Model: E [Y jX = x ] = β0 + β1 x Typical questions: What is the marginal expected e¤ect on Y of a change in X ? Estimate or con…dence interval for β1 What is the expected e¤ect of increasing X from x1 to x1 + ∆x ? Estimate or con…dence interval for θ = β1 ∆x Is Y e¤ected by X in the mean? Test H0 : β1 = 0 vr H1 : β1 6= 0 What is the mean response of Y when X = x ? Estimate or con…dence interval for θ = β0 + β1 x TS () LN3 01/09 2 / 11 Example: desired number of children by age of student Model: E [Children jAge = x ] = β0 + β1 x Typical questions: What is the mean number of desired children for a student of age X + 1 compared to that of a student of age X ? Estimate or con…dence interval for β1 What is the expected increase in the expected number of desired children from students of age 22 to those of age 30? Estimate or con…dence interval for θ = β1 8 Does expected desired number of children depend on the age of the student? Test H0 : β1 = 0 vr H1 : β1 6= 0 What is the mean desired number of children at student age 22? Estimate or con…dence interval for θ = β0 + β1 22 TS () LN3 01/09 3 / 11 Methods Linear parameters θ = aβ0 + bβ1 are estimated by b θ = ab β0 + b b β1 , b β0 and b β1 are the OLS estimators, a and b are numbers. β0 , b β1 β1 + 2ab cov b β0 + b 2 var b var b θ = a2 var b r r Estimates of var b β0 and var b β1 , SE b β0 and SE b β1 are given in computer output, along with an estimate of ρ = cor b β0 , b β1 compute SE b θ from these numbers TS () LN3 01/09 4 / 11 Methods cont. For large samples, and when the basic assumptions are satis…ed (Key Concepts 4.3 in SW: Linear regression, random sample, large outliers are unlikely): the con…dence interval of degree 1 2α for θ is zα SE b θ where zα = FN (10,1 ) (α) is the α quantile of the standard normal distribution the p value for the two-sided test problem H0 : θ = θ 0 vr H1 : θ 6= θ 0 is p = 2 min FN (0,1 ) the p H0 : θ the p H0 : θ TS () b θ θ0 SE (bθ ) ,1 FN (0,1 ) value for the one-sided test problem θ0 vr H1 : θ > θ 0 is p = 1 FN (0,1 ) value for the one-sided test problem θ0 vr H1 : θ < θ 0 is p = FN (0,1 ) LN3 b θ θ0 SE (bθ ) b θ θ0 SE (bθ ) b θ θ0 SE (bθ ) 01/09 5 / 11 Methods for estimating standard errors for OLS regression parameters 1 2 3 Calculate empirical residuals b bi = Yi β0 + b β1 Xi u bi = Yi Y i = 1, ,n Look at the scatter diagram or diagnostic plot to choose method The standard errors are conditional on X1 , , Xn Homoskedasticity-only standard error with σ2 = var [Y jX = x ]: SE b β1 = sub ∑ Xi X 2 = 1 n b σ 1 sX2 The standard errors are obtained by STATA command regress TS () LN3 01/09 6 / 11 Standard errors cont. Heteroskedasticity-robust standard error: 1 ∑ Xi 1 β1 = nh 2 SE b n 1 n ∑ Xi X X 2 u bi2 i 2 2 The robust standard errors are obtained by STATA command regress...,robust β1 is small when the empirical residuals are small and/or when SE b ∑ Xi TS () X 2 is large! LN3 01/09 7 / 11 Properties of OLS Linearity, E [Yi jXi ] = b β0 + b β1 Xi )linear parameters b θ = ab β0 + b b β1 b are unbiased, i.e. E θ = θ Linearity + homoskedasticity+independent normally distributed b residuals, u1 , , un are iid N (0, σ2 ) ) θ bθ is t-distributed with SE (θ ) n 2 degrees of freedom b Linearity+random sample+large outliers unlikely ) θ bθ is SE (θ ) approximately standard normally distributed approximation better the larger n is and/or the closer to normality the residuals are distributed. Figure: t-distributions for degree of freedom 3, 6 and ∞, and upper and lower 2.5% quantiles. TS () LN3 01/09 8 / 11 STATA replication …le found from http://wps.aw.com/aw_stock_ie_2/ # delimit ; clear; cap log close; *************************************************************; * Economic Value of a Year of Education; *************************************************************; log using ch5_EconomicValueBox.log,replace; set more 1; ***********************************; * Read In Data; * (Note: Change path name so that it is appropriate for your computer); use ntbn2en…naldata…les_replication…lesnch5_cps_box.dta; *; *; reg ahe yrseduc , robust; TS () LN3 01/09 9 / 11 STATA replication …le cont. predict t5; predict u5, residual; su u5 if yrseduc==10; su u5 if yrseduc==12; su u5 if yrseduc==16; gr7 ahe t5 yrseduc if yrseduc<=18, s(Oi) c(.l) sort xlab ylab l1(" ") l2("Average Hourly Earnings") b2("Years of Education") saving(…gure53,replace); *************************************************************; log close; clear; exit; TS () LN3 01/09 10 / 11 Problems to be done in class SW: 5.4 SW: 5.8 SW: 5.15 TS () LN3 01/09 11 / 11