Example Animal Studies of Side Effects y = pressure in the pancreas, Simple Linear Regression Basic Ideas x = dose, In simple linear regression, there is an approximately linear relation between two variables, say x 0 5 10 15 20 25 y 14.6 24.5 21.8 34.5 35.1 43.0 35 40 y = α + βx + , y 30 where 20 25 • x and y are observed; 15 • α and β are unknown; • is a random error with mean 0. 0 5 10 15 20 25 x Figure 1: A Scatterplot Note: Here x is a design variable, set by the experimenter. Slide 1 Slide 2 Example From the Coleman Report Drawing the Line Least Squares Estimators The Problem: Given (x1, y1 ), · · · , (xn , yn ), find a and b to minimize y = Ave 6th Grade Verbal Score, x = Mother0 s Education (yrs), SS(a, b) = x 12.38 10.34 14.08 14.20 12.3 11.46 y 37.01 26.51 36.51 40.70 37.1 33.40 n X i=1 (yi − a − bxi)2 . The Solution: sxy , sxx a = ȳ − bx̄, 36 38 40 b= 28 30 32 y 34 where x̄ and ȳ are the sample means of x1 , · · · , xn and y1 , · · · , yn , 26 sxy = 11 12 13 14 x Figure 2: A Scatterplot n X i=1 sxx (xi − x̄)(yi − ȳ), n X i=1 (xi − x̄)2 Note: Here x is a covariate, measured with y. Slide 3 Slide 4 Here The Details ∂ SS(a, b) = −2n(ȳ − a − bx̄). ∂a So, a = ȳ − bx̄. Recall: SS(a, b) = n X i=1 (yi − a − bxi)2 . Differentiate: Then n X ∂ SS(a, b) = −2 [yi − ȳ − b(xi − x̄)]xi , ∂b i=1 = −2 n X ∂ SS(a, b) = −2 (yi − a − bxi ) ∂a i=1 n X ∂ (yi − a − bxi)xi . SS(a, b) = −2 ∂b i=1 Now n X i=1 Solve: ∂ SS(a, b) = 0, ∂a ∂ SS(a, b) = 0. ∂b i=1 (yi − ȳ)xi − b (xi − x̄) = 0 = n X i=1 (xi − x̄)xi n X (yi − ȳ). i=1 So, n X i=1 (yi − ȳ)xi = n X i=1 (yi − ȳ)(xi − x̄) = sxy , Slide 5 Slide 6 Example Coleman a = .1312 and b = 2.8149 say, and n X n X i=1 i=1 (xi − x̄)xi = n X (xi − x̄)2 38 40 = sxx , 34 32 30 26 and 28 ∂ SS(a, b) = −2 sxy − bsxx , ∂b sxy , b= sxx Verbal Score 36 say. So, a = ȳ − bx̄. The Least Squares Line: y = a + bx. Slide 7 11 12 13 14 Mother’s Ed Figure 3: A Scatterplot and Least Squares Line Slide 8 Calculating a and b By Machine: Using Excel-for example. Example Dose and Response By Hand: Recall sxx = n X i=1 (xi − x̄)2 = n X i=1 x2i − nx̄2 . Similarly, sxy = n X = i=1 n X (xi − x̄)(yi − ȳ) i=1 xiyi − nx̄ȳ. Sums xy x2 14.6 0 0 5 24.5 122.5 25 10 21.8 218 100 15 34.5 517.5 225 20 35.1 702 400 x y 0 25 43.0 1075 625 75 173.5 2635 1375 So, a, and b can be calculated from the sums of xi, yi , xiyi , and x2i . Slide 9 Slide 10 The Calculations A Least Squares Line Pressure vs. Dose a = 15.595 and b = 1.066 40 75 x̄ = = 12.5, 6 173.5 ȳ = = 28.92, 6 35 sxy = 2635 − 6 × 12.5 × 28.92 25 sxx = 1375 − 6 × (12.5)2 y 30 = 466.25, 15 20 = 437.5, b= 466.25 = 1.066, 437.5 and 0 5 10 15 20 25 x Figure 4: A Scatterplot and Least Squares Line a = 28.92 − 1.066 × 12.5 = 15.595. Slide 11 Slide 12 Some Terminology Least Squares Estimators: sxy b= , sxx a = ȳ − bx̄. Error Sum of Squares AKA Residual Sum of Squares: n X e2i . SSE = Fitted Values AKA Predicted Values: Let i=1 ŷi = a + bxi = ȳ + b(xi − x̄). Residuals: syy = n X i=1 (yi − ȳ)2 . Then syy = SSR + SSE. ei = yi − yˆi Let = yi − ȳ − b(xi − x̄) R2 = Then Regression Sum of Squares: SSR = n X i=1 SSR . syy 100R2 = % ExplainedVariation (ŷi − ȳ)2 Note: SSE = syy − SSR = syy − b2sxx . = b2 sxx Slide 13 Slide 14 Derivation of syy = SSR + SSE syy = n X = i=1 n X Inference (yi − ŷi + ŷi − ȳ)2 e2i + 2 i=1 n X i=1 + n X i=1 ei (ŷi − ȳ) (ŷi − ȳ) 2 Model: Now suppose yi = α + βxi + i , where 1 , · · · , n ∼ind Normal[0, σ 2]. Notes -a) Here −∞ < α, β < ∞ and σ 2 > 0 are unknown. Here 1st = SSE, 3rd = SSR, and 2nd = 2 n X i=1 b) If x1, · · · , xn are covariates, then the conditions must hold conditionally given x1 , · · · , x n . c) Then [(yi − ȳ) − b(xi − x̄)b(xi − x̄) = 2[bsxy − b2 sxx ] = 0 Slide 15 yi ∼ Normal[α + βxi, σ 2 ], are (conditionally) independent. Slide 16 The Likelihood Function That is, sxy , sxx α̂ = a = ȳ − bx̄. β̂ = b = The Likelihood Function is n Y 1 1 √ exp[− 2 (yi − α − βxi)2 ] 2 2σ 2πσ i=1 n 1 n 1 X exp[− 2 = √ (yi − α − βxi )2 ] 2 2σ i=1 2πσ The Profile Likelihood Function: Then `(α̂, β̂, σ 2 |x, y) = − 1 − n[log(σ 2) + log(2π)]. 2 So, the log-likelihood function is `(α, β, σ 2 |x, y) = − n 1 X (yi − α − βxi)2 2σ2 1 SSE 2σ2 The MLE of σ 2 : i=1 ∂ 1 1 `= SSE − 2 n. ∂σ2 2σ4 2σ 1 − n[log(σ 2 ) + log(2π)]. 2 So, σ̂2 = Maximum Likelihood Estimators: α̂ and β̂ must minimize the sum of squares. So, SSE . n Let M SE = MLE = LSE. Slide 17 SSE . n−2 Slide 18 Means and Variances Of The Estimators Unbiasedness: α̂ and β̂ are unbiased; that is E(β̂) = β, E(α̂) = α. Variances: σβ̂2 = σ2 sxx 1 x̄2 2 σα̂2 = + σ n sxx Derivation For β̂: First, sxy = n X = n X i=1 i=1 since Pn i=1 (xi (xi − x̄)(yi − ȳ) So, 1 E(SxY ) sxx n 1 X = (xi − x̄)E(Yi ) sxx i=1 E(β̂) = = = n 1 X sxx 1 sxx i=1 n X i=1 (xi − x̄)(α + βxi) (xi − x̄)β(xi − x̄) 1 sxx β sxx = β. = (xi − x̄)yi , − x̄) = 0. Slide 19 Slide 20 Sampling Distributions Similarly, σβ̂2 = = = 1 s2xx n X Var (xi − x̄)Yi n 1 X s2xx 1 s2xx α̂ ∼ Normal[α, σα̂2 ], β̂ ∼ Normal[β, σβ̂2 ], i=1 i=1 (xi − x̄)2σ2 and βsxx SSE ∼ χ2n−2 ; σ2 and (α̂, β̂) is independent if SSE σ2 . = sxx Corollary: MSE is unbiased; that is E(MSE ) = σ 2 . Notes: • Unbiasedness requires (only) E(i ) = 0. • Variance requires also, E(2i ) = σ 2 . Note: The proof is similar to the independence of X̄ and S 2 in the one-sample normal problem. Note: Unbiasedness of MSE requires only E(i ) = 0 and E(2i ) = σ 2 . • Similarly for α̂. Slide 21 Slide 22 Confidence Intervals Studentization: Let σ̂β̂2 = MSE sxx T = β̂ − β σ̂β̂ Here and Then T ∼ tn−2 So, if c is the 97.5th percentile of tn−2 , for example, then P [−c ≤ T ≤ c] = .95. −c ≤ T = β̂ − β ≤c σ̂β̂ iff β̂ − cσ̂β̂ ≤ β ≤ β̂ + cσ̂β̂ . Confidence Interval For β: So, β̂ ± cσ̂β̂ is a 95% confidence interval for β. Confidence Interval for α: Similarly, α̂ ± cσ̂α̂ is a 95% confidence interval for α. 0.1 i O Slide 23 Slide 24 Here Example United Data Services Here n = 14, x = Units Serviced, c = 2.180, y = Time. β̂ = 15.509, sxx = 114, 150 MSE = 29.074 100 σ̂β̂2 = 29.074 = (.505)2 114 and 50 Time So, 2 4 6 8 β̂ ± cσ̂β̂ = 15.509 ± 2.18 × .505 10 Units = 15.51 ± 1.10 Figure 5: A Scatterplot Slide 25 Slide 26 Testing H0 : β = 0 Review From the Confidence Interval: Accept if β̂ − cσ̂β̂ ≤ 0 ≤ β̂ + cσ̂β̂ . • Least squares estimators, a and b. • Properties of the estimators Equivalently, reject if |T0 | = • Simple Linear Regression: Y = α + βx + |β̂| > c. σ̂β̂ Example: Dose Response. β̂ ± cσ̂β̂ = 1.066 ± .449 and, therefore, H0 is rejected. Note: This is the GLRT (as in the one sample problem). Slide 27 • Sampling distributions • Confidence intervals • Testing Today • Estimating expected response • Predicting a future value Slide 28 Estimating Expected Response Let Let µ̂0 = α̂ + β̂x0. µ(x) = α + βx = E(Y |x). Then E(µ̂0 ) = µ0 Fix an x0 and let σµ̂2 0 = µ0 = µ(x0) = α + βx0 Let Then σ̂µ̂2 0 = µ(x) = µ0 + β(x − x0 ). So, 1 n + (x0 − x̄)2 2 σ . sxx 1 (x0 − x̄)2 MSE . + n sxx Then µ0 = α 0 , when x0 = x − x 0 . µ̂0 ± cσ̂µ̂0 is a level 95% confidence interval for µ0. Slide 29 Slide 30 Predicting a Future Value Now let Y0 ∼ Normal[µ0 , σ̂ 2 ] Then ∆ ∼ tn−2 σ̂∆ and Ŷ0 = µ̂0 . and P [−c ≤ Then ∆ := Y0 − Ŷ0 = Y0 − µ0 − (µ̂0 − µ0 ). Here −c ≤ So, E(∆) = 0 2 σ∆ = σ 2 + σµ̂2 0 1 (x0 − x̄)2 2 = 1+ + σ n sxx ∆ ≤ c] = .95. σ̂∆ ∆ ≤c σ̂∆ iff Ŷ0 − cσ̂∆ ≤ Y0 ≤ Ŷ0 + cσ̂∆ . So, P [Ŷ0 − cσ̂∆ ≤ Y0 ≤ Ŷ0 + cσ̂∆ ] = .95. The interval Y0 − Ŷ0 ∼ Let 2 Normal[0, σ∆ ]. Ŷ0 ± cσ̂∆ is called a 95% prediction interval for Y0 . (x0 − x̄)2 1 2 MSE . σ̂∆ = 1+ + n sxx Slide 31 Slide 32 Example United Data Service The Prediction Interval Take x0 = 4, For this Ŷ0 = µ̂0 = 66.198 µ0 = α + 4β. Then µ̂0 = 4.162 + 4 × 15.509 = 66.198. 1 (4 − 6)2 2 σ̂∆ = 1+ + 29.074 14 114 2 = (5.672) Next σ̂µ̂2 0 x̄ = 6 1 (4 − 6)2 + 29.074 = (1.76)2. = 14 114 So µ̂0 ± cσ̂µ̂0 = 66.198 ± 2.18 × 1.76 = 66.20 ± 3.84. Slide 33 Ŷ0 ± cσ̂∆ = 66.198 ± 2.18 × 5.672 = 66.20 ± 12.36 Note: Average response versus individual response. Slide 34