Stat 141 R1 - Lecture #35 Announcements: 1) Assignment #11 Question 5: The answer is wrong … should be “fail to reject” but MyStatLab wants “reject”... so give the wrong answer for full marks in this question ๏ 2) Exam: STAT 141 R1 3 hrs 1400 Wed Apr 17 MAIN GYM, ~45 Multiple Choice Questions Chapters 7, 8, 18-28 …. some pre MT skills will be required. Simple Linear Regression …. continued Last time: Ex) Predicting final exam marks (%) from midterm exam marks (%) in a class of 88 students: Student Midterm mark Final mark #1 67% 62% #2 72% 50% … … … #88 88% 91% Stat 141 R1 - Lecture #35 Given x = midterm percentage, y = final percentage, ๐ = 88, xฬ = 67.812, yฬ = 52.643, sx = 17.922, sy = 25.430, r = 0.718 ∑(yi–ลทi)2 = 27278.82 We had calculated: The slope and intercept of the sample line of best fit: ๏ฐ sample line of best fit: ๐ฆฬ= -16.443 + 1.019 x An estimate for σ (standard deviation about the population line): se Given SSE = ∑(yi–ลทi)2 = 27278.82 : ๐ ๐2 = ๐๐๐ธ ๐−2 = 27278.82 88−2 = 317.196 → ๐ ≈ ๐ ๐ = √317.196 = 17.810 page 2 Stat 141 R1 - Lecture #35 page 3 Inference for the population slope β1 When the 4 basic assumptions of the SLR model are satisfied: o The relationship between x and y is sufficiently linear. Presuming linearity, this means, at any x, με = 0. o The std. dev. of ε is the same for any particular x (constant). o The distribution of ε at any particular x is normal. o The random deviations ε1, ε2, ..., εn associated with different observations are independent of one another. then: i) b1 is normally distributed ii) The mean of b1 is ๐b1 = β1 iii) The standard deviation of b1 is ๐๐1 = The standard error of b1 is SE(๐1 ) = ๐ √∑(๐ฅ๐ −๐ฅฬ )2 ๐ ๐ √∑(๐ฅ๐ −๐ฅฬ )2 = ๐ ๐ ๐ ๐ฅ √๐−1 ๏ฐ CI for β1: b1± tα/2× SE(b1) with df= n– 2 Test statistic for H0: β1= 0 : ๐ก0 = ๐1 −๐ฝ1 ๐๐ธ(๐1 ) with df= n – 2 Ex) Construct a 95% CI for β1. Sol.: SE(b1) = ๐ ๐ ๐ ๐ฅ √๐−1 = 17.810 17.922√87 = = 0.1065, df = n-2 = 88-2 = 86 => using df = 75: tα/2 ≈ t0.025 = 1.992 b1± tα/2× SE(b1) = 1.019 ± (1.992)(0.1065) = 1.019 ± 0.212 = (0.807, 1.231) Stat 141 R1 - Lecture #35 page 4 Ex) Is there sufficient evidence to conclude that the final percentage increases as midterm percentage increases? Carry out an appropriate test using ๏ก = 0.01 . Sol.: H0: β1=0 HA: β1> 0 Assumptions of SLR model: as above Test statistic: ๐ −๐ฝ 1.019−0 ๐ก0 = 1 1) = = 9.568 ๐๐ธ(๐1 0.1065 with df = 88 – 2 = 86 => 75 P-value: In the t-table, the corresponding range of p-values is (0.005, 0). Note that the test is one-tailed. Conclusion: Reject H0 in favour of HA at ๏ก = 0.01 There is convincing evidence against H0 in favour of HA: final percentage increases as midterm percentage increases (a positive linear association between mt% and fin%). A typical summary table ( by Excel or StatsCrunch etc ) Stat 141 R1 - Lecture #35 page 5 Inferences based on the estimated regression line: • CI for the mean value of y corresponding to an x value: For ลทν= b0 + b1xν ลทν ± tdf, α/2× SE(๐ฬ ๐ ) (df = n – 2) • Prediction interval (PI) for an individual value of y corresponding to an x value: ลทν ± tdf, α/2× SE(๐ฆฬ๐ ) (df = n – 2) Note that the PI is wider than the CI. Why? Ex) Give a 95% CI for the mean final% when midterm% = 73%. Compare with the 95% PI for final% when midterm% = 73%. Sol.: ลทν= b0+ b1xν= -16.443+ 1.019(73) = 57.928% where tn-2,α/2= t86,0.025 ≈ 1.992 Stat 141 R1 - Lecture #35 CI: ลทν ± tdf, α/2× SE(๐ฬ ๐ )= 57.928 ± (1.992)(1.977) = 57.928 ± 3.939 = (53.990, 61.867) PI: ลทν ± tdf, α/2× SE(๐ฆฬ๐ )= 57.928 ± (1.992)(17.919) = 57.928 ± 35.696 = (22.233, 93.624) ๏๏๏๏ THIS IS THE END! ๏๏๏๏ page 5