Simple Linear Regression NBA 2013/14 Player Heights and Weights Data Description / Model • Heights (X) and Weights (Y) for 505 NBA Players in 2013/14 Season. • Other Variables included in the Dataset: Age, Position • Simple Linear Regression Model: Y = b0 + b1X + e • Model Assumptions: e ~ N(0,s2) Errors are independent Error variance (s2) is constant Relationship between Y and X is linear No important (available) predictors have been ommitted Weight (Y) vs Height (X) - 2013/2014 NBA Players 300 275 Weight (lbs) 250 225 200 175 150 65 70 75 80 Height (inches) 85 90 Regression Calculations n 505 x 79.06535 S xx x x 2 y 220.6733 6012.844 S xy x x y y 38065.78 S yy y y 2 357767.1 2 ^ SSE yi y i S yy S xy2 S xx 357767.1 38065.782 6012.844 116782.3 i 1 n S xy 38065.78 b1 6.330745 S xx 6012.844 ^ ^ ^ b 0 y b 1 x 220.6733 6.330745 79.06535 279.869 ^ y i 279.869 6.330745xi 2 ^ yi y i SSE 116782.3 232.1716 s 232.1716 15.23718 se2 i 1 e n2 n 2 505 2 SE ^ se 1 S xx 15.23718 1 6012.844 0.196501 n b1 Inference Concerning b1 n 505 x 79.06535 y 220.6733 ^ ^ ^ S xy 38065.78 b1 6.330745 b 0 y b 1 x 220.6733 6.330745 79.06535 279.869 S xx 6012.844 SE ^ se 1 S xx 15.23718 1 6012.844 0.196501 b1 Test of H 0 : b1 0 H A : b1 0 ^ Test Statistic: tobs b1 SE ^ b1 6.330745 32.21738 0.196501 Rejection Region: tobs t.025,505 2 1.965 P-value: 2 P tn 2 tobs 2 P t505 2 32.21738 .0000 ^ 95% Confidence Interval for b1 : b 1 t.025,n 2 SE ^ b1 6.330745 1.965 0.196501 6.330745 0.386124 5.944621 , 6.71687 EXCEL Output and Inference for b0 Coefficients Standard Error t Stat P-value Lower 95%Upper 95% -279.869 15.5512 -17.9966 2.89E-56 -310.423 -249.316 6.330745 0.196501 32.21738 2.2E-124 5.944682 6.716809 Intercept Height ^ b 0 279.869 2 SE ^ se b0 1 x 1 79.065346532 15.23718 15.5512 n S xx 505 6012.843564 ^ Testing H 0 : b 0 0 H A : b 0 0 Test Statistic: tobs b0 SE ^ b0 279.869 17.9966 15.5512 95% CI for b 0 : 279.869 1.965 15.5512 279.869 30.55811 310.427 , 249.311 Estimating Mean and Predicting New Response at x=x* y 220.6733 x 79.06535 n 505 ^ S xx 6012.844 b 1 6.330745 ^ y i 279.869 6.330745xi ^ b 0 279.869 se 232.1716 15.23718 Estimating Mean Response at x 76": Y E Y | x 76 b 0 b1 76 ^ ^ ^ Y b 0 b 1 76 279.869 6.330745 76 201.2673 SE ^ se Y 1 x x S xx n * 2 76 79.06535 15.23718 0.003543 0.906953 1 15.23718 6012.844 505 2 95% CI for Y E Y | x 76 : 201.2673 1.965 0.906953 ^ ^ 199.4852 , 203.0495 ^ Predicting a New Players weight with x 76": y b 0 b 1 76 279.869 6.330745 76 201.2673 1 SE ^ se 1 y n x* x S xx 2 76 79.06535 15.23718 1.003543 15.26415 1 15.23718 1 6012.844 505 2 95% Prediction Interval for y76 : 201.2673 1.965 15.26415 171.2733 , 231.2614 Weight vs Height - Data, Fitted Values, CI for Mean, PI for Individuals 350 300 250 200 150 100 66 69 72 75 Weight Y-hat 78 CI_LB 81 CI_UB PI_LB 84 PI_UB 87 90 Coefficients of Correlation and Determination ryx S xy S xx S yy 38065.78 6012.843564 357767.1 0.820719235 Note that while the intercept and slope depend on units (e.g. inches vs centimetres, pounds vs kilograms, the correlation coefficient will not) r 2 r 0.820719235 0.67358 2 Alternatively: r 2 2 TSS SSE S yy SSE 357767.1-116782.3 0.67358 TSS S yy 357767.1 Approximately 2/3 (67.4%) of the variation in weight is "explained" by the regression of weight on height Testing H 0 : yx 0 H A : yx 0 Test Statistic: tobs ryx n2 505 2 0.820719235 32.21738 1 ryx2 1 0.67358 P-value .0000 Analysis of Variance and F-Test n Total (Corrected) Sum of Squares: TSS S yy yi y i 1 2 357767.1 DFT n 1 505 1 504 2 ^ Error (aka Residual) Sum of Squares: SSE yi y i 116782.3 i 1 n 2 ^ Regression (aka Model) Sum of Squares: SSR y i y 240984.8 i 1 n DFE n 2 505 2 503 DFR 1 F-Test for Slope Coefficient H 0 : b1 0 H A : b1 0 Test Statistic: Fobs 240984.8 1 1037.96 MSR SSR DFR MSE SSE DFE 116782.3 503 Rejection Region: Fobs F ; DFR , DFE F.05;1,503 3.860 P-value P FDFR , DFE Fobs P F1,503 1037.96 .0000 ANOVA df Regression Residual Total SS MS 1 240984.8 240984.8 503 116782.3 232.1716 504 357767.1 F Significance F 1037.96 2.2E-124 Graphical Representation of Analysis of Variance 300 280 260 240 220 200 180 160 140 120 100 66 69 72 75 78 Weight Y-hat 81 Y-bar 84 87 90 Linearity of Regression F -Test for Lack-of-Fit (n j observations at c distinct levels of "X") H 0 : E Yi b 0 b1 X i H A : E Yi i b 0 b1 X i Compute fitted value Y j and sample mean Y j for each distinct X level c nj Lack-of-Fit: SS LF Y j Y j j 1 i 1 c nj Pure Error: SS PE Yij Y j j 1 i 1 2 2 df LF c 2 df PE n c SS ( LF ) c 2 MS ( LF ) ~ MS ( PE ) SS ( PE ) n c H0 Test Statistic: FLOF Reject H 0 if FLOF F 1 ; c 2, n c Fc 2,n c FLOF SSE SS PE n 2 n c SS PE nc SSE R SSE F df R df F SSE F df F Reject H 0 if FLOF F 1 ; c 2, n c Computing Strategy: nj 1) For each group (j ): Compute: Y j nj Yij Y j s 2j i 1 n j 1 0 Y i 1 ij nj 2 nj 1 otherwise ^ Y j b0 b1 X j nj c nj c 2 ^ ^ 2) SS LF Y j Y j n j Y j Y j i 1 j 1 j 1 3) SS PE Yij Y j i 1 j 1 c n 1 s 2 c j 1 j 2 j 2 SS LF c 2 MS ( LF ) SS PE MS ( PE ) nc H0 ~ Fc 2,n c Height and Weight Data – n=505, c=18 Groups Height n 69 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 Sum Source df LackFit PureError Mean SD Y-hat SSLF SSPE SSE 2 182.50 3.54 156.95 1305.39 12.50 1317.89 4 175.75 15.52 169.61 150.62 722.75 873.37 13 181.00 13.00 175.94 332.27 2028.00 2360.27 16 186.13 12.09 182.28 237.15 2191.75 2428.90 21 183.33 9.26 188.61 583.79 1716.67 2300.45 41 193.71 11.58 194.94 61.96 5360.49 5422.44 32 200.84 11.96 201.27 5.74 4434.22 4439.96 31 204.13 10.70 207.60 373.06 3433.48 3806.55 43 211.00 12.83 213.93 368.86 6912.00 7280.86 49 221.35 18.70 220.26 57.94 16781.10 16839.04 46 227.33 15.13 226.59 24.90 10300.11 10325.01 67 232.49 19.63 232.92 12.30 25430.75 25443.05 53 241.49 14.79 239.25 265.64 11369.25 11634.88 44 245.66 17.55 245.58 0.26 13241.89 13242.14 34 254.62 14.70 251.91 248.66 7128.03 7376.69 7 247.86 10.75 258.24 755.21 692.86 1448.07 1 278.00 0.00 264.57 180.24 0.00 180.24 1 263.00 0.00 270.91 62.50 0.00 62.50 505 #N/A #N/A #N/A 5026.479 111755.8 116782.3 SS 16 5026.5 487 111755.8 MS F(LOF) F(.95) P-value 314.2 1.369 1.664 0.1521 229.5 Do not reject H0: j = b0 + b1Xj