Chapter 14 Simple Linear Regression Learning Objectives 1. Understand how regression analysis can be used to develop an equation that estimates mathematically how two variables are related. 2. Understand the differences between the regression model, the regression equation, and the estimated regression equation. 3. Know how to fit an estimated regression equation to a set of sample data based upon the leastsquares method. 4. Be able to determine how good a fit is provided by the estimated regression equation and compute the sample correlation coefficient from the regression analysis output. 5. Understand the assumptions necessary for statistical inference and be able to test for a significant relationship. 6. Know how to develop confidence interval estimates of y given a specific value of x in both the case of a mean value of y and an individual value of y. 7. Learn how to use a residual plot to make a judgement as to the validity of the regression assumptions. 8. Know the definition of the following terms: independent and dependent variable simple linear regression regression model regression equation and estimated regression equation scatter diagram coefficient of determination standard error of the estimate confidence interval prediction interval residual plot 14 - 1 Chapter 14 Solutions: a. 16 14 12 10 y 1 8 6 4 2 0 0 1 2 3 4 5 6 x b. There appears to be a positive linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion. d. x xi 15 3 n 5 y ( xi x )( yi y ) 26 b1 yi 40 8 n 5 ( xi x ) 2 10 ( xi x )( yi y ) 26 2.6 10 ( xi x )2 b0 y b1 x 8 (2.6)(3) 0.2 yˆ 0.2 2.6 x e. yˆ 0.2 2.6(4) 10.6 14 - 2 Simple Linear Regression a. 60 50 40 y 2. 30 20 10 0 0 5 10 15 20 25 x b. There appears to be a negative linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion. d. x xi 55 11 n 5 y ( xi x )( yi y ) 540 b1 yi 175 35 n 5 ( xi x ) 2 180 ( xi x )( yi y ) 540 3 180 ( xi x )2 b0 y b1 x 35 (3)(11) 68 yˆ 68 3x e. yˆ 68 3(10) 38 14 - 3 Chapter 14 a. 30 25 20 y 3. 15 10 5 0 0 5 10 15 x b. x xi 50 10 n 5 y ( xi x )( yi y ) 171 b1 yi 83 16.6 n 5 ( xi x ) 2 190 ( xi x )( yi y ) 171 0.9 190 ( xi x )2 b0 y b1 x 16.6 (0.9)(10) 7.6 yˆ 7.6 0.9 x c. yˆ 7.6 0.9(6) 13 14 - 4 20 25 Simple Linear Regression 4. a. 135 130 125 y 120 115 110 105 100 61 62 63 64 65 66 67 68 69 x b. There appears to be a positive linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion. d. x xi 325 65 n 5 y ( xi x )( yi y ) 110 b1 yi 585 117 n 5 ( xi x ) 2 20 ( xi x )( yi y ) 110 5.5 20 ( xi x ) 2 b0 y b1 x 117 (5.5)(65) 240.5 y 240.5 55 . x e. y 240.5 55 . x 240.5 55 . (63) 106 pounds 14 - 5 Chapter 14 a. 2500 2000 Price ($) 5. 1500 1000 500 0 1 1.5 2 2.5 3 3.5 4 4.5 Baggage Capacity b. Let x = baggage capacity and y = price ($). There appears to be a positive linear relationship between x and y. c. Many different straight lines can be drawn to provide a linear approximation of the relationship between x and y; in part (d) we will determine the equation of a straight line that “best” represents the relationship according to the least squares criterion. d. x xi 29.5 3.277778 n 9 y ( xi x )( yi y ) 2909.888891 b1 yi 11,110 123.444444 n 9 ( xi x ) 2 4.555559 ( xi x )( yi y ) 2909.888891 638.755615 4.555559 ( xi x )2 b0 y b1 x 1234.4444 (638.7561)(3.2778) 859.254658 yˆ 859.26 638.76 x e. A one point increase in the baggage capacity rating will increase the price by approximately $639. f. yˆ 859.26 638.76 x 859.26 638.76(3) $1057 14 - 6 Simple Linear Regression a. 30 25 20 Bonus 6. 15 10 5 0 50.00 70.00 90.00 110.00 130.00 150.00 170.00 190.00 Salary b. Let x = salary and y = bonus. There appears to be a positive linear relationship between x and y. c. x xi 1420 142 n 10 xi 1420 b1 yi 160 y yi 160 16 n 10 ( xi x )( yi y ) 1114 ( xi x ) 2 6046 ( xi x )( yi y ) 1114 .184254 6046 ( xi x )2 b0 y b1 x 16 (.184254)(142) 10.164068 yˆ 10.1641 .1843x d. $1000 increase in salary will increase the bonus by $180. e yˆ 10.1641 .1843x 10.1641 .1843(120) 11.95 or approximately $12,000 14 - 7 Chapter 14 a. 44,000 42,000 40,000 Price 7. 38,000 36,000 34,000 32,000 30,000 0 1 2 3 4 5 6 Reliability b. x xi / n 47 /15 3.1333 y yi / n 548, 434 /15 36,562.2667 ( xi x )( yi y ) 36, 086.5333 b1 ( xi x ) 2 27.7333 ( xi x )( yi y ) 36, 086.5333 1301.1987 27.7333 ( xi x )2 b0 y b1 x 36,562.2667 (1301.1987)(3.1333) 40,639.3126 yˆ 40,369 1301.2 x c. The scatter diagram and the slope of the estimated regression equation indicate a negative linear relationship between reliability and price. Thus, it appears that higher reliable cars actually cost less. Although this result may surprise you, it may be due to the fact that higher priced cars have more options that may increase the likelihood of problems. d. A car with a good reliability rating corresponds to x = 3. yˆ 40,369 1301.2 x 40,369 1301.2(3) 36, 465.40 Thus, the estimate of the price of an upscale sedan with a good reliability rating is approximately $36,465. 14 - 8 Simple Linear Regression a. 1800 1600 1400 1200 Price 8. 1000 800 600 400 200 0 0 1 2 3 4 5 6 Sidetrack Capability b. There appears to be a positive linear relationship between x = sidetrack capability and y = price, with higher priced models having a higher level of handling. c. x xi 28 2.8 n 10 y ( xi x )( yi y ) 4003.2 b1 yi 10, 621 1062.1 n 10 ( xi x ) 2 19.6 ( xi x )( yi y ) 4003.2 204.2449 19.6 ( xi x )2 b0 y b1 x 1062.1 (204.2449)(2.8) 490.2143 yˆ 490.21 204.24 x d. yˆ 490.21 204.24 x 490.21 204.24(4) 1307 14 - 9 Chapter 14 a. 150 140 130 120 110 y 9. 100 90 80 70 60 50 0 2 4 6 8 x b. x xi 70 7 n 10 y ( xi x )( yi y ) 568 b1 yi 1080 108 n 10 ( xi x ) 2 142 ( xi x )( yi y ) 568 4 142 ( xi x )2 b0 y b1 x 108 (4)(7) 80 y 80 4 x c. y 80 4 x 80 4(9) 116 14 - 10 10 12 14 Simple Linear Regression 10. a. 450 400 350 Price 300 250 200 150 100 50 0 0 10 20 30 40 50 Rating b. The scatter diagram and the slope of the estimated regression equation indicate a negative linear relationship between rating and price. Thus, it appears that sleeping bags with a lower temperature rating cost more than sleeping bags with a higher temperature rating. In other words, it costs more to stay warmer. c. x xi / n 209 /11 19 ( xi x )( yi y ) 10, 090 b1 y yi / n 2849 /11 259 ( xi x ) 2 1912 ( xi x )( yi y ) 10, 090 5.2772 1912 ( xi x )2 b0 y b1 x 259 (5.2772)(19) 359.2668 yˆ 359.2668 5.2772 x d. yˆ 359.2668 5.2772 x 359.2668 5.2772(20) 253.72 Thus, the estimate of the price of sleeping bag with a temperature rating of 20 is approximately $254. 14 - 11 Chapter 14 11. a. 35 Departures (%) 30 25 20 15 10 5 0 10 15 20 25 30 35 Arrivals (%) b. There appears to be a positive linear relationship between the variables. c. Let x = percentage of late arrivals and y = percentage of late departures. x xi 273 y 265 21 y i 20.3846 n 13 n 13 ( xi x )( yi y ) 142 b1 ( xi x ) 2 166 ( xi x )( yi y ) 142 .8554 166 ( xi x )2 b0 y b1 x 20.3846 (.8554)(21) 2.4212 yˆ 2.42 .86 x d. A one percent increase in the percentage of late arrivals will increase the percentage of late arrivals by .86 or slightly less than one percent. e. yˆ 2.42 .86 x 2.42 .86(22) 21.34% 14 - 12 Simple Linear Regression 12. a. 12000 11000 Price 10000 9000 8000 7000 6000 5000 700 720 740 760 780 800 820 840 Weight b. The scatter diagram indicates a positive linear relationship between weight and price. Thus, it appears that PWC’s that weigh more have a higher price. c. x xi / n 7730 /10 773 y yi / n 92, 200 /10 9220 ( xi x )( yi y ) 332, 400 ( xi x ) 2 14,810 b1 ( xi x )( yi y ) 332, 400 22.4443 14,810 ( xi x )2 b0 y b1 x 9220 (22.4443)(773) 8129.4439 yˆ 8129.4439 22.4443x d. yˆ 8129.4439 22.4443x 8129.4439 22.4443(750) 8703.78 Thus, the estimate of the price of Jet Ski with a weight of 750 pounds is approximately $8704. e. No. The relationship between weight and price is not deterministic. f. The weight of the Kawasaki SX-R 800 is so far below the lowest weight for the data used to develop the estimated regression equation that we would not recommend using the estimated regression equation to predict the price for this model. 14 - 13 Chapter 14 13. a. 30.0 25.0 y 20.0 15.0 10.0 5.0 0.0 0.0 20.0 40.0 60.0 80.0 100.0 120.0 x b. x xi 399 57 n 7 y ( xi x )( yi y ) 1233.7 b1 yi 97.1 13.8714 n 7 ( xi x ) 2 7648 ( xi x )( yi y ) 1233.7 0.1613 7648 ( xi x )2 b0 y b1 x 13.8714 (0.1613)(57) 4.6773 y 4.68 016 . x c. y 4.68 016 . x 4.68 016 . (52.5) 13.08 or approximately $13,080. The agent's request for an audit appears to be justified. 14 - 14 140.0 Simple Linear Regression Salary 14. a. 30.0 29.0 28.0 27.0 26.0 25.0 24.0 23.0 22.0 21.0 20.0 0.00 20.00 40.00 60.00 80.00 100.00 Index b. Let x = cost of living index and y = starting salary ($1000s) x xi 369.16 36.916 n 10 y ( xi x )( yi y ) 311.9592 b1 yi 267.7 26.77 n 10 ( xi x ) 2 7520.4042 ( xi x )( yi y ) 311.9592 .0415 7520.4042 ( xi x )2 b0 y b1 x 26.77 (.0415)(36.916) 28.30 yˆ 28.30 .0415x c. 15. a. yˆ 28.30 .0415x 28.30 .0415(50) 26.2 The estimated regression equation and the mean for the dependent variable are: yi 0.2 2.6xi y 8 The sum of squares due to error and the total sum of squares are SSE ( yi yi ) 2 12.40 SST ( yi y ) 2 80 Thus, SSR = SST - SSE = 80 - 12.4 = 67.6 b. r2 = SSR/SST = 67.6/80 = .845 The least squares line provided a very good fit; 84.5% of the variability in y has been explained by the least squares line. c. rxy .845 .9192 14 - 15 Chapter 14 16. a. The estimated regression equation and the mean for the dependent variable are: yˆi 68 3x y 35 The sum of squares due to error and the total sum of squares are SSE ( yi yˆi ) 2 230 SST ( yi y ) 2 1850 Thus, SSR = SST - SSE = 1850 - 230 = 1620 b. r2 = SSR/SST = 1620/1850 = .876 The least squares line provided an excellent fit; 87.6% of the variability in y has been explained by the estimated regression equation. c. rxy .876 .936 Note: the sign for r is negative because the slope of the estimated regression equation is negative. (b1 = -3) 17. The estimated regression equation and the mean for the dependent variable are: yˆi 7.6 .9 x y 16.6 The sum of squares due to error and the total sum of squares are SSE ( yi yˆi ) 2 127.3 SST ( yi y ) 2 281.2 Thus, SSR = SST - SSE = 281.2 – 127.3 = 153.9 r2 = SSR/SST = 153.9/281.2 = .547 We see that 54.7% of the variability in y has been explained by the least squares line. rxy .547 .740 18. a. The estimated regression equation and the mean for the dependent variable are: yˆ 1790.5 581.1x y 3650 The sum of squares due to error and the total sum of squares are SSE ( yi yˆi ) 2 85,135.14 SST ( yi y ) 2 335, 000 Thus, SSR = SST - SSE = 335,000 - 85,135.14 = 249,864.86 b. r2 = SSR/SST = 249,864.86/335,000 = .746 We see that 74.6% of the variability in y has been explained by the least squares line. c. rxy .746 .8637 14 - 16 Simple Linear Regression 19. The estimated regression equation and the mean for the dependent variable are: yˆi 40,639 1301.2 x y 36,562.27 The sum of squares due to error and the total sum of squares are SSE ( yi yˆi ) 2 47,116,828 SST ( yi y ) 2 94, 072,519 Thus, SSR = SST - SSE = 94,072,519 – 47,116,828 = 46,955,691 r2 = SSR/SST = 46,955,691/94,072,519 = .4991 We see that 49.91% of the variability in y has been explained by the least squares line. rxy .4991 .71 20. a. 70 65 60 Score 55 50 45 40 35 30 25 20 1500.00 2000.00 2500.00 3000.00 3500.00 4000.00 4500.00 Price The scatter diagram indicates a positive linear relationship between price and score. x xi / n 29, 600 /10 2960 ( xi x )( yi y ) 34,840 b1 y yi / n 496 /10 49.6 ( xi x ) 2 2, 744, 000 ( xi x )( yi y ) 34,840 .012697 2 2, 744, 000 ( xi x ) b0 y b1 x 49.6 (.012697)(2960) 12.0169 yˆ 12.0169 .0127 x 14 - 17 Chapter 14 b. The sum of squares due to error and the total sum of squares are SSE ( yi yˆi ) 2 540.0446 SST ( yi y ) 2 982.4 Thus, SSR = SST - SSE = 982.4 – 540.0446 = 442.3554 r2 = SSR/SST = 442.3554/982.4 = .4503 The fit provided by the estimated regression equation is not that good; only 45.03% of the variability in y has been explained by the least squares line. c. yˆ 12.0169 .0127 x 12.0169 .0127(3200) 52.66 The estimate of the overall score for a 42-inch plasma television is approximately 53. 21. a. x xi 3450 575 n 6 y yi 33, 700 5616.67 n 6 ( xi x )( yi y ) 712,500 b1 ( xi x ) 2 93, 750 ( xi x )( yi y ) 712,500 7.6 93, 750 ( xi x ) 2 b0 y b1 x 5616.67 (7.6)(575) 1246.67 y 1246.67 7.6 x b. $7.60 c. The sum of squares due to error and the total sum of squares are: SSE ( yi yˆi ) 2 233,333.33 SST ( yi y ) 2 5, 648,333.33 Thus, SSR = SST - SSE = 5,648,333.33 - 233,333.33 = 5,415,000 r2 = SSR/SST = 5,415,000/5,648,333.33 = .9587 We see that 95.87% of the variability in y has been explained by the estimated regression equation. d. 22. a. y 1246.67 7.6x 1246.67 7.6(500) $5046.67 Let x = speed (ppm) and y = price ($) x xi 149.9 14.99 n 10 y ( xi x )( yi y ) 34,359.810 b1 yi 10, 221 1022.1 n 10 ( xi x ) 2 291.3890 ( xi x )( yi y ) 34,359.810 117.917320 291.3890 ( xi x ) 2 14 - 18 Simple Linear Regression b0 y b1 x 1022.1 (117.917320)(14.99) 745.480627 yˆ 745.480627 117.917320x b. The sum of squares due to error and the total sum of squares are: SSE ( yi yˆi ) 2 1, 678, 294 SST ( yi y ) 2 5, 729,911 Thus, SSR = SST - SSE = 5,729,911 - 1,678,294 = 4,051,617 r2 = SSR/SST = 4,051,617/5,729,911 = 0.7071 Approximately 71% of the variability in price is explained by the speed. c. rxy .7071 .84 It reflects a linear relationship that is between weak and strong. 23. a. s2 = MSE = SSE / (n - 2) = 12.4 / 3 = 4.133 b. s MSE 4.133 2.033 c. ( xi x ) 2 10 sb1 d. t s ( xi x ) 2 2.033 10 0.643 b1 2.6 4.044 sb1 .643 Using t table (3 degrees of freedom), area in tail is between .01 and .025 p-value is between .02 and .05 Using Excel or Minitab, the p-value corresponding to t = 4.04 is .0272. Because p-value , we reject H0: 1 = 0 e. MSR = SSR / 1 = 67.6 F = MSR / MSE = 67.6 / 4.133 = 16.36 Using F table (1 degree of freedom numerator and 3 denominator), p-value is between .025 and .05 Using Excel or Minitab, the p-value corresponding to F = 16.36 is .0272. Because p-value , we reject H0: 1 = 0 14 - 19 Chapter 14 Source of Variation Regression Error Total 24. a. Sum of Squares 67.6 12.4 80.0 Mean Square 67.6 4.133 F 16.36 p-value .0272 s2 = MSE = SSE/(n - 2) = 230/3 = 76.6667 b. s MSE 76.6667 8.7560 c. ( xi x ) 2 180 sb1 d. Degrees of Freedom 1 3 4 t s ( xi x ) 2 8.7560 180 0.6526 b1 3 4.59 sb1 .653 Using t table (3 degrees of freedom), area in tail is less than .01; p-value is less than .02 Using Excel or Minitab, the p-value corresponding to t = -4.59 is .0193. Because p-value , we reject H0: 1 = 0 e. MSR = SSR/1 = 1620 F = MSR/MSE = 1620/76.6667 = 21.13 Using F table (1 degree of freedom numerator and 3 denominator), p-value is less than .025 Using Excel or Minitab, the p-value corresponding to F = 21.13 is .0193. Because p-value , we reject H0: 1 = 0 Source of Variation Regression Error Total 25. a. Sum of Squares 230 1620 1850 Degrees of Freedom 1 3 4 s2 = MSE = SSE/(n - 2) = 127.3/3 = 42.4333 s MSE 42.4333 6.5141 b. ( xi x ) 2 190 sb1 s ( xi x ) 2 6.5141 190 0.4726 14 - 20 Mean Square 230 76.6667 F 21.13 p-value .0193 Simple Linear Regression t b1 .9 1.90 sb1 .4726 Using t table (3 degrees of freedom), area in tail is between .05 and .10 p-value is between .10 and .20 Using Excel or Minitab, the p-value corresponding to t = 1.90 is .1530. Because p-value > , we cannot reject H0: 1 = 0; x and y do not appear to be related. c. MSR = SSR/1 = 153.9 /1 = 153.9 F = MSR/MSE = 153.9/42.4333 = 3.63 Using F table (1 degree of freedom numerator and 3 denominator), p-value is greater than .10 Using Excel or Minitab, the p-value corresponding to F = 3.63 is .1530. Because p-value > , we cannot reject H0: 1 = 0; x and y do not appear to be related. 26. a. In solving exercise 18, we found SSE = 85,135.14 s2 = MSE = SSE/(n - 2) = 85,135.14/4 = 21,283.79 s MSE 21,28379 . 14589 . ( xi x ) 2 0.74 sb1 t s ( xi x ) 2 145.89 0.74 169.59 b1 581.1 3.43 sb1 169.59 Using t table (4 degrees of freedom), area in tail is between .01 and .025 p-value is between .02 and .05 Using Excel or Minitab, the p-value corresponding to t = 3.43 is .0266. Because p-value , we reject H0: 1 = 0 b. MSR = SSR/1 = 249,864.86/1 = 249.864.86 F = MSR/MSE = 249,864.86/21,283.79 = 11.74 Using F table (1 degree of freedom numerator and 4 denominator), p-value is between .025 and .05 14 - 21 Chapter 14 Using Excel or Minitab, the p-value corresponding to F = 11.74 is .0266. Because p-value , we reject H0: 1 = 0 c. Source of Variation Regression Error Total 27. a. x Sum of Squares 249864.86 85135.14 335000 xi 37 3.7 n 10 y ( xi x )( yi y ) 315.2 b1 Degrees of Freedom 1 4 5 Mean Square 249864.86 21283.79 F 11.74 p-value .0266 yi 1654 165.4 n 10 ( xi x ) 2 10.1 ( xi x )( yi y ) 315.2 31.2079 10.1 ( xi x ) 2 b0 y b1 x 165.4 (31.2079)(3.7) 49.9308 yˆ 49.9308 31.2079 x b. SSE = ( yi yˆi ) 2 2487.66 SST = ( yi y ) 2 = 12,324.4 Thus, SSR = SST - SSE = 12,324.4 - 2487.66 = 9836.74 MSR = SSR/1 = 9836.74 MSE = SSE/(n - 2) = 2487.66/8 = 310.96 F = MSR / MSE = 9836.74/310.96 = 31.63 Using F table (1 degree of freedom numerator and 8 denominator), p-value is less than .01 Using Excel or Minitab, the p-value corresponding to F = 31.63 is .001. Because p-value , we reject H0: 1 = 0 Upper support and price are related. c. r2 = SSR/SST = 9,836.74/12,324.4 = .80 The estimated regression equation provided a good fit; we should feel comfortable using the estimated regression equation to estimate the price given the upper support rating. d. ŷ = 49.93 + 31.21(4) = 174.77 14 - 22 Simple Linear Regression 28. The sum of squares due to error and the total sum of squares are SSE ( yi yˆi ) 2 12,953.09 SST ( yi y ) 2 66, 200 Thus, SSR = SST - SSE = 66,200 – 12,953.09 = 53,246.91 s2 = MSE = SSE / (n - 2) = 12,953.09 / 9 = 1439.2322 s MSE 1439.2322 37.9372 We can use either the t test or F test to determine whether temperature rating and price are related. We will first illustrate the use of the t test. Note: from the solution to exercise 10 ( xi x ) 2 1912 s sb1 t ( xi x ) b1 sb 5.2772 .8676 2 37.9372 1912 .8676 6.0825 1 Using t table (9 degrees of freedom), area in tail is less than .005; p-value is less than .01 Using Excel or Minitab, the p-value corresponding to t = -6.0825 is .000. Because p-value , we reject H0: 1 = 0 Because we can reject H0: 1 = 0 we conclude that temperature rating and price are related. Next we illustrate the use of the F test. MSR = SSR / 1 = 53,246.91 F = MSR / MSE = 53,246.91 / 1439.2322 = 37.00 Using F table (1 degree of freedom numerator and 9 denominator), p-value is less than .01 Using Excel or Minitab, the p-value corresponding to F = 37.00 is .000. Because p-value , we reject H0: 1 = 0 Because we can reject H0: 1 = 0 we conclude that temperature rating and price are related. The ANOVA table is shown below. Source of Variation Regression Error Total Sum of Squares 53,246.91 12,953.09 66,200 Degrees of Freedom 1 9 10 14 - 23 Mean Square 53,246.91 1439.2322 F 37.00 p-value .000 Chapter 14 29. SSE = ( yi yˆi ) 2 233,333.33 SST = ( yi y ) 2 = 5,648,333.33 Thus, SSR = SST – SSE = 5,648,333.33 –233,333.33 = 5,415,000 MSE = SSE/(n - 2) = 233,333.33/(6 - 2) = 58,333.33 MSR = SSR/1 = 5,415,000 F = MSR / MSE = 5,415,000 / 58,333.25 = 92.83 Source of Variation Regression Error Total Sum of Squares 5,415,000.00 233,333.33 5,648,333.33 Degrees of Freedom 1 4 5 Mean Square 5,415,000 58,333.33 F 92.83 p-value .0006 Using F table (1 degree of freedom numerator and 4 denominator), p-value is less than .01 Using Excel or Minitab, the p-value corresponding to F = 92.83 is .0006. Because p-value , we reject H0: 1 = 0. Production volume and total cost are related. 30. SSE = ( yi yˆi ) 2 1,678,294 SST = ( yi y ) 2 = 5,729,911 Thus, SSR = SST – SSE = 5,729,911 – 1,678,294 = 4,051,617 s2 = MSE = SSE/(n-2) = 1,678,294/8 = 209,786.8 s 209,786.8 458.02 ( xi x ) 2 = 291.3890 sb1 t s ( xi x ) 2 458.02 291.3890 26.83 b1 117.9173 4.39 sb1 26.83 Using t table (1 degree of freedom numerator and 8 denominator), area in tail is less than .005 p-value is less than .01 Using Excel or Minitab, the p-value corresponding to t = 4.39 is .002. Because p-value , we reject H0: 1 = 0 There is a significant relationship between x and y. 14 - 24 Simple Linear Regression 31. SSE = 540.04 and SST = 982.40 Thus, SSR = SST - SSE = 982.40 – 540.04 = 442.36 s2 = MSE = SSE / (n - 2) = 540.04 / 8 = 67.5050 s MSE 67.5050 8.216 MSR = SSR / 1 = 442.36 F = MSR / MSE = 442.36 / 67.5050 = 6.55 Using F table (1 degree of freedom numerator and 8 denominator), p-value is between .025 and .05 Using Excel or Minitab, the p-value corresponding to F = 6.55 is .034. Because p-value , we reject H0: 1 = 0 Conclusion: price and overall score are related 32. a. b. s = 2.033 x 3 ( xi x ) 2 10 s yˆp s 2 1 ( xp x ) 1 (4 3)2 2.033 1.11 n ( xi x )2 5 10 y 0.2 2.6x 0.2 2.6(4) 10.6 y p t / 2 sy p 10.6 3.182 (1.11) = 10.6 3.53 or 7.07 to 14.13 2 1 ( xp x ) 1 (4 3)2 s 1 2.033 1 2.32 2 n ( xi x ) 5 10 c. sind d. y p t / 2 sind 10.6 3.182 (2.32) = 10.6 7.38 or 3.22 to 17.98 33. a. b. s = 8.7560 x 11 ( xi x )2 180 14 - 25 Chapter 14 s yˆp s 2 1 ( xp x ) 1 (8 11)2 8.7560 4.3780 n ( xi x )2 5 180 yˆ 68 3x 68 3(10) 38 y p t / 2 sy p 38 3.182(4.3780) = 38 13.93 or 24.07 to 51.93 c. sind s 1 d. y p t / 2 sind 2 1 ( xp x ) 1 (8 11)2 8.7560 1 9.7895 2 n ( xi x ) 5 180 38 3.182(9.7895) = 38 31.15 or 6.85 to 69.15 34. s = 6.5141 x 10 s yˆp s ( xi x ) 2 190 2 1 ( xp x ) 1 (12 10)2 6.5141 3.0627 n ( xi x )2 5 190 yˆ 7.6 .9 x 7.6 .9(12) 18.40 y p t / 2 sy p 18.40 3.182(3.0627) = 18.40 9.75 or 8.65 to 28.15 sind s 1 2 1 ( xp x ) 1 (12 10)2 6.5141 1 7.1982 n ( xi x )2 5 190 y p t / 2 sind 18.40 3.182(7.1982) = 18.40 22.90 or -4.50 to 41.30 14 - 26 Simple Linear Regression 35. a. Note: some of the values shown were computed in exercises 18 and 26. s = 145.89 x 3.2 s yˆp s ( xi x ) 2 0.74 2 1 ( xp x ) 1 (3 3.2)2 145.89 68.54 2 n ( xi x ) 6 0.74 yˆ 1790.5 581.1x 1790.5 581.1(3) 3533.8 y p t / 2 sy p 3533.8 2.776 (68.54) = 3533.8 190.27 or $3343.53 to $3724.07 b. sind s 1 2 1 ( xp x ) 1 (3 3.2)2 145.89 1 161.19 n ( xi x )2 6 0.74 y p t / 2 sind 3533.8 2.776 (161.19) = 3533.8 447.46 or $3086.34 to $3981.26 36. a. b. yˆ 359.2668 5.2772 x =359.2668 – 5.2772(30) = 200.95 or approximately $201. s = 37.9372 x 19 s yˆp s ( xi x ) 2 1912 2 1 ( xp x ) 1 (30 19)2 37.9372 14.90 n ( xi x )2 11 1912 y p t / 2 sy p 200.95 2.262 (14.90) = 200.95 33.70 or 167.25 to 234.65 c. sind s 1 2 1 ( xp x ) 1 (30 19)2 37.9372 1 40.76 n ( xi x )2 11 1912 y p t / 2 sind 14 - 27 Chapter 14 200.95 2.262 (40.76) = 200.95 92.20 or 108.75 to 293.15 d. 37. a. As expected, the prediction interval is much wider than the confidence interval. This is due to the fact that it is more difficult to predict the results for one new model with a temperature rating of 30 Fo than it is to estimate the mean for all models with a temperature rating of 30 Fo . x 57 ( xi x ) 2 7648 s2 = 1.88 s = 1.37 2 1 ( xp x ) 1 (52.5 57)2 s yˆp s 1.37 0.52 2 n ( xi x ) 7 7648 y p t / 2 sy p ŷ = 4.68 + 0.16x = 4.68 + 0.16(52.5) = 13.08 13.08 2.571 (.52) = 13.08 1.34 or 11.74 to 14.42 or $11,740 to $14,420 b. sind = 1.47 13.08 2.571 (1.47) = 13.08 3.78 or 9.30 to 16.86 or $9,300 to $16,860 c. Yes, $20,400 is much larger than anticipated. d. Any deductions exceeding the $16,860 upper limit could suggest an audit. 38. a. b. y 1246.67 7.6(500) $5046.67 x 575 ( xi x ) 2 93, 750 s2 = MSE = 58,333.33 s = 241.52 sind s 1 2 1 ( xp x ) 1 (500 575)2 241.52 1 267.50 n ( xi x )2 6 93,750 y p t / 2 sind 5046.67 4.604 (267.50) = 5046.67 1231.57 or $3815.10 to $6278.24 c. Based on one month, $6000 is not out of line since $3815.10 to $6278.24 is the prediction interval. However, a sequence of five to seven months with consistently high costs should cause concern. 14 - 28 Simple Linear Regression 39. a. Let x = miles of track and y = weekday ridership in thousands. x xi 203 29 n 7 y ( xi x )( yi y ) 1471 b1 yi 309 44.1429 n 7 ( xi x ) 2 838 ( xi x )( yi y ) 1471 1.7554 838 ( xi x )2 b0 y b1 x 44.1429 (1.7554)(29) 6.76 yˆ 6.76 1.755x b. SST =3620.9 SSE = 1038.7 SSR = 2582.1 r2 = SSR/SST = 2582.1/3620.9 = .713 The estimated regression equation explained 71.3% of the variability in y; a good fit. c. s2 = MSE = 1038.7/5 = 207.7 s 207.7 14.41 s yˆp s 2 1 ( xp x ) 1 (30 29)2 14.41 5.47 n ( xi x )2 7 838 yˆ 6.76 1.755x 6.76 1.755(30) 45.9 y p t / 2 sy p 45.9 2.571(5.47) = 45.9 14.1 or 31.8 to 60 d. sind s 1 2 1 ( xp x ) 1 (30 29)2 14.41 1 15.41 2 n ( xi x ) 7 838 y p t / 2 sind 45.9 2.571(15.41) = 45.9 39.6 or 6.3 to 85.5 The prediction interval is so wide that it would not be of much value in the planning process. A larger data set would be beneficial. 14 - 29 Chapter 14 40. a. 9 b. ŷ = 20.0 + 7.21x c. 1.3626 d. SSE = SST - SSR = 51,984.1 - 41,587.3 = 10,396.8 MSE = 10,396.8/7 = 1,485.3 F = MSR / MSE = 41,587.3 /1,485.3 = 28.00 Using F table (1 degree of freedom numerator and 7 denominator), p-value is less than .01 Using Excel or Minitab, the p-value corresponding to F = 28.00 is .0011. Because p-value , we reject H0: B1 = 0. e. 41. a. b. ŷ = 20.0 + 7.21(50) = 380.5 or $380,500 ŷ = 6.1092 + .8951x t b1 B1 .8951 0 6.01 sb1 .149 Using the t table (8 degrees of freedom), area in tail is less than .005 p-value is less than .01 Using Excel or Minitab, the p-value corresponding to t = 6.01 is .0003. Because p-value , we reject H0: B1 = 0 c. 42 a. ŷ = 6.1092 + .8951(25) = 28.49 or $28.49 per month ŷ = 80.0 + 50.0x b. 30 c. F = MSR / MSE = 6828.6/82.1 = 83.17 Using F table (1 degree of freedom numerator and 28 denominator), p-value is less than .01 Using Excel or Minitab, the p-value corresponding to F = 83.17 is .000. Because p-value < = .05, we reject H0: B1 = 0. Branch office sales are related to the salespersons. d. ŷ = 80 + 50 (12) = 680 or $680,000 14 - 30 Simple Linear Regression 43. a. The Minitab output is shown below: The regression equation is Price = 4.98 + 2.94 Weight Predictor Constant Weight Coef 4.979 2.9370 S = 8.457 SE Coef 3.380 0.2934 R-Sq = 80.7% T 1.47 10.01 P 0.154 0.000 R-Sq(adj) = 79.9% Analysis of Variance Source Regression Residual Error Total DF 1 24 25 SS 7167.9 1716.6 8884.5 MS 7167.9 71.5 F 100.22 P 0.000 Predicted Values for New Observations New Obs 1 Fit SE Fit 34.35 1.66 ( 95.0% CI 30.93, b. The p-value = .000 < = .05 (t or F); significant relationship c. r2 = .807. The least squares line provided a very good fit. d. The 95% confidence interval is 30.93 to 37.77. e. The 95% prediction interval is 16.56 to 52.14. 37.77) ( 95.0% PI 16.56, 52.14) 44. a/b. The scatter diagram shows a linear relationship between the two variables. c. The Minitab output is shown below: The regression equation is Rental$ = 37.1 - 0.779 Vacancy% Predictor Constant Vacancy% Coef 37.066 -0.7791 S = 4.889 SE Coef 3.530 0.2226 R-Sq = 43.4% T 10.50 -3.50 P 0.000 0.003 R-Sq(adj) = 39.8% Analysis of Variance Source Regression Residual Error Total DF 1 16 17 SS 292.89 382.37 675.26 MS 292.89 23.90 F 12.26 P 0.003 Predicted Values for New Observations New Obs 1 2 Fit 17.59 28.26 SE Fit 2.51 1.42 95.0% CI ( 12.27, 22.90) ( 25.26, 31.26) 14 - 31 ( ( 95.0% PI 5.94, 29.23) 17.47, 39.05) Chapter 14 Values of Predictors for New Observations New Obs 1 2 Vacancy% 25.0 11.3 d. Since the p-value = 0.003 is less than = .05, the relationship is significant. e. r2 = .434. The least squares line does not provide a very good fit. f. The 95% confidence interval is 12.27 to 22.90 or $12.27 to $22.90. g. The 95% prediction interval is 17.47 to 39.05 or $17.47 to $39.05. 45. a. x xi 70 14 n 5 y ( xi x )( yi y ) 200 b1 yi 76 15.2 n 5 ( xi x ) 2 126 ( xi x )( yi y ) 200 1.5873 126 ( xi x )2 b0 y b1 x 15.2 (1.5873)(14) 7.0222 yˆ 7.02 1.59 x b. The residuals are 3.48, -2.47, -4.83, -1.6, and 5.22 c. 6 Residuals 4 2 0 -2 -4 -6 0 5 10 15 20 25 x With only 5 observations it is difficult to determine if the assumptions are satisfied. However, the plot does suggest curvature in the residuals that would indicate that the error term assumptions are not satisfied. The scatter diagram for these data also indicates that the underlying relationship between x and y may be curvilinear. 14 - 32 Simple Linear Regression d. s2 23.78 hi 1 ( xi x ) 2 1 ( x 14) 2 i 2 n ( xi x ) 5 126 The standardized residuals are 1.32, -.59, -1.11, -.40, 1.49. e. 46. a. The standardized residual plot has the same shape as the original residual plot. The curvature observed indicates that the assumptions regarding the error term may not be satisfied. yˆ 2.32 .64 x b. 4 3 Residuals 2 1 0 -1 -2 -3 -4 0 2 4 6 8 10 x The assumption that the variance is the same for all values of x is questionable. The variance appears to increase for larger values of x. 47. a. Let x = advertising expenditures and y = revenue yˆ 29.4 1.55 x b. SST = 1002 SSE = 310.28 SSR = 691.72 MSR = SSR / 1 = 691.72 MSE = SSE / (n - 2) = 310.28/ 5 = 62.0554 F = MSR / MSE = 691.72/ 62.0554= 11.15 Using F table (1 degree of freedom numerator and 5 denominator), p-value is between .01 and .025 Using Excel or Minitab, the p-value corresponding to F = 11.15 is .0206. Because p-value = .05, we conclude that the two variables are related. 14 - 33 Chapter 14 c. 10 Residuals 5 0 -5 -10 -15 25 35 45 55 65 Predicted Values d. 48. a. The residual plot leads us to question the assumption of a linear relationship between x and y. Even though the relationship is significant at the .05 level of significance, it would be extremely dangerous to extrapolate beyond the range of the data. yˆ 80 4 x 8 6 Residuals 4 2 0 -2 -4 -6 -8 0 2 4 6 8 x b. The assumptions concerning the error term appear reasonable. 14 - 34 10 12 14 Simple Linear Regression 49. a. Let x = return on investment (ROE) and y = price/earnings (P/E) ratio. yˆ 32.13 3.22 x b. Standardized Residuals 2 1.5 1 0.5 0 -0.5 -1 -1.5 0 10 20 30 40 50 60 x c. 50. a. There is an unusual trend in the residuals. The assumptions concerning the error term appear questionable. The MINITAB output is shown below: The regression equation is Y = 66.1 + 0.402 X Predictor Constant X Coef 66.10 0.4023 S = 12.62 SE Coef 32.06 0.2276 R-sq = 38.5% T 2.06 1.77 p 0.094 0.137 R-sq(adj) = 26.1% Analysis of Variance SOURCE DF Regression 1 Residual Error 5 Total 6 Unusual Observations Obs. X Y 1 135 145.00 SS 497.2 795.7 1292.9 Fit 120.42 MS 497.2 159.1 SEFit 4.87 F 3.12 Residual 24.58 p 0.137 St.Resid 2.11R R denotes an observation with a large standardized residual. The standardized residuals are: 2.11, -1.08, .14, -.38, -.78, -.04, -.41 The first observation appears to be an outlier since it has a large standardized residual. 14 - 35 Chapter 14 b. 2.5 2.0 Standardized Residual 1.5 1.0 0.5 0.0 -0.5 -1.0 110 115 120 125 Fitted Value 130 135 140 The standardized residual plot indicates that the observation x = 135, y = 145 may be an outlier; note that this observation has a standardized residual of 2.11. The scatter diagram is shown below 150 145 140 135 130 y c. 125 120 115 110 105 100 100 110 120 130 140 150 160 170 180 x The scatter diagram also indicates that the observation x = 135, y = 145 may be an outlier; the implication is that for simple linear regression an outlier can be identified by looking at the scatter diagram. 14 - 36 Simple Linear Regression 51. a. The Minitab output is shown below: The regression equation is Y = 13.0 + 0.425 X Predictor Constant X Coef 13.002 0.4248 S = 3.181 SE Coef 2.396 0.2116 R-sq = 40.2% T 5.43 2.01 p 0.002 0.091 R-sq(adj) = 30.2% Analysis of Variance SOURCE DF Regression 1 Residual Error 6 Total 7 SS 40.78 60.72 101.50 Unusual Observations Obs. X Y 7 12.0 24.00 8 22.0 19.00 MS 40.78 10.12 Fit Stdev.Fit 18.10 1.20 22.35 2.78 F 4.03 Residual 5.90 -3.35 p 0.091 St.Resid 2.00R -2.16RX R denotes an observation with a large standardized residual. X denotes an observation whose X value gives it large influence. The standardized residuals are: -1.00, -.41, .01, -.48, .25, .65, -2.00, -2.16 The last two observations in the data set appear to be outliers since the standardized residuals for these observations are 2.00 and -2.16, respectively. b. Using MINITAB, we obtained the following leverage values: .28, .24, .16, .14, .13, .14, .14, .76 MINITAB identifies an observation as having high leverage if hi > 6/n; for these data, 6/n = 6/8 = .75. Since the leverage for the observation x = 22, y = 19 is .76, MINITAB would identify observation 8 as a high leverage point. Thus, we conclude that observation 8 is an influential observation. 14 - 37 Chapter 14 c. 30 25 y 20 15 10 5 0 0 5 10 15 20 25 x The scatter diagram indicates that the observation x = 22, y = 19 is an influential observation. 52. a. The Minitab output is shown below: The regression equation is Shipment = 4.09 + 0.196 Media$ Predictor Constant MediaExp S = 5.044 Coef 4.089 0.19552 SE Coef 2.168 0.03635 R-Sq = 78.3% T 1.89 5.38 P 0.096 0.001 R-Sq(adj) = 75.6% Analysis of Variance Source Regression Residual Error Total DF 1 8 9 Unusual Observations Obs Media$ Shipment 1 120 36.30 SS 735.84 203.51 939.35 Fit 27.55 MS 735.84 25.44 F 28.93 SE Fit 3.30 Residual 8.75 P 0.001 St Resid 2.30R R denotes an observation with a large standardized residual b. Minitab identifies observation 1 as having a large standardized residual; thus, we would consider observation 1 to be an outlier. 14 - 38 Simple Linear Regression 53. a. The Minitab output is shown below: The regression equation is Price = 28.0 + 0.173 Volume Predictor Constant Volume Coef 27.958 0.17289 S = 17.53 SE Coef 4.521 0.07804 R-Sq = 17.0% T 6.18 2.22 P 0.000 0.036 R-Sq(adj) = 13.5% Analysis of Variance Source Regression Residual Error Total DF 1 24 25 SS 1508.4 7376.1 8884.5 Unusual Observations Obs Volume Price 22 230 40.00 MS 1508.4 307.3 Fit 67.72 F 4.91 SE Fit 15.40 P 0.036 Residual -27.72 St Resid -3.31RX R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. The Minitab output identifies observation 22 as having a large standardized residual and is an observation whose x value gives it large influence. The following residual plot verifies these observations. 2 1 Standardized Residual b. 0 -1 -2 -3 -4 30 40 50 Fitted Value 14 - 39 60 70 Chapter 14 54. a. The Minitab output is shown below: The regression equation is Salary = 707 + 0.00482 MktCap Predictor Constant MktCap Coef 707.0 0.0048154 S = 379.8 SE Coef 118.0 0.0008076 R-Sq = 66.4% T 5.99 5.96 P 0.000 0.000 R-Sq(adj) = 64.5% Analysis of Variance Source Regression Residual Error Total Unusual Obs 6 17 DF 1 18 19 Observations MktCap Salary 507217 3325.0 120967 116.2 SS 5129071 2596647 7725718 Fit 3149.5 1289.5 MS 5129071 144258 F 35.55 SE Fit 338.6 86.4 Residual 175.5 -1173.3 P 0.000 St Resid 1.02 X -3.17R R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. b. Minitab identifies observation 6 as an observation whose x value gives it large influence and observation 17 as having a large standardized residual. A standardized residual plot against the predicted values is shown below: Standardized Residuals 3 2 1 0 -1 -2 -3 -4 500 1000 1500 2000 2500 3000 3500 Predicted Values 55. No. Regression or correlation analysis can never prove that two variables are causally related. 56. The estimate of a mean value is an estimate of the average of all y values associated with the same x. The estimate of an individual y value is an estimate of only one of the y values associated with a particular x. 57. The purpose of testing whether 1 0 is to determine whether or not there is a significant relationship between x and y. However, rejecting 1 0 does not necessarily imply a good fit. For example, if 1 0 is rejected and r2 is low, there is a statistically significant relationship between x and y but the fit is not very good. 14 - 40 Simple Linear Regression 58. a. The Minitab output is shown below: The regression equation is Price = 9.26 + 0.711 Shares Predictor Constant Shares S = 1.419 Coef 9.265 0.7105 SE Coef 1.099 0.1474 R-Sq = 74.4% T 8.43 4.82 P 0.000 0.001 R-Sq(adj) = 71.2% Analysis of Variance Source Regression Residual Error Total DF 1 8 9 SS 46.784 16.116 62.900 MS 46.784 2.015 F 23.22 P 0.001 b. Since the p-value corresponding to F = 23.22 = .001 < = .05, the relationship is significant. c. r 2 = .744; a good fit. The least squares line explained 74.4% of the variability in Price. d. yˆ 9.26 .711(6) 13.53 59. a. The Minitab output is shown below: The regression equation is Options = - 3.83 + 0.296 Common Predictor Constant Common S = 11.04 Coef -3.834 0.29567 SE Coef 5.903 0.02648 R-Sq = 91.9% T -0.65 11.17 P 0.529 0.000 R-Sq(adj) = 91.2% Analysis of Variance Source Regression Residual Error Total DF 1 11 12 SS 15208 1341 16550 MS 15208 122 F 124.72 P 0.000 b. yˆ 3.83 .296(150) 40.57 ; approximately 40.6 million shares of options grants outstanding. c. r 2 = .919; a very good fit. The least squares line explained 91.9% of the variability in Options. 14 - 41 Chapter 14 60. a. 1300 1280 S&P500 1260 1240 1220 1200 1180 1160 10000 10200 10400 10600 10800 11000 DJIA b. The Minitab output is shown below: The regression equation is S&P500 = - 182 + 0.133 DJIA Predictor Constant DJIA Coef -182.11 0.133428 S = 6.89993 SE Coef 71.83 0.006739 R-Sq = 95.6% T -2.54 19.80 P 0.021 0.000 R-Sq(adj) = 95.4% Analysis of Variance Source Regression Residual Error Total DF 1 18 19 SS 18666 857 19523 MS 18666 48 F 392.06 P 0.000 c. Using the F test, the p-value corresponding to F = 392.06 is .000. Because the p-value =.05, we reject H 0 : 1 0 ; there is a significant relationship. d. With R-Sq = 95.6%, the estimated regression equation provided an excellent fit. e. yˆ 182.11 .133428DJIA= 182.11 .133428(11,000) 1285.60 or 1286. f. The DJIA is not that far beyond the range of the data. With the excellent fit provided by the estimated regression equation, we should not be too concerned about using the estimated regression equation to predict the S&P500. 14 - 42 Simple Linear Regression 61. The Minitab output is shown below: The regression equation is Expense = 10.5 + 0.953 Usage Predictor Constant X Coef 10.528 0.9534 S = 4.250 SE Coef 3.745 0.1382 R-sq = 85.6% T 2.81 6.90 p 0.023 0.000 R-sq(adj) = 83.8% Analysis of Variance SOURCE DF Regression 1 Residual Error 8 Total 9 Fit 39.13 SS 860.05 144.47 1004.53 Stdev.Fit 1.49 ( MS 860.05 18.06 95% C.I. 35.69, 42.57) F 47.62 p 0.000 95% P.I. 28.74, 49.52) ( a. ŷ = 10.5 + .953 Usage b. Since the p-value corresponding to F = 47.62 = .000 < = .05, we reject H0: 1 = 0. c. The 95% prediction interval is 28.74 to 49.52 or $2874 to $4952 d. Yes, since the expected expense is $3913. 62. a. The Minitab output is shown below: The regression equation is Defects = 22.2 - 0.148 Speed Predictor Constant Speed Coef 22.174 -0.14783 S = 1.489 SE Coef 1.653 0.04391 R-Sq = 73.9% T 13.42 -3.37 P 0.000 0.028 R-Sq(adj) = 67.4% Analysis of Variance Source Regression Residual Error Total DF 1 4 5 SS 25.130 8.870 34.000 MS 25.130 2.217 F 11.33 P 0.028 Predicted Values for New Observations New Obs Fit 1 14.783 SE Fit 0.896 ( 95.0% CI 12.294, 17.271) ( 95.0% PI 9.957, 19.608) b. Since the p-value corresponding to F = 11.33 = .028 < = .05, the relationship is significant. c. r 2 = .739; a good fit. The least squares line explained 73.9% of the variability in the number of defects. d. Using the Minitab output in part (a), the 95% confidence interval is 12.294 to 17.271. 14 - 43 Chapter 14 63. a. 9 8 7 Days 6 5 4 3 2 1 0 0 5 10 15 20 Distance There appears to be a negative linear relationship between distance to work and number of days absent. b. The MINITAB output is shown below: The regression equation is Days = 8.10 - 0.344 Distance Predictor Constant X S = 1.289 Coef 8.0978 -0.34420 SE Coef 0.8088 0.07761 R-sq = 71.1% T 10.01 -4.43 p 0.000 0.002 R-sq(adj) = 67.5% Analysis of Variance SOURCE DF Regression 1 Residual Error 8 Total 9 Fit 6.377 Stdev.Fit 0.512 SS 32.699 13.301 46.000 ( MS 32.699 1.663 95% C.I. 5.195, 7.559) F 19.67 ( p 0.002 95% P.I. 3.176, 9.577) c. Since the p-value corresponding to F = 419.67 is .002 < = .05. We reject H0 : 1 = 0. d. r2 = .711. The estimated regression equation explained 71.1% of the variability in y; this is a reasonably good fit. e. The 95% confidence interval is 5.195 to 7.559 or approximately 5.2 to 7.6 days. 14 - 44 Simple Linear Regression 64. a. The Minitab output is shown below: The regression equation is Cost = 220 + 132 Age Predictor Constant X S = 75.50 Coef 220.00 131.67 SE Coef 58.48 17.80 R-sq = 87.3% T 3.76 7.40 p 0.006 0.000 R-sq(adj) = 85.7% Analysis of Variance SOURCE DF Regression 1 Residual Error 8 Total 9 Fit 746.7 SS 312050 45600 357650 Stdev.Fit 29.8 ( MS 312050 5700 95% C.I. 678.0, 815.4) F 54.75 ( p 0.000 95% P.I. 559.5, 933.9) b. Since the p-value corresponding to F = 54.75 is .000 < = .05, we reject H0: 1 = 0. c. r2 = .873. The least squares line provided a very good fit. d. The 95% prediction interval is 559.5 to 933.9 or $559.50 to $933.90 65. a. The Minitab output is shown below: The regression equation is Points = 5.85 + 0.830 Hours Predictor Constant X S = 7.523 Coef 5.847 0.8295 SE Coef 7.972 0.1095 R-sq = 87.8% T 0.73 7.58 p 0.484 0.000 R-sq(adj) = 86.2% Analysis of Variance SOURCE DF Regression 1 Residual Error 8 Total 9 Fit 84.65 Stdev.Fit 3.67 SS 3249.7 452.8 3702.5 ( MS 3249.7 56.6 95% C.I. 76.19, 93.11) F 57.42 ( p 0.000 95% P.I. 65.35, 103.96) b. Since the p-value corresponding to F = 57.42 is .000 < = .05, we reject H0: 1 = 0. c. 84.65 points d. The 95% prediction interval is 65.35 to 103.96 14 - 45 Chapter 14 66. a. The Minitab output is shown below: The regression equation is Horizon = 0.275 + 0.950 S&P 500 Predictor Constant S&P 500 Coef 0.2747 0.9498 S = 2.664 SE Coef 0.9004 0.3569 R-Sq = 47.0% T 0.31 2.66 P 0.768 0.029 R-Sq(adj) = 40.3% Analysis of Variance Source DF Regression 1 Residual Error 8 Total 9 SS 50.255 56.781 107.036 MS 50.255 7.098 F 7.08 P 0.029 The market beta for Horizon is b1 = .95 b. Since the p-value = 0.029 is less than = .05, the relationship is significant. c. r2 = .470. The least squares line does not provide a very good fit. d. Texas Instruments has higher risk with a market beta of 1.46. 67. a. The Minitab output is shown below: The regression equation is Audit% = - 0.471 +0.000039 Income Predictor Constant Income Coef -0.4710 0.00003868 S = 0.2088 SE Coef 0.5842 0.00001731 R-Sq = 21.7% T -0.81 2.23 P 0.431 0.038 R-Sq(adj) = 17.4% Analysis of Variance Source Regression Residual Error Total DF 1 18 19 SS 0.21749 0.78451 1.00200 MS 0.21749 0.04358 F 4.99 P 0.038 Predicted Values for New Observations New Obs 1 Fit 0.8828 SE Fit 0.0523 95.0% CI ( 0.7729, 0.9927) 95.0% PI ( 0.4306, 1.3349) b. Since the p-value = 0.038 is less than = .05, the relationship is significant. c. r2 = .217. The least squares line does not provide a very good fit. d. The 95% confidence interval is .7729 to .9927. 14 - 46 Simple Linear Regression 68. a. Satisfaction Rating (%) 90 80 70 60 50 40 30 20 20 30 40 50 60 70 Top Five (%) b. There appears to be a positive linear relationship between the two variables. A portion of the Minitab output for this problem follows. The regression equation is Rating (%) = 9.4 + 1.29 Top Five (%) Predictor Constant Top Five (%) Coef 9.37 1.2875 SE Coef 10.24 0.2293 S = 6.62647 R-Sq = 74.1% T 0.92 5.61 P 0.380 0.000 R-Sq(adj) = 71.8% Analysis of Variance Source Regression Residual Error Total DF 1 11 12 SS 1383.9 483.0 1866.9 MS 1383.9 43.9 F 31.52 P 0.000 c. ŷ = 9.37 + 1.2875 Top Five (%) d. Since the p-value = .000 < α = .05, the relationship is significant. e. r 2 = .741; a good fit. The least squares line explained 74.1% of the variability in the satisfaction rating. f. rxy r 2 .741 .86 14 - 47