Chapter 12 Correlation and Simple Regression Analysis

Chapter 12: Correlation and Simple Regression Analysis Chapter 12 Correlation and Simple Regression Analysis LEARNING OBJECTIVES The overall objective of this chapter is to give you an understanding of bivariate regression analysis, thereby enabling you to: 1. 2. 3. 4. 5. 6. 7. 8. 9. Define correlation and when is it used Define simple regression and its use. Interpret the slope and intercept of the equation. Understand the usefulness of residual analysis in examining the fit of the regression line to the data and in testing the assumptions underlying regression analysis. Compute a standard error of the estimate and interpret its meaning. Compute a coefficient of determination and interpret it. Test hypotheses about the slope of the regression model and interpret the results. Estimate values of y using the regression model. Develop a linear trend line and use it to forecast. CHAPTER TEACHING STRATEGY This chapter is about all aspects of simple (bivariate, linear) regression. Early in the chapter through scatter plots, the student begins to understand that the object of simple regression is to fit a line through the points. Fairly soon in the process, the student learns how to solve for slope and y intercept and develop the equation of the regression line. Most of the remaining material on simple regression is to determine how good the fit of the line is and if assumptions underlying the process are met. The student begins to understand that by entering values of the independent variable into the regression model, predicted values can be determined. The question then becomes: Are the predicted values good estimates of the actual dependent values? One rule to emphasize is that the regression model should not be used to predict for independent variable values that are outside the range of values used to construct the model. MINITAB issues a warning for such activity when attempted. There are many instances where the relationship between x and y are linear over a given interval but outside the interval the relationship becomes curvilinear or unpredictable. Of course, with this caution having been given, many forecasters use such regression models to © 2010 John Wiley & Sons Canada, Ltd. 388 Chapter 12: Correlation and Simple Regression Analysis extrapolate to values of x outside the domain of those used to construct the model. Such forecasts are introduced in section 12.9, “Using Regression to Develop a Forecasting Trend Line”. Whether the forecasts obtained under such conditions are any better than "seat of the pants" or "crystal ball" estimates remains to be seen. The concept of residual analysis is a good one to show graphically and numerically how the model relates to the data and the fact that it more closely fits some points than others, etc. A graphical or numerical analysis of residuals demonstrates that the regression line fits the data in a manner analogous to the way a mean fits a set of numbers. The regression model passes through the points such that the vertical distances from the actual y values to the predicted values will sum to zero. The fact that the residuals sum to zero points out the need to square the errors (residuals) in order to get a handle on total error. This leads to the sum of squares error and then on to the standard error of the estimate. In addition, students can learn why the process is called least squares analysis (the slope and intercept formulas are derived by calculus such that the sum of squares of error is minimized - hence "least squares"). Students can learn that by examining the values of se, the residuals, r2, and the t ratio to test the slope they can begin to make a judgment about the fit of the model to the data. Many of the chapter problems ask the student to comment on these items (se, r2, etc.). It is my view that for many of these students, the most important facet of this chapter lies in understanding the "buzz" words of regression such as standard error of the estimate, coefficient of determination, etc. because they may only interface regression again as some type of computer printout to be deciphered. The concepts then may be more important than the calculations. © 2010 John Wiley & Sons Canada, Ltd. 389 Chapter 12: Correlation and Simple Regression Analysis CHAPTER OUTLINE 12.1 Correlation 12.2 Introduction to Simple Regression Analysis 12.3 Determining the Equation of the Regression Line 12.4 Residual Analysis: Testing the Regression Model I-Predicted values, Residuals, and Sum of Squares of Error Using Residuals to Test the Assumptions of the Regression Model Using the Computer for Residual Analysis 12.5 Standard Error of the Estimate: Testing the Regression Model II-Standard Error of the Estimate and r Squared 12.6 Coefficient of Determination: Testing the Regression Model II-Standard Error of the Estimate and r Squared Relationship Between r and r2 12.7 Hypothesis Tests for the Slope of the Regression Model and Testing the Overall Model Testing the Slope Testing the Overall Model 12.8 Estimation Confidence Intervals to Estimate the Conditional Mean of y: µy/x Prediction Intervals to Estimate a Single Value of y 12.9 Using Regression to Develop a Forecasting Trend Line Determining the Equation of the Trend Line Forecasting Using the Equation of the Trend Line Alternative Coding for Time Periods 12.10 Interpreting Computer Output © 2010 John Wiley & Sons Canada, Ltd. 390 Chapter 12: Correlation and Simple Regression Analysis KEY TERMS Bivariate regression Coefficient of correlation (r) Coefficient of Determination (r2) Confidence Interval Correlation Dependent Variable Deterministic Model Heteroscedasticity Homoscedasticity Independent Variable Least Squares Analysis Outliers Pearson product-moment correlation coefficient Prediction Interval Probabilistic Model Regression Analysis Residual Residual Plot Scatter Plot Simple Regression Standard Error of the Estimate (se) Sum of Squares of Error (SSE) Time-series data Trend © 2010 John Wiley & Sons Canada, Ltd. 391 Chapter 12: Correlation and Simple Regression Analysis SOLUTIONS TO PROBLEMS IN CHAPTER 12 12.1 x = 80 y2 = 815 r = x2 = 1,148 xy = 624  xy  y = 69 n=7  x y n  ( x)   ( y ) 2  2 2  x    y   n   n   = 2 (80)( 69) 7 = 2  (80)   (69) 2  1,148   815   7  7   624  r = r = 12.2  164.571 = –0.927 177.533 x = 1,087 y2 = 878,686 r =  164.571 = (233.714)(134.857) x2 = 322,345 xy= 507,509  xy   ( x) 2  x  n  2 y = 2,032 n=5  x y n  ( y ) 2   y  n   2    = (1,087)( 2,032) 5 =  (1,087) 2   (2,032) 2  322,345   878,686   5 5    507,509  r = r = 65,752.2 65,752.2 = = .975 67,449.5 (86,031.2)(52,881.2) © 2010 John Wiley & Sons Canada, Ltd. 392 Chapter 12: Correlation and Simple Regression Analysis 12.3 Air Canada (x) 0.75 0.76 0.84 0.85 0.86 0.86 WestJet (y) 11.92 12.09 12.25 11.85 11.78 11.74 x = 4.92 x2 = 4.0474 r = y = 71.63 xy = 58.7181 2 y = 855.3355  xy   ( x) 2  x  n  2  x y n  ( y ) 2  2 y     n    = (4.92)(71.63)  0185 = – .37 6  0.0500  (4.92) 2   (71.63) 2  4 . 0474  855 . 3355     6  6   58.7181  r = 12.4 x = 59,788,660 y = 90,545 xy = 415,640,524,882 r =  xy  x2 = 428,249,006,044,248 y2 = 713,381,745 n = 23  x y n  ( x)   ( y ) 2  2 2  x    y   n   n   = 2 (59,788,660)(90,545) 23 = 2 2  (59,788,660)   (90,545)  428,249,006,044,248   713,381,745   23 23    415,640,524,882  r = © 2010 John Wiley & Sons Canada, Ltd. 393 Chapter 12: Correlation and Simple Regression Analysis 180,268,167,504 r = = .578 (272,827,968,453,000)(356,929,700) 12.5 Correlation between Year 1 and Year 2: x = 17.09 y = 15.12 xy = 48.97 r = x2 = 58.7911 y2 = 41.7054 n=8  xy   x y n  ( x)   ( y ) 2  2 2  x    y   n   n   = 2 (17.09)(15.12) 8 = 2 2  (17.09)   (15.12)  58.7911   41.7054   8 8    48.97  r = r = 16.6699 16.6699 = = .975 17.1038 (22.28259)(13.1286) Correlation between Year 2 and Year 3: x = 15.12 x2 = 41.7054 y = 15.86 y2 = 42.0396 xy = 41.5934 n=8 r =  xy   x y n  ( x)   ( y ) 2  2 2  x    y   n   n   = 2 (15.12)(15.86) 8 = 2  (15.12)   (15.86) 2  41.7054   42.0396   8 8    41.5934  r = © 2010 John Wiley & Sons Canada, Ltd. 394 Chapter 12: Correlation and Simple Regression Analysis r = 11.618 11.618 = = .985 11.795 (13.1286)(10.59715) Correlation between Year 1 and Year 3: x = 17.09 x2 = 58.7911 y = 15.86 y2 = 42.0396 xy = 48.5827 n=8 r =  xy   x y n  ( x)   ( y ) 2  2 2  x    y   n   n   = 2 (17.09)(15.86) 8 2  (17.09)   (15.86) 2  58 . 7911  42 . 0396     8 8    48.5827  r = r = 14.702 (22.2826)(10.5972) = 14.702 = .957 15.367 The years 2 and 3 are the most correlated with r = .985. © 2010 John Wiley & Sons Canada, Ltd. 395 Chapter 12: Correlation and Simple Regression Analysis 12.6 x 12 21 28 8 20 y 17 15 22 19 24 30 25 y 20 15 10 5 0 0 5 10 15 20 25 30 x x = 89 x2 = 1,833 b1 = SS xy SS xx y = 97 y2 = 1,935 x y  xy   n  ( x) x   2 (89)(97) 5 = 0.16238 (89) 2 1,833  5 1,767  = 2 n b0 =  y  b  x  97  0.1624 89 n 1 n 5 xy = 1,767 n=5 5 = 16.5093 ŷ = 16.50965 + 0.162379 x © 2010 John Wiley & Sons Canada, Ltd. 396 Chapter 12: Correlation and Simple Regression Analysis 12.7 x_ 140 119 103 91 65 29 24 x x2 b1 = _y_ 25 29 46 70 88 112 128 y = 571 = 58,293 SS xy SS xx x y  xy   n  ( x) x   2 2 n b0 = = 498 y2 = 45,154 (571)( 498) 7 (571) 2 58,293  7 30,099  =  y  b  x  498  (0.898) 571 n 1 n 7 xy = 30,099 n=7 7 = –0.898 = 144.414 ŷ = 144.414 – 0.898 x © 2010 John Wiley & Sons Canada, Ltd. 397 Chapter 12: Correlation and Simple Regression Analysis 12.8 (Advertising) x 12.5 3.7 21.6 60.0 37.6 6.1 16.8 41.2 (Sales) y 148 55 338 994 541 89 126 379 x = 199.5 x2 = 7,667.15 SS xy b1 = SS xx y y2 x y  xy   n  ( x) x   2 (199.5)( 2,670) 8 (199.5) 2 7,667.15  8 107,610.4  = 2 n b0 =  y  b  x  2,670  15.24 199.5 n 1 n xy = 107,610.4 n=8 = 2,670 = 1,587,328 8 8 = 15.240 = -46.292 ŷ = -46.292 + 15.240 x 12.9 (Prime) x 16 6 8 4 7 x = 41 x2 = 421 r =  xy   ( x) 2  x  n  2 (Bond) y 5 12 9 15 7 y = 48 y2 = 524  x y xy = 333 n=5 n  ( y ) 2   y  n   2    = (41)( 48) 5 2  (41)   (48) 2  421  524   5   5   333  r = © 2010 John Wiley & Sons Canada, Ltd. 398 Chapter 12: Correlation and Simple Regression Analysis r =  60.6 = – .828 (84.8)(63.2) Since r = – 0.828, the bond rate can be predicted by the prime interest rate. b1 = SS xy SS xx x y  xy   n  ( x) x   2 2 n b0 = (41)( 48) 5 = –0.715 (41) 2 421  5 333  =  y  b  x  48  (0.715) 41 n 1 n 5 5 = 15.460 ŷ = 15.460 – 0.715 x 12.10 Firm Births (x) 58.1 55.4 57 58.5 57.4 58 Bankruptcies (y) 34.3 35 38.5 40.1 35.5 37.9 x = 344.4 y = 221.3 y2 = 8188.41 xy = 12,708.08  x y r =  xy  x2 = 19,774.78 n  ( x)   ( y ) 2  2 2  x    y   n   n   n=6 = 2 © 2010 John Wiley & Sons Canada, Ltd. 399 Chapter 12: Correlation and Simple Regression Analysis (344.4)( 221.3) 6 2  (344.4)   (221.3) 2  19 , 774 . 78  8188 . 41     6 6    12,708.08  r = 5.46 = 0.428 (6.22)(26.1283) r = r = 0.428 represents a relatively moderate positive relationship between the number of firm births and the number of business bankruptcies. . b1 = SS xy SS xx x y  xy   n  ( x) x   2 (344.4)( 221.3) 6 = (344.4) 2 19,774.78  6 12,708.08  = 2 n b1 = 0.878 b0 =  y  b  x  221.3  (0.878) 344.4 n 1 n 6 6 = –13.503 ŷ = –13.503 + 0.878 x The slope of the line, b1 = 0.878, means that for every 10,000 increase of firm births, the number of business bankruptcies is predicted to increase by 878. 12.11 Number of Farms (x) 574,993 480,877 430,503 366,110 338,552 318,361 293,089 280,043 276,548 246,923 229,373 x = 3,835,372 Average Size (y) 122 145 164 188 202 207 231 242 246 273 295 y = 2315 x2 = 1,451,587,168,444 © 2010 John Wiley & Sons Canada, Ltd. 400 Chapter 12: Correlation and Simple Regression Analysis y2 = 515,797 b1 = SS xy SS xx xy = 752,175,501 x y  xy   n  ( x) x   2 n = 11 = 2 n (3,835,372)( 2315) 752,175,501  11 = = –0.000481 (3,835,372) 2 1,451,587,168,444  11 b0 =  y  b  x  2315  (0.000481) 3,835,372 n 1 n 11 11 = 378.2 ŷ = 378.2 – 0.0005 x 12.12 r = Steel (x) 90.6 88.8 89.7 79.7 84.3 88.8 91.3 95.2 95.5 98.5 Orders (y) 2.74 2.87 2.93 2.87 2.98 3.09 3.36 3.61 3.75 3.95 x = 902.4 y = 32.15 x2 = 81,705.14 y2 = 104.9815 xy = 2917.906 n = 10  xy   ( x) 2  x  n  2  x y n  ( y ) 2   y  n   2    = © 2010 John Wiley & Sons Canada, Ltd. 401 Chapter 12: Correlation and Simple Regression Analysis (902.4)(32.15) 10 2  (902.4)   (32.15) 2  81 , 705 . 14  104 . 9815   10   10   2917.906  r = 16.69 = 0.794 (272.564)(1.61925) r = r = 0.794 represents a relatively strong positive relationship between the raw steel production and the annual new orders. . b1 = SS xy SS xx x y  xy   n  ( x) x   2 2 n b0 = (902.4)(32.15) 10 = 0.06123 (902.4) 2 81,705.14  10 2917.906  =  y  b  x  32.15  (0.06123) 902.4 n 1 n 10 10 = –2.31040 ŷ = –2.31040 + 0.06123 x © 2010 John Wiley & Sons Canada, Ltd. 402 Chapter 12: Correlation and Simple Regression Analysis 4.5 4 3.5 New Orders 3 2.5 2 1.5 1 0.5 0 0 20 40 60 80 100 120 Raw Steel Production 12.13 x 15 8 19 12 5 y 47 36 56 44 21 ŷ = 13.625 + 2.303 x Residuals: x 15 8 19 12 5 y 47 36 56 44 21 ŷ 48.1694 32.0489 57.3811 41.2606 25.1401 Residuals (y- ŷ ) -1.1694 3.9511 -1.3811 2.7394 -4.1401 © 2010 John Wiley & Sons Canada, Ltd. 403 Chapter 12: Correlation and Simple Regression Analysis 12.14 x 12 21 28 8 20 y 17 15 22 19 24 Predicted ( ŷ ) 18.4582 19.9196 21.0563 17.8087 19.7572 Residuals (y- ŷ ) -1.4582 -4.9196 0.9437 1.1913 4.2428 ŷ = 16.51 + 0.162 x 12.15 x 140 119 103 91 65 29 24 y 25 29 46 70 88 112 128 Predicted ( ŷ ) Residuals (y- ŷ ) 18.6597 6.3403 37.5229 -8.5229 51.8948 -5.8948 62.6737 7.3263 86.0281 1.9720 118.3648 -6.3648 122.8561 5.1439 ŷ = 144.414 – 0.898 x 12.16 x 12.5 3.7 21.6 60.0 37.6 6.1 16.8 41.2 y 148 55 338 994 541 89 126 379 Predicted ( ŷ ) 144.2053 10.0954 282.8873 868.0945 526.7236 46.6708 209.7364 581.5868 Residuals (y- ŷ ) 3.7947 44.9047 55.1127 125.9055 14.2764 42.3292 -83.7364 -202.5868 ŷ = – 46.292 + 15.240x © 2010 John Wiley & Sons Canada, Ltd. 404 Chapter 12: Correlation and Simple Regression Analysis 12.17 x 16 6 8 4 7 y 5 12 9 15 7 Predicted ( ŷ ) 4.0259 11.1722 9.7429 12.6014 10.4576 Residuals (y- ŷ ) 0.9741 0.8278 -0.7429 2.3986 -3.4576 ŷ = 15.460 – 0.715 x 12.18 _ x_ 58.1 55.4 57.0 58.5 57.4 58.0 _ y_ 34.3 35.0 38.5 40.1 35.5 37.9 Predicted ( ŷ ) 37.4978 35.1277 36.5322 37.8489 36.8833 37.4100 Residuals (y- ŷ ) -3.1978 -0.1277 1.9678 2.2511 -1.3833 0.4900 The residual for x = 58.1 is relatively large, but the residual for x = 55.4 is quite small. 12.19 _x_ _ y_ 5 47 7 38 11 32 12 24 19 22 25 10 Predicted ( ŷ ) 42.2756 38.9836 32.3997 30.7537 19.2317 9.3558 Residuals (y- ŷ ) 4.7244 -0.9836 -0.3997 -6.7537 2.7683 0.6442 ŷ = 50.506 - 1.646 x No apparent violation of assumptions © 2010 John Wiley & Sons Canada, Ltd. 405 Chapter 12: Correlation and Simple Regression Analysis 12.20 Distance (x) 1,245 425 1,346 973 255 865 1,080 296 Cost y 2.64 2.31 2.45 2.52 2.19 2.55 2.40 2.37 ( ŷ ) 2.5376 2.3322 2.5629 2.4694 2.2896 2.4424 2.4962 2.2998 (y- ŷ ) .1024 -.0222 -.1129 .0506 -.0996 .1076 -.0962 .0702 ŷ = 2.2257 – 0.00025 x No apparent violation of assumptions 12.21 Error terms appear to be non independent © 2010 John Wiley & Sons Canada, Ltd. 406 Chapter 12: Correlation and Simple Regression Analysis 12.22 There appears to be nonlinear regression 12.23 The Residuals vs. Fits graphic is strongly indicative of a violation of the homoscedasticity assumption of regression. Because the residuals are very close together for small values of x, there is little variability in the residuals at the left end of the graph. On the other hand, for larger values of x, the graph flares out indicating a much greater variability at the upper end. Thus, there is a lack of homogeneity of error across the values of the independent variable. 12.24 SSE = y2 – b0y – b1xy = 1,935 – (16.50965)(97) – 0.16238(1767) = 46.6385 se  SSE 46.6385 = 3.94  n2 3 Approximately 68% of the residuals should fall within ±1se. 3 out of 5 or 60% of the actual residuals fell within ± 1se. 12.25 SSE = y2 – b0y – b1xy = 45,154 – 144.414(498) – (–.89824)(30,099) = SSE = 272.0 se  SSE 272.0 = 7.376  n2 5 6 out of 7 = 85.7% fall within + 1se 7 out of 7 = 100% fall within + 2se © 2010 John Wiley & Sons Canada, Ltd. 407 Chapter 12: Correlation and Simple Regression Analysis 12.26 SSE = y2 – b0y – b1xy = 1,587,328 – (–46.29)(2,670) – 15.24(107,610.4) = SSE = 70,940 se  SSE 70,940 = 108.7  n2 6 Six out of eight (75%) of the sales estimates are within $108.7 million. 12.27 SSE = y2 – b0y – b1xy = 524 – 15.46(48) – (–0.71462)(333) = 19.8885 se  SSE 19.8885 = 2.575  n2 3 Four out of five (80%) of the estimates are within 2.575 of the actual rate for bonds. This amount of error is probably not acceptable to financial analysts. 12.28 _ x_ _ y_ Predicted ( ŷ ) 58.1 55.4 57.0 58.5 57.4 58.0 34.3 35.0 38.5 40.1 35.5 37.9 37.4978 35.1277 36.5322 37.8489 36.8833 37.4100 SSE = se   ( y  yˆ ) 2 Residuals (y– ŷ ) ( y  yˆ ) 2 -3.1978 10.2259 -0.1277 0.0163 1.9678 3.8722 2.2511 5.0675 -1.3833 1.9135 0.4900 0.2401 2  ( y  yˆ ) = 21.3355 = 21.3355 SSE 21.3355 = 2.3095  n2 4 This standard error of the estimate indicates that the regression model is with + 2.3095(1,000) bankruptcies about 68% of the time. In this particular problem, 5/6 or 83.3% of the residuals are within this standard error of the estimate. © 2010 John Wiley & Sons Canada, Ltd. 408 Chapter 12: Correlation and Simple Regression Analysis 12.29  ( y  yˆ ) SSE = se  12.30 (y– ŷ )2 22.3200 .9675 .1597 45.6125 7.6635 .4150 (y- ŷ )2 = 77.1382 (y– ŷ ) 4.7244 -0.9836 -0.3996 -6.7537 2.7683 0.6442 2 = 77.1382 SSE 77.1382 = 4.391  n2 4 (y- ŷ ) .1024 -.0222 -.1129 .0506 -.0996 .1076 -.0962 .0702 (y- ŷ )2 .0105 .0005 .0127 .0026 .0099 .0116 .0093 .0049 2 (y– ŷ ) = .0620 SSE =  ( y  yˆ ) 2 = .0620 SSE .0620 = .1017  n2 6 The model produces estimates that are ±.1017 or within about 10 cents 68% of the time. However, the range of milk costs is only 45 cents for this data. se  12.31 Volume (x) 728.6 497.9 439.1 377.9 375.5 363.8 276.3 Sales (y) 10.5 48.1 64.8 20.1 11.4 123.8 89.0 n=7 x = 3059.1 2 x = 1,464,071.97 y = 367.7 y = 30,404.31 2 © 2010 John Wiley & Sons Canada, Ltd. xy = 141,558.6 409 Chapter 12: Correlation and Simple Regression Analysis b1 = –.1504 b0 = 118.257 ŷ = 118.257 – .1504x SSE = y2 – b0y – b1xy = 30,404.31 – (118.257)(367.7) – (–0.1504)(141,558.6) = 8211.6245 se  SSE 8211.6245 = 40.526  n2 5 This is a relatively large standard error of the estimate given the sales values (ranging from 10.5 to 123.8). 12.32 r2 = 1  SSE 46.6385 = .123  1 2 2 ( y ) ( 97 )  1,935   y2  n 5 This is a low value of r2 12.33 r2 = 1  SSE 272.121 = .972  1 2 2 ( y ) ( 498 )  45,154   y2  n 7 This is a high value of r2. 12.34 r2 = 1  SSE 70,940 = .898  1 2 ( y ) (2,670) 2 2 1,587,328  y  n 8 This value of r2 is relatively high. © 2010 John Wiley & Sons Canada, Ltd. 410 Chapter 12: Correlation and Simple Regression Analysis 12.35 r2 = 1  SSE 19.8885  1 2 ( y ) (48) 2 2 524  y  n 5 = .685 This value of r2 is a modest value. 68.5% of the variation of y is accounted for by x but 31.5% is unaccounted for. 12.36 r2 = 1  SSE 21.33547 = .183  1 2 ( y ) (221.3) 2 2 8,188.41  y  n 6 This value is a low value of r2. Only 18.3% of the variability of y is accounted for by the x values and 81.7% are unaccounted for. 12.37 Year 1997 1998 1999 2000 2001 2002 2003 2004 2005 Long-term Interest Rates (%) 6.14 5.28 5.54 5.93 5.48 5.30 4.80 4.58 4.07 Real GDP Growth (%) 4.2 4.1 5.5 5.2 1.8 2.9 1.8 3.3 2.9 Equation of the regression line: Long term interest rate = 4.3557 + 0.2498 (Real GDP) Standard error of the model = 0.6031 r-squared = 0.2592 Real GDP is not a good predictor of the Long-term Interest Rate because the t-value for the Real GDP Growth is not statistically significant. © 2010 John Wiley & Sons Canada, Ltd. 411 Chapter 12: Correlation and Simple Regression Analysis se 12.38 sb = x 2  ( x) 2  n 3.94 (89) 2 1833  5 = .2498 b1 = 0.162 Ho: 1 = 0 Ha: 1  0  = .05 This is a two-tail test, /2 = .025 df = n – 2 = 5 – 2 = 3 t.025,3 = ±3.182 t = b1  1 0.162  0 = 0.65  sb .2498 Since the observed t = 0.65 < t.025,3 = 3.182, the decision is to fail to reject the null hypothesis. se 12.39 sb = x 2  ( x ) 2  n 7.376 (571) 2 58,293  7 = .068145 b1 = – 0.898 Ho: 1 = 0 Ha: 1  0  = .01 Two-tail test, /2 = .005 df = n – 2 = 7 – 2 = 5 t.005,5 = ±4.032 t= b1  1  0.898  0 = –13.18  sb .068145 Since the observed t = –13.18 < t.005,5 = – 4.032, the decision is to reject the null hypothesis. © 2010 John Wiley & Sons Canada, Ltd. 412 Chapter 12: Correlation and Simple Regression Analysis se 12.40 sb = x 2  ( x )  2 n 108.7 (199.5) 2 7,667.15  8 = 2.095 b1 = 15.240 Ho: 1 = 0 Ha: 1  0  = .10 For a two-tail test, /2 = .05 df = n – 2 = 8 – 2 = 6 t.05,6 = + 1.943 t = b1  1 15.240  0 = 7.27  sb 2.095 Since the observed t = 7.27 > t.05,6 = 1.943, the decision is to reject the null hypothesis. se 12.41 sb = x 2  ( x) n 2  2.575 (41) 2 421  5 = .27963 b1 = – 0.715 Ho: 1= 0 Ha: 1  0  = .05 For a two-tail test, /2 = .025 df = n – 2 = 5 – 2 = 3 t.025,3 = ±3.182 t = b1  1  0.715  0 = – 2.56  sb .27963 Since the observed t = –2.56 > t.025,3 = –3.182, the decision is to fail to reject the null hypothesis. © 2010 John Wiley & Sons Canada, Ltd. 413 Chapter 12: Correlation and Simple Regression Analysis se 12.42 sb = x 2  ( x) 2 n  2.3095 (344.4) 2 19,774.78  6 = 0.926025 b1 = 0.878 Ho: 1 = 0 Ha: 1  0  = .05 For a two-tail test, /2 = .025 df = n – 2 = 6 – 2 = 4 t.025,4 = ±2.776 t = b1  1 0.878  0 = 0.948  sb .926025 Since the observed t = 0.948 < t.025,4 = 2.776, the decision is to fail to reject the null hypothesis. 12.43 F = 8.26 with a p-value of .021. The overall model is significant at  = .05 but not at  = .01. For simple regression, t = F = 2.874 t.05,8 = 1.86 but t.01,8 = 2.896. The slope is significant at  = .05 but not at  = .01. 12.44 x0 = 25 95% confidence df = n – 2 = 5 – 2 = 3 x  x  89 = n x = 89 5 /2 = .025 t.025,3 = ±3.182 17.8 x2 = 1,833 se = 3.94 © 2010 John Wiley & Sons Canada, Ltd. 414 Chapter 12: Correlation and Simple Regression Analysis ŷ = 16.5 + 0.162(25) = 20.55 ŷ ± t /2,n-2 se 1  n ( x0  x) 2 ( x) 2 2 x   n 20.55 ± 3.182(3.94) 1 ( 25  17.8) 2  (89) 2 5 1,833  5 = 20.55 ± 3.182(3.94)(.63903) = 20.55 ± 8.01 12.54 < E(y25) < 28.56 12.45 x0 = 100 For 90% confidence, /2 = .05 df = n – 2 = 7 – 2 = 5 t.05,5 = ±2.015 x  x  571 n 7 = 81.57143 x= 571 x2 = 58,293 se = 7.377 ŷ = 144.414 – .898(100) = 54.614 ŷ ± t /2,n-2 se 1  1  n ( x0  x) 2 = 2 ( x )   x2  n 54.614 ± 2.015(7.377) 1  1 (100  81.57143) 2 =  (571) 2 7 58,293  7 54.614 ± 2.015(7.377)(1.08252) = 54.614 ± 16.091 38.523 < y < 70.705 For x0 = 130, ŷ = 144.414 – 0.898(130) = 27.674 © 2010 John Wiley & Sons Canada, Ltd. 415 Chapter 12: Correlation and Simple Regression Analysis y ± t /2,n-2 se 1  1  n ( x0  x) 2 ( x ) 2 2 x  n = 1 (130  81.57143) 2 27.674 ± 2.015(7.377) 1   = (571) 2 7 58,293  7 27.674 ± 2.015(7.377)(1.1589) = 27.674 ± 17.227 10.447 < y < 44.901 The confidence interval of y for x0 = 130 is wider than the confidence interval of y for x0 = 100, because x0 = 100 is nearer to the value of x = 81.57 than is x0 = 130. 12.46 x0 = 20 For 98% confidence, /2 = .01 df = n – 2 = 8 – 2 = 6 t.01,6 = + 3.143 x  x  199.5 n 8 = 24.9375 x = 199.5 x2 = 7,667.15 se = 108.8 ŷ = – 46.29 + 15.24(20) = 258.51 ŷ ± t /2,n-2 se 1  n ( x0  x) 2 ( x) 2 2 x  n 258.51 ± (3.143)(108.8) 1  8 (20  24.9375) 2 (199.5) 2 7,667.15  8 258.51 ± (3.143)(108.8)(0.36614) = 258.51 ± 125.20 133.31 < E(y20) < 383.71 For single y value: © 2010 John Wiley & Sons Canada, Ltd. 416 Chapter 12: Correlation and Simple Regression Analysis ŷ ± t /2,n-2 se 1  1  n ( x0  x) 2 ( x ) 2 2 x  n 1 258.51 ± (3.143)(108.8) 1   8 (20  24.9375) 2 (199.5) 2 7,667.15  8 258.51 ± (3.143)(108.8)(1.06492) = 258.51 ± 364.16 – 105.65 < y < 622.67 The prediction interval for the single value of y is wider than the confidence interval for the average value of y because the average is more towards the middle and individual values of y can vary more than values of the average. 12.47 x0 = 10 For 99% confidence df = n – 2 = 5 – 2 = 3 t.005,3 = + 5.841 x  x  41 n 5 /2 = .005 = 8.20 x = 41 x2 = 421 se = 2.575 ŷ = 15.46 – 0.715(10) = 8.31 ŷ ± t /2,n-2 se 1  n ( x0  x) 2 ( x) 2 2 x  n 8.31 ± 5.841(2.575) 1 (10  8.2) 2  (41) 2 5 421  5 = 8.31 ± 5.841(2.575)(.488065) = 8.31 ± 7.34 0.97 < E(y10) < 15.65 If the prime interest rate is 10%, we are 99% confident that the average bond rate is between 0.97% and 15.65%. © 2010 John Wiley & Sons Canada, Ltd. 417 Chapter 12: Correlation and Simple Regression Analysis 12.48 Year (x) 2004 2005 2006 2007 2008 Fertilizer ($ millions) (y) 3,481.4 2,697.2 3,609.2 4,637.7 6,867.9 x = 10,030 x2 = 20,120,190 b1 = b0 = SS xy SS xx  y = 21,293.4 y2 = 101,097,670.14  xy  x 2  x y  n ( x) 2 n (10,030)( 21,293.4) 5 = 871.35 (10,030) 2 20,120,190  5 42,723,273.9  =  y  b  x  21,293.4  871.35 10,030 n 1 n 5 xy = 42,723,273.9 n=5 5 = – 1,743,669.42 ŷ = – 1,743,669.42+ 871.35 x ŷ (2010) = – 1,743,669.42+ 871.35 (2010) = 7,744.08 ($millions) 12.49 Month May 2009 June 2009 July 2009 August 2009 September 2009 October 2009 November 2009 Long-Term Interest Rates 3.22 3.47 3.42 3.473 3.366 3.423 3.41 Trend line: Long term interest rate = 3.3371 + 0.015071 (Recoded Month) Forecast for December 2009 = 3.46 © 2010 John Wiley & Sons Canada, Ltd. 418 Chapter 12: Correlation and Simple Regression Analysis 12.50 Year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 New Housing Prices Index 102.0 101.2 99.3 100.0 101.0 101.9 104.1 107.0 111.3 116.7 123.2 Equation of the trend line = New housing price index =94.0946+2.01(Recoded year) Forecast for new housing prices for 2005 = 118.2145 12.51 x = 36 y = 44 xy = 188 r = x2 = 256 y2 = 300 n =7  xy   x y n  ( x)   ( y ) 2  2 2  x    y   n   n   2  38.2857 r = (70.85714)( 23.42857) 12.52 x 5 7 3 16 12 = (36)( 44) 7 2  (36)   (44) 2  256  300     7  7   188  =  38.2857 = –.940 40.7441 y 8 9 11 27 15 © 2010 John Wiley & Sons Canada, Ltd. 419 Chapter 12: Correlation and Simple Regression Analysis 9 13 x = 52 y = 83 xy = 865 x2 = 564 y2 = 1,389 n=6 a) ŷ = 2.6941 + 1.2853 x b) ŷ (Predicted Values) 9.1206 11.6912 6.5500 23.2588 18.1177 14.2618 c) (y- ŷ ) residuals -1.1206 -2.6912 4.4500 3.7412 -3.1176 -1.2618 (y- ŷ )2 1.2557 7.2426 19.8025 13.9966 9.7194 1.5921 SSE = 53.6089 se  d) b1 = 1.2853 b0 = 2.6941 SSE 53.6089 = 3.661  n2 4 r2 = 1  SSE 53.6089  1 2 ( y ) (83) 2 2 1 , 389  y  n 6 = .777 © 2010 John Wiley & Sons Canada, Ltd. 420 Chapter 12: Correlation and Simple Regression Analysis e) Ho: 1 = 0 Ha: 1  0  = .01 Two-tailed test, /2 = .005 df = n – 2 = 6 – 2 = 4 t.005,4 = ±4.604 se sb = x t = 2  ( x ) 2  n 3.661 (52) 2 564  6 = .34389 b1  1 1.2853  0 = 3.74  sb .34389 Since the observed t = 3.74 < t.005,4 = 4.604, the decision is to fail to reject the null hypothesis. f) 12.53 The r2 = 77.74% is modest. There appears to be some prediction with this model. The slope of the regression line is not significantly different from zero using  = .01. However, for  = .05, the null hypothesis of a zero slope is rejected. The standard error of the estimate, se = 3.661 is not particularly small given the range of values for y (11 – 3 = 8). x 53 47 41 50 58 62 45 60 x = 416 y = 57 xy = 3,106 a) y 5 5 7 4 10 12 3 11 x2 = 22,032 y2 = 489 n=8 b1 = 0.355 b0 = –11.335 ŷ = –11.335 + 0.355 x © 2010 John Wiley & Sons Canada, Ltd. 421 Chapter 12: Correlation and Simple Regression Analysis b) ŷ (Predicted Values) 7.48 5.35 3.22 6.415 9.255 10.675 4.64 9.965 c) (y- ŷ ) residuals -2.48 -0.35 3.78 -2.415 0.745 1.325 -1.64 1.035 (y- ŷ )2 6.1504 0.1225 14.2884 5.8322 0.5550 1.7556 2.6896 1.0712 SSE = 32.4649 SSE 32.4649 = 2.3261  n2 6 d) se = e) r2 = 1  f) Ho: 1 = 0 Ha: 1  0 SSE 32.4649  1 2 ( y ) (57) 2 2 489  y   8 n = .608  = .05 Two-tailed test, /2 = .025 df = n – 2 = 8 – 2 = 6 t.025,6 = ±2.447 se sb = x 2  ( x) 2 n  2.3261 (416) 2 22,032  8 = 0.116305 © 2010 John Wiley & Sons Canada, Ltd. 422 Chapter 12: Correlation and Simple Regression Analysis t = b1  1 0.355  0 = 3.05  sb .116305 Since the observed t = 3.05 > t.025,6 = 2.447, the decision is to reject the null hypothesis. The population slope is different from zero. g) This model produces only a modest r2 = .608. Almost 40% of the variance of y is unaccounted for by x. The range of y values is 12 – 3 = 9 and the standard error of the estimate is 2.33. Given this small range, the se is not small. 12.54 x = 1,263 y = 417 xy = 88,288 b0 = 25.42778 x2 = 268,295 y2 = 29,135 n=6 b1 = 0.209369 SSE = y2 - b0y - b1xy = 29,135 – (25.42778)(417) – (0.209369)(88,288) = 46.845468 r2 = 1  SSE 46.845468 = .695  1 2 153.5 ( y ) 2 y  n Coefficient of determination = r2 = .695 12.55 a) x0 = 60 x = 524 y = 215 xy = 15,125 x2 = 36,224 y2 = 6,411 n=8 se = 3.201 95% Confidence Interval df = n – 2 = 8 – 2 = 6 b1 = .5481 b0 = –9.026 /2 = .025 t.025,6 = ±2.447 ŷ = –9.026 + 0.5481(60) = 23.86 © 2010 John Wiley & Sons Canada, Ltd. 423 Chapter 12: Correlation and Simple Regression Analysis x  x  524 n 8 ŷ ± t /2,n-2 se = 65.5 ( x0  x) 2 ( x) 2 2 x   n 1  n 1  8 23.86 + 2.447(3.201) (60  65.5) 2 (524) 2 36,224  8 23.86 + 2.447(3.201)(.375372) = 23.86 + 2.94 20.92 < E(y60) < 26.8 b) x0 = 70 ŷ 70 = –9.026 + 0.5481(70) = 29.341 ŷ + t/2,n-2 se 1  1  n ( x0  x) 2 ( x ) 2 2 x  n 1 29.341 + 2.447(3.201) 1   8 (70  65.5) 2 (524) 2 36,224  8 29.341 + 2.447(3.201)(1.06567) = 29.341 + 8.347 20.994 < y < 37.688 c) The confidence interval for (b) is much wider because part (b) is for a single value of y which produces a much greater possible variation. In actuality, x0 = 70 in part (b) is slightly closer to the mean (x) than x0 = 60. However, the width of the single interval is much greater than that of the average or expected y value in part (a). 12.56 Year Cost © 2010 John Wiley & Sons Canada, Ltd. 424 Chapter 12: Correlation and Simple Regression Analysis 1 2 3 4 5 56 54 49 46 45 x = 15 x2 = 55 b1 = b0 = y = 250 y2 = 12,594 SS xy SS xx   xy  x 2  x y  n ( x) 2 (15)( 250) 5 = –3 (15) 2 55  5 720  = n  y  b  x  250  (3) 15 1 n n 5 xy = 720 n=5 5 = 59 ŷ = 59 – 3 x ŷ (7) = 59 – 3(7) = 38 ($ millions) 12.57 y = 267 x = 21 xy = 1,256 y2 = 15,971 x2 = 101 n =5 b0 = 9.234375 b1 = 10.515625 SSE = y2 – b0y – b1xy = 15,971 – (9.234375)(267) – (10.515625)(1,256) = 297.7969 r2 = 1  SSE 297.7969 = .826  1 2 1 , 713 . 2 ( y )   y2  n If a regression model would have been developed to predict number of cars sold by the number of sales people, the model would have had an r2 of 82.6%. This result means that 82.6% of the variance in the number of cars sold is explained by variation in the number of salespeople. 12.58 n = 12 x = 548 x2 = 26,592 © 2010 John Wiley & Sons Canada, Ltd. 425 Chapter 12: Correlation and Simple Regression Analysis y = 5940 y2 = 3,211,546 b1 = 10.626383 xy = 287,908 b0 = 9.728511 ŷ = 9.728511 + 10.626383 x SSE = y2 – b0y – b1xy = 3,211,546 – (9.728511)(5940) – (10.626383)(287,908) = 94,337.9762 se  SSE 94,337.9762 = 97.1277  n2 10 SSE 94,337.9762 = .652  1 2 271 , 246 ( y )   y2  n se 97.1277  = 2.4539 2 2 ( x ) ( 548 )  26,592   x2  n 12 r2 = 1  sb = t = 10.626383  0 = 4.33 2.4539 If  = .01, then t.005,10 = +3.169. Since the observed t = 4.33 > t.005,10 = 3.169, the decision is to reject the null hypothesis. The slope is significantly larger than zero. Average family income is a significant predictor of number of rentals at a store. 12.59 Sales(y) Number of Units(x) 17.1 7.9 4.8 4.7 4.6 4.0 2.9 2.7 2.7 12.4 7.5 6.8 8.7 4.6 5.1 11.2 5.1 2.9 y = 51.4 x2 = 538.97 y2 = 460.1 xy = 440.46 x = 64.3 n=9 © 2010 John Wiley & Sons Canada, Ltd. 426 Chapter 12: Correlation and Simple Regression Analysis b0 = –0.863565 b1 = 0.92025 ŷ = –0.863565 + 0.92025 x SSE = y2 – b0y – b1xy = 460.1 – (–0.863565)(51.4) – (0.92025)(440.46) = 99.153926 SSE 99.153926 = .405  1 2 166 . 55 ( y )   y2  n  x y  xy  n = 2 2  ( x)   ( y )  2 2  x    y   n   n   r2 = 1  r = (64.3)(51.4) 9  (64.3) 2   (51.4) 2  538 . 97  460 . 1     9  9   440.46  r = 73.23556 r = = 0.636 (79.58222)(166.54889) Since r = 0.405 = 0.636, the restaurant chain’s sales can be predicted by its number of units. 12.60 Year (x) 1 2 3 4 5 6 7 8 Total Employment (1,000s) (y) 11,152 10,935 11,050 10,845 10,776 10,764 10,697 9,234 © 2010 John Wiley & Sons Canada, Ltd. 427 Chapter 12: Correlation and Simple Regression Analysis 9 10 11 12 13 9,223 9,158 9,147 9,313 9,353 x = 91 x2 = 819 b1 = b0 = SS xy SS xx y = 131,647 y2 = 1,342,147,171   xy  x 2  x y  n ( x) 2 n (91)(131,647) 13 = –198.9725 (91) 2 819  13 885,316  =  y  b  x  131,647  (198.9725) 91 n 1 n 13 xy = 885,316 n = 13 13 = 11,519.4998 ŷ = 11,519.4998 – 198.9725 x ŷ (2011) = ŷ (17) = 11,519.4998 – 198.9725 (17) = 8,136.9673 ŷ (2011) = 8,137 (1,000s) 12.61 x 2.4 3.9 1.1 2.8 2.4 1.6 2.8 2.2 2.2 0.1 y 6.7 7 7.8 7.9 7.2 6.9 6.1 6 6.1 8.4 © 2010 John Wiley & Sons Canada, Ltd. 428 Chapter 12: Correlation and Simple Regression Analysis r =  xy   x y n  ( x)   ( y ) 2  2 2  x    y   n   n   = 2 (21.5)( 70.1) 10 2  (21.5)   (70.1) 2  55 . 87  497 . 57     10   10   146.94  r = r =  3.775 = –.489 (9.645)(6.169) The r value of –.489 denotes a moderate negative correlation. 12.62 x = 3,337,494 x2 = 736,622,100,116 xy = 965,566,451,720 r =  xy   ( x) 2  x  n  2 y = 4,406,569 y2 = 1,274,609,514,995 n = 16  x y n  ( y ) 2   y  n   2    = (3,337,494)( 4,406,569) 16  (3,337,494) 2   (4,406,569) 2  736 , 622 , 100 , 116  1 , 274 , 609 , 514 , 995     16 16    965,566,451,720  r = 46,385,351,839.6 r = = 0.934 (40,442,962,613.8)(60,993,868,009.9) The r value of 0.934 denotes a strong positive correlation. b1 = 1.1469 b0 = 36,168.0228 ŷ = 36,168.0228+ 1.1469 x © 2010 John Wiley & Sons Canada, Ltd. 429 Chapter 12: Correlation and Simple Regression Analysis ŷ (200,000) = 36,168.0228+ 1.1469 (200,000) = 265,548.0228 ŷ + t/2,n-2se 1  n ( x0  x) 2 ( x) 2 2 x  n  = .05, t.025,14 = + 2.145 x0 = 200,000, n = 16 x = 208,593.375 SSE = y2 – b0y – b1xy = 1,274,609,514,995– (36,168.0228)( 4,406,569) – (1.1469)( 965,566,451,720) = = 7,824,463,455.5 se  SSE 7,824,463,455.5 = 23,640.8597  n2 14 Confidence Interval = 265,548.0228+ (2.145)( 23,640.8597) 1  16 (200,000  208,593.375) 2 = (3,337,494) 2 736,622,100,116  16 265,548.0228 + 12,861.2626 252,686.7542 to 278,409.2854 H0: 1 = 0 Ha: 1  0  = .05 t = df = 14 Table t.025,14 = + 2.145 b1  0 1.1469  0 1.1469 = 9.7559   sb   .11756 23 , 640 . 8597    40,442,962,613.8    Since the observed t = 9.7559 > t.025,14 = 2.145, the decision is to reject the null © 2010 John Wiley & Sons Canada, Ltd. 430 Chapter 12: Correlation and Simple Regression Analysis hypothesis. 12.63 x = 10.797 y = 516.8 xy = 1091.2081 x2 = 20.673549 y2 = 61,899.06 n=7 b1 = 73.15543 b0 = –39.00846 ŷ = –39.00846+ 73.15543 x SSE = y2 – b0 y – b1 xy SSE = 61,899.06 – (–39.00846)(516.8) – (73.15543)( 1091.2081) = 2,230.83435 se  SSE 2,230.83435 = 21.12266  n2 5 r2 = 1  SSE 2,230.83435 = 1 – .09395 = .90605  1 2 ( y ) (516.8) 2 2 61,899.06  y  n 7 12.64 x = 542,911 y = 858,326 y2 = 77,988,664,304 b1 = SS xy SS xx   xy  x 2 n = 10  x y  n ( x) 2 n x2 = 31,262,604,453 xy = 49,065,104,049 (542,911)(858,326) 10 (542,911) 2 31,262,604,453  10 49,065,104,049  = b1 = 1.379481 b0 = ŷ  y  b  x  858,326  (1.379481) 542,911 n 1 n 10 10 = 10,939.059081 = 10,939 + 1.3795 x r2 for this model is 0.78801183 . There is predictability in this model. Test for slope: t = 5.45 with a p-value of 0.0006. The null hypothesis that the population slope is zero is rejected. © 2010 John Wiley & Sons Canada, Ltd. 431 Chapter 12: Correlation and Simple Regression Analysis Time-Series Trend Line: x = 55 x2 = 385 b1 = b0 = SS xy SS xx y = 542,911 y2 = 31,262,604,453   xy  x  x y  2 xy = 3,102,042 n = 10 (55)(542,911) 10 = 1406.442424 (55) 2 385  10 3,102,042  n ( x) 2 = n  y  b  x  542,911  (1406.442424) 55 1 n n 10 10 = 46,555.666667 ŷ = 46555.666667+ 1406.442424 x ŷ (2008) = ŷ (11) = 46,555.666667 + 1406.442424 (11) = 62,026.5 12.65 x = 1034.7 x2 = 217,604.49 xy = 2,356,711.6 b1 = b0 = SS xy SS xx  y = 18,840 y2 = 62,486,228 n=7  xy  x 2  x y  n ( x) 2 (1034.7)(18,840) 7 = – 6.620826 (1034.7) 2 217,604.49  7 2,356,711.6  = n  y  b  x  18,840  (6.620826) 1034.7 n 1 n 7 7 = 3670.081237 ŷ = 3670.081237– 6.620826 x SSE = y2 – b0y – b1xy = 62,486,228 – (3670.081237)(18,840) – (–6.620826)(2,356,711.6) SSE = 8,945,274.9307 © 2010 John Wiley & Sons Canada, Ltd. 432 Chapter 12: Correlation and Simple Regression Analysis SSE 8,945,274.9307 = 1337.5556  n2 5 se  r2 = 1  SSE 8,945,274.9307 = 1 – .7594 = .2406  1 2 2 ( y ) ( 18 , 840 )  62,486,228   y2  n 7 The r value of –0.4905 denotes a moderate negative correlation. H0: 1 = 0 Ha: 1 0  = .05  x   t.025,5 = + 2.571 2 SSxx = t = x 2 n  217,604.49  (1034.7) 2 = 64,661.048571 7 b1  0  6.620826  0 = – 1.26  se 1337.5556 SS xx 64,661.048571 Since the observed t = – 1.26> t.025,5 = – 2.571, the decision is to fail to reject the null hypothesis. 12.66 Let Water Use = y and Temperature = x x = 196 y = 3,880 xy = 112,006 x2 = 5,834 y2 = 2,188,612 n=8 b1 = 16.42054 b0 = 82.69671 ŷ = 82.69671 + 16.42054 x ŷ 38 = 82.69671 + 16.42054 (38) = 707 SSE = y2 – b0 y – b1 xy SSE = 2,188,612– (82.69671)( 3,880) – (16.42054)( 112,006) = 28,549.76196 se  SSE 28,549.76196 = 68.980386  n2 6 © 2010 John Wiley & Sons Canada, Ltd. 433 Chapter 12: Correlation and Simple Regression Analysis r2 = 1  SSE 28,549.76196 = 1 – .093053 = .9069  1 2 2 ( y ) ( 3 , 880 )  2,188,612   y2  n 8 Testing the slope: Ho: 1 = 0 Ha: 1  0  = .01 Since this is a two-tailed test, /2 = .005 df = n – 2 = 8 – 2 = 6 t.005,6 = ±3.707 se sb = x t = 2  ( x ) 2 n  68.980386 (196) 2 5,834  8 = 2.147266 b1  1 16.42054  0 = 7.65  sb 2.147266 Since the observed t = 7.65 > t.005,6 = 3.707, the decision is to reject the null hypothesis. 12.67 The F value for overall predictability is 7.12 with an associated p-value of .0205 which is significant at  = .05. It is not significant at alpha of .01. The coefficient of determination is .372 with an adjusted r2 of .32. This represents the modest predictability. The standard error of the estimate is 982.219 which in units of 1,000 labourers means that about 68% of the predictions are within 982,219 of the actual figures. The regression model is: Number of Union Members = 22,348.97 – 0.0524 Labour Force. For a labour force of 100,000 (thousand, actually 100 million), substitute x = 100,000 and get a predicted value of 17,108.97 (thousand) which is actually 17,108,970 union members. 12.68 The Residual Diagnostics from output indicate a relatively healthy set of residuals. The Histogram indicates that the error terms are generally normally distributed. This is somewhat confirmed by the semi straight line Normal Plot of Residuals. However, the Residuals vs. Fitted Values graph indicates that there may be some heteroscedasticity with greater error variance for small x values. © 2010 John Wiley & Sons Canada, Ltd. 434

Chapter 12 Correlation and Simple Regression Analysis

Related documents

Products

Support

Chapter 12 Correlation and Simple Regression Analysis

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib