Stats for Strategy HOMEWORK 6 (Topic 8, Part 2) (revised Jan. 2016) DIRECTIONS/SUGGESTIONS • Textbook instructions for some exercises have been modified, or new parts added. • Use 5% significance when the textbook doesn’t specify a significance level. • Data files are available from the Stats website for some exercises. • See How To Succeed With Stats Homework in the Stats Guide. Many students report that using this accounting system helps them get the most out of the homework and solid quiz and exam preparation. The letters A, B, C, etc. represent distinct business applications. (Often several textbook exercises are grouped together for each such application.) A. Applying Population Regression Lines • Exercise 10.3 (p. 532) • Exercise 10.4 B. Research and Development Spending • Exercise 10.5 (p. 534) Clarified directions for parts (a)–(e): * Part (a): Enter data into a MINITAB worksheet, then make a scatterplot (first Graph menu option.) * Part (b): For the moment, ignore the textbook’s request to “add this line to your scatterplot.” Instead enter the data into your calculator and use the calculator’s regression function to find the least-squares equation. * Part (c): Look up the formula for regression sample variance s2 on page 41 in the Notebook. Can you figure out how to do these hand calculations? * Part (d): The textbook is requesting the population regression model (with coefficients written as Greek letters) * Part (e): Don’t worry about “adding the point to your plot.” Instead calculate the prediction error and give a reason why the error is relatively large. (continued next page) 1 Add more parts to Exercise 10.5: (f) Make a Fitted-Line Plot in MINITAB: Stat > Regression > Fitted Line Plot ◦ What value does the plot provide for regression standard deviation s? (Include all available decimal places in your answer.) ◦ Is the plot’s reported value for s the same (except for roundoff error) as you calculated by hand in part (c)? (g) Find the standard deviation sy for Spending. Does regression on Years reduce the standard deviation for predicting Spending? Explain. (h) There’s another way besides comparing the standard deviations s and sy to measure how well regression is working. How much of the variation of Spending within the data is explained by the variable Years (i.e., by the pattern over time)? C. T-Bills and Inflation • Open the data file Inflation . Get the full regression output for predicting the T-bill rate from inflation: ( Stat > Regression > Regression > Fit Regression Model > OK ) The textbook exercises below sometimes refer to “Excel regression output.” Just refer to your MINITAB output instead. • Exercise 10.6 (p. 539) • Exercise 10.7 Tip: Briefly review pages 60–64 in the Notebook, including your answers to Example 2. • Exercise 10.8 • Exercise 10.29 (p. 548) ◦ Tip for part (c): Use Four Steps. ◦ Tip for part (d): Also interpret the answer for your client. • Exercise 10.42 (p. 554) Go ahead and reproduce the MINITAB output shown at the bottom of page 554: Stat > Regression > Regression > Predict . . . > (Enter 3.7 for INFLATION) > OK • Exercise 10.52 (p. 557) Tip: Use the output from MINITAB 17 instead of the textbook’s output from MINITAB 16. 2 D. Earnings for Female Bank Employees • Review Topic 8 Part 2 Example 7 on Notebook page 82. Open the data file Bank Wages. • Reproduce the MINITAB Fitted Line Plot on page 82. • Also reproduce the full regression output and calculate a prediction: 1. First generate the model in MINITAB: Stat > Regression > Regression > Fit Regression Model > OK 2. Then predict using the fitted model: Stat > Regression > Regression > Predict . . . > (Enter 125 for LOS) > OK (a) The first entry in the database refers to a woman who earns $389/week based on 94 months of service. Find the predicted wages and residual for this person. (b) Suppose you’d like to predict weekly wages for four other female bank employees. These women’s lengths of service are one month, 100 months, 200 months, and 400 months. For each, calculate the predicted wages. Which of these predictions are supported by the range of the data? Which are considered risky, and why? (c) The intercept β0 of the population regression line measures the average starting wage of female bank employees (those with 0 months of experience.) Find and interpret a 90% confidence interval for β0 . (d) Notice that MINITAB predicts the weekly wage for a worker with 125 months experience as yb = $423.185. Also, the “standard error of the fit” is SEµb = $15.5530. ◦ Use a formula to calculate a 95% confidence interval for the mean wages of all bank workers who have 125 months experience, correct to the nearest cent. ◦ Is your calculated answer very close to MINITAB’s answer? Why aren’t the two answers exactly the same? (e) Using only the MINITAB output you made at the beginning (i.e. without re-running MINITAB), calculate a 90% CI for the mean wages of all bank workers who have 125 months experience. (f) Re-do the Fitted Line Plot with a special option: Stat > Regression > Fitted Line Plot > (Choose response = Wages, predictor = LOS) > Options > (Select Display confidence interval, Display prediction interval) > OK > OK (see questions for part (f ) next page) 3 Answer the following questions for part (f ): 1. Can you guess which color shows the confidence interval for mean wages for all employees? Which color shows the prediction interval? 2. Use the graph to “eyeball” a rough 95% estimate of mean weekly wages for all employees with 100 months of service. 3. Use the graph to “eyeball” a rough 95% estimate for the weekly wage of an individual employee with 200 months of service. E. Stocks and Bonds • Exercise 10.34 (p. 549) Ignore textbook directions for parts (c) and (d). Instead answer the following: (c) What fact about the scatterplot explains why the relationship between bond flows and stock flows is not significant? (d) Use the regression to predict net cash flow into bonds in the year 2016 with 95% certainty if the net cash flow into stocks in 2016 is $100 billion. (e) Use the regression to estimate average net cash flow into bonds with 95% certainty for all years in which net cash flow into stocks is $200 billion. F. Computer Memory • Exercise 10.39 (p. 550) (First enter these data into a MINITAB worksheet) Change instructions for part (a): Make a Fitted-Line Plot and a Residuals Plot: Stat > Regression > Fitted Line Plot . . . . . . > Graphs > Select Residuals versus fits > OK > OK Questions to answer for part (a): 1. What’s the sample regression equation for predicting DRAM over time? 2. What does the Residuals Plot imply about using the regression of DRAM by Year? Add part (d): Predict DRAM for the year 2002 with 90% certainty. G. Blood Alcohol Content • Exercise 10.38 (p. 550) Add part (c): Find and interpret a 90% confidence interval for the slope β1 . • Exercise 10.55 (p. 557) (Notice that the textbook mistakenly refers to a confidence interval instead of a prediction interval for Steve.) 4 H. Predicting Water Quality • Exercise 10.19 (p. 546) Ignore the book’s directions for this exercise. Follow these directions instead: (a) Make a Fitted Line Plot. Also obtain the full MINITAB regression output in the Session Window. What is the sample regression equation to predict IBI? • Exercise 10.20 Ignore the book’s directions for this exercise. Follow these directions instead: (a) Make a Fitted Line Plot. Also obtain the full MINITAB regression output in the Session Window. What’s the sample regression equation to predict IBI? (b) If you had to choose between the predictors Area and Forest to predict IBI, which of these two predictors would you choose? Explain your choice, based on a 10% significance level. (c) Suppose you wish to predict with 90% certainty the IBI for Stream A, whose watershed covers 50 square km and is 30% forested. Provide the answer from your chosen model. (d) Now suppose that you discover that a data-entry error has been made in Table 10.4 in the textbook! The correct Area corresponding to (IBI = 32) is actually 121, not the number 21 which is currently listed. Find the error in the MINITAB worksheet and make the appropriate data correction. Then redo the Fitted Line Plot and Regression for the predictor Area. What’s the revised regression equation? The corrected data point (121, 32) is called an outlier. Can you explain why it has this name, based on the scatterplot? (Outliers can have a strong impact on regression results.) (e) Re-evaluate your earlier choice: Has the data correction changed your choice of best predictor variable? Why or why not? What’s your 90% prediction for the IBI of Stream A now? (end of assignment) 5