Regression Problems: 1. A researcher wants to know if there is a relationship between the number of shopping centers in a state and the retail sales (in billions $) of that state. A random sample of 8 states is listed below. After determining, via a scatter-plot, that the data followed a linear pattern, the regression line was found. Using the given data and the given regression output answer the following questions. State 1 2 3 4 5 6 7 8 Num 630 370 616 700 430 568 1200 2976 Sales 15.5 7.5 13.9 18.7 8.2 13.2 23.0 87.3 Regression Statistics Multiple R 0.991159 R Square 0.982396 Adjusted R Square 0.979461 Standard Error 3.770754 Observations 8 Intercept Num a. b. c. d. e. Coefficients -4.93018 0.030273 Standard Error 2.043656 t Stat -2.41243 0.001654 18.29815 Lower Upper P-value 95% 95% 0.0524 -9.93082 0.070468 1.72E06 0.026224 0.034321 What is the equation of the regression line? Interpret the slope in the words of the problem. Find r2 and interpret its meaning in the words of the problem. Find the error for predicting the sales of a state with 1200 stores. Use the regression line to predict the sales for a state with 100 stores 2. A pharmaceutical company is investigating the relationship between advertising expenditures and the sales of some over-the-counter (OTC) drugs. The following data represents a sample of 10 common OTC drugs. Find the equation of the regression line, using Advertising dollars as the independent variable and Sales as the response variable. Interpret the slope of the line in the words of the problem. Find r 2 and interpret it in the words of the problem. Use the line to predict the Sales if Advertising dollars = $50 million. Note that AD = Advertising dollars in millions and S = Sales in millions $. AD 22 25 29 35 38 42 46 52 65 88 S 64 74 82 90 100 120 120 142 180 230 Calculator Output a = 6.629, b = 2.569, r = .996 3. A chemical company wants to study the effect of extraction time on the efficiency of an extraction process. They obtained a random sample of extraction times and the corresponding efficiency scores. The output from Excel is given below. What is the regression line? Interpret the slope and coefficient of determination in the words of the problem. Use the regression line to estimate the efficiency for an extraction time of 20. You can assume 20 is in the range of the x’s. Is the model useful for predicting efficiency? Justify! Regression Statistics Multiple R 0.864 R Square 0.746 Std Error 5.139 Obs 15 Coefficients Intercept 39.022 Time 0.764 Std Error 4.173079 0.123639 t Stat 9.350943 6.178365 P-value 3.9E-07 3.33E-05 Lower 95% 30.00684 0.496782 Upper 95% 48.03761 1.030995 4. The following is output from Excel for regression analysis. The researcher wanted to predict the total cholesterol (mg/100ml) using weight (kg) as the predictor variable. Using the output, please answer the following questions? a. Use ลท to predict the total cholesterol for a subject who weighs 70kg. b. Find the coefficient of determination and explain what this means in the words of the problem? c. Find a 95% Confidence interval for B. d. Do you think weight is a good predictor total cholesterol, Explain? SUMMARY OUTPUT Regression Multiple R R Square Standard Error Observations Intercept Weight Statistics 0.265293 0.070381 76.65431 25 Coeff Std Err t Stat 199.30 85.82 2.322 1.62 1.229 1.320 ANOVA Source Regress Residual Total df 1 23 24 P-value 0.0294 0.1999 Lower 95% 21.77 -0.921 SS MS F 10231 10231 1.741 135145 5875.8 145377 Upper 95% 376.825 4.1656