DS 303 Spring 2004 Exam # 3 Name: _____KEY______________ Show All your Work 1. Mid-Valley Travel Agency (MVTA) has offices in 12 cities. The company believes that its monthly airline bookings are related to the mean income in those cities and has collected the following data: Location Bookings 1 1098 2 1131 3 1120 4 1142 5 971 6 1403 7 855 8 1054 9 1081 10 982 11 1098 12 1387 Income 43299 45021 40290 41893 30620 48105 27482 33025 34687 28725 37892 46198 Simple linear regression model was used to analyze the data. The partial computer output is given below: SUMMARY OUTPUT Regression Statistics Multiple R 0.879189 R Square 0.772974 Adjusted R Square 0.750271 Standard Error 78.16735 Observations 12 ANOVA df Regression Residual Total Intercept Income 1 10 11 SS MS F 208036.3 208036.3 34.04775 61101.35 6110.135 269137.7 Coefficients Standard Error t Stat P-value 371.6758 128.5571 2.891133 0.016076 0.019381 0.003322 a) What is the estimated least square regression line? ŷ = 371.68 + .019x ŷ = (bookings) b) x = (income) What is the value of the coefficient of determination (R2)? What does it mean? R2 = .77 77% of the variability in the number of bookings is due to the income. c) Forecast the number of bookings when the mean income is $51385. ŷ = 371.68 + .019(51385) = 1347.99 ≈ 1348 d) Is there a significant relation between monthly airline bookings and the mean income? Test this at 5% level (state the null and alternative hypothesis, the value of your test statistic, the p-value or the decision rule, and your conclusion). Ho: β1 = 0 Ha: β1 = ≠ 0 t = b1 = .019381 = 5.834 S(b1) .003322 Since t = 5.384 > 2.228 Reject Ho. There is statistically significant relation Between number of booking s and the average income. e) Give a 95% confidence interval estimate of the average increase in monthly bookings. Explain what it means. b1 ± t*s(b1) t* = t(.025,10) = 2.228 .019381 ± 2.228(.003322) .019381± .0074 (.012, .03) For every $1000 increase in mean income there will be 12 to more bookings A tanning parlor located in a major shopping center near a large New England city has the following history of customers over the last four years (data are in hundreds of customers): Year 1 2 3 4 Number of Moving Centered CMA Seasonal Seasonal Cycle Quarters Customers Average Moving Average Trend Factor Index Factor 1 3.50 2 2.90 3 2.00 4 1 .73 1.028 .98 1.247 1.27 .986 1.01 2.9 2.975 2.90 .672 3.20 3.05 3.113 3.10 4.10 3.175 3.288 3.29 2 3.40 3.4 3.45 3.49 3 2.90 3.5 3.638 3.68 .797 4 3.60 3.775 3.913 3.88 .920 1 5.20 4.05 4.075 4.08 1.276 2 4.50 4.1 4.213 4.27 1.068 3 3.10 4.325 4.438 4.47 .699 4 4.50 4.55 4.613 4.66 .976 1 6.10 4.675 4.838 4.86 1.261 2 5.00 5 5.188 5.05 .964 3 4.40 5.375 4 6.00 1.026 1.004 .999 .989 .989 1.009 .999 .987 .993 .990 .995 1.027 5.25 5.45 a) Find a four period moving average for each quarter. b) Find the centered moving average for the sample. c) Find the seasonal factors and the seasonal indexes. ASF .723 .975 1.261 1.006 3.965 Quarters Q3 Q4 Q1 Q2 SI Quarter (4/3.965)*.723 = .73 Q3 (4/3.965)*.975 =.98 Q4 (4/3.965)*1.26 = 1.271 Q1 (4/3.965)*1.006 = 1.01 Q2 d) Find the cycle factors. e) Use the multiplicative decomposition method to forecast the number of customers for each quarter of year 4. FY1 = (4.86)(1.27)(.995) = 6.09 FY2 = (5.05)(1.01)(1.027) = 5.24 FY3 = (5.25)(.73)(.993) = 3.81 FY4 = (5.45)(.98)(.990) = 5.29 Multiple Choice Questions Select the best answer 1. In the linear model, the slope coefficient i measures the expected change in Y per unit change in Xi given the other independent variables are fixed. A) True 2. t- distribution with 9 degrees of freedom. t- distribution with 8 degrees of freedom. t- distribution with 19 degrees of freedom. t- distribution with 18 degrees of freedom. None of the above. Stepwise regression is an approach to choosing the independent variables to be included in a multiple regression equation. A) True 4. B) False C) Not enough information A company has computed a seasonal index for its quarterly sales. Which of the following statements is not correct? A) B) C) D) E) 5. C) Not enough information In a test of the distribution of the anti-fungus activity of a chemical compound, fungus is grown in petri dishes with different concentrations of the compound and the diameter of the fungus colonies is measured after one day. There are 20 dishes, two at each of 10 concentrations. A plot of diameter against concentration shows a straight-line pattern, with higher concentrations giving smaller diameters. Least squares regression is used to analyze the data. What distribution is used in the test of the hypothesis that concentration has no effect on diameter? A) B) C) D) E) 3. B) false The sum of the four quarterly seasonal index numbers is 4. An index of .75 for quarter-one sales indicates that sales were 25 percent lower than average sales. An index of 1.10 indicates sales 10% above the norm. The index for any quarter must be between 0 and 1. The average index is 1. The long-term trend of a time series in the decomposition model is estimated using A) B) C) D) E) a nonlinear time trend. the actual un smoothed data. the centered moving average data. the series of seasonal factors. All of the above. 6. The F-statistic reported in standard multiple regression computer packages tests which hypothesis? A) B) C) D) 7. The Y-intercept of the simple regression model A) B) C) D) E) 8. When Y increases by one, X increases by 3.5. When X increases by one, Y increases by 3.5. The regression line crosses the Y-axis at -14. X and Y are positively related. None of the above. Income is used to predict savings. For the regression equation Y = 1,000 + .10X, which of the following is true? A) B) C) D) 10. rarely has a useful interpretation. almost always has a useful interpretation. is always a positive number. is always positive when the correlation between the dependent and independent variable is positive. All the above. The Y-intercept of a regression line is -14 and the slope is 3.5. Which of the following is not correct? A) B) C) D) E) 9. H0: 1 ≠ 2 ≠ 3 ≠ .. ≠ K ≠ 0. H0: 1 + 2 + 3 + .. + K = 0. H0: 1 = 2 = 3 = .. = K = 0. H0: The set of independent variables has a significant linear influence on the dependent variable.. Y is income, X is savings, and income is the independent variable. Y is income, X is savings, and savings is the independent variable. Y is savings, X is income, and savings is the independent variable. Y is savings, X is income, and income is the independent variable. The least squares procedure minimizes the sum of A) B) C) D) E) the residuals. squared maximum error. absolute errors. squared residuals. None of the above. 11. In simple linear regression model, testing the null hypothesis that the slope coefficient is zero uses what sampling distribution? A) B) C) D) E) Normal. Chi-square. t distribution with n-1 degrees of freedom. Standard Normal. None of the above.