MGEB12: Quantitative Methods in Economics-II TUTORIAL-6 Question-1 The following set of data was collected to determine the effects of sleep deprivation on students' ability to solve problems. The amount of sleep deprivation varied, with 8, 12, 16, 20 and 24 hours without sleep. A total of ten subjects participated in the study. A set of simple addition problems was administered to each subject after his or her sleep deprivation period (X), and the number of errors recorded (Y). These results were obtained: Number of Errors Hours Without Sleep 8 8 6 8 6 12 10 12 8 16 14 16 14 20 12 20 16 24 12 24 ∑ X = 160 ∑ X = 2880 ∑ Y = 106 ∑ Y = 1236 ∑ XY = 1848 2 2 a) [5 Points] Find the regression line and interpret the intercept and slope coefficients. Solution: 160 × 106 10 b1 = 2 = = 0.475 160 2 sx 2880 − 10 b0 = (106 / 10) − 0.475 × (160 / 10) = 3 s xy 1848 − Y = 3 + 0.475 × X Interpretation: Slope: An hour increase in sleep is associated with an increase of 0.475 in the number of errors. Intercept: Since the sample cannot include sleep deprivation of zero, the intercept is meaningless. (If you assume sleep deprivation can be zero then the intercept is the average number of errors for someone with zero sleep deprivation) b) [4 Points] Graph the estimated line. Carefully identify the sources of variation for (X=12, Y=6). 10.6 8.7 6 3 12 Regression = (8.7-10.6)2 Error = (6-8.7)2 Total = (6-10.6)2 Page 2 of 9 c) [3 Points] Compute the coefficient of determination, R-squared. What does it mean? Solution: 160 2 − 2880 S x, y 2 2 2 Sx 2 2 10 R =( ≅ 0.6424 ) = b1 ( ) = 0.475 SxSy Sy 106 2 1236 − 5 Approximately 64.24% of the variation of the number of errors around the mean is explained by the regression (i.e. variation in the errors around the mean). Page 3 of 9 d) [4 Points] Make an inference about the number of errors for an individual with a sleep deprivation of 10. [Use α = 0.05 for the interval.] Solution: 106 2 = 112.4 SST = (n − 1) S y2 = 1236 − 10 SSE = 0.6423 R2 = 1 − 112.4 SSE ≅ 40.2 Sε = 40.2 40.2 = ≅ 2.2417 8 10 − 2 yˆ ± tα 2 sε 1 + 2 1 ( xg − x ) + n (n − 1) s x2 1 (10 − 16) 2 7.75 ± 2.306 × 2.2417 1 + + 320 10 7.75 ± 2.306 × 2.2417 × 1.101136 7.75 ± 5.6922 An individual with a sleep deprivation of 10 is predicted to have this range of errors at 95%. Page 4 of 9 e) [4 Points] A student claims that based on the estimated slope of the regression, a one-hour increase in the sleep deprivation is associated with less than half an error increase in the number of errors. Test this claim at a 1% significance level. Solution: H 0 : β1 = 0.5 H 1 : β1 < 0.5 t stat = S b1 = t stat = b1 − β1 S b1 Sε (n − 1) S x2 = 2.2417 160 2 2880 − 10 = 0.1253 0.475 − 0.5 ≅ −0.1995 0.1253 From the table t(0.01, 8): -2.896 => cannot reject the null…Therefore cannot conclude that the slope coefficient is significantly less than 0.5 => cannot conclude that one hour increase in the sleep deprivation increases the number of errors by less than 0.5. Page 5 of 9 Multiple Choice Questions 1. An analyst uses a regression analysis to predict the resale price of a car (Y) from the age of the car (Y). Price is measured in $1000 and Age is measured in years. The 5 regression provides the following line: Yˆ = 14 − X . If Age was rather reported in 6 months, the estimated regression line would have been: (A) 5 X 72 5 B. Yˆ = 14 − X 2 ˆ C. Y = 14 − 10 X D. Yˆ = 168 − 10 X 5 E. Yˆ = 14 − X 6 A. Yˆ = 14 − 2. When the estimated slope coefficient in the simple regression model is zero, then: (C) A. R 2 = Y B. 0 < R 2 < 1 C. Yˆ = Y D. The variance of the regression is zero E. The regression is significant only if the intercept is significant. 3. You are interested in exploring the determinants of successful high schools. You argue that there should be a relationship between the average parent's income and high school success. After running a regression of the percentage of students going on to college (Y) and average parent's income (X), you find that one school has a large negative residual. Which of the following is true? (B) A. This school has very low values for both variables. B. This school performed much worse than expected. C. This school has very high values for both variables. D. This school performed much better than expected. E. The regression suffers from heteroscedasticity. 4. Which of the following statements is correct: (C) I: The plot of standardized residuals is used to check for normality II: The plot of residuals against the fitted value can be used to check for heteroscedasticity III: When the regression is not significant the estimated intercept is equal to the mean of dependent variable Page 6 of 9 A. I only B. I and II C. I, II, and III D. I and III E. II and III 5. Consider the following summary statistics for a random sample. X = 20 ; s X = 3 Y = 10 ; sY = 5 s XY = −18 For which of the following would the SSE (sum of the squared errors) be smallest? (E) A. Yˆ = 4 – 6*X B. Yˆ = 8 – 6*X C. Yˆ = -4 + 2*X D. Yˆ = -8 – 2*X E. Yˆ = 50 – 2*X 6. Shown below is a scatterplot with the corresponding estimated least squares line. Which of the following correspond to the residual of this regression: (B) A.I B. II C. III D. IV E. None Page 7 of 9 Questions 7-9: A sample size of 10 cars selected to study the relation between the weight of the car (x) and the fuel consumption (y) provided the following sample statistics: Σx = 29 Σx2 = 89.28 Σy = 43.9 Σy2 = 207.31 Σxy = 135.8 7. The total sum of squares of this regression is: (A) A. 14.59 B. 15.81 C. 16.97 D. 17.62 E. 18.33 8. The standard error of the estimate is: (D) A. 0.11 B. 0.16 C. 0.21 D. 0.29 E. 0.36 9. In testing the significance of the regression the p-value would be approximately: (A) A. <0.01 B. between .01 and .025 C. between .025 and .05 D. between .05 and .10 E. >.1 Page 8 of 9 Question 10-11: A linear regression of y (Monthly earning in $1000) on x (Experience in Years) has yielded: yˆ = 2 + 0.12 x 10. According to this estimate, for an additional one-month increase in experience the monthly earning will increase by an average of: (D) A. $80 B. $100 C. $120 D. $10 E. $12 11. If x (Experience) had been measured in days rather than years how would the regression output change? (D) A. There will be no change in the slope or the intercept. B. Both intercept and the slope will increase. C. Both intercept and slope will decline. D. The slope will decline while the intercept will remain constant. E. The intercept will decline while the slope will remain constant. Page 9 of 9