UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Homework 10 Solutions Multiple Choice 1. d 2. d 3. d 4. f 5. c 6. f 7. b 8. d 9) __F__ in the population, the relationship between the variables is linear in the variables __T__ in the population, the relationship between the variables is linear in the parameters __F__ the X variables are correlated with the population error term e __T__ the model is correctly specified (contains the correct variables, and only the correct variables) __T__ the distribution of the error term is normal __T__ the error terms for various individuals in the population are not correlated with one another __T__ the variance of the error term is the same for all individuals in the population 10) __3__ Calculate the “Goodness of Fit” measures SER and R2. __5__ Describe the sign and magnitude of the effect of each (statistically significant) X on Y. __1__ Use the OLS Regression Estimator Equations to estimate the π½Μ ’s for the regression model. __4__ Check the ttest numbers for the π½Μ ’s to determine which of the π½Μ ’s are statistically significant. __2__ Use the F-test to determine whether the regression as a whole is statistically significant. 11) Y Components of TSS, RSS and ESS πΜ = βΜ0 + βΜ1 β π1 Yi _____ESS______ πΜ ____TSS_____ _____RSS_____ πΜ Xi 12) Y X1 Components of TSS, RSS and ESS πΜ = βΜ0 + βΜ1 β π1 πΜ ______RSS______ _____TSS_____ πΜ Yi Xi ______ESS______ X1 1 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas 13) The results of an OLS regression analysis include: n = 62, k = 8, RSS = 700, ESS = 300, and TSS = 1000. Calculate Ftest. Ftest = [RSS/(k-1)] / [ESS/(n-k)] = [700/(8-1)] / [300/(62-8)] = 18.0 What is the hypothesis that is tested using Ftest? H0 = all B’s are zero H1 = one or more of the B’s is not equal to zero Is the regression (as a whole) significant at the alpha = 5% level of significance? Use an F-test to answer this question. From the results above, we know that Ftest = 18.0 We need Fcritical from the F-table. d.f. numerator = k - 1 = 7. d.f. denominator = n - k = 54. alpha = 5%. So, Fcritical = 2.19 (approximately) Since Ftest > Fcritical, we reject H0 and accept H1. Thus, the regression (as a whole) is significant at the alpha = 5% level of significance. Calculate SER (notice that ESS is the same as ∑i(eΜ2i ) ). ∑π(πΜπ2 ) ππΈπ = √π£ππ(eΜi ’s ) = √πeΜ2i = √ π−π πΈππ 300 = √π−π = √62−8 = 2.357 Calculate R2. R2 = RSS / TSS = 700 / 1000 = 0.70 Calculate Rbar2. Rbar2 = 1-(1-R2)((n-1)/(n-k)) = 1 – (1-0.70)(62-1)/(62-8) = 0.661 Which should be used for this particular regression, R2 or Rbar2? Rbar2 should be used for this particular regression, because k > 2. When k > 2, we know that the regression equation has more than one X variable in it, and Rbar2 should be used whenever the regression equation has more than one X variable. What does R2 (Rbar2) tell us? Rbar2 tells us the percentage of the variation in the Y data that is explained by the regression model. 2 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas 14) (NOTE: Use the information in the handout ”Regression Analysis in SAS” on the course website to help answer the questions in this problem.) Suppose you do some consulting work for a client who is interested in health care in North Carolina. One determinant of the quality of health care is access to primary care doctors. The client is interested in what determines the number of primary care doctors per 1000 persons (DocsPer1000) in North Carolina counties. The client wants to know whether the number of doctors is influenced by measures of health needs, such as the number of babies per 1000 persons (BabiesPer1000) and senior citizens per 1000 persons (SeniorsPer1000), or simply the wealth of the population, as measured by median family income in 1000’s of dollars (MedInc1000s). You decide to run the following OLS regression analysis in SAS: proc reg data=dataset01; model DocsPer1000 = SeniorsPer1000 BabiesPer1000 MedInc1000s; run; The results of the analysis are shown below. The SAS System 09:29 Tuesday, March 26, 2013 The REG Procedure Model: MODEL1 Dependent Variable: DocsPer1000 Number of Observations Read Number of Observations Used 100 100 Analysis of Variance Source DF Sum of Squares Mean Square Model Error Corrected Total 3 96 99 3.28168 12.63374 15.91542 1.09389 0.13160 Root MSE Dependent Mean Coeff Var 0.36277 0.68325 53.09434 R-Square Adj R-Sq F Value Pr > F 8.31 <.0001 0.2062 0.1814 Parameter Estimates Variable Intercept SeniorsPer1000 BabiesPer1000 MedInc1000s DF Parameter Estimate Standard Error t Value Pr > |t| 1 1 1 1 1.67145 -0.00371 -0.01842 0.01653 0.69101 0.00169 0.00578 0.00606 2.42 -2.19 -3.19 2.73 0.0175 0.0307 0.0019 0.0076 3 UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas 14 continued) Referring to the OLS regression analysis results on the previous page, answer the following questions, assuming α = 0.05: What was the dependent variable in the regression analysis, and what were the independent variables? The dependent variables is DocsPer1000. The independent variables are BabiesPer1000, SeniorsPer1000, and MedInc1000s. What was the sample size (n) used in the analysis? n = 100 What does the F-Value tell you about the regression model? The F-Value tells us whether the regression model, as a whole, explains a statistically-significant percentage of the variation in the Y variable. What are the values of d.f.numerator and d.f.denominator for use in finding Fcritical from the F-table? d.f.numerator = k – 1 = 3 d.f.denominator = n – k = 96 What is the value of Fcritical from the F-table (using α = 0.05)? Fcritical = 2.70 approximately Is the F-Value significant? Briefly, how can you tell? From the regression output, Ftest = 8.31. Since Ftest > Fcritical, the F-Value is significant, which means that we reject H0 and accept H1. What do R-Square and Adj R-Sq tell you? For this regression, which should you look at, and (very briefly) why? R-Square and Adj R-Sq tell us the percentage of the variation in Y that is explained by the regression model. For this regression, we should look at Adj R-Sq, because the regression model has more than one X variable. In SAS, the SER is called “Root MSE?” Briefly, what does it tell us? SER tells us the average distance of a data point from the regression line/curve. Briefly, what do the numbers in the “Parameter Estimate” column tell us? The numbers in the “Parameter Estimate” column tell us the values of the π½Μ ’s. The Parameter Estimate for the “Intercept” is π½Μ 0. The Parameter Estimate for SeniorsPer1000 is the π½Μ 1 in the regression equation. The Parameter Estimate for BabiesPer1000 is the π½Μ 2 in the regression equation. The Parameter Estimate for MedInc1000s is the π½Μ 3 in the regression equation. Briefly, what do the numbers given in the “t Value” column tell us? The numbers in the “t Value” column tell us the ttest values for the t-tests of whether each π½Μ parameter is equal to zero. We can compare these ttest values with tcritical numbers from a t-table to test H0: π½Μ = 0 vs. H1: π½Μ ≠ 0. (Since this is a two-sided test, use α/2 when retrieving tcritical from the t-table.) Briefly, what do the numbers in the “Pr > |t|” column tell us? The numbers in the “Pr > |t|” column tell us the p-values for the t-tests of the π½Μ parameters. We can compare these p-values to the α/2-value in order to test H0: π½Μ = 0 vs. H1: π½Μ ≠ 0. So, what is the effect of SeniorsPer1000 on the number of primary care doctors per 1000 persons in a county? Parameter estimate π½Μ 1 gives the effect of SeniorsPer1000 (variable X1) on DocsPer1000 (the Y variable). The value of π½Μ 1 is -0.00371. This means that a one unit increase in SeniorsPer1000 (variable X1) results in a 0.00371 unit decrease in DocsPer1000 (the Y variable). We know that the effect of X1 on Y is negative because π½Μ 1 is negative. So, what is the effect of BabiesPer1000 on the number of primary care doctors per 1000 persons in a county? Parameter estimateπ½Μ 2 gives the effect of BabiesPer1000 (variable X2) on DocsPer1000 (the Y variable). The value of π½Μ 2 is -0.01842. This means that a one unit increase in BabiesPer1000 (variable X2) results in a 0.01842 unit decrease in DocsPer1000 (the Y variable). We know that the effect of X2 on Y is negative because π½Μ 2 is negative. So, what is the effect of MedInc1000s on the number of primary care doctors per 1000 persons in a county? Parameter estimate π½Μ 3 gives the effect of MedInc1000s (variable X2) on DocsPer1000 (the Y variable). The value of π½Μ 3 is 0.01653. This means that a one unit increase in MedInc1000s (variable X2) results in a 0.01653 unit increase in DocsPer1000 (the Y variable). We know that the effect of X3 on Y is positive because π½Μ 3 is positive. 4