Homework 10 Solutions

advertisement
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
Homework 10 Solutions
Multiple Choice
1. d
2. d
3. d
4. f
5. c
6. f
7. b
8. d
9)
__F__ in the population, the relationship between the variables is linear in the variables
__T__ in the population, the relationship between the variables is linear in the parameters
__F__ the X variables are correlated with the population error term e
__T__ the model is correctly specified (contains the correct variables, and only the correct variables)
__T__ the distribution of the error term is normal
__T__ the error terms for various individuals in the population are not correlated with one another
__T__ the variance of the error term is the same for all individuals in the population
10)
__3__ Calculate the “Goodness of Fit” measures SER and R2.
__5__ Describe the sign and magnitude of the effect of each (statistically significant) X on Y.
__1__ Use the OLS Regression Estimator Equations to estimate the 𝛽̂ ’s for the regression model.
__4__ Check the ttest numbers for the 𝛽̂ ’s to determine which of the 𝛽̂ ’s are statistically significant.
__2__ Use the F-test to determine whether the regression as a whole is statistically significant.
11)
Y
Components of TSS, RSS and ESS
π‘ŒΜ‚ = βΜ‚0 + βΜ‚1 βˆ™ 𝑋1
Yi
_____ESS______
π‘ŒΜ‚
____TSS_____
_____RSS_____
π‘ŒΜ…
Xi
12)
Y
X1
Components of TSS, RSS and ESS
π‘ŒΜ‚ = βΜ‚0 + βΜ‚1 βˆ™ 𝑋1
π‘ŒΜ…
______RSS______
_____TSS_____
π‘ŒΜ‚
Yi
Xi
______ESS______
X1
1
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
13) The results of an OLS regression analysis include: n = 62, k = 8, RSS = 700, ESS = 300, and TSS = 1000.
Calculate Ftest.
Ftest = [RSS/(k-1)] / [ESS/(n-k)] = [700/(8-1)] / [300/(62-8)] = 18.0
What is the hypothesis that is tested using Ftest?
H0 = all B’s are zero
H1 = one or more of the B’s is not equal to zero
Is the regression (as a whole) significant at the alpha = 5% level of significance?
Use an F-test to answer this question.
From the results above, we know that Ftest = 18.0
We need Fcritical from the F-table. d.f. numerator = k - 1 = 7. d.f. denominator = n - k = 54. alpha = 5%.
So, Fcritical = 2.19 (approximately)
Since Ftest > Fcritical, we reject H0 and accept H1.
Thus, the regression (as a whole) is significant at the alpha = 5% level of significance.
Calculate SER (notice that ESS is the same as ∑i(eΜ‚2i ) ).
∑𝑖(𝑒̂𝑖2 )
𝑆𝐸𝑅 = √π‘£π‘Žπ‘Ÿ(eΜ‚i ’s ) = √𝜎eΜ‚2i = √
𝑛−π‘˜
𝐸𝑆𝑆
300
= √𝑛−π‘˜ = √62−8 = 2.357
Calculate R2.
R2 = RSS / TSS = 700 / 1000 = 0.70
Calculate Rbar2.
Rbar2 = 1-(1-R2)((n-1)/(n-k)) = 1 – (1-0.70)(62-1)/(62-8) = 0.661
Which should be used for this particular regression, R2 or Rbar2?
Rbar2 should be used for this particular regression, because k > 2. When k > 2, we know that the regression
equation has more than one X variable in it, and Rbar2 should be used whenever the regression equation has
more than one X variable.
What does R2 (Rbar2) tell us?
Rbar2 tells us the percentage of the variation in the Y data that is explained by the regression model.
2
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
14) (NOTE: Use the information in the handout ”Regression Analysis in SAS” on the course website to help answer the
questions in this problem.) Suppose you do some consulting work for a client who is interested in health care in North
Carolina. One determinant of the quality of health care is access to primary care doctors. The client is interested in what
determines the number of primary care doctors per 1000 persons (DocsPer1000) in North Carolina counties. The client
wants to know whether the number of doctors is influenced by measures of health needs, such as the number of babies per
1000 persons (BabiesPer1000) and senior citizens per 1000 persons (SeniorsPer1000), or simply the wealth of the population,
as measured by median family income in 1000’s of dollars (MedInc1000s). You decide to run the following OLS regression
analysis in SAS:
proc reg data=dataset01;
model DocsPer1000 = SeniorsPer1000 BabiesPer1000 MedInc1000s;
run;
The results of the analysis are shown below.
The SAS System
09:29 Tuesday, March 26, 2013
The REG Procedure
Model: MODEL1
Dependent Variable: DocsPer1000
Number of Observations Read
Number of Observations Used
100
100
Analysis of Variance
Source
DF
Sum of
Squares
Mean
Square
Model
Error
Corrected Total
3
96
99
3.28168
12.63374
15.91542
1.09389
0.13160
Root MSE
Dependent Mean
Coeff Var
0.36277
0.68325
53.09434
R-Square
Adj R-Sq
F Value
Pr > F
8.31
<.0001
0.2062
0.1814
Parameter Estimates
Variable
Intercept
SeniorsPer1000
BabiesPer1000
MedInc1000s
DF
Parameter
Estimate
Standard
Error
t Value
Pr > |t|
1
1
1
1
1.67145
-0.00371
-0.01842
0.01653
0.69101
0.00169
0.00578
0.00606
2.42
-2.19
-3.19
2.73
0.0175
0.0307
0.0019
0.0076
3
UNC-Wilmington
Department of Economics and Finance
ECN 377
Dr. Chris Dumas
14 continued) Referring to the OLS regression analysis results on the previous page, answer the following
questions, assuming α = 0.05:
What was the dependent variable in the regression analysis, and what were the independent variables?
The dependent variables is DocsPer1000. The independent variables are BabiesPer1000, SeniorsPer1000, and
MedInc1000s.
What was the sample size (n) used in the analysis?
n = 100
What does the F-Value tell you about the regression model?
The F-Value tells us whether the regression model, as a whole, explains a statistically-significant percentage of the
variation in the Y variable.
What are the values of d.f.numerator and d.f.denominator for use in finding Fcritical from the F-table?
d.f.numerator = k – 1 = 3
d.f.denominator = n – k = 96
What is the value of Fcritical from the F-table (using α = 0.05)?
Fcritical = 2.70 approximately
Is the F-Value significant? Briefly, how can you tell?
From the regression output, Ftest = 8.31. Since Ftest > Fcritical, the F-Value is significant, which means that we reject
H0 and accept H1.
What do R-Square and Adj R-Sq tell you? For this regression, which should you look at, and (very briefly) why?
R-Square and Adj R-Sq tell us the percentage of the variation in Y that is explained by the regression model. For
this regression, we should look at Adj R-Sq, because the regression model has more than one X variable.
In SAS, the SER is called “Root MSE?” Briefly, what does it tell us?
SER tells us the average distance of a data point from the regression line/curve.
Briefly, what do the numbers in the “Parameter Estimate” column tell us?
The numbers in the “Parameter Estimate” column tell us the values of the 𝛽̂ ’s. The Parameter Estimate for the
“Intercept” is 𝛽̂ 0. The Parameter Estimate for SeniorsPer1000 is the 𝛽̂ 1 in the regression equation. The Parameter
Estimate for BabiesPer1000 is the 𝛽̂ 2 in the regression equation. The Parameter Estimate for MedInc1000s is the 𝛽̂ 3
in the regression equation.
Briefly, what do the numbers given in the “t Value” column tell us?
The numbers in the “t Value” column tell us the ttest values for the t-tests of whether each 𝛽̂ parameter is equal to
zero. We can compare these ttest values with tcritical numbers from a t-table to test H0: 𝛽̂ = 0 vs. H1: 𝛽̂ ≠ 0. (Since
this is a two-sided test, use α/2 when retrieving tcritical from the t-table.)
Briefly, what do the numbers in the “Pr > |t|” column tell us?
The numbers in the “Pr > |t|” column tell us the p-values for the t-tests of the 𝛽̂ parameters. We can compare these
p-values to the α/2-value in order to test H0: 𝛽̂ = 0 vs. H1: 𝛽̂ ≠ 0.
So, what is the effect of SeniorsPer1000 on the number of primary care doctors per 1000 persons in a county?
Parameter estimate 𝛽̂ 1 gives the effect of SeniorsPer1000 (variable X1) on DocsPer1000 (the Y variable). The value
of 𝛽̂ 1 is -0.00371. This means that a one unit increase in SeniorsPer1000 (variable X1) results in a 0.00371 unit
decrease in DocsPer1000 (the Y variable). We know that the effect of X1 on Y is negative because 𝛽̂ 1 is negative.
So, what is the effect of BabiesPer1000 on the number of primary care doctors per 1000 persons in a county?
Parameter estimate𝛽̂ 2 gives the effect of BabiesPer1000 (variable X2) on DocsPer1000 (the Y variable). The value
of 𝛽̂ 2 is -0.01842. This means that a one unit increase in BabiesPer1000 (variable X2) results in a 0.01842 unit
decrease in DocsPer1000 (the Y variable). We know that the effect of X2 on Y is negative because 𝛽̂ 2 is negative.
So, what is the effect of MedInc1000s on the number of primary care doctors per 1000 persons in a county?
Parameter estimate 𝛽̂ 3 gives the effect of MedInc1000s (variable X2) on DocsPer1000 (the Y variable). The value
of 𝛽̂ 3 is 0.01653. This means that a one unit increase in MedInc1000s (variable X2) results in a 0.01653 unit
increase in DocsPer1000 (the Y variable). We know that the effect of X3 on Y is positive because 𝛽̂ 3 is positive.
4
Download