Lecture Handout for Wednesday, April 2, 2003

advertisement
ECO391 Lecture Handout for 15.5, 15.6, and 15.7 Hypothesis Testing and Confidence Intervals
Several Brief Exercises to understand the computer output
Consider the following example from the old ECO 391 survey data: Let’s try to build a simple model for
students’ GPA.
Model 1: E(Y|X) = β0 + β1X where Y is a person’s GPA and X is his(her) ACT score.
Source |
SS
df
MS
-------------+-----------------------------Model | 27.2531658
1 27.2531658
Residual | 339.455496
276 1.22991122
-------------+-----------------------------Total | 366.708662
277 1.32385799
Number of obs
F( 1,
276)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
278
22.16
0.0000
0.0743
0.0710
1.109
-----------------------------------------------------------------------------gpa |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------act |
.0823084
.0174853
4.71
0.000
.0478869
.1167298
_cons |
1.259056
.4398005
2.86
0.005
.3932666
2.124846
------------------------------------------------------------------------------
Model 2:
E(Y|X) = β0 + β1X where Y is a person’s GPA and X is hours he(she) sleeps.
Source |
SS
df
MS
-------------+-----------------------------Model | .898181079
1 .898181079
Residual | 382.585537
312 1.22623569
-------------+-----------------------------Total | 383.483718
313
1.2251876
Number of obs
F( 1,
312)
Prob > F
R-squared
Adj R-squared
Root MSE
=
314
=
0.73
= 0.3927
= 0.0023
= -0.0009
= 1.1074
-----------------------------------------------------------------------------gpa |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------sleep | -.0463301
.0541338
-0.86
0.393
-.1528436
.0601834
_cons |
3.609665
.3764874
9.59
0.000
2.86889
4.350441
------------------------------------------------------------------------------
1. Write down the sample regression line equations for both models.
2. Write down your interpretation of slope coefficients from both models.
3. Find and interpret the R2 value for both of these models.
ECO 391 –007 Lecture handout Chapter 15.
1
15.5 Standard errors of the coefficients
The estimated standard error for the slope coefficient (sb1 )is found according to the following formula:
s b1 
se
 x  x 
2
The estimated standard error of the intercept ( sb0 ) is found according to the following formula:
sb0 
se
x
2
n ( x  x ) 2
ECO 391 –007 Lecture handout Chapter 15.
2
15.6 Hypotheses testing.
Recall that estimated coefficients are just estimates; if you take another sample they will be different!!!
Hypothesis testing is a way to determine whether the obtained regression coefficients are reliable or not.
Suppose we take one of the many possible samples from a population and we use it the estimate regression
coefficients. How representative was that one sample of the population? How reliable are the sample estimates in
measuring the true population relationship? Hypothesis testing will help us to analyze these issues.
Recall the following true population regression line.
E(YiXi) = o + 1 Xi
If no relationship exists between Xi and Yi, then 1 ….
If there is a relationship, then  1
Recall that 1 is the slope of the regression line. It measures the change in the expected value of Y associated with
a one unit change in X.
If 1 = 0 then as X changes, Y does not change.
Hypothesis Testing:
Two-tailed test: (if we do not have any prior idea as of the direction of the relationship)
We test
Ho:  1 = 0
H1:  1  0
(Null Hypothesis) (No relationship between X and Y)
(Alternative Hypothesis) (There is a relationship.)
If we reject Ho: 1 = 0 in favor of H1: 1  0, then we are saying that the values of X are helpful in predicting the
values of Y. “1 is significantly different from zero.”
If we fail to reject H0, then the coefficient is «statistically insignificant», which means that we do not have
evidence that a relationship b/w X and Y exists.
Right tail test: (if we know that there should be a positive relationship)
Ho: 1 = 0 in favor of
(No positive relationship)
H1: 1  0
(A positive relationship)
Then we are saying that the sample data supports the hypothesis that X and Y are positively related.
If we reject
ECO 391 –007 Lecture handout Chapter 15.
3
Left-tail test: (if we suspect a negative relationship)
Ho: 1  0 in favor of
H1: 1  0
If we reject
(No negative relationship)
(A negative relationship)
Then we are saying that the sample data supports the hypothesis that X and Y are negatively related.
To perform a test, a t-statistic is used:
t =
bi
sbi
this statistic is distributed with n-K-1 degrees of freedom where n is a number of observations in a sample and K is
a number of independent variables, so that for a simple regression model d.f. = n-2.
Recall Sbi are the estimated standard errors of the coefficients are found as outlined in section 15.5
EXERCISE: Are the slope coefficients in the two GPA models significant?
1. Perform a test for the slope coefficient for the GPA / ACT score model:
a. Ho:
H1:
b. t-statistic is … (use table 1 on page 1)
c. The critical value is….
d. The decision and conclusion:
2. Perform a test for the slope coefficient for the GPA / sleep model:
ECO 391 –007 Lecture handout Chapter 15.
4
a. Ho:
H1:
b. t-statistic is ________ (you can use already discussed table with the output)
c. The critical value is….
d. The Decision and conclusion:
ECO 391 –007 Lecture handout Chapter 15.
5
15.7 Confidence Intervals.
Formula for a confidence interval for i :
A 100(1-)% confidence interval for a coefficient i is:
( b i  t/2,n-k—1 Sbi )
Example:
Yi = 54.3182 - 4.0129Xi1
SE: (1.737)
n = 10
Yi = gasoline mileage
Xi1 = engine size in hundreds of cubic inches
Construct a 95% confidence interval for each of the regression coefficients.
Critical t = t.025, 7 = 2.365
For 1: -4.0129  2.365 (1.737) = (-8.121, 0.095)
(We are 95% confident that 1 falls in this interval.)
Note that zero lies within the first interval. This implies that if we did an hypothesis test at the 5% level
of significance using a two-tailed test we would fail to reject the null hypothesis that 1 = 0. (i.e. we
would find that the coefficient is not significantly different from zero.
One more example: (Let’s use the following model for GPA considered during the last class:
GPAhat = 1.938116 + 0.0498753 ACT
S.E. 0.0062
Construct a 95% confidence interval for the coefficient of the number of cigarettes smoked and interpret
your result.
ECO 391 –007 Lecture handout Chapter 15.
6
Download