Lecture 11

advertisement
Econ 140
Classical Regression III
Lecture 11
Lecture 11
1
Today’s plan
Econ 140
• Chi-squared test
• F test
Lecture 11
2
More on the ANOVA table
Econ 140
• Remember the sum of squares identity:
Total = Explained + Unexplained
2 ˆ
2
y

b
xy

e
ˆ



• The sum of squares identity can also be written as
2
2
2
ˆ
ˆ
(
Y

Y
)

(
Y

Y
)

(
Y

Y
)



Lecture 11
3
More on the ANOVA table (2)
Econ 140
• The components of the sum of squares identity can be
found in the Stata output as the ANOVA table, which looks
like this:
Source of
Varation
Explained
Residual
Total
Lecture 11
Sum of
Squares
b̂  xy
e^2
2
y

Degrees of
Freedom
1
Mean Squared
Deviation
bˆ  xy
1
 eˆ 2
n-2
n-1
y
n2
2
n 1
4
Chi-squared test
Econ 140
• The next test we’ll look at is the 2 (chi-squared) test
– This does not appear on the Stata output
• With the 2 test, we are testing the significance of a
variance for a particular value
• Recall that the Z statistic looks like
Z
Y  y

– The square of this distribution gives the 2 with one
degree of freedom
Lecture 11
5
Chi-squared test (2)
Econ 140
• With the chi-squared test we want to test:
H0 : YX2 = 1
For a 5% significance level
• The 2 statistic for n-2 degrees of freedom is distributed:
 n22 ~
2
(n  2)ˆ YX
2
 YX
– Where 2YX is the hypothesized value given under the
null hypothesis
Lecture 11
6
Chi-squared test (3)
Econ 140
• The root MSE from the Stata output or Excel output is the
2
square root of ˆ YX
• So our 2 statistic will be
 2  28(01.36 )  9.9
• Now look at the 2 table:
– The format is similar to the F
– First column gives degrees of freedom
– Across the top row are the areas remaining in the righthand tail of the curve
– The Ho region is the area of the left of the 2 value and
Lecture 11the H region is the area to the right
7
1
Chi-squared test (4)
Econ 140
• From the 2 table, we find that for a 95% confidence
interval the 2 value is
2
 28
,0,0.05  41.34
• Since 2 < 228,0, 0.05 we fail to reject the null
Lecture 11
8
F-test
Econ 140
• In the second line down in the right hand column of the
Stata output we have a F test, which looks at the
significance of this regression equation using the sum of
squares information from the ANOVA table
• Using the sample data, the F test is designed to test the
difference of variances between samples drawn from
independent populations
• The F statistic looks like:
F test
Lecture 11
2
ˆ1
 2
ˆ 2
m
Where m & n are the respective
n degrees of freedom
9
F-test (2)
Econ 140
• Properties of the F distribution:
1) The F distribution is skewed and positive
2) As the degrees of freedom increases for both samples,
the F distribution will approximate the normal
F(10,2)
F(50,50)
Lecture 11
10
F-test (3)
Econ 140
• Last time we looked at the spreadsheet L9.xls and we
learned how to use the LINEST function in Excel to run a
regression
• The LINEST output gives you the model (or explained)
sum of squares, the residual sum of squares, and the F
statistic
• The regression equation we estimated was
Yˆ  2.69  0.23 X
(0.70) (0.06)
– Where the numbers in parenthesis are the standard
errors for the coefficients
Lecture 11
11
F-test (4)
Econ 140
• F-test where
Ho : b = 0
• Our F statistic will be
F - test 
(explained/DF)
(residual/DF)
• So from our spreadsheet, we can plug in values to calculate
our F statistic
5.98 / 1
5.98
F - test 

 17.2
10.14 / 28 0.36
• From the LINEST output, we find that the F statistic is
16.51 [it’s not too far off]
Lecture 11
12
F-test (5)
Econ 140
• In the F tables, the first column gives the degrees of
freedom for the denominator, or df2
• Across the top row are the degrees of freedom for the
numerator, or df1
• For each pair of degrees of freedom, there are four values
for the significance level from 0.25 to 0.01
– this refers to the area remaining in the tail of the F
distribution
Lecture 11
13
F-test (6)
Econ 140
• Now let’s look up the F value with df1 = 1 and df2 =28
– F value at a 5% significance level is 4.20:
F0.05,1,28 = 4.20
• Therefore if F > F0.05,1,28 then we reject the null
– reject the null that: b = 0
• You will use an F-test when you calculate within sample
predictions, or when you do a Chow test
• You will also use an F-test when you want to test the
significance of a regression when there are multiple
independent (or X) variables
Lecture 11
14
Download