Econ 140 Classical Regression III Lecture 11 Lecture 11 1 Today’s plan Econ 140 • Chi-squared test • F test Lecture 11 2 More on the ANOVA table Econ 140 • Remember the sum of squares identity: Total = Explained + Unexplained 2 ˆ 2 y b xy e ˆ • The sum of squares identity can also be written as 2 2 2 ˆ ˆ ( Y Y ) ( Y Y ) ( Y Y ) Lecture 11 3 More on the ANOVA table (2) Econ 140 • The components of the sum of squares identity can be found in the Stata output as the ANOVA table, which looks like this: Source of Varation Explained Residual Total Lecture 11 Sum of Squares b̂ xy e^2 2 y Degrees of Freedom 1 Mean Squared Deviation bˆ xy 1 eˆ 2 n-2 n-1 y n2 2 n 1 4 Chi-squared test Econ 140 • The next test we’ll look at is the 2 (chi-squared) test – This does not appear on the Stata output • With the 2 test, we are testing the significance of a variance for a particular value • Recall that the Z statistic looks like Z Y y – The square of this distribution gives the 2 with one degree of freedom Lecture 11 5 Chi-squared test (2) Econ 140 • With the chi-squared test we want to test: H0 : YX2 = 1 For a 5% significance level • The 2 statistic for n-2 degrees of freedom is distributed: n22 ~ 2 (n 2)ˆ YX 2 YX – Where 2YX is the hypothesized value given under the null hypothesis Lecture 11 6 Chi-squared test (3) Econ 140 • The root MSE from the Stata output or Excel output is the 2 square root of ˆ YX • So our 2 statistic will be 2 28(01.36 ) 9.9 • Now look at the 2 table: – The format is similar to the F – First column gives degrees of freedom – Across the top row are the areas remaining in the righthand tail of the curve – The Ho region is the area of the left of the 2 value and Lecture 11the H region is the area to the right 7 1 Chi-squared test (4) Econ 140 • From the 2 table, we find that for a 95% confidence interval the 2 value is 2 28 ,0,0.05 41.34 • Since 2 < 228,0, 0.05 we fail to reject the null Lecture 11 8 F-test Econ 140 • In the second line down in the right hand column of the Stata output we have a F test, which looks at the significance of this regression equation using the sum of squares information from the ANOVA table • Using the sample data, the F test is designed to test the difference of variances between samples drawn from independent populations • The F statistic looks like: F test Lecture 11 2 ˆ1 2 ˆ 2 m Where m & n are the respective n degrees of freedom 9 F-test (2) Econ 140 • Properties of the F distribution: 1) The F distribution is skewed and positive 2) As the degrees of freedom increases for both samples, the F distribution will approximate the normal F(10,2) F(50,50) Lecture 11 10 F-test (3) Econ 140 • Last time we looked at the spreadsheet L9.xls and we learned how to use the LINEST function in Excel to run a regression • The LINEST output gives you the model (or explained) sum of squares, the residual sum of squares, and the F statistic • The regression equation we estimated was Yˆ 2.69 0.23 X (0.70) (0.06) – Where the numbers in parenthesis are the standard errors for the coefficients Lecture 11 11 F-test (4) Econ 140 • F-test where Ho : b = 0 • Our F statistic will be F - test (explained/DF) (residual/DF) • So from our spreadsheet, we can plug in values to calculate our F statistic 5.98 / 1 5.98 F - test 17.2 10.14 / 28 0.36 • From the LINEST output, we find that the F statistic is 16.51 [it’s not too far off] Lecture 11 12 F-test (5) Econ 140 • In the F tables, the first column gives the degrees of freedom for the denominator, or df2 • Across the top row are the degrees of freedom for the numerator, or df1 • For each pair of degrees of freedom, there are four values for the significance level from 0.25 to 0.01 – this refers to the area remaining in the tail of the F distribution Lecture 11 13 F-test (6) Econ 140 • Now let’s look up the F value with df1 = 1 and df2 =28 – F value at a 5% significance level is 4.20: F0.05,1,28 = 4.20 • Therefore if F > F0.05,1,28 then we reject the null – reject the null that: b = 0 • You will use an F-test when you calculate within sample predictions, or when you do a Chow test • You will also use an F-test when you want to test the significance of a regression when there are multiple independent (or X) variables Lecture 11 14