The F-distribution: Derivation and Application

advertisement
Robinson 1
Zachary Robinson
Hamaker Math 20
Final Project
November 30, 2011
The F-distribution: Derivation and Application
I. Introduction
In regression analysis, multiple tests are used to determine if variables are statistically
significant. Whereas the t-test is used to test a single variable for statistical significance, the Ftest is used to test a set of multiple restrictions (the t-test is a special case of the F-test where only
a single variable is being tested). The F-test depends upon an F-statistic. While this statistic may
be computed in multiple ways, it is always a ratio between two chi-squared distributed variables
divided by their respective degrees of freedom (Wooldridge 147, 743). The chi-squared
distribution is used to represent the sum of squares of independent normally distributed variables
with mean 0 and standard deviation 1, and degrees of freedom is a statistical concept that
represents the number of values that are free to vary in the final calculation of a statistic
(Grinstead and Snell 296).
One example of an F-statistic is that used to investigate whether there is a significant
difference between restricted and unrestricted regression models. In statistics, regression analysis
is used to examine the relationship between a dependent variable and one or more independent
variables. When comparing two different models, a restricted model is a modified version of an
unrestricted model that has the same dependent variable, but fewer independent variables than its
unrestricted counterpart. To determine if the restricted model is reasonable, we may use the F-
Robinson 2
test to decide whether the variables lacking in the restricted model but present in the unrestricted
model are jointly significant. Therefore, we may use the following F-statistic:
F
RSS  RSS  q
RSS  n  k  1
r
ur
ur
where RSS is the residual sum of squares, ur is the unrestricted equation, r is the restricted
equation, n is the number of data points, q is the number of variables being tested, and k is the
total number of variables in the unrestricted equation (Wooldridge 145). The residual sum of
squares quantifies the discrepancy between a data set’s predicted outcomes and the actual
outcomes by summing the squares of these differences in outcomes. As a result, it can be shown
that the RSS follows a chi-squared distribution (in this case, the variables are the difference
between actual and predicted outcomes). Additionally, q represents the numerator degrees of
freedom (as it is the difference in the degrees of freedom in the restricted and unrestricted model)
and n-k-1 equals the denominator degrees of freedom, i.e. the degrees of freedom of the
unrestricted model. Thus, we see that this F-statistic follows the previously noted requirement of
being the ratios of chi-squared distributed variables divided by their respective degrees of
freedom.
II. Derivation of Distribution
In order to use the F-statistic, we must first know the F-distribution, which this paper seeks to
derive. To begin, we compare two independent variables X and Y with the ratio 𝑇 =
𝑈
𝑋
𝑌
where
𝑉
𝑌 > 0. Let 𝑋 = 𝑚 and 𝑌 = 𝑛 where U and V are chi-squared distributed with m and n degrees of
freedom, respectively. The chi-squared density has only one parameter, n, which equals the
Robinson 3
number of degrees of freedom (Weisstein “Chi-Squared”). The chi-squared distribution with n
degrees of freedom can be written:
𝑓(𝑥) =
. 5.5𝑛 .5𝑛−1 −.5𝑥
𝑥
𝑒
Γ(.5𝑛)
Where  (n ) is the gamma function. The gamma function is a generalization of the factorial
function, and for an integer z, Γ(𝑧) = (𝑧 − 1)!. In general, the gamma function can be defined as
∞
Γ(𝑧) = ∫ 𝑥 𝑧−1 𝑒 −𝑥 𝑑𝑥
0
(Feller 47).
Then the pdf of T is:
𝐹𝑇 = 𝑃(𝑇 ≤ 𝑡) = 𝑃(𝑋 ≤ 𝑡𝑌)
∞
𝑡𝑦
= ∫ ∫ 𝑓𝑋 (𝑥)𝑓𝑌 (𝑦)𝑑𝑥𝑑𝑦
0
0
∞
𝑡𝑦
= ∫ 𝑓𝑌 (𝑦) ∫ 𝑓𝑋 (𝑥)𝑑𝑥𝑑𝑦
0
0
∞
= ∫ 𝐹𝑋 (𝑡𝑦)𝑓𝑌 (𝑦) 𝑑𝑦
0
Then take the derivative to find the pdf:
∞
𝑓𝑇 (𝑡) = 𝐹 ′ 𝑇 (𝑡) = ∫ 𝑦𝑓𝑋 (𝑡𝑦)𝑓𝑌 (𝑦) 𝑑𝑦
0
(Feller 49).
We must then find the pdfs of X and Y. Because U and V are chi-squared distributed, we know
the pdfs of U and V:
. 5.5𝑚 .5𝑚−1 −.5𝑦
𝑓𝑈 (𝑦) =
𝑦
𝑒
Γ(.5𝑚)
Robinson 4
. 5.5𝑛 .5𝑛−1 −.5𝑦
𝑓𝑉 (𝑦) =
𝑦
𝑒
Γ(.5𝑛)
Here, we can use the knowledge that for any given variable Z such that Z=aW where a≠0 is a
constant, then the pdf of Z is
𝑧 1
𝑓𝑍 (𝑧) = 𝑓𝑊 ( )
𝑎 𝑎
Then:
𝑋=𝑈
1
𝑚
𝑓𝑋 (𝑦) = 𝑓𝑈 (𝑚𝑦)𝑚
Next, we can use the pdf of U:
𝑓𝑋 (𝑦) = 𝑓𝑈 (𝑚𝑦)𝑚 = 𝑚
. 5.5𝑚
(𝑚𝑦).5𝑚−1 𝑒 −.5𝑚𝑦
Γ(.5𝑚)
. 5𝑚.5𝑚 .5𝑚−1 −.5𝑚𝑦
𝑓𝑋 (𝑦) =
𝑦
𝑒
Γ(.5𝑚)
Similarly:
𝑓𝑌 (𝑦) = 𝑓𝑉 (𝑛𝑦)𝑛 = 𝑛
𝑓𝑌 (𝑦) =
. 5.5𝑛
(𝑛𝑦).5𝑛−1 𝑒 −.5𝑛𝑦
Γ(.5𝑛)
. 5𝑛.5𝑛 .5𝑛−1 −.5𝑛𝑦
𝑦
𝑒
Γ(.5𝑛)
∞
𝑋
Then, returning to 𝑓𝑇 (𝑡) = ∫0 𝑦𝑓𝑋 (𝑡𝑦)𝑓𝑌 (𝑦) 𝑑𝑦, we can derive the density function of 𝑇 = 𝑌 :
∞
𝑓𝑇 (𝑡) = ∫ 𝑦
0
. 5𝑚.5𝑚
. 5𝑛.5𝑛 .5𝑛−1 −.5𝑛𝑦
(𝑡𝑦).5𝑚−1 𝑒 −.5𝑚𝑡𝑦
𝑦
𝑒
𝑑𝑦
Γ(.5𝑚)
Γ(.5𝑛)
∞
. 5𝑚.5𝑚 . 5𝑛.5𝑛 .5𝑚−1 .5𝑚+.5𝑛−1 −.5(𝑚𝑡+𝑛)𝑦
=∫
𝑡
𝑦
𝑒
𝑑𝑦
0 Γ(.5𝑚) Γ(.5𝑛)
=
∞
. 5𝑚.5𝑚 . 5𝑛.5𝑛 .5𝑚−1
Γ(.5𝑚 + .5𝑛)
(.5𝑚𝑡 + .5𝑛).5𝑚+.5𝑛 .5𝑚+.5𝑛−1 −.5(𝑚𝑡+𝑛)𝑦
𝑡
∫
𝑦
𝑒
𝑑𝑦
Γ(.5𝑚) Γ(.5𝑛)
(.5𝑚𝑡 + .5𝑛).5𝑚+.5𝑛 0
Γ(.5𝑚 + .5𝑛)
Robinson 5
. 5𝑚.5𝑚 . 5𝑛.5𝑛 .5𝑚−1
Γ(.5𝑚 + .5𝑛)
=
𝑡
Γ(.5𝑚) Γ(.5𝑛)
(.5𝑚𝑡 + .5𝑛).5𝑚+.5𝑛
𝑓𝑇 (𝑡) =
Γ(.5𝑚 + .5𝑛) 𝑡 .5𝑚−1 𝑚.5𝑚 𝑛.5𝑛
Γ(.5𝑚)Γ(.5𝑛) (𝑚𝑡 + 𝑛).5𝑚+.5𝑛
Which is the pdf of the F-distribution (Taboga).
Then the cdf of the F-distribution is:
𝑥
𝐹𝑋 (𝑥) = ∫ 𝑓𝑇 (𝑡) =
𝑚−2 𝑥 .5 2𝐹1(.5𝑚
2𝑚 2 ( )
𝑛
0
𝑚𝑥
+ .5𝑛, .5𝑚; 1 + .5𝑚; − 𝑛 )
𝐵(.5𝑚, .5𝑛)
(Weisstein “F-Distribution”).
Where 2𝐹1(.5𝑚 + .5𝑛, .5𝑚; 1 + .5𝑚; −
2𝐹1(𝑎, 𝑏; 𝑐; 𝑧) =
𝑚𝑡
𝑛
) is a hypergeometric function of the form
1 𝑏−1
Γ(𝑐)
𝑡 (1 − 𝑡)𝑐−𝑏−1
∫
𝑑𝑡
Γ(𝑏)Γ(𝑐 − 𝑏) 0
(1 − 𝑡𝑧)𝑎
(Weisstein “Hypergeometric”).
And 𝐵(.5𝑚, .5𝑛) is a beta function of the form
𝐵(𝑝, 𝑞) =
Γ(𝑝)Γ(𝑞)
Γ(𝑝 + 𝑞)
(Weisstein “Beta”).
III. Application
Knowing the cdf of the F-distribution, we can now use that knowledge to analyze problems
such as the one of statistical significance presented above. In statistics, statistical significance is
defined by a significance level α. This significance level is not the same for all problems and can
be varied to suit the needs of the current study, but is often .1, .05, or .01. We also require a null
hypothesis, which is an assumption made prior to analyzing the data, and if we have statistically
Robinson 6
significant evidence, we are able to reject the null. For example, in the problem presented in the
introduction, the null hypothesis would be that the jointly tested independent variables have no
effect on the dependent variable. If we were to find statistically significant evidence, we would
then prefer the unrestricted model which takes these variables into account.
To determine if there is sufficient evidence to be statistically significant at significance level
α, we must first calculate the p-value. A p-value is the probability that the observed data would
occur if the null hypothesis were true. If the p-value is very small, i.e. it is lower than the
significance level, we have adequate evidence to reject the null.
For an F-test, we can use the cdf to determine the p-value for a given set of data. For an Fstatistic F with m numerator degrees of freedom and n denominator degrees of freedom and cdf
𝐹𝑋 (𝐹):
p-value=𝑃(𝔽 > 𝐹) = 1 − 𝑃(𝔽 < 𝐹) = 1 − 𝐹𝑋 (𝐹)
where 𝔽 is a random F-variable with the same m and n degrees of freedom (Wooldridge 151).
For example, if we had RSSr=195.31 RSSur=172.36 q=5 n=89 k=14 then
𝐹=
(195.31 − 172.36)/5
172.36/(89 − 14 − 1)
𝐹 = 1.97
Then the p-value for F with 5 numerator degrees of freedom and 89-14-1=74 denominator
degrees of freedom is .093, which would be statistically significant when α=.1, but not for any
α<.093, e.g. α=.05.
Robinson 7
References
Feller, William. An Introduction to Probability Theory and its Applications. 2nd ed. 2. 1971.
Grinstead, Charles M., and J. Laurie Snell. Grinstead and Snell's Introduction to Probability.
2nd ed. 2003.
Taboga, M. "Lectures on probability and statistics." 2010. http://www.statlect.com
Weisstein, Eric W. "Beta Function." From MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/BetaFunction.html
Weisstein, Eric W. "Chi-Squared Distribution." From MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/Chi-SquaredDistribution.html
Weisstein, Eric W. "F-Distribution." From MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/F-Distribution.html
Weisstein, Eric W. "Hypergeometric Function." From MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/HypergeometricFunction.html
Wooldridge, Jeffrey M. Introductory Economics: A Modern Approach. 4th ed. 2009.
Download