Robinson 1 Zachary Robinson Hamaker Math 20 Final Project November 30, 2011 The F-distribution: Derivation and Application I. Introduction In regression analysis, multiple tests are used to determine if variables are statistically significant. Whereas the t-test is used to test a single variable for statistical significance, the Ftest is used to test a set of multiple restrictions (the t-test is a special case of the F-test where only a single variable is being tested). The F-test depends upon an F-statistic. While this statistic may be computed in multiple ways, it is always a ratio between two chi-squared distributed variables divided by their respective degrees of freedom (Wooldridge 147, 743). The chi-squared distribution is used to represent the sum of squares of independent normally distributed variables with mean 0 and standard deviation 1, and degrees of freedom is a statistical concept that represents the number of values that are free to vary in the final calculation of a statistic (Grinstead and Snell 296). One example of an F-statistic is that used to investigate whether there is a significant difference between restricted and unrestricted regression models. In statistics, regression analysis is used to examine the relationship between a dependent variable and one or more independent variables. When comparing two different models, a restricted model is a modified version of an unrestricted model that has the same dependent variable, but fewer independent variables than its unrestricted counterpart. To determine if the restricted model is reasonable, we may use the F- Robinson 2 test to decide whether the variables lacking in the restricted model but present in the unrestricted model are jointly significant. Therefore, we may use the following F-statistic: F RSS RSS q RSS n k 1 r ur ur where RSS is the residual sum of squares, ur is the unrestricted equation, r is the restricted equation, n is the number of data points, q is the number of variables being tested, and k is the total number of variables in the unrestricted equation (Wooldridge 145). The residual sum of squares quantifies the discrepancy between a data set’s predicted outcomes and the actual outcomes by summing the squares of these differences in outcomes. As a result, it can be shown that the RSS follows a chi-squared distribution (in this case, the variables are the difference between actual and predicted outcomes). Additionally, q represents the numerator degrees of freedom (as it is the difference in the degrees of freedom in the restricted and unrestricted model) and n-k-1 equals the denominator degrees of freedom, i.e. the degrees of freedom of the unrestricted model. Thus, we see that this F-statistic follows the previously noted requirement of being the ratios of chi-squared distributed variables divided by their respective degrees of freedom. II. Derivation of Distribution In order to use the F-statistic, we must first know the F-distribution, which this paper seeks to derive. To begin, we compare two independent variables X and Y with the ratio 𝑇 = 𝑈 𝑋 𝑌 where 𝑉 𝑌 > 0. Let 𝑋 = 𝑚 and 𝑌 = 𝑛 where U and V are chi-squared distributed with m and n degrees of freedom, respectively. The chi-squared density has only one parameter, n, which equals the Robinson 3 number of degrees of freedom (Weisstein “Chi-Squared”). The chi-squared distribution with n degrees of freedom can be written: 𝑓(𝑥) = . 5.5𝑛 .5𝑛−1 −.5𝑥 𝑥 𝑒 Γ(.5𝑛) Where (n ) is the gamma function. The gamma function is a generalization of the factorial function, and for an integer z, Γ(𝑧) = (𝑧 − 1)!. In general, the gamma function can be defined as ∞ Γ(𝑧) = ∫ 𝑥 𝑧−1 𝑒 −𝑥 𝑑𝑥 0 (Feller 47). Then the pdf of T is: 𝐹𝑇 = 𝑃(𝑇 ≤ 𝑡) = 𝑃(𝑋 ≤ 𝑡𝑌) ∞ 𝑡𝑦 = ∫ ∫ 𝑓𝑋 (𝑥)𝑓𝑌 (𝑦)𝑑𝑥𝑑𝑦 0 0 ∞ 𝑡𝑦 = ∫ 𝑓𝑌 (𝑦) ∫ 𝑓𝑋 (𝑥)𝑑𝑥𝑑𝑦 0 0 ∞ = ∫ 𝐹𝑋 (𝑡𝑦)𝑓𝑌 (𝑦) 𝑑𝑦 0 Then take the derivative to find the pdf: ∞ 𝑓𝑇 (𝑡) = 𝐹 ′ 𝑇 (𝑡) = ∫ 𝑦𝑓𝑋 (𝑡𝑦)𝑓𝑌 (𝑦) 𝑑𝑦 0 (Feller 49). We must then find the pdfs of X and Y. Because U and V are chi-squared distributed, we know the pdfs of U and V: . 5.5𝑚 .5𝑚−1 −.5𝑦 𝑓𝑈 (𝑦) = 𝑦 𝑒 Γ(.5𝑚) Robinson 4 . 5.5𝑛 .5𝑛−1 −.5𝑦 𝑓𝑉 (𝑦) = 𝑦 𝑒 Γ(.5𝑛) Here, we can use the knowledge that for any given variable Z such that Z=aW where a≠0 is a constant, then the pdf of Z is 𝑧 1 𝑓𝑍 (𝑧) = 𝑓𝑊 ( ) 𝑎 𝑎 Then: 𝑋=𝑈 1 𝑚 𝑓𝑋 (𝑦) = 𝑓𝑈 (𝑚𝑦)𝑚 Next, we can use the pdf of U: 𝑓𝑋 (𝑦) = 𝑓𝑈 (𝑚𝑦)𝑚 = 𝑚 . 5.5𝑚 (𝑚𝑦).5𝑚−1 𝑒 −.5𝑚𝑦 Γ(.5𝑚) . 5𝑚.5𝑚 .5𝑚−1 −.5𝑚𝑦 𝑓𝑋 (𝑦) = 𝑦 𝑒 Γ(.5𝑚) Similarly: 𝑓𝑌 (𝑦) = 𝑓𝑉 (𝑛𝑦)𝑛 = 𝑛 𝑓𝑌 (𝑦) = . 5.5𝑛 (𝑛𝑦).5𝑛−1 𝑒 −.5𝑛𝑦 Γ(.5𝑛) . 5𝑛.5𝑛 .5𝑛−1 −.5𝑛𝑦 𝑦 𝑒 Γ(.5𝑛) ∞ 𝑋 Then, returning to 𝑓𝑇 (𝑡) = ∫0 𝑦𝑓𝑋 (𝑡𝑦)𝑓𝑌 (𝑦) 𝑑𝑦, we can derive the density function of 𝑇 = 𝑌 : ∞ 𝑓𝑇 (𝑡) = ∫ 𝑦 0 . 5𝑚.5𝑚 . 5𝑛.5𝑛 .5𝑛−1 −.5𝑛𝑦 (𝑡𝑦).5𝑚−1 𝑒 −.5𝑚𝑡𝑦 𝑦 𝑒 𝑑𝑦 Γ(.5𝑚) Γ(.5𝑛) ∞ . 5𝑚.5𝑚 . 5𝑛.5𝑛 .5𝑚−1 .5𝑚+.5𝑛−1 −.5(𝑚𝑡+𝑛)𝑦 =∫ 𝑡 𝑦 𝑒 𝑑𝑦 0 Γ(.5𝑚) Γ(.5𝑛) = ∞ . 5𝑚.5𝑚 . 5𝑛.5𝑛 .5𝑚−1 Γ(.5𝑚 + .5𝑛) (.5𝑚𝑡 + .5𝑛).5𝑚+.5𝑛 .5𝑚+.5𝑛−1 −.5(𝑚𝑡+𝑛)𝑦 𝑡 ∫ 𝑦 𝑒 𝑑𝑦 Γ(.5𝑚) Γ(.5𝑛) (.5𝑚𝑡 + .5𝑛).5𝑚+.5𝑛 0 Γ(.5𝑚 + .5𝑛) Robinson 5 . 5𝑚.5𝑚 . 5𝑛.5𝑛 .5𝑚−1 Γ(.5𝑚 + .5𝑛) = 𝑡 Γ(.5𝑚) Γ(.5𝑛) (.5𝑚𝑡 + .5𝑛).5𝑚+.5𝑛 𝑓𝑇 (𝑡) = Γ(.5𝑚 + .5𝑛) 𝑡 .5𝑚−1 𝑚.5𝑚 𝑛.5𝑛 Γ(.5𝑚)Γ(.5𝑛) (𝑚𝑡 + 𝑛).5𝑚+.5𝑛 Which is the pdf of the F-distribution (Taboga). Then the cdf of the F-distribution is: 𝑥 𝐹𝑋 (𝑥) = ∫ 𝑓𝑇 (𝑡) = 𝑚−2 𝑥 .5 2𝐹1(.5𝑚 2𝑚 2 ( ) 𝑛 0 𝑚𝑥 + .5𝑛, .5𝑚; 1 + .5𝑚; − 𝑛 ) 𝐵(.5𝑚, .5𝑛) (Weisstein “F-Distribution”). Where 2𝐹1(.5𝑚 + .5𝑛, .5𝑚; 1 + .5𝑚; − 2𝐹1(𝑎, 𝑏; 𝑐; 𝑧) = 𝑚𝑡 𝑛 ) is a hypergeometric function of the form 1 𝑏−1 Γ(𝑐) 𝑡 (1 − 𝑡)𝑐−𝑏−1 ∫ 𝑑𝑡 Γ(𝑏)Γ(𝑐 − 𝑏) 0 (1 − 𝑡𝑧)𝑎 (Weisstein “Hypergeometric”). And 𝐵(.5𝑚, .5𝑛) is a beta function of the form 𝐵(𝑝, 𝑞) = Γ(𝑝)Γ(𝑞) Γ(𝑝 + 𝑞) (Weisstein “Beta”). III. Application Knowing the cdf of the F-distribution, we can now use that knowledge to analyze problems such as the one of statistical significance presented above. In statistics, statistical significance is defined by a significance level α. This significance level is not the same for all problems and can be varied to suit the needs of the current study, but is often .1, .05, or .01. We also require a null hypothesis, which is an assumption made prior to analyzing the data, and if we have statistically Robinson 6 significant evidence, we are able to reject the null. For example, in the problem presented in the introduction, the null hypothesis would be that the jointly tested independent variables have no effect on the dependent variable. If we were to find statistically significant evidence, we would then prefer the unrestricted model which takes these variables into account. To determine if there is sufficient evidence to be statistically significant at significance level α, we must first calculate the p-value. A p-value is the probability that the observed data would occur if the null hypothesis were true. If the p-value is very small, i.e. it is lower than the significance level, we have adequate evidence to reject the null. For an F-test, we can use the cdf to determine the p-value for a given set of data. For an Fstatistic F with m numerator degrees of freedom and n denominator degrees of freedom and cdf 𝐹𝑋 (𝐹): p-value=𝑃(𝔽 > 𝐹) = 1 − 𝑃(𝔽 < 𝐹) = 1 − 𝐹𝑋 (𝐹) where 𝔽 is a random F-variable with the same m and n degrees of freedom (Wooldridge 151). For example, if we had RSSr=195.31 RSSur=172.36 q=5 n=89 k=14 then 𝐹= (195.31 − 172.36)/5 172.36/(89 − 14 − 1) 𝐹 = 1.97 Then the p-value for F with 5 numerator degrees of freedom and 89-14-1=74 denominator degrees of freedom is .093, which would be statistically significant when α=.1, but not for any α<.093, e.g. α=.05. Robinson 7 References Feller, William. An Introduction to Probability Theory and its Applications. 2nd ed. 2. 1971. Grinstead, Charles M., and J. Laurie Snell. Grinstead and Snell's Introduction to Probability. 2nd ed. 2003. Taboga, M. "Lectures on probability and statistics." 2010. http://www.statlect.com Weisstein, Eric W. "Beta Function." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/BetaFunction.html Weisstein, Eric W. "Chi-Squared Distribution." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/Chi-SquaredDistribution.html Weisstein, Eric W. "F-Distribution." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/F-Distribution.html Weisstein, Eric W. "Hypergeometric Function." From MathWorld--A Wolfram Web Resource. http://mathworld.wolfram.com/HypergeometricFunction.html Wooldridge, Jeffrey M. Introductory Economics: A Modern Approach. 4th ed. 2009.