F-Test for Joint Significance – Student Notes Understanding the F-Test for Joint Significance The F-test for joint significance is used in multiple regression analysis to determine whether a group of explanatory variables adds meaningful explanatory power to a model. When do we use it? Suppose we have a regression model, and we want to test whether a set of variables (e.g., calendar month and whether there's a sporting event) significantly improves the model. These variables may not be significant individually, but together they might be important. What are we testing? We compare two models: - Restricted model: excludes the group of variables we’re testing. - Unrestricted model: includes the group of variables. We test the hypotheses: - H₀: All the coefficients of the additional variables are zero (they have no effect). - H₁: At least one coefficient of the additional variables is not zero (they do have an effect). F-test statistic formula F = ((R²ᵤ - R²ᵣ) / k) / ((1 - R²ᵤ) / (n - k - 1)) Where: - R²ᵤ: R-squared of the unrestricted model - R²ᵣ: R-squared of the restricted model - k: Number of variables being tested (added in the unrestricted model) - n: Total number of observations This formula measures the improvement in fit per added variable (numerator) compared to the unexplained variance per degree of freedom (denominator). Decision Rule 1. Calculate the F-statistic. 2. Compare it to the critical value from the F-distribution with degrees of freedom (k, n - k 1). 3. If F_calculated > F_critical, we reject H₀. That means the added variables are jointly significant and should be kept in the model. Why is this important? - It helps us avoid omitting important variables. - It ensures our model includes variables that genuinely improve the explanatory power. - It helps in model selection: should we include more variables or keep the model simpler? Example Recap (from your question): - Restricted model: R² = 0.36 - Unrestricted model: R² = 0.51 - Sample size: 104 - Variables tested: 2 Calculated F ≈ 15.46, critical value ≈ 3.09 Since 15.46 > 3.09 ⇒ Reject H₀ Conclusion: The new variables (calendar month & sporting event) are jointly significant at the 5% level.