Lecture 8: F Test and Vector Review Ailin Zhang 2024-05-31 Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 1 / 26 Recap: ANOVA Table Source Model df 1 βb2 Sxx (MS) Mean square 2 b β Sxx ÷ 1 Residual n−2 Pn Pn Total (corrected) n−1 Pn Ailin Zhang SS 1 ri2 i=1 b 1 ri2 i=1 b ÷ (n − 2) 2 i=1 (yi − ȳ ) Lecture 8: F Test and Vector Review 2024-05-31 2 / 26 Today’s Agenda Constructing the F-test Example in R (F-test; complete summary function) Understanding power transformations Review of Random Vectors Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 3 / 26 The F distribution Suppose U ∼ χ2m , V ∼ χ2n , and U, V are independent. Then Y = U ÷m ∼ Fm,n V ÷n F distribution on m and n degrees of freedom. The degrees of freedom m and n are sometimes called the numerator and denominator degrees of freedom, respectively. If Y ∼ Fm,n , then Y −1 ∼ Fn,m . The density, fY (y ; m, n), is zero for y < 0 and non-negative for y ≥ 0. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 4 / 26 The F distribution Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 5 / 26 The F distribution Exercise: Suppose X ∼ tk Find the distribution of Y = X 2 Answer: F(1,k) Exercise: Suppose U ∼ Km , V ∼ Kn , and U, V are independent. Find the distribution of 2 U Y = . V Answer: F(m,n) Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 6 / 26 Mean Squares Residual MS = MSE = RSS n−2 = 2 the estimate of σ̂ . Pn br 2 i=1 i b2 n−2 = σ , and we know that this is b2 Model MS = MSS 1 = β1 Sxx Dividing the model MS by the residual MS provides a discrepancy measure for testing the hypothesis H0 : β1 = 0 Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 7 / 26 Mean Squares This gives the idea for testing the hypothesis H0 : β1 = 0, using the ratio g MS Model g MS Residual where a large observed value of this ratio would provide evidence against the hypothesis. MS Model But what is the sampling distribution of f ? f MS Residual F(1,n-2) Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 8 / 26 Example in R # Read data bw_data <- read.table("lowbwt.csv",sep=",",header = TRUE) # Fit a simple linear model using lm() function fit <- lm(headcirc ~ gestage, data=bw_data) # Assign the summary of the fitted model fit_summary <- summary(fit) fit_summary ## ## Call: ## lm(formula = headcirc ~ gestage, data = bw_data) ## ## Residuals: ## Min 1Q Median 3Q Max ## -3.5358 -0.8760 -0.1458 0.9041 6.9041 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 3.91426 1.82915 2.14 0.0348 * ## gestage 0.78005 0.06307 12.37 <2e-16 *** ## --## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 1.59 on 98 degrees of freedom ## Multiple R-squared: 0.6095, Adjusted R-squared: 0.6055 ## F-statistic: 152.9 on 1 and 98 DF, p-value: < 2.2e-16 Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 9 / 26 # Get the ANOVA table for the fitted model anova(fit) ## Analysis of Variance Table ## ## Response: headcirc ## Df Sum Sq Mean Sq F value Pr(>F) ## gestage 1 386.87 386.87 152.95 < 2.2e-16 *** ## Residuals 98 247.88 2.53 ## --## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 10 / 26 Power Transformation When to use Power Transformations? When two variables x and y are: 1 monotonically related (e.g., as x increases, y increases), and if 2 x and y are greater than zero, then taking a power transformation of x or y or both, can result in a straighter looking scatterplot. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 11 / 26 How to perform Power Transformations? For n pairs of positive values (xi , yi ) , i = 1 . . . n, displaying a (roughly) monotonic relationship, consider the n pairs of transformed values (Tp (xi ) , Tq (yi )) , i = 1, . . . , n, p, q ∈ R. In practice, we use a “computationally easy” transform: z p p > 0, log10 (z) p = 0, Tp (z) = −z p p < 0. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 12 / 26 How to perform Power Transformations? We utilize Tukey’s ladder and the bump rule (using curvature): Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 13 / 26 How to perform Power Transformations? The bump rule can also use the location of a density’s bump (mode) to tell you which way to “move” on the ladder. If the bump is concentrated on “lower” values, move the power “lower” on the ladder. move down the ladder If the bump is concentrated on “higher” values, move the power “higher” on the ladder. move up the ladder Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 14 / 26 Example The US population from 1670 - 1860. The Y axis on the right panel is on a log scale. Least-squares estimates, confidence and prediction intervals, and tests of hypotheses are all the same as before, simply replace y by log(y ) This transformed model makes more physical sense! Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 15 / 26 Vector Notation The straight-line model for its realized values is yi = µi + ri , with µi = β0 + β1 x . for i = 1, . . . , n. Using vector notation, we write all n equations as: µ1 r1 y1 y2 µ2 r2 . = . + . . . . . . . yn µn rn µ1 1 x1 µ2 1 x2 . = β0 × . + β1 × . . . . . . . µn 1 Ailin Zhang xn Lecture 8: F Test and Vector Review 2024-05-31 16 / 26 Length/Magnitude of a Vector v = (v1 , v2 , · · · , vn )> kvk = = = q v12 + v2 2 + · · · + vn 2 qX √ vi 2 v> v kv k2 = v> v = Ailin Zhang P 2 v i Lecture 8: F Test and Vector Review 2024-05-31 17 / 26 Random Vectors Using vector notation, the model is Y = µ + R where Y = (Y1 , Y2 , . . . , Yn )T and R = (R1 , R2 , . . . , Rn )T are column vectors of n random variables, called random vectors, µ = (µ1 , µ2 , . . . , µn )T is a (non-random) column vector, and y and r are realizations of the random vectors Y and R, respectively, called realized vectors. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 18 / 26 Notation For any m × k array of random variables Zij , i = 1, . . . , m, j = 1, . . . , k, we define Z11 Z21 Z= .. . Z12 Z22 .. . ··· ··· Zm1 Zm2 · · · zT Z1k 1 h i zT Z2k 2 .. = Z1 , Z2 , · · · , Zk = .. . . Zmk zT m where Zj = (Z1j , Z2j , . . . , Zmj )T are column vectors, and zT i = (Zi1 , Zi2 , . . . , Zik ) are row vectors. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 19 / 26 Definitions For a random vector Z = (Z1 , Z2 , . . . , Zn )T ∈ Rn , the expectation of Z is (E (Z1 ), E (Z2 ), . . . , E (Zn ))T ∈ Rn , the variance-covariance matrix of Z is σ11 σ12 · · · σ21 σ22 · · · Σ= .. .. . . σn1 σn2 · · · σij = Cov (Zi , Zj ) = E ((Zi − θi ) (Zj − θj )) σ1n σ2n .. . σnn n×n for all i, j = 1, . . . , n When i = j, σii = Var (Zi ). Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 20 / 26 Variance-Covariance Matrix Properties of Σ : Σ is symmetric Σ is positive semi-definite (i.e. for any a ∈ Rn , aT Σa ≥ 0 ) If Z1 , . . . , Zn are independent, then Cov (Zi , Zj ) = 0, ∀i 6= j which results in a diagonal matrix. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 21 / 26 Basic Properties of Random Vectors Let Z ∈ Rn be a random vector, b ∈ Rn be a constant vector, and A be an n × n constant matrix. 1 2 3 4 E bT Z = bT E (Z) Var bT Z = bT Var(Z)b E (AZ + b) = AE (Z) + b Var(AZ + b) = A Var(Z)AT For I × m constant matrix A, k × p constant matrix B, I × p constant matrix C, and m × k random matrix Z, E (AZB + C) = AE (Z)B + C. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 22 / 26 Proposition Let Z be a random vector in Rn , with E (Z) = θ and Var(Z) = Σ. Then for any constant m × n matrix A and constant vector b ∈ Rm the random vector Y defined as Y = AZ + b has expectation E (Y) = Aθ + b, and variance-covariance matrix Var(Y) = AΣA> . Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 23 / 26 Partitioned Matrix Occasionally, it will be handy to divide a matrix into blocks, where each block is a submatrix. For example, suppose a d × 1 random vector Z is partitioned as ! Z1 Z= Z2 where Z1 is k × 1 and Z2 is (d − k) × 1. Then the d × d variance-covariance matrix Var(Z) = Σ can be written as a partitioned matrix as follows: " Σ= Ailin Zhang Σ11 Σ12 Σ21 Σ22 # Lecture 8: F Test and Vector Review 2024-05-31 24 / 26 Matrix Operations Let M be a d × d matrix partitioned as " M= A B C D # with k × kA, k × (d − k)B, (d − k) × kC, and (d − k) × (d − k)D. And let N be a d × d matrix similarly partitioned as, " N= P Q R S # with k × kP, k × (d − k)Q, (d − k) × kR, and (d − k) × (d − k)S. Then the matrix product " MN = Ailin Zhang A B C D #" P Q R S # " = (AP + BR) (AQ + BS) (CP + DR) (CQ + DS) Lecture 8: F Test and Vector Review 2024-05-31 # 25 / 26 Matrix Operations Similarly, the transpose " > M = A> C> B> D> # . Moreover, if M> = M, then A> = A, D> = D, and C = B> . Suppose further that A and D are invertible matrices. In this case, the inverse M−1 exists and can be written as " A B B> D #−1 " = A−1 + FE−1 F> −E−1 F> −FE−1 E−1 # where E = D − B> A−1 B, and F = A−1 B. Ailin Zhang Lecture 8: F Test and Vector Review 2024-05-31 26 / 26