Analysis of variance (ANOVA) Purpose: to test for statistically significant differences among two or more population means. H0: all of the population means are equal. Ha: not all of the population means are equal. Rejection of Ho means that there is a statistically significant difference between at least two of the sample means. Estimation (point and interval) of population means is also possible. One-way ANOVA (completely randomized design) Types of variation -- variation is measured by sums of squared deviations. Between-group variation -- signal-related Measured by "SST" -- sum of squared deviations for treatments (groups). Within-group variation -- noise-related Measured by "SSE" -- sum of squared deviations for error. Total variation Measured by "TSS" -- total sum of squared deviations. TSS = SST + SSE. Degrees of freedom in ANOVA: d.f. for treatments -- the "numerator" d.f. d.f. = (no. of treatments - 1) d.f. for error -- the "denominator" d.f.; d.f. = (total no. of observations - no. of treatments) overall d.f. -- sum of the other two d.f.'s, also d.f. = (total no. of observations - 1) SST and SSE are divided by the appropriate number of degrees of freedom, yielding MST (mean of squared deviations for treatments) and MSE, (mean of squared deviations for error), respectively. MST = SST / numerator d.f. MSE = SSE / denominator d.f. MST is the "signal." MSE is the "noise." The larger the MST relative to the MSE, the stronger the evidence against Ho. Fc, the test statistic, is the ratio of MST to MSE: Reject H0 if Fc Ft. Fc= MST / MSE. The Ft is based on α and the numerator and denominator degrees of freedom. Reject H0 if p α. ANOVA table: standardized way of presenting results of anova computations. When there are only two groups and a t-test could be used, the Fc will be equal to the square of the tc. ANOVA estimation -- single population mean and difference between two population means Similar to z and t procedures. tt in the following equations is based on the number of degrees of freedom for error. = X tt ( ˆ X ) single population mean: where ˆ X = MSE n difference between two population means: ( 1 - 2) = (x1 - x 2) t t ˆ ( x1- x2 ) where ˆ ( x - x ) = MSE x 1 2 1 n1 + 1 n2 Assumptions -- same as t-tests Samples: random, independent Populations: normal, equal variances Two-way ANOVA (randomized block design) Two independent variables -- treatments and blocks. Two between-group variations -- signal-related: SST & SSB (SSB = sum of squared deviations for blocks). TSS = SST + SSB + SSE. Degrees of freedom: d.f. for treatments -- "numerator" d.f. d.f. = (no. of treatments - 1) d.f. for blocks -- "numerator" d.f. d.f. = (no. of blocks - 1) d.f. for error -- "denominator" d.f. d.f. = (total no. observations - no. treatments - no. blocks + 1) overall d.f. -- sum of the other three d.f.'s, also d.f. = (total no. of observations - 1) SSB is divided by d.f. for blocks, yielding MSB (mean of squared deviations for blocks). MSB = SSB / d.f. for blocks MST & MSB are both "signals." MSE is "noise." Two hypothesis tests (one for treatments, one for blocks), two calculated F's, two table F's, two p-vlaues. Fc treatments = MST / MSE. Fc blocks = MSB / MSE. Reject each Ho if Fc Ft. Two-way ANOVA estimation -- valid only for differences between population means. Confidence intervals cannot be obtained for individual treatment means. tt in the following equation is based on the number of degrees of freedom for error. Difference between two population means: ( 1 - 2) = (x1 - x 2) t t ˆ ( x1- x2 ) where ˆ ( x - x ) = MSE x 1 2 Assumptions -- same as t-tests Samples: random, independent Populations: normal, equal variances Three-way analysis of variance "Latin Square" design 1 n1 + 1 n2