Introduction to ANOVA (ANalysis Of VAriance) Why ANOVA? • Effects of CHO loading • how much? • 1 gm/kg? 2 gm/kg? 5 gm/kg? • Effects of bracing on GRF • which brace • taping? Swed O? ActiveAnkle? • Effeks uf alcohol on spelin • what blood/alcohol level • 0.04? 0.08? 0.10? ANalysis Of VAriance • 1-way ANOVA • Grouping variable = factor = independent variable • The variable will consist of a number of levels • If 1-way ANOVA is being used, there will be >2 levels of one IV. • E.G. What type of program has the greatest impact on aggression? • Violent movies, soap operas, or “infomercials”? • Type of program is the independent variable or factor • Violent movies is one level of the factor • soap operas is one level of the factor • Infomercials is one level of the factor • Aggression is the dependent variable Example of Oneway ANOVA (single factor) • No reason to assume correlation between the cases in the “k” groups • (k = number of groups) • Question: does CHO affect time to fatigue?? • IV: diet (3 levels of IV or factor) • DV: Endurance time on bike How to compare more than 2 means? • refers to risk of making a Type 1 error • with each comparison, we have “ ” chances of making a Type 1 error • = 0.05 • 5 times in 100 we will reject a true null hypothesis when running each comparison Type 1 error rate is exponentially cumulative Family Wise error rate FW = 1- (1 - c ) where c is the number of comparisons to be made ie if = 0.05 and three means Type 1 error rate is exponentially cumulative Family Wise error rate with three means to compare FW = 1- (1 = 0.143 3 0.05) Type 1 error rate is exponentially cumulative Estimating Family Wise error rate FW = c where c is the number of comparisons to be made Note: always overestimates the error rate ie if = 0.05: k = 3; k = 4????? ANOVA is an attempt to maintain the FW error rate at a known (acceptable) level Example of ANOVA: Return to our original question Question: does amount of CHO injected affect time to fatigue?? IV: diet (3 levels of IV) DV: Endurance time on bike 1-way ANOVA (0ne IV) • IV = Grouping variable = factor • The IV consists of a number of levels Steps to Oneway ANOVA • set (0.05) • set sample size • Thirty randomly selected subjects • Three randomly assigned groups • n = 10 in each group • Grp 1: Regular Diet • Grp 2: CHO supp diet (0.5 g/kg) • Gpr 3: CHO supp diet (1.0 g/kg) • set HO: Set statistical hypotheses: I HO • Null hypothesis • Any observed difference between the 3 groups will be attributable to random sampling errors H1 (HA) • Alternative hypothesis • If HO is rejected, the difference is not attributable to random sampling errors (perhaps diet)? Set statistical hypotheses: II • HO • Null hypothesis • The population means of the 3 groups are equal • H1 • Alternative hypothesis • The population means of the three groups differ in some way Note: no directional hypothesis; Null may be false in many different ways Steps • • • • Set (0.05) set sample size (n = 10/grp) set Ho: test all subjects with a standardized protocol (bike) Subject Data file ANOVA1.sav Steps Set (0.05) set sample size (n = 10/grp) set Ho: test all subjects with a standardized protocol (bike) • get descriptive statistics of each group • • • • • histograms • mean, SD, n • compare the group means How to compare the groups? • With k = 3, = 0.05, FW = ??? Concept of ANOVA • Evaluate the effect of treatment (the IV) Concept of ANOVA • Evaluate the effect of treatment (the IV) by analyzing the amount of variation among the subgroup sample means (DV) Concept of ANOVA • Evaluate the effect of treatment (the IV) by analyzing the amount of variation among the subgroup sample means (DV) But how much variation is expected if the subgroup population means are equal? Some Nomenclature • Grand Mean: mean of ALL scores, regardless of group • ie all 30 scores X • Group Mean: mean of all scores from subjects treated the same • groups of 10 X 3 Sources of Variability (Deviation Scores!!!!) X-X X-X X-X : Total Variability (individual scores around Grand Mean) 3 Sources of Variability X-X X-X X-X : will sum to 0, so square it for each subject, then sum. Gives us The Total Sum of Squares 3 Sources of Variability X-X X-X X-X : Within Group Variability (individual scores around Group Mean) 3 Sources of Variability X-X X-X : Within Group Variability (scores around Group Mean) X-X Reflects INHERENT variability (all treated the same) Within-group • Variation between people that is not due to the grouping factor • Example: • You might assign people to three different tanning beds to see which has the greatest tanning effect • But folk within each type of bed would still vary greatly in the degree of tanning they achieved • Within group variance is the pooled variance from all levels of the grouping factor (similar to pooled SD in t-test) 3 Sources of Variability X-X X-X X-X : will sum to 0, so square it for each subject, then sum. Gives us Within Group Sum of Squares 3 Sources of Variability X-X X-X X-X : Between Group Variability (Groups around Grand Mean) Between-group variation • Is the variation normally expected between people (within-group variation), plus variation due to the grouping factor 3 Sources of Variability X-X X-X X-X Reflects inherent and TREATMENT EFFECT : Between Group Variability (Groups around Grand Mean) 3 Sources of Variability X-X X-X X-X : will sum to 0, so square it for each group, then sum. Gives us Between Group Sum of Squares Recall • Size of the Sum of Squares is affected by • size of each deviation score • number of cases that are summed Calculate the MEAN SQUARE of a sum of squares by dividing through by the degrees of freedom contributing to the sum. 3 Sources of Variability X-X X-X X-X df for EACH group = n-1 Statistics Humour Two unbiased estimators were sitting in a bar. The first says “So how do you like married life?“ The other replies, "It's pretty good if you don't mind giving up that one degree of freedom!" 3 Sources of Variability X-X X-X X-X df for EACH group = n-1 df for TOTAL groups = k (n-1) 3 Sources of Variability X-X X-X X-X df for EACH group = n-1 df for TOTAL groups = k (n-1) For our Diet study: df Within = 3 (10 - 1) = 27 3 Sources of Variability X-X X-X X-X df = k -1 3 Sources of Variability X-X X-X X-X For our Diet study df Between = 3 - 1 = 2 df = k -1 A new ratio between variabilities for us to consider Inherent Variability + Treatment Effect Inherent Variability A new ratio between variabilities for us to consider Inherent + Treatment Between = Inherent Between: between group variability Within: within group variability Within A new ratio between variabilities for us to consider Inherent + Treatment MSBetween = Inherent MSWithin By using Mean Square, account for different number of cases contributing to each estimate of error (random SE). A new ratio between variabilities for us to consider Inherent + Treatment MSBetween = Inherent MSWithin Note: if Treatment effect = 0 (ie no effect) the ratio will be equal to ???? A new ratio between variabilities for us to consider F MSBetween = MSWithin Note: if Treatment effect = 0 (ie no effect) the ratio will be equal to 1.00 Evaluating Fobserved with the F distribution • A distribution of F ratios is not normally distributed • follows an F distribution • positively skewed • depends on the number of degrees of freedom in the numerator (MS between) and the denominator (MS within) The F distribution (hypothetical) 0 1 2 3 4 5 6 7 8 Fcritical : the F value that must be equaled or exceeded to classify a difference among group means as statistically significant (identify a main effect) Fcritical depends on df of MSbetween and MS within, and chosen Fcritical depends on df of MSbetween and MS within, and chosen The F distribution (hypothetical) Region of rejection 0 1 2 3 4 5 6 7 8 F.05 = ??? For our Diet study, with = 0.05 and df = 2 and 27, Fcritical = ??? The F distribution (hypothetical) F distribution for df 2, 27 Concept of evaluating Fobs against Fcrit F distribution for df 2, 27 Area = 0.05 (5%) Fcrit = 3.35 Concept of evaluating Fobs against Fcrit F distribution for df 2, 27 Area = 0.05 (5%) Fcrit = 3.35 Fobs < Fcrit, Decision: ????? Concept of evaluating Fobs against Fcrit F distribution for df 2, 27 Area = 0.05 (5%) Fcrit = 3.35 Fobs Fcrit, Decision: ????? Running Oneway ANOVA (single factor ANOVA) Using SPSS e e N F 0 F 1 F Demonstrate with anova1.sav 1-way ANOVA in SPSS Procedure: Choose the appropriate procedure, and… 1-way ANOVA in SPSS Dialog box: slide the variables… …into the appropriate places ANOVA in SPSS O F m d F S i a B 2 0 0 0 W 7 1 T 9 Decision • Since Fobs = 11.13 Fcrit of 3.35, our decision is to ... Decision • Since Fobs = 11.13 Fcrit of 3.35, our decision is to reject Ho stating that... Decision • Since Fobs = 11.13 Fcrit of 3.35, our decision is to reject Ho stating that the difference among the means is not more than would be expected by chance and accept HA stating that... Decision • Since Fobs = 11.13 Fcrit of 3.35, our decision is to reject Ho stating that the difference among the means is not more than would be expected by chance and accept HA stating that the means differ in some way. Decision • Since Fobs = 11.13 Fcrit of 3.35, our decision is to reject Ho stating that the difference among the means is not more than would be expected by chance and accept HA stating that the means differ in some way. Omnibus F: identify a significant main effect Decision • Since Fobs = 11.13 Fcrit of 3.35, our decision is to reject Ho stating that the difference among the means is not more than would be expected by chance and accept HA stating that the means differ in some way. How to determine which means differ? Time to Fatigue (mins) Is Normal different from o.5 g/kg? From 1.0 g/kg? Is 0.5 g/kg different from 1.0 g/ kg? 50 45 40 35 Normal 0.5g CHO Diet Group 1.0g CHO ANOVA in SPSS O F m d F S i a B 2 0 0 0 W 7 1 T 9 Significant result…now what? There are more than 2 means Among all means, or just two? Better do a follow-up, mate There is a significant difference among the means Don’t know Rats Ok then Why not use 3 unpaired t-tests? • Normal vs 0.5 g/kg • Normal vs 1.0 g/kg • 0.5 g/kg vs 1.0 g/kg Why not use 3 unpaired t-tests? • Normal vs 0.5 g/kg • Normal vs 1.0 g/kg • 0.5 g/kg vs 1.0 g/kg Because we will be operating with an inflated Family Wise . Post Hoc tests • After the Fact comparisons of means used to identify which specific pairs of means are significantly different • Designed to maintain a specified Family Wise level regardless of how many pairs of means are compared Post Hoc tests • Follow-up tests • ONLY compute after a significant ANOVA • Like a collection of little t-tests • But they control overall type 1 error comparatively well • They do not have as much power as the omnibus test (the ANOVA) – so you might get a significant ANOVA & no sig. Follow-up • Purpose is to identify the locus of the effect (what means are different, exactly?) Significant result…now what? • Follow-up tests – most common… • Tukey’s HSD (honestly sig. diff.) • Formula: MSwithin HSD q ngroup • But it’s easier to use SPSS… Follow-ups to ANOVA in SPSS Choose “post-hoc” test (meaning ‘after this’) Follow-ups to ANOVA in SPSS Check the appropriate box for the HSD (Tukey, not Tukey’s b) Run Tukey’s HSD test: Oneway in SPSS • Use our diet data (ANOVA1.sav) i m And one a Groups T that does that do a not differ N 1 2 D N 0 0 0 0 0 1 0 0 M a U Assumptions to test in One-Way 1. 2. 3. Samples should be independent (as with independent t-test – does not mean perfectly uncorrelated) Each of the k populations should be normal (important only when samples are small…if there’s a problem, can use Kruskal-Wallis test) The k samples should have equal variances (this is the homogeneity of variance assumption, and we’ll look at it shortly…violations are important mostly with small samples and unequal n’s) Homogeneity of variance - SPSS 1. Click on the ‘options’ button Homogeneity of variance - SPSS 2. Choose homogeneity of variance (I’ve also chosen descriptives here) 3. Click continu e Homogeneity of variance - SPSS 4. SPSS output The test has to be significant for there to be a violation Reporting ANOVA results Table 1. Descriptive statistics of mean time to exhaustion (minutes) by diet group (n = 10). A solid line joins pairs of means that are not significantly different (Tukey’s HSD, =0.05) Mean SD Regular 0.5g/kg 1.0g/kg Diet CHO CHO 38.9 44.2 44.7 3.5 2.9 2.7 Time to Fatigue (mins) 50 40 30 * 20 10 0 Normal 0.5g CHO 1.0g CHO Diet Group Figure 1. Descriptive statistics of time to exhaustion with different diets. An asterisk indicates group means that are not significantly different (=0.05) Reporting ANOVA results Table 2. ANOVA summary table for the effects of diet on time to exhaustion. Source Diet Error df SS MS 2 206.6 103.3 27 250.6 9.3 F p 11.1 0.0003 Optional: include in appendix if not in body of thesis Reporting ANOVA results Descriptive statistics for the mean time to exhaustion for the three diet groups are presented in table 1 and graphically in Figure 1. A oneway ANOVA at = 0.05 revealed a significant difference among the diet groups for mean time to exhaustion (F 2,27 = 11.13, p = 0.0003). Tukey’s HSD was used to identify the source of the significant omnibus F, and indicated that the mean time to exhaustion for the regular diet group (38.9 3.5 minutes) was significantly shorter than the time for the groups receiving 0.5 grams CHO per kilogram body weight (g/kg) or 1.0 g/kg. These two groups , with means of 44.2 ( 2.9) and 44.7 ( 2.7) minutes respectively, were not significantly different. Reporting ANOVA results These results suggest that CHO supplements of at least 0.5 g/kg of CHO will increase time to exhaustion on the bicycle by about 5.5 minutes or 14%. The data also suggest a plateau effect of CHO supplementation, with no additional increase in time to exhaustion seen with 1.0 g/kg compared to 0.5 g/kg. In discussion, address whether the observed increase is physiologically meaningful, and elaborate on the concept of a plateau effect with CHO supplements. Calculating Tukey-b (HSD) test HSD q MS within n Honestly Significant Difference the magnitude of mean difference that must exist to claim levels are Significantly Different Tukey-b (HSD) test HSD q MS within n The studentized range statistic (table E, p. 470) depends on the number of levels to be compared and df within and Tukey-b (HSD) test HSD q MS within n For our diet study: k = 3 (# of levels) and df within = 27, = 0.05 From Table F, q = ??? Tukey-b (HSD) test HSD q MS within n For our diet study: k = 3 (# of levels) and df within = 27, = 0.05 From Table 8, q = 3.51 Tukey-b (HSD) test HSD q MS Mean SquareWithin, taken from ANOVA Summary Table n within Tukey-b (HSD) test HSD q MS For our diet study, MSwithin = 9.2815 n within Tukey-b (HSD) test HSD q MS within n Number of Subjects in EACH group Tukey-b (HSD) test HSD q MS For our diet study, n = 10 n within Tukey-b (HSD) test HSD 3.51 9.2851 = 3.382 10 Apply Tukey’s HSD test value of 3.4 to the diet data: • Normal vs 0.5 g/kg • 38.9 vs 44.2 minutes • difference = -5.3 minutes * • Normal vs 1.0 g/kg • 38.9 vs 44.7 minutes • difference = -5.8 minutes * • 0.5 g/kg vs 1.0 g/kg • 44.2 vs 44.7 minutes • difference = -0.5