BS704 Class 8 Analysis of Variance HW Set #7 Chapter 7 Problems 5, 14, 19 and 28 R Problem Set 7 (on Blackboard) Due November 2 Please complete Quiz 9 Before Nov 2 An RCT to Assess the Efficacy of a New Drug for Asthma in Children Background characteristics Age Sex Years since diagnosis of asthma Outcomes Self-reported improvement in symptoms FEV1 Did the randomization work? Characteristic Placebo Age, years 10 (2.4) 9.9 (2.1) .76 % Male 54% 43% .04 3.1 (2.1) .34 Yrs since Dx 3.4 (1.9) New Drug p 1. Yes 2. No o 0% N Ye s 0% What are hypotheses to compare ages? Characteristic Placebo Age, years 10 (2.4) 9.9 (2.1) .76 % Male 54% 43% .04 3.1 (2.1) .34 Yrs since Dx 1. 2. 3. 4. 3.4 (1.9) New Drug H0:m1=m2 vs H1:m1≠m2 H0:p1=p2 vs H1:p1≠p2 H0:m=10 vs H1:m≠10 H0:md=0 vs H1:md≠0 p What test would be used to compare % improvement between groups? 1. Test for equality of means 2. Test for equality of proportions 3. Test for mean difference 4. No clue m fo r st Te o cl en c ea n lit y ua eq fo r st Te N of di ffe r p. .. ... of lit y ua eq fo r st Te ue e 0% 0% 0% 0% What test would be used to compare FEV1 between groups? 1. Test for equality of means 2. Test for equality of proportions 3. Test for mean difference 4. No clue m fo r st Te o cl en c ea n lit y ua eq fo r st Te N of di ffe r p. .. ... of lit y ua eq fo r st Te ue e 0% 0% 0% 0% Objectives Understand the procedure for testing the equality of k > 2 means Perform the test by hand and using R Appropriately interpret results Hypothesis Testing Procedures 1. Set up null and research hypotheses, select a 2. Select test statistic 3. Set up decision rule 4. Compute test statistic 5. Draw conclusion & summarize significance (p-value) Hypothesis Testing for More than 2 Means - Analysis of Variance Continuous outcome k Independent Samples, k > 2 H0: m1=m2=m3 … =mk H1: Means are not all equal Test Statistic F= Σn j (X j X) 2 /(k 1) ΣΣ(X X j ) 2 /(N k) (Find critical value in Table 4) Test Statistic - F Statistic Comparison of two estimates of variability in data Between treatment variation, is based on the assumption that H0 is true (i.e., population means are equal) Within treatment, Residual or Error variation, is independent of H0 (i.e., we do not assume that the population means are equal and we treat each sample separately) F Statistic Difference BETWEEN each group mean and overall mean Σn j (X j X) /(k 1) 2 F= ΣΣ(X X j ) /(N k) 2 Difference between each observation and its group mean (WITHIN group variation - ERROR) F Statistic F = MSB/MSE MS = Mean Square What values of F that indicate H0 is likely true? Decision Rule Reject H0 if F > Critical Value of F with df1=k-1 and df2=N-k from Table 4 k= # comparison groups N=Total sample size ANOVA Table Source of Variation Sums of Squares df Between 2 SSB = Σ n j (X j - X ) Treatments k-1 2 Error SSE = Σ Σ (X - X j) Total SST = Σ Σ (X - X ) 2 N-k N-1 Mean Squares F SSB/k-1 MSB/MSE SSE/N-k Example Is there a significant difference in mean weight loss among 4 different diet programs? (Data are pounds lost over 8 weeks) Low-Cal 8 9 6 7 3 Low-Fat 2 4 3 5 1 Low-Carb 3 5 4 2 3 Control 2 2 -1 0 3 Example Summary Statistics on Weight Loss by Treatment Low-Cal n 5 Mean 6.6 Low-Fat 5 3.0 Overall Mean = 3.6 Low-Carb Control 5 5 3.4 1.2 Is there a statistically significant difference in weight loss programs? 1. Yes 2. No 3. ?? 0% ?? o 0% N Ye s 0% Example 1. H0: m1=m2=m3=m4 H1: Means are not all equal 2. Test statistic F= Σn j (X j X) 2 /(k 1) ΣΣ(X X j ) 2 /(N k) a=0.05 Example 3. Decision rule df1=k-1=4-1=3 df2=N-k=20-4=16 Reject H0 if F > 3.24 Example SSB = Σ n j (X j - X ) 2 =5(6.6-3.6)2+5(3.0-3.6)2+5(3.4-3.6)2+5(1.2-3.6)2 = 75.8 Example SSE = Σ Σ (X - X j) Low-Cal 8 9 6 7 3 Total (X-6.6) 1.4 2.4 -0.6 0.4 -3.6 0 2 (X-6.6)2 2.0 5.8 0.4 0.2 13.0 21.4 Example SSE = Σ Σ (X - X j) Low-Fat 2 4 3 5 1 Total (X-3.0) -1.0 1.0 0 2.0 -2.0 0 2 (X-3.0)2 1.0 1.0 0 4.0 4.0 10.0 Example SSE = Σ Σ (X - X j) Low-Carb 3 5 4 2 3 Total (X-3.4) -0.4 1.6 0.6 -1.4 -0.4 0 2 (X-3.4)2 0.2 2.6 0.4 2.0 0.2 5.4 Example SSE = Σ Σ (X - X j) Control 2 2 -1 0 3 Total (X-1.2) 0.8 0.8 -2.2 -1.2 1.8 0 2 (X-1.2)2 0.6 0.6 4.8 1.4 3.2 10.6 Example SSE = Σ Σ (X - X j) 2 =21.4 + 10.0 + 5.4 + 10.6 = 47.4 Example Source of Variation Sums of Squares df Mean Squares F 8.43 Between Treatments 75.8 3 25.3 Error 47.4 16 3.0 Total 123.2 19 Example 4. Compute test statistic F=8.43 5. Conclusion. Reject H0 because 8.43 > 3.24. We have statistically significant evidence at a=0.05 to show that there is a difference in mean weight loss among 4 different diet programs. ANOVA Using R .csv data file Example An investigator wishes to compare the average time to relief of headache pain under three distinct medications, A, B and C. Fifteen patients who suffer from chronic headaches are randomly selected for the investigation. The outcome is time to pain relief, in minutes. One Way ANOVA RCT to Compare 3 Medications for Chronic Pain N=15 Randomize A B C Outcome: Time to Pain Relief, minutes One Way ANOVA (cont’d) Data Mean Drug A 30 35 40 25 35 33.0 Drug B 25 20 30 20 30 25.0 Drug C 15 20 25 20 20 20.0 One Way ANOVA (cont’d) 1. Hypotheses H0: m1 = m2 = m3 H1: means not all equal a=0.05 2. Test Statistic F One Way ANOVA (cont’d) 3. Decision Rule K-1=3-1=2, N-k=15-3=12 Reject H0 if F > 3.89 4. Compute Sums of Squares One Way ANOVA (cont’d) (33.0 + 25.0 + 20.0) = 26.0 X.. = 3 SSB = Σ n j (X j - X ) 2 = 5((33-26.0)2 + (25-26.0)2 + (20-26.0)2) = 430 SSE = Σ Σ (X - X j) 2 One Way ANOVA (cont’d) Drug A X 30 35 40 25 35 (X-33) -3 -2 7 -8 -2 0 (X-33)2 9 4 49 64 4 130 One Way ANOVA (cont’d) Drug B X 25 20 30 20 30 (X-25) 0 -5 5 -5 5 0 (X-25)2 1 25 25 25 25 100 One Way ANOVA (cont’d) Drug C X 15 20 25 20 20 (X-20) -5 0 5 0 0 0 (X-20)2 25 0 25 0 0 50 One Way ANOVA (cont’d) SSE = Σ Σ (X - X j) 2 = 130+100+50 = 280 Source SS df MS F Between 430.0 2 215 9.21 Error 280.0 12 Total 710.0 14 23.3 One Way ANOVA (cont’d) Reject H0 since 9.21 > 3.89 – Means are not all equal. Paper – Testosterone Replacement Study design? RCT Number of comparison groups? -placebo, no exercise -testosterone, no exercise -placebo and exercise -testosterone and exercise Primary outcomes? Change in muscle strength, body weight, muscle volume, lean body mass (continuous) Paper – Testosterone Replacement Objective is to compare mean change in muscle strength, body weight, muscle volume, lean body mass (One at a time) across four treatment groups Figure 1 – generalizability? Paper – Testosterone Replacement Table 1 – what tests were used? Table 2 – what tests were used? Practice Problem – Complete the ANOVA Table H0: m1=m2=m3=m4=m5 H1: means not all equal Source SS Between Within Total 225 a=0.05 df MS 50 2.5 F Practice Problem – Complete the ANOVA Table H0: m1=m2=m3=m4=m5 H1: means not all equal a=0.05 Source SS df MS Between 100 4 25 Within 125 50 2.5 Total 225 Reject H0 if F > F0.05(4,50)=2.56 F 10 ANOVA When the sample sizes are equal, the design is said to be balanced Balanced designs give greatest power and are more robust to violations of the normality assumption Extensions Multiple Comparison Procedures – Used to test for specific differences in means after rejecting equality of all means Higher-Order ANOVA - Tests for differences in means as a function of several factors Extensions Repeated Measures ANOVA - Tests for differences in means when there are multiple measurements in the same participants (e.g., measures taken serially in time) Multiple Comparisons Procedures (MCPs) If we reject H0 in an ANOVA – we conclude that the k means are not all equal. Which means are different? Pairwise comparisons H0: mi=mj General contrasts H0: (mi+mj)/2=mk MCPs (continued) With k treatments there are k(k-1)/2 possible pairwise comparisons The overall Type I error rate can be as large as a{k(k-1)/2}! There are a number of different MCPs – they differ in terms of treatment of Type I error rate MCPs (continued) Error rate per comparison (ER_PC) = P(Type I error) on any one test or comparison (usually ER_PC is 0.05). Error rate per experiment (ER_PE) =the number of Type I errors we expect to make in any experiment under H0 (in 100 tests, we expect to make 5 Type I errors = #tests(a)). MCPs (continued) Familywise error rate (FW_ER) =P(at least 1 Type I error) in experiment. FW_ER =1 - (1-ai)c, where ai is the ER_PC c=# contrasts in experiment. MCPs (continued) Example. Suppose we test the equality of 5 treatment means using ANOVA and the null hypotheses is rejected at a = 0.05. Suppose that it is of interest to perform all pairwise comparisons. There are k(k-1)/2 = 5(5-1)/2 = 10 distinct pairwise comparisons. MCPs (continued) Suppose we wish to conduct each comparison at a 5% level of significance. NOTE: Only tests that are of substantive interest should be run and not all possible tests. ER_PC = 0.05. ER_PE = 10(0.05) = 0.5. FW_ER = 1 - (1 - 0.05)10 = 0.401. Scheffe MCP Conservative procedure that controls familywise error rate regardless of the number of contrasts Handles both pairwise and general contrasts Other MCPs include the Tukey procedure, Duncan procedure (multiple range test), Fisher's Least Significant Difference, the Newman-Keuls test, and Dunnett's test (used to compare a control to several active treatments). Scheffe MCP For pairwise tests H0: mi = mj H1: mi ≠ mj F= ( X.i - X.j )2 1 1 MSE + ni n j Reject H0 if F > (k-1) Fa (k-1, N-k) One Way ANOVA-Scheffe In Example we determined 3 drugs were significantly different with respect to mean time to pain relief Mean Drug A 33.0 Drug B 25.0 Drug C 20.0 Which drugs are different? Scheffe Test – Drug A Vs. B H0: mA = mB H1: mA ≠ mB 2 (X A - X B ) F= 1 1 MSE + nA nB Reject H0 if F > (k-1) Fa (k-1, N-k) (k-1) F 0.05 (2,12) = 2(3.89) = 7.78 Scheffe Test – Drug A Vs. B (X A - X B ) (33.0 25.0) 2 F= = = 6.87 1 1 1 1 23.3 MSE + 5 5 nA nB 2 Do not reject H0 since 6.87<7.78. No significant difference in mean times to pain relief for Drugs A and B. Scheffe Test – Drug A Vs. C H0: mA = mC H1: mA ≠ mC 2 (X A - XC ) F= 1 1 MSE + nA nC Reject H0 if F > (k-1) Fa (k-1, N-k) (k-1) F 0.05 (2,12) = 2(3.89) = 7.78 Scheffe Test – Drug A Vs. C (X A - XC ) (33.0 20.0) 2 F= = = 18.13 1 1 1 1 23.3 MSE + 5 5 nA nC 2 Reject H0 since 18.13>7.78. Significant difference in mean times to pain relief for Drugs A and C. Scheffe Test – Drug B Vs. C H0: mB = mC H1: mB ≠ mC 2 (X B - XC ) F= 1 1 MSE + nB nC Reject H0 if F > (k-1) Fa (k-1, N-k) (k-1) F 0.05 (2,12) = 2(3.89) = 7.78 Scheffe Test – Drug B Vs. C (X B - XC ) (25.0 20.0) 2 F= = = 2.68 1 1 1 1 MSE + 23.3 5 5 nB nC 2 Do not reject H0 since 2.68<7.78. No significant difference in mean times to pain relief for Drugs B and C. Overall conclusion?? Tukey Test in R ANOVA Pairwise Tests using Tukey MCP Only significant result is A vs C