Oneway/Randomized Block Designs Q560: Experimental Methods in Cognitive Science Lecture 8 Reconstructive Memory: Loftus and Palmer (1974) “How fast were the cars going when they ____ each other?” (hit, bumped, smashed) 44 Estimated speed (MPH) 42 40 38 36 34 32 30 Hit Bumped Verd used in question Smashed The problem with t-tests… We could compare three groups with multiple ttests: M1 vs. M2, M1 vs. M3, M2 vs. M3 But this causes our chance of a type I error (alpha) to compound with each test we do Testwise Error: Probability of a type I error on any one statistical test Experimentwise Error: Probability of a type I error over all statistical tests in an experiment ANOVA keeps our experimentwise error = alpha What is ANOVA? In ANOVA an independent or quasi-independent variable is called a factor. Factor = independent (or quasi-independent) variable. Levels = number of values used for the independent variable. One factor “single-factor design” More than one factor “factorial design” An example of a single-factor design: A example of a two-factor design: What are we interested in? Two interpretations: 1) Differences are due to chance. 2) Differences are real. ANOVA Test Stastic Remember the t statistic: actual difference between sample means t= difference expected by chance ANOVA test statistic (F-ratio) is similar: actual variance between sample means F= variance expected by chance Variance can be calculated for more than two sample means … An example: The Logic of ANOVA Hypothetical data from an experiment examining learning performance under three temperature conditions: Treatment 1 50° (sample 1) Treatment 2 70° (sample 2) Treatment 3 90° (sample 3) The Logic of ANOVA Looking at the data, there are two kinds of variability (variance): -Between treatments -Within treatments Variance between treatments can have two interpretations: -Variance is due to differences between treatments. -Variance is due to chance alone. This may be due to individual differences or experimental error. The Logic of ANOVA The Logic of ANOVA F-ratio compares between and within variance as follows: F= variance between treatments variance within treatments Another way of expressing it: treatment effect + chance F= chance The Logic of ANOVA treatment effect + chance F= chance If there is no effect due to treatment: F 1.00. If there is a significant effect due to treatment: F > 1.00. The denominator of the F-ratio is also called the error term (it measures only unsystematic variance). ANOVA Notation ANOVA Notation What do all the letters mean? k = number of levels of the factor (i.e. number of treatments) n = number of scores in each treatment N = total number of scores in the entire study T = X for each treatment condition G = “grand total” of all the scores We also need SS, M, and X2. ANOVA Notation What are the calculations we need to do? 4) 1) 3) 2) 1) Analysis of Sum of Squares: Total: Between: Within: SStotal G2 = X2 - N SSbetween T2 G2 = n N SSwithin = SSinside each treatment Note: SStotal = SSwithin + SSbetween. 1) Analysis of Sum of Squares: Total: Between: Within: SStotal G2 = X2 - N SSbetween T2 G2 = n N SSwithin = SSinside each treatment Just Remember: SSwithin = SStotal - SSbetween. Note: SStotal = SSwithin + SSbetween. 2) Analysis of Degrees of Freedom: Total: Between: Within: dftotal = N-1 dfbetween = k-1 dfwithin = dfin each treatment = N - k Note: dftotal = dfwithin + dfbetween. 3) Calculation of Variances (MS) and of F-ratio: Note: In ANOVA, variance = mean square (MS) MSbetween MSwithin SSbetween = dfbetween SSwithin = dfwithin MSbetween F= MSwithin Summary of ANOVA data: F distribution In our example, the value for the F-ratio is high (11.28). Is this value really significant? Need to compare this value to the overall F distribution. Note: 1) F-ratios must be positive. 2) If H0 is true, F is around 1.00. 3) Exact shape of F distribution will depend on the values for df. F distribution Shape of the F distribution for df = 2, 12: F distribution Let’s take a look at an F distribution table: degrees of freedom: denominator degrees of freedom: numerator 1 2 3 4 5 6 Hypothesis Testing with ANOVA Step 1: Hypotheses • H0: all equal; H1: at least one is different Step 2: Determine critical value • F ratios are all positive (only one tail) • Need: dfB and dfW Step 3: Calculations • SSB and SSW • MSB and MSW •F Step 4: Decision and conclusions • And maybe a source table Hypothesis Testing with ANOVA Data for three drugs designed to act as pain relievers: Placebo Drug A Drug B Drug C Step 1: State hypotheses H0: 1 = 2 = 3 = 4. H1: At least one is different. Step 2: Determine the critical region Set =.05 Determine df. dftotal = N-1 = 20-1= 19 dfbetween = k-1 = 4-1 = 3 dfwithin = N-k = 20-4 = 16 For the data given in the example: df = 3,8 Step 3: Calculate the F-ratio for the data 1) Obtain SSbetween and SSwithin. 2) Use SS and df values to calculate the two variances MSbetween and MSwithin. 3) Finally, use the two MS values to compute Fratio. 1) Sum of Squares Total: Between: SSTotal G2 60 2 = SX = 262 = 82 N 20 SSBetween 2 T 2 G2 =å n N 5 2 10 2 20 2 25 2 60 2 = + + + = 50 5 5 5 5 20 Within: SSBetween = SSTotal - SSBetween = 82 - 50 = 32 2) Mean Squares Between: Within: MSBetween SSBetween 50 = = = 16.67 df Between 3 MSWithin SSWithin 32 = = = 2.00 dfWithin 16 3) F-Ratio MSBetween 16.67 F= = = 8.33 MSWithin 2.00 Step 4: Decision and Conclusion Fobt exceeds Fcrit Reject H0 We must reject the null hypothesis that all of the drugs are the same, F(3,16) = 8.33, p< .05 Summary Table: Source SS df MS F Between Within Total 50 32 82 3 16 19 16.67 2.00 8.33* Source Table for Independent-Measures ANOVA Source Between Within Total SS df T 2 G2 SSB = å n N SSW = SST - SSB SST = SX 2 - G2 N k-1 N-k N-1 MS MSbetween = SSbetween df between MSwithin = SSwithin df within F F= MSbetween MSwithin Let’s visualize the concepts of betweentreatment and within-treatment variability. What are the corresponding F-ratios? MSbetween F= MSwithin Experiment A: F = 56/0.667 = 83.96 Experiment B: F = 56/40.33 = 1.39 Randomized Block Designs The Logic of ANOVA treatment effect + chance F= chance If there is no effect due to treatment: F 1.00. If there is a significant effect due to treatment: F > 1.00. The denominator of the F-ratio is also called the error term (it measures only unsystematic variance). Two Types of ANOVA Independent measures design: Groups are samples of independent measurements (different people) Dependent measures design: Groups are samples of dependent measurements (usually same people at different times; also matched samples) “Repeated measures” With t-tests, we used different formulae depending on the design…this is also true of ANOVA The Logic of ANOVA Independent Measures Differences between groups could be due to • Treatment effect • Individual differences • Error or chance (tired, hungry, etc) Differences within groups could be due to • Individual differences • Error or chance treatment effect + indiv diffs + chance F= indiv diffs + chance A repeated-measures design removes variability due to individual differences, and gives us a more powerful test Repeated Measures In a repeated-measures design, the same people are tested in each treatment, so differences between treatment groups cannot be due to individual differences treatment effect + indiv diffs + chance F= indiv diffs + chance So, we need to estimate differences between individuals to remove it from the denominator Then we will have a purer measure of the actual treatment effect (if it exists) Partitioning of Variance/df Total Variance Between-treatments variance: Within-treatments variance: 1) Treatment effect 1) Individual diffs 2) Error or chance (excluding indiv differences) 2) Error or chance Between-subjects variance: 1) Individual diffs Error variance: 1) Error or chance (excluding indiv differences) Example: Number of errors on a typing task while coffee is consumed Person Baseline Time 1 Time 2 Time 3 A B C D E 3 0 2 0 0 T=5 SS=8 4 3 1 1 1 6 3 4 3 4 7 6 5 4 3 Person Totals P = 20 P = 12 P = 12 P=8 P=8 n=5 k=4 N = 20 G = 60 SX 2 = 262 T = 10 T = 20 T = 25 SS=8 SS=6 SS=10 We also compute person totals to get an estimate of individual differences Sum of Squares: Stage 1 First step is identical to independent-measures ANOVA Total: Between: SStotal G2 = X2 - N SSbetween T2 G2 = n N Within: SSwithin = SStotal - SSbetween dfbetween = k-1 dftotal = N-1 dfwithin = N-k Sum of Squares: Stage 2 In the second stage, we simply remove the individual differences from the denominator of the F-ratio P 2 G2 SSb / s = å k N SSerror = SSwithin - SSb / s df b / s = n -1 df error = df within - df b / s Mean Squares and F-ratio Now we just substitute MSerror into the denominator of F: MSbetween SSbetween = df between MSbetween F= MSerror MSerror SSerror = df error Source Table for Repeated-Measures ANOVA Source Between Within b/w subjects Error Total SS T 2 G2 SSB = å n N SSW = SST - SSB SSb / s = å P 2 G2 k N SSerror = SSwithin - SSb / s G2 SST = SX N 2 df k-1 N-k n-1 (N-k)-(n-1) N-1 MS MSbetween = SSbetween df between MSerror = SSerror df error F F= MSbetween MSerror Example: Number of errors on a typing task while coffee is consumed Person Baseline Time 1 Time 2 Time 3 A B C D E 3 0 2 0 0 T=5 SS=8 4 3 1 1 1 6 3 4 3 4 7 6 5 4 3 T = 10 T = 20 T = 25 SS=8 SS=6 SS=10 Person Totals P = 20 P = 12 P = 12 P=8 P=8 n=5 K=4 N = 20 G = 60 SX 2 = 262 Step 1: State hypotheses H0: 1 = 2 = 3 = 4. H1: At least one is different. Step 2: Determine the critical region Set =.05 Determine df. Fcrit(3,12)=3.49 dftotal = N-1 = 20-1= 19 dfbetween = k-1 = 4-1 = 3 dfwithin = N-k = 20-4 = 16 dfb/s = n-1 = 5-1 = 4 dferror = dfwithin- dfb/s=16-4 = 12 Step 3: Calculate the F-ratio for the data 1) Obtain SSbetween and SSerror. 2) Use SS and df values to calculate the two variances MSbetween and MSerror. 3) Finally, use the two MS values to compute Fratio. 1) Sum of Squares, Stage 1: Total: Between: SSTotal G2 60 2 = SX = 262 = 82 N 20 SSBetween 2 T 2 G2 =å n N 5 2 10 2 20 2 25 2 60 2 = + + + = 50 5 5 5 5 20 Within: SSWithin = SSTotal - SSBetween = 82 - 50 = 32 1) Sum of Squares, Stage 2: Between subjects: P 2 G2 20 2 12 2 12 2 82 82 60 2 SSb / s = å = + + + + = 24 k N 4 4 4 4 4 20 Error: SSerror = SSwithin - SSb / s = 32 - 24 = 8 2) Mean Squares Between: Error: 3) F-Ratio MSBetween SSBetween 50 = = = 16.67 df Between 3 MSError SSError 8 = = = 0.67 df Error 12 MSBetween 16.67 F= = = 24.88 MSError 0.67 Note: These are the same data we used in the independentmeasures ANOVA on Thurs, by changing from an independent to repeated measures design, weve gone from F=8.33 to F=24.88 Step 4: Decision and Conclusion Fobt exceeds Fcrit Reject H0 We must reject the null hypothesis that coffee has no effect on errors, F(3,12) = 24.88, p< .05 Summary Table: Source SS df MS F Between Within b/w Ss Error Total 50 32 24 8 82 3 16 4 12 19 16.67 24.88* 0.67 Advantages of Repeated-Measures Remember: variance (=“noise”) in the samples increases the estimated standard error and makes it harder for a treatment-related effect to be detected. (Remember how we added up two sources of variance in the independent-measures design.) Repeated-measures design reduces or limits the variance, by eliminating the individual differences between samples. Problems with Repeated Measures Carryover effect (specifically associated with repeated-measures design): subject’s score in second measurement is altered by a lingering aftereffect from the first measurement. Examples: testing of two drugs in succession, motivation effects, etc. Important: Aftereffect from first treatment Problems with Repeated Measures Progressive error: Subject’s score changes over time due to a consistent (systematic) effect. Examples: fatigue, practice Important: effect of time alone History: changes outside the individual that may be confounded w/ the treatment Maturation: changes within the individual that may be confounded w/ the treatment