Introduction to Analysis of Variance CJ 526 Statistical Analysis in Criminal Justice Introduction 1. Analysis of Variance (ANOVA) is an inferential statistical technique 2. Developed by Sir Ronald Fisher, an agricultural geneticist, in the 1920s. Relationship Between ANOVA and Independent t-Test 1. Actually, Independent t-Test is really a special case of ANOVA 2. It is like other parametric inferential procedures such as t test, but there are more than two groups Purpose of ANOVA 1. Determine whether differences between the means of the groups are due to chance (sampling error) 2. Can be used with both experimental and ex post facto designs Experimental Research Designs Researcher manipulates levels of Independent Variable to determine its effect on a Dependent Variable Example of an Experimental Research Design Using ANOVA Dr. Sophie studies the effect of different dosages of a new drug on impulsivity among children at-risk of becoming delinquent Example of an Experimental Research Design Using ANOVA -- continued 1. Independent Variable 1. Different dosages of new drug 1. 0 mg (placebo) 2. 100 mg 3. 200 mg 4. Measure impulsivity in each group, compare groups Ex Post Facto Research Designs Researcher investigates effects of preexisting levels of an Independent Variable on a Dependent Variable Example of an Ex Post Facto Research Design Using ANOVA Dr. Horace wants to determine whether political party affiliation has an effect on attitudes toward the death penalty using a scale assessing attitudes Example of an Ex Post Facto Research Design Using ANOVA -- continued 1. Independent Variable 1. Political Party Affiliation 1. Democrat 2. Independent 3. Republican 4. Measure attitudes toward the death penalty in each group 5. No manipulation Null and Alternative Hypothesis in ANOVA 1. No differences among the group means 2. Alternative: at least one group differs from at least one other group Example of Pairwise Comparisons 1. Dr. Mildred wants to determine whether birth order has an effect on number of self-reported delinquent acts 2. Independent Variable 1. Birth Order 1. First Born (or only child) 2. Middle Born (if three or more children) 3. Last Born Example of Pairwise Comparisons -- continued 3. Dependent Variable 1. Number of self-reported delinquent acts 4. Possible pairwise comparisons 1. FB ≠ MB 2. FB ≠ LB 3. MB ≠ LB 5. It is possible for this particular analysis that: 1. Any one of the pairwise comparisons could be statistically significant 2. Any two of the pairwise comparisons could be statistically significant 3. All three of the pairwise comparisons could be statistically significant Types of ANOVA One-Way ANOVA 1. One Independent Variable 2. Groups are independent Types of ANOVA -continued Repeated-Measures ANOVA 1. Groups are dependent 2. Measure the dependent variable at more than two points in time ANOVA and Multiple tTests 1. Testwise alpha 2. If multiple t tests are run, there is error each time. If p < .05, one in twenty will be significant, which could just be error 3. Better to conduct ANOVA rather than multiple t tests The Logic of ANOVA Total variability of the DV can be analyzed by dividing it into its component parts Components of Total Variability 1. Between-Groups 2. Measure of the overall differences between treatment conditions (groups, samples) Within-Groups Variability 1. Measure of the amount of variability inside of each treatment condition (group, sample) 2. There will always be variability within a group Between-Group (BG) Variability 1. Treatment Effect (TE) Within-Group (WG) Variability 1. Individual Differences (ID) 2. Example: for race, there is more within group variability than between group variability (more genetic variation among white, or Asians, etc, than between the races The F-Ratio F BG WG The F-Ratio -- continued F TE ID EE ID EE The F-Ratio -- continued 1. If H0 is true, TE = 0, F = 1 The F-Ratio -- continued F 0 ID EE ID EE The F-Ratio -- continued 1. If H0 is false, TE > 0, F > 1 The F-Ratio -- continued F TE ID EE ID EE Systematic Variability 1. Due to treatment 2. Unsystematic variability: uncontrolled or unexplained ANOVA Vocabulary 1. Factor, (an IV) 2. Levels are different values of a factor 3. k, number of levels of a factor (also the number of samples) Degrees of Freedom 1. Between Groups 1. k – 1 (number of samples-1) 2. Within groups: n – k (total number of subjects minus number of samples) 3. Total degrees of freedom: n - 1 F-Distribution 1. Always positive 2. See p. 727, p < .05, p. 728, p < .01 3. n1 refers to within degrees of freedom, n2 to between degrees of freedom Example A police psychologist wants to determine whether caffeine has an effect on learning and memory Randomly assigns 120 police officers to one of five groups: Experimental Groups 1. 0 mg (placebo) 2. 50 mg 3. 100 mg 4. 150 mg 5. 200 mg Example -- continued Records how many “nonsense” words each police officer recalls after studying a 20-word list for 2 minutes (for example, CVC, dif, zup) ANOVA Summary Table Between Groups Within Groups Total Sum of Squares df Mean Squares F 82.72 4 20.68 5.14 462.3 545.02 115 119 4.02 Example of ANOVA 1. Number of Samples: 5 2. Nature of Samples: 1. independent Example of ANOVA -continued 3. Independent Variable: caffeine 4. Dependent Variable and its Level of Measurement: number of syllables remembered—interval/ratio Example of ANOVA -continued 5. Appropriate Inferential Statistical Technique: one way analysis of variance 6. Null Hypothesis: no differences in memory (DV) between the groups who are administered differing amount of caffeine (IV) Example of ANOVA -continued 7. Decision Rule: 1. If the p-value of the obtained test statistic is less than .05, reject the null hypothesis Example of ANOVA -continued 8. Obtained Test Statistic: F 9. Decision: accept or reject the null hypothesis Results The results of the One-way ANOVA involving caffeine as the independent variable and number of nonsense words recalled as the dependent variable were statistically significant, F = (4, 115) = 5.14, p < .01. The means and standard deviations for the five groups are contained in Table 1. Discussion It appears that the ingesting small to moderate amounts of caffeine results in better retention of nonsense syllables, but that ingesting moderate to large amounts of caffeine interferes with the ability to retain nonsense syllables SPSS Procedure Oneway Analyze, Compare Means, One-Way ANOVA Move DV into Depdent List Move IV into Factor Options Descriptives Homogeniety of Variance Sample Printout: ANOVA Descriptives Score on Drug Ind ex 95% Con fidence Interval for Mean N Mean Std. Deviation Std. Error Lower Bou nd Upper Bou nd Min im um Maxim um Catholic 7 9.43 12.541 4.740 -2 .17 21.03 0 30 Jewish 4 7.75 9.032 4.516 -6 .62 22.12 0 20 Pro testant 9 18.33 15.969 5.323 6.06 30.61 0 50 20 13.10 13.924 3.114 6.58 19.62 0 50 Total ANOVA Score on Drug Index Test of Homogeneity of Variances Sum of Squares Score on Drug Index Levene Statistic .831 df1 df2 2 Sig. 17 .452 Between Groups df Mean Square 455.336 2 227.668 Within Groups 3228.464 17 189.910 Total 3683.800 19 F 1.199 Sig. .326 Sample Printout: Post Hoc Tests Multiple Comparisons Dependent Variable: Score on Drug Index Bonferroni (I) Religious Affiliation of Res pondent Catholic Jewish Mean Difference (I-J) 1.68 Std. Error 8.638 Sig. 1.000 Lower Bound -21.25 Upper Bound 24.61 Protes tant -8.90 6.945 .651 -27.34 9.53 Catholic -1.68 8.638 1.000 -24.61 21.25 -10.58 8.281 .655 -32.57 11.40 Catholic 8.90 6.945 .651 -9.53 27.34 Jewish 10.58 8.281 .655 -11.40 32.57 (J) Religious Affiliation of Res pondent Jewish Protes tant Protes tant 95% Confidence Interval SPSS Procedure OneWay Output Descriptives Levels of IV N Mean Standard Deviation Standard Error of the Mean 95% Confidence Interval Lower Bound Upper Bound SPSS Procedure OneWay Output -- continued Test of Homogeneity of Variance ANOVA Summary Table Sum of Squares df Mean Square F Sig