8 ANOVA

Analysis Of Variance ANOVA ‫اختبار التباين‬ One way ANOVA Two-Way (Factorial) ANOVA Repeated measures (within-subjects) ANOVA 1 One way ANOVA ‫اختبار التباين األحادي في اتجاه واحد‬ Also known as: • One-Factor ANOVA • One-Way Analysis of Variance • Between Subjects ANOVA 2 Common Applications • One way ANOVA is used when we want to study the effect of one independent, qualitative variable on dependent continuous variable, the independent variable has more than two subgroups. • One way ANOVA compare the means for independent groups • It can be thought of as an extension of the two independent samples t-test, but it used to detect a difference in means of 3 or more independent groups. 3 Required variables • The requirements for one way ANOVA: 1. One (single) dependent (also called outcome or response) continuous (scale/interval/ratio) variable such as weight, blood pressure, cholesterol. 2. One (single) Independent (factor) categorical variable has > 2 groups (at least 3 unrelated/ independent groups) such as marital status ( single, married, divorced) economic status ( low, middle, high ) 4 Example • Assume that we have recorded the biomass of certain bacterial species in broth medium at three pH levels. • The researcher wishes to know if the biomass means (measured by optical density, O.D) of bacterial species are different between the three pH levels. Replicate pH 5.5 pH 6.5 pH 7.5 1 12 20 40 2 15 19 35 3 9 23 42 5 Example • Data: The data set ‘Diet.sav’ contains information on 78 people who undertook one of three diets. There is background information such as age, gender and height as well as weight lost on the diet (a positive value means they lost weight). The aim of the study was to see which diet was best for losing weight so the independent variable (group) is diet. 6 Cast the data into a table, labeling each group as Diet 1, Diet 2 and Diet 3 Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 n Mean Diet 1 K1 3.8 6 0.7 2.9 2.8 2 2 8.5 1.9 3.1 1.5 3 3.6 0.9 -0.6 1.1 4.5 4.1 9 2.4 3.9 3.5 5.1 3.5 n1 = 24 3.3 Type of diet Diet 2 K2 0 0 -2.1 2 1.7 4.3 7 0.6 2.7 3.6 3 2 4.2 4.7 3.3 -0.5 4.2 2.4 5.8 3.5 5.3 1.7 5.4 6.1 7.9 -1.4 4.3 n2 =27 3.03 Diet 3 K3 7 5.6 3.4 6.8 7.8 5.4 6.8 7.2 7 7.3 0.9 7.6 4.1 6.3 5 2.5 0.9 3.5 0.5 2.8 8.6 4.5 2.8 4.1 5.3 9.2 6.1 n3 =27 5.15 j is the observations in each group j = 1, 2, 3, …………n In this example n = 24, 27 and 27 for the first, second and third diet respectively i is the diet group i = 1, 2, ….k In this example k = 3 7 Hypothesis The one-way analysis of variance is used to test the claim that three or more population means are equal H0: µ1 = µ2 = µ3 =………. =µk Ha: At least two of the means µ1, µ2, µ3 …….. , µk are different. Or Ha: not all µ are equal Where µ1 is the population mean of Diet 1, µ2 is the population mean of Diet 2 and µ3 is the population mean of Diet 3 µk is the population mean of group k Hypothesis test :  Null hypothesis : There is no difference in the average weight loss of persons given three diet groups  Alternative hypothesis : There is a difference in the average weight loss of persons given three types of diet between at least two groups. 8 The Use of Computers • The calculations required by analysis of variance are long and complicated for this reason the computer assumes an important role in analysis of variance. 9 Enter data 10 Assumptions • The data are randomly sampled • Observations of the dependent (outcome) variable within each group were obtained independently. The independence assumption means that there is no association between the observations in the different groups and between the observations in the same group. • Observations of the dependent (outcome) variable within each group are drawn from normally distributed populations. • Residuals should be normally distributed • Homogeneity of variance i.e. Variances of the populations are equal. That is the variances of various groups are homogenous. 11 Notes on assumptions  Of the all assumptions, independence is the most crucial. If this assumption is violated, the inferences based on ANOVA are invalid.  The analysis of variance is not heavily dependent on the normality assumption; ANOVA is robust to minor departures from normality. It is worthy to mention that the ANOVA test is especially resistant to departures from normality when the sample sizes are equal  Similarly, the assumption of equal variances is crucial but not critical. If the number of observations in each group is the same, inferences about means not seriously affected by unequal population variances. 12 • Before carrying any analysis, summarize weight loss by treatment using a box-plot and some summary statistics. • Do the group means and standard deviations look similar or very different? 13 14 • Diet 3 seems better than the other diets as the mean weight lost is greater. • The standard deviations are similar so weight lost within each group is equally spread out. The assumption of equal variances can also be checked by examining the spread of the observations in the boxplots. 15 Checking the assumptions of normally distributed dependent variable in each group • Assumptions – The dependent variable in each group is normally distributed . • How to check – Click on Analyze in the main menu > Descriptive Statistics > Explore – Move the WeightLost variable to the Dependent List and the Diet variable to the Factor List. Click on Plots button and fill out its dialog box to produce histograms/ QQ plot / Shapiro Wilk tests for each diet group. • What to do if the assumption is not met – In case if the assumption is violated, natural logarithm or other data transformations may be tried to correct this problem. – If none of the available transformations turns out to be successful (for example when data contain seriously outlying observations), the Kruskal-Wallis test can be applied to the data. 16 17 Check normality As p > 0.05, the dependent variable in the three groups is normally distributed 18 Start ANOVA • Enter dependent variable in Dependent List • Enter independe nt variable in Factor This selection will create new variable of the standardized residuals for each subject which will added to the dataset 19 This selection is essential to check the assumption of Homogeneity (equality) of variance OK 20 Checking the assumptions of normally distributed Residuals • Assumptions – Residuals should be normally distributed • How to check – Use the Save menu within General Linear Model request the standardized residuals for each subject to added to the dataset (already done) and – then use Analyze > Descriptive Statistics > Explore produce histograms/ QQ plot / Shapiro Wilk tests residuals. to be to of • What to do if the assumption is not met – If the residuals are very skewed, the results of the ANOVA are less reliable. The Kruskall-Wallis test should be used instead of ANOVA. 21 22 Check normality As p > 0.05, the residuals are normally distributed 23 Checking the assumption of Homogeneity (equality) of variance • Assumptions – Homogeneity (equality) of variance: The variances (SD squared) should be similar for all the groups. • How to check – The Levene’s test is carried out if the Homogeneity of variance test option is selected in the Options menu. – If p > 0.05, equal variances can be assumed. • What to do if the assumption is not met – If p < 0.05, the results of the ANOVA are less reliable. – The Welch test is more appropriate and can be accessed via the Options menu using Analyze > Compare Means > One-way ANOVA. – The Games Howell post hoc test should also be used instead of Tukey’s. 24 Output H0: variances are equal Ha: variances are different Since p > 0.05 , = 0.520 we can not reject H0 F = test statistic = =MSBET/MSw35.547/5.736= 6.197 Between Groups Within Groups Since the significance is less than 0.05, you may reject the null hypothesis • When writing up the results, it is common to report certain figures from the ANOVA table. • F(dfbetween, df within) = Test Statistic, P = F(df2, df75) = 6.197, P = 0.003. There was a significant difference in weight lost [F(df2, df75) = 6.197, P = 0.003] between the diets. 25 New Terms in Analysis of Variance • Sum of squares and mean square are only new names for familiar concepts. • The sum of squares (abbreviated SS) – another name for variation – the deviations between a value and the mean of the values – the numerator of the variance. • The mean square (abbreviated MS) – is just the variance – The mean squares (or variances) could be easily obtained by dividing the sums of squares by their degrees of freedom (df(. n s2  x i 1 i x  2 n 1 26 Sum of the squares in ANOVA table • There are two sources of variation in ANOVA table – the variation between the groups, SS(BET) – the variation within the groups, SS(W) Between Groups Within Groups –There is also another sum of the squares (Total), but it is not a source of variation 27 Sum of the squares(between) – Sometimes called the variation due to the factor – Denoted SS(BET) for Sum of Squares (variation) between the groups – The between group variation measures how much the group means vary from the grand (overall) mean (= 3.84) SS     n x  x   n x  x   n x  x   k BET SS  i 1 i i  2 2 BET 1 1 2 2 2 Subject Type of diet Diet 2 K2 0 0 -2.1 2 1.7 4.3 7 0.6 2.7 3.6 3 2 4.2 4.7 3.3 -0.5 4.2 2.4 5.8 3.5 5.3 1.7 5.4 6.1 7.9 -1.4 4.3 n2 =27 3.03 Diet 1 K1 3.8 6 0.7 2.9 2.8 2 2 8.5 1.9 3.1 1.5 3 3.6 0.9 -0.6 1.1 4.5 4.1 9 2.4 3.9 3.5 5.1 3.5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 n Mean n1 = 24 3.3  n x  x  k Diet 3 K3 7 5.6 3.4 6.8 7.8 5.4 6.8 7.2 7 7.3 0.9 7.6 4.1 6.3 5 2.5 0.9 3.5 0.5 2.8 8.6 4.5 2.8 4.1 5.3 9.2 6.1 n3 =27 5.15 2 k SS  BET   24  3.3  3.84   27  3.03  3.84   27  5.15  3.84   71.0936 2 2 2 28 Sum of the squares (between) SS  BET   24  3.3  3.84   27  3.03  3.84   27  5.15  3.84   71.0936 2 2 2 Between Groups Within Groups 29 Sum of the squares(within) – This is called the within group variation – Denoted SS(W) for Sum of Squares (variation) within the groups – SS(W) measures how much the individuals vary from their group mean. – Each difference between an individual and its group mean is called a residual. – These residuals are squared and added together to give (SSW). Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 n Mean Diet 1 K1 3.8 6 0.7 2.9 2.8 2 2 8.5 1.9 3.1 1.5 3 3.6 0.9 -0.6 1.1 4.5 4.1 9 2.4 3.9 3.5 5.1 3.5 n1 = 24 3.3 SS W    i 1  j i1 (x ij  x i ) 2 = SS1 + SS2 +........+ SSk SSW= 115.42 + 165.5523 + 149.2075 = 430.1798 k Type of diet Diet 2 K2 0 0 -2.1 2 1.7 4.3 7 0.6 2.7 3.6 3 2 4.2 4.7 3.3 -0.5 4.2 2.4 5.8 3.5 5.3 1.7 5.4 6.1 7.9 -1.4 4.3 n2 =27 3.03 Diet 3 K3 7 5.6 3.4 6.8 7.8 5.4 6.8 7.2 7 7.3 0.9 7.6 4.1 6.3 5 2.5 0.9 3.5 0.5 2.8 8.6 4.5 2.8 4.1 5.3 9.2 6.1 n3 =27 5.15 n 30 Sum of the squares (within) SS W    i 1  j i1 (x ij  x i ) 2 = SS1 + SS2 +........+ SSk k n SSW = 115.42 + 165.5523 + 149.2075 = 430.1798 Between Groups Within Groups 31 Sum of the squares (Total) – This is called the total variation – Denoted SS(TOT) for the total Sum of Squares (variation) – SS(TOT) = SSBET + SSW OR SS TOT    i 1  j 1 (x ij  x ) 2 k ni Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 n Mean Diet 1 K1 3.8 6 0.7 2.9 2.8 2 2 8.5 1.9 3.1 1.5 3 3.6 0.9 -0.6 1.1 4.5 4.1 9 2.4 3.9 3.5 5.1 3.5 n1 = 24 3.3 Type of diet Diet 2 K2 0 0 -2.1 2 1.7 4.3 7 0.6 2.7 3.6 3 2 4.2 4.7 3.3 -0.5 4.2 2.4 5.8 3.5 5.3 1.7 5.4 6.1 7.9 -1.4 4.3 n2 =27 3.03 Diet 3 K3 7 5.6 3.4 6.8 7.8 5.4 6.8 7.2 7 7.3 0.9 7.6 4.1 6.3 5 2.5 0.9 3.5 0.5 2.8 8.6 4.5 2.8 4.1 5.3 9.2 6.1 n3 =27 5.15 32 Sum of the squares (Total) Between Groups Within Groups 33 Degrees of freedom (df) In one-way ANOVA, there are three degrees of freedom: • The degrees of freedom between groups, df.BET is one less than the number of groups df.BET = k – 1 = 2 where k is the number of groups (K = 3). • The degrees of freedom within groups, df.within. df.w = N – k = 75 where N is the total observations (N = 78) Alternatively, the within group df is the sum of the individual df’s of each group – – The sample sizes are 24, 27, and 27 df(W) = 23 + 26+ 26= 75 34 • The total df is one less than the total sample sizes – df(TOT) = N – 1 = 78– 1 = 77 – Or df(TOT) = df.BET + df.w Between Groups Within Groups 35 Mean Squares (variances) – The variances are also called the Mean of the Squares and abbreviated by MS. – They are found by dividing the variation (SS) by the degrees of freedom (df) V ariation V ariance  df SS MS  df In ANOVA table there are two mean squares • MSBET, the variance between groups (read mean square between) MS BET  • SS BET SS BET  df BET k 1 MSW, the variance within groups (read mean square within) SSW SSW MSW   df W N k 36 Mean Squares MS BET SS BET SS BET 71.094     35.547 df BET k 1 3 1 SSW SSW 430.179 MSW     5.736 df W N  k 78  3 Between Groups Within Groups 37 F-test – F - statistic is the ratio of two variances, the MSBET and MSW. – In fact, ANOVA stands for ‘Analysis of variance’ as it uses the ratio of between group variability (MSBET) to within group variability (MSW), when deciding if there is a statistically significant difference between the groups. F Between-groups variability (MS BET ) Within-groups variability (MSW ) For our data, F  MS BET 35.547   6.197 MSW 5.736 Between Groups Within Groups 38 Decision rule • ANOVA procedures utilize a distribution called the F distribution. • A given F distribution has two separate degrees of freedom, represented by df1and df2. – The first, df1, is called the degrees of freedom for the numerator (represents dfbetween = 2) and – the second, df2, is called degrees of freedom for the denominator (represents dfwithin = 75). • The critical F value for 2 (dfbetween) and 75 (dfwithin) is 3.18 for α of 0.05 and 5.06 at α of 0.01 39 The critical F values for 2 and 75 are 3.18 for α of 0.05 and 5.06 at α of 0.01 40 Decision • The computed F value 6.197 is greater than the tabulated (critical ) value for α = 0.05, F2,75 = 3.18 and also for α = 0.01, F2,33 = 5.06 Decision: Reject H0 at 0.05 and even at 0.01 level of α. i.e. p < 0.01 Non rejection Region F = 3.18 41 P-value You may reach the same decision and reject the null hypothesis based on the significance (p-value = 0.003) which is less than 0.05 and even less than 0.01 Between Groups Within Groups Conclusion: This indicates that at least one of the means is significantly different from the others- that is type of diet appears to be associated with (affect) weight loss 42 Where does variability come from? Three types: 1. INDIVIDUAL DIFFERENCES: Variability between all participants (gender, age, height, mood, favorite food-----etc). People bring different experiences to your study. 2. EXPERIMENTAL ERROR: Inaccurate measurement, poor planning of the study. May be measure weight with a broken scale. F Between-groups variability(MS BET ) Within-groups variability (MSW ) Indiv. Diff. + Exper. Error +..........?........... F  Indiv. Diff. + Exper. Error 3. ?????????????????????? 43 3. TREATMENT/ EXPOSURE EFFECT: – This is a between group variance. – Cannot influence within-group variance since all the subjects in a group are given the same treatment (Diet). So, the treatment effect is the only source of variance that can influence between-groups variance that doesn’t influence within-groups variance. Indiv. Diff. + Exper. Error +Treatment effect F  Indiv. Diff. + Exper. Error 44 Summary of calculation equations 45 Which treatments differ from one another? • ANOVA tests the null hypothesis ‘all group means are the same’ so the resulting p-value only concludes whether or not there is a difference between one or more pairs of groups. But it does not tell us which group differ from one another. • Which pairs of means are different from one another and which are not? • In order to know which group differ from one another further ‘post hoc’ tests have to be carried out to confirm where those differences are found. 46 Post-hoc Comparisons of Treatments • The post hoc tests are Several tests, named after their developers • They are mostly t-tests with an adjustment to test all possible pairs. • Although the tests show some theoretical differences, in practice, the results do not give very different results. • One of these tests needs to be undertaken only when the results of ANOVA indicate that there is a significant difference between the means of the groups. 47 • Repeat the ANOVA making the following adjustments in the post hoc window • Move the independent variable (factor) from the Factor to the Post hoc Tests for box, then choose from the available tests. – Tukey’s and Scheffe’s tests are the most commonly used post hoc tests. – Hochberg’s GT2 is better where the sample sizes for the groups are very different. 48 Report each of the three pairwise comparisons e.g. there was a significant difference between diet 3 and diet 1 (p = 0.02). Use the mean difference between each pair e.g. people on diet 3 lost on average 1.85 kg more than those on diet 1 or use individual group means to conclude which diet is best. Mean of diet 1– Mean of diet 3 3.3 – 5.1481 = - 1.8481 49 Reporting ANOVA • A one-way ANOVA was conducted to compare the effectiveness of three diets. Normality checks and Levene’s test were carried out and the assumptions met. • There was a significant difference in mean weight lost [F(2,75)=6.197, p = 0.003] between the diets. • Post hoc comparisons using the Tukey test were carried out. • There was a significant difference between diets 1 and 3 (p = 0.02) with people on diet 3 lost on average 1.85 kg more than those on diet 3. • There was also a significant difference between diets 2 and 3 difference (p = 0.005) with people on diet 3 lost on average 2.12 kg more than those on diet 2. 50 Welch test • If the assumption of Homogeneity (equality) of variance is not met, the Welch test is more appropriate • The Games Howell post hoc test should also be used instead of Tukey’s. 51 52 53 Two-Way (Factorial) ANOVA • Two way ANOVA is an extension to one way ANOVA, it can be used when we want to study the effect of two independent categorical variables (factors) on continuous, dependent variable based on means comparisons. • Two way ANOVA – calculates main effects for each independent variable – calculates interactive effects between independent variables. 54 Required variables • The requirements for two way (between-groups) ANOVA: 1. One (single) dependent (also called outcome or response), continuous (scale/interval/ratio) variable such as weight, blood pressure, cholesterol. 2. Two independent, categorical (grouping factors) variables. 55 Example • Data: The data set ‘Diet.sav’ contains information on 78 people who undertook one of three diets. There is background information such as age, gender and height as well as weight lost on the diet (a positive value means they lost weight). • The aim of the study was to see which diet was best for losing weight but it was also thought that best diets for males and females may be different so the independent variables are diet and gender. 56 57 58 Assumptions Basic assumptions of two-way ANOVA 1. The outcome (dependent) variable must be normally distributed. The measurements in each cell are assumed to be drawn from a population with a normal distribution. 2. Homogeneity of variance in groups. The measurements in each cell are assumed to come from distributions with approximately the same variance. 3. The groups must be independent. The measurements in each cell are come from independent random samples. 4. The residuals are normally distributed 59 Hypotheses • There are three hypotheses with a two-way ANOVA. – Two for the main effects (diet and gender), and – one for the interaction of the two, diet and gender. 60 Hypothesis test Effect of diet on weight loss Is there a significant difference in the average weight loss due to type of diet? Is the type of diet affect the weight loss? • Type of diet (main effect 1)  Null hypothesis: The effect of diet type on weight loss is not significant.  Alternative hypothesis: The effect of diet type on weight loss is significant. H0: There is no difference in average weight loss due to diet type. i.e., H0: µ(1) = µ(2) =µ(3) i.e. The type of diet has no effect on weight loss Ha: At least two of the averages µ(1) , µ(2) ,µ(3) are different. 61 Hypothesis test Effect of gender on weight loss Is there a significant difference in the weight loss due to gender? • Gender (main effect 2)  Null hypothesis: The effect of gender on weight loss is not significant  Alternative hypothesis: The effect of gender on weight loss is significant. H0: There is no difference in the average weight loss due to gender. i.e., H0: µ(male) = µ(female) i.e. The gender has no effect on weight loss Ha: µ(male) ≠ µ(female) . 62 The interaction (diet & gender) Effect of diet and gender on weight loss ‫التأثير المتبادل بين نوع الطعام والجنس‬ Is there a significant difference in the weight loss due to the interaction of the two variables, type of diet and gender?  Null hypothesis: The interaction between diet and gender on weight lost is not significant  Alternative hypothesis: The interaction between diet and gender on weight lost is significant. 63 Steps in SPSS • To carry out an ANOVA, select Analyze > General Linear Model > Univariate • Put the dependent variable (weight lost) in the Dependent Variable box • Put the independent variables (Diet and Gender) in the Fixed Factors box. 64 • In the Plots menu, move Diet to the Horizontal Axis box, Gender to the Separate Lines box and click Add. 65 • Request Tukey’s from the Post hoc menu for both factors. 66 • Ask for standardised residuals via the Save menu to check the assumption of residuals normality. 67 Checking the assumptions for twoway ANOVA • Assumptions – Residuals should be normally distributed • How to check – Use the Save menu to request the standardized residuals for each subject to be added to the dataset and then use Analyze > Descriptive Statistics > Explore to produce histogram of residuals. • What to do if the assumption is not met – If the residuals are very skewed, the results of the ANOVA are less reliable. There is no equivalent nonparametric test in SPSS but • transforming the dependent variable or • a separate ANOVA by gender (i.e. for males and females separately) are options. 68 Check normality of residuals • (Using Analyze > Descriptive Statistics > Explore to produce the histogram). The residuals are normally distributed. 69 Checking the assumptions for Twoway ANOVA • Assumptions – Homogeneity (equality) of variance (Levene’s test). • How to check – Use the Options menu to select Homogeneity tests for equality of variances. If p > 0.05, equal variances can be assumed – If p < 0.05, the results of the ANOVA are less reliable. • What to do if the assumption is not met – There is no equivalent test but comparing the p-values from the ANOVA with 0.01 instead of 0.05 is acceptable. 70 • Select Homogeneity tests from the Options menu to check the assumption of equal variances for each combination of diet/ gender. 71 The output • Checking the assumptions for equality of variances As p > 0.05, equal variances can be assumed. 72 Calculation Equations Factor A (diet) Factor B (Gender) = 49.679 Sum of squares due to diet effect = 0.428 Sum of squares due to diet effect = 33.904 Sum of squares due to interaction effect 73 Sum of squares due to diet effect (SSDiet) = 49.679 Sum of squares due to gender effect (SSgender) = 0.428 Sum of squares due to interaction effect (SS Diet*gender) = 33.904 Sum of squares of due to Error = SSError = 376.329 dfDiet= k -1 = 3-1 = 2 dfgender= n -1 = 2-1 = 1 dfDiet*gender = (k -1) (n -1) = (3-1)(2-1)=2 dfError (residuals) = Total number of observations -(k)(n) = 76 -(3x2)=70 F statistic of diet = MSDiet /MSError = 4.620 F statistic of gender = MSgender /MSError = 0.08 F statistic of Diet*gender = MS Diet*gender /MSError = 3.153 74 • The results of the two-way ANOVA and post hoc tests are reported for the main effects and the interaction. • There was a statistically significant interaction effect of Diet and Gender on weight loss [F(2, 70)=3.153, p = 0.049]. • Since the interaction effect is significant (p = 0.049), interpreting the main effects can be misleading. 75 • To easiest way to interpret the interaction is to use the plot from the output known as a Means or interaction plot which shows the means for each combination of diet and gender. • The plot clearly shows a difference between males and females in the way that diet affects weight lost, since the lines are not parallel. • The differences between the mean weight lost on the diets is much bigger for females. 76 • The values of the means can be obtained by splitting file as following 77 Interactions • In Two-way ANOVA an interaction is the combined effect of two independent variables on one dependent variable. • In a single (one) observations two-way ANOVA however, interaction cannot be measured where data in each cell of the table consist of a single observations . • In tests where no interaction occurred the lines in the plot known as a means or interaction plot are reasonably parallel. 78 • Some people just use the interaction plot to describe the combined effect of diet and gender but others prefer to carry out one way ANOVA’s for each group of one factor. • For this example, it makes sense to look at the differences between the diets by gender. 79 • To carry out separate ANOVA’s by gender, use Data > Split File • Select ‘Compare groups’ and move Gender to the ‘Groups Based on’ box. • After clicking ‘OK’, all analyses and charts will appear separately for males and females until the split is cancelled by going back to this box and 80 selecting ‘Analyze all cases, do not create groups’. • Run a one-way ANOVA for Diet: Analyze > General Linear Model > Univariate • The results appear separately for males and females. • Reporting results • There was a difference between the mean weight lost on the 3 diets for females (F(2,40)=10.64, p < 0.001) but not for males (F(2,30)=0.148, p = 0.863). • Only the post hoc tests for females should be interpreted. • You should also report the mean weight lost for each diet for females. 81 Continue … • Tukey’s post hoc tests were carried out for females. • Diet 3 was significantly different from diet 1 (p = 0.002) and diet 2 (p < 0.001) but there is no evidence to suggest that diets 1 and 2 differ (p = 0.841). 82 Continue …• For females, the mean weight lost on diet 3 was 5.88kg compared to only 3.05kg and 2.61kg on diets 1 and 2 respectively. • Normality checks and Levene’s test were carried out and the assumptions were met. 83 Two-Way ANOVA A single (one) observation per cell Example • A physical therapist wished to compare three methods for teaching patients to use a certain prosthetic device. He felt that the rate of learning would be different for patients of different ages and wished to design an experiment in which the influence of age could be taken into account. 84 • Note that the five age groups and three teaching devices give rise to data which has only one observation per ‘cell.’ For example, the age group “20 to 29” using teaching method C needs a 10 days to learn the use of the prosthetic device, while using the same teaching method, the “50 and over” age group needs 14 days to learn. We obtain one observation per cell and cannot measure variation within a cell. In this case we cannot check for interaction between the age group and the teaching method- the two factors used in this example. • Running an experiment several times results in multiple observations per cell and in this case we should assume that there may be interaction between the factors and check for this. 85 SPSS To fit a model without an interaction, click the Model button to open the Univariate: Model dialog box. 86 Select a model type (Main effects) 87 88 F = MSmethod/MSError F = MSAge/MSError Statistical decision. Since the p value of method = 0.001, we reject the null hypothesis of no effects of teaching method on the rate of learning time to use the prosthetic device. Similarly, Since the p value of age = 0.001, we also reject the null hypothesis of no effects of patients ages on rate of learning to use the prosthetic device. In a single observations two-way ANOVA, interaction cannot be measured where data in each cell of the table consist of a single observations . 89 90 • The 15 means can be displayed in a line/ means plot. • For all teaching methods (A, B and C), the fastest (i.e. shortest time) rate of learning are for those under 20, followed by 20 – 29, 30 – 39, 40 – 49 and then 50 and over. • The rate of teaching method A are higher than B and C at all age groups. • There is no interaction between teaching method and age groups as the lines are reasonably parallel. • An interaction occurs when the lines are not quite so parallel; such that the means of one group do not follow the same pattern as the other group. 91 Repeated measures (within-subjects) ANOVA 93 Common Applications: • Used when several measurements of the same dependent variable are taken at different time points or under different conditions. Repeated measures ANOVA tests (1) changes in mean score over 3 or more time points or (2) differences in mean score under 3 or more conditions. • This is the equivalent of a one-way ANOVA but for repeated samples and is an extension of a pairedsamples t-test. • Repeated measures ANOVA is also known as ‘withingroups’ ANOVA ‫اختبار تحليل التباين داخل المجموعات‬. 94 Required variables • One (single) dependent variable of Continuous (scale) • One (single) independent variable: Categorical e.g. time/ condition 95 Assumptions for repeated measures ANOVA • Assumptions – Normality of residuals by time point • How to check – In the Save menu, ask for the standardised residuals. A set of residuals will be produced for each time point and added to the data set. – Use histograms/ Shapiro-Wilk tests to check they are approximately normally distributed. • What to do if the assumption is not met – If the residuals are very skewed, ANOVA is not reliable so use the non-parametric Friedman test instead 97 Assumptions for repeated measures ANOVA • Assumptions – Sphericity ‫الدورية‬: the variances of the differences between all combinations of the related conditions/ time points are equal (similar to the assumption of equal variances in ANOVA). • How to check – Mauchly’s test of Sphericity is automatically given in the output. If p > 0.05, Sphericity can be assumed. • What to do if the assumption is not met – Use the p-value from the Greenhouse-Geisser correction row in the ‘Tests of Within-Subjects Effects’ ANOVA table. 98 Example • Data: Participants used Flora margarine for 8 weeks. Their cholesterol (in mmol/L) was measured before the special diet, after 4 weeks and after 8 weeks. • Use the SPSS file ‘Cholesterol.sav’ to see if the use of margarine has changed the mean cholesterol. Make sure in your data set that there is one row per person and a separate column for each of the three time points or conditions. 99 Hypothesis H0: µ(before) = µ(after 4 weeks) = µ(after 8 weeks) Ha: Not all µ are equal . 100 Steps in SPSS • To carry out a repeated measures ANOVA, use Analyse > General Linear Model > Repeated measures. 101 • This screen is where we define the levels of our repeated measures factor which in our case is time. • You need to name it using whatever name you like (we have used “time” in this case) and then state how many time points there are (which here is 3; before the experiment, after 4 weeks and after 8 weeks). • Make sure you click on the Add button and then click on the Define button. 102 • Move the three cholesterol variables across into the WithinSubjects Variables box. • Post hoc tests for repeated measures are in the Options menu. 103 • Move time to the Display Means for box Choose Bonferroni from the Confidence interval adjustment menu 104 • select In the Save menu, ask for the standardised residuals to be added to the dataset (ZRE_1 – ZRE_3). 105 • These will be added to your dataset by SPSS when you run the analysis. • They should then be checked for normality using histograms/ Shapiro-Wilk tests in Analyze > Descriptive Statistics > Explore. 106 The output • Histograms of the standardised residuals at the three time points showed an approximate normal distribution. 107 • The test is significant (p < 0.001) so the assumption of Sphericity has not been met. p-value 108 • If Sphericity can be assumed, use the top row of the ‘Tests of Within-Subjects Effects’ below. • If it cannot be assumed, use the Greenhouse-Geisser row (as shown below) which makes an adjustment to the degrees of freedom of the repeated measures ANOVA. As p < 0.001, there’s a difference in cholesterol between at least 2 time points • Report the results of this table using [F(dftime, dfError(time))= Test statistic F, p = …]. • Here a Greenhouse-Geisser correction was applied to the degrees of freedom so use [F(1.235, 21.001)= 212.321, p < 109 0.001] when reporting the results. • As the main ANOVA is significant, this means that there is a difference between at least two time points. • The Pairwise comparisons table contains multiple paired t-tests with a Bonferroni correction. • There was a significant difference between each pair of time points. • Cholesterol reduced by 0.566 mmol/L between baseline and 4 weeks (p < 0.001) and then reduced by an additional 0.063 mmol/L between 4 and 8 weeks (p = 0.004). 111 Reporting ANOVA • Participants used Flora margarine for 8 weeks. Their cholesterol was measured before the special diet, after 4 weeks and after 8 weeks. Normality checks were carried out on the residuals which were approximately normally distributed. • A repeated measures ANOVA with a Greenhouse-Geisser correction showed that mean cholesterol differed significantly between time points [F(1.235, 21.001)= 212.321, p < 0.001]. • Post hoc tests using the Bonferroni correction revealed that Cholesterol reduced by an average of 0.566 mmol/L after 4 weeks (p < 0.001) and then reduced by an additional 0.063 mmol/L between 4 and 8 weeks (p = 0.004). 112 113 Statistical Soup ANOVA, ANCOVA, MANOVA, & MANCOVA 114 ANOVA • The core component of all four of these analyses (ANOVA, ANCOVA, MANOVA and MANCOVA) is the first in the list, the ANOVA. 115 ANOVA • An "Analysis of Variance" (ANOVA) tests three or more groups for mean differences based on a continuous (i.e. scale or interval) response (dependent) variable. • The group membership could be Race, level of education, or treatment condition. 116 ANOVA • There are two main types of ANOVA: (1) "one-way" ANOVA compares levels (i.e. groups) of a single factor based on single continuous response variable (e.g. comparing Weight loss by 'Type of diet') Single independent variable (factor) 1. Type of diet= 1, 2, 3 Single continuous dependent variable (Response or Outcome) 1. Weight loss (2) a "two-way" ANOVA compares levels of two or more factors for mean differences on a single continuous response variable(e.g. comparing Weight loss by both 'Type of diet‘ and ‘Favorite soft drink'). 1. 2. Two independent variables (factors) Type of diet= 1, 2, 3 Favorite soft drink= Don’t like, Sugarless, With sugar Single continuous dependent variable (Response or Outcome) 1. Weight loss 117 ANCOVA ‫تحليل التغاير‬ ‫اختبار تحليل التباين المصاحب‬ • If we want to study the effect of independent variable (Type of diet) on dependent variable (Weight loss). Someone could suggest, that a person’s Age will have an added influence in the amount of weight they lose on a particular diet. • In this case ANCOVA take into consideration the covariate age which may has direct effect on Weight loss • One or more variables that could be related to the dependent variable could be controlled. Two independent variables One categorical variable Factor One or more variable/s Controlled variable/s covariate variable/s Single continuous dependent variable (Response or Outcome) Type of diet= 1, 2, 3 Age (continuous variable) Weight loss ANCOVA COMPARES A CONTINUOUS RESPONSE VARIABLE (E.G. WEIGHT LOSS) BY LEVELS OF A FACTOR 118 VARIABLE (E.G. TYPE OF DIET), CONTROLLING FOR COVARIATE /s (E.G. AGE). ANCOVA with three covariate variables 3 ‫ كمتغير تابع بناء على‬FEV ‫• في هذا المثال تم دراسة التغير في حجم الزفير القسري‬ ‫فئات العمرية كمتغير مستقل اخذين بعين االعتبار متغيرات الطول والجنس وحالة‬ FEV ‫) قد تؤثر على معدل‬covariate variables( ‫التدخين كمتغيرات مصاحبة‬ Note here that the covariate/s could include continuous or categorical variables Forced expiratory volume (FEV) measures how much air a person can exhale during a forced breath. 121 ANCOVA -> regression • If ANCOVA is carried out with a single response continuous variable and no factors, such analysis will be a regression (when covariate variable is continuous ). Two independent variables One categorical variable Factor One or more variable/s Controlled variable/s covariate variable/s Single continuous dependent variable (Response or Outcome) Type of diet= 1, 2, 3 Age (continuous variable) Weight loss 122 MANOVA • The obvious difference between ANOVA and a "Multivariate Analysis of Variance" (MANOVA) is the “M”, which stands for multivariate. • In basic terms, A MANOVA is an ANOVA with two or more continuous response variables. • Like ANOVA, MANOVA has both a one-way and a two-way. • The number of factor variables involved distinguish a one-way MANOVA from a two-way MANOVA. One-way MANOVA Two or more continuous dependent Single independent variable variables (Factor) (Response, Dependent or Outcome) 1. Weight loss 1. Type of diet= 1, 2, 3 2. Cholesterol level Two-way MANOVA Two or more continuous dependent At least two independent variables variables (Factors) (Response, Dependent or Outcome) 1. Type of diet= 1, 2, 3 1. Weight loss 2. Favorite soft drink= Don’t like, 123 2. Cholesterol level Sugarless, With sugar • A more subtle way that MANOVA differs from ANOVA in that MANOVA compares levels of a factor that has only two levels (binary). • When dealing with a single response variable and binary factor (e.g. gender), one uses an independent sample t-test. independent sample t-test Single binary independent variable (Factor) Gender = Male, Female Single continuous dependent variable (Response or Outcome) Weight loss • However, a t-test can not estimate differences for more than one response variable together, thus a MANOVA fills that need. One-way MANOVA independent variable (Factor) Gender = Male, Female Two or more continuous dependent variables (Response or Outcome) 1. Weight loss 2. Cholesterol level 124 Why Should You Do a MANOVA? • You do a MANOVA instead of a series of one-at-a-time ANOVAs • MANOVA takes into account the intercorrelations among the dependent variables (DVs). One-way MANOVA Single independent variable (Factor) 1. Type of diet= 1, 2, 3 Two or more continuous dependent variables (Response, Dependent or Outcome) 1. Weight loss 2. Cholesterol level Two-way MANOVA 1. 2. At least two independent variables (Factors) Two or more continuous dependent variables (Response, Dependent or Outcome) Type of diet= 1, 2, 3 Favorite soft drink= Don’t like, Sugarless, With sugar 1. 2. Weight loss Cholesterol level 125 MANCOVA • Like ANOVA and ANCOVA, the main difference between MANOVA and MANCOVA is the “C,” which again stands for “covariance.” • Both a MANOVA and MANCOVA feature two or more response variables, but the key difference between the two is the nature of the independent variables. • While a MANOVA can include only factors, an analysis evolves from MANOVA to MANCOVA when one or more covariates are added to the mix. One-way MANCOVA Two independent variables One categorical variable Factor Type of diet= 1, 2, 3 One or more variable/s Controlled variable/s covariate variable/s Age Two or more continuous variables Dependent variable (Response or Outcome) 1. 2. Weight loss Cholesterol level 126

8 ANOVA

Related documents

Products

Support

8 ANOVA

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib