APPLIED STATISTICS (D1074) THE ANALYSIS OF VARIANCE (ANOVA) & MANOVA Week 7-10 Learning Outcomes LO 1 : Apply statistical method to the real problem LO 2 : Use proper statistical method to the real problem LO 3 : Use statistical software to conduct analysis LO 4 : Interpret the results of output software and statistics calculation LO 5 : Explain the suitable decision from statistical method solution (1) One-Factor Analysis of Variance (1) One-Factor Analysis of Variance 1.1 Analysis of Variance (ANOVA) Analysis of variance is mainly used for tests of hypotheses that three or more population means are all equal against that at least one mean is different. is a technique for assessing how one or several nominal independent variables (called factors) affect a continuous dependent variable. Bina Nusantara University 4 (1) One-Factor Analysis of Variance ANOVA in which only one nominal independent variable is involved is called 1-way ANOVA. ANOVA in which only two nominal independent variables are involved is called 2-way ANOVA. ANOVA is an extension to the independent-twosample t test. Bina Nusantara University 5 (1) One-Factor Analysis of Variance 1.2 Experimental Design and ANOVA Statistical studies can be classified as either experimental or observational. In an experimental statistical study, an experiment is conducted to generate the data. An experiment begins with identifying a variable of interest. Then one or more other variables, thought to be related, are identified and controlled, and data are collected about how those variables influence the variable of interest. Bina Nusantara University 6 (1) One-Factor Analysis of Variance Example 1 Suppose in an industrial experiment that an engineer is interested in how the mean absorption of moisture in concrete varies among 5 different concrete aggregates. Aggregates 1 Aggregates 2 Aggregates 3 Aggregates 4 Aggregates 5 Bina Nusantara University may be interested in making individual comparisons among these 5 population means How the different aggregates effect on absorption of moisture 7 (1) One-Factor Analysis of Variance Example 1 The samples are exposed to moisture for 48 hours. It is decided that 6 samples are to be tested for each aggregate, requiring a total of 30 samples to be tested. Bina Nusantara University 8 (1) One-Factor Analysis of Variance Example 1 Hypothesis : Bina Nusantara University 9 (1) One-Factor Analysis of Variance 1.3 Assumption 1. For each population, the response variable is normally distributed. 2. The variance of the response variable, denoted σ2, is the same for all of the populations. 3. The observations must be independent. Bina Nusantara University 10 (1) One-Factor Analysis of Variance 1.4 F-Distribution Bina Nusantara University 11 (1) One-Factor Analysis of Variance 1.5 Hypothesis Test A nominal independent variable with k treatment is called a factor with k levels. Bina Nusantara University 12 (1) One-Factor Analysis of Variance Bina Nusantara University 13 (1) One-Factor Analysis of Variance 1.6 ANOVA Table ANOVA table is use to calculate test statistics F Bina Nusantara University 14 (1) One-Factor Analysis of Variance SSTR Bina Nusantara University 15 (1) One-Factor Analysis of Variance Example 2 Continued from example 1 Demonstrate that the aggregates do not have the same mean absorption. α = 5% Bina Nusantara University 16 (1) One-Factor Analysis of Variance Example 2 Hypothesis : H0: μ1 = μ2 = · · · = μ5 (the aggregates have the same mean absorption) H1: At least two of the means are not equal (the aggregates do not have the same mean absorption) Statistic test : SST = 209377 SSA= 85356 SSE = 209377 − 85356 = 124020 F= 4.30 Critical region: F > 2.76 with 4 and 25 degrees of freedom Bina Nusantara University 17 (1) One-Factor Analysis of Variance Example 2 ANOVA table is use to calculate test statistics F 85356 4 21339 124020 25 4961 209377 29 Bina Nusantara University 4,3 18 (1) One-Factor Analysis of Variance 1.7 Multiple Comparison Method ANOVA The population means are not all equal Multiple comparison procedures Purpose : determine where the differences among means occur Bina Nusantara University 19 (1) One-Factor Analysis of Variance Bina Nusantara University 20 (2) Randomized Block Design (2) Randomized Block Design 2.1 Definition A completely randomized design is useful when the experimental units are homogeneous. If the experimental units are heterogeneous, blocking is often used to form homogeneous groups. A completely randomized block design is an extension of paired samples to accommodate the comparison a set of k population means or factor levels Bina Nusantara University 22 (2) Randomized Block Design A typical layout for the randomized complete block design using 3 measurements in 4 blocks is as follows : Bina Nusantara University 23 (2) Randomized Block Design k × b Array for the RCB Design Bina Nusantara University 24 (2) Randomized Block Design 2.2 Hypothesis Test TEST FOR THE EQUALITY OF k TREATMENT Hypothesis H0 : µ1 = µ2 = µ 3 = …. = µk H1: At least two of the means are not equal Test Statistics MSTR F MSE Rejection Rule Reject H0 if F > F(α,k-1,(k-1)(b-1)) Bina Nusantara University 25 (2) Randomized Block Design TEST FOR THE EQUALITY OF b BLOCKS Hypothesis H0 : β1 = β 2 = β 3 = …. = βb H1: At least two of the means are not equal Test Statistics MSBL F MSE Rejection Rule Reject H0 if F > F(α,b-1,(k-1)(b-1)) Bina Nusantara University 26 (2) Randomized Block Design 2.3 ANOVA Table ANOVA table is use to calculate test statistics F Bina Nusantara University 27 (2) Randomized Block Design SSTR SSBL Bina Nusantara University 28 (2) Randomized Block Design Example 3 Four different machines, M1, M2, M3, and M4, are being considered for the assembling of a particular product. It was decided that six different operators would be used in a randomized block experiment to compare the machines. The machines were assigned in a random order to each operator. The operation of the machines requires physical dexterity, and it was anticipated that there would be a difference among the operators in the speed with which they operated the machines Bina Nusantara University 29 (2) Randomized Block Design Example 3 Bina Nusantara University 30 (2) Randomized Block Design Example 3 ANOVA Table Bina Nusantara University 31 (2) Multivariate ANOVA (MANOVA) • MANOVA model for comparing g population mean vectors parallels univariate ANOVA: Observation Vectors • Each component of Xlj satisfies the 1-way ANOVA model, but now the model includes covariances among the components. • These covariances are assumed to be equal across populations. • A vector of observations can be decomposed as Sums-of-Squares and Cross-Products (SSCP) • First we’ll find the total corrected squares and crossproducts. • and now sum all of this over cases and groups. • Since addition is distributive, we’ll do this in pieces and look just at cross-product first. . . Sum of Squares • Now summing the rest over j and l we get A Closer Look at Within Groups SSCP • where Sl is the sample covariance matrix for the lth group (treatment, condition, etc). • W (E) is proportional to a pooled estimated of the common Σ Between Groups SSCP & Test Statistic • With respect to between groups SSCP, • where T = W + B (i.e., the total corrected SSCP). • is known as “Wilk’s Lambda”. It’s equivalent to likelihood • ratio statistic. Hypothesis Testing with Distribution of Wilk’s Lambda Example One Way MANOVA An experiment was conducted for comparing 2 methods (A & B) of teaching shorthand to 60 female seniors in a vocational high school (a dated example). Also of interest were the effects of distributed versus massed practice C1: 2 hours of instruction/day for 6 weeks C2: 3 hours of instruction/day for 4 weeks C3: 4 hours of instruction/day for 3 weeks So each subject received a total of 12 hours of instruction. For now, we’ll just look the effect of distributed versus massed practice. Note: nl = 20 for l = 1, 2, 3 Two variables (dependent measures): X1 = speed X2 = accuracy Example Hypothesis Test No difference between massed versus distributed practice on either speed or accuracy: The within groups (residual) sums of squares and crossproducts matrix Hypothesis Test (continued) The between groups SSCP matrix: Or T = (n − 1)S where S is the covariance matrix computed over all groups and n is the total sample. Then B = T −W Test Statistic & Distribution For p = 2 and g = 3, we can use the exact sample distribution: (3) Application with Minitab ONE WAY ANOVA ONE WAY ANOVA ONE WAY ANOVA ONE WAY ANOVA One-way ANOVA: Durability versus Carpet Source Carpet Error Total DF SS 3 146.4 12 163.5 15 309.9 MS 48.8 13.6 F P 3.58 0.047 S = 3.691 R-Sq = 47.24% R-Sq(adj) = 34.05% Tukey 90% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of Carpet Individual confidence level = 97.50% Carpet = 1 subtracted from: Carpet Lower Center Upper ---------+---------+---------+---------+ 2 -11.428 -4.748 1.933 (-------*-------) 3 -8.356 -1.675 5.006 (-------*-------) 4 -3.048 3.633 10.313 (--------*-------) ---------+---------+---------+---------+ -8.0 0.0 8.0 16.0 Carpet = 2 subtracted from: Carpet Lower Center Upper ---------+---------+---------+---------+ 3 -3.608 3.073 9.753 (--------*-------) 4 1.699 8.380 15.061 (-------*--------) ---------+---------+---------+---------+ -8.0 0.0 8.0 16.0 Carpet = 3 subtracted from: Carpet Lower Center Upper ---------+---------+---------+---------+ 4 -1.373 5.308 11.988 (--------*-------) ---------+---------+---------+---------+ -8.0 0.0 8.0 16.0 MANOVA ONE WAY MANOVA ONE WAY MANOVA General Linear Model: Tear, Gloss, Opacity versus Extrusion, Additive MANOVA for Extrusion s = 1 m = 0.5 n = 6.0 Test Criterion Statistic Wilks' 0.38186 Lawley-Hotelling 1.61877 Pillai's 0.61814 Roy's 1.61877 F 7.554 7.554 7.554 DF Num Denom 3 14 3 14 3 14 P 0.003 0.003 0.003 ONE WAY MANOVA EIGEN Analysis for Extrusion Eigenvalue 1.619 0.00000 0.00000 Proportion 1.000 0.00000 0.00000 Cumulative 1.000 1.00000 1.00000 Eigenvector 1 2 3 Tear 0.6541 0.0460 0.4333 Gloss -0.3385 0.1241 0.5012 Opacity 0.0359 0.1246 0.0000 ONE WAY MANOVA MANOVA for Additive s = 1 m = 0.5 n = 6.0 Test Criterion Statistic Wilks' 0.52303 Lawley-Hotelling 0.91192 Pillai's 0.47697 Roy's 0.91192 DF F Num Denom 4.256 3 14 4.256 3 14 4.256 3 14 P 0.025 0.025 0.025 ONE WAY MANOVA MANOVA for Extrusion*Additive s = 1 m = 0.5 n = 6.0 Test Criterion Statistic Wilks' 0.77711 Lawley-Hotelling 0.28683 Pillai's 0.22289 Roy's 0.28683 DF F Num Denom 1.339 3 14 1.339 3 14 1.339 3 14 P 0.302 0.302 0.302 Exercises (1) Bina Nusantara University 61 Exercises (2) Bina Nusantara University 62 Exercises (3) Bina Nusantara University 63 THANK YOU Bina Nusantara University 64 Reference Anthony Hayter. (2013). Probability and Statistics for Engineers and Scientists. 04. Thomson Brooks/Cole. Australia. ISBN : 978-1133112143. Applied Multivariate. Statistical Analysis.(2007) Richard A. Johnson& Dean w. Wichern. Texas A&M University. PEARSON. Bina Nusantara University 65