Chapter 12 Analysis of Variance Introduction Chapters 2-4: techniques to describe data (敘述性統計 圖表 統計量) Chapters 5-8: probability, probability distributions Sampling methods and Central Limit Theorem (CLT) (機率分配,抽樣方法與中央極限定理) Introduction Inferential statistics (推論性統計): (1) Estimation 估計 (2) Hypothesis testing假說檢定;假設檢定 Chapter 9 Estimation - point estimation - interval estimation Chapter 10 One-sample tests of hypothesis 一個樣本的假設檢定 * population mean * population proportion Chapter 11 Two-sample tests of hypothesis * two population means * two population proportions Chapter 12 Analysis of variance (ANOVA, 變異數分析) * two population variances * three or more population means H0: µ1 = µ2 = µ3 = µ4 H1: The means are not all equal Learning Objectives LO1 the F distribution LO2 test the variances of two normal populations Test statistic F=s²1/s²2 LO3 test three or more population means - one-way ANOVA (單因子變異數分析) - two-way ANOVA (雙因子變異數分析) 12-5 LO1: The F Distribution It was named to honor Sir Ronald Fisher (1890-1962), father of modern statistics. 12-6 The F Distribution It is applied when we want to Compare two populations variances Compare several population means simultaneously. It is called “analysis of variance” (ANOVA). In both situations, the populations must follow a normal distribution. 12-7 Characteristics of F-Distribution 1. There is a “family” of F Distributions. 2. Each F distribution is determined by two parameters: 12-8 Characteristics of F-Distribution 1. There is a “family” of F Distributions. 2. Each F distribution is determined by two parameters: (1) the degrees of freedom in the numerator (分子自由度) (2) the degrees of freedom in the denominator. 12-9 Characteristics of F-Distribution 3. It is a continuous distribution. 4. F value is nonnegative. 5. The F distribution is positively skewed. 6. It is asymptotic. As F the curve approaches the X-axis but never touches it. 12-10 Comparing variances of two normal populations Examples: A car manufacturer is about to unveil a new, faster car. However, initial tests indicate there is more variation in the processing time than the current cars. A sample of 15 technology and 15 utility stocks shows the same mean rate of return, but there is more variation in the Internet stocks. utility stock: 公用事業公司股票 12-11 Equal means but different variances LO2: Test for Equal Variances Test statistic F=s²1/s²2 Note: F value is nonnegative. 12-13 Test statistic F=s²1/s²2 Question: When do we reject Hₒ ? Answer: reject Hₒ if F is not close to 1. Next Question: How close is close ? 12-14 Rejection region F=s²1/s²2 follows Fv1,v2 when σ12 = σ22 where v1=n1-1 v2=n2-1 12-16 One-sided and two-sided tests (a) H0: σ12 = σ22 H1: σ12 > σ22 (b) H0: σ12 = σ22 H1: σ12 < σ22 (c) H0: σ12 = σ22 H1: σ12 ≠ σ22 Test statistic F=s²1/s²2 (a) Reject Hₒ if F is far above 1 (b) Reject Hₒ if F is far below 1 (c) Reject Hₒ if F is not close to 1 One-sided and two-sided tests (a) H0: σ12 = σ22 H1: σ12 > σ22 (b) H0: σ12 = σ22 H1: σ12 < σ22 (c) H0: σ12 = σ22 H1: σ12 ≠ σ22 Test statistic F=s²1/s²2 (a) Reject Hₒ if F is far above 1 (if F > F,v1,v2) (b) Reject Hₒ if F is far below 1 (if F < F1- ,v1,v2) (c) Reject Hₒ if F is not close to 1 (if F > F/2,v1,v2 or if F < F1- /2,v1,v2) More on two-sided test H0: σ12 = σ22 H1: σ12 ≠ σ22 Reject Hₒ if F is not close to 1 Reject Hₒ if F > F/2,v1,v2 or if F < F1- /2,v1,v2 Question: F/2,v1,v2 > 1 or < 1 ? F1- /2,v1,v2 > 1 or < 1 ? More on two-sided test H0: σ12 = σ22 H1: σ12 ≠ σ22 Reject Hₒ if F is not close to 1 Reject Hₒ if F > F/2,v1,v2 or if F < F1- /2,v1,v2 Answer: F/2,v1,v2 > 1 F1- /2,v1,v2 < 1 Test for Equal Variances - Example The following are the mean rate of returns of 7 technology and 8 utility stocks. Technology: 52 67 56 45 70 54 64 Utility : 59 60 61 51 56 63 57 65 Using the .10 significance level, is there a difference in the variation in the driving times for the two types of stocks? X 1 58.28571; X 2 59. 12-21 Test for Equal Variances - Example Step 1: The hypotheses are: H0: σ12 = σ22 H1: σ12 ≠ σ22 Step 2: The significance level is .10. Step 3: The test statistic is the F distribution. 12-22 Test for Equal Variances - Example Step 4: State the decision rule. Reject H0 if F > F/2,v1,v2 F > F.10/2,7-1,8-1 F > F.05,6,7 12-23 Test for Equal Variances - Example Step 5: Compute the value of F and make a decision The decision is to reject the null hypothesis, because the computed F value (4.23) is larger than the critical value (3.87). We conclude that there is a difference in the variation of the mean rate of returns of the two types of stocks. 12-24 Wait. Shouldn’t we check two critical values? H0: σ12 = σ22 H1: σ12 ≠ σ22 Reject Hₒ if F > F/2,v1,v2 or if F < F1- /2,v1,v2 Reject H0 if F > F0.05,6,7 or F < F.95,6,7 Wait. Shouldn’t we check two critical values? Decision rule: Reject H0 if F > F0.05,6,7 or F < F.95,6,7 However, the table only provides the upper critical values ! Approach 1: Place the larger sample variance in the numerator; hence, we always have F=s²1/s²2 > 1. Decision rule: Reject H0 if F > F0.05,6,7 or F < F.95,6,7 Thus, F < F.95,6,7 would never happen; only the right-tail critical value is required. Advantage: reduce the size of the table of critical values Approach 2: Use the relation : F1 ,v 2 ,v1 1 F ,v1,v 2 1 1 So, F0.95 , 6 , 7 0.2375. F0.05 , 7 , 6 4.21 The reason is below : s12 P( 2 s2 s12 F ,v1,v 2 ) where 2 ~ F(v1, v2) s2 s22 1 s22 P( 2 ) where 2 ~ F(v2, v1) s1 F ,v1,v 2 s1 s22 P ( 2 F1 ,v 2 ,v1 ) s1 So, F1 ,v 2 ,v1 1 F ,v1,v 2 . Approach 2 (continued) For the derivation in the previous slide, the 1st equality: definition the 2nd equality: taking reciprocal on both sides the 3rd equality: definition The last line follows by comparing the 2nd and 3rd equalities. Approach 2 (continued) Decision rule: Reject H0 if F > F0.05,6,7 =3.87 or F < F.95,6,7 =0.2375 So, we have obtained the two critical values. And we reject H0 . Now, use approach 2 with smaller variance in the numerator. The original F statistic is If the larger sample variance is not placed in the numerator, then the computed F statistic = ? F statistic: F=s²2/s²1 =1/(4.23)=0.2364. So, we reject the null hypothesis because 0.2364 < 0.2583. Decision rule: Reject H0 if F > F0.05,7,6 =4.21 or F < F.95,7,6 =1/(3.87) =0.2583 LO3: Comparing Means of Two or More Populations - One-way analysis of variance One-factor analysis of variance (單因子變異數分析) - Two-way analysis of variance Two-factor analysis of variance (雙因子變異數分析) 12-33 Comparing Means of Two or More Populations The F distribution is also used for testing whether three or more population means are equal. H0: µ1 = µ2 =…= µk H1: The means are not all equal Assumptions: – The populations follow the normal distribution. – The populations have equal variance. – The random samples are independent. 12-34 Comparing Means of Two or More Populations – Example A manager of a regional financial center wishes to compare the productivity, as measured by the number of customers served, among three employees. Four days are randomly selected and the number of customers served by each employee is recorded. 12-35 Comparing Means of Two or More Populations Are the figures below fulfill the three assumptions ? – The populations follow the normal distribution? (Yes) – The populations have equal variance? (Yes) – The random samples are independent? (Not sure) 12-36 ANOVA (變異數分析) Goal: Test whether there is a significant difference between the treatment effect. Treatment: 處方 *ANOVA was first developed for agriculture applications. Idea: decomposition of total variation (總變異) into several parts Wolfe Whtie Korosa X G 58, grand mean 55 66 47 X 1 56, X 2 70, X 3 48 : group means 54 76 51 group 1 : X11 , X12 , X13 , X14 59 67 46 group 2 : X 21 , X 22 , X 23 , X 24 56 71 48 group 3 : X 31 , X 32 , X 33 , X 34 total variation : ( X X G ) 2 4 variation within group i : ( X ij X i ) 2 j 1 3 variation between group : ( X i X G ) 2 i 1 Idea: decomposition of total variation (總變異) total variat ion ( X X G ) 2 ( X X c X c X G ) 2 ( X X c ) ( X c X G ) 2( X X c )( X c X G ) 2 2 ( X X c ) ( X c X G ) 2 2 組內變異 (within groups) vs 組間變異 (between groups) Idea: decomposition of total variation (總變異) SSTotal (sum of squares total) ( X X G )2 ( X X c )2 ( X c X G )2 SSE SSTrt (SS for error) (SS for treatm ent) When H 0 is true, SST will be smaller; when H 1 is true, SST will be larger. SST So, we will reject H 0 if is too (large, small) ? SSE Analysis of variance (變異數分析) SSTotal (sum of squares total) SSE SSTrt (SS error) (SS trtment) SSTrt is too large. SSE SSTrt/(k - 1) test statistic : F SSE/(n - k) So, we will reject H 0 if When H 0 is true, the test statistic follows F distri. with degrees of freedom k - 1 and n - k. In this case, k 3, n 12. Why (k-1) and (n-k) in SSTrt /( k 1) F SSE /( n k ) ? Compare with F in equation (9-1): s12 ( X X ) 2 /( n1 1) F 2 s2 (Y X ) 2 /( n2 1) Recall: Each F distribution is determined by two parameters: (1) the degrees of freedom in the numerator (分子自由度) (2) the degrees of freedom in the denominator. Analysis of Variance – F statistic k: number of populations n: number of observations The test statistic is computed by: F SST k 1 SSE n k 12-43 Anova table SST/(k - 1) MST F SSE/(n - k) MSE 12-44 MSE: mean squared error (均方差) Have you seen this MSE before? What does the MSE estimate ? MSE: mean squared error (均方差) SSE ( X X c ) 2 SSE MSE , n n1 n 2 ... n k n-k The values of residentia l homes : k 2, n1 n 2 10, 10 10 ( X 1i X 1 ) ( X 2 i X 2 ) 2 So, MSE i 1 2 i 1 10 10 2 It is the pooled variance ! It estimates the common population variances. Procedure of hypothesis testing H0: µ1 = µ2 =…= µk H1: The means are not all equal Null: the population means are all the same. Alternative: at least one of the means is different. Choose significance level Test Statistic: F distribution. The Decision rule: Reject H0 if F > F,k-1,n-k Compute F and make decision Question: Is the rejection region on one or two tails? Why ? 12-47 Commonly seen mistakes H0: µ1 = µ2 =…= µk H1: µ1 ≠ µ2 ≠…≠ µk The Decision rule: (not quite right) Reject H0 if F > F/2,k-1,n-k or F < F1-/2,k-1,n-k (It is wrong!) 12-48 Example A marketing researcher randomly selected and surveyed customers from four stores regarding their level of satisfaction with a recent purchase. Twenty-five questions offered a range of possible answers: excellent, good, fair, or poor; with a score of 4, 3, 2, and1, respectively. These responses were then totaled. Is there a difference in the mean satisfaction level among the four stores? Use the .01 significance level. 12-49 Eaton Tony Aden Oz 94 75 70 68 90 68 73 70 85 77 76 72 80 83 78 65 88 80 74 68 65 total 65 Total 349 391 510 414 1,664 mean 87.25 78.20 72.86 69.00 75.64 Grand mean : 75.64 Group means: 87.25, 78.20, 72.86, 69.00 Comparing Means of Two or More Populations – Example Step 1: State the null and alternate hypotheses. H0: µ1 = µ2 = µ3 = µk H1: The means are not all equal Reject H0 if F > F,k-1,n-k Step 2: State the level of significance. The .01 significance level is stated in the problem. Step 3: Find the appropriate test statistic. Because we are comparing means of more than two groups, use the F statistic 12-51 Comparing Means of Two or More Populations – Example Step 4: State the decision rule. Reject H0 if F > F,k-1,n-k F > F.01,4-1,22-4 F > F.01,3,18 F > 5.09 12-52 Step 5: Compute the value of F and make a decision 12-53 Eaton Tony Aden Oz 94 75 70 68 90 68 73 70 85 77 76 72 80 83 78 65 88 80 74 68 65 total 65 mean 87.25 78.20 72.86 69.00 75.64 Need to calculate the following: total variation : ( X X G ) 2 4 variation within group i : ( X ij X i ) 2 j 1 3 variation between group : ( X i X G ) 2 i 1 Eaton Tony Aden Oz 18.36 -0.64 - 5.64 -7.64 14.36 -7.64 - 2.64 -5.64 9.36 1.36 0.36 -3.64 4.36 7.36 2.36 -10.64 12.36 4.36 -1.64 - 7.64 -10.64 Eaton Tony Aden Oz 337.09 0.41 31.81 58.37 206.21 58.37 6.97 31.81 87.61 1.85 0.13 13.25 19.0 54.17 5.57 113.21 152.77 19.01 2.69 58.37 113.21 113.21 -10.64 total 649.91 267.57 235.07 332.54 So, SS total = 649.91+267.57+235.07+332.54=1485.09 Eaton Tony Aden Oz Eaton Tony Aden Oz 6.75 -3.2 -2.86 -1 45.5625 10.24 8.18 1 2.75 -10.2 0.14 1 7.5625 104.04 0.02 1 -2.25 -1.2 3.14 3 5.0625 1.44 9.86 9 -7.25 4.8 5.14 -4 52.5625 23.04 26.42 16 7.14 5 96.04 50.98 25 -4.86 -4 23.62 16 61.78 -7.86 total 110.75 234.80 So, SSE = 110.75+234.80+180.86+68=594.41 180.86 68 Computing SST 4 (SS due to Treatment) nc SSTrt ( X c X G ) 2 c 1 j 1 n1 ( X 1 X G ) 2 n2 ( X 2 X G ) 2 n3 ( X 3 X G ) 2 n4 ( X 4 X G ) 2 Why ? SSTotal (sum of squares total) 4 nc ( X cj X G ) 2 c 1 j 1 4 nc 4 nc ( X cj X c ) ( X c X G ) 2 c 1 j 1 2 c 1 j 1 SSE SSTrt Another way to obtain SST: 12-57 What is the estimated value of the common variance ? What is the computed value of F ? 12-58 The computed value of F is 8.99, which is greater than the critical value of 5.09, so the null hypothesis is rejected. Conclusion: The mean scores are not the same for the four stores. Note: At this point we can only conclude there is a difference in the treatment means. We cannot determine which treatment groups differ. 12-59 Further question: Which treatment means differ ? One procedure: Use confidence intervals for the difference between two means to test H0: μ1 = μ2 Confidence interval for μ1 - μ2 : X X t 1 2 1 1 MSE n2 n1 Confidence interval for one population mean μ : s X z n X t / 2,n 1 s n 12-60 Which treatment means differ ? (cont’d) One procedure: Use confidence intervals for the difference between two means to test H0: μ1 = μ2 Confidence interval for μ1 - μ2 : X X t 1 2 1 1 MSE n2 n1 where 1. t is t/2,n-k (obtained from t table with n-k degrees of freedom) 2. MSE=SSE/(n-k): an estimate of the population variance (mean squared error, 均方差) 12-61 Confidence Interval for the Difference Between Two Means - Example Develop a 95% confidence interval for the difference in the mean between Eaton and Oz. t/2,n-k =2.101 The 95% confidence interval is (10.46, 26.04). Can we conclude that there is a difference between the two stores? 12-62 We conclude these treatment means differ significantly. The 95% confidence interval for μ1 - μ2 is (10.46, 26.04). Both endpoints are positive; hence, we can conclude these treatment means differ significantly. That is, customers on Eaton rated service significantly different from those on Oz. 12-63 Two-Way Analysis of Variance (雙因子變異數分析) 12-64 Two-Way Analysis of Variance (雙因子變異數分析) For the two-factor ANOVA we test whether there is a significant difference between the treatment effect and whether there is a difference in the blocking effect. Let Br be the block totals (r for rows) Treatment: 處方 Blocking variable (集區變數): A second treatment variable. *ANOVA was first developed for agriculture applications. 12-65 Example: The value of residential homes Home Taylor Watson 1 235 228 2 210 205 3 231 219 4 242 240 5 205 198 6 230 223 7 231 227 8 210 215 9 225 222 10 249 245 blocking variable: home Two-Way ANOVA Table SST b( x c x G ) 2 SSB k( x b x G ) 2 12-67 Example: Travel times at 4 different times by 5 drivers The Transportation Department would like to study whether the travel times of Bus S10 at the following 4 different times are different: (1) 10:00-12:00, (2) 12:00-14:00, (3) 14:00-16:00, (4) 16:00-18:00. Since there are many different drivers, the test was set up so that each driver drove at each of the 4 different times. 12-68 Example: Travel time (minutes) 10:0012:00 12:0014:00 14:0016:00 16:0018:00 Andy 18 17 21 22 Ben 16 23 23 22 David 21 21 26 22 George 23 22 29 25 Jack 25 24 28 28 At 0.05 significance level, is there a difference in the mean travel time at the 4 different times? 12-69 Shall we use 1-way ANOVA or 2-way ANOVA ? (both can compare several means) Let’s try 1-way first. 12-70 Critical value= 3.239; p-value=0.098 (fail to reject H0) 12-71 Step 1: State the null and alternate hypotheses. H0: µu = µw = µh = µr H1: Not all treatment means are the same Reject H0 if F > F,k-1,n-k Step 2: State the level of significance. The .05 significance level is stated in the problem. Step 3: Use the F statistic Step 4: State the decision rule. Reject H0 if F > F,v1,v2 F > F.05,k-1,n-k F > F.05,4-1,20-4 F > F.05,3,16 F > 3.24 12-72 Step 5. Compute the value of F and make a decision. The computed F is 2.483. Critical value= 3.239. We do not to reject H0. 12-73 Note: 1-way ANOVA (SSTotal = SST + SSE) does not take drivers’ variation into account. If we remove the effect of the drivers, at .05 significance level, is there a difference in the mean travel time? * 2-way ANOVA remove the variation due to drivers from SSE 12-74 Two-Way ANOVA Table 12-75 Two-Way Analysis of Variance (雙因子變異數分析) SST: sum of square for treatment SSB: sum of squares for the blocks SST b ( x c x G ) SSB k( x b x G ) 2 2 Sum of Squared Errors: SSE = SS total – SST - SSB 12-76 Step 1: State the null and alternate hypotheses. H0: µu = µw = µh = µr H1: Not all treatment means are the same Reject H0 if F > F,k-1,(k-1)(b-1) Step 2: State the level of significance. The .05 significance level is stated in the problem. Step 3: Use the F statistic Step 4: State the decision rule. Reject H0 if F > F,v1,v2 F > F.05,k-1,(k-1)(b-1) F > F.05,4-1,(3)(4) F > F.05,3,12 F > 3.49 12-77 10:0012:00 12:0014:00 14:0016:00 16:0018:00 Driver means Andy 18 17 21 22 19.5 Ben 16 23 23 22 21 David 21 21 26 22 22.5 George 23 22 29 25 24.75 Jack 25 24 28 28 26.25 12-78 4 nc SSTrt ( X c X G ) 2 c 1 j 1 4 b (X c X G) 2 c 1 b( X 1 X G ) 2 b( X 2 X G ) 2 b( X 3 X G ) 2 b( X 4 X G ) 2 5 * [( 20.6 22.8) 2 ( 21.4 22.8) 2 ( 25.4 22.8) 2 ( 23.8 22.8) 2 ] 72.8 12-79 The computed F is 24.27/3.06=7.93 Critical value= 3.49. Decision: We reject H0 at significance level 0.05; Conclusion: The mean times for the routes are not all the same. 12-80 Compare 1-way and 2-way ANOVA tables F.05,3,16=3.24 F.05,3,12=3.49 F.05,4,12=3.26 12-81 Example (continued) With 1-way ANOVA to test the mean time for the routes, we conclude: The mean times for the routes are the same. With 2-way ANOVA, we conclude: (1)The mean time is not the same for all drivers (2)The mean times for the routes are not all the same 12-82 Back to the example of values of residential homes Treat it as two independent samples: It does not consider the variation among homes. Treat it as two dependent samples: It subtracts variation among homes from SSE. 12-83 Supplementary materials on two-sided tests and confidence intervals 12-84 Relation between 95 % confidence interval and a 2-tailed test H 0 : 0 at significance level α not reject H 0 if Z Z / 2 Note : {X : X 0 / n Z / 2 } { X : X Z / 2 0 X Z / 2 } n n What does this mean? 10-85 Relation between 95 % confidence interval and a 2-tailed test H 0 : 0 at significance level α not reject H 0 if Z Z / 2 Note : {X : X 0 / n Z / 2 } { X : X Z / 2 0 X Z / 2 } n n What does this mean? Ans: we would not reject H0 at level α if 0 lies in the (1- α)100% confidence interval 10-86 Relation between (1- α)100% confidence interval and a 2-tailed test H0: μ1 = μ2 at significance level α not reject H 0 if Z Z / 2 Note : {X1 X 2 : X1 X 2 Z } 1 1 2 MSE ( ) n1 n2 What does this mean?1 {X1 X 2 : X1 X 2 Z 2 1 1 1 MSE ( ) 0 X 1 X 2 Z MSE ( ) } n1 n2 n1 n2 2 We would not reject H0 at level α if 0 lies in the (1α)100% confidence interval. In above case, 0 is not in (10.46, 26.04), so we reject H0 at level α . 10-87