ST 511 Analysis of Variance Introduction Analysis of variance compares two or more populations of quantitative data. Specifically, we are interested in determining whether differences exist between the population means. The procedure works by analyzing the sample variance. §1 One Way Analysis of Variance The analysis of variance is a procedure that tests to determine whether differences exist between two or more population means. To do this, the technique analyzes the sample variances One Way Analysis of Variance: Example A magazine publisher wants to compare three different styles of covers for a magazine that will be offered for sale at supermarket checkout lines. She assigns 60 stores at random to the three styles of covers and records the number of magazines that are sold in a one-week period. One Way Analysis of Variance: Example How do five bookstores in the same city differ in the demographics of their customers? A market researcher asks 50 customers of each store to respond to a questionnaire. One variable of interest is the customer’s age. Idea Behind ANOVA – two types of variability 1. Within group variability 2. Between group variability 30 25 x3 20 20 x 2 15 16 15 14 11 10 9 x3 20 20 19 x 2 15 x1 10 12 10 9 x1 10 7 1 A small variability within The sample means are the same as before, theTreatment samples makes it2easier Treatment 3 Treatment Treatment 3 1 Treatment 1 within-sample Treatment 2 variability but the larger to draw a conclusion about the makes it harder to draw a conclusion about the population means. population means. Idea behind ANOVA: recall the two-sample t-statistic Difference between 2 means, pooled variances, sample sizes both equal to n n (xy) t xy sp t 2 11 n n 2 sp n ( x y )2 2 s 2p Numerator of t2: measures variation between the groups in terms of the difference between their sample means Denominator: measures variation within groups by the pooled estimator of the common variance. If the within-group variation is small, the same variation between groups produces a larger statistic and a more significant result. One Way Analysis of Variance: Example Example 1 – An apple juice manufacturer is planning to develop a new product -a liquid concentrate. – The marketing manager has to decide how to market the new product. – Three strategies are considered Emphasize convenience of using the product. Emphasize the quality of the product. Emphasize the product’s low price. One Way Analysis of Variance Example 1 - continued – An experiment was conducted as follows: In three cities an advertisement campaign was launched . In each city only one of the three characteristics (convenience, quality, and price) was emphasized. The weekly sales were recorded for twenty weeks following the beginning of the campaigns. One Way Analysis of Variance Convnce Weekly sales 529 658 793 514 663 719 711 606 461 Weekly 529 sales 498 663 604 495 485 557 353 557 542 614 Quality Price 804 630 774 717 679 604 620 697 706 615 492 719 787 699 572 Weekly 523 sales 584 634 580 624 672 531 443 596 602 502 659 689 675 512 691 733 698 776 561 572 469 581 679 532 One Way Analysis of Variance Solution – The data are quantitative – The problem objective is to compare sales in three cities. – We hypothesize that the three population means are equal Defining the Hypotheses • Solution H0: 1 = 2 = 3 H1: At least two means differ To build the statistic needed to test the hypotheses use the following notation: Notation Independent samples are drawn from k populations (treatment groups). First observation, first sample Second observation, second sample 1 2 k X11 x21 . . . Xn1,1 X12 x22 . . . Xn2,2 X1k x2k . . . Xnk,k n2 nk x2 xk n1 x1 Sample size Sample mean X is the “response variable”. The variables’ values are called “responses”. Terminology In the context of this problem… Response variable – weekly sales Responses – actual sale values Experimental unit – weeks in the three cities when we record sales figures. Factor – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy. Factor levels – the population (treatment) names. In this problem factor levels are the 3 marketing strategies: 1) convenience, 2) quality, 3) price The rationale of the test statistic Two types of variability are employed when testing for the equality of the population means 1. Within sample variability 2.Between sample variability H0: 1 = 2 = 3 H1: At least two means differ The rationale behind the test statistic – I If the null hypothesis is true, we would expect all the sample means to be close to one another (and as a result, close to the grand mean). If the alternative hypothesis is true, at least some of the sample means would differ. Thus, we measure variability between sample means. Variability between sample means • The variability between the sample means is measured as the sum of squared distances between each mean and the grand mean. This sum is called the Sum of Squares for Groups SSG In our example, treatments are represented by the different advertising strategies. Sum of squares for treatment groups (SSG) k SSG n j (x j x) 2 j1 There are k treatments The size of sample j The mean of sample j Note: When the sample means are close to one another, their distance from the grand mean is small, leading to a small SSG. Thus, large SSG indicates large variation between sample means, which supports H1. Sum of squares for treatment groups (SSG) Solution – continued Calculate SSG x1 577.55 x2 653.00 x3 608.65 k SSG n j (x j x) 2 j1 The grand mean is calculated by n1 x1 n2 x2 ... nk xk x n1 n2 ... nk = 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 = = 57,512.23 Sum of squares for treatment groups (SSG) Is SSG = 57,512.23 large enough to reject H0 in favor of H1? The rationale behind test statistic – II Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. Therefore, even though sample means may markedly differ from one another, SSG must be judged relative to the “within samples variability”. Within samples variability The variability within samples is measured by adding all the squared distances between observations and their sample means. This sum is called the Sum of Squares for Error SSE In our example, this is the sum of all squared differences between sales in city j and the sample mean of city j (over all the three cities). Sum of squares for errors (SSE) Solution – continued Calculate SSE s12 10, 775.00 s22 7, 238.11 s32 8, 670.24 k nj SSE (xij x j ) (n1 - 1)s12 + (n2 -1)s22 + 2 j 1 i 1 (n3 -1)s32 = (20 -1)10,775 + (20 -1)7,238.11+ (20-1)8,670.24 = 506,983.50 Sum of squares for errors (SSE) Is SSG = 57,512.23 large enough relative to SSE = 506,983.50 to reject the null hypothesis that specifies that all the means are equal? The mean sum of squares To perform the test we need to calculate the mean squares as follows: Calculation of MSG Mean Square for treatment Groups SSG MSG k 1 57, 512.23 3 1 28, 756.12 Calculation of MSE Mean Square for Error SSE MSE nk 506,983.50 60 3 8,894.45 Calculation of the test statistic MSG F MSE 28, 756.12 8,894.45 3.23 with the following degrees of freedom: v1=k -1 and v2=n-k Required Conditions: 1. The populations tested are normally distributed. 2. The variances of all the populations tested are equal. The F test rejection region the hypothesis test: And finally H0: 1 = 2 = …=k H1: At least two means differ Test statistic: MSG F MSE R.R: F>Fa,k-1,n-k The F test Ho: 1 = 2= 3 H1: At least two means differ MSG MSE 28, 756.12 8, 894.45 3.23 F Test statistic F= MSG/ MSE= 3.23 R.R.: F Fa k 1nk F0.05,31,603 3.15 Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others. The F test p- value Use Excel to find the p-value fx Statistical F.DIST.RT(3.23,2,57) = .0469 0.1 0.08 p Value = P(F>3.23) = .0469 0.06 0.04 0.02 0 -0.02 0 1 2 3 4 Excel single factor ANOVA Anova: Single Factor SUMMARY Groups Convenience Quality Price Count 20 20 20 ANOVA Source of Variation Between Groups Within Groups SS 57512 506984 Total 564496 Sum Average Variance 11551 577.55 10775.00 13060 653.00 7238.11 12173 608.65 8670.24 df 2 57 59 SS(Total) = SSG + SSE MS 28756 8894 F P-value 3.23 0.0468 F crit 3.16 Multiple Comparisons When the null hypothesis is rejected, it may be desirable to find which mean(s) is (are) different, and at what ranking order. Two statistical inference procedures, geared at doing this, are presented: – “regular” confidence interval calculations – Bonferroni adjustment Multiple Comparisons Two means are considered different if the confidence interval for the difference between the corresponding sample means does not contain 0. In this case the larger sample mean is believed to be associated with a larger population mean. How do we calculate the confidence intervals? “Regular” Method This method builds on the equal variances confidence interval for the difference between two means. The CI is improved by using MSE rather than sp2 (we use ALL the data to estimate the common variance instead of only the data from 2 samples) 1 1 ( xi x j ) ta 2, n k s ni n j d . f . n k , s MSE Experiment-wise Type I error rate (the effective Type I error) The preceding “regular” method may result in an increased probability of committing a type I error. The experiment-wise Type I error rate is the probability of committing at least one Type I error at significance level a. It is calculated by: experiment-wise Type I error rate = 1-(1 – a)g where g is the number of pairwise comparisons (i.e. g = k C 2 = k(k-1)/2. For example, if a=.05, k=4, then experiment-wise Type I error rate =1-.735=.265 The Bonferroni adjustment determines the required Type I error probability per pairwise comparison (a*) , to secure a pre-determined overall a. Bonferroni Adjustment The procedure: – Compute the number of pairwise comparisons (g) [g=k(k-1)/2], where k is the number of populations. – Set a* = a/g, where a is the true probability of making at least one Type I error (called experiment-wise Type I error). – Calculate the following CI for i – j 1 1 ( xi x j ) ta * 2, n k s ni n j d . f . n k , s MSE Bonferroni Method Example - continued – Rank the effectiveness of the marketing strategies (based on mean weekly sales). – Use the Bonferroni adjustment method Solution – The sample mean sales were 577.55, 653.0, 608.65. – We calculate g=k(k-1)/2 to be 3(2)/2 = 3. – We set a* = .05/3 = .0167, thus t.0167/2, 60-3 = 2.467 (Excel). – Note that s = √8894.447 = 94.31 x1 x2 577.55 653 75.45 x1 x3 577.55 608.65 31.10 x2 x3 653 608.65 44.35 ta * 2 s 1 1 ni n j 2.467 *94.31 1/ 20 1/ 20 73.57 Bonferroni Method: The Three Confidence Intervals 1 1 ( xi x j ) ta * 2, n k s ni n j d . f . n k , s MSE x1 x2 577.55 653 75.45 x1 x3 577.55 608.65 31.10 x2 x3 653 608.65 44.35 ta * 2 s 1 1 ni n j 2.467 *94.31 1/ 20 1/ 20 73.57 1 2 : 75.45 73.57 (149.02, 1.88) 1 3 : 31.10 73.57 (104.67, 42.47) 2 3 :44.35 73.57 (29.22,117.92) There is a significant difference between 1 and 2. Bonferroni Method: Conclusions Resulting from Confidence Intervals Do we have evidence to distinguish two means? Group 1 Convenience: sample mean 577.55 Group 2 Quality: sample mean 653 Group 3 Price: sample mean 608.65 1 2 : 75.45 73.57 (149.02, 1.88) 31.10 73.57 (104.67, 42.47) 1 3 : ( 1 3 : 31.10 73.57 104.67, 42.47) 2 3 :44.35 73.57 (29.22,117.92) List the group numbers in increasing order of their sample means; connecting overhead lines mean no significant difference 1 3 2 Bonferroni Method: Conclusions Resulting from Confidence Intervals Do we have evidence to distinguish two means? Group 1 Convenience: sample mean 577.55 Group 2 Quality: sample mean 653 Group 3 Price: sample mean 608.65 1 2 : 75.45 73.57 (149.02, 1.88) 1 3 : 31.10 73.57 (104.67, 42.47) 2 3 :44.35 73.57 (29.22,117.92) List the group numbers in increasing order of their sample means; connecting overhead lines mean no significant difference 1 3 2