  2 .

252y0731 4/25/07 (Page layout view!) ECO252 QBA2 THIRD HOUR EXAM April 16, 2007 Name KEY Student number_____________ I. (8 points) Do all the following (2points each unless noted otherwise). Make Diagrams! Show your work! x ~ N 13, 7.2 Material in italics below is a description of the diagrams you were asked to make or a general explanation and will not be part of your written solution. The x and z diagrams should look similar. If you know what you are doing, you only need one diagram for each problem. General comment - I can't give you much credit for an answer with a negative probability or a probability above one, because there is no such  x  thing!!! In all these problems we must find values of z    corresponding to our values of x before    we do anything else. A diagram for z will help us because, if areas on both sides of zero are shaded, we will add probabilities, while, if an area on one side of zero is shaded and it does not begin at zero, we will subtract probabilities. Note: All the graphs shown here are missing a vertical line. They are also to scale. A hand drawn graph should exaggerate the distances of the points from the mean. Note also that, because of the rounding error necessary to use a conventional normal table, the results for x will disagree with results for z, especially for small values of z. 0  13    Pz  1.81  Pz  0  P1.81  z  0  .5  .4649  .0351 1. Px  0  P  z  7.2   This is how we usually find a p-value for a left-sided test when the distribution of the test ratio is Normal. For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line! Shade the entire area below -1.81. Because this is entirely on the left side of zero, we must subtract the area between -1.81 and zero from the area below zero. If you wish, make a completely separate diagram for x . Draw a Normal curve with a mean at 13. Indicate the mean by a vertical line! Shade the area below 0. This area is completely to the left of the mean (13), so we subtract the smaller area between zero and the mean from the larger area below the mean. 1 252y0731 4/25/07 (Page layout view!) 20  13   6  13 z  P 0.97  z  0.97   2P0  z  0.97   2.3340   .6680 2. P6  x  20   P  7.2   7.2 For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line! Shade the entire area between -0.97 and 0.97. Because this is entirely on both sides of zero, we must add the area between -0.97 and zero to the area from zero to 0.97. If you wish, make a completely separate diagram for x . Draw a Normal curve with a mean at 13. Indicate the mean by a vertical line! Shade the area from 6 to 20. This area is on both sides of the mean (13), so we add the area between 6 and the mean to the area between the mean and 20. 12  13    1  13 z  P 1.94  z  0.14   P1.94  z  0  P0.14  z  0 3. P1  x  12   P  7.2   7.2  .4738  .0557  .4181 For z make a diagram. Draw a Normal curve with a mean at 0. Indicate the mean by a vertical line! Shade the area between -1.94 and -0.14. Because this is entirely on the left side of zero, we must subtract the area between -0.14 and zero from the larger area between -0.14 and zero. If you wish, make a completely separate diagram for x . Draw a Normal curve with a mean at 13. Indicate the mean by a vertical line! Shade the area between -1 and 12. This area is completely to the left of the mean (13), so we subtact the smaller area between 12 and the mean from the larger area between -1 and the mean. 2 252y0731 4/25/07 (Page layout view!) 4. x.145 (Find z .145 first) Solution: x.145 is, by definition, the value of x with a probability of 14.5% above it. Make a diagram. The diagram for z will show an area with a probability of 85.5% below z .145 . It is split by a vertical line at zero into two areas. The lower one has a probability of 50% and the upper one a probability of 85.5% - 50% = 35.5%. The upper tail of the distribution above z .145 has a probability of 14.5%, so that the entire area above 0 adds to 35.5% + 14.5% = 50%. From the diagram, we want one point z .145 so that Pz  z .145   .855 or P0  z  z.145   .3550 . If we try to find this point on the Normal table, the closest we can come is P0  z  1.06   .3555 . So we will use z.145  1.06 , though 1.05 would be acceptable. Since x ~ N 13, 7.2 , the diagram for x would show 85.5% probability split in two regions on either side of 13 with probabilities of 50% below 13 and 35.5% above 13 and below x.145 , and with 14.5% above x.145 . z.145  1.06 , so the value of x can then be written x.145    z .145  13  1.067.2  13  7.63  20.63. 20 .63  13   To check this: Px  20 .63   P  z    Pz  1.06   Pz  0  P0  z  1.06   .5 + .3554 7.2   20 .63  13    .8554  85.5% . Or Px  20 .63   P  z    Pz  1.06   Pz  0  P0  z  1.06  7.2    .5  .3554  .1446  14.5% 3 252y0731 4/25/07 (Page layout view!) II. (22+ points) Do all the following (2 points each unless noted otherwise). Do not answer a question ‘yes’ or ‘no’ without giving reasons. Show your work when appropriate. Use a 5% significance level except where indicated otherwise. 1. Turn in your computer problems 2 and 3 marked as requested in the Take-home. (5 points, 2 point penalty for not doing.) 2. (Dummeldinger) As part of a study to investigate the effect of helmet design on football injuries, head width measurements were taken for 30 subjects randomly selected from each of 3 groups (High school football players, college football players and college students who do not play football – so that there are a total of 90 observations) with the object of comparing the typical head widths of the three groups. Before the data was compared, researchers ran two tests on it. They first ran a Lilliefors test and found that the null hypothesis was not rejected. They then ran a Bartlett test and found that the null hypothesis was not rejected. On the basis of these results they should use the following method. a) The Kruskal-Wallis test. b) *One-way ANOVA c) The Friedman test d) Two-Way ANOVA e) Chi-square test [7] Explanation: The null hypothesis of the Lilliefors test is that the distribution is Normal. The null hypothesis of the Bartlett test is equal variances. These two results are the requirements of a one-way ANOVA to compare the means of the three groups. 3. The coefficient of determination a) Is the square of the sample correlation b) Cannot be negative c) Gives the fraction of the variation in the dependent variable explained by the independent variable. d) *All of the above. e) None of the above. S xy 2 SSR Explanation: The coefficient of determination R  is the ratio of two sums or  SST SS x SS y 2 squares, which must be positive quantities. The correlation is the positive or negative square root of S xy 2 SS x SS y . 4. The Kruskal-Wallis test is an extension of which of the following for two independent samples? a) Pooled-variance t test b) Paired-sample t test c) * (Mann – Whitney -) Wilcoxon rank sum test d) Wilcoxon signed ranks test e) A test for comparison of means for samples with unequal variances f) None of the above. Explanation: The Wilcoxon rank sum test ranks all the number in the samples, just like the KruskalWallis test. Both are tests for comparison of medians. 4 252y0731 4/25/07 (Page layout view!) 5. One-way ANOVA is an extension of which of the following for two independent samples? a) *Pooled-variance t test b) Paired-sample t test c) (Mann – Whitney -) Wilcoxon rank sum test d) Wilcoxon signed ranks test e) A test for comparison of means for samples with unequal variances f) None of the above. [13] Explanation: The variances are pooled in the pooled-variance t test because equal variances is an assumption of ANOVA. Both are tests for comparison of means. Exhibit 1: As part of an evaluation program, a sporting goods retailer wanted to compare the downhill coasting speeds of 4 brands of bicycles. She took 3 of each brand and determined their maximum downhill speeds. The results are presented in miles per hour in the table below. She assumes that the Normal distribution is not applicable and uses a rank test. She treats the columns as independent random samples. Barth Tornado Reiser Shaw 43 37 41 43 46 38 45 45 43 39 42 46 6. What are the null and alternative hypotheses? Solution: H 0 : All medians equal H 1 : Not all medians equal 7. Construct a table of ranks for the data in Exhibit 1. Barth Tornado Reiser Shaw x1 r1 x 2 r2 x3 r3 x 4 r4 43 7.0 37 1.0 41 4.0 43 7.0 46 11.5 38 2.0 45 9.5 45 9.5 43 7.0 39 3.0 42 5.0 46 11.5 25.5 6.0 18.5 28.0 8. Complete the test using a 5% significance level (3) [20] 2  12  SRi      3n  1 Now, compute the Kruskal-Wallis statistic H    nn  1 i  ni   12  25 .52 6.02 18 .52 28 .02     313   1 604 .1667   39  7.4744 . This is too large for the         12 13 3 3 3 3 13    3 K-W table, so we compare this to  2 .05  7.8145 . Since the computed statistic is below the table statistic,  do not reject H 0 . 5 252y0731 4/25/07 (Page layout view!) Exhibit 2: (Webster) A placement director is trying to find out if a student’s GPA can influence how many job offers the student receives. Data is assembled and run on Minitab, with the results below. The regression equation is Jobs = - 0.248 + 1.27 GPA Predictor Coef SE Coef Constant -0.2481 0.7591 GPA 1.2716 0.2859 S = 1.03225 R-Sq = 71.2% Analysis of Variance Source DF SS Regression 1 21.076 Residual Error 8 8.524 Total 9 29.600 T P -0.33 0.752 4.45 0.002 R-Sq(adj) = 67.6% MS 21.076 1.066 F 19.78 P 0.002 9. In Exhibit 2, how many job offers would you predict for someone with a GPA of 3. Solution: Jobs = - 0.248 + 1.27(3)= 3.562 10. In Exhibit 2, what percent of the variation in job offers is explained by variation of GPA? (1) Solution: 71.2% 11. For the regression in Exhibit 2 to give us ‘good’ estimates of the coefficients. a) The variances of ‘Jobs’ and ‘GPA’ must be equal. b)* The amount of variation of ‘Jobs’ around the regression line must be independent of ‘GPA.’ c) ‘Jobs’ must be independent of ‘GPA.’ d) The p-value for ‘Constant’ must be higher than the p-value for ‘GPA.’ e) All of the above must be true. f) None of the above need to be true. [25] 12. Assuming that the Total of 29.600 in the ANOVA in Exhibit 2 is correct, but that you cannot read any other part of the printout, what are the largest and smallest values that the Regression Sum of Squares could take? [27] Solution: 0 to 29.600. 13. A corporation wishes to test the effectiveness of 3 different designs of pollution-control devices. It takes its 6 factories and puts each type of device for one week during each month of the year, for a total of 216 measurements. Pollution in the smoke is measured and an ANOVA is begun. Let factor A be months. Factor B will be factories and Factor C will be the devices. Fill in the following table. Note that we have too little data to compute interaction ABC. The Total sum of squares is 106. SSA is 44, SSB is 11, SSC is 21, SSAB is 9, SSAC is 7 and SSBC is 3. Use a significance level of 1%. Do we have enough information to decide which of the devices is best? (Do not answer this without showing your work!) (6) [33] Row 1 2 3 4 5 6 7 8 Factor A B C AB AC BC Within Total SS DF MS F F01 Significant? 6 252y0731 4/25/07 Solution: Row 1 2 3 4 5 6 7 8 Factor A B C AB AC BC Within Total (Page layout view!) SS 44 11 21 9 7 3 11 106 DF 11 5 2 55 22 10 110 215 MS 4.0000 2.2000 10.5000 0.1636 0.3182 0.3000 0.1000 F 40.000 22.000 105.000 1.636 3.182 3.000 F05 1.88 2.30 3.08 1.47 1.64 1.96 C7 s s s s s s F01 2.41 3.19 4.80 1.76 2.00 2.47 pval s ns ns ns s s The SS Column must add so the missing number is 11. As is implied by the diagram and your experience with 2-way ANOVA, there is no ABC interaction. 12 months means 11 A df, 6 plants means 5 B df, 3 devices means 2 C DF. AB df are 11(5) = 55. AC df are 11(2) = 22, BC df is 5(2) = 10. Total df is 12(6)3 – 1 = 216 – 1 = 215. Within df can be gotten by making sure that the DF column adds or by noting 11(5)2 = 110. MS is SS divided by DF. F is MS divided by MSW = 0.1000. Though all are significant at the 5% level, at the 1% level we cannot say that the Factor C means are different since the F is not significant at 1%. Exhibit 3: A plant manager is concerned for the rising blood pressure of employees and believes that it is related to the level of satisfaction that they get out of their jobs. A random sample of 10 employees rates their satisfaction on a 1to 60 scale and their blood pressure is tested. Row 1 2 3 4 5 6 7 8 9 10 Satisfaction BloodPr 34 124 23 128 19 157 43 133 56 116 47 125 32 147 16 167 55 110 25 156 We also know that Sum of Satisfaction = 350, Sum of BloodPr = 1363, Sum of squares (uncorrected) of Satisfaction = 14170 and Sum of squares (uncorrected) of BloodPr = 189113. It is also true that the sum of the first eight numbers of the product of Satisfaction and Blood is 35609. 14. a) Finish the computation of  xy (1). The sum of the first eight numbers of the product of Satisfaction and Blood is 35609. So 35609 + 110(55) + 156(25) = 35609 + 6050 + 3900 = 45559 b) Try to do a regression explaining blood pressure as a result of job satisfaction (4). Please do not recompute things that I have already done for you. xy  45559 and y  1363, x  350, x 2  14170, We now have n  10,      350 1363 y 2  189113. We can begin by computing x   35 and y   136.3. Now it’s 10 10 time for spare parts calculations SSx  SST  SS y  y 2 x 2  nx 2  14170  10352  1920,  ny 2  18911310136.32  3336.10, S xy  45559  1035136.3  -2146.  2146  1.1177 means b0  Y  b1 X  136 .3  1.1177 35   175 .4195 . SS x 1920 So our regression equation is Yˆ  175 .4195  1.1177 X or Y  175 .4195  1.1177 X  e . b1  S xy  7 252y0731 4/25/07 (Page layout view!) c) Find the slope and test its significance(4). SSR  b1 S xy  1.11772146  2398.5842 , df  n  2  8 s e2  SSE SST  SSR SS y  b1 S xy 3336 .10  2398 .5842     117 .1895 , n2 n2 n2 8 s e  117 .1895  10 .8254  1  117 .1895 s b21  s e2   0.06104 . So s b  0.06104  0.24706  1 1920  SS x  H :   0 b  0 1.117 8 To test  0 1 use t  1  1.860 .   4.5211 . We compare this with t .05 sb1 0.24706 H 1 : 1  0 Since our computed t is larger that the table t, we reject the null hypothesis. d) Do the prediction or confidence interval as appropriate for the blood pressure of the highly satisfied employee (satisfaction level is 57) that you hired last month. (3) [45] 2 Recall that x  35, SS x  1920.00, s e  117.1895 and Yˆ  175.4195 1.1177X  175.4195 1.117757  111.71 .   1 X X 2  The prediction Interval is Y0  Yˆ0  t sY , where sY2  s e2   0  1 n  SS x   2  1 57  35   484    117 .1895    1  117 .1895  0.1   1  117 .1895 1.35208   158.500. This  10  1920 1920     8  1.860 , and we put the whole mess together. implies that sY  158 .500  12 .5876 Now t .05 Y0  Yˆ0  t sY  111.71 1.860158.5  11.71 294.21 . A prediction interval is appropriate for a one-shot situatuin like this one. A confidence interval is an interval for the average response. 8 252y0731 4/25/07 (Page layout view!) ECO252 QBA2 THIRD EXAM April 16, 2007 TAKE HOME SECTION Name: _________________________ Student Number: _________________________ Class days and time : _________________________ Please Note: Computer problems 2 and 3 should be turned in with the exam (2). In problem 2, the 2 way ANOVA table should be checked. The three F tests should be done with a 5% significance level and you should note whether there was (i) a significant difference between drivers, (ii) a significant difference between cars and (iii) significant interaction. In problem 3, you should show on your third graph where the regression line is. Check what your text says about normal probability plots and analyze the plot you did. Explain the results of the t and F tests using a 5% significance level. (2) III Do the following. (20+ points) Note: Look at 252thngs (252thngs) on the syllabus supplement part of the website before you start (and before you take exams). Show your work! State H 0 and H 1 where appropriate. You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion. (Use a 95% confidence level unless another level is specified.) Answers without reasons or accompanying calculations usually are not acceptable. Neatness and clarity of explanation are expected. This must be turned in when you take the in-class exam. Note that from now on neatness means paper neatly trimmed on the left side if it has been torn, multiple pages stapled and paper written on only one side. To personalize these data, let the last digit of your student number be a . 1) (Ken Black) A national travel organization wanted to determine whether there is a significant difference between the cost of premium gas in various regions of the country. The following data was gathered in July 2006. Because the brand of gas may confuse the results, data is blocked by brand. Data Display Row 1 2 3 4 5 6 Brand A B C D E F City 1 3.47 3.43 3.44 3.46 3.46 3.44 City 2 3.40 3.41 3.41 3.45 3.40 3.43 City 3 3.38 3.42 3.43 3.40 3.39 3.42 City 4 3.32 3.35 3.36 3.30 3.39 3.39 City 5 3.50 3.44 3.45 3.45 3.48 3.49 Mark your problem ‘VERSION a .’ If a is zero, use the original numbers. If a is 1 through 4, add a to the city 4 column. If a is 5 through 9, add 0.04  a number is 090909, so he adds 0.04  9 100 100 100 to the city 5 column. Example: Seymour Butz’  .05 to column 5 (subtracts .05) and gets the table below. Version 9 Row 1 2 3 4 5 6 Brand A B C D E F City 1 3.47 3.43 3.44 3.46 3.46 3.44 City 2 3.40 3.41 3.41 3.45 3.40 3.43 City 3 3.38 3.42 3.43 3.40 3.39 3.42 City 4 3.32 3.35 3.36 3.30 3.39 3.39 City 5 3.45 3.39 3.40 3.40 3.43 3.44 a) Do a 2-way ANOVA on these data and explain what hypotheses you test and what the conclusions are. Show your work. (6) b) Using your results from a) present two different confidence intervals for the difference between numbers of defects for the best and worst city and for the defects from the best and second best brands. Explain which of the intervals show a significant difference and why. (3) 9 252y0731 4/25/07 (Page layout view!) c) What other method could we use on these data to see if city makes a difference while allowing for crossclassification? Under what circumstances would we use it? Try it and tell what it tests and what it shows. (3) d) (Extra credit) Check you results on the computer. The Minitab setup for both 2-way ANOVA and the corresponding non-parametric method is the same as the extra credit in the third graded assignment or the third computer problem. You may want to use and compare the Statistics pull-down menu and ANOVA or Nonparametrics to start with. Explain your results. (4) [12] 2) A local official wants to know how land value affects home price. The official collects the following data. 'Home price' is the dependent variable and 'Land Value' is the independent variable. Both are in thousands of dollars. If you don’t know what dependent and independent variables are, stop work until you find out. Row 1 2 3 4 5 6 7 8 9 10 HomePr 67 63 60 54 58 36 76 87 89 92 LandVal 7.0 6.9 5.5 3.7 5.9 3.8 8.9 9.6 9.9 10.0 To personalize the data let g be the second to last digit in your student number. Mark your problem ‘VERSION g . ’Subtract g from 92, which is the last home price. a) Compute the regression equation Y  b  b x to predict the home price on the basis of land value. 0 1 (3).You may check your results on the computer, but let me see real or simulated hand calculations. b) Compute R 2 . (2) c) Compute s e . (2) d) Compute s b0 and do a significance test on b0 (2) e) Use your equation to predict the price of the average home that is on land worth $9000. Make this into a 95% interval. (3) f) Make a graph of the data. Show the trend line and the data points clearly. If you are not willing to do this neatly and accurately, don’t bother. (2) [26] 3) A sales manager is trying to find out what method of payment works best. Salespersons are randomly assigned to payment by commission, salary or bonus. Data for units sold in a week is below. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Commission 25 35 20 30 25 30 26 17 31 30 Salary 25 25 22 20 25 23 22 25 20 26 24 Bonus 28 41 21 31 26 26 46 32 27 35 23 22 26 To personalize the data let g be the second to last digit in your student number. Subtract 2 g from 46 in the last column. Mark your problem ‘VERSION g .’Example: Payme Well’s student number is 000090, so she subtracts 18 and gets the following numbers. 10 252y0731 4/25/07 (Page layout view!) Version 9 Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Commission 25 35 20 30 25 30 26 17 31 30 Salary 25 25 22 20 25 23 22 25 20 26 24 Bonus 28 41 21 31 26 26 28 32 27 35 23 22 26 a) State your null hypothesis and test it by doing a 1-way ANOVA on these data and explain whether the test tells us that payment method matters or not. (4) b) Using your results from a) present individual and Tukey intervals for the difference between earnings for all three possible contrasts. Explain (i) under what circumstances you would use each interval and (ii) whether the intervals show a significant difference. (2) c) What other method could we use on these data to see if payment method makes a difference? Under what circumstances would we use it? Try it and tell what it tests and what it shows. (3) [37] d) (Extra Credit) Do a Levene test on these data and explain what it tests and shows. (4) f) (Extra credit) Check you results on the computer. The Minitab setup for 1-way ANOVA can either be unstacked (in columns) or stacked (one column for data and one for column name). There is a Tukey option. The corresponding non-parametric methods require the stacked presentation. You may want to use and compare the Statistics pull-down menu and ANOVA or Nonparametrics to start with. You may also want to try the Mood median test and the test for equal variances which are on the menu categories above. Even better, do a normality test on your columns. Use the Statistics pull-down menu to find the normality tests. The Kolmogorov-Smirnov option is actually Lilliefors. The ANOVA menu can check for equality of variances. In light of these tests was the 1-way ANOVA appropriate? You can get descriptions of unfamiliar tests by using the Help menu and the alphabetic command list or the Stat guide. (Up to 7) [12] A 5% significance level is assumed throughout this section. 1) (Ken Black) A national travel organization wanted to determine whether there is a significant difference between the cost of premium gas in various regions of the country. The following data was gathered in July 2006. Because the brand of gas may confuse the results, data is blocked by brand. a) Do a 2-way ANOVA on these data and explain what hypotheses you test and what the conclusions are. Show your work. (6) An example of 2-way ANOVA with one measurement per cell is in 252anovaex3. Note that all three solutions below are essentially identical. To save space the row and column giving 17 .07  3.414 . The sum of column and row counts is omitted. In row 1 the mean price for brand A was 5 squares for the first row is 3.47 2  3.40 2  3.38 2  3.32 2  3.50 2  58.2977 . The mean price squared was 3.414 2  11 .6554 . 3.41867, the number at the bottom of the mean column and on the right of the mean row is the overall mean, which can be found by averaging over the row or column or over the 30 original numbers. In every case we seem to reject the similarity of city means but not the similarity of brand means. 11 252y0731 4/25/07 (Page layout view!) Solution: Version 0 Row Brand 1 A 2 B 3 C 4 D 5 E 6 F Sum SS Mean Meansq City 1 3.47 3.43 3.44 3.46 3.46 3.44 20.70 71.4162 3.45000 11.9025 City 2 3.40 3.41 3.41 3.45 3.40 3.43 20.50 70.0436 3.41667 11.6736 City 3 3.38 3.42 3.43 3.40 3.39 3.42 20.44 69.6342 3.40667 11.6054 City 4 3.32 3.35 3.36 3.30 3.39 3.39 20.11 67.4087 3.35167 11.2337 City 5 Sum SS mean 3.50 17.07 58.2977 3.414 3.44 17.05 58.1455 3.410 3.45 17.09 58.4187 3.418 3.45 17.06 58.2266 3.412 3.48 17.12 58.6262 3.424 3.49 17.17 58.9671 3.434 20.81 102.56 350.682 3.41867 72.1791 350.682 3.46833 3.41867 12.0293 58.4445 meansq 11.6554 11.6281 11.6827 11.6417 11.7238 11.7924 70.1241  x  nx  350.682  303.41867  350.682  350.6191 0.0629 SSR  C x  nx  570.1241  303.41867  350.6205  350.6191  0.0014 SSC  R x  nx  658.4445  303.41867  350.6667  350.6191  0.0479 SST  2 2 2 2 i. 2 2 2 .j 2 2 Source SS Rows 0.0014 Columns Within Total DF MS F F.05 F 5,20  2.71 F 4, 20  2.87 5 0.00028 0.41 ns 0.0479 4 0.011975 17.61 s 0.0136 0.0629 20 29 0.00068 City 3 3.38 3.42 3.43 3.40 3.39 3.42 20.44 69.6342 3.40667 11.6054 City 4 3.36 3.39 3.40 3.34 3.43 3.43 20.35 69.0271 3.39167 11.5034 H0 Row means equal Column means equal s  MSW  .0261 Solution: Version 4 Row Brand City 1 1 A 3.47 2 B 3.43 3 C 3.44 4 D 3.46 5 E 3.46 6 F 3.44 Sum 20.70 SS 71.4162 Mean 3.45000 Mean sq 11.9025 City 2 3.40 3.41 3.41 3.45 3.40 3.43 20.50 70.0436 3.41667 11.6736 City 5 3.50 3.44 3.45 3.45 3.48 3.49 20.81 72.1791 3.46833 12.0293 sum SS mean 17.11 58.5649 3.422 17.09 58.4151 3.418 17.13 58.6891 3.426 17.10 58.4922 3.420 17.16 58.8990 3.432 17.21 59.2399 3.442 102.8 352.300 3.42667 352.300 3.42667 58.7142 mean sq 11.7101 11.6827 11.7375 11.6964 11.7786 11.8474 70.4527  x  nx  352.300  303.42667  352.300  352.2620  0.0380 SSR  C x  nx  570.4527  303.42667  352.2627  352.2620  0.0007 SSC  R x  nx  658.7142  303.42667  352.2852  352.2620  0.0232 SST  2 2 2 2 i. 2 2 2 .j 2 2 Source SS DF MS Rows 0.0007 5 0.00014 0.197 ns Columns 0.0232 4 0.0058 8.169 s Within Total 0.0141 0.0380 20 29 0.00071 F F.05 F 5,20  2.71 F 4, 20  2.87 H0 Row means equal Column means equal s  MSW  .0266 12 252y0731 4/25/07 (Page layout view!) Solution: Version 9 Row Brand City 1 City 2 City 3 City 4 1 A 3.47 3.40 3.38 3.32 2 B 3.43 3.41 3.42 3.35 3 C 3.44 3.41 3.43 3.36 4 D 3.46 3.45 3.40 3.30 5 E 3.46 3.40 3.39 3.39 6 F 3.44 3.43 3.42 3.39 Sum 20.70 20.50 20.44 20.11 SS 71.4162 70.0436 69.6342 67.4087 Mean 3.45000 3.41667 3.40667 3.35167 MeanSq 11.9025 11.6736 11.6054 11.2337 City 5 sum SS mean 3.45 17.02 57.9502 3.404 3.39 17.00 57.8040 3.400 3.40 17.04 58.0762 3.408 3.40 17.01 57.8841 3.402 3.43 17.07 58.2807 3.414 3.44 17.12 58.6206 3.424 20.51 102.26 348.616 3.40867 70.1131 348.616 3.41833 3.40867 11.6850 58.1002 mean sq 11.5872 11.5600 11.6145 11.5736 11.6554 11.7238 69.7145  x  nx  348.616  303.40867  348.616  348.5709  0.04506 SSR  C x  nx  569.7145  303.40867  348.5725  348.5709  0.00156 SSC  R x  nx  658.1002  303.40867  348.6012  348.5709  0.03030 SST  2 2 2 2 i. 2 2 2 .j 2 2 Source SS DF MS Rows 0.00156 5 0.00031 0.473 ns Columns 0.03030 4 0.00758 11.48 s Within Total 0.01320 0.04506 20 29 0.00066 F.05 F F 5,20  2.71 F 4, 20  2.87 H0 Row means equal Column means equal s  MSW  .0257 Note that s is essentially the same for all three versions so there is no point in sticking to separate solutions for b). b) Using your results from a) present two different confidence intervals for the difference between numbers of defects for the best and worst city and for the defects from the best and second best brands. Explain which of the intervals show a significant difference and why. (3) Solution: Since I blew the statement of this section, but no one complained, foll credit will be given for any sets of meaningful comparison of rows and columns. The intervals presented in the outline were Scheffé Confidence Interval If we desire intervals that will simultaneously be valid for a given confidence level for all possible intervals between means, use the following formulas. For row means, use 1   2  x1  x 2   R  1FR 1, RC P 1 2MSW PC . For column means, use C  1FC 1, RCP 1 2MSW . Note that if P  1 we replace RC P  1 with PR R  1C  1 . We have 6 rows and 5 columns with 1 measurement per cell. So R 1  5 , C  1  4 and  1   2  x1  x2   RC P  1 is replaced by 54  20 . We already know that s  MSW  .026 The row mean contrast is 1   2  x1  x 2    1   2  x1  x2   4F4,20 2MSW 6 5F5,20 2MSW 5 . The column mean contrast is We already know that the 5% values are F 4, 20  2.87 and F 5,20  2.71 . 13 252y0731 4/25/07 (Page layout view!) Bonferroni Confidence Interval with m  1 is an individual interval, not simultaneously valid at 95%. Note that if P  1 we replace RC P  1 with R  1C  1 . We have 6 rows and 5 coloumns with 1 measurement per cell. So R  1  5 , C  1  4 and RC P  1 is replaced by 54  20 The row mean contrast is 1   2  x1  x 2   t 20 2MSW and the column mean contrast is 5 2  1   2  x1  x2   t 20 2MSW 20 . t .025  2.086 . 6 2 Tukey Confidence Interval These are sort of a loose version of the Scheffe intervals. Note that if P  1 we replace RC P  1 with R  1C  1 . We have 6 rows and 5 coloumns with 1 measurement per cell. So R  1  5 , C  1  4 and RC P  1 is replaced by 54  20 The row mean contrast is 1   2  x1  x 2   q6, 20 MSW and the column mean contrast is 5  1   2  x1  x2   q5, 20 MSW . The Tukey table is shown below and it seem that the values we 6 5, 20 4, 20  4.23 and q.05  3.96 need are q.05  df  20 0.05 0.01 * 2 3 * * 2.95 4.02 * * * 3.58 4.64 * 4 * * 3.96 5.02 * * m 6 5 * 4.23 5.29 * 7 * * 4.45 5.51 * * * 4.62 5.69 * 8 9 * * 4.77 5.84 * * * 4.90 5.97 * 10 * 5.01 6.09 c) What other method could we use on these data to see if city makes a difference while allowing for crossclassification? Under what circumstances would we use it? Try it and tell what it tests and what it shows. (3) Solution: The Friedman method is similar in intent to 2-way ANOVA with one measurement per cell. It is used to compare medians when the underlying distribution cannot be considered Normal. The data is ranked within rows. Solution: Version 0 If the cities are x1 through x5 and the ranks are r1 through r5 we can make the table below. Brand x1 r1 x2 r2 x3 r3 x4 r4 x5 r5 A B C D E F Sum 3.47 3.43 3.44 3.46 3.46 3.44 4 4 4 5 4 4 25 3.40 3.0 3.41 2.0 3.41 2.0 3.45 3.5 3.40 2.0 3.43 2.5 15.0 3.38 2 3.42 3 3.43 3 3.40 2 3.39 1 3.42 1 12 The sums of ranks sum to 90, which checks our as  12 is  F2    rc c  1   SR   12  SR   3r c  1   656 25 2 i i i 2 3.36 3.39 3.40 3.34 3.43 3.43  1.0 1.0 1.0 1.0 3.0 2.5 9.5 3.50 3.44 3.45 3.45 3.48 3.49 5.0 5.0 5.0 3.5 5.0 5.0 28.5 rcc  1 656   90 . The Friedman statistic 2 2    15 2  12 2  9.5 2  28 .5 2   366  2 1896 .5  108  126 .43333  108  18 .4333 . This has the chi-squared distribution with 4 degrees of 30 freedom. The upper right corner of the chi-squared table appears below. Since the computed value of  2 14 252y0731 4/25/07 (Page layout view!) exceeds any value on the table, we can say that the p-value is below .005 and reject the null hypothesis of equal medians for cities. Degrees of Freedom 1 2 3 4  0.005 7.87946 10.5966 12.8382 14.8603 0.010 6.63491 9.2103 11.3449 13.2767 0.025 5.02389 7.3778 9.3484 11.1433 0.050 3.84146 5.9915 7.8147 9.4877 0.100 2.70554 4.6052 6.2514 7.7794 Solution: Version 4 Brand x1 r1 x2 r2 x3 r3 x4 r4 x5 r5 A B C D E F Sum 3.47 3.43 3.44 3.46 3.46 3.44 4 4 4 5 4 4 25 3.40 3.41 3.41 3.45 3.40 3.43 3.0 2.0 2.0 3.5 3.0 3.0 16.5 3.38 3.42 3.43 3.40 3.39 3.42 The sums of ranks sum to 90, which checks our as  12 is  F2    rc c  1   SR i   12  SR   3r c  1   656 25 2 i i 2.0 3.0 3.0 2.0 1.5 2.0 13.5 2  3.32 3.35 3.36 3.30 3.39 3.39 1.0 1.0 1.0 1.0 1.5 1.0 6.5 3.50 3.44 3.45 3.45 3.48 3.49 5.0 5.0 5.0 3.5 5.0 5.0 28.5 rcc  1 656   90 . The Friedman statistic 2 2    16 .5 2  13 .5 2  6.5 2  28 .5 2   36 6  2 1934   108  128 .9333  108  20 .93 . This has the chi-squared distribution with 4 degrees of 30 freedom. The upper right corner of the chi-squared table appears above. Since the computed value of  2 exceeds any value on the table, we can say that the p-value is below .005 and reject the null hypothesis of equal medians for cities. Solution: Version 9 Brand x1 r1 x2 r2 x3 r3 x4 r4 x5 r5 A B C D E F Sum 3.47 3.43 3.44 3.46 3.46 3.44 5.0 5.0 5.0 5.0 5.0 4.5 29.5 3.40 3.41 3.41 3.45 3.40 3.43 3 3 3 4 3 3 19 3.38 3.42 3.43 3.40 3.39 3.42 The sums of ranks sum to 90, which checks our as 2.0 4.0 4.0 2.5 1.5 2.0 16  SR i  3.32 3.35 3.36 3.30 3.39 3.39 1.0 1.0 1.0 1.0 1.5 1.0 6.5 3.45 3.39 3.40 3.40 3.43 3.44 4.0 2.0 2.0 2.5 4.0 4.5 19 rcc  1 656   90 . The Friedman statistic 2 2  12   12  29 .5 2  19 2  16 2  6.5 2  19 2   36 6  SRi2   3r c  1   is  F2    656    rc c  1 i  2 1890 .50   108  126 .0333  108  18 .033 . This has the chi-squared distribution with 4 degrees of  30     freedom. The upper right corner of the chi-squared table appears above. Since the computed value of  2 exceeds any value on the table, we can say that the p-value is below .005 and reject the null hypothesis of equal medians for cities. d) (Extra credit) Check your results on the computer. The Minitab setup for both 2-way ANOVA and the corresponding non-parametric method is the same as the extra credit in the third graded assignment or the third computer problem. You may want to use and compare the Statistics pull-down menu and ANOVA or Nonparametrics to start with. Explain your results. (4) [12] 15 252y0731 4/25/07 (Page layout view!) Solution: Note that because of rounding and small numbers there is quite a bit of difference between the numbers in the ANOVAs here and above. There is no difference in conclusions and little in the value of the standard errors. The Friedman test results are essentially identical. Version 0 MTB > WSave "C:\Documents and Settings\RBOVE\My Documents\Minitab\252x0703101.MTW"; SUBC> Replace. Saving file as: 'C:\Documents and Settings\RBOVE\My Documents\Minitab\251x07031-01.MTW' MTB > print c1-c6 Data Display Row 1 2 3 4 5 6 Brand A B C D E F City 1 3.47 3.43 3.44 3.46 3.46 3.44 City 2 3.40 3.41 3.41 3.45 3.40 3.43 City 3 3.38 3.42 3.43 3.40 3.39 3.42 MTB > Stack c2-c6 c50; SUBC> Subscripts c11; SUBC> UseNames. MTB > STACK c1 c1 c1 c1 c1 c51 MTB > print c50 - c52 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 C50 3.47 3.40 3.38 3.32 3.50 3.43 3.41 3.42 3.35 3.44 3.44 3.41 3.43 3.36 3.45 3.46 3.45 3.40 3.30 3.45 3.46 3.40 3.39 3.39 3.48 3.44 3.43 3.42 3.39 3.49 Rowno A A A A A B B B B B C C C C C D D D D D E E E E E F F F F F City City City City City City City City City City City City City City City City City City City City City City City City City City City City City City City City 4 3.32 3.35 3.36 3.30 3.39 3.39 City 5 3.50 3.44 3.45 3.45 3.48 3.49 #Puts all columns into C50 and column labels into C52. #This could be done by hand, but pull-down #menus are easy to use. #Stacks row labels in 51 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 16 252y0731 4/25/07 (Page layout view!) MTB > Twoway C50 C51 C52; SUBC> Means C51 C52. Two-way ANOVA: C50 versus Rowno, City Source Rowno DF 5 SS 0.0020267 MS 0.0004053 F 0.63 P 0.677 #High p-value. Do not reject equal firm #means. City 4 0.0485133 0.0121283 18.94 0.000 #Low p-value. Reject equal city means. Error 20 0.0128067 0.0006403 Total 29 0.0633467 S = 0.02530 R-Sq = 79.78% R-Sq(adj) = 70.69% Rowno 1 2 3 4 5 6 City City City City City City Mean 3.414 3.410 3.418 3.412 3.424 3.434 1 2 3 4 5 Individual 95% CIs For Mean Based on Pooled StDev -------+---------+---------+---------+-(-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) -------+---------+---------+---------+-3.400 3.420 3.440 3.460 Mean 3.45000 3.41667 3.40667 3.35167 3.46833 Individual 95% CIs For Mean Based on Pooled StDev -------+---------+---------+---------+-(-----*----) (----*-----) (-----*----) (----*----) (----*----) -------+---------+---------+---------+-3.360 3.400 3.440 3.480 MTB > Friedman C50 C52 C51. Friedman Test: C50 versus City blocked by Rowno S = 20.93 S = 21.29 City City City City City City 1 2 3 4 5 N 6 6 6 6 6 DF = 4 DF = 4 P = 0.000 #Low p-value. Reject equal city medians. P = 0.000 (adjusted for ties) Sum of Est Median Ranks 3.4375 25.0 3.4095 16.5 3.4025 13.5 3.3525 6.5 3.4555 28.5 Grand median = 3.4115 Version 4 Results for: 252x07031-01a4.MTW MTB > Stack c2-c6 c10; SUBC> Subscripts c11; SUBC> UseNames. MTB > STACK c1 c1 c1 c1 c1 c12 17 252y0731 4/25/07 (Page layout view!) MTB > Twoway c10 c12 c11; SUBC> Means c12 c11. Two-way ANOVA: C10 versus C12, C11 Source C12 DF 5 SS 0.0020267 MS 0.0004053 F 0.63 P 0.677 #High p-value. Do not reject equal firm #means. 0.000 #Low p-value. Reject equal city means. C11 4 0.0240333 0.0060083 9.38 Error 20 0.0128067 0.0006403 Total 29 0.0388667 S = 0.02530 R-Sq = 67.05% R-Sq(adj) = 52.22% C12 A B C D E F C11 City City City City City Individual 95% CIs For Mean Based on Pooled StDev ---+---------+---------+---------+-----(-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) ---+---------+---------+---------+-----3.400 3.420 3.440 3.460 Mean 3.422 3.418 3.426 3.420 3.432 3.442 1 2 3 4 5 Mean 3.45000 3.41667 3.40667 3.39167 3.46833 Individual 95% CIs For Mean Based on Pooled StDev -------+---------+---------+---------+-(------*------) (------*------) (-------*------) (-------*------) (------*------) -------+---------+---------+---------+-3.390 3.420 3.450 3.480 MTB > Friedman c50 c52 c51. Friedman Test: C50 versus C52 blocked by C51 S = 18.43 S = 18.75 DF = 4 DF = 4 P = 0.001 #Low p-value. Reject equal city medians. P = 0.001 (adjusted for ties) Sum of C52 N Est Median Ranks City 1 6 3.4375 25.0 City 2 6 3.4095 15.0 City 3 6 3.4025 12.0 City 4 6 3.3925 9.5 City 5 6 3.4555 28.5 Grand median = 3.4195 Version 9 MTB > WOpen "C:\Documents and Settings\RBOVE\My Documents\Minitab\252x0703101a9.MTW". Retrieving worksheet from file: 'C:\Documents and Settings\RBOVE\My Documents\Minitab\252x07031-01a9.MTW' Worksheet was saved on Wed Apr 04 2007 Results for: 252x07031-01a9.MTW MTB > Stack c2-c6 c10; SUBC> Subscripts c12; SUBC> UseNames. MTB > stack c1 c1 c1 c1 c1 c11 18 252y0731 4/25/07 (Page layout view!) MTB > Twoway c10 c11 c12; SUBC> Means c11 c12. Two-way ANOVA: C10 versus C11, C12 Source C11 DF 5 SS 0.0020267 MS 0.0004053 F 0.63 P 0.677 #High p-value. Do not reject equal firm #means. C12 4 0.0307133 0.0076783 11.99 0.000 #Low p-value. Reject equal city means Error 20 0.0128067 0.0006403 Total 29 0.0455467 S = 0.02530 R-Sq = 71.88% R-Sq(adj) = 59.23% C11 A B C D E F C12 City City City City City Mean 3.404 3.400 3.408 3.402 3.414 3.424 1 2 3 4 5 Individual 95% CIs For Mean Based on Pooled StDev --+---------+---------+---------+------(-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) (-----------*-----------) --+---------+---------+---------+------3.380 3.400 3.420 3.440 Mean 3.45000 3.41667 3.40667 3.35167 3.41833 Individual 95% CIs For Mean Based on Pooled StDev -------+---------+---------+---------+-(-----*----) (----*-----) (-----*----) (----*----) (-----*----) -------+---------+---------+---------+-3.360 3.400 3.440 3.480 MTB > Friedman c50 c52 c51. Friedman Test: C50 versus C52 blocked by C51 S = 18.03 S = 18.50 DF = 4 DF = 4 P = 0.001 #Low p-value. Reject equal city medians. P = 0.001 (adjusted for ties) Sum of C52 N Est Median Ranks City 1 6 3.4375 29.5 City 2 6 3.4095 19.0 City 3 6 3.4025 16.0 City 4 6 3.3525 6.5 City 5 6 3.4055 19.0 Grand median = 3.4015 19 252y0731 4/25/07 (Page layout view!) 2) A local official wants to know how land value affects home price. The official collects the following data. 'Home price' is the dependent variable and 'Land Value' is the independent variable. Both are in thousands of dollars. If you don’t know what dependent and independent variables are, stop work until you find out. Row 1 2 3 4 5 6 7 8 9 10 HomePr 67 63 60 54 58 36 76 87 89 92 LandVal 7.0 6.9 5.5 3.7 5.9 3.8 8.9 9.6 9.9 10.0 To personalize the data let g be the second to last digit in your student number. Mark your problem ‘VERSION g . ’Subtract g from 92, which is the last home price. a) Compute the regression equation Y  b  b x to predict the home price on the basis of land value. 0 1 (3).You may check your results on the computer, but let me see real or simulated hand calculations. Version 0 Y i 1 2 3 4 5 6 7 8 9 10 67 63 60 54 58 36 76 87 89 92 682 7.0 6.9 5.5 3.7 5.9 3.8 8.9 9.6 9.9 10.0 71.2 To summarize n  10, Y 2 49.00 469.0 4489 47.61 434.7 3969 30.25 330.0 3600 13.69 199.8 2916 34.81 342.2 3364 14.44 136.8 1296 79.21 676.4 5776 92.16 835.2 7569 98.01 881.1 7921 100.00 920.0 8464 559.18 5225.2 49364  X  71.2,  Y  682 ,  XY  5225 .2,  X  X  71.2  7.12 n 10 Spare Parts: SS x  S xy  Y2 XY 2  559 .18 and  4364 . df  n  2  10  2  8. Means: X  SS y  X2 X Y 2 X 2 Y   Y  682  68.2 n 10  nX  559.18  107.122  52.236 2  nY 2  49364  1068.22  2851.6  SST (Total Sum of Squares)  XY  nXY  5225 .2  107.12 68.2  369 .36 Coefficients: b1  S xy SS x   XY  nXY  X  nX 2 2  369 .36  7.0710 52 .236 b0  Y  b1 X  68 .2  70 .0710  7.12   17 .85458 So our equation is Yˆ  17.85458  7.0710 X Check: Regression Analysis: HomePr versus LandVal The regression equation is HomePr = 17.9 + 7.07 LandVal Predictor Coef SE Coef T P Constant 17.855 5.665 3.15 0.014 LandVal 7.0710 0.7576 9.33 0.000 S = 5.47564 R-Sq = 91.6% R-Sq(adj) = 90.5% Analysis of Variance 20 252y0731 4/25/07 (Page layout view!) Source Regression Residual Error Total DF 1 8 9 SS 2611.7 239.9 2851.6 Unusual Observations Obs LandVal HomePr 4 3.7 54.00 MS 2611.7 30.0 Fit 44.02 SE Fit 3.12 F 87.11 P 0.000 Residual 9.98 St Resid 2.22R b) Compute R 2 . (2) Please do not repeat any calculations from a), but recall the following. Spare Parts: SS x  SS y  S xy  Y 2 X 2  nX 2  559.18  107.122  52.236  nY 2  49364  1068.22  2851.6  SST (Total Sum of Squares)  XY  nXY  5225 .2  107.12 68.2  369 .36 S xy 369 .36 b0  Y  b1 X  68 .2  70 .0710  7.12   17 .85458  7.0710 SS x 52 .236 So SSR  b1 Sxy  7.0710 369 .36   2611 .745 , and b1  R2   S xy 2 369 .36 2  .9159 . SSR b1 S xy 2611 .745    .9159 or R 2   SST SSy 2851 .6 SS x SS y 52 .236 2851 .6 c) Compute s e . (2) SSE  SST  SSR  2851 .6  2611 .745  239 .855 s e2  SSE 239 .855   29 .9819 s e  29 .9819  5.47557 This is more accurate than the next calculation. n2 8 2 Note also that if R is available s e2   SS y 1  R 2 n2   2851 .61  .9159   29.97745 8 and s e  29 .97745  5.47517 . d) Compute s b0 and do a significance test on b0 (2). Recall X  7.12 and SS x  52.236 . H 0 :  0   00 b   00 8 We are now testing  with t  0 . We are using t n2  t.025  2.306. 2 H :    s 1 0 00  b0  1 7.12 2  1 X 2  s b20  s e2     29 .9819 1.07049   32 .0953 . So b0  17 .85458 and   29 .9819    n SS x  10 52 .236  H 0 :  0  0 b  0 17 .8548 sb0  39.0953  5.6653 . If we are testing  t 0   3.152 . Since the s b0 5.6653 H 1 :  0  0 rejection region is above 2.306 and below -2,306, we reject the null hypothesis and say that  0 is significant. A confidence interval would be  0  b0  t  sb0  17.855  2.3065.6653  17.855  13.064 . 2 Since this confidence interval does not include zero, we reject the null hypothesis of insignificance. Note that at the 1% significance level we could not reject the null hypothesis. e) Use your equation to predict the price of the average home that is on land worth $9000. Make this into a 95% interval. (3) Solution: Recall that n  10, X  7.12 , SS x  52.236 , s e2  29.9819 (or s e  29 .9819  5.47557 ), that both the dependent and independent variables are measured in thousands and that our equation is Yˆ  17.85458  7.0710 X . 21 252y0731 4/25/07 (Page layout view!) So if X  9 , Yˆ  17.85458 7.07109  81.49 . Since we are concerned with an average value, we do not use the prediction interval The formula for the Confidence Interval is   Yˆ  t s ˆ , where Y0  1 X X sY2ˆ  s e2   0 n SS x   0 Y 2    29 .9819  1  9  7.12   10  52 .236      29.9819 0.1  0.06766   29.9819 0.16766    8  5.02683 or s ˆ  5.02683  2.2421 . We already know that t n2  t.025  2.306. and that Yˆ  81.49 . 2 Y 2 So that our confidence interval is Y0  Yˆ0  t sY  81.49  2.3062.2421  81.49  5.17 , so that we can say that we are 95% sure that the average house standing on $9(thousand) worth of land will sell for between $76.32 and $86.66 (thousand). f) Make a graph of the data. Show the regression line and the data points clearly. If you are not willing to do this neatly and accurately, don’t bother. (2) [26] Fitted Line: HomePrice versus LandVal Version 9 Because time was short, I was only able to run one other version of the the regression and that only on the computer, but because it is the most extreme version, all other results should fall between the results below and the results above. The run follows with interpretation. This is a run of my ‘exec’ 252sols. If you wish to check your results, you are welcome to use it. As you can see, it is easy to run and internally documented. To prepare your data, put y in c1 and x in c2. Then store it using the files pull-down menu and ‘save worksheet.’ This is to establish a path for Minitab to find 252sols. Give Minitab the command echo in order to be able to track what it is doing. Then download 252sols.txt from the courses server to your storage location, and start it with the Minitab command, exec '252sols.txt'. Here is the annotated output. MTB > WOpen "C:\Documents and Settings\RBOVE\My Documents\Minitab\252x0703201a9.MTW". Retrieving worksheet from file: 'C:\Documents and Settings\RBOVE\My Documents\Minitab\252x07032-01a9.MTW' Worksheet was saved on Thu Apr 05 2007 A worksheet of the same name is already open... current worksheet overwritten. MTB > Save "C:\Documents and Settings\RBOVE\My Documents\Minitab\252x0703201a9.MTW"; 22 252y0731 4/25/07 (Page layout view!) SUBC> Replace. Saving file as: 'C:\Documents and Settings\RBOVE\My Documents\Minitab\252x07032-01a9.MTW' Existing file replaced. MTB > #252sols Fakes simple ols regression by hand MTB > regress c1 1 c2 Regression Analysis: HomePr versus LandVal The regression equation is HomePr = 20.5 + 6.57 LandVal Predictor Coef SE Coef Constant 20.488 5.644 LandVal 6.5748 0.7548 S = 5.45503 R-Sq = 90.5% T P 3.63 0.007 8.71 0.000 R-Sq(adj) = 89.3% Analysis of Variance Source DF SS Regression 1 2258.0 Residual Error 8 238.1 Total 9 2496.1 MS 2258.0 29.8 F 75.88 P 0.000 Unusual Observations Obs LandVal HomePr Fit SE Fit Residual St Resid 4 3.7 54.00 44.81 3.10 9.19 2.05R 6 3.8 36.00 45.47 3.04 -9.47 -2.09R R denotes an observation with a large standardized residual. MTB MTB MTB MTB MTB MTB MTB MTB MTB > > > > > > > > > let c3=c2*c2 let c4=c1*c2 let c5=c1*c1 let k1=sum(c1) let k2=sum(c2) let k3=sum(c3) let k4=sum(c4) let k5=sum(c5) print c1-c5 Data Display Row 1 2 3 4 5 6 7 8 9 10 MTB MTB MTB MTB MTB MTB MTB MTB MTB MTB HomePr 67 63 60 54 58 36 76 87 89 83 > > > > > > > > > > C3 49.00 47.61 30.25 13.69 34.81 14.44 79.21 92.16 98.01 100.00 in c1, x in c2 x squared, #c4 is xy y squared sum of y sum of x sum of x squared sum of xy sum of y squared C4 469.0 434.7 330.0 199.8 342.2 136.8 676.4 835.2 881.1 920.0 C5 4489 3969 3600 2916 3364 1296 5776 7569 7921 8464 let k17=count(c1) #k17 is n let k21=k2/k17 #k21 is mean of x let k22=k1/k17 #k22 is mean of y let k18=k21*k21*k17 let k18=k3-k18 #k18 is SSx let k19=k22*k22*k17 let k19=k5-k19 #k19 is SSy let k20=k21*k22*k17 let k20=k4-k20 #k20 is Sxy print k1-k5 Data Display K1 K2 K3 K4 K5 LandVal 7.0 6.9 5.5 3.7 5.9 3.8 8.9 9.6 9.9 10.0 #Put y #c3 is #c5 is #k1 is #k2 is #k3 is #k4 is #k5 is 673.000 71.2000 559.180 5225.20 49364.0 #k1 #k2 #k3 #k4 #k5 is is is is is sum sum sum sum sum of of of of of y x x squared xy y squared 23 252y0731 4/25/07 (Page layout view!) MTB > print k17-k22 Data Display K17 K18 K19 K20 K21 K22 10.0000 52.2360 4071.10 433.440 7.12000 67.3000 #k17 #k18 #k19 #k20 #k21 #k22 is is is is is is n SSx SSy Sxy mean of x mean of y MTB > end 24 252y0731 4/25/07 (Page layout view!) 3) A sales manager is trying to find out what method of payment works best. Salespersons are randomly assigned to payment by commission, salary or bonus. Data for units sold in a week is below. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Commission 25 35 20 30 25 30 26 17 31 30 Salary 25 25 22 20 25 23 22 25 20 26 24 Bonus 28 41 21 31 26 26 46 32 27 35 23 22 26 To personalize the data let g be the second to last digit in your student number. Subtract 2 g from 46 in the last column. Mark your problem ‘VERSION g .’Example: Payme Well’s student number is 000090, so she subtracts 18 and gets the following numbers. Version 9 Row Commission 1 25 2 35 3 20 4 30 5 25 6 30 7 26 8 17 9 31 10 30 11 12 13 Salary 25 25 22 20 25 23 22 25 20 26 24 Bonus 28 41 21 31 26 26 28 32 27 35 23 22 26 a) State your null hypothesis and test it by doing a 1-way ANOVA on these data and explain whether the test tells us that payment method matters or not. (4) Version 0: You should be able to do the calculations below. Only the three columns of numbers were given to us. Payment Method Sum Line Commission Salary Bonus 1 2 3 4 5 6 7 8 9 10 11 12 13 25 35 20 30 25 30 26 17 31 30 25 25 22 20 25 23 22 25 20 26 24 269 + 257 + 384  910  nj 10 + 11 + 13  34  n x j 26.9 23.3636 29.5385 26 .7647  x SS 7501 + 6049 + 12002  25552  Sum x 2j 723.61 545.8595 28 41 21 31 26 26 46 32 27 35 23 22 26  x ij  x 2 ij 872.5207 25 252y0731 4/25/07 (Page layout view!) Note that all the squared means were gotten by going back to the original ratios used to calculate the means.  x  nx  25552 3426.7647  2552  2435.8824  1196.1177 SSB   n x  nx  1026.9  1123.3636  1329.5385  3426.7647 SST  2 ij 2 j .j 2 2 2 2 2 2 2  10723 .61  11545 .8595   13872 .5207   24355 .8824  7236 .1  6004 .4545  11342 .7691  24355 .8824  24583 .3236  24355 .8824  227 .4413 Source Between SS DF MS F F.05 2 113.7207 3.6393 s F 2,31  3.30 227.4413 H0 Column means equal Within 968.6765 31 31.2476 Total 1196.1177 33 Because our computed F statistic exceeds the 5% table value, we reject the null hypothesis of equal means for each payment method and conclude that method of payment matters. This was verified by the Minitab printout . MTB > AOVOneway c1-c3; SUBC> Tukey 5; SUBC> Fisher 5. One-way ANOVA: Commission, Salary, Bonus Source DF Factor 2 Error 31 Total 33 S = 5.590 Level Commission Salary Bonus SS MS 227.4 113.7 968.7 31.2 1196.1 R-Sq = 19.01% N 10 11 13 Mean 26.900 23.364 29.538 F 3.64 P 0.038 R-Sq(adj) = 13.79% Individual 95% CIs For Mean Based on Pooled StDev StDev ---+---------+---------+---------+-----5.425 (---------*---------) 2.111 (---------*---------) 7.412 (--------*--------) ---+---------+---------+---------+-----21.0 24.5 28.0 31.5 Pooled StDev = 5.590 b) Using your results from a) present individual and Tukey intervals for the difference between earnings for all three possible contrasts. Explain (i) under what circumstances you would use each interval and (ii) whether the intervals show a significant difference. (2) Let’s start with s  MSW  31.2476  5.5899, x1  26 .9, x2  23.3636 , x3  29.5385 , n1  10, n2  11 and n3  13 . There are m  3 columns and the ANOVA printout gives us n  m  31 for 3,31 31  3.49  2.040 and q.05 the within degrees of freedom. From our tables we find t .025 26 252y0731 4/25/07 (Page layout view!) The outline gives the following for these two intervals. Contrast 1-2 Individual Confidence Interval If we desire a single interval, we use the formula for the difference between two means when the variance is known. For example, if we want the difference between means of column 1 and column 2. 1   2  x1  x2   t n  m  s 2 1 1  n1 n 2 1 1  26 .9  23 .3636   2.040 31 .2476     10 11   3.54  2.040 5.96545  3.54  2.040 2.442   3.54  4.98 ns Tukey Confidence Interval This applies to all possible differences. 1   2  x1  x2   q m,n  m   3.54  3.49 s 2 1 1  n1 n 2  26 .9  23 .3636   3.49 31 .2476  1 1     2  10 11  1 5.96545  3.54  3.49 1.727   3.54  6.03 ns 2 Contrast 1-3 Individual Confidence Interval If we desire a single interval, we use the formula for the difference between two means when the variance is known. For example, if we want the difference between means of column 1 and column 2. 1   3  x1  x3   t n  m  s 2 1 1 1 1  26 .9  29 .5285   2.040 31 .2476     n1 n3  10 13   2.63  2.040 5.52842  2.63  2.040 2.351   2.63  4.80 ns Tukey Confidence Interval This applies to all possible differences. 1   3  x1  x3   q m,n  m   2.63  3.49 s 2 31 .2476 1 1  26 .9  29 .5285   3.49  2 n1 n3 1 1     10 13  1 5.52842  2.63  3.49 1.662   2.63  5.80 ns 2 Contrast 2-3 Individual Confidence Interval If we desire a single interval, we use the formula for the difference between two means when the variance is known. For example, if we want the difference between means of column 1 and column 2.  2   3  x2  x3   t n  m  s 2 1 1  n 2 n3 1 1  23 .3636  29 .5285   2.040 31 .2476     11 13   6.16  2.040 5.24435  6.16  2.040 2.290   6.16  4.67 Tukey Confidence Interval This applies to all possible differences.  2   3  x2  x3   q m,n  m  s 2 31 .2476  1 1  1 1  23 .3636  29 .5285   3.49     2 n 2 n3  11 13  1 5.24435  6.16  3.49 1.619   6.16  5.65 2 The remainder of the Minitab printout for this problem follows.  6.16  3.49 27 252y0731 4/25/07 (Page layout view!) Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons Individual confidence level = 98.04% Commission subtracted from: Salary Bonus Lower -9.547 -3.147 Center -3.536 2.638 Upper 2.474 8.424 +---------+---------+---------+--------(---------*---------) (--------*---------) +---------+---------+---------+---------12.0 -6.0 0.0 6.0 Salary subtracted from: Bonus Lower 0.540 Center 6.175 Upper 11.810 +---------+---------+---------+--------(--------*---------) +---------+---------+---------+---------12.0 -6.0 0.0 6.0 Fisher 95% Individual Confidence Intervals All Pairwise Comparisons Simultaneous confidence level = 88.04% Commission subtracted from: Salary Bonus Lower -8.518 -2.157 Center -3.536 2.638 Upper 1.445 7.434 --------+---------+---------+---------+(-------*-------) (-------*-------) --------+---------+---------+---------+-6.0 0.0 6.0 12.0 Salary subtracted from: Bonus Lower 1.504 Center 6.175 Upper 10.845 --------+---------+---------+---------+(------*-------) --------+---------+---------+---------+-6.0 0.0 6.0 12.0 c) What other method could we use on these data to see if payment method makes a difference? Under what circumstances would we use it? Try it and tell what it tests and what it shows. (3) [37] The alternative to one-way ANOVA for situations in which the parent distribution is not Normal is the Kruskal-Wallis test. The original data is replaced by ranks. Data Display Row Commission 1 2 3 4 5 6 7 8 9 10 11 12 13 Rank Sum ni Rank Commission Salary Rank Salary Bonus x1 r1 x2 r2 x3 25 35 20 30 25 30 26 17 31 30 14.5 31.5 3.0 26.0 14.5 26.0 20.0 1.0 28.5 26.0 25 25 22 20 25 23 22 25 20 26 24 14.5 14.5 7.0 3.0 14.5 9.5 7.0 14.5 3.0 20.0 11.0 28 41 21 31 26 26 46 32 27 35 23 22 26 191 10 Note that the sum of the first thirty-four numbers is 118.5 11 Rank Bonus r3 24.0 33.0 5.0 28.5 20.0 20.0 34.0 30.0 23.0 31.5 9.5 7.0 20.0 285.5 13 34 35   595 , so that we can check our ranking by 2 noting that 191 + 118.5 + 285.5 = 595. 28 252y0731 4/25/07  12 H   nn  1  i (Page layout view!)  SRi 2   ni  2 2 2      3n  1   12  191   118 .5  285 .5   335   11 13   34 35   10  1211194 .6701  12  105  112 .8874  105  7.8874 . The 3648 .1  1276 .5682  6270 .0019   105  34 35  3435  2 design of the data is too large for the Kruskal-Wallis table, so we compare this with  2 .05  5.9915 . Since the computed chi_squared exceeds the table value we can reject the null hypothesis of equal medians. The computer agrees. The Minitab printout follows.  MTB > Kruskal-Wallis c4 c5. Kruskal-Wallis Test: CSB0 versus Pay0 Kruskal-Wallis Test on CSB0 Pay0 N Median Ave Bonus 13 27.00 Commission 10 28.00 Salary 11 24.00 Overall 34 H = 7.89 DF = 2 P = 0.019 H = 7.97 DF = 2 P = 0.019 Rank 22.0 19.1 10.8 17.5 Z 2.06 0.60 -2.72 (adjusted for ties) d) (Extra Credit) Do a Levene test on these data and explain what it tests and shows. (4) The Levene test is a test for equal variances. The original data is below. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Commission 25 35 20 30 25 30 26 17 31 30 Salary 25 25 22 20 25 23 22 25 20 26 24 Bonus 28 41 21 31 26 26 46 32 27 35 23 22 26 To find the median, put the data in order and pick the center number or an average of the two center numbers. We find that the median of Commission is 28.00, the median of Salary is 24.00 and the median of Bonus is 27.00.We subtract the column median from each column. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Commission -3 7 -8 2 -3 2 -2 -11 3 2 Salary 1 1 -2 -4 1 -1 -2 1 -4 2 0 Bonus 1 14 -6 4 -1 1 19 5 0 8 -4 -5 -1 29 252y0731 4/25/07 (Page layout view!) We now take absolute values. Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Commission 3 7 8 2 3 2 2 11 3 2 Salary 1 1 2 4 1 1 2 1 4 2 0 Bonus 1 14 6 4 1 1 19 5 0 8 4 5 1 An ANOVA is done on these 3 columns. Source SS DF Between 79.2 2 MS 39.6 F.05 F 2.538 ns F 2,31  3.30 H0 Column means equal Within 485.1 31 15.6 Total 564.3 33 Because we cannot reject the null hypothesis, we cannot say that the variances are not equal. As long as we use the Levine test, Minitab agrees. However, the more powerful Bartlett test causes us to reject the null hypothesis of equal variances if the data is Normally distributed. Test for Equal Variances: Commission, Salary, Bonus 95% Bonferroni confidence intervals for standard deviations Commission Salary Bonus N 10 11 13 Lower 3.45609 1.36993 4.96227 StDev 5.42525 2.11058 7.41188 Upper 11.5454 4.2692 13.8626 Bartlett's Test (normal distribution) Test statistic = 12.69, p-value = 0.002 Levene's Test (any continuous distribution) Test statistic = 2.53, p-value = 0.096 f) (Extra Credit) Check you results on the computer. The Minitab setup for 1-way ANOVA can either be unstacked (in columns) or stacked (one column for data and one for column name). There is a Tukey option. The corresponding non-parametric methods require the stacked presentation. You may want to use and compare the Statistics pull-down menu and ANOVA or Nonparametrics to start with. You may also want to try the Mood median test and the test for equal variances which are on the menu categories above. Even better, do a normality test on your columns. Use the Statistics pull-down menu to find the normality tests. The Kolmogorov-Smirnov option is actually Lilliefors. The ANOVA menu can check for equality of variances. In light of these tests was the 1-way ANOVA appropriate? You can get descriptions of unfamiliar tests by using the Help menu and the alphabetic command list or the Stat guide. (Up to 7) [12]. Most of this was done above. The printout below is for the Normality tests and Mood’s median test. ————— 4/18/2007 1:42:53 PM ———————————————————— Welcome to Minitab, press F1 for help. Retrieving worksheet from file: 'C:\Documents and Settings\RBOVE\My Documents\Minitab\252x07031-03-v0.MTW' Worksheet was saved on Tue Apr 17 2007 Results for: 252x07031-03-v0.MTW MTB > Save v0.MTW"; "C:\Documents and Settings\RBOVE\My Documents\Minitab\252x07031-03- 30 252y0731 4/25/07 (Page layout view!) SUBC> Replace. Saving file as: 'C:\Documents and Settings\RBOVE\My Documents\Minitab\252x07031-03-v0.MTW' Existing file replaced. MTB > NormTest c1; SUBC> KSTest. Probability Plot of Commission MTB > NormTest c2; SUBC> KSTest. Probability Plot of Salary MTB > NormTest c3; SUBC> KSTest. Probability Plot of Bonus Unfortunately, this is hard to read in this format. In the little boxes the last item is the p-value for the test and in each case the p-value is reported as ‘> .150.’ Because this p-value is above any significance level we might use, we cannot reject the null hypothesis of Normality. MTB > WOpen "C:\Documents and Settings\RBOVE\My Documents\Minitab\252x0703103.MTW". Retrieving worksheet from file: 'C:\Documents and Settings\RBOVE\My Documents\Minitab\252x07031-03.MTW' Worksheet was saved on Wed Apr 04 2007 31 252y0731 4/25/07 (Page layout view!) Results for: 252x07031-03.MTW MTB > print c5 c6 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Pay0 Commission Commission Commission Commission Commission Commission Commission Commission Commission Commission Salary Salary Salary Salary Salary Salary Salary Salary Salary Salary Salary Bonus Bonus Bonus Bonus Bonus Bonus Bonus Bonus Bonus Bonus Bonus Bonus Bonus C6 14.5 31.5 3.0 26.0 14.5 26.0 20.0 1.0 28.5 26.0 14.5 14.5 7.0 3.0 14.5 9.5 7.0 14.5 3.0 20.0 11.0 24.0 33.0 5.0 28.5 20.0 20.0 34.0 30.0 23.0 31.5 9.5 7.0 20.0 MTB > Mood C6 c5. Mood Median Test: C6 versus Pay0 Mood median test for C6 Chi-Square = 11.53 DF = 2 Pay0 Bonus Commission Salary N<= 3 4 10 N> 10 6 1 Median 23.0 23.0 11.0 P = 0.003 Individual 95.0% CIs Q3-Q1 +---------+---------+---------+-----16.0 (--------*----------) 15.0 (-----------------*----) 7.5 (-----*----) +---------+---------+---------+-----7.0 14.0 21.0 28.0 Overall median = 17.3 The Mood test gives us a low p-value which means that the medians are not similar. Because we have concluded that the data is Normally distributed, we have to go with the Bartlett test, which says the variances are unequal. If we had only the Levine result and the Normality tests, we could use ANOVA, but the Bartlett test says that we have unequal variances. Under the circumstances, we have a quandary and would probably go with the result of the Mood or Kruskal-Wallis test. In any case, we can conclude that the means are different and that we are probably better off using commissions. 32 252y0731 4/25/07 (Page layout view!) Version 9 MTB > print c11 - c13 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Co0 25 35 20 30 25 30 26 17 31 30 Sa0 25 25 22 20 25 23 22 25 20 26 24 Bo9 28 41 21 31 26 26 28 32 27 35 23 22 26 MTB > AOVOneway c11-c13; SUBC> Tukey 5; SUBC> Fisher 5. One-way ANOVA: Co0, Sa0, Bo9 Source DF Factor 2 Error 31 Total 33 S = 4.667 Level Co0 Sa0 Bo9 N 10 11 13 #Note that we do not reject equal means. SS MS F P 143.0 71.5 3.28 0.051 675.1 21.8 818.1 R-Sq = 17.48% R-Sq(adj) = 12.15% Mean 26.900 23.364 28.154 StDev 5.425 2.111 5.520 Individual 95% CIs For Mean Based on Pooled StDev --+---------+---------+---------+------(---------*---------) (---------*--------) (--------*--------) --+---------+---------+---------+------21.0 24.0 27.0 30.0 Pooled StDev = 4.667 Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons Individual confidence level = 98.04% Co0 subtracted from: Sa0 Bo9 Lower -8.554 -3.576 Center -3.536 1.254 Upper 1.481 6.084 ---------+---------+---------+---------+ (---------*---------) (---------*--------) ---------+---------+---------+---------+ -5.0 0.0 5.0 10.0 Sa0 subtracted from: ---------+---------+---------+---------+ (---------*--------) ---------+---------+---------+---------+ -5.0 0.0 5.0 10.0 Fisher 95% Individual Confidence Intervals All Pairwise Comparisons Simultaneous confidence level = 88.04% Bo9 Lower 0.086 Center 4.790 Upper 9.495 Co0 subtracted from: Sa0 Bo9 Lower -7.695 -2.750 Center -3.536 1.254 Upper 0.622 5.257 -------+---------+---------+---------+-(-------*-------) (-------*-------) -------+---------+---------+---------+-- 33 252y0731 4/25/07 (Page layout view!) -5.0 0.0 5.0 10.0 Sa0 subtracted from: Bo9 Lower 0.891 Center 4.790 Upper 8.689 -------+---------+---------+---------+-(-------*------) -------+---------+---------+---------+--5.0 0.0 5.0 10.0 MTB > vartest c11-c13; SUBC> unstacked. Test for Equal Variances: Co0, Sa0, Bo9 95% Bonferroni confidence intervals for standard deviations N Lower StDev Upper Co0 10 3.45609 5.42525 11.5454 Sa0 11 1.36993 2.11058 4.2692 Bo9 13 3.69589 5.52036 10.3248 Bartlett's Test (normal distribution) Test statistic = 8.75, p-value = 0.013 Levene's Test (any continuous distribution) Test statistic = 2.25, p-value = 0.123 MTB > Stack c11 c12 c13 c14; SUBC> Subscripts c15; SUBC> UseNames. MTB > MTB > SUBC> SUBC> MTB > rank c14 c16 unstack c16 c17 c18 c19; subscripts c15; varnames. Kruskal-Wallis c14 c15 Kruskal-Wallis Test: CSB9 versus Pay 9 Kruskal-Wallis Test on CSB9 Pay 9 N Median Ave Rank Z Bo9 13 27.00 21.6 1.88 Co0 10 28.00 19.6 0.79 Sa0 11 24.00 10.8 -2.72 Overall 34 17.5 H = 7.64 DF = 2 P = 0.022 H = 7.73 DF = 2 P = 0.021 (adjusted for ties) MTB > print c11 c17 c12 c18 c13 c19 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Co0 25 35 20 30 25 30 26 17 31 30 C16_Bo9 24.5 34.0 5.0 29.5 20.0 20.0 24.5 31.0 23.0 32.5 9.5 7.0 20.0 Sa0 25 25 22 20 25 23 22 25 20 26 24 C16_Co0 14.5 32.5 3.0 27.0 14.5 27.0 20.0 1.0 29.5 27.0 Bo9 28 41 21 31 26 26 28 32 27 35 23 22 26 MTB > copy c11-c13 c31-c33 * NOTE * Column lengths not equal. C16_Sa0 14.5 14.5 7.0 3.0 14.5 9.5 7.0 14.5 3.0 20.0 11.0 We are setting up for a Levine test. MTB > name k31 'medco0a' MTB > name k32 'medsal0a' MTB > name k33 'medbon9a' 34 252y0731 4/25/07 MTB MTB MTB MTB > > > > (Page layout view!) let k31 = median(c31) let k32 = median(c32) let k33 = median(c33) print k31-k33 Data Display medco0a medsal0a medbon9a MTB MTB MTB MTB > > > > 28.0000 24.0000 27.0000 let c31 = c31 - k31 let c32 = c32 - k32 let c33 = c33 - k33 describe c31 - c33 Descriptive Statistics: Co0a, Sal0a, Bon9a Variable Co0a Sal0a Bon9a N 10 11 13 N* 0 0 0 Variable Co0a Sal0a Bon9a Maximum 7.00 2.000 14.00 Mean -1.10 -0.636 1.15 SE Mean 1.72 0.636 1.53 StDev 5.43 2.111 5.52 Minimum -11.00 -4.000 -6.00 Q1 -4.25 -2.000 -2.50 Median 0.00 0.000 0.00 Q3 2.25 1.000 4.50 MTB > print c31 - c33 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 MTB MTB MTB MTB Co0a -3 7 -8 2 -3 2 -2 -11 3 2 > > > > Sal0a 1 1 -2 -4 1 -1 -2 1 -4 2 0 let c31 = abs(c31) let c32 = abs(c32) let c33 = abs(c33) print c31-c33 Data Display Row 1 2 3 4 5 6 7 8 9 10 11 12 13 Bon9a 1 14 -6 4 -1 -1 1 5 0 8 -4 -5 -1 Co0a 3 7 8 2 3 2 2 11 3 2 Sal0a 1 1 2 4 1 1 2 1 4 2 0 Bon9a 1 14 6 4 1 1 1 5 0 8 4 5 1 35 252y0731 4/25/07 (Page layout view!) MTB > AOVO c31 - c33 One-way ANOVA: Co0a, Sal0a, Bon9a Source Factor DF 2 Error 31 Total 33 S = 3.065 Level Co0a Sal0a Bon9a N 10 11 13 SS 42.24 MS 21.12 291.20 9.39 333.44 R-Sq = 12.67% Mean 4.300 1.727 3.923 StDev 3.199 1.272 3.904 F 2.25 P 0.123 #This is the last act of the Levene Test. #The high p-value means we cannot reject #the null hypothesis of equal means R-Sq(adj) = 7.03% Individual 95% CIs For Mean Based on Pooled StDev -+---------+---------+---------+-------(--------*---------) (---------*--------) (--------*-------) -+---------+---------+---------+-------0.0 2.0 4.0 6.0 Pooled StDev = 3.065 36

  2 .

Related documents

Products

Support

  2 .

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib