Trinity College Dublin Diploma in Quality Improvement Computer Laboratory 3 Analysis of a fractional factorial experiment, a blocked factorial experiment, a split plots experiment. Invitations to consider the results of Minitab analysis and their statistical and substantive interpretations are printed in italics. Take some time for this; consult your neighbour or tutor. Enter your responses in a Word document, as a log of your work and as draft contributions to a report on the experiment and its analysis. Learning Objectives On completion of this laboratory, students should be able to − − − − − − − − − − − − − − − − − − − − set up and analyse 2-level factorial experiments in Minitab; interpret effects plots, both formally and informally; apply the Pareto Principle informally, use simulation to set up reference plots for Normal effects plots, produce and interpret numerical summaries of effects; fit reduced models to provide error estimates as a basis for formal inference, produce and interpret main effect and interaction plots, produce stratified two-factor interaction plots to assist with interpretation of 3-factor interactions interpret diagnostic plots and relate the results to effects patterns; make and interpret a categorised table of summary statistics; prepare management reports based on results of detailed analyses; explain the make up of a split plots experiment, manipulate the raw data to produce a randomised blocks layout for the whole plots factor, implement a Minitab analysis of the randomised blocks, report on the results of the analysis, implement a Minitab analysis of the full split plots experiment, relate the whole plots analysis to the split plots analysis, explain the make up of the analysis of variance table associated with the split plots experiment, explain the Minitab "Expected Mean Squares" output and relate it to the analysis of variance, provide a comprehensive report on the analysis of results of the split plots experiment with tables and graphs, analysis of variance, comparisons of treatments with control and pairwise comparisons of treatments, Exercise 1: Analysis of a 25–1 fractional factorial experiment The problem A food processing company was having a problem with its soup mix packet filling machine; it was inconsistent in the weights it delivered, with the result that excessive overfill was needed to ensure achieving the nominal weight. The problem was isolated to the uneven flow of a component mix of vegetable oil, salt and other ingredients that was added to the soup mix during the mixing process. Five factors were identified that were thought to influence this part of the process and experimental settings were agreed, as follows: A: B: C: D: E: the number of ports for adding oil to the mix, mixer vessel temperature, mixing time, batch weight, delay between mixing and packaging, 1 or 3, ambient or water cooled, 60 seconds or 80 seconds, 1500 lbs or 2000 lbs, 1 day or 7 days. 5–1 A 2 experiment was run, with the 16 chosen factor level combinations being run in random order. The response variable was a measure of the spread of the actual weights of 5 sample packets, one taken every 15 minutes, during the production run for each factor level combination. The results, together with the run order, are shown below. Standard Order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 RunOrder A B C D E Y 5 3 12 13 16 14 6 4 7 10 1 15 9 8 11 2 – + – + – + – + – + – + – + – + – – + + – – + + – – + + – – + + – – – – + + + + – – – – + + + + – – – – – – – – + + + + + + + + + – – + – + + – – + + – + – – + 0.97 1.70 1.13 1.25 1.18 0.98 1.47 1.28 1.85 0.62 0.78 1.36 0.76 2.10 1.09 1.10 Confirm that the sign pattern for allocating levels of Factor E to design points is the product of the corresponding sign patterns for the other four factors. Minitab set-up These data are stored in an Excel file named Soup.xls in the module webpage; copy and paste into Minitab. Use the Stat / DOE / Factorial / Analyze Factorial Design to set up the design and analyse the data. In setting up the analysis, • • • insert – and + as the Low and High values for factors, from Graphs, select the Normal and Pareto effects plots, ignore residuals, from Results, select Coefficients and ANOVA table (the latter being useless!). page 2 Formal analysis Identify the statistically significant / active effects and the values of their estimates. Noting the Alias Structure in the Session window output, identify what is actually estimated in each case. Discuss the application of the Pareto Principle in this case; does it influence your choice of active effects? Use simulation to check the interpretation of the Normal effects plot: • • • • • • use the File menu to open a new Minitab worksheet (and keep the existing one open), use the Calc menu to generate 15 standard Normals into C1, Use the Graph menu to produce a Normal plot of the 15 values, − when choosing the distribution, click on the Data Display tab and uncheck "Show confidence interval", right-click in the finished graph and select Update Graph Automatically, use the Calc menu again to generate a second sample of random numbers into C1, use Ctrl+E to simulate repeated samples. In the light of these simulated reference samples and their Normal plots, how do you interpret the Normal effects plot (with specific reference to the DE interaction effect)? Graphical and numerical summaries Close the new worksheet (without saving), thereby returning to the original data window, and use the DOE command to make relevant interaction plots, • • • • • navigate to the DOE Factorial Plots dialog window, check Interaction Plot, click on Setup, add two relevant Available factors to the Selected factors box, click OK, OK, repeat to get the second relevant two factor interaction plot, ensure that both plots have the same scale of the vertical axis: − double click each scale in turn, − if necessary, adjust Position of ticks in one or both scales, − establish the maxima and minima, then set the most inclusive maximum and minimum pair in both scales, edit the graph titles, axis labels, legends, to improve interpretability by, e.g., replacing letters with names: − double click each in turn. Use the Analyze Factorial Design feature to produce corresponding numerical summaries : • • • • navigate to the Analyze Factorial Design dialog window, click on Results, under Display of Least Squares Means, select relevant terms for display, click OK, OK. Interpretation and report for management Provide an interpretation of these plots and numerical summaries for management. Noting that reducing Y is desirable, identify the best operating conditions. page 3 Assuming that a seven day delay in filling is not feasible, identify the next best. Record your analysis in a Word document, including plots and verbal descriptions. Model fitting Given the parsimonious model implicit in the above analysis, re-analyse the raw data using the DOE command with a reduced set of Selected terms: • • • • • navigate to the Analyze Factorial Design dialog window, click on Terms, clear the Selected Terms box and add back relevant terms, including all main effects corresponding to two-factor interactions added, click on Graphs and include residual plots, click OK. Review the diagnostic plots. Iterate the analysis in light of the diagnostics. Note that the simplest way to exclude a single case is the replace its Y value with an asterisk (*). If need be, it can be replaced later. Revised interpretation and report for management Interpret the final analysis. Adjust the numerical summaries in your Management Report accordingly. Add relevant confidence intervals, including a confidence interval (and significance test) for the difference between means under the competing operating conditions. Exercise 2: Cambridge grassland experiment A classic split plots experiment was carried out on the Cambridge University Farm in 1931. The original plan was to investigate the effects of two new cultivation treatments of grass-land pasture, by comparison with no treatment. The new treatments were use of a grassland "Rejuvenator" (R) and use of a conventional harrow (H). These treatments were compared with no treatment, taken to be a control (C). The experiment was carried out on a field of old pasture, laid out in plots consisting of strips 4 yards wide and 45 yards long. The treatments were applied to a block of 3 adjacent plots, in a randomly chosen order. The experiment was replicated in a total of 6 independently randomised blocks, placed side by side. In addition to the randomised blocks layout for investigating treatment effects, each 45 yard plot was subdivided into 4 subplots, each of which was treated with one of four fertilisers, Farmyard manure (F), Straw (S), Artificial fertiliser, (A) and no treatment, acting as Control (C). The fertilisers were also applied in random order to the subplots within each whole plot, independently. Yield was recorded in pounds (lbs) of green produce from a single cut of each subplot made on June 31, 1931. The data are shown in the following table and are available in Grassland.xls on the course website. The main plots correspond to the columns in the table, the subplots to the entries within each column. page 4 A C F S C 266 165 198 184 Block 1 H R 213 208 127 155 180 200 127 150 C 210 150 247 188 Block 2 H R 222 266 167 163 203 228 167 157 C 220 155 190 140 Block 3 H R 184 184 118 153 168 174 128 141 C 216 159 225 174 Block 4 H R 178 207 125 135 149 162 107 113 C 202 147 184 154 Block 5 H R 175 184 118 98 175 144 112 113 C 169 132 164 116 Block 6 H R 142 151 104 69 145 116 89 101 The organisation of the data in subplots within whole plots, with the independent randomizations within whole plots, means that the comparison of treatments C, H and R is assessed by reference to the variation at whole plot level, effectively by way of a randomised blocks analysis of treatments C, H and R in blocks 1 to 6., while comparison of fertilisers, including their interactions with treatments, is assessed by reference to the variation at subplot level. The analysis will be reported in two stages, with the randomised blocks layout of cultivation treatments in the whole plots analysed initially, followed by the full split plots analysis. The data may be copied into a new Minitab project, save the old project or not as desired. Whole plots analysis To set up the whole plots analysis, first calculate the whole plot yields by summing the four sub plot yields in each whole plot. To do this, copy the data to Minitab, rename the Block, Treatment Fertiliser and Yield columns as B, T, F and Y, respectively, then • • • from the Data menu, select Unstack Columns, − Unstack the data in Y, − Using subscripts in F, − Store unstacked data after last column in use, − click OK, from the Calc menu, select Calculator, calculate the sum of the four unstacked columns and store the result in the next available column, click OK name the new column WY, (for Whole plot Yield). Next, enter the appropriate Block and Treatment codes in two further columns: • • • name the next two columns WB and WT, respectively, from the Calc menu, select Make Patterned Data, then Simple Set of Numbers, − Store patterned data in WB, − From first value 1, − To last value 6, − Number of times to list each value, 3 − click OK from the Calc menu, select Make Patterned Data, then Text Values, − Store patterned data in WT, − enter text values C H R, − Number of times to list each value, 1, − Number of times to list the sequence, 6 − click OK. Identify the entries in the Y column corresponding to the first four entries in the WY column. Check that the Block and Treatment codes in columns B and T and columns WB and WT, respectively, correspond. Check that the entries in the WY column equal the sums of the corresponding entries in the Y column. page 5 Next, implement a randomised blocks analysis of the data in WY: • • • • from the Stat menu, select ANOVA, then General Linear Model, − enter WY as Response, WB WT as Model, WB as Random factor, click the Graphs button, check − Deleted Residual for Plots, − Normal plot of residual, − Residuals versus fits, − click OK, click the Results button, check − Analysis of variance table, − Display expected mean squares, − click OK, click OK. Provide a short report on the analysis of variance. Confirm that the values of the F ratios reflect the entries in the table of expected mean squares. Comment on the diagnostics. What action would you recommend? Re-analyse the data, this time adding the WB*WT interaction to the model and using the Factor Plot option to add the corresponding interaction plot. Discuss correspondences between the two analyses. Discuss the role of the interaction term Comment on the interaction plot and the consequences of finding an interaction effect. Split plots analysis Write down the model for the full split plot analysis, separating terms appropriate to the whole plots and the sub plots. Implement the split plot analysis: • • • • from the Stat menu, select ANOVA, then General Linear Model, − enter Y as Response, enter the model terms, enter B as a Random factor, click the Graphs button, check − Deleted Residual for Plots, − Normal plot of residual, − Residuals versus fits, − click OK, click the Results button, check − Analysis of variance table, − Display expected mean squares, − click OK, click OK. page 6 Compare the randomised blocks analysis with the top half of the split plots analysis. What is the connection? Why? Provide a short report on the analysis of variance. Confirm that the values of the F ratio for treatments, fertilisers and treatment by fertiliser interactions reflect the entries in the table of expected mean squares. Comment on the diagnostics. What action(s) would you recommend? Recalculate the analysis, this time selecting factor plots and "Least Squares Means" corresponding to the active effects. Provide a management report with discussion of comparisons of treatments with control and pairwise comparisons of treatments, accompanied by relevant tables and graphs. Calculate standard t-tests and corresponding confidence intervals for the comparison of the new grassland treatments with control and for the comparison of each of the three fertilisers with control, using the appropriate estimate of standard deviation in each case. Comment on the relative precision levels of the comparisons Note that Minitab will not compute confidence intervals for multiple comparisons between the treatment effects in this experiment, due to the block effects being regarded as random. In fact, the computation is quite straightforward, provided the correct standard deviations are used in the standard error formulas. Thus, the B x T root mean square is used with the T effects and the B x F root mean square with the F effects. Details of the computations involved in Tukey's method are given in Mullins (2003), pp. 320-1. Exercise 3: 4 4–1 Analysis of a 2 factorial experiment run in two blocks as two 2 factorials experiments. fractional A widely quoted study carried out at the UK Agricultural Experimental Station at Rothamstead, North of London, (rebranded as Rothamstead Research since 2002) was concerned with the effect of combinations of different fertilisers on the yield of beans, measured in pounds. Four fertilisers were to be studied, Dung (D), Nitrochalk (N), SuperPhosphate (P) and Muriate of Potash (Potassium, K). The levels of each factor were as follows: Low High Dung (D): none 10 tons per acre Nitrochalk (N): none 0.4 cwt per acre SuperPhosphate (P): none 0.6 cwt per acre Muriate of Potash (K): none 1.0 cwt per acre The experiment was organised in two blocks of 8 plots each, with a half fraction of the 24 being run in each block. The run order was randomised within each block. Details of the experimental design and results follow on page 8. Check the product of signs of D, N and P; note the correspondence with K. Check the product of signs of the four factors; note the correspondence with blocks. The data are stored in an Excel file named BeanYields.xls in the module webpage. Copy the data into Minitab. page 7 Block D N P K RunOrder Yield 1 1 1 1 1 1 1 1 – + – + – + – + – – + + – – + + – – – – + + + + + – – + – + + – 2 3 8 5 1 7 4 6 55 53 42 41 45 55 36 48 2 2 2 2 2 2 2 2 – + – + – + – + – – + + – – + + – – – – + + + + – + + – + – – + 6 3 2 7 4 1 8 5 58 43 44 41 51 50 50 44 Use the DOE command to analyse, first setting up the design: • • • select D, N, P, K as the factors, enter "-" and "+" as the factor levels (Low/High), enter the Design details, Standard Order as order of the data (or C1), Run Order as in C7, Blocks as in C6, click OK, select Y as the response, click on Terms, confirm Default selected terms and "Block included in the model", • • • click on Graphs and check Normal and Pareto Effects Plots, click on Results and check Coefficients and ANOVA table, Click OK. • • • Interpret the Normal effects plot and the Pareto effects plot. What effects do you think you should recognise as active? Check the values of the active effects. Is there a block effect? Note that this is twice the Block coefficient as reported by Minitab. Check this by calculating the difference of the block means directly. Assuming that inactive effects reflect chance variation, repeat the analysis: • • • • • in the Terms window, remove all terms from the selected terms list, then add active terms, including all lower order terms corresponding to higher order interactions, exclude the Block effect, click Ok, in the Graphs window, uncheck the Effects Plots, check appropriate Residual Plots, click Ok, Ok page 8 Review the diagnostics. Review your earlier interpretations. The active effects are most effectively displayed using interaction plots for two of the active factors for each level of the third. The most convenient way to do this involves splitting the worksheet in two and using the Interactions Plot subcommand of the ANOVA command: • • • select Split Worksheet from the Data menu, enter N (say) in the "By variables" box, click Ok, • • • from the Stat menu, select ANOVA, then Interactions Plot, enter Y as the Response, P and K as the Factors, click OK, • • • switch to the other half worksheet, enter Ctrl + E, click Ok. When the two plots are produced, ensure that the vertical axes have the same scales, including maximum, minimum and tick marks. To do this, double click each scale in turn, establish the maxima and minima, then set the most inclusive maximum and minimum pair in both scales. To assist in interpretting the plots, produce a corresponding table: • • • • • switch to the original worksheet enter Ctrl + E click on Terms, under Display of Least Squares Means, transfer BCD from Available Terms to Selected Terms, click Ok, click on Graphs, deselect Residual Plots, click Ok, click Ok. Provide a detailed interpretation of the interactions. Make recommendations for fertiliser levels to apply for increased yield. This ends the formal laboratory. Review the Learning Objectives. Have they been achieved? page 9