D:\687291973.doc Version 18.05.2004 page 1 of 10 THE USE OF STATGRAPHICS 5.0 IN AND OUTPUT ..................................................................................................................................................... 2 1. Import of Data Files ..........................................................................................................................................2 2. Input of Data .....................................................................................................................................................2 3. Modification and Output of Results ..................................................................................................................2 MULTIPLE COMPARISONS .................................................................................................................................. 4 1. Import of Data ..................................................................................................................................................4 2. Visualization of Data ........................................................................................................................................4 3. Test for Outlier .................................................................................................................................................5 4. Test for Homogenity of Variances ....................................................................................................................5 5. Test for Normal Distribution .............................................................................................................................5 6. ANOVA - Is there any difference between the samples ? ...............................................................................6 7. MULTIPLE RANGE TESTS - Which samples are different? ...........................................................................6 VARIANCE COMPONENTS (Nested Designs): error of sampling, analysis .................................................... 7 1. Import of Data ..................................................................................................................................................7 2. Visualization of Data ........................................................................................................................................7 3. Test for Outlier .................................................................................................................................................7 4. Test for Homogenity of Variances ....................................................................................................................7 5. Test for Normal Distribution .............................................................................................................................7 6. ESTIMATE VARIANCE COMPONENTS .........................................................................................................7 EXPERIMENTAL DESIGNS ................................................................................................................................... 8 1. Create Experiment ...........................................................................................................................................8 2. Run Experiments..............................................................................................................................................8 3. Enter Data ........................................................................................................................................................8 4. Analyze Data ....................................................................................................................................................9 FILES: In and Output IO.XLS REGR.TXT Anova HVA.CSV Experimental Designs TAU.XLS TAU1.SFX D:\687291973.doc Version 18.05.2004 IN AND OUTPUT 1. IMPORT OF DATA FILES Configuration: \Windows\Systemsteuerung\Ländereinstellung\Zahl: Dezimalzeichen ................................... . Symbol f. Zifferngruppierung ............... blank Listentrennzeichen .............................. ; File /Open Data File /Dateityp: Alle Files(*.*) Excel: /IO.XLS CSV: /HVA.CSV comma delimited Textfile: /REGR.TXT tab delimited /Variable Names from first row /Variable Names from first row /Variable Names from first row 2. INPUT OF DATA Mark column /right mouse button /Modify Column x 1 2 3 4 5 y 11.5 12.4 13 16 17 File /Save Data File as: REGR.SF 3. MODIFICATION AND OUTPUT OF RESULTS ANALYSIS File /Open Data File /REGR.SF Relate /Simple Regression Tabular Options: Analysis Summary Graphical Options: Plot fitted model, Residuals vs. x MODIFICATION click window 2x with left mouse button Click element with right mouse button Options OUTPUT a) to Statreporter: click window with right mouse button /Copy to Statreporter b) from Statreporter to Winword: copy and paste Textwindow click window 2x with left mouse button /mark text /Icon Cut -> Winword: insert as text (click window 2x with left mouse button /Icon Copy -> Winword: insert as object) save Graphic to file Save Graph as regr.wmf Without colours: Graphics\Options\Profile: Black and White File\PageSetup: Black and White page 2 of 10 D:\687291973.doc Version 18.05.2004 page 3 of 10 Regression Analysis - Linear model: Y = a + b*X Dependent variable: Y Independent variable: X Standard T Parameter Estimate Error Statistic P-Value ----------------------------------------------------------------------------Intercept 9.6 0.73964 12.9793 0.0010 Slope 1.46 0.22301 6.5468 0.0072 ----------------------------------------------------------------------------Analysis of Variance Source Sum of Squares Df Mean Square F-Ratio P-Value ----------------------------------------------------------------------------Model 21.316 1 21.316 42.86 0.0072 Residual 1.492 3 0.497333 ----------------------------------------------------------------------------Total (Corr.) 22.808 4 Correlation Coefficient = 0.966739 R-squared = 93.4584 percent Standard Error of Est. = 0.705219 The StatAdvisor The output shows the results of fitting a linear model to describe the relationship between Y and X. The equation of the fitted model is: Y = 9.6 + 1.46*X Since the P-value in the ANOVA table is less than 0.01, there is a statistically significant relationship between Y and X at the 99% confidence level. The R-Squared statistic indicates that the model as fitted explains 93.4584% of the variability in Y. The correlation coefficient equals 0.966739, indicating a relatively strong relationship between the variables. The standard error of the estimate shows the standard deviation of the residuals to be 0.705219. This value can be used to construct prediction limits for new observations by selecting the Forecasts option from the text menu. Plot of Fitted Model 17 16 Y 15 14 13 12 11 0 1 2 3 X 4 5 D:\687291973.doc Version 18.05.2004 page 4 of 10 MULTIPLE COMPARISONS PROBLEM: Which of 5 products are different in the moisture content? Each product is analysed 5 times. The averages of each product are compared by multiple range tests. NR 1 7 11 13 15 2 5 8 18 21 4 12 17 20 24 6 9 14 19 23 3 10 16 22 25 TS 90.91 90.60 90.40 90.52 90.77 90.79 90.36 90.32 90.59 90.51 90.27 90.37 90.38 90.31 90.49 90.57 90.82 90.63 90.98 90.44 90.24 90.35 90.15 90.08 90.46 a) Test for Trend: Data versus sequence of measurements Icon Scatterplot: NR->X TS->Y PROBE->Select Pane Options: PROBE->Point Codes, Points+Lines Plot of TS vs NR 91 PROBE 1 2 3 4 5 90.8 TS PROBE 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 2. VISUALIZATION OF DATA 90.6 90.4 90.2 90 0 5 10 15 20 NR b) Visual test for Outlier and Distribution \Compare \Analysis of Variance \One-Way ANOVA PROBE->Factor TS->Dependent Variable Scatterplot Graphic Options: Scatterplot Scatterplot by Level Code 91 90.8 90.6 TS HVA.CSV (comma delimited): !Variable Probe und Nr muß sortiert sein! 90.4 90.2 90 1 2 3 4 5 PROBE Box-and-Whisker Plot Graphic Options: Box-and-Whisker-Plot Pane Options: vertical Box-and-Whisker Plot 91 90.8 TS 1. IMPORT OF DATA 90.6 90.4 90.2 90 1 2 3 PROBE 4 5 25 D:\687291973.doc Version 18.05.2004 page 5 of 10 3. TEST FOR OUTLIER Grubbs: PW = | xi - av(xi) | / s < T (replications;) 4. TEST FOR HOMOGENITY OF VARIANCES \Compare \Analysis of Variance \One-Way ANOVA PROBE->Factor TS->Dependent Variable Tabular Options: Variance Check Variance Check Cochran's C test: Bartlett's test: Hartley's test: 0.297977 1.19452 6.5 P-Value = 0.99813 P-Value = 0.519824 PW = s2max / s2min < T(;samples;replications-1) The StatAdvisor The three statistics displayed in this table test the null hypothesis that the standard deviations of TS within each of the 5 levels of PROBE is the same. Of particular interest are the two P-values. Since the smaller of the P-values is greater than or equal to 0.05, there is not a statistically significant difference amongst the standard deviations at the 95.0% confidence level. 5. TEST FOR NORMAL DISTRIBUTION \Compare \Analysis of Variance \One-Way ANOVA PROBE->Factor TS->Dependent Variable Tabular Options: Summary Statistics \Pane Options: Selection of parameters NV if stand. Skewness/Curtosis <=+/-2 Summary Statistics for TS PROBE Count Average 1 5 90.64 2 5 90.514 3 5 90.364 4 5 90.688 5 5 90.256 -----------------------------------------------------------Total 25 90.4924 PROBE Variance Standard deviation 1 0.04085 0.202114 2 0.03583 0.189288 3 0.00698 0.0835464 4 0.04537 0.213002 5 0.02323 0.152414 -----------------------------------------------------------Total 0.0530607 0.230349 PROBE Minimum Maximum 1 90.4 90.91 2 90.32 90.79 3 90.27 90.49 4 90.44 90.98 5 90.08 90.46 -----------------------------------------------------------Total 90.08 90.98 PROBE Range Stnd. skewness 1 0.51 0.288577 2 0.47 0.589419 3 0.22 0.663105 4 0.54 0.39776 5 0.38 0.287198 -----------------------------------------------------------Total 0.9 0.899742 PROBE Stnd. kurtosis Sum 1 -0.530679 453.2 2 -0.178296 452.57 3 0.314799 451.82 4 -0.446946 453.44 5 -0.589814 451.28 -----------------------------------------------------------Total -0.279892 2262.31 The StatAdvisor This table shows various statistics for TS for each of the 5 levels of PROBE. The one-way analysis of variance is primarily intended to compare the means of the different levels, listed here under the Average column. Select Means Plot from the list of Graphical Options to display the means graphically. R/s-test (David): Tu(replications;)<(PW = R/s) <To (replications;) D:\687291973.doc Version 18.05.2004 page 6 of 10 6. ANOVA - IS THERE ANY DIFFERENCE BETWEEN THE SAMPLES ? \Compare \Analysis of Variance \One-Way ANOVA PROBE->Factor TS->Dependent Variable Tabular Options: Anova Table ANOVA Table for TS by PROBE Analysis of Variance Source Sum of Squares Df Mean Square F-Ratio P-Value ----------------------------------------------------------------------------Between groups 0.664416 4 0.166104 5.45 0.0039 Within groups 0.60904 20 0.030452 ----------------------------------------------------------------------------Total (Corr.) 1.27346 24 The StatAdvisor: The ANOVA table decomposes the variance of TS into two components: a between-group component and a within-group component. The F-ratio, which in this case equals 5.45462, is a ratio of the between-group estimate to the withingroup estimate. Since the P-value of the F-test is less than 0.05, there is a statistically significant difference between the mean TS from one level of PROBE to another at the 95.0% confidence level. To determine which means are significantly different from which others, select Multiple Range Tests from the list of Tabular Options. 7. MULTIPLE RANGE TESTS - WHICH SAMPLES ARE DIFFERENT? Tabular Options: Multiple Range Tests Pane Options: LSD, Tuckey HSD, Scheffe, Bonferroni, Student-Newman Keuls, Duncan Multiple Range Tests for TS by PROBE: Method: 95.0 percent LSD PROBE Count Mean Homogeneous Groups -------------------------------------------------------------------------------5 5 90.256 X 3 5 90.364 XX 2 5 90.514 XX 1 5 90.64 X 4 5 90.688 X Contrast Difference +/- Limits -------------------------------------------------------------------------------1 - 2 0.126 0.230221 1 - 3 *0.276 0.230221 1 - 4 -0.048 0.230221 1 - 5 *0.384 0.230221 2 - 3 0.15 0.230221 2 - 4 -0.174 0.230221 2 - 5 *0.258 0.230221 3 - 4 *-0.324 0.230221 3 - 5 0.108 0.230221 4 - 5 *0.432 0.230221 -------------------------------------------------------------------------------* denotes a statistically significant difference. The StatAdvisor: This table applies a multiple comparison procedure to determine which means are significantly different from which others. The bottom half of the output shows the estimated difference between each pair of means. An asterisk has been placed next to 5 pairs, indicating that these pairs show statistically significant differences at the 95.0% confidence level. At the top of the page, 3 homogenous groups are identified using columns of X's. Within each column, the levels containing X's form a group of means within which there are no statistically significant differences. The method currently being used to discriminate among the means is Fisher's least significant difference (LSD) procedure. With this method, there is a 5.0% risk of calling each pair of means significantly different when the actual difference equals 0. Multiple Range Tests for TS by PROBE: Method: 95.0 percent Bonferroni PROBE Count Mean Homogeneous Groups -------------------------------------------------------------------------------5 5 90.256 X 3 5 90.364 XX 2 5 90.514 XX 1 5 90.64 X 4 5 90.688 X Contrast Difference +/- Limits -------------------------------------------------------------------------------1 - 2 0.126 0.348031 1 - 3 0.276 0.348031 1 - 4 -0.048 0.348031 1 - 5 *0.384 0.348031 2 - 3 0.15 0.348031 2 - 4 -0.174 0.348031 2 - 5 0.258 0.348031 3 - 4 -0.324 0.348031 3 - 5 0.108 0.348031 4 - 5 *0.432 0.348031 -------------------------------------------------------------------------------* denotes a statistically significant difference. D:\687291973.doc Version 18.05.2004 page 7 of 10 VARIANCE COMPONENTS (NESTED DESIGNS): ERROR OF SAMPLING, ANALYSIS PROBLEM: How big are the contributions of the sampling method and the analysis method to the variability of the analysed moisture content? To quantify the variance within the samples and the variance of the averages of the samples 5 samples are drawn from a bag, homogenized and each sample is analysed 5 times. 1. IMPORT OF DATA 2. VISUALIZATION OF DATA 3. TEST FOR OUTLIER 4. TEST FOR HOMOGENITY OF VARIANCES 5. TEST FOR NORMAL DISTRIBUTION 6. ESTIMATE VARIANCE COMPONENTS \Compare\Analysis of Variance\Variance Components PROBE->Factors in Order of Nesting TS->Dependent Variable Tabular Options: Analysis Summary Variance Components Analysis Dependent variable: TS Factors: PROBE Number of complete cases: 25 Analysis of Variance for TS Source Sum of Squares Df Mean Square Var. Comp. Percent -------------------------------------------------------------------------------TOTAL (CORRECTED) 1.27346 24 -------------------------------------------------------------------------------PROBE 0.664416 4 0.166104 0.0271304 47.12 ERROR 0.60904 20 0.030452 0.030452 52.88 -------------------------------------------------------------------------------- Index 1 0 The StatAdvisor: The analysis of variance table shown here divides the variance of TS into 1 components, one for each factor. Each factor after the first is nested in the one above. The goal of such an analysis is usually to estimate the amount of variability contributed by each of the factors, called the variance components. In this case, the factor contributing the most variance is ERROR. Its contribution represents 52.8842% of the total variation in TS. Error of sampling = s12 = (MQ1 - MQ0) / k, Confidence limits: lower limit = [(MQ1*L12 - MQ0) / k]1/2 L1(, Df1, Df0) k..replications < s1 < upper limit = [(MQ1*L22 - MQ0) / k]1/2 L2(, Df1, Df0) Error of analysis = s02 = MQ0 Confidence limits: lower limit: s0*L1 < s0 < upper limit: s0*L2 L1(,Df0, ) L2(, Df0, ) D:\687291973.doc Version 18.05.2004 page 8 of 10 EXPERIMENTAL DESIGNS PROBLEM: How big is the effect of heating time and concentration of starch solutions on the viscosity of the gelatinised starch Suspensions of starch with different concentrations in water are heated for different times at 80C. With this samples defined shear tests are made. The effects of starch concentration (Conc) and the time of heating (Time) on the shear resistance (tau) at D=300 s-1 is quantified. 1. CREATE EXPERIMENT \Special \Experimental Design \Create Design \Screening Design 2 Factors, 1 Response, Fractional Design, 0 Center Point, 1 Replication Randomize correct Block Tabular Options: Design Summary, Worksheet Save Design File tau.sfx Print Worksheet Design Summary Design class: Screening Design name: Factorial 2^2 Base Design Number of experimental factors: 2 Number of blocks: 1 Number of responses: 1 Number of centerpoints per block: 0 Number of runs: 8 Randomized: Yes Factors Low High Units Continuous -----------------------------------------------------------------------Conc -1.0 1.0 Yes Time -1.0 1.0 Yes Responses Units ----------------------------------tau The StatAdvisor: You have created a Factorial design which will study the effects of 2 factors in 8 runs. The design is to be run in a single block. The order of the experiments has been fully randomized. This will provide protection against the effects of lurking variables. 2. RUN EXPERIMENTS 3. ENTER DATA \Special \Experimental Design \Open Design: tau.sfx Tabular Options: Design Summary, Worksheet !!!! take care of correct input of tau to the corresponding experiments !!!! run 4 10 7 3 2 9 5 8 BLOCK 1 1 1 1 Conc -1 1 -1 1 Time -1 -1 1 1 tau 40 105 130 119 1 1 1 1 -1 1 -1 1 -1 -1 1 1 42 98 134 122 D:\687291973.doc Version 18.05.2004 page 9 of 10 4. ANALYZE DATA \Special \Experimental Design \Analyze Design Analysis Options: max. Order Effect: 2 -> ignore Block number Estimated Sigma from: Experimental Data Tabular Options: Analysis Summary, ANOVA Table, Regression coeff., Optimization Graphical Options: Pareto Chart, Main Effects, Interaction Plot, Response Plots, Diagnostic Plots Analysis Summary Estimated effects for tau average = 98.75 +/- 1.10397 A:CONC = 24.5 +/- 2.20794 B:TIME = 55.0 +/- 2.20794 AB = -36.0 +/- 2.20794 ---------------------------------------------------------------------Standard errors are based on total error with 4 d.f. The StatAdvisor: This table shows each of the estimated effects and interactions. Also shown is the standard error of each of the effects, which measures their sampling error. To plot the estimates in decreasing order of importance, select Pareto Charts from the list of Graphical Options. To test the statistical significance of the effects, select ANOVA Table from the list of Tabular Options. You can then remove insignificant effects by pressing the alternate mouse button, selecting Analysis Options, and pressing the Exclude button. Analysis of Variance for TAU: Source Sum of Squares Df Mean Square F-Ratio P-Value -------------------------------------------------------------------------------A:CONC 1200.5 1 1200.5 123.13 0.0004 B:TIME 6050.0 1 6050.0 620.51 0.0000 AB 2592.0 1 2592.0 265.85 0.0001 Total error 39.0 4 9.75 -------------------------------------------------------------------------------Total (corr.) 9881.5 7 R-squared = 99.6053 percent Standard Error of Est. = 3.1225 R-squared (adjusted for d.f.) = 99.3093 percent Mean absolute error = 2.0 Durbin-Watson statistic = 2.76282 The StatAdvisor: The ANOVA table partitions the variability in TAU into separate pieces for each of the effects. It then tests the statistical significance of each effect by comparing the mean square against an estimate of the experimental error. In this case, 3 effects have P-values less than 0.05, indicating that they are significantly different from zero at the 95.0% confidence level. The R-Squared statistic indicates that the model as fitted explains 99.6053% of the variability in TAU. The adjusted R-squared statistic, which is more suitable for comparing models with different numbers of independent variables, is 99.3093%. The standard error of the estimate shows the standard deviation of the residuals to be 3.1225. The mean absolute error (MAE) of 2.0 is the average value of the residuals. The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in which they occur in your data file. Since the DW value is > 1.4, there is probably not any serious autocorrelation in the residuals. Pareto Chart: Pane Options: Standardized Standardized Pareto Chart for TAU B:T IME AB A:CONC 0 5 10 15 20 25 Standardized effect all factors and interactions are significant Regression coeffs. for tau constant = 98.75 A:CONC = 12.25 B:TIME = 27.5 AB = -18.0 The StatAdvisor: This pane displays the regression equation which has been fitted to the data. The equation of the fitted model is TAU = 98.75 + 12.25*CONC + 27.5*TIME - 18.0*CONC*TIME where the values of the variables are specified in their original units. To have STATGRAPHICS evaluate this function, select Predictions from the list of Tabular Options. To plot the function, select Response Plots from the list of Graphical Options. D:\687291973.doc Version 18.05.2004 page 10 of 10 Main Effects Plot for TAU 131 121 TAU 111 101 91 81 71 -1.0 1.0 CONC -1.0 1.0 TIME Interaction Plot for TAU 141 TAU TIME=1.0 121 TIME=1.0 101 TIME=-1.0 81 61 TIME=-1.0 41 -1.0 1.0 CONC at high CONC, the TIME has low effect at high TIME, the CONC has no effect Surface Plot: Pane Options: show points Estimated Response Surface 140 tau 120 100 80 60 40 -1 -0.6 -0.2 0.2 0.6 1 -1 -0.6 -0.2 0.2 0.6 1 Time Conc Contour Plot: Pan Options: Painted Regions Contours of Estimated Response Surface 1 TAU 41.0-51.0 51.0-61.0 61.0-71.0 71.0-81.0 81.0-91.0 91.0-101.0 101.0-111.0 111.0-121.0 121.0-131.0 131.0-141.0 TIME 0.6 0.2 -0.2 -0.6 -1 -1 -0.6 -0.2 CONC 0.2 0.6 1 same TAU can be obtained with low CONC at high TIME at high CONC, the TIME has low effect Diagnostic Plot: Pane Options: Residuals vs. Run Order Residual Plot for TAU 4.5 residual 2.5 0.5 -1.5 -3.5 0 2 4 run number Pane Options: Residuals vs. Factor A:conc 6 8