1. Import of Data

advertisement
D:\687291973.doc
Version 18.05.2004
page 1 of 10
THE USE OF STATGRAPHICS 5.0
IN AND OUTPUT ..................................................................................................................................................... 2
1. Import of Data Files ..........................................................................................................................................2
2. Input of Data .....................................................................................................................................................2
3. Modification and Output of Results ..................................................................................................................2
MULTIPLE COMPARISONS .................................................................................................................................. 4
1. Import of Data ..................................................................................................................................................4
2. Visualization of Data ........................................................................................................................................4
3. Test for Outlier .................................................................................................................................................5
4. Test for Homogenity of Variances ....................................................................................................................5
5. Test for Normal Distribution .............................................................................................................................5
6. ANOVA - Is there any difference between the samples ? ...............................................................................6
7. MULTIPLE RANGE TESTS - Which samples are different? ...........................................................................6
VARIANCE COMPONENTS (Nested Designs): error of sampling, analysis .................................................... 7
1. Import of Data ..................................................................................................................................................7
2. Visualization of Data ........................................................................................................................................7
3. Test for Outlier .................................................................................................................................................7
4. Test for Homogenity of Variances ....................................................................................................................7
5. Test for Normal Distribution .............................................................................................................................7
6. ESTIMATE VARIANCE COMPONENTS .........................................................................................................7
EXPERIMENTAL DESIGNS ................................................................................................................................... 8
1. Create Experiment ...........................................................................................................................................8
2. Run Experiments..............................................................................................................................................8
3. Enter Data ........................................................................................................................................................8
4. Analyze Data ....................................................................................................................................................9
FILES:

In and Output
IO.XLS
REGR.TXT

Anova
HVA.CSV

Experimental Designs
TAU.XLS
TAU1.SFX
D:\687291973.doc
Version 18.05.2004
IN AND OUTPUT
1. IMPORT OF DATA FILES
Configuration: \Windows\Systemsteuerung\Ländereinstellung\Zahl:
Dezimalzeichen ................................... .
Symbol f. Zifferngruppierung ............... blank
Listentrennzeichen .............................. ;
File /Open Data File /Dateityp: Alle Files(*.*)
 Excel:
/IO.XLS
 CSV:
/HVA.CSV
comma delimited
 Textfile:
/REGR.TXT
tab delimited
/Variable Names from first row
/Variable Names from first row
/Variable Names from first row
2. INPUT OF DATA


Mark column /right mouse button /Modify Column
x
1
2
3
4
5
y
11.5
12.4
13
16
17
File /Save Data File as: REGR.SF
3. MODIFICATION AND OUTPUT OF RESULTS
ANALYSIS
 File /Open Data File /REGR.SF
 Relate /Simple Regression
 Tabular Options: Analysis Summary
 Graphical Options: Plot fitted model, Residuals vs. x
MODIFICATION
 click window 2x with left mouse button
 Click element with right mouse button
 Options
OUTPUT
a) to Statreporter: click window with right mouse button /Copy to Statreporter
b) from Statreporter to Winword: copy and paste
Textwindow
 click window 2x with left mouse button /mark text /Icon Cut -> Winword: insert as text
 (click window 2x with left mouse button /Icon Copy -> Winword: insert as object)
save Graphic to file
 Save Graph as regr.wmf
Without colours: Graphics\Options\Profile: Black and White
File\PageSetup: Black and White
page 2 of 10
D:\687291973.doc
Version 18.05.2004
page 3 of 10
Regression Analysis - Linear model: Y = a + b*X
Dependent variable: Y
Independent variable: X
Standard
T
Parameter
Estimate
Error
Statistic
P-Value
----------------------------------------------------------------------------Intercept
9.6
0.73964
12.9793
0.0010
Slope
1.46
0.22301
6.5468
0.0072
----------------------------------------------------------------------------Analysis of Variance
Source
Sum of Squares
Df Mean Square
F-Ratio
P-Value
----------------------------------------------------------------------------Model
21.316
1
21.316
42.86
0.0072
Residual
1.492
3
0.497333
----------------------------------------------------------------------------Total (Corr.)
22.808
4
Correlation Coefficient = 0.966739
R-squared = 93.4584 percent
Standard Error of Est. = 0.705219
The StatAdvisor
The output shows the results of fitting a linear model to describe the relationship between Y and X. The equation of the fitted
model is:
Y = 9.6 + 1.46*X
Since the P-value in the ANOVA table is less than 0.01, there is a statistically significant relationship between Y and X at the
99% confidence level.
The R-Squared statistic indicates that the model as fitted explains 93.4584% of the variability in Y.
The correlation coefficient equals 0.966739, indicating a relatively strong relationship between the variables.
The standard error of the estimate shows the standard deviation of the residuals to be 0.705219. This value can be used to
construct prediction limits for new observations by selecting the Forecasts option from the text menu.
Plot of Fitted Model
17
16
Y
15
14
13
12
11
0
1
2
3
X
4
5
D:\687291973.doc
Version 18.05.2004
page 4 of 10
MULTIPLE COMPARISONS
PROBLEM: Which of 5 products are different in the moisture content?
Each product is analysed 5 times. The averages of each product are compared by multiple range tests.
NR
1
7
11
13
15
2
5
8
18
21
4
12
17
20
24
6
9
14
19
23
3
10
16
22
25
TS
90.91
90.60
90.40
90.52
90.77
90.79
90.36
90.32
90.59
90.51
90.27
90.37
90.38
90.31
90.49
90.57
90.82
90.63
90.98
90.44
90.24
90.35
90.15
90.08
90.46
a) Test for Trend: Data versus sequence of measurements
Icon Scatterplot: NR->X TS->Y PROBE->Select
Pane Options: PROBE->Point Codes, Points+Lines
Plot of TS vs NR
91
PROBE
1
2
3
4
5
90.8
TS
PROBE
1
1
1
1
1
2
2
2
2
2
3
3
3
3
3
4
4
4
4
4
5
5
5
5
5
2. VISUALIZATION OF DATA
90.6
90.4
90.2
90
0
5
10
15
20
NR
b) Visual test for Outlier and Distribution
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable

Scatterplot
Graphic Options: Scatterplot
Scatterplot by Level Code
91
90.8
90.6
TS
HVA.CSV (comma delimited):
!Variable Probe und Nr muß
sortiert sein!
90.4
90.2
90
1
2
3
4
5
PROBE
 Box-and-Whisker Plot
Graphic Options: Box-and-Whisker-Plot
Pane Options: vertical
Box-and-Whisker Plot
91
90.8
TS
1. IMPORT OF DATA
90.6
90.4
90.2
90
1
2
3
PROBE
4
5
25
D:\687291973.doc
Version 18.05.2004
page 5 of 10
3. TEST FOR OUTLIER
Grubbs: PW = | xi - av(xi) | / s < T (replications;)
4. TEST FOR HOMOGENITY OF VARIANCES
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable
Tabular Options: Variance Check
Variance Check
Cochran's C test:
Bartlett's test:
Hartley's test:
0.297977
1.19452
6.5
P-Value = 0.99813
P-Value = 0.519824
PW = s2max / s2min < T(;samples;replications-1)
The StatAdvisor
The three statistics displayed in this table test the null hypothesis that the standard deviations of TS within each of the 5 levels of
PROBE is the same. Of particular interest are the two P-values. Since the smaller of the P-values is greater than or equal to
0.05, there is not a statistically significant difference amongst the standard deviations at the 95.0% confidence level.
5. TEST FOR NORMAL DISTRIBUTION
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable
Tabular Options: Summary Statistics \Pane Options: Selection of parameters
NV if stand. Skewness/Curtosis <=+/-2
Summary Statistics for TS
PROBE
Count
Average
1
5
90.64
2
5
90.514
3
5
90.364
4
5
90.688
5
5
90.256
-----------------------------------------------------------Total
25
90.4924
PROBE
Variance
Standard deviation
1
0.04085
0.202114
2
0.03583
0.189288
3
0.00698
0.0835464
4
0.04537
0.213002
5
0.02323
0.152414
-----------------------------------------------------------Total
0.0530607
0.230349
PROBE
Minimum
Maximum
1
90.4
90.91
2
90.32
90.79
3
90.27
90.49
4
90.44
90.98
5
90.08
90.46
-----------------------------------------------------------Total
90.08
90.98
PROBE
Range
Stnd. skewness
1
0.51
0.288577
2
0.47
0.589419
3
0.22
0.663105
4
0.54
0.39776
5
0.38
0.287198
-----------------------------------------------------------Total
0.9
0.899742
PROBE
Stnd. kurtosis
Sum
1
-0.530679
453.2
2
-0.178296
452.57
3
0.314799
451.82
4
-0.446946
453.44
5
-0.589814
451.28
-----------------------------------------------------------Total
-0.279892
2262.31
The StatAdvisor
This table shows various statistics for TS for each of the 5 levels of PROBE. The one-way analysis of variance is primarily intended
to compare the means of the different levels, listed here under the Average column. Select Means Plot from the list of Graphical
Options to display the means graphically.
R/s-test (David): Tu(replications;)<(PW = R/s) <To (replications;)
D:\687291973.doc
Version 18.05.2004
page 6 of 10
6. ANOVA - IS THERE ANY DIFFERENCE BETWEEN THE SAMPLES ?
\Compare \Analysis of Variance \One-Way ANOVA
PROBE->Factor TS->Dependent Variable
Tabular Options: Anova Table
ANOVA Table for TS by PROBE
Analysis of Variance
Source
Sum of Squares
Df Mean Square
F-Ratio
P-Value
----------------------------------------------------------------------------Between groups
0.664416
4
0.166104
5.45
0.0039
Within groups
0.60904
20
0.030452
----------------------------------------------------------------------------Total (Corr.)
1.27346
24
The StatAdvisor: The ANOVA table decomposes the variance of TS into two components: a between-group component and a
within-group component. The F-ratio, which in this case equals 5.45462, is a ratio of the between-group estimate to the withingroup estimate. Since the P-value of the F-test is less than 0.05, there is a statistically significant difference between the mean
TS from one level of PROBE to another at the 95.0% confidence level. To determine which means are significantly different from
which others, select Multiple Range Tests from the list of Tabular Options.
7. MULTIPLE RANGE TESTS - WHICH SAMPLES ARE DIFFERENT?
Tabular Options: Multiple Range Tests
Pane Options: LSD, Tuckey HSD, Scheffe, Bonferroni, Student-Newman Keuls, Duncan
Multiple Range Tests for TS by PROBE: Method: 95.0 percent LSD
PROBE
Count
Mean
Homogeneous Groups
-------------------------------------------------------------------------------5
5
90.256
X
3
5
90.364
XX
2
5
90.514
XX
1
5
90.64
X
4
5
90.688
X
Contrast
Difference
+/- Limits
-------------------------------------------------------------------------------1 - 2
0.126
0.230221
1 - 3
*0.276
0.230221
1 - 4
-0.048
0.230221
1 - 5
*0.384
0.230221
2 - 3
0.15
0.230221
2 - 4
-0.174
0.230221
2 - 5
*0.258
0.230221
3 - 4
*-0.324
0.230221
3 - 5
0.108
0.230221
4 - 5
*0.432
0.230221
-------------------------------------------------------------------------------* denotes a statistically significant difference.
The StatAdvisor: This table applies a multiple comparison procedure to determine which means are significantly different from
which others. The bottom half of the output shows the estimated difference between each pair of means. An asterisk has been
placed next to 5 pairs, indicating that these pairs show statistically significant differences at the 95.0% confidence level. At the top
of the page, 3 homogenous groups are identified using columns of X's. Within each column, the levels containing X's form a group
of means within which there are no statistically significant differences. The method currently being used to discriminate among the
means is Fisher's least significant difference (LSD) procedure. With this method, there is a 5.0% risk of calling each pair of means
significantly different when the actual difference equals 0.
Multiple Range Tests for TS by PROBE: Method: 95.0 percent Bonferroni
PROBE
Count
Mean
Homogeneous Groups
-------------------------------------------------------------------------------5
5
90.256
X
3
5
90.364
XX
2
5
90.514
XX
1
5
90.64
X
4
5
90.688
X
Contrast
Difference
+/- Limits
-------------------------------------------------------------------------------1 - 2
0.126
0.348031
1 - 3
0.276
0.348031
1 - 4
-0.048
0.348031
1 - 5
*0.384
0.348031
2 - 3
0.15
0.348031
2 - 4
-0.174
0.348031
2 - 5
0.258
0.348031
3 - 4
-0.324
0.348031
3 - 5
0.108
0.348031
4 - 5
*0.432
0.348031
-------------------------------------------------------------------------------* denotes a statistically significant difference.
D:\687291973.doc
Version 18.05.2004
page 7 of 10
VARIANCE COMPONENTS (NESTED DESIGNS): ERROR OF SAMPLING, ANALYSIS
PROBLEM: How big are the contributions of the sampling method and the analysis method to the
variability of the analysed moisture content?
To quantify the variance within the samples and the variance of the averages of the samples 5 samples are
drawn from a bag, homogenized and each sample is analysed 5 times.
1. IMPORT OF DATA
2. VISUALIZATION OF DATA
3. TEST FOR OUTLIER
4. TEST FOR HOMOGENITY OF VARIANCES
5. TEST FOR NORMAL DISTRIBUTION
6. ESTIMATE VARIANCE COMPONENTS
\Compare\Analysis of Variance\Variance Components
PROBE->Factors in Order of Nesting
TS->Dependent Variable
Tabular Options: Analysis Summary
Variance Components Analysis
Dependent variable: TS
Factors:
PROBE
Number of complete cases: 25
Analysis of Variance for TS
Source
Sum of Squares
Df
Mean Square
Var. Comp. Percent
-------------------------------------------------------------------------------TOTAL (CORRECTED)
1.27346
24
-------------------------------------------------------------------------------PROBE
0.664416
4
0.166104
0.0271304
47.12
ERROR
0.60904
20
0.030452
0.030452
52.88
--------------------------------------------------------------------------------
Index
1
0
The StatAdvisor: The analysis of variance table shown here divides the variance of TS into 1 components, one for each factor.
Each factor after the first is nested in the one above. The goal of such an analysis is usually to estimate the amount of variability
contributed by each of the factors, called the variance components. In this case, the factor contributing the most variance is
ERROR. Its contribution represents 52.8842% of the total variation in TS.
Error of sampling = s12 = (MQ1 - MQ0) / k,
Confidence limits:
lower limit = [(MQ1*L12 - MQ0) / k]1/2
L1(, Df1, Df0)
k..replications
< s1 < upper limit = [(MQ1*L22 - MQ0) / k]1/2
L2(, Df1, Df0)
Error of analysis = s02 = MQ0
Confidence limits:
lower limit: s0*L1 < s0 < upper limit: s0*L2
L1(,Df0, )
L2(, Df0, )
D:\687291973.doc
Version 18.05.2004
page 8 of 10
EXPERIMENTAL DESIGNS
PROBLEM: How big is the effect of heating time and concentration of starch solutions on the viscosity of the
gelatinised starch
Suspensions of starch with different concentrations in water are heated for different times at 80C. With this
samples defined shear tests are made. The effects of starch concentration (Conc) and the time of heating
(Time) on the shear resistance (tau) at D=300 s-1 is quantified.
1. CREATE EXPERIMENT
\Special \Experimental Design \Create Design \Screening Design
2 Factors, 1 Response, Fractional Design, 0 Center Point, 1 Replication
Randomize
correct Block
Tabular Options: Design Summary, Worksheet
Save Design File tau.sfx
Print Worksheet
Design Summary
Design class: Screening
Design name: Factorial
2^2
Base Design
Number of experimental factors: 2 Number of blocks: 1
Number of responses: 1
Number of centerpoints per block: 0
Number of runs: 8
Randomized: Yes
Factors
Low
High
Units
Continuous
-----------------------------------------------------------------------Conc
-1.0
1.0
Yes
Time
-1.0
1.0
Yes
Responses
Units
----------------------------------tau
The StatAdvisor: You have created a Factorial design which will study the effects of 2 factors in 8 runs. The design is to be run in a
single block. The order of the experiments has been fully randomized. This will provide protection against the effects of lurking
variables.
2. RUN EXPERIMENTS
3. ENTER DATA
\Special \Experimental Design \Open Design: tau.sfx
Tabular Options: Design Summary, Worksheet
!!!! take care of correct input of tau to the corresponding experiments !!!!
run
4
10
7
3
2
9
5
8
BLOCK
1
1
1
1
Conc
-1
1
-1
1
Time
-1
-1
1
1
tau
40
105
130
119
1
1
1
1
-1
1
-1
1
-1
-1
1
1
42
98
134
122
D:\687291973.doc
Version 18.05.2004
page 9 of 10
4. ANALYZE DATA
\Special \Experimental Design \Analyze Design
Analysis Options:
max. Order Effect: 2
-> ignore Block number
Estimated Sigma from: Experimental Data
Tabular Options: Analysis Summary, ANOVA Table, Regression coeff., Optimization
Graphical Options: Pareto Chart, Main Effects, Interaction Plot, Response Plots, Diagnostic Plots
Analysis Summary
Estimated effects for tau
average = 98.75 +/- 1.10397
A:CONC = 24.5 +/- 2.20794
B:TIME = 55.0 +/- 2.20794
AB
= -36.0 +/- 2.20794
---------------------------------------------------------------------Standard errors are based on total error with 4 d.f.
The StatAdvisor: This table shows each of the estimated effects and interactions. Also shown is the standard error of each of the
effects, which measures their sampling error.
To plot the estimates in decreasing order of importance, select Pareto Charts from the list of Graphical Options.
To test the statistical significance of the effects, select ANOVA Table from the list of Tabular Options.
You can then remove insignificant effects by pressing the alternate mouse button, selecting Analysis Options, and pressing the
Exclude button.
Analysis of Variance for TAU:
Source
Sum of Squares
Df
Mean Square
F-Ratio
P-Value
-------------------------------------------------------------------------------A:CONC
1200.5
1
1200.5
123.13
0.0004
B:TIME
6050.0
1
6050.0
620.51
0.0000
AB
2592.0
1
2592.0
265.85
0.0001
Total error
39.0
4
9.75
-------------------------------------------------------------------------------Total (corr.)
9881.5
7
R-squared = 99.6053 percent
Standard Error of Est. = 3.1225
R-squared (adjusted for d.f.) = 99.3093 percent
Mean absolute error = 2.0
Durbin-Watson statistic = 2.76282
The StatAdvisor: The ANOVA table partitions the variability in TAU into separate pieces for each of the effects. It then tests the
statistical significance of each effect by comparing the mean square against an estimate of the experimental error. In this case, 3
effects have P-values less than 0.05, indicating that they are significantly different from zero at the 95.0% confidence level.
The R-Squared statistic indicates that the model as fitted explains 99.6053% of the variability in TAU. The adjusted R-squared
statistic, which is more suitable for comparing models with different numbers of independent variables, is 99.3093%.
The standard error of the estimate shows the standard deviation of the residuals to be 3.1225.
The mean absolute error (MAE) of 2.0 is the average value of the residuals.
The Durbin-Watson (DW) statistic tests the residuals to determine if there is any significant correlation based on the order in
which they occur in your data file. Since the DW value is > 1.4, there is probably not any serious autocorrelation in the residuals.
Pareto Chart: Pane Options: Standardized
Standardized Pareto Chart for TAU
B:T IME
AB
A:CONC
0
5
10
15
20
25
Standardized effect

all factors and interactions are significant
Regression coeffs. for tau
constant = 98.75
A:CONC
= 12.25
B:TIME
= 27.5
AB
= -18.0
The StatAdvisor: This pane displays the regression equation which has been fitted to the data. The equation of the fitted model is
TAU = 98.75 + 12.25*CONC + 27.5*TIME - 18.0*CONC*TIME
where the values of the variables are specified in their original units.
To have STATGRAPHICS evaluate this function, select Predictions from the list of Tabular Options.
To plot the function, select Response Plots from the list of Graphical Options.
D:\687291973.doc
Version 18.05.2004
page 10 of 10
Main Effects Plot for TAU
131
121
TAU
111
101
91
81
71
-1.0
1.0
CONC
-1.0
1.0
TIME
Interaction Plot for TAU
141
TAU
TIME=1.0
121
TIME=1.0
101
TIME=-1.0
81
61
TIME=-1.0
41
-1.0
1.0
CONC


at high CONC, the TIME has low effect
at high TIME, the CONC has no effect
Surface Plot: Pane Options: show points
Estimated Response Surface
140
tau
120
100
80
60
40
-1
-0.6
-0.2
0.2
0.6
1
-1
-0.6
-0.2
0.2
0.6
1
Time
Conc
Contour Plot: Pan Options: Painted Regions
Contours of Estimated Response Surface
1
TAU
41.0-51.0
51.0-61.0
61.0-71.0
71.0-81.0
81.0-91.0
91.0-101.0
101.0-111.0
111.0-121.0
121.0-131.0
131.0-141.0
TIME
0.6
0.2
-0.2
-0.6
-1
-1


-0.6
-0.2
CONC
0.2
0.6
1
same TAU can be obtained with low CONC at high TIME
at high CONC, the TIME has low effect
Diagnostic Plot:
Pane Options: Residuals vs. Run Order
Residual Plot for TAU
4.5
residual
2.5
0.5
-1.5
-3.5
0
2
4
run number
Pane Options: Residuals vs. Factor A:conc
6
8
Download