minitab: an overview

advertisement

MINITAB: AN OVERVIEW

Rajender Parsad

I.A.S.R.I., Library Avenue, New Delhi – 110 012 rajender@iasri.res.in

The functionality of MINITAB is accessible through interactive windows and menus, or through a command language called session commands. There are three windows viz. Data window, Session window and Project Manager. Data window is a worksheet in a spreadsheet format, with rows and columns that intersect to form individual cells. A worksheet can contain up to 4000 columns, 1000 constants, and up to 10,000,000 rows depending on memory of the computer. The text output generated by the analyses is displayed in Session window. The Project Manager contains folders that allow one to navigate, view, and manipulate various parts of the project. Minitab has the advanced Design of Experiments

(DOE) capabilities. One can screen the factors to determine which are important for explaining process variation. It can generate two-level full and fractional factorial designs, and Plackett-Burman designs, Box-Behnken and central composite designs, simplex centroid and simplex lattice designs and Taguchi orthogonal array designs. It also allows one to perform one way analysis of variance, two-way analysis of variance for balanced data, test for equality of variances, and generate various plots. Balanced ANOVA models with crossed or nested and fixed or random factors can also be analyzed. The option General MANOVA analyzes balanced or unbalanced MANOVA models with crossed or nested and fixed or random factors. The analysis of covariance is also possible with option General MANOVA.

For initiatinfg the work on MINITAB. From thw Windows Taskbar, choose Start

Programs

MINITAB 14 (MINITAB SOLUTIONS)

MINITAB 14 (MINITAB 15).

Minitab opens with two main windows viz. Session Window and Data Window. The first screen of MINITAB are shown as

Minitab: An Overview

Under the Data Menu: the following options are available

Subset Worksheet - copies specified rows from the active worksheet to the new worksheet

Split Worksheet - splits or unstacks the active worksheet into two or more new worksheets based on one or more "By" variables

Merge Worksheets - combines two worksheets into one new worksheet

Sort - sorts one or more columns of data

Rank - assigns rank scores to values in a column

Delete Rows - deletes specified rows from columns in the worksheet

Erase Variables - erases any combination of columns, stored constants and matrices

Copy - copies selections from one position in the worksheet to another; can copy entire selections or a subset

Stack - stacks columns on top of each other to make longer columns

Unstack - unstacks (or splits) columns into shorter columns

Transpose Columns - switches columns to rows

Concatenate - combines two or more text columns side by side into one new column

Code - recode values in columns

Change Data Type - changes columns from one data type (such as numeric, text, or date/time) to another

Display Data - displays data from the current worksheet in the Session window

Extract from Date/Time to Numeric/Text - extracts one or more parts of a date/time column, such as the year, the quarter, or the hour, and saves that data in a numeric or a text column.

In the worksheet, one can enter the data in columns numbered as C1, C2, …. The names of the variables can be written in the row below the row cotaining column numbers C1, C2, …

Calc Menu has the following sub-options

Calculator - does arithmetic using an algebraic expression, which may contain arithmetic operations, comparison operations, logical operations, and functions

Column Statistics - calculates various statistics based on a column you select

Row Statistics - calculates various statistics for each row of the columns you select

Standardize - centers and scales columns of data

Make Patterned Data - provides an easy way to fill a column with numbers or date/time values that follow a pattern. See also Generating Patterned Data Overview for related information.

Make Mesh Data - creates a regular (x,y) mesh to use for drawing contour, 3D surface and wireframe plots, with the option to create the z-variable as well

I-180

Minitab: An Overview

Make Indicator Variables - creates indicator (dummy) variables that you can use in regression analysis. See also Generating Patterned Data Overview for related information.

Set Base - fixes a starting point for Minitab's random number generator

Random Data - displays commands for generating a random sample of numbers, sampled either from columns of the worksheet or from a variety of distributions

Probability Distributions - displays commands that allow you to compute probabilities, probability densities, cumulative probabilities, and inverse cumulative probabilities for continuous and discrete distributions

Matrices - displays commands for doing matrix operations

The main menu for statistical data analysis Stat. Under this option, following suboptions are available:

Basic Statistics

Regression

ANOVA (Analysis of Variance)

DOE (Design of Experiments)

Control Charts

Quality Tools

Reliability/Survival

Multivariate

Time Series

Tables

Nonparametrics

EDA (Exploratory Data Analysis)

Power and Sample Size

In Basic statistics, following sub-options can be used through selecting Stat > Basic Statistics

Select one of the following commands: Display Descriptive Statistics , Store Descriptive

Statistics , Graphical Summary, 1-Sample Z, 1-Sample t, 2-Sample t, Paired t, 1 Proportion, 2

Proportions, 1-Sample Poisson Rate, 2-Sample Poisson Rate, 1 Variance, 2 Variances,

Correlation, Covariance, Normality Test, Goodness-of-Fit Test for Poisson. Then further subsub options can be used.

For performing regression analysis, from the menus choose Stat > Regression and then select one of the following commands to fit a model relating a response to one or more predictors :

Regression - does simple, multiple and polynomial regression

Stepwise - does stepwise regression, forward selection, and backward elimination

Best Subsets - does best subsets regression

Fitted Line Plot - fits a simple linear or polynomial regression model and plots the regression line through the actual data or the log10 of the data

Partial Least Squares - does partial least squares regression

Binary Logistic Regression - does logistic regression for a binary response variable

Ordinal Logistic Regression - does logistic regression for an ordinal response variable

Nominal Logistic Regression - does logistic regression for a nominal response variable

I-181

Minitab: An Overview

For performing Analysis of variance, Choose: Stat > ANOVA . This option allows to perform analysis of variance, test for equality of variances, and generate various plots. The analysis can be carried out, using the suitable sub-option.

One-Way - performs a one-way analysis of variance, with the response in one column, subscripts in another and performs multiple comparisons of means

One-Way (Unstacked) - performs a one-way analysis of variance, with each group in a separate column

Two-way - performs a two-way analysis of variance for balanced data

Analysis of Means - displays an Analysis of Means chart for normal, binomial, or Poisson data

Balanced ANOVA - analyzes balanced ANOVA models with crossed or nested and fixed or random factors

General Linear Model - analyzes balanced or unbalanced ANOVA models with crossed or nested and fixed or random factors. You can include covariates and perform multiple comparisons of means.

Fully Nested ANOVA - analyzes fully nested ANOVA models and estimates variance components

Balanced MANOVA - analyzes balanced MANOVA models with crossed or nested and fixed or random factors

General MANOVA - analyzes balanced or unbalanced MANOVA models with crossed or nested and fixed or random factors. You can also include covariates.

Test for Equal Variances - performs Bartlett's and Levene's tests for equality of variances

Interval Plot - produces graphs that show the variation of group means by plotting standard error bars or confidence intervals

Main Effects Plot - generates a plot of response main effects

Interactions Plot - generates an interaction plots (or matrix of plots)

Minitab can also be used for generating the layout of designs for two-level full and fractional factorial designs using Stat > DOE > Factorial . For generating Box-Behnken and central composite designs, use Stat > DOE > Response Surface . Simplex centroid and simplex lattice designs for mixture experiments can be obtained using Stat > DOE> Mixture .

Taguchi orthogonal arrays can be generated using Stat > DOE> Taguchi .

Minitab can perform principal components analysis, factor analysis, cluster analysis, discriminant analysis, and correspondence analysis. For performing multivariate data analysis, choose: Stat > Multivariate and then any one of the following sub-options depending upon the analysis required to be performed.

Principal Components - performs principal components analysis

Factor Analysis - performs factor analysis

Item Analysis performs item analysis

I-182

Minitab: An Overview

Cluster Observations - performs agglomerative hierarchical clustering of observations

Cluster Variables - performs agglomerative hierarchical clustering of variables

Cluster K-Means - performs K-means non-hierarchical clustering of observations

Discriminant Analysis - performs linear and quadratic discriminant analysis

Simple Correspondence Analysis - performs simple correspondence analysis on a two-way contingency table

Multiple Correspondence Analysis - performs multiple correspondence analysis on three or more categorical variables

Choosing: Stat > EDA performs exploratory data analysis to explore data before using more traditional methods, or to examine residuals from a model. They are particularly useful for identifying extraordinary observations and noting violations of traditional assumptions such as nonlinearity or nonconstant variance. Following sub-options may be used:

Stem-and-Leaf - does a stem-and-leaf plot

Boxplot - does a box-and-whiskers plot

Letter Values - prints a letter-value display

Median Polish - uses median polish to analyze a two-way layout

Resistant Line - fits a line to data using a procedure that is resistant to outliers

Resistant Smooth - smoothes data (usually a time series)

Rootogram - prints a suspended rootogram

Minitab may also be used for Control Charts, Quality Tools, Reliability/Survival, Time

Series, Tables, Nonparametrics and Power and Sample Size. The other menus in Minitab are:

Graph, Editor, Tools, Windows and Help. Once we click on help, we get the following screen.

I-183

Minitab: An Overview

Some practical exercises using MINITAB are given in the sequel.

¾ t-test

Example 2.1

: In a certain experiment to compare two types of pig foods A and B, the following results of increase in weights were observed in same set of 8 pigs:

Food A: 49 53 51 52 47 50 52 53

Food B: 52 55 52 53 50 54 54 53

Can we conclude that food B is better than A?

Solution: Paired t-test is to be used here.

The data has to be entered in the worksheet of the MINITAB in the following manner in two separate columns C1 and C2:

49 52

53

52

55

51 52

53

47 50

50 54

52 54

53 53

Steps: STAT

BASIC STATISTICS

PAIRED t

Enter C1 in First sample and C2 in second sample

OK

Output: Paired T-Test and CI: C1, C2

Paired T for C1 - C2

C1

N Mean

8 50.8750

C2 8 52.8750

Difference 8 -2.00000

St Dev

2.1002

SE Mean

0.7425

1.5526 0.5489

1.30931 0.46291

95% CI for mean difference: (-3.09461, -0.90539)

T-Test of mean difference = 0 (vs not = 0): T-Value = -4.32 P-Value = 0.003.

¾ Correlation and Regression

Example 2.2: In diabetic rats the blood sugar and endogenous insulin levels were estimated.

Find out if there is correlation between these two parameters

1 2 3 4 5 6 7 8

Insulin 16 21 18 11 10 8 20 11 IU

Solution: For obtaining the correlation coefficient using MINITAB from the menus choose:

Stat

Basic Statistics

Correlation

Select two or more numeric variables

Check the box Display p-values and click button OK.

The output of the above example with MINITAB is

Pearson correlation of x and y = -0.984

P-Value = 0.000

I-184

Minitab: An Overview

To calculate Spearman's rank correlation coefficient using MINITAB, ensure that there are no missing values in the data. If the data are not ranked, then use Data

Rank and then compute the Pearson's correlation on the columns of ranked data as explained earlier. Don't forget to uncheck Display p-values as the p-value given here is not accurate for Spearman's r. Don’t use p-values to interpret Spearman's r.

To obtain the partial correlation using MINITAB:

1 Regress the first variable on the other variables and store the residuals.

2 Regress the second variable on the other variables and store the residuals.

3 Calculate the correlation between the two columns of residuals.

Example 2.3: Given the following data, fit a simple linear regression equation between y and x

1

. Also fit a multiple linear regression equation with y as dependent and x

1

, x

2

, x

3

and x

4

as independent variables.

Observation

No. y x

1 x

2 x

3 x

4

1 78.5

2 74.3

7

1

26

29

6

15

60

22

3 104.3

4 87.6

5 95.9

6 109.2

7 102.7

8 72.5

9 93.1

10 115.9

11

11

7

11

3

1

2

21

56

31

52

55

71

31

54

47

8

8

6

9

17

22

18

4

20

47

33

22

6

44

22

26

11 83.8

12 113.3

13 119.4

1

11

10

40

66

68

23

9

8

34

12

12

For fitting a regression equation using MINITAB: From the menus choose:

Stat

Regression

Select Response Variable

Select one or more independent variables.

¾ Multiple Linear Regression

The output for the above example obtained using MINITAB is

Regression Analysis: y versus x

1

, x

2

, x

3

, x

4

The regression equation is y = 53.6 + 1.59 x

1

+ 0.661 x

2

+ 0.084 x

3

- 0.076 x

4

Predictor Coef SE Coef T

Constant 53.6300 10.2700 5.22 x1 1.5887 0.2670 5.95 x2 x3 x4

0.6606

0.0845

-0.0758

0.1140

0.2493

0.1144

5.79

0.34

-0.66

RMSE (S) = 3.00032 R-Sq = 97.7% R-Sq(adj) = 96.5%

0.001

0.000

0.000

0.743

0.526

P

I-185

Minitab: An Overview

Source

Regression

Residual Error

DF

4

8

Analysis of Variance

SS

3015.59

72.02

MS

753.90

9.00

F

83.75

P

0.000

Total 12 3087.61

Source x1 x2 x3

DF Seq SS

1 1546.50

1 1462.49

1 2.64

1 3.96 x4

From the above example, it can be seen that 97.7% of the variation in y is explained by x

1

, x

2

, x

3

and x

4

. Coefficients of x

1

and x

2 are significantly different from zero whereas that of x

3

and x

4

are not.

¾ ANOVA and ANCOVA

Example 2.4: A trial was designed to evaluate 15 rice varieties grown in soil with a toxic level of iron. The experiment was in a RCB design with three replications. Guard rows of a susceptible check variety were planted on two sides of each experimental plot. Scores for tolerance for iron toxicity were collected from each experimental plot as well as from guard rows. For each experimental plot, the score of susceptible check (averaged over two guard rows) constitutes the value of the covariate for that plot. Data on the tolerance score of each variety (Y variable) and on the score of the corresponding susceptible check (X variable) are shown below:

Scores for tolerance for iron toxicity (Y) of 15 rice varieties and those the corresponding guard rows of a susceptible check variety (X) in a RCB trial

Variety

Number

1.

2.

3.

4.

5.

X Y X Y X Y

15 22 16 13 16 14

16 14 15 23 15 23

15 24 15 24 15 23

16 13 15 23 15 23

17 17 17 16 16 16

6.

7.

8.

9.

16 14 15 23 15 23

16 13 15 23 16 13

16 16 17 17 16 16

17 14 15 23 15 24

10. 17 17 17 17 15 26

11. 16 15 15 24 15 25

12. 16 15 15 23 15 23

13. 15 24 15 24 16 15

14. 15 25 15 24 15 23

15. 15 24 15 25 16 16

I-186

Minitab: An Overview

For performing the ANOVA for the above data using MINITAB: First enter the data in the

Worksheet of MINITAB in four columns C1: rep; C2: trt; C3: Y and C4: X. Now fFrom menus choose Stat

ANOVA

General Linear Model . In the response variable Box, enter the variable Y, in the model enter trt rep. Specify the terms for comparing means as trt and the method for multiple comparisons. As the interest is in making all possible pairwise treatment comparisons, select Tukey or Bonferroni method. Check the Box TEST for multiple comparison output. If only ANOVA is to be performed, then C4 is not required. The out put obtained is given in the sequel.

The usual analysis of variance without using the covariate (X variable) is as follows:

Source DF SS Mean Square F (F-calc) p(Pr>F)

18.99 1.04 0.445

52.02 2.85 0.075

18.24

R-Square R-Sq(Adj) s (Root MSE) C.V.

0.4201 (42.01%) 8.88% 4.2704 21.5436

Y - Mean

19.82222

Least Squares Treatment Means for yield are

Treatment Mean SE mean

1 16.33

2 20.00

3 23.67

4 19.67

5 16.33

6 20.00

7 16.33

2.466

2.466

2.466

2.466

2.466

2.466

2.466

8 16.33

9 20.33

10 20.00

11 21.33

12 20.33

13 21.00

2.466

2.466

2.466

2.466

2.466

2.466

14 24.00

15 21.67

2.466

2.466

Neither Bonferroni Simultaneous Tests nor Tukey Simultaneous Tests for making all possible pairwise treatment comparisons resulted into p<0.05.

For performing analysis of covariance, in addition to the above, define covariate X in the diaglog box. Using the covariate, analysis is the following:

I-187

Minitab: An Overview

Source x

DF Seq SS Adj. SS Mean Square F (F-calc) p(Pr>F)

1 589.430 398.752

398.752

96.24 0.000

Treatment 14 156.797 152.561

Replication 2 22.480 22.480

Error 28 111.871 111.871

10.897

11.240

4.143

2.63 0.015

2.71 0.084

R-Square R-Sq(Adj) s (Root MSE) C.V.

0.8730 (87.30%) 79.30% 2.03552 10.2689

Y - Mean

19.82222

Coef P

Constant 114.673 9.673 11.85 0.000

It is interesting to note that the use of a covariate has resulted into a considerable reduction in the error mean square and hence the CV has also reduced drastically. This has helped in catching the small differences among the treatment effects as significant. This was not possible when the covariate was not used. The covariance analysis will thus result into a more precise comparison of treatment effects. Least Squares Treatment Means for yield are

Treatment Mean SE mean

1 16.87

1.177

2 18.51

3 20.15

4 18.18

5 22.96

1.185

1.229

1.185

1.356

6 18.51

7 16.87

8 20.93

9 20.87

10 24.60

11 19.84

1.185

1.177

1.265

1.177

1.265

1.185

12 18.84

13 19.51

14 20.48

15 20.18

1.185

1.185

1.229

1.185

The probability of significance of pairwise comparisons among the least square estimates of the treatment effects based on Tukey Simultaneous Tests are given below

I-188

Minitab: An Overview

1 .

2 0.9994 .

3 0.8280 .

4 1.0000 1.0000 0.9959

5 0.0930 0.5359 0.9754 0.4249

.

6 0.9994 1.0000 0.9994 1.0000

0.5359

.

.

7 1.0000 0.9994 0.8280 1.0000

0.093

0.9994

.

8 0.5536 0.9840 1.0000 0.9551

0.9945

0.9840

0.5536

.

9 0.5302 0.9789 1.0000 0.9418

0.9958

0.9789

0.5302

1.0000

10 0.0077 0.0930 0.5359 0.0622

0.9994

0.0930

0.0077

0.6586

11 0.8890 0.9999 1.0000 0.9992

0.9219

0.9999

0.889

1.0000

12 0.9959 1.0000 1.0000 1.0000

0.651

1.0000

0.9959

0.9958

13 0.9504 1.0000 1.0000 0.9999

0.8529

1.0000

0.9504

0.9999

14 0.7204 0.9959 1.0000 0.9829

0.9917

0.9959

0.7204

1.0000

15 0.7967 0.9992 1.0000 0.9949

0.9655

0.9992

0.7967

1.0000

9 10 11 12 13 14 15

9 .

11 1.0000 0.3659 .

12 0.9945 0.1363 1.0000 .

13 0.9999 0.2713 1.0000 1.0000

14 1.0000 0.651 1.0000 0.9994

1.0000

.

15 1.0000 0.4762 1.0000 0.9999

1.0000

1.0000

.

.

Treatments 1 and 7 and 7 and 10 are found to be significantly different.

¾ Combined Analysis of Data

For the data in Example 6.2 in Fundamentals of Design of Experiments given in Module 2:

Enter the data in Worksheet of MINITAB in 5 columns: C1: Year; C2: Rep; C3: blk; C4: trt;

C5: Yield.

Here Yr, Rep, Blk and trt represent respectively denote the year, replication, block and treatment.

At the first instance, split the worksheet for two years separately. This can be achieved by selecting Data

Split Worksheet

by Variable Yr . Now using the worksheet for Year 1, choose from the menu: STAT

ANOVA

General Linear Model. In the response variable Box, enter the variable yield, in Model enter Rep blk(rep) trt and Click OK .

The output obtained is given in the sequel.

I-189

Minitab: An Overview

Source rep

Analysis of Variance for yield: Year 1 (Using Adjusted SS for Tests)

DF

3

Seq SS Adj SS Adj MS F P

186.046

186.046

62.015

7.53 0.000

358.943

14.956

1.82

3442.148

3442.148

71.711

8.7 0.000

trt

Error

Total

120

195

988.707

6025.758

988.707

S = 2.87040 R-Sq = 83.59% R-Sq(adj) = 73.34%

8.239

Similarly, the analysis of data for second year can be performed, the results obtained are given in the sequel.

Analysis of Variance for yield: Year 2 (Using Adjusted SS for Tests)

Source rep

DF

48

3

Seq SS

176.399

Adj SS

176.399

Adj MS F

58.800

11.81

P

0.000

trt

Error

48

120

3353.212

597.305

556.491

23.187

4.66

3353.212

597.305

69.859

4.978

14.03 0.000

Total 195 5413.927

S = 2.23104 R-Sq = 88.97% R-Sq(adj) = 82.07%

The interpretations are same as given in Example 2 Section 6. Equality of error variance can be tested using F-test. As above, the errors are heterogeneous. Therefore, the data were transformed by dividing each observation with corresponding root mean square error. For this we create a new column of root mean square error in the worksheet and create a new variable

= original variable/sqrt(MSE) using CALC

CALCULATOR .

In addition to the above steps, select new variable as response variable, enter model as yr rep(yr) blk( rep yr) trt trt*yr, Define Yr in the Subdiaglog Box Random Factors. Now Click on Results and Check on the Display expected mean squares and variance components. The results obtained are

General Linear Model: Transformed Variable versus yr, trt, rep, blk

Factor Type Levels Values yr random 2 rep(yr) random 8 blk(yr rep) random 56 trt fixed 49

1, 2

1, 2, 3, 4, 1, 2, 3, 4

1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3,

4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6,

7, 1, 2, 3, 4, 5, 6, 7, 1, 2, 3, 4, 5, 6, 7, 1, 2,

3, 4, 5, 6, 7

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,

16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,

28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,

40, 41, 42, 43, 44, 45, 46, 47, 48, 49

I-190

Minitab: An Overview

Analysis of Variance for transformed variable, using Adjusted SS for Tests

Source yr

DF

1

Seq SS

4911.415

Adj SS Adj MS F P

4911.415 4911.415 422.37 0.000 x rep(yr) blk(yr rep)

6

48

58.828

439.557

58.828

139.74

9.805

2.80 0.023 x

2.911

2.58 0.00

trt yr*trt

Error

48

48

240

391

968.42

130.857

271.335

6780.412

Total x Not an exact F-test.

S = 1.06328 R-Sq = 96.00% R-Sq(adj) = 93.48%

968.42

20.175

7.40 0.00

130.857

271.335

2.726

1.131

2.41 0.00

Expected Mean Squares, using Adjusted SS

Source Expected Mean Square for Each Term

1 yr (6) + 4.0000 (5) + 7.0000 (3) + 49.0000 (2) + 196.0000 (1)

2 rep(yr) (6) + 7.0000 (3) + 49.0000 (2)

3 blk(yr rep) (6) + 5.2500 (3)

4 trt (6) + 3.5000 (5) + Q[4]

5 yr*trt (6) + 3.5000 (5)

6 Error (6)

Error Terms for Tests, using Adjusted SS

Source ErrorDF Error MS Synthesis of Error MS

1 yr 8.33 11.628 (2) + 1.1429 (5) - 1.1429 (6)

2 rep(yr) 39.06 3.505 1.3333 (3) - 0.3333 (6)

3 blk(yr rep) 240.00 1.131

4 trt 48.00 2.726

5 yr*trt 240.00 1.131

(6)

(5)

(6)

It can easily be seen that the testing of random effects has been one in one step using

MINITAB.

¾ Factorial Experiments:

The data given in Example 7.1 can be analyzed using MINITAB:

Enter the data in the worksheet of the MINITAB in 6 columns C1: Rep, C2: Block; C3:N;

C4: P; C5:K; C6: Yield.

Choose: Stat

ANOVA

General Linear Model. Now in the Dialog Box define Yield as

Response Variable. In the model define rep block (rep) n k n*p n*k p*k n*p*k. Now Choose comparisons, click the Radio Button of Pairwise Comparisons. In the Terms define n p k n*p n*k p*k n*p*k, Check the boxes of Tukey Method and Test.

I-191

Minitab: An Overview

Analysis of Variance for yield, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P rep 3 15.7187 15.7187 5.2396 10.71 0.000 blk(rep) 8 14.5571 14.1946 1.7743 3.63 0.003 n 2 89.1108 89.1108 44.5554 91.05 0.000 p 2 55.9270 55.9270 27.9635 57.14 0.000 k 1 3.2173 3.2173 3.2173 6.57 0.014 n*p 4 4.2752 4.2752 1.0688 2.18 0.087 n*k 2 0.7301 0.7301 0.3650 0.75 0.480 p*k 2 0.1128 0.1128 0.0564 0.12 0.891 n*p*k 4 2.1958 2.1958 0.5490 1.12 0.359

Error 43 21.0427 21.0427 0.4894

Total 71 206.8876

S = 0.699547 R-Sq = 89.83% R-Sq(adj) = 83.21%

The probability of significance of pairwise comparisons among levels of N based on Tukey

Simultaneous Tests are i/j 40 80 120

40 .

80 0.0000 .

The probability of significance of pairwise comparisons among levels of P based on Tukey

Simultaneous Tests are i/j 0 40 80

0 .

40 0.0000 .

The probability of significance of pairwise comparisons among levels of K based on Tukey

Simultaneous Tests are

Difference

K

40 -0 of Means

0.4228

SE of Adjusted

Difference T-Value P-Value

0.1649 2.564 0.0139

Similarly the probability of significance of pairwise comparisons among levels of N*P, N*K,

P*K and N*P*K based on Tukey Simultaneous Tests can be obtained.

¾ Diagnostics and Remedial Measures

Steps for carrying out these Diagnostics and Remedial Measures using MINITAB

First of all fit the model as per the design adopted using the options Stat

ANOVA

General Linear Model from the menus and from the Dialog Box Select storage and store residuals in a column in the worksheet. Once the residuals are stored on the worksheet, then use the following steps.

I-192

Minitab: An Overview

Testing Normality

From the menus choose: Stat

Basic Statistics

Normality

In the Dialog Box. Select the stored residual as variable in Variable list and then select one of the three tests viz. Anderson-

Darling, Ryan-Joiner and Kolmogrov-Smirnov tests and Click OK.

Test for Homogeneity of Variances

From the menus choose: Stat

ANOVA

Test for Equality of Variances

In the Dialog Box.

Select the stored residual in the Response Box and Treatment in the Factors Box and then choose the confidence level and Click OK.

Transformations of Data

For making logarithmic, square root and arcsine transformation, one can use the

Calc

Calculator. It is followed by storing the result in a variable by entering a target column in the worksheet. Then define the functions that are to be used for transformation in the

Expression SubDialog Box. For logarithmic transformation, define LOGT (Column number or variable name to be transformed) and Click OK. The transformed data will be stored in the target column. For square root transformation, use SQRT (Column number or variable name to be transformed) in the Expression SubDialog Box and for Arcsine transformation, use the expression ASIN (sqrt of the column number in which data is given/100)*180*7/22. The multiplication by 180*7/22 is done to convert the data from radians to degrees. If the original data lies between 0 and 1, then do not divide by 100.

Now perform the analysis again and test normality and homogeneity of error terms. If the errors are now normal and homogeneous, perform the analysis on the transformed data, otherwise use an appropriate non-parametric test. For performing the non-parametric analysis, from the menus choose: Stat

Nonparametrics

Appropriate test (Friedman, say)

In the

Dialog Box select Response, Treatment and Block variables and Click OK.

Example 2.4: Suppose an entomologist is interested in determining whether four different kinds of traps caught equivalent insects when applied to same field. Each of the traps is used six times on the field and resulting data (number of insects per hour) are as shown below alongwith mean, variance and range.

II IV

7

Y i S i

2

17 2

28 45 31 36

61 98 71 79

118 172 143 168 147

37

From the table it is clear that variances are heterogeneous and variance is proportional to mean.

Obtain the residuals for testing the normality and homogeneity of error terms. The residuals obtained are given below:

I-193

Minitab: An Overview

A

B

C

D

VI

S i

2

-1.00 0.75 -1.25

3.25

-11.75

-14.00 9.75 0.00

-3.25

-4.75

12.25

-13.00 11.75 23.00

-19.25

12.25

-14.75

28.00 -22.25 -33.00

23.75

-10.75

14.25

0 50.35

0 94.85

0 314.85

0 650.20

Normality of error terms:

Anderson-Darling Test Ryan-Joiner Test Kolmogrov-Smirnov Test

Statistic

(AD) p-value Statistic

(RJ) p-value Statistic

(KS) p-value

0.208 0.848 0.992 >0.100 0.110 >0.150

The errors were found to be normally distributed. Therefore, homogeneity of error variances was tested using Bartlett's test.

Using MINITAB, we get the output as

Bartlett's Test (normal distribution)

Test statistic = 8.32, p-value = 0.040

The

S i

2

Y i .

are 5.77, 5.32, 3.43 and 5.43, indicating that variance is proportional to mean.

Therefore, square root transformation should be used. After application of square root transformation, the residuals are

Treatment Replication Variance

I II III IV V VI

S i

2

A

B

C

D

-0.03614 -0.92542 1.05800

0.20614

0.98287

-1.28544 0.928

-1.34939 0.87854 -0.40473

-0.12183

-0.42993

1.42735 0.999

-0.28226 0.78841 0.99143

-1.08068

0.30794

-0.72483 0.694

1.66779 -0.74153 -1.64469

0.99637

-0.86087

0.58293 1.622

Normality of error terms on the transformed data:

Anderson-Darling Test Ryan-Joiner Test Kolmogrov-Smirnov Test

Statistic

(AD) p-value Statistic

(RJ) p-value Statistic

(KS) p-value

0.391 0.353 0.984 >0.100 0.127 >0.150

The errors remain normally distributed after transformation. The results of homogeneity of error variances using Bartlett's test are

Bartlett's Test (normal distribution): Test statistic = 0.89, p-value = 0.828

Hence, we conclude that the errors are normally distributed and have a constant variance after transformation.

I-194

Minitab: An Overview

The results of analysis of variance with original and transformed data are given in the sequel.

ANOVA: Original Data

Source DF Seq SS Adj. SS Mean Square F (F-calc) p(Pr>F)

Replication 5 689.0 689.0

137.8

0.37 0.86

Treatment 3 70828.5 70828.5

Error 15 5551.0 5551.0

23609.5

370.1

63.80 0.00

R-Square R-Sq(Adj) s (Root MSE)

Tukey Simultaneous Tests for All Pairwise Treatment Comparisons

1 2 3 4

1 .

2 0.3525 .

3 0.0001

4 0.0000 0.0000 0.0001 .

ANOVA: Transformed Data

Source DF Seq SS Adj. SS Mean Square F (F-calc) p(Pr>F)

Replication 5 5.055 5.055

Treatment 3 326.603 326.603

Error 15 21.214 21.214

1.011

108.868

1.414

0.71 0.622

76.98 0.000

R-Square R-Sq(Adj) s (Root MSE)

Tukey Simultaneous Tests for All Pairwise Treatment Comparisons

1 2 3 4

1 .

2 0.0091

0.0015

.

With transformed data treatments 1 and 2 are significantly different whereas with original data, they were not.

I-195

Minitab: An Overview

¾

¾

Probit Analysis

Example 1: Finney (1971) gave a data representing the effect of a series of doses of carotene (an insecticide) when sprayed on Macrosiphoniella sanborni (some obscure insects). The Table below contains the concentration, the number of insects tested at each dose, the proportion dying and the probit transformation (probit+5) of each of the observed proportions.

Concentration

(mg/1)

No. of insects (n)

No. of affected (r)

%kill (P) Log concentration

(x)

Empirical probit

0 49 0 0 - -

Steps for carrying out the Probit Analysis using MINITAB

For the data given in example 1, first enter the data in the Worksheet of MINITAB in three coumns C1: dose

;

C2: total Insects; C3: Insects killed or affected. Now create a column C4 for logdose by using LOGT(C1) using menu Calc.

Now Choose Stat > Reliability/Survival > Probit Analysis.

From the dialog box; Choose the data format "Success/trial" or "Response/frequency". In the present case, the data is in success trial format, therefore, enter C3, the column containing the number of successes in Number of Successes box and C2, the total number of trials in

Number of Trials subbox. In the subbox for stress/stimulus enter C4, the column containing the logdose. Since, there is only one stimulus, therefore, the subbox pertaining to Factor

(optional) may be left blank. Choose the distribution as normal.

The other options available on the dialog box are: Estimate, Graphs, Options, Results and

Storage.

Using the option Estimate, One can

estimate percentiles for the percents you specify. These percentiles are added to the default table of percentiles.

estimate survival probabilities for the stress values you specify.

One can also change the method of estimation for the confidence intervals and the level of confidence. The default option is two sided 95% fiducial intervals.

Other options may also be used, as and when required. For this example, we chose the additional percentiles as 65 and survival probabilities for stress level 0.9 (logdose).

I-196

Minitab: An Overview

Probit Analysis: affect, total versus logdose

Distribution: Normal

Response Information

Variable Value Count affect Success 132

Failure 111 total Total 243

Estimation Method: Maximum Likelihood

Regression Table

Standard

Variable Coef Error Z P

Constant -2.88746 0.350134 -8.25 0.000 logdose 4.21320 0.478303 8.81 0.000

Log-Likelihood = -120.052

Goodness-of-Fit Tests

Method Chi-Square DF P

Pearson 1.72888 3 0.631

Deviance 1.73897 3 0.628

Tolerance Distribution: Parameter Estimates

Standard 95.0% Normal CI

Parameter Estimate Error Lower Upper

Mean 0.685338 0.0220962 0.642030 0.728646

StDev 0.237349 0.0269451 0.190001 0.296497

Table of Percentiles

Standard 95.0% Normal CI

Percent Percentile Error Lower Upper

1 0.133180 0.0686394 -0.0013503 0.267711

2 0.197882 0.0617254 0.0769020 0.318861

3 0.238933 0.0573944 0.126442 0.351423

4 0.269813 0.0541723 0.163638 0.375989

5 0.294933 0.0515787 0.193840 0.396025

6 0.316313 0.0493935 0.219504 0.413123

7 0.335060 0.0474969 0.241967 0.428152

8 0.351845 0.0458160 0.262047 0.441643

9 0.367110 0.0443030 0.280278 0.453943

10 0.381162 0.0429251 0.297031 0.465294

20 0.485580 0.0332991 0.420314 0.550845

30 0.560872 0.0274617 0.507048 0.614696

40 0.625206 0.0238086 0.578542 0.671870

50 0.685338 0.0220962 0.642030 0.728646

60 0.745470 0.0224241 0.701519 0.789420

I-197

Minitab: An Overview

65 0.776793 0.0233958 0.730939 0.822648

70 0.809804 0.0249330 0.760936 0.858672

80 0.885096 0.0299366 0.826422 0.943771

90 0.989513 0.0389715 0.913131 1.06590

91 1.00357 0.0402991 0.924581 1.08255

92 1.01883 0.0417626 0.936978 1.10068

93 1.03562 0.0433947 0.950564 1.12067

94 1.05436 0.0452427 0.965688 1.14304

95 1.07574 0.0473792 0.982882 1.16860

96 1.10086 0.0499232 1.00301 1.19871

97 1.13174 0.0530936 1.02768 1.23580

98 1.17279 0.0573685 1.06035 1.28523

99 1.23750 0.0642153 1.11164 1.36336

Table of Survival Probabilities

95.0% Normal CI

Stress Probability Lower Upper

0.9 0.182888 0.122757 0.258650

Interpretation: The goodness-of-fit tests (p-values = 0.631, 0.628) suggest that the distribution and the model fits the data adequately. In this case, the fitting is done on normal equivalent deviate only without adding 5. Therefore, log LD50 or lof ED50 corresponds to the value of Probit=0.

Log LD50 is obtained as 0.685338. Therefore, the stress level at which the

50% of the insects will be killed is (10

0.685338

=4.845 mg/l). Similarly the stress level at which

65% of the insects will be killed is (10

0.776793

= 5.981 mg/l). At logdose = 0.9, what percentage of insects will be killed? Results indicate that 18.29% of the insects will be killed.

If there are more than one factor used for experimentation, then for the analysis of data follow the same steps as in Example 1 with the addition that in the factor subbox define factor as f.

I-198

Download