Repeated Measures and Related Designs

advertisement

Repeated Measures and Related Designs

I. General introduction

1.

Definition:

Repeated measures designs utilize the same subject for each of the treatment under study.

1

2.

Examples

(1). 15 test markets are to be used to study each of two different advertising campaigns. In each test market, the order of the two campaigns will be randomized, with a sufficient time lapse between the two campaigns so that the effects of the initial campaign will not carry over into the second campaign. The subjects in this study are the test markets.

(2). 200 persons who have persistent migraine headaches are each to be given two different drugs and a placebo, for two weeks each, with the order of the drugs randomized for each person. The subject in the study are the persons with migraine headache.

(3). In a weight loss study, 100 overweight persons are to be given the same diet and their weights measured at the end of each week for 12 weeks to assess the weight loss over time. Here the subjects are the overweight persons, who are observed repeatedly to provide information about the effects of a single treatment over time.

3.

Advantages and disadvantages.

Advantages:

(1). This design provide good precision for comparing treatments because all sources of variability between subjects are excluded from the experimental error.

Only variation within subjects enters the experimental error, since any two treatments can be compared directly for each subject.

(2). This design economizes on subjects.

Disadvantages:

(1). Order effect. It is connected with the position in the treatment order.

Randomization to eliminate order effect.

(2). Carryover effect. It is connected with the preceding treatment or treatments.

Allowing sufficient time between treatments reduces the carryover effect.

1 Designs with repeated measures need to be distinguished from design with repeated observations. In repeated measures designs, several or all of the treatments are applied to the same object. Designs with repeated observations, on the other hand, are designs where several observations on the response variable are made for a given treatment applied to an experiment uint.

Single-factor experiments with repeated measures on all treatments

1.

General notes.

Almost always, the subjects in repeated measures designs are viewed as a random sample from a population. So we will assume the effects of subjects to be random.

2.

Layout for single-factor repeated measures design (n=5, r=4)

Subject

1

Treatment order

1 2 3 4

T4 T3 T2 T1

2

3

4

T3

T4

T2

T4

T3

T1

T1

T1

T4

T2

T2

T3

T1 T2 T4 T3 5

3.

Data structure of the design

Subject

1

2

3

4

5

Treatment

1 2 3 4

Y11 Y12 Y13 Y14

Y21 Y22 Y23 Y24

Y31 Y32

Y41 Y42

Y51 Y52

Y33

Y43

Y53

Y34

Y44

Y54

4.

model

Y ij

    i

  j

  ij

Where

: the grand mean;

 i

: main effect of subject, independent with N ( 0 ,

2

) ;

 j

: main effect of treatment, subject to

  j

0 ;

 ij

: error, independent with N ( 0 ,

2

) ;

 i

and

 ij

are independent. i

1 ,..., n ; j

1 ,..., r .

5.

ANOVA table

Source of variation

Subjects

Treatments

SS

SSS

SSTR

DF n-1 r-1

MSE

SSS/n-1

SSTR/r-1

E(MSE)

2  r

2

 2  n

 r

 j

2

1

 2

Error SSTR.S (r-1)(n-1) SSTR.S/(r-1)(n-1)

Total SSTO nr-1

Where

SSTO= i

  j

( Y ij

Y ) 2

; SSS= r

 i

( Y i .

Y )

2

;

SSTR= n

( Y

.

j

 j

Y )

2

; SSTR.S=

 i j

( Y ij

Y i .

Y

.

j

Y )

2

;

6.

Evaluation of appropriateness of repeated measures model

Consider the residuals: e ij

Y ij

Y i .

Y

.

j

Y .

7.

SAS Code

Example: In a wine judging competition, four wines of the same vintage were judged by six experienced judges. Each judge tasted the wines in a blind fashion. The order of the wine presentation was randomized independently for each judge. To reduce carryover and other interference effects, the judges did not drink the wines and rinsed their mouths thoroughly between tastings. Each wine was scored on a 40-point scale. The data for this competition are presented in the following table.

Judge (i)

Wine (j)

1

2

1 2 3 4

20 24 28 28

15 18 23 24

3

4

5

18 19 24 23

26 26 30 30

22 24 28 26

6 19 21 27 25

We want to testing the treatment (wines) effects.

SAS program:

DATA WINES;

INPUT JUDGE $ WINE $ SCORE @@;

DATALINES ;

J1 W1

J1 W4

20 J1

28 J2

J2 W3 23 J2

J3 W2 19 J3

J4 W1 26 J4

J4 W4 30 J5

J5 W3 28 J5

W2 24 J1

W1 15 J2

W4 24 J3

W3 23 J3

W2 26 J4

W1 22 J5

W4 26 J6

W3 27 J6 J6 W2 21 J6

;

RUN ;

PROC GLM DATA =WINES;

CLASS JUDGE WINE;

MODEL SCORE=JUDGE WINE;

RANDOM JUDGE;

RUN ;

PROC MIXED DATA =WINES;

CLASS JUDGE WINE;

MODEL SCORE=WINE;

RANDOM JUDGE;

RUN ;

W3 28

W2 18

W1 18

W4 24

W3 30

W2 24

W1 19

W4 25

QUIT ;

The results are:

Class Level Information

Class Levels Values

JUDGE 6 J1 J2 J3 J4 J5 J6

WINE 4 W1 W2 W3 W4

Number of Observations Read 24

Number of Observations Used 24

Dependent Variable: SCORE

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 8 356.3333333 44.5416667 39.30 <.0001

Error 15 17.0000000 1.1333333

Corrected Total 23 373.3333333

R-Square Coeff Var Root MSE SCORE Mean

0.954464 4.498231 1.064581 23.66667

Source DF Type I SS Mean Square F Value Pr > F

JUDGE 5 173.3333333 34.6666667 30.59 <.0001

WINE 3 183.0000000 61.0000000 53.82 <.0001

Source DF Type III SS Mean Square F Value Pr > F

JUDGE 5 173.3333333 34.6666667 30.59 <.0001

WINE 3 183.0000000 61.0000000 53.82 <.0001

Source Type III Expected Mean Square

JUDGE Var(Error) + 4 Var(JUDGE)

WINE Var(Error) + Q(WINE)

The Mixed Procedure

Model Information

Data Set WORK.WINES

Dependent Variable SCORE

Covariance Structure Variance Components

Estimation Method REML

Residual Variance Method Profile

Fixed Effects SE Method Model-Based

Degrees of Freedom Method Containment

Class Level Information

Class Levels Values

JUDGE 6 J1 J2 J3 J4 J5 J6

WINE 4 W1 W2 W3 W4

Dimensions

Covariance Parameters 2

Columns in X 5

Columns in Z 6

Subjects 1

Max Obs Per Subject 24

Number of Observations

Number of Observations Read 24

Number of Observations Used 24

Number of Observations Not Used 0

Iteration History

Iteration Evaluations -2 Res Log Like Criterion

0 1 108.98547215

1 1 83.53091940 0.00000000

Convergence criteria met.

The Mixed Procedure

Covariance Parameter

Estimates

Cov Parm Estimate

JUDGE 8.3833

Residual 1.1333

Fit Statistics

-2 Res Log Likelihood 83.5

AIC (smaller is better) 87.5

AICC (smaller is better) 88.2

BIC (smaller is better) 87.1

Type 3 Tests of Fixed Effects

Num Den

Effect DF DF F Value Pr > F

WINE 3 15 53.82 <.0001

Two-Factor Experiments with Repeated Measures on Both Factors

1.

Layout for two-factor repeated measures design with repeated measures on both factors

(a=2, b=2, n=4).

Subject

1

2

2.

Data structure of the design

3

4

Treatment order

1 2 3 4

A1B2 A2B2 A1B1 A2B1

A2B1 A1B2 A2B2 A1B1

A2B2 A1B1 A2B1 A1B2

A1B1 A2B1 A1B2 A2B2

A1

A2

B1

Y111 Y112

Y113 Y114

Y211 Y212

Y213 Y214

B2

Y121 Y122

Y123 Y124

Y221 Y222

Y223 Y224

3.

Model

Y ijk

    i

  j

  k

(



) jk

  ijk

Where

: Grand Total;

 i

: Subject effect, independent with N ( 0 ,

2

) ;

 j

: Main effect of factor A subject to

  j

0 ;

 k

: Main effect of factor B subject to

  k

0 ;

(



) jk

: Interaction effect of A and B subject to

 j

(



) jk

 k

(



) jk

0 for all j.

0 for all k and

 ijk

: error, independent with N ( 0 ,

2

) ;

and i

 ijk

are independent. i

1 ,..., n ; j

1 ,..., a ; k

1 ,..., b .

.

4.

ANOVA table

Source of Variation

Subject

Factor A

Factor B

AB interaction

SS

SSS

SSA

SSB

DF n-1 a-1 b-1

MSE

SSS/n-1

SSA/a-1

SSB/b-1

SSAB (a-1)(b-1) SSAB/(a-1)(b-1)

E(MSE)

2  ab

2

 2  nb

  a

1 j

2

2  na

 b

1 k

2

2  n

(

 a

(



1 )( b

)

2

1 ) jk

Error SSE (n-1)(ab-1) SSE/(n-1)(ab-1)

 2

Total SST nab-1

Where

SSS= ab

 i

( Y i ..

Y ) 2

; SSA= nb

 j

( Y

.

j .

Y )

2

; SSB= na

 k

( Y

..

k

Y ) 2

;

SSAB= n

 j k

( Y

.

jk

Y

.

j .

Y

..

k

Y )

2

; SSE=

 i j k

( Y ijk

Y

.

jk

Y i ..

Y )

2

5.

SAS program

Example: To test the efficiency of its new programmable calculator, a computer company selected at random 6 engineers who were proficient in the use of both this calculator and an earlier model and asked them to work out two problems on both calculators. One of the problems was statistical in nature, the other was an engineering problem. The order of the four calculations was randomized independently for each engineer. The length of time (in minutes) required to solve each problem was observed. The results are presented in the following table.

Engineer

Jones

Statistical problem

New model

3.1

Earlier model

7.5

Engineering problem

New model

2.5

Earlier model

5.1

Williams

Adams

Dixon

Erickson

Maynes

3.8

3.0

3.4

3.3

3.8

8.1

7.6

7.8

6.9

7.8

2.8

2.0

2.7

2.5

2.4

5.3

4.9

5.5

5.4

4.8

Testing the effects of A: Statistical problems and B: model

SAS Program

DATA COMPUTER;

INPUT ENGINEER $ PROBLEM $ CMODEL $ TIME;

DATALINES ;

Jones Stat New 3.1 Williams Stat New 3.8

Adams Stat New 3 Dixon Stat New 3.4

Erickson Stat New 3.3 Maynes Stat New 3.8

Jones Stat Earlier

Adams Stat

7.5 Williams

Earlier 7.6 Dixon

Stat Earlier

Stat

8.1

Earlier

Erickson Stat Earlier 6.9 Maynes Stat Earlier 7.8

Jones Engi New 2.5 Williams Engi New 2.8

Adams Engi New 2 Dixon Engi New 2.7

Erickson Engi New 2.5 Maynes

Jones Engi Earlier 5.1 Williams

Engi

Engi Earlier

New 2.4

5.3

Adams

Erickson

;

Engi Earlier

Engi Earlier

4.9 Dixon Engi Earlier

5.4 Maynes Engi Earlier 4.8

RUN ;

PROC GLM DATA =COMPUTER;

CLASS ENGINEER PROBLEM CMODEL;

7.8

5.5

MODEL TIME=ENGINEER PROBLEM CMODEL PROBLEM*CMODEL;

RANDOM ENGINEER;

RUN ;

The result is:

Dependent Variable: TIME

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 8 93.00166667 11.62520833 154.09 <.0001

Error 15 1.13166667 0.07544444

Corrected Total 23 94.13333333

R-Square Coeff Var Root MSE TIME Mean

0.987978 5.885818 0.274672 4.666667

Source DF Type I SS Mean Square F Value Pr > F

ENGINEER 5 1.05833333 0.21166667 2.81 0.0554

PROBLEM 1 17.00166667 17.00166667 225.35 <.0001

CMODEL 1 71.41500000 71.41500000 946.59 <.0001

PROBLEM*CMODEL 1 3.52666667 3.52666667 46.75 <.0001

Source DF Type III SS Mean Square F Value Pr > F

ENGINEER 5 1.05833333 0.21166667 2.81 0.0554

PROBLEM 1 17.00166667 17.00166667 225.35 <.0001

CMODEL 1 71.41500000 71.41500000 946.59 <.0001

PROBLEM*CMODEL 1 3.52666667 3.52666667 46.75 <.0001

The GLM Procedure

Source Type III Expected Mean Square

ENGINEER Var(Error) + 4 Var(ENGINEER)

PROBLEM Var(Error) + Q(PROBLEM,PROBLEM*CMODEL)

CMODEL Var(Error) + Q(CMODEL,PROBLEM*CMODEL)

PROBLEM*CMODEL Var(Error) + Q(PROBLEM*CMODEL)

Two-Factor Experiments with Repeated Measures on One Factor

1.

Layout for Two-Factor Design with Random Assignments of Factor A level to Subjects and

Repeated Measures on Factor B.

Incentive Stimulus Subject

Treatment Order

1 2

A

A

1

2

1

…… n n+1

……

A

……

A

1

1

B

B

1

2

A

2

B

2

……

A

A

1

1

B

……

B

2

1

A

2

B

1

……

2n A

2

B

1

A

2

B

2

2.

Description of Design

In many two-factor studies, repeated measures can only be made on one of the two factors.

Consider, for instance, an experimenter who wished to study the effects of two types of incentives

(factor A) on a person’s ability to solve problems. The researcher also wanted to study two types of problems (factor B) --- abstract and concrete problems. Each experimental subject could be asked to do each type of problem, but could not be exposed to more than type of incentive stimulus because of potential interference effects.

In this design, two randomizations generally need to be employed. First, the level of the nonrepeated factor needs to be randomly assigned to the subjects. Second, the order of the levels of the repeated factor need to be randomized independently for all subjects.

Since n subjects are randomly assigned incentive stimulus A1 and n subjects are randomly assigned incentive stimulus A2, as far as factor A is concerned the experiment is a completely randomized one. On the other hand, as far as factor B is concerned, each subject is a block. Thus, for factor B, the experiment is randomized block design, with block effects random. We call this experimental design a two-factor experiment with repeated measures on factor B.

In the experiment depicted in above table, comparisons between factor A level means involve differences between groups of subjects as well as differences associated with the two factor A levels. On the other hand, comparisons between factor B level means at the same level of factor A are based on the same subject, and hence only involve differences associated with the two factor B levels. Thus, for theses latter comparisons, each subject serves as its own control. The main effects of factor A are therefore said to be confounded with differences between groups of subjects, whereas the main effects of factor B are free of such confounding. It is for this reason that test on factor B main effects will generally be more sensitive than tests on the main effects for factor A.

3.

Model

Y ijk

    i ( j )

  j

  k

(



) jk

  ijk

Where

 j

: Main effect of A subject to

  j

0 ;

 k

: Main effect of B subject to

  k

0 ;

 i ( j )

: Effect of subject nested in factor A, independent with N ( 0 ,

2

) ;

(



) jk

: Interaction effect of A and B subject to

 j

(



) jk

 k

(



) jk

0 for all j.

0 for all k and

 ijk

: error with N ( 0 ,

2

) ;

 i ( j )

and

 ijk

are independent;

i=1,……, n; j=1,……,a; k=1,……,b.

4.

ANOVA table

Source of

Variation

SS DF MSE E(MSE)

Factor A

Factor B

AB interaction

SSA

SSB

Subjects(within A) SSS(A) a-1 b-1 a(n-1)

SSA/a-1

SSB/b-1

SSAB (a-1)(b-1) SSAB/(a-1)(b-1)

SSS(A)/(a-1)(b-1)

2  b

2

 bn

 a

1 j

2

2  na

 b

1 k

2

2  n

(

 a

(



1 )( b

) 2

1 ) jk

2  b

2

Error SSE a(n-1)(b-1) SSE/a(n-1)(b-1)

2

Total SST nab-1

Where

SSS(A)= b

 i

( Y ij .

Y

.

j .

)

2

; SSA= nb

 j

( Y

.

j .

Y )

2

; SSB= na

 k

( Y

..

k

Y )

2

;

SSAB= n

 j k

( Y

.

jk

Y

.

j .

Y

..

k

Y )

2

; SSE=

 i j k

( Y ijk

Y

.

jk

Y ij .

Y

.

j .

)

2

SST=

  i j k

( Y ijk

Y

...

)

2

5.

Example and SAS program

Example: a national retail chain wanted to study the effects of two advertising campaigns (factor

A) on the volume of sales of athletic shoes over time (factor B). ten similar test markets (subjects,

S) were chosen at random to participate in this study. The two advertising campaigns (A1 and A2) were similar in all respects except that a different national sports personality was used in each.

Sales data were collected for three two-week periods (B1: two weeks prior to campaign; B2: two weeks during which campaign occurred; B3: two weeks after campaign was concluded). The experiment was conducted during a six-week period when sales of athletic shoes are usually quite stable. The data is present in the following table.

Advertising campaign

Test market k=1

Time period k=2 k=3 j=1 j=2 i=1 i=2 i=3 i=4 i=5 i=1 i=2 i=3 i=4 i=5

958 1047 933

1005 1122 986

351

549

730

780

229

883

624

375

436 339

632

784

897

275

964

695

436

512

707

718

202

817

599

351

SAS program:

DATA SHOES;

INPUT ADVER TIME MARKET SALE @@;

DATALINES ;

1 1 1 958 1 1

1 1 5 730 2 1

2 1

1 2

2 2

1 3

1 3

4

3

2

1

5

624 2

436 1

275 2

933 1

707 2

1

2

2

3

3

4 599 2 3 2 3

;

RUN ;

2

1

5 375 1 2

4 632 1 2

3 964 2 2

2 986 1 3

1

5

1005 1

780 2

718 2

351

1

1

3

3

2

351 1

229 2

PROC GLM DATA =SHOES;

CLASS ADVER TIME MARKET;

MODEL SALE=ADVER TIME MARKET (ADVER) ADVER*TIME;

RANDOM MARKET (ADVER)/ TEST ;

RUN ;

PROC MIXED DATA =SHOES;

1

1

1 1047 1 2

5 784 2 2

4 695 2 2

3 339 1 3

2 202 2 3

4 549

3 883

2 1122

1 897

5 436

4 512

3 817

CLASS ADVER TIME MARKET;

MODEL SALE=ADVER TIME ADVER*TIME;

RANDOM MARKET (ADVER);

LSMEANS ADVER TIME / DIFF ADJUST =TUKEY;

RUN ;

QUIT ;

Split-Plot Designs

Description of design through example.

Example 1: Consider a paper manufacturer who is interested in three different pulp preparation methods and four different cooking temperatures for the pulp and who whishes to study the effect of these two factors on the tensile strength of the paper. Each replicate of a factorial requires 12 observations, and the experimenter has decided to run three replicates. However, the pilot plant is only capable of making 12 runs per day, so the experimenter decides to run one replicate on each of the three days and to consider the days or replicates as blocks. On any day, he conducts the experiment as follows. A batch of pulp is produced by one of the three methods under study. Then this batch is divided into four samples, and each sample is cooked at one of the four temperatures.

Then a second batch of pulp is made up using another of the three methods. This second batch is also divided into four samples that are tested at the four temperatures. The process is then repeated, using a batch of pulp produced by the third method. The data are shown in the following table.

Pulp Preparation

Method

Temperature

200

225

250

275

1

30

35

37

36

Replicate 1

2

34

3

29

1

28

Replicate 2

2

31

3

31

1

31

Replicate 3

2

35

3

32

41 26 32 36 30 37 40 34

38 33 40 42 32 41 39 39

42 36 41 40 40 40 44 45

The design used in our example is a split-plot design. Each replicate or block in the split-plot design is divided into three parts called whole plots , and the preparation methods are called the whole plot or main treatments. Each whole plot is divided into four parts called subplots (or split-plots ), and one temperature is assigned to each. Temperature is called the subplot treatment .

Note that if there are other uncontrolled or undersigned factors present, and if these uncontrolled factors vary as the pulp preparation methods are changed, then any effect of the undersigned factors on the response will be completely confounded with the effect of the pulp preparation methods. Because the whole plot treatments in a split-plot design are confounded with the whole plots and the subplot treatments are not confounded, it is best to assign the factor we are most interested in to the subplots, if possible.

The linear model for the above split plot design is

Y ijk

    i

  j

(



) ij

  k

(



) ik

(



) jk

(



) ijk

  ijk

Where

 i

,

 j

and (



) ij

represents the whole plot and correspond respectively to blocks or replicates, main treatments (factor A), and the whole plot error;

 k

, (



) ik

, (



) jk

and (



) ijk represents the subplot and correspond respectively to the subplot treatment (factor B), the block or

replicates B and AB interactions, and the subplot error (blocks * A*B).

Example 2 : Consider an investigation to study the effects of two irrigation methods (factor A) and tow fertilizers (factor B) on yield of a crop, using four available fields as experimental units. In a completely randomized design, four treatments (A1B1,A1B2,A2B1,A2B2) would then be assigned at random to the four fields. Since there are four treatments and just four experimental units, there will be no degrees of freedom for estimation of error. If the fields could be subdivided into smaller experimental units, replicates of each factor-level combination could be obtained and the error variance could then be estimated. Unfortunately, in this investigation it is not possible to apply different irrigation methods (factor A) in areas smaller than a field, although different fertilizer types (factor B) could be applied in relatively small areas. A split-plot design can accommodate this situation.

In a split-plot design, each of the two irrigation methods is randomly assigned to two of the four fields, which are usually called whole plots. In turn, each whole plot is then subdivided into two or more small areas called split plots and the two fertilizers are then randomly assigned to the split plots within each whole plot. The key feature of split-plot designs is the use of tow (or more) distinct levels of randomization. At the first level of randomization, the whole-plot treatments are randomly assigned to whole plots; at the second level, the split-plot treatments are randomly assigned to split plots.

The layout for the example is shown below.

Split Plots

Fields

A2B2

A2B1

A1B1

A1B2

A1B2

A1B1

A2B1

A2B2

A2 A1 A1 A2

Note that this layout is conceptually identical to the layout for the two-factor repeated measures design. Consequently, the split-plot model here is same as the model in the previous section.

Y ijk

    i ( j )

  j

  k

(



) jk

  ijk

Where

 j

: j-th whole plot treatment subject to

  j

0 ;

 k

: k-th split-plot treatment subject to

  k

0 ;

 i ( j )

: Effect of of the i-th whole plot nested in the j-th whole plot treatment, independent with N ( 0 ,

2

) ;

(



) jk

: Interaction effect of A and B subject to

 j

(



) jk

 k

(



) jk

0 for all j.

0 for all k and

 ijk

: error with N ( 0 ,

2

) ;

 i ( j )

and

 ijk

are independent;

i=1,……, n; j=1,……,a; k=1,……,b.

Source of Variation SS DF MSE E(MSE)

Whole plot

Factor A SSA a-1 SSA/a-1

Whole-plot error SSS(A) a(n-1) SSS(A)/(a-1)(b-1)

2  b

2

 bn

 a

1 j

2

2  b

2

Split Plots

Factor B

SSB b-1 SSB/b-1

2  na

 b

1 k

2

AB interaction SSAB (a-1)(b-1) SSAB/(a-1)(b-1)

2  n

(

 a

(



1 )( b

)

2

1 ) jk

Split-plot error SSE a(n-1)(b-1) SSE/a(n-1)(b-1)

2

Total SST nab-1

Where

SSS(A)= b

 i

( Y ij .

Y

.

j .

)

2

; SSA= nb

 j

( Y

.

j .

Y )

2

; SSB= na

 k

( Y

..

k

Y )

2

;

SSAB= n

 j k

( Y

.

jk

Y

.

j .

Y

..

k

Y )

2

; SSE=

 i j k

( Y ijk

Y

.

jk

Y ij .

Y

.

j .

)

2

SST=

  i j k

( Y ijk

Y

...

)

2

Appendix: A comparison of PROC GLM and PROC MIXED

PROC GLM

Designed for models with all parameter fixed

Based on least squares estimation

Produces ANOVA table with SS and MS

All effects appear in MODEL statement

Random effects appear in RANDOM statement

Need to coerce correct tests with TEST statement or with RANDOM …/ TEST;

For balanced data same inference as MIXED

For unbalanced data with random effects ANOVA is only approximately correct

Many mean comparison methods through MEAN statement

PROC MIXED

Designed for models with one or more parameters Random. Can also be used for models with all Parameters random.

Based on (restricted) maximum likelihood estimation

Does not produce ANOVA table

Only fixed effects appear in MODEL statement

Random effects appear in RANDOM statement

No need to coerce correct tests

For balanced data same inference as GLM

For unbalanced data provides correct inference

CRD proc glm data=whatever;

class treat;

model y=treat;

means treat / tukey;

run;

CRD/SS proc glm data=whatever;

class rep treat;

model y = treat + rep (treat);

test h=treat e=rep(treat);

means treat / tukey e=rep (treat);

run;

RCBD w/random block proc glm data=whatever;

class block treat;

model y = block treat;

random block / test;

means treat / tukey; run;

Split-Plot proc glm data=whatever

class rep A B;

model y=rep A rep*A B A*B;

test h=A e=rep*A;

means A / tukey e=rep*A;

means B / tukey; run;

No MEAN statement. Mean comparison available through the LSMEANS statement, in particular the /DIFF option in conjunction with the ADJUST=option. proc mixed data=whatever

class treat;

model y=treat;

lsmeans treat / diff adjust=tukey;

run;

proc mixed data=whatever;

class rep treat;

model y = treat;

random rep(treat);

lsmeans treat / diff adjust = tukey;

run; proc mixed data=whatever;

class block treat;

model y = treat;

random block;

lsmeans treat / diff adjust = tukey; run; proc mixed data=whatever

class rep A B;

model y=rep A B A*B;

random rep*A;

lsmeans A B/ diff adjust = tukey; run;

References:

1.

Keinbaum, Kupper, Muller, Nizam, Applied Regression Analysis and Other Multivariable

Methods ; 3 rd edition. Thomson Learning Press, 1998.

2.

Download