Introduction to Experimental Design and Analysis Dr. John Mellor-Crummey Department of Computer Science

advertisement

Introduction to Experimental

Design and Analysis

Dr. John Mellor-Crummey

Department of Computer Science

Rice University johnmc@cs.rice.edu

COMP 528 Lecture 11 22 February 2005

Experimental Design and Analysis

Understand how to

Design a experiments for measurement or simulation

Develop a model that describes the data obtained

Estimate the contribution of each factor to performance

Isolate measurement errors

Estimate confidence intervals for model parameters

Check if alternatives are significantly different

Check if a model is adequate

2

Goals for Today

Understand

What are the benefits of experimental design

Terms

Avoiding mistakes in experimental design

Basic taxonomy of experimental designs

— simple design, factorial design, fractional factorial design

Understand 2 2 factorial design and its analysis

— sign table method

— properties

— analysis

— allocating variation

3

Why Experimental Design?

• Obtain maximum information from fewest experiments

— minimize time spent gathering data

• Quantify effects from different factors using analysis

• Determine if a factor’s effect is significant

— differences might be random variations caused by

– measurement errors

– parameters not controlled

4

Terms - I

Response variable: outcome of an experiment

— generally represents measured performance of the system

— e.g. throughput: transactions/second, round-trip latency, etc.

Factors: variables with alternatives that affect response

— also called predictor variables or predictors

— e.g. CPU type, memory size, # disk drives, workload used, etc.

Levels: values a factor can assume

— also called “treatment” in experimental design literature

— e.g. CPU type levels: Itanium2, Alpha 21264, Opteron

— e.g. memory size levels: 512MB, 1GB, 2GB, 4GB

Primary factors: factors whose effects need to be quantified

Secondary factors: factors that are not being quantified

— they impact performance; we may not be interested in how much

5

Terms - II

Replication: repetition of some or all experiments

— if all experiments repeated 3x, experiment is said to have 3 replications

Experimental design: plan for experimentation

— number of experiments, factor level combinations for each, replications

Experimental unit: any entity used for experiments

— workstations, patients, land in agriculture expts

— goal of experimental design: minimize impact of variation among units

Interaction: if level of A changes effect of level change of B

B1

B2

Non-interacting

A1 A2

3

6

5

8 lines parallel in graph of A vs. B

B1

B2

Interacting

A1 A2

3

6

5

9 lines not parallel in graph of A vs. B

6

Common Mistakes in Experimentation - I

Ignore variation due to experimental error

— every measured value is a random variable

— measured values change even if controllable vars kept constant

— must compare variation due to factor vs. experimental error

– don’t make decision about factor’s effect without this comparison!

Fail to control important parameters

— only some parameters selected as factors and varied

— must control other variables expected to have a significant effect

– e.g. if measuring impact of memory speed on performance,

• use same CPU type/speed for experiments if CPU not a factor

Fail to isolate effects of different factors

— if varying several factors are being varied simultaneously

– design experiments so that effects of factors can be separated

7

Common Mistakes in Experimentation - II

Use simple one-factor-at-a-time designs

— leads to too many experiments

— yields too little information per experiment

— proper design ⇒ narrower confidence intervals with same # expts

Ignore interactions between factors

— cannot estimate interactions with one-factor-at-a-time experiments

Conduct too many experiments

— # experiments needed depends upon # factors, # factor levels

— enormous single-step design vs. several steps

– better to use multiple steps

• each with small designs and # levels

– first step: test analysis assumptions and whether transformations required

– remaining steps: more factors and levels

8

Types of Experimental Designs

Simple designs

Full factorial design

Fractional factorial design

9

Simple Designs (Not recommended)

What is a simple design?

— start with a typical configuration

— vary one factor at a time to see how performance changes

Example: comparing workstation configurations

1. run typical configuration using a benchmark

2. run experiments to pick the best CPU by varying CPU only

3. using the best CPU, run a set of experiments to find the minimum memory size yielding good performance

4. using the best (CPU,memory) configuration, examine the impact of disk RPM on the benchmark’s performance

Given: k factors; i th factor has n i k

levels number of experiments

Drawbacks n = 1 + # i = 1

(n i

" 1)

— if factors interact, may yield wrong conclusions

— fails to make best use of # experiments: statistically inefficient 10

!

Full Factorial Designs

Explores every possible combination at all levels of factors k number of experiments n =

"

i = 1 n i

Example

n = (5 CPU types)(4 memory sizes)(2 disk RPMs)(4 workloads)

= 160 experiments

Advantages

— !

– can find effect of every factor, secondary factors, and interactions

Disadvantages

— cost: too many experiments, especially with repetitions

Ways to reduce cost

— reduce number of levels per factor (2 is very popular)

— reduce # factors:

– initially only examine a few levels of each factor

– prune unimportant factors, then try more factors per level

— use fractional factorial designs

11

Fractional Factorial Designs

A full factorial design may require many experiments

How can we get by with less: fractional factorial design

Example

— full factorial design (here, a 2 4 design)

n = (2 CPU types)(2 memory sizes)(2 disk RPMs)(2 workloads)

= 16 experiments

— fractional factorial design (here a 2 4-1 design)

CPU

CPU

CPU

CPU

CPU

CPU

CPU

CPU

Memory

Memory

Memory

Memory

Memory

Memory

Memory

Memory

Disk

Disk

Disk

Disk

Disk

Disk

Disk

Disk

Workload

Workload

Workload

Workload

Workload

Workload

Workload

Workload

“half replicate

design”

12

In Class Exercise

A system’s performance depends upon the following factors

— CPU type: Pentium4, Opteron, Athlon

— OS type: Linux, Windows, Solaris

— file compression utility: bzip2, gzip, zip

How many experiments are necessary if:

— there is significant interaction among the factors

— the interactions are small compared to the main effects

— there is no interaction among the factors

13

Beginning

2

k

Factorial Designs

14

2

k

Factorial Designs

What are they?

— design to determine effect of k factors, each with 2 levels

Why consider them?

— easy to analyze

— helps order factors based on impact

— useful to identify

– factors that have significant impact ⇒ study with full factorial design

– factors have little impact ⇒ not of interest for quantitative study

How to select 2 levels

— if factor effect is expected to be unidirectional

select min, max

– performance increases or decreases with factor

– e.g. performance improves with more memory

15

2

2

Factorial Designs

Special case of 2 k factorial designs, k = 2

— two factors at two levels

Utility

— 2 2 designs are simple to analyze with regression

16

Example: a 2

2

Design

Consider impact of memory & cache sizes on performance

L3 Cache Size (MB) Memory Size - 1GB Memory Size - 2GB

4 1.5 GFLOPS 3.2 GFLOPS

6 2.5 GFLOPS 5.2 GFLOPS

Define 2 categorical variables x

A

=

-1 if 1GB memory

+1 if 2GB memory x

B

=

-1 if 4MB cache

+1 if 6MB cache

Model performance using a non-linear equation in x

A

and x

B y = q

0

+ q

A x

A

+ q

B x

B

+ q

AB x

A x

B

Solve using regression

17

!

Example: a 2

2

Design

Regress performance in GFLOPS on x

A

and x

B

Substitute y, x

A y = q

and x

0

B

+ q

A x

A

+ q

B x

B

+ q

AB x

A x

B

1.5

= q

0

" q

A

" q

B

+ q

AB

!

3.2

= q

0

+ q

A

" q

B

" q

AB

2.5

= q

0

" q

A

+ q

B

" q

AB

5.2

= q

0

+ q

A

+ q

B

+ q

AB

4 equations, 4 unknowns: unique solution y = 3.1

+ 1.1

x

A

+ .75

!

— mean performance:

— effect of memory:

!

effect of cache: cache and memory interaction: x

B

+ .25

x

A x

B

3.1 GFLOPS

1.1GFLOPS

.75 GFLOPS

.25 GFLOPS

18

!

Computing Effects for 2

2

Design

y = q

0

+ q

A x

A

Substituting

4 observations into the model y

1

!

= q

0

" q

A

" q

B

+ q

AB y

2

= q

0

+ q

A

" q

B

" q

AB y

3

= q

0

" q

A

+ q

B

" q

AB y

4

= q

0

+ q

A

+ q

B

+ q

AB

+ q

B x

B

+ q

AB x

A x

B

Solving the equations for the q i

’s q

0

=

1

4

( y

1

+ y

2

+ y

3

+ y

4

) q q

A

B

=

1

4

( " y

1

+ y

2

" y

3

+ y

4

)

=

1

4

( " y

1

" y

2

+ y

3

+ y

4

) q

AB

=

1

4

( + y

1

" y

2

" y

3

+ y

4

)

Notice: expressions for q

A

, q

B

, q

AB

(contrasts)

are linear combinations of responses

sum of coefficients is 0

Expt

1

2

3

4

A

-1

1

-1

1

B

-1

-1

1

1 y y

1 y

2 y

3 y

4 19

Calculating Effects with a Sign Table

I

1

1

1

1

12.4

3.1

A

-1

1

-1

1

4.4

1.1

B

-1

-1

1

1

3

.75

AB

1

-1

-1

1

1

.25

y

1.5

3.2

2.5

5.2

total total/4

All possible combinations of -1, 1

AB is product of columns A and B y is set of observations compute product of each column and y; write product underneath divide through totals by 4 to compute regression coefficients

20

Sign Table Properties

I

1

1

1

1

A

-1

1

-1

1

B

-1

-1

1

1

AB

1

-1

-1

1

Sum of entries in columns A, B, AB is 0

4 4 4

" x

Ai

= 0 " x

Bi

= 0 " x

Ai x

Bi

= 0 i = 1 i = 1 i = 1

Sum of squares of entries in each column is 4

4

" x 2

Ai

=

!

4

4

" i = 1 x 2

Bi

!

= 4

4

" i = 1

( x

Ai x

Bi

)

2

= 4

! i = 1

Columns are orthogonal since inner product of column pairs is 0

!

4

"

i = 1 x

Ai x

Bi

!

= 0

4

" i = 1 x

Ai

!

( x

Ai x

Bi

) = 0

4

" i = 1 x

Bi

( x

Ai x

Bi

) = 0

21

! !

!

!

!

!

Computing Sample Mean with Sign Table

y =

=

1

4

1

4 sample mean y =

1

4

4

" i = 1 y i

4

" i = 1

4

( q

0

" q

0

+ i = 1

+ q

A

1

4 q

A x

Ai

+ q

B

4

" x

Ai

+ i = 1 x

Bi

1

4

+ q

AB x

Ai x

Bi

) q

B

4

" x

Bi

+ i = 1

1

4 q

AB

4

" i = 1 x

Ai x

Bi

= q

0

22

!

!

!

!

Computing Total Variation with Sign Table

Total variation

4

= # ( y i

"

y ) i = 1

2

4

= " ( q

A i = 1 x

Ai

4

= " ( q

A x

Ai

) i = 1

2

+ q

B x

Bi

+

4

" i = 1

(

+ q

AB x

Ai x

Bi

)

2

!

B x

Bi

)

2

4

+ " ( q

AB i = 1 x

Ai x

Bi

)

2

= q

A

2

4

2 "

Ai

+ q

B

2 i = 1

4

"

Bi

2

+ q 2

AB i = 1

4

" i = 1

( x

Ai x

Bi

)

2

+ product terms

= 4 q

A

2

+ 4 q

B

2

+ 4 q 2

AB

SST = SSA + SSB + SSAB =

SSA = 4 q

A

2

4 q 2

A

SSB =

+ 4 q

B

2

4 q

B

2

+ 4 q 2

AB

SSAB = 4 q 2

AB

23

!

!

! !

Allocating Variation

Importance of a factor = how much variation it explains

SST = SSA + SSB + SSAB = 4(1.1) 2 + 4(.75) 2 + 4(.25) 2 = 7.34

SSA 4(1.1) 2

Variation due to A = = = .66

SST 7.34

SSB 4(.75) 2

Variation due to B = = = .31

SST 7.34

SSAB 4(.25)

SST 7.34

2

Variation due to interaction A & B = = = .03

24

Download