ANOVA & PRINCIPLES OF EXPERIMENTAL DESIGN A regression analysis of observational data has some limitations. In particular, establishing a cause-and-effect relationship between an independent variable x and the response y is difficult since the values of the independent variables (relevant independent variables—both those in the model and those omitted from the model) are not controlled, thereby allowing the possibility of confounding factors. Recall that experimental data are data collected with the values of the x's set in advance of observing y (i.e., the values of the x's are controlled). With experimental data, we usually select the x's so that we can compare the mean responses, E(y), for several different combinations of the x values. The procedure for selecting sample data with the x's set in advance is called the design of the experiment. The statistical procedure for comparing the population means is called an analysis of variance. The objective of this handout is to introduce some key aspects of experimental design. The analysis of the data from such experiments using an analysis of variance procedure is the topic of the current chapter. The study of experimental design originated in England and, in its early years, was associated solely with agricultural experimentation. In agriculture, the need to save time and money led to a study of ways to obtain more information using smaller samples. This was called Design of Experiments. Similar motivations led to its subsequent acceptance and wide use in all fields of scientific experimentation. We will call the process of collecting sample data an experiment and the (dependent) variable to be measured, the response y. The planning of the sampling procedure is called the design of the experiment. The object upon which the response measurement y is taken is called an experimental unit. The independent variables, quantitative or qualitative, that are related to the response variable, y, are called factors. The value—that is, the setting ---assumed by a factor in an experiment is called a level. The combinations of levels of the factors for which the response will be observed are called treatments. EXAM P L E 1. A designed experiment. A marketing study is conducted to investigate the effect of brand and shelf location on weekly coffee sales. Coffee sales are recorded for each of two brands (brand A and brand B) at each of three shelf locations (bottom middle, and top) The 2 x 3 = 6 combinations of brand and shelf location were v med each week for a period of 18 weeks. Below is a layout of the design. For this experiment identify a. the experiment l unit b. the response, y c. the factors d. the factor levels e. the treatments 1 FIGURE 1. Layout for designed experiment of Example 1: Solution a Since the data will be collected each week for a period of 18 weeks, the experimental units are weeks. b. The variable of interest, i.e., the response, is y = weekly coffee sales. Note that weekly coffee sales are a quantitative variable. c. Since we are interested in investigating the effect of brand and shelf location on sales, brand and shelf location are the factors. Note that both factors are qualitative variables, although, in general, they may be quantitative or qualitative. d. For this experiment, brand is measured at two levels (A and B) and shelf location at three levels (bottom, middle and top3 e. Since coffee sales are recorded for each of the six brand-shelf location combinations (brand.A, bottom), (brand A, middle), (brand A, top), (brand B,bottom), (brand B, middle), and (brand B, top), then the experiment involves six treatments (see Figure 1). The term treatments is used to describe the factor level combinations to be included in an experiment because many experiments involve "treating" or doing something to alter the nature of the experimental unit. Thus, we might view the su brand-shelf location combinations as treatments on the experimental units in the marketing study involving coffee sales Now that we understand some of the terminology, it is helpful to think of the design of an experiment in four steps. STEP t Select the factors to be included in the experiment, and identify the parameters that are the object of the study. Usually, the target parameters are the population means associated with the factor level combinations (i.e., treatments) 2 STEP 2 Choose the treatments (the factor level combinations to be included in the experiment). STEP 3 Determine the number of observations (sample size) to be made for each treatment. [This will usually depend on the standard error(s) that you desire.] STEP 4 Plan how the treatments will be assigned to the experiment units. That is, decide on which design to use. By following these steps, you can control the quantity of information in an experiment. We shall explain how this is done in the next Section . Generally in an experiment, we control which experimental units get which values of X (treatments) and if the experimental units were similar before the experiment and different after, we can infer a cause - effect relationship. To illustrate, 30 similar students are assigned to three teaching methods (10 to each). If after the exponent (students being taught by the different methods), students make higher exam grades in one teaching method, we can infer that the teaching methods are affecting the grades. Definition A completely randomized design ( CRD ) to compare p treatments is one in which the treatments are randomly assigned to the experimental units. e.g. .30 students are randomly assigned to the three teaching methods. Advantage: Easy design. Problem with CRD: If students are not similar, then we have too much randomness in the values of Y caused by the lack of similarity. Example: If the students vary too much in IQ, background, etc. then we might not be able to detect differences in teaching methods. Possible remedy. Have all thirty students take an IQ test at the start of the experiment; divide the students into ten groups of three. The top three are the ones with the highest IQ score; the next group of three has the next highest IQ scores, etc. Within each group of three, we randomly assign them to the teaching methods. All teaching methods have similar students but all IQ levels are covered in the experiment. This is called a Randomized Block Design (RBD) 3 Definition: A Randomized Block Design to compare p treatments involves b blocks, each block containing p relatively homogeneous experimental units. The p treatments are randomly assigned to the experimental units within each block, with one experimental unit assigned per treatment. Or Definition: Randomized Block Design: N experimental units are divided into b blocks. Each block has similar experimental units but each block is different from other blocks. Within each block, one experimental unit is randomly assigned to a treatment (value of X). Example: 30 students are divided into 10 groups. Each group has similar students (with respect to IQ scores) but each group has students with different IQ. Within each group, a student is randomly assigned to a teaching method. Definition: A factorial Design is a completely randomized design with more than one X variable. Example: We wish to study the effect of teaching method and whether the students get to use computers or not. Factor 1 = teaching method Factor 2 = computer or not. We now have sis combinations. The students are now randomly assigned to the sis combinations. We also watch out for factor interaction. The Importance of Randomization All the basic designs presented in this chapter involve randomization of some sort. In a completely randomized design and a basic factorial experiment, the treatments are randomly assigned to the experimental units. In a randomized block design, the blocks are randomly selected and the treatments within each block are assigned in random order. Why randomize? The answer is related to the assumptions we make about the random error έ in the linear model. Experimenters rarely know all of the important variables in a process, nor do they know the true functional form of the model. Hence, the functional form chosen to fit the true relation is only an approximation, and the variables included in the experiment form only a subset of the total The random error έ, is thus a composite error caused by the failure to include all of the important factors as well as the error in approximating the function. 4 Although many unmeasured and important independent variables affecting the response y do not vary in a completely random manner during the conduct of a designed experiment, we hope their behavior is such that their cumulative effect varies in a random manner and satisfies the assumptions upon which our inferential procedures are based. The randomization in a designed experiment has the effect of randomly assigning these error effects to the treatments and assists in satisfying the assumptions on έ. 5