Chapter 1 Design of Experiments Principles Continued... Example: The use of unpaired observations and the use of paired observations are the experimental designs for two treatment experiments. Statistical design of experiments is the process of planning the experiment so that appropriate data can be obtained on the basis of which inferences can be made in the best possible manner. To the statistician the experiment is the set of rules used to draw the sample from the population. Two main aspects to any experimental problem Design of the experiment. Statistical analysis of the data. These two subjects are closely related since the method of analysis depends directly on the design employed. Experimental design methods play an important role in process development and process improvement." Planning Experiments Important steps for planning experiments: (1) Define the objective of the experiment: Remark: Classify the objectives as major and minor since certain experimental designs give precision for some treatment comparison than for others. 1 Sources of variability Remarks (2) Identify all sources of variation: In planning almost any experiment, we need to decide What measurement to make (the response) What conditions to study (the treatments), and What experimental material to use (the units or subjects). Planned, systematic variability - the kind we want Chance-like variability - the kind we can live with, and Unplanned, systematic variability - the kind that threatens disaster. Variability due to the conditions of interest (wanted) Variability in the measurements process (unwanted), and Variability in the experimental material (unwanted). Basic Principles Remarks (3) Choose a rule for assigning the experimental units to the treatments: Replication: Which specifies the number of units to be provided for each of the treatments. Properties: (i) helps to estimate the error 2 Randomization: Properties Continue... (ii) to improve the precision of an experiment by reducing the standard deviation of a treatment mean. To avoid systematic and personal biases from being introduced into the experiment by the experimenter. (iii) to increase the scope of inference of the experiment by selection and appropriate use of quite variable experiment units, and Statistical methods require that the observations (or errors) must be independently distributed random variables. (iv) to effect control of the error variance. Blocking: A block is a portion of the experimental units that should be more homogeneous than the entire set of units. Blocking involves making comparisons among the conditions of interest in the experiment within each block. Planning continues (iv) Selection of the response variable: Most often, the average or standard deviation (or both) of the measured characteristic will be the response variable. (v) Run a Pilot Study: A small-scale study (involving only a few observations) of the methods/model and procedures. This increases the precision of an experiment. 3 Continues… Continue… (vi) Choice of experimental design / Specify the model: (3) (1) Whether the design is uni-factor or factorial. (2) Whether to group the observation to eliminate one, two or more causes of variation. Example Continues… Performing the experiment Data Analysis: Statistical methods should be used to analyze the data so that results and conclusions are objective rather than judgmental in nature. Conclusions and Recommendations: Follow-up runs and confirmation testing should also be performed to validate the conclusions from the experiment. Whether the number of treatments or treatment combinations is too large to allow a full replication to be fitted conveniently into one block. If so, the design is will referred to as an “incomplete block design” and, if not, as a “complete block design”. Consider the following enzyme concentrations (mg/ml) in the hearts of eight hamsters. Hamsters raised with long days: 1.49 1.53 1.56 1.79 Hamsters raised with short days: 1.39 1.49 1.25 1.38 Observations: It looks that day length does affect the enzyme concentration. The average for the long-day hamsters is 1.59 mg/ml, which is higher than 4 Example continues... Example continues… the short-day average of 1.32 mg/ml. The eight number show a lot of variability: the long-day measurements range from 1.49 mg/ml to 1.79 mg/ml, and the difference of 0.30 mg/ml is bigger than the difference between the long- and short-day averages. The main important contents: Treatments: To study the effect of day length. Response: The concentration of enzyme. Material: Hamsters. Sources of variability: Variability in the conditions of interest: Objective: Need to estimate how much of the difference between the averages is due to day length and how much is due to other sources. Example continues... Variability in the response: measure the concentration of enzyme using a spectrophotometer to measure the amount of light absorbed by the suspended particles. Variability in experimental material: No two hamsters are biologically the same; some just naturally have higher enzyme concentration than others. We can also expect the hamsters’ environment and behavior to have some effect on enzyme concentration. The effects of day length---long versus short. (main goal of the experiment) Example continues.. Three sets of questions we want in this experiment to answers. (a) Long versus short days: Does day length affect enzyme concentrations? If so, how big is the effect? (b) Hamsters: Is the variability from one hamster to the next big enough to detect? If so, how much variability is there? (c) Measurement error: How big is the chance error built into process of measuring enzyme concentrations? 5 Example continues... Random assignment (Randomization): Use a chance device to decide which hamsters gets long days and which get short. Randomization makes statistical analysis possible. Example continues... Randomized complete design: Random assignment leads to the simplest experimental plan: using a chance device, randomly assign a day length (long or short) to each hamster. For balance, make sure that half the hamsters get long days and half get short. Treatments: long and short days Experimental units: hamsters (8 in all). Design: RC design. Continues... Example continues... Second principle (Blocking): Randomized complete block design: First sort (or subdivide) your experimental material into groups (blocks) of similar units; How to choose blocks: then assign conditions to units separately within each block. Treatments: long and short days Experimental units: hamsters (8 in all). Blocks: pairs of similar hamsters (4 pairs) Design: RCB design. (1) If our hamsters should actually happen to come from four different litters, two hamsters per litter, we could use pairs of littermates as blocks. The following three ways to choose blocks for the hamster experiment. 6 Computer Software Example continues... (2) We could weigh the hamsters and then put the two heaviest together in a pair, the next together, and so on. (3) Instead of weighing the hamsters, we could measure how fast they use oxygen, and put the two with the fastest rates together, and so on. R, SAS and SPSS (Used for this class), R and S-PLUS (Graphic purpose) Other Statistical Software etc. Note: One can use any computer software package, but make sure to know exactly the capabilities of his/her package and also the likely size of rounding errors. Suggested Homework Exercises Chapter 1 Exercises: 2, 5, 12. 7