Designing Experiments LISA short course Justin Loda March 17, 2015 Who I Am 2nd Year Graduate Student BS in Mathematics, CNU MS in Statistics, VT LISA Lead Collaborator About Laboratory for Interdisciplinary Statistical Analysis Free Collaboration Experimental Design, Data Analysis, Software Help, Interpreting Results, Grant Proposals Free Walk-In Consulting for quick statistics questions Free Short Courses R tutorial; Structural Equation Modeling; Plotting Data Requesting a LISA Meeting Go to www.lisa.stat.vt.edu Click link for “Collaboration Request Form” Sign into the website using VT PID and password Enter your information Email, college, etc. Describe your project (project title, research goals, specific research questions, if you have already collected data, special requests, etc.) Contact assigned LISA collaborators as soon as possible to schedule a meeting Goals for Course Obs. 3 Study vs Designed Experiments Main Principles of Designed Experiments Randomization Replication Blocking (Local Control of Error) Common EX: Designs Paint Hardness What Constitutes a Good Design? Maximize information gain Minimize cost Sources of Variation A source of variation is anything that could cause an observation to be different from another observation. Example: Popping popcorn Two types of Major Sources of Variation Those that can be controlled and are of interest are called treatments or treatment factors Drug in medical experiment Settings on machine producing tires Different types of political advertising to encourage voting Those that are not of interest but are difficult to control are nuisance factors Sex Age Weather Terminology Treatment Factor – any substance or item whose effect on the data is to be studied. Treatment Levels – specific types or amounts of the treatment factor that will actually be used in the experiment. Treatment Combinations – combination of the levels of different treatment factors. Factorial Experiment – an experiment involving two or more treatment factors. *Definitions from Dean and Voss Terminology Experimental Units – the “material” to which the levels of the treatment factor(s) are applied. What that treatment is being applied to. Block – a group of experimental units which share a common characteristic Blocking Factor – the characteristic used to create the blocks *Definitions from Dean and Voss Correlation ≠ Causation Experiment vs. Observational OBSERVATIONAL STUDY Researcher observes the response of interest under natural conditions EX: Surveys, weather patterns EXPERIMENT Researcher controls variables that have a potential effect on the response of interest Which one helps establish cause-and-effect relationships better? EXAMPLE: Impact of Exercise Intensity on Resting Heart Rate Researcher surveys a sample of individuals to glean information about their intensity of exercise each week and their resting heart rate What type of study is this? EXAMPLE: Impact of Exercise Intensity on Resting Heart Rate Researcher finds a sample of individuals, enrolls groups in exercise programs of different intensity levels, and then measures before/after heart rates THREE BASIC PRINCIPLES OF DOE: Randomization Randomization What? Randomly treatment assign which Experimental Unit gets a Why? Averages out the effects of extraneous/lurking variables Reduces bias and accusations of bias How? Depends on the type of experiment Exercise Example 36 participants are randomly assigned to one of the three programs 12 in low intensity, 12 in moderate intensity, 12 in high intensity Like drawing names from a hat to fall into each group Oftentimes computer programs can randomize participants for an experiment Exercise Example What if we did not randomize? Suppose there is some underlying characteristic that is more likely to be possessed from those who volunteer first If we assigned first third to one intensity, second third to another, and so forth, it would be hard to separate the effects of the “early volunteers” and their assigned intensityRun level1 2 3 4 5 6 7 8 … EX1 1 1 1 1 1 1 1 1 … EX2 1 3 2 3 1 2 1 3 … Summary Randomizing the assignment of treatments and/or order of runs accounts for known and unknown differences between subjects It does not matter if what occurs does not “looks random” (i.e. appears to have some pattern), as long as the order was generated using a proper randomization device THREE BASIC PRINCIPLES OF DOE: Replication Replication What? Assigning a treatment (treatment combination) to multiple Experimental Units Why? Increases How precision in the experiment Many? Sample size calculation Replication What Replication is NOT? Multiple measurements on the same Experimental Unit Example One subject is assigned to a drug and then measured four times over the course of a day (Repeated Measurements) Two different greenhouses are set at either a high or low growing temperature (treatment). Five plants are placed within each greenhouse. (Observational Unit) Experimental Units (EUs) We now introduce the term “Experimental Unit” (EU) EU is the “material” to which treatment factors are assigned In our case, each person is an EU This is different from an “Observational Unit” (OU) OU is part of an EU that is measured Multiple OUs within an EU here would be if we took each person’s pulse at his/her neck, at the wrist, etc. and reported these observations Replication Extension to EU Thus, a treatment is only replicated if it is assigned to a new Experimental Unit Taking multiple observations on one EU (i.e. creating more OUs) does not count as replication – this is known as subsampling Note that treating subsampling as replicating increases the chance of incorrect conclusions (psuedoreplication) Variability in multiple measurements is measurement error, rather than experimental error Exercise Example Use formula: # 𝑬𝑼𝒔 # 𝑹𝒆𝒑𝒔 = # 𝑻𝒓𝒆𝒂𝒕𝒎𝒆𝒏𝒕𝒔 36 participants, 3 treatments 36/3 = 12 replications per treatment in the balanced case The balanced case is preferred because: Power of test to detect a significant effect between treatments on the response is maximized with equal sample size Exercise Example Unbalanced consequences? Suppose the following:Low Treatment # Participants This 9 reps Moderate High 9 reps 18 reps would lead to better estimation of the high intensity treatment over the other two Thus if you have equal interest in estimating the treatments, try to equally replicate the number of treatment assignments Summary The number of replications is the number of experimental units to which a treatment is assigned Replicating in an experiment helps us decrease variance and increase precision in estimating treatment effects THREE BASIC PRINCIPLES OF DOE: Blocking (or Local Control of Error) Local Control of Error What? Any means of improving accuracy and precision of measuring treatment effects in design Why? Removes sources of nuisance experimental variability Improves precision with which comparisons among factors are made How? Often through use of blocking (or ANCOVA) Blocking What? A block is a set of relatively homogeneous experimental conditions EX: block on time, proximity of experimental units, or characteristics of experimental units How? Separate randomizations for each block Account for differences in blocks and then compare the treatments Exercise Example Block on gender? This assumes that males and females have different responses to exercise intensity Would have the followingBLOCK (balanced) 1 design: 24 MALES BLOCK 2 12 FEMALES 8 low 4 low 8 moderate 4 moderate 8 high 4 high Here, after the participants are blocked into male/female groups, they are then randomly assigned into one of three treatment conditions Summary Blocking is separating EUs into groups with similar characteristics It allows us to remove a source of nuisance variability, and increase our ability to detect treatment differences Randomization is conducted within each block Note that we cannot make causal inferences about blocks– only treatment effects! Design Fundamentals: Summary An experimental unit is what we assign/apply treatments to A block is a group of EUs more similar than other EUs Replication and randomization increase precision and reduce known/unknown sources of bias Accounting for covariate and block effects improves ability to detect treatment differences. Causal inference about treatment effects only!! Common Designs Completely Randomized Design (CRD) Simplest design where all EU’s are assumed to be similar to each other and the only major source of variation are the treatments A CRD will randomize all treatment-EU assignments for the specified number of treatment replications and, if necessary, randomize the run order CRD Example: Paint Hardness A chemical engineer wants to compare the hardness of four blends of paint. Eight samples of each paint blend were applied to 32 pieces of wood. The pieces of wood were cured. Then each sample was measured for hardness. Run Paint Hardness 1 1 8.7 2 3 11.3 3 2 8.5 4 4 10.6 5 2 7.4 32 2 8.1 Analysis of CRD: Plots Boxplots are a very effective way to compare responses for different treatments. Analysis of CRD: ANOVA ANOVA partitions total variability into separate, independent pieces MSTrt: Variability due to treatment differences MSError: Variability due to experimental error If MSTrt > MSError, then the treatments likely have different effects Analysis of CRD: Treatment Comparisons Tukey HSD is the most common and most powerful pairwise comparison test. Randomized Complete Block Design (RCBD) The block size is the number of EU’s for the block RCBD is when the block size equals the number of treatment combinations Generalized RCBD is when the block size is a multiple of the number of treatment combinations Incomplete Block Design (IBD) is when the block size is less than the number of treatment combinations Benefits of RCBD We can account for the variability in the Experimental Units that might otherwise obscure the treatment effects Can be thought of as a separate CRD for each block with one replicate. Randomize the treatments in EACH BLOCK RCBD Example: Paint Hardness cont. Now suppose that instead of 32 pieces of wood, the experimenter is only interested in the hardness of the paint as it pertains to plywood. Due to budget restrictions, only 8 sheets of plywood are available. Plywood 1 2 8 reps 3 4 RCBD Example: Paint Hardness cont. Source DF Sum of Squares Mean Square F Value Pr > F Block 7 8.57218750 1.22459821 1.47 0.2324 Paint 3 88.69093750 29.56364583 35.42 Error 21 17.5265625 <.0001 RCBD Summary Blocking is a technique to reduce experimental error No causal inference for block effects Analysis is similar to CRD RCBD is a simple block design where the block size equals the number of treatments Split-Plot Design Split-Plot Designs When some treatment factors are more difficult to change during the experiment than those of others Whole-Plot Factor is the hard to change factor Split-Plot Factor is the easy to change factor The designs have a nested blocking structure Two levels of randomizations Whole-Plot Split-Plot Split-Plot Example: Paint Hardness cont. Suppose the researcher is also interested in determining the effect of high and low temperature on paint hardness. The researcher has four temperature chambers which can each hold two sheets of plywood. Whole Plot Factor: Temperature (2 levels) Split-Plot Factor: Paint (4 levels) Low Temp High Temp Low Temp High Temp 1 2 1 2 1 2 1 2 3 4 3 4 3 4 3 4 1 2 1 2 1 2 1 2 3 4 3 4 3 4 3 4 SPD(CRD, RCBD) Split-Plot Example: Paint Hardness cont. Source DF Temp 1 WP Error 2 Block 1 Paint 3 Paint*Temp 3 SP Error 21 Notice there are 2 error terms Other Designs Multiple Blocking Factors Latin Square Designs Row-Column Designs Split-Split-Plot Small Experiments Single Replicate of Factorial Designs Fractional Factorials Saturated Designs Second-Order Designs Response Surface Methodology Wrap-Up: Conclusions & Questions Summary of the Short Course Remember to randomize! Remember to replicate! Use multiple EUs for each treatment– it will help you be more accurate in estimating your effects Remember to block! Randomize run order, and treatments In the case where you suspect some inherent quality of your experimental units may be causing variation in your response, arrange your experimental units into groups based on similarity in that quality Remember to contact LISA! For short questions, attend our Walk-in Consulting hours For research, come before you collect your data for design help References Dean and Voss (1999). Design and Analysis of Experiments Http://www.lisa.stat.vt.edu/?q=node/5960 Http://www.lisa.stat.vt.edu/?q=node/6390 http://support.minitab.com/en-us/minitab-express/1/help-and-howto/modeling-statistics/anova/how-to/one-way-anova/before-youstart/example/