LISA: BASIC PRINCIPLES OF EXPERIMENTAL DESIGN By Chris Franck 1 Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers benefit from the use of Statistics Collaboration: Visit our website to request personalized statistical advice and assistance with: Experimental Design • Data Analysis • Interpreting Results Grant Proposals • Software (R, SAS, JMP, SPSS...) LISA statistical collaborators aim to explain concepts in ways useful for your research. Great advice right now: Meet with LISA before collecting your data. LISA also offers: Educational Short Courses: Designed to help graduate students apply statistics in their research Walk-In Consulting: M-F 12-2PM in 401 Hutcheson Hall for questions requiring <30 mins All services are FREE for VT researchers. We assist with research—not class projects or homework. www.lisa.stat.vt.edu 2 Laboratory for Interdisciplinary Statistical Analysis To request a collaboration meeting go to www.lisa.stat.vt.edu www.lisa.stat.vt.edu 3 Laboratory for Interdisciplinary Statistical Analysis To request a collaboration meeting go to www.lisa.stat.vt.edu 1. Sign in to the website using your VT PID and password. 2. Enter your information (email address, college, etc.) 3. Describe your project (project title, research goals, specific research questions, if you have already collected data, special requests, etc.) 4. Wait 0-3 days, then contact the LISA collaborators assigned to your project to schedule an initial meeting. www.lisa.stat.vt.edu 4 About LISA Laboratory for Interdisciplinary Statistical Analysis www.lisa.stat.vt.edu FREE services: Collaboration, walk-in consulting, short courses We are here to help you! Goal is to contribute to good research across the Virginia Tech community. 5 Eric Vance Director Tonya Pruitt Administrative Superstar Chris Franck Assistant Director 6 5 Lead Collaborators (20 hours/week) 7 Structure of talk PLEASE SIGN THE SIGN IN SHEET! Cover fundamental aspects of experimental design. Scenarios come from consulting experience. Highlight LISA services available. Key words in red ?? Interesting questions. 8 Central messages If you face statistical uncertainty at any stage of your research, please come to LISA. Best time to involve the statisticians: Before the data has even been collected. Speaking of the pre-data collection phase… 9 Why are we here? Choices made at the design stage have the potential to drastically impact the results of any study. Good experimental design gives the researcher an improved chance of a successful experiment. A poorly considered or implemented design can have a ruinous effect on the investigation. 10 The plan We will discuss basic elements of experimental design: randomization, replication, and blocking. Real world experiments. Interpretation of experimental results will be compared and contrasted with interpretation of results from observational studies. 11 A note on examples presented. Examples chosen present unique challenges and complications – chosen from hundreds of collaborations. Magnitude of challenges in these examples is greater than what is typical for a LISA collaboration. Worst case scenarios! 12 13 Study design: food science Research question: Among three genetic varieties of sweet potatoes, which type will brown the least when fried? Also take storage time into account. Measurement of browning done with a machine – beyond the scope of this talk. The following graphic shows the design layout. 14 Sweet potato design Cook Order 1 2 3 4 5 6 7 8 9 1 ■ ■ ■ ■ ■ ■ ■ ■ ■ 2 ■ ■ ■ ■ ■ ■ ■ ■ ■ 3 ■ ■ ■ ■ ■ ■ ■ ■ ■ 4 ■ ■ ■ ■ ■ ■ ■ ■ ■ Legend ■ ■ ■ Week 5 ■ ■ ■ ■ ■ ■ ■ ■ ■ 6 ■ ■ ■ ■ ■ ■ ■ ■ ■ 7 ■ ■ ■ ■ ■ ■ ■ ■ ■ 8 ■ ■ ■ ■ ■ ■ ■ ■ ■ 9 ■ ■ ■ ■ ■ ■ ■ ■ ■ Cultivar 1 Cultivar 2 Cultivar 3 Oil Change 15 Features of the design Suppose we conduct this experiment and conclude the third variety browns the most, and the first variety browns the least. Is this necessarily due to the genetic differences in the potato types? Is there another plausible explanation? 16 What about cooking order? Notice that for a given week, all of the potatoes are cooked in the same oil. Also, the varieties are always cooked in the same order, making the effect of variety and the effect of cook order inseparable. The effect of variety on browning is confounded with the effect cooking order has on browning. 17 A randomized design Cook Order 1 2 3 4 5 6 7 8 9 1 ■ ■ ■ ■ ■ ■ ■ ■ ■ 2 ■ ■ ■ ■ ■ ■ ■ ■ ■ 3 ■ ■ ■ ■ ■ ■ ■ ■ ■ Legend ■ ■ ■ 4 ■ ■ ■ ■ ■ ■ ■ ■ ■ Week 5 ■ ■ ■ ■ ■ ■ ■ ■ ■ 6 ■ ■ ■ ■ ■ ■ ■ ■ ■ 7 ■ ■ ■ ■ ■ ■ ■ ■ ■ 8 ■ ■ ■ ■ ■ ■ ■ ■ ■ 9 ■ ■ ■ ■ ■ ■ ■ ■ ■ Cultivar 1 Cultivar 2 Cultivar 3 Oil Change 18 Why randomize? Randomization is a fundamental feature of good experimental design. In this case, randomization will eliminate the known confound between cooking order and potato variety. Randomization makes groups similar on average, and hence eliminates unknown confounding effects as well! 19 Sweet potato remarks Since the effect of interest (genetic variety) is confounded with cooking order in the current experiment, recommendation is to repeat the experiment with a randomized design. Many randomized designs exist! Perhaps changing oil more frequently can also improve the project. 20 Costly In general randomization is not difficult to perform. In this case the cost of repeating experiment in randomized fashion was moderate (about 12 weeks time + materials). Repeating an experiment can be VERY costly. (3 years + materials - PhD research) 21 Another randomization example A professor observes that students who sit in the front of a large lecture class tend to get better grades. Can she conclude that sitting in front causes students to get better grades? 22 HW problem A grape researcher is interested in testing the effect of 4 pesticides on the disease rate on his grapes. For his experiment he has 16 total vines arranged in four plots. Each vine has a trunk at the center and two cordons extending from the trunk. Many grape clusters grow on each cordon. 23 24 Basic grape vine anatomy 25 How to assign pesticides? To administer the pesticides, the researcher randomly assigns one pesticide (labeled A, B, C, and D) to each of the plots. He then sprays the assigned pesticide on all four vines in each plot, walking from north to south in each case. Call pesticide treatment. 26 How many replicates for each treatment? 27 How many reps for each treatment? A) 4 reps/treatment since there are four vines that receive each pesticide. B)8 reps/pesticide since there are eight cordons that receive a given treatment. C) Many replicates: depends on the number of grapes which grow, since each grape might or might not have the disease. D) Something else. 28 Answer: Number of experimental replicates: Why!? What went wrong? The experimental unit is the smallest unit in the experiment to which separate treatment assignments are made. What was the experimental unit in this experiment? 29 Definition of replicate The number of replicates for a given treatment is equal to the number of times the treatment was assigned to the experimental unit. 30 Consequences of this design We cannot perform usual statistical inference in this experiment. That is, we cannot perform hypothesis tests, construct confidence intervals, etc. The resulting data might suggest a difference in the treatments, but we can’t quantify the uncertainty of the results with confidence levels, p-values, etc. 31 Improvements Instead of using 4 total plots, we might use 8, 12, 16, etc. This would give 2,3,4 replicates per treatment. Instead of randomizing the treatments to the plots, perhaps we can randomize the treatments to the vines themselves? 32 33 Randomizing treatments to vines Now the vine is the experimental unit. 4 replicates for each treatment instead of 1. ?? What if our treatment is sprayed on the vines in such a way that adjacent vines get a little bit of the wrong treatment? Windy day? This is an example of a carryover effect – we can address these advanced issues at LISA collaboration meeting. 34 Another replication example A researcher is developing a compound needed to create some advanced textiles. Four temperature levels are crossed with 4 molecular compounds. Each combination of temperature and compound is randomized to a flask, and five samples are taken from each flask for measurements. Interaction between temperature and compound is of primary interest. 35 Umm, Chris, did you really randomize? Plot 1 has three instances of D! Not one single plot has each of the treatments! What if plot 1 has ideal characteristics? (irrigation, soil quality, sunlight) ?? Won’t treatment D seem better than it otherwise would since it appears in the best plot three times? 36 37 Yes, and Yes I did randomize, using a completely randomized design. Yes, if plot 1 has different characteristics than the other plots, and D appears frequently by chance in plot 1, then the three observations on treatment D are a function of both the treatment and the enhanced plot characteristics. 38 In general If the plots 1,2,3 and 4 have are not identical in terms of the response (disease rate), then plot to plot or inter-plot variability is present. Maybe some of the plots are on inclines, soil characteristics may be different, etc. Call the impact of the different plots on the response the plot effect. 39 How do we handle the plot effect? We are interested in the disease rates of grapes. We believe the various treatments will affect these disease rates. We also believe that the plots will also have an impact on the disease rates. We don’t really care about the plot effect – our primary goal is to determine how the treatments affect the response. Plots are simply an extra source of variability 40 Treat the plots as blocks Blocking is a strategy that may be implemented in order to account for known sources of variability in the experimental material. In our case, the plots may show variability we wish to account for. 41 To implement blocking Blocking is implemented during the design phase of the experiment. We want to assign the treatments into the blocks so that each treatment appears in each block exactly one time. Assigning treatments to blocks in this fashion is a form of restricted randomization. 42 Randomized complete block design 43 Notes about RCBD Notice each treatment appears in each block exactly once. Statistical model for this design: yij = μ + αi + βj + εij i=1,…,a is the number of treatments (4) j=1,…,b is the number of blocks (4) yij is the response for the ith treatment, jth block. 44 More terms μ is the overall mean. αi is the treatment effect for the ith treatment. βj is the effect of the jth block. εij is an error term associated with response at ith treatment and jth block. 45 Take home message By implementing the randomized complete block design, we can: Compare the performance of the four treatments. Account for variability in the plots that might otherwise obscure the treatment effects. But suppose you did not randomize the treatments into blocks before collecting data. Can I still use above technique? 46 Can I? Maybe, maybe not. If you used a completely randomized design (as I did initially), you may not be able to fit the RCBD model! This is because some of the parameters in the model may be non-estimable depending on how your randomization works out e.g. in completely randomized design, no information about how treatment A behaves in plot 1. Come see LISA when designing experiments! 47 Row effects? ?? RCBD seems good, but what if I also have a row effect in addition to a plot effect. E.g. perhaps there is a fertility gradient within each plot. Treatment C appears three times in the second row in RCBD plot. Don’t we have the same problem even with RCBD. Answer: 48 Latin Square layout 49 Another blocking example– hemophilia project Hemophilia refers to a set of genetic disorders which impairs an individual’s blood from clotting. Hemophiliacs do not have the ability to produce certain proteins which are needed for blood clotting. In our project we study Factor 9 (F.IX). 50 Chapel Hill hemophilia dogs Hemophilia A Hemophilia B Hemophilia overview This work is done in collaboration with The Francis Owen Blood Research Laboratory (FOBRL) in Chapel Hill, NC. FOBRL maintains a colony of hemophilic dogs which are used for animal models. Gene therapy – Use a modified virus to insert genes into an individual’s cells and tissues to treat hemophilia. 52 Gene therapy outline Development of gene therapies is an involved and expensive process. Hemophilic dogs are given doses of gene therapy vector with the hope that this therapy will give the dog the ability to produce the missing factor. If the dog is producing trace amounts of the factor then the dose of the vector can be increased. This is less expensive than developing a new gene therapy. This is also why it is vital to have a test that can detect the factor with high sensitivity. 53 Measuring clotting time Activated Partial Thromboplastin Time (APTT) – expensive, highly technical assay which relies on processed blood plasma. Whole Blood Clotting time (WBCT) – Less expensive simpler assay which is performed on unprocessed blood. 54 Whole blood clotting time procedure Draw 1 mL of blood from subject. divide between two test tubes in 28◦ C water bath. Incubate tubes for 1 minute, then begin tipping the first tube every 30 seconds. As soon as a clot forms in the first tube, begin tipping the second tube every 30 seconds. When the second tube forms a clot, the total time is the WBCT. 55 Research question What percentage of F.IX (relative to a non- hemophilic individual) can be detected by the WBCT procedure? More specifically, is the clotting time for WBCT assay significantly shorter than 60 minutes (baseline for untreated dog) at 0.01% F.IX? 56 Randomized complete block design 4 dogs each with 10 concentrations of F.IX. 4*10 = 40 data points. Each clotting time modeled as a function of overall mean, F.IX concentration, dog, and random error. What is the treatment? The blocks? 57 Statistical formulation of research hypotheses The null hypothesis (denoted H0) is the hypothesis of no treatment effect. The alternative hypothesis (denoted HA) is the hypothesis that the treatment has an effect. H0: F.IX concentration does not have an effect on WBCT. HA: F.iX concentration does have an effect on WBCT. 58 59 Analysis of F.IX project SAS was used to conduct this statistical analysis. We found that concentration had a significant effect on WBCT (p-value < 0.001). We found that even at the lowest concentration present (0.01% F.IX) the average clotting time was significantly less than 60 minutes. (p-value <0.001). These p-values represent the probability data as “extreme” as ours would arise if H0 were true. 60 Hemophilia wrap-up The WBCT assay yields clotting times lower than 60 minutes on average even for F.IX dilutions as low as 0.01%. Hence this procedure is very sensitive to the presence of F.IX. This is useful information since it helps guide researchers when deciding whether to increase the dose of the current gene therapy vector instead of developing a new gene therapy. 61 Strawberry Cover Example Data was collected on strawberry yields: Row covers and spring covers. Covers help regulate the temperature for strawberries, allow sunlight in, and protect from frost. Microclimate. Extend the growing season. Client appeared Thursday, had a conference Monday. 62 Strawberry rows 63 Strawberry row covers 64 Experimental layout 16 plots in four blocks. Plots randomly assigned to one of following treatments: No row cover (control), or one of three application dates of row covers (D1, D2, D3). Each plot was also split, and a Spring row cover regimen was applied to half of the plot chosen randomly. 65 Factorial arrangement of cover 66 Split each plot so we can assign Spring cover treatment 67 Each plot split into two, another experiment inside 68 What is this experiment called? 69 More advanced example – Split plot with Blocks Split plot designs are somewhat advanced – unlikely in an introductory course! Main feature: One experiment inside another. Can involve blocking terms. Model for split plot with blocks: Yijk = μ + αi + Rk + (αR)ik + βj + (αβ)ij + Eijk Greek letters fixed effects, Capital Roman letters random effects 70 Yijk = μ + αi + Rk + (αR)ik + βj + (αβ)ij + Eijk Y is response, i= 1,…,a, j=1,…,b, k=1,…, r. μ is overall mean, αi, are whole plot treatment effects, βj are split plot treatment effects, (αβ)ij are whole plot by split plot interaction effects. Assume sum to zero constraints. Rk represents the block effects, (αR)ik represents whole plot by block interaction, Eijk represents residual error. These random effects have normal distributions with appropriate variance components. 71 Analyzing Split plot with blocks model Not trivially easy, but not really hard either. However, the above experimental description differs from the experiment which was actually conducted! 72 Forgot to add split plot treatment to control plots! 73 Now what!? 74 Analyzing full data set not so simple… Expert opinion suggested the spring cover may not be particularly useful for improving yield. What if we consider only the non-control plots, and then test for a spring cover effect? 75 What type of experiment is this? 76 Strawberry Wrap-up Concluded no spring cover effect. Considered the full data set ignoring spring cover term, analyzed as two way factorial experiment with subsamples. Sufficient. Was this the optimal experimental design? Could the be a more optimal statistical approach? Deadline was met – conference presentation went well. 77 One benefit of experiments vs. observational studies In general, experiments provide more evidence of causal relationships between variables. Observational studies can show associations between variables, but are NOT sufficient to demonstrate causality. E.g. Survey grape farms, ask which treatment they use to control disease, find one treatment better than others. Causal? 78 Conclusions The design phase of an experiment is a crucial time to plan carefully. Careful design sets the stage for success. Overlooking this stage can lead to disastrous results. LISA can help you design experiments. Nobody (including LISA) can fully rescue an experiment that has design flaws. 79 Other LISA short courses Date Course Title Instructor 02/01/2011 Basic Principles of Experimental Design Chris Franck 02/07/2011 Using JMP for Statistical Analysis Part I Wandi Huang 02/08/2011 Using JMP for Statistical Analysis Part II Wandi Huang 02/15/2011 Regression Jennifer Kensler 02/21/2011 Intro to SAS Mark Seiss 02/22/2011 Intro to SAS Mark Seiss 02/28/2011 Introduction to R Sai Wang 03/01/2011 Introduction to R Sai Wang 03/14/2011 Bayesian Methods for Regression in R Nels Johnson 03/15/2011 Bayesian Methods for Regression in R Nels Johnson 80 Thanks! 81