1 Module 1: Lecture 1 LECTURE 1 INTRODUCTION OF DESIGN AND ANALYSIS OF EXPERIMENTS Experiment – test or series of tests in which purposeful changes are made to the input variables of a process or system so that we may observe and identify the reasons for changes that may be observed in output response; used to study performance of process and systems Objective of the Experiment may include: (Refer to Figure 1-1) 1. Determining which variables are most influential on the response 𝑦 2. Determining where to set the influential 𝑥 ′ s so that the 𝑦 is almost always near the desired nominal value 3. Determining where to set the influential 𝑥′𝑠 so that variability in 𝑦 is small 4. Determining where to set the influential 𝑥′𝑠 so that the effects of the uncontrollable variables 𝑧1 , 𝑧2 , … , 𝑧𝑞 are minimized Experiments often involve several factors. Usually, an objective of the experimenter is to determine the influence that factors have on the output of response of the system. Illustration: How many of you have baked a cookie? What are the factors involved to ensure a delicious and mouthwatering cookie? Some of the factors might include preheating the oven, baking time, ingredients, amount of moisture, and baking temperature. You probably follow a recipe so there are many additional factors that control the ingredients such as mixture. In other words, someone did the experiment (baking a cookie) before you! What parts of the recipe did they vary to make the recipe a success? Probably many factors, temperature and moisture, various ratios of ingredients, and presence or absence of many additives. Now, if 2 Module 1: Lecture 1 you keep all the factors involved in the experiment at a constant level and vary one factor, what do you think would likely happen? In this example, one of the possible inputs is ingredients like eggs, sugar, milk, and flour. There are several factors, some of which are controllable and others are not, that affect the outcome. The output may include flavor, texture, taste, or size. Definition of Terms • • • • • An experimental unit (eu) is the object on which the response and factors are observed or measured. It is a unit to which a single treatment is applied. This could be raw materials, human subjects, or just a point in time. A sampling unit (su) is a portion of the experimental unit on which the response variable is observed or measured. A treatment or factor is a set of experimental procedures or conditions whose effects are to be measured and compared. A response variable is a characteristic used to measure the effect of a treatment. An experimental error is the difference between the observed response for a particular experiment and the long run average of all experiments conducted at the same settings of the independent variables or factors. ➢ “error” should not lead one to assume that it is a mistake; experimental errors are not all zero due to background or lurking variables cause them to change from experiment to experiment ➢ Experimental error can be classified into bias error and random error. Bias error tends to remain constant or change in a consistent pattern over the experiments in an experimental design, while random error changes from one experiment to another in an unpredictable manner and average to be zero. ➢ This error is the primary basis for deciding whether an observed difference is real or due to chance. Example 1: A series of runs were performed to determine how the wash water temperature and the detergent concentration affect the bacterial count on the palms of subjects in a hand washing experiment. a. Identify the experimental unit. b. Identify the factor/s. c. Identify the response. Answer: a. eu: a person b. Factor/s: water temperature, detergent concentration 3 Module 1: Lecture 1 c. Response: bacterial count Example 2: Suppose Paul wants to evaluate a polluted stream water for its effect on fish lesions. He set up 2 aquaria, each with 60 fishes. He randomly assigns a water treatment (polluted vs. control) to each of the aquarium. After 1 month, he catches 10 fishes from each aquarium and count the number of lesions. a. b. c. d. Identify the experimental unit. Identify the sampling unit. Identify the factor/s. Identify the response. Answer: a. eu: an aquarium with 60 fishes Paul has applied a water treatment to each aquarium. The fishes are not the experimental units. For individual fish to be considered as an experimental unit, Paul would have to take one fish at a time and apply the treatment independently to each fish. This would be impractical from a logistics standpoint and was not done. Instead, the water treatment levels were applied to the entire aquarium, and so the experimental unit is an aquarium with 60 fishes. b. su: 10 fishes from each aquarium c. Factor/s: water treatment (polluted vs. control) d. Response: number of lesions Basic Principles Statistical design of experiment is the process of planning the experiment so that appropriate data will be collected and analyzed by statistical methods, resulting in valid and objective conclusions. The three basic principles of experimental design are randomization, replication and local control. 1. Randomization - this is an essential component of any experiment that is going to have validity. By randomization it means both the allocation of the experimental material and the order in which the individual trials of the experiment are to be performed are randomly determined. If you are doing a comparative experiment where you have two treatments, a treatment and a control for instance, you need to include in your experimental process the assignment of those treatments by some random process. An experiment 4 Module 1: Lecture 1 includes experimental units. You need to have a deliberate process to eliminate potential biases from the conclusions, and random assignment is a critical step. 2. Replication – a repetition of the basic experiment. Replication has two important properties. (1) It allows the experimenter to obtain an estimate of the experimental error; (2) if the sample mean is used to estimate the effect of a factor in the experimenter, replication permits the experimenter to obtain a more precise estimate of this effect. For example; if 2 is the variance of an individual observation and there are n replicates, the variance of the sample mean is 𝜎2 𝜎𝑦2̅ = . 𝑛 The factors affecting the number of replications is (i) degree of precision required, (ii) uniformity of experimental units, (iii) number of treatments, (iv) experimental designs, (v) time allotment for experiment, and (vi) cost and availability of resources. 3. Local Control (Error Control) - control of all factors except the ones about which we are investigating. • A good experiment incorporates all possible means of minimizing the experimental error. • Local control can reduce or control the variation due to extraneous factors and increase the precision of the experiment Blocking - is a technique to include other factors in our experiment which contribute to undesirable variation. Much of the focus in this class will be to creatively use various blocking techniques to control sources of variation that will reduce error variance. For example, in human studies, the gender of the subjects is often important. Age is another issue. Age and gender are nuisance factors which contribute to variability and make it difficult to assess systematic effects. By using these as blocking factors, you can both avoid biases that might occur due to differences between the allocation of subjects to the treatments, and as a way of accounting for some noise in the experiment. We want the unknown error variance at the end of the experiment to be as small as possible. Our goal is usually to find out something about a treatment factor (or a factor of primary interest), but in addition to this we want to include any blocking factors that will explain variation. 5 Module 1: Lecture 1 Guidelines for designing Experiments 1. Recognition of and statement of the problem. A clear statement of the problem contributes substantially to a better understanding of the phenomena and final solution of the problem. 2. Choice of factors, levels, and range. Choose factors to be varied in the experiment, the ranges over which these factors will be varied, and the specific levels at which runs will be made. 3. Selection of the response variables. Select response variable that readily provides useful information about the process under study. 4. Choice of experimental design. Choice of the design involves the following: a. Consideration of sample size (number of replicates). b. Selection of suitable run order for the experimental trials. c. Determination of whether blocking or other randomization restrictions are involved. 5. Performing the experiment. Monitoring process carefully to ensure that everything is being done according to plan. 6. Statistical data analysis. Statistical methods should be used to analyze the data so that results and conclusions are objective rather than judgmental in nature. 7. Conclusions and recommendations. After analyzing the data, an experimenter must draw practical conclusions about the results and recommend a course of actions. Remark: Steps 1 – 3 are pre – experimental planning. Purpose of Experimental Design • Statistical experimental designs provide a plan for collecting data in a way that they can be analyzed statistically to validate the conjecture in question. 6 Module 1: Lecture 1 • • The effects of confounding factor can be avoided when using the experimental design. One of the main purposes for experimental designs is to minimize the effect of experimental error. Factors We usually talk about "treatment" factors, which are the factors of primary interest to you. In addition to treatment factors, there are nuisance factors which are not your primary focus, but you must deal with them. Sometimes these are called blocking factors, mainly because we will try to block on these factors to prevent them from influencing the results. There are other ways that we can categorize factors: Experimental vs. Classification Factors Experimental Factors - these are factors that you can specify (and set the levels) and then assign at random as the treatment to the experimental units. Examples would be temperature, level of an additive fertilizer amount per acre, etc. Classification Factors - can't be changed or assigned, these come as labels on the experimental units. The age and sex of the participants are classification factors which can't be changed or randomly assigned. But you can select individuals from these groups randomly. Quantitative vs. Qualitative Factors Quantitative Factors - you can assign any specified level of a quantitative factor. Examples: percent or pH level of a chemical. Qualitative Factors - have categories which are different types. Examples might be species of a plant or animal, a brand in the marketing field, gender, - these are not ordered or continuous but are arranged perhaps in sets. Quick History of Experimental Design [2] Four Eras in the History of DOE • • • The agricultural origins, 1918 – 1940s o R. A. Fisher & his co-workers o Profound impact on agricultural science o Factorial designs, ANOVA The first industrial era, 1951 – late 1970s o Box & Wilson, response surfaces o Applications in the chemical & process industries The second industrial era, late 1970s – 1990 7 Module 1: Lecture 1 • o Quality improvement initiatives in many companies o Taguchi and robust parameter design, process robustness The modern era, beginning circa 1990, when economic competitiveness and globalization is driving all sectors of the economy to be more competitive. Notes: Sir Ronald Fisher laid the foundation for statistics and for design of experiments in the first half of the 20th century. He and his colleague Frank Yates developed many of the concepts and procedures that are used today. Basic concepts such as orthogonal designs and Latin squares began in the 20's through the 40's. World War II also had an impact on statistics, inspiring sequential analysis, which arose from World War II as a method to improve the accuracy of long-range artillery guns. Immediately after World War II the first industrial era marked another renaissance in the use of design of experiments. It was at this time that Box and Wilson (1951) wrote the key paper in response surface designs thinking of the output as a response function and trying to find the optimum conditions for this function. George Box died early in 2013. And, an interesting fact - he married Fisher's daughter! He worked in the chemical industry in England in his early career and then went to America and worked at the University of Wisconsin for most of his career. The Second Industrial Era - or the Quality Revolution The importance of statistical quality control was taken to Japan in the 1950's by W Edward Deming. This started what Montgomery calls a second Industrial Era. After World War II, Japanese products were of terrible quality. They were cheaply made and not very good. In the 1960s their quality started improving. The Japanese car industry adopted statistical quality control procedures and conducted experiments which started this new era. Total Quality Management (TQM), Continuous Quality Improvement (CQI) are management techniques that have come out of this statistical quality revolution - statistical quality control and design of experiments. Taguchi, a Japanese engineer, discovered and published a lot of the techniques that were later brought to the West, using an independent development of what he referred to as orthogonal arrays. In the West these were referred to as fractional factorial designs. He came up with the concept of robust parameter design and process robustness. The Modern Era 8 Module 1: Lecture 1 Around 1990 Six Sigma, a new way of representing CQI, became popular. Now it is a company, and they employ a technique which has been adopted by many of the large manufacturing companies. This is a technique that uses statistics to make decisions based on quality and feedback loops. It incorporates a lot of the previous techniques. References: Montgomery, D. C. (2001). Design and analysis of experiments. 5th Ed. John Wiley & sons. Gomez, K. A., & Gomez, A. A. (1984). Statistical procedures for agricultural research. John Wiley & Sons. Solivas, E.S., Reaño, C.E., Collado, R.V., Gulles, A.A., Cosico, A.B. STAT 162 Experimental Designs I. A handbook of Slide Presentation. http://www.fao.org/ https://onlinecourses.science.psu.edu/stat503