Chapters 1 & 2 Page 1 of 8 Statistics 103 Part I. Design of Experiments – Chapters 1 & 2 Chapter 1 & 2. Controlled Experiments & Observational Studies The important points of these chapters: 1. The method of comparison is the only way of determining if a treatment is effective. 2. In order to do that we have 2 groups which must be “statistically” indistinguishable. 3. A probability method must be used to assign people to the two groups since that allows us to use probability theory to analyze the results and reach a conclusion. If we randomly assign people to the groups we can determine how likely it is that they are not alike. 4. The best type of experimental design is a controlled, randomized, double blind experiment. Key Terms Treatment group – The group of subjects given some treatment in an experiment. Control group – A group of subjects who did not receive treatment. Placebo – Pills (or a substance) identical in appearance to active substance but containing no active drug at all. Controlled experiment – A study where the investigators decide who will be in the treatment group and who will be in the control group. Randomized controlled experiment - When an impartial chance procedure is used to assign the subjects to the treatment or control group. Double-blind – A procedure used in an experiment whereby the subject does not know whether he or she is receiving a treatment or placebo, and the person administering the treatment also does not know what each subject is given. Observational study – The subjects assign themselves to the different groups and the investigators watch what happens. Chapters 1 & 2 Page 2 of 8 Statistics 103 Confounding – A difference between the treatment and control groups, other than the treatment, which affects the responses being studied. Association – When one thing is linked to another. Self-selected survey is one in which the respondents themselves decide whether to be included. CONTROLLED EXPERIMENTS If a new drug is introduced its effectiveness needs to be tested. How does one do this? In the first section of our text the author describes the Salk vaccine field trial in which a vaccine for polio was tested. It is a good example on the importance of design in a statistical experiment. To know the effects of the vaccine, statisticians compare the responses of a treatment group with a control group. If the treatment group is comparable to the control group then a difference in responses of the two groups is likely to be due to the effect of the treatment. If the treatment group differs from the control group, then the effects of the factors that differ are likely to confound the results of the study. Confounded results are not reliable and thus, one wants to minimize confounding variables when determining a statistical design. One way to minimize confounding is to randomly choose subjects to be in the treatment and control groups: this is then a randomized controlled experiment. A statistician also wants to minimize bias in their design for an experiment. For this reason, the control group should be given a placebo to insure a response is to the treatment rather than the IDEA of being treated. In addition, to minimize bias, an experiment’s design should also be double-blind. As defined above, the subjects in a double-blind experiment do not know whether they are in treatment or in control; neither do those who evaluate the responses. In summary, a statistician wants a well-controlled experiment in which the design follows a randomized controlled experiment that is double-blind and if possible, the control group is given a placebo. OBSERVATIONAL STUDIES In an observational study, the investigators do not assign the subjects to the treatment or control. Some of the subjects have the condition whose effects are being studied; this is a treatment group. The other subjects are the controls. For example, in a Chapters 1 & 2 Page 3 of 8 Statistics 103 study on smoking, the smokers form the treatment group and the non-smokers are the controls. Observational studies can establish association: one thing is linked to another. Association may point to causation, but association does not prove causation There is an association between exposure and disease, but this does not directly imply that exposure causes the disease. For example, it may be that there are other genetic factors which is linked to causing lung cancer AS WELL AS giving one the propensity to smoke. This is a confounding factor. A confounder is a third variable that is associated with exposure and with disease. Observational studies must always be viewed with suspicion because the assignment of people to treatment and control is not randomized. The assignment is always self selected. OVERVIEW With both observational studies and with nonrandomized controlled experiments, try to find out how the subjects came to be in treatment or in control. When looking at a study, ask the following questions: Are the groups comparable or are they different? Was there any control group at all? What factors are confounded with treatment? What adjustments were made to take care of confounding? Were they sensible? Were historical controls used, or contemporaneous controls? How were subjects assigned to treatment – through a process under the control of the control of the investigator (a controlled experiment), or a process under the control of the investigator (an observational study)? If a controlled experiment, was assignment made using a chance mechanism (randomized controlled), or did assignment depend on the judgment of the investigator? Studies Chapters 1 & 2 Page 4 of 8 Statistics 103 Controls Contemporaneous Controlled experiment Randomized No controls Historical Observational studies Not randomized Design of Experiments (Summary) Much of the material on sampling will be covered in detail in a later chapter. In an observational study, we observe and measure specific characteristics but we do not attempt to modify the subjects being studied. In an experiment, we apply some treatment and then proceed to observe its effects on the subjects. There are a few basic steps that should be followed in designing an experiment that is capable of yielding valid results. 1. Identify your objective. Identify the exact question to be answered and clearly identify the relevant population. 2. Collect sample data: The way in which sample data are collected is absolutely critical to the success of the experiment. The sample data must be representative of the population in question. The sample must be large enough so that the effects of the treatment can be known. The question that you are trying to answer in your objective should be addressed without interference from extraneous factors. 3. Use a random procedure that avoids bias. 4. Analyze the data and form conclusions. Chapters 1 & 2 Page 5 of 8 Statistics 103 Controlling Effects of Variables A placebo effect occurs when an untreated subject incorrectly believes that he or she is receiving a treatment and reports an improvement in symptoms. The placebo effect can be countered by using blinding, a technique in which the subject does not know whether he or she is receiving a treatment or a placebo. When designing an experiment to test the effectiveness of one or more treatments, it is important to put the subjects (often called experimental units) in different groups or (blocks) in such a way that those groups are very similar. A block is a group of subjects (or experimental units) that are similar. (The subjects only need to be similar in the ways that might affect the outcome of the experiment.) When testing one or more different treatments, form blocks so that each one consists of subjects that are similar. When deciding how to assign the subjects to different blocks, you can use random selection or you can try to carefully control the assignment so that the subjects within each block are similar. One approach is to use a completely randomized experimental design, in which subjects are put into different blocks through a process or random selection. Another approach is to use a rigorously controlled design, in which experimental units (the subjects) are carefully chosen so that the subjects in each block are similar in the ways that are important. When conducting experiments, the results are sometimes ruined because of confounding. Confounding occurs in an experiment when the effects from two or more variables cannot be distinguished from each other. Sample size Another important consideration in conducting experiments is the size of the sample. It must be large enough so that erratic behavior of very small samples will not produce misleading results. Use a sample size large enough so that we can see the true nature of any effects, and obtain the sample using an appropriate method, such as one based on randomness. Randomization One of the worst mistakes is to collect datai n a way that is inappropriate. We cannot overstress this very important point: Chapters 1 & 2 Page 6 of 8 Statistics 103 Here are some other terms relating to sampling. However, only random sampling and simple random sampling will be used throughout this course. In a random sample, members of the population are selected in such a way that each member has an equal chance of being selected. A simple random sample of size n subjects is selected in such a way that every possible sample of size n has the same chance of being chosen. This is drawing n subjects at random from a population without replacement. That is, as each person is selected they are not placed back into the population and subject to being chosen again. In a large population the odds of being chosen will not change significantly as each person is selected and removed. But, in a small population removing a subject then changes the chances of being selected on subsequent drawings. Small populations have to be considered with modified formulas. In systematic sampling, we select some starting point and then select every kth (such as every 50th) element in the population. With convenience sampling, we simply use results that are readily available. For instance, if we do a survey of students who happen to be walking by some location we choose, this is a sample of convenience, not a random sample. With stratified sampling, we subdivide the population into at least two different subgroups (or strata) that share the same characteristics (such as gender or age bracket), then we draw a sample from each stratum. (We do not concern ourselves with this any further in this course.) In cluster sampling, we first divide the population area into sections (or clusters), then randomly select some of those clusters, and then choose all the members from those selected clusters. (We do not concern ourselves with this any further in this course.) A sampling error is the difference between a sample result and the true population result; such an error results from chance fluctuations. A nonsampling error occurs when the sample data are incorrectly collected, recorded, or analyzed (such as by selecting a biased sample, using a defective measurement instrument, or copying the data incorrectly). Example 2, Pg. 26, #9 Solution Chapters 1 & 2 Page 7 of 8 Statistics 103 (a) False because the observational studies found that people who get lots of vitamins by EATING VEGETABLES have lower death rates from colon cancer and lung cancer. In contrast, the colon cancer experiment found no difference in death rate between the control group and treatment group. Also, lung cancer experiment found that the death rate increased for subjects that took beta-carotene. (b) True because there may be some confounding factor other than eating fruits and vegetables that could be attributed to decreasing the death rate. (c) False because the treatment group differed from the control group only by taking the vitamin supplements. It is unknown that they would eat lots of fruit and vegetables as a part of their diet. Therefore, we cannot conclude that their lifestyles are also different. Example 3, Pg. 26, #10 Solution (a) This was an observational study because the subjects (children) assigned themselves to the groups being studied simply by their body fat. The investigators were then able to study the relationships the children in each group had with their mothers. (b) Yes, the association is that young children with more body fat would have more controlling mothers. (c) Yes, if the mother’s controlling behavior causes the child to eat more, then there is an association between her controlling behavior and the child’s body fat. (d) No. The gene is not related to the mother’s controlling behavior and therefore is not a confounding factor. Chapters 1 & 2 Page 8 of 8 Statistics 103 (e) The association is that controlling mothers have children with more body fat. An alternative way to explain the association is that the mother sees the child overeating and tells the child not to eat. (f) No. The Chronicle seems to have overreacted.