Chapter 13: Experiments and Observational Studies AP Statistics Observational Studies Observational Studies: Researchers observe. They don’t assign choices or manipulate anything (unlike experiments). We just use an existing situation (or data), neither choosing who or what treatments. Example: A recent study showed that men who have had a heart attack, have a greater chance of having a second heart attack if a certain protein is present in their blood. Observational Studies • They are not based on random samples, nor do they randomly impose treatments. The results cannot be generalized, nor can they show causeand-effect. • They, however, are not worthless. • They can show us trends and possible relationships—even if we can’t show cause-andeffect. • They can show us variables related to certain outcomes Types of Observational Studies Retrospective: Subjects are selected and then their previous conditions and behaviors are determined restricted to small part of population Prone to errors—looking at historical data Usually focus on estimating differences between groups or associations between variables Types of Observational Studies Prospective: Study where subjects are followed to observe future outcomes. Focus is on estimating differences among groups that might appear as groups are followed Because no treatment is applied, it is NOT and experiment. Randomized, Comparative Experiments • Only method by which we can prove cause-andeffect. • We want to see if learning math on a computer is better than learning it in a traditional classroom—randomly assign half of a group of students to classroom where the content was only taught on computer and the other half to a classroom where the content was never taught on the computer, then we would compare the results. Randomized, Comparative Experiments Comparative just means we are comparing the results at the end of the experiment. Randomized, Comparative Experiments Also called a “factor” Each factor has levels—values that the experimenter chooses for the factors Randomized, Comparative Experiments An experiment is designed to test the claim that those people who sleep less than 8 hours a night have a decreased ability to remember information. The experimenter has obtained 50 subjects and has randomly placed them in two groups. All subjects will be given a memory test as a baseline. One group will be required to sleep at least 8 hours for one night and the other groups will be prevented from sleeping 8 hours a night. The next day, each group will be given a test of memory and differences in the test will be recorded. Randomized, Comparative Experiments Important Concepts • The experimenter actively and deliberately manipulates the factors to control the details of the possible treatments. • The subjects are assigned to the treatments randomly. Four Principles of Experimental Design • Control • Randomization • Replicate • Block Control • We want to control sources of variation other than the factors we are testing by making conditions as similar as possible for all treatment groups. – We control a factor by assigning subjects to different factor levels because we want to see how the response will change at those different levels. – We control other sources of variation to prevent them from changing and affecting the response variable. Control • Controlling extraneous sources of variation reduces the variability of the responses, making it easier to detect differences among the treatment groups. • Making generalizations from the experiment to other levels of the controlled factor can be risky. Randomize • Allows us to equalize the effects of unknown or uncontrollable sources of variation – Doesn’t eliminate those effect of these sources, but it spreads them out across all treatment levels, so that they “even out” and can be looked past. – If not randomized, you will not be able to draw conclusions from the experiment – “control what you can, randomize the rest” Replicate • 1st type: We need to repeat the experiment, applying the treatment to a number of subjects. If we don’t assess the variation, it is not complete. The outcome of an experiment on a single subject is an anecdote—not an experiment Replicate • 2nd type: Occurs when our experimental units (subjects) are not representative of the population of interest. We will need to repeat the experiment with a different experimental units. Replication of an entire experiment with the controlled sources of variation at different levels is an essential step in science. If your subjects are from an Intro to Psychology class, you can’t generalize the results—so you will need to replicate the experiment Block • Sometimes random assignment to treatments from our subjects is not the way to go. • Sometimes we need to block. This is when we group experimental subjects that are known before the experiment to be similar in some way that is expected to affect the response to the treatments. • The randomization comes within the blocks— where we assign treatments in each block Logic to Experimental Design • Randomization produces groups of subjects that should be similar in all respects before we apply treatments • Comparative design ensures that influences other than the experimental treatment operate equally on all groups. • Therefore, differences in the response variable must be due to the effects of the treatments Experimental Diagram Diagram of a randomized comparative experiment. An experiment that was designed to test the effectiveness of the drug hydroxyurea for treating sickle cell anemia. There were 299 adult patients who had at least three episodes of pain from sickle cell anemia in the past year. Experimental Design Randomized Comparative Experiment Experiment to test the effectiveness of see what treatment may reduce the number of repeat offenders. Randomized Blocked Design • Blocking is used instead of randomizing subjects to treatments. • Blocking is used if it is believed that there will be differences in how set groups of subjects will respond to the explanatory variable(s) • Blocking may be used if we suspect that some issue we cannot control may introduce variability in the response. (maybe gender will produce variability in the response, therefore, we will block by gender) Randomized Block Design • Randomization is introduced when we randomly assign treatments within each block. • By blocking, we isolate the variability attributable to the differences between the blocks, so that we can see the differences caused by the treatments more clearly. • We block to reduce variability so we can see the effects of the factors • When we block, we are not usually interested in studying the effects of the blocks themselves (no need to compare the results between/among the blocks) Randomized Block Design Example: Suppose that the drug we are testing works effectively. That should show up as a difference in response between the experimental group and control group. However, if both groups are mixed gender and men and women respond differently to the drug, then the variability between the genders can drown out the true effect of the drug in each gender. We won’t see that the drug is effective. Randomized Block Design Example (cont.): We cannot cope with this variability problem with randomization (can’t randomize by gender). Instead, we block by gender to reduce this variability. Randomized Block Design Matched Pairs Design • Subjects are sometimes paired because they are similar in ways not under the study • When we match subjects in this way we can reduce variability in much the same way as does blocking • If we have study that is trying to determine if playing sports increases mathematical achievement, we might want to pair a subject that has a high IQ and plays a sport, with a subject that has a high IQ and does not play a sport (this would be an observational study). • The matching would reduce the variation due to IQ differences. Matched Pairs Design • When we have a matched pairs design that is an experiment, we need to introduce randomization • Suppose we use a matched pairs design in an experiment that looks at whether or not children can determine the difference between the facial expressions of fear and anger. In this situation, we could match subjects and then randomly assign the order in which the pictures of facial expression are shown. One part of pair will get anger and then fear, the other part of the pair will get fear and then anger. Matched Pairs Design Diagram Confounding and Lurking Variables Confounding and Lurking Variables Confounding and Lurking Variables Confounding and Lurking Variables • Lurking variables are most common in observational studies • Confounding are most common in experiments Statistically Significant How do we know if the results of an experiment really show that there is a difference? Suppose we tested a medication to see if it reduced blood pressure better than on older medication. In our results, we calculated that, on average, the medication reduced blood pressure by about 10% more than the older medication. Is this evidence that the new medication worked better than the new? What if it only reduced it by 3%? 20% Statistically Significant In order to determine if the difference is enough we need to say that the differences are statistically significant. This means that the differences we observed are too big to be explained by chance differences. If they are too big to be explained by chance, then we can attribute the differences to the experiment. Statistically Significant Suppose we flip a coin 100 times and it comes up heads 47 times. Is the coin fair, or is it weighted? We know it should come up heads 50 times (theoretically), but this difference between what we got and what we should get is not large and can be explained by chance differences (it is not unlikely to get those results if the coin was fair). However, if we got 30 heads, we may determine that a difference that big is too large for us to say that it is due to chance differences (it is very unlikely to get those results if the coin was fair). This last result shows that the differences are statistically significant and that the coin is probably not fair. Statistically Significant We will learn how to determine that in later chapters. Always be skeptical of studies and experiments that discuss how much better (or worse) one thing is than another without stating that the results are statistically significant. That is the only way to determine if there really is a difference between two (or more) things. Experiments and Samples (differences, similarities and other info) Similarities • Both use randomization to get unbiased data Experiments and Samples (differences, similarities and other info) Differences • Samples try to estimate the population parameters, so randomizing is an attempt at having the sample be representative of the population • Experiments try to assess the effectiveness of treatments, so randomization is used to assign treatments so as to reduce (eliminate) unwanted problems. Experiments rarely draw their subjects from random samples of the population. Experiments and Samples (differences, similarities and other info) Differences • If our objective is to learn something about a population Sample Survey • If our objective is to see if there is a difference in the effects of two treatments Experiment • If our objective is just to use an existing situation to look for trends and/or contributing factors Observational study Experiments and Samples (differences, similarities and other info) Other • If the subjects in an experiment are not random samples from the population, be cautious about generalizing the results of an experiment until the experiment has been replicated using different subjects, environments, etc. • Experiments typically draw stronger conclusions than surveys (even if experimental subjects are not random samples). – Because, by looking only at the differences across treatment groups, experiments cancel out many of the sources of bias. Control Treatments If we are attempting to see if a medication effectively reduces anxiety, we don’t just want to give all our subjects the medication and record if their anxiety decreased or not. Instead, we want to compare how much their anxiety decreased compared to people who did not take the medication. This comparison group is the control group and that group’s measurement is called the control treatment. Blinding Blinding is when we create an environment where subjects and/or researchers (physicians, technicians, psychologists, etc) do not know who gets treatment and who gets placebo. Who can affect the outcome of an experiment? Those who could influence the results (the subjects, treatment administrators, or technicians, etc) Those who evaluate the results (judges, treating physicians, etc) Blinding When individuals in one of those two groups is blinded we say that the experiment is singleblind. When individuals in both of those groups are blinded we say that the experiment is doubleblind. Blinding It is important to blind because it is easy for a person’s knowledge about which treatment is given to which people to influence there actions and beliefs. Therefore, blinding will eliminate this form of bias. Placebo A placebo is a “fake” treatment that looks just like the treatments being tested. It many times is used for the control group’s treatment It is the best way to blind subjects Placebo Sometimes groups who are treated with the placebo (control group) will show an improvement. This is not uncommon. This effect (when the control group show an improvement when treated with placebo) is called the placebo effect. It is not uncommon that 20% or more of subjects who are given a placebo report things such as reduction of pain, decreased depression and improved health issues Characteristics of Good Experiments • • • • Randomized Comparative Double-Blind Placebo-controlled