Slide 1 Confounding and the Language of Experimentation, Part II – The Importance of Proper Comparisons and Randomization Slide 2 This video is designed to accompany pages 13-18 of the workbook “Making Sense of Uncertainty: Activities for Teaching Statistical Reasoning,” a publication of the Van-Griner Publishing Company Slide 3 Arthroscopic surgery, which involves the placement of small instruments and a video camera into the joint, through small incisions, has become the most commonly performed orthopedic surgery in the United States. It isn’t just athletes who have them, but a diverse group of people, including those suffering from arthritis. How had doctors traditionally assessed the efficacy of this treatment? It seemed to make a lot of sense. Patients presenting with the appropriate symptoms would have their level of function measured and asked to rate their level of pain, both before and after the surgery. By and large patients both felt they had better function and exhibited an increase in their range of motion. Slide 4 What is missing with this kind of comparison? What we don’t know is how much the results might have been confounded by the placebo effect. Is it possible that the patients felt better and had a better range of motion in part just because they had felt their problem was being addressed, even if the actual surgery was not effective? Slide 5 This is the question that was asked by the team of physicians and scientists from the Houston Veterans Affairs Medical Center at the Baylor College of Medicine. Slide 6 In this study 180 patients with osteoarthritis of the knee were randomly assigned to the traditional arthroscopic surgery and a placebo surgery. This type of placebo surgery is called a “sham surgery.” Those patients received skin incisions and had the sights and sounds of the real surgery simulated. After surgery the patients in both groups were followed for 24 months and measurements both on pain and function were taken. Slide 7 What this team of researchers found was surprising. The outcomes as measured by these pain and function scales were no better for the real surgery group than for the placebo group. In 2002 recommendations to consider avoiding arthroscopic surgery for knees were released, though the impact of those recommendations is unknown. Slide 8 Let’s practice our language here. In the arthroscopic knee surgery example, the response variables in this case are “level of pain” and “level of function.” The explanatory variable is “type of procedure,” (real or placebo), and the subjects are the participating patients. Confounding was present in the original design because although the physicians were doing a before/after comparison there was no placebo comparison. Slide 9 It is sometimes helpful to be able to diagram the design of an experiment. Before the placebocontrolled study of arthroscopic surgery, patients’ pain and function werecompared before and after surgery. So there was a comparison taking place. Comparison alone is not necessarily enough. In this case, as we’ve just seen, the placebo effect was influencing the post-surgery measurements. So the right kind of comparison is important. Slide 10 If we diagram the experiment that was conducted at Baylor then we see that the confounding created by the placebo effect was controlled for by a direct comparison between the real surgery and an elaborate placebo surgery, also known as a “sham” surgery. Since patients were randomized to these two treatments it stands to reason that any differences in post treatment pain and function that exists between the two groups can be relegated to the differences in the treatments. Slide 11 Let’s turn now from comparison to randomization. In the mid to late 1980’s the disease AIDs was just making its way into the public consciousness. It was no surprise that pharmaceutical companies were in a frantic rush to be the first to offer a drug that would treat patients with early symptoms of the disease. Milan Panic, then the flamboyant chairman of ICN Pharmaceuticals in California was the first to claim success in a January 1987 news conference. The FDA disagreed. Slide 12 Here is a summary of the results that ICN reported from their medical experiments. About the same number of patients participated in three different treatments, one being a placebo. A close look at the table of results shows that the drug, especially in 800 mg form, seems to be highly effective compared to a placebo. What was the FDA’s objection? Slide 13 To quote a June 5th, 1990 article by Michael Lev in the New York Times, “The agency questioned the methods used in the test. Dr. Frank E. Young, then Commissioner of the F.D.A., publicly challenged the tests at an AIDS conference in Washington, D.C., because the group receiving a placebo in the study might have contained more patients considered seriously ill than were in the group that received ribavirin, skewing the results in ribavirin's favor.” In short, there was evidence that ICN had not randomized their subjects to the three treatments. Hence, the confounding owing to health of the patients when entering the study compromised the inference of “effective treatment” that Mr. Panic had been hoping for. Slide 14 It’s worth reviewing the language one more time. In the ribavirin study the response variable was the “development of AIDS (yes or no),” the explanatory variable was the type of intervention (level of Ribavirin, placebo),” the subjects were the 163 patients participating in the study,” and the confounding, as mentioned already, was caused by the “lack of randomization into the treatments.” Slide 15 So what are the benefits of randomization? Randomization helps to keep the comparison groups as much alike as possible. This helps insure that any differences observed between the two groups are due to treatment differences. Slide 16 This concludes our video on the importance of proper comparisons and randomization . Remember, the placebo effect and lack of randomization can create very real obstacles to making credible inferences from experimental data.