Methods for Two Categorical Variables – Fisher’s Exact Test Example: Suppose there are a total of 30 pine trees on a specific island; 15 are on the windward shore and 15 on the lee shore. Of these trees, four on the windward shore are damaged, while all the trees on the lee shore are healthy. The lee shore line is shown in red (the wind moves toward the shore), and the windward shore is shown in light green. Research Question – Is there evidence that there are more damaged trees on the windward shore (which may indicate the wind or some other factor on the windward shore has a negative impact) or is the observed difference simply due to chance? The data are summarized in the table given below. Windward shore Lee shore Total Damaged 4 0 4 No Damage 11 15 26 Total 15 15 30 When we analyze data on two variables, the first step is to determine which is the __________________ Variable and which is the ___________________ (or predictor) variable. Response variable: The __________________ variable on which comparisons are made. Explanatory (or predictor) variable: This defines the ____________ to be compared. Questions: 1. What are the two variables used in this study? 2. Which is the response variable? Which is the explanatory variable? 1 Descriptive Methods for Two Categorical Variables A ______________________________ shows the joint frequencies of the two categorical variables. The rows of the table denote the categories for the first variable and the columns denote the categories of the second variable. A _________________________ is a visual representation of the relationship between two categorical variables. A mosaic plot graphically represents the information given in the contingency table. To obtain these descriptive summaries we need to enter the data into JMP as follows. Next, click Analyze Fit Y by X. Put the response variable (Damage) in the Y, Response box , the explanatory variable (Shoreline) in the X, Factor box , and Count in the Freq box as shown below. Click OK and JMP will return the following output. 2 Recall, the research question for this study was… Is there evidence that there are more damaged trees on the windward shore (which may indicate the wind or some other factor on the windward shore has a negative impact) or is the observed difference simply due to chance? As always, to find the p-value we will find the probability of obtaining a sample at least as extreme as was observed, assuming there is no difference between the two groups. That is, assuming that the shore has no effect on whether trees are damaged. To calculate the probability, we need only consider that we have 4 damaged and 26 healthy trees overall. Furthermore, we assume that a damaged tree is equally likely to be from either shoreline (that is, there is no difference in the risk of damage between the lee and windward shorelines). If we randomly divide these 30 trees into two groups of size 15, what is the probability that all four damaged trees will end up in the windward group? Note that the number of damaged trees in the windward group can assume the following values: 0, 1, 2, 3, or 4. The probability of observing each of these outcomes is calculated under the assumption of no difference in risk using a distribution known as the _____________________________ distribution and is shown below. Questions: 3. Assuming there is no difference in the risk of damage between the two shorelines, what is the probability that all four damaged trees would come from the windward shoreline? 4. Can you find this probability in the JMP output below? 3 Fisher’s Exact Test Fisher’s Exact test is based on the probability of observing a table at least as extreme as the original data. The hypothesis test is carried out as outlined below. Step 0: Define the research question Is there evidence that there are more damaged trees on the windward shore (which may indicate the wind or some other factor on the windward shore has a negative impact) or is the observed difference simply due to chance? Step 1: Determine the null and alternative hypotheses H0: There is no difference in the proportion of damaged trees on the windward vs. lee shorelines Ha: The proportion of damaged trees on the windward shoreline is greater than the proportion of damaged trees on the lee shoreline We can also re-write the hypotheses using symbol notation. H0: pwindward ≤ plee or pwindward = plee Ha: pwindward > plee Step 2: Finding the test statistic and p-value There is _____ test statistic for this hypothesis test. Step 3: Report the conclusion in context of the research question 4 Example: The following table shows a sample of patients categorized with respect to two categorical variables, congenital heart defect (present or absent) and karyotype (trisomy 21, also called Down syndrome, or trisomy 13, also called Patau syndrome). Karotype Down syndrome Patau syndrome Total Congenital Heart Defect Present Absent Total 24 36 60 20 5 25 44 41 85 Research Question – Is there evidence that a congenital heart defect is found more commonly in patients with one of the two karyotypes examined? Step 0: Define the research question Is there evidence that the proportion of Down syndrome patients with a congenital heart defect is different from the proportion of Patau syndrome patients with a congenital heart defect? Step 1: Determine the null and alternative hypotheses H0: Ha: Step 2: Finding the test statistic and p-value Step 3: Report the conclusion in context of the research question 5 Example: An advertisement by the Schering Corporation in 1999 for the allergy drug Claritin mentioned that in a pediatric randomized clinical trial, symptoms of nervousness were shown by 4 of 188 patients on Claritin and 2 of 262 patients taking a placebo. Step 0: Define the research question Is there evidence that the proportion who experience nervousness greater for those taking Claritin than for those who take the placebo? Questions: 5. What is the response variable? 6. What is the explanatory variable? 7. Fill in the contingency table below using the above scenario. Nervousness? Drug Claritin Placebo Total Yes No Total 450 Step 1: Determine the null and alternative hypotheses H0: Ha: Step 2: Finding the test statistic and p-value Step 3: Report the conclusion in context of the research question 6 Example: Prevention of deep vein thrombosis (DVT) is a critical issue in patients undergoing total hip replacement surgery. Orthopedic surgeons recognize the importance of prophylaxis in the management of their patients but do not agree on an optimal method. In this study, two different prophylaxis methods are to be compared for the prevention of proximal DVT after total hip replacement surgery. Patients undergoing total hip replacement were randomly assigned to one of the two prophylactics. After surgery, it was noted whether patients had complications from proximal DVT or not. The results are presented in the following contingency table. Treatment 1 Treatment 2 Total No Complications 72 68 140 DVT Complications 3 12 15 Total 75 80 155 Step 0: Define the research question Is there evidence that that risk of DVT complications differs between the two prophylactic treatments? Step 1: Determine the null and alternative hypotheses H0: Ha: Step 2: Finding the test statistic and p-value Step 3: Report the conclusion in context of the research question 7 Observational Studies vs. Designed Experiments Reconsider the above example. Fisher’s exact test provided evidence that the risk of complications does differ between the two treatments (p-value = 0.0281). Now, the question is this: can we conclude that it really is something with the two treatments that causes the risk to be higher for Treatment group 2? The answer to this question lies in whether the experiment itself was a designed experiment or an observational study. Observational study Involves collecting and analyzing data ___________________ randomly assigning treatments to experimental units. Designed Experiment A treatment is ____________________imposed on individual subjects in order to observe whether the treatment causes a change in the response. Key statistical idea: The random assignment of treatments used by researchers in a designed experiment should balance out between the treatment groups any other factors that might be related to the response variable. Therefore, designed experiments can be used to establish a cause-and-effect relationship (as long as the p-value is small). On the other hand, observational studies establish only that an association exists between the predictor and response variable. With observational studies, it is always possible that there are other lurking variables not controlled for in the study that may be impacting the response. Since we can’t be certain these other factors are balanced out between treatment groups, it is possible that these other factors could explain the difference between treatment groups. Note that the “DVT complications” study is an example of a designed experiment since participants were randomly assigned to the two groups. We were trying to show that there was a difference in risk of complications between the two groups. The small p-value rules out observing the difference in risk between these two groups (4% vs. 15%) simply by chance, and the randomization of subjects to treatment groups should have balanced out any other factors that might explain the difference. So, the only explanation left is that Treatment 1 is truly better than Treatment 2. 8 Example: Past research has suggested a high rate of alcoholism among patients with primary unipolar depression. A study of 210 families of females with primary unipolar depression found that 89 had alcoholism present. A set of 299 control families found 94 present. The data can be entered into JPM as shown below. Questions: 8. What is the response variable? 9. What is the explanatory variable? Step 0: Define the research question Is the alcoholism rate in females different among patients with primary unipolar depression versus the control group? That is, is the proportion of the Depression group with Alcoholism different from the proportion of the Control group with alcoholism? Step 1: Determine the null and alternative hypotheses H0: Ha: Step 2: Finding the test statistic and p-value Step 3: Report the conclusion in context of the research question Question: 10. Can we say that having unipolar depression causes alcoholism? Explain. 9 An Alternative to Fisher’s Exact Test The Chi-square test can be used to carry out the analysis for testing a difference between two proportions. Let’s again look at the congenital heart failure example. Step 0: Define the research question Is there evidence that the proportion of Down syndrome patients with a congenital heart defect different from the proportion of Patau syndrome patients with a congenital heart defect? Step 1: Determine the null and alternative hypotheses H0: pDown = pPatau Ha: pDown ≠ pPatau Step 2: Finding the test statistic and p-value We will again use the Chi-square test statistic = observed - expected expected 2 . Therefore, we must again find expected counts. However, this time we don’t have a set of hypothesized proportions. Thus, we’ll have to use some information from our data. Karotype Down syndrome Patau syndrome Total Congenital Heart Defect Present Absent Total 24 36 60 20 5 25 44 41 85 Recall, under the null hypothesis we’re assuming that there is no difference between the two groups. That is, it would be like taking the 44 people with a congenital heart defect and randomly assigning them to the Down syndrome or Patau syndrome groups. Therefore, it would be like expecting 52% (44/85) of the 60 Down syndrome patients to have a congenital heart defect and the remaining 42% (41/85) to not. Similarly, it would be like expecting 52% of the 25 Patau syndrome patients to have a congenital heart defect and the remaining 48% to not. We can summarize the expected counts in the table below. Karotype Congenital Heart Defect Down syndrome Patau syndrome Total Present 44 Absent Total 41 60 25 85 10 We can again use JMP to compute the test statistic and p-value. Step 3: Report the conclusion in context of the research question Example: A study was carried out to examine the feeding habits of vampire bats. The main question is whether or not cows in estrus or not in estrus have the same chance of being attacked by vampire bats or not. Bitten by vampire bat? Yes No Cow in estrus 5 3 Cow not in estrus 1 7 Step 0: Define the research question Is there evidence that the proportion of cows in estrus attacked by vampire bats is the same as the proportion of cows not in estrus attacked by vampire bats? 11 Step 1: Determine the null and alternative hypotheses H0: The proportion of cows in estrus that have been bitten is equal to the proportion of cows not in estrus that have been bitten. Ha: The proportion of cows in estrus that have been bitten is different from the proportion of cows not in estrus that have been bitten. H0: Ha: Step 2: Finding the test statistic and p-value Step 3: Report the conclusion in context of the research question Questions: 11. Look closely at the output. Should you trust the results of the Chi-square test? Explain. Assumptions behind the Chi-Square Test: The chi-square test may be inappropriate for tables with very small expected cell frequencies. One rule of thumb suggests that most of the expected cell frequencies in the table should be 5 or more; otherwise, the chi-square approximation may not be reliable. JMP and most other statistical software packages will warn you when the results of the chi-square test are suspect. Note that the chi-square test gives an approximate p-value; on the other hand, Fisher’s exact test gives the exact p-value. Therefore, when the assumptions for the chi-square test are not met for a 2 x 2 table, you should use Fisher’s exact test. If the assumptions are met, the two p-values should be approximately equal. 12. What is your conclusion based on Fisher’s Exact test? 12