Comparing Two Categorical Variables - Solutions 1: Sleep apnea is a pattern of irregular breathing during sleep, with longer than normal breath-holding intervals. The following two-way table shows counts of men and women with sleep apnea or not, from a sleep study with 400 men and 300 women. Sleep Apnea? Gender Men Women Yes 40 12 No 360 288 a. Calculate the risk of sleep apnea for men in this study. 40/400 = 0.10 (or 10%) b. Calculate the risk of sleep apnea for women in this study. 12/300 = 0.04 (or 4%) c. Find the value that completes this sentence: The risk of sleep apnea for men is ___ times the risk for women. In other words, determine the relative risk. Relative risk = .10/.04 = 2.5 d. Find the value that completes this sentence: The odds of sleep apnea for men are __ times the odds for women. In other words, find the odds ratio. Odds Ratio = (.10/.90)/(.04/.96) = 2.67 2: Open the Class Survey data file. Use Stat>Tables>CrossTabulation and Chi-Square to answer the following questions. Put Gender (C2) in the row and the variable Ever Cheat (C14). Be sure the box for Row Percents is checked. The variable Ever Cheat is student responses to whether they ever cheated on a significant other. a. Fill in this table with row percents. Ever Cheat Gender No Yes Female 71.65 28.35 Male 80.81 19.19 b. Explain why the table of row percents indicates that there is a weak or no relationship between gender and whether students cheated on a significant other. There is little, if any, difference in the patterns between Females and Males in their responses to ever cheating. Both genders show a similar pattern of higher “No” percentage compared to “Yes” (whether you believe them or not is a different story!) 1 c. Do a chi-square test for statistical significance of the observed relationship. Use Stat>Tables>CrossTabulation and Chi-Square, then click the box Chi-Square and select Chi-Square Analsyis, click OK and then click OK again. (i) Give p-value for the test, Pearson Chi-Square = 2.532, DF = 1, P-Value = 0.112 (ii) explain whether the observed relationship is statistically significant and The observed relationship is not significant since the p-value is greater than 0.05. We call this “0.05” the alpha-level and we commonly use it as the critical value to which we compare the p-value. If p-value ≤ alpha, we will say that the result is statistically significant, but if p-value is > alpha we will say that we do not have enough evidence to support a statistically significant conclusion. (iii) state a general conclusion. The conclusion is simply a summary statement. For this problem our summary would be that based on our data there appears to be no significant relationship between Gender and Cheating. 3 Situation: Open from Datasets the HSB data. The High School and Beyond data is from a large-scale longitudinal study conducted by the National Opinion Research Center (1980) under contract with the National Center for Education Statistics. Below is a table representing a sample of 100 students from this data that includes the student’s gender and whether the high school they attended was public or private. To answer the questions below the table use Stat>Tables>CrossTabulation and Chi-Square, then click the box Chi-Square and select Chi-Square Analsyis as you did in part 2 above [Note: if you did the activity for the probability lesson this would be a test of independence: that is, can we say that the probability of being Female is independent of the probability that school type is public?] Public 38 46 84 Female Male Total Private 7 9 16 Total 45 55 100 (i) include a relevant table of conditional percents, based on the percents, discuss the nature of any relationship, and do a chi-square test of statistical significance. State a clear conclusion for the test of significance. Tabulated statistics: Gender, School (ii) (iii) Rows: Gender Columns: School Private Public All Female 7 15.56 38 84.44 45 100.00 Male 9 16.36 46 83.64 55 100.00 All 16 16.00 84 84.00 100 100.00 Cell Contents: Count % of Row Pearson Chi-Square = 0.012, DF = 1, P-Value = 0.913 2 (i) (ii) (iii) The relevant table is shown above with the row percent given below the cell count. For example, the 7 females from private school account for 15.56% of the total females. There appears to be no relationship between Gender and School Type as there is not a large difference in the row percents between the Genders for either school type. The p-value (0.913) is greater than 0.05 indicating NO statistically significant relationship. We would conclude, then, that our data does not show a statistically significant relationship between Gender and School Type. [NOTE: this would mean that Gender and School Type are independent – similar to the probability activity]. 4: Suppose a newspaper article states that drinking three or more cups of coffee doubles the risk of gall bladder cancer. Before giving up coffee, what question should be asked by a person who drinks this much coffee? (There is more than one possible answer.) Some of these include: Q. Is there a hereditary factor that was included in the study? Q. What role did gender play? Q. How was the data gathered and what was the sample size? Q. What was the make-up of the data set? E.g. dispersion of age, gender. Q. Is there a time factor? For instance, do you have to drink three or more cups of coffee for so long before being at risk? Q. Most importantly, this only shows a relationship and not a cause. So is there further info on whether this behavior will cause gall bladder cancer? 3