sol_cat_var

advertisement
Comparing Two Categorical Variables - Solutions
1: Sleep apnea is a pattern of irregular breathing during sleep, with longer than normal breath-holding
intervals. The following two-way table shows counts of men and women with sleep apnea or not, from a
sleep study with 400 men and 300 women.
Sleep Apnea?
Gender
Men
Women
Yes
40
12
No
360
288
a. Calculate the risk of sleep apnea for men in this study.
40/400 = 0.10 (or 10%)
b. Calculate the risk of sleep apnea for women in this study.
12/300 = 0.04 (or 4%)
c. Find the value that completes this sentence: The risk of sleep apnea for men is ___ times the risk for
women. In other words, determine the relative risk.
Relative risk = .10/.04 = 2.5
d. Find the value that completes this sentence: The odds of sleep apnea for men are __ times the odds for
women. In other words, find the odds ratio.
Odds Ratio = (.10/.90)/(.04/.96) = 2.67
2: Open the Class Survey data file. Use Stat>Tables>CrossTabulation and Chi-Square to answer the
following questions. Put Gender (C2) in the row and the variable Ever Cheat (C14). Be sure the box for
Row Percents is checked. The variable Ever Cheat is student responses to whether they ever cheated on
a significant other.
a. Fill in this table with row percents.
Ever Cheat
Gender
No
Yes
Female
71.65
28.35
Male
80.81
19.19
b. Explain why the table of row percents indicates that there is a weak or no relationship between gender
and whether students cheated on a significant other.
There is little, if any, difference in the patterns between Females and Males in their responses to ever
cheating. Both genders show a similar pattern of higher “No” percentage compared to “Yes” (whether
you believe them or not is a different story!)
1
c. Do a chi-square test for statistical significance of the observed relationship. Use
Stat>Tables>CrossTabulation and Chi-Square, then click the box Chi-Square and select Chi-Square
Analsyis, click OK and then click OK again.
(i) Give p-value for the test,
Pearson Chi-Square = 2.532, DF = 1, P-Value = 0.112
(ii) explain whether the observed relationship is statistically significant and
The observed relationship is not significant since the p-value is greater than 0.05. We call this
“0.05” the alpha-level and we commonly use it as the critical value to which we compare the p-value. If
p-value ≤ alpha, we will say that the result is statistically significant, but if p-value is > alpha we will say
that we do not have enough evidence to support a statistically significant conclusion.
(iii) state a general conclusion. The conclusion is simply a summary statement. For this problem
our summary would be that based on our data there appears to be no significant relationship between
Gender and Cheating.
3 Situation: Open from Datasets the HSB data. The High School and Beyond data is from a
large-scale longitudinal study conducted by the National Opinion Research Center (1980) under
contract with the National Center for Education Statistics. Below is a table representing a
sample of 100 students from this data that includes the student’s gender and whether the high
school they attended was public or private. To answer the questions below the table use
Stat>Tables>CrossTabulation and Chi-Square, then click the box Chi-Square and select Chi-Square
Analsyis as you did in part 2 above [Note: if you did the activity for the probability lesson this would be
a test of independence: that is, can we say that the probability of being Female is independent of the
probability that school type is public?]
Public
38
46
84
Female
Male
Total
Private
7
9
16
Total
45
55
100
(i)
include a relevant table of conditional percents,
based on the percents, discuss the nature of any relationship, and
do a chi-square test of statistical significance. State a clear conclusion for the test of
significance.
Tabulated statistics: Gender, School
(ii)
(iii)
Rows: Gender
Columns: School
Private
Public
All
Female
7
15.56
38
84.44
45
100.00
Male
9
16.36
46
83.64
55
100.00
All
16
16.00
84
84.00
100
100.00
Cell Contents:
Count
% of Row
Pearson Chi-Square = 0.012, DF = 1, P-Value = 0.913
2
(i)
(ii)
(iii)
The relevant table is shown above with the row percent given below the cell count.
For example, the 7 females from private school account for 15.56% of the total
females.
There appears to be no relationship between Gender and School Type as there is not a
large difference in the row percents between the Genders for either school type.
The p-value (0.913) is greater than 0.05 indicating NO statistically significant
relationship. We would conclude, then, that our data does not show a statistically
significant relationship between Gender and School Type. [NOTE: this would mean
that Gender and School Type are independent – similar to the probability activity].
4: Suppose a newspaper article states that drinking three or more cups of coffee doubles the risk
of gall bladder cancer. Before giving up coffee, what question should be asked by a person who
drinks this much coffee? (There is more than one possible answer.)
Some of these include:
Q. Is there a hereditary factor that was included in the study?
Q. What role did gender play?
Q. How was the data gathered and what was the sample size?
Q. What was the make-up of the data set? E.g. dispersion of age, gender.
Q. Is there a time factor? For instance, do you have to drink three or more cups of coffee for so
long before being at risk?
Q. Most importantly, this only shows a relationship and not a cause. So is there further info on
whether this behavior will cause gall bladder cancer?
3
Download