Stat 240 Homework 3 Solutions 6.5 a. A on midterm Not A on midterm Total 6.9 6.25 6.29 Prefer inclass 15 20 35 Total 25 50 75 b. The response variable is preference for type of final exam. The explanatory variable is grade on the midterm. c. Among students who got an A on the midterm, (10/25) 100% = 40% prefer a take-home final exam (and 60% prefer an in-class exam). Among students who did not get an A on the midterm, (30/50) 100% = 60% prefer a take-home final exam (and 40% prefer an in-class exam) There is relationship between the two variables. Students who did not get an A on the midterm are more likely to prefer a take-home final exam than are the students who got an A on the final. a. Percent ever bullied = (42/92) 100% = 45.65% among short students b. Percent ever bullied = (30/117) 100% = 25.64% among students not short c. (omitted, but here’s the answer): Yes, there is a relationship. Short students are much more likely to have been bullied than are students who are not short. a. Frequency of religious activities (i.e. frequency of attending church, praying, and so on). b. Blood pressure. c. Answers may vary. Some possibilities mentioned in the Case Study are smoking habits, diet, and extent of social network. a. The new treatment was the more successful treatment in Hospital A. The percent surviving with the new treatment was 100/1000 or 10%, compared to 5/100 or 5% with the standard treatment. b. The new treatment was the more successful treatment in Hospital B. The percent surviving with the new treatment was 95/100 or 95%, compared to 500/1000 or 50% with the standard treatment. c. The combined contingency table is: Standard New Total 6.32 Prefer take-home 10 30 40 Survive 505 (5+500) 195 (95+100) 600 Die 595 (95+500) 905 (900+5 ) 1600 Total 1100 1100 2200 The standard treatment is more successful than the new treatment in the combined data set. The percent surviving with the standard treatment is (505/1100) 100% = 45.9%, compared to (195/1100) 100% = 17.7%. This is an example of Simpson’s Paradox. a. null: H0: There is no relationship between gender and opinion on the death penalty. alternative: Ha: There is a relationship between gender and opinion on the death penalty. 6.32 continued: b. Gender Male Femal e Total Oppose Favor 648 337 147 1488 840 337 190 1488 648 1151 501 1488 840 1151 650 1488 337 1151 Total 648 840 1488 6.38 a. Null hypothesis: age and having seen a ghost (or not) are not related Alternative hypothesis: age and having seen a ghost (or not) are related b. There is a statistically significant relationship because p-value = 0.001 is less than .05. This means that we believe the relationship observed in the sample also holds in the population. Regarding the hypotheses stated in part (a), the alternative hypothesis is selected. c. The probability is .001 that the relationship observed in the sample would been have been as strong as it is if there is no relationship between the variables in the population. More specifically, the probability is .001 that the chi-square statistic would be as large as it is, or larger, if there is no relationship in the population. 6.39 a. Expected count = 6.59 Row total Column tot al 1525 677 174 .93 for the “age 18-29 yes” cell. Total n 5902 b. Expected counts must have the same row and column totals as the observed counts. So, expected "age 30+ yes” = Total “yes” expected "age 18-29 yes” = 677 174.93 = 502.07 c. In the age 18-29 group, percent expected “yes” = (174.93/1525) 100% = 11.47%. In the age 30+ group, percent expected “yes” = (502.07/4377) 100% = 11.47%. The percents are the same, and this is a general property of expected counts. a. The given information is summarized in this table: Positive SalEst Negative SalEst All 59% 41% n = 615 33% 67% n = 27 Delivered before 42nd week Delivered 42nd week or later Applying the given percents to the given sample sizes, and rounding to the nearest integer, gives the two-way table of counts for the relationship between time of delivery and SalEst test result. The table of counts is Delivered before 42nd week Delivered 42nd week or later All Positive SalEst Negative SalEst All 363 (59% of 615) 252 (41% of 615) 615 9 (33% of 27) 18 (67% of 27) 27 372 270 642 b. The biologically correct view would be that an impending delivery will cause a positive SalEst test, so the explanatory variable would be the time until delivery and the response variable would be the result of the test. The view of a test user, however, might be that the test result is used to predict the time of delivery and with this view the explanatory variable would be the test result and the time of delivery would be the response variable. c. For all women in the study: Percent delivering before 42 weeks = (615/642) 100% = 95.79% 6.59 continued: d. For women with a positive test: Percent delivering before 42 weeks = (363/372) 100% = 97.58%, which rounds up to 98%. e. For women with a negative test: Percent delivering before 42 weeks (252/270) 100% = 93.33% f. Opinions may differ, but generally the test doesn’t predict delivery much better than what might be predicted without doing the test.. Notice that overall almost 96% of the women delivered before the 42nd week. So, if we predicted that every woman would deliver before the 42 nd week, we would be right almost 96% of the time. Knowledge of a positive SalEst test only improves this predictive accuracy to 98%. And, about 93% of the women with a negative test delivered before the 42nd week so a negative test doesn’t provide the information that a woman will have a later delivery. 6.64 a. Null hypothesis: There is not a relationship between gun ownership (or not) and opinion about required police permits for guns. Alternative hypothesis: There is a relationship between gun ownership (or not) and opinion about required police permits for guns. b. The two-way table of counts, along with the percent favoring and opposing permits for each gun ownership group, is: Favor permits Oppose permits Total No gun in home 544 (90.2%) 59 (9.8%) 603 Yes, have gun in home 323 (72.4%) 123 (27.6%) 446 Total 867 182 1049 In the whole sample, percent owning a gun is (603/1049) 100% = 57.5%. Notice that the sample size for this table is much less than the overall sample size of the data set. These questions were not asked of all respondents. c. Gun in home: (323/446) 100% =72.4% favor permits No gun in home: (544/603) 100% = 90.2% favor permits. The difference in percent favoring permits is almost 18%, and the sample size is relatively large, so there probably is a real relationship between gun ownership and opinion about permits. d. p-value 0. Yes, the relationship is statistically significant. Minitab output is shown below (with expected counts shown below observed counts). Rows: owngun Minitab output for Exercise 6.64d Columns: gunlaw Favor 544 498.38 Oppose 59 104.62 All 603 603.00 Yes 323 368.62 123 77.38 446 446.00 All 867 603.00 182 182.00 1049 1049.00 No Chi-Square = 56.609, DF = 1, P-Value = 0.000 e. An observed relationship is statistically significant if it is unlikely that a relationship as strong, or stronger, would be observed in a sample if there were no relationship in the population. In the context of this exercise, it is nearly impossible (p-value 0) that the difference in the sample percents calculated in part (b) would be so large if there were no difference in opinion between those who have a gun and do not have a gun in the population.