Chapter 7: Tables 1. a. 0.406 b. 0.046 c. 0.125 d. 0.019 a. 7.779 b. 9.488 c. 16.919 d. 21.666 2. 3. False. The χ2 test only measures the significance of the association, not the degree of the association. 4. The p-value from the test is 0.041576 which would cause us to reject the hypothesis that this data comes from a fair die. 5. The Pearson Chi-Square test assumes nominal data and thus is not as powerful when testing ordinal data. To test ordinal data you should use a test like the Goodman-Kruskal Gamma, Kendall's tau-b, Stuarts tau-c, and Somers' D. 6. a. The pivot table appears as: Count of Dept Dept Percentage of Dept Total Dept Total Bus,Econ 91 Bus,Econ HealthSc 25 HealthSc 26.38% 7.25% MathSci 128 MathSci 37.10% Soc Sci 101 Soc Sci Grand Total 345 Grand Total 29.28% 100.00% 1 Chapter 7: Tables The pie chart and bar chart appear as: b. Total Count of Dept Bus,Econ 26% Soc Sci 29% Dept Bus,Econ HealthSc MathSci Soc Sci HealthSc 7% MathSci 38% Total 140 Count of Dept 120 100 80 Total 128 60 101 91 40 20 25 0 Bus,Econ HealthSc MathSci Soc Sci Dept The bar chart gives the best indication of comparative group size. The pie chart is better for comparing the size of each group to the whole. c. The initial pivot table appear as: Computer Not req Prereq Grand Total PC 106 18 124 Mac 16 7 23 Main 43 12 55 Mini Grand Total 3 1 4 168 38 206 2 Chapter 7: Tables However since the Mini category contains sparse data, we'll combine the Main and Mini categories as follows: Not req Observed Counts Prereq PC 106 Mac 16 18 7 Main/Mini 46 13 The test statistics for the table are: Not req Expected Counts Prereq PC 101.13 Mac 18.76 22.87 4.24 Main/Mini 48.12 10.88 Not req Std. Residuals Prereq PC 0.48 -1.02 Mac -0.64 1.34 Main/Mini -0.31 0.64 Value Test Statistics df p-value Pearson Chi-Square 3.975 2 0.137 Continuity Adjusted Chi-Square 2.793 2 0.248 Likelihood Ratio Chi-Square 3.756 2 0.153 Value Measures of Association Std. Error p-value Phi 0.139 Contigency 0.138 Cramer's V 0.139 Goodman-Kruskal Gamma 0.236 0.144 0.101 Kendalls tau-b 0.104 0.068 0.125 Stuart's tau-c 0.084 0.055 0.129 Somer's D (C|R) 0.077 0.051 0.129 Somer's D (R|C) 0.139 0.091 0.123 The Pearson Chi-Square does not indicate a significant relationship at the 5% level. There is a hint of a relationship since 18.4% of the calculus prerequisite classes (7 of 38) use the Macintosh, but only about 9.5% (16 of 168) of the other classes use the Macintosh. Perhaps a larger sample would show significance. In any case, it would not be appropriate to declare that Computer and Calculus are independent. There is not sufficient evidence to indicate a definitive relationship. d. The pivot table of enrollment versus computer type appears as follows: Count of Computer Computer Group Enrollment PC's Non PC's Grand Total 001-050 15 21 36 051-100 27 18 45 101-150 16 11 27 151-200 16 8 24 201-500 28 11 39 501- 18 8 26 120 77 197 Grand Total 3 Chapter 7: Tables The table statistics are: PC's Expected Counts Non PC's 001-050 21.93 14.07 051-100 27.41 17.59 101-150 16.45 10.55 151-200 14.62 9.38 201-500 23.76 15.24 501- 15.84 10.16 PC's Std. Residuals Non PC's 001-050 -1.48 1.85 051-100 -0.08 0.10 101-150 -0.11 0.14 151-200 0.36 -0.45 201-500 0.87 -1.09 501- 0.54 -0.68 Value Test Statistics df p-value Pearson Chi-Square 8.677 5 0.123 Continuity Adjusted Chi-Square 6.914 5 0.227 Likelihood Ratio Chi-Square 8.617 5 0.125 Value Measures of Association Std. Error Phi 0.210 Contigency 0.205 Cramer's V p-value 0.210 Goodman-Kruskal Gamma -0.264 0.097 0.006 Kendalls tau-b -0.167 0.062 0.007 Stuart's tau-c -0.209 0.078 0.007 Somer's D (C|R) -0.127 0.047 0.007 Somer's D (R|C) -0.220 0.081 0.007 The ordinal tests are all statisticall significant, indicating that a relationship between enrollment and computer type exists. As the enrollment increases, it is more likely that the course uses a PC rather than a non-PC computer. However this could be due to other factors besides class size. More advanced classes tend to be smaller and might therefore be using non-PC computers (such as mainframes and workstations.) 7. a. Use the Sort command on the Excel menu bar. b. The pivot table appears as: Rank Hired F M Grand Total instructor 20 17 37 asst prof 17 15 32 assoc prof 0 9 9 full prof 0 3 3 37 44 81 Grand Total 4 Chapter 7: Tables c. The table statistics are: Expected Counts F M instructor 16.90 20.10 asst prof 14.62 17.38 assoc prof 4.11 4.89 full prof 1.37 1.63 Std. Residuals F M instructor 0.75 -0.69 asst prof 0.62 -0.57 assoc prof -2.03 1.86 full prof -1.17 1.07 Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Measures of Association Value df p-value 11.852 3 0.008 8.039 3 0.045 16.398 3 0.001 Value Std. Error p-value Phi 0.383 Contigency 0.357 Cramer's V 0.383 Goodman-Kruskal Gamma 0.440 0.160 0.006 Kendalls tau-b 0.250 0.095 0.009 Stuart's tau-c 0.277 0.109 0.011 Somer's D (C|R) 0.223 0.083 0.007 Somer's D (R|C) 0.279 0.110 0.011 Warning: More than 1/5 of Fitted Cells are Sparse Because the rank hired data is ordinal, we should use one of the ordinal statistics. However some of the table cells are sparse, leaving some question as to the validity of the statistical results. We should either remove some of the sparse cells or combine them. 5 Chapter 7: Tables d. The revised table appears as: Observed Counts F M Instructors 20 17 asst prof 17 15 0 12 Full and Assoc The table statistics are: Expected Counts F M Instructors 16.90 20.10 asst prof 14.62 17.38 5.48 6.52 Full and Assoc Std. Residuals Instructors asst prof Full and Assoc Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Measures of Association F M 0.75 -0.69 0.62 -0.57 -2.34 2.15 Value df p-value 11.852 2 0.003 9.516 2 0.009 16.398 2 0.000 Value Std. Error p-value Phi 0.383 Contigency 0.357 Cramer's V 0.383 Goodman-Kruskal Gamma 0.440 0.160 0.006 Kendalls tau-b 0.251 0.097 0.009 Stuart's tau-c 0.277 0.109 0.011 Somer's D (C|R) 0.226 0.085 0.008 Somer's D (R|C) 0.279 0.110 0.011 Once again, the table statistics indicate a relationship between rank hired and gender. This time there is no problem with sparse cells. 6 Chapter 7: Tables e. Grouping the data into professors and non-professors yields: Observed Counts F M Instructors 20 17 Professors 17 27 The table statistics are: Expected Counts F M Instructors 16.90 20.10 Professors 20.10 23.90 Std. Residuals F M Instructors 0.75 -0.69 Professors -0.69 0.63 Test Statistics Value df p-value Pearson Chi-Square 1.925 1 0.165 Continuity Adjusted Chi-Square 1.354 1 0.245 Likelihood Ratio Chi-Square 1.931 1 0.165 Measures of Association Value Std. Error p-value Phi 0.154 Contigency 0.152 Cramer's V 0.154 Goodman-Kruskal Gamma 0.303 0.205 0.141 Kendalls tau-b 0.154 0.110 0.161 Stuart's tau-c 0.153 0.109 0.161 Somer's D (C|R) 0.154 0.110 0.161 Somer's D (R|C) 0.154 0.110 0.161 After grouping the Rank Hired variable this way, the relationship between rank and gender is no longer significant. The three p-values for the ordinal statistics are now 0.14, 0.16, and 0.16, so we can no longer reject the null hypothesis at the 5% level. The two top levels of professorship might be grouped together, preserving the conclusion and not losing all of the distinction between different ranks. f. Since the first statistics had a serious sparseness problem, no conclusive results could come from the analysis. Grouping the data solved the sparseness problem, but if we grouped the data too much, we lost important information. Grouping the associate and full professors together seemed to preserve most of the important information, while at the same time resolving the sparseness issue. Our tentative conclusion is that rank and gender are related with a higher percentage of males hired for the upper-level positions. Several important factors are not considered in this analysis, such as the number of applicants for the positions, their qualifications, the gender of the applicants, and their previous experience. All of these could be relevant to the study. 7 Chapter 7: Tables g. Grouping the full and associate professors together creates the following table: Observed Counts F M instructor 15 11 asst prof 11 8 0 8 Assoc/Full Prof The table statistics are: Expected Counts F M instructor 12.75 13.25 asst prof 9.32 9.68 Assoc/Full Prof 3.92 4.08 Std. Residuals instructor asst prof Assoc/Full Prof Test Statistics F M 0.63 -0.62 0.55 -0.54 -1.98 1.94 Value df p-value Pearson Chi-Square 9.073 2 0.011 Continuity Adjusted Chi-Square 6.627 2 0.036 12.165 2 0.002 Likelihood Ratio Chi-Square Measures of Association Value Std. Error p-value Phi 0.414 Contigency 0.382 Cramer's V 0.414 Goodman-Kruskal Gamma 0.461 0.194 0.017 Kendalls tau-b 0.267 0.121 0.027 Stuart's tau-c 0.295 0.137 0.031 Somer's D (C|R) 0.242 0.107 0.023 Somer's D (R|C) 0.295 0.137 0.031 For employees with a Master's degree when hired, there still appears to be a relationship between gender and rank. Once again, women are hired at a lower-than-expected rate for the upper-level positions. 8. a. The hypotheses are: H0: The incidence of colds will be the same in both groups. Ha: The incidence of colds will be different. 8 Chapter 7: Tables b. The table statistics are: Observed Counts Cold No Cold Placebo 31 109 Ascorbic Acid 17 122 Expected Counts Cold No Cold Placebo 24.09 115.91 Ascorbic Acid 23.91 115.09 Std. Residuals Placebo Ascorbic Acid Test Statistics Cold No Cold 1.41 -0.64 -1.41 0.64 Value pvalue df Pearson Chi-Square 4.811 1 0.028 Continuity Adjusted Chi-Square 4.141 1 0.042 Likelihood Ratio Chi-Square 4.872 1 0.027 Measures of Association Std. Error Value pvalue Phi 0.131 Contigency 0.130 Cramer's V 0.131 Goodman-Kruskal Gamma 0.342 0.145 0.019 Kendalls tau-b 0.131 0.058 0.024 Stuart's tau-c 0.099 0.045 0.027 Somer's D (C|R) 0.099 0.045 0.027 Somer's D (R|C) 0.174 0.076 0.023 Based on the table statistics, we reject the null hypothesis with a p-value of 0.028, concluding that group is related to the incidence of the common cold. In this case, taking ascorbic acid apparently reduces the incidence of colds. 9. a. The hypotheses are: H0: The heights of husbands and wives are the same. Ha: The heights are different. 9 Chapter 7: Tables b. The table statistics are: Observed Counts Tall Medium Short Tall 18 28 14 Medium 20 51 28 Short 12 25 9 Expected Counts Tall Medium Short Tall 14.63 30.44 14.93 Medium 24.15 50.22 24.63 Short 11.22 23.34 11.44 Std. Residuals Tall Medium Short Test Statistics Tall Medium Short 0.88 -0.44 -0.84 0.11 0.68 0.23 0.34 -0.72 Value df -0.24 p-value Pearson Chi-Square 2.907 4 0.573 Continuity Adjusted Chi-Square 1.979 4 0.740 Likelihood Ratio Chi-Square 2.923 4 0.571 Measures of Association Value Std. Error p-value Phi 0.119 Contigency 0.118 Cramer's V 0.084 Goodman-Kruskal Gamma 0.019 0.101 0.855 Kendalls tau-b 0.012 0.063 0.855 Stuart's tau-c 0.011 0.060 0.855 Somer's D (C|R) 0.012 0.063 0.855 Somer's D (R|C) 0.012 0.064 0.855 There is no evidence of a relationship between heights of wives and husbands. We fail to reject the null hypothesis with a p-value of 0.573. 10 Chapter 7: Tables 10. a. The table statistics for each question are: Do you favor or oppose term limits? Should Federal Spending on Child Care Be Increased? Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Value 2.765 2.570 2.761 df Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.040 0.040 0.040 -0.099 0.059 -0.040 0.024 -0.033 0.020 -0.049 0.030 -0.033 0.020 1 1 1 p-value 0.096 0.109 0.097 Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Value 13.641 13.023 13.613 p-value Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.110 0.109 0.110 -0.304 0.078 -0.110 0.030 -0.077 0.021 -0.155 0.042 -0.077 0.021 0.094 0.097 0.097 0.097 0.097 df 1 1 1 p-value 0.000 0.000 0.000 p-value 0.000 0.000 0.000 0.000 0.000 Should Federal Spending on Crime Be Increased? Should Federal Spending on Welfare Be Increased? Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Value 11.053 10.303 11.095 p-value 0.001 0.001 0.001 Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Value 11.053 10.303 11.095 Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.089 0.089 0.089 -0.370 0.103 -0.089 0.026 -0.041 0.013 -0.190 0.055 -0.042 0.013 p-value Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.089 0.089 0.089 -0.370 0.103 -0.089 0.026 -0.041 0.013 -0.190 0.055 -0.042 0.013 df 1 1 1 0.000 0.001 0.001 0.001 0.001 df 1 1 1 Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Value 0.028 0.010 0.028 Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.006 0.006 0.006 0.011 0.067 0.006 0.033 0.005 0.033 0.006 0.033 0.005 0.033 b. p-value 0.000 0.000 0.000 0.000 0.000 p-value 0.000 0.001 0.001 0.001 0.001 Should Federal Spending on Defense Be Increased? Should Federal Spending on Social Security Be Increased? Test Statistics Value df p-value Pearson Chi-Square 33.848 1 0.000 Continuity Adjusted Chi-Square 32.519 1 0.000 Likelihood Ratio Chi-Square 33.932 1 0.000 Value Std. Error 0.185 0.182 0.185 -0.601 0.081 -0.185 0.030 -0.102 0.019 -0.328 0.051 -0.105 0.019 p-value 0.001 0.001 0.001 df 1 1 1 p-value 0.867 0.920 0.867 p-value 0.867 0.867 0.867 0.867 0.867 Should Federal Spending on Child Care be Increased? Should Federal Spending on Crime be Increased? Should Federal Spending on Welfare be Increased? Should Federal Spending on Social Security be Increased? c. There are no statistically significant differences between men and women involving term limits and defense spending. The other four issues: child care, welfare, crime, and social security, showed statistically significant disagreements. In each case, a higher percentage of women than expected favored increased spending on the problem area. In general, women favor an increased government role in social issues. 11 Chapter 7: Tables 11. a. The table statistics are: Do you favor or oppose term limits? Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Value 1.459 1.211 1.414 df Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.030 0.030 0.030 0.096 0.083 0.028 0.025 0.016 0.014 0.023 0.021 0.033 0.030 2 2 2 p-value 0.482 0.546 0.493 p-value 0.246 0.268 0.269 0.268 0.268 Should Federal Spending on Crime Be Increased? Test Statistics Pearson Chi-Square Continuity Adjusted Chi-Square Likelihood Ratio Chi-Square Value 0.327 0.167 0.341 df Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.015 0.015 0.015 -0.099 0.172 -0.015 0.025 -0.005 0.008 -0.024 0.039 -0.010 0.016 1 1 1 p-value 0.567 0.683 0.559 p-value 0.566 0.542 0.543 0.542 0.543 Should Federal Spending on Child Care Be Increased? Test Statistics Value df Pearson Chi-Square 26.064 1 Continuity Adjusted Chi-Square 24.895 1 Likelihood Ratio Chi-Square 36.186 1 Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.154 0.152 0.154 -0.798 0.093 -0.154 0.016 -0.080 0.010 -0.162 0.018 -0.146 0.016 Should Federal Spending on Welfare Be Increased? Test Statistics Value df Pearson Chi-Square 37.220 2 Continuity Adjusted Chi-Square 35.672 2 Likelihood Ratio Chi-Square 31.351 2 Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Value Std. Error 0.182 0.179 0.182 -0.468 0.074 -0.162 0.035 -0.084 0.019 -0.134 0.030 -0.195 0.043 Should Federal Spending on Social Security Be Increased? Test Statistics Value df p-value Pearson Chi-Square 3.455 1 0.063 Continuity Adjusted Chi-Square 2.926 1 0.087 Likelihood Ratio Chi-Square 3.884 1 0.049 Should Federal Spending on Defense Be Increased? Test Statistics Value df Pearson Chi-Square 104.711 2 Continuity Adjusted Chi-Square 99.522 2 Likelihood Ratio Chi-Square 93.210 2 Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) Measures of Association Phi Contigency Cramer's V Goodman-Kruskal Gamma Kendalls tau-b Stuart's tau-c Somer's D (C|R) Somer's D (R|C) b. Value Std. Error 0.060 0.060 0.060 -0.320 0.163 -0.060 0.026 -0.026 0.012 -0.084 0.037 -0.043 0.019 p-value 0.049 0.024 0.027 0.024 0.026 Value Std. Error 0.457 0.415 0.457 0.788 0.043 0.445 0.046 0.328 0.040 0.473 0.050 0.420 0.045 p-value 0.000 0.000 0.000 p-value 0.000 0.000 0.000 0.000 0.000 p-value 0.000 0.000 0.000 p-value 0.000 0.000 0.000 0.000 0.000 p-value 0.000 0.000 0.000 p-value 0.000 0.000 0.000 0.000 0.000 Should Federal Spending on Child Care be Increased? Should Federal Spending on Defnese be Increased? Should Federal Spending on Welfare be Increased? Should Federal Spending on Social Security be Increased? c. A higher than expected number of respondents from the Black and Other race category believe that spending should be increased in child care and welfare. Also a higher than expected number of blacks believe that social security spending should be increased. A higher than expected number of whites believe that defense spending should be increased. In each case the differences were statistically significant at the 5% level., leading to the conclusion that the race of the respondent for these questions is associated with the response. 12 Chapter 7: Tables 12. a. The hypotheses are: H0: The number of cylinders put in each car is independent of the country of origin. Ha: The number of cylinders is related to the country. b. The pivot table appears as: Count of Origin Cylinders Origin 3 4 5 6 8 American 0 69 0 73 103 245 European 0 61 3 4 0 68 Japanese 4 69 0 6 0 79 Grand Total 4 199 3 83 103 392 c. Grand Total The table statistics are: Test Statistics Value Pearson Chi-Square 194.322 df 8 0.000 Continuity Adjusted Chi-Square 176.234 8 0.000 Likelihood Ratio Chi-Square 260.851 8 0.000 Measures of Association Value Std. Error Phi 0.682 Contigency 0.563 Cramer's V p-value p-value 0.482 Goodman-Kruskal Gamma -0.907 0.027 0.000 Kendalls tau-b -0.451 0.028 0.000 Stuart's tau-c -0.427 0.026 0.000 Somer's D (C|R) -0.479 0.024 0.000 Somer's D (R|C) -0.423 0.024 0.000 Warning: More than 1/5 of Fitted Cells are Sparse d. Since there is a problem with sparse cells, we'll group the cars with 3-5 cylinders and we'll also group the non-American cars. The pivot table appears as: Count of Origin Cylinders2 Origin2 3-5 American 6 8 Grand Total 69 73 103 245 Non-American 137 10 0 147 Grand Total 206 83 103 392 13 Chapter 7: Tables The table statistics are: Test Statistics Value Pearson Chi-Square 164.834 2 0.000 Continuity Adjusted Chi-Square 160.799 2 0.000 Likelihood Ratio Chi-Square 222.405 2 0.000 Measures of Association Value Phi 0.637 Contigency 0.537 Cramer's V df p-value Std. Error p-value 0.637 Goodman-Kruskal Gamma -0.947 0.018 0.000 Kendalls tau-b -0.523 0.030 0.000 Stuart's tau-c -0.593 0.034 0.000 Somer's D (C|R) -0.588 0.028 0.000 Somer's D (R|C) -0.466 0.026 0.000 e. Based on the results of the table statistics, we reject the null hypothesis with a p-value of < 0.0001 and conclude that there is a relationship between origin and the number of cylinders placed in the car. Generally, American cars have more cylinders than non-American cars. a. The pivot table is: 13. Count of Offer Pending NE Sector Offer Pending No No Yes Grand Total Yes Grand Total 30 65 95 9 13 22 39 78 117 The table statistics are: Test Statistics Value df p-value Pearson Chi-Square 0.700 1 0.403 Continuity Adjusted Chi-Square 0.343 1 0.558 Likelihood Ratio Chi-Square 0.683 1 0.409 Measures of Association Value Phi 0.077 Contigency 0.077 Cramer's V 0.077 Std. Error p-value Goodman-Kruskal Gamma -0.200 0.234 0.392 Kendalls tau-b -0.077 0.096 0.418 Stuart's tau-c -0.057 0.071 0.421 Somer's D (C|R) -0.093 0.115 0.418 Somer's D (R|C) -0.064 0.080 0.420 There is no evidence of a relationship between Offer Pending and the sector in which the house is located. 14 15