Stat 301 – Lecture 33 Categorical Data A random sample of students taking introductory statistics at a large public university are asked two questions. 1 Categorical Variables Response: With whom is it easiest to make friends? Opposite gender Same gender No difference 2 Categorical Variables Explanatory: What is your gender? Female Male 3 Stat 301 – Lecture 33 Contingency table With whom is it easiest to make friends? Opposite Same No Gender Gender Difference Total Female 58 16 63 137 Male 15 13 40 68 Total 73 29 103 205 4 Bar Graph With whom is it easiest to make friends? Distributions Answer Frequencies 50.2 35.6 75 50 14.1 Count 100 Level No Diff Opposite Same Total Count 103 73 29 205 Prob 0.50244 0.35610 0.14146 1.00000 N Missing 0 3 Levels 25 No Diff Opposite Same 5 Percentages With whom is it easiest to make friends? Count Row % Female Male Total Opposite Gender Same Gender No Difference Total 58 42.34% 15 22.06% 73 16 11.68% 13 19.12% 29 63 45.98% 40 58.82% 103 137 100% 68 100% 205 6 Stat 301 – Lecture 33 Mosaic Plot 1.00 Same Answer 0.75 Opposite 0.50 0.25 No Diff 0.00 Female Male Gender 7 Description Almost 60% of males in the sample say no difference while less than 50% of females in the sample say no difference. 8 Description Females in the sample are almost twice as likely as males in the sample to say its easiest to make friends with the opposite gender. 9 Stat 301 – Lecture 33 Description Males in the sample are almost twice as likely as females in the sample to say its easiest to make friends with the same gender. 10 Comment There appears to be some sort of relationship between students’ gender and their opinion on with whom it is easiest to make friends. 11 Inference Can the apparent relationship be due to chance or is there a statistically significant relationship? 12 Stat 301 – Lecture 33 Independence If students’ gender and their opinion on with whom it is easiest to make friends were independent the proportions for females and males would be similar to the total proportions. 13 Expected Percentages With whom is it easiest to make friends? Count Row % Female Opposite Gender Same Gender No Difference 35.61% 14.15% 50.24% 35.61% 73 35.61% 14.15% 29 14.15% 50.24% 103 50.24% Male Total Total 137 100% 68 100% 205 100%14 Expected Values With whom is it easiest to make friends? Count Row % Female Male Total Opposite Gender Same Gender No Difference Total 48.7854 35.61% 24.2146 35.61% 73 35.61% 19.3805 14.146% 9.6195 14.146% 29 14.146% 68.8341 50.244% 34.1659 50.244% 103 50.244% 137 100% 68 100% 205 100%15 Stat 301 – Lecture 33 Test Statistic ∑ χ df = (#rows – 1)(#columns – 1) Condition: Expected counts should be greater than 5 (10 or 15). 16 Cell Chi-Square With whom is it easiest to make friends? Count Row % Female Male Opposite Gender Same Gender No Difference 1.7405 0.5896 0.4945 3.5065 1.1880 0.9962 χ2 Prob > χ2 Total 8.515 0.0142 17 Hypotheses Ho: Opinion is independent of gender HA: Opinion is not independent of gender 18 Stat 301 – Lecture 33 Test Statistic/P-value χ2 = 8.515 P-value = 0.0142 The P-value is less than 0.05, therefore reject the null hypothesis, Ho. 19 Conclusion Opinion on with whom it is easiest to make friends and gender are not independent. 20 Inference For students taking introductory statistics at a large public university, females are more likely than males to say it is easiest to make friends with someone of the opposite gender. 21 Stat 301 – Lecture 33 Inference For students taking introductory statistics at a large public university, males are more likely than females to say it is easiest to make friends with someone of the same gender. 22 JMP Gender Female Female Female Male Male Male Opinion Opposite Gender Same Gender No Difference Opposite Gender Same Gender No Difference Count 58 16 63 15 13 40 23 Analyze – Fit Y by X Y, Response: Opinion X, Factor: Gender Freq: Count 24 Stat 301 – Lecture 33 Contingency Analysis of Opinion By Gender Freq: Count Mosaic Plot 1.00 0.90 0.80 0.70 Same Gender Opposite Gender 0.60 0.50 0.40 0.30 0.20 No Difference 0.10 0.00 Female Male Gender 25 Contingency Analysis of Opinion By Gender Freq: Count Contingency Table Opinion Count No Opposite Same Difference Gender Gender Row % Expected Cell Chi^2 Female 16 58 63 11.68 42.34 45.99 68.8341 48.7854 19.3805 0.5896 1.7405 0.4945 Male 13 15 40 19.12 22.06 58.82 34.1659 24.2146 9.61951 1.1880 3.5065 0.9962 103 73 29 137 68 205 Tests N 205 DF 2 -LogLike RSquare (U) 4.4258056 0.0218 Test ChiSquare Prob>ChiSq 8.852 0.0120* Likelihood Ratio 8.515 0.0142* Pearson 26 Conditions All the expected counts are greater than 5. It would be better if all the expected counts were greater than 10 or 15. 27