Example. Men, Women, and the Relative Importance of Personality versus Looks. In our January 14 class survey, a questions was In deciding whether to date someone, rate the relative importance of personality versus looks on a scale of 0 (personality most important) to 9 (looks most important). I created three categories for the “relative importance” responses: 0 to 3, 4 to 5, and 6 to 9. Table 14.1 compares the 68 men and 136 women in the survey with regard to these categories. Within a cell, the top number is a count while the number below the count is the row percent for that category. Table 14.1 Gender and Relative Importance of Personality versus Looks female male 0-3 31 22.79% 7 10.29% Relative Importance 4-5 6-9 85 20 62.50% 14.71% 42 61.76% 19 27.94% All 136 68 Figure 14.1 is a bar chart of the row percentages. This graph clearly shows a different pattern of responses for mean and women. Figure 14.1 A Chi-square procedure can be used to analyze the statistical significance of the relationship between two categorical variables. Generally, the null hypothesis is that there is not a relationship. Within a specific problem, we may write a more elaborate version of this null hypothesis. For instance, in our example the null and alternative hypotheses could be expressed as: H0 : No difference between men and women with regard to the distribution of responses in the “Importance Categories” (in other words, relative importance of looks and sex are not related) Ha : There is a difference between men and women Table 14.2 shows the Minitab output for the Chi-square test of these hypotheses. We see that the p-value of the Chi-square test is 0.019. This is a “statistically significant” result so we can declare a difference between men and women. This is a conclusion that applies to the populations of men and women represented by the samples from our class. Table 14.2 Chi-square Test for Relationship Between Gender and Relative Importance of Personality versus Looks. Relative Importance 0-3 4-5 6-9 31 85 20 25.33 84.67 26.00 All 136 136.00 male 7 12.67 42 42.33 68 68.00 All 38 38.00 127 127.00 female 19 13.00 39 39.00 204 204.00 Chi-Square = 7.960, DF = 2, P-Value = 0.019 Some Technical Details of the Chi-Square Procedure Ø The null hypothesis is that there is no relationship between two categorical variables. Ø The Chi-square statistic measures the difference between the observed counts in the cells of the two-way table and a set of “expected” counts. Ø The expected counts are hypothetical counts that would occur if the null hypothesis were true. Calculation of Expected Counts For each cell in a two-way table, Expected Count = Row Total × Column Total Total n Example: For the “female, 0-3” cell in our example Expected Count = 136 × 38 = 25.33 204 Ø The Chi-square statistic is Χ2 = ∑ ( Observed - Expected) 2 ∑ Expected all cells stands for “sum” and the sum is over all cells of the table. Ø If the null hypothesis is true, the p-value is approximately the area to the right of the calculated Chi-square value in a “Chi-Square probability distribution” that has degrees of freedom = (rows – 1)(columns – 1). This is the probability that the Chisquare would be as large as it is if the null hypothesis were true. In our example, there are two rows and three columns so the df= (2-1)(3-1)=2. The pvalue is the area to the right of the calculated Chi-square of 7.96 (see Table 14.2). This area is the probability that the Chi-square would be 9.96 or large if the null hypothesis is true. The answer, p-value-0.019, tells us that we can reject the null hypothesis. Figure 14.2 illustrates the p- value for our example.