Statistics 103 Probability and Statistical Inference Instructions for lab 10 Lab Objective Perform chi-squared goodness of fit and independence tests using JMP. Lab Procedures In the U.S., you are supposed to be tried by a jury of your peers. Does this really happen in practice? A study in the UCLA Law Review (1973) of grand juries in Alameda County, California, compared the demographic characteristics of a random sample of jurors with the general population. Below are the data for age and educational level. Only persons 21 and over are considered; the population data are known from the Public Health Department. Age County-wide % # of jurors ---------------------------------------------21-40 42 5 41-50 23 9 51-60 16 19 > 61 19 33 ---------------------------------------------Total 100 66 Educational Level County-wide % # of jurors ------------------------------------------------------Elementary 28.4 1 Secondary 48.5 10 Some college 11.9 16 College degree 11.2 35 ------------------------------------------------------Total 100.0 62 Questions: 1) Test whether the juries appear to be randomly selected with respect to the distribution of age. Report the sample and population percentages in each age group, the value of the chisquared test statistic and its degrees of freedom, the p-value, and your conclusion. To perform a chi-squared goodness of fit test in JMP: 1. Enter the data in two columns, the first containing the age categories and the second containing the counts in those categories (not the percentages). Make the column of labels a character variable and the column of counts a continuous variable. 2. Select Analyze-Distribution. Enter the name of the column with labels as the Y variable, and enter the column of counts as the Freq variable. This tells JMP that the column of counts is the frequency of each category. Hit OK to get the sample percentages in each age category. 3. On the red arrow next to the variable name, select Test Probabilities. Enter in the probabilities from the null hypothesis where indicated, and select Done. The output for the chi-squared goodness of fit test is in the row labeled "Pearson." The first entry is the value of the chi-squared test statistic; the second entry is the degrees of freedom (number of categories - 1); and the last entry is the p-value from the appropriate chi-squared distribution. 2) Perform the chi-squared goodness of fit test for the education data by hand. That is, show in your report the null hypothesis, the four pieces of the chi-squared test statistic including all values of (observed - expected)2/expected, the degrees of freedom, the p-value, and your conclusions. You can use JMP to check your answer, but all the by hand work must appear to get full credit. Unit 2: Independence tests Do people's opinions of their appearance change with age? In a survey reported in Newsweek magazine (Spring/Summer 1999), 747 randomly selected women were asked, "How satisfied are you with your overall appearance?" The numbers of women who chose each of four answers are shown in the table below. Age Very Somewhat Not Too Not At All ----------------------------------------------------------Under 30 45 82 10 4 30 - 49 73 168 47 6 Over 50 106 153 41 12 ----------------------------------------------------------- Questions: 3) Test the null hypothesis that women's satisfaction with their appearance is not associated with age. Report the sample percentages in each age group, the value of the chi-squared test statistic and its degrees of freedom, the p-value, and your conclusion. To perform a chi-squared independence test in JMP, 1. Enter the data in three columns, the first containing the age labels, the second containing the satisfaction labels, and the third containing the counts in those categories (not the percentages). You should have 12 rows total in the dataset. There should be one row corresponding to each cell (e.g., the first row has "under 30" in the age column, "very" in the satisfaction column, and 45 in the count column.) Make the columns of labels character variables and the column of counts a continuous variable. 2. Select Analyze-Fit Y by X. Enter the name of the column labels (satisfaction) as the Y variable, the row labels (age) as the X variable, and the column of counts as the Freq variable. Hit OK to get a contingency table of percentages in each category. 3. The output from the chi-squared test of independence is in the row labeled "Pearson." The first entry is the value of the chi-square test statistic; the second entry is the degrees of freedom; and the last entry is the p-value from the appropriate chi-squared distribution. In the contingency table, there are three probabilities below each count. The top one in each cell is the percentage of units in the entire data set that fall in the cell of the table. The middle one in each cell is the percentage of units in the row, given that they are in the column. The bottom one in each cell is the percentage of units in the column, given that they are in the row. You can see the expected count in each cell by clicking on the red arrow next to Contingency Table, and selecting Expected. 4) Assuming the null hypothesis is true, obtain by hand the expected number of women under age 30 in a random sample of 747 women who would be very satisfied with their appearance. Show exactly what you multiplied together to obtain the expected count.