WARM - UP EXAMPLE: A survey of randomly selected College students age 21 years and younger, which found that 411 of 1012 men and 535 of 1062 women enjoyed college statistics. Is there evidence that the proportion of men who enjoy college statistics statistics differs from that of women? (α = 0.05) pi = The true proportion of student. pm = Men and pw = Women H0: pm = pw Ha: pm ≠ pw TWO Proportion z – Test pˆ m z p1 p pˆ w 1 nm CONDITIONS 1. SRS – The data was collected randomly 2. Population of Men is ≥ 10 · (1012) Population of Women is ≥ 10 · (1062) 3. 1012 · (0.41) ≥ 10 AND 1012 · (1 – 0.41) ≥ 10 1062 · (0.50) ≥ 10 AND 1062 · (1 – 0.50) ≥ 10 1 nw P Value 2P Z 4.4626 2 normalcdf E 99, 4.4626 0 Since the P-Value is less than α = 0.05 the data IS significant . There is STRONG evidence to REJECT H0 . The proportion of men enjoy college statistics does differs from that of women. What is the Statistical Inference Test you would use if you needed to determine if there was a Significant Difference between Three or More Proportions? Multiple Proportions Chapter 26: Chi Square(d) or X2–Test The P-Values/Probabilities for the X 2 – Test come from a family of Chi Square Distributions, which only take Positive Values and are all Skewed Right. A specific distribution is specified by a parameter called the Degree of freedom (df). (Degree of Freedom = df = n – 1). Calculating The X2 - Test Statistic: X 2 Observed Count Expected Count Expected Count 2 O E 2 E Calculating The X2 - P-Value: P Value P( x X2 ) X 2cdf (X2 , E99, df ) Or find the appropriate line on the X2 Table. Find the P-Value for a Chi-Square Statistic = 12.132 with df = 6. P-Value = X2cdf (12.132, E99, 6) = 0.0591 Ch. 26 - Multiple Proportions There are THREE types of Chi-Square Tests: 1. The Chi-Square Test for Goodness of Fit. 2. The Chi-Square Test for Independence. 3. The Chi-Square Test for Homogeneity. The Chi-Square Test for GOODNESS OF FIT A test of whether the distribution of Counts in one categorical variable matches the distribution predicted (expected) by a model. (Degree of Freedom = df = n – 1) Where n = # of levels of the Category. CONDITIONS 1. SRS 2. All Expected Counts are 1 or greater. 3. No more than 20% of the Expected Counts are less than 5. EXAMPLE: Is there one month of the year that stands out as having more births occurring as compared to the others? If births were distributed uniformly across the year, we would expect 1/12 of them to occur each month. To test the claim, birth data was randomly collected and compiled. JAN. FEB. MAR APR. MAY JUN. JUL. AUG. SEP. OCT. NOV. DEC. OBS. DATA 75 87 91 88 76 98 87 74 81 70 74 83 EXP. DATA 82 82 X2 Goodness of Fit Test 82 82 82 82 82 82 82 82 82 H0: Births are uniformly distributed over the year. Ha: Births are NOT uniformly distributed over the year. or H0: Proportion of births in Jan = Feb.= Mar.= • • • = Dec. Ha: Prop. of births are not all uniform. Not all pi’s equal. Obs Exp 2 X2 = 9.5366 X 2 Exp 82 P-Value = X2cdf (9.5366, E99, 11) = 0.5725 Since the P-Value is NOT less than α = 0.05 we Fail to REJECT H0 . No evidence that Births do NOT occur uniformly through out the year. CONDITIONS 1. SRS √ 2. All Expected Counts are 1 or greater. √ 3. No more than 20% of the Expected Counts are less than 5. √ EXP. DATA 82 82 82 82 82 82 82 82 82 82 82 82 OR… CONDITIONS 1. SRS √ 2. All Expected Counts greater than 5. √ EXAMPLE: The NY Civil Liberties Union feel that the NYC Police Dept is not hiring an ‘ethnic composition’ representing the city. NYC is 29.2% White, 28.3% Black, 31.5% Latino, 9.1% Asian, and 2% other. If the NYC Police Dept. is composed of the following, does the Union have a case? OBS. DATA EXP. DATA X2 Goodness of Fit Test White Black Latino Asian Other 8560 7120 2762 1852 560 6089.4 5901.7 6569 1897.7 417.08 H0: The Police Dept. represents the Population of NYC. Ha: The Police Dept. Does NOT represents the Population of NYC. Obs Exp 2 X2 = 3510.3 X 2 Exp P-Value = X2cdf (3510.3, E99, 4) =0 Since the P-Value is less than α = 0.05 the data IS significant . There is STRONG evidence to REJECT H0 . The hiring practice of the NYC police dept. does NOT represent the ethnic composition of NYC. CONDITIONS 1. SRS X 2. All Expected Counts are 1 or greater. √ 3. No more than 20% of the Expected Counts are less than 5. √ EXP. DATA 6089.4 5901.7 6569 1897.7 417.08 Homework Page 628: #3, 10 Homework: Page 628: #10 Homework Page 628: #3-5, 9 Homework Page 628: #3-5, 9 Homework Page 628: #3-5, 9 WARM – UP (Matching) σ - Unknown Name the Statistical Inference Test you would use if you need to determine if there was a Significant Difference between… b 1. ___The Quantitative Means of Two Independent Samples. d 2. ___The Proportion of a Sample and a Stated Proportional Value. 3. ___The Quantitative Mean of a Sample and a Stated a Mean. e 4. ___The Proportions of Two Independent Samples. c 5. ___The Quantitative Means of Two Dependent Samples. a.) One Sample T-Test b.) Two Sample T-Test c.) One Sample Matched Pairs Test d.) One Proportion z – Test e.) Two Proportion z – Test 750 of a survey of 1785 students indicated that they cheat on tests. Do the results provide good evidence that less than half of students cheat. What Type Error could result. cheat on tests. 1. p = The true proportion of students who pˆ p0 0.420 0.50 z z 2. H0: p = 0.50 3. One Proportion 0.5 1 0.5 p0 1 p0 Ha: p < 0.50 z – Test 1785 n 4. 1. SRS – Not Stated 2. Population of students ≥ 10(1785) 3. 1785(0.50) ≥ 10 1785(1 – 0.50) ≥ 10 5. z 6.7457 P Value 0 6. Since the P-Value is less than α = 0.05 there is strong evidence to REJECT H0 . There is evidence that less than half of student cheat on test. Since you Rejected H0, you could be making a TYPE I error IF the in actuality the true prop. cheaters is exactly half.