HOW MUCH SLEEP DID YOU GET LAST NIGHT? 1. 2. 3. 4. 5. 6. <6 6 7 8 9 >9 17% 17% 17% 17% 17% 17% Slide 1- 1 1 2 3 4 5 6 CHAPTERS 17 Testing Hypotheses and Confidence Intervals ISU ACT MATH SCORES - 2012 New students on campus Averages ISU Math Score = 23.5 State of Illinois = 21.0 National = 21.1 SD =5.3 New students at ISU=5,147 Slide 1- 3 HYPOTHESIS TESTING USING P-VALUE P-value: Given the null hypothesis, the probability that we observe the sample we collected. If p-value is small 1) Ho is probably wrong and Ha is probably right 2) we really do have an odd sample Conclusion P-value small => reject null P-value large => fail to reject null Slide 1- 4 APPROACHES TO HYPOTHESIS TESTING P-value vs. Alpha Level One-sided tests (using our z-tables) Z-score vs. Critical Z* One-sided tests Two-sided tests Confidence Intervals Two-sided tests Slide 1- 5 HYPOTHESIS TEST USING Z-SCORE VS. P-VALUE | Z-score| > Critical Z* | Z-score| < Critical Z* Fail to reject the null hypothesis P-value < alpha Reject the null hypothesis Reject the null hypothesis P-value > alpha Fail to reject the null hypothesis Slide 1- 6 CRITICAL VALUES AGAIN (CONT.) Here are the traditional critical values from the Normal model: 1-sided 2-sided 0.05 1.645 1.96 0.01 2.28 2.575 0.001 3.09 3.29 Slide 1- 7 THE TRUE PROPORTION WHO PASS THE AP EXAM IS 18%. IN THE AP INCENTIVE PROGRAM, 20% OF THE STUDENTS PASS THE AP EXAM, WITH 225,000 STUDENTS THAT TOOK THE PROGRAM. The researchers believes that those in the AP Incentive Program are better than most students at passing the AP exam. Does the program work? That is, does the program increase the likelihood of passing the AP exam. Slide 1- 8 WHAT IS THE APPROPRIATE HYPOTHESIS TEST? 1. 2. 3. 4. 5. 6. Ho: pProgram=0.18 Ha: pProgram>0.18 Ho: pProgram=0.18 Ha: pProgram<0.18 Ho: pProgram=0.18 Ha: pProgram≠0.18 Ho: pProgram=0.20 Ha: pProgram>0.20 Ho: pProgram=0.20 Ha: pProgram<0.20 Ho: pProgram=0.20 Ha: pProgram≠0.20 17% 17% 17% 17% 17% 17% Slide 1- 9 1 2 3 4 5 6 DOES THE PROGRAM WORK? TEST SIGNIFICANCE LEVEL AT 0.05 Yes, there is enough evidence to suggest that 25% the program works. 2. Yes, there is NOT enough evidence to suggest 25% that the program works. 3. No, there is enough evidence to suggest that the 25% program works. 4. No, there is NOT enough evidence to suggest that the program works. 1. 25% Slide 1- 10 RADIOACTIVE FALLOUT FROM TESTING ATOMIC BOMBS DRIFTED ACROSS A REGION. THERE WERE 240 PEOPLE IN THE REGION AT THE TIME. 46 DIED OF CANCER. CANCER EXPERTS ESTIMATE ABOUT 28 CANCER DEATHS IS NORMAL FOR A GROUP THIS SIZE. IS THE DEATH RATE FOR THIS GROUP UNUSUALLY HIGH? WHAT IS OUR HYPOTHESES? Slide 1- 11 WHAT IS OUR HYPOTHESES? 1. 2. 3. 4. 5. 6. H0: pregion = 0.1167 HA: pregion > 0.1167 H0: pregion = 0.1167 HA: pregion < 0.1167 H0: pregion = 0.1167 HA: pregion ≠ 0.1167 H0: pregion = 0.1917 HA: pregion > 0.1917 H0: pregion = 0.1917 HA: pregion < 0.1917 H0: pregion = 0.1917 HA: pregion ≠ 0.1917 17% 17% 17% 17% 17% 17% Slide 1- 12 1 2 3 4 5 6 WHAT ALPHA SHOULD WE CHOOSE? 1. 2. 3. 4. α= 0.10 α= 0.05 α= 0.01 α= 0.001 25% 25% 25% 25% Slide 1- 13 1 2 3 4 WHAT IS OUR P-VALUE? 1. 2. 3. 4. 0.9999 0.0001 3.622 0.1917 25% 25% 25% 25% Slide 1- 14 1. 2. 3. 4. IS THE DEATH RATE FOR THIS GROUP UNUSUALLY HIGH? 25% 25% 25% 25% 1. 2. 3. 4. P-value is low enough to conclude that the death rate is unusually high P-value is too low to conclude that the death rate is unusually high P-value is too high to conclude that the death rate is unusually high P-value is high enough to conclude that the death rate is unusually high Slide 1- 15 WE CONCLUDED THAT THE DEATH RATE WAS UNUSUALLY HIGH. BUT DOES IT PROVE THAT EXPOSURE TO RADIATION INCREASES 33% 33%THE RISK 33% OF CANCER? 1. 2. 3. No, there is not enough evidence. Yes, there is enough evidence. Whether the death rate by cancer is unusually high or not, the CAUSE cannot be determined. Slide 1- 16 1 2 3 P-VALUES AND SIGNIFICANCE LEVELS Significance at 0.01 the test is also significant at 0.05 and 0.10 Significance at 0.05 the test is also significant at 0.10 Slide 1- 17 A MARKET RESEARCHER CONCLUDED THAT A MORE PEOPLE LIKE 4LOCO SIGNIFICANTLY BETTER THAN SPARKS. HIS DECISION WAS BASED ON ALPHA=0.025. WOULD HIS DECISION HAVE BEEN DIFFERENT UNDER ALPHA=0.20 ? 1. 2. 3. Yes, he still would have rejected the null. No, he would not have rejected the null. Maybe, we would need to know the p-value. Slide 1- 18 HIS DECISION WAS BASED ON ALPHA=0.025. WOULD HIS DECISION HAVE BEEN DIFFERENT UNDER ΑLPHA=0.001 ? 1. 2. 3. Yes, he still would have rejected the null. No, he would not have rejected the null. Maybe, we would need to know the p-value. 33% 33% 33% Slide 1- 19 1 2 3 CONFIDENCE INTERVALS AND HYPOTHESIS TESTS Confidence intervals and hypothesis tests are built from the same calculations. They have the same assumptions and conditions. You can approximate a hypothesis test by examining a confidence interval. Slide 1- 20 ONE-PROPORTION Z-INTERVAL When the conditions are met, we are ready to find the confidence interval for the population proportion, p. The confidence interval is pˆ z SE pˆ where ˆˆ SE( pˆ ) pq n The critical value, z*, depends on the particular confidence level, C, that you specify. Slide 1- 21 CONFIDENCE INTERVALS AND HYPOTHESIS TESTS Construct CI based on sample If hypothesized value falls in CI, Fail to Reject the null hypothesis If hypothesized value is outside of the CI, Reject the null hypothesis Slide 1- 22 CRITICAL Z* AND SIGNIFICANCE LEVEL FOR TWO-SIDED TEST α= .20 CI = 80% z*=1.282 α= .10 CI = 90% z*=1.645 α= .05 CI = 95% z*=1.96 α= .02 CI = 98%z*=2.326 α= .01 CI = 99% z*=2.576 α= .001 CI = 99.9% z*=3.29 Slide 1- 23 EXAMPLE A magazine reported the results of a random telephone poll to examine what they use to measure their idea of success. The poll survey 1085 men. 28 said their measure of success was through work Slide 1- 24 1. 2. SUPPOSE WE WISH TO SEE IF THE FRACTION HAS FALLEN BELOW THE 5% MARK. WHAT DOES YOUR 99.9% CI INDICATE 50%? 50% 5% is in the interval 5% is NOT in the interval Slide 1- 25 1 2 SUPPOSE WE WISH TO SEE IF THE FRACTION HAS FALLEN BELOW THE 5% MARK. WHAT DOES YOUR CI INDICATE? 1. 5% is in the interval, there is strong evidence that 25% MORE than 5% of men use work as their measure of success 2. 5% is in the interval, there is strong evidence that 25% FEWER than 5% of men use work as their measure of success 3. 5% is NOT in the interval, there is strong evidence that 25% MORE than 5% of men use work as their measure of success 25% 4. 5% is NOT in the interval, there is strong evidence that FEWER than 5% of men use work as their measure of success Slide 1- 26 MAKING ERRORS When we perform a hypothesis test, we can make mistakes in two ways: I. The null hypothesis is true, but we mistakenly reject it. (Type I error) II. The null hypothesis is false, but we fail to reject it. (Type II error) Slide 1- 27 MAKING ERRORS (CONT.) Which type of error is more serious depends on the situation at hand. In other words, the gravity of the error is context dependent. Here’s an illustration of the four situations in a hypothesis test: Slide 1- 28 POWER OF THE TEST The probability that it correctly rejects a false null hypothesis. Slide 1- 29 A BASKETBALL PLAYER WITH A POOR FOULSHOT RECORD PRACTICES INTENSIVELY DURING THE OFF-SEASON. He claims he improved his proficiency from 40% to 50%. Skeptical, the coach ask him to take 10 shots, and is surprised that he makes 9 out of 10. Assume Ho: p=0.4 Ha: p>0.4 Slide 1- 30 IF THE SHOOTER HAS NOT IMPROVED FROM 40%, BUT STILL MANAGES TO MAKE 9 OF 10, THEN THE COACH WILL THINK HE HAS 50% 50% IMPROVED. WHAT TYPE OF ERROR IS THE COACH MAKING? 1. 2. Type I – reject null even though it is true Type II – fail to reject null even though it is false Slide 1- 31 1 2 IF THE PLAYER REALLY CAN HIT 50%, AND IT TAKES AT LEAST 9 OUT OF 10 SUCCESSFUL SHOTS TO CONVINCE THE COACH. What’s the power of the test? Slide 1- 32 UPCOMING WORK HW #9 due Sunday Part 3 of Data Project due Monday Quiz #5 in class next Wednesday