More About Tests and Intervals Ch. 21 AP Statistics Null Hypothesis It is a statement about the value of a parameter for a model p. 482 Example In Florida, before the change in helmet law 60% of youths involved in a motorcycle accident had been wearing their helmets. Three years after the law change, it was observed that 781 youths were involved in a motorcycle accident and only 396 were wearing helmets. Has helmet use in Florida declined among riders under the age of 21 subsequent to the change in helmet laws? Steps 1. 2. 3. 4. Hypotheses (and define parameter, p) Model (conditions- and list test used) Mechanics Conclusion P-value • It is the conditional probability of getting results at least as unusual as the observed statistics given the null is true • The lower the p-value, the more comfortable we feel about our decision to reject the null hypothesis, but the null doesn’t get any more false Ex p. 484 A new England Journal of Medicine paper reported that the 7 year old risk of heart attack in diabetes patients taking Avandia was increased from the baseline of 20.2% to an estimated risk of 28.9% and said the p-value was 0.03. How should the p-value be interpreted in this context? Ex. P. 485 The question of whether the diabetes drug Avandia increased the risk of heart attack was raised by a study in the New England Journal of Medicine. This study estimated the 7 year risk of heart attack to be 28.9% and reported a p-value of 0.03 for a test of whether this risk was higher than the baseline risk of 20.2%. An earlier study(the ADOPT) study had estimated the 7 year risk to be 26.9% and reported a p-value of 0.27. Why did the researchers in the ADOPT study not express alarm about the increased risk they had seen? Alpha Level • Also called the significance level • Common alphas = 0.1, 0.05, 0.01 What does it mean to be statistically significant? • In large samples-small deviations can be statistically significant • In small samples-large deviations may not be statistically significant You can approximate a hypothesis test by using a confidence interval… A CI is 2 sided and can be compared with a 2 sided HT Ex. P. 488 The baseline 7 year risk of heart attacks for diabetics is 20.2%. In 2007 a NEIM study reported a 95% confidence interval equivalent to 20.8% to 40% for the risk among patients taking the drug Avandia. What did this confidence interval suggest to the FDA about the safety of the drug? JC P. 488 1. An experiment to test the fairness of a roulette wheel gives a z-score of 0.62. What would you conclude? 2. In the last chapter we encountered a bank that wondered if it could get more customers to make payments on delinquent balances by sending them a DVD urging them to set up a payment plan. Well, the bank just got back the results on their test of this strategy. A 90% CI for the success rate is (0.29, 0.45). Their old send a letter method had worked 30% of the time. Can you reject the null that the proportion is still 30% at 5%? Explain. 3. Given the CI the bank found, what would you recommend that they do? Should they scrap the DVD strategy? Example P. 488 Teens are at the greatest risk of being killed or injured in traffic crashes. According to the National Highway Safety Administration, 65% of young people killed were not wearing a safety belt. In 2001, a total of 3322 teens were killed in car accidents, an average of 9 teens a day. Because many of these deaths could have been easily prevented by the use of safety belts, many states have begun “Click It or Ticket” campaigns in which increased enforcement and publicity have resulted in significantly higher seatbelt use. Overall use in Massachusetts quickly increased from 51% in 2002 to 64.8% in 2006, with a goal of surpassing the national average of 82%. Recently, a local newspaper reported that a roadblock resulted in 23 tickets to drivers who were unbelted out of 134 stopped for inspection. Does this provide evidence that the goal of over 82% compliance was met? Use a CI and a HT. Type I Error Reject the null hypothesis when it is true. It is normally alpha, but when alpha is not given, find the probability of the question being asked. This is like getting a false positive test result(you are really healthy but diagnosed with a disease) Type II Error When you fail to reject the null hypothesis when it is false If Type II Error is too large, you will have to take a larger sample to reduce the chance of error. This is like getting a false negative test result(you are really sick but are not diagnosed with the disease) Power Rejecting the null hypothesis when it is false Power and Type II Error are complements of each other. If you increase n, you increase the power. A published study found the risk of heart attack to be increased in patients taking Avandia for diabetes. An article said, “A few events either way might have changed the findings for heart attack or for death from cardiovascular causes. In this setting, the possibility that the findings were due to chance cannot be excluded.” What kind of error would the researchers have made if, in fact, their findings were due to chance? What could be the consequences of this error? The study of Avandia published in NEJM combined results from 47 different trials-a method called meta-analysis. The drug’s manufacturer issued a statement saying, “Each study is designed differently and looks at unique questions: For example, individual studies vary in size and length, in the type of patients who participated, and in the outcomes they investigate.” Nevertheless, by combining data from many studies, meta-analyses can achieve a much larger sample size. How could this larger sample size help? JC P. 494 1. Remember our bank that’s sending out DVDs to try to get customers to make payments on delinquent loans? It is looking for evidence that the costlier DVD strategy produces a higher success rate than the letters it has been sending. Explain what a type I error is in this context and what the consequences would be to the bank. 2. What’s a type II error in the bank experiment context and what would the consequences be? 3. For the bank, which situation has higher power: a strategy that works really well, actually getting 60% of people to pay off their balances, or a strategy that barely increases the payoff rate to 32%? Explain. • A larger sample size, decreases Type II error and increases power • Power is the complement of a type II error • If you reduce alpha, you increase type II error and decrease the power • The larger the real difference between a hypothesized value and the true population proportion, the smaller the chance of making a type II error and the greater the power of the test • To reduce the standard deviation, use a larger sample size #32You are in charge of shipping computers to customers. You learn that a faulty disk drive was put into some of the machines. There’s a simple test you can perform, but it’s not perfect. All but 4% of the time, a good disk drive passes the test, but unfortunately, 35% of the bad disk drives pass the test too. You have to decide on the basis of one test whether the disk drive is good or bad. Make this a HT. 1. What are the hypotheses? 2. Given that a computer fails the test, what would you decide? What if it passes the test? 3. How large is alpha for this test? 4. What is the power of this test? #25A company is sued for job discrimination because only 19% of the newly hired candidates were minorities when 27% of all applicants were minorities. Is this strong evidence that the company’s hiring practices are discriminatory? 1. Is this a one sided or two sided test? Why? 2. What would a Type I error be? 3. What would a Type II error be? 4. What is meant by the power of the test for this problem? 5. If the hypothesis is tested at the 5% level instead of the 1% level, how will this affect the power of the test? 6. The lawsuit is based on the hiring of 37 employees. Is the power of the test higher than, lower than, or the same as it would be if it were based on 87 hires?