ECON/MGMT 201: Applied Statistics Solution Final Exam Practice Problems 1. You wish to test whether the reliability of your company’s product differs from the reliability of a competitor’s product. You have collected samples and found the following: Your Product Competitor’s Product Sample Size 1200 1000 Number of Failures 83 57 Assume a 5% level of confidence. What is your conclusion? What is the 95% confidence interval? Answer: H0: p1-p2 = 0 ; Ha: p1-p2 0 = 5% ; test stat: z because 1200>30 and 1000>30 CLT p1 =83/1200 = 0.0692 ; p 2 =57/1000 = 0.057 p1 1 p1 p2 1 p2 n1 n2 0.0692 0.057 z 1.174 0.0104 s p1 p2 0.06921 0.0692 0.0571 0.057 0.0104 1200 1000 rejection rule: z > 1.96 or z<-1.96 fail to reject the null. There is insufficient evidence to conclude that the reliabilities are different. 2. You have been asked to advise a consumer electronics company concerning its marketing strategy. The company is considering advertisements in several different magazines. The per-month cost of advertising is $12,000 for Time, $16,000 for Newsweek, and $20,000 for Reader’s Digest. You have collected historical data on the response to advertisements in those magazines and found the following: Time Newsweek Reader’s Digest Month Revenues Revenues Revenues January 2000 $53,389 $102,376 $12,349 February 2000 $121,689 $124,574 $52,348 March 2000 $15,999 $112,553 $47,439 April 2000 $120,738 $59,668 $109,483 May 2000 $115,292 $45,087 $125,610 June 2000 $24,508 $101,590 $137,062 July 2000 $97,345 $69,587 $75,944 August 2000 $12,250 $28,977 $159,637 September 2000 $129,029 $26,858 $109,519 October 2000 $13,681 $39,778 $111,394 November 2000 $47,491 $28,574 $177,211 December 2000 $54,971 $101,763 $18,492 You are interested in whether advertising is more effective in one of the magazines than the others. How might you address the problem? Which magazine (if any) is more effective than the others? You may assume a 5% level of confidence. Answer: Define the profit to be the revenues less the cost. sample mean Time $67,199 Newsweek $70,115 Reader’s Digest $94,707 standard deviation $46,648 $36,612 $53,479 We only know how to compare two populations at a time, so consider the three possible pairs. Time vs. Newsweek H0: T - N = 0 Ha: T - N 0 : 5% test stat: t (n<30) s: 41932 s x1 x2 : 17880 Newsweek vs. Reader’s Digest H0: N - R = 0 Ha: N - R 0 : 5% test stat: t (n<30) s: 45828 s x1 x2 : 19541 Time vs. Reader’s Digest H0: T - R = 0 Ha: T - R 0 : 5% test stat: t (n<30) s: 50180 s x1 x2 : 21397 t: -0.163 t: -1.26 t: -1.29 None of the differences are significant, so it does not appear that advertising in one magazine is significantly more effective than in the others. 3. Historically, one professor has failed 6% of the students in an introductory class while another professor has failed 9% in the same class. 312 have taken the class from the first professor while 220 have taken it from the second professor. Is one professor more difficult (in terms of achieving a passing grade) than the other? Answer: H0: p1-p2 = 0 ; Ha: p1-p2 0 = 5% ; test stat: z because 312>30 and 220>30 CLT p1 1 p1 p2 1 p 2 n1 n2 0.06 0.09 z 1.28 0.0235 s p1 p2 0.061 0.06 0.091 0.09 0.0235 312 220 rejection rule: z > 1.96 or z<-1.96 fail to reject the null. There is insufficient evidence to conclude that the professors differ in difficulty. 4. New graduates from W&L earn an average of $40,000 per year with a standard deviation of $10,000. New graduates from Stanford earn an average of $45,000 per year with a standard deviation of $12,000. The samples include 100 people from each university. Do Stanford graduates earn more than W&L graduates on average? Answer: H0: S-W 0 ; Ha: S-W > 0 = 5% ; test stat: z because 100>30 and 100>30 CLT s x1 x2 z s12 s 22 10000 2 12000 2 1562 n1 n2 100 100 45000 40000 3.2 , which is significant at the 5% (and even the 1% level). We 1562 therefore reject the null (i.e., Stanford grads earn more). 5. Suppose the average life expectancy is 74 years. A sample of 40 vegetarians found that the average age at death was 78 with a standard deviation of 15. Do vegetarians live longer than average? What is the 95% confidence limit? What is the 80% confidence limit? Sketch the power curve for the test by plotting at least three points. Answer: H0: 74 ; Ha: > 74 =5% (initially) test stat: z, because 40>30 CLT sx 15 40 2.37 We reject the null if x > 74+1.6452.37 = 77.9. 95% confidence limit = 78 – 1.6452.37 = 74.1 (i.e., we are 95% sure that the population mean is greater than 74.1. 80% confidence limit = 78 – 0.842.37 = 76.0 (i.e., we are 80% sure that the population mean is greater than 76.0. Assuming the 95% confidence level and arbitrarily choosing =75, =80, and =85….. 77.9 75 1.22 = 0.8888 and power = 0.1112 2.37 77.9 80 0.89 = 0.5-0.3133 = 0.1867 and power = 0.8133 Px 77.9 80 ; z 2.37 77.9 85 2.99 = 0.5-0.4986 = 0.0014 and power = 0.9986 Px 77.9 85 ; z 2.37 Px 77.9 75 ; z We can then plot the power vs. to get the power curve. power 1 74 80 population mean Note that we could do the power curve for the 80% confidence level (or any other level for that matter). 6. On average, a company spends $32,500 per year to maintain its buildings. This includes, heating and electricity, so the figure can be quite volatile. The historical standard deviation is $8,000 based on a sample size of six. A recent news article suggested that companies of this type spend $25,000 per year on average. Should the company be alarmed? Answer: H0: 25000 ; Ha: > 25000 Suppose =5% test stat: t, because 6<30 8000 sx 3266 6 32500 25000 t 2.30 3266 The t table shows a t-stat of 2.015 for a 5% tail and 5 degrees of freedom. We therefore reject the null. The company should be concerned. 7. A company wants to test the reliability of a new component. Assuming that the proportion of defects is likely to be about 0.04, what sample size should be chosen so that the margin of error (given a 95% confidence level) is 0.0005? Answer: We want z s p 0.0005 . s p 1.96 0.041 0.04 and z=1.96 (assuming a two-tailed test), so n 0.041 0.04 0.0005 n=590,070. n 8. BRIEFLY discuss the similarities between hypothesis testing and detecting outliers. Answer: The techniques used to identify outliers are quite similar to those used to test hypotheses. The basic intuition behind hypothesis testing is that we examine a sample and then ask whether the null hypothesis would be an outlier given the distribution implied by the sample. 9. You are interested in evaluating a product for possible sale in your store. The company has sent 100 randomly-chosen samples for you to test. You plan to go through with the deal as long as you believe no more than 5% of the products you subsequently purchase will be defective. You believe a 90% confidence level is appropriate for the test. What test statistic is appropriate? Specify the null and alternative hypotheses for the test. Sketch the power curve for the test. Answer: test stat = z because n=100>30 (CLT). H0: 5% ; Ha: > 5%. =10% corresponds to z=1.28. s p 0.05 0.95 0.02179 , so we reject if 100 p 0.05 1.28 0.2179 0.0779 . Let = {6%, 7%, 8%}. 0.0779 0.06 0.82 area=0.2939 =0.794 and power = 0.206 0.02179 0.0779 0.07 0.36 area=0.1406 =0.641 and power = 0.359 =7%: z 0.02179 0.0779 0.08 0 0.10 area=0.0398 =0.460 and power = 0.540. =8%: z 0.02179 =6%: z power ( 1 0.5 2% 4% 6% 8% 10% 10. Two professors teach the same course. The professors give the same final exam to both groups of students. Last term, professor A’s students averaged 75 on the final with a standard deviation of 12. Meanwhile, professor B’s students averaged 80 on the final with a standard deviation of 10. There were 16 students in each class. Assuming that teaching quality can be measured by student performance, is professor B a better teacher than professor A? Use a 10% level of significance in addressing the issue. Answer the question by calculating the p-value. Be sure to briefly state the logic behind the rejection decision. Answer: H0: B - A 0 ; Ha: B - A > 0. =10%, test stat=t because 16 < 30. s 15 12 2 15 10 2 5 1 1 1.28 . 11.045 ; s x A xB 11.045 3.905 . t 3.905 30 16 16 From the t table, we see that the p-value is slightly greater than 10%, so we do not reject the null. There is insufficient evidence to conclude that B is a better teacher. Answer the question by establishing a 90% confidence limit. Be sure to briefly state the logic behind the rejection decision. Answer: 90% confidence limit = 0 + 1.313.905 = 5.116. Since 5 < 5.116, we do not reject the null. Note that 1.31 comes from the t table with a 10% tail and 30 degrees of freedom. Answer the question by establishing a cutoff for the z-score or t-score (whichever is appropriate). Be sure to briefly state the logic behind the rejection decision. Answer: The rejection rule is t>1.31. Since 1.28<1.31, we do not reject the null. 11. The city council is planning a community-wide picnic. The council wants to estimate the proportion of residents who will attend the picnic. To do so, the council will randomly select residents and make phone calls. What is the margin of error if the council phones 30 people? Answer: Recall that when we do not know or have a reasonable estimate of it, we use p=0.5 to get the most conservative estimate of the standard error of the sample proportion. Assume =5%. s p 0.5 1 0.5 0.0913 . Margin of error = z s p = 1.960.0913 = 0.179. 30 If the desired margin of error is 4% at the 95% confidence level, what sample size should the council use? Answer: 0.04 1.96 601 people. 0.5 1 0.5 . Solving for n gives 600.25. So, the council should call n