252y0561h 11/1/05 ECO252 QBA2 SECOND EXAM Nov 8-9, 2005 TAKE HOME SECTION Name: _________________________ Student Number: _________________________ III. Do sections adding to at least 20 points - Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H 1 where appropriate. You have not done a hypothesis test unless you have stated your hypotheses, run the numbers and stated your conclusion. (Use a 95% confidence level unless another level is specified.) Answers without reasons are not usually acceptable. Neatness counts! Check the website regularly for hints or corrections. 1) A state is trying to figure out whether the background on highway signs makes a difference. In order to do this two samples of 15 individuals are shown a number of slides rapidly. The slides have either a green or a red background. You are trying to find out whether there is a difference between the number of slides correctly read between those with a red or a green background. To do so you will compare the mean or median as appropriate to the distribution. To personalize the data, look at the third digit from the end to decide what red data you will use. Call the column that you pick rj and compute a column called dj with the formula dj = green – rj. (Example: Seymour Butz’s student number is 976512, so he picks column 5 and used d5 = green – r5.) Tell me what column you are using! If you compare means state your hypotheses both in terms of 1 and 2 and in terms of D 1 2 . Row 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 green 8 10 6 5 9 7 3 7 7 3 6 6 8 3 10 r0 9 10 9 4 9 9 8 7 5 6 10 8 8 9 9 r1 5 9 5 7 6 7 9 8 7 5 7 5 8 7 7 r2 7 10 11 9 8 8 8 7 9 9 7 7 10 7 6 r3 6 12 7 10 12 10 9 11 9 8 7 6 11 13 8 r4 5 6 5 6 5 11 8 8 7 9 10 7 7 6 5 r5 8 9 6 10 7 7 5 6 11 9 10 7 7 8 8 r6 9 7 6 7 8 7 6 6 7 7 5 7 10 7 6 r7 7 9 10 11 11 6 11 6 9 7 7 7 9 8 8 r8 10 9 7 10 6 6 7 8 8 7 9 11 6 8 7 r9 5 6 6 7 10 10 10 13 9 6 8 6 7 8 10 Minitab computed some basic statistics from the data which will help you in some parts of this problem. Variable green r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 N 15 15 15 15 15 15 15 15 15 15 15 Mean 6.533 8.000 6.800 8.200 9.267 7.000 7.867 7.000 8.400 7.933 8.067 SE Mean 0.601 0.458 0.355 0.368 0.581 0.488 0.435 0.324 0.456 0.408 0.573 StDev 2.326 1.773 1.373 1.424 2.251 1.890 1.685 1.254 1.765 1.580 2.219 Minimum 3.000 4.000 5.000 6.000 6.000 5.000 5.000 5.000 6.000 6.000 5.000 Q1 5.000 7.000 5.000 7.000 7.000 5.000 7.000 6.000 7.000 7.000 6.000 Median 7.000 9.000 7.000 8.000 9.000 7.000 8.000 7.000 8.000 8.000 8.000 Q3 Maximum 8.000 10.000 9.000 10.000 8.000 9.000 9.000 11.000 11.000 13.000 8.000 11.000 9.000 11.000 7.000 10.000 10.000 11.000 9.000 11.000 10.000 13.000 a) Display the numbers that you are using in columns and compute a sample mean and sample standard deviation for the d column. (1) 252y0561h 11/1/05 In this problem assume that the red and green data are two independent samples. Use a confidence level of 95%. b) Assume that you believe that the normal distribution does not apply to the data and compare the means or medians as appropriate. (4) c) You suspect that the data has the Normal distribution. Test to see if the Normal distribution applies. Use a test that I taught you. (3) d) You decide that the Normal distribution applies to the data, but do not know if the variances are equal. Test them for equality. (1) e) You conclude that the underlying distributions are Normal and that the population variances are equal. Compare the means or medians as appropriate. Use a test ratio, critical value or a confidence interval (4) or all three (6). [15] f) (Extra credit) You conclude that the underlying distributions are Normal and that the population variances are not equal. Compare the means or medians as appropriate. Use a test ratio, critical value or a confidence interval (5) or all three (7) 2) In fact the data on the previous page applies to a single sample of 15 individuals. That is the first line of your worksheet tells you how the first person in the sample did when showed the same slides with red or green backgrounds. This applies to a) and b) in this question. Use a confidence level of 95%. a) Assume that you believe that the normal distribution does not apply to the data and compare the means or medians as appropriate. (3) b) You assume that the data has the Normal distribution. Compare the means or medians as appropriate. (3) c) For any part of one of these problems (tell me which one!), compute a confidence interval that you would use to compare means if your alternate hypothesis was H 1 : 2 1 . (2) [23] d) For the same part as you used in c), find a p-value for the null hypothesis. (2) [25] These results are all supposed to look to me as you did them by hand. But what I don’t know won’t hurt me. If you want to check your results by computer, you might try to use the following Minitab routine. If you put green in C1 and label columns with headings like rj, dj and dsqj (Seymour called his green, r5, d5 and dsq5.) The routine below with appropriate changes to rj, dj and dsqj, will compute much of the stuff above, though not in the right order. Note that to do a Wilcoxon signed rank test by hand, you will have to drop all zeroes from the d column. Computations for comparing c1 and rj MTB > MTB > MTB > MTB > MTB > MTB > MTB > SUBC> MTB > MTB > MTB > SUBC> MTB > SUBC> MTB > SUBC> let dj = c1 – rj let dsqj = dj *dj print c1 rj dj dsqj describe c1 rj dj sum dj ssq dj TwoSample c1 rj; Pooled. TwoSample c1 rj. Paired c1 'rj'. VarTest c1 'rj'; Unstacked. WTest 0.0 'dj'; Alternative 0. Mann-Whitney 95.0 c1 'rj'; Alternative 0. 252y0561h 11/1/05 If you want to fake the calculations for the Mann-Whitney test, try this. Procedure for setting up Mann-Whitney Test #c1 is green, c2 is rj, c3 is difference. # Mann Whitney Test MTB > Stack c1 c2 c5; SUBC> Subscripts c6. MTB > Rank c5 c7. MTB > Unstack (c7); SUBC> Subscripts c6; SUBC> After; SUBC> VarNames. MTB > sum c8 MTB > sum c9 MTB > print c1 c8 c2 c9 #The rest is up to you. If you want to fake the calculations for the Wilcoxon signed rank test, try this. Unfortunately, I know no good way to remove the zeros or change the signs except by hand. Procedure for setting up Wilcoxon Signed Rank Test #c1 is green, c2 is rj, c3 is difference. MTB > Let c3 = c1-c2 #Maybe you already did this. MTB > # Wilcoxon signed rank test MTB > let c10 = c3 MTB > #Remove zeroes from c10. (Just use delete on the cells with zeros.) MTB > #Notice that n has gotten smaller. MTB > let c11 = abs(c10) MTB > rank c11 c12 MTB > let c13 = c12 MTB > #Change signs in c13 to agree with signs in c10. MTB > let c14 = c13 *c10 #Check on signs. All should be positive. # Aside from this,consider c14 garbage. MTB > print c10 c11 c12 c13 #You now have the four columns that I computed in MTB > #the examples. The totals are up to you. 3) The results of a Gallup phone survey appear below. Consumers were asked if they objected to having their medical records shared with different types of organizations. Results follow. The proportion in a sample of 1000 who objected to sharing with insurance companies was p1 .820 . The proportion in a sample of 1000 who objected to sharing with pharmacies was p 2 .590 The proportion in a sample of 1000 who objected to sharing with medical researchers was p3 .670 Personalize the data by using the second to last digit of your student number, call it d . Multiply it by .001. Call the result .00d – If the second to last number is zero, use .00d = .010. Add .00d to .820 and subtract .00d from .670. . (Example: Seymour Butz’s student number is 976532, so he adds .003 to .820 and gets .823 and subtracts .003 from .670, getting .667. He leaves .590 alone. a) Is the proportion of people who object different for different institutions? .01 . (4) b) If appropriate, use the Marascuilo procedure to determine which organizations are different. Discuss. (3) [32]