3/26/03 252y0321 (Page layout view!) ECO252 QBA2 Name KEY SECOND HOUR EXAM Hour of Class Registered _______ March 27 - 28, 2003 I. (40 points) Do all the following (2points each unless noted otherwise). A table identifying methods for comparing 2 samples is at the end of the exam. A powerful women’s group has claimed that men and women differ in attitudes about sexual discrimination. A group of 50 men (group 1) and 40 women (group 2) were asked if they thought sexual discrimination is a problem in the United States. Of those sampled, 11 of the men and 19 of the women did believe that sexual discrimination is a problem. If the p value turns out to be 0.035 (which is not the real value in this data set), then a) at = 0.05, we should fail to reject H0 b) *at = 0.04, we should reject H0 c) at = 0.03, we should reject H0 d) None of the above would be correct statements. Explanation: The rule on p-value says if the p-value is less than the significance level (alpha = ) reject the null hypothesis; if the p-value is greater than or equal to the significance level, do not reject the null hypothesis. 1. Table 12-4 A few years ago, Pepsi invited consumers to take the “Pepsi Challenge.” Consumers were asked to decide which of two sodas, Coke or Pepsi, they preferred in a blind taste test. Pepsi was interesting in determining what factors played a role in people’s taste preferences. One of the factors studied was the gender of the consumer. Below are the results of analyses comparing the taste preferences of men and women with the proportions depicting preference for Pepsi. Males: n = 109, pSM = 0.422018 Females: n = 52, pSF = 0.25 pSM – pSF = 0.172018 z = 2.11825 (Note that H 1 may differ in 2) and 3) 2. Referring to Table 12-4, to determine if a difference exists in the taste preferences of men and women, give the correct alternative hypothesis that Pepsi would test. a) H1: M – F 0 b) H1: M – F 0 c) *H1: pM – pF 0 d) H1: pM – pF 0 3. Referring to Table 12-4, suppose Pepsi wanted to test to determine if the males preferred Pepsi less than the females. Using the test statistic given z 2.11825 , compute the appropriate p value for the test. a) 0.0170 b) 0.0340 c) 0.9660 d) *0.9830 Explanation: Alternative hypothesis is now : pM < pF or : pM – pF < 0. If z 2.12 , we want Pz 2.12 .5 .4830 .9830 . 3/26/03 252y0321 4. The amount of time necessary for assembly line workers to complete a product is a normal random variable with a mean of 15 minutes and a standard deviation of 2.1 minutes. The probability is __________ that a product is assembled in more than 20 minutes. 20 15 Solution: x ~ N 15,2.1 . Px 20 P z Pz 2.38 .5 .4913 .0087 2.1 Make a diagram! 5. The amount of time necessary for assembly line workers to complete a product is a normal random variable with a mean of 15 minutes and a standard deviation of 2.3 minutes. Find x.275 for this distribution. Solution: x ~ N 15,2.. We want a point, x.275 , so that Px x.275 .2750 . The corresponding value of z has Pz z.275 .2750 . and Pz z.275 .7250 . If this is true, and zero is the median, we must have P0 z z.275 .2250 . The closest we can come on the Normal table is P0 z 0.60 .2257 . So z .275 0.60, and x.275 z 15 0.602.3 16.38. 6. A local real estate appraiser analyzed the sales prices of homes in 2 neighborhoods to the corresponding appraised values of the homes. The goal of the analysis was to compare the distribution of sale-to-appraised ratios from homes in the 2 neighborhoods. Random and independent samples were selected from the 2 neighborhoods from last year’s homes sales, 8 from each of the 2 neighborhoods. Identify the nonparametric method that would be used to analyze the data. a) the Wilcoxon Signed-Ranks Test, D5b, using the test statistic Z b) the Wilcoxon Signed-Ranks Test, D5b, using the table test statistic W c) *the Wilcoxon Rank Sum Test, D5a, using the table test statistic T d) the Wilcoxon Rank Sum Test, using the test statistic Z 7. Turn in your computer output from computer problem 1 only tucked inside this exam paper. (3 points - 2 point penalty for not handing this in.) TABLE 10-5 To test the effects of a business school preparation course, 8 students took a general business test before the course. The same eight students took the test after the course. The data are given below. Student Before x1 1 2 3 4 5 6 7 8 530 690 910 700 450 820 820 630 After x2 670 770 1000 710 550 870 770 610 diff Note: Last column and sums not in original problem. d x 2 x1 140 80 90 10 100 50 -50 -20 400 d2 19600 6400 8100 100 10000 2500 2500 400 49600 Two tests were run using Minitab. (Only one should have been run!). The results follow: 2 3/26/03 252y0321 Table 1: Results for: 2x0321-1.MTW MTB > TwoSample c2 c1; SUBC> Alternative 1. Two-Sample T-Test and CI: after, before Two-sample T for after vs before after before N 8 8 Mean 744 694 StDev 144 155 SE Mean 51 55 Difference = mu after - mu before Estimate for difference: 50.0 95% lower bound for difference: -82.6 T-Test of difference = 0 (vs >): T-Value = 0.67 P-Value = 0.258 DF = 13 MTB > Paired c2 c1; SUBC> Alternative 1. Paired T-Test and CI: after, before Paired T for after - before after before Difference N 8 8 8 Mean 743.8 693.8 50.0 StDev 143.9 155.4 65.0 SE Mean 50.9 54.9 23.0 95% lower bound for mean difference: 6.4 T-Test of mean difference = 0 (vs > 0): T-Value = 2.17 8. P-Value = 0.033 Compute the standard deviation of the d column and use it to fill in the underlined blanks in the second of the two tests in Table 1. Show your work! (4 points - 2 point penalty for not computing the variance.) d 2 nd 2 49600 850 2 Solution: s d2 4228 .5714 s d 4228 .5714 65 .027 n 1 7 65 .027 4228 .5714 sd 22 .9907 8 8 9. What was the alternate hypothesis tested in Table 1? (1) H 1 : D 0 or H 1 : 1 2 0 or H 1 : 1 2 10. Using the correct test, state the null hypothesis and conclusion, explaining what numbers in the tests led you to your conclusion. (2) Solution: Since this is paired data, we use the second test. The null hypothesis is H 0 : D 0 or H 0 : 1 2 0 or H 0 : 1 2 . The p-value is .033. Assuming .05 , since the p-value is below the significance level (if we use 5%), reject the null hypotheses. 11. The assertion has been made that tests using paired data are more powerful than tests using independent samples. Can you point to any numbers in Table 1 that illustrate this? Solution: Power is the probability of rejecting a false null hypothesis. Notice that the first test, which is appropriate for independent samples, gives a p-value that is much higher than .033. So using the same numbers and significance levels like 5% or 10%, we reject the null hypothesis in the second test, but not the first. A more sophisticated answer might be that the standard error for d in the first version is 50 .9 2 54 .9 2 , which is much larger than 65.0, so that the second method produces larger t ' s. (This is the hardest question on the exam.) 3 3/26/03 252y0321 12. Given the following information, calculate the degrees of freedom that should be used in the pooled-variance t test. s12 = 4, s22 = 6, n1 = 16, n2 = 25 a) df = 41 b) *df = 39 c) df = 16 d) df = 25 Solution: In the pooled-variance t test, df n1 n2 1 16 25 2 39 13. Given the following information, calculate sp2, the pooled sample variance that should be used in the pooled-variance t test. s122 = 4, s22 = 6, n1 = 16, n2 = 25 a) sˆ p = 6 2 b) sˆ p = 5 2 c) * sˆ p = 5.23 2 d) sˆ p = 4 1 1 , where Solution: From the outline, t t n1 n2 2 and s d s p2 n1 n 2 n 1s12 n2 1s 22 154 246 60 144 204 s p2 1 52.23 . n1 n 2 2 15 24 39 39 TABLE 10-7 A perfume manufacturer is trying to choose between 2 magazine advertising layouts. An expensive layout would include a small package of the perfume. A cheaper layout would include a "scratchand-sniff" sample of the product. The manufacturer would use the more expensive layout only if there is evidence that it would lead to a higher approval rate. The manufacturer presents the more expensive layout to 4 groups and determines the approval rating for each group. He presents the "scratch-and-sniff" layout to 5 groups and again determines the approval rating of the perfume for each group. The data are given below. Use this to test the appropriate hypotheses with the Wilcoxon Rank Sum Test with a level of significance of 0.05. Package 52 68 43 48 Scratch 37 43 53 39 47 14. Referring to Table 10-7, the hypotheses that should be used are: a) H 0 : 1 2 versus H 1 : 1 2 b) H 0 : 1 2 versus H 1 : 1 2 c) H 0 : 1 2 versus H 1 : 1 2 d) * H 0 : 1 2 versus H 1 : 1 2 4 3/26/03 252y0321 15. Referring to Table 10-7, the rank given to the second observation in the "scratch-and-sniff" group is 6.5 or 3.5. Explanation: The numbers are written down in order. The r1 and r2 orderings are correct, since the rule is to work from the extreme of the smaller group. In this case, the 6th and 7th numbers are identical, so they are given the average rank. x1 x2 r1 r2 r1* r2* 37 9 1 39 8 2 43 43 6.5 6.5 3.5 3.5 47 5 5 48 4 6 52 3 7 53 2 8 68 1 _. 9 . 14.5 30.5 25.5 19.5 16. Referring to Table 10-7, the calculated value of the test statistic is 14.5 or 19.5. 17. Referring to Table 10-7, the critical values or p-value of the test is ________. Solution: Critical values from Table 6 are 12 and 28. From Table 5, p-value for 14.5 is between .0952 and .1429, p-value for 19.5 is between .4524 and .5476. All result in not rejecting the null hypothesis. 18. Referring to Table 10-7, the perfume manufacturer will a) *use the "scratch-and-sniff" layout because there is insufficient evidence to do otherwise. b) use the package layout because there is insufficient evidence to do otherwise. c) use the "scratch-and-sniff" layout because there is sufficient evidence to conclude that this is the best course of action. d) use the package layout because there is sufficient evidence to conclude that this is the best course of action TABLE 10-1 Are Japanese managers more motivated than American managers? A randomly selected group of each were administered the Sarnoff Survey of Attitudes Toward Life (SSATL), which measures motivation for upward mobility. The SSATL scores are summarized below. Sample Size Mean SSATL Score Population Std. Dev. American 211 65.75 11.07 Japanese 100 79.83 6.41 5 3/26/03 252y0321 19. Referring to Table 10-1, assuming the independent samples procedure was used, choose the value of the test statistic. a) b) c) z 65 .75 79 .83 z z d) * z 9.82 9.82 211 100 65 .75 79 .83 11 .07 6.41 211 100 65 .75 79 .83 9.82 2 9.82 2 211 100 65 .75 79 .83 11 .07 2 6.412 211 Location - Normal distribution. Compare means. Location - Distribution not Normal. Compare medians. 100 Paired Samples Method D4 Independent Samples Methods D1- D3 Method D5b Method D5a Proportions Method D6 Variability - Normal distribution. Compare variances. Method D7 Let’s try p-value again! Say we end up with z 3.00 . If H 1 is D 0 , p 0, p p 0 or 0 , pval Pz 3 .5 P0 z 3 . If H 1 is D 0 , p 0, p p 0 or 0 , pval Pz 3 .5 P0 z 3 . If H 1 is D 0 , p 0, p p 0 or 0 , pval 2Pz 3 2.5 P0 z 3 . 6 3/26/03 252y0321 ECO252 QBA2 SECOND EXAM February 20, 21 2003 TAKE HOME SECTION Name: _________________________ Social Security Number: _________________________ TABLE 12-6 One criterion used to evaluate employees in the assembly section of a large factory is the number of defective pieces per 1,000 parts produced. The quality control department wants to find out whether there is a relationship between years of experience and defect rate. Since the job is repetitious, after the initial training period any improvement due to a learning effect might be offset by a loss of motivation. A defect rate is calculated for each worker in a yearly evaluation. The results for 100 workers are given in the table below. Before you start, replace the 9 in the upper right hand (northeast)corner with the last digit of your Social Security Number. Total will be between 91 and 100. Defect Rate: High Average Low Years Since Training Period < 1Year 1 – 4 Years 5 – 9 Years 6 9 9 9 19 23 7 8 10 1) Use .05 in this question. a) Test the hypothesis that the proportion of workers with a low defect rate is larger for workers with 5-9 years experience than for workers with 1-4 years experience. State your hypotheses, find a test ratio and explain your conclusion. (2) b) Get a p-value for your test ratio.(1) c) Do a 95% 2-sided confidence interval for the difference between the proportion of 5-9 year workers with low defect rates and the proportion of 1-4 year workers with low defect rates. (2) d) Using all the data above, test whether there is a relationship between years of experience and the defect rate. (3) Variations on Solution: This is the solution to the problem shown above. It is your solution if your Social Security number ends in 9. For other solutions see 252y032app. To summarize the information in the problem - .05 Defect Rate <1 yr 1-4 yr 5-9yr Total 6 9 9 24 High 9 19 23 51 Average 7 8 10 25 Low 22 36 42 100 Total a) We are comparing x 2 8, n 2 36 , p 2 8 .2222 and x3 10, n3 42, p 3 10 .2381 . 36 42 Our we are testing H 1 : p3 p 2 . So the null hypothesis is H 0 : p 3 p 2 7 3/26/03 252y0321 a) Let p p 2 p3 . So p p 2 p3 .2222 .2381 .0159 and our hypotheses become H 0 : p 2 p3 0 and H1 : p 2 p 3 0 . or H 0 : p 0 and H 1 : p 0 . s p , p0 p 2 q 2 p3 q3 .2222 .7778 .2381 .7619 .004801 .004319 .009120 .0955 n2 n3 36 42 n p n3 p 3 36 .2222 42 .2381 18 8 10 2 2 .2308 36 42 n 2 n3 36 42 78 .05, z z.05 1.645 , z 2 z.025 1.960. Note that q 1 p and that q and p are between zero and one. p p0 q0 1 n1 1 n3 .2308 .7692 136 1 42 .17753 .05159 .009158 .0957 Use one of the following: Confidence interval: Since the alternate hypothesis is H 1 : p 0 , the confidence interval will be p p z s p 0.0159 1.645.0955 or p 0.1412 . This does not contradict H 0 : p 0 since any value of p between 0 and .1412 satisfies both the null hypothesis and the confidence interval, so do not reject H 0 . Test ratio: z p p 0 p .0159 0.166 . Make a diagram of a Normal curve with zero in the .0957 middle. The ‘reject’ zone is the area below - z z.05 1.645 . Since the test ratio is not in this zone, do not reject H 0 . Critical value: Because the alternate hypothesis is H 1 : p 0 , we need a critical value below zero. Use pcv p0 z p 0 1.645.0957 .1574. Make a diagram of a Normal curve with zero in the middle. The ‘reject’ zone is the area below -.1574. Since p .0159 is not in this zone, do not reject H 0 . b) The p-value for this problem is Pp .0159 Pz 0.17 .5 .0675 .4325 . Since this is not below .05, do not reject H 0 . c) p p z s p 0.0159 1.960.0955 .0159 .1872 or -.2031 to .1713 2 8 3/26/03 252y0321 d) n 100 . The proportions in rows, p r , are used with column totals to get the items in E . O is at the top of the page. Note that row and column sums in E are the same as in O except for a possible small rounding error. (Note that 2 is computed two different ways here - only one way is needed.) E H 1 yr 1 4 5 9 5.28 8.64 10 .08 24 .24 A 11 .22 16 .36 21 .42 51 .51 25 .25 L 5.50 9.00 10 .50 22 .00 36 .00 42 .00 100 1.00 Row O E E O 1 6 5.28 -0.72 2 9 11.22 2.22 3 7 5.50 -1.50 4 9 8.64 -0.36 5 19 18.36 -0.64 6 8 9.00 1.00 7 9 10.08 1.08 8 23 21.42 -1.58 9 10 10.50 0.50 Total 100 100 0.00 H 0 : years and defect rate independent E O 2 O2 E E 0.5184 0.098182 6.8182 4.9284 0.439251 7.2193 2.2500 0.409091 8.9091 0.1296 0.015000 9.3750 0.4096 0.022309 19.6623 1.0000 0.111111 7.1111 1.1664 0.115714 8.0357 2.4964 0.116545 24.6965 0.2500 0.023810 9.5238 1.35101 101.3510 DF r 1c 1 22 4 .2054 9.4877 E O2 O E 2 101 .3510 100 1.351 O2 n E E Since this is less than 9.4877, do not reject H 0 . (Diagram!) 9 3/26/03 252y0321 2) A firm has been experimenting with two separate assembly line arrangements and finds the following for the number of finished units a day. Each sample represents 21 days work. For the first arrangement the (sample) mean units produced per day were 85 with a (sample) variance of 1200. For the second arrangement the mean was 87 and the variance 3500. Use .10 . a) Test the variances for equality. (2) b) (Extra credit) Using the results of the test in a), test the equality of the means. You may use a test ratio, critical value or a confidence interval (4 points) or all three of these (6 points – assuming that you get the same conclusion for all of them) . c) (Extra credit) Given the results of both tests, write a short essay with your recommendations as to which of the two arrangements to use. (2) Solution: The facts given in the problem are n1 21, x1 85, s12 1200, n 2 21, x 2 87, s 22 3500 and H 0 : 1 2 H 1 : 1 2 . The null hypothesis is the same as H 0 : D 0 H 1 : D 0 if D 1 2 and d x1 x 2 85 87 2. .10 . a) Use a F test on the sample variances to see if the population variances are equal. Since we are comparing variances, we use Method D7. DF1 n1 1 20 and DF2 n 2 1 20 , Since the table is set up for one sided tests, if we wish to test H 0 : 12 22 , we must do two separate one-sided tests. First test s12 s 22 2 1200 3500 20, 20 2.12 and then test s 2 0.3429 against F.05 2.9167 against 2 3500 s1 1200 20, 20 2.12 . If either test is failed, we reject the null hypothesis. Since 2.9167 is above the table F, we F.05 reject the null hypothesis of equal variances and say that the variances are not equal. We should use Method D3, a method for comparing the means that allows unequal variances. b) First find degrees of freedom and a value of s d for this problem. s 2 3500 s2 s2 s12 1200 57 .1429 , 2 166 .667 , so 1 2 57 .1429 166 .667 223 .810 . n2 21 n1 n 2 n1 21 2 s12 n1 57 .1429 2 163 .265 , n1 1 20 2 2 s 22 n2 166 .667 2 1388 .89, n2 1 20 2 s12 s 22 n1 n2 so 163 .265 1388 .89 1552 .15 . n1 1 n2 1 s2 s2 2 1 2 n1 n 2 Finally df 2 2 s2 s 22 1 n2 n1 n 1 n2 1 1 sd 223 .810 2 32 .2717 . To be conservative, use 32 degrees of freedom. 1552 .15 s12 s 22 223 .810 14 .9603 n1 n 2 10 3/26/03 252y0321 Now do at least one of the following using t 2 t .32 05 1.694 . Confidence Interval: D d t 2 s d 2 1.694 14 .9603 2 25 .343 . This interval obviously includes zero, so do not reject H 0 . Test Ratio: t d D0 20 0.134 . Make a diagram of an almost Normal curve with a mean at sd 14 .9603 32 zero and ‘reject’ zones above t 2 t .32 05 1.694 and below t 2 t .05 1.694 . Since the test ratio does not fall into the ‘reject’ zones, do not reject H 0 . Critical Value: d cv D0 t 2 s d 0 1.694 14.9603 25.343 . Make a diagram of an almost Normal curve with a mean at zero and ‘reject’ zones above 25.343 and below -25.343. Since the d 2 does not fall into the ‘reject’ zones, do not reject H 0 . If you stubbornly tried to assume equal variances, the degrees of freedom for the problem are DF n1 1 n 2 1 n1 n 2 2 21 21 2 40 . t 2 t .40 05 1.684 The formula for the pooled variance is sˆ 2p 2350 n1 1s12 n2 1s 22 n1 n 2 2 20 1200 20 3500 2350 . 40 sd s p 1 1 n1 n2 1 1 1 1 2350 223 .810 14 .9603 21 21 21 21 Confidence Interval: D d t 2 s d 2 1.684 14 .9603 2 25 .193 . This interval obviously includes zero, so do not reject H 0 . Test Ratio: t d D0 20 0.134 . Make a diagram of an almost Normal curve with a mean at sd 14 .9603 40 zero and ‘reject’ zones above t 2 t .40 05 1.684 and below t 2 t .05 1.684 . Since the test ratio does not fall into the ‘reject’ zones, do not reject H 0 . Critical Value: d cv D0 t 2 s d 0 1.684 14.9603 25.193 . Make a diagram of an almost Normal curve with a mean at zero and ‘reject’ zones above 25.342 and below -25.342. Since the d 2 does not fall into the ‘reject’ zones, do not reject H 0 . c) Any report should emphasize that the major difference is the unreliability of the second method, shown by its significantly larger variance. 11