Stat401E Lab 2 Fall 2005 Objectives: Practice with the distribution of sample means; practice the binomial distribution; construct confidence intervals and test hypotheses;. Reading: Chapters 4 and 5 in Howell (2002). Notice especially the discussion of the sampling distribution (section 4.2) and its role in the logic of confidence intervals and hypothesis testing (sections 4.3 – 4.7). Chapter 5 introduces ideas of probability, and in this lab we will practice the binomial for small samples. Part I: The distribution of sample means 1. Exercise: The z-transformation for the distribution of sample means is: z y u n Using this transformation, a. b. c. d. e. What is the probability that What is the probability that What is the probability that What is the probability that What is the probability that y 12 when μy = 10.5, σ2 =50 and N = 25. y 49 when μy = 50, σ2 =115 and N = 115. y 24 when μy = 25, σ2 =75 and N = 250. y 14 when μy = 12, σ2 =70 and N = 35. 98 y 101 when μy = 100, σ2 =100 and N = 20. For each of the 5 distributions above, construct 95% probable limits for the distribution of y . To do this, re-read section 3.4 and apply the formula: n y 1.96 2. Murder in Snedecor Hall: a. Using the table of murder rates from Lab 1, look over the 99 cities to determine the range of murder rates, and then divide the range into 8 - 12 intervals of equal width. Determine the frequency in each interval and construct a relative frequency polygon. The mean number of murders per 100,000 people for this population of cities is = 7.4 and the standard variance is 2 = 28.4. The SPSS syntax file containing the 99 murder rates (see back of this lab) is saved on the class website as “lab1murder.sps.” Its great for constructing polygons. b. Your random sampling exercise in Lab 1 generated the following distribution of sample means (see top of next page). Use this data to construct a relative frequency polygon that can be directly compared to the one you did in question 2a. To make them comparable, have the horizontal axis cover the same range, but this time divide the appropriate section of that range into the intervals shown in the table above. Note where your sample mean for n = 30. Make a small indication on or near the horizontal axis where your mean is (for n = 30) compared to all other means in the distribution. Distribution of sample means from Lab 1 (Aug. 29, 2005): Category 3.00 – 3.99 4.00 – 4.99 5.00 – 5.99 6.00 – 6.99 7.00 – 7.99 8.00 – 8.99 9.00 – 9.00 10.0 – 10.99 11.0 – 11.99 12.0 – 12.99 13.0 – 13.99 Total n = 10 1 3 12 12 27 24 16 3 6 2 0 _______ 107 n = 30 1 5 13 9 7 1 _____ 36 c. Using the grouped data formula (approach #3 in the class handout), estimate the mean (average) of the sample means for the table above? How does it compare to the population mean ? d. Using the grouped data formula (approach #3), estimate the variance of the sampling means for the table above? How does it compare with the population variance (2)? How does it compare with the variance of the sampling distribution (2/n)? e. Briefly discuss whether your results are consistent with the Central Limit Theorem on page 80. Part II: The binomial distribution. 3. Exercises: Read sections 5.6 – 5.8 (pp. 126 – 134) in Howell (2002). We summarize the binomial distribution by reporting its parameters as y ~ b(n; p). Using this notation and the formula on page 129, calculate the following probabilities: a. b. c. d. What is the probability that y 3 when y ~ b(7;1/2). What is the probability that y 3 when y ~ b(7; 1/2). Using your answer to part b, what is the probability that y 3 when y ~ b(7; ½). Draw a relative histogram showing the probability of all outcomes for this distribution. e. f. What is the probability that y 3 when y ~ b(9; 1/3). What is the probability that y 5 when y ~ b(9; 1/3). 4. A mystery: Did the bank discriminate? This is a true story. A bank in Iowa was accused of discriminating against minorities in its hiring practices. The bank had 10 tellers, none of whom was from a minority group. The minority labor participation rate was about 15% (0.15) for these kinds of positions. a. What is the probability that a bank would have 0 minority tellers out of 10 in a 15% minority labor force? b. Suppose the bank had one minority employee. What is the probability that the bank would have 0 or 1 minority teller in a 15% minority labor force? c. Would your answer to question 4a be the same if the minority participation rate were 25% instead of 15%? d. What would be your decision regarding the likelihood of discrimination if the bank had one minority employee and the minority labor participation rate had been 30%? Part III: Exercise using the general social survey. 5. Using the SPSS “select if” procedure under the menu heading “Data” to select only female respondents (SEX = 2), obtain a distribution of responses to the statement “Family life suffers if Mom works full time (FAMSUFFR). Repeat the process for men (SEX = 1). Compare the two distributions by drawing relative frequency histograms. What do you conclude? 6. Selecting only men (SEX = 1) who are working full-time (WRKSTAT = 1), obtain a distribution of responses to the number of times they volunteered for a charitable organization in the last 12 months (VOLWKCHR). Next, select only men who are working part-time, and compare their distribution on the same variable to full-time workers. What do you conclude? ____________________________________ title 'Lab 1 Distribution of murder rates'. set width=80. data list free/ cityno rate. begin data 1 3.6 2 9.7 3 11.4 4 0.9 5 10.5 6 5.6 7 3.0 8 16.4 9 14.1 10 3.1 11 4.6 12 5.6 13 1.4 14 6.3 15 0.0 16 9.6 17 8.6 18 9.6 19 1.8 20 2.5 21 5.3 22 0.0 23 1.4 24 3.9 25 11.7 26 11.9 . . . 63 11.9 64 2.1 65 0.7 66 6.0 67 9.0 68 7.1 69 12.3 70 6.2 71 7.9 72 5.8 73 3.9 74 8.5 75 2.9 76 4.0 77 13.9 78 3.2 79 6.5 80 11.9 81 8.2 82 7.9 83 6.0 84 5.9 85 4.5 86 7.4 87 12.4 88 13.4 89 2.4 90 0.0 91 5.7 92 5.1 93 6.2 94 13.4 95 7.5 96 8.8 97 11.9 98 13.0 99 12.2 end data. frequencies var=rate / statistics = all.