Probability, Binomial Distributions and Hypothesis Testing Vartanian, SW 540 1. Assume you are tossing a coin 11 times. The following distribution gives the likelihoods of getting a particular number of heads: Heads 0 1 2 3 4 5 6 7 8 9 10 11 Probability .0005 .0054 .0269 .0806 .1611 .2256 .2256 .1611 .0806 .0269 .0054 .0005 The null hypothesis is that you are no more likely to get a head than a tail. The alternative hypothesis is that the coin is not a fair coin and that you are more likely to get a head than a tail. A. Is this a one-tailed test or a two-tailed test? B. If you are using a 95% confidence interval or a 5% rejection region and you got 9 heads out of 11 tosses of the coin, what would you conclude? C. If you were using a 5% rejection region and got 8 heads, what would you conclude? D. If you were using a 15% rejection region and got 8 heads, what would you conclude? E. If you rejected the null hypothesis in question D but found out that in the population (or in reality) that the coin was a fair coin, what type of error would you have made? F. What is the type of error you could have made in part C of this question? G. If you increased the likelihood of a type I error, would you increase or decrease the rejection region? H. If you decreased the likelihood of a type II error, what would happen to the likelihood of a type I error. Give an example of increasing the likelihood of a type I error. D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 1 2. You are examining the difference in income levels between women and men. Your null hypothesis is that there is no difference in income levels between the two groups. Your research hypothesis states that there is a difference in income levels between the groups, but you're not sure which group has a greater income. In your sample you find that men have a mean income level of $12,500 and women have a mean income level of $11,000. You find that the grouped standard error is $500. You have set up a 95% confidence interval. Assume that the samples were drawn independently, a very large sample size and normality. A. Have you found a significant difference in income levels? B. What will you do with your null hypothesis? C. What type of error could you have made? D. Assuming that you have a random sample from the population, can you generalize your results? E. What are your independent and dependent variables? F. Have you controlled for alternative explanations to gender differences? 3. Assume that we've conducted another sample of men and women, examining their levels of income. In this sample we find that the mean level of income for men is $11,000 and the mean level of income for women is $13,000. The grouped standard error is 2,000. Assume that the samples were drawn independently, a very large sample size and normality. Answer questions A-F from question 2. 4. If you know that 10% of the population is gay or lesbian and only 3 out of 45 people are gay or lesbian in the class, was this class chosen randomly? Can you reject your null hypothesis? 5. If 50% of the population are women and 10 out of the 20 people selected for a sample of the population are women, have you chosen randomly? Can you reject your null hypothesis? 6. You are trying to determine whether or not a particular treatment has positive effects on outcomes. In your null hypothesis you state that half the people will be better off after the treatment and half the people will be worse off. Your alternative hypothesis states that people should be better off after the treatment. In a sample of 12 people, 10 are better off after the treatment. Can you reject your null hypothesis? Test this at the .05 level of significance. D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 2 7. In question 6, if 8 of the 12 people were better off, could you reject the null hypothesis? 8. In a sample of 10 high school dropouts and 10 high school graduates, you would like to determine if there is a difference in income levels in the population for this group. Specifically, you would like to determine if graduates have higher income levels than do dropouts. From your sample you find that high school dropouts have an income level of $25,000 and high school graduates have an income level of $25,200. The standard deviation for both the dropouts and the graduates is $100. Assume random sampling and independence of samples. Because the standard deviations are the same, assume equal variances. Test your hypothesis at the .05 and .01 levels of significance. 9. You assume that the mean value for years of education is 12.5. You draw a sample of 100 people and find that their average years of schooling is 12.4. The standard deviation for this group is .5. Is there a difference in your sample and your hypothesized value? Test this at the .05 and .01 levels of significance. D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 3 Answers to the Worksheet Questions Vartanian: MSS SW 540 1. A. 1 tailed test. B. Reject the null since 0.0269+0.0054+0.0005=.0328<.05 C. Accept the null since 0.0806+0.0269+0.0054+0.0005=0.1134>0.05 D. Reject the null since 0.1134<0.1500 E. Type I error. You rejected the null when there is no relationship between the independent and dependent variable. Whenever you reject the null, the only type of error you could have possibly made is a type I error. If you accept the null, the only type of error you could have made is a type II error. F. You could have made a type II error since you accepted the null hypothesis. G. To increase the likelihood of a type I error, you must increase the rejection region. If you increase the rejection region, your rejection region may go from .05 to .10 and the likelihood of a type I error increases from 5% to 10%. H. The likelihood of a type I error would increase, from say 5% to 10%. 2. A. Since this is a two tailed test, you will have a 2 1/2 % rejection region on each tail of the normal distribution. In order to reject the null hypothesis you must land in the rejection region. The area between the central point of the normal distribution and the rejection region is 47.50. Compute a Z score: Z=(12,500-11,000)/500=3. You are thus 3 standard deviation units away from the hypothesized difference between the two groups (which is 0). Looking this value up in the Z table gives you a value of 49.87. You are therefore at the 99.87th percentile (50+49.87). Hence, you are within the 2 1/2% rejection region and you will therefore reject your null hypothesis. Any Z value that would have taken you to the 97 1/2 percentile would have allowed you to reject the null hypothesis. B. Reject it. C. You may have made a type I error if, in the population there truly is not a relationship between gender and level of income. There is only a 5% likelihood that there is no relationship between gender and income. D. Yes. E. The independent variable is gender and the dependent variable is income. F. No. You may wish to control for alternative explanations and thus use control variables within your statistical model. Such factors as education, family background, level of work experience may be important factors in the analysis of the determinants of income. D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 4 3. A. Z=(11,000-13,000)/2,000 = -1. Thus, you are 1 standard deviation unit away from the hypothesized difference between the groups. The area between the hypothesized difference between the two groups and where the actual value of this difference is 34.13. We are thus at the 15.87th percentile (50-34.13=15.87). Since we are not at either the top 2 1/2 percentile (at the 97 1/2 percentile) or at the bottom 2 1/2 percentile, we are not in the rejection region. B. Accept the null hypothesis. Even if this was a one-tailed test you would not have rejected the null hypothesis since the value of the difference divided by the grouped SD did not fall in the lowest 5% of the normal distribution. C. You may have made a type II error. Whenever you accept the null hypothesis, there is a chance you have made a type II error. D. You can say that in all likelihood there is no relationship between gender and income in the population. E. IV:Gender. DV: Income F. No. Answers: 4. P(0)' 45! .100(.9045'0.0087 0!(45! P(1)' 45! .101(.9044'0.0436 1!(44! P(2)' 45! .102(.9043'0.1067 2!(43! P(3)' 45! .103(.9042'0.1699 3!(42! You cannot reject the null hypothesis since you are at the 32nd percentile on the distribution (16.99+10.67+4.36+0.87), which is higher than the 5% level. If only a single gay/lesbian person were chosen, you could reject the null hypothesis of random sampling. 5. You cannot reject the null hypothesis when the proportion of the sample is exactly the same as the proportion in the population. D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 5 6. P(10)' P(11)' P(12)' 12! .510.52'0.016 10!(2! 12! .511.51'0.0029 11!(1! 12! .512.50'0.000244 12!(0! If you add each of these probabilities up, you get a total of around .0119 or 1.19%. Therefore, you would reject the null hypothesis at the .05 level that the treatment has no effect. If you were testing this at the .01 level of significance, you would not reject the null hypothesis. 7. P(8)' 12! .58.54'0.1208 8!(4! 12! .59.53'0.0537 9!(3! 12! P(10)' .510.52'0.016 10!(2! P(9)' P(11)' P(12)' 12! .511.51'0.0029 11!(1! 12! .512.50'0.000244 12!(0! Because these probabilities add up to more than .05, we would not reject the null hypothesis. D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 6 #8. We’re dealing with a small sample difference of means test. First, determine the standard error for the estimate: ó̂Ȳ ' 2&Ȳ1 9((100)2%9((100)2 18 20 ' 100 9(10000%9(10000 .2' 18 10000 .2' 100(.4472'44.72 t8' 200 '4.472 44.72 At 18 degrees of freedom, the critical value for a 1-tailed, .05 test is 1.734. Because the t value is greater than the critical value, you would not accept the null hypothesis for a .05 test (reject the null). For the .01 test, the critical value is 2.552. Because the t value is greater than the critical value, you would not accept the null hypothesis (reject the null). For both significance levels, you would give support the research hypothesis. 9. Use a z score test to determine whether there is a difference between the hypothesized value and the actual value. The standard error for the estimate is: SE' s n ' .5 100 ' .5 '.05 10 The z value is z' 12.5&12.4 .1 ' '2. .05 .05 For a two-tailed, .05 test, the critical value is 1.96. Because the z value is larger than the critical value, you will reject the null hypothesis. For a one-tailed, .05 test, the critical value is 1.64 – so you will again reject the null hypothesis that there is no difference between the sample value and the hypothesized value. For a two-tailed, .01 test, you would need a z value of 2.576 (or 2.58) in order to not accept the null. In this case, you will not reject the null for the .01, two-tailed test. For a one-tailed test, you will need a z score of 2.326 in order to reject the null. Again, you will not reject the null for a one-tailed, .01 test. D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 7 D:\WP60\LECT1.PHD\Probability.Questions.wpd Page 8