Stat 101: Chapters 16-20. Note: The Final Is Cumulative CH 1-20 Important Formulas and Concepts 1 1 Chapter 16 1.1 Definitions 1. Standard Error When we estimate the standard deviation of a sampling distribution, using statistics .] found from the data, the estimate is called a standard error. [SE(p̂) = p̂(1−p̂) n 2. Confidence Interval (CI) A level C confidence interval for a model parameter is an interval of values usually of the form Estimate ± Margin of Error found from data in such a way that C% of all random samples will yield intervals that capture the true parameter value. 3. One Proportion z-interval A confidence interval for the true value of a proportion. The confidence interval is p̂ ± ∗ z1−α/2 SE(p̂), where z ∗ is a critical value from the standard normal model corresponding to the specified confidence level. 4. Margin of Error (MOE) In a confidence interval, the extent of the interval on either side of the observed statistic value. It is typically the produce of a critical value from the sampling distribution and a standard error from the data. A small MOE corresponds to a confidence interval that pins down the parameter precisely. A large MOE corresponds to a confidence interval that q gives relatively little information about the estimated parameter. MOEproportion = z∗ p̂(1−p̂) . n 5. Critical Value The number of standard errors to move away from the mean of the sampling distribution to correspond to the specified level of confidence. The critical value, for a normal sampling distribution, denoted z ∗ , is usually found from a table or technology. 1.2 Some z values (Critical Values) for Confidence Intervals CI: z 1 90% CI 1.645 95% CI 1.96 99% CI 2.576 This version: November 21, 2015, by Jennifer Pajda-De La O. May not include all things that could possibly be tested on. To be used as an additional reference to studying all Chapters 16-20. Most definitions, formulas, and selected problems come from Intro Stats by De Veaux, Velleman and Bock, 4th edition, published by Pearson. 2 Extra Information Review any and all notes and supplementary materials. It may be the case that something was accidentally omitted from this study guide. Also, review any problems that may have been discussed in class as not all example problems may have been provided here. 3 Chapter 17 1. Hypothesis A model or proposition that we adopt in order to test. 2. Null Hypothesis (H0 ) The claim being assessed in a hypothesis test that states “no change from the traditional value,” “no effect”, “no difference”, or “no relationship”. For a claim to be a testable null hypothesis, it must specify a value for some population parameter that can form the basis for assuming a sampling distribution for a test statistic. 3. Alternative Hypothesis (HA ) The alternative hypothesis proposes what we should conclude if we reject the null hypothesis. 4. P-value The probability of observing a value for a test statistic at least as far from the hypothesized value as the statistic value actually observed if the null hypothesis is true. A small p-value indicates either that the observation is improbable or that the probability calculation was based on incorrect assumptions. The assumed truth of the null hypothesis is the assumption under suspicion. 5. One-proportion Z-test A test of the null hypothesis that the proportion of a single sample equals a specified value H0 : p = p0 by referring the statistic z = (p̂ − p0 )/SD(p̂). 6. Effect Size The difference between the null hypothesis value and the true value of a model parameter. 7. Two-sided (Tailed) Alternative An alternative hypothesis is two-sided (HA : p 6= p0 ) when we are interested in deviations in either direction away from the hypothesized parameter value. 8. One-sided (Tailed) Alternative An alternative hypothesis is one-sized (HA : p > p0 or HA : p < p0 ) when we are interested in deviations in only one direction away from the hypothesized parameter value. 4 Chapter 18 1. Student’s t distribution A family of distributions indexed by its degrees of freedom. The t-models are unimodal, symmetric, and bell shaped, but have fatter tails and a narrower center than the Normal model. As the degrees of freedom increase, t-distributions approah the Normal distribution. 2. Degrees of Freedom for Student’s t distribution (df) For the t-distribution, the degrees of freedom are equal to n − 1, where n is the sample size. 3. One-sample t-interval for the mean This for the mean. This is given by y ± t∗n−1 SE(y), SE(y) = √ is the confidence interval s/ n. The critical value t∗n−1 depends on the particular confidence level that you specify and on the number of degrees of freedom n − 1. 4. One-sample t-test for the mean This is the hypothesis √ test. It tests the hypothesis H0 : µ = µ0 using the statistic tn−1 = (y − µ0 )/(s/ n). 5 Chapter 19 1. Statistically significant When the p-value falls below the alpha level, we say that the test is “statistically significant” at that alpha level. 2. Alpha level The threshold p-value that determines when we reject a null hypothesis. If we observe a statistic whose p-value based on the null hypothesis is less than α, we reject that null hypothesis. 3. Significance level The alpha level is also called the significance level, most often in a phrase such as a conclusion that a particular test is “significant at the 5% significance level” 4. Critical value The value in the sampling distribution model of the statistic whose p-value is equal to the alpha level. The critical value is often denoted with an asterisk, as z ∗ and t∗ . 5. Type I Error The error of rejecting a null hypothesis when in fact it is true (also called a false positive). The probability of a Type I Error is α. 6. Type II Error The error of failing to reject a null hypothesis when in fact it is false (also called a false negative). The probability of a Type II Error is β. 7. β The probability of a Type II Error is commonly denoted β and depends on the effect size. 8. Power The probability that a hypothesis test will correctly reject a false null hypothesis is the power of the test. To find the power, we must specify a particular alternative parameter value as the “true” value. For any specific value in the alternative, the power is 1 − β. 9. Effect Size The difference between the null hypothesis value and the true value of a model parameter. 6 Chapter 20 1. Sampling distribution of the difference between two proportions The sampling distribution of p̂1 − p̂2 is, under appropriate assumptions, modeled by a Normal model with mean µ = p1 − p2 and standard deviation SD(p̂1 − p̂2 ) = p (p1 (1 − p1 ))/n1 + (p2 (1 − p2 ))/n2 . 2. Two-proportion z-interval This is the confidence interval. A two-proportion z-interval gives a confidence interval for the true difference in proportions, p1 −p2 in two independent groups. The confidence interval is (p̂1 − p̂2 )±z ∗ ×SE(p̂1 − p̂2 ). z ∗ is the critical value from the standard Normal Model corresponding to the specified confidence level. 3. Pooling Data from two or more populations may sometimes be combined, or pooled, to estimate a statistic (typically a pooled variance) when the estimated value is assumed to be the same in both populations. The resulting larger sample size may lead to an estimate with lower sample variance. However, pooled estimates are appropriate only when the required assumptions are true. 4. Two-proportion z-test This is the hypothesis test. Test the null hypothesis H0 : p1 − p2 = 0 by comparing the statistic z = (p̂1 − p̂2 )/SEpooled (p̂1 − p̂2 ) to the standard normal model. 5. Two-sample t-interval for the difference between means A confidence interval for the difference between the means of two p independent groups is found as (y 1 − y 2 ) ± t∗df × SE(y 1 − y 2 ). Here, SE(y 1 − y 2 ) = (s21 /n1 ) + (s22 /n2 ), and the number of degrees of freedom is given by a special formula. 6. Two-sample t-test for the difference between means A hypothesis test for the difference between the means of two independent groups. It tests the null hypothesis H0 : µ1 − µ2 = ∆0 , where the hypothesized difference ∆0 is almost always 0. This uses the statistic tdf = is given by a special formula. (y 1 −y 2 )−∆0 , SE(y 1 −y 2 ) with the degrees of freedom 7. Pooled t-test A hypothesis test for the difference in the means of two independent groups when we are willing and able to assume that the variances of the groups are equal. It tests the null hypothesis H0 : µ1 − µ2 = ∆0 , where the hypothesized difference, ∆0 is almost 1 −y 2 )−∆0 , with the degrees of freedom is always 0. This uses the statistic tdf = SE(ypooled (y 1 −y 2 ) (n1 − 1) + (n2 − 1). 7 Confidence Interval Creation and Hypothesis Testing Summary 7.1 1-Proportion Proportion - always use p • Confidence Interval Creation r p̂(1 − p̂) CI: p̂ ± z ∗ {z n } | M OE z ∗ = criticalvalue Table of critical values for z ∗ for Confidence Intervals: CI: z∗ 90% 1.645 95% 1.96 96% 2.054 98% 2.326 99% 2.576 • Hypothesis Testing Step 1: Write down your hypothesis H0 : p = p0 HA : p <or>or6= p0 Step 2: Calculate your test statistic z= q p̂−p0 p0 (1−p0 ) n Step 3: Calculate the p-value Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If p-value is > α, then Do Not Reject H0 . 7.2 1-Sample Mean Sample Mean - always use x or y The degrees of freedom is given by df = n − 1. • Confidence Interval Creation s CI : y ± t∗n−1 √ n | {z } M OE t∗n−1 = criticalvalue Use Appendix D Table T to determine the critical values of t. • Hypothesis Testing Step 1: Write down your hypothesis H0 : µ = µ0 HA : µ <or>or6= µ0 Step 2: Calculate your test statistic tn−1 = y−µ0 √s n Step 3: Calculate the p-value Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If p-value is > α, then Do Not Reject H0 . 7.3 Difference of Proportions Difference of Proportions - always use p1 − p2 • Confidence Interval Creation s p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) + CI:(p̂1 − p̂2 ) ± z ∗ n1 n2 | {z } M OE z ∗ = criticalvalue See Section ?? for examples of the critical values for z ∗ . • Hypothesis Testing Step 1: Write down your hypothesis H0 : p1 − p2 = 0 HA : p1 − p2 <or>or6= 0 Step 2: Calculate your test statistic z= p̂1 −p̂2 , SEpooled (p̂1 −p̂2 ) p̂pooled (1−p̂pooled ) p̂ (1−p̂ ) + pooled n2 pooled , n1 NumberofSuccessesinGroup1+NumberofSuccessesinGroup2 . n1 +n2 SEpooled (p̂1 − p̂2 ) = p̂pooled = q Step 3: Calculate the p-value Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If p-value is > α, then Do Not Reject H0 . 7.4 Difference of Means - 2 Independent Groups; Any type of Variance Difference of Means - always use x1 − x2 or y 1 − y 2 The degrees offreedom is given by (round down to the nearest integer) df = 2 2 s2 1 + s2 n1 n2 2 2 2 2 s1 s2 1 1 + n1 −1 n1 n2 −1 n2 . • Confidence Interval Creation s s21 s2 CI: (y 1 − y 2 ) ± t∗df + 2 n1 n2 | {z } t∗df = criticalvalue M OE • Hypothesis Testing Step 1: Write down your hypothesis H0 : µ1 − µ2 = ∆0 HA : µ1 − µ2 <or>or6= ∆0 Step 2: Calculate your test statistic tdf = (y 1 −y 2 )−∆0 r 2 s2 1 + s2 n1 n2 Step 3: Calculate the p-value Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If p-value is > α, then Do Not Reject H0 . 7.5 Difference of Means - 2 Independent Groups; Equal Variance both groups Difference of Means - always use x1 − x2 or y 1 − y 2 The degrees of freedom is given by df = n1 + n2 − 2. • Confidence Interval Creation CI: (y 1 − y 2 ) ± t∗df × SEpooled (y 1 − y 2 ) | {z } t∗df = criticalvalue M OE • Hypothesis Testing Step 1: Write down your hypothesis H0 : µ1 − µ2 = ∆0 HA : µ1 − µ2 <or>or6= ∆0 Step 2: Calculate your test statistic tdf = (y 1 −y 2 )−∆0 , SEpooled (y 1 −y 2 ) q SEpooled (y 1 − y 2 ) = spooled n11 + q (n1 −1)s21 +(n2 −1)s22 spooled = n1 +n2 −2 1 n2 Step 3: Calculate the p-value Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If p-value is > α, then Do Not Reject H0 . 7.6 Paired Differences of Means The degrees of freedom is given by df = n − 1. • Confidence Interval Creation (n = number of pairs) sd CI: d¯ ± t∗n−1 √ n | {z } M OE t∗n−1 = criticalvalue • Hypothesis Testing Step 1: Write down your hypothesis H0 : µd = ∆0 HA : µd <or>or6= ∆0 Step 2: Calculate your test statistic ¯ 0 tn−1 = d−∆ s √d n d¯ = averageofthedifferences sd = standarddeviationofthedifferences Step 3: Calculate the p-value Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If p-value is > α, then Do Not Reject H0 . 8 Extra Information Review any and all notes and supplementary materials. It may be the case that something was accidentally omitted from this study guide. Also, review any problems that may have been discussed in class as not all example problems may have been provided here. 9 Example Problems Q3 pg 444 The 95% confidence interval for the number of teens who reported that they had misrepresented their age online is from 45.6% to 52.5%. There were 799 teens in this study. (a) Interpret the interval in this context. (b) Explain the meaning of “95% confident” in this context. 1. A study found that 16 of 40 peanut candy bars in fact did not contain peanuts. (a) Construct a 90% confidence interval. (b) Interpret your 90% confidence interval. (c) Construct a 95% confidence interval. (d) Interpret your 95% confidence interval. Q15,16 pg 446 Several factors are involved in the creation of a confidence interval. Among them are the sample size, the level of confidence, and the margin of error. Which statements are true? (a) For a given sample size, higher confidence means a smaller margin of error. (b) For a given confidence level, halving the margin of error requires a sample twice as large. (c) For a certain confidence level, you can get a smaller margin of error by selecting a bigger sample. (d) For a fixed margin of error, larger samples provide greater confidence. 9.1 Chapter 16 1. I sample 600 people and 432 of them like cats. Construct a 95% confidence interval for the population proportion. 2. I think the proportion of people that eat candy is around 0.75. I am going to construct a 90% confidence interval and want the margin of error to be ±0.025. How large should the sample size be? 3. Jimmy samples 930 people and 234 took public transportation. Construct a 99% confidence interval for the population proportion. 4. I am going to construct a 95% confidence interval for the proportion of people that wear eyeglasses and want the margin of error to be ±0.2. I have no idea what to estimate for the population proportion. How large should the sample size be? 9.2 Chapter 17 1. A researcher believes that more than 50% of all people voted in the last election. She samples 800 people and 420 of them voted. Test her claim at a significance level of 0.05 (i.e. compare the P-value to 0.05). (a) State the hypotheses to be tested. (b) Compute the test statistics (z-value). You must show your computation to receive credit. (c) Compute the P-value associated with your test statistic. (d) Make a conclusion about the hypotheses. 2. A researcher believes that fewer than 75% of all mollusks are tasty. He samples 1200 mollusks and 865 of them are tasty. Test his claim at a significance level of 0.05 (i.e. compare the P-value to 0.05). (a) State the hypotheses to be tested. (b) Compute the test statistics (z-value). You must show your computation to receive credit. (c) Compute the P-value associated with your test statistic. (d) Make a conclusion about the hypotheses. 3. A researcher believes that the percentage of people that watch Game of Thrones is different than 27%. He samples 900 people and 220 of them watch. Test his claim at a significance level of 0.05 (i.e. compare the P-value to 0.05). (a) State the hypotheses to be tested. (b) Compute the test statistics (z-value). You must show your computation to receive credit. (c) Compute the P-value associated with your test statistic. (d) Make a conclusion about the hypotheses. 9.3 Chapter 18 1. A butcher wants to estimate the mean weight of a ham. She samples 33 hams and computes a sample mean weight of 8.2 pounds and a sample standard deviation of 3.3 pounds. What is a 90% confidence interval for the population mean weight of ham? Please indicate the value you used for z ∗ or t∗ . 2. A professor is interested in the mean length of a letter of recommendation. He samples 51 letters and finds a sample mean length of 620 words with a sample standard deviation of 90 words. What is a 95% confidence interval for the population mean length of a letter? Please indicate the value you used for z ∗ or t∗ . 3. A computer professional wants to know the mean number of emails people receive each day. She is going to compute a 95% confidence interval and wants a margin of error of ±2 emails. She believes the standard deviation to be 18 emails. How large should the sample size be to ensure this margin of error? 4. A researcher believes that the mean age at which a person first votes is greater than 22 years. He samples 27 people and computes a sample mean of 24.3 years and a sample standard deviation of 8 years. (a) State the hypotheses to be tested. (b) What is the value of your test statistic (t or z value)? (c) What is the P-value? (d) What conclusion should be drawn (compare p-value to 0.05). 5. A researcher believes that the mean age at which a person first tries chocolate is less than 3 years. He samples 24 people and computes a sample mean of 2.3 years and a sample standard deviation of 1.5 years. (a) State the hypotheses to be tested. (b) What is the value of your test statistic (t or z value)? (c) What is the P-value? (d) What conclusion should be drawn (compare p-value to 0.05). 6. A researcher believes that the mean height of a prairie dog is different than 14 inches. She samples 31 prairie dogs and computes a sample mean of 15.8 inches and a sample standard deviation of 3.6 inches. (a) State the hypotheses to be tested. (b) What is the value of your test statistic (t or z value)? (c) What is the P-value? (d) What conclusion should be drawn (compare p-value to 0.05). 9.4 Chapter 19 Q4 pg 526 Which of the following are true? If false, explain briefly. (a) A very low P-value provides evidence against the null hypothesis. (b) A high P-value is strong evidence in favor of the null hypothesis. (c) A P-value above 0.10 shows that the null hypothesis is true. (d) If the null hypothesis is true, you can’t get a p-value below 0.01. Q7 pg 526 Which of the following statements are true? If false, explain briefly. (a) Using an alpha level of 0.05, a p-value of 0.04 results in rejecting the null hypothesis. (b) The alpha level depends on the sample size. (c) With an alpha level of 0.01, a p-value of 0.10 results in rejecting the null hypothesis. (d) Using an alpha level of 0.05, a p-value of 0.06 means the null hypothesis is true. Q11 pg 527 For each of the following situations, state whether a Type I or Type II, or neither error has been made. Explain briefly. (a) A bank wants to know if the enrollment on their website is above 30% based on a small sample of customers. they test H0 : p = 0.3 versus HA : p > 0.3 and reject the null hypothesis. Later they find out that actually 28% of all customers enrolled. (b) A student tests 100 students to determine whether other students on her campus prefer Coke or Pepsi and finds no evidence that preference for Coke is not 0.5. Later, a marketing company tests all students on campus and finds no difference. (c) A human resource analyst wants to know if the applicants this year score, on average, higher on their placement exam than the 52.5 points the candidates averaged last year. She samples 50 recent tests and finds the average to be 54.1 points. She fails to reject the null hypothesis that the mean is 52.5 points. At the end of the year, they find that the candidates this year had a mean of 55.3 points. (d) A pharmaceutical company tests whether a drug lifts the headache relief rate from the 25% achieved by the placebo. They fail to reject the null hypothesis because the p-value is 0.465. Further testing shows that the drug actually relieves headaches in 38% of people. 9.5 Chapter 20 1. A researcher samples 600 children and 500 of them like ice cream. She also samples 450 adults and 350 of them like ice cream. Construct a 95% confidence interval for the difference of population proportions of children and adults that like ice cream. 2. A researcher samples 1200 children and 500 of them like to exercise. She also samples 900 adults and 350 of them like to exercise. Construct a 90% confidence interval for the difference of population proportions of children and adults that like to exercise. 3. A scientist believes that the proportion of North American bees that are hostile is greater than the proportion of South American bees. She samples 500 North American bees and 200 are hostile. She samples 600 South American bees and 230 are hostile. (a) State the hypotheses to be tested. (b) Compute the sample statistic (z value or t value). You must show work to receive credit. (c) Give the P-value or range of P-Values. (d) What decision should the scientist make at a significance level of 5%? 4. A scientist believes that the proportion of North American bears that are hostile is greater than the proportion of South American bears. She samples 800 North American bears and 200 are hostile. She samples 1200 South American bears and 240 are hostile. (a) State the hypotheses to be tested. (b) Compute the sample statistic (z value or t value). You must show work to receive credit. (c) Give the P-value or range of P-Values. (d) What decision should the scientist make at a significance level of 5%? Q61 pg 578 A man who moves to a new city sees that there are two routes he could take to work. A neighbor who has lived there a long time tells him Route A will average 5 minutes faster than Route B. The man decides to experiment; he wants to find out if the mean difference between Route A and B is different from 5 minutes. Each day, he flips a coin to determine which way to go, driving each route 20 days. He finds that Route A takes an average of 40 minutes, with a standard deviation of 3 minutes, and Route B takes an average of 43 minutes, with a standard deviation of 2 minutes. Histograms of travel times for the routes are roughly symmetric and show no outliers. Assume α = 0.05. (a) Find a 95% confidence interval for the difference in average commuting time for the two routes. Use df= 33. (b) State the hypotheses to be tested. (c) Compute the value of the test score. (d) Give the P-value or range of P-values. (e) Do the results seem significant? Q78 pg 582 Researchers randomly assigned participants either a tall, thin “highball” glass or a short, wide “tumbler,” each of which held 355 ml. Participants were asked to pour 1.5 oz = 44.3 ml of water into their glass. Did the shape of the glass make a difference in how much liquid they poured? In particular, test to see if they poured less water into the “highball” glass than the “tumbler”. Assume α = 0.1. Here are the summaries: Highball n 99 y 42.2 ml s 16.2 ml Tumbler n 99 y 60.9 ml s 17.9 ml (a) Find a 90% confidence interval for the difference in average water held for the two glasses. Use df = 194. (b) State the hypotheses to be tested. (c) Compute the value of the test score. (Assume all conditions are met.) (d) Give the P-value or range of P-values. (e) Do the results seem significant? 9.6 Various Chapters 1. We want to estimate the healing rate for a wound. A sample of size 17 is collected and the sample mean is computed to be 24.3 micrometers per hour, with a sample standard deviation of s= 8 micrometers per hour. What is a 95% confidence interval for the population mean? 2. A sample of size n=150 people is collected and the sample proportion of people who are illiterate is computed to be .20. Compute a 95% confidence interval for the population proportion of illiterate people. 3. You believe that the proportion of people that like cheese is .80. You are going to construct a 95% confidence interval and want the margin of error to be plus or minus .03. What should the sample size be? 4. Teresa knows that appointment times are approximately normally distributed. She believes the mean wait time is longer than 25 minutes. She conducts a test with α = 0.05 and the appropriate hypotheses. She selects 25 random appointments and the sample mean was found to be 25.66 minutes and a sample standard deviation of 10 minutes. (a) State the hypotheses to be tested. (b) Compute the value of the test score. (c) Give the P-value or range of P-values. (d) Do the results seem significant? 5. You claim that the proportion of people who watch American Idol is greater than .50. You sample n=200 people and compute a sample proportion of .53. Assume α = 0.05. (a) State the hypotheses to be tested. (b) Compute the value of the test score. (c) Give the P-value or range of P-values. (d) Do the results seem significant? 6. You want to compare the proportion of gamers amongst women and men. You survey 300 women and 400 men. 175 of the women were gamers and 200 of the men were gamers. Construct a 95% confidence interval for the difference of proportions. 7. You believe that the proportion of men that are colorblind is greater than the proportion of women that are color blind. You sample 900 men and 90 of them are color blind. You sample 700 women and 45 of them are colorblind. Assume α = 0.05. (a) State the hypotheses to be tested. (b) Compute the value of the test score. (c) Did you use the pooled proportion in part b.? (d) Compute the P-value. (e) Are the results significant? 10 Example Solutions Q3 pg 444 (a) We are 95% confident that, if we were to ask all teens whether they have misrepresented their age online, between 45.6% and 52.5% of them would say they have. (b) If we were to collect many random samples of 799 teens, about 95% of the confidence intervals would contain the true proportion of all teens who admit to misrepresenting their age online. 1. This problem tells us that p̂ = 16/40, n = 40. (a) For a 90% confidence interval, z ∗ = 1.645. The 90% confidence interval would then be q p̂ ± z ∗ p̂(1−p̂) n q 16 16 ) ( 40 )(1− 40 = ± 1.645 40 √ = ± 1.645 0.006 = (0.2726, 0.5274) 16 40 16 40 (b) We are 90% confidence that between 27% and 53% of all peanut candy bars did not contain peanuts. (c) For a 95% confidence interval, z ∗ = 1.96. The 95% confidence interval would then be q p̂ ± z ∗ p̂(1−p̂) n q 16 16 ( 40 )(1− 40 ) = ± 1.96 40 √ = ± 1.96 0.006 = (0.2482, 0.5518) 16 40 16 40 (d) We are 95% confident that between 25% and 55% of all peanut candy bars did not contain peanuts. Q15,16 pg 446 (a) False. Higher confidence means a larger margin of error. Suppose n = 10. Suppose p̂ = 0.5. Start with 90% Confidence (z ∗ = 1.645.) Calculate MOE. Now change to 95% Confidence (z ∗ = 1.96). Calculate MOE. Compare the two results. q M OE90 = 1.645 0.5(1−0.5) 10 q 0.25 = 1.645 10 √ = 1.645 0.025 = 0.26, q M OE95 = 1.96 q = 1.96 0.25 √ 10 = 1.96 0.025 = 0.31. 0.5(1−0.5) 10 From this, we can see that MOE increases when confidence increases. (b) False. The margin of error decreases as the square root of the sample size increases. Halving the margin of error requires a sample four times as large as the original. Suppose p̂ = 0.5. Suppose 95% Confidence (z ∗ = 1.96). Start with MOE = 0.6. Then compare with MOE = 0.3. q q 0.6 0.25 ⇒ = 0.6 = 1.96 0.5(1−0.5) n 1.96 n q ⇒ 0.306 = 0.25 n ⇒ 0.0937 = 0.25 n 0.25 ⇒ n = 0.0937 ⇒ n = 2.668, q 0.3 = 1.96 ⇒ 0.153 = 0.5(1−0.5) n ⇒ 0.3 1.96 = q 0.25 n q 0.25 n 0.25 n ⇒ 0.0234 = 0.25 ⇒ n = 0.0234 ⇒ n = 10.68. So our original n = 2.668 and the new n = 10.68, which is approximately 4 times the original value of n. (c) True. Larger samples are less variable, which translates to a smaller margin of error. We can be more precise at the same level of confidence. Suppose p̂ = 0.5. Suppose 90% Confidence. Start with n = 2 and compare to n = 18. q M OE2 = 1.645 0.5(1−0.5) 2 q = 1.645 0.25 √ 2 = 1.645 0.125 = 0.582, q M OE18 = 1.645 0.5(1−0.5) 18 q 0.25 = 1.645 18 √ = 1.645 0.139 = 0.194. Our MOE decreased when n increased. (d) True. Larger samples are less variable, which makes us more confident that a given confidence interval succeeds in catching the population proportion. Suppose M OE = 0.4. Suppose p̂ = 0.5. Compare the confidence of n = 5 to n = 8. q √ ∗ 0.4 = z5∗ 0.5(1−0.5) ⇒ 0.4 = z 0.05 5 5 0.4 ∗ ⇒ √0.05 = z5 ⇒ 1.789q = z5∗ , √ 0.4 = z8∗ 0.5(1−0.5) ⇒ 0.4 = z8∗ 0.03125 8 0.4 ⇒ √0.03125 = z8∗ ⇒ 2.263 = z8∗ . As the sample sizes increases, z ∗ increases, which means that the confidence level increases. 10.1 Chapter 16 1. I sample 600 people and 432 of them like cats. Construct a 95% confidence interval for the population proportion. 432 = 0.72 p̂ = 600 z ∗ = 1.96 n = 600 q CI : p̂ ± z ∗ p̂(1−p̂) qn ⇒ 0.72 ± 1.96 0.72(1−0.72) 600 ⇒ (0.684, 0.756) 2. I think the proportion of people that eat candy is around 0.75. I am going to construct a 90% confidence interval and want the margin of error to be ±0.025. How large should the sample size be? p̂ = 0.75 z ∗ = 1.645 M OE = 0.025 q M OE = z ∗ p̂(1−p̂) n q ⇒ 0.025 = 1.645 0.75(1−0.75) n q 0.1875 0.025 ⇒ 1.645 = n 0.1875 0.025 2 ⇒ 1.645 = n ⇒ n = 0.1875 2 ( 0.025 1.645 ) ⇒ n = 811.8075 ⇒ n ≈ 812 3. Jimmy samples 930 people and 234 took public transportation. Construct a 99% confidence interval for the population proportion. p̂ = 234 930 z ∗ = 2.576 n = 930 q CI : p̂ ± z ∗ p̂(1−p̂) qn ⇒ 234 ± 2.576 (234/930)(1−234/930) 930 930 q 0.188 ⇒ 0.252 ± 2.576 930 ⇒ (0.215, 0.289) 4. I am going to construct a 95% confidence interval for the proportion of people that wear eyeglasses and want the margin of error to be ±0.2. I have no idea what to estimate for the population proportion. How large should the sample size be? p̂ = 0.5whenwedon0 thaveanyideaforthepopulationproportion z∗ = 1.96 MOE = 0.2q MOE = z∗ p̂(1−p̂) qn ⇒ 0.2 = 1.96 0.5(1−0.5) n q 0.2 ⇒ 1.96 = 0.25 n 0.2 2 0.25 ⇒ 1.96 = n ⇒ n = 0.25 0.2 2 ( 1.96 ) ⇒ n = 24.01 ⇒ n ≈ 25 10.2 Chapter 17 1. A researcher believes that more than 50% of all people voted in the last election. She samples 800 people and 420 of them voted. Test her claim at a significance level of 0.05 (i.e. compare the P-value to 0.05). (a) State the hypotheses to be tested. H0 : p = 0.5 HA : p > 0.5 (b) Compute the test statistics (z-value). You must show your computation to receive credit. p̂ = 420/800 = 0.525. n = 800. 0.025 0 = 1.41 = q0.525−0.5 =√ z = q pp̂−p 0.25 0.5(1−0.5) 0 (1−p0 ) n 800 800 (c) Compute the P-value associated with your test statistic. P (Z > 1.41) = normalcdf (1.41, 999) = 0.0793 (d) Make a conclusion about the hypotheses. Since the p-value is “large” (0.0793 > 0.05), Do Not Reject H0 . The results are not significant. 2. A researcher believes that fewer than 75% of all mollusks are tasty. He samples 1200 mollusks and 865 of them are tasty. Test his claim at a significance level of 0.05 (i.e. compare the P-value to 0.05). (a) State the hypotheses to be tested. H0 : p = 0.75 HA : p < 0.75 (b) Compute the test statistics (z-value). You must show your computation to receive credit. p̂ = 865/1200 = 0.721. n = 1200. −0.029 0 = q0.721−0.75 =√ = −2.32 z = q pp̂−p 0.1875 0.75(1−0.75) 0 (1−p0 ) n 1200 1200 (c) Compute the P-value associated with your test statistic. P (Z < −2.32) = normalcdf (−999, −2.32) = 0.0102 (d) Make a conclusion about the hypotheses. Since the p-value is “small” (0.0102 < 0.05), Reject H0 . The results are significant. 3. A researcher believes that the percentage of people that watch Game of Thrones is different than 27%. He samples 900 people and 220 of them watch. Test his claim at a significance level of 0.05 (i.e. compare the P-value to 0.05). (a) State the hypotheses to be tested. H0 : p = 0.27 HA : p 6= 0.27 (b) Compute the test statistics (z-value). You must show your computation to receive credit. p̂ = 220/900 = 0.244. n = 900. −0.026 0 = q0.244−0.27 =√ z = q pp̂−p = −1.76 0.1971 0.27(1−0.27) 0 (1−p0 ) n 900 900 (c) Compute the P-value associated with your test statistic. Note that this is a 2-sided test. p-value= 2P (Z < −1.76) = 2 (normalcdf (−999, −1.76)) = 2(0.039) = 0.078 (d) Make a conclusion about the hypotheses. Since the p-value is “large” (0.078 > 0.05), Do Not Reject H0 . The results are not significant. 10.3 Chapter 18 1. A butcher wants to estimate the mean weight of a ham. She samples 33 hams and computes a sample mean weight of 8.2 pounds and a sample standard deviation of 3.3 pounds. What is a 90% confidence interval for the population mean weight of ham? Please indicate the value you used for z ∗ or t∗ . Summary of what is given: n = 33 y = 8.2 s = 3.3. For confidence intervals for the mean, we use t∗ , with n − 1 degrees of freedom and 90% confidence (for this case). Thus, t∗32 = 1.694. CI : y ± t∗n−1 √sn ⇒ 8.2 ± 1.694 √3.3 33 ⇒ (7.227, 9.173) 2. A professor is interested in the mean length of a letter of recommendation. He samples 51 letters and finds a sample mean length of 620 words with a sample standard deviation of 90 words. What is a 95% confidence interval for the population mean length of a letter? Please indicate the value you used for z ∗ or t∗ . Summary of what is given: n = 51 y = 620 s = 90. For confidence intervals for the mean, we use t∗ , with n − 1 degrees of freedom and 95% confidence (for this case). Thus, t∗50 = 2.009. CI : y ± t∗n−1 √sn ⇒ 620 ± 2.009 √9051 ⇒ (594.682, 645.318) 3. A computer professional wants to know the mean number of emails people receive each day. She is going to compute a 95% confidence interval and wants a margin of error of ±2 emails. She believes the standard deviation to be 18 emails. How large should the sample size be to ensure this margin of error? Summary of what is given: M OE = 2 s = 18 For sample size calculation, since this is based on the mean, use t∗ , with n − 1 degrees of freedom. Note that as n becomes really large, the t-distribution becomes more like the normal distribution. Therefore, use the 95% confidence interval critical value from the normal distribution instead. z ∗ = 1.96. Sample size can be calculated as follows: M OE = t∗n−1 √sn ⇒ 2 = 1.96 √18n 2 ⇒ 1.96×18 = √1n √ ⇒ n = 1.96×18 2 2 ⇒ n = 1.96×18 2 ⇒ n = 311.1696 ⇒ n ≈ 312 4. A researcher believes that the mean age at which a person first votes is greater than 22 years. He samples 27 people and computes a sample mean of 24.3 years and a sample standard deviation of 8 years. (a) State the hypotheses to be tested. H0 : µ = 22 HA : µ > 22 (b) What is the value of your test statistic (t or z value)? Use the t-test statistic because we are dealing with means. 0 tn−1 = y−µ √s n t27−1 = t26 = 24.3−22 √8 27 = 2.3 1.54 = 1.49 (c) What is the P-value? On your calculator: tcdf (1.49, 999, 26) = 0.0741 On the table: Go to degrees of freedom 26, find where 1.49 is in the row, and then look at the one-tail probability values. The probability is between 0.05 and 0.10. (d) What conclusion should be drawn (compare p-value to 0.05). Since the p-value is “large” (0.0741 > 0.05), Do Not Reject H0 . The results are not significant. 5. A researcher believes that the mean age at which a person first tries chocolate is less than 3 years. He samples 24 people and computes a sample mean of 2.3 years and a sample standard deviation of 1.5 years. (a) State the hypotheses to be tested. H0 : µ = 3 HA : µ < 3 (b) What is the value of your test statistic (t or z value)? Use the t-test statistic because we are dealing with means. 0 tn−1 = y−µ √s n t24−1 = t23 = 2.3−3 1.5 √ 24 = −0.7 0.3062 = −2.286 (c) What is the P-value? On your calculator: tcdf (−999, −2.286, 23) = 0.0159. On the table: Go to degrees of freedom 23, find where 2.286 is in the row, and then look at the one-tail probability values. The probability is between 0.01 and 0.025. (d) What conclusion should be drawn (compare p-value to 0.05). Since the p-value is “small” (0.0159 < 0.05), Reject H0 . The results are significant. 6. A researcher believes that the mean height of a prairie dog is different than 14 inches. She samples 31 prairie dogs and computes a sample mean of 15.8 inches and a sample standard deviation of 3.6 inches. (a) State the hypotheses to be tested. H0 : µ = 14 HA : µ 6= 14 (b) What is the value of your test statistic (t or z value)? Use the t-test statistic because we are dealing with means. 0 tn−1 = y−µ √s n t31−1 = t30 = 15.8−14 3.6 √ 31 = 1.8 0.6466 = 2.784 (c) What is the P-value? On your calculator: 2tcdf (2.784, 999, 30) = 2(0.0046) = 0.0092. On the table: go to degrees of freedom 30, find where 2.784 is in the row, and then look at the two-tail probability values. The probability is lower than 0.01. (d) What conclusion should be drawn (compare p-value to 0.05). Since the p-value is “small” (0.0092 < 0.05), Reject H0 . The results are significant. 10.4 Chapter 19 Q4 pg 526 Which of the following are true? If false, explain briefly. (a) A very low P-value provides evidence against the null hypothesis. True. (b) A high P-value is strong evidence in favor of the null hypothesis. False. A high p-value shows that the data are consistent with the null hypothesis but does not prove that the null hypothesis is true. (c) A P-value above 0.10 shows that the null hypothesis is true. False. No p-value ever shows that the null hypothesis is true (or false). (d) If the null hypothesis is true, you can’t get a p-value below 0.01. False. If the null hypothesis is true, you will get a p-value below 0.01 about once in a hundred hypothesis tests. Q7 pg 526 Which of the following statements are true? If false, explain briefly. (a) Using an alpha level of 0.05, a p-value of 0.04 results in rejecting the null hypothesis. True. (b) The alpha level depends on the sample size. False. The alpha level is set independently and does not depend on the sample size. (c) With an alpha level of 0.01, a p-value of 0.10 results in rejecting the null hypothesis. False. The p-value would have to be less than 0.01 to reject the null hypothesis. (d) Using an alpha level of 0.05, a p-value of 0.06 means the null hypothesis is true. False. It means that we do not have enough evidence at that alpha level to reject the null hypothesis. Q11 pg 527 For each of the following situations, state whether a Type I or Type II, or neither error has been made. Explain briefly. (a) A bank wants to know if the enrollment on their website is above 30% based on a small sample of customers. they test H0 : p = 0.3 versus HA : p > 0.3 and reject the null hypothesis. Later they find out that actually 28% of all customers enrolled. Type I Error. The actual value is not greater than 0.3, but they rejected the null hypothesis. (b) A student tests 100 students to determine whether other students on her campus prefer Coke or Pepsi and finds no evidence that preference for Coke is not 0.5. Later, a marketing company tests all students on campus and finds no difference. No error. The actual value is 0.5 which was not rejected. (c) A human resource analyst wants to know if the applicants this year score, on average, higher on their placement exam than the 52.5 points the candidates averaged last year. She samples 50 recent tests and finds the average to be 54.1 points. She fails to reject the null hypothesis that the mean is 52.5 points. At the end of the year, they find that the candidates this year had a mean of 55.3 points. Type II Error. The actual value was 55.3 points, which is greater than 52.5, which was not rejected. (d) A pharmaceutical company tests whether a drug lifts the headache relief rate from the 25% achieved by the placebo. They fail to reject the null hypothesis because the p-value is 0.465. Further testing shows that the drug actually relieves headaches in 38% of people. Type II Error. The null hypothesis was not rejected, but it was false. The true relief rate was greater than 0.25. 10.5 Chapter 20 1. A researcher samples 600 children and 500 of them like ice cream. She also samples 450 adults and 350 of them like ice cream. Construct a 95% confidence interval for the difference of population proportions of children and adults that like ice cream. What we are given: p̂1 = p̂2 = 500 600 350 450 Since we are considering the confidence interval for the difference of proportions, we need a value for z ∗ . Here, z ∗ = 1.96. The confidence interval is q p̂1 (1−p̂1 ) 2) ∗ CI: (p̂1 − p̂2 ) ± z + p̂2 (1−p̂ n2 1 qn500 350 (1− 500 ) (1− 350 ) ⇒ 500 − 350 ± 1.96 600 600600 + 450 450450 600 450q 1 ⇒ 18 ± 1.96 5/36 + 14/81 600 450 ⇒ (0.0069, 0.1042) 2. A researcher samples 1200 children and 500 of them like to exercise. She also samples 900 adults and 350 of them like to exercise. Construct a 90% confidence interval for the difference of population proportions of children and adults that like to exercise. What we are given: p̂1 = p̂2 = 500 1200 350 900 Since we are considering the confidence interval for the difference of proportions, we need a value for z ∗ . Here, z ∗ = 1.645. The confidence interval is q 1) 2) CI:(p̂1 − p̂2 ) ± z ∗ p̂1 (1−p̂ + p̂2 (1−p̂ n1 n2 q 500 350 (1− 500 ) (1− 350 ) 500 − 350 ± 1.645 1200 12001200 + 900 900900 ⇒ 1200 900q 1 ± 1.645 35/144 + 77/324 ⇒ 36 1200 900 ⇒ (−0.0078, 0.0633) 3. A scientist believes that the proportion of North American bees that are hostile is greater than the proportion of South American bees. She samples 500 North American bees and 200 are hostile. She samples 600 South American bees and 230 are hostile. (a) State the hypotheses to be tested. H0 : pN A − pSA = 0 HA : pN A − pSA > 0 (b) Compute the sample statistic (z value or t value). You must show work to receive credit. This is the hypothesis test for the difference of proportions. p̂pooled = #SuccessGrp1+#SuccessGrp2 n1 +n2 200+230 = 500+600 43 = 110 q p̂pooled (1−p̂pooled ) p̂ (1−p̂ ) SEpooled (p̂N A − p̂SA ) = + pooled n2 pooled n1 q 43 43 (1− 43 ) (1− 43 ) = 110 500110 + 110 600110 q = 0.2381 + 0.2381 500 600 = 0.0295 p̂N A −p̂SA z = SEpooled (p̂N A −p̂SA ) 200 − 230 600 = 500 0.0295 1/60 = 0.0295 = 0.565 (c) Give the P-value or range of P-Values. On your calculator: normalcdf (0.565, 999) = 0.2860. (d) What decision should the scientist make at a significance level of 5%? Since the p-value is “large” (0.2860 > 0.05), Do Not Reject H0 . The results are not significant. 4. A scientist believes that the proportion of North American bears that are hostile is greater than the proportion of South American bears. She samples 800 North American bears and 200 are hostile. She samples 1200 South American bears and 240 are hostile. (a) State the hypotheses to be tested. H0 : pN A − pSA = 0 HA : pN A − pSA > 0 (b) Compute the sample statistic (z value or t value). You must show work to receive credit. This is the hypothesis test for the difference of proportions. p̂pooled = #SuccessGrp1+#SuccessGrp2 n1 +n2 200+240 = 800+1200 = 0.22 q p̂pooled (1−p̂pooled ) p̂ (1−p̂ ) SEpooled (p̂N A − p̂SA ) = + pooled n2 pooled n1 q = 0.22(1−0.22) + 0.22(1−0.22) 800 1200 q 0.1716 0.1716 + 1200 = 800 = 0.0189 200 240 − 1200 ( 800 ) p̂N A −p̂SA 0.05 z = SEpooled = = 0.0189 = 2.646 (p̂N A −p̂SA ) 0.0189 (c) Give the P-value or range of P-Values. On your calculator: normalcdf (2.646, 999) = 0.0041. (d) What decision should the scientist make at a significance level of 5%? Since the p-value is “small” (0.0041 < 0.05), Reject H0 . The results are significant. Q61 pg 578 A man who moves to a new city sees that there are two routes he could take to work. A neighbor who has lived there a long time tells him Route A will average 5 minutes faster than Route B. The man decides to experiment; he wants to find out if the mean difference between Route A and B is different from 5 minutes. Each day, he flips a coin to determine which way to go, driving each route 20 days. He finds that Route A takes an average of 40 minutes, with a standard deviation of 3 minutes, and Route B takes an average of 43 minutes, with a standard deviation of 2 minutes. Histograms of travel times for the routes are roughly symmetric and show no outliers. Assume α = 0.05. (a) Find a 95% confidence interval for the difference in average commuting time for the two routes. Use df= 33. Since df = 33, then t∗33 = 2.0345. q 2 s s2 ∗ CI:(y B − y A ) ± tdf nBB + nAA q 22 32 ⇒ (43 − 40) ± 2.0345 20 + 20 √ ⇒ 3 ± 2.0345 0.65 ⇒ (1.36, 4.64) Note that this result means that we are 95% confident that Route B has a mean commuting time between 1.36 and 4.64 minutes more than the mean commuting time of Route A. Also, because 5 minutes is not within the interval, it appears that the neighbor may be exaggerating the average difference in commuting time. (b) State the hypotheses to be tested. H0 : µB − µA = 5 HA : µB − µA 6= 5 (c) Compute the value of the test score. t33 = (y B −y A )−∆0 r s2 B nB s2 + nA A = (43−40)−5 q 2 22 + 320 20 = √−2 0.65 = −2.481 (d) Give the P-value or range of P-values. On your calculator: 2tcdf (−999, −2.481, 33) = 2(0.00919) = 0.0184. On the table: go to degrees of freedom 33, find where 2.481 is in the row, and then look at the two-tail probability values. The p-value is between 0.01 and 0.02. (e) Do the results seem significant? Since the p-value is “small” (0.0184 < α = 0.05), Reject H0 . The results are significant. There is evidence to conclude that the average difference in commuting time is different from 5 minutes. However, we don’t know if it is higher than 5 minutes or lower than 5 minutes because we did not test for that. Q78 pg 582 Researchers randomly assigned participants either a tall, thin “highball” glass or a short, wide “tumbler,” each of which held 355 ml. Participants were asked to pour 1.5 oz = 44.3 ml of water into their glass. Did the shape of the glass make a difference in how much liquid they poured? In particular, test to see if they poured less water into the “highball” glass than the “tumbler”. Assume α = 0.1. Here are the summaries: Highball n 99 y 42.2 ml s 16.2 ml Tumbler n 99 y 60.9 ml s 17.9 ml (a) Find a 90% confidence interval for the difference in average water held for the two glasses. Use df = 194. Because we are looking for a 90% confidence interval with df = 194, t∗194 = 1.6528. q 2 s s2 ∗ CI: (y H − y T ) ± tdf nHH + nTT q 2 2 ⇒ (42.2 − 60.9) ± 1.6528 16.2 + 17.9 99 99 √ ⇒ −18.7 ± 1.6528 5.8874 ⇒ −18.7 ± 1.6528(2.4264) ⇒ (−22.71, −14.69) (b) State the hypotheses to be tested. H0 : µH − µT = 0 HA : µH − µT < 0 (c) Compute the value of the test score. (Assume all conditions are met.) t194 = = (y H −y T )−0 r s2 H nH s2 + nT T q 42.2−60.9 2 16.22 + 17.9 99 99 −18.7 2.4264 = = −7.707 (d) Give the P-value or range of P-values. On your calculator: tcdf (−999, −7.707, 194) = 0. On the table (or an online table): go to degrees of freedom 194, find where 7.707 is in the row, and then look at the one-tail probability values. Compare 7.707 to the values on the table. The p-value is less than 0.001. (e) Do the results seem significant? Since the p-value is “small” (0 < α = 0.1), Reject H0 . The results are significant. There is sufficient evidence to conclude that they poured less water into the “highball” glass than the “tumbler”. 10.6 Various Chapters 1. We want to estimate the healing rate for a wound. A sample of size 17 is collected and the sample mean is computed to be 24.3 micrometers per hour, with a sample standard deviation of s= 8 micrometers per hour. What is a 95% confidence interval for the population mean? What we are given: y = 24.3 s=8 n = 17 Because we want the confidence interval for the population mean, use the formula CI : y ± t∗n−1 √sn ⇒ 24.3 ± t∗17−1 √817 ⇒ 24.3 ± 2.120 √817 ⇒ (20.187, 28.413) 2. A sample of size n=150 people is collected and the sample proportion of people who are illiterate is computed to be .20. Compute a 95% confidence interval for the population proportion of illiterate people. What we are given: n = 150 p̂ = 0.2 Because we want the confidence interval for the population proportion, use the formula q p̂(1−p̂) ∗ CI: p̂ ± z qn ⇒ 0.2 ± 1.96 0.2(1−0.2) 150 q ⇒ 0.2 ± 1.96 0.16 150 ⇒ (0.136, 0.264) 3. You believe that the proportion of people that like cheese is .80. You are going to construct a 95% confidence interval and want the margin of error to be plus or minus .03. What should the sample size be? What we are given: p̂ = 0.8 M OE = 0.03 Since this is dealing with one proportion, use the formula q q p̂(1−p̂) ∗ ⇒ 0.03 = 1.96 0.8(1−0.8) M OE = z n n q 0.16 0.03 ⇒ 1.96 = n 0.16 0.03 2 ⇒ 1.96 = n 0.16 ⇒ n = 0.03 2 ( 1.96 ) ⇒ n = 682.95 ⇒ n ≈ 683 4. Teresa knows that appointment times are approximately normally distributed. She believes the mean wait time is longer than 25 minutes. She conducts a test with α = 0.05 and the appropriate hypotheses. She selects 25 random appointments and the sample mean was found to be 25.66 minutes and a sample standard deviation of 10 minutes. (a) State the hypotheses to be tested. H0 : µ = 25 HA : µ > 25 (b) Compute the value of the test score. Since this is asking for us to test the population mean, we need the formula 0 tn−1 = y−µ √s n ⇒ t25−1 = t24 = 25.66−25 10 √ 25 = 0.66 2 = 0.33 (c) Give the P-value or range of P-values. On your calculator: tcdf (0.33, 999, 24) = 0.372. On the table: go to degrees of freedom 24, find where 0.33 is on the row, and then look at the one-tail probability values. The probability is greater than 0.10. (d) Do the results seem significant? Since the p-value is “large” (0.372 > α = 0.05), Do Not Reject H0 . The results are not significant. 5. You claim that the proportion of people who watch American Idol is greater than .50. You sample n=200 people and compute a sample proportion of .53. Assume α = 0.05. (a) State the hypotheses to be tested. H0 : p = 0.5 HA : p > 0.5 (b) Compute the value of the test score. Since this is asking for us to test the population proportion, we need the formula 0.03 0 = √0.03 = 0.849 z = q pp̂−p = q0.53−0.5 = √0.00125 0.25 0.5(1−0.5) 0 (1−p0 ) n 200 200 (c) Give the P-value or range of P-values. On your calculator: normalcdf (0.849, 999) = 0.198. (d) Do the results seem significant? Since the p-value is “large” (0.198 > α = 0.05), Do Not Reject H0 . The results are not significant. 6. You want to compare the proportion of gamers amongst women and men. You survey 300 women and 400 men. 175 of the women were gamers and 200 of the men were gamers. Construct a 95% confidence interval for the difference of proportions. What we are given: p̂W = p̂M = 175 300 200 400 Because we want the confidence interval for the difference of proportions, use the formula q p̂1 (1−p̂1 ) 2) ∗ CI: (p̂1 − p̂2 ) ± z + p̂2 (1−p̂ n2 qn1 175 ( 300 )(1− 175 ( 200 )(1− 200 ) 300 ) − 200 ± 1.96 + 400 400 400 ⇒ 175 300 400q 300 35 1 144 ⇒ 12 ± 1.96 300 + 0.25 400 ⇒ (0.0091, 0.157) 7. You believe that the proportion of men that are colorblind is greater than the proportion of women that are color blind. You sample 900 men and 90 of them are color blind. You sample 700 women and 45 of them are colorblind. Assume α = 0.05. (a) State the hypotheses to be tested. H0 : pM − pW = 0 HA : pM − pW > 0 (b) Compute the value of the test score. p̂pooled = NumberofSuccessesinGroup1+NumberofSuccessesinGroup2 n1 +n2 90+45 = 900+700 = 0.084375 q p̂pooled (1−p̂pooled ) p̂ (1−p̂ ) SEpooled (p̂1 − p̂2 ) = + pooled n2 pooled n1 q = 0.084375(1−0.084375) + 0.084375(1−0.084375) 900 700 √ −5 −4 = √8.534 × 10 + 1.104 × 10 = 1.9574 × 10−4 = 0.014 90 45 − 700 1/28 p̂1 −p̂2 900 z = SEpooled = = 0.014 = 2.55 (p̂1 −p̂2 ) 0.014 (c) Did you use the pooled proportion in part b.? YES (d) Compute the P-value. On your calculator: normalcdf (2.55, 999) = 0.0054 (e) Are the results significant? Since the p-value is “small” (0.0054 < α = 0.05), Reject H0 . The results are significant.