Solutions to Sample Test 1 1. All are valid reasons why one samples except response b. Generally speaking a sample statistic may be your best estimate of a population parameter, however, nothing beats the real thing. In fact the population parameter is not even an estimate. 2. Response c is a basic definition of a census. 3. The question stem includes the basic definition of a parameter. 4. The Literary Digest fiasco was essentially an example where the sample was unrepresentative of the population. However, it could be concluded that it reflected a nonrandom sample since so many elements in the population were not given any chance of being selected in the sample. For example, if you did not own a car, have a phone, or subscribe to magazines you had no chance of being selected for the sample. 5. All of the options reflect probability samples except d. Convenience or judgment samples are examples of nonrandom or non-probability samples. 6. The primary advantage of a cluster sample is to deal with sampling where population elements are widely dispersed geographically. It provided a means of selecting elements within geographic clusters and avoiding the time and expense of traveling great distances to collect small numbers of responses. 7. The rule of thumb is that we use FPCF in this case if n/N> or = .05. 8. Sigma is the Greek symbol used to represent the population standard deviation. 9. Statement of Central Limit Theorem(CLT). Value of CLT is that it proves that the population need not be normally distributed for the sampling distribution to be normally distributed as long as sample size is large. 10. Note that question asks what is the probability that the sample mean exceeds 490. Thus, the answer will be based on the sampling distribution of the mean of which we know that its mean will be mu, the population mean and its standard error will be equal to the population standard deviation (or its estimate) divided by the square root of the sample size. This assumes we can ignore the Finite Population Correction Factor, which would be true since n = 400 and the population would be enormous – all U.S. college students. Thus, n/N would be less than .05. Standardizing the above results in Z ( X ) (490 500) n 2.0 100 400 The area between the mean(center point) and a Z-value of –2.0 is 0.4772. Thus, the area under the normal curve above 490(above Z = -2.0) is .4772+.5000 = 0.9772 11. Note that question asks what is the probability that the sample proportion is between .45 and .60. Thus, the answer will be based on the sampling distribution of the proportion of which we know that its mean will be the true or population proportion and its standard error will be equal to the square root of p times q divided by n, the sample size. This assumes we can ignore the Finite Population Correction Factor, which would be true since n = 100 and the population would be sizable – all Brevard county college students. Thus, n/N would be less than .05. Standardizing for p-hat = .45 yields ˆ Z ( p p) (.45 .50) (p*q n (.5 * .5 100 1.0 The area between the mean(center point) and a Z-value of –1.0 is 0.3413. Standardizing for p-hat = .60 yields ˆ Z ( p p) (.60 .50) (p*q n (.5 * .5 100 2.0 The area between the mean(center point) and a Z-value of 2.0 is 0.4772. Thus, the area under the normal curve between 0.45 and .60 is .4772+.3413 = 0.8185 12. The 99% confidence interval is X Z / 2 n 26,000 2.575(3,000 25 ) Finishing the calculation yields an interval from $24,455 to $27,545. Thus, the student adviser would be 99 percent confident that the mean annual income of graduating students last year would be between $24,455 and $27,545. Note even though the sample size was small, less than 30, we were able to use the normal curve rather than the tdistribution since we knew sigma, the population standard deviation. If we did not know sigma and had to estimate it with s, we would have used the t-distribution instead. Compare for example, question 14 below. 13. Definition of Type I Error. 14. The 95% confidence interval is X t / 2 s n 150 2.064(15 25 ) Finishing the calculation yields an interval from $143.81 to $156.19. Thus, the firm would be 95 percent confident that the mean repair cost would be between $143.81 and $156.19. Note the sample size was small, less than 30, thus, we were required to use the t-distribution rather than the normal curve since we also Did Not know sigma, the population standard deviation. Compare this question with question 12 above. 15. The 95% confidence interval for the proportion is pˆ Z / 2 p q 0.40 1.96 .4 .6 Finishing the calculation yields an n 600 interval from .3608 to .4392. Thus, the firm would be 95 percent confident that the proportion of the population that watched the TV show is .3608 to .4392. Note, since we were estimating the true proportion, p, we did not know its value. Thus, in the calculation of the standard error of the proportion we had to use an estimate for p. The best estimate of p is p-hat, the sample proportion. P-hat was 240/600 = 0.40. 16. Note the formula for determining the required sample size. Note all factors listed Z 2 2 are involved except the sample mean, x-bar. n e2 17. Note that factors in the numerator(Z and sigma) will be directly or positively 18. 19. 20. 21. 22. 23. related to n and factors in the denominator(e) will be inversely or negatively Z 2 2 related to n. See formula, n e2 Z 2 2 2.5752 52 n 165.7656 e2 12 The formula for the required sample size for estimating a proportion Z 2 p q 1.6452 .9 .1 is n 608.8556 e2 .022 Assume p is 0.5 since this yields largest possible sample size holding other factors constant. Thus, if you know nothing about the size of p, be conservative and assume it is 0.5. Your sample size may be larger than necessary but it will never be too small. Note in the question above, we used 0.9 for p. This was based on the fact that we knew that in another branch, 90 percent had opted for the program. Thus, we used it as an estimate of the proportion for our branch. The rejection region would be in lower-tail since the only time you would reject the null hypothesis is when the sample mean was significantly less than 100. X 88 90 The test statistic is Z 3.5 Since hypothesis is that light 4 X 49 comes on after 90 seconds, we would reject the null if sample results are significantly below or above 90. In this case our result, 88, is significantly below 90 evidenced by a Z-test of –3.5 and the .05 level of significance Z-value would only be + or -1.96. X 49.5 50 The test statistic is Z 5.0 Since hypothesis is that 12 X 14400 motorist travel at 50 mph or more, we would reject the null if sample results are significantly below 50. In this case our result, 49.5, is significantly below 50 evidenced by a Z-test of –5.0 and the .01 level of significance Z-value would only –2.33. 24. The null hypothesis would be 2-sided, mu = 10,000. X 112,800 110,000 1.12 Since hypothesis is 7,500 X 9 that packages use 110,000 bytes of memory, we would reject the null if sample results are significantly below or above 110,000. In this case our result, 112,800, is NOT significantly above 110,000 evidenced by a Z-test of +1.12 and the .01 level of significance t-value would only be +3.355. Note t-value is used since n is small (9<30) and sigma was not known. We used s, the sample standard deviation, as an estimate for sigma. Also critical t-value(3.355) is based on n-1 degrees of freedom. 25. The test statistic is Z 26. Using Z-table, look up area of .4900 (.5000 - .0100). When considering both halves of distribution this would give you 98 percent within an tails of .0 and .01 or .02 in total. Ztable gives Z of 2.33 for the above. 27. The test statistic is Z pˆ p 0.42 0.40 .9129 , note p-hat is 210/500. This is pq .4 .6 500 n a 2-tailed hypothesis with a .02 level of significance. Thus, the critical Z-value is determined as shown in the previous question, #26. 28. You can decrease the probability of a Type I Error, reject the null when it is true, if you decrease alpha. Decreasing alpha will reduce the size of the rejection region. 29. Stated requirement since sampling distribution of a proportion can only be normal if the sample size is large. Also the size of the sample must be larger the more p deviates from .5. 30. If you could be so lucky.