Chapter 1 1. A survey of 1420 U.S. undergraduate English majors asked which Shakespearean play was most relevant in the year 2000. What is the population and the sample? Ans. Population is all of the undergraduate English majors. The sample is 1420 U.S. undergraduate English majors. 2. Decide which method of data collection you would use to gather data for the study. Explain. Twenty-five students are randomly selected from each grade level at a high school and surveyed about their study habits. Ans. Stratified, because each must be represented in the study. 3. You are a researcher for a professional research firm. Your firm has won a contract on doing a study for an automobile industry publication. The publication would like to get its readers’ (engineers, manufactures, researchers, and developers) thought on the future of automobiles, such as what type of fuel they think will be used in the future. The publication would like to get input from those who work for automakers and for those who work for automaker suppliers. The publication has given you their readership database and the 25 questions they would like to ask (the sample questions from a previous study are given below). It is too expensive to contact all the readers so you need to determine a way to contact a representative sample of the entire readership population. How will the internal combustion engine of the future be fueled? Fuel Gasoline Hydrogen Diesel Fuel Natural Gas Other No response Percent Responding 38.1% 23.2% 19.6% 11.4% 6.7% 1% When will affordable fuel cell vehicles be on the market? Time 5 years or less More than 5 years to At most 10 years More than 10 years Not likely No response Percent Responding 5.4% 33.6% 48.7% 7.6% 4.7% A. a) What sampling technique would you use to select the sample for the study? Why? b) Will the technique you choose in part a give you a sample that is representative of the population? c) Describe the method for collecting data. d) Identify possible flaws or biases in your study. B. a) What type of data do you expect to collect: qualitative, quantitative, or both? Why? b) What levels of measurement do you think the data in the study will be? Why? c) Will the data collected for the study represent a population or a sample? Why? d) Will the numerical descriptions of the data be parameters or statistics? Ans: Possible Responses 1. a) Systematic sampling using every 4th or 5th subscriber from the publications subscriber database because this would provide SRS sampling that is cost effective and simple to run, instead of trying to figure out how to perhaps break the different subscribers jobs into categories and then do a stratified based on that. However cluster with North, East, South, and West is also a viable option. b) The technique will give a representative sample of the reader’s population. c) Data collection will be bases of a survey with 25 questions that are multiple choice. d) Possible flaws would be nonresponse bias or response bias from leading questions. Perhaps certain types of readers, who therefore have similar jobs might be most likely to reply while others might not reply at all. 2. a) Quantitive because the data is looking at the percentages, numbers/data, of readers who responded to the questions and their answers. b) Ratio, because they can be ordered for example from lowest to highest, their differences can be found and so can their ratio values. For example it is acceptable to say that approximately twice as many readers think gasoline at 38.1% will be the fuel of the future compared to 19.6% who think diesel duel will be c) Sample d) Statistic 3. You are conducting a survey to find out how many students’ favorite subject is math at your school. Describe how you would do this by using stratified and cluster sampling (assume that your school goes from grade nine through twelve) Ans: Stratified: You would survey a sample of students from each grade level. Cluster: you would only survey a sample of all the students from one grade level. 5. Using the random number table from a group of 999 in which each person is randomly assigned a number, randomly select ten people by using the third row. Ans: 596, 547, 196, 627, 386, 500, 40, 535, 894, and 31 Chapter 2 When data is skewed right, the mean is: a. Greater than the mode and less than the median? b. Greater than the mode and greater than the median? c. Less than the mode and greater than the median? d. Less than the mode and less than the median? Ans: b 2. In a normally distributed bell curve, what percentage of the data lies within one standard deviation of the mean? a. 68 b. 34 c. 27 d. 56 Ans. A 3. Use the following data set: 50 51 54 56 59 60 61 61 61 63 64 65 68 69 70 71 71 75 a) Make a stem and leaf plot of the data b) Make a box and whisker plot of the data ANS: 5 01469 6 011134589 7 0115 50 60 70 4. The weight of 20 baseball players have a distribution with a mean of 195 pounds and a standard deviation of 25 pounds. Use z-scores to determine if the weights of the following players are unusual. a) 251 pounds c) 219 pounds b) 162 pounds d) 178 pounds ANS: a. unusual; b) Not unusual; c) Not unusual; and d) Not unusual Chapter 3 1. Are the following events independent or dependent? Flipping a coin and getting heads and rolling a die and obtaining a five Ans: Independent, one outcome does not affect the other. 2. The state government wants to construct a new interstate highway and gets 16 bids for the project. 4 of the 16 bidding companies will be picked to sponsor the project. In how many ways can the 4 companies be picked? Ans. 1,820 different ways 3. Find the probability of selecting four consecutive twos when four cards are drawn without replacement from a standard deck of 52 playing cards. Round your answer to four decimal places. ANS: Not provided 4. If a student is randomly selected, find the probability that it is a senior given that the student owns a credit card. Round your answers to three decimal places. Class Credit Card Carrier Junior 26 Senior 32 Total 58 ANS: Not provided Not a Credit Card Carrier 40 45 85 Total 66 77 143 5. Decide if the events are mutually exclusive: Event A: Randomly select a person who uses the internet at least twice a week. Event B: Randomly select a person who has not used the internet in seven days. Ans: Mutually Exclusive – The two events A and B cannot occur at the same time. 6. The starting lineup for a softball team consists of ten players. How many different batting orders are possible using the starting lineup? Ans: 3,628,800 batting orders (Permutation) Chapter 4 1. You are taking a multiple-choice quiz that consists of five questions. Each question has four possible answers, only one of which is correct. To complete the quiz, you randomly guess the answer to each question. Find the probability of guessing: a) Exactly 3 answers correctly b) At least three answers correctly c) Less than 3 answers correctly Ans: a) 0.088 b) 0.0104 c) 0.896 2. Decide whether the distribution is a probability distribution X 5 6 7 8 P(x) 0.28 0.21 0.43 0.15 Ans: No. The sum is greater than 1 3. A six-sided die is rolled 3 times. Find the probability of rolling exactly one 6. Ans: 0.347 4). Determine what type of distribution the following statement represents, and run the test for the distribution to answer the included question. As a hungry boy, Peter really enjoys visiting his favorite fast food chain, Taco Bell. However, his best friend, Fabio, claims that they have terrible service, because forty-three percent of the time he receives the wrong order. Peter makes a bet with Fabio that he will not receive a wrong order from Taco Bell until his fifth visit (not counting the other times Fabio visited, so start from 0). What is the probability that Peter is correct? Ans: Geometric Distribution, 0.045 5) Create a discrete probability distribution and graph according to the following scenario. A business owner at a local ice cream shop wants to see the probability distribution of the number of toppings (1 – 4) that people put on their ice cream, and determine whether or not is normally distributed. After keeping tally of 112 customers he placed his results in the following chart (frequency distribution). Score 1 2 3 4 Frequency (f) 12 47 30 23 Ans: x 1 2 3 4 P(x) 0.107 0.420 0.268 0.205 Relative Frequency Ice Cream Pretend the graph to the right has no space between the bars! 0.5 0.4 0.3 0.2 0.1 0 Number of Toppings 1 2 3 4 Numer of Ice Cream Toppin gs 6. A local pet shelter is selling $4 raffle tickets as part of a fundraising program. The first prize is a vacation getaway valued at $3150, and the second prize is a camping tent valued at $450. The rest of the prizes are 15 - $25 Target gift certificates. The number of tickets sold is 5000. Find the expected net gain to the player for one play of the game. Is the player expected to win or lose? Ans: -3.205 the players are expected to lose. 7. Assume the probability that you will make a hole in one on your first swing is 0.19. Find the probability that you …. a. Make a hole in one on the fifth swing b. Make a hole in one on the first, second or third swing. c. Do no make a hole in one on the first three swings. Ans: a. 0.082; b. 0.469; c. 0.531 Chapter 5 1. A survey indicates that for each trip to the supermarket, a shopper spends an average of 45 minutes with a standard deviation of 12 minutes. The length of time spent in the store is normally distributed and is represented by the variable x. A shopper enters the store. Find the probability that the shopper will be in the store between 24 and 54 minutes. Ans: 0.733 2. 29% of people in the US say they are confident that passenger trips to the moon will occur in their lifetime. You randomly select 200 people in the US and ask each if he or she thinks passenger trips will occur in his/her lifetime. What is the probability that at least 50 will say yes? Ans. 0.9066 3. Shoppers spend an average of 65 minutes at the grocery store every week with a standard deviation of 15 minutes. Find the z-score for: a) 50 – 80 minutes b) 95 or more minutes Ans. a) 0.680; b) 0.023 4. Find the z-score that corresponds to: a) P5 b) P10 c) P15 d) P20 Ans: a = -1.65; b = -1.28; c = -1.04; d = -0.84 Chapter 6 1. A video gamer wishes to estimate the mean level of all players on his server. In a random sample of 100 players, the mean level is found to be 17.8. Similar experiments in the past have found the population standard deviation to be 1.8. Assuming the population is normally distributed, construct a 90% confidence interval of the population mean age. Ans. 17.5039 18.0961 2. A jeweler randomly selects and weighs 30 diamonds. The sample standard deviation is 53 grams. Construct a 99% confidence interval for the population standard deviation. Assume the weights are normally distributed. Ans. 39.4412 78.79368 3. Use a normal distribution or a t-distribution to construct a 95% confidence interval for the population mean. Justify your decision. If neither distribution can be used, explain why not. You make a random survey of 25 sports cars and record the miles per gallon for each. The data are listed below. Assume the miles per gallon are normally distributed. 24 24 27 20 26 23 18 29 24 22 22 27 26 20 28 30 23 24 19 22 24 26 23 24 25 Ans: Use t-distribution because n<30, the miles per gallon are normally distributed, and the standard deviation is unknown. (22.762, 25.238) 4. A container of car oil is supposed to contain 1000 milliliters of oil. A quality control manager wants to be sure that the standard deviation of the oil container is less than 20 milliliters. He randomly selects 10 cans of oil with a mean of 997 milliliters and a standard deviation of 32 milliliters. Use these sample results to construct a 95% confidence interval for the true value of the population standard deviation. Ans: (22.01, 58.42) 5. Find the Critical Value tc for a 95% confidence level when the sample size is 12. Ans. tc = 2.201 6. You randomly select 12 cities in Georgia and measure the temperature in the summer of each city. The sample mean is 98” with a sample standard deviation of 5”. Find the 95% confidence interval for the mean temperature. Assume temperatures are normally distributed. Ans: (94.823, 101.18) Chapter 7 1. A local basketball club claims that the length of time to play an entire game has a standard deviation of more than 10 minutes. a. What would be a consequence of a Type I error in this case? b. What would be a consequence of a Type II error in this case? Ans: a. Type I: rejecting the null hypothesis that the standard deviation is less than or equal to 10, when it is actually true. b. Not rejecting the null hypothesis when it is actually false. 2. The teachers of Chandler High School report that their students have 3 or less hours of homework every night, for all of their classes combined. You disagree with this statement and randomly select a sample of 450 students. You find that the calculated p-value is 0.0312. At the 5% significance level, what can you conclude? Ans. Reject H0. At 5% level, there is not enough evidence to support the claim the CHS students have less than or equal to 3 hours of homework. 3. You represent a company that claims its mean amount of products shipped out every year is 68 products. Write the null and alternative hypothesis to test this claim. H 0 68 (claim ) Ans: H a 68 4. A repair company states that the mean price for every repair they make is less than $100. You are thinking about hiring this repair company to fix your broken fridge, and find that the mean repair cost of 5 fridges that they repaired was $75 with a standard deviation of $12.50. At 0.01 do you have enough evidence to support the company’s claim? Ans. H o 100 H a 100 (claim ) Reject H0. There is enough evidence at the 1% level to support the claim that the average repair cost for broken fridges is less than $100. Chapter 8 1. Find the critical value for the indicated test, level of significance, and given sample size. Assume samples are independent, normal, random, and that the population variances are not equal! Two- tailed; 0.10 ; n1 = 10, n2 = 12 Ans: t 0 1.725 2. Classify the two given examples as independent or dependent. Explain your answer. Sample 1: The weights of 51 adults Sample 2: The weights of the same 51 adults after participating in a diet and exercise program for one month. Ans: Dependent because the same adults were sampled. 3. Test the claim about the difference between two population proportions for the given level of significance, and the given sample statistics Claim: p1 p2 x1 = 35, n1 = 70, x2 = 36; n2 = 60 0.01 Ans. z 1.142; p 02.54 Fail to reject null at the 1% level. Not enough evidence to support the claim that p1 p2 4. Decide whether the following are independent or dependent. Explain reasoning. a) Sample 1: test scores for 14 statistics students on a pre test Sample 2: test scores for the same 14 statistics students on a posttest b) Sample 1: results of a medicine on a sample of 26 patients Sample 2: results of a placebo on a sample of 26 patients Ans: a. The two samples are dependent because the same 14 students are used for both samples. b. The two samples are independent because the patients involved cannot be taking both the medicine and the placebo, thus they are independent. 5. A real estate agent claims that there is no difference between the mean household incomes from two neighborhoods. The mean income of 12 randomly selected households from the first neighborhood was $12,250 with a standard deviation of $1200. In the second neighborhood, 10 randomly selected households had a mean income of $17,500 with a standard deviation of $950. Assume normal distribution and equal population variance. Test the claim at 0.05. Ans: 2-Sample t -Test (pooled yes) t = -11.202. Conclusion: Reject H0: at that the 5% level, evidence does not support the real estate agent’s claim that there is no difference between the mean incomes of two neighborhoods. 6. An advertising agency claims that there is no difference between the income of doctors and the income of lawyers. Five hundred doctors and five hundred lawyers are surveyed to find their mean annual income and the results are recorded below. The two samples are independent. Do the results support the advertiser’s claim at the 5% level? x1 = 125,000 s1 = 28,000; n1 = 500 x2 = 118,000; s2 = 25,000; n2 = 500 Ans: Reject the null. There is enough evidence to reject the claim that the mean annual income of doctors and the mean annual income of lawyers are equal at the 5% level. 7. A travel agent claims that the proportion of people who develop violent diarrhea after visiting a Latin American country is the same as the proportion of people who develop violent diarrhea in a South American country. After polling 200 visitors of Latin America and 150 of South America, ABC 15 investigators found that 30% of visitors to the Latin American country developed debilitating diarrhea and 65% of visitors to the South American country developed such diarrhea. Test the claim at the 5% level that the proportions are the same. Ans: Fail to reject the null hypothesis. There is enough evidence at the 5% level to support the claim that the proportions are equal. Chapter 9 1. Calculate the correlation coefficient (r) and make a conclusion about the type of correlation. The number of hours 15 statistics students spent studying for a test and their scores on that test. X Y 0 48 1 53 1 56 2 65 4 77 4 80 5 87 6 94 3 72 2 63 5 81 4 78 3 70 5 84 6 98 Ans: r = 0.989, hours spent studying and test scores have a strong positive linear relationship. 2. Interpret the meaning of the following coefficient of determination, r 2 = 0.978 Ans: It means that 97.8% of the data is explained by the regression line and 2.2% is not. 4. Find the equation for the linear regression line given that the line passes through the point (10, 5): Sy = 5.789; Sx = 10.596; r = 0.85 Ans: yˆ 4.40 0.44 x 5. Perform a t-test for correlation to see if there is a significant correlation between age and number of magazine subscriptions, use 0.05 Age Magazine Subscriptions 10 0 22 2 37 2 48 3 50 2 Ans: Fail to reject H0, there is evidence at the 5% level to show that there is a correlation between age and number of magazine subscriptions. Chapter 10 1. The following table shows the price per gallon for a random sample of exterior deck treatments. At 10% level is there enough evidence to show that at least one of the treatments is different than the rest. (Hint: ANOVA TEST) Semitransparent treatments 24 23 22 17 21 17 Lightly tinted treatments 51 14 21 16 Clear treatments 13 13 10 12 22 Ans: At the 10% level, there is enough evidence to support the H0. There is not enough evidence to show that the treatments are different. 2. Biologists were studying the eating habits of pythons and anacondas in the wilderness. Eight pythons and six anacondas were observed. The variance of time between meals for the pythons was five months. The variance of time between meals for the anacondas was eight months. The biologists claim that the anacondas eat more often than the pythons do. Prove or disprove this claim. 0.05. Ans: Fail to reject the null hypothesis. There is insufficient evidence at the 5% level to conclude that anacondas eat more than pythons. 3. The following contingency table shows the results of a random sample of 550 company CEOs classified by age and size of company. At α=0.01, can you conclude that the CEOs’ ages are related to company size? Company size Age of CEOs 39 and under 40-49 50-59 60-69 70+ Total Small/Midsize 42 69 108 60 21 300 Large 5 18 85 120 22 250 Total 47 87 193 180 43 550 Ans: H0: The CEOs’ ages are independent of company size Ha: The CEOs’ ages are dependent on company size d.f. = (2-1)(5-1) = 4 α=0.01 χ2c = 13.277 χ2 = 77.9 Reject null; At 1% level, there is enough evidence to support the claim that CEOs’ ages are dependent on company size. 4. A medical researcher claims that specially treated intravenous solution decreases the variance of the time required for nutrients to enter the blood stream. Independent samples from each type of solution are randomly selected, and the results are shown in the table. At α=0.01, is there enough evidence to support the researcher’s claim? Assume the populations are normally distributed. Normal Solution n = 25 s2 = 180 Treated Solution n = 20 s2 = 56 b) Identify claim and state null and alternative hypothesis. c) Specify level of significance. d) Determine degrees of freedom for numerator and denominator. e) Find critical values and identify rejection region. f) Use F-Test to find F test statistic g) Conclude. Ans: H0: Ha: (claim) α=0.01 d.f. (numerator) = 24 d.f. (denominator) = 19 F0 = 2.92 F = 3.214 Reject null, evidence supports claim at 1% level.