AP Statistics 8/31/06 Coley / P. Myers Test #1 (Chapters 1-6) Name ______________________________________________ Period ___________ Honor Pledge _________________________________________ Part I - Multiple Choice (Questions 1-10) - Circle the answer of your choice. 1. In order to rate TV shows, phone surveys are sometimes used. Such a survey might record several variables, some of which are listed below. Which of these variables are categorical? I. II. III. IV. V. (a) (b) (c) (d) (e) The type of show being watched The number of persons watching the show The ages of persons watching the show The name of the show being watched The number of times the show has been watched in the last month II, III, and V I only I and V I and IV None of the above describes the complete set of correct responses 2. The stemplot displays the 1988 per capita income (in hundreds of dollars) of the 50 states. Which of the following best describes the data? (a) (b) (c) (d) (e) Skewed distribution, mean greater than median Skewed distribution, median greater than mean Symmetric distribution, mean greater than median Symmetric distribution, median greater than mean Symmetric distribution with outliers on high end 3. A study was conducted on the weights of three different species of fish (Bream, Perch & Roach) found in a lake in Finland. These three fish (bream, perch and roach) are commercial fish. Their weights are displayed in the boxplots. Which of the following statements comparing these boxplots is NOT correct? (a) The median weights of the three species differ. (b) The spread of roach is less than the spread of the other two species. (c) The distributions of weights are approximately symmetric for all three species. (d) There are no outliers in weight for the three species. (e) The variability in the weights for the three species combined exceeds the variation in the medians of the three species. 4. The mean age of 14 of the members attending a mathematics department faculty meeting is 42. Mr. Myers, who is 57, arrives late. What is the average of all 15 members? (a) 43 (b) 44 (c) 45 (d) 46 (e) cannot be determined 5. The weights of cockroaches living in a typical college dormitory are approximately normally distributed with a mean of 80 grams and a standard deviation of 4 grams. The percentage of cockroaches weighing between 77 grams and 83 grams is about: (a) 99.7% (b) 95% (c) 68%(d) 55% (e) 34% 6. Scores on the American College Test (ACT) are normally distributed with a mean of 18 and a standard deviation of 6. The interquartile range of the scores is approximately: (a) 8.1 (b) 12 (c) 6 (d) 10.3 (e) 7 7. The test grades at a large school have an approximately normal distribution with a mean of 50. What is the standard deviation of the data so that 80% of the students are within 12 points (above or below) the mean? (a) 5.875 (b) 9.375 (c) 10.375 (d) 14.5 (e) cannot be determined from the given information 8. In the accompanying display, which has the larger mean and which has the larger standard deviation? (a) Larger mean, A; larger standard deviation, A (b) Larger mean, A; larger standard deviation, B (c) Larger mean, B; larger standard deviation, A (d) Larger mean, B; larger standard deviation, B (e) Larger mean, B; same standard deviation 9. You have a set of data that you suspect came from a normal distribution. In order to assess normality, you construct a normal probability plot. Which of the following would constitute evidence that the data actually came from a normal distribution? (a) (b) (c) (d) (e) A strongly linear relationship between the data and their standardized values. A bell-shaped (normal) relationship between the data and their standardized values. A random scattering of points when the standardized values are plotted against the data. A strongly non-linear relationship (with no outliers) between the data and their percentiles. A uniform relationship between the percentiles and the standardized values. 10. The cost of glass cleaner is nicely described by a Normal model with a mean cost per ounce of 7.7 cents with a standard deviation of 2.5 cents. What is the z-score of Windex with a cost of 10.1 cents per ounce? (a) 0.96 (b) 1.31 (c) 1.94 (d) 2.25 (e) 3.00 Part II – Free Response(Questions 11-13) – Show your work. 11. The heights of NCAA women basketball players are approximately normally distributed with 2.5 . 70 and For each of the following, illustrate with a picture and evaluate. (a) P(Height > 66) (c) P(Height < 64) __________ (b) P(72 __________ Height 74) __________ (d) The value of X if P(Height > X) = 0.215. ________ 12. The summary statistics for the number of inches of rainfall in Los Angeles for 177 years, beginning in 1877, are shown below. N 117 Mean 14.941 Median 13.070 StDev 6.747 Min 4.850 Max 38.180 Q1 9.680 Q3 19.250 (a) Describe a procedure that uses these summary statistics to determine whether there are outliers. (b) Are there outliers in these data? Justify your answer based on the procedure that you described in (a). (c) The news media reported that in a particular year, there were only 10 inches of rainfall. Use the information provided to comment on this reported statement. 13. Two parents have each built a toy catapult for use in a game at an elementary school fair. To play the game, the students will attempt to launch Ping-Pong balls from the catapults so that the balls land within a 5-centimeter band. A target line will be drawn through the middle of the band, as shown in the figure below. All points on the target line are equidistant from the launching location. If a ball lands within the shaded band, the student will win a prize. The parents have constructed the two catapults according to slightly different plans. They want to test these catapults before building additional ones. Under identical conditions, the parents launch 40 Ping-Pong balls from each catapult and measure the distance that the ball travels before landing. Distances to the nearest centimeter are graphed in the dotplot below. (a) Comment on any similarities and any differences in the two distributions of distances traveled by balls launched from catapult A and catapult B. (b) If the parents want to maximize the probability of having the Ping-Pong balls land within the band, which one of the catapults, A or B, would be better to use than the other? Justify your choice. (c) Using the catapult that you chose in part (b), how many centimeters from the target line should this catapult be placed? Explain why you chose this distance. AP Statistics @ Woodward Academy Thursday, September 21, 2006 Coley / P. Myers Test #2 (Chapters 7-9) Name _________________________________________ Period ______ Honor Pledge ___________________________________ Part I - Multiple Choice (Questions 1-10) - Circle the answer of your choice. 1. Given the least-squares regression line: [Monopoly Property Cost = 67.3 + 6.78 * [Spaces From GO] Determine the residual for Reading Railroad which costs $200 and is 5 spaces from GO. (a) (b) (c) (d) (e) –98.8 –9.88 98.8 –1418.3 A residual has no meaning since one of the variables is categorical. 2. The computer printout of the relationship between the number of hours studying and the number of hours watching television is shown below. Predictor Constant Television Coef 5.1674 -0.56484 SE Coef 0.3203 0.07636 T 16.13 -7.40 P 0.000 0.000 S = 0.522956 R-Sq = 84.5% R-Sq(adj) = 83.0% Analysis of Variance Source Regression Residual Error Total DF 1 10 11 MS 14.963 SS 14.963 2.735 17.698 F 54.71 P 0.000 0.273 What is the value of the correlation coefficient for the number of hours studying and the number of hours watching television? (a) (b) (c) (d) (e) 0.919 0.523 .830 -0.919 -0.565 3. Data are obtained for a group of college freshman examining their SAT scores (math plus verbal) from their senior year of high school and their GPAs during their first year of college. The resulting regression equation is: ^ GPA 0.00161* SAT 1.35 with s SAT 120 , and sGPA .3057 What percentage of the variation in GPAs can be explained by looking at SAT scores? (a) (b) (c) (d) (e) 0.161% 16.1% 39.9% 63.2% This value cannot be computed from the information given. 4. Suppose the correlation between two variables is r = 0.23. What will the new correlation be if 0.14 is added to all values of the x-variable, every value of the y-variable is doubled, and the two variables are interchanged? (a) (b) (c) (d) (e) 0.23 0.37 0.74 -0.23 -0.74 5. Which of the following characteristics of a least-squares regression equation is false? (a) (b) (c) (d) (e) The LSRL minimizes the sum of the residuals. The average residual of a LSRL is 0. The LSRL minimizes the sum of the squared residuals. The slope of the LSRL is a constant multiple of the correlation coefficient. The slope of the LSRL line tells you, on the average, how much the response variable will change for each unit change in the explanatory variable. 6. A study of the fuel economy for various automobiles plotted the fuel consumption (in liters of gasoline used per 100 kilometers traveled) vs. speed (in kilometers per hour). A least-squares regression line was fitted to the data and residual plot is displayed to the right. What does the pattern of the residuals tell you about the linear model? the (a) The evidence is inconclusive. (b) The residual plot confirms the linearity of the data. (c) The residual plot suggests a different line would be more appropriate. (d) The residual plot clearly contradicts the linearity of the data. (e) None of the above. 7. With regard to regression, which of the following statements about outliers are true? I. II. III. (a) (b) (c) (d) (e) Outliers have large residuals. A point may not be an outlier even though its x-value is an outlier in the x-variable and its yvalue is an outlier in the y-variable. Removal of an outlier sharply affects the regression line. I and II I and III II and III I, II, and III None of the above gives the complete set of true responses. 8. As reported in the Journal of the American Medical Association (June 13, 1990), for a study of ten nonagenarians, the following tabulation shows a measure of strength versus a measure of functional mobility Strength (kg) Walk time (s) 7.5 6 11.5 10.5 9.5 18 4 12 9 3 18 46 8 25 25 7 22 12 10 48 What does the slope of the least-squares regression line signify? (a) (b) (c) (d) (e) The sign is positive, signifying a direct cause-and-effect relationship between strength and mobility. The sign is positive, signifying that the greater the strength, the greater the functional mobility. The sign is negative, signifying that the relationship between strength and functional mobility is weak. The sign is negative, signifying that the greater the strength, the less the functional mobility. The slope is close to zero, signifying that the relationship between strength and functional mobility is weak. 9. Some AP Statistics students were interested in finding out if there was a relationship between the number of hours of study for a chapter test and the score on that test. On the basis of the number of hours their classmates studied for the chapter 3 test and the scores on the test (out of 100%), the LSRL was ^ ^ y 72.53 5.88 x , where x is the number of hours studied and y is the predicted score on the test. Which statement correctly interprets the meaning of the slope of this regression line? (a) For each additional hour studied, the predicted score on the test increases by approximately 73%. (b) For each additional hour studied, the predicted score on the test increases by approximately 6%. (c) For each additional percent of increase on the test, the predicted number of hours studied increases by approximately 73%. (d) For each additional percent of increase on the test, the predicted number of hours studied increases by approximately 6%. (e) We cannot use this regression equation, since cause-effect has not been proven. 10. Consider the three points (2,11), (3,17), (and (4,29). Given any straight line, we can calculate the sum of the squares of the three vertical distances from these points to the line. What is the smallest possible value this sum can be? (a) (b) (c) (d) (e) 6 9 29 57 cannot be determined Part II – Free Response (Questions 11-12) – Show your work and explain your results clearly. 11. Lydia and Bob were searching the Internet to find information on air travel in the United States. They found data on the number of commercial aircraft flying in the United States during the years 1990-1998. The dates were recorded as years since 1990. Thus, the year 1990 was recorded as year O. They fit a least squares regression line to the data. The graph of the residuals and part of the computer output for their regression are given below. a. Is a line an appropriate model to use for these data? What information tells you this? b. What is the value of the slope of the least squares regression line? Interpret the slope in the context of this situation. c. What is the value of the intercept of the least squares regression line? Interpret the intercept in the context of this situation. d. What is the predicted number of commercial aircraft flying in 1992? e. What was the actual number of commercial aircraft flying in 1992? 12. A simple random sample of 9 students was selected from a large university. Each of these students reported the number of hours he or she had allocated to studying and the number of hours allocated to work each week. A least squares regression was performed and part of the resulting computer printout is shown below. Predictor Coef StDev T P Constant 8.107 2.731 2.97 0.021 Work 0.4919 0.1950 2.52 0.040 S = 4.349 R-Sq = 47.6% R-Sq (adj) = 40.1% The scatterplot below displays the data that were collected from the 9 students. Scatterplot of Study vs Work P 25 Study 20 15 10 5 0 5 10 15 Work 20 25 30 (a) After point P, labeled on the graph was removed from the data, a second linear regression was performed and the computer output is shown below. Predictor Coef StDev T P Constant 11.123 3.986 2.79 0.032 Work 0.1500 0.3834 0.39 0.709 S = 4.327 R-Sq = 2.5% R-Sq (adj) = 0.0% Does point P exercise a large influence on the regression line? Explain. (b) The researcher who conducted the study discovered that the number of hours spent studying reported by the student represented by P was recorded incorrectly. The corrected data point for this student is represented by the letter Q in the scatterplot below. Scatterplot of Study vs Work 17.5 Study 15.0 12.5 10.0 7.5 5.0 Q 0 5 10 15 Work 20 25 30 Explain how the least squares regression line for the corrected data (in this part) would differ from the least squares regression line for the original data. AP Statistics @ Woodward Academy Thursday, October 5, 2006 Coley / P. Myers Test #3 (Chapters 1-10) Name _________________________________________ Period ______ Honor Pledge ___________________________________ Part I - Multiple Choice (Questions 1-8) - Circle the answer of your choice. 1. A response variable appears to be exponentially related to the explanatory variable. The natural logarithm of each y-value is taken and the least-squares regression line is found to be ln(y) = 1.64 – 0.88x. Rounded to two decimal places, what is the predicted value of y when x = 3.1? (a) (b) (c) (d) (e) -1.09 -0.34 0.34 0.082 1.09 2. The 5-number summary for a one-variable data set is {5, 18, 20, 40, 75}. If you wanted to construct a modified box-andwhiskers plot for the dataset (that is, one that shows outliers if there are any), what would be the maximum possible length of the right side “whisker”? (a) (b) (c) (d) (e) 35 33 5 55 53 3. Mary’s best time for downhill sking the challenging course has a z-score of 0.5 as compared to all skiers that are timed on the same course. Which statement best interprets her z-score? (a) Mary’s time is 0.5 seconds times faster than all skiers timed on the same course. (b) Mary’s time is 0.5 seconds faster than all skiers timed on the same course. (c) Mary’s time is 0.5 standard deviations below the mean time for all skiers timed on the same course. (d) Mary’s time is 0.5 standard deviations above the mean for all skiers timed on the same course. (e) Mary skis worse than the majority of the skiers timed on the same course. 4. The equation of a least-squares regression line is y = 3.34x – 7.012. One of the points in the scatter plot was (5,10). What is the residual for this point? (a) (b) (c) (d) (e) -10.388 -0.312 0.312 9.688 10.388 5. The heights of adult women are approximately normally distributed about a mean of 65 inches with a standard deviation of 2 inches. If Rachel is at the 99th percentile in height for adult women, then her height, in inches, is closest to (a) 60 (b) 62 (c) 68 (d) 70 (e) 74 6. A set of data was re-expressed in two ways. ( x, y ) ( x, log( y )) Model B: ( x, y ) (log( x), log( y )) Model A: Based on the residual plots shown below, which model would be more appropriate? (a) (b) (c) (d) (e) Model A because the residuals are reasonably random. Model B because the residuals are reasonably random. Model A because the residuals are decreasing. Model B because the residuals show a pattern. Cannot be determined. 7. If a least-squares residual plot appeared as in the enclosed graph, the appropriate model would be: (a) (b) (c) (d) (e) exponential model linear model power model an undetermined non-linear model square root model 8. A copy machine dealer has data on the number x of copy machines at each of 89 customer locations and the number y of service calls in a month at each location. x 8.4, s 2.1, y 14.2, s 3.8, r 0.86 x y Summary calculations give . About what percent of the variation in number of service calls is explained by the linear relation between number of service calls and number of machines? (a) 86% (b) 93% (c) 74% (d) none of these (e) cannot be determined Part II – Free Response (Questions 9-11) – Show your work and explain your results clearly. 9. The Earth’s Moon has many impact craters that were created when the inner solar system was subjected to heavy bombardment of small celestial bodies. Scientists studied 11 impact craters on the Moon to determine whether there was any relationship between the age of the craters (based on radioactive dating of lunar rocks) and the impact rate (as deduced from the density of the craters. The data are displayed in the scatterplot below. 8000 7000 Impact Rate 6000 5000 4000 3000 2000 1000 0 0.4 0.9 1.4 Age (a) Describe the nature of the relationship between age and impact rate. Prior to fitting a linear regression model, the researchers transformed both impact rate and age by using logarithms. The following computer printout and residual plot were produced. Regression equation: log(rate) = 4.82 - 3.92 log(age) Predictor Constant log (age) S = 0.5977 Coef 4.8247 -3.9232 SE Coef 0.1931 0.4514 R-Sq = 89.4% T 24.98 -8.69 P 0.000 0.000 R-Sq(adj) = 88.2% (b) Interpret the value of r2. (c) Comment on the appropriateness of this linear regression for modeling the relationship between the transformed variables. 10. A random sample of 400 married couples was selected from a large population of married couples. Heights of married men are approximately normally distributed with mean 70 inches and standard deviation 3 inches. Heights of married women are approximately normally distributed with mean 65 inches and standard deviation 2.5 inches. There were 20 couples in which the wife was taller than her husband, and there were 380 couples in which the wife was shorter than her husband. The relationship between husband’s height vs. wife’s height for the 400 married couples was approximately linear with correlation 0.4. (a) Determine the boundary heights for the middle 95% of men’s heights. (b) Determine the boundary heights for the middle 95% of women’s heights. (c) Determine the equation of the least-squares regression line for the linear relationship between men’s heights and women’s heights. (d) Using all the information given and the results from parts (a), (b), and (c), sketch an oval that could enclose the points on the scatterplot below. 100 90 80 70 60 50 40 30 20 10 10 20 30 40 50 60 70 80 90 100 11. A plot of the number of defective items produced during 20 consecutive days at a factory is shown below. Scatterplot of Number of Defective Items vs Day Number Number of Defective Items 5 4 3 2 1 0 5 10 Day Number 15 (a) Draw a histogram that shows the frequencies of the number of defective items. (b) Give one fact that is obvious from the histogram but is not obvious from the scatterplot. (c) Give one fact that is obvious from the scatterplot but is not obvious from the histogram. 20 AP Statistics 11/02/06 Coley / P. Myers Test #4 (Chapters 11-13) Name ____________________________________________________ Period ___________ Honor Pledge ______________________________________________ Part I - Multiple Choice (Questions 1-10) - Circle the answer of your choice. 1. (a) (b) (c) (d) (e) 2. (a) (b) (c) (d) (e) 3. Who makes more mistakes on their income tax forms: accountants or taxpayers who prepare the forms themselves? A random sample of income tax forms that were prepared by accounts was drawn form IRS records. An equal number of forms that were self-prepared by taxpayers was also drawn. The average number of errors per form was compared to determine if one group tends to make more mistakes than the other. What type of study is this? census experiment voluntary response survey observational study matched-pairs study A dance club holds a raffle at the end of each dance. Five dancers are selected at random to each draw one numbered tag from a hat without replacement. There are 50 tags in the hat numbered from 1 to 50. Drawing a tag numbered from 1 through 5 wins $20, tags 6 through 25 wins $10, and tags 26 through 50 wins $5. In order to determine the average amount of money paid out, a simulation will be conducted using a random number table. Which of the following assignments of random numbers to tag values is most appropriate for the simulation? Using single-digit numbers, assign 0 to represent a $20 prize, 1-4 to represent a $10 prize, and 5-9 to be a $5 prize. Using single-digit numbers, assign 0 to represent a $20 prize, 1 to represent a $10 prize, and 2 to represent a $5 prize. Numbers 3-9 are ignored. Using two-digit numbers, assign 20 to represent a $20 prize, 10 to represent a $10 prize, and 05 to represent a $5 prize. Numbers 00-04, 06-09, 11-19, 21-99 are ignored. Using two-digit numbers, assign 01-05 to represent a $20 prize, 06-25 to represent a $10 prize, and 26-50 to represent a $5 prize. Numbers 51-99 and 00 are ignored. Using two-digit numbers, assign 01-10 to represent a $20 prize, 11-40 to represent a $10 prize, and 41-99 and 00 to represent a $5 prize. (a) (b) (c) (d) (e) The student council wants to survey their students to see what brands of soft drinks they want in the school machines. They randomly sampled 30 freshmen, 30 sophomores, 30 juniors, and 30 seniors. The sampling method they used is a: simple random sample stratified random sample cluster sample systematic random sample convenience sample 4. (a) (b) (c) (d) (e) What is the major difference between an experiment and an observational study? A treatment is imposed in an experiment. An observational study can establish cause-effect relationships. There are two control groups instead of one in an experiment. Observational studies use only one population. Experiments are blinded. 5. (a) (b) (c) (d) (e) A simple random sample of size n is selected in such a way that: Each member of the population has an equal chance of being selected. Each member of the population is given an opportunity to respond to the survey. All samples of size n have the same chance of being selected. The probability of selecting any sample is known to be 1/n. The sample is guaranteed to represent the entire population. 6. In sample surveys, bias can be controlled by all of the following except: Using a random sampling procedure. Wording questions so they are not confusing or misleading. Carefully training and supervising interviewers. Prompting respondents so that they give correct responses. Reducing non-response and undercoverage. (a) (b) (c) (d) (e) 7. A new medication has been developed to cure a certain disease. The disease progresses in three stages: I, II, and III, each progressively worse than the one before it. Ninety volunteers are gathered to test the new medication, 30 in each of the three stages. The medication will be administered to subjects daily in one of three dosages: 100 mg for each subject in stage I, 200 mg for each subject in stage II, 300 mg for each subject in stage III. After 8 weeks, the proportion of subjects cured of the disease will be recorded. Why is this NOT a good experimental design? I. II. III. Because experiments of this type should only use one dosage level of medication. Because disease stage is potentially confounded with dosage level. Because the experiment lacks a control group. (a) (b) (c) (d) (e) I only II only I and II only II and III only I, II, and III 8. A garage door manufacturer has developed a new type of door for houses in the Southeast part of the United States. Doors in this area of the country are particularly susceptible to damage from salty ocean spray and the sun’s rays, which tend to shine mainly on the north side of the house. An experiment will test the new type of garage door against the existing type of door on eight houses in a particular residential area. An overhead view of the area is shown below. The location of the garage door on each house is marked with an “X”. Which of the following blocking schemes is most appropriate to account for variables in this study other than type of door? (a) Form the houses into two blocks: {1,2,3,4} and {5,6,7,8} (b) Form the houses into two blocks: {1,3,5,7} and {2,4,6,8} (c) Form the houses into four blocks: {1,5}, {2,6}, {3,7} and {4,8} (d) Form the houses into four blocks: {1,3}, {2,4}, {5,7} and {6,8} (e) No blocking is necessary in this experiment. 9. Five homes from a subdivision will be randomly selected to receive 1 month of free cable TV. There are 80 homes in the subdivision. The homes are assigned numbers 01-80 and the random number table below (beginning with the first line and reading from left to right) is used to select the five homes. No home may receive more than one free month of service. Which of the following is a correct selection of the five homes? 99154 92210 (a) 9, 1, 5, 4, 7 (b) 15, 47, 03, 23, 23 (c) 15, 47, 03, 23, 35 (d) 99, 70, 23, 92, 08 (e) 99, 15, 47, 03, 92 70392 70439 23889 08629 92335 73299 10. A graduate student designed a study to determine whether a new activity-based method is better than the traditional lecture of teaching statistics. He found two teachers to help him in his study for one semester. Mr. Dull volunteered to continue teaching with traditional lectures and Ms. Perky agreed to try the new activity-based method. Each teacher planned to teach two sections of approximately forty students each for adequate replication. At the end of the semester, all sections would take the same final exam and their scores would be compared. What is the explanatory variable in this study? (a) Teacher (b) Section of the Course (c) Teaching Method (d) Final Exam Score (e) Student Part II – Free Response (Questions 11-13) – Show your work and explain your results clearly. 11. It rains on Paradise Island on 40% of the days. The chance of rain is independent from day to day. A travel agent is signing people up to go on a 5-day tour of the island. She wants to know the chance of getting at least two consecutive days of rain at any time during the 5 days. To determine this, a simulation will be used. (a) Describe how you would use a random digit table to simulate whether at least two consecutive fays of rain occur over a 5-day period. (b) Conduct 10 trials of your simulation using the random number table below. By marking directly on or above the table, make your procedure clear enough for someone to understand. 00233 54830 39108 57935 28715 08996 19223 39280 22222 31405 71405 49953 30324 80154 39490 96080 51290 33843 62322 80262 12. A biologist is interested in studying the effect of growth-enhancing nutrients and different salinity (salt) levels in water on the growth of shrimps. The biologist has ordered a large shipment of young tiger shrimps from a supply house for use in the study. The experiment is to be conducted in a laboratory where 10 tiger shrimps are placed randomly into each of 12 similar tanks in a controlled environment. The biologist is planning to use 3 different growth-enhancing nutrients (A, B, and C) and two different salinity levels (low and high). (a) List the treatments that the biologist plans to use in this experiment. (b) Give one statistical advantage to having only tiger shrimps in the experiment. Explain why this is an advantage. (c) Give one statistical disadvantage to having only tiger shrimps in the experiment. Explain why this is a disadvantage. (d) Using the treatment listed in part (a), describe a completely randomized design that will allow the biologist to compare the shrimps’ growth after 3 weeks. 13. The administrators in a high school are thinking of changing the school’s parking policy effective 3 weeks after school begins. The administration has asked the student council to conduct a survey during the first week of school to determine what students who won cars think about the proposal. The student body has the following distribution. The number of students who won cars is also provided. Grade Population Own Cars Freshmen 500 0 Sophomores 550 180 Juniors 500 315 Seniors 450 405 The student council has decided to survey 100 students. The student body president wants to conduct a simple random sample to obtain the names of 100 students to be surveyed. The student body secretary wants to use a stratified random sample to obtain the names of the 100 students. (a) If the student body president’s plan is chosen, describe the procedure used to select the 100 students. (b) Describe one disadvantage of using the student body president’s plan. (c) If the student body secretary’s plan is chosen, describe the procedure used to select the 100 students. AP Statistics @ Woodward Academy Tuesday, November 14, 2006 Coley / P. Myers Test #3 (Chapters 14-15) Name _________________________________________ Period ______ Honor Pledge ___________________________________ Part I - Multiple Choice (Questions 1-10) - Circle the answer of your choice. 1. Suppose on any given day at school 0.15 of the English classes go to the computer lab, 0.10 of the Science classes go to the computer lab, and 0.04 of English and Science classes go to the computer lab. What is the probability that either an English or Science class will go to the computer lab? (a) 0.15 (b) 0.19 (c) 0.21 (d) 0.25 (e) 0.29 2. Alex, Bryan, and Charlie are all playing tennis matches in a tournament against different opponents. Based on previous performances, there is a 0.4 probability that Alex will win his first match, a 0.3 probability that Bryan will win his first match, and a 0.2 probability that Charlie will win his first match. If the chance that each wins his first match is independent of the others, what is the probability that none of them wins in their first matches? (a) 0.024 (b) 0.304 (c) 0.336 (d) 0.700 (e) 0.900 3. The 2000 Census identified the ethnic breakdown of the state of California to be approximately as follows: White – 46%, Latino – 32%, Asian – 11%, Black – 7%, Other – 4%. Assuming that these are mutually exclusive categories, what is the probability that a randomly selected person from the state of California is of Asian or Latino descent? (a) 46% (b) 32% (c) 11% (d) 43% (e) 3.5% 4. Given two events, A and B, if P(A) = 0.43, P(B) = 0.26, and P(A or B) = 0.68, then the two events are (a) disjoint but not independent (b) independent but not disjoint (c) disjoint and independent (d) neither disjoint nor independent (e) not enough information is given to determine whether A and B are disjoint or independent 5. A dormitory on campus houses 200 students. Of the 200 students, 120 are male, 50 are seniors, and 40 are male seniors. A student is selected at random. The probability of selecting a non-senior, given the student is a female, is: (a) 7/8 (b) 7/15 (c) 7/15 (d) 1/4 (e) 2/5 6. In an effort to get at the source of an outbreak of Legionnaire's disease at the 1979 APHA convention, a team of epidemiologists carried out a case-control study involving all 50 cases and a sample of 200 non-cases out of the 4000 persons attending the convention. Among the results, it was found that 40% of the cases went to a cocktail party given by a large drug company on the second night of the convention, whereas 10% of the controls attended the same party. Which of the following statements is appropriate for describing the 40% of cases who went to the party? (C = case, P = attended party) (a) P (C|P) = .40 (b) P (P|CC) = .40 (c) P (C|PC) = .40 (d) P (PC|C) = .40 (e) none of these 7. P(X) = 0.23 and P(X and Y) = 0.12 and P(X or Y) = .34, find P(YC). (a) 0.23 (b) 0.52 (c) 0.11 (d) 0.77 (e) 0.48 8. Which of the following statements must be true? (a) (b) (c) (d) (e) If two If two If two If two If two events are independent, they must be disjoint. events are dependent, they must be disjoint. events are disjoint, they must be independent. events are not disjoint, they must be independent. events are disjoint, they must be dependent. 9. For any two events A and B, which of the following statements must be true? I. II. III. IV. V. (a) (b) (c) (d) (e) P(A) + P(B) = 1 P(A) + P(AC) = 1 P(A|B) + P(B|A) = 1 P(A|B) + P(AC|B) = 1 P(A|B) + P(A|BC) = 1 II and IV only I and II only II, II, and IV only II and V only None of the above describes the complete set of true statements. Cause of Death 10. The cause of death and the age of the deceased are recorded for 440 patients from a hospital. Accident Homicide Heart Disease HIV Cancer Other 15-24 14 5 1 0 2 3 25-34 12 4 3 3 4 7 Age 35-44 15 3 14 6 17 16 45-54 12 0 34 4 47 26 55-64 7 0 63 0 89 43 If a person is known to be between the ages of 45 and 54, what is the probability that they died as a result of an accident? (a) 0.0273 (b) 0.0976 (c) 0.1364 (d) 0.2000 (e) 0.4878 Part II – Free Response (Questions 11-13) – Show your work and explain your results clearly. 11. The WA upper school student body consists of 55% females. Of the females, 65% like chicken fingers; of the males, 85% like chicken fingers. (a) Assign variable names to each of the unique events and describe the given probabilities using appropriate probability notation. (b) Set up an appropriately labeled diagram that describes this situation. (c) If a student is randomly selected and they like chicken fingers, find the probability that the student is male. Show your work. 12. Among the students in the WA upper school student body, 50% drive to school, 20% have a part-time job, and 40% neither drive to school nor have a part-time job. (a) Assign variable names to each of the unique events and describe the given probabilities using appropriate probability notation. (b) Set up an appropriately labeled diagram that describes this situation. (c) Are the events “Driving to School” and “Having a Part-Time Job” independent? Show your work. 13. The graph displays the scores of 32 students on a recent exam. The scores ranged from 64 to 95 points. (a) Describe the shape of the distribution. 6 6 7 7 8 8 9 9 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * (b) In order to motivate her students, the instructor wants to report that the overall class performance on the exam was high. Which summary statistic, the mean or the median, should the instructor use to report that overall exam performance was high? Explain. (c) The midrange is defined by minimum + maximum . Compute this value using the exam data. 2 (d) Is the midrange a measure of center or a measure of spread? Explain. AP Statistics @ Woodward Academy Tuesday, November 21, 2006 Coley / P. Myers Test #6 (Chapters 16) Name _________________________________________ Period ______ Honor Pledge ___________________________________ Part I - Multiple Choice (Questions 1-10) - Circle the answer of your choice. 1. (a) (b) (c) (d) (e) Family size can be represented by the random variable X. Determine the mean family size. X 2 3 4 5 P(X) .17 .47 .26 .10 2.94 3.00 3.29 3.49 3.86 2. The heights of married men are approximately normally distributed with a mean of 70 and a standard deviation of 3, while the heights of married women are approximately normally distributed with a mean of 65 and a standard deviation of 2.5. If the heights of married men & married women are independent, determine the probability that a randomly selected married woman is taller than a randomly selected married man. (a) (b) (c) (d) (e) 0.05 0.10 0.15 0.20 Cannot be determined from the given information. 3. Which of the following is not true concerning discrete probability distribution? (f) (g) (h) (i) (j) The probability of any specific value is between 0 and 1, inclusive. The mean of the distribution is between the smallest and largest value in the distribution. The sum of all probabilities is 1. The standard deviation of the distribution is between –1 and 1. The distribution may be displayed using a probability histogram. 4. A high school golf team of five players is to be in an upcoming tournament. Each of the players on the team will play a round of golf and the team score is the sum of the five individual scores. The individual player scores are independent of each other and approximately normally distributed with the following means and standard deviations. Golfer 1 Mean 78 Standard Deviation 3 What are the mean and standard deviation of the team score? (f) (g) (h) (i) (j) 2 79 4 3 81 2 4 84 4 5 93 6 Mean = 83, standard deviation = 3.8 Mean = 83, standard deviation = 9 Mean = 415, standard deviation = 6 Mean = 415, standard deviation = 9 Mean = 415, standard deviation = 19 5. A married couple decides they wish to start a family and they really want to have a baby girl. Because of financial considerations, they decide they will have children until they have a girl or a total of 4 children. If the probability of having a boy or girl is equally likely, determine the expected number of boys. (f) 0.75 (g) 0.875 (h) 0.9375 (i) 1 (j) 1.25 6. A rock concert producer has scheduled an outdoor concert. If it is warm that day, she expects to make a $20,000 profit. If it is cool that day, she expects to make a $5,000 profit. If it is very cold that day, she expects to suffer a $12,000 loss. Based upon historical records, the weather office has estimated the chances of a warm day to be 0.60; the chances of a cool day to be 0.25. What is the producer's expected profit? (a) (b) (c) (d) (e) $5,000 $13,000 $15,050 $13,250 $11,450 7. The scores on the Woodward Academy AP Stat Test #1 (T1) had a mean of 27 with a standard deviation of 3 and the scores on Test #2 (T2) had a mean of 29 with a standard deviation of 4. To reflect the true brilliance of the students taking the course, he total score had to be adjusted according to the following definition: Total = 2*T1+3*T2 . What is the mean and standard of Total? (f) 141, 13.4 (g) 141, 18 (h) 141, 13.4 (i) 141, 13.4 (j) cannot be determined Use the following information for questions 8-10. The independent random variables X and Y are defined by the following probability distribution tables. X P(X) 1 .6 3 .3 6 .1 8. Determine the mean of X+Y (f) 7.2 (g) 8.4 (h) 5.1 (i) 9 (j) 4.3 9. Determine the standard deviation of 3Y + 5 (f) (g) (h) (i) (j) .44 3.62 0 5.1 5.44 10. Determine the standard deviation of 4X - 5Y. (a) (b) (c) (d) (e) 15.38 –2.76 11.05 10.62 cannot be determined from the given information Y P(Y) 2 .1 3 .2 5 .3 7 .4 Part II – Free Response (Questions 11-13) – Show your work and explain your results clearly. 11. The depth from the surface of the earth to a refracting layer beneath the surface can be estimated using methods developed by seismologists. One method is based on the time required for vibrations to travel from a distant explosion to a receiving point. The depth measurement (M) is the sum of the true depth (D) and the random measurement error (E). That is M = D + E. The measurement error (E) is assumed to be normally distributed with mean 0 feet and standard deviation 1.5 feet. (a) If the true depth at a certain point is 2 feet, what is the probability that the depth measurement will be negative? (b) Suppose three independent depth measurements are taken at the point where the true depth is 2 feet. What is the probability that at least one of these measurements will be negative? (c) What is the probability that the mean of the three independent depth measurements taken at the point where the true depth is 2 feet will be negative? 12. Two antibiotics are available as treatment for a common ear infection in children. Antibiotic A is known to effectively cure the infection 60 percent of the time. Treatment with antibiotic A costs $50. Antibiotic B is known to effectively cure the infection 90 percent of the time. Treatment with antibiotic A costs $80. The antibiotics work independently of one another. Both antibiotics can be safely administered to children. A health insurance company intends to recommend one of the following two plans of treatment for children with this ear infection. Plan I: Treat with antibiotic A first. If it is not effective, then treat with antibiotic B. Plan II: Treat with antibiotic B first. If it is not effective, then treat with antibiotic A. (a) If a doctor treats a child with an ear infection using Plan I, what is the probability that the child will be cured? If a doctor treats a child with an ear infection using Plan II, what is the probability that the child will be cured? (b) Compute the expected cost per child when plan I is used for treatment. Compute the expected cost per child when plan II is used for treatment. (c) Based on the results in parts (a) and (b), which plan would you recommend? Explain your recommendation. 13. John believes that as he increases his walking speed, his pulse rate will increase. He wants to model this relationship. John records his pulse rate, in beats per minute (pbm), while walking at each of seven different speeds, in miles per hour (mph). A scatterplot and regression output are shown below. (a) Using the regression output, write the equation of the fitted regression line. (b) Estimate John’s pulse rate if he walks at 2 mph. (c) Note that S = 3.087. Interpret this value in the context of this study. (d) Explain the meaning of R-Sq in the context of this study. 11. A department supervisor is considering purchasing one of two comparable photocopy machines, A or B. Machine A costs $10,000 and Machine B costs $10,500. This department replaces photocopy machines every three years. The repair contract for Machine A costs $50 per month and covers an unlimited number of repairs. The repair contract for Machine B costs $200 per repair. Based on past performance, the distribution of the number of repairs needed over any one-year period for Machine B is shown below. Number of Repairs Probability 0 0.50 1 0.25 2 0.15 3 0.10 You are asked to give an overall recommendation based on overall cost as to which machine, A or B, along with its repair contract, should be purchased. What would your recommendation be? Give a statistical justification to support your recommendation. 11. For an upcoming concert, each customer may purchase up to 3 child tickets and 3 adult tickets. Let C be the number of child tickets purchased by a single customer. The probability distribution of the number of child tickets purchased by a single customer is given in the table below. C P(C) 0 0.4 1 0.3 2 0.2 3 0.1 (a) Compute the mean and the standard deviation of C. (b) Suppose the mean and the standard deviation of the number of adult tickets purchased by a single customer are 2 and 1.2, respectively. Assume that the number of child tickets and the number of adult tickets purchased are independent random variables. Compute the mean and the standard deviation of the total number of adult and child tickets purchased by a single customer. (c) Suppose each child ticket costs $15 and each adult ticket costs $25. Compute the mean and the standard deviation of the total amount spent per purchase. AP Statistics 12/5/06 Coley / P. Myers Test #7 (Chapter 17) Name ____________________________________________________ Period ___________ Honor Pledge ______________________________________________ Part I - Multiple Choice (Questions 1-10) - Circle the answer of your choice. 5. Sixty-five percent of all divorce cases cite incompatibility as the underlying reason. If four couples file for a divorce, what is the probability that exactly two will state incompatibility as the reason? (f) (g) (h) (i) (j) .104. .207 .254 .311 .423 6. Which I. II. III. (f) (g) (h) (i) (j) of the following are true statements? The histogram of a binomial distribution with p = .5 is always symmetric. The histogram of a binomial distribution with p = .9 is skewed to the right. The histogram of a geometric distribution with p = .3 is always skewed right. I and II I and III II and III I, II, and III None of the above gives the complete set of complete responses. 3. Binomial and geometric probability situations share many conditions. Identify the choice that is not shared. (k) (l) (m) (n) (o) The probability of success on each trial is the same. There are only two outcomes on each trial. The random variable is the number of successes in a given number of trials. The probability of a success equals 1 minus the probability of a failure. The mean depends on the probability of a success. 4. An inspection procedure at a manufacturing plant involves picking thirty items at random and then accepting the whole lot if at least twenty-five of the thirty items are in perfect condition. If in reality 85% of the whole lot is perfect, what is the probability that the lot will be accepted? (k) (l) (m) (n) (o) .524 .667 .186 .476 .711 5. A recent study of the WA Upper School student body determined that 41% of the students were “chic”. If Mr. Floyd has developed a test for “chic-ness”, what is the average number of students we would need to test in order to find one who is “chic”? (k) (l) (m) (n) (o) 2 2.43 3 3.57 1, because the study is clearly in error since all WA students are “chic” 6. A student is randomly generating 1-digit numbers on his TI-84. What is the probability that the first four that appears will be the 8th digit generated? (k) (l) (m) (n) (o) .053 .082 .048 .742 .500 7. 3,600,000 dice are rolled. Determine the probability that between 599,000 and 610,000 4’s appear. (k) (l) (m) (n) (o) 0.67 0.74 0.92 0.08 ERR:DOMAIN 8. A probability experiment involves a series of identical, independent trials with two outcomes (success/failure) per trial and the probability of a success on each trial is 0.1. Determine the number of trials, n, in a binomial experiment such that the expected number of successes in that binomial experiment will be equal to the expected number of trials in a geometric experiment. (b) (c) (d) (e) (f) 2 5 10 50 100 9. In which of the following games would you have the best chance of winning? (f) (g) (h) (i) (j) Toss a coin 20 times. You win if you get more than 11 heads. Toss a coin 10 times. You win if you don’t get 4, 5, or 6 heads. Toss a coin 7 times. You win if you get at least 5 heads. Toss a coin 4 times. You win if you get at least 3 heads. Toss a coin 5 times. You win if you get exactly 3 heads. 10. The renowned soccer player, Levi Gupta scores a goal on 30% of his attempts. The random variable X is defined as the number of goals scored on 50 attempts. The renowned gambler, Mohammed Smith, wins at Blackjack 25% of the time. The random variable Y is defined as the number of games needed to win his first game. Define the random variable Z as the total number of soccer goals scored and blackjack games played. Determine the mean and standard deviation of the random variable Z. (k) (l) (m) (n) (o) 11, 6.7 19, 6.7 11, 4.74 19, 4.74 Cannot be determined with the given information. Part II – Free Response (Questions 11-12) – Show your work and explain your results clearly. 11. Sophie, Ms. Coley’s favorite dog, loves to play catch. Unfortunately, she (Sophie, not Ms. Coley) is not particularly adept at catching as her probability of catching the ball is 0.15. (a) Ms. Coley is interested in determining how many tosses it will take for Sophie to catch the ball once. (i) Can this situation be described as binomial, geometric, or neither? Explain. (ii) What is the expected number of tosses it will take for Sophie to catch the ball once? (iii) What is the probability it will take exactly 10 tosses in order for Sophie to catch the ball? (b) Mr. Wylder, avid baseball player & coach, decides to train Sophie. After three-a-day training sessions for 4 weeks, the probability that Sophie catches the ball has increased to 0.35. Mr. Wylder is interested in determining the number of times Sophie will catch the ball in 25 tosses. (i) Can this situation be described as binomial, geometric, or neither? Explain. (ii) What is the expected number of times that Sophie will catch the ball? (iii) What is the probability that Sophie will catch the ball 8 times in 25 tosses? (c) Mr. Myers, knowing that Sophie is a reasonably smart dog and her probability of catching the ball will actually improve 0.01 after each toss. Mr. Myers would like to find out the number of tosses required for Sophie to catch the ball three times. (i) Can this situation be described as binomial, geometric, or neither? Explain. (ii) If Sophie’s initial probability of catching the ball is 0.15, what is the probability that it will take five tosses for Sophie to catch the ball three times? 12. When a tractor pulls a plow through an agricultural field, the energy to pull that plow is called the draft. The draft is affected by environmental conditions such as soil type, terrain, and moisture. A study was done to determine whether a newly developed hitch would be able to reduce draft compared to the standard hitch. (A hitch is used to connect the plow to the tractor.) Two large plots of land were used in this study. It was randomly determined which plot was to be plowed using the standard hitch. As the tractor plowed that plot, a measurement device on the tractor automatically recorded the draft at 25 randomly selected points in the plot. After the plot was plowed, the hitch was changed from the standard one to the new one, a process that takes a substantial amount of time. Then the second plot was plowed using the new hitch. Twenty-five measurements of draft were also recorded at randomly selected points in this plot. (a) What was the response variable in this study? Identify the treatments. What were the experimental units? (b) Given that the goal of the study was to determine whether a newly develop hitch reduces draft compared to the standard hitch, was randomization properly used in this study? Justify your answer. (c) Given that the goal of the study was to determine whether a newly develop hitch reduces draft compared to the standard hitch, was replication properly used in this study? Justify your answer. (d) Plot of land is a confounding variable in this study. Explain why.