Pre-TestUnit9:DescriptiveStatisticsKEY You may use a calculator. The following table shows how many text messages different students sent this week. Answer the following questions using the table. 20 200 340 0 75 55 90 120 60 150 170 220 240 90 85 40 35 100 65 30 1. Construct a histogram for the above data set using appropriate scale for the -axis and appropriate -axis intervals for the frequency. (8 pts; 2 pts for y-axis scale and label, 2 pts for -axis intervals and label, 4 pts for correct frequencies) 6 5 4 3 2 0 100 200 300 400 300-349 250-299 200-249 150-199 100-149 50-99 1 0-49 Frequency 7 Text Messages Sent 2. Construct a box and whisker plot for the above data set in the blank space above. (8 pts; 2 pts for quartiles, 2 pts for end points, 2 pts for number line labels, 2 pts for box/whisker) 3. What are the mean and median of the above data set? (4 pts; 2 pts each) Median = 87.5, Mean = 109.25 4. What are the range and interquartile range of the above data set? (4 pts; 2 pts each) Range = 340, Interquartile Range = 112.5 5. What is the population standard deviation of the above data set? (4 pts; no partial credit) 84.473 6. What can you tell about a data set of test scores with a mean of 75%, median of 78%, range of 30%, and population standard deviation of 7%? (4 pts; partial credit at teacher discretion) Answers will vary: The data is tightly clumped in the middle. Even the end points of the data set are not far. 7. What could be a mean and median of a data set of test scores where a third of the class failed, but the rest of the class scored above an 80%? Justify each choice in writing. (4 pts; partial credit at teacher discretion) Answers will vary: Mean = 70 since most passed, Median = 85 middle would be above 80 1 Construct a scatter plot for the following data set using appropriate scale for both the - and -axis. (8 pts; 1 pts for each axis scale/interval, 1 pt for deciding to break each axis or not, 2 pts for correct independent/dependent axes, 2 pts for correctly plotted points) 8. This table shows the number of hours students slept the night before their math test and their scores. 100 Test Score 8 7 8 6 5 8 8 9 7 6 95 90 85 75 65 90 80 95 80 70 80 70 Test Score Hours Slept Anna Bob Carly Damien Esther Franco Georgia Hank Innya Jacob 90 60 50 40 30 20 10 1 2 3 4 5 6 7 8 9 Hours of Sleep Percent of Income Saved Use the following scatter plot to answer each question. The scatter plot shows the monthly income of each person in hundreds of dollars versus the percent of their income that they save each month. (4 pts; 2 pts for correct answer with no explanation) 9. What patterns or associations do you see present in this data? Why do you think so? Positive linear association since the data is going up at a Kory generally steady pace. 10. Which person makes the most money per month? How much do they make? Paul makes $2800 per month Rachel Nancy Paul Lexi Tanya Mike Quinn Stan Oliver Monthly Income in Thousands of Dollars 2 Draw an informal function of best for the given scatter plots. (3 pts; partial credit at teacher discretion) 12. This scatter plot shows the hours a cubic foot of ice was exposed to sunlight versus the thickness of ice that melted in inches. Plant Growth Ice Melted Plant Growth in cm 10 8 6 4 2 0 0 2 4 6 8 Copper in Water in ppm Thickness of Ice Melted in inches 11. This scatter plot shows the amount copper in water in ppm versus plant growth in cm over three months. 2 1.5 1 0.5 0 0 2 -0.5 4 6 8 10 Hours of Sunlight Exposure Explain why the drawn line of best fit is accurate or why not. (3 pts; partial credit at teacher discretion) 13. This scatter plot shows the age in years versus the height in inches of a group of children. 14. This scatter plot shows the hours of TV watched per week versus the GPA on a 4.0 scale for a group of students. Height GPA 4.0 60 3.5 50 3.0 40 2.5 30 2.0 1.5 20 1.0 10 Age 2 4 6 8 10 12 14 −10 Not accurate because there are too many points below at the beginning of the line, and too many above at the end of the line. 0.5 Hours Weekly TV −4 −0.5 4 8 12 16 20 24 28 Accurate because there is a balance of how far away the data points are from the line. 3 The scatter plot shows what people think the temperature “feels like” ( axis) as the humidity ( axis) varies when the room is actually at 68° F. The equation of the line of best fit is . (4 pts; 2 pts for equation answer, 2 pts for graph answer) 15. Predict what a person would say the temperature “feels like” when the humidity is at 80% using both the equation and graph. Feels Like Temp 70 Equation Work: 60 50 Graph Prediction: 68° 68°69° 40 30 20 10 % Humidity −10 10 20 30 40 50 60 70 80 90 100 −10 Using the same scatter plot and equation of the line of best fit of , answer the following questions. (2 pts; partial credit at teacher discretion) 16. What does the slope of this equation mean in terms of the given situation? In other words, explain what the rise and run mean for this problem. The “feels like” temperature will go up one degree for every 10% increase in humidity. 17. What does the -intercept of this equation mean in terms of the given situation? In other words, explain what the -intercept means when considering the humidity and “feels like” temperature. The y-intercept of 61 degrees means that with 0% humidity it will “feel like” 61 degrees instead of 68 degrees. Answer the following questions. (4 pts; partial credit at teacher discretion) 18. A function of best fit has a correlation coefficient of 0.8557. What does that tell us? It has a strong correlation. 19. Plot the two sets of residuals on number lines. (4 pts; 2 pts each) 1 2 3 4 5 Anne’s 0.5 1.5 -0.5 -1 -2 Bob’s 4 1 0 2 0.5 3 -0.5 4 0.5 5 1 20. Which person’s line of best fit works better and why do you think so? (4 pts; 2 pts for answer, 2 pts for explanation) Bob’s because his residuals are closer to the zero line. Answer the following questions about two-way tables. (4 pts; partial credit at teacher discretion) 21. Construct a two-way table from the following data about whether people are democrats or republicans and whether or not they support stricter gun control laws. Democrat or Republican? Support Strict Gun Control? Support Gun Control Against Gun Control D R R R D D R D D D R D R R D R D D R R Y N N N N Y N Y Y Y N Y N Y N N Y Y Y N Republican 2 8 Democrat 8 2 22. Do you think there is a relationship between party affiliation and gun control laws? Based on the data, why or why not? (no credit without explanation of why, partial credit at teacher discretion for explanation) Yes. 80% of Republicans are against gun control while 80% of Democrats support gun control. Answer the following questions using the given two-way table. (4 pts; no partial credit) Students Teachers Support School Uniforms 278 82 Do Not Support School Uniforms 1726 23 23. How many students were surveyed? 2004 24. As a percent to the nearest hundredth (two decimal places) what is the relative frequency of students who support school uniforms? 278 ≈ 13.87% 2004 5 Unit 9 Homework Answer Key Lesson 9.1 For each of the data sets below, create a histogram, dot plot, and box plot. 1. The following data lists the number of hours per week spent playing video games for each person. Abraham Betty Carrie Demarcus Ely Francis Gretchen Heather Ingrid Jackson Kamir Lamar Marcus Noel Oji Pat Queen Reed Sage Tiko Ugo Victor Wim Xavia Yuma Zachary 9 8 7 Frequency Name Hours per week 8 3 0 0 5 6 2 8 3 2 7 5 3 1 9 7 0 3 1 2 9 4 3 5 6 1 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 6 7 Hours per week 0 0 1 1 2 2 3 3 4 4 6 5 5 6 7 8 8 9 9 10 10 2. The following data lists the number of wins for pitchers in MLB in the 2013 season. 21-23 15-17 18-20 12-14 9-11 6-8 0-2 3-5 Frequency Name Wins 18 Sanchez 14 Colon 18 16 Iwakuma 14 14 Darvish 13 12 Scherzer 21 10 Hernandez 12 8 Sale 11 Shields 13 6 Santana 9 4 Jimenez 13 2 Kuroda 11 Price 10 Wilson 17 Holland 10 Number of Wins Masterson 14 Verlander 13 Quintana 9 Lackey 10 Fister 13 Tillman 16 Pettitte 11 Lester 15 Gonzalez 11 Griffin 14 Parker 12 0 15 18 21 12 24 3 6 9 Guthrie 15 Buehrle 12 Correia 9 Norris 10 Dickey 14 Porcello 13 Doubront 11 Damster 8 Williams 9 Sabathia 14 Hellickson 12 Saunders 11 0 15 18 21 3 6 9 12 24 3. Using the plots, can you tell about where the average might be? What about the middle? The average might be in the third quartile and the middle is the middle of the box and whisker plot. 4. Using the plots, is this data very spread out or closely packed? This data is closely packed except in the fourth quartile. 7 27 30 27 30 5. The following data lists the number of full years in office for U. S. presidents. 18 16 14 12 10 8 6 4 10-11 12-13 8-9 6-7 4-5 2 0-1 2-3 Years in office 6 4 8 8 8 4 8 8 0 3 4 1 2 4 4 4 3 8 4 0 3 4 4 4 4 7 4 8 2 5 4 12 7 8 2 5 5 2 4 8 4 8 8 Frequency Name Washington J. Adams Jefferson Madison Monroe J. Q. Adams Jackson Van Buren Harrison Tyler Polk Taylor Fillmore Pierce Buchanan Lincoln Johnson Grant Hayes Garfield Arthur Cleveland Harrison Cleveland McKinley Roosevelt Taft Wilson Harding Coolidge Hoover Roosevelt Truman Eisenhower Kennedy Johnson Nixon Ford Carter Reagan G. H. W. Bush Clinton G. Bush Years in office 0 14 16 18 20 0 10 12 14 2 4 6 8 6. Using the plots, can you tell about where the average might be? What about the middle? 16 18 20 2 4 6 8 10 12 The average will probably be in the third quartile between 4 and 8 and the middle is 4. 7. Using the plots, is this data very spread out or closely packed? This data has clumps around 4 and 8 because of four year terms. 8 8. The following data lists the amount of allowance each person receives. 9 8 7 6 5 4 3 2 15-17 18-20 12-14 9-11 6-8 1 0-2 3-5 Robert Mary John Dorothy James Helen William Betty Charles Margaret George Ruth Joseph Virginia Richard Doris Edward Mildred Donald Frances Thomas Elizabeth Frank Evelyn Harold Anna Allowance $ 5 15 8 10 12 15 7 5 10 8 7 12 15 14 13 10 8 15 20 18 12 7 15 10 6 12 Frequency Name Allowance 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 9 Lesson 9.2 The following table shows the test scores for various classes. Answer the following questions using the table. Period Period Period Period 1 2 3 4 95 80 90 45 73 90 95 87 100 70 45 100 90 75 89 30 95 68 90 92 71 67 85 95 80 83 41 80 83 60 88 20 81 88 97 84 75 88 84 45 73 85 86 35 72 72 90 90 88 79 30 85 80 82 89 25 98 65 20 40 88 65 15 81 85 78 96 92 79 70 85 10 93 75 95 88 80 75 81 5 1. What are the mean, median, range, interquartile range, and population standard deviation for period 1? Mean: 83.95, Median: 82, Range: 29, IQR: 14.5, SD: ≈ 8.84 2. What are the mean, median, range, interquartile range, and population standard deviation for period 2? Mean: 75.75, Median: 75, Range: 30, IQR: 13.5, SD: ≈ 8.40 3. What are the mean, median, range, interquartile range, and population standard deviation for period 3? Mean: 74.55, Median: 87, Range: 82, IQR: 27, SD: ≈ 26.55 4. What are the mean, median, range, interquartile range, and population standard deviation for period 4? Mean: 61.45, Median: 80.5, Range: 95, IQR: 56.5, SD: ≈ 31.63 5. Compare and contrast Period 1 and Period 2 using the measures of center and spread you calculated. Answers will vary: They have similar spreads, but period 1 seems to have done better on the test based on the center. 6. Compare and contrast Period 1 and Period 3 using the measures of center and spread you calculated. Answers will vary: They have similar medians, but period 3 has a much wider spread meaning scores dipped lower. That caused the mean to be lower in period 3. 7. Compare and contrast Period 3 and Period 4 using the measures of center and spread. Answers will vary: Even though the medians are somewhat close, the means are further away which means the spread for period 4 is larger as seen in the IQR and SD. 8. What sort of centers and spread would you expect from a class that all scored relatively near 75%? Centers close to 75% and spread relatively low. 9. What sort of centers and spread would you expect from a class whose test scores were evenly spread out from 0% to 100%? Centers close to 50%, IQR near 50, and SD near 25. 10 The following table shows the batting averages for baseball players from four different teams. Answer the following questions using the table. Team 1 Team 2 Team 3 Team 4 0.300 0.320 0.280 0.350 0.290 0.310 0.275 0.280 0.280 0.150 0.270 0.270 0.270 0.150 0.270 0.260 0.260 0.150 0.265 0.250 0.250 0.150 0.265 0.240 0.240 0.150 0.220 0.230 0.230 0.120 0.210 0.220 0.220 0.110 0.200 0.100 0.210 0.100 0.180 0.050 10. What are the mean, median, range, interquartile range, and population standard deviation for Team 1? Mean: 0.255, Median: 0.255, Range: 0.09, IQR: 0.05, SD: ≈ 0.029 11. What are the mean, median, range, interquartile range, and population standard deviation for Team 2? Mean: 0.171, Median: 0.15, Range: 0.22, IQR: 0.03, SD: ≈ 0.074 12. What are the mean, median, range, interquartile range, and population standard deviation for Team 3? Mean: 0.2435, Median: 0.265, Range: 0.1, IQR: 0.06, SD: ≈ 0.035 13. What are the mean, median, range, interquartile range, and population standard deviation for Team 4? Mean: 0.225, Median: 0.245, Range: 0.3, IQR: 0.05, SD: ≈ 0.083 14. Compare and contrast Team 1 and Team 2 using the measures of center and spread you calculated. Answers will vary: Team 2 has much lower centers and is spread out farther apart. 15. Compare and contrast Team 1 and Team 3 using the measures of center and spread you calculated. Answers will vary: The two teams are very similar, but Team 3 may have a few lower people pulling down their mean. 16. Compare and contrast Team 2 and Team 4 using the measures of center and spread. Answers will vary: Team 4 has higher centers but a wider spread. 17. What sort of centers and spread would you expect from a team that all batted relatively near 0.250? Centers close to 0.250 and spread relatively low. 18. What sort of centers and spread would you expect from a team whose batting averages were evenly spread out from 0.100 to 0.300? Centers close to 0.200, IQR near 0.100 and SD near 0.050. 11 Answer the following questions. 19. What happened in a class if test score percents had a mean of 75% and a median of 90%? What sort of population standard deviation and interquartile range would you expect? Answers will vary: Most of the class scored above 90%, but several probably failed dragging down the mean. This would produce a larger SD and IQR. 20. Let’s say two classes had a mean test score of 70% and a median test score of 70%, but their population standard deviations were 5% and 20% respectively. What could you conclude about the differences between the two classes? Answers will vary: The second class had a much wider spread of data, but evenly spaced to produce the same centers. 21. Describe a data set where the mean and median are far apart. Answers will vary: Test scores where two-thirds of the class scored an A but the other third failed. The median would be an A, but the mean would be a C or lower. 22. Describe a data set where the interquartile range and population standard deviation are far apart. Answers will vary: Test scores where the middle half of the class scored a D, a fourth scored an A, and the other fourth scored a low F. The IQR would be low because of all the D’s while the SD would be more because of the actual range being larger. 12 Lesson 9.3 Use the given data to answer the questions and construct the scatter plots. Pathfinder Character Level vs. Total Experience Points Level 2 3 6 9 10 XP 15 35 150 500 710 11 1050 14 2950 15 4250 17 8500 20 24000 24000 21600 19200 16800 XP 14400 12000 9600 7200 4800 2400 0 0 2 4 6 8 10 12 14 16 18 20 1. Which variable should be the independent variable (-axis) and which should be the dependent variable (-axis)? Level should be , XP should be 2. Should you use a broken axis? Why or why not? No broken axis, uses all space in range 3. What scale and interval should you use for the xaxis? 0 to 20 by ones 4. What scale and interval should you use for the yaxis? 0 to 24,000 by 1,200 5. Construct the scatter plot. Level Allowance Age vs. Weekly Allowance Age 12 12 Allowance 0 5 13 5 13 8 40 38 36 34 32 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 0 14 10 14 15 15 20 15 20 16 25 16 30 6. Which variable should be the independent variable (xaxis) and which should be the dependent variable (y-axis)? Age should be x, Allowance should be y 7. Should you use a broken axis? Why or why not? Broken axis for x since 0 to 11 not used 8. What scale and interval should you use for the x-axis? 12 to 16 by 0.25 9. What scale and interval should you use for the y-axis? 0 to 30 by 1.5 or 0 to 40 by twos 10. Construct the scatter plot. 11 12 13 14 15 16 Age 13 Age vs. Number of Baby Teeth Age 5 6 7 Baby 20 19 17 Teeth 7 15 8 10 9 10 10 8 11 4 11 2 12 2 20 18 16 Baby Teeth 14 12 10 8 6 4 2 0 0 2 4 6 8 10 12 14 16 18 20 11. Which variable should be the independent variable (x-axis) and which should be the dependent variable (y-axis)? Age should be x, Baby Teeth should be y 12. Should you use a broken axis? Why or why not? No broken axis, range greater than gap beforehand 13. What scale and interval should you use for the xaxis? 0 to 20 by ones 14. What scale and interval should you use for the yaxis? 0 to 20 by ones 15. Construct the scatter plot. Age Mileage Car Speed (in mph) vs. Gas Mileage (in mpg) Speed 20 25 35 40 Mileage 25 27 28 30 32 31 30 29 28 27 26 25 24 23 22 21 0 10 20 30 40 50 60 70 80 90 100 45 31 55 32 65 30 80 29 90 25 100 22 16. Which variable should be the independent variable (x-axis) and which should be the dependent variable (y-axis)? Speed should be x, Mileage should be y 17. Should you use a broken axis? Why or why not? Broken axis for y since 0 to 22 not used 18. What scale and interval should you use for the xaxis? 0 to 100 by fives 19. What scale and interval should you use for the yaxis? 22 to 32 by ones (or by halves) 20. Construct the scatter plot. Speed 14 Lesson 9.4 Use the given scatter plots to answer the questions. 1. Does this scatter plot show a positive association, negative association, or no association? Explain why. Positive, going up from left to right Daily Study Time Daily Study Time (minutes) 80 70 2. Is there an outlier in this data set? If so, approximately how old is the outlier and about how many minutes does he or she study per day? 12 years old and 75 minutes 60 50 40 30 3. Is this association linear or non-linear? Explain why. Linear, increases by about the same amount each year 20 10 0 0 5 10 15 20 4. What can you say about the relationship between your age and the amount that you study? The older you are, the more you study Age 5. Does this scatter plot show a positive association, negative association, or no association? Explain why. Negative, going down from left to right Daily Family Time 350 Daily Family Time 300 6. Is there an outlier in this data set? If so, approximately how old is the outlier and about how many minutes does he or she spend with family per day? No outlier in this data set 250 200 150 100 7. Is this association linear or non-linear? Explain why. Non-linear, it curves down 50 0 0 5 10 Age 15 20 8. What can you say about the relationship between your age and the amount of time that you spend with family? As you get older, you spend much less time with family each day 15 9. Does this scatter plot show a positive association, negative association, or no association? Explain why. Negative, going down from left to right Math Grade Math Grade 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 10. Is there an outlier in this data set? If so, approximately how much does that person watch TV daily and what is his or her approximate math grade? About 5.5 hours of TV and 95% math grade 11. Is this association linear or non-linear? Explain why. Linear, grade goes down by the same amount for each hour of TV 0 2 4 6 12. What can you say about the relationship between the amount of time you watch TV and your math grade? Watching more TV correlates with lower math grades Daily TV Time (hours) 13. Does this scatter plot show a positive association, negative association, or no association? Explain why. Positive, math grade goes up from left to right Math Grade Math Grade 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 14. Is there an outlier(s) in this data set? If so, approximately how much time does that person(s) spend with his or her family daily and what is his or her approximate math grade? 40 minutes with 92% and 100 minutes with 96% 15. Is this association linear or non-linear? Explain why. Questionable, could go either way 0 100 200 300 Daily Family Time (minutes) 400 16. What can you say about the relationship between the amount of time that you spend with your family and your math grade? More time with family correlates with higher math grades 17. Are there any other patterns that you notice in this data? Clumping around 280 minutes and also around 140 minutes 16 18. Does this scatter plot show a positive association, negative association, or no association? Explain why. Negative, going down from left to right Number of Pets 14 Number of Pets 12 19. Is there an outlier(s) in this data set? If so, approximately how many pets does that person(s) have? No outlier 10 8 6 20. Is this association linear or non-linear? Explain why. Linear, going down the same amount each time 4 2 0 0 10 20 30 First Letter of Last Name (A = 1 and Z = 26) 21. What can you say about the relationship between your last name and the number of pets you have? Earlier in the alphabet has more pets 22. Are there other patterns that you notice about people’s last names and how many pets they have? Clumping, early alphabet between 8 and 13 pets, middle alphabet between 4 and 6, later alphabet between 0 and 2 pets 23. Does this scatter plot show a positive association, negative association, or no association? Explain why. No association, no clear pattern Last Name First Letter of Last Name (A = 1 and Z = 26) 30 24. Is there an outlier(s) in this data set? If so, approximately how old is that person? No outlier 25 20 15 25. Is this association linear or non-linear? Explain why. Neither since there is no association 10 5 0 0 5 10 Age 15 20 26. What can you say about the relationship between your last name and your age? There is no relationship 17 27. Does this scatter plot show a positive association, negative association, or no association? Explain why. Positive, going up from left to right Weekly Allowance ($) Weekly Allowance ($) 30 28. Is there an outlier(s) in this data set? If so, approximately how tall is that person and how much does he or she make in allowance each week? 72 inches with $0 allowance 25 20 15 10 29. Is this association linear or non-linear? Explain why. Non-linear, it curves up 5 0 0 20 40 60 80 Height (inches) 30. What can you say about the relationship between your height and your allowance? As height increases, allowance increases 31. Do you think that being taller means that you will get more allowance? In other words, do you think this relationship is a causation or a correlation? This is a correlation, not a causation because being tall doesn’t cause more allowance 32. Does this scatter plot show a positive association, negative association, or no association? Explain why. Positive, going up from left to right Weekly Allowance ($) Weekly Allowance ($) 30 33. Is there an outlier(s) in this data set? If so, approximately how old is that person and how much does he or she make in allowance each week? 16 years old with $0 allowance 25 20 15 34. Is this association linear or non-linear? Explain why. Non-linear, it curves up 10 5 0 0 5 10 Age 15 20 35. What can you say about the relationship between your age and your allowance? As age increases, allowance increases 36. Do you think that being older means that you will get more allowance? In other words, do think this relationship is a causation or a correlation? This is probably a causation since being older means you generally spend more money and therefore need more allowance 18 Lesson 9.5 Draw an informal function of best fit on the given scatter plot and explain why you chose that type of function. A real function of best fit is the thick line in red. 1. 2. Math Grade 80 100% 70 95% 60 90% Math Grade Daily Study Time (minutes) Daily Study Time 50 40 30 85% 80% 75% 20 70% 10 65% 0 60% 0 5 10 15 20 0 2 Age 4 6 Daily TV Time (hours) 3. 4. Total Worth in Millions How Well Can We See Stars? 100 80 15.00 Visibility % Millions of Dollars after 50 Years 20.00 10.00 5.00 0.00 0 -5.00 200 400 600 800 60 40 20 0 1000 0 -20 Monthly Payment at 10% Return 19 2 4 6 8 Distance from Earth in AUs 10 5. 6. $20,000 at 10% per Year Age vs. Sleep 14 2.50 12 Daily Sleep (hours) Total Worth in Millions of $ 3.00 2.00 1.50 1.00 0.50 10 8 6 4 2 0.00 0 0 20 -0.50 40 60 0 5 Years Invested 10 15 20 Age (years) 7. 8. Ultrasonic Response of Metal Detectors 200 180 160 140 120 100 80 60 40 20 0 Ultrasonic Response (as %) Weight (pounds) Age vs. Weight 0 5 10 15 20 100 90 80 70 60 50 40 30 20 10 0 0 Age (years) 2 4 6 8 Distance from Metal (in meters) 20 10 Determine whether the drawn function of best fit is accurate or not. Explain why you think your position is true. A real function of best fit is the thick line in red. 9. 10. 40 40 35 35 30 30 25 25 20 20 15 15 10 10 5 5 0 0 0 10 20 30 11. 0 10 20 30 0 10 20 30 12. 40 40 35 35 30 30 25 25 20 20 15 15 10 10 5 5 0 0 0 10 20 30 21 13. 14. 350 350 300 300 250 250 200 200 150 150 100 100 50 50 0 0 0 5 10 15 20 25 0 15. 16. 8 8 6 6 4 4 2 2 0 5 10 15 20 25 0 0 2 4 6 8 10 12 0 -2 -2 -4 -4 -6 -6 -8 -8 22 2 4 6 8 10 12 Use the given graph of the line of best fit or equation of the line of best fit to answer the following questions. ! The equation of the line of best fit is: − . 17. Using the graph only, about how much would you expect an 18 year old to weigh? 185 – 190 lbs Weight (pounds) Age vs. Weight 200 180 160 140 120 100 80 60 40 20 0 18. Using the equation only, about how much would you expect a 4 year old to weigh? 185.5 lbs 19. Using the graph only, if a person weighed 80 pounds, how old would you expect them to be? 8 years old 0 5 10 15 20 Age (years) 20. Using the equation only, if a person weighed 80 pounds, how old would you expect them to be? 8 years old 21. What is the rate of change (slope) of the line of best fit? What does the slope represent in this context and " does that make sense? #$#%#&'%ℎ)*+&,-%$##+./+0& 22. What is the initial value (-intercept) of the line of best fit? What does it represent in this context and does 1 that make sense? − #$#%#&'%)#0/ℎ'+'-0'ℎ, 3#%&4 '*+5#%#&%#'ℎ+6#&#/+'06#)#0/ℎ' 23 Use the given graph of the function of best fit or equation of the function of best fit to answer the following questions. The equation of the line of best fit is: 7(9) −;. <9 + ;=9. 23. Using the graph only, about how high would you expect the baseball to be after 5 seconds? 104*#'#% Baseball Height w/ Upward Velocity of 45 Meters/Sec 120 24. Using the equation only, about how high would you expect the baseball to be after 5 seconds? 102.5*#'#% Height in Meters 100 80 60 40 25. Using the graph only, how long had the ball been in the air if it were 100 meters high? ≈ 3.6+&35.4%#>&3% 20 0 0 2 4 6 Time in Seconds 8 10 26. Using the equation only, how long had the ball been in the air if it were 100 meters high? ≈ 3.8+&35.4%#>&3% 27. What does the > value in the quadratic equation represent in this situation? It represents the height at which the ball was at time zero or the height at which the person who threw the ball was standing, which is 0 meters above ground. 28. What does the - value in the quadratic equation represent in this situation? It represents the speed at which the ball was thrown into the air, which is 45 meters per second. 29. What does the + value in the quadratic equation represent in this situation? It represents the force of gravity pulling the ball back down to the ground. 24 The following data about weekly allowances at various ages was used to create the given scatterplot. Four students estimated the line of best fit for this data. Plot the given residuals calculated from each student’s line of best fit and determine which student had the best line of best fit. Age 10 Allowance 2 10 5 11 0 11 5 12 10 12 5 Allowance by Age 13 10 13 15 14 15 14 20 15 20 15 25 Abby’s LOBF: 4.1 − 41 30 Bennett’s LOBF: 25 Allowance = 4.5 − 41 20 Courtney’s LOBF: 15 = 4 − 43 10 Drew’s LOBF: 5 = 5 − 45 0 0 5 10 15 20 Age 30. Abby’s Residuals from LOBF = 4.1 − 41 Age () 10 10 11 11 12 Abby’s Residuals -2 -5 4.1 -0.9 -1.8 Correlation Coefficient: ≈ 0.860 12 13 13 14 14 15 15 3.2 0.5 -4.5 2.3 -2.7 Abby's Residuals 10 8 6 4 2 0 -2 10 -4 -6 -8 -10 11 12 13 14 25 15 1.4 -3.6 31. Bennett’s Residuals from LOBF 4.5 − 41 Age () 10 10 11 11 12 Bennett’s Residuals 2 -1 8.5 3.5 3 Correlation Coefficient: ≈ 0.742 13 13 14 14 12 8 7.5 2.5 7 2 15 15 6.5 1.5 15 15 -3 -8 Bennett's Residuals 10 8 6 4 2 0 -2 10 -4 -6 -8 -10 11 12 13 14 15 Correlation Coefficient: ≈ 0.758 12 13 13 14 14 32. Courtney’s Residuals from LOBF = 4 − 43 Age () 10 10 11 11 12 Courtney’s Residuals -5 -8 1 -4 -5 0 -1 -6 Courtney's Residuals 10 8 6 4 2 0 -2 10 -4 -6 -8 -10 11 12 13 14 26 15 -2 -7 33. Drew’s Residuals from LOBF 5 − 45 Age () 10 10 11 11 12 Drew’s Residuals 3 0 10 5 5 Correlation Coefficient: ≈ 0.643 12 13 13 14 14 10 10 5 10 5 15 15 10 5 Drew's Residuals 10 8 6 4 2 0 -2 10 -4 -6 -8 -10 11 12 13 14 15 34. Which person’s line of best fit do you think is the best based off your residual plots and their corresponding correlation coefficients? Explain why you think so. Abby’s LOBF is the best because her residuals are closely centered around zero and her line also has the highest correlation coefficient. 35. Using technology (Excel), calculate the correlation coefficient ( ) of the line of best fit for the original data set. What does that mean? ≈ 0.8557 which means that a linear function is a good choice for a function of best fit. 36. If a function of best fit had a correlation coefficient of ≈ 0.02, what would that mean? The choice of function does not fit the data hardly at all. Either this is a terrible choice for a function of best fit or there is little correlation between the variables. 37. If a function of best fit had a correlation coefficient of ≈ 0.92, what would that mean? The choice of function fits the data extremely well. There appears to be a strong correlation between the variables in the data set. 38. If a function of best fit had a correlation coefficient of ≈ 0.41, what would that mean? The choice of function fits the data moderately. There may be better choices for a function of best fit or there is not as strong of a correlation between the variables. 27 Lesson 9.6 Use the data set to answer the following questions. For this data set a class of middle school students was asked what they thought was most important in school: good grades or popularity. Boy or Girl Grades or Popularity B B G G G B G B B G G B G B G B B G G B P G G P G P G G P G G P G P P P G G G P Boy or Girl Grades or Popularity B B G G G B G B B G G B G B G B B G G B P G P G G P G P P G G G G P P P G P G G 1. Construct a two-way table of the data. Boys Girls Grades 7 15 Popularity 13 5 2. What is the frequency of students who believe grades are more important? 22 3. What is the relative frequency of students who believe grades are more important? 22 = 55% 40 4. What is the frequency of students who believe popularity is more important? 18 5. What is the relative frequency of students who believe popularity is more important? 18 = 45% 40 6. What is the frequency of girls who believe grades are more important? 15 7. What is the relative frequency of girls who believe grades are more important? 15 = 75% 20 8. What is the frequency of boys who believe popularity is more important? 13 9. What is the relative frequency of boys who believe popularity is more important? 13 = 65% 20 10. Based on this data, do you feel there is relationship between a student’s gender and what they think is most important in school? What is that relationship and what evidence do you have that it exists? Based on the relative frequencies, girls typically believe that grades are more important, while boys believe popularity is more important. 28 Use the data set to answer the following questions. For this data set a class of middle school students was asked what hand was their dominant hand. Boy or Girl Right or Left B B G G G B G B B G G B G B G B B G G B L R R L R L R R R R L R R R R R L R L R Boy or Girl Right or Left B B G G G B G B B G G B G B G B B G G B R R L R R R L R L R R R L R R L R R L L 11. Construct a two-way table of the data. Boys Girls Right-handed 14 13 Left-handed 6 7 12. What is the frequency of students who are right-handed? 27 13. What is the relative frequency of students who are right-handed? 27 = 67.5% 40 14. What is the frequency of students who are left-handed? 13 15. What is the relative frequency of students who are left-handed? 13 = 32.5% 40 16. What is the frequency of girls who are right-handed? 13 17. What is the relative frequency of girls who are right-handed? 13 = 65% 20 18. What is the frequency of boys who are right-handed? 14 19. What is the relative frequency of boys who are right-handed? 14 = 70% 20 20. Based on this data, do you feel there is relationship between a student’s gender and whether or not they are right-handed? What is that relationship and what evidence do you have that it exists? Based on the relative frequencies it appears that boys and girls have the same chances of being left- or righthanded and that being right-handed is much more likely than being left-handed. 29 Use the two-way tables representing surveys middle school students took to answer the following questions. Survey 1: Boys Girls Prefer Spicy Salsa 255 68 Prefer Mild Salsa 45 132 Survey 2: Right-handed Left-handed Prefer Spicy Salsa 280 43 Prefer Mild Salsa 170 7 21. How many students were surveyed? 500 22. What is the relative frequency of students who prefer spicy salsa? Is it the same on both two-way tables? 323 = 64.6% 500 23. How many boys were surveyed? 300 24. How many girls were surveyed? 200 25. What is the relative frequency of boys who prefer spicy salsa? 255 = 85% 300 26. What is the relative frequency of girls who prefer spicy salsa? 68 = 34% 200 27. Do you think there is a relationship between gender and salsa preference? What is that relationship and what evidence do you have that it exists? Based on the relative frequencies, it appears that boys prefer spicy salsa more than girls. 28. How many right-handed students were surveyed? 450 29. How many left-handed students were surveyed? 50 30. What is the relative frequency of right-handed students who prefer mild salsa? 170 = 37. 7?% 450 31. What is the relative frequency of left-handed students who prefer mild salsa? 7 = 14% 50 32. Do you think there is a relationship between a student’s dominant hand and salsa preference? What is that relationship and what evidence do you have that it exits? Based on the relative frequencies, it appears that that right-handed students are between two and three times as likely to prefer mild salsa. 30 ReviewUnit9:DescriptiveStatisticsKEY You may use a calculator. The following table shows the fall MAP scores for students. Answer the following questions using the table. 210 225 208 245 232 219 253 228 218 230 234 241 240 221 235 218 227 261 1. Construct a histogram for the above data set using appropriate scale for the y-axis and appropriate x-axis intervals for the frequency. 9 8 208-210 211-213 214-216 217-219 220-222 223-225 226-228 229-231 232-234 235-237 238-240 241-243 244-246 247-249 250-252 253-255 256-258 259-261 262-264 265-267 268-270 6 5 4 3 2 256-261 250-255 244-249 238-243 232-237 226-231 220-225 214-219 1 208-213 Frequency 7 208 214 220 226 232 238 244 250 256 262 268 MAP scores 2. Construct a dot plot for the above data set in the space above. 3. Construct a box and whisker plot for the above data set in the space above. 4. What are the mean and median of the above data set? Mean: 230.28; Median: 229 5. What are the range and interquartile range of the above data set? Range: 53; Interquartile range: 21 6. What is the population standard deviation of the above data set? 13.73 7. What can you tell about a data set of test scores with a mean of 228 median of 230, range of 35, and population standard deviation of 4? The data is pretty closely packed around 228; the end points are not far off as well. Data is evenly distributed due to mean and median being close together. 8. What could be a mean and median of a data set of test scores where a third of the class scored above 250, but the rest of the class scored around 225? Justify each choice in writing. Median would be around 225 since more than half of the data was around 225. The mean would be above 225 but probably below 250 since a third of the data is above 250. 31 The following table shows the free throw percentages from two different teams. Answer the following questions using the table. Team 1 Team 2 .714 .819 .807 .671 .721 .672 .817 .676 .694 .750 .733 .730 .750 .623 .619 .636 .710 .875 .794 .615 .500 .500 .900 .710 .500 .815 .650 .450 .790 .735 .500 .621 9. What are the mean, median, range, interquartile range, and population standard deviation for Team 1? Mean: 0.71, Median: 0.732, Range: 0.317, Interquartile Range: 0.119, Standard Deviation: 0.096 10. What are the mean, median, range, interquartile range, and population standard deviation for Team 2? Mean: 0.671, Median: 0.661, Range: 0.45, Interquartile Range: 0.116, Standard Deviation: 0.123 11. Compare and contrast Team 1 and Team 2 using the measures of center and spread you calculated. The center for team 2 is slightly lower. The spreads are very similar. 12. What sort of centers and spread would you expect from a team that all shot relatively near 0.750? Centers will be around 0.75 and the spread will be low 13. What sort of centers and spread would you expect from a team whose free throw percentages were evenly spread out from 0.600 to 0.800? Centers near 0.700 and interquartile range near 0.1 and standard deviation near 0.07. Construct a scatter plot for the following data set using appropriate scale for both the x- and y-axis. 14. This table shows the age of students slept and their scores on the MAP test. 8 10 11 12 9 15 13 14 13 14 180 200 215 220 195 235 230 235 225 225 230 220 MAP Score Anna Bob Carly Damien Esther Franco Georgia Hank Innya Jacob 240 MAP Score Age 250 210 200 190 180 170 160 150 0 2 4 6 8 10 12 14 16 18 20 Age 32 Use the following scatter plot to answer each question. The scatter plot shows the number of years each person invested ten thousand dollars versus the end value of that investment in thousands of dollars. 15. Does this scatter plot represent a positive association, negative association, or no association? Why? Negative, going down over time. Remaining Student Loan Debt on a $30,000 Loan 35000 Mike Remaining Debt 30000 Jazmin Gonzo Eloise Amanda Fifi 25000 20000 16. Which person paid off their debt? About how long did it take? Brady, 30 years. Donna Katy Leonard 15000 Chuck Hannah Isildor 10000 5000 0 0 5 10 15 20 25 30 17. Does this appear to a linear or nonlinear association? Why? Non-linear, curves down. Brady 35 Years Since Graduating College 18. Which person is the outlier in this data set? Why? Mike, has more debt after many years. Draw an informal line of best for the given scatter plots. 20. This scatter plot shows the hours of TV watched per week versus the GPA on a 4.0 scale for a group of students. 80 4 70 3.5 60 3 50 2.5 GPA Height in Inches 19. This scatter plot shows the age in years versus the height in inches of a group of children. 40 2 30 1.5 20 1 10 0.5 0 0 0 5 10 15 20 0 Age 5 10 Hours of TV Watched per Week 33 15 Explain why the drawn line of best fit is accurate or why not. 22. This scatter plot shows the hours a cubic foot of ice was exposed to sunlight versus the amount of ice that melted in cubic inches. Inaccurate, not the right slope. 21. This scatter plot shows the amount copper in water in ppm versus plant growth in cm over three months. Inaccurate, does not split data in half. 10 9 9 Ice Melted in Cubic Inches 10 Plant Growth in cm 8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1 0 0 0 20 40 60 0 2 Cu in Water (ppm) 4 6 8 Hours of Sunlight The scatter plot shows the price of a gallon of milk from 2001 to 2012. The equation of the line of best fit is approximately = = + . @. 23. Predict what price of a gallon of milk would have been in 2005 using both the equation and the graph. $4.50 $4.00 Equation Work: Avg Price of Milk $3.50 $3.00 = $2.50 Graph Prediction: 21 85: + 2.68 = $3.10 250 $3.10 $2.00 $1.50 $1.00 $0.50 $0.00 0 5 10 Years since 2000 15 24. Predict what year it would have been when a gallon of milk cost approximately $3.00 using both the equation and the graph. " Equation Work: 3 = BC + 2.68 1.32 = 21 250 ≈ 3.8 meaning about 2004 34 Graph Prediction: 2004 Using the same scatter plot and equation of the line of best fit of = = + . @, answer the following questions. 25. What does the slope of this equation mean in terms of the given situation? In other words, explain what the rise and run mean for this problem. The price goes up $21 every 250 years. 26. What does the -intercept of this equation mean in terms of the given situation? In other words, explain what the -intercept means when considering the price of a gallon of milk and the year. In the year 2000, the price of a gallon of milk was $2.68. Answer the following questions. 27. A function of best fit has a correlation coefficient of ≈ 0.901. What does that tell us? Negative relationship that is close to linear 28. A function of best fit has a correlation coefficient of ≈ 0.029. What does that tell us? Positive relationship that is not linear 29. Plot the two sets of residuals on the number lines. Nate’s 1 -1 2 -0.5 3 2 4 1 Nancy’s 5 2 1 0 2 0.5 3 -1 4 0.5 5 1 30. Which person’s line of best fit works better and why do you think so? Nancy’s LOBF is the best because her residuals are closely centered near zero and only exceed 1 or -1 in a couple of instances. The other residuals are lopsided and exceed 1 or -1 more often. 35 Answer the following questions about two-way tables. 31. Construct a two-way table from the following data about whether or not students own an iPhone and whether or not they own an iPad. Own an iPhone? Own a iPad? Y N Y Y N Y N N Y Y Y N N Y N N Y N Y N Y N Y N N Y Y N Y N Y Y N Y N N Y Y N N Owns iPhone Owns iPad Does Not Own iPad 7 3 Does Not Own iPhone 3 7 32. Do you think there is a relationship between owning a iPhone and owning an iPad? Based on the data, why or why not? Yes, there is a relationship. Owners of iPhones are more likely to own iPads. 70% of iPhone owners also own an iPad and 70% of those who do not own an iPhone also do not own an iPad. Answer the following questions using the given two-way table. Students Teachers Support Year-Round School 250 80 Do Not Support YearRound School 2150 70 33. How many teachers were surveyed? 150 34. How many students were surveyed? 2400 35. How many people support year-round school? 330 36. How many teachers do not support year-round school? 70 37. How many students do not support year-round school? 2150 38. As a percent to the nearest hundredth (two decimal places) what is the relative frequency of the teachers "BC compared to all those surveyed? BBC ≈ 5.88% 39. As a percent to the nearest hundredth (two decimal places) what is the relative frequency of the students BC who support year-round school compared to all students? DCC ≈ 10.42% 40. As a percent to the nearest hundredth (two decimal places) what is the relative frequency of the teachers 1C who do not support year-round school compared to all teachers? "BC ≈ 46.67% 36