Statistics Project Name: __________________________________ Section: ____ Date: ___________ Page 0 Integrated Algebra – Graphs and Statistics This is a project based approach to teaching the above topic. Please send comments/corrections/additions to the author: George Ludovici at gwludovici@gmail.com , thank you. The author credits fellow participants at PCMI, especially Bill Thill, for help with the basic idea of this project. Objective – To address statistical components of the NYS Integrated Algebra Regents level course with accommodations for anticipated impact of the Common Core Standards. Project – Cup Stacking Outline Project and data collection Frequency and Cumulative Frequency Histograms Box and Whiskers Plots Scatter Plot Line of Best Fit Correlation Central Tendency Materials – 16 plastic stackable/disposable cups per student group, 3 to 5 students per group Two different color markers, for each group to write on the cups Stop watch or other method of timing to the nearest second for each group Tape measure (reach across student’s arm-span) Note to the Teacher – On page 6 questions 16 and 18, you may wish to have a discussion with your students that graphical scales or intervals can be a number other that one. For example, a histogram interval of only one is not usually a good choice and when large numbers are involved, a vertical unit of one is impractical. Standards – There is emphasis in the Common Core Mathematics Standards on deeper thinking and communication. To this end students are often asked to explain their reasoning throughout this unit. The two example histograms and the example scatter plot are borrowed from old NYS Regents exam questions. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ Page 1 Conduct experiments to collect the data: Your group needs to collect data to fill in the following table. Gender Arm Span Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 Total Average Student Name m/f Inches seconds seconds seconds seconds seconds Seconds Time 1) Fill in the names, genders and arm spans. Arm span is fingertip to fingertip with arms outstretched to the sides. 2) Number 16 cups, 1 to 16, using one color for even numbers and one color for odd numbers. Write the numbers several times so they are visible on the sides of the cups no matter how the cup is turned. An alternative to this is to use two different cup colors, what is important is being able to identify that only one cup at a time is being moved and when all the cups have been moved. 3) Take turns being: a stacker, a timer, and a spotter. Stacker: holds the cups. Timer: operates the timer and records the data. Spotter: makes sure stacker performs correctly. 4) Stand up and hold the stack of cups. Take one cup from the bottom of the stack with one hand and put it in the top of the stack, then use your other hand to take the new bottom cup and put it at the top of the stack. 5) You must alternate hands throughout the experiment. You are done when the cup which started at the bottom goes all the way through one cycle, returning to the bottom. 6) The stacker must be standing up. 7) The spotter is responsible for making sure every number 1-16 goes by. Specifically that only one cup at a time is being moved and when the end is reached. 8) Mistakes must be corrected by undoing and repeating the action correctly. Do not start a new trial. 9) The timer stops as soon as the original cup is back at the bottom of the stack and the stack is complete. 10) Make sure the stacker goes through this experiment five times in a row, then everyone in the group switches to a new position. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ Page 2 Analysis How would you describe the relationships you see in your data? Justify your answer from number 1 above. How is your description helpful to someone who has not seen your data? What else might that person want or need to know about your data? Discuss these questions, don’t write answers yet: 1) How much time does it usually take to stack the cups? 2) Are some of the cup stacking times more or less typical than others? 3) How spread out are the cup stacking times? 4) Does practice improve the cup stacking times? If so, how much? 5) Are there any relationships between gender, arm span and cup stacking times? 6) Could you predict some data items given other data items? 7) What other questions might you ask? During this statistical unit, we are going to use the data you collected from your experiment to answer the above questions. As you do your work you may think of additional questions. Think about how you might use the statistical tools that we are going to learn about to help answer those questions. If the statistical tools we use don’t help to answer your questions, think about how you might create a different tool to answer the question. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ Page 3 Histogram – You are going to create frequency histograms to communicate your data graphically. Two example histograms are discussed: a regular histogram on this page and then a cumulative histogram on the next page. Notice how the vertical axis always represents frequency. What does “frequency mean”? Write a sentence describing exactly what the fourth gray bar, above 190-199, for Student Heights means. Height Interval Total Cumulative Total Using the above histogram, complete this table. 160-169 Discuss your table answers. Should the “Total” entry for 170-179 be 2 or 4 or 6? Why? 170-179 180-189 190-199 200-209 Should the “Cumulative Total” entry for 170-179 be 2 or 4 or 6? Why? Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ Page 4 What does ”cumulative” mean? Why do the bars on the Cumulative Test Scores histogram keep getting higher? Why is the fifth gray bar six units higher than the fourth gray bar? Describe exactly what the fourth gray bar, above 41-80, for CumulativeTest Scores means. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ Draw a Cumulative Student Height histogram based on the data in the Student Heights histogram. Draw a Test Scores histogram based on the data in the Cumulative Test Scores histogram. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Page 5 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ We are now going to draw a frequency histogram based on the data you collected using the Total Time. 15 What does vertical mean? 16 What measure or label should be on the vertical (y) axis? 17 What does horizontal mean? 18 What measure or label should be on the horizontal (x) axis? The width of each histogram bar represents an interval of your data. In our case this will be in seconds. 19 What does interval mean? Your interval should be greater than or equal to one depending on your data. 20 What interval will you choose for your bars? 21 Why did you choose that interval? 22 How did you calculate it? 23 How many bars will you have? Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Page 6 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ Page 7 On each axis there is a scale. 24 What does scale mean? 34 Complete this table based on your cup stacking data. 25 Write the scale you will use on your horizontal axis. 35 Will you need to fill in every line in this table? 26 Why did you choose that scale? 36 How many bars did you say you would have? Interval Tally Your Entries Total In addition to the scale, we label each axis. 27 Why do we label the axis? 28 What label will you use on your horizontal axis? 29 How will you choose a scale for the vertical axis? 30 Write the scale you will use on your vertical axis. 31 What label will you use on your vertical axis? 32 Why do we give the histogram a name or description? 33 What will you name your histogram? Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Cumulative Total Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ 37 Draw a histogram for the cup stacking data. 38 Draw a cumulative histogram for the cup stacking data. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Page 8 Statistics Project Part 1 Name: __________________________________ Section: ____ Date: __________ Page 9 Cup Stacking Project Rubric for Part 1 Rubric Gathered the Data 0 1 2 Not Done Table has a combination of missing data, difficult to read and/or fewer than three team members. Some data is missing or the table is difficult to read or there are fewer than three team members. Portions of complete and fully labeled histograms are missing or difficult to read or inaccurate. Portions of complete and fully labeled histograms are missing or difficult to read or inaccurate. Most questions are answered in a neat, accurate and thoughtful manner Properly graphed 13 and 14 on page 5 Not Done Properly graphed 37 and 38 on last page Not Done Remaining 34 questions Not Done The histograms have incompleteness, inaccuracies and/or are difficult to read. The histograms have incompleteness, inaccuracies and/or are difficult to read. Few questions are answered and/or many answers are difficult to read and/or most answers are not thoughtful and accurate. 3 A neat and complete data table with Arm Length, Gender and five trials each for at least three team members. A frequency histogram and a cumulative frequency histogram are completed, neat, fully labeled and accurate. A frequency histogram and a cumulative frequency histogram are completed, neat, fully labeled and accurate. All or almost all questions are answered. Answers are thoughtful and accurate, and answers are neat and easy to read. Total Possible deduction for being late past the due date Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Points 3 3 3 3 12 Statistics Project Part 2 Name: __________________________________ Section: ____ Date: ___________ Page 1 Box and Whiskers A sample Box and Whiskers plot might look like this: Five numbers are used to create a Box and Whiskers plot. First put your data in order (sort) from smallest to largest. 1) The smallest number. In the above plot, the smallest number is ______________. 2) The middle number in the lower half of your sorted data. This is called the first or lower quartile. Quartile sounds like quarter which is 1 or 25% of all the data points. If two numbers are in the middle then take 4 their average by adding them up and dividing by two. If there is an overall odd number of data values then do not include the median in the upper or the lower half when determining quartiles. In the above plot, the first quartile is ______________. 3) The middle number in all of your sorted data. This is called the median; think of median as middle. For example, a highway or a wide street often has a median running down the middle to separate cars going in different directions. The median is the same as the second quartile. Half (50%) of the data comes before the median and half (50%) of the data comes after the median. Like the lower quartile, if there are an even number of data points so that two points are in the middle, then average them to find the median. In the above plot, the median is ______________. 4) The middle number in the higher half of your sorted data. This is called the third or upper quartile. In the above plot, the third quartile is ______________. 5) The largest number. In the above plot, the largest number is ______________. Notice how in the above Box and Whiskers plot the quartiles or vertical lines are not evenly spaced. 6) How far apart are each of the five numbers? _________ Low to first quartile ____________ First to Second quartiles ____________ Second to Third quartiles __________ Third to High quartile 7) What does this tell you about how the data are spread out? Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 2 Name: __________________________________ Section: ____ Date: __________ Page 2 8) The Interquartile Range shows how spread out the middle half of the data is. The Interquartile Range is calculated by subtracting the first quartile from the third quartile. What is the Interquartile Range for the box and whiskers plot on the previous page? Create a Box and Whisker Plot Sort your whole team’s cup-stacking data from shortest Trial Time to longest Trial Time. Find your five numbers using the sorted Trial Times: 1) Lowest: ___________ 2) Lower quartile: ___________ 3) Median: ___________ 4) Third Quartile : ___________ 5) Highest: ___________ The Box and Whiskers plot is drawn above part of a number line which is the scale for the plot. What scale will you choose? Why did you choose that scale? How did you calculate that scale? Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 2 Name: __________________________________ Section: ____ Date: __________ Page 3 Draw your cup-stacking Box and Whisker plot. Be sure to include a title, the scale, the box around the three middle numbers, vertical lines on the first, third and fifth numbers and connect the whiskers with horizontal lines. How does the spread of the data from your Cup Stacking Box and Whiskers plot compare with the spread of the data from the sample Box and Whiskers plot that we have been using? Scatter Plots A sample scatter plot for the maximum height and speed of some roller coasters is shown in the table below and graphed with a scatter plot. The scatter plot has a data point graphed for every piece of data. Although the above table is sorted by height, it does not need to be. The points are not connected with lines. Both axis and the graph are labeled. Each axis has a scale and a label. The whole graph has a label. If there was more than one point with the same height value, they would each be graphed. Notice how the scatter plot does not use intervals; this is different from a histogram. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 2 Name: __________________________________ Section: ____ Date: __________ Page 4 Describe the relationship you see between Max Height and Max Speed: A scatter plot is useful for seeing relationships between two variables. Do you think there is a relationship between Arm Length and Average Trial Time in your data? Explain your reasoning for the answer above. Create a scatter plot with Average Trial Time on the horizontal and Arm Length on the vertical. Join with one or two other groups so that you will have at least ten points to plot. What scale will you use on the horizontal axis? How did you determine the scale for the horizontal axis? What scale will you use on the vertical axis? How did you determine the scale for the vertical axis? Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 2 Name: __________________________________ Section: ____ Date: __________ Complete your scatter plot. Is there a relationship between Average Trial Time and Arm Length? Explain your above answer. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Page 5 Statistics Project Part 2 Name: __________________________________ Section: ____ Date: __________ Page 6 Remember the Roller Coaster Scatter Plot? Here it is again, drawn twice. The one on the right has a “line of best fit”. A line of best fit is all of the following: It is always straight It may go through some of your points or it may not, however it is as close as possible to as many of your points as possible while still being straight. If does not have to go through the origin (0, 0). If your data is close to the line of best fit, then we can say there is a relationship or correlation between the data on the x-axis and the data on the y-axis. The closer your data is to the line, the stronger the correlation is. The correlation is either positive (positive slope), negative (negative slope) or zero (no slope, no correlation). Go back to the previous page and add a line of best fit to your scatter plot. What kind of correlation does your line show? Describe how the correlation strengthens, weakens or changes your prior answer about the relationship between Average Trial Time and Arm Length. Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Statistics Project Part 2 Name: __________________________________ Section: ____ Date: __________ Page 7 Cup Stacking Project Rubric for Part 2 Rubric 0 Box and whiskers pages 1 and 2 Not Done Box and Whiskers Graph Not Done Scatter Plot preparation on page 4 Scatter plot Graph on page 5 and work on page 6 Not Done Not Done 1 Many answers are missing, not neat or inaccurate. Multiple mistakes and/or omissions and/or not neat and/or no written analysis. Few answers are completed, neat and thoughtful. Multiple mistakes and/or omissions and/or not neat and/or no written analysis. 2 3 Most answers are completed, neat and accurate. Minor mistakes or omissions or not neat. A written analysis is provided. All answers are completed, neat and accurate. Most answers are completed, neat and thoughtful. Minor mistakes or omissions or not neat. Written analysis is provided. All answers are completed, neat and thoughtful. A correct box-and-whiskers is drawn with a labeled scale and everything is neat. A written analysis is provided. A correct scatter plot is drawn with a labeled scale and everything is neat. A written analysis is provided and a correlation is described. Total Possible Possible deduction for being late past the due date Institute for Advanced Study/Park City Mathematics Institute Secondary School Teachers Program/Reasoning from Data and Chance Summer 2012 Points 3 3 3 3 12