Exploring Univariate and Bivariate Data Algebra (Grade 9) A 5-day Unit Plan Five 50 minute lessons Heather Gallivan I2T2 Project 2007 Materials: TI-84 Calculator TI-View Screen Fathom 2.0 Software CBR 2 1 Table of Contents Title Page Student Objectives………………………………………………….3 NYS Core Curriculum & NCTM Standards……………………......3 Resources……………………………………………………….......5 Materials……………………………………………………………5 Unit Overview……………………………………………………...6 Lesson 1……………………………………………………………7 Lesson 2…………………………………………………………...19 Lesson 3…………………………………………………………...33 Lesson 4…………………………………………………………...46 Lesson 5…………………………………………………………...56 2 Student Objectives for the Unit: Students will be able to distinguish between univariate and bivariate data and qualitative and quantitative data. Students will be able to construct a bar chart, pie chart, box-and-whisker plot, frequency table, histogram, stem-and-leaf plot, and scatter plot for the various types of data. Students will be able to calculate lines of regression manually and using technology. Students will be able to compare two sets of univariate data using the various graphical displays. Students will be able to analyze the different ways of displaying qualitative and quantitative data and univariate and bivariate data. Students will be able to judge which method of displaying data is the most appropriate. Students will be able to analyze data in terms of its variability; including whether the data is skewed or if it has outliers. NYS Core Curriculum Standards: Problem Solving Strand: A.PS.4 Use multiple representations to represent and explain problem situations. A.PS.10 Evaluate the relative efficiency of different representations and solution methods of a problem. Communications Strand: A.CM.1 Communicate verbally and in writing a correct, complete, coherent, and clear design and explanation for the steps used in solving a problem. A.CM.2 Use mathematical representations to communicate with appropriate accuracy, including numerical tables, formulas, functions, equations, charts, graphs, Venn diagrams, and other diagrams A.CM.7 Read and listen for logical understanding of mathematical thinking shared by other students. A.CM.12 Understand and use appropriate language, representations, and terminology when describing objects, relationships, mathematical solutions, and rationale. Connections Strand: A.CN.1 Understand and make connections among multiple representations of the same mathematical idea. A.CN.6 Recognize and apply mathematics to situations in the outside world. 3 Representation Strand: A.R.1 Use physical objects, diagrams, charts, tables, graphs, symbols, equations, or objects created using technology as representations of mathematical concepts. A.R.2 Recognize, compare, and use an array of representational forms. A.R.3 Use representation as a tool for exploring and understanding mathematical ideas. A.R.4 Select appropriate representations to solve problem situations. A.R.7 Use mathematics to show and understand social phenomena. Probability and Statistics Strand: A.S.1 Categorize data as qualitative or quantitative. A.S.2 Determine whether data to be analyzed is univariate or bivariate. A.S.3 Determine when collected data or display of data may be biased. A.S.4 Compare and contrast the appropriateness of different measures of central tendency for a given data set. A.S.5 Construct a histogram, cumulative frequency histogram, and a box-and-whisker plot, given a set of data. A.S.6 Understand how the five statistical summary (minimum, maximum, and the three quartiles) is used to construct a box-and- whisker plot. A.S.7 Create a scatter plot of bivariate data. A.S.8 Construct manually a reasonable line of best fit for a scatter plot and determine the equation of that line. A.S.9 Analyze and interpret a frequency distribution table or histogram, a cumulative frequency distribution table or histogram, or a box-and-whisker plot. A.S.12 Identify the relationship between the independent and dependent variables from a scatter plot (positive, negative, or none). A.S.17 Use a reasonable line of best fit to make a prediction involving interpolation or extrapolation. A.S.20 Calculate the probability of an event and its complement. NCTM Standards: Problem Solving Reasoning and Proof: Make and investigate mathematical conjectures. Communication 4 Connections Representation Data Analysis and Probability: Formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them. Select and use appropriate statistical methods to analyze data. Develop and evaluate inferences and predictions that are based on data. Understand and apply basic concepts of probability. Resources: About M&M’s Brand. 2007. Mars, Incorporated. 10 Nov. 2007. <http://us.mms.com/us/about/products/milkchocolate/>. Bohan, James F. AP Statistics. New York: Amsco School Publications, Inc., 2000. Exploring Data. 1997. Education Queensland. 21 Nov. 2007. <http://exploringdata.cqu.edu.au/>. Free Online Graph Paper. Incomptech.com. < http://incompetech.com/graphpaper/>. “Get on the Stick.” Texas Instruments: Activities Exchange. 2005. Texas Instruments Incorporated. 17 Nov. 2007. <http://education.ti.com/educationportal/activity exchange/Activity.do?cid=US&aId=6495>. I2T2 Project Year 8 – 2007 Notebook. Buffalo State College. Materials: TI-84 Calculator (with the EasyData application) TI-View Screen Overhead projector Class set of computers Fathom 2.0 Software Calculator-Based Ranger (CBR 2) M&M’s (medium bag) Small bags to sort the M&M’s for each group of students (preferably ones you can’t see through) Meter sticks Paper plates One die Measuring tapes Spaghetti (raw) Rulers 5 Unit Overview: Day 1: Exploring Qualitative Data: Bar Chart and Pie Chart The students will first review the difference between qualitative and quantitative data by completing a short worksheet. Then the students will discover the ways of displaying and analyzing qualitative data through a group activity. The students will be determining how often each color M&M occurs in a given bag and compare their results with what the M&M’s website states should be the correct proportions. They will gather data in groups and then compile their data with the entire class to create a bar chart and pie chart of the results. Students will be able to compare these displays and determine which display is the most appropriate to use with a set of data. Day 2: Quantitative Data Part I: Measures of Central Tendency, Frequency Table, Stem-and-Leaf Plot, Histogram The students will review known vocabulary such as mean, median, mode, and range and be introduced to new vocabulary such as variability and outlier through the use of a knowledge rating chart. The students will review how to calculate measures of central tendency and creating a frequency table using a given set of data. The students will be introduced to a stem-and-leaf plot and histogram using collected data on the heights of the students in class. They will learn how to graph a stem-and-leaf plot by hand and a histogram using the TI-84 calculator. The students will be able to analyze the variability in a set of data using these graphical displays and be able to describe the advantages and disadvantages of using both. Day 3: Quantitative Data Part II: Box-and-whisker Plot, Five-Number Summary The students will be introduced to the five-number summary and the box-and-whisker plot by completing an activity using the CBR 2 and their TI-84 calculators. The students will use the CBR 2 to determine the average reaction times of the members of their class and then use their TI-84 calculator to calculate the five-number summary and create a box plot. Students will be able to analyze a set of data using a box plot and the fivenumber summary and determine the advantages and disadvantages of using both. Day 4: Quantitative Data Part III: Comparing Distributions The students will learn how to compare two distributions using Fathom 2.0 software. The students will first play a game called Greed to collect data. Then they will be introduced to Fathom and will analyze the collected data using a back-to-back stem-and-leaf plot, dot plots, histograms, box-and-whisker plots, and summary statistics. They will learn the advantages of displaying data side by side on the same set of axis and also of being able to create all of the different graphical displays at the same time. Day 5: Bivariate Data: Scatter Plots and Regression Lines Students will learn how to create a scatter plot using data they will collect in class. They will be comparing their heights to the length of their foot and to their arm span. In groups of three, they will manually create a regression line using spaghetti and will then use the TI-84 to calculate the regression line. They will analyze their results in their groups and as a class. 6 Day 1: Exploring Qualitative Data Time: 50 minutes Materials: M&M’s (medium bag) Small bags to sort the M&M’s (preferably ones you can’t see through) TI-84 Calculators worksheets Lesson Overview: Students will learn the usefulness of different graphical representations for qualitative data through an activity using M&M’s. Lesson Objectives: Students will be able to collect qualitative data and construct a bar chart and pie chart using that data. Students will be able to analyze the different ways of displaying qualitative data. Students will be able to judge which graphical display of qualitative data is the most appropriate. Anticipatory Set: 1. Have a tally chart written on the board for favorite ice cream flavor. As the students walk in, have them put a tally mark under their favorite flavor. If their favorite flavor isn’t on the board, have the students write it on the board and put a tally next to it. Let the students know that this data will be used later. 2. Give the students the Qualitative vs. Quantitative worksheets and have them complete it in their own. (The students were introduced to the terms qualitative data and quantitative data with examples on the previous day’s instruction.) Go over the answers as a class. Explain that we are going to analyze categorical data today. Developmental Activity: 1. Have the M&M’s total chart from Part III written on the board. 2. Have the students break up into groups of two and give them the Data Collection and M&M’s worksheets and a cup full of M&M’s. 3. Introduction: “When you buy a bag of M&M’s, you may notice that you get more of certain colors than others. What you may not know is that M&M/Mars Company prides itself on the ratio of each color M&M to the entire bag of M&M’s. Today we will take a medium sized bag of M&M’s and using a bar graph and pie-chart, we will determine if what the company says is true for this bag.” 4. Have the students follow the instructions on the worksheet to complete Part I of the activity. Discuss as a class their predictions and their reasoning behind their choices. 5. Have the students complete Part II with their partner. Be sure to remind the students to record their numbers for Part II on the chart on the board for the entire class’s data. Make sure they calculate the probability as a percent. 6. Once the students have gathered the data, start Part III by discussing the definition of a bar graph. Have the students create a bar graph on their worksheets. Have one group come up to the front of the room to share their bar graph and explain how they did it. 7 7. Discuss the definition of a pie chart. Have the students create a pie chart on their worksheets. Be sure to let them know that they do not need to calculate the exact angle measure for each section. They should only estimate each section. Have one group come up and share their pie chart with the rest of the class and explain how they did it. 8. Now reveal to the students the actual percentages from the M&M’s website http://us.mms.com/us/about/products/milkchocolate/ and have them write the numbers in the blanks provided on the worksheet. Discuss question 5 on the worksheet. Actual percentages: Blue – 24%, Brown – 13%, Green – 16%, Orange – 20%, Red – 13%, Yellow – 14% Closure: Discuss as a class the usefulness of graphical representations of qualitative data. Which representation do you like best? How does the visual representation help you to understand the relationship between the categories? Discuss how we used their prior knowledge of probability to help them with the investigation and how it relates to what they did. Assessment: For homework, have the students use the Favorite Ice Cream data that was collected when they walked in to class to create a bar graph and a pie chart. Have them be as creative as they would like. Make sure they are prepared to share their graphs with the class tomorrow. 8 Name: ________________________________________________ Date: ____________ Qualitative vs. Quantitative Data Directions: Identify the type of data in each case by placing a check mark in the column that each description corresponds to. Description Qualitative 1. The types of cars in the parking lot. 2. The heights of students in our class. 3. The favorite flavors of ice cream of the students in our class. 4. The hair color of the members of the school’s faculty. 5. The height of a falling object at specified moments in time. 9 Quantitative_____ Name: _______________________________________________ Date: _____________ Data Collection and M&M’s Directions: The M&M/Mars Company prides itself on ensuring that the ratio of each color to the total amount of M&M’s in each bag is held constant. Through this activity, we are going to see if they are correct for this particular bag of M&M’s. Complete each part as directed. CAUTION: Do not eat any of the M&M’s until we are done!!!!!! Part I – Predictions 1. Predict which color will occur most frequently in the bag. ______________ Why? 2. Predict which color will occur least frequently in the bag. ______________ Why? Part II – Data Collection 1. Record the total number of M&M’s and the number of each color of M&M’s in your bag in the chart below. Color Blue Brown Green Orange Red Yellow Total Your Totals Number 2. Add your totals to the class chart on the board. Once everyone had added their numbers to the chart, add up the total amount of M&M’s for each color and record them in the chart below. 3. Then calculate the actual probability of selecting a candy of each color from the bag. Record this in the chart on the next page. 10 Class Totals Number Color Blue Brown Green Orange Red Yellow Total Probability Part III – Analysis Statistical data are often displayed graphically. It is almost always easier to study relationships in the data when you create a chart or a graph, rather than simply looking at the data as a set of numbers. A bar graph is a type of chart used to compare categorical data in which the length of a bar represents the size of the data. 1. Create a bar graph on the axis below using the class data that was collected. Make sure you create a title, label the axis, and create a scale. ***Remember, the bars do not touch in a bar graph because each category is separate from the others.*** Title: __________________________________________ 10 8 6 4 2 5 10 11 15 2. a. Which color occurs most often in your bag? ______________ b. Did you guess correctly in Part I? _______________ 3. a. Which color occurs least often in your bag? _______________ b. Did you guess correctly in Part I? ______________ A pie chart is a type of chart in which a circle is divided up into sections where each section represents part of the total. It is usually displayed using percents. 4. Calculate the probabilities for each color M&M as a percent. Using these probabilities, create a pie chart using the circle below. Estimate the measures of each angle for the chart. Be sure to label each section!! Title: ______________________________________ 12 5. Now that we have calculated the percentage of each color in our bag of M&M’s, let’s compare our results with what the M&M/Mars Company says they should be. Actual percentages: Blue: ________ Brown: __________ Green: ________ Orange: ________ Red: __________ Yellow: ________ How does our data compare? Is it relatively close or completely different? Explain. 13 Name: ____________Answer Key__________________________ Date: ____________ Qualitative vs. Quantitative Data Directions: Identify the type of data in each case by placing a check mark in the column that each description corresponds to. Description Qualitative 1. The types of cars in the parking lot. Quantitative_____ X 2. The heights of students in our class. X 3. The favorite flavors of ice cream of the students in our class. X 4. The hair color of the members of the school’s faculty. X 5. The height of a falling object at specified moments in time. 14 X Name: _________Answer Key___________________________ Date: _____________ Data Collection and M&M’s ***The answers here are only possible answers as many of the questions are opinion based and based on the actual data they collect. Directions: The M&M/Mars Company prides itself on ensuring that the ratio of each color to the total amount of M&M’s in each bag is held constant. Through this activity, we are going to see if they are correct for this particular bag of M&M’s. Complete each part as directed. CAUTION: Do not eat any of the M&M’s until we are done!!!!!! Part I – Predictions 3. Predict which color will occur most frequently in the bag. __Blue_______ Why? Because it looks like there are more blue M&M’s than any other color. 4. Predict which color will occur least frequently in the bag. __red_______ Why? Because it looks like there are fewer red M&M’s than any other color. Part II – Data Collection 4. Record the total number of M&M’s and the number of each color of M&M’s in your bag in the chart below. Your Totals Color Number Blue 12 Brown 7 Green 8 Orange 10 Red 6 Yellow 7 Total 50 ***the data used is only an example as the actual numbers will vary*** 5. Add your totals to the class chart on the board. Once everyone had added their numbers to the chart, add up the total amount of M&M’s for each color and record them in the chart below. 6. Then calculate the actual probability of selecting a candy of each color from the bag. Record this in the chart on the next page. 15 Class Totals Number 63 29 40 52 32 34 250 Color Blue Brown Green Orange Red Yellow Total Probability .252 = 25.2% .116 = 11.6% .160 = 16.0% .208 = 20.8% .128 = 12.8% .136 = 13.6% 100% Part III – Analysis Statistical data are often displayed graphically. It is almost always easier to study relationships in the data when you create a chart or a graph, rather than simply looking at the data as a set of numbers. A bar graph is a type of chart used to compare categorical data in which the length of a bar represents the size of the data. 6. Create a bar graph on the axis below using the class data that was collected. Make sure you create a title, label the axis, and create a scale. ***Remember, the bars do not touch in a bar graph because each category is separate from the others.*** Title: _____M&M’s in One Bag_____________________ 10 60 54 F r e q u e n c y 8 48 426 36 4 30 24 3 0 24 2 18 12 01 06 5 Blue Brown 10 Green 15 Orange 16 Red Yellow 7. a. Which color occurs most often in your bag? ___Blue_______ b. Did you guess correctly in Part I? __Yes__________ 8. a. Which color occurs least often in your bag? __Brown________ b. Did you guess correctly in Part I? ___No_________ A pie chart is a type of chart in which a circle is divided up into sections where each section represents part of the total. It is usually displayed using percents. 9. Calculate the probabilities for each color M&M as a percent. Using these probabilities, create a pie chart using the circle below. Estimate the measures of each angle for the chart. Be sure to label each section!! Title: _______M&M’s Probability_______________________ Blue 25.2% Red 12.8% Green 16.0% Brown 11.6% Orange 20.8% 17 Yellow 13.6% 10. Now that we have calculated the percentage of each color in our bag of M&M’s, let’s compare our results with what the M&M/Mars Company says they should be. Actual percentages: Blue: _24%____ Brown: ___13%____ Green: __16%___ Orange: _20%____ Red: __13%____ Yellow: __14%___ How does our data compare? Is it relatively close or completely different? Explain. The data in this example is pretty close to the actual percentages. The students should describe how close their data is to the percentages boasted by the M&M’s website and whether or not they think what they say is true. 18 Day 2: Quantitative Data Part I – Measures of Central Tendency, Frequency Table, Histogram, Stem-and-leaf Plot Time: 50 minutes Materials: TI-84 Calculator TI-View Screen overhead projector measuring tapes worksheets Lesson Overview: Students will learn how to find the measures of central tendency from a set of data and use that data to construct a frequency table, stem-and-leaf plot, and histogram using the calculator. They will also be able to analyze the given data in terms of its variability and shape. Lesson Objectives: Students will be able to create a stem-and-leaf plot, frequency table, and histogram from both collected data and given data. Students will be able to compare and contrast the different methods of displaying data. Students will be able to analyze a set of data graphically in terms of its variability; including if it is skewed or symmetrical and if it has outliers. Students will be able to evaluate the advantages and disadvantages of using the frequency table, measures of central tendency, stem-and-leaf plot, and histogram to analyze data. Anticipatory Set: 1. As the students walk in the door, have them write their height in inches on the board. Have a few measuring tapes hung on the wall in case they are not sure how tall they are. 2. Check that the students have completed their homework from the previous night and then have them hang their graphs on the bulletin board for display. 3. Give the students the Knowledge Rating worksheet and have them fill it out individually. Explain to the students that for each word, they should put a check in one of the columns depending on whether they know the word well and can come up with a definition, have heard of the word but don’t really know the definition, or have never heard of the word before. 4. Go over the words and come up with definitions for each. Introduce the words they have never seen before. Have them write down the definitions on the back of the worksheet. The students will most likely not know the word outlier and possibly the mathematical definitions for variability and skewed. Developmental Activity: 1. Give the students the Review worksheet and have them work with a partner to complete it. 2. Go over the answers as a class and then have a class discussion analyzing the data in terms of the new words they just learned. (e.g. Discuss whether they think there are any outliers, how varied or spread out the data is, etc.) 3. Explain to the students that there is an easier way to calculate the mean and median using their calculators. Have the students enter the data on the Review worksheet into list 1 in their calculators and show them how to calculate the mean 19 and median using the TI-View Screen for the overhead. ***See attached worksheet for instructions. 4. Explain how a graphical display of data can be far more helpful in analyzing the data than simply using a frequency table or measures of central tendency alone. 5. Ask: What kinds of graphical displays do you know? (They may come up with histogram, stem-and-leaf plot, and dot plot.) 6. Say: Today we are going to learn how to create a stem-and-leaf plot by hand and a histogram using the calculator. 7. Have the students create a stem-and-leaf plot of the quiz grade data on the back of the worksheet. Discuss the usefulness of a stem-and-leaf plot for analyzing data. 8. Show the students how to create a histogram on the TI-84 calculators using the TI-View Screen. Discuss the usefulness of a histogram for analyzing data. ***See attached worksheet for instructions. 9. Now that we know how to graph a stem-and-leaf plot and histogram, we can use them to make some observations about the variability of data. We are going to see how varied the heights of the students in our class are using both of these graphical displays. 10. Give the students the What’s Your Height? worksheet. Have the students complete it with a partner. Closure: Discuss the answers to the What’s Your Height? worksheet. What are the advantages and disadvantages of using a stem-and-leaf plot and histogram? What do each tell us about the data we collected? Which graphical display do you like the best? Assessment: Give the students the Entrance Ticket and have them hand it in when they walk into class tomorrow. 20 Name: _______________________________________________ Date: _____________ Knowledge Rating Word Know it well Heard of it Mean Outlier Median Range Variability Mode Frequency Skewed 21 Clueless_________ Name: ______________________________________________ Date: ______________ Review Directions: Thirty students took a quiz worth 100 points and their scores are listed below. 7 94 61 100 99 94 97 69 93 85 56 86 45 67 64 91 77 39 85 78 87 74 89 73 71 82 64 89 100 100 Construct a frequency table using the data by placing a tally mark next to the score. Then answer the questions below. Score 0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-100 Quiz Scores Frequency Total 1. What is the mean for this set of data? ________________ 2. What is the median? _________________ 3. What is the range? _______________ 4. What is the mode? ________________ 22 Name: _____________________________________________ Date: ______________ What’s Your Height? 1. Enter the class height data into the chart below and create a stem-and-leaf plot of the data. Heights in Inches Heights in Inches 2. Enter the data into L1 in your calculator. a. What is the mean height? ________________ b. What is the median height? ________________ 3. Create a histogram of the data using your calculator and sketch it on the axis below. 23 4. Describe the shape of the data. Is the data symmetric or is it skewed? __________________________________________________________________ __________________________________________________________________ 5. Are there any outliers? _______________________________________________ 6. What do both of these graphs tell you about the heights of the students in our class? ____________________________________________________________ __________________________________________________________________ 24 Name: _____________________________________________ Date: ______________ Entrance Ticket Directions: The quiz data for another class of thirty students is as follows: 32 7 26 56 46 79 91 67 79 95 69 31 93 89 79 100 81 90 73 96 81 87 49 83 80 57 21 78 94 100 Create a stem-and-leaf-plot and histogram of the data. How does this class’s data compare to the first class you analyzed? 25 Calculating the Mean and Median Using the TI-84 Step 1: Enter data into L1 by hitting STAT then 1: EDIT. Scroll down under L1 and enter the data as shown. Then hit 2ND QUIT to get back to the home screen. **To delete old data out of list one, scroll up until L1 is highlighted and press CLEAR and ENTER. Step 2: Hit 2ND STAT, then arrow to the right to the MATH menu. Step 3: Select 3: mean( and press ENTER. Hit 2ND, 1 to get L1 and close the parenthesis. Then hit ENTER. ***Repeat these steps to find the median*** 26 Creating a Histogram Using the TI-84 Calculator Step 1: Enter the data into L1 as described above. Step 2: Hit 2ND, Y= to get into the STAT PLOT menu. Hit ENTER to go into Plot 1. Step 3: Turn the plot on by hitting ENTER when ON is highlighted. Hit the down arrow to get to TYPE and use the right arrow to select the histogram as shown. Keep the Xlist as L1 and Freq as 1. Step 4: Hit ZOOM, 9: ZoomStat to graph the histogram. You should get a graph like the one below. 27 Name: __________Answer Key___________________________ Date: _____________ Knowledge Rating ***Answers here are what you may expect the students to answer.*** Word Mean Know it well Heard of it X Outlier X Median X Range X Variability Mode Clueless_________ X X Frequency X Skewed X 28 Name: _______Answer Key___________________________ Date: ______________ Review Directions: Thirty students took a quiz worth ten points and their scores are listed below. 7 94 61 100 99 94 97 69 93 85 56 86 45 67 64 91 77 39 85 78 87 74 89 73 71 82 64 89 100 100 Construct a frequency table using the data by placing a tally mark next to the score. Then answer the questions below. Score 0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-100 Quiz Scores Frequency | | | | |||| |||| |||| || |||| |||| Total 1 0 0 1 1 1 5 5 7 6 5. What is the mean for this set of data? ___77.2_____________ 6. What is the median? _______83.5_________ 7. What is the range? ___100 - 7 = 93________ 8. What is the mode? ______100__________ 0 1 2 3 4 5 6 7 8 9 10 7 9 5 6 14479 13478 2556799 134479 000 29 Name: _________Answer Key_______________________ Date: ______________ What’s Your Height? 7. Enter the class height data into the chart below and create a stem-and-leaf plot of the data. ***the data is just an example as it will vary in every class*** 65 64 62 61 60 63 58 61 64 58 59 64 Heights in Inches 70 66 67 57 66 66 66 60 60 64 69 67 65 65 68 62 63 71 Heights in Inches 5 5 6 6 7 7 7889 0001122334444 55566667789 01 8. Enter the data into L1 in your calculator. a. What is the mean height? __63.7___________ b. What is the median height? ____64__________ 9. Create a histogram of the data using your calculator and sketch it on the axis below. 30 10. Describe the shape of the data. Is the data symmetric or is it skewed? The data is relatively symmetric but very slightly skewed to the right. 11. Are there any outliers? __There are no outliers._______________________ What do both of these graphs tell you about the heights of the students in our class? These graphs tell us that our class’s heights are not very spread out but close together. They also show us that the majority of the students are around 5’4” tall. 31 Name: _______Answer Key____________________________ Date: ______________ Entrance Ticket Directions: The quiz data for another class of thirty students is as follows: 32 7 26 56 46 79 91 67 79 95 69 31 93 89 79 100 81 90 73 96 81 87 49 83 80 57 21 78 94 100 Find the mean and the median and create a stem-and-leaf-plot and histogram of the data. How does this class’s data compare to the first class you analyzed? Mean : 70.3 Median: 79 0 1 2 3 4 5 6 7 8 9 10 7 16 12 69 67 79 38999 011379 013456 00 This class’s data is also skewed to the left with a high concentration of the data towards the high end. This class’s average and median is quite a bit lower because there are more test scores in the lower range of grades than the first class’s. There are no outliers in this class’s data whereas in the first class’s data there is one outlier. 32 Day 3: Quantitative Data Part II – Box-and-whisker plot, 5-number summary Time: 50 minutes Materials: TI-84 Calculator (EasyData application) CBR 2 TI-View Screen overhead projector meter sticks paper plates Lesson Overview: Students will learn how to construct a box-and-whisker plot and fivenumber summary by collecting and analyzing data using the CBR 2 and their TI-84 calculators. Lesson Objectives: Students will be able to create a box-and-whisker plot and five-number summary using collected data. Students will be able to analyze the data they collected using the box-and-whisker plot. Students will be able to evaluate the usefulness of a box-and-whisker plot and fivenumber summary when analyzing data. Anticipatory Set: 1. Collect the Entrance Ticket from the previous day. 2. Quickly review what they learned the previous day about stem-and-leaf plots and histograms and the advantages and disadvantages they pose. 3. Say: Today we are going to learn another way to display a set of data that can also be useful when analyzing data. Developmental Activity: 1. Divide the class into groups of three, assigning the following roles: drop the stick, catch the stick, and set up and run the calculator and record data. Give the students the Data Collection and Get on the Stick worksheets. 2. Tape the center of a paper plate to the end of a meter stick and place the CBR on the floor directly under the plate. 3. Connect the calculator to the CBR and set the sensitivity setting on the CBR to normal. 4. Have the students take the calculator and go into APPS and select EasyData. ***For all instruction on how to use the calculator, use the TI-View Screen and overhead projector. 5. Next, we only need three seconds for this activity so go into SETUP and then 2: Time Graph 6. You will see the Time Graph Settings displayed. Now, hit EDIT and change the time interval to .03. Then hit NEXT and keep the samples at 100. 33 Then hit NEXT and it should take you back to this SETUP screen at the right. Then hit OK. 7. Perform one trial with the help of two students to model how the data is to be collected. 8. Have the students complete five trials for each of their group members and find their own average reaction time using the Data Collection worksheets to help them. Each group should record their individual trials on the Get on the Stick worksheets. 9. Have each student fill in their average reaction time on the Average Reaction Time chart that will be placed on the overhead. 10. Once every student has written their reaction time on the chart, show the students how to enter the data into list 1 in their calculators. Note: They should only enter the data from the Reaction Time column. They do not need to enter the student numbers into a list. They should go to STAT, and then 1: EDIT and enter the data from the Reaction Time column into L1. If L1 is not empty, arrow up until L1 is highlighted and press CLEAR and ENTER to empty the list. 11. Then, have the students press Y= and clear any equations they may have stored. 12. We want all of the data to be rounded to the nearest thousandth of a second. In order to do this, have the students hit MODE and then scroll down to FLOAT and arrow over until 3 is highlighted and press ENTER. Hit 2ND QUIT to get back to the home screen. 13. Discuss what the five-number summary is and show the students how to find it on their calculators using the data they collected as they follow along on the Get on the Stick worksheet. 14. Have the students answer the questions about the five-number summary in their groups and discuss the answers as a class. 15. Introduce the box-and-whisker plot and show the students how to create one on their calculators with the data they collected as they follow along on their worksheets. 16. Have the students answer the remaining questions on the worksheet in their groups. Discuss their answers as a class. Closure: Discuss: What is the five-number summary and how is it used to make a boxand-whisker plot of a set of data? Discuss how useful the five-number summary and boxand-whisker plot is for analyzing variability in data. When should the box-and-whisker plot be used when analyzing quantitative data? Which method of displaying univariate, quantitative data do you like the best? Assessment: Give the students the Think About It worksheets to complete for homework. 34 Name: _____________________________________________ Date: ______________ Data Collection Overview: Imagine you are standing in the batter’s box waiting for the pitch from someone who throws a fastball at 150 kilometers per hour (90 miles per hour). The pitcher’s mound is 18.4 meters from home plate. In the time it takes for the ball to get from the pitcher’s hand (2 seconds), you must decide whether you are going to swing the bat and where to swing the bat. How long do you think you have to make these decisions? Data Collection: 1. Have one partner set up the CBR 2 and the EasyData APP setting. 2. Have another partner hold the meter stick so that the paper plate is about 1 meter from the CBR. Make sure the plate is directly over the CBR sensor. 3. Another partner should be ready to catch the stick with his/her thumb when the stick is released. ***Make sure there is nothing else in between the plate and the CBR*** 4. Have the partner holding the stick press Start when ready to collect the data. The partner who is to catch the stick should be ready to catch the stick as quickly as possible after the stick has been let go. The catcher should try to hold the stick steady when they catch it. 5. The partner dropping the stick can vary when they drop the stick. The catcher should not know when they are going to drop the stick. 6. After the CBR 2 finishes collecting the data, the calculator will display “Please wait, transferring data” and then will display a distance/time graph like the one below. 7. Press the right arrow key until the cursor is on the last data point before the graph starts dropping. The x-value at the top of the screen represents the start time. Record this in the chart below under Trial 1, Start Time. 8. Press the right arrow key until the graph levels off again. The x-value at the top of the screen is the end time. Record this on the chart on the Get on the Stick worksheet under Trial 1, End Time. Subtract the two numbers to get your reaction time for that trial and record it in the chart. 9. Repeat this activity four more times by hitting Main and then Start. The screen to the right should appear. Hit Ok since you have recorded the data in the chart. 10. Switch roles until every member of your group has performed the experiment five times. 11. When all of your trials have been completed, select Main, Quit, and then OK to get back to the home screen. 35 Name: _______________________________________________ Date: _____________ Get on the Stick Directions: Record the reaction times in the table below. Trial Start Time Student #1 Student #2 Student #3 End Reaction Start End Reaction Start End Reaction Time Time Time Time Time Time Time Time 1 2 3 4 5 Avg. Note: Round your average reaction time to 3 decimal places (to the thousandth of a second). 1. A great way to determine the variability of the average reaction time of the students in our class is called the five-number summary. The five-number summary consists of the minimum value, the first quartile, the median, the third quartile, and the maximum value. Use your calculator to find the fivenumber summary for the average reaction time. Step 1: On your calculator, go to STAT and CALC and highlight 1: 1-Var Stats Step 2: Hit ENTER and type 2ND, 1 to get L1. Step 3: Hit ENTER and scroll down until you get to the five-number summary. Record the five-number summary in the chart below. Minimum value (minX) the first quartile (Q1) Median (Med) the third quartile (Q3) Maximum value (maxX) 36 The five-number summary tells you a lot about the variability of a set of data. The minimum value and the maximum value can be used to determine the range of the set of data. The first quartile (Q1) tells you that 25% of the data is less than or equal to Q1. In other words, Q1 is the median of the lower half of the data. The third quartile (Q3) tells you that 75% of the data is less than or equal to Q3. In other words, Q3 is the median of the upper half of the data. This means that 50% of the data lies between Q1 and Q3. The range that is calculated by subtracting Q1 from Q3 is called the inter-quartile range (IQR). i) What is the range for the average reaction time for our class? _____________ ii) What is the IQR for the average reaction time for our class? ______________ iii) What does this tell us about the variability of the average reaction time for the students in our class?______________________________________________________ ______________________________________________________________________ 2. A great way to display the five-number summary is something called a box-andwhisker plot. Use your calculator to create a box-and-whisker plot of the average reaction time. Step 1: Hit 2ND, Y= to get into the STAT PLOTS menu. Step 2: Hit ENTER on Plot 1. Step 3: Hit ENTER to turn the plot ON. Hit the down arrow to get on TYPE and use the right arrow to scroll through the plot options and hit ENTER on the box-andwhisker plot. Hit the down arrow and put L1 into the X list. Your screen should look like the one at the right. Step 4: Then press ZOOM, then 9: ZoomStat to get your box-and-whisker plot. Sketch the graph on the axis below: 37 A box-and-whisker plot is very useful because it allows you to focus on a few important features without the clutter that result when all of the data values are displayed. The “box” consists of Q1, the median, and Q3 and the “whiskers” end with the minimum value and the maximum value, respectively. i) Discuss the shape of the box of your graph. What does that tell us about the average reaction time for the class?_________________________________________ _____________________________________________________________________ _____________________________________________________________________ ii) What is the median for our data? _______________________________________ What does the median tell you about the data in the middle of your data set? ____ _____________________________________________________________________ iii) How many of your classmates are in the list? __________ How many are in the 1st quartile? ______ 2nd quartile? _______ 3rd quartile? _______ 4th quartile? _______ iv) How many of your classmates were between the 1st and 3rd quartile? __________ v) Is one of the whiskers longer than the other or are they about the same length? What does that tell us about how our data is skewed? __________________________ _____________________________________________________________________ _____________________________________________________________________ vi) What would happen to the box-and-whisker plot if someone had a reaction time that was longer than any of the other times? How would this change the graph? _____ _____________________________________________________________________ _____________________________________________________________________ 38 Average Reaction Times Student Average Reaction Time 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 39 Name: _______________________________________________ Date: _____________ Think About It Directions: Complete each question as indicated. Use the TI-84 calculator wherever applicable. Here is the data for the heights (in inches) of 30 high school girls. 60 61 64 65 65 61 62 64 60 59 63 65 66 63 67 61 58 62 60 61 63 60 60 61 63 62 59 64 62 64 1. Calculate the five-number summary for this set of data. 2. Construct a box-and-whisker plot of the data and sketch the graph on the axis below. Label the five-number summary. 3. How spread out or varied is the data? Is the data skewed? 4. What does this graph tell us about the heights of high school girls? 40 Name: ______Answer Key_______________________________ Date: _____________ Get on the Stick Directions: Record the reaction times in the table below. ***The data here is only an example as it will vary for each student*** Student #1 Student #2 Student #3 Trial Start End Reaction Start End Reaction Start End Reaction Time Time Time Time Time Time Time Time Time 1 .69 .96 .27 1.23 1.46 .23 .23 .49 .26 2 .9 1.17 .27 2.01 2.21 .20 1.34 1.58 .24 3 .63 .42 .21 .56 .75 .19 1.12 1.33 .21 4 1.26 1.44 .18 .98 1.16 .18 .76 .98 .22 5 1.52 1.72 .20 1.45 1.61 .16 2.12 2.31 .19 Avg. .223 .192 .224 ***Answers will vary*** Note: Round your average reaction time to 3 decimal places (to the thousandth of a second). 3. A great way to determine the variability of the average reaction time of the students in our class is called the five-number summary. The five-number summary consists of the minimum value, the first quartile, the median, the third quartile, and the maximum value. Use your calculator to find the fivenumber summary for the average reaction time. Step 1: On your calculator, go to STAT and CALC and highlight 1: 1-Var Stats Step 2: Hit ENTER and type 2ND, 1 to get L1. Step 3: Hit ENTER and scroll down until you get to the five-number summary. Record the five-number summary in the chart below. Minimum value (minX) the first quartile (Q1) Median (Med) the third quartile (Q3) Maximum value (maxX) .156 .187 .192 .201 .278 41 The five-number summary tells you a lot about the variability of a set of data. The minimum value and the maximum value can be used to determine the range of the set of data. The first quartile (Q1) tells you that 25% of the data is less than or equal to Q1. In other words, Q1 is the median of the lower half of the data. The third quartile (Q3) tells you that 75% of the data is less than or equal to Q3. In other words, Q3 is the median of the upper half of the data. This means that 50% of the data lies between Q1 and Q3. The range that is calculated by subtracting Q1 from Q3 is called the inter-quartile range (IQR). i) What is the range for the average reaction time for our class? .278 - .156 = .122 ii) What is the IQR for the average reaction time for our class? .201 - .187 = .014 iii) What does this tell us about the variability of the average reaction time for the students in our class? The variability of this set of data seems to be fairly close together with the exception of a few outliers that affect the range. 4. A great way to display the five-number summary is something called a box-andwhisker plot. Use your calculator to create a box-and-whisker plot of the average reaction time. Step 1: Hit 2ND, Y= to get into the STAT PLOTS menu. Step 2: Hit ENTER on Plot 1. Step 3: Hit ENTER to turn the plot ON. Hit the down arrow to get on TYPE and use the right arrow to scroll through the plot options and hit ENTER on the box-andwhisker plot. Hit the down arrow and put L1 into the X list. Your screen should look like the one at the right. Step 4: Then press ZOOM, then 9: ZoomStat to get your box-and-whisker plot. Sketch the graph on the axis below: 42 A box-and-whisker plot is very useful because it allows you to focus on a few important features without the clutter that result when all of the data values are displayed. The “box” consists of Q1, the median, and Q3 and the “whiskers” end with the minimum value and the maximum value, respectively. vii) Discuss the shape of the box of your graph. What does that tell us about the average reaction time for the class? The box part of the graph is very close together. This means that our class’s reaction times are relatively close with the exception of the few outliers._____________________________________ viii) What is the median for our data? ____.192__________________ What does the median tell you about the data in the middle of your data set? ____ _The median tells us where the middle of the data is.__________________ ix) How many of your classmates are in the list? ____30___ How many are in the 1st quartile? ___9__ 2nd quartile? _7____ 3rd quartile? ___7__ 4th quartile? _7____ x) How many of your classmates were between the 1st and 3rd quartile? _14______ xi) Is one of the whiskers longer than the other or are they about the same length? What does that tell us about how our data is skewed? ___The right whisker is longer than the other whisker. The data is skewed to the right.______________ __________________________________________________________________ xii) What would happen to the box-and-whisker plot if someone had a reaction time that was longer than any of the other times? How would this change the graph? _____ __the whisker on the right would be extended, showing that the data is more spread out between the majority of the data and this outlier __________________ 43 Average Reaction Times ***data is an example as it will vary with each class*** Student Average Reaction Time 1 .223 2 .224 3 .192 4 .187 5 .211 6 .215 7 .165 8 .199 9 .230 10 .278 11 .177 12 .189 13 .195 14 .193 15 .197 16 .186 17 .179 18 .201 19 .187 20 .186 21 .190 22 .156 23 .189 24 .191 25 .212 26 .195 27 .192 28 .188 29 .179 30 .194 44 Name: ____Answer Key_______________________________ Date: _____________ Think About It Directions: Complete each question as indicated. Use the TI-84 calculator wherever applicable. Here is the data for the heights (in inches) of 30 high school girls. 60 61 64 65 65 61 62 64 60 59 63 65 66 63 67 61 58 62 60 61 63 60 60 61 63 62 59 64 62 64 5. Calculate the five-number summary for this set of data. 6. Construct a box-and-whisker plot of the data and sketch the graph on the axis below. Label the five-number summary. 7. How spread out or varied is the data? Is the data skewed? The data is not very spread out but close together. The data is fairly symmetrical. 8. What does this graph tell us about the heights of high school girls? The graph tells us that the middle 50% of high school girls fall between 60’’ and 64’’ tall. There are a few shorter girls and a few taller girls, but neither Q1 nor Q4 is significant enough to skew the data. 45 Day 4: Quantitative Data Part III – Comparing Distributions Time: 50 minutes Materials: class set of computers Fathom 2.0 software one die worksheets Lesson Overview: Students will learn how to compare different sets of univariate data using stem-and-leaf plots, histograms, box-and-whisker plots, dot plots, and summary statistics. They will collect data through a class game and analyze the data using Fathom 2.0 software. Lesson Objectives: Students will be able to create graphical displays of two sets of univariate data using Fathom 2.0 software. Students will be able to compare different distributions using dot plots, histograms, backto-back stem-and-leaf plots, and box-and-whisker plots. Students will be able to evaluate the advantages of using statistical software to analyze sets of data. Students will be able to judge which graphical displays are the most appropriate for given sets of data. Anticipatory Set: 1. Discuss the homework from the previous day. Specifically discuss the answers to questions 3 and 4. 2. Have the students create a stem-and-leaf plot and histogram of the data. 3. Compare and contrast the different graphical displays in terms of this data and discuss which graph is the most appropriate. 4. Say: We have spent the last few days analyzing sets of data using various graphical displays. What if we wanted to compare two similar sets of data? What kind of information do you think you could gain from doing something like this? Developmental Activity: 1. Have a chart set up with two columns on the board: Game 1 and Game 2. Give the students the Greed worksheet. 2. Explain to the students that they will be collecting data to analyze using Fathom by first playing a game called Greed. 3. Explain the rules: First, all students should stand. The teacher rolls the die twice and the sum of the two rolls is the students’ starting score. From then on, the student has two choices. A: The student can sit down and record that score on their sheet. B: They can remain standing. The die is rolled again for the students that are standing. If a five is rolled, all of the students standing lose all of their points. They sit and record their score as 0. If a five is not rolled, that number gets added to their total score. Keep rolling the die until all of the students are seated. That ends the first round. The students play five rounds and at the end of five rounds, they add their scores together to get their final score for game 1. 4. The students should come up to the board to record their score under the column for Game 1. Have the students create a stem-and-leaf plot for game 1 on their worksheets. 46 5. Repeat step 3 for the second game and have the students come up to the board to record their score for game 2. 6. For number 2, show the students how to create a back-to-back stem-and-leaf plot with the data from game 1 and game 2. 7. Have the students answer question 3 in groups of three. Discuss their answers. (e.g. The scores for game 2 should be higher as the students developed strategies. Discuss whether this difference is significant or caused by random chance.) 8. Have the students follow the Greed worksheet to analyze the data using Fathom. 9. Stop the students once they have reached number 6. Discuss the answer to number 6 as this may be the first time they have encountered a dot plot. 10. Have the students complete the rest of the worksheet in their groups. Walk around to observe the students’ progress and to answer any questions they may have. ***Be sure to encourage the students to play around with the features of Fathom as they go along. For example, if you click on one data point, it becomes highlighted in every plot you have open. Closure: Discuss their answers to the worksheet. How useful was it to be able to graph all of the graphical displays all at once? What advantages do you have by being able to do this? In the context of our data, which display do you feel was the most appropriate to use to compare the data? Assessment: Give the students the Comparing Distributions worksheet to complete for homework. They can use Fathom if they have access to it outside of class or their TI-84 calculators. 47 Name: _____________________________________________ Date: _______________ Greed Directions: How greedy are you? Try and get the highest score without getting caught with 0. The die is rolled twice to get your starting score. You can keep that score or keep playing. However, if a five is rolled, you are left with a score of zero and are out of the game. You can decide to keep your score between any round. Let’s play! Game 1 Round 1 Round 2 Round 3 Round 4 Round 5 Total Score Game 2 Round 1 Round 2 Round 3 Round 4 Round 5 Total Score 1. Create a stem-and-leaf plot for Game 1 with the data from the entire class. 0 1 2 3 4 5 6 7 8 9 2. Create a back-to-back stem-and-leaf plot with your data for Game 1 and Game 2 by adding the data from Game 2 to the other side of the stem-and-leaf plot above. 3. What patterns do you notice from the stem-and-leaf plot? What can you infer about the scores from Game 1 versus the scores from Game 2? 48 4. We can analyze this data further using different graphical displays. Open Fathom and make sure you have a new window. If not, go to File, then New. 5. Create a Table by dragging the table icon onto the document and enter the data from Game 1 and Game 2 into the first two columns. [To rename the columns, simply click on <new> and type in the name you want. In this case, label the two columns Game 1 and Game 2.] 6. Create a graph by dragging the graph icon to an empty area in the document. Drag the column header for Game 1 to the horizontal axis of the graph over the spot labeled Drop attribute here. (As you move over the x-axis, a black box appears, showing that you can drop the attribute there.) What has appeared automatically is a dot plot of the data. a. What do you think a dot plot shows? 7. Drag the column header for Game 2 to the horizontal axis on the same graph. To do so, drag the column header to the left corner of the graph until you are highlighting the plus sign and then drop it. You should now have both dot plots for Game 1 and Game 2 on the same graph. a. What are the advantages to using dot plots to compare these two sets of data? 8. Create another graph by dragging the graph icon onto open space. Place both column headers for Game 1 and Game 2 onto the same set of axis as you did above. You should now have two dot plots again. Now, change the graph to a histogram by clicking on the drop down menu in the upper right hand corner that says dot plot and select histogram. You should now see both sets of data displayed with histograms. a. Describe the variability in each distribution. Is the data spread out or close together? Is the data skewed in either distribution or is it symmetric? 9. Create another graph as you did above to display the data with box plots. a. Are there any outliers in either distribution? b. What is the IQR for both distributions? What does that tell us about each set of data? 49 10. Display the Summary Statistics for both sets of data by dragging the summary icon onto open space in the document. Then drag the column header for Game 1 onto the chart until the arrow pointing to the right is highlighted and then drop it. Do the same with the column header for Game 2. Only the mean is displayed for each set of data. Next, right click on the summary statistics and select Add FiveNumber Summary. Now the five-number summary and the mean should be displayed. a. What advantage do you get from seeing the summary statistics along with the graph? b. How do outliers affect the mean of the distribution? How do they affect the graphical displays of the data? 11. What is the advantage to displaying the data from both data sets on the same axis? 12. What is the advantage of being able to analyze the data using all of the different data plots simultaneously? 13. Write a short paragraph comparing the two sets of data you collected playing the game. Make sure to discuss any similarities and differences and what may have caused them. 50 Name: _______________________________________________ Date: _____________ Comparing Distributions Directions: Use Fathom or your TI-84 calculator to create a back-to-back stem-and-leaf plot, histograms, and box-and-whisker plots for the two sets of data below. Compare the two sets of data, describing the shape of the distribution and any outliers in the context of the data. The data represents the scores of Class A and Class B on the same 100 point math test. A {70, 65, 66, 60, 80, 90, 95, 50, 65, 75, 65, 60, 80, 85, 60, 70, 75, 60, 55, 99} B {70, 85, 80, 80, 60, 65, 75, 80, 90, 95, 85, 80, 70, 60, 65, 75, 80, 80, 75, 80, 70, 50, 85, 90} 51 Name: ________Answer Key___________________________ Date: _______________ Greed ***SEE THE ATTACHED FATHOM FILE FOR AN EXAMPLE*** Directions: How greedy are you? Try and get the highest score without getting caught with 0. The die is rolled twice to get your starting score. You can keep that score or keep playing. However, if a five is rolled, you are left with a score of zero and are out of the game. You can decide to keep your score between any round. Let’s play! **ALL OF THE ANSWERS ARE BASED ON THE EXAMPLE DATA GIVEN** Game 1 Round 1 4 Round 2 3 Round 3 5 Round 4 0 Round 5 12 Total 24 ***Data is an example*** Score Game 2 Round 1 Round 2 Round 3 Round 4 Round 5 Total Score 6 4 7 5 10 32 14. Create a stem-and-leaf plot for Game 1 with the data from the entire class. ***Data is an example for the class*** 0 0 00127 42 1 237 82 2 344678 9820 3 1257 955100 4 764 5 026 10 6 6 744 7 8 0 9 15. Create a back-to-back stem-and-leaf plot with your data for Game 1 and Game 2 by adding the data from Game 2 to the other side of the stem-and-leaf plot above. 16. What patterns do you notice from the stem-and-leaf plot? What can you infer about the scores from Game 1 versus the scores from Game 2? The scores from Game 1 were lower than those for Game 2. This could be because students developed strategies to improve their score in the second game. 52 17. We can analyze this data further using different graphical displays. Open Fathom and make sure you have a new window. If not, go to File, then New. 18. Create a Table by dragging the table icon onto the document and enter the data from Game 1 and Game 2 into the first two columns. [To rename the columns, simply click on <new> and type in the name you want. In this case, label the two columns Game 1 and Game 2.] 19. Create a graph by dragging the graph icon to an empty area in the document. Drag the column header for Game 1 to the horizontal axis of the graph over the spot labeled Drop attribute here. (As you move over the x-axis, a black box appears, showing that you can drop the attribute there.) What has appeared automatically is a dot plot of the data. a. What do you think a dot plot shows? A dot plot displays the data much like a histogram only the frequency is displayed by the number of dots over each bin. 20. Drag the column header for Game 2 to the horizontal axis on the same graph. To do so, drag the column header to the left corner of the graph until you are highlighting the plus sign and then drop it. You should now have both dot plots for Game 1 and Game 2 on the same graph. a. What are the advantages to using dot plots to compare these two sets of data? It displays the general shape of the distribution including gaps and clusters. 21. Create another graph by dragging the graph icon onto open space. Place both column headers for Game 1 and Game 2 onto the same set of axis as you did above. You should now have two dot plots again. Now, change the graph to a histogram by clicking on the drop down menu in the upper right hand corner that says dot plot and select histogram. You should now see both sets of data displayed with histograms. a. Describe the variability in each distribution. Is the data spread out or close together? Is the data skewed in either distribution or is it symmetric? In Game 1, the data are skewed to the right with one outlier. In Game 2, the data are more symmetric. Both Games have about the same range, only the middle 50% of the data is lower in Game 1. 22. Create another graph as you did above to display the data with box plots. a. Are there any outliers in either distribution? There is one outlier in Game 1. This outlier is the winner of the game. b. What is the IQR for both distributions? What does that tell us about each set of data? 53 Game 1: 37-12 = 25 Game 2: 57-30 = 27 The IQR tells you where the middle 50% of the data lies. It tells us that the IQR for both games is about the same. 23. Display the Summary Statistics for both sets of data by dragging the summary icon onto open space in the document. Then drag the column header for Game 1 onto the chart until the arrow pointing to the right is highlighted and then drop it. Do the same with the column header for Game 2. Only the mean is displayed for each set of data. Next, right click on the summary statistics and select Add FiveNumber Summary. Now the five-number summary and the mean should be displayed. a. What advantage do you get from seeing the summary statistics along with the graph? You can see the effect of outliers on the mean, you are able to compare five-number summaries easier, etc. b. How do outliers affect the mean of the distribution? How do they affect the graphical displays of the data? Outliers affect the mean by making it either higher or lower, depending on where the outlier is. It affects the box plot more obviously but it can skew the data for the other displays. 24. What is the advantage to displaying the data from both data sets on the same axis? Displaying the data from both data sets makes it much easier to compare the two sets of data. You can more readily see differences in the shape of each distribution and the effect of outliers. 25. What is the advantage of being able to analyze the data using all of the different data plots simultaneously? Displaying all of the different data plots at the same time can help you decide which display is more useful when analyzing the data. Sometimes certain graphical displays can be misleading so being able to graph them all will enable you to analyze the data more accurately. 26. Write a short paragraph comparing the two sets of data you collected playing the game. Make sure to discuss any similarities and differences and what may have caused them. In Game 1, the scores were lower in general than Game 2’s scores. Despite this, their variability was about the same since their ranges are about the same and their IQRs are also very close. What may have caused the differences in the data is the fact that strategies were developed as the students played the game more, so their scores went up. 54 Name: ________Answer Key___________________________ Date: _____________ Comparing Distributions Directions: Use Fathom or your TI-84 calculator to create a back-to-back stem-and-leaf plot, histograms, and box-and-whisker plots for the two sets of data below. Compare the two sets of data, describing the shape of the distribution and any outliers in the context of the data. The data represents the scores of Class A and Class B on the same 100 point math test. A {70, 65, 66, 60, 80, 90, 95, 50, 65, 75, 65, 60, 80, 85, 60, 70, 75, 60, 55, 99} B {70, 85, 80, 80, 60, 65, 75, 80, 90, 95, 85, 80, 70, 60, 65, 75, 80, 80, 75, 80, 70, 50, 85, 90} Class A Class B Class A Class B Class B 0 5500 555000 5550000000 500 5 6 7 8 9 Class A 05 00015556 0055 005 059 The data for Class A is skewed to the right while Class B is skewed to the left. Class B has an outlier which will affect the mean of Class B. The same data point is not an outlier for Class A but that’s because there is a greater range of scores in Class A. Class A has more variability because its data is more spread out than Class B. 55 Day 5: Bivariate Data – Scatter Plots & Lines of Regression Time: 50 minutes Materials: TI-84 Calculator TI-84 View Screen Overhead projector Measuring Tapes Rulers Spaghetti (raw) worksheets Lesson Overview: Students will learn how to create a scatter plot and regression line manually and with the TI-84 calculator using collected data. Lesson Objectives: Students will be able to compare bivariate data using scatter plots. Students will be able to calculate a regression line both manually and using the TI-84 calculator that best fits the data. Students will be able to judge whether there is a positive, negative, or no association between the variables. Students will be able to evaluate how strongly the data fits a linear pattern and analyze what that means in the context of the problem. Anticipatory Set: 1. Collect the homework from the previous day. 2. Say: For the last few days we have been analyzing quantitative and qualitative univariate data. What if you wanted to analyze bivariate data? 3. Say: Today we are going to determine the relationship between two variables through a measurement activity. Give the students the Scatter Plots worksheet and discuss how they are going to be determining if a relationship exists between their height and the length of their foot and their height and their arm span (the length from the tip of the longest finger to the other when their arms are spread out perpendicular to their body). 4. Have the students answer question 1 and discuss their answers as a class. Developmental Activity: 1. Break the students into groups of three and give each group a measuring tape. 2. Have the students take their measurements and record them on their worksheets and on the Class Data transparency that you should have on the overhead projector. 3. Once all of the data has been recorded, have the students complete number 4 on the worksheet and hand out one piece of spaghetti to each student. 4. Explain how they have just created a scatter plot of the data. Ask: What do you notice about the general shape of the data? The data should form a linear pattern. 5. Ask: What if I wanted to find out the approximate foot length of someone whose height is not in one of our data points? Since the data appears to be linear, do you think we can draw a line on our graph that fits our data so we can estimate the foot length for any height? Discuss the answers to these questions as a class. 6. Have the students complete numbers 5 and 6 in their groups. Go over their answers as a class. Each group should have a slightly different equation but 56 discuss how they are just estimating so it is ok. Introduce the terms interpolation and extrapolation for questions a and b, respectively. 7. Go over number 7 as a class. You may need to remind students how to enter data into lists. See Day 2’s lesson for how to do this. 8. Walk the students through numbers 8 and 9 on the calculator using the TI-View Screen. 9. Have the students complete questions 10-12 in their groups. 10. Discuss their answers as a class. The answers to 11 and 12 have the students start analyzing the data in terms of the association between the variables. These questions will lead into the introduction of correlation coefficients and telling the difference between causation and correlation. Closure: Discuss what they learned today. How useful are scatter plots when analyzing data? What is the significance of a regression line when analyzing data? Assessment: Have the students create a scatter plot and regression line comparing Height and Arm Span using their calculators for homework. Collect it the next day. 57 Name: _______________________________________________ Date: _____________ Scatter Plots Is there a relationship between your height and the length of your foot? What about between your height and your arm span? By taking measurements of everyone in the class we are going to find out. 1. Make some predictions about the type of relationships that may exist, if any. 2. With the members of your group, measure your height, the length of your foot, and the length of your arm span, in inches, and record the measurements in the table below. Height (in.) Foot Length (in.) Arm Span (in.) Student 1 Student 2 Student 3 3. Once you have collected your data, record it on the class table on the overhead. 4. Using the height as the independent variable and foot length as the dependent variable, plot each data point for the class totals using the graph paper on the last page. Make sure you come up with an appropriate scale and label the graph. 5. Using a piece of spaghetti, place it on your graph and move it until you feel it is in a place that best fits the data. Once you have done so, use a ruler to draw that line on your graph. Write the equation of your line: _________________________________ 6. Use your line to answer the following questions: a. How long would a person’s foot be if he or she was 65.5 inches tall? b. How tall would someone be if their foot was 13 inches long? 58 7. The line that you just estimated is called a regression line or a “best fit” line. Your calculator can easily find the line that fits the data the best without estimating. Enter the class data into your calculator, putting Height into L1 and Foot Length into L2 and Arm Span into L3. 8. Create a scatter plot of like the one above using Height as your independent variable and Foot Length as your dependent variable. Step 1: Hit 2ND, Y= to get to the STAT PLOTS menu. Step 2: Hit ENTER on 1: Plot 1 and hit ENTER to turn the plot ON. Scroll down to Type and hit ENTER on the scatter plot. Scroll down and enter L1 for the Xlist and L2 for the Ylist. Your screen should look like the one to the right. Step 3: Hit ZOOM, then 9: ZoomStat to get your scatter plot. 9. Now, use your calculator to find the regression line for the data. Step 1: Hit STAT, then hit the right arrow to highlight CALC. Step 2: Choose 4: LinReg (ax+b) and press ENTER. Step 3: LinReg (ax+b) should be on your screen. Hit 2ND, 1 to get L1, then comma, then 2ND, 2 to get L2. We want the calculator to store the line into Y1, so hit comma, then VARS, Y-VARS, 1: Function, then 1: Y1. Step 4: Hit ENTER to get the regression line. Hit Y= to see the equation of the line. Then hit GRAPH to see the scatter plot with the line drawn in. See the example below: 59 10. Write the equation of the regression line you just calculated, rounding to the nearest thousandth: _____________________________________________ How does your line in number 5 compare to the one you just calculated using the calculator? 11. What is the relationship between height and foot length? Is it positive or negative? 12. Based on the scatter plot, how strong do you think the relationship is between height and foot length? 60 61 Class Data Height (in.) Foot Length (in.) Student 1 Student 2 Student 3 Student 4 Student 5 Student 6 Student 7 Student 8 Student 9 Student 10 Student 11 Student 12 Student 13 Student 14 Student 15 Student 16 Student 17 Student 18 Student 19 Student 20 Student 21 Student 22 Student 23 Student 24 Student 25 Student 26 Student 27 Student 28 Student 29 Student 30 62 Arm Span (in.) Name: _____________________________________________ Date: _______________ Height vs. Arm Span Directions: Using the data you collected in class, create a scatter plot and regression line using your TI-84 calculator. Answer the following questions. 1. Write the regression line you calculated: _________________________________ 2. Using the above line, answer the following questions: a. What is the arm span of a person who is 67.3 inches tall? b. How tall could a person be if their arm span is 64.5 inches? 3. What is the relationship between height and arm span? Is there a positive or negative association? 4. How strong of an association is there between height and arm span? 63 Name: _______Answer Key______________________________ Date: _____________ Scatter Plots ***ANSWERS ARE BASED ON THE EXAMPLE DATA GIVEN*** Is there a relationship between your height and the length of your foot? What about between your height and your arm span? By taking measurements of everyone in the class we are going to find out. 13. Make some predictions about the type of relationships that may exist, if any. Answers will vary but may state how as your height increases, the longer your foot and arm span will be. 14. With the members of your group, measure your height, the length of your foot, and the length of your arm span, in inches, and record the measurements in the table below. Height (in.) Student 1 Student 2 Student 3 65 64 67 Foot Length (in.) 9 8.5 10.5 Arm Span (in.) 64 65 66 15. Once you have collected your data, record it on the class table on the overhead. 16. Using the height as the independent variable and foot length as the dependent variable, plot each data point for the class totals using the graph paper on the last page. Make sure you come up with an appropriate scale and label the graph. 17. Using a piece of spaghetti, place it on your graph and move it until you feel it is in a place that best fits the data. Once you have done so, use a ruler to draw that line on your graph. Write the equation of your line: ___y = .4x - 18__ 18. Use your line to answer the following questions: a. How long would a person’s foot be if he or she was 65.5 inches tall? y = .4 (65.5) – 18 = 8.2 inches b. How tall would someone be if their foot was 13 inches long? 13 = .4x - 18 x = 77.5 inches 64 19. The line that you just estimated is called a regression line or a “best fit” line. Your calculator can easily find the line that fits the data the best without estimating. Enter the class data into your calculator, putting Height into L1 and Foot Length into L2 and Arm Span into L3. 20. Create a scatter plot of like the one above using Height as your independent variable and Foot Length as your dependent variable. Step 1: Hit 2ND, Y= to get to the STAT PLOTS menu. Step 2: Hit ENTER on 1: Plot 1 and hit ENTER to turn the plot ON. Scroll down to Type and hit ENTER on the scatter plot. Scroll down and enter L1 for the Xlist and L2 for the Ylist. Your screen should look like the one to the right. Step 3: Hit ZOOM, then 9: ZoomStat to get your scatter plot. 21. Now, use your calculator to find the regression line for the data. Step 1: Hit STAT, then hit the right arrow to highlight CALC. Step 2: Choose 4: LinReg (ax+b) and press ENTER. Step 3: LinReg (ax+b) should be on your screen. Hit 2ND, 1 to get L1, then comma, then 2ND, 2 to get L2. We want the calculator to store the line into Y1, so hit comma, then VARS, Y-VARS, 1: Function, then 1: Y1. Step 4: Hit ENTER to get the regression line. Hit Y= to see the equation of the line. Then hit GRAPH to see the scatter plot with the line drawn in. See the example below: 65 22. Write the equation of the regression line you just calculated, rounding to the nearest thousandth: ____y = .407x – 17.617______________ How does your line in number 5 compare to the one you just calculated using the calculator? The lines are pretty close to the same. 23. What is the relationship between height and foot length? Is it positive or negative? There is a positive association. As your height increases, your foot length increases. 24. Based on the scatter plot, how strong do you think the relationship is between height and foot length? The relationship is pretty strong because the data points are close to the line of regression. 66 Name: _____Answer Key______________________________ Date: _______________ Height vs. Arm Span Directions: Using the data you collected in class, create a scatter plot and regression line using your TI-84 calculator. Answer the following questions. ***ANSWERS ARE BASED ON THE EXAMPLE DATA GIVEN*** 5. Write the regression line you calculated: ___y = .921x + 5.115__________ 6. Using the above line, answer the following questions: a. What is the arm span of a person who is 67.3 inches tall? y = .921(67.3) + 5.115 = 67.1 inches b. How tall could a person be if their arm span is 64.5 inches? 64.5 = .921x + 5.115 x = 64.47 inches 7. What is the relationship between height and arm span? Is there a positive or negative association? There is a positive association. As your height increases, your arm span increases. 8. How strong of an association is there between height and arm span? There is a relatively strong association between height and arm span because the data points are pretty close to the line. 67 Class Data Student 1 Student 2 Student 3 Student 4 Student 5 Student 6 Student 7 Student 8 Student 9 Student 10 Student 11 Student 12 Student 13 Student 14 Student 15 Student 16 Student 17 Student 18 Student 19 Student 20 Student 21 Student 22 Student 23 Student 24 Student 25 Student 26 Student 27 Student 28 Student 29 Student 30 Height (in.) 65 61 58 58 70 66 67 64 65 62 64 60 61 59 57 66 66 69 65 63 62 63 64 64 66 60 60 67 68 71 Foot Length (in.) 9 7 6 6.5 12 9.7 9.5 9 8.7 7 8.5 6.8 7 6.5 6.4 9.5 9 10 8.8 7.9 7 7.8 8.3 8.4 9.3 6.7 6.6 9.9 8.9 11.7 68 Arm Span (in.) 64 60 57 58 69 64 69 65 65 64 66 60 62 60 57 67 66 71 66 65 63 62 63 62 64 61 62 65 67 70