Unit 5 – Data Representation – Class Notes Date Statistical Questions Learning Target: I will be able to identify statistical and non-statistical questions. Key Terms Biased Question: Leads a person to a particular answer. You cannot draw valid conclusions from biased data. Statistics: The process of collecting, organizing, and interpreting data. Statistical Question: A question that has many different, or variable, answers. Examples/Try This Statistical Questions… Ashley asked her classmates, “How many miles do you travel to school?” The table below shows the responses. 0.8 miles 2 miles 1 mile Distances Traveled to School 1.5 miles 1.4 miles 0.75 miles 0.25 miles 0.5 miles 3 miles 1. Is there variability in the data? 2. Did Ashley ask a statistical question? 3. Explain why this is not a statistical question: “How many miles is the library from the school?” 4. Is each question a statistical question? Explain. a. How many siblings do you have? b. In which month is your birthday? c. What is your favorite type of book? d. How many states are in the United States? e. What color shirt am I wearing? f. What is your favorite color? g. How many dogs do you have? h. What size shoe do you wear? For more help, go to www.khanacademy.org Page 1 Biased Questions… Dan and Kellie surveyed people about vacation spots. Dan asked, “Would you prefer to vacation in sunny Bermuda or rainy London?” Kellie asked, “Would you prefer to vacation in Bermuda or London?” Survey Results Dan Kellie Bermuda Bermuda London London Bermuda London Bermuda London Bermuda Bermuda Bermuda Bermuda Bermuda Bermuda London Bermuda 1. How do Dan and Kellie’s answers differ? 2. What words did Dan use in his survey question that might influence the responses to his question? 3. Is either question biased? 4. Complete the table. Biased Question Do you prefer exciting action movies over boring drama movies? Do you prefer living in the peaceful countryside or the noisy city? Do you want a delicious chocolate cake or a bland vanilla cake for your birthday? Do you agree with most people that playing the guitar is cooler than playing the clarinet? Do you want to go for a tiring run or play a fun game of soccer? Words Creating Biased exciting, boring Unbiased Question What type of movie do you prefer? peaceful, noisy For more help, go to www.khanacademy.org Page 2 Date Measures of Central Tendency Learning Targets 1) I will be able to calculate the mean, median, mode and range of a set of data. 2) I will be able to recognize outliers in a set of data. Key Terms Measure of Center: A value that describes how data is centered. This includes mean, median, and mode. Mean: This is sometimes called the average. To find this, add up all the values in the set and divide by the number of items in the set. Median: This is the middle value when the data are in numerical order, from least to greatest. If there are an odd number of items, it is the middle number; if there is an even number of items, it is the mean of the two middle items. Mode: This is the value or values that appear the most often. There may be more than one mode for a set of a data. If all values occur an equal number of times, then the data has no mode. Outlier: This is a value in a set that is very different from the other values. Examples/Try This: 1. Find the mean, median, mode and range for the following set of data: 5 30 35 20 5 25 20 2. The table shows the number of glasses of water consumed by several students in one day. Identify the outlier in the data set. Then determine how the outlier affects the measures of central tendencies. Water Consumption Name Randy Lori Anita Jana Sonya Victor Mark Jorge Glasses 4 12 3 1 4 7 5 4 3. Based on the following line plot, find the measures of central tendencies. 4. Twelve people estimated the time, in minutes, they spend reading each day. Their responses are as follows: 20 5 45 90 60 45 30 10 30 45 15 25 Find the measures of central tendencies. Which one best describes the data? For more help, go to www.khanacademy.org Page 3 Date Measures of Variability Learning Target: 1. I can find the Interquartile Range (IQR) and Mean Absolute Deviation (MAD) of a set of data. Key Terms: Measure of Variability: A value that describes how data is spread out. This includes the range, mean absolute deviation and interquartile range. Range: This is the difference in the least and greatest values in a set. To find this, take the largest value minus the smallest. Interquartile Range (IQR): The difference of the upper quartile and the lower quartile. Mean Absolute Deviation (MAD): The mean distance between each data value and the mean of the data set. Examples/Try This: Scientists recorded rainfall amounts in a rain forest for 10 days in June. The table below shows the results. June Rainfall Day 1 2 3 4 5 6 7 8 9 10 Rainfall (cm) 5 6 8 16 7 6 1 5 9 7 1. One measure of variability is mean absolute deviation (MAD). The MAD is the average amount that the data values vary from the mean. Step 1 Find the mean value of the rain over the ten days. Step 2 Complete the table below. June Rainfall Distances from Mean Day 1 2 3 4 5 6 7 8 9 10 Rainfall (cm) 5 6 8 16 7 6 1 5 9 7 Mean Distance from Mean Find the difference between each day’s rainfall and the mean to find the distance each day’s rainfall is from the mean. Write each difference as a positive number. Step 3 Calculate the mean of all the distances in the bottom row. This is the MAD, which is the average amount that the data values vary from the mean. 2. Range is another measure of variability. What is the range of the rainfall data? 3. Another measure of variability is the interquartile range (IQR). This measure tells the spread of the middle half of the data. Step 1 Arrange the data values in order from least to greatest. Step 2 Find the lower quartile and the upper quartile. Step 3 Subtract the lower quartile from the upper quartile. This is the (IQR). For more help, go to www.khanacademy.org Page 4 Date Number Lines and Dot Plots (Line Plots) Learning Targets: I am able to create and read information off of dot or line plots. Key Terms Number Line: A line without ends whose points are matched to the real numbers by their distance from a given point labeled zero. Dot Plot (Line Plot): A graph that shows the shape of a data set by stacking x’s above each value on a number line. This is sometimes referred to as a line plot. Examples/Try This The line plot below shows the number of miles a cyclist traveled during a training period. Cyclist A x x x x x x x x x x x x x x 2 3 4 5 6 7 8 9 1. In the plot for Cyclist A, the data are grouped around what value? 2. If you draw a vertical line at that value, are the points on either side symmetrical? 3. This line plot shows the distance traveled by another cyclist. Cyclist A x x 2 x 3 x 4 x 5 x 6 x x x 7 x x x x 8 x 9 a. Is there a value that the data are grouped around? b. Are there more points to the left or the right of the graph? 4. Students in Mr. Gordon’s class ran several miles a week. The results are in the following table. Organize the data into a dot plot. Number of Miles Ran 3 4 5 6 7 8 9 10 Number of Students 5 0 6 4 3 7 0 2 For more help, go to www.khanacademy.org Page 5 Date Histograms Learning Target: I am able to create and read information off of histograms. Key Terms Histogram: A type of bar graph that shows the frequency of data within given intervals. The label under each bar tells the range of numbers the bar represents, and the height represents the frequency. There is no space between the bars. Intervals: A set of numbers consisting of all the numbers between a pair of given numbers along with either, both, or none of the endpoints. Examples/Try This: The histogram shows the scores for a test that was given in Mr. Warren’s English class. Test scores in English Class 76 – 80 81 - 85 86 – 90 91 – 95 Test Scores 96 – 100 1. What does the bar labels 81-85 represent? 2. What values are included in the interval 91-95? 3. Compare all of the intervals shown on the horizontal axis of the histogram. Do any of the intervals overlap or have a gap between them? 4. How many students are in the English class? Explain how you found your answer. 5. For each question below, answer the question or tell why the answer cannot be determined. a. Which interval includes the most test scores? b. How many students had a test score of 95? c. How many students have test scores below 86? d. How many students have test scores below 88? e. What is the highest test score in the class? 6. Make a list of numbers that could represent the data graphed in this histogram. 7. The table below shows the number of minutes that Samuel spent on the computer each day for a month. Use the data to complete the frequency table and make a histogram. Samuel’s Computer Time 95 4 26 95 4 87 36 47 26 51 23 18 45 81 76 24 57 7 16 62 70 20 45 16 32 37 64 8 28 44 Sam’s Computer Time Minutes Frequency For more help, go to www.khanacademy.org Page 6 Date Box Plot (Box-and-Whisker Plot) Learning Targets 1) I will be able to read information from a box-and-whisker plot. 2) I will be able to create a box-and-whisker plot. Key Terms Box-and-Whisker Plot: Shows how a set of data is distributed. The plot displays five numbers that summarize the data. Parts of Box-and-Whisker Plot Lower Extreme (LE): the smallest value, that is not an outlier Lower or 1st Quartile (LQ): the median of the lower half of data Median: the middle value when placed in numerical order Upper or 3rd Quartile (UQ): the median of the upper half of data Upper Extreme (UE): the greatest value, that is not an outlier Other Important Terms Inter-quartile Range (IQR): upper quartile minus the lower quartile Outlier: a value that is a lot bigger or smaller than the other values Mathematical formula for finding outliers o Any number smaller than LQ - IQR(1.5) o Any number bigger than UQ +IQR(1.5) Steps to Making a Box-and-Whisker Plot Make a box-and-whisker plot for the following set of data: 43, 42, 44, 46, 47, 42, 49, 47, 44, 47 1. Put the numbers in order from least to greatest. 2. Find the median of the set of numbers 3. Find the lower and upper quartiles. Remember, this is just the medians of the upper and lower numbers. For more help, go to www.khanacademy.org Page 7 4. On an appropriate number line, draw a box extending from the lower quartile to the upper quartile. Draw in the median using a dotted line. What do you think is meant by the word appropriate? 5. Fill in the “Whiskers” by connecting the lower extreme (excluding outliers) to the lower quartile and the upper extreme (excluding outliers) to the upper quartile. 6. If you have any outliers, they are simply dots at the given values. Five things you need to make a box-and-whisker plot Lower Extreme: Upper Extreme: Lower Quartile: Upper Quartile: Median: Try This Draw a box-and-whisker plot for the following numbers. 45 43 61 64 51 35 55 37 66 59 58 49 50 47 62 Five things you need to make a box-and-whisker plot Lower Extreme: Upper Extreme: Lower Quartile: Upper Quartile: Median: For more help, go to www.khanacademy.org Page 8 For more help, go to www.khanacademy.org Page 9