11.2 Notes - Comparing Data Sets Vocabulary to know: Different Centers of the Data: Mean – average that involves adding the data and dividing by the number of scores Median – average that involves putting the data in order and finding the middle score Mode – average that involves finding the score that appears in the data most often Other vocabulary: Outlier – data value that is very different from the others Interquartile range – the difference of the third and first quartiles - represents the spread of the middle 50% of the data values Mean Absolute Deviation – (MAD) – measure of variability that describes how much the data is spread out from the mean **We use the center and spread to compare data sets.** Example: Comparing dot plots The dots plots above show the data for miles run per week for two different classes. 1.)Compare the centers of the two data sets. Class A mode: The mode is 4 miles, but the data shows two clusters – one around 4 miles and another around 13 miles. Class B mode: The mode is 7 miles. Class A median: The middle number is 6 miles. Class B median: The middle number is 6 miles. Class A mean: 8.2 miles Class B mean: 5.9 miles 2.)Compare the spread of the two data sets. Class A range: 14 – 4 = 10 miles Class B range: 9 – 3 = 6 miles Class A data is spread out more than Class B data. Class A mean absolute deviation: Each score minus the mean, then made positive, added and divided buy the number of scores. Mean = 8.2 4 – 8.2 = - 4.2 -> 4.2 4 – 8.2 = - 4.2 -> 4.2 4 – 8.2 = - 4.2 -> 4.2 4 – 8.2 = - 4.2 -> 4.2 4 – 8.2 = - 4.2 -> 4.2 5 – 8.2 = - 3.2 -> 3.2 5 – 8.2 = - 3.2 -> 3.2 5 – 8.2 = - 3.2 -> 3.2 6 – 8.2 = - 2.2 -> 2.2 6 – 8.2 = - 2.2 -> 2.2 12 – 8.2 = 3.8 13 – 8.2 = 4.8 13 – 8.2 = 4.8 13 – 8.2 = 4.8 13 – 8.2 = 4.8 14 – 8.2 = 5.8 14 – 8.2 = 5.8 69.6 ÷ 17 scores = 4.1 miles 3 – 5.9 = - 2.9 -> 2.9 4 – 5.9 = - 1.9 -> 1.9 Mean = 5.9 4 – 5.9 = - 1.9 -> 1.9 4 – 5.9 = - 1.9 -> 1.9 5 – 5.9 = - 0.9 -> 0.9 5 – 5.9 = - 0.9 -> 0.9 5 – 5.9 = - 0.9 -> 0.9 5 – 5.9 = - 0.9 -> 0.9 6 – 5.9 = 0.1 6 – 5.9 = 0.1 7 – 5.9 = 1.1 7 – 5.9 = 1.1 7 – 5.9 = 1.1 7 – 5.9 = 1.1 7 – 5.9 = 1.1 8 – 5.9 = 2.1 8 – 5.9 = 2.1 9 – 5.9 = 3.1 25.2 ÷ 17 scores = 1.5 miles Class A's data points are farther from the mean than those of Class B. Class A's data shows greater variability. Class B MAD: Class A data has a greater mean, two modes, and greater spread and variability than Class B data. Example: Compare box plots Store A Store B Maximum value 76 74 Minimum value 25 41 First quartile 30 48 Median 43 51 Third quartile 55 65 Interquartile range (IQR) 25 17 Store A's median is close to Staore B's minimum, so Store A's average day is comparable to Store B's worst day. Store B's values are almost all greater than Store A's comparable values meaning Store B has greater overall sales. Store A has a greater spread, both interquartile range and overall range. Example: Compare histograms Machine 1 has an average torque around 16 to 20. Machine 2's average torque is around 24. Machine 1 has a torque as low as 10 and as high as 24. Machine 2's torques range from 14 to 38 and are much more spread out than Machine 1's torques.