Section 2.5 Summer 2013 - Math 1040 June 10, 2013 (1040) M 1040 - 2.5 June 10, 2013 1 / 14 Roadmap We will study today: §2.5 Measures of Position. Quartiles. Boxplots. Standard scores (z-scores). (1040) M 1040 - 2.5 June 10, 2013 2 / 14 Quantiles and quartiles Quantiles are numbers that divide an ordered data set into lower and upper percentages. For instance, the median is known as a 50-percentile because it divides the data into a lower 50% and upper 50%. (1040) M 1040 - 2.5 June 10, 2013 3 / 14 Quantiles and quartiles Quantiles are numbers that divide an ordered data set into lower and upper percentages. For instance, the median is known as a 50-percentile because it divides the data into a lower 50% and upper 50%. Quartiles divide the data into quarters. We can label them Q1 , Q2 , Q3 , respectively dividing the data into the lower 25%, 50%, and 75%. That is, about one-quarter of the data falls below Q1 , two-quarters below Q2 , and three-quarters below Q3 . (1040) M 1040 - 2.5 June 10, 2013 3 / 14 Quartiles Data set: 15 CPR training test scores are below. 13 9 18 15 14 21 7 10 11 20 5 18 37 16 17 are first sorted sorted 5 7 9 10 11 13 14 15 16 17 18 18 20 21 27 The median 15 divides the data into two equal parts. 5 7 9 10 11 13 14 15 16 17 18 18 20 21 27 Lower half: 5 7 9 10 11 13 14 Upper half: 16 17 18 18 20 21 27 (1040) M 1040 - 2.5 June 10, 2013 4 / 14 Finding quartiles 1. Sort the data. 2. Find the median Q2 . 3. Find the medians of the lower and upper halves. These are Q1 and Q3 . Special rule: When the sample size is even, average the middle values. (1040) M 1040 - 2.5 June 10, 2013 5 / 14 IQR We recall the range of a data set is its maximum minus the minimum. The interquartile range (IQR) is the range of the middle 50% of the data. That is, IQR = Q3 − Q1 . (1040) M 1040 - 2.5 June 10, 2013 6 / 14 IQR We recall the range of a data set is its maximum minus the minimum. The interquartile range (IQR) is the range of the middle 50% of the data. That is, IQR = Q3 − Q1 . The IQR of the CPR training scores is 18 − 10 = 8. The spread of the middle part of the data is 8 test points. (1040) M 1040 - 2.5 June 10, 2013 6 / 14 IQR Example Tail lengths of a sample of American alligators, in feet: 6.5 3.4 4.2 7.1 5.4 6.8 7.5 3.9 4.6 1. Sort the data. 3.4 3.9 4.2 4.6 5.4 6.5 6.8 7.1 7.5 2. Find the median Q2 . 3.4 3.9 4.2 4.6 5.4 6.5 6.8 7.1 7.5 Q2 = 5.4 feet. 3. Find Q1 and Q3 . 3.4 3.9 (4.05) 4.2 4.6 and 6.5 6.8 (6.95) 7.1 7.5 Q1 = 4.05 feet and Q3 = 6.95 feet with the special rule. The IQR is then 6.95 - 4.05 = 2.9 feet. (1040) M 1040 - 2.5 June 10, 2013 7 / 14 Example A question is asked to 10 people in a survey. Given the age of a Male, what is the acceptable minimum age for dating a Female? Using R’s built-in data set based on age-difference in dating, the following table is obtained, giving the frequency of that amount. For instance, for a Male of age 21, eight people believe the acceptable minimum age for a Female date is 17. (1040) M 1040 - 2.5 June 10, 2013 8 / 14 Example Out of the 10 people surveyed, here is are the responses for a Male that is age 30: 18 19 21 22 23 25 25 25 25 25. (1040) M 1040 - 2.5 June 10, 2013 9 / 14 Example Out of the 10 people surveyed, here is are the responses for a Male that is age 30: 18 19 21 22 23 25 25 25 25 25. The statistics are: Q1 = 21 years, Q2 = 24 years, Q3 = 25 years. Also, x̄ = 22.8 years. Notice here that 25 is the mode. Do you believe this complicates things? How do you interpret Q3 ? (1040) M 1040 - 2.5 June 10, 2013 9 / 14 Example It turns out that in this example we can speak of these data in terms of better divisions - percentiles. For instance, ask yourself what percent of the ages are less than or equal to 26 years, 25 years, 24 years, 23 years, etc? Less than or equal to: Percentage: Percentile: 25 years 100% 100th-percentile 23 years 50% 50th-percentile 22 years 40% 40th-percentile 21 years 30% 30th-percentile 19 years 20% 20th-percentile 18 years 10% 10th-percentile (1040) M 1040 - 2.5 June 10, 2013 10 / 14 Box plots Box-and-Whisker Plots Box-and-whisker plots, or box-plots, are graphical representations of the minimum, quartiles, and maximum (the so-called five number summary). 1. Find the five-number summary. 2. Construct a horizotal scale that spans the range. 3. Plot the five-number summary above the horizontal scale. 4. Draw a box above the scale from Q1 to Q3 , and draw a vertical line at Q2 inside the box. 5. Draw whiskers from the box to the minimum and maximum values. (1040) M 1040 - 2.5 June 10, 2013 11 / 14 Standard scores Standard Scores The standard score or z-score is the number (possibly fractional) of standard deviations a data value is from the mean. The formula is given by: x −µ Value - Mean = Standard Deviation σ In the formula above it is more useful to memorize the middle, verbal part. z= (1040) M 1040 - 2.5 June 10, 2013 12 / 14 Standard scores A statistics test has a mean of µ1 = 63 and a standard deviation of σ1 = 7.0 and a biology test has a mean of µ2 = 23 and standard deviation of σ2 = 3.9. (1040) M 1040 - 2.5 June 10, 2013 13 / 14 Standard scores A statistics test has a mean of µ1 = 63 and a standard deviation of σ1 = 7.0 and a biology test has a mean of µ2 = 23 and standard deviation of σ2 = 3.9. A student gets a 75 on a statistics test and a 25 on a biology test. On which test is the better score? (1040) M 1040 - 2.5 June 10, 2013 13 / 14 Standard scores A statistics test has a mean of µ1 = 63 and a standard deviation of σ1 = 7.0 and a biology test has a mean of µ2 = 23 and standard deviation of σ2 = 3.9. A student gets a 75 on a statistics test and a 25 on a biology test. On which test is the better score? (1040) z1 = 75 − 63 ≈ 1.71 7.0 z2 = 25 − 23 ≈ 0.51 3.9 M 1040 - 2.5 June 10, 2013 13 / 14 Assignements Assignment: 1 Read pages 100 - 106. 2 Exercises p 107, 1 - 49 odd. Vocabulary: Quartiles, box plots, z-scores. Understand: Use the median of a data set to help find quartiles. Construct a box plot from the data set. Find the z-scores of data. (1040) M 1040 - 2.5 June 10, 2013 14 / 14