1.3 Describing Quantitative Data with Numbers Measuring Center: The Mean 1. The most common measure of center is the arithmetic average 2. NOT a resistant measure 3. Interpreted as the “average” value x= sum _ of _ observations n x= å xi n Measuring Center: The Median 1. The median M is the midpoint of the distribution 2. To find the median of a distribution: a. Arrange all the observations from smallest to largest b. If the number of observations is odd, the median M is the center observation in the ordered list c. If the number of observations is even, the median M is the average of the two center observations in the ordered list 3. The median is a resistant measure 4. Interpreted as the “typical” value Comparing the Mean and Median 1. The mean and median of a roughly symmetric distribution are close together 2. If the distribution is exactly symmetric, the mean and median are exactly the same 3. In a skewed distribution, the mean is usually farther out in the long tail than is the median Example #9 Here are data for the amount of fat (in grams) in McDonald’s beef sandwiches: Sandwich Hamburger Cheeseburger Double Cheeseburger McDouble Quarter Pounder Quarter Pounder with Cheese Double Quarter Pounder with Cheese Big Mac Big N’ Tasty Big N’ Tasty with Cheese McRib Mac Snack Wrap Angus Bacon & Cheese Angus Deluxe Angus Mushroom & Swiss Fat(g) 9 12 23 19 19 26 42 29 24 28 26 19 39 39 40 a) Create a stemplot of the data b) Find the mean amount of fat for all 15 beef sandwiches c) The three Angus burgers are relatively new additions to the menu. How much did they increase the average when they are added? d) Find and interpret the median Measuring Spread: The Interquartile Range (IQR) 1. Measures the range of the middle 50% of the data 2. How to calculate Quartiles Q1 and Q3 and the IQR: a. Arrange the observations in increasing order and locate the median M b. The first quartile Q1 is the median of the observations to the left of M c. The third quartile Q3 is the median of the observations to the right of M d. Important: Be sure to leave out the median when you locate the quartiles IQR = Q3 - Q1 Identifying Outliers: 1. Call an observation an outlier if it falls more than 1.5 x IQR above the third quartile or below the first quartile. Example #10 a) Refer to Example #9 to find and interpret Q1 , Q3 , and the interquartile range (IQR) b) Determine whether a beef sandwich with 68 grams of fat is an outlier The Five Number Summary: 1. Consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation. Combine all five numbers to get a quick summary of both center and spread Minimum, Q1 , M, Q3 , Maximum Boxplots – graphical representation of the five number summary 1. How to make a boxplot a. A central box is drawn from the first quartile to the third quartile b. A line in the box marks the median c. Whiskers extend from the box out to the smallest and largest observations that are NOT outliers d. Outliers are marked with stars Common Errors: 1. Many students refer to the box in a boxplot as the IQR. This is incorrect since the IQR is a number, not the box itself. 2. When describing the shape of a boxplot, many students will describe a symmetric boxplot as “approximately Normal.” This is incorrect since a boxplot does not reveal where the modes of a distribution are. Example #11 Here are the number of home runs that Hank Aaron hit in each of his 23 seasons: 13 39 27 29 26 44 44 38 30 47 a) Make a boxplot for these data 39 34 40 40 34 20 45 12 44 10 24 32 44 Measuring Spread: The Standard Deviation Standard Deviation – measure the average distance of the observations from their mean Variance – measures the general dispersion of the data. Is equal to sx2 sx = 1 å(xi - x)2 n -1 sx - sample s x - population 1. Properties a. Measures spread about the mean and should only be used when the mean is chosen as the measure of center b. Is always greater than zero c. Has the same units of measurement as the original observations d. Standard deviation is not resistant Example #12 Here are the foot lengths (in cm) for a random sample of seven 14-year-olds from the UK: 25, 22, 20, 25, 24, 24, 28 a) Calculate and interpret the standard deviation of these data. Choose a Measure of Center and Spread: Median and IQR 1. describing skewed data 2. describing data with strong outliers Mean and standard deviation 1. reasonably symmetric distributions 2. no outlers Plot your data: Dotplot, stemplot, histogram Interpret what you see: Shape, center, spread, outliers Choose numerical summary: Mean and standard deviation, median and IQR? Example #13 The following data show the number of contacts that a sample of high school students had in their cell phones. Male 124 260 135 Female 30 180 41 290 114 83 124 29 31 105 116 33 27 168 103 22 213 44 169 96 173 218 87 167 144 155 183 85 214 134 110 a) Do the data give convincing evidence that one gender has more contacts than the other?