Describing Data: Numerical Measures Chapter 3 Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education. Learning Objectives LO3-1 Compute and interpret the mean, the median, and the mode. LO3-2 Compute a weighted mean. LO3-3 Compute and interpret the geometric mean. LO3-4 Compute and interpret the range, variance, and standard deviation. LO3-5 Explain and apply Chebyshev’s theorem and the Empirical Rule. LO3-6 Compute the mean and standard deviation of grouped data. 3-2 LO3-1 Compute and interpret the mean, the median, and the mode. [1] Measures of Location The purpose of a measure of location is to pinpoint the center of a distribution of data. โผ There are many measures of location. We will consider three: 1- First : The mean (a)The arithmetic mean: 1- nongroup data (raw data) (population parameter, sample statistics) : 2- grouped data (b)Weight mean (c) Geometric mean 2- Second: The median 3- Third: The mode โผ 3-3 LO3-1 First the mean :(A) The arithmetic mean a. Population Mean For ungrouped data, the population mean is the sum of all the population values divided by the total number of population values: 3-4 LO3-1 Example – Population Mean There are 42 exits on I-75 through the state of Kentucky. Listed below are the distances between exits (in miles). 1. Why is this information a population? 2. What is the mean number of miles between exits? 3-5 LO3-1 Solution Why is this information a population? Answer: This is Population because we are considering all the exits on I-75 in Kentucky. 2. What is the mean number of miles between exits? Answer: ๐= σ๐ ๐ = 1+โฏ+10+4+11 42 192 = 42 =57.4 3-6 LO3-1 Parameter versus Statistic PARAMETER A measurable characteristic of a population. STATISTIC A measurable characteristic of a sample. 3-7 LO3-1 b. Sample Mean For ungrouped data, the sample mean is the sum of all the sample values divided by the number of sample values: 3-8 LO3-1 Example – Sample Mean Solution 3-9 LO3-1 Properties of the Arithmetic Mean 1. Every set of interval-level and ratio-level data has a mean. 2. All the values are included in computing the mean. 3. The mean is unique. 4. The sum of the deviations of each value from the mean is zero. เดค =0 σ(๐ − ๐) 3-10 LO3-1 Weakness of the Arithmetic Mean Arithmetic average is extremely sensitive to extreme values. Which means If one or two of the data values are either extremely large or extremely small compared to the majority of data, the mean might not be an appropriate average to represent the data. Arithmetic average treats all the individual observations equally. For example, you have a portfolio of stocks and it is highly unlikely that all stocks will have the same weight and therefore the same impact on the total performance of the portfolio. 3-11 LO3-6 Compute the mean and standard deviation of grouped data. The Arithmetic Mean of Grouped Data 3-12 LO3-6 Example - The Arithmetic Mean of Grouped Data Recall in Chapter 2, we constructed a frequency distribution for Applewood Auto Group profit data for 180 vehicles sold. The information is repeated in the table. Determine the arithmetic mean profit per vehicle. 3-13 LO3-6 Example - The Arithmetic Mean of Grouped Data 3-14 LO3-2 Compute a weighted mean. (B) Weighted Mean The weighted mean of a set of numbers X1, X2, ..., Xn, with corresponding weights w1, w2, ...,wn, is computed with the following formula: σ๐๐ ๐๐ = σ ๐ 3-15 LO3-2 Example – Weighted Mean The Carter Construction Company pays its hourly employees $16.50, $19.00, or $25.00 per hour. There are 26 hourly employees: 14 are paid at the $16.50 rate, 10 at the $19.00 rate, and 2 at the $25.00 rate. What is the mean hourly rate paid for the 26 employees? 3-16 LO3-3 Compute and interpret the geometric mean. (C)The Geometric Mean โผ Useful in finding the average change of percentages, ratios, indexes, or growth rates over time. โผ It has wide application in business and economics because we are often interested in finding the percentage changes in sales, salaries, or economic figures, such as the GDP (Gross Domestic Product). โผ The geometric mean will always be less than or equal to the arithmetic mean. 3-17 LO3-3 The Geometric Mean: Finding the average rate of return over time EXAMPLE: The return on investment earned by Atkins Construction Company for four successive years was: 30 percent, 20 percent, -40 percent, and 200 percent. What is the change geometric mean rate of return on investment? The average change rate of return is 29.4% 3-18 LO3-3 The Geometric Mean: Finding an Average Percent Change Over Time EXAMPLE: The population of Las Vegas, Nevada increased from 258,295 in 1990 to 584,539 in 2011. This is an increase of 326,244 people, or a 226.3 percent increase over the period. What is the average annual increase? The population increased at a rate of 3.97% per year 3-19 2728- 31- 32- 3.83% found by 5 151,812, 000 −1 125,821, 000 LO3-1 Second: The Median MEDIAN The midpoint of the values after they have been ordered from the minimum to the maximum values. Properties of the median: 1. There is a unique median for each data set. 2. It is not affected by extremely large or small values and is therefore a valuable measure of central tendency when such values occur. 3. It can be computed for ratio-level, interval-level, and ordinal-level data. 4. It can be computed for an open-ended frequency distribution if the median does not lie in an openended class. 3-22 LO3-1 Examples - Median The ages for a sample of five college students are: The heights of four basketball players, in inches, are: 21, 25, 19, 20, 22 76, 73, 80, 75 Arranging the data in ascending order gives: 19, 20, 21, 22, 25. Thus the median is 21. Arranging the data in ascending order gives: 73, 75, 76, 80. Thus the median is 75.5. 3-23 Note that: โ If there is an odd amount of numbers, the median value is the number that is in the middle, with the same amount of numbers below and above. โ If there is an even amount of numbers in the list, the middle pair must be determined, added together, and divided by two to find the median value. LO3-1 Third: The Mode MODE The value of the observation that appears most frequently. Note: The mode is especially useful in summarizing nominal-level data. 3-25 LO3-1 Example - Mode Using the data measuring the distance in miles between exits on I-75 through Kentucky, what is the modal distance? Organize the distances into a frequency table and select the distance with the highest frequency. 3-26 LO3-1 The Relative Positions of the Mean, Median and the Mode Note: If the distribution is highly skewed, the mean would not be a good measure to use. The median and mode would be more representative [2] Dispersion LO3-4 Compute and interpret the range, variance, and standard deviation. โช A measure of location, such as the mean or the median, does not tell us anything about the spread of the data. โช For example, if your nature guide told you that the river ahead averaged 3 feet in depth, would you want to wade across on foot without additional information? Probably not. You would want to know something about the variation in the depth. ๏จ A small value for a measure of dispersion indicates that the data are clustered closely, say, around the mean. The mean is therefore considered representative of the data. ๏จ A large measure of dispersion indicates that the mean is not reliable. 3-28 LO3-4 Compute and interpret the range, variance, and standard deviation. Dispersion โช A second reason for studying the dispersion in a set of data is to compare the spread in two or more distributions. LCD computer monitor is assembled in Baton Rouge and in Tucson. The arithmetic mean hourly output in both the Baton Rouge plant and the Tucson plant is 50. Based on the two means, we might conclude that the distributions of the hourly outputs are identical. But production records reveal that this conclusion is not correct (see Chart 3–6). • Baton Rouge production varies from 48 to 52 assemblies per hour. • Production at the Tucson is more erratic, ranging from 40 to 60 per hour. Therefore, the hourly output for Baton Rouge is clustered near the mean of 50; the hourly output for Tucson is more dispersed. 3-29 LO3-4 Measures of Dispersion โผ Range โผ Variance โผ Standard Deviation 3-30 LO3-4 Example – Range The number of cappuccinos sold at the Starbucks location in the Orange County Airport between 4 and 7 p.m. for a sample of 5 days last year were 20, 40, 50, 60, and 80. Determine the range for the number of cappuccinos sold. Range = Maximum value – Minimum value = 80 – 20 = 60 Limitation of Range: It is based on just two values, the maximum and the minimum; it does not take into consideration all of the values. The variance does this. 3-31 LO3-4 Variance and Standard Deviation VARIANCE : The arithmetic mean of the squared deviations from the mean. STANDARD DEVIATION: The square root of the variance. โผ The variance and standard deviations are nonnegative and are zero only if all observations are the same. โผ For populations, whose values are near the mean, the variance and standard deviation will be small. โผ For populations, whose values are dispersed from the mean, the population variance and standard deviation will be large. โผ The variance overcomes the weakness of the range by using all the values in the population. 3-32 LO3-4 Computing the Variance Steps in computing the variance: Step 1: Find the mean. Step 2: Find the difference between each observation and the mean, and square that difference. Step 3: Sum all the squared differences found in Step 2. Step 4: Divide the sum of the squared differences by the number of items in the population. 3-33 LO3-4 Example – Variance and Standard Deviation The number of traffic citations (tickets of traffic offense) issued during the last twelve months in Beaufort County, South Carolina, is reported below: What is the population variance? Step 1: Find the mean. ๐ = σ ๐= ๐ 19 + 17+. . . +34 + 10 348 = = 29 12 12 3-34 LO3-4 Example – Variance and Standard Deviation Continued What is the population variance? Step 2: Find the difference between each observation and the mean of 29, and square that difference. Step 3: Sum all the squared differences found in Step 2. Step 4: Divide the sum of the squared differences by the number of items in the population. ( X − ๏ญ) 1,488 ๏ฅ ๏ณ = = 124 = 2 2 N 12 3-35 LO3-4 Sample Variance 3-36 LO3-4 Example – Sample Variance The hourly wages for a sample of part-time employees at Home Depot are: $12, $20, $16, $18, and $19. The sample mean is $17. What is the sample variance? 3-37 LO3-4 Sample Standard Deviation where : s 2 is the sample variance x is the value of each observation in the sample x is the mean of the sample n is the number of observations in the sample 3-38 Self Review LO3-6 Example - Standard Deviation of Grouped Data Refer to the frequency distribution for the Applewood Auto Group data used earlier. Compute the standard deviation of the vehicle profits. 3-40 LO3-5 Explain and apply Chebyshev’s theorem and the Empirical Rule. Chebyshev’s Theorem 3-41 Example Dupree Paint Company employees contribute a mean of $51.54 to the company’s profit-sharing plan every two weeks. The standard deviation of biweekly contributions is $7.51. At least what percent of the contributions lie within plus 3.5 standard deviations and minus 3.5 standard deviations of the mean, that is between $25.26 and $77.83? Solution 3-42 LO3-5 The Empirical Rule 3-43 Example A sample of the rental at University Park Apartments approximates a symmetrical, bell-shaped distribution. The sample mean is $500; the standard deviation is $20. Using the Empirical Rule, answer these 1. About 68% of the monthly rentals are between what two amounts? 2. About 95% of the monthly rentals are between what two amounts? 3. Almost all of the monthly rentals are between what two amounts? Solution 1. About 68% are between 500 – 20 & 500 + 20 means $480 & $520. 2. About 95% are between 500 – 2(20) & 500 + 2(20) means $460 & $540. 3. Almost all (99.7%) are between 500 – 3(20) & 500 + 3(20) means $440 & $560. 3-44 55- About 69%, found by 1 − 1 (1.8) 2 56- About 84%, each income level lies 2.5 standard deviations from the mean. Then 1 − 1 = 0.84 2 (2.5) TRY & CHECK (1) Here are six cards , there is a number on each card Two of the numbers are hidden 4 5 ? 6 3 The mode of the six numbers is 4 The mean of the six numbers is 5 Work out the two numbers that are hidden ? TRY & CHECK (2) TRY & CHECK (3) TRY & CHECK (4) Three numbers have a mode of 6 and a mean of 7 Find the three numbers? TRY & CHECK (5) Mark ran a mean distance of 13.2 KM in five days The next day Mark ran 20 KM Find the mean distance Mark ran in the six days