Measures of Central Tendency Descriptive Statistics Measures of Central Tendency A measure of central tendency is a value that represents a typical, or central, entry of a data set. The three most commonly used measures of central tendency are: the mean the median the mode The Mean The mean (arithmetic average)of a data set is the sum of the data entries divided by the number of entries. To find the mean of a data set, use one of the following formulas. Population (Parameter) Mean: μ = Ʃ Sample (Statistic) Mean: x = Ʃ x/N x/n The lowercase Greek letter μ (pronounced mu) represents the population mean and x (read as “x bar”) represents the sample mean. Note that N represents the number of entries in a population and n represents the number of entries in a sample. Finding a Sample Mean The prices (in dollars) for a sample of room air conditioners are listed. What is the mean price of the air conditioners? $500 $840 $470 $480 $420 $440 $440 500 + 840 + 470 + 480 + 420 + 440 + 440 =3590 /7 = 512.9 or $512.90. General Rounding Rule: In statistics the basic rounding rule is that when computations are done in the calculations, rounding should not be done until the final answer is calculated. When rounding is done in the intermediate steps, it tends to increase the difference between that answer and the exact one. The Median The median of a data set is the value that lies in the middle of the data when the data set is ordered. If the data set has an odd number of entries, the median is the middle data entry. If the data set has an even number of entries, the median is the mean of the two middle data entries Finding the Median Find the median of the air conditioner prices given in the previous example. $420 $440 $440 $470 $480 $500 $840 Because there are seven entries (an odd number), the median is the middle, or fourth data entry. So therefore the median air conditioning price is $470.00 $420 $440 $440 $470 $480 $500 $840 Finding the Median What if we added 600 to our data? Find the median of the air conditioner prices given in this example. $420 $440 $470 $480 $500 $600 $840 Because there are now eight entries (an even number), the median is the middle, of the fourth and fifth data entry. Therefore we must add the middle numbers and divide by 2 to find the median air conditioning price. $420 $440 $440 $470 $480 $500 $600 $840 =470 + 480/2 =$475 The Mode The mode of a data set is the data entry that occurs with the greatest frequency ( 1 mode =unimodal). If no entry is repeated, the data set has no mode. If the two entries occur with the same greatest frequency, each entry is a mode and the data is called bimodal. More than two modes is multimodal. The mode is the only measure of central tendency that can be used to describe data at the nominal level of measurement. Finding the Mode Find the mode of the air conditioning prices in our previous example. 420 440 440 470 480 500 840 From the ordered data, you can see that the entry of 440 occurs twice, whereas the other data entries occur once. So the mode of the air conditioning prices is $440.00 Measures of Central Tendency Although the mean, the median, and the mode each describe a typical entry of a data set, there are advantages and disadvantages of using each, especially when the data set contains outliners. Outliners is a data entry that is far removed from the other entries in the data set. Comparing the Mean, the Median and Mode Find the mean, the median Ages in a Class and the mode of the sample ages of a class shown at the left. Which measure of central tendency best describes a typical entry of this data set? Are there any outliners? Mean = 475/20 = 23.8 Median = 21+22/2= 21.5 Mode = The entry occurring the greatest is 20. 20 20 20 20 20 20 21 21 21 21 22 22 22 23 23 23 23 24 24 65 Comparing the Mean, the Median and Mode Mean = 475/20 = 23.8 Ages in a Class Median = 21+22/2= 21.5 Mode = The entry occurring the greatest is 20. Interpretation: The mean takes every entry into account but is influenced by the outliner of 65. The median also takes into account every entry and it is not affected by the outliner. In this case the mode exists, but it doesn’t appear to represent a typical entry. 20 20 20 20 20 20 21 21 21 21 22 22 22 23 23 23 23 24 24 65 Graphical Comparison Sometimes a graphical comparison can help you decide which measure of central tendency best represents a data set. In this case the median best describes the data set. Midrange The midrange is a rough estimate of the middle. It is found by adding the lowest and the highest values in the data set and dividing by 2. It is a very rough estimate of the average and can be affected by one extremely high or low value. Weighted Mean and Mean of Grouped Data Sometimes data sets contain entries that have greater effect on the mean than do other entries. To find the mean of such data sets, you must find the weighted mean. A weighted mean is the mean of a data set whose entries have varying weights. A weighted mean is given by Where w is the weight of each entry x. Finding a Weighted Mean You are taking a class in which your grade is determined from 5 sources; 50% from your test mean, 15% 9 weeks test mean, 20% for your semester exam, 10% computer lab work, and 5% homework.Your scores are 86 test mean, 96 nine weeks test, 82 semester exam, 98 computer lab work and 100 homework. What is the weighted mean of your scores. Source Score, x Weight, w xw Test mean 86 0.50 43.0 Nine weeks exam 96 0.15 14.4 Semester Exam 82 0.20 16,4 Computer Lab 98 0.10 9.8 Homework 100 0.05 5.0 Ʃw=1 Ʃ (x*w)=88.6 What if data is presented in a frequency distribution? The mean of a frequency distribution for a sample is approximated by where x and f are the midpoints and frequencies of a class, respectively. Sample Population Guidelines for finding the mean of a frequency distribution Finding the mean of a frequency distribution Use the frequency distribution at the right to approximate the mean number of minutes that a sample of Internet subscribers spent online during their most recent session. 2089.0/50 = 41.8 x Frequency, f (x*f) 12.5 6 75.0 24.5 10 245.0 36.5 13 474.5 48.5 8 388.0 60.5 5 302.5 72.5 6 435.0 84.5 2 169.0 n=50 Ʃ = 2089.0 The Shapes of Distributions A graph reveals several characteristics of a frequency distribution. One such characteristic is the shape of the distribution. A frequency distribution is symmetric when a vertical line can be drawn through the middle of a graph of the distribution and the resulting halves are approximately mirror images. A frequency distribution is uniform (or rectangle) when all entries, or classes, in the distribution have equal frequencies. A uniform distribution is also symmetric. A frequency distribution is skewed if the “tail” of the graph elongates more to one side than to the other. A distribution is skewed left (negatively skewed) if its tail extends to the left. A distribution is skewed right (positively skewed) if its tail extends to the right. Shapes cont… When a distribution is symmetric and unimodal, the mean, median, and the mode are equal. If a distribution is skewed left, the mean is less than the median and the median is usually less than the mode. If a distribution is skewed right, the mean is greater than the median and the median is usually greater than the mode. The mean will always fall in the direction the distribution is skewed. For instance, when the distribution is skewed left, the mean is to the left of the median. Examples Properties and Uses of Central Tendency