Mathematics in the Modern World | 4. Data Management
Measures of Central Tendency
Measure of central tendency is a value that indicates where the center of
distribution tends to be located, or simply the average of the data. It is said to form the
basis of statistics. The most common measures of central tendency are the: mean,
median, and mode. On a perfect normal distribution, all three measures of central
tendency are located at the same score, which is at the center of the normal
distribution.
Mean
The mean is the most commonly used measure of central tendency. The mean of
a data set is the sum of the data points divided by the number of data points, or simply
the average of the data points. Thus, it is strongly influenced by outliers (data points
that are extremely low or extremely high compared to other data points). The
po0pulation mean, denoted by , is estimated by the sample mean denoted by ̅.
̅
where
are the data poins and
is the number data points.
Some characteristics of the mean are the following:
1. The sum of deviations of the data points from the mean is zero. (Deviation is the
difference between a data point from a certain data point)
2. The sum of the squared deviations of the data points is minimum when the
deviations are taken from the mean.
3. If a constant is added (or subtracted) to every data point, the new mean is the
original mean increase (or decrease) by .
4. If every data point is multiplied (or divided) by a constant , the new mean is the
original mean multiplied (or divided) by .
5. Since the mean is a calculated number, it may not be an actual value in the data
points.
Example 1: The data below are the current diesel prices (in pesos/liter) in nearby gas
stations, find the mean price.
43.80 44.10
42.95
43.80
44.30
39.00
44.30
43.80
Solution:
̅
̅
43.26 pesos/liter
Page 7 of 27
Mathematics in the Modern World | 4. Data Management
Example 2: Gabriel has a total of 4 quizzes. One quiz is missing while the scores of his
remaining quizzes are 43, 35 and 39. Calculate the score of the missing quiz if his
mean score is 41.
Solution:
Let
denote Gabriel‟s score in his missing quiz.
̅
( )
47
Example 3: In a class of 18 men and 22 women, the mean score of men in a quiz is 38
while the mean score of women is 35. Find the mean score of the whole class.
Solution:
(
̅
̅
)
(
)
36.35
Mean of Grouped Data
In a grouped data, we do not know the individual data points. In such situations ,
we use the midpoints of the intervals to represent individual scores. Consequently, the
mean of the grouped data is only an approximation.
̅
where
is the midpoint of each interval and
is the frequency of each interval.
Example 4: Find the mean score of 42 students from the following frequency
distribution:
Score
16 - 23
24 - 31
32 - 39
40 - 47
48 - 55
56 - 63
Frequency
11
13
7
3
2
6
Solution:
Step 1: Add two columns for Midpoint ( ) and
, and compute for its value. The
midpoint is half of the sum of lower limit and upper limit less by one measure of
Page 8 of 27
Mathematics in the Modern World | 4. Data Management
unit in each interval (See the example below) while
is the product of frequency
and midpoint in each interval.
Step 2: Compute for
and
.
Step 3: Use the formula ̅
to get the mean of the grouped frequency distribution.
Midpoint ( )
16 - 23
19.5
11
11(19.5) = 214.5
24 - 31
27.5
13
13(27.5) = 357.5
32 - 39
40 - 47
48 - 55
56 - 63
35.5
43.5
51.5
59.5
Total
Finally, ̅
Frequency
( )
Score
7
3
2
6
= 42
248.5
130.5
103.0
357.0
1411
33.60
Note: Actually, the data in this example are those used in Illustration 5 of this chapter.
The reader is urged to compute the actual mean which is 33.64. It only shows that the
mean of a grouped data is just an approximation of the actual mean.
Median
The median is a value that separates an array of data points into two equal parts.
To find it, the data need first to be arranged in numerical order. If there is an odd
number of data points, then the median is the middle value. If there is an even number
of values in the data set, then the median is the average of the two middle values. The
median can be denoted by
or ̃.
Unlike the mean, median is not affected by extreme values in data points because
it only considers the middle values in the data set.
Example 5: Calculate the median age of the seven employees.
25
31
25
62
49
50
38
Solution: First, we need to arrange the data from lowest to highest.
25
25
31
38
49
50
62
Since there are 7 (odd) data points, the median is the middle value which is 38.
Example 6: The current crude oil prices (in pesos/liter) in nearby gas stations are listed
below. Find the median price.
43.80
44.10
42.95
43.80
44.30
39.00
44.30
43.90
Page 9 of 27
Mathematics in the Modern World | 4. Data Management
Solution:
39.00
42.95
43.80
43.80
43.90
44.10
44.30
44.30
Median
There are 8 (even) data points, the median price is the average of the two middle
values, 43.80 and 43.90, which is
43.85 pesos/liter.
Mode
The mode of a data set is the data point that occurs most often. If no data point is
repeated or every data point is repeated the same number of times, there is no mode.
If the mode of a data set exists, it may not be unique. A unimodal data set has one
mode, bimodal has two modes, trimodal has three modes and multimodal has many
modes. The mode can be used for qualitative as well as quantitative data.
Mode is not affected by the extreme values in the data set, since it only considers
the most frequent data. Mode can be denoted by
or ̂.
Example 7: Find the mode of the following data set;
a. 1, 2, 3, 4, 5, 6, 7, 8
b. 1, 2, 3, 4, 1, 2, 3, 4
c. 5, 8, 4, 8, 6, 7, 5, 3
Solution:
a. There is no mode because no data point is repeated.
b. There is no mode because all data points are repeated twice.
c. The mode is 5 and 8, since 5 and 8 are repeated twice.
Example 8: Thirty students are asked about their favorite color. The data is summarized
by the frequency distribution table below. Find the mode.
Color
Yellow
Blue
Red
White
Black
Frequency
2
5
5
8
10
The mode is black, since it has the highest frequency.
Page 10 of 27
Mathematics in the Modern World | 4. Data Management
In some situations, the measures of central tendency cannot provide enough
information that would lead to a valid conclusion, especially when two or more sets of
data need to be compared. In the following example, a weakness of the mean, median
and mode is illustrated.
Suppose that we are choosing between Jerico and Jerwin on who should represent
CLSU to an upcoming Inter-University Math Quiz Bee. To choose, their coach conducted
6 sessions of quiz-alikes between them, and came up with the following scores:
Jerico
Jerwin
Quiz 1
83
81
Quiz 2
65
85
Quiz 3
100
74
Quiz 4
92
85
Quiz 5
85
90
Quiz 6
85
95
So, after the 6 quizzes, Jerico and Jerwin were tied at 3 wins and 3 losses. Who
should be chosen? Looking at their averages (verify);
Jerico
Jerwin
Mean
85
85
Median Mode
85
85
85
85
Surprisingly, they are again tied in these measures. The mean, median, and the
mode cannot help in deciding on who should be sent to the Quiz Bee!
Another measure that could help is to look at their consistency. This is about the
measure of variability that is to look at how spread apart or dispersed their scores are.
Measures of Variability
A measure of variability (or dispersion) is a quantity that measures the spread of
scores in a given population. It indicates the extent to which observations in a data set
are scattered about the mean. Scores that are relatively close together have a lower
variation as compared to scores that are spread farther apart. To measure the spread
or dispersion of data, we use statistical values known as the range, variance and
standard deviation, these three statistical values are the most common measures of
variability.
Range
The range, denoted by , is the difference between the lowest and the highest
values in a data set. A weakness of the range is that an extreme value (outlier) can
greatly alter its value.
= Highest Value – Lowest Value
For example, Jerico‟s range is 00 – 65 or 35; Jerwin‟s range is 95 – 74 or 21.
This indicates that the scores of Jerico are more spread apart.
Page 11 of 27