1 Measures of Central Tendency CHAPTER 1.1

advertisement
CHAPTER
1
1.1
Measures of Central Tendency
INTRODUCTION: STATISTICS
It is that branch of science which deals with the collection of data, organising summarising, presenting
and analysing data and making valid conclusions and drawing reasonable decisions on the basis of such
study. The word statistics refers to some numerical facts relating to any phenomena in social sciences
or exact sciences. Facts and figure pertaining to population, production, national income, profits, sales,
bank rates, family patterns, dowry system, animal kingdom, plant life, bacteria will all constitute
statistics. The word ‘statistics’ seems to have derived from either the Latin word ‘status’ or the Italian
word ‘statitsta’ both meaning ‘a political state’.
The word ‘statistics’ is presently referred to in two distinct senses. In its first reference as a plural
noun, it means an aggregate of collection of numerical or quantitative expressions of facts, i.e., ‘Numerical
data’ or simply ‘data’. In its second reference as a singular noun, its means a body of principles and
methods used in the collection, presentation, analysis and interpretation of numerical data.
Bowley defines statistics as the science of counting in one context. The emphasis made here is only
on the collection of data. At another place he says: Statistics may right be called the science of averages.
Boddington defines “Statistics as the science of estimates and probabilities”. According to Lovitt, “the
science of statistics deals with the collection classification and tabulation of numerical facts.”
1.2
AVERAGES OR MEASURES OF CENTRAL TENDENCY
An average is a value which is representative of a set of data. Generally, it is difficult for the human
mind to remember huge and unwiedly set of numerical data, to understand their complications and to
compare them easily. Statistical methods help us to make the data simple, precise and understandable.
In this reference ‘statistical averages’ plays an important role. In other words, “Statistical average is
such a simple and precise value, that is used to represent all the values of a statistical series”.
Some of standard definitions are as follows:
1. According to A.E. Waugh, “An average is a single value selected from a group of values to
represent them in some way”.
2. According to Croxton and Cowden, “An average is a single value within the range of the data
that is used to represent all the values in the series”.
2 Business Statistics
In this definition, two points are significant:
(a) Average is a value that lies within the range of data.
(b) Its objective is to find out representative value of a statistical series.
3. Simpson and Kafka state, “A measure of central tendency is a typical value around which other
figures congregates.”
1.3
IMPORTANCE OF AVERAGES
In view of Prof. Tippet, “The average has its limitations, but if they are recognised, there is no single
statistical quantity more valuable than average.” The average plays an important role in statistical
science. In modern practical life averages are required at each step. We usually discuss and refer an
average income, average cost, average run rate, average price etc. Dr. Bowley even called Statistics “as
the science of average.”
1.4
REQUISITES FOR A GOOD AVERAGE
Following are the requisites of a good average:
(a) It should be simple to understand and locate.
(b) It should not be affected much by extreme observations.
(c) It should be based on all the items or observations.
(d) It should be rigidly defined so that the conclusion remains uniform irrespective of enumeration
by any person.
(e) It should be suitable for mathematical treatment.
(f) It should be least affected by fluctuations of sampling.
(g) An average should be such that it represents maximum features of a statistical series.
(h) An average should be in terms of absolute value, i.e., it should not be expressed in terms of
percentage or any other relative measurement.
1.5
TYPES OF AVERAGES
There are different types of statistical averages.
1. Averages of position:
(a) Median
(b) Mode
2. Mathematical averages
(a) Arithmetic mean
(b) Geometric mean
(c) Harmonic mean
(d) Quadratic mean
3. Business averages:
(a) Moving average
(b) Composite average
(c) Progressive average
According to syllabus, in this Chapter, we only discuss Averages of position and Mathematical
averages in detail (i.e., Mean, Median, Mode, Geometric mean, Harmonic mean)
1.6
AVERAGES OF POSITION
1.6.1 Median
Median of a distribution is the value of the variable which divides it into two equal parts. According to
Connor, “The median is that value of the variable which divides the group into two equal parts—one
part comprising all values greater and the other all values less than the median”. Also according to
Secrist “Median of a series is the value of that item-actual or estimated—when a series is arranged in
order of magnitude which divides the distribution in two parts.” Median is generally denoted by Me.
1.6.1.1 Calculation of median in individual series
In case of ungrouped data, if the number of observations is odd then median is the middle value after
the values have been arranged in ascending or descending order of magnitude i.e., if total number of
th
Ê n + 1ˆ
item.
items is n (odd) then median value is Á
Ë 2 ˜¯
In case of even number of observations there are two middle items and the mean of the value of
th
th
Ên ˆ
Ê nˆ
ÁË ˜¯ and ÁË + 1˜¯ items is defined as the median.
2
2
Example 1: Find out the median of the following items:
5, 7, 9, 12, 10, 8, 7, 15, 21
Solution: Items arranged in ascending order of magnitude
Serial no.
Size of items
1
2
3
4
5
6
7
8
9
5
7
7
8
9
10
12
15
21
n the number of items = 9 (odd)
th
Ê n + 1ˆ
Me = Á
item
Ë 2 ˜¯
th
Ê 9 + 1ˆ
item = 5th item
=Á
Ë 2 ˜¯
Me = 9. Ans.
Example 2: The following table gives the Economic Advisors index numbers of wholesale prices:
Year
Index numbers
1947
1948
1949
1950
1951
1952
297.4
367.1
381.1
400.7
439.3
386.9
CHAPTER 1
Measures of Central Tendency 3
4 Business Statistics
Solution: Index numbers arranged in ascending order of magnitude
Serial no.
Size of items
1
2
3
4
5
6
297.4
367.1
381.1
386.9
400.7
439.3
n the numer of items = 6 (even)
th
Ê nˆ
then Á ˜ item = 3rd item
Ë 2¯
= 381.1
th
and ÊÁ n + 1ˆ˜ item = 4 rd item
Ë2 ¯
= 386.9
381.1 + 386.9
2
768.0
=
= 384.0
2
Me = Median = 384. Ans.
Therefore Median =
1.6.1.2 Calculation of median in discrete series
In case of discrete frequency (grouped data) distribution median is obtained by considering the cumulative frequencies. Also in discrete series it is checked that values of item (x) are arranged. If not then
they are arranged in ascending order with same frequency. The steps for calculating median are:
(a) Find
 f +1
.
2
Ê Â f + 1ˆ
(b) See the cumulative frequency just greater than ÁË
˜.
2 ¯
(c) The corresponding value of x is median.
Example 3: The following data relate to size of shoes sold at a store during a given week. Find the
median.
Size of shoes
No. of pairs
4.5
5.0
5.5
6.0
1
2
4
5
Size of shoes
No. of pairs
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
15
30
60
95
82
75
44
25
15
5
CHAPTER 1
Measures of Central Tendency 5
Solution:
Size of shoes
No. of pairs
c.f.
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
1
2
4
5
15
30
60
95
82
75
44
25
15
5
1
3
7
12
27
57
117
212
294
369
413
438
453
458
Me =
Me =
Me =
( Sf + 1)
2
(458 + 1)
2
(459)
= 229.5
2
Me = 229.5 lies in c.f. 294. Its corresponding value is 8.5, so
Me = 8.5. Ans.
6 Business Statistics
Example 4: Calculate median from the following frequency distribution:
Marks
No. of students
32
45
62
78
80
8
15
10
4
2
Solution:
Me =
Me =
Me =
Marks
No. of students
c.f.
32
45
62
78
80
8
15
10
4
2
8
23
33
37
39
( Sf + 1)
2
(39 + 1)
2
(40 )
2
= 20
Me = 20 lies in c.f. 23. Its corresponding value is 45 so,
Me = 45. Ans.
1.6.1.3 Calculation of median in continuous series
In case of continuous frequency distribution, the class corresponding to the cummulative frequency just
Sf
is called the median class and the value of median is obtained by the formula
greater than
2
Ê Sf
ˆ
-c
Á 2
˜
Median = l + Á
i
Ë f ˜¯
where,
l is the lower limit of the median class.
f is the frequency of the median class.
i is the width of the class interval.
c is the c.f. of the class preceeding the median class.
Sf is the total frequency of the data.
Example 5: Calculate the median wages from the following particulars of daily wages of 95 persons:
Daily wages in Rs.
No. of wage earners
0–10
10–20
20–30
30–40
40–50
50–60
60–70
12
18
25
20
9
5
6
Solution:
Daily wages in Rs.
No. of wage earners
c.f.
12
18
25
20
9
5
6
12
30
55
75
84
89
95
0–10
10–20
20–30
30–40
40–50
50–60
60–70
Ê Sf
ˆ
-c
Á 2
˜
Me = l + Á
i
Ë f ˜¯
Sf 95
=
= 47.5
2
2
Ê 95
ˆ
- 30
Á 2
˜
10
M e = 20 - Á
Ë 25 ˜¯
Ê 47.5 - 30 ˆ
M e = 20 + Á
˜ 10
Ë
25 ¯
Ê 17.5 ˆ
M e = 20 + Á
10
Ë 25 ˜¯
Ê 175 ˆ
M e = 20 + Á
Ë 25 ˜¯
Me = 20 + 7
Me = 27. Ans.
CHAPTER 1
Measures of Central Tendency 7
8 Business Statistics
Example 6: Calculate median from the following data:
Age
1–5
6–10
11–15
16–20
21–25
26–30
No. of students
1
3
26
44
20
6
Solution:
Age
No. of students
C.I.
c.f.
1–5
6–10
11–15
16–20
21–25
26–30
1
3
26
44
20
6
0.5 – 5.5
5.5–10.5
10.5–15.5
15.5–20.5
20.5–25.5
25.5–30.5
1
4
30
74
94
100
Ê Sf
ˆ
-c
Á 2
˜
Me = l + Á
i
Ë f ˜¯
Sf 100
=
= 50
2
2
Ê 100
ˆ
- 30
Á 2
˜
5
M e = 15.5 + Á
Ë 44 ˜¯
Ê 50 - 30 ˆ
M e = 15.5 + Á
5
Ë 44 ¯˜
Ê 20 ¥ 5 ˆ
M e = 15.5 + Á
Ë 44 ˜¯
Ê 100 ˆ
M e = 15.5 + Á
Ë 44 ¯˜
Me = 15.5 + 2.27
Me = 17.77. Ans.
CHAPTER 1
Measures of Central Tendency 9
Example 7: Find the median from the following distribution:
Size
Below
Below
Below
Below
Below
Below
Below
Below
Below
Frequency
10
20
30
40
50
60
70
80
90
5
15
32
60
83
95
127
198
250
Solution:
Size
Below
Below
Below
Below
Below
Below
Below
Below
Below
Frequency
10
20
30
40
50
60
70
80
90
5
15
32
60
83
95
127
198
250
C.I.
f
c.f.
0–10
10–20
20–30
30–40
40–50
50–60
60–70
70–80
80–90
5
10
17
28
23
12
32
71
52
5
15
32
60
83
95
127
198
250
Ê Sf
ˆ
-c
Á 2
˜
Me = l + Á
i
Ë f ˜¯
Sf 250
=
= 125
2
2
Ê 250
ˆ
- 95
Á 2
˜
M e = 60 + Á
˜ 10
Ë
32 ¯
Ê 125 - 95 ˆ
M e = 60 + Á
10
Ë 32 ˜¯
Ê 30 ¥ 10 ˆ
M e = 60 + Á
Ë 32 ˜¯
Ê 300 ˆ
M e = 60 + Á
Ë 32 ˜¯
Me = 60 + 9.375
Me = 69.375. Ans.
10
Business Statistics
Example 8: Calculate the median from the following data:
Class
Above
Above
Above
Above
Above
Above
Above
Frequency
0
10
20
30
40
50
60
100
88
75
52
24
15
5
Solution:
Class
Above
Above
Above
Above
Above
Above
Above
0
10
20
30
40
50
60
Frequency
100
88
75
52
24
15
5
C.I.
f
0–10
10–20
20–30
30–40
40–50
50–60
60–70
12
13
23
28
9
10
5
Ê Sf
ˆ
-c
Á 2
˜
Me = l + Á
i
Ë f ˜¯
Sf 100
=
= 50
2
2
Ê 100
ˆ
- 48
Á 2
˜
10
M e = 30 + Á
Ë 28 ˜¯
Ê 50 - 48 ˆ
M e = 30 + Á
10
Ë 28 ˜¯
Ê 2 ¥ 10 ˆ
M e = 30 + Á
Ë 28 ˜¯
Ê 20 ˆ
M e = 30 + Á ˜
Ë 28 ¯
Me = 30 + 0.71
Me = 30.71. Ans.
c.f.
12
25
48
76
85
95
100
CHAPTER 1
Measures of Central Tendency 11
Example 9: Locate median from the following data:
Mid-value
Frequency
10
20
30
40
50
60
70
6
12
18
30
32
7
3
Solution:
Mid-value
Frequency
C.I.
c.f.
10
20
30
40
50
60
70
6
12
18
30
32
7
3
5–15
15–25
25–35
35–45
45–55
55–65
65–75
6
18
36
66
98
105
108
Ê Sf
ˆ
-c
Á 2
˜
Me = l + Á
i
Ë f ˜¯
Sf 108
=
= 54
2
2
Ê 108
ˆ
- 36
Á 2
˜
10
M e = 35 + Á
Ë 30 ˜¯
Ê 54 - 36 ˆ
M e = 35 + Á
10
Ë 30 ¯˜
Ê 18 ¥ 10 ˆ
M e = 35 + Á
Ë 30 ¯˜
Ê 180 ˆ
M e = 35 + Á
Ë 30 ¯˜
M e = 35 + 6
M e = 41. Ans.
12
Business Statistics
Example 10: Find out the missing frequency at Me = 46 and Σf = 230, from the following data:
Class
Frequency
10–20
20–30
30–40
40–50
50–60
60–70
70–80
12
30
–
65
–
25
18
Solution: Given Σf = 230, Me = 46
Class
Frequency
10–20
20–30
30–40
40–50
50–60
60–70
70–80
12
30
D (say)
65
E (say)
25
18
c.f.
12
42
42 + D
107 + D
107 + D + E
132 + D + E
150 + D + E
Sf 230
=
= 115
2
2
Ê Sf
ˆ
-c
Á 2
˜
Me = l + Á
i
Ë f ˜¯
Ê 115 - (42 + D ) ˆ
46 = 40 + Á
˜¯ ¥ 10
Ë
65
Ê 115 - (42 + D ) ˆ
46 - 40 = Á
˜¯ ¥ 10
Ë
65
65 ¥ 6 = 10 (73 - D )
390 = 730 - 10D
10 D = 730 - 390 = 340
D = 34 .
Now,
230 = 150 + D + E
230 = 150 + 34 + E
230 = 184 + E
E = 230 - 184
E = 46 . Ans.
Hence missing frequency for the class interval 30–40 is 34 and for class interval 50–60 it is 46.
1.6.1.4 Merits and demerits of median
Merits
1.
2.
3.
4.
5.
The value of median is rigidly defined, which is an important property for an ideal average.
The median is very easy to understood and is easy to calculate.
Median is not affected by values of extreme items.
In certain cases median can be located merely by observation.
It can be calculated for distributions with open-end classes.
Demerits
1. If the number of items in an individual series is in even number, the value of median cannot be
determined exactly. We merely estimate it by taking the mean of two middle terms.
2. Median is not capable for further algebraic treatment.
3. For calculating median it is necessary to arrange the data in ascending order or descending
order, which is not only time consuming but becomes tedious also.
4. Median is a positional average, so its value is not determined by each and every observation.
PRACTICE EXERCISE 1
1. Find out median wages of workers from the following data:
Name of persons
A
B
C
D
E
F
G
H
I
Monthly wages in Rs.
80
180
150
200
250
500
350
220
400
[Ans. 220]
2. Compute the median from the following data:
S. no.
1
2
3
4
5
6
7
8
9
Marks
68
49
32
21
54
38
59
66
41
[Ans. Me = 49]
3. Find the median from the data given below:
Size of items
Frequency
3.5
4.5
5.5
6.5
7.5
8.5
9.5
3
7
22
60
85
32
8
[Ans. Me = 7.5]
4. Find out the median from the following
35 men get at the rate of
40 ”
”
”
48 ”
”
”
100 ”
”
”
125 ”
”
”
87 ”
”
”
43 ”
”
”
22 ”
”
”
data:
Rs. 4.50 per man
Rs. 5.50
”
Rs. 6.50
”
Rs. 7.50
”
Rs. 8.50
”
Rs. 9.50
”
Rs.10.50
”
Rs.11.50
”
[Ans. Me = 8.5]
CHAPTER 1
Measures of Central Tendency 13
14
Business Statistics
5. The table shows the age distribution of married females according to sample census of 2001 in the U.P. state:
Age
No. of married
females
0–5
5–10
10–15
15–20
20–25
3
31
410
1809
2446
Age
No. of married
females
Age
No. of married
females
25–30
30–35
35–40
40–45
45–50
2223
1723
1292
963
762
50–55
55–60
60–65
65–70
70–75
531
317
156
59
37
Calculate the median ages of married females.
[Ans. 28.8 years]
6. Find the median from the following table:
Size
11–15
16–20
21–25
26–30
31–35
36–40
41–45
46–50
7
10
13
26
35
22
11
5
Frequency
[Ans. Me = 31.71]
7. From the following table, find the median:
No. of days (absent)
No. of students
5
10
15
20
25
30
35
40
45
29
224
465
582
634
644
650
653
655
[Ans. Me = 12.15]
8. The following table gives the marks obtained by 65 students in statistics in certain examination:
Examination marks
No. of students
More than 70%
7
More than 60%
18
More than 50%
40
More than 40%
40
More than 30%
63
More than 20%
65
Calculate the median of the series.
[Ans. Me = 53.6%]
9. Find the median from the data given below:
Mid-point
15
20
25
30
35
40
Frequency
30
28
25
24
20
21
[Ans. Me = 25.7]
10. Find out the median from the following frequency distribution:
Class
Frequency
0–100
100–200
200–300
300–400
400–500
500–600
5
3
12
0
15
5
[Ans. Me = 350]
1.6.2 Mode
Mode is defined to be the size of the variable which occurs more frequently in a set of observations. In
other words, mode of the frequency distribution is the value of the variable (x) corresponding to
maximum frequency.
1.6.2.1 Calculation of mode in individual series
In an individual series mode is find by observation and the value occurring maximum number of times
is the modal value. If in the given data all the values are distinct then in that case highest value of x is
the modal value of given data.
Example 11: Find the mode from the following data:
(A) 80, 170, 145, 250, 255, 317, 285, 350, 400.
(B) 15, 35, 40, 42, 48, 50, 52, 42, 18, 12, 48, 54, 48, 5, 10, 55.
Solution:
(A) For the given data highest frequency is 400 and no one value is repeated.
Therefore
Mode Mo = 400. Ans.
(B) Since the frequency 48 occurs the maximum number of times, i.e., 3. Hence the mode for the
given data is
Mo = 48. Ans.
1.6.2.2 Calculation of mode in discrete series
In the case of discrete series there are two methods to solve the problem.
(a) By inspection method: In this method, the size or value having the highest frequency will be
identified as mode. Generally, mode is denoted by Mo.
Example 12: Calculate mode from the following data:
Salary (in thousands)
15
20
25
30
35
40
Frequency
4
6
12
18
7
3
Solution: By inspection it is clear that the highest frequency is 18. So, its corresponding value 30 will
be mode.
Hence, Mode = 30 thousand
(b) Grouping method: Grouping method can be used in the following conditions:
(i) When the maximum frequency occurs either in very beginning or at the end of the distribution.
(ii) The frequencies of the variables increase or decrease in a haphazard way.
(iii) When the maximum frequency is repeated or approximately equal concentration is found in
two or more neighbouring values.
In grouping method, to solve the problem six columns are drawn in addition to the column of
values (x) and the frequencies are grouped in the following manner:
CHAPTER 1
Measures of Central Tendency 15
16
Business Statistics
(a) first column: The frequencies given in the problem are shown in the first column.
(b) Second column: In this column, frequencies are grouped into two’s from the top of the value.
(c) Third column: In this column, frequencies are again grouped into two’s, but the first frequency
is left out i.e., starts from the second frequency.
(d) Fourth column: In this column, frequencies are grouped into three’s, starting from the first
frequency.
(e) Fifth column: In this column, frequencies are grouped into three’s but this grouping starts from
the second frequency.
(f) Sixth column: In this column, frequencies are grouped again in three’s but this grouping starts
from the third frequency.
After preparing the grouping table, tallies are marked against the values having highest frequency
in first column and the highest total in each of the other column. Finally, the value securing maximum
tallies will be modal value.
Example 13: Calculate the mode from the following data of the marks obtained by nine students:
Marks
No. of students
10
3
11
4
12
6
13
8
14
10
15
9
16
5
17
7
18
1
Marks
No. of students
10
3
11
4
12
6
13
8
14
10
15
9
16
5
17
7
18
1
Solution:
Ans. Mo = 14
CHAPTER 1
Measures of Central Tendency 17
Example 14: Find the mode in the following given data:
Size of shoes
Frequency
2
3
4
5
6
7
8
9
10
10
12
18
25
30
28
15
10
11
Solution:
Size
Frequency
2
10
3
12
4
5
6
7
8
9
10
18
25
30
28
15
10
1
}
}
}
}
22
43
58
25
11
2
}
}
}
}
30
55
3
}
}
40
83
43
21
}
4
}
}
55
73
5
}
73
}
53
Analysis table
I=1
III = 3
IIII = 5
III = 3
I=1
36
From the Analysis Table, we have size 6 is repeated 5 times therefore its mode.
Mo = 6. Ans.
1.6.2.3 Calculation of mode in continuous series
In case of continuous series it should be checked before the calculation of mode that each class interval
should be equal or not. If they are not equal, then they should be equalised. After it there are two steps
to find the mode.
1. Find the modal class by either observation or by grouping method (in case of irregular
distribution).
2. Then modal value is obtained by the formula
Mo = l +
f1 - f 0
¥i
2 f1 - f 0 - f2
Download