Uploaded by farzan hassan

BBA 201 Busines Mathematics and Statistics unlocked

advertisement
BUSINESS MATHEMATICS AND
STATISTICS
MCM - 105BCIBF-202
B.Com- 202/ BBA-201/
Under Graduate Commerce Programmes
(Distance Mode)
Centre for Distance and Online Education
Jamia Millia Islamia
New Delhi-110025
EXPERT COMMITTEE
Prof. Najma Akhtar
Patron Vice-Chancellor,
Jamia Millia Islamia
Prof. Jessy Abraham
Hony. Director,
CDOE, Jamia Millia Islamia
Prof. Mohammad Miyan
Hony. Chief Advisor, Founder
CDOE, Jamia Millia Islamia
Prof. Y.P. Singh
Department of Commerce,
University of Delhi
Prof. Najeeb Uzamman Khan Sherwani
Head, Department of commerce and Business Studies
Jamia Millia Islamia
Prof. Sunayana
Centre for Management Studies,
Jamia Millia Islamia
Prof. Madhu Tyagi
School of Management,
IGNOU
Dr. Sabiha Khatoon
Assistant Professor, CDOE
Jamia Millia Islamia
Dr. Firdous Khanum
Assistant Professor, CDOE Jamia
Millia Islamia
Dr. Mohd. Afzal Saifi
Assistant Professor, CDOE
Jamia Millia Islamia
PROGRAMME COORDINATOR
Dr. Sabiha Khatoon, CDOE, Jamia Millia Islamia
COURSE WRITERS
K.B. Akhilesh, Professor, Department of Management Studies, Indian Institute of Science, Bengaluru
Units: (1.1-1.2, 2.1-2.2, 2.4-2.8)
S. Balasubrahmanyam, Research Scholar, Department of Management Studies, Indian Institute of Science, Bengaluru
Units: (1.1-1.2, 2.1-2.2, 2.4-2.8)
V.K. Khanna, Associate Professor, Deptt. of Mathematics, Kirori Mal College, University of Delhi
Units: (1.3-1.8, 3, 4.1-4.2, 4.4-4.8, 5.1-5.2, 5.4-5.8, 6.1-6.2, 6.4-6.8)
S.K. Bhamari, Associate Professor, Deptt. of Mathematics, Kirori Mal College, University of Delhi
Units: (1.3-1.8, 3, 4.1-4.2, 4.4-4.8, 5.1-5.2, 5.4-5.8, 6.1-6.2, 6.4-6.8)
Dr. Pratiksha Saxena, Assistant Professor, School of Applied Sciences, Gautam Buddha University, Greater Noida
Units: (2.3, 4.3, 5.3, 6.3, 7, 8)
J.S. Chandan, Professor, Medgar Evers College, City University of New York
Units: (9, 13, 14)
Neeru Sood, Freelance Author
Units: (10-12)
Dr. (Mrs.) Vasantha R. Patri, Former Faculty of Psychology, Lady Shri Ram College, Delhi University (1971-2001);
Chairperson, Indian Institute of Counselling
Unit: (15)
C.R. Kothari, Ex-Associate Prof - Department of Economic Administration & Financial Management, University of Rajasthan
Units: (16-18)
All rights reserved. Printed and published on behalf of the CDOE, Jamia Millia Islamia by Hi-Tech Graphics, New Delhi
March, 2023
ISBN: 978-93-5259-718-5
All rights reserved. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including
photocopying, recording or by any information storage or retrieval system, without permission in writing from the CDOE,
Jamia Millia Islamia, New Delhi.
Cover Credits: Anupama Kumari, Faculty of Fine Arts, Jamia Millia Islamia
SYLLABI-BOOK MAPPING TABLE
Business Mathematics and Statistics
Syllabi
Block I
Function and Progression
Block II Permutation and Combination
Mapping in Book
Unit-1: Function and Progression
(Pages 3-40);
Unit-2: Arithmetic Progression and Series
(Pages 41-88);
Unit-3: Geometric Progression and Series
(Pages 89-106)
Unit-4: Fundamental Principles of Counting
(Pages 109-118);
Unit-5: Permutation and Combination
(Pages 119-134);
Unit-6: Matrices and Determinants
(Pages 135-176);
Unit-7: Differentiation
(Pages 177-198);
Unit-8: Integration and Its Application
(Pages 199-222)
Block III Basic Statistical Concepts
Unit-9: Meaning and Scope of Statistic
(Pages 225-242);
Unit-10: Organizing a Statistical Survey
(Pages 243-266);
Unit-11: Accuracy, Approximation and Errors
(Pages 267-290);
Unit-12: Ratios, Percentages and Rates
(Pages 291-304)
Block IV Collection, Classification and
Presentation of Data
Unit-13: Collection and Classification of Data
(Pages 307-332);
Unit-14: Tabular Presentation
(Pages 333-352);
Unit-15: Diagrammatic and Graphic
Presentation (Pages 353-370)
Block V
Measures of Central Tendency,
Dispersion and Skewness
Unit-16: Concept of Central Tendency,
Mean, Median, Mode, and Geometric,
Harmonic and Moving Averages
(Pages 373-394);
Unit-17: Measures of Dispersion–I & II
(Pages 395-426);
Unit-18: Measures of Skewness
(Pages 427-440)
CONTENTS
BLOCK-I : FUNCTION AND PROGRESSION
UNIT 1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
ARITHMETIC PROGRESSION AND SERIES
41-88
Introduction
Sequence
Arithmetical Mean
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 3
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3-40
Introduction
Functions
Types of Function
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
FUNCTION AND PROGRESSION
GEOMETRIC PROGRESSION AND SERIES
89-106
Introduction
Geometric Progression and Geometric Means
Sum of Geometric Progression
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
BLOCK-II : PERMUTATION AND COMBINATION
UNIT 4
4.1
4.2
4.3
FUNDAMENTAL PRINCIPLES OF COUNTING
Introduction
Multiplication Rule
Addition Rule
109-118
4.4
4.5
4.6
4.7
4.8
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 5
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
135-176
DIFFERENTIATION
177-198
Introduction
Limit
Differentiability
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 8
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
MATRICES AND DETERMINANTS
Introduction
Matrix
Subtraction of Matrix and System of Linear Equations
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 7
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
119-134
Introduction
Permutation
Combination
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 6
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
PERMUTATION AND COMBINATION
INTEGRATION AND ITS APPLICATION
Introduction
Integration
Application of Integration
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
199-222
BLOCK-III : BASIC STATISTICAL CONCEPTS
UNIT 9
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
243-266
ACCURACY, APPROXIMATION AND ERRORS
267-290
Introduction
Approximation and Errors
Estimation and Sampling of Errors
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 12
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
ORGANIZING A STATISTICAL SURVEY
Introduction
An Overview to Statistical Survey
Sampling Methods
Statistical Unit
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 11
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
225-242
Introduction
An Introduction to Statistics
Evaluating Statistics
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 10
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
MEANING AND SCOPE OF STATISTIC
RATIOS, PERCENTAGES AND RATES
Introduction
Meaning of Various Statistical Derivatives
Purpose of Statistical Derivatives
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
291-304
BLOCK-IV : COLLECTION, CLASSIFICATION AND
PRESENTATION OF DATA
UNIT 13
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
TABULAR PRESENTATION
333-352
Introduction
Tabulation of Data
Classification and Tabulation
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 15
15.1
15.2
15.3
15.4
15.5
15.6
15.7
15.8
307-332
Introduction
Collection of Data
Classification of Data
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 14
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8
COLLECTION AND CLASSIFICATION OF DATA
DIAGRAMMATIC AND GRAPHIC PRESENTATION
353-370
Introduction
Diagrammatic and Graphic Presentation
Graphical Presentation
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
BLOCK-V : MEASURES OF CENTRAL TENDENCY,
DISPERSION AND SKEWNESS
UNIT 16
16.1
16.2
16.3
CONCEPT OF CENTRAL TENDENCY, MEAN, MEDIAN,
MODE, AND GEOMETRIC, HARMONIC AND MOVING
AVERAGES
Introduction
Measures of Central Tendency
Mean
373-394
16.4
16.5
16.6
16.7
16.8
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 17
17.1
17.2
17.3
17.4
17.5
17.6
17.7
17.8
395-426
Introduction
Measures of Dispersion
Standard Deviation
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
UNIT 18
18.1
18.2
18.3
18.4
18.5
18.6
18.7
18.8
MEASURES OF DISPERSION–I & II
MEASURES OF SKEWNESS
Introduction
Measures of Skewness
Karl Pearson’s Measure of Skewness
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
427-440
Function and Progression
BLOCK-I
FUNCTION AND PROGRESSION
This block will discuss function and progression. Function in mathematics refers to a relation
or expression involving one or more variables. Progression, however, refers to a series with
a definite pattern of advance. This block refers to functions and progressions, systematically
dealing with functions, progressions, arithmetic progressions series and geometric
progression series. It consists of three units.
The first unit explains functions and progressions. It begins by explaining the nature of
functions. Functions refer to a variable that corresponds to a definite value of another
variable and is denoted with a common representation. The various types of functions, their
characteristics, graphical representations and solution sets of linear equations and inequalities
are discusses in detail here. A few solved examples dealing with functions and variables are
also solved for a better understanding.
The second unit discusses arithmetic progression and series. An arithmetic progression is a
mathematical series that is obtained by adding a fixed number to the previous term. This
fixed number that is added is called a common difference. The unit discusses some standard
results of arithmetic progression, geometric progression and its properties, arithmecogeometric series and its importance, and the sums of terms of an arithmetic series. Solved
examples on the topics are discussed for a better understanding.
The third unit examines geometric progression and series. Geometric progression which is
also known as a geometric sequence is a sequence of numbers where each term after the
first is obtained by multiplying the previous one by a fixed, non-zero number. This fixed
number is called the common ratio. The unit discusses geometric progression and means, it
also carries solved examples on the sum of n terms of geometric progression, and the sum
of integrity of a geometric progression.
1
Function and Progression
UNIT–1
FUNCTION AND PROGRESSION
Objectives
After going through this unit, you will be able to:
•
Discuss the properties of functions
•
Analyze even and odd functions
•
Assess the properties of logarithmic function
Structure
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Introduction
Functions
Types of Function
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
1.1
INTRODUCTION
This unit will discuss about functions and progressions. A function is a
mathematical relation such that each element of a given set (the domain of the
function) is associated with an element of another set (the range of the function).
Supposing that, for a variable (x), there is a definite value (y), then (y) is said to be
a function of (x), with (y) being the dependent variable and (x) being the
independent. There are various types of functions, ranging from single valued to multi
valued functions, even and odd functions. Function describes any situation in which
one quantity depends on another, and the relationship between the two sets of
numbers of a function can be represented by a mathematical equation. Domain and
range are the two main characteristic of a function.
A progression, on the other hand, is a mathematical series with a definite pattern
of advance. This unit will talk in detail about functions, its characteristics, types and
use in quadratic equations.
3
Function and Progression
1.2
FUNCTIONS
If to each value of a variable x, there corresponds one definite value of another
variable y, then we say that y is a function of x, and denote it as y = f(x). Here, y is
called the dependent variable and x the independent variable (or argument).
For example, Dx = f(Px), where Dx is the demand for a product ‘x’ and Px is its
price (per unit), other factors remaining the same.
Remark
Though mathematically, the foregoing demand function can be duly transformed into
another wherein Px can be expressed as a function of Dx. It does not, however,
carry any practical significance because in general, it is the price (endogenous
variable or independent variable) that can be directly manipulated rather than the
demand, which is an exogenous variable or dependent variable.
Note
The set of values of x for which the value of the function y = f(x) is determined, is
called the domain of the function, while the set of values of y is called the range of
the function.
Interval of a Variable
The range of values that a variable can take, be it a closed (or semi-closed) interval
or an open (semi-open) interval or a combination of such intervals is known as the
interval of the variable.
Thus, if a variable ‘x’ can take any value between two real numbers a and b (a < b),
inclusive of both the values, then such an interval can be written as follows:
a ≤ x ≤ b or [a, b]
Using the notation of sets, it can be written as
{x ∈  / a ≤ x ≤ b} or x ∈ [a, b]
Similarly, the other possibilities can be expressed as follows:
{x ∈  / a < x < b} as x ∈ (a, b)
{x ∈  / a £ x < b} as x ∈ [a, b)
and {x ∈  / a < x ≤ b} as x ∈ (a, b]
4
Function and Progression
Classification of functions
Single Valued and Multi-Valued Functions
When a function has only one value corresponding to each value of the independent
variable, the function is called a single-valued function. If a function has several values
corresponding to each value of the independent variable, it is called a multi-valued or
many valued function.
e.g., y = x2 is a single valued function of x,
while y = x is a multi-valued (two-valued) function of x.
Even and Odd Functions
If f(x) changes sign where the sign of x is changed, i.e., if f(–x) = – f(x), then f(x)
is said to be an odd function of x.
e.g., y = x3, y = f(x) = (3x + 6x3), y = sin (x), y = sin h(x) etc. are all odd
functions of x.
On the other hand, if f(x) does not change its sign when the sign of x is changed,
it is said to be an even function of x (i.e., when f(–x) = f(x)).
e.g., x2, (3x4 + 7x2), cos x, cos h(x) etc. are all even functions of x.
Notes
1. Geometrically, an even function is symmetric with respect to the y-axis
while an odd function is symmetric with respect to the origin.
2. Taylor series of an even function includes even powers only while that of an
odd function includes odd powers only.
3. The only function which is both even and odd is the constant function which
is identically zero (i.e., f(x) = 0 for all x).
4. In general, the sum of an even function and an odd function is neither even
nor odd (e.g., x + x2).
5. The sum of two even functions is even, and any constant multiple of an even
function is even.
6. The sum of two odd functions is odd, and any constant multiple of an odd
function is odd.
7. The product of two even functions is an even function.
8. The product of two odd functions is again an even function.
5
Function and Progression
9. The product of an even function and an odd function is an odd function.
10. The derivative of an even function is odd.
11. The derivative of an odd function is even.
12. The Fourier series of a periodic even function includes cosine terms only
while that of a periodic odd function includes sine terms only.
13. Any linear combination of even functions is even while that of odd functions
is odd.
14. Both the even and the odd functions form a vector space over the reals. In
fact, the vector space of all real-valued functions is the direct sum of the
spaces of even and odd functions. In other words, every function can be
uniquely written as the sum of an even function and an odd function:
f (x) =
f (x)
2
f ( x)
f (x )
2
f ( x)
15. The even functions form a commutative algebra over the reals. However,
the odd functions do not form an algebra over the reals.
Explicit function
If the dependent variable y is expressed directly in terms of the independent variable
x, then y is called an explicit function of x and is written as y = f(x).
e.g., y = (2x + 3), y = (4x2 + 7x – 8) are all explicit functions of x.
Implicit function
When x and y both occur together in an equation but y is not capable of being
directly expressed in terms of x, then y is said to be an implicit function of x.
e.g., (x3 + 3x2y + 3xy2 + y2) = 0 is an implict function of x.
When the form of an implicit function is not specified, it is written as
f(x, y) = 0.
Inverse function
If y is a function of x, then on the other hand, x is also (yet another) function of y.
The latter is called the inverse function of the former function y, i.e., if y = f(x), then
x = g(y)
e.g., If y = ax, then x = loga y
6
Function and Progression
If y
2x 3
x 5
, then x
3 5y
y 2
Symbolically, y = f(x) ⇔ x = f–1 (y)
Convex function
A function f(x) defined over a convex set S (Note 3) is said to be a convex function
if for any two points x1 and x2 lying in S and for any 0 ≤ l ≤ 1,
f(lx1 + (1 – l) x2) ≤ [l. f(x1) + (1 – l) f(x2)]
Strictly convex function
If in the previous definition, for any 0 < l < 1, f (lx1 + (1 – l)x2) < [l. f (x1) +
(1 – l) f (x2)], then f (x) is called a strictly convex function.
Concave Function
A function f (x) is said to be concave
if – f (x) is convex.
Strictly Concave Function
A function f (x) is said to be strictly concave
if –f (x) is strictly convex.
Characteristics of Function
Domain and range are the two main characteristic of a function.
Function describes any situation in which one quantity depends on another. For
example, the height of a person depends on his age. The distance an object travels
in four hours depends on its speed. When such relationships exist, one variable is
said to be a function of the other. Therefore, height is a function of age and distance
is a function of speed.
The relationship between the two sets of numbers of a function can be
represented by a mathematical equation. Consider the relationship of the area of a
square to its sides. This relationship is expressed by the equation A = x2. Here, A,
the value for the area, depends on x, the length of a side. Consequently, A is called
the dependent variable and x is the independent variable. In fact, for a relationship
7
Function and Progression
between two variables to be called a function, every value of the independent
variable must correspond to exactly one value of the dependent variable.
The relationship between any square and its area could be represented by
f(x) = x2, where A = f(x). To use this notation, we substitute the value found
between the parenthesis into the equation. For a square with a side 4 units long, the
function of the area is f(4) = 16.
The set of numbers made up of all the possible values for x is called the domain
of the function. The set of numbers created by substituting every value for x into the
equation is known as the range of the function.
We can add, subtract, multiply or divide real numbers to get new numbers,
functions can be manipulated as such to form new functions. Consider the functions
f(x) = x2 and g(x) = 4x + 2. The sum of these functions f(x) + g(x) = x2 + 4x + 2.
The difference of f(x) – g(x) = x2 – 4x – 2. The product and quotient can be
obtained in a similar way. A composite function is the result of another manipulation
of two functions. The composite function created by our previous example is noted
by f(g(x)) and equal to f(4x + 2) = (4x + 2)2. It is important to note that this
composite function is not equal to the function g(f(x)).
Other characteristics can be defined as:
(1) Relation: is a set of ordered pairs (x, y) ex – (2, 3), (10, 1), (3, 8).
(2) Function: a relation in which the x values do not repeat.
(3) x-coordinate: first number in an ordered pair.
(4) y-coordinate: second number in an ordered pair.
(5) Domain: the set of permissible x values in a relation or function.
(6) Range: the set of permissible y values in a relation or function.
Check Your Progress - 1
1.
What is a single-valued function?
................................................................................................................
................................................................................................................
................................................................................................................
8
Function and Progression
2.
What are the two main characteristics of a function?
................................................................................................................
................................................................................................................
................................................................................................................
1.3
TYPES OF FUNCTION
Linear Quadratic
To discuss the concept of linear equations more formally, firstly we define a linear
expression.
Definition 1. Any expression of the type ax + by + c, a, b, c in R and at least one
of a and b is non-zero, is called a linear expression (to be more precise, a linear
expression in x and y over the reals).
Definition 2. An equation of the type ax + by + c = 0, where a, b, c ∈ R, is
called a linear equation.
In other word, a linear equation is obtained by equating to zero a linear expression.
Similarly, inequality of the type ax + by + c > 0 or ax + by + c < 0 is called a
linear inequation (more precisely a linear inequation in x and y over the reals).
Thus, 3x + 5y + 7 = 0, 2x – 1 = 0, 3 y + 11 = 0, x + y – 2 = 0 are some linear
1
equations, while x > 0, 4x – 3y + 1 < 0, 2 x − 3 y + 11 > 0, x – 1.5y + > 0, 3.78x
2
1
– 2 < 0 are some linear inequalities.
3
Solution Sets of Linear Equations and Inequalities
In this section, we explain what we mean by the solution set of a linear equation or a
linear inequality or of a system of linear equations and linear inequalities.
Firstly we recall the definition of an ordered pair.
Definition 3. By an ordered pair (a, b) of real numbers a and b we mean a set
{{a}, {a, b}}.
Thus (a, b) is a set with two elements namely the set {a} and the set {a, b}. With
the help of this definition it can be proved that two ordered pairs (a, b) and (c, d) are
equal if and only if a = c and b = d.
Note: Some authors take this property as the defining property for ordered pairs.
9
Function and Progression
Example 1.1: The plane of co-ordinate geometry is the set of all ordered pairs (x, y)
with x, y ∈ R. For any point P in this plane, its co-ordinates determine an ordered
pair (a, b) where a is the abscissa of P and b is the ordinate of P. Also for any ordered
pair of real numbers (c, d) there is exactly one point Q in the plane of co-ordinates
whose x co-ordinate is c and y co-ordinate is d. We note that the points (2, 1) and (1,
2) are different. In general points (a, b) and (b, a) are different whenever a ≠ b. This
explains why the co-ordinates of a point form an ordered pair.
Definition 4. Let ax + by + c = 0 be a linear relation, then the set of all ordered
pairs (x1, y1) of real numbers such that ax1 + by1 + c = 0 is called the solution set of
the linear equations ax + by + c = 0.
Thus, (2, 1) is an element of the solution set of the equation 3x – 4y –2 = 0, since
3.2 – 4.1 – 2 = 6 – 4 – 2 = 0. Again (1, 2) is not in the solution set of the same equation
as 3.1 – 4.2 – 2 = – 7 ≠ 0. Let S be the solution set of 3x – 4y – 2 = 0, then S = {(2, 1),
(6, 4) ( 3 13 , 2), (– 2/3, – 1), ...}. Since it is impossible to enumerate all the ordered pairs
(x1, y1) satisfying 3x1 – 4y1 = 2, the above said notation of S does not convey the actual
size of the solution set. Note that (x1, y1) ∈ S ⇔ 3x1 – 4y1 = 2
⇔
x1 =
So we can write
2 (1 + 2 y1 )
.
3
RS
T
S = ( x1 , y1 ) x1 =
or
2 (1 + 2 y1 )
3
UV
W
S = {(x1, y1) | 3x1 – 4y1 – 2 = 0}.
Similarly we can define solution set of 2x – 1 = 0 and 4x + y + 1 = 0 as the set S
of ordered pairs (x1, y1) such that 2x1 – 1 = 0 and 4x1 + y1 + 1 = 0. It can be easily
verified that here S consists of only one ordered pair, namely
1
, 3 .
2
Definition 5. The set of all ordered pairs (x1, y1) of real numbers, such that
ax1 + by1 + c > 0 is called the solution set of the linear inequality ax + by + c > 0.
For example, (5, 1) is the solution set of the inequality 2x – y – 7 > 0 while (1, 4)
is not in its solution set.
10
Function and Progression
Definitions 4 and 5 can, obviously, be extended to a system consisting of more
than one linear equation or linear inequality and also to a system consisting of linear
equations and linear inequalities.
Note: The word ‘linear constraint’ is used in place of ‘linear equation’ as well as a ‘linear
inequality’.
Graphical Representation of Solution Sets
As remarked earlier, we can identify an ordered pair (a, b) of real number with a point
in the plane of coordinate geometry. Thus, the solution set of any linear equation
precisely consists of the points whose coordinates satisfy that equation. But every
linear equation represents a line, so the solution set consists of points on the line. Thus,
to draw a graph of the solution set of a linear equation it is sufficient to trace the line
represented by that equation on graph paper. For example suppose, we are interested
to represent the solution set of the equation x + y + 1 = 0 graphically. We trace the line
x + y + 1 = 0.
Now x + y + 1 = 0 ⇒ y = – x – 1. We given arbitrary values to x and find out the
corresponding values of y from y = – x – 1.
Suppose we put x = 0, then y = – 1. We write 0 in the row headed by x and put
in the column consisting of 0 and the row headed by y (Fig. 1.1). Similarly on putting
x = – 1, 2, – 2 we get y = 0, – 3, 1, respectively. We plot these points on a graph
paper and join them. Thus we get a line, every point of which has the co-ordinates
satisfying x + y + 1 = 0. This line represents the solution set of x + y + 1 = 0.
We next consider the solution set of inequality x – y + 1 > 0. Again x – y + 1 > 0
⇒ x + 1 > y ⇒ y < x + 1. Here, the solution set S is given by {(x1, y1) | y1 < x1 + 1}.
Thus, if we put x = 0 then all points (0, y) with y < 1 are in the solution set of x –
y + 1 > 0. We first plot the graph of equation x – y + 1 = 0 following the procedure
discussed earlier.
We make a table of the following type:
x
0
–1
2
–2
–3
...
...
...
...
...
y
–1
0
–3
1
2
...
...
...
...
...
11
Function and Progression
Y
x
+
y
+
1
=
0
(–3, 2)
(–2, 1)
X'
O
(–1, 0)
X
(0, –1)
(2, –3)
Y'
Fig. 1.1
Let P (x1, y1) be any point. Draw PQ parallel to x-axis to meet line x – y + 1 = 0
at Q (Fig. 1.2). Let the coordinates of Q be (x2, y1). Since Q (x2, y1) lies on x – y +
1 = 0, we get x2 – y1 + 1 = 0. P will lie on right of x – y + 1 = 0 if and only if x1 > x2
⇔ x1 > y1 – 1 ⇔ y1 < x1 + 1. Thus, point (x1, y1) is in solution set of x – y + 1 > 0
if and only if it lies on the right of line x – y + 1 = 0. Thus, the shaded portion (excluding
line x – y + 1 = 0) depicts the solution set of x – y + 1 > 0. Similarly, it can be verified
that a point (x1, y1) lies on left of x – y + 1 = 0 if and only if x1 < y1 – 1 ⇔ y1 > x1 +
1. So the unshaded portion is the graphical representation of the linear inequality x – y
+ 1 < 0. Note the shaded portion together with the line x – y + 1 = 0 represents the
solution set of x – y + 1 ≥ 0 or a system of linear constraints x – y + 1 = 0 and x – y
+ 1 > 0.
Sometimes a solution set need not exist. Consider the following examples. In such
cases no graphical representation is possible.
Example 1.2: Find the solution set of x – 1 = 0 and x < 0.
Solution: For all the points (x1, y1) lying in the solution set of x – 1 = 0, x1 = 1, while
for all points (x1, y1) satisfying x < 0 we must have x1 < 0. But 1 is never less than 0.
Hence no x1 exists which simultaneously satisfies x1 = 1 and x1 < 0.Thus, we cannot
get any point in the graphical representation of solution of x = 1 and x = 0. In this case
the solution set is an empty set.
12
Function and Progression
Y
Q
P
(2, 3)
(1, 2)
(0, 1)
(–1, 0)
X'
X
x–
y
+
1
=
0
O
Y'
Fig. 1.2
Example 1.3: Find the solution set of
x + 2y + 1 = 0 and 2x + 4y + 3 = 0.
Solution: Let (x1, y1) be in the solution set of the equations x + 2y + 1 = 0 and 2x +
4y + 3 = 0.
Then, x1 + 2y1 + 1 = 0 as well as 2x1 + 4y1 + 3 = 0.
These equations together imply (2x1 + 4y1 + 3) – 2(x1 + 2y1 + 1) = 0 ⇒ 3 – 2 =
0 ⇒ 1 = 0, an absurdity. Hence, there exists no element in the solution set. In other
words the solution set of x + 2y + 1 = 0 and 2x + 4y + 3 = 0 is empty.
Definition 6. Whenever the solution set of a system of linear inequations is empty,
we say that the inequations are inconsistent.
Definition 7. A system of linear inequations is said to be consistent if its solution
set is non-empty.
Example 1.4: Draw the graph of 4x + 3y ≤ 6. Mark two solutions of this on the
graph.
Solution: Firstly, we trace the line 4x + 3y = 6 on a graph paper.
Now, 4x + 3y = 6 ⇒ 3y = 6 – 4x ⇒ y =
6 − 4x
.
3
13
Function and Progression
We give values 0, 1, 2, 3, – 1, – 2, – 3, ... to x and find corresponding values of
y with the help of y =
6 − 4x
.
3
These values we put down in the following table.
x
0
1
2
3
–1
–2
–3
...
...
y
2
2/3
– 2/3
–2
10/3
14/3
6
...
...
The graph of 4x + 3y = 6 is shown in Figure 1.3.
Now, a point (x1, y1) satisfies 4x + 3y < 6 if and only if 4x1 + 3y1 < 6 ⇔ (x1, y1)
lies on left of the line 4x + 3y = 6. Points of these type lie in the shaded portion of the
figure.
Hence, the solution set of 3x + 4y ≤ 6 consists of the shaded portion including the
line 3x + 4y = 6.
Y
+
4x
=
3x
6
(–3, 6)
(–6, 2)
(0, 2)
X'
X
O
(3, –2)
Y'
Fig. 1.3
Clearly, the points (– 3, 6) and (– 6, 2) are such that their co-ordinates satisfy 4x
+ 3y ≤ 6, as 4(– 3) + 3 (6) = – 12 + 18 = 6 and 4(– 6) + 3(2) = – 18 < 6. We mark
these points by black dots.
Example 1.5: Find the graph of x + 2y – 5 < 0, 4x – y < 2 and y > 0. On the graph
mark three points which satisfy these inequalities.
14
Function and Progression
Solution: Firstly, we trace lines x + 2y – 5 = 0, 4x – y = 2 and y = 0.
To trace x + 2y – 5 = 0, we note that y =
5− x
5
. So for x = 0, 1, 2, 3, ... ; y = ,
2
2
FG 5 IJ , (1, 2), FG 2, 3 IJ , (3, 1) and join them to obtain the
H 2K
H 2K
3
2
2, , 1, etc. Plot the points 0,
graph of x + 2y – 5 = 0.
Again 4x – y = 2 ⇒ y = 4x – 2 so for x = 0, 1, 2, 3, ...; y = –2, 2, 6, 10, ... . Plot
the points (0, – 2), (1, 2), (2, 6), (3, 10) and join them to get the graph of 4x – y = 2.
Finally y = 0 is the axis of x i.e., X′OX. Now y co-ordinate of any point is positive
if and only if that point lies above x-axis. Further, (x1, y1) satisfies x + 2y – 5 = 0 if and
only if it lies on the left of the line x + 2y – 5 < 0. Similary, (x1, y1) satisfies 4x – y < 2
if and only if the point (x1, y1) lies on the left of the line 4x – y – 2 = 0. Hence, the
solution set is the shaded portion of the figure excluding the lines y = 0, x + 2y = 5 and
4x – y = 2. The ordered pairs
given system, since
FG 1 , 2IJ , (– 1, 1), (– 5, 4) are in the solution set of the
H2 K
5
1
+2−5= −
2
2
< 0, 4.
1
−2
2
= 0 < 2 and 2 > 0; – 1 + 2.1 – 5 =
– 4 < 0, 4(– 1) – 1 = – 5 < 2 and 1 > 0; – 5 + 2.4 – 5 = – 2 < 0, 4(– 5) – 4 = – 24
< 2 and 4 > 0. The points corresponding to these pairs are shown by black dots in the
Figure 1.4.
Example 1.6: Find the solution set of the following system of inequalities and represent
the solution set by graph.
3x + y < 13, 7y + x > 11, 3y ≤ 9 + x.
Solution: Firstly, we draw lines 3x + y = 13, 7y + x = 11 and 3y = 9 + x.
Now, 3x + y < 13 is represented by the region in the left side of line 3x + y = 13;
7y + x > 11 is represented by the portion of plane on right side of line 7y + x = 11 and
3y ≤ 9 + x is represented by portion of plane on right of line 3y = 9 + x together with
the line 3y = 9 + x. Hence, the solution set is the interior of triangle ABC (shown by
shaded portion) and the portion of line 3y = 9 + x between the points A, C (but
excluding A and C). Note that coordinates of A, B, C are respectively equal to (– 3,
2), (4, 1), (3, 4). These are obtained by solving the pair of lines 3y = 9 + x and 7y +
x = 11; 7y + x = 11 and 3x + y = 13; 3x + y = 13 and 3y = 9 + x. The point A is not
in the solution set of the given system, since 7(2) + (– 3) = 11 11 (where stands
for not greater than). Also C is not in the solution set as 3(3) + 4 = 13 13 (where
stand for not less than).
15
Function and Progression
4x – y
=2
Y
(2, 6)
(–5, 4)
(1/2, 2)
(–1, 1)
X'
(3, 1)
O
(5, 0)
x+
(0, –2)
2y
–
X
5=
0
Y'
Fig. 1.4
Quadratic Equation
An equation of degree two is called a quadratic equation.
Note: In this section we shall be mainly dealing with quadratic equations having rational
numbers as coefficients.
There are two types of quadratic equations: (1) Pure and (2) Affected.
A quadratic equation is called pure if it does not contain single power of x. In
other words in a pure quadratic equation, coefficient of x must be zero. Thus a pure
quadratic equation is of the type ax2 + b = 0 with a ≠ 0.
A quadratic equation which is not pure is called an affected quadratic equation.
Thus the most general form of an affected quadratic equation is ax2 + bx + c = 0,
with ab ≠ 0. (Recall that ab ≠ 0 ⇔ a ≠ 0 and b ≠ 0).
Root. A complex number α is called a root of ax2 + bx + c if aα2 + bα + c = 0.
Method of Solving Pure Quadratic Equations
Let ax2 + b = 0 be a pure quadratic equation. This implies
ax2 = – b ⇒ x2 = −
b
a
⇒x= ±
−b
a
It is clear that the roots of ax2 + b are real if and only if a and b are of opposite
signs.
16
Function and Progression
Example 1.7: Solve 9x2 – 4 = 0.
Solution: Clearly, 9x2 = 4 ⇒ x2 =
4
9
2
3
⇒ x= ± .
Methods of Solving Affected Quadratic Equations
Note: Since a pure quadratic equation is a particular case of ax2 + bx + c = 0. All these methods
are applicable to pure equations also. All that we have to do is to just put b = 0 to get the
solution of a pure equation.
(i) Method of Factorisation
If the expression ax2 + bx + c can be factored into linear factors then each of the
factors, put to zero, provides us with a root of the given quadratic equation.
Thus, if ax2 + bx + c = a(x – α)(x – β), then the roots of ax2 + bx + c = 0 are α
and β.
Example 1.8: Solve x2 – 5x + 6 = 0.
Solution: Clearly, x2 – 5x + 6 = 0
⇒ (x – 2)(x – 3) = 0
⇒ x – 2 = 0 or x – 3 = 0
⇒ x=2
or x = 3
Hence, roots of given equation are 2 and 3.
(ii) Method of Perfect Square
This method is made clear by the following steps. Let ax2 + bx + c = 0 be the given
equation.
Step 1. Divide both sides of the equation by a to obtain
x2 +
b
c
x+ =0
a
a
(since a ≠ 0, we are justified in division by a)
Step 2. Transpose the constant term (i.e., the term independent of x) on R.H.S.,
to get
x2 +
Step 3. Add
b
c
x =−
a
a
b2
to both the sides.
4a2
17
Function and Progression
Thus, we have
x2 +
b2
c
b
b2
=
−
x+
2
2
a
a
4a
4a
FG x + b IJ
H 2a K
or
2
2
= b − 42 ac .
4a
b
.
This is a pure equation in the variable x +
2a
± b2 − 4 ac
b
x+
=
2a
2a
So the solution is
x =
or
− b ± b2 − 4 ac
2a
Note: This method is useful particularly when ax2 + bx + c cannot be factored into linear factor
easily.
Example 1.9: Solve 2x2 + 3x – 1 = 0.
Solution: In this case a = 2, b = 3, c = – 1.
Hence, roots are x =
− 3 ± 32 − 4 ( 2 )( − 1)
2. 2
=
− 3 ± 17
.
4
Nature of Roots
The roots of ax2 + bx + c = 0 are given by
− b ± b2 − 4 ac
. The expression inside the
2a
radical sign, i.e., b2 – 4ac V a, b, c ∈ R is called discriminant.
Case I. b2 – 4ac > 0, i.e., b2 > 4ac.
In this case b2 − 4 ac is a real number. Hence, the two roots of the given equation
are unequal and real.
Case II. b2 – 4ac = 0, i.e., b2 = 4ac.
In this case both the roots are real and equal (each equal to – b/2a).
Case III. b2 – 4ac < 0, i.e., b2 < 4ac.
In this case b2 − 4 ac is an imaginary number and so both the roots are complex
and unequal.
Example 1.10: Solve
x+3 x − 3
2x − 3
+
=
.
x+2 x − 2
x −1
Solution: Given equation is equivalent to
18
Function and Progression
( x + 2) + 1 ( x − 2) − 1
2 ( x − 1) − 1
+
=
x+2
x−2
x −1
⇒
1+
1
1
1
+1−
= 2−
x+2
x−2
x −1
⇒
x−2−x−2
1
= −
2
x −1
x −4
⇒
−4
1
= −
x −1
x −4
⇒
4x – 4 = x2 – 4
⇒
2
x2 – 4x = 0 ⇒ x(x – 4) = 0
⇒ x = 0 or 4.
Hence, the roots of the given equation are 0 and 4.
Example 1.11: Solve, x4 – 13x2 + 36 = 0.
Solution: This is not a quadratic equation in x, but on putting x2 = t, we get a quadratic
in t, namely t2 – 13t + 36 = 0.
Roots of this equation are given by (t – 4)(t – 9) = 0.
Thus, t = 4 or t = 9. In other words x2 = 4 or x2 = 9. Hence x = ± 2 or ± 3.
Consequently, roots of given equation are ± 2, ± 3.
Example 1.12: Solve, (x + 1)(x + 3)(x + 4)(x + 6) = 72.
Solution: Rearrange the factors on the L.H.S. so as to have the sum of constants in
first two factors same as in the case of other two factors.
Since 1 + 6 = 3 + 4, we get (x + 1)(x + 6)(x + 3)(x + 4) = 72
or
Now put
(x2 + 7x + 6)(x2 + 7x + 12) = 72
x2 + 7x = t, to obtain
(t + 6)(t + 12) = 72
This implies t2 + 18t + 72 = 72
⇒
Hence,
t(t + 18) = 0 ⇒ t = 0
or t = – 18
x2 + 7x = 0 or x2 + 7x + 18 = 0
First quadratic has 0 and – 7 as its roots and the second quadratic has roots given
by
− 7 ± 49 − 72
2
, i.e.,
− 7 ± − 23
2
Example 1.13: Solve, 5 x 2 − 6 x + 8 − 5 x 2 − 6 x − 7 = 1.
19
Function and Progression
Solution: Consider (5x2 – 6x + 8) – (5x2 – 6x – 7) = 15.
Divide this equation by the given equation.
We get
5x2
6x
5x2
8
6x
7 = 15
Adding this equation to the given equations we obtain
(
)
5x2 − 6 x + 8 = 8
5x2 – 6x + 8 = 64
⇒
⇒
5x2 – 6x – 56 = 0
⇒
x=
⇒
x=
6 ± 36 + 1120
10
6 ± 34
10
=
6 ± 1156
10
⇒ x=4
or
4
−2 .
5
Example 1.14: Solve, x4 – 5x3 + 15x + 9 = 0.
Solution: Note that in this equation
x4 – 5x (x2 – 3) + 9 = 0
(x4 – 6x2 + 9) – 5x(x2 – 3) + 6x2 = 0
Put x2 – 3 = t.
Thus the given equation is reduced to t2 – 5xt + 6x2 = 0
This has the roots t = 2x and t = 3x.
In other words we have two quadratic equations.
x2 – 3 = 2x and x2 – 3 = 3x.
The roots of former equation are – 1 and 3 and those of the latter are
Example 1.15: Solve, 5x + 52–x = 26.
Solution: Multiplying the given equation by 5x we obtain
52x + 25 = 26 × 5x
or
52x – 26 × 5x + 25 = 0
Put 5x = t to obtain the quadratic equation t2 – 26t + 25 = 0.
The roots of this equation are t = 1 or t = 25.
Then,
5x = 1 = 50
or
5x = 25 = 52 ⇒ x = 2
Hence,
x=0
⇒ x=0
or 2.
20
3 ± 21
.
2
Function and Progression
Example 1.16: Solve, 3x – 4 = 2 x 2 − 3x + 2 .
Solution: Squaring both sides to eliminate the radical sign, we get
9x2 – 24x + 16 = 2x2 – 3x + 2
or
7x2 – 21x + 14 = 0
or
x2 – 3x + 2 = 0
x=1
⇒
or 2
Hence, the roots of given equation are 1 and 2.
Example 1.17: Solve x4 + x3 – 4x2 + x + 1 = 0.
Solution: In equations of such type if the terms are arranged according to descending
powers of x, the coefficients of terms equidistant from first and last term are equal or
differ in sign. Equations of this type are called reciprocal equations.
We collect equidistant terms together.
Thus, given equation is equivalent to
(x4 + 1) + (x3 + x) – 4x2 = 0
Divide by x2 to obtain
Now put x +
1
x
FG x
H
2
IJ FG
K H
= t. Then x 2 +
IJ
K
1
1
+ x+
−4
2
x
x
+
1
x2
= t2 – 2
We get
t2 – 2 + t – 4 = 0
or
t2 + t – 6 = 0 ⇒ t = – 3 or 2.
In other words,
1
x
= –3 or 2
x2 + 3x + 1 = 0
i.e.,
⇒
x+
x=
−3 ± 5
2
=0
or x2 – 2x + 1 = 0
or x = 1, 1.
Hence, the roots of given equation are
1, 1,
−3 ± 5
.
2
Example 1.18: Solve the equation
x2 – 6x + 9 = 4 x 2 − 6 x + 6
21
Function and Progression
Solution: Putting x2 – 6x + 6 = t in the given equation, we get
t+3=4 t
or
t2 + 6t + 9 = 16t
or
t2 – 10t + 9 = 0
⇒
(t – 1)(t – 9) = 0
⇒
t=1
or t = 9
2
⇒
x – 6x + 6 = 1
or x2 – 6x + 6 = 9
⇒
x2 – 6x + 5 = 0
or x2 – 6x – 3 = 0
⇒
(x – 1)(x – 5) = 0
⇒
x = 1, 5
or x =
⇒
x = 1, 5
or 3 ± 2 3 .
x
1− x
= t2
1
t
13
6
We get t + =
6 ± 36 − 4 ( − 3)
2
6±4 3
2
1− x
x
+
1− x
x
Example 1.19: Solve
Solution: Put
or x =
1
6
=2 .
⇒ 6t2 + 6 = 13t
⇒ 6t2 – 13t + 6 = 0
⇒ 6t2 – 4t – 9t + 6 = 0
⇒ (2t – 3)(3t – 2) = 0
⇒ t=
Now,
t= 3
2
⇒
3
2
x
1− x
⇒ 13x = 9
when
t=
2
3
⇒
x
1 x
2
3
or
= 9
⇒ 4x = 9 – 9x
⇒x=
9
13
4
=
4
9
⇒ 9x = 4 – 4x
⇒ 13x = 4 ⇒ x =
So,
x=
4
13
9
.
13
or
22
4
13
Function and Progression
Example 1.20: Find the value of 6 + 6 + 6 + ... .
Solution: Let x = 6 + 6 + 6 + ... ∞ = 6 + x
⇒ x2 = 6 + x
⇒
x2 – x – 6 = 0
⇒
(x – 3)(x + 2) = 0
⇒
x=3
or – 2.
x− p x−q
q
p
+
=
.
+
q
p
x− p x−q
Example 1.21: Solve
Solution: Given equation can be rewritten as
x
p
q
q
x
p
p
x
x
q
q
p
⇒
( x − p )2 − q 2
q ( x − p)
⇒
( x − p − q )( x − p + q )
q ( x − p)
=
p2 − ( x − q )2
p( x − q )
Either x – p – q = 0, i.e., x = p + q
or we get
x− p+q
q ( x − p)
=
− ( p + x − q)
p( x − q )
Simplifying, we get ( p + q)x2 – ( p2 + q2)x = 0
⇒
x = 0 or x =
Hence, x = 0 or
p2 + q 2
p+q
p2 + q 2
p+q
or p + q.
Example 1.22: Solve x + x =
6
.
25
Solution: Putting x = t, we get
t2 + t =
6
25
⇒
⇒ t =
=
=
25t2 + 25t – 6 = 0
− 25 ± 625 − 4 ( − 6)( 25)
50
− 25 ± 625 + 600
50
− 25 ± 1225
50
= 10
50
or
=
25 35
50
− 60
50
23
=
( p + x − q )( p − x + q )
p( x − q )
Function and Progression
1
5
=
or
1
25
Then x = t2 =
or
−6
5
36
.
25
Example 1.23: Solve x2/3 + x1/3 – 2 = 0.
Solution: Put x1/3 = t, to obtain
t2 + t – 2 = 0
⇒ (t + 2)(t – 1) = 0
⇒ t = 1 or – 2
In case t = 1,
we get x1/3 = 1 ⇒ x = 1
In case t = –2,
we get x1/3 = – 2 ⇒ x = – 8
Hence, x = 1
or – 8.
Example 1.24: Solve x2 + x + 10 x 2 + 3x +16 = 2(20 – x).
Solution: Given equation can be written as
x 2 + 3 x − 40 + 10 x 2 + 3 x + 16
Put
x 2 + 3 x + 16
=0
Then x2 + 3x = t2 – 16.
= t.
So, the given equation simplifies to
t2 – 16 – 40 + 10t = 0
or
t2 + 10t – 56 = 0
⇒
(t + 14)(t – 4) = 0
⇒
t=4
Now
t = 4 ⇒ x2 + 3x + 16 = 16
or – 14
⇒ x2 + 3x = 0 ⇒ x = 0
While t = –14
⇒ x2 + 3x + 16 = 196
⇒ x2 + 3x – 180 = 0
⇒ x=
Hence,
− 3 ± 9 + 720
2
=
− 3 ± 729
2
=
− 3 ± 27
2
= 12 or – 15
x ⇒ 0, – 3, 12, – 15.
24
or – 3
Function and Progression
Example 1.25: Solve 3 x 2 − 18 + 3x 2 − 4 x + 6 = 4x.
Solution: Putting 3 x 2 − 4 x + 6 = t, we get
3x2 – 4x = t2 + 6
So, the given equation is reduced to
t2 + 6 – 18 + t = 0 ⇒ t2 + t – 12 = 0
⇒ (t + 4)(t – 3) = 0
⇒ t=3
or – 4
2
Now, t = 3 ⇒ 3x – 4x – 6 = 9 ⇒ 3x2 – 4x – 15 = 0
⇒ 3x2 – 9x + 5x – 15 = 0
⇒ (x – 3)(3x + 5) = 0
⇒ x = 3 or – 5
Also,
3
t = – 4 ⇒3x2 – 4x – 6 = 16 ⇒ 3x2 – 4x – 22 = 0
x=
⇒
=
4 ± 16 − 4. 3 ( − 22 )
6
4
16
6
264
=
4 ± 280
6
2 ± 70
3
⇒
x=
Hence,
x = 3, − ,
5 2 ± 70
.
3
3
Example 1.26: Solve,
1 + x2 + 1 − x2
1 + x2 − 1 − x2
= 3.
Solution: Simplifying given equation, we get
1 + x2 + 1 − x2
= 3 1 x2
3 1
x2
⇒ 2 1 + x2 = 4 1 − x2
⇒ 1 + x2 = 2 1 x2
⇒ 1 + x2 = 4(1 – x2)
⇒ 5x2 = 3
⇒ x2 =
3
5
⇒x=±
25
3
.
5
Function and Progression
Logarithmic
In mathematics, logarithmic function is very important function. If y = ax, then x is
given as logarithm of y to the base a, the same is expressed mathematically as
x = logay.
A s an example, 100 = 102 so, 2 = log10100. This tells that 2 is how many times 10
must be multiplied to itself to get 100: Thus 10 × 10 = 100. The base-2 logarithm of
16 is 4 because 4 is multiplied to itself to get 16. It is also obtained by self multiplication
of 2 four times. Hence, it is clear that 2 × 2 × 2 × 2 = 16. Since 102 = 100, so log10100
= 2, and 24 = 16, so log216= 4.
If we want to get a logarithm of x having base b, it is written as logb(x). If the
base is understood, we may write simply as log(x).
if x = by, then y = logb (x)
Logarithms converts the tedious task of multiplication to addition using the
formula log(x.y) = log x + log y. By using this function complex calculations were
made easier and this contributed greatly to the development of concept. We find
logarithmic tables which are used for making complex calculations very easy.
Logarithm with base e is known as natural logarithm and those with base 10 are
known as common logarithm. In calculus logarithm is taken as natural logarithm. In
binary mathematics, ‘2’ is used as a base as it uses two discrete symbols to
represent numbers or characters.
Properties of the Logarithm
For x > 0 and b > 0 (but ≠ 1), logb(x) is a unique real number. Although base can
be any positive number except 1, normally 10, e, or 2 are used. Logarithms are
defined for real as well as for complex numbers.
Most important property of logarithms lies in converting multiplication to
addition. We know that,
bx × by = bx+ y , We take logarithm on both sides,
bx × by = bx + y,
which by taking logarithms becomes
logb (bx × by) = logb (bx + y) = x + y = logb (bx) + logb (by).
For example,
4 = 22 ⇒ log2 (4) = 2,
26
Function and Progression
8 = 23 ⇒ log2 (8) = 3,
log2 (32) = log2 (4 × 8) = log2 (4) + log2 (8) = 2 + 3 = 5.
A related property is reduction of exponentiation to multiplication, Using the
identity.
c = blogb (c),
if follows that c to the power p (exponentiation) is:
p
cp = (blogb (c)) = bp logb (c),
or, taking logarithms:
logb (cp) = p logb (c).
Hence, to raise a number to a power p, one must find the logarithm of the
number and then multiply it by p. The exponentiated value is then the inverse or anti
logarithm of this product; which means,
number to power = bproduct.
With the use of logarithms lengthy numerical calculations become easier. To
make the process easy, tables of logarithms, or slide rules are used.
Example 1.27: What is log327?
Solution: 3, because 27 = 33
Example 1.28: What is log51/25?
Solution: –2, because 1/25 = 1/(52) = 5–2
Logarithmic Identities
log(cd) = log(c) + log(d)
log(c/d) = log(c) – log(d)
log(cd) = d log(c)
log( d c ) =
log(c )
d
Logarithm as a Function
In early stages of development of logarithms it was taken to be an arithmetic
sequence of numbers in correspondence to a geometric sequence of other positive
real numbers. But gradually it was considered as an analytic function which can also
be extended to cover complex numbers.
27
Function and Progression
The term logarithm has the form logb(x) where base b is fixed and argument x is
a variable. But the base must be a positive real number, but not 1. Thus the
logarithmic function with base b, is the inverse of an exponential function of the form
bx. The term logarithm is normally used instead of logarithmic function.
Logarithm of a Negative or Complex Number
Originally, there no place for negative or complex numbers. But this has been extended
for complex number as well. The value of the function so obtained is not single valued.
We find that e2πi = e0 = 1. Thus loge1 has two values, 0 and 2πi. Let a complex
number z be given by z = x +iy.
We express a complex number z, as z = reiθ = rcosθ + i.rsinθ, to find its
logarithm. Here r = |z| = sqrt(x2 + y2) and this is called modulus of z and θ is the
argument denoted as θ = arg(z) is an angle and x = rcosθ and y = rsinθ. Here arg(z)
is multi-valued. When base of the logarithm is chosen as e, it is called natural
logarithm and denoted by ln. We get complex logarithm as:
1n(z) = 1n(r) + i (θ + 2πk)
We get principal value by putting k = 0, in the range (–π to π]. Principal value
has imaginary part which is the natural logarithm for all numbers lying in the set of
positive real numbers. Logarithm of a negative number has its principal value as:
1n(–r) = 1n(r) + iπ
If we try to find the logarithm on a base, other than e, say ‘b’ the complex
logarithm logb(z) = ln(z)/ln(b). Principal value of logb(z) is then, given by ln(z) and
ln(b).
Change of Base: For finding logarithm for a base other that built in the
calculator we use change of formula concept. We find logarithm with base b, using
any other known base, say k.
log b ( x) =
log k ( x)
log k (b)
If we are required to find the log with base 2 of the number 16 with the help of
a calculator, then we do as follows:
log 2 (16) =
log(16)
log(2)
Use of Logarithms
In equations where exponents are unknown, logarithms are very useful. Their
derivatives are simple and hence used in the solution of integrals.
28
Function and Progression
Scientific Applications
Logarithms are used to define many quantities, used in scientific applications. These
broadly include the following:
pH measurement: In chemistry, pH is defined as, pH = –log10[H+], where [H+]
activity of hydronium ions. Activity of hydronium ions neutral water = 10 –7 mol/L at
25o C. Its pH value is 7. pH thus shows the scale of acidity 1 to 14. A liquid is
acidic if pH < 7 and alkaline if pH > 7.
Measure power level: Power level, voltage level in electrical, electronics and
telecommunication is frequently used and expressed as decibel, written as dB which
is given as 10log10(Ratio of Power). Neper is measurement which is given by
ln(Ratio of Power).
Measurement of earthquake: Intensity of earthquake is measured in Richter scale
on a base 10 logarithmic scale.
In Astronomy: Eyes respond logarithmically to brightness, hence rightness of stars
as measured on logarithmic scale.
In Psychophysics: Relationship between stimulus and sensation has been shown by
Weber–Fechner as logarithmic.
In computer science: Computational complexity is expressed in terms logarithm.
For searching N items, computational time is proportional to N × log N. To compute
storage space of memory, base 2 logarithm is used.
In Information science: In information theory logarithms are used as a measure of
Quantity of information is measured in terms of logarithm in information science. If a
message recipient may expect any one of N possible messages with equal likelihood,
then the amount of information conveyed by any one such message is quantified as
log2 N bits.
Log-log chart: In engineering and scientific applications many log-log and semilog
charts are used.
Logarithm According to Calculus
The natural logarithm of a positive number x according to calculus Natural
logarithmic derivative is given by,
1n( x) ≡ ∫
x
1
dt
t
29
Function and Progression
d
1
1n( x) =
dx
x
We can find derivative for other bases, we apply the change-of-base rule as:
log b (e)
d
d 1n( x)
1
log
=
= =
b ( x)
dx
dx 1n(b) x1n(b)
x
Integration of ln(x) is given by:
) dx
∫1n( x=
x1n( x ) − x + C
For other bases, integration of ln(x) is given by:
x)dx
∫ log (=
b
x log b ( x) −
x
x
=
+ C x log b   + C
1n(b)
e
Expanding into a series natural logarithm:
For |x| < 1, from binomial theorem,
1
= 1 + x + x 2 + x 3 + ⋅ ⋅⋅
1− x
Integrating both the sides, we get
−1n(1 − x ) =x +
x 2 x3 x 4
+ + + or
2 3 4
1n(1 − x) =
−x −
x 2 x3 x 4
− − −
2 3 4
Putting. z = 1 – x and thus x = (1 – z), we get
In
(1 − z )
z=
−(1 − z ) −
2
2
(1 − z )
−
3
3
(1 − z )
−
4
4
+
Another series expansion of ln z is given as below:
1  z −1 
In ( z ) = 2∑


n =0 2n + 1  z + 1 
∞
2 n +1
for z with positive real part.
By substituting –x for x we get,
1n(1 + x) =x −
x 2 x3 x4
+ − +
2 3 4
30
Function and Progression
Subtraction gives:
1n
1+ x
x3
x5
= 1n(1 + x) − 1n(1 − x) = 2 x + 2 + 2 + 
1− x
3
5
Putting z =
1+ x
z −1
and thus x =
we get
1− x
z +1
 z − 1 1  z − 1  3 1  z − 1 5

1n z =
2
+ 
+ 
+ 


 z +1 3  z +1 5  z +1



As z tends to 1 convergence becomes faster. To use this formula one should try to
get an approximate value of y ≈ ln(z) first and then apply A = z/exp(y), where exp(y)
is computed using the exponential series. If y is not very large, it converges fast.
Finally, we get ln(z) = y + ln(A). Here A is approximately equal to 1, which is
desired. For larger value of z we should use z = a×10b, and ln(z) = ln(a) + b ×
ln(10).
Exponential
This function is of prime importance in mathematics and finds its wide application in
calculus and many branches of science and engineering. An exponential function of x
is written as exp(x) or ex. Here e is a constant and an irrational number. It has been
estimated as 2.718281828 by Euler and bears his name. It is called ‘Euler’s
number’ and is also the base of natural logarithm. An exponential function is the
inverse of a logarithmic function and is sometimes, called anti logarithm. Inverse of
an exponential function is a logarithmic function.
The exponential function rises slowly and is almost flat for x < 0, but increases
rapidly for values x > 0 and its value is 1 for x = 0. Its ordinate value is the slope
of its curve at that point. That is why an exponential function with negative value of
x is known as exponential decay and those with positive value it is called exponential
growth. Also, when growth is very fast we call it exponential growth, example,
population growth.
The exponential function is almost flat, rising slowly, for negative values of x, and
increases fast for positive values of x, and equals 1 when x is equal to 0. Its y value
always equals the slope at that point.
31
Function and Progression
The graph of an exponential function always lies above the abscissa, since ex is
always positive. It is increasing on the positive side of X-axis. In the negative side of
X-axis it is decreasing but never touches the X- axis.
The exponential function ex may be expanded into an infinite series, called power
series given below:
xn
x 2 x3 x 4
=1 + x +
+ +
+
2! 3! 4!
n=0 n !
∞
ex = ∑
This function can be defined as a limit which is given below:
n
1
 x
e x = lim  1 +  ⋅ or e x = lim (1 + nx ) n ⋅
n →∞ 
n →∞
n
Exponential functions in mathematics, engineering and various science streams
are predominantly because of the characteristic an exponential function with respect
to its derivative, which is:
d x
e = ex
dx
• The slope of the graph of ex at any point, x = ex.
• The rate of increase of the function with respect to x, at a point = ex.
• Since y’= y, this function is a solution of the differential equation y’–y = 0.
In higher mathematical applications there are great numbers of differential
equations whose solution are exponential functions. Laplace’s equation and equation
of simple harmonic motion are examples. Equations for simple harmonic motion also
give exponential functions.
There are exponential functions with other bases, like one given below for a
function y = ax:
d x
a = (In a )a x .
dx
32
Function and Progression
Proof.
y = ax
1ny = 1nax
1ny = x 1na
1 dy
= 1n a
y dx
dy
= (1n a)y = (1n a) ax
dx
This shows that Derivative of an exponential function is a constant multiple of its
own. If rate of change of a variable is proportional to the variable itself, the solution
results in an exponential function. Population growth, radioactive decay, continuously
compounded interest, etc., are examples of exponential function in practical life. In all
these cases the variable is proportional to exponential function of time. For a
differentiable function f(x), as per chain rule:
d f ( x)
e
= f ′( x)e f ( x )
dx
Exponential Function on the Complex Plane
As in case of real numbers, the exponential function can be defined in for complex
quantities too. Some of these definitions are identical to those given for real valued
exponential functions. The definition of power series can be used and for this real
value replaced by a complex one, as given below:
nn
n =0 n!
∞
ez = ∑
The derivative, like that of real quantities also holds for complex quantities and
this can be stated as below:
d z
e = e z holds in the complex plane.
dz
We can now extends the concept for real exponential function to complex one as
below by writing as ex + iy = exeiy. The real part is ex and eiy = cos(y) + isin(y). Thus
we use the real definition without ignoring it.
We can now write,
ea+bi = ea (cosb + i sin b)
Here a and b are real values.
33
Function and Progression
Example 1.29: Looking at the functions below, find the function(s) which is/are not
exponential.
(i) f(x) = 3e–2 x
(ii) g(x) = 2x/2
(iii) h(x) = x3/2
(iv) g(x) = 15/7x
(v) p(x) = xe
Solution: Here, h(x) and p(x) are not exponential functions. For the function to be
exponential, the independent variable should be the exponent.
Example 1.30: Find the domain and range of function defined as f(x) = kbx.
Discuss the nature of graph of this function. How f(x) changes when (i) x tends to
infinity and (ii) x tends to negative infinity? Are there any horizontal asymptotes? Tell
about its horizontal asymptote.
Solution: Domain of this function is the set of real numbers, but the range is the set
of all positive real numbers.
When b > 1, the function f(x) is increasing; the graph rises in the right proton. (i)
When x tends to infinity f(x) increases. (ii) When x decreases tending to negative
side of infinity the function, f(x) goes on decreasing and tends to zero. The line given
by y = 0, which is the x-axis, is the horizontal asymptote.
For b < 1, the condition is opposite to it. It decreases with increasing value of x
and decreases with the increasing value of x. It goes from high in the left to low in the
right portion of the graph.
Example 1.31: The Bacteria grow exponentially in a culture. It was observed that
number of bacteria at 2:00 p.m. was 80 and at 6:00 p.m. it was 500. The growth is
given by a function f(t) = k.eat. Find the population of bacteria at 10:00 p.m.
Solution: The growth is given by f(t) = 80e0.4581 tat any time t. Number of bacteria
at 10:00 p.m. will be 3125.
Example 1.32: A European country conducted the nuclear test on an island in the
Pacific Ocean in 1990. Just after the explosion, the level of Strontium-90 on the
island was noted as 100 times the ‘safe level’ for human habitation. Taking half-life
of Strontium-90 as 28 years, find the number of years after which the island will
once again be habitable.
Solution: The Island will be habitable after 186 years approximately which is the
year 2176.
34
Function and Progression
Utility
If U(x, y) denotes the satisfaction obtained by an individual when he buys quantities
x and y of two commodities X and Y, then U(x, y), the function of two variables x
and y is called the utility function or utility index of the individual.
U = (x + 3) (y + 1)
e.g.,
U = (x – 1)0.5 (y – 2)0.5
Notes
1. Still there are other functions such as Marginal Revenue Function and
Marginal cost function, which are based on the (complete) derivatives or
partial derivatives. They are dealt with in the respective chapters of
differential/integral calculus.
2. Break-Even Analysis entails finding out the minimum quantum of production
(and sales) that a firm has to achieve in its attempt to recover its investment
(total fixed cost) whereafter profits start accruing.
At Break-even point, profit = Loss = 0
or Total Revenue = Total Cost
i.e.,
R(x) = C(x)
or,
p.x. = (TFC + AVC.x)
⇒
x (P – AVC) = TFC, where p = P = unit Price
or xB =
TFC
( P − AVC )
units (Break-even output) (QB)
Break-even Sales (Revenue)
p.xB p=
.QB
sB = =
or sB =
( TFC )
AVC 

1 −

P 

or
P ( TFC )
( P − AVC )
TFC
(Break-even Sales)
TFC 

1 −

TR 

35
Function and Progression
Check Your Progress - 2
1.
What is a quadratic equation?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What are logarithms used to define?
................................................................................................................
................................................................................................................
................................................................................................................
1.4
SUMMARY
• If to each value of a variable x, there corresponds one definite value of
another variable y, then we say that y is a function of x, and denote it as
y = f(x).
• The set of values of x for which the value of the function y = f(x) is
determined, is called the domain of the function, while the set of values of
y is called the range of the function.
• The range of values that a variable can take, be it a closed (or semi-closed)
interval or an open (semi-open) interval or a combination of such intervals
is known as the interval of the variable.
• When a function has only one value corresponding to each value of the
independent variable, the function is called a single-valued function. If a
function has several values corresponding to each value of the independent
variable, it is called a multi-valued or many valued function.
• If f(x) changes sign where the sign of x is changed, i.e., if f(–x) = – f(x),
then f(x) is said to be an odd function of x.
• Geometrically, an even function is symmetric with respect to the y-axis
while an odd function is symmetric with respect to the origin.
• The only function which is both even and odd is the constant function which
is identically zero (i.e., f(x) = 0 for all x).
36
Function and Progression
• The sum of two odd functions is odd, and any constant multiple of an odd
function is odd.
• The derivative of an even function is odd.
• The product of an even function and an odd function is an odd function.
• The Fourier series of a periodic even function includes cosine terms only
while that of a periodic odd function includes sine terms only.
• Both the even and the odd functions form a vector space over the reals. In
fact, the vector space of all real-valued functions is the direct sum of the
spaces of even and odd functions.
• The even functions form a commutative algebra over the reals. However,
the odd functions do not form an algebra over the reals.
• When x and y both occur together in an equation but y is not capable of
being directly expressed in terms of x, then y is said to be an implicit
function of x.
• If y is a function of x, then on the other hand, x is also (yet another) function
of y. The latter is called the inverse function of the former function y, i.e., if
y = f(x), then x = g(y)
• The distance an object travels in four hours depends on its speed. When
such relationships exist, one variable is said to be a function of the other.
• The relationship between any square and its area could be represented by
f(x) = x2, where A = f(x).
• The set of numbers created by substituting every value for x into the
equation is known as the range of the function.
• A linear equation is obtained by equating to zero a linear expression.
• We can identify an ordered pair (a, b) of real number with a point in the
plane of coordinate geometry.
• Whenever the solution set of a system of linear inequations is empty, we
say that the inequations are inconsistent.
• A system of linear inequations is said to be consistent if its solution set is
non-empty.
• A quadratic equation is called pure if it does not contain single power
of x. In other words in a pure quadratic equation, coefficient of x must be
zero. Thus a pure quadratic equation is of the type ax2 + b = 0 with a ≠ 0.
37
Function and Progression
• A quadratic equation which is not pure is called an affected quadratic
equation.
• If the expression ax2 + bx + c can be factored into linear factors then each
of the factors, put to zero, provides us with a root of the given quadratic
equation.
• Logarithm with base e is known as natural logarithm and those with base
10 are known as common logarithm.
• In early stages of development of logarithms it was taken to be an
arithmetic sequence of numbers in correspondence to a geometric
sequence of other positive real numbers. But gradually it was considered
as an analytic function which can also be extended to cover complex
numbers.
• In equations where exponents are unknown, logarithms are very useful.
Their derivatives are simple and hence used in the solution of integrals.
• Exponential function is of prime importance in mathematics and finds its
wide application in calculus and many branches of science and engineering.
• The graph of an exponential function always lies above the abscissa, since
ex is always positive.
• As in case of real numbers, the exponential function can be defined in for
complex quantities too. Some of these definitions are identical to those
given for real valued exponential functions.
1.5
KEY WORDS
• Interval of a Variable: It is the range of values that a variable can take,
be it a closed interval or an open interval or a combination of both.
• Multi-valued function: If a function has several values corresponding to
each value of the independent variable, it is called a multi-valued or many
valued function.
• Quadratic Equation: An equation of degree two is called a quadratic
equation.
38
Function and Progression
1.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. When a function has only one value corresponding to each value of the
independent variable, the function is called a single-valued function.
2. Domain and range are the two main characteristic of a function.
Check Your Progress - 2
1. An equation of degree two is called a quadratic equation.
2. Logarithms are used to define many quantities, used in scientific
applications.
1.7
SELF-ASSESSMENT QUESTIONS
1. Write a short note on functions.
2. Give a brief classification of functions.
3. Discuss the properties of functions.
4. What do you mean by interval of a variable?
5. What do you mean by graphical representation of solution sets? Discuss.
6. List the methods of solving affected quadratic equations.
7. Solve x4 + x3 – 4x2 + x + 1 = 0.
8. Discuss the logarithm of a complex number in detail.
9. List the various uses of logarithms.
1.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
39
Function and Progression
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
40
Arithmetic Progression and
Series
UNIT–2
ARITHMETIC PROGRESSION AND SERIES
Objectives
After going through this unit, you will be able to:
•
Define sequence and its significance
•
Discuss arithmetic progression and its importance
•
Analyse the general term of an arithmetic progression
•
Understand the concept of arithmetic mean
Structure
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
Introduction
Sequence
Arithmetical Mean
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
2.1
INTRODUCTION
This unit will discuss arithmetic progression and series. An arithmetic progression is
a continuous series, in a coherent manner where in each term, after the first, is
obtained by adding a common number to the term before it. The number which is
generally added to the first term is called the common difference. The entire event is
called a sequence. Another sequence, where each term except the first is obtained
by multiplying it to the term before it, generally with a non-zero number is called a
geometric progression. Arithmetic progression varies into another type called a
harmonic progression.
This unit will discuss arithmetic mean and progression in detail and will also
explain the insertion of n Arithmatic means between two given numbers.
41
Arithmetic Progression and
Series
2.2
SEQUENCE
An arithmetic progression is a sequence, in which each term, except the first, is
obtained by adding a fixed number to the term immediately preceding it. The fixed
number is called the common difference.
Remarks
(i) In an AP, we usually denote the first term by ‘a’, the common difference by
‘d’, the general term, i.e., nth term by Tn and the sum to first n terms by Sn
respectively.
(ii) Clearly, d = (T2 – T1) = (T3 – T2) = ... = (Tn – Tn–1)
(iii) (Tn = Sn – Sn–1)
Some Standard Results
If a, a + d, a + 2d, ... is an AP, then
(i) Tn = a + (n – 1) d
(ii) nth term from the end = l – (n – 1) d, where l is the last term.
(iii) Sn
n
2a  (n  1)d 
2
n
( a  l ),
2
where l is the last term.
(iv) If a, b, c are in AP, then b is called the arithmetic mean (AM) between a
 a  c
.
and c, and b  
 2 
(v) If a, x1, x2, ... , xn, b are in AP, then x1, x2, x3, ... , xn are called the
“n arithmetic means” between a and b.
Here, d   b  a 
 n  1
and

n(b  a ) 
x
n
 a  ( n  1) 


(vi) If a fixed number is added to (or subtracted from) each term of an AP then
the resulting sequence is also an AP.
(vii) If each term of an AP is multiplied or divided by a non-zero fixed number,
then the resulting sequence is also an AP.
(viii) It is convenient (when the sum of three/five/seven/... consecutive terms of
an AP is given) to make a choice of:
three numbers in AP as (a – d), a, (a + d),
five numbers in AP as (a – 2d), (a – d), a, (a + d), (a + 2d) and so on.
42
Arithmetic Progression and
Series
(ix) It is convenient (when the sum of four/six/eight/ ... consecutive numbers of
an AP is given) to make a choice of:
four numbers in AP as (a – 3d), (a – d), (a + d), (a + 3d),
six numbers in AP as (a – 5d), (a – 3d), (a – d), (a + d) (a + 3d),
(a + 5d) and so on.
Geometric Progression
A geometric progression is a sequence, in which each term, except the first, is
obtained by multiplying the term immediately preceding it, with a fixed non-zero
number.
The fixed number is called the common ratio.
Some Standard Results
If a, ar, ar2, ar3, ... is a GP, then (a = first term, r = common ratio)
(i) nth term, Tn = arn–1
T
T1
T
T2
3
2

...

(ii) Common ratio, r = 
(iii) nth term from the end =
l
r
n 1
(iv) Sum to first n terms, Sn =
Tn
Tn 1
, where l is the last term
a(r n  1)
(r  1)
or
last term
a(1
(1
 a  lr 
rn )
= 
where l =
 1  r 
r)
(v) Sum to infinity of a GP,
S∞ =
a
(1  r )
when | r | < 1
or
–1 < r < 1
(vi) If a, b, c are in GP, then b is called the geometric mean (GM) between a
and c. In this case, b  ac or b2 = ac.
(vii) If a, g1, g2, ... , gn, b are in GP, then g1, g2, ..., gn are called “n geometric
means” between a and b.
b
Here r   
 a
n 1
 b
 a 
gn ar
and 
 a
n
n( n 1)
(viii) If each term of a GP is multiplied or divided by a non-zero fixed number,
then the resulting sequence is also a GP.
43
Arithmetic Progression and
Series
(ix) If each term of a GP is raised to the same index, then the resulting
sequence is also a GP.
i.e., If a, b, c, are in GP,
then aK, bK, cK are also in GP, where K is a constant.
(x) It is convenient (when the product of three/five/seven/ ... consecutive terms
of a GP is given) to make a choice of:
a
r
three terms of a GP as
a
five terms of a GP as
r
2
, a, ar
,
a
, a, ar, ar2 and so on.
r
(xi) It is convenient (when the product of four/six/eight/... consecutive terms of
a GP is given) to make a choice of:
a a
, , ar , ar 3
3 r
r
four terms of a GP as
six terms of a GP as
a
r
5
,
a a
, , ar, ar3 , ar 5 and so on.
3 r
r
Arithmetico-geometric series
A series in which each term is the product of the corresponding terms of an AP and
a GP is called an Arithmetico-geometric series.
Some standard results
If a, (a + d) r, (a + 2d) r2, ... is an AGP, then
Tn = [a + (n – 1)d] rn–1
a
dr
dr n
[ a  (n  1)d ]r n 




2
(1  r )
(1  r )2
 (1  r ) (1  r )


Sn = 

a
dr 

2
 (1  r ) (1  r ) 
S∞ = 
Harmonic progression
A sequence of numbers is said to be in Harmonic Progression when the reciprocals
of these numbers are in Arithmetic Progression. For example, 1 , 1 , 1 , 1 , ... are in
2 3 4 5
HP since 2, 3, 4, 5, ... are in AP.
44
Arithmetic Progression and
Series
Thus, the nth term of a HP is the reciprocal of the nth term of the corresponding
AP.
Remark
The sum of first n terms of an HP is not equal to the reciprocal of the sum of first n
terms of the corresponding AP.
Some Standard Results
(i) If a, H, b are in HP, then H is called the harmonic mean between a
and b.
 2ab 
Here, H = 

 a  b
(ii) If a, h 1, h 2, ..., h n, b are in HP, then h 1, h 2, ..., h n are known as
“n harmonic means” between a and b.
Here, hn =
ab(n  1)
(b  na )
Notes
1. There are no special formulae for HP. We have to trace the corresponding
AP and apply the results/formulae of AP and eventually find the respective
answers for HP.
2. If each term of a HP is multiplied or divided by a constant non-zero
number, then the resulting terms are also in HP.
Other series of importance
(i) 1 + 2 + 3 + ... + n = S n =
n(n  1)
2
 n(n  1)(2n  1) 

6

(ii) 12 + 22 + 32 + ... + n2 = S n2 = 

 n 2 (n  1)2 

4


(iii) 13 + 23 + 33 + ... + n3 = S n3 = 
 n2
Solved examples (Arithmetic progression)
Example 2.1: A manufacturer of TV sets produced 600 units in the third year and
700 units in the seventh year. Assuming that the increase in production every year is
the same, find what was (i) the total production in 7 years and (ii) the production in
the 10th year?
45
Arithmetic Progression and
Series
Solution: Obviously, it is a case of Arithmetic Progression. Using the standard
notations, we have
and
On subtraction,
T3 = a + 2d = 600
...(1)
T7 = a + 6d = 700
4d = 100
...(2)
d = 25
⇒
Substituting d = 25 in equation (1), we get
a + 2 (25) = 600
or
a = 550
7
2(550)  (7  1)25
2
= 4375 Ans.
Thus, (i)
S7 =
and (ii)
T10 = a + 9d = 550 + 9(25) = 775 Ans.
Example 2.2: A company is considering a salary plan that would pay new
employees ` 5000/- per month with ` 200/- as annual increment.
(i) Find the total earned salary through 20 years.
(ii) Find the period for the monthly salary to get doubled.
Solution:
(i) The total annual salaries form an AP 5000 × 12, 5200 × 12, 5400 × 12, ...
Thus, the total earned salary through 20 years = S20
=
20
2(5000  12)  (20  1)(200  12)
2
= 10 [1,20,000 + 45,600] = ` 16,56,000
(ii) For the current monthly salary (` 5000) to get doubled (to become
` 10,000), a consolidated increment of ` 5000 is required.
Thus,
5000
= 25 years are required.
200
Even otherwise, in the series 5000, 5200, 5400, ...
Tn = 10,000 = [a + (n – 1) d]
= [5000 + (n – 1) 200]
⇒
n = 26 is, in the 26th year (or after the completion of 25
years), the salary gets doubled.
46
Arithmetic Progression and
Series
Example 2.3: The cost of boring a tubewell 600 metres deep is as follows: 25
paise for the first metre and an additional 4 paise for every subsequent metre. Find
the cost of boring the 500th metre and also the total cost.
Solution: The cost of boring the 500th metre = T500 = [a + (500 – 1) d]
= [25 + 499 (4)] = 2021 paise = ` 20.21
Total cost of boring 600 metres = S600
=
600
2(25)  (600  1)4
2
= 300 [50 + 599 (4)]
= 733800 paise
= ` 7338
Example 2.4: A person pays ` 975 through monthly instalments each less than the
former by ` 5. The first instalment is ` 100. In how many instalments will the amount
be paid?
Solution: The series of instalments 100, 95, 90, 85, ... forms an AP.
Let ‘n’ be the no. of instalments in which the entire amount is cleared.
Sn =
n
975
2a  (n  1) d  
2
or
n
975
2(100)  (n  1)(  5) 
2
or
n
975
200  5n  5 
2
or
n
195
41  n 
2
or
(n2 – 41n + 390) = 0
⇒
[n2 – 26n – 15n + 390] = 0
⇒
n (n – 26) + 15 (n – 26) = 0
⇒
(n – 26) (n + 15) = 0
⇒
n = 26
or
n
975
205  5n 
2
or (41n – n2) = 390
( n cannot be negative)
Example 2.5: The monthly salary of a person was ` 320 for each of the first three
years. He next got annual increments of ` 40 per month for each of the following
successive 12 years. His salary remained stationary till retirement when he found that
47
Arithmetic Progression and
Series
his average monthly salary during the service period was ` 698. Find the period of
his service.
Solution: The monthly salary for the first 3 years (i.e., 36 months) = ` 320
The monthly salary in the 4th year = ` 360
The monthly salary in the 5th year = ` 400
The monthly salary in the 6th year = ` 440
... ... ...
... ... ...
The monthly salary in the 15th year = [320 + 40(12)] = ` 800
Now, as per the given problem,
12


 3(320)  2 [2(360)  (12  1)40]  n(800) 
the average salary = 
 = ` 698
(3  12  n )




 7920  800n 
 698
⇒ 

15  n

⇒ n = 25
∴ The total service = (3 + 12 + 25) = 40 years.
Example 2.6: Balu arranges to pay a debt of ` 9600 in 48 annual instalments which
form an arithemetic series. When 40 of these instalments are paid, Balu becomes
insolvent and his creditor finds that ` 2400 still remain unpaid. Find the value of each
of the first three instalments. Ignore interest.
Solution: S48 =
48
9600
2a  (48  1) d  
2
or
24(2a + 47d) 9600
or
2a + 47d = 400
Also, S40
...(1)
40
2a  (40  1) d  (9600  2400) 7200
2
or
20 (2a + 39d) = 7200
or
2a + 39d = 360
Solving (1) and (2), we get
d = 5 and
a = 82.5
48
...(2)
Arithmetic Progression and
Series
∴ The first three instalments are 82.5, 82.5 + 5, 82.5 + 2(5)
i.e., ` 82.50, ` 87.50 and ` 92.50, respectively.
Example 2.7: If a person repays a loan of ` 3,250 by paying ` 20 in the first month
and then increases the payment by ` 15 every month, how long will he take to clear
his loan?
Solution: The series is 20, 35, 50, 65, ...
Let n be the no. of months required to clear the loan
Sn =
⇒ 3250 =
n
2a  (n  1) d 
2
n
2
[2(20) + (n – 1) 15]
or
(3n2 + 5n – 1300) = 0
or
[3n2 + 65n – 60n – 1300] = 0
or
[n (3n + 65) – 20 (3n + 65)] = 0
or
(n – 20) (3n + 65) = 0
⇒
n = 20
(The other value
65
3
being negative, is ruled out)
Thus, the loan will be cleared in 20 months.
Example 2.8: Two posts were offered to a man. In the first one, the starting salary
was ` 120 per month and the annual increment was ` 8. In the second one, the
salary commenced at ` 85 per month, but the annual increment was ` 12. He
decided to accept that post which would give him more earnings in the first 20 years
of the service. Which post was acceptable to him? Justify your answer.
Solution: Obviously, the annual salaries form an AP in either case
Post I:
120 × 12, 128 × 12, 136 × 12, ...
S20 =
Post II:
20
2(120  12)  (20  1)(8  12)
2
= ` 47,040
85 × 12, 97 × 12, 109 × 12, ...
S20 =
20
2(85  12)  (20  1)(12  12)
2
Obviously, the Post II was preferable to him.
49
= ` 47,760
Arithmetic Progression and
Series
Example 2.9: The pth term of an AP is q and the qth term is p. Show that the rth
term is (p + q – r) and the (p + q)th term is zero.
Tp  a  ( p  1)d  q
 d( p  q )  ( q  p )
Solution: T a  ( q  1)d p   d  1 and a ( p  q  1)
q

Tr = a + (r – 1) d
= a + (r – 1) (–1)
=a–r+1
= (p + q – 1) – r + 1
Tr = (p + q – r)
Also, Tp+q = (p + q – 1) d + (p + q – 1) (–1) = 0
(Substituting a = p + q – 1 and r = p + q in Tr = [a + (r – 1)d]
Example 2.10: If pth term, qth term and rth term of an AP are a, b, c respectively,
show that (q – r) a + (r – p) b + (p – q) c = 0
Solution: Let A be the first term, then
T p = A + (p – 1) d = a
T q = A + (q – 1) d = b
Tr = A + (r – 1) d = c
Now, Σ (q – r) a = Σ(q – r) [A + (p – 1) d]
= Σ A(q – r) + Σpd (q – r) – Σd (q – r)
= AΣ (q – r) + dΣp (q – r) – dΣ (q – r)
= A(0) + d(0) – d (0) = 0
Hence the result.
Example 2.11: Firm A starts producing 400 units and decreases production by 50
units annually. Firm B starts by producing 250 units and increases production by 25
units annually. Assuming that both the firms grow/decay in an arithmetic series, find
the following:
(a) In which year will both produce the same amount?
(b) When will firm A produce zero output?
50
Arithmetic Progression and
Series
(c) What will be the production of firm B in the year when firm A produces
nothing?
Production at Firm A: 400, 350, 300, ...
Production at Firm B: 250, 275, 300, ...
Solution: (a) In the third year, they produce the same quantity
Tn = [400 + (n – 1) (– 50)] = [250 + (n – 1) 25]
⇒
n=3
Tn = 0 = [a + (n – 1)d] = [400 + (n – 1) (– 50)]
(b)
or
400 – 50n + 50 = 0
or
450 – 50n = 0
⇒
n = 9 years
(c) T9 for the production of Firm B
T9 = [a + (n – 1) d] = [250 + (9 – 1) 25]
= (250 + 8 × 25) = 450 units
Example 2.12: Twenty-five trees are planted in a straight line at intervals of 5 feet.
To water them, the gardener must bring water for each tree separately from a well
10 ft from the first tree in the line of the trees. How far has he walked when he has
just watered all the trees beginning with the first?
Solution:
T1 = 10 + 10 = 20 ft (both to and fro)
T2 = (10 + 5) + (10 + 5) = 30 ft (both to and fro)
T3 = (10 + 10) + (10 + 10) = 40 ft (both to and fro)








1


Thus, the total distance covered = S24  2 (T25 )


Note
When he just completes watering the 25th tree, there is no need to come back to the
well.
Thus, (20 + 30 + 40 + ... up to 24 terms) +
51
1
2
(T25)
Arithmetic Progression and
Series
=
24
[2(20)
2
+ (24 – 1)10] +
1
2
[20 + (25 – 1)10] = 3370 ft Ans.
Example 2.13: If Sn is the sum of first ‘n’ terms of an arithmetic series, then show
that Sn+3 – 3.Sn+2 + 3.Sn+1 – Sn = 0.
Solution: Sn+1 = (Sn + Tn+1)
Sn+2 = (Sn + Tn+1 + Tn+2)
Sn+3 = (Sn + Tn+1 + Tn+2 + Tn+3)
Thus, (Sn+3 –3.Sn+2 + 3.Sn+1 – Sn)
= [(Sn + Tn+1 + Tn+2 + Tn+3) – 3(Sn + Tn+1 + Tn+2) + 3(Sn + Tn+1) – Sn]
= [Tn+1 – 2.Tn+2 + Tn+3]
( Tn+1, Tn+2 and Tn+3 are three consecutive terms of an
Arithmetic Progression)
=0
Note
If a, b, c are in AP, then b   a  c 
 2 
or (a + c – 2b) = 0, or (a – 2b + c) = 0
Example 2.14: If the roots of the equation (q – r) x2 + (r – p) x + (p – q) = 0 are
equal, then show that p, q, r are in AP.
Solution: Since the roots of a quadratic equation ax2 + bx + c = 0 are equal when
(b2 – 4ac) = 0, we have
(r – p)2 – 4(q – r) (p – q) = 0
r2 + p2 – 2pr + 4pq + 4q2 + 4pr – 4qr = 0
r2 + p2 + (2q)2 + 2pr – 2(p)(2q) – 2(2q)(r) = 0
or
(p – 2q + r)2 = 0
or
p – 2q + r = 0
⇒
 p  r
q
 2 
⇒
p, q, r are in AP
 a n  1  bn  1 
 may be the arithmetic mean
 a n  bn 
Example 2.15: Find ‘n’ such that 
between a and b.
52
Arithmetic Progression and
Series
ab
 an  1  bn  1 

an  bn 



Solution: Given, AM = 
 2  
or
(a + b) (an + bn) = 2 (an+1 + bn+1)
or
an+1 + abn + an.b + bn+1 = 2 (an+1 + bn+1)
or
(abn + anb) = (an+1 + bn+1)
or
(anb – an+1) = (bn+1 – abn)
or
an (b – a) = bn (b – a)
⇒
 a
 a
 b   1  b  ( b  a )
⇒
n=0
0
n
Example 2.16: If the sum of first ‘p’ terms of an AP is equal to the sum of first ‘q’
terms of the same progression, then show that sum of the first (p + q) terms is equal
to zero.
Solution:
Sp =
p
[2a
2
Sq =
q
2
+ (p – 1) d]
[2a + (q – 1) d]
Given, Sp = Sq
⇒
p
[2a
2
+ (p – 1) d] =
q
2
[2a + (q – 1) d]
or
2ap + p (p – 1) d = 2aq + q (q – 1) d
or
2a (p – q) + d [p (p – 1) – q (q – 1)] = 0
or
2a (p – q) + d [(p2 – q2) – (p – q)] = 0
or
(p – q) [2a + (p + q) d – d] = 0
or
(p – q) [2a + (p + q – 1)d] = 0
⇒
[2a + (p + q – 1) d] = 0 (Œ p ≠ q)
Thus, Sp+ q =
=
pq
[2a
2
pq
[0]
2
+ (p + q – 1) d]
= 0 (Hence the result)
53
Arithmetic Progression and
Series
Example 2.17: The sums of the first ‘n’ terms of two arithmetic series are in the
ratio of (3n + 1) : (n + 4). Find the ratio of their 4th terms.
Solution: Let a1 and a2 denote the first terms of the two series respectively.
Let d1 and d2 stand for the common differences of the two series respectively.
n
2a1  (n  1)d1 
2
n
2a2  (n  1)d2 
2
Sn   3n  1 

Thus,


Sn   n  4 
T4
Now,
T4
=
2a1  (n  1)d1
3n  1

2a2  (n  1)d2
n 4
=
a1  (4  1)d1

a2  (4  1)d2
=
2a1  (7  1)d1
3(7)  1

2a2  (7  1)d2
74
=
22
11
=
2
1
a1  3d1

a2  3d2
...(1)
2a1  6d1
2a2  6d2
(Substituting n = 7 in equation (1))
Hence, the required ratio = 2 : 1.
Example 2.18: Insert 5 AMs between 9 and 27.
Solution: Let x1, x2, x3, x4, x5 be the 5 AMs
∴ a, x1, x2, x3, x4, x5 form an arithmetic series
T1 = a = 9
T 7 = (a + 6d) = 27
⇒ (9 + 6d)= 27
⇒
6d = 18
⇒
d=3
∴
x 1 = T2 = (a + d) = 9 + 3 = 12
x 2 = T3 = (a + 2d) = 9 + 6 = 15
x 3 = T4 = (a + 3d) = 9 + 9 = 18
x 4 = T5 = (a + 4d) = 9 + 12 = 21
x 5 = T6 = (a + 5d) = 9 + 15 = 24
Example 2.19: There are n AMs between 4 and 59 such that 4th mean: nth mean
= 4 : 9, find n.
54
Arithmetic Progression and
Series
Solution: Let x1, x2, x3, ... , xn be the AMs.
Then, 4, x1, x2, x3, ..., xn, 59 form an AP.
∴ a = 4 and Tn+2 = [a + (n +1)d] = 59
= [4 + (n +1)d] = 59
or
 55 
d = 
 n  1 
4th mean = x4 = T5 = (a + 4d) = (4 + 4d)
 55 
 4n  224 
=4+4

 n  1   n  1 
 59n  4 
55

nth mean = xn = Tn+1 = a + nd = 4 + n 
= 
 ( n  1) 
 n  1 
Given,
x4 : xn = 4 : 9
∴
(4n + 224) : (59n + 4) = 4 : 9
⇒
9(4n + 224) = 4 (59n + 4)
Note
A : B = C : D ⇒ AD = BC
⇒
⇒
36n + 2016 = 236n + 16
n = 10
Example 2.20: If the sum of the first pth, qth and rth terms of an AP are a, b, c,
then show that
a
b
c
a
1
( q  r )  (r  p )  ( p  q ) = 0 ⇒
= [2 A  ( p  1)d ]
p
q
r
2
p
Now, Σ
Sp =
p
[2A
2
Sq =
q
2
[2A + (q – 1) d] = b ⇒
Sr =
r
2
[2A + (r – 1) d] = c
+ (p – 1) d] = a ⇒
b
1
= [2 A  ( q  1)d ]
2
q
c
r
=
1
[2 A  (r  1)d ]
2
a
(q – r) = Σ 1 [2A + (p – 1) d] (q – r)
p
2
1
2
= [ΣA (q – r) + Σ p (q – r) – Σ
55
1
2
d (q – r)]
Arithmetic Progression and
Series
= [AΣ (q – r) +
1
2
Σp(q – r) –
d
2
Σ(q – r)]
= (0 + 0 – 0) = 0
Hence the result.
Example 2.21: In an organisational hierarchy, each echelon contains two managers
more than the one above it. If on the top, there are three managers and 17 at the
lowest echelon, determine the number of echelons and the total number of managers
in the entire organisation.
Solution: Let ‘n’ stand for the number of echelons.
a = 3, d = 2
Then,
T n = [a + (n – 1) d] = 17
i.e.,
[3 + (n – 1) 2] = 17
i.e.,
(3 + 2n – 2) = 17
or
n=8
Total no. of managers = Sn = S8 =
n
2
[a + l]
=
8
2
[3 + 17] = 80
Example 2.22: If Sp, Sq, Sr denote the sum of first p, q, r terms respectively of an
AP, whose common difference is ‘d’, then prove that
Sp
d
 ( p  q)( p  r )  2
Solution:
LHS =
=
=
p
q
r
2a  ( p  1)d  2 2a  ( q  1)d  2 2a  (r  1)d 
2


( p  q )( p  r )
( q  p )( q  r )
(r  p )(r  q )

p
q
r
2a  ( p  1)d  2 2a  (q  1)d  2 2a  (r  1)d 
2


( p  q )(r  p )
( p  q )( q  r )
(r  p )( q  r )
d


  ap  ( p2  p) 
2


( p  q )(r  p )

p

d 2
d 2

 

 aq  2 ( q  q )  ar  2 (r  r ) 



( p  q )( q  r )
(r  p )( q  r )
d
p2

d
p

=  a 
  
  

 ( p  q )(r  p)  2  ( p  q)(r  p)  2  ( p  q )(r  p) 
56
Arithmetic Progression and
Series
d
2
d
2
=  a(0)  ( 1)  (0)
=
d
2
= RHS.
Hence the result.
Notes
0
0
( p  q ) ( q  r ) (r  p )
1.

p
( p  q)(r  p)
=
p( q  r )  q(r  p )  r( p  q )
( p  q )( q  r )(r  p )
2.

p2
( p  q)(r  p)
=
p2 ( q  r )  q2 (r  p )  r 2 ( p  q)
( p  q)( q  r )(r  p)
=


p2q  p2r  q2r  q2 p  r 2 p  r 2q


2
2
2
2
2
2
 pqr  p q  pr  p r  q r  pq  qr  pqr 
=
=–1
Example 2.23: The sum of three numbers in AP is 21 and their product is 315. Find
the numbers.
Solution: Let the three numbers be a–d, a, a+d.
∴ Their sum = (a – d) + a + (a + d) = 21
⇒
3a = 21 ⇒ a = 7
Also, their product = (7 – d).7(7 + d) = 315
7(72 – d2) = 315
(72 – d2) = 45
49 – d2 = 45
⇒
d =± 2
∴ The numbers are 7 – 2, 7, 7 + 2, or 5, 7, 9
Example 2.24: The sum of four numbers in AP is 16 and the sum of their cubes is
496. Find the numbers.
Solution: Let the four numbers be (a – 3d), (a – d), (a + d), (a + 3d) respectively.
Their sum
= [(a – 3d) + (a – d) + (a + d) + (a + 3d)] = 16
⇒ 4a = 16
⇒ a=4
Sum of their cubes = [(a – 3d)3 + (a – d)3 + (a + d)3 + (a + 3d)3]
= 2[a3 + 3a(3d)2] + 2[a3 + 3a(d)2]
57
Arithmetic Progression and
Series
= [4a3 + 60ad2] = 496
⇒ [4(4)3 + 60(4)d2] = 496
[ (a + b)3 + (a – b)3 = 2 (a3 + 3ab2)]
⇒ d=±1
∴ The numbers are 4 – 3(1), 4 – 1, 4 + 1, 4 + 3(1)
i.e., 1, 3, 5, 7
Example 2.25: Prove that if a, b, c are in AP, then
(a) b2 + c2 + bc, c2 + a2 + ca, a2 + b2 + ab are in AP
(b)
1 1 1
,
,
bc ca ab
are in AP
(c) a2 (b + c), b2 (c + a), c2 (a + b) are in AP
Solution:
(a) (b2 + c2 + bc), (c2 + a2 + ca), (a2 + b2 + ab) are in AP
⇔ [(c2 + a2 + ca) – (b2 + c2 + bc)] = [(a2 + b2 + ab) – (c2 + a2 + ca)]
⇔
(a2 – b2 + ca – bc) = (b2 – c2 + ab – ca)
⇔ (a + b) (a – b) + c (a – b) = (b + c) (b – c) + a (b – c)
⇔ (a – b) (a + b + c) = (b – c) (a + b + c)
( a + b + c ≠ 0)
⇔ (a – b) = (b – c)
⇔ (b – a) = (c – b)
⇔ a, b, c are in AP
(b)
1 1 1
,
,
bc ca ab
⇔
are in AP
abc abc abc
,
,
bc ca ab
are in AP
⇔ a, b, c are in AP
(c) a2 (b + c), b2 (c + a), c2 (a + b) are in AP
⇔ [b2 (c + a) – a2 (b + a)] = [c2 (a + b) – b2 (c + a)]
⇔ (b2c + b2a – a2b – a2c) = (c2a + c2b – b2c – b2a)
⇔ ab (b – a) + c (b2 – a2) = cb (c – b) + a (c2 – b2)
⇔ (b – a) [ab + c (b + a)] = (c – b) [cb + a (c + b)]
⇔ (b – a) (ab + bc + ca) = (c – b) (ab + bc + ca)
( ab + bc + ca ≠ 0)
⇔ (b – a) = (c – b)
58
Arithmetic Progression and
Series
⇔ a, b, c are in AP
Solved Examples (Geometric Progression)
Example 2.26: A man borrows ` 8190 without interest and repays the loan in 12
monthly instalments; each instalment being twice the preceding one. Find the first
and the last instalments.
Solution:
a(212  1)
(2  1)

8190

S
 Note : Sn
12 =
a=
⇒

a(r n  1) 

(r  1) 
8190
8190
  2
12
(2  1) 4095
T1 = a = ` 2
Thus,
T12 = ar12–1 = ar11 = 2 × 211 = 212 = ` 4096
Example 2.27: The sum of 2w terms of a GP, whose first term is ‘a’ and common
ratio is ‘r’, is equal to the sum of w terms of another GP, whose first term is ‘b’ and
common ratio ‘r’. Prove that ‘b’ is equal to the sum of the first two terms of the first
series.
Solution:
GP 1
GP 2
2w
w
First term
a
b
Common ratio
r
r2
No. of terms
Given sum1 = sum2
i.e.,
a(r 2w  1) b[(r 2 )w  1]

(r  1)
(r 2  1)
i.e.,
a(r 2w  1)
b[(r 2w  1]

(r  1)
(r  1)(r  1)
or
b = a + ar
or a 
b
(r  1)
( r ≠ 1)
Hence the result.
Example 2.28: A machine depreciates at 8% of its value at the beginning of a year.
If the machine was purchased for ` 15,000, what is the minimum number of
complete years at the end of which the worth of the machine will not exceed 2/5 of
its original value?
59
Arithmetic Progression and
Series
Solution: The value of the machine at the end of the 1st year, the 2nd year, the third
year and so on will form the following GP
2
1
3
8 
8 

8 
, 15000 1 
, 15000 1 
, ...
15000 1 






100 
100 
100 
Thus, for the value not exceeding 2/5th of its original value, we have
n
2
8 


15000 1 
(15000)

5


100
i.e., 0.92n ( 0.4, log (0.92)n ( log (0.4)
or
∴
n(
log (0.4)
log (0.92)
n ( 10.989
or
n = 10 years
Example 2.29: A tractor was purchased for ` 45,000 and sold as a scrap for `
5000 after 10 years. Find the rate of depreciation of the tractor.
Solution: Let r% p.a. be the rate of depreciation
r 
T10 = 45000 1 

100 
⇒
r 

 1  100 
⇒
r 

1  100 
r
100
or
10
10
5000

1
r   1


 1 


9
100   9 
1 10
= 0.80274
= (1 – 0.80274) = 0.197258
r = 19.726% p.a.
Example 2.30: For three consecutive months, a person deposited some amount of
money on the first day of each month in a small savings fund. These three successive
amounts in the deposits, the total value of which is ` 65, form a GP. If the two
extreme amounts be multiplied each by 3 and the mean by 5, the products form an
AP. Find the amounts in the first and the second deposits.
Solution: Because, the product of the three amounts has not been given, there
won’t be any special advantage in assuming the three amounts to be
Hence, the general form of a GP can be adopted.
60
a
, a and ar..
r
Arithmetic Progression and
Series
Thus, let the numbers be a, ar, ar2
a + ar + ar2 = 65
∴
...(1)
Also, given that 3a, 5ar, 3ar2 form an AP
 3a  3ar 2 

2


...(2)
∴
5ar = 
i.e.,
(3r2 – 10r + 3) = 0
or
(r – 3) (3r – 1) = 0
r=3
or
1
3
When r = 3: (a + 3a + 9a) = 65 [from (1)]
13a = 65
When r =
1
3
a=5
:
a a

 a  3  9 
13a
9
= 65 [from (1)]
= 65
a = 45
∴
or
1
1
The numbers are 5, 5(3), 5(3)2 or 45, 45   , 45  
 3
5, 15, 45
2
 3
Example 2.31: Gold worth ` 1000 has been preserved by a family for 70 years.
Find the amount they would have got as interest for this period if this gold is sold and
the amount (` 1000) was invested as a fixed deposit at a rate near about 10%
compound interest, supposing that at this rate of interest the amount gets doubled in
every 7 years period.
Solution: By the end of first 7 years, the initial amount of ` 1000 gets doubled i.e.,
becomes ` 2000.
Similarly, by the end of second 7 years, it again gets doubled i.e., by the end of
14 years, it becomes ` 4000 and so on.
Thus, the terms take the form of a GP
1000 × 2, 1000 × 22, 1000 × 23, ..., 1000 × 210
The final amount, T10 = ar10–1 = (1000 × 2) × 210–1
= 1000 × 210.
∴
Interest = Amount – Principal
= ` (1000 × 210 – 1000)
61
Arithmetic Progression and
Series
= ` 1000 (210 – 1)
= ` 10,23,000
Example 2.32: ABC Company Ltd has earmarked a fund of ` 1 crore towards the
payment of remuneration to a consultant for his advisory services rendered during a
month. His pay package for that one month is as follows:
He charges Re 1 for the first day, ` 2 for the 2nd day, ` 4 for the 3rd day, ` 8
for the 4th day and so on. What is his total remuneration for that one month? Can the
company afford to pay his remuneration?
Solution: His total remuneration is the sum of all the 30 terms of the GP given by
1 + 2 + 4 + 8 + ... up to 30 terms
Thus, his total remuneration = S30
=
1(230  1)
(2  1)
= (230 – 1)
= ` 1,07,37,41,823
Naturally, with just an allocation of ` 1 crore, the company can’t afford to hire
his services for ` 107.37 crore (approx.), which is more than 100 times the earmarked
budget.
Example 2.33: The fifth term of a GP is 81 and the second term is 24. Find the
series.
Solution: Let the GP be a, ar, ar2, ..., arn–1
T 5 = ar4 = 81
T 2 = ar = 24
∴
∴
T5
T2
81 27  3 
3
   
= r
24
8  2
r=
3
3
2
3
Now, ar = 24 ⇒ a   = 24
 2
⇒ a = 16
2
3
 3
Thus, the GP is 16, 16   , 16   , 16
 2
 2
or
16, 24, 36, 54, ...
62
3
 3  , ...
 2 
Arithmetic Progression and
Series
  (= 0.3484848...)
Example 2.34: Evaluate the recurring decimal 0.348

0.348
Solution:
= 0.3 + 0.048 + 0.00048 + 0.0000048 + ...
=
3  48
48
48

  3  5  7  ...

10  10
10
10
48
=

0.348
∴
Aliter
3
103

10 
1 
 1  2 
10
=
3
48
100


10 1000 99
=
3
48

10 990
=
345 23

990 66
=
23
66
a 

 S  1  r 
x = 0.3484848...
Let
10x = 3.484848...
1000x = 348.484848...
on subtraction, 990x = 345
x =
345 23

990 66
Remark
The trick lies in obtaining two different deca-multiples of the given recurring decimal,
each with the recurring part occurring immediately after the decimal and thereafter
taking the difference between these two multiples.
Example 2.35: Find the first term of a GP whose second term is 2 and sum to
infinity is 8.
Solution:
Given,
T 2 = ar = 2
S∞ =
...(1)
a
ar
8
8r


(1  r )
(1  r )
or
2
 8r
1r
⇒
4r2 – 4r + 1 = 0
...(2)
63
Arithmetic Progression and
Series
⇒
(2r – 1)2 = 0
⇒
r=
1
2
1
From (1), ar = 2 or a   = 2 or a = 4
 2
Example 2.36: If (a2 + b2) (b2 + c2) = (ab + bc)2, then show that a, b, c are in GP.
Solution: (a2 + b2) (b2 + c2) = (ab + bc)2
⇒
(a2b2 + a2c2 + b4 + b2c2) = (a2b2 + b2c2 + 2ab2c)
⇒
(a2c2 + b4) = 2ab2c
⇒
[(ac)2 + (b2)2 – 2 (ac) (b2)] = 0
⇒
(ac – b2)2 = 0
⇒
b2 = ac
⇒
a, b, c are in GP
Example 2.37: If a, b, c, d are in GP, then show that
(a2 + b2 + c2) (b2 + c2 + d2) = (ab + bc + cd)2
Solution: Let r be the common ratio,
then
b = ar,
c = ar2,
d = ar3
∴ LHS = (a2 + b2 + c2) (b2 + c2 + d2)
= (a2 + a2r2 + a2r4) × (a2r2 + a2r4 + a2r6)
= a4r2 (1 + r2 + r4)2
RHS = (ab + bc + cd)2 = (a2r + a2r3 + a2r5)2
= a4r2 (1 + r2 + r4)2
∴ LHS = RHS (Hence the result)
Example 2.38: Find the sum to n terms of the series 4 + 44 + 444 + ...
Solution:
Sn = 4 + 44 + 444 + ...
⇒
Sn
4
= 1 + 11 + 111 + ...
⇒
9Sn
4
= 9 + 99 + 999 + ...
⇒
9Sn
4
= (10 – 1) + (102 – 1) + (103 – 1) + ...
64
Arithmetic Progression and
Series
⇒
9Sn
4
= (101 + 102 + 103 + ... + 10n) – (1 + 1 + 1 + ... up to n terms)
⇒
9Sn
4
=
10(10n  1)
n
(10  1)
⇒
9Sn
4
=
10(10n  1)
n
9
Sn =
40
4n
(10n 1) 
81
9
⇒
Example 2.39: Find the sum to n terms of the series 0.7 + 0.77 + 0.777 + ...
Solution:
Sn = 0.7 + 0.77 + 0.777 + ...
Let
Sn
7
= 0.1 + 0.11 + 0.111 + ...
⇒
9Sn
7
= 0.9 + 0.99 + 0.999 + ...
⇒
9Sn
7
= (1 – 0.1) + (1 – 0.01) + (1 – 0.001) + ...
⇒
9Sn
7
= 1 − 10 + 1 − 2 + 1 − 3 + ...
10
10
⇒
9Sn
7
= (1 + 1 + 1 + ... up to n terms) –   2  ...  n 
10 10
10
FG
H
1
IJ FG
K H
1
IJ FG
K H
1
IJ
K
 1
1
1 
1 
1 
1 n


10
10 
n
1

1  10 
⇒
9Sn
7
=
⇒
9Sn
7
= n
⇒
Sn =
7n 7

(1  10 n )
9 81
(1  10n )
9
Example 2.40: Insert 5 GMs between 3 and 192.
Solution: Let g1, g2, g3, g4, g5 be the 5 GMs so that 3, g1, g2, g3, g4, g5, 192 are
in GP.
Let r be the common ratio of this GP.
T1 = a = 3
∴
T 7 = ar7–1 = 192
65
Arithmetic Progression and
Series
3r6 = 192
⇒
⇒
⇒
∴
r6 =
192
 64
 26
3
r=2
g1 = T2 = 3r = 3 × 2 = 6
g 2 = T3 = 3r2 = 3 × 22 = 12
g3 = T4 = 3r3 = 3 × 23 = 24
g4 = T5 = 3r4 = 3 × 24 = 48
g5 = T6 = 3r5 = 3 × 25 = 96
Hence the five GMs are 6, 12, 24, 48, 96.
Example 2.41: If G1 and G2 are two GMs between b and c and a is their AM, then
show that G13 + G23 = 2abc.
Solution: Given G1, G2 are two GMs between b and c. ∴ b, G1, G2, c are in GP.
 c
If r is the common ratio, then r =  
 b
 c
G 1 = br = b  
 b
13
 c
G 2 = br2 = b  
 b
G13
G13
= b2c
G23
= bc2
13
23
+ G23 = bc (b + c)
Since a is the AM between b and c, we have a =
b + c = 2a
⇒
∴
bc
2
G13
+ G23 = 2abc
Example 2.42: The sum of three terms of a GP is 21 and their product is 216. Find
the terms.
Solution: Let the numbers be
a
r
, a, ar
Given that their product = 216
66
Arithmetic Progression and
Series
∴
a
r
. a . ar = 216
a3 = 216 = 63 ⇒ a = 6
Also, sum = 21
∴
⇒
a
r
6
r
+ a + ar = 21
+ 6 + 6r = 21
⇒
6 (1 + r + r2) = 21r
⇒
2r2 – 5r + 2 = 0
⇒
2r2 – 4r – r + 2 = 0
⇒
2r (r – 2) – 1 (r – 2) = 0
⇒
(r – 2) (2r – 1) = 0
⇒
r = 2 or
1
2
 6
1
6
∴ The numbers are  , 6, 6  2 or  , 6, 6  
2
i.e.,

1 2
2
(3, 6, 12) or (12, 6, 3)
Example 2.43: The sum of three consecutive terms in a GP is 7 and the sum of their
squares is 21. Find the numbers.
Solution: Let the three terms be a, ar, and ar2.
Since the product of the three terms is not given, there won’t be any specific
a
advantage in assuming them to be , a and ar..
r
Their sum = (a + ar + ar2) = 7
...(1)
Sum of their squares = a2 + a2r2 + a2r4 = 21
...(2)
From (1) and (2), we have
a2 (1  r  r 2 )2 72 49 7
  
a2 (1  r 2  r 4 ) 21 21 3
(1  r  r 2 )2
2
4
(1  r  r )
⇒

7
3
(1  r  r 2 )2
2
2
(1  r  r )(1  r  r )

7
3
67
Arithmetic Progression and
Series
Note
1 + r2 + r4 = 1 + r4 + r 2
= [12 + (r2)2 + 2(1) (r2)] – r2
= (1 + r2)2 – r2
= (1 + r2 + r) (1 + r2 – r)
⇒
1  r  r2
1r r
2

7
3
⇒ (4r2 – 10r + 4) = 0
⇒
2r2 – 5r + 2 = 0
⇒
(r – 2) (2r – 1) = 0
⇒
r=2
Œ
a (1 + r + r2) = 7, a (1 + 2 + 4) = 7
or ½
⇒ a=1
∴ The numbers are 1, 2, 4.
Note
Even if r = ½ is used, we get the same numbers (in the reverse order).
Example 2.44: a, b, c are the three numbers in GP and their sum is 28. If ab + bc
+ ca = 224, find the numbers.
Solution: a, b, c are in GP ⇒ b2 = ac, a + b + c = 28, ab + bc + ca = 224
∴ ab + bc + b2 = 224
⇒
b (a + b + c) = 224
⇒
b (28) = 224
⇒
b=8
If r be the common ratio, then a =
∴
8
+ 8 + 8r = 28
r
⇒
2r2 – 5r + 2 = 0
⇒
(r – 2) (2r – 1) = 0
⇒
r = 2 or
∴
a=
8
2
8
, c = 8r
r
1
2
= 4, c = 8 × 2 = 16
∴ The numbers are 4, 8 and 16.
68
Arithmetic Progression and
Series
Solved Examples (Harmonic Progression)
Example 2.45: The second term of a HP is
1
5
and its 9th term is
1
. Determine
19
the series.
Solution:
T2 of the corresponding AP = 5= a + d
...(1)
T9 of the corresponding AP = 19 = a + 8d
...(2)
On subtraction,
14 = 7d
d=2
d = 2 in (1),
Substituting
5=a+2 ⇒ a=3
Thus, the AP is
3, 3 + 2, 3 + 4, 3 + 6, ...
or
3, 5, 7, 9, ...
Hence, the HP is
1 1 1 1
, , ,
3 5 7 9
, ...
Example 2.46: If a, b, c are in HP (a ≠ b ≠ c), prove that
a a b

c bc
Solution: a, b, c are in HP ⇒
⇒
1 1 1 1
  
b a c b
⇒
a b bc

ab
bc
⇒
a b bc

a
c
⇒
a  a  b

c  b  c 
1 1 1
, ,
a b c
are in AP
Example 2.47: Find the (m + n)th term of the HP of which the mth term is n and
nth term is m. Also find the (mn)th term.
Solution: Let the corresponding AP be a, a + d, a + 2d, ...
Tm of the AP = a + (m – 1)d =
1
n
(given)
69
Arithmetic Progression and
Series
Tn of the AP = a + (n – 1)d =
1
m
(given)
 1 1  m  n
∴ On subtraction, (m – n)d =    
 n m   mn 
d
1
mn
But,
a + (m – 1)d =
∴
a+
(m  1) 1

mn
n
a=
1  m  1

n  mn 
a=
1
mn
1
n
∴ (m + n)th term of the AP, Tm+n = a + (m + n – 1) d
= 1   m  n  1 
mn  mn 
=
mn
mn
∴The (m + n)th term of the given HP =
1
a  (mn  1)d
mn
(m  n )
1
mn  1 
 1
 mn  mn 
Tmn of the HP =  1
Example 2.48: If log (a + c) + log (a + c –2b) = 2 log (a – c), then prove that a,
b, c are in HP.
Solution: log (a + c) + log (a + c – 2b) = 2 log (a – c)
⇒
log [(a + c) (a + c – 2b)] = log [(a – c)2]
⇒
(a + c) (a + c – 2b)] = (a – c)2
⇒
(a + c)2 – 2b (a + c) = (a – c)2
⇒
(a + c)2 – (a – c)2 = 2b (a + c)
⇒
4ac = 2b (a + c)
⇒

b = 
 a  c 
⇒
a, b, c are in HP
2ac
70
Arithmetic Progression and
Series
Example 2.49: If a, b, c are in HP, then show that
ba bc

ba bc
= 2.
b  a  b  c
Solution: 
 
 =2
b  a
 b  c
⇔
(b + a) (b – c) + (b – a) (b + c) = 2 (b – a) (b – c)
⇔
2b2 – 2ac = 2b2 – 2ab – 2bc + 2ac
⇔
2ab + 2bc = 4ac
⇔
b= 
 a  c 
⇔
a, b, c are in HP.
 2ac 
Example 2.50: If the pth, qth and rth terms of a HP are respectively P, Q and R,
prove that PQ (p – q) + QR (q – r) + RP (r – p) = 0.
1
a  ( p  1)d
1
a  ( q  1)d

; Tq
; Tr
Solution: Tp = 
ΣPQ (p – q) =
( p  q)
 [ a  ( p  1)d ][a  (q  1)d ]
( p  q)[ a  (r  1)d ]
=
 [a  ( p  1)d ][a  (q  1)d ][ a  (r  1)d ]
=
a  ( p  q )  d  ( p  q ) r  d ( p  q )
=0
[ a  ( p  1)d ][ a  ( q  1)d ][ a  (r  1)d ]
Example 2.51: Insert 4 HMs between
2
3
and
2
.
13
Solution: Let x1, x2, x3, x4, be 4 HMs between
∴
2
3
, x1, x2, x3, x4,
2
13
a=
2
3
are in HP
3 1 1 1 1 13
, ,
,
,
,
2 x1 x2 x3 x4 2
⇒
1
a  (r  1)d
are in AP
3
2
T6 = a + 5d =
13
2
71
and
2
13
Arithmetic Progression and
Series
=
3
2
+ 5d =
13
2
d=1
⇒
∴
T2 =
1
x1
=a+d=
T3 =
1
x2
= a + 2d =
T4 =
1
x3
T5 =
1
x4
3
5
1  ,
2
2
3
7
 2 ,
2
2
3
9
3 ,
2
2
= a + 3d =
= a + 4d =
3
11
4  .
2
2
2 2 2
5 7 9
Hence, the 4 HMs are , ,
and
2
11
respectively..
 an 1  bn 1 
 may be the harmonic mean between
 a n  bn 
Example 2.52: Find n such that 
a and b.(a ≠ b).
 2ab 
Solution: The harmonic mean between two numbers a and b is  a  b 
∴
 a n 1  bn 1   2ab 

 

 a n  bn   a  b 
(a + b) (an+1 + bn+1) = 2ab (an + bn)
an+2 + abn+1 + an+1.b + bn+2 = 2an+1.b + 2a.bn+1
an+2 + bn+2 = an+1 .b + a.bn+1
an+1 (a – b) = bn+1 (a – b)
an+1 = bn+1
 a
 b 
n 1
 a
 1  
 b
0
(∴ a ≠ b)
 n 1 
0
 n
1
Remark
The same expression AM and GM between a and b for n = 0 and n = –1/2
respectively.
72
Arithmetic Progression and
Series
bc  b  c  3(b  c )
Example 2.53:
If
, then show that a, b, c, d are in HP.


ad
bc
ad
bc
ad
Solution: Given 
⇒
ad bc

ad
bc
⇒
1 1 1 1
  
d a c b
⇒
1 1 1 1
  
b a d c
Also,
(a  d )
 a  d
3(b  c )
bc
bc
, we have

(a  d )
ad a  d
...(1)
bc 3(b  c )
a  d 3(b  c )



ad ( a  d )
ad
bc
1 1 3 3
  
d a c b
⇒
1 1 1 1
  
d a c b
Adding,
2 4 2


d c b
or
2 2 4
 
d b c
or
1 1 1 1
  
d b c c
or
1 1 1 1
  
d c c b
...(2)
··· from (1)
...(3)
From (1) and (3), we have
1 1 1 1 1 1
    
b a c b d c
⇒
1 1 1 1
, , ,
a b c d
⇒
a, b, c, d are in HP
are in AP
Example 2.54: If a1, a2, a3, ..., an, are in HP, then show that
a1 a2 + a2 a3 + a3 a4 + ··· an–1.an = (n – 1) a1 an
Solution: Given a1, a2, a3, ..., an, are in HP.
∴
1 1 1
1
,
,
, ...,
a1 a2 a3
an
are in AP.
Let ‘d’ be the common difference of the AP.
73
Arithmetic Progression and
Series
FG
H
Fa
=d ⇒ G
H
IJ
K
−a I
J=a a
d K
a − a2
1
1
−
=d ⇒ 1
= a1 a2
a2 a1
d
Then
1
1
−
a 3 a2
2
...
...
...
...
...
...
3
2 3
1
1
 an 
a


d ⇒  n 1
  an 1 .an

an an 1
d
 ( a1  a2 )  ( a2  a3 )  ...  ( an 1  an ) 

d

∴ (a1 a2 + a2 a3 + ... an–1.an) = 

 a1  an 
d 
= 

But,
⇒
⇒
...(1)
1
1
= Tn of the AP =
+ (n – 1)d
a1
an
1
1

(n  1)d
an a1
 a1  an 
 a a  = (n – 1)d
 a1 1 nan 
= (n – 1) a1 an

d 
...(2)
From (1) and (2), we have
a1 a2 + a2 a3 + ... an–1.an = (n – 1) a1.an
Example 2.55: If a, b, c are in HP, then show that
Solution:
a
b
c
,
,
are in HP
bc ca ab
⇔
b+ c c+ a a+ b
,
,
a
b
c
⇔
bc
ca
a b
 1,
 1,
 1 are in AP
a
b
c
a
b
c
,
,
are in HP..
bc ca ab
are in AP
⇔
FG a + b + c IJ , FG a + b + c IJ , FG a + b + c IJ are in AP
H a KH b KH c K
⇔
1 1 1
, ,
a b c
⇔
a, b, c are in HP
are in AP
74
(Hence the result)
Arithmetic Progression and
Series
Example 2.56: If a, b, c are in HP, then show that

 
 

a
b
c
 b  c  2a  ,  c  a  2b  ,  a  b  2c 
a

b
 
are in HP..
c
 

are in HP
Solution: 
,
,
 b  c  2a   c  a  2b   a  b  2c 
⇔
 b  c  2a   c  a  2b   a  b  2c 

 , 
 , 

a
b
c
⇔
b  c
 c  a
 a  b

 a  2 ,  b  2 ,  c  2
⇔
bc ca a b
,
,
a
b
c
⇔
a, b, c are in HP (as per the previous problem)
are in AP
are in AP
are in AP
Example 2.57: The sum of first three terms of a HP is 22. If the first term be 12,
find the HP.
Solution: Let the first three terms of the corresponding AP be a, a + d, a + 2d.
Thus,
1
a
= 12 or a =
1
12
∴ The first three terms of AP are 1 ,  1  d  ,  1  2d 
12  12
  12

i.e., 1 ,  1  12d  ,  1  24d 
12 
12
12
 

∴ The first three terms of the given HP are 12,
12
12
,
(1  12d ) (1  24d )
12
12

= 22
(1  12d ) (1  24d )
∴
12 +
or
1440d2 – 36d – 7 = 0
 36  1296  40320   36  204 


2880

  2880 
∴
d= 
∴
d=
1
12
or
7
120
Now, putting d =
1
,
12
And, putting d =
7
, the required
120
the required HP is 12, 6, 4, ...
HP is 12, 40, – 30, ...
75
Arithmetic Progression and
Series
Example 2.58: The harmonic mean of two numbers is 4. The arithmetic mean A and
the geometric mean G of these numbers are connected by the relation 2A + G2
= 27. Find the numbers.
Solution: Let the numbers be a and b.
A=
a b
;
2
G=
ab ;
H=
2ab
=4
a b
⇒ 2A = (a + b); (G2 = ab); [ab = 2 (a + b)]
∴
2A + G2 = 27 ⇒ (a + b) + ab = 27
⇒
(a + b) + 2 (a + b) = 27
⇒
3 (a + b) = 27
⇒
(a + b) = 9
∴
ab = 2 (a + b) = 18
∴
(a – b) = ( a  b)2  4ab 
 3
Solving, a = 6 or 3
b = 3 or 6
nth term of an Arithmetic Progression/Sum of first n terms of an
arithmetic Progression/Arithmetic Mean (Sigma) Notation
In mathematics, progression refers to arithmetic progression which is sequence of
numbers such that the difference of any two successive members of the sequence is
a constant and geometric progression which is sequence of numbers such that the
quotient of any two successive members of the sequence is a constant.
Arithmetical Progression
Quantities a1, a2, a3, ..., an, ... are said to be in Arithmetical Progression if an – an–
1 is constant for all integers n >1. The constant quantity an – an–1 is called the
common difference of the arithmetical progression.
Notation. A.P. stands for an arithmetical progression. Consider the following
series.
1, 3, 5, 7, 9, 11, ...
0, 2 , 2 2 , 3 2 , 4 2 , ...
1,
1
,
2
1
2
3
2
0, – , –1, – , ...
x + y , x, x – y, x – 2y, ...
76
Arithmetic Progression and
Series
5.3, 5.55, 5.8, 6.05, 6.3, ...
Each of the above series is an A.P. Common differences are respectively 2, 2 ,
1
– , –y and 0.25.
2
General Term of an Arithmetical Progression
The first term can be denoted by a.
an = a + (n – 1)d
So,
Sn =
n
2
[2a + (n – 1)d]
Let a1, a2, ..., an , ... be a given A.P. Let d be their common difference. Then
an – an–1 = d for all n.
⇒
⇒
a2 – a1 = d, a3 – a2 = d, a4 – a3 = d and so on,
a2 = a1 + d, a3 = a2 + d = a1 + d + d = a1 + 2d
a4 = a3 + d = a1 + 2d + d = a1 + 3d
...
...
... ...
...
...
...
...
an–1 = a1 + (n – 2)d
an = an–1 + d = a1 + (n –2)d + d
= a1 + (n – 1)d
Thus nth term, an, of an arithmetical progression whose first term is a1 and common
difference d is given by,
an = a1 + (n – 1)d
Example 2.59: Find 16th term of the series 3.75, 3.5, 3.25, ... .
Solution: In this case a1 = 3.75, a2 = 3.5, a3 = 3.25
d = a2 – a1 = –0.25
Hence 16th term = a16
= 3.75 + (16 – 1) (– 0.25)
= 3.75 – 15 × 0.25
= 3.75 – 3.75 = 0
Example 2.60: Which term of the A.P. 49, 44, 39, ... is 9?
Solution: Let nth term be 9, i.e. an = 9.
Here
Thus
a1 = 49, d = 44 – 49 = –5,
an = 9 = 49 + (n – 1)( –5)
77
Arithmetic Progression and
Series
⇒
9 = 49 – 5n + 5
or
5n = 54 – 9 = 45
n =9
Thus 9th term of the given A.P. is 9.
Sum of Finite Number of Quantities in an Arithmetic Progression
Let a1, a2, ..., an be n quantities in A.P., and let the last term an be denoted by l. If d
is their common difference then
an = a1 + (n – 1)d = l
Put
Sn = a1 + a2 + ... + an
Thus
Sn = a1 + (a1 + d) + (a2 + 2d) + ... + [a1 + (n – 1)d]
= a1 + (a1 + d) + (a1 + 2d) + ... + (l – d) + l
...(1)
Writing the above series in reverse order, we get
Sn = l + (l – d) + (l – 2d) + ... + (a1 + d) + a1
...(2)
Adding Equations (2.4) and (2.5), we get
2Sn = (a1 + l) + (a1 + l) + ... + (a1 + l), (n times)
= n(a1 + l)
Therefore,
Sn =
=
Consequently,
Sn =
n
2
n
2
n
2
(a1 + l)
{a1 + [a1 + (n – 1)d]}
[2a1 + (n – 1)d]
Check Your Progress - 1
1.
How do we denote the first term in an AP?
................................................................................................................
................................................................................................................
................................................................................................................
2.
When is a sequence of numbers said to be in Harmonic Progression?
................................................................................................................
................................................................................................................
................................................................................................................
78
Arithmetic Progression and
Series
2.3
ARITHMETICAL MEAN
If a1, a2, ..., an are in A.P., then the quantities a2, a3, ..., an–1 are called Arithmetic
Means (A.M.) between a1 and an.
Thus in the series, 1, 3, 5, 7, 9, 11, 13, 15, ...
3, 5 are arithmetic means between 1 and 7.
9, 11, 13 are arithmetic means between 7 and 15.
To Insert n Arithmetic Means Between Two Given Numbers
Let a and b be two given quantities and A1, A2, ..., An be the n arithmetic means
between them. Then the quantities
a, A1, A2, ..., An, b are in A. P.
Let d be their common difference.
Now
b = (n + 2)th term
= a + (n + 1)d
⇒
Further,
d=
b a
n 1
A1 = 2nd term
=a+d=a+
=
b a
n 1
na b
n 1
A2 = 3rd term
= a + 2d = a + 2
=
b a
n 1
na 2b a
n 1
... ... ... ... ... ...
An = a + nd = a + n
=
Hence
b a
n 1
a nb
n 1
na + b na + 2b − a
a + nb
,
,... ,
n +1
n +1
n +1
are n arithmetic means between a and b.
79
Arithmetic Progression and
Series
Example 2.61: Insert 6 arithmetic means between 1 and 19.
Solution: Let A1, A2, A3, A4, A5, A6, be the required arithmetic means.
Then 1, A1, A2, A3, A4, A5, A6, 19 are in A.P.
Let d be their common difference.
Then
19 = 8th term
= 1 + (8 – 1)d
= 1 + 7d
d=
Thus
Hence
18
7
A1 = 2nd term
=1+
18
7
25
7
=
A2 = 3rd term = 1 + 2 ×
18
7
=1+
36
7
=
43
7
A3 = 4th term = 1 + 3 ×
18
7
=1+
54
7
=
61
7
A4 = 5th term = 1 + 4 ×
18
7
=1+
72
7
=
79
7
A5 = 6th term = 1 + 5 ×
18
7
=1+
90
7
=
97
7
A6 = 7th term = 1 + 6 ×
18
7
=1+
108
7
=
115
7
So, the required means are
25 43 61 79 97 115
,
, , , ,
7 7 7 7 7 7
Example 2.62: If pth, qth, rth term of an A.P. are a, b, c, respectively, show that
(q – r) a + (r –p)b + (p – q)c = 0
Solution. Here,
pth term = a = a1 + (p – 1)d
...(1)
qth term = b = a1 + (q – 1)d
...(2)
rth term = c = a1 + (r – 1)d
...(3)
where a1 is the first term and d is the common difference of the A.P.
Multiply equations (1) by q – r, (2) by r – p, (3) by p – q and add to obtain
(q – r) a + (r – p) b + (p – q) c= a1(q – r) + a1(r – p) +a1(p – q) + d[(p – 1) (q – r)
80
Arithmetic Progression and
Series
+ (q – 1)(r – p) + (r – 1)(p – q)] = a1 [q – r + r – p + p – q] + d [pq + r – pr – q +
qr – r + p – pq + rp – rq – p + q] = 0
Example 2.63: The sum of n terms of two A.Ps are in the ratio of 7n + 1: 4n + 27.
Find the ratio of their 11th terms.
Solution: Let a1 and b1 be the first terms of two A.P.s and d1, d2 be their common
difference respectively.
n
Sn = [2a1 + (n – 1) d1]
n2
S ′n = [2b1 + (n – 1) d2]
Then,
2
2a1
n 1 d1
2b1
n 1 d2
Sn
Sn
=
2a1 20d1
2b1 20d 2
=
148
111
or
a1 10d1
b1 10d 2
=
148
111
or
a11
b11
=
148
111
So,
=
7n 1
4n 27
Putting n = 21, we get
where a11 and b11 are the 11th terms of two A.P.s respectively.
The ratio of their 11th term is 4:3
Example 2.64. If
A.P.
Solution: Since
We have
⇒
⇒
⇒
⇒
1
b+c
1
b c
,
,
1
1
,
c+a a+b
1
1
,
c a a b
1
c a
–
1
b c
b c – c a
b c c a
b a
b c
are in an A.P., prove that a2, b2, c2 are also in
are in an A.P.,.,
=
=
=
1
a b
–
c a
1
c a
a b
a b c a
c b
b a
b2 – a2 = c2 – b2
a2, b2, c2 are in A.P.
81
Arithmetic Progression and
Series
Example 2.65: The monthly salary of a person was ` 320 for each of the first three
years. He then got annual increments of ` 40 per month for each of the following
successive 12 years. His salary remained stationary till retirement when he found that
his average monthly salary during the service period was ` 698. Find the period of his
service.
Solution: Let n be the total number of years of the person’s service.
His total salary = ` 12n × 698
(As his monthly average is ` 698)
Total salary in first three years of service
= 320 × 3 × 12 = ` 960 × 12
In the 4th year, his monthly salary was ` (320 + 40) = ` 360
In the 5th year his monthly salary was ` 400, and so on.
Then for the next 12 years, his total salary
= ` 12 × [360 + 400 + ... up to 12 terms]
= ` 12 ×
12
[2 × 360 + (12 – 1)
2
× 40]
= ` 12 × 6 (720 + 440)
= ` 12 × 6 × 1160
= ` 12 × 6960
At the end of following the 12 years, his monthly salary was
` [360 + (12 – 1) × 40] = ` 800
He got ` 800 as salary for the remaining (n – 15) years. So his total salary for the
remaining (n – 15) years was (n – 15) 800 × 12
Hence his total salary throughout his service period
= 12[960 + 6960 + 800(n – 15)]
= 12(7920 + 800n – 12000)
= 12 (800n – 4080)
This must be same as 12n × 698
i.e.,
⇒
12n × 698 = 12(800n – 4080)
102n = 4080 ⇒ n = 40 years.
82
Arithmetic Progression and
Series
Example 2.66: The sequence of natural numbers is written as,
1
2
3
4
5
6
7
8
9
...
...
...
...
...
...
...
...
...
...
Find the sum of the numbers in the rth row.
Solution: Let S1 denotes the sum of rth row.
S1 = 1, S2 = 2 + 3 + 4, S3 = 5 + 6 + 7 + 8 + 9
Let initial term of Sk be tk
and suppose that
M = 1 + 2 + 5 + ... + tk
be the sum of the first terms of S1, S2 and Sk
Now,
M = 1 + 2 + 5 + 10 + ... + tk
Also,
M = 1 + 2 + 5 + ... + tk–1 + tk
Subtracting, we get
0 = (1 + 1 + 3 + 5 + up to k term) – tk
t k = 1 + [1 + 3 + 5 + ... up to (k – 1) terms]
=1 +
k 1
[2 + (k – 2) × 2]
2
= 1 + (k – 1)2 = k2 – 2k + 2
In S1 there is one term, in the S2 there are three terms and so on. In Sk there will
be (2k – 1) term.
Hence, we have to find the sum of the series
r2 – 2r + 2, r2 – 2r + 3, r2 – 2r + 4, up to (2r – 1) terms
So
Sr =
2r 1
2
[2(r2 – 2r + 2) + (2r – 2) × 1]
= (2r – 1) (r2 – 2r + 2 + r – 1)
= (2r – 1) (r2 – r + 1)
= 2r3 – 3r2 + 3r – 1
Example 2.67: Find domain and range of the following relation. Is it a function?
{(–3, 6), (–2, 6), (–1, 6), (0, 6), (1, 6), (2, 6).
83
Arithmetic Progression and
Series
Solution: Domain = {– 3, –2, –1, 0, 1, 2}
Range = {6}
In this relation all the values of x maps on the same value of y = 6, which is
a horizontal line. But all the values of domain are different.
∴ Given relation is a function
Example 2.68: Find domain and range of the following relation. Is it a function? {(–
2, 3), (4, 6) (3, –1) (6, 6) (–2, –3)}
Solution: Domain = {– 2, 4, 3, 6}
Range = {3, 6, –1, 3}
Here two pairs (–2, 3), (–2, –3) have same value of x and one value of x can
not be mapped on different values (3, –3}, therefore given relation is not a function.
Example 2.69: Find domain and range of the following function
y =  (  2 x  3)
Solution: This function can have all values of x but negative value inside the square
root will be to a complex number.
∴ –2x + 3 ≥ 0
–2x ≥ –3
y
x
or 2x ≤ 3
or x ≤ 3/2
Domain is {x : x ≤ 3/2}
Range is {y ≤ 0}
Sigma Notation
Sigma notation is given by Σ. Sigma is the upper case letter S in Greek, which
stands for sum. It represents sum up the value written after it
For Example: Σn implies we sum n.
84
Arithmetic Progression and
Series
5
Example:
n 1
n implies we sum n and n goes for 1 to 4.
4
=
n
n 1
= [1 + 2 + 3 + 4]
= 10
In this same way value of more complex terms can be evaluated under sigma
notations as
4
n (2n+1) = (3 + 5 + 7 + 9)
n 1
= 24
5
And
n
= (12 + 22 + 32 + 42 + 52)
n 1
= 55
Example 2.70: Write 1 + 2 + 3 + … + 7 + 8 using sigma notation.
8
Solution:
n
n 1
Example 2.71: Write 1 + 4 + 9 … + 49 using sigma notation.
7
Solution:
n
n 1
Check Your Progress - 2
1.
What is sigma notation given by?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What does sigma stand for?
................................................................................................................
................................................................................................................
................................................................................................................
85
Arithmetic Progression and
Series
2.4
SUMMARY
• An arithmetic progression is a sequence, in which each term, except the
first, is obtained by adding a fixed number to the term immediately
preceding it.
• The fixed number in an arithmetic progression is called the common
difference.
• In an AP, we usually denote the first term by ‘a’, the common difference by
‘d’.
• If a, b, c are in AP, then b is called the arithmetic mean (AM) between a
and c, and b.
• If a fixed number is added to (or subtracted from) each term of an AP then
the resulting sequence is also an AP.
• If each term of an AP is multiplied or divided by a non-zero fixed number,
then the resulting sequence is also an AP.
• A geometric progression is a sequence, in which each term, except the first,
is obtained by multiplying the term immediately preceding it, with a fixed
non-zero number.
• If each term of a GP is multiplied or divided by a non-zero fixed number,
then the resulting sequence is also a GP.
• A series in which each term is the product of the corresponding terms of an
AP and a GP is called an Arithmetico-geometric series.
• A sequence of numbers is said to be in Harmonic Progression when the
reciprocals of these numbers are in Arithmetic Progression.
• The sum of first n terms of an HP is not equal to the reciprocal of the sum
of first n terms of the corresponding AP.
• If each term of a HP is multiplied or divided by a constant non-zero
number, then the resulting terms are also in HP.
• In mathematics, progression refers to arithmetic progression which is
sequence of numbers such that the difference of any two successive
members of the sequence is a constant and geometric progression which is
sequence of numbers such that the quotient of any two successive members
of the sequence is a constant.
86
Arithmetic Progression and
Series
• Quantities a1, a2, a3, ..., an, ... are said to be in Arithmetical Progression
if a n – an–1 is constant for all integers n >1. The constant quantity
an – an–1 is called the common difference of the arithmetical progression.
• Sigma notation is given by Σ. Sigma is the upper case letter S in Greek,
which stands for sum. It represents sum up the value written after it
2.5
KEY WORDS
• Arithmetic progression: It is a sequence of numbers in which each differs
from the preceding one by a constant quantity.
• Harmonic progression: It is a sequence of quantities whose reciprocals
are in arithmetical progression.
• Arithmetico-geometric series: A series in which each term is the product
of the corresponding terms of an AP and a GP is called an Arithmeticogeometric series.
2.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. We denote the first term in an AP by ‘a’.
2. A sequence of numbers is said to be in Harmonic Progression when the
reciprocals of these numbers are in Arithmetic Progression.
Check Your Progress - 2
1. Sigma notation is given by Σ.
2. Sigma stands for sum.
2.7
SELF-ASSESSMENT QUESTIONS
1. What do you understand by arithmetic progression and sequence?
2. List the difference between an arithmetic progression and a harmonic
progression.
3. Define a geometric progression. How is it different from an arithmetic
progression?
87
Arithmetic Progression and
Series
4. A manufacturer of TV sets produced 670 units in the third year and 770
units in the seventh year. Assuming that the increase in production every
year is the same, find what was (i) the total production in 9 years and
(ii) the production in the 11th year?
5. The monthly salary of a person was ` 320 for each of the first three years.
He next got annual increments of ` 40 per month for each of the following
successive 12 years. His salary remained stationary till retirement when he
found that his average monthly salary during the service period was ` 698.
Find the period of his service.
6. Two posts were offered to a man. In the first one, the starting salary was
` 500 per month and the annual increment was ` 15. In the second one,
the salary commenced at ` 320 per month, but the annual increment was
` 22. He decided to accept that post which would give him more earnings
in the first 20 years of the service. Which post was acceptable to him?
Justify your answer.
7. If Sn is the sum of first ‘n’ terms of an arithmetic series, then show that
Sn+3 – 3.Sn+2 + 3.Sn+1 – Sn = 0.
2.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
88
Geometric Progression and
Series
UNIT–3
GEOMETRIC PROGRESSION AND SERIES
Objectives
After going through this unit, you will be able to:
•
Define nth term of a geometric progression
•
Analyse the sum of infinity of a geometric progression
•
Assess the sum of integrity of a geometric progression
•
Discuss geometric mean and its significance
Structure
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
Introduction
Geometric Progression and Geometric Means
Sum of Geometric Progression
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
3.1
INTRODUCTION
This unit will discuss geometric progression and series. A geometric progression is a
sequence of numbers where each term after the first is found by multiplying the
previous one by a fixed, non-zero number called the common ratio. The number
multiplied each time is constant. In order to find the common ratio, the second term
is divided by the first term. In a geometric progression, the n-th term of a geometric
sequence with initial value a and common ratio r is given by an = arn-1.
Geometric Progression also includes harmonic series, which is the reciprocal of
arithmetic progression. Other than geometric progression is geometric mean i.e. the
mean or average which indicates the central tendency or typical value of a set of
numbers. This unit discusses in detail the various aspects of geometric progression
and series, ranging from the first n terms of geometric progression to the sun of
infinity of a geometric progression.
89
Geometric Progression and
Series
3.2
GEOMETRIC PROGRESSION AND GEOMETRIC MEANS
The aspects of geometric progression, its nth term and mean are discussed here.
nth term of a Geometric Progression
Non-zero quantities a1, a2, a3, ..., an,...., each term of which is equal to the product
of preceding term and a constant number, form a Geometrical Progression (written
as G.P.).
Thus, all the following quantities are in G.P.
(i) 1, 2, 4, 8, 16,...
(ii) 3, –1,
1
1 1
, ,
3 9 27
, ....
(iii) 1, 2 , 2, 2 2 ,....
(iv) a,
a
b
(v) 1,
1 1
1
, ,
,...
5 25 125
,
a
b2
,
a
b3
,..., where a ≠ 0, b ≠ 0.
The constant number is termed as the common ratio of the G.P.
The nth Term of a G.P.
Let first term be a and r, the common ratio, By definition the G.P. is a, ar, ar2,...
1st term = a = ar0 = ar1–1
2nd term = ar = ar1 = ar2–1
... ... ... ... ... ... ... ...
In general, nth term = arn–1.
In examples of the preceding section, we compute 5th, 7th, 3rd, 11th and 8th term
of (i), (ii), (iii), (iv) and (v) respectively.
In (i) Ist term is 1 and common ratio = 2.
Hence, 5th term = ar4 = 1.24 = 16.
In (ii) a = 3, r =
1
, hence, 7th term = ar6 = 3
3
In (iii) a = 1, r = 2 , hence, 3rd term = ar2 = 2.
90
1
3
6
=
1
.
243
Geometric Progression and
Series
1
a
, hence, 11th term = ar10 = 10
b
b
In (iv) Ist term = a, r =
In (v) a = 1, r =
1
1
, hence, 8th term = ar2 = 7
5
5
=
.
1
.
78125
Sum of First n Terms of a G.P.
Let a, ar, ar2,... be a given G.P. and let Sn be the sum of its first n terms.
Sn = a + ar + ar2 +...+ arn–1.
Then,
rSn = ar + ar2 +...+ arn–1 + arn
This gives that
Subtracting, we get, Sn – r Sn = a – arn = a (1 – rn)
a 1 rn
In case r ≠ 1,
Sn =
In case r = 1,
Sn = a + a + a + ... + a (n times)
1 r
= na.
Thus, sum of n terms of a G.P. is
a 1 rn
1 r
provided r ≠ 1.
In case r = 1, sum of G.P. is na.
Example 3.1: Find the sum of the first 14 terms of a G.P.
3, 9, 27, 81, 243, 729,...
Solution: In this case a = 3, r = 3, n = 14.
So,
Sn =
=
a 1 rn
1 r
3
2
3 1 314
=
1 3
(314 – 1).
Example 3.2: Find the sum of first 11 terms of a G.P. given by
1
2
1, − ,
1
, n = 11.
2
Solution: Here, a = 1, r =
So,
Sn =
a 1 rn
1 r
1
1
, −
8
4
1
2
11
=
1
... , ...
11
1
2
91
Geometric Progression and
Series
=
211 1
683
=
.
10
3 2
1024
Harmonic Series
Non-zero quantities whose reciprocals are in A.P. are said to be in Harmonical
Progression (H.P.)
Consider the following examples:
1 1 1
3 5 7
1. 1, , , , ... ...
2.
1 1 1 1
, , , , ... ...
2 5 8 11
5 10
, ...
2 3
3. 2, ,
1
1
1
4. a , a  b , a  2b , ... ... a, b  0.
5. 5,
55 55
, , 11, ... ...
9 7
It can be easily checked, that in each case the series obtained by taking the reciprocal of each of the term is an A.P.
Geometric Means
Geometric mean or GM is the mean or average which indicates the central tendency
or typical value of a set of numbers.
If α, β, γ are in G.P., then β is called a geometric mean between α and γ (written as
G.M.).
If a1, a2, ..., an are in G.P., then a2, ..., an–1 are called geometric means between
a1 and an.
Thus 3, 9, 27 are three geometric means between 1 and 81.
To insert n Geometric Means between Two given Numbers a and b
Let G1, G3, ... Gn be n geometric means between a and b. Thus a, G1, G2, ... Gn, b
is a G.P., b being (n + 2)th term = arn+1 where r is the common ratio of G.P.
Thus
b=
arn+1
92
⇒r=
b
a
1
n 1
Geometric Progression and
Series
G 1 = ar = a
So,
G2 =
ar2
=a
b
a
b
a
... .... ...
1
n 1
= anb
2
n 1
= a n 1b 2
...
G n = arn–1 = a
b
a
1
n 1
1
n 1
... ... .... ... ... ... ....
n 1
n 1
= a 2bn
1
1 n 1
Example 3.3: Find 7 G.M.’s between 1 and 256.
Solution. Let G1, G2, ... G7, be 7 G.M.’s between 1 and 256.
Then 256= 9th term of G.P.,
= 1. r8 where r is the common ratio of the G.P.
This gives that r8 = 256 ⇒ r = 2.
G 1 = ar = 1.2 = 2
Thus
G 2 = ar2 = 1.4 = 4
G 3 = ar3 = 1.8 = 8
G 4 = ar4 = 1.16 = 16
G 5 = ar5 = 1.32 = 32
G 6 = ar6 = 1.64 = 64
G 7 = ar7 = 1.128 = 128.
Hence required G.M.’s are 2, 4, 8, 16, 32, 64, 128.
Example 3.4: Sum the series 1 + 3x + 5x2 + 7x3 + ... up to n terms, x ≠ 1.
Solution. Note that nth term of this series = (2n – 1) xn – 1.
Let Sn = 1 + 3x + 5x2 + ... + (2n – 1) xn – 1.
Then xSn = x + 3x2 + ... + (2n – 3) xn – 1 + (2n – 1) xn.
Subtracting, we get
Sn(1 – x) = 1 + 2x + 2x2 + ... + 2xn – 1 + (2n – 1) xn
= 1 + 2x .
1  xn  1
– (2n – 1) xn
1 x
93
Geometric Progression and
Series
1  x  2 x  2 x n  (2n  1) x n (1  x)
=
1 x
=
1  x  2 x n  (2n  1) x n (2n  1) x n  1
1 x
=
1  x  (2n  1) x n  (2n  1) x n  1
1 x
S=
Hence
1  x  (2n  1) x n  (2n  1) x n  1
(1  x)2
Example 3.5: If in a G.P., (p + q)th term = m and (p – q)th term = n, then find its
pth and qth terms.
Solution. Suppose that the given G.P. be a, ar, ar2, ar3, ...
By hypothesis, (p + q)th term = m = ar p + q – 1
(p – q)th term = n = arp – q – 1.
Then
m
n
Hence
m=a
= r2q ⇒ r =
m
n
(p
m
n
q 1) / 2 q
1/ 2q
⇒ a = m(q – p + 1)/2q n(p + q – 1)/2q.
pth term = ar p – 1 = m1/2 n1/2 = mn
Thus,
qth term = arq – 1 = m
2q  p
p
n
2p
2q
Example 3.6: Sum the series 5 + 55 + 555 + ... up to n terms.
Solution. Let
Sn = 5 + 55 + 555 + . . . .
S n = 5 (1 + 11 + 111 + . . . . )
=
=
=
=
5
9
5
9
5
9
[(10 – 1) + (100 – 1) + (1000 – 1) + ...]
5
9
[(10 + 102 + 103 + ... + 10n) – n]
(9 + 99 + 999 + . . . )
[(10 + 102 + 103 + ... + 10n )
94
– (1 + 1 + . . . .n terms)]
Geometric Progression and
Series
=
5 10(1 10n )
9
1 10
=

5 10(10n  1)
 n

9 
9

=
50
5n
(10n  1) 
.
81
9
n
Example 3.7: If a, b, c, d are in G.P., prove that a2 – b2, b2 – c2 and c2 – d2 are
also in G.P.
Solution.
Since
b
a
d
c
b = ak, c = bk,
we have
d = ck
b = ak, c = ak2, d = ak3.
i.e.,
Now
c
b
=   k (say)
(b2 – c2)2 = (a2k2 – a2k4)2
= a4k4(1 – k2)2.
= (a2 – a2k2) (a2k4 – a2k6)
Also (a2 – b2) (c2 – d2)
= a4(1 – k2) (k4 – k6)
= a4k2 (1 – k2)2
Hence
(b2 – c2) = (a2 – b2) (c2 – d2).
This gives that a2 – b2, b2 – c2, c2 – d2 are in G.P.
Example 3.8: Three numbers are in G.P. Their product is 64 and sum is
Find them.
Solution. Let the numbers be
Since
we have
This gives that
⇒
⇒
a
r
, a, ar.
a
r
+ a + a2 =
4
r
+ 4 + 4r =
124
5
and
a
r
, a, ar = 64,
a 3 = 64 ⇒ a = 4.
1
r
+1+r=
124
5
31
5
r2  1
26
=
r
5
95
124
.
5
Geometric Progression and
Series
5r2 + 5 = 26r
⇒
r=
⇒
1
5
or 5
4
5
In either case, the numbers are , 4, and 20.
Example 3.9: If a, b, c are in G.P. and ax = b y = cz, prove that
1 1
+
x z
=
2
y
Solution. a, b, c are in G.P., b2 = ac
But
by = ax ⇒ a = b y/x
and
by = cz ⇒ c = b y/z
So we get
bz = b y/x. b y/z
=b
1
1
2 = y   
 x z
⇒
⇒
 1 1
y  
 x z
2
1 1
 = .
y
x z
Example 3.10: Sum to n terms the series
.7 + .77 + .777 + . . .
Solution. Given series
= .7 + .77 + .777 + . . . up to n terms
= 7 (.1 + .11 + .111 + ... up to n terms)
=
=
=
7
9
7
9
(.9 + .99 + .999 + ... up to n terms)


1 
1  
1 
 1    1  2   1  3   ...
10
10
10


7
1
1

 n    2  ... up to n terms 
9
10 10

7
=
9


1 (1  1/10n )


10
n 

1


1


10
96
Geometric Progression and
Series
=
7
9

1
1 
 n  1  n  
9
10 

=
7
9

1
1 
 n  1  n   .
9
10 

Example 3.11: The sum of three numbers in G.P. is 35 and their product is 1000.
Find the numbers.
Solution. Let the numbers be
r
, α, αr
α 3 = 1000
Their product
α = 10
⇒
So the numbers are
10
, 10, 10r
r
The sum of these numbers = 35
⇒
⇒
10
r
+ 10 + 10r = 35
2
+ 2r = 5
r
⇒
2r2 – 5r + 2 = 0
⇒
(2r – 1) (r – 2) = 0
⇒
r=2
or
1
2
r = 2 gives the numbers as 5, 10, 20
r=
1
2
gives the numbers as 20, 10, 5, the same as the
first set.
Hence, the required numbers are 5, 10 and 20.
Example 3.12: The sum of the first eight terms of a G.P. (of real terms) is five
times the sum of the first four terms. Find the common ratio.
Solution. Let the G.P. be a, ar, ar2, . . .
8
S 8 = Sum of first eight terms = a (1  r )
1 r
97
Geometric Progression and
Series
S4 = Sum of first four terms =
By hypothesis
S 8 = 5S4 ⇒
a (1  r 4 )
1 r
a (1  r 8 )
5a(1  r 4 )
=
1 r
1 r
⇒
1 – r8 = 5(1 – r4)
⇒
(1 – r4) (1 + r4) = 5(1 – r4)
In case
r4 – 1 = 0
we get (r2 – 1) = 0 ⇒ r = ±1
(Note that
r2 + 1 = 0 ⇒ r is imaginary)
r = 1 ⇒ the given series is a + a + a + . . .
Now
S8 = 8a and S4 = 4a.
but then
So
S8 ≠ 4S4.
In case r = –1, we get S8 = 0 and S4 = 0 hence the hypothesis is satisfied.
Suppose now
⇒
⇒
r4 – 1 ≠ 0
then 1 + r4 = 5
r 4 = 4 ⇒ r2 = 2
(r2 ≠ – 2)
r= ± 2
Hence
r = –1 or ± 2
Example 3.13: If S is the sum, P the product and R the sum of reciprocals of n
terms in G.P., prove that
P2Rn = Sn.
Solution. Let
Then
a, ar, ar2, . . . be the given G.P.
S = a + ar + ar2 + . . .
up to n terms
a (1  r n )
1 r
=
...(1)
P = a ⋅ ar ⋅ ar2... arn – 1
= an r1 + 2 + 3 + ... + (n – 1)
( n  1)
(2  n  2)
r 2
=
an
=


an r 2 n
 n  1
R=
98
1 1
1


 ...
a ar ar 2
...(2)
up to n terms
Geometric Progression and
Series
=
=
1
a
1

1  n 
r
1
1
r
=
(1  r n )
=
...(3)
a (1  r ) r n  1
P2Rn = a2n rn(n – 1)
By (2) and (3),
r (r n  1)
a (r  1) r n
a n (1  r n ) n
(1  r ) n
(1  r n ) n
a n (1  r ) n r n ( n  1)
= Sn by (1).
Example 3.14: The ratio of the 4th to the 12th term of a G.P. with positive
common ratio is
to 8 terms.
1
256
. If the sum of the two terms is 61.68, find the sum of series
Solution. Let the series be a, ar, ar2, . . .,
T 4 = 4th term = ar3
T12 = 12th term = ar11
T4
T12
By hypothesis
ar 3
i.e.,
11
ar
1
r
8
=
1
256
=
1
256
=
1
256
⇒
r8 = 256
⇒
r = ±2
Since r is given to be positive, we reject negative sign.
Again it is given that
T4 + T12 = 61.68
i.e.,
a (r3 + r11) = 61.68
99
Geometric Progression and
Series
a (8 + 2048) = 61.68
a=
Hence
61.68
2056
= 0.03
S 8 = sum to eight terms
a (1  r 8 ) a (r 8  1)
(.03) (256  1)

=
r 1
1 r
(2  1)
=
= 0.03 × 255 = 7.65.
Example 3.15: A manufacturer reckons that the value of a machine which costs
him Rs 18750 will depreciate each year by 20%. Find the estimated value at the
end of 5 years.
Solution. At the end of the first year, the value of machine
= 18750 ×
=
4
5
80
100
(18750)
2
4
At the end of the 2nd year, it is equal to   (18750); proceeding in this manner,,
 5
5
4
the estimated value of machine at the end of five years is   (18750)
 5
=
64  16
 18750
125  25
=
1024
 750
125
= 1024 × 6
= 6144 rupees
Example 3.16: Show that a given sum of money accumulated at 20 per cent per
annum more than doubles itself in 4 years at compound interest.
6a
(it is increased
Solution. Let the given sum be a rupees. After 1 year, it becomes
5
a
by ).
5
At the end of two years, it becomes
100
2
6  6a   6 
    a.
5  5   5
Geometric Progression and
Series
Proceeding in this manner, we get that at the end of 4th year, the amount will be
4
1296
 6
a
  a =
5
625
1296
46
a  2a  a, a + ve quantity, so the amount after 4 years is more than
Now
625
625
double of the original amount.
Example 3.17: If
a a

+ ... ∞
r r2
x=a+
b
r
y= b 
and
Show that
Solution. Clearly
xy
z
=
ab
c
x=
a
y=
and
Now
c
z= c
z=
xy
z
=
=
r

+ ... ∞
r2

2
1
1
r
b
c
r4
+ ... ∞
ar
,
r 1
b
br

1  ( 1/r ) r  1
c
1
1
r

2
cr 2
r2  1
ab r 2
(r
2
ab
c
1)
cr 2
r2 1
.
Example 3.18: If a2 + b2, ab + bc and b2 + c2 are in G.P., prove that a, b, c are
also in G.P.
Solution. Since a2 + b2, ab + bc and b2 + c2 are in G.P., we get
(ab + bc)2 = (a2 + b2) (b2 + c2)
b2(a2 + 2ac + c2) = a2b2 + a2c2 + b4 + b2c2
⇒
2ab2c2 = a2c2 + b4
⇒
a2c2 – 2ab2c2 + b4 = 0
⇒
(ac – b2)2 = 0
101
Geometric Progression and
Series
⇒
ac = b2
⇒
a, b, c are in G.P.
3.3
SUMS OF GEOMETRIC PROGRESSION
The types of sums of geometric progression are discussed here below.
Sum of first n terms of geometric progression
Let a, ar, ar2, ... be a given G.P. and let Sn be the sum of its first n terms.
S n = a + ar + ar2 + ... + arn–1.
Then
rSn = ar + ar2 +...+ arn–1 + arn
This gives that
Subtracting, we get Sn – r Sn = a – arn = a (1 – rn)
a (1 − r n )
(1 − r )
In case r ≠ 1,
Sn =
In case r = 1,
Sn = a + a + a + ... + a (n times)
= na.
Thus, sum of n terms of a G.P. is
In case r = 1, sum of G.P. is na.
a (1 − r n )
1− r
provided r ≠ 1.
Example 3.19: Find the sum of the first 14 terms of a G.P.
3, 9, 27, 81, 243, 729, ...
Solution. In this case a = 3, r = 3, n = 14.
So,
Sn =
=
a 1 rn
1 r
3
2
3 1 314
=
1 3
(314 – 1).
Example 3.20: Find the sum of first 11 terms of a G.P. given by
1 1
, ,
2 4
1,
Solution. Here a = 1, r =
So,
Sn =
a 1 rn
1 r
1
, n = 11.
2
1
11
2
=
1
102
1
2
11
1
8
..., ...
Geometric Progression and
Series
=
211 1
683
=
.
10
3 2
1024
Sum of infinity of a geometric progression
Let a, ar, ar3, ... be a G.P. with r < 1.
Now r < 1 ⇒ r2 < r, r3 < r2, ....
Thus, as power of r goes on increasing, the corresponding term in G.P. decreases
in value. So, we can assume that as n becomes indefinitely large, rn becomes indefinitely
small i.e., rn→ 0.
Now
Sn =
So, as n→∞,
S∞ =
a 1 rn
=
1 r
a
1 r
a
1 r
n
– ar .
1 r
.
Example 3.21: Find the sum of the following series up to infinity
1+
Solution. Here a = 1, r =
So,
S∞=
a
1 r
=
3
7
1
3
1
7
3
7
+
9
49
+
27
343
81
2401
+
+ ...
< 1.
7
.
4
=
Example 3.22: Evaluate the recurring decimal 17.
Solution. Now
0.17 = 0.1 + 0.07 + 0.007 + 0.0007 + ...
=
1
10
+
7
10 2
+
=
1
10
+
7
10 2
1
=
1
7
10 102
=
1
7 10
10 100 9
=
1
10
7
90
7
103
+ ...
1
1
10 102
... ...
1
1
10
1
=
16
90
=
8
.
45
103
Geometric Progression and
Series
Check Your Progress - 1
1.
What is termed as the common ratio of the G.P.?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What quantities are said to be in Harmonic Progression?
................................................................................................................
................................................................................................................
................................................................................................................
3.
What is Geometric Mean?
................................................................................................................
................................................................................................................
................................................................................................................
3.4
SUMMARY
• Non-zero quantities a1, a2, a3, ..., an,...., each term of which is equal to the
product of preceding term and a constant number, form a Geometrical
Progression.
• A geometrical progression is also written as G.P.
• The constant number is termed as the common ratio of the G.P.
• Non-zero quantities whose reciprocals are in A.P. are said to be in
Harmonical Progression (H.P.)
• Geometric mean or GM is the mean or average which indicates the central
tendency or typical value of a set of numbers.
• If α, β, γ are in G.P., then β is called a geometric mean between α and
γ (written as G.M.).
104
Geometric Progression and
Series
3.5
KEY WORDS
• Geometric Mean: It is the mean or average which indicates the central
tendency or typical value of a set of numbers.
• Hamonic Progression: These are non-zero quantities whose reciprocals
are in an arithmetic progression.
3.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The constant number is termed as the common ratio of the G.P.
2. Non-zero quantities whose reciprocals are in A.P. are said to be in
Harmonical Progression.
3. Geometric mean or GM is the mean or average which indicates the central
tendency or typical value of a set of numbers.
3.7
SELF-ASSESSMENT QUESTIONS
1. The sum of three numbers in G.P. is 75 and their product is 1050. Find the
numbers.
2. The sum of the first eight terms of a G.P. (of real terms) is five times the sum
of the first four terms. Find the common ratio.
3. The ratio of the 4th to the 12th term of a G.P. with positive common ratio
is 1/256. If the sum of the two terms is 61.68, find the sum of series to 8
terms.
4. Evaluate the recurring decimal 17.
5. Define a geometric progression series. Use an example to support your
answer.
6. How is a geometric progression different from a harmonic progression?
105
Geometric Progression and
Series
3.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
106
Fundamental Principles of
Counting
BLOCK-II
PERMUTATION AND COMBINATION
This block discusses Permutation and Combination. Permutation and Combination is the
method of deriving or finding out the maximum number of possible outcomes for any given
situation. For example, the numbers 1, 2 and 3 can be written as 12, 13, 21, 23, 31 and 32.
There can be another case where we can repeat the digits, thus contributing three more
combinations 11, 22 and 33. These therefore, are the maximum number of combinations of
the three digits. The block discusses the fundamental principles of counting, permutation and
combination, matrices and determinants, differentiation and integration and its applications.
This block consists of five units.
The fourth unit discusses fundamental principles of counting. The fundamental principles of
counting implies that if for one event has m number of possible outcomes and anther has n
possible outcomes, then the total number of outcomes for both events will be m x n. The unit
consists of the multiplication rule of counting. It also discusses the other mathematical
operations used for counting the events.
The fifth unit discusses permutation and combination. Permutation and combinations help
find the total number of possible outcomes from any event. It is in other words the several
possible ways a set or number of things can be ordered or arranged. The cases of repetition
and non-repetition are discussed in this unit.
The sixth unit explains matrices and determinants. Matrices are arrays of numbers, symbols,
or expressions, arranged in rows and columns. The various types of matrices, row, column,
square, null, diagonal, scalar, identity and triangular; along with the various operations on
matrices are also discussed in the unit. Along with matrices, determinants of order one, two,
three and four are explained with suitable examples. The properties of determinants are also
explained with examples.
The seventh unit discusses differentiation. Differentiation in mathematics is the
mathematical process of obtaining the derivative of a function. Limit and continuity,
properties of continuous functions, differentiability, applications of derivatives, and
derivatives of functions multiplied by a constant are discusses in this unit.
The eighth unit lists integration and its applications. Integration is a calculus operation by
which the integral of a function is determined. There are various applications of integration,
ranging from economics to accounting and business, determination of cost functions, total
revenue functions, consumer surplus and producer surplus. Integration can be computed by
various methods, namely, indefinite integral, integration by substitution, integration of rational,
irrational and trigonometric functions. These are discussed in detail in the unit.
107
Fundamental Principles of
Counting
UNIT–4
FUNDAMENTAL PRINCIPLES OF COUNTING
Objectives
After going through this unit, you will be able to:
•
Discuss the fundamental principles of counting
•
Explain multiplication rule
•
Describe the rule of the product
•
Analyses the principles of inclusion and exclusion
•
Understand the basics of factorial notation
Structure
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
Introduction
Multiplication Rule
Addition Rule
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
4.1
INTRODUCTION
This unit explains fundamental principles of counting. According to the fundamental
principles of counting if one event has m possible outcomes and another second
independent event has n possible outcomes, then there are a total of m × n total
possible outcomes for the two events together. The multiplication rule includes the
rule of sum, the rule of product, the principle of inclusion and exclusion.
Fundamental principles of counting also include the addition rule and the factorial
notation. This unit discusses these in detail with the help of a number of solved
examples and equations.
4.2
MULTIPLICATION RULE
In combinatorial analysis we intend to determine the number of logical possibilities of
occurrence of events without looking into individual cases.
109
Fundamental Principles of
Counting
Rule of Sum: Suppose two tasks can not be performed simultaneously and also
suppose that T1 can be performed in n1 ways and T2 can be performed in n2 ways.
Then two tasks T1 and T2 can be performed in n1 + n2 ways. In general, suppose a
task T1 can be performed in n1 ways, and second task T2 in n2 ways, a third task in
n3 ways, and so on, and if no two tasks can be performed simultaneously, then one
of the task can be performed in n1 + n2 + n3 + … ways.
In set theoretical notation, the rule of sum can be interpreted as follows:
n (A ∪ B) = n (A) + n (B)
Multiplication Rule
Rule of Product: Suppose a task T 1 can be performed in n 1 ways, and
independent of this task, the second task T2 can be performed in n2 ways, so that
these two tasks when combined can be performed in mn ways. In general, suppose
a task T1 can be performed in n1 ways, and following T1, a second task T2 can be
perfomed in n2 ways, and following task T2 a third task T3 can be performed in n3
ways, and so on, then all k tasks can be performed in the sequence T1T2…Tk in
exactly n1n2 …nk different ways.
In set theoretical notation, the rule of product can be interpreted as follows:
n (A × B) = n (A) × n (B)
where n (A) and n (B) denotes the number of elements in the sets A and B,
respectively.
Example 4.1: Suppose a questionnaire contains 5 questions in which 3 questions
have 2 possible answers and the remaining 2 questions have 3 possible answers.
Then in how many ways can questionnaire be answered?
Solution: Each of 3 questions can be answered in 2 × 2 × 2 ways and remaining
2 questions can be answered in 3 × 3 ways. Hence, total number of ways in which
the questionnaire can be answered are, 2 × 2 × 2 × 3 × 3 = 72 ways.
For example, suppose there are 3 different optional papers to select in one semester
and 2 different optional papers to another semester by the BCA students. According
to the rule of product, there will be 3 × 2 choices for students who want to select
one paper in each of these semesters. On the other hand, as per the rule of sum,
students will have 3 + 2 choices to select only one paper.
Example 4.2: A computer program consists of one letter followed by three digits.
If repetition are allowed, then in how many ways different label identifiers are possible?
110
Fundamental Principles of
Counting
Solution: There are 26 English alphabet and 10 digits from 0 to 9. Thus, each
sequence of three digits can be formed in 10 ways. Hence, total number of ways
in which different label identifiers are possible are, 26 × 10 × 10 × 10 = 26,000.
Example 4.3: A football stadium has 4 gates on the South boundary and 3 gates
on the North boundary.
(i) In how many ways can a person enter through an South gate and leave by
a North gate?
(ii) In how many different ways in all can a person enter and get out through
different gates?
Solution:
(i) Since there are 4 gates on South side, the person can enter in 4 different
ways from South side into the stadium. If he wants to exit from North side,
he can do so in 3 ways because there are 3 exit gates in North side.
Hence, the total number of ways in which he can enter from South gate and
go out from a North gate is 4 × 3 = 12 ways.
(ii) Since he has the choice for entrance from any of 4 + 3 = 7 gates, there are
7 ways in which he can enter and can get out from any one of the remaining
6 gates because he cannot go out from the gate through which he had entered.
Hence, the total number of ways in which he can enter and go out 7 × 6 = 42.
Example 4.4: In how many different ways, can 3 rings of a lock be combined
when each ring has 10 digits 0 to 9? If the lock opens with only one combination
of 3 digits, how many unsuccessful events are possible?
Solution: The ways in which 3 rings can be combined are 10 × 10 × 10 = 1000.
But the lock opens with only one combination of 3 digits, therefore the unsuccessful
events (attempts) will be 1000 – 1 = 999.
Example 4.5: How many 8-digit telephone numbers are possible, if
(i) Only even digits may be used?
(ii) The number must be a multiple of 100?
Solution:
(i) Even digits are 2, 4, 6 and 8. Each of the 8 places can be filled in 4 ways
by even digits to form a 8 digit number. Hence, there can be 4 × 4 × 4 ×
4 × 4 × 4 × 4 × 4 = (4)8 different numbers.
111
Fundamental Principles of
Counting
(ii) A telephone number that needs to be multiple of 100 should have last two
digits as zero (0). Thus, while forming such numbers by using digits 0 to 9,
the first digit can be 1 to 9 and the next 5 places be filled in by 10. Thus
9 × 105 ways.
Principle of Inclusion and Exclusion
The basic principle of counting the things defines that each object should be counted
only once. If there are N number of objects given and out of these N objects some
objects have property a (denoted as Na) and some possess property b (denoted as
Nb) while some have both the properties a and b (denoted as Nab). If Vab denotes
number of objects having either of these properties a or b, then Vab = Na + Nb –
Nab .
This is just like a simple addition on elements of sets like n (A ∪ B) = n(A) +
n(B) – n(A ∩ B).
Applying formula Vab = Na + Nb – Nab recursively we include one more
property c, and now we can write as: Vabc = Na + Nb + Nc – Nab – Nbc – Nac +
Nabc and going further for properties abc…n we can write as: Vabc….n = Na + Nb +
Nc + …. + Nn – Nab – Nbc – Nac – …. – Nmn + Nabc + ….. + Nlmn – Nabcd – ……..
This formula can also be written in a set theoretic form for sets A, B, C, …,
etc., and Na can be written as |A| or #A or n(A) and similarity for Nb, Nc or other
and Nab can be written as |A ∩ B| or #( A ∩ B) or n(A ∩ B). This is the principle
of inclusion and exclusion (written in short as PIE) with the condition that each
object is counted only once. We take an arbitrary object, say S out of a set of given
N number of objects and assume that the object S has property k out of the many
properties in the set of N. To prove that the formulae is correct according to
principle of inclusion and exclusion we have to first prove that as per the above
given formulae the object S is counted only once.
For example, consider the following solved examples.
Example 4.6: A coin is flipped 5 times. In how many ways can it be done so that
there is an exact sequence of 3 heads in a row? The case of 4 heads in a row is not
counted. If Ni be the number of sequences of tosses having an exact sequence of 3
heads and starts on the ith throw then what will be V12345?
Solution: V12345 is equal to V123 because an exact sequence of 3 heads cannot
directly start on the 4th or 5th throws. According to PIE formulae: V123 = N1 + N2
+ N3 – N12 – N23 – N13 + N123
112
Fundamental Principles of
Counting
We may not have exact sequences of heads starting on the 1st and 2nd throws.
Let us write H for head and T for tails. If N1 is the number of sequences exactly
starting with 3 H then the 4th throw must be the tail (T) leaving two possibilities for
the 5th throw. Thus, N1 = 2.
In a similar way, if there is an exact sequence of 3 H starting with the 2nd toss
then this means that both the 1st and 5th tosses must be T or tail. This leads to N2
= 1. Hence, if a sequence of H starts on the 3rd throw, then the 2nd throw has to
be T while the 1st throw may be either H or T. This gives N3 = 2. Hence, V12345 =
V123 = 2 + 1 + 2 = 5.
Example 4.7: A coin is flipped 5 times. In how many ways can it be done so that
there must be a sequence of at least 3 heads in a row? If Ni be the number of
sequences of tosses with at least 3 heads beginning on the ith throw then what will
be V12345?
Solution: In this case, Ni is the number of exact sequences of tosses where there is
a sequence of at least 3 heads starting on the ith throw. Here, at least 3 means that
3 or more heads. Like previous example, V12345 = V123 and V123 = N1 + N2 + N3
– N12 – N23 – N13 + N123
Also in this case the first three terms only can be non-zero and N1 = 4 because
each 4th and 5th tosses can be H or T. Also, N2 = 2, because 5th throw can either
b
e
H or T. In a similar way N3 = 2 as the 1st throw may be H or T but the 2nd throw
must be tails. V12345 = V123 = 8.
Example 4.8: In a class room there are 15 students. Out of these 6 study
mechanics, 9 study general science, and 9 study computer science. Also, 2 study
mechanics and general science, 3 study mechanics and computer science, and 5
study general science and computer science. One student in the class studies all
three subjects. How many of these students study none of the three subjects?
Solution: Let M, G, and C denote the sets of students who study mechanics,
general science and computer science respectively and let U be the entire set of 15
students. Then |M| = 6, |G| = 9, and |C| = 9. Also, |MG| = |M ∩ G| = 2, |MC| = |M
∩ C| = 3, and |GC| = |G ∩ C| = 6 and |MGS| = |M ∩ G ∩ C| = 1. Then (MGC)c
= |U|-(|M|+|G|+|C|-|MG|-|MC|-|GC|+|MGC|) = complement of MGC = 15-(6 + 9
+ 9 – 2 – 3 – 6 +1) = 3 = 15 – (24 – 11 + 1) = 1.
113
Fundamental Principles of
Counting
4.3
ADDITION RULE
The fundamental principle of addition says that if there are two event which may
occur independent by p and q ways, then either of the two events can occur in
(p + q) ways.
It can also be defined as; if E1 and E2 are mutually exclusive events and E be the
event that either E1 or E2 will occur, then number of times event E will occur is given
as,
N(E) = n(E1) + n(E2)
where n(E1) = number of outcomes of event E1
n(E2) = number of outcomes of event E2
n(E) = number of outcomes of event E
For n number of events, this principle can be extended as,
n(E) = n(E1) + n(E2) + … n(Em)
Where E is event that either E1, E2, … Em will occur n(E1), n(E2) … n(Em)
presents number of outcomes for events E1, E2 … Em.
Factorial notation
The product of all consecutive integers starting from 1 to t is denoted by t! or | t and
read as t-factorial.
t ! = 1 × 2 × 3 × ... × t.
Thus
In this way 1 ! = 1, 2 ! = 1 × 2 = 2, 3 ! = 1 × 2 × 3 = 6
4 ! = 1 × 2 × 3 × 4 = 24 etc.
Note that for n > 1, n ! = n(n – 1) !
Now
n
Pr = n(n – 1)(n – 2) ... (n – r + 1)
n ( n −1) ... ( n − r + 1)( n − r )( n − r − 1) ... 3. 2.1
( n − r )( n − r − 1) ... 3. 2.1
n
=
n−r
=
Convention: As a convention we take 0 ! equal to 1.
Example 4.9: Find the value of 6P4.
Solution. 6P4 =
6!
(6 − 4) !
=
6!
2!
=
114
6 . 5 . 4 . 3 . 2 .1
2 .1
= 360.
Fundamental Principles of
Counting
Example 4.10: If nP4 = 12 nP2, find n.
n!
(n − 4) !
Solution. nP4 =
and
n!
(n − 4) !
By hypothesis
= 12
P2 =
n
n!
(n − 2) !
n!
(n − 2) !
⇒
12(n – 4) ! = (n – 2) !
⇒
12(n – 4) ! = (n – 2)(n – 3)(n – 4)!
12 = n2 – 5n + 6
⇒
⇒
n2 – 5n – 6 = 0
⇒
(n – 6)(n + 1) = 0
⇒
n=6
or n = – 1
Since n is positive integer, we reject the second value of n. Thus
n = 6.
Example 4.11 In how many ways 5 passengers can sit in a compartment
having 16 vacant seats?
Solution. Required number of ways = 16P5
=
16 !
(16 − 5) !
=
16 . 15 . 14 . 13 . 12 . | 11
| 11
=
16 !
11!
= 16 . 15 . 14 . 13 . 12 = 524160.
Check Your Progress - 1
1.
What is the rule of sum?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What does the basic principle of counting things define?
................................................................................................................
................................................................................................................
................................................................................................................
115
Fundamental Principles of
Counting
3.
What do you mean by a factorial?
................................................................................................................
................................................................................................................
................................................................................................................
4.4
SUMMARY
• For a rule of product, suppose a task T1 can be performed in n1 ways, and
independent of this task, the second task T2 can be performed in n2 ways,
so that these two tasks when combined can be performed in mn ways.
• If a task T1 can be performed in n1 ways, and following T1, a second task
T2 can be perfomed in n2 ways, and following task T2 a third task T3 can
be performed in n3 ways, and so on, then all k tasks can be performed in
the sequence T1T2…Tk in exactly n1n2 …nk different ways.
• The basic principle of counting the things defines that each object should be
counted only once.
• If there are N number of objects given and out of these N objects some
objects have property a (denoted as Na) and some possess property b
(denoted as Nb) while some have both the properties a and b (denoted as
Nab).
• If Vab denotes number of objects having either of these properties a or b,
then Vab = Na + Nb – Nab.
• The fundamental principle of addition says that if there are two event which
may occur independent by p and q ways, then either of the two events can
occur in (p + q) ways.
• If E1 and E2 are mutually exclusive events and E be the event that either E1
or E2 will occur, then number of times event E will occur is given as, N(E)
= n(E1) + n(E2).
4.5
KEY WORDS
• Rule of sum: It is a basic counting principle which states the idea that if
there are a ways of doing something and b ways of doing another thing,
then we can do both things simultaneously as a + b.
116
Fundamental Principles of
Counting
• Factorial: For a non-negative integer n, the factorial is denoted by n!, and
is the product of all positive integers less than or equal to n.
4.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The rule of sum defines that for any task T1 that can be performed in n1
ways, and any task T2 that can be performed in n2 ways, then the two tasks
T1 and T2 can be performed as n1 + n2.
2. The basic principle of counting the things defines that each object should be
counted only once.
3. A factorial refers to the product of an integer and all the integers below it.
4.7
SELF-ASSESSMENT QUESTIONS
1. What do you understand by multiplication rule?
2. Differentiate between the rule of sum and the rule of product.
3. What is the principle of inclusion and exclusion?
4. Write a short note on addition rule.
5. Define a factorial. What is a factorial notation?
6. In a group of 15 students. Out of these 6 play football, 9 play hockey, and
9 play chess. Also, 2 play football and hockey, 3 play football and chess,
and 5 play hockey and chess. One student in the group plays all three
games. How many of these students play none of the three games?
7. How many 6-digit telephone numbers are possible, if only even digits are
used?
4.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
117
Fundamental Principles of
Counting
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
118
Permutation and
Combination
UNIT–5
PERMUTATION AND COMBINATION
Objectives
After going through this unit, you will be able to:
•
Discuss the concepts of permutation
•
Analyse ordered samples and permutations
•
Differentiate between ordered samples with and without repetitions
•
Understand the concept of restricted combination
Structure
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
Introduction
Permutation
Combination
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
5.1
INTRODUCTION
This unit will discuss permutation and combination. Permutation refers to an event in
which one thing is substituted for another, while combination refers to its combination
with another event. Permutations are of ordered samples wherein each of the several
possible ways a set or number of things can be ordered or arranged. Supposing
there are three things represented as a, b and c, then the selections can be made
with the combination of these three things taken two at a time, as, ab, ac and bc. In
another order these can be taken as ba, ca, and cb. Permutation and combination
help determine the maximum number of likely outputs that can be derived from a set
of given things or commodities. In a detailed explanation, permutation and
combination can also be determined for things with or without repetition. This unit
will discuss in detail the various aspects of permutation and combination.
119
Permutation and
Combination
5.2
PERMUTATION
The various aspects of permutation are discussed here.
Ordered Samples and Permutations
The following statements illustrate the basic concept of ordered samples and
permulations.
(i) If there are things represented by α, β, γ, then the selections that can be
made from these, taken two at a time are βγ, γα, αβ.
If however, we take into account the order or the arrangement in each of
the selections, we have βγ, γβ, γα, αγ, αβ, βα as the different arrangements
of three things taken two at a time.
The different selections, that can be obtained by taking the same number r
of things from a given collection of n different things, without any regard to
the order of the things, are called the combination of n things taken r at a
time and the number of such combination is denoted by nCr. The different
arrangements that can be obtained by taking the some number r of things
from a given collection of n different things, are called the Permutations of
n things taken r at a time, and the number of such permutations is denoted
by nPr.
(ii) The number of ways of selecting one thing at a time, from n different things,
is n and hence nC1 = n.
The number of ways of selecting all the n things is 1 and hence nCn = 1.
The number of ways of arranging one thing at a time from n different things
is n and hence nP1 = n.
Note: In nCr and nPr, r must be necessarily less than n and its maximum
value is n.
(iii) If a certain operation can be performed in any one of the m ways, and then,
a second operation can be performed in anyone of the n ways, then both
the operations can be performed in anyone of the mn ways.
(iv) The number of permutations of n different things taken r at a time is
n(n – 1)(n – 2)........(n – r + 1).
The number of permutations of n things taken r at a time is the same as the
number of ways in which r blank spaces can be filled up by the n things.
120
Permutation and
Combination
(v) We have nPn = n(n – 1) (n – 2).......to n factors = n(n – 1) (n –
2)...3.2.1 = n!
Also, we have,
nP
r
= n(n – 1) (n – 2)..........(n – r + 1)
=
n( n – 1)(n – 2)....(n – r + 1)( n – r )...2.1
n!
=
( n – r )(n – r – 1)...2.1
( n – r )!
Example 5.1: How many numbers between 1,000 and 10,000 can be formed
with the digits 1, 3, 5, 7, 9, when each digit being used only once in each
number?
Solution: Each number should consist of 4 digits, and the required number is the
same as the number of permutations of 5 different things, taken 4 at a time = 5P4=120.
Example 5.2: Eleven papers are set for the engineering examination of which, two
are in Mathematics. In how many ways can the papers be arranged, if the two
Mathematics papers do not come together?
Solution: We shall find the total number of ways of arranging all the papers, if there
is no restriction and subtract from this, the number of ways in which the two
Mathematics papers come together.
The total number of ways in which all the papers can be arranged, if there is no restriction
= 11!
To find the number of ways in which Mathematics papers come together, consider
that the two papers are bound together; this can be done in 2! ways. Now the
number of ways in which the resulting 10 papers can be arranged is, 10!. Hence,
the total number of ways in which the Mathematics papers come together is 2! ×
10!. Hence the number of ways in which the Mathematics papers do not come
together is, 11! – 2! × 10! = 9 × 10!
Example 5.3: There are 35 micro computers in a computer centre. Each
microcomputer has 18 ports. How many different ports are there in the centre?
Solution: The procedure for choosing a port consists of two jobs, first picking a
microcomputer, and then picking a port, on this microcomputer. Since there are
35 ways to choose the microcomputer and 18 ways to choose the port, it does
not matter which microcomputer has been choosen. By rule 3, there are, 35 × 18
= 630 ports.
121
Permutation and
Combination
Example 5.4: How many functions are there from a set with P elements to one
with Q elements?
Solution: A function corresponds to a choice of one of the Q elements in the
codomain for each of the P elements in the domain. Hence by rule 3, there are QP
functions from a set with P elements, to one with Q elements.
Example 5.5: In how many ways can the letters of the word EDINBURGH be
arranged,
(i) With the vowels only in the odd places;
(ii) Beginning and ending with vowels;
(iii) Beginning and ending with consonants.
Solution:
(i) There are five odd places and as the three vowels should be in these places
only, they can be first, arranged in 5P3 = 60 ways.
When the vowels have been arranged in any one way as shown below,
1 2 u3 4 5 6 7 8 9
e
the remaining six places are to be filled up by the six consonants, and this
can be done in 6! = 720 ways. Hence, the total number of arrangements is
60 × 720 = 43,200 ways.
(ii) The first and last places should be occupied by vowels, and this can be done
in 3P2 = 6 ways.
Further, for each of these ways the other 7 letters can be arranged in 7! = 5040
ways.
Hence, the total number of arrangements are: 6 × 5040 = 30240.
(iii) The total number of ways = 6P2 × 7! = 30 × 5040 = 151200.
Example 5.6: How many numbers between 5,000 and 10,000 can be formed from
the digits 1, 2, 3, 4, 5, 6, 7, 8, 9, each digit not appearing more than once in each
number?
Solution: The first digit from the left may be 5 or 6 or 7 or 8 or 9; and so, the first
place from the left can be filled in 5 different ways; as the number should consist of
4 digits, the remaining 3 digits can be arranged in 8P3 = 336 ways. Hence, the total
number of numbers that can be formed is 5 × 336 = 1,680.
122
Permutation and
Combination
Circular Permutations
Consider n persons seated in a round table in any order, and at the same time,
consider them arranged in the same order in a line, as shown above.
The number of circular arrangements of n persons is
(n – 1)!
Example 5.7: In how many ways can 6 different beads be strung together to form
a necklace?
Solution: The number of circular permutations of 6 different things is 5!
When 6 persons are seated at a round table the two arrangements shown above
will have to be considered different; but in the case of the necklace, the above
arrangements can be obtained by simply turning over the first arrangement same
and so must be considered identical.
1
60 ways in which, a necklace
Hence the 5! circular permutation gives us only ⋅ 5! =
2
of 6 beads can be formed.
Check Your Progress - 1
1.
How can different selections be obtained from the same number from a
given collection?
................................................................................................................
................................................................................................................
................................................................................................................
123
Permutation and
Combination
What is the number of circular arrangements of n persons?
2.
................................................................................................................
................................................................................................................
................................................................................................................
5.3
COMBINATIONS
The number of combinations of n different things taken r at a time is
nc
r
=
n( n – 1)(n – 2)...( n – r + 1)
.
1, 2, 3... r
Example 5.8: How many diagonals are there in a polygon of n sides?
Solution: The number of diagonals is the number of straight lines joining any two
consecutive vertices excepted; and hence the required number = nC2 – n =
–n=
n 2 – 3n
.
2
n(n – 1)
2
Example 5.9: The English language consists of 21 consonants and 5 vowels. How
many 5 lettered words, consisting of atleast a vowel and two consonants, can be
formed from them?
Solution: Each 5-lettered word might consists of,
(i) 1 Vowel and 4 consonants
(ii) 2 Vowels and 3 consonants
(iii) 3 Vowels and 2 consonants
Now, it
(i) Gives rises to 5C1 × 21C4 × 5!
(ii) Gives rises to 5C2 × 21C3 × 5!
(iii) Gives rises to 5C3 × 21C2 × 5!
∴The required number of words is 5,439,000.
Example 5.10: Find the number of permutations when six letters at a time are taken
from the word RAMAYANAM.
Solution: There are 9 letters of 5 sorts, namely, aaaa; mm; r; y; n
The combination and permutation can be grouped. Then:
124
Permutation and
Combination
(i) 4 alike, 2 alike,
The number of combinations = 1.1 = 1
The number of permutations =
6!
= 15
4! 2 !
(ii) 4 alike, 2 different,
The number of combinations = 1.4 C2= 6
The number of permutations = 6
6!
= 180
4!
(iii) 3 alike, 2 alike, 1 different,
The number of combinations = 1.1.3 C1= 3
The number of permutations = 3.
6!
= 180
3! 2 !
(iv) 3 alike, 3 different,
The number of combinations = l.4C3= 4
The number of permutations = 4.
6!
= 480
3!
(v) 2 alike, 2 alike, 2 different,
The number of combinations = 1.1.3C2 =3
The number of permutations = 3
6!
= 540.
2! 2 !
(vi) 2 alike, 4 different,
The number of combinations = 2C1 4C4= 2
The number of permutations = 2.
6!
= 720
2!
The total number of permutations = 15 + 180 + 180 +480 + 540 + 720 =
2,115.
Example 5.11: I have 4 friends: in how many ways can I invite them for dinner?
Solution: The required number is, 4C1 + 4C2 + 4C3 + 4C4 = 4 + 6 + 4 + 1 = 15.
Unordered Samples with/without Repetition
In any arrangement or sampling, if order does not matter then it is a combination.
Here also we have two different cases, one in which repetition is allowed and
another in which repetition is not allowed. We can think of a case of repetition being
allowed is to think of coins in your pocket. Your pocket has coins of 1, 1, 1, 2,
5, 5… like that. The case when repetition is not allowed can be thought of as a
125
Permutation and
Combination
sequence of number 1,2,3 which is considered one sample and here ordering is not
important. If orders are taken then there are six number of arrangements like, 123,
132, 231, 213, 312, and 321. So unordered sample is like a set. It is a set of three
elements 1,2,3. Here we will discus those where repetition is not allowed and so we
are dealing with combination without repetition.
An ordered arrangement of in n things taken r at a time (without repetition)
is given by n!/(n – r)!. If the same n things are arranged taken r! at a time, it is given
as n!/{(n – r)!r!}. Thus when restriction of ordering is removed then the number of
arrangement is reduced by r! times. Symbolically we represent first case of orderly
arrangement as permutation, nPr and in second case it is a combination nCr and nPr
= r! nCr.
Permutations with Repetitions
Permutations with Repetitions: The number of permutations of n different things
taken r at a time when the things can be repeated any number of times is nr.
Note: The total number of permutations of n things taken 1, 2, 3,....., r at a time
when the things may be repeated any number of times is n + n2 + ..... + nr = n.
nr – 1
.
n –1
Example 5.12: How many numbers of four digits can be formed with the digits
1, 2, 3 ? Find the sum of all such numbers.
Solution: The number of four digit numbers = 3 × 3 × 3 × 3 = 81.
If the numbers are all written down we find that any one of the digits 1, 2, 3 occurs
in the units place in
81
= 27 times. Hence, the total sum of the digits in the units
3
place is, 27 (1 +2 + 3) = 162. This is also the sum of all digits in the tens, hundreds
and thousands places.
Hence, the required sum is, 162 + 162 × 10 + 162 × 100 + 162 × 1000 = 179982.
Permutations when all the Things are not Different: The number of permutations
n!
of n things taken together when all the things are not different is given by p!q !r !...
Here, among the n things, p things of them are of one kind; q of them are of second
kind, and so on.
(p + q + r +........ = n)
126
Permutation and
Combination
Example 5.13: How many different words can be made out of the letters which form
the word ALLAHABAD?
Solution: There are 9 letters of which 4 letters are of one sort (A, A, A, A); 2 are
of second sort (L, L); 1 is of third sort (H); 1 is of fourth sort (B); and 1 is of
different sort (D).
The required number =
9!
4! 2 !
Example 5.14: In how many ways can the letters of the word ENGINEERING
be arranged (i) without changing the order of the consonants (ii) without changing
the relative positions of the vowels and the consonants?
Solution:
(i) The consonants are required to be in the same order as in the given word
and so, there can be no interchange of posititons among them and so, they
may be replaced by letters say c,c,c,c,c,c. All these 6 c’s can be arranged
in one and only way. Now, we have to find the number of permutations of
11 letters of which the 6 consonants are alike; 3 vowels e,e,e are alike; and
the two vowels i, i are alike.
Hence, the required number =
11!
= 4,620.
6! 3! 2!
(ii) The places originally occupied by vowels must always be occupied by vowels
and those occupied by consonants, always by consonants. The vowels e,
5!
= 10 ways and the
3! 2 !
6!
= 60
consonants n, n, n, g, g, r can be arranged among themselves in
3! 2 !
e, e, i, i can be arranged among themselves in
ways.
Hence, the required number is l0 × 60 = 600 ways.
Permutations Involving Indistinguishable Objects
There are number of objects which are not all different, some are of one kind other
are of second kind and yet some other are of a third kind and like that. Example of
such a kind is often found when we come across words when certain letters are
repeated. For example, in a word COMMITTEE, T has been repeated two times,
at 6th and 7th position; E has been repeated two times, at 8th and 9th positions; M
has also been repeated two times, at 3rd and 4th positions. If these repeated letters
are interchanged, it is not distinguishable.
127
Permutation and
Combination
In such a case if among the n things, p things of them are of one kind; q things
of them are of second kind, and so on, such that p + q + r +........ = n, the number
of permutations of n things taken together when all the things are not different is given
by n! / (p!q!r!...) since number of permutations are reduced by p!q!r!... from the
original permutation when things were all different.
In the above example number of permutation of all the letters of the word
COMMITTEE can be given by 9! / (2!2!2!) = 45360. If these 9 letters would all
have been different then it would have been just 9! = 362880.
The following example will make the concept clear on permutations involving
indistinguishable objects.
Example 5.15: How many different letter arrangements can be formed using the
letters T E N N E S S E E ?
Solution: There are 9! possible permutations of the letters T E N N E S S E E if the
letters are distinguishable.
However, 4 E’s are indistinguishable. There are 4! ways to order the E’s.
2 S’s and 2 N’s are indistinguishable. There are 2! orderings of each.
Once all letters are ordered, there is only one place for the T.
If the E’s, N’s and S’s are indistinguishable among themselves, then there are 9!/
(4!.2!.2!) = 3,780 different orderings of T E N N E S S E E.
Restricted Combinations
We know that combination presents a way of choosing elements from a set, where
order does not matter. When additional restrictions are added, it is called restricted
combinations.
Case 1: When p particular things are always to be included
Number of combinations of n distinct things taking r at a time, when s particular
things are always to be included in each selection, is
(n p)
C( r
p)
.
Case 2: When a particular thing is always to be included
Number of combinations of n distinct things taking r at a time, when a particular
thing is always to be included in each selection, is
( n 1)
C( r 1) .
Case 3: When p particular things are never included
128
Permutation and
Combination
Number of combinations of n distinct things taking r at a time, when s particular
things are never included in any selection, is
n p
Cr
Case 4: When p particular things never come together
Number of combinations of n distinct things taking r at a time, when m particular
n
things never come together in any selection, is Cr
(n p)
C( r
p)
.
Case 5: Number of ways of selecting zero or more things from ‘n’ different things
is given by 2n–1.
Proof: Number of ways of selecting one thing, out of n-things = nC1
Number of selecting two things, out of n-things = nC2
Number of ways of selecting three things, out of n-things = nC3
Number of ways of selecting ‘n’ things out of ‘n’ things = nCn
→ Total number of ways of selecting one or more things out of n different things.
= n C1
= ( n C0
n
C2
n
n
C3 
C1 
n
Cn )
n
Cn
n
C0
2n 1
Case 6: Number of ways of selecting zero or more things from ‘n’ different things
is given by n + 1.
Example 5.16: In how many ways can a cricket-eleven be chosen out of 15
players? If
(i) A particular player is always chosen.
(ii) A particular is never chosen.
Solution:
(i) A particular player is always chosen, it means that 10 players are selected
out of the remaining 14 players.
= Required number of ways = 14C10 = 14C4
= 14!/4! × 19! = 1365
(ii) A particular players is never chosen, it means that 11 players are selected
out of 14 players.
→ Required number of ways = 14C11
= 14!/11! × 3! = 364 [nC0 = 1]
Example 5.17: Kamal has 8 friends. In how many ways can he invite one or more
of them to dinner?
129
Permutation and
Combination
Solution. Kamal can select one or more than one of his 8 friends.
→ Required number of ways = 28 – 1 = 256 – 1 = 255.
Example 5.18: In how many ways, can zero or more letters be selected form the
letters AAAAA?
Solution. Number of ways of :
Selecting zero ‘A’s = 1
Selecting one ‘A’s = 1
Selecting two ‘A’s = 1
Selecting three ‘A’s = 1
Selecting four ‘A’s = 1
Selecting five ‘A’s = 1
Required number of way = 5 + 1 = 6.
Division into Groups
Objects can be divided into groups in two ways
1. Groups of unequal size
2. Groups of equal size
For groups of unequal size,
(a) Number of ways in which n distinct objects can be divide into r unequal
groups containing p1, p2 … pr their
n
= p pp
1
2
r
(b) Number of ways in which n distinct object can be distributed among r
persons such that some person get p1 objects, another person gets a2
object … and similarly someone gets ar objects
n r
= p pp
1
2
r
For groups of equal size,
(c) Number of ways in which m × n distinct objects can be divided equally into
n groups (unmarked) = (mn)! (m!) n n!.
130
Permutation and
Combination
(d) Number of ways in which m × n different object can be distributed equally
among n persons (or numbered groups)
= (number of ways of dividing) × (number of groups)! = (mn)! n!(m!)n n!
= (mn)!/(m!)n.
(e) The number of ways to divide m + n + p objects into three groups having
m, n and p objects is (m + n + p)!/(m! n! p!)
(f) The number of ways to divide m + 2n objects into three groups having m,
n and n objects is (m + 2n)!/m! × n! × n! × (no. of groups having the
same number of objects)!
Example 5.19: In how many ways can you divide 28 school children into three
groups having 3, 5, and 20 children?
Solution: Total students = 28, Groups of 3, 5 and 20
Therefore number of ways = 28!/(3!5!20!).
Example 5.20: In how many ways can you divide 28 school children into three
groups having 4, 12, and 12 children?
Solution. Total students = 28
Groups of = 4, 12, 12
Repeated number of student in a group = 21 of 12
28
∴ Number of ways = 4 12 12 2
Check Your Progress - 2
1.
What is the number of combinations of n different things taken r at a
time?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is it called when the order in an arrangement or sampling does not
matter?
................................................................................................................
................................................................................................................
................................................................................................................
131
Permutation and
Combination
3.
What is the number of permutations of n different things taken r at a time,
considering the things can be repeated any number of times?
................................................................................................................
................................................................................................................
................................................................................................................
5.4
SUMMARY
• It refers to each of the several possible ways in which a set or number of
things can be ordered or arranged.
• If there are things represented by α, β, γ, then the selections that can be
made from these, taken two at a time are βγ, γα, αβ.
• The different arrangements that can be obtained by taking the some number
r of things from a given collection of n different things, are called the
Permutations of n things taken r at a time.
• The different selections, that can be obtained by taking the same number r
of things from a given collection of n different things, without any regard to
the order of the things, are called the combination of n things taken r at a
time.
• The number of ways of selecting one thing at a time, from n different things,
is n and hence nC1 = n.
• The number of ways of arranging one thing at a time from n different things
is n and hence nP1 = n.
• If a certain operation can be performed in any one of the m ways, and
then, a second operation can be performed in anyone of the n ways, then
both the operations can be performed in anyone of the mn ways.
• The number of permutations of n different things taken r at a time is
n(n – 1)(n – 2)........(n – r + 1).
• The number of combinations of n different things taken r at a time is
Cr =
n
n(n 1)(n 2)( n r 1)
1, 2,3r
• ordered arrangement of in n things taken r at a time (without repetition) is
given by n!/(n – r)!.
132
Permutation and
Combination
• The number of permutations of n different things taken r at a time when the
things can be repeated any number of times is nr.
5.5
KEY WORDS
• Permutation: It refers to each of the several possible ways in which a set
or number of things can be ordered or arranged.
• Permutations with repetition: It refers to the number of permutations of
n different things taken r at a time when the things can be repeated any
number of times is nr.
5.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. Different selections can be obtained by taking the same number r of things
from a given collection of n different things, without any regard to the order
of the things.
2. The number of circular arrangements of n persons is n!/n= (n – 1)!
Check Your Progress - 2
1. The number of combinations of n different things taken r at a time is
Cr =
n
n(n 1)(n 2)( n r 1)
.
1, 2,3r
2. In any arrangement or sampling, if order does not matter then it is called a
combination.
3. The number of permutations of n different things taken r at a time when the
things can be repeated any number of times is nr.
5.7
SELF-ASSESSMENT QUESTIONS
1. What do you understand by permutation and combination?
2. There are 35 students in a class. Each student has 8 notebooks. How many
different notebooks are there in the class?
3. How many functions are there from a set with P elements to one with Q
elements?
133
Permutation and
Combination
4. Write a short note on unordered samples with and without repetitions.
5. What do you mean by circular permutations?
6. How many different letter arrangements can be formed using the letters
ARRANGEMENT?
7. How is the permutation of n different things computed when all things are
different?
8. Write the possible combinations for inviting 6 friends to dinner.
5.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
134
Matrices and Determinants
UNIT–6
MATRICES AND DETERMINANTS
Objectives
After going through this unit, you will be able to:
•
Describe matrices and determinants
•
Analyse the various types of matrices
•
Assess the operations of matices
•
Describe minors and cofactors of determinants
•
Understand scalar multiplication of matrix
Structure
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
Introduction
Matrix
Subtraction of Matrix and System of Linear Equations
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
6.1
INTRODUCTION
This unit will discuss matrices and determinants. A matrix (plural matrices) is a
rectangular array of numbers, symbols, or expressions, arranged in rows and
columns. A determinant however refers to a quantity obtained by the addition of
products of the elements of a square matrix. Matrices are classified as row, column,
square, null, diagonal, scalar, identity and triangular. Matrices can be however, equal
sometimes and often not equal. In an equal matrix the corresponding elements in
matrix A and B are of the same order. The basic operations of addition, subtraction
and multiplication can also be applied on matrices. Another operation that can be
applied to a matrix is transpose. This unit discussed the operations of matrices in
detail.
135
Matrices and Determinants
6.2
MATRIX
The aspects, types and characteristics of matrix are discussed here below in detail.
What is a Matrix?
Let n, m be two integers ≥ 1. An array of elements of the type is as follows:
 a11 a12

 a21 a22
 


 am1 am2
a13
a23

am3
... a1n 

... a2n 

 

... amn 
This is called a matrix. We denote this matrix by (aij), i = 1, ..., m and j = 1, ..., n. We
say that it is an m × n matrix (or matrix of order m × n). It has m rows and n columns.
For example, the first row is (a11 a12 ..., a1n) and first column is,
 a11 


 a21 
  


 am1 
Also, aij denotes the element of the matrix (aij) lying in ith row and jth column and we
call this element as the (i, j)th element of the matrix.
For example, in the matrix,
 1 2 3


 4 5 6
7 8 9


a11 = 1, a12 = 2, a32 = 8,
i.e.,
(1, 1)th element is 1, (1, 2)th element is 2, (3, 2)th element is 8, respectively.
Notes:
1. Matrices are a key tool in linear algebra.
2. A matrix is simply an arrangement of elements and has no numerical
value.
Example 6.1: If A =
1

4
7
 0

2
5
8
1
3

6
,
9

2 
136
find a11, a22, a33, a31, a41.
Matrices and Determinants
Solution:
a11 = element of A in first row and first column = 1
a22 = element of A in second row and second column = 5
a33 = element of A in third row and third column = 9
a31 = element of A in third row and first column = 7
a41 = element of A in fourth row and first column = 0
Types of Matrices
1. Row Matrix. A matrix which has exactly one row is called a row matrix.
For example, (1 2 3 4) is a row matrix.
2. Column Matrix. A matrix which has exactly one column is called a
column matrix.
5
 
6
For example,   is a column matrix.
7
 
3. Square Matrix. A matrix in which the number of rows is equal to the
number of columns is called a square matrix.
1 2
For example,  3 4  is a 2 × 2 square matrix.


4. Null or Zero Matrix. A matrix each of whose elements is zero is called a
null matrix or zero matrix.
 0 0 0
For example,  0 0 0  is a 2 × 3 Null matrix.


5. Diagonal Matrix. The elements aij are called diagonal elements of a
 1 2 3


4 5 6
in matrix,
square matrix (aij). For example, 
7 8 9


the diagonal elements are a11 = 1, a22 = 5, a33 = 9
137
Matrices and Determinants
A square matrix whose every element other than diagonal elements is zero,
1 0 0


0 2 0
is a diagonal matrix.
is called a diagonal matrix. For example, 
 0 0 3


Note that, the diagonal elements in a diagonal matrix may also be zero. For
example,
0 0
0 0

 and 

0 2
0 0
are also diagonal matrices.
6. Scalar Matrix. A diagonal matrix whose diagonal elements are equal, is
1 0 0  0 0 0
 5 0 
 

,  0 1 0,  0 0 0 


are
called a scalar matrix. For example,  0 5 
 0 0 1  0 0 0

 

scalar matrices.
7. Identity Matrix. A diagonal matrix whose diagonal elements are all equal
to 1 (unity) is called identity matrix or (unit matrix). For example,
 1 0

 is an identity matrix.
 0 1
8. Triangular Matrix. A square matrix (aij), whose elements aij = 0 when i
< j is called a lower triangular matrix.
Similarly, a square matrix (aij) whose elements aij = 0 whenever i > j is
called an upper triangular matrix.
For example,
1 0 0

  1 0
 4 5 0  ,  2 0  are lower triangular matrices

7 8 9 


1 2 3

 1 2
0 4 5, 
 are upper triangular matrices.

and
0 0 6 0 3


138
Matrices and Determinants
Algebra of Matrices
Equality
Two matrices A and B are said to be equal if,
(i) A and B are of same order.
(ii) Corresponding elements in A and B are same. For example, the following
two matrices are equal.
3 4 9 3 4 9

 =

 16 25 64   16 25 64 
But the following two matrices are not equal.
 1 2 3
1 2 3 


  4 5 6
 4 5 6  7 8 9


As matrix on left is of order 2 × 3, while on right it is of order 3 × 3
The following two matrices are also not equal.
 1 2 3  1 2 3 



 7 8 9  4 8 9
As (2, 1)th element in LHS matrix is 7 while in RHS matrix it is 4.
Operations on Matrices
The following operations can be performed on matrices.
Addition of Matrices
If A and B are two matrices of the same order then addtion of A and B is defined to
be the matrix obtained by adding the corresponding elements of A and B.
For example, if
 1 2 3
A =  4 5 6 , B =


 2 3 4


5 6 7
 1 + 2 2 + 3 3 + 4  æç3 5 7 ÷ö
÷÷
Then, A + B =  4 + 5 5 + 6 6 + 7  = çç

 è9 11 13÷ø
æ1 - 2 2 - 3 3 - 4ö÷ æ-1 -1 -1÷ö
ç
÷ = çç
÷
Also, A – B = çç
è4 - 5 5 - 6 6 - 7ø÷÷ çè-1 -1 -1÷÷ø
139
Matrices and Determinants
Note that addition (or subtraction) of two matrices is defined only when A and
B are of the same order.
Properties of Matrix Addition
(i) Matrix addition is commutative.
A+B=B+A
i.e.,
(i, j)th element of A + B is (aij + bij) and of B + A is (bij + aij), and they are
same as, aij and bij are real numbers.
(ii) Matrix addition is associative,
A + (B + C) = (A + B) + C
i.e.,
For, (i, j)th element of A + (B + C) is aij + (bij + cij) and of (A + B) + C is
(aij + bij) + cij which are same.
(iii) If O denotes null matrix of the same order as that of A then,
A+O=A=O+A
(i, j)th element of A + O is aij + O which is same as (i, j)th element
of A.
(iv) To each matrix A there corresponds a matrix B such that,
A + B = O = B + A.
Let (i, j)th element of B be – aij. Then (i, j)th element of A + B is,
aij – aij = 0.
Thus, the set of m × n matrices forms an abelian group under the composition of
matrix addition.
1 2
3
0 1
2
Example 6.2: If A = 4 5 6 and B = 3 4 5
Verify A + B = B + A.
FG 1 + 0 2 + 1 3 + 2IJ = 1 3 5
H 4 + 3 5 + 4 6 + 5K 7 9 11
F 0 + 1 1 + 2 2 + 3IJ = 1 3
B+A=G
H 3 + 4 4 + 5 5 + 6K 7 9
Solution: A + B =
5
11
A+B=B+A
So,
Example 6.3: If A and B are matrices as in Example 4.2
and C =
1
1
0
1
2
3
, verify (A + B) + C = A + (B + C).
140
Matrices and Determinants
1 3
Solution: Now A + B =
5
7 9 11
So,
(A + B) + C = 1 1 3 0
Again,
B+C=
So,
A + (B + C) =
7 1 9
FG 0 − 1
H3 + 1
5 1
2 11 3
IJ =
K
1+ 0 2 +1
4 + 2 5+ 3
FG 1 − 1
H4 + 4
0
=
3
6
8 11 14
1 1 3
4 6 8
IJ =
6 + 8K
2 +1 3+ 3
0
5+6
8 11 14
3
6
Therefore, (A + B) + C = A + (B + C)
1
Example 6.4: If A = 3
Solution:
Then,
2
5
4 , find a matrix B such that A + B = 0.
6
b11
b12
b31
b32
Let, B = b21 b22
1 b11
2
b12
5 b31
6
b32
A + B = 3 b21 4 b22
F0
= G0
GH 0
0
0
0
I
JJ
K
It implies, b11= – 1, b12 = – 2, b21 = –3, b22 = – 4,
b31 = – 5, b32 = – 6
F −1
Therefore required B = G − 3
GH − 5
−2
−4
−6
I
JJ
K
Multiplication of Matrices
The product AB of two matrices A and B is defined only when the number of columns
of A is same as the number of rows in B and by definition the product AB is a matrix G
of order m × p if A and B were of order m × n and n × p, respectively. The following
example will give the rule to multiply two matrices:
Let,
æ a1 b1 c1 ÷ö
ç
A = çça b c ÷÷÷
è 2 2 2ø
æ d1 e1 ÷ö
çç
÷
ççd2 e2 ÷÷
÷÷
B = çç
÷
èçd3 e3 ÷ø
Order of A = 2 × 3, Order of B = 3 × 2
141
Matrices and Determinants
So, AB is defined as,
æ a1d1 + b1d 2 + c1d3
ç
G = AB = çça d + b d + c d
è 2 1
2 2
2 3
 g11
a1e1 + b1e2 + c1e3 ÷ö
÷
a2e1 + b2e2 + c2e3 ÷÷ø
g12 

= g
 21 g 22 
g11 : Multiply elements of the first row of A with corresponding elements of the first
column of B and add.
g12 : Multiply elements of the first row of A with corresponding elements of the second
column of B and add.
g21 : Multiply elements of the second row of A with corresponding elements of the
first column of B and add.
g22 : Multiply elements of the second column of A with corresponding elements of
the second column and add.
Notes: 1. In general, if A and B are two matrices then AB may not be equal to BA.
For example, if
A=
1
0
1
0
, B=
and BA =
1
0
1
0
1
0
0
0
then AB =
1
0
0
0
. So, AB ≠ BA
2. If product AB is defined, then it is not necessary that BA must also be
defined. For example, if A is of order 2 × 3 and B is of order 3 × 1, then AB
can be defined but BA cannot be defined (as the number of columns of B ≠
the number of rows of A).
It can be easily verified that,
(i) A(BC) = (AB)C
(ii) A(B + C) = AB + AC
(iii) (A + B)C = AC + BC.
2 –1
Example 6.5: If A = 0
Solution:
7
0
3 and B = –2 –3 , find AB.
AB =
=
2
7
0
( 1)
7
16
3
6
9
142
( 2) 2
3 ( 2)
0
( 1)
0
0
( 3)
3 ( 3)
Matrices and Determinants
Example 6.6: Verify the associative law A(BC) = (AB)C for the following matrices.
 –1
0 
,
– 2
 −1
0  −1 5 
A= 
7
Solution:
 1+ 0
 −1 −1 

0
C= 
2
−5 + 0 
AB = 

=

 7 −2  7 0   −7 − 14 35 + 0 
=
BC =
A(BC) =
=
(AB)C =
=
Thus,
 −1 5 
,
0
B= 
7
æ 1 -5÷ö
çç
÷
çè-21 35 ÷÷ø
æ-1 5öæ
1 + 0 ö÷ çæ 11 1 ö÷
÷÷çç-1 -1÷÷ö = ççæ 1 + 10
÷=ç
÷
çç
÷÷ç 2
çè 7 0øè
0 ÷÷ø çè-7 + 0 -7 + 0ø÷÷ çè-7 -7ø÷÷
æ-1 0 ÷öæ 11 1 ö÷ æ-11 + 0 -1 + 0ö÷
çç
֍
÷ ç
÷
çè 7 -2÷÷øèçç-7 -7ø÷÷ = èçç 77 + 14 7 + 14 ø÷÷
æ-11 -1÷ö
çç
÷
çè 91 21÷÷ø
æ 1 -5÷öæ-1 -1÷ö æ-1-10 -1 + 0÷ö
÷çç
÷ = çç
÷
çç
çè-21 35 ÷÷øèç 2
0 ÷÷ø çè 21 + 70 21 + 0 ÷÷ø
æ-11 -1ö÷
çç
÷
çè 91 21ø÷÷
A(BC) = (AB)C
Example 6.7: If A is a square matrix, then A can be multiplied by itself. Define
A2 = A. A (called power of a matrix). Compute A2 for the following matrix:
A=
A2 =
Solution:
1 0
3 4
FG 1 0IJ FG 1 0IJ =  1
H 3 4K H 3 4K 15
0

16 
(Similarly, we can define A3, A4, A5, ... for any square matrix A.)
Scalar Multiplication of Matrix
If k is any complex number and A, a given matrix, then kA is the matrix obtained from
A by multiplying each element of A by k. The number k is called scalar.
For example, if
A=
FG 1
H4
IJ and k = 2
K
2 3
5 6
143
Matrices and Determinants
2
kA =
Then,
4
6
8 10 12
It can be easily shown that,
(ii) (k1 + k2)A = k1A + k2A
(i) k(A + B) = kA + kB
(iv) (k1k2)A = k1(k2A)
(iii) 1A = A
0
1
Example 6.8: (i) If A = 2
3
4
5
2
4 and k1 = i, k2 = 2, verify,,
6
(k1 + k2) A = k1A + k2A
(ii) If A =
0
2
3
2
1
4
,B=
6
3
1
4
5
i
2i
0
Solution: (i) Now k1A
0
and k2A = 4
4i 5i 6i
2
i
4i 10
0
2i
8
4i
5i 12
6i
i
2
4
2i
8
4i
5i 12
6i
(k1 + k2) A = 4 2i 6 3i
4i 10
8
2
4
6
8
8 10 12
4
k1A + k2A = 4 2i 6 3i
8
Also,
, find the value of 2A + 3B.
= 2i 3i 4i
0
So,
7
Therefore, (k1 + k2)A = k1A + k2A
(ii)
0 4 6
2A = 4 2 8
3B =
So, 2A + 3B =
21 18
9
3 12 15
21 22 15
7 14 23
1 2
Example 6.9: If A = – 3 0 find A2 + 3A + 5I where I is unit matrix of order 2.
Solution:
A2 =
1 2
1 2
3 0
3 0
144
=
5
2
3
6
Matrices and Determinants
3 6
3A =
I=
So,
9 0
FG 1 0IJ , 5I = æçç5
çè0
H 0 1K
 −5
2
0ö÷
÷
5ø÷÷
 3 6
5 0
A2 + 3A + 5I = 
+
+

 −3 −6  −9 0 0 5
3

8
= 

 − 12 − 1 
Example 6.10: If A =
Solution: Now,
0 1
,B=
1 0
AB =
BA =
0
i
i
0
0 1
0
i
1 0
i
0
0
i
0 1
i
0
1 0
So,
AB = – BA
Also,
A2 =
show that, AB = – BA and A2 = B2 = I.
=
i
0
0
i
=
i 0
0
i
FG 0 1IJ FG 0 1IJ = FG 1 0IJ = I
H 1 0K H 1 0K H 0 1K
0
i 0
i
F 1 0IJ = I
=G
B2 =
i 0 i 0
H 0 1K
This proves the result.
Example 6.11: In an examination of Mathematics, 20 students from college A, 30
students from college B and 40 students from college C appeared. Only 15 students
from each college could get through the examination. Out of them 10 students from
college A and 5 students from college B and 10 students from college C secured full
marks. Write down the above data in matrix form.
Solution: Consider the matrix,
20 30 40
15 15 15
10
5 10
First row represents the number of students in college A, college B and college C
respectively.
Second row represents the number of students who got through the examination
in three colleges respectively.
145
Matrices and Determinants
Third row represents the number of students who got full marks in the three colleges
respectively.
Example 6.12: A publishing house has two branches. In each branch, there are three
offices. In each office, there are 3 peons, 4 clerks and 5 typists. In one office of a
branch, 6 salesmen are also working. In each office of other branch 2 head clerks are
also working. Using matrix notation find (i) the total number of posts of each kind in all
the offices taken together in each branch, (ii) the total number of posts of each kind in
all the offices taken together from both the branches.
Solution: (i) Consider the following row matrices,
A1 = (3 4 5 6 0), A2 = (3 4 5 0 0),
A3 = (3 4 5 0 0)
These matrices represent the three offices of the branch (say A) where elements
appearing in the row represent the number of peons, clerks, typists, salesmen and
head clerks taken in that order working in the three offices.
Then, A1 + A2 + A3 = (3 + 3 + 3 4 + 4 + 4 5 + 5 + 5 6 + 0 + 0 0 + 0 + 0)
= (9 12 15 6 0)
Thus, total number of posts of each kind in all the offices of branch A are the
elements of matrix A1 + A2 + A3 = (9 12 15 6 0)
Now consider the following row matrices,
B1 = (3 4 5 0 2), B2 = (3 4 5 0 2), B3 = (3 4 5 0 2)
Then B1, B2, B3 represent three offices of other branch (say B) where the elements
in the row represent number of peons, clerks, typists, salesmen and head clerks
respectively.
Thus, total number of posts of each kind in all the offices of branch B are the
elements of the matrix B1 + B2 + B3 = (9 12 15 0 6)
(ii) The total number of posts of each kind in all the offices taken together from
both branches are the elements of matrix,
(A1 + A2 + A3) + (B1 + B2 + B3) = (18 24 30 6 6)
Example 6.13: Let A =
FG 10 20IJ where first row represents the number of table fans
H 30 40K
and second row represents the number of ceiling fans which two manufacturing units A
and B make in one day. The first and second column represent the manufacturing units
A and B. Compute 5A and state what it represents.
146
Matrices and Determinants
Solution: 5A =
50 100
150 200
It represents the number of table fans and ceiling fans that the manufacturing units
A and B produce in five days.
2 3 4 5
Example 6.14: Let A = 3 4 5 6 where rows represent the number of items of
4 5 6 7
type I, II, III, respectively. The four columns represents the four shops A1, A2, A3, A4
respectively.
1 2 3 4
1 2 2 3
Let, B = 2 1 2 3 , C = 1 2 3 4
3 2 1 2
2 3 4 4
Where elements in B represent the number of items of different types delivered at the
beginning of a week and matrix C represent the sales during that week. Find,
(i) The number of items immediately after delivery of items.
(ii) The number of items at the end of the week.
(iii) The number of items needed to bring stocks of all items in all shops to 6.
Solution:
F3
(i) A + B = G 5
GH 7
5 7 9
5 7 9
7 7 9
I
JJ
K
Represent the number of items immediately after delivery of items.
F2
(ii) (A + B) – C = G 4
GH 5
I
JJ
K
3 5 6
3 4 5
4 3 5
Represent the number of items at the end of the week.
(iii) We want that all elements in (A + B) – C should be 6.
F4
GG
H1
I
JJ
1K
3 1 0
Let D = 2 3 2 1
2 3
Then (A + B) – C + D is a matrix in which all elements are 6. So, D represents the
number of items needed to bring stocks of all items of all shops to 6.
Example 6.15: The following matrix represents the results of the examination of
B. Com. class:
147
Matrices and Determinants
1
2
3
4
5
6
7
8
9 10 11 12
The rows represent the three sections of the class. The first three columns represent
the number of students securing 1st, 2nd, 3rd divisions respectively in that order and
fourth column represents the number of students who failed in the examination.
(i) How many students passed in three sections respectively?
(ii) How many students failed in three sections respectively?
(iii) Write down the matrix in which number of successful students is shown.
(iv) Write down the column matrix where only failed students are shown.
(v) Write down the column matrix showing students in 1st division from three
sections.
Solution: (i) The number of students who passed in three sections respectively are
1 + 2 + 3 = 6, 5 + 6 + 7 = 18, 9 + 10 + 11 = 30.
(ii) The number of students who failed from three sections respectively are 4, 8,
12.
(iii)
1
2
3
5
6
7
9 10 11
4
(iv)
(v)
8 represents column matrix where only failed students are shown.
12
F 1I
GG 5JJ represents column matrix of students securing 1st division.
H 9K
Transpose of a Matrix
Let A be a matrix. The matrix obtained from A by interchange of its rows and columns,
is called the transpose of A. For example,
If,
F1
A=G
H2
IJ then transpose of A = FG 10
GH 2
K
0 2
1 0
Transpose of A is denoted by A′.
148
2
1
0
I
JJ
K
Matrices and Determinants
It can be easily verified that,
(i) (A′)′ = A
(ii) (A + B)′ = A′ + B′
(iii) (AB)′ = B′A′
Example 6.16: For the following matrices A and B verify (A + B)′ = A′ + B′.
A=
Solution:
1
2 3
4 5 6
, B=
F1
GG
H3
3
So,
I
JJ
6K
4
A′ = 2 5
2 3 4
1 8 6
F2
GG
H4
I
JJ
6K
1
B′ = 3 8
5
A′ + B′ = 5 13
7 12
Again, A + B =
3
5
7
5 13 12
3
5
So, (A + B)′ = 5 13
7 12
Therefore, (A + B)′ = A′ + B′
Square Matrix
A square matrix is a matrix which has the same number of rows and columns. An
n-by-n matrix is known as a square matrix of order n.
Symmetric Matrices
Consider a square matrix A such that A' = A is a symmetric matrix.
A square matrix A such that A' = – A is skew symmetric. Its leading diagonal has all
zeros.
 r11
5 a  
a 5  ,  r12

 
 r13
r12
r22
r23
r13 
r23  are symmetric matrices.
r33 
 0 −1 2 
 1 0 3

 is skew symmetric.
 −2 −3 0 
149
Matrices and Determinants
Note: If A is a square matrix then, (i) A + A' is symmetric (ii) A – A' is skew symmetric,
(iii) The square matrix A is the sum of the symmetric matrix
symmetric matrix
A + A'
and the skew
2
A − A'
. These results can be easily proved.
2
If A, B are square and AB = BA, then A, B are commutative. If AB = – BA, then A, B
are anti-commutative.
If A2 = A, then A is called idempotent.
Determinant of a Matrix
A square matrix A has a uniquely defined determinant | A | associated with the matrix.
The determinant of,
a
a 
a
a
11
12
11
12
= a11 a22 – a12 a21
A= 
 is | A | = a
a
a
a
 21 22 
21
22
The determinant of the product of two matrices is the product of their determinants.
| AB | = | A | | B |
Students to verify the above results for,
 1 1 1
 2 2 1
 4 −1 1


(i) A = 
 , B =  1 0 2
 0 1 −1
 2 1 2 
 1 0 0
a b


(ii) A = 0 1 0  , B =  d e
0 0 1
 g h
c
f 
i 
Singular and Non-Singular Matrices
A square matrix A is
(i) Singular if | A | = 0
(ii) Non-singular if | A | ≠ 0.
12 3
Example 6.17: Is square matrix A = 
 Singular or non-singular?
 20 5
12 3
Solution: A = 
 is singular because | A | = 60 – 60 = 0.
 20 5
150
Matrices and Determinants
Adjoint Matrix
The adjoint matrix of A is obtained by replacing the elements of A by their respective
cofactors and then transposing.
If A = [aij] and B = [Aij] where Aij is the cofactor of aij in A then we have the adjoint
matrix of A, written as
adj A = [Aij]' = [Aji]
 a11 a12
A =  a21 a22
 a31 a32
If
where
 a22
a13 
a23  , adj A =
a33 
a23 
 A11
A
 12
 A13
A21
A22
A23
A23 
A32 
A33 
 a21 a23 
A11 = +  a
, A = – 
 , etc.
 32 a33  12
 a31 a33 
Determinant of a Square Matrix
In matrix algebra, the determinant is a special number associated with any square
matrix. In linear transformation the determinant acts as a scale factor or coefficient
for measure.
If A is a square matrix, then determinant of A will be denoted by det A or | A |. If
 a11
a12

a11
an1
a12
an 2
a1n
ann
a1n 
A = a a  a 
n2
nn 
 n1
Then det A will be denoted by,
Notes: 1. det A or | A | is defined for square matrix A only.
2. det A or | A | will be defined in such a way that A is invertible if
det A ≠ 0.
3. The determinant of an n × n matrix will be called determinant of order n.
Determinant of Order One
Let A = (a11) be a square matrix of order one. Then det A = a11
By definition, if A is invertible, then a11 ≠ 0 and so, det A ≠ 0. Also, conversely if
det A ≠ 0, then a11 ≠ 0 and so, A is invertible.
151
Matrices and Determinants
Determinant of Order Two
a
a 
Let A =  11 12  be a square matrix of order two. Then we define
 a21 a22 
det A = a11a22 – a12a21
For example, if A =
 a11
 a21
Suppose A = 
FG 1 2IJ then det A = 4 – 6 = – 2
H 3 4K
a12 
is invertible.
a22 
Then by definition there exists a matrix,
B=
FG x y IJ
H z wK
where x, y, z, w are complex numbers such that AB = I = BA
The above identity implies,
a11x + a12z = 1, a11 y + a12w = 0
a21x + a22z = 0, a21 y + a22w = 1
which in turn implies
∆x = a22, ∆y = – a12
∆z = – a21, ∆w = a11
where ∆ = a11a22 – a12a21
Clearly ∆ ≠ 0, for otherwise x, y, z, w will be indeterminate. This means that det
A ≠ 0. Conversely, if A is a square matrix of order 2 such that det A ≠ 0, then A is
invertible as,
x=
a22
, y=
 a12
, z=
 a21
, w=
a11
will determine B uniquely satisfying AB = I = BA
Determinant of Order Three
Let,
 a11 a12
A =  a21 a22
 a
31 a32
a13 
a23  be a 3 × 3 matrix.

a 
33
Then we define,
det A = a11(a22a33 – a32a23) – a12(a21a33 – a31a23) + a13(a21a32 – a31a22)
The above definition may be explained as follows:
152
Matrices and Determinants
The first bracket is determinant of matrix obtained after removing first row and
first column.
The second bracket is determinant of matrix obtained after removing first row and
second column.
The third bracket is determinant of matrix obtained after removing first row and
third column.
The elements before three brackets are first, second, third element respectively of
first row with alternate positive and negative signs.
F1
GG
H7
I
JJ
9K
2 3
For example, let A = 4 5 6
8
To find det A.
The first bracket in the definition of det A is determinant of,
FG 5 6IJ = 45 – 48 = – 3
H 8 9K
The second bracket is determinant of,
FG 4 6IJ
H 7 9K
= 36 – 42 = – 6
The third bracket is determinant of,
 4 5
 7 8
= 32 – 35 = – 3
So, det A = 1(– 3) – 2(– 6) + 3(– 3) = – 3 + 12 – 9 = 0
It can be seen that if A is a square matrix of order 3, then A is invertible if det
A ≠ 0.
Determinant of Order Four
Let,
 a11
a
A =  21
 a31
 a
41
a12
a13
a22
a32
a23
a33
a42
a43
a14 
a24 

a34 
a44 
 a22
Then we define det A = a11  det  a32
 a
42
a23
a33
a43
a24 
a34 

a 
44
153
Matrices and Determinants
 a12
+ a13
 a21 a23
 det  a31 a33

 a
41 a43
a24 
a34 

a 
 a21
 det  a31

 a41
a24 
a34 

a44 
a22
a32
a42
 a21 a22
 a14  det  a31 a32

 a
41 a42
Note: A determinant
a1
a2
b1
b2
44
a23 
a33 

a 
43
of order 2 can also be obtained when we eliminate x, y
from a1x + b1y = 0, a2x + b2y = 0 provided one of x, y is non-zero. Similarly
determinant of order 3 can be obtained by eliminating x, y, z from,
a1x + b1y + c1z = 0
a2x + b2y + c2z = 0
a3x + b3y + c3z = 0
provided one of x, y, z is non zero.
Properties of Determinants
We list below some imortant properties of determinants.
1. If two rows (or columns) are interchanged in a determinant it retains its absolute
value but changes its sign.
i.e.,
a1 a2
a3
c1
c3
b1
b2
c2
b3
b1
b2
b3
c1
c2
c3
=  a1 a2 a3
2. If rows are changed into columns and columns into rows the determinant remains
unchanged.
i.e.,
a1 a2
a3
c1
c3
b1
b2
c2
b3
a1
b1
c1
a3
b3
c3
= a2 b2 c2
154
Matrices and Determinants
3. If two rows (or columns) are identical in a determinant it vanishes.
a1 a2
i.e.,
a3
a3 = 0
c3
a1 a2
c1
c2
4. If any row (or column) is multiplied by a complex number k, the determinant so
obtained is k times the original determinant.
i.e.,
a1
a2
c1
c2
kb1 kb2
a3
a1 a2
a3
c2
c3
kb3 = k b1
c3
c1
b2
b3
5. If to any row (or column) is added k times the corresponding elements of another
row (or column), the determinant remains unchanged.
i.e.,
a1  kb1 a2  kb2
b1
b2
c1
c2
a3  kb3
a1 a2
b3
= b1 b2
c3
c1 c2
a3
b3
c3
6. If any row (or column) is the sum of two or more elements, then the determinant
can be expressed as sum of two or more determinants.
i.e.,
a1  k1 a2  k2
b1
b2
c1
c2
a3  k3
b3
c3
a1 a2
a3
k1 k2
k3
c1
c3
c1
c3
= b1 b2 b3 + b1 b2 b3
c2
c2
7. If determinant vanishes by putting x = a, then (x – a) is a factor of the determinant.
i.e.,
1
1
a
b
a2
b2
1
c has (a – b) as one of its factors (By putting a = b, first and
c2
second columns become identical).
8. If k rows or columns become identical by putting x = a then
(x – a)k – 1 is a factor of the determinant.
For example, consider in the following determinant:
(b  c)2
a2
a2
b2
(c  a ) 2
b2
c2
c2
( a  b) 2
All the three rows become identical by putting a + b + c = 0. So, (a + b + c)2 is
one of the factors of the given determinant.
155
Matrices and Determinants
Example 6.18: Show that,
1 a b+c
1 b c+a = 0
1 c a+b
Solution: Now,
1 a bc
1 b ca
1 c ab
1
a
=
1
b
1
c
[Interchanging rows and columns]
bc ca ab
Applying C2 → C2 – C1, C3 → C3 – C1
1
=
a
0
0
ba ca
bc ab ac
1
= (a  b)(a  c)
0
a 1 1 = 0, by property 3
1 1
bc
Example 6.19: Show that,
ab bc ca
bc ca ab = 0
ca ab bc
ab bc ca
Solution: b  c c a a  b
ca ab bc
Applying R1 → R1 + R2 + R3
0
0
0
0
= bc c a a b
ca ab bc
=0
156
Matrices and Determinants
Example 6.20: Prove that
1
a
1
b
1
c
2
2
2
a
Solution:
b
c
= (a – b)(b – c)(c – a)
1
a
1
b
1
c
2
2
2
a
b
c
1
a
0
ba
0
ca
a2
b2  a 2
c2  a2
=
Applying C2 → C2 – C1 and C3 → C3 – C1
=
1
(b  a )(c  a ) a
0
1
a2
0
1
ba ca
= (b – a)(c – a)(c + a – b – a)
= (b – a)(c – a)(c – b)
= (a – b)(b – c)(c – a)
Example 6.21: Prove that
abc
2a
2b
2c
Solution:
2a
= (a + b + c)3
bca
2b
c a b
2c
abc
2a
2a
2b
bca
2b
2c
2c
cab
Applying R1 → R1 + R2 + R3
abc abc a bc
=
2b
bca
2b
2c
2c
cab
1
1
1
= (a  b  c) 2b b  c  a
2c
2c
2b
cab
Applying C2 → C2 – C1, C3 → C3 – C1
1
0
= (a  b  c) 2b (a  b  c)
2c
0
0
0
 (a  b  c )
= (a + b + c)(a + b + c)2 = (a + b + c)3
157
Matrices and Determinants
Example 6.22: Solve
1+ a
1
1
1 1 1

1
1+ b
1
= abc 1 + + + 
a b c
1
1 1+ c
Solution:
1
1 a
1
1
1
1 b
1
1
1
1 c
= abc
1
b
1
c
1
a
1
a
1
1
c
1
b
1
a
1
b
1
1
c
Applying R1 → R1 + R2 + R3
1+
= abc
=
1 1 1
1 1 1
1 1 1
+ +
1+ + +
1+ + +
a b c
a b c
a b c
1
1
1
1+
b
b
b
1
1
1
1+
c
c
c
1
1
1
1 1 1 1
1
1

1
abc 1    

a b c b
b
b
1
1
1
1
c
c
c
Applying C2 → C2 – C1, C3 → C3 – C1
=
1 0 0
1 1 1 1

abc 1    
1 0

a b c b
1
0 1
c
FG
H
= abc 1 + 1 + 1 + 1
a
b
c
IJ
K
Example 6.23: Prove that x = 2 and x = 3 are roots of the equation,
x5 2
3
x
=0
x−5 2
x
Solution: Now, − 3
=0
⇒ x2 – 5x + 6 = 0
⇒ (x – 3)(x – 2) = 0
⇒ x = 3, x = 2 are roots of the given equation.
158
Matrices and Determinants
Inverse of a Square Matrix
Consider the matrices,
2 0
A= 5 1
0 1
3
1
1
15
6
5
5
2
2
1
B=
0 ,
3
It can be easily seen that,
AB = BA = I (unit matrix)
In this case, we say, B is inverse of A. Infact, we have the following definition.
‘If A is a square matrix of order n, then a square matrix B of the same order n is
said to be inverse of A if AB = BA = I (unit matrix).’
Notation: Inverse of A is denoted by A– 1
Notes:1. Inverse of a matrix is defined only for square matrices.
2. If B is an inverse of A, then A is also an inverse of B. [Follows clearly by
definition.]
3. If a matrix A has an inverse, then A is said to be invertible.
4. Inverse of a matrix is unique.
For, let B and C be two inverses of A.
Then,
AB = BA = I and AC = CA = I
So,
B = BI = B(AC) = (BA)C = IC = C
5. Square matrix is not invertible.
For, let A =
1
1
1
1
x
x
If A is invertible, let B = y
x
be inverse of A.
y
y
Then AB = I implies x y
x
y
x
y
1
= 0
0
1
⇒ x + y = 1, x′ + y′ = 0, x + y = 0, x′ + y′ = 1, which is absurd.
This proves our assertion.
In the present section, we give a method to determine the inverse of a matrix.
Consider the identity A = IA.
159
Matrices and Determinants
We reduce the matrix A on left hand side to the unit matrix I by elementary row
operations only and apply all those operations in same order to the prefactor I on the
right hand side of the above identity. In this way, unit matrix I is reduced to some
matrix B such that I = BA. Matrix B is then the inverse of A.
We illustrate the above method by the following examples.
Example 6.24: Find the inverse of the matrix,
1 3 3
1 4 3
1 3 4
Solution: Consider the identity,
F1
GG 1
H1
I F1
JJ = GG 0
K H0
I F1
JJ GG 1
K H1
3 3
4 3
3 4
0 0
1 0
0 1
3 3
4 3
3 4
I
JJ
K
Applying R2 → R2 – R1, then R3 → R3 – R1, we have,
F1
GG 0
H0
I
JJ =
K
3 3
1 0
0 1
1 0 0
1 3 3
1 1 0
1 4 3
1 0 1 1 3 4
Applying R1 → R1 – 3R2 – 3R3, we have,
F1
GG 0
H0
I
JJ =
K
0 0
1 0
0 1
7
3
3
1 3 3
1
1
0
1 4 3
1
0
1 1 3 4
So, the desired inverse is,
7
3
3
1
1
0
1
0
1
Example 6.25: Find the inverse of the matrix,
 1 3 − 2


 − 3 0 −5
 2 5
0 

160
Matrices and Determinants
Solution: Consider the identity,
1 0 0
 1 3 − 2


 − 3 0 −5 = 0 1 0
 2 5
0 0 1
0 

1 3
2
3 0
5
2 5
0
Applying R2 → R2 + 3R1, R3 → R3 – 2R1, we have,
1
3
0
9
0
1
2
11 =
4
1 0 0
1 3
2
3 1 0
3 0
5
2 0 1
2 5
0
Applying R3 → 9R3 and then R3 → R3 + R2, we have,
1 3
1 0 0
1 3
2
3 1 0
3 0
5
15 1 9
2 5
0
2
0 9
11 =
0 0
25
Applying R3 → 1 R3 , we have,
25
1 3
2
1
11 =
1
0 9
0 0
0
0
1 3
2
3
1 0
3 1
9
5 25 25
3 0
2 5
5
0
Applying R2 → R2 + 11R3, R1 → R1 + 2R3, we have,
1 3 0
0 9 0
0 0 1
=
1
5
18
5
3
5
2
25
36
25
1
25
1
R2 , we have,
9
1 2
5 25
1 3 0
2 4
0 1 0 =
5 25
0 0 1
3 1
5 25
18
25
99
25
9
25
1 3
3 0
2 5
2
5
0
Applying R2 →
F
GG
H
I
JJ
K
18
25
11
25
9
25
1 3
3 0
2 5
2
5
0
161
Matrices and Determinants
Applying R1 → R1 – 3R2, we have,
1 0 0
0 1 0
0 0 1
2
3

 1 − 5 −5

  1 3 − 2
4 11  

− 2
=  5 25 25   − 3 0 − 5 

 2 5
0 
1
9 
−3


 5 25 25 
So, the desired inverse is,
2
5
4
25
1
25
1
2
5
3
5
3
5
11
25
9
25
2 − 1
 1

Example 6.26: Find the inverse of the matrix  − 4 − 7 4 
 − 4 −9 5


Solution: Consider the identity,
1
2
1
1 0 0
1
2
1
4
7
4 = 0 1 0
0 0 1
5
4
7
4
4
9
5
4
9
Applying R2 → R2 + 4R1, R3 → R3 + 4R1, we have,
1
2
1
1 0 0
1
2
1
0
1
7
4
1
0 = 4 1 0
1
4 0 1
4
0
4
9
5
Applying R1 → R1 + R3 then R3 → R3 + R2, we have,
F1
GG 0
H0
I
JJ =
K
1 0
5 0 1
1
2
1
1 0
0 1
4 1 0
4
7
4
8 1 1
4
9
5
Applying R1 → R1 – R2, we have
1
1 0 0


0
1
0
4
=


0 0 1
8


1
2
1
1
1 1
0
4
7
4
1
1
4
9
5
162
Matrices and Determinants
So, the desired inverse is,
 1 − 1 1


 4 1 0
 8 1 1


Check Your Progress - 1
1.
What is a column matrix?
................................................................................................................
................................................................................................................
................................................................................................................
2.
When are two matrices A and B are said to be equal?
................................................................................................................
................................................................................................................
................................................................................................................
3.
List two properties of matrix addition.
................................................................................................................
................................................................................................................
................................................................................................................
6.3
SUBTRACTION OF MATRIX AND SYSTEM OF LINEAR
EQUATIONS
The aspects of matrix subtraction and system of linear equations are discussed here.
By Matrix Invasion Method
By matrix invasion method, system of linear equations can be solved. In this method
system of equation can be defined as AX = B, where A is the coefficient matrix, X
is the variable matrix and B is the matrix for right hand side values of system of linear
equation.
For example: 2x + y = 5
163
Matrices and Determinants
9x – y = 3 can be defined in the form of matrix as AX = B
or
2
9

1
1

Coefficient
matrix
consisting
of coefficient
of x and y
 x
 y
 


Variable
matrix
5
3
 

Right hand
side value
For example: 2x + y – z = 3
8x – y + 3z = 2 can be defined is
x+y+z=7
The form of matrix as AX = B
2
or  8
1
1
 3
1  x   
 2
3  y   
 7 
To solve matrix equation AX = B,
X  A 1B
as AX = B
A–1 (AX) = A–1B
(A–1A) X = A–1B
(I) X = A–1B
X  A 1B
To apply this method, two conditions should be fulfilled:
1. The system must have same number of equations as number of variables
(i.e. the coefficient matrix of the system must be square).
2. The determinant of the coefficient matrix must be non zero.
Example 6.27: Solve the system of equations
x + 2y = 4
3x – 5y = 1
164
Matrices and Determinants
Solution: Given equations in the form of matrix is
2  x 
 4
=



 1
 5  y 
 
1
3

1
 x
1
or  y  = 
 
3
1
2  4
 5  1
2
 5
A = 3

1
A–1 =  A  adj A
 2
1
= (3) (2)  (1) (4)  4

 1
3
1
3
1 2
= 2  4

1
2

3
2 

 1

= 
2

∴ X = A–1B

 1

= 
2

1
2

3
2 
 4
 1
 
 7
 2
 x


 y  =   13 
 
 2 
or x =
7
,
2
y=
13
2
165
Matrices and Determinants
Example 6.28: Solve –x + 3y + z = 1
2x + 5y = 3
3x + y – 2z = – 3
Solution: Given system of equation in matrix form
 1
 2

 3
1  x   1
0  y    3
 2  z    2
3
5
1
X = A–1 B
1

A=  2
 3
3
5
1
1
0
 2
1
A–1 =  A  adjA
 5
1
= (1) ( 5)  (2) (3)   3

1  5
= 11   3

 2
1
 2
1
X = A–1 B
1  5
= 11   3

 2   4
1  1
 1   22
= 11   11


x
 2
y =  1
or x = 2, y = 1
Example 6.29: Solve 3x + y = 2
4x + 2y = 3
166
Matrices and Determinants
Solution: Given equations in the matrix form
3
4

1  x 
 2
=



 3
2  y 
 
1
 x
3
or  y  = 
 
4
1  2
2  3
3
Given
A = 4

1
2
1
A–1 =  A  adjA
  10
 9

 4
=  9
  13
 9
 5
9 

2
9

 11
9 
7
9
1
9
10
9
X = A–1B
=
10
9
4
9
13
9
7
9
1
9
10
9
5
9
2
9
11
9
1
3
2
x 
 21
 y
 
  =  3
 39
 z 
By Pivot Reduction Method
For system of linear equation AX = B, coefficient matrix is converted to echlon
form. An element is said to be pivot element on the left hand side of the matrix for
whom above and below elements an made zero for doing this, elementary operation
will be performed.
167
Matrices and Determinants
It can also be done by partially pioviting the matrix and converting it to lower
triangular matrix by using elementary operations.
Same operations are also performed on R.H.S. matrix for a system of linear
equations AX = B, find an augmented matrix C = [A, B] and reduce it to lower
triangular matrix.
Example 6.30: Solve 3x + 3y + 4z = 20
X+y+z=6
2x + y + 3z = 13
Solution: Given matrix AX = B is
3 3 4
1 1 1
x
y
20
6
2 1 3
z
13
Augmented matrix C = [A : B]
3
1

2
3
1
1
4
1
3
R1 → R2 –
:
:
:
20
6 
13 
1
R
3 1
3

0

2
R3 → R3 –
R1 ~ 
3
0

3

0

R2 ↔ R3 ~ 
0

3
1
0
Now from this matrix
3
0
1
4
1
3
1
3
4
1
3
1
3
20
 2 
3 
1

3 
20
 1
3
 2

3 
1
2
z
(last row)
3
3
z=2
168
Matrices and Determinants
1
1
y z =
(second row)
3
3
1
1
 y  (2) =
3
3
y=1
and 3x + 3y + 4z = 20
[from first row]
3x + 3(1) + 4(2) = 20
x= 3
Example 6.31: Solve x + 2y + 3z = 7
–
2x + 3y – z = 5
– x – 2y + 3z = – 1
Solution: Given matrix AX = B is
 1
 2

 1
2
3
2
3  x 
 7



1  y  =  5
1
3  z 
1

Augmented matrix C = [A : B] ~  2
 1
2
3
2
R2 → R2 + 2R1
1

R3 → R3 + R1 ~ 0
0
2
7
0
3
5
6
7
19
6 
⇒ 6z = 6 (from last row)
z=1
and 7y + 5z = 19
7y + 5(1) = 19
7y = 14
y=2
and x + 2y + 3z = 7
x + 2(2) + 3(1) = 7
x= 1
169
3
1
3
:
:
:
7
5 
1
Matrices and Determinants
Minors and Cofactors of a Determinant
A minor of a matrix is defined as the determinant of a smaller square matrix which is
obtained by removing one or more rows or columns or both from matrix A.
2

For example A = 8
3
Then
2
B = 8

3
5
6
9
7
4
3
(Removing third row and third column)
5
5 7
C = 6 4 (Removing first row and first column)
8 7
D = 3 6 (Removing first row and second column)
Determinant of matrix B, C and D are minors of matrix A.
A minor can be defined as a value computed from the determinant of a square
matrix which is obtained after removing a row and a column corresponding the
element that is under consideration.
2
3
B1 = 8
5
5
7
C1 = 6
8
D1 = 3
4
7
6
= 10 – 24 = –14
= 20 – 42 = – 22
= 48 – 21 = 27
are minor for matrix A.
Cofactor of the element aij is defined as its minor prefixing. Sign is taken as
positive if i + j is even and negative if i + j is odd.
Cofactor of element aij is obtained as determined of matrix obtained by deleting
ith row and jth column.
170
Matrices and Determinants
For example, cofactor of a21
3 2
1 8
in matrix A = O
5 6
2
deleting second row and first column, smaller matrix is 6

2
is  6

1
7  here a = 1, by
21
9 
1
and cofactor of a21
9
1
[with a negative sign as a21, (2 + 1 = 3, odd)].
9
The value of a determinant is equal to the sum of the products of the elements
of a line by its corresponding cofactors.
a11
a21
a12
a22
a13
a23
a31
a32
a33
a
a23
a21
 a12
a33
a31
22
= a11 a
32
2
Example 6.32: Find cofactor of a22 in 9

2
a23
a21
 a13
a33
a31
a22
a32
5
7 
5
By deleting second row and second column, value is 2 (with
 7 
Solution: 9

+ve sign)
 3

Example 6.33: Find cofactor of a31 and a23 for A =  0
 2
 3

Solution:  0
 2
2
2
1
1
 5
4
By deleting third row and first column
2
A1 = 2

Cofactor of
1
 5
2
a31 =  2

1
 5
(with the sign even)
= 2(– 5) – 2(1)
171
2
2
1
1
 5
4
Matrices and Determinants
= – 10 – 2
= – 12
By deleting second row and third column
 3
 0

 2
2
2
1
1
 5
4
 3
A2 = 2

2
1
 3
 2
Cofactor of a23 =  
2
1
= – {(3) (1) – (–2)}
= – {3 + 4}
=–7
Check Your Progress - 2
1.
When can the subtraction of matrices be performed?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is the minor of a matrix defined as?
................................................................................................................
................................................................................................................
................................................................................................................
3.
What is the value of a determinant equal to?
................................................................................................................
................................................................................................................
................................................................................................................
172
Matrices and Determinants
6.4
SUMMARY
• A matrix which has exactly one row is called a row matrix. For example,
(1 2 3 4) is a row matrix.
• A matrix which has exactly one column is called a column matrix.
• A matrix in which the number of rows is equal to the number of columns is
called a square matrix.
• A matrix each of whose elements is zero is called a null matrix or zero
matrix.
• A square matrix whose every element other than diagonal elements is zero,
is called a diagonal matrix.
• A diagonal matrix whose diagonal elements are equal, is called a scalar
matrix.
• A square matrix (aij), whose elements aij = 0 when i < j is called a lower
triangular matrix.
• The product AB of two matrices A and B is defined only when the number
of columns of A is same as the number of rows in B and by definition the
product AB is a matrix G of order m × p if A and B were of order m × n
and n × p, respectively.
• A square matrix is a matrix which has the same number of rows and
columns. An n-by-n matrix is known as a square matrix of order n.
• In matrix algebra, the determinant is a special number associated with any
square matrix. In linear transformation the determinant acts as a scale factor
or coefficient for measure.
• If two rows (or columns) are interchanged in a determinant it retains its
absolute value but changes its sign.
• Subtraction of matrix is done by subtraction from element to element of two
matrices subtraction of two matrices can be performed only if both of the
matrices are of same dimensions, i.e., having same number of rows and
columns. It is done by subtracting corresponding elements.
• By matrix invasion method, system of linear equations can be solved. In this
method system of equation can be defined as AX = B, where A is the
173
Matrices and Determinants
coefficient matrix, X is the variable matrix and B is the matrix for right hand
side values of system of linear equation.
• An element is said to be pivot element on the left hand side of the matrix for
whom above and below elements are made zero. For doing this elementary
operation will be performed.
• A minor of a matrix is defined as the determinant of a smaller square matrix
which is obtained by removing one or more rows or columns or both from
matrix A.
6.5
KEY WORDS
• Square Matrix: It is a matrix in which the number of rows is equal to the
number of columns.
• Scalar Matrix: It is a diagonal matrix whose diagonal elements are equal.
• Identity Matrix: It is a diagonal matrix whose diagonal elements are all
equal to 1.
• Row Matrix: It is a matrix which has exactly one row.
6.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. A matrix which has exactly one column is called a column matrix.
2. Two matrices A and B are said to be equal if, A and B are of same order.
3. Matrix addition is commutative and associative.
Check Your Progress - 2
1. Subtraction of two matrices can be performed only if both of the matrices
are of same dimensions, i.e., having same number of rows and columns.
2. A minor of a matrix is defined as the determinant of a smaller square
matrix.
3. The value of a determinant is equal to the sum of the products of the
elements of a line by its corresponding cofactors.
174
Matrices and Determinants
6.7
SELF-ASSESSMENT QUESTIONS
1. Briefly define matrices and determinants.
2. List the various types of matrices.
3. Define in short the algebra of matrices.
4. Discuss the various operations on matrices.
5. What do you understand by scalar multiplication of a matrix?
6. Discuss minors and cofactors of determinants in detail.
7. Solve the following
(i) 3x + 3y + 4z = 20
X+y+z=6
2x + y + 3z = 13
(ii) 3x + y = 2
4x + 2y = 3
(iii) –x + 3y + z = 1
2x + 5y = 3
3x + y – 2z = – 3
8. In an examination of Science, 30 students from college A, 40 students from
college B and 20 students from college C appeared. Only 17 students from
each college could get through the examination. Out of them 18 students
from college A and 7 students from college B and 10 students from college
C secured full marks. Write down the above data in matrix form.
6.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
175
Matrices and Determinants
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
176
Differentiation
UNIT–7
DIFFERENTIATION
Objectives
After going through this unit, you will be able to:
•
Discuss differentiation and its related concepts
•
Understand limit and its types
•
Analyse continuity in an interval
•
Explain geometrical interpretation of continuity
Structure
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
Introduction
Limit
Differentiability
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
7.1
INTRODUCTION
This unit will discuss differentiation. Differentiation in mathematics is the
mathematical process of obtaining the derivative of a function. The process of
differentiation begins by considering a limit of the function. A limit refers to a number
that a function approaches as the independent variable of the function which it
approaches for a given value. Limits are classified as left hand limit and right hand
limit. A differentiable function of one real variable is a function whose derivative
exists at each point in its domain. Functions have attributes of both continuity and
differentiability. This unit will discuss the aspects of differentiation in detail.
7.2
LIMIT
Limit can be defined as a number that a function approaches as the independent
variable of the function which it approaches for a given value.
177
Differentiation
For example: For f(x) = 6x, limit of f(x) as x approaches 3 is 18 or can be
written as, lim 6 x 18.
x
3
Limit of function f(x) is said to be L as x approaches a i.e. xlima f ( x) = L,
provided f(x) is made as close as L for all x sufficiently close to a, from both sides,
without actually letting x be a.
2
Example 7.1. lim(2 x 5 x 3)
x
= 2(2)2 + 5(2) + 3
2
Solution:
= 21
Right Hand Limit: lim f ( x) = L, provide f(x) is made as close as L for all x
x
a
sufficiently close to a and x > a without actually letting x be a.
Left Hand Limit: lim f ( x) = L, provided f(x) is made as close as L for all x
x
a
sufficiently close to a and x < a without actually letting x be a.
Limit is understood as the function is approaching L when x approaches a
specific value (a).
Y
L
W
a
X
When f(x) approaches L as the y-axis when x approaches the value a on the
x-axis.
It means that f(x) approaches L as x approaches a from the right or left.
lim f ( x)
x
a
is said to be exist, when both the left and right hand limit exist and equal.
178
Differentiation
lim f ( x) = L
x
a
lim f ( x) = lim f ( x) = L
a
x a–
when
x
or
2
f(x) → L when x → a+ and x → a– find lim( x 9) .
x
2
Example 7.2. lim( x 9)
= (3)2 + 9
Solution:
= 18
x
3
3
f(x) is approaching 18 when x is approaching 3.
( x 3) ( x 3)
( x 2 9)
= lim
x 3
( x 3)
( x 3)
Example 7.3. lim
x 3
Solution:
(as f(x) is not defined at x = 3)
( x 3)
= lim
x 3
=6
f(x) is approaching 6 when x is approaching 3. It can also be defined as,
Let f(x) be a function defined on an interval containing x = a except possibly
at x = a, then
lim f ( x) = L
x
If
a
ε > 0, ∃ δ > 0 such that |f(x) – L| < ε whenever 0 < |x – a| < δ
Theorems on Units
Let lim f ( x) and lim g ( x) both exist and let k be any constant. Then
x
(i)
x
a
a
lim[k f ( x)] k lim f ( x )
x
f ( x)
(ii) xlim[
a
a
g ( x)]
x
lim f ( x)
x
a
a
lim g ( x)
x
a
179
Differentiation
f ( x) g ( x)]
(iii) xlim[
a
(iv) lim
x
a
lim f ( x)
f ( x)
g ( x)
(v) xlima k
x
a
x
a
lim g ( x)
lim g ( x)
a
x
a
[if g(x) ≠ 0]
k
f ( x)]n
(vi) xlim[
a
lim f ( x)
x
(vii) lim n f ( x)
x
lim f ( x)
x
n
a
n
a
, n is 9 positive integer
lim f ( x), if lim f ( x) 0 when n is even
x
x
a
a
2
Find lim(4 x 2 x 1)
x 1
x 2 2 x 1) = 4 lim x 2
Example 7.4. lim(4
x 1
x 1
Solution:
2 lim x
x 1
1
= 4(1)2 – 2(1) + 1
=3
2 f ( x ) 3 g ( x)
Example 7.5. If xlima f ( x) = 5 and xlima g ( x) = – 1 find xlima f ( x) g ( x) .
2 f ( x) 3 g ( x )
Solution: xlima f ( x) g ( x)
=
2 lim f ( x)
x
a
lim f ( x)
x
2(5) 3( 1)
=
5 ( 1)
180
a
3 lim g ( x )
x
a
lim g ( x)
x
a
Differentiation
Example 7.6. Find lim
x
2
lim x
x
Solution:
3
x2 1
2x
lim1
x
3
3
2 lim x
x
3
(3)2 1
2(3)
=
8
6
=
Continuity of a Function
A function f(x) is said to be continuous at the point x = a if lim f ( x) exists and
x
a
is equal to f(a).
A function f(x) is said to be continuous at x = a if lim f ( x)
x
a
f (a ). A function
is said to be continuous on the interval [a1, a2] if it is continuous at each point in
the interval.
A function is continuous if it has no holes or jumps or if it continuous at every
point of its domain. A function f(x) is said to be continuous at the point x = a, if
the following three conditions are satisfied:
1. f(a) is defined
2.
lim f ( x) exists
x
a
lim f ( x)
3.
x
a
f (a)
Let f be a real function on a subset of the real numbers and let a be a point in
the domain of f. Then f is continuous at a if
lim f ( x) = f(a)
x
a
181
Differentiation
More elaborately, if the left hand limit, right hand limit and the value of the
function at x = a exist and are equal to each other, i.e.,
lim f ( x) f ( x) lim f ( x)
x
a
x
a
Then f is said to be continuous at x = a
Continuity in an Interval
(i) f is said to be continuous in an open interval (A, B) if it is continuous at
every point in this interval.
(ii) f is said to be continuous in the closed interval [A, B] if
• f is continuous in (A, B)
•
•
lim f ( x)
f ( A)
lim f ( x)
f ( A)
x
x
a
b
Geometrical Interpretation of Continuity
(i) Function f will be continuous at x = c if there is no break in the graph of the
function at the point (c, f(c)).
(ii) In an interval, function is said to be continuous if there is no break in the
graph of the function in the entire interval.
Continuity of Some of the Common Functions
Function f(x) Interval in which f is continuous
1. The constant function, i.e. f ( x) a
2. The identity function, i.e. f ( x ) x
3. The polynomial function, i.e.
f ( x)
a0 x n
a1 x n
1
R
... an 1 x an
4. |x – a| (– ∞; ∞)
5. x–n, n is a positive integer
(– ∞, ∞) – {0}
6. p(x)/q(x), where p(x) and q(x) are polynomials in x R – {x : q(x) = 0}
7. sin x, cos x R
182
Differentiation
8. tan x, sec x R – (2n 1) : n Z
2
9. cot x, cosec x
10. ex
R – {(nπ; n ∈ Z}
R
11. log x (0, ∞)
Properties of Continuous Function
If the function f and g are continuous at c then
1. f + g is continuous at c; The sum of two continuous is continuous functions.
2. f – g is continuous at c; The difference of two continuous is continuous
functions.
3. f.g is continuous at c; The product of two continuous is continuous functions.
4. f/g is continuous at c if g(c) ≠ 0 and is discontinuous at c if g(c) = 0. The
quotient of two continuous functions is continuous where it is defined. (It
won’t be defined when the denominator is continuous. All rational functions
(quotients of two polynomials) are continuous where they’re defined.
5. Polynomials are continuous functions.
Intermediate-value Theorem
If f(x) is continuous on a closed interval [a, b] and c is any number between
f(x) and f(b), inclusive, then there is at least one number x in the interval
[a, b] such that f(x) = c.
6. If f(x) is continuous on [a, b], and if f(a) and f(b) have opposite signs, then
there is at least one solution of the equation f(x) = 0 in the interval (a, b)
7. The External Value Theorem. If a function is continuous on a closed
interval, then it takes on a maximum value and a minimum value. Symbolically,
if f is continuous on [a, b], then there is some c in [a, b] such that f(c) ≥
f(x) for all x in [a, b]. Likewise, there is some d in
[a, b] such that f(d) ≤ f(x) for all x in [a, b]
8. The composition of two continuous functions is continuous. So, for example,
the square root function is continuous, so the square root of a continuous
function is another continuous function.
9. Functions inverse to continuous functions are continuous.
183
Differentiation
Check Your Progress - 1
1.
How can limit be defined?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is the composition of two continuous functions?
................................................................................................................
................................................................................................................
................................................................................................................
7.3
DIFFERENTIABILITY
A differentiable function of one real variable is a function whose derivative exists at
each point in its domain. As a result, the graph of a differentiable function must have
a (non-vertical) tangent line at each point in its domain, be relatively smooth, and
cannot contain any breaks, bends, or cusps.
Differentiation
Let f be a function defined on an open interval I and a a point of I. The function
f is said to be differentiable at a if and only if the rate of change of the function f
at a has a finite limit l at a, i.e.:
lim
h
0
f (a
h)
h
f (a )
l
L is called the derived number of f at a and is denoted f ′(a)
When the function f is differentiable on an interval I, the derivative function,
called f ′, which to x of I relates the derived number f ′(x).
Differentiation is the algebraic procedure of calculating the derivatives. Derivative
of a function is the slope or the gradient of the curve (graph) at any given point.
Gradient of a curve at any given point is the gradient of the tangent drawn to that
curve at the given point. Differentiation process is useful in calculating the gradient
of the curve at any point.
184
Differentiation
Another definition for derivative is, “the change of a property with respect to
a unit change of another property.”
Let f(x) be a function of an independent variable x. If a small change (∆x) is
caused in the independent variable x, a corresponding change ∆f(x) is caused in the
function f(x); then the ratio ∆f(x)/∆x is a measure of rate of change of f(x), with
respect to x. The limit value of this ratio, as ∆x tends to zero, lim ( f ( x) / ( x) ) is
x
0
called the first derivative of the function f(x), with respect to x; in other words, the
instantaneous change of f(x) at a given point x.
Graphical Interpretation of Differentiation
When f is differentiable at a, the graph bf of the function f has, at the point
A(a, f(a)) a tangent line with a linear coefficient f ′(a) whose the equation is:
(T) : y = f ′(a) (x – a) + f(a)
Numerical Interpretation of Differentiation
When a function f is differentiable at a, a good linear approximation, when a + h
approaches a is:
f(a + h) ≈ f(a) + hf′ (a)
Formula List of Derivatives
Let, u = f(x) and v = g(x) represent differentiable functions of x
Derivative of a constant
dc
=0
dx
Derivative of constant multiple
d
du
(cu ) = c
dx
dx
(We could also write (cf)′ = cf ′, and could use the “prime notion” in the other
formulas as well)
Derivative of sum or difference
Product Rule
d
dv
(uv) = u
dx
dx
d
(u
dx
v
v) =
du
dx
185
du
dx
dv
dx
Differentiation
Quotient Rule
Chain Rule
d u
dx v
v
=
du
dx
u
v2
dv
dx
dy
dy du
=
dx
du dx
d
d n
x = nx n –1 u n = nu
dx
dx
d x
a = (ln a)ax
dx
(If a = e)
d x
e = ex
dx
1
d
log a x =
(ln a) x
dx
(If a = e)
d
ln x
dx
1
du
dx
d u
du
a = (ln a) au
dx
dx
d u
du
e = eu
dx
dx
d
1
du
log a u =
u
dx
(ln a) dx
=
1
x
d
1 du
ln u =
dx
u dx
d
sin x = cos x
dx
d
du
sin u = cos u
dx
dx
d
cos x = – sin x
dx
d
cos u =
dx
d
tan x = sec2 x
dx
d
du
tan u = sec 2 u
dx
dx
d
cot x = – csc2 x
dx
d
cot u =
dx
d
sec x = sec x tan x
dx
sin u
du
dx
csc 2 u
du
dx
d
du
sec u = sec u tan u
dx
dx
186
Differentiation
d
csc x = – csc x cot x
dx
d
sin
dx
1
x =
d
csc u =
dx
1
1 x2
d
tan
dx
d
tan 1 u =
dx
x =
du
dx
d
sin 1 u =
dx
d
arc sin x =
dx
1
csc u cot u
1
d
arc tan x =
1 x2
dx
d
arc sin u =
dx
du
1 u 2 dx
1
1 du
d
arc tan u =
dx
1 u 2 dx
Differentiability
The function defined by f ′(x) = lim
h
f (x
h)
h
0
f ( x)
, wherever the limit exists, is
defined to be the derivative of f at x. In other words, we say that a function f is
differentiable at a point a in its domain if both lim–
h
derivative, denoted by Lf ′ (a), and lim
h
f (a
0
f (a
0
h)
h
h)
h
f (a )
f (a)
, called left hand
, called right hand
derivative, denoted by Rf ′ (a), are finite and equal.
(i) The function y = f(x) is said to be differentiable in an open interval (A, B)
if it is differentiable at every point of (A, B).
(ii) The function y = f(x) is said to be differentiable in the closed interval (A,
B) if Rf ′ (A) and Lf ′ (B) exist and f ′ (x) exist for every point of (A, B).
(iii) Every differentiable function is continuous, but the converse is not true.
187
Differentiation
Solved Examples
Example 7.7. Find the value of the constant k so that the function f defined below
is
Continuous at x = 0, where f (x) =
1 cos 4 x
,x 0
8x2
x,
x 0
Solution: It is given that the function f is continuous at x = 0. Therefore,
lim f ( x)
x
f (0)
0
⇒ xlim0
1 cos 4 x
= k
8x2
2 sin 2 2 x
= k
0
8x2
⇒ lim
x
⇒ lim
x
0
sin 2 x
2x
2
= k
⇒ k= 1
Thus, f is continuous at x = 0 if k = 1.
Example 7.8. Discuss the continuity of the function f(x) = sin x ⋅ cos x.
Solution: Since sin x and cos x are continuous functions and product of two
continuous function is a continuous function, therefore f(x) = sin x. cos x is a
continuous function.
Example 7.9. Let f be a piecewise function defined by:
f ( x)
x2
f ( x)
x 4
x
2 x 2 if x 1
if x 1
188
Differentiation
Let us study the continuity and differentiability of this function at 1.
• Continuity at 1. Left-hand continuity at 1 is not a problem, because a
polynomial is continuous on] – ∞; 1]. For the right-hand continuity:
lim
x 1
x 4
= – 3 and f(1) = 12 – 2 × 1 – 2 = –3
x
So: lim
x 1
x 1
= f(1) the function f is continuous at 1.
x
• Differentiability at 1. Left-hand differentiability at 1 is not a problem
because a polynomial is differentiable on [–∞; 1].
If x ≤ 1, we have f ′(x) = 2x – 2 so f ′ – (1) = 0
As for right-hand differentiability, we have to revert to the definition. We
can then carry out the following calculation:
f (1 h)
h
So, lim–
h 0
As f (1)
f (1)
4
1 h
1 h 4
3
4h
4
1 h
h
h(1 h) 1 h
= 4 then f+ '(1) = 4
f (1) the function f is not differentiable at 1.
Graphically the graph bf is a single unbroken curve and has a cusp at the
point A.
189
Differentiation
Continuity and Differentiability
Example 7.10. Suppose we want to determine whether the function
f(x) =
x2
6x 8
if x 2
x 2
3
if x 2
is differentiable at x = 2. You would first make sure that it is continuous at x
= 2: since
lim f ( x) = lim
x 2
x 2
( x 4) ( x 2)
=2–4=–2
x 2
And –2 ≠ f(2) = 3, f(x) is not continuous at x = 2, so it cannot be differentiable
at x = 2.
Suppose, 3 is changed to a –2:
f(x) =
x
6x 8
if x 2
x 2
2
if x 2
and I wanted to know whether it was differentiable at x = 2. Well now f(x) is
continuous, so we can move on to differentiability. There are two ways to see f(x)
is differentiable. First, notice that f(x) is just the line x – 4, since we can rewrite
f(x) as
f(x) =
x 4 if x
2
2 if x
2
,
so all we did is remove the point (2, – 2) in the line y = x – 4 and then fill it
in again. The other way would be to show
lim
x
2
f ( x)
x
f (2)
2
exists. Using this rewritten form of f(x) for the limit is easier, and I’II leave it
to you to check.
190
Differentiation
Determine a and b so that the function
f(x) =
9cos ( x) if x
ax
b
if x
is differentiable at x = π. Again, we need to check continuity first: Observe that
lim 9 cos( x ) = 9 cos (π) = – 9, lim ax
–
x
x
b
a b
so in order to be continuous at x = π, we need πa + b = –9. Since f(π) =
πa + b, this would guarantee continuity. To actually solve for a, b we can now
check differentiability at x = π. Instead of using the limit definition, I can say that
9sin( x) if x
f ′(x) =
a
?
if x
if x
,
since each piece individually is differentiable, so the only question is x = π.
However, all we need is for the slopes from the left and right of π to agree, i.e.
lim f ( x)
lim f ( x)
x
x
So, if you check this, you get the equation 0 = a. Plug this into the above
equation to get b = –9. So with these two values, f(x) is differentiable at x = π.
The reason this works is because each piece of f ′(x) is continuous individually.
Also, what I wanted to demonstrate in the review session with the limit:
lim
x
4
5 x 1
2
x
.
This is a 0/0 limit. Here, you could multiply by the conjugate on the bottom to
get
( sqrt 5 x 1) (2
4
4 x
lim
x
x)
but nothing cancels. So this tells you that you should also multiply by the
conjugate on the top:
191
Differentiation
lim
( 5 x 1) ( 5 x 1) (2
x) ( 5 x 1)
(4
x
4
=
4
=2
2
x)
Example 7.11. Find derivative of
Solution: Let y =
lim
x
(4
4 (4
x) (2
x)
x ) ( 5 x 1)
tan x
tan x . Using chain rule, we have
dy
d
1
(tan x )
dx 2 tan x dx
=
=
=
1
sec2 x
2 tan x
1
2 tan x
d
( x)
dx
(sec 2 x )
(sec 2 x )
4 x tan x
1
2 x
.
Example 7.12. If y = tan (x + y), find
dy
.
dx
Solution: Given y = tan (x + y). Differentiating both sides w.r.t. x, we have
dy
sec 2 ( x
dx
or
y)
d
(x
dx
y)
2
= sec ( x y ) 1
dy
dx
[1 – sec2 (x + y]
dy
= sec2 (x + y)
dx
192
Differentiation
Therefore,
dy sec2 ( x y )
= – cosec2 (x + y).
dx 1sec 2 ( x y )
dy
.
dx
Example 7.13. If ex + ey = ex+y, find
Solution: Given that ex + ey = ex+y. Differentiating both sides w.r.t. x, we have
ex
ey
(e y
or
dy
x
= e
dx
ex y )
y
1
dy
dx
dy
= ex+y – ex,
dx
which implies that
ex
dy
= y
dx
e
y
ex
ex
e x ( e y 1)
.
e y (1 e x )
y
Example 7.14. If xy = ex–y, prove that
dy
dx
log x
(1 log x) 2
Solution: We have xy = ex–y. Taking logarithm on both sides, we get
Y log x = x – y
⇒ y(1 + log x) = x
x
i.e. y = 1 log x
Differentiating both sides w.r.t. x, we get
dy
dx
(1 log x) 1 x
(1 log x)
2
1
x
log x
(1 log x) 2
193
Differentiation
Example 7.15. If y = tan x + sec x, prove that
d2y
dx
2
cos x
(1 sin x )2
Solution: We have y = tan x + sec x. Differentiating w.r.t. x, we get
dy
= sec2x + secx tanx
dx
=
sin x
1
2
1sin x
2
1 sin x
(1 sin x) (1 sin x)
2
cos x cos x
cos x
dy
1
=
.
dx
1 sin x
Thus
Now, differentiating again w.r.t. x, we get
d2y
dx
2
cos x
=
cos x
2
(1 sin x) (1 sin x) 2
x3
x 2 16 x 20
( x 2) 2
k
Example 7.16. If f(x) =
,x
,x
.
2
is continuous at x = 2, find the
2
value of k.
Solution: Given f(2) = k.
Now,
lim f ( x )
x
2–
= xlim2
lim f ( x)
x
(x
lim
x
2
5) ( x
(x
2) 2
2) 2
x3
2
lim ( x
x
2
As f is continuous at x = 2, we have
lim f ( x) = f(2)
x
2
⇒ k = 7.
194
x 2 16 x 20
( x 2)2
5) 7
Differentiation
Example 7.17. Show that the function f defined by
x sin
f(x) =
1
,x 0
x
0 ,x 0
is continuous at x = 0.
Solution: Left hand limit at x = 0 is given by
lim f ( x) = lim x sin 1
0
x
x 0
x
Similarly, lim f ( x) lim x sin
x
x
0
0
0
Since,
1 sin
1
1
x
1
0. Moreover f(0) = 0.
x
Thus lim f ( x) lim f ( x) f(0). Hence f is continuous at x = 0.
x
0
x
0
Check Your Progress - 2
1.
What is a differentiable function of one real variable?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is the derivative of a function?
................................................................................................................
................................................................................................................
................................................................................................................
3.
How can the gradient of a curve at any point be determined?
................................................................................................................
................................................................................................................
................................................................................................................
195
Differentiation
7.4
SUMMARY
• Limit can be defined as a number that a function approaches as the
independent variable of the function approaches a given value.
• Limit is understood as the function is approaching L when x approaches a
specific value (a).
• If f(x) is continuous on a closed interval [a, b] and c is any number
between f(x) and f(b), inclusive, then there is at least one number x in the
interval [a, b] such that f(x) = c.
• The composition of two continuous functions is continuous. So, for
example, the square root function is continuous, so the square root of a
continuous function is another continuous function.
• Functions inverse to continuous functions are continuous.
• A differentiable function of one real variable is a function whose derivative
exists at each point in its domain.
• As a result, the graph of a differentiable function must have a (non-vertical)
tangent line at each point in its domain, be relatively smooth, and cannot
contain any breaks, bends, or cusps.
• Differentiation is the algebraic procedure of calculating the derivatives.
• Derivative of a function is the slope or the gradient of the curve (graph) at
any given point.
• Gradient of a curve at any given point is the gradient of the tangent drawn
to that curve at the given point.
• Differentiation process is useful in calculating the gradient of the curve at
any point.
7.5
KEY WORDS
• Differentiation: It is the mathematical process of obtaining the derivative
of a function.
• Limit: It is defined as the mathematical value towards which a function
goes as the independent variable approaches infinity.
196
Differentiation
7.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. Limit can be defined as a number that a function approaches as the
independent variable of the function approaches a given value.
2. The composition of two continuous functions is continuous.
Check Your Progress - 2
1. A differentiable function of one real variable is a function whose derivative
exists at each point in its domain.
2. Derivative of a function is the slope or the gradient of the curve (graph) at
any given point.
3. The gradient of the curve at any point can be determined with the help of
differentiation process.
7.7
SELF-ASSESSMENT QUESTIONS
1. What so you understand by differentiation?
2. Define the meaning and purpose of limit.
3. Differentiate between right hand limit and left hand limit.
4. Discuss continuity of a function.
5. What do you mean by continuity in an interval? Discuss.
6. Discuss the geometrical interpretation of continuity.
7. List the various properties of continuous functions.
8. Write a detailed note on differentiation and differentiability.
7.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
197
Differentiation
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
198
Integration and Its
Application
UNIT–8
INTEGRATION AND ITS APPLICATION
Objectives
After going through this unit, you will be able to:
•
Discuss the aspects of integration
•
Analyse integration by substitution
•
Understand the applications of integration
•
Determine a cost function
Structure
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
Introduction
Integration
Application of Integration
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
8.1
INTRODUCTION
This unit will introduce you to integration and its applications. Integration is a calculus
operation whereby the integral of a function is determined. It is in other words the
process of calculating either a definite integral or indefinite integral. Integration is
denoted with the symbol . Differentiation is also an integral aspect of integration.
The different between integration and differentiation can be defined as the difference
between squaring and taking the square root. There are different methods of
integration, namely, indefinite integral, integration by substitution, integration of
rational, irrational and trigonometric functions.
Integration has a wide application, from economics to accounting and business,
determination of cost functions, total revenue functions, consumer surplus and
producer surplus. The aspects of integration and its varied applications are discussed
in detail in this unit.
199
Integration and Its
Application
8.2
INTEGRATION
Integration is the process of calculating either definite integral or indefinite integral.
Y
f(x)
a
b
X
Definite Integral
For a real function f(x) and a closed interval [a, b] on the real line, the definite
integral,
b
a
f ( x), is defined as the area between the graph of the function the
horizontal axis and the two vertical lines at the end points of an interval.
Indefinite Integral
When a specific interval is not given, it is known as indefinite integral. A definite
integral can be calculated using anti-derivatives.
Given a function f(x), the indefinite integral (or antiderivative) of f(x) is a function
F(x) whose derivative is equal to f(x). This means that F ( x)
f ( x).
Symbol of Integration
To find the indefinite integral (antiderivative) of a function f,
f ( x ) dx = F(x) + C
The C is called the constant of integration. From the rules of differentiation the
derivative of any constant is simply 0. That is how differentiation and integration are
related to each other.
Relation between Integration and Differentiation
The different between integration and differentiation can be defined as the difference
between squaring and taking the square root. If a positive number is squared and
then take the square root of the result, the positive square root value will be the
200
Integration and Its
Application
number that is squared. Similarly, if the integration is applied on the result, that is
obtained by differentiating a continuous function f(x), it will leads back to the
original function and vice versa.
For example, let F(x) be the integral of function f(x) = x, therefore, F(x) =
f ( x) dx = (x2/2) + c, where c is an arbitrary constant. When differentiating F(x)
with respect to x we get, F(x) = dF(x)/dx = (2x/2) + 0 = x, therefore, the
derivative of F(x) is equal to f(x).
Integration Formulas
Indefinite Integral
Method of substitution
f ( g ( x )) g ( x) dx =
f (u ) du
Integration by parts
f ( x) g ( x) dx = f ( x) g ( x)
g ( x) f ( x) dx
Integrals of Rational and Irrational Functions
xn 1
n 1
x n dx
=
1
dx
x
= ln|x| + C
c dx
= cx + C
x dx =
x 2 dx
x2
2
=
1
dx =
x2
C
C
x3
3
C
1
x
C
201
Integration and Its
Application
xdx =
2x x
3
C
1
dx = arc tan x + C
1 x2
1
1 x2
dx = arc sin x + C
Integrals of Trigonometric Functions
sin x dx = – cos x + C
cos x dx
= sin x + C
tan x dx = ln |sec x| + C
sec x dx = ln |tan x + sec x| + C
sin 2 x dx =
1
(x – sin x cos x) + C
2
cos 2 x dx =
1
(x + sin x cos x) + C
2
tan 2 x dx = tan x – x + C
sec 2 x dx = tan x + C
ax
e sin bx dx =
ax
e cos bx dx =
e ax
a2
b2
eax
a2
b2
[a sin bx – b cos bx]
[a cos bx + b sin bx]
202
Integration and Its
Application
Integrals of Exponential and Logarithmic Functions
ln x dx = x ln x – x + C
xn 1
ln x
x ln x dx =
n 1
n
x 1
( n 1) 2
C
e x dx = ex + C
bx
ln b
b x dx =
C
sin hx dx = cos h x + C
cos hx dx = sin h x + C
(ax b) n 1
, (n ≠ – 1)
b) dx =
a (n 1)
n
(ax
1
(ax b)
dx =
1
ln (ax b)
a
Integration by Substitution
Integration by substitution is a method which deals with comparatively complex
integration. Difficult piece of integration can be make easy by using this method. It
affects the variable and the integrand. Simple substitution method can be understood
by the example of linear substitution of ax + b = u. It can be said that substitution
method provides simpler integration involving the variable u.
Let u = ax + b
Step 1: Choose a new variable u
Step 2: Determine the value dx
Step 3: Make the substitution
Step 4: Integrate resulting integral
Step 5: Return to the initial variable x
203
Integration and Its
Application
Example 8.1. Find ( x 4)5 dx
Solution: Let x + u = u
du =
du
dx
dx
Now, in this example, because u = x + 4 it follows immediately that
du
dx
1 and
so du = dx. So, substituting both for x + 4 and for dx in Equation (1) we have
( x 4)5 dx =
u 5 du
The resulting integral can be evaluated immediately to give
u6
6
c. We can
revert to an expression involving the original variable x by recalling that
u = x + 4, giving
( x 4)5 dx =
( x 4)6
6
c
Example 8.2. Find cos(3x 4) dx
Solution: Let 3x + 4 = u
du =
and so
du
dx
dx
with u = 3x + 4
It follows that
du =
du
dx
dx
3dx
204
and
du
=3
dx
Integration and Its
Application
So, substituting u for 3x + 4, and with dx =
1
cos u du
3
cos(3x 4) dx =
1
sin u
3
=
1
du in Equation (2) we have
3
c
We can revert to an expression involving the original variable x by recalling that
u = 3x + 4, giving
cos(3x
4) dx =
1
sin(3x
3
4) c
Example 8.3. Evaluate (2 x 3) 4 dx
Solution:
Step 1: Choose a substitution function u = 2x + 3
Step 2: Determine the value du = 2dx + 0
dx =
du
2
Step 3: Integrate resulting integral
(2 x
3) 4 dx =
=
1 4
u du
2
=
u5
10
u4
1 u5
2 5
du
2
c
c
205
Integration and Its
Application
Step 4: Return to the initial variable: x
So, the solution is:
=
(2 x 3)5
10
Example 8.4. Find
c
15
dx
(3 2 x)
Solution:
Step 1: Choose a substitution function u = 3 – 2x
Step 2: Determine the value
du = 0 – 2dx
dx =
1
du
2
Step 3: Integrate resulting integral
15
dx =
(3 2 x)
=
15 du
2
u
15
4
1
du
2
15
|n|u| + C
2
Step 4: Return to the initial variable: x
=
15
ln 3 2 x
2
c
Integration by substitution is also done by substituting the functions.
The first and most importation step is to write the integral in this form:
f [ g ( x)]g ( x ) dx has its derivative g′(x)
For example:
sin( x 2 ) 2 x dx
Here f = sin, and we have g = x2 and its derivative of 2x.
206
Integration and Its
Application
Therefore integration can be written as
f [ g ( x )] g ( x ) dx =
f (u ) du
Then we can integrate f(u), and finish by putting g(x) back as u.
Example 8.5. sin ( x 2 ) 2 x dx
Solution:
Let x2 = u
∴ 2x dx = du
Now integrate:
sin (u ) du = cos (u) + C
And finally put u = x2 back again:
cos (x2) + C
Example 8.6. 2 x 1 x 2 dx
Solution:
And so with u = 1 + x2
and
du
= 2x
dx
It follows that
du =
du
dx = 2x dx
dx
So, substituting u for 1 + x2, and with 2x dx = du, (3) we have
2 x 1 x 2 dx =
=
=
u du
u1/ 2 du
2 3/ 2
u
3
c
207
Integration and Its
Application
2
(1 x 2 )3 / 2
2
3
2 x 1 x dx =
c
Check Your Progress - 1
1.
What is integration?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is the difference between integration and differentiation defined as?
................................................................................................................
................................................................................................................
................................................................................................................
3.
What is integration by substitution?
................................................................................................................
................................................................................................................
................................................................................................................
8.3
APPLICATION OF INTEGRATION
Application of Economic, Accounting and Business
For this application marginal function is obtained by differentiating the total function.
Now, when Marginal function is given and initial values are given, then total function
can be obtained with the help of integration.
Determination of Cost Function
If C denotes the total cost and MC =
= C(x) =
dC
is the marginal cost, then we write C
dx
( MC ) dx + k, where k is the constant of integration, k, being the
constant, is the fixed cost.
208
Integration and Its
Application
Example 8.7. The marginal cost function of manufacturing x units of a product is
5 + 16x – 3x2. The total cost of producing 5 items is ` 500. Find the total cost
function.
MC = 5 + 16x – 3x2
Solution: Given,
∴ C(x) =
= 5x + 16
3 x 2 ) dx
(5 16 x
x2
2
x3
3
3
k
C(x) = 5x + 8x2 – x3 + k
When x = 5, C(x) = C(5) = ` 500
or, 500 = 25 + 200 – 125 + k
This gives, k = 400
∴ C(x) = 5x + 8x2 – x3 + 400
Example 8.8. The marginal cost function of producing x units of a product is given
by MC =
x
x2
2500
. Find the total cost function and the average cost function
if the fixed cost is Rs. 1000.
Solution: MC =
∴ C(x) =
x
x
2
2500
x
x
2
2500
dx
k
Let x2 + 2500 = t2 ⇒ x dx = t dt
∴ C(x) =
t dt
t
C(x) =
k
dt
k
t
k
x2
2500
209
k
Integration and Its
Application
x = 0, C(0) = Rs. 1000
When
2500 + k = 50 + k
∴ 1000 =
Or, k = 950
∴ C(x) =
x2
2500
AC =
1
950
2500
x
950
x
2
Total revenue function can also be determine by integration
If R(x) denotes the total revenue function and MR is the marginal revenue function,
then
MR =
d
[ R ( x)]
dx
∴ R(x) = ( MR ) dx k Where k is the constant of integration.
Also, where R(x) is known, the demand function can be found as p =
R( x)
x
Example 8.9. The marginal revenue function of a commodity is given as MR = 12
– 3x2 + 4x. Find the total revenue and the corresponding demand function.
Solution: MR = 12 – 3x2 + 4x
∴ R=
(12 3 x 2
4 x ) dx
k
R = 12x – x3 + 2x2 [constant of integration is zero in this case]
∴ Revenue function is given by R = 12x + 2x2 – x3
Since
∴ p=
x = 0, R = 0 ⇒ k = 0
R
= 12 + 2x – x2 is the demand function.
x
210
Integration and Its
Application
Example 8.10. The marginal revenue function for a product is given by
MR =
6
4
3) 2
(x
Find the total revenue function and the demand function.
Solution: MR =
∴ R=
6
4
( x 32 )
6
(x
3)
2
6
4 dx
x
3
4x k
x = 0, R = 0 ⇒ k = – 2
∴ R=
Now,
6
x
3
– 4x – 2, which is the required revenue function.
p=
R
x
=
6
x ( x 3)
=
6 2x 6
x( x 3)
=
2
x 3
6
x( x 3)
4
2
x
4
2
x
4
4
2
3
x
4
∴ The demand function is given by p =
2
3
x
4.
Consumer Surplus and Producer Surplus
The Demand Curve p = D(x) – from the Consumer’s Perspective. Generally,
the lower the price of a product, the more the consumers will demand the product.
That is, high prices reduce demand and low prices raise demand. So, generally, p
= D(x) is a decreasing function.
211
Integration and Its
Application
Price
S(x)
Equilibrium point (XE, PE)
D(x)
Quantity
The Supply Curve p = S(x) – from the Producer’s Perspective. Generally,
the higher the price of a product, the more the producers are willing to supply. That
is, high prices increase supply and low prices decrease supply. So, generally, the
supply curve is an increasing function.
The Equilibrium Point (xE, pE) is the intersection of the supply and demand
curves.
Utility, U, is an economic idea. When a consumer receives x units of a product
a certain amount of pleasure, or utility, is derived from it.
Definition: The supply function or supply curve gives the quantity of an item
that producers will supply at any given price. The demand function or demand curve
gives the quantity that consumers will demand at any given price.
Let the price per unit by p and the quantity supplied or demanded at that price
by q. As is the convention in economics, p is always written as a function of q. Thus
the supply curve will be denoted by the formula
p = S(q)
and represented by a graph where the x and y axes correspond to q and p
values respectively. Similarly, we will use
p = D(q)
to denote the demand curve, the supply function S is increasing – the higher the
price, the more the producers will supply. The demand function D is decreasing –
the higher the price, the less the consumers will buy.
212
Integration and Its
Application
Definition: The point of intersection ( ,
) of the supply and demand
curves is called the market equilibrium point. The numbers
and
are termed
equilibrium quantity and equilibrium price respectively.
In an ideal free market both consumers and producers gain by buying and
selling at the equilibrium price. The goal of this section is to compute exactly how
much the consumers gain by buying at the equilibrium price rather than at a higher
price.
The total amount spent by the consumers if everyone buys at the equilibrium
price p, in this case q units are supplied and bought, and the total amount spent is
the number of units bought times the price per unit, i.e.,
total amount spent at equilibrium price = p q
q
total amount paid at maximum prices = D(q ) dq .
0
The quantity in the integral is the area under the demand curve from q = 0 to
q = qe. As the figure shows it is greater than , which is the area of the rectangle
either sides [0 q ] and [0 p ], and which according to the formula represents the
total amount spent by consumers at the equilibrium price. The difference between
these two areas represents the total that consumers save by buying at equilibrium
price.
213
Integration and Its
Application
This is called the consumer surplus for this product (See picture above). To
summarize
q
q
D ( q ) dq
Consumer surplus =
pq
[ D (q )
0
p ] dq.
...(i)
0
A similar analysis (which you should try out) shows that the producers also gain
by trading at the equilibrium price. Their gain called producer surplus is given by the
following quantity
q
Producer surplus = pe qe
q
S (q ) dq
0
[p
S (q )] dq.
...(ii)
0
Example 8.11. For a certain item the demand curve is
20
p = D(q) = q 1
and the supply curve is
p = S(q) = q + 2.
Find the equilibrium price and equilibrium quantity. Then compute the consumer
and producer surplus.
Solution. The find the equilibrium quantity, we let D(q) = S(q) to obtain
20
= q + 2.
q 1
214
Integration and Its
Application
Clearing the denominator gives 20 = (q + 1) (q + 2), which simplifies to q2 +
3q – 18 = 0. The positive solution gives the equilibrium quantity = 3, and the
= 5.
equilibrium price is
We compute consumer and producer surplus using formulae (i) and (ii) above:
q
CS
D ( q ) dq
=
pe qe
0
3
=
20
dq (5) (3)
q 1
0
= 20 ln(q 1) 30 15
= 20 ln 4 – 15
≈ 12.73.
q
Similarly
PS = p q
S ( q ) dq
0
3
= (5) (3)
(q
2) dq
0
= 15
q2
2
= 15
9
2
3
2q
0
6
4.50 .
Example 8.12. Find consumer and producer surplus for demand equation P = –
50q + 2000 and supply equation p = 10q + 500.
215
Integration and Its
Application
Both areas can be found using a definite integral. The form the integral
takes is:
x coordinate of right edge
x coordinate of left edge
(upper function) (lower function) dx
The shaded area for Consumer Surplus is shown in the figure. The left edge of
the triangle has an x-coordinate of 0, and the right edge is our equilibrium point,
which has an x-coordinate of 25. The top of the triangle is the demand equation
p = –50q +2000, and the bottom of the triangle is our constant equilibrium price,
750. So,
25
Consumer Surplus =
=
25q 2
0
( 50q 2000) (750) dq
25
2000q 750q 0
= [–25(25)2 + 2000(25) – 750(25)] – [–25(0)2 + 2000(0) – 750(0)]
= Rs. 15,625
Producer Surplus can be found the same way: The left edge and the right edge
are still at 0 and 25, but now the top of the triangle is our equilibrium price, and
the bottom of the triangle is our supply equation p = 10q + 500. So,
Producer Surplus =
25
0
(750)
(10q 500)dq
25
= 750q 5q 2 500q 0
= [750(25) – 5(25)2 – 500(25)] – [750(0) – 5(0)2 – 500(0)]
= Rs. 3125
Example 8.13. The demand function is given by q = –0.5q + 70 and the supply
function is given by q = 0.7q – 50. On the x-axis, and quantity is on the y-axis.
Find consumer and producer surplus.
216
Integration and Its
Application
Sol. As before, we set the supply and demand equations equal to each other.
Supply = demand
0.7p – 50 = 0.5p + 70
1.2p = 120
p = 100
So we know that Ep = Rs. 100. To find Eq, we could use either the supply or
the demand equation. Again, both will give the same answer:
supply:
demand:
q = 0.7(100) – 50
q = –0.5(100) + 70
q = 70 – 50
q = – 50+70
q = 20
q = 20
Both solutions agree, so we can be sure that Eq = 20 units.
Definite integral =
x coordinate of right edge
x coordinate of left edge
(upper function) (lower function)dx
Consumer Surplus
The left edge of Consumer Surplus is the equilibrium line. And the x-coordinate of
that line is our equilibrium price, or `100. The right edge is the point where the
demand function crosses the x-axis. To find this point, we set the demand function
equal to zero and solve:
217
Integration and Its
Application
Demand = 0
–0.5p + 70 = 0
–0.5p = –70
p = ` 140
So the bounds of our integral will be at ` 100 and ` 140.
140
100
140
100
( 0.5 p
70) (0) dp
( 0.5 p
70) (0) dp =
=
0.5 p 2
70 p
2
=
0.5(140) 2
2
140
100
0.5 p
70 dp
140
C
100
70(140) C
0.5(100) 2
2
70(100) C
= ` 400
Producer Surplus
The left edge of Producer Surplus is the point where the supply function crosses the
x-axis, and so to find this point, we set the supply function equal to zero and solve:
Supply = 0
0.75 – 50 = 0
0.7p = 50
p = ` 71.43
It’s the equilibrium line, and the x-coordinate of that line is `100. So the bounds
of our integral will be `71.43 and `100.
100
71.43
100
71.43
(0.7 p 50) (0) dp
(0.7 p 50) (0) dp =
218
100
71.43
0.75 p
50dp
Integration and Its
Application
=
=
0.7 p 2
2
100
50 p C
0.7(100) 2
2
71.43
50(100) C
0.7(71.43) 2
2
50(71.43) C
Example 8.14. Suppose that when it is t years old, a particular industrial machine
generates revenue at the rate R′(t) = 5,000 – 20t2 rupees per year and that
operating and servicing costs related to the machine accumulate at the rate C’(t) =
2,000 + 10t2 rupees per year.
(a) How many years pass before the profitability of the machine begins to
decline?
(b) Compute the net earnings generated by the machine over the time period
determined in part (a).
Solution:
(a) The profit associated with the machine after t years of operation is
P(t)
= R(t) – C(t) and the rate of profitability is
P′(t) = R′(t) – C′(t)
= (5,000 – 20t2) – (2,000 + 10t2)
= 3,000 – 30t2
The profitability begins to decline when
P′(t) = 0
3,000 – 30t2 = 0
t2 = 100
t = 100 years
(b) The net earnings NE over the time period 0 ≤ t ≤ 10 is given by the
difference NE = P(10) – P(0), which can be computed by the integral
NE = P(10) P(0)
10
P (t ) dt
0
219
Integration and Its
Application
10
=
(3,000 30t 2 ) dt
0
10
= (3,000t 10t 3 ) 0
` 20,000
Check Your Progress - 2
1.
What happens if the price of a product is lowered?
................................................................................................................
................................................................................................................
................................................................................................................
2.
How according to the supply curve do prices affect supply?
................................................................................................................
................................................................................................................
................................................................................................................
8.4
SUMMARY
• Integration is the process of calculating either definite integral or indefinite
integral.
• For a real function f(x) and a closed interval [a, b] on the real line, the
definite integral, is defined as the area between the graph of the function the
horizontal axis and the two vertical lines at the end points of an interval.
• When a specific interval is not given, it is known as indefinite integral.
• A definite integral can be calculated using anti-derivatives.
• From the rules of differentiation the derivative of any constant is simply 0.
• The different between integration and differentiation can be defined as the
difference between squaring and taking the square root.
• Integration by substitution is a method which deals with comparatively
complex integration.
• It can be said that substitution method provides simpler integration involving
the variable u.
220
Integration and Its
Application
• Simple substitution method can be understood by the example of linear
substitution of ax + b = u.
• Generally, the higher the price of a product, the more the producers are
willing to supply.
• The supply function or supply curve gives the quantity of an item that
producers will supply at any given price.
• The demand function or demand curve gives the quantity that consumers
will demand at any given price.
• In an ideal free market both consumers and producers gain by buying and
selling at the equilibrium price.
8.5
KEY WORDS
• Supply: It is the activity of supplying or providing something inorder to
maintain its availability in a marketplace.
• Market equilibrium point: It is the point of intersection of the supply and
demand curve.
• Integration by substitution: It is a method which deals with
comparatively complex integration.
8.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. Integration is the process of calculating either definite integral or indefinite
integral.
2. The different between integration and differentiation can be defined as the
difference between squaring and taking the square root.
3. Integration by substitution is a method which deals with comparatively
complex integration.
Check Your Progress - 2
1. If the price of a product is lowered, the demand of the product amongst
consumers will increase.
2. High prices increase the supply and low prices decrease the supply in a
supply curve.
221
Integration and Its
Application
8.7
SELF-ASSESSMENT QUESTIONS
1. What do you understand by integration?
2. Name the various fields where integration is used.
3. Differentiate between definite and indefinite integral.
4. Discuss the relationship between integration and differentiation.
5. Account for the various formulae of integration.
6. Write a short note on integration by substitution.
7. Show a graphical representation of consumer surplus and producer surplus.
8. Find consumer and producer surplus for demand equation P = –60q +
2200 and supply equation p = 20q + 550.
8.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
222
Meaning and Scope of Statistic
BLOCK-III
BASIC STATISTICAL CONCEPTS
The basic statistical concepts are discussed in this block. Statistics refers and relates to our
daily life in many ways. It generally consists of reaching various decisions, based on a
number of tests and surveys. The surveys are carried on a particular group of people or
population, called a sample. A sample in statistics is a common group that helps conduct
surveys and reach a result. This block discusses the meaning and scope of statistics, the
methods of organizing a statistical survey, accuracy, approximation and errors, ratio,
percentage and rates. This block consists of four units.
The ninth unit, of this book, discusses the meaning and scope of statistics. Statistics is an
important aspect of our daily life. It pertains to the various financial and calculative decisions
that we take in a day. It varies from rising stock rates to the literacy rates. Statistics has in
the recent years moved from mathematics to various other fields. The unit discusses the
many aspects of statistics in detail.
The tenth unit lists the method of organizing a statistical survey. Surveys help in reaching an
endpoint or a conclusion. Generally surveys are research based and are done with an
objective to reach some conclusion. A statistical survey however targets only a particular
population and is intended to help solve their problems and issues. The unit discusses the
methods of conducting surveys in detail.
The eleventh unit explains accuracy, approximation and errors. Any statistical data, when
collected comprises of these three things. As statistical data is generally collected on a large
scale, thus it is much likely to consist of errors as many-a-things and data are based on an
approximate value or count. Accuracy is important to reach a conclusion at the end of a
survey, but it has to be dealt with the errors and approximations. This unit tackles this with
explanation.
The twelfth unit discusses ratios, percentages and rates. Any survey or statistical data which
is collected over a large area, consists of ratios, percentages and rates. These factors are
later computed as per the need of the end result and are thus accounted. The unit discusses
the role of ratios, percentages and rates in statistics.
223
Meaning and Scope of Statistic
UNIT–9
MEANING AND SCOPE OF STATISTIC
Objectives
After going through this unit, you will be able to:
•
Describe the nature and scope of statistics
•
Assess the concept of business statistics
•
Analyse the importance of statistics in various fields
•
Discuss the evaluation of statistics as a subject of study
Structure
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
Introduction
An Introduction to Statistics
Evaluating Statistics
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
9.1
INTRODUCTION
Statistics is regarded as an important part of our daily lives and is defined as
numerical statements related to facts and used in various fields. The nature of
statistics is mainly concerned with forming decisions about various things such as
stock market trends, levels of literacy and interest rates. This unit will provide an
overview of statistics and its nature.
Statistics has gained a lot of importance in various fields such as government
sectors and commerce industries. It is thus, now no longer restricted in areas related
to mathematics, however, it still has some limitations which have been discussed in
this unit.
In this unit, you will also be able to understand the importance of statistics and
its major strengths and weaknesses. Despite its limitations, statistics as a field has
emerged as relevant in various areas of interest.
225
Meaning and Scope of Statistic
9.2
AN INTRODUCTION TO STATISTICS
Most business decisions are made today on the basis of relevant information and
statistical analysis of such information. Quantitative analysis has replaced intuition and
experienced guess work in solving most business problems. One of the tools to
understand information is statistics.
In general, business statistics can be defined as ‘a body of methods for
obtaining, organizing, summarizing, presenting, interpreting, analysing and acting
upon numerical facts related to an activity of interest. Numerical facts are usually
subjected to statistical analysis with a view to helping a decision-maker make wise
decisions in the face of uncertainty’.
The word ‘statistics’ can be referred to in two ways. In a common way, it refers
simply to numerical statements of facts such as the number of children in a family, the
number of books on statistics in the college library, the number of students enrolled
in the department of economics in Delhi University, and so on. The following
statements indicate the use of statistics as referring to numbers:
• Around 20 million Americans have a serious drinking problem.
• Nearly 52,000 Americans died in automobile accidents last year.
• More than 76 per cent voters turned out to vote during elections in Punjab
in February 2007.
• Majority of Americans consider Japanese cars superior in quality than
American cars.
All these statements represent statistical conclusions in some form. These
conclusions help us in formulating specific policies and attitudes with respect to
diverse areas of interest.
The second meaning of statistics refers to the field of study rather than simply to
numerical statements. As an area of study, it is primarily concerned with making
scientific and rational decisions about various properties and characteristics of some
population of interest, such as stock market trends, interest rates, demographic
shifts, inflation rates over the years, and so on. Consider the following statistical
statements:
• The crime rate in the city has gone up by 15 per cent over what it was last
year. (This statistical conclusion could help us in making decisions regarding
our safety and security in the city).
226
Meaning and Scope of Statistic
• The rate of inflation is expected to remain less than 5 per cent per year over
the next five years. (This could help us in making more educated
judgements about the general economic health of the country in the near
future).
• Less than 20 per cent of all high school graduates enter colleges for higher
education and less than 40 per cent of those who do enter colleges actually
graduate. (This statement gives us a good indication of the educational
philosophy of the country and the community and the reasons for such low
rates of admission into colleges and graduation could be investigated).
All these statements represent statistical conclusions in some form, which help us
to understand our environment better, and further help us in formulating specific
policies and attitudes to address and solve issues of interest.
Meaning and Scope of Statistics
In order for the quantitative and numerical data to be identified as statistics, it must
possess certain identifiable characteristics. Some of these characteristics are
described as follows:
1. Statistics are aggregates of facts: Single or isolated facts or figures
cannot be called statistics as these cannot be compared or related to other
figures within the same framework. Accordingly, there must be an
aggregate of these figures. For instance, if I say that I earn $30,000 per
year, it would not be considered statistics. On the other hand, if I say that
the average salary of a professor at our college is $30,000 per year, then
this would be considered statistics since the average has been computed
from many related figures, such as yearly salaries of many professors.
Similarly, a single birth in a hospital is not statistics, as it has no significance
for analysis purposes. However, when such information about many births
in the same hospital or birth information for different hospitals is collected,
then this information can be compared and analysed, and thus this data
would constitute statistics.
2. Statistics, generally are not the outcome of a single cause, but are
affected by multiple causes: There are a number of forces working
together that affect the facts and figures. For instance, when we say that the
crime rate in New York city has increased by 15 per cent over the last year,
a number of factors might have affected this change. These factors may be:
227
Meaning and Scope of Statistic
general level of economy such as state of economic recession,
unemployment rate, extent of use of drugs, areas affected by crime, extent
of legal effectiveness, social structure of the family in the area, and so on.
While these factors can be isolated by themselves, the effects of these
factors cannot be isolated and measured individually. Similarly, a marked
increase in food grain production in India may have been due to combined
effect of many factors such as better seeds, more extensive use of
fertilizers, mechanisation in cultivation, better institutional framework and
governmental and banking support, adequate rainfall, and so on. It is
generally not possible to segregate and study the effect of each of these
forces individually.
3. Statistics are numerically expressed: All statistics are stated in
numerical figures which means that these are quantitative information only.
Qualitative statements are not subject to accurate interpretations and hence
cannot be called statistics. For instance, qualitative statements such as
‘India is a developing country’ or ‘Jack is very tall’ would not be
considered statistical statements. On the other hand, comparing per capita
income of India with that of America would be considered statistical in
nature. Similarly, Jack’s height in numbers compared to average height in
America would also be considered statistics.
4. Statistical data is collected in a systematic manner: The procedures
for collecting data should be predetermined and well planned and such data
collection should be undertaken by trained investigators. Haphazard
collection of data can lead to erroneous conclusions.
5. Statistics are collected for a predetermined purpose: The purpose
and objective of collecting pertinent data must be clearly defined, decided
upon and determined prior to data collection. This would facilitate the
collection of proper and relevant data. For instance, data on the heights of
students would be irrelevant if considered in connection with the ability to
get admission in a college, but may be relevant when considering qualities of
leadership. Similarly, collective data on the prices of commodities in itself
does not serve any purpose unless we know, for the purpose of
comparison, the type of commodities under investigation and whether these
relate to producer, distributor, wholesale or retail prices. As another
example, if you are collecting data on the number of in-patients in the
228
Meaning and Scope of Statistic
hospital waiting to be X-rayed, then the pre-determined purpose may be to
establish the average time for the patients before X-ray and what can be
done to reduce this waiting time.
6. Statistics are enumerated or estimated according to reasonable
standard of accuracy: There are basically two ways of collecting data.
One is the actual counting or measuring, which is the most accurate way.
For instance, the number of people attending a football game can be
accurately determined by counting the number of tickets sold and
redeemed at the gate. The second way of collecting data is by estimation
and is used in situations where actual counting or measuring is not feasible
or where it involves prohibitive costs. For instance, the crowd at the
football game can be estimated by visual observation or by taking samples
of some segments of the crowd and then estimating the total number of
people on the basis of these samples. Estimates based on samples cannot
be as precise and accurate as actual counts or measurements, but these
should be consistent with the degree of accuracy desired.
7. Statistics must be placed in relation to each other: The main objective
of data collection is to facilitate a comparative or relative study of the
desired characteristics of the data. In other words, the statistical data must
be comparable with each other. The comparisons of facts and figures may
be conducted regarding the same characteristics over a period of time from
a single source or it may be from various sources at any one given time.
For instance, prices of different items in a store as such would not be
considered statistics. However, prices of one product in different stores
constitute statistical data, since these prices are comparable. Also, the
changes in the price of a product in one store over a period of time would
also be considered statistical data since these changes provide for
comparison over a period of time. However, these comparisons must relate
to the same phenomenon or subject so that likes are compared with likes
and oranges are not compared with apples.
Definition of Business Statistics
According to Schaum’s Outline of Business Statistics, ‘Statistics refers to the
body of techniques used for collecting, organizing, analyzing and interpreting data.
The data may be quantitative, with values expressed numerically, or they may be
qualitative, with characteristics such as consumer preferences being tabulated.
229
Meaning and Scope of Statistic
Statistics is used in business to help make better decisions by understanding the
sources of variation and by uncovering patterns and relationships in business data.’
Functions of Statistics
Statistics is no longer confined to the domain of mathematics. It has spread to most
of the branches of knowledge including social sciences and behavioural sciences.
One of the reasons for its phenomenal growth is the variety of different functions
attributed to it. Some of the most important functions of statistics are described as
follows:
1. It condenses and summarizes voluminous data into a few
presentable, understandable and precise figures: The raw data, as is
usually available, is voluminous and haphazard. It is generally not possible
to draw any conclusions from the raw data as collected. Hence, it is
necessary and desirable to express this data in few numerical values. For
instance, the average salary of a policeman is derived from a mass of data
from surveys. But just one summarized figure gives us a pretty good idea
about the income of police officers. Similarly, stock market prices of
individual stocks and their trends are highly complex to comprehend, but a
graph of price trends gives us the overall picture at a glance.
2. It facilitates classification and comparison of data: Arrangement of
data with respect to different characteristics facilitates comparison and
interpretation. For instance, data on age, height, sex and family income of
college students gives us a much better picture of students when the data is
categorized relative to these characteristics. Additionally, simply the
statements about these figures don’t convey any significant meaning. It is
their comparison that helps us draw conclusions.
3. It helps in determining functional relationships between two or more
phenomenon: Statistical techniques such as correlational analysis assist in
establishing the degree of association between two or more independent
variables. For instance, the coefficient of correlation between literacy and
employment gives us the degree of association between extent of training
and industrial productivity. Similarly, correlation between average rainfall
and agricultural productivity can be obtained by using such statistical tools.
Some statistical methods can also be used in formulating and testing
hypothesis about a certain phenomenon. For instance, it can be tested
whether a credit squeeze is effective in controlling prices of consumer
230
Meaning and Scope of Statistic
goods or whether tenured professors are more motivated to improve their
teaching than untenured professors.
4. It helps in predicting future trends: Statistical methods are highly useful
tools in analysing the past data and predicting some future trends. For
instance, the sales for a particular product for the next year can be
computed by knowing the sales for the same product over the previous
years, the current market trends and the possible changes in the variables
that affect the demand of the product.
5. It helps the central management and the government in formulating
policies: Various governmental policies regarding import and export trade,
taxation, planning, resource allocation and so on are formulated on the
basis of data regarding these elements. Many other policies are based upon
statistical forecasts made by statisticians, such as policies regarding
housing, employment, industrial expansion, food grain production, and so
on. Some of these policies would be based upon population forecasts for
the future years. Also based upon the forecasts of future trends, events or
demand, the central organizational management can modify their policies
and plan to meet future needs. For instance, the oil production in OPEC
countries for the next few years would affect the operations of many energy
consuming industries in America. Accordingly, these organizations must plan
to meet these challenges in the future.
Scope of Statistics
There is hardly any walk of life which has not been affected by statistics—ranging
from a simple household to big businesses and the government. Some of the
important areas where the knowledge of statistics is usefully applied are explained in
the following paragraphs:
Statistics in Government
Since the beginning of organized society, the rulers and the heads of states have
relied heavily on statistics in the form of collecting data on various aspects for
formulating sound military and fiscal policies. This data may have involved
population, taxes collected, military strength and so on. In the current structure of
democratic societies, the government is, perhaps, the biggest collector of data and
user of statistics. Various departments of the government collect and interpret vast
amount of data and information for efficient functioning and decision-making.
231
Meaning and Scope of Statistic
1. Economics: Statistics are widely used in economics study and research.
The subject of economics is mainly concerned with production and
distribution of wealth as well as savings and investments. Some of the areas
of economic interest in which statistical tools are used are as follows:
• Statistical methods are extensively used in measuring and forecasting
Gross National Product (GNP).
• Economic stability is primarily judged by statistical studies of business
cycles.
• Statistical analyses of population growth, unemployment figures, rural
or urban population shifts and so on influence much of the economic
policy making.
• Econometric models which involve application of statistical methods
are used for optimum utilisation of resources available.
• Financial statistics are necessary in the fields of money and banking
including consumer savings and credit availability.
2. Physical, natural and social sciences: In physical sciences, as an
example, the science of meteorology uses statistics in analysing the data
gathered by satellites in predicting weather conditions. Similarly, in botany,
in the natural sciences, statistics are used in evaluating the effects of
temperature and other climatic conditions and types of soil on the health of
plants. In the social sciences, ‘statistics are extensively used in all areas of
human and social characteristics.’
3. Statistics and research: There is hardly any advanced research going on
without the use of statistics in one form or another. Statistics are used
extensively in medical, pharmaceutical and agricultural research. The
effectiveness of a new drug is determined by statistical experimentation and
evaluation. In agricultural research, experiments about crop yields, types of
fertilizers and types of soils under different types of environments are
commonly designed and analysed through statistical methods. In marketing
research, statistical tools are indispensable in studying consumer behaviour,
effects of various promotional strategies, and so on.
4. Other areas: Statistics are commonly used by insurance companies, stock
brokerage houses, banks, public utility companies and so on. Statistics are
also immensely useful to politicians since they can predict their chances for
232
Meaning and Scope of Statistic
winning through the use of sampling techniques in random selection of voter
samples and studying their attitudes on issues and policies.
Statistics in Business and Commerce
Statistics influence the operations of business and management in many dimensions.
Statistical applications include the area of production, marketing, promotion of
product, financing, distribution, accounting, marketing research, manpower planning,
forecasting, research and development and so on. As the organizational structure
has become more complex and the market highly competitive, it has become
necessary for executives to base their decisions on the basis of elaborate information
systems and analysis instead of intuitive judgement. In such situations, statistics are
used to analyse this vast data base for extracting relevant information. Some of the
typical areas of business operations where statistics have been extensively and
effectively used are as follows:
1. Entrepreneuring: If you are opening a new business or acquiring one, it is
necessary to study the market as well as the resources from statistical point
of view to ensure success of the new venture. A shrewd businessman must
make a proper and scientific analysis of the past records and current
market trends in order to predict the future course for business conditions.
The analysis of the needs and wants of the consumers, the number of
competitors in the market and their marketing strategies, availability of
resources and general economic conditions and trends would all be
extremely helpful to the entrepreneur. A number of new enterprises have
failed either due to unreliability of data or due to faulty interpretations and
conclusions.
2. Production: The production of any item depends upon the demand of that
item and this demand must be accurately forecast using statistical
techniques. Similarly, decisions as to what to produce and how much to
produce are based largely upon the feedback of surveys that are analysed
statistically.
3. Marketing: An optimum marketing strategy would require a skillful
analysis of data on population, shifts in population, disposable income,
competition, social and professional status of target market, advertising,
quality of sales people, easy availability of the product and other related
matters. These variables and their inter-relationships must be statistically
studied and analysed.
233
Meaning and Scope of Statistic
4. Purchasing: The purchasing department of an organization makes
decisions regarding the purchase of raw materials and other supplies from
different vendors. The statistical data in the cost structure would assist in
formulating purchasing policies as to where to buy, when to buy, at what
price to buy and how much to buy at a given time.
5. Investment: Statistics have been almost indispensable in making a sound
investment whether it be in buying or selling of stocks and securities or real
estate. The financial newspapers are full of tables and graphs analysing the
prices of stocks and their movements. Based upon these statistical data, a
good investor will buy when the prices are at their lowest and sell when the
prices are at their highest. Similarly, buying an apartment building would
require that an investor take into consideration the rent collected, rate of
occupancy, any rent control laws, cost of the mortgage obtained and the
age of the building before making a decision about investing in real estate.
6. Banking: Banks are highly affected by general economic and market
conditions. Many banks have research departments which gather and
analyse information not only about general economic conditions but also
about businesses in which they may be directly or indirectly involved. They
must be aware of money markets, inflation rates, interest rates and so on,
not only in their own vicinity but also nationally and internationally. Many
banks have lost money in international operations, sometimes in as simple a
matter as currency fluctuations because they did not analyse the
international economic trends correctly. Many banks have failed because
they over-extended themselves in making loans without properly analysing
the general business conditions.
7. Quality control: Statistics are used in quality control so extensively that
even the phenomenon itself is known as statistical quality control. Statistical
quality control (SQC) consists of using statistical methods to gather and
analyse data on the determination and control of quality. This technique
primarily deals with the samples taken randomly and as representative of
the entire population, then these samples are analysed and inferences made
concerning the characteristics of the population from which these random
samples were taken. The concept is similar to testing one spoonful from a
pot of stew and deciding whether it needs more salt or not. The
characteristics of samples are analysed by statistical quality control and the
use of other statistical techniques.
234
Meaning and Scope of Statistic
8. Personnel: Study of statistical data regarding wage rates, employment
trends, cost of living indexes, work related accident rates, employee
grievances, labour turnover rates, records of performance appraisal and so
on and the proper analysis of such data assist the personnel departments in
formulating the personnel policies and in the process of manpower planning.
As we have seen, statistics in one form or another, affects every business and
every individual. An average individual is involved in statistics, knowingly or
unknowingly, every day of his life; whether it be comparing prices during shopping
or putting an extra lock on his door as a result of reading the crime rate in the
newspapers. Perhaps, it is an exaggeration but basically it is true what an
overenthusiastic, statistically aware business executive stated many years ago, When
the history of modern times is finally written, we shall read it as beginning with the
age of steam and progressing through the age of electricity to that of statistics.
Limitations of Statistics
Statistics is essential for almost all sciences such as social, physical and natural. In
spite of the extensive scope of the subject it has the following limitations:
1. Statistics does not study qualitative phenomena because it deals with facts
and figures. So the quality aspect of a variable or the subjective
phenomenon falls out of the scope of statistics. For example, qualities like
beauty, honesty, intelligence, etc., cannot be numerically expressed. So
these characteristics cannot be examined statistically.
2. Statistics does not study individuals. Statistics deals with aggregate of facts.
Single or isolated figures are not statistics.
3. Statistics can be misused. Statistics is mostly a tool of analysis. Statistical
techniques are used to analyse and interpret the collected information in an
enquiry. Statements supported by statistics are more appealing and are
commonly believed. For this, statistics is often misused.
4. Statistical methods rightly used are beneficial but if misused these become
harmful. Statistical methods used by less expert hands will lead to
inaccurate results. Here the fault does not lie with the subject of statistics
but with the person who makes wrong use of it.
5. Statistical cannot be applied to heterogeneous data.
6. It sufficient care is not exercised in collecting, analyzing and interpretation
the data, statistical results might be misleading.
235
Meaning and Scope of Statistic
7. Only a person who has an expert knowledge of statistics can handle
statistical data efficiently.
8. Some errors are possible in statistical decisions. Particularly the inferential
statistics involves certain errors. We do not know whether an error has
been committed or not.
Check Your Progress - 1
1.
What are the two methods used for collecting data?
................................................................................................................
................................................................................................................
................................................................................................................
2.
State the main objective of data collection.
................................................................................................................
................................................................................................................
................................................................................................................
3.
Enlist any three limitations of statistics.
................................................................................................................
................................................................................................................
................................................................................................................
4.
How can you say that statistics influences the operations of business and
management?
................................................................................................................
................................................................................................................
................................................................................................................
9.3
EVALUATING STATISTICS
Being a subject of much practical utility and having wide-ranging applications,
statistics displays a unique strength. It suffers from an important weakness as well.
All in all, it is spreading its tentacles far and wide.
236
Meaning and Scope of Statistic
1. Strength: The greatest strength of statistics as a subject lies in developing
a statistical mode of thinking, in imparting an orientation to the mind to think
statistically. This is of specific relevance in a modern society where
governance is no longer circumscribed by the day-to-day administration of
the matters of the state. Governments and other state agencies now remain
constantly engaged in various activities encompassing the whole gamut of
the functions of a corporate manager and a development planner.
This calls for collection and compilation of massive data on all such
characteristics of the subjects of the state (such as the level of education,
income, occupation, sex, age, marital status, or the like) as are necessary
for effective planning of the developmental activities of the state. All these
data, often collected over time, are carefully studied and systematically
analysed with a view to seeking useful insights and reaching statistically valid
conclusions for sound decision-making.
Thus, the wide diversity of data we face and the statistical tools that are
applied for data analysis do, together, impel us to think statistically. We are
gently coaxed into the statistical thinking mode, while:
(i) Bringing out the pattern of variations in the available data in a given
problem situation on one or more relevant characteristic(s)
(ii) Training the mind in comparative dimensions of data analysis,
examining the consequent variations, drawing inferences, and
establishing plausible relationships
So long as the process of collection and analysis of data is devoid of
comparative inputs, it fails to offer useful and reliable results for any
meaningful decision activity. For, a set of data compiled without serious
thought and deeper insights, proves a mere waste of efforts. Apart from
providing defences against any such possibilities, statistics cultivates a
resilient mind with an astute statistical sense alive to the dangers involved.
2. Weakness: The general feeling of distrust in it is an important weakness of
statistics. It emanates from the often-held view that the data, to which
statistical methods are applied, lack the desired element of accuracy. As a
result, the conclusions and inferences drawn from data analysis cannot be
claimed as being adequately reliable. In support of these fears, a
commoner may cite frequent cases of media reports and other officially
237
Meaning and Scope of Statistic
sponsored public relation material which, he feels, are generally based on
inadequate, manipulated, and unreliable data.
To the extent that this apprehension may be taken as based on tactual
situations, the real culprit are those who compile, collect, and project data
in a given light. Even if data inaccuracy is otherwise taken as being too
serious a flaw of statistics, there is really no escape from it. The whole
process of data collection, compilation, and tabulation is, indeed, too
porous, and does allow room for numerous errors. These can at best be
minimized, but can not be eliminated altogether.
Since complete accuracy can not be ensured in the absolute sense,
considerations of reliability and trust are relevant only in the relative terms.
The facts being what they are, the saying that ‘working on some information
is better than doing without any information,’ is a useful common sense
maxim. This should, and often does, greatly soften our attitude towards the
lack of reliability of statistical data and the consequent distrust in statistics.
3. Increasing tentacles: Despite the feeling that statistical data are not often
very trustworthy, statistics has evolved fairly objective methods of
evaluating the element of errors that erodes data reliability. Assuming
sincerity and fair play on the part of those involved in data collection and
processing, the available means of estimating the extent of errors in databased results have greatly reduced the reason for distrust in statistical data.
All said and done on this score, statistics has, as of now, established itself
as a generic and versatile subject of study. The more one gets to know of
it, the more one imbibes of its subtle impact in terms of the mental ability to
draw fairly valid conclusions even from limited data. And, it is precisely
owing to this reason that the applications of statistical methods have fast
spread its tentacles to the various important areas of human interest.
Check Your Progress - 2
1.
State the major drawback of statistics.
................................................................................................................
................................................................................................................
................................................................................................................
238
Meaning and Scope of Statistic
2.
What are the various statistical factors which should be considered while
planning the development activities of the state?
................................................................................................................
................................................................................................................
................................................................................................................
9.4
SUMMARY
• Most business decisions are made today on the basis of relevant information
and statistical analysis of such information.
• Business statistics can be defined as a body of methods for obtaining,
organizing, summarizing, presenting, interpreting, analysing and acting upon
numerical facts related to an activity of interest.
• Statistics refers to the field of study rather than simply to numerical
statements.
• Single or isolated facts or figures cannot be called statistics as these cannot
be compared or related to other figures within the same framework.
• All statistics are stated in numerical figures which means that these are
quantitative information only.
• The procedures for collecting data should be predetermined and well
planned and such data collection should be undertaken by trained
investigators.
• The two methods which are used for collecting data are as follows:
(a) Actual counting or measuring
(b) Estimation
• The main objective of data collection is to facilitate a comparative or
relative study of the desired characteristics of the data.
• According to Schaum’s Outline of Business Statistics, Statistics refers to
the body of techniques used for collecting, organizing, analysing and
interpreting data.
• Statistics is no longer confined to the domain of mathematics and has
spread to most of the branches of knowledge including social sciences and
behavioural sciences.
239
Meaning and Scope of Statistic
• Arrangement of data with respect to different characteristics facilitates
comparison and interpretation.
• Statistical techniques such as correlational analysis assist in establishing the
degree of association between two or more independent variables.
• Statistical methods are highly useful tools in analysing the past data and
predicting some future trends.
• Since the beginning of organized society, the rulers and the heads of states
have relied heavily on statistics in the form of collecting data on various
aspects for formulating sound military and fiscal policies.
• The subject of economics is mainly concerned with production and
distribution of wealth as well as savings and investments.
• Statistics are used extensively in medical, pharmaceutical and agricultural
research.
• Statistics are commonly used by insurance companies, stock brokerage
houses, banks, public utility companies and so on.
• An optimum marketing strategy would require a skilful analysis of data on
population, shifts in population, disposable income, competition, social and
professional status of target market, advertising, quality of sales people,
easy availability of the product and other related matters.
• Statistics have been almost indispensable in making a sound investment
whether it be in buying or selling of stocks and securities or real estate.
• Statistics are used in quality control so extensively that even the
phenomenon itself is known as statistical quality control.
• Being a subject of much practical utility and having wide-ranging
applications, statistics displays a unique strength.
• The greatest strength of statistics as a subject lies in developing a statistical
mode of thinking, in imparting an orientation to the mind to think statistically.
• The general feeling of distrust in it is an important weakness of statistics.
• Despite the feeling that statistical data are not often very trustworthy,
statistics has evolved fairly objective methods of evaluating the element of
errors that erodes data reliability.
• The applications of statistical methods have fast spread its tentacles to the
various important areas of human interest.
240
Meaning and Scope of Statistic
9.5
KEY WORDS
• Business statistics: It can be defined as a body of methods for obtaining,
organizing, summarizing, presenting, interpreting, analysing and acting upon
numerical facts related to an activity of interest.
• Statistics: It is defined as a body of techniques used for collecting,
organizing, analysing and interpreting data.
• Correlational analysis: It is defined as a technique of statistics which
helps in establishing the degree of association between two or more
independent variables.
9.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The two methods which are used for collecting data are as follows:
(a) Actual counting or measuring
(b) Estimation
2. The main objective of data collection is to facilitate a comparative or
relative study of the desired characteristics of the data.
3. The following are the limitations of statistics:
(a) Statistics does not study qualitative phenomena because it deals with
facts and figures.
(b) Statistical methods rightly used are beneficial but if misused these
become harmful.
(c) Statistical cannot be applied to heterogeneous data.
4. Statistics influences the operations of business and management as it
includes many applications such as area of production, marketing,
promotion of product, financing, distribution, accounting and marketing
research and so on.
Check Your Progress - 2
1. Statistic offers a feeling of distrust which is considered as the most
important drawback of statistics.
241
Meaning and Scope of Statistic
2. The various statistical factors which should be considered while planning
the development activities of the state are the level of education, income,
occupation, sex, age and marital status.
9.7
SELF-ASSESSMENT QUESTIONS
1. Explain the characteristics of statistics.
2. What are the various functions of statistics?
3. Enlist the uses of statistics in the field of economics.
4. State the role of statistics in the area of research.
5. Discuss the uses of statistics in business operations.
6. ‘Statistics has established itself as a generic and versatile subject of study’.
Justify.
9.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
242
Organizing a Statistical
Survey
UNIT–10
ORGANIZING A STATISTICAL SURVEY
Objectives
After going through this unit, you will be able to:
•
Discuss the steps in a statistical survey
•
Understand the sources of statistical data
•
Analyse the factors affecting the type of enquiry
•
Describe the various different types of enquiries
•
Assess non-probability sampling methods
Structure
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
Introduction
An Overview to Statistical Survey
Sampling Methods
Statistical Unit
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
10.1 INTRODUCTION
This unit will discuss about statistical survey and its various aspects. A statistical
survey is a collection of information about items in a population. It is a statistical
inquiry on a specific target population so as to discover facts leading to information
which can then be further used to solve problems pertaining to that segment of the
population. It tells about the different enquiries and laws pertaining to it. As you
advance further with this unit, it will explain topics about statistical units and its
degrees that are important to remember to help conduct a survey.
243
Organizing a Statistical
Survey
10.2 AN OVERVIEW TO STATISTICAL SURVEY
The various aspects of statistical survey have been discussed below in detail.
Steps in Statistical Survery
It is essential to go through systematic process of steps and the sequence of steps
needs to be followed in order to understand the process of conducting a statistical
survey. If these are not done accordingly then a statistical survey is hard to provide
the desired results. Following are the sequential steps that should be followed:
1. Defining the problem
2. Determining the objective and scope
3. Preliminaries to the collection of data
i) Source of data
ii) Type of enquiry
iii) Statistical unit
iv) Degree of accuracy
4. Collection of data
5. Editing of data
6. Classification and tabulation of data
7. Data analysis
8. Data interpretation
9. Report writing
Let us now have a detailed insight of the steps in the explanations below:
Defining the Problem
The foremost step in the survey is to identify the problem that needs to be
investigated. It is essential that the problem be addressed and defined clearly as this
would help identifying relevant data. It is imperative to say that statistics is all about
aggregated facts represented in numerical expression. For this reason it is essential
to define problem that would help ensure the occurrence of quantitative
measurement.
244
Organizing a Statistical
Survey
Determining the Objective and Scope
The next step that comes immediately after stating the problem is to determine the
scope and objective that the survey would have. Having clarity about the survey
would transform it as a guide that will help in compilation of required information.
The precise statement of the object will help you take a uniform approach towards
any number of problems that you will come across while you are conducting the
survey. The scope within the survey is all about the area that you will need to cover,
the time given to its study, items that need to be covered and collection of
information. All these are dependent on the given problem that is to be scrutinized
and the study objective. The accuracy with which the result achieved varies on
correct assessment of all that is included. So, it becomes essential to precisely
determine the scope related to the survey.
Preliminaries to the Collection of Data: Before proceeding ahead with the
data collected, the following preliminaries need to be dealt with:
i) Source of Data: The sources need to be decided in relation to data
collection. You can employ two approaches for collecting data: (1)
personally collecting data, or (2) collecting data from published sources. It
is to be observed that the first instance of data collection is primary data.
Data collected by someone else becomes the secondary data.
ii) Type of Enquiry: It is important to first determine the kind of enquiry that
will be conducted. Here you need to understand different kind of enquiries
census or sample, initial or repetitive, direct or indirect, regular or ad-hoc,
confidential or non-confidential, official or non-official, etc. All these should
be kept in mind prior to initiating the proposed study with a view of the
object and the scope including the client involved and the data sources.
iii) Defining the Statistical Unit: You should remember to take into
consideration the data collected and the statistical unit or units that need to
be collected. However, there is one important factor that you should
consider; it is all about eliminating any chances of ambiguity. Precisely
defining the statistical unit you will be minimizing the chances of collecting
inconclusive data. After having defined the statistical unit the same unit can
then become the basis of investigation. You will further learn about
statistical unit in detail later on.
iv) Degree of Accuracy: You should be able to make a prior decision related
to the accuracy that you want to achieve in accordance to collection of
245
Organizing a Statistical
Survey
data. However, it is essential to understand that absolute accuracy is hard
to achieve. The reason being the expenses incurred and the time
consumption that is does not exactly adds up to the standard of accuracy.
Nonetheless, you should strive to achieve reasonable accuracy that is
dependent on the data used and is related to the purpose behind the
investigation.
Data Collection: The above mentioned are the steps involved in the preliminary
stages, after this you will need to move on to data collection. Employing various
means you can collect data that is suitable for you. However, it is to be understood
that the most suitable method of data collection should be chosen after taking all the
factors into consideration that involves, scope and objective of enquiry, study,
available finances and factor of time involved.
Editing the Data: After collecting the desired data, the next step is to scrutinize
the information that has been collected. This is called data editing. This is important
because the data collected may be full of errors and mistakes. However, it is equally
essential to understand that the data should not be tampered with.
Classification and Tabulation of Data: Organizing the data is equally
important; it should be represented in a table, graphs or charts format that would be
compact form that is often referred to as frequency distribution. This way it would
become easier to sight out the salient features. Additionally, classification of data in
this format will enable in easy comparison.
Data Analysis: The next step involves data analysis using different statistical
measures that involves methods like percentage, averages and coefficients. If the
data is already represented in a figure format then it becomes easy to analyze it or
else the raw format serves little or no purpose at all. Different statistical measures are
related to different characteristics that are connected to the data in the form of a
summary. You should carefully consider measures that are well suited for the specific
survey out of the numerous methods of analysis.
Data Interpretation: After data analysis, comes the step of drawing inferences
that should be done with careful consideration. If not then it may lead to misleading
conclusions. Interpretation thus becomes the right means of seeking broader
perception to the survey findings. Relations and processes that underlie survey
findings can be focused well by proper interpretation.
Report Writing: Report writing is the last step in statistical survey. Without this
report the survey would be incomplete. Another thing to keep in mind is that the
246
Organizing a Statistical
Survey
purpose of survey is not complete till the findings are clearly defined to the people
and communicated in a systematic manner to the people. The results of the survey
may be well accounted for as a source of knowledge. For all these reasons the
survey reports are significant.
Sources of Statistical Data
When you are through with determining the scope and object of enquiry, the next
important step is related to deciding the sources of data collection. Here it is
essential to notice that there are two categories in which the data can be classified,
these are:
(1) Primary data
(2) Secondary data
Let us now discuss these in detail one by one.
Primary Data and Secondary Data
The primary data is that which you have collected for the first time that too for your
personal use. This makes you the primary source and the first time collected data
thus becomes original data. The data that you are using which has been compiles,
analyzed and classified by another becomes secondary data. The sources that
collected the data then become secondary sources. The example of primary data is
the national income data that is compiled by the Government. However, if the same
data is used for research workers then it becomes the secondary data.
The primary data is thus a raw form of data that only depicts the information
collected, it is used for the means of applying statistical method of analysis.
When it comes to secondary data, it is more of a finished product that is already
treated using statistical methods.
If you are employing primary data for the purpose of your survey, then it
becomes essential to identify the sources of data collection. When it comes to mass
enquiries such as population census, then larger population is involved and larger
people are surveyed for data collection. When it is about small enquiries like cost of
living then you may need to get in touch with the industrial workers in a specific city,
this may involve less people.
Using secondary data for the purpose of your study, you should first edit and
scrutinize it for discrepancies. If this is not done then you will not be able to achieve
the desired accuracy or it will not be suitable in any form to the purpose that you
247
Organizing a Statistical
Survey
want it to serve. Without editing or scrutinizing it the secondary data may result in
errors and the investigation would then be incorrect. For these reasons it becomes
essential to use secondary data with caution.
Methods Involved in Collection of Primary Data
You can employ several methods for primary data collection. However, the
important means that you should employ should include: (i) observation, (ii)
interview, (iii) questionnaire, and (iv) schedule.
Let us now study each one of these in brief.
i) Observation: In this method you will need to employ the technique of
personal observation for the purpose of collecting data. It requires intensive
study of the related phenomenon as it occurs.
ii) Interview: The information that you wish to obtain should involve
interviewing people from whom you need to obtain knowledge on a
particular subject or gather information about a problem that you are
investigating.
iii) Questionnaire: In this method, you need to collect information by a series
of questions that can be e-mailed or posted to people. The questions are
related to the problem that you are investigating. The respondents are to
answer these questions and then return the questionnaire back.
iv) Schedule: This method is all about the process that involves sending the
questionnaires through enumerators. These enumerators enable the one to
answer it.
You can choose any of these methods for collecting primary data, depending on
the availability, time, funds and circumstances.
Sources of Secondary Data
You can collect secondary data from two sources: (1) published sources, and (2)
unpublished sources.
The published data sources are often government publications, foreign
government or international bodies that may include organizations like World Bank.
Other government bodies may include trade journals, stock exchanges, technical
journals, newspapers and magazines, etc. The unpublished work may include but not
limited to sources such as works of scholars, labour bureaus, research workers and
trade associations.
248
Organizing a Statistical
Survey
Types of Enquires
A statistical survey is incomplete without deciding the enquiry type even if you have
already deployed other preliminary steps. It is important to include the steps related
to enquiries and its kinds. First you need to understand the types of enquiries,
whether it is direct or indirect, census or sample, original or repetitive and is it
confidential or open. However, it is first important to understand that factor that
influence the enquiry before talking about their different kinds.
Factors Affecting the Type of Enquiry
The decision related to the enquiry type is often influenced by several factors behind
it. These are as follows:
Objective and Scope of the Survey: This is the most important factor that
determines enquiry type. For example, if your enquiry is all about investigation
related to rice cultivation and the area related to it in West Bengal, then the enquiry
method that you can best employ is that which involves complete enumeration. Your
objective would be to find out the per hectare yield, for this you will need to pick
sample plots of different locations and then estimate the yield. This method is
relatively better as this does not require the process of complete enumeration. A
sample survey would be enough to give you the right idea and help you achieve
accurate results. Similarly, if the enquiry scope is wide that should contain
information from bigger sources and several items are involved then you will need to
employ different method. If the scope is narrow then other kind of enquiry is to be
employed.
Who Conducts the Survey: This is yet another factor that should be
considered while you are to determine the enquiry type. It is equally important to
choose the facilitator who should conduct the survey. It is to be determined that
the data collection varies upon the survey conducted by a body or an organization
or by an individual. Another notable factor is that where Government may spend
money and extract information through compulsion, this is not necessary with
other sources. If an organization or an individual is included then the means of
moral pressure or the technique of persuasion may be used for obtaining
information. In case of an individual when the survey is done on their own, then
the mode of obtaining information changes due to availability of resources and the
information available with the people, this makes the scope of enquiry limited or
narrows it down.
249
Organizing a Statistical
Survey
Financial Implications: The decision to begin enquiry is also dependent upon
the financial means or implications. Money being the governing factor into
conducting statistical survey, it should be considered with all its implications. Here
one needs to understand that the money involved in large survey would be large as
compared to small scale survey. Financial resources vary on individual, institution
and other surveying bodies involved as with each scenario the implications related to
harnessing the monetary power is different. Where state is involved, the money is
ample as compared to a private institution, similarly these compared to an individual
is much more. For these reasons the kind of enquiry that is to be undertaken should
have the core deciding factor of the kind of financial resources that will be involved
in it.
Sources of Data: One more important factor that should be considered is the
source of statistical information. The source from where you collect information. The
primary data that is to be collected, this would be different from the other kind that
is related to secondary data. When it comes to primary data then the sources from
which information is obtained are defined under different terms or units. However,
such decisions become obsolete when it comes to secondary data.
Different Types of Enquiries
Census or Sample Enquiry: The items involved in inquiry comprise of population
or universe. The thing to consider here is that when it comes to population pertaining
to statistics, it is not just the human population; it is related to total of all the items
that are involved in a specific statistical study. When it comes to the census enquiry,
a group is considered, a sample study would only include a part of it.
Census enquiry is all about complete enumeration related to the items within the
population. This kind of enquiry requires covering all the items, here highest
accuracy is to be obtained without any element of uncertainty. However, the reality
is different from it all. The error is the bias in this enquiry that will grow in number
with the given observations as they increase in number. In order to check the bias,
the only way out is that of using sample checks or through the means of survey.
Interestingly, census enquiry is time consuming and it requires equal deal of
energy and money. Therefore, organizing a census is rather difficult when it is related
to a larger scale activity as numerous resources are involved in it. For this reason this
enquiry is beyond the individual’s reach. It is only the government that can complete
the enumeration. The Government goes ahead with this kind of enquiry rarely if ever.
250
Organizing a Statistical
Survey
Perhaps for this reason the government undertakes population census in a decade.
Another thing to remember is that at times it is difficult to examine all the items within
population. Only sometimes one may get accurate results but that too when only a
part of the population is studied. When this is the case then there is no requirement
for census surveys.
Sample enquiry requires a partial stud of the population while the field studies
comprise of time, cost, convenience and other such factors form the basis of
selecting sample survey. The sample survey is all about the sample items that are
selected to represent the population in its totality. The sample items would help the
investigator in estimation of the characteristics related to the population without bias
that would help in producing reliable and valid results.
Now it is time to know about the advantages that comes with sample enquiry:
i) Conducting a sample study is cheaper and involves less financial means in
comparison to census study. The results are obtained quickly in a short
time.
ii) The measurements are more accurate as the data collection is done by
experienced and trained investigators.
iii) With larger population the best means of survey is the sample survey
method for data collection.
iv) Sample survey method becomes the best beams of survey when one is to
utilize an object that would be destroyed under study. The best example of
this is related to physical science wherein fresh samples are required each
time the chemicals are used.
v) Through this method you will be able to estimate errors that come as a
result of sampling.
Despite these advantages, the sample enquiry if the given areas is small or
narrow then the utility of resorting to this method is useless. Deciding about
employing a means of enquiry varies on different factors such as availability of
resources, nature of enquiry, objective and scope.
Original or Repetitive Enquiry: This kind of enquiry is all about the first time
enquiry, the repetitive enquiry is something that happens in continuation to previous
surveys. The initial survey or an original survey one has the liberty to adopt any
means of data collection, but when it is about repetitive enquiry resorting to old
method is required throughout the study. In case of encountering a new situation it is
251
Organizing a Statistical
Survey
modified accordingly. However, repetitive enquiry one needs to be careful about not
changing the definition of terms related to it as it would then lead to inaccuracy in
comparisons.
Confidential or open Enquiry: A confidential survey as the name suggests is
all about keeping the survey results a secret and these are not revealed to the public.
However, when it comes to open enquiry things are different and opposite to it. Both
the enquiries are treated with different modes. Remember, that most of the enquiries
conducted by the state or the government including that by institutions are nonconfidential. When private bodies are involved such as trade unions, and other
associations these collect information that are kept amongst themselves and confined
to few members involved in it.
Direct or Indirect Enquiry: Direct enquiry comprise of producing direct
quantitative measurement. For example, factors related to height, weight, income,
these are included in quantitative terms. Indirect enquiry is different as it does not
require direct quantitative measurements that are also not possible in it. Things like
honesty, intelligence, efficiency are some of the factors that cannot be measured.
However, these factors are still taken into consideration at the time of indirect
enquiry as these factors influence on the problem even though these are not
measured quantitatively. However, factors that are not quantifiable should be
measured indirectly. If one is to study intelligence of the students then it is essential
to include the marks of the students and make it a part of that study.
Regular or Ad-hoc Enquiry : A regular enquiry comprise of collecting regular
data over a period of time, however, an ad-hoc means collecting data when required
without any given period of intervals or specific timings. It all depends whenever the
data is needed that the enquiry is conducted.
Official or Semi-official or Non-official Enquiry: Official enquiry is when the
government is conducting the survey, just as official enquiry. Semi- official enquiry is
done by other bodies that are of government patronage. Non-official or private
enquiry is carried out by private institutions, bodies or individuals. One thing to
remember is that the facilities available vary on the type of enquiry. For example, if
it is an official enquiry then people will have to go through the obligation to supply
information. In case of semi-official people the information is acquired on request
basis. If there is a private enquiry the investigator will run through numerous troubles
and difficulties for data collection.
252
Organizing a Statistical
Survey
Check Your Progress - 1
1.
What is the foremost step in a survey?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What does direct enquiry comprises of?
................................................................................................................
................................................................................................................
................................................................................................................
3.
What does ad-hoc mean?
................................................................................................................
................................................................................................................
................................................................................................................
10.3 SAMPLING METHODS
Samples in a statistical survey are collected with the help of surveys. There are two
types of surveys, these are discusses below:
(1) Census survey- This contains an entire group.
(2) Sample survey- This contains selected representative items pertaining to a
group.
The sample survey, representative items are the sample. There are methods
though which samples are collected; these can be categorized as follows:
(1) Probability sampling methods
(2) Non-probability sampling methods
Probability sampling Methods
This method is related to the fact that each item pertaining to population comprises
a chance or probability to become a part of the sample. For this reason the method
gives a chance to every member to be a part of it. This method too has the following
methods:
253
Organizing a Statistical
Survey
1. Simple random sampling
2. Systematic sampling
3. Stratified sampling
4. Cluster sampling
5. Area sampling
6. Multi-stage sampling
Simple Random Sampling: This method is also referred to as lottery or chance
method. For this reason every item gets an equal opportunity or chance to be
included in the sample and each sample can be selected due to the factor of
probability in play. This method is used when the population comprises of
homogeneous group. Random numbers come into play when this method is used.
Systematic Sampling: Under this method, a chronological or alphabetical
manner of arrangement is used. The units that appear at specific intervals are
included. For example, you might select every name that comes as 14th in the list, or
10th house on the street side may be included and other such things. The first unit
that becomes the starting point is picked as random. This method comprises a
selection process that begins choosing at a random point within the given list, the
units continue to select till the number is achieved.
Stratified Sampling: This method comes into play with the homogeneous
group. The population is split into different strata. However, this method should be
carefully considered without overlapping. When the stratification is done, the
samples are randomly selected that belong to each stratum that may be equal or
proportionate basis. For the clarity of this method let us include an example. If you
are to survey the economic conditions related to employees within a university and
its affiliations, then you will need to split three categories that will include the
following:
(i) Principals and professors
(ii) Readers
(iii) Lecturers
(iv) Administrative staff
(v) Class IV staff
When we look at these groups in isolation even then they are homogeneous.
These groups are for this reason called strata. Randomly you will pick from each of
254
Organizing a Statistical
Survey
these groups and select samples that are suitable. This is what is called stratified
sampling.
Cluster Sampling: This method is all about grouping related to heterogeneous
groups that are called clusters and then one is to select a few out of these by
employing the random sampling technique. The survey work is accomplished by
using the selected clusters that include all the items. All the five different elements are
included as explained in pervious example, in this it forms a heterogeneous group
that contains employees related to an institution. Each institution in the list would be
a cluster. By selecting a few institutions through the process of random sampling the
survey is conducted of all the employees. This is cluster sampling.
Area Sampling: This method is similar to that of cluster sampling. It is used
when there is a need for covering a geographical area and when it is a widely spread
survey. With this method the area is divided into smaller areas then the method of
random selection is applied to smaller areas. All the units thus selected then are
studied and examined for the accomplishment of the task of survey.
Multi-stage Sampling: This method is used when the survey requires covering
large area or where the population comprises of heterogeneous group. For example,
the survey that you are about to conduct includes families from the whole country.
This is something that requires a sampling method that is multi-staged. The first
would need random selection of states. Next, you will require selecting few districts
randomly. After this the final stage would include selecting few towns from each of
the districts. Now you need to select families randomly from the selected towns.
This method requires stratification that is carried out in four sages to make a final
sample. With this there is a possibility of each item being selected.
Non-probability Sampling Methods
This method comprise of deliberate selection that are related to particular items. On
simple terms it is all about whether the investigator is of the opinion that specific units
are not representative that would not be able to get any chance of inclusion in the
sample. For this reason the method is referred to as non-probability sampling.
Following are the methods that can be used:
Convenience Sampling: When you are selecting the sample items pertaining to
population due to ease of access, this method becomes convenience sampling. Now
let us take an example that involves data collection from petrol consumers. For this
purpose we need to select petrol pump stations that are within reach and then begin
by interviewing people who buy petrol from these stations.
255
Organizing a Statistical
Survey
Judgment Sampling: When the judgment of the investigator is dependent on
selecting samples for the purpose of representative sample then it is called the
judgment sample. This kind of sampling is used for the purpose of qualitative
research where the necessity of developing hypothesis is seen.
Quota Sampling: This yet another form of non-probability sampling. This is
where you need to divide the population according to homogeneous groups and then
the interviews are carried on by allotting quota to the interviewers that is filled by
each group. The actual selection is left to the discretion of the interviewer who would
be the final judge to the sample items. When it comes to the size of each quota it is
proportionate to the population group.
With all this you are now familiar with many sampling methods that give you the
liberty to choose any one of these that would suit your purpose. However, when it
comes to random sampling then errors may arise due to personal judgment that can
be overcome easily.
The most desirable sampling is purposive sampling especially when the choice is
narrow and the characteristic is under intensive study. Another thing to notice is that
sample designs serve the utility of convenience and are low in cost other than the
random sampling. Due to this sampling methods should be chosen with utmost care
and all the factors should be taken into consideration that would include the scope
and nature of the enquiry along with other factors like staff, convenience, money and
time.
Law of Statistical Regularity
This law states that the random selection related to the items from the universe will
be able to provide a representative sample. This law is about the average of the
sample chosen randomly that will bear the same characteristic and the same
composition as the whole universe. Let us take an example of a school that
comprises 700 boys and girls are 300 in number, then you go on selecting 100
students by random selection, this would lead to 30 girls and 70 boys. Now all that
can be said about this situation is that at a random you are able to select these
numbers of girls and boys by selecting 100 students from a school then the same
would mean that the school contains 1000 students with 300 girls and 700 boys.
The results in this case are derived through a study of 100 items that are applied to
1000 items, this is what sampling is all about.
256
Organizing a Statistical
Survey
Following are the conditions that are related to this law of statistical regularity:
i) The selection should be random. This would be like every item should be
able to get equal opportunity to be selected in the process.
ii) The items that you need to include should be large to support sufficient
representation of the sample.
From this it is understood that the population is a large sized sample that is
randomly chosen, it is certain that the sample taken too will contain the same
characteristics as that of the population.
Law of Intertia of Large Numbers
This law takes naturally to the law of statistical regularity. The accuracy and the
sample size are inter-related. The reason being that with large numbers there are high
chances of errors. When you collect large amount of samples there is a high degree
of stability as compared to the smaller samples collected. Now let us understand it
with an example of coin tossing.
Now you toss a coin 40 times there is a chance that you may expect to get
heads 20 times. However, if you are tossing it then you may get the head 25 times
and 15 times tales. If you toss it more than 40 times then the situation is reversed.
Then again tossing the time 1000 times would result in 500 tails and equal number of
heads. This happens due to the larger number of tosses, the errors or the difference
of the expected and the actual are reversed due to their opposite movement that
cancels out each other. In conclusion to it, the larger tosses would lead to a greater
irregularity that would compensate the other.
With the above example all that we can say is that there is inertia with large
numbers. This means that with large numbers there are high chances of consistency.
Now when we come on to rice cultivation in a specific district, the same would
vary throughout years. If the production of the whole state is considered then it may
not vary as much, this is due to the fact that some districts may have more crops
than usual, other districts may produce below normal. When we look at the overall
production at the state level it would then be stable. When it comes to the figures of
rice production then the variation related to it would be smaller on the national level
even if it is seen year after year. This is what is referred to as inertia of large numbers.
This discussion does not mean that when it involves passage of time there would
be no changes in large numbers. However, it means that there will be no sudden or
violent fluctuations in large numbers. There will be fluctuations, but these would be
257
Organizing a Statistical
Survey
slow and possibly gradual. This clearly defines how the inclusion of larger items, the
deviation is smaller.
Check Your Progress - 2
1.
Where is multi-stage sampling used?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What does the law of statistical regularity state?
................................................................................................................
................................................................................................................
................................................................................................................
10.4 STATISTICAL UNIT
It is important to consider that you need to define the unit properly prior to the
statistical survey. This unit is defined as per the measures undertaken by the
investigator as per the variable that are selected for enumeration, interpretation and
analysis. It is therefore essential to take into consideration the collection of relevant
data. If the unit is not well defined then, the possibility of the collected data may be
devoid of the relevant data and that should be included. For this reason it is not easy
to define the unit as it may see, in the first instance.
Features of a Good Statistical Unit
Following are the requirements that you should take care of when it comes to
enquiry related to statistical unit:
i) The unit must be appropriate: The unit should be able to fulfill the
purpose of the enquiry. For example, when there are different kinds of
prices that comprise cost, retail and wholesale prices. Then it becomes
essential to select the price that would be suitable for the purpose of the
enquiry that is done after you have selected the price unit. You should
remember that your results might be misleading if you select the wrong
258
Organizing a Statistical
Survey
price. If retail price is suitable to the sample then this should be used
instead of choosing another type of price.
ii) The unit should be specific and unambiguous: It is essential to define
the unit specifically to avoid ambiguity. If this is not done then the data
collected will be full of errors and it would be inaccurate.
iii) The unit must be stable: If the value fluctuates then the data collections
from different places or times would be incomparable. This would be
misleading.
iv) The unit must be homogeneous: Once you have defined the statistical
unit then the next thing to do is to keep it uniform throughout your enquiry,
this is essential if you want to get a valid comparison that is based on the
data collected.
v) The unit must be simple: The statistical unit should be kept simple for the
sake of its understanding and it should be complete.
Types of Units
Now take a look at different kinds of statistical units.
1. There is a possibility that the statistical unit is either arbitrary or a physical
unit. The examples of physical units are: grams, tons, meters, kilograms and
so on. These units are common and need no explanation. However, when
it comes to studies, these are not suitable. For example when it comes to
defining the wages of workers in an industry then the unit will be wage.
Different wages would then be included such as piece wage, daily, monthly,
money wage and so on. This situation requires taking an arbitrarily decision
about the kind of wage that you need to collect and then define it.
2. The statistical units have categories that include
(i) Units of estimation or enumeration
(ii) Units of analysis and interpretation
(i) Units of enumeration are related to terms of collected data. These can
include simple or composite units. Simple unit is about representing
single condition that is devoid of qualifications. These would include
hour, house, meter, and worker. Composite unit consists of qualifying
259
Organizing a Statistical
Survey
word that is added to simple unit that limits the scope and for this
reason it is difficult to define it.
For example, skilled worker and worker are two units. The worker is
simple and the second one is composite. The second case should be
well defined with and without the additional component. Other similar
examples can be kilowatt-hour and machine-hour.
(ii) Analysis and interpretation are comparative and for this reason it
would include coefficients, rates, percentages and so on.
Degree of Accuracy
The first thing that one needs to decide for enquiry is the degree of accuracy that
should be well decided in advance for the purpose of achieving accuracy pertaining
to data collection.
There are two aspects that you should keep in mind:
(i) The accuracy
(ii) The degree of accuracy that is a necessary task to be achieved in the given
investigation. However, it is to be remembered that absolute accuracy is
not possible to achieve for the purpose of describing the exact
phenomenon. Other factors influencing the absolutism of the result includes
due to imperfection on the part of investigator or due to imperfect
measuring instruments. For these reasons expecting complete accuracy is
not possible. Even when we talk about physical sciences with environment
of controlled experiments, absolute accuracy is still not possible. For this
reason social sciences are not be referred.
Significance of Reasonable Accuracy
Now it is to be understood that when it comes to statistical investigations the
requirement of absolute accuracy is redundant. For the purpose of understanding or
analysing the presence of reasonably accurate estimates are enough. Take for
example the weight of good grains, in quintals it is still good to be measured; it need
not be represented in gram. You may correct it to kilogram if required. Similarly,
when it come sot measuring the distance between cities, the unit is kilometers. For
this including additional meters may not be required as it loses the significance.
260
Organizing a Statistical
Survey
Counting does not require the presence of absolute accuracy as it is rare.
Population census requires high degree of accuracy due to the fact that the numbers
included should be down to the count of people in real time. However, it is possible
even in such a scenario some may be left out during enumeration. What is essential
to notice is that accuracy in accordance to ages does not require higher accuracy as
is required in case of population census. This would still serve the general purposes
even if the ages are depicted in completed years. For this reason there is no need of
absolute accuracy, reasonable accuracy is enough. Now the question is what is
reasonable accuracy?
Nothing of it can be defined for sure as it is dependent on nature and objective
that the enquiry serves to fulfil. Let us take for example of measuring distance
between two cities to that of measuring cloth. While the former one need not be
define to its absolutism up to meters, the latter one require defining it with accuracy
down to few centimeters that should not be ignored.
Another example would be that of measuring coal then a few grams can be left
out. However, when it comes to gold each gram is to be accounted for. When one
is to carry out statistical investigations then these considerations should be kept in
mind for achieving reasonable accuracy.
These methods should be adopted along with the units that will help achieve the
degree of accuracy. However, it is to be noted that the measurement accuracy is
dependent on two factors:
(i) Accuracy of measuring instruments
(ii) The consideration with which it is used by the investigator
For instance, when measuring lengths one can reasonably include the millimeters.
Similarly, when the investigation is related to the ages of people in years or months,
it is not possible to include the details of number of days as the information cannot
be at any cost obtained with accuracy.
Concept of Spurious Accuracy
Let us begin by understanding the concept through an example. Studying the ages of
class X students that can be 16 years 7 months, 17 years 2 months, 16 years 8
months, 15 years 9 months, and 15 years 10 months.
261
Organizing a Statistical
Survey
From the figures obtained it would be misleading when we try to convey the
same on terms of statistics with relation to the age of the student as follows:
(16+17+16+15+15)/5 = 15.8 years.
It is better to express the age of a student to the highest accuracy by including it
to a complete 15 years.
The accuracy implied by the figures 15.8 years is what we call the spurious
accuracy. Now, one needs to understand that inclusion of numerical facts require
concern about spurious accuracy.
Check Your Progress - 3
1.
On what basis is the statistical unit defined?
................................................................................................................
................................................................................................................
................................................................................................................
2.
Why should the statistical unit be kept simple?
................................................................................................................
................................................................................................................
................................................................................................................
10.5 SUMMARY
• Surveys related to statistics are fact enquiries that also include interest, this
need to be planned properly and executed with caution in order for the
results to be able to depict realities.
• Following are the steps that you need to consider when it comes to
statistical survey:
o Defining the problem
o Determining the objective and scope of the survey
o Accomplishing the initial steps like deciding the sources of data, type
of enquiry, statistical unit and the degree of accuracy desired
262
Organizing a Statistical
Survey
o Data collection
o Editing the data
o Classification and tabulation of data
o Data analysis
o Data interpretation
o Writing the report
• The sources of the data can be primary or secondary. The first time
collected data becomes the primary and the original data when done by an
investigator. The data that is collected by the secondary data is the one on
which the investigator works as it is already collected data.
• Data collection comprise of several methods that includes using different
techniques like interview, questionnaire, schedule, observation and much
more. It is up to the investigator to use the best suitable technique that
varies on the scope nature and object of enquiry while considering the
money and time constrains.
• When it comes to collecting secondary data on the basis of secondary
sources, one can seek it from newspapers, journals, books, reports and
published sources that can be referred to. If needed unpublished sources
too can be referred.
• It is important to know prior to conducting a survey that there are different
kinds of survey; it can be simple or census. Census is utilized when a whole
group is to be surveyed, but the simple is used for surveying a part of the
group.
• On practical terms the best method to be employed is simple survey due to
numerous advantages it serves. Other forms of enquiries would include
direct and indirect, open and confidential, original or repetitive, regular or
ad-hoc.
• After deciding the kind of enquiry that you will employ, the next step is to
understand what factors you need to be concerned about, these would be.
The sample selection has many methods, like:
o Probability sampling methods
o Non-probability sampling methods
263
Organizing a Statistical
Survey
• There are two laws that you need to know about:
o Law of statistical regularity
o Law of inertia of large numbers
• The law of statistical regularity can be defined where large sized samples
are collected randomly from people, this would most likely possess similar
characteristics as the people.
• The law of inertia of large numbers is all about large numbers being more
stable as compared to samples of small numbers. Fluctuations are gradual
and slow when one takes into account large numbers.
• Statistical unit can be defined as measuring the variables for the purpose of
enumeration, interpretation and analysis. The unit selected for the purpose
of statistics should be stable, complete, simple, precise unambiguous and
appropriate.
• One last thing that you need to keep in mind is that achieving absolute
accuracy is not possible with statistical surveys. The purpose is complete
by choosing to achieve reasonable accuracy that varies on the nature and
object of enquiry.
10.6 KEY WORDS
• Data Collection: It is the process of collecting data, for the process of
study or analysis, often in order to reach a conclusion.
• Data Interpretation: It is the process of assigning meaning to the collected
information and determining the conclusions, significance, and implications
of the findings.
• Questionnaire: It is a set of printed or written questions with a choice of
answers, devised for the purposes of a survey or statistical study.
• Quota Sampling: It is a process where one needs to divide the population
according to homogeneous groups and then the interviews are carried on
by allotting quota to the interviewers that is filled by each group.
264
Organizing a Statistical
Survey
10.7 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The foremost step in the survey is to identify the problem that needs to be
investigated.
2. Direct enquiry comprise of producing direct quantitative measurement.
3. Ad-hoc means collecting data when required without any given period of
intervals or specific timings.
Check Your Progress - 2
1. Multi-stage Sampling is used when the survey requires covering large area
or where the population comprises of heterogeneous group.
2. The law of statistical regularity states that the random selection related to
the items from the universe will be able to provide a representative sample.
Check Your Progress - 3
1. The statistical unit is defined as per the measures undertaken by the
investigator and as per the variable that are selected for enumeration,
interpretation and analysis.
2. The statistical unit should be kept simple for the sake of its understanding.
10.8 SELF-ASSESSMENT QUESTIONS
1. What do you mean by a statistical survey?
2. List the various steps in a statistical survey.
3. Account for the primary and secondary data in a statistical survey.
4. What are the various methods of collecting primary data? Discuss.
5. List the factors that affect the type of enquiry.
6. Discuss the law of inertia of large numbers. Support your answer from the
learning of the text.
7. Write a short note on degree of accuracy.
8. What are the various features of a good statistical unit? Discuss.
265
Organizing a Statistical
Survey
10.9 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
266
Accuracy, Approximation
and Errors
UNIT–11
ACCURACY, APPROXIMATION AND ERRORS
Objectives
After going through this unit, you will be able to:
•
Discuss the errors in statistics
•
Explain the measurement of errors of approximation
•
Analyse the effect of mathematical operations on error
•
Assess sampling and non-sampling errors
Structure
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
Introduction
Approximation and Errors
Estimation and Sampling of Errors
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
11.1
INTRODUCTION
This unit will discuss accuracy, approximation and errors. Statistical data should
comprise of reasonable standard accuracy and for this one needs to understand that
the degree of accuracy needs to be clearly defined. It is important to understand
measuring accuracy and making approximation that helps achieve the desired
accuracy. There are certain aspects that one needs to keep in mind, of which
foremost are the errors that occur. Errors generally happen when the measurements
are inaccurate, or the methods employed are inappropriate and the figures are of
approximate value.
This unit will teach you about accuracy related concepts. Along with it you will
also learn about the approximation methods and concepts, recognizing the errors
that happen due to deploying different measuring errors.
267
Accuracy, Approximation
and Errors
11.2
APPROXIMATION AND ERRORS
Accuracy
Taking estimates, measuring or counting are the means of obtaining statistical data.
Considering that the data that is related to cars, then the estimation requires counting
of cars. If it is related to milk then the produce is to be recorded and the milk
powder should be weighed. However, when the government is trying to find data on
wheat production before and after the harvest, the total production can only be
estimated. If the counting is done properly then only the exact figures would be
obtained. However, the estimates and measurements may not always be exact.
When we take an example of a truck load powder and it is weighed on a weight
bridge then a difference of kilogram or more would hardly matter. However, when
we take a pinch of powder on a chemical balance then even a milligram would
matter due to the variation it causes on the balance. The accuracy of the weight is
dependent on the smallest measure of milligram. When it is measured, the accuracy
is dependent on the instrument with which it is measured.
Other factors influencing the accuracy and cause errors are many that need to be
understood. For this reason it is difficult to achieve perfect accuracy. Even with
different fields of science it is hard to achieve accuracy. When it comes to statistical
measurement, a reasonable degree is enough due to the inclusion of practical value.
The way it is used, nature, purpose and cost of obtaining it. Statistical surveys are
such that there is no need of achieving high accuracy. For instance, when it comes to
application of a more meaningful approach, in case of population of a given country
that is estimated to be 1 million then we are not including the exact figure that may
be 1,004,601. Instead of going down to accurately defining the figure to its
absolution, the round figure is better as accuracy here is not desirable. Even with
this, the absolute accuracy may not be able to give you the desired clarity.
In our next example, when we take into consideration the sale of polyester
annually and cotton cloth that is from a retail shop then it better be defined with a
ratio of 3:2 instead of defining it with the actual figure that may be ` 60,340 for
polyester and ` 40,105 for cotton cloths.
Understand that the accuracy depends on the situation that is taken into
consideration. When we talk about the blacksmith who is weighing iron, then the
grams become irrelevant, however, when it comes to gold then each measure counts
268
Accuracy, Approximation
and Errors
down to a milligram. This is what makes accuracy relative. Another thing to
remember is that when we talk about accuracy in terms of measurement then it is
may be right for one but inaccurate for another purpose just as it is stated in the
examples earlier.
Accurate measurement may not be relevant for or required on terms of defining
population in number, but the same would be relevant for election results that is
necessary to know about the exact votes pertaining to each candidate. This example
states that the desired accuracy is dependent on the purpose of the enquiry.
Approximation
Till now it must be clear to you that in several cases one can resort to using of
approximate figures to arrive to a conclusion. This especially goes for measurements
as one cannot achieve perfect physical measurements, while on the other hand it is
possible to achieve reasonable precision. If digits are included down to the details,
then it becomes confusing. It is possible that when we include approximation, then
the grasping becomes clear as it helps in comparisons and calculations.
Approximation is dependent upon the desired degree that is required for the purpose
of achieving accuracy within the data. Approximation is all about rounding off the
numbers and digits.
Methods of Approximation
Now when we talk about rounding off in case of approximation, what exactly is
rounding off? Rounding is the practice of achieving a round number by dropping off
the last few digits for the purpose of simplifying the digital form of larger figures.
Rounding off comprise of different methods, these are:
(i) Rounding Up: This is related to raising the given to the figure up to the
next full unit. For example, when we consider a parcel with a weight of
8.9 grams then through this method the weight would level up to 9 grams.
(ii) Rounding Down: In this method figures are reduced to the fuller lower
unit. For instance, when it comes to stating the age of a person on the
pervious birthday, if the person was 19 year 10 months old then we can
state the age with a rounding down number of 19 years.
(iii) Stating the Value of the Nearest Unit: It is to be understood that there
are rules related to rounding to the fuller digits, these are:
(a) The first digits that should be dropped is lower than 5, the digits
preceding it would be kept the same. For example, when we take a
269
Accuracy, Approximation
and Errors
huge figure of 2,23,490 then the approximate would the 2,23,000.
490 is dropped as it is 4 that is less than 5.
(b) The digits are greater than 5, the preceding digits would be increased
by one. When we take the figure of 1,42,896 the nearest approximate
value would be 1,43000. 896 is dropped this is because the digit 8 is
greater than 5 and the digits before it are increased in number, 2
increases to 3. If the digits were 1,83,503 then it would be rounded
1,84,000.
(c) If the first digit is 5, then the rule would be to zero it to the right and
the preceding digit would not be changed in case of even number and
by 1 in case of an odd number. On simpler terms the figure would be
rounded to an even number. For example, 2,23,500 would become
2,24,000. Similarly, 2,24,500 would also be 2,24,000. 500 is
dropped in the first example, due to 3 being an odd number for this
reason it is increased to 4. In the second one, 4 is already even, so
when 500 is dropped, it remains the same. Remember that zero is an
even digit.
(iv) Round to So Many Significant Figures: Talking about simpler number
or when we refer to computation, the digits that are depicting the accurate
figures are significant digits. When there is no zero involved the term is
defined. Zeros may be significant or non-significant. They are significant if
they are placed in a significant position on the right and similarly goes for
the left placement of a zero. When the figure is 14,005, the zeros are
significant due to other figures involved on both the sides of the zero.
Zeros when placed at the extreme left would not be significant. For example,
when we take figures like 0,00,500, it is easy to see that the zeroes on the left are
insignificant. However, in another example where the figure is 501,0.0501 and
0.000501 all the digits are significant including the zeroes. For this, it is easy to
conclude that zeroes are significant on specific occasion.
For example, zeroes placed on the extreme right after the decimal point are
significant to the entire digit, these depict the places that related to the given number.
The value of 123.00 are correct due to the placement of the zeroes down to the
right decimal point and this is what makes the zeroes after the decimal point
significant.
270
Accuracy, Approximation
and Errors
When we talk about significant places, it means presenting the figures that are
relevant in accordance to the given number that is accurate.
For example, when we take the number 3.4752 the rounding off would be by
dropping 752. The new rounded off number would then be 3.5. Similarly, the
number 2,23,490 would become 2,23,500 as the rounded off number.
It is to be noted that when it comes to significant figures these would be the
digits that depict real information and are completely accurate avoiding inaccuracies
of any sort. Now look at the following Example.
Example 11.1
Original No.
5,99,502
5,99,500
5,99,498
5,98,500
999.051
999.049
999.150
999.950
0.00723
Rounded No.
600 thousands
600 thousands
599 thousands
598 thousands
999.1 to one decimal
999.0 to one decimal
999.2 to one decimal
1,000.0 to one decimal
0.007 to three decimals
Significant Digits
in rounded No.
600
600
599
598
9991
9990
9992
l0000
7
Errors in Statistics
Error is a significant term in statistics as it is often defined as the difference that the
true value and the estimated value have pertaining to a specific item. Errors happen,
this is due to the fact that the estimates are based on sample observations and the
methods too include figures that are approximately rounded.
For example, you are to find out the percentage of nitrogen present in fertilizer,
the samples that you collect would be from different parts, with different fertilizer mix
and that too on different days. For the purpose of further analysis, the sample is then
sent for a lab testing where different methods are applied for the purpose of
analysing it. There are high chances of slight variations in the concoction of the
mixture due to the environmental changes such as temperature, heat and humidity.
Now another kind of variation would be related to the different kinds of samples and
the results in sampling errors. Another factor affecting the final results would be the
differences that arise due to analysis. Errors can arise with the process of
measurement that is called errors of observation.
271
Accuracy, Approximation
and Errors
We can clearly say that with surveys and experiments there are different kind of
errors such as:
(i) Sampling errors
(ii) Analytical errors
(iii) Errors due to observations and measurements
It is to be concluded that errors are all about the mistakes that happen as a
result of data compilation. However, it is essential to understand that when there is
an arithmetical miscalculation then it is a mistake. Statistics is dealing with
approximate and or estimated values, errors thus become inevitable. There is no
possibility of eliminating errors, but it can be minimized. However, when it comes to
mistakes, these can be completely eliminated.
Sources of Errors
Following are the sources of errors:
1. Errors of origin: When variables such as height, distance, and weight are
involved then precision cannot be achieved. This happens due to the
limitations that one comes across with the measuring instruments. For this
reason the scope of difference between the actual state and measurement
cannot be eliminated. Many times when unsuitable statistical units are
involved the measurement is incorrect. Another thing can be the possibility
of incorrect information from the source. A person may be biased in
supplying information that would result in errors in measurement. These
errors are often referred as errors of origin. These errors increase with
increase in observation.
2. Errors of inadequacy: The sample taken in any enquiry should be able to
represent population. If the size is small and the sample is not represented
in a correct manner then it leads to errors. These errors are referred to as
errors of inadequacy.
3. Errors of manipulation: Errors may happen unconsciously on the part of
the investigator in classification and counting of the objects. These errors
along with the approximation are referred as errors of manipulation.
272
Accuracy, Approximation
and Errors
Following are the three types of errors that happen during statistical investigation:
Errors of Approximation
Statistical figures are often rounded off for the convenience of the method. Due to
rounding off, the accuracy is stated as:
1. Depiction of data to the nearest whole number that would be like, 4,672.4
is approximated to 5,000 and the nearest whole number round off would
be 4,672.
2. Using the + and – signs for the purpose of indicating the approximation in
absolute terms, 5,000 ± 500. This depicts the actual value would then be
500 of 5,000 i.e. 500 more or less than 5,000.
3. Using + and- for indicating the proportion of error 5,000 ±0.1. This
indicates the final value can be 0.1 of 5,000 it can be 500 more or less than
5,000.
4. Using the method of percentage is similar to the third case above. For
example 5,000 ± 10% means that the error is 10% of 5,000.
5. Depicting the approximation of accuracy to the significant figures, 4,672.4
would be correct to five significant figures.
The ± symbol thus becomes useful for the purpose of depicting degree of
approximation or an error. These symbols are used to denote that there are limits to
errors. Possible errors thus can be defined as the limits wherein the actual error lies.
For instance 5,000 ± 500. The minimum error is 500, if we are to round it off to
the closest thousand then in accordance to the upper limit it would be an error as
+500 and lower limit will be –500. The error would be written as ± 500. This is in
the case of rounded off to thousands. Now when it is taken to hundreds then it
would be ± 50 and ± 5 respectively.
Measurement of Errors of Approximation
Following are the methods of measuring these errors:
1. Absolute Error: This is defined as the difference that lies between the
approximate and the true value whether it is estimated or observed.
Absolute Error (AE) = x – x1 where x is the true value, and x1 is the
approximated value.
273
Accuracy, Approximation
and Errors
The possibility with absolute error is that it can be both positive and
negative. When the figure is 5,000 ± 500, the maximum absolute error
would be 500 in both the cases. If it is about the true value that is greater
than estimated value, the error would then be positive and if less than the
estimated value, the error would then be negative.
For instance state has a population of 2,71,70,314 and its capital is
26,39,766. The approximated value of the population to the lakh of the
state would then be 272 lakhs and its capital would be 26 lakhs. The
approximated value in the first case is more than the true value, and in the
latter case it is less than the true value.
For the state population
A.E. = True Value – Approximated Value = 2,71,70,314 – 2,72,00,000 =
– 29,686
For state capital
A.E. = True Value – Approximated Value= 26,39,766 – 26,00,000 =
39,766.
In the first case AE is negative and in the second case it is positive.
2. Relative Error: The extent of errors in both the cases is not much even
when we know for the fact that the state populations is ten times to the
capital. If one is to find out about the significance of error out of these, then
the need for sighting absolute error is obsolete. If one is to find out which
error out of these is significant then they will need to depict it in a fraction
format either true value or approximated value. Now relative error
becomes useful. When RE or relative error is depicted as absolute error
ratio to the estimated or approximated value.
This can be expressed as follows:
Relative Error (RE) = Absolute Error (AE) ÷ Corresponding
Approximated Value x1
Now let us take the Example for the purpose of absolute error, and
estimate the Relative Error in approximating the state population and
capital.
RE in approximating population = –2,9,686 ÷ 2,72,00,000 = – 0.0011
RE in approximating capital = 39,766 + 26,00,000 = 0.0153
274
Accuracy, Approximation
and Errors
There is a difference of ten times when it comes to the error in
approximating the population related to the capital, this is due to the reason
that the capital is ten times lower compared to population. The state
population can thus be depicted as 272 lakhs – 0.0011 and capital
population as 26 lakhs + 0.0153.
3. Percentage Error: It is when percentages are used for expressing RE.
Conversion of a relative error to a percentage error becomes easy for the
purpose of understanding.
Percentage Error (PE) = RE × 100
For example, the percentage error (PE) in approximating the state
population is
– 0.0011 × 100 = – 0.11% and similarly PE of the capital is 00.0153 ×
100 = 1.53%. The percentage error of the capital when compared with the
state is about ten times more. The base error comprises both, the
percentage and the relative error. When it comes to comparison, the
relative and percentage errors form a significant part in absolute error.
Example 11.2
Sight the relative and percentage error when the given figure 2,234.752 is rounded
to the
1. Closest two digits after decimal
2. Closest, whole number
3. Closest hundred
4. Closest thousand
Solution:
Method of Rounding
Nearest two digits
after decimal
Nearest whole
number
Nearest hundred
Nearest thousand
Rounded
Value
2,234.75
Maximum Absolute
Possible errors
±.005
±0.5
Relative
error
k0.000002
Negligible
±0.0002
Percentage
error
0.0002%
Negligible
0.02%
2,235
2,200
2,000
±50
±500
k0.0227
k0.25
2.27%
+25%
275
Accuracy, Approximation
and Errors
After a careful observation of the above Example, following are the sightings:
1. There is an increase in the maximum absolute error due to the increase in
the order of rounding, just as increasing number of digits are left out.
2. There is a significant increase in the relative error as the order of rounding
increases.
The higher order of rounding is the reason behind decreased precision.
Computation with Rounded Numbers
When addition and subtraction is carried out with rounded figures, the most essential
aspect is that the answer obtained cannot be more accurate than the least accurate
figures.
For example, (1) 357, (2) 574, and (3) 600 are to be added. Here 600 is the
least accurate figure due to the fact that is the rounded figure to hundred. The result
obtained would be 1,531.
But when we round off the answer to nearest hundred then it would be 1,500.
If we are to attempt to the highest exactness then it would only result in spurious
accuracy.
Similarly, when we are to multiply or use the mode of division on the rounded
numbers then it becomes necessary to consider that the result obtained should not
have more significant figures in comparison to the minimum rounded figures that are
used for the purpose of calculation. For example, when we are to multiply 2.92 by
2.6 these two are rounded figures, that gives a result of 7.592.
Here the answer too should be in two figures as it should be similar to 2.6 with
two figures, this makes the result as 7.6.
In conclusion to it all it is to be understood that even with calculations that
involve rounded numbers, the result obtained too should be limited to the accuracy
of the given figures in the equation.
Let us understand it in detail.
Effect of Mathematical Operations on Errors
When we talk about approximated figures, these are affected by the mode of
operations like division, multiplication, addition and subtraction.
276
Accuracy, Approximation
and Errors
Let us study it all one by one in detail.
Effect of Addition
The sum of the absolute is equal to the sum of absolute errors of its components.
For example, when we are adding 500 (to the closest 10) and 400 (to the closest
100). This statement would then be depicted as follows:
(500 ± 5) + (400 ± 50) = 900 ± 55
This can be explained in more detail as follows:
Figures
500
400
Total
Error
Nearest 10
Nearest 100
Absolute
Error
±5
±50
±55
Maximum
Value
505
450
955
(=900+55)
Minimum
Value
495
350
845
(=!No-55)
In the equation absolute error is stated as ± 55 (5 + 50), the relative error
would be depicted as ± 0.061 (± 55/900), and then the percentage error will be
± 6.1% (± 0.061 × 100.)
Effect of Subtraction
The absolute error of difference would be equal to that of the sum of errors of its
components. For instance, figures from 500 (to the closest 10) subtracted by 400
(to the closest 100). The difference of 500 – 400 = 100 the error of this equation
would be 5 + 50 = 55. All this can be explained as follows in a complete equation.
(500 ± 5) – (400 ± 50) = 100 ± 55
Let us get to the details of it. The occurrence of maximum error is going to occur
in accordance to the greater figure that would be at the greatest and when it comes
to the lower figure it would be at the lowest or it would be the opposite to it. When
absolute error difference is calculated, it will seem to be in the following manner in a
calculation:
Figures
500
400
Total
Error
Absolute
Error
Nearest 10
Nearest 100
±5
±50
±55
Subtraction will have
Maximum
Minimum
Value
Value
505(Max)
495(Min)
350(Min)
450(Max)
155
45
(=100+55)
(=100-55)
277
Accuracy, Approximation
and Errors
The absolute error would then be stated as + 55 (i.e., 5 + 50), the relative error
is going to be expressed as ± 0.55 (± 55/100), and the percentage error will appear
as ± 55% (± 0.55 × 100). Comparison becomes easy with the depiction of all the
errors, all these in the form of addition and subtraction, the noticeable relative error
with subtraction would be more as compared to the addition. The fact behind is that
the base becomes smaller. Another noticeable thing is that there is equality in the
absolute errors. Another point to understand is that, the errors due to subtraction
and addition occurs in the sum total of these errors.
Effect of Multiplication
It is important to understand that relative error of a product would be approximately
equal to the sum of the relative error of its components. When we multiply the figure
of 500 (to the closest 10) by 40 (to the closest unit). Now absolute error in relation
to the figure of 500 is + 5 and the relative error in relation to the figure depiction is
±1%. Absolute error in the figure of 40 is going to be ±0.5, and the relative error of
the figure would then be written as ±1.25%. The multiplication of the figures 500
and 40 would then be 2,000.
The following manner will explain all about it:
(500 ± 1%) × (40 ± 1.25%) = 2,000 ± 2.25%
Here relative error ± 2.25% is the sum of ± 1% and ± 1.25%. Further the
explanation would be:
The maximum value of the product will be expressed as:
(500 + 5) × (40 + 0.5) = (500 × 40) + (500 × 0.5) + (5 × 40) + (5 × 0.5) (a)
Similarly, the minimum value of the product would be as follows:
(500 – 5) × (40 – 0.5) = (500 × 40) – (5 × 40) – (0.5 × 500) + (5 × 0.5) (b)
Normally, when the errors are small the product of the two errors such as, the
term (5 × 0.5) in (a) and (b), is bound to be ignored as it is small. So, when it
comes to the absolute error in multiplying two figures 500 × 40 would result in
2,000 will then be expressed as (5 × 40) + (0.5 × 500) which is equal to 450. This
means relative error would be depicted as 450/2,000 = 0.0225 and the percentage
error is going to result in 2.25%.
Effect of Division
The sum of the relative errors of its components would be equal to the relative error
of a quotient, this is all in approximation. This can be understood by inclusion of
278
Accuracy, Approximation
and Errors
multiplication and divide it. We have the equation of 500 / 40 = 12.5. Now with the
relative errors are going to be depicted as follows:
(500 ± 1%)/(40 ± 1.25%) = 12.5 ± 2.25%
The relative error in the quotient 2.25% is the sum of two relative errors 1% and
1.25%. In order to understand it, it is important to get the smallest and the largest
value of the division, this can be obtained with the difference, whether it is less or
more than the division of 500 by 40 i.e. 12.5%. The division is going to result in the
smallest value that can be obtained when the smallest value of the numerator
(500 – 1%) is divided by the largest value of the denominator (i.e., 40 + l.25%)
Now, 500 – 1% = 500 – 5 = 495, and 40 + 1.25% = 40 – 0.5 = 40.5
With 495/40.5 = 12.22, it is going to be the slimmest value with the given
division. The difference between 12.50 and 12.22 is 0.28, is going to be expressed
in the absolute error. The relative error would then be:
0.28/12.5 × 100 = 2.24% or approximately 1% + 1.25% which is the sum of
the relative errors in two numbers.
You first need to first check the largest value of the division and look for how
much more than the value of 500/40, 12.5. This will be expressed in figure as
(500 + 5)/(40 – 0.5) – 12.5 = 505/39.5 – 12.5 = 0.28. This difference is same as
the former difference. The result is that the relative error of a quotient is
approximately equal to the sum of the relative errors of its components.
Biased and Unbiased Errors
It is already understood that errors are unavoidable in statistics. Although, it is
acceptable, but one should be able to tell apart biased from unbiased and know the
difference. Now, it becomes important to understand about these errors in detail.
Biased Errors
Biased errors are in one direction, the sum of the estimated figures is either going to
be large or it will be too small than the sum of actual figures.
Suppose all the numbers are rounded off then the biased error would be its
result. It is due to the fact that after rounding off the figures the rounding down is
going to be below the true values of these figures.
For example, 14 is rounded as 10, the figure 132 as 100, and the figure 5,396
as 5,000.
279
Accuracy, Approximation
and Errors
It can be seen that the errors are only in one direction like +4, +32 and +396,
this would result in the total error in the sum 14 + 132 + 5,396 (5,542) when
rounded by the sum of 10 + 100 + 5,000 (5,110) will be the sum of the errors
4 + 32 + 396 = 432, which is true as 5,542 – 5,110. The nature of these errors is
cumulative and for this purpose these are also called cumulative errors.
Due to the bias of persons or the instruments, these biased errors would happen
when data is collected.
Another thing to remember is that there is a high possibility that the respondents
may understate of overstate the facts, this happens due to personal bias. Another
example would be using the meter rod for the purpose of measuring the cloth that
can be smaller form the actual length. In both the cases the result would be biased
errors or cumulative errors. This is due to the rounding up or down of the numbers.
However, when it comes to rounding the closest digit this error would not appear. It
is important to understand that with the given large number of observations it is
possible that half of the figures may be raised up and the rest of the numbers may be
decreased. For this reason errors in total would get cancelled out.
Unbiased Errors
When the errors are cancelled out they are referred to as compensating errors or
unbiased errors.
For instance, when the rounding off includes six numbers with 21, 22, 24, 26, 27
and 28 to nearest tens. The first three figures would be approximated to 20 each this
would be expressed as a total error of +7.
The other three figures 26, 27 and 28 would be approximated to 30 each that
would result in a total error of –9. When we reach to the totality of the sum of all
these figures then it would be 7 – 9 = –2 only. For this reason the unbiased numbers,
with approximated value would be less than that of the true value, in other cases it is
more. For this reason, it can be both positive and negative that would be nullified
with the effect and cancel each other out. The larger the number, the smaller will be
the unbiased errors. With an increase in the number of observations, the unbiased
errors are going to decrease.
280
Accuracy, Approximation
and Errors
Look at the Example 11.3 carefully.
Example 11.3
Actual Case
17,118
8,362
10,509
15,443
Actual absolute error
Relative error
Case(i)
Unbiased Unbiased
Rounding Absolute
error
17,000
+118
8,000
+362
11,000
-491
15,000
+443
+432
+0.847%
Case(ii)
Lower
Biased
(000’s) Absolute
error
17,000
+118
8,000
+362
10,000
+509
15,000
+443
+1,432
+2.864%
Case(iii)
Upper
Biased
(000’s)
Absolute
error
18,000
–882
9,000
–638
11,000
–491
16,000
–557
–2,568
–4.756%
From Example 11.3 it is easy to conclude that:
i) The absolute error in the case of unbiased error is lower compared to
biased error.
ii) In the case of unbiased error, the relative error is also small. It also
decreases with an increase in the number of items.
iii) In the case of biased errors, both the absolute and the relative errors are
high. In fact they will increase as the number of items increases.
Check Your Progress - 1
1.
What is the accuracy of weight dependent on?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is approximation all about?
................................................................................................................
................................................................................................................
................................................................................................................
3.
What is the sum of absolute equal to?
................................................................................................................
................................................................................................................
................................................................................................................
281
Accuracy, Approximation
and Errors
11.3
ESTIMATION AND SAMPLING OF ERRORS
The various aspects of estimation and sampling are discusses below.
Estimation of Biased and Unbiased Errors
When figures are approximated, it is important to find out the amount of error
involved in that approximation. Often the exact figures are not available and for this
reason the approximated figures are given. In such cases the actual amount of error
is going to be depicted as the difference between the actual and the approximate
figure, this is not possible to find out and thus only estimated. In Example 11.3, the
actual figures in the first column are not known. Now one needs to estimate the
relative error in the total. The estimation procedure would vary on the unbiased or
biased.
Estimation when the Error is Unbiased
The absolute unbiased error lies between 0 and 500 when figures are rounded to the
nearest thousand. In the Example 11.3, one figure such as 17,118 has an error as
low as 118 and another figure such as, 10,509 have as high as 491. In any number
rounded to thousand, the possible lowest error can be ‘0’ and possible highest error
can be 500. So the average absolute error (AAE) in any figure can be taken as
(0 + 500) + 2 = 250. The best estimate of the unbiased absolute error in the sum of
a number of items is given by the product of this average absolute error and the
square root of the number of items. The proof of this formula is outside the scope of
this course.) The formula is as follows:
Absolute Error (unbiased type) = AAE X
Where, AAE is Average Absolute Error, N is Number of Items.
By using this formula, let us now estimate the absolute error in the present
example.
Absolute Error: 250 X = 500.
Similarly, we can also estimate the relative error with the following formula:
Relative Error = AAE X X ÷ Approximated Total
= 250 X ÷ 51,000 = 0.0098 or 0.98%
282
Accuracy, Approximation
and Errors
Estimation when the Error is Biased
An item expressed in thousands can have an error between 0 and 999. So the
average absolute error (AAE) in biased errors is 0 + 999 ÷ 2 = 499.5. The formula
for estimating the error, when the error is biased, is presented below:
Absolute Error (biased type) = AAE x N
Where, AAE is Average Absolute Error & N is Number of Items.
Let us apply this same formula and estimate the absolute error in the following
example.
Absolute Error = 499.5 × 4 = 1,998.
For estimating the relative error the following formula can be used:
Relative Error = AAE X N ÷ Approximated Total
Relative Error (when rounded down) = 499.5 × 4 ÷ 50,000 = 0.0399 or 3.99%
Relative Error (when rounded up) = 499.5 × 4 ÷ 54,000 = 0.037 or 3.7%.
In this example, you should note that the estimated relative and absolute errors
are different from the actual relative and absolute errors (refer Example 11.3)
calculated when exact numbers (in column 1) were available. This difference will be
quite small if the number of items is larger.
Sampling and Non-sampling Errors
You have learnt the meaning and the method of estimating the biased and unbiased
errors. Let us now discuss about sampling and non-sampling errors.
Sampling Errors
These are the errors that are resulted due to the drawing inference about the
population on the basis of samples. The sampling errors result occurs due to the fact
that there is a bias with regard to the selection of sample units. These errors occur
because the study is based on a part of the population. When the entire population
is considered all is eliminated. When more than one sample units are involved with
the process of random sampling method, their results are going to be different and
the results are going to be different from the result of the population. This is because
the selected two sample items will be different. Thus, sampling error means precisely
the difference between the sample result and that of the population when both the
results are obtained by using the same procedure or method of calculation. For this
reason exact amount of sampling error will differ from sample to sample. One
283
Accuracy, Approximation
and Errors
cannot completely eliminate the sampling errors or avoid it. Another thing is that one
can minimize these errors by the process of a systematic survey.
Sampling errors are of two types:
(i) Biased sampling errors
(ii) Unbiased sampling errors
Biased Sampling Errors: This happens when the values of the statistics
obtained from the survey deviate only to one direction, for this reason it cannot be
cancelled out. These errors happen due to various factors such as bias in selection
unit, faulty data collection, bias in analysis and other such factors. For example,
possibility of biased sampling errors is more when the sample units are selected
through deliberate sampling method instead of random sampling method. When one
encounters difficulties with information from some of the sampling units included in
the random selection, the investigator is more likely to include it in some other units
of the population. This also leads to bias if the substitute units are not selected
randomly. Sometimes due to lack of information the investigator would include the
remaining information, this too would result in bias. In other cases the information
may be biased, if the person wishes to conceal some facts from the investigator. Any
of the errors that are consistent would result in biases. Bias can also occur with
improper data collection instruments and when the investigator is incompetent.
Limitations with the coding, collection procedure, and methods of analysis will also
result in bias. These increase with the increase in the number of observations. Biased
sampling errors are cumulative in nature.
Unbiased Sampling Errors: These errors arise due to chance differences
between the units of population included in the sample and the one that is not
included. Errors due to chance are called unbiased sampling errors. They are not
due to any form of bias. No amount of increase in observations can result in any
fluctuations with these errors. On the other hand these errors it may be neutralized
when the number of observations increase. For this reason it is often referred as
compensating errors or non-cumulative errors.
Thus, the total sampling errors comprises both, biased and unbiased errors. The
primary objective of the statistical method related to any given survey is to design
sampling schemes so that biased errors are removed as much as possible and the
unbiased errors can be reduced to the minimum.
284
Accuracy, Approximation
and Errors
Non-sampling Errors
This can happen with complete, enumeration or sampling. Non-sampling errors
include mistakes and biases. These are not chance errors.
Most of the factors are similar that result in occurrence of bias in complete
enumeration, that has been described earlier. They also things like lack of
information, careless definition of population, a vague idea of the information sought,
utilizing inefficient method of interview and so on. Mistakes happen when the coding
is improper, trouble in computations and mistakes in processing. One or more of the
reasons stated below are the reasons that are related to non-sampling errors:
(i) Improper and ambiguous data specifications those are irregular with
relation to the census or survey objectives.
(ii) Inappropriate methods of sampling, incomplete questionnaire and incorrect
interviewing.
(iii) Personal bias with relation to investigators or informants.
(iv) Unavailability of trained and qualified investigators.
(v) Errors in compilation and tabulation.
These are the possible reasons out of many other possibilities.
The total errors include the sum of sampling errors and non-sampling errors. The
objective of any survey is to minimize these. It is easy to control non-sampling errors
through the process of defining the precise population, creating a careful
questionnaire and pre-testing it. Other things include training the investigators,
conducting a check and monitoring each step. However, this is only possible with
small amount of items or else it is only going to be time consuming and the whole
matter is going to be utterly costly. Another thing to notice is that when the sampling
amount is small there is an increase in errors. Now when you plan a survey it is
essential to be careful about the allocation of limited resources that includes human
and capital both along with the time to be considered. This should be done in such
a manner that the errors related to sampling and non-sampling errors are minimized
and it is possible to achieve maximum level of accuracy.
285
Accuracy, Approximation
and Errors
Check Your Progress - 2
1.
Between what figures when rounded to the nearest thousand do the
absolute unbiased errors lie?
................................................................................................................
................................................................................................................
................................................................................................................
2.
Due to what factors do sampling errors result?
................................................................................................................
................................................................................................................
................................................................................................................
3.
What do total errors include?
................................................................................................................
................................................................................................................
................................................................................................................
11.4
SUMMARY
• You can obtain statistical data by counting or measuring or through
estimating. With this it is possible to find out the exact numbers.
• However, it is important to remember that in measuring and estimating,
absolute accuracy cannot be achieved. Even in counting, where it is
possible, it is not desirable always. You may not get the desired clarity.
• Accuracy required varies on the purpose of the study and the situation.
Spurious accuracy should be avoided at any cost.
• Desired level of accuracy with regard to the given figures is possible
through the process of approximation through the technique of rounding
off.
286
Accuracy, Approximation
and Errors
• There are three methods of rounding off, these are as follows:
o Rounding up
o Rounding down
o Rounding to nearest unit
• The digits that are able to depict the extent of accuracy are called significant
digits.
• Statistical error is all about the difference between the true value and the
estimated value.
• A statistical data can be defined in three types, these are as follows:
o Errors of origin, these occur due to limitations with relation to the
measuring instruments, choosing the unsuitable statistical units, not
getting right information, investigators’ bias.
o Errors of inadequacy happens then sample size is small or it is not
depicting the population correctly.
o Error of manipulation this includes error that are unintentional in
counting, measuring etc., or errors due to approximation.
• Talking about simpler number or when we refer to computation, the digits
that are depicting the accurate figures are significant digits.
• Error is a significant term in statistics as it is often defined as the difference
that the true value and the estimated value have pertaining to a specific
item.
• Statistics is dealing with approximate and or estimated values, errors thus
become inevitable.
• Depiction of data to the nearest whole number that would be like, 4,672.4
is approximated to 5,000 and the nearest whole number round off would
be 4,672.
• The possibility with absolute error is that it can be both positive and
negative.
287
Accuracy, Approximation
and Errors
• When addition and subtraction is carried out with rounded figures, the most
essential aspect is that the answer obtained cannot be more accurate than
the least accurate figures.
• The sum of the relative errors of its components would be equal to the
relative error of a quotient, this is all in approximation.
11.5
KEY WORDS
• Biased sampling errors: It refers to those errors that happen due to
various factors such as faulty data collection, bias in analysis and selection
unit.
• Absolute error: It is defined as the difference that lies between the
approximate and the true value whether it is estimated or observed.
• Rounding down: In this method figures are reduced to the fuller lower unit.
• Unbiased sampling errors: These are errors that arise due to chance
differences between the units of population included in the sample and the
one that is not included.
11.6
ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The accuracy of the weight is dependent on the smallest measure of
milligram.
2. Approximation is all about rounding off the numbers and digits.
3. The sum of the absolute is equal to the sum of absolute errors of its
components.
Check Your Progress - 2
1. The absolute unbiased errors lie between 0 and 500 when figures are
rounded to the nearest thousand.
288
Accuracy, Approximation
and Errors
2. Sampling errors result due to the drawing inference about the population on
the basis of samples.
3. Total errors include the sum of sampling errors and non-sampling errors.
11.7
SELF-ASSESSMENT QUESTIONS
1. Write a short note on methods of approximation.
2. Discuss the various errors in statistics.
3. List the various errors of approximation.
4. Account for the measurement of errors of approximation.
5. What effects do mathematical operations have on errors?
6. Differentiate between biased and unbiased errors.
7. Write a short note on sampling and non-sampling errors.
8. Account for the estimation of biased and unbiased errors. Support your
answer with an example.
11.8
FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
289
Accuracy, Approximation
and Errors
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
290
Ratios, Percentages and
Rates
UNIT–12
RATIOS, PERCENTAGES AND RATES
Objectives
After going through this unit, you will be able to:
•
Discuss ratio, percentages and rates
•
Analyse the various statistical derivatives
•
Explain the differences between ratios and percentages
•
Assess the purpose of statistical derivatives
Structure
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
Introduction
Meaning of Various Statistical Derivatives
Purpose of Statistical Derivatives
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
12.1 INTRODUCTION
This unit is designed to teach you all about the in-depth explanation on percentages,
ratios, computational aspects and rates that are involved in calculation. Another thing
that you will understand is the precaution you need to take when you are working
out the percentages, rates and ratios as illustrated in administration and business
logarithms during the process of computation that involves methods like roots,
multiplications and divisions.
It is to be understood that reasonable data collection is not enough for drawing
out desired conclusions; the figures should be able to speak. The data collected
should be analysed, compared, should be meaningful and it should be enough to
make a viable format that would help in gaining conclusions. Quantitative data should
be condensed but in a manner that it becomes meaningful. This is essential as it
should be easy to interpret and to understand. Utilizing the method of statistical
derivatives for the purpose of computing is a good solution. The simplest form of
derivatives are rates, percentages, ratios etc. All these are points that helps measure
the present relationship between factors that give a better interpretation. This unit will
291
Ratios, Percentages and
Rates
help you with learning about the utility of logarithms and making expansive
computations.
12.2 MEANING OF VARIOUS STATISTICAL DERIVATIVES
There are three segregations to the method: ratios, percentages, and rates.
Ratio
Ratio helps express the connection between the two quantities that are similar and
denotes the number of times one is contained within another. When the expression of
quantities contain A and B then it is depicted in the form of a ratio as A: B and it is
called A is to B. In this ratio A is the antecedent and the B is consequent. Another
form of expression can be AIB. Now the ratio between these two is the concept of
division taking place between A that is divided by B. The ratio here can be implied
division or actual, either of the two, however, for convenience it is not to be
represented as division.
For example, if comparison is drawn between workers on the basis of gender
then 550/110 would be incomprehensible. A better representation of the same would
be 5:1 that becomes easy to understand. This kind of representation in a ratio format
is done to reduce the size and facilitate in easy grasp. When two terms are
interchanged within a ratio then the second ratio that is obtained is the inverse ratio
with relation to the first. Now with A and B the inverse would be B:A.
Let us now take another example of 80 students in a class wherein 50 are boys
and 30 girls. Now when converted in ratio format it would be 5:3 and this boys is to
girls and when it is inverse then it is 3:5 with girls is to boys. Here it is to be noted
that one ratio is greater than another. It is possible that a ratio can take a value that
is greater, lesser or equal to another as per the given situation. The ratio is the
relation of a quantity or a number over another, the value that is expresses as the
quotient of the first one that is being divided with the second. However, the ratio
may be extended to more than two numbers. Three or more numbers can be
involved in comparison and expression, this can be written as A:B:C:D.
When we take the example of a class of 100 law students, commerce 70,
science 20 and arts 10. When we sight the comparison from these different streams
then it will be depicted as 7:2:1. With the increase in categories, the proportion too
is a derivative for the given representation, this way it becomes easy and is less
confusing. If the total items as N items that are divided into three categories as
292
Ratios, Percentages and
Rates
categories-N1 in the category 1, N2 items would be in 2, and N3 would be in 3.
The proportion of 1,2,3 categories would then be depicted as Nl/N, N2/N, N3/N.
This proportion is the denominator that is in fact the total number of items and the
numerator would then be the number of categories. For this reason proportion
would be less than that of the total sum or proportions.
It is to be understood that the ratio can be converted in proportions. For
example, referring to the male is to female depicted as 3:2 then it would be written
for the male as 31(3+2) = 0.6, with this the female would be U(3+2) = 0.4. 4.2.2.
It is essential to notice that the percentage ratios and the proportions are depicted as
percentages; these are the relative measure that is visualized in a better manner with
the percentages. You can convert the ratios into percentages; this is done by taking
a figure that would be the base and then multiplying it with 100. If we take an
example of a paddy crop between the years 1988-1989 as 23 then in percentage it
would be, 1988 this would be the base. 1989 would be expressed as 312 × 100 =
150% as per the last year. The entire sum total would be 100, this would be under
the condition when the categories are collectively exhaustive and mutually exclusive.
Percentage
Percentages are just a form of the proportion based on or against 100. To calculate
a percentage we simply multiply a proportion by 100. Proportions are special kinds
of ratios where the denominator is the total while the numerator is a subpart of the
total. This tells us what part the numerator is of the total. Thus, while the ratio of
females to males in a city is 1.06, females represent .515 proportion of the total.
Rate
When there needs to be comparison between two quantities of the same type then
it is a ratio. For example when we take male and female workers in a factory then
both are workers. For this reason they are to be in the same kind. When we take an
example of per capita income the total income would be the numerator and the total
population would be the denominator. Other examples may be accident rate, death
and birth rates. Here it is to be understood that rate is all about the concept that
varies. These are dynamic and are related to time. Quotient is a rate of change that
includes a number that is representing the change in denominator and numerator.
Thus a rate is all about standardized relation towards the denominator. When
division takes place with a related number and quotient multiplied by 1,000, the
resulting figure is rate per thousand.
293
Ratios, Percentages and
Rates
For example, if we divide number of deaths then the statistical concepts with the
entire population along with the quotient is going to be multiplied by 1000 with this
death rate is obtained. Another thing to notice is that the coefficient is the rate per
unit. Let us assume that 1.9% is the death rate or 19 per thousand, then the
coefficient is going to be 0.019 as the coefficient of death. If this is multiplied by the
entire population then the resultant will be total number of deaths.
Check Your Progress - 1
1.
What do ratios help express?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What are percentages?
................................................................................................................
................................................................................................................
................................................................................................................
12.3 PURPOSE OF STATISTICAL DERIVATIVES
Now it is time to know the purpose of statistical derivatives, the first one would be
comparison as the primary purpose. When we consider percentage, coefficient and
ratio then there is clarity of idea that all of them are representation of a relative
picture. However, when numbers are involved in comparison with each other, then
the standard figure taken into consideration for the purpose is the base. Now the
question is which type should be selected for the purpose of a base, this however,
depends on the given situation. A thing to notice here is that a derivative is not
meaningful all by itself especially for the purpose of analysing a given problem. If a
company has earned 18% ROI in the current year, then the question is whether it is
a high ROI or not. There has to be comparison for the meaningful use of derivatives.
Now when we talk about 18% return then this is comparable with either last year or
with the figures of other competing firms, this can however, be done if they are
comparable.
294
Ratios, Percentages and
Rates
Utilization of derivatives is for the purpose of drawing out comparison between
different groups, now naturally it is reduced to a common denominator and for this
reasons comparisons are drawn to make it meaningful yet simple.
When we talk about two business firms that began with a capital investment of
` 50,000 and ` 1,20,000. At the year end, the first firm makes a profit of ` 20,000
and the second one earned ` 40,000. This clearly shows that the second firm is able
to achieve a double figure as compared to first business. Now when we reduce them
to a denominator of 100 that is common, then the profit made by the first business
would be 40% of the capital and the second would be 33% of the capital. Now the
impression is reversed. With this the percentage profit becomes more meaningful.
The derivatives are to be utilized for the purpose of quantity estimation especially
when the quantity is unknown.
When we look at the birth rate in accordance to a particular region that is
already known, it is fairly constant over a given time. With the total number of births
known at a given time, you will be able to get the estimate value of the population in
a specific time frame. For this reason derivatives help in extracting unknown
quantities or its estimation, this even helps in simplification of data and for the
purpose of increasing their comparability.
Types of Ratios
Statistical work involves several ratios, this ratio is used for the purpose of statistical
work. This ratio is dependent on base, when there is a comparison between
numbers, the figures used in comparison thus become the base. Now it is important
to know about different kinds of ratios.
The Distribution Ratio
This is the ratio that is a part of the total part. If there are 300 females in a company
out of 1000 workers then the distribution ratio would be 30011000 = 3:l0. This
would imply that 30% of the total labour force is females. The concept of
distribution ratio can be extended to more than two groups. This would then make
it a total-to-parts ratio.
Interpart and Interclass Ratio
Interpart is when the ratio of a total part is in relation to another part in the same
total. The base here is one of the two parts as here the comparison is drawn
between two parts. For example, sex ratio of a population is good to consider here
295
Ratios, Percentages and
Rates
for the purpose of sex ratio that is depicted as females 1000 per males and not as
females per 1,000 population.
Time Ratio
This ratio is a measure that depicts a series of arranged values in a given time
sequence and this is also expressed as percentage. This is what is referred to as past
to present ratio.
This is what brings us to two important classes of time ratios:
(i) Those involving a fixed base period, and (ii) those involving a moving base.
For instance we take the example of tea production of the current year.
If one is to utilize fixed based method then including a particular year like 1980
would serve the base year and the current year production would be compared with
the production of current year.
When one is to consider moving base method the base varies. Now it is to be
understood that when one is comparing it with current year then previous year
production is used as the base. However, when one is comparing it with next year
then current year is the base for the purpose of comparison. Thus all this comes in
handy for percentages, ratios and basic Statistical Concepts calculations. These are
the calculations that depict the comparisons between data in accordance to two
consecutive time periods.
Hybrid Ratio
When it comes to corresponding part within different categories ratios of the given
data that is referred to as hybrid ratio. It is important to understand that the
denominator and numerator will be different in different units. For example, when we
consider a simple statement like the car that is travelling at the pace of 30 mph, this
would be included in the hybrid ratio.
The miles involved are the numbers that would be divided by the number of
hours, here there are two units that include miles and hours these both are included
in the statement as a result of it all.
Another example that we can take for the purpose of hybrid ratios is the per
capita income, persons per square kilometers, and output per hour or per day, cost
per passenger mile, number of children per family, investment per mile and so on and
so forth.
296
Ratios, Percentages and
Rates
It is to be understood that hybrid ratios are to be stated as per unit base instead
of its expression as percentages. This is essential due to the fact that the dominators
and the numerators belong to different categories in the ratios involved in this kid of
depiction. For this reason it can be said that the hybrid ratios can also be referred to
as rates too.
Computation of Ratios
Variables to be Related: There must be a clear relationship between the
numerator and the denominator. For example, in the event that you are keen on
processing the gaining of an organization in the current year, the present year’s
speculation must be considered and not the venture at the season of its initiation.
Another case could be the agrarian generation per section of land. In this
proportion, horticultural generation per section of land of land developed is more
significant than agrarian creation per section of land to aggregate land (which
incorporates badlands, backwoods, deserts, and so on.
Choice of Base: The base or denominator of a measurable proportion is
dependably a standard with which the numerator is being analyzed. As you most
likely are aware, through proportion we build up relationship between two things.
Here it is essential to choose which of the two things is to be utilized as base. At
times decision of the base is self-evident, while in different cases decision of the base
is not self-evident. Be that as it may, certain speculations can be settled on in the
decision of the base.
(i) In a correlation between a section and the entire, the entire is dependably
the base. For example, in relating the quantity of unemployed to aggregate
work drive, the quantity of people in the work compel would be the
denominator of the proportion.
(ii) In time examinations between comparable things (time proportions), the
prior occasion is taken as the base perpetually. For instance in contrasting
the rate change of current year deals over the earlier year, you ought to
consider the earlier year’s deals as the base.
(iii) If the connection is to be studied between two factors, one of which might
be solely dependent upon the other, then the autonomous variable is by and
large utilized as the base of examination. For example, in relating the
quantity of mishaps to aggregate traveller miles, the later would for the most
part be taken as the base of correlation.
297
Ratios, Percentages and
Rates
Choice of Units in the Denominator: The quantity of units in the denominator
(such as, base) might be controlled by custom, accommodation and adequacy. We
can show a portion of the practices in such manner.
(a) There are a few cases in which the base of a proportion is communicated
as a solitary unit. For example, per capita salary, per traveller mile, creation
per section of land, and so on.
(b) Many circumstances proportions are communicated regarding rates. For
example, the quantity of phone lines in operation today is 150%-of the
number a year prior. For this situation the number expressed as a rate
demonstrates what numbers of numerator units are there for each hundred
denominator units. It is anything but difficult to imagine treating the base in
units of 100.
(c) Thousand, ten thousand or even a bigger number of units might be utilized
as a part of the base Ratios, Rates and Ratios. For instance, an
announcement like 4.5 miss-chances for each 1,000 worker hours can
likewise be expressed as 45 mishaps for each 10,000 worker hours or
0.0045 mishaps for each man hour.
Following are some guidelines one need to consider while determining whether one
or some higher power of units should be used as the base.
(i) The number utilized as the base ought to be sufficiently substantial so that
the estimation of the numerator will seem chiefly all in all number however
ought not to have more than a few digits to one side of the decimal point.
It is more helpful to state that there are 45 mishaps for each 10,000 worker
hours than to state that there are 4,500 miss-chances for every 10,00,000
worker hours.
(ii) The number utilized as the base ought to be lesser than the number in the
first information relating to the denominator. For example, there are just 12
people in a firm and nine of them had autos. For this situation, it is ideal to
utilize the first information as the relationship included is clear without
lessening the information into proportion frame. In the event that you say
that 75% of the representatives utilize autos implies a similar thing, but it
may not give clear impression. Here, the denominator is 100 which are
higher than the real figure i.e., 12. Hence in processing measurable
proportions, one should first choose which variable ought to go into the
298
Ratios, Percentages and
Rates
numerator, which variable ought to go into the denominator, and what
number of units the denominator of the fancied proportion ought to contain.
Application of Ratios
Proportions, rates, coefficients are utilized as a part of a wide range of studies. Per
capita wage, populace per square kilometer, generation per section of land, turnover
proportion, settled resources proportion, insight remainder, cargo income per mile,
and venture per mile, work to yield proportion, capital yield proportion, and so forth
are cases of different mainstream proportions utilized. These are the points of
interest of some regularly utilized as proportions.
Every one of these proportions is refined and henceforth they are called
refined proportions. A refined proportion is one in which the numerator or the
denominator or both are balanced to reject the incidental elements which have a
tendency to cloud coordinate relationship between them. For example, proportion
of work cost in an industrial facility to aggregate cost of make is a valuable
proportion. In any case, the denominator contains two sorts of costs, cost and
variable cost. The proportion of work cost to aggregate variable cost gives a
proportion which is more significant to the administration in breaking down the
operations. A proportion might be institutionalized by changing the segment parts
of a proportion for better equivalence with different proportions. The utilization of
institutionalized proportions is essential in the field of imperative measurements
where institutionalized demise rates, birth rates, and so forth are utilized in
correlation with various urban communities or areas of the nation. The figures of
institutionalized rates include the idea of weighted normal and are, along these
lines, out of extent of this unit.
Caution in the Use of Derivatives
A number of the blunders in the utilization of subsidiaries spring from inability to
express the importance of subordinates effectively. Challenges experienced in the
calculation and utilization of the subordinates can be by and large followed to at least
one of the accompanying causes:
Perplexity Regarding the Base: Suppose the cost of an item has expanded
from 2,000 to ` 2,500 in the present year. The present cost would be 125% of the
most recent year’s cost. An option explanation would be that cost in the present
year 25% higher than that of a year ago. Such figures might be confused to mean
either that cost in current year is 25% of a year ago or this year cost has expanded
299
Ratios, Percentages and
Rates
by 125%. In the event that any esteem decays by 100% it brings about zero
esteem. More prominent than 100% decay can’t happen with amounts like costs,
wages, work, and so on, and in the event that it does, it demonstrates a mistake.
For example, if the cost of ` 2,000 is decreased to ` 800, the decay of ` 1,200
is figured as 150% of the last cost. This is an inaccurate explanation. The base is
not accurately picked.
Twists Caused by Small Bases: Consider another case where off base
conclusions can be attracted because of the mutilations brought about by little bases.
Considerate caution is to be practiced in the translation of these figures. Clearly a
conclusion that the administration of firm is more proficient can’t be defended. Since
the rate demonstrates a relative greatness just, no deduction ought to be drawn from
this in regards to the total sums. In such a circumstance, a right picture can be
acquired just if the supreme figures are presented.
Examinations Based on Dissimilar Situations: The information ought to be
homogeneous for the calculation and the utilization of proportions and rates. Before
one can make huge determinations from the examination, it is constantly important to
see if the information broke down is tantamount or not. Number juggling Mistakes
including lost decimal focuses may prompt to gross misinterpretations.
Disgraceful Averaging: Averaging deserves some talk as it is done in a few
circumstances. To discover suitable normal it is important to know the quantity of
jolts created by every machine. From the above discourse it is apparent that figuring
of proportion and rate must be done precisely, that significant conclusions can be
drawn. At whatever point conceivable, the information from which these proportions
are determined ought to likewise be given so that the pursuer can confirm the
relationship, and can recognize the blunders to make his own particular
understanding.
Check Your Progress - 2
1.
What is a ratio?
................................................................................................................
................................................................................................................
................................................................................................................
300
Ratios, Percentages and
Rates
2.
Where do a number of the blunders in the utilization of subsidiaries spring
from?
................................................................................................................
................................................................................................................
................................................................................................................
12.4 SUMMARY
• Data collection is not enough for drawing out desired conclusions; the
figures should be able to speak. The data collected should be analysed,
compared, should be meaningful and it should be enough to make a viable
format that would help in gaining conclusions.
• Ratio is that helps express the connection between the two quantities that
are similar and denotes the number of times one is contained within
another.
• Ratios can be converted in proportions by taking a figure that would be the
base and then multiplying it with 100.
• Quotient is a rate of change that includes a number that is representing the
change in denominator and numerator.
• When division takes place with a related number and quotient multiplied by
1,000, the resulting figure is rate per thousand.
• When we consider percentage, coefficient and ratio then there is a clarity of
idea that all of them are representation of a relative picture.
• Utilisation of derivatives is for the purpose of drawing out comparison
between different groups, now naturally it is reduced to a common
denominator and for this reasons comparisons are drawn to make it
meaningful yet simple.
• Statistical work involves several ratios, this ratio is used for the purpose of
statistical work. This ratio is dependent on base, when there is a
comparison between numbers, the figures used in comparison thus become
the base. Now it is important to know about different kinds of ratios.
• Interpart is when the ratio of a total part is in relation to another part in the
same total. The base here is one of the two parts as here the comparison is
drawn between two parts.
301
Ratios, Percentages and
Rates
• Time ratio is a measure that depicts a series of arranged values in a given
time sequence and this is also expressed as percentage.
• When it comes to corresponding part within different categories ratios of
the given data that is referred to as hybrid ratio.
• There must be a clear relationship between the numerator and the
denominator.
• The base or denominator of a measurable proportion is dependably a
standard with which the numerator is being analysed.
• The quantity of units in the denominator (such as, base) might be controlled
by custom, accommodation and adequacy.
• The number utilized as the base ought to be sufficiently substantial so that
the estimation of the numerator will seem chiefly all in all number however
ought not to have more than a few digits to one side of the decimal point.
• Per capita wage, populace per square kilometer, generation per section of
land, turnover proportion, settled resources proportion, insight remainder,
cargo income per mile, and venture per mile, work to yield proportion,
capital yield proportion, and so forth are cases of different mainstream
proportions utilized.
• A number of the blunders in the utilization of subsidiaries spring from
inability to express the importance of subordinates effectively.
12.5 KEY WORDS
• Logarithm: Logarithm of a given number (logy) is the ability to which a
given base is raised accomplishes the given number.
• Percentage: It gives the extent of the numerator when denominator of a
proportion gets to be distinctly 100.
• Proportion: It is the amount of the quantity of things in one class to the
aggregate number of things in all classifications.
• Rate: It is a proportion in which when numerator and denominator are in
various units.
• Time Ratio: It is generally shown in percentages, expresses the change in
a series of values relating to different time periods.
302
Ratios, Percentages and
Rates
• Ratios: It is the quantitative relation between two amounts showing the
number of times one value contains or is contained within the other.
12.6 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. Ratio helps express the connection between the two quantities that are
similar and denotes the number of times one is contained within another.
2. Percentages are just a form of the proportion based on or against 100.
Check Your Progress - 2
1. This ratio is a measure that depicts a series of arranged values in a given
time sequence and this is also expressed as percentage
2. A number of the blunders in the utilization of subsidiaries spring from
inability to express the importance of subordinates effectively.
12.7 SELF-ASSESSMENT QUESTIONS
1. Write a short note on ratios, percentages and rates.
2. Differentiate between ratios and rates in statistics.
3. What is the purpose of statistical derivatives in statistics?
4. List the various types of ratios.
5. How are ratios computed in statistics? Discuss with the help of examples.
6. Write a note on cautions in the use of derivatives.
7. List the various applications of ratios.
8. Write a note on the various statistical derivatives.
12.8 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
303
Ratios, Percentages and
Rates
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
304
Collection and
Classification of Data
BLOCK-IV
COLLECTION, CLASSIFICATION AND PRESENTATION OF DATA
This block will discuss the collection, classification and presentation of data. As discusses in
the previous block about the basic statistical concepts; its meaning, organizing a statistical
survey, accuracy and approximation of errors and ratio, percentages and rates. This block
will now deal with collection, classification, presentation and diagrammatic representation.
This block consists of three units.
The thirteenth unit, as per this book, discusses collection and classification of data. It
discusses the methods and techniques of data collection. A research begins by finding and
collecting data. The collected data is then classified as primary and secondary sources.
Primary data being the first hand source of data and secondary being the data collected
from various different sources. The unit discusses it in detail.
The fourteenth unit is about tabular presentation. Data once collected is classified and then
presented in tables. Tables are single-column or single row or multiple columns, depending
upon the nature of data. For example numerical data is presented in statistical tables while a
contingency table presents observed data. The unit discusses the aspects of data
presentation in detail.
The fifteenth unit explains diagrammatic and graphic presentation of data. Data, when it is
classified into different segments, needs to be stored in such a manner that it remains easy to
retrieve it. For the same diagrammatic and graphic representations are used. These help
compare and differentiate data amongst its different forms and types. The different types of
charts and diagrams are discussed in this unit.
305
Collection and
Classification of Data
UNIT–13
COLLECTION AND CLASSIFICATION OF DATA
Objectives
After going through this unit, you will be able to:
•
Describe collection of data
•
Assess the drafting of questionnaire
•
Analyse the features of specimen questionnaire
•
Discuss sampling and non sampling errors
Structure
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
Introduction
Collection of Data
Classification of Data
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
13.1 INTRODUCTION
In this unit, you will learn about the methods and techniques of data collection.
Determining the sources of data is one of the most important steps in conducting a
research. Sources of data can be classified into primary and secondary sources. A
primary source, also known as first-hand information, includes all the data that is
closest to the information or concept being examined. Primary data can be obtained
through observations or through direct communication with the persons associated
with the selected subject by performing surveys or descriptive research. A
secondary source, also known as second-hand information, is data that relates or
discusses information actually presented elsewhere. In this unit, you will also learn
about editing the data, which includes correction in data.
13.2 COLLECTION OF DATA
Some of the sources of data to collect first-hand information are as follows:
• Census
• World Bank
307
Collection and
Classification of Data
• WHO (World Health Organization)
• NSSO (National Sample Survey Organization)
• Economic Survey
• Civil Registration
• Sample Registration System
• National Family and Health Surveys
• Reproductive and Child Health Project
• SRS Surveys
• Multiple Indicator Survey
• Medical Causes of Death
• Demographic and Health Surveys
Since, the quality of the results obtained from statistical data for the purpose of
using these outcomes for managerial decision-making depends upon the quality of
the information itself collected, it is important that a sound investigative process be
established to ensure that the data are highly representative and highly unbiased. This
requires a high degree of skill and also certain precautionary measures are to be taken.
The following steps may be considered in the primary data collection process:
Planning the Study
Before any procedures for data collection are established, the purpose and the
scope of the study must be clearly specified. If any similar studies have been
conducted, prior to the current one, then the investigator may want to use some
secondary data in his own study, and may redefine his objectives on the basis of the
previous studies conducted. The scope of the study must take into consideration the
field to be covered, and the time period in which to conduct the study. The time span
is very important, because in certain areas, the conditions change very quickly, and
hence by the time the study is completed, it may become irrelevant. The statistical
units and the desired accuracy of such units must be clearly specified.
Methods of Collecting Primary Data
Primary data can be collected by any one or more of the following methods:
(a) Direct personal observation: Under this method, the investigator
presents himself personally before the informant and obtains a first-hand
information. This method is most suitable when the field of enquiry is small
and a greater degree of accuracy is required.
308
Collection and
Classification of Data
Merits
(i) The first-hand information obtained by the investigator is bound to be
more reliable and accurate since the investigator can extract the
correct information by removing doubts, if any, in the minds of the
respondents regarding certain questions.
(ii) High response rate since the answers to various questions are obtained
on the spot.
(iii) It permits explanation of questions concerning difficult subject matter.
(iv) It permits evaluation of respondent, his circumstances and reliability.
(v) This method is useful where sponteneity of response is required.
(vi) It provides personal rapport which helps to overcome reluctance to
respond.
(vii) Where the investigator and informant talk face to face, it becomes
possible to explore questions in depth.
(viii) Information is collected promptly and there is no dribbling in.
Limitations
(i) This method is suitable only for intensive studies and not for extensive
enquiries.
(ii) This method is time-consuming and the investigation may have to be
spanned over a long period.
(iii) This method is highly subjective in nature and the results of the enquiry
may be adversely affected by the personal biases, whim and
prejudices of the investigator.
(b) Telephone survey: Under this method, the investigator, instead of
presenting himself before the informants, contacts them on telephone and
collects information from them.
Merits
(i) The method is more convenient than personal interview.
(ii) This method is less time-consuming and can be applied even to
extensive fields of enquiries. Telephone survey has all the other merits
of personal interview.
309
Collection and
Classification of Data
Limitations
(i) This method excludes those who do not have a telephone as also
those who have unlisted telephones.
(ii) This method is also subjective in nature and personal bias, whim and
prejudices of the investigator may adversely affect the results of the
enquiry.
(c) Indirect personal interview: Under this method, instead of directly
approaching the informants, the investigator interviews several third persons
who are directly or indirectly concerned with the subject matter of the
enquiry and who are in possession of the requisite information. Such a
procedure is followed by the enquiry committees and commissions
appointed by the Government of India. The committee selects persons
known as witnesses and collects information from them by getting answers
to questions decided in advance. This method is highly suitable where the
direct personal investigation is not practicable either because the informants
are unwilling or reluctant to supply the information or where the information
desired is complex and the study in hand is extensive.
Merits
(i) This method is less costly and less time-consuming than the direct
personal investigation.
(ii) Under this method, the enquiry can be formulated and conducted
more effectively and efficiently as it is possible to obtain the views and
suggestions of the experts on the given problem.
Limitations
The success of this method depends upon:
(i) The representative character of the witnesses
(ii) The personal knowledge of the witnesses about the subject matter of
enquiry
(iii) The personal prejudices of the witnesses as regards definiteness in
stating what is wanted
(iv) The ability of the interviewer to extract information from the witnesses
by asking appropriate questions and cross-questions
310
Collection and
Classification of Data
(d) Information received through local agents: Under this method, the
information is not collected formally by the investigator, but local agents,
commonly known as correspondents, are appointed in different parts of the
area under investigation. These agents collect information in their areas and
transmit the same to the investigator. They apply their own judgement as to
the best method of obtaining information. This method is usually employed
by newspaper or periodical agencies which require information in different
fields such as economic trends, business, stock and share markets, sports,
politics, and so on.
Merits
(i) This method is very cheap and economical for extensive investigations.
(ii) The required information can be obtained expeditiously since only
rough estimates are required.
Limitations
Since, the correspondents apply their own judgement about the method of
collecting the information, the results are often vitiated due to personal
prejudices and whims of the correspondents. The data so obtained are thus
not so reliable. This method is suitable only if the purpose of investigation is
to obtain rough and approximate estimates. It is unsuited where a high
degree of accuracy is desired.
(e) Mailed questionnaire method: Under this method, the investigator
prepares a questionnaire containing a number of questions pertaining to the
field of enquiry. These questionnaires are sent by post to the informants
together with a polite covering letter explaining in detail the aims and
objectives of collecting the information, and requesting the respondents to
cooperate by furnishing the correct replies and returning the questionnaire
duly filled in. In order to ensure quick response, the return postage
expenses are usually borne by the investigator. This method is usually
adopted by the research workers, private individuals and non-official
agencies. The success of this method depends upon the proper drafting of
the questionnaire and the cooperation of the respondents.
Merits
(i) By this method, a large field of investigation may be covered at a very
low cost. In fact, this is the most economical method in terms of time,
money and manpower.
311
Collection and
Classification of Data
(ii) Errors due to personal bias of the investigators or enumerators are
completely eliminated as the information is supplied by the person
concerned in his own handwriting.
Limitations
(i) This method can be used only if the respondents are educated and can
understand the questions well, and reply in their own handwriting.
(ii) Sometimes the informants may not send back the schedules and even
if they return the schedules, they may be incorrectly filled in.
(iii) Sometimes, the informants are not willing to give written information in
their own handwriting on certain personal questions like income,
personal habits and property.
(iv) There is no scope for asking supplementary questions for crosschecking of the information supplied by the respondents.
(f) Questionnaire sent through enumerators: Under this method, instead
of sending the questionnaire through post, the investigator appoints agents
known as enumerators, who go to the respondents personally with the
questionnaire, ask them the questions given therein, and record their
replies. This method is generally used by business houses, large public
enterprises and research institutions.
Merits
(i) The information collected through this method is more reliable as the
enumerators can explain in detail the objectives and aims of the
enquiry to the respondents and win their cooperation.
(ii) Since the enumerators personally call on the respondents, there is very
little non-response.
(iii) This technique can be used with advantage even if the respondents are
illiterate.
(iv) The enumerators can effectively check the accuracy of the information
supplied through some intelligent cross-questioning by asking
supplementary questions.
Limitations
(i) The method is more expensive and can be used by financially strong
bodies or institutions only.
(ii) It is more time-consuming than the mailed questionnaire method.
312
Collection and
Classification of Data
(iii) The success of the method depends upon the skill and efficiency of the
enumerators to collect the information as also on the efficiency and
wisdom with which the questionnaire is drafted.
Drafting or Framing the Questionnaire
Since the questionnaire is the only medium of communication between the
investigator and the respondents, it must be designed or drafted with utmost care
and caution so that all the relevant and essential information for the enquiry may be
collected without any difficulty, ambiguity or vagueness. Designing of questionnaire,
therefore, requires a high degree of skill and experience on the part of the
investigator. No hard and fast rules can be laid down for designing or framing a
questionnaire. However, much useful purpose would be served if the following
general points are borne in mind while drafting a questionnaire:
1. The size of the questionnaire should be as small as possible. The number of
questions should be kept to the minimum keeping in view the nature,
objectives and purpose of enquiry. Respondents’ time should not be
wasted by asking irrelevant and unimportant questions. Fifteen to twentyfive may be regarded as a fair number. If a larger number of questions is
unavoidable in any enquiry, the questionnaire should preferably be divided
into two or more parts.
2. Questions should be clear, brief, unambiguous, non-offending, courteous in
tone, corroberative in nature and to the point.
3. Questions should be logically arranged.
4. Questions should be short, simple and easy to understand. The usage of
vague or multiple meaning words should be avoided. Unless the
respondents are technically trained, the use of technical terms should be
avoided.
5. Questions should be so designed that the respondents can easily
comprehend and answer them. Questions involving mathematical
calculations should not be asked.
6. Questions of sensitive or personal nature should be avoided.
7. The questionnaire should provide necessary instructions to the
enumerators.
8. If a particular question needs clarification, it should be explained by way of
a footnote.
313
Collection and
Classification of Data
9. Questions should be capable of objective answer. Various types of
questions in the questionnaire may be grouped under three categories:
(i) Dichotomous or simple alternate questions in which the
respondent has to choose between two clear-cut alternatives like
‘Yes’ or ‘No’. ‘Right or Wrong’; ‘Either, ‘Or’, and so on. This
technique can be applied elegantly in situations where two clear-cut
alternatives exist.
(ii) Multiple choice questions in which the respondent is asked to
select one out of a number of responses. All possible answers to a
question are listed and the respondent chooses one of these. Such
questions save time and facilitate tabulation. This method should be
used only if a few alternative answers exist to a particular question.
(iii) Open questions are those in which no alternative answers are
suggested and the respondents are free to express their frank and
independent opinions on the problem in their own words usually in
essay form.
10. Cross checks: The questionnaire should be so designed as to provide a
cross check on the accuracy of the information supplied by the
respondents by including some connected questions.
11. Pre-testing the questionnaire: The questionnaire should be tried on a
small group before using it for the given enquiry. This will help in improving
or modifying the questionnaire in the light of the drawbacks, shortcomings
and problems faced by the investigator in the pre-test.
12. A covering letter, stating briefly the aims and objects of the enquiry,
soliciting the cooperation of the respondents, and explaining various terms
and concepts, should be enclosed along with the questionnaire.
13. In case of mailed questionnaire method, a self-addressed stamped
envelope should be enclosed.
14. To ensure quick response, respondents may be offered incentives in the
form of gift coupons, a sample of the product to be introduced, or a
promise to supply a copy of the findings after the survey work is over.
15. Method of tabulation and analysis, whether hand-operated, machineoperated or computerized, should also be kept in mind while designing the
questionnaire.
314
Collection and
Classification of Data
16. Lastly, the questionnaire should be made attractive by a proper layout and
an appealing get up.
Specimen Questionnaire
This hypothetical study is adapted from a study developed by Deepak Mehendru in
India. Assume that this study involves 200 professors in New York area colleges
who are asked about their interest in buying automobiles. The basic objective of this
survey is to determine certain marketing trends among the population of professors
in New York area regarding their automobile buying patterns and are based upon the
following factors:
• The profile of the decision-maker who finally decides to buy a particular
type of car
• People around the decision-maker who influence the decision-making
process
• The factors affecting the selection of a particular dealer of cars
• People in the family who make or affect decisions regarding the maximum
budget that can be allocated for purchasing a car
• The effect of various options available in the car
• The image and reliability of the company that makes these cars
• The effect of heavy promotion on television about the utility of the car on
the decision maker
(For the sake of simplicity, it is assumed that the professors have only one car in
the family.)
The Questionnaire
1. General
Name................................................................................
Age...................................................................................
Sex.........M..........F............................................................
Marital Status ....... Married ....... Unmarried ...................
Number of members in the family
1–2...................
3–4...................
315
Collection and
Classification of Data
5–6...................
Over 6..............
Yearly income
Less than $30,000...................
$30,000–$39,999......................
$40,000–$49,999......................
$50,000 and more...................
2. What type of car do you own now?
.................American
.................Japanese
.................European
3. What size of car do you own?
.................Luxury
.................Mid-size
.................Compact
4. Did you buy this car new or used?
.................New....................Used
5. If you bought a used car, did you buy it from a dealer or a private party?
.................Dealer.................Private party
6. If you bought a new car, how long have you owned this car?
.................Number of years
7.
If you bought a used car, how old is this car now?
..............Number of years
8. Price paid for the car..........New..........Used
9. Who influenced your decision to purchase the above brand of car? Indicate
if more than one.
...............Yourself
......................Your wife
...............Your children
...................... Your friend
...............Your neighbour ......................Your colleague
Others.................................................................................. .
316
Collection and
Classification of Data
10. Indicate as to who decided about the budget allocation for the car?
...............Yourself
...............Your spouse
.............. Family decision
11. If you bought your car from a dealer, then who influenced your decision
regarding the selection of a particular dealer?
...............Yourself
...............Your friend
...............Your colleague
...............Family decision
12. How did you come to know about this dealer?
...............TV commercial
...............Newspapers
...............Personal references
...............Others
13. Rank the following factors that affected the final decision at the time of
purchasing the car (A rank of 1 measures the most important factor, a rank
of 2 measures the second most important factor, and so on).
...............Very inconvenient without the car
...............Money was available
...............Reputation of car manufacturer
...............Discounts offered
...............Interest rate on financing
...............Guarantees and warranties offered
...........................Others
14. Did you make an extensive survey regarding price comparisons after you
decided to buy the particular car? ............ Yes......... No.
15. If you bought a used car, how did you learn about it?............ Newspapers
...............Friend ............... Others
16. In order of preference, what were the major reasons for buying a used
car?
317
Collection and
Classification of Data
...............Unavailability of adequate funds
...............Cheaper insurance
...............Lack of parking garage
...............Condition of the car
...............Others
17. Which of the following media you think is most effective in creating an
impact on the potential customer relative to a particular brand of the car?
................TV
...............Newspapers
................Magazines
...............Favourable news reports
................Word of mouth
...............Others
The responses to such questions would form the basis of analysis in order to
achieve the set marketing objectives.
Secondary Data
The chief sources of secondary data may be broadly classified into the following two
groups:
(i) Published sources
(ii) Unpublished sources
Published sources: There are a number of national organizations and international
agencies which collect and publish statistical data relating to business, trade, labour,
price, consumption, production, etc. These publications are useful sources of
secondary data. Some of these published sources are as follows:
1. Official publications of the Central and State Governments such as monthly
abstract of statistics, national income statistics, vital statistics of India, etc.
2. Publications of semi-government organizations, e.g., the Reserve Bank of
India bulletin
3. Publications of research institutions, e.g., the publications of the Indian
Council of Agricultural Research (I.C.A.R.), New Delhi
4. Publications of commercial and financial institutions, e.g., the publications of
the F.I.C.C.I.
5. Reports of various committees and commissions appointed by the
government, such as the Wanchoo Commission Report on Taxation
318
Collection and
Classification of Data
6. Newspapers and periodicals like Economic Times, Statesman Year Book
also publish useful statistical data
7. International publications like the U.N. Statistical Year Book,
Demographic Year Book, etc
Unpublished sources: The records maintained by private firms or business houses
which may not like to release their data to any outside agency; the researches carried
out by the research scholars in the universities or research institutes may also provide
useful statistical data.
Precautions in the use of secondary data: Secondary data should be used with
extra caution since they have been collected by someone other than the investigator.
Before using such data the investigator must be satisfied in regard to the reliability,
accuracy, adequacy and suitability of the data to the given problem under
investigation. Before using secondary data, the investigator should examine the
following questions.
1. Are the data suitable for the purpose of investigation? For this, he should
compare the objectives, nature and scope of the given enquiry with the
original investigation. He should also confirm that the various terms and units
were clearly defined and were uniform throughout the earlier investigation
and these definitions are suitable for the present enquiry as well.
2. Are the data reliable? For this, the investigator himself should satisfy about
(i) the reliability, integrity and experience of the collecting organization, (ii)
the reliability of the source of information, (iii) the methods used for the
collection and analysis of the data, and (iv) the degree of accuracy desired
by the company.
3. Are the data adequate? Adequacy of data is to be judged in the light of the
requirements of the survey and the geographical areas covered by the
available data. Adequacy of data is also to be considered in the light of the
time period for which the data are available.
Hence, in order to arrive at conclusions free from limitations and inaccuracies,
the secondary data must be subjected to thorough scrutiny and editing before they
are accepted for use.
Correction in Data
When the researcher collects the data, it is in raw form and it needs to be edited,
organized and analyzed. The first step in the correction of data is to edit that data.
319
Collection and
Classification of Data
The edited data is then coded and inferences are drawn. The editing of the data is
not a complex task, but it requires an experienced, knowledgeable and talented
person to do so.
The next step in the processing of data is editing of the data instruments. Editing
is a process of checking data to detect and correct errors and omissions, if any.
Data editing happens at two stages, one at the time of recording of the data and
second at the time of analysis of data.
Data Editing at the Time of Recording of Data
Document editing and testing of data at the time of data recording is done while
keeping the following questions in mind:
• Do the filters agree or is the data inconsistent?
•
Have ‘missing values’ been set to values that are the same for all research
questions?
•
Have variable descriptions been specified?
•
Have value labels and labels for variable names been defined and written?
All editing steps are documented so that the redefining of variables or later
analytical modification could be easily incorporated into the data sets.
Data Editing at the Time of Analysis of Data
Data editing is also a requisite before the analysis of data is carried out. This ensures
that the data is complete in all respects and can be subjected to further analysis.
Some of the usual check list questions that can be prepared by a researcher for
editing data sets before analysis are as follows:
•
Is the coding frame complete?
•
Is the documentary material sufficient for the methodology description of
the study?
•
Is the storage medium readable and reliable?
•
Has the correct data set been framed?
•
Is the number of cases correct?
• Are there differences between questionnaires, coding frames and data?
• Are there undefined and so-called ‘wild codes’?
320
Collection and
Classification of Data
•
Has the first counting of the data been compared with the original
documents of the researcher?
The editing steps check for the completeness, accuracy and uniformity of the
data as created by the researcher.
Completeness: The first step of editing or correction of data is to check whether
there is an answer to each of the questions/variables set out in the data set. If there
are any omissions, the researcher sometimes is able to deduce the correct answer
from other related data on the same instrument. If this is possible, the data set has to
rewritten on the basis of the new information. For example, the approximate family
income can be inferred from other answers to probes such as, occupation of family
members, sources of income, approximate spending saving and borrowing habits of
family members’, etc. If the information is vital and has been found to be incomplete,
then the researcher can take the step of contacting the respondent personally again
and solicit the requisite data. If none of these steps help in furnishing the required
data, the data must be marked ‘missing’.
Accuracy: Apart from checking for omissions, the accuracy of each recorded
answer should be checked. A random check process can be applied to trace the
errors at this step. Consistency in response can also be checked at this step. The
cross verification to a few related responses would help in checking for consistency
in responses. The reliability of the data set would heavily depend on this step of error
correction. While, clear inconsistencies should be rectified in the data sets, fact
responses should be dropped from the data sets altogether.
Uniformity: In editing data sets, another keen look-out should be for any lack of
uniformity in interpretation of questions and instructions by the data recorders. For
instance, the responses towards a specific feeling could have been queried from a
positive as well as the negative angle. While interpreting the answers, care should be
taken to record each answer as a ‘positive question’ response or as ‘negative
question’ response in all uniformity checks for consistency in coding throughout the
questionnaire/interview schedule response/data set.
The final selection in the editing of data is to maintain a log of all corrections that
have been carried out at this stage. The documentation of these corrections helps the
researcher to retain the original data set.
321
Collection and
Classification of Data
Check Your Progress - 1
1.
What do editing steps check for?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is the final selection in the editing of data?
................................................................................................................
................................................................................................................
................................................................................................................
13.3 CLASSIFICATION OF DATA
In the data preparation step, the data is prepared in a data format that allows the
analyst to use modern analysis software such as SAS or SPSS. The major
criterion in this is to define the data structure. A data structure is a dynamic
collection of related variables and can be conveniently represented as a graph
where nodes are labeled by variables. The data structure also defines the stages of
the preliminary relationship between variables/groups that have been pre-planned
by the researcher. Most data structures can be graphically presented to give clarity
as to the frames of the researched hypothesis. A sample structure could be a linear
structure in which one variable leads to the other and finally, to the resultant and
variable.
The identification of the nodal points and the relationships among the nodes
could sometimes be a more complex task than estimated. When the task is complex,
involving several types of instruments being collected for the same research question,
the procedure for drawing the data structure would involve a series of steps. In
several intermediate steps, the heterogeneous data structure of the individual data
sets can be harmonized to a common standard and the separate data sets are then
integrated into a single data set. However, the clear definition of such data structures
would help in the further processing of data.
322
Collection and
Classification of Data
Editing and Coding
The next step in the processing of data is editing of the data instruments. Editing of
data has already been discussed in the previous unit.
Coding and Classification of Data
The next step in data processing after initial identification of data structure and the
editing/ correction of data is to codify and classify the data. Codification of data
refers to careful assignment of variables to certain values in data obtained during
preceding stages. Classification of data, on the other hand, refers to the
categorization and assembling of all data in a comprehensive way.
The edited data is then subjected to codification and classification. Coding
process assigns numerals or other symbols to the responses of the data set. If there
is a prerequisite to prepare a coding scheme for the data set, the recording of the
data is done on the basis of this coding scheme.
The responses collected in the data sheets vary, sometimes the responses could
be the choice among the multiple responses, sometimes the response could be values
and sometimes, alphanumeric. At the recording stage itself, if some codification is
done to the responses collected, it can be useful in the data analysis. When
codification is done, it is imperative to keep a log of the codes allotted to the
observations. This code sheet will help in the identification of variables/observations
and the basis for such codification.
The first coding done to primary data sets are the individual observation
themselves. This response sheets coding gives a benefit to the research, in that, the
verification and editing of the recordings and further contact with the respondents
can be achieved without any difficulty. The codification can be made at the time of
distribution of the primary data sheets itself. The codes can be alphanumerical so as
to keep track of where and to whom it has been sent. For instance, if the data
involves a number of people at different localities, the sheets that are distributed in a
specific locality may carry a unique part code which is alphabetic. To this alphabetic
code, a numeric code can be attached to distinguish the person to whom the primary
instrument was distributed. This also helps the researcher to keep track of who the
respondent is and who the probable respondents are from whom primary data
sheets are yet to be collected. Even at a later stage, any specific queries on specific
response sheets can be clarified.
323
Collection and
Classification of Data
The variables or observations in the primary instrument also need codification,
especially when they are categorized. The categorization could be on a scale ranging
from most preferable to not preferable, or it could be very specific such as, gender
classified as male and female. Certain classifications can further lead to open-ended
classifications such as education classification, such as illiterate, graduate,
professional, others. In such instances, the codification needs to be carefully done to
include all possible responses under ‘others if, please specify’. If the preparation of
an exhaustive list is not feasible, then it will be better to create a separate variable for
the‘others’ category and record all responses as such.
(i) Numeric coding: In this format, coding need not necessarily be numeric. It
can also be alphabetic. However, the coding must be purely numerical if the
variable is subject to further parametric analysis.
(ii) Alphabetic coding: A mere tabulation or frequency count or graphical
representation of the variable may be given in an alphabetic coding.
(iii) Zero coding: A coding of zero has to be assigned carefully to a variable. In
many instances, when manual analysis is done, a code of zero would imply
a ‘no response’ from the respondents. Hence, if a value of zero is to be
given to specific responses in the data sheet, it should not lead to the same
interpretation of ‘non-response’. For instance, if there is a tendency to give
a code of 0 to a ‘no’, then a different coding other than zero should be
given in the data sheet. An illustration of the coding process of some of the
demographic variable is given in the following table.
Table 13.1 Coding Process
Question
Variable
Response
Number
Observation
Categories
1.1
Organization
Private
Pt
Public
Pb
Government
Go
3.4
4.2
Owner of vehicle
Vehicle performance
324
Code
Yes
2
No
1
Excellent
5
Good
4
Adequate
3
Collection and
Classification of Data
5.1
5.2
Age
Occupation
Bad
2
Worst
1
Upto 20 years
1
21-40 years
2
40-60 years
3
Salaried
S
Professional
P
Technical
T
Business
B
Retired
R
Housewife
H
Others
=
= could be treated as a separate variable/observation and the actual response could be
recorded. The new variable cannot be termed as ‘other occupation’.
The coding sheet needs to be prepared carefully, if the data recording is not
done by the researcher, but is outsourced to a data entry firm or individual. In order
to enter the data from the same perspective as the researcher would like to view it,
the data coding sheet is to be prepared first and a copy of the data coding sheet
should be given to the outsourcer to help him or her in the data entry procedure.
Sometimes, the researcher might not be able to code the data from primary
instrument itself. He or she may need to classify the responses and then code them.
For this purpose, classification of data is also necessary at the data entry stage.
Classification
When open-ended responses have been received, classification is necessary to code
the responses. For instance, the income of the respondents could be an open-ended
question. A suitable classification can be arrived at from all responses. A
classification method should meet certain requirements or should be guided by
certain rules.
First, classification should be linked to the theory and the aim of the particular
study. The objectives of the study will determine the dimensions chosen for coding.
The categorization should meet the information required to test the hypothesis or
investigate the questions.
Second, the scheme of classification should be exhaustive, that is, there must be
a category for every response. For example, the classification of marital status into
325
Collection and
Classification of Data
three category, viz., ‘married’ ‘single’ and ‘divorced’ is not exhaustive, because
responses like ‘widower’ or ‘separated’ cannot be fitted into the scheme. Here, an
open-ended question will be the best mode of getting the responses. From the
responses collected, the researcher can fit a meaningful and theoretically supportive
classification. The ‘others’ category be has to be carefully used by the researcher
for this purpose. However, this categorization tends to defeat the very purpose of
classification, which is to distinguish between observations in terms of the properties
under study. The ‘others’ category can be very useful when a minority of
respondents in the data set give varying answers. For instance, a survey is carried
out to find out the newspaper readily habits of people. 95 respondents out of 100
could be easily classified into 5 large reading groups while 5 respondents could have
given a unique answer. These answers, rather than being separately considered,
could be clubbed under the ‘others’ heading for meaningful interpretation of
respondents and reading habits.
Third, the categories must also be mutually exhaustive, so that each case is
classified only once. This requirement is violated when some of the categories
overlap or different dimensions are mixed up.
The number of categorization for a specific question/observation at the coding
stage should mostly be permissible since reducing the categorization at the analysis
level would be easier than splitting an already classified group of responses.
However, the number of categories is limited by the number of cases and the
anticipated statistical analysis that is to be used on the observation.
Transcription of Data
When the observations collected by the researcher are not very large, the simple
inferences, which can be drawn from the observations, can be transferred to a data
sheet, which is a summary of all responses on all observations from a research
instrument. The main aim of transition is to minimize the shuffling proceeds between
several responses and observations. Suppose a research instrument contains 120
responses and the observations have, been collected from 200 respondents; a
simple summary of one response from all 200 observations would require shuffling
of 200 pages. The process is quite tedious if several summary tables are to be
prepared from the instrument. The transcription process helps in the presentation of
all responses and observations on data sheets which can help the researcher to arrive
at preliminary conclusions as to the nature of the collected sample. Transcription is,
hence, an intermediary process between data coding and data tabulation.
326
Collection and
Classification of Data
Methods of Transcription
The researcher may adopt a manual or computerized transcription. Long
worksheets, sorting cards or sorting strips could be used by the researcher to
manually transcript the responses. The computerized transcription could be done
using a data base package such as spreadsheets, text files, or other databases.
The main requisite for a transcription process is the preparation of data sheets
where observations are the row of the data base and the responses/variables are the
columns of the data sheet. Each variable should be given a label so that long
questions can be covered under the label names. The label names are thus the links
to specific questions in the research instruments. For instance, opinion on consumer
satisfaction could be identified through a number of statements (say 10); the data
sheet does not contain the details of the statement, but gives a link to the question in
the research instrument though variables labels. In this instance, the variable names
could be given as CS1, CS2, CS3, CS4, CS5, CS6, CS7, CS8, CS9 and CS10.
The label CS indicate consumer satisfaction and the numbers 1 to 10 indicate the
statements measuring consumer satisfaction. Once the labeling process has been
done for all the responses in the research instrument, the transcription of the
response in done.
1. Manual Transcription
When the sample size is manageable, the researcher need not use any
computerization process to analyze the data. The researcher could prefer a manual
transcription and analysis of responses. The choice of manual transcription would be
when the number of responses in a research instrument is very less, say 10
responses, and the numbers of observations collected are within 100. A transcription
sheet with 100*50 (assuming each response has 5 options) rows /column can be
easily managed by a researcher manually. If, on the other hand, the variables have 20
options, it leads to a worksheet of 100*200 size, which might not be easily managed
by the researcher manually. In the second instance, if the number of responses is less
than 30, then the manual worksheet could be attempted manually. In all other
instances, it is advisable to use a computerized transcription process.
2. Long Worksheets
Long worksheets require quality paper; preferably chart sheets thick enough to last
several usages. These worksheets are normally ruled both horizontally and vertically
allowing responses to be written in the boxes. If one sheet is not sufficient, the
researcher may use multiple rule sheets to accommodate all the observations.
327
Collection and
Classification of Data
Heading of responses which are variable names and their coding (options) are filled
in the first two rows. The first column contains the code of observations. For each
variable, the responses from the research instrument are now transferred to the
worksheet by ticking the specific option that the observer has chosen. If the variable
cannot be coded into categories, requisite length for recording the actual response of
the observer should be provided for in the worksheet.
The worksheet can then be used for preparing the summary tables or can be
subjected to further analysis of data. The original instrument can now be kept aside
as safe documents. Copies of the data sheets can also be kept for further
references. As has been discussed under the editing section, the transcription data
has to be subjected to a testing to ensure error free transcription of data.
A sample worksheet is given below for reference:
Sl
No
Vehicle
Owner
Age
S
P
x
x
Y
1
2
3
4
5
6
7
N
x
x
STUDENT
8
x
ARTIST
x
x
T
Occupation
Performance
Age
B
x
R
ROTHER occ 1
Vehicle
2
3
4
1
x
2
3
4
x
x
x
x
x
x
x
x
x
5
x
x
x
x
x
x
x
x
x
x
x
x
x
Transcription can be made as and when the edited instrument is ready for
processing. Once all schedules/questionnaires have been transcripted, the frequency
tables can be constructed straight from the worksheet. Other methods of manual
transcription involve adoption of sorting strips or cards.
Earlier data entry and processing were done through mechanical and semimetric devices such as key punch using punch cards. The arrival of computers has
changed the data processing methodology altogether.
Check Your Progress - 2
1.
What is a data structure?
................................................................................................................
................................................................................................................
................................................................................................................
328
Collection and
Classification of Data
2.
What is the next step in data processing after identification of data?
................................................................................................................
................................................................................................................
................................................................................................................
13.4 SUMMARY
• The quality of the results obtained from statistical data for the purpose of
using these outcomes for managerial decision-making depends upon the
quality of the information itself collected.
• It is important that a sound investigative process be established to ensure
that the data are highly representative and highly unbiased.
• Before any procedures for data collection are established, the purpose and
the scope of the study must be clearly specified.
• The scope of the study must take into consideration the field to be
covered, and the time period in which to conduct the study.
• The first-hand information obtained by the investigator is bound to be more
reliable and accurate since the investigator can extract the correct
information by removing doubts, if any, in the minds of the respondents
regarding certain questions.
• Where the investigator and informant talk face to face, it becomes possible
to explore questions in depth.
• In indirect personal interview method, instead of directly approaching the
informants, the investigator interviews several third persons who are
directly or indirectly concerned with the subject matter of the enquiry and
who are in possession of the requisite information.
• The committee selects persons known as witnesses and collects information
from them by getting answers to questions decided in advance.
• Under the mailed questionnaire method, the investigator prepares a
questionnaire containing a number of questions pertaining to the field of
enquiry.
• These questionnaires are sent by post to the informants together with a
polite covering letter explaining in detail the aims and objectives of
329
Collection and
Classification of Data
collecting the information, and requesting the respondents to cooperate by
furnishing the correct replies and returning the questionnaire duly filled in.
• Since the questionnaire is the only medium of communication between the
investigator and the respondents, it must be designed or drafted with
utmost care and caution so that all the relevant and essential information for
the enquiry may be collected without any difficulty, ambiguity or vagueness.
• Designing of questionnaire, therefore, requires a high degree of skill and
experience on the part of the investigator.
• No hard and fast rules can be laid down for designing or framing a
questionnaire.
• The size of the questionnaire should be as small as possible. The number of
questions should be kept to the minimum keeping in view the nature,
objectives and purpose of enquiry.
• Questions should be clear, brief, unambiguous, non-offending, courteous in
tone, corroborative in nature and to the point.
• The questionnaire should be tried on a small group before using it for the
given enquiry.
• There are a number of national organizations and international agencies
which collect and publish statistical data relating to business, trade, labour,
price, consumption, production, etc.
• Secondary data should be used with extra caution since they have been
collected by someone other than the investigator.
• The sampling error would be smallest if the sample size is large relative to
the population and vice versa.
• The editing of the data is not a complex task, but it requires an
experienced, knowledgeable and talented person to do so.
• The first step of editing or correction of data is to check whether there is an
answer to each of the questions/variables set out in the data set. If there are
any omissions, the researcher sometimes is able to deduce the correct
answer from other related data on the same instrument.
• The final selection in the editing of data is to maintain a log of all corrections
that have been carried out at this stage.
330
Collection and
Classification of Data
13.5 KEY WORDS
• Census: It is a periodic count of the population that the government takes
to collect the updated statistics of the population.
• World Bank: It is an international financial institution that provodes loans
to developing countries for capital programs.
• Telephone survey: It is a kind of survey where the investigator, instead of
presenting himself before the informants, contacts them on telephone and
collects information from them.
• Indirect personal interview: It is a method of interview where the
investigator interviews several third persons who are directly or indirectly
concerned with the subject matter of the enquiry and who are in possession
of the requisite information.
13.6 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The editing steps check for the completeness, accuracy and uniformity of
the data as created by the researcher.
2. The final selection in the editing of data is to maintain a log of all corrections
that have been carried out at this stage.
Check Your Progress - 2
1. A data structure is a dynamic collection of related variables and can be
conveniently represented as a graph where nodes are labeled by variables.
2. The next step in data processing after initial identification of data structure
and the editing/ correction of data is to codify and classify the data.
13.7 SELF-ASSESSMENT QUESTIONS
1. What do you understand by collection of data?
2. List the methods of collecting primary data.
3. Discuss the merits and demerits of direct personal observation.
4. What do you mean by telephone survey? Discuss.
331
Collection and
Classification of Data
5. List the process of drafting the questionnaire.
6. Define a specimen questionnaire.
7. Discuss the term classification of data.
8. What should be the precautions in the use of secondary data?
9. Discuss the chief sources of secondary data.
13.8 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
332
Tabular Presentation
UNIT–14
TABULAR PRESENTATION
Objectives
After going through this unit, you will be able to:
•
Explain the concept of tabular presentation and the types of tables
•
Discuss the components of a table
•
Analyse the framing of tables
•
Describe the concept of statistical tables
•
Differentiate between classification and tabulation
Structure
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8
Introduction
Tabulation of Data
Classification and Tabulation
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
14.1 INTRODUCTION
This unit will introduce you to tabulation, its concepts and objectives. It refers to the
tabulation of data into appropriate tables. Tables are generally one of the following
types: single-column or single-row table, multiple-column or multiple-row table. The
components of a table include table number, title of the table, headnotes, footnotes
and sources. A statistical table, which is an orderly and systematic presentation of
numerical data in columns and rows, also has the same components.
14.2 TABULATION OF DATA
Tabular presentation means tabulating the data in the form of appropriate tables. A
table is a statistical table, containing data arranged into convenient number of rows
and/or columns. The numbers of rows or columns in which data may be classified
(or distributed) help bring out the broad data features to the fore to be easily seen at
a glance.
333
Tabular Presentation
The basic function of a table is to simplify data and to present them in a manner
that facilitates comparison. Simplifying data means that the information desired
becomes easy to locate. Comparison involves bringing all related data together at
one place such that a relational picture can be conveniently and efficiently drawn.
Types of Tables
Statistical tables can be laid in various ways. The form of a table must suit the data
at hand and be convenient to achieve the objective(s) in mind. Generally, a table is
of the following types:
(i) Single-column or single-row tables: Such tables are the simplest to
construct. The data in these tables are arranged in a single row or a single
column according to time, place, region of space, or an attribute of interest.
The table is vertically laid when the data are arranged in a single column,
and horizontally laid when the data are arranged in a single row.
In fact, laying the table either vertically or horizontally means the same thing.
How the available space allows laying the table is perhaps the only
important consideration that goes into deciding it. A horizontally laid table
obviously consumes lesser space. Otherwise, these two ways of tabulating
data constitute essentially a single type of table.
(ii) Multiple-column and multiple-row tables: As against single-column and
single-row tables, the given data on a variable may also be arranged in
multiple columns and rows. The data break-up and the kind of relational
comparative picture intended determine the number of columns and rows
required. If the number of rows is represented by r and of columns by c,
such tables are known as ‘r by c’ tables.
The intersection of each row with each column makes a cell. This means
that any ‘r by c’ table consists of r x c cells. A table so constructed is
known as a cross-classification table, with format looking as in Table 14.1.
It shows the following:
(a) There are three columns and four rows with 4 x 3 = 12 cells
comprising the body of the table, each containing a figure.
(b) While columns describe one characteristic of the data, rows describe
the other.
(c) Either columns or rows may represent time, place, region of space, or
some other attribute of the data.
334
Tabular Presentation
Table 14.1
. ..... Title ..... .
... .. .. .. Head Note .. ... ... .
Stub
Col. Head
Box Head/Caption
Col. Head
Col.Head
Sub/Row Head
Cell
Cell
Cell
Stub/Row Head
Cell
Cell
Cell
Stub/Row Head
Cell
Cell
Cell
Stub/Row Head
Cell
Cell
Cell
Footnote(s): ................
Source(s): ....................
(iii) Reference vs summary tables: Yet another classification of data may
yield a statistical table known as a reference table or a summary table.
Here the criterion of data classification is the quantum of information that a
table contains.
Reference tables are the ones which present extensive information on any
subject. For all practical purposes, these tables are the repository of basic
data and work almost as data inventory. They are primarily meant for
referencing, and serve as source material for summary tables. Accordingly,
these tables are also known as basic or source tables.
For example, all data published in the Annual Survey of Industries are in
the form of reference tables, which offer all relevant data on industries in
detail.
Summary tables, on the contrary, provide only summarized data on one or
more related aspects on a given subject. These are drawn from reference
tables and are usually displayed in the course of running text. Usually, they
are meant to be used as necessary support to inferences drawn in the text
of the report. Accordingly, these are also known as text or analytical tables.
Their basic function is to highlight comparisons and reveal possible
relationships.
Components of a Table
Components of a table are functional parts that constitute the structure of the table.
Almost invariably, there are eight (8) components of a statistical table. Each of these
may be understood with reference to the typical format of Table 14.1.
335
Tabular Presentation
• Table number: A table must be appropriately numbered, to allow making
references and citing results. It makes sense to relate the numbering of the
tables with serial number of the chapter. Generally, the digit occurring
before the dot (.) indicates the chapter where the table appears, and the
digit appearing after the dot (.) tells the serial number of the table in the
given chapter.
• Title of the table: Each table has to be given a suitable title. The title
should be so framed and stated that it briefly tells all about the data
tabulated. The title should be very short, but as complete and speaking as
possible. It must also convey the subject, time, and place the data
contained in the table refer to.
• Headnotes: A head note figures immediately below the title. It either offers
some additional information about the title and/or qualifies the data
presented in the table. For example, if the data are expressed in thousand
dollars, it is mentioned as a head note. Importantly, a head note is a
qualifier usually provided in brackets.
• Stub and stub-head: Stub refers to the main heading of rows, while the
stub-heads/entries occur as row-headings against which data entries are
made. The stub of a table consists of as many stub-heads as the number of
rows.
• Box head and sub-heads: The box head describes the data provided in
various columns. It is also called caption, being the title under which the
column heads are provided. Since sub-heads specify the data occurring
under various columns, these are parts of the caption and are provided
thereunder.
• Body of the table: The body of the table consists of a number of cells,
each containing a figure called cell entry. The body contains r x c cells,
and thus equal number of cell entries. Each cell occurs at the intersection of
a column and a row.
• Footnote(s): A footnote provides additional information, if any, about the
functional parts of a table. Generally, it is by way of some clarification(s)
that may be necessary about an entry made in the table. Or it may be by
way of a qualification to the data presented in rows and/or columns.
336
Tabular Presentation
• Source(s): A source mentions where the data presented in the table have
come from. This is an important component of the table, since the source
enables the reader to check and re-check the data from where these may
have been borrowed. It may also help draw, if relevant and necessary,
more information from the source. The source must indicate all information
about itself, such as publication, place and year of publication, and page(s)
and table(s) where the concerned data appear.
How to Frame Tables
There are no hard and fast rules governing how to frame a statistical table. It all
depends on the kind of data available and the objective(s) one wishes to achieve.
Experience of having been engaged in research is perhaps the only important factor
that plays a decisive role in framing a table of high interpretative value.
There are, however, a few catch-points that do help construct a useful table.
• Where any two sets of data are to be compared, these should preferably
be presented in columns. Column presentation of related data offers a
more vivid comparative picture than when the same data are laid in rows.
• Where some figures provided in any column or a row are required to be
brought into focus the same may be made bolder. For example, totals and
sub-totals deserve more attention in drawing a comparison, or otherwise.
This facilitates the desired data being easily noticed and/or distinguished. As
a measure of data refining, it improves the value of data presentation and
adds to the fineness of the table.
• Where availability of space is a constraint in deciding the size of a table, it
should be so designed that the available space accommodates the table
with all the information it is supposed to contain. Space limitation may at
times be serious enough to require abridging the table either horizontally or
vertically. Abridging must, however, ensure that the basic information
needed for analysis, drawing inferences, and/or establishing relationship(s)
is not lost in the process of reducing size. Otherwise, a table will suffer a
serious handicap in achieving the objective(s) in mind.
• Where reducing the space requirement of a table is unavoidable, a useful
way of doing so is to appropriately round off the figures, following the
basic rules of rounding. Long figures expressed in many digits can be easily
made short by expressing them in, say, thousands or millions, as may be
337
Tabular Presentation
necessary. Similarly, decimal figures can also be suitably adjusted up to the
desired number of decimal points.
While the points made above do matter in constructing a meaningful statistical
table, the basic ground rule is no different from applying common sense and
imagination, keeping the use requirements in mind. Any table that we may intend to
construct and lay should generally be an intelligent display of data so as to be
conveniently read and understood.
A Contingency Table
A contingency table is an important form of presenting observed data. It is amenable
to the application of a number of useful statistical tools of data analysis. It follows
largely the same format as that of Table 14.1. Running into r number of rows and c
number of columns, there are ‘r x c’ cell entries which make the body structure of
the table.
Consider for example, Table 14.2, which gives the distribution of 2,000
collegiate students according to sex and economic status. As a contingency table, it
deviates from a normal multi-column and multi-row format in Table 14.2 as under:
Table 14.2
Classification of 2000 Collegiate Students According to Sex and Economic Status
(A 2 × 3 Contingency Table)
Sex
High Income Means
Economic Status
Average Income Means
Low Income Means
Row Totals
Boys
12 0
70 0
380
1200
Girls
80
50 0
220
800
Column Totals
20 0
1200
600
2000
• A contingency table is a cross-section presentation of observed data in
terms of any two attributes. Here, one is sex and the other economic status.
Importantly, the data presented in any such table are the observed
frequency, and not the continuous quantitative data on a variable.
• The data appearing as cell entries in a contingency table are essentially
qualitative count data. To be more specific, the cell entries are observed
frequencies/counts of an item or the outcome of an event possessing or not
possessing a certain attribute.
• In addition to the cell entries being determined as ‘r x c’, the last column in
a contingency table provides row totals and the last row gives column
totals. At the intersection of the last column (for row totals) and the last row
338
Tabular Presentation
(for column totals) lies a cell containing the total number of frequencies, or
the number of subjects or objects/items observed in terms of the two
attributes of interest. The row totals and column totals are known as
marginal frequencies.
A look at Table 14.2 shows that the data provided in the cells are count data.
The row totals and column totals both add to 2000, the total number of students
observed. The last column presents the row totals and the last row the column totals.
All this is unlike a normal cross-classification table, where the data are the
measurements of a continuous quantitative variable.
The cell frequencies in a contingency table are amenable to meaningful
interpretations. For example, the first cell frequency (that is, 120) means that out of
all the 2000 collegiate there are 120 boys who have high-income means.
Similarly, among 200 collegiate out of 2000 who have high income means, 120
are boys and 80 girls. And, so on. An important point that must weigh heavy in the
construction of a contingency table is that the two classifying attributes are clearly
and objectively defined. This helps stating the various column heads and row heads
in unambiguous terms as to their meaning and coverage. Any ambiguity in defining
the attributes and, consequently, the column and row heads seriously erodes an
objective classification of the observed data. It also does not allow the cell
frequencies to offer precise and meaningful interpretations.
Statistical Tables
A statistical table is an orderly and systematic presentation of numerical data in
columns and rows. Columns are vertical arrangements; rows are horizontal. The
main objective of a statistical table is to so arrange the physical presentation of
numerical facts that the attention of the reader is automatically directed to the
relevant information. Some of the main advantages of tabular presentation over
descriptive statements are as follows:
• Tabulated data can be easily understood than facts stated in the form of
descriptions.
• They leave a lasting impression.
• They facilitate quick comparison.
• Statistical tables make easier the summation of items and detection of
errors and omissions.
339
Tabular Presentation
• When data are tabulated all unnecessary details and repetitions are avoided.
• A tabular arrangements makes it unnecessary to repeat explanations,
phrases and headings.
Parts of Tables
The following parts must be present in all tables:
• Title
• Caption
• Stubs
• Body
There are, however, other parts whose presence depends upon the specific
purpose. They are Headnote (or prefatory note), footnote and source note.
• Title: A complete title explains in brief and concise language (a) what the
data are, (b) where the data are, (c) time period of data and (d) how the
data are classified.
• Captions: The title of the columns are given in captions. In case there is a
sub-division of any column, there would be sub-caption headings also.
• Stubs: The titles of the rows are called stubs. The box over the stub on the
left of the table gives description of the stub contents, and each stub labels
the data found in its row of the table.
• Body: The body of the table contains the numerical information.
• Headnote (or prefatory note) It is a statement, given below the title, which
clarifies the contents of the table.
• Footnote: It is a statement which clarifies some specific items given in the
table or explains the omission thereof. Thus, if we look into a table, giving
yearly figures of wheat production in India, the sudden fall in the figure for
1947 relate to India after partition.
• Source: The source from where the data contained in the table has been
obtained should be stated. This would permit the reader to check the
figures and gather, if necessary, additional information.
340
Tabular Presentation
Table 14.3
Title (Description of Units and Year, Place etc) Headnote
(Stub box) (D)
(A) Caption
(2)
(1)
(B) Caption
(4)
(3)
Sub X
Y
B
O
D
Y
Z
Total
Notes:
Any definition.
Any explanation.
Source from which derived.
Types of Tables
Tables may be classified according to the number of characteristics used for
tabulation. A simple or a one-way table use only one characteristic against which the
frequency distributions given, as in Table 14.4 where the characteristic used is the
age of student.
Table 14.4 Age Wise Distribution of the Students of a College
Age in Year
Students
16—17
—
17–18
—
18—19
—
In a two-way table, on the other hand, two characteristics are used. In this case,
one characteristic is taken as column headings, and the other as row stubs. Example
of a two-way table showing a two-way frequency distribution is shown in Table
14.5.
Table 14.5 Age and Sex Wise Distribution of the Students of a College
Age in years
Students
Male
Total
Female
16–17
17–18
18 and on
When it is desired to represent three or more characteristics in a single table,
such a table is called higher order table. Thus, if it is desired to represent the ‘age’,
341
Tabular Presentation
‘sex’ and ‘course’, of the students, the table would take the form as shown on page
70 and would be called a higher order table.
Table 14.6 Table Showing Distribution of the Students of a College
According to ‘Age’, ‘Sex’, and ‘Course’
Course
Age in Years
Male
Arts
Female
Male
Science
Female
Commerce
Male
Female
Total
16–17
17–18
18 and Over
Total
Example 14.1: Draft a form of tabulation to show:
(a) Sex,
(b) Three ranks–supervisors, assistants and clerks,
(c) Years–1970 and 1979
(d) Age group–18 years and under, over 18 but less than 55 years, over 55
years.
Solution: In the previous question, we have to prepare a table to show four
characteristics, i.e., sex., three ranks of the employees, as given, for two different
years and the data is to be divided according to age groups already given here. We
can prepare a blank table to incorporate all these characteristics (Table 14.7).
Table 14.7 Table Showing the Division of Three Ranks of Employees According to Sex and
Age Group for 1976 and 1979
1976
Age Group
Supervisors
Assistants
0–18
Males
18–55
55 and above
Total
0–18
Females
18–55
55 and above
Total
342
Clerks Total
1979
Supervisors Assistants
Clerks
Total
Tabular Presentation
Example 14.2: The city of Timbuktu was divided into three areas: the administrative
district, other urban districts and rural districts. A survey of housing conditions was
carried out and the following information was gathered:
There were 6,77,100 buildings of which 1,76,100 were in rural districts. Of the
buildings in other urban districts 4,06,400 were inhabited and 4,500 were under
construction in the administrative district 4,000 buildings were inhabited and 500
were under construction of the total of 61,600.
The total buildings in the city that are under construction are 6,200 and those
uninhabited are 44,900.
Tabulate the above information so as to give the maximum possible information.
How many buildings are under construction in rural areas?
Solution
District
Administrative
Other Urban
Rural
Total
Table 14.8 Distribution of Building in the Three Districts of
Timbuktu According to Inhabitation
Inhabited
571
4064
1625
6260
Unihabited
40
285
124
449
Under Construction
5
45
12
62
(in hundreds)
Total
616
4394
1761
6771
The table clearly shows that there are 1,200 buildings under construction in rural
areas.
Example 14.3: An investigation conducted by the education department in a public
library revealed the following facts. You are required to tabulate the information as
neatly and clearly as you can.
‘In 1960, the total number of readers was 46,000 and they borrowed some
16,000 volumes. In 1965, the number of books borrowed increased by 4,000 and
the borrowers by 50 per cent.’
The classification was on the basis of three sections: Literatures, Fiction and
Illustrated News. There were 10,000 and 30,000 readers in the section Literature
and Fiction, respectively, in the year 1960–Illustrated news and Fiction, respectively.
Marked changes were seen in 1965. There were 7,000 and 42,000 readers in the
Literature and Fiction section respectively. So also 4,000 and 13,000 books were
lent in the section Illustrated News and Fiction respectively.
343
Tabular Presentation
Solution:
Table 14.9: Showing the Changes in the Number of Readers and
Type of Books in the Year 1975 as Compared to 1970.
1970
1975
Types of
books
Number
of
readers
Number
of books
borrowed
Number
of
readers
Number
of books
borrowed
Fiction
30,000
10,000
42,000
13,000
+12,000
+3000
Literature
10,000
4,000
7,000
3,000
–3,000
–1,000
Illustrated news 6,000
2,000
20,000
4,000
+18,000
+2,000
Total
16,000
69,000
20,000
27,000
4,000
46,000
Changes in 1975
over 1970
Example 14.4: Prepare a two-way frequency table and marginal frequency tables
for 25 values of the two variables x and y given below. Take class interval of x as
10–20, 20–30, etc., and that of y as 100–200, 200–300 etc.
x
12
24
33
22
44
37
26
36
y
140
256
360
470
470
380
280
315
x
55
48
27
57
21
51
27
42
y
420
390
440
390
590
250
550
360
c
43
52
57
44
48
48
52
41
y
570
290
416
280
452
370
312
330 590
Solution:
Table 14.10 Bivariate Frequency Table
X
10–20
20–30
30–40
40–50
50–60
60–70
Total
100–300
1
–
–
–
–
–
1
200–300
–
2
–
–
2
–
4
300–400
–
–
3
5
2
–
10
400–500
–
2
–
2
2
–
6
500–600
–
2
–
1
–
1
4
Total
1
6
3
8
6
1
25
Y
344
69
Tabular Presentation
Table 14.11 Marginal Distribution of X
x
f
10 – 20
1
20 – 30
6
30 – 40
3
40 – 50
8
50 – 60
6
60 – 70
1
Total
25
Table 14.12 Marginal Distribution of Y
y
f
100 – 200
1
200 – 300
4
300 – 400
10
400 – 500
6
500 – 600
4
Total
25
Example 14.5: In a trip organized by a college, there were 80 persons, each of who paid
` 15.50 on an overage. There were 60 students, each of who paid ` 16. Members of
teaching staff were charged at a higher rate. The number of servants (all males) was six and
they were not charged anything. The number of ladies was 20 per cent of the total and there
was only one ladystaff member. Tabulate this information.
Solution:
Total contribution = 80 × 15.50 = ` 1240.00
Table 14.13 Showing Participants, Sex and Class wise
Class
Sex
Males
Females
Totals
Contribution Contribution
Students
45
15
60
16
960
Teaching Staff
13
1
14
20
280
Servants
6
–
6
–
–
Totals
64
16
80
15.50
1240
345
Tabular Presentation
Example 14.6: Prepare a bivariate frequency distribution for the following data:
Marks in law
Marks in Statistics:
10
20
11
21
10
22
11
21
11
23
14
23
12
22
12
21
13
24
Marks in Law:
Marks in Statistics:
12
23
11
22
12
23
10
22
14
22
12
20
13
24
10
23
14
24
10
23
13
24
Solution:
Marks in
Statistics
Law
20
21
22
23
24
Total
10
1
–
2
2
–
5
11
–
2
1
1
–
4
12
1
1
1
2
–
5
13
–
–
–
–
3
3
14
–
–
1
1
1
3
Totals
2
3
5
6
4
20
Check Your Progress - 1
1.
What is the basic function of a table?
................................................................................................................
................................................................................................................
................................................................................................................
2.
Which tables are known as ‘r by c’ tables?
................................................................................................................
................................................................................................................
................................................................................................................
3.
What is the function of a headnote in a table?
................................................................................................................
................................................................................................................
................................................................................................................
346
Tabular Presentation
14.3 CLASSIFICATION AND TABULATION
The aspects of classification and tabulation and discussed in detail here:
Difference between Classification and Tabulation
Classification and tabulation, are both methods of summarizing data in statistics. It is
used to draw further analysis of data or to draw inferences from the given data.
Here below, are discussed the two methods of summarizing the data and the
difference between classification and tabulation of data.
Classification of Data
Classification in statistics refers to the process of separation of data into various
groups or classes with the help of properties in the data set. For example, the
interests of particular class or group can be separated on the basis of gender. In this
classification, the raw data condenses into suitable forms for statistical analysis and
removes complex data patterns and highlights the core representatives of the raw
data. Post classification, the data can be put to comparison or inferences. Classified
data at some means can also provide relationships or correlative data patterns.
Data when it is raw, is classified using four key characteristics geographical,
chronological, qualitative and quantitative properties. Considering that a data set is
gathered for the analysis of the consumption of petrol per day around the world.
The consumption of petrol can be classified on the basis of countries and types of
vehicles. Here, geographical factors and vehicle types are the merits for
classification. A further classification as chronological, can include older vehicles
which have a higher rate of consumption. The maintenance and serviceability of the
vehicles can act as the qualitative base of classification and the gross average
claimed by the manufacturer can act as the quantitative base for classification of the
consumption.
Tabulation of Data
Tabulation in statistics is a method of summarising data by using a systematic
arrangement of data into rows and columns. Tabulation is carried out as to
investigate, compare, identify errors or omissions in data, to study a prevailing trend,
to simplify the known raw data and to use the space economically and use it as
future reference.
347
Tabular Presentation
The following are the components of a statistical table:
Component
Description
Title
It is a brief explanation of the contents of the table Table Number
It is a number assigned to a table for easy identification
Date
Date of the creation of the table should be indicated
Row Designations
Each row of the table is given a brief name, usually provided in the
first column. Such a name is known as a “stub”, and the column is
known as the “stub column”
Column Headings
Each column is given a heading to explain the nature of the figures,
these are known as “captions” or “headings”.
Body of the table
Data is entered into the main body and should be created for easy
identification of each data items. Numeric values are often ordered
in either ascending or descending order.
Unit of Measurement
The unit of measurement of the values in the table body should be
indicated.
Sources
The tables should provide the primary and secondary sources for
the data below the body of the table.
Footnotes and
These are additional details for clarifying the contents of the table.
References
Hence, in classification, data are separated and grouped based on a property of
the data common to all values. Whereas in tabulation, data is arranged into columns
and rows based on its characteristics or properties. Tabulation generally emphasizes
on the presentation aspects of the data, while classification is used as a means of
sorting of data for further analysis.
Check Your Progress - 2
1.
What is the common factor between classification and tabulation?
................................................................................................................
................................................................................................................
................................................................................................................
2.
How is raw data classified?
................................................................................................................
................................................................................................................
................................................................................................................
348
Tabular Presentation
3.
What is the basic difference between tabulation and classification of data?
................................................................................................................
................................................................................................................
................................................................................................................
14.4 SUMMARY
• Tabular presentation means tabulating the data in the form of appropriate
tables. A table is a statistical table, containing data arranged into convenient
number of rows and/or columns.
• The basic function of a table is to simplify data and to present them in a
manner that facilitates comparison. Simplifying data means that the
information desired becomes easy to locate.
• Single-column or single-row tables are the simplest to construct. The data
in these tables are arranged in a single row or a single column according to
time, place, region of space, or an attribute of interest. The table is
vertically laid when the data are arranged in a single column, and
horizontally laid when the data are arranged in a single row.
• As against single-column and single-row tables, the given data on a variable
may also be arranged in multiple columns and rows. The data break-up
and the kind of relational comparative picture intended determine the
number of columns and rows required.
• Summary tables, on the contrary, provide only summarized data on one or
more related aspects on a given subject. These are drawn from reference
tables and are usually displayed in the course of running text.
• A table must be appropriately numbered, to allow making references and
citing results. It makes sense to relate the numbering of the tables with serial
number of the chapter.
• A head note figures immediately below the title. It either offers some
additional information about the title and/or qualifies the data presented in
the table.
• The body of the table consists of a number of cells, each containing a figure
called cell entry. The body contains r x c cells, and thus equal number of
cell entries. Each cell occurs at the intersection of a column and a row.
349
Tabular Presentation
• A source mentions where the data presented in the table have come from.
This is an important component of the table, since the source enables the
reader to check and re-check the data from where these may have been
borrowed. It may also help draw, if relevant and necessary, more
information from the source.
• There are no hard and fast rules governing how to frame a statistical table.
It all depends on the kind of data available and the objective(s) one wishes
to achieve.
• Where availability of space is a constraint in deciding the size of a table, it
should be so designed that the available space accommodates the table
with all the information it is supposed to contain.
• A contingency table is an important form of presenting observed data. It is
amenable to the application of a number of useful statistical tools of data
analysis.
• The data appearing as cell entries in a contingency table are essentially
qualitative count data. To be more specific, the cell entries are observed
frequencies/counts of an item or the outcome of an event possessing or not
possessing a certain attribute.
• A statistical table is an orderly and systematic presentation of numerical
data in columns and rows. Columns are vertical arrangements; rows are
horizontal. The main objective of a statistical table is to so arrange the
physical presentation of numerical facts that the attention of the reader is
automatically directed to the relevant information.
• The following parts must be present in all tables:
o Title
o Caption
o Stubs
o Body
• Classification, in statistics refers to the process of separation of data into
various groups or classes with the help of properties in the data set.
• Data when it is raw, is classified using four key characteristicsgeographical, chronological, qualitative and quantitative properties.
Considering that a data set is gathered for the analysis of the consumption
of petrol per day around the world.
350
Tabular Presentation
• Tabulation in statistics is a method of summarising data by using a
systematic arrangement of data into rows and columns. Tabulation is
carried out as to investigate, compare, identify errors or omissions in data,
to study a prevailing trend, to simplify the known raw data and to use the
space economically and use it as future reference.
14.5 KEY WORDS
• Headnote: It is a statement, given below the title, which clarifies the
contents of the table.
• Footnote: It is a statement which clarifies some specific items given in the
table or explains the omission thereof.
• Stubs: They are the titles of the rows.
• Reference tables: These tables present extensive information on any
subject; for all practical purposes, these tables are the repository of basic
data and work almost as data inventory.
14.6 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The basic function of a table is to simplify data and to present them in a
manner that facilitates comparison.
2. If the number of rows is represented by r and of columns by c, such tables
are known as ‘r by c’ tables.
3. A headnote either offers some additional information about the title or
qualifies the data presented in the table.
Check Your Progress - 2
1. Classification and tabulation, are both methods of summarizing data in
statistics.
2. Raw data is classified using four key characteristics geographical,
chronological, qualitative and quantitative properties.
3. Tabulation generally emphasizes on the presentation aspects of the data,
while classification is used as a means of sorting of data for further analysis.
351
Tabular Presentation
14.7 SELF-ASSESSMENT QUESTIONS
1. What is tabular presentation of data? How does it facilitate comparison?
2. List the various types of tables.
3. Enumerate the components of a table.
4. What is the difference between footnote and headnote?
5. Discuss the steps involved in framing a table.
6. Differentiate between classification and tabulation of data.
7. What is classification of data? Why is it necessary to classify data? Give an
example where data is classified.
8. What are statistical tables? State the objectives of statistical tables. Also
list the advantages of tabular presentation of data.
14.8 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
352
Diagrammatic and Graphic
Presentation
UNIT–15
DIAGRAMMATIC AND GRAPHIC PRESENTATION
Objectives
After going through this unit, you will be able to:
•
Explain the diagrammatic representation of data
•
Analyse pictogram as a sign language
•
Discuss graphic representation of data
•
Differentiate between histograms, frequency polygon and ogives
Structure
15.1
15.2
15.3
15.4
15.5
15.6
15.7
15.8
Introduction
Diagrammatic and Graphic Presentation
Graphical Presentation
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
15.1 INTRODUCTION
This unit will introduce you to graphic representation of data. Graphical or pictorial
representation of data helps in giving a visual indication of magnitudes, groupings,
trends and patterns in the data. These also help facilitate comparisons between two
or more sets of data. Diagrammatic representations include bar diagrams, pie charts
and pictograms, whereas graphic representation includes histograms, frequency
polygons and cumulative frequency curves or ogives.
15.2 DIAGRAMMATIC AND GRAPHIC PRESENTATION
The data we collect can often be more easily understood for interpretation if it is presented
graphically or pictorially. Diagrams and graphs give visual indications of magnitudes,
groupings, trends and patterns in the data. These important features are more simply
presented in the form of graphs. Also, diagrams facilitate comparisons between two or
more sets of data.
353
Diagrammatic and Graphic
Presentation
The diagrams should be clear and easy to read and understand. Too much
information should not be represented through the same diagram; otherwise, it may
become cumbersome and confusing. Each diagram should include a brief and selfexplanatory title dealing with the subject matter. The scale of the presentation should
be chosen in such a way that the resulting diagram is of appropriate size. The
intervals on the vertical as well as the horizontal axis should be of equal size;
otherwise, distortions would occur.
Diagrams are more suitable to illustrate discrete data, while continuous data is
better represented by graphs. The following are the diagrammatic and graphic
representation methods that are commonly used.
Diagrammatic representation
(i) Bar diagram
(ii) Pie chart
(iii) Pictogram
(i) Bar diagram: Bars are simply vertical lines where the lengths of the bars are
proportional to their corresponding numerical values. The width of the bar is unimportant
but all bars should have the same width so as not to confuse the reader of the diagram.
Additionally, the bars should be equally spaced.
Example 15.1: Suppose that the following were the gross revenues (in $100,000.00)
for a company XYZ for the years 1989, 1990 and 1991.
Year
Revenue
1989
110
1990
95
1991
65
Construct a bar diagram for this data.
Solution:
The bar diagram for this data can be constructed as follows with the revenues represented
on the vertical axis and the years represented on the horizontal axis.
354
Diagrammatic and Graphic
Presentation
The bars drawn can be subdivided into components depending upon the type of information
to be shown in the diagram. This will be clear by the following example in which we are
presenting three components in a bar.
Example 15.2: Construct a subdivided bar chart for the three types of expenditures in
dollars for a family of four for the years 1988, 1989, 1990 and 1991 as given as follows:
Year
Food
Education Other
Total
1988
1989
1990
1991
3000
3500
4000
5000
2000
3000
3500
5000
3000
4000
5000
6000
Solution:
The subdivided bar chart would be as follows:
355
8000
10500
12500
16000
Diagrammatic and Graphic
Presentation
(ii) Pie chart: This type of diagram enables us to show the partitioning of a
total into its component parts. The diagram is in the form of a circle and is
also called a pie because the entire diagram looks like a pie and the
components resemble slices cut from it. The size of the slice represents the
proportion of the component out of the whole.
Example 15.3: The following figures relate to the cost of the construction
of a house. The various components of cost that go into it are represented
as percentages of the total cost.
Item
% Expenditure
Labour
25
Cement, Bricks
30
Steel
15
Timber, Glass
20
Miscellaneous
10
Construct a pie chart for the above data.
Solution:
The pie chart for this data is presented as follows:
Misc.
10%
Labour
25%
Timber, glass
20%
Steel
15%
Cement, bricks
30%
Pie charts are very useful for comparison purposes, especially when there
are only a few components. If there are too many components, it may
become confusing to differentiate the relative values in the pie.
(iii) Pictogram: Pictogram means presentation of data in the form of pictures.
It is quite a popular method used by governments and other organizations
for informational exhibitions. Its main advantage is its attractive value.
Pictograms stimulate interest in the information being presented.
News magazines are very fond of presenting data in this form. For
example, in comparing the strength of the armed forces of USA and
Russia, they will simply make sketches of soldiers where each sketch may
356
Diagrammatic and Graphic
Presentation
represent 100,000 soldiers. Similar comparison for missiles and tanks is
also done.
Pictograms or pictographs are symbols of representation of the pictorial
graphic system. Pictographs originated from prehistoric drawings on ancient
rocks signifying an object or thing with its depiction. It is meant to convey,
share or represent an idea or concept. A pictogram conveys meaning
without words, with the help of its diagrammatic representation and are
generally used in graphic systems and writings which have characters that
appear in a pictorial form. It sometimes uses the representation of phonetic
letters to form a base for cuneiform and even hieroglyphic writing.
Better known as ‘icons’, pictograms have been popularised with the use and
familiarization of software’s. Today the term is used widely and casually with the
broad sweep of many icons representing things. The major role in getting familiarised
with pictograms has been played by mobile devices and computers.
Pictograms are often used in writing, citing references, sign boards and as
graphical systems where the characters illustrated are a representation of the natural
self and to a considerable extent are pictorial in resemblance. These are used in
various fields such as leisure, tourism and geography.
Herbert W. Kapitzki (Professor of Visual Communications, University of Arts,
Berlin) defines the pictogram by its formal quality and abstractness. According to
him, a pictogram is an iconic sign that depicts the character of what is being
represented and through abstraction takes on its quality as a sign.
Otl Aicher (Ulm College of design) states that the pictogram must have the
character of a sign and should not be an illustration.
Pictogram: Sign Language
The Pictogram is a friendly visual language that is developed for all classes of people
and even those with no ability to speak, read or write.
• Pictograms can help one understand without help.
• With a pictogram representation, one can ask questions and get replies.
Here below are a few diagrammatic representations of pictograms:
357
Diagrammatic and Graphic
Presentation
Fig. 15.1 A Common Utility Pictogram Chart
Source: http://www.scratchinginfo.net/wp-content/uploads/2013/04/Modern-Pictograms.png
Fig. 15.2 A Pictogram Chart of Daily Use Signs
Source: http://kudesign.co.nz/studio/wp-content/uploads/pictograms.jpg
Check Your Progress - 1
1.
What is the need for graphical or pictorial presentation of data?
................................................................................................................
................................................................................................................
................................................................................................................
358
Diagrammatic and Graphic
Presentation
2.
What are bar diagrams?
................................................................................................................
................................................................................................................
................................................................................................................
3.
Name the chart that shows the partitioning of a total into component parts.
................................................................................................................
................................................................................................................
................................................................................................................
15.3 GRAPHICAL PRESENTATION
Graphical presentation and its classification are discussed here:
Graphical Presentation: Histogram, Frequency Polygon and Ogive
Graphic representation can be classified into the following:
(i) Histogram
(ii) Frequency polygon
(iii) Cumulative frequency curve (Ogive)
Each of these is briefly explained and illustrated.
(i) Histogram: A histogram is the graphical description of data and is
constructed from a frequency table. It displays the distribution method of
a data set and is used for statistical as well as mathematical calculations.
The word histogram is derived from the Greek word histos which means
‘anything set upright’ and gramma which means ‘drawing, record, and
‘writing’. It is considered as the most important basic tool of statistical
quality control process.
In this type of representation, the given data are plotted in the form of a
series of rectangles. Class intervals are marked along the X-axis and the
frequencies along the Y-axis according to a suitable scale. Unlike the bar
chart, which is one-dimensional, meaning that only the length of the bar is
important and not the width, a histogram is two-dimensional in which both
the length and the width are important. A histogram is constructed from a
frequency distribution of a grouped data where the height of the rectangle is
359
Diagrammatic and Graphic
Presentation
proportional to the respective frequency and the width represents the class
interval. Each rectangle is joined with the other and any blank spaces
between the rectangles would mean that the category is empty and there
are no values in that class interval.
As an example, let us construct a histogram for our example of ages of 30
workers. For convenience sake, we will present the frequency distribution
along with the mid-point of each interval, where the mid-point is simply the
average of the values of the lower and upper boundary of each class
interval. The frequency distribution table is shown as follows:
Class Interval (Years)
(f)
Mid-point
15 and upto 25
20
5
25 and upto 35
30
3
35 and upto 45
40
7
45 and upto 55
50
5
55 and upto 65
60
3
65 and upto 75
70
7
The histogram of this data would be shown as follows:
7
7
5
5
3
3
(ii) Frequency polygon: A frequency polygon is a line chart of frequency
distribution in which either the values of discrete variables or mid-points of
class intervals are plotted against the frequencies. These plotted points are
joined together by straight lines. Since the frequencies generally do not start
at zero or end at zero, this diagram as such would not touch the horizontal
axis. However, since the area under the entire curve is the same as that of
a histogram which is 100 per cent of the data presented, the curve can be
360
Diagrammatic and Graphic
Presentation
enclosed so that the starting point is joined with a fictitious preceding point
whose value is zero, so that the start of the curve is at horizontal axis and
the last point is joined with a fictitious succeeding point whose value is also
zero, so that the curve ends at the horizontal axis. This enclosed diagram is
known as the frequency polygon.
We can construct the frequency polygon from the preceding table as
follows:
(40, 7)
(70, 7)
(20, 5)
(50, 5)
(60, 3)
(30, 3)
(iii) Cumulative frequency curve (Ogive): The cumulative frequency curve
or ogive is the graphic representation of a cumulative frequency distribution.
Ogives are of two types. One of these is less than and the other one is
greater than ogive. Both these ogives are constructed based upon the
following table of our example of 30 workers.
Class Interval
(Years)
15 and upto 25
25 and upto 35
35 and upto 45
45 and upto 55
55 and upto 65
65 and upto 75
Mid-point
(f)
20
30
40
50
60
70
5
3
7
5
3
7
Cum. Freq.
(Less Than)
5 (less than 25)
8 (less than 35)
15 (less than 45)
20 (less than 55)
23 (less than 65)
30 (less than 75)
361
Cum. Freq.
(Greater Than)
30 (more than 15)
25 (more than 25)
22 (more than 35)
15 (more than 45)
10 (more than 55)
7 (more than 65)
Diagrammatic and Graphic
Presentation
(a) Less than ogive: In this case, less than cumulative frequencies are plotted
against upper boundaries of their respective class intervals.
(b) Greater than ogive: In this case, greater than cumulative frequencies are
plotted against the lower boundaries of their respective class intervals.
Greater than
Cumulative Frequency
More than Ogive
These ogives can be used for comparison purposes. Several ogives can be
drawn on the same grid, preferably with different colours for easier visualization and
differentiation.
Although, diagrams and graphs are a powerful and effective media for presenting
statistical data, they can only represent a limited amount of information and they are
not of much help when intensive analysis of data is required.
362
Diagrammatic and Graphic
Presentation
Solved Problems
Example 15.4: Standard tests were administered to 30 students to determine their
IQ scores. These scores are recorded in the following table.
120 115 118 132 135 125 122 140 137 127
129 130 116 119 132 127 133 126 120 125
130 134 135 127 116 115 125 130 142 140
(a) Arrange this data into an ordered array.
(b) Construct a grouped frequency distribution with suitable class intervals.
(c) Compute for this data:
• Cumulative frequency (<)
• Cumulative frequency (>)
(d) Compute:
• Relative frequency
• Cumulative relative frequency (<)
• Cumulative relative frequency (>)
(e) Construct for this data:
• A histogram
• A frequency polygon
• Cumulative relative ogive (<)
• Cumulative relative ogive (>)
Solution:
(a) The ordered array for this data is as follows:
115 115 116 116 118 119 120 120 122 125
125 125 126 127 127 127 129 130 130 132
132 132 133 134 135 135 137 140 140 142
363
Diagrammatic and Graphic
Presentation
(b) Let there be six groupings, so that the size of the class interval be five. The
frequency distribution is shown as follows:
Class Interval (CI)
115 to less than 120
120 ’’ ’’ ’’ 125
125 ’’ ’’ ’’ 130
130 ’’ ’’ ’’ 135
135 ’’ ’’ ’’ 140
140 ’’ ’’ ’’ 145
Frequency ( f )
6
3
8
7
3
3
(c) The required elements are computed in the following table.
Class Interval
115–120
120–125
125–130
130–135
135–140
140–145
(f)
6
3
8
7
3
3
Cum. Freq.(<)
6 (less than 120)
9 (less than 125)
17 (less than 130)
24 (less than 135)
27 (less than 140)
30 (less than 145)
Cum. Freq. (>)
30 (more than 115)
24 (more than 120)
21 (more than 125)
13 (more than 130)
6 (more than 135)
3 (more than 140)
(d) The computed values of relative frequency, cumulative relative frequency (<)
and cumulative relative frequency (>) are shown in the following table:
Class Interval
(f )
Rel. Freq.
Cum. Rel.
Freq. (<)
Cum. Rel.
Freq. (>)
115 and upto 120 6
6/30 or 20%
6/30 or 20% (<120)
30/30 or 100% >115)
120 and upto 125 3
3/30 or 10%
9/30 or 30% <125)
24/30 or 80% (>120)
125 and upto 130 8 8/30 or 26.7% 17/30 or 56.7% (<130) 21/30 or 70% (>125)
130 and upto 135 7 7/30 or 23.3% 24/30 or 80% (<135) 13/30 or 43.3% (>130)
135 and upto 140 3
3/30 or 10%
27/30 or 90% (<140)
6/30 or 20% (>135)
140 and upto 145 3 13/30 or 10% 30/30 or 100% (<145) 3/ 30 or 10% (>140)
Total = 30
364
Diagrammatic and Graphic
Presentation
(e) Before we construct the histogram and other diagrams, let us first
determine the midpoint (X) of each class interval.
Class Interval
115–120
120–125
125–130
130–135
135–140
140–145
(f )
6
3
8
7
3
3
Mid-point (X)
117.5
122.5
127.5
132.5
137.5
142.5
A histogram
A frequency polygon
365
Diagrammatic and Graphic
Presentation
A cumulative frequency ogive (<)
A cumulative frequency ogive (>)
Example 15.5: Construct a stem and leaf display for the data of IQ scores presented
in the preceding example.
Solution:
The IQ scores of the given thirty students are presented in an ordered array, as follows:
115 115 116 116 118 119 120 120 122 125
125 125 126 127 127 127 129 130 130 132
132 132 133 134 135 135 137 140 140 142
366
Diagrammatic and Graphic
Presentation
The stem would consist of the first two digits and the leaf would consist of the last digit.
Stem
11
12
13
14
Leaves
556689
00255567779
0022234557
002
Example 15.6: Suppose the Office of the Management and Budget (OMB) has
determined that the Federal Budget for 2008 would be utilized for proportionate
spending in the following categories. Construct a pie chart to represent this data.
Category
Direct benefit to individuals
State, local grants
Military spending
Debt service
Misc. operations
Per cent Allocation
40
15
25
15
5
Total 100%
Solution:
The pie chart is presented as follows. Care must be taken so that the percentage
allocation of budget is represented by the appropriate proportion of the pie.
Check Your Progress - 2
1.
What is a histogram?
................................................................................................................
................................................................................................................
................................................................................................................
367
Diagrammatic and Graphic
Presentation
2.
What is the graphic representation of a cumulative frequency distribution
called?
................................................................................................................
................................................................................................................
................................................................................................................
15.4 SUMMARY
• The data we collect can often be more easily understood for interpretation
if it is presented graphically or pictorially. Diagrams and graphs give visual
indications of magnitudes, groupings, trends and patterns in the data.
• The diagrams should be clear and easy to read and understand. Too much
information should not be represented through the same diagram;
otherwise, it may become cumbersome and confusing.
• Bars are simply vertical lines where the lengths of the bars are proportional
to their corresponding numerical values. The width of the bar is unimportant
but all bars should have the same width so as not to confuse the reader of
the diagram.
• This type of diagram enables us to show the partitioning of a total into its
component parts. The diagram is in the form of a circle and is also called a
pie because the entire diagram looks like a pie and the components
resemble slices cut from it.
• Pictogram means presentation of data in the form of pictures. It is quite a
popular method used by governments and other organizations for
informational exhibitions. Its main advantage is its attractive value.
Pictograms stimulate interest in the information being presented.
• Pictograms or pictographs are symbols of representation of the pictorial
graphic system. Pictographs originated from prehistoric drawings on ancient
rocks signifying an object or thing with its depiction. It is meant to convey,
share or represent an idea or concept.
• Better known as ‘icons’, pictograms have been popularised with the use
and familiarization of softwares. Today the term is used widely and casually
with the broad sweep of many icons representing things.
• The Pictogram is a friendly visual language that is developed for all classes
of people and even those with no ability to speak, read or write.
368
Diagrammatic and Graphic
Presentation
• A histogram is the graphical description of data and is constructed from a
frequency table. It displays the distribution method of a data set and is
used for statistical as well as mathematical calculations.
• The word histogram is derived from the Greek word histos which means
‘anything set upright’ and gramma which means ‘drawing, record, and
‘writing’. It is considered as the most important basic tool of statistical
quality control process.
• A frequency polygon is a line chart of frequency distribution in which either
the values of discrete variables or mid-points of class intervals are plotted
against the frequencies. These plotted points are joined together by straight
lines.
• The cumulative frequency curve or ogive is the graphic representation of a
cumulative frequency distribution. Ogives are of two types. One of these is
less than and the other one is greater than ogive.
15.5 KEY WORDS
• Pie charts: They are basically circle charts, which are usually drawn for
component-wise per cent data.
• Component charts: These charts are meant for exhibiting the changes in
the components or parts of a given total in relative terms.
• Pictogram: These are symbols of representation of the pictorial graphic
system.
• Frequency polygon: It is a line chart of frequency distribution in which
either the values of discrete variables or mid-points of class intervals are
plotted against the frequencies.
15.6 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. Graphical or pictorial presentation of data makes the data easy to
understand and interpret.
2. Bar diagrams are simply vertical lines where the lengths of the bars are
proportional to their corresponding numerical values.
3. A pie chart shows the partitioning of a total into component parts.
369
Diagrammatic and Graphic
Presentation
Check Your Progress - 2
1. A histogram is the graphical description of data and is constructed from a
frequency table.
2. The graphic representation of a cumulative frequency distribution is called
an ogive.
15.7 SELF-ASSESSMENT QUESTIONS
1. What is graphical presentation of data? What should be taken care of while
presenting data graphically?
2. Discuss the diagrammatic representation of data.
3. Differentiate between a bar chart, pie chart and a pictogram. Explain the
primary differences between them and their utility.
4. How general is usage of the sign language? Give a few examples of
pictograms from your daily life.
5. Discuss graphic representation of data in detail. List the forms of graphic
representation.
6. What is a frequency polygon? When plotted on the horizontal and vertical
axis, why does the polygon not touch the horizontal axis? Explain with the
help of an example.
7. State how ‘less than ogive’ is different from ‘greater than ogive’?
15.8 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
370
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
BLOCK-V
MEASURES OF CENTRAL TENDENCY, DISPERSION AND SKEWNESS
This block discusses the measures of central tendency, dispersion and skewness. The
concepts of central tendency, mean, median, mode and Geometric, Harmonic and Moving
Averages along with the methods of dispersion and skewness are discussed in this block.
This block consists of three units.
The sixteenth unit, as per this book, discusses the concept of central tendency. Central
tendency is the tendency for the values of a random variable to cluster round its mean, mode,
or median. Along with the basics and features of central tendency, the unit also discusses
mean, median and mode. Geometric, harmonica and moving averages are also covered in
this unit.
The seventeenth unit explains the measures of dispersion. Dispersion refers to the extent to
which values of a variable differ from a fixed value such as the mean. The measures of
dispersion can be expressed in an absolute form or in a relative form. The common measures
of dispersion, range and standard deviation are discussed in this unit.
The eighteenth unit discusses the measures of skewness. Skewness refers to a measure of
the asymmetry of the probability distribution of a real-valued random variable about its mean.
The skewness value can be either positive or negative, or it can even be undefined.
However, the qualitative interpretation of skewness remains complicated. The unit discusses
the measures, aspects and features of skewness in detail.
371
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
UNIT–16
CONCEPT OF CENTRAL TENDENCY, MEAN,
MEDIAN, MODE, AND GEOMETRIC, HARMONIC
AND MOVING AVERAGES
Objectives
After going through this unit, you will be able to:
•
Discuss the measures of central tendency
•
Describe the concepts of mean
•
Analyse arithmetic mean of grouped data
•
Assess the advantages and disadvantages of mean
Structure
16.1
16.2
16.3
16.4
16.5
16.6
16.7
16.8
Introduction
Measures of Central Tendency
Mean
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
16.1 INTRODUCTION
This unit will discuss the concepts of central tendency, mean, median, mode and
geometric, harmonic and moving averages. Central tendency refers to the tendency
for the values of a random variable to cluster round its mean, mode, or median.
Where mean, median, and mode are the three common forms of statistical averages.
Mean refers to an average of n numbers computed by adding some function of the
numbers and dividing by some function of n. Median on the other hand is the value
below which 50% of the cases fall and mode being the most frequent value of a
random variable. The measures of central tendencies, characteristics of mean,
median, mode, and the various types of means are discussed in this unit.
16.2 MEASURES OF CENTRAL TENDENCY
Statistics indicate the location of the frequency curve along the X-axis and ignore all
other features of the distribution. There are various possible measures that can be
used to ‘locate’ a frequency distribution, as shown in Fig. 16.1.
373
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
A, the minimum value.
B, the value of maximum concentration.
C, the value which divides the distribution into half, such that one half of the
items have value less than this and the other half more.
D, the average value of all items.
E, the 95th percentile, i.e., the value below which 95 per cent items lie.
F, the maximum value.
Fig. 16.1 Frequency Distribution
If the shape of the frequency distributions were fixed, then all these measures are
equally descriptive, and fix the location of the curve. But, the practical distributions
that we deal with always have some change in shape depending on the samples we
take, even though the general shapes are quite similar. It is, therefore, necessary that
we choose those measures of location which are not very sensitive to the specific
values of items, in particular the extreme values. Thus, measures A and E are
generally meaningless because they depend on the values of the lowest and the
highest items, respectively. The other measures, on the contrary, are less susceptible
to extreme values because they are somehow related to the entire distributions.
Thus, we treat B, C, D and E as the most common measures of location. There are
some more of such measures which we will consider later.
The most important object of calculating and measuring central tendency is to
determine a ‘single figure’ which may be used to represent a whole series involving
magnitudes of the same variable. In that sense, it is an even more compact
description of the statistical data than the frequency distribution.
Since an ‘average’ represent the entire data, it facilitates comparison within one
group or between groups of data. Thus, the performance of the members of a group
374
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
can be compared by relating it to the average performance of the group. Likewise,
the achievements of groups can be compared by a comparison of their respective
averages.
Check Your Progress - 1
1.
What is the most important object of calculating and measuring central
tendency?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What do average facilitate?
................................................................................................................
................................................................................................................
................................................................................................................
16.3 MEAN
There are several commonly used measures such as arithmetic mean, mode and
median. These values are very useful not only in presenting the overall picture of the
entire data but also for the purpose of making comparisons among two or more sets
of data.
While arithmetic mean is the most commonly used measure of central location, mode
and median are more suitable measures under certain set of conditions and for
certain types of data. However, each measure of central tendency should meet the
following requisites.
1. It should be easy to calculate and understand.
2. It should be rigidly defined. It should have only one interpretation so
that the personal prejudice or bias of the investigator does not affect its
usefulness.
3. It should be representative of the data. If it is calculated from a sample,
then the sample should be random enough to be accurately representing the
population.
375
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
4. It should have sampling stability. It should not be affected by sampling
fluctuations. This means that if we pick 10 different groups of college
students at random and compute the average of each group, then we
should expect to get approximately the same value from each of these
groups.
5. It should not be affected much by extreme values. If few very small or very
large items are present in the data, they will unduly influence the value of the
average by shifting it to one side or other, so that the average would not be
really typical of the entire series. Hence, the average chosen should be such
that it is not unduly affected by such extreme values.
Let us consider the measure of central tendency, arithmetic mean. This is also
commonly known as simply the mean. Even though average, in general, means any
measure of central location. When we use the word average in our daily routine, we
always mean the arithmetic average. The term is widely used by almost every one in
daily communication. We speak of an individual being an average student or of
average intelligence. We always talk about average family size or average family
income or grade point average (GPA) for students and so on.
For discussion purposes, let us assume a variable X which stands for some
scores such as the ages of students. Let the ages of 5 students be 19, 20, 22, 22 and
17 years. Then variable X would represent these ages as follows:
X: 19, 20, 22, 22, 17
Placing the Greek symbol Σ(Sigma) before X would indicate a command that all
values of X are to be added together. Thus:
ΣX = 19 + 20 + 22 + 22 + 17
The mean is computed by adding all the data values and dividing it by the
number of such values. The symbol used for sample average is X so that:
X
19 20 22 22 17
5
In general, if there are n values in the sample, then
X
X1
In other words,
n
X
i 1
X 2 ......... X n
n
Xi
n
,
i 1, 2 ... n
376
(16.1)
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
The above formula states, add up all the values of Xi where the value of i
starts at 1 and ends at n with unit increments so that i = 1, 2, 3, ... n.
If instead of taking a sample, we take the entire population in our calculations
of the mean, then the symbol for the mean of the population is m (mu) and the size
of the population is N, so that:
N
i 1
Xi
N
(16.2)
i 1, 2 ...N
,
If we have the data in grouped discrete form with frequencies, then the sample mean
is given by:
X
Where
f (X )
f
(16.3)
= Summation of all frequencies
= n
Σf(X) = Summation of each value of X multiplied by its
corresponding frequency ( f )
Example 16.1: Let us take the ages of 10 students as follows:
Σf
19, 20, 22, 22, 17, 22, 20, 23, 17, 18
Solution: This data can be arranged in a frequency distribution as follows:
Age
(X)
17
18
19
20
22
23
Frequency
(f)
2
1
1
2
3
1
Total = 10
f(X)
34
18
19
40
66
23
200
In the given case we have Σf = 10 and Sf(X) = 200, so that:
X
=
f (X )
f
= 200/10 = 20
Example 16.2: Calculate the mean of the marks of 46 students given in the
following table.
377
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
Frequency of Marks of 46 Students
Marks
(X)
Frequency
(f)
9
10
11
12
13
14
15
16
17
18
1
2
3
6
10
11
7
3
2
1
Total
46
Solution: This is a discrete frequency distribution, and is calculated using equation
(16.3). The following table shows the method of obtianing Σf(X).
Marks (X)
Frequency ( f )
f(X)
9
10
11
12
13
14
15
16
17
18
1
2
3
6
10
11
7
3
2
1
9
20
33
72
130
154
105
48
34
18
Σf = 46
Σf(X) = 623
Using equation 16.3, we get,

X
f ( X ) 623
  13.54
46
f
Arithmetic Mean of Grouped Data
If however the data is grouped such that we are given frequency of finite-sized class
intervals we do not know the value of every item. The calculation of arithmetic mean
in such a case is then necessarily, a process of estimation, based on some
assumption. The standard assumption for this purpose is that all the items within a
particular class are concentrated at the midvalue of the class and thus f(X)
corresponding to the f items of a class equals f(m), where m is the midpoint of the
class interval, and the arithmetic mean is then given by,
378
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
X =
f ( m)
f
(16.4)
The determination of the midpoint of a class interval requires some
consideration. The position of the midpoint is determined by real as distinguished
from apparent class limits.
Advantages of Mean
1. Its concept is familiar to most people and is intuitively clear.
2. Every data set has a mean, which is unique and describes the entire data to
some degree. For example, when we say that the average salary of a
professor is ` 25,000 per month, it gives us a reasonable idea about the
salaries of professors.
3. It is a measure that can be easily calculated.
4. It includes all values of the data set in its calculation.
5. Its value varies very little from sample to sample taken from the same
population.
6. It is useful for performing statistical procedures such as computing and
comparing the means of several data sets.
Disadvantages of Mean
1. It is affected by extreme values, and hence, not very reliable when the data
set has extreme values especially when these extreme values are on one
side of the ordered data. Thus, a mean of such data is not truly a
representative of such data. For example, the average age of three persons
of ages 4, 6 and 80 is 30.
2. It is tedious to compute for a large data set as every point in the data set is
to be used in computations.
3. We are unable to compute the mean for a data set that has open-ended
classes either at the high or at the low end of the scale.
4. The mean cannot be calculated for qualitative characteristics such as beauty
or intelligence, unless these can be converted into quantitative figures such
as intelligence into IQs.
Median
The second measure of central tendency that has a wide usage in statistical works, is
the median. Median is that value of a variable which divides the series in such a
manner that the number of items below it is equal to the number of items above it. Half
379
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
the total number of observations lie below the median, and half above it. The median
is thus a positional average.
The median of ungrouped data is found easily if the items are first arranged in
order of magnitude. The median may then be located simply by counting, and its value
can be obtained by reading the value of the middle observations. If we have five
observations whose values are 8, 10, 1, 3 and 5, the values are first arrayed: 1, 3, 5,
8 and 10. It is now apparent that the value of the median is 5, since two observations
are below that value and two observations are above it. When there is an even number
of cases, there is no actual middle item and the median is taken to be the average of the
values of the items lying on either side of (N + 1)/2, where N is the total number of
items. Thus, if the values of six items of a series are 1, 2, 3, 5, 8 and 10. The median is
the value of item number (6 + 1)/2 = 3.5, which is approximated as the average of the
third and the fourth items, i.e.,(3+5)/2 = 4.
Thus the steps required for obtaining median are:
1. Arrange the data as an array of increasing magnitude.
2. Obtain the value of the (N+ l)/2th item.
Even in the case of grouped data, the procedure for obtaining median is
straightforward as long as the variable is discrete or non-continuous as is clear from
the following examples.
Example 16.3: Obtain the median size of shoes sold from the following data.
Number of Shoes Sold by Size in One Year
Size
5
Number of Pairs
30
Cumulative Total
30
5 21
40
70
6
50
120
6 21
150
270
7
300
570
7 21
600
1170
8
950
2120
8 21
820
2940
9
750
3690
9 21
440
4130
250
4380
10
380
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
10 21
150
4530
11
40
4570
11 21
39
4609
Total 4609
Solution: Median, is the value of
( N + 1)
4609 + 1
th =
th
2
2
= 2305th item. Since the
items are already arranged in ascending order (size-wise), the size of 2305th item is
easily determined by constructing the cumulative frequency. Thus, the median size of
shoes sold is 81, the size of 2305th item.
In the case of grouped data with continuous variable, the determination
of median is a bit more involved. Consider an example: the data relating to the
distribution of male workers by average monthly earnings is given in the following
table. Clearly the median of 6291 cases is the earnings of (6291 + l)/2 = 3l46th
worker arranged in ascending order of earnings.
From the cumulative frequency, it is clear that this worker has his income in the
class interval 67.5–72.5. But it is impossible to determine his exact income. We,
therefore, resort to approximation by assuming that the 795 workers of this class are
distributed uniformly across the interval 67.5 to 72.5. The median worker is
(3146–2713) = 433rd of these 795, and hence, the value corresponding to him can
be approximated as,
67.5 +
433
× ( 72.5 − 67.5)
795
= 67.5 + 2.73 = 70.23
Distribution of Male Workers by Average Monthly Earnings
Group No.
Monthly
Earnings (`)
No. of
Workers
1
2
3
4
5
6
7
8
9
10
27.5–32.5
32.5–37.5
37.5–42.5
42.5–47.5
47.5–52.5
52.5–57.5
57.5–62.5
62.5–67.5
67.5–72.5
72.5–77.5
120
152
170
214
410
429
568
650
795
915
381
Cumulative No.
of Workers
120
272
442
656
1066
1495
2063
2713
3508
4423
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
11
77.5–82.5
745
5168
12
82.5–87.5
530
5698
13
87.5–92.5
259
5957
14
92.5–97.5
152
6109
15
97.5–102.5
107
6216
16
102.5–107.5
50
6266
17
107.5–112.5
25
6291
Total 6291
The value of the median can thus be put in the form of the formula,
N +1
−C
Me = l + 2
×i
f
Where l is the lower limit of the median class, i its width, f its frequency, C the
cumulative frequency upto (but not including) the median class, and N is the total
number of cases.
Location of median by graphical analysis
The median can quite conveniently be determined by reference to the ogive which
plots the cumulative frequency against the variable. The value of the item below
which half the items lie can easily be read from the ogive.
Example 16.4: Obtain the median of data given in the following table.
Monthly Earnings
Frequency (f)
Less than
More than
(greater than)
27.5
32.5
37.5
42.5
47.5
52.5
57.5
62.5
67.5
72.5
77.5
82.5
87.5
__
120
152
170
214
410
429
568
650
795
915
745
530
0
120
272
442
656
1066
1495
2063
2713
3508
4423
5168
5698
6291
6171
6019
5849
5635
5225
4796
4228
3578
2783
1868
1123
593
382
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
92.5
97.5
102.5
107.5
112.5
259
152
107
50
25
5957
6109
6216
6266
6291
334
182
65
25
0
Solution: It is clear that this is grouped data. The first class is 27.5–32.5, whose
frequency is 120, and the last class is 107.5–112.5, whose frequency is 25.
The median can also be determined by plotting both ‘less than’ and ‘more than’
cumulative frequency as shown in Fig. 16.2. It is obvious that the two curves should
intersect at the median of the data.
Fig. 16.3
Mode
The mode, is that value of the variable, which occurs or repeats itself the greatest
number of times. The mode is the most ‘fashionable’ size in the sense that it is the most
common and typical, and is defined by Zizek as ‘the value occurring most frequently in
a series (or group of items) and around which the other items are distributed most
densely.’
The mode of a distribution is the value at the point around which the items tend to
be most heavily concentrated. It is the most frequent or the most common value,
provided that a sufficiently large number of items are available to give a smooth
383
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
distribution. It will correspond to the value of the maximum point (ordinate) of a
frequency distribution if it is an ‘ideal’ or smooth distribution. It may be regarded as
the most typical of a series of values. The modal wage, for example, is the wage
received by more individuals than any other wage. The modal ‘hat’ size is that which is
worn by more persons than any other single size.
It may be noted that the occurrence of one or a few extremely high or low values
has no effect upon the mode. If a series of data are unclassified, not having been either
arrayed or put into a frequency distribution, the mode cannot be readily located.
Taking first an extremely simple example, if seven men are receiving daily wages
of ` 5, 6, 7, 7, 7, 8 and 10, it is clear that the modal wage is ` 7 per day. If we have
a series such as 2, 3, 5, 6, 7, 10 and 11, it is apparent that there is no mode.
There are several methods of estimating the value of the mode. But, it is seldom
that the different methods of ascertaining the mode give us identical results. Consequently,
it becomes necessary to decide as to which method would be most suitable for the
purpose in hand. In order that a choice of the method may be made, we should
understand each of the methods and the differences that exist among them.
The four important methods of estimating mode of a series are: (i) Locating the
most frequently repeated value in the array; (ii) Estimating the mode by interpolation;
(iii) Locating the mode by graphic method; and (iv) Estimating the mode from the
mean and the median. Only the last three methods are discussed in this unit.
Estimating the Mode by Interpolation. In the case of continuous frequency
distributions, the problem of determining the value of the mode is not so simple as it
might have appeared from the foregoing description. Having located the modal class
of the data, the next problem in the case of continuous series is to interpolate the value
of the mode within this ‘modal’ class.
The interpolation is made by the use of any one of the following formulae:
(i) Mo = l1 +
or (iii) Mo = l1 +
f2
f0 + f2
× i;
(ii) Mo = l2 −
f1 − f 0
( f1 − f 0 ) + ( f1 − f 2 )
f0
f0 + f2
×i
×i
Where l1 is the lower limit of the modal class, l2 is the upper limit of the modal class, f0
equals the frequency of the preceding class in value, f1 equals the frequency of the
modal class in value, f2 equals the frequency of the following class (class next to modal
class) in value and i equals the interval of the modal class.
384
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
Example 16.5: Determine the mode for the data given in the following table.
Wage Group
Frequency (f)
14 — 18
18 — 22
22 — 26
26 — 30
30 — 34
34 — 38
38 — 42
42 — 46
46 — 50
50 — 54
54 — 58
6
18
19
12
5
4
3
2
1
0
1
Solution: In the given data, 22 – 26 is the modal class, since it has the largest frequency,
the lower limit of the modal class is 22, its upper limit is 26, its frequency 19, the
frequency of the preceding class is 18, and of the following class is 12. The class
interval is 4. Using the various methods of determining mode, we have,
(i)
Mo
= 22
12
4
18 12
(ii) Mo = 26 –
= 22 + 8
= 26 –
= 23.6
= 23.6
5
(iii) Mo =
22 
19  18
4
(19  18)  ( 19  12)
= 22 
4
8
18
4
18  12
12
5
= 22.5
In formulae (i) and (ii), the frequency of the classes adjoining the modal class is
used to pull the estimate of the mode away from the midpoint towards either the upper
or lower class limit. In this particular case, the frequency of the class preceding the
modal class is more than the frequency of the class following and, therefore, the estimated
mode is less than the midvalue of the modal class. This seems quite logical. If the
frequencies are more on one side of the modal class than on the other, it can be
reasonably concluded that the items in the modal class are concentrated more towards
the class limit of the adjoining class with the larger frequency.
The formula (iii) is also based on a logic similar to that of (i) and (ii). In this case,
to interpolate the value of the mode within the modal class, the differences between
the frequency of the modal class, and the respective frequencies of the classes adjoining
385
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
it are used. This formula usually gives results better than the values obtained by the
other and exactly equal to the results obtained by graphic method. The formulae (i)
and (ii) give values which are different from the value obtained by formula (iii) and are
more close to the central point of modal class. If the frequencies of the class adjoining
the modal are equal, the mode is expected to be located at the midvalue of the modal
class, but if the frequency on one of the sides is greater the mode will be pulled away
from the central point. It will be pulled more and more if the difference between the
frequencies of the classes adjoining the modal class is higher and higher. In the example
given above, the frequency of the modal class is 19 and that of preceding class is 18.
So, the mode should be quite close to the lower limit of the modal class. The midpoint
of the modal class is 24 and lower limit of the modal class is 22.
Locating the Mode by the Graphic Method. The method of graphic
interpolation is illustrated in Fig. 16.3. The upper corners of the rectangle over the
modal class have been joined by straight lines to those of the adjoining rectangles as
shown in the diagram; the right corner to the corresponding one of the adjoining rectangle
on the left, etc. If a perpendicular is drawn from the point of intersection of these lines,
we have a value for the mode indicated on the base line. The graphic approach is, in
principle, similar to the arithmetic interpolation explained earlier.
Fig. 16.3 Method of Mode Determination by Graphic Interpolation
386
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
The mode may also be determined graphically from an ogive or cumulative frequency
curve. It is found by drawing a perpendicular to the base from that point on the curve
where the curve is most nearly vertical, i.e., steepest (in other words, where it passes
through the greatest distance vertically and smallest distance horizontally). The point
where it cuts the base gives us the value of the mode. How accurately this method
determines the mode is governed by: (1) The shape of the ogive, (2) The scale on
which the curve is drawn.
Estimating the Mode from the Mean and the Median. There usually exists a
relationship among the mean, median and mode for moderately asymmetrical
distributions. If the distribution is symmetrical, the mean, median and mode will have
identical values, but if the distribution is skewed (moderately) the mean, median and
mode will pull apart. If the distribution tails off towards higher values, the mean and the
median will be greater than the mode. If it tails off towards lower values, the mode will
be greater than either of the other two measures. In either case, the median will be
about one-third as far away from the mean as the mode is. This means that,
Mode = Mean – 3 (Mean – Median)
= 3 Median – 2 Mean
In the case of the average monthly earnings (refer table of example 3) the mean is
68.53 and the median is 70.2. If these values are substituted in the above formula, we
get,
Mode = 68.5 – 3(68.5 –70.2)
= 68.5 + 5.1 = 73.6
According to the formula used earlier,
Mode = l1 +
f2
f0 + f2
= 72.5 +
×i
745
×5
795 + 745
= 72.5 + 2.4 = 74.9
OR
Mode = l1 2 f
1
= 72.5 +
f1
f0
f0
f2
i
915 − 795
×5
2 × 915 − 795 − 745
= 72.5 + 120 × 5 = 75.57
290
387
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
The difference between the two estimates is due to the fact that the assumption of
relationship between the mean, median and mode may not always be true, it is obviously
not valid in this case.
Example 16.6: (a) In a moderately symmetrical distribution, the mode and mean are
32.1 and 35.4 respectively. Calculate the median.
(b) If the mode and median of moderately asymmetrical series are respectively
16'' and 15.7'', what would be its most probable median?
(c) In a moderately skewed distribution, the mean and the median are respectively
25.6 and 26.1 inches. What is the mode of the distribution?
Solution: (a) We know,
Mean – Mode = 3 (Mean – Median)
or
3 Median = Mode + 2 Mean
32.1  2  35.4
3
102.9
=
3
or
Median =
= 34.3
(b)
2 Mean = 3 Median – Mode
31.1
= 15.55
Mean = 1 ( 3 × 15. 7 − 16.0) =
or
2
(c)
2
Mode = 3 Median – 2 Mean
= 3 × 26.1 – 2 × 25.6 = 78.3 – 51.2 = 27.1
Geometric Mean and Harmonic Mean
The Geometric Mean (GM) of n positive values is defined as the nth root of their
product. Thus, it is obtained by multiplying together all the values and then extracting
the relevant root of the product. It can be represented as:
Geometric Mean or GM = n x1 ⋅ x 2 ⋅ x 3 ... x n
Where n stands for the number of items and x1, x2, x3, ... xn are the various values.
For instance, the geometric mean of 4, 8, 16 is,
GM = 3 4 8 16 = 3 512 = 8
The above method of calculating geometric mean is satisfactory only if there are
two or three items. But if n is a large number, the problem of computing the nth root of
388
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
the product of these values by simple arithmetic is a tedious work. To facilitate the
computation of geometric mean we make use of logarithms. The above formula when
reduced to its logarithmic form will be:
log GM =
log x1 + log x 2 + log x 3 + ... log x n
n
The logarithm of the geometric mean is equal to the arithmetic mean of the logarithms
of individual values.
Example 16.7: Find the GM of 2, 4, 8, 12, 16, 24.
log
2
4
8
12
16
24
0.3010
0.6021
0.9031
1.0792
1.2041
1.3802
5.4697
Solution:
Geometric Mean = antilog 5. 4697
6
= antilog 0.9116
= 8.158
It is easily verified that the geometric mean (GM) of a frequency distribution is
given by,
log GM =
f 1 log x1 + f 2 log x 2 + f 3 log x 3 ... f n log x n
N
Similarly, for grouped data,
log GM =
∑ f log m
N
Where m is the midvalue of a particular class.
Harmonic Mean
Another important mean is the Harmonic Mean (HM) which is used for averaging the
rates. It is defined by,
389
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
1
HM
1
x1
1
x2
1
x3
...
1
/n
xn
Where n is the number of items in the series x1, x2, x3, ..., xn.
Thus, if a man travels 200 km each on three days at speeds of 60, 50 and 40
kmph, respectively, his average speed is given by the HM of the three speeds, namely
3
= 48.65 kmph
1
1
1
+
+
60 50 40
Note: HM gives the correct average speed because the man travelled equal distances on three
speeds. If, however, he had travelled for equal times, the AM would have been the correct
average.
HM =
Moving Averages
Moving averages are defined as a succession of average derived from successive
segments of constant size and overlapping of a series of values. Moving average is
calculated by creating series of averages of different subjects of the full data set.
After fixing the size of subset, the first element of the moving average is obtained by
taking the average of the initial fixed subset of the number series.
For example: 3, 5, 9, 11, 2, 8, 7, 6, 4, 2 A 3 year moving average can be
calculated as
Check Your Progress - 2
1.
How is the mean computed?
................................................................................................................
................................................................................................................
................................................................................................................
390
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
2.
What are the four important methods of estimating mode of a series?
................................................................................................................
................................................................................................................
................................................................................................................
16.4 SUMMARY
• Statistics indicate the location of the frequency curve along the X-axis and
ignore all other features of the distribution.
• The most important object of calculating and measuring central tendency is
to determine a ‘single figure’ which may be used to represent a whole series
involving magnitudes of the same variable.
• While arithmetic mean is the most commonly used measure of central
location, mode and median are more suitable measures under certain set of
conditions and for certain types of data.
• The mean is computed by adding all the data values and dividing it by the
number of such values.
• The mean cannot be calculated for qualitative characteristics such as beauty
or intelligence, unless these can be converted into quantitative figures such
as intelligence into IQs.
• Half the total number of observations lie below the median, and half above
it. The median is thus a positional average.
• The median of ungrouped data is found easily if the items are first arranged
in order of magnitude.
• The median of ungrouped data is found easily if the items are first arranged
in order of magnitude.
• The median can quite conveniently be determined by reference to the ogive
which plots the cumulative frequency against the variable.
• The mode is the most ‘fashionable’ size in the sense that it is the most
common and typical, and is defined by Zizek as ‘the value occurring most
frequently in a series (or group of items) and around which the other items
are distributed most densely.’
391
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
• The modal wage, for example, is the wage received by more individuals
than any other wage.
• It may be noted that the occurrence of one or a few extremely high or low
values has no effect upon the mode.
• If a series of data are unclassified, not having been either arrayed or put
into a fre­quency distribution, the mode cannot be readily located.
• There are several methods of estimating the value of the mode. But, it is
seldom that the different methods of ascertaining the mode give us identical
results.
• The four important methods of estimating mode of a series are: (i) Locating
the most frequently repeated value in the array; (ii) Estimating the mode by
interpolation; (iii) Locating the mode by graphic method; and (iv)
Estimating the mode from the mean and the median.
• In the case of continuous frequency distributions, the problem of
determining the value of the mode is not so simple as it might have
appeared from the foregoing description.
• There usually exists a relationship among the mean, median and mode for
moderately asymmetrical distributions.
• If the distribution is symmetrical, the mean, median and mode will have
identical values, but if the distribution is skewed (moderately) the mean,
median and mode will pull apart.
• If the distribution tails off towards higher values, the mean and the median
will be greater than the mode.
• The Geometric Mean (GM) of n positive values is defined as the nth root
of their product.
16.5 KEY WORDS
• Median: It is that value of a variable which divides the series in such a
manner that the number of items below it is equal to the number of items
above it.
• Mode: It is that value of the variable, which occurs or repeats itself the
greatest number of times.
392
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
• Moving averages: These are defined as a succession of average derived
from successive segments of constant size and overlapping of a series of
values.
16.6 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. The most important object of calculating and measuring central tendency is
to determine a ‘single figure’ which may be used to represent a whole
series involving magnitudes of the same variable.
2. Average facilitates comparison within one group or between groups of
data.
Check Your Progress - 2
1. The mean is computed by adding all the data values and dividing it by the
number of such values.
2. The four important methods of estimating mode of a series are: (i) Locating
the most frequently repeated value in the array; (ii) Estimating the mode by
interpolation; (iii) Locating the mode by graphic method; and (iv)
Estimating the mode from the mean and the median.
16.7 SELF-ASSESSMENT QUESTIONS
1. Write a short note on measures of central tendencies.
2. Write a note on Mean. State its characteristics.
3. What is arithmetic mean of grouped data? Discuss.
4. List the advantages and disadvantages of mean.
5. What do you mean by median? Explain with the help of an example using
a tabular presentation.
6. How is mode estimated by interpolation?
7. Discuss the locating of mode by the graphical method.
8. What do you mean by geometric and harmonic mean?
9. Write a short note on moving averages.
393
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
16.8 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
394
Measures of Dispersion–I & II
UNIT–17
MEASURES OF DISPERSION–I & II
Objectives
After going through this unit, you will be able to:
•
Define the concept of measures of dispersion and its significance in statistical
analysis
•
Differentiate between quartile deviation and standard deviation
•
Describe how to calculate coefficient of mean deviation and standard
deviation
•
Analyse standard deviation by short-cut method
•
Assess the various measures of dispersion
Structure
17.1
17.2
17.3
17.4
17.5
17.6
17.7
17.8
Introduction
Measures of Dispersion
Standard Deviation
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
17.1 INTRODUCTION
In this unit, you will learn about the measures of dispersion. The measures of central
tendency is computed to see through the variability or dispersion of the individual
values. But the dispersion is in itself a very important property of a distribution and
needs to be measured by an appropriate statistics.
The measure of dispersion can be expressed in an ‘absolute form’, or in a
‘relative form’. It is said to be in an absolute form when it states the actual amount
by which the value of an item on an average deviates from a measure of central
tendency. A relative measure of dispersion is a quotient obtained by dividing the
absolute measures by a quantity in respect to which absolute deviation has been
computed. Relative measures are used for making comparisons between two or
more distributions. The common measures of dispersion are range, semi-interquartile
395
Measures of Dispersion–I & II
range or the quartile deviation, mean deviation and standard deviation. Of these, the
standard deviation is the best measure. All these measures are discussed in this unit.
17.2 MEASURES OF DISPERSION
A measure of dispersion, or simply dispersion may be defined as statistics signifying
the extent of the scatteredness of items around a measure of central tendency.
A measure of dispersion may be expressed in an ‘absolute form’, or in a
‘relative form’. It is said to be in an absolute form when it states the actual amount
by which the value of an item on an average deviates from a measure of central
tendency. Absolute measures are expressed in concrete units, i.e., units in terms of
which the data have been expressed, e.g., rupees, centimetres, kilograms, etc., and
are used to describe frequency distribution.
A relative measure of dispersion is a quotient obtained by dividing the absolute
measures by a quantity in respect to which absolute deviation has been computed. It
is as such a pure number and is usually expressed in a percentage form. Relative
measures are used for making comparisons between two or more distributions.
A measure of dispersion should possess the following characteristics which are
considered essential for a measure of central tendency.
(a) It should be based on all observations.
(b) It should be readily comprehensible.
(c) It should be fairly and easily calculated.
(d) It should be affected as little as possible by fluctuations of sampling.
(e) It should be amenable to algebraic treatment.
The following are the common measures of dispersion:
(i) The range, (ii) The semi-interquartile range or the quartile deviation, (iii) The
mean deviation and (iv) The standard deviation. Of these, the standard deviation is
the best measure. All these measures are discussed in this unit.
Range
The crudest measure of dispersion is the range of the distribution. The range of any
series is the difference between the highest and the lowest values in the series. If the
marks received in an examination taken by 248 students are arranged in ascending
396
Measures of Dispersion–I & II
order, then the range will be equal to the difference between the highest and the
lowest marks.
In a frequency distribution, the range is taken to be the difference between the
lower limit of the class at the lower extreme of the distribution and the upper limit of
the class at the upper extreme.
Table 17.1 Weekly Earnings of Labourers in Four Workshops of the Same Type
No. of Workers
Weekly earnings
`
Workshop A
Workshop B
Workshop C
Workshop D
15–16
17–18
19–20
21–22
23–24
25–26
27–28
29–30
31–32
33–34
35–36
37–38
...
...
...
10
22
20
14
14
...
...
...
...
...
2
4
10
14
18
16
10
6
...
...
...
2
4
4
10
16
14
12
6
6
2
...
4
...
...
4
14
16
16
12
12
4
2
...
...
Total
80
80
80
80
Mean
25.5
25.5
25.5
25.5
Consider the data on weekly earning of worker on four workshops given in the
above Table 17.1. We note the following:
Workshop
Range
A
9
B
15
C
23
D
15
From these figures, it is clear that the greater the range, the greater is the
variation of the values in the group.
The range is a measure of absolute dispersion and as such cannot be usefully
employed for comparing the variability of two distributions expressed in different
units. The amount of dispersion measured, say, in pounds, is not comparable with
dispersion measured in inches. So the need of measuring relative dispersion arises.
An absolute measure can be converted into a relative measure if we divide it by
some other value regarded as standard for the purpose. We may use the mean of
the distribution or any other positional average as the standard.
397
Measures of Dispersion–I & II
For Table 17.1, the relative dispersion would be:
Workshop A =
9
25.5
Workshop C =
23
25.5
Workshop B =
15
25.5
Workshop D =
15
25.5
An alternate method of converting an absolute variation into a relative one would be
to use the total of the extremes as the standard. This will be equal to dividing the
difference of the extreme items by the total of the extreme items. Thus,
Relative Dispersion =
Difference of extreme items, i.e., Range
Sum of extreme items
The relative dispersion of the series is called the coefficient or ratio of dispersion. In our
example of weekly earnings of workers considered earlier, the coefficients would be:
Workshop A =
9
9
=
21 + 30 51
Workshop B =
15
15
=
17 + 32 49
Workshop C =
23
23
=
15 + 38 53
Workshop D =
15
15
=
19 + 34 53
Merits and Limitations of Range
Merits: Of the various characteristics that a good measure of dispersion should
possess, the range has only two, viz. (i) It is easy to understand, and (ii) Its
computation is simple.
Limitations: Besides the aforesaid two qualities, the range does not satisfy the
other test of a good measure and hence it is often termed as a crude measure of
dispersion.
The following are the limitations that are inherent in the range as a concept of
variability:
(i) Since it is based upon two extreme cases in the entire distribution, the range
may be considerably changed if either of the extreme cases happens to
drop out, while the removal of any other case would not affect it at all.
(ii) It does not tell anything about the distribution of values in the series relative
to a measure of central tendency.
(iii) It cannot be computed when distribution has open-end classes.
(iv) It does not take into account the entire data. These can be illustrated by
the following illustration. Consider the data given in Table 17.2.
398
Measures of Dispersion–I & II
Table 17.2 Distribution with the Same Number of Cases,
but Different Variability
Class
No. of Students
Section
A
Section
B
Section
C
0–10
10–20
20–30
30–40
40–50
50–60
60–70
70–80
80–90
90–100
...
1
12
17
29
18
16
6
11
...
...
...
12
20
35
25
10
8
...
...
...
...
19
18
16
18
18
21
...
...
Total
110
110
110
Range
80
60
60
The table is designed to illustrate three distributions with the same number of
cases but different variability. The removal of two extreme students from section A
would make its range equal to that of B or C.
The greater range of A is not a description of the entire group of 110 students,
but of the two most extreme students only. Further, though sections B and C have
the same range, the students in section B cluster more closely around the central
tendency of the group than they do in section C. Thus, the range fails to reveal the
greater homogeneity of B or the greater dispersion of C. Due to this defect, it is
seldom used as a measure of dispersion.
Specific Uses of Range
In spite of the numerous limitations of the range as a measure of dispersion, there are
the following circumstances when it is the most appropriate one:
(a) In situations where the extremes involve some hazard for which preparation
should be made, it may be more important to know the most extreme cases
to be encountered than to know anything else about the distribution. For
example, an explorer, would like to know the lowest and the highest
temperatures on record in the region he is about to enter; or an engineer
would like to know the maximum rainfall during 24 hours for the
construction of a storem water drain.
399
Measures of Dispersion–I & II
(b) In the study of prices of securities, range has a special field of activity.
Thus to highlight fluctuations in the prices of shares or bullion it is a
common practice to indicate the range over which the prices have moved
during a certain period of time. This information, besides being of use to the
operators, gives an indication of the stability of the bullion market, or that
of the investment climate.
(c) In statistical quality control the range is used as a measure of variation.
We, e.g., determine the range over which variations in quality are due to
random causes, which is made the basis for the fixation of control limits.
Quartile Deviation (QD)
Another measure of dispersion, much better than the range, is the semi-interquartile
range, usually termed as ‘quartile deviation’. As stated in the previous unit, quartiles
are the points which divide the array in four equal parts. More precisely, Q1 gives the
value of the item 1/4th the way up the distribution and Q3 the value of the item 3/4th
the way up the distribution. Between Q1 and Q3 are included half the total number of
items. The difference between Q1 and Q3 includes only the central items but excludes
the extremes. Since under most circumstances, the central half of the series tends to
be fairly typical of all the items, the interquartile range (Q3– Q1) affords a convenient
and often a good indicator of the absolute variability. The larger the interquartile
range, the larger the variability.
Usually, one-half of the difference between Q3 and Q1 is used and to it is given
the name of quartile deviation or semi-interquartile range. The interquartile range is
divided by two for the reason that half of the interquartile range will, in a normal
distribution, be equal to the difference between the median and any quartile. This
means that 50 per cent items of a normal distribution will lie within the interval
defined by the median plus and minus the semi-interquartile range.
Symbolically,
Q.D. =
Q3 − Q1
2
Let us find quartile deviations for the weekly earnings of labour in the four
workshop whose data is given in Table 17.1. The computations are as shown in
Table 17.3.
As shown in the table, Q.D. of workshop A is ` 2.12 and median value is 25.3.
This means that if the distribution is symmetrical, the number of workers whose wages
400
Measures of Dispersion–I & II
vary between (25.3–2.1) = ` 23.2 and (25.3 + 2.1) = ` 27.4, shall be just half of the
total cases. The other half of the workers will be more than ` 2.1 removed from the
median wage. As this distribution is not symmetrical, the distance between Q1 and the
median Q2 is not the same as between Q3 and the median. Hence, the interval defined
by median plus and minus semi inter-quartile range will not be exactly the same as
given by the value of the two quartiles. Under such conditions the range between
` 23.2 and ` 27.4 will not include precisely 50 per cent of the workers.
If quartile deviation is to be used for comparing the variability of any two series,
it is necessary to convert the absolute measure to a coefficient of quartile deviation.
To do this the absolute measure is divided by the average size of the two quartile.
Symbolically,
Coefficient of Quartile Deviation =
Q3 − Q1
Q3 + Q1
Applying this to our illustration of four workshops, the coefficients of Q.D. are
as given below.
Table 17.3 Calculation of Quartile Deviation
Location of Q2
N
2
Q2
Location of Q1
N
4
Q1
Location of Q3
Workshop
Workshop
Workshop
Workshop
A
B
C
D
80
= 40
2
80
= 40
2
80
= 40
2
80
= 40
2
24.5 +
40 − 30
×2
22
24.5 +
40 − 30
×2
18
24.5 +
40 − 30
×2
16
24.5 +
40 − 30
×2
16
= 24.5 + 0.9
= 24.5 + 1.1
= 24.5 + 0.75
= 24.5 + 0.75
= 25.4
= 25.61
= 25.25
= 25.25
80
= 20
4
22.5 +
20 − 10
×2
22
80
= 20
4
22.5 +
80
= 20
4
20 − 16
×2
14
20.5 +
20 − 10
×2
10
80
= 20
4
22.5 +
20 − 18
×2
16
= 22.5 + 0.91
= 22.5 + 0.57
= 20.5 + 2
= 22.5 + 0.25
= 23.41
= 23.07
= 22.5
= 22.75
3N
4
3×
Q3
26.5 +
80
=
60
4
60 − 52
×2
14
60
26.5 +
60
60 − 48
×2
16
26.5 +
60 − 50
×2
12
60
26.5 +
60 − 50
×2
12
= 26.5 + 1.14
= 26.5 + 1.5
= 26.5 + 1.67
= 26.5 + 1.67
= 27.64
= 28.0
= 28.17
= 28.17
401
Measures of Dispersion–I & II
Quartile Deviation
Q3 − Q1
2
27.64 − 23.41
2
=
4.23
= ` 2.12
2
28 − 23.07
2
=
4.93
= ` 2.46
2
28.17 − 22.5
2
=
5.67
= ` 2.83
2
28.17 − 22. 75
2
=
5.42
= ` 2.71
2
Coefficient of Quartile Deviation
27. 64 − 23. 41
= 27. 64 + 23. 41
Q3 − Q1
Q3 + Q1 = 0.083
28 − 23. 07
28 + 23. 07
= 0.097
28.17 − 22.5
28.17 + 22.5
= 0.112
28.17 − 22. 75
28.17 + 22. 75
= 0.106
Characteristics of Quartile Deviation
(i) The size of the quartile deviation gives an indication about the uniformity or
otherwise of the size of the items of a distribution. If the quartile deviation
is small it denotes large uniformity. Thus, a coefficient of quartile deviation
may be used for comparing uniformity or variation in different distributions.
(ii) Quartile deviation is not a measure of dispersion in the sense that it does
not show the scatter around an average, but only a distance on scale.
Consequently, quartile deviation is regarded as a measure of partition.
(iii) It can be computed when the distribution has open-end classes.
Limitations of Quartile Deviation
Except for the fact that its computation is simple and it is easy to understand, a
quartile deviation does not satisfy any other test of a good measure of variation.
Mean Deviation (MD)
A weakness of the measures of dispersion discussed earlier, based upon the range or
a portion thereof, is that the precise size of most of the variants has no effect on the
result. As an illustration, the quartile deviation will be the same whether the variates
between Q1 and Q3 are concentrated just above Q1 or they are spread uniformly
from Q1 to Q3. This is an important defect from the viewpoint of measuring the
divergence of the distribution from its typical value. The mean deviation is employed
to answer the objection.
Mean deviation also called average deviation, of a frequency distribution is the
mean of the absolute values of the deviation from some measure of central tendency.
In other words, mean deviation is the arithmetic average of the variations
(deviations) of the individual items of the series from a measure of their central
tendency.
402
Measures of Dispersion–I & II
We can measure the deviations from any measure of central tendency, but the
most commonly employed ones are the median and the mean. The median is
preferred because it has the important property that the average deviation from it is
the least.
Calculation of the mean deviation then involves the following steps:
(a) Calculate the median or the mean, Md or Me ( X ).
(b) Record the deviations | d | = | x – Me | of each of the items, ignoring the
sign.
(c) Find the average value of deviations.
Mean Deviation =
|d |
N
Example 17.1: Calculate the mean deviation from the following data giving marks
obtained by 11 students in a class test.
14, 15, 23, 20, 10, 30, 19, 18, 16, 25, 12
Solution: Median = Size of
11 + 1
2
th item
= Size of 6th item = 18
Serial No.
Marks
| x – Median |
|d|
1
2
3
4
5
6
7
8
9
10
11
10
12
14
15
16
18
19
20
23
25
30
8
6
4
3
2
0
1
2
5
7
12
∑ |d | = 50
|d |
Mean Deviation from Median = ∑
N
=
50
11
= 4.5 marks
403
Measures of Dispersion–I & II
For grouped data, it is easy to see that the mean deviation is given by,
f |d |
Mean Deviation (M.D.) = ∑
∑f
Where | d | = | x – median | for grouped discrete data, and | d | = M – median
| for grouped continuous data with M as the mid-value of a particular group. The
following examples illustrate the use of this formula.
Example 17.2: Calculate the mean deviation from the following data:
Size of Item
6
7
8
9
10
11
12
Frequency
3
6
9
13
8
5
4
Solution:
Size
Frequency
(f)
Cumulative
Frequency
Deviations
from Median
|d|
f| d |
6
3
3
3
9
7
6
9
2
12
8
9
18
1
9
9
13
31
0
0
10
8
39
1
8
11
5
44
2
10
12
4
48
3
12
48
Median = Size of
60
48 + 1
2
= 24.5th item which is 9
Therefore, deviations (d) are calculated from 9, i.e., | d | = | x – 9 |
∑ f |d |
Mean Deviation = ∑ f
=
60
48
= 1.25
Example 17.3: Calculate the mean deviation from the following data:
x
0–10
f
18
10–20 20–30 30–40 40–50 50–60 60–70 70–80
16
15
12
10
5
2
2
Solution:
This is a frequency distribution with continuous variable. Thus, deviations are
calculated from midvalues.
404
Measures of Dispersion–I & II
x
Midvalue
f
Less than
c.f.
Deviation
from Median
|d|
f| d |
0–10
10–20
20–30
30–40
40–50
50–60
60–70
70–80
5
15
25
35
45
55
65
75
18
16
15
12
10
5
2
2
18
34
49
61
71
76
78
80
19
9
1
11
21
31
41
51
342
144
15
132
210
155
82
102
80
Median
= Size of
= 20 +
and then, Mean Deviation
=
1182
6
× 10
15
80
2
th item
= 24
∑ f |d |
= ∑f
1182
80
= 14.775
Merits and Demerits of the Mean Deviation
Merits
(i) It is easy to understand.
(ii) As compared to standard deviation (discussed later), its computation is
simple.
(iii) As compared to standard deviation, it is less affected by extreme values.
(iv) Since it is based on all values in the distribution, it is better than range or
quartile deviation.
Demerits
(i) It lacks those algebraic properties which would facilitate its computation
and establish its relation to other measures.
(ii) Due to this, it is not suitable for further mathematical processing.
Coefficient of Mean Deviation
The coefficient or relative dispersion is found by dividing the mean deviations (if
deviations were recorded either from the mean or from the median) by mean or by
median. Thus,
405
Measures of Dispersion–I & II
Coefficient of M.D.=
Mean Deviation
Mean
(when deviations were recorded from the mean)
=
M.D.
Median
(when deviations were recorded from the median)
Applying the above formula to Example 3.
Coefficient of Mean Deviation =
14.775
24
= 0.616
Check Your Progress - 1
1.
What is dispersion defined as?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is range?
................................................................................................................
................................................................................................................
................................................................................................................
17.3 STANDARD DEVIATION
By far the most universally used and the most useful measure of dispersion is the
standard deviation or root mean square deviation about the mean. We have seen that
all the methods of measuring dispersion so far discussed are not universally adopted
for want of adequacy and accuracy. The range is not satisfactory as its magnitude is
determined by most extreme cases in the entire group. Further, the range is notable
because it is dependent on the item whose size is largely matter of chance. Mean
deviation method is also an unsatisfactory measure of scatter, as it ignores the
algebraic signs of deviation. We desire a measure of scatter which is free from these
shortcomings. To some extent standard deviation is one such measure.
The calculation of standard deviation differs in the following respects from that
of mean deviation. First, in calculating standard deviation, the deviations are squared.
This is done so as to get rid of negative signs without committing algebraic violence.
Further, the squaring of deviations provides added weight to the extreme items, a
desirable feature for certain types of series.
406
Measures of Dispersion–I & II
Secondly, the deviations are always recorded from the arithmetic mean, because
although the sum of deviations is the minimum from the median, the sum of squares
of deviations is minimum when deviations are measured from the arithmetic average.
The deviation from x is represented by d.
Thus, standard deviation, s (sigma) is defined as the square root of the mean of
the squares of the deviations of individual items from their arithmetic mean.
( x − x )2
σ = ∑
(17.1)
N
For grouped data (discrete variables),
2
∑ f (x − x )
∑f
σ =
(17.2)
and, for grouped data (continuous variables),
σ =
∑ f (M − x)
∑f
(17.3)
Where M is the midvalue of the group.
The use of these formulae is illustrated by the following examples.
Example 17.4: Compute the standard deviation for the following data:
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21
Solution: Here formula (17.1) is appropriate. We first calculate the mean as x =
∑ x/ N = 176/11 = 16, and then calculate the deviation as follows:
x
(x – x )
(x – x )2
11
12
13
14
15
16
17
18
19
20
21
–5
–4
–3
–2
–1
0
+1
+2
+3
+4
+5
25
16
9
4
1
0
1
4
9
16
25
176
11
Thus by formula (17.1)
σ=
110
= 10
11
= 3.16
407
Measures of Dispersion–I & II
Example 17.5: Find the standard deviation of the data in the following distributions:
x
12
13
14
15
16
17
18
20
f
4
11
32
21
15
8
6
4
Solution: For this discrete variable grouped data, we use formula (17.2). Since for
calculation of x , we need ∑ fx and then for σ we need ∑ f ( x − x ) 2 , the calculations
are conveniently made in the following format.
x
f
fx
d=x– x
d2
fd2
12
13
14
15
16
17
18
20
4
11
32
21
15
8
5
4
48
143
448
315
240
136
90
80
–3
–2
–1
0
1
2
3
5
9
4
1
0
1
4
9
25
36
44
32
0
15
32
45
100
100
1500
Here
and
x
=
fx / f
304
= 1500/100 = 15
fd 2
σ = ∑
=
∑f
304
100
= 3. 04 = 1.74
Example 17.6: Calculate the standard deviation of the following data:
Class
Frequency
1–3
3–5
5–7
7–19
9–11
11–13
13–15
1
9
25
35
17
10
3
Solution: This is an example of continuous frequency series and formula (17.3)
seems appropriate.
Class
1–3
3–5
5–7
7–9
9–11
11–13
13–15
Midpoint
Frequency
(x)
(f)
f (x)
2
4
6
8
10
12
14
1
9
25
35
17
10
3
2
36
150
280
170
120
42
100
800
408
Deviation Squared Squared
of MidDeviation Deviation
point (x)
Times
2
Frequency
from
d
Mean (d)
fd2
–6
–4
–2
0
2
4
6
36
16
4
0
4
16
36
36
144
100
0
68
160
108
616
Measures of Dispersion–I & II
First the mean is calculated as,
= f x/ f = 800/100 = 8.0
x
Then the deviations are obtained from 8.0. The standard deviation,
σ
=
∑ f ( M − x )2
∑f
σ
=
∑ fd 2
=
∑f
616
100
= 2.48
Calculation of Standard Deviation by Short-Cut Method
The three examples worked out above have one common simplifying feature, namely
x in each, turned out to be an integer, thus, simplifying calculations. In most cases,
it is very unlikely that it will turn out to be so. In such cases, the calculation of d and
d2 becomes quite time-consuming. Short-cut methods have consequently been
developed. These are on the same lines as those for calculation of mean itself.
In the short-cut method, we calculate deviations x' from an assumed mean A.
Then, for ungrouped data,
σ=
FG
H
∑ x′
∑ x ′2
−
N
N
IJ
K
2
(17.4)
and for grouped data,
σ=
fx
f
2
fx
f
2
(17.5)
This formula is valid for both discrete and continuous variables. In case of
continuous variables, x in the equation x' = x – A stands for the midvalue of the class
in question.
Note that the second term in each of the formulae is a correction term because
of the difference in the values of A and x . When A is taken as x itself, this correction
is automatically reduced to zero. The following examples explain the use of these
formulae.
Example 17.7: Compute the standard deviation by the short-cut method for the
following data:
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21
409
Measures of Dispersion–I & II
Solution: Let us assume that A = 15
σ=
11
12
13
14
15
16
17
18
19
20
21
x' = (x – 15)
–4
–3
–2
–1
0
1
2
3
4
5
6
x'2
16
9
4
1
0
1
4
9
16
25
36
N = 11
∑ x ′ = 11
∑ x ′ 2 = 121
FG
H
∑ x′
∑ x ′2
−
N
N
FG IJ
H 11K
= 121 − 11
11
2
= 10
IJ
K
2
= 11 − 1
= 3.16
Another Method: If we assumed A as zero, then the deviation of each item from
the assumed mean is the same as the value of item itself. Thus, 11 deviates from the
assumed mean of zero by 11, 12 deviates by 12, and so on. As such, we work with
deviations without having to compute them, and the formula takes the following
shape:
x
x2
11
12
13
14
15
16
17
18
19
20
21
121
144
169
196
225
256
289
324
361
400
441
176
2926
410
Measures of Dispersion–I & II
σ=
FG IJ
H K
2926 F 176 I
−G
H 11 JK
11
∑ x2
∑x
−
N
N
2
2
=
= 266 − 256 = 3.16
Example 17.8: Calculate the standard deviation of the following data by short-cut
method.
Person
Monthly Income
(Rupees)
1
2
3
4
5
6
7
300
400
420
440
460
480
580
Solution: In this data, the values of the variables are very large making calculations
cumbersome. It is advantageous to take a common factor out. Thus, we use x' =
x− A
. The standard deviation is calculated using x' and then the true value of σ is
20
obtained by multiplying back by 20. The effective formula used is,
FG
H
′
′2
σ = C× ∑x − ∑x
N
N
IJ
K
2
Where C represents the common factor.
Using x' = (x – 420)/20
x
Deviation from
Assumed Mean
x′ = (x – 420)
300
400
420
–120
–20
0
x'
x'2
–6
–1
0
36
1
0
–7
440
460
480
580
20
40
60
160
1
2
3
8
1
4
9
64
+ 14
N=7
x' = 7
FG
H
′
′2
σ = 20 × ∑ x − ∑ x
N
N
IJ
K
2
FG IJ
H 7K
= 20 115 − 7
7
411
2
x'2 = 115
= 78.56
Measures of Dispersion–I & II
Example 17.9: Calculate the standard deviation from the following data:
Size
6
9
12
15
18
Frequency
7
12
19
10
2
Solution:
Deviation
divided
by Common
Factor 3
(x')
x' times
Frequency
( fx')
x'2 times
frequency
( fx'2)
x
Frequency
(f)
Deviation
from
Assumed
Mean 12
6
7
–6
–2
–14
28
9
12
–3
–1
–12
12
12
19
0
0
0
0
15
10
3
1
10
10
18
2
6
2
4
8
∑ fx′
= –12
∑ fx′ 2
= 58
N = 50
Since deviations have been divided by a common factor, we use,
FG
H
fx ′ 2
∑ fx ′
−
σ =C ∑
N
FG IJ
H 50 K
= 3 58 − −12
50
N
IJ
K
2
2
= 3 1.1600 0.0576
= 3 × 1.05 = 3.15
Example 17.10: Obtain the mean and standard deviation of the first N natural numbers,
i.e., of 1, 2, 3, ..., N – 1, N.
Solution: Let x denote the variable which assumes the values of the first N natural
numbers.
Then,
N
x =
x
1
N
412
N ( N 1)
2
N
N
1
2
Measures of Dispersion–I & II
N
Hence,
1
x = 1 + 2 + 3 + ... + (N – 1) + N
N ( N 1)
2
=
To calculate the standard deviation σ, we use 0 as the assumed mean A. Then,
σ=
But,
FG IJ
H K
∑ x2
∑x
−
N
N
2
= 12 + 22 + 32 + ... (N – 1)2 + N2 =
∑ x2
Therefore,
σ
N ( N + 1) ( 2 N + 1)
6
=
N ( N + 1) ( 2 N + 1) N 2 ( N + 1) 2
−
6N
4N 2
=
( N + 1) 2 N + 1 N + 1
−
2
3
2
LM
N
OP =
Q
(N
1) ( N
12
1)
Thus for first 11 natural numbers,
x
and
=
σ =
11 + 1
=6
2
(11 + 1) (11 − 1)
12
= 10 = 3.16
Example 17.11:
Midpoint
(x)
Frequency
(f)
Deviation
from Class
of Assumed
Mean
(x')
0–10
10–20
5
15
18
16
–2
–1
20–30
30–40
40–50
50–60
60–70
70–80
25
35
45
55
65
75
15
12
10
5
2
1
0
1
2
3
4
5
Deviation
time
Frequency
( fx')
–36
–16
Squared
Deviation
times
Frequency
( fx'2)
72
16
–52
0
12
20
15
8
5
0
12
40
45
32
25
60
f = 79
60
–52
∑ fx′ = 8
413
242
Measures of Dispersion–I & II
Solution: Since the deviations are from assumed mean and expressed in terms of
class interval units,
FG
H
x′2
∑ fx ′
−
σ = i× ∑
N
N
FG IJ
H 79 K
= 10 × 242 − 8
79
IJ
K
2
2
= 10 × 1.75 = 17.5
Combining Standard Deviations of Two Distributions
If we were given two sets of data of N1 and N2 items with means x1 and x 2 and
standard deviations s1 and s2 respectively, we can obtain the mean and standard
deviation x and s of the combined distribution by the given formulae:
x
and
=
σ =
N 1 x1 + N 2 x 2
N1 + N 2
(17.6)
N 1σ 12 + N 2 σ 22 + N 1 ( x − x1 ) 2 + N 2 ( x − x 2 ) 2
N1 + N 2
(17.7)
Example 17.12: The mean and standard deviations of two distributions of 100 and
150 items are 50, 5 and 40, 6 respectively. Find the standard deviation of all taken
together.
Solution: Combined mean,
x
=
N 1 x1 + N 2 x 2
N1 + N 2
=
100 × 50 + 150 × 40
100 + 150
= 44
Combined standard deviation,
σ =
=
N1
2
1
N2
2
2
N1 ( x x1 )2
N1 N 2
N2 ( x
x2 ) 2
100 × (5) 2 + 150 ( 6) 2 + 100 ( 44 − 50 ) 2 + 150 ( 44 − 40 ) 2
100 + 150
= 7.46
414
Measures of Dispersion–I & II
Example 17.13: A distribution consists of three components with 200, 250, 300
items having mean 25, 10 and 15 and standard deviation 3, 4 and 5, respectively.
Find the standard deviation of the combined distribution.
Solution: In the usual notations, we are given here:
N1 = 200, N2= 250, N3 = 300
x1 = 25, x 2 = 10, x 3 = 15
The formulae (17.6) and (17.7) can easily be extended for combination of three
series as
x
=
=
N 1 x1 + N 2 x 2 + N 3 x 3
N1 + N 2 + N 3
200 × 25 + 250 × 10 + 300 × 15
200 + 250 + 300
= 12000 = 16
750
and,
N1
σ =
=
2
1
N2 ( x
N2
2
2
2
x2 )
N1
N3
2
3
N1 ( x
x1 )2
N3 ( x x3 )2
N 2 N3
200 × 9 + 250 × 16 + 300 × 25 + 200 × 81 + 250 × 36 + 300 × 1
200 + 250 + 300
= 51.73 = 7.19
Comparison of various measures of Dispersion
The range is the easiest to calculate the measure of dispersion, but since it depends
on extreme values, it is extremely sensitive to the size of the sample, and to the
sample variability. In fact, as the sample size increases the range increases
dramatically, because the more the items one considers, the more likely it is that
some item will turn up which is larger than the previous maximum or smaller than the
previous minimum. So, it is, in general, impossible to interpret properly the
significance of a given range unless the sample size is constant. It is for this reason
that there appears to be only one valid application of the range, namely in statistical
415
Measures of Dispersion–I & II
quality control where the same sample size is repeatedly used, so that comparison of
ranges are not distorted by differences in sample size.
The quartile deviations and other such positional measures of dispersions are
also easy to calculate but suffer from the disadvantage that they are not amenable to
algebraic treatment. Similarly, the mean deviation is not suitable because we cannot
obtain the mean deviation of a combined series from the deviations of component
series. However, it is easy to interpret and easier to calculate than the standard
deviation.
The standard deviation of a set of data, on the other hand, is one of the most
important statistics describing it. It lends itself to rigorous algebraic treatment, is
rigidly defined and is based on all observations. It is, therefore, quite insensitive to
sample size (provided the size is ‘large enough’) and is least affected by sampling
variations.
It is used extensively in testing of hypothesis about population parameters based
on sampling statistics.
In fact, the standard deviation has such stable mathematical properties that it is
used as a standard scale for measuring deviations from the mean. If we are told that
the performance of an individual is 10 points better than the mean, it really does not
tell us enough, for 10 points may or may not be a large enough difference to be of
significance. But if we know that the σ for the score is only 4 points, so that on this
scale, the performance is 2.5 σ better than the mean, the statement becomes meaningful.
This indicates an extremely good performance. This sigma scale is a very commonly
used scale for measuring and specifying deviations which immediately suggest the
significance of the deviation.
The only disadvantages of the standard deviation lies in the amount of work
involved in its calculation, and the large weight it attaches to extreme values because
of the process of squaring involved in its calculations.
Solved Problems
Example 17.14: The arithmetic mean and standard deviation of a series of 20 items
were calculated by a student as 20 cm and 5 cm respectively. But while calculating
them an item 13 was misread as 30. Find the correct arithmetic mean and standard
deviation.
416
Measures of Dispersion–I & II
Solution: In the usual notations, we are given
N = 20, X = 20 and σ = 5
= N X = 20 × 20 = 400
∑X
Corrected
X = 400 – 30 + 13 = 383
Corrected
X =
Corrected ∑ X 383
=
N
20
σ2 =
∑X2
− ( X )2
N
= 19.15
Also we know that,
2
2
∑ X 2 = N (σ + X )
or
= 20 (25 + 400) = 8500
2
2
∑ X 2 = 8500 – (30) + (13)
Corrected
= 8500 – 900 + 169 = 7769
σ2 =
Corrected
=
FG
H
Corrected ∑ X 2
Corrected ∑ X
−
N
N
FG IJ
H K
7769
383
−
20
20
2
IJ
K
2
= 388.45 – 366.72
σ = 4.66
Hence, the correct mean is 19.15 and correct standard deviation 4.66.
Example 17.15: Mean, and standard deviation of the following continuous series are
31 and 15.9 respectively. The distribution after taking step deviation is as follows:
X'
:
–3
–2
–1
0
1
2
3
f
:
10
15
25
25
10
10
5
Determine the actual class intervals.
Solution:
X'
:
–3
–2
–1
0
1
2
f
:
10
15
25
25
10
10
5
100
fX'
:
–30
–30
–25
0
10
20
15
–40
fX'2
:
90
60
25
0
10
40
45
270
417
3 Total
Measures of Dispersion–I & II
fX
N
Standard Deviation =
2
fX
N
2
i
N = 100
Putting the known values, we have
FG IJ
H 100 K
15.9 = 270 − −40
100
2
×i
= 2. 70 − 0.16 × i = 1.59 × i
i = 15. 9 = 10
∴
1.59
Arithmetic Mean =
fX
N
A
i
∴ Putting the known values, we have
−40
× 10
31 = A +
100
A = 31 + 4 = 35
or
A or assumed mean is the midpoint corresponding to the class having X value 0.
As the class interval is of 10 and the variable under study is a continuous one, the
class for which X = 0 will be 35–5 to 35 + 5, i.e., 30 to 40. A class next lower than
this is 30–10 to 30, i.e., 20 to 30.
Similarly other classes can be calculated. So all the class intervals are:
0–10
10–20
20–30
30–40
40–50
50–60
60–70
Example 17.16: The mean of 50 readings of a variable was 7.43 and their S.D.
was 0.28. The following ten additional readings become available: 6.80, 7.81, 7.58,
7.70, 8.05, 6.98, 7.78, 7.85, 7.21 and 7.40. If these are included with original 50
readings, find (i) The mean, (ii) The standard deviation of the whole set of 60
readings.
Solution: Mean of 50 readings = 7.43
Mean of 10 additional readings =
=
6.80
X
N
7.81 7.58
= 7.516
418
7.70
8.05 6.98
10
7.78
7.85
7.21 7.40
Measures of Dispersion–I & II
Mean of 60 readings =
7.43 50 7.516 10
50 10
=
371.5
75.16
60
= 7.44
Standard Deviation,
0.28 =
0.0784 =
X2
50
(7.43) 2
X2
– 55.2
50
ΣX2 = (0.0784 + 55.2)50 = 2764.165
Sum of square of 10 additional readings,
46.24 + 61.00+ 57.46 + 59.29 + 64.80 + 48.72 + 60.53 + 61.62 + 51.99 + 54.76
= 566.55
Sum of the square of 60 readings = 2764.165 + 566.55 = 3330.71
∴S.D. of 60 readings =
3330.71
60
(7.44) 2
= 55.52 55.35 = 0.71 = 0.41
Example 17.17: The first of two subgroups has 100 items with mean 15 and S.D.
3. If the whole group has 250 items with mean 15.6 and S.D. 13. 44 , find the S.D.
of the second group.
Solution: Combined A.M. = X =
15.6 =
N1 X 1 + N 2 X 2
N1 + N 2
(100 × 15) + (150 × X 2 )
250
3900 = 1500 + 150 X 2
X2
= 16
Therefore the A.M. of the second group is 16.
419
Measures of Dispersion–I & II
Combined S.D.,
13. 44
100 × 9 + 100 (15 − 15. 6) 2 + 150 σ 22 + 150 (16 − 15. 6) 2
=
13.44 =
250
900
36 150
250
2
2
24
or 3360 = 960 + 150σ22
150σ22 = 240, σ22 = 16, σ2 = 4
Therefore the S.D. of the second group is 4.
Example 17.18: You are given the two variables A and B. Using quartile deviations
state which of the two is more dispersed?
A
B
Midpoint
Frequency
Midpoint
Frequency
15
15
100
340
20
33
150
492
25
56
200
890
30
103
250
1420
35
40
300
620
40
32
350
360
45
10
400
187
450
140
Solution: To compare the variability comparison of coefficient of quartile deviations
is required.
Coefficient of quartile deviation is,
Q3 − Q1
Q3 + Q1
Variable A
Variable B
Midpoint
Class Interval
f
Cumulative
Frequency
Midpoint
Class Interval
f
Cumulative
Frequency
15
12.5–17.5
15
15
100
75–125
340
340
20
17.5–22.5
33
48
150
125–175
492
832
25
22.5–27.5
56
104
200
175–225
890
1722
30
27.5–32.5
103
207
250
225–275
1420
3142
35
32.5–37.5
40
247
300
275–325
620
3762
420
Measures of Dispersion–I & II
40
37.5–42.5
32
279
350
325–375
360
4122
45
42.5–47.5
10
289
400
375–425
187
4309
450
425–475
140
4449
N
4
Q1 has
, i.e.,
289
4
or 72.25
Q1 has
N
4
, i.e.,
below it.
4449
2
or 1112.25 items
items below it.
∴ It lies in the group 22.5–27.5
Q1 =
22.5 +
72. 25 − 48
×5
56
Q1 = 175 +
= 24.67
Q3 has
3N
4
32.5 +
1112. 25 − 832
× 50
890
= 190.7
or 216.75 items below it.
∴ Q3 lies in the group 32.5–37.5
Q3 =
∴ It lies in the group 175–225
216. 75 − 207
×5
40
Q3 has
3N
4
items below it.
Q3 lies in the group 275–325
Q3 = 275 +
3336. 75 − 3142
× 50
620
= 33.72
= 290.7
Coefficient of Q.D.
Coefficient of Q.D.
=
33. 72 − 24. 67
33. 72 + 26. 67
= 0.15
=
290. 7 − 190. 7
290. 7 + 190. 7
= 0.21
As coefficient of quartile deviation for B is higher, it is more variable.
Example 17.19: From the data given about four subgroups, calculate the average
and the standard deviation of the whole group.
Subgroup
A
B
C
D
No. of Men
50
100
120
30
Average Wage
(`)
Standard Deviation
Wage (`)
61.0
70.0
80.5
83.0
8
9
10
11
300
421
Measures of Dispersion–I & II
Solution:
NX
σ
Nσ2
N
Average
X
50
100
120
30
61
70
80.5
83
3050
7000
9660
2490
8
9
10
11
3200
8100
12000
3630
Sub-
Men
group
A
B
C
D
300
22200
X – Xc
N( X – X c )2
–13
–4
6.5
9.0
8450
1600
5070
2430
26930
17550
22200
NX
Combined Mean ( X ) c = ∑
=
= ` 74
N
300
2
(Combined Standard Deviation)
=
N 2
N
N (X
N
X c )2
44480
= 26930 17550 =
= 148.27
300
300
300
σ = 148 . 27 = ` 12.18
Example 17.20: For a certain group of wage-earners, the median and quartile wages
per week were ` 44.3, ` 43.0 and ` 45.9 respectively. Wages for the group ranged
between ` 40 and ` 50. 10 per cent of the group had under ` 42 per week, 13 per
cent had ` 47 and over and 6 per cent ` 48 and over. Put these data into the form of
a frequency distribution, and hence obtain an estimate of the mean wage and the
standard deviation.
Solution: Assuming that the group has 100 workers the frequency distribution will
take the following shape.
Earnings
f(X)
d
fd
fd2
No. of
Wageearners
(f)
Midvalue
(X)
40–42
10
41.00
410
–3.50
–35
122.50
42–43
15
42.50
637.50
–2.00
–30
60.00
43–44.3
25
43.65
1091.25
0.85
–21.25
18.06
44.3–45.9
25
45.10
1127.50
+0.60
15.00
9.00
45.9–47
43.63
`
12
46.45
557.40
1.95
23.40
47–48
7
47.50
332.50
3.00
21.00
63.00
48–50
6
49.00
294.00
4.50
27.00
121.50
∑ f = 100
4450.15
422
437.69
Measures of Dispersion–I & II
X =
fX
= 4450 .15 = ` 44.50
N
100
σ=
437 . 69
= ` 2.1 (approx.)
100
Check Your Progress - 2
1.
What is standard deviation?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What are the disadvantages of standard deviation?
................................................................................................................
................................................................................................................
................................................................................................................
17.4 SUMMARY
• A measure of dispersion may be expressed in an ‘absolute form’, or in a
‘relative form’.
• A measure of dispersion or simply dispersion may be defined as statistics
signifying the extent of the scatteredness of items around a measure of
central tendency.
• Absolute measures are expressed in concrete units, i.e., units in terms of
which the data have been expressed, e.g., rupees, centimeters, kilograms,
etc., and are used to describe frequency distribution.
• A relative measure of dispersion is a quotient obtained by dividing the
absolute measures by a quantity in respect to which absolute deviation has
been computed. It is as such a pure number and is usually expressed in a
percentage form.
• A measure of dispersion should possess the following characteristics which
are considered essential for a measure of central tendency.
423
Measures of Dispersion–I & II
• In a frequency distribution, the range is taken to be the difference between
the lower limit of the class at the lower extreme of the distribution and the
upper limit of the class at the upper extreme.
• The range is a measure of absolute dispersion and as such cannot be
usefully employed for comparing the variability of two distributions
expressed in different units.
• An absolute measure can be converted into a relative measure if we divide
it by some other value regarded as standard for the purpose.
• An alternate method of converting an absolute variation into a relative one
would be to use the total of the extremes as the standard.
• Of the various characteristics that a good measure of dispersion should
possess, the range has only two, viz. (i) It is easy to understand, and (ii)
Its computation is simple.
• Besides the aforesaid two qualities, the range does not satisfy the other test
of a good measure and hence it is often termed as a crude measure of
dispersion.
• In situations where the extremes involve some hazard for which preparation
should be made, it may be more important to know the most extreme cases
to be encountered than to know anything else about the distribution.
• In statistical quality control the range is used as a measure of variation. We,
e.g., determine the range over which variations in quality are due to random
causes, which is made the basis for the fixation of control limits.
• Another measure of dispersion, much better than the range, is the semiinterquartile range, usually termed as ‘quartile deviation’.
• We can measure the deviations from any measure of central tendency, but
the most commonly employed ones are the median and the mean.
• The median is preferred because it has the important property that the
average deviation from it is the least.
• By far the most universally used and the most useful measure of dispersion
is the standard deviation or root mean square deviation about the mean.
• The calculation of standard deviation differs in the following respects from
that of mean deviation.
• Secondly, the deviations are always recorded from the arithmetic mean,
because although the sum of deviations is the minimum from the median, the
424
Measures of Dispersion–I & II
sum of squares of deviations is minimum when deviations are measured
from the arithmetic average.
17.5 KEY WORDS
• Dispersion: It is the extent to which values of a variable differ from a fixed
value such as the mean.
• Mean deviation: It is the arithmetic average of the variations (deviations)
of the individual items of the series from a measure of their central
tendency.
17.6 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. Dispersion is defined as statistics signifying the extent of the scatteredness
of items around a measure of central tendency.
2. The range is a measure of absolute dispersion and as such cannot be
usefully employed for comparing the variability of two distributions
expressed in different units.
Check Your Progress - 2
1. Standard deviation, is defined as the square root of the mean of the squares
of the deviations of individual items from their arithmetic mean.
2. The disadvantages of standard deviation lies in the amount of work
involved in its calculation, and the large weight it attaches to extreme values
because of the process of squaring involved in its calculations.
17.7 SELF-ASSESSMENT QUESTIONS
1. Define measures of dispersion. Also list its characteristics.
2. Discuss the merits and limitations of range.
3. What are the specific uses of range?
4. Write a short note on Quartile deviation.
5. Discuss the characteristics and limitations of Quartile deviation.
425
Measures of Dispersion–I & II
6. What do you understand by mean deviation? Discuss its merits and
demerits.
7. Explain the calculation of standard deviation by short cut method.
8. List the comparisons of various measures of dispersion.
17.8 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
426
Measures of Skewness
UNIT–18
MEASURES OF SKEWNESS
Objectives
After going through this unit, you will be able to:
•
Discuss the measures of skewness
•
Describe Karl Pearson’s measure of skewness
•
Analyse Kelly’s measure of skewness
Structure
18.1
18.2
18.3
18.4
18.5
18.6
18.7
18.8
Introduction
Measures of Skewness
Karl Pearson’s Measure of Skewness
Summary
Key Words
Answers to ‘Check Your Progress’
Self-Assessment Questions
Further Readings
18.1 INTRODUCTION
This unit discusses the measures of skewness. Skewness, in probability theory and
statistics, is a measure of the asymmetry of the probability distribution of a realvalued random variable about its mean. A distribution, or data set, is symmetric if it
looks the same to the left and right of the center point. Conceptually, skewness
describes which side of a distribution has a longer tail. If the tail is longer on the right,
then the skewness is rightward or positive; if the tail is longer to the left, then the
skewness is leftward or negative. Right skewness is common when a variable is
bounded on the left but unbounded on the right. Left skewness is less common in
practice, but it can occur when a variable tends to be closer to its maximum than its
minimum value. This unit will discuss skewness, its characteristics and types in detail.
18.2 MEASURES OF SKEWNESS
When a frequency distribution is not symmetrical it is said to be asymmetrical or
skewed. The nature of symmetry and the various types of asymmetry are illustrated
in the given example.
427
Measures of Skewness
The following table shows the heights of the students of a college:
Class
Interval
56.5–58.5
58.5–60.5
60.5–62.5
62.5–64.5
64.5–66.5
66.5–68.5
68.5–70.5
N
Mean (Me)
Median (Md)
Mode (Mo)
A
f
B
f
C
f
D
f
5
25
15
10
15
25
5
3
5
20
44
20
5
3
0
4
40
24
20
8
4
4
8
20
24
40
4
0
100
63.5
63.5
—
100
63.5
63.5
63.5
100
63.5
63
61.9
100
63.5
64
65.1
The histograms and the corresponding curves are drawn in Figures 18.1 and 18.2.
A
56.5
A
X = Md
70.5
B
56.5
X = Mo = Md 70.5
Fig. 18.1
A glance at the data of each of the four classes given above makes a very
interesting study.
The shape of the curves, histograms and placement of equal items at equal
distances on either side of the median clearly show that distributions A and B are
symmetrical. If we fold these curves, or histograms on the ordinate at the mean, the
two halves of the curve or histograms will coincide. In distribution B, all the three
measures of central tendency are identical. In A, which is a bimodal distribution,
mean and median have the same value.
428
Measures of Skewness
Distributions C and D are asymmetrical. This is evident from the shape of the
histograms and curves, and also from the fact that items at equal distances from the
median are not equal in number. The three measures of central tendency for each of
these distributions are of different sizes.
A point of difference between the asymmetry of distribution C and that of D
should be carefully noted. In distribution C, where the mean (63.5) is greater than
the median (63) and the mode (61.9), the curve is pulled more to the right. In
distribution D where mean (63.5) is lesser than the median (64) and mode (65.1) the
curve is pulled more to the left.
In other words, we may say that if the extreme variations in a given distribution
are towards higher values they give the curve a longer tail to the right and this pulls
the median and mean in that direction from the mode. If, however, extreme
variations are towards lower values, the longer tail is to the left and the median and
mean are pulled to the left of the mode.
It could also be shown that in a symmetrical distribution the lower and upper
quartiles are equidistant from the median, so also are corresponding pairs of deciles
and percentiles. This means that in a asymmetrical distribution the distance of the
upper and lower quartiles from median is unequal.
C
C
56.5
MEAN 70.5
MEDIAN
MODE
D
56.5
D
70.5
MODE
MEDIAN
MEAN
Fig. 18.2
429
Measures of Skewness
From the above discussion, we can summarize the tests for the presence of
skewness as follows:
1. When the graph of the distribution does not show a symmetrical curve.
2. When the three measures of central tendency differ from one another.
3. When the sum of the positive deviations from the median are not equal to
the negative deviations from the same value.
4. When the distances from the median to the quartiles are unequal.
5. When corresponding pairs of deciles or percentiles are not equidistant from
the median.
Measures of Skewness
On the basis of the above tests, the following measures of skewness have been
developed:
1. Relationship between three measures of central tendency—commonly
known as the Karl Pearson’s measure of skewness.
2. Quartile measure of skewness—known as Bowley’s measure of skewness.
3. Percentile measure of skewness—also called the Kelly’s measure of
skewness.
4. Measures of skewness based on moments.
All these measures tell us both the direction and the extent of the skewness.
Check Your Progress - 1
1.
State the nature of distance of the upper and lower quartiles from median
in a asymmetrical distribution.
................................................................................................................
................................................................................................................
................................................................................................................
2.
When is a frequency distribution said to be skewed?
................................................................................................................
................................................................................................................
................................................................................................................
430
Measures of Skewness
18.3 KARL PEARSON’S MEASURE OF SKEWNESS
It has been shown earlier that in a perfectly symmetrical distribution, the three
measures of central tendency, viz., mean, median and mode will coincide. As the
distribution departs from symmetry these three values are pulled apart, the difference
between the mean and mode being the greatest. Karl Pearson has suggested the use
of this difference in measuring skewness. Thus Absolute Skewness = Mean – Mode.
(+) or (–) signs obtained by this formula would exhibit the direction of the skewness.
If it is positive, the extreme variation in the given distribution is towards higher values.
If it is negative, it shows that extreme variations are towards lower values.
Pearsonian Coefficient of Skewness
The difference between mean and mode is an absolute measure of skewness. An
absolute measure cannot be used for making valid comparison between the
skewness in two or more distributions for the following reasons: (i) The same size of
skewness has different significance in distributions with small variation and in
distributions with large variation, in the two series, and (ii) The unit of measurement
in the two series may be different.
To make this measure a suitable device for comparing skewness, it is necessary
to eliminate from it the disturbing influence of ‘variation’ and ‘units of measurements’.
Such elimination is accomplished by dividing the difference between mean and mode
by the standard deviation. The resultant coefficient is called Pearsonian Coefficient
of Skewness. Thus, the formula of Pearsonian Coefficient of Skewness is,
Coefficient of Skewness =
Mean Mode
Standard Deviation
Since, as we have already seen, in moderately skewed distributions that,
Mode = Mean – 3 (Mean – Median)
We may remove the mode from the formula by substituting the above in the formula
for skewness, as follows:
Coefficient of Skewness =
Mean [Mean 3(Mean Median)]
Standard Deviation
3(Mean Median)
= Mean Mean 3(Mean Median) =
Standard Deviation
The removal of the mode and substituting median in its place becomes necessary
because mode cannot always be easily located and is so much affected by grouping
errors that it becomes unreliable.
431
Measures of Skewness
Example 18.1: Find the skewness from the following data:
Height
(in inches)
58
59
60
61
62
63
64
65
Number
of Persons
10
18
30
42
35
28
16
8
Solution: Height is a continuous variable, and hence 58″ must be treated as 57.5″–
58.5″, 59″ as 58.5″–59.5″, and so on.
Height
(in inches)
Frequency
f
x′
from 61
fx′
fx′2
Cumulative
Frequency
58
59
59.5″–60–60.5″
10
18
30
–3
–2
–1
–30
–36
–30
–96
90
72
30
10
28
58
60.5″–61–61.5″
42
0
0
0
100
62
62.5″–63–63.5″
63.5″–64–64.5″
65
35
28
16
8
1
2
3
4
35
56
48
32
35
112
144
128
135
163
179
187
171
611
187
+75
Mean = 61 +
σ=
75
187
611
187
= 61.4,
75
187
2
Skewness = 61.4 – 61.04
Coefficient of Skewness =
0.36
1.76
Mode = 60.5 +
35
65
= 61.04
= 3.27 0.16 = 3.11 = 1.76
= 0.36 inches.
= 0.205
Alternatively, we can determine the median,
Median = The size of
= 60.5 +
187
th item = 93.5th item
2
1 35.5
42
= 61.35
Skewness = 3(61.4 – 61.35) = 3(0.05) = 0.15
Coefficient of Skewness =
0.15
1.76
= 0.09
The two coefficients are different because of the difficulties associated with
determination of mode.
432
Measures of Skewness
Bowley’s (Quartile) Measure of Skewness
In the above two methods of measuring skewness, the whole series is taken into
consideration. But, absolute as well as relative skewness may be secured even for a
part of the series. The usual device is to measure the distance between the lower and
the upper quartiles. In a symmetrical series, the quartiles would be equidistant from
the value of the median, i.e.,
Median – Q1 = Q3 – Median
In other words, the value of the median is the mean of Q1 and Q3. In a skewed
distribution, quartiles would not be equidistant from median unless the entire
asymmetry is located at the extremes of the series. Bowley has suggested the
following formula for measuring skewness, based on above facts.
Absolute SK = (Q3 – Me) – (Me – Q1)
= Q3 + Q1 – 2 Me
(18.1)
If the quartiles are equidistant from the median, i.e., (Q3 – Md) = (Md – Q1), then
SK = 0. If the distance from the median to Q1 exceeds that from Q3 to the median, this
will give a negative skewness. If the reverse is the case; it will give a positive skewness.
If the series expressed in different units are to be compared, it is essential to
convert the absolute amount into the relative. Using the interquartile range as a
denominator we have for the coefficient of skewness as follows:
Relative SK =
or,
Q3 Q1 2 Md
Q3 Q1
(18.2)
(Q3 Md) (Md Q1 )
(Q3 Md) (Md Q1 )
If in the series the median and lower quartiles coincide, then the SK becomes
(+1). If the median and upper quartiles coincide, then the SK becomes (–1).
This measure of skewness is rigidly defined and easily computable. Further, such
a measure of skewness has the advantage that it has value limits between (+1) and
(–1), with the result that it is sufficiently sensitive for many requirements. The only
criticism levelled against such a measure is that it does not take into consideration all
the item of these series, i.e., extreme items are neglected.
Example 18.2: Calculate the coefficient of skewness of the data of table given in
example 9 based on quartiles.
Solution: With reference to table given in example 18.1, we have,
Q1 = The size of
N
th
4
187
4
46.75th item
433
Measures of Skewness
= 59.5 +
18.75
30
Q3 = The size of
= 62.5 +
= 59.5 + 0.63 = 60.13
3N
th item
4
5.25
28
3 187
4
140.25th item
= 62.5 + 0.19 = 62.69
Skewness = 62.69 + 60.13 – 2 (61.35) = 0.12
(using formula 18.1)
0.12
(using formula 18.2)
Coefficient of Skewness = 62.69 60.13
=
0.12
2.56
= 0.047
Kelly’s (Percentile) Measure of Skewness
To remove the defect of Bowley’s measure that it does not take into account all the
values, it can be enlarged by taking two deciles (or percentiles), equidistant from the
median value. Kelly has suggested the following measure of skewness:
S K = P50 –
= D5 –
or,
P90
D9
2
2
P10
D1
Though such a measure has got little practical use, yet theoretically this measure
seems very sound.
Example 18.3: Calculate the Karl Pearson’s coefficient of skewness from the
following data:
Marks
above
”
”
”
”
No. of Students
0
10
20
30
40
150
140
100
80
80
Marks
No. of Students
above 50
” 60
” 70
” 80
70
30
14
0
Solution:
f(X′)
f(X′2)
Marks
Frequency
Midpoint
X = (X – A)/10
Cumulative
Frequency (cf)
0–10
10
5
–3
–30
90
10
10–20
40
15
–2
–80
160
50
20–30
20
25
–1
–20
20
70
0
70
–130
30–40
0
35
0
434
0
Measures of Skewness
40–50
10
45
1
10
10
80
50–60
40
55
2
80
160
120
60–70
16
65
3
48
144
136
70–80
14
75
4
56
224
150
194
808
150
+64
Since it is a bimodal distribution Karl Pearson coefficient is appropriate and we
need to calculate X , Me and σ.
X = 35 +
64
150
Median = Size of
= 40 +
× 10 = 35 + 4.27 = 39.27
150
th item
2
10 5
10
= 45
f (X 2)
N
Standard Deviation (σ) = i ×
f (X )
N
2
2
= 10
808
150
64
150
= 10
5.387
0.182
= 10 × 2.28 = 22.8
Skewness =
=
3( X
Median)
3( 5.73)
22.8
=
3(39.27 45)
22.8
17.19
22.8
=
= – 0.75
Example 18.4: From the following data compute quartile deviation and the
coefficient of skewness.
Size
5–7
8–10
11–13
14–16
17–19
Frequency
14
24
38
20
4
Solution:
Size
Frequency
Cumulative Frequency
4.5–7.5
7.5–10.5
10.5–13.5
13.5–16.5
14
24
38
20
14
38
76
96
16.5–19.5
4
100
435
Measures of Skewness
Q1 = 7.5 +
3 11
24
= 8.87
Q3 = 10.5 +
3 37
38
= 10.5 +
111
38
= 10.5 + 2.92 = 13.42
Median = 10.5 +
3 12
38
= 10.5 +
36
38
= 10.5 + 0.947 = 11.447
Quartile Deviation =
Skewness =
=
Q3
Q3
2
Q1
=
13.42 8.87
2
=
4.55
2
= 2.275
Q1 2Me
Q3 Q1
13.42 8.87 22.89
13.42 8.87
= 0.6 = – 0.13
4.55
Example 18.5: In a certain distribution the following results were obtained:
X = 45.00;
Median = 48.00
Coefficient of Skewness = – 0.4
You are required to estimate the value of standard deviation.
Solution:
Skewness = 3 (Mean Median)
– 0.4 =
3 (45
48)
– 0.4σ = – 9
σ=
9
0.4
= 22.5
Example 18.6: Karl Pearson’s coefficient of skewness of a distribution is +0.32.
Its standard deviation is 6.5 and mean is 29.6. Find the mode and median of the
distribution.
Solution:
Coefficient of Skewness = Mean Mode
0.32 = 29.6 Mode
6.5
or
6.5 × 0.32 = 29.6 – Mode
Mode = 29.6 – 2.08 = 27.52
436
Measures of Skewness
Coefficient of Skewness = 3 (Mean Median)
0.32 =
3(29.6
Median)
6.5
6.5 × 0.32 = 88.8 – 3 Median
Median =
88.8
2.08
3
= 28.91
Example 18.7: You are given the position in a factory before and after the
settlement of an industrial dispute. Comment on the gains or losses from the point of
the workers and that of the management.
Before
After
2440
45.5
49.0
12.0
2359
47.5
45.0
10.0
No. of Workers
Mean Wages
Median Wages
Standard Deviation
Solution:
Employment. Since the number of workers employed after the settlement is less
than the number of employed before, it has gone against the interest of the workers.
Wages. The total wages paid after the settlement were 2350 × 47.5 =
` 1,11,625; before the settlement the amount disbursed was 2400 × 45.5 =
` 1,09,200.
This means that the workers as a group are better off now than before the
settlement, and unless the productivity of workers has gone up, this may be against
the interest of management.
Uniformity in the wage structure. The extent of relative uniformity in the wage
structure before and after the settlement can be determined by a comparison of the
coefficient of variation.
Coefficient of Variation, Before =
Coefficient of Variation, After =
12
45.5
10
47.5
× 100 = 26.4
× 100 = 21.05
This clearly means that there is comparatively lesser disparity in due wages received
by the workers. Such a position is good for both the workers and the management.
Pattern of the wage structure. A comparison of the mean with the median
leads to the obvious conclusion that before the settlement more than 50 per cent of
the workers were getting a wage higher than this mean, i.e., (` 45.5). After the
437
Measures of Skewness
settlement the number of workers whose wages were more than ` 45.5 became less
than 50 per cent. This means that the settlement has not been beneficial to all the
workers. It is only 50 per cent workers who have been benefited as a result of an
increase in the total wages bill.
Check Your Progress - 2
1.
What is the difference between mean and mode?
................................................................................................................
................................................................................................................
................................................................................................................
2.
What is the nature of quartiles in a symmetrical series?
................................................................................................................
................................................................................................................
................................................................................................................
18.4 SUMMARY
• When a frequency distribution is not symmetrical it is said to be
asymmetrical or skewed.
• The shape of the curves, histograms and placement of equal items at equal
distances on either side of the median clearly show that distributions A and
B are symmetrical.
• If the extreme variations in a given distribution are towards higher values
they give the curve a longer tail to the right and this pulls the median and
mean in that direction from the mode.
• If extreme variations are towards lower values, the longer tail is to the left
and the median and mean are pulled to the left of the mode.
• It could also be shown that in a symmetrical distribution the lower and upper
quartiles are equidistant from the median, so also are corresponding pairs of
deciles and percentiles. This means that in a asymmetrical distribution the
distance of the upper and lower quartiles from median is unequal.
• The difference between mean and mode is an absolute measure of
skewness.
438
Measures of Skewness
• To make this measure a suitable device for comparing skewness, it is
necessary to eliminate from it the disturbing influence of ‘variation’ and
‘units of measurements’.
18.5 KEY WORDS
• Skewness: It is a measure of the asymmetry of the probability distribution
of a real-valued random variable about its mean.
• Coefficient of skewness: It is the value of skewness that is obtained in
ratios or percentages.
18.6 ANSWERS TO ‘CHECK YOUR PROGRESS’
Check Your Progress - 1
1. In a asymmetrical distribution the distance of the upper and lower quartiles
from median is unequal.
2. A frequency distribution is said to be skewed when it is not symmetrical.
Check Your Progress - 2
1. The difference between mean and mode is an absolute measure of
skewness.
2. In a symmetrical series, the quartiles would be equidistant from the value of
the median.
18.7 SELF-ASSESSMENT QUESTIONS
1. What do you understand by measures of skewness?
2. Write a short note on Karl Pearson’s measure of skewness.
3. Define Pearson’s coefficient of skewness.
4. Differentiate between quartile and Pearson’s measure of skewness.
5. Karl Pearson’s coefficient of skewness of a distribution is +0.35. Its
standard deviation is 6.8 and mean is 29.6. Find the mode and median of
the distribution.
439
Measures of Skewness
6. Calculate the Karl Pearson’s coefficient of skewness from the following
data:
Marks
No. of Students
Marks
No. of Students
above 0
170
above 50
72
“ 10
160
“ 60
33
“ 20
99
“ 70
12
“ 30
85
“ 80
0
“ 40
80
7. You are given the position in a factory before and after the settlement of an
industrial dispute. Comment on the gains or losses from the point of the
workers and that of the management.
Before
After
No. of Workers
2543
2766
Mean Wages
51.5
50.5
Median Wages
48.0
43.0
Standard Deviation 14.0
10.0
18.8 FURTHER READINGS
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
440
NOTES
NOTES
Download