Uploaded by qadeer1997aqg

6401unit .9

advertisement
Prepared By: ZIA ULLAH
INTERPRETATION
OF DATA
Prepared By: ZIA ULLAH
427
Prepared By: ZIA ULLAH
CONTENTS
1.
2.
3.
4.
5.
6.
7.
Mean of Grouped and Ungrouped date
Median of Grouped and Ungrouped data
Mode of Grouped and Ungrouped data
Quartiles of Grouped and Ungrouped data
Deciles of Grouped and Ungrouped data
Percentiles of Grouped and Ungrouped data
Measures of Dispersion (Range, Standard Deviation, Variance)
428
Prepared By: ZIA ULLAH
Mean of Grouped and Ungrouped Data
It is a value obtained by dividing the sum of all values by
their numbers. If a series is denoted
by X , X , .......X then

arithmetic mean of series denoted by x
.
X  X  ....... X n
X 1 2
So,
n
1
X
=
n

X  Arithmeticm Mean
 X  Sum of all value of available,
n
=
number of values.
429
2
n
Prepared By: ZIA ULLAH
Direct Method:
A.M is the quotient of sum of all values divided by the number of
values.A.M = X 
X
n
Example:
Calculate arithmetic mean of the following data
20, 50, 72, 28, 53, 54, 59, 64, 72,
Solution:
X
20
50
72
A.M = X 
X
n
778
13
28
=
53
54
59
64
72
74
75
79
78
= 59.85
Total: 778
430
74,
75,
78,
79,
Prepared By: ZIA ULLAH
Grouped data:
The Arithmetic mean for grouped data is given below.
1)
Direct
Method:The first step is to take midpoint of each class. The mid point
of each class is multiplied by their Corresponding
frequencies to obtain the total product which is then
divided by the number of items.
X
f 1 x1 f 2 x2   f n xn
=
=
f
f 1 f 2   f n
 fx
f
= Sum of frequencies
X = The mid point of individual class
Example:
431
Prepared By: ZIA ULLAH
The miles travelled by 20 students in coming to commerce college
Faisalabad. Calculate the arithmetic mean.
Miles travelled
No. of Students
0-2
2
2-4
5
4-6
4
6-8
8
8-10
1
Solution:
Miles travelled
0-2
2-4
4-6
6-8
8-10
Total
X
f
2
5
4
8
1
20
=
 fx
f
=
102
20
Mid point (x)
1
3
5
7
9
X
=
fx
2
15
20
56
9
102
5.1
432
Prepared By: ZIA ULLAH
Advantages and Disadvantage of MEAN
Advantages:
I) It is easy to calculate and easy to understand.
II) It is determinate. It is not indefinite.
III) It can be used for further analysis and to treatment.
IV) It provides a good standard of comparisn.
V) It is the best known of the average.
VI) It is least affected by fluctuates of sampling.
Disadvantages:
i)
It can not be completed accurately in case of open ended
distributions.
ii) It may not lie in the middle of series, if the series is skewed.
iii) It is greatly affected by extreme values in the data.
433
Prepared By: ZIA ULLAH
Median for Grouped and Ungrouped data
Median is the value of middle item of a series,
when it is arranged in ascending or descending order.
Median divides the series in two equal halves, in such a way that in one
half, the values are less than median and in the other half, the values
more than median.
Ungrouped Data:
First arrange the values in an array. Locate the middle values, i-e,
the number of values, above the median, is the same as the number of
values, below the median.
Odd Numbers:
If the number of values i-e n is an odd number, the median is calculated by
 n1 
Medan: The value of 

2
 
th
item.
434
Prepared By: ZIA ULLAH
i Example:
Find median of the following items.
5, 7, 7, 8, 9, 10, 12, 15, 21
Solution:
Arrange in ascending order.
5, 7, 7, 8, 9, 10, 12, 15, 21
 n1
 item.
Median= The value of 
2


th
=
 91 

The value of 
 2 
=
The value of 5 item.
=
9
th
tem.
th
435
Prepared By: ZIA ULLAH
Even Number:
If n is even, then median
is:
[The value of n item+ The value of n2 item]
1
Median =
2
2
1222 item]
12
[The
value
of
item
+
The
value
of
1
= 2
2
2
th
th
th
=
1
2
th
[The value of 6 item + The value of 7 item
th
th
1
2
=
[51+52]
= 51.2
436
Prepared By: ZIA ULLAH
Grouped data:
i) Continuous series:When frequency distribution, is available in a continuous series, the
median is the value of n2 item. To find the median from frequency dist,
th
we form a Cumulative frequency .
median is obtained by formula:
h n  c


median=
f 2

𝑙+
𝑙= Lower class boundary of median class.
n=
f=
h=
C=
Number of items.
Frequency of median class,
Size of class interval,
Cumulative freq. of the class proceeding the median class.
437
Prepared By: ZIA ULLAH
Example:
Find the median of the data:Class interval
Frequency
100-200
200-300
300-400
400-500
500-600
15
18
30
20
17
Solution:
Class interval
100-200
200-300
300-400
400-500
500-600
Total
n
2
𝑙
C
h
f
C.F
15
18
30
20
17
100
15
33
63
83
100=n
= 100 = 50
2
=
=
=
Median =
300
33
100
𝑙
h n 
 c 
f 2 
100
=
300+
=
356.67
50  33
30
438
Prepared By: ZIA ULLAH
Advantages and Disadvantage of Median:
Advantage:
i. It is easy and quick to calculate
ii.It is easily located in individual and discrete series.
iii- It is not affected by the value of extreme items.
iv- It can be found even for distributions with open classes at either end.
v- It is suitable for skewed distributions.
Disadvantages:
i. It is not as familiar average as the arithmetic mean.
ii. It cant not be used for further mathematical processing.
iii. Median can not be calculated unless the values are arranged
according to size.
iv. It is not based all the observations.
439
Prepared By: ZIA ULLAH
Mode
“Most repeating value of the given data is called mode”
If each value occurs the same number of times, then there is no
mode. If two more values occur the same number of times but more
frequently then any of the other values, then there more than one mode.
In this respect, the mode differs from the mean and the median because
there is only one mean and only one median.
If there is only one mode, the distribution is said to be uni-modal
distribution a distribution having two modes is called a bi-modal
distribution and a distribution having more than two modes is called a
multi-modal distribution.
Example: Following are the daily wages received by 8 laborers: Rs 20, 25, 35, 35, 40, 50, 55, 60
Find out the mode
Solution: Here 35 is repeating two times.
So 35 is mode.
440
Prepared By: ZIA ULLAH
Example: 6 term tests, in education a student bas made
grades of 81, 92, 85, 77, 89, 79. Find the mode .
Solution: Since each grades occur only once, i-e,
77, 79, 81, 85, 89, 92, no mode exist.
Example:Find the mode. Salaries of 5 men in an industrial
concern Rs: 950, 2100, 1500, 1500, 2100
Solution: Write the salaries in ascending order
950, 1500, 1500, 2100, 2100
Mode: 1500, 2100
441
Prepared By: ZIA ULLAH
Grouped Data:
i) Continuous Series:
When the data are grouped into a frequency distribution, the
mode lie in the class that carries the highest frequency. This class
is called modal class. The formula for computing the mode is:-
Mode
l
f
f
1
m
h
=
=
=
=
=
l+ f
f
m


f
f  f
m
1

1
m

f
h

2
Lower class boundary of modal class.
frequency of modal class,
frequency of class after the modal class
size of class interval of modal class
Example:
442
Prepared By: ZIA ULLAH
Calculate mode of the data given below:
Weight
410-419
420-429
430-439
440-449
450-459
460-469
470-479
No. of
mangoes
14
20
42
54
45
18
7
Solution:
f
m
Weight
f
Class boundaries
410-419
14
409.5-419.5
420-429
20
419.5-429.5
430-439
42
429.5-439.5
440-449
54
439.5-449.5
450-459
45
449.5-459.5
460-469
18
459.5-469.5
470-479
7
469.5-479.5
Total
200
=
54,
l
=
Mode =
f
439.5
l+
f
=42,
1
f
f
m
=
439.5+
=
445.26


= 45,
f
f f
m
1
h=10
2

1
m

f
h

2
2015

2015 2016 10
443
Prepared By: ZIA ULLAH
Discrete Series:
In a discrete frequency dist. , the mode is that value
which has maximum frequency.
Example:
Find mode form the following
No. of children
1
2
3
4
5
6
No. of couples
10
15
45
18
15
10
Solution:
The data is discrete. The maximum frequency is 45 hence
mode=03.
444
Prepared By: ZIA ULLAH
Advantage and Disadvantages of Mode
Advantages:
iiiiiiivvvi-
It is easy and quick to calculate.
It is easy to understand
It can be determined from open-end distribution.
Extreme values do not affect its values.
It can be found at once by inspection, from the ungrouped data.
It is useful for meteorological forecasts.
Disadvantages:
i. It is ill-defined.
ii.It is not based on all the observations of a set of data.
iii- It cannot be used for further mathematical processing.
iv- There maybe more than one values of the mode in the set of data.
v- There will be no mode, if there is no common Value in the data.
445
Prepared By: ZIA ULLAH
9.6 Percentiles of Grouped and Ungrouped Data:
Ninety nine values dividing the data into one hundred equal parts are
called percentiles.
Percentiles of Ungrouped Data:
P1 =
 n 1 

The value of  100  item.
P2 =
 n 1
The value of 2  100  item.


th
th
 n 1 
TH

P 99 = value of 99  100  item.
th
446
Prepared By: ZIA ULLAH
Example:
Find Percentiles for the following data:
71, 81, 90, 100, 99, 78, 76, 66, 65, 52, 42, 37, 33, 90, 7, 9, 16, 13,
21, 51
Solution:
n = 21
Arrange the data into ascending order
7, 9, 13, 16, 21, 33, 37, 42, 47, 51, 52, 65, 66, 71, 76, 78, 81, 90, 90, 99,
100
 n 1 

The value of 55  100  item.
th
P55 =
 211 


The value of  100  item.
th
=
=
=
=
=
The value of 12.1th item.
The value of 12th item + 0.1 [13th item –12th item]
65+0.1[66 –65]
65.1
447
Prepared By: ZIA ULLAH
Percentile for Grouped Data:
h  n  c 
f  100 
h  2n  c 

l

P2
f  100 
P1  l 
.
.
.
h
P99  l 
f
 9n  c 
 100 
448
Prepared By: ZIA ULLAH
Example:
Give the data:
Grade
99-99
80-89
70-79
60-69
50-59
40-49
30-39
9
32
43
21
11
3
1
F
Fin P65 of the given data.
Solution:
Grade
30-39
40-49
50-59
60-69
70-79
80-89
90-99
Total
F
1
3
11
21
43
32
9
120
65n
100
P65
l
65120
100
 78

h  65n  F 


f  100 
 69.5 
P65
Class Boundaries
29.5–39.5
39.5-49.5
49.5-59.5
59.5-69.5
69.5-79.5
79.5-89.5
89.5-99.5
10
78  36 
32
 82.62
Standard Deviation:
449
C.F
1
4
15
36 ← F
79 ←
111
120
Prepared By: ZIA ULLAH
“The
positive square root of the
Mean of squared deviations of all
observation from their Mean” is
known as standard deviation. It may be
defined as “Root mean squared deviation”. The
Standard deviation of a set of „n‟ value, X , X  X ,
denoted by S ( Sample Standard deviation).
1
For Ungrouped Data:
S

1 n 
 X i X
n i 1 



 X  X 



=



2
2
n
450
2
n
Prepared By: ZIA ULLAH
=
If Deviation
X   
X



X 
X




Then squared deviation =
2
2
2
Mean squared deviation =
X  
 X 


n
2


 X  X 

 =S

And root mean squared deviation =
n
For population standard deviation, the Greek letter (Sigma) is used.
 X u 
2

N
451
Prepared By: ZIA ULLAH
Example:
Find the standard deviation “S” of each set of numbers.
9, 3, 8, 8, 9, 8, 9, 18
Solution:
9  3  8  8  9  8  9  18

= 9.5
X
8



 X  X 



S =
2
n
= 99.539.589.589.5899.589.599.5189.5
2
S
=
S
S
=
=
2
2
2
2
190
8
23.75
4.87
452
2
2
2
Prepared By: ZIA ULLAH
The standard Deviation for Grouped Data:
In case of a frequency distribution with X 1, X 2  X k as class marks
and f , f      f as the corresponding class frequency, the standard
deviation is given by:
 
1 

S =

n Xi X 


1
2
k
2
k
i 1
=
Where n =
 

 fX X 


2
n
f

1
f

2
f
k
….k
453
=
 f where
I = 1, 2, 3,
Prepared By: ZIA ULLAH
Example:
Find the S.D of height of 100 female students at AIOU.
Height
f
60-62
5
63-65
18
Solution:
X
Height
f
60-62
63-65
66-68
69-71
72-74
5
18
42
27
8
100
Class Marks
x
61
64
67
70
73
=
 fx
f
=
6745
=
100
fx
305
1152
2814
1890
584
6745
=
=
=
=
69-71
27
X  X  X  X 
2
-6.45
-3.45
-0.45
2.55
5.55
41.6025
11.9025
0.2025
6.5025
30.8025
67.45


 f X X 



S
66-68
42
2
n
852.75
100
8.5275
2.92
inch
454
72-74
8
fX
 
 X 


2
208.0125
214.245
08.505
175.5675
246.4200
852.75
Prepared By: ZIA ULLAH
Ungrouped Data:
Direct Method I:
The standard deviation of a set of values , is the positive square root of
the arithmetic mean of the standard deviation from the mean of the
distribution.
If X 1, X 2      X n are the values of a set of data, then the standard
deviation is given as;
S.D =
 

 X  X 


2
n
Direct Method-II:
In direct method II, the square of the vales of items are totaled and
divided by the number of items.
S.D =
X
n
2

X


 n 
2
455
Prepared By: ZIA ULLAH
Example:
Find the standard deviation from the data by direct-method II.
Solution:
X
X
11
12
13
14
15
16
17
18
19
20
21
176
121
144
169
196
225
256
289
324
361
400
441
2926
=
X  X 


n
n


=
2926  176 
11  11 
2
2
2
S.D
2
=
3.16
456
Prepared By: ZIA ULLAH
Advantages and Disadvantages of Standard
Deviation:
Advantages
i) it is based on all the values.
ii) It is much used in statistical inference. It plays a key role in
the normal distributions.
iii) It is easily amenable to algebrical process.
iv) It is less affected by fluctuaions of sampling.
v) It is useful comparing number of different sets of data.
Disadvantages
i) it is difficult to calculate.
ii) It is affected by extreme values.
iii) It gives more weight to extreme values and less of those
which are near the mean.
457
Prepared By: ZIA ULLAH
Variance
The mean of the squared deviation of all the
observation from their mean, is known as variance
For Sample:
 xix 

2
S
2
n
The  (Variance) read as “Sigma Square” and is denoted by var(x). The
variance is in square of units about the population variance. So variance is
always positive.
2
For Population:

 xi   

2
2
n
458
Prepared By: ZIA ULLAH
Example:
A population of N=10 has the observations 7,8,10,13,14,19,20,25,26 and
28. Find its variance.
Solution:
x
i
x  u 
x i u 
x i 
-10
-9
-7
-4
-3
+2
3
8
9
11
100
81
49
16
9
4
9
64
81
121
534
49
64
100
169
196
361
400
625
676
784
3424
7
8
10
13
14
19
20
25
26
28
170
   xi
=
N
2
 xi   

2
i

170
10
=
17
2
n
=
534
10
= 53.4
459
2
Prepared By: ZIA ULLAH
Example:
Calculate variance of the following data:
i
f
74.5
9435
144.5
134.5
154.5
174.5
194.5
9
10
17
10
5
4
5
x
i
Solution:
x
f
i
74.5
9435
144.5
134.5
154.5
174.5
194.5
Total
f ix
i
f xi
i
i
9
10
17
10
5
4
5
670.5
945
1946.5
1345
772.5
698
972.5
7350
49952.25
89302.50
222874.25
180902.5
11935.25
121801
189151.25
973335
 f x    fx 
f   f 
2
2
S
2

=
S
2
=
973335  7350 


60
60

2
2

1216
460
Prepared By: ZIA ULLAH
Coefficient of Variation
Co-efficient of variation is calculated for comparison of
the series of data and it was first of all used by Kart
Pearson. It is the quotient of the standard deviation,
divided by the arithmetic mean expressed
percentage, represented by C.V it is defined as,
Co-efficient of variation = C.V =
in
S
 100.
x
A distribution having the smaller coefficient of
variation then the other distribution is the consistent
distribution.
461
Prepared By: ZIA ULLAH
462
Download