Prepared By: ZIA ULLAH INTERPRETATION OF DATA Prepared By: ZIA ULLAH 427 Prepared By: ZIA ULLAH CONTENTS 1. 2. 3. 4. 5. 6. 7. Mean of Grouped and Ungrouped date Median of Grouped and Ungrouped data Mode of Grouped and Ungrouped data Quartiles of Grouped and Ungrouped data Deciles of Grouped and Ungrouped data Percentiles of Grouped and Ungrouped data Measures of Dispersion (Range, Standard Deviation, Variance) 428 Prepared By: ZIA ULLAH Mean of Grouped and Ungrouped Data It is a value obtained by dividing the sum of all values by their numbers. If a series is denoted by X , X , .......X then arithmetic mean of series denoted by x . X X ....... X n X 1 2 So, n 1 X = n X Arithmeticm Mean X Sum of all value of available, n = number of values. 429 2 n Prepared By: ZIA ULLAH Direct Method: A.M is the quotient of sum of all values divided by the number of values.A.M = X X n Example: Calculate arithmetic mean of the following data 20, 50, 72, 28, 53, 54, 59, 64, 72, Solution: X 20 50 72 A.M = X X n 778 13 28 = 53 54 59 64 72 74 75 79 78 = 59.85 Total: 778 430 74, 75, 78, 79, Prepared By: ZIA ULLAH Grouped data: The Arithmetic mean for grouped data is given below. 1) Direct Method:The first step is to take midpoint of each class. The mid point of each class is multiplied by their Corresponding frequencies to obtain the total product which is then divided by the number of items. X f 1 x1 f 2 x2 f n xn = = f f 1 f 2 f n fx f = Sum of frequencies X = The mid point of individual class Example: 431 Prepared By: ZIA ULLAH The miles travelled by 20 students in coming to commerce college Faisalabad. Calculate the arithmetic mean. Miles travelled No. of Students 0-2 2 2-4 5 4-6 4 6-8 8 8-10 1 Solution: Miles travelled 0-2 2-4 4-6 6-8 8-10 Total X f 2 5 4 8 1 20 = fx f = 102 20 Mid point (x) 1 3 5 7 9 X = fx 2 15 20 56 9 102 5.1 432 Prepared By: ZIA ULLAH Advantages and Disadvantage of MEAN Advantages: I) It is easy to calculate and easy to understand. II) It is determinate. It is not indefinite. III) It can be used for further analysis and to treatment. IV) It provides a good standard of comparisn. V) It is the best known of the average. VI) It is least affected by fluctuates of sampling. Disadvantages: i) It can not be completed accurately in case of open ended distributions. ii) It may not lie in the middle of series, if the series is skewed. iii) It is greatly affected by extreme values in the data. 433 Prepared By: ZIA ULLAH Median for Grouped and Ungrouped data Median is the value of middle item of a series, when it is arranged in ascending or descending order. Median divides the series in two equal halves, in such a way that in one half, the values are less than median and in the other half, the values more than median. Ungrouped Data: First arrange the values in an array. Locate the middle values, i-e, the number of values, above the median, is the same as the number of values, below the median. Odd Numbers: If the number of values i-e n is an odd number, the median is calculated by n1 Medan: The value of 2 th item. 434 Prepared By: ZIA ULLAH i Example: Find median of the following items. 5, 7, 7, 8, 9, 10, 12, 15, 21 Solution: Arrange in ascending order. 5, 7, 7, 8, 9, 10, 12, 15, 21 n1 item. Median= The value of 2 th = 91 The value of 2 = The value of 5 item. = 9 th tem. th 435 Prepared By: ZIA ULLAH Even Number: If n is even, then median is: [The value of n item+ The value of n2 item] 1 Median = 2 2 1222 item] 12 [The value of item + The value of 1 = 2 2 2 th th th = 1 2 th [The value of 6 item + The value of 7 item th th 1 2 = [51+52] = 51.2 436 Prepared By: ZIA ULLAH Grouped data: i) Continuous series:When frequency distribution, is available in a continuous series, the median is the value of n2 item. To find the median from frequency dist, th we form a Cumulative frequency . median is obtained by formula: h n c median= f 2 𝑙+ 𝑙= Lower class boundary of median class. n= f= h= C= Number of items. Frequency of median class, Size of class interval, Cumulative freq. of the class proceeding the median class. 437 Prepared By: ZIA ULLAH Example: Find the median of the data:Class interval Frequency 100-200 200-300 300-400 400-500 500-600 15 18 30 20 17 Solution: Class interval 100-200 200-300 300-400 400-500 500-600 Total n 2 𝑙 C h f C.F 15 18 30 20 17 100 15 33 63 83 100=n = 100 = 50 2 = = = Median = 300 33 100 𝑙 h n c f 2 100 = 300+ = 356.67 50 33 30 438 Prepared By: ZIA ULLAH Advantages and Disadvantage of Median: Advantage: i. It is easy and quick to calculate ii.It is easily located in individual and discrete series. iii- It is not affected by the value of extreme items. iv- It can be found even for distributions with open classes at either end. v- It is suitable for skewed distributions. Disadvantages: i. It is not as familiar average as the arithmetic mean. ii. It cant not be used for further mathematical processing. iii. Median can not be calculated unless the values are arranged according to size. iv. It is not based all the observations. 439 Prepared By: ZIA ULLAH Mode “Most repeating value of the given data is called mode” If each value occurs the same number of times, then there is no mode. If two more values occur the same number of times but more frequently then any of the other values, then there more than one mode. In this respect, the mode differs from the mean and the median because there is only one mean and only one median. If there is only one mode, the distribution is said to be uni-modal distribution a distribution having two modes is called a bi-modal distribution and a distribution having more than two modes is called a multi-modal distribution. Example: Following are the daily wages received by 8 laborers: Rs 20, 25, 35, 35, 40, 50, 55, 60 Find out the mode Solution: Here 35 is repeating two times. So 35 is mode. 440 Prepared By: ZIA ULLAH Example: 6 term tests, in education a student bas made grades of 81, 92, 85, 77, 89, 79. Find the mode . Solution: Since each grades occur only once, i-e, 77, 79, 81, 85, 89, 92, no mode exist. Example:Find the mode. Salaries of 5 men in an industrial concern Rs: 950, 2100, 1500, 1500, 2100 Solution: Write the salaries in ascending order 950, 1500, 1500, 2100, 2100 Mode: 1500, 2100 441 Prepared By: ZIA ULLAH Grouped Data: i) Continuous Series: When the data are grouped into a frequency distribution, the mode lie in the class that carries the highest frequency. This class is called modal class. The formula for computing the mode is:- Mode l f f 1 m h = = = = = l+ f f m f f f m 1 1 m f h 2 Lower class boundary of modal class. frequency of modal class, frequency of class after the modal class size of class interval of modal class Example: 442 Prepared By: ZIA ULLAH Calculate mode of the data given below: Weight 410-419 420-429 430-439 440-449 450-459 460-469 470-479 No. of mangoes 14 20 42 54 45 18 7 Solution: f m Weight f Class boundaries 410-419 14 409.5-419.5 420-429 20 419.5-429.5 430-439 42 429.5-439.5 440-449 54 439.5-449.5 450-459 45 449.5-459.5 460-469 18 459.5-469.5 470-479 7 469.5-479.5 Total 200 = 54, l = Mode = f 439.5 l+ f =42, 1 f f m = 439.5+ = 445.26 = 45, f f f m 1 h=10 2 1 m f h 2 2015 2015 2016 10 443 Prepared By: ZIA ULLAH Discrete Series: In a discrete frequency dist. , the mode is that value which has maximum frequency. Example: Find mode form the following No. of children 1 2 3 4 5 6 No. of couples 10 15 45 18 15 10 Solution: The data is discrete. The maximum frequency is 45 hence mode=03. 444 Prepared By: ZIA ULLAH Advantage and Disadvantages of Mode Advantages: iiiiiiivvvi- It is easy and quick to calculate. It is easy to understand It can be determined from open-end distribution. Extreme values do not affect its values. It can be found at once by inspection, from the ungrouped data. It is useful for meteorological forecasts. Disadvantages: i. It is ill-defined. ii.It is not based on all the observations of a set of data. iii- It cannot be used for further mathematical processing. iv- There maybe more than one values of the mode in the set of data. v- There will be no mode, if there is no common Value in the data. 445 Prepared By: ZIA ULLAH 9.6 Percentiles of Grouped and Ungrouped Data: Ninety nine values dividing the data into one hundred equal parts are called percentiles. Percentiles of Ungrouped Data: P1 = n 1 The value of 100 item. P2 = n 1 The value of 2 100 item. th th n 1 TH P 99 = value of 99 100 item. th 446 Prepared By: ZIA ULLAH Example: Find Percentiles for the following data: 71, 81, 90, 100, 99, 78, 76, 66, 65, 52, 42, 37, 33, 90, 7, 9, 16, 13, 21, 51 Solution: n = 21 Arrange the data into ascending order 7, 9, 13, 16, 21, 33, 37, 42, 47, 51, 52, 65, 66, 71, 76, 78, 81, 90, 90, 99, 100 n 1 The value of 55 100 item. th P55 = 211 The value of 100 item. th = = = = = The value of 12.1th item. The value of 12th item + 0.1 [13th item –12th item] 65+0.1[66 –65] 65.1 447 Prepared By: ZIA ULLAH Percentile for Grouped Data: h n c f 100 h 2n c l P2 f 100 P1 l . . . h P99 l f 9n c 100 448 Prepared By: ZIA ULLAH Example: Give the data: Grade 99-99 80-89 70-79 60-69 50-59 40-49 30-39 9 32 43 21 11 3 1 F Fin P65 of the given data. Solution: Grade 30-39 40-49 50-59 60-69 70-79 80-89 90-99 Total F 1 3 11 21 43 32 9 120 65n 100 P65 l 65120 100 78 h 65n F f 100 69.5 P65 Class Boundaries 29.5–39.5 39.5-49.5 49.5-59.5 59.5-69.5 69.5-79.5 79.5-89.5 89.5-99.5 10 78 36 32 82.62 Standard Deviation: 449 C.F 1 4 15 36 ← F 79 ← 111 120 Prepared By: ZIA ULLAH “The positive square root of the Mean of squared deviations of all observation from their Mean” is known as standard deviation. It may be defined as “Root mean squared deviation”. The Standard deviation of a set of „n‟ value, X , X X , denoted by S ( Sample Standard deviation). 1 For Ungrouped Data: S 1 n X i X n i 1 X X = 2 2 n 450 2 n Prepared By: ZIA ULLAH = If Deviation X X X X Then squared deviation = 2 2 2 Mean squared deviation = X X n 2 X X =S And root mean squared deviation = n For population standard deviation, the Greek letter (Sigma) is used. X u 2 N 451 Prepared By: ZIA ULLAH Example: Find the standard deviation “S” of each set of numbers. 9, 3, 8, 8, 9, 8, 9, 18 Solution: 9 3 8 8 9 8 9 18 = 9.5 X 8 X X S = 2 n = 99.539.589.589.5899.589.599.5189.5 2 S = S S = = 2 2 2 2 190 8 23.75 4.87 452 2 2 2 Prepared By: ZIA ULLAH The standard Deviation for Grouped Data: In case of a frequency distribution with X 1, X 2 X k as class marks and f , f f as the corresponding class frequency, the standard deviation is given by: 1 S = n Xi X 1 2 k 2 k i 1 = Where n = fX X 2 n f 1 f 2 f k ….k 453 = f where I = 1, 2, 3, Prepared By: ZIA ULLAH Example: Find the S.D of height of 100 female students at AIOU. Height f 60-62 5 63-65 18 Solution: X Height f 60-62 63-65 66-68 69-71 72-74 5 18 42 27 8 100 Class Marks x 61 64 67 70 73 = fx f = 6745 = 100 fx 305 1152 2814 1890 584 6745 = = = = 69-71 27 X X X X 2 -6.45 -3.45 -0.45 2.55 5.55 41.6025 11.9025 0.2025 6.5025 30.8025 67.45 f X X S 66-68 42 2 n 852.75 100 8.5275 2.92 inch 454 72-74 8 fX X 2 208.0125 214.245 08.505 175.5675 246.4200 852.75 Prepared By: ZIA ULLAH Ungrouped Data: Direct Method I: The standard deviation of a set of values , is the positive square root of the arithmetic mean of the standard deviation from the mean of the distribution. If X 1, X 2 X n are the values of a set of data, then the standard deviation is given as; S.D = X X 2 n Direct Method-II: In direct method II, the square of the vales of items are totaled and divided by the number of items. S.D = X n 2 X n 2 455 Prepared By: ZIA ULLAH Example: Find the standard deviation from the data by direct-method II. Solution: X X 11 12 13 14 15 16 17 18 19 20 21 176 121 144 169 196 225 256 289 324 361 400 441 2926 = X X n n = 2926 176 11 11 2 2 2 S.D 2 = 3.16 456 Prepared By: ZIA ULLAH Advantages and Disadvantages of Standard Deviation: Advantages i) it is based on all the values. ii) It is much used in statistical inference. It plays a key role in the normal distributions. iii) It is easily amenable to algebrical process. iv) It is less affected by fluctuaions of sampling. v) It is useful comparing number of different sets of data. Disadvantages i) it is difficult to calculate. ii) It is affected by extreme values. iii) It gives more weight to extreme values and less of those which are near the mean. 457 Prepared By: ZIA ULLAH Variance The mean of the squared deviation of all the observation from their mean, is known as variance For Sample: xix 2 S 2 n The (Variance) read as “Sigma Square” and is denoted by var(x). The variance is in square of units about the population variance. So variance is always positive. 2 For Population: xi 2 2 n 458 Prepared By: ZIA ULLAH Example: A population of N=10 has the observations 7,8,10,13,14,19,20,25,26 and 28. Find its variance. Solution: x i x u x i u x i -10 -9 -7 -4 -3 +2 3 8 9 11 100 81 49 16 9 4 9 64 81 121 534 49 64 100 169 196 361 400 625 676 784 3424 7 8 10 13 14 19 20 25 26 28 170 xi = N 2 xi 2 i 170 10 = 17 2 n = 534 10 = 53.4 459 2 Prepared By: ZIA ULLAH Example: Calculate variance of the following data: i f 74.5 9435 144.5 134.5 154.5 174.5 194.5 9 10 17 10 5 4 5 x i Solution: x f i 74.5 9435 144.5 134.5 154.5 174.5 194.5 Total f ix i f xi i i 9 10 17 10 5 4 5 670.5 945 1946.5 1345 772.5 698 972.5 7350 49952.25 89302.50 222874.25 180902.5 11935.25 121801 189151.25 973335 f x fx f f 2 2 S 2 = S 2 = 973335 7350 60 60 2 2 1216 460 Prepared By: ZIA ULLAH Coefficient of Variation Co-efficient of variation is calculated for comparison of the series of data and it was first of all used by Kart Pearson. It is the quotient of the standard deviation, divided by the arithmetic mean expressed percentage, represented by C.V it is defined as, Co-efficient of variation = C.V = in S 100. x A distribution having the smaller coefficient of variation then the other distribution is the consistent distribution. 461 Prepared By: ZIA ULLAH 462