Professor Vipin 2014 Unit 5 Analysis of Uni-variate Data Central Tendency In statistics, a central tendency (or, more commonly, a measure of central tendency) is a central value or a typical value for a probability distribution. It is occasionally called an average or just the center of the distribution. The most common measures of central tendency are the arithmetic mean, the median and the mode. A central tendency can be calculated for either a finite set of values or for a theoretical distribution, such as the normal distribution. Mean Arithmetic Mean This is the simplest form of representing data by one value. This is also called ‘average’ Properties of Mean 1. The sum of the deviations, of all the values of x, from their arithmetic mean, is zero. 2. The product of the arithmetic mean and the number of items gives the total of all items. 3. If and are the arithmetic mean of two samples of sizes n1 and n2 respectively then, the arithmetic mean of the distribution combining the two can be calculated as ̅ ̅̅̅ ̅̅̅ Merits 1. 2. 3. 4. 5. Easy to understand and simple to compute. It is based on all the values. It is well defined by a mathematical formula. It is capable of further mathematical treatment. It possesses sampling stability. Demerits 1. 2. 3. 4. It is highly affected by normal extreme values. If the distributions have open ended classes, then it cannot be easily calculated. Cannot be used for qualitative data Cannot be obtained graphically. www.VipinMKS.com Page 1 Professor Vipin 2014 Simple Arithmetic Mean ̅ Arithmetic Mean for Un-Grouped Data (Frequencies Not Given) 1. Direct Method ∑ ̅ 2. Step Deviation method (Used when values are large) ∑ ̅ Arithmetic Mean for Grouped Data (Frequencies Given) 1. Direct Method ̅ ∑ 2. Step Deviation ̅ ∑ 3. Continuous Series (Direct Method) www.VipinMKS.com Page 2 Professor Vipin 2014 ∑ ∑ ̅ Here x is the midpoint of each class interval 4. Continuous Series (Step Deviation Method) ∑ ∑ ̅ Weighted Arithmetic Mean ̅ ∑ ∑ Median Median It is the middle most value of a data when they are arranged in an order. It is denoted by ‘M’. It divides the variable into two equal parts. Median (Direct Method) In this case, the median can be easily computed by sorting the data in ascending or descending order and counting the middle value. ( ) Where N is the number of observations Median (With Continuous Series) In this case first we have to determine the median class and then median which lies in the median class by using interpretation formula. ( ) www.VipinMKS.com Page 3 Professor Vipin 2014 ( ) l1 is the lower limit of the median class cf is the cumulative frequency of the median class. i is the length of the median class (l2-l1) N is the no of the observations in the distribution. f is the simple frequency of the median class. Mode Mode (Direct Method) Mode is the value of that observation which has the highest frequency. In case of the following values 12, 34, 32, 16, 14, 19, 20, 27 The mode here is 34 because it is the highest. Mode (With Continuous series) Modal class is the class having highest frequency. Mode is the value inside the modal class given by the formula: ( ) l1 is lower limit of the modal class f1 is the frequency of the modal class f0 is the frequency of the preceding to f1 f2 is the frequency next to f1 i is the length of the modal class Emperical Relation between Mean, Median and Mode ̅ www.VipinMKS.com Page 4 Professor Vipin 2014 Geometric Mean Geometric Mean It is the nth root of the product of ‘n’ observations in a series. It is denoted by G. √ 1. Ungrouped Data ( ∑ ) 2. For Discrete and Continuous Data ( ∑ ∑ ) Merits 1. Based on all observations 2. Well defined by a mathematical formula 3. Capable for further mathematical treatment. Demerits 1. Not easy to understand and is complicated to calculate 2. Cannot be computed if any one of the observations is zero or negative Harmonic Mean Harmonic Mean Harmonic mean is the reciprocal of the AM of the reciprocal of the set of observations. 1. Ungrouped data ∑( ) www.VipinMKS.com Page 5 Professor Vipin 2014 2. Discrete Data (Grouped / With Frequency) ∑ ∑( ) 3. Continuous Data ∑ ∑( ) Merits 1. Based on all observations 2. Well defined by a mathematical formula 3. Capable for further mathematical treatment. Demerits 1. Not easy to understand and is complicated to calculate 2. Cannot be computed if any one of the observations is zero or negative Relationship Between AM, GM and HM √ ̅ For any two numbers, a and b ̅ √ www.VipinMKS.com Page 6 Professor Vipin 2014 Measures of Position Median It is the measure that divides the data into two equal parts. Quartiles They are measures which divide the data into four equal parts. Q1 is the first quartile Q2 is the 2nd quartile or median Q3 is the third quartile or upper quartile Deciles There are nine deciles which divide the data into 10 equal parts. The 5th decile is Q2. Percentiles They are measures which divide the data into 100 equal parts. Computation of Quartiles, Deciles and Percentiles 1. Ungrouped Data * * * ( ) + ( ) + ( ) + ( ) + 2. Discrete Data (Grouped Data) Step 1: Find LCF Step 2: * www.VipinMKS.com Page 7 Professor Vipin 2014 * * ( ) + ( ) + Step 3: Qi / Di / Pi = value of x corresponding to LCF just > ( ) ( ) , ( ) ( ) ( ) ( ) 3. Continuous Data Step 1: Find LCF Step 2: , where N denotes the total frequency Step 3: Qi / Di / Pi class = class corresponding to LCF Just > , Step 4: www.VipinMKS.com Page 8 Professor Vipin 2014 Measures of Dispersion Absolute Measures of Variation 1. 2. 3. 4. Range Quartile Deviation Mean Deviation Standard Deviation Relative Measures of Variation 1. 2. 3. 4. Coefficient of Range Coefficient of Quartile deviation Coefficient of Mean deviation Coefficient of Standard deviation Range It is the simplest method of studying variation. Coefficient of Range Quartile Deviation It is the half of the inter quartile range. Coefficient of QD www.VipinMKS.com Page 9 Professor Vipin 2014 Mean Deviation (From Mean) 1. Ungrouped Data ∑| ̅| 2. Discrete Data (Grouped) ∑ | ̅| 3. Continuous (Grouped) is a relative measure ( ̅) ̅ Mean Deviation (From Median) 1. Ungrouped data ∑| | 2. Discrete data (Grouped) ∑ | | 3. Continuous (Grouped) ( ) www.VipinMKS.com Page 10 Professor Vipin 2014 Mean Deviation (From Mode) 1. Ungrouped Data ( ) ∑| | 2. Discrete data (Grouped) ∑ | | 3. Continuous (Grouped) It is the same as discrete data. However x denotes the mid points of the CI Standard Deviation It is the positive square root of the mean of the squared deviations of given observations from their AM. 1. Ungrouped Data Direct Method ∑( √ ̅) ∑ √ ( ∑ √ ( ∑ ) Deviation Method www.VipinMKS.com ∑ ) Page 11 Professor Vipin 2014 2. Discrete Data (Grouped) ∑ √ ( ∑ √ ( ∑ √ ( ∑ ) Deviation Method ∑ ) 3. Continuous Frequency (Grouped) ∑ ) Where x is the midpoint of the class interval Deviation Method ∑ ( ́) √ ( ∑ ́ ) ́ Coefficient of SD ̅ Coefficient of Variation ̅ www.VipinMKS.com Page 12 Professor Vipin 2014 Moments Meaning Moments are used to describe characteristics of a frequency distribution such as averages, dispersion, skewness and kurtosis. Types of Moments 1. Raw Moments Moments about any arbitrary point ‘A’ are known as raw moments. ∑( ) 2. Central Moments Moments about the mean are known are central moments. a) First ∑( x) b) Second ∑( x) c) Third ∑( x) ∑( x) d) Fourth Karl Pearson’s Beta and Gamma Coefficients These are coefficients based on the 1st four central moments. www.VipinMKS.com Page 13 Professor Vipin 2014 Also √ Tells us if a distribution is skewed or whether it is symmetrical and a symmetrical curve and normal curve. tells us the difference between Skewness In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive or negative, or even undefined. Types of Skewness Teacher expects most of the students get good marks. If it happens, then the cure looks like the normal curve below: But for some reasons (e. g., lazy students, not understanding the lectures, not attentive etc.) it is not happening. So we get another two curves. www.VipinMKS.com Page 14 Professor Vipin 2014 Karl Pearson’s Coefficient of Skewness x Bowley’s Coefficient of Skewness www.VipinMKS.com Page 15