Math 211 Introduction to Statistics Chapter III The Measures of Central Tendency Index (subscript) Notation: Let the symbol X i (read ‘ X sub i) denote any of the N values X 1 , X 2 ,..., X N assumed by a variable X . The letter i in X i i 1, 2,..., N is called an index or subscript. The letters j , k , p, q or s can also be used. N Summation Notation: å X i = X 1 + X 2 + .... + X N i= 1 N Example : å X i Yı = X 1Y1 + X 2Y2 + .... + X N YN i= 1 N å N aX i = aX 1 + aX 2 + .... + aX N = a (X 1 + X 2 + .... + X N ) = a å X i , a Î i= 1 i= 1 Averages (Measures of Central Tendency) The average of a set of numbers is the value which best represents it. There are three different types of averages. Each has advantages and disadvantages depending on the data and intended purpose. Mean This is also known as the arithmetic mean. It is found by dividing the sum of the set of numbers with the actual number of values and defined as N X X 2 ... X N X 1 N X i 1 N i X N Example : Find the mean of 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10. Sum of values: 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10 = 55 Number of values = 10 Mean of values= X = 55 / 10 = 5.5 Note: If the numbers X1 , X 2 ,..., X k occur f1 , f 2 ,..., f k times respectively, (occur with frequencies f1 , f 2 ,..., f k ), the arithmetic mean is, k X fX i 1 i i N k where N X i is the total frequency. i 1 Sonuç Zorlu Lecture Notes 1 Example : The grades of a student on six examinations were 84,91,72,68,91, and 72. Find the arithmetic mean. The arithmetic mean k å X= fi X i i= 1 N = 1(84)+ 2 (91)+ 2 (72)+ 1(68) = 79.67 1+ 2 + 2 + 1 Example : If 5,8,6 and 2 occur with frequencies 3,2,4 and 1 respectively, the arithmetic mean is X= 3(5) + 2(8) + 4(6) + 1(2) 57 = = 5.7 3+ 2 + 4 + 1 10 The Weighted Arithmetic Mean The weighted arithmetic mean of a set of N numbers X1 , X 2 ,..., X N is defined as N w X + w2 X 2 + ... + wk X k X= 1 1 = w1 + w2 + ... + wk å wi X i i= 1 k å wi i= 1 where w j represents the weight of the j th value. Example: If a final examination is weighted 4 times as much as a quiz, a midterm examination 3 times as much as a quiz, and a student has a final examination grade of 80, a midterm examination grade of 95 and quiz grades of 90, 65 and 70, the mean grade is X= 1(90)+ 1(65)+ 1(70)+ 3(95)+ 4 (80) 1+ 1+ 1+ 3 + 4 = 830 = 83 . 10 Properties of the Arithmetic Mean (1) The algebraic sum of the deviations of a set of numbers from their arithmetic N mean is zero, that is å (Xi - X ) = 0 . i= 1 n (2) å ( X i - a ) 2 is minimum if and only if a = X . i= 1 Sonuç Zorlu Lecture Notes 2 (3) If f1 numbers have mean m1 , f 2 numbers have mean m2 ,…, f k numbers have mean mk , then the mean of all the numbers is f1m1 + f 2 m2 + ... + f k mk f1 + f 2 + ... + f k X= æweighted arithmetic ö ÷ çç ÷ ÷ ÷ çèmean of all the meansø (4) If A is any guessed or assumed arithmetic mean and if d j = X j - A are the deviations of X j from A , then N å dj i= 1 X = A+ N = A+ å d N ( for raw data ) = A+ d k å X = A+ f jd j i= 1 k = A+ å å fd N fj = A+ d ( for grouped data) i= 1 Arithmetic Mean Computed from Grouped Data k å Formula 1. X = A + f jd j i= 1 k å = A+ fj å fd where A is any guessed or assumed N i= 1 class mark, d j = X j - A are the deviations of X j from A . k å Formula 2. X = X j fj i= 1 k å where X j is the class mark of the corresponding class. fj i= 1 Sonuç Zorlu Lecture Notes 3 Median The median of a set of numbers arranged in an array is either the middle value or the arithmetic mean of the two middle values. That is, X n 1 / 2 if n is odd X X n / 2 X n / 2 1 if n is even 2 The disadvantage of median is that it is not sensitive against changes in the data. Example: Find the median of 2, 4, 8, 7, 4, 6, 10, 8, and 5. Array: 2, 4, 4, 5, 6, 7, 8, 8, 10 Middle value = ( ( 9 + 1 ) / 2 ) th value = 5 th value= X 5 Median = 6 The Median for Grouped Data æN ö çç - (å f ) ÷ ÷ 1÷ çç 2 ÷ Median = X = L1 + ç c ÷ ÷ çç ÷ f median ÷ ÷ çè ÷ ø where L1 = lower class boundary of the median class N = number of item s in the data (total frequency) (å f ) = 1 sum of frequencies of all classes lower than the median class f median = frequency of the median class c = size of the median clas s int erval Mode The mode is the value which occurs most frequently in the set of values. The mode of the set of values is also known as the modal value. The mode may be unique, may not exist or may be more than one. Example: Find the mode of 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 7, 8, 8 and 9. Modal value = 5, since it has the highest frequency. Sonuç Zorlu Lecture Notes 4 In the case of grouped data where a frequency curve has constructed to fit the data, the mode will be the value (or values) of X corresponding to the maximum point (or points) on the curve. This value of X is sometimes denoted by X . From a frequency distribution or histogram the mode can be obtained by æ D1 ö ÷ ÷ Mode = X = L1 + çç c ÷ ÷ çèD 1 + D 2 ø where L1 = lower class boundary of the mod al class D 1 = excess of mod al frequency over frequency of the next lower class D 2 = excess of mod al frequency over frequency of the next higher class c = size of the median clas s int erval The Empirical Relation between the Mean, Median and Mode MEAN MODE 3 MEAN MEDIAN The above relation is true for unimodal frequency curves which are asymmetrical. The Geometric Mean G Let X1 , X 2 ,..., X N be the sample values, the geometric mean is G N X1 X 2 ... X N . Example: The geometric mean of the numbers 2,4 and 8 is G 3 2.4.8 3 64 4 . The Harmonic Mean H Let X1 , X 2 ,..., X N be the sample values, the harmonic mean is H 1 1 N N X i 1 Sonuç Zorlu i N 1 X Lecture Notes 5 Example: The harmonic mean of the numbers 2,4 and 8 is H 3 3.43 1 1 1 2 4 8 The Relation between the Arithmetic, Geometric and Harmonic Means: H G X Quartiles, Deciles and Percentiles Three of these divide the data set into four, ten or hundred divisions, respectively. Quartiles, Deciles and Percentiles are measures of position useful for comparing scores within one set of data. You probably all took some type of college placement exam at some point. If your composite math score was say 28, it might have been reported that this score was in the 94th percentile. What does this mean? This does not mean you received a 94% on the test. It does mean that of all the students who took that exam, 94% of them scored lower than you did (and 6% higher). For a set of data you can divide the data into three quartiles ( Q1 , Q2 , Q3 ), nine deciles ( D1 , D2 ,...D9 ) and 99 percentiles ( P1 , P2 ,...., P99 ). The quartile Q1 separates the bottom 25% from the top 75%, Q2 is the median and Q3 separates the top 25% from the bottom 75%. To work with percentiles, deciles and quartiles - you need to learn to do two different tasks. First you should learn how to find the percentile that corresponds to a particular score and then how to find the score in a set of data that corresponds to a given percentile. Sonuç Zorlu Lecture Notes 6 Exercise 1: The table shows the speed distribution of vehicles on Magusa-Lefkosa Road on a typical day. Speed(km/hr) No. of Class marks vehicles X i 60-69 138 64.5 70-79 163 74.5 80-89 325 84.5 90-99 541 94.5 100-109 427 104.5 110-119 214 114.5 120-129 110 124.5 130-139 52 134.5 140-149 30 144.5 N=2000 (a) (b) (c) (d) dj X j A -40 -30 -20 -10 0 10 20 30 40 f jd j -5520 -4890 -6500 -5410 0 2140 2200 1560 1200 -15210 Find the mean speed. Find the median speed. Find the modal speed. Find Q1 , D3 and P95 . k Solution. (a) Let A 104.5 , then X = A + å f jd j i= 1 k å = 104.5 fj 15210 = 96.895km / hr . 2000 i= 1 (b) Since N 2000, the 1000th value will be the median, this value can be found as follows: æN ö çç - (å f ) ÷ ÷ æ1000 - 626 ÷ ö 1÷ ç ÷ X = L1 + çç 2 c = 89.5 + çç 10 = 89.5 + (0.69)10 = 96.4km / hr ÷ ÷ ÷ ÷ ç çç è ÷ f median 541 ø ÷ ÷ çè ÷ ø (c) The modal speed can be found as follows: æ D1 ÷ ö æ 216 ÷ ö ÷ X = L1 + çç c = 89.5 + çç 10 = 89.5 + 6.55 = 96.05km / hr . ÷ ÷ ÷ çèD 1 + D 2 ø èç 216 + 114 ÷ ø æN ö çç - (å f ) ÷ ÷ æ500 - 301 ö 1÷ çç 4 ÷ ÷ çç (d) Q1 = L1 + ç c = 79.5 + 325 ÷ ÷ ÷ ÷10 = 85.62km / hr çè çç ø ÷ fQ1 ÷ ÷ çè ÷ ø Sonuç Zorlu Lecture Notes 7 æ3 N ö çç - (å f ) ÷ ÷ ÷ 1 ç ÷ ÷ D3 = P30 = L1 + çç 10 c = 79.5 + ÷ çç ÷ f D3 ÷ ÷ ççè ÷ ø æ95 N ö çç - (å f ) ÷ ÷ ÷ 1 ç ÷ ÷ P95 = L1 + çç 100 c = 119.5 + ÷ çç ÷ f P 95 ÷ ÷ ççè ÷ ø æ600 - 301ö ÷ ç ÷ ÷10 = 88.7km / hr èçç 325 ø æ1900 - 1808 ÷ ö çç 10 = 127.89km / hr. ÷ èç ø÷ 110 Exercise 2: The following table shows a frequency distribution of the weekly wages of 65 employees at the P&R Company. Wages $250.00-259.99 $260.00-269.99 $270.00-279.99 $280.00-289.99 $290.00-299.99 $300.00-309.99 $310.00-319.99 (a) (b) (c) (d) No. of employees 8 10 16 14 10 5 2 N=65 Find the mean wage Find the modal wage Find the median wage Find Q3 and D8 . Exercise 3: Consider the following frequency distribution. classes frequency 10-14 15-19 20-24 25-29 30-34 7 11 14 13 5 Total 50 Xi 12 17 22 27 32 fi X i 84 187 308 351 160 1090 di X i A -10 -5 0 5 10 fi di -70 -55 0 65 50 0 (a) Find the (approximate) mean using formula 1 and formula 2. Compare the results. k Formula 1. Let A 22 , then X = A + å f jd j i= 1 k å = 22 fj 10 = 21.8 50 i= 1 Sonuç Zorlu Lecture Notes 8 k å Formula 2. X = X j fj i= 1 k å = fj 1090 = 21.8 50 i= 1 (b) Find the mode. The modal class is the third class with frequency 14. 1 14 11 3, 2 14 13 1 . æ D1 ÷ ö æ 3 ÷ ö ÷ Thus, X = L1 + çç c = 19.5 + çç 5 = 23.25 ÷ ÷ ÷ ÷ çèD + D ø èç3 + 1ø 1 2 (c) Find P90 and P10 . æ90 N ö çç - (å f ) ÷ ÷ 1÷ ç ÷ ÷ P90 = L1 + çç 100 c = 24.5 + ÷ çç ÷ f P 90 ÷ ÷ ççè ÷ ø æ10 N ö çç - (å f ) ÷ ÷ ÷ 1 ç ÷ ÷ P10 = L1 + çç 100 c = 9.5 + ÷ çç ÷ f P10 ÷ ÷ ççè ÷ ø æ45 - 32 ö÷ ç ÷5 = 29.5 . èçç 13 ø÷ æ5 - 0 ÷ ö çç 5 = 13.07 ÷ èç 7 ø÷ Exercise 4: A student’s grades in laboratory, lecture, and recitation parts of a computer course were 71, 78, and 89, respectively. (a) If the weights accorded these grades are 2,4, and 5, respectively, what is an average grade? (b) What is the average grade if equal weights are used? Sonuç Zorlu Lecture Notes 9