1 To be able to determine which of the three measures(mean, median and mode) to apply to a given set of data with the given purpose of information. 2 Introduction Definition of Measures of Central Tendency Mean Arithmetic Mean Ungrouped data Grouped data Median Ungrouped data Grouped data Graphical Method 3 Mode Ungrouped data Grouped data 4 Measures of central tendency are single values that are typical and representative for a group of numbers. They are also called measures of locations. A representative values of location for a group of numbers is neither the biggest nor the smallest but is a number whose values is somewhere in the middle of the group. Such measures are often used to summarize a data set and to compare one data set with another. 5 The average value of a set of data. Appropriate for describing measurement data, eg. heights of people, marks of student papers. Often influenced by extreme values. 6 • For Ungrouped data : x x1 x 2 ...........xn x x n x N n where, x value of x variable, n sample size, N population size the summation of • Usually we seldom use population mean, µ, because the population is very large and it would be troublesome to gather all the values. We usually calculate the sample mean, and use it to make an estimation of µ. 7 Example: Find the average age of five students whose ages are 18, 19, 19, 19, 20 and 22 respectively. Solution x x n 98 5 19.6 years old. 8 1) Basic Method fx x n For Example: The distances traveled by 100 workers of XYZ Company from their homes to the workplace are summarized below (next slide). Find the mean distance traveled by a worker. 9 Distances (km) No. of workers 0 and under 2 12 2 and under 4 35 4 and under 6 24 6 and under 8 8 and under 10 TOTAL 18 11 100 Kilometers travelled No of Mid Total Workers Point distance (X) travelled ƒ(x) 0 and under 2 12 1 12 2 and under 4 35 3 105 4 and under 6 24 5 120 6 and under 8 18 7 126 8 and under 10 11 9 99 total 100 Total 462 Mean = 462/100 = 4.62 10 It is easy to calculate and understand. It makes use of all the data points and can be determined with mathematical exactness. The mean is useful for performing statistical procedures like comparing the means between data sets. 11 Can be significantly abnormal values. influenced by extreme It may not be a value which correspond with a single item in the data set. Every item in the data set is taken into consideration when computing the mean. As a result it can be very tedious to compute when we have a very large data set. It is not possible for use to compute the mean with open ended classes. 12 The weighted mean enables us to calculate an average which takes into account the relative importance of each value to the overall total. Example A lecturer in XYZ Polytechnic has decided to use weighted average in awarding final marks for his students. Class participation will account for 10% of the student’s grade, mid term test 15%, project 20%, quiz 5% and final exam 50%. 13 From the information given, compute the final average for Zaraa who is one of the students. Zaraa’s marks are as follows: class participation 90 quiz 80 project 75 mid term test 70 final examination 85 14 Solution Subjects Marks (x) Weight (w) wx Participation 90 10 900 Quiz 80 5 400 Project 75 20 1500 Mid term test 70 15 1050 Final exam 85 50 4250 total 100 8100 wx x w 8100 100 81 15 The median of a set of values is defined as the value of the middle item when the values are arranged in ascending or descending order of magnitude. 16 If the data set has an odd number of observations, the middle item will be the required median after the data has been arranged in either ascending or descending order. If the data array has an even number of observations, we will take the average of the two middle items for the required median as shown at the next slide. 17 Example : a) odd number of values data set: 2,1,5,2,10,6,8 array: 1,2,2,5,6,8, 10 median = value of fourth item. b) Even number of values data set: 9,6,2,5,18,4,12,10 array: 2,4,5,6,9,10,12,18 median = (6+9) / 2 = 7.5 18 Example Kilometers No of travelled workers Cumulative frequency 0 and under 2 12 2 and under 4 35 4 and under 6 24 71 6 and under 8 18 89 8 and under 10 11 100 Total 12 Middle term = n/2 = 100/2 = 50th term 47 We identify the median class by examining the cumulative frequencies 100 19 We can also find the median using a graph. Here, we will first plot the “more than” and “less than” ogives. The median is the value of the intersection point of the two ogives. ogives cf 50 2 4 6 8 10 20 Example : Distance (km) 0 and under 2 No of workers 12 Less than More than cf cf 12 88 2 and under 4 35 47 53 4 and under 6 24 71 29 6 and under 8 18 89 11 8 and under 10 11 11 0 total 100 21 It is easy to compute and simple to understand. It is not affected by extreme values. Can take on open ended classes. It can deal with qualitative data. 22 It can be time consuming to compute as we have to first array the data. If we have only a few values, the median is not likely to be representative. It is usually less reliable than the mean for statistical inference purpose. It is not suitable for arithmetical calculations, and has limited use for practical work. 23 The most frequent or repeated value. In the case of a continuous variable, it is possible that no two values will repeat themselves. In such a situation, the mode is defined as the point of highest frequency density, i.e., where occurrences cluster most closely together. 24 Like the median, the mode has very limited practical use and cannot be subjected to arithmetical manipulation. However, being the value that occurs most often, it provides a good representation of the data set. 25 The mode can be obtained simply by inspection. Example 1,4,10,8,10,12,13 1,3,3,7,8,8,9 1,2,3,4,9,10,11 Mode=10 Mode= 3 and 8 No mode 26 In the case of grouped data, finding the mode may not be as easy. Since a grouped frequency distribution does not show individual values, it is obviously impossible to determine the value which occurs most frequently. Here we can assign a mode to grouped data that have the highest frequency even though we may not know whether or not any data value occurs more than once. 27 Advantages It can take on open ended classes. It cannot be affected by extreme values. It is also applicable to qualitative data. Disadvantages Not clearly defined. Some data may have no mode It is difficult to interpret and compare if data set has more than one mode 28 References : Lecture & Tutorial Notes from Department of Business & Management, Institute Technology Brunei, Brunei Darussalam.