Measures of Central Tendencies What Is This Module About? Have you encountered a situation wherein you have to get the average of a given set of numbers or a given set of data? For instance, you have a sari sari store and you want to know how much is your average profit per month. How then will you compute for the average profit? The average or the mean for a set of given numbers or data is just one of the three measures of central tendency that you will learn about in this module. While the measures of central tendency give information about the average or the center of a given distribution of data, the measures of dispersion are used to analyze the variations of the data in the given set of distribution. In this module, you will learn further about their differences and how they are applied in problem solving. This module is divided into two lessons. These are: Lesson 1 – Measures of Central Tendency Lesson 2 – Measures of Dispersion Wait! Before studying this module, be sure that you have completed the module on Graphs and Frequency Distribution. What Will You Learn From This Module? After studying this module, you should be able to: ♦ describe the differences between the mean, median and mode; ♦ use the mean, median and mode to analyze and interpret data to solve problems in daily life; ♦ describe the differences between the range, mean deviation and the standard deviation; and ♦ use the range, mean deviation and standard deviation to solve problems in daily life. 1 Let’s See What You Already Know A. B. The following is the frequency distribution of 50 scores in a Science test. Interval of scores Frequency 95 – 99 2 90 – 94 6 85 – 89 6 80 – 84 8 75 – 79 7 70 – 74 6 65 – 69 4 60 – 64 5 55 – 59 4 50 – 54 2 1. Find the mean. 2. Find the median. 3. Find the mode. The following are the test scores of Jeremy and Theresa in four of their subjects. Jeremy Theresa Physics Chemistry Biology 78 79 61 65 80 91 Geology 89 85 Complete the table below. R an g e V arian ce Jerem y (1) (2) (3) T heresa (4) (5) (6) S tan d a D eviati Well, how was it? Do you think you fared well? Compare your answers with those in the Answer Key on page 37 to find out. If all your answers are correct, very good! This shows that you already know much about the topics in this module. You may still study the module to review what you already know. Who knows, you might learn a few more new things as well. If you got a low score, don’t feel bad. This means that this module is for you. It will help you understand some important concepts that you can be apply in your daily life. If you study this module carefully, you would learn the answers to all the items in the test and a lot more! Are you ready? You may now go to the next page to begin Lesson 1. 2 LESSON 1 Measures of Central Tendency Measures of central tendency provide information about the averages or centers of a distribution of data. It is the number that best represent a given set of numbers or what we call a group of scores. This lesson will introduce you to the three measures of central tendencies called the mean, median and the mode; how they are computed and how they are used in everyday life. After studying this lesson, you should be able to: ♦ describe the differences between the mean, median and mode; ♦ compute for the mean, median and mode of a given set of data; and ♦ use the mean, median and mode to analyze and interpret data to solve problems in daily life. In a fishing village, Mario, Gimo, Oscar and Caloy are comparing their catches for the day. Mario has caught 7.6 kilograms of fish while Gimo has caught only 6 kilograms. Oscar has 10.2 kilograms of fish while Caloy has 11.5 kilograms. What is their average fish catch for the day? 3 The average fish catch can also be referred to as the mean fish catch. The mean is the arithmetic average of a set of data or given numbers. In this case, the set of data is the number of kilograms of fish each of the fishermen caught. These given numbers, or what is referred to as the scores are added up. When we add all the kilograms of fish caught for the day, we have 35.3 kilograms of fish. Then we count the number of scores, and we get 4. When the sum of the scores is divided by the number of scores, we get an average of 8.825 kilograms of fish. This means that the mean of the group of data is 8.825. Let’s Learn The mean, also known as the arithmetic mean, is the average of a set of data, or scores. This is the number or the score that best represents a group of scores. It is one of the measures of central tendency used when you have interval or ratio scores. Do you still remember the terms “interval” and “ratio”? These are 2 of the 4 scales of measurement. If you want to review them, you may read the module entitled “Graphs and Frequency Distributions. This topic is discussed in lesson one of that module. The computation for the mean may be depicted using symbols. Let us use x to represent a single score in a given data. In the example above, Mario has caught 7.6 kilograms of fish. This score would be represented as: x = 7.6. This score can also be referred to as a raw score if it is untreated in any way. The sum of a set of scores is represented by the capital Greek letter sigma (Σ). This symbols represents the summation of a set of scores. Whenever this symbol appears, it means that whatever follows it must be summed or added up. For example, the notation Σx indicates that all the values or scores represented by x should be added up. The symbol N is also useful in statistical computations. This represents the number of scores in the given set. In the previous example, the number of scores given is 4. Therefore, N = 4. Combining these three symbols (x, Σ, N), we have the formula to be used in calculating for the mean of a given set of data. If we let the mean be x, the formula is: x= Σx N where x = the mean Σ =" the sum of" N = number of scores 4 Let’s Try This Read the following problems carefully. Solve for the mean. EXAMPLE 1 A company makes electrical circuits. The quality control manager decides to find out the average number of defective circuits made per week. He examines each electrical circuit produced per day, and records the number of electrical circuits with defects. He collected data for 7 days. These data are listed below: Day 1 25 defective circuits Day 2 22 defective circuits Day 3 30 defective circuits Day 4 32 defective circuits Day 5 31 defective circuits Day 6 27 defective circuits Day 7 16 defective circuits Compute for the mean failure rate or the mean number of defective electrical circuits. Let’s do it together. In order to get the mean, we use the formula x= Σx N In the formula, Σx indicates that we sum up all the given scores. Thus, 25 + 22 + 30 + 32 + 31 + 27 + 16 = 183 We then divide the sum with the number of scores N. Here, N = 7. Thus, x= Σx 183 = 26.14 = 26 = 7 N Therefore, the mean failure rate is 26 defective electrical circuits. Try to solve the following on your own. 5 EXAMPLE 2 You are an employee of a certain company in your province. In your company, there are 20 employees in all. You would want to determine the average monthly salary that the company gives. Listed below are the salaries of these 20 employees (including your own salary) in thousands of pesos. Employee 1 P 4,000 Employee 11 P 3,000 Employee 2 P 2,500 Employee 12 P 3,000 Employee 3 P 5,200 Employee 13 P 3,000 Employee 4 P 2,800 Employee 14 P 4,200 Employee 5 P 1,200 Employee 15 P 2,100 Employee 6 P 2,100 Employee 16 P 2,100 Employee 7 P 2,100 Employee 17 P 1,800 Employee 8 P 2,100 Employee 18 P 1,500 Employee 9 P 2,000 Employee 19 P 2,100 Employee 10 P 3,200 Employee 20 P 4,000 Compare your answers with those found in the Answer Key on page 37. Let’s Learn When scores are in a grouped data or are presented in a frequency distribution, such as in the table below, we may need to use a more convenient formula to compute for the mean. The following table shows the test scores in a class composed of 40 students. Table 1 Scores in a test Interval of scores Frequency (f ) 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 15 – 19 10 – 14 1 2 5 10 9 6 4 3 In the case of a frequency distribution table, we take note of how frequent an interval appears in the data. This is represented by f. We also take note of the midpoint in an interval. This is represented by xc. For example, the midpoint xc of the interval 45 – 49 is 47. For the interval 40 – 44, what do you think is the midpoint? _____. In order to get the mean using the new formula, the midpoint is multiplied by the frequency. Let us present the information above in a new table. 6 Table 2 Scores in a test Interval of scores 45 40 35 30 25 20 15 10 – – – – – – – – Midpoint x c Frequency f 47 42 37 32 27 22 17 12 1 2 5 10 9 6 4 __3__ N = 40 49 44 39 34 29 24 19 14 To compute the mean for this frequency distribution, we use the formula below: x= Σfx c N Using the data in Table 2, we now solve for the mean, 1115 40 = 27.875 x= = 28 Therefore, the mean is 28. Let’s Learn When data is presented in a frequency distribution, we compute for the mean by using the following formula: Σfx c N where x = the mean x= Σ = “the sum of” x c = midpoint of an interval f = frequency of cases in an interval N = number of scores 7 Given a frequency distribution, we follow the following step-by-step procedure: STEP 1 Determine the number of scores N. STEP 2 Determine the midpoint xc for each class interval. STEP 3 Determine the fxc for each class interval. STEP 4 Determine the sum Σfxc. STEP 5 Substitute the values in the formula. Let’s Try This EXAMPLE 3 Given the following frequency distribution table of the ages of 30 participants in a workshop training, compute for the mean. Interval of ages Frequency 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 2 5 4 3 12 4 Let us use the step-by-step procedure. Try to fill in the blanks. STEP 1 Determine the number of scores N. The total number of scores is 30 since there are 30 participants in the workshop training. Therefore, N = 30. STEP 2 Determine the midpoint xc for each class interval. Interval of ages Midpoint Xc 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 47 42 37 ___ ___ ___ 8 Frequ 1 ___ N= STEP 3 Determine the fxc for each class interval. Interval of ages Midpoint xc Frequency f 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 47 42 37 32 27 22 2 5 4 3 12 __4__ N = 30 STEP 4 Determine the sum Σfxc. Interval of ages Midpoint xc Frequency f 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 47 42 37 32 27 22 2 5 4 3 12 __4__ N = 30 STEP 5 Substitute the values in the formula. Σfx c N 960 = 30 = 32 x= Therefore, the mean is 32. EXAMPLE 4 Given the following frequency distribution table of 40 scores in a Science test, compute for the mean. Interval of scores Frequency 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 1 4 5 7 6 5 3 5 2 2 9 STEP 1 Determine the number of scores N. N = _______ STEP 2 STEP 3 STEP 4 Determine the midpoint xc for each class interval. Interval of ages Midpoint xc Frequency f 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ 1 4 5 7 6 5 3 5 2 2 N = ____ Determine the fxc for each class interval. Interval of ages Midpoint xc Frequency f 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ 1 4 5 7 6 5 3 5 2 2 N = ____ Interval of ages Midpoint xc Frequency f 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ 1 4 5 7 6 5 3 5 2 2 N = ____ Determine the sum Sfxc. 10 Σ STEP 5 Substitute the values in the formula. Σfx c N = ______ = ______ x= Therefore, the mean is_______. Compare your answers with those found in the Answer Key on pages 37–39. Let’s Study and Analyze Suppose you are watching a basketball game. The following are the final scores of the players that you took note of: Player David Limpot Ong Meneses Crisano Score/Number or Points 45 10 9 8 8 11 If we are to look for the middle score but used the method for computing the mean, the score that we will get will be higher than the true value we are looking for. The computed middle score will be higher because one score is too high compared to the other scores. The scores of Limpot, Ong, Meneses and Crisano are close to each other. However, there is one value that is too high, compared to the other scores – the score of David. We say that the score of David is an extreme score or an outlier. So instead of computing for the mean, just look for the middle score. This score is called the median. The median is the midpoint in a set of scores. When looking for the median, arrange the scores from the highest to the lowest score. This arrangement of scores is called an array. Then, look for the score that is exactly in the middle of the array such as the scores above and below it equal number. In the example, there are five scores, and the middle number is 9. There are 2 scores below it and also 2 scores above it. Therefore, the score 9 is the median. This is quite easy to determine because the number of scores N is an odd number. For a set of scores with an even number of scores N, the median is computed by getting the average of the two middle scores. For example, in the set of scores below, 5 9 12 13 15 21 22 30 the middle scores are 13 and 15. The median is computed by getting the average of these two scores. The average of 13 and 15 is 14. Therefore, we say that the median is 14. Note that there is no 14 in the given set of data, which means that the median is not a score but a point that divides a distribution in the middle. Let’s Learn The median is the score or value that divides a given data set or a distribution into two equal halves wherein 50% of the scores are above it, while 50% are below it. The median is not greatly affected by extremely high or low scores. If one or more scores in the data are too high or too low compared to most of the other scores, then the median is a better measure of central tendency than the mean. These extreme scores are called outliers. The median is oftentimes denoted as “Md”. There are two different cases considered in finding the median of a given set of data. CASE 1 The number of scores N is odd. In the array of scores, the median is the middle score. CASE 2 The number of scores N is even. In the array of scores, the median is the average of the two middle scores. 12 Let’s Try This Read the following problems carefully. Find the median of the given set of data EXAMPLE 1 The following are the amount of pastilyas Ria sold for 5 days. Day 1 P 1,620 Day 2 P 1,560 Day 3 P 2,220 Day 4 P 5,560 Day 5 P 2,300 It is important that the given data is in the form of an array. In the example, th scores are not in an array, so we still have to arrange it from the highest to the lowest value. Rearranging the given data, we’ll have Day 4 P 5,560 Day 5 P 2,300 Day 3 P 2,220 Day 1 P 1,620 Day 2 P 1,560 Which value may be considered an outlier in Ria’s data? Notice that the amount she got in Day 4 is much higher than in the other days. Therefore, the outlier in the data is P 5,560. How do we solve for the median? Notice that the number of scores given is 5, an odd value. Therefore, Case 1 applies to this example. To get the median, we simply look for the middle score. The middle score in the given data is the amount corresponding to Day 3. It is so because there are 2 scores above and below it. From the definition, the median divides a distribution of data into two equal parts. Therefore, the median is P 2,220. 13 EXAMPLE 2 The following are the monthly salaries of 10 employees from a small company. Employee 1 Employee 2 Employee 3 Employee 4 Employee 5 Employee 6 Employee 7 Employee 8 Employee 9 Employee 10 P 15,400 P 3,500 P 3,400 P 3,300 P 3,300 P 3,100 P 2,700 P 2,600 P 2,500 P 2,200 Check first if the data is already presented in an array. Which salary may be considered an outlier in the data? _________. Take note of the number of scores given. Which case applies to this example? ________. Case 2 applies to data with an even number of scores. It states that the median is the average of the two middle scores. What are the middle scores in this example? ______ and P 3,100. Notice that there are 4 values both above and below the middle scores. We can now determine the median by getting the average of the middle scores. Md = (_____ + P 3,100) ÷ 2 = ___________ Therefore, the median is _________. Try to solve the following example on your own. EXAMPLE 3 The following are the number of boxes of soap sold by a certain company in the past year. January February March April May June July August September October November December 3,214 2,459 2,330 2,422 2,354 2,575 2,132 2,079 1,991 2,560 2,438 1,857 14 Compute for the median. Compare your answers with those found in the Answer Key on page 39. Let’s Learn Now you know how to find the median of a given set of data. But what if you are given data presented in a frequency distribution table? How will you solve for the median in this case? Let’s look at the following example. The table below shows the test scores in a class composed of 30 students. Table 3 Scores in a test Interval of scores Midpoint xc Frequency f 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 ____ ____ ____ ____ ____ ____ ____ ____ ____ ____ 1 4 5 7 6 5 3 5 2 2 N = ____ Let us solve for the median using a step-by-step procedure. STEP 1 Determine the number of cases N. The total number of cases is 30 since there are 30 students in the class. Therefore, N = 30. Note that we use the term “case” because we are referring to how frequent a “score” is used. STEP 2 Solve for N/2 or half the number of cases in the distribution. The number of cases N is 30, therefore, N/2 = 30/2 = 15. 15 STEP 3 Count up the number of cases until the interval containing the N/2 case is reached. From the bottom, we count up until we reach the N/2 case, which is 15. So, 2 + 3 + 6 + 8 = 19 This is 4 cases more than 15 and falls in the interval 25 – 29. STEP 4 Determine how many cases were needed out of all the cases in the interval to reach N/2. Divide this by the number of cases in the interval. Out of the 8 cases in the interval 25 – 29, only 4 are used to reach 15. Therefore, only 4/8 or ½ of the interval was needed. STEP 5 Multiply this by the size of the interval. The size of the class interval 25 – 29 is 5. STEP 6 Add this to the lower limit of the interval containing the median. The interval containing the median is 25 – 29. The lowest number in this interval 25, but instead, we will be using 24.5 since this is really the limit before the next interval. If we do Steps 4 to 6 simultaneously, we will have: 4/8 × 5 + 24.5 = 27 Therefore, we say that 27 is the median. Let’s Learn When data is presented in a frequency distribution, we compute for the median by using the following step-by-step procedure: STEP 1 Determine the number of cases N. Get the sum of the given frequencies. STEP 2 Solve for N/2 or half the number of cases in the distribution. STEP 3 Count up the number of cases until the interval containing the N/2 case is reached. STEP 4 Determine how many cases were needed out of all the cases in the interval to reach N/2. Divide this by the number of cases in the interval. STEP 5 Multiply this by the size of the interval. STEP 6 Add this to the lower limit of the interval containing the median. 16 Let’s Try This EXAMPLE 4 The table shows the daily earnings of employees in a certain barangay. Solve for the median using the step-by-step procedure. STEP 1 Interval of scores (earnings in P) Frequency (number of employees) 235 – 239 230 – 234 225 – 229 220 – 224 215 – 219 210 – 214 205 – 209 200 – 204 3 5 11 8 4 5 3 1 Determine the number of cases N. N = ______ STEP 2 Solve for N/2 or half the number of cases in the distribution. N/2 = _____ /2 = _____. STEP 3 Count up the number of cases until the interval containing the N/2 case is reached. From the bottom, we count up until we reach the N/2 case, which is ____. So, ___+___+___+___+___+___ = ______ This is _____ case/s more than ______ and falls in the interval __________. STEP 4 Determine how many cases were needed out of all the cases in the interval to reach N/2. Divide this by the number of cases in the interval. Out of the _____ cases in the interval 220 – 224, only ______ are used to reach 20. Therefore, only ___(4)___ of the interval was needed. STEP 5 Multiply this by the size of the interval. The size of the class interval 220 – 224 is ___(5)__. STEP 6 Add this to the lower limit of the interval containing the median. The lower limit to be used = __(6)___. If we do Steps 4 to 6 simultaneously, we will have: __(4)___ × __(5)___ + ___(6)___ = _________ Therefore, we say that _______ is the median. 17 By now you should be able to solve for the median without having to write down the step-by-step procedure. Try to do the following example on your own. EXAMPLE 5 The table shows the age of the students in a dance class of 35. Find the median height. Interval of scores (ages in years) Frequency (number of students) 33 – 35 30 – 32 27 – 29 24 – 26 21 – 23 18 – 20 15 – 17 1 1 3 3 6 12 9 Compare your answers with those found in the Answer Key on page 40. Let’s Study and Analyze Suppose your family wants to open up a sari-sari store and you want to know which brand of bath soap sells most among the families in your barangay in order to help you decide which brand you should buy. You went over and collected data from one of the bigger sari-sari stores in your barangay by asking the storeowner the number of bars of different bath soaps sold daily. He got the following data on the next page: 18 Brand of Bath Soap Average Number of _________________________Bars Sold per Day Cascade Buoy Satin Palms 20 10 15 32 The brand of bath soap with the most number of sales is the Palms bath soap, with an average of 32 bars sold for a day. The Palms bath soap is called the mode or modal category of the data collected. Let’s Learn The mode is the score or category that occurs with the highest frequency. By the word “category”, we simply mean a brand, a name or a group of scores. This means that among all the categories in your data, this score or category is the one which occurred the most number of times. The modal category of the data refers to the most frequently occurring cardinal data while mode refers to the most frequently occurring nominal data. In the case of data presented in a frequency distribution table, the mode is the midpoint of the class interval with the highest frequency of occurrence. Let’s Try This EXAMPLE 1 A survey of favorite books among teens is given below. Find the modal book. Title of Book Frequency Heather Potter Sweet Dale Mushroom Soup Teen Blood EXAMPLE 2 54 37 42 33 Find the mode for the following set of scores. Interval of scores Midpoint Frequenc 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 15 – 19 10 – 14 37 42 37 32 27 22 17 12 3 1 6 12 9 5 2 2 Compare your answers with those found in the Answer Key on page 40. 19 Let’s See What You Have Learned A. Solve what is asked for. 1. The expenses of a household for four weeks are as follows: First week Second week Third week Fourth week – – – – P 1,900 P 1,800 P 2,100 P 2,100 What is the weekly mean expenses of the household? 2. The duration in minutes of telephone calls made in a certain pay phone for one day are as follows: 8, 9, 20, 4, 29, 15, 2, 4, 3, 12, 10 Find the median. 3. A survey of favorite TV channels in a class gives the following data: Channel Z Cinema TV Cartoon Channel Sports Stars Nature Show – – – – – 10 9 13 8 5 Find the modal program. 20 B. The following is the frequency distribution of 30 scores in a test. Interval of scores Frequency 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 1 3 4 6 5 4 2 3 1 1 a. Find the mean. b. Find the median. c. Find the mode. Compare your answers with those found in the Answer Key on page 41. Did you get all the answers right? If yes, congratulations! You really understood the concepts discussed in this lesson. If you did not get all the correct answers, just go over the parts you did not understand very well, then answer the exercises again. 21 Let’s Remember The mean is the average of a set of scores. It is computed by adding all the scores in your data set and dividing the sum by the number of scores. In formula form, x= Σx N where x = the mean Σ =" the sum of" N = number of scores When data is presented in a frequency distribution, we compute for the mean by using the following formula: Σfxc N where x = the mean x= ∑ = “the sumof” x c = midpoint of an int erval f = frequencyof casesin an int erval N = number ofscores ♦ To compute for the mean of a frequency distribution, we have the following steps: STEP 1 Determine the number of scores N. STEP 2 Determine the midpoint xc for each class interval. STEP 3 Determine the fxc for each class interval. STEP 4 Determine the sum Σfxc. STEP 5 Substitute the values in the formula. ♦ The median is the score or value that divides a given data set or a distribution into two equal halves wherein 50% of the scores are above it, while 50% are below it. ♦ The median is oftentimes denoted as “Md”. There are two different cases considered in finding the median of a given set of data. CASE 1 The number of scores N is odd. In the array of scores, the median is the middle score. CASE 2 The number of scores N is even. In the array of scores, the median is the average of the two middle scores. 22 ♦ When data is presented in a frequency distribution, we compute for the median by using the following step-by-step procedure: STEP 1 Determine the number of cases N. STEP 2 Solve for N/2 or half the number of cases in the distribution. STEP 3 Count up the number of cases until the interval containing the N/2 case is reached. STEP 4 Determine how many cases were needed out of all the cases in the interval to reach N/2. Divide this by the number of cases in the interval. STEP 5 Multiply this by the size of the interval. STEP 6 Add this to the lower limit of the interval containing the median. ♦ The mode is the score or category that occurs with the highest frequency. By the word “category”, we simply mean a brand, a name or a group of scores. This means that among all the categories in your data, this score or category is the one which occurred the most number of times. ♦ The modal category of the data refers to the most frequently occurring cardinal data while mode refers to the most frequently occurring nominal data. ♦ In the case of data presented in a frequency distribution table, the mode is the midpoint of the class interval with the highest frequency of occurrence. 23 LESSON 2 Measures of Dispersion How do we know how close or far apart are the scores in a given data? In Statistics, it is measured by values called measures of dispersion or measures of variability. These measures tell us how close to each other are the scores in the data. It should be remembered that the smaller the value of these measures, the closer together are the scores in the data. Measures of dispersion provide information about the variations in a given distribution of data. They reflect the way in which data are distributed or dispersed in either directions from the center of the distribution. This lesson will introduce you to the three measures of dispersion called the range, mean deviation and standard deviation; how they are computed and how they are used in everyday life. After studying this lesson, you should be able to: ♦ describe the differences between the range, mean deviation and standard deviation; ♦ compute for the range, mean deviation and standard deviation of a given set of data; ♦ use the range, mean deviation and standard deviation to analyze and interpret data to solve problems in daily life. Let’s Study and Analyze Hey, how did you 3 do in the basketball tournament last year? 24 Oh, I did not get scores as high as you did. I got 9 for the first game, 7 for the second game and 8 for the last game. Well, these are my scores during the 3 games of the basketball tournament: 15 for the first game, 10 for the second game and 15 again for the last game. How about you, Noel? What were your scores? I am not as lucky as you are but I think I am the most consistent among all of us. I got 7 in all 3 games. You know what? I got 20 for the first game, 21 for the second game and 24 for the last game. I think I got lucky. List down below the scores that each person got for the 3 games of the basketball tournament. Then answer the questions that follow. Mario ____ ____ ____ Noel ____ ____ ____ Paul ____ ____ ____ Oscar ____ ____ ____ 1. Whose scores are closest to each other? ________ 2. Who is the highest scoring player? ________ 3. Who is the most consistent player? ________ Compare your answers with those found in the Answer Key on page 41. 25 Let’s Learn The most basic measure of variability is the range. To compute for the range of a given data, we find the highest and lowest scores in the data, and then find the difference between these two values. R = XH − XL Where R is the range; XH is the highest score; and XL is the lowest score. Let us take as an example Mario’s scores in the 3 basketball games. Here are the steps in computing for the range: STEP 1 Look for the highest score and the lowest score. In the example above, the highest score is 15 and the lowest score is 10. STEP 2 Subtract the lowest score from the highest score. R = 15 − 10 = 5 Therefore, the range Mario’s scores is 5. Let’s Try This Compute for the range of Noel, Paul and Oscar’s scores. 1. 2. 3 Noel’s scores ____ ____ ____ highest score XH = _____ lowest score XL = _____ range R = _____ Paul’s scores ____ ____ ____ highest score XH = _____ lowest score XL = _____ range R = _____ Oscar’s scores ____ ____ ____ highest score XH = _____ lowest score XL = _____ range R = _____ Compare your answers to those found in the Answer Key on page 41. 26 Let’s Learn The range is used to measure the variability of a set of data. However, if the data has an outlier, or the distribution of most of the scores are located close to the mean but has one extremely low or high score, the range will show a much larger spread of scores than what really exists. In this case, we would have to use other measures of variability. The variance is another measure of variability. It may also be considered as a mean. It is the average of the squared differences from the mean of the data. It is used for data in the interval or ratio scales. In computing for the variance, the distance of the scores from the mean will be used. We can represent this deviation from the mean as x. This is obtained by subtracting the mean from the raw score, x=X–x where x = deviation from the mean X = raw score x = the mean We can now use the deviation from the mean in the formula for variance. Representing the variance as S2, Σx 2 S = N 2 where S2 = the variance Σx 2 = sum of the squared deviations from the mean N = number of scores in the data Recall that the S symbol tells you to add the values indicated beside this symbol.. So what are the steps again in getting the variance? Let’s take Mario’s score in the basketball game as an example. Here are the steps in finding the variance. STEP 1 Compute for the mean. x= 15 + 10 + 15 40 = 11.67 = = 13.33 3 3 The mean of the scores of Mario is 13.33 points. His average score for the 3 games is 13.33 points. 27 STEP 2 Subtract each of the raw scores from the mean. From the formula of computing for the deviation from the mean, x=X–x we have, (15 – 33.33) = –1.67 (10 – 33.33) = –3.33 (15 – 33.33) = –1.67 Did you get the same answers? You might be wonder that you got negative values. You should not worry about this. In the next step, you will square these values, so the result shall still be positive. STEP 3 Square each value. (1.67) 2 = 2.79 (–3.33) 2 = 11.09 (1.67) 2 = 2.79 STEP 4 Add all the squared values. 2.79 + 11.09 + 2.79 = 16.67 STEP 5 Divide the sum of the squared values by N or the total number of scores. 16.67 ÷ 3 = 5.56 Therefore, the variance of Mario’s scores is 5.56. Let’s Try This Compute for the variance of Noel and Paul’s scores. 1. Noel’s scores 9 7 8 STEP 1 Compute for the mean. The mean of the Noel’s scores is _____ points. STEP 2 Subtract each of the raw scores from the mean. From the formula of computing for the deviation from the mean, x=X–x we have, ______ ______ ______ 28 STEP 3 Square each value. _______ _______ _______ STEP 4 Add all the squared values. ______ + ______ + ______ = ______ STEP 5 Divide the sum of the squared values by N or the total number of scores. ______ ÷ ______ = ______ Therefore, the variance of Noel’s scores is _______. 2. Paul’s scores 20 21 24 Try to solve this one without writing down the step-by-step procedure. Compare your answers with those found in the Answer Key on page 42. 29 Let’s Study and Analyze The standard deviation is the most statistically useful measure of variability. The mean and the standard deviation are related. While the mean is the average of the scores in a set, the standard deviation is the average of how far the scores are from the mean. Recall the formula for the variance. You can see that it is a squared value. But, the original scores in your data set are not squared values. To convert these squared values into the unit of the original scores, we get the square root of the variance. This is what the standard deviation is: the square root of the variance. Look at the formula for standard deviation below. It has the same formula as the variance, except for the square root sign. S2 = Σx 2 N where S2 = the standard deviation Σx 2 = sum of the squared deviations from the mean N = number of scores in the data Simply, we can say that the standard deviation is the square root of the variance. S = S2 Of course, if you have already computed for the variance of a set of scores you do not have to compute the standard deviation using the long process. You simply find the square root of the variance. Let’s go back to Mario’s scores in the basketball game. You have already computed for the variance of these scores, right? Then it will be very easy to find the standard deviation. Here are the steps in finding the standard deviation: STEP 1 Compute for the variance. We have already done that for Mario’s scores. S2 = 5.56 STEP 2 Get the square root of the variance. 5.56 = 2.36 Therefore, the standard deviation of Mario’s scores is 2.36. 30 Let’s Try This Compute for the standard deviation of Noel and Paul’s scores. 1. Variance of Noel’s scores = ________ STEP 1 Compute for the variance. S2 = ______ STEP 2 Get the square root of the variance. ___ = _____ Therefore, the standard deviation of Noel’s scores is _____. 2. Variance of Paul’s scores = ______ Standard deviation of Paul’s scores = _______ Compare your answers with those found in the Answer Key on page 43. Let’s See What You Have Learned 1. The following are the test scores of Maria, Rachel and Mike for four of their subjects. Math Science Literature History Maria 82 85 90 91 Rachel 80 95 88 87 Mike 96 91 80 81 Compute for the range, variance and standard deviation. Complete the table below. Range Variance Maria Rachel Mike Compare your answers with those found in the Answer Key on page 43. If you got all the 9 answers correct, you’re doing great! However if you got a score of less than 5, you need to read the lesson again, then answers some more exercises, for you to fully grasp the concepts discussed. 31 Let’s Remember ♦ The range is the difference of the highest and the lowest scores in a given data. The formula for computing the range is R = XH − XL Where R is the range; XH is the highest score; and XL is the lowest score. ♦ In computing for the variance and standard deviation, we look for the deviation from the mean by using the following formula, x=X–x where x = deviation from the mean X = raw score x = the mean ♦ The variance is the average of the squared differences from the mean of the data. The formula is S2 = Σx 2 N where S2 = the variance Σx 2 = sum of the squared deviations from the mean N = number of scores in the data ♦ The standard deviation is the average of how far the scores are from the mean. It is the square root of the variance. The formula is S2 = Σx 2 N where S2 = the standard deviation Σx 2 = sum of the squared deviations from the mean N = number of scores in the data 32 What Have You Learned A. C. The following is the frequency distribution of 40 scores in a Math test. Interval of scores Frequency 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 2 4 5 7 6 5 3 4 2 2 4. Find the mean. 5. Find the median. 6. Find the mode. The following are the basketball scores of Jack and Kevin for four games in a basketball tournament. Game 1 Jack Kevin Game 2 Game 3 Game 4 23 18 32 30 26 25 19 27 Complete the table below. Jack Kevin Range Variance (1) (4) (2) (5) Compare your answers with those found in the Answer Key on page 43. If you got a score of: 8–9 Very good! You learned a lot from this module. You are now ready to move on to the next module. 6–7 Very satisfactory. Just review the items that you missed. 4–5 Satisfactory. Review the parts of the module you did not understand very well. 1–3 You should study the whole module again. 33 Let’s Sum Up ♦ The mean is the average of a set of scores. It is computed by adding all the scores in your data set and dividing the sum by the number of scores. In formula form, x= Σx N where x = the mean Σ =" the sum of" N = number of scores ♦ When data is presented in a frequency distribution, we compute for the mean by using the following formula: Σfxc N where x = the mean x= ∑ = “the sumof” x c = midpoint of an int erval f = frequencyof casesin an int erval N = number ofscores ♦ To compute for the mean of a frequency distribution, we have the following steps: STEP 1 Determine the number of scores N. STEP 2 Determine the midpoint xc for each class interval. STEP 3 Determine the fxc for each class interval. STEP 4 Determine the sum Σfxc. STEP 5 Substitute the values in the formula. ♦ The median is the score or value that divides a given data set or a distribution into two equal halves wherein 50% of the scores are above it, while 50% are below it. ♦ The median is oftentimes denoted as “Md”. There are two different cases considered in finding the median of a given set of data. CASE 1 The number of scores N is odd. In the array of scores, the median is the middle score. CASE 2 The number of scores N is even. In the array of scores, the median is the average of the two middle scores. 34 ♦ When data is presented in a frequency distribution, we compute for the median by using the following step-by-step procedure: STEP 1 Determine the number of cases N. STEP 2 Solve for N/2 or half the number of cases in the distribution. STEP 3 Count up the number of cases until the interval containing the N/2 case is reached. STEP 4 Determine how many cases were needed out of all the cases in the interval to reach N/2. Divide this by the number of cases in the interval. STEP 5 Multiply this by the size of the interval. STEP 6 Add this to the lower limit of the interval containing the median. ♦ The mode is the score or category that occurs with the highest frequency. By the word “category”, we simply mean a brand, a name or a group of scores. This means that among all the categories in your data, this score or category is the one which occurred the most number of times. ♦ The modal category of the data refers to the most frequently occurring cardinal data while mode refers to the most frequently occurring nominal data. ♦ In the case of data presented in a frequency distribution table, the mode is the midpoint of the class interval with the highest frequency of occurrence. ♦ The range is the difference of the highest and the lowest scores in a given data. The formula for computing the range is R = XH − XL Where R is the range; XH is the highest score; and XL is the lowest score. ♦ In computing for the variance and standard deviation, we look for the deviation from the mean by using the following formula, x=X–x where x = deviation from the mean X = raw score x = the mean 35 ♦ The variance is the average of the squared differences from the mean of the data. The formula is S2 = Σx 2 N where S2 = the variance Σx 2 = sum of the squared deviations from the mean N = number of scores in the data ♦ The standard deviation is the average of how far the scores are from the mean. It is the square root of the variance. The formula is Σx 2 S = N 2 where S2 = the standard deviation Σx 2 = sum of the squared deviations from the mean N = number of scores in the data 36 Answer Key A. Let’s See What You Already Know (page 2) A. Mean Median Mode = = = 76 77.36 = 77 82 B. Jeremy Theresa B. Range Variance 28 26 105.50 87.75 Lesson 1 Let’s Try This (pages 5–6) EXAMPLE 2 54,000 = 2,700 20 Let’s Try This (pages 8–11 ) EXAMPLE 3 STEP 2 STEP 3 Determine the midpoint xc for each class interval. Interval of ages Midpoint xc Frequency f 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 47 42 37 32 27 22 2 5 4 3 12 __4__ N = 30 Determine the fxc for each class interval. Interval of ages Midpoint xc 45 – 49 40 – 44 35 – 39 30 – 34 25 – 29 20 – 24 47 42 37 32 27 22 Frequency 2 5 4 3 12 __4__ N = 30 37 EXAMPLE 1 STEP 1 Determine the number of scores N. N = 40 STEP 2 STEP 3 Determine the midpoint xc for each class interval. Interval of ages Midpoint xc 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 97 92 87 82 77 72 67 62 57 52 Frequenc 1 4 5 7 6 5 3 5 2 __2__ N = 40 Determine the fxc for each class interval. Interval of ages Midpoint xc Frequency f fx 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 97 92 87 82 77 72 67 62 57 52 1 4 5 7 6 5 3 5 2 __2__ N = 40 9 36 43 57 46 36 20 31 11 10 38 STEP 4 STEP 5 Determine the sum Sfxc. Interval of ages Midpoint xc Frequency f 95 – 99 90 – 94 85 – 89 80 – 84 75 – 79 70 – 74 65 – 69 60 – 64 55 – 59 50 – 54 97 92 87 82 77 72 67 62 57 52 1 4 5 7 6 5 3 5 2 __2__ N = 40 Substitute the values in the formula. Σfx c N 3025 = 40 = 75.625 x= Therefore, the mean is 76. Let’s Try This (13–15) EXAMPLE 2 Employee 1’s salary Case 2 P 3,300 Md = (P 3,300 + P 3,100) ÷ 2 = P 3,200 Therefore, the median is P 3,200. EXAMPLE 3 Md = (P 2,422 + P 2,354) ÷ 2 = P 2,388 Therefore, the median is P 2,388. 39 ___ Σfxc Let’s Try This (ages 17–18) EXAMPLE 4 STEP 1 Determine the number of cases N. N = 40 STEP 2 Solve for N/2 or half the number of cases in the distribution. N/2 = 40 /2 = 20. STEP 3 Count up the number of cases until the interval containing the N/2 case is reached. From the bottom, we count up until we reach the N/2 case, which is 20. So, 1 + 3 + 5 + 4 + 8 = 21 This is 1 case/s more than 20 and falls in the interval 220 – 224. STEP 4 Determine how many cases were needed out of all the cases in the interval to reach N/2. Divide this by the number of cases in the interval. Out of the 8 cases in the interval 220 – 224, only 7 are used to reach 20. Therefore, only 7/8 of the interval was needed. STEP 5 Multiply this by the size of the interval. The size of the class interval 220 – 224 is 5. STEP 6 Add this to the lower limit of the interval containing the median. The lower limit to be used = 219.5. If we do Steps 4 to 6 simultaneously, we will have: 7/8 × 5 + 219.5 = 223.875 Therefore, we say that 224 is the median. EXAMPLE 5 Md = 9/12 × 3 + 17.5 = 19.75 = 20 Let’s Try This (page 19) EXAMPLE 1 Heather Potter EXAMPLE 2 32 40 Let’s See What You Have Learned (pages 20–21) C. A. 1. 2. 3. 1,975 9 Cartoon Channel B. 1. 2. 3. 77 78.5 82 Lesson 2 Let’s Study and Analyze (pages 24–25) 1. 2. 3. Mario 15 10 15 Noel 9 7 8 Paul 20 21 24 Oscar 7 7 7 Mario Paul Oscar Let’s Try This (page 26) 1. 2. 3. Noel 9 highest score lowest score range XH = 9 XL = 7 R = 2 Paul 21 20 7 8 24 highest score XH = 24 lowest score XL = 20 range R = 4 Oscar 7 7 highest score XH = 7 lowest score XL = 7 range R 7 = 0 41 Let’s Try This (pages 28–29) 1. Noel’s scores 9 7 8 STEP 1 Compute for the mean. The mean of the Noel’s scores is 8. STEP 2 Subtract each of the raw scores from the mean. From the formula of computing for the deviation from the mean, x=X–x we have, 9–8=1 8–8=0 7 – 8 = –1 STEP 3 Square each value. 12 = 1 02 = 0 (–1)2 = 1 STEP 4 Add all the squared values. 1+ 0+1 = 2 STEP 5 Divide the sum of the squared values by N or the total number of scores. 2 ÷ 3 = 0.67 Therefore, the variance of Noel’s scores is 0.67. 2. Σx 2 S = N = 8.67/3 = 2.89 2 42 Let’s Try This (page 31) 1. Variance of Noel’s scores = 0.67 STEP 1 Compute for the variance. S2 = 0.67 STEP 2 Get the square root of the variance. 0.67 = 0.82 Therefore, the standard deviation of Noel’s scores is 0.82. 2. Variance of Paul’s scores = 2.89 Standard deviation of Paul’s scores = 1.70 Let’s See What You Have Learned (page 31) 1. Maria Rachel Mike D. Range Variance 9 15 16 13.50 28.19 45.50 What Have You Learned? (page 33) A. mean. 76.5 median. 77.83 = 78 mode. 82 B. Jack Kevin Range Variance 13 12 22.50 19.50 References Mathematics III. SEDP Series. Quezon City: IMC. 1991. Ho, Ju Se T., et al. 21st Century Mathematics (Third Year). Quezon City: Phoenix Publishing. 1996. Lacuesta, Debbie P. Basic Statistical Concepts. Quezon City: SEAMEO INNOTECH. 1998. Downie, N.M. and Heath R.W. Basic Statistical Methods. 4th ed. 1974. 43