MEASURES OF VARIABILITY OF GROUPED DATA & CHOICE OF A MEASURE OF CENTRAL TENDENCY Members: Agravante, Joshua Ortuoste, Joshua Belongan, Hadjeoria Said, Hanna De Castro, Erwin Quinne, Lalim LEARNING OUTCOMES: Identify Appropriate Measures Understand Measures of Central Tendency Calculate the Range Interpret Variance and Standard Deviation Understand Grouped Data MEASURES OF VARIABILITY OF GROUPED DATA Understanding the variability within a dataset is fundamental to gaining meaningful insights. Variability measures provide essential information about the spread, dispersion, and distribution of data points. Measuring the variability of large datasets especially those grouped into classes or intervals requires the range data, variance, and standard deviation. Range The range gives you a sense of how spread out the data is by showing the difference between the highest (Upper Class Boundary of the Top Interval) and lowest values (Lower Class Boundary of the Bottom Interval). Range Use this formula to find the range of grouped data R = Upper Class Boundary of the Top Interval - Lower Class Boundary of the Bottom Interval • First, determine the highest interval and lowest interval Highest interval is 46-50 Lowest interval is 21-25 Scores Frequency 46-50 1 • Second, get the Upper limit and the Lower limit 41-45 10 36-40 10 31-35 16 26-30 9 21-25 4 ✓ Note that we have these called Upper Limit and Lower Limit on each interval Upper limit of Highest Interval = 50 Lower limit of Bottom Interval = 21 • Third, get the Upper Class Boundary by adding 0.5 in our Upper Limit 50 + 0.5 = 50.5 is the Upper Class Boundary • Get the Lower Class Boundary by subtracting 0.5 to the Lower Limit 21 – 0.5 = 20.5 is the Lower Class Boundary Now apply the formula; R = Upper Class Boundary of the Top Interval - Lower Class Boundary of the Bottom Interval R = 50.5 – 20.5 R = 30 Scores Frequency 46-50 1 41-45 10 36-40 10 31-35 16 26-30 9 21-25 4 VARIANCE VARIANCE = Variance is the mean of the square of the deviations from the mean of a frequency distribution . Use this formula to find the variance of grouped data; Steps in Calculating the Variance; 1. Make a table for x, fx, (x - x̄), (x - x̄)2, and f(x - x̄)2 to find the mean 2. 3. 4. 5. Scores Get the class marks (x) 46-50 Find the product (fx) of the frequency (f) 41-45 36-40 and the midpoint (x) 31-35 Get the difference between each score 26-30 21-25 and the mean (x - x̄) Get the squares of each difference (x x̄)2 6. Find the product of each f(x - x̄)2 7. Solve for the variance. f 1 10 10 16 9 4 X Fx (x - x̄) (x - x̄)2 f(x - x̄)2 1. Start by finding the value of n by adding all the Frequency (f) x̄ = 𝒇𝒙 𝒏 2. Followed by finding the class mark/midpoint or X by determining the middle number Scores 46-50 41-45 36-40 31-35 26-30 21-25 between the lower and upper limits of interval f 1 10 10 16 9 4 50 X Fx 48 43 38 33 28 23 48 430 380 528 252 92 1,730 (x - x̄) (x - x̄)2 f(x - x̄)2 Then divide the total number of fx and frequency 3. Next is find the Fx by multiplying the Frequency to Midpoint or X When you are done, total the values 𝒇𝒙 𝒏 x̄ = 1,730 50 = Therefore, the value of mean (x̄) is; x̄ = 34.6 4. Proceed in finding the next column, use the mean x̄ = 34.6 and subtract it to midpoint or x 5. Next is to multiply the (x - x̄) by itself Scores 46-50 41-45 6. The last column will be multiplying the (x - 36-40 31-35 x̄) to frequency. 26-30 When you are done, total the value 21-25 Since we have already completed the table, let us proceed in finding the variance using the formula; σ2 = 𝑓(x − x̄ )2 𝑛−1 = 1,972 50−1 = 1,972 49 σ2 = 40.24 f 1 10 10 16 9 4 50 X 48 43 38 33 28 23 Fx 48 430 380 528 252 92 1,730 (x - x̄) (x - x̄)2 13.4 179.56 8.4 70.56 3.4 11.56 (-1.6) 2.56 (-6.6) 43.56 (-11.6) 134.56 f(x - x̄)2 179.56 705.6 115.6 40.96 392.04 538.24 1,972 Next is finding the STANDARD DEVIATION using this Scores 46-50 41-45 36-40 31-35 26-30 21-25 formula; Standard deviation is the square root of the variance. It provides a measure of the average deviation or dispersion of data points from the mean. σ= 𝑓(x − x̄)2 𝑛−1 = 1,972 50−1 = 1,972 49 = 40.24 σ = 6.34 f 1 10 10 16 9 4 50 X 48 43 38 33 28 23 Fx 48 430 380 528 252 92 1,730 (x - x̄) (x - x̄)2 13.4 179.56 8.4 70.56 3.4 11.56 (-1.6) 2.56 (-6.6) 43.56 (-11.6) 134.56 f(x - x̄)2 179.56 705.6 115.6 40.96 392.04 538.24 1,972 Choice of a Measure of Central Tendency Central Tendency is a statistical measure that represents a central or typical value in data set. Why choosing right measure is crucial? Picking a measure of central tendency is important as it helps us because it gives us a quick idea of what most numbers in the group are like What are outliers? Outliers or extreme values are data points that significantly differ from the majority of the other data points in a dataset. They are unusually high or low values that don't follow the same pattern as the rest of the data. • We have 3 measures we can use; 1. Mean 2. Median 3. Mode Mean - The mean is calculated by adding up all the values in a dataset and dividing the sum by the total number of values. It represents the arithmetic average of the data. The mean is sensitive to extreme values, or outliers, and can be heavily influenced by them. Median - The median is the middle value of a dataset when it is arranged in numerical order. Mode - The mode is the value that appears most frequently in a dataset. Choosing the appropriate measure of central tendency depends on the nature of the data and the presence of outliers. Each measure has its strengths and is useful in different situations. Nature/Types of data; 1. Interval or ratio data 2. Nominal Data 3. Ordinal Data Example 1: Interval Data Data: Temperatures in Celsius recorded every hour throughout a day: 18,20,21,19,22,23,20,25,24,20,19 Appropriate Measure: Mean Explanation: Interval data are quantitative and have meaningful intervals. In this case, you can calculate the mean temperature to find the average value for the day. Mean Calculation: Mean = 18+20+21+19+22+23+20+25+24+20+19 ≈ 11 21.09 degrees Celsius Nature/Types of data; 2. Nominal Data 3. Ordinal Data Example 1: Interval Data Data: Temperatures in Celsius recorded every hour throughout a day: 18,20,21,19,22,23,20,25,24,20,19 Appropriate Measure: Mean Explanation: Interval data are quantitative and have meaningful intervals. In this case, you can calculate the mean temperature to find the average value for the day. Mean Calculation: 18+20+21+19+22+23+20+25+24+20+19 Mean = ≈ 11 21.09 degrees Celsius Nature/Types of data; 3. Ordinal Data Example 3: Ordinal Data Data: Survey responses indicating customer satisfaction levels: Poor, Fair, Good, Excellent, Good, Fair, Excellent, Good, Good, Poor Appropriate Measure: Median Explanation: In this case, arranging the responses in order and finding the middle value (median) gives a central tendency measure. Arranged Data: Poor, Poor, Fair, Fair, Good, Good, Good, Good, Excellent, Excellent Median Calculation: The median is Good, which is the middle value Thank You!