Uploaded by Joshua Ortuoste

Group-3-StatsBio

advertisement
MEASURES OF VARIABILITY OF
GROUPED DATA
&
CHOICE OF A MEASURE OF
CENTRAL TENDENCY
Members:
Agravante, Joshua
Ortuoste, Joshua
Belongan, Hadjeoria
Said, Hanna
De Castro, Erwin
Quinne, Lalim
LEARNING OUTCOMES:
Identify
Appropriate
Measures
Understand
Measures of
Central
Tendency
Calculate
the Range
Interpret
Variance
and
Standard
Deviation
Understand
Grouped
Data
MEASURES OF VARIABILITY OF
GROUPED DATA
Understanding the variability within a dataset is fundamental to
gaining meaningful insights.
Variability measures provide essential information about the
spread, dispersion, and distribution of data points.
Measuring the variability of large datasets especially those
grouped into classes or intervals requires the range data,
variance, and standard deviation.
Range
The range gives you a sense of how spread out the data is by
showing the difference between the highest (Upper Class
Boundary of the Top Interval) and lowest values (Lower Class
Boundary of the Bottom Interval).
Range
Use this formula to find the range of grouped data
R = Upper Class Boundary of the Top Interval - Lower Class
Boundary of the Bottom Interval
• First, determine the highest interval and
lowest interval
Highest interval is 46-50
Lowest interval is 21-25
Scores
Frequency
46-50
1
• Second, get the Upper limit and the Lower
limit
41-45
10
36-40
10
31-35
16
26-30
9
21-25
4
✓ Note that we have these called Upper
Limit and Lower Limit on each interval
Upper limit of Highest Interval = 50
Lower limit of Bottom Interval = 21
• Third, get the Upper Class Boundary by adding
0.5 in our Upper Limit
50 + 0.5 = 50.5 is the Upper Class
Boundary
• Get the Lower Class Boundary by subtracting
0.5 to the Lower Limit
21 – 0.5 = 20.5 is the Lower Class
Boundary
Now apply the formula;
R = Upper Class Boundary of the Top Interval - Lower Class
Boundary of the Bottom Interval
R = 50.5 – 20.5
R = 30
Scores
Frequency
46-50
1
41-45
10
36-40
10
31-35
16
26-30
9
21-25
4
VARIANCE
VARIANCE = Variance is the mean of the square of the deviations from the mean of
a frequency distribution .
Use this formula to find the variance of grouped data;
Steps in Calculating the Variance;
1.
Make a table for x, fx, (x - x̄), (x - x̄)2,
and f(x - x̄)2 to find the mean
2.
3.
4.
5.
Scores
Get the class marks (x)
46-50
Find the product (fx) of the frequency (f) 41-45
36-40
and the midpoint (x)
31-35
Get the difference between each score
26-30
21-25
and the mean (x - x̄)
Get the squares of each difference (x x̄)2
6.
Find the product of each f(x - x̄)2
7.
Solve for the variance.
f
1
10
10
16
9
4
X
Fx
(x - x̄) (x - x̄)2 f(x - x̄)2
1. Start by finding the value of n by adding all the Frequency (f)
x̄ =
𝒇𝒙
𝒏
2. Followed by finding the class mark/midpoint
or X by determining the middle number
Scores
46-50
41-45
36-40
31-35
26-30
21-25
between the lower and upper limits of interval
f
1
10
10
16
9
4
50
X
Fx
48
43
38
33
28
23
48
430
380
528
252
92
1,730
(x - x̄) (x - x̄)2 f(x - x̄)2
Then divide the total number of fx and frequency
3. Next is find the Fx by multiplying the
Frequency to Midpoint or X
When you are done, total the values
𝒇𝒙
𝒏
x̄ =
1,730
50
=
Therefore, the value of mean (x̄) is;
x̄ = 34.6
4. Proceed in finding the next column, use the mean x̄ = 34.6 and
subtract it to midpoint or x
5. Next is to multiply the (x - x̄) by itself
Scores
46-50
41-45
6. The last column will be multiplying the (x - 36-40
31-35
x̄) to frequency.
26-30
When you are done, total the value
21-25
Since we have already completed the table, let us proceed in
finding the variance using the formula;
σ2 =
𝑓(x − x̄ )2
𝑛−1
=
1,972
50−1
=
1,972
49
σ2 = 40.24
f
1
10
10
16
9
4
50
X
48
43
38
33
28
23
Fx
48
430
380
528
252
92
1,730
(x - x̄) (x - x̄)2
13.4 179.56
8.4
70.56
3.4
11.56
(-1.6)
2.56
(-6.6) 43.56
(-11.6) 134.56
f(x - x̄)2
179.56
705.6
115.6
40.96
392.04
538.24
1,972
Next is finding the STANDARD DEVIATION using this
Scores
46-50
41-45
36-40
31-35
26-30
21-25
formula;
Standard deviation is the square root of
the variance. It provides a measure of the
average deviation or dispersion of data
points from the mean.
σ=
𝑓(x − x̄)2
𝑛−1
=
1,972
50−1
=
1,972
49
= 40.24
σ = 6.34
f
1
10
10
16
9
4
50
X
48
43
38
33
28
23
Fx
48
430
380
528
252
92
1,730
(x - x̄) (x - x̄)2
13.4 179.56
8.4
70.56
3.4
11.56
(-1.6)
2.56
(-6.6) 43.56
(-11.6) 134.56
f(x - x̄)2
179.56
705.6
115.6
40.96
392.04
538.24
1,972
Choice of a Measure of Central Tendency
Central Tendency is a statistical measure that represents a
central or typical value in data set.
Why choosing right measure is crucial?
Picking a measure of central tendency is important as it helps us
because it gives us a quick idea of what most numbers in the
group are like
What are outliers?
Outliers or extreme values are data points that significantly differ
from the majority of the other data points in a dataset. They are
unusually high or low values that don't follow the same pattern as
the rest of the data.
•
We have 3 measures we can use;
1. Mean
2. Median
3. Mode
Mean - The mean is calculated by adding up all the values in a dataset and dividing the
sum by the total number of values. It represents the arithmetic average of the data. The
mean is sensitive to extreme values, or outliers, and can be heavily influenced by them.
Median - The median is the middle value of a dataset when it is arranged
in numerical order.
Mode - The mode is the value that appears most frequently in a dataset.
Choosing the appropriate measure of central tendency depends on the nature of the data and the
presence of outliers. Each measure has its strengths and is useful in different situations.
Nature/Types of data;
1. Interval or ratio data
2. Nominal Data
3. Ordinal Data
Example 1: Interval Data
Data: Temperatures in Celsius recorded every hour throughout a day:
18,20,21,19,22,23,20,25,24,20,19
Appropriate Measure: Mean
Explanation: Interval data are quantitative and have meaningful intervals. In this case,
you can calculate the mean temperature to find the average value for the day.
Mean Calculation:
Mean =
18+20+21+19+22+23+20+25+24+20+19
≈
11
21.09 degrees Celsius
Nature/Types of data;
2. Nominal Data
3. Ordinal Data
Example 1: Interval Data
Data: Temperatures in Celsius recorded every hour throughout a day:
18,20,21,19,22,23,20,25,24,20,19
Appropriate Measure: Mean
Explanation: Interval data are quantitative and have meaningful intervals. In this case,
you can calculate the mean temperature to find the average value for the day.
Mean Calculation:
18+20+21+19+22+23+20+25+24+20+19
Mean =
≈
11
21.09 degrees Celsius
Nature/Types of data;
3. Ordinal Data
Example 3: Ordinal Data
Data: Survey responses indicating customer satisfaction levels: Poor, Fair,
Good, Excellent, Good, Fair, Excellent, Good, Good, Poor
Appropriate Measure: Median
Explanation: In this case, arranging the responses in order and finding the
middle value (median) gives a central tendency measure.
Arranged Data: Poor, Poor, Fair, Fair, Good, Good, Good, Good, Excellent,
Excellent
Median Calculation: The median is Good, which is the middle value
Thank You!
Download