Uploaded by Kyle Salavante Abong (Abong, Kyle Salavante)

Skewness & Kurtosis: Measures of Shape in Statistics

advertisement
MEASURES OF SHAPES
Measures of Shape
is a descriptive statistic that can help us to determine how numbers of
data points in a data set are distributed.
Skewness is a measure of the asymmetry of a distribution. This value
can be positive or negative.
A negative skew indicates that the tail is on the left side of the
distribution, which extends towards more negative values.
A positive skew indicates that the tail is on the right
side of the distribution, which extends towards
more positive values.
A value of zero indicates that there is no skewness
in the distribution at all, meaning the distribution is
perfectly symmetrical.
Kurtosis is a measure of whether or not a distribution is
heavy-tailed or light-tailed relative to a normal distribution.
The kurtosis of a normal distribution is 0.265
If a given distribution has a kurtosis less than .265, it is said
to be playkurtic, which means it tends to produce fewer and
less extreme outliers than the normal distribution.
If a given distribution has a kurtosis greater
than .265, it is said to be leptokurtic, which
means it tends to produce more outliers than
the normal distribution.
Symmetric Distribution
A Distribution could be described as
Symmetrical if both parts of it are a mirror
image of each other If you can separate it
from its mean, the information you see on the
left of the distribution is what you will see on
the opposite side.
Normal Distribution
The Normal Distribution is one of the Symmetric Distributions in
which the median, the mean, and mode are in line with the
other. The most important characteristics to determine a
normal distribution or aspects that make a particular normal
distribution are as follows:
It is symmetrical; as described in the previous paragraph, the
distribution mirror’s left and right sides are split by the middle
in which the mean, median, and Mode meet.
NORMAL DISTRIBUTION
β—¦
Secondly, The Mean, Median,
and Mode coincide in the
middle when data is plotted
onto a graph. Furthermore,
since this distribution is the
highest near the center, it
exhibits an angular curve if a
graph is made. This type of
distribution has a unimodal
distribution, i.e., the only value
repeated for the longest time.
Each of these three aspects that make up the normal
distribution is crucial in the field of statistics since they
enable us to calculate probabilities. They also play an
important role when it comes to Inferential Statistics. If all
three characteristics are considered, it is possible to
conclude that most of the data lie at the center, and the
values become extreme and uncommon when we get
away from the center on one or both sides.
Skewed Distribution
Skew is one of the characteristics commonly used to define what happens to
values. When it comes to Asymmetric distributions, the distribution could be
skewed either positively or negatively. This occurs when the more common
values are crowded between the low and high ends of the x-axis. This is
among the most commonly used designs when a distribution is divergent
from the normative distribution. In this case, the median, mean, and mode
don’t coincide. The easiest method to determine whether the distribution is
skewed or not is to make a histogram, then observe the form of the
distribution. If it’s skewed, it is either Positively Skewed or negatively skewed.
Positively Skewed Distribution
A distribution is considered to be positively skewed when the median
of the spread is greater than the mean, and the majority of the
scores fall in the lower part of the distribution, while a few scores are
on the higher part of the distribution.
If we plot the distribution in a histogram, it is evident that the left side
will be larger than that on the right one, with the mean located on
the right side and the median on the left.
Negatively Skewed Distribution
A distribution is negatively skewed where the mean of the distribution
is lesser than the median, and the majority of the scores are at the
upper end of the distribution. In contrast, a few scores are located in
the lower part. Any probability based on an underlying distribution
may underestimate the number of scores on the lower end of the
distributed skewed one and underestimate the number of scores on
the upper end.
If we plot this on a histogram, the left-hand right side will be larger
than that on the right one, with the mean to the left side and median
on the right. This is why positive Skewed distribution is referred to as
the Left Skewed Distribution. This means that the distribution is
negative Skewed in the event that the mean is lower than the
median (making the mean to the left of the median) and a small
number of are at the upper end of the spectrum, making an
elongated tail on the lower side of the spectrum.
Formula for Skewness
π‘†π‘˜ =
3(𝑋 − 𝑀𝑑)
𝑆𝐷
β—¦
Where: Sk = Skewness
β—¦
ΰ΄€ Mean
𝑋=
β—¦
Md= Median
β—¦
SD= Standard Deviation
β—¦
3= Constant
Consider whether this distribution is normal
or not normal
β—¦
Scores
f
< π‘π‘’π‘šπ‘“
Midpt (X) fX
β—¦
45-50
2
50
47.5
95
β—¦
40-44
6
48
42
252
β—¦
35-39
11
42
37
407
β—¦
30-34
10
31
32
320
β—¦
25-29
12
21
27
324
β—¦
20-24
5
9
22
110
β—¦
15-19
4
4
17
68
β—¦
n = 50
σ 𝑓𝑋 =1,576
ΰ΄€ σ 𝑓𝑋
𝑋=
𝑁
𝑀𝑑 = πΏπ‘π‘šπ‘ +
1576
ΰ΄€
𝑋=
50
ΰ΄€
𝑋 = 31.5
=29.5 +
50
−21
2
10
4
=29.5 + 10 5
= 29.5 + 2
= 31.5
𝑛
−π‘π‘’π‘šπ‘“
2
𝑓
5
𝑖
Scores
f
< π‘π‘’π‘šπ‘“
Midpt (X) Fx
X - 𝑋ഀ
ΰ΄€ 2
(𝑋 − 𝑋)
𝑓(𝑋 − 𝑋)2
45-50
2
50
47.5
95
16
256
512
40-44
6
48
42
252
10.5
110.25
661.5
35-39
11
42
37
407
5.5
30.25
332.75
30-34
10
31
32
320
0.5
0.25
2.5
25-29
12
21
27
324
-4.5
20.25
243
20-24
5
9
22
110
-9.5
90.25
451.25
15-19
4
4
17
68
-14.5
210.25
841
n = 50
σ 𝑓𝑋 =1,576
σ 𝑓(𝑋 − 𝑋)2 =3,044
Skewness
β—¦
β—¦
𝑀𝑑 = πΏπ‘π‘šπ‘ +
=29.5 +
50
−21
2
10
4
5
10
β—¦
=29.5 +
β—¦
= 29.5 + 2
β—¦
= 31.5
𝑛
−π‘π‘’π‘šπ‘“
2
𝑓
5
𝑖
σ 𝑓(𝑋−𝑋)2
β—¦
𝑆𝐷 =
β—¦
𝑆𝐷 =
β—¦
𝑆𝐷 = 7.85
β—¦
π‘Ίπ’Œ =
= 0; Therefore the
πŸ•.πŸ–πŸ“
distribution is normal
𝑁−1
3,044
49
πŸ‘(πŸ‘πŸ.πŸ“−πŸ‘πŸ.πŸ“)
Kurtosis
Similar to skewness, Kurtosis is a term used to describe
the shape. It defines the form spread in terms of flatness
or height. There are several kinds that are affected by
Kurtosis: Leptokurtic, Platykurtic, and Mesokurtic.
Leptokurtic
If you have a significant positive exceed of kurtosis, then the form that
the distribution takes is referred to as Leptokurtic. To comprehend
this by its shape, it has larger tails. Compared with normal
distributions, it has an identical peak (to be exact, this distribution has
a greater peak than the one normally found in a bell-shaped
distribution and is significantly more so compared to the Platykurtic
distribution). Values that are clustered around the center (mean ). If
the value of kurtosis is greater than .265 then the distribution is
leptokurtic
Platykurtic
Suppose the result is a negative surplus of
kurtosis. The shape that the distribution takes is
known as Platykurtic.
If the value of Kurtosis is less than .265 then the
distribution is platykurtic
Mesokurtic
This is the time when you have a normal
distribution. The parts of the tail are neither too
thin nor too thick, scoring is equally split, and
scores are not concentrated around the central
point nor too dispersed.
The value of Kurtosis must be .265
Formula for Kurtosis
𝑄
KURTOSIS=
𝑃90 −𝑃10
Ku= kurtosis
Q=quartile Deviation
𝑃90 = Percentile 90
𝑃10 = Percentile 10
Scores
f
< π‘π‘’π‘šπ‘“
45-50
2
50
40-44
6
48
35-39
11
42
30-34
10
31
25-29
12
21
20-24
5
9
15-19
4
4
β—¦
β—¦
β—¦
𝑄3 −𝑄1
𝑄=
2
𝑄3 = 𝐿 +
𝑄1 = 𝐿 +
3𝑁
−<π‘π‘’π‘šπ‘“
4
𝑓
1𝑁
−<π‘π‘’π‘šπ‘“
4
𝑓
𝑖
𝑖
Download