Uploaded by hpp academicmaterials

NORMAL DISTIBUTION Notes

advertisement
THE NORMAL DISTRIBUTION
CONTINUOUS RANDOM VARIABLE
• A variable that can assume any value on a
continuum (can assume an uncountable number
of values)
• Examples are as follows:
o thickness of an item
o time required to complete a task
o temperature of a solution
o Height
NORMAL DISTRIBUTION
• It is the most common continuous distribution.
• Also known as the Gaussian distribution or the
bell curve.
• In this distribution, the probability that various
values occur within certain ranges or intervals can
be calculated.
THE NORMAL DISTRIBUTION PROPERTIES
1. ‘Bell Shaped’
2. Symmetrical
3. Mean, Median and Mode are equal
4. Location is characterized by the mean, μ
5. Spread is characterized by the standard
deviation, σ
6. The random variable has an infinite theoretical
range: -¥ to +¥
NOTE: Values above the mean have positive Z-values,
values below the mean have negative Z-values
EXAMPLE
• If X is distributed normally with mean of 100 and
standard deviation of 50, the Z value for X = 200
is
𝑧=
𝑋$ − 𝜇
200 − 100
=
= 2.0
𝜎
50
•
This says that X = 200 is two standard deviations
(2 increments of 50 units) above the mean of
100.
•
Note that the distribution is the same, only the
scale has changed. We can express the problem
in original units (X) or in standardized units (Z)
NORMAL PROBABILITIES
• Probability is measured by the area under the
curve
• The total area under the curve is 1.0, and the
curve is symmetric, so half is above the mean,
half is below.
THE NORMAL DISTRIBUTION SHAPE
EXAMPLE 1
•
•
•
THE STANDARDIZED NORMAL DISTRIBUTION
• Also known as the “Z” distribution
• Mean is 0
• Standard Deviation is 1
Let X represent the time it takes (in seconds) to
download an image file from the internet.
Suppose X is normal with mean 8.0 and standard
deviation 5.0
Find P(X < 8.6)
Calculate Z-values as follows:
𝑧=
𝑧=
𝑋$ − 𝜇
8− 8
=
= 0
𝜎
5
𝑋$ − 𝜇
8.6 − 8
=
= 0.12
𝜎
5
ANSWER: P( X < 8.6) = P( Z < 0.12) = .5478 or 54.78%
EXAMPLE 2
• Suppose X is normal with mean 8.0 and standard
deviation 5.0. Find P(X > 8.6)
𝑃 (𝑋 > 8.6) = 1.0 − 0.5478
CA 51018: Statistical Analysis with Software Applications
EXAMPLE 3
• Suppose X is normal with mean 8.0 and standard
deviation 5.0. Find P(8 < X < 8.6)
•
Calculate Z-values as follows:
𝑧=
𝑧=
Do
approximately
95%
of
the
observations lie within mean ± 2
standard deviations?
Evaluate normal probability plot
o Is
the
normal
probability
plot
approximately linear with positive slope?
o A normal probability plot for data from a
normal distribution will be approximately
linear
o Non-linear plots indicate a deviation from
normality
o
ANSWER: P (X >8.6) = P (Z >0.12) = 0.4522 or 45.22%
𝑋$ − 𝜇
8− 8
=
= 0
𝜎
5
𝑋$ − 𝜇
8.6 − 8
=
= 0.12
𝜎
5
ANSWER: P(8 < X < 8.6) = P (0 < Z < 0.12)
=.0478 or 4.78%
ASSESSING NORMALITY
• It is important to evaluate how well the data set
is approximated by a normal distribution.
• Normally distributed data should approximate the
theoretical normal distribution:
o The normal distribution is bell shaped
(symmetrical) where the mean is equal to
the median.
o The empirical rule applies to the normal
distribution.
o The interquartile range of a normal
distribution is 1.33 standard deviations.
THE EMPIRICAL RULE AS APPLIED TO THE
NORMAL DISTRIBUTION
• This rule states that for symmetrical bell-shaped
data sets, one can find that roughly two out of
every three observations are contained within a
distance of 1 standard deviation around the mean
and roughly
ASSESSING NORMALITY (cont.)
• Construct charts or graphs
o For small- or moderate-sized data sets,
do stem-and- leaf display and box-andwhisker plot look symmetric?
o For large data sets, does the histogram
or polygon appear bell-shaped?
• Compute descriptive summary measures
o Do the mean, median and mode have
similar values?
o Is the interquartile range approximately
1.33 σ?
o Is the range approximately 6 σ?
• Observe the distribution of the data set
o Do approximately 2/3 of the observations
lie within mean ± 1 standard deviation?
o Do
approximately
80%
of
the
observations lie within mean ± 1.28
standard deviations?
CA 51018: Statistical Analysis with Software Applications
EXPLORATORY DATA ANALYSIS THE FIVE
NUMBER SUMMARY
• The five numbers that describe the spread of data
are:
o Minimum
o First Quartile (Q1)
o Median (Q2)
o Third Quartile (Q3)
o Maximum
•
•
•
The Box-and-Whisker Plot is a graphical display
of the five number summary.
The Box and central line are centered between
the endpoints if data are symmetric around the
median.
A Box-and-Whisker plot can be shown in either
vertical or horizontal format.
OTHER WAYS OF ASSESSING NORMALITY OF
DATA
• checking for skewness with Pearson coefficient
(PC) of skewness as:
𝑧=
o
•
3(𝑋$ − 𝑚𝑒𝑑𝑖𝑎𝑛)
𝑠
NOTE: The data is considered significantly
skewed when PC is greater than or equal to
+ 1 or less than or equal to – 1.
checking for outliers
o NOTE: An outlier is a data value that lies
more than 1.5(IQR) units below Q1 or
1.5(IQR) units above Q3.
REFERENCES:
Berenson, M. L., Krehbiel, T. C., Levine, D. M., & Stephan,
D. (2008). Statistics for Managers Using Microsoft
Excel. Pearson.
Bluman, G. (2018). Elementary statistics : a step by step
approach. New York: McGraw-Hill Education.
Statistical
Analysis
with
Software
Philippines: McGraw-Hill Education.
Applications.
CA 51018: Statistical Analysis with Software Applications
Download