8.2 Properties of the Normal Distribution

advertisement
Chapter 8 Notes / December 2012
A continuous probability distribution is a smooth density curve that models
the distribution of a continuous random variable. Here are some hypothetical,
continuous probability distribution. Notice that the curve may indeed be nothing
more than a straight line.
Normal distributions are a family of distributions that have the shape
shown below.
Normal distributions are symmetric with scores more concentrated in the middle
than in the outside tails. Looking at the distribution we can estimate an
approximate mean, median and mode.
Normal distributions are defined by two parameters that we have already
covered in class. These include the mean (μ) and standard deviation. The mean
is the average of the datum while the standard deviation (σ) is the variance of
how spread out a distribution is.
A continuous random variable X can theoretically be taken of any value in
an interval of values. Examples that come to mind are: heights in inches of a
species of a plant, weights in pounds of patients in a cardiac unit, or disk access
time in nanoseconds for a certain disk drive.
Distributions that are not evenly distributed are said to be skewed.
“Positively Skewed Distributions” have their mode closer to the left while
“Negatively Skewed Distributions” have the mode closer to the right.
These graphs illustrate the notion of
skew. The one on the left is positively
skewed. The one on the right is
negatively skewed.
Two common forms of distributions are Uniform (straight horizontal Line) and Exponential
Distribution (Curve sloping downwards to the right).
Properties of the Normal Distribution
The normal distributions are a very important class of statistical
distributions. All normal distributions are symmetric and have bell-shaped
density curves with a single peak.
In previous lessons we covered the idea of standard deviation. Standard
deviations is a statistic that tells you how tightly all the various examples are
clustered around the mean in a set of data. When the examples are pretty tightly
bunched together and the bell-shaped curve is steep, the standard deviation is
small.
However, when the examples are spread apart and the bell curve is
relatively flat, that tells you that you have a relatively large standard deviation.
Recall from previous lessons that the Deviation is the difference between an
individual value in a set of data and the mean for the data. Simply put it is the
difference between a particular element from the set and the average of the set as
a whole.
Standard Deviation: The square root of the variance. This will always be a
positive value.
Sample Standard
Deviation
S=
 ( x  x)
n 1
2
Population Standard
Deviation

=
 (x  )
N
2
Variance: The mean of the squared deviations of the observations from their
mean. This will always be a positive value.
Sample Variance
s
2
=
 ( x  x)
n 1
Population Variance
2
 =
2
 (x  )
2
N
Standard Deviation
In the diagram below, both distributions have means, and modes of 50. The blue
(taller) distribution has a standard deviation of 5; the red (shorter) distribution
has a standard deviation of 10. For the blue distribution, 68% of the distribution
is between 45 and 55; for the red distribution, 68% is between 40 and 60.
Normal distributions with standard
deviations of 5 (blue line) and 10 (red
line).
One standard deviation away from the mean in either direction on the horizontal
axis (the red area on the above graph) accounts for somewhere around 68
percent of the people in this group. Two standard deviations away from the
mean (the red and green areas) account for roughly 95 percent of the people.
And three standard deviations (the red, green and blue areas) account for about
99 percent of the people.
The 68-95-99.7% Rule
All normal density curves satisfy the following property which is often referred
to as the Empirical Rule.
68% of the observations fall within 1 standard deviation of the mean, that is,
between
and
.
95% of the observations fall within 2 standard deviations of the mean, that is,
between
and
.
99.7% of the observations fall within 3 standard deviations of the mean, that
is, between
and
.
Thus, for a normal distribution, almost all values lie within 3 standard
deviations of the mean.
Z-Scores:
Calculating how many standard deviations a score is from the mean. To create a
Z score, subtract the mean
from a raw score and divide by the standard deviation
Z-Scores reflect a score’s relationship to the rest of the scores....
-Z = below average
+Z = above average
The Z-score measures the number of standard deviations any point is away from
the mean. By using the Z-Score, any normally distributed data can be converted
to the standard normal distribution (with µ = 0 and σ = 1). When the Z-Scores
are used, another way to determine the probabilities is by using the tables of
areas under the normal distribution curve.
Z-Scores , percentiles and cut-off scores are all useful techniques for analyzing
normal distributions
Download