Normal Distribution and Area Under the Curve

advertisement
Normal Distribution
Previously we introduced the concept of the normal distribution. The normal
distribution is an important concept because

In business many continuous variables have distributions that approximate
the normal distribution.

The normal distribution provides the basis for statistical inference because of
the relationship to the Central Limit Theorem (described in the reading
assignment for this module).

Many of the statistical tests that we’ll cover assume a normal distribution and
many of these tests are robust to violations of normalcy as long as the data
are approximately normal.
The normal curve, frequently referred to as the bell-shaped curve because of its
appearance, is presented graphically below. The normal curve is 1) symmetrical
with the mean and median being equal, 2) has an infinite range, and 3) contains
50% of the values between two-thirds of a standard deviation below and above the
mean.
Standard Normal Distributions (Z Scores)
Normal distributions can be converted to standard normal distributions by
transforming the variables to z scores. This is also known as standardizing a
variable. Standard normal distributions have two properties: the mean is always 0
and the variance is always 1. Regardless of the original values of the variable (miles
to the planets or the weights of grains of sand) the mean of the standardized
variable is always 0 and the variance is always 1. The formula for calculating z
scores follows
where is the mean and is the standard deviation. Note that z scores are in
standard deviation units and may be plus or minus. Thus a z score of +1.2 would
represent a value that would be 1.2 standard deviations above the mean while a z
score of -2.3 would represent a value that would be 2.3 standard deviations below
the mean. For more about standardizing and Excel see the Calculating Z Scores
video.
Area Under the Curve
The following table is a portion of a larger table containing areas under the standard
normal curve between 0 and the indicated z score. The cell with the value of .1255
highlighted in yellow represents the area between a z score of 0 (the mean) and a z
score of 0.32. In other words there’s a probability of .1255 that a value would be
between a z of 0 and a z score of .32.
The second shaded value indicates the probability that a value would be between a
z score of 0 and a z score of 1 (that is between the mean and one standard
deviation from the mean – either plus or minus) is .3413. When describing the
properties related to the standard deviation in Module 3, we had noted that in a
normal curve about 68% of the observations fall between plus and minus one
standard deviation. This property is derived from this table (.3413 X 2 .68).
0.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
In the following table, the Dow Jones Industrial Average annual performance from
1950 until 2003 is listed. The average return was 8.28% during this period and the
standard deviation is 13.04%. If we were interested in the probability of obtaining
returns above 15% in a year during this time period, we’d first calculate the z score
for a return of 15% as follows:
Thus, a return of 15% is about one-half standard deviation above the mean of
8.28%. After viewing the video Z Scores and Area Under the Normal Curve, answer
the following question. What’s the probability (rounded to two decimal places) that
the annual return during this period would have exceeded 15%?
Year
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
Annual
Return
20.52%
19.11%
5.09%
1.92%
21.01%
32.57%
11.36%
-3.51%
3.35%
28.57%
-2.23%
11.89%
-7.49%
11.73%
16.68%
9.21%
-4.09%
0.63%
3.06%
-3.23%
-14.09%
17.47%
7.45%
-2.82%
-17.81%
5.68%
21.49%
-8.24%
-8.32%
2.95%
5.57%
4.66%
-5.21%
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
34.60%
-1.00%
12.71%
34.97%
26.95%
-9.45%
21.74%
6.78%
9.35%
12.12%
7.24%
7.71%
18.45%
27.80%
29.57%
15.92%
21.32%
2.58%
-5.08%
-9.45%
-2.52%
Download