Normal Distribution Previously we introduced the concept of the normal distribution. The normal distribution is an important concept because In business many continuous variables have distributions that approximate the normal distribution. The normal distribution provides the basis for statistical inference because of the relationship to the Central Limit Theorem (described in the reading assignment for this module). Many of the statistical tests that we’ll cover assume a normal distribution and many of these tests are robust to violations of normalcy as long as the data are approximately normal. The normal curve, frequently referred to as the bell-shaped curve because of its appearance, is presented graphically below. The normal curve is 1) symmetrical with the mean and median being equal, 2) has an infinite range, and 3) contains 50% of the values between two-thirds of a standard deviation below and above the mean. Standard Normal Distributions (Z Scores) Normal distributions can be converted to standard normal distributions by transforming the variables to z scores. This is also known as standardizing a variable. Standard normal distributions have two properties: the mean is always 0 and the variance is always 1. Regardless of the original values of the variable (miles to the planets or the weights of grains of sand) the mean of the standardized variable is always 0 and the variance is always 1. The formula for calculating z scores follows where is the mean and is the standard deviation. Note that z scores are in standard deviation units and may be plus or minus. Thus a z score of +1.2 would represent a value that would be 1.2 standard deviations above the mean while a z score of -2.3 would represent a value that would be 2.3 standard deviations below the mean. For more about standardizing and Excel see the Calculating Z Scores video. Area Under the Curve The following table is a portion of a larger table containing areas under the standard normal curve between 0 and the indicated z score. The cell with the value of .1255 highlighted in yellow represents the area between a z score of 0 (the mean) and a z score of 0.32. In other words there’s a probability of .1255 that a value would be between a z of 0 and a z score of .32. The second shaded value indicates the probability that a value would be between a z score of 0 and a z score of 1 (that is between the mean and one standard deviation from the mean – either plus or minus) is .3413. When describing the properties related to the standard deviation in Module 3, we had noted that in a normal curve about 68% of the observations fall between plus and minus one standard deviation. This property is derived from this table (.3413 X 2 .68). 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 In the following table, the Dow Jones Industrial Average annual performance from 1950 until 2003 is listed. The average return was 8.28% during this period and the standard deviation is 13.04%. If we were interested in the probability of obtaining returns above 15% in a year during this time period, we’d first calculate the z score for a return of 15% as follows: Thus, a return of 15% is about one-half standard deviation above the mean of 8.28%. After viewing the video Z Scores and Area Under the Normal Curve, answer the following question. What’s the probability (rounded to two decimal places) that the annual return during this period would have exceeded 15%? Year 1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 Annual Return 20.52% 19.11% 5.09% 1.92% 21.01% 32.57% 11.36% -3.51% 3.35% 28.57% -2.23% 11.89% -7.49% 11.73% 16.68% 9.21% -4.09% 0.63% 3.06% -3.23% -14.09% 17.47% 7.45% -2.82% -17.81% 5.68% 21.49% -8.24% -8.32% 2.95% 5.57% 4.66% -5.21% 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 34.60% -1.00% 12.71% 34.97% 26.95% -9.45% 21.74% 6.78% 9.35% 12.12% 7.24% 7.71% 18.45% 27.80% 29.57% 15.92% 21.32% 2.58% -5.08% -9.45% -2.52%