Chapter 8 Notes / December 2012 A continuous probability distribution is a smooth density curve that models the distribution of a continuous random variable. Here are some hypothetical, continuous probability distribution. Notice that the curve may indeed be nothing more than a straight line. Normal distributions are a family of distributions that have the shape shown below. Normal distributions are symmetric with scores more concentrated in the middle than in the outside tails. Looking at the distribution we can estimate an approximate mean, median and mode. Normal distributions are defined by two parameters that we have already covered in class. These include the mean (μ) and standard deviation. The mean is the average of the datum while the standard deviation (σ) is the variance of how spread out a distribution is. A continuous random variable X can theoretically be taken of any value in an interval of values. Examples that come to mind are: heights in inches of a species of a plant, weights in pounds of patients in a cardiac unit, or disk access time in nanoseconds for a certain disk drive. Distributions that are not evenly distributed are said to be skewed. “Positively Skewed Distributions” have their mode closer to the left while “Negatively Skewed Distributions” have the mode closer to the right. These graphs illustrate the notion of skew. The one on the left is positively skewed. The one on the right is negatively skewed. Two common forms of distributions are Uniform (straight horizontal Line) and Exponential Distribution (Curve sloping downwards to the right). Properties of the Normal Distribution The normal distributions are a very important class of statistical distributions. All normal distributions are symmetric and have bell-shaped density curves with a single peak. In previous lessons we covered the idea of standard deviation. Standard deviations is a statistic that tells you how tightly all the various examples are clustered around the mean in a set of data. When the examples are pretty tightly bunched together and the bell-shaped curve is steep, the standard deviation is small. However, when the examples are spread apart and the bell curve is relatively flat, that tells you that you have a relatively large standard deviation. Recall from previous lessons that the Deviation is the difference between an individual value in a set of data and the mean for the data. Simply put it is the difference between a particular element from the set and the average of the set as a whole. Standard Deviation: The square root of the variance. This will always be a positive value. Sample Standard Deviation S= ( x x) n 1 2 Population Standard Deviation = (x ) N 2 Variance: The mean of the squared deviations of the observations from their mean. This will always be a positive value. Sample Variance s 2 = ( x x) n 1 Population Variance 2 = 2 (x ) 2 N Standard Deviation In the diagram below, both distributions have means, and modes of 50. The blue (taller) distribution has a standard deviation of 5; the red (shorter) distribution has a standard deviation of 10. For the blue distribution, 68% of the distribution is between 45 and 55; for the red distribution, 68% is between 40 and 60. Normal distributions with standard deviations of 5 (blue line) and 10 (red line). One standard deviation away from the mean in either direction on the horizontal axis (the red area on the above graph) accounts for somewhere around 68 percent of the people in this group. Two standard deviations away from the mean (the red and green areas) account for roughly 95 percent of the people. And three standard deviations (the red, green and blue areas) account for about 99 percent of the people. The 68-95-99.7% Rule All normal density curves satisfy the following property which is often referred to as the Empirical Rule. 68% of the observations fall within 1 standard deviation of the mean, that is, between and . 95% of the observations fall within 2 standard deviations of the mean, that is, between and . 99.7% of the observations fall within 3 standard deviations of the mean, that is, between and . Thus, for a normal distribution, almost all values lie within 3 standard deviations of the mean. Z-Scores: Calculating how many standard deviations a score is from the mean. To create a Z score, subtract the mean from a raw score and divide by the standard deviation Z-Scores reflect a score’s relationship to the rest of the scores.... -Z = below average +Z = above average The Z-score measures the number of standard deviations any point is away from the mean. By using the Z-Score, any normally distributed data can be converted to the standard normal distribution (with µ = 0 and σ = 1). When the Z-Scores are used, another way to determine the probabilities is by using the tables of areas under the normal distribution curve. Z-Scores , percentiles and cut-off scores are all useful techniques for analyzing normal distributions