4.4 The Normal Distribution: Introduction Introduction In many cases, measurements will tend to cluster near the mean so that many points lie near the mean and fewer and fewer points lie further and further from the mean. In these cases, the frequency distribution looks something like this: The frequency is high near the mean, And then frequencies taper off on both sides. Normal distribution: A distribution is a normal distribution if it is represented by an ideal bell-shaped curve. (Frequencies are highest in the middle and taper symmetrically as values move away from the center). Other types of distributions: Uniform Left-tailed Right-tailed Normal Distribution The normal distribution is an idealized bell-shaped curve which represents the frequency of the data verses the datavalue. The center of the curve is the mean and the std dev is . A normal distribution described an idealized case in which the mean, median and mode are all equal and the frequencies on either side of the peak are perfectly symmetric. The tails of the curve extend forever, from - to +. 68.3% of the data points lie within 1 standard deviation of the mean 95.44% of the data points lie within 2 standard deviations of the mean 99.74% of the data points lie within 3 standard deviations of the mean Example: Book #10 A population is normally distributed with mean 18.9 and standard deviation 1.8. a. Find the intervals representing 1, 2 and 3 standard deviations of the mean. 1 st dev: [18.9-1.8,18.9+1.8] = [17.1,20.7] 2 st dev: [18.9-2*1.8,18.9+2*1.8] = [15.3,22.5] 3 st dev: [18.9-3*1.8,18.9+3*1.8] = [13.5,24.3] b. What percentage of the data lies in each of the intervals in part (a)? 68.3% 95.44% 99.74% c. Draw a sketch of the bell curve. The Shape of a Normal Distribution: Narrow or Spread-Out The mean is the location of the highest frequency. The standard deviation determines how narrow or spread out it is. The larger the standard deviation, the more spread out it is. The Standard Normal Distribution (z-distribution) Consider a normal distribution with mean and std dev . If the mean is 0 and the standard deviation is 1 we call it the standard normal curve, or the z-distribution. Using the Z-Distribution To Determine Probabilities The Area Under the Curve Represents The Probability of That Region The area under the curve between points a and b corresponds to the probability that a randomly selected datapoint z will be between a and b (1) The total area under the curve is 1: prob that x is anywhere between - and + is 1. (2) The total area of the entire right side is .5: prob that z>=.5 (3) The total area of the entire left side is .5: prob that z<=.5 For other regions, we need to use a body table to determine the probability. Worksheet: Normal Distributions Any Normal Distribution can be converted to a z-distribution We convert to the standard normal curve by taking each datapoint x and normalizing it as z = (x- μ) / σ. Example: D = {8 10 12} mean = 30/3 = 10 st dev = 2 x=8: z=(x-10)/2 = -1 x=10: z=(x-10)/2 = 0 x=12: z=(x-10)/2 =1 D2 = {-1,0,1} mean = 0 st dev = 1 Thus to study the normal distribution, you only have to study the standard normal distribution in which the mean=0 and the standard dev=1. Technical details: To be perfect, the normal distribution must describe the frequency of a continuous rather than discrete variable. Continuous variable: measurements (heights of students in a class) Discrete variable: whole numbers (# of times you flip heads when you flipping coins) Distributions for discrete variables can approximate normal distributions. However, a normal curve is an ideal, perfectly smooth curve while discrete variables will result in jagged edges. Finding Probabilities of Regions under the Standard Normal Curve 1. Shade in the region on a graph of the z-distribution, and identify the sub-regions. 2. Calculate the probability of each sub-region and add the shaded probabilities together. The Probabilities of 3 Types of Regions: Bodies, Tails, Body And Tail Regions Body Region of Length z: A body region is [0,z] (the interval from 0 to z under the bell curve) You will need to be able to use a body table to determine the probability that a variable is between 0 and z. Tail Region of Body Region of Length z: A tail region is the region not included in the body region (the interval from z to ∞ under the bell curve). Body and Tail Region: Including the body and tail together make half the area under the bell curve, and has probability .5 The body table for the standard normal distribution provides the probability for body regions of length z only. To calculate tail regions, you need to subtract the body region from the body and tail region. You need to be able to use a body table to determine the probability that a value will lie in an interval [a,b].