7 - THE NORMAL DISTRIBUTION Examples: Alpha fetoprotein levels of mothers carrying a fetus with spina bifida. Smooth bell shaped symmetric curve is called the Normal p.d.f. curve or just the Normal curve. 50% 50% Mean If a random variable, X, has a Normal distribution with a mean and a standard deviation we write: The Normal distribution is important because: it fits a lot of data reasonably well; it can be used to approximate other distributions it is an important assumption in statistical inference (see later work.) Shape is solely determined by two parameters, and , the population mean controls where the normal is centered, and the population standard deviation controls the spread about . Example: Alpha fetoprotein levels found in the urine of mothers carrying a foetus with spina bifida. Let X = alpha fetoprotein level in the urine of a mother carrying a foetus with spina bifida. We will assume that alpha fetoprotein levels have a normal distribution. The sample mean AFP level, X 22.71 moles/liter and the sample standard deviation, s = 3.92 moles/liter. These are the sample-based estimates of _____ and ______ respectively. Approximately _______ % of the mothers in this population will have AFP levels within 1 standard deviation of the mean, i.e. we estimate that approximately ________% of this population of mothers will have AFP levels: between __________________ and _____________________ = between __________ and ___________ Diagram here: . Approximately _______ % of the mothers in this population will have AFP levels within 2 standard deviation of the mean, i.e. we estimate that approximately ________% of this population of mothers will have AFP levels: between _________ and __________ = between __________ and ___________ 5 Approximately _______ % of the mothers in this population will have AFP levels within 3 standard deviation of the mean, i.e. we estimate that approximately ________% of this population of mothers will have AFP levels: between _________ and __________ = between __________ and ___________ For the Normal Distribution: A random observation has approximately: 68% chance of falling within 1 of ; 95% chance of falling within 2 of ; 99.7% chance of falling within 3 of . In a normal distribution, approximately: 68% of observations are within 1 of ; 95% of observations are within 2 of ; 99.7% of observations are within 3 of . or OBTAINING PROBABILITES To find probabilities associated with a normal distribution with mean and standard deviation we need to have a mechanism for finding areas beneath the normal curve. Because there are infinitely many mean and standard deviations we might be interested in we need a standard process by which we can find areas associated with any normal distribution! The Standard Normal Distribution and Using the Standard Normal Table X Fact: If X ~ N( , ) then if we define a new random variable Z then Z ~ N(0,1), i.e. we create a new random variable Z where the observed values of Z are the z-scores for the random variable X. Recall the process of converting a random variable X to z-scores is called standardization. Once standardized, we can find probabilities/areas of interest using a standard normal table. The standard normal table in the appendix of most texts gives P(Z < z), i.e. lower tail probabilities for a standard normal distribution (shaded). We can also use the Normal Probability Calculator in JMP in the Tutorials section of website. Most tables give shaded area = P(Z < z) 0 z Z 6 Basic method for obtaining probabilities 1. 2. 3. 4. Sketch a Normal curve, marking on the mean and values of interest. Shade the area under the curve corresponding to the required probability. Convert all values in original scale to their corresponding z-scores. Obtain the desired probability from the upper-tail areas provided by a standard normal table. Z = standard normal random variable Z ~Normal( 0, 1). Find the following standard normal probabilities: a) P(Z > 2.25) b) P(Z < 1.28) c) P(Z > .50) d) P(Z < -2.33) e) P(-1.96 < Z < 1.96) h) Find z so that P(Z < z) = .90, i.e. what is the 90th percentile of the standard normal distribution? 7 Spina Bifida Example (continued) X = AFP level of a randomly selected mother carrying a foetus with spina bifida . Lets assume that X~Normal ( =23.05, = 4.08) using the sample mean and sample standard deviation. Find the following: a) P(X < 15.00) = b) P(X < 20.00) c) P(20.00 < X < 25.00) d) P(X > 30.00) 8 e) Find the 90th percentile. f) Find the 25th percentile 9 Original Problem: Spina Bifida Diagnosis 15.73 23.05 Recall: For normal foetuses =15.73, = 0.72 and for foetuses with spina bifida = 23.05 and = 4.08. Assume the threshold for detecting spina bifida is set at 17.8. (A foetus would be diagnosed as not having spina bifida if the fetoprotein level is below 17.8) a) What is the probability that a foetus suffering from spina bifida is correctly diagnosed? Incorrectly diagnosed? b) What is the probability that a foetus without spina bifida is correctly diagnosed? Incorrectly diagnosed? 10 c) If they wanted to ensure that 99% of foetuses with spina bifida were correctly diagnosed, at what level should they set T? 11 Standard Normal Table – P(Z < z) Table for negative z-scores, i.e. z < 0 12 Standard Normal Table – P(Z < z) Table for positive z-scores, i.e. z > 0 13