Random Variables – Continuous How Can We Find Probabilities for Bell-Shaped Distributions? Normal Distribution The normal distribution is symmetric, bell-shaped and characterized by its mean µ and standard deviation σ. The probability of falling within any particular number of standard deviations of µ is the same for all normal distributions. The Z-Score Recall: The z-score for an observation is the number of standard deviations that it falls from the mean. For each fixed number z, the probability within z standard deviations of the mean is the area under the normal curve between the mean +/- z For z = 1: 68% of the area (probability) of a normal distribution falls between: +/- 1 standard deviation of the mean For z = 2: 95% of the area (probability) of a normal distribution falls between: +/- 2 standard deviation of the mean For z = 3: Nearly 100% of the area (probability) of a normal distribution falls between: +/- 3 standard deviation of the mean The Normal Distribution: The Most Important One in Statistics It’s important because… •Many variables have approximate normal distributions. •It’s used to approximate many discrete distributions. •Many statistical methods use the normal distribution even when the data are not bellshaped. Finding Normal Probabilities for Various Z-values - Table A1 This table is called a standard normal table. By converting the normally distributed X-variable to a z-score we have “standardized” the data to have a common mean of 0 and standard deviation of 1. Give examples of finding probabilities for various z-scores. Example: In the U.S. the heights of adult males and females are both approximately normal. The mean height for males is 70 inches with a standard deviation of 4 inches, while the female mean height is 65 inches with standard deviation of 3.5 inches. What is the probability a randomly selected male would be: 1 1. Less than 70 inches? 2. Less than 65 inches? 3. More than 68 inches? 4. Between 68 and 73 inches? Answers: 1. z-score = (70 – 70)/4 = 0 and P(Z < 0) = 0.5000 2. z-score = (65 – 70)/4 = -5/4 = -1.25 and P(Z < -1.25) = 0.1056 3. z-score = (68 – 70)/4 = -0.5 and P(Z > -0.5) = 1 - P(Z < -0.5) = 1 – 0.3085 = 0.6915 4. z-score for 68 is -0.5 and for 73 we have (73 – 70)/4 = 0.75 and P(-0.5 < Z < 0.75) = 0.7734 – 0.3085 = 0.4649 What about percentiles? Percentiles represent the percentage of all scores or values at or below a particular score or value. For example, if on an exam that is approximately normal you scored a 91 which was in the 85th percentile. This would mean that for your exam score of 91 85% of the other exam scores were equal to or below your score of 91. This would also translate to your score being in the top 15%. So what height for males would put one in the 85th percentile? First find in the table the cumulative probability closest to 0.8500 and then the zscore that corresponds to this cumulative probability. In the table the closest is 0.8508 which is a z-score of 1.04. Now find the observed score by: Observed Score = z* σ + µ = 1.04*4 + 70 = 4.16 + 70 = 74.16 or a little of six feet two inches. Who is taller, relatively speaking: Nikki Green or Jermaine Marshall, both listed as sixfour or 76 inches? For Jermaine, his z-score is (76 – 70)/4 = 1.5 and for Nikki her z-score is (76 – 65)/3.5 which equals 3.14. Comparatively, Jermaine’s height is 1.5 standard deviations above the mean for males but Nikki’s is 3.14 standard deviations above the mean for females. Relatively speaking, Nikki is taller! How unlikely is a female to be six-four or taller? P(Z > 3.14) = 1 – P(Z < 3.14) = 1 – 0.9992 = 0.0008 Pretty unlikely! As for Jermaine, the probability a male is six-four or taller is P(Z > 1.5) = 1 – 0.9332 = 0.0668 SPECIAL NOTES: Note that z-scores can be negative or positive depending on whether the observed xvalue is below the mean (resulting in a negative z-score) or above the mean (producing a positive z-score) Note that since we are talking about a continuous variable then unlike the discrete case there is no distinction between < or ≤. If you had calculus the reason for this is that the only difference between the two is that in the latter instance we are talking about that 2 exact point, i.e. “equal to”. In essence, when finding these probabilities we are find the area under a curve between two values (in the case of Table A1 this is from negative infinity to the z-score). But the equal to is asking to find the area under a point which from calculus the area under a point is simply zero. So including the “equal sign” adds nothing. Comparing two scores When you applied to college, you scored 650 on the math section of the SAT, which had mean µ = 500 and standard deviation σ = 100. Your friend took the comparable ACT scoring 30. That year, the ACT had µ = 21.0 and σ = 4.7. How can we tell who did better? What is the z-score for your SAT score of 650? For the SAT scores: µ = 500 and σ = 100 so z-score = 1.50 and from before we know that this results in about the 93rd percentile What is the z-score for your friend’s ACT score of 30? The ACT scores had a mean of 21 and a standard deviation of 4.7. Z = (30 – 21)/4.7 = 1.91 The cumulative probability for P(Z < 1.91) = 0.9719, or your friend scored in roughly the 93% percentile. If time, show how to do previous examples using Minitab. 3