Random Variables – Continuous How Can We Find Probabilities for Bell-Shaped Distributions? Normal Distribution The normal distribution is symmetric, bell-shaped and characterized by its mean µ and standard deviation σ. The probability of falling within any particular number of standard deviations of µ is the same for all normal distributions. The Z-Score Recall: The z-score for an observation is the number of standard deviations that it falls from the mean. NOTE: Technically, since the mean and median should be roughly the same in a normal distribution an observation would be the same distance from the median as the mean. For each fixed number z, the probability within z standard deviations of the mean is the area under the normal curve between the mean +/- z For z = 1: 68% of the area (probability) of a normal distribution falls between: +/- 1 standard deviation of the mean For z = 2: 95% of the area (probability) of a normal distribution falls between: +/- 2 standard deviation of the mean For z = 3: Nearly 100% of the area (probability) of a normal distribution falls between: +/- 3 standard deviation of the mean The Normal Distribution: The Most Important One in Statistics It’s important because… •Many variables have approximate normal distributions. •It’s used to approximate many discrete distributions. •Many statistical methods use the normal distribution even when the data are not bellshaped. Finding Normal Probabilities for Various Z-values - Table A1 This table is called a standard normal table. By converting the normally distributed X-variable to a z-score we have “standardized” the data to have a common mean of 0 and standard deviation of 1. What about percentiles? Percentiles represent the percentage of all scores or values at or below a particular score or value. For example, if on an exam that is approximately normal you scored a 91 which was in the 85th percentile. This would mean that for your exam score of 91 85% of the other exam scores were equal to or below your score of 91. This would also translate to your score being in the top 15%. 1 Examples Second, the heights of U.S. adult females and males are approximately normally distributed (per Center of Disease Control) with the following statistics: Females: Mean 64 inches and SD of 2 inches Males: Mean 69 inches and SD of 3 inches 1. Based your height and gender find your Z- score Me: Z = (72 – 69)/3 = 3/3 = 1 2. Using Table A1 and your z-score, what percentile do you fall in (find the cumulative probability in the Table for your z-score then multiply by 100%) The percentile would be 0.8431 * 100% = 84.31 Interpretation would be that my height of 72 inches falls approximately in the 84th percentile meaning that about 84% of U.S. male adult heights are at or below my height of 72 inches. 3. Currently the average height of an NBA (National Basketball Association) player is about 77 inches (6 foot 5 inches). What do think the probability is of meeting a U.S. adult Male of this height or taller? Asked to find P(X > 77) === P(Z > (77 – 69)/3) = P(Z > 8/3) = (P Z > 2.67) Using Table A1 this means P(Z > 2.67) = 1 – P(Z < 2.67) = 1 – 0.9962 = 0.0038 Therefore, the probability of meeting a U.S. adult male that is 77 inches or taller is 0.0038 or about a 0.38% chance 4. What do think the probability is of meeting a U.S. adult Female of this height or taller? Asked to find P(X > 77) === P(Z > (77 – 64)/2) = P(Z > 13/2) = P(Z > 6.5) Using Table A1 the closest Z we have to 6.5 is 3.49. Using 3.49, P(Z > 3.49) = 1 – P(Z < 3.49) = 1 – 0.9998 = 0.0002 Therefore, the probability of meeting a U.S. adult female that is 77 inches or taller is 0.0002 or about a 0.02% chance 5. Finding an observed score for a specific percentile or probability. a. First, find the z-score associated with the percentile or cumulative probability 2 b. Input this z-score, SD, and mean into the following equation to find observed score: Observed = Z*SD + mean EXAMPLE: What height would put in the “bottom 10%?” The “top 5%”?” Bottom 10%: a. Find z-score for 10th percentile or cumulative probability of 0.1000. From Table this corresponds to a Z of -1.28 b. For women, the height for the bottom 10% would be: Observed = (-1.28)*(2) + 64 = -2.56 + 64 = 61.44 inches For men, the height for the bottom 10% would be: Observed = (-1.28)*(3) + 69 = -3.84 + 69 = 65.16 inches Top 5%: a. Find z-score for 95th percentile or cumulative probability of 0.9500 as this corresponds to the “top 5%”. From Table this corresponds to a Z of 1.65 b. For women, the height for the top 5% would be: Observed = (1.65)*(2) + 64 = 3.30 + 64 = 67.30 inches For men, the height for the top 5% would be: Observed = (1.65)*(3) + 69 = 4.95 + 69 = 73.95 inches SPECIAL NOTES: Note that z-scores can be negative or positive depending on whether the observed xvalue is below the mean (resulting in a negative z-score) or above the mean (producing a positive z-score) Note that since we are talking about a continuous variable then unlike the discrete case there is no distinction between < or ≤. If you had calculus the reason for this is that the only difference between the two is that in the latter instance we are talking about that exact point, i.e. “equal to”. In essence, when finding these probabilities we are find the area under a curve between two values (in the case of Table A1 this is from negative infinity to the z-score). But the equal to is asking to find the area under a point which from calculus the area under a point is simply zero. So including the “equal sign” adds nothing. 1. Comparing two scores Nikki Green and Jermaine Marshall are members or PSU women’s and men’s basketball teams. Both are six-foot four or 76 inches. Relative to their gender, who is taller? For Jermaine, his z-score is (76 – 69)/3 = 2.33 and for Nikki her z-score is (76 – 64)/2 which equals 6.00. Comparatively, Jermaine’s height is 2.33 standard deviations above the male mean while Nikki’s is 6 standard deviations above the female mean. Relativel speaking, Nikki is taller! 2. If time, show how to do previous examples using Minitab. 3