3 Normal Distriubtio..

CHAPTER 3 CONTINUOUS RANDOM VARIABLES AND THE NORMAL DISTRIBUTION 1. 2. Difference Between Discrete and Continuous Random Variables 1.1. Discrete Random Variables 1.2. Continuous Random Variables The Normal Distribution 2.1. Finding Probabilities in a Normal Distribution 2.1.1. Using the Standard Normal Distribution to Find Probability of x 2.1.2. The Standard Normal Table (z-Table) 2.2. Finding the x Value for a Given Probability 2.3. z scores Frequently Used in Inferential Statistics Random variable are divided into two general categories: discrete random variable and continuous random variables. 1. Difference Between Discrete and Continuous Random Variables The difference between the probability distribution of discrete versus continuous random variables can be explained by the way they are presented graphically. 1.1. Discrete Random Variables In the case of guessing the answers to the 5-question multiple choice exam and letting x denote the number of correct answers guessed, the probability distribution of x is as follows: 𝑥 0 1 2 3 4 5 𝑓(𝑥) 0.2373 0.3955 0.2637 0.0879 0.0146 0.0010 1.0000 Since x is discrete, it takes on discrete values and the probability of each value is specified. If you can enumerate or list every value of the random variable, and each value has a non-zero probability, then you have a discrete random variable. Note: Some textbooks use the notation P(x) in place of f(x). The term f(x) is called a probability density function. Probability density refers to the point on the graph corresponding to a given value of the random variable x. In other words, probability density is the height of the graph at the given value of the random variable. In the chart below, for example, f(2) = 0.2637. CH 3—The Normal Distribution Page 1 of 16 Probability Density Function of a Discrete Random Variable 0.45 0.3955 0.40 0.35 0.30 f(x) 0.25 0.20 0.2637 0.2373 0.15 0.0879 0.10 0.05 0.0146 0.0010 4 5 0.00 0 1 2 x 3 1.2. Continuous Random Variables Unlike a discrete random variable, you cannot enumerate or list all the values of a continuous random variable. A continuous random variable takes on infinite number of values within an interval. Graphically, the distinguishing features of a continuous random variable are:    The density function of a continuous random variable, 𝑓(𝑥), is not represented by a bar graph. It is, rather, shown as a continuous graph, a smooth curve. Because probability cannot be defined for a single value of x, the height of 𝑓(𝑥) at a given value of x does not represent the probability of that value. The continuous random variable can take on any of the infinite number of values in a given range. Thus the probability that 𝑥 will be equal to a single value is 1⁄∞ , which is 0. Probability is defined, instead, for an interval of 𝑥 and is represented by the area under 𝑓(𝑥) bounded by that interval. 2. The Normal Distribution The normal distribution is the most familiar continuous distribution. The normal distribution is a family of continuous bell-shaped distributions. Each particular normal distribution is defined by two summary characteristics μ, the mean, and σ, the standard deviation. μ and σ are called the parameters of the distribution. The shorthand expression of the normal distribution is as follows: 𝑥~𝑁(μ, σ) The above expression is read: "𝑥 is normally distributed with mean of μ and standard deviation σ." The mathematical formula for the probability density function of 𝑥~𝑁(μ, σ) is: 𝑓(𝑥) = 1 σ√2π 1 𝑥−μ 2 ) σ 𝑒 −2( Plotting this function for fixed values of μ and σ and different values of x will generate a bell-shaped curve which is symmetrical about the mean μ. The following diagram shows the graphs of four normal curves— four members of the family of infinite number of normal curves. Two curves, with means μ1 = 10 and μ2 = 20, have different means, but the same standard deviation. The other two share the same mean (μ3 = 40), but CH 3—The Normal Distribution Page 2 of 16 have different standard deviations . Note that the smaller the standard deviation σ is, the narrower the curve. This shows that with a small σ, the values of the random variable x are more clustered around the mean. Four Members of the Family of Normal Curves A B C D 10 20 40 The total area under each normal curve is equal to 1. The probability of x taking on values within a given interval, say 𝑥1 ≤ 𝑥 ≤ 𝑥2 , is the area under the curve bounded by 𝑎 and 𝑏, as shown in the following diagram. Note that P(𝑥 = 𝑥1 ) = 𝑃(𝑥 = 𝑥2 ) = 0, that is, the probability of 𝑥 being equal to a given value is zero. Therefore, P(𝑥1 < 𝑥 < 𝑥2 ) = P(𝑥1 ≤ 𝑥 ≤ 𝑥2 ). Probability is measured as the area under the curve for an interval of values of x P(x₁ ≤ x ≤ x₂) x₁ 2.1. x₂ x Finding Probabilities in a Normal Distribution Consider a normally distributed random variable 𝑥 with μ = 10 and σ = 2. What is the probability that 𝑥 is less than or equal to 8? Alternatively stated, what proportion of the 𝑥 values are less than or equal to 8? Find 𝑃(𝑥 ≤ 8). That is, find the area to the left of 𝑥 = 8 under the normal curve with a mean of 10 and standard deviation of 2. CH 3—The Normal Distribution Page 3 of 16 Probability of x less than or equal to 8 is the area under the curve to left of x = 8 σ=2 P(x ≤ 8) 8 µ = 10 x To find this probability, first you have to convert 𝑥 into the standard normal random variable 𝑧. 2.1.1. Using the Standard Normal Distribution to Find Probability of x Unlike the binomial distribution, there is no simple formula used to find the probability involving the normal distribution. In the absence of a computer you must rely on a table. To use this table you should first transform 𝑥 into another random variable called the standard normal random variable, denoted by 𝑧. The familiar formula to transform 𝑥 into 𝑧 is 𝑧= 𝑥−μ σ In Chapter 1 it was shown that 𝑧 measures the deviation of 𝑥 values from the mean in units of standard deviation. Thus, when 𝑧 = 2, 𝑥 is two standard deviations above than the mean: 𝑥 = μ + 2𝜎. And, when 𝑧 = −2, 𝑥 is two standard deviations below the mean: 𝑥 = μ − 2𝜎. This leads us to two important properties of 𝑧 μ𝑧 = 0 and σ𝑧 = 1 (See the footnote below)1 Show that μ𝑧 = 0 and σ𝑧 = 1 First, rewrite the 𝑧 equation as, 1 𝑧= 𝑥−μ μ 1 =− + 𝑥 σ σ σ Note that z is the linear transformation of the random variable 𝑥, 𝑧 = 𝑎 + 𝑏𝑥, where 𝑎 = − μ⁄σ and 𝑏 = 1⁄σ. Using the arithmetic properties of expected value and standard deviation of the random variable x, μ 1 μ 1 E(𝑧) = − + E(𝑥) = − + μ = 0 σ σ σ σ sd(𝑧) = 1 1 sd(𝑥) = σ = 1 σ σ CH 3—The Normal Distribution Page 4 of 16 In short, if 𝑥 is normally distributed with a mean of μ and standard deviation of σ, then 𝑧 is also normally distributed but with a mean of 0 and standard deviation of 1. These properties of 𝑧 allow for the development of a single probability table that can be used to find all normal probabilities. Example 1 Let 𝑥 be a normally distributed random variable with a mean of μ = 10 and standard deviation of σ = 2. Find P(𝑥 ≤ 8). First transform 𝑥 to 𝑧: 𝑧= Thus, given μ = 10, and σ = 2: 𝑥 − μ 8 − 10 = = −1.00 σ 2 P(𝑥 ≤ 8) = P(𝑧 ≤ −1.00) Now you need to find the area under the z curve to the left of 𝑧 = −1.00. Using the z table you can see that this area is equal to 0.1587. Finding this value is explained below. The mean of the standard normal distribution z is always μ = 0 and the standard deviation is always σ = 1. The probability of z less than or equal to −1, P(z ≤ −1), is the area under the z curve to left of z = −1. σ=1 P(z ≤ −1) -1 2.1.2. µ=0 z The Standard Normal Table (z Table) The z table2 provides the area under the z curve to the left of the z score. For example, if 𝑧 = −1.00, then the table provides the area under the curve to the left of −1.00. A portion of the z-table is reproduced below. The complete table shows the areas for z values ranging from −4.00 to 3.99. As the table shows, P(z < −1.00) = 0.1587. The z table is available on my website in E270 LECTURE NOTES/2 EXCEL FILES/3 Normal Distribution. Click on the tab “z table”. 2 CH 3—The Normal Distribution Page 5 of 16 z -1.50 -1.49 -1.48 -1.47 -1.46 -1.45 -1.44 -1.43 P(Z < z) 0.0668 0.0681 0.0694 0.0708 0.0721 0.0735 0.0749 0.0764 z -1.00 -0.99 -0.98 -0.97 -0.96 -0.95 -0.94 -0.93 P(Z < z) 0.1587 0.1611 0.1635 0.1660 0.1685 0.1711 0.1736 0.1762 z -0.50 -0.49 -0.48 -0.47 -0.46 -0.45 -0.44 -0.43 P(Z < z) 0.3085 0.3121 0.3156 0.3192 0.3228 0.3264 0.3300 0.3336 z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 P(Z < z) 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 Example 2 Find P(𝑧 < 1.00). From the table, P(𝑧 < 1.00) = 0.8413. P(z ≤ 1.00) = 0.8413 0.8413 1.00 z Example 3 Find P(z > 1.25) From the complete z table, P(z < 1.25) = 0.8944 Therefore, P(z > 1.25) = 1 – 0.8944 = 0.1056. CH 3—The Normal Distribution Page 6 of 16 P(z ≥ 1.25) = 1 − 0.8944 = 0.1056 0.8944 0.1056 1.25 z Note: Since the standard normal distribution is symmetric about the mean 0, when asked to find the area to the right of a given 𝑧 score, , rather than finding the area to the left and subtracting the result from 1, you can directly find the area by using the negative of that z score. Thus, P(𝑧 > 1.25) = P(𝑧 < −1.25) = 0.1056. P(z ≥ 1.25) = P(z ≤ −1.25)= 0.1056 0.1056 -1.25 z Example 4 Find P(−1.40 < 𝑧 < 1.40) From the complete z table, P(𝑧 < 1.4) = 0.9192 and P(z < −1.4) = 0.0808. Therefore, P(−1.40 < z < 1.40) = P(z < 1.40) – P(z < −1.40) = 0.9192 – 0.0808 = 0.8384. CH 3—The Normal Distribution Page 7 of 16 0.0808 -1.4 0.9192 1.4 z Example 5 Let x be a normally distributed random variable with a mean of µ = 10 and standard deviation of σ = 2. Find P(8 < x < 12). 𝜇 = 10 𝑧= 𝜎=2 𝑥 − μ 8 − 10 = = −1.00 σ 2 and 𝑧= 12 − 10 = 1.00 2 Thus, for the given mean and standard deviation of 𝑥, P(8 < 𝑥 < 12) = 𝑃(−1.00 < 𝑧 < 1.00) P(8 < 𝑥 < 12) = P(𝑧 < 1.00) − P(𝑧 < −1.00) P(8 < 𝑥 < 12) = 8413 − 0.1587 = 0.6827 Note: When asked to find the area between two z scores that are symmetric about the mean 0, it is easier to simply double the area from the left tail and subtract the result from 1. P(−1.00 < 𝑧 < 1.00) = 1 − 2 × 𝑃(𝑧 < −1.00) P(1.00 < 𝑧 < 1.00) = 1 − 2 × 0.1587 = 0.6827 CH 3—The Normal Distribution Page 8 of 16 0.6827 0.1587 0.1587 -1.00 z 1.00 Example 6 Given μ = 10 and σ = 2, find P(6 < 𝑥 < 14). 𝑧1 = 6 − 10 = −2.00 2 𝑧2 = 14 − 10 = 2.00 2 P(−2.00 < 𝑧 < 2.00) = P(𝑧 < 2.00) − P(𝑧 < −2.00) P(−2.00 < 𝑧 < 2.00) = 0.9772 − 0.0228 = 0.9544 Alternatively, you can find the same probability by, P(−2.00 < 𝑧 < 2.00) = 1 − 2 × 𝑃(𝑧 < −1.00) P(−2.00 < 𝑧 < 2.00) = 1 − 2 × 0.0228 = 0.9544 0.9545 0.0228 -2.00 0.0228 2.00 z Example 7 Suppose the vehicle speed on I-65 between Lebanon and Gary is normally distributed with a mean of 74 mph and a standard deviation of 5 mph. CH 3—The Normal Distribution Page 9 of 16 a) What is the probability that a vehicle clocked at random is traveling slower that 65 mph? Or, what proportion of vehicles are traveling below 65 mph? Given 𝜇 = 74 and 𝜎 = 5, find P(𝑥 < 65) Solution 𝑧= 65 − 74 = −1.80 5 P(𝑧 < −1.80) = 0.0359 b) What is the probability that the vehicle is traveling faster than 80 mph? Or, what proportion of vehicles are traveling faster than 80 mph? Given 𝜇 = 74 and 𝜎 = 5, find P(𝑥 > 80) Solution 𝑧= 80 − 74 = 1.20 5 P(𝑧 > 1.20) = 1 − P(𝑧 < 1.20) P(𝑧 > 1.20) = 1 − 0.8849 = 0.1151 Note: To find the area or probability to the right of a 𝑧-score, look up the area to the left of the negative of that 𝑧-score: P(𝑧 > 1.20) = P(𝑧 < −1.20) = 0.1151 c) What proportion of drivers drive between 75 and 85 mph? Given 𝜇 = 74 and 𝜎 = 5, find P(75 < 𝑥 < 85) Solution 𝑧1 = 75 − 74 = 0.20 5 𝑧2 = 85 − 74 = 2.20 5 P(0.20 < 𝑧 < 2.20) = P(𝑧 < 2.20) − P(𝑧 < 0.20) P(−2.00 < 𝑧 < 2.00) = 0.9861 − 0.5793 = 0.4068 d) What fraction or proportion of vehicles drive within ±6 mph from the mean? Given 𝜇 = 74 and 𝜎 = 5, find P(𝜇 − 6 < 𝑥 < 𝜇 + 6) = P(68 < 𝑥 < 80) Solution CH 3—The Normal Distribution Page 10 of 16 𝑧1 = 68 − 74 = −1.20 5 𝑧2 = 80 − 74 = 1.20 5 P(−1.20 < 𝑧 < 1.20) = 1 − 2 × P(𝑧 < −1.20) P(−1.20 < 𝑧 < 1.20) = 1 − 2 × 0.1151 = 0.7699 e) What fraction or proportion of vehicles drive within ±2 standard deviations from the mean? Given μ = 74 and σ = 5, find P(𝜇 − 2𝜎 < 𝑥 < 𝜇 + 2𝜎) = P(74 − 2 × 5 < 𝑥 < 74 + 2 × 5) P(𝜇 − 2𝜎 < 𝑥 < 𝜇 + 2𝜎) = P(64 < 𝑥 < 84) 𝑧1 = 64 − 74 = −2.00 5 𝑧2 = 84 − 74 = 2.00 5 Note: The term “2 standard deviations from the mean” is an expression of the distance of 𝑥 from the mean in units of standard deviation, that is, 𝑧 = 2.00. Thus, P(−2.00 < 𝑧 < 2.00) = 0.9544 2.2. Finding the x Value for a Given Probability In many statistical problems involving the normal distribution you will be asked to find the value or an interval of values of 𝑥 that bounds a given area under the normal curve. In other words, you will be given a probability and asked to find the 𝑥 value (or values) corresponding to that probability. Example 8 Let 𝑥 be a normally distributed random variable with 𝜇 = 20 and 𝜎 = 5. The area under the normal curve to left of an unknown value of 𝑥 is 0.2005. Find the 𝑥 value. CH 3—The Normal Distribution Page 11 of 16 Find the x values that bound a left-tail area of 0.2005 x μ = 74 x To find 𝑥, you have to work your way back starting with the 𝑧 table. Look up the z score that corresponds to the area or probability 0.2005. The z score corresponding to this area or probability is 𝑧1 = −0.84. Find the z values that bound a left-tail area of 0.2005 z μ=0 z Now, using 𝑧= 𝑥−μ σ solve for 𝑥: 𝑥 = μ + 𝑧𝜎 Thus, 𝑥 = 20 + (−0.84)(5) = 15.8 Example 9 Let 𝑥 be a normally distributed random variable with 𝜇 = 20 and 𝜎 = 5. The area under the normal curve to right of an unknown value of 𝑥 is 0.2546. Find the 𝑥 value. CH 3—The Normal Distribution Page 12 of 16 Solution Find the x values that bound a right-tail area of 0.3000 μ = 74 x x In the 𝑧 table, the 𝑧 score corresponding to the area 0.2546 is −0.66. However, since we are interested in the right tail area, using the symmetric property of the standard normal distribution, we ignore the negative sign. Thus: 𝑥 = μ + 𝑧σ = 20 + (0.66)(5) = 23.3 Example 10 Suppose the vehicle speed on the rural stretch of I-65 is normally distributed with a mean of 74 mph and a standard deviation of 8 mph. If the State Police planned to ticket the top 30 percent of the speeders, above what speed should the State Police issue tickets? Find the x values that bound a middle area of 0.3000 μ = 74 x x The closest area in the z table to 0.3000 is 0.3015, the corresponding z score for which is −0.52. But since we are looking for the z score for the top 30 percent we ignore the negative sign. 𝑥 = μ + 𝑧σ = 74 + (0.52)(8) = 78.16 The State Police should issue tickets for vehicles going over 78 mph. Example 11 Suppose the vehicle speed on the rural stretch of I-65 is normally distributed with a mean of 74 mph and a standard deviation of 8 mph. Find the middle interval of speeds within which one-half (0.5 or 50%) of the vehicles drive. CH 3—The Normal Distribution Page 13 of 16 Solutions Denote the lower end or boundary of the interval as 𝑥1 and the upper end as 𝑥2 . Then we must find the values for 𝑥1 and 𝑥2 such that P(𝑥1 < 𝑥 < 𝑥2 ) = 0.50 Find the x values that bound a middle area of 0.5000 x₁ μ = 74 x₂ x We can find the lower and upper boundary, respectively from 𝑥1 = μ − 𝑧σ and 𝑧2 = μ + 𝑧σ. Here for 𝑧 we must insert the 𝑧 score that bounds a tail area of 0.25. This 𝑧 score is 0.67. 𝑥1 = μ − 𝑧σ = 74 − (0.67)(8) = 74 − 5.36 = 68.64 𝑥2 = μ + 𝑧σ = 74 + (0.67)(8) = 74 + 5.36 = 79.36 The above computation indicates that 50% of vehicles drive within ±5.36 mph from the mean speed. Example 12 In the previous problem, within what speed interval do 90% of vehicles drive? Given 𝜇 = 74 and 𝜎 = 8, find 𝑥1 and 𝑥2 such that, P(𝑥1 < 𝑥 < 𝑥2 ) = 0.90 Solution Again, 𝑥1 = μ − 𝑧σ and 𝑥2 = μ + 𝑧σ . Here for z we must insert the z score that bounds a tail area of 0.05. This z score is 1.64. 𝑥1 = μ − 𝑧0.05 σ = 74 − (1.64)(8) = 74 − 13.12 = 60.9 𝑥2 = μ + 𝑧0.05 σ = 74 + (1.64)(8) = 74 + 13.12 = 87.1 The above computation indicates that 90% of vehicles drive within ±13.12 mph from the average speed. Example 13 Find the z scores corresponding to the following right tail areas under the standard normal curve. a) 0.10 b) 0.05 c) 0.025 a) Denote the z score corresponding to the 0.10 right tail area as 𝑧0.10 . Then, using the symmetric property of the standard normal distribution, CH 3—The Normal Distribution Page 14 of 16 P(𝑧 > 𝑧0.10 ) = P(𝑧 < −𝑧0.10 ) = 0.10 In the z table the area closest to 0.10 is 0.1003. The z score corresponding to this area is −1.28. Therefore, on the right tail z0.10 = 1.28 The z score that bounds a right tail area of 0.1003 0.1003 0 z = 1.28 z b) P(𝑧 > 𝑧0.05 ) = 0.05 P(𝑧 > 𝑧0.05 ) = P(𝑧 < −𝑧0.05 ) = 0.05 In the z table there are two areas that are equally close to 0.05. These are 0.0495 and 0.0505. Generally, the higher number, 0.0505, is chosen because if the area were rounded to five decimals, 0.05050 is closer to 0.05 than 0.04947. The z score corresponding to 0.0505 is −1.64. Therefore, 𝑧0.05 = 1.64. c) P(𝑧 > 𝑧0.025 ) = P(𝑧 < −𝑧0.025 ) The z score corresponding to 0.025 is −1.96. Therefore 𝑧0.025 = 1.96. Generally, let α denote the right tail area under the z curve bounded by zα. Then P(𝑧 > 𝑧α ) = P(𝑧 < −𝑧α ) CH 3—The Normal Distribution Page 15 of 16 2.3. z scores Frequently Used in Inferential Statistics The three z scores calculated above will be frequently used in the subsequent chapters covering inferential statistics. You should simply memorize them.  The z score that bounds a right tail area of 0.10: 𝒛𝟎.𝟏𝟎 = 𝟏. 𝟐𝟖  The z score that bounds a right tail area of 0.05: 𝒛𝟎.𝟎𝟓 = 𝟏. 𝟔𝟒  The z score that bounds a right-tail area of 0.025: 𝒛𝟎.𝟎𝟐𝟓 = 𝟏. 𝟗𝟔 0.05 0.10 0 1.28 CH 3—The Normal Distribution z 0 1.64 0.025 z 0 1.96 z Page 16 of 16

3 Normal Distriubtio..

Related documents

Products

Support

3 Normal Distriubtio..

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib