Discrete Random Variables: Their Probability Density Functions f(x), Cumulative Distribution Functions F(x), and Expected Values E(X) A RANDOM VARIABLE assigns a number to some outcome. If the possible values are at distinct points, then the random variable is DISCRETE. If the possible values are an interval of values, then the random variable is CONTINUOUS. Notation: Use capital letters at the end of the alphabet, X, Y, Z, to define the random variable. The corresponding lower case letters, x, y, z, to represent the actual values. Example: Suppose a randomly selected family has 3 children. Let X = the number of boys in the family. Then x = 0, 1, 2, or 3. If necessary, you can use a tree diagram to list all of the possible 3-children families, as follows. The column labeled x corresponds to the number of boys in the particular outcome. The column labeled "Probability" identifies the probability of the particular outcome, assuming a boy and girl are equally likely, i.e. P(G)=P(B)=0.5. Note that the probability is calculated assuming independence, i.e. assuming that the gender of one child does not affect the gender of the next child, a seemingly reasonable assumption. So, for example, P(GGG)=P(G)×P(G)×P(G)=0.5×0.5×0.5= 0.125. Outcome GGG BGG GBG GGB BBG GBB BGB BBB x 0 1 1 1 2 2 2 3 Probability 0.5 × 0.5 × 0.5 = 0.125 0.5 × 0.5 × 0.5 = 0.125 0.5 × 0.5 × 0.5 = 0.125 0.5 × 0.5 × 0.5 = 0.125 0.5 × 0.5 × 0.5 = 0.125 0.5 × 0.5 × 0.5 = 0.125 0.5 × 0.5 × 0.5 = 0.125 0.5 × 0.5 × 0.5 = 0.125 Probability Density Functions f(x) We summarize the probability that a discrete random variable X takes on certain values x by way of a PROBABILITY DENSITY FUNCTION f(x). The probability density function, which can be described by a formula or in a table, is defined as: f(x) = P(X = x) NOTE!!!! The notation for a probability density function, f(x), entails using a lowercase "f". As we will soon see, F(x), with a capital "F", takes on a different meaning. Handout 07 Page 1 of 4 Example (cont'd): We can determine the probability density function by noting that the possible 3-child families are mutually exclusive outcomes. So: P(X=0) = P(GGG) = 0.125 P(X=1) = P(BGG or GBG or GGB) = P(BGG)+P(GBG) +P(GGB) = 0.375 P(X=2) = P(BBG or GBB or BGB) = P(BBG)+P(GBB) +P(BGB) = 0.375 P(X=3) = P(BBB) = 0.125 So, the probability density function for X, the number of boys in a 3-child family is: x 0 1 2 3 f(x) 0.125 0.375 0.375 0.125 Note that a probability density function must follow basic probability rules. The probabilities must be numbers between 0 and 1 (inclusive) and the probabilities must add to 1. Example (cont'd): What is the probability that a randomly selected 3-child family as at least one boy? P(X 1) = P(X=1)+P(X=2)+P(X=3) = 0.375 + 0.375 + 0.125 = 0.875 What is the probability that a randomly selected 3-child family has fewer than 2 boys? P(X < 2) = P(X 1) = P(X=0) + P(X=1) = 0.125+0.375 = 0.50 What is the probability that a randomly selected 3-child family has at least 1, but no more than 2 boys? P(1X2) = P(X=1) + P(X=2) = 0.375 + 0.375 = 0.75 Expected Value E(X) What is the average number of boys in 3-children families? Assume we have randomly selected 1,000 families with 3 children. The probability density function tells us that: 125 of the families should have 0 boys 375 of the families should have 1 boy 375 of the families should have 2 boys 125 of the families should have 3 boys Handout 07 Page 2 of 4 Then, to calculate the average number of boys in 3-children families, our calculation would look like: Average 0 0 ... 0 1 1 ... 1 2 2 ... 2 3 3 ... 3 1000 in which there would be 125 zeroes, 375 ones, 375 twos, and 125 threes. So, we could alternatively write the average as: Average 125 (0) 375 (1) 375 (2) 125 (3) 1000 and as: Average 125 375 375 125 (0) (1) (2) (3) 1000 1000 1000 1000 and finally: Average = 0.125 (0) + 0.375 (1) + 0.375 (2) + 0.125 (3) = 1.5 Since our probabilities are based on a large theoretical population, we call this average the POPULATION AVERAGE µ. Alternatively, we call this average the EXPECTED VALUE of X, denoted E(X) and read "E of X". The general formula for the expected value of X is: µ = E(X) = ( x f(x) Simply put, one calculates the expected value of X by multiplying each possible value of the random variable by its probability and then adding the results up. Example: Calculate the average, or expected, number of ear infections in the first two years of life, if the probability density function is: x 0 1 2 3 4 5 f(x) 0.129 0.264 0.271 0.185 0.095 0.056 E(X) = 0(0.129) + 1(0.264) + 2(0.271) + 3(0.185) + 4(0.095) + 5(0.056) = 0 + 0.264 + 0.542 + 0.555 + 0.380 + 0.280 = 2.021 That is, we can expect children to have, on average, 2.021 (or approximately 2?) ear infections in their first two years of life. Handout 07 Page 3 of 4 Cumulative Distribution Function F(x) The CUMULATIVE DISTRIBUTION FUNCTION F(x) is the more common way that the probabilities of a random variable are summarized. The cumulative distribution function, which also can be described by a formula or in a table, is defined as: F(x) = P(X x) NOTE!!!! The notation for a probability density function, f(x), entails using a lowercase "f", while the notation for a cumulative distribution function, F(x), entails using a capital "F". The notation is not interchangeable. Example (cont'd): Again, consider X, the number of boys in a 3-child family. Then: P(X 0) = P(X=0) = 0.125 P(X 1) = P(X=0)+P(X=1) = 0.125 + 0.375 = 0.50 P(X 2) = P(X=0)+P(X=1)+P(X=2) = 0.125 + 0.375 + 0.375 = 0.875 P(X 3) = P(X=0)+P(X=1)+P(X=2)+P(X=3) = 0.125 + ... + 0.125 = 1 Then, summarizing the cumulative distribution function F(x) for X in a table: x 0 1 2 3 F(x) 0.125 0.500 0.875 1.000 As we will see, typically we are faced with the situation in which we have some assumed cumulative distribution function, but we need to calculate some probability. The following example illustrates how to use a cumulative distribution function F(x) to find various probabilities. Example: Let X = the number of days of rain in 5 randomly selected days. Weather records indicate that the cumulative distribution function F(x) is as follows: x 0 1 2 3 4 5 F(x) 0.2373 0.6328 0.8965 0.9844 0.9990 1.000 In five randomly selected days.... what is the probability that there will be no more than 2 days of rain? what is the probability that there will be fewer than 2 days of rain? what is the probability that there will be between 2 and 4 days (inclusive) of rain? what is the probability that there will be at least 3 days of rain? what is the probability that there will be more than 3 days of rain? Handout 07 Page 4 of 4