Random Variables and its Probability Distributions Random Variable: A random variable is obtained by assigning a numerical value to each outcome of a particular experiment. We shall use a capital letter, say X, to denote a random variable and its corresponding small letter, x in this case, for one of its values. For example, the sample space giving a detailed description of each possible outcome when three electronic components are tested may be written S = {NNN, NND, NDN, DNN, NDD, DND, DDN, DDD} where N denotes non defective and D denotes defective. The possible outcomes and the values x of the random variable X, where X are the number of defectives that occur. x Sample Space NNN 0 NND 1 NDN 1 DNN 1 NDD 2 DND 2 DDN 2 DDD 3 Notice that the random variable X assumes the value 2 for all elements in the subset E = {DDN, DND, NDD} of the sample space S. That is, each possible value of X represents an event that is a subset of the sample space for the given experiment. Example: Consider the simple condition in which components are arriving from the production line and they are stipulated to be defective or not defective. Define the random variable X by { The random variable for which 0 and 1 are chosen to describe the two possible values is called a Bernoulli random variable. Example: Suppose a sampling plan involves sampling items from a process until a defective is observed. The evaluation of the process will depend on how many consecutive items are observed. In that regard, let X be a random variable defined by the number of items observed before a defective is found. With N a non-defective and D a defective, sample spaces are S = {D} given X = 1, S = {ND} given X = 2, S = {NND} given X = 3, and so on. Example: Let X be the random variable defined by the waiting time, in hours, between successive speeders spotted by a radar unit. The random variable X takes on all values x for which x ≥ 0. Discrete Random Variable: A random variable is called a discrete random variable if its set of possible outcomes is countable. That implies a discrete random variable is defined over a discrete sample space. Page 1 of 18 Continuous Random Variable: When a random variable can take on values on a continuous scale, it is called a continuous random variable. Continuous random variables represent measured data, such as all possible heights, weights, temperatures, distance, or life periods, whereas discrete random variables represent count data, such as the number of defectives in a sample of k items or the number of highway fatalities per year in a given state. The sample space giving a detailed description of each possible outcome when three electronic components are tested may be written S = {NNN, NND, NDN, DNN, NDD, DND, DDN, DDD} If x represents the values of the random variable X, where X are the number of defectives that occur. Then the associated probabilities are Values of X= x 0 1 2 3 P(X=x) A discrete random variable assumes each of its values with a certain probability. So it is convenient to represent all the probabilities of a random variable X by a formula. Such a formula would necessarily be a function of the numerical values x that we shall denote by and so forth. Therefore, we write that is, f(3) = P(X = 3). It is verify that f(x) for the above table is in the form ( )( ) ( ) Discrete Probability Distributions: A function f(x) is a probability function, probability mass function, or probability distribution of the discrete random variable X if, for each possible outcome x, 1. 2. ∑ 3. Example: A shipment of 8 similar laptop computers to a retail outlet contains 3 that are defective. If a school makes a random purchase of 2 of these computers, find the probability distribution for the number of defectives. Solution: Let X be a random variable whose values x are the possible numbers of defective computers purchased by the school. Then x can only take the numbers 0, 1, and 2. Now ( )( ) ( ) ( )( ) ( ) ( )( ) ( ) Thus, the probability distribution of X is Values of X= x 0 1 2 Page 2 of 18 Cumulative Distribution Function: The cumulative distribution function (cdf) F(x) of a discrete random variable X with probability distribution is ∑ Example: A stockroom clerk returns three safety helmets at random to three steel mill employees who had previously checked them. If Smith, Jones, and Brown, in that order, receive one of the three hats, list the sample points for the possible orders of returning the helmets, and find the value x of the random variable X that represents the number of correct matches. Find the cumulative distribution function of the random variable? Solution : If S, J, and B stand for Smith’s, Jones’s, and Brown’s helmets, respectively, then the possible arrangements in which the helmets may be returned and the number of correct matches and The possible values x of X and their probabilities are The cumulative distribution function of X is { Example: If a car agency sells 50% of its inventory of a certain foreign car equipped with side airbags, find a formula for the probability distribution of the number of cars with side airbags among the next 4 cars sold by the agency. Solution: Since the probability of selling an automobile with side airbags is 0.5, the = 16 points in the sample space are equally likely to occur. Therefore, the denominator for all probabilities, and also for our function, is 16. The event of selling x models with side airbags and models without side airbags can occur in ( ) ways, where x can be 0, 1, 2, 3, or 4. Thus, the probability distribution is ( ) Page 3 of 18 Example: Find the cumulative distribution function of the random variable X in above example. Using F(x), verify that f(2) = 3/8. Solution: Direct calculations of the probability distribution of above example give , and . Therefore, { Now Page 4 of 18 Now we shall concern ourselves with computing probabilities for various intervals of continuous random variables such as and so forth. Note that when X is continuous, That is, it does not matter whether we include an endpoint of the interval or not. This is not true, though, when X is discrete. Continuous Probability Distributions: The function is a probability density function (pdf) for the continuous random variable X, defined over the set of real numbers, if 1. 2. ∫ 3. ∫ The probability that X assumes a value between a and b is equal to the shaded area under the density function between the ordinates at x = a and x = b, and from integral calculus is given by Example: Suppose that the error in the reaction temperature, in ◦C, for a controlled laboratory experiment is a continuous random variable X having the probability density function { (a) Verify that (b) Find is a density function. Page 5 of 18 Solution: We use above definition. (a) Obviously, . To verify condition 2 in Definition, we have ∫ ∫ | (b) Using formula 3 in above Definition, we obtain ∫ Cumulative Distribution Function: The cumulative distribution function F(x) of a continuous random variable X with density function is ∫ The consequence of above definition ∫ ∫ Example: For the density function of Example, find Solution: For , ∫ ∫ and use it to evaluate | Therefore, { The cumulative distribution function F(x) is expressed in Figure. Now P Page 6 of 18 Joint Probability Distributions: Our study of random variables and their probability distributions in the preceding sections is restricted to one-dimensional sample spaces, in that we recorded outcomes of an experiment as values assumed by a single random variable. There will be situations, however, where we may find it desirable to record the simultaneous outcomes of several random variables. For example, if an 18wheeler is to have its tires serviced and X represents the number of miles these tires have been driven and Y represents the number of tires that need to be replaced, then is the probability that the tires are used over 30,000 miles and the truck needs 5 new tires. Example: Two ballpoint pens are selected at random from a box that contains 3 blue pens, 2 red pens, and 3 green pens. If X is the number of blue pens selected and Y is the number of red pens selected, find where A is the region Solution: The possible pairs of values are . Now, , for example, represents the probability that a red and a green pens are selected. The total number of equally likely ways of selecting any 2 pens from the 8 is ( ) of selecting 1 red from 2 red pens and 1 green from 3 green pens is ( )( ) . The number of ways . Hence Page 7 of 18 Page 8 of 18 Marginal Distributions: Example: Show that the column and row totals of Table (in discrete case) give the marginal distribution of X alone and of Y alone. Which are just the column totals of the table. In a similar manner we could show that the values of are given by the row totals. In tabular form, these marginal distributions may be written as follows: Example: Find and for the joint density function of Example (Continuous case). Page 9 of 18 When X and Y are continuous random variables, the joint density function is a surface lying above the xy plane, and where A is any region in the xy plane, is equal to the volume of the right cylinder bounded by the base A and the surface. Let X and Y be two random variables, discrete or continuous, with joint probability distribution and marginal distributions and respectively. The random variables X and Y are said to be statistically independent if and only if for all (x, y) within their range. The fact that the marginal distributions and are indeed the probability distributions of the individual variables X and Y alone can be verified by. For example, in the continuous case Mathematical Expectation: If two coins are tossed 16 times and X is the number of heads that occur per toss, then the values of X are 0, 1, and 2.Let the following frequency distribution summarized the number of heads that one would expect per toss. Values of X=x 0 1 2 Frequency 4 7 5 The average number of heads per toss of the two coins is then ∑ ∑ Let us now restructure our computation for the average number of heads so as to have the following equivalent form: The fractions 4/16, 7/16, and 5/16 are the relative frequencies for the different values of X in our experiment. In fact, then, we can calculate the mean, or average, of a set of data by knowing the distinct values that occur and their relative frequencies, without any knowledge of the total number of observations in our set of data. Therefore, if 4/16, or 1/4, of the tosses result in no heads, 7/16 of the tosses result in one head, and 5/16 of the tosses result in two heads, the mean number of heads per toss would be 1.06 no matter whether the total number of tosses were 16, 1000, or even 10,000. This method of relative frequencies is used to calculate the average number of heads per toss of two coins that we might expect in the long run. We shall refer to this average value as the mean of the random variable X or the mean of the probability distribution of X and write it as or simply as μ when it is clear to which random variable we refer. It is also common among statisticians to refer to this mean as the mathematical expectation, or the expected value of the random variable X, and denote it as E(X). Page 10 of 18 Example: A lot containing 7 components is sampled by a quality inspector; the lot contains 4 good components and 3 defective components. A sample of 3 is taken by the inspector. Find the expected value of the number of good components in this sample. Page 11 of 18 Example: In a gambling game a man is paid if he gets all heads or all tails when three coins are tossed, and he will pay out if either one or two heads show. What is his expected gain? Solution: The sample space for the possible outcomes when three coins are tossed simultaneously is Each of these possibilities is equally likely and occurs with probability equals to 1/8. The random variable of interest is X, the amount the gambler can win; and the possible values of X are if event occurs and if event occurs. Since and occur with probabilities and respectively, it follows that In this game the gambler will, on average, lose $1 per toss of the three coins. Page 12 of 18 Example: Let X and Y be the random variables with joint probability distribution indicated in Table (on page 7). Find the expected value of . The table is reprinted here for convenience. Page 13 of 18 Variance of Random Variables: The quantity is referred to as the variance of the random variable X or the variance of the probability distribution of X and is denoted by or the symbol , or simply by when it is clear to which random variable we refer. Page 14 of 18 In Figure, we have the histograms of two discrete probability distributions that have the same mean, , but differ considerably in variability, or the dispersion of their observations about the mean. Clearly, the variance of the number of automobiles that are used for official business purposes is greater for company B than for company A. An alternative and preferred formula for finding , which often simplifies the calculations, is stated in the following theorem. Page 15 of 18 Page 16 of 18 Page 17 of 18 Some Theorems (Without Proof): Relationship to Material in Other Chapters: The purpose of this chapter is for readers to learn how to manipulate a probability distribution, not to learn how to identify a specific type. In future chapters it will become apparent that probability distributions represent the structure through which probabilities that are computed aid in the evaluation and understanding of a process. Page 18 of 18