Probability and Statistics Lecture 3 Dr.-Ing. Erwin Sitompul President University http://zitompul.wordpress.com President University Erwin Sitompul PBST 3/1 Chapter 3 Random Variables and Probability Distributions Chapter 3 Random Variables and Probability Distributions President University Erwin Sitompul PBST 3/2 Chapter 3.1 Concept of a Random Variable Concept of a Random Variable Random variable is a function that associates a real number with each element in the sample space. In other words, random variable is a numerical description of the outcome of an experiment, where each outcome gets assigned a numerical value. A capital letter, say X, is used to denotes a random variable and its corresponding small letter, x in this case, for one of its values. President University Erwin Sitompul PBST 3/3 Chapter 3.1 Concept of a Random Variable Concept of a Random Variable The sample space giving a detailed description of each possible outcome when three electronic components are tested may be written as S DDD, DDN , DND, DNN , NDD, NDN , NND , NNN One is concerned with the number of defectives that occurs. Thus each point in the sample space will be assigned a numerical value of 0, 1, 2, or 3. Then, the random variable X assumes the value 2 for all elements in the subset E DDN , DND, NDD Two balls are drawn in succession without replacement from an urn containing 4 red balls and 3 black balls. The possible outcomes and the values y of the random variable Y, where Y is the number of red balls are President University Erwin Sitompul PBST 3/4 Chapter 3.1 Concept of a Random Variable Sample Space and Random Variable If a sample space contains a finite number of possibilities or an unending sequence with as many elements as there are whole numbers, it is called a discrete sample space. If a sample space contains an infinite number of possibilities equal to the number of points on a line segment, it is called a continuous sample space. A random variable is called a discrete random variable if its set of possible outcomes is countable. A random variable is called a continuous random variable if it can take on values on a continuous scale. If X is the random variable assigned to the waiting time, in minute, for a bus at a bus stop, then the random variable X may take on all values of waiting time x, x ≥ 0. In this case, X is a continuous random variable. President University Erwin Sitompul PBST 3/5 Chapter 3.2 Discrete Probability Distributions Discrete Probability Distributions Frequently, it is convenient to represent all the probabilities of a random variable X by a formula. Such a formula would necessarily be a function of the numerical values x, denoted by f(x), g(x), r(x), and so forth. For example, f ( x) P ( X x) The set of ordered pairs (x, f(x)) is a probability function, probability mass function, or probability distribution of the discrete random variable X if, for each possible outcome x, 1. f ( x ) 0 2. f ( x) 1 x 3. P( X x) f ( x) President University Erwin Sitompul PBST 3/6 Chapter 3.2 Discrete Probability Distributions Discrete Probability Distributions In the experiment of tossing a fair coin twice, the random variable X represents how many times the head turns up. The possible value for x of X and their probability can be summarized as President University Erwin Sitompul PBST 3/7 Chapter 3.2 Discrete Probability Distributions Discrete Probability Distributions A shipment of 20 similar laptop computers to a retail outlet contains 3 that are defective. If a school makes a random purchase of 2 of these computers, find the probability distribution for the number of defectives. Let X be a random variable, whose value x are the possible numbers of defective computers purchased by the school. f (0) P ( X 0) 3 C 0 17 C 2 20 f (1) P ( X 1) 3 f (2) P ( X 2) C2 C 1 17 C 1 20 3 190 190 C 2 17 C 0 C2 51 C2 20 President University 136 3 190 Erwin Sitompul PBST 3/8 Chapter 3.2 Discrete Probability Distributions Discrete Probability Distributions There are many problems where we may wish to compute the probability that the observed value of a random variable X will be less than or equal to some real number x. The cumulative distribution F(x) of a discrete random variable X with probability distribution f(x) is F ( x) P ( X x) f (t ) for x t x Example of a probability distribution President University Example of a cumulative distribution Erwin Sitompul PBST 3/9 Chapter 3.3 Continuous Probability Distributions Continuous Probability Distributions In case the sample space is continuous, there can be unlimited number of possible value for the samples. Thus, it is more meaningful to deal with an interval rather than a point value of a random variable. For example, it does not make sense to know the probability of selecting person at random who is exactly 164 cm tall. It will be more useful to talk about the probability of selecting a person who is at least 163 cm but not more than 165 cm. We shall concern ourselves now with computing probabilities for various intervals of continuous random variables such as P(a < X < b), P(W ≥ c), P(U ≤ d) and so forth. Note that when X is continuous P( X a) 0 Probability of a point value is zero P (a X b ) P (a X b ) P ( X b ) P (a X b ) President University Erwin Sitompul PBST 3/10 Chapter 3.3 Continuous Probability Distributions Continuous Probability Distributions In dealing with continuous variables, the notation commonly used is f(x) and it is usually called the probability density function, or the density function of X. For most practical application, the density functions are continuous and differentiable. Their graphs may take any forms, but since it will be used to represent probabilities, the density function must lie entirely above the x axis to represent positive probability. f(x) f(x) x President University f(x) x Erwin Sitompul x PBST 3/11 Chapter 3.3 Continuous Probability Distributions Continuous Probability Distributions A probability density function is constructed so that the area under its curve bounded by the x axis is equal to 1 when computed over the range of X for which f(x) is defined. In the figure below, the probability that X assumes a value between a and b is equal to the shaded area under the density function between the ordinates at x = a and x = b. b P (a X b ) f ( x ) dx a President University Erwin Sitompul PBST 3/12 Chapter 3.3 Continuous Probability Distributions Continuous Probability Distributions The function f(x) is a probability density function for the continuous random variable X, defined over the set of real numbers R if 1. f ( x ) 0, fo r a ll x R 2. 3. f ( x ) dx 1 b P (a X b ) f ( x ) dx a President University Erwin Sitompul PBST 3/13 Chapter 3.3 Continuous Probability Distributions Continuous Probability Distributions Suppose that the error in the reaction temperature, in °C, for a controlled laboratory experiment is a continuous random variable X having the probability density function x2 , 1 x 2 f (x) 3 w h e re 0, e lse (a) Verify whether f ( x ) dx 1 (b) Find P(0 < X ≤ 1) (a) 2 f ( x ) dx 1 2 3 2 1 dx 1 3 9 1 9 9 x 1 (b) P (0 X 1) 0 President University x x 2 3 dx 8 x 3 9 1 0 1 9 Erwin Sitompul PBST 3/14 Chapter 3.3 Continuous Probability Distributions Continuous Probability Distributions The cumulative distribution F(x) of a continuous random variable X with density function f(x) is x F ( x) P ( X x) f ( t ) dt , fo r x For the density function in the last example, find F(x) and use it to evaluate P(0 < X ≤ 1). x F (x) x f ( t ) dt 1 t 2 dt 3 t 3 9 x x 1 3 , fo r 1 x 2 9 1 x 1 0, x 3 1 F (x) , 1 x 2 9 x2 1, P (0 X 1) F (1) F (0) 2 9 President University 1 9 1 9 Erwin Sitompul PBST 3/15 Chapter 3.4 Joint Probability Distributions Joint Probability Distributions If X and Y are two discrete random variables, the probability distribution for their simultaneous occurrence can be represented by a function with values f(x, y) for any pair of values (x, y) within the range of the random variables X and Y. Such function is referred to as the joint probability distribution of X and Y. The function f(x, y) is a joint probability density function or joint probability distribution function of the discrete random variables X and Y if 1. f ( x , y ) 0, fo r all ( x , y ) R 2. x f ( x, y ) 1 y 3. P ( X x , Y y ) f ( x , y ) For any region A in the xy plane, P ( X , Y ) A President University Erwin Sitompul A f ( x, y ) PBST 3/16 Chapter 3.4 Joint Probability Distributions Joint Probability Distributions Two ballpoint pens are selected at random from a box that contains 3 blue pens, 2 red pens, and 3 green pens. If X is the number of blue pens selected and Y is the number of red pens selected, find (a) the joint probability function f(x, y) (b) P[(X, Y) A], where A is the region {(x, y)|x + y ≤ 1} (a) f ( x, y ) 3 C x 2 C y 3 C 2 x y , C2 fo r x 0,1, 2; y 0,1, 2; 0 x y 2 8 (b) P ( X , Y ) A P ( X Y 1) f (0, 0) f (0,1) f (1, 0) 3 28 9 3 14 9 28 14 President University Erwin Sitompul PBST 3/17 Chapter 3.4 Joint Probability Distributions Joint Probability Distributions The function f(x, y) is a joint probability density function of the continuous random variables X and Y if 1. f ( x , y ) 0, 2. fo r all ( x , y) R f ( x , y ) dxdy 1 3. P ( X , Y ) A f ( x , y ) dxdy A For any region A in the xy plane. President University Erwin Sitompul PBST 3/18 Chapter 3.4 Joint Probability Distributions Joint Probability Distributions A privately owned business operates both a drive-in facility and a walk-in facility. On a randomly selected day, let X and Y, respectively, be the proportions of the time that the drive-in and the walk-in facilities are in use, and suppose that the joint density function of these random variables is 52 (2 x 3 y ), 0 x 1, 0 y 1 f ( x, y ) e lse w h e re 0, (a) Verify that f(x, y) is a joint density function. (b) Find P[(X, Y) A], where A is {(x, y)|0 < x < 1/2, 1/4 < y < 1/2}. (a) 1 1 f ( x , y ) dxdy y x 0 0 1 2 2 2 6 (2 x 3 y ) dxdy x yx 5 5 5 0 1 2 6 2 2 6 y y dy y 5 10 5 5 0 2 6 1 5 10 President University Erwin Sitompul x 1 dy x0 1 0 PBST 3/19 Chapter 3.4 Joint Probability Distributions Joint Probability Distributions (b) Find P[(X, Y) A], where A is {(x, y)|0 < x < 1/2, 1/4 < y < 1/2}. P ( X , Y ) A P (0 X 1 2 1 2 y 1 4 x 0 2 1 2 , 1 4 Y 12 ) 1 2 (2 x 3 y ) dxdy 5 1 2 2 2 6 x yx 5 5 1 4 1 3 2 3 1 y y y dy 10 5 10 10 1 4 x 1 2 dy x0 1 2 1 4 1 1 1 3 1 1 10 2 4 10 4 16 13 160 President University Erwin Sitompul PBST 3/20 Chapter 3.4 Joint Probability Distributions Marginal Probability Distributions The marginal probability distribution functions of X alone and of Y alone are g ( x) f ( x, y ) an d h( y) y x for the discrete case, and g ( x) f ( x, y ) f ( x , y ) dy an d for the continuous case. h( y) f ( x , y ) dx The term marginal is used here because, in discrete case, the values of g(x) and h(y) are just the marginal totals of the respective columns and rows when the values of f(x, y) are displayed in a rectangular table. President University Erwin Sitompul PBST 3/21 Chapter 3.4 Joint Probability Distributions Marginal Probability Distributions Show that the column and row totals from the “ballpoint pens” example give the marginal distribution of X alone and of Y alone. 2 g (0) f (0, y ) f (0, 0) f (0,1) f (0, 2) y0 3 28 3 14 1 28 5 14 2 g (1) f (1, y ) f (1, 0) f (1,1) f (1, 2) y0 9 28 3 0 14 15 28 2 g (2) f (2, y ) f (2, 0) f (2,1) f (2, 2) y0 3 00 28 President University 3 28 It is found that the values of g(x) are just the column totals of the table above. In similar manner we could show that the values of h(y) are given by the row totals. Erwin Sitompul PBST 3/22 Chapter 3.4 Joint Probability Distributions Marginal Probability Distributions Find f(x) and h(y) for the joint density function of the “drive-in walkin facility” example. 2 (2 x 3 y ), 0 x 1, 0 y 1 f ( x) 5 e lse w h e re 0, g ( x) 1 2 5 (2 x 3 y ) d y f ( x , y ) dy 0 1 6 2 4 3 4 xy y x 10 5 5 5 0 h( y) 1 f ( x , y ) dx 2 5 (2 x 3 y ) d x 0 1 2 6 2 2 6 x yx y 5 5 5 5 0 President University Erwin Sitompul PBST 3/23 Chapter 3.4 Joint Probability Distributions Conditional Probability Distributions Let X and Y be two random variables, discrete or continuous. The conditional probability distribution function of the random variable Y, given than X = x, is f ( y x) f ( x, y ) , g ( x) 0 g ( x) Similarly, the conditional distribution of the random variable X, given that Y = y, is f (x y) f ( x, y ) , h( y) 0 h( y) President University Erwin Sitompul PBST 3/24 Chapter 3.4 Joint Probability Distributions Conditional Probability Distributions If one wished to find the probability that the discrete random variable X falls between a and b when it is known that the discrete variable Y = y, we evaluate P (a X b Y y ) f (x y) x where the summation extends over all available values of X between a and b. When X and Y are continuous, we can find the probability that X lies between a and b by evaluating b P (a X b Y y ) f ( x y ) dx a President University Erwin Sitompul PBST 3/25 Chapter 3.4 Joint Probability Distributions Conditional Probability Distributions Referring back to the “ballpoint pens” example, find the conditional distribution of X, given that Y = 1, and use it to determine P(X = 0 | Y = 1). f (x y) f ( x, y ) h( y) f ( x 1) f ( x ,1) x 0,1, 2 , h (1) f (0 1) f (0,1) h (1) f (1 1) f (1,1) 3 14 f (2,1) h (1) President University 3 7 0 1 2 3 7 h (1) f (2 1) 3 14 1 2 P X 0 Y 1 f 0 1 0 1 2 3 7 Erwin Sitompul PBST 3/26 Chapter 3.4 Joint Probability Distributions Conditional Probability Distributions Given the joint density function x (1 3 y 2 ) , 0 x 2, 0 y 1 f ( x, y ) 4 0, e lse w h e re find g(x), h(y), f(x|y), and evaluate P(1/4 < X < 1/2|Y = 1/3). g ( x) h( y) 1 f ( x , y ) dy 0 2 f ( x , y ) dx f (x y) x (1 3 y ) 2 4 x (1 3 y ) f ( x, y ) h( y) x( y y ) x dy , 0 x2 4 2 0 dx 4 x (1 3 y ) 4 2 (1 3 y ) 2 2 x 1 4 President University 2 , 0 x 2, 0 y 1 2 1 2 P (1 4 X 1 2 Y 1 3) 2 2 x (1 3 y ) 1 3y , 0 y 1 8 2 0 2 2 0 1 3 1 2 f ( x y ) dx 1 4 x 2 Erwin Sitompul dx x 2 4 1 2 1 4 3 64 PBST 3/27 Chapter 3.4 Joint Probability Distributions Statistical Independence Let X and Y be two random variables, discrete or continuous, with joint probability distribution f(x, y) and marginal distributions g(x) and h(y), respectively. The random variables X and Y are said to be statistically independent if and only if f ( x, y ) g ( x )h( y ) for all (x, y) within their range. President University Erwin Sitompul PBST 3/28 Chapter 3.4 Joint Probability Distributions Statistical Independence Consider the following joint probability density function of random variables X and Y. 3x y , f ( x, y ) 18 0, 1 x 4, 1 y 2 e ls e w h e re (a) Find the marginal density functions of X and Y (b) Are X and Y statistically independent? (c) Find P(X > 2|Y = 2) (a) g ( x) 2 f ( x , y ) dy 1 h( y) 4 f ( x , y ) dx President University 1 3x y 18 3x y 18 2 6 xy y 6x 3 2x 1 dy , 1 x 4 36 36 12 1 2 4 3 x 2 yx 45 6 y 15 2 y , 1 y 2 dx 36 36 12 1 2 Erwin Sitompul PBST 3/29 Chapter 3.4 Joint Probability Distributions Statistical Independence (b) Are X and Y statistically independent? 2 x 1 15 2 y 3 x y g ( x)h( y ) f ( x, y ) 18 12 12 X and Y are not statistically independent (c) Find P(X > 2|Y = 2) 4 P ( X 2 Y 2) 4 f (x y) y2 dx 2 4 (15 4) 2 President University h( y) 2 (3 x 2) 18 12 4 dx 4 f ( x, y ) 2 33 dx y2 (3 x 2) dx 2 Erwin Sitompul 2 f ( x , 2) dx h (2) 2 3 2 x 2 x 33 2 4 2 PBST 3/30 28 33 Chapter 3.4 Joint Probability Distributions Statistical Independence Let X1, X2, ..., Xn be n random variables, discrete or continuous, with joint probability distribution f(x1, x2, ..., xn) and marginal distributions f1(x1), f2(x2), ..., fn(xn), respectively. The random variables X1, X2, ..., Xn are said to be mutually statistically independent if and only if f ( x1 , x 2 , ..., , x n ) f1 ( x1 ) f 2 ( x 2 ) f n ( xn ) for all (x1, x2, ..., xn)) within their range. President University Erwin Sitompul PBST 3/31 Chapter 3.4 Joint Probability Distributions Statistical Independence Suppose that the shelf life, in years, of a certain perishable food product packaged in cardboard containers is a random variable whose probability density function is given by e x , x 0 f ( x) e lse w h e re 0, Let X1, X2, and X3 represent the shelf lives for three of these containers selected independently and find P(X1<2, 1<X2<3, X3>2) f ( x1 , x 2 , x 3 ) f ( x1 ) f ( x 2 ) f ( x 3 ) e x1 e x2 e x3 3 2 P ( X 1 2,1 X 2 3, X 3 2) e x1 e x2 e x3 dx1 dx 2 dx 3 2 1 0 e x1 2 e x2 0 2 (1 e )( e 3 e x3 1 1 2 3 2 e )( e ) 0.0372 President University Erwin Sitompul PBST 3/32 Probability and Statistics Homework 3 1. A game is played with the rule that a counter will move forward one, two, or four places according to whether the scores on the two dice rolled differ by three or more, by one or two, or are equal. Here we define a random variable, M, the number of places moved, which can take the value 1, 2, or 4. Determine the probability distribution of M. (Sou.04.E1 s.2) 2. Let the random variable X denote the time until a computer server connects to your notebook (in milliseconds), and let Y denote the time until the server authorizes you as a valid user (in milliseconds). Each of these random variables measures the wait from a common starting time. Assume that the joint probability density function for X and Y is 2 10 6 e 0.001 x 0.002 y , x 0, y 0 f ( x, y ) e lse w h e re 0, (a) Show that X and Y are independent. (b) Determine P(X > 1000, Y < 1000). President University Erwin Sitompul (Mo.E5.20) PBST 3/33