3. Random Variables Let (, F, P) be a probability model for an experiment, and X a function that maps every , to a unique point x R, the set of real numbers. Since the outcome is not certain, so is the value X ( ) x. Thus if B is some subset of R, we may want to determine the probability of “ X ( ) B ”. To determine this probability, we can look at the set A X 1 ( B) that contains all that maps into B under the function X. X ( ) x A B Fig. 3.1 R 1 PILLAI Obviously, if the set A X 1( B) also belongs to the associated field F, then it is an event and the probability of A is well defined; in that case we can say Probability of theevent" X ( ) B " P( X 1 ( B)). (3-1) However, X 1( B) may not always belong to F for all B, thus creating difficulties. The notion of random variable (r.v) makes sure that the inverse mapping always results in an event so that we are able to determine the probability for any B R. Random Variable (r.v): A finite single valued function X ( ) that maps the set of all experimental outcomes into the set of real numbers R is said to be a r.v, if the set | X ( ) x is an event ( F ) for every x in R. 2 PILLAI Alternatively X is said to be a r.v, if X 1( B) F where B represents semi-definite intervals of the form { x a} and all other sets that can be constructed from these sets by performing the set operations of union, intersection and negation any number of times. The Borel collection B of such subsets of R is the smallest -field of subsets of R that includes all semi-infinite intervals of the above form. Thus if X is a r.v, then | X ( ) x X x (3-2) is an event for every x. What about a X b , X a? Are they also events ? In fact with b a since {X a} and X b are events, X a c X a is an event and hence X a X b {a X b} is also an event. 3 PILLAI 1 a X a n Thus, Consequently is an event for every n. 1 a X a { X a} n n 1 (3-3) is also an event. All events have well defined probability. Thus the probability of the event | X ( ) x must depend on x. Denote P | X ( ) x FX ( x) 0. (3-4) The role of the subscript X in (3-4) is only to identify the actual r.v. FX (x) is said to the Probability Distribution Function (PDF) associated with the r.v X. 4 PILLAI Distribution Function: Note that a distribution function g(x) is nondecreasing, right-continuous and satisfies g () 1, g () 0, (3-5) i.e., if g(x) is a distribution function, then (i) g () 1, g () 0, (ii) if x1 x2 , then g ( x1 ) g ( x2 ), (3-6) and g ( x ) g ( x), for all x. (iii) We need to show that FX (x) defined in (3-4) satisfies all properties in (3-6). In fact, for any r.v X, 5 PILLAI (i) FX () P | X ( ) P() 1 (3-7) and FX () P | X ( ) P( ) 0. (3-8) (ii) If x1 x2 , then the subset (, x1 ) (, x2 ). Consequently the event | X ( ) x1 | X ( ) x2 , since X ( ) x1 implies X ( ) x2 .As a result FX ( x1 ) P X ( ) x1 P X ( ) x2 FX ( x2 ), (3-9) implying that the probability distribution function is nonnegative and monotone nondecreasing. (iii) Let x xn xn1 x2 x1, and consider the event since Ak | x X ( ) xk . (3-10) x X ( ) xk X ( ) x X ( ) xk , (3-11) 6 PILLAI using mutually exclusive property of events we get P( Ak ) Px X ( ) xk FX ( xk ) FX ( x). But Ak 1 Ak Ak 1 , lim Ak Ak k (3-12) and hence and hence lim P( Ak ) 0. (3-13) k k 1 Thus lim P( Ak ) lim FX ( xk ) FX ( x ) 0. k k xk x , the right limit of x, and hence But lim k FX ( x ) FX ( x), (3-14) i.e., FX (x) is right-continuous, justifying all properties of a distribution function. 7 PILLAI Additional Properties of a PDF (iv) If FX ( x0 ) 0 for some x0 , then FX ( x) 0, x x0. (3-15) This follows, since FX ( x0 ) P X ( ) x0 0 implies X ( ) x0 is the null set, and for any x x0 , X ( ) x will be a subset of the null set. (v) P X ( ) x 1 FX ( x). (3-16) We have X ( ) x X ( ) x , and since the two events are mutually exclusive, (16) follows. (vi) P x1 X ( ) x2 FX ( x2 ) FX ( x1 ), x2 x1. (3-17) The events X ( ) x1 and {x1 X ( ) x2} are mutually exclusive and their union represents the event X ( ) x2 . 8 PILLAI (vii) PX ( ) x FX ( x) FX ( x ). (3-18) Let x1 x , 0, and x2 x. From (3-17) lim P x X ( ) x FX ( x ) lim FX ( x ), (3-19) P X ( ) x FX ( x) FX ( x ). (3-20) 0 0 or According to (3-14), FX ( x0 ), the limit of FX (x) as x x0 from the right always exists and equals FX ( x0 ).However the left limit value FX ( x0 ) need not equal FX ( x0 ). Thus FX (x) need not be continuous from the left. At a discontinuity point of the distribution, the left and right limits are different, and from (3-20) P X ( ) x0 FX ( x0 ) FX ( x0 ) 0. (3-21) 9 PILLAI Thus the only discontinuities of a distribution function FX (x) are of the jump type, and occur at points x0 where (3-21) is satisfied. These points can always be enumerated as a sequence, and moreover they are at most countable in number. Example 3.1: X is a r.v such that Solution: For FX (x). X ( ) Find c, . so that x and c, X (for ) x , FX ( xso ) that 0, FX ( x ) 1. (Fig.3.2) x c, X ( ) x , FX (x) 1 c x Fig. 3.2 Example 3.2: Toss a coin. H ,T . Suppose the r.v X is 10 such that X (T ) 0, X ( H ) 1. Find FX (x). PILLAI Solution: For x 0, X ( ) x , so that FX ( x) 0. 0 x 1, x 1, X ( ) x T , so that FX ( x) P T 1 p, X ( ) x H , T , so that FX ( x) 1. (Fig. 3.3) FX (x) 1 q 1 x Fig.3.3 •X is said to be a continuous-type r.v if its distribution function FX (x) is continuous. In that case FX ( x ) FX ( x) for all x, and from (3-21) we get PX x 0. •If FX (x) is constant except for a finite number of jump discontinuities(piece-wise constant; step-type), then X is said to be a discrete-type r.v. If xi is such a discontinuity point, then from (3-21) (3-22) 11 pi PX xi FX ( xi ) FX ( xi ). PILLAI From Fig.3.2, at a point of discontinuity we get P X c FX (c) FX (c ) 1 0 1. and from Fig.3.3, P X 0 FX (0) FX (0 ) q 0 q. Example:3.3 A fair coin is tossed twice, and let the r.v X represent the number of heads. Find FX (x ). Solution: In this case HH , HT , TH , TT , and X ( HH ) 2, X ( HT ) 1, X (TH ) 1, X (TT ) 0. x 0, X ( ) x FX ( x ) 0, 0 x 1, X ( ) x TT FX ( x ) P TT P (T ) P (T ) 1 , 4 1 x 2, X ( ) x TT , HT , TH FX ( x ) P TT , HT , TH x 2, X ( ) x FX ( x) 1. (Fig. 3.4) 12 PILLAI 3 , 4 From Fig.3.4, PX 1 FX (1) FX (1 ) 3 / 4 1 / 4 1 / 2. FX (x) 1 3/ 4 1/ 4 1 x 2 Fig. 3.4 Probability density function (p.d.f) The derivative of the distribution function FX (x) is called the probability density function f X (x) of the r.v X. Thus f X ( x) dFX ( x ) . dx (3-23) Since dFX ( x ) F ( x x ) FX ( x ) lim X 0, x 0 dx x from the monotone-nondecreasing nature of FX ( x ), (3-24) 13 PILLAI it follows that f X ( x) 0 for all x. f X (x) will be a continuous function, if X is a continuous type r.v. However, if X is a discrete type r.v as in (3-22), then its f (x) p.d.f has the general form (Fig. 3.5) pi X f X ( x) pi ( x xi ), (3-25) i xi x Fig. 3.5 where xi represent the jump-discontinuity points in FX (x). As Fig. 3.5 shows f X (x) represents a collection of positive discrete masses, and it is known as the probability mass function (p.m.f ) in the discrete case. From (3-23), we also obtain by integration FX ( x) x f x (u )du. (3-26) Since FX () 1, (3-26) yields f x ( x )dx 1, (3-27) 14 PILLAI which justifies its name as the density function. Further, from (3-26), we also get (Fig. 3.6b) P x1 X ( ) x2 FX ( x2 ) FX ( x1 ) x x2 f X ( x)dx. (3-28) 1 Thus the area under f X (x) in the interval the probability in (3-28). FX (x) represents f X (x) 1 x1 x2 ( x1 , x2 ) x (a) x x1 x2 (b) Fig. 3.6 Often, r.vs are referred by their specific density functions both in the continuous and discrete cases - and in what follows we shall list a number of them in each category. 15 PILLAI Continuous-type random variables 1. Normal (Gaussian): X is said to be normal or Gaussian r.v, if f X ( x) 1 2 2 e ( x ) 2 / 2 2 (3-29) . This is a bell shaped curve, symmetric around the parameter , and its distribution function is given by FX ( x) x 1 2 2 e ( y ) 2 / 2 2 x dy G , (3-30) where G( x) 1 e dy is often tabulated. Since f X (x) 2 depends on two parameters and 2 , the notation X N ( , 2 ) will be used to represent (3-29). f (x) x y2 / 2 X Fig. 3.7 x 16 PILLAI 2. Uniform: X U (a, b), a b, if (Fig. 3.8) 1 , a x b, f X ( x) b a 0, ot herwise. (3.31) 3. Exponential: X ( ) if (Fig. 3.9) 1 e x / , x 0, f X ( x) 0, ot herwise. (3-32) f X (x) f X (x) 1 ba a Fig. 3.8 b x x Fig. 3.9 17 PILLAI 4. Gamma: X G( , ) if ( 0, 0) (Fig. 3.10) f X (x) 1 x x / e , x 0, f X ( x ) ( ) 0, ot herwise. If n an integer (3-33) x Fig. 3.10 (n) (n 1)!. f X (x ) 5. Beta: X (a, b) if (a 0, b 0) (Fig. 3.11) 0 x 1 Fig. 3.11 1 a 1 b 1 x ( 1 x ) , 0 x 1 , f X ( x ) ( a , b) (3-34) 0, ot herwise. where the Beta function (a, b) is defined as 1 (a, b) u a 1 (1 u)b1 du. 0 (3-35) 18 PILLAI 6. Chi-Square: X 2 (n), if (Fig. 3.12) 1 x n / 21e x / 2 , x 0, n/2 f X ( x ) 2 ( n / 2 ) (3-36) 0, ot herwise. f X (x ) x Fig. 3.12 Note that 2 (n) is the same as Gamma (n / 2, 2). 7. Rayleigh: X R( 2 ), if (Fig. 3.13) 2 2 x 2 e x / 2 , x 0, f X ( x ) 0, ot herwise. f X (x) (3-37) x Fig. 3.13 8. Nakagami – m distribution: 2 m m 2 m 1 mx 2 / x e , x0 f X ( x ) ( m ) 0 otherwise (3-38) 19 PILLAI f X (x ) 9. Cauchy: X C( , ), if (Fig. 3.14) f X ( x) / (x ) 2 2 , x . x (3-39) Fig. 3.14 10. Laplace: (Fig. 3.15) f X ( x) 1 |x|/ e , x . 2 (3-40) 11. Student’s t-distribution with n degrees of freedom (Fig 3.16) f T (t ) ( n 1) / 2 t 1 n n ( n / 2) 2 f X ( x) ( n 1) / 2 , t . fT ( t ) x Fig. 3.15 (3-41) t Fig. 3.16 20 PILLAI 12. Fisher’s F-distribution {( m n ) / 2} m m / 2 n n / 2 z m / 2 1 , z0 (mn) / 2 f z ( z) ( m / 2) ( n / 2) ( n mz ) (3-42) 0 otherwise 21 PILLAI Discrete-type random variables 1. Bernoulli: X takes the values (0,1), and P( X 0) q, P( X 1) p. (3-43) 2. Binomial: X B(n, p), if (Fig. 3.17) n k n k P( X k ) , k 0,1,2,, n. k p q (3-44) 3. Poisson: X P( ) , if (Fig. 3.18) P( X k ) e k k! , k 0,1,2,, . (3-45) P( X k ) P( X k ) k 12 n Fig. 3.17 Fig. 3.18 22 PILLAI 4. Hypergeometric: P( X k ) m k N m n k , N n max(0, m n N ) k min( m, n ) (3-46) 5. Geometric: X g ( p ) if P( X k ) pqk , k 0,1,2,, , q 1 p. (3-47) 6. Negative Binomial: X ~ NB (r, p), if k 1 r k r P( X k ) p q , r 1 k r, r 1, . (3-48) 7. Discrete-Uniform: P( X k ) 1 , k 1,2,, N . N (3-49) We conclude this lecture with a general distribution due 23 PILLAI to Polya that includes both binomial and hypergeometric as special cases. Polya’s distribution: A box contains a white balls and b black balls. A ball is drawn at random, and it is replaced along with c balls of the same color. If X represents the number of white balls drawn in n such draws, X 0, 1, 2, , n, find the probability mass function of X. Solution: Consider the specific sequence of draws where k white balls are first drawn, followed by n – k black balls. The probability of drawing k successive white balls is given by pW a ac a 2c a b a b c a b 2c a (k 1)c a b (k 1)c Similarly the probability of drawing k white balls (3-50) 24 PILLAI followed by n – k black balls is given by pk p w b bc a b kc a b ( k 1)c k 1 a ab icic i 0 n k 1 j 0 b (n k 1)c a b (n 1)c b jc . a b( j k )c (3-51) Interestingly, pk in (3-51) also represents the probability of drawing k white balls and (n – k) black balls in any other specific order (i.e., The same set of numerator and denominator terms in (3-51) contribute to all other sequences n as well.) But there are k such distinct mutually exclusive sequences and summing over all of them, we obtain the Polya distribution (probability of getting k white balls in n draws) to be k 1 n k 1 P( X k ) n k pk n k i 0 a ic a bic j 0 b jc , a b( j k )c k 0,1,2, , n. 25 (3-52) PILLAI Both binomial distribution as well as the hypergeometric distribution are special cases of (3-52). For example if draws are done with replacement, then c = 0 and (3-52) simplifies to the binomial distribution P( X k ) where n k a p , ab p k qn k , k 0,1,2, ,n (3-53) b q 1 p. ab Similarly if the draws are conducted without replacement, Then c = – 1 in (3-52), and it gives P( X k ) n k a(a 1)(a 2) (a k 1) b(b 1) (b n k 1) (a b)(a b 1) (a b k 1) (a b k ) (a b n 1) 26 PILLAI n! a !(a b k )! b!(a b n )! P( X k ) k !(n k )! (a k )!(a b)! (b n k )!(a b k )! a b k n k a b n (3-54) which represents the hypergeometric distribution. Finally c = +1 gives (replacements are doubled) P( X k ) = n k ( a k 1)! ( a b 1)! (b n k 1)! ( a b k 1)! ( a 1)! ( a b k 1)! (b 1)! ( a b n 1)! a k 1 b n k 1 k n k a b n 1 n . (3-55) we shall refer to (3-55) as Polya’s +1 distribution. the general Polya distribution in (3-52) has been used to study the spread of contagious diseases (epidemic modeling). 27 PILLAI