Chapter 2. Random Variables 2.1 2.2 2.3 2.4 2.5 2.6 NIPRL Discrete Random Variables Continuous Random Variables The Expectation of a Random Variable The Variance of a Random Variable Jointly Distributed Random Variables Combinations and Functions of Random Variables 2.1 Discrete Random Variable 2.1.1 Definition of a Random Variable (1/2) • Random variable – A numerical value to each outcome of a particular experiment S R -3 NIPRL -2 -1 0 1 2 3 2.1.1 Definition of a Random Variable (2/2) • Example 1 : Machine Breakdowns – Sample space : S {electrical , m echanical , m isuse} – Each of these failures may be associated with a repair cost – State space : {50, 200, 350} – Cost is a random variable : 50, 200, and 350 NIPRL 2.1.2 Probability Mass Function (1/2) • Probability Mass Function (p.m.f.) – A set of probability value p i assigned to each of the values taken by the discrete random variable x i – 0 p i 1 and i p i 1 – Probability : P ( X x i ) p i NIPRL 2.1.2 Probability Mass Function (1/2) • Example 1 : Machine Breakdowns – P (cost=50)=0.3, P (cost=200)=0.2, P (cost=350)=0.5 – 0.3 + 0.2 + 0.5 =1 f (x) xi 50 200 350 pi 0.3 0.2 0.5 0.5 0.3 0.2 50 NIPRL 200 350 C ost($) 2.1.3 Cumulative Distribution Function (1/2) • Cumulative Distribution Function – Function : F ( x ) P ( X x ) F ( x ) – Abbreviation : c.d.f P( X y) y: y x F (x) 1.0 0.5 0.3 0 NIPRL 50 200 350 x ($cost) 2.1.3 Cumulative Distribution Function (2/2) • Example 1 : Machine Breakdowns x 50 F ( x ) P (cost x ) 0 50 x 200 F ( x ) P (cost x ) 0.3 200 x 350 F ( x ) P (cost x ) 0.3 0.2 0.5 350 x F ( x ) P (cost x ) 0.3 0.2 0.5 1.0 NIPRL 2.2 Continuous Random Variables 2.2.1 Example of Continuous Random Variables (1/1) • Example 14 : Metal Cylinder Production – Suppose that the random variable X is the diameter of a randomly chosen cylinder manufactured by the company. Since this random variable can take any value between 49.5 and 50.5, it is a continuous random variable. NIPRL 2.2.2 Probability Density Function (1/4) • Probability Density Function (p.d.f.) – Probabilistic properties of a continuous random variable f ( x) 0 NIPRL f ( x ) dx 1 statespace 2.2.2 Probability Density Function (2/4) • Example 14 – Suppose that the diameter of a metal cylinder has a p.d.f f ( x ) 1.5 6( x 50.2) 2 for 49.5 x 50.5 f ( x ) 0, elsew here f (x) 49.5 NIPRL 50.5 x 2.2.2 Probability Density Function (3/4) • This is a valid p.d.f. 50.5 49.5 (1 .5 6 ( x 5 0 .0 ) ) d x [1 .5 x 2 ( x 5 0 .0 ) ] 49.5 2 3 50.5 [1 .5 5 0 .5 2 (5 0 .5 5 0 .0 ) ] 3 [1 .5 4 9 .5 2 (4 9 .5 5 0 .0 ) ] 3 7 5 .5 7 4 .5 1 .0 NIPRL 2.2.2 Probability Density Function (4/4) • The probability that a metal cylinder has a diameter between 49.8 and 50.1 mm can be calculated to be 5 0 .1 4 9 .8 (1 .5 6 ( x 5 0 .0 ) ) d x [1 .5 x 2 ( x 5 0 .0 ) ] 4 9 .8 2 3 5 0 .1 [1 .5 5 0 .1 2 (5 0 .1 5 0 .0 ) ] 3 [1 .5 4 9 .8 2 ( 4 9 .8 5 0 .0 ) ] 3 7 5 .1 4 8 7 4 .7 1 6 0 .4 3 2 f (x) 49.5 NIPRL 49.8 50.1 50.5 x 2.2.3 Cumulative Distribution Function (1/3) • Cumulative Distribution Function F ( x) P ( X x) f ( x) x f ( y )dy dF ( x) dx P (a X b) P ( X b) P ( X a ) F (b ) F ( a ) P (a X b) P (a X b) NIPRL 2.2.2 Probability Density Function (2/3) • Example 14 F ( x) P ( X x) x (1.5 6( y 50.0) ) dy 2 49.5 [1.5 y 2( y 50.0) ] 49.5 3 x [1.5 x 2( x 50.0) ] [1.5 49.5 2(49.5 50.0) ] 3 3 1.5 x 2( x 50.0) 74.5 3 P (49.7 X 50.0) F (50.0) F (49.7 ) (1.5 50.0 2(50.0 50.0) 74.5) 3 (1.5 49.7 2(49.7 50.0) 74.5) 3 0.5 0.104 0.396 NIPRL 2.2.2 Probability Density Function (3/3) P (49.7 X 50.0) 0.396 1 P ( X 50.0) 0.5 F (x) P ( X 49.7) 0.104 49.5 NIPRL 49.7 50.0 50.5 x 2.3 The Expectation of a Random Variable 2.3.1 Expectations of Discrete Random Variables (1/2) • Expectation of a discrete random variable with p.m.f P ( X xi ) p i E(X ) p i xi i • Expectation of a continuous random variable with p.d.f f(x) E(X ) xf ( x ) d x state sp ace • The expected value of a random variable is also called the mean of the random variable NIPRL 2.3.1 Expectations of Discrete Random Variables (2/2) • Example 1 (discrete random variable) – The expected repair cost is E (cost ) ($50 0.3) ($200 0.2) ($350 0.5) $230 NIPRL 2.3.2 Expectations of Continuous Random Variables (1/2) • Example 14 (continuous random variable) – The expected diameter of a metal cylinder is E(X ) 5 0 .5 x (1 .5 6 ( x 5 0 .0 ) ) d x 2 4 9 .5 – Change of variable: y=x-50 E (x) 0.5 0.5 0.5 0.5 ( y 50)(1.5 6 y ) dy 2 ( 6 y 300 y 1.5 y 75) dy 3 2 [ 3 y / 2 100 y 0.75 y 75 y ] 0.5 4 3 2 [25.09375] [ 24.90625] 50.0 NIPRL 0.5 2.3.2 Expectations of Continuous Random Variables (2/2) • Symmetric Random Variables – If x has a p.d.f f ( x ) that is symmetric about a point so that f ( x) f ( x) – Then, E ( X ) E(X ) (why?) – So that the expectation of the random variable is equal to the point of symmetry NIPRL f (x) x E(X ) xf ( x ) d x - xf ( x ) d x + xf ( x ) d x y 2 x - NIPRL xf ( x ) d x + - yf ( y ) d y 2.3.3 Medians of Random Variables (1/2) • Median – Information about the “middle” value of the random variable F ( x ) 0.5 • Symmetric Random Variable – If a continuous random variable is symmetric about a point , then both the median and the expectation of the random variable are equal to NIPRL 2.3.3 Medians of Random Variables (2/2) • Example 14 F ( x ) 1.5 x 2( x 50.0) 74.5 0.5 3 x 5 0 .0 NIPRL 2.4 The variance of a Random Variable 2.4.1 Definition and Interpretation of Variance (1/2) • Variance( ) – A positive quantity that measures the spread of the distribution of the random variable about its mean value – Larger values of the variance indicate that the distribution is more spread out 2 – Definition: V ar ( X ) E (( X E ( X )) ) 2 E ( X ) ( E ( X )) 2 • Standard Deviation – The positive square root of the variance – Denoted by NIPRL 2 2.4.1 Definition and Interpretation of Variance (2/2) V a r ( X ) E (( X E ( X )) ) 2 E(X 2 2 X E ( X ) ( E ( X )) ) E(X 2 ) 2 E ( X ) E ( X ) ( E ( X )) E(X 2 ) ( E ( X )) 2 2 2 f (x) Two distribution with identical mean values but different variances x NIPRL 2.4.2 Examples of Variance Calculations (1/1) • Example 1 V ar ( X ) E (( X E ( X )) ) p i ( x i E ( X )) 2 2 i 0 .3(5 0 2 3 0 ) 0 .2 (2 0 0 2 3 0 ) 0 .5(3 5 0 2 3 0 ) 2 1 7 ,1 0 0 NIPRL 2 17,100 130.77 2 2 2.4.3 Chebyshev’s Inequality (1/1) • Chebyshev’s Inequality – If a random variable has a mean and a variance 2, then P ( c X c ) 1 1 c 2 for c 1 – For example, taking c 2 gives P ( 2 X 2 ) 1 1 2 NIPRL 2 0.75 • Proof 2 ( x ) f ( x ) dx 2 ( x ) f ( x ) dx c 2 | x | c P (| x | c ) 1 / c 2 2 | x | c 2 P (| x | c ) 1 P (| x | c ) 1 1 / c NIPRL f ( x ) dx . 2 2.4.4 Quantiles of Random Variables (1/2) • Quantiles of Random variables – The p th quantile of a random variable X F ( x) p – A probability of p that the random variable takes a value less than the p th quantile • Upper quartile – The 75th percentile of the distribution • Lower quartile – The 25th percentile of the distribution • Interquartile range – The distance between the two quartiles NIPRL 2.4.4 Quantiles of Random Variables (2/2) • Example 14 3 F ( x ) 1.5 x 2( x 50.0) 74.5 for 49.5 x 50.5 – Upper quartile : F ( x ) 0.75 x 5 0 .1 7 – Lower quartile : F ( x ) 0.25 x 4 9 .8 3 – Interquartile range : 5 0 .1 7 4 9 .8 3 0 .3 4 NIPRL 2.5 Jointly Distributed Random Variables 2.5.1 Jointly Distributed Random Variables (1/4) • Joint Probability Distributions – Discrete P ( X x i , Y y j ) p ij 0 s a tis fyin g i p ij 1 j – Continuous f ( x , y ) 0 satisfying state space NIPRL f ( x , y ) dxdx 1 2.5.1 Jointly Distributed Random Variables (2/4) • Joint Cumulative Distribution Function – Discrete F ( x , y ) P ( X xi , Y y j ) – Continuous F ( x, y ) p ij i : xi x j : y j y F ( x, y ) NIPRL x w y z f ( w , z ) d zd w 2.5.1 Jointly Distributed Random Variables (3/4) • Example 19 : Air Conditioner Maintenance – A company that services air conditioner units in residences and office blocks is interested in how to schedule its technicians in the most efficient manner – The random variable X, taking the values 1,2,3 and 4, is the service time in hours – The random variable Y, taking the values 1,2 and 3, is the number of air conditioner units NIPRL 2.5.1 Jointly Distributed Random Variables (4/4) Y= number of units • Joint p.m.f X=service time 1 2 3 4 1 0.12 0.08 0.07 0.05 2 0.08 0.15 0.21 0.13 i p ij 0.12 0.18 j 0.07 1.00 • Joint cumulative distribution function F (2, 2) p11 p12 p 21 p 22 0.12 0.18 0.08 0.15 3 NIPRL 0.01 0.01 0.02 0.07 0.43 2.5.2 Marginal Probability Distributions (1/2) • Marginal probability distribution – Obtained by summing or integrating the joint probability distribution over the values of the other random variable – Discrete P ( X i ) pi p ij j – Continuous f X (x) NIPRL f ( x, y )dy 2.5.2 Marginal Probability Distributions (2/2) • Example 19 – Marginal p.m.f of X 3 P ( X 1) p1 j 0 .1 2 0 .0 8 0 .0 1 0 .2 1 j 1 – Marginal p.m.f of Y 4 P (Y 1) i 1 NIPRL p i1 0.12 0.08 0.07 0.05 0.32 • Example 20: (a jointly continuous case) • Joint pdf: f ( x, y ) • Marginal pdf’s of X and Y: NIPRL f X ( x) f ( x , y ) dy fY ( y ) f ( x , y ) dx 2.5.3 Conditional Probability Distributions (1/2) • Conditional probability distributions – The probabilistic properties of the random variable X under the knowledge provided by the value of Y – Discrete p ij P ( X i, Y j ) p i| j P ( X i | Y j ) P (Y j ) p j – Continuous f X |Y y ( x ) f ( x, y ) fY ( y ) – The conditional probability distribution is a probability distribution. NIPRL 2.5.3 Conditional Probability Distributions (2/2) • Example 19 – Marginal probability distribution of Y P (Y 3) p 3 0.01 0.01 0.02 0.07 0.11 – Conditional distribution of X p1|Y 3 P ( X 1 | Y 3) NIPRL p13 p 3 0.01 0.11 0.091 2.5.4 Independence and Covariance (1/5) • Two random variables X and Y are said to be independent if – Discrete p ij p i p j for all values i of X and j of Y – Continuous f ( x , y ) f X ( x ) fY ( y ) fo r a ll x a n d y – How is this independency different from the independence among events? NIPRL 2.5.4 Independence and Covariance (2/5) • Covariance C ov ( X , Y ) E (( X E ( X ))(Y E (Y ))) E ( X Y ) E ( X ) E (Y ) C ov( X , Y ) E (( X E ( X ))(Y E (Y ))) E ( XY XE (Y ) E ( X )Y E ( X ) E (Y )) E ( XY ) E ( X ) E (Y ) E ( X ) E (Y ) E ( X ) E (Y ) E ( XY ) E ( X ) E (Y ) – May take any positive or negative numbers. – Independent random variables have a covariance of zero – What if the covariance is zero? NIPRL 2.5.4 Independence and Covariance (3/5) • Example 19 (Air conditioner maintenance) E ( X ) 2.59, 4 E ( XY ) 3 ijp i 1 E (Y ) 1.79 ij j 1 (1 1 0.12) (1 2 0.08) (4 3 0.07 ) 4.86 C ov ( X , Y ) E ( X Y ) E ( X ) E (Y ) 4.86 (2.59 1.79) 0.224 NIPRL 2.5.4 Independence and Covariance (4/5) • Correlation: C orr ( X , Y ) C ov ( X , Y ) V ar ( X )V ar (Y ) – Values between -1 and 1, and independent random variables have a correlation of zero NIPRL 2.5.4 Independence and Covariance (5/5) • Example 19: (Air conditioner maintenance) V ar ( X ) 1.162, V ar (Y ) 0.384 C orr ( X , Y ) C ov ( X , Y ) V ar ( X )V ar (Y ) NIPRL 0.224 1.162 0.384 0.34 • What if random variable X and Y have linear relationship, that is, Y aX b where a 0 C ov ( X , Y ) E [ X Y ] E [ X ] E [Y ] E [ X ( aX b )] E [ X ] E [ aX b ] aE [ X ] bE [ X ] aE [ X ] bE [ X ] 2 2 a ( E [ X ] E [ X ]) aV ar ( X ) 2 C orr ( X , Y ) 2 C ov ( X , Y ) V ar ( X )V ar (Y ) aV ar ( X ) That is, Corr(X,Y)=1 if a>0; -1 if a<0. NIPRL 2 V ar ( X ) a V ar ( X ) 2.6 Combinations and Functions of Random Variables 2.6.1 Linear Functions of Random Variables (1/4) • Linear Functions of a Random Variable – If X is a random variable and Y a X b for some numbers a , b R then E (Y ) aE ( X ) b and V ar (Y ) a 2 V ar ( X ) • Standardization -If a random variable X has an expectation of and a 2 variance of , X 1 Y X has an expectation of zero and a variance of one. NIPRL 2.6.1 Linear Functions of Random Variables (2/4) • Example 21:Test Score Standardization – Suppose that the raw score X from a particular testing procedure are distributed between -5 and 20 with an expected value of 10 and a variance 7. In order to standardize the scores so that they lie between 0 and 100, the linear transformation Y 4 X 2 0 is applied to the scores. NIPRL 2.6.1 Linear Functions of Random Variables (3/4) – For example, x=12 corresponds to a standardized score of y=(4ⅹ12)+20=68 E (Y ) 4 E ( X ) 20 (4 10) 20 60 V ar(Y ) 4 V ar( X ) 4 7 112 2 Y NIPRL 112 10.58 2 2.6.1 Linear Functions of Random Variables (4/4) • Sums of Random Variables – If X 1 and X 2 are two random variables, then E ( X 1 X 2 ) E ( X 1 ) E ( X 2 ) ( w hy ?) and V ar ( X 1 X 2 ) V ar ( X 1 ) V ar ( X 2 ) 2C ov ( X 1 , X 2 ) – If X 1 and X 2 are independent, so that C o v ( X 1 , X 2 ) 0 then V ar ( X 1 X 2 ) V ar ( X 1 ) V ar ( X 2 ) NIPRL • Properties of C ov ( X , X ) 1 2 C ov ( X 1 , X 2 ) E [ X 1 X 2 ] E [ X 1 ] E [ X 2 ] C ov ( X 1 , X 2 ) C ov ( X 2 , X 1 ) Cov ( X 1 , X 1 ) Var ( X 1 ) C ov ( X 2 , X 2 ) Var ( X 2 ) C ov ( X 1 X 2 , X 1 ) C ov ( X 1 , X 1 ) C ov ( X 2 , X 1 ) C ov ( X 1 X 2 , X 1 X 2 ) C ov ( X 1 , X 1 ) C ov ( X 1 , X 2 ) C ov ( X 2 , X 1 ) C ov ( X 2 , X 2 ) Var ( X 1 X 2 ) Var ( X 1 ) Var ( X 2 ) 2 Cov ( X 1 , X 2 ) NIPRL 2.6.2 Linear Combinations of Random Variables (1/5) • Linear Combinations of Random Variables – If X 1 , , X n is a sequence of random variables and a1 , and b are constants, then E ( a1 X 1 a n X n b ) a1 E ( X 1 ) an E ( X n ) b – If, in addition, the random variables are independent, then V ar ( a1 X 1 NIPRL 2 a n X n b ) a1 V ar ( X 1 ) 2 a n V ar ( X n ) , an 2.6.2 Linear Combinations of Random Variables (2/5) • Averaging Independent Random Variables – Suppose that X 1 , , X n is a sequence of independent random variables with an expectation and a variance 2 . – Let – Then X X1 Xn n E(X ) and V ar ( X ) 2 n – What happened to the variance? NIPRL 2.6.2 Linear Combinations of Random Variables (3/5) 1 E(X ) E X1 n 1 1 1 n Xn n 1 n E ( X 1) 1 n E(X n) n 1 V ar ( X ) V ar X 1 n 2 1 n NIPRL 2 1 X n V ar ( X 1 ) n n 1 2 2 1 n 2 2 n 2 1 V ar ( X n ) n 2.6.2 Linear Combinations of Random Variables (4/5) • Example 21 – The standardized scores of the two tests are Y1 10 3 X1 Y2 and 5 3 X2 50 3 – The final score is Z 2 3 NIPRL Y1 1 3 Y2 20 9 X1 5 9 X2 50 9 2.6.2 Linear Combinations of Random Variables (5/5) – The expected value of the final score is E (Z ) 20 9 E ( X1) 5 9 E(X 2) 50 9 20 5 50 18 30 9 9 9 6 2 .2 2 – The variance of the final score is 5 50 20 V ar ( Z ) V ar X1 X 2 9 9 9 2 2 20 5 V ar ( X ) 1 V ar ( X 2 ) 9 9 2 2 20 5 24 60 137.04 9 9 NIPRL 2.6.3 Nonlinear Functions of a Random Variable (1/3) • Nonlinear function of a Random Variable – A nonlinear function of a random variable X is another random variable Y=g(X) for some nonlinear function g. Y X , Y 2 X , Y e X – There are no general results that relate the expectation and variance of the random variable Y to the expectation and variance of the random variable X NIPRL 2.6.3 Nonlinear Functions of a Random Variable (2/3) – For example, f(x)=1 E(x)=0.5 f X ( x ) 1 for 0 x 1 f ( x) 0 elsew here 0 F X ( x ) x fo r 0 x 1 1 x f(y)=1/y E(y)=1.718 – Consider Y e X w here 1 Y 2.718 – What is the pdf of Y? NIPRL 1.0 2.718 y 2.6.3 Nonlinear Functions of a Random Variable (3/3) • CDF methd FY ( y ) P (Y y ) P ( e X y ) P ( X ln( y )) F x (ln( y )) ln( y ) – The p.d.f of Y is obtained by differentiation to be fY ( y ) dFY ( y ) dy 1 for 1 y 2.718 y • Notice that E (Y ) E (Y ) e NIPRL 2.718 z 1 E(X ) zf Y ( z ) dz e 0.5 2.718 z 1 1.649 1dz 2.718 1 1.718 • Determining the p.d.f. of nonlinear relationship between r.v.s: Given f X ( x ) and Y g ( X ) ,what is f Y ( y ) ? If x1 , x 2 , , x n are all its real roots, that is, y g ( x1 ) g ( x 2 ) then n fY ( y ) f X ( xi ) | g '( x ) | i 1 where g '( x ) dg ( x ) dx NIPRL i g ( xn ) • Example: determine f Y ( y ) f X ( x ) 1 for 0 x 1 f ( x) 0 elsew here y g ( x) e ln y x dg x dx fY ( y ) NIPRL |y| X w here 1 Y 2.718 x -> one root is possible in 0 x 1 e y 1 Y e 1 y