Chapter 5 Joint Probability Distributions and Random Samples 5.1 - Jointly Distributed Random Variables 5.2 - Expected Values, Covariance, and Correlation 5.3 - Statistics and Their Distributions 5.4 - The Distribution of the Sample Mean 5.5 - The Distribution of a Linear Combination PARAMETERS REVIEW POWERPOINT SECTION “3.3-CONT’D” FOR PROPERTIES OF EXPECTED VALUE Mean: x f ( x), X discrete E[ X ] Variance of a random x f ( x ) dx , X continuous variable measures how it varies about its mean. Variance: 2 ( x ) f ( x), X discrete 2 2 E ( X ) 2 ( x ) f ( x) dx, X continuous E X E X 2 2 2 2 x f ( x ) , X discrete 2 2 x f ( x ) dx , X continuous Proof: See PowerPoint section 3.3-cont’d, slide 18, for discrete X. Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: Y X , Y Variances: f X ( x) X X2 , Y2 fY ( y) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM E X E X x2 f X ( x) X2 12 (.6) 22 (.4) 1.42 0.24 2 2 2 Y E Y E Y y 2 fY ( y) Y2 12 (.5) 22 (.3) 32 (.2) 1.72 2 X 2 2 0.61 Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: X , Y Variances: X2 , Y2 X2 E ( X X )2 , Y2 E (Y Y )2 Covariance: XY E ( X X )(Y Y ) Proof: E XY Y X X Y X Y E[ X Y ] E[ X ] E[Y ] ( x X )( y Y ) f ( x, y) E XY ] E[Y X ] E[ X Y ] E[ X Y ( x X )( y Y ) f ( x, y) dy dx E[aX ] aE[ X ] Claim : XY E XY E[ X ] E[Y ] x y f ( x, y ) X Y x y f ( x, y ) dy dx X Y E[ XY ] Y E[ X ] X E[Y ] X Y E[1] E[ XY ] Y X XX YY XX YY E[ XY ] E[ X ] E[Y ] QED Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: X E[ X ], Y E[Y ] Variances: X2 Var ( X ) Y2 Var (Y ) E ( X X ) 2 E (Y Y ) 2 E X 2 X2 E Y 2 Y2 Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Var(X) Is there an association between X and Y, and if so, how is it measured? Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Y f X ( x) X fY ( y) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM XY Cov( X , Y ) E ( X X )(Y Y ) ( x X )( y Y ) f ( x, y ) Is there an association between X and Y, and if so, how is it measured? Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Y X 1 1 f X ( x) .25 fY ( y) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM XY Cov( X , Y ) E ( X X )(Y Y ) ( x X )( y Y ) f ( x, y ) (1 1.4)(1 1.7)(.25) Is there an association between X and Y, and if so, how is it measured? Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Y X 1 2 f X ( x) .20 fY ( y) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM XY Cov( X , Y ) E ( X X )(Y Y ) ( x X )( y Y ) f ( x, y ) (1 1.4)(1 1.7)(.25) (1 1.4)(2 1.7)(.20) Is there an association between X and Y, and if so, how is it measured? Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y 1 Y X 1 .15 3 f X ( x) .15 fY ( y) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM XY Cov( X , Y ) E ( X X )(Y Y ) ( x X )( y Y ) f ( x, y ) (1 1.4)(1 1.7)(.25) (1 1.4)(2 1.7)(.20) (1 1.4)(3 1.7)(.15) Is there an association between X and Y, and if so, how is it measured? Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Y X 2 fY ( y) 1 2 3 .25 .10 .05 f X ( x) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM XY Cov( X , Y ) E ( X X )(Y Y ) ( x X )( y Y ) f ( x, y ) (1 1.4)(1 1.7)(.25) (1 1.4)(2 1.7)(.20) (1 1.4)(3 1.7)(.15) (2 1.4)(1 1.7)(.25) (2 1.4)(2 1.7)(.10) (2 1.4)(3 1.7)(.05) Is there an association between X and Y, and if so, how is it measured? Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Y f X ( x) X 2 fY ( y) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM XY Cov( X , Y ) E ( X X )(Y Y ) ( x X )( y Y ) f ( x, y ) (1 1.4)(1 1.7)(.25) (1 1.4)(2 1.7)(.20) (1 1.4)(3 1.7)(.15) (2 1.4)(1 1.7)(.25) (2 1.4)(2 1.7)(.10) (2 1.4)(3 1.7)(.05) .08 Is there an association between X and Y, and if so, how is it measured? Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Y f X ( x) X fY ( y) X E[ X ] x f X ( x) (1)(.60) (2)(.40) 1.4 cups / AM Y E[Y ] y fY ( y) (1)(.50) (2)(.30) (3)(.20) 1.7 cups / PM XY Cov( X , Y ) E[ XY ] X Y x y f ( x, y ) X Y … but what does it mean???? (1)(1)(.25) (1)(2)(.20) (1)(3)(.15) (2)(1)(.25) (2)(2)(.10) (2)(3)(.05) (1.4)(1.7) .08 Is there an association between X and Y, and if so, how is it measured? joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y) Y X y1 y2 y3 y4 y5 x1 f(x1, y1) f(x1, y2) f(x1, y3) f(x1, y4) f(x1, y5) fX(x1) x2 f(x2, y1) f(x2, y2) f(x2, y3) f(x2, y4) f(x2, y5) fX(x2) x3 f(x3, y1) f(x3, y2) f(x3, y3) f(x3, y4) f(x3, y5) fX(x3) x4 f(x4, y1) f(x4, y2) f(x4, y3) f(x4, y4) f(x4, y5) fX(x4) x5 f(x5, y1) f(x5, y2) f(x5, y3) f(x5, y4) f(x5, y5) fX(x5) fY(y1) fY(y2) fY(y3) fY(y4) fY(y5) The distribution of these points ( xi , y j ) in the XY -plane depends on the joint pmf f ( x, y). 1 Is there an association between X and Y, and if so, how is it measured? joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y) Example: Y 1 2 3 4 5 1 .04 .04 .04 .04 .04 .20 2 .04 .04 .04 .04 .04 .20 3 .04 .04 .04 .04 .04 .20 4 .04 .04 .04 .04 .04 .20 5 .04 .04 .04 .04 .04 .20 .20 .20 .20 .20 .20 1 X In a uniform population, each of the points {(1,1), (1, 2),…, (5, 5)} has the same density. A scatterplot would reveal no particular association between X and Y. In fact, X and Y are statistically independent! It is easy to see that Cov(X, Y) = 0. Is there an association between X and Y, and if so, how is it measured? joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y) Exercise: Y X 1 2 3 4 5 1 .04 2 .12 3 .20 4 .28 5 .36 .10 .15 .20 .25 .30 Fill in the table so that X and Y are statistically independent. Then show that Cov(X, Y) = 0. 1 THEOREM. If X and Y are statistically independent, then Cov(X, Y) = 0. However, the converse does not necessarily hold! Exception: The Bivariate Normal Distribution Is there an association between X and Y, and if so, how is it measured? joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y) Example: Y 1 2 3 4 5 1 .08 .04 .03 .02 .01 .20 2 .04 .08 .04 .03 .02 .20 3 .03 .04 .08 .04 .03 .20 4 .02 .03 .04 .08 .04 .20 5 .01 .02 .03 .04 .08 .20 .20 .20 .20 .20 .20 1 X Is there an association between X and Y, and if so, how is it measured? joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y) Example: Y 1 2 3 4 5 1 .08 .04 .03 .02 .01 .18 2 .04 .08 .04 .03 .02 .21 3 .03 .04 .08 .04 .03 .22 4 .02 .03 .04 .08 .04 .21 5 .01 .02 .03 .04 .08 .18 .18 .21 .22 .21 .18 1 X • As X increases, Y also has a tendency to increase; thus, X and Y are said to be positively correlated. • Likewise, two negatively correlated variables have a tendency for Y to decrease as X increases. • The simplest mathematical object to have this property is a straight line. Is there an association between X and Y, and if so, how is it measured? Y PARAMETERS Means: X E[ X ] 1.4 Y E[Y ] 1.7 f X ( x) X fY ( y) Variances: X2 Var ( X ) E ( X X )2 E X 2 X2 0.24 Y2 Var (Y ) E (Y Y )2 E Y 2 Y2 0.61 Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y 0.08 Linear Correlation Coefficient: Corr ( X , Y ) (“rho”) XY X2 Y2 .08 0.209 .24 .61 Always between –1 and +1 Is there an association between X and Y, and if so, how is it measured? PARAMETERS Linear Correlation Coefficient: Corr ( X , Y ) XY X2 Y2 • ρ measures the strength of linear association between X and Y. • Always between –1 and +1. JAMA. 2003;290:1486-1493 Is there an association between X and Y, and if so, how is it measured? PARAMETERS Linear Correlation Coefficient: IQ vs. Head circumference strong -1 moderate -0.75 -0.5 negative linear correlation weak 0 moderate strong +0.5 +0.75 +1 positive linear correlation Is there an association between X and Y, and if so, how is it measured? PARAMETERS Linear Correlation Coefficient: Body Temp vs. Age strong -1 moderate -0.75 -0.5 negative linear correlation weak 0 moderate strong +0.5 +0.75 +1 positive linear correlation Is there an association between X and Y, and if so, how is it measured? A strong positive correlation exists between ice cream sales and drowning. Cause & Effect? NOT LIKELY… “Temp (F)” is a confounding variable. PARAMETERS Linear Correlation Coefficient: Profit vs. Price strong -1 moderate -0.75 -0.5 negative linear correlation weak 0 moderate strong +0.5 +0.75 +1 positive linear correlation Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: X E[ X ], Y E[Y ] Variances: X2 Var ( X ) E ( X X ) 2 E X 2 X2 Y2 Var (Y ) E (Y Y ) 2 E Y 2 Y2 Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y X Y {x y | x X , y Y } E[ X Y ] E[ X ] E[Y ] Proof: See text, p. 240 Var ( X Y ) Var ( X ) Var (Y ) 2Cov( X , Y ) Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: X E[ X ], Y E[Y ] Variances: X2 Var ( X ) E ( X X ) 2 E X 2 X2 Y2 Var (Y ) E (Y Y ) 2 E Y 2 Y2 Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Var ( X Y ) Var ( X ) Var (Y ) 2Cov( X , Y ) Proof: Var ( X Y ) (WLOG) 2 E ( X Y ) ( X Y ) 2 E ( X X ) (Y Y ) 2 2 E ( X X )2 2(2X ( X X)(YX)( ) ( Y ) E Y ) E ( Y ) Y Y Y Y Is there an association between X and Y, and if so, how is it measured? PARAMETERS Means: X E[ X ], Y E[Y ] Variances: X2 Var ( X ) E ( X X ) 2 E X 2 X2 Y2 Var (Y ) E (Y Y ) 2 E Y 2 Y2 Covariance: XY Cov( X , Y ) E ( X X )(Y Y ) E[ XY ] X Y Var ( X Y ) Var ( X ) Var (Y ) 2Cov( X , Y ) (WLOG) If X and Y are independent, then Cov(X, Y) = 0. Proof: Exercise (HW problem) If X and Y are independent, then Var ( X Y ) Var ( X ) Var (Y ) .