5.2

advertisement
Chapter 5
Joint Probability Distributions and Random Samples
 5.1 - Jointly Distributed Random Variables
 5.2 - Expected Values, Covariance, and
Correlation
 5.3 - Statistics and Their Distributions
 5.4 - The Distribution of the Sample Mean
 5.5 - The Distribution of a Linear Combination
PARAMETERS
REVIEW POWERPOINT SECTION “3.3-CONT’D”
FOR PROPERTIES OF EXPECTED VALUE
 Mean:

  x f ( x), X discrete
  E[ X ]  
Variance of a random
x
f
(
x
)
dx
,
X
continuous


variable measures how
it varies about its mean.
 Variance:
2

(
x


)
f ( x), X discrete
 
2
2
  E ( X   )   
2
(
x


)
f ( x) dx, X continuous


 E  X   E  X 
2
2
2
2

x
f
(
x
)


, X discrete
 
 2
2
x
f
(
x
)
dx


, X continuous


Proof: See PowerPoint section 3.3-cont’d, slide 18, for discrete X.
Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Means:
Y
 X , Y
 Variances:
f X ( x)
X 
 X2 ,  Y2
fY ( y)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
  E  X   E  X    x2 f X ( x)   X2  12 (.6)  22 (.4)  1.42  0.24
2
2
2


 Y  E Y   E Y    y 2 fY ( y)  Y2  12 (.5)  22 (.3)  32 (.2)  1.72 
2
X
2
2
0.61
Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Means:
 X , Y
 Variances:
 X2 ,  Y2
 X2  E ( X   X )2  ,  Y2  E (Y  Y )2 
 Covariance:
 XY  E ( X   X )(Y  Y )
Proof:
 E  XY  Y X   X Y   X Y 
E[ X  Y ]  E[ X ]  E[Y ]

  ( x   X )( y  Y ) f ( x, y)

 E  XY ]  E[Y X ]  E[ X Y ]  E[ X Y 

   ( x   X )( y  Y ) f ( x, y) dy dx
E[aX ]  aE[ X ]
Claim :  XY  E  XY   E[ X ] E[Y ]

  x y f ( x, y )   X Y


   x y f ( x, y ) dy dx   X Y
 E[ XY ]  Y E[ X ]   X E[Y ]   X Y E[1]
 E[ XY ]  Y  X  XX YY  XX YY
 E[ XY ]  E[ X ] E[Y ] QED
Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Means:
 X  E[ X ], Y  E[Y ]
 Variances:
 X2  Var ( X )
 Y2  Var (Y )
 E ( X   X ) 2 
 E (Y  Y ) 2 
 E  X 2    X2
 E Y 2   Y2
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
Var(X)
Is there an association between X and Y, and if so, how is it measured?
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
Y
f X ( x)
X 
fY ( y)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
 XY  Cov( X , Y )  E  ( X   X )(Y  Y ) 
  ( x   X )( y  Y ) f ( x, y )
Is there an association between X and Y, and if so, how is it measured?
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
Y
X  1
1
f X ( x)
.25
fY ( y)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
 XY  Cov( X , Y )  E  ( X   X )(Y  Y ) 
  ( x   X )( y  Y ) f ( x, y )
 (1  1.4)(1  1.7)(.25) 
Is there an association between X and Y, and if so, how is it measured?
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
Y
X  1
2
f X ( x)
.20
fY ( y)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
 XY  Cov( X , Y )  E  ( X   X )(Y  Y ) 
  ( x   X )( y  Y ) f ( x, y )
 (1  1.4)(1  1.7)(.25)  (1  1.4)(2  1.7)(.20) 
Is there an association between X and Y, and if so, how is it measured?
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
1
Y
X  1
.15
3
f X ( x)
.15
fY ( y)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
 XY  Cov( X , Y )  E  ( X   X )(Y  Y ) 
  ( x   X )( y  Y ) f ( x, y )
 (1  1.4)(1  1.7)(.25)  (1  1.4)(2  1.7)(.20)  (1  1.4)(3  1.7)(.15) 
Is there an association between X and Y, and if so, how is it measured?
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
Y
X 
2
fY ( y)
1
2
3
.25
.10
.05
f X ( x)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
 XY  Cov( X , Y )  E  ( X   X )(Y  Y ) 
  ( x   X )( y  Y ) f ( x, y )
 (1  1.4)(1  1.7)(.25)  (1  1.4)(2  1.7)(.20)  (1  1.4)(3  1.7)(.15) 
(2  1.4)(1  1.7)(.25)  (2  1.4)(2  1.7)(.10) (2  1.4)(3  1.7)(.05) 
Is there an association between X and Y, and if so, how is it measured?
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
Y
f X ( x)
X 
2
fY ( y)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
 XY  Cov( X , Y )  E  ( X   X )(Y  Y ) 
  ( x   X )( y  Y ) f ( x, y )
 (1  1.4)(1  1.7)(.25)  (1  1.4)(2  1.7)(.20)  (1  1.4)(3  1.7)(.15) 
(2  1.4)(1  1.7)(.25)  (2  1.4)(2  1.7)(.10) (2  1.4)(3  1.7)(.05)  .08
Is there an association between X and Y, and if so, how is it measured?
 Covariance:
 XY  Cov( X , Y )
 E  ( X   X )(Y  Y )
 E[ XY ]   X Y
Y
f X ( x)
X 
fY ( y)
 X  E[ X ]   x f X ( x)  (1)(.60)  (2)(.40)  1.4 cups / AM
Y  E[Y ]   y fY ( y)  (1)(.50)  (2)(.30)  (3)(.20)  1.7 cups / PM
 XY  Cov( X , Y )  E[ XY ]   X Y
  x y f ( x, y )   X Y
… but what does it mean????
 (1)(1)(.25)  (1)(2)(.20)  (1)(3)(.15) 
(2)(1)(.25)  (2)(2)(.10)  (2)(3)(.05)  (1.4)(1.7)  .08
Is there an association between X and Y, and if so, how is it measured?
joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y)
Y
X
y1
y2
y3
y4
y5
x1 f(x1, y1) f(x1, y2) f(x1, y3) f(x1, y4) f(x1, y5)
fX(x1)
x2 f(x2, y1) f(x2, y2) f(x2, y3) f(x2, y4) f(x2, y5)
fX(x2)
x3 f(x3, y1) f(x3, y2) f(x3, y3) f(x3, y4) f(x3, y5)
fX(x3)
x4 f(x4, y1) f(x4, y2) f(x4, y3) f(x4, y4) f(x4, y5)
fX(x4)
x5 f(x5, y1) f(x5, y2) f(x5, y3) f(x5, y4) f(x5, y5)
fX(x5)
fY(y1)
fY(y2)
fY(y3)
fY(y4)
fY(y5)
The distribution of these points ( xi , y j ) in
the XY -plane depends on the joint pmf f ( x, y).
1
Is there an association between X and Y, and if so, how is it measured?
joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y)
Example:
Y
1
2
3
4
5
1
.04
.04
.04
.04
.04
.20
2
.04
.04
.04
.04
.04
.20
3
.04
.04
.04
.04
.04
.20
4
.04
.04
.04
.04
.04
.20
5
.04
.04
.04
.04
.04
.20
.20
.20
.20
.20
.20
1
X
In a uniform population, each of the points {(1,1), (1, 2),…, (5, 5)} has the same
density. A scatterplot would reveal no particular association between X and Y.
In fact, X and Y are statistically independent!
It is easy to see that Cov(X, Y) = 0.
Is there an association between X and Y, and if so, how is it measured?
joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y)
Exercise:
Y
X
1
2
3
4
5
1
.04
2
.12
3
.20
4
.28
5
.36
.10
.15
.20
.25
.30
Fill in the table so that X and Y are statistically
independent. Then show that Cov(X, Y) = 0.
1
THEOREM. If X and Y are statistically independent, then
Cov(X, Y) = 0. However, the converse does not necessarily hold!
Exception: The Bivariate Normal Distribution
Is there an association between X and Y, and if so, how is it measured?
joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y)
Example:
Y
1
2
3
4
5
1
.08
.04
.03
.02
.01
.20
2
.04
.08
.04
.03
.02
.20
3
.03
.04
.08
.04
.03
.20
4
.02
.03
.04
.08
.04
.20
5
.01
.02
.03
.04
.08
.20
.20
.20
.20
.20
.20
1
X
Is there an association between X and Y, and if so, how is it measured?
joint pmf f ( x, y); marginal pmfs f X ( x), fY ( y)
Example:
Y
1
2
3
4
5
1
.08
.04
.03
.02
.01
.18
2
.04
.08
.04
.03
.02
.21
3
.03
.04
.08
.04
.03
.22
4
.02
.03
.04
.08
.04
.21
5
.01
.02
.03
.04
.08
.18
.18
.21
.22
.21
.18
1
X
• As X increases, Y also has a tendency to increase;
thus, X and Y are said to be positively correlated.
• Likewise, two negatively correlated variables
have a tendency for Y to decrease as X increases.
• The simplest mathematical object to have this
property is a straight line.
Is there an association between X and Y, and if so, how is it measured?
Y
PARAMETERS
 Means:
 X  E[ X ]  1.4
Y  E[Y ]  1.7
f X ( x)
X 
fY ( y)
 Variances:
 X2  Var ( X )  E ( X   X )2   E  X 2    X2  0.24
 Y2  Var (Y )  E (Y  Y )2   E Y 2   Y2  0.61
 Covariance:
 XY  Cov( X , Y )  E ( X   X )(Y  Y )  E[ XY ]   X Y  0.08
 Linear Correlation Coefficient:
  Corr ( X , Y ) 
(“rho”)
 XY
 X2  Y2

.08
 0.209
.24 .61
Always between
–1 and +1
Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Linear Correlation Coefficient:
  Corr ( X , Y ) 
 XY
 X2  Y2
• ρ measures the strength of linear
association between X and Y.
• Always between –1 and +1.
JAMA. 2003;290:1486-1493
Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Linear Correlation Coefficient:
IQ vs. Head circumference
strong
-1
moderate
-0.75
-0.5
negative linear correlation
weak
0
moderate
strong
+0.5 +0.75
+1
positive linear correlation

Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Linear Correlation Coefficient:
Body Temp vs. Age
strong
-1
moderate
-0.75
-0.5
negative linear correlation
weak
0
moderate
strong
+0.5 +0.75
+1
positive linear correlation

Is there an association between X and Y, and if so, how is it measured?
A strong positive correlation exists
between ice cream sales and drowning.
Cause & Effect? NOT LIKELY…
“Temp (F)” is a confounding variable.
PARAMETERS
 Linear Correlation Coefficient:
Profit vs. Price
strong
-1
moderate
-0.75
-0.5
negative linear correlation
weak
0
moderate
strong
+0.5 +0.75
+1
positive linear correlation

Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Means:  X  E[ X ], Y  E[Y ]
 Variances:
 X2  Var ( X )  E ( X   X ) 2   E  X 2    X2
 Y2  Var (Y )  E (Y  Y ) 2   E Y 2   Y2
 Covariance:
 XY  Cov( X , Y )  E ( X   X )(Y  Y )  E[ XY ]   X Y
X  Y  {x  y | x  X , y  Y }
E[ X  Y ]  E[ X ]  E[Y ]
Proof: See text, p. 240
Var ( X  Y )  Var ( X )  Var (Y )  2Cov( X , Y )
Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Means:  X  E[ X ], Y  E[Y ]
 Variances:
 X2  Var ( X )  E ( X   X ) 2   E  X 2    X2
 Y2  Var (Y )  E (Y  Y ) 2   E Y 2   Y2
 Covariance:
 XY  Cov( X , Y )  E ( X   X )(Y  Y )  E[ XY ]   X Y
Var ( X 
 Y )  Var ( X )  Var (Y ) 
 2Cov( X , Y )
Proof: Var ( X  Y )
(WLOG)
2

 E  ( X  Y )  (  X  Y )  


2
 E  ( X   X )  (Y  Y )  


2
2



 E ( X   X )2 2(2X
( X
 X)(YX)(
)

(
Y


)
E
Y


)

E
(
Y


)


Y
Y 
Y
Y 


Is there an association between X and Y, and if so, how is it measured?
PARAMETERS
 Means:  X  E[ X ], Y  E[Y ]
 Variances:
 X2  Var ( X )  E ( X   X ) 2   E  X 2    X2
 Y2  Var (Y )  E (Y  Y ) 2   E Y 2   Y2
 Covariance:
 XY  Cov( X , Y )  E ( X   X )(Y  Y )  E[ XY ]   X Y
Var ( X 
 Y )  Var ( X )  Var (Y )  2Cov( X , Y )
(WLOG)
If X and Y are independent, then Cov(X, Y) = 0.
Proof: Exercise (HW problem)
If X and Y are independent, then
Var ( X  Y )  Var ( X )  Var (Y ) .
Download