Sampling- part 4

advertisement
Introduction to Biostatistics (PUBHLTH 540)
Multiple Random
Variables
1
Multiple Random Variables
 Linear
Combinations of Random
Variables
– Expected Value
– Variance
 Stochastic
Models
 Covariance of two Random
Variables
 Independence
 Correlation
2
An Example




Choose a Simple Random Sample with Replacement of size n=2 from
a Population of N=3
Observe:
– 1 Response (i.e. Age) on each Subject in the Sample
Question:
– What is the average age of subjects in the population?
Use the sample mean to estimate the Population Average Age
Introducing….
Daisy
Lily
SPH&HS, UMASS Amherst
Rose
3
Population
SPH&HS, UMASS Amherst
4
Population of N=3
ID
(s)
1
2
3
Note:
Population mean
Variance.
Subject
Daisy
Lily
Rose
Response
(Age)
25
32
33
  
N
1
38
2
2
    xi      12.67
N i 1
3
5
Pick SRS with Replacement of n=2
ID (s) Subject
Response
1
Daisy
25
2
3
Lily
Rose
32
33
i=1,…,n=2
Y1
a random variable representing the 1st
selection
Y2
a random variable representing the
2nd selection
6
Use as an Estimator: Sample Mean
1 n
Y   Yi
n i 1
1
1
1
 Y1  Y2  ...  Yn
n
n
n
A Linear Estimator- a sum of random variables
When n=2,
1
1
Y  Y1  Y2
2
2
 1 1   Y1 

Y 
 2 2  2 
 cY
1  1
c  
2  1
1
c  1 1
2
 Y1 
Y 
 Y2 
7
Linear Combination of Random Variables
Example: Sample Mean
1 n
Y   Yi
n i 1
1
1
1
 Y1  Y2  ...  Yn
n
n
n
 Y1 
Y 
1
2

 1 1
1
 
n
 
 Yn 
 cY
1
c  1n
n
Y  Y1 Y2
Yn 
8
Models for Response
ys     s
Non-Stochastic model (Deterministic)
Yi    Ei
Stochastic model
ID (s) Subject
Response
ys     s
1
Daisy
2
Lily
3 (=N) Rose
y1  25
y2  32
y3  33

s
30
5
30
30
2
3
9
Finite Population
Yi    Ei


Pick a SRS with
replacement of
size n=2

Stochastic model
i 1
i2
Y2    E2
Y1    E1
SPH&HS, UMASS Amherst
10
Finite Population
Yi    Ei


with
replacement

Stochastic model
i 1
i2
Y2    E2
Yy11    E11
SPH&HS, UMASS Amherst
11
Finite Population
Yi    Ei


with
replacement

Stochastic model
i 1
i2
Yy22    E22
y1     1
SPH&HS, UMASS Amherst
12
Sampling- n=2


with
replacement

Stochastic model
i 1
Y1    E1
i2
Y2    E2
Random Variables
Linear Combination of
Random Variables
SPH&HS, UMASS Amherst
1 n
Y   Yi
n i 1
 cY
13
Sampling- n=2

i 1

with
replacement

i2
Y2  y2
Y1  y1
Realized Values
y1    1
SPH&HS, UMASS Amherst
y2     2
14
Other Possible Samples

i 1

with
replacement

i2
y2     2
y1    1
SPH&HS, UMASS Amherst
15
Other Possible Samples

i 1

with
replacement

i2
y2     2
y1    1
SPH&HS, UMASS Amherst
16
All Possible Samples
Y1  y1 Y2  y2
Sample (t)
Probability
1
1/9
25
25
2
1/9
25
32
3
1/9
25
33
4
1/9
32
25
5
1/9
32
32
6
1/9
32
33
7
1/9
33
25
8
1/9
33
32
17
Expected Values
P Y1  y1  y1
P Y2  y2  y2
25
2.78
2.78
25
32
2.78
3.56
1/9
25
33
2.78
3.67
4
1/9
32
25
3.56
2.78
5
1/9
32
32
3.56
3.56
6
1/9
32
33
3.56
3.67
7
1/9
33
25
3.67
2.78
8
1/9
33
32
3.67
3.56
9
1/9
33
33
3.67
3.67
Y1  y1 Y2  y2
Sample
(t)
Probability
1
1/9
25
2
1/9
3
E Y1   30
E Y2   30
T
E Yi    P Yi  yi  yi
t 1
18
var Y1 
yi  
 yi   
Sample
(t)
Probability
Y1  y1
1
1/9
25
-5
25
2
1/9
25
-5
25
3
1/9
25
-5
25
4
1/9
32
2
4
5
1/9
32
2
4
6
1/9
32
2
4
7
1/9
33
3
9
8
1/9
33
3
9
9
1/9
33
3
9
0.00
12.67
T
2
var Yi    P Yi  yi  yi   
t 1
2
19
var Y2 
Y2  y2 yi  
 yi   
Sample
(t)
Probability
1
1/9
25
-5
25
2
1/9
32
2
4
3
1/9
33
3
9
4
1/9
25
-5
25
5
1/9
32
2
4
6
1/9
33
3
9
7
1/9
25
-5
25
8
1/9
32
2
4
9
1/9
33
3
9
0.00
12.67
T
2
var Yi    P Yi  yi  yi   
t 1
2
20
Covariance of Two Random Variables
T
cov Y , Z    P Y  y; Z  z   y  E Y    z  E  Z  
t 1
T
cov Y1 , Y2    P Y1  y1;Y2  y2   y1  E Y1   y2  E Y2 
t 1
21
T
cov Y1 , Y2    P Y1  y1;Y2  y2   y1  E Y1   y2  E Y2 
t 1
 y1    y2   
y1  
y2  
25
-5
-5
25
25
32
-5
2
-10
1/9
25
33
-5
3
-15
4
1/9
32
25
2
-5
-10
5
1/9
32
32
2
2
4
6
1/9
32
33
2
3
6
7
1/9
33
25
3
-5
-15
8
1/9
33
32
3
2
6
9
1/9
33
33
3
3
9
Sample
(t)
Probability
1
1/9
25
2
1/9
3
Y1  y1 Y2  y2
cov Y1 , Y2   0
Based on simple random sampling with replacement
22
Variance Matrix
cov Y1 , Y2  
 Y1   var Y1 
var    

var Y2  
 Y2   cov Y1 , Y2 

When n=2, and SRS with replacement:
2
Y

0 
 1 
var    
2
1 0
 Y2   0  
I2  

0
1


0
2 1
 

Identity Matrix
0
1


23
Variance Matrix for n Random
Variables
cov Y1 , Y2 
 Y1   var Y1 
 Y   cov Y , Y
 1 2  var Y2 
2


var

  
  
 Yn   cov Y1 , Yn  cov Y2 , Yn 
cov Y1 , Yn  

cov Y2 , Yn  


var Yn  
24
Covariance of Random Variables When SRS
without Replacment (n=2)
T
cov Y1 , Y2    P Y1  y1;Y2  y2   y1  E Y1   y2  E Y2 
t 1
 y1    y2   
y1  
y2  
32
-5
2
-10
25
33
-5
3
-15
1/6
32
25
2
-5
-10
4
1/6
32
33
2
3
6
5
1/6
33
25
3
-5
-15
6
1/6
33
32
3
2
6
Y1  y1 Y2  y2
Sample
(t)
Probability
1
1/6
25
2
1/6
3
cov Y1 , Y2   6.33
25
Covariance of two random variables when
sampling without replacement
cov Yi , Y j   

 1
 Y1 

Y 
 1
var  2    2  N  1
 

 

 Yn 
 1

 N 1
2
N 1
1
N 1
1
1
N 1
1 
N 1 

1 
N 1 



1 

26
Estimating the Covariance
Estimate the variance:
 assuming srs
N
1
2
2
    ys   
N s 1
n
2
1
2
S 
Yi  Y 


n  1 i 1
Estimate the
covariance:
 assuming srs
1 N
 xy    ys   y   xs   x 
N s 1
1 n
ˆ xy 
Yi  Y  X i  X 


n  1 i 1
27
Independence

Two random variables, Y and Z are
independent if
P(Y=y|Z=z)=P(Y=y)
P(Y=y|Z=z) means the probability that Y
has a value of y, given Z has a value of z
(see Text, sections 6.1 and 6.2)
28
Example: SRS with rep n=2
Are Y1 and Y2 independent?
Does
P Y2  y2 | Y1  y1   P Y2  y2  ?
ID (s) Subject
Response
1
Daisy
25
2
3
Lily
Rose
32
33
29
Sampling n=2 (with rep)
Are Y1 and Y2 independent?



Y1    E1
i 1
y1    1
P Y1  y1   1/ 3
P Y1  y1   1/ 3
P Y1  y1   1/ 3



Y2    E2
i2
P Y2  y2 | Y1  y1   1/
? 3
Yes
SPH&HS, UMASS Amherst
P Y2  y2   1/ 3
P Y2  y2   1/ 3
P Y2  y2   1/ 3
30
Sampling n=2 (with rep)
Are Y1 and Y2 independent?



Y1    E1
i 1
y1    1
P Y1  y1   1/ 3
P Y1  y1   1/ 3
P Y1  y1   1/ 3



Y2    E2
i2
P Y2  y2 | Y1  y1   1/
? 3
Yes
SPH&HS, UMASS Amherst
P Y2  y2   1/ 3
P Y2  y2   1/ 3
P Y2  y2   1/ 3
31
Sampling n=2 (with rep)
Are Y1 and Y2 independent?



Y1    E1
i 1
y1    1
P Y1  y1   1/ 3
P Y1  y1   1/ 3
P Y1  y1   1/ 3



Y2    E2
i2
P Y2  y2 | Y1  y1   1/
? 3
Yes
SPH&HS, UMASS Amherst
P Y2  y2   1/ 3
P Y2  y2   1/ 3
P Y2  y2   1/ 3
32
Example: SRS without rep n=2
Are Y1 and Y2 independent?
Does
P Y2  y2 | Y1  y1   P Y2  y2  ?
ID (s) Subject
Response
1
Daisy
25
2
3
Lily
Rose
32
33
33
Sampling n=2 (without replacement)
Are Y1 and Y2 independent?



Y1    E1
i 1
y1    1
P Y1  y1   1/ 3
P Y1  y1   1/ 3
P Y1  y1   1/ 3



Y2    E2
i2
P Y22  y22 | Y11  y11  0?
No
SPH&HS, UMASS Amherst
P Y2  y2   1/ 3
P Y2  y2   1/ 3
P Y2  y2   1/ 3
34
Sampling n=2 (without replacement)
Are Y1 and Y2 independent?



Y1    E1
i 1
y1    1
P Y1  y1   1/ 3
P Y1  y1   1/ 3
P Y1  y1   1/ 3



Y2    E2
i2
P Y22  y22 | Y11  y11  1/
? 2
No
SPH&HS, UMASS Amherst
P Y2  y2   1/ 3
P Y2  y2   1/ 3
P Y2  y2   1/ 3
35
Sampling n=2 (without replacement)
Are Y1 and Y2 independent?



Y1    E1
i 1
y1    1
P Y1  y1   1/ 3
P Y1  y1   1/ 3
P Y1  y1   1/ 3



Y2    E2
i2
P Y22  y22 | Y11  y11   1/
? 2
No
SPH&HS, UMASS Amherst
P Y2  y2   1/ 3
P Y2  y2   1/ 3
P Y2  y2   1/ 3
36
Relationship between
Independence and Covariance
If two random variables are independent,
then their covariance is 0.
 If the covariance of two random variables
is zero, the two may (or may not) be
independent

37
Expected Value of a Linear Combination of
Random Variables

.
Write linear combinations using vector notation
1 n
Y   Yi
n i 1
1
 1 1
n
 cY
Random variables
Constants
 Y1 
Y 
1  2 
 
 
 Yn 
1
c  1n
n
Y  Y1 Y2
Yn 
38
Example: SRS of size n:
E Y   E  cY 
 cE  Y 
where
E  Y    E Y1  E Y2 
E Yn  
E Y   E  cY 
1
 1 1
n
 E Y1  


E
Y


1 
1 




 E Yn  
1
 1 1
n


1    
 
 

39
Example 2: Suppose two independent SRS w/o
replacement are selected from populations of boy and girl
babies, and the weight recorded. Let us represent the boy
weight by Y and the girl weight by X. Suppose sample
results are given as follows:
Sample
Mean
Variance
Boys
n=25
Girls
n=40
Y
X
 y2
 x2
An estimate is wanted of the average
birth weight in Europe, where for
every 1000 births, 485 are girls,
while 515 are boys.
Write a linear combination that can
be used to construct an estimator.
Z  0.485 X  0.515Y
X
  0.485 0.515   
Y 
40
Variance of a Linear Combination of Random
Variables
var  cY   c var  Y  c
Example: Sample
mean, n=2 srs with
replacement
1
c  12
2
Constants
Y  Y1 Y2 
Random variables
  Y1   1 1
1
var  cY   1 1  var     
2
  Y2   2 1
  2 0  1
1
 1 1 
2  
4
 0   1
41
Matrix Multiplication
 c1
Hence
 a b
c2  
  c1a  c2 d

d e
c1b  c2e 
  2 0  1
1
var  cY   1 1 
2  
4
 0   1
1 2
2 1
     
4
1
1
  2 2 
4

2
2
42
Practice: Variance of a Linear Combination of
Random Variables
Example: Sample mean,
n=2 srs withOUT
replacement
from a population of N
1
c  12
2
Constants
Y  Y1 Y2 
Random variables
1 

1


 Y1 
N 1 
2
var     

 Y2 

  1
1 

2
 N 1

var  cY  
1 1 
1 
N  1   1
 
1
1
4

 

1 
 N 1

2 
1
1   1

1

1



4  N 1
N  1  1
1
1  2

 1 

 N 1  2

43
Correlation (see 17.1, 17.2 in text)

The correlation between two random
variables is defined as


cov  X , Y 
var  X  var Y 
Based on a simple random sample, we
estimate the correlation by
r
ˆ xy
S x2 S y2
1 n
ˆ xy 
X i  X Yi  Y 


n  1 i 1
n
n
2
2
1
1
2
2
Sx 
Xi  X  Sy 
Yi  Y 




n  1 i 1
n  1 i 1
44
Download