Reversible Jump Details University of Wisconsin-Madison April 2008 Brian S. Yandell

advertisement
Reversible Jump Details
Brian S. Yandell
www.stat.wisc.edu/~yandell/statgen
University of Wisconsin-Madison
April 2008
April 2008
Stat 877 © Brian S. Yandell
1
reversible jump idea
• expand idea of MCMC to compare models
• adjust for parameters in different models
– augment smaller model with innovations
– constraints on larger model
• calculus “change of variables” is key
– add or drop parameter(s)
– carefully compute the Jacobian
• consider stepwise regression
– Mallick (1995) & Green (1995)
– efficient calculation with Hausholder decomposition
April 2008
Stat 877 © Brian S. Yandell
2
model selection in regression
• known regressors (e.g. markers)
– models with 1 or 2 regressors
• jump between models
– centering regressors simplifies calculations
m  1 : Yi    a ( Qi 1  Q1 )  ei
m  2 : Yi    a1 ( Qi 1  Q1 )  a 2 ( Qi 2  Q2 )  ei
April 2008
Stat 877 © Brian S. Yandell
3
slope estimate for 1 regressor
recall least squares estimate of slope
note relation of slope to correlation
n
aˆ 
r1 y s y
s1
, r1 y 
 (Q
i 1
i1
 Q1 )(Yi  Y ) / n
s1 s y
n
n
i 1
i 1
s12   (Qi1  Q1 ) 2 / n, s 2y   (Yi  Y ) 2 / n
April 2008
Stat 877 © Brian S. Yandell
4
2 correlated regressors
slopes adjusted for other regressors
aˆ1 
( r1 y  r12r2 y ) s y
s1
 aˆ 
r2 y s y
s2
r12 s2
c21, c21 
s1
n
aˆ2 
( r2 y  r12r1 y ) s y
April 2008
s2
, s221 
2
(
Q

Q

c
(
Q

Q
))
 i 2 2 21 i1 1
i 1
Stat 877 © Brian S. Yandell
n
5
Gibbs Sampler for Model 1
• mean
• slope
• variance
April 2008

2
n
, Bn 
 ~    Bn (Y   ), Bn
n 
n 



a ~   Bn




(Qi1  Q1 )(Yi  Y )

2 
 
i 1
,
B
n
ns12
ns12 


n
n

2 
2


v


Y

Y

a
(
Q

Q
)



i

i1
1
i 1

 2 ~ inv -  2  v  n,


vn




Stat 877 © Brian S. Yandell
6
Gibbs Sampler for Model 2
• mean
• slopes
• variance
April 2008

2

 ~    Bn (Y   ), Bn
n 



a2 ~   Bn



n
 (Qi 2  Q2 )(Yi  Y  a1 (Qi1  Q1 ))
i 1
ns221


2 
, Bn 2
ns21 


2
n
2


 
2

v    Yi  Y   ak (Qik  Qk ) 


2
2
i 1 
k 1

 ~ inv -   v  n,

v

n






Stat 877 © Brian S. Yandell
7
updates from 2->1
• drop 2nd regressor
• adjust other regressor
a  a1  a2c21
a2  0
April 2008
Stat 877 © Brian S. Yandell
8
updates from 1->2
• add 2nd slope, adjusting for collinearity
• adjust other slope & variance
z ~  (0,1),
J

s21 n
n
a2  aˆ2  z  J , aˆ2 
 (Q
i 1
i2
 Q2 )Yi  ˆ  aˆ1 (Qi1  Q1 ) 
ns221
a1  a  a2c21  a  z  c21J  aˆ2c21
April 2008
Stat 877 © Brian S. Yandell
9
model selection in regression
• known regressors (e.g. markers)
– models with 1 or 2 regressors
• jump between models
– augment with new innovation z
m
parameters
innovation s transforma tions
a2  aˆ2  z  J 
2
1  2 (  , a,  ; z ) z ~  (0,1) 

 a1  a  a2c21 
2  1 (  , a1 , a2 ,  )
2
April 2008
Stat 877 © Brian S. Yandell
a  a1  a2c21 


z0


10
change of variables
• change variables from model 1 to model 2
• calculus issues for integration
– need to formally account for change of variables
– infinitessimal steps in integration (db)
– involves partial derivatives (next page)
 a1  1
   
 a 2  0
  (a , a
1
April 2008
2
a
 c21J  c21   
  z   g (a; z | Y , Q1 , Q2 )

J
1   
 aˆ2 
| Y , Q1 , Q2 )da1da2    (a; z | Y , Q1 , Q2 ) Jdadz
Stat 877 © Brian S. Yandell
11
Jacobian & the calculus
• Jacobian sorts out change of variables
– careful: easy to mess up here!
g (a; z ) 1  c21J 
g (a; z )  (a1 , a2 ),


0
J
az


 1  c21J  
  1  J  0  ( c21J )  J
det  


0
J




 g (  , a,  2 ; z ) 
 da1da2  Jdadz
da1da2  det 
az


April 2008
Stat 877 © Brian S. Yandell
12
geometry of reversible jump
0.6
0.6
0.8
Reversible Jump Sequence
0.8
Move Between Models
b2
0.2 0.4
b2
0.2 0.4
c21 = 0.7
0.0
0.0
m=2
m=1
0.0
0.2
0.4
b1
0.6
0.8
0.0
a1
April 2008
0.2
0.4
b1
0.6
0.8
a1
Stat 877 © Brian S. Yandell
13
QT additive reversible jump
first 1000 with m<3
0.0
0.0
0.05
b2
0.1 0.2
b2
0.10
0.3
0.15
0.4
a short sequence
0.05
April 2008
0.10
b1
a1
0.15
-0.3 -0.2 -0.1 0.0 0.1 0.2
b1
Stat 877 © Brian S. Yandell
a1
14
0.0
-0.1
regression line
corresponds to
slope of updates
b2
0.1
90% & 95% sets
based on normal
0.2
0.3
credible set for additive
-0.1
April 2008
Stat 877 © Brian S. Yandell
0.0
b1
a1
0.1
0.2
15
multivariate updating of effects
before
• more computations when m > 2
• avoid matrix inverse
– Cholesky decomposition of matrix
• simultaneous updates
– effects at all loci
after
• accept new locus based on
– sampled new genos at locus
– sampled new effects at all loci
• also long-range positions updates
April 2008
Stat 877 © Brian S. Yandell
16
References
• Satagopan, Yandell (1996); Heath (1997);
Sillanpää, Arjas (1998); Stephens, Fisch
(1998)
• Green (1995); Richardson, Green (1997);
Green 2003, 2004
April 2008
Stat 877 © Brian S. Yandell
17
Download