Poster - Jerry Zitao Liu

advertisement
Modeling Clinical Time Series Using Gaussian Process Sequences
Zitao Liu
Lei Wu
Milos Hauskrecht
Department of Computer Science, University of Pittsburgh
Background (con’t)
Motivation
Development of accurate models of complex clinical time series data is
critical for understanding the disease, its dynamics, and subsequently
patient management and clinical decision making.
• Gaussian Process (GP)
•
GP is an extension of a multivariate Gaussian to distributions over
functions. Defined by two components: (m( x ), k ( x, x ')) .
Disease understanding
GP regression equations:
1
 Estimated Mean ( f* ) : K ( x* , x)  K (x, x)   2 I  y
 Estimated Covariance (Cov( f* )) : K ( x* , x* )  K ( x* , x)  K (x, x)   2 I  1 K ( x* , x)
Making decision
Patient management
[( f ( x )  m( x ))( f ( x)  m( x))]
Goal
• Data
State Space Gaussian Process(SSGP) Model
We consider the Gaussian process q(t) with the mean function formed
by a combination of a fixed set of basis functions with coefficients, β:
q( t )  f ( t )  h( t )T β,
 Mean function: m(x)  [ f (x)]
 Covariance function: K (x, x) 
Experiments
State Space Gaussian Process (con’t)
f ( t) ~
f
(0, K ( t, t))
In this definition, f(t) is a zero mean GP , h(t) denotes a set of fixed
basis functions, for example, h( t )  (1, t, t 2 ,) , and β is a Gaussian prior,
 ~ (b, I ) . Therefore, q(t) is another GP process, defined by:
q( t ) ~
• Choice of Covariance Functions( K  K1  K2 )
(h( t ) b, K ( t, t)  h( t ) h( t))
T
q
Figure 2. Time series for six tests from the Complete Blood Count(CBC) panel for one of the patients.
T
 Mean Reverting Property:
“Develop accurate models of complex clinical time series!”
???
Value
z1 ~
( 1 ,V1 ), w t ~
y t  u( z t )  v t
(0, Q ), v t ~
r ()
u()
r () – unknown transition function;
(0, R )
u() – unknown measurement function.
r ()
u()
u()
Y – time series of observations;
Z – hidden states driving the dynamics.
Time
( Az t , Q ), p( y t | z t ) 
(Cz t , R )
i 2
i 1
i 1
j 1
A
C
Time
C
Y – time series of observations;
Z – hidden states driving the dynamics.
???
Time
(Θ denotes covariance function parameters)

β,z
1
K  1 T 1 K 1
 Y K
K Y
  2

)
[log p(β, z, Y)]
Future Work
• Study and model dependences among multiple time series
• Extend to switching-state and controlled dynamical systems
• Prediction
Value
C
si
• Learning
 Learn Ω\Θ: EM algorithm with
Time
Value
p( z t 1 | z t ) 
(0, R )
m
p( Y | )
1 
  Tr  K
 Learn Θ: gradient based methods(  log 
2 
Value
• Linear Dynamical System (LDS)
(0, Q ), v t ~
m
p( D )  p( z, β, Y)  p( z1 ) p( z i | z i 1 ) (βi | z i ) p( y i , j | βi )
Parameter Set:   {,{βi }, A, C, R, Q,  1,V1}
• Idea Illustration
Background
A
m
Figure 3. Root Mean Square Error(RMSE) on CBC test samples.
State Space Gaussian Process
𝒕𝒊
Figure 1. Graphical representation of the state-space Gaussian process model. Shaded nodes yi , j denote
(irregular) observations and shaded nodes Ti , j denote times associated with each observation. Each rectangle
(plate) corresponds to a window, which is associated with its own local GP. si is the number of observations in
each window. f i , j is Gaussian field.
Joint distribution:
𝒚
𝒚𝒊
Value
z t 1  r ( z t )  w t
???
(𝒚𝒊 , 𝒕𝒊 )
( 1 ,V1 ), w t ~
1/2


RMSE   n 1  |yi  yi |2 
 i 1

n
• Discrete non-linear model (GPIL)
We define the time series prediction/regression function for clinical time
series as: g : Yobs  t  y where Yobs is a sequence of past observation-time pairs
Yobs  ( y i , ti )in1 such that, 0  ti  ti 1 , y i is a p-dimensional observation vector
made at time ( ti ), and n is the number of past observations; and t  tn is the
time at which we would like to predict the observation y . Irregularly
sampled, ti 1  ti  ti  ti 1.
y t  Cz t  v t


K 2   2 exp( 2 sin  ( t  t) )
 2

2
Root Mean Square Error(RMSE):
Time
Problem Statement
z t 1  Az t  w t
K1   1 exp(1 | t  t |)
• Results
Value
Specifically, a prediction model that can:
1. Handle missing values
2. Deal with irregular time sampling intervals
3. Make accurate long term predictions
z1 ~
 Periodicity:
• Evaluation Metric
Time
To support the prediction inference, we need the following steps:
1. Split Yobs and t into windows.
2. For windows that do not contain t, extract the last values in those
windows as βs and feed them into Kalman Filter algorithms to infer
the most recent hidden state z k where k is the index of the last window
that does not contain t.
3. Get βk 1  CAz k from z k 1  Az k and βk 1  Cz k 1 .
4. If t is in window k+1, use observations ( y k 1, tk 1 ) in window k+1 and
βk 1 to make the prediction, where y  βk 1  K (t , tk 1 ) K 1 (tk 1 , tk 1 )( y k 1  βk 1 ) ;
otherwise find out the window index i where t belongs to. The
prediction at t is y  CAi k z k .
Acknowledgement
This research work was supported by grants R01LM010019 and R01GM088224 from the
National Institutes of Health. Its content is solely the responsibility of the authors and does not
necessarily represent the official views of the NIH.
Reference
•
•
•
M. Hauskrecht, M. Valko, I. Batal, G. Clermont, S. Visweswaran, and G.F. Cooper, Conditional outlier
detection for clinical alerting, in AMIA Annual Symposium Proceedings, 2010, p. 286.
Carl Edward Rasmussen and Christopher K. I. Williams, Gaussian Processes for Machine Learning, MIT
Press, 2006.
R. Turner, M.P. Deisenroth, and C.E. Rasmussen, State-space inference and learning with Gaussian processes,
in AISTATS, vol. 9, 2010, pp. 868-875.
Download