Author Guidelines for 8

advertisement
TRACKING MULTIPLE CELLS BY CORRESPONDENCE RESOLUTION IN A
SEQUENTIAL BAYESIAN FRAMEWORK
Nilanjan Ray, Gang Dong, Scott T. Acton
C. L. Brown Dept. of Electrical & Computer Engineering, Dept. of Biomedical Engineering
University of Virginia, Charlottesville, Virginia, USA
ABSTRACT
We propose a multi-target tracking (MTT) algorithm in a
sequential Bayesian framework that computes cell
velocities from video microscopy. Unlike the traditional
tracking methods, our formulation does not involve the
estimation of target states; instead, we estimate one-to-one
target correspondences by way of a sequential Markov
chain Monte Carlo (MCMC) algorithm. The proposed
probabilistic framework also automatically accounts for a
variable number of targets. We have tested the proposed
tracking algorithm on two different in vitro and one in vivo
microscopy experiments. The three experiments show
that the method holds promise in terms of low false
positive and false negative rates as well as low rates of
correspondence error.
1. INTRODUCTION
Cell velocity analysis from in vitro flow chamber assays
and in vivo video microscopy is a crucial task in important
biomedical application areas [3, 4]. Computing cell
velocities from video microscopy manually is an
extremely tedious task, which is also prone to human
fatigue and bias. The essential precursor to velocity
computation is the implementation of a robust multi-target
tracking (MTT) algorithm. In this paper we propose a
novel MTT algorithm based on a sequential Bayesian
framework with the motivation of solving MTT in the
context of cell velocity analysis.
Classically, tracking refers to target state estimation over a
period of time. However we take a different route in our
approach to multiple cell tracking– instead of estimating
the cell state (position, velocity, etc.), we estimate the
target correspondence between two consecutive video
frames in a sequential MCMC framework. The target
correspondence may be defined as a mapping from a set of
targets on a given video frame to the set of targets on the
subsequent video frame. We will refer to this mapping as
the track-map. The motivation to directly estimate the
track-map rather than target state comes from the fact that
cell positions can be obtained reasonably accurately by
straightforward detection methods. In many microscopy
scenarios, these detection techniques typically yield a high
rate of detection. However the detection methods also
generate false positives. Thus, a tracking method requires
two post-processing steps– elimination of the false
positives, and retrieval of the track-map.
In contrast to our approach, the typical sequential
Bayesian tracking framework always involves estimation
of target state. For example, multiple hypotheses tracking
(MHT) methods [6] compute both the probability
associated with the measurement-to-target association
hypothesis and the posterior probability for target states
given the measurements available up until the current time
(and a given data hypothesis). Next the MHT method
computes the target state posterior density by multiplying
the two aforementioned probabilities and taking their sum
over all possible association hypotheses. As time
progresses, the number of hypotheses grows exponentially
and so does the computational complexity. The joint
probability data association (JPDA) method [6] is a
subclass of MHT that prunes the association hypotheses
by clustering the target states. JPDA is typically defined
for systems with linear and Gaussian dynamics. When the
system involves non-Gaussian probability distributions
and non-linearity in the target state dynamics, the particle
filter (PF) provides a solution for MHT [1]. However, the
computation becomes formidable in the PF context, when
number of targets is large and there is no straightforward
method to add or subtract targets dynamically. The
reversible jump MCMC method [2] provides an avenue to
compute posterior target state density in such cases with
variable number of targets. Even so, the sequential nature
of the problem in addition to the varying number of targets
renders the reversible jump MCMC computation nearly
intractable due to the storage of samples for different
number of targets.
These factors have led us to rethink about the choice of
tracking framework applicable to the cell motility analysis.
Our proposed tracking framework has the following
characteristics– (1) we formulate the problem in a
sequential Bayesian framework that does not involve
target state estimation; (2) we simultaneously estimate the
track-map and refine available target detection results; (3)
our formulation completely bypasses intensive reversible
jump MCMC computation, yet handles a variable number
of targets in a sound probabilistic framework, and (4) the
implementation involves a MCMC sampler that is a hybrid
of Gibbs sampling and Metropolis-Hastings (MH)
algorithm that efficiently creates samples for one-to-one
track-map.
p(lt , f t | Z1:t ) 
M
p( Z t | lt , f t ) p(lt , f t | lt 1  ltm1 , g t 1  g tm1 ).
(8)
m1
2. PROPOSED MTT FORMULATION
The proposed MTT formulation assumes that all targets
are detected and that the detection leads to false positives.
The proposed tracking framework simultaneously refines
the set of crudely detected targets to eliminate false
positives and estimates the track-map.
Our aim is to generate samples {ltm , f t m }mM1 from the
mixture density given in the right hand side of (8) in order
to represent the posterior density p(lt,ft|Z1:t). Once the
samples are obtained, we can estimate the track-map and
the refined detection. To accomplish the recursion (8) for
the next frame (t+1) from the samples {ltm , f t m }mM1 , we
2.1 Simultaneous Track-map Estimation and
Detection Refinement
Let dt and dt-1 be the set of detected targets on frame t and
t-1 respectively. We now define ft, the track-map, as
follows:
(1)
f t : dt 1  dt  {o},
such that the following restricted mapping is one-to-one:
(2)
ft restricted : {e : e  dt 1 and ft (e)  o}  dt .
construct the samples {ltm , gtm }mM1 . This is straightforward
as one can construct samples for ft uniquely given those
for gt and vice-versa. The following MCMC algorithm
generates samples {ltm , gtm }mM1 given the set of samples
We denote a null element by o. f t (e)  o means the target
e in set dt-1 does not find its match in set dt. We also define
the following mapping for the refinement of detection:
(3)
lt : dt  {0,1},
where 0 denotes “not a target” or false positive and 1
denotes a target or true positive. We are interested in a
sequential Bayesian maximum a posteriori (MAP)
estimation of (lt,ft) from the joint density p(lt,ft|Z1:t), where
Z1:t denotes all the accumulated observations or
measurements up to frame t. The posterior density can be
written as:
p(lt , f t | Z1:t ) 
(4)
p( Z | l , f )
p(l , f | l , f ) p(l , f | Z ).
Choose u with uniform distributi on in {1,  , M }
t
t
t

t
t
t 1
t 1
t 1
t 1
1:t 1
lt 1 ft 1
Given ft-1 we can uniquely construct a “backward” trackmap gt 1 : dt 1  dt 2  {o} as follows:
b, if b  d t  2 , and f t 1 (b)  a
(5)
g t 1 (a)  
o, otherwise.
Similarly, given gt (or gt-1) we can uniquely define ft (or ft1) as well. We choose the following density as the state
evolution dynamics for the system:
p(lt , f t | lt 1 , f t 1 )  p(lt , f t | lt 1 , gt 1 ),
(6)
then (4) can be rewritten as:
p(lt , f t | Z1:t ) 
p( Z | l , f )
p (l , f | l , g ) p(l , g | Z ). (7)
t
t
t

t
t
t 1
t 1
t 1
t 1
1:t 1
lt 1 gt 1
If we now approximate the density p(lt-1,gt-1|Z1:t-1) by a set
of samples {ltm1 , gtm1}mM1 , then (7) can be expressed as
follows:
{ltm1 , gtm1}mM1 .
{ltm , g tm }mM1  MCMC[{ltm1 , g tm1}mM1 ]
for m  1 : M
Generate ltm and f t m from
p ( Z t | lt , f t ) p (lt , f t | lt 1  ltu1 , g t 1  g tu1 )
Construct g tm from f t m
end
Before we elaborate on the second step inside the loop of
MCMC, we need these notations to denote multiple
targets: ltm  {ltm,n }|ndt |1 , g tm  {g tm,n }|ndt |1 , f t m  { f t ,mn }|ndt 11| , and
likewise, where |dt| and |dt-1| are number of initial
detections on frame t and t-1 respectively. A hybrid of
Gibbs and MH algorithm generates ltm and f t m
from p(Zt | lt , f t ) p(lt , ft | lt 1  ltu1 , gt 1  gtu1 ) , where the
second
density
factors
as:
u
u
u
u
p(lt | ft , lt 1  lt 1, gt 1  gt 1 ) p( ft | lt 1  lt 1, gt 1  gt 1 ) . In order to
implement this sampling algorithm, we utilize the
following form for the conditional density:
|dt |
|dt 1|
n1
n1
p(lt | f t , lt 1 , gt 1 )  p(lt | f t )  min(  lt ,n , 1( ft ,n o ) ) (9)
where 1(.) is the indicator function.
We also assume that measurement is independent of the
track-map and it factors over the targets, so:
|dt |
p( Z t | lt , f t )  p( Z t | lt )   p( Z t | lt ,n ).
(10)
n1
While generating the mth sample, a new proposal value for
nth detected cell is accepted/rejected with respect to the
MH ratio
| d t 1 |
min(  ltm,i  ltm, n1 ' ltm,i1 , 1( f t ,n  o ) ) p( Z t | ltm, n1 ' )
in
i n
n 1
| d t 1 |
min(  ltm,i   ltm,i1 , 1( f t ,n  o ) ) p( Z t | ltm, n1 )
in
where l
in
m 1
t ,n
, (11)
n 1
' is the logical complement of ltm,n1 . The
proposal ltm, n1 ' is symmetric and deterministic.
Next, we generate samples for the track-map ft from the
conditional distribution p( f t | lt , lt 1 , gt 1 ) as follows:
f t m  GEN_TRACKM AP[lt , l t 1 , g t-1 ]
for n  1 :| d t 1 |
if ltu1, jn  1
if S is empty
f t ,mjn  o
else
P  {h(k , j n , g t-1 ( j n )) : k  S}
Choose f t ,mjn  S with probabilit y P
S  S \ { f t ,mjn }
end
else
f t ,mjn  o
end
end
where h(k,j,i) is the so called “motion model” that has
three arguments k,j,i representing respectively the
coordinates (xk,yk), (xj,yj) and (xi,yi) of three cell centers on
frame t, t-1, and t-2. The h(.) can have the following form:
h( k , j , i ) 
exp( 
(arg (( xk , yk ), ( x j , y j ))  arg (( x j , y j ), ( xi , yi ))) 2
t ,n m1
n 1
(15)
The operator “mode” selects the mode sample value.
When mode is not unique, we use a random number
generator to choose one (with uniform density). It is
noteworthy that the MCMC samples follow the law of
large numbers, and estimations such as (14) and (15) are
thus possible [7].
Choose u  {1,  , M } with uniform distribution
2 12
t ,n n 1
and
lˆt  {lˆt ,n }|ndt |1  {mode[{ltm,n }mM1 ]}|ndt |1.
{ j1 ,  , j|dt 1| }  RandomPerm utation[{1,  , | d t 1 |}]
(dist(( xk , yk ), ( x j , y j ))  dist(( x j , y j ), ( xi , yi ))) 2
After generating the samples for the track-map and the
detection refinement function, we estimate ft and lt as
follows:
fˆ  { fˆ }|dt 1|  {mode[{ f m }M ]}|dt 1| ,
(14)
t
S  {k : l t (k )  1}
exp( 
where d and  are some user defined values. The
GEN_TRACKMAP algorithm creates a sample for the
track-map the restriction (2) for which is one-to-one. To
speed up the computation for a large number of targets,
the concept of “gating” may be utilized by suitably
defining (12). As for example, the value of h(.) may be
taken as zero when “dist” exceeds certain value.
) (12)
),
2 22
where “dist” represent Euclidean distance between two
points, and “arg” represents signed angle between two
vectors. Motion model (12) tries to preserve the direction
and the speed of a target. 1 and 2 are standard deviations
for the Gaussian distributions. When i in the argument of
h(.) is the null element o, we may define h(.) as:
(dist(( xk , yk ), ( x j , y j ))  d ) 2
h(k , j , o)  exp( 
)
2 12
(13)
(arg (( xk , yk ), ( x j , y j ))   ) 2
exp( 
),
2 22
2.2 Occlusions
Typically occlusions occur in video microscopy
observation– a cell visible (detected) in one frame
becomes invisible (undetected) in the next frame. Such a
cell may remain invisible for a number of consecutive
frames. In the proposed tracking framework, we can
accommodate occlusions simply deferring the track-map
decisions for those cells that do find a match. In order to
achieve this deferral decision, we redefine the track-map
as follows:
(16)
f t : (d t 1  rt 1 )  (d t  {o}),
and as before the restriction of ft,
(17)
f t restricted : {e : e  dt 1  rt 1 and ft (e)  o}  dt
is one-to-one. The detection refinement is as follows:
(18)
lt : (d t  rt )  {0,1}.
The “backward” track-map is defined as:
(19)
gt : dt  (dt 1  rt 1  {o}),
and the set rt is defined as:
rt  {e : e  (d t 1  rt 1 ) and mode[{ f t ,me }mM1 ]  o}.
(20)
The set rt basically acts as an accumulator for those cells
that cannot be matched with cells in the immediate next
frame. The same algorithms, MCMC and GENTRACKMAP, apply in this case as well. It is also possible
to purge cells from the accumulator that are, say, k frames
old by the following definition of rt:
rt  {e : e  (d t 1  rt 1 ) and mode[{ f t ,me }mM1 ]  o} \ (d t k  rt k ).
(21)
3. RESULTS AND DISCUSSION
The proposed algorithm has been tested using three
different types of cell video sequences– 1) human
monocytes observed from an in vitro assay, where the cells
are rolling on human P-selectin; 2) in vitro microbubble
data, where ultrasound contrast microbubbles rolls in the
flow chamber illuminated with a bright field, and the
adhesion property is the mechanism to be investigated; 3)
in vivo natural killer T (NKT) cell data, where NKT cells
migrate along liver sinusoids, and the average velocity and
maximal brightness are of specific importance. Each test
sequence contains a total of 150 frames at 30 Hz frame
rate. The sequences (1), (2) and (3) have 35, 20 and 25
average cells present, respectively, inside the tracking
region on every frame. We utilize the following
measurement models:
p( Z t | lt ,n  0)  exp( ( I t ( x n , y n )   b ) 2 / 2 b2 ),
p( Z t | lt ,n  1)  exp( ( I t ( x n , y n )   f ) 2 / 2 f2 ),
(22)
where b and f are mean and b and f are standard
deviations of respectively the background and the
foreground (cells) image intensity. It(xn,yn) denotes the
average image intensity within a circle (cell shape)
centered at (xn,yn) on frame t.
The proposed MTT method, with sample sizes (M) of 10,
100, 1000, and 5000, is applied to each type of data. A
Matlab implementation on a 2.4GHz, Pentium 4, 1GB
RAM PC took on an average 1.8s, 4.2s, 31.0s and 148.4s
per frame to track the monocyte sequence with
respectively 10, 100, 1000, and 5000 samples. The Table
presents the tracking results and summarizes the
performance. As performance measures, the numbers of
false positives (FP) or incorrectly detected targets, false
negatives (FN), or missed targets and correspondence
errors (CE) are given. A CE refers to an incorrect
correspondence among cells in two consecutive frames.
The total error is the sum of the three types of errors; and
the error rate is the ratio of the number of errors to the
number of cells tracked per frame. Figure 1 shows two
consecutive frames of tracking display for human
monocyte sequence. Figure 2 shows the displacements of
each of the monocytes along the direction of blood flow
(assumed horizontal in the video) in the flow chamber
over the entire sequence.
In our future endeavors, we wish to extend the proposed
algorithm to general multi-target tracking problems, where
it is possible to dynamically add and delete targets into
and from the set dt of currently detected targets. We hope
to utilize the powerful tools from the theory of marked
point processes [5] for the purpose. We also plan to
elaborate this work by performing detailed performance
and computation comparisons with popular MTT
techniques, such as JPDA and MHT in the near future.
REFERENCES
[1] A. Doucet, B. Vo, and C. Andrieu, “Particle filtering for
multi-target tracking and sensor management,” Proc. Int. Conf.
on Info. Fusion, Annapolis, MD, 2002.
[2] P.J. Green, “Reversible jump Markov chain Monte Carlo
computation and Bayesian model determination” Biometrika,
vol. 82, pp. 711-732, 1995.
[3] M.A. Mackey and F. Ianzini, “Development of the largescale digital cell analysis system,” Radiation Protection and
Dosimetry, vol.99, pp.289-293, 2002.
[4] K. Ley and D. Vestweber, Eds., The selectins: Initiators of
leukocyte endothelial adhesion. Amsterdam, The Netherlands:
Harwood, pp. 63–104, 1997.
[5] J. Møller and R.P. Waagepetersen, Statistical inference and
simulation for spatial point processes. Chapman and Hall/CRC:
Boca Raton, 2004.
[6] L.D. Stone, C.A. Barlow, and T.L. Corwin, Bayesian
multiple target tracking, Artech House, Boston:MA, 1999.
[7] L. Tierney, “Markov chains for exploring posterior
distributions,” Ann. of Statist., vol.22, pp.1701-1786, 1994.
Figure 1. Tracking display for two consecutive frames. Size of
bounding box is 50x100 square pixels.
Figure 2. Cell displacement computed by the MTT method.
Table. Tracker performances in three types of test sequences. FP
(false positive) FN (false negative) CE (correspondence error)
Download