Kristin Branson, Vincent Rabaud, and Serge Belongie
Dept of Computer Science, UCSD http://vision.ucsd.edu
We wish to track three agouti mice from video of a side view of their cage.
Mouse Vivarium Room
A vivarium houses thousands of cages of mice.
Manual, close monitoring of each mouse is impossible.
Behavior
Analysis
Algorithm
Automated behavior analysis will allow for
Improved animal care.
More detailed and exact data collection.
Activity
Eating
Scratching
Reproduction
Rolling
…
Tracking
Algorithm
Behavior
Analysis
Algorithm
Automated behavior analysis will allow for
Improved animal care.
More detailed and exact data collection.
An algorithm that tracks individual mice is a necessity for automated behavior analysis.
Activity
Eating
Scratching
Reproduction
Rolling
…
Tracking multiple mice is difficult because
The mice are indistinguishable.
They are prone to occluding one another.
They have few (if any) trackable features.
Their motion is relatively erratic.
We benefit from simplifying assumptions:
The number of objects does not change.
The illumination is relatively constant.
The camera is stationary.
We break the tracking problem into parts:
Track separated mice.
Detect occlusions.
Track occluded/occluding mice.
We break the tracking problem into parts:
Track separated mice.
Detect occlusions.
Track occluded/occluding mice.
We break the tracking problem into parts:
Track separated mice.
Detect occlusions.
Track occluded/occluding mice.
Occlusion Start
We break the tracking problem into parts:
Segmenting is more difficult when a frame is viewed out of context.
Segmenting is more difficult when a frame is viewed out of context.
Using a depth ordering heuristic, we associate the mouse at the start of an occlusion with the mouse at the end of the occlusion.
We track mice sequentially through the occlusion, incorporating a hint of the future locations of the mice.
Background/Foreground classification.
Tracking separated mice.
Detecting occlusions.
Tracking through occlusion.
Experimental results.
Future work.
Background/Foreground classification.
Tracking separated mice.
Detecting occlusions.
Tracking through occlusion.
Experimental results.
Future work.
Image History
Modified Temporal
Median
Current Frame
Estimated Background
Thresholded
Absolute
Difference
Foreground/
Background
Classification
Background/Foreground classification.
Tracking separated mice.
Detecting occlusions.
Tracking through occlusion.
Experimental results.
Future work.
We model the distribution of the pixel locations of each mouse as a bivariate Gaussian .
If the mice are separated, they can be modeled by a Mixture of Gaussians.
We fit the parameters using the EM algorithm.
mean covariance
Background/Foreground classification.
Tracking separated mice.
Detecting occlusions.
Tracking through occlusion.
Experimental results.
Future work.
Occlusion events are detected using the GMM parameters.
We threshold how “close” together the mouse distributions are.
The Fisher distance in the x -direction is the distance measure: d
F
(
,
2
), (
x x 1 x 1 2
,
x 2
2
)
(
(
x 1 x 1
2
x 2 x 2
2
)
)
2
/ 2
Background/Foreground classification.
Tracking separated mice.
Detecting occlusions.
Tracking through occlusion.
Experimental results.
Future work.
The pixel memberships during occlusion events must be reassigned.
Pixel Memberships
Frame t
a a
4
1
,
a a
5
2 a
3 a
6
“Best” Affine Transformation
Frame t to t +1
Pixel Memberships
Frame t + 1
a a
4
1
,
a a
5
2 a a
6
3
“Best” Affine Transformation
Frame t+1 to t+2
…
Pixel Memberships
Frame t
a a
4
1
,
a a
5
2 a
3 a
6
“Best” Affine Transformation
Frame t to t +1
Pixel Memberships
Frame t + 1
a a
4
1
,
a a
5
2 a a
6
3
“Best” Affine Transformation
Frame t+1 to t+2
…
Affine flow estimation assumes
Brightness Constancy: image brightness of an object does not change from frame to frame.
I x u
I y v
I t
0
The per-frame motion of each mouse can be described by an affine transformation.
u v (
( x , x , y ) y )
a
1 a
4
a
2 a
5 x x
a
3 a
6 y y
Frame t Frame t +1
In general, these assumptions do not hold.
We therefore minimize least-squares sense.
I x u
I y v
I t in the
The best a given only the affine flow cue minimizes where z
H
0
I x
, I x x , I x
( x , y
)
Μ w ( x , y )( z
T a y , I y
, I y x , I y y
T
.
I t
)
2
The affine flow cue alone does not give an accurate motion estimate.
The affine flow cue alone does not give an accurate motion estimate.
Suppose we have a guess of the affine transformation, â , to bias our estimate.
The best a minimizes the â .
H
0
and is near
Our criterion is:
H
w ( x , y
, y
)
Μ ( x
)( z
T a
I t
)
2
( a
ˆ
)
T a
1
( a
ˆ
) regularization term
Taking the partial derivative of H [ a ] w.r.t. a , setting it to 0, and solving for a gives:
(
T
a
1
)
1
(
T
t
a
1 ˆ
)
We use the depth order cue to estimate â.
We assume the front blob at the start and end of the occlusion are the same mouse.
Start (frame 1 )
End (frame n )
The succession of frame to frame motions transforms the initial front mouse into the final front mouse.
We set the per-frame prior estimate â to reflect this.
1 : n
(
μ
1
,
Σ
1
)
1
1 : 2
(
μ
2
,
Σ
2
)
ˆ
2 : 3
(
μ
3
,
Σ
3
)
3 : 4
(
μ
4
,
Σ
4 a
)
ˆ n
1 : n
(
μ n
,
Σ n
)
2 3 4 n
We estimate the frame to frame motion, â, by linearly interpolating the total motion,
1 : n a
1 : n
(
μ
1
,
Σ
1
)
1
(
μ
2
,
Σ
2
) (
μ
3
,
Σ
3
) (
μ
4
,
Σ
4
) (
μ n
2 3 4 n
,
Σ n
)
Given the initial and final mouse distributions, we compute the total transformation : t
1 : : n
n
1
,
1 : : n
1 n
/ 2
T
n
1 / 2
Translation
Rotation
& Skew
Orthogonal
Matrix
( t
1 : n
, A
1 : n
)
(
μ
1
,
Σ
1
)
1
(
μ n
,
Σ n
) n
From the total transformation, we estimate the per-frame transformation: t
ˆ n t
1 : n
1
,
A
1
1 : n
/( n
1 )
y
Estimating which mouse is in front relies on a simple heuristic: the front mouse has the lowest (largest) y-coordinate.
x
Pixel Memberships
Frame t
a a
4
1
,
a a
5
2 a
3 a
6
“Best” Affine Transformation
Frame t to t +1
Pixel Memberships
Frame t + 1
a a
4
1
,
a a
5
2 a a
6
3
“Best” Affine Transformation
Frame t+1 to t+2
…
We assign membership based on the weighted sum of the proximity and motion similarity.
Proximity criterion :
J l
[ p ]
( p
μ t
1
)
T Σ t
1
1
( p
μ t
1
)
Motion similarity criterion :
J m
[ p ]
local
[( u local
t x
)
2
( v local
t y
)
2
]
Local optical flow estimate
Background/Foreground classification.
Tracking separated mice.
Detecting occlusions.
Tracking through occlusion.
Experimental results.
Future work.
We report initial success of our algorithm in tracking three agouti mice in a cage.
x
Viewed in another way: t
Video
Simple
Tracker
Mouse
Positions
Detect
Occlusions
Occlusion
Starts & Ends
Occlusion
Reasoning
Mouse
Positions
We presented three modules to track identical, non-rigid, featureless objects through severe occlusion.
The novel module is the occlusion tracking module.
Video
Simple
Tracker
Mouse
Positions
Detect
Occlusions
Occlusion
Starts & Ends
Occlusion
Reasoning
Mouse
Positions
We presented three modules to track identical, non-rigid, featureless objects through severe occlusion.
The novel module is the occlusion reasoning module.
Frame n a
1 : 2 a
2 : 3 a
3 : 4
Frame 1 Frame 2 Frame 3 Frame 4
While the occlusion tracker operates sequentially, it incorporates a hint of the future locations of the mice.
This is a step in the direction of an algorithm that reasons forward and backward in time.
More robust depth estimation.
More robust separated mouse tracking
(e.g. BraMBLe).
Different affine interpolation schemes.
[1] D. Comaniciu, V. Ramesh, and P. Meer. Kernel-based object tracking. In
Pattern Analysis and Machine Intelligence, volume 25 (5), 2003.
[2] J. G årding. Shape from surface markings. PhD thesis, Royal Institute of
Technology, Stockholm, 1991.
[3] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning.
Springer Series in Statistics. Springer Verlag, Basel, 2001.
[4] M. Irani and P. Anandan. All about direct methods. In Vision Algorithms:
Theory and Practice. Springer-Verlag, 1999.
[5] M. Isard and J. MacCormick. BraMBLe: A Bayesian multiple-blob tracker. In
ICCV, 2001.
[6] B. Lucas and T. Kanade. An iterative image registration technique with an application to stereo vision. In DARPA Image Understanding Workshop, 1984.
[7] J. MacCormick and A. Blake. A probabilistic exclusion principle for tracking multiple objects. IJCV, 39(1):57 –71, 2000.
[8] Measuring Behavior: Intl. Conference on Methods and Techniques in
Behavioral Research, 1996 –2002.
[9] S. Niyogi. Detecting kinetic occlusion. In ICCV, pages 1044 –1049, 1995.
[10] J. Shi and C. Tomasi. Good features to track. In CVPR, Seattle, June 1994.
[11] H. Tao, H. Sawhney, and R. Kumar. A sampling algorithm for tracking multiple objects. In Workshop on Vision Algorithms, pages 53 –68, 1999.
[12] C. Twining, C. Taylor, and P. Courtney. Robust tracking and posture description for laboratory rodents using active shape models. In Behavior
Research Methods, Instruments and Computers, Measuring Behavior Special