Loss-based Visual Learning with Weak Supervision

advertisement
Loss-based Visual Learning
with Weak Supervision
M. Pawan Kumar
Joint work with
Pierre-Yves Baudin, Danny Goodman,
Puneet Kumar, Nikos Paragios,
Noura Azzabou, Pierre Carlier
SPLENDID
Self-Paced Learning for Exploiting Noisy, Diverse or Incomplete Data
Machine Learning
Weak Annotations
Noisy Annotations
Applications
Computer Vision
Nikos Paragios
Equipe Galen
INRIA Saclay
2012
2013
Daphne Koller
DAGS
Stanford
Medical Imaging
2 Visits from INRIA to Stanford
1 Visit from Stanford to INRIA
3 Visits Planned
ICML
MICCAI
Medical Image Segmentation
MRI Acquisitions of the thigh
Medical Image Segmentation
MRI Acquisitions of the thigh
Segments correspond to muscle groups
Random Walks Segmentation
Probabilistic segmentation algorithm
Computationally efficient
Interactive segmentation
Automated shape prior driven segmentation
L. Grady, 2006
L. Grady, 2005; Baudin et al., 2012
Random Walks Segmentation
x: Medical acquisition
y(i,s): Probability that voxel ‘i’ belongs to segment ‘s’
miny E(x,y) = yTL(x)y + wshape||y-y0||2
Positive semi-definite Laplacian matrix
Shape prior on the segmentation
Parameter of the RW algorithm
Convex
Hand-tuned
Random Walks Segmentation
Several Laplacians
L(x) = Σα wαLα(x)
Several shape and appearance priors
Σβ wβ||y-yβ||2
Hand-tuning large number of parameters is onerous
Parameter Estimation
Learn the best parameters from training data
Σα wαyTLα(x)y + Σβ wβ||y-yβ||2
Parameter Estimation
Learn the best parameters from training data
wTΨ(x,y)
w is the set of all parameters
Ψ(x,y) is the joint feature vector of input and output
Outline
• Parameter Estimation
– Supervised Learning
– Hard vs. Soft Segmentation
– Mathematical Formulation
• Optimization
• Experiments
• Related and Future Work in SPLENDID
Supervised Learning
Dataset of segmented fMRIs
Sample xk, voxel i
Probabilistic segmentation??
1, s is ground-truth
zk(i,s) =
0, otherwise
Supervised Learning
minw Σk ξk + λ||w||2
wTΨ(xk,ŷ) Energy
of
Segmentation
wTΨ(xk,zk) ≥ Δ(ŷ,zk) - ξk
Energy
of
Ground-truth
Δ(ŷ,zk) = Fraction of incorrectly labeled voxels
Structured-output Support Vector Machine
Taskar et al., 2003; Tsochantardis et al., 2004
Supervised Learning
Convex with several efficient algorithms
No parameter provides ‘hard’ segmentation
We only need a correct ‘soft’ probabilistic segmentation
Outline
• Parameter Estimation
– Supervised Learning
– Hard vs. Soft Segmentation
– Mathematical Formulation
• Optimization
• Experiments
• Related and Future Work in SPLENDID
Hard vs. Soft Segmentation
Hard segmentation zk
Don’t require 0-1 probabilities
Hard vs. Soft Segmentation
Soft segmentation yk
Compatible with zk
Binarizing yk gives zk
Hard vs. Soft Segmentation
Soft segmentation yk
Compatible with zk
yk  C(zk)
Which yk to use??
yk provided by best parameter
Unknown
Outline
• Parameter Estimation
– Supervised Learning
– Hard vs. Soft Segmentation
– Mathematical Formulation
• Optimization
• Experiments
• Related and Future Work in SPLENDID
Learning with Hard Segmentation
minw Σk ξk + λ||w||2
wTΨ(xk,ŷ) -
wTΨ(xk,zk) ≥ Δ(ŷ,zk) - ξk
Learning with Soft Segmentation
minw Σk ξk + λ||w||2
wTΨ(xk,ŷ) -
wTΨ(xk,yk) ≥ Δ(ŷ,zk) - ξk
Learning with Soft Segmentation
minw Σk ξk + λ||w||2
wTΨ(xk,ŷ) - minyk wTΨ(xk,yk) ≥ Δ(ŷ,zk) - ξk
yk  C(zk)
Latent Support Vector Machine
Smola et al., 2005; Felzenszwalb et al., 2008; Yu et al., 2009
Outline
• Parameter Estimation
• Optimization
• Experiments
• Related and Future Work in SPLENDID
Latent SVM
minw Σk ξk + λ||w||2
wTΨ(xk,ŷ) – minyk wTΨ(xk,yk) ≥ Δ(ŷ,zk) – ξk
yk  C(zk)
Difference-of-convex problem
Concave-Convex Procedure (CCCP)
CCCP
Estimate soft segmentation
yk* = minyk wTΨ(xk,yk) s.t. yk  C(zk)
Efficient optimization using dual decomposition
Update parameters
minw Σk ξk + λ||w||2
wTΨ(xk,ŷ) – wTΨ(xk,yk*) ≥ Δ(ŷ,zk) – ξk
Convex optimization
Repeat until convergence
Outline
• Parameter Estimation
• Optimization
• Experiments
• Related and Future Work in SPLENDID
Dataset
30 MRI volumes of thigh
Dimensions: 224 x 224 x 100
4 muscle groups + background
80% for training, 20% for testing
Parameters
4 Laplacians
2 shape priors
1 appearance prior
Baudin et al., 2012
Grady, 2005
Baselines
Hand-tuned parameters
Structured-output SVM
Hard segmentation
Soft segmentation based on signed distance transform
Results
Small but statistically significant improvement
Outline
• Parameter Estimation
• Optimization
• Experiments
• Related and Future Work in SPLENDID
Loss-based Learning
x: Input
a: Annotation
Loss-based Learning
x: Input
a: Annotation
h: Hidden information
h
h = “soft-segmentation”
a = “jumping”
Loss-based Learning
Annotation Mismatch
min Σk Δ(correct ak, predicted ak)
x: Input
a: Annotation
h: Hidden information
h
h = “soft-segmentation”
a = “jumping”
Loss-based Learning
Annotation Mismatch
min Σk Δ(correct ak, predicted ak)
Small improvement using small medical dataset
Loss-based Learning
Annotation Mismatch
min Σk Δ(correct ak, predicted ak)
Large improvement using large vision dataset
Loss-based Learning
Output Mismatch
Modeled using a distribution
min Σk Δ(correct {ak,hk}, predicted {ak,hk})
Inexpensive annotation
No experts required
Richer models can be learnt
Kumar, Packer and Koller, ICML 2012
Questions?
Download