Liu - Frontiers in Computer Vision

Foundations & Core in Computer Vision:
A System Perspective
Ce Liu
Microsoft Research New England
Vision vs. Learning
• Computer vision: visual application of machine learning?
• Data  features  algorithms  data
• ML: design algorithms given input and output data
• CV: find the best input and output data given available
Theoretical vs. Experimental
• Theoretical analysis of a visual system
– Best & worst cases
– Average performance
• Theoretical analysis is challenging as many visual
distributions are hard to model (signal processing: 2nd
order processes, machine learning: exponential families)
• Experimental approach: full spectrum of system
performance as a function of the amount of data,
annotation, number of categories, noise, and other
Quality vs. Speed
• HD videos, billions of images to index
• Real time & 90% vs. one hour per frame & 95%?
• Mechanism to balance quality and speed in modeling
Automatic vs. semi-automatic
• Common review feedback: parameters are hand-tuned;
not clear how to set the parameters
• Vision system user feedback: I don’t know how to tweak
• Computer-oriented vs. human-oriented representations
• Human-in-the-loop (collaborative) vision
– How to optimally use humans (what, which and how
accurate) beyond traditional active learning
– Model design by crowd-sourcing
– Learning by subtraction
Algorithms vs. Sensors
• Two approaches to solving a vision problem
– Look at images, design algorithms, experiment, improve…
– Look at cameras, design new/better sensors, …
• Cameras for full-spectrum, high res, low noise, depth,
motion, occluding boundary, object, …
• What’s the optimal sensor/device for solving a vision
• What’s the limit of sensors?
Thank you!
Ce Liu
Microsoft Research New England