Intro to C.V. - Department of Computer Science

advertisement
Introduction to Computer Vision
Olac Fuentes
Computer Science Department
University of Texas at El Paso
El Paso, TX, U.S.A.
What is Computer Vision?
Computer Vision is the process of extracting
knowledge about the world from one or
more digital images
Digital Images
are 2D arrays (matrices) of numbers:
Digital Images
Color Images are formed with three
2-D arrays, representing the
Red, Green and Blue
components of the image.
Computer Vision – Main Tasks
•
•
•
•
Model generation
Object Recognition
Object Detection
Tracking
Computer Vision – Object Detection
Detecting Faces
Computer Vision – Object Detection
Detecting Faces
Computer Vision – Object Detection
Detecting Pedestrians
Computer Vision – Object Detection
Detecting Cars
Computer Vision – Object Detection
How to do it?
Idea: Use Machine Learning
Training:
Training Set:
•
•
Positive examples are images of objects that belong to the class of
interest
Negative examples are images of objects that don’t belong to that
class
Train classifier using the training set
Detection
Given an image to analyze, apply classifier to every
subimage (there are lots of them, so a low false positive rate
is important!)
Face Detection – Training Images
Efficient Object Detection
Viola & Jones, 2005
Idea #1: Classifier Structure
Build a cascade classifiers:
Where stage i is simpler (and faster) than
stage i+1
Efficient Object Detection
Viola & Jones, 2005
Idea #2: Features
Use a large number of very simple features:
Efficient Object Detection
Viola & Jones, 2005
Idea #3: Feature Computation
Compute the features very efficiently using the integral image:
Efficient Object Detection
Viola & Jones, 2005
Idea #4: Dealing with multiple scales
Efficient Object Detection
Viola & Jones, 2005
Idea #4: Dealing with multiple scales
Obvious solution:
Build a detector for each possible scale
Efficient Object Detection
Viola & Jones, 2005
Idea #4: Dealing with multiple scales
Obvious solution:
Build a detector for each possible scale
Efficient Object Detection
Viola & Jones, 2005
Idea #4: Dealing with multiple scales
Obvious solution:
Build a detector for each possible scale
Better idea:
Build a detector for a single scale
During detection, scale the image
Efficient Object Detection
The Modified census transform (Froba and Ernst, 2004)
Used local intensity descriptors as features
Efficient Object Detection
The Modified census transform (Froba and Ernst, 2004)
Used local intensity descriptors as features
Used simple voting classifiers and Adaboost to build a cascade
of classifiers
Efficient Object Detection
Histograms of Gradients (Dalal, 2005)
Histograms of Gradients (Dalal, 2005)
Used histograms of oriented gradients as features
Used Support Vector Machine as classifier
Best results to date
Training
Object Recognition
Testing
Owl
??
Duck
Toucan
??
Egret
Object Recognition – Face Recognition
Eigenfaces are a set of "standardized face ingredients",
derived from statistical analysis of many pictures of faces.
First four eigenfaces from
the AT&T database
Eigenfaces
• One person's face might be made up of 10% from face 1,
24% from face 2 and so on.
Very few eigenvector terms are needed to give a fair
likeness of most people's faces
Eigenfaces provide a means of applying data compression
to faces for identification purposes.
Eigenfaces
• Let E1,...,En, be the eigenfaces obtained from a face
database
Let F1,...,Fm be the images in our training/testing sets.
(For the training images we also know the person’s
identity)
The attributes of Fi are given by the sum of the pixel by
pixel products of Fi and E1,...,En, that is, Fi is
represented by n numbers: [Fi·E1, Fi·E2, ..., Fi·En]
Using the attribute vectors and the class information
we can now construct a classifier
Tracking
Continuous detection of objects of interest in video streams
Tracking
Continuous detection of objects of interest in video streams
Reconstruction
Build a 3D models of world given 2D Images
Most-common Approach: Stereo Vision
•Inspired by human 3D perception
•Use two cameras of known geometry
Reconstruction
Build a 3D models of world given 2D Images
Most-common Approach: Stereo Vision
•Inspired by human 3D perception
•Use two cameras of known geometry
•Take images
Reconstruction
Build a 3D models of world given 2D Images
Most-common Approach: Stereo Vision
•Inspired by human 3D perception
•Use two cameras of known geometry
•Take images
•Find correspondences
•Reconstruct using correspondences and known geometry
Reconstruction
Reconstruction
Problems with Stereo Vision:
Finding matches reliably is difficult
Calibration is difficult
It hard to deal with featureless areas
Computationally expensive
Reconstruction
Microsoft to the rescue!
Reconstruction
Microsoft to the rescue!
Seriously!
Reconstruction
Microsoft Kinect
Reconstruction using active illumination
Project a known pattern of light at an invisible
wavelength
Learn the appearance of that pattern at different
distances
Fast and easy
Reconstruction
Microsoft Kinect
Reconstruction
Microsoft Kinect
Download