Document 15072458

advertisement
Mata kuliah : T0283 - Computer Vision
Tahun
: 2010
Lecture 09
Pattern Recognition and Classification I
Learning Objectives
After carefully listening this lecture, students will be able
to do the following :
explain some of object recognition and classification
procedures in Computer Vision such as k-NN, Neural Net
and SVM
demonstrate the use of feature vectors and boundary
decision function in classification process
January 20, 2010
T0283 - Computer Vision
3
Pattern Recognition and Classification
The goal of pattern recognition is to classify objects of
interest into one of a number of categories or classes
Uses :
Many systems must recognize things and make decisions
automatically
Every-day applications : Speech recognition, Fingerprint
identification, DNA sequence identification, OCR (Optical
character recognition)
Medical applications : diagnosis of symptoms, etc
Locating structures in images, detection of
abnormalities, etc
January 20, 2010
T0283 - Computer Vision
4
Classifiers for Recognition
Examine each
window of an image
Classify object class
within each window
based on a training
set images
Approach : Supervised Learning
Off-Line
Process
Feature extraction
Discriminative features
Invariant features with respect to translation, rotation and scaling
Classifier
Partitions feature space into different regions
Learning
Use features to determine classifier. Many different procedures for training
classifiers and choosing models
January 20, 2010
T0283 - Computer Vision
6
An Example: A Classification Problem
Categorize images of fish—say,
“Sea Bass” vs. “Salmon”
Use features such as length, width,
lightness, fin shape & number,
mouth position, etc.
Steps
1. Preprocessing (e.g., background
subtraction)
2. Feature extraction
3. Classification
January 20, 2010
T0283 - Computer Vision
example from Duda & Hart
7
Problem Analysis
Set up a camera and take some sample images to extract
features
Length
Lightness
Width
Number and shape of fins
Position of the mouth, etc…
This is the set of all suggested features to explore for use
in our classifier !
January 20, 2010
T0283 - Computer Vision
8
Pre-Processing
Use a segmentation operation to isolate fishes
from one another and from the background
Information from a single fish is sent to a feature
extractor whose purpose is to reduce the data by
measuring certain features
The features are passed to a classifier
January 20, 2010
T0283 - Computer Vision
9
Classification
Select the length of the fish as a possible feature
for discrimination
January 20, 2010
T0283 - Computer Vision
10
Decision Boundary
The length is a poor feature alone!
January 20, 2010
T0283 - Computer Vision
11
Decision Boundary
Select the lightness as a possible feature.
January 20, 2010
T0283 - Computer Vision
12
Threshold decision boundary and cost relationship
Move our decision boundary toward smaller values of
lightness in order to minimize the cost (reduce the
number of sea bass that are classified salmon!)
Adopt the lightness and add the width of the fish
Fish
xT = [x1, x2]
Lightness
January 20, 2010
T0283 - Computer Vision
Width
13
January 20, 2010
T0283 - Computer Vision
14
We might add other features that are not
correlated with the ones we already have. A
precaution should be taken not to reduce the
performance by adding such “noisy features”
Ideally, the best decision boundary should be the
one which provides an optimal performance such
as in the following figure:
January 20, 2010
T0283 - Computer Vision
15
January 20, 2010
T0283 - Computer Vision
16
However, our satisfaction is premature because
the central aim of designing a classifier is to
correctly classify novel input
Issue of generalization!
January 20, 2010
T0283 - Computer Vision
17
January 20, 2010
T0283 - Computer Vision
18
Bayes Risk
Some errors may be inevitable: the minimum risk
(shaded area) is called the Bayes risk
Probability density functions (area under each curve sums to 1)
January 20, 2010
T0283 - Computer Vision
19
Histogram based classifiers
Use a histogram to represent the class-conditional
densities
(i.e. p(x|1), p(x|2), etc)
Advantage: Estimates converge towards correct
values with enough data
Disadvantage: Histogram becomes big with high
dimension so requires too much data
but maybe we can assume feature independence?
January 20, 2010
T0283 - Computer Vision
20
Application: Skin Colour Histograms
Skin has a very small range of (intensity
independent) colours, and little texture
Compute colour measure, check if colour is in
this range, check if there is little texture
(median filter)
Get class conditional densities (histograms),
priors from data (counting)
Classifier is
January 20, 2010
T0283 - Computer Vision
21
Skin Colour Models
Skin chrominance points
January 20, 2010
Smoothed, [0,1]-normalized
T0283 - Computer Vision
22
Skin Colour Classification
courtesy
of G. Loy
January 20, 2010
T0283 - Computer Vision
23
Results
Figure from “Statistical color models with application to skin detection,” M.J. Jones and J. Rehg,
Proc. Computer Vision and Pattern Recognition, 1999 copyright 1999, IEEE
January 20, 2010
T0283 - Computer Vision
24
Nearest Neighbor Classifier
Useful non-linear classifier
Retain all training set
Select class of new example as that of closest
vector training set
Require a distance metric d(X1, X2)
Common metric is Euclidean distance
d(X1, X2) = |X1 – X2|
January 20, 2010
T0283 - Computer Vision
25
Nearest Neighbor Classifier
Assign label of nearest training data point to each
test data point
from Duda et al.
January 20, 2010
Voronoi partitioning of feature space
for 2-category 2-D and 3-D data
T0283 - Computer Vision
26
K-Nearest Neighbors
Rather than choose single closest
Find k closest (k = odd)
If kA from class A and kB from class B
Choose class A if kA > kB
More robust than single nearest neighbor
January 20, 2010
T0283 - Computer Vision
27
K-Nearest Neighbors
For a new point, find the k closest points from training data
Labels of the k points “vote” to classify
Avoids fixed scale choice—uses data itself (can be very
important in practice)
Simple method that works well if the distance measure
correctly weights the various dimensions
k=5
January 20, 2010
from
Duda et al.
Example density estimate
T0283 - Computer Vision
28
Neural networks
Compose layered classifiers
Use a weighted sum of elements at the previous
layer to compute results at next layer
Apply a smooth threshold function from each layer
to the next (introduces non-linearity)
Initialize the network with small random weights
Learn all the weights by performing gradient
descent (i.e., perform small adjustments to
improve results)
January 20, 2010
T0283 - Computer Vision
29
Training
Adjust parameters to minimize error on
training set
Perform gradient descent, making small
changes in the direction of the derivative of
error with respect to each parameter
Stop when error is low, and hasn’t changed
much
Network itself is designed by hand to suit the
problem, so only the weights are learned
January 20, 2010
T0283 - Computer Vision
30
Architecture of the complete system: they use another neural net to
estimate orientation of the face, then rectify it. They search over scales
to find bigger/smaller faces.
Figure from “Rotation invariant neural-network based face detection,” H.A. Rowley,
S. Baluja and T. Kanade, Proc. Computer Vision and Pattern Recognition, 1998,
copyright 1998, IEEE
January 20, 2010
T0283 - Computer Vision
31
Support Vector Machines
Try to obtain the decision boundary directly
potentially easier, because we need to encode only
the geometry of the boundary, not any irrelevant
wiggles in the posterior.
Not all points affect the decision boundary
January 20, 2010
T0283 - Computer Vision
32
Support Vectors
January 20, 2010
T0283 - Computer Vision
33
Decision Boundary : A Computation Example
g ij ( X )  0
g ij ( X )  0
gij ( X )  gi ( X )  g j ( X )  0
Discriminant Function :
1 T
g i  X ri  ( ri ri )
2
T
January 20, 2010
T0283 - Computer Vision
34
Decision Boundaries
Mean vectors for the three classes are given by C1 =(1,2)T,
C2 =(3,6)T , and C3 =(7,2)T .
January 20, 2010
T0283 - Computer Vision
35
Substituting r1 = (1,2)T in discriminant function g we
get :
1 T
g i ( X )  X ri  (ri ri )
2
g1 ( X )  x1  2 x2  2.5
T
and r2 = (3,6)T we get :
g 2 ( X )  3 x1  6 x2  22.5
Decision boundary function g12 :
g12 ( X )  g1 ( X )  g 2 ( X )  0
g12 ( X )  x1  2 x2  10  0
January 20, 2010
T0283 - Computer Vision
36
Download