# Computer Vision

```Computer Vision
Lecture 7
Classifiers
This Lecture
• Bayesian decision theory (22.1, 22.2)
–
–
–
–
General theory
Gaussian distributions
Nearest neighbor classifier
Histogram method
• Feature selection (22.3)
– Principal component analysis
– Canonical variables
• Neural networks (22.4)
– Structure
– Error criterion
• Support vector machines (SVM) (22.5)
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 1
Motivation
• Task: Find faces in image
• Method:
– While (image not explored)
• Obtain pixels from a rectangular ROI in image
(call this set of pixels x)
• Classify according to the set of values
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 2
Example of Result
• Figure 22-5 from
text. Classifier has
found the faces in
image, and
indicated their
locations with
polygonal figures.
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 3
Conceptual Framework
• Assume that there are two classes, 1 and 2
• We know p(1|x) and p(2|x)
• Intuitive classification - see below
Com po ne nts o f a m ix tur e
1 .8
T hre s ho l t
1 .6
p(x)
1 .4
1 .2
p1
1
p2
0 .8
0 .6
0 .4
0 .2
0
0
0 .5
1
1 .5
2
x
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 4
Decision Theory
• What are the consequences of an incorrect
decision?
• Bayes method
– Assume that a loss (cost) of L(1-&gt;2) if we assign
object of class 1 to category 2 and L(2-&gt;1) if we
assign object of class 2 to category 1.
• To minimize the average loss we decide as
follows
– Choose 1 if L(1-&gt;2)P(1|x) &lt; L(2-&gt;1)P(2|x)
– Choose 2 otherwise
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 5
Experimental Approaches
• We are given a training (learning) sample (xi,
yi) of data vectors xi and their classes yi, and
we need to construct the classifier
• Estimate p(1|x), p(2|x) and build classifier
– Parametric method
– Nearest neighbor method
– Histogram method
• Use classifier of specified structure and adjust
classifier parameters from training sample.
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 6
Example: Gaussian Classifier
• Assume we have data vectors xk,i for i = 1, 2.
The probabilities Pr{i} are known. The Bayes
loss is equal for both categories.
• Estimate the means and covariances
N
N
i
1 i
1
i   xk,i ; i 
(xk,i   i )(xk,i   i )T .

Ni k1
(Ni 1) k1

• Classifier: Given unknown x. Compute

g{i | x}  (x  i )T 1
i (x  i ) 2ln Pr(i) ln i
– Chose class that has the lower value of g(i|x)

Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 7
Example: Nearest Neighbor
• Assume we have data vectors xk,i for i = 1, 2.
• Classifier: Given unknown x
– Find
di  min k x  xi,k for i  1,2
• Chose i for smaller value of di

Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 8
Example
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 9
Histogram Method
• Assume we have data vectors xk,i for i = 1, 2.
The Bayes loss is equal for both categories.
• Divide input space into J ‘bins’,J &lt; N1,
N2
• Find hi,j the number of training vectors
of category i in bin j.
• Given: unknown x. Find its bin j. Decide
according to which hi,j is higher.
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 10
Curse of Dimension
• If the number of measurements in x is
very large, we are in trouble
– The covariance matrix ∑ will be singular
(not possible to find ∑-1, ln|∑| = -∞).
Gaussian method does not work.
– Hard to divide space of x into bins.
Histogram method does not work
• Feature extraction?
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 11
Principal Component Analysis
• Picture
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 12
Principal Component Analysis
• Formulas
1 n
1 n
   xi ,  
(xi   )(xi   )T .

n i1
n 1 i1
• Find vi, the eigenvalues and
eigenvectors of ∑

• Chose x•vi for large eigenvalues as
features
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 13
Issues
• Data may not be linearly seperable
• Principal components may not be appropriate
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 14
Canonical Transformation
• For two categories: Fisher linear
discriminant function
n
n
1 i
1 2 i
 i   xi,k ,  
(xi,k   i )(xi,k   i )T .


ni k1
n 1 i1 k1
v  1( 2  1 )
Tx as only feature.
• Use
v

• See textbook for general formula
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 15
Neural Networks
• Nonlinear classifiers
• Parameters are found by iterative (slow)
training method
• Can be very effective
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 16
Two-Layer Neural Network
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 17
Curse of Dimension
• High order neural network has limited
ability to generalize (and can be very
slow to train)
• Problem structure can improve
performance
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 18
Useful Preprocessing
• Brightness/contrast equalization
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 19
Architecture for Face Search
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 20
Architecture for Character
Recognition
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 21
Support Vector Machines (SVM)
• General nonlinear (and linear) classifier
• Structure similar to perceptron
• Training based on sophisticated
optimization theory
• Currently, the ‘Mercedes-Bentz’ of
classifiers.
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 22
Summary
• Goal: Find object in image
• Method: Look at windows, classify
(template search with classifiers)
• Issues:
– Classifier design
– Training
– Curse of dimension
– Generalization
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 23
Template Connections
Computer Vision, Lecture 6
Oleh Tretiak &copy; 2005
Slide 24
```