PSO-based License Plate Detection

advertisement
Evolutionary Image Analysis and
Signal Processing
Stefano Cagnoni
HEURISTIC PARAMETER OPTIMIZATION
AND REAL-WORLD PROBLEMS
The function to be optimized:
• describes the solution to a practical problem
parametrically
• is typically optimized based on performance
achieved on a set of examples which are
representative of the problem at hand
Problem
Parametric
solution
S(p1…pn)
Cost function
f(p1…pn)
Examples
Optimization
method
Optimized
Solution
A SHIFTED VIEW UPON DESIGN
From:
• defining exact solutions, justified by an underlying
theory
To:
• searching solutions which work well, by:
• defining a quality criterion to measure the
effectiveness (cost) of possible solutions
• choosing a method that maximizes (minimizes) it.
A SHIFTED VIEW UPON DESIGN
From:
• tuning the parameters of a specific solution to a
problem using knowledge about the problem
To:
• tuning the parameters of a general optimization
method using knowledge about the method
WHEN ?
• No direct solution is available
• Problem specifications can be provided only
qualitatively or through examples
• Behaviors or phenomena can be described or
measured with little precision (e.g., noisy signals)
• Little a priori knowledge (none ?) on the problem
• Integration of modules to which any of the previous
conditions applies
COMPUTER VISION
The “art” of making computers see
(and understand what they see)
Sub-topics (the ‘vision chain’)
Image Acquisition
Image Enhancement
Segmentation
3D-Information Recovery
Image Understanding
COMPUTER AND HUMAN VISION
HUMAN
COMPUTER
Perception
Image Acquisition
Selective information
extraction
Feature Enhancement
Grouping by ‘similarity’
Segmentation
Extraction of spatial
relationships
3D-Information Recovery
Object recognition and
semantic interpretation
Image Understanding
(signal/image processing)
COMPUTER AND HUMAN VISION
HUMAN
COMPUTER
Perception
Image Acquisition
Selective information
extraction
Feature Enhancement
Grouping by ‘similarity’
Segmentation
Extraction of spatial
relationships
3D-Information Recovery
Object recognition and
semantic interpretation
Image Understanding
(signal/image processing)
LOW-LEVEL VISION
COMPUTER AND HUMAN VISION
HUMAN
HIGH-LEVEL
Perception
COMPUTER
VISION
Image Acquisition
Selective information
extraction
Feature Enhancement
Grouping by ‘similarity’
Segmentation
Extraction of spatial
relationships
3D-Information Recovery
Object recognition and
semantic interpretation
Image Understanding
(signal/image processing)
APPLICATION FRAMEWORKS
• Optimization of parameters of specific objective
functions (e.g., using GA, ES,…):
– related with a well-defined task.
– for a whole system.
• Generation of solutions from scratch (e.g., using GP)
Optimization of performances based on:
– specific objective functions
– interactive qualitative comparisons between solutions
• Generation of emergent collective solutions (e.g.,
using PSO or other Swarm Intelligence techniques)
Achievement of higher-level and complex tasks from collective
use of trivial, local, hard-wired behaviors
APPLICATIONS TO SIGNAL/IMAGE
PROCESSING AND PATTERN RECOGNITION
• Optimization of filter/detector AND algorithm
parameters for event detection and image
segmentation
• Qualitative optimization of image processing
algorithms
• Design of implicitly parallel binary image operators
and classifiers
• Object detection and segmentation
APPLICATIONS
• GA-based design of a QRS detector for ECG signals.
• Optimization of a 3D segmentation algorithm for
tomographic images based on an elastic contour
model.
• GP-based design of lookup tables for color
processing of MR images.
• SmcGP-based low-level image processing and lowresolution character recognition.
• Object detection and segmentation using PSO
SIGNAL PROCESSING
• Signal Enhancement (filtering) and Event Detection
(thresholding)
ECG signal
Enhanced ECG
QRS complex
EVOLUTIONARY DESIGN OF QRS DETECTORS
Given:
– Filter/detector layout
– Training set
– Fitness function
Optimize (using a GA):
– Filter coefficients
– Detector threshold
– Other parameters regulating the adaptive behavior
of the detector
FILTER LAYOUT
• Linear:
• Linear with selected samples:
• Quadratic with selected samples:
OTHER PARAMETERS
EXPERIMENTAL SETUP
TRAINING SET
10 10-second tracts of the ECG from each of the
48 30-minute records of the MIT-BIH Arrhythmia
Database (5981 beats out of about 110,000).
FITNESS FUNCTION
f = fmax - (FP2 + FN2) ,
fmax such that f>0
FP = False Positives, FN = False Negatives
RESULTS
• 99.5% average sensitivity (100% on most
“normal” recordings)
• Much faster detection with respect to
published algorithms yielding comparable
results
APPLICATIONS
• GA-based design of a QRS detector for ECG signals.
• Optimization of a 3D segmentation algorithm for
tomographic images based on an elastic contour
model.
• GP-based design of lookup tables for color
processing of MR images.
• SmcGP-based low-level image processing and lowresolution character recognition.
• Object detection and segmentation using PSO
RATIONALE
Pre-defined parameter sets hardly work in the
presence of high variability, as in biology/anatomy
• GA optimization of a specific structure segmentation
algorithm
• interactive specification of a few training contours,
followed by
• extraction of the contours of the structure of interest
from the whole data set
CONTOUR EXTRACTION
The problem can be
reformulated as
‘multiple 1D-edge
detection and tracking’.
An extension of the 1D
detector
SEGMENTATION
• Definition of a starting contour
• Iterate:
– Application of the GA-designed filter to the next
contour (extraction of matching edge points)
– Elastic contour model-based interpolation (also
optimized by the GA) of the edge points
extracted by the filter
TRAINING SET
One slice following the one which is used to seed the
iterative segmentation process
FITNESS FUNCTION
dyk = distance, along scan line Lh, between the actual
edge point and the one detected, K = constant
RESULTS
APPLICATIONS
• GA-based design of a QRS detector for ECG signals.
• Optimization of a 3D segmentation algorithm for
tomographic images based on an elastic contour
model.
• GP-based design of lookup tables for color
processing of MR images.
• SmcGP-based low-level image processing and lowresolution character recognition.
• Object detection and segmentation using PSO
INTERACTIVE EVOLUTION OF
LOOKUP TABLES
Aim: given two grey-scale images
I1(x,y) and I2(x,y), produce a color
image evolved using GP
IC(x,y) = F (I1(x,y), I2(x,y) )
with ‘interesting’ features from the point
of view of a specific application
Fitness is implicitly defined by the user
who acts as the referee of a tournament
(of size 2) used in the selection phase
Population size: 15
N. of generations : 10-15
Crossover rate: 0.7
No Mutation
Results of 4 different
experiments
APPLICATIONS
• GA-based design of a QRS detector for ECG signals.
• Optimization of a 3D segmentation algorithm for
tomographic images based on an elastic contour
model.
• GP-based design of lookup tables for color
processing of MR images.
• SmcGP-based low-level image processing and lowresolution character recognition.
• Object detection and segmentation using PSO
SUB-MACHINE CODE GENETIC
PROGRAMMING (Poli)
Unsigned long variables are used (32/64 bit words
depending on compiler or computing architecture) to
encode binary arrays of inputs
The bit string may encode consecutive samples of a
temporal sequence, a row or a window within an
image, etc.
Function set: bitwise logical operators + circular shifts
A whole block of data is affected by a single bit-wise
Boolean operation (SIMD paradigm)
SUB-MACHINE CODE GENETIC
PROGRAMMING
Advantages
• operating in parallel on multiple data makes fitness
evaluation more efficient
• fitness can be evaluated on multiple fitness cases
at the same time with a single operation sequence
Limitations
• Impossible to apply different operations to data
from the same block: the long int variable is an array
of independent data which undergo the same
operation, hence the need for circular shift operators
EXPERIMENTAL TEST-BED
APACHE PLATE-RECOGNITION SYSTEM (Adorni-Cagnoni, 2000)
• Main task: car
license-plate
recognition
• Data: 130 images of
running cars
• Sub-tasks:
plate extraction and
character recognition
PRE-PROCESSING
Input image
Grey-level
image
Binarized
H.gradient image
The horizontal-gradient image is thresholded to obtain a binary
image showing only the strongest edges.
APACHE: PLATE EXTRACTION
The plate region is the one where the density of vertical
edges (peaks of the horizontal gradient) is highest
CHARACTER RECOGNITION
Recognition of digits represented by binary twodimensional patterns: 10 specialized binary
classifiers have the pattern as input and
produce as output:
1
if the patterns belongs to the class
corresponding to the classifier
0
if the pattern belongs to another class
CLASSIFICATION BY
INDEPENDENT CLASSIFIERS
Advantages
• Each classifier is specialized and yields high
performances
• Classifiers are very ‘compact’: they don’t need to
consider features belonging to several classes
Disadvantages
Possible ambiguous classifications:
- The output of all classifiers is 0
- The output of more than one classifier is 1
A disambiguation mechanism is needed
INPUT ENCODING (TERMINAL SET)
Input Pattern: binary digits of size 13x8
104 bits may be represented using 4 32-bit words
(e.g., 4 unsigned long variables in C).
72 bits of the pattern are packed into the 24 least
significant bits of the first 3 unsigned long variables
The remaining 32 are packed into the fourth one
FUNCTION SET + ERCs
Binary bitwise operators: AND OR NOT XOR
Circular shift operators: SHR, SHR2, SHR4,
SHL, SHL2, SHL4
Ephemeral Random Constants (ERC):
32-bit unsigned long
FITNESS FUNCTION
F = number of correct classifications (TP+TN)
(TP = True Positive, TN = True Negatives)
NB If training data are uniformly distributed, then
the negative case shown to each classifier are 9
times as many as the positive ones
=>
F
strongly favors specificity
EVOLUTION PARAMETERS
Population : 1000
Survival rate : 17 %
Crossover rate: 80%
Mutation rate : 3%
Tournament selection with tournament size = 7
300 to 2000 iterations
(up to 3 hours on a 3 GHz PC)
TEST SET
Database of plate digits collected at toll booths
of Italian highways
about 11000 digits of size 13x8 from real plates
binarized with threshold=0.5 (0=black; 1 = white)
6024 in the training set
5010 in the test set (exactly 501 per class)
CLASSIFIERS
Set of 10 binary classifiers (one for each class)
For each classifier:
Input : unsigned long pattern[4]
Output : unsigned long out
Each classifier actually produces 32 binary outputs
(32 distinct fitness cases): the output bit which
yields the highest fitness on the training set is taken
as the actual output of the classifier
CLASSIFICATION RULE
The same pattern is input into all classifiers.
10 outputs are produced of which one (hopefully) is
1 and the others are 0.
If the output is ambiguous an external tie-breaker is
applied (the LVQ neural net classifier used in the
APACHE system).
For all-0 outputs this is the only possible remedy.
When more than one output are active a second
classifier set could be trained to act as tie-breakers.
FITNESS FUNCTION
F = TP+TN
• High specificity => high positive predictivity
• All-zero classifications are less than 5%
• Ambiguous cases less than 1%
95.3% of the digits is directly classified by the
basic classifier set (accuracy: training 99.98%,
testing 98.7%).
The remaining 5% is classified by the LVQ
classifier.
RESULTS
Training set
99.65% correct classifications
(99.98% over the 99% which could be directly classified)
Test Set
97.43% correct classifications
(98.7% over the 95.3% which could be directly classified)
Performance close to the reference LVQ classifier
Average execution time: 0.1 ms per classifier (1 ms per
pattern) on a 3-GHz Pentium IV computer after
compiling the programs.
10-fold reduction of computation time
EC-BASED IMPLEMENTATIONS
• GA-based design of a QRS detector for ECG signals.
• Optimization of a 3D segmentation algorithm for
tomographic images based on an elastic contour
model.
• GP-based design of lookup tables for color
processing of MR images.
• SmcGP-based low-level image processing and lowresolution character recognition.
• Object detection and segmentation using PSO
SmcGP-BASED PLATE DETECTION
• Input data: 4 unsigned long words encoding a
window, of size 32x4 pixels, from the binarized
gradient image
• Desired output:
1 if the window belongs to the plate
0 otherwise
TRAINING SET
• 80/130 images
• Input data: binarized
horizontal gradient
image
• 366 negative + 4824
positive = 5190 training
samples
• Same settings as those
used for digit recognition
Empty samples (that can
be found both inside and
outside the plate) have
been purged.
TYPICAL RESULTS
The same algorithm that APACHE directly applies to the gradient
image can be applied to this image to improve plate localization
COMPARISON WITH APACHE
• Increased plate-detection rate
RESULTS
• Improved accuracy with a limited increase
of computation time
• The computation efficiency of SmcGP
classifiers limits the effects of the
overhead added by the GP-evolved stage
EC-BASED IMPLEMENTATIONS
• GA-based design of a QRS detector for ECG signals.
• Optimization of a 3D segmentation algorithm for
tomographic images based on an elastic contour
model.
• GP-based design of lookup tables for color
processing of MR images.
• SmcGP-based low-level image processing and lowresolution character recognition.
• Object detection and segmentation using PSO
Particle Swarm Optimization
Optimization technique inspired by the collective
behavior of flocks (swarms) of birds in search of
food
• Each particle moves ‘randomly’ in the parameter
space under the action of an attractive force field
generated by:
• the best point in the parameter space ever
discovered by the particle itself
• the best point in the parameter space ever
discovered by the whole swarm
Particle Swarm Optimization
• Basic PSO equations
XP(t) = XP(t-1) + vP(t)
vP(t) = w * vP(t-1) + c1 * rand() * [BestXP – XP(t-1)] +
c2 * rand() * [gBestX – XP(t-1)]
c1, c2
w
XP
BestXP
gBestX
rand()
positive constants
inertia weight
position of particle P
best location reached by particle P up to time t-1
best location ever found by the whole swarm
random number between 0 and 1
The swarm converges towards the fittest point.
PSO and Image Analysis
• Multi-objective optimization: the goal is to identify
(segment out) one or MORE regions of interest
(ROIs) in the search space (the image), not a single
optimum.
• k-mean clustering PSO (Passaro and Starita 2006): the
swarm reorganizes itself in multiple sub-swarms: each
sub-swarm may then ‘address’ a different target.
PSO-based License Plate Detection
• Plates feature high density
of pixels with high
horizontal gradient values
due to the contrast
between letters (black) and
plate background (white)
• The horizontal gradient
can be easily and
efficiently approximated
using differences
PSO-based License Plate Detection
Two Search stages :
• Global search: the most ‘promising’ areas
within the whole image are grossly detected
• Local search: attention is focused on the
ROIs defined in the previous stage to detect
good plate candidates
PSO-based License Plate Detection
• A repulsion term between particles is added to let the
swarm spread all over the target:
vP*(t) = vP(t) + repulsionp
• Repulsion between two particles is computed, separately
along each axis, as
repulsion(i, j) = REPULSION_RANGE - |Xi-Xj|
REPULSION_RANGE being the maximum distance within which
interaction between particles occurs.
• The global repulsion term for P is the average of all
repulsion terms with particles interacting with it:
repulsionp = j repulsion(p,j) / n
PSO-based License Plate Detection
• The goal is to find regions featuring high
density of high-gradient pixels
• To prevent swarms from converging towards
ISOLATED pixels, fitness has two terms:
• punctual fitness, depending on visual
features
• local fitness, proportional to the number of
neighbors
PSO-based License Plate Detection
• The swarm explores a gray-scale image
fitness(x,y) = punctual fitness(x,y) + local fitness(x,y)
if ( |r(x,y)-g(x,y)|< K0 && |r(x,y)-b(x,y)|<K0 && |g(x,y)-b(x,y)| <K0 )
punctual_fitness = max(right_gradient,left_gradient);
else punctual_fitness = 0;
where right_gradient = |grayscale(x,y)-grayscale(x+1,y)|;
left_gradient = |grayscale(x,y)-grayscale(x-1,y)|;
local fitness = K1 * neighbor_number
PSO-based License Plate Detection
• The standard equation has been further
modified, to increase stability of sub-swarms.
If a particle with high punctual fitness lies within
a region with high density of particles, then it
has a probability, which is linearly dependent on
such a density, of staying there:
Prob { Xt-1=Xt } = number of particles in the
neighborhood / total number of particles in the
swarm
PSO-based License Plate Detection
GLOBAL SEARCH RESULTS
PSO-based License Plate Detection
Two ‘promising’ areas are detected (sub-swarms with
Nparticles > T): the larger one will be explored first
PSO-based License Plate Detection
The whole swarm is re-initialized near the selected area
and a local search is performed, using the previous
algorithm.
PSO-based License Plate Detection
A bounding box is computed enclosing all particles
with high punctual fitness. If this box has a w:h ratio =
5:1, we can assert we found the plate.
PSO-based License Plate Detection
Otherwise the box is expanded, letting some agent move
up and down (or left and right), in order to reach the given
ratio. On failure, the next region is explored.
PSO-based License Plate Detection
Final result
PSO-based License Plate Detection
TEST RESULTS
Set of 98 plates, 10 runs per plate = 980 runs
Correct detections
Wrong detections
Plate not found
N
955
15
10
Avg. time
0.083 s
0.151 s
1.424 s
In many cases performing the whole plate detection
process has a lower computation cost than computing
the gradient image
PSO-based License Plate Detection
Detection speed (955 correct detections / 980)
N. plates detected
1200
(Single-frame)
detection percentage
vs. time
1000
800
600
400
200
0
< 0,05 < 0,1 < 0,15 < 0,2
< 0,3
< 0,4
< 0,5
<1
<2
Time (s)
Real-time tracking can be achieved re-initializing the
swarm in frame F(t+1) within the region where the plate
has been detected in frame F(t): average detection time
0.03s (more than 30 fps can be analyzed).
• Swarm Intelligence approaches can be very
efficient in roughly detecting object positions,
as they effectively sample the search space
• Very interesting approaches to the analysis of
stereo images for obstacle detection and
robot navigation have been described by
Louchet, Lutton and Olague
CONCLUSIONS
In many cases in which a signal/image
processing/understanding problem can be
reformulated as an optimization problem (and
not only in those cases… ), evolutionary
computation provides powerful and effective
tools to search for ‘good’ solutions, hard to find
otherwise.
BIBLIOGRAPHY
•
Poli R., Cagnoni S., Valli G., ``Genetic design of optimum linear and
non-linear QRS detectors'', IEEE Trans. on Biomedical Engineering,
42(11):1137-41,1995.
•
Poli R., Cagnoni S. ``Genetic programming with user-driven selection:
experiments on the evolution of algorithms for image enhancement'',
Genetic Programming 1997, pp. 269-277, 1997.
•
Cagnoni S., Dobrzeniecki A.B., Poli R., Yanch J.C. ``Genetic-algorithmbased interactive segmentation of 3D medical images'', Image and
Vision Computing Journal, 17(12):881-896, 1999.
•
Cagnoni S., Bergenti F., Mordonini M., Adorni G., "Evolving binary
classifiers through parallel computation of multiple fitness cases". IEEE
Transactions on Systems, Man and Cybernetics Part B-Cybernetics,
35(3):548-555, 2005.
•
Cagnoni S., Mordonini M., Sartori J., “Particle Swarm Optimization for
Image Segmentation and Object Detection”, Proc. EvoWorkshops 2007,
pp. …..
UPCOMING BOOK
Genetic and Evolutionary Image
Processing and Computer Vision
EURASIP Book Series in Signal Processing
Editors: S. Cagnoni, E. Lutton, G. Olague
Download