3. Pattern Recognition

advertisement
BABEŞ-BOLYAI UNIVERSITY CLUJ-NAPOCA
FACULTY OF MATHEMATICS AND COMPUTER SCIENCE
SPECIALIZATION
FORMAL METHODS IN PROGRAMMING
Pattern Recognition
Author
Mezei Sergiu-Vlad
2010
Table of Content
1.
Introduction............................................................................................................................ 3
1.1. Pattern Recognition ........................................................................................................................... 3
1.2. Computer Vision ............................................................................................................................... 4
2.
OpenCV (Open Source Computer Vision) .......................................................................... 6
3.
Pattern Recognition ............................................................................................................... 9
3.1.Classifiers ........................................................................................................................................... 9
3.1.1. Decision Functions ................................................................................................................ 9
3.1.2. Statistical Approach ............................................................................................................. 10
3.1.3. Fuzzy Classifiers .................................................................................................................. 10
3.1.4. Syntactic Approach.............................................................................................................. 10
3.1.5. Neural networks ................................................................................................................... 11
3.2.Traffic Sign Recognition Using OpenCV ........................................................................................ 11
4.
Recent applications .............................................................................................................. 14
5.
Conclusions........................................................................................................................... 16
6.
Bibliography ......................................................................................................................... 17
1
Abstract
Pattern recognition is an activity that humans normally excel in. We do it almost all the time, without
realizing it. We receive information through our various senses, which is instantaneously processed by
our mind such that we are able to identify the source of the information, without having made any
perceptible effort. What is even more impressive is the ability to accurately perform recognition tasks
even in non-ideal conditions, for example, when the received information is ambiguous, imprecise or
even incomplete. In fact, most of our day-to-day activities are based on our success in performing
various pattern recognition tasks. For example, when we read, we recognize the letters, words and,
ultimately concepts and notions, from the visual signals received by our mind, which processes them
extremely fast and probably does a neurobiological implementation of template matching.
This paper intends to provide an insight of the way a machine can perform recognition tasks in order to
perceive the environment like a human being does.
2
1. Introduction
1.1. Pattern Recognition
The subject of Pattern Recognition (PR) or Pattern recognition by machine basically deals with the
problem of constructing methods and algorithms that can enable the machine-implementation of
various recognition tasks that a human normally performs. The motivation is to find ways in which the
recognition tasks can be performed faster, more accurately and maybe more economically than a
human can. The purpose of PR also includes tasks at which humans are not good at, like reading bar
codes.
The goal of pattern recognition research is to devise ways and means of automating certain decisionmaking processes that lead to classification and recognition. For example a sorting factory can use
these algorithms to sort the red apples from the green ones which travel on a conveyor belt using a
video camera, which will be the “eye” of the machine, and a computer which will be processing the
input information from the camera to determine which apple is red or green.
Pattern recognition has been an intensely studied field for the past few decades, as is amply borne out
by numerous books (References [1, 2, 3, 4, 5, 6, 7] for example).
The object which is inspected for the “recognition” process is called a pattern. We normally refer to a
pattern as a description of an object which we want to recognize. In this paper we are interested in
spatial patterns like humans, apples, fingerprints, electrocardiograms or chromosomes. In most cases a
pattern recognition problem is a problem of discriminating between different populations. So, the
recognition process turns into classification. To determine which class an object belongs to, we must
first find which features are going to determine this classification.
In constructing a pattern recognition system, i.e. a system that will be able to obtain an unknown
incoming pattern and classify it in one (or more) of several given classes, we clearly want to employ all
the available related information that was previously accumulated. We assume that some sample
3
patterns with known classification are available. These patterns with their typical attributes from a
training set which provides relevant information how to associate between input information and
decision making. By using the training set the pattern recognition system may learn various types of
information like statistical parameters, relevant features, etc.
1.2. Computer Vision
Computer vision is the science that helps an artificial system sense the surrounding environment like a
human senses the environment through sight. It develops a basis through which we can automatically
process and obtain data from an image or a sequence of images (a video stream). The artificial vision
system that is implemented in software and hardware goes hand in hand with research made on
biological vision. This is an extremely difficult task and requires an understanding of human perception
so we could say that Computer Vision is a polyvalent domain. Sciences like psychology,
neurophysiology, cognitive science, geometry and areas of physics such as optics, artificial intelligence
and pattern recognition are included and at some level considered prerequisites for Computer Vision.
Computer vision is a wide and relatively new field of study. In the early days, it was difficult to process
even small sets of image data, although earlier work exists, starting from the 1950s, but it was done in a
more simple and theoretical way. It was not until the late 1970s that a more focused study of the field
appeared on the computing stage, when computers could manage the processing of large data sets such
as images and video streams. Computer vision covers a wide range of topics which are often related to
other disciplines, and consequently there is no standard formulation of "the computer vision problem".
Moreover, there is no standard formulation of how computer vision problems should be solved.
Although, there are an abundance of methods for solving various well-defined computer vision tasks,
where the methods often are very task specific and seldom can be generalized over a wide range of
applications. Many of the methods and applications are still in the state of basic research, but more and
more methods have found their way into commercial products, where they often constitute a part of a
larger system which can solve complex tasks (e.g., in the area of medical images, or quality control and
measurements in industrial processes). In most practical computer vision applications, the computers
are pre-programmed to solve a particular task, but methods based on learning are now becoming
increasingly common.
4
As a technological discipline, computer vision seeks to apply its theories and models to the
construction of computer vision systems. Examples of applications of computer vision include systems
for:

Controlling processes (e.g. an industrial robot or an autonomous vehicle).

Detecting events (e.g. for visual surveillance or people counting).

Organizing information (e.g. for indexing databases of images and image sequences).

Modeling objects or environments (e.g. industrial inspection, medical image analysis or
topographical modeling).

Interaction (e.g. as the input to a device for computer-human interaction).
Sub-domains of computer vision include scene reconstruction, event detection, video tracking, object
recognition, learning, indexing, motion estimation, and image restoration.
5
2. OpenCV (Open Source Computer Vision)
OpenCV is an open source computer vision library. The library is written in C and C++ and runs under
Linux, Windows and Mac OS X. There is active development on interfaces for Python, Ruby, Matlab,
and other languages. OpenCV was designed for computational efficiency and with a strong focus on
realtime applications. OpenCV is written in optimized C and can take advantage of multicore
processors.[13, 14]
The OpenCV library has over 500 functions that span many fields in vision, including factory product
inspection, medical imaging, security, user interface, camera calibration, stereo vision, and robotics.
Officially launched in 1999, the OpenCV project was initially an Intel Research initiative to advance
CPU-intensive applications, part of a series of projects including real-time ray tracing and 3D display
walls. The main contributors to the project included Intel’s Performance Library Team, as well as a
number of optimization experts in Intel Russia. In the early days of OpenCV, the goals of the project
were described as:

Advance vision research by providing not only open but also optimized code for basic vision
infrastructure. No more reinventing the wheel.

Disseminate vision knowledge by providing a common infrastructure that developers could
build on, so that code would be more readily readable and transferable.

Advance vision-based commercial applications by making portable, performance-optimized
code available for free, with a license that did not require commercial applications to be open or
free themselves.
The first alpha version of OpenCV was released to the public at the IEEE Conference on Computer
Vision and Pattern Recognition in 2000, and five betas were released between 2001 and 2005. The first
1.0 version was released in 2006. In mid 2008, OpenCV obtained corporate support from Willow
Garage, and is now again under active development. A version 1.1 "pre-release" was released in
6
October 2008, and a book by two authors of OpenCV published by O'Reilly Media went on the market
that same month (see Learning OpenCV: Computer Vision with the OpenCV Library ).
The second major release of the OpenCV was on October 2009. OpenCV 2 includes major changes to
the C++ interface, aiming at easier, more type-safe patterns, new functions, and better implementations
for existing ones in terms of performance (especially on multi-core systems).
Since then, OpenCV has been under heavy development and it is currently at the version number 2.1.
Now, anybody can contribute to the OpenCV library development using the public SVN link where
developers can upgrade the repository with their contributions.
Here is a list of the most important domains that are covered:
 Statistics and moments computing
 Structural analysis
 Motion analysis and object tracking
 Pattern recognition
 Graphic interface
 General Image Processing and Analysis functions
 Segmentation
 Geometric descriptors
 Transforms
 Machine Learning: Detection, Recognition
 Matrix Math
 Image Pyramids
 Camera Calibration, Stereo, 3D
 Utilities and Data Structures
 Fitting
Currently there are no standard APIs for Computer Vision, the only software that can be found is
research code (which is slow and unstable), expensive commercial toolkits (like Halcon or Matlab with
Simulink) or specialized solutions made for video surveillance, medical equipment, etc. Therefore there
is no standard library that simplifies the development of computer vision applications. OpenCV was
7
created to serve as a solution to this problem. Created by the Intel Corporation, it is especially
optimized for Inter Processors but, as an open source project, it is usable on any architecture. As a
consequence, the algorithms are guaranteed to be fast especially because this library also has the
purpose to point out the capabilities of the Intel processors.
8
3. Pattern Recognition
3.1.Classifiers
The final goal in pattern recognition is classification of a pattern. From the original information that we
obtain about the pattern we first identify the features and then use a feature extractor to measure them.
These measurements are then sent to a classifier which performs the actual classification (determines at
which of the existing classes to classify the pattern).
In this section we assume the presence of natural grouping, meaning that we have some knowledge
about the classes and the data. For example we may know the exact or approximate number of the
classes and the correct classification of some given patterns which are called the training patterns.
3.1.1. Decision Functions
When the number of classes is known and when the training patterns are such that there is geometrical
separation between the classes we can often use a set of decision functions to classify an unknown
pattern. For example a case when we have two classes Class1 and Class2 which exist in R over n and a
hyperplane d(x) = 0 which separates the two classes. Then we can use the decision function d(x) as a
linear classifier and classify new pattern by
d(x) > 0 => x belongs to Class1
d(x)>0 => x belongs to Class2
The hyperplane d(x) = 0 is called a decision boundary. When a set of hyperplanes can separate between
m given classes in R over n, these classes are linearly separable. Sometimes, we cannot classify
elements only by using linear decision functions, in this case we can either use generalized functions
(which are nonlinear) or use a nonlinear classifier of transform the problem to a space of a much higher
dimension where classification is done using linear boundaries.
9
3.1.2. Statistical Approach
Many times the training pattern of various classes overlap for example when they are originated by
some statistical distributions [3, 4]. In this case a statistical approach is needed, especially when the
various distribution functions of the classes are known. A statistical classifier must also evaluate the
risk associated with every classification which measures the probability of misclassification.
The Bayes classifier [3, 5, 6, 10] based on Bayes formula from probability theory minimizes the total
expected risk. In order to use Bayes classifier one must know beforehand the pattern distribution
function for each class. If these distributions are unknown they must at least be approximated using the
training patterns.
3.1.3. Fuzzy Classifiers
There are times when classification is performed with some degree of uncertainty. Either the
classification outcome itself may be in doubt, or the classified pattern x may belong in some degree to
more than one class. For example a medium apple cannot be considered to fully belong to class “big”,
although at the same time it cannot be fully accepted in the class “small” (provided only these two
classes exist). So, we naturally introduce fuzzy classification where a pattern is a member of every
class with some degree of membership between 0 and 1 [1, 2, 9].
3.1.4. Syntactic Approach
Unlike the previous approaches, the syntactic pattern recognition utilizes the structure of the patterns
[7]. Instead of carrying an analysis based strictly on quantitative characteristics of the pattern, we
emphasize the interrelationships between the primitives, the components which compare the pattern.
Usual patterns that are subject to syntactic pattern recognition research are characters, fingerprints,
chromosomes, etc.
10
3.1.5. Neural networks
The neural network approach assumes as other approaches before that a set of training patters and their
correct classification is given [11, 12]. The architecture of the net which includes input layer, output
layer and hidden layers may be very complex. It is characterized by a set of weights and activation
function which determine how any information (input data) is being transmitted to the output layer. The
neural network is trained by training patterns and adjusts the weights until the correct classifications are
obtained. It is then used to classify arbitrary unknown patterns. There are several neural net classifiers
but the most used and most simple one is the perceptron.
Pattern recognition and classification have been used for numerous applications. In the following pages
of this paper, I would like to present an application that will revolutionize the way we travel.
3.2.Traffic Sign Recognition Using OpenCV
Traffic Sign Recognition is used to regulate traffic signs, warn a driver, and command or forbid certain
actions. Fast real-time and robust automatic traffic sign detection and recognition can support and
simplify the tasks that a driver has to perform and significantly increase driving safety and comfort.
Generally, traffic signs provide the driver with a variety of information for safe and efficient
navigation. Automatic recognition of traffic signs is, therefore, important for automated intelligent
driving vehicles or for driver assistance systems. However, identification of traffic signs with respect to
various natural background viewing conditions still remains a challenging task. Traffic Sign
Recognition Systems usually have been developed into two specific phases [15 - 21]. The first phase is
normally related to the detection of traffic signs in a video sequence or an image using image
processing. The second one is related to recognition of those detected signs, which deals with the
interest of performance in an artificial neural network. The detection algorithms are normally based on
shape or color segmentation. The segmented potential regions are extracted as input in recognition
stage. The efficiency and speed of the detection play important roles in the system. To recognize traffic
signs, various methods for automatic traffic sign identification have been developed and show
promising results. Neural Networks precisely represents a technology used in traffic sign recognition
[15, 17, 20, 21]. One specific area in which many neural network applications have been developed is
11
the automatic recognition of signs. Fig. 1 illustrates the concept of automated intelligent driving vehicle
or for a driver assistance system.
Human visual
recognition
processing
Automatic road
sign detection and
recognition system
Neural Networks deal the
problem of Traffic Sign
Recognition
Guiding, Warning, or
Regulating to make
driving safer and easier.
Fig. 1 The concept of an automated intelligent driving vehicle or for a driver assistance system
The difficulties of traffic sign detection and recognition are involved with the performance of a system
in real time. Highly efficient algorithms and powerful performance hardware are required in the system
[17]. Furthermore, the environmental constraints include lighting, shadow occlusion, air pollution,
weather conditions (sunny, rainy, foggy, etc.) as well as additional image distortions, such as, motion
blur, vehicle vibration, and abrupt contrast changes which can possibly occur frequently in an actual
system [17,21]. In recent studies, the detection and recognition of traffic signs have been developed in
many research centers. A vision system for the traffic sign recognition and integrated autonomous
vehicle was developed as part of the European research project PROMETHEUS at DAIMLER-BENZ
Research Center [17]. Moreover, many techniques have been developed for road sign recognition. For
example, Pacheco et al. [22] used special color barcodes under road signs for detecting road signs in a
vision-based system, however, this took a lot of time and resources. A genetic algorithm was also
proposed by Aoyagi and Askura [23] to identify road sign from gray-level images. Because of a
limitation of crossover, mutation operator, and optimal solution, it is not guaranteed to achieve results.
Color indexing was proposed by Lalonde and Li [24] to approach identifying road sign, unfortunately,
the computation time was not accepted in complex traffic scenes.
12
This application represents the real implementation using an intelligent vehicle. The main objective is
to reduce the search space and indicate only potential regions for increasing the efficiency and speed of
the system. A higher robust and faster intelligent algorithm is required to provide the necessary
accuracy in recognition of traffic signs. In the detection phase, the acquisition image is preprocessed,
enhanced, and segmented according to the sign properties of color and shape. The traffic sign images
are investigated to detect potential pixel regions which could be recognized as possible road signs from
the complex background. The potential objects are then normalized to a specified size, and input to
recognition phase. This study investigates only circle and hexagonal shaped objects because these
shapes are normally present in many types of traffic signs. Multilayer Perceptron (MLP) with respect to
a back propagation learning algorithm is an alternative method to approach the problem of recognizing
signs in this work. The image processing tool used for this application is a free and noncommercial
Intel® Open Source Computer Vision Library (OpenCV) [14].
Here are the steps that this proposed application follows in order to achieve the end result:
The images are pre-processed in stages with image processing techniques, such as, threshold technique,
Gaussian filter, Canny edge detection, Contour and Fit Ellipse. The images are extracted to a small area
region of the traffic sign called “blob.” The main reason to select this method is to reduce the
computational cost in order to facilitate real time implementation. This first step is achieved using the
OpenCV library that provided the algorithm necessary to process the acquired images. Then, in the
second step, the Neural Network stages are performed to recognize the traffic sign patterns. The first
strategy is to reduce the number of MLP inputs by preprocessing the traffic sign image, and the second
strategy is to search for the best network architecture by selecting a suitable error criterion for training.
The system is trained by a training data set, and validated by a validating data set to find the suitable
network architecture. The cross-validation technique is implemented by a training data set, validating
data set, and test set. The experiments show consistent results together with accurate classifications of
traffic sign patterns with complex background images. The processing time in each frame of an image
is approached and satisfied to apply in the real intelligent vehicle or driver assistance application.
13
4. Recent applications
In the last few years there has been a lot of interest in the field of computer vision and pattern
recognition by large organizations like DARPA which is the research and development office for the
U.S. Department of Defense. DARPA’s mission is to maintain technological superiority of the U.S.
military and prevent technological surprise from harming U.S. national security.
DARPA funds unique and innovative research through the private sector, academic and other nonprofit organizations as well as government labs.
In 2007, DARPA hosted an autonomous vehicle race through traffic called the “Urban Challenge”.
The DARPA Urban Challenge was held on November 3rd, 2007, at the former George AFB in
Victorville, California. Building on the success of the 2004 and 2005 Grand Challenges, this event
required teams to build an autonomous vehicle capable of driving in traffic, performing complex
maneuvers such as merging, passing, parking and negotiating intersections. This event was truly
groundbreaking as the first time autonomous vehicles have interacted with both manned and unmanned
vehicle traffic in an urban environment.
Teams from around the world were whittled down through a series of qualifying steps, beginning with
technical papers and videos, then advancing to actual vehicle testing at team sites. Of the 89 teams to
initially apply, 35 teams were invited to the National Qualification Event (NQE), a rigorous eight-day
vehicle testing period. The NQE was co-located with the Final Event in Victorville, CA. DARPA
transformed the roads of the former George AFB into an autonomous vehicle testing ground.
After tallying all of the NQE scores, DARPA announced on November 1, 2007, that 11 teams would be
competing in the Final Event. And so at sunrise on November 3, in front of a crowd of thousands on
hand to witness history being made, Dr. Tether, DARPA Director, raised the green flag and the race was
on. One by one, all 11 finalist robots were released from their starting chutes, followed by a chase
vehicle equipped with an emergency stop control.
This event was not just a timed race however – robots were also being judged on their ability to follow
California driving rules. DARPA officials pored through reams of data throughout the night, analyzing
each
team’s
infractions
and
elapsed
At the awards ceremony the next morning, DARPA announced the winning order:
14
run
times.
1st Place: $2,000,000 Tartan Racing – Pittsburgh, PA
2nd Place: $1,000,000 Stanford Racing Team– Stanford, CA
3rd Place : $500,000 Victor Tango – Blacksburg, VA
OpenCV was of key use in the vision system of “Stanley”, the vehicle that the ‘Stanford Racing’ team
used. [25].
15
5. Conclusions
In this paper I have discussed about the approaches of pattern recognition in computer vision that have
emerged so far over the past few decades.
In my opinion, these advances in pattern recognition and computer vision represent the building blocks
for the near future of the technology based world. I hope I have provided some insight into the world of
robots that were a matter of science fiction at the end of the 20th century. I can only think of
advantages in continuing the research into this amazing subject, the main advantage is the preservation
of human life by assigning the life-threatening tasks to robots.
There continues to be a great demand for pattern recognition applications in the current technology
based world, so future research on the matter of pattern recognition is predicted to continue and
intensify in this century.
16
6. Bibliography
[1] J.C. Bezdek, Pattern Recognition with Fuzzy Objective Functions. New York: Plenum Press, 1981
[2] J.C. Bezdek, J. Keller, R. Krisnapuram, and N. R. Pal, Fuzzy models and Algorithms for pattern
recognition and Image Processing. Boston: Kluwer Academic, 1999.
[3] P.A. Devijver and J. Kittler, Pattern Recognition: A statistical Approach. London: Prentice-Hall,
1982.
[4] L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. New York:
Springer, 1996
[5] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. New York: Wiley, 1973
[6] R.O. Duda, D.G. Stork, and P.E. Hart, Pattern Classification and scene Analysis, New York: John
Wiley and Sons, second ed., 2000
[7] K.S. FU, Syntactic Pattern Recognition and Applications. Englewood Clifffs, NJ: Prentice-Hall,
1982
[8] K. Fukunaga, Introduction to Statistical Pattern Recognition. New York: Morgan Kaufmann, 1990.
[9] S.K. pal and D. Dutta Majumder, Fuzzy Mathematical Approach to Pattern Recognition. New York:
Wiley (Halsted Press), 1986.
[10] S. Theodoridis and K. Koutroumbas, Pattern Recognition. San Diego: Academic Press, 1999.
[11] B.D. Ripley, Pattern Recognition and Neural Networks. Cambridge: University Press, 1996
[12] C.M. Bishop, Neural Networks for Pattern Recognition. Oxford: Univeristy Press, 1995
[13] Bradski, G. , Kaehler, A.: Learning OpenCV , O’Reilly Media, Inc., USA, 2008.
[14] Open Source Computer Vision Library – Reference Manual, Intel
[15] R. Vicen-Bueno, R. Gil-Pita, M.P. Jarabo-Amores and F. L´opez-Ferreras, “Complexity Reduction
in Neural Networks Applied to Traffic Sign Recognition”, Proceedings of the 13th European Signal
Processing Conference, Antalya, Turkey, September 4-8, 2005.
[16] R. Vicen-Bueno, R. Gil-Pita, M. Rosa-Zurera, M. Utrilla-Manso, and F. Lopez-Ferreras,
“Multilayer Perceptrons Applied to Traffic Sign Recognition Tasks”, LNCS 3512, IWANN 2005, J.
Cabestany, A. Prieto, and D.F. Sandoval (Eds.), Springer-Verlag, Berlin, Heidelberg, 2005.
[17] H. X. Liu, and B. Ran, “Vision-Based Stop Sign Detection and Recognition System for Intelligent
Vehicle”, Transportation Research Board (TRB) Annual Meeting 2001, Washington, D.C., USA,
January 7-11, 2001.
17
[18] H. Fleyeh, and M. Dougherty, “Road And Traffic Sign Detection And Recognition”, Proceedings
of the 16th Mini - EURO Conference and 10th Meeting of EWGT.
[19] D. S. Kang, N. C. Griswold, and N. Kehtarnavaz, “An Invariant Traffic Sign Recognition System
Based on Sequential Color Processing and Geometrical Transformation”, Proceedings of the IEEE
Southwest Symposium on Image Analysis and Interpretation Volume , Issue , 21-24 Apr 1994.
[20] M. Rincon, S. Lafuente-Arroyo, and S. Maldonado-Bascon, “Knowledge Modeling for the Traffic
Sign Recognition Task”, Springer Berlin / Heidelberg Volume 3561/2005.
[21] C. Y. Fang, C. S. Fuh, P. S. Yen, S. Cherng, and S. W. Chen, “An Automatic Road Sign
Recognition System based on a Computational Model of Human Recognition Processing”, Computer
Vision and Image Understanding, Vol. 96 , Issue 2 (November 2004).
[22] L. Pacheco, J. Batlle, X. Cufi, “A new approach to real time traffic sign recognition based on color
information”, Proceedings of the Intelligent Vehicles Symposium, Paris, 1994.
[23] Y. Aoyagi, T. Asakura, “A study on traffic sign recognition in scene image using genetic
algorithms and neural networks”, Proceedings of the IEEE IECON Int. Conf. on Industrial Electronics,
Control, and Instrumentation, Taipei, Taiwan, vol. 3, 1996.
[24] M. Lalonde, Y. Li, Detection of Road Signs Using Color Indexing, Technical Report CRIMIT95/12-49,
Centre
de
Recherche
Informatique
de
<http://www.crim.ca/sbc/english/cime/> publications.html, 1995.
[25] http://tech.groups.yahoo.com/group/OpenCV
18
Montreal.
Available
from:
Download