An introduction to pattern recognition using neural networks Pattern Recognition (PR) is the largest application of neural networks. PR involves the classification of an unknown input. It is the process of mapping a given input to one of a group of known patterns each of which represents a certain class. For example, in character recognition, an input alphabetic character is identified to be one of the 26 letters of the alphabet. Two other examples of PR are: Identifying someone from a facial, fingerprint or retinal image Understanding spoken words in voice or speech recognition PR methodology and techniques The simplest approach is the direct matching of the input with a database of representative patterns. But in general PR is a two-stage process involving 1. The extraction of features characterising the input pattern. This involves processing carried out to measure certain feature values. 2. Classification by matching this set of features with the database of patterns storing sets of features for each class. The measurements made for a pattern are treated as the n components of an ndimensional vector, known as the feature vector for that pattern. Feature vectors can be represented by points in an n-dimensional feature space. Pattens belonging to the same class form clusters in this feature space. For a simple example, if we want to identify people by using the two features height and weight, the two axes of the 2-dimensional feature space will correspond to these two features. Pattern vectors for people with similar characteristics, such as tall, heavy people, will correspond to relatively closely located points in this 2dimensional feature space. If we call the class of such people the giants, and the class of short lightweight people the minos, we can then try to draw a line dividing these two classes – the so-called decision boundary. For a 3-dimensional feature space, this boundary will be a plane. In general, the decision boundary for patterns in an n-dimensional feature space will be a hyperplane. Any unknown input can then be classified as a giant or mino depending on which side of the decision boundary the corresponding pattern vector falls. The classification decision is arrived at by measuring the distance of an input pattern vector from the clusters representing various classes. There are more than one ways of measuring the distance – a common distance metric is the Euclidean distance between two points P(x1, x2, .. , xn) and Q(y1, y2, .. , yn) (or two pattern vectors with components (x1, x2, .. , xn) and (y1, y2, .. , yn)) in n-dimensional space defined as ((x1-y1)2 + (x2-y2)2 + .. + (xn-yn)2). Another common distance metric is the Hamming distance, given by Pattern classes which can be separated in the feature space using hyperplanes (linear decision boundaries) are known as linearly separable. Unfortunately not all PR problems deal with linearly separable patterns. Artificial neural networks are capable of handling both linear as well as non-linear pattern classification.