Full project report

advertisement
Pattern Recognition Using Artificial Neural Networks
Pattern Recognition Using
Artificial Neural Networks
Ace of
Hearts
Project Team
Eyal Ittah (60407301)
Ittai Doron (53084489)
|Page1
Pattern Recognition Using Artificial Neural Networks
Introduction
One of the uses of computational vision is the recognition of shapes, patterns and objects in
an input image. Pattern recognition aims to classify data based on information extracted from the
data. In our case, we chose to classify images of playing cards by their suit (spades, clubs, hearts and
diamonds) and value ( 2-10, J, Q, K and A ). While this problem is simple enough for a human brain,
recognizing the shapes and numbers in the image is a difficult operation for the computer.
Our application will use relaxation labeling as a process of extracting the relevant data
needed for pattern recognition from the cards images. Then, it will use the processed image as an
input for an Artificial Neural Network aimed to find the suit and value of the card. Once the card
information is extracted, our application will present the user with the result.
Pattern Recognition
A complete pattern recognition system consists of:
1. A sensor In our case we bypassed the sensor stage and supplied the application with images of the cards.
The same system can be used with a camera continually taking photographs, saving them to the
computer and having the application analyze them.
2. A feature extraction mechanism The image was pre-processed using a relaxation-labeling algorithm which received the color
image and labeled it with two labels – object and background. By reducing the incoming data
from 3 x 255 bit variables (R,G,B) for each pixel to 1 bit (Boolean) for each pixel, we reduced the
noise of irrelevant information and made the Artificial Neural Network smaller (due to fewer
input values) and more efficient.
|Page2
Pattern Recognition Using Artificial Neural Networks
3. Classification scheme In order to classify the data derived from the relaxation-labeling algorithm as a number and suit
we used two Artificial Neural Networks

Recognizing the value This Artificial Neural Network received an input of 20x40 pixels (800 input neurons) and
returned the value true in one of 13 output neurons (representing the 13 classifications
of the card's value).

Recognizing the suit This Artificial Neural Network received an input of 20x20 pixels (400 input neurons) and
returned the value true in one of 4 output neurons (representing the 4 classifications of
the card's suit).
Relaxation Labeling
The relaxation labeling process used the following properties of the card images:
a. The objects used for labeling were the card image pixels.
b. The labels used were – object or background. An object label signified a card suit or value.
c. The world our application lives in is such that background tends to be white, while objects
are either black or red. This information was taken into consideration while determine the
initial confidence for each label.
d. The initial confidence function used was the amount of white in the pixel's RGB color
representation. Hence, a higher value in either the red, green or blue colors signifies a higher
degree of white. During the calculation we summed up the degree of RGB, and then divide
the result with 255 * 3. A lower value of RGB had given us a value closer to zero, which
means a higher probability to be labeled object.
|Page3
Pattern Recognition Using Artificial Neural Networks
Artificial Neural Networks
An Artificial Neural Network (ANN) is a
computational model based on the way neurons are
connected in the brain. Each individual neuron is a
simple calculation unit which is connected to numerous
other neurons. The network itself is a DAG (Directed
acyclic graph). The neurons are arranged in layers:

Input layer
Each neuron in this layer represents a single input
variable.
In our project, each input neuron represents a
single boolean value belonging to a pixel in the input image.

Output layer
Each neuron in this layer represents a single output variable.
In our project, each input neuron represents a single boolean value belonging to a specific class
value. For example, when classifying playing cards by their suit (spades, clubs, hearts and
diamonds), 4 output neurons are needed where each one represents the input being classified
as a specific suit.

Hidden layers
An ANN without hidden layers is only able to learn to identify linearly separable problems
(problems where the results can be separated as being classified to a single class using a linear
function). Since our problem is more complex, we needed to add hidden layers between the
input and output layers.
We used a single hidden layer in each of the ANNs.
Each neuron is connected by an edge to
neurons in the next layer. Each edge has a
weight which is chosen randomly in the
beginning and then corrected throughout the
learning process. These weights are the
knowledge gained during the learning process
and they allow the network to classify future
inputs.
Each artificial neuron is a basic computing unit capable of simple calculations – it sums the
incoming values and sets the outgoing value based on a threshold value or function.
In order to evaluate and classify an input, the input values are set to the input neurons. The
values are then propagated through the network – each neuron's new value is the sum of incoming
|Page4
Pattern Recognition Using Artificial Neural Networks
values, each multiplied by its weight. The output values can then be retrieved from the output
neurons.
The learning process
In order to achieve a neural network that is capable of classifying input, it needs to improve its
initial edge weights. We do this through supervised learning. In this process we present the network
with input from a training set. In each iteration of the learning process (or epoch), we iterate
through all the values in the training set in a random order. With each input we let the values
propagate through the network and retrieve an output. If the output is wrong, we adjust the weights
of the network using back-propagation.
In back-propagation we basically set the error values of the output neurons based on the
difference between the desired and actual outputs. We then propagate the error values back
through the network, updating the weights of the edges. A learning rate  is used to decide the rate
in which the weights change.
A smaller learning rate results in more subtle changes to the weights and more exploitation of
good results. A larger learning rate, on the other hand, results in more drastic changes to the
weights and more exploration.
|Page5
Pattern Recognition Using Artificial Neural Networks
Results
After the training of the two ANNs has been completed, a card image can be loaded and the
card's image and classification will be shown on the screen.
The following screenshot is an example of a classification of a card without noise :
|Page6
Pattern Recognition Using Artificial Neural Networks
The following screenshot is an example of a classification of a card with added noise (Added
noise with Gaussian distribution using Photoshop):
|Page7
Pattern Recognition Using Artificial Neural Networks
Conclusions

Using two ANNs instead of one
In the first attempts we used a single ANN that received an image part that included both
the value and suit of the card. This was a much larger image and thus resulted in many more
input neurons. This network didn't produce good results. In fact, during the training phase
the number of errors was usually around 50%.

Decreasing the learning rate of the ANN
During the back-propagation stage of training the ANN, the new weight of an edge is
updated based on the error and a learning rate (). We found that while rates of =0.1 and
0.05 produced sporadic results and the network couldn't finish it's learning process with zero
errors, lowering the lowering the learning rate to =0.005 produced consistently better
results and managed to finish the learning process.

Using an output neuron for each class
There are two ways of using the output neurons :
1. Each output neuron is a boolean value indicating that the evaluated input belongs to this
class.
For example, we will have output neurons O1..O4
Suit
O1
O2
O3
O4
spades
+
-
-
-
clubs
-
+
-
-
hearts
-
-
+
-
diamonds
-
-
-
+
2. Treat the output neurons as bits and encode the answer using these bits. For example,
for the different suits of cards, we can specify that
spades=0, clubs=1, hearts=2 and diamonds=3.
Then we can encode the output as follows :
|Page8
Pattern Recognition Using Artificial Neural Networks
Suit
O1
O2
spades
-
-
clubs
-
+
hearts
+
-
diamonds
+
+
We have found that we get better results using the first method for the output neurons.
Resources

Moshe Sipper, Evolutionary Computation and Artificial Life (course), Semester A, 2007/8
http://www.cs.bgu.ac.il/~sipper/courses/ecal081/

Tettamanzi & M. Tomassini, Soft Computing: Integrating Evolutionary, Neural, and Fuzzy
Systems, Springer-Verlag, Heidelberg, 2001
|Page9
Download