Artificial Neural Network Overview - GK

advertisement
Artificial Neural Network Overview
An artificial neural network is a collection of connected models neurons. Taken one at a
time each neuron is rather simple. As a collection however, a group of neurons is capable
of producing complex results. In the following sections I will briefly summarize a
mathematical model of a neuron, neuron layer, and neural network before discussing the
types of behavior achievable from a neural network. Finally, I will conclude with a short
description of the program included in this lesson so you can form networks that are
tailored to your class.
Models
The models presented in this section appear fairly difficult mathematically. However,
they eventually boil down to just multiplication and addition. The use of matrices and
vectors simplifies the notation but is not absolutely required for this application.
Neuron Model
A model of a neuron has three basic parts: input weights, a summer, and an output
function. The input weights scale values used as inputs to the neuron, the summer adds
all the scaled values together, and the output function produces the final output of the
neuron. Often, one additional input, known as the bias is added to the system. If a bias is
used, it can be represented by a weight with a constant input of one. This description is
laid out visually below.
I1
I2
W1
W2

x
W3
I3
B
1
f(x)
a
Where I1, I2, and I3 are the inputs, W1, W2, and W3 are the weights, B is the bias, x is
an intermediate output, and a is final output. The equation for a is given by
a  f (W1 I1  W2 I 2  W3 I 3  B) where f could be any function. Most often, f is the sign of
the argument (i.e. 1 if the argument is positive and -1 if the argument is negative), linear
(i.e. the output is simply the input times some constant factor), or some complex curve
used in function matching (not needed here). For this model we will use the first case
where f is the sign of the argument for two reasons: it closely matches the ‘all or nothing’
property seen in biological neurons and it is fairly easy it implement.
When artificial neurons are implemented, vectors are commonly used to represent the
inputs and the weights so the first of two brief reviews of linear algebra is appropriate


here. The dot product of two vectors x  ( x1 , x2 ,, xn ) and y  ( y1 , y2 ,, yn ) is given by
 
x  y  x1 y1  x2 y2    xn yn . Using this notation the output is simplified to

 
a  f (W  I  B) where all the inputs are contained in I and all the weights are

contained in W .
Neuron Layer
In a neuron layer each input is tied to every neuron and each neuron produces its own
output. This can be represented mathematically by the following series of equations:
 
a1  f1 (W1  I  B1 )
 
a2  f 2 (W2  I  B2 )
 
a3  f 3 (W3  I  B3 )
...
NOTE: In general these functions may be different, however, I will take them to be the
sign of the argument from now on.
And we will take our second digression into linear algebra. We need to recall that to
perform the operation of matrix multiplication you take each column of the second matrix
and perform the dot product operation with each row of the first matrix to produce each
element in the result. For example the dot product of the ith column of the second matrix
and the jth row of the first matrix results in the (j,i) element of the result. If the second
matrix is only one column, then the result is also one column.
Keeping matrix multiplication in mind, we append the weights so that each row of a
matrix represents the weights of on neuron. Now, representing the input vector and the
biases as one column matrices, we can simplify the above notation to:

a  f (W  I  B)
which is the final form of the mathematical representation of one layer of artificial
neurons.
Neural Network
A neural network is simply a collection of neuron layers where the output of each
previous layer becomes the input to the next layer. So, for example, the inputs to layer
two are the outputs of layer one. In this exercise we are keeping it relatively simple by
not having feedback (i.e. output from layer n being input for some previous layer). To
mathematically represent the neural network we only have to chain together the
equations. The finished equation for the three layer network in this equation is given by:
 


a  f (W3  f (W2  f (W1  I  B1 )  B2 )  B3 )
Neural Network Behavior
Although transistor now switch in as little as 0.000000000001 seconds and biological
neurons take about .001 seconds to respond we have not been able to approach the
complexity or the overall speed of the brain because of, in part, the large number
(approximately 100,000,000,000) neurons that are highly connected (approximately
10,000 connections per neuron). Although not as advanced as biologic brains, artificial
neural networks are still perform many important functions in a wide range of
applications including sensing, controls, pattern recognition, and categorization.
Generally, networks (including our brains) are trained to achieve a desired result. The
training mechanisms and rules are beyond the scope of this paper, however it is worth
mentioning that generally good behavior is rewarded while bad behavior is punished.
That is to say that when a network performs well it is modified only slightly (if at all) and
when it performs poorly larger modifications are made. As a final thought on neural
network behavior, it is worth noting that if the output function of the neurons are all
linear functions, the network is reducible to a one layer network. In other words, to have
a useful network of more than one layer we must us a function like the sigmoid (an s
shaped curve), the sign function we used above, a linear function that saturates, or any
other non-line shaped curve.
Matlab Code
This section covers the parameters in my Matlab code that you might choose to modify if
you decide to create a network with inputs and outputs other than what have been already
documented in this lesson. Before using my code you should be aware that it was not
written to solve general neural network problems, but rather to find a network by
randomly trying values. This means that it could loop forever even if a solution to your
inputs and outputs exists. If you do not get a good result after a few minutes you may
want to stop the execution and change your parameters. Finally, I will not claim that I
have worked all bugs out of this program so you should check your results carefully
before executing them in a classroom setting.
p1, p2, and p3 are input patterns for three different inputs. Each input pattern consists of
three elements pertaining to different attributes of the input. For example in my lesson I
used redness, roundness, and softness. Here, for instance, a one in the first position means
that an object is red while a zero indicates that it is not red.
a1, a2, and a3 are output patterns. They need to be initialized to be incorrect (that way the
program enters the loop rather than bypasses it). The second argument of the conditionals
for the loop should be the desired results. In my case, I chose to have one neuron in the
last layer be an indicator for each object. When that object was used as an input for the
network, that neuron would end up being a one while the other neurons in the last layer
would be negative one (if everybody did their math correctly). More explicitly, when the
first element of a1 is not a positive one then it is wrong and I want to do the loop again.
In a similar manner, when the second element of a1 is not a negative one it is wrong and I
want to do the loop again. And the same for the rest of the outputs.
Note that there is one known bug involving the termination of non-terminating decimals
(in binary 0.1 is non-terminating). It is possible that a 0.0000 is taken to be positive rather
than zero.
Download