Modelling the neuron

advertisement
Artificial Neural Networks - what and why?
A biological brain is a massive collection simple interconnected processing elements called neurons.
Human brain has about 1011 neurons each having up to 104 interconnections to other neurons.
Neurons are slow (10-3 sec) compared with microprocessors (10 -9 sec) but their number and
interconnectivity makes a massively parallel processor many times more powerful.
The main advantage of the biological brain is its learning capability.
ANNs are inspired by biological brains.
They are a collection of a relatively small number of simple processing elements, also called neurons.
ANNs are not an attempt to build an artificial brain - rather a simplified attempt to replicate the
structure and operation of the brain to solve problems.
Many such problems are difficult to be solved by conventional von Neumann computing, simply
because no known algorithms exist or even if it does the computational cost is too high.
For example, the travelling salesman problem for 50 cities requires 1 million centuries at the rate of 1
billion tour calculations per sec.
ANNs are capable of learning, generalising and handling noisy or incomplete data similar to biological
systems.
Representing the biological neuron
A biological neuron adds its inputs and "fires"(produces an output) when the sum exceeds a threshold.
The strengths of the inputs are modified by the synaptic junctions.
The basic model of a biological neuron consists of a summing function with an internal threshold and
"weight"ed inputs
Total input
x =  wixi
i = 1..n
n
Output y is a function fh of the net activation  wixi - 
where  is known as the "bias" or "offset" for the neuron and may be interpreted as an threshold for the
neuron to fire.
Output y = 1
if net activation > 0, y = 0 otherwise.
The bias  can be represented as an extra input and the output written as:
n
y = fh  wixi , i = 0..n
which represents the so-called McCulloch-Pitt model of the neuron.
Learning in Neurons - the Perceptron
Given a certain input pattern, a neuron can be trained to produce a desired output by adjusting its
weights iteratively.
Hebbian learning achieves this by reinforcing active inputs if the output is correct.
The perceptron learning algorithm:
Let us consider a two-class problem in which the two classes are recognised by outputting either a 0 or
a 1.
wi(t)
Let
=
weight for input i at time t

=
threshold value for output
w0
=
- (the bias), and x0 always equal to 1
1.
Initialise wi(0) to small random values
2.
Present input (xo, x1,...,xn)
3.
Calculate actual output
4.
Adapt weights
y(t) = fh  wi(t)xi(t)
If output correct, leave weights unchanged
wi(t + 1) = wi(t)
wi(t + 1) = wi(t) + xi(t)
if output 0, should be 1
wi(t + 1) = wi(t) - xi(t)
if output 1, should be 0
It can be shown that the perceptron weight vector does eventually align itself with the ideal
weight vector and does not oscillate indefinitely.
In a modified version of the above algorithm, a gain term 0    1 is introduced to control the rate of
convergence (smaller steps taken towards the solution) if output 0, should be 1
if output 1, should be 0
wi(t + 1) = wi(t) + xi(t)
wi(t + 1) = wi(t) - xi(t)
The Widrow-Hoff delta rule
In this version of the learning algorithm, the weight adjustments are made in proportion to the
error  - the difference between the actual output and the desired output. The error term  is given by
 = d(t) - y(t)
where d(t) is the desired response of the system and y(t) is the actual response. The weight
adjustment is given by
wi(t + 1) = wi(t) + xi(t)
wi remains unchanged if the output is correct -  = 0.
Neurons using the Widrow-Hoff learning rule are called ADALINEs (ADAptive LInear
NEurons).
The inputs to a perceptron and the weights associated can be considered as n-dimensional vectors to
give the weighted sum
 wixi  W.X
W can be regarded as the decision boundary that partitions the pattern space. In the perceptron learning
process, W starts from a random orientation to iteratively converges to the correct one.
Limitations of perceptrons
A single layer perceptron can only solve linearly separable problems. An example of problems which
are nonlinearly separable is the exclusive OR problem.
Fig. The XOR problem in pattern space. It is impossible to draw a straight line to separate the two
points corresponding to '1' outputs from those corresponding to '0' outputs.
Nonlinearly separable problems can be solved by multilayer perceptrons proposed by Rumelhart and
McClelland in 1986.
Download