Picture 2.12. Some of the more often used neuron`s

advertisement
2.2
How to make an artificial neuron?
(translation by Agata Barabasz agata.barabasz@op.pl)
The basic “building materials” that we use to create a neural network are artificial neurons.
Now we will try to learn about them more precisely. In the previous chapter you have seen
some pictures illustrating the shape of a biological neuron, but it will not harm to recall one
more picture, so see in picture 2.4, how an exemplary neuron (in simplification) looks like.
Picture 2.4. The orientation structure of a biological nerve cell (a neuron).
So that you do not think that all real neurons look exactly like that, in picture 2.5 I am
showing you one more illustration of a real biological neuron, dissected free of a rat’s cerebral
cortex.
Picture 2.5 The view of a microscopic preparation of a real neuron
It is hard in this picture to guess, which of the many visible on it fibres is an axon, which is
always single and as the only one delivers signals from the given neuron to all the others, and
which are performing the role of dendrites.
Nevertheless, this is also a real biological neuron, thus also such a cell as this one our artificial
neuron has to map well, and of which we will take care now more precisely.
Artificial building neurons used in the networks technique are of course very simplified
models of nerve cells, that occur in nature.
A structure of an artificial neuron best illustrates the scheme presented in picture 2.6.
Comparing this illustration with pictures 2.4 or 2.5 you will realise how far neural networks’
researchers simplify biological reality.
Picture 2.6 A general scheme of an artificial neuron shows the degree of it’s simplification
However, in spite of this simplifications artificial neurons keep all these features, which are
valid from the point of view of tasks we want to entrust them within built networks, being the
computer science tools, rather than biology models.
 Firstly, they are characterised by having many inputs and one output. The input
signals xi (i = 1,2,…,n) and the output signal y may take on only numerical values,
generally of the range from 0 to 1 ( sometimes also from –1 to + 1), whereas the fact
that within the tasks being solved by networks they represent some information (e.g.
as the output of a decision, who has been recognised by the neural network, which has
been analysing someone’s photo), is the result of a specific agreement. Generally
particular meanings are ascribed to network’s input and output signals in such a way
that the most crucial is this, on which input or output a given signal has occurred (each
input and output is associated with a specific meaning of a signal), additionally
signals scaling is used, so selected that signal values that would be circulating in a
network, would not be out of an agreed range – e.g. from 0 to 1.
 Secondly – artificial neurons perform specific activities on signals, which they receive
on inputs, as a consequence they produce signals (only one by each single neuron),
which are present on their outputs and are sent forward (to other neurons, or onto this
network’s output, as the solution of a raised problem). Network’s assignment, reduced
to the functioning of it’s basic element, which is a neuron, is based on this that it
transforms an input data xi into a result y applying rules resulting from that how it has
been built, and what has been taught. Considered up to this point neuron’s properties
have been illustrated on picture 2.7.
x1
x2
y
...
xn
Picture 2.7 Basic signals occurring in a neuron

Thirdly – neurons may learn. This purpose serve wi coefficients called synaptic
weights. As you certainly remember from the previous chapter – these reflect rather
complicated biochemical and bioelectric processes, which take place in real
biological neuron’s synapses. From further considerations point of view the most
significant is that synaptic weights can be modified (i.e. their values can be changed),
x1
x2
xn
w1
w2
y
...
wn
Picture 2.8 Adding to a neuron’s structure adjustable weight’s coefficients makes it a learnable unit
what constitutes a basis for teaching networks. A scheme of a neuron capable of learning
has been shown on picture 2.8.
Summing up this discourse it could be ascertained that artificial neurons can be treated as
elementary processors with the following features:




each neuron receives many input signals xi and on their basis determines it’s own
“answer” y, that is produces one output signal;
with each separated neuron’s input is connected a parameter called weight wi . This
name means that it expresses a degree of significance of an information arriving to
this neuron through just this input;
a signal, coming in through a particular input is first modified with the use of the
weight of that given input. Most often a modification is based on this that a signal is
simply multiplied through the weight of a given input, so in consequence in further
calculations it is already participating in the modified form: strengthened ( if the
weight is greater than 1) or restrained (if the weight’s value is less than 1). A signal
from a particular input may occur even in the form opposite in relation to signals
from the other inputs, if it’s weight has a negative value. Inputs with negative
weights are among neural networks users defined as so called inhibitory inputs,
whereas these with positive weights are called excitatory inputs.
input signals (modified by adequate weights) are aggregated in a neuron (see picture
2.9). Once again considering networks in general, many ways of input signals
aggregation may be given, nevertheless most often it is based on this that signals are
simply summed up giving as the result some helpful internal signal, called a
cumulative neuron stimulation or a postsynaptic stimulation. This signal may be also
defined as a net value.
x1
x2
xn
w1
w2
...
wn
s g wi , xi 
y
i 1,,n
Picture 2.9. An aggregation of input data as the first of neuron’s internal functions

to so created sum of signals the neuron adds sometimes (not in all networks’ types,
but generally often) some extra component independent of input signals, called a
bias.
A bias, if it is taken into account, also undergoes a learning process, that is why
sometimes one can imagine, that a bias is an extra synaptic weight associated with
the input, on which it is provided an internal signal of constant value equal to 1. A
bias role lies in this that thanks to it’s presence during a learning process a neuron’s
properties may be formed in a much more free way (without having it the
aggregation function characteristics always must pass through the beginning of the
coordinate system, what sometimes is a burdensome “ anchor”). A scheme of a
neuron, in which a bias has been taken into account, is shown in picture 2.10;
Picture 2.10. The application of the additional parameter, which is bias

A sum of internal signals multiplied by weights plus (possibly) a bias may be
sometimes sent directly to it’s axon and treated as a neuron’s output signal. In many
types of networks that is enough. In this way work so called linear networks (for
example a net named ADALINE = ADaptive LINEar). However, in networks with
richer abilities (for example in very popular networks called MLP from the words
Multi–Layer Perceptron) a neuron’s output signal is calculated by means of some
nonlinear function. This function in the whole book we will be designating with the
symbol ƒ( ) or φ ( ). A scheme of a neuron including both an input signals’
aggregation and an output signal’s generation is presented in picture 2.11;
Picture 2.11. The full number of neuron’s internal functions

a function φ ( ) is called a characteristic of a neuron (a transfer function). There are
known many different neuron’s characteristics, what illustrates picture 2.12 Some of
them are chosen in a such way that artificial neuron’s behaviour would be the most
similar to a real biological neuron’s behaviour (a sigmoid function), but they also
could be selected in such manner, which would assure the maximum efficiency of
computations carried on by a neural network (a Gauss function). In all the cases
function φ ( ) constitutes an important element going between a joint stimulation of a
neuron and it’s output signal;
Picture 2.12. Some of the more often used neuron’s characteristics

a knowledge of the input signals, weights’ coefficients, inputs aggregation method and
neuron’s characteristic, allow to unequivocally define at any time it’s output signal,
with usual assuming that (in contrast to what takes place in real neurons) this process
occurs immediately. Thanks to this in artificial neural networks changes of input
signals are practically immediately appearing on output. Of course this is a clearly
theoretical assumption, because after input signals change even in electronic
realization some time would be needed for establishing the right value of an output
signal by an adequate integrated circuit.. . Much more time would be necessary to
achieve the same effect in a net working as a simulation model, because a computer
imitating network activities must then calculate all values of all signals on all neurons
outputs of this network, what even on very fast computers could take a lot of time.
While speaking about a prompt neuron’s action I mean that considering network’s
functioning we will not pay attention to a factor, which is a time of neuron’s reaction,
because this will be insignificant for us. A complete structure of a single neuron is
presented in picture 2.13.
x1
x2
w1
w2

wn
xn
es

y
neuron
liniowy
linear neuron
Picture 2.13. Structure of a neuron as a processor, which is the basis for building neural networks
A neuron presented in this picture is the most typical “material”, which is used for creating a
network. More precisely – such typical “material” is a neuron of a network defined as MLP
(Multi–Layer Perceptron), the most crucial elements of which I have collected and presented
in picture 2.14. It is visible in this picture that neuron MLP is characterised by the aggregation
function consisting of simple summing up the input signals multiplied by weights, and uses a
nonlinear transfer function with a distinctive sigmoid shape.
x1
x2
xn
w1
w2
...
wn
y:=1/(1+exp(-0.5*x))
1.1
0.9
n
s  wx
i 1
i
y
0.7
0.5
i
0.3
0.1
-0.1
-10
-5
0
5
10
Picture 2.14 The most popular component of neural networks – the MLP type neuron
However sometimes in neural networks for special purposes there are used so called radial
neurons. They have an atypical method of input data aggregation, and they also use an
untypical characteristic (a Gauss’s one) and are taught in an unusual way. At this moment I do
not intend to elaborate on a subject of this specific neurons, which are used mainly to create
special networks called RBF (Radial Basis Functions), but in picture 2.15 I present a scheme
of such a radial neuron, to enable you to make a comparison with discussed earlier a typical
neuron shown in picture 2.14.
1
t1
x1
r1
f
...
xn
t n
f
y
x-t
r n
In this type of neuron the
aggregation of the input signals
consists of evaluating the
distance between the present
input vector X and the centroid
of a certain subset T determined
during a teaching process
Also a nonlinear transfer function in
these neurons has a different form –
”bell-shaped” gaussoid - i.e. it is a
non-monotonic function.
Picture 2.15 A structure and peculiar properties of a radial neuron, denoted also as RBF
Download