9. Self-learning neural networks

advertisement
9. Self-learning neural networks
9.1. What is the idea of a self-learning network?
(Translated by Barbara Borowik, cnborowi@cyf-kr.edu.pl)
You've seen (and have tried in several programs!), how a neural network taking the advantage of a teacher’s
guidelines has been from step to step perfecting its operations, obtaining a compatibility of its behavior with the
patterns given by a teacher. Now I'll show you that the network can learn all alone, without a teacher. I have
been already writing on this subject in section 3.2, but now it came the time for you to get to know few facts.
You need to know that there exist some methods of such self-learning (algorithms of Hebb, of Oji, of Kohonen,
etc.), but in this book we do not go into algorithms’ details, but we will only try to see what it is all about. For
this purpose I suggest you to ran the Example 10a program.
In this program you are dealing with a network of neurons, which has the form of only one, but instead of
sometimes quite an extensive layer (fig. 9.1). To all neurons of this layer are provided the same input signals and
all the neurons determine (independently of each other) the value of its degree of excitation, multiplying input
signals (the same for all neurons) by their weighting factors (of course, different in each neuron).
Fig. 9.1. A structure of a single-layer network that may be subject to self-learning.
As you remember, the excitation of a neuron is therefore the stronger, the better the match between the input
signals delivered at this time, and the internal pattern. Knowing the values of the weight factors a neuron, allows
you to know to which input signals it will react positively and to which negatively. If we assume that input
signals are two-dimensional (i.e. have two components) and you may mark them as points on the plane (do you
remember, we call it the input space?), then the weights of all neurons are also two-dimensional (the number of
weights must always be the same as the number of components of the signal), and also these weights can be
marked as points on a plane - most conveniently on the same plane. In this way, on the plane of the input
signals will appear points indicating the locations of particular neurons, as the result of their internal knowledge,
that is, the values received by their weighting factors. I am reminding you about this in Figure 9.2., which you've
seen in one the previous chapters, but now - richer with so many new experiences – so probably you will look at
it differently.
the point, which represents the neuron’s property
a value of the 2-nd neuron’s weight
a value of the 2‘nd
The angle deciding
about the neuron
response
a point representing the input signal
component of the signal
a value of the 1‘st neuron’s weight
a value of the 1‘st component of the signal
Fig. 9.2. The interpretation of the relationship between the weight vector and the input vector
The self-learning principle is that at the beginning all the neurons obtain random weighting factors values, what
on the figure produced by the Example 10a program you will see as the "cloud" of randomly scattered points.
Each of these points represents a single neuron; as usual the location of points results from the weighting factors
values. Figure 9.3 shows you this (the window on the right side). When it comes to the window on the left side
of fig. 9.3, as in previous programs this window will appear soon after starting the program, and it will serve
you to modify network parameters. The parameters to the possible setting up, you have two: the number of
neurons in the network (more neurons can have more interesting behavior, but then it is more difficult to keep
track of it), and the parameter named Etha, which represents the intensity of the learning process (higher values
will result in faster and more energetic learning). You may change these parameters at any time, at the
beginning, however, I suggest you to accept the values proposed by the program. I have selected these default
settings after my initial examining the program behavior. To descriptions of individual parameters and to
"experimenting" with their values we will return later in this chapter.
After starting the program you will have a network composed of 30 neurons, characterized by a rather moderate
enthusiasm for learning. All the weights of all neurons in the network will have random values. To such a
network initiated randomly begin to enter input signals, which are the examples of sample objects, with which
the network is dealing (do you maybe remember the example from the third chapter, with Martian-men, Martianwomen and Martian-kids?). On the appearance of these objects in the Example 10a you will have an impact. It's
a little at odds with the idea of self-learning (at last if the network should learn by itself, then you should have
nothing to do!), however only in further experiences the process of presenting objects of the learning sequence
(without any information about what needs to be done with them!) will run - just like in real applications independently of you, randomly. In the Example 10a, I intentionally have done a deviation from this ideal,
because I would like you to observe by yourself, how the way of appearing of the learning objects (which ones
more often, which ones less often ...) affects the process of independent "knowledge acquisition" by the network.
However some explanation is needed here. All the knowledge the network can gain during self-learning must lie
in objects presented to it at the input, and more precisely in the fact that these objects form certain classes of
similarity; this means that among them there are some which are similar to each other (Martian-man) and are
different from those which belong to another class (Martian-women). This means that neither you (doing it
manually) or even automatic generator (which will help you in this work in next program) cannot show objects
which are randomly distributed in the input signals space. Presented objects should form explicit clusters around
certain, detectable centers which can be named and interpreted (“typical Martian-man”, “typical Martianwomen” etc.).
Fig. 9.3. Initial window with parameters (on the left) and location of the points representing neurons in the
program Example 10a after accepting network parameters and selecting Next (the window on the right)
While exploring the Example 10a you will decide each time what to show to the network, but this will be only a
“rough” choice of the object because its details will be determine for you by the automat. More precisely, you
will be able to indicate the quadrant1 within which the next, presented object should be located. All you have to
do is to click the mouse on the right quadrant. In this way, you will control what the network will "see". You can
agree (“with yourself”!) that the objects shown in the first quadrant belong to Martian-men, in the second to
Martian-women, in the third to Martian-kids and in the fourth to Martianamors2. With a little imagination you
can see the window displayed by the program Example 10a in such a manner as showed in Figure 9.4, where
Martians silhouette instead of points representing their measured features are shown.
1
The term "quadrant" (of some space, defined on the basis of a two-dimensional coordinate system) is derived
from mathematics and I have promised that Mathematical will not be required reading this book. So if you do
not yet learned in mathematics (or learned and forgotten) what is the coordinate system and how it is divided into
quadrants in its signal space, then assume that quadrant s are numbered starting from upper right corner of
considered space. This upper right quadrant, where variables on both axes are of positive value only is numbered
as first. Then the numbering goes counter-clockwise for consecutive quadrants. So the second quadrant is the left
upper one, the third lies just under it and so on. If you would like to see how the quadrants are situated (with
numbering) on the picture then go a few pages forward and find Figures 9.12. On the picture presenting the
window (lower right corner of the window) you can see the set of four buttons with numbers corresponding to
the described quadrants. You will learn how to apply this buttons later, for now just look at how they are
arranged and the numbering. This will help you to keep track remaining part of the section.
2
Martianamors are representatives of present on Mars third gender. As is generally known, inhabitants of Mars
merge into triangles, in order to have offspring. Being busy with their complex sexual lives they not made yet
technical civilization, do not form large structures, and therefore of the Earth's observations are not able to detect
them.
Fig. 9.4. Representation of the points of input signals space as "Martians"
As you can see in the picture - every Martian is a little different, but they are similar to each other and therefore
their data are represented in the same area of the input signals space (this will help a neural network to learn how
"typical Martian" looks like). However, since each of these Martians have their own individual characteristics
that distinguish it slightly from other Martian the points representing them during self-learning process will
differ little bit from each other – it means that they will be located in slightly different place of our coordinate
system. However, since there are more features that distinguish Martians than those features that distinguish each
of them individually – the points representing them will concentrate in certain subarea of considered space of
input signals.
Generating objects used to the learning process you will be able to decide whether now is to be shown the
Martian-man, Martian-woman, Martian-kid or Martianamor. You will not give however any details of any
particular object (e.g. there is no need to describe the growth or the eye color of each Martian) because the
program will make it automatically. At each learning step program will ask you a about the considered Martian
type– it means to which quadrant the presented object belongs. When you decide and choose one quadrant - the
program will generate an object of a specific class, and will show it to the network. Each Martian shown will
differ from others, but you will easily notice that their features (associated with the location of corresponding
points in the input signal space) will be clearly concentrated around certain "prototypes". Perhaps this “magical”
process of objects generation and their presentation to the network seems a little bit complicated to you now, but
after a short time of working with Example 10a you will get used to it and additionally some tips will appear on
the screen to help you to better understand what is going on.
At the time of the emergence of a specific input signal, belonging to the learning set, you'll see it on the screen
as a green square (try to guess why I chose the color green), much larger than the point of identifying the
location of each of any neuron (Figure 9.5). After receiving the signal at inputs all the neurons of self-learning
network define their output signals (based on the components of this input signal and its weights). These signals
can be positive (the neuron "likes" this object – red points on the screen) or negative (the neuron "doesn’t likes"
this object – blue points on the screen). If the output signals of some neurons (both positive and negative) have
low value – relevant points take the color gray (corresponding neurons have the "indifferent attitude" to this
subject).
Fig. 9.5. Situation after the appearance of the input learning object (“Martian”)
All the neurons of self-learning network correct their weights based on the input signals and the established
output values. Behavior of each neuron during correction of its weights depends on what was the value of his
output (its response to stimulation). If the output neuron was strongly positive (red on the screen) - weights
change in such a way that the neuron is approaching the point he liked. This means that if the same point will be
presented again - neuron will record it even more enthusiastic (output signal will have an even larger positive
value). Neurons which showed negative attitude to currently presented pattern (blue points on the screen) will be
repelled from it, and its weights will motivate them to even more negative response to this type of input in the
future. In Example 10a, you can see the effect of "escape" of negatively oriented neurons outside the viewing
points that represent them. In this case, the neuron is no longer displayed. This is a natural result if one takes into
account the fact that in the process of self-learning neural network equally strong “attraction” and “repulsion” is
typically used.
Fig. 9.6. Weights correction of caused by self-learning process
You can see very well the processes of moving the points showing the position of the weight vectors of neurons,
because the program presents in each step the image which shows both the previous (colorful) and new (squaresempty inside) locations of points representing the neurons – Figure 9.6, what's more - the old location is
combined with a new one with dotted line, so you can follow the "path" made by neurons during the learning.
In the next step, the "new" neurons weights become "old" and after showing the next point step the whole cycle
begins again (Figure 9.7), and a new object is presented to the network, perhaps belonging to a new class (Figure
9.8).
Fig. 9.7. Another iteration of the self-learning process
Fig. 9.8. Interpretation of next iteration of the self-learning process
If you will repeat this process long enough the cluster of neurons will appear in each quarter, that are specialized
in the recognition of typical object belonging to this group (Figure 9.9). In this way, the network will specialize
itself in the recognition of all types of objects encountered. Following the process of self-learning you will
certainly see that it only strengthen the "innate tendency" of the neural network, expressed in the initial (random)
spreading of weights vectors. These initial differences cause that after appearing of input object some neurons
respond to it positively, other negatively and then the correction of all neurons is made in such a way that as a
result it learns to recognize common elements appearing at the input and effectively improves its recognition
ability without any interference of the teacher, who may not even know how much of objects have to recognized
and what are or what are their characteristics.
Fig. 9.9. The final phase of the self-learning process
To ensure that the self-learning process can proceed efficiently it is required, however, to provide necessary,
initial diversity in neurons population. If the weights values are randomized in a way that assure needed
diversity in a population of neurons then self-learning process will proceed relatively easily and smoothly. If,
however, all neurons will have a similar "innate passion" – it will be very difficult for them to handle all
emerging classes of objects. It is more likely in this case, the phenomenon of "collective fascination", when all
neurons “like” selected object and completely ignore all the others. Usually, at the end of such a process all
neurons are specialized in recognition of one selected class of objects and ignoring completely all other classes.
The diversity in Example 10a was guaranteed as self-evident and immutable. Since the problem of initial,
random distribution of neurons is a matter of primary importance I have prepared a variant of the program called
Example 10ax, where you expressly request or random spreading of the features (widely throughout the inputs),
or that the initial positions of the neurons are drawn from a narrow range. To do this, the state of the field
(checkbox) called Density must be set properly in the initial panel (left screen in Figure 9.10). At the beginning
it would be better to choose widely spread neurons and therefore leave Density field empty. Once you will be
familiar with basic regularity of the network self-learning process and ready to see something new indeed, check
this box. Then you will see for yourself how important initial, random scattering of neurons parameters (weights)
is. Figures 9.10, 9.11 and 9.12 show the network self-learning process in cases when initial neurons weights are
insufficiently differentiated.
Fig. 9.10. The initial window with parameters (on the left) and example of the effect of drawing the
initial values of weights vectors in program Example 10ax after acceptance of network parameters, in particular
the fact of selection of the field Density. In this case, after clicking the Next button we get a task where initial
value of weights diversity is to small (window on the right)
Fig. 9.11. Self-learning process in program Example 10ax – all the neurons are attracted by the same attractor
Fig. 9.12. Self-learning process in program Example 10ax – all the neurons are repelled by the same attractor
You can use programs Example 10a and Example 10ax to carry out a series of studies. You can examine, for
example, the effects of somewhat more "energetic" self-learning process. You will notice this after changing
(especially increasing – but carefully) a value of the learning rate - Etha factor.
Other interesting phenomena can be observed, after unchecking Randomize parameter (checked as default).
Then every time the network will start with the same initial weights distribution. This will let you to observe
how the order and frequency of presenting of objects from different quarters (which depend of your decisions)
influence the final neurons distribution. Another parameter with which you can experiment with is the Number
of Neurons (the number of plotted points). As you probably already knew these parameters can be set going
back to the first window appearing after launching considered programs (Figure 9.3 and Figure 9.10 - initial
window with the parameters shown on the left side of the corresponding figure).
This context can result in very interesting observations. The initial distribution of certain neurons can be
considered (by analogy) to the innate features and talents of the individual and the learning sequence of objects
to his live experience. I could tell you a lot about it after spending long hours, modeling different neural
networks, however, these are considerations that go far beyond the scope of the book. Maybe one day I'll write
memories about it?
Download