9. Self-learning neural networks 9.1. What is the idea of a self-learning network? (Translated by Barbara Borowik, cnborowi@cyf-kr.edu.pl) You've seen (and have tried in several programs!), how a neural network taking the advantage of a teacher’s guidelines has been from step to step perfecting its operations, obtaining a compatibility of its behavior with the patterns given by a teacher. Now I'll show you that the network can learn all alone, without a teacher. I have been already writing on this subject in section 3.2, but now it came the time for you to get to know few facts. You need to know that there exist some methods of such self-learning (algorithms of Hebb, of Oji, of Kohonen, etc.), but in this book we do not go into algorithms’ details, but we will only try to see what it is all about. For this purpose I suggest you to ran the Example 10a program. In this program you are dealing with a network of neurons, which has the form of only one, but instead of sometimes quite an extensive layer (fig. 9.1). To all neurons of this layer are provided the same input signals and all the neurons determine (independently of each other) the value of its degree of excitation, multiplying input signals (the same for all neurons) by their weighting factors (of course, different in each neuron). Fig. 9.1. A structure of a single-layer network that may be subject to self-learning. As you remember, the excitation of a neuron is therefore the stronger, the better the match between the input signals delivered at this time, and the internal pattern. Knowing the values of the weight factors a neuron, allows you to know to which input signals it will react positively and to which negatively. If we assume that input signals are two-dimensional (i.e. have two components) and you may mark them as points on the plane (do you remember, we call it the input space?), then the weights of all neurons are also two-dimensional (the number of weights must always be the same as the number of components of the signal), and also these weights can be marked as points on a plane - most conveniently on the same plane. In this way, on the plane of the input signals will appear points indicating the locations of particular neurons, as the result of their internal knowledge, that is, the values received by their weighting factors. I am reminding you about this in Figure 9.2., which you've seen in one the previous chapters, but now - richer with so many new experiences – so probably you will look at it differently. the point, which represents the neuron’s property a value of the 2-nd neuron’s weight a value of the 2‘nd The angle deciding about the neuron response a point representing the input signal component of the signal a value of the 1‘st neuron’s weight a value of the 1‘st component of the signal Fig. 9.2. The interpretation of the relationship between the weight vector and the input vector The self-learning principle is that at the beginning all the neurons obtain random weighting factors values, what on the figure produced by the Example 10a program you will see as the "cloud" of randomly scattered points. Each of these points represents a single neuron; as usual the location of points results from the weighting factors values. Figure 9.3 shows you this (the window on the right side). When it comes to the window on the left side of fig. 9.3, as in previous programs this window will appear soon after starting the program, and it will serve you to modify network parameters. The parameters to the possible setting up, you have two: the number of neurons in the network (more neurons can have more interesting behavior, but then it is more difficult to keep track of it), and the parameter named Etha, which represents the intensity of the learning process (higher values will result in faster and more energetic learning). You may change these parameters at any time, at the beginning, however, I suggest you to accept the values proposed by the program. I have selected these default settings after my initial examining the program behavior. To descriptions of individual parameters and to "experimenting" with their values we will return later in this chapter. After starting the program you will have a network composed of 30 neurons, characterized by a rather moderate enthusiasm for learning. All the weights of all neurons in the network will have random values. To such a network initiated randomly begin to enter input signals, which are the examples of sample objects, with which the network is dealing (do you maybe remember the example from the third chapter, with Martian-men, Martianwomen and Martian-kids?). On the appearance of these objects in the Example 10a you will have an impact. It's a little at odds with the idea of self-learning (at last if the network should learn by itself, then you should have nothing to do!), however only in further experiences the process of presenting objects of the learning sequence (without any information about what needs to be done with them!) will run - just like in real applications independently of you, randomly. In the Example 10a, I intentionally have done a deviation from this ideal, because I would like you to observe by yourself, how the way of appearing of the learning objects (which ones more often, which ones less often ...) affects the process of independent "knowledge acquisition" by the network. However some explanation is needed here. All the knowledge the network can gain during self-learning must lie in objects presented to it at the input, and more precisely in the fact that these objects form certain classes of similarity; this means that among them there are some which are similar to each other (Martian-man) and are different from those which belong to another class (Martian-women). This means that neither you (doing it manually) or even automatic generator (which will help you in this work in next program) cannot show objects which are randomly distributed in the input signals space. Presented objects should form explicit clusters around certain, detectable centers which can be named and interpreted (“typical Martian-man”, “typical Martianwomen” etc.). Fig. 9.3. Initial window with parameters (on the left) and location of the points representing neurons in the program Example 10a after accepting network parameters and selecting Next (the window on the right) While exploring the Example 10a you will decide each time what to show to the network, but this will be only a “rough” choice of the object because its details will be determine for you by the automat. More precisely, you will be able to indicate the quadrant1 within which the next, presented object should be located. All you have to do is to click the mouse on the right quadrant. In this way, you will control what the network will "see". You can agree (“with yourself”!) that the objects shown in the first quadrant belong to Martian-men, in the second to Martian-women, in the third to Martian-kids and in the fourth to Martianamors2. With a little imagination you can see the window displayed by the program Example 10a in such a manner as showed in Figure 9.4, where Martians silhouette instead of points representing their measured features are shown. 1 The term "quadrant" (of some space, defined on the basis of a two-dimensional coordinate system) is derived from mathematics and I have promised that Mathematical will not be required reading this book. So if you do not yet learned in mathematics (or learned and forgotten) what is the coordinate system and how it is divided into quadrants in its signal space, then assume that quadrant s are numbered starting from upper right corner of considered space. This upper right quadrant, where variables on both axes are of positive value only is numbered as first. Then the numbering goes counter-clockwise for consecutive quadrants. So the second quadrant is the left upper one, the third lies just under it and so on. If you would like to see how the quadrants are situated (with numbering) on the picture then go a few pages forward and find Figures 9.12. On the picture presenting the window (lower right corner of the window) you can see the set of four buttons with numbers corresponding to the described quadrants. You will learn how to apply this buttons later, for now just look at how they are arranged and the numbering. This will help you to keep track remaining part of the section. 2 Martianamors are representatives of present on Mars third gender. As is generally known, inhabitants of Mars merge into triangles, in order to have offspring. Being busy with their complex sexual lives they not made yet technical civilization, do not form large structures, and therefore of the Earth's observations are not able to detect them. Fig. 9.4. Representation of the points of input signals space as "Martians" As you can see in the picture - every Martian is a little different, but they are similar to each other and therefore their data are represented in the same area of the input signals space (this will help a neural network to learn how "typical Martian" looks like). However, since each of these Martians have their own individual characteristics that distinguish it slightly from other Martian the points representing them during self-learning process will differ little bit from each other – it means that they will be located in slightly different place of our coordinate system. However, since there are more features that distinguish Martians than those features that distinguish each of them individually – the points representing them will concentrate in certain subarea of considered space of input signals. Generating objects used to the learning process you will be able to decide whether now is to be shown the Martian-man, Martian-woman, Martian-kid or Martianamor. You will not give however any details of any particular object (e.g. there is no need to describe the growth or the eye color of each Martian) because the program will make it automatically. At each learning step program will ask you a about the considered Martian type– it means to which quadrant the presented object belongs. When you decide and choose one quadrant - the program will generate an object of a specific class, and will show it to the network. Each Martian shown will differ from others, but you will easily notice that their features (associated with the location of corresponding points in the input signal space) will be clearly concentrated around certain "prototypes". Perhaps this “magical” process of objects generation and their presentation to the network seems a little bit complicated to you now, but after a short time of working with Example 10a you will get used to it and additionally some tips will appear on the screen to help you to better understand what is going on. At the time of the emergence of a specific input signal, belonging to the learning set, you'll see it on the screen as a green square (try to guess why I chose the color green), much larger than the point of identifying the location of each of any neuron (Figure 9.5). After receiving the signal at inputs all the neurons of self-learning network define their output signals (based on the components of this input signal and its weights). These signals can be positive (the neuron "likes" this object – red points on the screen) or negative (the neuron "doesn’t likes" this object – blue points on the screen). If the output signals of some neurons (both positive and negative) have low value – relevant points take the color gray (corresponding neurons have the "indifferent attitude" to this subject). Fig. 9.5. Situation after the appearance of the input learning object (“Martian”) All the neurons of self-learning network correct their weights based on the input signals and the established output values. Behavior of each neuron during correction of its weights depends on what was the value of his output (its response to stimulation). If the output neuron was strongly positive (red on the screen) - weights change in such a way that the neuron is approaching the point he liked. This means that if the same point will be presented again - neuron will record it even more enthusiastic (output signal will have an even larger positive value). Neurons which showed negative attitude to currently presented pattern (blue points on the screen) will be repelled from it, and its weights will motivate them to even more negative response to this type of input in the future. In Example 10a, you can see the effect of "escape" of negatively oriented neurons outside the viewing points that represent them. In this case, the neuron is no longer displayed. This is a natural result if one takes into account the fact that in the process of self-learning neural network equally strong “attraction” and “repulsion” is typically used. Fig. 9.6. Weights correction of caused by self-learning process You can see very well the processes of moving the points showing the position of the weight vectors of neurons, because the program presents in each step the image which shows both the previous (colorful) and new (squaresempty inside) locations of points representing the neurons – Figure 9.6, what's more - the old location is combined with a new one with dotted line, so you can follow the "path" made by neurons during the learning. In the next step, the "new" neurons weights become "old" and after showing the next point step the whole cycle begins again (Figure 9.7), and a new object is presented to the network, perhaps belonging to a new class (Figure 9.8). Fig. 9.7. Another iteration of the self-learning process Fig. 9.8. Interpretation of next iteration of the self-learning process If you will repeat this process long enough the cluster of neurons will appear in each quarter, that are specialized in the recognition of typical object belonging to this group (Figure 9.9). In this way, the network will specialize itself in the recognition of all types of objects encountered. Following the process of self-learning you will certainly see that it only strengthen the "innate tendency" of the neural network, expressed in the initial (random) spreading of weights vectors. These initial differences cause that after appearing of input object some neurons respond to it positively, other negatively and then the correction of all neurons is made in such a way that as a result it learns to recognize common elements appearing at the input and effectively improves its recognition ability without any interference of the teacher, who may not even know how much of objects have to recognized and what are or what are their characteristics. Fig. 9.9. The final phase of the self-learning process To ensure that the self-learning process can proceed efficiently it is required, however, to provide necessary, initial diversity in neurons population. If the weights values are randomized in a way that assure needed diversity in a population of neurons then self-learning process will proceed relatively easily and smoothly. If, however, all neurons will have a similar "innate passion" – it will be very difficult for them to handle all emerging classes of objects. It is more likely in this case, the phenomenon of "collective fascination", when all neurons “like” selected object and completely ignore all the others. Usually, at the end of such a process all neurons are specialized in recognition of one selected class of objects and ignoring completely all other classes. The diversity in Example 10a was guaranteed as self-evident and immutable. Since the problem of initial, random distribution of neurons is a matter of primary importance I have prepared a variant of the program called Example 10ax, where you expressly request or random spreading of the features (widely throughout the inputs), or that the initial positions of the neurons are drawn from a narrow range. To do this, the state of the field (checkbox) called Density must be set properly in the initial panel (left screen in Figure 9.10). At the beginning it would be better to choose widely spread neurons and therefore leave Density field empty. Once you will be familiar with basic regularity of the network self-learning process and ready to see something new indeed, check this box. Then you will see for yourself how important initial, random scattering of neurons parameters (weights) is. Figures 9.10, 9.11 and 9.12 show the network self-learning process in cases when initial neurons weights are insufficiently differentiated. Fig. 9.10. The initial window with parameters (on the left) and example of the effect of drawing the initial values of weights vectors in program Example 10ax after acceptance of network parameters, in particular the fact of selection of the field Density. In this case, after clicking the Next button we get a task where initial value of weights diversity is to small (window on the right) Fig. 9.11. Self-learning process in program Example 10ax – all the neurons are attracted by the same attractor Fig. 9.12. Self-learning process in program Example 10ax – all the neurons are repelled by the same attractor You can use programs Example 10a and Example 10ax to carry out a series of studies. You can examine, for example, the effects of somewhat more "energetic" self-learning process. You will notice this after changing (especially increasing – but carefully) a value of the learning rate - Etha factor. Other interesting phenomena can be observed, after unchecking Randomize parameter (checked as default). Then every time the network will start with the same initial weights distribution. This will let you to observe how the order and frequency of presenting of objects from different quarters (which depend of your decisions) influence the final neurons distribution. Another parameter with which you can experiment with is the Number of Neurons (the number of plotted points). As you probably already knew these parameters can be set going back to the first window appearing after launching considered programs (Figure 9.3 and Figure 9.10 - initial window with the parameters shown on the left side of the corresponding figure). This context can result in very interesting observations. The initial distribution of certain neurons can be considered (by analogy) to the innate features and talents of the individual and the learning sequence of objects to his live experience. I could tell you a lot about it after spending long hours, modeling different neural networks, however, these are considerations that go far beyond the scope of the book. Maybe one day I'll write memories about it?