Author template for journal articles

advertisement
Martin Kromka
martinkromka@gmail.com
Supervisor : Ing. Rudolf Jakša PhD.
jaksa@neuron.tuke.sk
Tutor : Ing. Matúš Užák
uzak@neuron.tuke.sk
Abstract
Information reduction in visualization of
neural network learning
1.
Introduction
A typical neural network performance is usually evaluated with help of
mean-square error (MSE) or by estimation of the general classification precision.
It’s quite possible that two different neural networks may have the same MSE and
in spite of that may form completely different internal data mapping. We need
internal mapping analysis to find out if there is any difference between these two
nets and visualization is the simplest way to achieve this [4].
2.
Selected Methods and Approaches
2.1
The basic type of visualization
The basic of visualization is to convert the responses (output values) of
each neural network part (visual information). Sigmoidal function (Figure 2.1) is
the most used as an activating function. It also limits neuron’s output to <0,1>
interval. We could obtain the particular visual information by multiplying the
output with the maximum grey level of the desired picture (Gmax) (Figure 2.2).
1
Figure 2.1: The shape of sigmoidal function [1].
Figure 2.2: Neuron’s response visualization [1].
The left part of Figure 2.2 corresponds with response of neuron to
functional signal before learning process (right after the random initialization).
The right part of Figure 2.2 means the same as the left part, but it’s after the
nearest extreme convergence visualization.
Information reduction during neural network learning visualization could
be done with help of clustering of each neuron response. For this purpose we used
two clustering techniques. For the first we used clustering of two nearest neurons
responses with help of Euclidean metric. For the second we used clustering of
many neurons with help of Kohonen network [2].
2.2
Clustering with help of Euclidean metric
With help of Euclidean metric combinations of two most similar responses
on hidden layers within the frame of the whole testing set rise. Arithmetical
diameter needs to be calculated for visualization of one picture with two most
similar neurons responses. There newcomer neuron pairs are multiplied by the
maximum grey level (Gmax). Consequently min-max normalization needs to be
done [3].
Figure 2.3 and Figure 2.4 provides more detailed look at neuron pairs
normalized responses visualization.
2
Figure 2.3: First hidden layer visualization using 2D circle training model.
Figure 2.4: Second hidden layer visualization using 2D circle training model.
2.3
Clustering with help of Kohonen network
Kohonen network includes as many neurons, as many pictures the user
wants to visualize on the screen. Two kinds of Kohonen networks were used ,
exclusively for the first hidden layer and exclusively for the second hidden layer.
As training models for Kohonen network, there are each particular neurons
responses (Figure 2.5). With help of Kohonen network, clustering of each
responses for the first and for the second hidden layer within the frame of the
whole testing set is performed.
Two variants of visualization using Kohonen networks are possible:
1. cluster centre visualization
2. diameters visualization
Figure 2.5: Clustering process using Kohonen network
1. Centres visualization
During neural network (NN) learning process, not only the learnt NN
synaptic weights (SW) are adapting, but also the Kohonen network SW are
3
adapting. For visualization purposes these Kohonen network SW are multiplied
by the maximum grey level (Gmax).
2. Diameters visualization
Neurons clusters are being created during the learning process with help of
Kohonen network. For neurons clusters visualization purposes, arithmetical
diameter of neurons responses for each cluster need to be calculated. This
diameter is then multiplied by the maximum grey level (Gmax).
3.
Interaction interferences implementation
3.1
SW, destined to the exact neuron reset
The visual view of neurons behaviour allows the user to choose neurons in
saturation state (mode) and are responding inappropriately to the signals spreading
throughout the network. Interaction implementation allows the user to reset the
SW, destined to the exact neuron after unpacking the cluster (Figure 3.1) the user
wants or chooses (Figure 3.2, Figure 3.3).
4
Figure 3.1: Unpacked clusters.
Figure 3.2: Unpacked neurons
cluster with wrong reaction of
neuron on the left.
Figure 3.3 Reset of the neuron on
the left
5
3.2
Individual learning rate setting
After unpacking some cluster, the user is allowed to set individual learning
rate (discrete values) of each neuron of this cluster (Figure 3.4 a Figure 3.5).
The user has overview about neurons that represent the appropriate building
blocks of desired knowledge, considering their reactions to input signals. SW to
those neurons shouldn’t change in the same way as neurons deforming the desired
knowledge.
Figure 3.4: Default gamma
parameter setting for each
particular cluster neurons
3.3
Figure 3.5: Various gamma
parameter setting for each cluster
neuron.
Prunning
We are able to find the suitable NN architecture. NN costs could be
decreased that way (time, memory, hardware implementation). Unimportant
neurons could be cut. That helps to create a input relevance picture [5].
Because of simplicity, the prunning implementation was made the easy
way. Neurons removal is possible after cluster unpacking. This prunning was
implemented in its testing version. The whole process isn’t faster in any way by
neurons removal (Figure 3.6).
Figure 3.6: Neuron prunning.
4.
Experiments
The experiments were designed and realised simultaneously with designed
visualization model of NN learning information reduction and visualization of this
process. Experiments are interlocking each other and are continuously clarify to
the reader the NN learning visualization benefits.
4.1
Euclidean metric
In Figure 4.1, there is NN visualization using Euclidean metric, learning
the circle model. The first row of squares signifies the input layer, the second
6
signifies the first hidden layer (its clusters), the third signifies the second hidden
layer (its clusters). The final output is in the last bottom square.
Figure 4.1: Circle model visualization by clustering with help of Euclidean
metric.
4.2
Kohonen network
If we want to visualizate such a number of neuron pairs, that is not
possible to figure, there is a problem about Euclidean metric.
Kohonen network visualization allows us two views. The centres view and
the diameters view. Using this kind of visualization the first row of squares
signifies the input layer, the second signifies the first hidden layer (its clusters),
the third signifies the second hidden layer (its clusters) and the last bottom square
signifies the final response of output neuron (Figure 4.2, 4.3).
7
Figure 4.2: Circle model diameters
visualization.
Figure 4.3: Circle model centres
visualization.
By the next experiment SW reset possibility for each neuron and also the
possibility of learning rate change for better results were used (Figure 4.4, 4.5,
4.6).
Figure 4.4: Desired output.
Figure 4.5: Learning process without user intervention visualization.
8
Figure 4.6: Learning process interactive user intervention visualization.
Visually, the final neuron looks better using SW reset interaction and
individual learning rate change than without learning interaction. On the other
hand testing model MSE without interaction is 0,016697 and with interaction is
0,018216.
Beside the interaction interference was also prunning used in the next
experiment. With help of prunning we tried to get as good results as without
prunning by removing neurons (Figure 4.7, 4.8).
Figure 4.7: 2-10-4-2 NN
visualization without prunning.
Figure 4.8: 2-2-1-2 NN
visualization after prunning.
Figure 4.7 shows the net visualization with 2 input neurons, 10 first layer
hidden neurons, 4 second layer hidden neurons and 2 output neurons. Visually,
the visualization shows 2 input neurons, 6 clusters in the first hidden layer, 3
9
clusters in the second hidden layer and 1 output neuron. After that prunning was
applied. With its help the inappropriate responding neuron’s respond values were
set to 0 (Figure 4.8). We got the minimum net topology by removing neurons. We
got the following topology: 2 input neurons, 2 first layer hidden neurons, 1 second
layer hidden neuron and 2 output neurons.
4.
Conclusion
We are able to observe knowledge creation with help of visualization. This
is an advantage for the user to better understand and properly modify the building
blocks, represented by interactions of hidden layers neurons.
We tried to visualizate large NN using clusters. This brought clarity to
large NN visualization process. If the user understands well the particular building
blocks creation, he will be to interference into the learning process and this way
make the results better.
References
1. Užák, M., Vizualizácia a interakcia v procese učenia neurónových sietí,
Diplomová práca, Technická univerzita Košice, (2005).
2. Sinčák, P., Andrejková, G., Neurónové siete – inžiniersky prístup (Dopredné
neurónové siete), č.1., Elfa-press, ISBN 80-88786-38-X, (1996).
3. Paralič, J., Objavovanie znalostí v databázach, elfa s.r.o., ISBN 80-89066-607,(2003)
4. Duch, W. Visualization of hidden node activity in neural networks.
International
Joint Conference on Neural Networks, Portland, Oregon, 2003,
Vol. I, pp. 1735-1740
5. Algoritmus Prunning
[on-line] http://poprad.fei.tuke.sk/~ligus/fir/obsah/met23.htm
10
Download