Generalised scheme for optimal learning in recurrent neural networks

advertisement
Generalised scheme for optimal learning in
recurrent neural networks
K. Shanmukh
Y.V. Venkatesh
Indexing terms: Neural network, Perceptron, Hopfield nelwork. Bidirerrionul ussoriutiue memory, Maximal stabiliry, Optimul learning, AdaTron
Abstract: A new learning scheme is proposed for
neural network architectures like Hopfield
network and bidirectional associative memory.
This scheme, which replaces the commonly used
learning rules, follows from the proof of the result
that learning in these connectivity architectures is
equivalent t o learning in the 2-state perceptron.
Consequently, optimal learning algorithms for the
perceptron can be directly applied to learning in
these connectivity architectures. Similar results are
established for learning in the multistate perceptron, thereby leading to an optimal learning algorithm. Experimental results are provided to show
the superiority of the proposed method.
Introduction
A major goal of present day research on artificial neural
networks (ANN) is to design the interconnections among
a large number of (primitive) neurons, in such a way as to
realise appropriate pattern recognition characteristics.
The ANNs range from perceptrons (with no feedback)
[l] to cellular networks (with local feedback) to Hopfield
networks (with global feedback) [2]. When the ANN is to
be used for pattern analysis, the stability of the network
as a dynamic system is of paramount importance. If the
network, starting from an arbitrary point in state-space,
evolves in time and reaches a stable equilibrium point,
then the network is interpreted as having recognised the
pattern corresponding to the stable state. The distinctive
characteristic of ANNs is that this type of convergence to
the stable state may take place, even though the input
pattern is an incomplete or distorted version of the
correct pattern. The ANN may have many points of
stable equilibria, corresponding to the many patterns
which it can recognize. The region around a n equilibrium
point, consisting of all the points starting from which the
trajectory of the dynamical system will converge to the
equilibrium point, is called the basin of attraction for the
equilibrium point. The ANN is to be so designed as to
have the basin of attraction as large as possible for each
equilibrium point.
The synthesis of the ANNs for pattern analysis
involves designing the dynamical system, in such a way
as to realise a separate equilibrium point for each of the
patterns t o be recognised. This implies a learning stage,
in which the training patterns are used to fix the connection strengths.
In this paper, we develop a n optimal learning scheme
for associative memory architectures like the Hopfield
network and bidirectional associative memory. We generalise this for the multistate perceptron. This is an indication of the special feature of the proposed technique,
which can be used for learning in any neural network
architecture without hidden neurons, as long as the
problem of learning can be converted t o the problem of
satisfying a set of linear inequalities of a particular type.*
In order to avoid terminological confusion, we employ,
throughout the paper, the terms ‘Hopfield network‘ and
‘BAM’ to refer to connectivity architectures and not to
the learning algorithms.
2
Perceptron and the perceptron convergence
theorem
A two-state perceptron is a single neuron which takes an
n-dimensional vector as input, and gives a binary output,
see Fig. 1. The input-output relation is nonlinear, and is
given by
y
= sgn
( W T X- t )
where X and W are the input and weight vectors, respectively, t a scalar called the threshold, y the output, T the
x1
Fig. 1
Two-state perreplron
transpose operation, and the function sgn (h) is -1 for
h < 0 and + 1 for h 2 0. It is known that the Perceptron
can be used to solve the linearly separable two-class classification problems. In this case, the set of the patterns
belonging to the two classes is the ‘training set’, and
* At the time of the present revision of the paper, the authors came
across the latest reference 1141 in which a similar criterion for optimality has been used. However, there is no proof of convergence in
Reference 14, and the framework used for establishing the domain of
attraction is different.
71
'learning' is the process of finding W and t such that the
perceptron classifies the patterns correctly.
Formally, let { X i } s = 1, 2; k = 1 , 2 , . . _ ,N ( s ) be the
set of patterns. Here s stands for the class number, and k
is the pattern number within the sth class. Let the output,
ys, be - 1 for s = 1 and + 1 for s = 2. Learing amounts
to finding W and t such that
(W'X;
-
t)y' > 0 for s = 1 , 2 V k
(1)
The following well-known perceptron convergence
theorem [l] gives an iterative learning algorithm that
converges to a solution whenever there exists one.
Without loss of generality, we neglect the threshold term,
t, in the following discussion.
Perceptron-convergence theorem: Starting from the initialisation weight vector, "(0) = 0, each of the patterns, Xs,
is repeatedly presented to the algorithm. If X l is the
pattern presented at step t, then the vector W(t - 1) of
the preceding step is modified according to the rule
w(t) = w(t - 1) + EXSY"
If a solution W* exists satisfying eqn. I , then this algorithm converges in a finite number of steps to a solution
of eqn. 1.
The proof of this theorem (see Reference 1) involves
showing that the Schwartz inequality,
(3)
will be violated, if the modification of the weight vector
does not stop after a finite number of iterations.
In the above algorithm, even if the modification of
weights is done after a batch of patterns is presented, the
method still converges.
2.1 Learning with constrained weights
It has been observed that, in the above algorithm, if a n
element of the weight vector, w i , always gets the input of
the same sign, and if the sign of wi is opposite to that of
the input, then that weight wi can be kept constant while
updating the vector W . Still the algorithm converges in a
finite number of steps. The proof again uses the Schwartz
inequality to show the number of corrections is bounded
above.
Now consider the problem of finding a weight vector
W such that
W r X i > t i t i > 0 b'i
In this case, too, the algorithm converges to a solution
whenever there exists one. The proof of convergence
follows from the above result, and the following representation of the problem:
In this formulation, ti and - 1 are of opposite signs, as a
consequence of which we can keep t i fixed without affecting the convergence.
Based on the above discussion, we observe that the
PLA, in general, can solve a system of linear inequalities
of the following type:
W'X, > ti where ti s are positive
(4)
Viewing the PLA in this way is useful in many ways. For
instance, any linear constraint on W of the type 4 can be
allowed in the PLA, since it can be converted to an
equivalent pattern. and added to the pattern set.
As an example, consider the problem of learning in a
perceptron with sign-constrained weights [3]. If a weight
wi is constrained to be positive, then that constraint can
be converted to the pattern X = (0 0 . . . 1 . . . 0 0)' so
that
W T X > 0 - wi > 0
So the algorithm converges even when the signs of the
weights are constrained, if there exists a solution with the
given signs. In this way, we obtain the algorithm given in
Reference 3 as a special case.
2.2 Perceptron of maximal stability
The perceptron generalises well to the classification of
patterns which are not in the training set, if the inequalities 1 are satisfied for a positive constant, c, on the righthand side instead of zero. An optimal learning algorithm
is the one which results in the choice of W and t such
that c is maximum, with respect to the normalised W .
With such W and t, even if the patterns are corrupted to
some extent, one can expect the classification to be
correct because of higher tolerance in c. The perceptron
with such W and t is the perceptron of maximal stability
[4]. The determination of such W and t leads to the following optimisation problem for the pattern set { X i } :
maximise c
subject to (W'Xi
-
t)y" > el(W ( (
Gardner et al. [S] show that the PLA can be used, for
finding the weights, W , that satisfy the above inequalities
for a fixed c. In Reference 6 an iterative algorithm, called
AdaTron, is proposed to solve the above optimisation
problem. In fact, AdaTron is a combination of the
Adaline rule [7] and the PLA [l].
3
Optimal learning in various neural-network
models
In this Section, we use the above procedure of converting
the constraints on W into patterns, in order to develop
optimal learning algorithms for the neural network
models such as BAM, the Hopfield network and the
multistate perceptron.
3.1 Learning in bidirectional associative memory and
the Hopfield network
For clarity, we deal with the BAM and the Hopfield
network separately. As is well known, the BAM is a twolayer feedback neural network, which can be used to
store associations between patterns (see Fig. 2). The
neurons in one layer are connected to the neurons in the
other layer, but there are no connections within a layer.
Let N and P be the number of neurons in the first and
second layers, respectively. Let qj be the connection
weight between neuron i in the first layer and neuron j in
the second layer. Let { ( X , , K)}, k = 1, _ .., A4 be the set of
pattern pairs to be stored, where X , s and & s are N- and
P-dimensional vectors, respectively. In the standard recall
finding a W which satisfies the following inequalities:
i1
xk(i)[
wj
>
yk(j)]
where i = 1, _ .,,N ; j = 1, ..., P ; k = 1, .. ., M ; and c is a
positive number. An optimal learning algorithm should
maximise the c in the above equations with respect to a
normalised weight matrix.
Kosko [ 9 ] has proposed a correlation learning rule to
determine the connection weights using the formula,
It should be noted that this rule does not work well when
the patterns are not orthogonal. Even when there exists a
matrix W such that the given patterns are stable, this
learning rule may fail in storing the patterns. Thus, with
this learning rule, one cannot utilise the full storage
capacity of the network’s architecture.
T o overcome this problem, and to obtain an optimal
weight matrix, a modified learning algorithm based on
multiple training has been proposed in Reference 10. In
this method, called the weighted learning method, the
connection weight is of the form
M
Fig. 2
W,
Bidireclional associative memory
procedure, each neuron state X ( i ) and Y ( j )is updated as
follows:
where X’(i) and Y ‘ ( j )are the new states of the ith unit in
the first layer, and of the jth unit in the second layer,
respectively.
In the learning phase, a weight matrix to store the
given pattern pairs is obtained. Two aspects are to be
considered in learning: first, the desired patterns are to be
stored as stable states, and second, the stored patterns
are to be made optimally stable.*
From eqn. 5, we see that a pattern pair ( X , Y ) is a
stable state, if and only if the following equations hold:
X(i)
1
i
]
j = I . ..., P
1 Rj Y ( j ) > o
[ j r l
Y(j) C F j X ( i ) >O
[iyl
=
=
1
Ck
The algorithm proposed in Reference 10 results in C k s
which make the patterns stable. It is also proved that this
algorithm converges whenever there exists a set of C k s
such that the given patterns are stable. Further, a condition for the existence of C,s is derived, and it is shown to
be equivalent to the linear separability of a set of patterns
which are constructed from the given pattern set.
However, a drawback of this method is that it may not
be able to store the patterns which can, theoretically, be
stored. This problem arises because the algorithm obtains
the solution only when it is of the form given in eqn. 8. In
fact, for some pattern sets which can be stored, there may
not exist a set of C k s such that W in eqn. 8 makes them
stable.
As an example, consider the set of patterns in Fig. 3. It
can be shown, by using the Ho-Kashyap algorithm [I 17,
that there d o not exist C k s which make the patterns
stable, in spite of the fact that these patterns can be
stored.
1, ..., N
X1=[-1
-1
-1
1
-1
-1
X2=[-1
-1
1
1
-1
1
X3=[
1
X4=[-1
The first aspect of the learning phase is to obtain the
weight matrix, W ,satisfying the above inequalities for all
the pattern pairs to be stored. The second requires a W
which gives a basin of attraction as wide as possible
around each stored pattern pair. In view of the difficulty
of solving this problem, we consider a simpler version of
the second aspect (see, for instance, Reference X), namely,
~
(8)
xk(i)yk(j)
k=l
Y1 = [ - 1
Y2=[
1
Y3=[-1
Y4=[
Fig. 3
1
11
-11
1
1
-1
1
-1
11
1
-1
-1
1
-1
11
-1
-1
-1
1
1
1
1
-1
1
-1
-11
1
1
-1
-1
1
-11
1
-1
-1
-1
-11
-1
-11
Set ofputterns used us an example
Here, we give an algorithm which takes care of both the
aspects of learning mentioned above. We show the equivalence of the BAM to a perceptron, so that the optimal
learning algorithms of perceptrons can be used for
optimal learning in the BAM. T o this end, consider the
BAM of Fig. 4 with N = P = 3.
Let ( X k , K) be one pair of patterns to be stored. The
conditions on W i i s for ( X k , K) to be stable are given in
this equivalent perceptron, we obtain the weights for the
optimal BAM.
O n the other hand, the Hopfield network [2] is a feedback neural network, in which every neuron is connected
to every other neuron. This network can be used as an
auto-associative memory. Patterns are stored in the
network by making them the stable points. Here, learning
is the process of obtaining the weight matrix W , such
that the given patterns are stable. Once the learning is
over, if the network is given a partial pattern, it is supposed to complete the pattern.
Now consider the problem of storing the ndimensional patterns X , , X , , ..., X , in a Hopfield
network of n neurons. i.e. we want W E R""" such that
sgn ( W X , ) = X iVi
Fig. 4
B A M with N = P
=
3
conditions 6. Each of the conditions 6 on F.js can be
converted to an input-output pair for the equivalent perceptron shown in Fig. 5. Thus the conditions 6, after con-
where the operator sgn is applied on all elements of the
vector.
This is a system of linear constraints on W . For every
pattern, we get n constraints, one per each neuron. Therefore we have rnn constraints, which are to be converted
into patterns used for training a perceptron. However, we
are guaranteed to get a solution whenever one exists. If
self-interactions are assumed to be zero, then this equivalent perceptron will have n z - n connections. If we want
symmetry also to be imposed on W , we can combine wij
and w j i , and have only one connection. Then we will
have a perceptron with ( n 2 - n)/2 weights.
As an example, consider a Hopfield network of 4
neurons. The equivalent perceptron has 6 inputs
(assuming zero self-interactions and symmetry of W ) , and
its weight vector is of the form ( w 1 2 w 1 3 w14 w Z 3 wz4
w ~ ~The
) .pattern
~
(x(1) x(2) x(3) ~ ( 4 ) )to~ be
, stored in
the Hopfield network, is converted to the following
input-output pairs for the equivalent perceptron :
I
The advantage of this approach is that, the Hopfield
network with partial connectivity can also be trained, just
by keeping only the corresponding weights in the equivalent perceptron. Also, the network can be made robust by
applying the optimal learning algorithms for the equivalent perceptron.
w3Y
Fig. 5
1
0
0 +x(l)
x(2) x(3) x(4) 0
x(l) 0
0 x(3) x(4) 0 -x(2)
0 x(l) 0 x(2) 0 x(4) +x(3)
0
0 x(1) 0 x(2) x(3) -x(4)
*
Perceprron equiualent to B A M oJFig. 4
version, become the following pattern set for this
perceptron with the corresponding outputs.
0
0
0
0
0
-&(l) K(2) K(3)
0
0
0
K(1) Yk(2) K(3)
0
6
0
0
0
0
0
0
Y k ( l ) K(2)
0
K(2) 0
&(I)
K(3)
0
0
0
&(I)
0
0
K(2) 0
0
K(3)
0
0
K(1) 0
0
K(2)
0
0
0
0 0
K(3)
0
0
K(3).
Each pattern pair in the BAM structure gives rise to
( N + P ) sub-patterns in the equivalent perceptron, corresponding to the ( N P ) neurons in the BAM. Now,
storing M pattern pairs in the BAM is equivalent to
learning in a perceptron with these M * ( N P ) subpatterns. Thus, by applying the AdaTron algorithm to
+
+
I
3.2 Learning in multistate perceprrons
A multistate perceptron is a straightforward generalisation of the two-state perceptron to a perceptron
whose output can be in one of Q states, ( Q > 2). This is
useful for multi-class pattern classification. Here the
output y can take any value from the set {0, 1, 2, ...,
Q - 1). The perceptron now has Q 1 thresholds (t, <
t , < . . < tQ- to separate the neighbouring classes, see
Fig. 6 . The input-output relation is given by
~
y
if W'X; < t ,
0
=s
=
=
if t,
< W'X; < t,+ ,
< WTX;
Q - 1 ift,-,
The problem of learning is to find W and thresholds, t,,
t 2 , . . . , such that the input patterns are mapped to their
x,
w1
This is equivalent to finding the weight vector for the
two-state perceptron, with the following input-output
pattern pairs:
[X,",
- 1,0]
[x:,0,- 11+
1
~
+1
[ X i , 0, - 11+ + 1
[Xi,- 1, 01 -+
It can be verified that the weight vector which gives this
mapping satisfies the constraints of the three-state perceptron. The last two elements in the weight vector of the
two-state perceptron correspond to the thresholds of the
three-state perceptron. Thus, by applying the learning
algorithm of the two-state perceptron to the transformed
input-output pairs, we obtain weights and the corresponding thresholds which solve the three-state perceptron learning problem. Therefore the convergence of the
algorithm follows directly from the perceptron convergence theorem.
In order to get the weights and thresolds for a multistate perceptron, we generate transformed input-output
pairs by appending additional element to the input patterns, see Fig. 7. The number of these additional elements
w,
x1
Fig. 6
+ -1
Multistate perceptron with Q states
corresponding classes. This is equivalent to finding W
and thresholds satisfying the following inequalities:
WTXS, < t ,
for s = 0
< W'X; < t,,,
t, < W T X ;
for s = 1, 2, . . . , Q
t,
-
2
fors=Q-1
(9)
We define the gap between two neighbouring classes, s
and s + 1, as
Fig. 7
We define G(W) as the minimum gap between any two
classes, given by
C( W ) = min g"( W )
The optimal multistate perceptron is then defined as the
one which maximises G(W). This is equivalent to maximising c in the following inequalities:
t,
-
t,+ I
W'X; > ell WII
-
W'X; > cII WII
for s = 0
for s = 1, 2, . .. , Q
-
2
WTX;-t,>cllWII
f o r s = 1,2, . . . , Q - 2
W T X ;- t, > cI( Wll
for s
=Q -
1
In Reference 4, an iterative algorithm is given to find W
and t s that satisfy expr. 9, but the problem of finding the
multistate perceptron of maximal stability is posed as an
open question. Here we give a solution to this problem.
First, we derive a learning algorithm for the multistate
perceptron, by showing that learning in this case is equivalent to learning in the 2-state perceptron.
Consider the learning problem in a three-state perceptron. We need to find W , t , and t, such that
W'Xf < t ,
Perceptron equivalent t o the rnulristate perceptron o j F i g . 0
is equal to the number of thresholds in the perceptron.
The output is 1 or - 1, depending on which threshold
is being considered, and on which side of that threshold
the input pattern is supposed to lie. Thus for patterns in
classes 0 and Q - 1, we need to check with only one
threshold, and, correspondingly, we will generate only
one new input-output pair for every input-output pair in
the original pattern set. And for patterns in other classes,
we obtain two new input-output pairs for every inputoutput pair in the original pattern set, as they have to be
compared with two thresholds. The transformed inputoutput pairs will be of the following form:
+
0
[Xi - I
[X:
[Xi
[x;
s + l ' " o - I
I . - . s
0
0 0
0 0
0 0
0
"'
'
0
0
-I
" '
'
''.
0
0
-I
0
. ' '
" '
.
'
'
'
01 01 - + I
01 - - I
-I]
-+I
- s =I o
s = I , 2. . . . Q - 2
r = l , 2..... Q - 2
s=Q-l
Now, the problem of obtaining the optimal multistate
perceptron is equivalent to obtaining the maximally
stable two-state perceptron for the transformed inputoutput pairs. This maximises the minimum gap between
any threshold and the patterns of its neighbouring class.
This is exactly what we require for a maximally stable
multistate perceptron. So the AdaTron algorithm of the
two-state perceptron can be used to find the multistate
perceptron with maximal stability characteristics.
4
Experimental results
Table 6: Percentage of successful recalls in the Hopfield
network
In the first part of our experiments, we consider the
problem of optimal learning in BAM and the Hopfield
network. We consider a BAM with N = P = 15, and a
Hopfield network with 30 neurons. We compare the proposed method with the weighted learning method of Reference 10, using a set of test patterns. The number of
patterns stored, M , is varied from 2 to 7. For each value
of M, lo00 experiments are performed. In each experiment, patterns are generated randomly, and the two algorithms are applied. If both the methods are successful in
storing the patterns, then the network is tested for
pattern-retrieval performance. For this purpose, 300 patterns are generated randomly, and 100 of them are corrupted by flipping 10% of the bits, another 100 patterns
are corrupted by flipping 15% of the bits, and 20% of the
bits are flipped in the remaining 100 patterns. The
network is initialised with the corrupted pattern, and
allowed to run. Then the results are checked to verify
whether the network has converged to the correct
pattern.
Results of the comparisons for BAM are summarised
in Tables 1-3: Table 1 gives the percentage of experiments in which the two methods were successful in
M
2
3
Weighted Hebb rule 99 95
Proposed method
99 99
4
5
6
7
89 7 9 71 64
98 96 94 91
and generated 3 sets of random patterns from the three
square regions shown in Fig. 8. Our results are compared
with those obtained from the learning algorithm given in
Reference 4.
L?
,
I
Table 1 : Storage of patterns in B A M
M
Weighted Hebb rule
Proposed method
2
3
4
5
6
7
100 98.8 89.0 5 7 . 9 20.8 3 . 3
100 99.9 99.9 99.7 9 9 . 6 99.2
X
Table 2: Measure of basin o f attraction in B A M
M
Weighted Hebb rule
ProDosed method
2
3
4
5
6
7
2 . 6 4 1.61 1.06 0 . 7 6 0 . 5 8 0.49
2 . 6 9 1 . 8 8 1.43 1 . 1 9 1.03 0.94
Table 3: Percentage of successful recalls in B A M
2
Weighted Hebb rule 99
Proposed method
99
M
3
4
5
91 80 7 0
93 85 7 8
6
7
60 51
6 7 64
Fig. 8
In Fig. 8, the dotted lines I, and I , show the decision
boundaries obtained by the algorithm in Reference 4. The
solid lines L , and L , in the Figure are the ones obtained
by our method. Clearly, L , and L , are better decision
boundaries than I , and I,. From these results, the
superiority of the proposed method is evident.
5
storing the patterns. Table 2 contains (c/llWili) (where
IIWl!iis the Euclidean norm of the vector of weights connecting neuron i to other neurons; and c, as obtained by
the algorithm, is the largest positive number that satisfies
expr. 7 with respect to a normalised weight matrix),
which is a measure of the size of the basin of attraction.
This parameter is averaged over all neurons, and over all
experiments in which both the methods were successful in
storing the patterns. Table 3 gives the percentage of successful recalls, i s . runs in which the network converged
to the correct pattern in the pattern retrieval phase.
Tables 4-6 give similar results for the Hopfield network.
In the second part of our experiments, we applied the
proposed method for learning in the multistate perceptron. For simplicity, we considered a 3-state perceptron,
Table 4: Storage of patterns in the Hopfield network
M
Weighted Hebb rule
Proposed method
2
3
4
5
99
97
90
77
100 100 100 100
6
7
51
22
100 100
T a b l e 5 : Measure of basin of attraction in the Hopfield
2
3
4
5
6
7
Weighted Hebb rule 3 . 5 4 2.35 1.68 1 . 2 4 0 . 9 9 0 . 8 2
Prooosed method
3 . 7 8 3 . 0 3 2 . 5 7 2 . 2 5 2.01 1.81
Conclusions
In this paper, we have described a method of exploiting
the learning algorithms of a two-state perceptron, to
obtain optimal weight matrices for general neural architectures like the bidirectional associative memory, Hopfield network and the multistate perceptron. More details
of the proposed method can be found in References 12
and 13. This procedure is applicable even when the connectivity is partial, as in the case of cellular neural networks. It can also be applied to learning in the Hopfield
network/BAM with multistate neurons.
6
References
1 MINSKY, M.L.. and PAPERT. S
2
3
4
natwork
M
Three-state perceptran
5
6
’Perceotrons’ (Cambridge MA
MIT Press, 1969)
HOPFIELD, J.J.: ‘Neural networks and physical systems with
emergent collective computational abilities’, Proc. Natl. Acad. Sci..
1982,79, pp. 2554-2558
AMIT, D.J., WONG, K.Y.M., and CAMPBELL, C . : ‘Perceptron
learning with sien-constrained weiehts’. J . Phvs. A . Math. Gen..
1989.21, pp. 203;-2045
ELIZALDE, E., and GOMEZ, S.: ‘Multistate perceptrons: learning
rule and perceptron of maximal stability’, J . Phys. A, Math. Gen.,
1992.25, pp. 5039-5045
GARDNER, E.: ‘The space of interactions in neural network
models’, J . Phys. A, Math. Gen., 1988, 21, pp. 257-270
ANLAUF, J.K., and BIEHL, M.: ‘The AdaTron: an adaptive perceptron algorithm’, Europhys. Left.. 1989, 10, (7). pp. 687-692
7 WIDROW, G., and HOFF, M.E.: ’Adaptive switching circuits’.
IRE, Western Electronic Show and Convention, Convention
Record, Part 4, pp. 96-104
8 KRAUTH, W., and MEZARD, M.: ‘Learning algorithms with
optimal stability in neural networks’, J. Phys. A , Math. Gen., 1987,
20, p. L745
9 KOSKO, B.: ‘Adaptive bidirectional associative memories’, Appl.
Opt., 1987.26, (23), pp. 4947-4960
10 WANG, T., ZHUANG, X., and XING, X.: ‘Weighted learning of
bidirectional associative memories by global minimization’, IEEE
Trans. Neural Netw., 1993.3, (6), pp. 1010-1018
11 TOU, J.T., and GONZALEZ, R.C.: ‘Pattern recognition principles’
(Addison-Wesley, 1974)
12 SHANMUKH, K., and VENKATESH, Y.V.: ‘On an optimal learning scheme for bi-directional associative memories’. Int. Joint Conf.
on Neural Networks, Nagoya, 1993
13 SHANMUKH, K., and VENKATESH, Y.V.: ‘Multistate perceptron with optimal stability’. ANNNSI-93, Bombay, 1993
14 WANG, T., ZHUANG, X., and XING, X.: ‘Designing bidirectional
associative memories with optimal stability’, IEEE Trans. Sysr.,
Man Cybern., 1994, SMC-24, (5). pp. 778-790
Download