Document 10851134

advertisement
Hindawi Publishing Corporation
Discrete Dynamics in Nature and Society
Volume 2012, Article ID 456919, 25 pages
doi:10.1155/2012/456919
Research Article
A Smoothing Interval Neural Network
Dakun Yang and Wei Wu
School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
Correspondence should be addressed to Wei Wu, wuweiw@dlut.edu.cn
Received 23 May 2012; Accepted 18 October 2012
Academic Editor: Daniele Fournier-Prunaret
Copyright q 2012 D. Yang and W. Wu. This is an open access article distributed under the Creative
Commons Attribution License, which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.
In many applications, it is natural to use interval data to describe various kinds of uncertainties.
This paper is concerned with an interval neural network with a hidden layer. For the original
interval neural network, it might cause oscillation in the learning procedure as indicated in our
numerical experiments. In this paper, a smoothing interval neural network is proposed to prevent
the weights oscillation during the learning procedure. Here, by smoothing we mean that, in a
neighborhood of the origin, we replace the absolute values of the weights by a smooth function
of the weights in the hidden layer and output layer. The convergence of a gradient algorithm for
training the smoothing interval neural network is proved. Supporting numerical experiments are
provided.
1. Introduction
In the last two decades artificial neural networks have been successfully applied to various
domains, including pattern recognition 1, forecasting 2, 3, and data mining 4, 5.
One of the most widely used neural networks is the feedforward neural network with
the well-known error backpropagation learning algorithm. But in most neural network
architectures, input variables and the predicted results are represented in the form of single
point value, not in the form of intervals. However, in real-life situations, available information
is often uncertain, imprecise, and incomplete, which can be represented by fuzzy data, a
generalization of interval data. So in many applications it is more natural to treat the input
variables and the predicted results in the form of intervals than a set of single-point value.
Since multilayer feedforward neural networks have high capability as a universal
approximator of nonlinear mappings 6–8, some methods via neural networks for handling
interval data have been proposed. For instance, in 9, the BP algorithm 10, 11 was
extended to the case of interval input vectors. In 12, the author proposed a new extension
2
Discrete Dynamics in Nature and Society
of backpropagation by using interval arithmetic which called Interval Arithmetic Backpropagation IABP. This new algorithm permits the use of training samples and targets
which can be indistinctly points and intervals. In 13, the author proposed a new model of
multilayer perceptron based on interval arithmetic that facilitates handling input and output
interval data, where weights and biases are single valued and not interval valued.
However, weights oscillation phenomena during the learning procedure were
observed in our numerical experiments for these interval neural networks models. In order to
prevent the weights oscillation, a smoothing interval neuron is proposed in this paper. Here,
by smoothing we mean that, in the activation function and in a neighborhood of the origin,
we replace the absolute values of the weights by a smooth function of the weights. Gradient
algorithms 14–17 are applied to train the smoothing interval neural network. The weak and
strong convergence theorems of the algorithms are proved. Supporting numerical results are
provided.
The remainder of this paper is organized as follows. Some basic notations of interval
analysis are described in Section 2. The traditional interval neural network is introduced in
Section 3. Section 4 is devoted to our smoothing interval neural network and the gradient
algorithm. The convergence results of the gradient learning algorithm are shown in Section 5.
Supporting numerical experiments are provided in Section 6. The appendix is devoted to the
proof of the theorem.
2. Interval Arithmetic
Interval arithmetic as a tool appeared in numerical computing in late 1950s. Then the interval
mathematic is a theory introduced by Moore 18 and Sunaga 19 in order to give control of
errors in numeric computations. Fundamentals used in this paper are described below.
Let us denote the intervals by uppercase letters such as A and the real numbers by
lowercase letters such as a. An interval can be represented by its lower bounds L and upper
bounds U as A aL , aU , or equivalently by its midpoint C and radius R as A aC , aR ,
where
aC aL aU
,
2
2.1
aU − aL
.
a 2
R
For intervals A aL , aU and B bL , bU , the basic interval operations are defined by
A B aL bL , aU bU ,
A − B aL − bU , aU − bL ,
⎧
⎨ k · aL , k · aU ,
k·A ⎩ k · aU , k · aL ,
where k is a constant.
k > 0,
k < 0,
2.2
Discrete Dynamics in Nature and Society
3
If f is an increasing function, then the interval output is given by
fA f aL , f aU .
2.3
In this paper, we use the following weighted Euclidean distance for a pair of intervals A and
B
2 2
dA, B β aC − bC 1 − β aR − bR ,
β ∈ 0, 1.
2.4
The parameter β ∈ 0, 1 facilitates giving more importance to the prediction of the output
centres or to the prediction of the radii. For β 1 learning concentrates on the prediction
of the output interval centre and no importance is given to the prediction of its radius. For
β 0.5 both predictions centres and radii have the same weights in the objective function.
For our purpose, we assume β ∈ 0, 1.
3. Interval Neural Network
In this paper, we consider an interval neural network with three layers, where the input and
output are interval value, the weights are real value. The numbers of neurons for the input,
hidden and output layers are N, M, 1, respectively. Let Wm wm1 , wm2 , . . . , wmN T ∈ RN ,
m 1, 2, . . . , M be the weight matrix connecting the input and the hidden layers. The
weight vector connecting the hidden and the output layers is denoted by W0 w0,1 , w0,2 ,
T T
∈ RNMM .
. . . , w0,M T ∈ RM . To simplify the presentation, we write W W0T , W1T , . . . , WM
In the interval neural network, a nonlinear activation function fx is used in the hidden
layer, and a linear activation function in the output layer.
For an arbitrary interval-valued input X X1 , X2 , . . . , XN , where Xi xiC , xiR , i 1, 2, . . . , N, as the weights of the proposed structure are real value, this linear combination
results in a interval given by
N
R
wmi Xi sC
Sm m , sm i1
N
N
C
R
wmi xi , |wmi |xi .
i1
3.1
i1
Then the output of the interval neuron in the hidden layer is given by
R
C
R
C
R
,
f
s
h
−
s
s
,
h
Hm fSm f sC
m
m
m
m
m m
C
f s − sR f sC sR f sC sR − f sC − sR
,
.
2
2
3.2
4
Discrete Dynamics in Nature and Society
Finally, the output of the interval neuron in the output layer is given by
Y yC , yR ,
M
yC 3.3
w0m hC
m,
3.4
m1
yR M
|w0m |hRm .
3.5
m1
4. Smoothing Interval Neural Network
4.1. Smoothing Interval Neural Network Structure
As revealed in the numerical experiment below in this paper, there appear weights oscillation
phenomena during the learning procedure for the original interval neural network presented
in the last section. In order to prevent the weights oscillation, we propose a smoothing
interval neural network by replacing |wmi | and |w0m | with a smooth function ϕwmi and
ϕw0m in 3.1 and 3.5. Then, the output of the smoothing interval neuron in the hidden
layer is defined as
Sm N
wmi Xi i1
N
N
C
R
wmi xi , ϕwmi xi ,
i1
4.1
i1
R
C
R
R
hC
Hm fSm f sC
m − sm , f sm sm
m , hm
C
C
R
R
f sm sRm − f sC
f sm − sRm f sC
m sm
m − sm
,
.
2
2
4.2
The output of the smoothing interval neuron in the output layer is given by
yC M
w0m hC
m,
m1
yR M
4.3
ϕw0m hRm .
m1
For our purpose, ϕx can be chosen as any smooth function that approximates |x| near the
origin. For definiteness and simplicity, we choose ϕx as a polynomial function:
⎧
⎪
x ≤ −μ,
⎪
⎨−x,
ϕx ϕx,
−μ < x < μ,
⎪
⎪
⎩x,
x ≥ μ,
4.4
Discrete Dynamics in Nature and Society
5
where μ > 0 is a small constant and
ϕx
−
1 4
3 2 3
x μ.
x 3
4μ
8
8μ
4.5
We observe that the above defined ϕx is a convex function in C2 R, and it is identical to
the absolute value function |x| outside the zero neighborhood −μ, μ.
4.2. Gradient Algorithm of the Smoothing Interval Neural Network
J
Suppose that we are supplied with a training sample set {Xj , Oj }j1 , where Xj ’s and Oj ’s
are input and ideal output samples, respectively, as follows: Xj X1j , X2j , . . . , XNj T , Xij C R
xijL , xijU xijC , xijR , i 1, 2, . . . , N, Oj oLj , oU
j oj , oj . Our task is to find the weights
T T
W W0T , W1T , . . . , WM
such that
Oj Y Xj ,
j 1, 2, . . . , J.
4.6
T T
satisfying 4.6 does not exit and, instead,
But usually, the weight W W0T , W1T , . . . , WM
the aim of the network learning is to choose the weight W to minimize an error function of
the smoothing interval neural network. By 2.4, a simple and typical error function is the
quadratic error function:
EW J 2 2 R
1
C
R
o
−
y
1
−
β
−
y
β oC
.
j
j
j
j
2 j1
4.7
C 2
R
R
R 2
Let us denote fjC t 1/2oC
j − tj , fj t 1/2oj − tj , j 1, 2, . . . , J, t ∈ R, then the
error function 4.7 is rewritten as
EW βfjC y 1 − β fjR y .
J 4.8
j1
Now, we introduce the gradient algorithm 15, 16 for the smoothing interval neural network.
The gradient of the error function EW with respect to W0 is given by
⎛
⎞
J
∂yjC ∂yjR
R
∂EW C
C
R
⎝β y − o
⎠
1 − β yj − oj
j
j
∂W0
∂W0
∂W0
j1
⎛
⎞
C
R
J
∂y
∂y
j
j
⎝βf C y
⎠,
1 − β fj R y
j
∂W
∂W
0
0
j1
4.9
6
Discrete Dynamics in Nature and Society
where
∂yjC
∂W0
∂yjR
∂W0
hC
j ,
4.10
ϕ W0 hRj .
The gradient of the error function EW with respect to Wm , m 1, 2, . . . , M is given by
⎛
⎞
R
R
J
∂yjC ∂hC
∂h
∂y
∂EW jm
j
jm
⎝β yC − oC
⎠
1 − β yjR − oRj
j
j
C ∂W
R ∂W
∂Wm
∂h
∂h
m
m
j1
jm
jm
4.11
⎞
∂yjC ∂hC
∂yjR ∂hRjm
jm
⎝βf C y
⎠,
1 − β fj R y
j
C ∂W
R ∂W
∂h
∂h
m
m
j1
jm
jm
J
⎛
where
∂yjC
∂hC
jm
∂yjR
∂hC
jm
∂Wm
∂hRjm
∂Wm
f
sC
jm
−
sRjm
ϕw0m ,
∂hRjm
xjC
−ϕ
w0m ,
Wm xjR
2
R
xjC ϕ Wm xjR
f sC
jm sjm
2
−
f
sC
jm
sRjm
xjC
ϕ
Wm xjR
2
R
xjC − ϕ Wm xjR
f sC
jm − sjm
2
4.12
,
.
In the learning procedure, the weights W are iteratively refined as follows:
W k1 W k ΔW k ,
4.13
∂E W k
ΔW −η
,
∂W
4.14
where
k
where η > 0 a constant learning rate and k 1, 2, . . . .
Discrete Dynamics in Nature and Society
7
5. Convergence Theorem for SINN
n
2
For any x ∈ Rn , its Euclidean norm is x i1 xi . Let Ω0 {W ∈ Ω : EW W 0} be
the stationary point set of the error function EW, where Ω ⊂ RNMM is a bounded region
satisfying A2 below. Let Ω0,s ⊂ R be the projection of Ω0 onto the sth coordinate axis, that
is,
Ω0,s ws ∈ R : W w1 , . . . , ws , . . . , wNMM T ∈ Ω0 ,
5.1
for s 1, 2, . . . , NM M. To analyze the convergence of the algorithm, we need the following
assumptions.
A1 |ft|, |f t|, |f t| are uniformly bounded for t ∈ R.
A2 There exists a bounded region Ω ⊂ Rn such that {W k } ⊂ Ω k ∈ N.
A3 The learning rate η is small enough such that A.10 below is valid.
A4 Ω0,s does not contain any interior point for every s 1, 2, . . . , NM M.
Now we are ready to present one convergence theorem of the learning algorithms. Its proof
is given in the appendix later on.
Theorem 5.1. Let the error function EW be defined by 4.7, and the weight sequence {W k } be
generated by the learning procedure 4.13 and 4.14 for smoothing interval neuron with W 0 being
an arbitrary initial guess. If Assumptions A1, A2, and A3 are valid, then we have
E W k1 ≤ E W k ,
5.2
lim EW W k 0.
5.3
k→∞
Furthermore, if Assumption A4 also holds, there exists a point W ∗ ∈ Ω0 such that
lim W k W ∗ .
k→∞
5.4
6. Numerical Experiment
We compare the performances of the interval neural network and the smoothing interval
neural network by approximating a simple interval function
Y 0.01 × X 112 .
6.1
In this example, the training set contains five training samples. Their midpoints are all 0 and
their radii are 0.8552 2.6248 8.0101 0.2922 9.2885, respectively. The corresponding
outputs of the samples are Y yC , yR 0.01 × X 112 .
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
100
Norm of weight gradient
Discrete Dynamics in Nature and Society
Norm of weight gradient
8
101
102
103
104
5
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
100
101
102
103
104
Number of iterations
Number of iterations
a Interval neural network
b Smoothing interval neural network
Figure 1: Norm of gradient of the interval neural network and the smoothing interval neuron in the
training.
2
2
1
1
0
−1
−1
Error
Error
0
−2
−2
−3
−4
−3
−5
−4
−5
−6
0
500
1000
1500
Number of iterations
a Value of D for interval neural network
2000
−7
0
500
1000
1500
Number of iterations
2000
b Value of D for smoothing interval neural
network
Figure 2: Values of the error function D for the interval neural network and the smoothing neural network.
For the above two interval neural networks, the error function EW is defined as in
4.7. But in order to see the error more clearly in the figures, we will also use the error D
defined by
⎛
⎞
J 2 2 1
C
⎠.
D ln E ln⎝
1 − β oRj − yjR
β oC
j − yj
2 j1
6.2
The number of training iterations is 2000, the initial midpoint of weight vector is
selected randomly from −0.01, 0.01, and two neurons are selected in the hidden layer. The
fix learning rate is η 0.2, β 0.5, and μ 0.5.
In the learning procedure for the interval neural network, we clearly see from
Figure 1a that the gradient norm is not convergent. Figure 2a shows that the error function
D is oscillating and not convergent. On the contrary, we see from Figure 1b that the gradient
Discrete Dynamics in Nature and Society
9
norm of the smoothing interval neural network is convergent. Figure 2b shows that the
error function D, as well as E, is monotone decreasing and convergent.
From this numerical experiment, we can see that the proposed smoothing neural
network can efficiently avoid the oscillation during the training process.
Appendix
First, we give Lemmas A.1 and A.2. Then, we use them to prove Theorem 5.1.
Lemma A.1. Let {bm } be a bounded sequence satisfying limm → ∞ bm1 − bm 0. Write γ1 limn → ∞ infm>n bm , γ2 limn → ∞ supm>n bm , and S {a ∈ R : There exists a subsequence {bik } of
{bm } such that bik → a as k → ∞}. Then we have
S γ1 , γ2 .
A.1
Proof. It is obvious that γ1 ≤ γ2 and S ⊂ γ1 , γ2 . If γ1 γ2 , then A.1 follows simply from
limm → ∞ bm γ1 γ2 . Let us consider the case γ1 < γ2 and proceed to prove that S ⊃ γ1 , γ2 .
For any a ∈ γ1 , γ2 , there exists ε > 0 such that a − ε, a ε ⊂ γ1 , γ2 . Noting
limm → ∞ bm1 − bm 0, we observe that bm travels between γ1 and γ2 with very small pace
for all large enough m. Hence, there must be infinite number of points of the sequence {bm }
falling into a − ε, a ε. This implies a ∈ S and thus γ1 , γ2 ⊂ S. Furthermore, γ1 , γ2 ⊂ S
immediately leads to γ1 , γ2 ⊂ S. This completes the proof.
For any k 0, 1, 2, . . ., 1 ≤ j ≤ J, we define the following notations.
ΦR0,k,j ϕ W0k · hRk,j ,
k
C
ΦC
0,k,j W0 · hk,j ,
C
C
ΨC
k,j hk1,j − hk,j ,
ΨRk,j hRk1,j − hRk,j .
A.2
Lemma A.2. Suppose Assumption A2, A3 holds, for any k 0, 1, 2, . . . and 1 ≤ j ≤ J, then we
have
R k C R max xjC , xjR , oC
W
,
o
,
Φ
,
Φ
j
0
j
0,k,j ≤ M0 ,
0,k,j
J βfj
C
ΦC
0,k,j
j1
k
hC
k,j ΔW0
∂E W k 2
R R R k k
1 − β fj Φ0,k,j hk,j ϕ W0 ΔW0 −η
,
∂W0 J
∂E W k 2
2
ΔW0k · ΨC
βfj C ΦC
,
0,k,j
k,j ≤ M1 η ∂W
j1
J
j1
A.3
A.4
A.5
J
R k R W0k · ΨC
ϕ W0 · Ψk,j
βfj C ΦC
1 − β fj R φ0,k,j
0,k,j
k,j j1
M ∂E W k 2
2
≤ −η M2 η
,
∂W
m
m1
A.6
10
Discrete Dynamics in Nature and Society
J
∂E W k 2
2
1
C
C
C
C
2
ξ0,k,j Φ0,k1,j − Φ0,k,j ≤ M3 η βf
,
∂W 2 j1 j
J
1 − β fj
R
R
φ0,k,j
j1
∂E W k 2
k
k
R
2
ϕ ζ1 ΔW0 · Ψk,j ≤ M4 η ,
∂W A.7
A.8
J
∂E W k 2
2
R R k 1
k
R
2
ΔW0 · hk,j ≤ M5 η 1 − β fj φ0,k,j ϕ ζ2
,
∂W0 2 j1
A.9
J
∂E W k 2
2
R R R
1
R
2
1 − β fj ξ0,k,j Φ0,k1,j − Φ0,k,j ≤ M6 η ,
∂W 2 j1
A.10
C
lies on the segment between ΦC
where Mi i 0, 1, 2, 3, 4, 5, 6 is independent of k and j, ξ0,k,j
0,k1,j
and ΦC
, ξR lies on the segment between ΦR0,k1,j and ΦR0,k,j , ζ1k , ζ2k both lie on the segment between
0,k,j 0,k,j
W0k1 and W0k .
Proof. The proof of A.3 in Lemma A.2: For the given training sample set, by Assumption
A2, 4.2, and 4.4, it is easy to known that A.3 is valid.
The proof of A.4 in Lemma A.2: by 4.9 and 4.14, we have
J j1
R R R k C
k
βfj C ΦC
Φ0,k,j hk,j ϕ W0 ΔW0k
0,k,j hk,j ΔW0 1 − β fj
∂E W
∂W0
k
·
−η
∂E W
∂W0
k
∂E W k 2
−η
.
∂W0 A.11
This proves A.4.
The proof of A.5 in Lemma A.2: using the Mean Value Theorem, for any 1 ≤ m ≤ M,
1 ≤ j ≤ J, and k 0, 1, 2, . . ., we have
C
C
ΨC
k,j,m hk1,j,m − hk,j,m
1 C
R
C
R
C
R
f
s
−
f
s
f sk1,j,m − sRk1,j,m − f sC
−
s
s
s
k,j,m
k1,j,m
k,j,m
k,j,m
k1,j,m
k,j,m
2
1 1 C
R
sk1,j,m − sRk1,j,m − sC
f tk,j,m
k,j,m − sk,j,m
2
R
C
R
f t2k,j,m
sC
−
s
,
s
s
k1,j,m
k,j,m
k1,j,m
k,j,m
A.12
Discrete Dynamics in Nature and Society
11
−sRk1,j,m and sC
−sRk,j,m , t2k,j,m is on the segment
where t1k,j,m is on the segment between sC
k1,j,m
k,j,m
between sC
sRk1,j,m and sC
sRk,j,m . By A.3, we have
k1,j,m
k,j,m
!
!
!
! !
!
! C ! M0 ! C
! ! C
!
R
R
C
R
s
−
s
−
s
s
s
!Ψk,j,m ! ≤
! sk1,j,m − sRk1,j,m − sC
!
!
k,j,m
k1,j,m
k,j,m !
k,j,m
k1,j,m
k,j,m
2
! !
! !
! !
!
M0 !! C
! ! R
! ! C
! ! R
!
R
C
R
≤
−
s
−
s
−
s
!sk1,j,m − sC
!
!s
!
!s
!
!s
k1,j,m
k,j,m
k1,j,m
k,j,m !
k,j,m
k1,j,m
k,j,m
2
! !
!
!
! ! R
!
!
C
R
M0 !sC
−
s
−
s
!
!s
k1,j,m
k,j,m !
k1,j,m
k,j,m
! ! !
!
!
!
!
k C!
k1
k
M0 !ΔWm
− ϕ Wm
xjR !
xj ! ! ϕ Wm
k
k
k
≤ M02 ΔWm
ΔWm
ϕ τ1,m
,
A.13
k
k1
k
where τ1,m
is on the segment between Wm
and Wm
. Since
⎧
⎨−x, if x ≤ −μ,
if − μ < x < μ,
ϕx ϕx,
⎩
x,
if x ≥ μ,
A.14
if x ≤ −μ and x ≥ μ, |ϕ x| 1, |ϕ x| 0.
If −μ < x < μ, we have
3
1 3
x ∈ −1, 1,
x 3
2μ
2μ
3
3
3
∈ 0,
ϕ x − 3 x2 ,
2μ
2μ
2μ
ϕ x −
A.15
so if x ∈ R, we have
! !
!ϕ x! ≤ 1,
! !
!ϕ x! ≤ 3 .
2μ
A.16
According to A.16 and A.13, we can obtain that
!
!
! C !
k
!Ψk,j,m ! ≤ 2M02 ΔWm
.
A.17
12
Discrete Dynamics in Nature and Society
By A.17, for any 1 ≤ j ≤ J and k 0, 1, 2, . . ., we have
⎛ C
⎞2
hk1,j,1 − hC
k,j,1
⎟
⎜ C
C
M ⎟
⎜ h
2
2
−
h
⎜ k1,j,2
C k,j,2 ⎟
4
k
≤
4M
⎟
ΔW
.
Ψk,j ⎜
m
0
..
⎟
⎜
m1
⎝
⎠
.
hC
− hC
k1,j,M
k,j,M
A.18
C
According to the definition of fjC t, we get that fj C t tC
j − oj , combining with A.3, we
| ≤ 2M0 . By A.18, we have
deduce that |fj C ΦC
0,k,j
J
j1
J k
C
k C ΔW
≤
2βM
βfj C ΦC
·
Ψ
Ψ
ΔW
0
0
0
0,k,j
k,j
k,j j1
≤ βM0
J 2 2 ΔW0k ΨC
k,j j1
M 2
2
k
≤ βJM0 ΔW0k 4βJM05
ΔWm
A.19
m1
≤ M1
M 2
k
ΔWm
m0
∂E W k 2
M1 η ,
∂W 2
where M1 βJM0 max{1, 4M04 }. This proves A.5.
The proof of A.6 in Lemma A.2: using the Taylor expansion, we get that
C
C
ΨC
k,j,m hk1,j,m − hk,j,m
1 C
R
C
R
C
R
f
s
−
f
s
f sk1,j,m − sRk1,j,m − f sC
−
s
s
s
k,j,m
k1,j,m
k,j,m
k,j,m
k1,j,m
k,j,m
2
1 C
R
C
R
sC
−
s
−
s
−
s
f sk,j,m − sRk,j,m
k1,j,m
k,j,m
k1,j,m
k,j,m
2
2
R
C
R
R
f t3k,j,m
sC
f sC
k1,j,m − sk1,j,m − sk,j,m − sk,j,m
k,j,m sk,j,m
×
R
C
R
−
s
sC
s
s
k1,j,m
k,j,m
k1,j,m
k,j,m
2 R
C
R
sC
−
s
,
s
s
f t4k,j,m
k1,j,m
k,j,m
k1,j,m
k,j,m
A.20
Discrete Dynamics in Nature and Society
13
−sRk1,j,m and sC
−sRk,j,m , t4k,j,m is on the segment
where t3k,j,m is on the segment between sC
k1,j,m
k,j,m
between sC
sRk1,j,m and sC
sRk,j,m . By A.3, A.16, we deduce that
k1,j,m
k,j,m
R
C
R
sC
k1,j,m − sk1,j,m − sk,j,m − sk,j,m
k C
xj
ΔWm
2 k
k
k
k
− φ Wm ΔWm φ τ2,m ΔWm
xjR
2
k
k
k
k
xjR ΔWm
ΔWm
xjC − φ Wm
− φ τ2,m
xjR ,
A.21
2
R
C
R
sC
k1,j,m − sk1,j,m − sk,j,m − sk,j,m
2 2
k C
k
k R
k
k
ΔWm
xjR ΔWm
ΔWm
xj − φ τ3,m
xj
xjC − φ τ3,m
,
k
k
k1
k
where τ2,m
, τ3,m
both lie on the segment between Wm
and Wm
. Similarly, we can deduce that
2
R
C
R
C
k
R
k
k
k
sC
ΔWm
xjR ,
k1,j,m sk1,j,m − sk,j,m sk,j,m xj φ Wm xj ΔWm φ τ4,m
2 2
R
C
R
k
k
sC
xjR ΔWm
xjC φ τ5,m
,
k1,j,m sk1,j,m − sk,j,m sk,j,m
A.22
k
k
k1
k
where τ4,m
, τ5,m
both lie on the segment between Wm
and Wm
. Combining with A.20, we
have
C
C
ΨC
k,j,m hk1,j,m − hk,j,m
2 1 C
R
C
k
R
k
k
k
xj − φ Wm xj ΔWm − φ τ2,m ΔWm xjR f t3k,j,m
f sk,j,m − sk,j,m
2
2
k
k
R
× xjC − φ τ3,m
xjR ΔWm
f sC
k,j,m sk,j,m
×
xjC
φ
k
Wm
xjR
k
ΔWm
φ
k
τ4,m
k
ΔWm
2
xjR
2 k
k
xjC φ τ5,m
xjR ΔWm
f t4k,j,m
1 C
k
R
k
k
xjR f sC
xjC φ Wm
xjR ΔWm
f sk,j,m − sRk,j,m xjC − φ Wm
k,j,m sk,j,m
2
2
2
R
k
k
R
3
C
k
R
k
φ
τ
ΔW
t
x
τ
x
ΔW
− f sC
−
s
x
f
−
φ
m
m
j
j
2,m
3,m
j
k,j,m
k,j,m
k,j,m
2
2 R
k
k
R
4
C
k
R
k
φ
τ
ΔW
t
x
τ
x
ΔW
.
f sC
s
x
f
φ
m
m
j
j
5,m
j
4,m
k,j,m
k,j,m
k,j,m
A.23
14
Discrete Dynamics in Nature and Society
By A.23, we get that
W0k
·
ΨC
k,j
M
1
k
R
C
k
x
W
xjR
w0,m f sC
−
s
−
φ
m
j
k,j,m
k,j,m
2 m1
R
C
k
R
k
f sC
x
W
ΔWm
s
φ
m xj
j
k,j,m
k,j,m
2
R
k
k
ΔWm
− f sC
xjR
k,j,m − sk,j,m φ τ2,m
2
k
k
xjC − φ τ3,m
xjR ΔWm
f t3k,j,m
A.24
2
R
k
k
ΔWm
f sC
xjR
k,j,m sk,j,m φ τ4,m
2 k
k
xjC φ τ5,m
xjR ΔWm
f t4k,j,m
Δ1 Δ2 ,
where
Δ1 M
1
k
R
k
xjC − φ Wm
xjR
w0,m
f sC
k,j,m − sk,j,m
2 m1
R
C
k
R
k
x
W
ΔWm
f sC
s
φ
,
m xj
j
k,j,m
k,j,m
A.25
M
2
1
k
R
k
k
ΔWm
w0,m
xjR
−f sC
k,j,m − sk,j,m φ τ2,m
2 m1
2
k
k
xjC − φ τ3,m
xjR ΔWm
f t3k,j,m
2
R
k
k
ΔWm
f sC
xjR
k,j,m sk,j,m φ τ4,m
2 4
C
k
R
k
xj φ τ5,m xj ΔWm
f tk,j,m
.
A.26
Δ2 This together with A.25 leads to
J
βfj C ΦC
0,k,j Δ1
j1
J
M
1
k
C
R
k
xjC − φ Wm
xjR
βfj C ΦC
w
0,m f sk,j,m − sk,j,m
0,k,j
2 j1
m1
f
sC
k,j,m
sRk,j,m
xjC
φ
k
Wm
xjR
A.27
k
ΔWm
.
Discrete Dynamics in Nature and Society
15
This together with A.26 leads to
J
βfj C ΦC
0,k,j Δ2
j1
J
M
1
k
βfj C ΦC
w0,m
0,k,j
2 j1
m1
2
R
k
k
ΔWm
xjR
− f sC
k,j,m − sk,j,m φ τ2,m
2
k
k
xjC − φ τ3,m
xjR ΔWm
f t3k,j,m
A.28
2
R
k
k
ΔWm
f sC
xjR
k,j,m sk,j,m φ τ4,m
2 k
k
xjC φ τ5,m
xjR ΔWm
f t4k,j,m
.
By A.3, A.16 and |fj C ΦC
| ≤ 2M0 , we have
0,k,j
J
M
2 1
k
C
R
k
k
R
s
φ
τ
ΔW
βfj C ΦC
w
−
s
−f
m xj
0,m
2,m
k,j,m
0,k,j
k,j,m
2 j1
m1
≤
J M 2 1 k C
C C
k
k
β
· ΔWm
fj Φ0,k,j · w0,m
· f sk,j,m − sRk,j,m · φ τ2,m
· xjR 2 j1 m1
≤
J M
2
1 3
k
β
· M0 · ΔWm
2M0 · M0 · M0 ·
2 j1 m1
2μ
M 2
3
k
βJM04 ΔWm
.
2μ
m1
Similarly, we can obtain that
J
M
2
1
k
k
k
xjC − φ τ3,m
xjR ΔWm
βfj C ΦC
w0,m
f t3k,j,m
0,k,j
2 j1
m1
≤ 4βJM05
M 2
k
ΔWm
,
m1
J
M
2
1
k
C
R
k
k
R
s
φ
τ
ΔW
βfj C ΦC
w
f
s
m xj
0,m
4,m
k,j,m
0,k,j
k,j,m
2 j1
m1
≤
M 2
3
k
βJM04 ΔWm
,
2μ
m1
A.29
16
Discrete Dynamics in Nature and Society
J
M
2
1
k
4
C
k
R
k
t
x
τ
x
ΔW
βfj C ΦC
w
f
φ
m
j
0,m
5,m
j
k,j,m
0,k,j
2 j1
m1
≤ 4βJM05
M 2
k
ΔWm
.
m1
A.30
So by A.28, A.29, and A.30, we have
J
βfj C ΦC
0,k,j Δ2
j1
≤
M M 2
2
3
k
k
βJM04 ΔWm
4βJM05 ΔWm
2μ
m1
m1
M M 2
2
3
k
k
βJM04 ΔWm
4βJM05 ΔWm
2μ
m1
m1
M 2
3
k
8M0 βJM04 ΔWm
,
μ
m1
A.31
with A.23, similarly, we get that
ΨRk,j,m hRk1,j,m − hRk,j,m
2 1 C
R
C
k
R
k
k
k
f sk,j,m sk,j,m
xj φ Wm xj ΔWm φ τ6,m ΔWm xjR f t5k,j,m
2
2
k
k
R
× xjC φ τ7,m
xjR ΔWm
− f sC
k,j,m − sk,j,m
×
2 k
k
k
k
xjR ΔWm
ΔWm
− φ τ8,m
xjR
xjC − φ Wm
2 k
k
xjc − φ τ9,m
xjR ΔWm
−f t6k,j,m
1 C
k
R
k
k
xjR −f sC
xjC −φ Wm
xjR ΔWm
f sk,j,m sRk,j,m xjC φ Wm
k,j,m −sk,j,m
2
2
2
R
k
k
R
5
C
k
R
k
φ
τ
ΔW
t
x
τ
x
ΔW
f sC
s
x
f
φ
m
m
j
j
6,m
7,m
j
k,j,m
k,j,m
k,j,m
2
R
k
k
R
φ
τ
ΔW
f sC
−
s
m xj
8,m
k,j,m
k,j,m
2 k
k
xjC − φ τ9,m
xjR ΔWm
,
−f t6k,j,m
A.32
Discrete Dynamics in Nature and Society
17
k
k
k
k
k1
k 5
, τ7,m
, τ8,m
, τ9,m
lie on the segment between Wm
and Wm
, tk,j,m lies on the segment
where τ6,m
between sC
sRk1,j,m and sC
sRk,j,m , t6k,j,m lies on the segment between sC
− sRk1,j,m
k1,j,m
k,j,m
k1,j,m
and sC
− sRk,j,m . By A.32, we have
k,j,m
ϕ W0k · ΨRk,j
M 1
k
R
C
k
R
f sC
x
W
ϕ w0,m
s
φ
m xj
j
k,j,m
k,j,m
2 m1
R
C
k
R
k
x
W
ΔWm
−
s
−
φ
−f sC
m xj
j
k,j,m
k,j,m
2
R
k
k
R
φ
τ
ΔW
f sC
s
m xj
6,m
k,j,m
k,j,m
2
k
k
xjC φ τ7,m
xjR ΔWm
f t5k,j,m
2
R
k
k
ΔWm
f sC
xjR
k,j,m − sk,j,m φ τ8,m
2 6
C
k
R
k
xj − φ τ9,m xj ΔWm
−f tk,j,m
A.33
Δ3 Δ4 ,
where
M 1
k
ϕ w0,m
2 m1
R
k
xjC φ Wm
xjR
× f sC
k,j,m sk,j,m
R
C
k
R
k
−f sC
x
W
x
ΔW
,
−
s
−
φ
m
m
j
j
k,j,m
k,j,m
M 2
1
k
R
k
k
Δ4 f sC
ΔWm
ϕ w0,m
xjR
k,j,m sk,j,m φ τ6,m
2 m1
2
k
k
xjC φ τ7,m
xjR ΔWm
f t5k,j,m
2
R
k
k
ΔWm
f sC
xjR
k,j,m − sk,j,m φ τ8,m
2 6
C
k
R
k
xj − φ τ9,m xj ΔWm
−f tk,j,m
.
Δ3 A.34
A.35
By A.34, we have
R Δ3
1 − β fj R φ0,k,j
J
j1
J
M R 1
k
R
k
f sC
xjC φ Wm
xjR
ϕ w0,m
1 − β fj R φ0,k,j
k,j,m sk,j,m
2 j1
m1
R
C
k
R
k
−f sC
x
W
x
ΔW
−
s
−
φ
m
m ,
j
j
k,j,m
k,j,m
A.36
18
Discrete Dynamics in Nature and Society
with A.31, similarly, this together with A.35 leads to
R Δ4
1 − β fj R φ0,k,j
J
j1
≤
M M 2
2
3 k
k
1 − β JM04 ΔWm
4 1 − β JM05 ΔWm
2μ
m1
m1
M M 2
2
3 k
k
1 − β JM04 ΔWm
4 1 − β JM05 ΔWm
2μ
m1
m1
A.37
M 2
3
k
8M0 1 − β JM04 ΔWm
.
μ
m1
By A.27, A.31, A.36 and A.37, we obtain that
J
J
R R k R k
C
W
βfj C ΦC
·
Ψ
1
−
β
fj φ0,k,j ϕ W0 · Ψk,j
0
0,k,j
k,j
j1
j1
J
J
R βfj C ΦC
Δ
1 − β fj R φ0,k,j
Δ
Δ3 Δ4 1
2
0,k,j
j1
≤
j1
J
M
1
k
C
R
k
xjC − φ Wm
xjR
βfj C ΦC
w
0,m f sk,j,m − sk,j,m
0,k,j
2 j1
m1
J
M R R 1
R
C
k
R
k
k
x
W
x
ΔW
φ
s
φ
ϕ w0,m
f sC
1
−
β
f
m
m
j
j
j
k,j,m
0,k,j
k,j,m
2 j1
m1
×
R
k
R
k
k
f sC
xjC φ Wm
xjR − f sC
xjC − φ Wm
xjR ΔWm
k,j,m sk,j,m
k,j,m − sk,j,m
M M 2 3
2
3
4
k
k
8M0 βJM0 ΔWm 8M0 1 − β JM04 ΔWm
μ
μ
m1
m1
J
M
1
k
C
R
k
xjC − φ Wm
xjR
βfj C ΦC
w
0,m f sk,j,m − sk,j,m
0,k,j
2 j1
m1
R
C
k
R
f sC
x
W
s
φ
m xj
j
k,j,m
k,j,m
J
M R 1
k
ϕ w0,m
1 − β fj R φ0,k,j
2 j1
m1
R
C
k
R
C
R
C
k
k
x
W
x
−
f
s
x
W
xjR ΔWm
s
φ
−
s
−
φ
× f sC
m
m
j
j
j
k,j,m
k,j,m
k,j,m
k,j,m
k
× ΔWm
M 2
3
k
8M0 JM04 ΔWm
.
μ
m1
A.38
Discrete Dynamics in Nature and Society
19
Combining with 4.11, 4.12, and 4.14, we get that
J
J
R R k R k
C
W
βfj C ΦC
·
Ψ
1
−
β
fj φ0,k,j ϕ W0 · Ψk,j
0
0,k,j
k,j
j1
k
j1
M ∂E W
M 2
3
k
k
8M0 JM04 ΔWm
· ΔWm
∂Wm
μ
m1
m1
M ∂E W k 2
M ∂E W k 2
2
−η M2 η
∂W
∂W
m
m
m1
m1
≤
A.39
M ∂E W k 2
−η M2 η
,
∂W
m
m1
2
where M2 3/μ 8M0 JM04 . This proves A.6.
The proof of A.7 in Lemma A.2: According to the definition of fjC t, we get that
fjC t 1, combining with A.3, A.18, we have
J
2
1
C
C
ΦC
βfjC ξ0,k,j
0,k1,j − Φ0,k,j
2 j1
J 2
1 C β ΦC
0,k1,j − Φ0,k,j 2 j1
J 2
1 k
C β W0k1 · hC
−
W
·
h
0
k1,j
k,j 2 j1
J 2
1 k
C
C β W0k1 − W0k · hC
k1,j W0 · hk1,j − hk,j 2 j1
≤β
J 2
2 M02 ΔW0k M02 ΨC
k,j j1
≤
βJM02
M 2
2
k
ΔW0k 4M04 ΔWm
M 2
k
≤ M3 ΔWm
m1
m0
M ∂E W k 2
M3 η
∂W
m m0
∂E W k 2
M3 η 2 ,
∂W 2
A.40
where M3 βJM02 max{1, 4M04 }. This proves A.7.
20
Discrete Dynamics in Nature and Society
The proof of A.8 in Lemma A.2: With A.17, similarly, for any 1 ≤ j ≤ J and k 0, 1, 2, . . ., we can get that
M 2
R 2
k
Ψk,j ≤ 4M04 ΔWm
.
A.41
m1
According to the definition of fjR t, we get that fj R t tRj − oRj , combining with A.3, we
can obtain that |fj R ΦR0,k,j | ≤ 2M0 . By A.16 and A.41, we deduce that
1 − β fj R ΦR0,k,j ϕ ζ1k ΔW0k · ΨRk,j
J
j1
J ≤ 2 1 − β M0 ΔW0k ΨRk,j j1
J 2 2 ≤ 1 − β M0
ΔW0k ΨRk,j j1
M 2
2
k
≤ 1 − β JM0 ΔW0k 4 1 − β JM05 ΔWm
A.42
m1
≤ M4
M 2
k
ΔWm
m0
∂E W k 2
M4 η ,
∂W 2
where M4 1 − βJM0 max{1, 4M04 }. This proves A.8.
The proof of A.9 in lemma A.2: By |fj R ΦR0,k,j | ≤ 2M0 , A.3 and A.16, we get that
J
2
R k 1
ϕ ζ2
ΔW0k · hRk,j
1 − β fj R φ0,k,j
2 j1
J 2 3 ≤ 1 − β M0 ·
ΔW0k · hRk,j 2μ j1
2
3 1 − β JM02 ΔW0k 2μ
∂E W k 2
2
≤ M5 η ,
∂W0 ≤
A.43
where M5 3/2μ1 − βJM02 . This proves A.9.
Discrete Dynamics in Nature and Society
21
The proof of A.10 in Lemma A.2: According to the definition of fjR t, we get that
fjR t 1, combining with A.3 and A.41, we have
J
2
1
R
1 − β fjR ξ0,k,j
ΦR0,k1,j − ΦR0,k,j
2 j1
J 2
1
R
1−β
Φ0,k1,j − ΦR0,k,j 2
j1
J 2
1
1−β
ϕ W0k1 · hRk1,j − ϕ W0k · hRk,j 2
j1
J 2
1
1−β
ϕ W0k1 − ϕ W0k · hRk1,j ϕ W0k · hRk1,j − hRk,j 2
j1
J 2
2 ≤ 1−β
M02 ΔW0k M02 ΨRk,j j1
≤ 1−β
A.44
M 2
2
k
ΔW0k 4M04 ΔWm
JM02
m1
≤ M6
M 2
k
ΔWm
m0
M ∂E W k 2
M6 η 2 ∂W
m m0
∂E W k 2
M6 η ,
∂W 2
where M6 1 − βJM02 max{1, 4M04 }. This proves A.10. Thus this completes the proof of
Lemma A.2.
Now we are ready to prove Theorem 5.1.
Proof. Using the Taylor expansion and Lemma A.2, for any k 0, 1, 2, . . ., we have
E W k1 − E W k
J R R
R R C
C
1
−
β
f
Φ
−
βf
Φ
−
1
−
β
f
Φ
βfjC ΦC
j
j
j
0,k1,j
0,k,j
0,k1,j
0,k,j
j1
J j1
1
2
C
C
C
C
ΦC
ξ0,k,j
ΦC
βfj C ΦC
0,k,j
0,k1,j − Φ0,k,j βfj
0,k1,j − Φ0,k,j
2
1 − β fj
R
ΦR0,k,j
ΦR0,k1,j
−
ΦR0,k,j
2
1
R
1 − β fjR ξ0,k,j
ΦR0,k1,j − ΦR0,k,j
2
22
Discrete Dynamics in Nature and Society
J j1
1
2
k1 C
k C
C
C
C
C
βf
W
ξ
Φ
h
−
W
h
−
Φ
βfj C ΦC
0
0 k,j
0,k,j
k1,j
0,k,j
0,k1,j
0,k,j
2 j
1 − β fj R ΦR0,k,j ϕ W0k1 hRk1,j − ϕ W0k hRk,j
J j1
2
1
R
ΦR0,k1,j − ΦR0,k,j
1 − β fjR ξ0,k,j
2
k
C
k
C
k
C
ΔW
·
h
W
·
Ψ
ΔW
·
Ψ
βfj C ΦC
0
0
0
0,k,j
k,j
k,j
k,j
2 1
C
C
βfjC ξ0,k,j
ΦC
1 − β fj R ΦR0,k,j
0,k1,j − Φ0,k,j
2
k1
k
R
k
R
k1
k
R
− ϕ W0
· hk,j ϕ W0 · Ψk,j ϕ W0
− ϕ W0
· Ψk,j
×
ϕ W0
J j1
2
1
R
ΦR0,k1,j − ΦR0,k,j
1 − β fjR ξ0,k,j
2
k
C
k
C
k
C
ΔW
·
h
W
·
Ψ
ΔW
·
Ψ
βfj C ΦC
0
0
0
0,k,j
k,j
k,j
k,j
2 R R 1
C
C
ΦC
−
Φ
1
−
β
fj Φ0,k,j
βfjC ξ0,k,j
0,k1,j
0,k,j
2
2 ×
ϕ W0k ΔW0k ϕ ζ2k ΔW0k
· hRk,j ϕ W0k · ΨRk,j ϕ ζ1k ΔW0k · ΨRk,j
2 1
R
ΦR0,k1,j − ΦR0,k,j
1 − β fjR ξ0,k,j
2
M ∂E W k 2
∂E W k 2 ∂E W k 2
∂E W k 2
2
2
2
≤ −η
−η M2 η
M1 η M4 η ∂W0 ∂W
∂W
∂W
m
m1
∂E W k 2
∂E W k 2
∂E W k 2
2
2
2
M5 η M3 η M6 η ∂W0 ∂W ∂W ∂EW k 2
≤ − η − M7 η2 ,
∂W A.45
C
lies on the segment between ΦC
and ΦC
,
where M7 M1 M2 M3 M4 M5 M6 , ξ0,k,j
0,k1,j
0,k,j
R
ξ0,k,j
lies on the segment between ΦR0,k1,j and ΦR0,k,j , ζ1k , ζ2k both lie on the segment between
W0k1 and W0k . Let γ η − M7 η2 , then
E W
k1
≤E W
k
∂E W k 2
− γ
.
∂W A.46
Discrete Dynamics in Nature and Society
23
Obviously, we require the learning rate η to satisfy
0<η<
1
.
M7
A.47
Thus, we can obtain that
E W k1 ≤ E W k ,
k 0, 1, 2, . . . .
A.48
This together with A.46 leads to
∂E W k 2
k1
k
E W
≤ E W − γ
∂W ≤ · · · ≤ E W0
k ∂E W t 2
−γ .
∂W t0
A.49
Since EW k1 ≥ 0, we have
γ
k
≤ E W0 .
A.50
t0
Letting k → ∞ results in
∞ ∂E W t 2
1 ≤ E W 0 < ∞.
∂W γ
t0
A.51
∂E W k 2
lim 0.
k → ∞
∂W A.52
So this immediately gives
According to 4.14 and A.52, we get that
lim ΔW k 0.
k→∞
A.53
According to A1, the sequence {wm } m ∈ N has a subsequence {wmk } k ∈ N that
is convergent to, say, w∗ ∈ Ω0 . It follows from 5.3 and the continuity of Ew w that
Ew w∗ lim Ew wmk lim Ew wm 0.
k→∞
m→∞
A.54
24
Discrete Dynamics in Nature and Society
This implies that w∗ is a stationary point of Ew. Hence, {wm } has at least one accumulation
point and every accumulation point must be a stationary point.
Next, by reduction to absurdity, we prove that {wm } has precisely one accumulation
$ We
w.
point. Let us assume the contrary that {wm } has at least two accumulation points w /
m
T . It is easy to see from 4.13 and 4.14 that limm → ∞ wm1 −
write wm w1m , w2m , . . . , wnp1
wm 0, or equivalently, limm → ∞ |wim1 − wim | 0 for i 1, 2, . . . , np 1. Without loss
$ do not equal to each other,
of generality, we assume that the first components of w and w
$ 1 . By Lemma A.1,
w
$ 1 . For any real number λ ∈ 0, 1, let w1λ λw1 1 − λw
that is, w1 /
mk
there exists a subsequence {w1 1 } of {w1m } converging to w1λ as k1 → ∞. Due to the
mk
mk
mk
boundedness of {w2 1 }, there is a convergent subsequence {w2 2 } ⊂ {w2 1 }. We define
mk
w2λ limk2 → ∞ w2 2 . Repeating this procedure, we end up with decreasing subsequences
mk
{mk1 } ⊃ {mk2 } ⊃ · · · ⊃ {mknp1 } with wiλ limki → ∞ wi i for each i 1, 2, . . . , np 1. Write
λ
wλ w1λ , w2λ , . . . , wnp1
T . Then, we see that wλ is an accumulation point of {wm } for any
λ ∈ 0, 1. But this means that Ω0,1 has interior points, which contradicts A4. Thus, w∗ must
be a unique accumulation point of {wm }∞
m0 . This proves 5.4. Thus this completes the proof
of Theorem 5.1.
Acknowledgments
This work is supported by the National Natural Science Foundation of China 11171367 and
the Fundamental Research Funds for the Central Universities of China.
References
1 C. M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, Oxford, UK, 1995.
2 M. Perez, “Artificial neural networks and bankruptcy forecasting: a state of the art,” Neural Computing
and Applications, vol. 15, no. 2, pp. 154–163, 2006.
3 G. Zhang, B. E. Patuwo, and M. Y. Hu, “Forecasting with artificial neural networks: the state of the
art,” International Journal of Forecasting, vol. 14, no. 1, pp. 35–62, 1998.
4 M. W. Craven and J. W. Shavlik, “Using neural networks for data mining,” Future Generation Computer
Systems, vol. 13, no. 2-3, pp. 211–229, 1997.
5 S. K. Pal, V. Talwar, and P. Mitra, “Web mining in soft computing framework: relevance, state of the
art and future directions,” IEEE Transactions on Neural Networks, vol. 13, no. 5, pp. 1163–1177, 2002.
6 K. I. Funahashi, “On the approximate realization of continuous mappings by neural networks,”
Neural Networks, vol. 2, no. 3, pp. 183–192, 1989.
7 K. Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural Networks, vol.
4, no. 2, pp. 251–257, 1991.
8 H. White, “Connectionist nonparametric regression: multilayer feedforward networks can learn
arbitrary mappings,” Neural Networks, vol. 3, no. 5, pp. 535–549, 1990.
9 H. Ishibuchi and H. Tanaka, “An extension of the BP algorithm to interval input vectors,” in
Proceedings of the International Joint Conference on Neural Networks (IJCNN’91), vol. 2, pp. 1588–1593,
Singapore, 1991.
10 D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating
errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986.
11 D. E. Rumelhart, J. L. McClelland, and The PDP Research Group, Parallel Distributed Processing, vol. 1,
MIT Press, Cambridge, Mass, USA, 1986.
12 C. A. Hernandez, J. Espi, K. Nakayama, and M. Fernandez, “Interval arithmetic backpropagation,”
in Proceedings of International Joint Conference on Neural Networks, vol. 1, pp. 375–378, Nagoya, Japan,
October 1993.
Discrete Dynamics in Nature and Society
25
13 A. M. S. Roque, C. Maté, J. Arroyo, and A. Sarabia, “IMLP: applying multi-layer perceptrons to
interval-valued data,” Neural Processing Letters, vol. 25, no. 2, pp. 157–169, 2007.
14 H. M. Shao and G. F. Zheng, “Convergence analysis of a back-propagation algorithm with adaptive
momentum,” Neurocomputing, vol. 74, no. 5, pp. 749–752, 2011.
15 W. Wu, J. Wang, M. S. Cheng, and Z. X. Li, “Convergence analysis of online gradient method for BP
neural networks,” Neural Networks, vol. 24, no. 1, pp. 91–98, 2011.
16 D. P. Xu, H. S. Zhang, and L. J. Liu, “Convergence analysis of three classes of split-complex gradient
algorithms for complex-valued recurrent neural networks,” Neural Computation, vol. 22, no. 10, pp.
2655–2677, 2010.
17 J. Wang, J. Yang, and W. Wu, “Convergence of cyclic and almost-cyclic learning with momentum for
feedforward neural networks,” IEEE Transactions on Neural Networks, vol. 22, no. 8, pp. 1297–1306,
2011.
18 R. E. Moore, Interval Analysis, Prentice-Hall, Englewood Cliffs, NJ, USA, 1966.
19 T. Sunaga, “Theory of an interval algebra and its applications to numerical analysis,” RAAG Memoirs,
vol. 2, pp. 29–46, 1958.
Advances in
Operations Research
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Advances in
Decision Sciences
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Mathematical Problems
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Algebra
Hindawi Publishing Corporation
http://www.hindawi.com
Probability and Statistics
Volume 2014
The Scientific
World Journal
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International Journal of
Differential Equations
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Volume 2014
Submit your manuscripts at
http://www.hindawi.com
International Journal of
Advances in
Combinatorics
Hindawi Publishing Corporation
http://www.hindawi.com
Mathematical Physics
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Journal of
Complex Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com
Stochastic Analysis
Abstract and
Applied Analysis
Hindawi Publishing Corporation
http://www.hindawi.com
Hindawi Publishing Corporation
http://www.hindawi.com
International Journal of
Mathematics
Volume 2014
Volume 2014
Discrete Dynamics in
Nature and Society
Volume 2014
Volume 2014
Journal of
Journal of
Discrete Mathematics
Journal of
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Applied Mathematics
Journal of
Function Spaces
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Optimization
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Download