ONE STEP LEARNING PROCEDURE
FOR NEURAL NET CONTROL SYSTEM
Astapkovitch A.M.
Head of Student Design Center of SUAI (Sankt-Petersburg)
Abstracts
This work is concerned of “Teaching by Showing” approach for multi channel real time control systems, based on neural network. It was shown that supervised learning paradigm can be interpreted as nonlinear graph fitting problem.
For such control system one step leaning procedure was proposed, that significantly decreases the cost of leaning of real control systems, realized on base of neuron net. This procedure uses least square approach and uses Tickhnov regularization procedure of cost function to get a correct solution for weight set for neural control net. Result can be expressed analytically as w = (S T S +
E)
–1
S T a
where w – is the neutron net weight set, S – is rectangular matrix of inputs, formed from sensor measurements, a is vector of actor signals, formed by operator during showing phase and
is regularization coefficient.
Some ideas were experimentally investigated during student research project
“Autonomous Robot PHOENIX-1 ”. This experiments confirmed the reliability end effectiveness of proposed approach.
Introduction
An artificial neural network involves a network of simple processing elements (neurons), which can exhibit complex global behavior, determined by the connections between processing elements. It is important feature, that neural networks are trainable systems that can “lean” to solve complex problems from the set of examples.
There are known three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning.
Supervised learning is known also as “Teaching by Showing” approach. In supervised learning a set of example pairs (X,Y) is given, where X belongs to set of inputs (sensor measurements) and Y belongs to set outputs (actor signals). The aim of leaning is to find a
function f (X) in the allowed class of functions, that minimize a selected cost function, that estimates difference between X and f (X).
For unsupervised learning for given set of data X given cost function have to be minimized. The cost function is determined by the task formulation and can be any reasonable from practice point of view function of X and network’s output.
In reinforcement learning, data X usually not given, but generated as a result of system interaction with the environment. At each point in time T the system performs an action Y(t) and receives observation X(T) and generates an instantaneous cost C(t), according to some
(usually unknown) dynamics). The aim is to discover a policy for selecting actions that minimizes some measure of long-term cost, i.e. cumulative cost. The environment’s dynamics and the longterm cost for each policy are usually unknown, but can be estimated..
This work is concerned for “Teaching by Showing” approach for multi channel real time control systems, based on neural network. Such neural network has a complex multiplayer structure that includes nonlinear elements. Nonlinear behavior is originated if neuron includes activation function. In this case linear output of neuron
A k
(T k
) = (w, s(T k
))
is transformed to
(I.1)
A k
(T k
) * = f(A k
(T k
)) = 1/(1+exp (A k
(T k
))) where A k
(T k
) is the k-th actor control signal at moment T k.
(I.2),
Usually, back-propagation algorithm [1] is used for learning (teaching) such type of net.
The training session consists of presentation of selected samples in the training set and update of weights using the desired output value for presented sample. An optimum set of weights is obtained after a number of passes over the training set with some sort of iteration procedure
(w) k+1 =
(( w) k ) (I.3).
In practical applications, the algorithm needs many passes over the training set and usually algorithm is complemented with some techniques to minimize the number of training cycles.
In this work one step learning procedure is proposed. This procedure includes Tickhnov regularization of cost function to get a correct solution for weight set of neural control net. Some ideas were experimentally investigated during student research project “Autonomous robot
PHOENIX-1 ” [2].
1.
PROBLEM FORMULATION
Control system structure, that oriented to be used with “Teaching by Showing” methodology, is illustrated with Fig.1.1. Model of this system includes: sensors , described with vectors s(t),parameterized control system with free parameters described with vector w, and actor reaction, described with vectors a (t). This vectors have dimensions Ns, Nw, that has to be equal
Ns, and Na. To have a ready to use control system it is necessary in some way to determine the vector w.
OPERATOR
Teaching
Set I (T k
)
k=1….p
Real
Environment
“Teaching”
Sensor
Set s (T k
),
( S i
(T k
), i=1..Ns )
“Real “
Sensor
Set s(t)
Environmen t
Conditions
Teaching
Interface
Parameterized
Control
System
W i
i=1..Nw
Optimal Parameter
Set w
“Teaching”
Actor
Set a(T k
)
( A i
(T k
) i=1..Na )
TEACHER
“Real “
Actor
Signal a(t)
Environmen t
Conditions
TEACHER DATA BASE COLLECTED DURING “SHOWING” PHASE
S
1
(T
1
)
S
2
(T
1
)
S
S
1
2
(T
(T
2
2
)
)
S
S
1
2
(T
(T p p
)
)
…….
S
Ns
(T
1
)
……
S n
(T
2
)
…….
S n
(T p
)
A
1
(T
1
…….
)
A m
(T
1
)
A (T
2
…….
A
1 m
(T
2
)
)
A
A
1
……. m
(T
(T p p
)
)
SENSOR TEACHING SET OPERATOR ACTOR SET
Fig. 1.1 “Teaching by Showing” chain
S i
S n
SENSOR
(INPUTS)
S
1
W
Fig.1.2.
One layer neuron control system
ACTORS
A
1
A k
A m
S
In simplified form “teaching by showing” methodology supposes that there are two phases: “showing “ and “teaching”. During “showing” phase control system has as inputs some
“teaching” set. Sensor part of the control system processes inputs and the results go to “Operator” and “Teacher ” subsystems. “Operator” estimated situation with sensor outputs and generates some actor signals, that also go to “Teacher” data base. It has to be pointed out that as
“Operator ” can be a human being or other control system with unknown algorithm.
During “teaching” phase the collected information is processed to estimate vector of weights w. Estimating procedure has to have as a result the parameter vector w, that provides the same reaction for a given teaching set as “Operator”.
As a first step let us suppose that control system is a single neuron, as it is illustrated by
Fig.1.2.
For given weight reaction of such control system reaction of control net at time T
K on input vector s(T
K
) is the actor vector a(T
K
) , calculated with
W * s(T
K
) = a ( T
K
) (1.1), where s (T
K
) is vector, that describes sensor measurements of input influence at moment T k
; W is a M * N matrix of weights, M is a number of the actors; N is a number of the sensors.
W
11
W
12
W
13
….W
1 i
…..
W
1n
W
21
W
22
W
23
….W
2 i
…..
W
2n
S
1
(T k
)
S
2
(T k
)
A
1
(T k
)
A
2
(T k
)
* =
… …..
(1.2).
……………………………….
W m1
W m2
W
23
….W
2 i
…..
W mn
S n
(T k
) A m
(T k
)
As it was described above during “showing” phase teacher is collecting information from sensors and actors. Information is sampled from sensors and actors at time moment T
1
,T
2
,
…..T
P , where p is a number of time points in whole teaching set. During
“teaching” phase teacher has to estimate matrix W coefficients with the collected information. With this matrix W neural net control system has to react on situation in the same manner as operator did.
It has to be pointed out that known approaches like “back propagation” or nonlinear programming is a very expansive procedure if one speaks about multi channel real time control system and learning control system of real machine.
Common scheme is described as iterative process
W m+1
= W m
+ hD m
(1.3), where W m
- the current matrix of weights, D m
- the correction matrix, h - parameter
for one dimension search .
So, main question “Is is possible to propose one step procedure for estimating matrix W ? “ and is it possible to implement one to real control systems.
2.
ONE STEP TEACHING PROCEDURE
For system with structure that is presented in Fig.1.1. let us w is a vector, formed from unknown coefficients W i,j of matrix W w = [W
11
, … W
1n
, W
21
, … W
2n
, …… W m1
, … W mn
] T (2.1)`
and vector a is a = [A
1
(t
1
) .. A m
(t
1
), A
1
(t
2
) .. A m
(t
2
), ……. A
1
(t p
) .. A m
(t p
) ] T (2.2)
It can be shown that the set of the neuron net weights is the solution of the linear equations system
S * w = a (2.3) where, S is a sparse rectangular matrix (n*m*p), formed from sensor sample set
S
1
(T
1
) S
2
(T
1
) .. S n
(T
1
) 0 ….. ………………..…
…………………………… 0
1
0 .. … 0 S
1
(T
1
) S
2
(T
1
) .. S
…………………………….. 0 n
(T
1
) 0
0 …………………………………………….0 S
1
(T
1
) S
2
(T
1
) .. S n
(T
1
) m
S
1
(T
2
) S
2
(T
2
) .. S n
(T
2
) 0 ……
……………………………………………. 0
S =
0 …….. 0 S
1
(T
2
) S
2
(T
2
) .. S n
(T
2
) 0
……………………………… 0
0 ………………………………………… .0 S
1
(T
2
) S
2
(T
2
) .. S n
(T
2
)
S
1
(T p
) S
2
(T p
) .. S n
(T p
) 0 ……
………………………………………………… 0
0 …… 0 S
1
(T p
) S
2
(T p
) .. S n
(T p
)
(2.4)
… …… … … ………………… ………..
0 …………………………………………… S
1
(T p
) S
2
(T p
) .. S n
(T p
)
Problem of estimating vector w can be formulated as least square problem for a(T i
) function fitting with linear weighted combination of sensor function S k
(T i
) set.
It has to be outlined, that for common case system (2.3) is singular that leads to no correct problem in accordance with Adamar classification. It means, that Tikhonov regularization has to be used to get a correct solution [3].
Solution of problem is such set of weights w, that minimizes norm of
II
S*w-a
II
and norm of
II
w
II min F(w) =
II
S*w-a
II
+
II
w
II
(2.5),
w where
is a regularization coefficient .
Result can be expressed analytically as w = (S T S +
E)
–1
S T a (2.6).
Solution (2.6) provides set of weights with minimal norm, that depends on used regularization member. Other possibilities exist also [4].
Important also, that Tickhonov regularization procedure provides solution that is stable against experimental distortion related with sensor and actor measurements.
3.
Neural control system with complex structure
Let us look, as the proposed above approach can be used for neural net with complex structure.
As the first step let us look at multi layer net with linear neurons. It is clear that for linear neurons (without activation function) output is linear function of weights and inputs, so multiplayer neuron net can be presented in one neuron approximation, if as inputs for this neuron one uses sensor measurement during different time slices.
The idea of the approach is illustrated with Fig.3.1 for neuron control net with one hidden layer. It is clear that for common case if the dimension of sensor vector is Ns for initial net, than dimension of sensor and weight vectors for one neutron approximation will Ns*Nl, where
Nl is total number of layers. It is clear also, that this approach is workable if the dimension of teaching set is large enough p >> 0.
The same approach can be used for neuron net with feedback connection, as it is illustrated with Fig.3.2. In this situation neuron has to have a sensors then measure actors actual “positions”.
Also, number of hidden layer has to be taken to account when one constructs the matrix S to use formula (2.6).
It can be outlined also, that the same approach can be implemented for neuron with activation function or even arbitrary functions. It is quite clear from presentation of outputs in form (3.1).
It means, that “Teaching by Showing” paradigm can be interpreted as known graph nonlinear fitting problem.
S(T k+2
)
T
S(T
S(T
S(T k+2 k+2 k+1 k
)
T
)
) r+1
T k a(T k
)
Fig. 3.1. One neuron approximation for neuron with hidden layer a(T k
)
..S
3
(t k+1
) = A(t k
) S
3
(t k
) = A(t k-1
)
… S
2
(t k+2
) S
2
(t k+1
) S
2
(t k
)
… S
1
(t k+2
) S
1
(t k+1
) S
1
(t k
)
… A(t k+2
) A(t k+1
) A(t k
)
Fig. 3.2. One neuron approximation for neuron with feedback
W
11
W
12
W
13
….W
1 i
…..
W
1n
W
21
W
22
W
23
….W
2 i
…..
W
2n
……………………………….
W m1
W m2
W
23
….W
2 i
…..
W mn
S
F
..
F
1
1 n
(T
(S
(S k
1 n
)
(T
(T k k
))
))
=
A
1
(T k
)
A
2
(T k
)
… …..
A m
(T k
)
( 3.1).
It is clear that approach described with (2.1-2.6) gives possibility to use one step learning procedure to system like (3.1).
One promising idea can be introduced such way. Let us W is linear function on s
W=W(s) =W0+ (W1,s) ( 3.2).
In this case system (2.3) can be written in form
S
1
(T
1
) …. S
Ns
(T
1
) .. .. S i
(T
1
)Sj (T
1
)..
S
1
(T
2
) …. S
Ns
(T2) .. .. S i
(T
2
)Sj (T
2
)..
…………………..
S
1
(T p
) …. S
Ns
(T p
) .. .. S i
(T p
)Sj (T p
)..
*
W0
W2
=
where W2 is a part of weight vector that responsible for quadratic terms. a (3.3),
In common case when all S i
Sj is used, dimension NQ of the vector w for neuron, described with (3.2)
NQ = Ns+Ns(Ns+1)/2=Ns (Ns/2+1) (3.4).
As a result, it can be stated, that proposed one step teaching procedure has no limitation from point of neuron net control structure. It is clear also that dimension of system (2.6) will increase significantly if one will use (3.2) to multi layer neural control net with feedback loops.
4.
Experimental results
Some ideas were experimentally investigated during student research project “Mobile robot FENIX-1”. Student design team : Astapkovitch D.,Gontcharov A., Dmitriev A., Mickheev
A. More details about this project, robot design end experimental results are presented in [2 ].
Webcamera
N оtebook
Toshiba
RS-232
Rotation
Speed
Sensor -1 Controller
ASK LAb
Rotation
Speed
Sensor -2
Left and Right
Motor Control
Bridges
Fig. 4.1 Robot FENIX-1
Autonomous robot FENIX-1 has to be leaned to follow white stripe using current image from web-cam. Web-cam image was processed with high end software, running with notebook.
Artificial neuron net was realized by high end software and control robot movement through RS-
232 interface.
During experiments with robot Fenix-1 system with 5 sensors and one actor was investigated. It was difference between PWM for left and right motors
A
1
(T) = left_motor_PWM (T) - right_motor PWM (T) (4.1).
In this case teaching model is rather simple
S
1
(T
1
) S
2
(T
1
) .. S
5
(T
1
)
S
1
(T
2
) S
2
(T
2
) .. S
5
(t
2
)
………………………….
*
S
1
(t p
) S
2
(t p
) .. S n
(t p
) w w
2 w
1
5
= =
A
A
A
1
1
1
(T
(T
(T
1
2 p
)
)
)
(4.2)
w = (S T S+
E) -1 S T a (4.3)
During experiments a PID regulator was used as “OPERATOR”. MATHCAD 11.0 solver was used during “TEACHING” phase. Mathcad “program” and results are presented in
Fig. 4.2.
First graph presents a set of collected data when robot have followed white stripe under
“OPERATOR ”
control. The second is an approximation with actor behavior with neural net.
After this robot could follow the white stripe from different position.
Conclusion
This experiments confirmed the reliability end effectiveness of proposed approach. As it was described above the very simple model was used during experiments.
For real system a dimension of matrix S is
Np*Ns* Na *Nl and it means that possibility to solve linear system with at least 10 4 -10 5 equation has to be investigate.
This research will be the core of future student research project Phoenix-2, that started at the very beginning of 2007.
List of publication
1.
Industrial lication of neural networks , edited by Lakhmi C.Jain, V.Rao Vemuri
CRC Press LLС, Boca Raton, London, New York, Weashington, D.C ,1999.
2. Д.А.Астапкович, А.А.Гончаров, А.С.Дмитриев, А.В.Михеев Проект “Робот
Феникс-1”. Сборник докладов пятьдесят девятой международной студенческой научно-технической конференции ГУАП., Часть I, Технические науки, 18-22 апреля 2006 г.,Санкт-Петербург, с.211-216.
3. Тихонов А.Н, Арсенин В.Я. Метеды решения некорректных задач., М.,
Наука,главная редакция физико-математической литературы, 1979, с. 286.
4.Ю.E. Бояринцев. Регулярные и сингулярные системы линейных обыкновенных дифференциальных уравнений, Новосибирск, Наука, 1980,
220 с.
EXAMPLE 1 ONE CHANNEL SYSTEM a
SENS1
SENS1
ACT1 i
SENS1
SENS1
SENS1
SENS1
SENS1 k
0 4
0.5
0
0.5
0 m a
a
0 4
SENS1
SENS1
100
REG
a
a
0
200
SENS1
ACT1
REG i
300 i
DØÈÌ
D1
D2
D3
D4
D5
a a
0
NDATA
500 i
0 NDATA t i
REG
400
1
500 w
SENS1
T
SENS1
w
T
0.5 1.104
10
8
1
SENS1
T
ACT1
0.05
1.595
10
8
2.906
10
8
i
0 NDATA
FRES1 i
w
0
SENS1
w
1
SENS1
w
2
SENS1
FRES1 i
0.5
ACT1 i
0.5
0
1
0 100 200 i
w
3
SENS1
300
w
4
SENS1
400 500