MITLibraries Document Services

advertisement
MITLibraries
Document Services
Room 14-0551
77 Massachusetts Avenue
Cambridge, MA 02139
Ph: 617.253.5668 Fax: 617.253.1690
Email: docs@mit.edu
http: //libraries. mit. edu/docs
DISCLAIMER OF QUALITY
Due to the condition of the original material, there are unavoidable
flaws in this reproduction. We have made every effort possible to
provide you with the best copy available. If you are dissatisfied with
this product and find it unusable, please contact Document Services as
soon as possible.
Thank you.
Due to the poor quality of the original document, there is
some spotting or background shading in this document.
August 1981
r,-;ia
text of
LIDS-P-1125
0~r"!
-] r-~Lj %
rOI)FL PA
77;.H' D.
[
[
A UT HCQK-
N
secOinl 1a.cc
t
SiZEE;
Fi
1]1
PA:.:PE
,
. IDISTRIBUITED ,YNAitC
X l1
aI
Ufsl
PROGrANLlNG*:D
--- .....
sec--on- /"
L7Z1__
Dimitr:?P; !Bertsekas
i
-
Laboratory fr Informationand DecisionfSvstens ___
_Massachusetts Institute of TechnologyL
C.lnrtxtl:
ofhIrst pae hAbsfrt:ai hr:r. Pis l'eroiessors Ivtlcs
i. Th' Cd h;f
Besfn'
end of. Aa : We consider distributed algorithms for solving
.
dynamic programming problems whereby several processors participate simultaneously in the comnputation while maintaining coordination by information
exchange via communication links. A model of asynchronous distributed computation is developed which
requires very weak. assumptions on the ordering of computations, the timing of information exchange,
the amount of local information needed at each computation node, and the initial conditions for the
algorithm. The class of problems considered is
very broad and includes shortest path problems, and
finite and infinite horizon stochastic optimal control problems. . When specialized to a shortest path
reduces to the algori.tllm origproblemi the algori.t.hml
inally implemented for routing of messages in the
ARPANET.
1.
i
CambridgeMassachusetts 02139
A 17sAT o
i
Introduction
Recent advances in microcomputer technology
have intensified interest in distributed computation scI;hmes. Aside from modular expandability,
other potential advantages of sucth schemes are a
reduction in computation time for solving a given
problem due to parallelism of computation, and
elimination of the need to communicate problem
data available at geographically dispersed data
collection points to a computation center. The
first advantage is of crucial importance in real
time applications where problem solution time can
be an implementation bottleneck. The second advantage manifests itself for excamnple in applications
involving communication networks where there is a
natural decentralization of problem data acquisition.
The structure of dynamic programming naturally
lends itself well to distributed computation since
it involves calculations that to a great extent
can be carried out in parallel. In fact it is
trivial to devise simple schemes taking advantage
of this structure whereby the calculation involved
in each iteration of the standard form of the algoSuch
rithm is simply shared by several processors.
schemes require a certain degree of synchronlization
*This research ias conducted at the
.I.T. Laoratory for Information and Decision Systemcls ith
partial suilpport provided by thle National Science
Foundation Grant No. NSF/ECS 79-19SSO.
To be presented at the
Decembercrucia
1981.
-~~~frs
is-~-advatag
importance-of____
L
their assignaicist
must'complete
ed portion of the computation before..a new iteration can bein. As a esult complex protocols for
ialgorithm initiation and processor synchronization
may be necessary, and the speed of computation is
limited to that of the slowest processor. These
drawbacks motivate distributed algorithms whereby.,
computation is performed asynchronously at various
nodes and independently of the progress in other .
jnodes. Their potential advantages are simpler
implementation, faster convergence to a solution
.and, possibly, a reduction in information exchange
between computation nodes.
Thlis paper considers an asynchronous distributed algorithm for a broad class of dynanric programming problems. This class is described in
Section 2. The distributed computation model is
described in Section 3. It is shown in Section 4
that the algorithm converges to the correct solution under very weak assumptions.
For some classes
of problems convergence in finite time is demonstrated. These include shortest path problems for
which the distributed algorithm of this paper turns
out to be essentially the same as the routing algorithn originally implemented in the ARPANET in
1969 [1]. To our knowledge there is no published
proof of convergence of this algorithm.
2.
Problem Formulation
Wteuse an abstract framework of dynamic programming, first introduced in [2], [3] which includes as special cases a number of specific problems of practical interest.
Let S and C be two sets referred to as the
state space and the control space respectively.
Elements of S and C are referred to as states and
controls and are denoted by x and u respectivelyxa
For each xcS we are given a subset U(x)c C referred
to as the control constraint set at x. Let F be
the set of all extended rcal valued functions
J: S-i[-,pap] on S. For any two functions J1, JieF
we use the otatio
i
(a)
< J
if
J2
xcS
(a)
1 2
J 1 (X)
ib)
S.
i
(X),
J
if J(X)·
R:JP
2
=
2
Lt IIn: S x C x F
be
amapping which is
monoteone in the sense that for all xcS and ucU(x)
we have
20th IEEE Conference on Decision and Control,
'in
real !
h'e u
an
San Diego, CA,
r abs
- t
rac
tf
?,t
if).
E-
77
-ijTH'E!(.;B:Ff
.
PA,
S i
RE
FINAL SIZE 8' X
tf'.A'E
;:'fAP. L
l jJ26Ej7with J 1
cl(~xjuJ);'J
|~l(Y~~
,-..T
Given 'a subset FCF the problem is to:-,find a:func-fCi:)
_|_
.......
.
x
scalar (2)-----_
isp(2) _The scalar a is positive.
-----r ... Because the-set-W-isl assumed-countable--the
J*( . --f- ( expected
Svalue-in-.(9) -is well -defined:-for--all.JF
in terms of infinite summation provided we use the
It is pos+' (see [31],p.31).
AUTFIConvention +
that
tion F,*cFsuch
J(...
I:l.into
(2I:l The. :functiQns..g, aind f ima.p,Six;C[~.%] ;and S respectively-.
< J
11
,
,
;
S
fl sUECX)
-
sible-to-consider-a -more general probabilistic
structure_for..W (see [3]).- atthe-expense of compliJ
J cating the presentation but this does not seem
f4)
O ( AN1iZAworthwhile in view of the computational purposes
By considering.-the. mapping-T.J--.-.---F-defined-by ---
'
=
T(J) (x)
inf
ueU(:)
'
H(x,u,'J)
the problem-is alternately-'stated
'as-one-of
findin
|a fixed Poitt~,i
3' t4itl:l~i
r,, .ea:,
:fu~fnction J*cF,
such that
~J*
....... T(J*)
_
I..........
...............
)--
j
.- e
I
..- -It is shown in [3] that with this definition
of It the abstract problem (3) reduces under more
[specific-assumptions-to various types. of standard -
Thus if
istochastic optimal control problems.
.
.-
-----------.
---.------.-------------.---------------
.We will assume throughout that T has a unique fixed
'.''..-
......--.
f,:_.:,:',"-.,.'
.x>-.:..
Within F.
-point
! S
-1,.'!
We provide some examples that illustrate the
broad scope of the problem formulation just given.
Example 1 (Shortest Path Problems): Let (N,L) be
a directed graph where N = {1,2, .. ,.n} denotes the
Let
set of nodes and L denotes the set of links.
N(i) denote the downstream neighbors of node i,
i.e., the set of nodes j for which (i,j) is a link.
Assume that each link (i,j) is assigned a positive
Assume also
scalar a. referred to as its length.
1)
that there is a directed path to node 1 from every
other node. Then is is known ([4], p.67) that the
shortest path distances dt to node 1 from all other
I
nodes i solve uniquely the equations
dit =
=
d*
1
mrin
{a.
jcN(i)
(6a)
Vi Z 1
+ dc,
(6b)
0.
'
-
- -
If we make tie identifications
S
=
N, U(x)
C =
=
N(x),
F
=
F,J*(x)
=
(isee [51' Sections 6.1-6.3). ' Under these circumstances the mapping T of (4) has a unique fixed
point J* in the class of all bounded real valued
'functions on S and J* is the optimal value function
of the corresponding stochastic optimal, control
problem.
'
Is we assume that 0 < g(x,u,w) or g(x,u,w) < 0
s
for all (x,u,w)ES x C x W1then we obtain stochastic
optimal control problems of the type discussed extensively, for example, in [5], Sections 6.4-6.6,
If J* is the optinal
7.1-7.4, and [31, Chapter 5.
value function for such a problem then J* is the
such
fixed point of T over all functions JF
unique fixed point of T over all functions JaF such
or
all
(x,u,w),
that 0 < J < J* if 0. < g(x,u,w) for
J* < J < 0 if g(x,u,w) < 0 for all (x,u,w) (see
[53, p. 256).
Example 3 (Finite Ilorizon Stochastic Optimal Control
Let S,C,U(x), I, p(dwlx,u), g and f be
Problems):
as in Example 2 and consider the set of equations
JN(x)
=
Jk(xk)
=
x
=
4axu
+ J(u)
0
x
if
x = 1
=
E{g(x,u,w)
+ ctJ[f(x,u,w)]lx,u}
E{g(xk'uk'
+ Jk l[f(X
(9)
where the following are assumded:
(1)
'Theparameter ! takes valtues in a countable
set W w\ithi given probability distribution p(dw lx,u)
and
depending
vdapldinueon xx on
and u, and
anclE{I-x,u}
thil,
xu di denotes
not-cs
nrizon expected
ctipbtct
d
recspect to this distribution.
value withi
k)
+
(lOb)
k uk('k)I0xk'Uk},
k =,1,.
(8)
we find that the abstract problem (3) reduces to
the shortcest path prolblenm,
: 2 (Infinite
,
.
ExamlE1e 2 (Infnitc lhorizon Stochastic Oti.mal Control P'roblems) : Let II be given by
lH(x,tu,J)
inf
UkSU (xk)
1:
if
(lOa)
, xES
d*
(7)
H(x,u,J)
g is
,uniformly bounded above and below. by some scalars
<
<
a
1 the problem is equivalent to the
,and O
istandard infinite horizon discounted stochastic
!optimal control problem with bounded cost per stage
-1,
xkcS,
These are the usual
where N is a positive integer.
dynamic programming equations associated with finite horizon stochastic optimal control problems
with zero terminal cost and stationary cost per
stage and systcem function.
It is possible to write
these equations in the form (3) by defining a new
state space consisting of an (N+l)-fold Cartesian
product of S wlithl itsclf, wiriting J*
)
=
(JO'J1'' .
J , and appropriatcly defining II on the basis of
(10).
In fact thlis is a standard procedure for convcrt-in a finite horizon problem to an infinite hoizt, p.325). This reformulation
rizn problem (see
(see [5],
F3
AUTJHOi
7-, FV'-D.
INi%1 8nEr' PPAG ,:' I'EHiE
1MC)Oi L1PAP ERR
-
FidNAi SIZE 8'2 X 1
...............--.....
1
[
ii.....
[ican.also be-trivially generalized to finite horimission.fro!l
ell-s
bfe
it
zon problems involving a:nonzero terminal cost
i
t
and a nonstationary system-andc--cost per stage-------fstoos its own astim
ate of
X.alues
o..the_.sonutio
an
s'afunction
a
J* for all states xvX Ti...
The con.tents
3. A':todel for Distributed Dynamic 'Proun
. raruning
_..L
. .....
------------.-..--.
...
eacch'bouffer Bt' where
uicit
=i- or
....-----
r-
---
t---...
J
at tiredt
are a
.
our. algorithm can -be described -in terms-of-a.--..
.
.... .every-Tt,_.i.for
a fun.tion
collection of n computation centers referred to asd
from S. into [-O ] and mcay be viewed as the nesti,
nodes and denoted 1,2,...,n.;
The state space S iJ-' i
i
!
_
i
mate bynodei _ofthe_restriction ofthesolution
partitionedinto.n-.disjoint-. ets--denotedms
? oFS~r-..f-3i, ~~
~ oi--;1*~Jtmnsuch that
'function
J* on S. availabl a
at ttime t, The rules
Each
node
is as-signed
-the-responsiSl
ity--6f-dom:accordingb
to wllch
which functions
functiaonsJij
Jmt are
areupdated
puting
are as
the ivalues
a'
of the',solution
function
J*-at--\!j-wccording
to
updated-are
lows
all states x in the corresponding set S ..A node
....
t
rfo1
-.
_-....._
,.
1
.............
.
-
---
is -said-tol
be- aneighbor of inode i if'j
there exist a.state x.Y.':and two:'functions
_:r
valusof J o1n
5J
X)...-
·...-
i' and1If
i
J
·1
s obufferf
.J2(X)
*J~cF'--Suc~ '"tha
TOJ )(x )
/
T(J
j
-r
-"-
BTatAtsmeaetl arertransmitted and entered
C__-lla) Jiinthe bufrfer-iB..-at- tme 2,uti.e.
-- o ..... .
.
..............
(x
b)
(12)
The set-of all neighbors of i is denoted N(i).tte In- e
tuitively j is not a neighbor of i if, for every
JtF, the values of
aJ on Sp do not influence Sthe
i
iIi
values of (J) on S...........As a result, for an JdeF,
1
in order for node i to
to be able to compute T(J) on
S.~ is only necessary
S. it
to know the values of J on
sets S., jNo'(i),
......
[t-t2 'is a tracnsmission interval' for node
ato
tl;
is n
y j
to node
i withi ieN(j)
the contents J.. of the
i
.
1
municates the estimate obtained from the latest
compute phase to one or more nodes In for whichl
iFN(m).
In the idle state node i does nothing related to the solution of the problem.
It is assumed that a node can receive a transmission from
neighbors simultaneously with computing or transmitting, but this is not a real restriction since,
if needed, a time period in a separate receive
ifneeded,
period
a.separatereceie
in
.the
state can be lumpeda time
into a time periodin the idle
state.
at time t
by the restriction
restriction of the
the function T(J
t
nowhere, for all t, J. is defined by
and, possibly, on the set hS..
At each time instant, node i can be in one of
three possible states compute, transmit, or idle.
In the compute state node i computes a new estimate of the values of the solution function J* for
all states xES..
In the transmit state node i com il
nter
2
contents of buffer B..
i
(xi
)
are replaced
{.2
t-
on S
if XES.
1
t......
t
Ji
)
if X
and
-N(i)
otherwise
(13)
(
In other words we have
2
Jii(x)
f)
T
T(Ji
(x)
=
inf
H(x,u,Ji ),'ES
u U(x)
.
The contents of a buffer B.
he
r
i
i.l4)
can change only at
on
end of a computation interval for node i.
The
contents of a buffer Bij,
I> J£N%(i), can change only
at the end of a transmission interval from j to i.
We assume that computation
and transmission
for' each node takes place in uninterupted time intervals [tP
t ]with
with
t < < t2'
t
but
c do not exclude
toeach
[lod2]
il
the possibility that a node may- be simultaneously
transmitting to more than' one nodes nor do wie assunle that the transmission intervals to these nodes
have the same origin and/or termination.
WTeo
also
make no assumptions on the length, timing and sequencing of computation and transmission intervals
other than the following:
Assulmption (A): There exists a positive scalar P
such that, for every node i, every time interval of
lengtl
at Pleast
contains
on C
ttiol iT-Vl
for i and at least one transmission intcrval fritio
to each nodoU m title iEN(m).
N~i)
Each node i also has one buffer per neighborI
..Djj t~hcre
, enoted
iaalgorithm
j N(i) denoted B.. where it stores the latest trans1.
Note that by definition of the neighbor set
t
the value T(Jt)(x)
for xeS. does not depend on
tt
1
the vat
states xSm ith mi, and m
i
We have assigned arbitrarily the default value zero
to these states in (13).
Our objective 'i s to show
that for all i = 1,...,n
N(i),
lim JIjx)
=
J*(x),
SS,
j =i orJN(i).
It is clear that an assuio
such as () is ncssar) in order for such a result to hold.
SineICCit(14) is of the dv-ic
prra.
ng yp it
is also cltear that some restrictions rust be placed
on the mapping II that guarante-e convergence of the
algorithm under the usual circumstances where the
is carried out in a centralized, synchr-o-
77%
%
'I)
?tD.
I1
J .AF
SZI 8
E L PAPt. F
D
sup
1
here
yin }.;'E{g(X4IW) } -A-.:J *
C
OUS nmanner-(i-e..,:iwhen:there is. bnly one computation node). The following assumption places
xS ucU(x) x
.-----somewhat -indirect -restrictions -on -Il but-simplifiesinf E{g(x,u,w)}
inf
<
the convergence analysis'
T ITLE 0' FIR~SfrT
PAE'iEEE,
":_.-- ..
\
xES uEU(x)'
- .6i-i(B)
s-ti--As-s
Assumption (B):-- Thlere- exist--two- functions J -and -J.
-- ---------.---------with---;---, 'in F Fsuch that -the-set-of all functions---JEF-.
' A~ .For infinite horion stochastic optimal control
furthermore
JJ-A.T
to F and
and f..rther..ore
belongs to
-<< JJ <
< JJ belongs
J
J >~-T(J}--,----T(J
-jT(J)-v---T (J_)-->-- z-J>_LT
*--·J
~--_
,,
(5-'
r- k-lim T (J)(x)
k-t~oL_____.___
~X:
lim T (-J)(x)V )GAN!EZATIs
xcS
=J*(x),
_
L___
L
problems with nonpositive cost per stage (Example 2
<O) 'if'tcan be shoW.. thatt the-functions J, J
| i
_,
N A O;:.'
VES
J*(x), J(x) =0,
j .J(x)
'iwith'-g
~~(15)--[--
l --
-v --
... .
__
_
'ithth
-
-
_...............
___.
--J*(x),; VXS
lim
(J)(x)--=
T
:r.;.:
o :L d :r','.'
:..c".: -,;, ;:Ir~co.:
ko.n:
sk-',
1:!
Xtcost
k,.: Kr
k
where T denotes composition .of..the mapping T with
i
itself k times.
2 8(t6) If the
pp- 21,298).
i[
If *th
;'.Xt. 29.
n.td
|
per stage is nonnegative (Example 2 with g > 0)
.then, under a -mild assumption (which is satisfied.;
in particular if U(x) is a finite set for each x),
..
,
it..can be shown. that..the choices.J,.J with---
Note that in view of the monotonicity of H
[cf. (2)] and the fact J*.= T(J*), Assumption (B)
implies
J(x).
J > T(J) > T2(j)
J*" > ...
,-'..>
-
....
...
>T 2(J)>T(J)> J
satisfy Assumption (B)
0 ,
oif
J(i)
i
- - (17b)
0
if i = 1
..
19)
(x),
3
13
conditions allowed by
The broad range of initial
(19) eliminates the need to reset the contents of
the buffers in an environment where it is necessary
to execute periodlically the algorithm; with slightly
different problem data as for example in routing
This is
algorithms for communication networks.
J(x)
is
1
=
...
._.
assumption that the contents J.ii.J of the- buffers B..ii
:k~ce
at the initial time t=O satisfy
(17a)
n
i = 1 .....
J*x)
Our convergence result will be sho.e;n under the
-k-tco
Assumption (B) leaves open the question of how
On the other
to find suitable functions J and J.
hand for most problems of interest the choices are
In particular we have the following:
clear.
J(i) =
. =
satisfy Assumption (B) ([5], pp. 263, 298)._ The
Ichoice of J and J can be further sharpened and simunder more specific assumptions on problem
: -plified
.structure but we will not pursue this matter further.
Furthermore if JeF satisfies J < J < J then
k.
lim T (J)(x) = J*(x) for all xES.
1)
For shortest path problems (Example 1) it
straightforward to verify that the choices
= __0, .J...J(x)
particularly true for cases 1)-3) above where condition (19) implies that the initial buffer contents
be essentially arbitrary.
.can
- - .-
-
4.
Convergence Analysis
satisfy Assumption (B).
Our main result is the following proposition.
2) For finite horizon stochastic optimal control
Proposition 1: Let Assumptions (A) and (B) hold
problems (Example 3) for which the function g in
and assume that for all i = 1,,..,n
(10) is uniformly bounded below it can be easily
O
) and
verified that the functions J = (J oJ1 .....
efrlN
J(x) < J .(x) < J(x), VxfgS., j = i or ja.N(i).(20)
w
e
~~(J
0~~~o,'
1. '.,)
~
J =
:x
and
all
k
for
*)
**
JJhere
J = t~o,,
8Then for all i = 1,..,n
t
. .J
.)
=
(x)
,
Jk(X) =
j = i or JEN(i).
=J*(x), VxcS
li J .(x)
t
Assumption (B).
satisfy
3) For discounted infinite horizon stochastic optimal control problems with bounded cost per stagC
(Example 2 with ac(O,1) and g uniformly bounded
above and below) it is easily shown that ever) pair
of funct:i.ons .J, J of Ctei forrml
J(x)
--
,
J(x)
=
$,
-
-
Proof: Siince the contents of a buffer can change
end of a co. itation or transmission
only at te
interval at a node we can analyze convergence in
Ie focus attenterms of a discrete tine process.
tion at thle seu;cncc of times {t} *with0 < tl <
t2
<...
where each tk is
the end of a computation
interval for one or more nodes.
VXES
where
Let
"".
~ ~~~ ~
'
.--
us
consider
for
all
t
>
0
--
-
-.
'NUM' ER PAGES 'IE1- .
AUTHOR
77,
t
,.
:uS.>-,:[.i
-*iJ,:3
'=t
J...: n S.-:'o
]-
·-13
FIN A L S Z
MD E L PAPI'E
R-ED.
,.
,;~l.,nc:e
-'n
-
X 11
S
c;
9f}'SuocI
trere
'-]n-
with potential strict inequality only for nodes j
[J. (x)] repwhere for each xCS., the value J..(x)
13
t or t was the end of a co.
.whicither
3
t1 or t2 was thnhed of a o
3fither
resent the contents of buffer B.. at time t ift
..
_
....
_.__
..
........'The- preceding- argument c-an be
-T tatio-n-tlterval.
algorithm were executed with the same timing and
and
all-k;---i--=-l;,.,;ns
that-- for -intervals -6ut-r-l repeated-to~-show -order' o.fcomputation~.and 'ransmission
with the initial condition J(x) [J(x)] instead ofAitJTHoN(i) we have
J
I
hf
o
o
(x --for-each -buffer -B .---and-xS-.--- The-monoto
algottheI'
icity assumption (2)aniiid rtheadefinieiti
rithm (cf. (13), (14)] clearly imply that for alk.tjIZATION
< J
(x)
x
IJ.(k' X&J;
<
()
1
at $;tf2
--.
.
j = i or jcN(i).(21)
i - l...,n,
will thus suffice to show that
It
-J x
-
-- o r-
:
....
(28)
,ttktl)
Let k -be th first-int*oer for which 2P < t
ICwhere P is as in Assumption (A). Then each node kl
must have completed at least one computation phase
in the interval [0,P] and at least one transmission
Iphaseto all nodes in the interval [P,2P].__ It is
seen that this together with (28), the monoIeasily
~~~~~~
-
(3
_
.
...
(13),j(14)]
ihl[. cl4l
N(i)
;
.. t ix<.nl3x,
> .
.....
-t
-1~~~~~~~~~t
-.
(x),xS
J
----
)
T
(J)(x)
.)
.)..
,tk
(4)l > J
then for all t[t
with potential strict inequality only for nodes j
for which t was the end of a c6mplutation interval.
(x)
>
Vxk
S-
J(x)
>
.
N(i).
1t-otj
~, >
-t
(x)
>
(x),
xS.
13
.
i]...,n, jN(i), ....
13
xholds
:'y combining (21), (29), and (30) and using Assumption (B) we obtain (22), (23) and the proof of
Q.E.D.
the proposition is complete.
Note that (21), (29), and (30) provide useful
estimates of rate of convergence. In fact by using these relations it is possible to show that in
some special cases convergence is attained in fi-
j=l,...,n, maN(j),
t).~t
1 ,t2
- 227that
-1 and (14)
(26)
t
J,(x)
> J t (x), Vx
j.-3
n jN(i).
jENCi).
,n
.
The last relation can also be written as
> 3. (X), VxcS ,
iJ= 1.
..
(25.
2
tcttl't 2
I.
(x)
+l)
-
;..;
,n,
im : !-i = l, ..............................
J....................
.......................
13: 3s(x)
t[tl't
t-.
we obtain for all t[etk,t
1Similarly
i(4)
we must havei,
so from
..(x) >
(29)
the content of B.. is either J.. or
t2i
For te[ti
st
J
ni,.
.
..
argument can be repeated and-shows that if
m(k) is the largest integer m such that 2mP < t
In view of the fact J > T(J) we have clearly
.-'This-l
J(X) -oi(X) > Jn(X) , VXES,
]r
_
V
I£
,ei
-lJii
i
tk+
:. '
(236)
J
-(27)
with potential strict inequality only for nodes j
for which t2 was the end of a computation interval.
(25):and(27)twe
Combin
Combining (25) and (27) uIcobtain
Proposition 2:
In the cases of Exap.les 1 and 3
respectively
and
satisfying
there
tk > t
athere
ti(20)
for all i=<, ,..
,n
andexists
m e t > O such that
f:Jo (x)
I
oProof:
holds
k
T (J)(x)
J*(x),
j = i or jN(i
VxS.,
For Example 3 it is
k
T (J)(x)
easily seen that there
J*(),
VxS,
k > N+l
i\CAUTHor,
A
~l
EU
77'' RED.
G-.
F!iA L. SIZE S" X 11
ThdprooflfolLo- s fromi'(2l)-(29) ,and (30).
!example 1 it is easily seen that
k-
;-itD
iS;
lw
izMlI
U ivi
M O cE L PAPE R
r..
For
|
Proof:''Sec '61
-sccg;d
(:sceti@i1fla
r:ete
;
. ..
......
.....
i
s5.I'~~
Conclusions
~~ ~1
J*(i), ~~~~~k
i = 2, .....
,n
;FAG
-- 1
!
k
}
'The analysis of this paper shows that naturall
Also for-each-i,--T (J) (i)- represents-the' length- of;--7-- distributed dynamic-progran i-ng s'cheme§s'co
lverge
a path starting -from-i with k links;-and each -link'--c--to the-correct--solution under very-weak-asstumptions
has positive length. Therefore there exists a k , .,.on
.
the problem structure, and the timing and orderk
.
ing of computation and internode communication.
.
su~h that
P ~~(Jf)
(i.)~~~represents length of a path
from' -he restrictions--on the initial-conditions are also
2
i to node i, for otherwise the paths corresponding
Teryweaki
This-means that- forproblems that are
k-very--weak-- This-means tat--forproblems that are.
to T (J)(i)would cycle indefinitely without reach-__ being solved continuously in real time, it is not
1k. necessary to reset the initial conditions and re+
,
, ,,k. .... J h
ing nodek ...
----- .
---. ------------..
...
-Since-..r-- --synchronize the algorithm each time the problem
J)
Tk
(i)'<
J*(i)":and'J*(i) is:the:shoirtest distance
'data
changes.
As a result the potential for track5from i to-l we'obtain '';;-i -;:
r-I-:
ing slow variations in optimal control laws is inr-i
'/k
1,[...
proved, and algorithmic implementation isgreatly ,
T
)i)
1.=
.J*
i-,
= jZi,
-. .': .,n k>K.
simplified.
T (J) (i)
=
-
The result again follows from (21),
(29)
and (30).
--------------
.Q.E..
-
(-i->,--.
_.!.[2]
2
In many problems of interest the main objective of the algorithm is to obtain a minimizing
control law P*, i.e. a function *: S
C with
p*(x)cU(x) for all xES such that
a=
iln
uem
7i'
(31)S.
t (X,U,J*),vXeS.
HS(X.
(1)
'
It
thus of interest to investigate the question
of
t: w herC contro
satisfng
las
St: C satisfying
S
of whether control laws
7
D. P.. Bertsekas,-"':,fonotone
Mappings with ApD P
o
plication in Dyna-. ic ProgrammSing", SI M.I J.
; Control and Otimiation, Vol. 15
77 pp.
Control and
tmzaton Vol. 1
1977
.....
Optimal Control: The Discrete-Time Case,
Academic PFrss, N.Y., 1978.
is
Pt(X)CU(x),
VxS
-
t
H[x,lp
(x),J.]
where J.
is
min
ll(x,u,Jt),
ueU(x)
VxES.,
given for all t by (13),
t(33)
converge in
Let the assumptions of Proposition
k1 hold.
Assuml:e
that for
uZU(x)
sequence
{.7k}CF also
for which
liml every
J (x) xcS,
= J*(x)
for and
for all xcS we have
k
lim H(x,u,.J)
= -I(x,u,J*).
(34)
k-~o
Then for each state xES for 1which U(x) is a finite
set there exists t
> 0 such that for all t > t
--
x
it (x)
satisfies
H[x, t (x),J']
E. L. Lawler, Combinatorial Optimization:
Networks and latreids, 'lolt, Rinehart, and
Winston, N.Y., 1976.
[5]
D. P. Bertsekas, Dynamic Programming and Stochastic Control, Academic Press, N.Y., 19.
[6]
D. P. Bertsekas, "Distributed Dynamic Programming", Report LIDS-P-1060, Lab. for Information
and Decision Systems, M.1.iT'., Cambridiue, MA,
1980 (to appear in IEEE Trans. on Aut. Control).
(32),
m in
u 0J(x)
(33) thlen
1l(x,u,.J'k) ..
-
i=l .....
i
some sense to a control law p* satisfying (31).
The following proposition shows that convergence
is attained in finite time if the sets U(x) are finite and 1Hhas a continuity property which is satisfied for most problerms of practical interest.
A
related convergence result can be shown assuming,
the sets U(x) are compact (c.f. [5], Prop. 5.11).
Proposition 3:
[4]
(32)
and
if
_______.
J.'itcQuillan, G. Falk, and I. Richer, i'ARe---q ----.
viel of-the Development--and Performance of
---------- the ARPANET Routing Algorithm"r-- IEEE -Trans.on ConLmunications, Vol. COM-26, 1978, pp.
1802-1811
It is possible to construct examples showing
that in the case of the shortest path problem the
number of iterations needed for finite convergence
'k
.
of T (J) depends on the link lengths in a mamnner
which makes the overall algorithm nonpolynomial.
i[X,P*(X),J*]
Refereces
x
Download