MITLibraries Document Services

MITLibraries Document Services Room 14-0551 77 Massachusetts Avenue Cambridge, MA 02139 Ph: 617.253.5668 Fax: 617.253.1690 Email: docs@mit.edu http: //libraries. mit. edu/docs DISCLAIMER OF QUALITY Due to the condition of the original material, there are unavoidable flaws in this reproduction. We have made every effort possible to provide you with the best copy available. If you are dissatisfied with this product and find it unusable, please contact Document Services as soon as possible. Thank you. Due to the poor quality of the original document, there is some spotting or background shading in this document. August 1981 r,-;ia text of LIDS-P-1125 0~r"! -] r-~Lj % rOI)FL PA 77;.H' D. [ [ A UT HCQK- N secOinl 1a.cc t SiZEE; Fi 1]1 PA:.:PE , . IDISTRIBUITED ,YNAitC X l1 aI Ufsl PROGrANLlNG*:D --- ..... sec--on- /" L7Z1__ Dimitr:?P; !Bertsekas i - Laboratory fr Informationand DecisionfSvstens ___ _Massachusetts Institute of TechnologyL C.lnrtxtl: ofhIrst pae hAbsfrt:ai hr:r. Pis l'eroiessors Ivtlcs i. Th' Cd h;f Besfn' end of. Aa : We consider distributed algorithms for solving . dynamic programming problems whereby several processors participate simultaneously in the comnputation while maintaining coordination by information exchange via communication links. A model of asynchronous distributed computation is developed which requires very weak. assumptions on the ordering of computations, the timing of information exchange, the amount of local information needed at each computation node, and the initial conditions for the algorithm. The class of problems considered is very broad and includes shortest path problems, and finite and infinite horizon stochastic optimal control problems. . When specialized to a shortest path reduces to the algori.tllm origproblemi the algori.t.hml inally implemented for routing of messages in the ARPANET. 1. i CambridgeMassachusetts 02139 A 17sAT o i Introduction Recent advances in microcomputer technology have intensified interest in distributed computation scI;hmes. Aside from modular expandability, other potential advantages of sucth schemes are a reduction in computation time for solving a given problem due to parallelism of computation, and elimination of the need to communicate problem data available at geographically dispersed data collection points to a computation center. The first advantage is of crucial importance in real time applications where problem solution time can be an implementation bottleneck. The second advantage manifests itself for excamnple in applications involving communication networks where there is a natural decentralization of problem data acquisition. The structure of dynamic programming naturally lends itself well to distributed computation since it involves calculations that to a great extent can be carried out in parallel. In fact it is trivial to devise simple schemes taking advantage of this structure whereby the calculation involved in each iteration of the standard form of the algoSuch rithm is simply shared by several processors. schemes require a certain degree of synchronlization *This research ias conducted at the .I.T. Laoratory for Information and Decision Systemcls ith partial suilpport provided by thle National Science Foundation Grant No. NSF/ECS 79-19SSO. To be presented at the Decembercrucia 1981. -~~~frs is-~-advatag importance-of____ L their assignaicist must'complete ed portion of the computation before..a new iteration can bein. As a esult complex protocols for ialgorithm initiation and processor synchronization may be necessary, and the speed of computation is limited to that of the slowest processor. These drawbacks motivate distributed algorithms whereby., computation is performed asynchronously at various nodes and independently of the progress in other . jnodes. Their potential advantages are simpler implementation, faster convergence to a solution .and, possibly, a reduction in information exchange between computation nodes. Thlis paper considers an asynchronous distributed algorithm for a broad class of dynanric programming problems. This class is described in Section 2. The distributed computation model is described in Section 3. It is shown in Section 4 that the algorithm converges to the correct solution under very weak assumptions. For some classes of problems convergence in finite time is demonstrated. These include shortest path problems for which the distributed algorithm of this paper turns out to be essentially the same as the routing algorithn originally implemented in the ARPANET in 1969 [1]. To our knowledge there is no published proof of convergence of this algorithm. 2. Problem Formulation Wteuse an abstract framework of dynamic programming, first introduced in [2], [3] which includes as special cases a number of specific problems of practical interest. Let S and C be two sets referred to as the state space and the control space respectively. Elements of S and C are referred to as states and controls and are denoted by x and u respectivelyxa For each xcS we are given a subset U(x)c C referred to as the control constraint set at x. Let F be the set of all extended rcal valued functions J: S-i[-,pap] on S. For any two functions J1, JieF we use the otatio i (a) < J if J2 xcS (a) 1 2 J 1 (X) ib) S. i (X), J if J(X)· R:JP 2 = 2 Lt IIn: S x C x F be amapping which is monoteone in the sense that for all xcS and ucU(x) we have 20th IEEE Conference on Decision and Control, 'in real ! h'e u an San Diego, CA, r abs - t rac tf ?,t if). E- 77 -ijTH'E!(.;B:Ff . PA, S i RE FINAL SIZE 8' X tf'.A'E ;:'fAP. L l jJ26Ej7with J 1 cl(~xjuJ);'J |~l(Y~~ ,-..T Given 'a subset FCF the problem is to:-,find a:func-fCi:) _|_ ....... . x scalar (2)-----_ isp(2) _The scalar a is positive. -----r ... Because the-set-W-isl assumed-countable--the J*( . --f- ( expected Svalue-in-.(9) -is well -defined:-for--all.JF in terms of infinite summation provided we use the It is pos+' (see [31],p.31). AUTFIConvention + that tion F,*cFsuch J(... I:l.into (2I:l The. :functiQns..g, aind f ima.p,Six;C[~.%] ;and S respectively-. < J 11 , , ; S fl sUECX) - sible-to-consider-a -more general probabilistic structure_for..W (see [3]).- atthe-expense of compliJ J cating the presentation but this does not seem f4) O ( AN1iZAworthwhile in view of the computational purposes By considering.-the. mapping-T.J--.-.---F-defined-by --- ' = T(J) (x) inf ueU(:) ' H(x,u,'J) the problem-is alternately-'stated 'as-one-of findin |a fixed Poitt~,i 3' t4itl:l~i r,, .ea:, :fu~fnction J*cF, such that ~J* ....... T(J*) _ I.......... ............... )-- j .- e I ..- -It is shown in [3] that with this definition of It the abstract problem (3) reduces under more [specific-assumptions-to various types. of standard - Thus if istochastic optimal control problems. . .- -----------. ---.------.-------------.--------------- .We will assume throughout that T has a unique fixed '.''..- ......--. f,:_.:,:',"-.,.' .x>-.:.. Within F. -point ! S -1,.'! We provide some examples that illustrate the broad scope of the problem formulation just given. Example 1 (Shortest Path Problems): Let (N,L) be a directed graph where N = {1,2, .. ,.n} denotes the Let set of nodes and L denotes the set of links. N(i) denote the downstream neighbors of node i, i.e., the set of nodes j for which (i,j) is a link. Assume that each link (i,j) is assigned a positive Assume also scalar a. referred to as its length. 1) that there is a directed path to node 1 from every other node. Then is is known ([4], p.67) that the shortest path distances dt to node 1 from all other I nodes i solve uniquely the equations dit = = d* 1 mrin {a. jcN(i) (6a) Vi Z 1 + dc, (6b) 0. ' - - - If we make tie identifications S = N, U(x) C = = N(x), F = F,J*(x) = (isee [51' Sections 6.1-6.3). ' Under these circumstances the mapping T of (4) has a unique fixed point J* in the class of all bounded real valued 'functions on S and J* is the optimal value function of the corresponding stochastic optimal, control problem. ' Is we assume that 0 < g(x,u,w) or g(x,u,w) < 0 s for all (x,u,w)ES x C x W1then we obtain stochastic optimal control problems of the type discussed extensively, for example, in [5], Sections 6.4-6.6, If J* is the optinal 7.1-7.4, and [31, Chapter 5. value function for such a problem then J* is the such fixed point of T over all functions JF unique fixed point of T over all functions JaF such or all (x,u,w), that 0 < J < J* if 0. < g(x,u,w) for J* < J < 0 if g(x,u,w) < 0 for all (x,u,w) (see [53, p. 256). Example 3 (Finite Ilorizon Stochastic Optimal Control Let S,C,U(x), I, p(dwlx,u), g and f be Problems): as in Example 2 and consider the set of equations JN(x) = Jk(xk) = x = 4axu + J(u) 0 x if x = 1 = E{g(x,u,w) + ctJ[f(x,u,w)]lx,u} E{g(xk'uk' + Jk l[f(X (9) where the following are assumded: (1) 'Theparameter ! takes valtues in a countable set W w\ithi given probability distribution p(dw lx,u) and depending vdapldinueon xx on and u, and anclE{I-x,u} thil, xu di denotes not-cs nrizon expected ctipbtct d recspect to this distribution. value withi k) + (lOb) k uk('k)I0xk'Uk}, k =,1,. (8) we find that the abstract problem (3) reduces to the shortcest path prolblenm, : 2 (Infinite , . ExamlE1e 2 (Infnitc lhorizon Stochastic Oti.mal Control P'roblems) : Let II be given by lH(x,tu,J) inf UkSU (xk) 1: if (lOa) , xES d* (7) H(x,u,J) g is ,uniformly bounded above and below. by some scalars < < a 1 the problem is equivalent to the ,and O istandard infinite horizon discounted stochastic !optimal control problem with bounded cost per stage -1, xkcS, These are the usual where N is a positive integer. dynamic programming equations associated with finite horizon stochastic optimal control problems with zero terminal cost and stationary cost per stage and systcem function. It is possible to write these equations in the form (3) by defining a new state space consisting of an (N+l)-fold Cartesian product of S wlithl itsclf, wiriting J* ) = (JO'J1'' . J , and appropriatcly defining II on the basis of (10). In fact thlis is a standard procedure for convcrt-in a finite horizon problem to an infinite hoizt, p.325). This reformulation rizn problem (see (see [5], F3 AUTJHOi 7-, FV'-D. INi%1 8nEr' PPAG ,:' I'EHiE 1MC)Oi L1PAP ERR - FidNAi SIZE 8'2 X 1 ...............--..... 1 [ ii..... [ican.also be-trivially generalized to finite horimission.fro!l ell-s bfe it zon problems involving a:nonzero terminal cost i t and a nonstationary system-andc--cost per stage-------fstoos its own astim ate of X.alues o..the_.sonutio an s'afunction a J* for all states xvX Ti... The con.tents 3. A':todel for Distributed Dynamic 'Proun . raruning _..L . ..... ------------.-..--. ... eacch'bouffer Bt' where uicit =i- or ....----- r- --- t---... J at tiredt are a . our. algorithm can -be described -in terms-of-a.--.. . .... .every-Tt,_.i.for a fun.tion collection of n computation centers referred to asd from S. into [-O ] and mcay be viewed as the nesti, nodes and denoted 1,2,...,n.; The state space S iJ-' i i ! _ i mate bynodei _ofthe_restriction ofthesolution partitionedinto.n-.disjoint-. ets--denotedms ? oFS~r-..f-3i, ~~ ~ oi--;1*~Jtmnsuch that 'function J* on S. availabl a at ttime t, The rules Each node is as-signed -the-responsiSl ity--6f-dom:accordingb to wllch which functions functiaonsJij Jmt are areupdated puting are as the ivalues a' of the',solution function J*-at--\!j-wccording to updated-are lows all states x in the corresponding set S ..A node .... t rfo1 -. _-....._ ,. 1 ............. . - --- is -said-tol be- aneighbor of inode i if'j there exist a.state x.Y.':and two:'functions _:r valusof J o1n 5J X)...- ·...- i' and1If i J ·1 s obufferf .J2(X) *J~cF'--Suc~ '"tha TOJ )(x ) / T(J j -r -"- BTatAtsmeaetl arertransmitted and entered C__-lla) Jiinthe bufrfer-iB..-at- tme 2,uti.e. -- o ..... . . .............. (x b) (12) The set-of all neighbors of i is denoted N(i).tte In- e tuitively j is not a neighbor of i if, for every JtF, the values of aJ on Sp do not influence Sthe i iIi values of (J) on S...........As a result, for an JdeF, 1 in order for node i to to be able to compute T(J) on S.~ is only necessary S. it to know the values of J on sets S., jNo'(i), ...... [t-t2 'is a tracnsmission interval' for node ato tl; is n y j to node i withi ieN(j) the contents J.. of the i . 1 municates the estimate obtained from the latest compute phase to one or more nodes In for whichl iFN(m). In the idle state node i does nothing related to the solution of the problem. It is assumed that a node can receive a transmission from neighbors simultaneously with computing or transmitting, but this is not a real restriction since, if needed, a time period in a separate receive ifneeded, period a.separatereceie in .the state can be lumpeda time into a time periodin the idle state. at time t by the restriction restriction of the the function T(J t nowhere, for all t, J. is defined by and, possibly, on the set hS.. At each time instant, node i can be in one of three possible states compute, transmit, or idle. In the compute state node i computes a new estimate of the values of the solution function J* for all states xES.. In the transmit state node i com il nter 2 contents of buffer B.. i (xi ) are replaced {.2 t- on S if XES. 1 t...... t Ji ) if X and -N(i) otherwise (13) ( In other words we have 2 Jii(x) f) T T(Ji (x) = inf H(x,u,Ji ),'ES u U(x) . The contents of a buffer B. he r i i.l4) can change only at on end of a computation interval for node i. The contents of a buffer Bij, I> J£N%(i), can change only at the end of a transmission interval from j to i. We assume that computation and transmission for' each node takes place in uninterupted time intervals [tP t ]with with t < < t2' t but c do not exclude toeach [lod2] il the possibility that a node may- be simultaneously transmitting to more than' one nodes nor do wie assunle that the transmission intervals to these nodes have the same origin and/or termination. WTeo also make no assumptions on the length, timing and sequencing of computation and transmission intervals other than the following: Assulmption (A): There exists a positive scalar P such that, for every node i, every time interval of lengtl at Pleast contains on C ttiol iT-Vl for i and at least one transmission intcrval fritio to each nodoU m title iEN(m). N~i) Each node i also has one buffer per neighborI ..Djj t~hcre , enoted iaalgorithm j N(i) denoted B.. where it stores the latest trans1. Note that by definition of the neighbor set t the value T(Jt)(x) for xeS. does not depend on tt 1 the vat states xSm ith mi, and m i We have assigned arbitrarily the default value zero to these states in (13). Our objective 'i s to show that for all i = 1,...,n N(i), lim JIjx) = J*(x), SS, j =i orJN(i). It is clear that an assuio such as () is ncssar) in order for such a result to hold. SineICCit(14) is of the dv-ic prra. ng yp it is also cltear that some restrictions rust be placed on the mapping II that guarante-e convergence of the algorithm under the usual circumstances where the is carried out in a centralized, synchr-o- 77% % 'I) ?tD. I1 J .AF SZI 8 E L PAPt. F D sup 1 here yin }.;'E{g(X4IW) } -A-.:J * C OUS nmanner-(i-e..,:iwhen:there is. bnly one computation node). The following assumption places xS ucU(x) x .-----somewhat -indirect -restrictions -on -Il but-simplifiesinf E{g(x,u,w)} inf < the convergence analysis' T ITLE 0' FIR~SfrT PAE'iEEE, ":_.-- .. \ xES uEU(x)' - .6i-i(B) s-ti--As-s Assumption (B):-- Thlere- exist--two- functions J -and -J. -- ---------.---------with---;---, 'in F Fsuch that -the-set-of all functions---JEF-. ' A~ .For infinite horion stochastic optimal control furthermore JJ-A.T to F and and f..rther..ore belongs to -<< JJ < < JJ belongs J J >~-T(J}--,----T(J -jT(J)-v---T (J_)-->-- z-J>_LT *--·J ~--_ ,, (5-' r- k-lim T (J)(x) k-t~oL_____.___ ~X: lim T (-J)(x)V )GAN!EZATIs xcS =J*(x), _ L___ L problems with nonpositive cost per stage (Example 2 <O) 'if'tcan be shoW.. thatt the-functions J, J | i _, N A O;:.' VES J*(x), J(x) =0, j .J(x) 'iwith'-g ~~(15)--[-- l -- -v -- ... . __ _ 'ithth - - _............... ___. --J*(x),; VXS lim (J)(x)--= T :r.;.: o :L d :r','.' :..c".: -,;, ;:Ir~co.: ko.n: sk-', 1:! Xtcost k,.: Kr k where T denotes composition .of..the mapping T with i itself k times. 2 8(t6) If the pp- 21,298). i[ If *th ;'.Xt. 29. n.td | per stage is nonnegative (Example 2 with g > 0) .then, under a -mild assumption (which is satisfied.; in particular if U(x) is a finite set for each x), .. , it..can be shown. that..the choices.J,.J with--- Note that in view of the monotonicity of H [cf. (2)] and the fact J*.= T(J*), Assumption (B) implies J(x). J > T(J) > T2(j) J*" > ... ,-'..> - .... ... >T 2(J)>T(J)> J satisfy Assumption (B) 0 , oif J(i) i - - (17b) 0 if i = 1 .. 19) (x), 3 13 conditions allowed by The broad range of initial (19) eliminates the need to reset the contents of the buffers in an environment where it is necessary to execute periodlically the algorithm; with slightly different problem data as for example in routing This is algorithms for communication networks. J(x) is 1 = ... ._. assumption that the contents J.ii.J of the- buffers B..ii :k~ce at the initial time t=O satisfy (17a) n i = 1 ..... J*x) Our convergence result will be sho.e;n under the -k-tco Assumption (B) leaves open the question of how On the other to find suitable functions J and J. hand for most problems of interest the choices are In particular we have the following: clear. J(i) = . = satisfy Assumption (B) ([5], pp. 263, 298)._ The Ichoice of J and J can be further sharpened and simunder more specific assumptions on problem : -plified .structure but we will not pursue this matter further. Furthermore if JeF satisfies J < J < J then k. lim T (J)(x) = J*(x) for all xES. 1) For shortest path problems (Example 1) it straightforward to verify that the choices = __0, .J...J(x) particularly true for cases 1)-3) above where condition (19) implies that the initial buffer contents be essentially arbitrary. .can - - .- - 4. Convergence Analysis satisfy Assumption (B). Our main result is the following proposition. 2) For finite horizon stochastic optimal control Proposition 1: Let Assumptions (A) and (B) hold problems (Example 3) for which the function g in and assume that for all i = 1,,..,n (10) is uniformly bounded below it can be easily O ) and verified that the functions J = (J oJ1 ..... efrlN J(x) < J .(x) < J(x), VxfgS., j = i or ja.N(i).(20) w e ~~(J 0~~~o,' 1. '.,) ~ J = :x and all k for *) ** JJhere J = t~o,, 8Then for all i = 1,..,n t . .J .) = (x) , Jk(X) = j = i or JEN(i). =J*(x), VxcS li J .(x) t Assumption (B). satisfy 3) For discounted infinite horizon stochastic optimal control problems with bounded cost per stagC (Example 2 with ac(O,1) and g uniformly bounded above and below) it is easily shown that ever) pair of funct:i.ons .J, J of Ctei forrml J(x) -- , J(x) = $, - - Proof: Siince the contents of a buffer can change end of a co. itation or transmission only at te interval at a node we can analyze convergence in Ie focus attenterms of a discrete tine process. tion at thle seu;cncc of times {t} *with0 < tl < t2 <... where each tk is the end of a computation interval for one or more nodes. VXES where Let "". ~ ~~~ ~ ' .-- us consider for all t > 0 -- - -. 'NUM' ER PAGES 'IE1- . AUTHOR 77, t ,. :uS.>-,:[.i -*iJ,:3 '=t J...: n S.-:'o ]- ·-13 FIN A L S Z MD E L PAPI'E R-ED. ,. ,;~l.,nc:e -'n - X 11 S c; 9f}'SuocI trere '-]n- with potential strict inequality only for nodes j [J. (x)] repwhere for each xCS., the value J..(x) 13 t or t was the end of a co. .whicither 3 t1 or t2 was thnhed of a o 3fither resent the contents of buffer B.. at time t ift .. _ .... _.__ .. ........'The- preceding- argument c-an be -T tatio-n-tlterval. algorithm were executed with the same timing and and all-k;---i--=-l;,.,;ns that-- for -intervals -6ut-r-l repeated-to~-show -order' o.fcomputation~.and 'ransmission with the initial condition J(x) [J(x)] instead ofAitJTHoN(i) we have J I hf o o (x --for-each -buffer -B .---and-xS-.--- The-monoto algottheI' icity assumption (2)aniiid rtheadefinieiti rithm (cf. (13), (14)] clearly imply that for alk.tjIZATION < J (x) x IJ.(k' X&J; < () 1 at $;tf2 --. . j = i or jcN(i).(21) i - l...,n, will thus suffice to show that It -J x - -- o r- : .... (28) ,ttktl) Let k -be th first-int*oer for which 2P < t ICwhere P is as in Assumption (A). Then each node kl must have completed at least one computation phase in the interval [0,P] and at least one transmission Iphaseto all nodes in the interval [P,2P].__ It is seen that this together with (28), the monoIeasily ~~~~~~ - (3 _ . ... (13),j(14)] ihl[. cl4l N(i) ; .. t ix<.nl3x, > . ..... -t -1~~~~~~~~~t -. (x),xS J ---- ) T (J)(x) .) .).. ,tk (4)l > J then for all t[t with potential strict inequality only for nodes j for which t was the end of a c6mplutation interval. (x) > Vxk S- J(x) > . N(i). 1t-otj ~, > -t (x) > (x), xS. 13 . i]...,n, jN(i), .... 13 xholds :'y combining (21), (29), and (30) and using Assumption (B) we obtain (22), (23) and the proof of Q.E.D. the proposition is complete. Note that (21), (29), and (30) provide useful estimates of rate of convergence. In fact by using these relations it is possible to show that in some special cases convergence is attained in fi- j=l,...,n, maN(j), t).~t 1 ,t2 - 227that -1 and (14) (26) t J,(x) > J t (x), Vx j.-3 n jN(i). jENCi). ,n . The last relation can also be written as > 3. (X), VxcS , iJ= 1. .. (25. 2 tcttl't 2 I. (x) +l) - ;..; ,n, im : !-i = l, .............................. J.................... ....................... 13: 3s(x) t[tl't t-. we obtain for all t[etk,t 1Similarly i(4) we must havei, so from ..(x) > (29) the content of B.. is either J.. or t2i For te[ti st J ni,. . .. argument can be repeated and-shows that if m(k) is the largest integer m such that 2mP < t In view of the fact J > T(J) we have clearly .-'This-l J(X) -oi(X) > Jn(X) , VXES, ]r _ V I£ ,ei -lJii i tk+ :. ' (236) J -(27) with potential strict inequality only for nodes j for which t2 was the end of a computation interval. (25):and(27)twe Combin Combining (25) and (27) uIcobtain Proposition 2: In the cases of Exap.les 1 and 3 respectively and satisfying there tk > t athere ti(20) for all i=<, ,.. ,n andexists m e t > O such that f:Jo (x) I oProof: holds k T (J)(x) J*(x), j = i or jN(i VxS., For Example 3 it is k T (J)(x) easily seen that there J*(), VxS, k > N+l i\CAUTHor, A ~l EU 77'' RED. G-. F!iA L. SIZE S" X 11 ThdprooflfolLo- s fromi'(2l)-(29) ,and (30). !example 1 it is easily seen that k- ;-itD iS; lw izMlI U ivi M O cE L PAPE R r.. For | Proof:''Sec '61 -sccg;d (:sceti@i1fla r:ete ; . .. ...... ..... i s5.I'~~ Conclusions ~~ ~1 J*(i), ~~~~~k i = 2, ..... ,n ;FAG -- 1 ! k } 'The analysis of this paper shows that naturall Also for-each-i,--T (J) (i)- represents-the' length- of;--7-- distributed dynamic-progran i-ng s'cheme§s'co lverge a path starting -from-i with k links;-and each -link'--c--to the-correct--solution under very-weak-asstumptions has positive length. Therefore there exists a k , .,.on . the problem structure, and the timing and orderk . ing of computation and internode communication. . su~h that P ~~(Jf) (i.)~~~represents length of a path from' -he restrictions--on the initial-conditions are also 2 i to node i, for otherwise the paths corresponding Teryweaki This-means that- forproblems that are k-very--weak-- This-means tat--forproblems that are. to T (J)(i)would cycle indefinitely without reach-__ being solved continuously in real time, it is not 1k. necessary to reset the initial conditions and re+ , , ,,k. .... J h ing nodek ... ----- . ---. ------------.. ... -Since-..r-- --synchronize the algorithm each time the problem J) Tk (i)'< J*(i)":and'J*(i) is:the:shoirtest distance 'data changes. As a result the potential for track5from i to-l we'obtain '';;-i -;: r-I-: ing slow variations in optimal control laws is inr-i '/k 1,[... proved, and algorithmic implementation isgreatly , T )i) 1.= .J* i-, = jZi, -. .': .,n k>K. simplified. T (J) (i) = - The result again follows from (21), (29) and (30). -------------- .Q.E.. - (-i->,--. _.!.[2] 2 In many problems of interest the main objective of the algorithm is to obtain a minimizing control law P*, i.e. a function *: S C with p*(x)cU(x) for all xES such that a= iln uem 7i' (31)S. t (X,U,J*),vXeS. HS(X. (1) ' It thus of interest to investigate the question of t: w herC contro satisfng las St: C satisfying S of whether control laws 7 D. P.. Bertsekas,-"':,fonotone Mappings with ApD P o plication in Dyna-. ic ProgrammSing", SI M.I J. ; Control and Otimiation, Vol. 15 77 pp. Control and tmzaton Vol. 1 1977 ..... Optimal Control: The Discrete-Time Case, Academic PFrss, N.Y., 1978. is Pt(X)CU(x), VxS - t H[x,lp (x),J.] where J. is min ll(x,u,Jt), ueU(x) VxES., given for all t by (13), t(33) converge in Let the assumptions of Proposition k1 hold. Assuml:e that for uZU(x) sequence {.7k}CF also for which liml every J (x) xcS, = J*(x) for and for all xcS we have k lim H(x,u,.J) = -I(x,u,J*). (34) k-~o Then for each state xES for 1which U(x) is a finite set there exists t > 0 such that for all t > t -- x it (x) satisfies H[x, t (x),J'] E. L. Lawler, Combinatorial Optimization: Networks and latreids, 'lolt, Rinehart, and Winston, N.Y., 1976. [5] D. P. Bertsekas, Dynamic Programming and Stochastic Control, Academic Press, N.Y., 19. [6] D. P. Bertsekas, "Distributed Dynamic Programming", Report LIDS-P-1060, Lab. for Information and Decision Systems, M.1.iT'., Cambridiue, MA, 1980 (to appear in IEEE Trans. on Aut. Control). (32), m in u 0J(x) (33) thlen 1l(x,u,.J'k) .. - i=l ..... i some sense to a control law p* satisfying (31). The following proposition shows that convergence is attained in finite time if the sets U(x) are finite and 1Hhas a continuity property which is satisfied for most problerms of practical interest. A related convergence result can be shown assuming, the sets U(x) are compact (c.f. [5], Prop. 5.11). Proposition 3: [4] (32) and if _______. J.'itcQuillan, G. Falk, and I. Richer, i'ARe---q ----. viel of-the Development--and Performance of ---------- the ARPANET Routing Algorithm"r-- IEEE -Trans.on ConLmunications, Vol. COM-26, 1978, pp. 1802-1811 It is possible to construct examples showing that in the case of the shortest path problem the number of iterations needed for finite convergence 'k . of T (J) depends on the link lengths in a mamnner which makes the overall algorithm nonpolynomial. i[X,P*(X),J*] Refereces x

MITLibraries Document Services

Related documents

Products

Support

MITLibraries Document Services

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib