LIDS-P-1427 January 1985 OPTIMAL CONTROL OF PRODUCTION RATE IN A FAILURE PRONE MANUFACTURING SYSTEM* by R. AKELLA Rm. -35-431 Laboratory for Information and Decision Systems Massachusetts Institute of Technology Cambridge, Massachusetts 02139 P.R. KUMAR Dept. of Math. and Computer Science U.MoB.C. 5401 Wilkens Avenue Baltimore, Maryland 21228 ABSTRACT We address the problem of controlling the production rate of a failure prone manufacturing system so as to minimize the discounted inventory cost, where certain cost rates are specified for both positive and negative inventories, and there is a constant demand rate for the commodity produced. problem is the optimal control of a continuous The underlying theoretical time system with jump Markov disturbances, with an infinite horizon disFirst, proWe use two complementary approaches. counted cost criterion. ceeding informally, and using a combination of stochastic coupling, linear system. arguments, stable and unstable eigenspaces, renewal theory, parametric optimization etc., we arrive at a conjecture for the optimal policy. Then we address the previously ignored mathematical difficulties associated with differential equations with discontinuous right hand sides, singularity of the optimal control problem, smoothness and validity of the dynamic programming equation etc., to give a rigorous proof of optimality of the conjectured policy. It is hoped that both approaches will find uses in other such problems also. We obtain the complete solution and show that the optimal solution is simply characterized by a certain critical number, which we call the optimal If the current inventory level exceeds the optimal, one inventory levelo should not produce at all, if less, one should produce at the maximum rate, We while if exactly equal one should produce exactly enough to meet demand. also give a simple explicit formula for the optimal inventory level. The research of the second author has been supported in part by the N.S.F. under Grant ECS-8304435, U.S. ARO under contract DAAG-29-84-K0005, administered by the Laboratory for Information and Decision Systems, Massachusetts Institute of Technology; while the research of the first author has been supported in part by IBM. Submitted to IEEE AC. OPTLMAL CONTROL OF PRODUCTION RATE LI A F.: TURS PRONE MANUFACTURING SYST-- I. R. .MLA P. R . K*MkR tL .D.~S l.I.T. . ant C-=-_- Scienc~=o-d Yaf Ui.oB.C. 3altimore, Maryl.and 21228 "Tr. ODUCTION There We consider a manufacturing system product-ng a single comodity. is a constant demand rate d for the c=oodd:y, and the goal- of the turing system is to try to meet this demand. anufac- The manufacturing system is however subject to occasional breakdowns and so there are two states, a "functional" state and a "breakdown" state, in which it can be. The transi- tions between these two states occur as a continuous time Markov chain,with q! the rate of transition from the functional to the breakdown state, and q 2 the rate of transition from the breakdown to the functional state. (Alternatively, the mean time hetween failures is q time is q 2i). and the mean repair When the manufacturing system is in the breakdown state it cannot produce the commodity; while if it is in the functional state it produce at any rate u upto a maximun can production rate r. We assume that r > d > 0. Let x(tI be the inventory of the commodty at time t, i.e., z(t) = Ctota. production upto time t) - (total demand upto time t). negative, which corresponds to a backlog. ventories incur a holding cost of c x(t) may be We suppose that positive in- per unit caoodity per unit time, while negative inventories incur a cost of c-, with c, > O, c > O. Our goal is to control the production rate u(t) at time t, so as to minimize the expected discounted cost, I+t c+ x (t) + c-x-(t))et 0 ( ) w-hre S M. e is :=max(x,O), :o x max (-z,O) y'> 0 is an problem that we acddrcess is t fuctionin is Ca:the op: the discount rate mhe nufa:-X-trig system 'ie Wse. -eztio -l n rate u as a unc:tion of tha This problem has been motivated by the pioneering work of inventory x? and Kimemia and Gershwin [2], where a more general problem is /imemia fi formulated (see Section XIII also). We obtain the complete answer to this question. u(t) xz*(t)), is given by a critical number z*o it 2 eg*r The optimal solution, The opti-al policy is, Wg:)< z* if d if x(t) t - 0 if X(t) (2) z* > z* Thus, whenever the manufacturing system is in the functional state, one should produce at the maxim=u rate r if the inventory x(t) is less than z*, one should produce exactly enough to z:eet demand if the inventort x(t) is exactly equal to z*, and one should not produce at all if the inventory x(t) exceeds z*. Hence the production rate should always be controlled so as to driVeo the inventory level as rapidly as possible towards. z*, and once there should maintain it at the level z*. For this reason we shall call z* the optimal inventory level.° We also obtain the following simple formula for the optimal inventory level. 2*0, t g I c where X _ + cf + s (y1+ q+dX) is the only negative eigenvalue of the matrix, d (3))d 4 q, q, ++7 + Y The movivation for the problem studied here is thaz it is a basic The optimal policy is trivial to probleto for manufacturing systems. compute and qualitatively simple to implement, and it is hoped that these two features will render it attractive enough for use as a guideline. From a theoretical viewpoint also, both our solution procedure and method of proof pp.os.ess.ss several interesting features. First note. that the system under consideration is the following. 5.i) i(t) - u(t) - d 5oii)2 {S(t); t > O} is a continuous time Markov chain with state. space {1,2} and generator [_q. 5iii). 2 The constraint on u(.t2 is, u(t) - O * £o,r] 5oivl The go-l is to inimi.ze E if s(t) - 2 if s(t) ~ 1 (c xz(t) + c-x-(t))e-Y : dt Here x(t) is the inventory at time t, u(t) is the production rate at time t, and s(t) - 1 or 2 depending on whether the manufacturing system is in the functional state or the breakdown state respectively. Thus we have a con- tinuous time system with jump Markov disturbances and an infinite horizon discounted cost criterion. For previous work on problems of this type, we refer the reader to Rishel 3]1 for the case of a finite horizon cost criterion; and Krassovskii and Lidskii [4] and Lidskii [5] for a case of. an infinite horizon problem. are two problematical issues. In dealing with these types of systems there The firs: set of problems arises because one encounters many mathe= ca! rt a dfic ul-_ias, se FPishl- [3], 'he= l studying the probl.em of ot:ima! Con- control for continuous time systems wieh jump MarkBov disturbances. i w(x). sider a feedback policy u Then the system (5.i) satisfies ir (x) - d (6) continuous functions. dis - we wish to consider are essenelly However, the types of functions ' Standard ezxistence and uniqueness conditions for R/shel [3] the solution of the differential. equation (6) are not satisfied. has considered one notion of a solution; are also other difficulties. Vi(x) :- mini we use another methd6. .. There. Let value of the cost (5.iV) when starting in the state s(O) = i, x(O) = x. Then, informally, we have the HamilEto-Jacobi-Bellman dynamic programming equation: fmi (u - d)V (x) q -q7v 1 [I' +j [ +v2 ( ) J r(-dVq2ZWL-qJ y~q( L EtOF It ib is not a priori clear that V i () ) + (7) is a differentiable function. It is not it turns out also clear that there exists an optimal control law. Moreover, that Vi(x) does vanish for some x, and for such a value of x, the left hand side of (7) is minimized by every u, and so the Hamilton-Jacobi-Bellman (HJB) dynamic programming equation does not prescribe the optimal u for such x. Thus, the optimal control problem is a singular one. I: is this collection of mathematical problems that we shall address rigorously in the second half of this paper. The second set of proboles is this. Even ignoring all the mathemati- ca- difficulties mentioned above, how do we actually :tbain the optimal .solution? Why should we suspect that the optimal number type? Why is. the critical number z* always nonnegative? we want to solve the HJR equation (7), conditions? policy is of the critical Given that what are the appropriate boundary After determining boundary conditions how does one determine an optimal choice for z*? It is this collection of issues dealing with the actual obtaining of the optimal policy that we address. in the first half of the paper. The two approaches complement each other and we hope that they wll also be useful in dealing with other problems of the sort considered here. The paper is organized as follows. technical questions In Sections II-VIII we ignore some and arrive at a conjecture for the optimal policy. Beginning with Section IX we address all the mathematical difficulties and rigorously_prove the optdimality of the conjectured policy. II. OPTIMALITY OF CRITICAL NUMBER POLICY Beginning with this section and continuing through Section VIII, we provide a sequence of informal arguments which will lead us to conjectures. about the optimal policy and the optimal cost function. In this section we give an argument to show that the optimal policy is characterized by a critical number. Let us assume the existence of'an optimal feedback policy u(t) =wt*(x(t)) and let Vi(x) denote the optimal cost when starting in the state (s(O) - i, x(O) = x). Fix {s(t,w); t > O}, a realization of the continuous time Markov chain, with s(O,w) = i. Now we consider two different initfi. and also a convex If combination x (0) conditions x 0 (O) and x (O) + CX(O) (lj)xo(O) 5 here 0 <( < 1_ * (-) is used, then the trajectories sarting with the initial condi- tions x 0 (0 ) and: (O) satisfy, 6,tW) - i (xk(t,w)) - d if s(tc,) 1 (8) s(t,W) = 2 i -d a xk(O,W) - Xk(O) for k 0,1. Also, we have where E . signifies that the expectation is taken over W. Suppose now that for the initial state x (0), a) (x 0 (te,S)) Note that such a control is t > (s;(t,); (10) + ctr(lg(t,)L) in fact. implementable because by observing one can in fact deduce what {xk(t,w), t > O} would have 01 been for k - 0,1. Such a control gives rise to the erajectory satis.fying, - (1-ctO C(o) X:a we use the concrol * * ~-_ ( 9 )'--(l (9) dt d(t,w))e + c =Ef(.cCtW) V((0))) W (xO(t,w)) + 7r* (x:(t,w)) d if s(t,W) if -d s(t,w) 1: = 2 (11) x (0,s) it = (1 -a)x0(0) + axl(0) iS easy to check from (8) and (11) that g(t,W) (1-)Xo(t,w) + axl(t,w) for every t > 0 From the convexity of the integrand in (5.iv) it follows that (c x+(t,w) + c x (tw).)e W~~~~ , t dt < (1-a)V (X (0)) + aVl (x (0)) (12) However, for the initial state x (O) the control (10) is not necessarily opcti-al., ard. so v 1 (x(O)) <_ W f(c x:(t,Ca ) + c : (, (13) )eY: d From (13) and (12) we deduce that V 1 (-) is a convex funcction. Assumin tna: VL(-) is continuously differentiable, we see that there is some z* for -1 > 0 ~~~~~~~~~~(14) for x > z* From the left-hand side of the HJB equation (7), which we suspect (V (O); .i - 1,2} satisfy, we see that u r minimizes (u-d)V l (x) - 0 minimizes (u-d)V1 ( x ) if x < z* if x > z* Hence, we suSpect that the optimal policy u(t) a (x) ' r if x < z* s0 if x > z* I r (x(t)) is of the form (15) for some critical number z*. What happens at x - z*? the form of (15) it Any uE [O,r] minimizes (u-d)V1 (z*), but from is clear that the inventory level is quickly driven back to z*'if it deviates from z*. Hence, due to this "chattering" phenomenon, we suspect that * (x) - d if x = z* because such a choice keeps the inventory level exactly at z*, once it reaches Z*o (15) and (16) show that the optimal policy is characterized by (16) the critical number Z*, which we have called the optimal inventory level. Y OF OPTT4L INVa-_TORY LEVEL NONENGATIVT IlIo In this section we show that the op-mai in ventory level is nonegative. Consider two policiesa t rg if x < *d if x = 0 W0 (X) (°) and r Z (.), where 0 (17) . > O O 0g and witaW) Lea r if x < z -=d if x ~ z 0 if x > z (18) z < 0 be some srictely negat:ive number. 0 Denote by V (x) and Vi(x) the z costs resulting from the policieis T (.0) and ITz() ing in the state (s(O) I minim i, x(0) = z). is optimal, then f:rom (14) we see that wz3(-) at x = z, 4&(e) should attain a i.eo V 1 (z) < t1(x) for all x (19) (o) is optimal, we should also have Moreover, if Z(x) < V(x) In particular, < 9v v~gz)z(Z) = respectively, when start- for all x. from the above two inequalities we should have v(0) 1 (0) (20) show Hence, to show that 9z(,) with z < 0 is not optimal, it w}ll suffice to that (20) is not true. 10 Indeed, let {s(t,w); t > O} with s(0,c) sider the two t:ra j = 1 be a realization, and con- c:ries (:,m) ' r0 -d - -- d ifif 0 xO;(,=) (t,) x 1 0, s(;,~.O) s(t,W) -< 0, 1 otherwise 0 x2 (O,) - O and i:(t,w) x (O,) rr-d 1 if xZ(t,W) < Zs s(t,m) - 0 if xZ(t,) . -d otherwise - °, s(t,W) - 1 " z 0 which emanate from the initial states 0 and z, when the policies it () and %r(-)are respectively used. It is easy to verify that xO(t,W) + z - xz(t,w) < z < 0 for all t > 0 Henace, c x (t,c) + c - (t,,) - cx (tw) 0c fx (t,;) + + - C xr (t,) +z + cx ] 0- (t:,) + Cjz[ Hence ))et Y o-0(,xO+ (tw))e + ccx (tW.) + x0 (t, J~c dt < E (¢+ cx(t, -Z+ W) + CrxZ (t,d))e:tdt i.e. v°(o) < vz(z) showing that (20) is violated, and thus that tz(-) cannot be an optimal. policy. Hence z*, the optiml inventory level, has to be nonnegativeo IV-. I- ?TIEC-W!$SE LITN'iA. EQUIATIONS FOR. - Lt: z > 0 and denote by Vi(x) defined in (18). the COST FNiCTION -s:t function for the The analog of (7) for the policy TZ(o) is Vb:(x) := 6oaahg)}g T A2 Cx) .1d - b1 + bcx .AVZ(x) 2(2) 2 Be( +b2cxfor d for x < for x > z w ca ul o xx porlic¢ (.) 12 function corresponding to iz(-l, w.e need to determine appropriate boundary conditions for (23). V. 3BOURDARY CONDITIONS Since the vector VZ(x) is two-dimensional, we need two boundary con- ditions for (23). V.1 The First Boundary Condition Let {s(t,w); t > 0} be a realization. Under the policy rz(-), the inventory is given by the differential equation, ;(,tQ)' w 7; (X('tw)) - .d - d if - 1 if s(t,W) - 2 - In any case 1x(t,,) | < r S(te,) for all (t,w), and so ix(.t,S)I < !x(Q))l + rt for all (t,,). Hence, VY(X(O)) E[J(c +x (t,W) + c x (t,w))e < kljx(O) Hence, + k dts(Ow) for some constants k i, x(O,w) x(O)] and k2. we see that VZ(xl - O(IxI)' as x - + (24). Let us. see how we can make (24). more usable. Solving (231 for x < 0 in terms of V (0), we get v e vZo1) C for X < O (25) Now note the following easily verified fact. Al has one strictly positive eigenvalue, say X., and one strictly negative eigenvalue, say X. (26) 13 Let w {J (27) :- eigenvector of A1 corresponding to A+o ~ To satisfy (24) as x 51 c v (0) - , we clearly need (28) () Xx ) for otherwse, VZ(x) - O( generated by as x ( .+) is the eigenspace Here, Ho w+)}. (28) is one boundary condition for (23)> V o2 The Second Boundary Condition To obtain the second boundary condition, let us see what Vl(z) is. Consider a system starting in state (s(0) = !, x(O) stopping time, be the first time at which s(T+) for 0 < t < T when IZ(o) is used. V (z) IE' e Ytc zdt + e * 2 z) and let t, Clearly x(t a z Hence V2 (z)] Noting that T is exponentially distrihuted with mean qme by e@valuating the expectation in the above equation, we get Z ,1 q 1 Z (q1V2(z) +c +z) Z) (29) or, equivalently, [1,a]A v(z) (30) -d Z (30) is the second boundary condition for (23)o By using the two boundary conditions (28) and (30) one can solve the piecewise linear differential equations (23) function for any policy rZ(o) with z > 0o to obtain VZ()), the cost 14 The next question we have to face is; what is the optima, V1. OPTLMAL CHOICE OF z Suouose i) choice of z? z* (') V1 (z*) < with z* > 0 is optimal. (x) Then, for a1l x, since by (19), the optimal cost function attains a miaimm at z ii) V2} (x) < VtZ(x) for all x,z, since WIZ*() is optimal and therefore has lower cost than any other TrZ(-). From the above, we get Vzl(Z)_ V,1(Z) < 9v(z) ence VZ(z) attains a nmum for all z he 1 z*. Assuming now that l(z). is a C 1 function of z, we see that _W.(Z..] 0° (31) We will call (31) the optimality condition and in the next section we will see how it VII. can be exploited to give the optimal solution. OPTIMAL INVENTORY LEVEL We will now utilize the piecewise linear differential equations (23), the two boundary conditions (28) and (30), and the optimality condition (31) to obtain the optimal choice for z*. Differentiating (29) and using (31), we get dz z (z)j]z~* q1 However, by the chain rule, (32) 1Z-Z dz'2 2 z1W* * Sice VZ(x) coasidee d as a fuc:ic of o is ai/.;ze5 a: z *, by ass ing cont:nuous 4d3ferentiab/i-:=y, We have, z(z *)] MO0 Hence, jz 2 (0) dz( 1 (A1 L1z* a% 2 2' * 8[0, {2V where we have also usecd (2{)o 10°11£d. - z [ bC *w |(34) c+ Setting (34) te:porarily aside, we tu= to (28) Q(0)in terms of 9 (z), and substituting this in (28), Since £vZ(z) + A1 (w' .Solving (23) for we get V: (P ) - e A LA(zj) + e (33) q +1 dI AVZrz*) A (zV z~} From (32) and (33) we have g*(*) + b* fO, 11A1 Vz* (*) (zrz) + + b2 2 b1 c + A1 bc+1 - Al C we have Iaz + 2%.C4J - A 1 blJ is invariant under e , we have + C) (+ >' 16 + A-lbc V (z) Al - 2 -elAA' 1 Zb z + AAb 1 c + +c)E(w) (35) Combining (34) and (35), and noting that (w') is invariant under Al, we get -51J* + AZ + Alblc -Ct _ e I A7b (CC +C ) ( + ) (36) This equation now gives'us a "formula" to choose z*. Recall now from Section VI,' that in obtaining (36) we made the implicit Therefore we now have to determine when there will be assumption that z* > 0. a positive solution z* for (36). Let us first simplify (36) a bit. Note that -Y(Y +q + q2) det: AM ' + d(r -d) - Al "b Y(] (37) simplifies to Hence (36) U eAlZ [1]_ + c 1 C +c - (+C+ :-,,zj Y [ (+) E9(w t(38) 1 C )q Now let v' ~ [vl, v2] :- a left eigenvector of A1 corresponding to X (39) Since left and right eigenvectors corresponding to different eigenvalues are a 0, (38)is true, if and only if, orthogonal, i.e., v w v eZ* Aiz* Since v e + v e Pf+1l Jc j ' (c + c )ql X z* = e v , this reduces to v o X* v2 V @ - + +vy° 1 1v 2 _+ 69Cc Since vA " X v-, by equating the second components of both sides, 1 ~ v2kr -d)- Vl ;>%' 2 -- 2 q +, + 2 and now substituting for , v2 we get I +,.. * - ce + + cd q2d + ( i d r) -t ( + ( d (40) It is easy to check that q d - + q+ (y X d)(r -d) > 0 (41) and so the right hand side of°(40) is strictly positive. Taking logarithms. we therefore get °Z* ts g' ++ +A_ + qd y q% " (y + Xd ag ) -d) However, such a value of z* may not be positive, and we already know from Section III, that if z* > 0 is not optimal, then z* 0 is. Hence we arrive at our conjecture: z* ,. max{. ; 'X t0~og L i +· c Due to (41).it follows Note: (1 + ehat + q .. + q2 + qld -(y d -q -(y +q2 )(r + --d) .d (42) > 1* X_ d)(r -d) I Hence, for every c- > 0, there exists c* > 0 such that if c+ > c*, then z* O. Thus the optimal inventory level may be zero even though c This is somewhat counterintuitive and surprising. < + . OPTZAL COST FUNCTION VIII. In the previous section we have conjectured the optimal policy. In this section we shall conjecture the optimal cost function a-Lso. Let us consider separately the two cases where z* as given hy (42) is positive. is zero, and where it VIIol O0 z* Case 1: 'When z* - 0, the optimal cost function is V (-) which satisfies (.23) with z - 0, and has the boundary conditions (28) and (30) with z - 0o Defining w V(O) as in (28) says that (27), for some constant k kw' + A 1 b Then, from (30), we get 0 - [1r,o ]A1 v() - [1, 0]Jpb ki+ 1 J - kx -. fence and so k - Solving the differential equations (23) with the boundary condi;tion (43), we get yX _ + + -lhl for x < O A b c+ c ee(44) (~6). A2x Ve2. C- c w + + A2 A2 + b2cx+ - + x foLbc + for x > 0 22b2 19 V!I!.2 Case 2: V(Z*) and so, z* > 0 .. A.. L (45) Z*i A1qJ- solving the piecewise boundary condition (45), inear differential equations (23) with we get AA _ e-- } c( + +A7b + A7blc for x < 0 l m e e"l-l' C tb - A 2 b c+ for 0 < x < AeO {* aA 2 By simplification it *V() AitJ2=+ A 2 b tc h 2 c Z* -lcl_+z* 2 A 2 b2 c+ cx (46) 2 for x > z* can also he seen that - e - + AZ b } b 22 (47)f for x > z* Our conjectures for the optimal cost function in the two cases. z* 0 the previous sections we have arrived at the conjecture that if z* and z* > 0 are given by (44) ZX. and (46) respectively. SOLUTION OF HJB EQUATION In is as specified by (42), then r *(-) defined as in (2) is the optimal policy and V(o) defined as in (44) and (46), for the two cases z* = 0 and z* > 0, is the optimal cost function. Beginning with this section, we commence the rigorous proof of the validity of our conjectures. 2Q In this section we will show that V(-) satisfies the Hamilton-JacobiBellman dynamic progra-ing equation (7). LeLa 1 V(-) is continuously differentiable. Proof Consider first the case z* - 0 where V(-) is specified by (44). need to check the continuous differentiability at x - 0e iV Y(a+h) - V() h+Oh and similarly for V(a'). V(O-) :' w+ - Y Denote by V(a+), Now, from (44) ---{Y Y4+ We only Y {- 2 where we have used (27), (37) and [I 0A oJ1 (49) Hence V(.') is continuously differencia'ble whenever z* a 0. Now consider the case z* > 0, where V(-) is specified by (46). We only need to check the continuous differentiability at x - 0 and x V(O-) - (O+) - e a 1 [A'clc and so we proceed to consider x = as eca seen from (46) as can be seen from (46) and (47). - z*. Now, clearly, from (46), - A1{J] A 1b c (50) z*, for which, (51) 0 ~21 ~~~~L _~~~~~a2~~~~ Lea 2 Proof ' From (44), this is clearly true for the case z* Qo Considering z* > 0, We see flrom (46) that 2Ie b v(0) A1, we get isinvariant under e and so noting that (w ) - - :c rolc2 k b c -)q - Abl (c +Q )q 1 c +c ) le eigenvalues Noting that left and right eigenvectors corresponding to different are orthogonal, i.e. v F% s 0 only where v and w are given by (39) and (27), we . eed to verify that;, Az* Noting that v e = e * -v , by using (37) and simplifying, we see that we only have to show that f , J q1 H e (c[ H 0 But noting the equivalence of this and (40) and (42) when z* > 0, the assertion follows. Now. we are ready to consider the case z* thegJB dynamic programming equation (7)o = 0 and show. that V() satisfies 22 Lema 3 Suppose z* given by (42) is equal to 0, and V(-) is Cefined by (44). Then (r (X) - d)V1 (x) - +q ( -q1 (x -( 52.i) Y+ - q2 -d2(X ) q 2Vz)J + c x) 1 for all x 52.1i) d)VlCx) (.wZ*(x)- m Cin (u d)V (x) for all x uE [t,rJ Proof I:t is easily checked that V(-) defined by (44) satisfies (231 for = < O and x > O; and so (52.±). is valid for x < O and x > Now.con- sidering x - 0 and using (48), we have [(Z* [t(~ (OI - d ).v (.-) o c d while, by using (43), we have Y ql SI'a2 -2- + V(Q<) - -A2V(O Y Y+ -- + '21 AZW dA2 h e 2 _cYdfo + where we have also used (49) and (37). Now turning to (52.ii), since Tz*(-) Hence (52.i) is also valid at x = 0. satisfies (2), we only need to show that 23 VI(9)c for x < 0 < Thea, fr-m (44) it follows that Consider x < 0 first. cA+ 00 . -- +> for x < 0 e oe< where, by V1 ( 0 ) , we mean V 1 (0-). Since o (48) shows that Vi(Q) = 0, it Noe w turning to 2 > 0, from (43) and fol6ows that (53) holds for x < 0 (44) for x > 0 (53 ) V()W>O and we see that - aeco[A Vo.(x) · ^2" 2: 'J ]2 for x> 0 (49) and (37), we have From (43), 2 A2 V(0) - pcp - + 2P ~ 2w e 7-A 1 0 b-f C b;2-lJ Recalling (22) we thus have, c)J +.-A 2c 2 L _c A72 1 f'or x > O (54) are right eigenvectors of AI corresponding to the and q2 (55) y+ q +q () eigenvalues (: . ai- .. 2) respectively and [:j= -q+ ql q2 [f: +E ' ql +q 2 [3rsq2 (56) Moreover, equating the first components of both sides of the equation A W X+ + + and noting (4), we have 1 = y + q - X(r - d) (57) 24 Using (55), (56) and (57) in (54), we get il-· 1() 1 e d c C(Y + ql +q 2) d X] (Yf (q,+ q2 ) - d) > 0, then clearly V(X) If y -X(r I - e > O0 for x > O and (53) is valid. So suppose that y - X+(r - d) < 0 and note, by differentiating, that + V (x) X+ (r-)) .)die +( q c-(yy' c(y - X(r-d)) Y(q 1 +q 2 ) (y + q +q 2 ) d for x > 0 (y + ql + q 2 ) If p Hence, to show that for all x > 0 if and only if p + n > O. Vi ( x) > 0 > 0 + ne 0 and n < 0 are constants, then pe > 0 for allx> we only need to verify that c+ + d c(Y ¥ - + (r-d)) y + q1 + q 2 Y(q +q2) > . d -f d or equivalently, that .cy +.c (y It is +(r - d)) >o easy to check that (?+ + X)(r d) (y + ql) - (y + q)(rd so, substituting for X+, we only need to verify that I d q2 d ( + qL But this is in turn equivalent to, X (r - d)) + c y > 0 d and 25 C+-l~+> c + c q 1d - ( + q2 + A d)(r d) which is in fact true, since z* given by (42) satisfies z* - Oo Hence (53) is valid, proving (52.ii). Turning now to the case z* > 0 we show a similar result. Lema 4 Suppose z* given by (42] is strictly positive~, ad iV() is defined by (46)., then (52.i) and (52oii) are valid. Proof is easily checked that Ve). satisfies (23) for x < 0, 0 < x < z* It and x > z*, and so in all three of these cases (52 0 i) x is satisfied. At O, (50) and (46) again show that V(0) - A½V(O) and so (52oi) is also valid at x = 0. (Z* Turning now to x - z*, we uote from (51) that d-V( z (z*) q, -dl(Z*) 1i 'dVZ*) Also, Y + t* V .+ q = -d im {A2V(x) + bq cJV(z*) z d{A 2V(z*) + b2 CZ} + b 2 c} x d Lim V(a)W -dV(z*) where we have used the continuous differentiability of V(o). valid for all x. Now turning to (52.ii), since KTz in (2), we need to show that () Hence (52.i) is is of the form shown 26 (58) for x > z* V l(x) > 0 From (46) we obtain Consider 0 < x < z* first. z*l A (x v(x) and for x < z* Vl (x ) < 0 for 0 < z < z* fh c+ - e Now where, by V(O) and V(z*) we mean V(0+) and V(z*-) respectively. hb c % l .where by 8 :-= + Bd +c Y + q2 ) q + c d > O. (x - z*) > 0 and denoting Let t :=- ql the inverse Laplace transform, we have t-1 = ,O]e ,(x) eq .z[ lr[,o](sI + Al)-I q 1 e(e __________ - -xQ+ (s with strict-inequality for x- z*. (r d) e d)(X+ d) + - Moreover, from (51), V 1 (z*) so the validity of (58) for 0 < x < z* is established. ) = O, and By continuity of V.(x), we see now that V 1 (0) < O, thereby establishing (58) for x - 0 also. Now we consider x < O. V(x2 1 Ie4 V(O) - Since (23) is satisfied for x < O, we see that blc _] + lbl lA + 2 lc for x < 0 (59) Hence t(x) - Ale 1V (o) From Lemma 2, we know that V(O) V(x) = kX+e+ for x <0 A_ bl-c] + Alblj +x +x w + ALb Noting (27) and (37), we have -A2blc b kw for x < 0 for some constant k, and so 27 ) V (O)- k0Xe Lx) Now we consider x > 0 < x < Z*. z*) A (x *o Fprom (47) we have, + for x > z* s= P + *(g e q Y + := > - 0, and hy V(z*) we mean V(z*+) Now using (56) bdwde get (55) a + r (i V ()°- with strict of (8) for xXe O < 0 for x < 0 and s.O (58) is valid for z < 0 in addition to that Vl(X) where ) (0) < 0 as previously shown,it follows 0 for x < 0 and since Since Xax L is e- r d2) b ql ea inequality except at x - z*. > for x > z. Since V!(z*) - 0, the validity Iso established for x > Z*o Theorem 5 Let 60.i) 60 i). 60.iii) z* be deainefd as in C42) 7*b() be definaed as in (2) VCe) be defined as in (44) or (46) depending on whether z* z* > 0. 60.iv) V(-) is continuously differentiable 60.v). wt Z*() and V(.-) satisfy (52.i) and (52.ii) and hence the Hamilton-Jacobi-Bellman dynamic programming equation. 60.vi). V(r.) =0( I) as x ± Q0 or 28 Proo f We have already established (60.iv) and (60.v) in Leas 1, 3 and 4. To show (60.vi) note that (44) and (46), (26), (27) together with Lea Moreover, since both eigenvalues of show that V(x) - O(IxI) as x . - ". follows from (44) and (46) that V(x) - O(x) A2 are strictly negative, it as x X. 2 aLSo. .- + ADJMISSIBLE POLICIES It is now time for us to address more general issues. We begin by defining the class of admissible policies. Definition: A measurable function r : R - f[O,r] will be called an admissible R 2 with Z > 0, there exists a function y (t;T,r) policy if, for every.(r,) which satisfies (61.i) y (t;e,g) is absolutely continuous in t (61.ii) y1T(t;r,g) = (61.iii) y (t;T,~) is continuous in (t,Tr,). (61.iv). yC(-) t _- + J-(Ir(s; r i)) d)ds for t > t is the unique function satisfying (i and ii) above. Given such an admissible we now describe the manner in which we interpret the differential equation (5.i), Let {s(t,w); t > 0} be a realization of (5.ii) with, say, s(O,w) - 1 and suppose x 0 is the initial inventory level. Define tr 0 () :- inf{t > 0 : s(t,w) - z} and Ti + l () inf{t > Ti(W); s(t+, w) # s(t-, w)}. Then we construct the process {(x (t,w)} by, x (t,w) := y (t;O,x 0 for 0 < t < tO(W) := x (ri(), w) - d(t - ri(W)) for ri(<) < t < Ti+ l(W) and i = 0,2,4,... := Y (t;tri()., X (,i(w),W)) for T.i() < t < and i = 1,3,5,... i+ l(W) 29 Noce that an i{ediate consequence is (ar(sw) - d)ds +0 X (t,) (62) for all t > 0 where 0 if s (t,) 'rrSC~(t,9z)) if s(tw) u (t,c) " = 2 1X Thus,- the differential equation (5.i) is interpreted in integral form in (62) . One can use the theory of semigroups of nonlinear contractions in Banach spaces, see Barhu 16J., to obtain sufficient conditions for a policy W to be admissible. of t:he z () We now use this 'to establish the admissibility of policies type. Theorem 6 i (o) defined by (18) is admissible. Proof Let A, a multivalued operator, or equivalently a subset of R 2 b.e defined by A(x) :-'{r 0S-d, r :e {-%} Then x 1 < if xg< z d} d] if x - z H 9if > z and Y!E A(x1), Y2I A(x2) implies that (x - x 2)(y - y2) 0 and so A is a dissipative operator, see Definition 3.1, page 71 of Barb.u 16]. Moreover U AR U {x - y} - R A -dssipa, see , page 7 and so A is m-dissipative, see £6, page 71]. Ar e Also, for every x6 R, 3a and IZTZ(x (itZ(x) - d) A.(x) - d < fyI_ for every yEA(x) By Corollary 1.1 and Theorem 1.6 of [6, page 118] we see that (61.i, ii, iv) are satisfied. Moreover, hy Proposition 1.2 a function of t, for each g, is [6, page 110], y z(t;'r,) 2 )I <_. -z for all t > r, it 1 2 |Y z(t;r, Since by uniqueness y Z(t;,S) ) - y (t-t; Y IY z(t-rT 2 ; O·,l)l - Y (t=-T:; O,2) that y continuous function, and so (61.iii) is also satisfied. .XI. 1 follows from {y Z(t;:-l,,) - y Z(s;T2,i 2 ) < ,-'(t ; o )+ y( 1. 0,,) Z(t-r; 0,1) is a . INTEGRAL EQUATION FOR COST FUNCTION In this section we will show that the cost function corres.pondiag to a policy ar satisfies a certain integral equation. LetTil} be the successive jump times of (s(t)}. is If {x (t); t > O} the trajectory resulting from a policy T, define V' (9) : £C. C(x(t))e dt Is(O) x (0) -i, as the. cost of using T upto the n-th jump of {s(t)}. : c). Eere ca x + C x Clearly zaim (V ) is V, () the corresponding erxpected cost of using Define xa(<,9) :, y(t ;0O,) x(t,S) :- :d t- as a semigroup of nonlinear contractions, and so from Definition 1.1 of 16, page 98] we see that yrz(t;tZ and (63) it indefinitely. 3l ) represents the inventory level at time t if initially the Clearly x (t, inventorv level is e, s(Q) - i, and there are no jups of {s(t)} i .f0,t), By a renewal argument it follows. that V vi,1(T) ( - 0)-0 0 and rt VUf~ )i, rT T ia i ,g))dt + a Y:(x;( (64) (x ,))) (64 (x j(i), ,a I ve where j (i)E{fl,l2, j(i) P i l, be the Banach space of all measurable functions > O0, let { For fo~rt mappig a into R, with norm defined by eF fl letF: 0nf ,h2 ,,- and no te ( ,f2, TZ (,r 2 f x F {p :" sup{e {:'f(x)l, tfik ge4{ (19f21 aF i Ftor (flf2 2 _2 and max llt define T (f1, 2 : by ) ) fji) fq : ((t))dt i + (a( je )da (65) Lelma 7 i. ii) If (fl,f 2 )EYf 2 2) then T(flf E 2 is a contraction with respect to the noarm T j EI for every > 0 sufficiently smaall. Proof It suffices to show that Ti f for i f1,2o -( t<,O.i j (f) E jFand that Ti Note first thar by (62), I + k1 t is a contraction 32 In the above and what follows, all the kt's are constants chosen appropriately. Since c (x) < k2Ixl, fqie a t eY it follows that c(xi(t, ))dt]da < kI + k4 (y + q 1 ) Also, for 0 < E k < ~~.~~~~~" ence ±T El+ e < k f(l sufficiently small, i.e. Ti consider 0 <E <Y ~(T: f - T , I E f f f and so, T + k is. a contraction, To show that T. then (irXg)()< ±+d+r eiei<i> (TfJ±e ·- o I(4)qi | f ( a,)) T ) !g(xi (q.i+ Y)a E(I9, + ka)I t,, l l qe if 1i - T Il~:i,r hence <c < 1. eei ' <_ elt ~! <sl !e- !If - ~:i,,gl i~ . w,~=, o < B < ~.. ~-,',~- where 0 V0 := 0. for n = 0,1,2,... with'~~~~~T da l E +glI < 5eEll[lf From (64) it follows that if Vl d - : ,) 1 (V< V2 E then Vn+ dg T1 33 The oram 8 Let V. ,,.() ). x(O) denote the cost of using I ( Let (i))O V2 (Vl (M), :V Then integral equation. :V is thhe 660i) Vt for every 66.ii) IT (xT (F, ))jda + e-YV te-(x qecf7r,))dt q= ' 1s starting in the state (s(O) E R. R For every f E 2 (n)f , Zim T' V. Proof -(=)0ee Lim TI(T)O Fromt (63) and (64), we see that V n-fold ittrate of Tw TI a contraction, is s and 0 is the identically zero function, Lim Tn) f is It denotes the However, since a unique fixed point of T 7 , for every Htnce Vn is the unique solution, f E . (n) where RV in, F of VT - Cl is important to note that the boundary condition for the integral equation is really the condition that the solution be in . This is a condition on the asymptotic growth rate, and serves to differentiate the infinite horizon problem treated here from the finite horizon problem in [31. OPTeIALITY of fz* XIIo We are now ready to prove the optimality of the s.ugges.ted policy. We shall actually show that optimality is a consequence of it* and V(-) satisfying ,vvi) for the infinite time problem and so our proof is the HJB equation (60iv, quite general. Theorem 9 Let z* and z* > 0, define V(-) by (46). while if i) If rz*(.) be as. 'in (3) and (2). If z* = 0, define V(.) b.y (44), Then V (') represents the cost function corresponding to an admissible policy w, then V *(V) < V - ii (5) for i = 1,2; Z R and all admissible wT 34 E R for every ,(~) - v(M) ii) V iT Proof We will show that TV i i.e. > * g for every admissible i * V z* ( Since Tr is monotone, ) for every 5 E R and i z*(v) -> (67) (67) implies that T()V it > V Z* - 1,2. Taking the limit Z* V z*> that is (i) for i = 1,2. in n and using (66.ii), we obtain % goal is to show (67), along with. equality when rt = T (68) So our Considering (52.i,-ii) . we have (u - d)l(x) - (Y + ql)VCl(x) - q1 V 2 (x)mmn u E [O,r] -dV2 (x) (69) C( + q2 )V2(x) - c(x)(70) -q 2V1 (x) + ( For any iT therefore, (69) implies that c(x) > (y + ql)v - q (x) (r(x) (X) (71) - d)7(x) and so for any admissible T, 1. > (y + ql)V1 (x c(XI(ti)) 1 (t:,)) - qiV2 (x 1 (tOT) - 1 (i(x (ti)) 1 d)V (x Cit )) (72) Now noting that x (t,Q)is absolutely continuous in t, ( 1r(x(,t)) - d),and V 1'( ) with derivative is continuous, we can apply Corollary 7 of [7] showing that the chain rule is valid, and so obtain d V CX1 (t, -dv l,'(XIT V (x '(t (x (t') )) O ~ 'IT(,)- - d) a.e. (7 (73) 35 Hence, from (72) and (73), we have -| YC(X (t,%))dt > v2 x- (t,) + q)v( eY[(y t )d (74) fJeYt ~ Vl(t a for (,))d: > 0 Integrating the last term in (74) By parts, see Hewitt and Stromherg [9, po 287L. we have |e- C(:1(t,D)dt > |aeY q1tv (x-(it,)) - V (x (teg)))dt joqla for a > 0 e- v(x(,)) + vl( (=,O (e vl(~ I + fqe ? l 71 -(Y + q)i f) ~0z Jo - Jo: vlx~( + j)d: V xa(ago))daa ,)))dajdC ytq (V1 (x1(t,')) - V2 (xi(t 9 2) Xd (tS))de Hence ¥@ e :qe (Xx +e l(t,t))d VlgC(za)) i.e. (T~ v2)(9) > V (g) noting that from (60.iv), V(x) = O(x) and so V Ef 2 (70), similarly we deduce that (T T V > V. V2,l)(9) = V 2 (g), Domain (T )o Using Thus we have shown On the other hand since Tr*() attains equality in (71), we have 36 T V V. V < T V Thus V Z* iT XI%. T V r 12 with equality when r Ir . i CONCLUDING RMARKS There are two directions in which more work is needed. The first is to realize the full program for flexible manufacturing systems outlined in Kimemia and Gershwin £2J.parts on m machines.. Consider a flexible manufacturing sy.stem making p Part j requires. ai units of time on machine i.. Thus if a subset sk c {1,2,...,m} of machines is functioning, while the rest have failed, then a vector u - (u 1,u if and only if u E Uk where U k : if i £ sk}. ,up) of production rates is feasible {u : a 2 ,... iu.< if . E $k or Suppose now that each machine is subject to occasional failure (s(t); t > 0} be a Markov chain with state space {Sls2,... and let Given a demand rate vector d - (dl,d 2 ,...,dp), x ,Sm}S we have the problem u(t)-d s(tj); t > O} is a Markov chain with state space {s ..'. 1 5s } 2m u(t) E Uk if s(t) = sk Minimize E e-Ytc(x(t))dt where c(-) is of x(t), has some convex cost function. Due to the multidimensional nature this problem is much more difficult than the one solved here. proposed an approximation, £2] but the optimal solution needs more study. The other direction in which more research is needed is theoretical, and is the problem of optimal control of continuous time systems with jump Markov disturbances. As in Rishel (3], we also have proved optimality only with- in the class of Markov policies. For discrete time systems, see Blackwell [1o] for example, much more progress has been made on optimal control, and one usually considers a much more general class of policies within which optimality is proven. The question of existence of optimal controls also needs more study. Finally, more work needs to be done on the average cost problem for systems with jump Markov disturbances, see Tsitsiklis [8]. The author is grateful to S. Gershwin for introducing him to the problem studied here, to S. Mitter for his friendly hospitality during his visit to M.I.T., and to To Seidman for'several useful discussions, REFEENCES 1. J.o G. Control of Production in Fle.xible Manufacturing imemia, "iserarchial Systems," Ph.Do Thesis, L.Io.DS.-TH-1215, M.I.To, 2, Cambridge, September 19382. J, G0 Kimemia and S. B. Gershwin, "An Algcorithm for the Computer Control of Production in Flexible Manufacturing Systems," IEE Transactions, vol. 15, pp. 353-362, December 1983. 3, I. Rishel, "Dynamic Programming and Minimtia Principles for Systems With Jump Markov Distrubances," SIAM Journal on Cont:rol, vol. 13, pp. 338-371, February 1975 . 4. rassovskii and E. A. Lidskii, "Analytical Deisgn of Controllers in Sto- No.N. castic Systems With Velocity Limited Controlling Action,"' Applied Mathematics and Mechanics, vol 25, pp. 627*-643, 1961, 3. E. A. Lidskii, "Opeimal Control of Systems with Rand'om Properties," Applied Mathematics and Mechanics, vol. 27, pp. 33-45, 1963. 6. V. Barbu, Nonlinear Semigrouus and Differential Equations in Noordhoff Interna!ltional ?ublishing, Leyden, The Netherlands, 7. J. Serrin and D. E. Varberg, Banach Soaces, 1976. "A General Chain Rule for Derivatives and the Change of Variables Formula for the Lebesgue Integral," The American Mathematical Monthly, vol. 8, 76, J. N. Tsitsiklis, "Convexity and Characterization of Optimal Policies in E. Hewitt and K. Stromberg, "Real and Abstract Analysis," Springer-Verlag, New York, 1965, 10. D. Blackwell, "Discounted Dynamic Programming," Annals of Mathematical Statistics, vol. 36, pp. 226-235, 1965. ~ .."^~~~~~~-~-"" ~ ~ " -'I"`"^~' a LI.DS,°-R-i1178, M.I.T., Cambridge, February 1982, Dynamic Routing Problem,'" 9. pp. 514-520, May 1969. ~1 . -.···i-; ·· - -- ··· -,.··- ·----- ·- ··- ···-- ··- - ---- ·- ·--