liilllllili HD28 .M414 ^M" ALFRED P. WORKING PAPER SLOAN SCHOOL OF MANAGEMENT OPTIMAL CONTROL OF A TWO-STATION BROWNIAN NETWORK Lawrence M. Wein Sloan School of Management Massachusetts Institute of Technology Cambridge, MA 02139 #2015-88 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 50 MEMORIAL DRIVE CAMBRIDGE, MASSACHUSETTS 02139 OPTIMAL CONTROL OF A TWO-STATION BROWNIAN NETWORK Lawrence M. Wein Sloan School of Management Massachusetts Institute of Technology Cambridge, MA 02139 #2015-88 M.I.T. LIBRARIES OPTIMAL CONTROL OF A TWO-STATION BROWNIAN NETWORK Lawrence M. Wein Abstract Harrison [5] has introduced a model called a Brownian network, which approximates a multiclass queueing network with dynamic scheduling capability. In the present paper, a particular Brownian network control problem is considered that approximates a math- ematically intractable scheduling problem for a two-station multiclass queueing network. A reformulation of the Brownian network control problem ming is is solved. Linear program- used to reduce the reformulated problem to a singular control problem for a one-dimensional Brownian motion. The objective of the singular control problem is to minimize the long-run expected average holding cost of the controlled process incurred per tmit of time, subject to constraints on the long-run expected average amount of control exerted. A'ev words. Brownian motion, stochastic control, singular control, queueing networks. December 1987 OPTIMAL CONTROL OF A TWO-STATION BROWNIAN NETWORK Lawrence M. Wein Introduction. This paper 1. is part of a research effort to develop effective schedul- ing rules for multiclass queueing networks. Harrison [5] has introduced a type of crude but relatively tractable stochastic system model called a Brownian network. A Brownian network approximates a multiclass queueing network with dynamic scheduling capability if the total load imposed on each station in the queueing network to each station's capacity. Hence, is approximately equal we can approximate a dynamic scheduling problem for a queueing network by a dynamic control problem for a Brownian network. We consider a control problem for a Brownian network that approximates a particular scheduling problem for a two-station multiclass queueing network. scheduling problem, which is priority sequencing decisions, The queueing network described in the next paragraph, has both input control and and appears to be mathematically intractable. In this paper, the approximating Brownian network control problem, which appears in equations (1.1)(1.7), is reformulated, found in the present and the new formulation paper was interpreted in is solved exactly. In Wein [10], the solution terms of the original queueing system. This interpretation yielded an effective scheduling rule for the queueing network scheduling problem. from A simulation study in Wein this analysis appear in the [10] showed that the resulting scheduling outperforms conventional input control and priority sequencing rules that literature. The queueing network scheduling problem is many factories. different customer encountered in stations and rule derived K is motivated by a scheduling problem that Consider a queueing network with two single-server classes. Customers of 1 class k = l,...,/\ require service and at a specific station s{k) random x'axiables with finite their service times axe independent mean k customer turns into a class probabihty 1 — Yli^=\ -P<.j< and variance 5|. Upon completion P = all is previous history. We assume that the Because the number of classes an endless Each customer line of in the line is all allowed to be almost perfectly general. The scheduling problem incorporates input and sequencing is KxK (Pkj) has spectral radius less than one, so that will eventually exit the system. arbitrary, this routing structure there of service, a class customer with probability P^j and exits the system with j independent of Markovian switching matrix customers rrik and identically distributed decisions. We assume customers who are waiting to gain entry into the network. has an exogenously specified class designation. These class designations are such that, over the long-run, the proportion of class k customers released into the system is where g^, as the entering class mix. N — {N{t),t > 0}, system up to time The input where N{t) t. — Xlfc=i Ik is 1- The = (qk) will be referred to decisions are to choose a non-decreasing process the cvunulative Thus the input vector q number of customers injected into the decisions essentially allow full discretion over the timing of the release of customers into the system, but do not allow for the choice of which class of customer to The sequencing tomer to process at inject. decisions consist of choosing, at each point in time, which class of cus- each server in the network. Preemptive resimie scheduling so that service of a customer priority customer arrives approximation that have an effect It is is may be interrupted at a particular station at that station. Due higher Brownian employed here, the assumptions made regarding preemption do not assimaed that a holding cost c^ in the is incurred for each queueing network. Also, there is analysis. iinit of time that a class k a specified lower boimd A on the long-run average expected throughput rate of the queueing network. of a queueing system allowed, when a to the rather crude nature of the on the scheduling policy that emerges from the customer spends is is the number The throughput rate of customer departures from the system per unit of 2 Our queueing network scheduling problem time. is to choose the input and sequencing decisions so as to minimize the long-run average expected holding costs incurred per unit of time, subject to a lower bound constraint on the long-run average expected throughput rate. Since this queueing network scheduling problem appears to be mathematically hope tractable, the best for further progress more tractable models, such appears to be in the analysis of cruder, Brownian network model. as the in- anced heavy loading, a Brownian network approximates a Under conditions of baJ- queueing network multiclciss with dynamic scheduling capability. To state these conditions more precisely, we define the two-vector p stations. m = The = {p,) be the relative server utilizations, or values of pi and p2 can be traffic intensities, for computed from the switching matrix P, the vector mix [nik] of expected processing times, the entering class q = (qk) average throughput rate A, as will be shown later in the introduction. z = 1, 2. As a canonical example, one may think of = pi p2 = .9, and the specified The balanced heavy < loading conditions assume the existence of a large integer n such that for the two \/n{l — p,) < which case n in = 1 100 satisfies this condition. Under such conditions, the scheduling problem described above can be approximated by a dynamic control problem Brownian network model, for a Brownian network. see Harrison to determine the data of the [5]. For a full development of the Only enough information Brownian network in will be given here terms of the original queueing network parameters. Also, most of Harrison's notation will be retained for ease of reference. The Brownian network control problem the set-up that will be adopted. assumed there is is a measurable a given (i7, mapping of the real line R, Ft = is stated in the next paragraph, but When we F, Ft, A', say that P^), where Q, into C(R), which (7{X{s), s <t) \s probability measures on 17 X (f), is is F) we describe a {p.,a^) Brownian motion, is a measurable space, it is X = X{u!) the space of continuous functions on the filtration generated by such that the process {X{t),t 3 first > 0} is X, and P^ is a family of a Brownian motion with drift variance /i, If }' with Pj. we say More a"^ = , and {}'(/)' ^ Y that the process generally, process A' when we Y left It 0} is non-anticipating with respect to the Brownian motion a process that is Y one process adapted to the coarsest Also, a stochastic process have > will say that is Let Ei be the expectation operator associated initiEd state x. is said to be is Ft -measurable is < > 0, then X. non-amticipating with respect to another with respect to which filtration RCLL for all if its X is adapted. sample paths are right continuous and limits with probability one. follows from Section 9 of Harrison sumptions described earlier, the [5] that under the balanced heavy loading as- queueing network scheduling problem described earlier can be approximated by the following limiting control problem: choose a pair of RCLL processes 1' and 6 (A'-dimensional and one- dimensional, respectively) to ... minirmze 1 ,. limsup ^ —Ei f •^0 subject to the constraints : Y Y,CkZk{t)dt (1.1) ;t=i and 6 axe non — anticipating with respect to A', (1.2) Z{t)=X{t) + RY{t)-qe{t) foralH>0, U{t) = AY{t) U non — decreasing with U(0) is Z{t) > for for all alH > Vim sup ^E[U,{T)] The process Z is, if Q is original queueing system, then the scaled Z(0 = is 0, < > (1.4) 0, = 0, and 7, (1-5) (1.6) fort = 1,2. (1.7) represents the A' -dimensional scaled queue length process and describes the state of the system. That where n t (1.3) the AT-dimensional queue length process for the queue length process is defined by ^, ^>0, (1.8) the large integer specified in the balanced heavy loading conditions. that, as in Harrison [5], exactly the same notation that is Notice used for the scaled processes is to used in the approximating Brownian control problem. We retain this notation in order emphasize the queueing network interpretation of the Brownian network model. The A'-dimensional process Y dimensional process represents the scaled centered input process. represents the scaled centered allocation process, and the one- These two control processes correspond to the sequencing and input decisions, respectively, in the queueing network scheduling problem. The scaled process = -^ Yk{t) where Tk{t) is the interval [0,t] problem), and the cumulative {Tk ajt, F= ^^^-^' (Y'k) is > t amount of time devoted defined by 1.9) 0, to serving a class k customer in a decision variable in the original queueing network scheduling is which is defined in equation (1.17), represents the long-run fraction of the server's active time at station s{k) that must be devoted to class k in order to maintain The materiad balance in the network. 0{t} = scaled process Xnt is defined by — - N(nt) =^ -,t>0, (1.10) \/n where, as mentioned earlier, A is the specified average throughput rate and N{t) cumulative number of customers released into the network in The two-dimensional the two stations. Thus U process is U I{t) is [0,t]. represents the scaled cumuJative idleness process for defined by U(t) where = :^, t > (1.11) 0, the cumulative amount of idleness incurred by the server at station the time interval [0,^]. the is Uand (For brevity's sake, processes such as Z, Y, referred to without the adjective "scaled".) The K x K input-output i in 6 will often be matrix R= (Rkj) is defined by Rkj=m-HSjk-P:k), 5 (1.12) where Sjk denotes the Dirac delta function, meeining that 6jk otherwise. The \ The A'-dimensional process A' a is (6, covariance matrix S = (Sj/)- Let A = system per unit of time P number was assumed to be transient, /? = = of traffic intensities p all {p,) = k and 6jk vector 6 = defined by ^ drift = — (Sk) ' ^ definitions are and the K x K (1.14) q\, of class k customers that it follows that = must depart from the throughput rate constraint. {/3k) satisfying A Letting C{i) be the set of j (A^) be defined by in order to satisfy the a unique non-negative A'-vector is li E) Brownian motion, but several X so that Xk represents the average {A,k) 1 otherwise. 0, needed before we can describe the A'-dimensional Since A= 2 x A' resource consumption matrix = R is non-singular and there exists the flow balance equations (1.15) RI3. customer classes k such that s{k) = z, we define the two-vector by kec(i) We now define the A'-vector a = (ctjt) ak = by — for all it e C{i). (1.17) Pi Then the drift 6 and covariance S of the Brownian motion X are <5=v^(A-i?a) (1.18) and K ^ji = ^[(^km^'PkjiSji - Pki) =! *: + akm-'slRjkRik]. (1-19) Inequality (1.7). which expresses the throughput rate constraint in terms of the cu- mulative server idleness process t7, is the only relationship in the limiting control problem that does not appear in the Brownian network formulation of in (1.7) is [5]. The two-vector 7 = (7, defined by 7. = v/^(l-p.). (1.20) Inequality (1.7) states that the long-run average fraction of time that server than or equal to 1 — p,. In Wein [10], it is the long-run average throughput rate The remainder of this paper work control problem (1.1)-(1.7) is is shown that is i this inequality holds if and only if greater than or equal to A. orgajiized as follows. In Section 2, the Brownian reformulated so that the state of the system is idle is less is net- described by a two-dimensional workload process, rather than a K-dimensional queue length process. The new formulation tion to the will be referred to as the workload formulation, and workload formulation that is the goal of this paper. The it is the solu- control processes in the workload formulation are (Z, U, 6), which are A'-, two- and one-dimensional processes, respectively. programming In Section 3, linear is used to find the optimal process Z in terms of the process U. In Section 4, the results of Section 3 are employed to reduce the remainder of the workload formulation to a problem involving singular (or instantaneous) control of a onedimensional Brownian motion. controller who, at any time, process B. That may is, a {fi,a^) Brownian motion B is observed by a instantaneously increase or decrease the value of the Holding costs are incurred continuously at a rate h{W{t)), where controlled process. No costs are incurred when the the value of the Brownian motion process. W is the controller either increases or decreases However, there are constraints on the long- run expected average amount of control to be exerted in the positive direction and in the negative direction. The objective is to minimize the long-run expected average holding cost incurred per unit of time. In Sections 5 through 9, we derive the solution to this constrciined singular control problem. The solution is characterized by two constants a and optimal policy keeps the controlled process The corresponding optimal Section 5, where a < [a, b] b, and the with minimal effort. control functionals are the local times at these boundaries. In a Lagrangian approach The function. W inside the interval b, is used to move the two constraints into the objective resulting unconstrained problem has been analyzed by Taksar [9], and his results are used in Section 6 to develop sufficient conditions for optimality of the constrained singular control problem. In Section and derived, pair of is zero in Section 8. 7, a candidate policy that satisfies the constraints a solution to the optimahty equations Lagrange multipliers. The special case where the treated in Section is drift /i is fotind, of the which includes a Brownian motion and the solution to the workload formulation 9, is is B summarized in Section 10. The Workload Formulation 2. The state of the system in the Brownian network control problem is described by the A'-dimensional queue length process Z, by way of the basic system relationship (1.3). In this section the system is Brownian network control problem is reformulated so that the state of the described by a two-dimensional workload process. that the matrix R appearing In Harrison in (1.3) is invertible. Let us define the matrix M = AR-\ shown M = (M,fc) by Define the two- W = (Wt) by W{t) The workload it is (2.1) This 2 X A' matrix will be referred to as the workload proHle matrix. dimensional worJcioad process [5], = MZ{t), i > 0. (2.2) process describes the state of the system in the workload formulation. Also, define the two-dimensional vector v = (vi) V by = Mq. 8 (2.3) It was shown Wein in that [10] p, where A is a positive number that is = v,\ for 2 = 1,2, (2.4) a parameter of the original queueing network scheduling problem. B= Define the two-dimensional Brownian motion B{t) so that B has drift two-dimensional drift MS = MX{t), and covariance vector MS = MY.M^ > t (B,) (2.5) 0, was shown It . by —7, where 7 was defined in Wein [10] that the in equation (1.20). Define the workload formulation of the Brownian network control problem as choosing RCLL processes Z, U and 6 (A'-, minimize two- and one-dimensional, respectively) so as to limsup— T—>oo / J- subject to the constraints : U U -^0 Y,CkZk{t)dt and 6 are non — anticipating with respect is Z{t) non — decreasing with ^(0) > for all t > a pair of control problem if it RCLL satisfies 7. for? = 1,2, for all t and (2.10) > (2.11) 0. processes (1,^) a feasible policy for the Brownian network equations (1.3)-(1.7) and (Z, U,d) a feasible policy for the workload formulation The (2.8) 0, (2.9) MZ{t) = B{t) + U{t)-ve{t) call = to B, (2.7) 0, hm sup -E[U,{T)] < Let us (2.6) k=l call a triple of if it satisfies RCLL processes equations (2.8)-(2.11). following proposition allows us to analyze the workload formulation of the limiting control problem, rather than studying problem (1.1)-(1.7) directly. Proposition 2.1. feasible policy (Y, 6) for the Brownian network control prob- feasible policy {Z, U, 0) for the workload formulation and every Every lem yields a corresponding feasible policy (Z, U, 6) yields a corresponding feasible policy (K, 9). 9 Proof. Let {Y.O) be any feasible policy for the Brownian network control problem, thus yielding an associated triple {Z,U,9) from equations (1.3)-(1.4). Premulti plying (1.3) by the workload profile matrix M and using (2.11). Since equations (2.8)-(2.10) are satisfied, the triple Reversing the argument, suppose that {Z,U,0) formulation. Then is {Z,U,d) (2.5) gives equation is a feasible policy. a feasible policy for the workload constraints (1.5)-(1.7) are satisfied and, using equation (1.3), one can define a corresponding control process 1' by = R-^[Z{t)-X{t) + Y{t) The associated queue length and cumulative cisely the processes and (1.4), (2.1), (2.3) Z and q9{t)], foralU>0. (2.12) and idleness processes in (1.3) (1.4) are pre- U, respectively, that we started with. Thus the pair {Y,d) is a feasible policy. | We now R address conditions (1.2) and (2.7) on non-anticipativeness. Since the matrix invertible, is motion X if the control process in the limiting control problem, 6 workload formulation, 6 more information will is is non-anticipating with respect to the Brownian problem, then the control process with respect to the Brownian motion in the limiting control Y is B in the non-anticipating non-anticipating with respect to A", whereas in the non-anticipating with respect to B. be available to the controller of 6 will By (2.5), it can be seen that in the limiting control be seen in Section the workload formulation remains unchanged whether 6 X is workload formulation. However, notice that than in the workload formulation. However, as to U is 4, problem the solution to non-anticipating with respect or with respect to B. The (2.11). rest of this In Wein paper [10], is devoted to finding a solution to the workload formulation (2.6)- the solution to the workload formulation is interpreted in terms of the original queueing network in order to obtain effective scheduling rules for the queueing system. 10 3. Solving for Z Terms of in TJ Z In this section the optimal process terms of the control process and (2.10). program at Then V V Suppose we are given a process . Z and the optimal each time workload formulation in the is expressed in that satisfies (2.7), (2.8) d processes are found by solving the following linear t: K min Y CkZk{t) (3.1) A' subject to ^ MxkZk{t) + Vi9{t) = Bi{t) + Ui{t) (3.2) MikZkit) + v^eit) = B,(t) + U2it) (3.3) Zk{t) > = (3.4) k=l K Y k=l At each time Since it t, this linear would be easier to analyze a define the dual variables 7ri(<) time program may have a and linear 7r2{t) for k 0, l,...,K. different set of right program with a hand side values. static constraint set, let us and state the duaJ linear program to be solved at t: max subject to [B,it) + A/u7ri(^) lh(t)]7r,{t) + ViTTyit) Before analyzing the dual ance process LP When \Vi{t) [B2(t) M2jt7r2(i) < + = V2n2it) Ck + U2(t)]7T2it) for A; = 1, ..., (3.5) A' (3.6) 0. (3.7) (3.5)-(3.7), define the one- dimensional workload imbal- W by W{t) At each time + t, = p2W,{t) - p,W2{t), t>0. W(t) measures how imbalanced the workload and ^2(0 ^^^ ^'^ the same proportions 11 (3.8) is between the two stations. as the long-run utilizations pi and p2, = respectively, then \V{t) 0. The process W becomes positive (negative, respectively) when the workload in the system becomes imbalanced towards station By and (2.2), (2.4), (2.11) = Wit) Returning to the dual and use - LP (station 2, respectively). workload imbalance process can also be expressed as (3.8), the p2B,{i) 1 piB2{t) + P2lh{t) - piU2(t), > t 0. (3.5)-(3.7), use (3.7) to eliminate TT2{t) (3.9) from the problem (2.4) to substitute the utihzation levels p for the vector v to obtain the one- variable LP max ——TTiit) subject to c'j^^{p2Mik The dual LP terms of IV'(/). (3.10) - piM2k)T^i{t) < for p2 A: = (3.10)-(3.11) allows us to express the solution Z{t) of Without loss of generality, assume that the (3.11) 1, ...,/\. classes k = 1, LP ..., (3.1)-(3.4) in A' are ordered so that argmax c'^\p2Mik - pi^ik) = 1 (3.12) 2. (3.13) k and argmin c';;^ip2Mik - PiM2k) = k By complementary must satisfy Zk{t) satisfy Zk{t) = when W{t) = slackness, = iiW{t) > for all classes k for all classes k ^ 2. ^ 0, 1. then the solution Z{t) of the Similarly, if W(t) < Notice that the solution is 0, Zkit) LP (3.1)-(3.4) the solution must — for k = 1, ..., K 0. When W{t) > 0, by (2.2) and (3.12), W2{t)^^W,it). (3.14) Mil Combining this with the definition (3.8) of the workload imbalance process Ml ^«(*)=-T7 P2Mn -^^-T;-^^'(^) Pi^hi 12 for z = 1,2. W, one obtains (3.15) Therefore, when \V{t) > the solution Z{t) to the 0, LP r mt) Zjt(0= Similarly, when \V{f) < 0, (3.1)-(3.4) I, -1 ^^^^^"'"^'"' ' 0, in-^i. ' ~~ •{ the solution is (3.16) is ^C) if [ 0, By jfc 7^ 2. the balanced heavy loading conditions on the original queueing system stated in [10], and by conventions (3.12)-(3.13), it Wein follows that P2M11 -piA/21 > (3.18) and P1M22 - P2M12 > and therefore that the solution Z{t) > Z is constructed from (3.16)-(3.17) for depend on the control process workload imbalance process 4. t > 0. > 0. The optimal queue length process Notice that this optimal process does not U only through the as described in equation (3.9). The Resulting Control Problem In this section Z all t and depends on the control process 6 W, for all (3.19) 0, it is shown that by substituting for the optimal queue length process using (3.16)-(3.17), the workload formulation (2.6)-(2.11) can be reduced to a problem of finding the optimal two-dimensional cumulative idleness process U. dimensional Brownian motion B B[t) Define the one- by = p2B,{t) 13 p,B2{t), <>0. (4.1) Recalling that the two-dimensional Brownian motion follows that B has drift a'^ has —7, by equation drift (1.8) it where fj., /i and has variance B = g^MEM^g^ = V"(Pi -P2), (4.2) where Q P2 = (4.3) L-PiJ Notice that when the system is perfectly baianced case will be discussed separately in Section R also define the processes (i.e., then, 9; until = then /i = 0. assume that /i is nonzero. Let us pi ^2), This driftless and L by R{t) = p2lh{t), t>0 (4.4) L{t) = p,U2{t), t>0. (4.5) and The letters movement, R and L are used to denote cumulative respectively, exerted straints (2.10) determine amounts of rightward and leftward by the controller on the Brownian motion B. The con- an upper bound on the long-run average expected amounts of rightward and leftward controls exerted. Define a policy as a pair of non-decreasing R processes {R, L) such that Ex[L{t)] are finite for each ciate with the policy t and L are non-anticipating with respect to B, > and each initial state x, {R,L) the controlled process B{t) equations (2.2), (2.11), (4.1) and (4.4)-(4.5), it + and R{0) R{t) = B{i) it + R{t)-L{t) to the right (left, respectively), = L{0) L{t) for all i > 0. 0. and Asso- From for all < > is is, 0. Thus, in the resulting control problem, the Brownian motion pushing Ej-[R{t)] can be seen that the controlled process precisely the workload imbalance process defined in (3.8); that W{t) — = RCLL (4.6) B can be controlled by which increases (decreases, respectively) the 14 Now workload imbalance of the system. hi define the positive coefficients hi and /?2 by C2 = (4.7) P1M22 -P2A/12 and /l2 If = Cl (4.8) P2M11 - P1M21 the optimal queue length process constructed from (3.16)-(3.17) is used, then one obtains K J2ckZ,{t) = h{Wit)), (4.9) k=l ,vhere — hix, if X /z2X, if X h{x) Thus for problem < > (4.10) 0. each policy {R,L), costs are incurred at a rate h{W{t)). The resulting control is to find a policy {R,L) to minimize limsup— .E^ T—'oo / limsup h{W{t))dt — £'2.[/?(T)] T— 00 T < a solution d{t) is - (4.12) P2 '""-''^'" (4.13) . -t P\ - P2 {R,L) to the constrained control problem (4.11)-(4.13) can be found, then the optimal 9 process where Z{t) P2(l -Pi)a^ P\ Umsupi£,li(r)l < T^oo (4.11) Jo -^ 1 subject to If 0, = is, via equations (3.2) v-^[Bi{t) + ^ and (4.4), K -YMikZkit)] for all t > (4.14) 0, defined in (3.16)-(3.17). Notice that the optimal 6 process would be expressed by (4.14) whether 9 was non-anticipating with respect to the two-dimensional Brownian motion paper is B or with respect to the A'-dimensional Brownian motion X devoted to finding a solution {R,L) to problem (4.11)-(4.13). 15 . The rest of this 5. The Lagrange Multiplier Method Instead of analyzing the constrained control problem (4.11)-(4.13) directly, a La- grangian approach will be used. In particular, and r let / be the Lagrange multipliers corresponding to constraints (4.12) and (4.13), respectively. The multipliers interpreted as the per unit costs of exerting the controls with a policy (i?, and L, and / are respectively. Associate L) the Lagrangian cost function k{x) With the R r = —Ei f Jo limsup hiWit))dt -\- rR(T) + IL{T) (5.1) aid of the following theorem, the constrained problem (4.11)-(4.13) can be solved by making an appropriate choice of multipliers and then minimizing the Lagrangian cost function. This idea Bellman [1] is not at and Everett [3] all new in the context of control theory; readers are referred to for early applications to dynamic programming problems. The problem of finding a policy (R, L) that minimizes k{x) will be referred to as the Lagremgian problem. Theorem is 5.1. Suppose and r I are nonnegative real numbers and suppose {R* L*) , a solution to the Lagrangian problem. Furthermore, suppose 1 P2(l J- Pi rr,*/T^M hmsup -Er[R {T}] = r— oo - Pl)^ (5.2) P2 and 1 rrvTM hmsup -Er[L (T)] = T— oo Then (R*,L') is Proof. Let P\ J- -P2)m - P2 (5.3) . a solution to the constrained control problem (4.11)-(4.13). W* be the controlled process under the optimal policy (i?*,L*), so that W*{t) Then Pi(l for all policies = B{t) + R*{t) - L*{t) for all ^ > 0. (5.4) {R,L) limsup— Ei T—oo J- f h{W* {t))dt + rR*{T) + lL*iT) Jo 16 < limsup —E^ / Jo h{W{t))dt + rR{T)^lL{T) (5.5) Rearraxiging terms yields 1 limsup —Ej: r— oo < h{\V\t))dt 1 J- imsup-E^ T-^oo T h{W{t))dt / Jq + r\imsup^E,[RiT)-R*iT)] ' T—'oo + Since (5.6) holds for all policies, it r— oo (5.6) J- certainly holds for the subset of policies such that < ei^Llfllt Pi - P2 limsup iE.[/?(r)] T— oo llimsup ^E,[L{T)-L*{T)]. J- (5.7) and limsup r— oo However, by equations -Ex [i(T)] < i (5.2)-(5.3), the P\ - (5.8) . p2 terms 1 rlimsup-£;x[i?(r)-i?'(r)] T— oo (5.9) -^ and /iimsup;^£;,[L(r)-L*(r)] T^oo (5.10) J- are nonpositive for this subset of policies. (4.13). Hence the policy (i?*, L*) is a solution to (4.11)- I Notice that, at this point, the Lagrange multipliers r and to invoke Theorem 5.1, we need to find a pair of multipliers (r, / /) Eire not known. In order and a solution (i?, L) to the Lagrangian problem that simultaneously satisfy (5.2)-(5.3). Given control of r and /, the Lagrangian problem is a problem of singular (or instantaneous) Brownian motion. The name stems from the fact that the state of the controlled process can be instantaneously changed by the controller and, as a result, the optimal 17 control processes R R has meeisure zero). Such problems have been the subject of and L see, for increcise L are continuous but singular example, Harrison and Taksar and Taksar for ajid [9]. In particular, Taksar [9] Karatzas [6], (i.e., [7], the set of time points at which much study; Shreve, Lehoczky and Gaver [8], has solved the Lagrangian problem defined by (5.1) a convex holding cost fimction h that is finite everywhere or on a infinite finite interval. Since the holding cost function h defined by (4.9)-(4.10) satisfies Taksar's requirement, his results will be used next section, where sufficient conditions for optimality are in the developed for the constrained problem (4.11)-(4.13). The Optimality Equations 6. The goal of this section is to present sufficient conditions for the constrcdned problem (4.11)-(4.13). an optimal solution First, the sufficient conditions for optimality will be stated for the Lagrangian problem with arbitrary nonnegative ntunbers latter conditions were derived in to Taksar The [9]. r and /. These sufficient conditions for optimality of problem (4.11)-(4.13) are found by combining Taksar's results with Theorem 5.1. Let us start by defining a class of policies that include the optimal policy. Hcirrison and Taksar define a special class of policies brings the controlled process keeps it process [6] called control limit policies. This type of policy W within a certain interval within this interval while exerting a W under such a policy interval [a,b]. where in it is A is minimum amount the Brownian motion referred to as a regulated specifically, B instantaneously and then of control. The controlled reflected at the endpoints of the complete anedysis of this controlled process is contained in Harrison Brownian motion. The control functionals under a control limit policy are the local times of More [a, b] W at the points a the control limit policy {R,L) on the interval [a,b] R{t)= sup [a - B{s) + 0<a<« 18 L{s)]^ and is b, R [4], and L respectively. defined by (6.1) and = L{t) The existence and uniqueness sup [B{s) 0<J<t + R{s) - b]+. (6.2) of such functionals was estabhshed in Harrison and Tciksar [6]. We now present sufficient conditions for optimality for the Lagrangian problem (5.1). Suppose that an optimal solution to the Lagrangian problem fovmd and is corresponding value of the Lagrangian cost function k{x). Thus g cost per unit time (independent of initial state) and standard with problems using average cost criterion, the optimal policy when the when the incurred under the optimal policy V has a first The and second derivative and V is W is 0. It is x minus the cost will be assumed that [9]. a solution to Min {rV{x) + h{x)-g,r + V'{x)J-V'{x)} = V{0) and suppose there exists an interval [a, b] TV(x) + h{x) Suppose that for some constants A'l V'{x) — —r V'{x) = is 0, 1 for a < X for X < a, ior < 6, x>h. + N2h{x). the minimal average cost for the Lagrangian problem and the control Umit on the interval [a, b] is an optimal (6.4) (6.5) (6-6) (6.7) A^2 V{x) < Ni Then g = (6.3) such that -g = and is be referred to as the potential function. will Suppose {g,\'{x)) 6.1. W initial state of following proposition was proved in Taksar Proposition the minimal average V{x) be the cost incurred under the controlled process initial state of g be the be referred to as the gain. As will let is let policy. 19 (6.8) poUcy We axe now Theorem in a position, using 5.1 and Proposition 6.1, to state sufficient conditions for optimality of the constraSned problem (4.11)-(4.13). Theorem Suppose 6.2. Mm {TV{x) = V{0) R r, /, a, 6) satisfy + h{x)-g.r + V'{x),l-V'{x)} = is V'{x) = -r V'{x) = = g < for X for X I for a > < < X (6.12) (6.13) 6, = MIH^^ limsnp ^EAL{t)] = Mlz.^, pi J- smooth (6.8). - fit and poUcy on the that the function is and often imposed Benes, Shepp and Witsenhausen [2]. V interved G C"^ when (6.14) (6.15) P2 interval [a,b]. Then the optima] poUcy {R,L) the control hmit (6.11) i, a, lim sup I; E^[Rit)] The requirement (6.9) (6.10) and L are the control Umit pohcies on the (4.11)-(4.13) 0, 0, + hix)- and satisEes condition ple of V{x), TV{x) r— oo where {g, is Suppose to the constrained V G C"^ problem [a, 6]. sometimes called the heuristic princi- solving control problems; see, for example, In the next two sections a solution, which is denoted (g*,F*(x),r*,/*,a*,6*), to equations (6.9)-(6.15) will be found. 7. The A Candidate Policy first step towards a solution to equations (6.9)-(6.15) endpoints a* and b* . is to find candidate interval These values lead to a candidate pohcy {R*,L*) defined by R*{t)= sup [a* -B{s) + 0<3<t 20 L'{s)]-^ (7.1) and L'[t)= sup [B{s) •1 + + R'{s)-b'] (7.2) 0<s<t In order to find these endpoints, the following motion first be needed. See Chapter 5 of Harrison will define u Lemma Then [4] concerning a regulated Brownian for a derivation of this result. Let us by V (6.1)-(6.2), lemma 7.1. and thus Suppose B is a {ii.cr'^) W =B+R—L is 2fx = (7.3) Browman motion, R and L are as defined a regulated Brownian motion on the interval in [a, b]. W has a fruncafed exponential steady state distribution with density function ue,i/(i-a) p{x) giy(6-a) _ < for a X < (7.4) b. I Furthermore, ^EJR(t)] = -nr^, lira (7.5) and lim The candidate among ^EMt)] = interval endpoints will ^ ,, (7.6) , be derived by solving the following problem: the class of control limit policies, find the policy {R,L) to minimize limsup— £^i h{Wii))dt T —oo J" Jo (7.7) -^ 1 subject to limsup r-oo — £'i[i?(T)] = T Iimsup-£i[i:(r)] T-oo 1 That is, we restrict the policy find the values of a and (R,L) b that solve to take the problem 21 , Pi - P2 Pi - P2 = . form defined (7.7)-(7.9). in (6.1)-(6.2), (7.8) (7.9) and then Notice that problem (7.7)-(7.9) is problem (4.11)-(4.13) with the inequality constraints (4.12)-(4.13) just the constrained replaced by conditions (6.14)-(6.15), which solving problem (7.7)-(7.9) endpoints a and Lemma (7.8)-(T.9) if (7.6) and A and only control limit policy The b first step in a). (7.9). | [a, 6] satisfies constraints if result follows — {R,L) on the interval The result .-Mn(^41^). (7.10) by equating the right side of equations (7.5) and (7.8) and can also be proved by equating the right side of equations Since the cost function h defined in (4.9)-(4.10) at zero, The b. 7.2. solving for (6 equality constraints. to express the constraints directly in terms of the interval is 6-a = Proof. Eire is convex and achieves its minimimi the optimal control limit policy for problem (7.7)-(7.9) will have endpoints a and < that satisfy a (7.7)-(7.9) < 6. By Lemma 8.2, the search over values of a and b to solve problem can be reduced to a search over values of a such that (7.11) Proposition 7.3. The control -1, limit policy with interval endpoints (^1 / +^2)^2(1 -Pi) .-,^^ \ and \hip2(lsolves problem Proof. J h{x)p{x)dx pl) + h2Pi{l - P2)J (7.7)-(7.9). By equation for any control (7.4), the objective function (7.7) can be expressed as limit policy on the interval [a, b]. By Lemma 7.2, the optimal value of the endpoint a can be found by minimizing the function /(a) over the interval 22 (7.11). where /(a) = h{x)p{x)dx. / (7.14) Ja Using (7.10) to substitute for (6 — and using a) in (7.4) (4.10) to substitute for h yield 7"-^;' (/°-/.,.e-'-'& /(.)= (pi - C2(' -Pi) / V / + Ja P2) X , /i2Te''('-<'>dx). / (7.15) Integration by parts yields /(a) Setting /'(a) = = + (t.(/>i-p2))~'('p2(l-pi)(/ii - Pi) Pi) - + i^a(^i/!>2(l - ^iP2(l 0, - + - /'2pi(l /J2P1(1 - /i2)e-''" P2)) + P2) /i2Pi(l - P2)ln f ^'?!"^'! ))VP2(1 - P\)J J (7.16) we obtain -vp2{l - pi){h, + h2)e-'''' + t^(;iiP2(l - Pi) + /?2Pi(l - P2)) = (7.17) 0. Solving for the endpoint a gives equation (7.12) and substituting this value in equation (7.10) yields (7.13). Taking the second derivative of / with respect to a yields /f'i(a) ^ = ^-P2(l-Pi)(/^i+^2)e-° (Pi which is strictly positive Thus a candidate ^ -, by definitions (4.2) ,„^_(7-18) , -P2) and (7.3), thus completing the proof. | been found such that the candidate policy interval [a*, 6*] has (i?*,L*) defined by equations (7.1)-(7.2) satisfy conditions (6.14)-(6.15). Notice that, as expected, a* = date gain g* can in (5.1) in the limiting case now be /?2 = 0, and similarly, &* = when /ii = 0. A candi- derived by evaluating the Lagrangian cost funcion k[x) defined under the candidate policy. 23 Corollary k{x) Under the candidate policy (R*,L*), the Lagrangian 7.4. cost function is \hiP2{^- V + - h2Piil {hi P2)ln - hip2{l (Pl By Proof. -P2) and = Substituting for {b* expression (7.19). 8. — a*) •^(«*) pi) + - P2) /j2Pi(l - P2) (7.14), the value of the under the candidate policy {R*,L*) 3' /J2)pi(l p2)J -P2) (Pl (7.5)-(7.6) + + h2pi{l- pi) Lagrangian cost function (5.1) is + %.(6.-t).i + \_J,,..a-y and a* using equations (7.10) and (7-20) (7.12), respectively, yields | Solution to the Optimality Equations In this section a solution (g* ,V*{x),r* ,l*,a* ,b*) to equations (6.9)-(6.15) will be In the previous section candidate endpoints a* presented. and b* were derived and a candidate gain g* was found in terms of the Lagrange multipliers r and We start this section tion (6.12) obtain, by finding a pair of candidate multipliers (r*,/*). and the assimiption that upon /. setting V'{a*) = —r V G C^, we set V"{a*) = By condi- in equation (6.11) to and using (4.10) and (7.12) to substitute for h and a* respectively, g^-nrSimilarly, setting V'{b*) g = (^i + ^2)/32(l - Pi) -ll„ f '' h,v-^ In \\' ^^^^^ \hip2{\ - pi) + h2pi{l - P2) = and pl / + = V"{h*) In in (^1 -1,^ / h2U-' • , , equation (6.1) yields + ;/ \hip2[\ , 24 (8.1) ^^2)^1(1 - P2. [;' + h2p\{l pi)Vr^ . P2). (8.2) Equating the right sides of equations r + , , I ( ' In fiu \hip2{l flU Lemma 8.1. Proof. By (hr .hiP2{l Lef r + / be a^ in - - h2)p2il pi) + ^-' P^) - we + r h2pi{l + > P2) - p2)j ' 0. 1 see that r + > if / and only if jh, + h2)p,{l - P2) f V' \h1p2il - pi) + h2pi{l - P2) ._.. {p2{l -pl))''>+''2 (/9i(l -^2))"'+"= (8.5) ^ - P2)J h2pl{l pl) Then (8.3). rearranging terms in (8.3), + (S.2) gives {h^+h2)p2{\-pi) \hip2{l - pi) + h2Pi{l - h, = and (S.l) or, equivalently, — hi Since ln(x) It is + > h2 an increasing concave function, turns out that there is + > follows that r it / 0. | not a unique pair of multipliers (r*,/*) that satisfy the optimality equations (6.9)-(6.15); any nonnegative pair that satisfy equation (8.3) will suffice. To this end, choose any A* G The [0, 1]. pair of candidate multipliers (r*,/*) are defined by (^1 A*(_^ln'^ pu \h1p2il ^2 (hi ^^ pu f +^2)p2(l - Pi) - pi) + h2Pi{l - P2) +h2)pi{l - P2) UiP2(l-pi) + II .gg. /i2Pi(l-P2), and (^i+/^2)P2(l-Pi) \hiP2{\ - pi) + h2pi{\- r = (l-A*)(-^ln^ pu (^1+^2)P1(1-P2) \hip2{\ - pi) + h2P\{l - ^Mnf pu With the candidate multipliers (r*,/*) in hand, it is \\ P2) ) now P2) .g^. )' possible to calculate the candidate gain g* directly in terms of the problem data. This can be done in three ways: 25 /* substituting r* for r in equation (8.1), substituting both r* and /* for in / may Readers in equation (7.19) of Corollary 7.4. equation (8.2), or substituting verify that all three ways lead to the result that -ii„ =(A*-l)/ii//-Mn g* (^1 I - pi) + (^1 +^2)/>l(l - hiP2{l -1,_ ,.L In + X*h2U-^ To complete the candidate V* function is required. ^iV'{x) The / 7^ )/ , +^2)^2(1 - Pi) V*'{x) = P2) P2) .7' r (8.S) . solution of the optimality equations, a candidate potential With g*,a* and + ^a^V"(x) + 6* now chosen, condition (6.1) becomes -g* = h{x) The solution V*' < for a* solution to this second order differential equation Proposition 8.2. - /l2Pl(l < x b* (8.9) given in the following proposition. is to (8.9) is — + C*t-"^ - ^e-"^ / <x< for a h{y)e''^dy h\ (8.10) where ^. _ h,fi-'iy-'{h,+h2)p2{l-Pi) - hxp2il Proof. The pi) + h2pi{l = C V*'{a*) is axi unknown = —r* and h2)p2{l To + pi) - Pi) - h2pi{l . ' P2) ' is (^^ constant. + \hip2{l- P2) — + Ce-"' - ^e-""" P where - solution V*' to (8.9) V*'{x) jh, .^^f' f h{y)e''Uy for a' < x < b* (8.12) Ja- find C, we evaluate (8.12) at the point x = a*, set substitute for a* using (7.12) to get fi V [hi ^, -//C ^ip2(l + h2)p2{l - Pi) J or, equivalently, g = -/xr ., [hi 26 -Pi) + + , , , -\ /i2pi(l ,. h2)p2[l - Pi) P2) ,^... 1 • (8.14) Substituting g* and /•* and for g and right sides of equations (8.1) C* defined = in (8.11). following three 8Lnd V*"{b*) 8.3. lemmas Proof. Evaluating =^— + C'e-"^' - Substituting for a*, ^+ / jj. / = V*"{a*) /*, +^2)p2(l -Pi) hip2{\ - pi) + h2p\{\- p2) - Pl) + - ^2pl(l P2) {h\ -21 + , (^1 (^1 + ^2 ^2)p2(l Pl) - + hip2{l - - X + - ^2)p2(l + pi) -Pi) h2pi{l hip2{l 2/l2 //llp2(l ^^ ^' -P2) M .J -Pl) + /i2pl(l -P2) ^ M ,,7' - - - Pl) ^ ^ + + p2) P2) h2)p2{l pl) X - p\) h2Pl{l ^ - P2) ^2Pl(l -P2)' (/^1+^2)P1(1-P2) (^1 / + {hi . P2) ^2)Pl(l ':' we obtain ih,+h2)pAl-P2) ^ Pl) h2Pi{l P\) 1))) -Pl) + ^2Pl(l - /liP2(l + /i2Pl(l-p2)V .' + /llp2(l -21 - h2)p2{l ^/?lP2(l-Pl) .,-2 +'--'( + (^1 (^1 2/li ' (8.11), ln( p.u . yield - {e^^'iva" I*. (8.15) ^1 ) - = \)\. and C* using (7.12)-(7.13) and 6* hipiil "" + h2U-\e''^\vb* -l) _^-2 _ ^-2/ -2 and integrating by parts 4-^""** i-hiu-^i-l (h,+h2)p,{l-p2) V boundary conditions V*'{b*) fc* (^1 ^lP2(l ^ yields the candidate constant in (8.10)-(8.11) satisEes V*'{b*) = x V'*' in (8.10) at + = verify that the The function V*' de£ned V"(b*) C for respectively, are satisfied. 0, Lemma r*'(6M and solving (8.14), | The = using (8.6) and (8.8) respectively, equating the r in (S.l) /ilP2(l +^2)Pl(l -P2) -Pl) + /i2Pl(l - P2) ^2)pl(l ^/llP2(l-pi) + -P2) /i2Pl(l-p2)V' Cancelling and rearranging terms yield g* // ^2 , /ii/ ^1P2(1 (/Jl ( + (hi \/jip2(l -Pl) + h2)pi{l -pi) + /^2Pl(l +/J2)pi(l = r. -P2)\ -P2) I 27 P2) /i2Pl(l Since the last three terms on the right side of (8.17) v*'{b*) - \ ,hi+h2 -P2)/ Pl^ /ti/)2(l-pi) /i2 y pi/pi(l-p2) PI^ sum to zero, using (8.8) .g we ^-,. see that Lemma 8.4. Proof. Differentiating (S.IO) yields V"'{x) V*"{a') V'*' satisfies = 0. ^^ + -^ / = -uC'e-"' - h{y)e''ydy = Using (4.10) and (7.12) emd evaluating equation (8.18) at x ^°^ = ' -^^^ ih,+h2)p2il-p.) (-hi a^^ ^u ^ which equals zero by + < b\ x (8.18) a* give ^ - h,)p2{l \ p,) \h,p2{l-p,) + h2P,[\-p2))' ^ ' | (8.11). Lemma 8.5. Proof. Evaluating (8.18) at x V* {h^ [ < for a' = V*"{h*) satisfies = 6* 0. and integrating by parts yield a' (-1 - 6* Substituting for a*, V"'(b*) = + (^1 - hip2{l hiP2{l{hi 2 ^2)/>2(l + Pi) pl) + /iip2(l . + 1))) + h2)pin- + -Pi) h2pl(\ -uhiP2il - {hi _ - h2pi{l - - h2)p2{^ p2) pi) - - hiP2{l X /llP2(l Pi) h2)p2{l + -pi) /l2Pl(l - - Pi) + /l2Pl(l . X P2) + h2i'-A. + + Pi) h2pi{l h2)piil (8.20) follows that it - ^2)^2(1 pi) -2 + - L , _ , h2)p2{l + pi) - ^ - P2) P2) X -2 - Pi) h2U-^{hi ^1P2(1 - + Pi -pi) h2pi{l + h2U-'^{hi hiP2{l P2) X - {hi hip2{\ P2) 1)1. V / . + ^^hip2il-pi) + h2Pi{l- P2)^ -P2) p\) - (hi . - P2).f, h2Plil +^2)pi(l -P2) '•^^ ln( + hiP2{l fi + + (^1 h2 s 1) (8.11), respectively, \n( ^11^ P2) -Pi) + h2pi{l - hiu-'^jhi P2) - h2V-'{e'"' (ub* )^^ +/Z2)Pl(l (/)l / {va' and C* using (7.12)-(7.13) and v( , (e"'' ) X - P2) -P2) /i2)pi(l + /i2Pl(l h2)pi{l + - P2) -P2) h2pi{l - x X P2) (8.21) 28 The In(-) terms cancel out and upon rearranging, we have V"(b*) = which equals I Pi) + h2(hi + A hip2{l -P2) n/ ^2pi(l h2)p2{l — P\) ', , ,, Pi) + h2pi{l - P2) - +h2)p2{l- hijhi / \ , J+ s \ , + /ii pi) /12 8.22) , ) | zero. We now extend in the obvious - ^1^2(1 / the function V*' derived in Proposition 8.2 beyond the interval way suggested by conditions function V* in terms of V*' . and define the candidate potential V* by Define { l/*(x)= (6.12)-(6.13) [a*, h*\ ifx>0, - \^V'"{y)dy, -'" (8.23) „ ifx<0, \ -/;F*'(y)dy, where {-r*, Notice that I'* + C'e--^' - ^e-"- /;. h{y)t^ydy, G C^, as required, and that F*(0) The candidate < a*, ifa:>6*, ^*- 9L X if solution (g*, F*(x), r*, I* = ,a* b*) tions (6.10)-(6.15) of the optimality equations have < x now completely been verified. verified, it suffices to TV*{x) specified, To complete cation of the candidate solution, condition (6.9) needs to be verified. have already been < h\ 0. is , a* if (S.24) and equathe verifi- Since (6.11)-(6.13) check that + h{x)-g* >0 x^ [a*, 6*] (8.25) < X < (8.26) = —pr* — h\X — > —ur — hiQ for and -r* < Lemma Proof. 8.6. T'*'(x) < r for a* b* Condition (8.25) holds. To show (8.25), notice that for x TV*{x) + h{x) — g* = 29 < 0. a*, g* — g (8.27) Similarly, for j > 6*, TV*{x) + h{x) -g* = fil* + h2X - g* > ul* + h2b* = 0. Lemma 8.7. Proof. Since V*'{a*) (8.26) = -r%V"{b*) = X < V*"Ij:) = < 0, we have, by {hi ^1 ( P + (8.28) = and V*"{a*) I' V*"(b*) = 0, to prove show that V*"{x) > For a* I Condition (8.26) holds. suffices to it -g* hip2il ^ for a* < X and integration by (8.18) +h2)p2{l- Pi) - p\) + h2piil - b* (hi hip2{l P2) (8.29) parts, ^, . - ^^lil^(i.xe- - e- - a-' < j.G*e-' + + h2)p2{l - Pi) - pi) + h2pi{l - >.^ _^^ P2) e^'). ' (8.30) p, Substituting for a* from (7.12) and cancelling and rearranging terms yield T-.//. ^ (x) V Letting x = = —p^(I--; ^1 (/ii r: /. '701 is X G (a*, - ^1) re ; -^^^ also greater 0), it f for ) ' hih2ipi = - a *^ < - a: ^n < P2) ro'J1\ (8.31) ' = , p{hiP2{l-pi) + h2Pi{l-p2))' ^ greater than zero by definition (4.2). Also, for a V*"'(x) ^"^^ which ^ ' in (8.31) gives ^ is /?2)p2(l h,p2il-pi) + h2Pi{l-p2) y which + < x < follows that V*"{x) . ' 0, + h2)p2il-pi) aHh,P2{l-Pi) + h2P,il-p2))' than zero. ^ 2hr{h^ Since V*"{a) > for a < 30 x = < 0, 0. F*"(0) > ' and V*"'{x) > ^^-^^^ for < Similarly, for < x 6*, p^h,p2{l2h2X we have by Pi) + hiP2{l hip2{l- -Pl) + h2pi{l - /?2)P2(1 pi) + IHllZ (e'^'ivx - + 1) + (hi ^^ V /i - parts, + h2Pl{l-p2)' ''^hiP2{l-pi) + h2Pi{l-p2)^^ hie-"' f (ftl and integration by (8.18) h2)p2{l pi) ln(- hip2il P2) 1 Pi) - h2pi{l [hi / , + - P2) + h2)p2il - Pi) - Pl) + h2pi{l - P2, y (8.34) which can be rewritten as .^^n, {x] V . h2 = < X < 6*. we can differentiate ^ is than less > proof. We -— pi) + - -e _^^ P2) h2pi{l - ^ ^ U* forO<x<i. r for (8.35 P2)) V*"(x) to obtain V*"(0) > < x < /C OrA r. + h2)Pl{l - P2) 6*. 0, V*"{b*) = .oo^^ ,-^x ^^-^^^ ' ^~'crHhiP,{l-pi) + h2Pi{l-p2))" zero. Since have that V*"{x) h2)pl{l 2h2{hi .'*,>,,._ which + p{hip2{l- P- For h2{hi + -— and V*"'{x) < for x G (0, 6*), we Thus, (8.29) has been shown, which completes the I are now Theorem in position to state the solution to The 8.8. {R',L*) defined problem (4.11)-(4.13). solution to problem (4.11)-(4.13) on the interval in (7.1)-(7.2) [a*,b*], is the control limit policy where a* and b* are defined in (7.12)-(7.13). Proof. V'*(x), r*, 6.2, for /*, The results from Lemma 7.1 through Lemma a*, 6*) that satisfies equations (6.9)-(6.15) the result holds some constants if A'^i the function and A^2- V* satisfies 8.7 lead to a solution (g*, such that V G C^. By Theorem condition (6.8) that V{x) < Ni + N2h{x) This condition holds by setting Ni=m&x{V*{a*),V'{b*)} 31 (8.37) and N2 = r* + /• max{—7 —7+ r* , is }. (8.38) I ^1 /l2 As I* usual in control problems dealing with long-run average cost criterion, the op- timal poHcy {R*,L*) not unique. is Brownian motion process within the and then uses a* and In particular, any policy that brings the controlled interval [a*, 6*] in a finite expected b' as reflecting barriers will achieve the amount of time same average expected cost as the policy {R*,L*). The 9. Case Driftless In this section, the constrained problem (4.11)-(4.13) where perfect system balance, B defined in (4.1) has drift /x p\ = = 0. P2- In this case, the Defining e it C, solved vmder the conditions of two-dimensional Brownian motion by = v^Pi(l-pi), (9.1) seen that the right side of constraints (4.12) and (4.13) both equal is an essential system parameter, problem (4.11)-(4.13) procedure used in Sections 4 through 8 are is now same is still will Taking be solved in terms of it. ^ as The valid in this case, but the computations substantially easier. In particular, the sufficient conditions for optimality are the as in Theorem 6.2, except that conditions (6.14)-(6.15) are replaced by limsnp^E,[R{t)]=\\msnp^E,[L{t)] = As 1^. in Section 7, the first step (6.9)-(6.13) and Harrison for a [4] (9.2) is t (9.2) towards finding a solution to the optimality equations to find candidate interval endpoints a* proof of the following result, which driftless case. 32 is and b* the analogue of . See Chapter 5 of Lemma 7.1 for the Lemma (6.1)-(6.2), Then Suppose 9.1. and thus B is a (0,(7^) Brownian motion, W =B+R—L is T—'OO The candidate and L are as defined a regulated Brownian n^otion on the interval W has a uniform steady state distribution on limsup ^E4R{t)] R = [a, b] T—<-oo = ^ ^\0 ' interval endpoints are derived [a, b]. and limsup ^Er[L(t)] ' in (9.3) (^) by solving the following problem: sujiong the class of control limit policies, find the policy {R, L) to • minimize limsup iim T^oo subject to Lemma 9.2. i£,[/ J r h{W{t))dt] limsup —E^[R[t)] The control (9.4) Jo = limsup —E^[L{t)] = limit policy with interval endpoints * (9.5) Integrating gives (9.11) <' i Setting /'(a) = 0. we have 2 J 2/iia + 2/i2a+ ^— = (9.12) Solving for a yields a* in (9.6) and using (9.8) to solve for 6 yields b* in (9.7). These are the cajididate interval endpoints, since /"(a) which is upon 4(2/^1 +2/12), cancelling and rearrcinging terms, /(a*) a'^h^h: = the candidate gain g* /l2)' + ri and b* is (T^hih2 Let X* e 9.3. = '* (9.14) + 4^/11 Theorem (9.13) | positive. Since, = and [0, 1] let a* + (9.15) li. be deRned as in (9.6)-(9.T). Let A* h, r = (i-y) ^''^ hi hih2 hi (9.16) + h2Ae' + "' h2 4^2 (9.17) ' cr^ (9.18) +h2 2C and ifx>0, J^'V*'iy)dy, = V'ix) (9.19) ifx<0, -j'V*'{y)dy, where ' V*'{x) = — r*, if /*, ifx>b*, { '-^ - ^ i:. hiy)dy + C; , X < a*, (9.20) ifa'<x<b\ and , c; (7'(-yhih2 + i2-y)hihi) HHh,+h2)^ 34 (9.21) Then {g% V*{x).r\l\a\b*) and and V e C^ in the driftless case is the (9.2). Furthermore, satisfies condition (6.8). Thus, by Theorem control limit pohcy 6.2, the solution to that lead to will Theorem problem (4.11)-(4.13) I") defined by (7.1)-(7.2) on the interval (i?*, The proof are defined in (9.6)-(9.7). and solves equations (6.9)-(6. 13) Theorem of [a*, 6*], 9.3 parallels the where a* and arguments b* in Section 8 However, the computations in this case are substantially simpler 8.8. be omitted. This section concluded by showing that, as the is endpoints derived in Section 8 for the nonzero drift ;/ — > 0, the converge to the endpoints derived drift case in the driftless case. Proposition 9.4. h \_ (^1+^2)P2(1-Pi) lim'^'lnf a^ and lim <y\ ^In + h2)pi{l-p2) \ "^'y '^'' = \hip2{\-pi) + h2Pi{l-p2)J M-o2^ Proof. Since p = {hi f ' — \/n(pi ' ' P2). it -0 \hip2{l -yj^ - lim P2-P1 2y/n(pi By {hi — 2/j hi + h2 2i . (9.23)' ^ follows that <^\ ,. iim - in f /i- a^ hi —li— — + h2)p2{l - Pi) - Pi) + h2pi{l- P2] ih,+h}Ml-p,) In f \hip2{l -/9i) P2) + /i2Pi(l , - , P2) THopital's rule, one obtains ,. hm ^^ - =r, P2-PI 2-/n{pi - P2) J_(y2 — (Jp2 = lim ^ ,. (hi+h2)p2(l-pi) f \hip2{l - pi) + h2Pi{l - In / Jj^ (/ii+/i;)P2(l-Pi) p2, ^ •^''^ \ ''lP2(l-Pl) / "^^^ ^'^ + 'l2Pl(l-P2) ^^" \ (9.25) -a'^pih2 P2-PI 2yynp2(hip2{l - pi) + h2pi{l - P2)) -a^h 2 2y/^{hl + h2)pi{l - PlY 35 (9.27) The definition of similax and 10. is ,^ in (9.1) omitted. completes the proof for the endpoint a* of this paper was summarizes the solution. The parameters p,, M,jt, This t'l, cr^, a self-contained section that is u and ^ appearing in this sec- h,, and in equations (l.S). (2.1), (2.3), (4.3), (4.7)-(4.S), (7.3) (9.1), respectively. workload formulation, the controller observes a two-dimensional Brownian mo- tion process B, from which can be observed the one-dimensional Brownian motion process defined by B{t) If (U,Z,d) to the workload formu- problem data. Their definitions can be found tion are all defined in terms of the primitive B for b* is Summary to obtain a solution lation (2.6)-(2.11) of the limiting control problem. In the The proof | Solution to the Workload Formulation: The purpose . pi ^ P2, = p.Biit) - p,B2{t), then define the interval endpoints a and a = -1, 1^ f In b t>0. by {h,+h2)p2{l-p,) -; ^ ; (10.1) ,.„_. \ -; (10.2) \hiP2il-pi) + h2Pi{l-p2)J and t = ,-. In ( —l^i±MMilL£lL^ hip2{l If p\ = P2> then - P\) + h2pi{l - ^ . I ' (10.3) P2), let h2 a'' hi+h2 2^ (10.4) anc For a particular realization of B, define the control functionals (R, L) by R{t)= sup [a-B(5) + 0<s<J 36 L(5)]+ (10.6) and L[t) = + snp [B{s) R{s) - b]^ (10.7) 0<s<t The two-dimensional optimal control process U = U^{t) is given by ^ (lO.S) P2 and IW) = ^. (10.9) Pi From the functionals {R,L) in (10.6)-(10.7), next define the process W{t) = B(t) + R{t) - L{t) The Z 7\.'-dimensional optimal control process —,,^'^^\, —T!— 57 Zi-(f)= ''^^^"~^^^^^' , , foralH>0. (10.10) given by is if W by it A; = 1 an( W{t) > and <( 0, (10. ii: . 0, if ^ 7^ 1 and Wit) > 0. if A: = and Wit) < 0,' and Zkit)={ vv«) -, P2W12-P1A/22' Finally, the optimal control process Oit)^v-^[Biit) Thus the solution iU,Z,9) (10.S)-(10.9) to the (10.12) if [0, is 2 - fc 7^ 2 and W{t) < 0. given by + —^-y^AhkZkit)], foralU>0. workload formulation (2.6)-(2.11) is (10.13) given by equations and (10.11)-(10.13). Acknowledgements I this am deeply indebted to my dissertation advisor, J. Michael Harrison. He suggested problem area to me, and provided invaluable suggestions on both the technical and expository aspects of this paper. This research was partially supported by the Semiconductor Research Corporation and by a predoctoral fellowship from the International Business Machines Corporation. 37 REFERENCES Bellman, R. E. (1956). Dynamic programming and Lagrange multipliers. Proceedings [1] Nat. Acad. Sci. 42, 767-769. Benes, V. E.. Shepp, L. A. and Witsenhausen, H. [2] Control Problems. Stochastics S. (1980). Solvable Stochastic 4, 39-83. Everett, H. M. (1963). Generalized Lagrange multiplier [3] Some method for solving problems of optimcd allocation of resources. Ops. Rsch. 11, 399-417. Harrison, [4] M. (19S5). Brownian Motion and Stochastic Flow Systems. John Wiley J. and Sons, New York. Harrison, [5] M. (1986). Brownian Models of Queueing Networks with Heterogeneous J. Customer Classes. Proc. IMA Workshop on Stochastic Differential Systems. Springer- Verlag, Berlin, to appear. Harrison, [6] J. M. and Taksar, M. Math, of Oper. Res. [7] Kaxatzas, L (1983). A 3. \. (1983). Instantaneous Control of Brownian Motion. 439-453. Class of Singular Stochastic Control Problems. Adv. Appl. Prob. 15, 225-254. [8] Shreve, S. E., Lehoczky, J. P. and Gaver, D. P. (1984). Diffusions with Absorbing and Reflecting Barriers. [9] Optimal Consumption SIAM J. for General Control 22, 55-75. Taksar, M. L (1985). Average Optimal Sing\ilar Control and a Related Stopping Problem. Math. Oper. Res. 10, 63-81. [10] Wein, L. M. (1987). Asymptotically Optimal Scheduling of a Two-Station Multiclass Queueing Network. Unpublished Ph. D. Thesis, Dept. of Operations Research, Stanford University. 38 bl 17 06i; 10 '^fe Date Due FEB 19 198') Lib-26-67 3 IDSD DDS 35D TID '^f^^e/o^nfT'