'"fey HD28 .d414 ALFRED P. WORKING PAPER SLOAN SCHOOL OF MANAGEMENT SCHEDULING A TWO-STATION MULTICLASS QUEUEING NETWORK IN HEAVYTRAFFIC Lawrence M. Wein Sloan School of Management Massachusetts Institute of Technology Cambridge. MA 02139 #2016-88 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 50 MEMORIAL DRIVE CAMBRIDGE, MASSACHUSETTS 02139 SCHEDULING A TWO-STATION MLLTICLASS QUEUEING NETWORK IN HEAVY TRAFFIC Lawrence M. Wein Sloan School of Management Massachusetts Institute of Technology Cambridge, MA 02139 #2016-83 SCHEDULING A TWO-STATION MULTICLASS QUEUEING NETWORK IN HEAVY TRAFFIC Lawrence M. Wein Abstract Motivated by a factory scheduling problem, we consider the problem of input control (subject to a specified product mix) and sequencing in a two-station multiclass queueing network with general service time distributions and a general routing structure. The objective to is minimize the long-run average expected number of customers subject to a constraint on the long-run average expected output rate. heavy loading conditions, this scheduling involving Brownian motion. exactly in Wein ing network rule is in example is work in Under balanced approximated by a control problem is interpreted in terms of the queue- order to obtain an effective scheduling rule. input policy, where a customer of system reformulation of this Brownian control problem was solved a static priority ranking of the classes. amount is In the present paper, this solution [17]. model A problem in the is The input policy injected into the system the system for the two stations falls is The a "workload regulating" whenever the expected total within a prescribed region. presented that illustrates the procedure and demonstrates November 1987 resulting sequencing its effectiveness. An M.I.T. LIBRARIES RECEIVED n SCHEDULING A TWO-STATION MULTICLASS QUEUEING NETWORK IN HEAVY TRAFFIC Lawrence M. Wein This research in many motivated by a particular scheduling problem that is By viewing factories. K network under consideration consists of two single-server stations and Customers of class k service times are independent mjt and variance class 1, and ..., independent of matrix P= all identically distributed random structure is previous history. their variables with finite mean of service, a class k customer turns next into a We assume that the Because the number of classes A' is x all A' 1 — XI 7=1 ^kj- Markovian switching customers will eventually allowed to be arbitrary, this routing almost perfectly general. The scheduling problem is customer and A' require service at a specific station s{k) {Pkj) has spectral radius less than one, so that exit the system. The queueing different customer with probability Pkj and exits the system with probability j there = Upon completion s\. encountered a factory as a network of queues, the scheduling problem can be formulated as one of controlling the flow in a queueing network. classes. is an endless Each customer line of in the line incorporates input and sequencing decisions. customers who We assume are waiting to gain entry into the network. has an exogenously specified class designation. These class designations are such that, over the long-run, the proportion of class k customers released into the system is qk, where ^^=1 5* as the entering class mix. A'' = {N{t),t > 0}, system up to time The input where N{t) t. is = 1- The - {qk) will be referred to decisions are to choose a non-decreasing process the cumulative Thus the input vector q number of customers injected into the decisions essentially allow full discretion over the timing of the release of customers into the system, but do not allow for the choice of which 1 class of customer to The sequencing tomer to process inject. decisions consist of choosing, at each point in time, which ciass of cus- at each server in the network. customer so that service of a priority interrupted at a particular station effect It is allowed, when a higher is employed here, the assumptions made regarding preemption do not on the scheduling policy that emerges from the analysis. assumed that a holding customer spends in cost Ck is incurred for each unit of time that a class k the queueing network. Also, there is a specified lower long-run average expected throughput rate of the queueing network. of a queueing system is the number of bound A on the The throughput rate customer departures from the system per unit of Our queueing network scheduling problem time. is customer arrives at that station. Due to the rather crude nature of the Brownian approximation that have an may be Preemptive resume scheduling is to choose the input and sequencing decisions so as to minimize the long-run average expected holding costs incurred per unit of time, subject to a lower rate. bound constraint on the long-run average expected throughput Notice that in the special case where Ck = c for all k minimize the long-run average number of customers in the = l,...,/v, the objective system. Because the problem formulated in terms of long-run averages and because the constraint on throughput general be tight. Little's formula [9] implies that this objective is customer is the amount of time a customer spends in to is will in equivalent to minimizing the long-run average expected cycle time of customers in the system. of a is The cycle time the queueing network. In a manufacturing setting, there are many good reasons to minimize both the work-in-process inventory and the cycle time, and some of these will be discussed in the next section. A good deal of literature exists on input control of queueing networks, but these models consider the decision of whether to accept or reject Poisson arrivals; Stidham [15] provides a thorough survey of work in this area. Such models are not applicable to the scheduling problem considered here, since the relevant into the queueing network, not issue in our setting is when to release a customer whether or not to accept the customer. Although useful 2 Klimov results exist for sequencing single-station systems (see [8]), a satisfactory theory for sequencing in a network setting has not been attained, and simulation (see Conway, Maxwell and Miller [2] for a classic study on this topic) is still the primary tool of analysis. In view of the difficulty in obtaining sequencing rules for conventional multiclass queueing networks has been 14 years since Klimov's result), the best hope for further progress (it appears to be One such model Harrison [4]. more tractable models. the analysis of cruder, in is a Brownian network, a stochastic system model introduced by Under conditions of balanced heavy loading, a Brownian network approxi- mates a multiclass queueing network with dynamic scheduling capability. To state these conditions more precisely, let the two- vector p the two stations. traffic intensities, for switching matrix P, the vector mix q = {qk) and the in >/n(l — Pi) < which case n = 1 — {rrik) specified average The balanced heavy loading < m The for i = (p,) be the relative server and values of pi p2 can utilizations, or be computed from the of expected processing times, the entering class throughput rate A, as will be shown in Section 2. conditions assume the existence of a large integer n such that = 1,2. As a canonical example, one may think of p\ = P2 = -9, 100 satisfies this condition. Under such conditions, the scheduling problem described above can be approximated by a dynamic control problem Brownian control problem is for a in Wein [17] so Brownian control problem that the state of the system process that represents the scaled version of the total each of the two stations. in this a 7v-dimensional vector queue length process (appropriately scaled). Instead of analyzing the mulated Brownian network. The state of the system The reformulated problem the present paper, the solution is is is directly, the problem is refor- described by a two-dimensional amount of work solved exactly in in the system for Wein [17], and in interpreted in terms of the queueing system in order to obtain an effective scheduling rule for the original queueing network (and hence factory) scheduling problem. This interpretation traffic limit theorems for is based on intuition obtained from existing heavy some simpler queueing systems, and no attempt 3 is made to rig- orously justify our interpretation via a weak convergence result. However, that the resulting scheduling rule as — n is asymptotically optimal in the heavy we conjecture traffic limit (i.e., oo). The scheduling rule derived here consists of a sequencing rule describe the rule, a few definitions are needed. Let of time that the server at station Af,jt To policy. equal the expected total amount (hereafter referred to as server i and an input i) must devote to a class k customer before that customer eventually exits the network. Denote the ivT-dimensional queue length process by Q, so that at time = for k t w{t) = MQ{t), where work for server at time i the number of class k customers in the system we (A/,jt), w = interpret w,{t) as the expected total {wi) by amount of customers who are present anywhere in the network in those t. rule ranks each cjt = this index. the linear holding cost for a class k customer, the sequencing is customer c for all k priority at station is — embodied Recalling that Ck where M is Defining a two-dimensional workload process A'. 1 Qki't) 1 = class k by the index 1,...,A', this rule c'j^^ {p2^hk — P\M2k)- In the special case a static priority ranking that awards higher is (respectively, station 2) to the smaller (respectively, larger) values of (The case where Ck ^ = c for all k 1,...,A' will be discussed in Section 5). It interesting to note that, as in Klimov's results for a single-station queueing system, the solution to a dynamic scheduling problem is a static priority ranking of the classes, and the solution depends on the general service time distributions only through their means. This sequencing rule has the following interpretation. In the special case when Ck for all k = I, ..., A' and pi = p2 (i.e., minimization of the cycle time in a perfectly = c balanced system), the rule tends to retain jobs at each station (by giving them lower priority) that have relatively more work to be done at that station, either quickly (by giving them higher the other station. When Incidentally, known (Harrison and Wein it is pi ^ priority) jobs that P2, then pi now or later, dispatching have relatively more work to be done at and p2 show up as appropriate weighting 4 [5]) more that when cjt = c for all k = factors. 1,...,A', this same sequencing queueing network The input in rule rule maximizes the throughput rate heavy workload regulating policy because two-dimensional workload process w. More A orthant of R^. may it depends 6, description of this region where the region is will the nonnegative be deferred until 7, where the region consists of the shaded area. The input network to behave as a "pull" system: when either server appears to be rule causes the threatened with idleness and there new customer and fairly involved is is not too much work already present in the system, a released into the system. Although neither the sequencing nor input rule derived here has ever appeared literature, they are setting, they both intuitively appealing policies. would be very easy to implement. As outperform conventional scheduling rules The original in will Furthermore, be seen in in a Section manufacturing 7, these policies simulation studies. restrictive at first glance. the balanced heavy loading assumption is However, one important implication of that, in the heavy traffic limit represented by the Brownian network model, any stations in the original system that are not most heavily loaded [7] ting. Limit will theorems of this [1] among the This has been proven in limit theorems by simply disappear. and Chen and Mandelbaum are not heavily loaded original in the system description of a two-station, heavily-loaded, well-balanced net- work may seem quite Johnson on the calculated explicitly. For a typical example, interested readers Figure 2 of Section refer to is in solely new customer specifically, the rule releases a system whenever the workload process enters a certain region into the Section a two-station multiclass closed traffic. called a is in in the single-type open queueing network type can justify the procedure of eliminating all set- stations that when forming the approximating Brownian network, reducing the system to a subnetwork of bottleneck stations for purposes of subsequent analysis. However, these bottleneck stations are precisely where the large queues form, where most of the waiting fact, is incurred, and thus where scheduling will have the biggest impact. other approaches to job shop scheduling problems, such as the the 5 OPT In system (see Jacobs for a critical evaluation of its [6] taken by Morton and Smunt [10], also Brownian network approximation queueing network, its One consequence is main features) or the expert systems approach focus on the bottleneck stations. Thus, although the a rather crude model in comparison to a conventional underlying assumptions are made-to-order for scheduling purposes. of the previous paragraph is that the scheduling rule emerging from our analysis can be applied to any queueing network with two bottleneck stations. In a simulation model has been built that ductor wafer fabrication is based on operating data from an actual semicon- Using this simulation model (see Wein facility. fact, [16] for details), which contains 24 stations but only two bottleneck stations, rules similar to the ones derived here were compared against conventional sequencing and input The rules. results were quite impressive: the rules outperformed conventional rules and achieved a 47.2% reduction in average customer queueing time versus the base case of Poisson inputs and first-in first-out This paper study (FIFO) sequencing. is organized as follows. discussed in Section 1. network scheduling problem is is The factory scheduling problem that motivates our In Section 2 the stated. Brownian approximation of the queueing The Brownian control problem is reformulated Section 3 and the solution to the reformulated problem, which was derived in stated in Section and in Sections 5 An example is 4. This solution 6, in is Wein in [17], is interpreted in terms of the original queueing system order to obtain a sequencing rule and an input policy, respectively. presented in Section 7 that illustrates the procedure and demonstrates its effectiveness. Some of the notational conventions introduced. and have motion, space, ,Y A left it = is stochastic process is and terminology used said to be limits with probability one. assumed there A''(a;) is is RCLL if its When we in this say that A' a given (Q, F, Ft, AT, Pj:), where = a(A'(5),5 < 6 t) is will now be sample paths are right continuous a measurable mapping off! into C(R), which functions on the real line R, Ft paper is is a {^,a^) Brownian (n,F) is a measurable the space of continuous the filtration generated by X, and Pi is a family of probability measures on motion with drift /i, associated with P^. variance If Y = cr'^ {Y'{t),t Y then we say that the process X. More generally, another process X we when and is > is such that the process {X{t),t initial state x. 0} is Let > 0} is a Brownian E^ be the expectation operator a process that is Ft-measurable for all t > 0, non-cLnticipating with respect to the Brownian motion will say that Y fi one process Y adapted to the coarsest is non-anticipating with respect to filtration with respect to which X is adapted. 1. The Factory Scheduling Problem This section describes the relationship between the queueing network scheduling prob- lem and the factory scheduling problem. Each server to a machine or work center job. The routing in in the queueing network corresponds the factory, and each customer corresponds to a particular structure described in the introduction can accomodate the case where the factory produces a variety of products, each with through the network of machines. its own In that case, a different arbitrary deterministic route customer class is defined for each combination of product and stage of completion. More generally, our set-up allows probabilistic routing to represent such events as rework or scrapping. In fact, a customer class can include any observable information about a particular job that is relevant for dynamic scheduling purposes. The queueing network model can By assuming that the amount of also accomodate machine breakdown and repair. machine busy time between consecutive breakdowns is exponentially distributed, the breakdown and repair can be incorporated into the service time distributions for each customer class; see Harrison and si are interpreted as the mean and variance of the customer, i.e., [4] effective service the actual processing time plus the total duration of 7 The modified for details. all rrik time of a class ^- interruptions that occur during that service. In the manufacturing setting, the sequencing decisions consist of dynamically choosing which job to process machine at each The input shop scheduling problem. release of jobs onto the factory floor. entering product types mix that the factory A and is is decisions in our problem specify the timing of the However, it is assumed that the exact sequence of This sequence reflects the desired product specified precisely. required to maintain. For example, if a factory makes two products, B, to be produced in equal quantities, then the specified sequence of entering jobs would be ABABAB... There the factory objective which corresponds to the classic job in the factory; this is is is a long-run average output rate (in jobs per unit time) that is required to maintain. When the holding costs Ck = c for all k = 1, .... A', the minimize the long-run average expected work-in-process (WIP) inventory, to equivalent to minimizing the long-run average expected cycle time of jobs. This scheduling problem is relevant for any factory that is obliged to maintain a specified average output rate of a certain product mix, but can control the timing of inputs. In thinking about endogenously generated arrivals, to-stock manufacturer, where orders are it is easiest to imagine a finished goods inventory. met from its make- However, in a make-to-order environment, input to the factory floor can also be regulated, but then customer orders will The motivation for in both the floor. WIP sometimes queue outside the factory doing this Time manufacturing the number (see will be gained by a reduction on the factory [13] for be detected floor, the benefits from Just-In- a detailed description) can be realized. For and thus there faster, will fast A more less rework and turnaround on individual orders, and the factory may readily adapt to a changed order, since the corresponding job processing. be the cycle time of jobs, the factory can gain Rexibility: the system be more capable of very more its By reducing of jobs Schonberger example, quality problems will to reap the benefits that can is inventory on the factory floor and the cycle time of jobs on the factory By reducing scrap of jobs. floor waiting to gain entrance. specific example occurs 8 in may not have begun the semiconductor industry, where a decrease in the average cycle time of a lot of wafers in the wafer fab will result increase in the yield of good wafers. This is because in the fab. Finally, in the case of standardized cycle times allow production to be based lots are so easily products that can be an in contaminated while made to stock, shorter on more accurate forecasts of market demand. Since our definition of cycle time does not include the time that transpires between receiving an individual order and releasing the corresponding job onto the factory floor, may be readers concerned about the performance. Our view is effect the rules derived here that, in the case of a would have on due-date busy factory with more than one bottleneck machine, scheduling for due-dates has a detrimental effect on the utilization of bottleneck machines, and hence ultimately does more harm than good. As an example, consider a two-station well-balanced factory that has a very large backlog of jobs, each with a given due date. Furthermore, suppose that either workload regulating input (see Section or closed loop input (the total Solberg [14]) is used. number of jobs floor is held constant, see In these cases, the sequencing rule described here can substantially increase server utilization compared to any sequencing due-date information (see Harrison and Wein allow the factory to produce more timely customer on the factory 6) more jobs per service than a [5] or rule that sequences according to Wein [17]). This sequencing rule will unit time and thus would eventually provide myopic sequencing rule that is based on due-dates. (Notice that the above argument does not hold for a factory with only a single machine. This is because, in a single-server queueing system, every work-conserving sequencing rule achieves the In same server utilization. summary, the research undertaken here attempts dynamic and stochastic elements that are inherent to realistically incorporate the in all factory scheduling problems. Furthermore, we believe that factories, by focusing on system performance measures (such as WIP inventory and cycle time) rather than due-date performance measures, can take advantage of some benefits of Just-In-Time manufacturing and can provide better customer service over the long run. 9 2. The Limiting Control Problem We assume readers are forth in Harrison follows [4]; Most familiar with the approximating Brownian network model put of that paper's notation will be retained for ease of reference. from Section 9 of Harrison [4] It that under the balanced heavy loading assumptions, the queueing network scheduling problem described in the introduction can be approxi- mated by the following limiting control problem: choose a pair of RCLL processes Y and 6 (K-dimensional and one-dimensional, respectively) to minimize subject to — limsup / E'j.f T— oo T Jo }' and ^ r^ 1 are non 2.^kZk{t)dt] — = U{t) = AY{t) U non — decreasing with U{0) is X{t) > + RY{t) for all for all t lim sup ^E[U,{T)] T—oo process Z anticipating with respect to Zit) Z{t) The > qe{t) > t 0, < for all > t X, (2-2) (2.3) 0, (2.4) 0, = (2-5) 0, and (2.6) fori 7. = 1,2. (2.7) -^ represents the A'-dimensional scaled queue length process and describes the state of the system. The A'-dimensional process allocation process and the one-dimensional process process. (2-1) f^^ These two control processes correspond Y represents the scaled centered represents the scaled centered input to the sequencing respectively. Interested readers are referred to Harrison [4] for control problem. This is done in in defining decisions, an explicit definition of the process Y, since the definition will not be needed here. As in Harrison notation used for the scaled processes are used and input [4], exactly the same the approximating Brownian order to emphasize the queueing network interpretation 10 of the Brownian network model. The scaled process B where, as mentioned earlier, A lative number \nt-Nint) ,^^^ ^^-^, i>0, = e{t) defined by is 2.8 the specified average throughput rate, N{t) is of customers released into the network in [0,t] and n is is the cumu- the large integer heavy loading condition. specified in the balanced The two-dimensional U process represents the scaled cumuiative idleness process for the two stations. (For brevity's sake, processes such as Z, Y, Ua.nd 9 will often be referred The to without the adjective "scaled".) x A' input-output matrix A' R— [Rkj) is defined by Rkj=rn-\8jk-P,k). where meaning that 6jk denotes the Dirac delta function, otherwise. The 2 x A' resource consumption matrix fl, ifi process X a is [6, E = (Ej/). Let A = (A;t) Ajt Since P = to j = k and 8jk = defined by ,^ Q. ^'' ' = {6k) and the A' x A' covariance number gA, be transient, a unique non-negative A'— vector l3 — (2.11) of class k customers that in order to satisfy the was assumed \i be defined by represents the average system per unit of time (A,fc) is I s(k); drift vector 6 A so that = S) Brownian motion, but several definitions are needed before stating the A'-dimensional matrix = A= 8jk otherwise. "^'^-lo, The A'-dimensional (2.9) it must arrive to the throughput rate constraint. follows that R is non-singular and there exists {3k) satisfying the flow balance equations A = RI3. 11 (2.12) Letting C{i) be the set of of traffic intensities p — customer classes k such that s{k) all {p,) = i, define the two-vector by ^ = P. (2.13) f3k. itGC(i) Now define the iv-vector a = (q^) by ak = — E 0(1). for all k (2.14) P. Then the drift 6 and covariance E of the Brownian motion S= X are -fV"(A-i?Q) and (2.15) K S;( = ^[afcm;^Pfc,(<5,, - + akm-'slR,kRik]- Pki) (2.16) k= l Inequality (2.7), which expresses the throughput rate constraint in terms of the cu- mulative server idleness process U, is the only relationship in the limiting control problem that does not appear in the Brownian network formulation of in (2.7) is defined derive (2.7), The two- vector 7 let the 2 x A' matrix = y^(l-p.). is called the amount workload of time that server M = (M,jt) be defined by and profile matrix, i must devote i is to a class k v is (2.18) M,fc the network. Define the two-dimensional vector v so that V, (ji) (2.17) M = AR-\ M = by 7. To [4]. = interpreted as the expected total customer before that customer exits (v,) by = Mq, interpreted as the expected total (2.19) amount of time over the long-run that server spends on each customer. From (2.11)-(2.13) and (2.18)-(2.19), p, = v,X for 12 f = 1,2. it follows that (2.20) Inequality (2.7) follows from (2.17) and (2.20), since the long-run average throughput rate greater than or equal to A is i is al idle is if and only than or equal to less 1 — if the long-run average fraction of time that server p^. 3. The Workload Formulation The state of the system in the limiting control problem is described by a A'— dimension- queue length process, by way of the basic system relationship limiting control problem is (2.3). In this section the reformulated so that the state of the system described by a is two-dimensional workload process. Recalling the definition (2.18) of the workload matrix M, let us define the two-dimensional scaled workload process W{t) where Wi{t) in is interpreted as the expected total those customers who are present dimensional Brownian motion B{t) anywhere = B{t) The B = MZ{t), M6 t W — {W,) by >0, (3.1) amount in the profile of work for server network at time t. i embodied Define the two- {B,{t)) by' = MX{t), t > (3.2) 0. MUM'^ By and (2.17)- Define the workload formulation of the limiting control problem as choosing RCLL process (2.18), has drift and covariance . one can show that the two-dimensional processes Z,U and (2.10), (2.12)-(2.15) drift vector M8 = —7. 6 (K-, two- and one-dimensional, respectively) so as to mmimize limsupiE.i/ subject to U U YckZk{t)dt\ (3.3) and 6 are non — anticipating with respect is non - decreasing with U{Q) 13 = 0, to B, (3.4) (3.5) Z{t) > for all t Vim sup -E[U,{T)] MZ{t) = RCLL Let us call a pair of problem if it satisfies B{t) + > 0, < U{t) (3.6) - = fori 7. ve{t) 1,2, for all and (3.7) t>0. (3.8) processes {Y, 9) a feasible policy for the limiting control equations (2.3)-(2.7) and a feasible policy for the workload formulation following proposition, which was proved in call if Wein it a triple of satisfies [17], RCLL processes {Z,U,6) The equations (3.5)-(3.8). allows us to analyze the workload formulation of the limiting control problem, rather than studying problem (2.1)-(2.7) di- rectly. Proposition 2.1. Every feasible policy {Y,9) for the limiting control problem yields a corresponding feasible policy {Z,U,0) for the workload formulation and every feasible policy [Z, U,6) yields a corresponding feasible policy {¥,6). It to the was shown in Wein Brownian motion [17] that if the control process Y is non-anticipating with respect A' in the limiting control problem, then the control process non-anticipating with respect to the Brownian motion B in U the workload formulation. is It was also shown that the solution to the workload formulation remains unchanged whether 9 is non-anticipating with respect to 4. The or with respect to B. Solution to the Workload Formulation solution {U,Z,9) to the workload formulation (3.3)-(3.8) of the limiting control problem was derived solution. X in Wein [17]. This The parameters pi,Mtk and of the primitive vi is a self-contained section that summarizes the appearing problem data by definitions in this section are all defined in (2.13). (2.18) need to define the parameters a^.h\,h2,i' and 14 if, and (2.19), respectively. which can be calculated in terms We also terms of the The parameter primitive problem data. defined by cr^ is <r^ = g'^M'LM^ g, where P2 (4.1) -PiJ Without loss of generality, assume that the arg classes k = 1,...,K are ordered so that max c^^{p2Mik - P\M2k) = 1 (4.2) 2. (4.3) k and argmin c,^\p2Mik - P\M2k) = k Now define the positive coefficients h^ and h, = ho = /i2 by C2 (4.4) P1M22 - P2^h2 and Cl (4.5) P2M11 - Pi A/21' Finally, let 2sqrtn{pi - P2) (4.6) and i In the = y/ripi{l - (4.7) pi). workload formulation, the controller observes a two-dimensional Brownian mo- tion process B, from which can be observed the one-dimensional B Brownian motion process defined by Bit) If pi ^ p2, = p2B,{t) - piB2it), then define the interval endpoints a and a — V Mn {hi hip2{l b t>0. (4.8) by + /l2)/?2(l - Pi) - Pi) + h2pi(l - (4.9) p2 and b = i/-Mn (hi hip2{l + h2)pi{l - P2) - pi) + h2Pi{l 15 (4.10) P2) If Pi = P2i then let __h2_zL = a + hi (4.11: hi 2i and 6 a2 hi = (4.12) 4-/12 2^ /ii . For a particular realization of B, define the control functionals {R,L) by R{t) = sup [a- B{s) 0<a<t L(t) = sup [Bis) Q<s<t + L{s)]+ (4.13) + R{s)-b] + (4.14) and The two-dimensional optimal U control process = Ui{t) given by is R{t) (4.15) P2 and U2it)= From the (4.16) functionals {R,L) in (4.13)-(4.14), next define the process = W{t) The K-dimensional optimal + B{t) = Z control process ,,"""\, Zkit) - R{t) , L(t) for all is given by if it if /t if if = t > W by (4.17) 0. 1 and W{t) > 0; / 1 and W{t) > 0. A: = 2 and W{t) < 0; A: 7^ 2 and W{t) < 0. (4.18) { 0, and ,,^'^'^ Zk{t) = ,, , (4.19) { 0, Finally, the optimal control process 6 e{t) = v;'[Bi{t) + is given by K -^ - Y, MikZk{t)l R{t) P2 Jt=l 16 for all t > 0. (4.20) Thus the solution {U,Z,6) to the In the next (4.15)-(4.16) and(4.18)-(4.20). in workload formulation (3.3)-(3.8) two sections, given by equations is this solution will be interpreted terms of the queueing network model. 5. The Sequencing Rule In this section we describe the sequencing rule, which is based on the control process Consider again the workload formulation (3.3)-(3.8) of the limiting control problem. Z. According to definition we must have W{t) (3.1), the scaled queue length process Z present scaled workload process W. = MZ{t). This means that at any time can be any nonnegative vector that Thus, in the idealized of different customer classes can be instantaneously is i, consistent with the Brownian approximation, queues swapped for one another, as long as the expected work content remains unchanged. These swaps, which can be interpreted as among the reallocation of server time the various classes, appear to occur instantaneously because we are observing the system evolving in scaled time. From the Z solution Z in (4.18)-(4.19), it is seen that only two of the A' components of These two components correspond to the two customer are ever positive. classes that are &Tgmax c'^\p2Mik - PiM2k) (5-1) k and argmin c-;;\p2Mik - PiMik), (5.2) k which were denoted by classes more, at each time t, W{t) < 0. In the case and 2, respectively, by conventions (4.2)-(4.3). Further- only one customer class has a positive queue length. According to formulas (4.18)-(4.19), class load imbalance W{t) 1 > 0, 1 customers have a positive queue length whenever the work- and where Ck class 2 = customers have a positive queue length whenever c for all k 17 - \, ..., A' (i.e., the objective is to minimize the long-run average cycle time of customers), and class 2 is served at station customers of class 1. Similarly, 1 This 2. are only served whenever W{t) < served 0, or in does not matter what order The that whenever W{t) > 1 0, there are no other customers present at station classes 1,3, only required that the two servers be kept busy are two reasons for this. served at station is 2. traffic conditions, it when W{t) > mean 1 customers of class 2 are only served when there are no 0, other customers present at station Under heavy true that class interpreted to is when it is first ..., when what order in K are served there is work classes 2,...,K are when W{t) < them for to do. reason, as will be seen in the next section, is 0; it is There that the asymptotically optimal input rule prevents a large queue of customers from forming at station in 1 (respectively, station 2) when W{t) < the scaled space of the Brownian limit, 2) vanish when W{t) < (respectively, customers at station all W{t) > (respectively, to that of the bottom and thus their scaled priority classes. 1 0). will not see the queue lengths will Consequently, (respectively, station The second reason 0). customer classes that are not given bottom priority a heavy traffic situation, W{t) > is because the queueing system in be negligible compared This phenomenon of the normalized queue length processes of high priority customers vanishing in the heavy traffic limit has been observed in previous work. Whitt limit theorems [18], Harrison in a single station [3], and Reiman system, and Johnson [7] [12] have obtained heavy and Peterson [11] traffic have obtained similar results in a network setting. However, a formal limit theorem has yet to be proved for our case of a general multiclass network with feedback. To repeat, the interpretation of formulas (4.18)-(4.19) lowest priority at station station 2 when W{t) < 0. 1 when W{t) > There seems to and give class 2 is to give class 1 customers customers lowest priority at be some ambiguity that remains in specifying a sequencing rule that emerges from the solution of the Brownian control problem. However, from (4.1)-(4.2), when c^ = c for all k = l,...,/\, there customer classes by the index p2M\k - PiM2k18 We now is a natural ranking of the K propose two sequencing policies that give class station 2) 1 (respectively, class 2) customers lowest priority at station when W{t) > (respectively, W{t) < rule that awards higher priority at station class k = policy is is a static priority p2M\k — p\M2k- obtained by computing dynamic reduced costs for each customer The reduced 1,...,A'. policy first (respectively, (respectively, station 2) to the classes with 1 the smaller (respectively, larger) values of the index The second The 0). 1 cost for a class k customer at time t can be interpreted as the increase in the objective function of the linear program (originally stated as equations Wein (3.1)-(3.4) in [17]) A' min YckZkit) (5.3) t^ k=\ z(i),e{t) K Y^KhkZk{t) + v^6{t)^B^[t) + subject to U^{t) (5.4) U2{t) (5.5) k= (5.6) k=\ K Y, M2kZk{t) + V2e{t) = Zk[t) > B2{t) + k-1 for 0, l,...,K per unit increase in the righthand side of the nonnegativity constraint Zk{t) shown in Wein [17] control process Z > 0. It was that the partial solution Z{t) to this linear program yields the optimal given in equations (4.18)-(4.19). was also shown there that the dual It of (5.3)-(5.6) can be expressed as max ^\{i} Tl(«) subject to The reduced costs at time (5-') P2 c^\p2Mik t PiM2k)-^\{t) TT*{t) is (5.9), the the solution to (5.7)-(5.8). more expensive it for p2 i- = 1, ..., A'. (5.8) for the A' variables in the dual of (5.7)-(5.8) are P2-c;'{P2Mik-PiM2k)7Tlit) where < is for A^ = 1,...,A-, The higher the value to hold class k 19 customers (5.9) of the k-th reduced cost in in the queue. Furthermore, the reduced cost for a class k assume that Ck time at each t = customer = c for k to the dynamic scheduling > Zk{t) Let us again in (4.18)-(4.19). and consider the policy that gives highest 1,...,A' customer when zero is class with the largest rule that ranks all reduced cost. priority Then one obtains a K classes by the index p2-^iit — Pi-^2* and, at each when station, serves the class with the smallest (respectively, largest) value of the index W{t) > (respectively, W{t) < 0). Simulation results (see Section 7) on several systems have indicated that both the static priority rule and the dynamic rule work well regulating input rule described in the next section. it is easier to implement, since dynamic Thus far in this section relax this assumption, is it sequencing policies. class 1 priority it 2. " it has been assumed that Ck When this now based on this is when Wit) < more 0. advantage that has a natural generalization to networks with more is indeed = still c for all k 1 is 1,...,A'. If served at station — 1 we and P\M2k), are the proposed not the case, the equations (4.18)-(4.19) > = the case, then the same two priority the index c^^{p2Mik customers lowest priority when W{t) before making a 6. If static rule has the does not necessarily follow that class served at station rules described earlier, The conjunction with the workload does not depend on any global state information. The rule has the advantage that than two bottleneck stations. class 2 it in and giving still class 2 suggest giving customers lowest However, additional simulation studies need to be performed specific policy recommendation. The Input Rule In this section the input rule, described. In the which is based on the control processes U and 9, is workload formulation of the limiting control problem, the controller observes the two-dimensional Brownian motion process B, exerts the controls 20 U and 6, W, and obtains the controlled process system state equations which is the scaled workload process. The basic govern the controlled process can be expressed as (3.8) that W,{t) = Bi{t) + U,{t) W2{t) = B2{t) + U2{t)-V2e{t). - and Vi9{t) (6.1) (6.2) U Since these equations are linear and additive, the controls and 6 act on as "pushes" the Brownian motion B. Recall that the non-decreasing process U, represents the scaled cumulative idleness process for station the vector v Since W resides on (4.18)-(4.19), Also, is the scaled centered input process, and proportional to the server utilization levels is W i. = MZ, Z the solution the boundary of a cone it in (4.18)-(4.19) implies that in the workload process the nonnegative orthant of R^. can be seen that the control Ui (respectively, U2) scaled workload imbalance process W equals a (respectively, interpreted as incurring server idleness at station b). is From equations exerted only when the Exerting the control W, the interval endpoints a and correspond to reflecting barriers on the boundary of the cone, beyond which not enter. of the cone This situation is boundary that depicted in Figure is in boldface. U2 are only exerted when W2(t) threshold levels Cj and C2 = Cj L/, is i. In terms of the two-dimensional workload process b p. 1, W where must W reside on the portion In the optimal solution, the controls Ui and Wi{t) = may Cj, respectively, and where the scaled can be calculated explicitly from the solution to the workload formulation. Otherwise, only the input process 6 is used to keep the controlled process W on the boundary of the cone. Thus, the policy that emerges from the Brownian control problem attempts process process to manipulate input in lieu of idling servers and keeps the workload W on the boldface portion of the cone boundary W reaches the barrier or the barrier refuses to release at c* any more customers In order to see exactly how into the the input is 21 Figure 1. However, when the at Cj in Figure 1, then the controller system and in is manipulated, willing to incur server idleness. recall that by equation (2.19) FIGURE and the balanced loading conditions, centered input process 6 can Vi move along is 1 approximately equal to a direction that is V2 and so the scaled close to the 45 degree line. The process 6 was defined in equation (2.8) by ~Xnt - N(nt) (6.3) where the process to time t. is is the cumulative Thus, when 6 moves relative to the input N nominal input in number of customers released into the system up the negative 45 degree direction, input rate, and when 9 moves in is witheld whenever the workload process 22 is being witheld the positive 45 degree direction, being increased relative to the nominal input rate. This where input is in is depicted in Figure the cone and input is 2, increased INCREASE INPUT INCREASE INPUT FIGURE whenever the workload process is in the shaded region. Notice that in the actual queueing system, to reside outside of the cone. MZ,Z > 0}, This is 2 it may be possible for the workload process because the state space of which contains the cone pictured in Figure 1. Its W is the cone {W = extremal rays are generated by the two customer classes (6.4) 23 , and argnun— which may not coincide with the rays in Figure (6.5) which are generated by the two classes 1, defined in (5.1)-(5.2). The main goal of this section is to develop an effective input policy for the actual queueing system that operationalizes the optimal solution obtained from the limiting control problem. To this end, let us interpret the word "increase" in Figure 2 to simply "release a customer into the system" Then and the word "withold" the naive rule that emerges from this interpretation system when the workload process is to simply mean mean "cesise input". to release a customer into the W enters the shaded region in Figure However, 2. this naive rule ignores a major difference that exists between the actual queueing system and the idealized heavy traffic limit. This difference can be understood by making the following W observation. In the idealized Brownian setting, when the scaled workload process the lower ray of the cone boundary and Wi{t) < then there are zero scaled customers at station 2 and yet station 2 cone and Woit) < C2, not idle. Similarly, when W then there are zero customers at station This apparent paradox limit. In the actual is c*, is is 1 is on on the upper ray of the and station 1 is not due to the rescaling that occurs when passing to the heavy idle. traffic queueing system, there are enough customers at the particular station to avoid idleness, but when looked at in the scaled space of the heavy traffic limit, these customers vanish. In order to adapt the naive control rule stated above to the actual queueing system, is as necessary to build in a boundary layer of thickness shown that is in Figure 3. e on the inside of the cone boundary, This boundary layer generates a new cone, which we strictly within the original cone. The input rule is still which is is call the e-cone, to release a the system whenever the workload process enters the shaded region, but region 24 customer into now enlarged by including the area between the two cones, as in Figure negligible in scaled space, prevents the process it 3. W from straying very the shaded This layer, far from the boldfaced portion of the original cone boundary, but allows the servers to be utilized the requisite portion of the time. queue lengths may grow of e will As e increases, the servers will incur less idleness but the as a result. In depend on the amount of an actual queueing system, the appropriate setting variability in the time customers spend at non-bottleneck stations. In Ci queueing system and the amount of fact, one could use a layer of thickness on the lower ray of the cone boundary, and a layer of thickness the cone boundary. FIGURE 25 3 (.2 on the upper ray of Thus the suggested input rule enters the shaded region of Figure is customer whenever the workload process to release a This region can be calculated explicitly 3. the problem data and the parameters and ej (.1. The cone in Figure 1 is terms of in generated by the rays W2 - ^W^i = ^^W2 = 0. (6.7) < fi (6.8) (6.6) • Mil and W, - M22 Therefore the regions outside of the e-cone are T^^2 - ^^^1 A/11 and ^^i-T7^^2<e2. A/22 From (2.37), (3.8) and (4.18)-(4.19), c* and (6.9) can be solved c\ P2 A/11 for explicitly. The solution is -pi A/21 and Co A/2 2 a = - P2A/12 where a and h are (6.11) PiA/22' the optimal interval endpoints from the solution to the Brownian control problem. Notice that \\\ c\ and c\ are all in scaled terms, policy for the original queueing system, (3.1) and the standard heavy and in order to find an appropriate some unsealing needs to be done. By described in Section 5 of Harrison traffic scaling definitions [4], it can be seen that »•(.) where w is = ^, (6.12) the unscaied workload process defined in the introduction by w{t) = MQ{t), 26 t > 0, (6.13) and the actual queue length process. Define w* Q is levels for the input policy. system at times t = \/nc* for Then the suggested input rule i = 1, to release a is be the threshold 2 to customer into the such that either wi{t) < wl and y^2it) - ^w,{t) < W2{t) < ^'iii) - TT-Mt) < (6-14) e,, (6.15) or 1^2 and (6.16) AI Here, As will when and ei €2 be seen (6.17) ^2. are parameters that can be set in order to achieve a desired output rate. in the next section, the setting of these parameters is quite simple, at least there are no non-bottleneck stations in the queueing system. An Example 7. The scheduling rules stated in Sections 5 example. The example mix that As seen is in will specified, so that Figure The four stages. 4, six and 6 have two customer types, will A and be illustrated by means of an B, and there customers are released into the system customer type A has two stages on its is a 50-50 product in the order ABABAB... route and customer type customer classes are designated (and ordered from k = 1, ...,6) B has by Al, A2, Bl, B2, B3 and B4, since each class corresponds to a type-stage pair. The mean in Figure are 4. assumed with finite service times (in arbitrary time units) for each customer class are indicated For concreteness (since simulation results will be exhibited), to be exponential, although our results hold for mean and variance. Calculation of the ,, /4 ^^=[1 1 2x6 10 2 2 13 13 7 27 service times any service time distributions workload 0\ ?;• all profile matrix M yields ,_ . ^^-^^ A1 STATION B1 1 8 such that either wi{t)<l9 W2{t) and (7.3) - ^Wi{t) < €u (7.4) or W2{t) < wiit) - ^W2{t) < and 62 (7.5) €2. (7.6) Using the example network, a simulation study was undertaken to compare the per- formance of the suggested scheduling rule against conventional input and sequencing Three input rules were tested: the suggested input rule (abbreviated by WR{€i,€2) for workload regulating input, where ej and €2 are the CL{N), where loop input (abbreviated by N is boundary the total layer thicknesses used); closed number of customers in the net- work); and deterministic input, where the interarrival times are constant. customers entered the system rules, compared: first-in first-out rules. in the order ABABAB... For all input Five sequencing rules were (FIFO); shortest expected processing time (SPT); shortest expected remaining processing time (SRPT); the asymptotically optimal sequencing rule (abbreviated by ST(A/i was described in Section 5 scheduling literature to the work it. The to station 2, The tics for is customer who in — and all (abbreviated by DY(A/i the least work next queue is LWNQ M2)); and the rule based on the dynamic reduced costs that — M2)). Another common (LWNQ) rule in the This policy gives priority rule. going next to the queue that has the least expected amount of rule is not relevant here, since all customers at station customers at station 2 go next to station results of the simulation study are summarized a particular scheduling policy, which paired with a specific sequencing rule. The is first specified in go next or exit the system. Table 1. Each row gives statis- by a particular input control rule two columns of Table ing policy. For each policy tested, ten independent runs were 29 1 1 1 state the schedul- made, each consisting of 2000 customer completions. The third column gives the average throughput rate 95% per unit time) over the ten runs, along with a of Table 95% A 1 confidence interval. The (in customers fourth column contains the average cycle time of customers over the ten runs, along with a confidence interval for this value. Rather than use the target average throughput rate = .1286 customers per unit time, it was more convenient to choose the parameters of the closed input policies and workload regulating input policies so as to achieve a throughput rate of .127 customers per unit time. average server utilization of 88.9%. times for the various policies, all This average throughput rate corresponds to an To allow for easy comparisons of the average cycle simulation runs achieved this target output rate. Each simulation run had no and initialization period, all runs began with an empty system. For closed loop input runs, the customers arrived according to deterministic input (at the same rate as the corresponding open models) until the network had reached its population limit, and then closed loop input was used. Referring to the results in Table 1, it is seen that workload regulating input in com- bination with either of the sequencing rules described in Section 5 easily outperformed The other combinations of input and sequencing rules. the ST(Mi — M2) and DY(A/i — achieved nearly a ing rule, which rule. 30% il/2) rules difference in performance between rule, in combination with the which was shown in reduction in average cycle time compared to the workload regulating ST{Mi — M2) and DY{Mi — were not tested in in rule rules. Similarly, the in heavy [5] to traffic, sequencing maximize the achieved a 30% the closed loop input case. was A/2) rules, the input rule with the other three sequencing rules FIFO input ST{Mi —M2) Harrison and Wein throughput rate of a two-station closed queueing network Since They both statistically significant. reduction in average cycle time, compared to the next best schedul- was the closed loop input This sequencing was not all derived jointly was not tested in with the combination ST{Mi — M2) and DY{Mi — A/2) combination with input rules with which they were not derived. 30 Acknowledgements I this am deeply indebted to problem area to my dissertation advisor, J. Michael Harrison. He suggested me, and provided invaluable suggestions on both the technical and expository aspects of this paper. This research was partially supported by the Semiconductor Research Corporation and by a predoctoral fellowship from the International Business Machines Corporation. 32 REFERENCES [I] Chen, H. and Mandelbaum, A. (1987). Stochastic Flow Networks: Bottlenecks and Diffusion Approximations. In preparation. [2] Conway, R. W., Maxwell, W. L. and Miller, L. W. (1967). Theory of Scheduling. Addison-Wesley, Reading, Mass. [3] Harrison, M. (1973). A Limit Theorem J. for Priority Queues in Heavy Traffic. J. Appl. Prob. 10, 907-912. [4] Harrison, M. (1987). Brownian Models J. Customer Classes. Proc. IMA Workshop of Queueing Networks with Heterogeneous on Stochastic DifFerenticd Systems. Springer- Verlag, Berlin. [5] Harrison, M. and Wein, J. Queueing Network [6] in Jacobs, R. F. (1983). Heavy The M. L. (1987). Scheduling a Two-Station Multiclass Closed Submitted Traffic. OPT Scheduling System: Scheduling System. Production and Inventory [7] Johnson, D. cesses and P. (1983). Diffusion for for publication. A Review Management Approximations for of a New 24, 47-51. Optimal Filtering Queueing Networks. Unpublished Ph.D. Production thesis, of Jump Pro- Dept. of Mathematics, Univ. of Wisconsin, Madison. [8] [9] KHmov, G. Little, J. P. (1974). Time Sharing D. C. (1961). A Service Systems I. Th. Prob. Appl. 19,532-551. Proof of the Queueing Formula L = \W. Ops. Rsch. 9, 383-387. [10] Morton, T. E. and Smunt, T. L. (1986). A Planning and Scheduling System for Flexible Manufacturing. In Kusiak, A., editor. Flexible Manufacturing Systems: Methods and Studies, North-Holland, [II] Peterson, ple W. Amsterdam. P. (1985). Diffusion Approximations for Networks of Queues with Multi- Customer Types. Unpublished Ph.D. Thesis, Dept. ford University. 33 of Operations Research, Stan- [12] Reiman, M. Proc. Intl. I. (1983). Some Diffusion Approximations with State Space Collapse. Seminar on Modeling and Performance Evaluation Methodology, Springer- Verlag, Berlin. [13] Schonberger, R. [14] Solberg, Pap' [15] " J. J. J. (1982). Japanese Manufacturing Techniques. Free Press. (1977). A Mathematical Model of Computerized Manufacturing Systems. presented at the 4th International Conference on Production Research, Tokyo. Stidham, Jr., S. (1985). Optimal Control of Admission to a Queueing System. IEEE Trans. Aut. Cont. AC-30, 705-713. [16] Wein, L. M. (1986). Scheduling Semiconductor Wafer Fabrication. Tech. Report for Semiconductor Research Corporation. [17] Wein, L. M. (1987). Asymptotically Optimal Scheduling of a Two-Station Multiclass Queueing Network. Ph. D. Thesis, Department of Operations Research, Stanford University. [18] Whitt, W. (1971). Weak Preemptive- Resume Discipline. Convergence J. Appl. Prob. 34 Theorems 8, 74-94. for Priority Queues: 6i I 7 065 \o M(.e.« Date Due FEB 19 Bab tv Lib-26-67 3 TDflO DOS BSD T44