liilllllili

advertisement
liilllllili
HD28
.M414
^M"
ALFRED
P.
WORKING PAPER
SLOAN SCHOOL OF MANAGEMENT
OPTIMAL CONTROL OF A TWO-STATION
BROWNIAN NETWORK
Lawrence M. Wein
Sloan School of Management
Massachusetts Institute of Technology
Cambridge, MA 02139
#2015-88
MASSACHUSETTS
INSTITUTE OF TECHNOLOGY
50 MEMORIAL DRIVE
CAMBRIDGE, MASSACHUSETTS 02139
OPTIMAL CONTROL OF A TWO-STATION
BROWNIAN NETWORK
Lawrence M. Wein
Sloan School of Management
Massachusetts Institute of Technology
Cambridge, MA 02139
#2015-88
M.I.T. LIBRARIES
OPTIMAL CONTROL OF A TWO-STATION BROWNIAN NETWORK
Lawrence M. Wein
Abstract
Harrison
[5]
has introduced a model called a Brownian network, which approximates
a multiclass queueing network with dynamic scheduling capability. In the present paper,
a particular Brownian network control problem
is
considered that approximates a math-
ematically intractable scheduling problem for a two-station multiclass queueing network.
A
reformulation of the Brownian network control problem
ming
is
is
solved.
Linear program-
used to reduce the reformulated problem to a singular control problem for a
one-dimensional Brownian motion.
The
objective of the singular control problem
is
to
minimize the long-run expected average holding cost of the controlled process incurred per
tmit of time, subject to constraints on the long-run expected average
amount
of control
exerted.
A'ev words.
Brownian motion, stochastic
control, singular control, queueing networks.
December 1987
OPTIMAL CONTROL OF A TWO-STATION BROWNIAN NETWORK
Lawrence M. Wein
Introduction. This paper
1.
is
part of a research effort to develop effective schedul-
ing rules for multiclass queueing networks.
Harrison
[5]
has introduced a type of crude
but relatively tractable stochastic system model called a Brownian network.
A
Brownian
network approximates a multiclass queueing network with dynamic scheduling capability
if
the total load imposed on each station in the queueing network
to each station's capacity. Hence,
is
approximately equal
we can approximate a dynamic scheduling problem
for
a queueing network by a dynamic control problem for a Brownian network.
We consider
a control problem for a Brownian network that approximates a particular
scheduling problem for a two-station multiclass queueing network.
scheduling problem, which
is
priority sequencing decisions,
The queueing network
described in the next paragraph, has both input control and
and appears
to be mathematically intractable. In this paper,
the approximating Brownian network control problem, which appears in equations (1.1)(1.7), is reformulated,
found
in the present
and the new formulation
paper was interpreted
in
is
solved exactly. In
Wein
[10],
the solution
terms of the original queueing system. This
interpretation yielded an effective scheduling rule for the queueing network scheduling
problem.
from
A
simulation study in Wein
this analysis
appear in the
[10]
showed that the resulting scheduling
outperforms conventional input control and priority sequencing rules that
literature.
The queueing network scheduling problem
is
many
factories.
different
customer
encountered in
stations
and
rule derived
K
is
motivated by a scheduling problem that
Consider a queueing network with two single-server
classes.
Customers of
1
class k
=
l,...,/\
require service
and
at a specific station s{k)
random
x'axiables
with
finite
their service times axe independent
mean
k customer turns into a class
probabihty
1
—
Yli^=\
-P<.j<
and variance
5|.
Upon completion
P =
all
is
previous history.
We
assume that the
Because the number of classes
an endless
Each customer
line of
in the line
is
all
allowed to be
almost perfectly general.
The scheduling problem incorporates input and sequencing
is
KxK
(Pkj) has spectral radius less than one, so that
will eventually exit the system.
arbitrary, this routing structure
there
of service, a class
customer with probability P^j and exits the system with
j
independent of
Markovian switching matrix
customers
rrik
and identically distributed
decisions.
We
assume
customers who are waiting to gain entry into the network.
has an exogenously specified class designation.
These
class
designations are such that, over the long-run, the proportion of class k customers released
into the system
is
where
g^,
as the entering class mix.
N
—
{N{t),t
>
0},
system up to time
The input
where N{t)
t.
—
Xlfc=i Ik
is
1-
The
=
(qk) will
be referred to
decisions are to choose a non-decreasing process
the cvunulative
Thus the input
vector q
number
of customers injected into the
decisions essentially allow full discretion over the
timing of the release of customers into the system, but do not allow for the choice of which
class of
customer to
The sequencing
tomer to process
at
inject.
decisions consist of choosing, at each point in time, which class of cus-
each server in the network. Preemptive resimie scheduling
so that service of a customer
priority
customer arrives
approximation that
have an
effect
It is
is
may be
interrupted at a particular station
at that station.
Due
higher
Brownian
employed here, the assumptions made regarding preemption do not
assimaed that a holding cost c^
in the
is
incurred for each
queueing network. Also, there
is
analysis.
iinit
of time that a class k
a specified lower boimd A on the
long-run average expected throughput rate of the queueing network.
of a queueing system
allowed,
when a
to the rather crude nature of the
on the scheduling policy that emerges from the
customer spends
is
is
the
number
The throughput
rate
of customer departures from the system per unit of
2
Our queueing network scheduling problem
time.
is
to choose the input
and sequencing
decisions so as to minimize the long-run average expected holding costs incurred per unit
of time, subject to a lower
bound constraint on the long-run average expected throughput
rate.
Since this queueing network scheduling problem appears to be mathematically
hope
tractable, the best
for further progress
more tractable models, such
appears to be in the analysis of cruder,
Brownian network model.
as the
in-
anced heavy loading, a Brownian network approximates a
Under conditions
of baJ-
queueing network
multiclciss
with dynamic scheduling capability. To state these conditions more precisely, we define
the two-vector p
stations.
m =
The
=
{p,)
be the relative server utilizations, or
values of pi and p2 can be
traffic intensities, for
computed from the switching matrix P, the vector
mix
[nik] of expected processing times, the entering class
q
=
(qk)
average throughput rate A, as will be shown later in the introduction.
z
=
1, 2.
As a canonical example, one may think of
=
pi
p2
=
.9,
and the
specified
The balanced heavy
<
loading conditions assume the existence of a large integer n such that
for
the two
\/n{l
— p,) <
which case n
in
=
1
100
satisfies this condition.
Under such conditions, the scheduling problem described above can be approximated
by a dynamic control problem
Brownian network model,
for a
Brownian network.
see Harrison
to determine the data of the
[5].
For a
full
development of the
Only enough information
Brownian network
in
will
be given here
terms of the original queueing network
parameters. Also, most of Harrison's notation will be retained for ease of reference. The
Brownian network control problem
the set-up that will be adopted.
assumed there
is
is
a measurable
a given
(i7,
mapping of
the real line R, Ft
=
is
stated in the next paragraph, but
When we
F, Ft,
A',
say that
P^), where
Q,
into
C(R), which
(7{X{s), s
<t)
\s
probability measures on
17
X
(f),
is
is
F)
we describe
a {p.,a^) Brownian motion,
is
a measurable space,
it is
X = X{u!)
the space of continuous functions on
the filtration generated by
such that the process {X{t),t
3
first
>
0}
is
X, and P^
is
a family of
a Brownian motion with
drift
variance
/i,
If }'
with Pj.
we say
More
a"^
=
,
and
{}'(/)'
^
Y
that the process
generally,
process A'
when
we
Y
left
It
0}
is
non-anticipating with respect to the Brownian motion
a process that
is
Y
one process
adapted to the coarsest
Also, a stochastic process
have
>
will say that
is
Let Ei be the expectation operator associated
initiEd state x.
is
said to be
is
Ft -measurable
is
<
>
0,
then
X.
non-amticipating with respect to another
with respect to which
filtration
RCLL
for all
if its
X
is
adapted.
sample paths are right continuous and
limits with probability one.
follows from Section 9 of Harrison
sumptions described
earlier, the
[5]
that under the balanced heavy loading as-
queueing network scheduling problem described earlier
can be approximated by the following limiting control problem: choose a pair of
RCLL
processes 1' and 6 (A'-dimensional and one- dimensional, respectively) to
...
minirmze
1
,.
limsup
^
—Ei
f
•^0
subject to the constraints
:
Y
Y,CkZk{t)dt
(1.1)
;t=i
and 6 axe non — anticipating with respect
to A', (1.2)
Z{t)=X{t) + RY{t)-qe{t) foralH>0,
U{t)
= AY{t)
U
non — decreasing with U(0)
is
Z{t)
>
for
for all
alH >
Vim sup ^E[U,{T)]
The process Z
is,
if
Q
is
original queueing system, then the scaled
Z(0 =
is
0,
<
>
(1.4)
0,
=
0,
and
7,
(1-5)
(1.6)
fort
=
1,2.
(1.7)
represents the A' -dimensional scaled queue length process and describes
the state of the system. That
where n
t
(1.3)
the AT-dimensional queue length process for the
queue length process
is
defined by
^, ^>0,
(1.8)
the large integer specified in the balanced heavy loading conditions.
that, as in Harrison
[5],
exactly the
same notation that
is
Notice
used for the scaled processes
is
to
used in the approximating Brownian control problem.
We
retain this notation in order
emphasize the queueing network interpretation of the Brownian network model. The
A'-dimensional process
Y
dimensional process
represents the scaled centered input process.
represents the scaled centered allocation process, and the one-
These two control
processes correspond to the sequencing and input decisions, respectively, in the queueing
network scheduling problem. The scaled process
= -^
Yk{t)
where Tk{t)
is
the interval
[0,t]
problem), and
the cumulative
{Tk
ajt,
F=
^^^-^'
(Y'k) is
>
t
amount of time devoted
defined by
1.9)
0,
to serving a class k customer in
a decision variable in the original queueing network scheduling
is
which
is
defined in equation (1.17), represents the long-run fraction of
the server's active time at station s{k) that must be devoted to class k in order to maintain
The
materiad balance in the network.
0{t}
=
scaled process
Xnt
is
defined by
—
- N(nt)
=^ -,t>0,
(1.10)
\/n
where, as mentioned earlier, A
is
the specified average throughput rate and N{t)
cumulative number of customers released into the network in
The two-dimensional
the two stations.
Thus
U
process
is
U
I{t)
is
[0,t].
represents the scaled cumuJative idleness process for
defined by
U(t)
where
=
:^,
t
>
(1.11)
0,
the cumulative amount of idleness incurred by the server at station
the time interval
[0,^].
the
is
Uand
(For brevity's sake, processes such as Z, Y,
referred to without the adjective "scaled".)
The
K x K input-output
i
in
6 will often
be
matrix
R=
(Rkj)
is
defined by
Rkj=m-HSjk-P:k),
5
(1.12)
where
Sjk denotes the Dirac delta function, meeining that 6jk
otherwise.
The
\
The A'-dimensional
process A'
a
is
(6,
covariance matrix
S =
(Sj/)- Let A
=
system per unit of time
P
number
was assumed to be transient,
/?
=
=
of traffic intensities p
all
{p,)
=
k and 6jk
vector 6
=
defined by
^
drift
=
—
(Sk)
'
^
definitions are
and the
K
x
K
(1.14)
q\,
of class k customers that
it
follows that
=
must depart from the
throughput rate constraint.
{/3k) satisfying
A
Letting C{i) be the set of
j
(A^) be defined by
in order to satisfy the
a unique non-negative A'-vector
is
li
E) Brownian motion, but several
X
so that Xk represents the average
{A,k)
1
otherwise.
0,
needed before we can describe the A'-dimensional
Since
A=
2 x A' resource consumption matrix
=
R is non-singular and
there exists
the flow balance equations
(1.15)
RI3.
customer classes k such that s{k)
=
z,
we
define the two-vector
by
kec(i)
We now
define the A'-vector
a =
(ctjt)
ak
=
by
—
for all
it
e C{i).
(1.17)
Pi
Then the
drift 6
and covariance S of the Brownian motion
X
are
<5=v^(A-i?a)
(1.18)
and
K
^ji
=
^[(^km^'PkjiSji - Pki)
=!
*:
+ akm-'slRjkRik].
(1-19)
Inequality (1.7). which expresses the throughput rate constraint in terms of the cu-
mulative server idleness process
t7, is
the only relationship in the limiting control problem
that does not appear in the Brownian network formulation of
in (1.7)
is
[5].
The two-vector 7 =
(7,
defined by
7.
=
v/^(l-p.).
(1.20)
Inequality (1.7) states that the long-run average fraction of time that server
than or equal to
1
—
p,.
In
Wein
[10], it is
the long-run average throughput rate
The remainder
of this paper
work control problem
(1.1)-(1.7)
is
is
shown that
is
i
this inequality holds
if
and only
if
greater than or equal to A.
orgajiized as follows. In Section 2, the
Brownian
reformulated so that the state of the system
is
idle is less
is
net-
described
by a two-dimensional workload process, rather than a K-dimensional queue length process.
The new formulation
tion to the
will
be referred to as the workload formulation, and
workload formulation that
is
the goal of this paper.
The
it
is
the solu-
control processes in
the workload formulation are (Z, U, 6), which are A'-, two- and one-dimensional processes,
respectively.
programming
In Section 3, linear
is
used to find the optimal process
Z
in
terms of the process U.
In Section 4, the results of Section 3 are
employed
to reduce the
remainder of the
workload formulation to a problem involving singular (or instantaneous) control of a onedimensional Brownian motion.
controller who, at any time,
process B.
That
may
is,
a {fi,a^) Brownian motion
B
is
observed by a
instantaneously increase or decrease the value of the
Holding costs are incurred continuously at a rate h{W{t)), where
controlled process.
No
costs are incurred
when the
the value of the Brownian motion process.
W
is
the
controller either increases or decreases
However, there are constraints on the long-
run expected average amount of control to be exerted in the positive direction and in the
negative direction.
The
objective
is
to
minimize the long-run expected average holding
cost incurred per unit of time.
In Sections 5 through
9,
we
derive the solution to this constrciined singular control
problem. The solution
is
characterized by two constants a and
optimal policy keeps the controlled process
The corresponding optimal
Section
5,
where a <
[a, b]
b,
and the
with minimal
effort.
control functionals are the local times at these boundaries. In
a Lagrangian approach
The
function.
W inside the interval
b,
is
used to move the two constraints into the objective
resulting unconstrained problem has been analyzed
by Taksar
[9],
and
his
results are used in Section 6 to develop sufficient conditions for optimality of the constrained
singular control problem. In Section
and
derived,
pair of
is
zero
in Section 8.
7,
a candidate policy that satisfies the constraints
a solution to the optimahty equations
Lagrange multipliers. The special case where the
treated in Section
is
drift
/i
is
fotind,
of the
which includes a
Brownian motion
and the solution to the workload formulation
9,
is
is
B
summarized
in Section 10.
The Workload Formulation
2.
The
state of the system in the
Brownian network control problem
is
described by the
A'-dimensional queue length process Z, by way of the basic system relationship (1.3). In
this section the
system
is
Brownian network control problem
is
reformulated so that the state of the
described by a two-dimensional workload process.
that the matrix
R appearing
In Harrison
in (1.3) is invertible. Let us define the
matrix
M = AR-\
shown
M = (M,fc) by
Define the two-
W = (Wt) by
W{t)
The workload
it is
(2.1)
This 2 X A' matrix will be referred to as the workload proHle matrix.
dimensional worJcioad process
[5],
= MZ{t),
i
>
0.
(2.2)
process describes the state of the system in the workload formulation. Also,
define the two-dimensional vector v
=
(vi)
V
by
= Mq.
8
(2.3)
It
was shown
Wein
in
that
[10]
p,
where A
is
a positive number that
is
=
v,\
for
2
=
1,2,
(2.4)
a parameter of the original queueing network scheduling
problem.
B=
Define the two-dimensional Brownian motion
B{t)
so that
B
has
drift
two-dimensional
drift
MS
= MX{t),
and covariance
vector
MS =
MY.M^
>
t
(B,)
(2.5)
0,
was shown
It
.
by
—7, where 7 was defined
in
Wein
[10]
that the
in equation (1.20).
Define the workload formulation of the Brownian network control problem as choosing
RCLL
processes Z,
U
and 6
(A'-,
minimize
two- and one-dimensional, respectively) so as to
limsup—
T—>oo
/
J-
subject to the constraints
:
U
U
-^0
Y,CkZk{t)dt
and 6 are non — anticipating with respect
is
Z{t)
non — decreasing with ^(0)
>
for all
t
>
a pair of
control problem
if it
RCLL
satisfies
7.
for?
=
1,2,
for all
t
and
(2.10)
>
(2.11)
0.
processes (1,^) a feasible policy for the Brownian network
equations (1.3)-(1.7) and
(Z, U,d) a feasible policy for the workload formulation
The
(2.8)
0,
(2.9)
MZ{t) = B{t) + U{t)-ve{t)
call
=
to B, (2.7)
0,
hm sup -E[U,{T)] <
Let us
(2.6)
k=l
call
a triple of
if it satisfies
RCLL
processes
equations (2.8)-(2.11).
following proposition allows us to analyze the workload formulation of the limiting
control problem, rather than studying problem (1.1)-(1.7) directly.
Proposition
2.1.
feasible policy (Y, 6) for the
Brownian network control prob-
feasible policy {Z, U, 0) for the
workload formulation and every
Every
lem yields a corresponding
feasible policy (Z, U, 6) yields a corresponding feasible policy (K, 9).
9
Proof. Let {Y.O) be any
feasible policy for the
Brownian network control problem,
thus yielding an associated triple {Z,U,9) from equations (1.3)-(1.4). Premulti plying (1.3)
by the workload profile matrix
M
and using
(2.11). Since equations (2.8)-(2.10) are satisfied, the triple
Reversing the argument, suppose that {Z,U,0)
formulation.
Then
is
{Z,U,d)
(2.5) gives equation
is
a feasible policy.
a feasible policy for the workload
constraints (1.5)-(1.7) are satisfied and, using equation (1.3), one can
define a corresponding control process 1'
by
= R-^[Z{t)-X{t) +
Y{t)
The associated queue length and cumulative
cisely the processes
and
(1.4), (2.1), (2.3)
Z and
q9{t)],
foralU>0.
(2.12)
and
idleness processes in (1.3)
(1.4) are pre-
U, respectively, that we started with. Thus the pair {Y,d)
is
a
feasible policy. |
We now
R
address conditions (1.2) and (2.7) on non-anticipativeness. Since the matrix
invertible,
is
motion
X
if
the control process
in the limiting control
problem, 6
workload formulation, 6
more information
will
is
is
non-anticipating with respect to the Brownian
problem, then the control process
with respect to the Brownian motion
in the limiting control
Y
is
B
in the
non-anticipating
non-anticipating with respect to A", whereas in the
non-anticipating with respect to B.
be available to the controller of 6
will
By
(2.5),
it
can be seen that
in the limiting control
be seen in Section
the workload formulation remains unchanged whether 6
X
is
workload formulation. However, notice that
than in the workload formulation. However, as
to
U
is
4,
problem
the solution to
non-anticipating with respect
or with respect to B.
The
(2.11).
rest of this
In
Wein
paper
[10],
is
devoted to finding a solution to the workload formulation (2.6)-
the solution to the workload formulation
is
interpreted in terms of
the original queueing network in order to obtain effective scheduling rules for the queueing
system.
10
3.
Solving for
Z
Terms of
in
TJ
Z
In this section the optimal process
terms of the control process
and
(2.10).
program
at
Then
V
V
Suppose we are given a process
.
Z and
the optimal
each time
workload formulation
in the
is
expressed in
that satisfies (2.7), (2.8)
d processes are found by solving the following linear
t:
K
min
Y
CkZk{t)
(3.1)
A'
subject to
^
MxkZk{t)
+
Vi9{t)
=
Bi{t)
+
Ui{t)
(3.2)
MikZkit)
+
v^eit)
=
B,(t)
+
U2it)
(3.3)
Zk{t)
>
=
(3.4)
k=l
K
Y
k=l
At each time
Since
it
t,
this linear
would be easier to analyze a
define the dual variables 7ri(<)
time
program may have a
and
linear
7r2{t)
for k
0,
l,...,K.
different set of right
program with a
hand
side values.
static constraint set, let us
and state the duaJ linear program
to
be solved
at
t:
max
subject to
[B,it)
+
A/u7ri(^)
lh(t)]7r,{t)
+
ViTTyit)
Before analyzing the dual
ance process
LP
When
\Vi{t)
[B2(t)
M2jt7r2(i)
<
+
=
V2n2it)
Ck
+
U2(t)]7T2it)
for
A;
=
1,
...,
(3.5)
A'
(3.6)
0.
(3.7)
(3.5)-(3.7), define the one- dimensional
workload imbal-
W by
W{t)
At each time
+
t,
=
p2W,{t)
-
p,W2{t),
t>0.
W(t) measures how imbalanced the workload
and ^2(0 ^^^
^'^
the
same proportions
11
(3.8)
is
between the two
stations.
as the long-run utilizations pi
and
p2,
=
respectively, then \V{t)
0.
The
process
W becomes positive (negative, respectively) when
the workload in the system becomes imbalanced towards station
By
and
(2.2), (2.4), (2.11)
=
Wit)
Returning to the dual
and use
-
LP
(station 2, respectively).
workload imbalance process can also be expressed as
(3.8), the
p2B,{i)
1
piB2{t)
+
P2lh{t)
-
piU2(t),
>
t
0.
(3.5)-(3.7), use (3.7) to eliminate TT2{t)
(3.9)
from the problem
(2.4) to substitute the utihzation levels p for the vector v to obtain the one- variable
LP
max
——TTiit)
subject to
c'j^^{p2Mik
The dual LP
terms of
IV'(/).
(3.10)
-
piM2k)T^i{t)
<
for
p2
A:
=
(3.10)-(3.11) allows us to express the solution Z{t) of
Without
loss of generality,
assume that the
(3.11)
1, ...,/\.
classes k
=
1,
LP
...,
(3.1)-(3.4) in
A' are ordered
so that
argmax c'^\p2Mik - pi^ik) =
1
(3.12)
2.
(3.13)
k
and
argmin
c';;^ip2Mik
- PiM2k) =
k
By complementary
must
satisfy Zk{t)
satisfy Zk{t)
=
when W{t) =
slackness,
=
iiW{t) >
for all classes k
for all classes k
^
2.
^
0,
1.
then the solution Z{t) of the
Similarly,
if
W(t) <
Notice that the solution
is
0,
Zkit)
LP
(3.1)-(3.4)
the solution must
—
for k
=
1,
...,
K
0.
When W{t) >
0,
by
(2.2)
and
(3.12),
W2{t)^^W,it).
(3.14)
Mil
Combining
this
with the definition (3.8) of the workload imbalance process
Ml
^«(*)=-T7
P2Mn -^^-T;-^^'(^)
Pi^hi
12
for
z
=
1,2.
W,
one obtains
(3.15)
Therefore,
when
\V{t)
>
the solution Z{t) to the
0,
LP
r
mt)
Zjt(0=
Similarly,
when
\V{f)
<
0,
(3.1)-(3.4)
I,
-1
^^^^^"'"^'"'
'
0,
in-^i.
'
~~
•{
the solution
is
(3.16)
is
^C)
if
[ 0,
By
jfc
7^ 2.
the balanced heavy loading conditions on the original queueing system stated in
[10],
and by conventions
(3.12)-(3.13),
it
Wein
follows that
P2M11 -piA/21 >
(3.18)
and
P1M22 - P2M12 >
and therefore that the solution Z{t) >
Z
is
constructed from (3.16)-(3.17) for
depend on the control process
workload imbalance process
4.
t
>
0.
>
0.
The optimal queue length
process
Notice that this optimal process does not
U
only through the
as described in equation (3.9).
The Resulting Control Problem
In this section
Z
all
t
and depends on the control process
6
W,
for all
(3.19)
0,
it
is
shown that by substituting
for the
optimal queue length process
using (3.16)-(3.17), the workload formulation (2.6)-(2.11) can be reduced to a problem
of finding the optimal two-dimensional cumulative idleness process U.
dimensional Brownian motion
B
B[t)
Define the one-
by
=
p2B,{t)
13
p,B2{t),
<>0.
(4.1)
Recalling that the two-dimensional Brownian motion
follows that
B
has
drift
a'^
has
—7, by equation
drift
(1.8)
it
where
fj.,
/i
and has variance
B
= g^MEM^g^
=
V"(Pi -P2),
(4.2)
where
Q
P2
=
(4.3)
L-PiJ
Notice that
when
the system
is
perfectly baianced
case will be discussed separately in Section
R
also define the processes
(i.e.,
then,
9; until
=
then
/i
=
0.
assume that
/i
is
nonzero. Let us
pi
^2),
This driftless
and L by
R{t)
=
p2lh{t),
t>0
(4.4)
L{t)
=
p,U2{t),
t>0.
(4.5)
and
The
letters
movement,
R
and L are used
to denote cumulative
respectively, exerted
straints (2.10) determine
amounts of rightward and leftward
by the controller on the Brownian motion B. The con-
an upper bound on the long-run average expected amounts of
rightward and leftward controls exerted. Define a policy as a pair of non-decreasing
R
processes {R, L) such that
Ex[L{t)] are finite for each
ciate with the policy
t
and L are non-anticipating with respect to B,
>
and each
initial state x,
{R,L) the controlled process B{t)
equations (2.2), (2.11), (4.1) and (4.4)-(4.5),
it
+
and R{0)
R{t)
=
B{i)
it
+ R{t)-L{t)
to the right (left, respectively),
=
L{0)
L{t) for
all
i
>
0.
0.
and
Asso-
From
for all
<
>
is
is,
0.
Thus, in the resulting control problem, the Brownian motion
pushing
Ej-[R{t)]
can be seen that the controlled process
precisely the workload imbalance process defined in (3.8); that
W{t)
—
=
RCLL
(4.6)
B
can be controlled by
which increases (decreases, respectively) the
14
Now
workload imbalance of the system.
hi
define the positive coefficients hi
and
/?2
by
C2
=
(4.7)
P1M22 -P2A/12
and
/l2
If
=
Cl
(4.8)
P2M11 - P1M21
the optimal queue length process constructed from (3.16)-(3.17)
is
used, then one obtains
K
J2ckZ,{t)
=
h{Wit)),
(4.9)
k=l
,vhere
— hix,
if
X
/z2X,
if
X
h{x)
Thus
for
problem
<
>
(4.10)
0.
each policy {R,L), costs are incurred at a rate h{W{t)). The resulting control
is
to find a policy
{R,L) to
minimize
limsup— .E^
T—'oo
/
limsup
h{W{t))dt
— £'2.[/?(T)]
T— 00 T
<
a solution
d{t)
is
-
(4.12)
P2
'""-''^'"
(4.13)
.
-t
P\
-
P2
{R,L) to the constrained control problem (4.11)-(4.13) can be found,
then the optimal 9 process
where Z{t)
P2(l -Pi)a^
P\
Umsupi£,li(r)l <
T^oo
(4.11)
Jo
-^
1
subject to
If
0,
=
is,
via equations (3.2)
v-^[Bi{t)
+
^
and
(4.4),
K
-YMikZkit)]
for all
t
>
(4.14)
0,
defined in (3.16)-(3.17). Notice that the optimal 6 process would be expressed
by (4.14) whether 9 was non-anticipating with respect to the two-dimensional Brownian
motion
paper
is
B
or with respect to the A'-dimensional Brownian motion
X
devoted to finding a solution {R,L) to problem (4.11)-(4.13).
15
.
The
rest of this
5.
The Lagrange Multiplier Method
Instead of analyzing the constrained control problem (4.11)-(4.13) directly, a La-
grangian approach will be used.
In particular,
and
r
let
/
be the Lagrange multipliers
corresponding to constraints (4.12) and (4.13), respectively. The multipliers
interpreted as the per unit costs of exerting the controls
with a policy
(i?,
and L,
and
/
are
respectively. Associate
L) the Lagrangian cost function
k{x)
With the
R
r
=
—Ei f
Jo
limsup
hiWit))dt
-\-
rR(T)
+
IL{T)
(5.1)
aid of the following theorem, the constrained problem (4.11)-(4.13) can be solved
by making an appropriate choice of multipliers and then minimizing the Lagrangian cost
function. This idea
Bellman
[1]
is
not at
and Everett
[3]
all
new
in the context of control theory; readers are referred to
for early applications to
dynamic programming problems. The
problem of finding a policy (R, L) that minimizes k{x)
will
be referred to as the Lagremgian
problem.
Theorem
is
5.1.
Suppose
and
r
I
are nonnegative real
numbers and suppose {R* L*)
,
a solution to the Lagrangian problem. Furthermore, suppose
1
P2(l
J-
Pi
rr,*/T^M
hmsup -Er[R
{T}] =
r— oo
-
Pl)^
(5.2)
P2
and
1
rrvTM
hmsup -Er[L
(T)] =
T— oo
Then (R*,L')
is
Proof. Let
P\
J-
-P2)m
- P2
(5.3)
.
a solution to the constrained control problem (4.11)-(4.13).
W*
be the controlled process under the optimal policy (i?*,L*), so that
W*{t)
Then
Pi(l
for all policies
=
B{t)
+
R*{t)
-
L*{t)
for all
^
>
0.
(5.4)
{R,L)
limsup— Ei
T—oo
J-
f
h{W* {t))dt + rR*{T) + lL*iT)
Jo
16
<
limsup
—E^
/
Jo
h{W{t))dt
+ rR{T)^lL{T)
(5.5)
Rearraxiging terms yields
1
limsup —Ej:
r— oo
<
h{\V\t))dt
1
J-
imsup-E^
T-^oo T
h{W{t))dt
/
Jq
+ r\imsup^E,[RiT)-R*iT)]
'
T—'oo
+
Since (5.6) holds for
all policies, it
r— oo
(5.6)
J-
certainly holds for the subset of policies such that
< ei^Llfllt
Pi - P2
limsup iE.[/?(r)]
T— oo
llimsup ^E,[L{T)-L*{T)].
J-
(5.7)
and
limsup
r— oo
However, by equations
-Ex [i(T)] <
i
(5.2)-(5.3), the
P\
-
(5.8)
.
p2
terms
1
rlimsup-£;x[i?(r)-i?'(r)]
T— oo
(5.9)
-^
and
/iimsup;^£;,[L(r)-L*(r)]
T^oo
(5.10)
J-
are nonpositive for this subset of policies.
(4.13).
Hence the policy
(i?*,
L*)
is
a solution to (4.11)-
I
Notice that, at this point, the Lagrange multipliers r and
to invoke
Theorem
5.1,
we need
to find a pair of multipliers (r,
/
/)
Eire
not known. In order
and a solution
(i?,
L) to
the Lagrangian problem that simultaneously satisfy (5.2)-(5.3).
Given
control of
r
and
/,
the Lagrangian problem
is
a problem of singular (or instantaneous)
Brownian motion. The name stems from the
fact that the state of the controlled
process can be instantaneously changed by the controller and, as a result, the optimal
17
control processes
R
R
has meeisure zero). Such problems have been the subject of
and L
see, for
increcise
L
are continuous but singular
example, Harrison and Taksar
and Taksar
for
ajid
[9].
In particular, Taksar
[9]
Karatzas
[6],
(i.e.,
[7],
the set of time points at which
much
study;
Shreve, Lehoczky and Gaver
[8],
has solved the Lagrangian problem defined by (5.1)
a convex holding cost fimction h that
is
finite
everywhere or
on a
infinite
finite interval.
Since the holding cost function h defined by (4.9)-(4.10) satisfies Taksar's requirement,
his results will
be used
next section, where sufficient conditions for optimality are
in the
developed for the constrained problem (4.11)-(4.13).
The Optimality Equations
6.
The
goal of this section
is
to present sufficient conditions for
the constrcdned problem (4.11)-(4.13).
an optimal solution
First, the sufficient conditions for optimality will
be stated for the Lagrangian problem with arbitrary nonnegative ntunbers
latter conditions were derived in
to
Taksar
The
[9].
r
and
/.
These
sufficient conditions for optimality of
problem (4.11)-(4.13) are found by combining Taksar's
results
with Theorem
5.1.
Let us start by defining a class of policies that include the optimal policy. Hcirrison and
Taksar define a special class of policies
brings the controlled process
keeps
it
process
[6]
called control limit policies. This type of policy
W within a certain interval
within this interval while exerting a
W under such a policy
interval [a,b].
where
in
it is
A
is
minimum amount
the Brownian motion
referred to as a regulated
specifically,
B
instantaneously and then
of control.
The
controlled
reflected at the endpoints of the
complete anedysis of this controlled process
is
contained in Harrison
Brownian motion. The control functionals
under a control limit policy are the local times of
More
[a, b]
W
at the points a
the control limit policy {R,L) on the interval [a,b]
R{t)= sup
[a
- B{s) +
0<a<«
18
L{s)]^
and
is
b,
R
[4],
and L
respectively.
defined by
(6.1)
and
=
L{t)
The existence and uniqueness
sup [B{s)
0<J<t
+
R{s)
-
b]+.
(6.2)
of such functionals was estabhshed in Harrison and Tciksar
[6].
We now
present sufficient conditions for optimality for the Lagrangian problem (5.1).
Suppose that an optimal solution to the Lagrangian problem
fovmd and
is
corresponding value of the Lagrangian cost function k{x). Thus g
cost per unit time (independent of initial state)
and
standard with problems using average cost criterion,
the optimal policy
when the
when the
incurred under the optimal policy
V
has a
first
The
and second derivative and
V
is
W
is 0.
It
is
x minus the cost
will
be assumed that
[9].
a solution to
Min {rV{x) + h{x)-g,r + V'{x)J-V'{x)} =
V{0)
and suppose there
exists
an interval
[a, b]
TV(x) + h{x)
Suppose that
for
some constants
A'l
V'{x)
— —r
V'{x)
=
is
0,
1
for a
<
X
for X
<
a,
ior
<
6,
x>h.
+
N2h{x).
the minimal average cost for the Lagrangian problem and the control Umit
on the interval
[a, b] is
an optimal
(6.4)
(6.5)
(6-6)
(6.7)
A^2
V{x) < Ni
Then g
=
(6.3)
such that
-g =
and
is
be referred to as the potential function.
will
Suppose {g,\'{x))
6.1.
W
initial state of
following proposition was proved in Taksar
Proposition
the minimal average
V{x) be the cost incurred under
the controlled process
initial state of
g be the
be referred to as the gain. As
will
let
is
let
policy.
19
(6.8)
poUcy
We
axe
now
Theorem
in a position, using
5.1
and Proposition
6.1, to state sufficient
conditions for optimality of the constraSned problem (4.11)-(4.13).
Theorem
Suppose
6.2.
Mm
{TV{x)
=
V{0)
R
r,
/,
a, 6) satisfy
+ h{x)-g.r + V'{x),l-V'{x)} =
is
V'{x)
= -r
V'{x)
=
=
g
<
for X
for X
I
for a
>
<
<
X
(6.12)
(6.13)
6,
=
MIH^^
limsnp ^EAL{t)]
=
Mlz.^,
pi
J-
smooth
(6.8).
-
fit
and
poUcy on the
that the function
is
and
often imposed
Benes, Shepp and Witsenhausen
[2].
V
interved
G C"^
when
(6.14)
(6.15)
P2
interval [a,b].
Then the optima] poUcy {R,L)
the control hmit
(6.11)
i,
a,
lim sup I; E^[Rit)]
The requirement
(6.9)
(6.10)
and L are the control Umit pohcies on the
(4.11)-(4.13)
0,
0,
+ hix)-
and satisEes condition
ple of
V{x),
TV{x)
r— oo
where
{g,
is
Suppose
to the constrained
V
G C"^
problem
[a, 6].
sometimes called the heuristic
princi-
solving control problems; see, for example,
In the next two sections a solution,
which
is
denoted
(g*,F*(x),r*,/*,a*,6*), to equations (6.9)-(6.15) will be found.
7.
The
A
Candidate Policy
first
step towards a solution to equations (6.9)-(6.15)
endpoints a* and
b*
.
is
to find candidate interval
These values lead to a candidate pohcy {R*,L*) defined by
R*{t)= sup
[a*
-B{s) +
0<3<t
20
L'{s)]-^
(7.1)
and
L'[t)= sup [B{s)
•1 +
+ R'{s)-b']
(7.2)
0<s<t
In order to find these endpoints, the following
motion
first
be needed. See Chapter 5 of Harrison
will
define u
Lemma
Then
[4]
concerning a regulated Brownian
for a derivation of this result. Let us
by
V
(6.1)-(6.2),
lemma
7.1.
and thus
Suppose
B
is
a
{ii.cr'^)
W =B+R—L
is
2fx
=
(7.3)
Browman
motion,
R
and L are as defined
a regulated Brownian motion on the interval
in
[a, b].
W has a fruncafed exponential steady state distribution with density function
ue,i/(i-a)
p{x)
giy(6-a)
_
<
for a
X
<
(7.4)
b.
I
Furthermore,
^EJR(t)] = -nr^,
lira
(7.5)
and
lim
The candidate
among
^EMt)] =
interval endpoints will
^
,,
(7.6)
,
be derived by solving the following problem:
the class of control limit policies, find the policy {R,L) to
minimize
limsup— £^i
h{Wii))dt
T —oo
J"
Jo
(7.7)
-^
1
subject to
limsup
r-oo
— £'i[i?(T)] =
T
Iimsup-£i[i:(r)]
T-oo 1
That
is,
we
restrict the policy
find the values of a
and
(R,L)
b that solve
to take the
problem
21
,
Pi
-
P2
Pi
-
P2
=
.
form defined
(7.7)-(7.9).
in (6.1)-(6.2),
(7.8)
(7.9)
and then
Notice that problem (7.7)-(7.9)
is
problem (4.11)-(4.13) with the inequality constraints (4.12)-(4.13)
just the constrained
replaced by conditions (6.14)-(6.15), which
solving problem (7.7)-(7.9)
endpoints a and
Lemma
(7.8)-(T.9) if
(7.6)
and
A
and only
control limit policy
The
b
first
step in
a).
(7.9).
|
[a, 6] satisfies
constraints
if
result follows
—
{R,L) on the interval
The
result
.-Mn(^41^).
(7.10)
by equating the right side of equations (7.5) and
(7.8)
and
can also be proved by equating the right side of equations
Since the cost function h defined in (4.9)-(4.10)
at zero,
The
b.
7.2.
solving for (6
equality constraints.
to express the constraints directly in terms of the interval
is
6-a =
Proof.
Eire
is
convex and achieves
its
minimimi
the optimal control limit policy for problem (7.7)-(7.9) will have endpoints a and
<
that satisfy a
(7.7)-(7.9)
<
6.
By Lemma
8.2,
the search over values of a and
b
to solve
problem
can be reduced to a search over values of a such that
(7.11)
Proposition
7.3.
The control
-1,
limit policy with interval endpoints
(^1
/
+^2)^2(1 -Pi)
.-,^^
\
and
\hip2(lsolves
problem
Proof.
J h{x)p{x)dx
pl)
+
h2Pi{l
- P2)J
(7.7)-(7.9).
By equation
for
any control
(7.4),
the objective function (7.7) can be expressed as
limit policy
on the interval
[a, b].
By Lemma
7.2,
the optimal
value of the endpoint a can be found by minimizing the function /(a) over the interval
22
(7.11).
where
/(a)
=
h{x)p{x)dx.
/
(7.14)
Ja
Using (7.10) to substitute
for (6
—
and using
a) in (7.4)
(4.10) to substitute for h yield
7"-^;' (/°-/.,.e-'-'&
/(.)=
(pi
-
C2(' -Pi) /
V
/
+
Ja
P2)
X
,
/i2Te''('-<'>dx).
/
(7.15)
Integration by parts yields
/(a)
Setting /'(a)
=
=
+
(t.(/>i-p2))~'('p2(l-pi)(/ii
-
Pi)
Pi)
-
+
i^a(^i/!>2(l
-
^iP2(l
0,
-
+
-
/'2pi(l
/J2P1(1
-
/i2)e-''"
P2))
+
P2)
/i2Pi(l
-
P2)ln f ^'?!"^'! ))VP2(1 - P\)J J
(7.16)
we obtain
-vp2{l - pi){h,
+
h2)e-''''
+
t^(;iiP2(l
-
Pi)
+
/?2Pi(l
-
P2))
=
(7.17)
0.
Solving for the endpoint a gives equation (7.12) and substituting this value in equation
(7.10) yields (7.13).
Taking the second derivative of / with respect to a yields
/f'i(a)
^
=
^-P2(l-Pi)(/^i+^2)e-°
(Pi
which
is
strictly positive
Thus a candidate
^
-,
by definitions
(4.2)
,„^_(7-18)
,
-P2)
and
(7.3),
thus completing the proof.
|
been found such that the candidate policy
interval [a*, 6*] has
(i?*,L*) defined by equations (7.1)-(7.2) satisfy conditions (6.14)-(6.15). Notice that, as
expected, a*
=
date gain g* can
in (5.1)
in the limiting case
now be
/?2
=
0,
and
similarly, &*
=
when
/ii
=
0.
A
candi-
derived by evaluating the Lagrangian cost funcion k[x) defined
under the candidate
policy.
23
Corollary
k{x)
Under the candidate policy (R*,L*), the Lagrangian
7.4.
cost function
is
\hiP2{^-
V
+
-
h2Piil
{hi
P2)ln
-
hip2{l
(Pl
By
Proof.
-P2)
and
=
Substituting for {b*
expression (7.19).
8.
—
a*)
•^(«*)
pi)
+
-
P2)
/j2Pi(l
-
P2)
(7.14), the value of the
under the candidate policy {R*,L*)
3'
/J2)pi(l
p2)J
-P2)
(Pl
(7.5)-(7.6)
+
+ h2pi{l-
pi)
Lagrangian cost function
(5.1)
is
+ %.(6.-t).i + \_J,,..a-y
and a* using equations
(7.10)
and
(7-20)
(7.12), respectively, yields
|
Solution to the Optimality Equations
In this section a solution (g* ,V*{x),r* ,l*,a* ,b*) to equations (6.9)-(6.15) will be
In the previous section candidate endpoints a*
presented.
and
b*
were derived and a
candidate gain g* was found in terms of the Lagrange multipliers r and
We
start this section
tion (6.12)
obtain,
by finding a pair of candidate multipliers (r*,/*).
and the assimiption that
upon
/.
setting V'{a*)
= —r
V
G C^, we
set
V"{a*)
=
By
condi-
in equation (6.11) to
and using (4.10) and (7.12) to substitute
for h
and
a*
respectively,
g^-nrSimilarly, setting V'{b*)
g
=
(^i + ^2)/32(l - Pi)
-ll„ f
''
h,v-^
In
\\'
^^^^^
\hip2{\ - pi) + h2pi{l - P2)
= and
pl
/
+
=
V"{h*)
In
in
(^1
-1,^ /
h2U-'
•
,
,
equation (6.1) yields
+
;/
\hip2[\ ,
24
(8.1)
^^2)^1(1
-
P2.
[;'
+ h2p\{l pi)Vr^
.
P2).
(8.2)
Equating the right sides of equations
r
+
,
,
I
(
'
In
fiu
\hip2{l
flU
Lemma
8.1.
Proof.
By
(hr
.hiP2{l
Lef
r
+
/
be a^ in
-
-
h2)p2il
pi)
+
^-'
P^)
-
we
+
r
h2pi{l
+ >
P2)
- p2)j
'
0.
1
see that r
+ >
if
/
and only
if
jh, + h2)p,{l - P2)
f
V'
\h1p2il - pi) + h2pi{l - P2)
._..
{p2{l -pl))''>+''2 (/9i(l -^2))"'+"=
(8.5)
^
- P2)J
h2pl{l
pl)
Then
(8.3).
rearranging terms in (8.3),
+
(S.2) gives
{h^+h2)p2{\-pi)
\hip2{l - pi) + h2Pi{l -
h,
=
and
(S.l)
or, equivalently,
—
hi
Since ln(x)
It
is
+
>
h2
an increasing concave function,
turns out that there
is
+ >
follows that r
it
/
0.
|
not a unique pair of multipliers (r*,/*) that satisfy the
optimality equations (6.9)-(6.15); any nonnegative pair that satisfy equation (8.3) will
suffice.
To
this end,
choose any A* G
The
[0, 1].
pair of candidate multipliers (r*,/*) are
defined by
(^1
A*(_^ln'^
pu
\h1p2il
^2
(hi
^^
pu
f
+^2)p2(l - Pi)
-
pi)
+
h2Pi{l
-
P2)
+h2)pi{l - P2)
UiP2(l-pi) +
II
.gg.
/i2Pi(l-P2),
and
(^i+/^2)P2(l-Pi)
\hiP2{\ - pi) + h2pi{\-
r = (l-A*)(-^ln^
pu
(^1+^2)P1(1-P2)
\hip2{\ - pi) + h2P\{l -
^Mnf
pu
With the candidate
multipliers (r*,/*) in hand,
it
is
\\
P2)
)
now
P2)
.g^.
)'
possible to calculate the
candidate gain g* directly in terms of the problem data. This can be done in three ways:
25
/*
substituting r* for r in equation (8.1), substituting
both r* and
/*
for
in
/
may
Readers
in equation (7.19) of Corollary 7.4.
equation (8.2), or substituting
verify that all three
ways
lead to the result that
-ii„
=(A*-l)/ii//-Mn
g*
(^1
I
-
pi)
+
(^1 +^2)/>l(l
-
hiP2{l
-1,_
,.L
In
+ X*h2U-^
To complete the candidate
V*
function
is
required.
^iV'{x)
The
/
7^
)/
,
+^2)^2(1 - Pi)
V*'{x)
=
P2)
P2)
.7'
r
(8.S)
.
solution of the optimality equations, a candidate potential
With g*,a* and
+ ^a^V"(x) +
6*
now
chosen, condition (6.1) becomes
-g* =
h{x)
The solution V*'
<
for a*
solution to this second order differential equation
Proposition 8.2.
-
/l2Pl(l
<
x
b*
(8.9)
given in the following proposition.
is
to (8.9) is
— + C*t-"^ - ^e-"^
/
<x<
for a
h{y)e''^dy
h\
(8.10)
where
^.
_
h,fi-'iy-'{h,+h2)p2{l-Pi)
-
hxp2il
Proof.
The
pi)
+
h2pi{l
=
C
V*'{a*)
is axi
unknown
= —r* and
h2)p2{l
To
+
pi)
-
Pi)
-
h2pi{l
.
'
P2)
'
is
(^^
constant.
+
\hip2{l-
P2)
— + Ce-"' - ^e-"""
P
where
-
solution V*' to (8.9)
V*'{x)
jh,
.^^f'
f h{y)e''Uy
for a'
<
x
<
b*
(8.12)
Ja-
find C,
we evaluate
(8.12) at the point x
=
a*, set
substitute for a* using (7.12) to get
fi
V
[hi
^,
-//C
^ip2(l
+
h2)p2{l
-
Pi)
J
or, equivalently,
g
=
-/xr
.,
[hi
26
-Pi) +
+
,
,
,
-\
/i2pi(l
,.
h2)p2[l
-
Pi)
P2)
,^...
1
•
(8.14)
Substituting g* and
/•*
and
for g
and
right sides of equations (8.1)
C* defined
=
in (8.11).
following three
8Lnd
V*"{b*)
8.3.
lemmas
Proof.
Evaluating
=^— + C'e-"^' -
Substituting for a*,
^+
/
jj.
/
=
V*"{a*)
/*,
+^2)p2(l -Pi)
hip2{\ - pi) + h2p\{\- p2)
-
Pl)
+
-
^2pl(l
P2)
{h\
-21
+
,
(^1
(^1
+
^2
^2)p2(l
Pl)
-
+
hip2{l
-
-
X
+
-
^2)p2(l
+
pi)
-Pi)
h2pi{l
hip2{l
2/l2 //llp2(l
^^
^'
-P2)
M
.J
-Pl) + /i2pl(l -P2)
^
M
,,7'
-
-
-
Pl)
^
^
+
+
p2)
P2)
h2)p2{l
pl)
X
-
p\)
h2Pl{l
^
-
P2)
^2Pl(l -P2)'
(/^1+^2)P1(1-P2)
(^1
/
+
{hi
.
P2)
^2)Pl(l
':'
we obtain
ih,+h2)pAl-P2)
^
Pl)
h2Pi{l
P\)
1)))
-Pl) + ^2Pl(l -
/liP2(l
+ /i2Pl(l-p2)V
.' +
/llp2(l
-21
-
h2)p2{l
^/?lP2(l-Pl)
.,-2
+'--'(
+
(^1
(^1
2/li
'
(8.11),
ln(
p.u
.
yield
-
{e^^'iva"
I*.
(8.15)
^1
)
-
=
\)\.
and C* using (7.12)-(7.13) and
6*
hipiil
""
+
h2U-\e''^\vb* -l)
_^-2 _ ^-2/
-2
and integrating by parts
4-^""** i-hiu-^i-l
(h,+h2)p,{l-p2)
V
boundary conditions V*'{b*)
fc*
(^1
^lP2(l
^
yields the candidate constant
in (8.10)-(8.11) satisEes V*'{b*)
=
x
V'*' in (8.10) at
+
=
verify that the
The function V*' de£ned
V"(b*)
C
for
respectively, are satisfied.
0,
Lemma
r*'(6M
and solving
(8.14),
|
The
=
using (8.6) and (8.8) respectively, equating the
r in (S.l)
/ilP2(l
+^2)Pl(l -P2)
-Pl) +
/i2Pl(l
-
P2)
^2)pl(l
^/llP2(l-pi)
+
-P2)
/i2Pl(l-p2)V'
Cancelling and rearranging terms yield
g*
//
^2
,
/ii/
^1P2(1
(/Jl
(
+
(hi
\/jip2(l
-Pl) +
h2)pi{l
-pi) +
/^2Pl(l
+/J2)pi(l
=
r.
-P2)\
-P2)
I
27
P2)
/i2Pl(l
Since the last three terms on the right side of (8.17)
v*'{b*)
-
\
,hi+h2
-P2)/
Pl^
/ti/)2(l-pi)
/i2
y
pi/pi(l-p2)
PI^
sum
to zero, using (8.8)
.g
we
^-,.
see that
Lemma
8.4.
Proof.
Differentiating (S.IO) yields
V"'{x)
V*"{a')
V'*' satisfies
=
0.
^^ + -^ /
= -uC'e-"' -
h{y)e''ydy
=
Using (4.10) and (7.12) emd evaluating equation (8.18) at x
^°^ =
'
-^^^
ih,+h2)p2il-p.)
(-hi
a^^ ^u
^
which equals zero by
+
< b\
x
(8.18)
a* give
^
-
h,)p2{l
\
p,)
\h,p2{l-p,) + h2P,[\-p2))'
^
'
|
(8.11).
Lemma
8.5.
Proof.
Evaluating (8.18) at x
V*
{h^
[
<
for a'
=
V*"{h*)
satisfies
=
6*
0.
and integrating by parts yield
a'
(-1 -
6*
Substituting for a*,
V"'(b*)
=
+
(^1
-
hip2{l
hiP2{l{hi
2
^2)/>2(l
+
Pi)
pl)
+
/iip2(l
.
+
1)))
+ h2)pin-
+
-Pi)
h2pl(\
-uhiP2il -
{hi
_
-
h2pi{l
-
-
h2)p2{^
p2)
pi)
-
-
hiP2{l
X
/llP2(l
Pi)
h2)p2{l
+
-pi)
/l2Pl(l
-
-
Pi)
+
/l2Pl(l
.
X
P2)
+ h2i'-A.
+
+
Pi)
h2pi{l
h2)piil
(8.20)
follows that
it
-
^2)^2(1
pi)
-2
+
-
L
,
_
,
h2)p2{l
+
pi)
-
^
-
P2)
P2)
X
-2
-
Pi)
h2U-^{hi
^1P2(1
-
+
Pi
-pi)
h2pi{l
+
h2U-'^{hi
hiP2{l
P2)
X
-
{hi
hip2{\
P2)
1)1.
V
/
.
+
^^hip2il-pi) + h2Pi{l- P2)^
-P2)
p\)
-
(hi
.
- P2).f,
h2Plil
+^2)pi(l -P2)
'•^^
ln(
+
hiP2{l
fi
+
+
(^1
h2
s
1)
(8.11), respectively,
\n(
^11^
P2)
-Pi) + h2pi{l -
hiu-'^jhi
P2)
-
h2V-'{e'"' (ub*
)^^
+/Z2)Pl(l
(/)l
/
{va'
and C* using (7.12)-(7.13) and
v(
,
(e"''
)
X
-
P2)
-P2)
/i2)pi(l
+
/i2Pl(l
h2)pi{l
+
-
P2)
-P2)
h2pi{l
-
x
X
P2)
(8.21)
28
The
In(-)
terms cancel out and upon rearranging, we have
V"(b*) =
which equals
I
Pi)
+
h2(hi
+
A
hip2{l
-P2) n/
^2pi(l
h2)p2{l — P\)
',
,
,,
Pi) + h2pi{l - P2)
-
+h2)p2{l-
hijhi
/
\
,
J+
s
\
,
+
/ii
pi)
/12
8.22)
,
)
|
zero.
We now extend
in the obvious
-
^1^2(1
/
the function V*' derived in Proposition 8.2 beyond the interval
way suggested by conditions
function V* in terms of V*'
.
and define the candidate potential
V* by
Define
{
l/*(x)=
(6.12)-(6.13)
[a*, h*\
ifx>0,
-
\^V'"{y)dy,
-'"
(8.23)
„
ifx<0,
\ -/;F*'(y)dy,
where
{-r*,
Notice that
I'*
+
C'e--^'
- ^e-"-
/;. h{y)t^ydy,
G C^, as required, and that F*(0)
The candidate
<
a*,
ifa:>6*,
^*-
9L
X
if
solution (g*, F*(x), r*,
I*
=
,a* b*)
tions (6.10)-(6.15) of the optimality equations have
<
x
now completely
been
verified.
verified,
it
suffices to
TV*{x)
specified,
To complete
cation of the candidate solution, condition (6.9) needs to be verified.
have already been
< h\
0.
is
,
a*
if
(S.24)
and equathe
verifi-
Since (6.11)-(6.13)
check that
+ h{x)-g* >0
x^
[a*, 6*]
(8.25)
<
X
<
(8.26)
= —pr* —
h\X
—
> —ur —
hiQ
for
and
-r* <
Lemma
Proof.
8.6.
T'*'(x)
< r
for a*
b*
Condition (8.25) holds.
To show
(8.25), notice that for x
TV*{x) + h{x) — g*
=
29
<
0.
a*,
g*
—
g
(8.27)
Similarly, for j
>
6*,
TV*{x) + h{x) -g* =
fil*
+
h2X - g*
>
ul*
+
h2b*
=
0.
Lemma
8.7.
Proof.
Since V*'{a*)
(8.26)
= -r%V"{b*) =
X
<
V*"Ij:)
=
<
0,
we have, by
{hi
^1
(
P
+
(8.28)
=
and V*"{a*)
I'
V*"(b*)
=
0, to
prove
show that
V*"{x) >
For a*
I
Condition (8.26) holds.
suffices to
it
-g*
hip2il
^
for a*
<
X
and integration by
(8.18)
+h2)p2{l- Pi)
- p\) + h2piil -
b*
(hi
hip2{l
P2)
(8.29)
parts,
^,
.
- ^^lil^(i.xe- - e- -
a-'
<
j.G*e-'
+
+ h2)p2{l - Pi)
- pi) + h2pi{l -
>.^
_^^
P2)
e^').
'
(8.30)
p,
Substituting for a* from (7.12) and cancelling and rearranging terms yield
T-.//.
^
(x)
V
Letting x
=
=
—p^(I--;
^1
(/ii
r:
/.
'701
is
X G (a*,
-
^1)
re
;
-^^^
also greater
0), it
f
for
)
'
hih2ipi
=
-
a *^
<
-
a:
^n
<
P2)
ro'J1\
(8.31)
'
=
,
p{hiP2{l-pi) + h2Pi{l-p2))'
^
greater than zero by definition (4.2). Also, for a
V*"'(x)
^"^^
which
^
'
in (8.31) gives
^
is
/?2)p2(l
h,p2il-pi) + h2Pi{l-p2)
y
which
+
<
x
<
follows that V*"{x)
.
'
0,
+ h2)p2il-pi)
aHh,P2{l-Pi) + h2P,il-p2))'
than zero.
^
2hr{h^
Since V*"{a)
>
for a
<
30
x
=
<
0,
0.
F*"(0)
>
'
and V*"'{x) >
^^-^^^
for
<
Similarly, for
<
x
6*,
p^h,p2{l2h2X
we have by
Pi)
+
hiP2{l
hip2{l-
-Pl)
+ h2pi{l -
/?2)P2(1
pi)
+ IHllZ (e'^'ivx -
+
1)
+
(hi
^^
V
/i
-
parts,
+ h2Pl{l-p2)' ''^hiP2{l-pi) + h2Pi{l-p2)^^
hie-"' f
(ftl
and integration by
(8.18)
h2)p2{l
pi)
ln(-
hip2il
P2)
1
Pi)
-
h2pi{l
[hi
/
,
+
-
P2)
+ h2)p2il - Pi)
- Pl) + h2pi{l -
P2,
y
(8.34)
which can be rewritten as
.^^n,
{x]
V
.
h2
=
<
X
<
6*.
we can
differentiate
^
is
than
less
>
proof.
We
-—
pi)
+
-
-e _^^
P2)
h2pi{l
-
^ ^ U*
forO<x<i.
r
for
(8.35
P2))
V*"(x) to obtain
V*"(0) >
< x <
/C OrA
r.
+
h2)Pl{l
-
P2)
6*.
0,
V*"{b*)
=
.oo^^
,-^x
^^-^^^
'
^~'crHhiP,{l-pi) + h2Pi{l-p2))"
zero. Since
have that V*"{x)
h2)pl{l
2h2{hi
.'*,>,,._
which
+
p{hip2{l-
P-
For
h2{hi
+ -—
and V*"'{x) <
for x
G
(0, 6*),
we
Thus, (8.29) has been shown, which completes the
I
are
now
Theorem
in position to state the solution to
The
8.8.
{R',L*) defined
problem (4.11)-(4.13).
solution to problem (4.11)-(4.13)
on the interval
in (7.1)-(7.2)
[a*,b*],
is
the control limit policy
where a* and
b*
are defined in
(7.12)-(7.13).
Proof.
V'*(x), r*,
6.2,
for
/*,
The
results
from
Lemma
7.1
through
Lemma
a*, 6*) that satisfies equations (6.9)-(6.15)
the result holds
some constants
if
A'^i
the function
and
A^2-
V*
satisfies
8.7 lead to a solution (g*,
such that
V
G C^. By Theorem
condition (6.8) that V{x)
< Ni + N2h{x)
This condition holds by setting
Ni=m&x{V*{a*),V'{b*)}
31
(8.37)
and
N2 =
r*
+
/•
max{—7
—7+
r*
,
is
}.
(8.38)
I
^1
/l2
As
I*
usual in control problems dealing with long-run average cost criterion, the op-
timal poHcy {R*,L*)
not unique.
is
Brownian motion process within the
and then uses a* and
In particular,
any policy that brings the controlled
interval [a*, 6*] in a finite expected
b' as reflecting barriers will
achieve the
amount
of time
same average expected
cost
as the policy {R*,L*).
The
9.
Case
Driftless
In this section, the constrained problem (4.11)-(4.13)
where
perfect system balance,
B
defined in (4.1) has drift
/x
p\
=
=
0.
P2- In this case, the
Defining
e
it
C,
solved vmder the conditions of
two-dimensional Brownian motion
by
= v^Pi(l-pi),
(9.1)
seen that the right side of constraints (4.12) and (4.13) both equal
is
an essential system parameter, problem (4.11)-(4.13)
procedure used in Sections 4 through 8
are
is
now
same
is
still
will
Taking
be solved in terms of
it.
^ as
The
valid in this case, but the computations
substantially easier. In particular, the sufficient conditions for optimality are the
as in
Theorem
6.2,
except that conditions (6.14)-(6.15) are replaced by
limsnp^E,[R{t)]=\\msnp^E,[L{t)] =
As
1^.
in Section 7, the first step
(6.9)-(6.13)
and
Harrison
for a
[4]
(9.2)
is
t
(9.2)
towards finding a solution to the optimality equations
to find candidate interval endpoints a*
proof of the following result, which
driftless case.
32
is
and
b*
the analogue of
.
See Chapter 5 of
Lemma
7.1 for the
Lemma
(6.1)-(6.2),
Then
Suppose
9.1.
and thus
B
is
a (0,(7^) Brownian motion,
W =B+R—L
is
T—'OO
The candidate
and L are as defined
a regulated Brownian n^otion on the interval
W has a uniform steady state distribution on
limsup ^E4R{t)]
R
=
[a, b]
T—<-oo
=
^
^\0
'
interval endpoints are derived
[a, b].
and
limsup ^Er[L(t)]
'
in
(9.3)
(^)
by solving the following problem: sujiong
the class of control limit policies, find the policy {R, L) to
•
minimize
limsup
iim
T^oo
subject to
Lemma
9.2.
i£,[/
J
r
h{W{t))dt]
limsup —E^[R[t)]
The control
(9.4)
Jo
=
limsup —E^[L{t)]
=
limit policy with interval endpoints
*
(9.5)
Integrating gives
(9.11)
<'
i
Setting /'(a)
=
0.
we have
2
J
2/iia
+
2/i2a+
^— =
(9.12)
Solving for a yields a* in (9.6) and using (9.8) to solve for
6 yields b* in (9.7).
These are
the cajididate interval endpoints, since
/"(a)
which
is
upon
4(2/^1 +2/12),
cancelling and rearrcinging terms,
/(a*)
a'^h^h:
=
the candidate gain g*
/l2)'
+
ri
and
b*
is
(T^hih2
Let X* e
9.3.
=
'*
(9.14)
+
4^/11
Theorem
(9.13)
|
positive.
Since,
=
and
[0, 1]
let a*
+
(9.15)
li.
be deRned as in (9.6)-(9.T). Let
A*
h,
r = (i-y) ^''^
hi
hih2
hi
(9.16)
+ h2Ae'
+
"'
h2 4^2
(9.17)
'
cr^
(9.18)
+h2 2C
and
ifx>0,
J^'V*'iy)dy,
=
V'ix)
(9.19)
ifx<0,
-j'V*'{y)dy,
where
'
V*'{x)
=
— r*,
if
/*,
ifx>b*,
{
'-^ -
^ i:. hiy)dy + C;
,
X
<
a*,
(9.20)
ifa'<x<b\
and
,
c;
(7'(-yhih2 + i2-y)hihi)
HHh,+h2)^
34
(9.21)
Then {g% V*{x).r\l\a\b*)
and
and
V
e
C^
in the driftless case
is
the
(9.2).
Furthermore,
satisfies condition (6.8).
Thus, by Theorem
control limit
pohcy
6.2, the solution to
that lead to
will
Theorem
problem (4.11)-(4.13)
I") defined by (7.1)-(7.2) on the interval
(i?*,
The proof
are defined in (9.6)-(9.7).
and
solves equations (6.9)-(6. 13)
Theorem
of
[a*, 6*],
9.3 parallels the
where a* and
arguments
b*
in Section 8
However, the computations in this case are substantially simpler
8.8.
be omitted. This section
concluded by showing that, as the
is
endpoints derived in Section 8 for the nonzero
drift
;/
—
>
0,
the
converge to the endpoints derived
drift case
in the driftless case.
Proposition 9.4.
h
\_
(^1+^2)P2(1-Pi)
lim'^'lnf
a^
and
lim
<y\
^In
+ h2)pi{l-p2)
\
"^'y '^''
=
\hip2{\-pi) + h2Pi{l-p2)J
M-o2^
Proof.
Since p
=
{hi
f
'
—
\/n(pi
'
'
P2).
it
-0
\hip2{l
-yj^ -
lim
P2-P1 2y/n(pi
By
{hi
—
2/j
hi
+
h2 2i
.
(9.23)'
^
follows that
<^\
,.
iim - in f
/i-
a^
hi
—li—
—
+ h2)p2{l - Pi)
- Pi) + h2pi{l-
P2]
ih,+h}Ml-p,)
In
f
\hip2{l -/9i)
P2)
+
/i2Pi(l
,
-
,
P2)
THopital's rule, one obtains
,.
hm
^^
-
=r,
P2-PI 2-/n{pi
-
P2)
J_(y2
—
(Jp2
=
lim
^
,.
(hi+h2)p2(l-pi)
f
\hip2{l - pi) + h2Pi{l -
In
/
Jj^
(/ii+/i;)P2(l-Pi)
p2,
^
•^''^
\ ''lP2(l-Pl)
/
"^^^ ^'^ + 'l2Pl(l-P2)
^^"
\
(9.25)
-a'^pih2
P2-PI 2yynp2(hip2{l
-
pi)
+
h2pi{l
-
P2))
-a^h 2
2y/^{hl
+
h2)pi{l
- PlY
35
(9.27)
The
definition of
similax
and
10.
is
,^
in (9.1)
omitted.
completes the proof for the endpoint a*
of this paper
was
summarizes the solution. The parameters
p,, M,jt,
This
t'l, cr^,
a self-contained section that
is
u and ^ appearing in this sec-
h,,
and
in equations (l.S). (2.1), (2.3), (4.3), (4.7)-(4.S), (7.3)
(9.1), respectively.
workload formulation, the controller observes a two-dimensional Brownian mo-
tion process B,
from which can be observed the one-dimensional Brownian motion process
defined by
B{t)
If
(U,Z,d) to the workload formu-
problem data. Their definitions can be found
tion are all defined in terms of the primitive
B
for b* is
Summary
to obtain a solution
lation (2.6)-(2.11) of the limiting control problem.
In the
The proof
|
Solution to the Workload Formulation:
The purpose
.
pi
^
P2,
=
p.Biit)
-
p,B2{t),
then define the interval endpoints a and
a
=
-1,
1^
f
In
b
t>0.
by
{h,+h2)p2{l-p,)
-;
^
;
(10.1)
,.„_.
\
-;
(10.2)
\hiP2il-pi) + h2Pi{l-p2)J
and
t
=
,-. In
(
—l^i±MMilL£lL^
hip2{l
If p\
=
P2>
then
-
P\)
+
h2pi{l
-
^
.
I
'
(10.3)
P2),
let
h2
a''
hi+h2
2^
(10.4)
anc
For a particular realization of B, define the control functionals (R, L) by
R{t)= sup [a-B(5) +
0<s<J
36
L(5)]+
(10.6)
and
L[t)
=
+
snp [B{s)
R{s)
-
b]^
(10.7)
0<s<t
The two-dimensional optimal
control process
U
=
U^{t)
is
given by
^
(lO.S)
P2
and
IW) =
^.
(10.9)
Pi
From
the functionals {R,L) in (10.6)-(10.7), next define the process
W{t) = B(t) + R{t) - L{t)
The
Z
7\.'-dimensional optimal control process
—,,^'^^\,
—T!—
57
Zi-(f)=
''^^^"~^^^^^'
,
,
foralH>0.
(10.10)
given by
is
if
W by
it
A;
=
1
an( W{t) >
and
<(
0,
(10. ii:
.
0,
if
^
7^ 1
and Wit) >
0.
if
A:
=
and Wit) <
0,'
and
Zkit)={
vv«)
-,
P2W12-P1A/22'
Finally, the optimal control process
Oit)^v-^[Biit)
Thus the solution iU,Z,9)
(10.S)-(10.9)
to the
(10.12)
if
[0,
is
2
-
fc
7^
2
and W{t) <
0.
given by
+ —^-y^AhkZkit)], foralU>0.
workload formulation (2.6)-(2.11)
is
(10.13)
given by equations
and (10.11)-(10.13).
Acknowledgements
I
this
am
deeply indebted to
my
dissertation advisor,
J.
Michael Harrison. He suggested
problem area to me, and provided invaluable suggestions on both the technical and
expository aspects of this paper. This research was partially supported by the Semiconductor Research
Corporation and by a predoctoral fellowship from the International Business
Machines Corporation.
37
REFERENCES
Bellman, R. E. (1956). Dynamic programming and Lagrange multipliers. Proceedings
[1]
Nat. Acad.
Sci. 42,
767-769.
Benes, V. E.. Shepp, L. A. and Witsenhausen, H.
[2]
Control Problems. Stochastics
S.
(1980).
Solvable Stochastic
4, 39-83.
Everett, H. M. (1963). Generalized Lagrange multiplier
[3]
Some
method
for solving
problems
of optimcd allocation of resources. Ops. Rsch. 11, 399-417.
Harrison,
[4]
M. (19S5). Brownian Motion and Stochastic Flow Systems. John Wiley
J.
and Sons, New York.
Harrison,
[5]
M. (1986). Brownian Models of Queueing Networks with Heterogeneous
J.
Customer
Classes. Proc.
IMA Workshop
on Stochastic Differential Systems. Springer-
Verlag, Berlin, to appear.
Harrison,
[6]
J.
M. and Taksar, M.
Math, of Oper. Res.
[7]
Kaxatzas, L (1983).
A
3.
\.
(1983). Instantaneous Control of
Brownian Motion.
439-453.
Class of Singular Stochastic Control Problems. Adv. Appl. Prob.
15, 225-254.
[8]
Shreve,
S. E.,
Lehoczky,
J. P.
and Gaver, D.
P. (1984).
Diffusions with Absorbing and Reflecting Barriers.
[9]
Optimal Consumption
SIAM J.
for General
Control 22, 55-75.
Taksar, M. L (1985). Average Optimal Sing\ilar Control and a Related Stopping Problem. Math. Oper. Res. 10, 63-81.
[10]
Wein, L. M. (1987). Asymptotically Optimal Scheduling of a Two-Station Multiclass
Queueing Network. Unpublished Ph. D. Thesis, Dept. of Operations Research, Stanford University.
38
bl 17
06i;
10
'^fe
Date Due
FEB 19
198')
Lib-26-67
3
IDSD DDS 35D TID
'^f^^e/o^nfT'
Download