January 1985 by AKELLA LIDS-P-1427

advertisement
LIDS-P-1427
January 1985
OPTIMAL CONTROL OF PRODUCTION RATE IN A FAILURE PRONE MANUFACTURING SYSTEM*
by
R. AKELLA
Rm. -35-431
Laboratory for Information and Decision Systems
Massachusetts Institute of Technology
Cambridge, Massachusetts 02139
P.R. KUMAR
Dept. of Math. and Computer Science
U.MoB.C.
5401 Wilkens Avenue
Baltimore, Maryland 21228
ABSTRACT
We address the problem of controlling the production rate of a
failure prone manufacturing system so as to minimize the discounted
inventory cost, where certain cost rates are specified for both positive
and negative inventories, and there is a constant demand rate for the
commodity produced.
problem is the optimal control of a continuous
The underlying theoretical
time system with jump Markov disturbances, with an infinite horizon disFirst, proWe use two complementary approaches.
counted cost criterion.
ceeding informally, and using a combination of stochastic coupling, linear
system. arguments, stable and unstable eigenspaces, renewal theory, parametric
optimization etc., we arrive at a conjecture for the optimal policy. Then
we address the previously ignored mathematical difficulties associated with
differential equations with discontinuous right hand sides, singularity of
the optimal control problem, smoothness and validity of the dynamic programming
equation etc., to give a rigorous proof of optimality of the conjectured
policy.
It is hoped that both approaches will find uses in other such problems
also.
We obtain the complete solution and show that the optimal solution is
simply characterized by a certain critical number, which we call the optimal
If the current inventory level exceeds the optimal, one
inventory levelo
should not produce at all, if less, one should produce at the maximum rate,
We
while if exactly equal one should produce exactly enough to meet demand.
also give a simple explicit formula for the optimal inventory level.
The research of the second author has been supported in part by the N.S.F.
under Grant ECS-8304435, U.S. ARO under contract DAAG-29-84-K0005, administered by the Laboratory for Information and Decision Systems,
Massachusetts Institute of Technology; while the research of the first
author has been supported in part by IBM.
Submitted to IEEE AC.
OPTLMAL CONTROL OF PRODUCTION RATE LI A F.: TURS PRONE MANUFACTURING SYST--
I.
R. .MLA
P. R . K*MkR
tL .D.~S
l.I.T.
. ant C-=-_- Scienc~=o-d Yaf
Ui.oB.C.
3altimore, Maryl.and 21228
"Tr.
ODUCTION
There
We consider a manufacturing system product-ng a single comodity.
is a constant demand rate d for the c=oodd:y, and the goal- of the
turing system is to try to meet this demand.
anufac-
The manufacturing system is
however subject to occasional breakdowns and so there are two states, a
"functional" state and a "breakdown" state, in which it can be.
The transi-
tions between these two states occur as a continuous time Markov chain,with
q! the rate of transition from the functional to the breakdown state, and
q 2 the rate of transition from the breakdown to the functional state.
(Alternatively, the mean time hetween failures is q
time is q 2i).
and the mean repair
When the manufacturing system is in the breakdown state it
cannot produce the commodity; while if it is in the functional state it
produce at any rate u upto a maximun
can
production rate r. We assume that
r > d > 0.
Let x(tI be the inventory of the commodty at time t,
i.e., z(t) =
Ctota. production upto time t) - (total demand upto time t).
negative, which corresponds to a backlog.
ventories incur a holding cost of c
x(t) may be
We suppose that positive in-
per unit caoodity per unit time,
while negative inventories incur a cost of c-, with c, > O, c
> O.
Our
goal is to control the production rate u(t) at time t, so as to minimize
the expected discounted cost,
I+t
c+ x (t) + c-x-(t))et
0
( )
w-hre
S
M.
e
is
:=max(x,O),
:o
x
max (-z,O)
y'> 0 is
an
problem that we acddrcess is t
fuctionin
is Ca:the op:
the discount rate
mhe
nufa:-X-trig system
'ie
Wse.
-eztio
-l
n
rate u as a
unc:tion of
tha
This problem has been motivated by the pioneering work of
inventory x?
and Kimemia and Gershwin [2], where a more general problem is
/imemia fi
formulated (see Section XIII also).
We obtain the complete answer to this question.
u(t)
xz*(t)), is given by a critical number z*o
it 2
eg*r
The optimal solution,
The opti-al policy is,
Wg:)< z*
if
d
if x(t)
t
- 0
if X(t)
(2)
z*
> z*
Thus, whenever the manufacturing system is in the functional state, one
should produce at the maxim=u
rate r if
the inventory x(t) is less than z*,
one should produce exactly enough to z:eet demand if
the inventort
x(t) is
exactly equal to z*, and one should not produce at all if the inventory x(t)
exceeds z*.
Hence the production rate should always be controlled so as
to driVeo the inventory level as rapidly as possible towards. z*, and once
there should maintain it
at the level z*.
For this reason we shall call z*
the optimal inventory level.°
We also obtain the following simple formula for the optimal inventory
level.
2*0,
t g I
c
where X
_
+ cf
+
s
(y1+ q+dX)
is the only negative eigenvalue of the matrix,
d
(3))d
4
q,
q,
++7
+ Y
The movivation for the problem studied here is thaz it
is a basic
The optimal policy is trivial to
probleto for manufacturing systems.
compute and qualitatively simple to implement, and it is hoped that these
two features will render it
attractive enough for use as a guideline.
From a theoretical viewpoint also, both our solution procedure and
method of proof pp.os.ess.ss
several interesting features.
First note. that the
system under consideration is the following.
5.i)
i(t) - u(t) - d
5oii)2
{S(t); t > O} is a continuous time Markov chain with state.
space {1,2} and generator [_q.
5iii).
2
The constraint on u(.t2 is,
u(t)
- O
* £o,r]
5oivl
The go-l is to
inimi.ze E
if
s(t) - 2
if
s(t) ~ 1
(c xz(t) + c-x-(t))e-Y : dt
Here x(t) is the inventory at time t, u(t) is
the production rate at time
t,
and s(t) - 1 or 2 depending on whether the manufacturing system is in the
functional state or the breakdown state respectively.
Thus we have a con-
tinuous time system with jump Markov disturbances and an infinite horizon
discounted cost criterion.
For previous work on problems of this type, we
refer the reader to Rishel
3]1
for the case of a finite horizon cost
criterion; and Krassovskii and Lidskii [4] and Lidskii [5] for a case of.
an infinite horizon problem.
are two problematical issues.
In dealing with these types of systems there
The firs: set of problems arises because one encounters many mathe= ca!
rt a dfic
ul-_ias, se FPishl- [3],
'he=
l
studying the probl.em of ot:ima!
Con-
control for continuous time systems wieh jump MarkBov disturbances.
i w(x).
sider a feedback policy u
Then the system (5.i) satisfies
ir
(x) - d
(6)
continuous functions.
dis -
we wish to consider are essenelly
However, the types of functions '
Standard ezxistence and uniqueness conditions for
R/shel [3]
the solution of the differential. equation (6) are not satisfied.
has considered one notion of a solution;
are also other difficulties.
Vi(x) :- mini
we use another methd6.
.. There.
Let
value of the cost (5.iV) when starting in the
state s(O)
=
i,
x(O) = x.
Then, informally, we have the HamilEto-Jacobi-Bellman dynamic programming
equation:
fmi
(u - d)V (x)
q
-q7v
1
[I'
+j [ +v2 ( ) J
r(-dVq2ZWL-qJ y~q(
L
EtOF
It
ib
is not a priori clear that V i
()
)
+
(7)
is a differentiable function.
It is not
it
turns out
also clear that there exists an optimal control law.
Moreover,
that Vi(x) does vanish for some x, and for such a value of x, the left hand
side of (7)
is minimized by every u, and so the Hamilton-Jacobi-Bellman
(HJB)
dynamic programming equation does not prescribe the optimal u for such x.
Thus, the optimal control problem is a singular one.
I: is
this collection
of mathematical problems that we shall address rigorously in the second
half of this paper.
The second set of proboles is
this.
Even ignoring all the mathemati-
ca- difficulties mentioned above, how do we actually :tbain the optimal
.solution?
Why should we suspect that the optimal
number type?
Why is. the critical number z* always nonnegative?
we want to solve the HJR equation (7),
conditions?
policy is of the critical
Given that
what are the appropriate boundary
After determining boundary conditions how does one determine
an optimal choice for z*?
It
is this collection of issues dealing with
the actual obtaining of the optimal policy that we address. in the first half
of the paper.
The two approaches complement each other and we hope that they wll also
be useful in dealing with other problems of the sort considered here.
The paper is organized as follows.
technical questions
In Sections II-VIII we ignore some
and arrive at a conjecture for the optimal policy.
Beginning with Section IX we address all the mathematical difficulties and
rigorously_prove the optdimality of the conjectured policy.
II.
OPTIMALITY OF CRITICAL NUMBER POLICY
Beginning with this section and continuing through Section VIII, we
provide a sequence of informal arguments which will lead us to conjectures.
about the optimal policy and the optimal cost function.
In this section we give an argument to show that the optimal policy is
characterized by a critical number.
Let us assume the existence of'an optimal feedback policy u(t) =wt*(x(t))
and let Vi(x) denote the optimal cost when starting in the state (s(O) - i,
x(O) = x).
Fix {s(t,w); t > O}, a realization of the continuous time Markov
chain, with s(O,w)
=
i.
Now we consider two different initfi.
and also a convex
If
combination x (0)
conditions x 0 (O) and x (O)
+ CX(O)
(lj)xo(O)
5
here 0 <(
< 1_
* (-) is used, then the trajectories sarting with the initial condi-
tions x 0
(0
) and:
(O) satisfy,
6,tW) - i (xk(t,w)) - d
if s(tc,)
1
(8)
s(t,W) = 2
i
-d
a
xk(O,W) - Xk(O)
for k
0,1.
Also, we have
where E . signifies that the expectation is
taken over W.
Suppose now that for the initial state x (0),
a)
(x
0 (te,S))
Note that such a control is
t >
(s;(t,);
(10)
+ ctr(lg(t,)L)
in fact. implementable because by observing
one can in fact deduce what {xk(t,w), t > O} would have
01
been for k - 0,1.
Such a control gives rise to the erajectory satis.fying,
- (1-ctO
C(o)
X:a
we use the concrol
*
*
~-_
( 9 )'--(l
(9)
dt
d(t,w))e
+ c
=Ef(.cCtW)
V((0)))
W
(xO(t,w)) + 7r* (x:(t,w))
d
if s(t,W)
if
-d
s(t,w)
1:
= 2
(11)
x (0,s)
it
=
(1 -a)x0(0)
+ axl(0)
iS easy to check from (8) and (11) that
g(t,W)
(1-)Xo(t,w) + axl(t,w)
for every t > 0
From the convexity of the integrand in (5.iv) it follows that
(c x+(t,w) + c x (tw).)e
W~~~~
,
t
dt < (1-a)V (X (0)) + aVl (x (0))
(12)
However, for the initial state x (O) the control (10) is not necessarily
opcti-al., ard. so
v 1 (x(O)) <_
W f(c
x:(t,Ca
) + c : (,
(13)
)eY: d
From (13) and (12) we deduce that V 1 (-) is a convex funcction.
Assumin
tna: VL(-) is continuously differentiable, we see that there is some z* for
-1
> 0
~~~~~~~~~~(14)
for x > z*
From the left-hand side of the HJB equation (7), which we suspect (V (O);
.i - 1,2} satisfy, we see that
u r minimizes (u-d)V
l (x)
-
0 minimizes (u-d)V1 ( x )
if
x < z*
if x > z*
Hence, we suSpect that the optimal policy u(t)
a (x) ' r
if x < z*
s0
if x > z*
I
r (x(t)) is of the form
(15)
for some critical number z*.
What happens at x - z*?
the form of (15) it
Any uE [O,r] minimizes (u-d)V1 (z*),
but from
is clear that the inventory level is quickly driven back
to z*'if it deviates from z*. Hence, due to this "chattering" phenomenon,
we suspect that
*
(x) - d
if
x = z*
because such a choice keeps the inventory level exactly at z*, once it
reaches Z*o
(15) and (16) show that the optimal policy is characterized by
(16)
the critical number Z*, which we have called the optimal inventory level.
Y OF OPTT4L INVa-_TORY LEVEL
NONENGATIVT
IlIo
In this section we show that the op-mai in ventory level is nonegative.
Consider two policiesa
t
rg
if x
<
*d
if x
= 0
W0 (X)
(°) and r Z (.),
where
0
(17)
.
> O
O
0g
and
witaW)
Lea
r
if x < z
-=d
if x ~ z
0
if x > z
(18)
z < 0 be some srictely negat:ive number.
0
Denote by V (x) and Vi(x) the
z
costs resulting from the policieis T (.0) and ITz()
ing in the state (s(O)
I
minim
i,
x(0)
= z).
is optimal, then f:rom (14) we see that
wz3(-)
at x = z,
4&(e)
should attain a
i.eo
V 1 (z) < t1(x)
for all x
(19)
(o) is optimal, we should also have
Moreover, if
Z(x) < V(x)
In particular,
<
9v
v~gz)z(Z)
=
respectively, when start-
for all x.
from the above two inequalities we should have
v(0)
1 (0)
(20)
show
Hence, to show that 9z(,) with z < 0 is not optimal, it w}ll suffice to
that (20) is not true.
10
Indeed, let {s(t,w); t > O} with s(0,c)
sider the two t:ra j
=
1 be a realization, and con-
c:ries
(:,m) ' r0 -d
-
-- d
ifif
0
xO;(,=)
(t,)
x
1
0, s(;,~.O)
s(t,W) -< 0,
1
otherwise
0
x2 (O,) - O
and
i:(t,w)
x (O,)
rr-d
1
if xZ(t,W) < Zs s(t,m)
- 0
if xZ(t,)
. -d
otherwise
- °, s(t,W) - 1
" z
0
which emanate from the initial states 0 and z, when the policies it () and
%r(-)are respectively used.
It is easy to verify that
xO(t,W) + z - xz(t,w) < z < 0
for all t > 0
Henace,
c x
(t,c) + c
- (t,,) - cx
(tw)
0c fx (t,;)
+ +
- C xr (t,)
+z
+ cx
]
0-
(t:,)
+ Cjz[
Hence
))et Y
o-0(,xO+
(tw))e
+ ccx
(tW.) +
x0 (t,
J~c
dt < E
(¢+
cx(t,
-Z+
W) + CrxZ (t,d))e:tdt
i.e.
v°(o)
< vz(z)
showing that (20) is violated, and thus that tz(-) cannot be an optimal.
policy.
Hence z*, the optiml inventory level, has to be nonnegativeo
IV-.
I-
?TIEC-W!$SE LITN'iA. EQUIATIONS FOR. -
Lt: z > 0 and denote by Vi(x)
defined in (18).
the
COST FNiCTION
-s:t function for the
The analog of (7) for the policy TZ(o) is
Vb:(x) := 6oaahg)}g
T
A2
Cx)
.1d
-
b1
+ bcx
.AVZ(x)
2(2)
2
Be(
+b2cxfor
d
for x <
for x > z
w ca ul
o
xx
porlic¢
(.)
12
function corresponding to iz(-l,
w.e need to determine appropriate boundary
conditions for (23).
V.
3BOURDARY CONDITIONS
Since the vector VZ(x) is two-dimensional, we need two boundary con-
ditions for (23).
V.1
The First Boundary Condition
Let {s(t,w); t > 0} be a realization.
Under the policy rz(-),
the
inventory is given by the differential equation,
;(,tQ)' w 7; (X('tw))
-
.d
- d
if
- 1
if s(t,W) - 2
-
In any case 1x(t,,) | < r
S(te,)
for all (t,w), and so
ix(.t,S)I < !x(Q))l + rt
for all (t,,).
Hence,
VY(X(O))
E[J(c +x (t,W) + c x (t,w))e
< kljx(O)
Hence,
+ k
dts(Ow)
for some constants k
i, x(O,w)
x(O)]
and k2.
we see that
VZ(xl - O(IxI)'
as x -
+
(24).
Let us. see how we can make (24). more usable.
Solving (231 for x < 0 in
terms of V (0), we get
v
e
vZo1)
C
for X < O (25)
Now note the following easily verified fact.
Al has one strictly positive eigenvalue, say X., and one strictly
negative
eigenvalue, say X.
(26)
13
Let
w
{J
(27)
:- eigenvector of A1 corresponding to A+o
~
To satisfy (24) as x
51 c
v (0) -
,
we clearly need
(28)
()
Xx
)
for otherwse, VZ(x) - O(
generated by
as x
( .+) is the eigenspace
Here,
Ho
w+)}.
(28) is one boundary condition for (23)>
V o2
The Second Boundary Condition
To obtain the second boundary condition, let us see what Vl(z) is.
Consider a system starting in state (s(0) = !, x(O)
stopping time, be the first time at which s(T+)
for 0 < t < T when IZ(o) is used.
V (z)
IE' e Ytc zdt + e
*
2
z) and let t,
Clearly x(t
a
z
Hence
V2 (z)]
Noting that T is exponentially distrihuted with mean
qme
by e@valuating the
expectation in the above equation, we get
Z
,1
q
1
Z
(q1V2(z)
+c
+z)
Z)
(29)
or, equivalently,
[1,a]A v(z)
(30)
-d Z
(30) is the second boundary condition for (23)o
By using the two boundary conditions (28) and (30) one can solve the
piecewise linear differential equations (23)
function for any policy
rZ(o) with z > 0o
to obtain VZ()), the cost
14
The next question we have to face is; what is the optima,
V1.
OPTLMAL CHOICE OF z
Suouose
i)
choice of z?
z* (')
V1 (z*) <
with z* > 0 is optimal.
(x)
Then,
for a1l x, since by (19), the optimal cost
function attains a miaimm at z
ii)
V2} (x)
< VtZ(x)
for all x,z, since WIZ*()
is optimal and therefore
has lower cost than any other TrZ(-).
From the above, we get
Vzl(Z)_ V,1(Z) < 9v(z)
ence VZ(z) attains a
nmum
for all z
he
1
z*. Assuming now that
l(z). is a C
1
function of z, we see that
_W.(Z..]
0°
(31)
We will call (31) the optimality condition and in the next section we will
see how it
VII.
can be exploited to give the optimal solution.
OPTIMAL INVENTORY LEVEL
We will now utilize the piecewise linear differential equations (23),
the two boundary conditions (28) and (30), and the optimality condition (31)
to obtain the optimal choice for z*.
Differentiating (29) and using (31), we get
dz z (z)j]z~*
q1
However, by the chain rule,
(32)
1Z-Z
dz'2
2
z1W*
*
Sice VZ(x) coasidee d as a fuc:ic
of o is
ai/.;ze5
a: z
*, by ass
ing cont:nuous 4d3ferentiab/i-:=y, We have,
z(z
*)]
MO0
Hence,
jz 2 (0)
dz(
1
(A1
L1z*
a% 2
2'
* 8[0, {2V
where we have also usecd (2{)o
10°11£d.
-
z
[
bC *w
|(34)
c+
Setting (34) te:porarily aside, we tu= to (28)
Q(0)in terms of
9
(z),
and substituting this in (28),
Since
£vZ(z) + A1
(w'
.Solving (23) for
we get
V: (P ) - e A LA(zj) +
e
(33)
q
+1
dI
AVZrz*)
A (zV
z~}
From (32) and (33) we have
g*(*) + b*
fO, 11A1 Vz* (*)
(zrz)
+ + b2
2
b1 c
+ A1
bc+1 - Al
C
we have
Iaz + 2%.C4J - A 1 blJ
is invariant under e
, we have
+ C)
(+ >'
16
+ A-lbc
V (z)
Al - 2
-elAA'
1 Zb
z + AAb 1 c
+ +c)E(w)
(35)
Combining (34) and (35), and noting that (w') is invariant under Al,
we get
-51J*
+
AZ
+ Alblc
-Ct
_ e I
A7b (CC +C )
(
+
)
(36)
This equation now gives'us a "formula" to choose z*.
Recall now from Section VI,' that in obtaining (36) we made the implicit
Therefore we now have to determine when there will be
assumption that z* > 0.
a positive solution z* for (36).
Let us first simplify (36) a bit.
Note that
-Y(Y +q + q2)
det: AM ' +
d(r -d)
-
Al "b
Y(]
(37)
simplifies to
Hence (36)
U
eAlZ [1]_ + c
1 C
+c
-
(+C+
:-,,zj
Y
[
(+)
E9(w
t(38)
1
C )q
Now let
v'
~ [vl,
v2] :- a left eigenvector of A1 corresponding to X
(39)
Since left and right eigenvectors corresponding to different eigenvalues are
a 0, (38)is true, if and only if,
orthogonal, i.e., v w
v eZ*
Aiz*
Since v e
+
v
e Pf+1l
Jc
j
'
(c + c )ql
X z*
= e
v , this reduces to
v
o
X*
v2
V
@ - + +vy°
1
1v 2
_+
69Cc
Since vA
" X v-, by equating the second components of both sides,
1
~ v2kr -d)-
Vl
;>%'
2
--
2
q
+,
+
2
and now substituting for
,
v2
we get
I
+,..
* -
ce
+
+ cd
q2d + ( i d r)
-t ( +
(
d
(40)
It is easy to check that
q
d
-
+
q+
(y
X d)(r -d) > 0
(41)
and so the right hand side of°(40) is strictly positive.
Taking logarithms.
we therefore get
°Z*
ts g'
++
+A_
+
qd
y
q% "
(y + Xd
ag
)
-d)
However, such a value of z* may not be positive, and we already know from
Section III, that if z* > 0 is not optimal, then z*
0 is.
Hence we arrive
at our conjecture:
z* ,. max{.
;
'X
t0~og
L
i
+·
c
Due to (41).it follows
Note:
(1 +
ehat
+
q
..
+ q2 +
qld -(y
d
-q
-(y +q2
)(r
+
--d)
.d
(42)
> 1*
X_ d)(r -d) I
Hence, for every c- > 0, there exists c* > 0 such that if c+ > c*, then
z*
O.
Thus the optimal inventory level may be zero even though c
This is somewhat counterintuitive and surprising.
< +
.
OPTZAL COST FUNCTION
VIII.
In the previous section we have conjectured the optimal policy.
In
this section we shall conjecture the optimal cost function a-Lso.
Let us consider separately the two cases where z* as given hy (42)
is positive.
is zero, and where it
VIIol
O0
z*
Case 1:
'When z* - 0, the optimal cost function is V (-) which satisfies (.23)
with z - 0, and has the boundary conditions (28) and (30) with z - 0o
Defining w
V(O)
as in
(28) says that
(27),
for some constant k
kw' + A 1 b
Then, from (30), we get
0 - [1r,o ]A1 v()
-
[1, 0]Jpb
ki+
1
J - kx
-. fence
and so k -
Solving the differential equations (23) with the boundary condi;tion (43), we
get
yX
_
+
+
-lhl
for x < O
A b c+
c ee(44) (~6).
A2x
Ve2.
C-
c
w + + A2
A2
+ b2cx+
-
+
x
foLbc
+
for x > 0
22b2
19
V!I!.2
Case 2:
V(Z*)
and so,
z* > 0
.. A..
L
(45)
Z*i A1qJ-
solving the piecewise
boundary condition (45),
inear differential equations (23) with
we get
AA
_ e--
}
c( +
+A7b
+ A7blc
for x < 0
l
m e
e"l-l' C
tb -
A
2
b c+
for 0 < x <
AeO
{*
aA 2
By simplification it
*V()
AitJ2=+ A 2 b
tc
h 2 c Z* -lcl_+z*
2
A 2 b2 c+
cx
(46)
2
for x > z*
can also he seen that
- e
-
+
AZ b
}
b
22
(47)f
for x > z*
Our conjectures for the optimal cost function in
the two cases. z*
0
the previous sections we have arrived at the conjecture that if
z*
and z* > 0 are given by (44)
ZX.
and (46) respectively.
SOLUTION OF HJB EQUATION
In
is as specified by (42), then r *(-)
defined as in (2) is the optimal
policy
and V(o) defined as in (44) and (46), for the two cases z* = 0 and z* > 0,
is
the optimal cost function.
Beginning with this section, we commence the rigorous proof of the
validity of our conjectures.
2Q
In this section we will show that V(-) satisfies the Hamilton-JacobiBellman dynamic progra-ing equation (7).
LeLa 1
V(-) is continuously differentiable.
Proof
Consider first the case z* - 0 where V(-) is specified by (44).
need to check the continuous differentiability at x - 0e
iV Y(a+h) - V()
h+Oh
and similarly for V(a').
V(O-) :' w+
-
Y
Denote by V(a+),
Now, from (44)
---{Y
Y4+
We only
Y
{-
2
where we have used (27), (37) and
[I 0A oJ1
(49)
Hence V(.') is continuously differencia'ble whenever z* a 0. Now consider the
case z* > 0, where V(-) is specified by (46). We only need to check the
continuous differentiability at x - 0 and x
V(O-) -
(O+) - e a
1
[A'clc
and so we proceed to consider x
=
as eca seen from
(46)
as can be seen from (46) and (47).
-
z*. Now, clearly, from (46),
- A1{J]
A 1b c
(50)
z*, for which,
(51)
0
~21
~~~~L _~~~~~a2~~~~
Lea 2
Proof
'
From (44), this is clearly true for the case z*
Qo
Considering z* > 0,
We see flrom (46) that
2Ie b
v(0)
A1, we get
isinvariant under e
and so noting that (w )
- -
:c
rolc2
k
b c
-)q
- Abl
(c +Q )q
1
c +c )
le
eigenvalues
Noting that left and right eigenvectors corresponding to different
are orthogonal, i.e. v F% s 0
only
where v
and w
are given by (39) and (27), we
.
eed to verify that;,
Az*
Noting that v e
=
e
*
-v , by using (37) and simplifying, we see that we
only have to show that
f
,
J q1
H
e
(c[
H
0
But noting the equivalence of this and (40) and (42) when z* > 0,
the
assertion follows.
Now. we are ready to consider the case z*
thegJB dynamic programming equation (7)o
=
0 and show. that V()
satisfies
22
Lema 3
Suppose z* given by (42) is equal to 0, and V(-) is Cefined by (44).
Then
(r
(X) - d)V1 (x) -
+q
(
-q1
(x
-(
52.i)
Y+
- q2
-d2(X )
q
2Vz)J
+
c x)
1
for all x
52.1i)
d)VlCx)
(.wZ*(x)-
m
Cin
(u d)V (x)
for all x
uE [t,rJ
Proof
I:t is easily checked that V(-) defined by (44) satisfies (231 for
= < O and x > O; and so (52.±). is valid for x < O and x >
Now.con-
sidering x - 0 and using (48), we have
[(Z*
[t(~
(OI - d ).v (.-)
o
c d
while, by using (43), we have
Y
ql
SI'a2
-2-
+
V(Q<) - -A2V(O
Y
Y+
--
+
'21
AZW
dA2 h
e
2
_cYdfo +
where we have also used (49) and (37).
Now turning to (52.ii), since Tz*(-)
Hence (52.i) is also valid at x = 0.
satisfies (2), we only need to show that
23
VI(9)c
for x < 0
<
Thea, fr-m (44) it follows that
Consider x < 0 first.
cA+
00
. --
+>
for x < 0
e
oe<
where, by V1 ( 0 ) , we mean V 1 (0-).
Since o (48) shows that Vi(Q) = 0, it
Noe
w turning to 2 > 0, from (43) and
fol6ows that (53) holds for x < 0
(44)
for x > 0 (53 )
V()W>O
and
we see that
- aeco[A
Vo.(x)
·
^2"
2:
'J
]2
for x>
0
(49) and (37), we have
From (43),
2
A2 V(0)
-
pcp
-
+ 2P
~
2w
e
7-A
1
0
b-f
C
b;2-lJ
Recalling (22) we thus have,
c)J +.-A
2c
2
L
_c
A72 1
f'or x > O
(54)
are right eigenvectors of AI corresponding to the
and
q2
(55)
y+ q +q
()
eigenvalues
(:
.
ai-
..
2)
respectively
and
[:j=
-q+
ql
q2
[f:
+E
'
ql +q 2
[3rsq2
(56)
Moreover, equating the first components of both sides of the equation
A W
X+
+
+
and noting (4), we have
1
=
y + q
-
X(r
-
d)
(57)
24
Using (55),
(56) and (57) in (54), we get
il-·
1() 1
e
d
c
C(Y + ql +q 2)
d
X]
(Yf
(q,+ q2 )
- d) > 0, then clearly V(X)
If y -X(r
I
-
e
> O0 for x > O and (53) is valid.
So suppose that y - X+(r - d) < 0 and note, by differentiating, that
+
V (x)
X+ (r-))
.)die
+( q
c-(yy'
c(y - X(r-d))
Y(q 1 +q 2 )
(y + q +q 2 )
d
for x > 0
(y + ql + q 2 )
If p
Hence, to show that
for all x > 0 if and only if p + n > O.
Vi (
x)
> 0
> 0
+ ne
0 and n < 0 are constants, then pe
>
0
for allx>
we only need to verify that
c+ +
d
c(Y
¥ -
+ (r-d))
y + q1 + q 2
Y(q +q2)
>
.
d -f
d
or equivalently, that
.cy +.c (y It
is
+(r - d)) >o
easy to check that (?+ + X)(r
d)
(y + ql) - (y + q)(rd
so, substituting for X+, we only need to verify that
I d
q2
d
(
+ qL
But this is in turn equivalent to,
X (r - d))
+ c y
>
0
d
and
25
C+-l~+>
c
+ c
q 1d -
( + q2 +
A d)(r
d)
which is in fact true, since z* given by (42) satisfies z* - Oo
Hence
(53) is valid, proving (52.ii).
Turning now to the case z* > 0 we show a similar result.
Lema 4
Suppose z* given by (42] is strictly positive~, ad
iV() is defined by
(46)., then (52.i) and (52oii) are valid.
Proof
is easily checked that Ve). satisfies (23) for x < 0, 0 < x < z*
It
and x > z*, and so in all three of these cases (52 0 i)
x
is satisfied.
At
O, (50) and (46) again show that V(0) - A½V(O) and so (52oi) is also
valid at x = 0.
(Z*
Turning now to x - z*, we uote from (51) that
d-V(
z
(z*)
q,
-dl(Z*)
1i
'dVZ*)
Also,
Y +
t*
V
.+ q
= -d
im {A2V(x)
+
bq
cJV(z*)
z
d{A 2V(z*) + b2 CZ}
+ b 2 c} x
d Lim
V(a)W
-dV(z*)
where we have used the continuous differentiability of V(o).
valid for all x.
Now turning to (52.ii), since KTz
in (2), we need to show that
()
Hence (52.i) is
is of the form shown
26
(58)
for x > z*
V l(x) > 0
From (46) we obtain
Consider 0 < x < z* first.
z*l
A (x
v(x)
and
for x < z*
Vl (x ) < 0
for 0 < z < z*
fh c+ -
e
Now
where, by V(O) and V(z*) we mean V(0+) and V(z*-) respectively.
hb
c % l .where
by
8 :-=
+
Bd
+c Y + q2 )
q
+ c
d
> O.
(x - z*) > 0 and denoting
Let t :=-
ql
the inverse Laplace transform, we have
t-1
=
,O]e
,(x)
eq
.z[
lr[,o](sI + Al)-I
q 1 e(e
__________
-
-xQ+ (s
with strict-inequality for x- z*.
(r
d)
e
d)(X+
d) + -
Moreover, from (51), V 1 (z*)
so the validity of (58) for 0 < x < z* is established.
)
=
O, and
By continuity of
V.(x), we see now that V 1 (0) < O, thereby establishing (58) for x - 0 also.
Now we consider x < O.
V(x2
1
Ie4
V(O) -
Since (23) is satisfied for x < O, we see that
blc _] +
lbl
lA
+
2
lc
for x < 0
(59)
Hence
t(x)
- Ale 1V (o)
From Lemma 2, we know that V(O)
V(x)
= kX+e+
for x <0
A_ bl-c] + Alblj
+x +x
w + ALb
Noting (27) and (37), we have
-A2blc
b
kw
for x < 0
for some constant k, and so
27
)
V (O)-
k0Xe
Lx)
Now we consider x >
0 < x < Z*.
z*)
A (x
*o
Fprom (47) we have,
+
for x > z*
s= P
+
*(g
e
q
Y
+
:=
>
-
0, and hy V(z*) we mean V(z*+)
Now using
(56) bdwde get
(55) a
+
r (i
V ()°-
with strict
of (8)
for xXe O
< 0 for x < 0 and s.O (58) is valid for z < 0 in addition to
that Vl(X)
where
)
(0) < 0 as previously shown,it follows
0 for x < 0 and since
Since Xax
L
is
e-
r
d2)
b ql
ea
inequality except at x - z*.
>
for x > z.
Since V!(z*) - 0, the validity
Iso established for x > Z*o
Theorem 5
Let
60.i)
60
i).
60.iii)
z* be deainefd as in C42)
7*b()
be definaed as in (2)
VCe) be defined as in (44) or (46) depending on whether z*
z* > 0.
60.iv)
V(-) is continuously differentiable
60.v).
wt Z*() and V(.-) satisfy (52.i) and (52.ii) and hence the
Hamilton-Jacobi-Bellman dynamic programming equation.
60.vi).
V(r.) =0(
I)
as x ±
Q0
or
28
Proo f
We have already established (60.iv) and (60.v) in Leas 1, 3 and 4.
To show (60.vi) note that (44) and (46), (26), (27) together with Lea
Moreover, since both eigenvalues of
show that V(x) - O(IxI) as x . - ".
follows from (44) and (46) that V(x) - O(x)
A2 are strictly negative, it
as x
X.
2
aLSo.
.- +
ADJMISSIBLE POLICIES
It
is now time for us to address more general issues.
We begin by
defining the class of admissible policies.
Definition:
A measurable function r : R -
f[O,r] will be called an admissible
R 2 with Z > 0, there exists a function y (t;T,r)
policy if, for every.(r,)
which satisfies
(61.i)
y (t;e,g) is absolutely continuous in t
(61.ii)
y1T(t;r,g) =
(61.iii)
y (t;T,~) is continuous in (t,Tr,).
(61.iv).
yC(-)
t
_-
+ J-(Ir(s;
r
i))
d)ds
for t > t
is the unique function satisfying (i and ii) above.
Given such an admissible
we now describe the manner in which we interpret
the differential equation (5.i),
Let {s(t,w); t > 0} be a realization of
(5.ii) with, say, s(O,w) - 1 and suppose x 0 is the initial inventory level.
Define tr
0 ()
:-
inf{t > 0 : s(t,w) - z} and Ti
+ l ()
inf{t > Ti(W);
s(t+, w) # s(t-, w)}. Then we construct the process {(x (t,w)} by,
x (t,w)
:= y (t;O,x
0
for 0 < t < tO(W)
:= x (ri(), w) - d(t - ri(W))
for ri(<)
< t < Ti+ l(W)
and i = 0,2,4,...
:= Y (t;tri().,
X (,i(w),W))
for T.i()
< t <
and i = 1,3,5,...
i+ l(W)
29
Noce that an i{ediate consequence is
(ar(sw) - d)ds
+0
X (t,)
(62)
for all t > 0
where
0
if s (t,)
'rrSC~(t,9z))
if s(tw)
u (t,c)
"
= 2
1X
Thus,- the differential equation (5.i) is interpreted in integral form in
(62) .
One can use the theory of semigroups of nonlinear contractions in
Banach spaces, see Barhu 16J., to obtain sufficient conditions for a policy W
to be admissible.
of t:he
z ()
We now use this 'to establish the admissibility of policies
type.
Theorem 6
i
(o)
defined by (18) is admissible.
Proof
Let A, a multivalued operator, or equivalently a subset of R 2
b.e
defined by
A(x) :-'{r
0S-d,
r
:e {-%}
Then x 1 <
if xg< z
d}
d]
if
x - z
H 9if
> z
and Y!E A(x1), Y2I
A(x2) implies that (x
- x 2)(y
- y2)
0
and so A is a dissipative operator, see Definition 3.1, page 71 of Barb.u 16].
Moreover
U
AR
U
{x - y} - R
A
-dssipa,
see ,
page 7
and so A is m-dissipative, see £6, page 71].
Ar
e
Also, for every x6
R,
3a
and IZTZ(x
(itZ(x) - d) A.(x)
- d
<
fyI_
for every yEA(x)
By Corollary 1.1 and Theorem 1.6 of [6, page 118] we see that (61.i, ii,
iv) are satisfied.
Moreover, hy Proposition 1.2
a function of t, for each g,
is
[6, page 110], y z(t;'r,)
2
)I <_.
-z
for all t > r, it
1
2
|Y z(t;r,
Since by uniqueness y Z(t;,S)
)
- y
(t-t;
Y
IY z(t-rT 2 ; O·,l)l - Y (t=-T:; O,2) that y
continuous function, and so (61.iii) is also satisfied.
.XI.
1
follows from {y Z(t;:-l,,) - y Z(s;T2,i 2 ) <
,-'(t
; o
)+
y(
1.
0,,)
Z(t-r; 0,1)
is a
.
INTEGRAL EQUATION FOR COST FUNCTION
In
this section we will show that the cost function corres.pondiag to
a policy ar satisfies a certain integral equation.
LetTil} be the successive jump times of (s(t)}.
is
If {x (t); t > O}
the trajectory resulting from a policy T, define
V' (9) :
£C.
C(x(t))e
dt
Is(O)
x (0)
-i,
as the. cost of using T upto the n-th jump of {s(t)}.
:
c).
Eere
ca x + C x
Clearly
zaim
(V )
is
V,
()
the corresponding erxpected cost of using
Define
xa(<,9)
:, y(t ;0O,)
x(t,S)
:-
:d
t-
as
a semigroup of nonlinear contractions, and
so from Definition 1.1 of 16, page 98] we see that
yrz(t;tZ
and
(63)
it
indefinitely.
3l
) represents the inventory level at time t if initially the
Clearly x (t,
inventorv level is e, s(Q) - i, and there are no jups of {s(t)} i .f0,t),
By a renewal argument it follows. that
V
vi,1(T) (
- 0)-0
0
and
rt
VUf~
)i,
rT
T
ia
i
,g))dt + a
Y:(x;(
(64)
(x ,))) (64
(x
j(i), ,a
I
ve
where
j (i)E{fl,l2,
j(i)
P i
l,
be the Banach space of all measurable functions
> O0, let {
For
fo~rt
mappig a into R, with norm defined by
eF fl
letF:
0nf
,h2 ,,-
and no te
(
,f2, TZ
(,r
2
f
x F
{p
:" sup{e {:'f(x)l,
tfik
ge4{
(19f21
aF
i
Ftor (flf2 2
_2
and
max
llt
define T (f1,
2
:
by
)
)
fji)
fq
:
((t))dt
i
+
(a(
je
)da
(65)
Lelma 7
i.
ii)
If (fl,f 2 )EYf 2
2)
then T(flf
E
2
is a contraction with respect to the noarm
T
j
EI
for every
> 0 sufficiently smaall.
Proof
It suffices to show that Ti f
for i
f1,2o
-( t<,O.i
j
(f) E jFand that Ti
Note first thar by (62),
I
+
k1 t
is a contraction
32
In the above and what follows, all the kt's are constants chosen appropriately.
Since c (x) < k2Ixl,
fqie a
t
eY
it
follows that
c(xi(t, ))dt]da < kI
+ k4
(y + q 1 )
Also, for 0 < E
k
<
~~.~~~~~"
ence
±T
El+
e
< k
f(l
sufficiently small, i.e. Ti
consider 0 <E <Y
~(T:
f -
T
,
I
E
f
f
f
and so, T
+ k
is. a contraction,
To show that T.
then
(irXg)()<
±+d+r eiei<i>
(TfJ±e
·-
o
I(4)qi
| f
( a,))
T ) !g(xi
(q.i+ Y)a E(I9, + ka)I
t,, l l
qe
if
1i
- T
Il~:i,r
hence
<c < 1.
eei
' <_ elt
~! <sl !e- !If
- ~:i,,gl i~
.
w,~=, o < B < ~.. ~-,',~-
where 0
V0 := 0.
for n = 0,1,2,... with'~~~~~T
da
l
E
+glI
< 5eEll[lf
From (64) it follows that if Vl
d
-
:
,)
1
(V<
V2
E
then Vn+
dg
T1
33
The oram 8
Let V. ,,.()
).
x(O)
denote the cost of using I
(
Let
(i))O
V2
(Vl (M),
:V
Then
integral equation.
:V
is thhe
660i)
Vt
for every
66.ii)
IT (xT (F, ))jda
+ e-YV
te-(x
qecf7r,))dt
q=
'
1s
starting in the state (s(O)
E R.
R
For every f E
2
(n)f
, Zim T'
V.
Proof
-(=)0ee
Lim TI(T)O
Fromt (63) and (64), we see that V
n-fold ittrate of Tw
TI
a contraction,
is
s and
0 is the identically zero function,
Lim Tn) f is
It
denotes the
However,
since
a unique fixed point of T 7 , for every
Htnce Vn is the unique solution,
f E .
(n)
where
RV
in, F of VT -
Cl
is important to note that the boundary condition for the integral
equation is really the condition that the solution be in
.
This is a
condition on the asymptotic growth rate, and serves to differentiate the
infinite horizon problem treated here from the finite
horizon problem in
[31.
OPTeIALITY of fz*
XIIo
We are now ready to prove the optimality of the s.ugges.ted policy.
We
shall actually show that optimality is a consequence of it* and V(-) satisfying
,vvi) for the infinite time problem and so our proof is
the HJB equation (60iv,
quite general.
Theorem 9
Let z* and
z* > 0, define V(-) by (46).
while if
i)
If
rz*(.) be as. 'in (3) and (2).
If
z* = 0,
define V(.)
b.y (44),
Then
V (') represents the cost function corresponding to an admissible policy w,
then
V
*(V) < V
-
ii
(5)
for i
= 1,2; Z
R and all
admissible wT
34
E R
for every
,(~) - v(M)
ii) V
iT
Proof
We will show that
TV
i
i.e.
>
*
g
for every admissible i
*
V
z* (
Since Tr is monotone,
)
for every 5 E R and i
z*(v)
->
(67)
(67) implies that T()V
it
> V
Z* -
1,2.
Taking the limit
Z*
V z*> that is (i) for i = 1,2.
in n and using (66.ii), we obtain %
goal is to show (67), along with. equality when rt = T
(68)
So our
Considering (52.i,-ii)
.
we have
(u - d)l(x) - (Y + ql)VCl(x) - q1 V 2 (x)mmn
u E [O,r]
-dV2 (x)
(69)
C(
+ q2 )V2(x) - c(x)(70)
-q 2V1 (x) + (
For any iT therefore, (69) implies that
c(x) > (y + ql)v
- q
(x)
(r(x)
(X)
(71)
- d)7(x)
and so for any admissible T,
1.
> (y + ql)V1 (x
c(XI(ti))
1
(t:,))
-
qiV2 (x
1
(tOT)
-
1
(i(x (ti))
1
d)V (x Cit
))
(72)
Now noting that x (t,Q)is absolutely continuous in t,
(
1r(x(,t))
- d),and V 1'(
)
with derivative
is continuous, we can apply Corollary 7 of [7]
showing that the chain rule is valid, and so obtain
d V CX1 (t,
-dv
l,'(XIT
V (x '(t
(x (t')
)) O ~
'IT(,)-
- d)
a.e. (7
(73)
35
Hence, from (72) and (73), we have
-| YC(X (t,%))dt >
v2
x-
(t,)
+ q)v(
eY[(y
t
)d
(74)
fJeYt ~ Vl(t
a
for
(,))d:
> 0
Integrating the last term in (74) By parts, see Hewitt and Stromherg [9, po 287L.
we have
|e-
C(:1(t,D)dt > |aeY q1tv (x-(it,)) - V (x (teg)))dt
joqla
for a > 0
e- v(x(,))
+ vl(
(=,O
(e
vl(~
I
+ fqe ?
l
71
-(Y + q)i
f)
~0z
Jo
- Jo:
vlx~(
+ j)d:
V xa(ago))daa
,)))dajdC
ytq (V1 (x1(t,')) - V2 (xi(t
9
2)
Xd
(tS))de
Hence
¥@ e
:qe
(Xx
+e
l(t,t))d
VlgC(za))
i.e.
(T~
v2)(9) > V (g)
noting that from (60.iv), V(x) = O(x) and so V Ef 2
(70), similarly we deduce that (T
T V > V.
V2,l)(9)
=
V 2 (g),
Domain (T )o
Using
Thus we have shown
On the other hand since Tr*() attains equality in (71), we have
36
T
V
V.
V < T V
Thus V Z*
iT
XI%.
T V
r
12
with equality when r Ir
.
i
CONCLUDING RMARKS
There are two directions in which more work is needed.
The first
is to realize the full program for flexible manufacturing systems outlined in
Kimemia and Gershwin £2J.parts on m machines..
Consider a flexible manufacturing sy.stem making p
Part j requires. ai
units of time on machine i.. Thus if a
subset sk c {1,2,...,m} of machines is functioning, while the rest have
failed, then a vector u - (u 1,u
if and only if u E Uk where U k :
if i £ sk}.
,up)
of production rates is feasible
{u
:
a
2 ,...
iu.<
if .
E $k
or
Suppose now that each machine is subject to occasional failure
(s(t); t > 0} be a Markov chain with state space {Sls2,...
and let
Given a demand rate vector d - (dl,d 2 ,...,dp),
x
,Sm}S
we have the problem
u(t)-d
s(tj); t > O}
is
a Markov chain with state space {s ..'.
1
5s
}
2m
u(t) E Uk if
s(t) = sk
Minimize E
e-Ytc(x(t))dt
where c(-) is
of x(t),
has
some convex cost function.
Due to the multidimensional nature
this problem is much more difficult than the one solved here.
proposed an approximation,
£2]
but the optimal solution needs more study.
The other direction in which more research is needed is theoretical, and is
the problem of optimal control of continuous time systems with jump Markov
disturbances.
As in Rishel (3], we also have proved optimality only with-
in the class of Markov policies.
For discrete time systems, see Blackwell [1o]
for example, much more progress has been made on optimal control, and one
usually considers a much more general class of policies within which optimality
is proven.
The question of existence of optimal controls also needs more study.
Finally, more work needs to be done on the average cost problem for systems
with jump Markov disturbances, see Tsitsiklis [8].
The author is grateful to S. Gershwin for introducing him to the problem studied here, to S. Mitter for his friendly hospitality during his
visit to M.I.T., and to
To Seidman for'several useful discussions,
REFEENCES
1.
J.o G.
Control of Production in Fle.xible Manufacturing
imemia, "iserarchial
Systems," Ph.Do Thesis, L.Io.DS.-TH-1215, M.I.To,
2,
Cambridge, September 19382.
J, G0 Kimemia and S. B. Gershwin, "An Algcorithm for the Computer Control of
Production in Flexible Manufacturing Systems," IEE
Transactions, vol. 15,
pp. 353-362, December 1983.
3,
I. Rishel, "Dynamic Programming and Minimtia Principles for Systems With Jump
Markov Distrubances,"
SIAM Journal on Cont:rol, vol.
13,
pp.
338-371, February
1975 .
4.
rassovskii and E. A. Lidskii, "Analytical Deisgn of Controllers in Sto-
No.N.
castic Systems With Velocity Limited Controlling Action,"' Applied Mathematics
and Mechanics, vol 25, pp. 627*-643, 1961,
3.
E. A. Lidskii, "Opeimal Control of Systems with Rand'om Properties," Applied
Mathematics and Mechanics, vol. 27, pp. 33-45, 1963.
6.
V.
Barbu, Nonlinear Semigrouus and Differential Equations in
Noordhoff Interna!ltional ?ublishing, Leyden, The Netherlands,
7.
J.
Serrin and D.
E. Varberg,
Banach Soaces,
1976.
"A General Chain Rule for Derivatives and the
Change of Variables Formula for the Lebesgue Integral," The American Mathematical Monthly, vol.
8,
76,
J. N. Tsitsiklis, "Convexity and Characterization of Optimal Policies in
E. Hewitt and K. Stromberg, "Real and Abstract Analysis," Springer-Verlag,
New York, 1965,
10.
D. Blackwell, "Discounted Dynamic Programming," Annals of Mathematical
Statistics, vol. 36, pp. 226-235, 1965.
~
.."^~~~~~~-~-""
~ ~
" -'I"`"^~'
a
LI.DS,°-R-i1178, M.I.T., Cambridge, February 1982,
Dynamic Routing Problem,'"
9.
pp. 514-520, May 1969.
~1
.
-.···i-; ··
- -- ···
-,.··- ·-----
·- ··-
···--
··-
- ----
·-
·--
Download