HW2_Comments.

advertisement
IE 434/IE 534 – Fall 2009-2010
Comments on HW2
General comments:
The questions were motivated by the fact that I wanted you to be more attentive to the
numerical computations regarding stochastic models, as well. Of course, it turned out to
be more tedious than I thought as you are not oriented to use a higher programming
language to solve such problems. For the same reason, I am still not able to grade your
first question, as I have not numerically solved the problem, yet.
Comments on the problems
Problem 1 (Total: pts; 100) – 4 % of the overall grade
This problem is an extension of the cash management problem we have seen in class. The
extension is in two directions: The policy is now with s1 and s2 (rather than a single
target cash level after a bank transaction). Additionally, the fixed transaction cost is
different for state 0 and for state S. Looking at the solution, the more difficult problem, is
fundamentally similar to the example solved in class; however computational
requirements are more, as parametric computation of the results is not either as
straightforward or not possible. Hence, this question is an exercise to warm you up for
the evaluation of more realistic MC problems.
(1)
The model considered in class had a regeneration point after each bank
transaction – each time you reach state 0 and for state S, the same target s was reached. In
this new case, on the other hand, the solution will not be as simple, as we may start a
cycle with s1 or s2.
Note that we have four types of cycles (as defined in class): Cycles that start with
s1 and end with 0 (and hence start the new one with s2); cycles that start with s1 and end
with S (and hence start the new one with s1); cycles that start with s2 and end with 0 (and
hence start the new one with s2); cycles that start with s2 and end with S (and hence start
the new one with s1). A regeneration cycle (call it now a grand cycle) can now be
defined:
Consider a series of cycles (as defined in the above paragraph), the first one
starting from s1. The end of this grand cycle is with the first cycle that ends with S. Note
that there can be one cycle, two cycles, three cycles, etc in a grand cycle. Actually, the
number of cycles in a grand cycle is a random variable!
(2)
A) Compute
column vector. The row corresponding to s1 (or s2) is the
expected number of steps asked.
B) Compute matrix of size (S-1)X(S-1)
C) Let be the matrix of absorption probabilities of size (S-1)X2 (as we have
two absorbing states).
=
, where is the matrix of size (S-1)X2 denoting the
probabilities from transient states to absorbing ones. The row corresponding to s1 (or s2)
is the probability asked.
D) Here, we have to compute the probability distribution for N, the number of
cycles in a grand cycle, as explained in part A). Assume we start with s1. Let Ax,y define
the probability that we start in a transient state x and end up in the absorbing state y (as
computed in part C)). Hence, the probabilities for a given realization of N can be stated in
the following table:
N=
1
2
3
…
n
…
…
Prob (N=)
As1,S
As1,0 As2,S
As1,0 As2,0 As2,S
…
As1,0 (As2,0)n-2 As2,S
…
…
Hence, expected value of N can be computed.
E) To represent the total cost when N is a given value, using the above logic is
possible. Hence, the total expected cost for a given realization of N can be stated in the
following table:
N=
1
TC (N=)
S 1
K1 +
 rjw
j 1
2
s1, j
S 1
K1 +
 rjws1, j + K2 +
j 1
3
S 1
K1 +
s 2, j
j 1
S 1
 rjw
s 2, j
j 1
)
…
S 1
K1 +
 rjws1, j + (n-1)(K2 +
j 1
…
…
 rjw
 rjws1, j + 2(K2 +
j 1
…
n
S 1
S 1
 rjw
j 1
s 2, j
)
…
…
Using the probabilities computed in part D), we can compute expected total cost;
dividing it to ms1 computed in part A9 will yield the expected total long-run average cost
per period.
(3) This one is easier.
Problem 2 (Total: 100 pts; 30, 30, 40) – 1 % of the overall grade
This is a standard Markov Decision Process (MDP) modeling problem. The process is
formulated by identifying three states indicating the market value and three investment
decisions. In particular, we define states i = 1,2,3 to correspond to market status of
11000, 12000 and 13000 marks, respectively, and we define decisions k = 0,1,2 to
correspond to the Do Not Invest (DNI) , invest in Go-Slow Fund (GGF) and invest in GoGo Fund (GSF) , respectively.). In accordance with this, you can compute the values
of Cik , for all i = 1, 2, 3 and k = 0, 1, 2. Notice that Ci 0  0 , for all i. The transition matrix
is the same for any decision (i.e. market is not affected by individual decisions). The
MDP has a total of 27 stationary deterministic policies. For each policy, the (long run)
3
expected average cost per period is given by   jk C jk , for all k =1,2,3, where  jk is the
j 1
steady state probability for decision k. In fact, we observe that some polices dominate
other polices even without the need to evaluate such polices. It is not hard to find that the
7
19
15
steady state probabilities are  1  ,  2 
and  3  . By calculating the long run
41
41
41
expected average cost per period for each policy, it is found that (2,2,0) which
corresponds to (GGF,GGF,DNI) is the optimal policy with long run cost of -$6195/year.
Your performance was satisfactory in answering this question, except that almost all of
you considered only the GGF and GSF decisions in their definition and hence you had
polices instead of 27. Accordingly, the optimal policy in your case is (GGF, GGF, GSF)
with -$3268.29. However, for those who considered two decisions in their decision
definition 20 points were cut from their grades of part (a) of the questions while nothing
was deduced from other parts’ grades as long as their answers were correct.
Download