x * n - Dalton State College

advertisement
INTRODUCTION TO OPERATIONS
RESEARCH
Deterministic Dynamic Programming
DYNAMIC PROGRAMMING

Dynamic programming is a widely-used
mathematical technique for solving problems
that can be divided into stages and where
decisions are required in each stage.

The goal of dynamic programming is to find a
combination of decisions that optimizes a
certain amount associated with a system.
DETERMINISTIC DYNAMIC PROGRAMMING

Dynamic Programming (DP)


determines the optimum solution to an n-variable
problem by decomposing it into n stages with each
stage constituting a single-variable sub problem.
Recursive Nature of Computations in DP
 Computations
in DP are done recursively, in the
sense that the optimum solution of one sub
problem is used as an input to the next sub
problem.
DETERMINISTIC DYNAMIC PROGRAMMING
By the time the last sub problem is solved, the
optimum solution for the entire problem is at
hand. The manner in which the recursive
computations are carried out depends on how
we decompose the original problem
 In particular, the sub problems are normally
linked by common constraints. As we move from
one sub problem to the next, the feasibility of
these common constraints must be maintained

STAGECOACH PROBLEM
We illustrate with the famous
STAGECOACH problem
 It concerns a mythical fortune
seeker in Missouri who decided to
go west to join the gold rush in
California during the mid-19th
century. The journey would require
travelling by stagecoach through
different states.

STAGECOACH PROBLEM
Traveling out west was dangerous during this
time frame, so the stagecoach company offered
life insurance to their passengers
 Since our fortune seeker was concerned about
his safety, he decided the safest route should
be the one with the cheapest total life
insurance cost

STAGECOACH PROBLEM
7
B
4
A
4
6
2
E
4
3
2
D
3
J
F
4
4
H
6
C
3
1
3
3
1
5
I
G
3
4
STAGECOACH PROBLEM
Four stages were required to travel from the
point of embarkation in state A (Missouri)
to his destination in state J (California). The
insurance costs between the states are also
shown.
 Thus the problem is to find the cheapest
route the fortune-seeker should take

STAGECOACH PROBLEM



By using the minimum technique for selecting the
shortest step offered by each successive step, we will
have the possible shortest path AB  F  I  J,
with cost 13.
When replacing AB  F with AD  F , we get
another path with cost only 11.
One possible approach is to enumerate all the
possible routes, which is 18 routes. This is so-called
exhaust enumeration method.
STAGECOACH PROBLEM
Now let’s do the same problem through dynamic
programming:
 Stage
 State
 Decision
variable
 Optimal policy (Optimal solution)
DYNAMIC PROGRAMMING

Dynamic programming does not exist a
standard mathematical formulation of “the”
dynamic programming problem. Rather,
dynamic programming is a general type of
approach to problem solving, and the particular
equations used must be developed to fit each
situation.
DYNAMIC PROGRAMMING

Dynamic programming starts with small portion
of the original problem and finds the optimal
solution for this smaller problem. It then
gradually enlarges the problem, finding the
current optimal solution from the preceding
one, until the original problem is solved in its
entirety.
FORMULATION



Let decision variable xn, (n=1,2,3,4) be the immediate
destination on stage n. The route selected is A x1 
x2  x3  x4, where x4 is J.
Let fn(s, xn ) be the total cost of the best overall policy
for the remaining stages, given that you are in state s,
ready to start stage n, and select xn as the immediate
destination.
Given s and n, let x*n denotes any value of xn (not
necessary unique) that minimizes fn(s, xn ), and let f
* (s) be the corresponding minimum value of f (s, x ).
n
n
n
FORMULATION
Thus
f ( s)  min f n ( s, xn )  f n ( s, x )
*
n
xn
*
n
where
fn(s, xn ) = immediate cost (at stage n) +
minimum future cost (stages
n+1 onward) = Cs,xn+f *n+1( xn )
the value of Cs,xn is given by the preceding tables for by
i=s (the current state) and j= xn (the immediate
destination), here f *5( J ) =0.

Objective is to find f *1(A) and the corresponding
route.
SOLUTION

Stage n=4:
*
4
s
H
I
*
4
f ( s)
x
3
4
J
J
H
3
0
J
I
4
SOLUTION

Stage n=3:
x3  H : f 3 ( F , H )  C F , H  f 4* ( H ) 
x3  I : f 3 ( F , I )  C F , I  f 4* ( I ) 
3
6
F
H
4
3
I
SOLUTION

Stage n=3:
x3
f3 (s, x3 )  Csx3  f 4 ( x3 )
*
f 3 (s)
x3*
*
H
I
E
4
8
4
H
F
9
7
7
I
G
6
7
6
H
s
SOLUTION

Stage n=2:
x2  E :
f 2 (C , E )  CC , E  f 3* ( E ) 
x2  F :
f 2 (C , F )  CC , F  f ( F ) 
x2  G :
f 2 (C , G)  CC ,G  f (G) 
*
3
4
E
3
7
2
C
F
6
4
G
*
3
SOLUTION

Stage n=2:
x2
f 2 (s, x2 )  Csx2  f 3* ( x2 )
x2*
E
F
G
f 2* (s)
B
11
11
12
11
E or F
C
7
9
10
7
E
D
8
8
11
8
E or F
s
SOLUTION

Stage n=1:
x1  B :
f1 ( A, B)  C A, B  f 2* ( B) 
x1  C :
f1 ( A, C )  C A,C  f (C ) 
x1  D :
f1 ( A, D)  C A, D  f 2* ( D) 
*
2
11
B
2
7
A
4
3
C
8
D
SOLUTION

Stage n=1:
f1 (s, x1 )  Csx1  f 2 ( x1 )
*
x1
s
A
B
C
D
13
11
11
*
1
f ( s)
11
*
1
x
C or D
OPTIMAL SOLUTION
stage 1
2
3
11
4
7
B
11
state:
4
A
3
E
4
7 3
7
C
F
D
1
1
3
G
3
H
3
J
4
3
4
8
4
I
3
4
GENERAL CHARACTERISTICS OF DYNAMIC
PROGRAMMING

The problem structure is divided into stages

Each stage has a number of states associated with it

Making decisions at one stage transforms one state of
the current stage into a state in the next stage.

Given the current state, the optimal decision for each of
the remaining states does not depend on the previous
states or decisions. This is known as the principle of
optimality for dynamic programming.

The principle of optimality allows to solve the problem
stage by stage recursively.
DIVISION INTO STAGES
The problem is divided into smaller subproblems each of them
represented by a stage.
The stages are defined in many different ways depending on the context
of the problem.
If the problem is about long-time development of a system then the
stages naturally correspond to time periods.
If the goal of the problem is to move some objects from one location to
another on a map then partitioning the map into several geographical
regions might be the natural division into stages.
Generally, if an accomplishment of a certain task can be considered as a
multi-step process then each stage can be defined as a step in the
process.
STATES
Each stage has a number of states associated with it.
Depending what decisions are made in one stage,
the system might end up in different states in the
next stage.
If a geographical region corresponds to a stage then
the states associated with it could be some
particular locations (cities, warehouses, etc.) in
that region.
In other situations a state might correspond to
amounts of certain resources which are essential
for optimizing the system.
DECISIONS
Making decisions at one stage transforms one state of the
current stage into a state in the next stage.
In a geographical example, it could be a decision to go from
one city to another.
In resource allocation problems, it might be a decision to
create or spend a certain amount of a resource.
For example, in the shortest path problem three different
decisions are possible to make at the state corresponding to
Columbus; these decisions correspond to the three arrows
going from Columbus to the three states (cities) of the next
stage: Kansas City, Omaha, and Dallas.
PRINCIPLE OF OPTIMALITY
The goal of the solution procedure is to find an optimal policy for
the overall problem, i.e., an optimal policy decision at each stage
for each of the possible states.
Given the current state, the optimal decision for each of the
remaining states does not depend on the previous states or
decisions. This is known as the principle of optimality for
dynamic programming.
For example, in the geographical setting the principle works as
follows: the optimal route from a current city to the final
destination does not depend on the way we got to the city.
A system can be formulated as a dynamic programming problem
only if the principle of optimality holds for it.
Download