Dynamic programming by Minjie

advertisement
Dynamic Programming
Dynamic Programming
• Dynamic Programming is a general algorithm design
technique
• It breaks up a problem into a series of overlapping subproblems ( sub-problem whose results can be reused
several times)
• Main idea:
- set up a recurrence relating a solution to a larger instance to
solutions of some smaller instances
- solve smaller instances once
- record solutions in a table
- extract solution to the initial instance from that table
Discussed topics
•Assembly-line scheduling
•Partition of Data Sequence
•Route Segmentation and Classification for GPS data
•Longest Common Subsequence (LCS)
Assembly-line scheduling
Recursive equation:
f1[ j ]  min( f1[ j  1]  a1, j , f 2 [ j  1]  t 2, j 1  a1, j )
where f1[j] and f2[j] are the accumulated time cost to reach station S1,j and S2,j
a1,j is the time cost for station S1,j
t2,j-1 is the time cost to change station from S2,j-1 to S1,j
Partition of Data Sequence
Partition of the sequence X into to K nonoverlapping groups with given cost functions
f(xi, xj) so that the total value of the cost
function is minimal:
f ( x1 , xi2 1 )  f ( xi2 , xi3 1 )  ...  f ( xiK , xN )  min
Partition of Data Sequence
G(k,n) is cost function for optimal partition of n points into
k non-overlapping groups:
k

G (k , n)  min  f i j , i j 1  1
{i j }
 j 1

Recursive equation:
G(k , n)  min G(k  1, j  1)  f  j, n ; 0  n  N .
k  j n
Examples: Image Quantization
Input
Quantized
Here, cost function is:
f (i, j ) 
 n( k )  ( k  q )
2
Square Error
ik  j
where q 
 n( k )  k
ik  j
Quantize value
Examples: Polygonal Approximation
5004 points are approximated by 78 points.
Given approximated points M, errors are minimized.
Given error ε, number of approximated points are minimized.
Here, cost function is:
f ( xi , x j )  max ik  j {d k }
or f ( xi , x j ) 
2
d
 k
i k  j
Examples: Route segmentation
Ski
Running and Jogging
estimated segment result
estimated segment result
6
20
5
15
4
speed
speed
25
10
3
2
5
1
0
0
200
400
600
800
1000
time
1000
2000
3000
4000
time
Non-moving
Divide the routes into several segments by
speed consistency
Here, cost function is:
estimated segment result
10
8
speed
0
0
6
f (i, j )   (vk )i  k  j  (t j  ti )
4
2
0
0
1000
2000
3000
time
4000
5000
6000
Speed variance Time duration
Examples: Route classification
1
Stop Walk
Car
0.8
Run
Probability
Bike
Determine the moving type only by
speed will cause mis-classification.
0.6
0.4
0.2
0
10
20
30
40
Speed(km/h)
Frequent Moving type dependency of
1st order HMM
Examples: Route classification (cont.)
1st order HMM, maximize:
M
f   P(mi | Xi , mi 1 )
i 1
mi : moving type of segment i
Xi : feature vector (e.g. speed)
Solve by dynamic programming similar with
the Assembly-line scheduling problem
Longest Common Subsequence (LCS)
Find a maximum length common subsequence
between two sequences.
For instance,
Sequence 1: president
Sequence 2: providence
Its LCS is priden.
president
providence
How to compute LCS?
Let A=a1a2…am and B=b1b2…bn .
len(i, j): the length of an LCS between
a1a2…ai and b1b2…bj
With proper initializations, len(i, j) can be computed as follows.
0
if i  0 or j  0,

len(i, j )  len(i  1, j  1)  1
if i, j  0 and ai  b j ,
max(len(i, j  1), len(i  1, j )) if i, j  0 and a  b .
i
j

i
0
1
2
3
4
5
6
7
8
9
j
p
r
e
s
i
d
e
n
t
0
1
p
2
r
3
o
4
v
5
i
6
d
7
e
8
n
9
c
10
e
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
0
1
2
2
2
2
2
2
2
2
2
0
1
2
2
2
2
2
3
3
3
3
0
1
2
2
2
2
2
3
3
3
3
0
1
2
2
2
3
3
3
3
3
3
0
1
2
2
2
3
4
4
4
4
4
0
1
2
2
2
3
4
5
5
5
5
0
1
2
2
2
3
4
5
6
6
6
0
1
2
2
2
3
4
5
6
6
6
Running time and memory: O(mn) and O(mn).
i
0
1
2
3
4
5
6
7
8
9
j
p
r
e
s
i
d
e
n
t
0
1
p
2
r
3
o
4
v
5
i
6
d
7
e
8
n
9
c
10
e
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
0
1
2
2
2
2
2
2
2
2
2
0
1
2
2
2
2
2
3
3
3
3
0
1
2
2
2
2
2
3
3
3
3
0
1
2
2
2
3
3
3
3
3
3
0
1
2
2
2
3
4
4
4
4
4
0
1
2
2
2
3
4
5
5
5
5
0
1
2
2
2
3
4
5
6
6
6
0
1
2
2
2
3
4
5
6
6
6
Output: priden
Time Series Matching using Longest
Common Subsequence
Minimum Bounding Envelope (MBE) for LCSS
delta = time matching region (left
& right)
2
0
epsilon = spatial matching region
(up & down)
-2
-4
0
20
40
60
80
100
120
Point Correspondense, Similarity [=10, =0.3] = 0.875
0
20
40
Recursive
for i  0, j equation:
0
60
80
0


len(i, j )  
len(i  1, j  1)  1
max(len(i  1, j ), len(i, j  1))

100
120
i  0, j  0
i  0, j  0,| ti  t j |  , d ( pi , p j )  
else
Spatial Data Matching using Longest
Common Subsequence
epsilon = spatial matching
distance
Recursive equation:
0


len(i, j )  
len(i  1, j  1)  1
max(len(i  1, j ), len(i, j  1))

i  0, j  0
i  0, j  0, d ( pi , p j )  
else
Conclusion
Main idea:
- Divide into sub-problems
- Design cost function and recursive function
- Optimization (fill in the table)
- Backtracking
Download