Midterm: week 7 in the lecture for 2 hours 2016/5/29 chapter25 1

advertisement
Midterm: week 7 in the lecture for 2 hours
2016/5/29
chapter25
1
2016/5/29
chapter25
2
2016/5/29
chapter25
3
2016/5/29
chapter25
4
Recursive Algorithm:
Compute-Opt(j)
if j=0 then
return 0
else
return max {vj+Compute-Opt(p(j)), Compute-Opt(j-1)}
Running time: >2n/2.
(not required)
2016/5/29
chapter25
5
Index
1
2
v1=2
p(1)=0
v2=4
p(2)=0
v3=4
p(3)=1
3
4
v4=7
p(4)=0
v5=2
5
v6=1
6
2016/5/29
chapter25
p(5)=3
p(6)=3
6
(not required)
OPT(6)
OPT(5)
OPT(4)
OPT(3)
OPT(3) OPT(2)
OPT(1)
OPT(3)
OPT(2)
OPT(2)
OPT(1) OPT(1)
OPT(1)
OPT(1)
OPT(1)
2016/5/29
chapter25
The tree of subproblems
grows very quickly
It may take
exponential time
7
(not required)
T(n)=T(n-1)+T(n-2)>2T(n-2)>4T(n-4)
> 8T(n-6)>…>2n/2T(1)
2016/5/29
chapter25
8
Weighted Interval Scheduling: Bottom-Up
Input: n, s1, s2, …, sn, f1, f2, …, fn, v1, v2, …, vn
Sort jobs by finish times so that f1f2 … fn.
Compute p(1), p(2) , …, p(n)
M[0]=0;
for j = 1 to n do
M[j] = max { vj+m[p(j)], m[j-1]}
if (M[j] == M[j-1]) then B[j]=0 else B[j]=1 /*for backtracking
m=n; /*** Backtracking
while ( m ≠0) { if (B[m]==1) then
print job m; m=p(m)
else
2016/5/29
m=m-1 }
B[j]=0 indicating job j is not selected.
B[j]=1 indicating job j is selected.
chapter25
9
M[2]=w2+M[0]=4+0; M[3]=w3+M[1]=4+2;
M[4]=W4+M[0]=7+0; M[5]=W5+M[3]=2+6;
M[6]=w6+M[3]=1+6<8;
Index
w1=2
1
p(2)=0
w3=4
3
p(4)=0
w5=2
5
w6=1
6
Backtracking: job1, job 3, job 5
2016/5/29
2
3
4
5
M= 0
2
0
2
4
0
2
4
6
0
2
4
6
7
0
2
4
6
7
8
0
2
4
6
7
8
6
p(3)=1
w4=7
4
1
p(1)=0
w2=4
2
0
chapter25
p(5)=3
p(6)=3
j: 0 1 2 3 4 5 6
B: 0 1 1 1 1 1 0
10
8
Backtracking and time complexity
•Backtracking is used to get the schedule.
•P()’s can be computed in O(n) time after sorting all the jobs
based on the starting times.
•Time complexity
• O(n) if the jobs are sorted and p() is computed.
• Total time: O(n log n) including sorting.
2016/5/29
chapter25
11
Computing p()’s in O(n) time
P()’s can be computed in O(n) time using two sorted lists, one
sorted by finish time (if two jobs have the same finish time,
sort them based on starting time) and the other sorted by start
time.
Start time: b(0, 5), a(1, 3), e(3, 8), c(5, 6), d(6, 8)
Finish time a(1, 3), b(0,5), c(5,6), d(3,8), e(6,8)
P(d)=c, p(c )=b, p(e)= a, p(a)=0, p(b)=0. (See demo7)
2016/5/29
chapter25
12
Example 2:
Start time: b(0, 5), a(1, 3), e(3, 8), c(5, 6), d(6, 8)
Finish time a(1, 3), b(0,5), c(5,6), d(6,8), e(3,8)
P(d)=c, p(c )=b, p(e)= a, p(a)=0, p(b)=0.
v(a)=2, v(b)=3, v(c )=5, v(d) =6, v(e)=8.8.
Solution: M[0]=0, M[a]=2. M[b]=max{2, 3+M[p(b)]}=3.
M[c]=max{3, 5+M[p(c )]}=5+M[b]=8.
M[d]=max{8, 6+M[p(d)]}=6+M[c]=6+8=14.
M[e]=max{14, 8.8+M[p(e)]}=max{14, 8.8+M[a]}=max {14, 10.8}=14.
Backtracking: b, c, d.
2016/5/29
Job: a b c d e
chapter25
B: 1 1 1 1 0
13
Longest common subsequence
• Definition 1: Given a sequence X=x1x2...xm,
another sequence Z=z1z2...zk is a subsequence of
X if there exists a strictly increasing sequence
i1i2...ik of indices of X such that for all j=1,2,...k,
we have xij=zj.
• Example 1: If X=abcdefg, Z=abdg is a
subsequence of X.
X=abcdefg,
Z=ab d g
2016/5/29
chapter25
14
• Definition 2: Given two sequences X and
Y, a sequence Z is a common subsequence
of X and Y if Z is a subsequence of both X
and Y.
• Example 2: X=abcdefg and Y=aaadgfd.
Z=adf is a common subsequence of X and
Y.
X=abc defg
Y=aaaadgfd
Z=a d f
2016/5/29
chapter25
15
• Definition 3: A longest common
subsequence of X and Y is a common
subsequence of X and Y with the longest
length. (The length of a sequence is the
number of letters in the seuqence.)
• Longest common subsequence may not
be unique.
• Example: abcd
acbd
Both acd and abd are LCS.
2016/5/29
chapter25
16
Longest common subsequence problem
• Input: Two sequences X=x1x2...xm, and
Y=y1y2...yn.
• Output: a longest common subsequence of X and Y.
• Applications:
• Similarity of two lists
– Given two lists: L1: 1, 2, 3, 4, 5 , L2:1, 3, 2, 4, 5,
– Length of LCS=4 indicating the similarity of the two lists.
• Unix command “diff”.
2016/5/29
chapter25
17
Longest common subsequence problem
• Input: Two sequences X=x1x2...xm, and
Y=y1y2...yn.
• Output: a longest common subsequence of X and Y.
• A brute-force approach
Suppose that mn. Try all subsequence of X
(There are 2m subsequence of X), test if such a
subsequence is also a subsequence of Y, and select
the one with the longest length.
2016/5/29
chapter25
18
Charactering a longest common
subsequence
• Theorem (Optimal substructure of an
LCS)
• Let X=x1x2...xm, and Y=y1y2...yn be two sequences, and
• Z=z1z2...zk be any LCS of X and Y.
• 1. If xm=yn, then zk=xm=yn and Z[1..k-1] is an LCS of
X[1..m-1] and Y[1..n-1].
• 2. If xm yn, then zkxm implies that Z is an LCS of
X[1..m-1] and Y.
• 2. If xm yn, then zkyn implies that Z is an LCS of X and
Y[1..n-1].
2016/5/29
chapter25
19
The recursive equation
• Let c[i,j] be the length of an LCS of X[1...i] and
Y[1...j].
• c[i,j] can be computed as follows:
0
if i=0 or j=0,
c[i,j]= c[i-1,j-1]+1
if i,j>0 and xi=yj,
max{c[i,j-1],c[i-1,j]} if i,j>0 and xiyj.
Computing the length of an LCS
• There are nm c[i,j]’s. So we can compute them in
a specific order.
2016/5/29
chapter25
20
The algorithm to compute an LCS
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
1. for i=1 to m do
2.
c[i,0]=0;
3. for j=0 to n do
4.
c[0,j]=0;
5. for i=1 to m do
6.
for j=1 to n do
7.
{
8.
if x[i] ==y[j] then
9.
c[i,j]=c[i-1,j-1]+1;
10
b[i,j]=1;
11.
else
if c[i-1,j]>=c[i,j-1] then
12.
c[i,j]=c[i-1,j]
13.
b[i,j]=2;
14.
else c[i,j]=c[i,j-1]
15.
b[i,j]=3;
14
}
2016/5/29
chapter25
21
Example 3: X=BDCABA and Y=ABCBDAB.
2016/5/29
chapter25
22
Constructing an LCS (back-tracking)
• We can find an LCS using b[i,j]’s.
• We start with b[n,m] and track back to some cell b[0,i] or b[i,0].
• The algorithm to construct an LCS (backtracking)
1.
2.
3.
4.
i=m
j=n;
if i==0 or j==0 then exit;
if b[i,j]==1 then
{
i=i-1;
j=j-1;
print “xi”;
}
5. if b[i,j]==2
i=i-1
6. if b[i,j]==3
j=j-1
7. Goto Step 3.
• The time complexity: O(nm).
2016/5/29
chapter25
23
Remarks on weighted interval scheduling
• it takes long time to explain. (50+13 minutes)
• Do not mention exponent time etc.
• For the first example, use the format of example
2 to show the computation process (more
clearly).
2016/5/29
chapter25
24
Shortest common supersequence
• Definition: Let X and Y be two sequences. A
sequence Z is a supersequence of X and Y if both
X and Y are subsequence of Z.
• Shortest common supersequence problem:
Input: Two sequences X and Y.
Output: a shortest common supersequence of X and Y.
• Example: X=abc and Y=abb. Both abbc and
abcb are the shortest common supersequences for
X and Y.
2016/5/29
chapter25
25
Recursive Equation:
• Let c[i,j] be the length of an SCS of X[1...i]
and Y[1...j].
• c[i,j] can be computed as follows:
j
if i=0
i
if j=0,
c[i,j]= c[i-1,j-1]+1
if i,j>0 and xi=yj,
min{c[i,j-1]+1,c[i-1,j]+1} if i,j>0 and xiyj.
2016/5/29
chapter25
26
2016/5/29
chapter25
27
The pseudo-codes
for i=0 to n do
c[i, 0]=i;
for j=0 to m do
c[0,j]=j;
for i=1 to n do
for j=1 to m do
if (xi == yj) c[i ,j]= c[i-1, j-1]+1; b[i.j]=1;
else {
c[i,j]=min{c[i-1,j]+1, c[i,j-1]+1}.
if (c[I,j]=c[i-1,j]+1 then b[I,j]=2;
else b[I,j]=3;
}
p=n, q=m; / backtracking
while (p≠0 or q≠0)
{ if (b[p,q]==1) then {print x[p]; p=p-1; q=q-1}
if (b[p,q]==2) then {print x[p]; p=p-1}
if (b[p,q]==3) then {print y[q]; q=q-1}
}
2016/5/29
chapter25
28
Exercises
•
Exercise 1: For the weighted interval scheduling
problem, there are eight jobs with starting time and finish
time as follows: j1=(0, 8), j2=(2, 3), j3=(3, 6), j4=(5, 9),
j5=(8, 12), j6=(9, 11), j7=(10, 13) and j8=(11, 16). The
weight for each job is as follows: v1=3.5, v2=2.0, v3=3.0,
v4=3.0, v5=6.5, v6=2.5, v7=12.0, and v8=8.0.
Find a maximum weight subset of mutually compatible
jobs. (Backtracking process is required.) (You have to
compute p()’s. The process of computing p()’s is NOT
required.)
• Exercise 2: Let X=abbacab and Y=baabcbb. Find the
longest common subsequence for X and Y.
Backtracking process is required.
2016/5/29
chapter25
29
Summary of Week 6
• Understand the algorithms for the weighted
Interval Scheduling problem, LCS and SCS.
• The “alignment of sequences” part is not
taught.
2016/5/29
chapter25
30
Alignment of sequences
• An alignment:
– inserting spaces into X and Y such that the two resulting
sequences X’ and Y’ are of the same length.
– every letter in X’ is opposite to a unique letter in Y’.
Examples: o-currence o-curr-ance
abbbaa--bbbbaab
occurrence o-curre-nce ababaaabbbbba-b
n
• The alignment value:  s( X '[i], Y '[i])
i 1
– where X’[i] and Y’[i] are the two letters in column i of the
alignment and s(X’[i], Y’[i]) is the score (weight) of these
opposing letters.
• There are several popular socre schemes for DNA and
protein sequences.
2016/5/29
chapter25
31
• Recursive equations:
c[i,j]=max{ c[i-1, j-1]+s(X[i], Y[j]), c[i, j-1]+s(_,Y[j]), c[i-1,
j)+s(X[i],_)}.
Similarity Score Scheme (max):
– match: 1;
– mismatch or insertion or deletion: 0.
Example:
A B B C A A A
A B B C AAA
0 0 0 0 0 0 0 0
A B C C AA
A 0 1 1 1 1 1 1 1
B 0 1 2 2 2 2 2 2
The same as LCS if we
C 0 1 2 2 3 3 3 3
C 0 1 2 2 3 3 3 3
use the special
A 0 1 2 2 3 4 4 4
similarity score and
A 0 1 2 2 3 4 5 5
maximization
2016/5/29
chapter25
32
• Recursive equations:
c[i,j]=min{ c[i-1, j-1]+s(X[i], Y[j]), c[i, j-1]+s(_,Y[j]), c[i-1,
j)+s(X[i],_)}.
Distance Score Scheme (mix):
– match: 0 insertion and deletion 1;
– Mismatch 2
Example:
A B B C A A A
A B B C AAA
0 1 2 3 4 5 6 7
A B C C AA
A 1 1 2 3 4 5 6 7
B 2 2 2 3 4 5 6 7
C 3 3 3 4 4 5 6 7
C 4 4 4 5 5 6 7 8
The same as SCS if we use
A 5 5 5 6 6 6 7 8
the special distance score
A 6 6 6 7 7 7 7 8
and minimization
2016/5/29
chapter25
33
A score emphasizing A-A match: (max)
– A-A match: 1,
– Any other match or mismatch: 0.
Example:
A B B C A A A
A B B C AAA
0 0 0 0 0 0 0 0
A B C C AA
A 0 1 1 1 1 1 1 1
B 0 1 1 1 1 1 1 1
C 0 1 1 1 1 1 1 1
There are 3 A-A
C 0 1 1 1 1 1 1 1
matchs
A 0 1 1 1 1 2 2 2
A 0 1 1 1 1 2 3 3
2016/5/29
chapter25
34
• Recursive equations:
c[i,j]=min{ c[i-1, j-1]+s(X[i], Y[j]), c[i, j-1]+s(_,Y[j]),
c[i-1, j)+s(X[i],_)}.
c[i,j]=max{ c[i-1, j-1]+s(X[i], Y[j]), c[i, j-1]+s(_,Y[j]),
c[i-1, j)+s(X[i],_)}.
• Time and space complexity
Both are O(nm) or O(n2) if both sequences have equal
length n.
• Why?
We have to compute c[i,j] (the cost) and b[i,j] (for backtracking). Each will take O(n2).
2016/5/29
chapter25
35
Download