Advanced Graph Algorithms

advertisement
Advanced Graph Algorithms (I)
• What we do not cover but you are
expected to know
– Mathematical induction, basic data structure,
sorting, shortest path, minimum spanning tree,
dynamic programming, divide-and-conquer
– Homework assignment
• Better if you know NP-complete, NP-hard
– Pick it up yourself if not. It becomes a
common sense in computer science.
1/24
Advanced Graph Algorithms (II)
• Who should take
– 對演算法有興趣
– 對抽象思考有興趣
– 培養基礎,以後有志於從事研究工作
– 有興趣知道如何將理論應用到實際問題
• 生物資訊,網路搜尋,自然語言
• 對任職於Google, Facebook, Microsoft有興趣者
• 課程網站(所有的announcement都放在此)
– 我在研究院的網頁下面的教學網站
– http://iasl.iis.sinica.edu.tw/hsu/#teach
2/24
成績計算方式
• 小考(從作業內出題,作業不必繳交)
• 一次期末考筆試
• 一個程式project
• 期末論文presentation(視人數而定)
3/24
Biography
•
•
•
•
•
許聞廉 Wen-Lian Hsu 中研院資訊所特聘研究員
1973 台大數學學士
1980 康乃爾Operations Research博士
1980-89 美國西北大學工業工程系
1989- 中央研究院資訊所
– 發展「自然輸入法」、許氏鍵盤
• Research interests:
– Design of algorithms, artificial intelligence, natural
language processing, bioinformatics, knowledge
management
4/24
Scope
• Consecutive ones test • Applications
– Sequence assembly
• PQ-trees and PC– Motif discovery
trees
– de novo sequencing
• Planar graphs
– Protein structure
• Maximal Planar
prediction
Subgraph Algorithm
• Chordal graphs
• Interval graphs
5/24
A few Examples of
Mathematical Induction
• Most algorithms we designed are
“recursive.”
• N! = N x (N-1)!
• The theoretical basis is “Mathematical
Induction.”
6/24
A Hat Problem (I)
N prisoners lined up in a row, each one can see the hats
of all people in front of him. A person who guesses the
color of his hat correctly can survive
No strategy
In the worst case, all men were shot.
Strategy 1 (with collaboration)
In the worst case, half of the men will be shot.
7/24
A Hat Problem (II)
Strategy 1 (at least half can survive, probably ¾ will)
Divide the men into two groups: odd-numbered and
even-numbered. Each odd-numbered person should tell
the person in front the correct color (since he can see it).
As for the person himself, there is still ½ chance that he
will survive)
Design a strategy so that as few men will die as possible.
8/24
A Hat Problem (III)
Message Passing
Suppose we use 0 to indicate white hat and 1 for black hat
Let the original sequence be
0 1 1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
Then the sequences each man will see are as follows
1 1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
How do you let each man guess the right # (except the first one)?
odd-evenness (or parity) of the # of 0’s and 1’s.
9/24
A Hat Problem (IV)
0 1 1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
1 1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
1 0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
0 0 1 0 0 0 1 1 0 1 0 0 1 1 1
• If the current hat is 0, then moving to the next sequence will
only change the parity of 0 (the parity of 1 stays the same)
• Everyone knows the parity of 0 and 1 for the sequence in
front of him.
• If the 1st person says the parity of 1 for his sequence (either
odd or even), then by checking whether the parity of 1
changes, the 2nd person knows his hat color
• By induction, everyone afterward can compute his hat color
10/24
Marriage Theorem
There are n girls and n boys. Each girl has a list of boys she can
marry. Assume a boy never rejects a girl’s offer. Under what
condition can you find a perfect match?
The following condition is both necessary and sufficient:
Every set of r girls, 1  r  n, like at least r boys.
Sufficient: If the condition holds, then can find a perfect match
Prove by induction (for the sufficient part, since necessity is clear).
Necessary:
If there
is a perfect
match,size
thento
the
must hold.
How do you
reduce
the problem
a condition
smaller one?
Easy case: There is a subset of k girls who like exactly k boys
By induction, can match these k girls with the k boys. Again, by
induction, the remaining n-k girls can be matched to the n-k boys
Otherwise: we have: Every set of r girls, 1  r  n, likes > r boys.
Marry a girl with a boy first, then for the remaining n-1 girls and11/24
n-1 boys, the condition still holds.
Maximum Subsequence Sum Problem
• Given (possibly negative) integers A1, A2,…, AN,
find the maximum value of
k=i Ak over all i, j
j
• For input –2, 11, -4, 13, -5, -2, the answer is 20
12/24
Algorithm 1: Brute-force method
• Given n integers A1, A2,…, AN, how many
subsequences can you form?
• For example: 1, 2, 3, 4
– The possible sums include:
•
•
•
•
1, 1+2, 1+2+3, 1+2+3+4
2, 2+3, 2+3+4
3, 3+4
4
– Find the maximum in the above sums
– An O(N3) solution
13/24
Algorithm 2: A Bit Clever Algorithm
• By noting

j A
k=i k
= Aj +

j-1 A
k=i k
• We can reuse partial computation in
previous steps
14/24
Algorithm 2: A Bit Clever Algorithm
• Sum (i,j) = 0 for all i, j
• For i = 0 to n
For j = i to n
Sum (i,j)  Sum (i, j-1) + Aj
end
end
• An O(N2) algorithm
15/24
Algorithm 3: Divide and Conquer
• Divide Part:
– Split the problem into two roughly equal
subproblems, which are then solved
recursively.
• Conquer Part:
– Patching together the two solutions of the
subproblems with small amount of additional
work.
16/24
Algorithm 3: Divide and Conquer
• Maximum subsequence sum problem
– Divide the sequence into two equal parts
– The maximum subsequence sum can be
found in one of three places:
• Entirely in the left half of the input
• Entirely in the right half of the input
• Crosses the middle and in both halves
17/24
Algorithm 3: Divide and Conquer
• In first half: 6
First Half
4
-3
Second Half
5
-8
-1
2
6
-2
18/24
Algorithm 3: Divide and Conquer
• In second half: 8
First Half
4
-3
Second Half
5
-8
-1
2
6
-2
19/24
Algorithm 3: Divide and Conquer
• Crosses middle: 11
First Half
4
-3
Second Half
5
-2
-1
2
6
-2
20/24
Algorithm 3: Divide and Conquer
• When the maximum subsequence sum crosses
the middle
The maximum subsequence sum from the end
• T(n) = 2T(n/2) + 2n/2
• An O(n log n) algorithm
21/24
Algorithm 4: The most clever one
• An improvement over algorithm 2
• Clever observations:
– No negative subsequence can possibly be a
prefix of the optimal subsequence
22/24
Algorithm 4: The most clever one
When would adding An change the maxSUMn ?
• Use induction. Keep two sums at each iteration and update them based
on the three conditions below. Note that maxendSUM is the optimal
subsequence sum from the right end
maxSUMn-1
maxendSUMn-1
maxSUMn-1
n-1
maxendSUMn-1 + An
1
1
n
If maxendSUMn-1 + An  0, then maxendSUMn = 0
If maxendSUMn-1 + An > 0, then maxendSUMn  maxendSUMn-1 + An
If maxendSUMn-1 + An > maxSUMn-1, then maxSUMn = maxendSUMn-1 + An
Else maxSUMn = maxendSUMn-1
23/24
Algorithm 4: The most clever one
• An O(N) algorithm
– It takes constant time to update the two sums at each
iteration
• Example: Sequence: -2, 11, -4, 13, -5, -2
Initially, maxendSUM0 = 0, maxSUM0 = 0
i = 1: (a) maxendSUM1 = 0
i = 2: (b) maxendSUM2 = 11, (c) maxSUM2 = 11
i = 3: (b) maxendSUM3 = 7, (c) maxSUM3 = 11
i = 4: (b) maxendSUM4 = 20, (c) maxSUM4 = 20
i = 5: (b) maxendSUM5 = 15, (c ) maxSUM5 = 20
i = 6: (b) maxendSUM6 = 13, (c ) maxSUM6 = 20
24/24
Common Computational Models
• Discrete algorithm
– Probabilistic, approximation, on-line, randomized
• Non-linear programming (numerical)
• Statistical
– Regression
– Machine learning
• Neural net, SVM, Hidden Markov Model, Maximum entropy,
Conditional random fields,
– Evolutionary
• Genetic algorithm, particle swarm
• Areas: NLP, ASR, IR, IE, DM
25/24
Download