Greedy Algorithms 5. Greedy Algorithms - 1

advertisement
Greedy Algorithms
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 1
Greedy Algorithms
Coming up
Casual Introduction: Two Knapsack Problems
An Activity-Selection Problem
Greedy Algorithm Design
Huffman Codes
(Chap 16.1-16.3)
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 2
2 Knapsack Problems
1. 0-1 Knapsack Problem:
A thief robbing a store finds n items.
ith item: worth vi dollars
wi pounds
W, wi, vi are integers.
He can carry at most W pounds.
Which items
should I
take?
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 3
2 Knapsack Problems
2. Fractional Knapsack Problem:
A thief robbing a store finds n items.
ith item: worth vi dollars
wi pounds
W, wi, vi are integers.
He can carry at most W pounds.
He can take fractions of items.
?
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 4
2 Knapsack Problems
Dynamic Programming Solution
Both problems exhibit the optimal-substructure property:
Consider the most
valuable load that
weighs at most W
pounds.
If jth item is
the remaining load must be
removed
the most valuable load
from his load, weighting at most W-wj that
he can take from the n-1
original items excluding j.
=> Can be solved by dynamic programming
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 5
2 Knapsack Problems
Dynamic Programming Solution
Example: 0-1 Knapsack Problem
Suppose there are n=100 ingots:
30 Gold ingots: each $10000, 8 pounds
(most expensive)
20 Silver ingots: each $2000, 3 pound per piece
50 Copper ingots: each $500, 5 pound per piece
Then, the most valuable load for to fill W pounds
= The most valuable way among the followings:
(1) take 1 gold ingot + the most valuable way to fill W-8 pounds
from 29 gold ingots, 20 silver ingots and 50 copper ingots
(2) take 1 silver ingot + the most valuable way to fill W-3 pounds
from 30 gold ingots, 19 silver ingots and 50 copper ingots
(3) take 1 copper ingot + the most valuable way to fill W-5 pounds
from 30 gold ingots, 20 silver ingots and 49 copper ingots
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 6
2 Knapsack Problems
Dynamic Programming Solution
Example: Fractional Knapsack Problem
Suppose there are totally n = 100 pounds of metal dust:
30 pounds Gold dust: each pound $10000
(most expensive)
20 pounds Silver dust: each pound $2000
50 pounds Copper dust: each pound $500
Then, the most valuable way to fill a capacity of W pounds
= The most valuable way among the followings:
(1) take 1 pound of gold + the most valuable way to fill W-1 pounds
from 29 pounds of gold, 20 pounds of silver, 50 pounds of copper
(2) take 1 pound of silver + the most valuable way to fill W-1 pounds
from 30 pounds of gold, 19 pounds of silver, 50 pounds of copper
(3) take 1 pound copper + the most valuable way to fill W-1 pounds
from 30 pounds of gold, 20 pounds of silver, 49 pounds of copper
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 7
2 Knapsack Problems
By Greedy Strategy
Both problems are similar. But Fractional Knapsack Problem
can be solved in a greedy strategy.
Step 1. Compute the value per pound for each item
Eg. gold dust: $10000 per pound (most expensive)
Silver dust: $2000 per pound
Copper dust: $500 per pound
Step 2. Take as much as possible of the most expensive
(ie. Gold dust)
Step 3. If the supply of that item is exhausted (ie. no more
gold) and he can still carry more, he takes as
much as possible of the item that is next most
expensive and so forth until he can’t carry any
5. Greedy Algorithms - 8
more.
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
Knapsack Problems
By Greedy Strategy
We can solve the Fractional Knapsack
Problem by a greedy algorithm:
Always makes the choice that looks best
at the moment.
ie. A locally optimal Choice
To see why we can’t
solve 0-1 Knapsack
Problem by greedy
strategy, read Chp 16.2.
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 9
Greedy Algorithms
2 techniques for solving optimization problems:
1. Dynamic Programming
2. Greedy Algorithms (“Greedy Strategy”)
For the optimization problems:
Greedy Approach can solve
these problems:
Dynamic Programming can
solve these problems:
For some optimization problems,
Dynamic Programming is “overkill”
Greedy Strategy is simpler and more efficient.
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 10
Activity-Selection Problem
For a set of proposed activities that wish to
use a lecture hall, select a maximum-size
subset of “compatible activities”.
Set of activities: S={a1,a2,…an}
Duration of activity ai:
[start_timei, finish_timei)
Activities sorted in increasing order of finish time:
i
start_timei
finish_timei
1
1
4
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
2
3
5
3
0
6
4
5
7
5
3
8
6
5
9
http://www.cs.cityu.edu.hk/~helena
7 8 9 10 11
6 8 8 2 12
10 11 12 13 14
5. Greedy Algorithms - 11
Activity-Selection Problem
i
1
start_timei 1
finish_timei 4
2
3
5
3
0
6
4
5
7
5
3
8
time a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
6
5
9
7 8 9 10 11
6 8 8 2 12
10 11 12 13 14
Compatible activities:
{a3, a9, a11},
{a1,a4,a8,a11},
{a2,a4,a9,a11}
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 12
Activity-Selection Problem
Dynamic Programming Solution (Step 1)
Step 1. Characterize the structure of an optimal solution.
S:
i
start_timei
finish_timei
1
1
4
2
3
5
3
0
6
4
5
7
5
3
8
Let Si,j be the set of activities that
start after ai finishes and
finish before aj starts.
eg.
eg S2,11=
6
5
9
7 8 9 10 11(=n)
6 8 8 2 12
10 11 12 13 14
time a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11
0
1
2
3
4
5
ok
ok
6
ok
ok ok
ok ok
7
8
ok ok ok ok
9
ok ok ok
ok ok
10
11
ok
12
13
14
Definition:
Sij={akS: finish_timeistart_timek<finish_timek start_timej}
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 13
Activity-Selection Problem
Dynamic Programming Solution (Step 1)
Add fictitious activities: a0 and an+1:
S:
i
start_timei
finish_timei
0
0
1
1
4
2
3
5
3
0
6
4
5
7
5
3
8
6
5
9
11(=n) 12
7 8 9 10 11
6 8 8 2 12

10 11 12 13 14
time a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
ie. S0,n+1
={a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11}
=S
Note:
If i>=j then Si,j=Ø

CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 14
Activity-Selection Problem
Dynamic Programming Solution (Step 1)
The problem:
For a set of proposed activities that wish
to use a lecture hall, select a maximumsize subset of “compatible activities
Substructure:
Suppose a solution to Si,j
includes activity ak,
then,2 subproblems are
generated: Si,k, Sk,j
The maximum-size subset Ai,j
of compatible activities is:
Ai,j=Ai,k U {ak} U Ak,j
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
=
Select a maximum-size
subset of compatible
activities from S0,n+1.
time a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Suppose a solution to S0,n+1 contains a7, then,
2 subproblems are generated: S0,7 and S7,n+1
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 15
Activity-Selection Problem
Dynamic Programming Solution (Step 2)
Step 2. Recursively define an optimal solution
Let c[i,j] = number of activities in a maximum-size subset of
compatible activities in Si,j.
If i>=j, then Si,j=Ø, ie. c[i,j]=0.
c(i,j) =
0
Maxi<k<j {c[i,k] + c[k,j] + 1}
if Si,j=Ø
if Si,jØ
Step 3. Compute the value of an
optimal solution in a bottom-up
fashion
Step 4. Construct an optimal solution
from computed information.
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 16
Activity-Selection Problem
Greedy Strategy Solution
0
c(i,j) =
if Si,j=Ø
Maxi<k<j {c[i,k]+c[k,j]+1} if Si,jØ
Consider any nonempty subproblem
Si,j, and let am be the activity in Si,j
with the earliest finish time.
Then,
1. Am is used in some maximumsize subset of compatible
activities of Si,j.
2. The subproblem Si,m is empty,
so that choosing am leaves the
subproblem Sm,j as the only
one that may be nonempty.
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
eg. S2,11={a4,a6,a7,a8,a9}
time a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11
0
1
2
3
4
5
ok
ok
6
ok
ok ok
7
ok ok
8
ok ok ok ok
9
ok ok ok
10
ok ok
11
ok
12
13
14
Among {a4,a6,a7,a8,a9}, a4 will
finish earliest
1. A4 is used in the solution
2. After choosing A4, there are 2
subproblems: S2,4 and S4,11.
But S2,4 is empty. Only S4,11
remains as a subproblem.
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 17
Activity-Selection Problem
Greedy Strategy Solution
time a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12
Hence, to solve the Si,j:
1. Choose the activity am with the earliest
finish time.
2. Solution of Si,j = {am} U Solution of
subproblem Sm,j
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

That is,
To solve S0,12, we select a1 that will finish earliest, and solve for S1,12.
To solve S1,12, we select a4 that will finish earliest, and solve for S4,12.
To solve S4,12, we select a8 that will finish earliest, and solve for S8,12.
…
Greedy Choices (Locally optimal choice)
Solve the problem in a
top-down fashion
To leave as much opportunity as possible for the
remaining activities to be scheduled.
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 18
Activity-Selection Problem
Greedy Strategy Solution
Recursive-Activity-Selector(i,j)
1 m = i+1
m=2
// Find first activity in Si,j
m=3
m=4
2 while m < j and start_timem < finish_timei Okay Okay break
the loop
3
do m = m + 1
4 if m < j
5
then return {am} U Recursive-Activity-Selector(m,j)
time a a a a a a a a a a a a a
6 else return Ø
0
Order of calls:
1
2
{1,4,8,11}
3
Recursive-Activity-Selector(0,12)
4
5
{4,8,11}
6
Recursive-Activity-Selector(1,12)
7
8
{8,11}
9
Recursive-Activity-Selector(4,12)
10
11
{11}
12
Recursive-Activity-Selector(8,12)
13
14
Ø Recursive-Activity-Selector(11,12)
0
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
1
2
3
4
5
6
7
8
9
10 11 12

http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 19
Activity-Selection Problem
Greedy Strategy Solution
Iterative-Activity-Selector()
1 Answer = {a1}
2 last_selected=1
3 for m = 2 to n
4
if start_timem>=finish_timelast_selected
5
then Answer = Answer U {am}
6
last_selected = m
time a
7 return Answer
0
0
a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12
1
2
3
4
5
6
7
8
9
10
11
12
13
14

CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 20
Activity-Selection Problem
Greedy Strategy Solution
For both Recursive-Activity-Selector and
Iterative-Activity-Selector,
Running times are (n)
Reason: each am are examined once.
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 21
Greedy Algorithm Design
Steps of Greedy Algorithm Design:
1. Formulate the optimization problem in the
form: we make a choice and we are left
with one subproblem to solve.
2. Show that the greedy choice can lead to
an optimal solution, so that the greedy
choice is always safe.
3. Demonstrate that
an optimal solution to original problem =
greedy choice + an optimal solution to the
subproblem
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
Optimal
Substructure
Property
GreedyChoice
Property
A good clue that
that a greedy
strategy will solve
the problem.
5. Greedy Algorithms - 22
Greedy Algorithm Design
Comparison:
Dynamic Programming
Greedy Algorithms
At each step, the choice is
determined based on
solutions of subproblems.
At each step, we quickly make a
choice that currently looks best.
--A local optimal (greedy) choice.
Sub-problems are solved first.
Greedy choice can be made first
before solving further subproblems.
Bottom-up approach
Top-down approach
Can be slower, more complex
Usually faster, simpler
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 23
Huffman Codes
Huffman Codes
• For compressing data (sequence of characters)
• Widely used
• Very efficient (saving 20-90%)
• Use a table to keep frequencies of occurrence
of characters.
• Output binary string.
“Today’s
weather is nice”
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
“001 0110 0 0 100
1000 1110”
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 24
Huffman Codes
Frequency
Example:
‘a’
‘b’
‘c’
‘d’
‘e’
‘f’
A file of 100,000 characters.
Containing only ‘a’ to ‘e’
45000
13000
12000
16000
9000
5000
Fixed-length Variable-length
codeword
codeword
000
0
001
101
010
100
011
111
100
1101
101
1100
eg. “abc” = “000001010”
eg. “abc” = “0101100”
300,000 bits
1*45000 + 3*13000 + 3*12000 +
3*16000 + 4*9000 + 4*5000
= 224,000 bits
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 25
Huffman Codes
A file of 100,000
characters.
The coding schemes can be represented by trees:
Frequency
(in thousands)
45
13
12
16
9
5
‘a’
‘b’
‘c’
‘d’
‘e’
‘f’
Fixed-length
codeword
000
001
010
011
100
101
100
Not a full
0
binary tree 86
0
1
0
a:45
1
b:13
0
c:12
1
14
0
28
58
‘a’
‘b’
‘c’
‘d’
‘e’
‘f’
1
d:16
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
Frequency Variable-length
(in thousands)
codeword
45
0
13
101
12
100
16
111
9
1101
5
1100
0
a:45
14
0
e:9
1
f:5
100
25
0
b:13
http://www.cs.cityu.edu.hk/~helena
0
A full binary tree
1
55
every nonleaf node
has 2 children
1
30
1
c:12
0
0
e:9
14
1
f:5
1
d:16
5. Greedy Algorithms - 26
Huffman Codes
Frequency Codeword
45000
0
13000
101
12000
100
16000
111
9000
1101
5000
1100
‘a’
‘b’
‘c’
‘d’
‘e’
‘f’
0
a:45
100
0
1
55
0
b:13
30
1
c:12
0
0
e:9
14
1. The coding must be unambiguous.
Consider codes in which no codeword is also a
prefix of other codeword. => Prefix Codes
Prefix Codes are unambiguous.
Once the codewords are decided, it is easy to
compress (encode) and decompress (decode).
2. File size must be smallest.
=> Can be represented by a full binary tree.
=> Usually less frequent characters are at bottom
1
25
To find an optimal code for a file:
1
d:16
1
f:5
Eg. “abc” is coded as “0101100”
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
Let C be the alphabet (eg. C={‘a’,’b’,’c’,’d’,’e’,’f’})
For each character c, no. of bits to encode all c’s
occurrences = freqc*depthc
File size B(T) = cCfreqc*depthc
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 27
Huffman Codes
How do we find the
optimal prefix code?
Q: A min-priority queue
c:12
b:13
25
c:12
b:13
f:5
d:16
14
f:5
e:9
30
a:45
e:9
e:9
c:12
14
f:5
a:45
a:45
d:16
e:9
c:12
a:45
30
b:13
a:45
25
b:13
100
25
c:12
d:16
55
e:9
http://www.cs.cityu.edu.hk/~helena
55
30
25
d:16
14
f:5
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
b:13
a:45
d:16
14
f:5
Huffman code (1952) was invented to solve it.
A Greedy Approach.
b:13
c:12
d:16
14
e:9
f:5
5. Greedy Algorithms - 28
Huffman Codes
Q: A min-priority queue
c:12
b:13
d:16
14
f:5
f:5
e:9
c:12
b:13
a:45
14
f:5
e:9
e:9
d:16
a:45
d:16
25
c:12
a:45
b:13
….
HUFFMAN(C)
1 Build Q from C
2 For i = 1 to |C|-1
3
Allocate a new node z
4
z.left = x = EXTRACT_MIN(Q)
5
z.right = y = EXTRACT_MIN(Q)
6
z.freq = x.freq + y.freq
7
Insert z into Q in correct position.
8 Return EXTRACT_MIN(Q)
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
If Q is implemented as a binary
min-heap,
“Build Q from C” is O(n)
“EXTRACT_MIN(Q)” is O(lg n)
“Insert z into Q” is O(lg n)
Huffman(C) is O(n lg n)
How is it “greedy”?
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 29
Greedy Algorithms
Summary
Casual Introduction: Two Knapsack Problems
An Activity-Selection Problem
Greedy Algorithm Design
Steps of Greedy Algorithm Design
Optimal Substructure Property
Greedy-Choice Property
Comparison with Dynamic Programming
Huffman Codes
CS3381 Des & Anal of Alg (2001-2002 SemA)
City Univ of HK / Dept of CS / Helena Wong
http://www.cs.cityu.edu.hk/~helena
5. Greedy Algorithms - 30
Download