COSC 3100 Transform and Conquer

advertisement
COSC 3100
Transform and Conquer
Instructor: Tanvir
What is Transform and Conquer ?
• The 4th algorithm design technique we are going
to study
• Three major variations
– Instance Simplification: Transform to a simpler or more
convenient instance of the same problem
– Representation Change: Transform to a different
representation of the same instance
– Problem Reduction: Transform to an instance of a
different problem for which you know an efficient
algorithm
• Two-step process:
– Step 1: Modify problem instance to something easier to
solve
– Step 2: Conquer!
What is Transform and Conquer ?
1. Presorting
Problem’s
instance
Simpler instance
Or
Another representation
Or
Another problem’s instance
1. Heapsort
2. Horner’s Rule
3. Binary Exponentiation
Solution
1. Computing Least Common Multiple
2. Counting Paths in a Graph
3. Reduction to Optimization Problems
4. Reduction to Graph Problems
Trns. & Conq.: Presorting
• Many questions about a list are
easier to answer if the list is sorted
• We shall see three examples
• Example 1: Checking element
uniqueness in an array
T(n) = (n-1)+(n-2)+…+1
E.g.,
4
1
8
9
3
7
10
2
3
1
=
𝒏−𝟏 𝒏
𝟐
є Θ(n2)
What is the brute-force idea ?
Take say the first element and check
whether it is in the rest of the array,
take the second element and check
if it is in the rest of the array, and so on…
Trns. & Conq.: Presorting
• Let us apply sorting first
ALGORITHM PresortElementUniqueness(A[0..n-1])
sort the array A
4 1 8 9 3 7 10 2 3 1
for i <- 0 to n-2 do
S
O
if A[i] = A[i+1]
R
T
return false
1 1 2 3 3 4 7 8 9 10
return true
T(n) = Tsort(n) + Tscan(n) є Θ(nlgn) + Θ(n) = Θ(nlgn)
Trns. & Conq.: Presorting
• Example 2: Computing a mode
• A mode is a value that occurs most often in a list
• E.g., in 5 1 5 7 6 5 7 , mode is ? 5
Brute-force: scan the list and compute the
frequencies of all distinct values, then find
the value with the highest frequency
How to implement
brute-force ?
7
1
Value: 5
Store the values
already encountered
along with their frequencies,
in a separate list.
Frequency:
3
𝒏−𝟏 𝒏
𝟐
2
What is the worst-case
input ?
C(n) = 0 + 1 + … + (n-1)
=
1
є Θ(n2)
An array with no equal
elements…
6
1
Trns. & Conq.: Presorting
• Example 2: Computing a mode (contd.)
• If we sort the array first, all equal
values will be adjacent to each other.
• To compute the mode, all we need to
know is to find the longest run of
adjacent equal values in the sorted
array
Trns. & Conq.: Presorting
Example 2: Computing a mode (contd.)
ALGORITHM PresortMode( A[0..n-1] )
sort the array A
i <- 0
1 1 2 3 3 3 7 8 9 10
modefrequency <- 0
while i ≤ n-1 do
runlength <- 1; runvalue <- A[i]
while i+runlength ≤ n-1 and A[i+runlength] = runvalue
runlength <- runlength+1
if runlength > modefrequency
modefrequency <- runlength; modevalue <- runvalue
i <- i+runlength
return modevalue
while loop takes linear time, so the overall runtime
is dominated by the time for sorting, Θ(nlgn)
Trns. & Conq.: Heapsort
• Another O(nlgn) sorting algorithm
• Uses a clever data structure called “Heap”
• It transforms an array into a Heap and then
sorting becomes very easy
• Heap has other important uses, like in
implementing “priority queue”
• Recall, priority queue is a multiset of items with
an orderable characteristic called “priority”, with
the following operations:
– Find an item with the highest priority
– Deleting an item with the highest priority
– Adding a new item to the multiset
Let us
study Heap
first…
Trns. & Conq.: Heap
• A “heap” can be described as a binary
tree, with one key per node, provided the
following two conditions are met:
– Shape property: Binary tree is essentially
complete, all levels are full except possibly the
last level, where only some right most leaves
may be missing
– Parental dominance (or heap property): key in
each node is greater than or equal to the keys
in its children (considered automatically
satisfied for the leaves)
This is actually the “max heap”,
There is a corresponding “min heap”…
Trns. & Conq.: Heap
Shape and Heap
properties…
10
10
5
5
4
7
7
2
Is it a Heap ?
1
10
5
6
7
2
1
2
1
Trns. & Conq.: Heap
Note:
Sequence of values on a path from
the root to a leaf is nonincreasing
10
7
8
5
2
3
index:
value:
6
ALGORITHM Parent(i)
return 𝑖 2
1
5
0
1
There is no left-to-right order in
key values, though
1
2
10 8
ALGORITHM Left(i)
return 2i
3
7
4
5
parents
5
2
6
1
7
6
8
3
9
5
leaves
10
1
ALGORITHM Right(i)
return 2𝑖 + 1
Trns. & Conq.: Heap
Important properties of heaps:
1. There exists one essentially complete
binary tree with n nodes, its height is 𝑙𝑔𝑛
2. The root of a heap contains the largest
element
3. A node with all its descendants is also a
heap
4. Heap can be implemented as an array
By recording its elements in the top-down,
left-right fashion. H[0] is unused or contains
a number bigger than all the rest.
a. Parental node keys will be in
first 𝑛 2 positions, leafs will
Be in the last 𝑛 2 positions
9
7
8
5
3
0
H:
2
2
6
1
5
1
1
3
4
5
6
7
8
9
10
9 8 7 5 2 1 6 3 5 1
b. Children of key at index i (1 ≤ i ≤ 𝑛 2 ) will be in positions
2i and 2i+1. The parent of a key at index i (2 ≤ i ≤ n) will be in
Position 𝑖 2 .
Which gives: H[i] ≥ max{ H[2i], H[2i+1] }
for i = 1, … , 𝒏 𝟐
Trns. & Conq.: Heap
• How can we transform an array into
2
a heap? Two ways: 1. Start at
the last
parent at
index 𝒏 𝟐
2
9
6
0
1
7
5
2
3
4
8
5
6
2 9 7 6 5 8
9
6
5
2. Check if parental
dominance holds
for this node’s key
3. If it does not,
Exchange the node’s
6
key K with the larger
n
key of its children
4. Check if parental
dominance holds for K
in its new position and so on…
Stop after doing for root.
6
2
8
7
2
9
8
5
7
9
2
6
8
5
This process is called “Heapifying”
This method is called “bottom-up heap construction”
7
Heapify
one parent
Trns. & Conq.: Heap
ALGORITHM HeapBottomUp(H[1..n])
//Input: An array H[1..n] of orderable items
//Output: A heap H[1..n]
for i <- 𝑛 2 downto 1 do
k <- i; v <- H[k]
heap <- false
while not heap and 2*k ≤ n do
j <- 2*k
if j < n // there are 2 children
if H[j] < H[j+1]
j <- j+1
if v ≥ H[j] // only left child
heap <- true
else
H[k] <- H[j]
k <- j
H[k] <- v
2
9
7
6
5
8
n
0
1
2
3
4
5
6
2 9 7 6 5 8
Trns. & Conq.: Heap
• Let us analyze the HeapBottomUp(H[1..n])’s worst case
Let’s take a heap whose binary tree is full
n = 22-1
n = 23-1
Why ?
7
So, number of nodes, n = 2m-1
largest possible # of nodes at
each level…
4
6
So what is the height ?
h = 𝒍𝒈𝒏 = 𝐥𝐠(𝒏 + 𝟏) -1 = m-1
A key on level i must travel to the
leaf-level h; Level i has (h-i) lower
levels. Going down by 1 level requires
2 comparisons: One to determine the
larger child, the other to determine
if exchange is needed.
5
than 2n
comparisons
are required
𝒊𝟐𝒊 = (𝒏 − 𝟏)𝟐𝒏+𝟏 + 𝟐
9
3
Let’s look at
a worst-case…
Cworst(n) = ℎ−1
𝑖=0
So, fewer
𝑖
= ℎ−1
𝑖=0 2(ℎ − 𝑖)2
𝒏
𝒊=𝟏
2
2𝑖
𝑗=1 2(ℎ
ℎ−1
ℎ−1
2𝑖 − 2
= 2ℎ
𝑖=0
− 𝑖)
𝑖2𝑖
𝑖=0
= 2h(2h-1) – 2(h-2)2h + 4
= h2h+1-2h-h2h+1+2*2h+1+4
= 2 ( n – lg(n+1) + 4 )
Trns. & Conq.: Heap
• Building a heap, Method 2
• It’s called “top-down heap construction”
• Idea: Successively insert a new key into a
previously constructed heap
• Attach a new node with key K after the last leaf
of the existing heap
• Sift K up until the parental dominance is
established
9
9
8
6
2
5
7
10
10
6
10
2
5
7
9
6
8
2
5
7
Insertion operation cannot take more than
twice the height of the heap’s tree, so it is in O(lgn)
8
Trns. & Conq.: Heap
• Maximum key deletion from a heap
– Exchange root’s key with the last key K in the
heap
– Decrease the heap’s size by 1
– “Heapify” the smaller tree by sifting K down
establishing “parental dominance” as long as it
is needed
1
want
1
We
to delete
the root
9
6
8
2
5
1
2
Efficiency is
O(lgn)
5
2
2
9
5
5
8
8
6
1
6
8
6
8
6
1
2
5
Trns. & Conq.: Heapsort
• An interesting algorithm discovered
by J. W. J. Williams in 1964
• It has 2 stages
– Stage 1: Costruct a heap from a given
array
– Stage 2: Apply the root-deletion
operation n-1 times to the remaining
heap
Trns. & Conq.: Heapsort
Stage 1: Heap construction
0
0
0
0
0
1
2
3
4
5
6
2
9
7
6
5
8
1
2
3
4
5
6
2
9
8
6
5
7
1
2
3
4
5
6
2
9
8
6
5
7
1
2
3
4
5
6
9
2
8
6
5
7
1
2
3
4
5
6
9
6
8
2
5
7
2
9
6
7
5
8
9
6
2
2
8
5
9
7
8
6
5
9
2
6
2
8
5
7
7
9
6
8
5
7
Trns. & Conq.: Heapsort
Stage 2: Maximum deletions
0
0
0
0
0
1
2
3
4
5
6
9
6
8
2
5
7
0
1
2
3
4
5
6
7
6
8
2
5
9
1
2
3
4
5
6
8
6
7
2
5
9
1
2
3
4
5
6
5
6
7
2
8
9
1
2
3
4
5
6
7
6
5
2
8
9
0
0
0
1
2
3
4
5
6
2
6
5
7
8
9
1
2
3
4
5
6
6
2
5
7
8
9
1
2
3
4
5
6
5
2
6
7
8
9
1
2
3
4
5
6
2
5
6
7
8
9
Trns. & Conq.: Heapsort
• What is Heapsort’s worst-case
complexity ?
Stage 1: Heap construction is in ? O(n)
Stage 2: Maximum deletions is in ? Let’s see…
Let C(n) be the # of comparisons needed in Stage 2
Recall, height of heap’s tree is 𝒍𝒈𝒏
C(n) ≤ 2 𝐥𝐠(𝒏 − 𝟏) + 2 𝒍𝒈(𝒏 − 𝟐) + … … + 2 𝒍𝒈(𝟏) ≤ 2
C(n) ≤ 2
𝒏−𝟏
𝒊=𝟏 𝒍𝒈(𝒏 −
𝟏) = 2(n-1)lg(n-1) ≤ 2nlgn
𝒏−𝟏
𝒊=𝟏 𝐥𝐠(𝒊)
So, C(n) є O(nlgn)
So, for two stages, O(n) + O(nlgn) = O(nlgn)
Trns. & Conq.: Heapsort
• More careful analysis shows that
Heapsort’s worst and best cases are
in Θ(nlgn) like Mergesort
• Additionally Heapsort is in-place
• Timing experiments on random files
show that Heapsort is slower than
Quicksort, but can be competitive
with Mergesort
• What sorting algorithm is used by
Collections.sort(List<T> list) of Java?
Trns. & Conq.: Problem Reduction
reduction
Problem 1
To be solved
Alg. A
Problem 2
Solution to
Problem 2
Solvable by Alg. A
• Compute least common multiple of two nonnegative
integers, lcm(m, n)
So, m*n = lcm(m, n) * gcd(m, n)
𝒎∗𝒏
• Say m = 24, and n = 60
lcm(m, n) =
!
𝐠𝐜𝐝(𝒎,𝒏)
• 24 = 2*2*2*3 and 60 = 2*2*3*5
• Take product of all common factors of m and n, and all
factors of m not in n, and all factors of n not in m
• lcm(24, 60) = (2*2*3)*2*5 = 120 Is it a good algorithm?
Notice, 24*60 = (2*2*2*3) * (2*2*3*5) = (2*2*3)2*2*5
Download