Sorting1

advertisement
CS6045: Advanced Algorithms
Sorting Algorithms
Sorting
• Input: sequence of numbers
Output: a sorted sequence
a1 , a2 ,..., an
a1  a2  ...  an
Insertion Sort
Insertion Sort
//next current
//go left
//find place for current
// shift sorted right
// go left
//put current in place
An Example: Insertion Sort
30
10
40
20
1
2
3
4
i =  j =  key = 
A[j] = 
A[j+1] = 
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
30
10
40
20
1
2
3
4
i=2 j=1
A[j] = 30
key = 10
A[j+1] = 10
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
30
30
40
20
1
2
3
4
i=2 j=1
A[j] = 30
key = 10
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
30
30
40
20
1
2
3
4
i=2 j=1
A[j] = 30
key = 10
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
30
30
40
20
1
2
3
4
i=2 j=0
A[j] = 
key = 10
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
30
30
40
20
1
2
3
4
i=2 j=0
A[j] = 
key = 10
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=2 j=0
A[j] = 
key = 10
A[j+1] = 10
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=3 j=0
A[j] = 
key = 10
A[j+1] = 10
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=3 j=0
A[j] = 
key = 40
A[j+1] = 10
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=3 j=0
A[j] = 
key = 40
A[j+1] = 10
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=3 j=2
A[j] = 30
key = 40
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=3 j=2
A[j] = 30
key = 40
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=3 j=2
A[j] = 30
key = 40
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=4 j=2
A[j] = 30
key = 40
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=4 j=2
A[j] = 30
key = 20
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=4 j=2
A[j] = 30
key = 20
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=4 j=3
A[j] = 40
key = 20
A[j+1] = 20
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
20
1
2
3
4
i=4 j=3
A[j] = 40
key = 20
A[j+1] = 20
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
40
1
2
3
4
i=4 j=3
A[j] = 40
key = 20
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
40
1
2
3
4
i=4 j=3
A[j] = 40
key = 20
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
40
1
2
3
4
i=4 j=3
A[j] = 40
key = 20
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
40
1
2
3
4
i=4 j=2
A[j] = 30
key = 20
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
40
40
1
2
3
4
i=4 j=2
A[j] = 30
key = 20
A[j+1] = 40
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
30
40
1
2
3
4
i=4 j=2
A[j] = 30
key = 20
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
30
40
1
2
3
4
i=4 j=2
A[j] = 30
key = 20
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
30
40
1
2
3
4
i=4 j=1
A[j] = 10
key = 20
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
30
30
40
1
2
3
4
i=4 j=1
A[j] = 10
key = 20
A[j+1] = 30
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
20
30
40
1
2
3
4
i=4 j=1
A[j] = 10
key = 20
A[j+1] = 20
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
An Example: Insertion Sort
10
20
30
40
1
2
3
4
i=4 j=1
A[j] = 10
key = 20
A[j+1] = 20
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
A[j+1] = key
}
}
Done!
Correctness
• Which elements are in sorted order after
running each iteration?
• Loop invariant: the subarray A[1 … j-1]
consists of the elements originally in A[1 …
j-1] but in sorted order
Correctness
• To use a loop invariant to prove correctness,
we must show three things about it:
– Initialization: It is true prior to the first iteration
of the loop.
– Maintenance: If it is true before an iteration of the
loop, it remains true before the next iteration.
– Termination: When the loop terminates, the
invariant—usually along with the reason that the
loop terminated—gives us a useful property that
helps show that the algorithm is correct.
Insertion Sort Correctness
• Initialization: Just before the first iteration, j = 2. The subarray A[1 … j-1] is
the single element A[1], which is the element originally in A[1], and it is
trivially sorted.
• Maintenance: To be precise, we would need to state and prove a loop invariant
for the “inner” while loop. Rather than getting bogged down in another loop
invariant, we instead note that the body of the inner while loop works by
moving A[1 … j-1], A[1 … j-2], A[1 … j-3], and so on, by one position to the
right until the proper position for key (which has the value that started out in
A[j]) is found. At that point, the value of key is placed into this position.
• Termination: The outer for loop ends when j > n, which occurs when j = n+1.
Therefore, j – 1 = n. Plugging n in for j - 1 in the loop invariant, the subarray
A[1 … n] consists of the elements originally in A[1 … n] but in sorted order. In
other words, the entire array is sorted.
Analyze Algorithm’s Running Time
• Depends on
– input size
– input quality (partially ordered)
• Kinds of analysis
– Worst case (standard)
– Average case (sometimes)
– Best case
(never)
Asymptotic Analysis
• Ignore machine dependent constants
• Look at growth of T(n) while n  
– Drop lower-order terms
– Ignore the constant coefficient in the leading
term
• O - big O notation to represent the order of
growth
Asymptotic Notations
• BIG O: O
– f = O(g) if f is no faster then g
– f / g < some constant
• BIG OMEGA: 
– f = (g) if f is no slower then g
– f / g > some constant
• BIG Theta: 
– f = (g) if f has the same growth rate as g
– some constant < f / g < some constant
Time Analysis of Insertion Sort
InsertionSort(A, n) {
for i = 2 to n {
key = A[i]
j = i - 1;
while (j > 0) and (A[j] > key) {
A[j+1] = A[j]
j = j - 1
}
How many times will
A[j+1] = key
this loop execute?
}
}
Insertion Sort
Statement
Effort
InsertionSort(A, n) {
for i = 2 to n {
c1n
key = A[i]
c2(n-1)
j = i - 1;
c3(n-1)
while (j > 0) and (A[j] > key) {
c4T
A[j+1] = A[j]
c5(T-(n-1))
j = j - 1
c6(T-(n-1))
}
0
A[j+1] = key
c7(n-1)
}
0
}
T = t2 + t3 + … + tn where ti is number of while expression evaluations for the
ith for loop iteration
Analyzing Insertion Sort
• T(n) = c1n + c2(n-1) + c3(n-1) + c4T + c5(T - (n-1)) + c6(T - (n-1)) + c7(n-1)
= c8T + c9n + c10
• What can T be?
– Best case -- inner loop body never executed
• ti = 1  T(n) is a linear function
– Worst case -- inner loop body executed for all
previous elements
• ti = i  T(n) is a quadratic function
– Average case
• ???
Insertion Sort Analysis
• Best Case
– O(n)
• Worst Case
– O(n^2)
• Average Case
– O(n^2)
Merge Sort
•
•
•
•
Divide (into two equal parts)
Conquer (solve for each part separately)
Combine separate solutions
Merge sort
– Divide into two equal parts
– Sort each part using merge-sort (recursion!!!)
– Merge two sorted subsequences
Merge Sort
Example 1
Example 2
Merging
• Design an algorithm, which takes O(n) time?
//split the array A to L and R
//add sentinels at the end of arrays L & R
//find smaller value from L & R and then
save back to array A
Analysis of Merge Sort
Statement
Effort
MergeSort(A, left, right) {
if (left < right) {
mid = floor((left + right) / 2);
MergeSort(A, left, mid);
MergeSort(A, mid+1, right);
Merge(A, left, mid, right);
}
}
• So T(n) =
(1) when n = 1, and
2T(n/2) + (n) when n > 1
• So what (more succinctly) is T(n)?
T(n)
(1)
(1)
T(n/2)
T(n/2)
(n)
Recurrences
• The expression:
c
n 1


T ( n)  
2T  n   cn n  1
  2 
is a recurrence.
– Recurrence: an equation that describes a
function in terms of its value on smaller
functions
Recurrence Examples
0
n0

s ( n)  
c  s(n  1) n  0
0
n0

s ( n)  
n  s(n  1) n  0
c
n 1


T ( n)  
2T  n   c n  1
  2 


c
n 1

T ( n)  
 n
aT    cn n  1
 b
Solving Recurrences
• Substitution method
• Iteration method
• Master method
Solving Recurrences
• The substitution method (Textbook1 4.1)
– A.k.a. the “making a good guess method”
– Guess the form of the answer, then use
induction to find the constants and show that
solution works
– Examples:
• T(n) = 2T(n/2) + (n)  T(n) = (n lg n)
• T(n) = 2T(n/2) + n  ???
Solving Recurrences
• The substitution method (Textbook1 4.1)
– A.k.a. the “making a good guess method”
– Guess the form of the answer, then use
induction to find the constants and show that
solution works
– Examples:
• T(n) = 2T(n/2) + (n)  T(n) = (n lg n)
• T(n) = 2T(n/2) + n  T(n) = (n lg n)
• T(n) = 2T(n/2 )+ 17) + n  ???
Solving Recurrences
• The substitution method (Textbook1 4.1)
– A.k.a. the “making a good guess method”
– Guess the form of the answer, then use
induction to find the constants and show that
solution works
– Examples:
• T(n) = 2T(n/2) + (n)  T(n) = (n lg n)
• T(n) = 2T(n/2) + n  T(n) = (n lg n)
• T(n) = 2T(n/2+ 17) + n  (n lg n)
Analyze Merge Sort
12345678
1 358
15
5
2467
38
1
8
log n
47
3
7
26
4
• n comparisons per level
• log n levels
• total runtime = n log n
6
2
Solving Recurrences
• Another option is what the book calls the
“iteration method”
– Expand the recurrence
– Work some algebra to express as a summation
– Evaluate the summation
• We will show several examples
0
n0

s ( n)  
c  s(n  1) n  0
• s(n) =
c + s(n-1)
c + c + s(n-2)
2c + s(n-2)
2c + c + s(n-3)
3c + s(n-3)
…
kc + s(n-k) = ck + s(n-k)
0
n0

s ( n)  
c  s(n  1) n  0
• So far for n >= k we have
– s(n) = ck + s(n-k)
• What if k = n?
– s(n) = cn + s(0) = cn
0
n0

s ( n)  
c  s(n  1) n  0
• So far for n >= k we have
– s(n) = ck + s(n-k)
• What if k = n?
– s(n) = cn + s(0) = cn
• So
0
n0

s ( n)  
c  s(n  1) n  0
• Thus in general
– s(n) = cn
0
n0

s ( n)  
n  s(n  1) n  0
•
=
=
=
=
=
=
s(n)
n + s(n-1)
n + n-1 + s(n-2)
n + n-1 + n-2 + s(n-3)
n + n-1 + n-2 + n-3 + s(n-4)
…
n + n-1 + n-2 + n-3 + … + n-(k-1) + s(n-k)
0
n0

s ( n)  
n  s(n  1) n  0
•
=
=
=
=
=
=
s(n)
n + s(n-1)
n + n-1 + s(n-2)
n + n-1 + n-2 + s(n-3)
n + n-1 + n-2 + n-3 + s(n-4)
…
n + n-1 + n-2 + n-3 + … + n-(k-1) + s(n-k)
n
= i
i  n  k 1
 s(n  k )
0
n0

s ( n)  
n  s(n  1) n  0
• So far for n >= k we have
n
i
i  n  k 1
 s(n  k )
0
n0

s ( n)  
n  s(n  1) n  0
• So far for n >= k we have
n
i
 s(n  k )
i  n  k 1
• What if k = n?
0
n0

s ( n)  
n  s(n  1) n  0
• So far for n >= k we have
n
i
 s(n  k )
i  n  k 1
• What if k = n?
n 1
i  s (0)   i  0  n

2
i 1
i 1
n
n
0
n0

s ( n)  
n  s(n  1) n  0
• So far for n >= k we have
n
i
 s(n  k )
i  n  k 1
• What if k = n?
n 1
i  s (0)   i  0  n

2
i 1
i 1
n
• Thus in general
n 1
s ( n)  n
2
n
c
n 1

 n
T (n)  2T
   c n 1
  2 
• T(n) =
2T(n/2) + c
2(2T(n/2/2) + c) + c
22T(n/22) + 2c + c
22(2T(n/22/2) + c) + 3c
23T(n/23) + 4c + 3c
23T(n/23) + 7c
23(2T(n/23/2) + c) + 7c
24T(n/24) + 15c
…
2kT(n/2k) + (2k - 1)c
c
n 1

 n
T (n)  2T
   c n 1
  2 
• So far for n > 2k we have
– T(n) = 2kT(n/2k) + (2k - 1)c
• What if k = lg n?
– T(n) = 2lg n T(n/2lg n) + (2lg n - 1)c
= n T(n/n) + (n - 1)c
= n T(1) + (n-1)c
= nc + (n-1)c = (2n - 1)c
The Master Theorem
• Given: a divide and conquer algorithm
– An algorithm that divides the problem of size n
into a subproblems, each of size n/b
– Let the cost of each stage (i.e., the work to
divide the problem + combine solved
subproblems) be described by the function f(n)
• Then, the Master Theorem gives us a
cookbook for the algorithm’s running time:
The Master Theorem
• if T(n) = aT(n/b) + f(n) then


logb a

n



logb a
T (n)   n
log n


 f (n) 







f (n)  O n logb a 


  0
logb a
f ( n)   n

 c 1


f (n)   n logb a  AND

af (n / b)  cf (n) for large n







Using The Master Method
• T(n) = 9T(n/3) + n
– a=9, b=3, f(n) = n
– nlog a = nlog 9 = (n2)
– Since f(n) = O(nlog 9 - ), where =1, case 1
applies:
b
3
3



T (n)   nlogb a when f (n)  O nlogb a 
– Thus the solution is T(n) = (n2)

Download