ppt - HKOI

advertisement
Searching and Sorting
Gary Wong
Prerequisite
• Time complexity
• Pseudocode
• (Recursion)
Agenda
• Searching
– Linear (Sequential)
Search
– Binary Search
• Sorting
–
–
–
–
Bubble Sort
Merge Sort
Quick Sort
Counting Sort
Linear Search
One by one...
Linear Search
• Check every element in the list, until the
target is found
• For example, our target is 38:
i
0
1
2
3
4
5
a[i]
25
14
9
38
77
45
Not found!
Found!
Linear Search
1) Initilize an index variable i
2) Compare a[i] with target
•
•
If a[i]==target, found
If a[i]!=target,
•
•
If all have checked already, not found
Otherwise, change i into next index and go to step 2
Linear Search
• Time complexity in worst case?
– If N is number of elements,
– Time complexity = O(N)
• Advantage?
• Disadvantage?
Binary Search
Chop by half...
Binary Search
• Given a SORTED list:
• (Again, our target is 38)
Smaller!
Found!
i
0
1
2
3
4
5
a[i]
9
14
25
38
45
77
L
Larger!
R
Binary Search
• Why always in the middle, but not other
positions, say one-third of list?
1) Initialize boundaries L and R
2) While L is still on the left of R
•
•
•
•
mid = (L+R)/2
If a[mid]>Target, set R be m-1 and go to step 2
If a[mid]<Target, set L be m+1 and go to step 2
If a[mid]==Target, found
Binary Search
• Time complexity in the worst case?
– If N is the number of elements,
– Time complexity = O(lg N)
– Why?
• Advantage?
• Disadvantage?
Example: Three Little Pigs
• HKOI 2006 Final Senior Q1
• Given three lists, each with M numbers,
choose one number from each list such that
their sum is maximum, but not greater than N.
• Constraints:
– M ≤ 3000
– Time limit: 1s
Example: Three Little Pigs
•
•
•
•
How many possible triples are there?
Why not check them all?
Time complexity?
Expected score = 50
Example: Three Little Pigs
• A simpler question: How would you search for
the maximum number in ONE SORTED list
such that it is not greater than N?
• Binary search!
– With slight modification though
– How?
Example: Three Little Pigs
• Say, N=37
i
0
1
2
3
4
5
a[i]
9
14
25
38
45
77
L
R
Example: Three Little Pigs
• Let’s go back to original problem
• If you have chosen two numbers a1[i] and a2[j]
already, how would you search for the third
number?
• Recall: How would you search for the
maximum number in ONE SORTED list such
that it is not greater than N-a1[i]-a2[j]?
Example: Three Little Pigs
• Overall time complexity?
• Expected score = 90
• 
Example: Three Little Pigs
• Slightly harder: Given TWO lists, each with M
numbers, choose one number from each list
such that their sum is maximum, but not
greater than N.
• Linear search?
• Sort one list, then binary search!
• Time complexity = O(M lg M)
– O(M2) if less efficient sorting algorithm is used
• But, can it be better?
Example: Three Little Pigs
• Why is it so slow if we use linear search?
• If a1[i] and a2[j] are chosen, and their sum is
smaller than N:
– Will you consider any number in a1 that is smaller
than or equal a1[i]?
• If a1[i] and a2[j] are chosen, and their sum is
greater than N:
– Will you consider any number in a2 that is greater
than or equal to a2[j]?
Example: Three Little Pigs
• Recall: Why is it so slow if we use linear search?
– Because you use it for too many times!
• At which number in each list should you begin
the linear search?
• Never look back at those we do not consider!
• Time complexity?
• Expected score = 100 
What can you learn?
• Improve one ‘dimension’ using binary search
• Linear search for a few times can be more
efficient than binary search for many times!
– DO NOT underestimate linear search!!!
Points to note
• To use binary search, the list MUST BE
SORTED (either ascending or decending)
– NEVER make assumptions yourself
– Problem setters usually do not sort for you
• Sorting is the bottleneck of efficiency
• But... how to sort?
How to sort?
• For C++: sort()
• Time complexity for sort() is O(N lg N)
– which is considered as efficient
• HOWEVER...
– Problem setters SELDOM test contestants on pure
usage of efficient sorting algorithms
– Special properties of sorting algorithms are
essential in problem-solving
• So, pay attention! 
Bubble Sort
Smaller? Float! Larger? Sink!
Bubble Sort
• Suppose we need to sort in ascending order...
• Repeatedly check adjacent pairs sequentially,
swap if not in correct order
• Example:
18
9
20
11
77
45
Incorrect order, swap!
• The last number is always the largest
Correct order, pass!
Bubble Sort
• Fix the last number, do the same procedures
for first N-1 numbers again...
• Fix the last 2 numbers, do the same
procedures for first N-2 numbers again...
• ...
• Fix the last N-2 numbers, do the same
procedures for first 2 numbers again...
Bubble Sort
for i -> 1 to n-1
for j -> 1 to n-i
if a[j]>a[j+1], swap them
• How to swap?
Merge Sort & Quick Sort
Many a little makes a mickle...
Merge Sort
• Now given two SORTED list, how would you
‘merge’ them to form ONE SORTED list?
List 1: 8 14 22
List 2: 10 13 29 65
Temporary list: 8 10 13 14 22 29 65
Combined list:
Merge Sort
Merge
1) While both lists have numbers still not yet considered
• Compare the current first number in two lists
• Put the smaller number into temporary list, then discard it
2) If list 1 is not empty, add them into temporary list
3) If list 2 is not empty, add them into temporary list
4) Put the numbers in temporary list back to the desired list
Merge Sort
• Suppose you are given a ‘function’ called
‘mergesort(L,R)’, which can sort the left half and
right half of list from L to R:
10
•
•
•
•
13
29
65
8
14
L
(L+R)/2 (L+R)/2+1
How to sort the whole list?
Merge them!
How can we sort the left and right half?
Why not making use of ‘mergesort(L,R)’?
22
R
Merge Sort
mergesort(L,R){
If L is equal to R, done;
Otherwise,
m=(L+R)/2;
mergesort(L,M);
mergesort(M+1,R);
Merge the lists [L,M] and [M+1,R];
}
Merge Sort
mergesort(0,6)
mergesort(0,1)
65
mergesort(0,0)
10
mergesort(1,1)
mergesort(2,3)
29
13
mergesort(2,2) mergesort(3,3)
mergesort(4,5)
14
65
13
10
29
13
22
mergesort(4,4) mergesort(5,5)
mergesort(0,3)
10
8
8
mergesort(6,6)
mergesort(4,6)
29
65
14
22
8
14
29
65
22
Merge Sort
• Time complexity?
• O(N lg N)
• Why?
Quick Sort
• Choose a number as a ‘pivot’
• Put all numbers smaller than ‘pivot’ on its left
side
• Put all numbers greater than (or equal to)
‘pivot’ on its right side
10
13
29
65
8
14
22
10
13
8
14
22
65
29
Quick Sort
a[y] < pivot!
shift x, swap!
• How?
10
x
13
29
65
8
14
22
y
• y shifts to right by 1 unit in every round
• Check if a[y] < pivot
– If yes, shift x to right by 1 unit and swap a[x] and a[y]
• If y is at 2nd last position, swap a[x+1] and pivot
• Time complexity?
Quick Sort
10
13
8
14
22
65
29
• Use similar idea as in merge sort
• If we have a function called ‘quicksort(L,R)’...
– Make use of ‘quicksort(L,R)’ to sort the two halves!
Quick Sort
quicksort(L,R){
If L is equal to R, done;
Otherwise,
Choose a number as a pivot, say a[p];
Perform pivoting action;
quicksort(L,p-1);
quicksort(p+1,R);
}
Quick Sort
•
•
•
•
Time complexity in worst case?
O(N2) 
What is the worst case?
Preventions:
– Choose a random number as pivot
– Pick 3 numbers, choose their median
• Under randomization, expected time
complexity is O(N lg N)
Counting Sort
No more comparison...
Counting Sort
• Create a list, then tally!
• Count the occurences of elements
• Example:
– Sort {2,6,1,4,2,1,9,6,4,1} in ascending order
– A list from 1 to 9 (or upper bound of input no.)
– Three ‘1’s, two ‘2’s, two ‘4’s, two ‘6’s and one ‘9’
– List them all!
Counting Sort
• Time complexity?
• O(range * N)
• Good for numbers with small range
More...
If we have time...
More...
•
•
•
•
•
Lower bound of sorting efficiency?!
Guess the answer by binary search
Count the number of ‘inversions’
Find the kth largest number
Other sorting algorithms
Time for lunch
Yummy!
Download