2 Linear Search - Department of Computer Science

advertisement
School of Computer Science
CPS109 Course Notes Set 10
Alexander Ferworn Updated Fall 11
CPS109 Course Notes 10
Searching and Sorting
Alexander Ferworn
Table of Contents
1
2
3
4
5
6
7
8
9
Big O............................................................................................................................. 1
Linear Search .............................................................................................................. 1
Binary Search .............................................................................................................. 2
B ubble Sort ................................................................................................................. 3
Selection Sort ............................................................................................................... 4
Insertion Sort............................................................................................................... 5
Recursion ..................................................................................................................... 6
Fibonacci numbers ...................................................................................................... 7
Merge sort .................................................................................................................... 7
1 Big O
Big O is a way of characterizing an algorithm in Computer Science by relating the
algorithms performance to the size of data it is presented and its worst case performance.
The amount of data is known by the variable “n”.
To give a rough and ready example, if you have a loop that does something to each piece
of data and stops, then you have an order n algorithm or O(n). We like to throw a lot of
math around the idea but a lot of the time the O of an algorithm is governed by the
amount of looping through the data (n) that you do. If you have the data being
manipulated within a loop that has a loop within it (a nested loop) then you have an
O n 2  algorithm.
2 Linear Search

In computer science, linear search is a search algorithm, also known as sequential search
that is suitable for searching a set of data for a particular value when it is not known if the
data is in any particular order. It operates by checking every element of a list until a
match is found or it is confirmed that there is no match.
Linear search runs in O(N). If the data are distributed randomly, on average N/2
comparisons will be needed. The best case is that the value is equal to the first element
tested, in which case only 1 comparison is needed. The worst case is that the value is not
in the list, in which case N comparisons are needed.
Here is an example linear search. Find “gourmet food” on the menu below. If you started
from one end and went one at a time, examining each item, you have done linear search.
School of Computer Science
CPS109 Course Notes Set 10
Alexander Ferworn Updated Fall 11
QuickTime™ and a
decompressor
are needed to see this picture.
3 Binary Search
In computer science, binary search or binary chop is a search algorithm for finding a
particular value in a linear array, by "ruling out" half of the data at each step. In order to
employ binary search, the array must be in an order from lowest to highest or highest to
lowest.
A binary search finds the median, makes a boolean comparison, the results of which will
choose either the upper or lower half for further comparison, and so on until the value is
found or the value is determined not to be in the list. A binary search is an example of a
“divide and conquer” algorithm where the data that needs to be dealt is substantially
reduced through algorithmic means. This form of search is also known as a dichotomic
search.
School of Computer Science
CPS109 Course Notes Set 10
Alexander Ferworn Updated Fall 11
We can perform a search on the ordered set of integers -4,2,3,5,18. We can represent this
data graphically in something called a binary tree as shown below.
If we were searching for the value 3, we would start at the root of the tree (5) and ask the
question “is 3 less than 5?” and continue to ask appropriate questions as we traverse the
tree from node to node until we find what we are looking for or determine it is not in the
tree.
4 B ubble Sort
The bubble sort is the oldest and simplest sort in use. Unfortunately, it's also the slowest.
The bubble sort works by comparing each item in the list with the item next to it, and
swapping them if required. The algorithm repeats this process until it makes a pass all the
way through the list without swapping any items (in other words, all items are in the
correct order). This causes larger values to "bubble" to the end of the list while smaller
values "sink" towards the beginning of the list.
The bubble sort is generally considered to be the most inefficient sorting algorithm in
common usage. Under best-case conditions (the list is already sorted), the bubble sort can
approach a constant O(n) level of complexity. General-case is an abysmal O(n2).
Bubble sort in action can be seen in the figure below.
School of Computer Science
CPS109 Course Notes Set 10
Alexander Ferworn Updated Fall 11
5 Selection Sort
The selection sort works by selecting the smallest unsorted item remaining in the list, and
then swapping it with the item in the next position to be filled. The selection sort has a
complexity of O(n2).
Pros: Simple and easy to implement.
Cons: Inefficient for large lists, so similar to the more efficient insertion sort that the
insertion sort should be used in its place.
An example using selection sort is shown below.
School of Computer Science
CPS109 Course Notes Set 10
Alexander Ferworn Updated Fall 11
6 Insertion Sort
The insertion sort works just like its name suggests - it inserts each item into its proper
place in the final list. The simplest implementation of this requires two list structures - the
source list and the list into which sorted items are inserted. To save memory, most
implementations use an in-place sort that works by moving the current item past the
already sorted items and repeatedly swapping it with the preceding item until it is in
place.
Like the bubble sort, the insertion sort has a complexity of O(n2). Although it has the
same complexity, the insertion sort is a little over twice as efficient as the bubble sort.
Pros: Relatively simple and easy to implement.
Cons: Inefficient for large lists.
School of Computer Science
CPS109 Course Notes Set 10
Alexander Ferworn Updated Fall 11
An example of insertion sort in action is presented below.
7 Recursion
Code Self-invocation
Joke: What is the definition of recursion? SEE RECURSION.
To be reasonably useful Recursive routines must generally have all of the following
characteristics:
i)
ii)
The code calls itself if certain conditions are met,
The code can check for a stopping condition that allows it to end without
calling itself.
School of Computer Science
CPS109 Course Notes Set 10



Alexander Ferworn Updated Fall 11
Recursive code allows for certain problems to be conceptualized much more
easily. Mathematicians like it because it can be easily related to the concept of
“recurrence” relationships in math.
Recursion can always be replaced by iteration.
Recursion sometimes has unintended side effects at execution time.
8 Fibonacci numbers
By definition, the first two Fibonacci numbers are 0 and 1, and each remaining number is
the sum of the previous two. In mathematical terms, the sequence Fn of Fibonacci
numbers is defined by the recurrence relation:
Where F0=0 and F1=1
9 Merge sort
Merge sort takes advantage of the ease of merging already sorted lists into a new and
sorted list. It starts by comparing every two elements (i.e., 1 with 2, then 3 with 4...) and
swapping them if the first should come after the second. It then merges each of the
resulting lists of two into lists of four, then merges those lists of four, and so on; until at
last two lists are merged into the final sorted list. Java 6 uses a version of mergesort for
its sorting.
The algorithm is often expressed recursively.
Download