Sorting

advertisement
CS 163
Data Structures
Chapter 5
Sorting
Herbert G. Mayer, PSU
Status 5/23/2015
1
Syllabus
 Sorting Constraints
 Complexity
 Bubble Sort
 Insertion Sort
2
Sorting Constraints
 Sorting rearranges elements of a data structure in a
defined order; but does not change the total content
 One order of the arrangement is descending
 Another order is ascending
 Without loss of generality we focus on ascending
order only; the other order is purely complementary
 Also, data structures can store information
repeatedly, or just once per unique element
 If repeated, the duplicates may be stored sequentially;
or else a count at one instance indicates the total
number of occurrences
 Without loss of generality we focus on unique
occurrence only
3
Complexity

The cost of a search for 1 elements in a data structure of n
unordered elements is n, or O(n) in Big-O notation, with n
being the distinct number of elements

Cost of a sort is generally higher than n, as each element’s
position is considered versus all n, hence cost can be O(n2)

Purpose of the sort is often to allow for more efficient
searching algorithms, more efficient than O(n) for 1 element

This weighs, when the number of lookups is large, i.e. large
vs. the cost of the initial sort

In Big-O notation, only the complexity n of the data structure
is considered, i.e. the number n of elements included
4
Bubble Sort
 The Bubble Sort is the most intuitive sorting method,
but also a most costly
 For a data structure of size n the cost to sort is O(n2)
 To sort ascendingly, each element in turn is compared
against all other n elements to determine the correct
position; uniqueness assumed
 That means, each of n elements is compared against
O(n) other elements
 In reality, only comparison against n-i are needed, with
i = 1..n-1, but in Big-O notation such an effective
reduction factor ½ does not have any impact on the
Big-O cost function
5
Bubble Sort Implementation
// bubble sort sorts in ascending order
// assume elements to be unique in data structure a[]
// there are MAX integers included in a[]
// .. Core of some bubble sort algorithm
for ( int outer = 0; outer < MAX-1; outer++ ) {
for ( int inner = outer+1; inner < MAX; inner++ ) {
if ( a[ outer ] > a[ inner ] ) {
// element at a[ outer ] is larger! Exchange!
swap( a[ inner ], a[ outer ] ); // C++
} //end if
} //end for
} //end for
6
Bubble Sort, swap() with & Parameter
// swap works with & ref parameters
// else resort to de-tour via pointers
// swap() makes no assumption, where
// val1 and val2 are stored
// Also, since val1 and val2 are reference parameters
// actuals do not need to be passed with “address of”
void swap( int & val1, int & val2 )
{ // swap
int temp = val1;
val1 = val2;
val2 = temp;
} //end swap
7
// must be C++
Bubble Sort, swap() in Situ
// sort in ascending order
// assume elements to be unique in data structure a[]
// there are MAX integers inside a[]
// .. Core of some bubble sort algorithm
for ( int outer = 0; outer < MAX-1; outer++ ) {
for ( int inner = outer+1; inner < MAX; inner++ ) {
if ( a[ outer ] > a[ inner ] ) {
// element at lower index outer is larger!
int temp = a[ inner ];
a[ inner ] = a[ outer ];
a[ outer ] = temp;
// swapping done in situ!
} //end if
} //end for
} //end for
8
Bubble Sort, ptr_swap()
// sort in ascending order
// assume elements to be unique in data structure a[]
// there are MAX integers inside a[]
// .. Core of some bubble sort algorithm
for ( int outer = 0; outer < MAX-1; outer++ ) {
for ( int inner = outer+1; inner < MAX; inner++ ) {
if ( a[ outer ] > a[ inner ] ) {
// element at lower index outer is larger!
ptr_swap( & a[ inner ], & a[ outer ] );
} //end if
} //end for
} //end for
9
Bubble Sort, ptr_swap()
// can be C or C++
// val1 and val2 are *int parameters
// This is how C programmer can get around ref parameters
void ptr_swap( int * val1, int * val2 )
{ // swap
int temp = *val1;
*val1
= *val2;
*val2
= temp;
} //end ptr_swap
10
Insertion Sort
 Insertion sort is a sorting algorithm that is relatively
efficient for mostly sorted lists
 Elements from the list are removed, and then placed, one
at a time and inserted in their correct position in a new
sorted list
 The remaining list is moved up (or down) by one
position, possible due to the place freed by the moved
element
Partially sorted
 Ai
Unsorted data
> Ai
Ai
…
Insert
Partially sorted
 Ai
Ai
Unsorted data
> Ai
11
…
Insertion Sort
 If the original list is largely unsorted, the
cost for insertion sort becomes similar, even
equal to the bubble sort
 For lists that are almost totally sorted, the
cost for insertion sort can be low, even O(1)
in Big-O notation
12
Insertion Sort, Method
 Goal is a list in ascending order:
 Start at index i=1, fetch value = list[i];
then all the way up the last element i=MAX-1
 Set j = i-1 and compare value against
list[j]
 As long as element list[j] is larger than
value, it is out of place, it must be shifted to a
higher index, up to where value was fetched
 In the end, value is placed into the slot freed
13
Insertion Sort
. . .
// very clever
for ( i = 1; i < MAX-1; i++ ) {
value = list[ i ];
j = i - 1;
while( ( j >= 0 ) && ( list[ j ] > value ) ) {
list[ j+1 ] = list[ j ];
// push up
--j;
// check next
} //end while
list[ j + 1 ] = value;
// the right place
} //end for
14
Insertion Sort
 The simplicity of the algorithm is striking
 The cost is not worse than that of the bubble
sort
 For lucky cases, the cost function can be
way lower than the O(n2) of the bubble sort
 In rare cases it may be O(1), something not
possible with the bubble sort
15
Download