Chapter 3 - Simple Sorting

advertisement
COP 3540 Data Structures with OOP
Simple Sorting
Chapter 3
1/47
Major Topics
 Introductory Remarks
 Bubble Sort
 Selection Sort
 Insertion Sort
 Sorting Objects
 Comparing Sorts
2/47
Introductory Remarks
 So many lists are better dealt if ordered.
 Will look at simple sorting first.
 Note: there are books written on many
advanced sorting techniques.
• E.g. shell sort; quicksort; heapsort; etc.
 Will start with ‘simple’ sorts.
 Relatively slow,
 Easy to understand, and
 Excellent performance under circumstances.
 All these are O(n2) sorts.
3/47
Basic idea of these simple sorts:
 Compare two items
 Swap or Copy over
 Depending on the specific algorithm…
4/47
Bubble Sort
 Very slow but simple
 Basic idea:
 Start at one end of a list - say, an array
 Compare the first items in the first two positions
• If the first one is larger, swap with the second
and move to the right; otherwise, move
right.
 At end, the largest item is in the last position.
5/47
Bubble Sort process
 After first pass, made n-1 comparisons
and between 0 and n-1 swaps.
 Continue this process.
 Next time we do not check the last entry (n1), because we know it is in the right spot.
We stop comparing at (n-2).
 You see how it goes. Let’s look at some
code.
 Author’s applets are available, if desired.
6/47
class ArrayBub
{
private long[] a;
// ref to array a
private int nElems;
// number of data items
//-------------------------------------------------------------public ArrayBub(int max)
// constructor - sets size of array up via argument passed by client.
{
a = new long[max];
// create the array
nElems = 0;
// no items yet
}
public void insert(long value) // put element into array at the end
{
a[nElems] = value;
// insert it
nElems++;
// increment size
}
public void display()
// displays array contents
{
for(int j=0; j<nElems; j++)
// for each element,
System.out.print(a[j] + " "); // display it
System.out.println("");
}
public void bubbleSort()
Here’s the real sort itself!
{
int out, in;
for(out=nElems-1; out>1; out--) // outer loop (backward)
for(in=0; in<out; in++)
// inner loop (forward)
if( a[in] > a[in+1] )
// out of order?
swap(in, in+1);
// swap them
} // end bubbleSort()
private void swap(int one, int two)
Note: this method is private. So what does this tell you?
{
long temp = a[one];
a[one] = a[two];
a[two] = temp;
7/47
}
} // end class ArrayBub
// Normally should also contain find() and delete() methods too…
The Bubble Sort Application
class BubbleSortApp
{
public static void main(String[] args)
{
int maxSize = 100;
// array size
// note; merely setting up a size
ArrayBub arr;
// reference to array
// creating a pointer to the array
arr = new ArrayBub(maxSize); // create the array
// creating an instance of the array
// class which will contain an array of maxSize.
arr.insert(77);
arr.insert(99);
arr.insert(44);
arr.insert(55);
arr.insert(22);
arr.insert(88);
arr.insert(11);
arr.insert(00);
arr.insert(66);
arr.insert(33);
// insert 10 items
arr.display();
// display items
arr.bubbleSort();
don’t care how or where! It is up to the ArrayBub.
Asking the object to display the array elements
// bubble sort them Asking the array to sort itself
arr.display();
// display them again Asking the array to display itself again
} // end main()
// note all the ‘services’ asked of the array object.
} // end class BubbleSortApp
8/47
////////////////////////////////////////////////////////////////
Discussion of Bubble Sort - More
public void bubbleSort()
{
int out, in;
Here’s the real sort itself!
for(out=nElems-1; out>1; out--) // outer loop (backward)
for(in=0; in<out; in++)
// inner loop (forward)
if( a[in] > a[in+1] )
// out of order?
swap(in, in+1);
// swap them
} // end bubbleSort()
Note the elements probed….
private void swap(int one, int two) Note: this method is private. What does this tell you?
long temp = a[one];
a[one] = a[two];
a[two] = temp;
}
If the input numbers are: 77 99 44 55 22 88 11 0 66 33
The sorted numbers will be: 0 11 22 33 44 55 66 77 88 99
Clearly this sort is ‘ascending.’ It doesn’t have to be.
Note the loop counter in the outer loop starts at end of array and decrements each cycle. Why?
Note: the inner loop starts at the beginning of the array each time and increments until it hits
to the current ‘limit,’ which is out-1. (See above)
9/47
Efficiency of the Bubble Sort - Comparisons
 Can readily see that there are fewer
comparisons in each successive ‘pass.’
 Thus number of comparisons is computed
as:
(n-1)+(n-2)+… + 1 = n(n-1)/2;
For 10 elements, the number is 10*9/2 = 45.
So, the algorithm makes about n2/2
comparisons (ignoring the n which is
negligible especially if n is large.
(If n = 100, then n(n-1)/2 is (10000 – 100)/2
or 9900/2. So you can see the n2/2 is “close
enough”)
10/47
Efficiency of the Bubble Sort - Swaps
 Clearly there will be fewer ‘swaps’ than
comparisons, since every comparison does
not result in a swap.
 We can say that in general, a swap will occur
half the time.
 Given have n2/2 comparisons;
 Half the time will result in swaps;
 Conclusion: have (1/2)*(n2/2) = n2/4 swaps.
• Worst case, every compare results in a swap.
But we will take average.
11/47
Overall – Bubble Sort
 Note both swaps and compares are proportional to n2.
 Constants don’t count in Big O notation.
 Ignore the 2 and 4; say Bubble Sort runs in O(n2) time.
 This is rather slow.
 Heuristic: Whenever in algorithms we see nested
loops (say, nested-for loops), you can infer O(n2)
performance.
 (Outer loop executes n times and inner loop executes in n
times PER execution of the outer n times: hence n2 .)
12/47
Selection Sort
13/47
How Does the Selection Sort Work?
 Principle: Scan all elements to be sorted selecting
the smallest (or largest) item and put this in the first
position of an array.
 Specifically, compare zeroth to first, zeroth to
second, zeroth to third, zeroth to fourth, etc.
swapping if the first, second, third, or fourth, (etc.)
are smaller than the zeroth element in the array.
 At end of ‘pass 1’, smallest item is found first in list
(array, …) in position zero.
 Next pass, start at position 1 (instead of position 0)
 Loop until all are sorted.
 Can improve efficiency by including a switch that is set if
at least one swap occurs.
14/47
Selection Sort
 Can see fewer swaps; same number of
comparisons as bubble. But:
 Swaps reduced from O(n2) to O(n)!
 Number of comparisons remains the same, O(n2).
 Reduction in swap time tends to be more
important than the comparison time
 No data movement in comparisons; is in swapping.
 Significant improvement for movement of large
records physically moved around in memory.
 Think table sorting in COBOL.
 If moving large records and can reduce the
number moved, then performance is improved
 (Good for COBOL, etc. Not good for Java, since
we typically move references and not entire
records (objects)). 15/47
Selection Sort – in more detail
 So, in one pass, you have made n
comparisions but possibly ONLY ONE Swap!
 With each succeeding pass,
 one more item is sorted and in place;
 one fewer items needs to be considered.
 Java code for the Selection Sort:
16/47
// SelectSort.java
class ArraySel
{
private long[ ] a;
// ref to array a
private int nElems;
// number of data items
public ArraySel(int max)
// constructor
{
a = new long[max];
// create an array named ‘a’.
nElems = 0;
// no items yet
}
public void insert(long value) { // put element into array // Please note I have sacrificed some form for space
a[nElems] = value;
// insert it
nElems++;
// increment size
}
public void display() {
// displays array contents
for(int j=0; j<nElems; j++)
// for each element,
System.out.print(a[j] + " "); // display it
System.out.println("");
}
public void selectionSort() {
See separate slide ahead…
int out, in, min;
for(out=0; out<nElems-1; out++) // outer loop
{
min = out;
// minimum
for(in=out+1; in<nElems; in++) // inner loop
if(a[in] < a[min] )
// if min greater,
min = in;
// we have a new min
swap(out, min);
// swap them
} // end for(out)
} // end selectionSort()
private void swap(int one, int two) {
long temp = a[one];
a[one] = a[two];
a[two] = temp;
}
} // end class ArraySel
17/47
// note scope terminators!
The Selection Sort - Client
class SelectSortApp
{
public static void main(String[] args)
{
int maxSize = 100;
// array size
ArraySel arr;
// creates reference to an object of type ArraySel called arr,
arr = new ArraySel(maxSize); // creates the object w/argument for its Constructor
arr.insert(77);
arr.insert(99);
arr.insert(44);
arr.insert(55);
arr.insert(22);
arr.insert(88);
arr.insert(11);
arr.insert(00);
arr.insert(66);
arr.insert(33);
// insert 10 items
// this client doesn’t even know that there is an array in arr!
// the ‘insert’ could be inserting into some other kind of container…
// could be inserting into, perhaps, a linked list. Don’t know. Don’t care!
arr.display();
// display items
arr.selectionSort();
// selection-sort them
arr.display();
// display them again
} // end main()
} // end class SelectSortApp
18/47
////////////////////////////////////////////////////////////////
Selection Sort itself:
public void selectionSort()
{
int out, in, min;
for(out=0; out<nElems-1; out++) // outer loop
{
min = out;
// assert minimum
for(in=out+1; in<nElems; in++) // inner loop
if(a[in] < a[min] )
// if min greater,
min = in;
// we have a new min
swap(out, min); // swap them at end of pass
} // end outerfor // Walk through this!!
} // end selectionSort()
19/47
Selection Sort itself - more
 Algorithm implies it is an O(n2) sort (and it is).
 How did we see this?
 Container class is called ArraySel vice ArrayBub
 Loops start at beginning and goes to end;
Inner loop starts at outer loop’s lowest
index+1. Both proceed ‘left to right.’
 Outer loop merely
 Controls the inner loop and
 keeps track of the smallest (or largest) index of
the item per pass.
 At end of inner pass, swap is called to ‘Swap’
the two elements based on their indices.
20/47
Selection Sort itself - a bit more; Efficiency
  Same number of comparisons as Bubble Sort –
still n2/2 comparisons
  Fewer swaps
 Consider: for large N:
 Comparison times will dominate because there are so
many even though there wil be fewer swaps.
 Yet, with so many fewer swaps, it will still be much faster
than the Bubble Sort anyway.
 Selection Sort still runs in O(n2) time.
 Consider: for small N,
 Swap times would dominate; the selection sort may well
be much faster, especially if there are a lot of swaps!
21/47
Insertion Sort
22/47
Insertion Sort
 In many cases, this sort is considered the best of
these elementary sorts.
 Still an O(n2) but:
 about twice as fast as bubble sort and
 somewhat faster than selection sort in most situations.
 Easy, but a bit more complex than the others
 Insertion sort is sometimes used as final stage of
some more sophisticated sorts, such as a QuickSort
(coming).
23/47
Insertion Sort – Here’s how it works: 1 of 3
 Approach this sort by thinking ‘half’ of the
list of items to be sorted is, say, sorted.
 Pick a position in the middle of an array, say,
that you visualize going left to right.
 Assume items ‘to the left’ of your picked
position are sorted (among themselves)
recognizing that those elements to the right are
not sorted and that those to the left, while
sorted, or likely not in their final positions
because of the other remaining ‘unsorted’
items have not been inserted into this evolving
subset of the original ‘numbers.’.
 Consider the position you picked as the
‘marked’ position, and that the number is
not in its final position.
24/47
Insertion Sort – how it works: 2 of 3
 The ‘marked item’ and all items to the right are unsorted.
 We now want to insert the marked item in the appropriate
place in the partially sorted group (the group to the left).
 But to do this,
 Need to shift ‘some’ of the items on the left - to the right to make
room for the marked item.
 Remove marked item from list. (Leaves a hole).
 Look for position in sorted list to place marked item into:
 Move largest ‘sorted’ item into marked item’s slot; (recall it is empty)
 Move the ‘next’ largest item into the (former) largest item’s slot, etc.
 But all the while, you are comparing the marked item with the one to
be moved to the right.
 Shifting stops when you’ve shifted the last item larger than
marked item.
 This opens up a space where the marked item, when
inserted, will be in sorted order.
25/47
Insertion Sort – description 3 of 3.
 Result:
 partially-ordered list is one item larger and
 the unsorted list is one item smaller.
 Marked item moves one slot to the right, so
once more it is again in front of the leftmost
unsorted item.
 Continue process until all unsorted items
have been inserted.
 Hence the name ‘insertion sort.’
 Code 
26/47
// insertSort.java
class ArrayIns
{
private long[] a;
// ref to array a
private int nElems;
// number of data items
public ArrayIns(int max) {
// constructor
a = new long[max];
// creates an array for an object (here, obj name is arr – from client)
nElems = 0;
// no items yet
}
public void insert(long value) { // put element into array. Inside this object, we see list is an array.
a[nElems] = value;
// insert it
nElems++;
// increment size
}
public void display() {
// displays array contents
for(int j=0; j<nElems; j++)
// for each element,
System.out.print(a[j] + " "); // display it
System.out.println("");
}
public void insertionSort() {
// will examine ahead separately…
int in, out;
for(out=1; out<nElems; out++) // out is dividing line
{
long temp = a[out];
// remove marked item
in = out;
// start shifts at out
while(in>0 && a[in-1] >= temp) // until one is smaller,
{
a[in] = a[in-1];
// shift item to right
--in;
// go left one position
}
a[in] = temp;
// insert marked item
} // end for
} // end insertionSort()
27/47
} // end class ArrayIns
Insertion Sort Client
class InsertSortApp
{
public static void main(String[] args)
{
int maxSize = 100;
// array size that will be passed to Constructor in arr object.
ArrayIns arr;
// reference to object of type ArrayIns.
arr = new ArrayIns(maxSize); // creates the object and passes argument to its Constructor.
arr.insert(77);
arr.insert(99);
arr.insert(44);
arr.insert(55);
arr.insert(22);
arr.insert(88);
arr.insert(11);
arr.insert(00);
arr.insert(66);
arr.insert(33);
// insert 10 items
arr.display();
// display items
arr.insertionSort();
// Please note: no notion of the structure of this list!
// Could be implemented as an array, linked list, tree, …
// get the object to do the work on its data!
// insertion-sort them
arr.display();
// display them again
} // end main()
} // end class InsertSortApp
28/47
Code for Insertion Sort method
public void insertionSort() {
int in, out;
for(out=1; out<nElems; out++) // out is dividing line.
{
// a[out] is marked item…
long temp = a[out]; // remove/move marked item to temp..
in = out;
// start shifts at “out”
while(in>0 && a[in-1] >= temp) // until one is smaller,
{
a[in] = a[in-1];
// shift item to right filling the hold
--in;
// then go left one position
}
a[in] = temp;
// insert marked item; done with this
} // end for
// marked item. Get next item on the right
} // end insertionSort()
29/47
Another Source: On your OWN to read.
 In abstract terms, every iteration of an insertion sort
removes an element from the input data, inserting it at the
correct position in an already sorted list, and this continues
until no elements are left in the input.
 The choice of which element to remove from the input is
arbitrary and can be made using almost any choice
algorithm.
 Sorting is typically done in-place.
 The resulting array after k iterations contains the first k
entries of the input array and is sorted.
 In each step, the first remaining entry of the input is
removed, inserted into the result at the right position, thus
extending the result:
30/47
What we really have is shown below:
becomes:
.
with each element > x copied to the right as it is compared against x
31/47
Discussion of Insertion Sort – How really implemented!!
 You’ve seen the theory. Theory is general. Implementation
is specific. So, here’s what we do:
 Start with out = 1, which means there is only a single
element to its ‘left.’
 We infer that this item to its left is sorted unto itself.
 Hard to argue this is not true. (This is out = 0)
 a[out] is the marked item, and it is moved into temp.
 a[out] is the leftmost unsorted item.
 The inner loop compares the marked item to the one to the
left of it. If the marked item is smaller, a[out] and a[out-1] are
swapped and ‘in’ is decremented.
 (Incidentally, note that the algorithm structure itself tells us
that this is an O(n2) sort.)
 Let’s consider some data…32/47
First (of three): Brute Force Approach 1 of 3
 You are responsible for going through the algorithm. This
is the only way to really convince yourself that you
understand it.
 See algorithm for first two elements only: 40 and 20.
 The for:

out = 1;

long temp = a[out] => a[1] = 20 (marked item; create hole)

in = out; (in = out = 1)

‘while’ is true, ( in > 0 it is 1; a[in-1] or a[0] >= temp. Yes.
that is 40 > 20. so we want:

a[in] = a[in-1] that is, a[1] = 40. // moved large to hole

‘in’ is decremented (to 0) and the

‘while’ will be false, so fall through to next stmt in the for:

a[in] to temp; that is, a[0] to 20.
 Go back to outer loop for more serious ‘n’
 But for two elements, this is it.
33/47
Second (of three): Let’s make it larger…Brute Force
 Here, Out = 2
 Assume we now have 20 40 30
long temp = a[out] = a[2] = 30. Move 30 out. Create hole.
in = out; in = out = 2.
while (in > 0 && a[in-1] >= temp, a[1] > 30. True.
a[in] = a[in-1]  a[2] = a[1]  a[2] = 40. (move to hole)
decrement ‘in’ to 1; loop within while this time...
while (in > 0 && a[0] >= temp)  20 >= 30? False.
end while
a[in] = temp …. a[1] = 30
a[0] remains = to 20.
 We now have 20 30 40 in order)
Back to outer loop…. One more….
34/47
Third (of three): Brute Force Approach
 So, we have 20 30 40 and, say, 25 (marked) (25 is now ‘rightmost’
 From for loop

out = 3; long temp = 25. in = out = 3

while (in>0 (T) && a[in-1] > temp

a[2] > temp = T; loop

a[in] = a[in-1]  a[3] = a[2]  a[3] = 40

decrement ‘in’ to 2 and loop within while

while (in>0 (T) and a[in-1] > temp

a[1] > temp = True; loop

a[2] = a[1] or a[2] = 30.

decrement ‘in’ to 1 and loop within while.

while (in > 0 (True) and a[in-1] > temp

a[0] > temp = False!

go to outer loop:

a[in] = temp  a[1] = 25

a[0] remains = 20….
 We now have: 20 25 30 40
See the trend??
35/47
Efficiency of the Insertion Sort
 Comparisons: How it really works:
Remember, we start on the left…
 On pass one, it compares a max of one; pass
two, it compares a max of two, etc. Up to a
max of n-1 comparisons.
 This equals 1+2+3+…+n-1 = n(n-1)/2
comparisons.
 But, on average, only half of the maximum
number of items are actually compared before
the insertion point is found.
 This gives us: n(n-1)/4.
36/47
Efficiency of the Insertion Sort (2 of 3)
 Copies:
 Have lots of ‘copies’ (same as number of
comparisons – about)
  But a copy is not nearly as time-consuming
as a swap. Think about this!!
 Thus, for random data, this algorithm runs
about twice as fast as the bubble sort and
faster than the selection sort.
 Yet, it still runs on O(n2) time for random data.
37/47
Efficiency of the Insertion Sort (3 of 3)
 Lastly:
 If data is nearly sorted insertion sort does quite well.
 The condition in while loop is rarely true, so it almost
runs in O(n) time; much better than O(n2) time!
  Thus to order a file that is nearly in order, an insertion
sort will be a very efficient sort.
  But if data is in very unsorted order (nearly
backward), performance runs no faster than the bubble
sort, as every possible comparison and shift is carried
out.
38/47
Sorting Objects
39/47
Sorting Objects
 Very important to be able to sort objects.
 Let’s sort an array of Person objects.
 Must be careful here, especially in noting that
 ‘that’ which is in an array are objects, and
 The sort is based on values of String objects
inside of the object.
• Will use the inherited String method, compareTo.
40/47
Person Class
class Person
{
private String lastName;
private String firstName;
private int age;
//----------------------------------------------------------public Person(String last, String first, int a)
{
// constructor
lastName = last;
firstName = first;
age = a;
}
//----------------------------------------------------------public void displayPerson()
{
System.out.print(" Last name: " + lastName);
System.out.print(", First name: " + firstName);
System.out.println(", Age: " + age);
}
//----------------------------------------------------------public String getLast()
// get last name
{ return lastName; }
} // end class Person
41/47
////////////////////////////////////////////////////////////////
Array In Object – Here’s the important slide!!!!!!
class ArrayInOb
{
private Person[] a;
// ref to array a of Person objects (to be defined)
private int nElems;
// number of data items
public ArrayInOb(int max) {
// constructor
a = new Person[max];
// create a ref to an array of ‘max’ Person objects. But since Person is
nElems = 0;
// not a primitive data type, we define it as a class and we create
}
// an array of Person objects, ‘a.’
public void insert(String last, String first, int age) { // creates individual Person objects
a[nElems] = new Person(last, first, age);
nElems++;
// increment size
}
public void display() {
// displays array contents
for (int j=0; j<nElems; j++)
// for each element,
a[j].displayPerson();
// display it
}
public void insertionSort() {
int in, out;
for(out=1; out<nElems; out++) {
Person temp = a[out];
// out is dividing line Here’s our marked item (object)
in = out;
// start shifting at out
while(in>0 && a[in-1].getLast().compareTo(temp.getLast())>0) {
complicated. >0  +
a[in] = a[in-1];
// shift item to the right
--in;
// go left one position
What does this really do?
}
Why do we need getLast()?
a[in] = temp;
// insert marked item
} // end for
42/47
} // end insertionSort()
} // end class ArrayInOb
Object Sort Application – Client application
class ObjectSortApp
{
public static void main(String[] args)
{
int maxSize = 100;
// array size
ArrayInOb arr;
// reference to object that will contain our array
arr = new ArrayInOb(maxSize); // creates the object, which creates an array of 100 Person objects.
arr.insert("Evans", "Patty", 24);
arr.insert("Smith", "Doc", 59);
arr.insert("Smith", "Lorraine", 37);
arr.insert("Smith", "Paul", 37);
arr.insert("Yee", "Tom", 43);
arr.insert("Hashimoto", "Sato", 21);
arr.insert("Stimson", "Henry", 29);
Nothing really different!
arr.insert("Velasquez", "Jose", 72);
The public interface of ArrayInOb
indicates that the insert requires
arr.insert("Vang", "Minh", 22);
several attributes. But still, the
arr.insert("Creswell", "Lucinda", 18);
insert could be almost any
implementing structure.
System.out.println("Before sorting:");
arr.display();
// display items
arr.insertionSort();
// insertion-sort them
System.out.println("After sorting:");
arr.display();
// display them again
} // end main()
43/47
} // end class ObjectSortApp
Discussion
 Note: we are sorting based on last name.
 Last name is a String
 Need to use the compareTo method available for
String objects.
 We are not comparing primitives!
 We are comparing attributes in a specific array
element, which happens to be a Person object.
 Recall: the ‘compareTo’ method returns an
integer: a negative integer, zero, or a positive
integer.
 The ‘compareTo’ method is inherited from the
String class.
44/47
Secondary Sort Fields?
 ‘Stable’ sort? What does this mean?
 Original orders are preserved…
 All of these algorithms so far are stable
 Sort ‘keys’
 Think Windows:
 Arrange by Type; then by data modified…
45/47
Comparing the Simple Sorts
 Bubble Sort – simplest.
 Use only if you don’t have other algorithms available
and ‘n’ is small.
 Selection Sort
 Minimizes number of swaps, but number of
comparisons still high.
 Useful when amount of data is small and swapping is
very time consuming – like when sorting records in
tables – internal sorts.
 Insertion Sort
 The most versatile, and is usually best bet in most
situations, if amount of data is small or data is almost
sorted. For large n, other sorts are better. We will
cover advanced sorts later…
46/47
Comparing the Simple Sorts
 All require very little space and they sort in
place.
 Can’t see the real efficiencies / differences
unless you apply the different sources to
large amounts of data.

47/47
Download