Topic: Sorting Algorithms Heap sort, Shell sort and Insert sort. …………………………………………………………… …………………………………………………………… ……………………………………………………………. …………………………………………………………….. ……………………………………………………………. …………………………………………………………….. …………………………………………………………….. ……………………………………………………………… …………………………………………………………….. …………………………………………………………….. ……………………………………………………………. …………………………………………………………….. supervised by: Assistant of Computer Software department ___________________ S. Kanengoni Table of Contents Chapter 1 1.1 Introduction………………………………………………………3 1.2 Problem Statement……………………………………………….3 1.3 Objectives………………………………………………………..3 Chapter 2: Sorting Algorithms: Heap sort 2.1 Heap Sort………………………………………………………..4 2.2 heap sort algorithm………………………………………………4 2.3 Heap sort pseudo code……………………………………………4 2.4 Heap sort code implementation ………………………………….5 2.5 Heap sort flowchart………………………………………………6 2.6 heap sort space and time complexity…………………………….7 2.7 Heap sort advantages and disadvantages ………………………..7 Chapter 3: Sorting Algorithms: insert sort 3.1 Insert Sort…………………………………………………………8 3.2 Insert Sort algorithm………………………………………………8 3.3 Insert Sort pseudo code……………………………………………8 3.4 Insert Sort code implementation ………………………………….9 3.5 Insert Sort flowchart……………………………………………….10 3.6 Insert Sort space and time complexity…………………………….11 3.7 Insert Sort advantages and disadvantages …………………………11 Chapter4: Sorting Algorithms: Shell Sort 4.1 Shell Sort……………………………………………………………12 4.2 Shell Sort algorithm…………………………………………………12 4.3 Shell Sort pseudo code………………………………………………12 4.4 Shell Sort code implementation …………………………………….13 4.5 Shell Sort flowchart………………………………………………….14 4.6 Shell Sort space and time complexity………………………………..15 4.7 Shell Sort advantages and disadvantages ……………………………16 Chapter 5: Data analysis……………………………………………………………16 Chapter 6: Conclusion……………………………………………………………….17 Chapter 7: References………………………………………………………………18.. Chapter 1 1.1 Introduction Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way of arranging data in a particular order for example in numeric or lexicographical order. It is an essential operation in computer software and development. What is the purpose of sorting algorithms? well Sorting algorithms take lists of items as input data, perform specific operations on those lists and deliver ordered arrays as output. The many applications of sorting algorithms include organizing items by price on a retail website and determining the order of sites on a search engine results page. It makes it is easier to search through large amounts of data quickly and more efferently. The simplest example of sorting is a dictionary 1.2 Problem statement There are a wide variety of sorting algorithms available however in this particular report will dwell on Heap sort, shell sort and Insert sort, explaining how each of these algorithm’s function, stating its space, time complexity, pseudo code and flowchart. 1.3 Objectives (SMART: Specific, Measurable, Achievable, Realistic & Time bound) To design a code, pseudo and algorithm that implements the above-mentioned algorithms (Heap, shell and insert). To calculate the complexities of the algorithms. To compare the complexities Chapter 2: Sorting algorithms 2.1 Heap Sort Definition: Heap sort is an efficient sorting algorithm based on the use of max/min heaps. A heap is a tree-based data structure that satisfies the heap property- that is for a mx heap, the key of any node is less than or equal to the key of its parents (if it has a parent). It is a comparison-based sorting algorithm that uses a binary heap data structure to sort elements. The algorithm works by building a heap from the array and repeatedly extracting the maximum element from the heap and placing it at the end of the array. The heap is then reconstructed, and the process is repeated until the array is sorted. 2.2 Heap Sort Algorithm Here’s the algorithm for heap sort: Step 1: Build Heap. Build a heap from the input data. Build a max heap to sort in increasing order, and build a min heap to sort in decreasing order. Step 2: Swap Root. Swap the root element with the last item of the heap. Step 3: Reduce Heap Size. Reduce the heap size by 1. Step 4: Re-Heapify. Heapify the remaining elements into a heap of the new heap size by calling heapify on the root node. Step 5: Call Recursively. Repeat steps 2,3,4 as long as the heap size is greater than 2. 2.3 Heap sort Pseudo code Array A, size N heapSort() For all non-leaf elements (i=N/2-1;i>=0;i--) Build Heap (Heapify) Initialize indexEnd While indexEnd>1 Swap(A[0],A[indexEnd] indexEnd=indexEnd-1 Build heap (apply heapify on the root node), considering array from A[0] to A[indexEnd] Output the sorted array[] End heapSort() 2.4 Heap Sort Code implementation public int[] SortArray(int[] array, int size) { if (size <= 1) return array; for (int i = size / 2 - 1; i >= 0; i--) { Heapify(array, size, i); } for (int i = size - 1; i >= 0; i--) { var tempVar = array[0]; array[0] = array[i]; array[i] = tempVar; Heapify(array, i, 0); } return array; } 2.5 Heap sort Flowchart 2.6 Heap sort: Space and Time Complexity CLASS SORTING Algorithm Worst case performance Best case performance Average performance space complexity O(n logn) O(n logn) O(n logn) O(1) Best Case Complexity – It occurs when there is no sorting required, i.e., the array is already sorted. The best-case time complexity of heap sort is O(n logn). Average Case Complexity – It occurs when the array elements are in jumbled order that is not properly ascending and not properly descending. The average case time complexity of heap sort is O(n log n). Worst Case Complexity – It occurs when the array elements are required to be sorted in reverse order. That means suppose you have to sort the array elements in ascending order, but its elements are in descending order. The worst-case time complexity of heap sort is O(n logn). The time complexity of heap sort is O(n logn) in all three cases (best case, average case, and worst case). The height of a complete binary tree having n elements is logn. 2.7 Advantages of Heap Sort: Efficiency – The time required to perform Heap sort increases logarithmically while other algorithms may grow exponentially slower as the number of items to sort increases. This sorting algorithm is very efficient. Memory Usage – Memory usage is minimal because apart from what is necessary to hold the initial list of items to be sorted, it needs no additional memory space to work Simplicity – It is simpler to understand than other equally efficient sorting algorithms because it does not use advanced computer science concepts such as recursion. Disadvantages Of Heap Sort: Costly: Heap sort is costly. Unstable: Heap sort is unstable. It might rearrange the relative order. Efficient: Heap Sort is not very efficient when working with highly complex data. Heap sort is typically not stable since the operations on the heap can change the relative order of equal key items. It’s typically an unstable sorting algorithm. If the input array is huge and doesn’t fit into the memory and partitioning the array is faster than maintaining the heap, heap sort isn’t an option. In such cases, something like merge sort or bucket sort, where parts of the array can be processed separately and parallelly, works best. Chapter 3 Sorting algorithm: Insert 3.1 Insert Sort Definition: Insertion sort is a simple sorting algorithm that builds the final sorted array one item at a time by comparisons. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. In other words, it is a simple sorting technique that scans the sorted list, starting at the beginning, for the correct insertion point for each of the items from the unsorted list. 3.2 Insert Sort Algorithm Step 1 – If it is the first element, it is already sorted. Return 1; Step 2 – Pick next element Step 3 – Compare with all elements in the sorted sub-list Step 4 – Shift all the elements in the sorted sub-list that is greater than the Value to be sorted Step 5 – Insert the value Step 6 – Repeat until list is sorted 3.3Insert Sort Pseudo Code rocedure insertionSort( A : array of items ) int holePosition int valueToInsert for i = 1 to length(A) inclusive do: /* select value to be inserted */ valueToInsert = A[i] holePosition = i /*locate hole position for the element to be inserted */ while holePosition > 0 and A[holePosition-1] > valueToInsert do: A[holePosition] = A[holePosition-1] holePosition = holePosition -1 end while /* insert the number at hole position */ A[holePosition] = valueToInsert end for end procedure 3.4 Insert sort Flowchart 3.5 Insert Sort Implementation // Function to print an array void printArray(int array[], int size) { for (int i = 0; i < size; i++) { cout << array[i] << " "; } cout << endl; } void insertionSort(int array[], int size) { for (int step = 1; step < size; step++) { int key = array[step]; int j = step - 1; // Compare key with each element on the left of it until an element smaller than // it is found. // For descending order, change key<array[j] to key>array[j]. while (key < array[j] && j >= 0) { array[j + 1] = array[j]; --j; } array[j + 1] = key; } } // Driver code int main() { 3.6 Insert Sort: Space and time Complexity CLASS SORTING ALGORITHM Worst case complexity O(n^2) Best case complexity O(n) Average case complexity O(n2) Space complexity O(1) Worst case complexity Suppose, an array is in ascending order, and you want to sort it in descending order. In this case, worst case complexity occurs. Each element has to be compared with each of the other elements so, for every nth element, (n-1) number of comparisons are made. Thus, the total number of comparisons = n*(n-1) ~ n2 Best Case Complexity: O(n) When the array is already sorted, the outer loop runs for n number of times whereas the inner loop does not run at all. So, there are only n number of comparisons. Thus, complexity is linear. Average case complexity It occurs when the elements of an array are in jumbled order (neither ascending nor descending). Space complexity Space complexity is O(1) because an extra variable key is used. 3.7 ADVANTAGES For nearly-sorted data, it’s incredibly efficient (very near O(n) complexity) It works in-place, which means no auxiliary storage is necessary i.e. requires only a constant amount O(1) of additional memory space. Efficient for (quite) small data sets. Stable, i.e. does not change the relative order of elements with equal keys. DISADVANTAGES It is less efficient for large data sets, as the time complexity is O(n^2). Insertion sort needs a large number of element shifts. Chapter 4 Sorting algorithm: Shell 4.1 Shell Sort Definition: It is a sorting algorithm that is an extended version of insertion sort. Shell sort has improved the average time complexity of insertion sort. As similar to insertion sort, it is a comparison-based and in-place sorting algorithm. Shell sort is efficient for medium-sized data sets. In insertion sort, at a time, elements can be moved ahead by one position only. To move an element to a far-away position, many movements are required that increase the algorithm's execution time. But shell sort overcomes this drawback of insertion sort. It allows the movement and swapping of far-away elements as well.This algorithm first sorts the elements that are far away from each other, then it subsequently reduces the gap between them. This gap is called as interval. 4.2 Shell Sort Algorithm step 1 − Initialize the value of h. Step 2 − Divide the list into smaller sub-list of equal interval h. Step 3 − Sort these sub-lists using insertion sort. Step 3 − Repeat until complete list is sorted. 4.3 Shell sort pseudo code aculate gap size ($gap) WHILE $gap is greater than 0 FOR each element of the list, that is $gap apart Extract the current item Locate the position to insert Insert the item to the position END FOR Calculate gap size ($gap) END WHILE 4.4 Shell Sort Flowchart 4.5 Shell sort implementation using System; class ShellSort { /* function to implement shellSort */ static void shell(int[] a, int n) { /* Rearrange the array elements at n/2, n/4, ..., 1 intervals */ for (int interval = n/2; interval > 0; interval /= 2) { for (int i = interval; i < n; i += 1) { /* store a[i] to the variable temp and make the ith position empty */ int temp = a[i]; int j; for (j = i; j >= interval && a[j - interval] > temp; j -= interval) a[j] = a[j - interval]; /* put temp (the original a[i]) in its correct position */ a[j] = temp; } } } static void printArr(int[] a, int n) /* function to print the array elements */ { int i; for (i = 0; i < n; i++) Console.Write(a[i] + " "); } static void Main() { int[] a = { 31, 29, 38, 6, 10, 15, 23, 40 }; int n = a.Length; Console.Write("Before sorting array elements are - \n"); printArr(a, n); shell(a, n); Console.Write("\nAfter applying shell sort, the array elements are - \n"); printArr(a, n); } } 4.6 Shell space time and complexity Time Complexity Best O(nlog n) Worst O(n2) Average O(nlog n) Space Complexity O(1) Best Case Complexity - It occurs when there is no sorting required, i.e., the array is already sorted. The best-case time complexity of Shell sort is O(n*logn). Average Case Complexity - It occurs when the array elements are in jumbled order that is not properly ascending and not properly descending. The average case time complexity of Shell sort is O(n*logn). Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order. That means suppose you have to sort the array elements in ascending order, but its elements are in descending order. The worst-case time complexity of Shell sort is O(n2). Space complexity- The space complexity for shell sort is O(1). 4.7 Advantages of Shell Sort Implementation is easy. No stack call is required. Shell sort is efficient when given data is already almost sorted. Shell sort is an in-place sorting algorithm. Disadvantages of Shell Sort Shell sort is inefficient when the data is highly unsorted. Shell sort is not efficient for large arrays. Chapter 5: Data Analysis Heap sort Data size Array size Run time Small 50 0.175seconds medium 100 0.234 seconds large 200 0.317seconds While the asymptotic complexity of heap sort makes it look faster than quicksort, in real systems heap sort is often slower. Shell sort Data size Array size Run time Small 50 0.223 seconds Medium 100 0.355seconds large 200 0.731seconds The time complexity of Shell Sort depends on the gap sequence. Insert sort Data size Small Medium Large Array size 50 100 200 Run time 0.0005 seconds 0.002 seconds 0.031 seconds Название диаграммы 0,8 0,7 0,6 0,5 0,4 0,3 0,2 0,1 0 50 100 heap insert 200 shell It is important to note that runtimes will vary depending on the specific implementation and hardware used. Additionally, the chosen gap sequence will affect the algorithms runtime given input. This data analysis indicates that heap sort algorithm is the fastest Chapter 6:Conclusion In conclusion a sorting algorithm is a method for reorganizing a large number of items into a specific order, such as alphabetical, highest-to-lowest value or shortest-to-longest distance. Sorting algorithms take lists of items as input data, perform specific operations on those lists and deliver ordered arrays as output. In this report we mainly focusing on Heap, insert and shell sort algorithms. We established the pseudo codes, algorithms and implementation of each algorithm respectively. By asking ourselves which amongst heap, shell and insert we deduced that if the data is almost sorted, insertion sort works best with O(n) time complexity. If the data is random, or heapsort can be better options. The choice of the best sorting algorithm depends on several factors, including the size of the input data, the order of the data, memory usage, stability, performance, etc. For small input data, a simple algorithm like insertion sort can work best. However, for larger data sets, more efficient algorithms like quicksort, merge sort, or heapsort are the best choices. If the data is almost sorted, insertion sort works best with O(n) time complexity. If the data is random, quicksort, merge sort, or heapsort can be better options. . When memory usage is an important consideration, algorithms like heapsort [O(1) extra space] or quicksort [O(logn) extra space] are preferred over merge sort [O(n) extra space]. For sorting linked lists, merge sort is the optimal choice. It is relatively simple to implement and requires O(nlogn) time, and O(1) extra space. However, linked lists have slow random- access performance, resulting in poor performance for algorithms such as quicksort and making others like heapsort infeasible. In a parallel computing environment, merge sort is often the preferred choice due to its divide-and- conquer approach. This method divides the input equally at each stage, and each smaller sub- problem is independent of the others. This makes it easy to distribute and process data in parallel across multiple clusters. Quick sort and merge sort can be relatively simple to implement, but heapsort may require a deeper understanding of binary heaps. Chapter 7: References https://www.programiz.com https://www.javatpoint.com https://startutorial.com https://t4tutorials.com https://www.interviewkickstart.com