Uploaded by kundisogo

DATA STRUCTURES PROJECT PRPOPOSALS GROUP 2

advertisement
Topic: Sorting Algorithms
Heap sort, Shell sort and Insert sort.
……………………………………………………………
……………………………………………………………
…………………………………………………………….
……………………………………………………………..
…………………………………………………………….
……………………………………………………………..
……………………………………………………………..
………………………………………………………………
……………………………………………………………..
……………………………………………………………..
…………………………………………………………….
……………………………………………………………..
supervised by:
Assistant of Computer Software department
___________________ S. Kanengoni
Table of Contents
Chapter 1
1.1 Introduction………………………………………………………3
1.2 Problem Statement……………………………………………….3
1.3 Objectives………………………………………………………..3
Chapter 2: Sorting Algorithms: Heap sort
2.1 Heap Sort………………………………………………………..4
2.2 heap sort algorithm………………………………………………4
2.3 Heap sort pseudo code……………………………………………4
2.4 Heap sort code implementation ………………………………….5
2.5 Heap sort flowchart………………………………………………6
2.6 heap sort space and time complexity…………………………….7
2.7 Heap sort advantages and disadvantages ………………………..7
Chapter 3: Sorting Algorithms: insert sort
3.1 Insert Sort…………………………………………………………8
3.2 Insert Sort algorithm………………………………………………8
3.3 Insert Sort pseudo code……………………………………………8
3.4 Insert Sort code implementation ………………………………….9
3.5 Insert Sort flowchart……………………………………………….10
3.6 Insert Sort space and time complexity…………………………….11
3.7 Insert Sort advantages and disadvantages …………………………11
Chapter4: Sorting Algorithms: Shell Sort
4.1 Shell Sort……………………………………………………………12
4.2 Shell Sort algorithm…………………………………………………12
4.3 Shell Sort pseudo code………………………………………………12
4.4 Shell Sort code implementation …………………………………….13
4.5 Shell Sort flowchart………………………………………………….14
4.6 Shell Sort space and time complexity………………………………..15
4.7 Shell Sort advantages and disadvantages ……………………………16
Chapter 5:
Data analysis……………………………………………………………16
Chapter 6:
Conclusion……………………………………………………………….17
Chapter 7:
References………………………………………………………………18..
Chapter 1
1.1 Introduction
Sorting refers to arranging data in a particular format. Sorting algorithm specifies the way of
arranging data in a particular order for example in numeric or lexicographical order. It is an
essential operation in computer software and development. What is the purpose of sorting
algorithms? well Sorting algorithms take lists of items as input data, perform specific
operations on those lists and deliver ordered arrays as output. The many applications of
sorting algorithms include organizing items by price on a retail website and determining the
order of sites on a search engine results page. It makes it is easier to search through large
amounts of data quickly and more efferently. The simplest example of sorting is a dictionary
1.2 Problem statement
There are a wide variety of sorting algorithms available however in this particular report will
dwell on Heap sort, shell sort and Insert sort, explaining how each of these algorithm’s
function, stating its space, time complexity, pseudo code and flowchart.
1.3 Objectives (SMART: Specific, Measurable, Achievable, Realistic & Time bound)



To design a code, pseudo and algorithm that implements the above-mentioned
algorithms (Heap, shell and insert).
To calculate the complexities of the algorithms.
To compare the complexities
Chapter 2: Sorting algorithms
2.1 Heap Sort
Definition: Heap sort is an efficient sorting algorithm based on the use of max/min heaps. A
heap is a tree-based data structure that satisfies the heap property- that is for a mx heap, the
key of any node is less than or equal to the key of its parents (if it has a parent). It is a
comparison-based sorting algorithm that uses a binary heap data structure to sort elements.
The algorithm works by building a heap from the array and repeatedly extracting
the maximum element from the heap and placing it at the end of the array. The heap is then
reconstructed, and the process is repeated until the array is sorted.
2.2 Heap Sort Algorithm
Here’s the algorithm for heap sort:
Step 1: Build Heap. Build a heap from the input data. Build a max heap to sort in increasing
order, and build a min heap to sort in decreasing order.
Step 2: Swap Root. Swap the root element with the last item of the heap.
Step 3: Reduce Heap Size. Reduce the heap size by 1.
Step 4: Re-Heapify. Heapify the remaining elements into a heap of the new heap size by
calling heapify on the root node.
Step 5: Call Recursively. Repeat steps 2,3,4 as long as the heap size is greater than 2.
2.3 Heap sort Pseudo code
Array A, size N
heapSort()
For all non-leaf elements (i=N/2-1;i>=0;i--)
Build Heap (Heapify)
Initialize indexEnd
While indexEnd>1
Swap(A[0],A[indexEnd]
indexEnd=indexEnd-1
Build heap (apply heapify on the root node), considering array from A[0] to A[indexEnd]
Output the sorted array[]
End heapSort()
2.4 Heap Sort Code implementation
public int[] SortArray(int[] array, int size)
{
if (size <= 1)
return array;
for (int i = size / 2 - 1; i >= 0; i--)
{
Heapify(array, size, i);
}
for (int i = size - 1; i >= 0; i--)
{
var tempVar = array[0];
array[0] = array[i];
array[i] = tempVar;
Heapify(array, i, 0);
}
return array;
}
2.5 Heap sort Flowchart
2.6 Heap sort: Space and Time Complexity
CLASS
SORTING Algorithm
Worst case performance
Best case performance
Average performance
space complexity
O(n logn)
O(n logn)
O(n logn)
O(1)
Best Case Complexity – It occurs when there is no sorting required, i.e., the array is already
sorted. The best-case time complexity of heap sort is O(n logn).
Average Case Complexity – It occurs when the array elements are in jumbled order that is
not properly ascending and not properly descending. The average case time complexity of
heap sort is O(n log n).
Worst Case Complexity – It occurs when the array elements are required to be sorted in
reverse order. That means suppose you have to sort the array elements in ascending order, but
its elements are in descending order. The worst-case time complexity of heap sort is O(n
logn).
The time complexity of heap sort is O(n logn) in all three cases (best case, average case, and
worst case). The height of a complete binary tree having n elements is logn.
2.7 Advantages of Heap Sort:



Efficiency – The time required to perform Heap sort increases logarithmically while
other algorithms may grow exponentially slower as the number of items to sort
increases. This sorting algorithm is very efficient.
Memory Usage – Memory usage is minimal because apart from what is necessary to
hold the initial list of items to be sorted, it needs no additional memory space to work
Simplicity – It is simpler to understand than other equally efficient sorting algorithms
because it does not use advanced computer science concepts such as recursion.
Disadvantages Of Heap Sort:





Costly: Heap sort is costly.
Unstable: Heap sort is unstable. It might rearrange the relative order.
Efficient: Heap Sort is not very efficient when working with highly complex data.
Heap sort is typically not stable since the operations on the heap can change the
relative order of equal key items. It’s typically an unstable sorting algorithm.
If the input array is huge and doesn’t fit into the memory and partitioning the array is
faster than maintaining the heap, heap sort isn’t an option. In such cases, something
like merge sort or bucket sort, where parts of the array can be processed separately
and parallelly, works best.
Chapter 3 Sorting algorithm: Insert
3.1 Insert Sort
Definition: Insertion sort is a simple sorting algorithm that builds the final sorted array one
item at a time by comparisons. It is much less efficient on large lists than more advanced
algorithms such as quicksort, heapsort, or merge sort. In other words, it is a simple sorting
technique that scans the sorted list, starting at the beginning, for the correct insertion point for
each of the items from the unsorted list.
3.2 Insert Sort Algorithm
Step 1 – If it is the first element, it is already sorted. Return 1;
Step 2 – Pick next element
Step 3 – Compare with all elements in the sorted sub-list
Step 4 – Shift all the elements in the sorted sub-list that is greater than the
Value to be sorted
Step 5 – Insert the value
Step 6 – Repeat until list is sorted
3.3Insert Sort Pseudo Code
rocedure insertionSort( A : array of items )
int holePosition
int valueToInsert
for i = 1 to length(A) inclusive do:
/* select value to be inserted */
valueToInsert = A[i]
holePosition = i
/*locate hole position for the element to be inserted */
while holePosition > 0 and A[holePosition-1] > valueToInsert do:
A[holePosition] = A[holePosition-1]
holePosition = holePosition -1
end while
/* insert the number at hole position */
A[holePosition] = valueToInsert
end for
end procedure
3.4 Insert sort Flowchart
3.5 Insert Sort Implementation
// Function to print an array
void printArray(int array[], int size) {
for (int i = 0; i < size; i++) {
cout << array[i] << " ";
}
cout << endl;
}
void insertionSort(int array[], int size) {
for (int step = 1; step < size; step++) {
int key = array[step];
int j = step - 1;
// Compare key with each element on the left of it until an element smaller than
// it is found.
// For descending order, change key<array[j] to key>array[j].
while (key < array[j] && j >= 0) {
array[j + 1] = array[j];
--j;
}
array[j + 1] = key;
}
}
// Driver code
int main() {
3.6 Insert Sort: Space and time Complexity
CLASS
SORTING ALGORITHM
Worst case complexity
O(n^2)
Best case complexity
O(n)
Average case complexity
O(n2)
Space complexity
O(1)
Worst case complexity
Suppose, an array is in ascending order, and you want to sort it in descending order. In this
case, worst case complexity occurs. Each element has to be compared with each of the other
elements so, for every nth element, (n-1) number of comparisons are made. Thus, the total
number of comparisons = n*(n-1) ~ n2
Best Case Complexity: O(n)
When the array is already sorted, the outer loop runs for n number of times whereas the inner
loop does not run at all. So, there are only n number of comparisons. Thus, complexity is
linear.
Average case complexity
It occurs when the elements of an array are in jumbled order (neither ascending nor
descending).
Space complexity
Space complexity is O(1) because an extra variable key is used.
3.7 ADVANTAGES




For nearly-sorted data, it’s incredibly efficient (very near O(n) complexity)
It works in-place, which means no auxiliary storage is necessary i.e. requires only a
constant amount O(1) of additional memory space.
Efficient for (quite) small data sets.
Stable, i.e. does not change the relative order of elements with equal keys.
DISADVANTAGES


It is less efficient for large data sets, as the time complexity is O(n^2).
Insertion sort needs a large number of element shifts.
Chapter 4 Sorting algorithm: Shell
4.1 Shell Sort
Definition: It is a sorting algorithm that is an extended version of insertion sort. Shell sort has
improved the average time complexity of insertion sort. As similar to insertion sort, it is a
comparison-based and in-place sorting algorithm. Shell sort is efficient for medium-sized data
sets. In insertion sort, at a time, elements can be moved ahead by one position only. To move
an element to a far-away position, many movements are required that increase the algorithm's
execution time. But shell sort overcomes this drawback of insertion sort. It allows the
movement and swapping of far-away elements as well.This algorithm first sorts the elements
that are far away from each other, then it subsequently reduces the gap between them. This gap
is called as interval.
4.2 Shell Sort Algorithm
step 1 − Initialize the value of h.
Step 2 − Divide the list into smaller sub-list of equal interval h.
Step 3 − Sort these sub-lists using insertion sort.
Step 3 − Repeat until complete list is sorted.
4.3 Shell sort pseudo code
aculate gap size ($gap)
WHILE $gap is greater than 0
FOR each element of the list, that is $gap apart
Extract the current item
Locate the position to insert
Insert the item to the position
END FOR
Calculate gap size ($gap)
END WHILE
4.4 Shell Sort Flowchart
4.5 Shell sort implementation
using System;
class ShellSort {
/* function to implement shellSort */
static void shell(int[] a, int n)
{
/* Rearrange the array elements at n/2, n/4, ..., 1 intervals */
for (int interval = n/2; interval > 0; interval /= 2)
{
for (int i = interval; i < n; i += 1)
{
/* store a[i] to the variable temp and make the ith position empty */
int temp = a[i];
int j;
for (j = i; j >= interval && a[j - interval] > temp; j -= interval)
a[j] = a[j - interval];
/* put temp (the original a[i]) in its correct position */
a[j] = temp;
}
}
}
static void printArr(int[] a, int n) /* function to print the array elements */
{
int i;
for (i = 0; i < n; i++)
Console.Write(a[i] + " ");
}
static void Main()
{
int[] a = { 31, 29, 38, 6, 10, 15, 23, 40 };
int n = a.Length;
Console.Write("Before sorting array elements are - \n");
printArr(a, n);
shell(a, n);
Console.Write("\nAfter applying shell sort, the array elements are - \n");
printArr(a, n);
}
}
4.6 Shell space time and complexity
Time Complexity
Best
O(nlog n)
Worst
O(n2)
Average
O(nlog n)
Space Complexity
O(1)
Best Case Complexity - It occurs when there is no sorting required, i.e., the array is already
sorted. The best-case time complexity of Shell sort is O(n*logn).
Average Case Complexity - It occurs when the array elements are in jumbled order that is
not properly ascending and not properly descending. The average case time complexity of
Shell sort is O(n*logn).
Worst Case Complexity - It occurs when the array elements are required to be sorted in
reverse order. That means suppose you have to sort the array elements in ascending order, but
its elements are in descending order. The worst-case time complexity of Shell sort is O(n2).
Space complexity- The space complexity for shell sort is O(1).
4.7 Advantages of Shell Sort
Implementation is easy.
No stack call is required.
Shell sort is efficient when given data is already almost sorted.
Shell sort is an in-place sorting algorithm.
Disadvantages of Shell Sort
Shell sort is inefficient when the data is highly unsorted.
Shell sort is not efficient for large arrays.
Chapter 5: Data Analysis
Heap sort
Data size
Array size
Run time
Small
50
0.175seconds
medium
100
0.234 seconds
large
200
0.317seconds
While the asymptotic complexity of heap sort makes it look faster than quicksort, in real
systems heap sort is often slower.
Shell sort
Data size
Array size
Run time
Small
50
0.223 seconds
Medium
100
0.355seconds
large
200
0.731seconds
The time complexity of Shell Sort depends on the gap sequence.
Insert sort
Data size
Small
Medium
Large
Array size
50
100
200
Run time
0.0005 seconds
0.002 seconds
0.031 seconds
Название диаграммы
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
50
100
heap
insert
200
shell
It is important to note that runtimes will vary depending on the specific implementation and
hardware used. Additionally, the chosen gap sequence will affect the algorithms runtime
given input.
This data analysis indicates that heap sort algorithm is the fastest
Chapter 6:Conclusion
In conclusion a sorting algorithm is a method for reorganizing a large number of items into a
specific order, such as alphabetical, highest-to-lowest value or shortest-to-longest distance.
Sorting algorithms take lists of items as input data, perform specific operations on those lists
and deliver ordered arrays as output. In this report we mainly focusing on Heap, insert and
shell sort algorithms. We established the pseudo codes, algorithms and implementation of
each algorithm respectively. By asking ourselves which amongst heap, shell and insert we
deduced that if the data is almost sorted, insertion sort works best with O(n) time complexity.
If the data is random, or heapsort can be better options.
The choice of the best sorting algorithm depends on several factors, including the size of the
input data, the order of the data, memory usage, stability, performance, etc.
For small input data, a simple algorithm like insertion sort can work best. However, for larger
data sets, more efficient algorithms like quicksort, merge sort, or heapsort are the best
choices.
If the data is almost sorted, insertion sort works best with O(n) time complexity. If the data is
random, quicksort, merge sort, or heapsort can be better options.
. When memory usage is an important consideration, algorithms like heapsort [O(1) extra
space] or quicksort [O(logn) extra space] are preferred over merge sort [O(n) extra space].
For sorting linked lists, merge sort is the optimal choice. It is relatively simple to implement
and requires O(nlogn) time, and O(1) extra space. However, linked lists have slow random-
access performance, resulting in poor performance for algorithms such as quicksort and
making others like heapsort infeasible.
In a parallel computing environment, merge sort is often the preferred choice due to its
divide-and- conquer approach. This method divides the input equally at each stage, and each
smaller sub- problem is independent of the others. This makes it easy to distribute and
process data in parallel across multiple clusters.
Quick sort and merge sort can be relatively simple to implement, but heapsort may require a
deeper understanding of binary heaps.
Chapter 7: References
https://www.programiz.com
https://www.javatpoint.com
https://startutorial.com
https://t4tutorials.com
https://www.interviewkickstart.com
Download