International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 5, Issue 8, August 2015) Modified Quick Sort: Worst Case Made Best Case Omar Khan Durrani1, Dr. Khalid Nazim S. A2 1,2 Department of Computer Science & EngineeringVidya Vikas Institute of Engineering & Technology, Mysore Abstract— Sufficient work has been carried out on analysis of quick sort by pioneers of computer Science and applied mathematics, which, no doubt is out of reach to describe and discuss. Also we cannot oversight any contribution made in the field of research and development, but still, facts come out with necessities. Having this in mind some improvements have been made with respect to the worst case performance of quick sort which is found in the academic reference material and contributions at a saturated state. In this paper quick sort is modified to perform Best when it is suppose to be worst. The results of modifications have yielded sufficient improvements over the existence one. The remaining sections are as follows, the section Analysis of quick sort exhibits the algorithms complexity in all possible cases as a review, the literature survey gives us brief description of references which leads us to classify quick sort worst case in the class O (n2), the section quicksort worst made best explain the modified code which sorts the ordered input in Ө (n) time. Finally in experiment results and performance measurement section the experiment conducted is shown as acknowledgement for correctness and also time clocked for different set ups are plotted and discussed from the point of efficiency. Keywords— sorting, quicksort, randomized, worst case, quicksort_wmb, partition, global variables, ordered input. II. QUICK SORT QuickSort is an algorithm based on the Divide-andConquer paradigm that selects a pivot element and reorders the given list in such a way that all elements smaller to it are on one side and those bigger than it are on the other. Then the sub lists are recursively sorted until the whole list gets completely sorted. The selection of pivot could be the first element when the input is random numbers. In case of ordered input elements choosing pivot as first element will lead slow performance of algorithm resulting complexity as θ(n2) (i.e. worst case) instead of O (n log n) which is the complexity in best and average cases. To make quick sort perform O (n log n) with respect to time complexity when ordered input is considered practitioners have preferred random selection of the pivot leading to Randomized Quicksort. A complete theoretical analysis of Quick Sort is given in the following subsection. Also Quick sort is the default sorting scheme in some operating systems, such as UNIX. I. INTRODUCTION Mathematicians have contributed algorithmic analysis from info-theoretic view point on the other side we algorithm engineers contribute from the angle of software and the computer architecture. A responsibility that showers is to cope for the upcoming challenges is to improve and provide a compatible code design keeping time efficiency as our objective. Sorting is one of the most important, well-studied and commonly applied problem in the field of computer technology. Many sorting algorithms are known which offer various trade-offs in efficiency, simplicity, memory use, and other factors. However, these algorithms do not take much into account the features of compilers and computer architectures that significantly influence performance. Hence Quick sort algorithm analyzed and improved it in the worst case, though practitioners may have not shown much interest as they manage it with other sorting algorithms like the merge sort which perform much better for such cases, for which quick sort is proven to be slow. As an academician I feel this type of outcomes have to be shared among our community. It is also well known that Quick sort has been proven to be fastest on average case when compared to the other n log n class of algorithms like Merge sort and heap sort. Other sorting algorithms like bubble sort, selection, insertion sort, shell sort fall under n2class of sorting algorithm which has shown slow performance. Analysis and performance measurement of both the classes is experimented [7, 8]. III. ANALYSIS OF QUICKSORT The total time taken to re-arrange the array as just described in the above section always takes O (n) or α n where α is some constant needed to execute in every partition. Let us suppose that the pivot we just chose has divided the array into two parts: one of size k and the other of size n − k. Notice that both these parts still need to be sorted. This gives us the following relation: T (n) = T (k) + T (n − k) + α n, ---------------------- (1) Where T (n) refers to the time taken by the algorithm to sort n elements. 373 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 5, Issue 8, August 2015) A. Worst case analysis Consider the case, when pivot is the least element of the array (input array is in ascending order), so that we have k = 1 and n − k = n – 1 in equation (1). In such a case, we have: T (n) = T (1) + T (n − 1) + α n as follows: = T (n − i) + iT (1) + α ( Notice that this recurrence will continue only until n = 2k (otherwise we have n/2k < 1), i.e. until k = log n. Thus, by putting k = log n, we have the following equation: T (n) = n T (1) + α n log n, which is O (n log n). This is the best case for quick sort. It also turns out that in the average case (over all possible pivot configurations), quick sort has a time complexity of O (n log n), the proof of which is beyond the scope.. by solving the recurrence )) ---------- (2) C. Avoiding the worst case Practical implementations of quick sort often pick a pivot randomly from the list each time [1, 2]. This greatly reduces the chance that the worst-case ever occurs. This method is seen to work excellently in practice but still much time is exploited by the randomizer [1]. The other technique, which deterministically prevents the worst case from ever occurring, is to find the median of the array to be sorted each time, and use that as the pivot. The median can be found in linear time but that is saddled with a huge constant factor overhead, rendering it suboptimal for practical implementations [3]. Now clearly such a recurrence can only go on until i = n − 1 (because otherwise n – 1 would be less than 1). So, substitute i = n − 1 in the equation (2), which gives us: T (n) = T (1) + (n − 1)T(1) + α = nT (1) + α (n (n − 2) − (n − 2) (n − 1)/2) (Notice that = = (n − 2) (n − 1)/2) which is O (n2). This is the worst case of quick-sort, which happens when the pivot we picked turns out to be the least element of the array to be sorted, in every step (i.e. in every recursive call). A similar situation will also occur if the pivot happens to be the largest element of the array to be sorted. IV. LITERATURE SURVEY Thomas H. Cormen et. al [3] have quoted that Quick sort, deteriorates and takes Quadratic time in the worst case, spends a lot of time even on the sorted or almost sorted data. It performs a lot of comparisons even on sorted data, but swap count is low for sorted or almost sorted input. Mark Allan Weiss in [11] has also stated that quick sort has O (n2) worst case performance. Howrowitz et at in [1] have said that, a possible input on which quicksort displays worst case behavior is one in which the elements are already in order. Almost all the authors of algorithm books and research papers on quick sort analysis have agreed that quicksort perform no better than O(n2) in case of ordered input (considering randomized and medians as exceptional cases). With all the above survey and many others references made the theoreticians and practitioners have considered quick sort worst case classified under asymptotic class O (n2). B. Best case analysis The best case of quick sort occurs when the pivot we pick happens to divide the array into two exactly equal parts, in every step. Thus we have k = n/2 and n−k = n/2 in quation (1) for the original array of size n. Consider, therefore, the recurrence: T (n) = 2 T (n/2) + α n --------------------------- (3) = 2 (2T (n/4) + α n/2) + α n (Note: T (n/2) = 2T (n/4) + α n/2 by just substituting n/2 for n in the equation (3) = 22 T (n/4) + 2 α n (By simplifying and grouping terms together). = 22(2 T (n/8) + α n/4) + 2 α n = 23T (n/8) + 3 α n = 2kT (n/2k) + k α n (Continuing likewise till the kth step) 374 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 5, Issue 8, August 2015) void quick::quicksort_wmb(int low,int high) { int j; if(low<high) { j=partition_wmb(low,high); if (no_part= =1) if( Aorder) {cout<<”A order”<<endl; return;} else if( Dorder) {cout<<”Dorder”<<endl; return;} quicksort_wmb(low,j-1); quicksort_wmb(j+1,high); }// end of if compound statement }// end of quicksort int quick::partition_wmb(int low,int high) { int key,i,j,temp; no_part++; // Global variable key=a[low]; i=low; j=high+1; while(i<=j) { do{i++;}while(key>=a[i]); do{j--;}while(key<a[j]); if(i<j) {temp=a[i];a[i]=a[j]; a[j]=temp;}} temp=a[low]; a[low]=a[j]; a[j]=temp; if (no_part= =1) if ((j= =n-1)&& (i= =n)) Dorder =1; else if ((i= =1) &&(j= =0)) Aorder =1; return j; }//end of partition_wmb Note:The array element at the position high+1 is assisgned with a value maximum + 1(1000 in our case) for the algorithm as it will stops index I at high+1. That i gets increamented to n searching for key >a[i] which is the highest in the array and j stands still at n-1 as a[n-1] i.e a[j] is < key as shown in figure 1.2. Aorder is set to value 1 in the partition function when i=1 and j=0, that is i does not get increamented further as key>=a[i] becomes false in the statement do{i++;}while(key>=a[i]);(do- while is executed only once), where as key<a[j] is true until j become 0 after n executions of the statement do{j--;}while(key<a[j]); (until key=a[j]). The above two cases are shown in figure 1.2(in case of ascending order input)and figure1.3(when descending order input is considered). In case of descending order input, a small modification can be made in the algorithm which will avoid the last or the highest element getting swapped with the first (i.e the lowest) element and printing the array in reverse order which may take a time complexity of Ө(2n), i.e Ө(n) time for one execution of partition function and Ө(n) for reversing the input array. Also the quicksort_wmb fails to sort the random input arrays which has the first element as the highest or the lowest among the array elements. In such cases one can always use int findmin() or int findmax() which will return respective minimum and maximum array elements using it in comparison with the first element of given input array one can overcome the such problem.This modification will take extra time of about O(n) along with the time of quicksort_wmb which is θ(n)resulting the time complexity as θ(n+n)=θ(n) using the theorem1 given in chapter 2 of [2]. Figure 1.1: Quicksort and Partition funcions modified to perform as best case for ordered inputs. V. QUICKSORT WORST CASE MADE BEST CASE When bubble sort and selection sort can be modified to perform only n-1 comparison on sorted input data as is stated in most algorithm books and also realized by experiments conducted at laboratory, why not Quick sort can do this? This made me to sit aside for a short time and modify the existing code of quicksort and partition as quicksort_wmb and partition_wmb as shown in the following C++ code in figure 1.1. The bold statements in the code reflects the changes made over the existing one. In the above code, three global variables int no_part, Aorder, Dorder are initialized to zero, further the decision to return from the quick sort is made when the respective values of global variables are set to 1 and recursion is avoided later on. The global variable no_part counts the number of partitions made in each execution of quicksort, Aorder is an indicator when the array is encountered to be in ascending order, similarly Dorder indicates if the array is in descending order. Dorder is set to value 1 in the partition function when i=n and j=n-1, that is i is out of upper bound and j is at the upper bound of given array. 375 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 5, Issue 8, August 2015) 3. Quick sort for ordered input array elements considering pivot as first element(column 5) 4. Randomized Quick sort for ordered input array elements with randomized pivot (column 6) 5. WMB_Quick sort for any given array of elements (column 7) A method specified by [1] under Bird’s Eye View is used to clock the time in my C++ code for all the above listed algorithms. We can notice from the Table 4 that the quick sort worst case time for n=1000 (column 6) is 0.467081 which is slowest in comparison with other experiments results (i.e. in the last row). For randomly generated input array with first element as pivot the time clocked is 0.093458 which shows that it is quite fast from that of worst case algorithm (column 3). The randomized pivot selection for the same has resulted the time as 0.09873 which is because of calling randomized function to select the pivot element (column 4). The column 6 which show cases the results for randomized Quick sort in which ordered input array of elements are considered but pivot is randomly generated has improved the speed by resulting the value of time clocked (for n=1000) as 0.064995 which is the speediest among the experiments considered so far. Finally we can see the worst made best which has resulted the time for ordered input as 0.1319. The respective graph plotted in figure 1.4 for the data in Table 4 gives a clearer statistics of speed and time complexity. VI. EXPERIMENTAL RESULTS The configuration of the test bed used for experiments is described as: Intel ®, core (™) 2E7200, 2.53 GHz, 0.99 GB of RAM, System: Micro Windows XP Windows 7 viena, Service Pack 3.version 2008-2009. First the code Quicksort_wmb was tested for its correctness on different samples of under following 3 categories (two test cases under each category) (1) Ascending order Input (Table 1): (2) Descending Order Input (Table 2) (3) Randomly generated input numbers (Table 3) The results obtained for the above three categories (for simplicity only two are shown here) of data tests the correctness of the algorithm. In the above tables the first line indicates the input size of array, the second line request for entering the input list of elements, no. of partitions displays the number of partitions happened when quicksort_wmb was executed. Finally we see the sorted array of elements displayed. The flag Aorder and Dorder is displayed when quicksort_wmb identifies the input array in ascending order or in descending order respectivelyand it had returned from quicksort_wmb after the first parttion. Further after confirmation of test cases for quicksort_wmb the time of execution for various samples of n ranging from 0…1000 for the following set of algorithms: 1. Quick sort for randomly generated array elements considering pivot as first element(column 3) 2. Quick sort for randomly generated array elements with randomized pivot (column 4) Table 1: shows the result obtained when ascending order input was considered for Quicksort_wmb enter the size of array :20 enter the elements 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 no. of partition:1 Aorder sorted array is 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 ________________________________________ enter the size of array : 50 enter the elements 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Aorder no. of partition:1 sorted array is 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 376 International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 5, Issue 8, August 2015) Table 2: shows the result obtained when descending order input was considered for Quicksort_wmb VII. CONCLUSION AND FUTURE WORK In this paper, I have shown how Quicksort can perform well from an angle which has not been discussed in the theories and analysis of quick sort. With fine tuning the quicksort_wmb one can use this to sort all possible cases of input data. Finally considering work done in this paper, we can claim that this modification will classify quicksort’s efficiency as Ө (n log n) when we have almost ordered or randomly ordered as worst and average case respectively, and Ө(n) when we have strictly ordered as best case. Further this paper also gives grounds to prove for its correctness formally. The paper also encourages one to consider design of program from various angles like compilers design, system architecture being used to execute the programs and similar other aspects. This paper also encourages a program designer to review the designed algorithms especially for sorting and searching. enter the size of array : 20 enter the elements 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Dorder no. of partition:1 sorted array is 1 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 20 ___________________________________________________________ __________________________________________________________ enter the size of array :50 enter the elements 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Dorder no. of partition:1 sorted array is 1 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 50 Table 3: Shows the result obtained when random input numbers were considered for Quicksort_wmb Acknowledgement First of all I would like to thank almighty for making this work possible. Heartfelt gratitude to Sartaj Sahni for supporting during the course of analysis. Special thanks to my colleagues at Vidya Vikas Institute of Engineering and Technolgy, Mysore especially Aditya C. R Assistant Professor who has reviewed and pointed places where I had to improve, finally to my Head of the department Dr. Khalid Nazim S. A for his encouragement and motivation. enter the size of array ;20 enter the elements 6 10 2 10 16 17 15 15 8 6 4 18 11 19 12 0 12 1 3 7 no. of partition:15 sorted array is 0 1 2 3 4 6 6 7 8 10 10 11 12 12 15 15 16 17 18 19 ___________________________________________________________ __________________________________________________________ enter the size of array :50 enter the elements 46 30 32 40 6 17 45 15 48 26 4 8 21 29 42 10 12 21 13 47 19 41 40 35 14 9 2 21 29 16 31 1 45 43 34 10 29 45 11 42 39 38 16 14 42 13 16 14 39 1 no. of partition: 35 sorted array is 1 1 2 4 6 8 9 10 10 11 12 13 13 14 14 14 15 16 16 16 17 19 21 21 21 26 29 29 29 30 31 32 34 35 38 39 39 40 40 41 42 42 42 43 45 45 45 46 47 48 REFERENCES [1] [2] Table 4: Time clocked in millisecond for various samples of n N value Avg case time Randomised Avg case 0 10 20 30 40 50 60 70 80 90 100 200 300 400 500 600 700 800 900 1000 0.000389 0.001304 0.002 0.00271 0.003451 0.004201 0.004971 0.005766 0.006558 0.007361 0.008174 0.016648 0.02558 0.034796 0.04424 0.053857 0.063597 0.073447 0.083448 0.093458 0.000395 0.001331 0.00208 0.002853 0.003644 0.004454 0.005282 0.006136 0.006982 0.007834 0.0087 0.017748 0.027214 0.03698 0.046964 0.057149 0.067406 0.077662 0.08816 0.09873 worst case 0.000392 0.000879 0.001659 0.002493 0.00333 0.004367 0.005494 0.006669 0.007945 0.009262 0.010625 0.029192 0.055645 0.090217 0.132669 0.183577 0.242041 0.309246 0.38297 0.467081 Worst case randomised time 0.000395 0.000893 0.001592 0.002245 0.002886 0.003525 0.004163 0.004812 0.005457 0.006089 0.00671 0.013082 0.019452 0.025911 0.032436 0.038907 0.045375 0.051857 0.058463 0.064995 [3] worst made best [4] 0.000392 0.00054 0.000672 0.000792 0.000922 0.001044 0.001188 0.001318 0.001445 0.001573 0.001698 0.002961 0.004227 0.005487 0.00675 0.008009 0.009274 0.010538 0.011894 0.01319 [5] [6] [7] 377 Sartaj Sahni, “Data structures and Algorithms in C++”, university press publication 2004, chapter 4: Performance measurement, pages 123. Levitin, A. Introduction to the Design and Analysis of Algorithms. Addison-Wesley,Boston MA, 2007. Thomas H. Cormen, Charles E. Leiserson,Ronald L. Rivest,Clifford Stein, “Introduction to Algorithms”, Second Edition, Prentice-Hall New Delhi,2004. Vandana Sharma, Parvinder S. Sandhu, Satwinder Singh, and Baljit Saini,” Analysis of Modified Heap Sort Algorithm on Different Environment", World Academy of Science, Engineering and Technology 42 2008. Yediyah Langsam, Moshe J Augenstein,Aaron M Tenenbaum, “An introduction to data structures with c++” Prentice hall India Learning private limited,2e 2008. ”Sorting Algorithm Analysis”, Gina Soileau,Muhammad Younus, Suresh Nandlall,Tamiko Jenkins, Thierry Ngouolali, Tom Rivers,Data Structures and Algorithms (SMT-274304-01-08FA1), Professor James Iannibelli, December 21, 2008. Omar Khan Durrani, Shreelakshmi V, Sushma Shetty,” Performance Measurement and analysis of sorting Algorithms”, National Conference on Convergent innovative technologies and management (CITAM-11),held on Dec 2-3,2011 at Cambridge Institute of Technology and Management, Bangaluru. International Journal of Emerging Technology and Advanced Engineering Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 5, Issue 8, August 2015) [8] [9] Omar Khan Durrani, Shreelakshmi V, Sushma Shetty & Vinutha D C “ Analysis and Determination of Asymptotic Behavior Range For Popular Sorting Algorithms”Special Issue of International Journal of Computer Science & Informatics (IJCSI), ISSN (PRINT) :2231– 5292, Vol.- II, Issue-1, 2. C.Canaan, M.S Garai, M Daya,”Popular Sorting Algorithms”,World Applied Programming, Vol 1,No. 1,April 2011,pages 42-50. [10] Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Pages 106–110 of section 5.2.2:Sorting by Exchanging. [11] Weiss, M. A., “Data Structures and Algorithm Analysis in C”. Addison-Wesley,Second Edition, 1997 ISBN: 0-201-49840-5. 378