Quicksort is one of the most efficient and most fundamental sorting algorithms. The algorithm is based on a recursive divide and conquer approach. It is also one of the classic algorithms taught and tested in universities. In this article, we'll introduce quicksort, discuss how it works, its space and time complexity, and give full code examples in multiple languages. Before proceeding, if you are not familiar with recursion, please visit our non-techincal introduction to recursion or our outline on writing recursive functions.
QuickSort
The quicksort algorithm is a comparison based sorting algorithm which is used to sort a list of items. Generally, it is applied to sort arrays. This algorithm uses a divide and conquer approach to recursively sort different partitions of the main array, continuously making smaller and smaller partitions, and partially sorting each partition based on a pivot value, until each partition is of size one. Once this point is reached, the sort process will be complete and the entire array will be sorted in place. The original array itself will be sorted, thus the original ordering will be lost. We'll break down the process step by step and give a full overview of each step in the following sections.
Pivoting and Partitioning
Quicksort applies the divide and conquer strategy by dividing the input array into partitions. At each recursive step, a partition will be created by selecting a pivot point within the array, then swapping items in the array so all items smaller than the pivot are to the left, and all items larger are to the right. This isn't a full sort, as items to the left and right might still be unsorted themselves. We can consider the array to be sorted with respect to "p", where p is the pivot value. Once the pivot process is complete, we will have two partitions, one with values smaller than or equal to p and one with values larger than or equal to p. The value p itself is now in the position it will be in once the entire array is sorted.
To summarize, the steps of this part of the algorithm are:
- Set the initial pivot point as the median of the first, middle, and last elements. We'll discuss this in more detail below.
- Iterate through the array and compare values against p.
- If the item is before p but greater than p, move it to the right of p.
- If the item is after p but less than p, move it to the left of p.
- Consider the left partition to be the subarray from the beginning to p and the right partition to be the subarray from p+1 to the end.
There are multiple ways we can approach creating the partition function. In our implementation, we will use two pointers, one for the left and one for the right. The left will start at the leftmost part of the array and the right will start at the rightmost. Both sides will increment/decrement until a mismatch pair is found, then swap them. The full steps are:
- Initialize left to the beginning and right to the end indices.
- Move the left pointer forwards until an item equal to or larger than p is found.
- Move the right pointer leftwards until an item equal to or lesser than p is found.
- If right is less than or equal to left, the partition process is complete, and right is the partition index.
- Otherwise, swap the array values at left with right, then move back to step 2.
Pivot Selection
There are multiple ways a pivot can be selected. The selected pivot can have consequences on the overall performance. Common types of pivot choices include:
- First index: we can make the pivot point the first possible index. However, this is generally avoided, as it can cause the algorithm to degrade to O(n^2) for lists which are sorted or mostly sorted. This point will be covered further in sections below.
- Mid-point index: we can make the pivot point the middle index in the array.
- Last index: we can make the pivot point the last possible index.
- Random index: select any random index as the pivot.
- Median of three: take the first, center, and last elements, determine the median value of the three, and use that value. This is the approach we will be taking in our algorithm.
The ideal pivot value is a value which is the median of all of the values in the array. The median of three pivot approach will increase the chances of using a median value as the pivot point by selecting the best option out of three items. In the worst case scenario, all values will be the same. In the best case, all values will be different and the median selected will be closer to the median of the entire array than the other two values.
Recursively Sorting
Once the partitioning process is complete, the array will be sorted in respect to the pivot. Additionally, we will have two partitions, a left partition and right partition. Note that the pivot index itself is excluded from both partitions. To continue the sorting process, we can recursively call the quicksort algorithm once for each partition. This will recursively sort each partition as if it were its own array; they will have their own pivot points, operations, etc. Instead of cloning or splicing arrays, we will pass start and end indices to recursive calls in our implementatino.
Complexity Analysis
In the table below, we'll outline the space complexity, time complexity, and other aspects of the quicksort algorithm:
Value | Notes | |
---|---|---|
Space Complexity | O(n) | This requres space in the function call stack for recursive calls. Average case is O(log n) |
Time Complexity | O(n^2) | The average case is O(n logn) , but in edge cases, such as an array which is already sorted, performance will degrade to O(n^2) |
In Place | Yes | This only requires space on the call stack, unlike a sorting algorithm such as merge sort, which must duplicate subarrays. |
Stable | No | The sorted array is not guaranteed to preserve the order of items which have an equal sort value. |
Modifies the Source Input | Yes | The source array itself will be directly modified, so the original ordering will be lost. |
Quicksort is generally considered an O(n logn) sorting algorithm despite its worst case of O(n^2) because an actual performance of O(n^2) is very uncommon, and the average performance is generally faster than other O(n logn) sorting algorithms.
Full Implementation
In this implementation, we will be sorting arrays of numbers. However, the approach can be adapted to other types, such as strings and class instances, or generalized to sort anything which can implement a compare method.