Binary search is an algorithm which will let us find elements in a list or array without the need to check every single item in that array, so long as the array is sorted. The algorithm works as follows:
- Determine the array to perform the search on and the item we want to check the array against. For example, we may want to check if the string "apples" is in the array "fruits", which is sorted alphabetically.
- Find the index of the element which is exactly in the center of the search space. At the start, the search space will be the entire array. If the search space length is even, we can pick the leftmost center element.
- If that element is the one we're looking for, immediately return that element's index.
- If that element is less than the element we're looking for (alphabetically), limit our search so we start with the next element, and repeat from step 2.
- If that element is greater than the element we're looking for, limit our search so we end with that element, and repeat from step 2.
- If there are no more elements left, the item is not in the array. We can return -1 to mimic the normal behavior of "findIndex".
This approach finds the element by repeatedly narrowing down the search field based on whether or not our item ranks "higher" or "lower" based on some criteria. For numbers, the criteria can be a less than or greater than comparison. For strings, the criteria can be an alphabetical order comparison. Generally, examples are given in terms of numbers or strings, but this approach can be taken towards any type of object so long as we can program a way to rank the items against each other and sort them.
Binary Search Implementations
The code boxes below show two types of approaches we can take: iterative and recursive. The iterative approach works by continuously looping and narrowing down our search space until we find the element or run out of elements to find. The recursive approach works similarly but instead of using a loop construct, we call the function itself recursively with a subset of the original array. The codebox below shows both Implementations and provides comments throughout the code to outline exactly what is happening.
The iterative approach above has the benefit that we are not adding to the call stack each time we narrow down the search space. However, for languages which support tail recursion (a special form of recursion which reuses the same stack frame), the recursive option can be refactored to leverage tail recursion and avoid adding to the stack.
Performance Analysis
Let's compare this approach against the common approach of checking against every element in the array one by one and see how the compare and contrast. The table below gives a synopsis of each approach and compares their performance over arrays of different sizes.
Linear Search | Binary Search | |
---|---|---|
Description | Compare input to every element of an array. | Compare input to some elements of a sorted array by continuously narrowing down the search space. |
Array with 7 Elements | 7 comparisons at most | 3 comparisons at most |
Array with 13 Elements | 13 comparisons at most | 4 comparisons at most |
Array with 31 Elements | 31 comparisons at most | 5 comparisons at most |
Array with 63 Elements | 63 comparisons at most | 6 comparisons at most |
Array with 1023 Elements | 1023 comparisons at most | 10 comparisons at most |
A few observations on the table above:
- As the number of elements grows, the number of comparisons for binary search grows more slowly than that of linear search.
- The example array sizes we've chosen are powers of two minus one.
- We specify at most because there is a chance we will find our item in the array before searching over the entire space.
Based on the number of operations listed above, we can infer that the number of comparisons required to perform a binary search is roughly log2(n) where n is the number of elements in the array. This is much better than the n number of comparisons required by linear search. More formally: O(log(n)), meaning that the "big O" worst case time complexity of this algorithm is log2(n). Within "big O" notation log is assumed to be log2.
Sorting an Array with Binary Search
We can use binary search to sort an array by performing the following operations:
- Create a new empty array of the same size as the input array. We will use this to place items in one by one.
- For each item in the input array:
- Use binary search to find where we should insert our item into the sorted array. Instead of returning -1 when the item doesn't exist, return the index that we "expect it to be in".
- Insert the item at that index.
- Proceed to the next item in the input.
The code below outlines how to perform this operation. Note that the binary search function is modified to not return -1 if an item is not found, as described above.
This performs a binary search for each item in the array. Therefore, it's complexity is O(n*log(n)). This process is called binary insertion sort. There also exists similar sorting algorithms which have the same complexity as this.
When to Use Binary Search vs Linear Search
Consider the following scenario: we receive an array which might not be sorted and need to find a single item in that array. In that scenario, we would need to sort the array first if we wanted to use binary search. The time complexity of sorting the array first is O(n*log(n)) or worse. If all we need to do is find that single element, then sorting + binary search will take more time than linear search; thus linear search would be more appropriate. However, if we need to find more than one item within the array, it may become appropriate to sort + binary search as the number of items grows. We can rephrase that as: the process of comparing the intersection of two arrays can be optimized using binary search . We can sort the larger array, then for each item in the smaller array, use binary search to check if it exists in the larger array. In that case, it appears that using binary search is desired over linear search. A general rule of thumb: use binary search when you need to lookup more than one item (in which case you may need to sort the array first), or you know that the subject array is already sorted, when the size of the array is not small.
Generic Implementation of Binary Search
In the examples above, the code is performing binary search over a set of strings. However, the algorithm can be applied to any type of object in which we can implement some compareTo function. The function below has been slightly changed to allow us to sort over anything which implements IComparable.