Unit 4 - Divide & Conquer
4.3 - Split Search
The idea behind divide & conquer is to reduce a larger problem into easier-to-handle, smaller "sub-problems" that mimic the larger one.
Imagine you have the entire list of active TikTok users in a sorted array as Username, Email and you wanted to search for a specific person (SantaDancer123, sd@gmail.com) to get their phone number or IP address or something. TikTok has over 1 billion active users. You don't know which element in the array is the SantaDancer123 you are looking for.
Q. 🤔 How do we find the index of SantaDancer123 efficiently?
Search Algorithms
Linear Search
A linear approach would be to start at index 0 and check each element if it is equal to the target value. This would be a very inefficient method with the worst-case having to look-at and compare all elements in the array.
Modified Linear Search
A slightly more efficient method would be to check the first letter of the target (in this case, "S"). Since "S" is closer to "Z" than "A", we could perform the linear search in reverse and potentially find the target faster, if the data in the array is normally distributed across the alphabet... What if most usernames start with a letter in the upper-half of the alphabet?
Split Search
Let's play a guessing game. I'm thinking of a number between 1 and 1,000,000. I will tell you if your guesses are too high or low. What is the smartest number to select to start? How about for your second guess?
A much more efficient search method is to exploit the fact that the list is sorted. Start in the middle (or the left/right of middle if there's an even #) and split left/right - repeat! Split Search is more efficient (faster) than linear because it starts at the middle of a sorted array (or list) and eliminates half of the array each pass through the algorithm. Note - this only works on sorted data.
The algorithm needs to take into account an even or odd number of elements in each iteration. The programmer will need to decide which half gets the extra element in an odd number situation - typically the right half, but there is no specific rule.
Also, the element might not exist in the list. In this case we either return -1, undefined, null, or similar.