Menengah Exercism • rust

Binary Search

Ringkasan Tantangan

Implement a binary search algorithm.

Introduction

You have stumbled upon a group of mathematicians who are also singer-songwriters. They have written a song for each of their favorite numbers, and, as you can imagine, they have a lot of favorite numbers (like 0 or 73 or 6174).

You are curious to hear the song for your favorite number, but with so many songs to wade through, finding the right song could take a while. Fortunately, they have organized their songs in a playlist sorted by the title — which is simply the number that the song is about.

You realize that you can use a binary search algorithm to quickly find a song given the title.

Instructions

Your task is to implement a binary search algorithm.

A binary search algorithm finds an item in a list by repeatedly splitting it in half, only keeping the half which contains the item we’re looking for. It allows us to quickly narrow down the possible locations of our item until we find it, or until we’ve eliminated all possible locations.

Binary search only works when a list has been sorted.

The algorithm looks like this:

Find the middle element of a sorted list and compare it with the item we’re looking for.
If the middle element is our item, then we’re done!
If the middle element is greater than our item, we can eliminate that element and all the elements after it.
If the middle element is less than our item, we can eliminate that element and all the elements before it.
If every element of the list has been eliminated then the item is not in the list.
Otherwise, repeat the process on the part of the list that has not been eliminated.

Here’s an example:

Let’s say we’re looking for the number 23 in the following sorted list: [4, 8, 12, 16, 23, 28, 32].

We start by comparing 23 with the middle element, 16.
Since 23 is greater than 16, we can eliminate the left half of the list, leaving us with [23, 28, 32].
We then compare 23 with the new middle element, 28.
Since 23 is less than 28, we can eliminate the right half of the list: [23].
We’ve found our item.

Dig Deeper

Looping

use std::cmp::Ordering;

pub fn find<U: AsRef<[T]>, T: Ord>(array: U, key: T) -> Option<usize> {
    let array = array.as_ref();
    let mut left = 0usize;
    let mut right = array.len();
    let mut mid: usize;

    while left < right {
        mid = (left + right) / 2;
        match array[mid].cmp(&key) {
            Ordering::Equal => return Some(mid),
            Ordering::Less => {
                left = mid + 1;
            }
            Ordering::Greater => {
                right = mid;
            }
        }
    }
    None
}

This approach starts by using the Ordering enum.

The find() function has a signature to support the optional generic tests. To support slices, arrays and Vecs, which can be of varying lengths and sizes at runtime, the compiler needs to be given informaton it can know at compile time. A reference to any of those containers will always be of the same size (essentially the size of a pointer), so AsRef is used to constrain the generic type to be anything that is a reference or can be coerced to a reference.

The <[T]> is used to constrain the reference type to an indexable type T. The T is constrained to be anything which implements the Ord trait, which essentially means the values must be able to be ordered.

So, the key is of type T (orderable), and the array is of type U (a reference to an indexable container of orderable values of the same type as the key.)

Although array is defined as generic type U, which is constrained to be of type AsRef, the as_ref() method is used to get the reference to the actual type. Without it, the compiler would complain that “no method named len found for type parameter U in the current scope” and “cannot index into a value of type U”.

Next. some variables of type usize are defined to keep track of where we are in the container.

The loop runs while left is less than right. If the array is empty, then left and right will both be 0, and the loop won’t run, and the function will return None.

Inside the loop, the midpoint between left and right is used to get the element of the array at the index of the midpoint value. The cmp() method of the Ord trait is used to compare that element value with the key value. Since the element is a reference, the key must also be referenced.

The match arms each use a value from the Ordering enum.

If the midpoint element value equals the key, then the midpoint is returned from the function wrapped in a Some.
If the midpoint element value is less than the key, then the left value is adjusted to be one to the right of the midpoint.
If the midpoint element value is greater than the key, then the right value is adjusted to be the midpoint.

While the element value is not equal to the key, the number of elements being searched keeps getting halved until either the key is found, or, if it is not in the array, the bounds meet and the loop no longer runs.

Recursion

use std::cmp::Ordering;

fn find<U: AsRef<[T]>, T: Ord>(array: U, key: T) -> Option<usize> {
    let array = array.as_ref();
    if array.is_empty() {
        return None;
    }
    let mid = array.len() / 2;

    match array[mid].cmp(&key) {
        Ordering::Equal => Some(mid),
        Ordering::Greater => find(&array[..mid], key),
        Ordering::Less => find(&array[mid + 1..], key).map(|p| p + mid + 1),
    }
}

This approach starts by using the Ordering enum.

So, the key is of type T (orderable), and the array is of type U (a reference to an indexable container of orderable values of the same type as the key.)

If the array is empty, then None is returned.

The midpoint of the array is used to get the element of the array at the index of the midpoint value. The cmp() method of the Ord trait is used to compare that element value with the key value. Since the element is a reference, the key must also be referenced.

The match arms each use a value from the Ordering enum.

If the midpoint element value equals the key, then the midpoint is returned from the function wrapped in a Some.
If the midpoint element value is greater than the key, then find() calls itself, passing a slice of the array from the beginning up to but not including the midpoint element.
If the midpoint element value is less than the key, then find() calls itself, passing a slice of the array from the element to the right of the midpoint through the end of the array. The return postion from the recursive call is the midpoint start from the new left(mid + 1), so we add mid + 1 to the return postion.

While the element value is not equal to the key, find() keeps calling itself while halving the number of elements being searched, until either the key is found, or, if it is not in the array, the array is whittled down to empty.

Source: Exercism rust/binary-search