STL algorithms
for programming interviews

Monday, September 28, 2020

Using C++ in a technical interview can be hazardous unless you have experience using it. Experienced C++ programmers will benefit from a collection of data structures and algorithms that will help them solve problems in real-time. These data structures and algorithms are part of the Standard Template Library (STL). In this post, I will list useful STL algorithms that anyone who’s preparing for technical interviews should be familiar with. I will be dividing my list into two categories: Non-modifying algorithms and modifying algorithms.

In many cases, using a library algorithm to solve a complete problem isn’t acceptable since the interviewer may want to see how you would implement it yourself but there are many cases in which using built-it library functions is perfectly fine. The best way to determine this is by asking the interviewer.

This list isn’t comprehensive, it presents algorithms I found useful for programming interviews over the years and how they can be used.


Non-modifying algorithms

Modifying algorithms


std::count
std::min_element
std::max_element

I’m bundling these algorithms together because they serve the same purpose: iterating through a range and reducing it into a single value. This task comes up often in programming interviews and these three operations are especially common: counting the number of times an element appears in a range and finding out the position of the min/max element in a range.

std::count has a single overload that takes a pair of iterators and the value we want to count. It’s using operator== for equality by default. If you want to pass your own predicate, look no further than std::count_if.

std::min_element and std::max_element in their basic form take two iterators. These algorithms use operator< for comparison by default. Unlike std::count, these algorithms return an iterator instead of a value which is useful if you need to perform additional operations on the range as is often the case. Also unlike std::count, they provide an overload that takes a comparison function.

The implementation of these algorithms is trivial but relying on built-in functions instead will simplify your code substantially. For example, if you need to get the index of the smallest element in an array and you find yourself writing something like this:

int MinElementIdx(const vector<int>& vec) {
    auto min_value = numeric_limits<int>::max();
    auto min_value_idx = 0;
    for (int i = 0; i < size(vec); ++i) {
        if (vec[i] < min_value) {
            min_value = vec[i];
            min_value_idx = i;
        }
    }
    return min_value_idx;
}

It will benefit you greatly to replace it with a built-in function:

int MinElementIdx(const vector<int>& vec) {
    return min_element(begin(vec), end(vec)) - begin(vec);
}

The resulting code isn’t just shorter, it’s also less susceptible to small bugs, and it’s easier to understand – an outcome that is desirable during an interview.

std::accumulate

std::accumulate is related to the algorithms presented above, it too summarizes a range into a single value, but it provides greater flexibility. It comes with some pitfalls as well so we will give it the attention it deserves.

All the algorithms we’re going to cover live in <algorithm> except for std::accumulate that’s hiding under <numeric>. Despite its location, std::accumulate can be used for non-numeric types.

In its basic form it takes a pair of iterators and returns the sum of the elements in the range using operator+. Since operator+ requires two values, it takes a starting value as well. In another form, std::accumulate also takes a function that lets you decide how to compute the sum.

It might sounds straight forward, but there are a couple of things to keep in mind:

  1. The starting value type has to much the container type. If you provide a range of double and a starting value of type int, the return value will be reduced to an int.
  2. The second overload of std::accumulate takes a function that makes it possible to do all kinds of things with it, but to maintain the expressivity of your code the return value should always represent a useful summary of the elements in the range.

If you’re asking yourself what kinds of things can be done with such a simple algorithm, I encourage you to watch “std::accumulate: Exploring an Algorithmic Empire” by Ben Deane. We will keep things simple with our example. Let’s write a function that takes an array of positive integers and returns the sum of all the even numbers in the array:

int sum_even_numbers(const vector<int>& arr) {
    return accumulate(begin(arr), end(arr), 0, [](int acc, int x){
        return x % 2 == 0 ? acc + x : acc;
    });
}

std::find
std::find_if
std:find_if_not

Looking for values in a container is one of the most common tasks you need to perform during technical interviews (second only to sorting), and the STL gives you several options to choose from. The most important thing to keep in mind is that “find algorithms” should only be used for unsorted ranges. The next group of algorithms will deal with sorted ranges and cover the difference.

std::find takes a pair of iterators and the value you are looking for. It’s using operator== for equality and returns an iterator to the first element that is matching it. If the value you are looking for is not found in the range std::find returns an iterator to the last element.

std::find_if and std::find_if_not are similar to std::find except they take a unary predicate as well as a pair of iterators to determine equality. The difference between the two is std::find_if looks for the first match while std::find_if_not looks for the first mismatch.

I’ve already mentioned that algorithms that return iterators can be useful so here’s an example. Let’s write a function that trims leading spaces from a string:

string trim_leading_spaces(const string& str) {
    return {
        find_if_not(begin(str), end(str), [](char c){
            return c == ' ';
        }),
        end(str)
    };
}

You might be wondering what’s the purpose of having std::find_if and std::find_if_not? We could have just as easily implemented the same function using std::find_if by negating the return value of the predicate:

find_if(begin(str), end(str), [](char c){
    return c != ' ';
});

The answer to this question, and the reason I’m presenting both algorithms, is expressivity. If we use std::find_if and negate its return value, the meaning of the algorithm that we derive from its name changes.

std::lower_bound
std::upper_bound

If you are preparing for technical interviews you should know all about binary search: a sorted-range algorithm that finds values in logarithmic time. Many questions require you to implement a specialized variation of binary search, but in some cases, a built-in algorithm will do.

The STL provides an algorithm called std::binary_search, but it’s not as useful as the two algorithms we’ll be discussing next since it returns a boolean if a given value exists in a range instead of the position of that value. Since a given value may appear more than once, we need two variations of an algorithm that takes a pair of iterators and a value to search for:

  • std::lower_bound returns an iterator that points to the first element equal to or greater than value.
  • std::upper_bound returns an iterator that points to the first element that’s greater than value.

There’s a subtle but important difference between both algorithms. Suppose you are given a sorted array of integers [1, 3, 3, 5, 6, 7] and you want to find the position of the value 3, which index should be returned? Sometimes we want to find the lower-bound and sometimes the upper-bound:

int find_first_of_element(const vector<int>& vec, int val) {
    return lower_bound(begin(vec), end(vec), val) - begin(vec);
}

int find_last_of_element(const vector<int>& vec, int val) {
    return upper_bound(begin(vec), end(vec), val) - begin(vec) - 1; 
}

int main() {
    vector<int> vec = {1, 3, 3, 5, 6, 7};
    cout << find_first_of_element(vec, 3) << '\n'; // prints 1
    cout << find_last_of_element(vec, 3) << '\n'; // prints 2
}

While there are several sorted-range algorithms that employ binary search, the ones I found most useful for technical interviews are std::lower_bound and std::upper_bound. It’s important to remember that to maintain logarithmic time complexity these algorithms require random-access iterators (array, vector, string, and deque). They would work otherwise but the time complexity will most likely be linear. According to the standard, passing unsorted ranges to std::lower_bound and std::upper_bound is undefined behaviour so if you are seeing unexpected results, make sure this invariant holds true.


std::shuffle

std::shuffle comes in handy when you need to generate random permutations for a range such that each permutation has an equal probability of appearing. It has a single overload that takes a pair of iterators and a uniform random number generator.

std::shuffle is usually implemented using the Fisher-Yates shuffle algorithm which is relatively simple so you should know how to implement yourself, however, if shuffling is an implementation detail, using a built-in function is acceptable.

Suppose we need to create a type initialized with an array of integers. This type should provide two methods:

  • Shuffle() which returns a random permutation of the original array.
  • Reset() which resets the array to its original state and returns it.

We can use std::shuffle to solve this problem:

class Array {
public:
    explicit Array(const vector<int>& arr)
        : arr(arr), src(arr), seed(random_device{}()){}

    vector<int> Shuffle() {
        shuffle(begin(arr), end(arr), seed);
        return arr;
    }

    vector<int> Reset() { return src; }

private:
    vector<int> arr, src;
    default_random_engine seed;
};

If you look up shuffling in C++, you might come across std::random_shuffle which is the deprecated predecessor of std::shuffle. It’s simpler to use because you don’t need to give it a random number generator since it’s using rand() in the background. You should avoid rand() because it only generates random integral values between 0 and RAND_MAX whos value is implementation-dependent. To learn more about rand() and why you shouldn’t use it, I strongly recommend watching rand() Considered Harmful by Stephan T. Lavavej.

std::next_permutation

Generating permutations is something that comes up often in programming interviews. Most ‘permutations’ problems can be solved with an algorithm that returns the next lexicographically-ordered permutation of a range. In fact, a popular question is implementing this function yourself. If the implementation is not central to the question, you should use std::next_permutation.

In its simplest form, std::next_permutation takes a pair of iterators. It uses operator< to determine the next lexicographically-ordered permutation based on the current ordering of the range. This algorithm modifies the range in place and returns true if the next permutation is greater, otherwise, it returns false.

Since this algorithm is applicable to so many problems I want to provide you with two examples. For the first one (and perhaps most common) suppose we need to write a function that takes an array of integers and returns all possible permutations. The most common solution for this problem involves backtracking, but you can also use std::next_permutation without compromising time complexity:

vector<vector<int>> Permutations(const vector<int>& nums) {
    vector<vector<int>> output;
    auto permutation = nums;
    do {
        next_permutation(begin(permutation), end(permutation));
        output.emplace_back(permutation);
    } while (permutation != nums);
    return output;
}

For the second problem, suppose we are given a positive integer n, and we need to write a function that returns the smallest integer that contains the same digits as n and is greater in value than n. We can easily conclude from the problem statement that what we need is the next permutation of n, if such permutation exist.

int NextGreaterElement(int n) {
    auto permutation = to_string(n);
    if (!next_permutation(begin(permutation), end(permutation))) {
        return -1; // the next permutation isn't greater than n
    }
    return stoi(permutation);
}

std::rotate

Rotating elements in a container is a common building block in many algorithms and programming interview questions. In this context, rotation means shifting n elements in a container to the left or to the right in a circular manner. For example, suppose you’re asked to shift three elements in the following array to the right [1, 2, 3, 4, 5, 6], the output should look like this: [4, 5, 6, 1, 2, 3].

If you are wondering how this algorithm can be put into good use, consider the following example of an insertion sort algorithm pulled directly from cppreference.com:

vector<int> v = {2, 4, 2, 0, 5, 10, 7};
for (auto i = v.begin(); i != v.end(); ++i) {
    rotate(upper_bound(v.begin(), i, *i), i, i + 1);
}

std::rotate takes three forward iterators: An iterator for the beginning of the original range, an iterator for the element that should appear at the beginning of the rotated range, and finally, an iterator for the end of the original range.

In an interview, you may be asked to rotate an array to the right k times, in which case your interviewer is probably interested in how you’d implement this algorithm yourself, but suppose you are given a sorted array that was rotated k times and you are asked to rotate it back into its original position. We can solve this problem with a single line of code:

void RotateBack(vector<int>& arr, int k) {
    rotate(begin(arr), begin(arr) + size(arr) - k + 1, end(arr));
}

We could have used a sorting algorithm to solve this problem but the time complexity of std::rotate is linear in the distance of the original range, better than most standard sorting algorithms.

std::remove
std::fill

std::remove is one of the most confusing and commonly used algorithms in the STL. It’s confusing because it doesn’t remove anything, and there’s a good reason for it. Every container has a different way of erasing elements depending on its underlying data-structure, and std::remove doesn’t have a way to reflect on the type of the container it’s operating on.

So what does it do? std::remove moves all the elements that should not be removed to the beginning of the range – while maintaining their relative order – and returns an iterator to the new logical end of the range. Unlike std::stable_partition however, std::remove doesn’t move the “removed” elements to the back of the range, instead, it simply writes over them.

When we use std::remove, we usually want to remove elements from a range, and that’s why it’s often called in conjunction with an erase member function. Together, these two function calls make up the erase-remove idiom.

std::remove takes a pair of iterators and the value that should be removed. If you want to provide a predicate, you can use std::remove_if instead. Most containers provide an erase member function that takes a range. Now, suppose you are given an array of integers [1, 7, 0, 3, 5, 9, 0, 3, 4] and you are asked to remove all the 0’s from it, here’s how you’d do it:

void RemoveZeroesFromArray(vector<int>& arr) {
    arr.erase(remove(begin(arr), end(arr), 0), end(arr));
}

How does std::fill fits in with std::remove? As I mentioned before, when we call std::remove we usually want to erase elements, but we can use it to solve other problems too. Consider the following: Given an array of integers, write a function that moves all the 0’s to the end while maintaining the relative order of the non-zero elements. If we were to use std::stable_partition we couldn’t guarantee a linear time complexity, but we can using fill-remove:

void MoveZeroes(vector<int>& nums) {
    fill(remove(begin(nums), end(nums), 0), end(nums), 0);
}