Spaceship Operator

Simon Brand from Simon Brand

You write a class. It has a bunch of member data. At some point, you realise that you need to be able to compare objects of this type. You sigh and resign yourself to writing six operator overloads for every type of comparison you need to make. Afterwards your fingers ache and your previously clean code is lost in a sea of functions which do essentially the same thing. If this sounds familiar, then C++20’s spaceship operator is for you. This post will look at how the spaceship operator allows you to describe the strength of relations, write your own overloads, have them be automatically generated, and how correct, efficient two-way comparisons are automatically rewritten to use them.

Relation strength

The spaceship operator looks like <=> and its official C++ name is the “three-way comparison operator”. It is so-called, because it is used by comparing two objects, then comparing that result to 0, like so:

(a <=> b) < 0  //true if a < b
(a <=> b) > 0  //true if a > b
(a <=> b) == 0 //true if a is equal/equivalent to b

One might think that <=> will simply return -1, 0, or 1, similar to strcmp. This being C++, the reality is a fair bit more complex, but it’s also substantially more powerful.

Not all equality relations were created equal. C++’s spaceship operator doesn’t just let you express orderings and equality between objects, but also the characteristics of those relations. Let’s look at two examples.

Example 1: Ordering rectangles

We have a rectangle class and want to define comparison operators for it based on its size. But what does it mean for one rectangle to be smaller than another? Clearly if one rectangle is 1cm by 1cm, it is smaller than one which is 10cm by 10cm. But what about one rectangle which is 1cm by 5cm and another which is 5cm by 1cm? These are clearly not equal, but they’re also not less than or greater than each other. But speaking practically, having all of <, <=, ==, >, and >= return false for this case is not particularly useful, and it breaks some common assumptions at these operators, such as (a == b || a < b || a > b) == true. Instead, we can say that == in this case models equivalence rather than true equality. This is known as a weak ordering.

Example 2: Ordering squares

Similar to above, we have a square type which we want to define comparison operators for with respect to size. In this case, we don’t have the issue of two objects being equivalent, but not equal: if two squares have the same area, then they are equal. This means that == models equality rather than equivalence. This is known as a strong ordering.

Describing relations

Three-way comparisons allow you express the strength of your relation, and whether it allows just equality or also ordering. This is achieved through the return type of operator<=>. Five types are provided, and stronger relations can implicitly convert to weaker ones, like so:

relation conversion

Strong and weak orderings are as described in the above examples. Strong and weak equality means that only == and != are valid. Partial ordering is weaker than weak_ordering in that it also allows unordered values, like NaN for floats.

These types have a number of uses. Firstly they indicate to users of the types what kind of relation is modelled by the comparisons, so that their behaviour is more clear. Secondly, algorithms could potentially have more efficient specializations for when the orderings are stronger, e.g. if two strongly-ordered objects are equal, calling a function on one of them will give the same result as the other, so it would only need to be carried out once. Finally, they can be used in defining language features, such as class type non-type template parameters, which require the type to have strong equality so that equal values always name the same template instantiation.

Writing your own three-way comparison

You can provide a three-way comparison operator for your type in the usual way, by writing an operator overload:

struct foo {
  int i;

  std::strong_ordering operator<=> (foo const& rhs) {
    return i <=> rhs.i;
  }
};

Note that whereas two-way comparisons should be non-member functions so that implicit conversions are done on both sides of the operator, this is not necessary for operator<=>; we can make it a member and it’ll do the right thing.

Sometimes you may find that you want to do <=> on types which don’t have an operator<=> defined. The compiler won’t try to work out a definition for <=> based on the other operators, but you can use std::compare_3way instead, which will fall back on two-way comparisons if there is no three-way version available. For example, we could write a three-way comparison operator for a pair type like so:

template<class T, class U>
struct pair {
  T t;
  U u;

  auto operator<=> (pair const& rhs) const
    -> std::common_comparison_category_t<
         decltype(std::compare3_way(t, rhs.t)),
         decltype(std::compare3_way(u, rhs.u)> {
    if (auto cmp = std::compare3_way(t, rhs.t); cmp != 0) return cmp;
    return std::compare3_way(u, rhs.u);
  }

std::common_comparison_category_t there determines the weakest relation given a number of them. E.g. std::common_comparison_category_t<std::strong_ordering, std::partial_ordering> is std::partial_ordering.

If you found the previous example a bit too complex, you might be glad to know that C++20 will support automatic generation of comparison operators. All we need to do is =default our operator<=>:

auto operator<=>(x const&) = default;

Simple! This will carry out a lexicographic comparison for each base, then member of the type, in order of declaration.

Automatically-rewritten two-way comparisons

Although <=> is very powerful, most of the time we just want to know if some object is less than another, or equal to another. To facilitate this, two-way comparisons like < can be rewritten by the compiler to use <=> if a better match is not found.

The basic idea is that for some operator @, an expression a @ b can be rewritten as a <=> b @ 0. It can even be rewritten as 0 @ b <=> a if that is a better match, which means we get symmetry for free.

In some cases, this can actually provide a performance benefit. Quite often comparison operators are implemented by writing == and <, then writing the other operators in terms of those rather than duplicating the code. This can lead to situations where we want to check <= and end up doing an expensive comparison twice. This automatic rewriting can avoid that cost, since it will only call the one operator<=> rather than both operator< and operator==.

Conclusion

The spaceship operator is a very welcome addition to C++. It gives us more expressiveness in how we define our relations, lets us write less code to define them (sometimes even just a defaulted declaration), and avoids some of the performance pitfalls of manually implementing some comparison operators in terms of others.

For more details, have a look at the cppreference articles for comparison operators and the compare header, and Herb Sutter’s standards proposal. For a good example on how to write a spaceship operator, see Barry Revzin’s post on implementing it for std::optional. For more information on the mathematics of ordering with a C++ slant, see Jonathan Müller’s blog post series on the mathematics behind comparison.

Evolutionary pressures on C++, Java and Python

Derek Jones from The Shape of Code

The future evolution of C++, Java and Python is being driven by very different interested parties, and it’s going to be interesting watching events unfold over the next 5-10 years.

I have previously written about how the C++ Standard’s committee is past its sell-by date, has taken off its ball and chain and is now in the hands of bored consultants.

Bjarne Stroustrup was once effectively treated as C++’s Benevolent Dictator For Life (during the production of the first C++ Standard some people were labeled as Bjarne groupees); things have moved on since then, but the ‘old-guard’ are trying to make a comeback. Suggesting that people ought to base their thinking on a book published almost 25-years ago (Stroustrup’s “The Design and Evolution of C++”; a very interesting book that is well worth reading) creates a rather backward looking image. Bored consultants are looking to work on exciting new ideas. The old-guard need to appear modern to attract followers (even if the ideas are old ideas with a fresh coat of paint).

The threat to C++ is from bored consultants, each adding their own pet idea to the language standard; a situation that Stroustrup thinks is starting to happen.

Java, the language, is owned by Oracle, the company (let’s not get too involved in exactly what they own, have copyright on, etc). Oracle are not shy about asking people for licensing fees. Java is now on a 6-month release cycle (at least the Oracle version, there are Open Source implementations) and the free support only applies to the current release; paying a license fee buys support for versions older than 6-months. In the short term, the cheapest solution is for companies to pay for support.

Oracle are always happy to send in the lawyers and if too many customers switch to non-Oracle implementations, I’m sure something can be found to introduce enough uncertainty to discourage work/distribution involving Open Source Java implementations.

Will Java survive Oracle’s licensing? It is not in their interest for Java to die; Oracle will adjust their terms to keep the money flowing in, but over the longer term I think willing Java developers are going to be hard to find.

Guido van Rossum recently removed himself from the post of Python’s Benevolent Dictator For Life. One of the jobs of a benevolent dictator is maintaining some degree of language coherence, which involves preventing people’s pet ideas from being added to the language. Does this mean that Python is slowly going to be become more and more bloated? Perhaps, but I think a more likely problem is a language fork, multiple implementations of slightly different (at first) languages all claiming to be Python.

These days, the strength of Python is its large collection of very useful, commercial grade, packages, and future language details may turn out to be irrelevant. There is a lot to learn from the Python 2/3 transition, but true believers like to think that things will turn out differently for them.

Super Simple Named Boolean Parameters

Simon Brand from Simon Brand

Quite often we come across interfaces with multiple boolean parameters, like this:

cake make_cake (bool with_dairy, bool chocolate_sauce, bool poison);

A call to this function might look like:

auto c = make_cake(true, true, false);

Unfortunately, this is not very descriptive. We need to look at the declaration of the function to know what each of those bools mean, and it would be way too easy to accidentally flip the wrong argument during maintainance and end up poisoning ourselves rather than having a chocolatey treat.

There are many solutions which people use, from just adding comments to each argument, to creating tagged types. I recently came across a very simple trick which I had not seen before:

#define K(name) true

Yep, it’s a macro which takes an argument and gets replaced by true. This may look strange, but it enables us to write this:

auto c = make_cake(K(with_dairy), !K(chocolate_sauce), !K(poison));

// or using `not` if you prefer
auto c = make_cake(K(with_dairy), not K(chocolate_sauce), not K(poison));

Of course, it’s up to you whether you think using a macro for this kind of thing is worth it, or if you think that macro should have a name which is less likely to collide. Personally I think it’s quite a neat trick to solve this issue in a low-overhead manner.

A guess I need to come up with a marketing name for this. Since we already have X Macros, I hereby dub this trick “K Macros”.

Function Template Partial Ordering: Worked Examples

Simon Brand from Simon Brand

C++ function overloading rules are complex. C++ template rules are complex. Put the two together, and you unfortunately do not get something simple; you get a hideous monster of standardese which requires great patience and knowledge to overcome. However, since C++ is mostly corner-cases, it can pay to understand how the rules apply for those times where you just can’t work out why your code won’t compile. This post will present a few step-by-step examples of how partial ordering of function templates works in order to arm you for these times of need.

Partial ordering of function templates is a step of overload resolution. It occurs when you call a function template which is overloaded and the compiler needs to decide which one is more specialized than the other. Consider this code:

template<class T> void f(T);        //(1)
template<class T> void f(T const*); //(2)

int const* p = nullptr;
f(p);

We expect f(p) to call (2), because p is a int const*. In order to decide that (2) is more specialized than (1), the compiler needs to follow the function template partial ordering rules. Let’s see what the standard has to say.

Partial ordering selects which of two function templates is more specialized than the other by transforming each template in turn (see next paragraph) and performing template argument deduction using the function type.

This is not so complicated, even if the terms may be unfamiliar. Unfortunately, the “next paragraph” and the sections which it references are almost impossible to parse without a lot of background knowledge and re-readings, so I shall step through the algorithm rather than pasting the rules for you to cry over.

There are four steps which we to have work through:

  1. Transforming (1).
  2. Performing deduction on (2) with the transformed template from step 1.
  3. Transforming (2).
  4. Performing deduction on (1) with the transformed template from step 3.

If one and only one of the deductions succeeds, then the template with which the deduction was performed is more specialized than the other.

Step 1: The rules state that for each template parameter, we create some unique type to use in its stead1. Let’s call this unique type type_0. You can pretend that this was defined somewhere like class type_0{};. Now we take our function template template <class T> void f(T) and substitute in type_0 for T. This gives us void f(type_0). The transformation is complete.

Step 2: Now that we have transformed template <class T> void f(T) into void f(type_0), we will perform deduction on (2) using the transformed function type. To do this, we imagine a call to (2) where the arguments have the type of the parameters for (1). Concretely, it would look like this:

template <class T> void func_2(T const*);
func_2(type_0{}); //derived from void f(type_0)

Would this call succeed? We can put it into our compiler to find out. GCC 8.1 says:

<source>: In function 'int main()':
<source>:4:18: error: no matching function for call to 'func_2(type_0)'
   func_2(type_0{});
                  ^
<source>:1:25: note: candidate: 'template<class T> void func_2(const T*)'
 template <class T> void func_2(T const*);
                         ^~~~~~
<source>:1:25: note:   template argument deduction/substitution failed:
<source>:4:18: note:   mismatched types 'const T*' and 'type_0'
   func_2(type_0{});
                  ^   

So deduction from (1) to (2) fails, because the invented type type_0 cannot be used to deduce const T*.

Step 3: Let’s try from (2) to (1). Again, we’ll transform (2) from template <class T> void f(T const*) to void f(type_0 const*).

Step 4: Now we attempt deduction:

template <class T> void func_1(T);
type_0 const* arg = nullptr;
func_1(arg);

This succeeds because a type_0 const* can be used to deduce T. Since deduction from (1) to (2) fails, but deduction from (2) to (1) succeeds, (2) is more specialized than (1) and will be chosen by overload resolution.


Let’s try a different example. How about:

template<class T> void g(T);  //(1)
template<class T> void g(T&); //(2)
int i = 0;
g(i);

(1) transforms to void g(type_0). Before we try deduction, we need to apply one of the numerous additional rules from the standard, which says we need to replace references with the type being referred to. So template <class T> void g(T&) becomes template <class T> void g(T). Deduction time:

template<class T> void func_2(T);
func_2(type_0{});

This succeeds.

Now the other direction. template<class T> void g(T&) transforms to void g(type_0&), then we remove the reference to get void g(type_0). Our second deduction:

template<class T> void func_1(T);
func_1(type_0{});

This is effectively identical to the previous one, so of course it succeeds.

Since deduction succeeded in both directions, the call is ambiguous. Sure enough, GCC diagnoses:

<source>: In function 'int main()':
<source>:5:8: error: call of overloaded 'g(int&)' is ambiguous
     g(i);
        ^
<source>:1:24: note: candidate: 'void g(T) [with T = int]'
 template<class T> void g(T);  //(1)
                        ^
<source>:2:24: note: candidate: 'void g(T&) [with T = int]'
 template<class T> void g(T&); //(2)
                        ^ 

This is why the algorithm is a partial ordering: sometimes two function templates are not ordered.


I’ll give one more example. This one has multiple parameters and is a bit more subtle.

template<class T>struct identity { using type = T; };
template<class T>struct A{};

template<class T, class U> void h(typename identity<T>::type, U); //(1)
template<class T, class U> void h(T, A<U>);                       //(2)
h<int>(0,A<void>{});

identity here just evaluates to its template argument, but the important thing to note is that typename identity<T>::type is a non-deduced context, so T cannot be deduced from the argument for that parameter.

(1) transforms to void h(typename identity<type_0>::type, type_0), which is void h(type_0, type_0). Attempt deduction on (2):

template<class T, class U> void func_2(T, A<U>);
func_2(type_0{}, type_0{});

This fails because we can’t match type_0 against A<U>.

(2) transforms to void h(type_0, A<type_1>). Try deduction against (1):

template<class T, class U> void func_1(typename identity<T>::type, U);
func_1(type_0{}, A<type_1>{});

This fails because typename identity<T>::type is a non-deduced context, so we can’t deduce T.

In the example from the last section deduction succeeded both ways so the call was ambiguous. In this example, deduction fails both ways, which is also an ambiguous call.


That’s the last of the examples. Of course, there are a bunch of rules which I didn’t cover here, like Concepts, parameter packs, non-type/template template parameters, and cases where both the argument and parameter types are references. Hopefully you now have enough of an intuition that you can understand what the standard says when you inevitably hit those corner cases. If you have any partial ordering conundrums, drop them down in the comments below or send them to me on Twitter.


  1. I’ll ignore non-type template parameters and template template parameters for simplicity, but the rules are essentially the same. 

std::accumulate vs. std::reduce

Simon Brand from Simon Brand

std::accumulate has been a part of the standard library since C++98. It provides a way to fold a binary operation (such as addition) over an iterator range, resulting in a single value. std::reduce was added in C++17 and looks remarkably similar. This post will explain the difference between the two and when to use one or the other.

Let’s start by looking at their interfaces, beginning with std::accumulate.

template< class InputIt, class T >
T accumulate( InputIt first, InputIt last, T init );

template< class InputIt, class T, class BinaryOperation >
T accumulate( InputIt first, InputIt last, T init,
              BinaryOperation op );

std::accumulate takes an iterator range and an initial value for the accumulation. You can optionally give it a binary operation to do the reduction, which will default to addition. It will call this operation on the initial value and the first element of the range, then on the result and the second element of the range, etc. Here are two equivalent calls:

auto sum1 = std::accumulate(begin(vec), end(vec), 0);
auto sum2 = std::accumulate(begin(vec), end(vec), 0, std::plus<>{});

std::reduce has a fair few more overloads to get your head round, but has a very similar interface once you understand them:

template<class InputIt>
typename std::iterator_traits<InputIt>::value_type reduce(
    InputIt first, InputIt last);

template<class ExecutionPolicy, class ForwardIt>
typename std::iterator_traits<ForwardIt>::value_type reduce(
    ExecutionPolicy&& policy,
    ForwardIt first, ForwardIt last);

template<class InputIt, class T>
T reduce(InputIt first, InputIt last, T init);

template<class ExecutionPolicy, class ForwardIt, class T>
T reduce(ExecutionPolicy&& policy,
         ForwardIt first, ForwardIt last, T init);

template<class InputIt, class T, class BinaryOp>
T reduce(InputIt first, InputIt last, T init, BinaryOp binary_op);

template<class ExecutionPolicy, class ForwardIt, class T, class BinaryOp>
T reduce(ExecutionPolicy&& policy,
         ForwardIt first, ForwardIt last, T init, BinaryOp binary_op);

The differences here are:

I’ll talk about these points in turn.

Execution policies

Execution policies are a C++17 feature which allows programmers to ask for algorithms to be parallelised. There are three execution policies in C++17:

  • std::execution::seq – do not parallelise
  • std::execution::par – parallelise
  • std::execution::par_unseq – parallelise and vectorise (requires that the operation can be interleaved, so no acquiring mutexes and such)

The idea behind execution policies is that you can change a serial algorithm to a parallel algorithm simply by passing an additional argument to the function:

auto sum1 = std::reduce(begin(vec), end(vec));                      //sequential
auto sum2 = std::reduce(std::execution::seq, begin(vec), end(vec)); //sequential
auto sum3 = std::reduce(std::execution::par, begin(vec), end(vec)); //parallel

Allowing parallelisation is the main reason for the addition of std::reduce. Let’s look at an example where we want to sum up all the elements in an array. With std::accumulate it looks like this:

accumulate plus

Note that each step of the computation relies on the previous computation, i.e. this algorithm will execute serially and we make no use of hardware parallel processing capabilities. If we use std::reduce with the std::execution::par policy then it could look like this:

reduce plus

This is a trivial amount of data for processing in parallel, but the benefit gained when the data size is scaled up should be clear: some of the operations can be executed independently of others, so they can be done in parallel.

A common question is: why do we need an entirely new algorithm for this? Why can’t we just overload std::accumulate? For an example of why we can’t do this, let’s use std::minus<> instead of std::plus<> as our reduction operation. With std::accumulate we get this:

accumulate minus

However, if we try to use std::reduce, we could get something like:

reduce minus

Uh oh. We just broke our code.

We got the wrong answer because of the mathematical properties of subtraction. You can’t arbitrarily reorder the operands, or compute the operations out of order when doing subtraction. This is formalised in the properties of commutativity and associativity.

A binary operation ∗ on a set S is associative if the following equation holds for all x, y, and z in S:

(x ∗ y) ∗ z = x ∗ (y ∗ z)

An operation is commutative if:

x ∗ y = y ∗ x

Unfortunately there’s no way for the compiler to reliably check that some operation is associative and commutative, so we’re stuck with having std::reduce as a separate algorithm. In concepts speak, this is called an axiom: a requirement imposed on semantics which cannot generally be statically verified.

Input vs. Forward Iterators

I couldn’t find any reasoning in the original standards papers for the separate overloads for input and forward iterators. However, my educated guess is that this is to allow library implementors to chunk up the data in order to distribute it to a thread pool. Indeed, this appears to be what MSVC’s implementation does.

The idea is that input iterators are single-pass, whereas forward iterators can be iterated through multiple times. An algorithm couldn’t chunk up data indexed by input iterators because by the time it had gone through the range to work out the sub-range boundaries, the data from the iterator could have already been consumed.

Optional Initial Element

std::reduce lets you not bother passing an initial element, in which case it will default-construct one using typename std::iterator_traits<InputIt>::value_type{}. I think this was a mistake. A default-constructed value is not always the identity element (such as in multiplication), and it’s very easy to for a programmer to miss out the initial element. The code will still compile, it will just give the wrong answer. I suspect that this choice will result in some hard-to-find bugs when this interface comes into heavier use.

Finishing Up

That covers the differences between std::reduce and std::accumulate. My three point guide to std::reduce is:

  • Use std::reduce when you want your accumulation to run in parallel
  • Ensure that the operation you want to use is both associative and commutative
  • Remember that the default initial value is produced by default construction, and that this may not be correct for your operation