A call for data on exceptions

Simon Brand from Simon Brand

Last week I released an article entitled “Functional exceptionless error-handling with optional and expected”. The idea was to talk about some good ways to handle errors for those cases where you don’t want to use exceptions, but without starting a huge argument about whether exceptions are good or bad. That post has since spawned arguments about exceptions on Hacker News, two other blog posts, and three Reddit threads. Ten points for effort, zero for execution. Since the argument has now started, I feel inclined to give my own view on the matter.

I know people who think exceptions are dangerous. I don’t personally know anyone who thinks ADT-based error handling is dangerous, but I’m sure the Internet is full of them. What I think is dangerous is cargo cult software engineering. Many people take a stance on either side of this divide for good reasons based on the constraints of their platform, but an unfortunately large number just hear a respected games programmer say “exceptions are evil!”, believe that this applies to all cases and assert it unconditionally. How do we fight this? With data.

I believe that as a community we need to be able to answer these questions:

  1. What do the different kinds of error handling cost when errors occur?
  2. What do they cost when no errors occur?
  3. How consistent are the costs, and what causes deviations?
  4. What is the cognitive overhead of the different techniques?
  5. How do they affect the design of our interfaces?

I have an intuition about all of these, backed by some understanding of the actual code which is generated and how it performs, as I’m sure many of you do. But we hit a problem when it comes to teaching. I don’t believe that teaching an intuition can build an intuition. Such a thing can only be built by experimentation or the reduction of hard data into a broad understanding of what that data points towards. If we answer all of questions above with hard data, then we can come up with a set of strong, measurable guidelines for when to use different kinds of error handling, and we can have something concrete to point people towards to help build that intuition. The C++ Core Guidelines provides the former, but not the latter. Furthermore, when I talk about “hard data”, I don’t just mean microbenchmarks. Microbenchmarks can help, sure, but writing them such that they have the realistic error patterns, cache behaviour, branch prediction, etc. is hard.

Of course, many people have tried to answer these questions already, so I’ve teamed up with Matt Dziubinski to put together a set of existing resources on error handling. Use it to educate yourselves and others, and please submit issues/pull requests to make it better. But I believe that there’s still work to be done, so if you’re a prospective PhD or Masters student and you’re looking for a topic, consider helping out the industry by researching the real costs of different error handling techniques in realistic code bases with modern compilers and modern hardware.

throw end_of_blog_post;

Functional exceptionless error-handling with optional and expected

Simon Brand from Simon Brand

In software things can go wrong. Sometimes we might expect them to go wrong. Sometimes it’s a surprise. In most cases we want to build in some way of handling these misfortunes. Let’s call them disappointments. I’m going to exhibit how to use std::optional and the proposed std::expected to handle disappointments, and show how the types can be extended with concepts from functional programming to make the handling concise and expressive.

One way to express and handle disappointments is exceptions:

void foo() {
try {
do_thing();
}
catch (...) {
//oh no
handle_error();
}
}

There are a myriad of discussions, resources, rants, tirades, debates about the value of exceptions123456, and I will not repeat them here. Suffice to say that there are cases in which exceptions are not the best tool for the job. For the sake of being uncontroversial, I’ll take the example of disappointments which are expected within reasonable use of an API.

The internet loves cats. The hypothetical you and I are involved in the business of producing the cutest images of cats the world has ever seen. We have produced a high-quality C++ library geared towards this sole aim, and we want it to be at the bleeding edge of modern C++.

A common operation in feline cutification programs is to locate cats in a given image. How should we express this in our API? One option is exceptions:

// Throws no_cat_found if a cat is not found.
image_view find_cat (image_view img);

This function takes a view of an image and returns a smaller view which contains the first cat it finds. If it does not find a cat, then it throws an exception. If we’re going to be giving this function a million images, half of which do not contain cats, then that’s a lot of exceptions being thrown. In fact, we’re pretty much using exceptions for control flow at that point, which is A Bad Thing™.

What we really want to express is a function which either returns a cat if it finds one, or it returns nothing. Enter std::optional.

std::optional<image_view> find_cat (image_view img);

std::optional was introduced in C++17 for representing a value which may or may not be present. It is intended to be a vocabulary type – i.e. the canonical choice for expressing some concept in your code. The difference between this signature and the last is powerful; we’ve moved the description of what happens on an error from the documentation into the type system. Now it’s impossible for the user to forget to read the docs, because the compiler is reading them for us, and you can be sure that it’ll shout at you if you use the type incorrectly.

The most common operations on a std::optional are:

std::optional<image_view> full_view = my_view;
std::optional<image_view> empty_view;
std::optional<image_view> another_empty_view = std::nullopt;

full_view.has_value(); //true
empty_view.has_value(); //false

if (full_view) { this_works(); }

my_view = full_view.value();
my_view = *full_view;

my_view = empty_view.value(); //throws bad_optional_access
my_view = *empty_view; //undefined behaviour

Now we’re ready to use our find_cat function along with some other friends from our library to make embarrassingly adorable pictures of cats:

std::optional<image_view> get_cute_cat (image_view img) {
auto cropped = find_cat(img);
if (!cropped) {
return std::nullopt;
}

auto with_tie = add_bow_tie(*cropped);
if (!with_tie) {
return std::nullopt;
}

auto with_sparkles = make_eyes_sparkle(*with_tie);
if (!with_sparkles) {
return std::nullopt;
}

return add_rainbow(make_smaller(*with_sparkles));
}

Well this is… okay. The user is made to explicitly handle what happens in case of an error, so they can’t forget about it, which is good. But there are two issues with this:

  1. There’s no information about why the operations failed.
  2. There’s too much noise; error handling dominates the logic of the code.

I’ll address these two points in turn.

Why did something fail?

std::optional is great for expressing that some operation produced no value, but it gives us no information to help us understand why this occurred; we’re left to use whatever context we have available, or (God forbid) output parameters. What we want is a type which either contains a value, or contains some information about why the value isn’t there. This is called std::expected.

Now don’t go rushing off to cppreference to find out about std::expected; you won’t find it there yet, because it’s currently a standards proposal rather than a part of C++ proper. However, its interface follows std::optional pretty closely, so you already understand most of it. Again, here are the most common operations:

std::expected<image_view,error_code> full_view = my_view;
std::expected<image_view,error_code> empty_view = std::unexpected(that_is_a_dog);

full_view.has_value(); //true
empty_view.has_value(); //false

if (full_view) { this_works(); }

my_view = full_view.value();
my_view = *full_view;

my_view = empty_view.value(); //throws bad_expected_access
my_view = *empty_view; //undefined behaviour

auto code = empty_view.error();
auto oh_no = full_view.error(); //undefined behaviour

With std::expected our code might look like this:

std::expected<image_view, error_code> get_cute_cat (image_view img) {
auto cropped = find_cat(img);p
if (!cropped) {
return no_cat_found;
}

auto with_tie = add_bow_tie(*cropped);
if (!with_tie) {
return cannot_see_neck;
}

auto with_sparkles = make_eyes_sparkle(*with_tie);
if (!with_sparkles) {
return cat_has_eyes_shut;
}

return add_rainbow(make_smaller(*with_sparkles));
}

Now when we call get_cute_cat and don’t get a lovely image back, we have some useful information to report to the user as to why we got into this situation.

Noisy error handling

Unfortunately, with both the std::optional and std::expected versions, there’s still a lot of noise. This is a disappointing solution to handling disappointments. It’s also the limit of what C++17’s std::optional and the most recent proposed std::expected give us.

What we really want is a way to express the operations we want to carry out while pushing the disappointment handling off to the side. As is becoming increasingly trendy in the world of C++, we’ll look to the world of functional programming for help. In this case, the help comes in the form of what I’ll call map and and_then.

If we have some std::optional and we want to carry out some operation on it if and only if there’s a value stored, then we can use map:

widget do_thing (const widget&);

std::optional<widget> result = maybe_get_widget().map(do_thing);
auto result = maybe_get_widget().map(do_thing); //or with auto

This code is roughly equivalent to:

widget do_thing (const widget&);

auto opt_widget = maybe_get_widget();
if (opt_widget) {
widget result = do_thing(*opt_widget);
}

If we want to carry out some operation which could itself fail then we can use and_then:

std::optional<widget> maybe_do_thing (const widget&);

std::optional<widget> result = maybe_get_widget().and_then(maybe_do_thing);
auto result = maybe_get_widget().and_then(maybe_do_thing); //or with auto

This code is roughly equivalent to:

std::optional<widget> maybe_do_thing (const widget&);

auto opt_widget = maybe_get_widget();
if (opt_widget) {
std::optional<widget> result = maybe_do_thing(*opt_widget);
}

and_then and map for expected acts in much the same way an for optional: if there is an expected value then the given function will be called with that value, otherwise the stored unexpected value will be returned. Additionally, we could add a map_error function which allows mapping functions over unexpected values.

The real power of these functions comes when we begin to chain operations together. Let’s look at that original get_cute_cat implementation again:

std::optional<image_view> get_cute_cat (image_view img) {e
auto cropped = find_cat(img);
if (!cropped) {
return std::nullopt;
}

auto with_tie = add_bow_tie(*cropped);
if (!with_tie) {
return std::nullopt;
}

auto with_sparkles = make_eyes_sparkle(*with_tie);
if (!with_sparkles) {
return std::nullopt;
}

return add_rainbow(make_smaller(*with_sparkles));
}

With map and and_then, our code transforms into this:

tl::optional<image_view> get_cute_cat (image_view img) {
return crop_to_cat(img)
.and_then(add_bow_tie)
.and_then(make_eyes_sparkle)
.map(make_smaller)
.map(add_rainbow);
}

With these two functions we’ve successfully pushed the error handling off to the side, allowing us to express a series of operations which may fail without interrupting the flow of logic to test an optional. For more discussion about this code and the equivalent exception-based code, I’d recommend reading Vittorio Romeo’s Why choose sum types over exceptions? article.

A theoretical aside

I didn’t make up map and and_then off the top of my head; other languages have had equivalent features for a long time, and the theoretical concepts are common subjects in Category Theory.

I won’t attempt to explain all the relevant concepts in this post, as others have done it far better than I could. The basic idea is that map comes from the concept of a functor, and and_then comes from monads. These two functions are called fmap and >>= (bind) in Haskell. The best description of these concepts which I have read is Functors, Applicatives’ And Monads In Pictures by Aditya Bhargava. Give it a read if you’d like to learn more about these ideas.

A note on overload sets

One use-case which is annoyingly verbose is passing overloaded functions to map or and_then. For example:

int foo (int);

tl::optional<int> o;
o.map(foo);

The above code works fine. But as soon as we add another overload to foo:

int foo (int);
int foo (double);

tl::optional<int> o;
o.map(foo);

then it fails to compile with a rather unhelpful error message:

test.cpp:7:3: error: no matching member function for call to 'map'
o.map(foo);
~~^~~
/home/simon/projects/optional/optional.hpp:759:52: note: candidate template ignored: couldn't infer template argument 'F'
  template <class F> TL_OPTIONAL_11_CONSTEXPR auto map(F &&f) & {
                                                   ^
/home/simon/projects/optional/optional.hpp:765:52: note: candidate template ignored: couldn't infer template argument 'F'
  template <class F> TL_OPTIONAL_11_CONSTEXPR auto map(F &&f) && {
                                                   ^
/home/simon/projects/optional/optional.hpp:771:37: note: candidate template ignored: couldn't infer template argument 'F'
  template <class F> constexpr auto map(F &&f) const & {
                                    ^
/home/simon/projects/optional/optional.hpp:777:37: note: candidate template ignored: couldn't infer template argument 'F'
  template <class F> constexpr auto map(F &&f) const && {
                                    ^
1 error generated.

One solution for this is to use a generic lambda:

tl::optional<int> o;
o.map([](auto x){return foo(x);});

Another is a LIFT macro:

#define FWD(...) std::forward<decltype(__VA_ARGS__)>(__VA_ARGS__)
#define LIFT(f) \
[](auto&&... xs) noexcept(noexcept(f(FWD(xs)...))) -> decltype(f(FWD(xs)...)) \
{ return f(FWD(xs)...); }

tl::optional<int> o;
o.map(LIFT(foo));

Personally I hope to see overload set lifting get into the standard so that we don’t need to bother with the above solutions.

Current status

Maybe I’ve persuaded you that these extensions to std::optional and std::expected are useful and you would like to use them in your code. Fortunately I have written implementations of both with the extensions shown in this post, among others. tl::optional and tl::expected are on GitHub as single-header libraries under the CC0 license, so they should be easy to integrate with projects new and old.

As far as the standard goes, there are a few avenues being entertained for adding this functionality. I have a proposal to extend std::optional with new member functions. Vicente Escribá has a proposal for a generalised monadic interface for C++. Niall Douglas’ operator try() paper suggests an analogue to Rust’s try! macro for removing some of the boilerplate associated with this style of programming. It turns out that you can use coroutines for doing this stuff, although my gut feeling puts this more to the “abuse” end of the spectrum. I’d also be interested in evaluating how Ranges could be leveraged for these goals.

Ultimately I don’t care how we achieve this as a community so long as we have some standardised solution available. As C++ programmers we’re constantly finding new ways to leverage the power of the language to make expressive libraries, thus improving the quality of the code we write day to day. Let’s apply this to std::optional and std::expected. They deserve it.


Meeting C++17 Trip Report

Simon Brand from Simon Brand

This year was my first time at Meeting C++. It was also the first time I gave a full-length talk at a conference. But most of all it was a wonderful experience filled with smart, friendly people and excellent talks. This is my report on the experience. I hope it gives you an idea of some talks to watch when they’re up on YouTube, and maybe convince you to go along and submit talks next year! I’ll be filling out links to the talks as they go online.

The Venue

The conference was held at the Andel’s Hotel in Berlin, which is a very large venue on the East side of town, about 15 minutes tram ride from Alexanderplatz. All the conference areas were very spacious and there were always places to escape to when you need a break from the noise, or places to sit down on armchairs when you need a break from sitting down on conference chairs.

The People

Some people say they go to conferences for the talks, and for the opportunity to interact with the speakers in the Q&As afterwards. The talks are great, but I went for the people. I had a wonderful time meeting those whom I knew from Twitter or the CppLang Slack, but had never encountered in real life; catching up with those who I’ve met at other events; and talking to those I’d never had interacted with before. There were around 600 people at the conference, primarily from Europe, but many from further-afield. I didn’t have any negative encounters with anyone at the conference, and Jens was very intentional about publicising the Code of Conduct and having a diverse team to enforce it, so it was a very positive atmosphere.

Day 1

Keynote: Sean Parent – Better Code: Human interface

After a short introduction from Jens, the conference kicked off with a keynote from Sean Parent. Although he didn’t say it in these words, the conclusion I took out of it was that all technology requires some thought about user experience (UX if we’re being trendy). Many of us will think of UX as something centred around user interfaces, but it’s more than that. If you work on compilers (like I do), then your UX work involves consistent compiler flags, friendly error messages and useful tooling. If you work on libraries, then your API is your experience. Do your functions do the most obvious thing? Are they well-documented? Do they come together to create terse, correct, expressive, efficient code? Sean’s plea was for us to consider all of these things, and to embed them into how we think about coding. Of course, it being a talk about good code from Sean Parent, it had some great content about composing standard algorithms. I would definitely recommend checking out the talk if what I’ve just written sounds thought-provoking to you.

Peter Goldsborough – Deep Learning with C++

Peter gave a very well-structured talk, which went down the stack of technology involved in deep learning – from high-level libraries to hardware – showing how it all works in theory. He then went back up the stack showing the practical side, including a few impressive live demos. It covers a lot of content, giving a pretty comprehensive introduction. It does move at a very quick pace though, so you probably don’t want to watch this one on 1.5x on YouTube!

Jonathan Boccara – Strong types for strong interfaces

This talk was quite a struggle against technical difficulties; starting 15 minutes late after laptop/projector troubles, then having Keynote crash as soon as the introduction was over. Fortunately, Jonathan handled it incredibly well, seeming completely unfazed as he gave an excellent presentation and even finished on time.

The talk itself was about using strong types (a.k.a. opaque typedefs) to strengthen your code’s resilience to programmer errors and increase its readability. This was a distillation of some of the concepts which Jonathan has written about in a series on his blog. He had obviously spent a lot of time rehearsing this talk and put a lot of thought into the visual design of the slides, which I really appreciated.

Simon Brand (me!) – How C++ Debuggers Work

This was my first full length talk at a conference, so I was pretty nervous about it, but I think it went really well: there were 150-200 people there, I got some good questions, and a bunch of people came to discuss the talk with me in the subsequent days. I screwed up a few things – like referring to the audience as “guys” near the start and finishing a bit earlier than I would have liked – but all in all it was a great experience.

My talk was an overview of almost all of the commonly-used parts of systems-level debuggers. I covered breakpoints, stepping, debug information, object files, operating system interaction, expression evaluation, stack unwinding and a whole lot more. It was a lot of content, but I think I managed to get a good presentation on it. I’d greatly appreciate any feedback of how the talk could be improved in case I end up giving it at some other conference in the future!

Joel Falcou – The Three Little Dots and the Big Bad Lambdas

I’m a huge metaprogramming nerd, so this talk on using lambdas as a kind of code injection was very interesting for me. Joel started the talk with an introduction to how OCaml allows code injection, and then went on to show how you could emulate this feature in C++ using templates, lambdas and inlining in C++. It turned out that he was going to mostly talk about the indices trick for unpacking tuple-like things, fold expressions (including how to emulate them in C++11/14), and template-driven loop unrolling – all of which are techniques I’m familiar with. However, I hadn’t thought about these tools in the context of compile-time code injection, and Joel presented them very well, so it was still a very enjoyable talk. I particularly enjoyed hearing that he uses these in production in his company in order to fine-tune performance. He showed a number of benchmarks of the linear algebra library he has worked on against another popular implementation which hand-tunes the code for different architectures, and his performance was very competitive. A great talk for demonstrating how templates can be used to optimise your code!

Diego Rodriguez-Losada – Conan C++ Quiz

This was one of the highlights of the whole conference. The questions were challenging, thought-provoking, and hilarious, and Diego was a fantastic host. He delivered all the content with charisma and wit, and was obviously enjoying himself: maybe my favourite part of the quiz was the huge grin he wore as everyone groaned in incredulity when the questions were revealed. I don’t know if the questions will go online at some point, but I’d definitely recommend giving them a shot if you get the chance.

Day 2

Keynote: Kate Gregory – It’s complicated!

Kate gave a really excellent talk which examined the complexities of writing good code, and those of C++. The slide which I saw the most people get their phones out for said “Is it important to your ego that you’re really good at a complicated language?”, which I think really hit home for many of us. C++ is like an incredibly intricate puzzle, and when you solve a piece of it, you feel good about it and want to share your success with others. However, sometimes, we can make that success, that understanding, part of our identity. Neither Kate or I are saying that we shouldn’t be proud of our domination over the obscure parts of the language, but a bit of introspection about how we reflect this understanding and share it with others will go a long way.

Another very valuable part of her talk was her discussion on the importance of good names, and how being able to describe some idiom or pattern helps us to talk about it and instruct others. Of course, this is what design patterns were originally all about, but Kate frames it in a way which helps build an intuition around how this common language works to make us better programmers. A few days after the conference, this is the talk which has been going around my head the most, so if you watch one talk from this year’s conference, you could do a lot worse than this one.

Felix Petriconi – There Is A New Future

Felix presented the std::future-a-like library which he and Sean Parent have been working on. It was a very convincing talk, showing the inadequacies of std::future, and how their implementation fixes these problems. In particular, Felix discussed continuations, cancellations, splits and forks, and demonstrated their utility. You can find the library here.

Lukas Bergdoll – Is std::function really the best we can do?

If you’ve read a lot of code written by others, I’m sure you’ll have seen people misusing std::function. This talk showed why so many uses are wrong, including live benchmarks, and gave a number of alternatives which can be used in different situations. I particularly liked a slide which listed the conditions for std::function to be a good choice, which were:

  • It has to be stored inside a container
  • You really can’t know the layout at compile time
  • All your callable types are move and copyable
  • You can constrain the user to exactly one signature

#include meeting

Guy Davidson lead an open discussion about #include, which is a diversity group for C++ programmers started by Guy and Kate Gregory (I’ve also been helping out a bit). It was a very positive discussion in which we shared stories, tried to grapple with some of the finer points of diversity and inclusion in the tech space, and came up with some aims to consider moving forward. Some of the key outcomes were:

  • We should be doing more outreach into schools and trying to encourage those in groups who tend to be discouraged from joining the tech industry.
  • Some people would greatly value a resource for finding companies which really value diversity rather than just listing it as important on their website.
  • Those of use who care about these issues can sometimes turn people away from the cause by not being sensitive to cultural, psychological or sociological issues; perhaps we need education on how to educate.

I thought this was a great session and look forward to helping develop this initiative and seeing where we can go with it. I’d encourage anyone who is interested to get on to the #include channel on the CppLang Slack, or to send pull requests to the information repository.

Juan Bolívar – The most valuable values

Juan gave a very energetic, engaging talk about value semantics and his Redux-/Elm-like C++ library, lager. He showed how these techniques can be used for writing clear, correct code without shared mutable state. I particularly like the time-travelling debugger which he presented, which allows you to step around the changes to your data model. I don’t want to spoil anything, but there’s a nice surprise in the talk which gets very meta.

Day 3

Lightning talks

I spent most of day three watching lightning talks. There were too many to discuss individually, so I’ll just mention my favourites.

Arvid Gerstmann – A very quick view into a compiler

Arvid gave a really clear, concise description of the different stages of compilation. Recommended if you want a byte-sized introduction to compilers.

Réka Nikolett Kovács – std::launder

It’s become a bit of a meme on the CppLang Slack that we don’t talk about std::launder, as it’s only needed if you’re writing standard-library-grade generic libraries. Still, I was impressed by this talk, which explains the problem well and shows how std::launder (mostly) solves it. I would have liked more of a disclaimer about who the feature is for, and that there are still unsolved problems, but it was still a good talk.

Andreas Weis – Type Punning Done Right

Strict aliasing violations are a common source of bugs if you’re writing low level code and aren’t aware of the rules. Andreas gave a good description of the problem, and showed how memcpy is the preferred way to solve it. I do wish he’d included an example of std::bit_cast, which was just voted in to C++20, but it’s a good lightning talk to send to your colleagues the next time they go type punning.

Miro Knejp – Is This Available?

Miro presented a method of avoiding awful #ifdef blocks by providing a template-based interface to everything you would otherwise preprocess. This is somewhat similar to what I talked about in my if constexpr blog post, and I do something similar in production for abstracting the differences in hardware features in a clean, composable manner. As such, it’s great to have a lightning talk to send to people when I want to explain the concept, and I’d recommend watching it if the description piques your interest. Now we just need a good name for it. Maybe the Template Abstracted Preprocessor (TAP) idiom? Send your thoughts to me and Miro!

Simon Brand (me!) – std::optional and the M word

Someone who was scheduled to do a lightning talk didn’t make the session, so I ended up giving an improvised talk about std::optional and monads. I didn’t have any slides, so described std::optional as a glass which could either be empty, or have a value (in this case represented by a pen). I got someone from the audience to act as a function which would give me a doily when I gave it a pen, and acted out functors and monads using that. It turned out very well, even if I forgot how to write Haskell’s bind operator (>>=) halfway through the talk!

Overall

I really enjoyed the selection of lightning talks. There was a very diverse range of topics, and everything flowed very smoothly without any technical problems. It would have been great if there were more speakers, since some people gave two or even three talks, but that means that more people need to submit! If you’re worried about talking, a lightning talk is a great way to practice; if things go terribly wrong, then the worse that can happen is that they move on to the next speaker. It would have also been more fun if it was on the main track, or if there wasn’t anything scheduled against them.

Secret Lightning Talks

Before the final keynote there was another selection of lightning talks given on the keynote stage. Again, I’ll just discuss my favourites.

Guy Davidson – Diversity and Inclusion

Guy gave a heartfelt talk about diversity in the C++ community, giving reasons why we should care, and what we can do to try and encourage involvement for those in under-represented groups in our communities. This is a topic which is also very important to me, so it was great to see it being given an important place in the conference.

Kate Gregory – Five things I learned when I should have been dying

An even more personal talk was given by Kate about important things she learned when she was given her cancer diagnosis. It was a very inspiring call to not care about the barriers which may perceive to be in the way of achieving something – like, say, going to talk at a conference – and instead just trying your best and “doing the work”. Her points really resonated with me; I, like many of us, have suffered from a lot of impostor syndrome in my short time in the industry, and talks like this help me to battle through it.

Phil Nash – A Composable Command Line Parser

Phil’s talk was on Clara, which is a simple, composable command line parser for C++. It was split off from his Catch test framework into its own library, and he’s been maintaining it independently of its parent. The talk gave a strong introduction to the library and why you might want to use it instead of one of the other hundreds of command line parsers which are out there. I’m a big fan of the design of Clara, so this was a pleasure to watch.

Keynote: Wouter van Ooijen – What can C++ offer embedded, what can embedded offer C++?

To close out the conference, Wouter gave a talk about the interaction between the worlds of C++ and embedded programming. The two have been getting more friendly in recent years thanks to libraries like Kvasir and talks by people like Dan Saks and Odin Holmes. Wouter started off talking about his work on space rockets, then motivated using C++ templates for embedded programming, showed us examples of how to do it, then finished off talking about what the C++ community should learn from embedded programming. If I’m honest, I didn’t enjoy this keynote as much as the others, as I already had an understanding of the techniques he showed, and I was unconvinced by the motivating examples, which seemed like they hadn’t been optimised properly (I had to leave early to catch a plane, so please someone tell me if he ended up clarifying them!). If you’re new to using C++ for embedded programming, this may be a good introduction.

Closing

That’s it for my report. I want to thank Jens for organising such a great conference, as well as his staff and volunteers for making sure everything ran smoothly. Also thanks to the other speakers for being so welcoming to new blood, and all the other attendees for their discussions and questions. I would whole-heartedly recommend attending the conference and submitting talks, especially if you haven’t done so before. Last but not least, a big thanks to my employer, Codeplay for sending me (we’re hiring, you get to work on compilers and brand new hardware and go to conferences, it’s pretty great).

Detection Idiom – A Stopgap for Concepts

Simon Brand from Simon Brand

Concepts is a proposed C++ feature which allows succinct, expressive, and powerful constraining of templates. They have been threatening to get in to C++ for a long time now, with the first proposal being rejected for C++11. They were finally merged in to C++20 a few months ago, which means we need to hold on for another few years before they’re in the official standard rather than a Technical Specification. In the mean time, there have been various attempts to implement parts of concepts as a library so that we can have access to some of the power of Concepts without waiting for the new standard. The detection idiom – designed by Walter Brown and part of the Library Fundamentals 2 Technical Specification – is one such solution. This post will outline the utility of it, and show the techniques which underlie its implementation.


A problem

We are developing a generic library. At some point on our library we need to calculate the foo factor for whatever types the user passes in.

template <class T>
int calculate_foo_factor (const T& t);

Some types will have specialized implementations of this function, but for types which don’t we’ll need some generic calculation.

struct has_special_support {
int get_foo() const;
};

struct will_need_generic_calculation {
// no get_foo member function
};

Using concepts we could write calculate_foo_factor like so:

template <class T>
concept bool SupportsFoo = requires (T t) {
{ t.get_foo() } -> int;
};

template <SupportsFoo T>
int calculate_foo_factor (const T& t) {
return t.get_foo();
}

template <class T>
int calculate_foo_factor (const T& t) {
// insert generic calculation here
return 42;
}

This is quite succinct and clear: SupportsFoo is a concept which checks that we can call get_foo on t with no arguments, and that the type of that expression is int. The first calculate_foo_factor will be selected by overload resolution for types which satisfy the SupportsFoo concept, and the second will be chosen for those which don’t.

Unfortunately, our library has to support C++14. We’ll need to try something different. I’ll demonstrate a bunch of possible solutions to this problem in the next section. Some of them may seem complex if you aren’t familiar with the metaprogramming techniques used, but for now, just note the differences in complexity and abstraction between them. The metaprogramming tricks will all be explained in the following section.

Solutions

Here’s a possible solution using expression SFINAE:

namespace detail {
template <class T>
auto calculate_foo_factor (const T& t, int)
-> decltype(t.get_foo()) {
return t.get_foo();
}

template <class T>
int calculate_foo_factor (const T& t, ...) {
// insert generic calculation here
return 42;
}
}

template <class T>
int calculate_foo_factor (const T& t) {
return detail::calculate_foo_factor(t, 0);
}

Well, it works, but it’s not exactly clear. What’s the int and ... there for? Why do we need an extra overload? The answers are not the important part here; what is important is that unless you’ve got a reasonable amount of metaprogramming experience, it’s unlikely you’ll be able to write this code offhand, or even copy-paste it without error.

The code could be improved by abstracting out the check for the presence of the member function into its own metafunction:

template <class... Ts>
using void_t = void;

template <class T, class=void>
struct supports_foo : std::false_type{};

template <class T>
struct supports_foo<T, void_t<decltype(std::declval<T>().get_foo())>>
: std::true_type{};

Again, some more expert-only template trickery which I’ll explain later. Using this trait, we can use std::enable_if to enable and disable the overloads as required:

template <class T, std::enable_if_t<supports_foo<T>::value>* = nullptr>
auto calculate_foo_factor (const T& t) {
return t.get_foo();
}

template <class T, std::enable_if_t<!supports_foo<T>::value>* = nullptr>
int calculate_foo_factor (const T& t) {
// insert generic calculation here
return 42;
}

This works, but you’d need to understand how to implement supports_foo, and you’d need to repeat all the boilerplate again if you needed to write a supports_bar trait. It would be better if we could completely abstract away the mechanism for detecting the member function and just focus on what we want to detect. This is what the detection idiom provides.

template <class T>
using foo_t = decltype(std::declval<T>().get_foo());

template <class T>
using supports_foo = is_detected<foo_t,T>;

is_detected here is part of the detection idiom. Read it as “is it valid to instantiate foo_t with T?” std::declval<T>() essentially means “pretend that I have a value of type T” (more on this later). This still requires some metaprogramming magic, but it’s a whole lot simpler than the previous examples. With this technique we can also easily check for the validity of arbitrary expressions:

template <class T, class U>
using equality_t = decltype(std::declval<T>() == std::declval<U>());

template <class T, class U>
using supports_equality = is_detected<equality_t, T, U>;

struct foo{};
struct bar{};
struct baz{};

bool operator== (foo, bar);

static_assert( supports_equality<foo,bar>::value, "wat");
static_assert(!supports_equality<foo,baz>::value, "wat");

We can also compose traits using std::conjunction:

template <class T>
using is_regular = std::conjunction<std::is_default_constructible<T>,
std::is_copy_constructible<T>,
supports_equality<T,T>,
supports_inequality<T,T>, //assume impl
supports_less_than<T,T>>; //ditto

If you want to use is_detected today, then you can check if your standard library supports std::experimental::is_detected. If not, you can use the implementation from cppreference or the one which we will go on to write in the next section. If you aren’t interested in how this is written, then turn back, for here be metaprogramming dragons.


Metaprogramming demystified

I’ll now work backwards through the metaprogramming techniques used, leading up to implementing is_detected.

Type traits and _v and _t suffixes

A type trait is some template which can be used to get information about characteristics of a type. For example, you can find out if some type is an arithmetic type using std::is_arithmetic:

template <class T>
void foo(T t) {
static_assert(std::is_arithmetic<T>::value, "Argument must be of an arithmetic type");
}

Type traits either “return” types with a ::type member alias, or values with a ::value alias. _t and _v suffixes are shorthand for these respectively. So std::is_arithmetic_v<T> is the same as std::is_arithmetic<T>::value, std::add_pointer_t<T> is the same as typename std::add_pointer<T>::type1.

decltype specifiers

decltype gives you access to the type of an entity or expression. For example, with int i;, decltype(i) is int.

std::declval trickery

std::declval is a template function which helps create values inside decltype specifiers. std::declval<foo>() essentially means “pretend I have some value of type foo”. This is needed because the types you want to inspect inside decltype specifiers may not be default-constructible. For example:

struct foo {
foo() = delete;
foo(int);
void do_something();
};

decltype(foo{}.do_something()) // invalid
decltype(std::declval<foo>().do_something()) // fine

SFINAE and std::enable_if

SFINAE stands for Substitution Failure Is Not An Error. Due to this rule, some constructs which are usually hard compiler errors just cause a function template to be ignored by overload resolution. std::enable_if is a way to gain easy access to this. It takes a boolean template argument and either contains or does not contain a ::type member alias depending on the value of that boolean. This allows you to do things like this:

template <class T>
std::enable_if_t<std::is_integral_v<T>> foo(T t);

template <class T>
std::enable_if_t<std::is_floating_point_v<T>> foo(T t);

If T is integral, then the second overload will be SFINAEd out, so the first is the only candidate. If T is floating point, then the reverse is true.

Expression SFINAE

Expression SFINAE is a special form of SFINAE which applies to arbitrary expressions. This is what allowed this example from above to work:

namespace detail {
template <class T>
auto calculate_foo_factor (const T& t, int)
-> decltype(std::declval<T>().get_foo()) {
return t.get_foo();
}

template <class T>
int calculate_foo_factor (const T& t, ...) {
// insert generic calculation here
return 42;
}
}

int calculate_foo_factor (const T& t) {
return calculate_foo_factor(t, 0);
}

The first overload will be SFINAEd out if calling get_foo on an instance of T is invalid. The difficulty here is that both overloads will be valid if the get_foo call is valid. For this reason, we add some dummy parameters to the overloads (int and ...) to specify an order for overload resolution to follow2.

void_t magic

void_t is a C++17 feature (although it’s implementable in C++11) which makes writing traits and using expression SFINAE a bit easier. The implementation is deceptively simple3:

template <class... Ts>
using void_t = void;

You can see this used in this example which we used above:

template <class T, class=void>
struct supports_foo : std::false_type{};

template <class T>
struct supports_foo<T, void_t<decltype(std::declval<T>().get_foo())>>
: std::true_type{};

The relevant parts are the class=void and void_t<...>. If the expression inside void_t is invalid, then that specialization of supports_foo will be SFINAEd out, so the primary template will be used. Otherwise, the whole expression will be evaluated to void due to the void_t and this will match the =void default argument to the template. This gives a pretty succinct way to check arbitrary expressions.

That covers all of the ingredients we need to implement is_detected.

The implementation

For sake of simplicity I’ll just implement is_detected rather than all of the entities which the Lib Fundamentals 2 TS provides.

namespace detail {
template <template <class...> class Trait, class Enabler, class... Args>
struct is_detected : std::false_type{};

template <template <class...> class Trait, class... Args>
struct is_detected<Trait, void_t<Trait<Args...>>, Args...> : std::true_type{};
}

template <template <class...> class Trait, class... Args>
using is_detected = typename detail::is_detected<Trait, void, Args...>::type;

Let’s start with that last alias template. We take a variadic template template parameter (yes really) called Trait and any number of other type template arguments. We forward these to our detail::is_detected helper with an extra void argument which will act as the class=void construct from the previous section on void_t4. We then have a primary template which will “return” false as the result of the trait. The magic then happens in the following partial specialization. Trait<Args...>> is evaluated inside void_t. If instantiating Traits with Args... is invalid, then the partial specialization will be SFINAEd out, and if it’s valid, it’ll evaluate to void due to the void_t. This successfully abstracts away the implementation of the detection and allows us to focus on what we want to detect.


That covers it for the detection idiom. This is a handy utility for clearing up some hairy metaprogramming, and is even more useful since it can be implemented in old versions of the standard. Let me know if there are any other parts of the code you’d like explained down in the comments or on Twitter. If you want some more resources on the detection idiom, you might want to watch the conference talks given by Walter Brown5 6, Marshall Clow7 8, and Isabella Muerte9.


  1. See here for information on why the typename keyword is needed in some places. 

  2. Have a look here for a better way to do this. 

  3. Due to standards defect CWG1558, implementations needed an extra level of abstraction to work on some compilers. 

  4. The reason we can’t use the same trick is that parameter packs must appear at the end of the template parameter list, so we need to put the void in the middle. 

  5. CppCon 2014: Walter E. Brown “Modern Template Metaprogramming: A Compendium, Part I” 

  6. CppCon 2014: Walter E. Brown “Modern Template Metaprogramming: A Compendium, Part II” 

  7. ACCU 2017: The Detection Idiom - a simpler way to SFINAE - Marshall Clow 

  8. C++Now 2017: Marshall Clow “The ‘Detection idiom:’ A Better Way to SFINAE” 

  9. CppCon 2016: Isabella Muerte “No Concepts Beyond This Point: The Detection Idiom” 

Learning C++

Simon Brand from Simon Brand

C++ is one of the most powerful programming languages in the world. However, it can also be one of the most daunting for beginners. There is so much to learn, so many corner cases, so many little languages within the language itself, that mastering it can seem an insurmountable task. Fortunately, it’s not quite as impossible as it looks. This post outlines some resources and advice for learning C++ today. Some of these also will be useful for those who have used C++ in the past and want to get familiar with modern C++, which is like an entirely different language.

If you want to learn C++ from scratch, don’t rely on random online tutorials. Get a good book. One of these. I read the entirety of Accelerated C++ the day before the interview for my current job, and it was enough to get stuck in to real C++ development. It’s a bit outdated now, (i.e. it doesn’t cover C++11/14/17) but is easily the most important book in my career development. The Effective C++ series by Scott Meyers should be mandatory reading for C++ developers, particularly Effective Modern C++, which is the go-to introduction to C++11 and 14. If you prefer online resources, use reputable ones like Kate Gregory’s Pluralsight courses rather than whatever Google throws out. There are a lot of terrible C++ tutorials out there which will only harm you. A good rule of thumb: if you see #include <iostream.h>, run1.

However, no matter how much you read, you need to practice in order to improve. I find it helpful to have a mix of small, self-contained problems to try out new techniques, and larger applications to make sure you can use those ideas in a real-world context. Kind of like unit tests and integration tests for your programming skills; can you use these ideas in isolation, and can you fit them together into something comprehensible and maintainable.

Stack Overflow is a great source of skill unit tests. I learned most of my initial metaprogramming knowledge by answering Stack Overflow questions incorrectly and having people tell me why I was wrong. Unfortunately it can be an unforgiving community, and isn’t friendly to very simple questions. Be sure to read through the rules thoroughly before contributing, and feel free to send me an answer you were going to post to get feedback on it if you’re worried!

For skill integration tests, I’d recommend coming up with an application in a domain you’re interested in and trying to develop it from start to finish. If you haven’t done much of this kind of thing before, it can be a daunting task. The main piece of advice I have on this is to ensure your application has a strict scope so that you always know what you’re aiming towards and when you’re finished. Set a number of checkpoints or milestones so that you have explicit tasks to work on and don’t get lost. Some ideas for fun projects:

  • A compiler for a simple C subset
  • A mini text editor
  • Implement some simplified standard library classes, like std::vector
  • A clone of a simple video game like Snake
  • An ELF file manipulation tool

One of the most valuable knowledge sources is a good C++ community to be part of. Find out if your city or a nearby one has a User Group and go along to meet other developers. They might even have a mentorship program or advice channels for people learning the language. Definitely get on to the C++ Slack Channel and lurk, ask questions, join in discussions. Many of the foremost C++ experts in the world hang out there, and it’s also a very friendly and welcoming environment to those learning. There are even dedicated subchannels for those learning or having problems debugging C++.

Don’t be discouraged when you find things difficult, and, trust me, you will. C++ is hard, and even the experts get things wrong, or can’t even agree on what’s right.

If you do get good, pass on the knowledge. Start a blog, give advice, mentor those who need it, talk at conferences. You do not need to be an expert to do these things, and we need more speakers and writers from a diverse set of experience levels, as well as in background, race, gender, sexuality. The major C++ conferences all have codes of conduct and organisers who care about these issues. Many of them also have student or accessibility tickets available for those who need further assistance to attend. If in doubt, just ask the organisers, or drop me a message and I can get in touch with the relevant people.

Finally, find some good, regular blogs to learn from. Here are some I would recommend:

I hope that this post helps you in your journey to learn C++. It’s a long, but very rewarding path, with a lot of wonderful people to meet, work with, and debate metaprogramming techniques with along the way.


  1. iostream.h rather than just iostream the name for the header before C++ was standardised. People are still writing tutorials which teach using this form and ancient compilers. 

Writing a Good Tech Cover Letter

Simon Brand from Simon Brand

I review a lot of job applications at Codeplay and most of them make me sad.

For every application I look at, I make a list of plus points and minus points as I read through the candidate’s documents and code. Invariably the first thing I look at is the cover letter, and I would estimate that 90% of them get an instant minus point in my review feedback. As an alternative to ranting about this on the internet, I’ve written some advice on writing a cover letter for a tech job. All advice is from the viewpoint of a small-medium sized tech company and all, well, you know, just like, my opinion, man.

A good job application answers two questions: why the candidate is right for the company and why the company is right for the candidate. The best place to answer these questions is in your cover letter. In fact, with the best cover letters, I even forget to read the CV. If you can’t answer both of these questions on a single side of A4, you probably don’t know the answers.

Step 1 to writing a good cover letter is to actually write a cover letter. If the job advertisement asks for a cover letter, write a cover letter. If it doesn’t ask for a cover letter, write one anyway! Without a concise, targeted description of why you should be considered for a job, your CV needs to work five times as hard for you, as the reviewer needs to try and extract the necessary information from a list of competencies and experience rather than an explicit argument for why the candidate fits the position.

Do not copy-paste a generic cover letter which you send everywhere. Anyone who has reviewed a bunch can spot them instantly, and they make it look like you don’t care about the job enough to actually write something targeted to the company and position.

If the job advert lists criteria for applicants, reference it. Show why you fulfil (or, even better, exceed) the expectations and provide evidence. The best cover letters I’ve seen have gone through all of the required and desired skills and provided short descriptions of experience in the area with links to proof of this experience. And, yes, you can do this on one sheet of paper and still have space for a how-do-you-do.

Don’t spout vague buzzwords. Every time I read about a “team player who is eager to learn” my eyes roll back in my head so hard I lose them. If things like being a team player or eager to learn are in the criteria, fine, but (again) provide evidence.

Let your character and enthusiasm show. These don’t come across in a CV, so show me who you are and why I should want to work with you. Of course, this is much more apparent when it actually comes to interviews, but being enthusiastic about the position and giving some colour to your writing can make the difference between actually getting that interview or not.

If the advert asks for links to your work, provide good ones. Of course, if most of your work is confidential and you don’t work on hobby projects then you may not have any; that’s fine, just justify it. But many applicants link to a Stack Overflow profile with two closed questions on std::vector and send a code sample they wrote ten years ago with using namespace std; in all the headers. Don’t.

Spell-check your letter. English might not be your first language, and whoever is reviewing your application should be sensitive to that, but maeking shoor ur letter doesnt look lyk this can be done by your word processor. This advice should be obvious, but you’d be surprised how many applications I review are riddled with typos. Bad spelling looks lazy. You should put effort into your cover letter, and it should show.

Finally, and most importantly, don’t assume you’re not good enough. An understanding of the gaps in your knowledge, an honest desire to learn the field and evidence that you’re trying goes a really long way.

void C::C::C::A::A::A::foo() – valid syntax monstrosity

Simon Brand from Simon Brand

Here’s an odd bit of C++ syntax for you. Say we have the following class hierarchy:

class A {
public:
virtual void foo() = 0;
};

class B {
public:
virtual void foo() = 0;
};

class C : public A, public B {
public:
virtual void foo();
};

The following definitions are all well-formed:

void C::foo(){
std::cout << "C";
}
void C::A::foo(){
std::cout << "A";
}
void C::B::foo(){
std::cout << "B";
}

The first one defines C::foo, the second defines A::foo and the third defines B::foo. This is valid because of an entity known as the injected-type-name:

A class-name is inserted into the scope in which it is declared immediately after the class-name is seen. The class-name is also inserted into the scope of the class itself; this is known as the injected-class-name. For purposes of access checking, the injected-class-name is treated as if it were a public member name. […]

Since A is a base class of C and the name A is injected into the scope of A, A is also visible from C.

The injected-class-name exists to ensure that the class is found during name lookup instead of entities in an enclosing scope. It also makes referring to the class name in a template instantiation easier. But since we’re awful people, we can use this perfectly reasonable feature do horribly perverse things like this:

void C::C::C::A::A::A::foo(){
std::cout << "A";
}

Yeah, don’t do that.

Writing a Linux Debugger Part 10: Advanced topics

Simon Brand from Simon Brand

We’re finally here at the last post of the series! This time I’ll be giving a high-level overview of some more advanced concepts in debugging: remote debugging, shared library support, expression evaluation, and multi-threaded support. These ideas are more complex to implement, so I won’t walk through how to do so in detail, but I’m happy to answer questions about these concepts if you have any.


Series index

  1. Setup
  2. Breakpoints
  3. Registers and memory
  4. Elves and dwarves
  5. Source and signals
  6. Source-level stepping
  7. Source-level breakpoints
  8. Stack unwinding
  9. Handling variables
  10. Advanced topics

Remote debugging

Remote debugging is very useful for embedded systems or debugging the effects of environment differences. It also sets a nice divide between the high-level debugger operations and the interaction with the operating system and hardware. In fact, debuggers like GDB and LLDB operate as remote debuggers even when debugging local programs. The general architecture is this:

debugarch

The debugger is the component which we interact with through the command line. Maybe if you’re using an IDE there’ll be another layer on top which communicates with the debugger through the machine interface. On the target machine (which may be the same as the host) there will be a debug stub, which in theory is a very small wrapper around the OS debug library which carries out all of your low-level debugging tasks like setting breakpoints on addresses. I say “in theory” because stubs are getting larger and larger these days. The LLDB debug stub on my machine is 7.6MB, for example. The debug stub communicates with the debugee process using some OS-specific features (in our case, ptrace), and with the debugger though some remote protocol.

The most common remote protocol for debugging is the GDB remote protocol. This is a text-based packet format for communicating commands and information between the debugger and debug stub. I won’t go into detail about it, but you can read all you could want to know about it here. If you launch LLDB and execute the command log enable gdb-remote packets then you’ll get a trace of all packets sent through the remote protocol. On GDB you can write set remotelogfile <file> to do the same.

As a simple example, here’s the packet to set a breakpoint:

$Z0,400570,1#43

$ marks the start of the packet. Z0 is the command to insert a memory breakpoint. 400570 and 1 are the argumets, where the former is the address to set a breakpoint on and the latter is a target-specific breakpoint kind specifier. Finally, the #43 is a checksum to ensure that there was no data corruption.

The GDB remote protocol is very easy to extend for custom packets, which is very useful for implementing platform- or language-specific functionality.


Shared library and dynamic loading support

The debugger needs to know what shared libraries have been loaded by the debuggee so that it can set breakpoints, get source-level information and symbols, etc. As well as finding libraries which have been dynamically linked against, the debugger must track libraries which are loaded at runtime through dlopen. To facilitate this, the dynamic linker maintains a rendezvous structure. This structure maintains a linked list of shared library descriptors, along with a pointer to a function which is called whenever the linked list is updated. This structure is stored where the .dynamic section of the ELF file is loaded, and is initialized before program execution.

A simple tracing algorithm is this:

  • The tracer looks up the entry point of the program in the ELF header (or it could use the auxillary vector stored in /proc/<pid>/aux)
  • The tracer places a breakpoint on the entry point of the program and begins execution.
  • When the breakpoint is hit, the address of the rendezvous structure is found by looking up the load address of .dynamic in the ELF file.
  • The rendezvous structure is examined to get the list of currently loaded libraries.
  • A breakpoint is set on the linker update function.
  • Whenever the breakpoint is hit, the list is updated.
  • The tracer infinitely loops, continuing the program and waiting for a signal until the tracee signals that it has exited.

I’ve written a small demonstration of these concepts, which you can find here. I can do a more detailed write up of this in the future if anyone is interested.


Expression evaluation

Expression evaluation is a feature which lets users evaluate expressions in the original source language while debugging their application. For example, in LLDB or GDB you could execute print foo() to call the foo function and print the result.

Depending on how complex the expression is, there are a few different ways of evaluating it. If the expression is a simple identifier, then the debugger can look at the debug information, locate the variable and print out the value, just like we did in the last part of this series. If the expression is a bit more complex, then it may be possible to compile the code to an intermediate representation (IR) and interpret that to get the result. For example, for some expressions LLDB will use Clang to compile the expression to LLVM IR and interpret that. If the expression is even more complex, or requires calling some function, then the code might need to be JITted to the target and executed in the address space of the debuggee. This involves calling mmap to allocate some executable memory, then the compiled code is copied to this block and is executed. LLDB does this by using LLVM’s JIT functionality.

If you want to know more about JIT compilation, I’d highly recommend Eli Bendersky’s posts on the subject.


Multi-threaded debugging support

The debugger shown in this series only supports single threaded applications, but to debug most real-world applications, multi-threaded support is highly desirable. The simplest way to support this is to trace thread creation and parse the procfs to get the information you want.

The Linux threading library is called pthreads. When pthread_create is called, the library creates a new thread using the clone syscall, and we can trace this syscall with ptrace (assuming your kernel is older than 2.5.46). To do this, you’ll need to set some ptrace options after attaching to the debuggee:

ptrace(PTRACE_SETOPTIONS, m_pid, nullptr, PTRACE_O_TRACECLONE);

Now when clone is called, the process will be signaled with our old friend SIGTRAP. For the debugger in this series, you can add a case to handle_sigtrap which can handle the creation of the new thread:

case (SIGTRAP | (PTRACE_EVENT_CLONE << 8)):
//get the new thread ID
unsigned long event_message = 0;
ptrace(PTRACE_GETEVENTMSG, pid, nullptr, message);

//handle creation
//...

Once you’ve got that, you can look in /proc/<pid>/task/ and read the memory maps and suchlike to get all the information you need.

GDB uses libthread_db, which provides a bunch of helper functions so that you don’t need to do all the parsing and processing yourself. Setting up this library is pretty weird and I won’t show how it works here, but you can go and read this tutorial if you’d like to use it.

The most complex part of multithreaded support is modelling the thread state in the debugger, particularly if you want to support non-stop mode or some kind of heterogeneous debugging where you have more than just a CPU involved in your computation.


The end!

Whew! This series took a long time to write, but I learned a lot in the process and I hope it was helpful. Get in touch on Twitter @TartanLlama or in the comments section if you want to chat about debugging or have any questions about the series. If there are any other debugging topics you’d like to see covered then let me know and I might do a bonus post.