void C::C::C::A::A::A::foo() – valid syntax monstrosity

Simon Brand from Simon Brand

Here’s an odd bit of C++ syntax for you. Say we have the following class hierarchy:

class A {
public:
virtual void foo() = 0;
};

class B {
public:
virtual void foo() = 0;
};

class C : public A, public B {
public:
virtual void foo();
};

The following definitions are all well-formed:

void C::foo(){
std::cout << "C";
}
void C::A::foo(){
std::cout << "A";
}
void C::B::foo(){
std::cout << "B";
}

The first one defines C::foo, the second defines A::foo and the third defines B::foo. This is valid because of an entity known as the injected-type-name:

A class-name is inserted into the scope in which it is declared immediately after the class-name is seen. The class-name is also inserted into the scope of the class itself; this is known as the injected-class-name. For purposes of access checking, the injected-class-name is treated as if it were a public member name. […]

Since A is a base class of C and the name A is injected into the scope of A, A is also visible from C.

The injected-class-name exists to ensure that the class is found during name lookup instead of entities in an enclosing scope. It also makes referring to the class name in a template instantiation easier. But since we’re awful people, we can use this perfectly reasonable feature do horribly perverse things like this:

void C::C::C::A::A::A::foo(){
std::cout << "A";
}

Yeah, don’t do that.

Writing a Linux Debugger Part 10: Advanced topics

Simon Brand from Simon Brand

We’re finally here at the last post of the series! This time I’ll be giving a high-level overview of some more advanced concepts in debugging: remote debugging, shared library support, expression evaluation, and multi-threaded support. These ideas are more complex to implement, so I won’t walk through how to do so in detail, but I’m happy to answer questions about these concepts if you have any.


Series index

  1. Setup
  2. Breakpoints
  3. Registers and memory
  4. Elves and dwarves
  5. Source and signals
  6. Source-level stepping
  7. Source-level breakpoints
  8. Stack unwinding
  9. Handling variables
  10. Advanced topics

Remote debugging

Remote debugging is very useful for embedded systems or debugging the effects of environment differences. It also sets a nice divide between the high-level debugger operations and the interaction with the operating system and hardware. In fact, debuggers like GDB and LLDB operate as remote debuggers even when debugging local programs. The general architecture is this:

debugarch

The debugger is the component which we interact with through the command line. Maybe if you’re using an IDE there’ll be another layer on top which communicates with the debugger through the machine interface. On the target machine (which may be the same as the host) there will be a debug stub, which in theory is a very small wrapper around the OS debug library which carries out all of your low-level debugging tasks like setting breakpoints on addresses. I say “in theory” because stubs are getting larger and larger these days. The LLDB debug stub on my machine is 7.6MB, for example. The debug stub communicates with the debugee process using some OS-specific features (in our case, ptrace), and with the debugger though some remote protocol.

The most common remote protocol for debugging is the GDB remote protocol. This is a text-based packet format for communicating commands and information between the debugger and debug stub. I won’t go into detail about it, but you can read all you could want to know about it here. If you launch LLDB and execute the command log enable gdb-remote packets then you’ll get a trace of all packets sent through the remote protocol. On GDB you can write set remotelogfile <file> to do the same.

As a simple example, here’s the packet to set a breakpoint:

$Z0,400570,1#43

$ marks the start of the packet. Z0 is the command to insert a memory breakpoint. 400570 and 1 are the argumets, where the former is the address to set a breakpoint on and the latter is a target-specific breakpoint kind specifier. Finally, the #43 is a checksum to ensure that there was no data corruption.

The GDB remote protocol is very easy to extend for custom packets, which is very useful for implementing platform- or language-specific functionality.


Shared library and dynamic loading support

The debugger needs to know what shared libraries have been loaded by the debuggee so that it can set breakpoints, get source-level information and symbols, etc. As well as finding libraries which have been dynamically linked against, the debugger must track libraries which are loaded at runtime through dlopen. To facilitate this, the dynamic linker maintains a rendezvous structure. This structure maintains a linked list of shared library descriptors, along with a pointer to a function which is called whenever the linked list is updated. This structure is stored where the .dynamic section of the ELF file is loaded, and is initialized before program execution.

A simple tracing algorithm is this:

  • The tracer looks up the entry point of the program in the ELF header (or it could use the auxillary vector stored in /proc/<pid>/aux)
  • The tracer places a breakpoint on the entry point of the program and begins execution.
  • When the breakpoint is hit, the address of the rendezvous structure is found by looking up the load address of .dynamic in the ELF file.
  • The rendezvous structure is examined to get the list of currently loaded libraries.
  • A breakpoint is set on the linker update function.
  • Whenever the breakpoint is hit, the list is updated.
  • The tracer infinitely loops, continuing the program and waiting for a signal until the tracee signals that it has exited.

I’ve written a small demonstration of these concepts, which you can find here. I can do a more detailed write up of this in the future if anyone is interested.


Expression evaluation

Expression evaluation is a feature which lets users evaluate expressions in the original source language while debugging their application. For example, in LLDB or GDB you could execute print foo() to call the foo function and print the result.

Depending on how complex the expression is, there are a few different ways of evaluating it. If the expression is a simple identifier, then the debugger can look at the debug information, locate the variable and print out the value, just like we did in the last part of this series. If the expression is a bit more complex, then it may be possible to compile the code to an intermediate representation (IR) and interpret that to get the result. For example, for some expressions LLDB will use Clang to compile the expression to LLVM IR and interpret that. If the expression is even more complex, or requires calling some function, then the code might need to be JITted to the target and executed in the address space of the debuggee. This involves calling mmap to allocate some executable memory, then the compiled code is copied to this block and is executed. LLDB does this by using LLVM’s JIT functionality.

If you want to know more about JIT compilation, I’d highly recommend Eli Bendersky’s posts on the subject.


Multi-threaded debugging support

The debugger shown in this series only supports single threaded applications, but to debug most real-world applications, multi-threaded support is highly desirable. The simplest way to support this is to trace thread creation and parse the procfs to get the information you want.

The Linux threading library is called pthreads. When pthread_create is called, the library creates a new thread using the clone syscall, and we can trace this syscall with ptrace (assuming your kernel is older than 2.5.46). To do this, you’ll need to set some ptrace options after attaching to the debuggee:

ptrace(PTRACE_SETOPTIONS, m_pid, nullptr, PTRACE_O_TRACECLONE);

Now when clone is called, the process will be signaled with our old friend SIGTRAP. For the debugger in this series, you can add a case to handle_sigtrap which can handle the creation of the new thread:

case (SIGTRAP | (PTRACE_EVENT_CLONE << 8)):
//get the new thread ID
unsigned long event_message = 0;
ptrace(PTRACE_GETEVENTMSG, pid, nullptr, message);

//handle creation
//...

Once you’ve got that, you can look in /proc/<pid>/task/ and read the memory maps and suchlike to get all the information you need.

GDB uses libthread_db, which provides a bunch of helper functions so that you don’t need to do all the parsing and processing yourself. Setting up this library is pretty weird and I won’t show how it works here, but you can go and read this tutorial if you’d like to use it.

The most complex part of multithreaded support is modelling the thread state in the debugger, particularly if you want to support non-stop mode or some kind of heterogeneous debugging where you have more than just a CPU involved in your computation.


The end!

Whew! This series took a long time to write, but I learned a lot in the process and I hope it was helpful. Get in touch on Twitter @TartanLlama or in the comments section if you want to chat about debugging or have any questions about the series. If there are any other debugging topics you’d like to see covered then let me know and I might do a bonus post.

Stack Overflow With Custom JsonConverter

Chris Oldwood from The OldWood Thing

[There is a Gist on GitHub that contains a minimal working example and summary of this post.]

We recently needed to change our data model so that what was originally a list of one type, became a list of objects of different types with a common base, i.e. our JSON deserialization now needed to deal with polymorphic types.

Naturally we googled the problem to see what support, if any, Newtonsoft’s JSON.Net had. Although it has some built-in support, like many built-in solutions it stores fully qualified type names which we didn’t want in our JSON, we just wanted simple technology-agnostic type names like “cat” or “dog” that we would be happy to map manually somewhere in our code. We didn’t want to write all the deserialization logic manually, but was happy to give the library a leg-up with the mapping of types.

JsonConverter

Our searching quickly led to the following question on Stack Overflow: “Deserializing polymorphic json classes without type information using json.net”. The lack of type information mentioned in the question meant the exact .Net type (i.e. name, assembly, version, etc.), and so the answer describes how to do it where you can infer the resulting type from one or more attributes in the data itself. In our case it was a field unsurprisingly called “type” that held a simplified name as described earlier.

The crux of the solution involves creating a JsonConverter and implementing the two methods CanConvert and ReadJson. If we follow that Stack Overflow post’s top answer we end up with an implementation something like this:

public class CustomJsonConverter : JsonConverter
{
  public override bool CanConvert(Type objectType)
  {
    return typeof(BaseType).
                       IsAssignableFrom(objectType);
  }

  public override object ReadJson(JsonReader reader,
           Type objectType, object existingValue,
           JsonSerializer serializer)
  {
    JObject item = JObject.Load(reader);

    if (item.Value<string>(“type”) == “Derived”)
    {
      return item.ToObject<DerivedType>();
    }
    else
    . . .
  }
}

This all made perfect sense and even agreed with a couple of other blog posts on the topic we unearthed. However when we plugged it in we ended up with an infinite loop in the ReadJson method that resulted in a StackOverflowException. Doing some more googling and checking the Newtonsoft JSON.Net documentation didn’t point out our “obvious” mistake and so we resorted to the time honoured technique of fumbling around with the code to see if we could get this (seemingly promising) solution working.

A Blind Alley

One avenue that appeared to fix the problem was manually adding the JsonConverter to the list of Converters in the JsonSerializerSettings object instead of using the [JsonConverter] attribute on the base class. We went back and forth with some unit tests to prove that this was indeed the solution and even committed this fix to our codebase.

However I was never really satisfied with this outcome and so decided to write this incident up. I started to work through the simplest possible example to illustrate the behaviour but when I came to repro it I found that neither approach worked – attribute or serializer settings - I always got into an infinite loop.

Hence I questioned our original diagnosis and continued to see if there was a more satisfactory answer.

ToObject vs Populate

I went back and re-read the various hits we got with those additional keywords (recursion, infinite loop and stack overflow) to see if we’d missed something along the way. The two main candidates were “Polymorphic JSON Deserialization failing using Json.Net” and “Custom inheritance JsonConverter fails when JsonConverterAttribute is used”. Neither of these explicitly references the answer we initially found and what might be wrong with it – they give a different answer to a slightly different question.

However in these answers they suggest de-serializing the object in a different way, instead of using ToObject<DerivedType>() to do all the heavy lifting, they suggest creating the uninitialized object yourself and then using Populate() to fill in the details, like this:

{
  JObject item = JObject.Load(reader);

  if (item.Value<string>(“type”) == “Derived”)
  {
    var @object = new DerivedType();
    serializer.Populate(item.CreateReader(), @object);
    return @object;
  }
  else
    . . .
}

Plugging this approach into my minimal example worked, and for both the converter techniques too: attribute and serializer settings.

Unanswered Questions

So I’ve found another technique that works, which is great, but I still lack closure around the whole affair. For example, how come the answer in the the original Stack Overflow question “Deserializing polymorphic json classes” didn’t work for us? That answer has plenty of up-votes and so should be considered pretty reliable. Has there been a change to Newtonsoft’s JSON.Net library that has somehow caused this answer to now break for others? Is there a new bug that we’ve literally only just discovered (we’re using v10)? Why don’t the JSON.Net docs warn against this if it really is an issue, or are we looking in the wrong part of the docs?

As described right at the beginning I’ve published a Gist with my minimal example and added a comment to the Stack Overflow answer with that link so that anyone else on the same journey has some other pieces of the jigsaw to work with. Perhaps over time my comment will also acquire up-votes to help indicate that it’s not so cut-and-dried. Or maybe someone who knows the right answer will spot it and point out where we went wrong.

Ultimately though this is probably a case of not seeing the wood for the trees. It’s so easy when you’re trying to solve one problem to get lost in the accidental complexity and not take a step back. Answers on Stack Overflow generally carry a large degree of gravitas, but they should not be assumed to be infallible. All documentation can go out of date even if there are (seemingly) many eyes watching over it.

When your mind-set is one that always assumes the bugs are of your own making, unless the evidence is overwhelming, then those times when you might actually not be entirely at fault seem to feel all the more embarrassing when you realise the answer was probably there all along but you discounted it too early because your train of thought was elsewhere.

C++ iterator wrapping a stream not 1-1

Andy Balaam from Andy Balaam&#039;s Blog

Series: Iterator, Iterator Wrapper, Non-1-1 Wrapper

Sometimes we want to write an iterator that consumes items from some underlying iterator but produces its own items slower than the items it consumes, like this:

ColonSep items("aa:foo::x");
// Prints "aa, foo, , x"
for(auto s : items)
{
    std::cout << s << ", ";
}

When we pass a 9-character string (i.e. an iterator that yields 9 items) to ColonSep, above, we only repeat 4 times in our loop, because ColonSep provides an iterable range that yields one value for each whole word it finds in the underlying iterator of 9 characters.

To do something like this I'd recommend consuming the items of the underlying iterator early, so it is ready when requested with operator*. We also need our iterator class to hold on to the end of the underlying iterator as well as the current position.

First we need a small container to hold the next item we will provide:

struct maybestring
{
    std::string value_;
    bool at_end_;

    explicit maybestring(const std::string value)
    : value_(value)
    , at_end_(false)
    {}

    explicit maybestring()
    : value_("--error-past-end--")
    , at_end_(true)
    {}
};

A maybestring either holds the next item we will provide, or at_end_ is true, meaning we have reached the end of the underlying iterator and we will report that we are at the end ourself when asked.

Like the simpler iterators we have looked at, we still need a little container to return from the postincrement operator:

class stringholder
{
    const std::string value_;
public:
    stringholder(const std::string value) : value_(value) {}
    std::string operator*() const { return value_; }
};

Now we are ready to write our iterator class, which always has the next value ready in its next_ member, and holds on to the current and end positions of the underlying iterator in wrapped_ and wrapped_end_:

class myit
{
private:
    typedef std::string::const_iterator wrapped_t;
    wrapped_t wrapped_;
    wrapped_t wrapped_end_;
    maybestring next_;

The constructor holds on the underlying iterator pointers, and immediately fills next_ with the next value by calling next_item passing in true to indicate that this is the first item:

public:
    myit(wrapped_t wrapped, wrapped_t wrapped_end)
    : wrapped_(wrapped)
    , wrapped_end_(wrapped_end)
    , next_(next_item(true))
    {
    }

    // Previously provided by std::iterator
    typedef int                     value_type;
    typedef std::ptrdiff_t          difference_type;
    typedef int*                    pointer;
    typedef int&                    reference;
    typedef std::input_iterator_tag iterator_category;

next_item looks like this:

private:
    maybestring next_item(bool first_time)
    {
        if (wrapped_ == wrapped_end_)
        {
            return maybestring();  // We are at the end
        }
        else
        {
            if (!first_time)
            {
                ++wrapped_;
            }
            return read_item();
        }
    }

next_item recognises whether we've reached the end of the underlying iterator and saves the empty maybstring if so. Otherwise, it skips forward once (unless we are on the first element) and then calls read_item:

    maybestring read_item()
    {
        std::string ret = "";
        for (; wrapped_ != wrapped_end_; ++wrapped_)
        {
            char c = *wrapped_;
            if (c == ':')
            {
                break;
            }
            ret += c;
        }
        return maybestring(ret);
    }

read_item implements the real logic of looping through the underlying iterator and combining those values together to create the next item to provide.

The hard part of the iterator class is done, leaving only the more normal functions we must provide:

public:
    value_type operator*() const
    {
        assert(!next_.at_end_);
        return next_.value_;
    }

    bool operator==(const myit& other) const
    {
        // We only care about whether we are at the end
        return next_.at_end_ == other.next_.at_end_;
    }

    bool operator!=(const myit& other) const { return !(*this == other); }

    stringholder operator++(int)
    {
        assert(!next_.at_end_);
        stringholder ret(next_.value_);
        next_ = next_item(false);
        return ret;
    }

    myit& operator++()
    {
        assert(!next_.at_end_);
        next_ = next_item(false);
        return *this;
    }
}

Note that operator== is only concerned with whether or not we are an end iterator or not. Nothing else matters for providing correct iteration.

Our final bit of bookkeeping is the range class that allows our new iterator to be used in a for loop:

class ColonSep
{
private:
    const std::string str_;
public:
    ColonSep(const std::string str) : str_(str) {}
    myit begin() { return myit(std::begin(str_), std::end(str_)); }
    myit end()   { return myit(std::end(str_),   std::end(str_)); }
};

A lot of the code above is needed for all code that does this kind of job. Next time we'll look at how to use templates to make it useable in the general case.

Why I don’t like getter and setter functions in C++, reason #314.15

Timo Geusch from The Lone C++ Coder&#039;s Blog

This is a post I wrote several years ago and it’s been languishing in my drafts folder ever since. I’m not working on this particular codebase any more. That said, the problems caused by using Java-like getter and setter functions as the sole interface to an object in the context described in the post have […]

The post Why I don’t like getter and setter functions in C++, reason #314.15 appeared first on The Lone C++ Coder's Blog.

LINQ: Did You Mean First(), or Really Single()?

Chris Oldwood from The OldWood Thing

TL;DR: if you see someone using the LINQ method First() without a comparator it’s probably a bug and they should have used Single().

I often see code where the author “knows” that a sequence (i.e. an Enumerable<T>) will result in just one element and so they use the LINQ method First() to retrieve the value, e.g.

var value = sequence.First();

However there is also the Single() method which could be used to achieve a similar outcome:

var value = sequence.Single();

So what’s the difference and why do I think it’s probably a bug if you used First?

Both First and Single have the same semantics for the case where the the sequence is empty (they throw) and similarly when the sequence contains only a single element (they return it). The difference however is when the sequence contains more than one element – First discards the extra values and Single will throw an exception.

If you’re used to SQL it’s the difference between using “top” to filter and trying extract a single scalar value from a subquery:

select top 1 x as [value] from . . .

and

select a, (select x from . . .) as [value] from . . .

(The latter tends to complain loudly if the result set from the subquery is not just a single scalar value or null.)

While you might argue that in the face of a single-value sequence both methods could be interchangeable, to me they say different things with Single begin the only “correct” choice.

Seeing First says to me that the author knows the sequence might contain multiple values and they have expressed an ordering which ensures the right value will remain after the others have been consciously discarded.

Whereas Single suggests to me that the author knows this sequence contains one (and only one) element and that any other number of elements is wrong.

Hence another big clue that the use of First is probably incorrect is the absence of a comparator function used to order the sequence. Obviously it’s no guarantee as the sequence might be being returned from a remote service or function which will do the sorting instead but I’d generally expect to see the two used together or some other clue (method or variable name, or parameter) nearby which defines the order.

The consequence of getting this wrong is that you don’t detect a break in your expectations (a multi-element sequence). If you’re lucky it will just be a test that starts failing for a strange reason, which is where I mostly see this problem showing up. If you’re unlucky then it will silently fail and you’ll be using the wrong data which will only manifest itself somewhere further down the road where it’s harder to trace back.

Catch Up

Phil Nash from level of indirection

Trolley

Stock image from Shutterstock

It's been just over six years since I first announced Catch to the world as a brand new C++ test framework!

In that time it has matured to the point that it can take on the heavyweights - while still staying true to its original goals of being lightweight, easy to get started with and low-friction to work with.

In the last couple of years or so it has also increased dramatically in popularity! That sounds like a good thing - and it is - but with that comes a greater diversity of environments and usage, and more people raising issues and submitting pull requests.

Again, it's great to have so much input from the community - especially in the form of pull requests - where other developers have gone to some effort to implement a change, or a fix, and present it back for inclusion in the main project. So it's been heart-breaking for me that, between this increase in volume and finding my meagre free-time stretched even further, so many issues and PRs have been left unacknowledged - many not even seen by me in the first place.

But two things have happened, recently, that completely change this state of affairs. We're moving firmly in the right direction again.

Firstly, as mentioned in On Joining JetBrains, I've recently changed jobs to one that should give me much more time and opportunity to work on Catch - as well as the opportunity to do so in my home office - with stable internet (as opposed to on the train while commuting to and from work). The first few months were a bit of a wash for the reasons discussed in that post, but, as I also suggested there, this year has seen that change and I've been able to put in quite a lot of work on Catch already.

But that's not really enough. There's a huge back-log - and I'm still only doing this part time - and I want to spend time working on Catch2 as well (more on that soon). I don't want to end up back in the situation where everything is backing up and there's no hope of recovery.

I've been hoping to find someone else to be a key maintainer of Catch for a couple of years now. I've not been very active in this search - for all the same reasons - but it's been on my mind.

But, just last month, after I appeared on CppCast talking about JetBrains and Catch, there was a thread on Reddit about it - with many expressing concern over the Catch situation. I brought the subject up on there again and got the attention of one of the commenters.

I didn't know it at the time, but Martin Hořeňovský has been responsible for a good number of those PRs and issues that had been left unaddressed - as well as an active community member in helping address other people's issues. So it's with great pleasure (and relief!) that I can announce that Martin now has full commit rights to Catch on GitHub and has been prolific in working through the currently outstanding tickets.

Martin seems to really "get" Catch, and the design goals around it - so working with him on this the last couple of weeks has been very rewarding. From some queries I just ran on GitHub it looks like 39 issues have been closed and 38 PRs merged or closed in that time! That's compared to 9 new issues and 7 PRs - about half of which were created by Martin and I in the process. And that's not to mention all the labels we've been using to categorise the other tickets - with many marked as "Resolved - pending review" - which usually means we think it's resolved but we're just waiting for feedback (or a chance for more testing).

With 219 open issues and 41 PRs still outstanding, at time of writing, there's a lot more work to do yet - but I hope this reassures you that we're going in the right direction - and fast!

And we're not stopping with Martin. We have at least one other volunteer that I'll be bringing up to speed soon.

Catch2

I've referred to Catch2 a number of times now, and talked a little about what it will be. The biggest reason for making it a major release, according to Semantic Versioning, is that it will drop support for pre-C++11. For that reason Catch Classic (1.x) will continue to receive at least bug fix updates - but no more new features once Catch2 is fully released. A few major features in the pipeline have been explicitly deferred to Catch2: concurrency support and generators/ property-based testing in particular.

Moving to C++11 provides a very large scope for cleaning up the code-base - which has a significant volume of code dedicated to platform-specific workarounds for compiler shortcomings, missing library features such as smart pointers, and boilerplate that will no longer be necessary with things like range-based-for, auto and others. Lambdas will be useful too, but are not quite so important.

Because taking advantage of C++11 has the potential to touch almost every line of code, I'm taking the opportunity to rewrite the core of Catch - primarily the assertion macros and the infrastructure to support that. This is code that is #included in every test file, and expanded (in the case of macros) in every test case or even every assertion. Keeping this code lightweight is essential to avoiding a compile time hit. There's a number of ways this foot-print can be reduced and the rewrite will strive for this as much as possible.

The rest of the code, concerned with maintaining the registry of tests, parsing and interpreting the command line, running tests and reporting results, will be updated more incrementally.

I already have a (not-yet-public) proof-of-concept version of the re-written code. It's not yet complete but, so far, has only one standard library dependency and minimal templates. The compile-time overhead is imperceptible.

In addition to compile-time, runtime performance is also a goal of Catch2. It's not an overriding goal - I won't be obfuscating the code in the name of wringing out the last few milliseconds of performance - but this is a definite change from Catch Classic where runtime performance was a non-goal. This is in recognition of the fact that Catch is used for more than just isolated unit tests - and will also become more important with property based testing.

I don't have a timeline, yet, for when I expect Catch2 to be ready - and in the immediate term getting Catch Classic back under control is the priority. Despite the partial re-write, and the major version increment, I expect tests written against Catch Classic to mostly "just work" with Catch2 - or require very minimal changes in a some rare cases.

You

As already mentioned many developers have also spent time and effort contributing issues, fixes and even feature PRs over the years. So Catch has really been a community project for years now and I'm very grateful for all the help and support. I think Catch has shown that having a low-friction approach to testing C++ code is very important to a lot of people and I'm hoping we'll continue to build on that. Thank you all.

Catch Up

Phil Nash from level of indirection

Trolley

It's been just over six years since I first announced Catch to the world as a brand new C++ test framework!

In that time it has matured to the point that it can take on the heavyweights - while still staying true to its original goals of being lightweight, easy to get started with and low-friction to work with.

In the last couple of years or so it has also increased dramatically in popularity! That sounds like a good thing - and it is - but with that comes a greater diversity of environments and usage, and more people raising issues and submitting pull requests.

Again, it's great to have so much input from the community - especially in the form of pull requests - where other developers have gone to some effort to implement a change, or a fix, and present it back for inclusion in the main project. So it's been heart-breaking for me that, between this increase in volume and finding my meagre free-time stretched even further, so many issues and PRs have been left unacknowledged - many not even seen by me in the first place.

But two things have happened, recently, that completely change this state of affairs. We're moving firmly in the right direction again.

Firstly, as mentioned in On Joining JetBrains, I've recently changed jobs to one that should give me much more time and opportunity to work on Catch - as well as the opportunity to do so in my home office - with stable internet (as opposed to on the train while commuting to and from work). The first few months were a bit of a wash for the reasons discussed in that post, but, as I also suggested there, this year has seen that change and I've been able to put in quite a lot of work on Catch already.

But that's not really enough. There's a huge back-log - and I'm still only doing this part time - and I want to spend time working on Catch2 as well (more on that soon). I don't want to end up back in the situation where everything is backing up and there's no hope of recovery.

I've been hoping to find someone else to be a key maintainer of Catch for a couple of years now. I've not been very active in this search - for all the same reasons - but it's been on my mind.

But, just last month, after I appeared on CppCast talking about JetBrains and Catch, there was a thread on Reddit about it - with many expressing concern over the Catch situation. I brought the subject up on there again and got the attention of one of the commenters.

I didn't know it at the time, but Martin Hořeňovský has been responsible for a good number of those PRs and issues that had been left unaddressed - as well as an active community member in helping address other people's issues. So it's with great pleasure (and relief!) that I can announce that Martin now has full commit rights to Catch on GitHub and has been prolific in working through the currently outstanding tickets.

Martin seems to really "get" Catch, and the design goals around it - so working with him on this the last couple of weeks has been very rewarding. From some queries I just ran on GitHub it looks like 39 issues have been closed and 38 PRs merged or closed in that time! That's compared to 9 new issues and 7 PRs - about half of which were created by Martin and I in the process. And that's not to mention all the labels we've been using to categorise the other tickets - with many marked as "Resolved - pending review" - which usually means we think it's resolved but we're just waiting for feedback (or a chance for more testing).

With 219 open issues and 41 PRs still outstanding, at time of writing, there's a lot more work to do yet - but I hope this reassures you that we're going in the right direction - and fast!

And we're not stopping with Martin. We have at least one other volunteer that I'll be bringing up to speed soon.

Catch2

I've referred to Catch2 a number of times now, and talked a little about what it will be. The biggest reason for making it a major release, according to Semantic Versioning, is that it will drop support for pre-C++11. For that reason Catch Classic (1.x) will continue to receive at least bug fix updates - but no more new features once Catch2 is fully released. A few major features in the pipeline have been explicitly deferred to Catch2: concurrency support and generators/ property-based testing in particular.

Moving to C++11 provides a very large scope for cleaning up the code-base - which has a significant volume of code dedicated to platform-specific workarounds for compiler shortcomings, missing library features such as smart pointers, and boilerplate that will no longer be necessary with things like range-based-for, auto and others. Lambdas will be useful too, but are not quite so important.

Because taking advantage of C++11 has the potential to touch almost every line of code, I'm taking the opportunity to rewrite the core of Catch - primarily the assertion macros and the infrastructure to support that. This is code that is #included in every test file, and expanded (in the case of macros) in every test case or even every assertion. Keeping this code lightweight is essential to avoiding a compile time hit. There's a number of ways this foot-print can be reduced and the rewrite will strive for this as much as possible.

The rest of the code, concerned with maintaining the registry of tests, parsing and interpreting the command line, running tests and reporting results, will be updated more incrementally.

I already have a (not-yet-public) proof-of-concept version of the re-written code. It's not yet complete but, so far, has only one standard library dependency and minimal templates. The compile-time overhead is imperceptible.

In addition to compile-time, runtime performance is also a goal of Catch2. It's not an overriding goal - I won't be obfuscating the code in the name of wringing out the last few milliseconds of performance - but this is a definite change from Catch Classic where runtime performance was a non-goal. This is in recognition of the fact that Catch is used for more than just isolated unit tests - and will also become more important with property based testing.

I don't have a timeline, yet, for when I expect Catch2 to be ready - and in the immediate term getting Catch Classic back under control is the priority. Despite the partial re-write, and the major version increment, I expect tests written against Catch Classic to mostly "just work" with Catch2 - or require very minimal changes in a some rare cases.

You

As already mentioned many developers have also spent time and effort contributing issues, fixes and even feature PRs over the years. So Catch has really been a community project for years now and I'm very grateful for all the help and support. I think Catch has shown that having a low-friction approach to testing C++ code is very important to a lot of people and I'm hoping we'll continue to build on that. Thank you all.

C++17 – Why it’s better than you might think

Phil Nash from level of indirection

C++20 Horizon

From Mark Isaacson's Meeting C++ talk, "Exploring C++ and beyond"

I was recently interviewed for CppCast and one the news items that came up was a trip report from a recent C++ standards meeting (Issaquah, Nov 2016). This was one of the final meetings before the C++17 standard is wrapped up, so things are looking pretty set at this point. During the discussion I made the point that, despite initially being disappointed that so many headline features were not making it in (Concepts, Modules, Coroutines and Ranges - as well as dot operator and uniform call syntax), I'm actually very happy with how C++17 is shaping up. There are some very nice refinements and features (const expr if is looking quite big on its own) - and including a few surprise ones (structured bindings being the main one for me).

But the part of what I said that surprised even me (because I hadn't really thought of it until a couple of hours before we recorded) was that perhaps it is for the best that we don't get those bigger features just yet! The thinking was that if you take them all together - or even just two or three of them - they have the potential to change the language - and the way we write "modern C++" perhaps even more so than C++11 did - and that's really saying something! Now that's a good thing, in my opinion, but I do wonder if it would be too soon for such large scale changes just yet.

After the 98 standard C++ went into a thirteen year period in the wilderness (there was C++03, which fixed a couple of problems with the 98 standard - but didn't actually add any new features - except value initialisation). As this period coincided with the rise of other mainstream languages - Java and C# in particular - it seemed that C++ was a dying language - destined for a drawn out, Cobolesque, old-age at best.

But C++11 changed all that and injected a vitality and enthusiasm into the community not seen since the late 90s - if ever! Again the timing was a factor - with Moore's Law no longer influencing single-core performance there was a resurgence of interest in low/ zero overhead systems languages - and C++11 was getting modern enough to be palatable again. "There's no such thing as a free lunch" turns out to be true if you wait long enough.

So the seismic changes in C++11 were overdue, welcome and much needed at that time. Since then the standardisation process has moved to the "train model", which has settled on a new standard every three years. Whatever is ready (and fits) makes it in. If it's not baked it's dropped - or is moved into a TS that can be given more real-world testing before being reconsidered. This has allowed momentum to be maintained and reassures us that we won't be stuck without an update to the standard for too long again.

On the other hand many code-bases are still catching up to C++11. There are not many breaking changes - and you can introduce newer features incrementally and to only parts of the code-base - but this can lead to some odd looking code and once you start converting things you tend to want to go all in. Even if that's not true for your own codebase it may be true of libraries and frameworks you depend on! Those features we wanted in C++17 could have a similar - maybe even greater - effect and my feeling is that, while they would certainly be welcomed by many (me included) - there would also be many more that might start to see the churn on the language as a sign of instability. "What? We've only just moved on to C++11 and you want us to adopt these features too?". Sometimes it can be nice to just know where you are with a language - especially after a large set of changes. 2011 might seem like a long time ago but there's a long lag in compiler conformance, then compiler adoption, then understanding and usage of newer features. Those just starting to experiment with C++11 language features are still very common.

I could be wrong about this, but it feels like there's something in it based on my experience. And I think the long gap between C++98 and C++11 is responsible for at least amplifying the effect. People got used to C++ being defined a single way and now we have three standards already in use, with another one almost ready. It's a lot to keep up with - even for those of us that enjoy that sort of thing!

So I'm really looking forward to those bigger features that we'll hopefully get in C++20 (and don't forget you can even use the TS's now if your compiler supports them - and the Ranges library is available on GitHub) - but I'm also looking forward to updating the language with C++17 and the community gaining a little more experience with the new, rapidly evolving, model of C++ before the next big push.

Surprising Defaults – HttpClient ExpectContinue

Chris Oldwood from The OldWood Thing

One of the things you quickly discover when moving from building services on-premise to “the cloud” is quite how many more bits of wire and kit suddenly sit between you and your consumer. Performance-wise this already elongated network path can then be further compounded when the framework you’re using invokes unintuitive behaviour by default [1].

The Symptoms

The system was a new REST API built in C# on the .Net framework (4.6) and hosted in the cloud with AWS. This AWS endpoint was then further fronted by Akamai for various reasons. The initial consumer was an on-premise adaptor (also written in C#) which itself had to go through an enterprise grade web proxy to reach the outside world.

Naturally monitoring was added in fairly early on so that we could start to get a feel for how much added latency moving to the cloud would bring. Our first order approximation to instrumentation allowed us to tell how long the HTTP requests took to handle along with a breakdown of the major functions, e.g. database queries and 3rd party requests. Outside the service we had some remote monitoring too that could tell us the performance from a more customer-like position.

When we integrated with the 3rd party service some poor performance stats caused us to look closer into our metrics. The vast majority of big delays were outside our control, but it also raised some other questions as the numbers didn’t quite add up. We had expected the following simple formula to account for virtually all the time:

HTTP Request Time ~= 3rd Party Time + Database Time

However we were seeing a 300 ms discrepancy in many (but not all) cases. It was not our immediate concern as there was bigger fish to fry but some extra instrumentation was added to the OWIN pipeline and we did a couple of quick local profile runs to look out for anything obviously out of place. The finger seemed to point to time lost somewhere in the Nancy part of the pipeline, but that didn’t entirely make sense at the time so it was mentally filed away and we moved on.

Serendipity Strikes

Whilst talking to the 3rd party about our performance woes with their service they came back to us and asked if we could stop sending them a “Expect: 100-Continue” header in our HTTP requests.

This wasn’t something anyone in the team was aware of and as far as we could see from the various RFCs and blog posts it was something “naturally occurring” on the internet. We also didn’t know if it was us adding it or one of the many proxies in between us and them.

We discovered how to turn it off, and did, but it made little difference to the performance problems we had with them, which were in the order of seconds, not milliseconds. Feeling uncomfortable about blindly switching settings off without really understanding them we reverted the change.

The mention of this header also cropped up when we started investigating some errors we were getting from Akamai that seemed to be more related to a disparity in idle connection timeouts.

Eventually, as we learned more about this mysterious header someone in the team put two-and-two together and realised this was possibly where our missing time was going too.

The Cause

Our REST API uses PUT requests to add resources and it appears that the default behaviour of the .Net HttpClient class is to enable the sending of this “Expect: 100-Continue” header for those types of requests. Its purpose is to tell the server that the headers have been sent but that it will delay sending the body until it receives a 100-Continue style response. At that point the client sends the body, the server can then process the entire request and the response is handled by the client as per normal.

Yes, that’s right, it splits the request up so that it takes two round trips instead of one!

Now you can probably begin to understand why our request handling time appeared elongated and why it also appeared to be consumed somewhere within the Nancy framework. The request processing is started and handled by the OWN middleware as that only depends on the headers, it then enters Nancy which finds a handler, and so requests the body in the background (asynchronously). When it finally arrives the whole request is then passed to our Nancy handler just as if it had been sent all as a single chunk.

The Cure

When you google this problem with relation to .Net you’ll see that there are a couple of options here. We were slightly nervous about choosing the nuclear option (setting it globally on the ServicePointManager) and instead added an extra line into our HttpClient factory so that it was localised:

var client = new HttpClient(...);
...
client.DefaultRequestHeaders.ExpectContinue = false;

We re-deployed our services, checked our logs to ensure the header was no longer being sent, and then checked the various metrics to see if the time was now all accounted for, and it was.

Epilogue

In hindsight this all seems fairly obvious, at least, once you know what this header is supposed to do, and yet none of the people in my team (who are all pretty smart) joined up the dots right away. When something like this goes astray I like to try and make sense of why we didn’t pick it up as quickly as perhaps we should have.

In the beginning there were so many new things for the team to grasp. The difference in behaviour between our remote monitoring and on-premise adaptor was assumed to be one of infrastructure especially when we had already battled the on-premise web proxy a few times [2]. We saw so many other headers in our requests that we never added so why would we assume this one was any different (given none of us had run across it before)?

Given the popularity and maturity of the Nancy framework we surmised that no one would use it if there was the kind of performance problems we were seeing, so once again were confused as to how the time could appear to be lost inside it. Although we were all aware of what the async/await construct does none of us had really spent any serious time trying to track down performance anomalies in code that used it so liberally and so once again we had difficulties understanding perhaps what the tool was really telling us.

Ultimately though the default behaviour just seems so utterly wrong that none of use could imagine the out-of-the-box settings would cause the HttpClient to behave this way. By choosing this default we are in essence optimising PUT requests for the scenario where the body does not need sending, which we all felt is definitely the exception not the norm. Aside from large file uploads or massive write contention we were struggling to come up with a plausible use case.

I don’t know what forces caused this decision to be made as I clearly wasn’t there and I can’t find any obvious sources that might explain it either. The internet and HTTP has evolved so much over the years that it’s possible this behaviour provides the best compatibility with web servers out-of-the-box. My own HTTP experience only covers the last few years along with few more around the turn of the millennium, but my colleagues easily cover the decades I’m missing so I don’t feel I’m missing anything obvious.

Hopefully some kind soul will use the comments section to link to the rationale so we can all get a little closure on the issue.

 

[1] Violating The Principle of Least Astonishment for configuration settings was something I covered more generally before in “Sensible Defaults”.

[2] See “The Curse of NTLM Based HTTP Proxies”.