Surprising Defaults – HttpClient ExpectContinue

Chris Oldwood from The OldWood Thing

One of the things you quickly discover when moving from building services on-premise to “the cloud” is quite how many more bits of wire and kit suddenly sit between you and your consumer. Performance-wise this already elongated network path can then be further compounded when the framework you’re using invokes unintuitive behaviour by default [1].

The Symptoms

The system was a new REST API built in C# on the .Net framework (4.6) and hosted in the cloud with AWS. This AWS endpoint was then further fronted by Akamai for various reasons. The initial consumer was an on-premise adaptor (also written in C#) which itself had to go through an enterprise grade web proxy to reach the outside world.

Naturally monitoring was added in fairly early on so that we could start to get a feel for how much added latency moving to the cloud would bring. Our first order approximation to instrumentation allowed us to tell how long the HTTP requests took to handle along with a breakdown of the major functions, e.g. database queries and 3rd party requests. Outside the service we had some remote monitoring too that could tell us the performance from a more customer-like position.

When we integrated with the 3rd party service some poor performance stats caused us to look closer into our metrics. The vast majority of big delays were outside our control, but it also raised some other questions as the numbers didn’t quite add up. We had expected the following simple formula to account for virtually all the time:

HTTP Request Time ~= 3rd Party Time + Database Time

However we were seeing a 300 ms discrepancy in many (but not all) cases. It was not our immediate concern as there was bigger fish to fry but some extra instrumentation was added to the OWIN pipeline and we did a couple of quick local profile runs to look out for anything obviously out of place. The finger seemed to point to time lost somewhere in the Nancy part of the pipeline, but that didn’t entirely make sense at the time so it was mentally filed away and we moved on.

Serendipity Strikes

Whilst talking to the 3rd party about our performance woes with their service they came back to us and asked if we could stop sending them a “Expect: 100-Continue” header in our HTTP requests.

This wasn’t something anyone in the team was aware of and as far as we could see from the various RFCs and blog posts it was something “naturally occurring” on the internet. We also didn’t know if it was us adding it or one of the many proxies in between us and them.

We discovered how to turn it off, and did, but it made little difference to the performance problems we had with them, which were in the order of seconds, not milliseconds. Feeling uncomfortable about blindly switching settings off without really understanding them we reverted the change.

The mention of this header also cropped up when we started investigating some errors we were getting from Akamai that seemed to be more related to a disparity in idle connection timeouts.

Eventually, as we learned more about this mysterious header someone in the team put two-and-two together and realised this was possibly where our missing time was going too.

The Cause

Our REST API uses PUT requests to add resources and it appears that the default behaviour of the .Net HttpClient class is to enable the sending of this “Expect: 100-Continue” header for those types of requests. Its purpose is to tell the server that the headers have been sent but that it will delay sending the body until it receives a 100-Continue style response. At that point the client sends the body, the server can then process the entire request and the response is handled by the client as per normal.

Yes, that’s right, it splits the request up so that it takes two round trips instead of one!

Now you can probably begin to understand why our request handling time appeared elongated and why it also appeared to be consumed somewhere within the Nancy framework. The request processing is started and handled by the OWN middleware as that only depends on the headers, it then enters Nancy which finds a handler, and so requests the body in the background (asynchronously). When it finally arrives the whole request is then passed to our Nancy handler just as if it had been sent all as a single chunk.

The Cure

When you google this problem with relation to .Net you’ll see that there are a couple of options here. We were slightly nervous about choosing the nuclear option (setting it globally on the ServicePointManager) and instead added an extra line into our HttpClient factory so that it was localised:

var client = new HttpClient(...);
...
client.DefaultRequestHeaders.ExpectContinue = false;

We re-deployed our services, checked our logs to ensure the header was no longer being sent, and then checked the various metrics to see if the time was now all accounted for, and it was.

Epilogue

In hindsight this all seems fairly obvious, at least, once you know what this header is supposed to do, and yet none of the people in my team (who are all pretty smart) joined up the dots right away. When something like this goes astray I like to try and make sense of why we didn’t pick it up as quickly as perhaps we should have.

In the beginning there were so many new things for the team to grasp. The difference in behaviour between our remote monitoring and on-premise adaptor was assumed to be one of infrastructure especially when we had already battled the on-premise web proxy a few times [2]. We saw so many other headers in our requests that we never added so why would we assume this one was any different (given none of us had run across it before)?

Given the popularity and maturity of the Nancy framework we surmised that no one would use it if there was the kind of performance problems we were seeing, so once again were confused as to how the time could appear to be lost inside it. Although we were all aware of what the async/await construct does none of us had really spent any serious time trying to track down performance anomalies in code that used it so liberally and so once again we had difficulties understanding perhaps what the tool was really telling us.

Ultimately though the default behaviour just seems so utterly wrong that none of use could imagine the out-of-the-box settings would cause the HttpClient to behave this way. By choosing this default we are in essence optimising PUT requests for the scenario where the body does not need sending, which we all felt is definitely the exception not the norm. Aside from large file uploads or massive write contention we were struggling to come up with a plausible use case.

I don’t know what forces caused this decision to be made as I clearly wasn’t there and I can’t find any obvious sources that might explain it either. The internet and HTTP has evolved so much over the years that it’s possible this behaviour provides the best compatibility with web servers out-of-the-box. My own HTTP experience only covers the last few years along with few more around the turn of the millennium, but my colleagues easily cover the decades I’m missing so I don’t feel I’m missing anything obvious.

Hopefully some kind soul will use the comments section to link to the rationale so we can all get a little closure on the issue.

 

[1] Violating The Principle of Least Astonishment for configuration settings was something I covered more generally before in “Sensible Defaults”.

[2] See “The Curse of NTLM Based HTTP Proxies”.

Automated Integration Testing with TIBCO

Chris Oldwood from The OldWood Thing

In the past few years I’ve worked on a few projects where TIBCO has been the message queuing product of choice within the company. Naturally being a test-oriented kind of guy I’ve used unit and component tests for much of the donkey work, but initially had to shy away from writing any automated integration tests due to the inherent difficulties of getting the system into a known state in isolation.

Organisational Barriers

For any automated integration tests to run reliably we need to control the whole environment, which ideally is our development workstations but also our CI build environment (see “The Developer’s Sandbox”). The main barriers to this with a commercial product like TIBCO are often technological, but also more often than not, organisational too.

In my experience middleware like this tends to be proprietary, very expensive, and owned within the organisation by a dedicated team. They will configure the staging and production queues and manage the fault-tolerant servers, which is probably what you’d expect as you near production. A more modern DevOps friendly company would recognise the need to allow teams to test internally first and would help them get access to the product and tools so they can build their test scaffolding that provides the initial feedback loop.

Hence just being given the client access libraries to the product is not enough, we need a way to bring up and tear down the service endpoint, in isolation, so that we can test connectivity and failover scenarios and message interoperability. We also need to be able develop and test our logic around poisoned messages and dead-letter queues. And all this needs to be automatable so that as we develop and refactor we can be sure that we’ve not broken anything; manually testing this stuff is not just not scalable in a shared test environment at the pace modern software is developed.

That said, the TIBCO EMS SDK I’ve been working with (v6.3.0) has all the parts I needed to do this stuff, albeit with some workarounds to avoid needing to run the tests with administrator rights which we’ll look into later.

The only other thorny issue is licensing. You would hope that software product companies would do their utmost to get developers on their side and make it easy for them to build and test their wares, but it is often hard to get clarity around how the product can be used outside of the final production environment. For example trying to find out if the TIBCO service can be run on a developer’s workstation or in a cloud hosted VM solely for the purposes of running some automated tests has been a somewhat arduous task.

This may not be solely the fault of the underlying product company, although the old fashioned licensing agreements often do little to distinguish production and modern development use [1]. No, the real difficulty is finding the right person within the client’s company to talk to about such matters. Unless they are au fait with the role modern automated integrated testing takes place in the development process you will struggle to convince them your intended use is in the interests of the 3rd party product, not stealing revenue from them.

Okay, time to step down from the soap box and focus on the problems we can solve…

Hosting TIBEMSD as a Windows Service

From an automated testing perspective what we need access to is the TIBEMSD.EXE console application. This provides us with one or more TIBCO message queues that we can host on our local machine. Owning thing process means we can therefore create, publish to and delete queues on demand and therefore tightly control the environment.

If you only want to do basic integration testing around the sending and receiving of messages you can configure it as a Windows service and just leave it running in the background. Then your tests can just rely on it always being there like a local database or the file-system. The build machine can be configured this way too.

Unfortunately because it’s a console application and not written to be hosted as a service (at least v6.3 isn’t), you need to use a shim like SRVANY.EXE from the Windows 2003 Resource Kit or something more modern like NSSM. These tools act as an adaptor to the console application so that the Windows SCM can control them.

One thing to be careful of when running TIBEMSD in this way is that it will stick its data files in the CWD (Current Working Directory), which for a service is %SystemRoot%\System32, unless you configure the shim to change it. Putting them in a separate folder makes them a little more obvious and easier to delete when having a clear out [2].

Running TIBEMSD On Demand

Running the TIBCO server as a service makes certain kinds of tests easier to write as you don’t have to worry about starting and stopping it, unless that’s exactly the kinds of test you want to write.

I’ve found it’s all too easy when adding new code or during a refactoring to accidentally break the service so that it doesn’t behave as intended when the network goes up and down, especially when you’re trying to handle poisoned messages.

Hence I prefer to have the TIBEMSD.EXE binary included in the source code repository, in a known place so that it can be started and stopped on demand to verify the connectivity side is working properly. For those classes of integration tests where you just need it to be running you can add it to your fixture-level setup and even keep it running across fixtures to ensure the tests running at an adequate pace.

If, like me, you don’t run as an Administrator all the time (or use elevated command prompts by default) you will find that TIBEMSD doesn’t run out-of-the-box in this way. Fortunately it’s easy to overcome these two issues and run in a LUA (Limited User Account).

Only Bind to the Localhost

One of the problems is that by default the server will try and listen for remote connections from anywhere which means it wants a hole in the firewall for its default port. This of course means you’ll get that firewall popup dialog which is annoying when trying to automate stuff. Whilst you could grant it permission with a one-off NETSH ADVFIREWALL command I prefer components in test mode to not need any special configuration if at all possible.

Windows will allow sockets that only listen for connections from the local host to avoid generating the annoying firewall popup dialog (and this was finally extended to include HTTP too). However we need to tell the TIBCO server to do just that, which we can achieve by creating a trivial configuration file (e.g. localhost.conf) with the following entry:

listen=tcp://127.0.0.1:7222

Now we just need to start it with the –conf switch:

> tibemsd.exe -config localhost.conf

Suppressing the Need For Elevation

So far so good but our other problem is that when you start TIBEMSD it wants you to elevate its permissions. I presume this is a legacy thing and there may be some feature that really needs it but so far in my automated tests I haven’t hit it.

There are a number of ways to control elevation for legacy software that doesn’t have a manifest, like using an external one, but TIBEMSD does and that takes priority. Luckily for us there is a solution in the form of the __COMPAT_LAYER environment variable [3]. Setting this, either through a batch file or within our test code, supresses the need to elevate the server and it runs happily in the background as a normal user, e.g.

> set __COMPAT_LAYER=RunAsInvoker
> tibemsd.exe -config localhost.conf

Spawning TIBEMSD From Within a Test

Once we know how to run TIBEMSD without it causing any popups we are in a position to do that from within an automated test running as any user (LUA), e.g. a developer or the build machine.

In C#, the language where I have been doing this most recently, we can either hard-code a relative path [4] to where TIBEMSD.EXE resides within the repo, or read it from the test assembly’s app.config file to give us a little more flexibility.

<appSettings>
  <add key=”tibemsd.exe”
       value=”..\..\tools\TIBCO\tibemsd.exe” />
  <add key=”conf_file”
       value=”..\..\tools\TIBCO\localhost.conf” />
</appSettings>

We can also add our special .conf file to the same folder and therefore find it in the same way. Whilst we could generate it on-the-fly it never changes so I see little point in doing this extra work.

Something to be wary of if you’re using, say, NUnit to write your integration tests is that it (and ReSharper) can copy the test assemblies to a random location to aid in insuring your tests have no accidental dependencies. In this instance we do, and a rather large one at that, so we need the relative distance between where the test assemblies are built and run (XxxIntTests\bin\Debug) and the TIBEMSD.EXE binary to remain fixed. Hence we need to disable this copying behaviour with the /noshadow switch (or “Tools | Unit Testing | Shadow-copy assemblies being tested” in ReSharper).

Given that we know where our test assembly resides we can use Assembly.GetExecutingAssembly() to create a fully qualified path from the relative one like so:

private static string GetExecutingFolder()
{
  var codebase = Assembly.GetExecutingAssembly()
                         .CodeBase;
  var folder = Path.GetDirectoryName(codebase);
  return new Uri(folder).LocalPath;
}
. . .
var thisFolder = GetExecutingFolder();
var tibcoFolder = “..\..\tools\TIBCO”;
var serverPath = Path.Combine(
            thisFolder, tibcoFolder, “tibemsd.exe”);
var configPath = Path.Combine(
            thisFolder, tibcoFolder, “localhost.conf”);

Now that we know where the binary and config lives we just need to stop the elevation by setting the right environment variable:

Environment.SetEnvironmentVariable("__COMPAT_LAYER", "RunAsInvoker");

Finally we can start the TIBEMSD.EXE console application in the background (i.e. no distracting console window) using Diagnostics.Process:

var process = new System.Diagnostics.Process
{
  StartInfo = new ProcessStartInfo(path, args)
  {
    UseShellExecute = false,
    CreateNoWindow = true,
  }
};
process.Start();

Stopping the daemon involves calling Kill(). There are more graceful ways of remotely stopping a console application which you can try first, but Kill() is always the fall-back approach and of course the TIBCO server has been designed to survive such abuse.

Naturally you can wrap this up with the Dispose pattern so that your test code can be self-contained:

// Arrange
using (RunTibcoServer())
{
  // Act
}

// Assert

Or if you want to amortise the cost of starting it across your tests you can use the fixture-level set-up and tear down:

private IDisposable _server;

[FixtureSetUp]
public void GivenMessageQueueIsAvailable()
{
  _server = RunTibcoServer();
}

[FixtureTearDown]
public void StopMessageQueue()
{
  _server?.Dispose();
  _server = null;
}

One final issue to be aware of, and it’s a common one with integration tests like this which start a process on demand, is that the server might still be running unintentionally across test runs. This can happen when you’re debugging a test and you kill the debugger whilst still inside the test body. The solution is to ensure that the server definitely isn’t already running before you spawn it, and that can be done by killing any existing instances of it:

Process.GetProcessesByName(“tibemsd”)
       .ForEach(p => p.Kill());

Naturally this is a sledgehammer approach and assumes you aren’t using separate ports to run multiple disparate instances, or anything like that.

Other Gottchas

This gets us over the biggest hurdle, control of the server process, but there are a few other little things worth noting.

Due to the asynchronous nature and potential for residual state I’ve found it’s better to drop and re-create any queues at the start of each test to flush them. I also use the Assume.That construct in the arrangement to make it doubly clear I expect the test to start with empty queues.

Also if you’re writing tests that cover background connect and failover be aware that the TIBCO reconnection logic doesn’t trigger unless you have multiple servers configured. Luckily you can specify the same server twice, e.g.

var connection= “tcp://localhost,tcp://localhost”;

If you expect your server to shutdown gracefully, even in the face of having no connection to the queue, you might find that calling Close() on the session and/or connection blocks whilst it’s trying to reconnect (at least in EMS v6.3 it does). This might not be an expected production scenario, but it can hang your tests if something goes awry, hence I’ve used a slightly distasteful workaround where the call to Close() happens on a separate thread with a timeout:

Task.Run(() => _connection.Close()).Wait(1000);

Conclusion

Writing automated integration tests against a middleware product like TIBCO is often an uphill battle that I suspect many don’t have the appetite or patience for. Whilst this post tackles the technical challenges, as they are at least surmountable, the somewhat harder problem of tackling the organisation is sadly still left as an exercise for the reader.

 

[1] The modern NoSQL database vendors appear to have a much simpler model – use it as much as you like outside production.

[2] If the data files get really large because you leave test messages in them by accident they can cause your machine to really grind after a restart as the service goes through recovery.

[3] How to Run Applications Manifested as Highest Available With a Logon Script Without Elevation for Members of the Administrators Group

[4] A relative path means the repo can then exist anywhere on the developer’s file-system and also means the code and tools are then always self-consistent across revisions.

A Game of Tag

Phil Nash from level of indirection

One of the tent-pole features of Catch is the ability to write test names as free-form strings. When you run a Catch executable from the command line you can specify a test case by name, to run just that one:

./MyTestExe "a very nice test case"

or you can use wildcards to run a group of test cases (or just one with less typing):

./MyTestExe "*very nice*"

If you want to use wildcards but you're not sure what they'll match you can combine this with the listing option, -l, to see which test cases match the pattern:

./MyTestExe "*very nice*" -l
Matching test cases:
  a very nice test case
  a not very nice test case
2 matching test cases

This is already quite a powerful way to group test cases into ad-hoc "suites". However we don't want to twist our test names into artificial schemes for this purposes (although, early on, that's exactly what I proposed). Instead Catch allows you to add "tags" to test cases.

TEST_CASE( "a very nice test case", "[nice][good]" ) { /* ... */ }
TEST_CASE( "a not very nice test case", "[nice][bad]" ) { /* ... */ }

Now we can run all tests with a certain tag:

./MyTestExe [good]

or combination of tags:

./MyTestExe [nice][good]

also with exclusions:

./MyTestExe [nice]~[bad]

unions are supported with ,:

./MyTestExe [nice],[pleasant]

Very powerful! And this functionality has been around for a while.

More recent, and less well known (mostly because they weren't documented until recently) are a set of "special tags": Instruction Tags, Hiding Tags, Tag Aliases and some automatically generated tags.

Let's see what they're all about.

Instruction Tags

In general all tags that start with a symbol are reserved by Catch (or, put another way, user defined tag names must start with an alpha-numeric character). This allows a nice rich range of namespaces for special tags. Tags that start with the ! character are Instruction tags. They inform Catch something about the test case that they apply to. At time of writing the following are defined:

  • [!hide] This "hides" the test from the default run (i.e. if you run the test executable without specifying any names or tags). This feature was originally introduced with the [hide] tag (note, no: !) - and is still supported, though deprecated. There is also a shortcut form, [.] which we'll revisit in a moment.
  • [!throws] This tells Catch that an exception may be thrown in the course of executing the test - even if it is caught and dealt with. If you've ever tried to track down a rogue exception in your debugger - and so have set the debugger to break on exceptions as they're thrown - you'll know how frustrating all the false positives coming from such tests are! So Catch provides a way to suppress exceptions it is expecting - through the -e or --nothrow options on the command line. This already skips over REQUIRE_THROWS... or CHECK_THROWS... assertions. The [!throws] tag covers you for cases where the exception is caught and handled in the code under test (or your test code).
  • [!shouldfail] This tells Catch that you're expecting this test to fail! Furthermore, if it does fail then it should treat that as a pass!
  • [!mayfail] Rather than explicitly inverting the pass/ fail logic as the previous tag does, this tag just says that the test may fail but that's ok (although it is still reported). It's also ok if it passes.

Hiding Tags

We already looked at [!hide] (and the deprecated [hide]) above, and mentioned that [.] was a shortcut for the same.

It turns that when one of these tags is used it is often combined with another tag that is used when you do want to run the test. The classic example is where you write integration tests in the same executable as unit tests. By default you don't want the integration tests to run as you want the shortest possible path to running just unit tests. So you hide them but also tag them [integration], or something similar (the word "integration" has no significance to Catch). So pairings like, [.][integration] or [.][performance] are frequently found together.

So, as a convenience, Catch now supports . as a tag prefix. The rest of the tag can be completely custom and works exactly like any other normal tag - except that the test is also hidden. Our examples would, thus, be written as [.integration] and [.performance]

One final point to mention about hiding tags is that, due to the way they have evolved through a number of forms (including the severely deprecated "./" name prefix) whichever form is used will not only hide the test, but any of the other forms will match it in a tag pattern. e.g. if you tag a test with [.] you can match it with [!hide].

Tag Aliases

As we saw earlier, tags can be combined in fairly complex ways. While this is powerful and flexible, it can be a bit awkward if you often want to use the same tag expression. Wouldn't it be nice if there was a way of writing the expression once then getting Catch to remember it for you - and associate it with an easier to remember name?

Well there is! You can associate any tag pattern with a name that you can use just like any normal tag - except that it must begin with the @ character.

You create a tag alias, in code, using the CATCH_REGISTER_TAG_ALIAS macro. E.g.

CATCH_REGISTER_TAG_ALIAS( "[@not nice]", "~[nice]~[!hide]" );

This registers a tag alias, [@not nice] which, when expanded will match all tests that are not tagged [nice] but also are not hidden. The second part is important because if you have any hidden tests then they will usually be included any time you use a not expression (~) because the rule is that tests are only hidden if no pattern is specified!

Also did you notice that we had a space in the tag name? Surprised? I never said that tags could not include spaces. Of course they can.

You can register as many aliases as you like and you can put them anywhere you like (as long as catch.hpp is #included). However I recommend keeping them all in your main source file (the one you #define CATCH_CONFIG_MAIN, or equivalent) - simply so you only have to look in one place for them.

Filenames As Tags

The newest special tag form is the result of automatically generating a set of tags. The tags all begin with the # character (I've resisted the urge to call them "hash tags"). The rest of the tag is generated from the name of the source file that the test is implemented in. The full path (as reported by __FILE__) is stripped of its directories and extension - so all tests in /Development/Tests/SquirrelTests.cpp would be tagged, [#SquirrelTests].

At time of writing this feature is only available on the develop branch on GitHub - and must be specifically enabled running with the --filenames-as-tags or -# command line options. It's possible that situation may change by the time it makes it onto master.

The Tag Line

So tags not only provide a rich grouping mechanism in Catch - they also allow you to control some aspects of how Catch runs and treats test cases. Some tags can be generated for you - and some tags can be expanded from simpler forms. We've covered here the complete set of special tags at the time of writing. If you're reading this in the future there may be more - I'll try and be better at keeping the docs up-to-date there. Also any stock price tips you might have from the future would be welcome too.

Call for Papers: C++ track at NDC Oslo 2015, June 17-19

olvemaudal from Geektalk

There was a lot of interest for the very strong C++ track that we did at the NDC conference in Oslo last summer. Here is a summary I wrote after the event, including links to the videos that we recorded.

We are repeating the success this year, June 17-19. A few big names has
already been signed up, but we also need your contribution. If you
would like to be part of the C++ track this year, please submit your
proposal soon. The CFP closes February 15. Feel free to contact me directly if you have any questions about the C++ track.

Modern C++ Testing

Phil Nash from level of indirection

Back in August I took my family to stay for a week at my brother's house in (Old) South Wales. I've not been to Wales for a long time and it was great to be back there - but that's not what this post is about, of course.

The thing about Wales is that it has mountains (and where there are no mountains there are plenty of hills and valleys and cliffs). Mobile cell coverage is non-existent in much of the country - particularly where my brother lives.

So the timing was particularly bad when, just as we were driving along the south cost (somewhere between Cardiff and Swansea, I think), I started getting emails and tweets from people pointing out that Catch was riding high on HackerNews! Someone had recently discovered Catch and was enjoying it enough that they wanted to share it with the community. Which is awesome!

Except that, between lack of mobile reception and spending time with my family, I didn't have opportunity to join the discussion.

When I got back home a week later I read through the comments. One of them stuck out because it called me out on describing Catch as a "Modern C++" framework (the commenter recommended another framework, Bandit, as being "more modern").

When I first released Catch, back in 2010, C++11 was still referred to as C++1x (or even C++0x!) and the final release date was still uncertain. So Catch was written to target C++03. It used a few "modern" idioms of the time - but the modernity was intended more as being a break from the past - where most C++ frameworks were just reimplementations of JUnit in C++. So I think the label was somewhat justified at the time.

Of course since then C++11 has not only been standardised but is fully, or nearly fully, implemented by many leading, mainstream, compilers. I think adoption is still not high enough, at this point, that I'd be willing to drop support for C++03 in Catch (there is even an actively maintained fork for VC6!). But it is enough that the baseline for what constitutes "modern C++" has definitely moved on. And now C++14 is here too - pushing it even further forward.

"Modern" is not what it used to be

What does it mean to be a "Modern C++ Test Framework" these days anyway? Well the most obvious thing for the user is probably the use of lambdas. Along with a few other features, lambdas allow for a lot of what previously required macros to be done in pure C++. I'm usually the first to hold this up as A Good Thing. In a moment I'll get to why I don't think it's necessarily as good a step as you might think.

But before I get to that; one other thing: For me, as a framework author, the biggest difference C++11/14 would make to something like Catch would be in the internals. Large chunks of code could be removed, reduced or at least cleaned up. The "no dependencies" policy means that Catch has complete implementations of things like shared pointers, optional types and function objects - as well as many things that must be done the long way round (such as iterating collections - I long for range for loops - or at least BOOST_FOREACH).

The competition

I've come across three frameworks that I'd say qualify as truly trying to be "modern C++ test frameworks". I'm sure there are others - and I've not really even used these ones extensively - but these are the ones I'll reference in this discussion. The three frameworks are:

  • Lest - by Martin Moene, an active contributor to Catch - and partly based on some Catch ideas - re-imagined for a C++11 world.
  • Bandit - this is the one mentioned in the Hacker News comment I kicked off with
  • Mettle - I saw this in a tweet from @MeetingCpp last week and it's what kicked off the train of thought that led me to this post

The case for test case macros

But why did I say that the use of lambdas is not such a good idea? Actually I didn't quite say that. I think lambdas are a very good idea - and in many ways they would certainly clean up at least the mechanics of defining and registering test cases and sections.

Before lambdas C++ had only one place you could write a block of imperative code: in a function (or method). That means that, in Catch, test cases are really just functions - which must have a function signature - including a name (which we hide - because in Catch the test name is a string). Those functions must be captured somehow. This is done by passing a pointer to the function to the constructor of a small class - who's sole purposes is to forward the function pointer onto a global registry. Later, when the tests are being run, the registry is iterated and the function pointers invoked.

So a test case like this:

    TEST_CASE( "test name", "[tags]" )
    {
        /* ... */
    }

...written out in full (after macro expansion) looks something like:

    static void generatedFunctionName();
    namespace{ ::Catch::AutoReg generatedNameAutoRegistrar
        (   &generatedFunctionName, 
       	    ::Catch::SourceLineInfo( __FILE__, static_cast<std::size_t>( __LINE__ ) ), 
            ::Catch::NameAndDesc( "test name", "[tags]") ); 
    }
    static void generatedFunctionName()
    {
        /* .... */
    }

(generatedFunctionName is generated by yet another macro, which combines root with the current line number. Because the function is declared static the identifier is only visible in the current translation unit (cpp file), so this should be unique enough)

So there's a lot of boilerplate here - you wouldn't want to write this all by hand every time you start a new test case!

With lambdas, though, blocks of code are now first class entities, and you can introduce them anonymously. So you could write them like:

    Catch11::TestCase( "test name", "[tags]", []() 
    {
        /* ... */
    } );

This is clearly far better than the expanded macro. But it's still noisier than the version that uses the macro. Most of the C++11/14 test frameworks I've looked at tend to group tests together at a higher level. The individual tests are more like Catch's sections - but the pattern is still the same - you get noise from the lambda syntax in the form of the []() or [&]() to introduce the lambda and an extra ); at the end.

Is that really worth worrying about?

Personally I find it's enough extra noise that I think I'd prefer to continue to use a macro - even if it used lambdas under the hood. But it's also small enough that I can certainly see the case for going macro free here.

Assert yourself

But that's just test cases (and sections). Assertions have traditionally been written using macros too. In this case the main reasons are twofold:

  1. It allows the expression evaluation to be wrapped in an exception handler.
  2. It allows us the capture the file and line number to report on.

(1) can arguably be handled in whatever is holding the current lambda (e.g. it or describe in Bandit, suite, subsuite or expect in Mettle). If these blocks are small enough we should get sufficient locality of exception handling - but it's not as tight as the per-expression handling with the macro approach.

(2) simply cannot be done without involving the preprocessor in some way (whether it's to pass __FILE__ and __LINE__ manually, or to encapsulate that with a macro). How much does that matter? Again it's a matter of taste but you get several benefits from having that information. Whether you use it to manually locate the failing assertion or if you're running the reporter in an IDE window that automatically allows you to double-click the failure message to take you to the line - it's really useful to be able to go straight to it. Do you want to give that up in order to go macro free? Perhaps. Perhaps not.

Interestingly lest still uses a macro for assertions

Weighing up

So we've seen that a truly modern C++ test framework, using lambdas in particular, can allow you to write tests without the use of macros - but at a cost!

So the other side of the equation must be: what benefit do you get from eschewing the macros?

Personally I've always striven to minimise or eliminate the use of macros in C++. In the early days that was mostly about using const, inline and templates. Now lambdas allow us to address some of the remaining cases and I'm all for that.

But I also tend to associate a much higher "cost" to macro usage when it generates imperative code. This is code that you're likely to find yourself needing to step through in a debugger at runtime - and macros really obfuscate this process. When I use macros it tends to be in declarative code. Code that generates purely declarative statements, or effectively declarative statements (such as the test case function registration code). It tends to always generate the exact same machinery - so should not be sensitive to its inputs in ways that will require debugging.

How do Catch's macros play out in that regard? Well the test case registration macros get a pass. Sections are a grey area - they are on the path of code that needs to be stepped over - and, worse, hide a conditional (a section is really just an if statement on a global variable!). So score a few points down there. Assertions are also very much runtime executable - and are frequently on the debugging path! In fact stepping into expressions being asserted on in Catch tests can be quite a pain as you end up stepping into some of the "hidden" calls before you get to the expression you supplied (in Visual Studio, at least, this can be mitigated by excluding the Catch namespace using the StepOver registry key).

Now, interestingly, the use of macros for the assertions was never really about C++03 vs C++11. It was about capturing extra information (file/ line) and wrapping in a try-catch. So if you're willing to make that trade-off there's no reason you can't have non-macro assertions even in C++03!

Back to the future

One of my longer arcs of development on Catch (that I edge towards on each refactoring) is to decouple the assertion mechanism from the guts of the test runner. You should be able to provide your own assertions that work with Catch. Many other test frameworks work this way and it allows them to be much more flexible. In particular it will allow me to decouple the matcher framework (and maybe allow third-party matchers to work with Catch).

Of course this would also allow macro-less assertions to be used (as it happens the assertions in bandit and mettle are both matcher-like already).

So, while I think Catch is committed to supporting C++03 for some time yet, that doesn't mean there is no scope for modernising it and keeping it relevant. And, modern or not, I still believe it is the simplest C++ test framework to get up and running with, and the least noisy to work with.

Videos from C++ track on NDC 2014

olvemaudal from Geektalk

As the chair for the C++ track on NDC Oslo, I am happy to report that the C++ track was a huge and massive success! The C++ community in Norway is rather small so even if NDC is a big annual conference for programmers (~1600 geeks) and even with names like Nico, Scott and Andrei as headliners for the track, I was not sure how many people would actually turn up for the C++ talks. I was positively surprised. The first three sessions was packed with people (it was of course a cheap trick to make the three first sessions general/introductory on popular topics). All the other talks were also well attended. The NDC organizers have already confirmed that they want to do this next year as well.

NDC Oslo is an annual five day event. First two days of pre-conference workshops (Andrei did 2 days of Scalable design and implementation using C++) and then 9 tracks of talks for three full days. As usual, NDC records all the talks and generously share all the videos with the world (there are 150+ talks, kudos to NDC!).

I have listed the videos from the C++ track this year. I will also put out a link to the slides when I get them. Enjoy!

Day 1, June 4, 2014

  • C++14, Nico Josuttis (video)
  • Effective Modern C++, Scott Meyers (video)
  • Error Handling in C++, Andrei Alexandrescu (video)
  • Move, noexcept, and push_back(), Nico Josuttis (video)
  • C++ Type Deduction and Why You Care, Scott Meyers (video)
  • Generic and Generative Programming in C++, Andrei Alexandrescu (video)

Day 2, June 5, 2014

  • C++ – where are we headed?, Hubert Matthews (video)
  • Three Cool Things about D, Andrei Alexandrescu (video)
  • The C++ memory model, Mike Long (video, slides)
  • C++ for small devices, Isak Styf (video)
  • Brief tour of Clang, Ismail Pazarbasi (video)
  • Insecure coding in C and C++, Olve Maudal (video, slides)
  • So you think you can int? (C++), Anders Knatten (video)

With this great lineup I hope that NDC Oslo in June has established itself as a significant annual C++ event in Europe together with Meeting C++ Berlin in December and ACCU Bristol in April.

Save the date for NDC next year, June 15-19, 2015. I can already promise you a really strong C++ track at NDC Oslo 2015.

A final note: Make sure you stay tuned on isocpp.org for global news about C++.

How to write Boost.Python type converters

Austin Bingham from Good With Computers

Boost.Python [1] makes it possible to write C++ that "feels" like Python. The library is powerful and sometimes subtle. This is as compared with the Python C API, where the experience is very far removed from writing Python code.

Part of making C++ feel more like Python is allowing natural assignment of C++ objects to Python variables. For instance, assigning an standard library string to a Python object looks like this:

// Create a C++ string
std::string msg("Hello, Python");

// Assign it to a python object
boost::python::object py_msg = msg;

Likewise (though somewhat less naturally), it is also important to be able to extract C++ objects from Python objects. Boost.Python provides the extract [2] type for this:

boost::python::object obj = ... ;
std::string msg = boost::python::extract(obj);

To allow this kind of natural assignment, Boost.Python provides a system for registering converters between the languages. Unfortunately, the Boost.Python documentation does a pretty poor job of describing how to write them. A bit of searching on the internet will turn up a few links. [3]

While these are fine (and, in truth, are the basis for what I know about the conversion system), they are not as explicit as I would like.

So, in an effort to clarify the conversion system both for myself and (hopefully) others, I wrote this little primer. I'll step through a full example showing how to write converters for Qt's QString [4] class. In the end, you should have all the information you need to write and register your own converters.

Converting QString

A Boost.Python type converter consists of two major parts. The first part, which is generally the simpler of the two, converts a C++ type into a Python type. I'll refer to this as the to-python converter. The second part converts a Python object into a C++ type. I'll refer to this as the from-python converter.

In order to have your converters be used at runtime, the Boost.Python framework requires you to register them. The Boost.Python API provides separate methods for registering to-python and from-python converters. Because of this, you are free to provide conversion in only one direction for a type if you so choose.

Note that, for certain elements of what I'm about to describe, there is more than one way to do things. For example, in some cases where I choose to use static member functions, you could also use free functions. I won't point these out, but if you wear your C++ thinking-cap you should be able to see what is mandatory and what isn't.

To-python Converters

A to-python converter converts a C++ type to a Python object. From an API perspective, a to-python converter is used any time that you construct a boost::python::object [5] from another C++ type. For example:

// Construct object from an int
boost::python::object int_obj(42);

// Construct object from a string
boost::python::object str_obj = std::string("llama");

// Construct object from a user-defined type
Foo foo;
boost::python::object foo_obj(foo);

You implement a to-python converter using a struct with static member function named convert(), which takes the C++ object to be converted as its argument, and it returns a PyObject*. A to-python converter for QStrings looks like this:

/* to-python convert to QStrings */
struct QString_to_python_str
{
    static PyObject* convert(QString const& s)
    {
        return boost::python::incref(
            boost::python::object(
                s.toLatin1().constData()).ptr());
    }
};

The crux what this does is as follows:

  1. Extract the QString's underlying character data using toLatin1().constData()
  2. Construct a boost::python::object with the character data
  3. Retrieve the boost::python::object's PyObject* with ptr()
  4. Increment the reference count on the PyObject* and return that pointer.

That last step bears a little explanation. Suppose that you didn't increment the reference count on the returned pointer. As soon as the function returned, the boost::python::object in the function would destruct, thereby reducing the ref-count to zero. When the PyObject's reference count goes to zero, Python will consider the object dead and it may be garbage-collected, meaning you would return a deallocated object from convert().

Once you've written the to-python converter for a type, you need to register it with Boost.Python's runtime. You do this with the aptly-named to_python_converter [6] template:

// register the QString-to-python converter
boost::python::to_python_converter<
    QString,
    QString_to_python_str>()

The first template parameter is the C++ type for which you're registering a converter. The second is the converter struct. Notice that this registration process is done at runtime; you need to call the registration functions before you try to do any custom type converting.

From-python Converters

From-python converters are slightly more complex because, beyond simply providing a function to convert from Python to C++, they also have to provide a function that determines if a Python type can safely be converted to the requested C++ type. Likewise, they often require more knowledge of the Python C API.

From-python converters are used whenever Boost.Python's extract type is called. For example:

// get an int from a python object
int x = boost::python::extract(int_obj);

// get an STL string from a python object
std::string s = boost::python::extract(str_obj);

// get a user-defined type from a python object
Foo foo = boost::python::extract(foo_obj);

The recipe I use for creating from-python converters is similar to to-python converters: create a struct with some static methods and register those with the Boost.Python runtime system.

The first method you'll need to define is used to determine whether an arbitrary Python object is convertible to the type you want to extract. If the conversion is OK, this function should return the PyObject*; otherwise, it should return NULL. So, for QStrings you would write:

struct QString_from_python_str
{

    . . .

    // Determine if obj_ptr can be converted in a QString
    static void* convertible(PyObject* obj_ptr)
    {
        if (!PyString_Check(obj_ptr)) return 0;
        return obj_ptr;
    }

    . . .

};

This simply says that a PyObject* can be converted to a QString if it is a Python string.

The second method you'll need to write does the actual conversion. The primary trick in this method is that Boost.Python will provide you with a chunk of memory into which you must in-place construct your new C++ object. All of the funny "rvalue_from_python" stuff just has to do with Boost.Python's method for providing you with that memory chunk:

struct QString_from_python_str
{

    . . .

    // Convert obj_ptr into a QString
    static void construct(
        PyObject* obj_ptr,
        boost::python::converter::rvalue_from_python_stage1_data* data)
    {
        // Extract the character data from the python string
        const char* value = PyString_AsString(obj_ptr);

        // Verify that obj_ptr is a string (should be ensured by
        convertible())
        assert(value);

        // Grab pointer to memory into which to construct the new QString
        void* storage = (
            (boost::python::converter::rvalue_from_python_storage*)
            data)->storage.bytes;

        // in-place construct the new QString using the character data
        // extraced from the python object
        new (storage) QString(value);

        // Stash the memory chunk pointer for later use by boost.python
        data->convertible = storage;
    }

  . . .

};

The final step for from-python converters is, of course, to register the converter. To do this, you use boost::python::converter::registry::push_back(). [7] The first argument is a pointer to the function which tests for convertibility, the second is a pointer to the conversion function, and the third is a boost::python::type_id for the C++ type. In this case, we'll put the registration into the constructor for the struct we've been building up:

struct QString_from_python_str
{
    QString_from_python_str()
    {
        boost::python::converter::registry::push_back(
            &convertible,
            &construct,
            boost::python::type_id());
    }

    . . .

};

Now, if you simply construct a single QString_from_python_str object in your initialization code (just like you how you called to_python_converter() for the to-python registration), conversion from Python strings to QString will be enabled.

Taking a reference to the PyObject in convert()

One gotcha to be aware of in your construct() function is that the PyObject argument is a 'borrowed' reference. That is, its reference count has not already been incremented for you. [8] If you plan to keep a reference to that object, you must use Boost.Python's borrowed construct. For example:

class MyClass
{
public:
    MyClass(boost::python::object obj) : obj_ (obj) {}

private:
    boost::python::object obj_;
};

struct MyClass_from_python
{
    . . .

    static void construct(
        PyObject* obj_ptr,
        boost::python::converter::rvalue_from_python_stage1_data* data)
    {
        using namespace boost::python;

        void* storage = (
            (converter::rvalue_from_python_storage*)
                data)->storage.bytes;

        // Use borrowed to construct the object so that a reference
        // count will be properly handled.
        handle<> hndl(borrowed(obj_ptr));
        new (storage) MyClass(object(hndl));

        data->convertible = storage;
    }
};

Failing to use borrowed() in this situation will generally lead to memory corruption and/or garbage collection errors in the Python runtime.

There are a number of useful resources on the web for finding more information on Boost.Python objects, handles, and reference counting. [9]

When converters don't exist

Finally, a cautionary note. The Boost.Python type-conversion system works well, not only at the job of moving objects across the C++-python languages barrier, but at making code easier to read and understand. You must always keep in mind, though, this comes at the cost of very little compile-time checking.

That is, the boost::python::object copy-constructor is templatized and accepts any type without complaint. This means that your code will compile just fine even if you're constructing boost::python::object s from types that have no registered converter. At runtime these constructors will find that they have no converter for the requested type, and this will result in exceptions.

These exceptions [10] will tend to happen in unexpected places, and you could spend quite a bit of time trying to figure them out. I say all of this so that maybe, when you encounter strange exceptions when using Boost.Python, you'll remember to check that your converters are registered first. Hopefully it'll save you some time.

Resources

Boost.Python is fairly complex and can be difficult to understand all at once. Here are few more useful resources that might help you come up to speed on this useful technology:

  • This IPython notebook-based tutorial covers a lot of the major (and some of the more obscure) topics in Boost.Python.
  • The Boost.Python wiki contains a lot of collected Boost.Python knowledge.
  • And of course, the Boost.Python documentation itself is very useful.

Appendix: Full code for QString converter

struct QString_to_python_str
{
    static PyObject* convert(QString const& s)
    {
        return boost::python::incref(
            boost::python::object(
                s.toLatin1().constData()).ptr());
    }
};

struct QString_from_python_str
{
    QString_from_python_str()
    {
        boost::python::converter::registry::push_back(
            &convertible,
            &construct,
            boost::python::type_id());
    }

    // Determine if obj_ptr can be converted in a QString
    static void* convertible(PyObject* obj_ptr)
    {
        if (!PyString_Check(obj_ptr)) return 0;
        return obj_ptr;
    }

    // Convert obj_ptr into a QString
    static void construct(
        PyObject* obj_ptr,
        boost::python::converter::rvalue_from_python_stage1_data* data)
    {
        // Extract the character data from the python string
        const char* value = PyString_AsString(obj_ptr);

        // Verify that obj_ptr is a string (should be ensured by convertible())
        assert(value);

        // Grab pointer to memory into which to construct the new QString
        void* storage = (
            (boost::python::converter::rvalue_from_python_storage*)
            data)->storage.bytes;

        // in-place construct the new QString using the character data
        // extraced from the python object
        new (storage) QString(value);

        // Stash the memory chunk pointer for later use by boost.python
        data->convertible = storage;
    }
};

void initializeConverters()
{
    using namespace boost::python;

    // register the to-python converter
    to_python_converter<
        QString,
        QString_to_python_str>();

    // register the from-python converter
    QString_from_python_str();
}
[1]The Boost.Python homepage.
[2]boost::python::extract<> documentation.
[3]For example the Boost.Python FAQ.
[4]The Qt QString documentation.
[5]The boost::python::object documentation.
[6]The to_python_converter documentation.
[7]The boost::python::converter::registry documentation.
[8]Python reference counting details.
[9]For example, this discussion from the C++-sig discussion list, the Boost.Python documentation, and David Abrahams' guidelines. for handle<> on the Python wiki.))
[10]Boost.Python uniformly uses boost::python::error_already_set to communicate exceptions from Python to C++..

A CppQuiz a day, keeps the debugger away!

olvemaudal from Geektalk

C++ is a difficult language to master. Very difficult. It does not take more than a few days away from the keyboard before you start forgetting some of the details that will bite when you visit the dark and dusty corners of the language (sometimes because you work with code written by others).

Last month, Anders Schau Knatten officially launched a great tool for practicing your C++ language skills:

http://cppquiz.org

I recommend that you visit this site once in a while to challenge yourself. A good score on this quiz does not make you a great programmer, but it does suggest that you have a deeper understanding of the language than most. Being fluent in a programming language makes it much easier to avoid the dark and dusty corners so that you can concentrate on writing high quality software instead of spending time in the debugger.

Optional streaming

Phil Nash from level of indirection

Catch has a number of macros that allow values of arbitrary types to be streamed into an ostringstream. The canonical example is the INFO macro:

INFO( "There were " << bottles.size() << " green bottles, hanging on the wall" );

This macro builds up a string that will be passed to the next assertion to be included as an annotation. Note that, unlike with a naked ostringstream there is no leading <<. This makes it clean and uncluttered when you just want to log a single value (such as a string), for example:

INFO( "Weirdness" );
The obvious way to do this is for the macro to provide the leading << prior to its argument. Conceptually something like this:
#define INFO( log ) { \
	std::ostringstream oss; \
	oss << log; \
	useTheString( oss.str() ); 
}

This all works quite nicely. But there are a few other macros that use this idiom, too: WARN, SUCCEED and FAIL.

The last two are of interest because the logging behaviour is more of a secondary concern. The primary behaviour is to appear like a passing or failing assertion, respectively, without the need to actually assert on anything. SUCCEED can be useful if you otherwise have no assertions in a test and you don't want to see warnings about it. FAIL is useful if the situation that leads to the failure is not captured in an expression for some reason. It can also be useful to force a test to fail, perhaps as a placeholder. These are useful macros to have available, but they are not often needed in practice. So when they are it's nice to be able to annotate their useage inline - hence the streamed argument.

This is all well and good. But I've found there are still enough cases where I don't want to annotate that having to pass an empty string or make something up is a little annoying. I also use a similar idiom in other projects where it would be nice to be able to make the stream completely optional.

This is not as easy as it sounds, though. The first, and most obvious, issue is that this requires support for variadic macros. Catch has made use of variadic macros, where available, for some time now. In theory they are available to any C++11 compiler. In practice most, if not all, compilers that support any reasonable chunk of C++11 support variadic macros - and most supported them as an extension even before that. That's certainly true of Visual C++, GCC and Clang.

The technically more interesting problem, though, is dealing with that initial <<. Remember the first << is being supplied inside the macro. It will still be there even if the caller does not supply an argument to the macro. If we wrote FAIL the same way we presented INFO earlier (but with variadic macros) it might look something like this:

#define FAIL( ... ) { \
	std::ostringstream oss; \
	oss << __VA_ARGS__; \
	notifyFail( oss.str() ); \
}
... which, with no argument provided, would expand to...
{ 
	std::ostringstream oss; 
	oss << ; 
	notifyFail( oss.str() ); 
}

Do you see the problem? With nothing following the << this will not compile.

So do we need a different operator? What properties would we need? It seems we'd need an operator that comes in two forms: a binary operator that allows us to capture an argument, and a unary operator that allows us to omit the argument. Furthermore the binary form must not require its argument to be enclosed in any sort of brackets. Finally it must have higher precedence than << so we can switch over to normal stream insertion at that point.

That's a long list. Does such an operator exist? Fortunately there's not just one but two such operators to choose from! + and -. The only slight hitch is that the unary form is right-to-left associative, whereas the binary form is left-to-right. So how can we work these in?

Let's pick one of the operators. I've gone with +, but I don't think there is any advantage either way. Because unary + is right-to-left associative it needs to prefix something. So we can't use it at the start of our streaming expression. We can, however, use it at the end. Then we'll need an object to apply it to. The object doesn't actually need to do anything else. I've gone with this implementation of StreamEndStop in Catch:

struct StreamEndStop {
    std::string operator+() {
        return std::string();
    }
};
With this definition the expression, +StreamEndStop() now yields an empty string - which is idempotent with a stringstream. Which means we can write:
{
	std::ostringstream oss; 
	oss << +StreamEndStop();
	notifyFail( oss.str() ); 
}
And oss.str() evaluates to an empty string. Perfect. But what about when we do stream something? Well that would expand to:
{
	std::ostringstream oss; 
	oss << something +StreamEndStop();
	notifyFail( oss.str() ); 
}
... where something could be a string or variable or literal of any type. So we need some way for the expression:
something +StreamEndStop()
to yield the value of something. That's where the binary form of operator+ comes in:
template<typename T>
T const& operator + ( T const& value, StreamEndStop& ) {
	return value;
}
Now, whether we supply nothing, a single value or multiple values joined by <<s we'll end up with a stringstream containing what we expect. The relevant bit of code in Catch actually looks like this:
Catch::ExpressionResultBuilder( messageType ) \
	<< __VA_ARGS__ \
	+::Catch::StreamEndStop()
which yields an ExpressionResultBuilder that gets passed on elsewhere. This is all protected by CATCH_CONFIG_VARIADIC_MACROS. Otherwise it falls back to:
Catch::ExpressionResultBuilder( messageType ) << log
So a lot of work to save a few explicit empty strings, but sometimes it's the little things.

Capturing lvalue references in C++11 lambdas

Pete Barber from C#, C++, Windows &amp; other ramblings

Recently the question "what is the type of an lvalue reference when captured by reference in a C++11 lambda?" was asked. It turns out that it's a reference to whatever the original reference was too. This is just like taking a reference to an existing reference, e.g.

int foo = 7;
int& rfoo = foo;
int& rfoo1 = rfoo;
int& rfoo2 = rfoo1;

All references refer to foo rather than rfoo2->rfoo1->rfoo->foo meaning the following code

std::cout << "foo:" << foo << ", rfoo:" << rfoo 
          << ", rfoo1:" << rfoo1 << ", rfoo2:" << rfoo2 
          << '\n';
++foo;

std::cout << "foo:" << foo << ", rfoo:" << rfoo 
          << ", rfoo1:" << rfoo1 << ", rfoo2:" << rfoo2 
          << '\n';

std::cout << "&foo:" << &foo << ", &rfoo:" << &rfoo 
          << ", &rfoo1:" << &rfoo1 << ", &rfoo2:" << &rfoo2 
          << '\n';

Which gives:

foo:7, rfoo:7, rfoo1:7, rfoo2:7
foo:8, rfoo:8, rfoo1:8, rfoo2:8
&foo:00D3FB0C, &rfoo:00D3FB0C, &rfoo1:00D3FB0C, &rfoo2:00D3FB0C

I.e. all the references are aliases for the original foo hence the same value is displayed including when the original is modified and that the address of each variable is the same, that of foo.

There is nothing surprising here it's just basic C++ but it's along time since I've thought about it which is why with lambdas, l-value, r-value and universal references I sometimes I do a double take on what was once obvious.

The same happens with lambda capture but it's a slightly more interesting story. Take the following example:

int foo = 99;
int& rfoo = foo;
int& rfoo1 = foo;

std::cout << "foo:" << foo << ", rfoo:" << rfoo 
          << ", rfoo1:" << rfoo1 
          << '\n';

std::cout << "&foo:" << &foo << ", &rfoo:" << &rfoo 
          << ", &rfoo1:" << &rfoo1 
          << '\n';

auto l = [foo, rfoo, &rfoo1]()
{
    std::cout << "foo:" << foo << '\n';
    std::cout << "rfoo:" << rfoo << '\n';
    std::cout << "rfoo1:" << rfoo1 << '\n';

    std::cout << "&foo:" << &foo << ", &rfoo:" 
              << &rfoo << ", &rfoo1:" << &rfoo1 
              << '\n';
};

foo = 100;

l();

Which gives:

foo:99, rfoo:99, rfoo1:99
&foo:00D3FB0C, &rfoo:00D3FB0C, &rfoo1:00D3FB0C
foo:99
rfoo:99
rfoo1:100
&foo:00D3FAE0, &rfoo:00D3FAE4, &rfoo1:00D3FB0C

To begin with it behaves as per the first example in that foo, rfoo and rfoo1 all give the same value as rfoo and rfoo1 are effectively aliases for foo as shown when displaying their addresses; they're all the same.

However, when these same variables are captured it's a different story: The capture of foo is of no surprise as this is by-value so displays the captured value of 99 despite the original foo being changed to 100 prior to the lambda being invoked. Its address is that of a new variable; a member of the lambda.

It starts to get interesting with the capture of rfoo. When the lambda is invoked this too displays 99, the original captured value. Also, its address is not that of the original foo. It seems that the reference itself has not been captured but rather what it refers too, in this case an int with the value of 99. It appears to have been magically dereferenced as part of the capture.

This is the correct behaviour and when thought about becomes somewhat obvious. It's just like assigning a variable from a reference, e.g.

int foo = 7;
int& rfoo = foo;
int bar = rfoo;

bar doesn't become an int& and  rfoo is magically dereferenced except in this scenario there is nothing magical at all, it's as expected. If int were replaced with auto, e.g.

auto bar = rfoo;

then it would be expected that bar is an int as auto strips of CV and reference qualifiers.

Finally, there is rfoo1. This too is odd as it is attempting to take a reference to a reference. As seen in the first example this is perfectly fine. The end effect is that there can't be a reference to reference and so on and all are aliases of the original variable.

This is pretty much what's happening here. It's irrelevant that the target of the capture is a reference. In the end the capture by reference is capture by reference of the underlying variable, i.e. what rfoo1 refers too, in this case foo not rfoo1 itself. This is demonstrated twofold by rfoo1 within the lambda displaying the updated value of foo and also that the address of rfoo1 within the lambda is that of foo outside it.

This is as per the standard section 5.1.2 Lambda expression sub-note 14:

An entity is captured by copy if it is implicitly captured and the capture-default is = or if it is explicitly
captured with a capture that does not include an &. For each entity captured by copy, an unnamed nonstatic
data member is declared in the closure type. The declaration order of these members is unspecified.
The type of such a data member is the type of the corresponding captured entity if the entity is not a
reference to an object, or the referenced type otherwise. [ Note: If the captured entity is a reference to a
function, the corresponding data member is also a reference to a function. —end note ]

The sentence in bold states that for a reference captured by value then the type of the captured value is the type referred to, i.e. the reference aspect as been removed the crucial part being "or the referenced type otherwise". (NOTE: I haven't experimented with references to functions).

Finally, a vivid example showing that a reference captured by value involves a dereference.

class Bar
{
private:
int mValue;

public:
Bar(const Bar&) : mValue(9999)
{
}

public:
Bar(const int value) : mValue(value) {}
int GetValue() const { return mValue; }
void SetValue(const int value) { mValue = value; }
};

Bar bar(1);
Bar& rbar = bar;
Bar& rbar1 = bar;

std::cout << "&bar:" << &bar << ", &rbar:" << &rbar<< ", &rbar1:" << &rbar1 << '\n';

auto l2 = [bar, rbar, &rbar1]()
{
std::cout << "bar:" << bar.GetValue() << '\n';
std::cout << "rbar:" << rbar.GetValue() << '\n';
std::cout << "rbar1:" << rbar1.GetValue() << '\n';

std::cout << "&bar:" << &bar << ", &rbar:" << &rbar<< ", &rbar1:" << &rbar1 << '\n';
};

bar.SetValue(2);

l2();

The class bar provides a crude copy-constructor that sets the stored value to 9999. The following output is similar to that in the previous example in that the addresses of bar and rbar in the lambda differ from that of bar showing they're copies whilst rbar1 is the same. Secondly, the value of mValue stored within Bar is shown as 9999 for the first two captured variables meaning they were copy-constructed.

&bar:00D3FB0C, &rbar:00D3FB0C, &rbar1:00D3FB0C
bar:9999
rbar:9999
rbar1:2
&bar:00D3FAE0, &rbar:00D3FAE4, &rbar1:00D3FB0C

Making the copy-construct private (by commenting out the seemingly unnecessary 'public:') prevents compilation.

1>------ Build started: Project: References, Configuration: Debug Win32 ------
1>  main.cpp
1>c:\users\pete\desktop\references\references\main.cpp(85): error C2248: 'Bar::Bar' : cannot access private member declared in class 'Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(59) : see declaration of 'Bar::Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(54) : see declaration of 'Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(59) : see declaration of 'Bar::Bar'
1>          c:\users\pete\desktop\references\references\main.cpp(54) : see declaration of 'Bar'

Writing this post has clarified the situation for me, I hope it helps you as well.

The sample code is available here.