The Perils of Multi-Phase Construction

Chris Oldwood from The OldWood Thing

I’ve never really been a fan of C#’s object initializer syntax. Yes, it’s a little more convenient to write but it has a big downside which is it forces you to make your types mutable by default. Okay, that’s a bit strong, it doesn’t force you to do anything, but it does promote that way of thinking and allows people to take advantage of mutability outside the initialisation block [1].

This post is inspired by some buggy code I encountered where my suspicion is that the subtleties of the object initialisation syntax got lost along the way and partially constructed objects eventually found their way into the wild.

No Dragons Yet

The method, which was to get the next message from a message queue, was originally written something like this:

Message result = null;
RawMessage message = queue.Receive();

if (message != null)
{
  result = new Message
  {
    Priority = message.Priority,
    Type = GetHeader(message, “MessageType”),
    Body = message.Body, 
  };
}

return result;

This was effectively correct. I say “effectively correct” because it doesn’t contain the bug which came later but still relies on mutability which we know can be dangerous.

For example, what would happen if the GetHeader() method threw an exception? At the moment there is no error handling and so the exception propagates out the method and back up the stack. Because we make no effort to recover we let the caller decide what happens when a duff message comes in.

The Dragons Begin Circling

Presumably the behaviour when a malformed message arrived was undesirable because the method was changed slightly to include some recovery fairly soon after:

Message result = null;
RawMessage message = queue.Receive();

if (message != null)
{
  try
  {
    result = new Message
    {
      Priority = message.Priority,
      Type = GetHeader(message, “MessageType”),
      Body = message.Body,  
    };
  }
  catch (Exception e)
  {
    Log.Error(“Invalid message. Skipping.”);
  }
}

return result;

Still no bug yet, but that catch handler falling through to the return at the bottom is somewhat questionable; we are making the reader work hard to track what happens to result under the happy / sad paths to ensure it remains correct under further change.

Object Initialisation Syntax

Before showing the bug, here’s a brief refresher on how the object initialisation syntax works under the covers [2] in the context of our example code. Essentially it invokes the default constructor first and then performs assignments on the various other properties, e.g.

var __m = new Message();
__m.Priority = message.Priority;
__m.Type = GetHeader(message, “MessageType”);
__m.Body = message.Body,  
result = __m;

Notice how the compiler introduces a hidden temporary variable during the construction which it then assigns to the target at the end? This ensures that any exceptions during construction won’t create partially constructed objects that are bound to variables by accident. (This assumes you don’t use the constructor or property setter to attach itself to any global variables either.)

Hence, with respect to our example, if any part of the initialization fails then result will be left as null and therefore the message is indeed discarded and the caller gets a null reference back.

The Dragons Surface

Time passes and the code is then updated to support a new property which is also passed via a header. And then another, and another. However, being more complicated than a simple string value the logic to parse it is placed outside the object initialisation block, like this:

Message result = null;
RawMessage message = queue.Receive();

if (message != null)
{
  try
  {
    result = new Message
    {
      Priority = message.Priority,
      Type = GetHeader(message, “MessageType”),
      Body = message.Body,  
    };

    var str = GetHeader(message, “SomeIntValue”);
    if (str != null && TryParseInt(str, out var value))
      result.IntValue = value;

    // ... more of the same ...
  }
  catch (Exception e)
  {
    Log.Error(“Invalid message. Skipping.”);
  }
}

return result;

Now the problems start. With the latter header parsing code outside the initialisation block result is assigned a partially constructed object while the remaining parsing code runs. Any exceptions that occur [3] mean that result will be left only partially constructed and the caller will be returned the duff object because the exception handler falls out the bottom.

+1 for Tests

The reason I spotted the bug was because I was writing some tests around the code for a new header which also temporarily needed to be optional, like the others, to decouple the deployments. When running the tests there was an error displayed on the console output [4] telling me the message was being discarded, which I didn’t twig at first. It was when I added a retrospective test for the previous optional fields and I found my new one wasn’t be parsed correctly that I realised something funky was going on.

Alternatives

So, what’s the answer? Well, I can think of a number of approaches that would fix this particular code, ranging from small to large in terms of the amount of code that needs changing and our appetite for it.

Firstly we could avoid falling through in the exception handler and make it easier on the reader to comprehend what would be returned in the face of a parsing error:

catch (Exception e)  
{  
  Log.Error(“Invalid message. Skipping.”);
  return null;
}

Secondly we could reduce the scope of the result variable and return that at the end of the parsing block so it’s also clearer about what the happy path returns:

var result = new Message  
{  
  // . . .  
};

var str = GetHeader(message, “SomeIntValue”);
if (str != null && TryParseInt(str, out var value)
  result.IntValue = value;

return result;

We could also short circuit the original check too and remove the longer lived result variable altogether with:

RawMessage message = queue.Receive();

if (message == null)
    return null;

These are all quite simple changes which are also safe going forward should someone add more header values in the same way. Of course, if we were truly perverse and wanted to show how clever we were, we could fold the extra values back into the initialisation block by doing an Extract Function on the logic instead and leave the original dragons in place, e.g.

try
{  
  result = new Message  
  {  
    Priority = message.Priority,  
    Type = GetHeader(message, “MessageType”),  
    Body = message.Body,
    IntValue = GetIntHeader(message, “SomeIntValue”),
    // ... more of the same ...  
  };
}  
catch (Exception e)  
{  
  Log.Error(“Invalid message. Skipping.”);  
}

But we would never do that because the aim is to write code that helps stop people making these kinds of mistakes in the first place. If we want to be clever we should make it easier for the maintainers to fall into The Pit of Success.

Other Alternatives 

I said at the beginning that I was not a fan of mutability by default and therefore it would be remiss of me not to suggest that the entire Message type be made immutable and all properties set via the constructor instead:

result = new Message  
(  
  priority: message.Priority,  
  type: GetHeader(message, “MessageType”),  
  body: message.Body,
  IntValue: GetIntHeader(message, “SomeIntValue”),
  // ... more of the same ...  
);

Yes, adding a new property is a little more work but, as always, writing the tests to make sure it all works correctly will dominate here.

I would also prefer to see use of an Optional<> type instead of a null reference for signalling “no message” but that’s a different discussion.

Epilogue

While this bug was merely “theoretical” at the time I discovered it [5] it quickly came back to bite. A bug fix I made on the sending side got deployed before the receiving end and so the misleading error popped up in the logs after all.

Although the system appeared to be functioning correctly it had slowed down noticeably which we quickly discovered was down to the receiving process continually restarting. What I hadn’t twigged just from reading this nugget of code was that due to the catch handler falling through and passing the message on it was being acknowledged on the queue twice –– once in that catch handler, and again after processing it. This second acknowledgment attempt generated a fatal error that caused the process to restart. Deploying the fixed receiver code as well sorted the issue out.

Ironically the impetus for my blog post “Black Hole - The Fail Fast Anti-Pattern” way back in 2012 was also triggered by two-phase construction problems that caused a process to go into a nasty failure mode, but that time it processed messages much too quickly and stayed alive failing them all.

 

[1] Generally speaking the setting of multiple properties implies it’s multi-phase construction. The more common term Two-Phase Construction comes (I presume) from explicit constructor methods names like Initialise() or Create() which take multiple arguments, like the constructor, rather than setting properties one-by-one.

[2] This is based on my copy of The C# Programming Language: The Annotated Edition.

[3] When the header was missing it was passing a null byte[] reference into a UTF8 decoder which caused it to throw an ArgumentNullException.

[4] Internally it created a logger on-the-fly so it wasn’t an obvious dependency that initially needed mocking.

[5] It’s old, so possibly it did bite in the past but nobody knew why or it magically fixed itself when both ends where upgraded close enough together.

Modeling visual studio C++ compile times

Derek Jones from The Shape of Code

Last week I spotted an interesting article on the compile-time performance of C++ compilers running under Microsoft Windows. The author had obviously put a lot of work into gathering the data, and had taken care to have multiple runs to reduce the impact of random effects (128 runs to be exact); but, as if often the case, the analysis of the data was lackluster. I posted a comment asking for the data, and a link was posted the next day :-)

The compilers benchmarked were: Visual Studio 2015, Visual Studio 2017 and clang 7.0.1; the compilers were configured to target: C++20, C++17, C++14, C++11, C++03, or C++98. The source code used was 100 system headers.

If we are interested in understanding the contribution of each component to overall compile-time, the obvious fist regression model to build is:

compile_time = header_x+compiler_y+language_z

where: header_x are the different headers, compiler_y the different compilers and language_z the different target languages. There might be some interaction between variables, so something more complicated was tried first; the final fitted model was (code+data):

compile_time = k+header_x+compiler_y+language_z+compiler_y*language_z

where k is a constant (the Intercept in R’s summary output). The following is a list of normalised numbers to plug into the equation (clang is the default compiler and C++03 the default language, and so do not appear in the list, the : symbol represents the multiplication; only a few of the 100 headers are listed, details are available):

                             Estimate Std. Error  t value Pr(>|t|)    
               (Intercept)                  headerany 
               1.000000000                0.051100398 
               headerarray             headerassert.h 
               0.522336397               -0.654056185 
...
            headerwctype.h            headerwindows.h 
              -0.648095154                1.304270250 
              compilerVS15               compilerVS17 
              -0.185795534               -0.114590143 
             languagec++11              languagec++14 
               0.032930014                0.156363433 
             languagec++17              languagec++20 
               0.192301727                0.184274629 
             languagec++98 compilerVS15:languagec++11 
               0.001149643               -0.058735591 
compilerVS17:languagec++11 compilerVS15:languagec++14 
              -0.038582437               -0.183708714 
compilerVS17:languagec++14 compilerVS15:languagec++17 
              -0.164031495                         NA 
compilerVS17:languagec++17 compilerVS15:languagec++20 
              -0.181591418                         NA 
compilerVS17:languagec++20 compilerVS15:languagec++98 
              -0.193587045                0.062414667 
compilerVS17:languagec++98 
               0.014558295 

As an example, the (normalised) time to compile wchar.h using VS15 with languagec++11 is:
1-0.514807638-0.183862162+0.033951731-0.059720131

Each component adds/substracts to/from the normalised mean.

Building this model didn’t take long. While waiting for the kettle to boil, I suddenly realised that an additive model was probably inappropriate for this problem; oops. Surely the contribution of each component was multiplicative, i.e., components have a percentage impact to performance.

A quick change to the form of the fitted model:

log(compile_time) = k+header_x+compiler_y+language_z+compiler_y*language_z

Taking the exponential of both side, the fitted equation becomes:

compile_time = e^{k}e^{header_x}e^{compiler_y}e^{language_z}e^{compiler_y*language_z}

The numbers, after taking the exponent, are:

               (Intercept)                  headerany 
              9.724619e+08               1.051756e+00 
...
            headerwctype.h            headerwindows.h 
              3.138361e-01               2.288970e+00 
              compilerVS15               compilerVS17 
              7.286951e-01               7.772886e-01 
             languagec++11              languagec++14 
              1.011743e+00               1.049049e+00 
             languagec++17              languagec++20 
              1.067557e+00               1.056677e+00 
             languagec++98 compilerVS15:languagec++11 
              1.003249e+00               9.735327e-01 
compilerVS17:languagec++11 compilerVS15:languagec++14 
              9.880285e-01               9.351416e-01 
compilerVS17:languagec++14 compilerVS15:languagec++17 
              9.501834e-01                         NA 
compilerVS17:languagec++17 compilerVS15:languagec++20 
              9.480678e-01                         NA 
compilerVS17:languagec++20 compilerVS15:languagec++98 
              9.402461e-01               1.058305e+00 
compilerVS17:languagec++98 
              1.001267e+00 

Taking the same example as above: wchar.h using VS15 with c++11. The compile-time (in cpu clock cycles) is:
9.724619e+08*3.138361e-01*7.286951e-01*1.011743e+00*9.735327e-01

Now each component causes a percentage change in the (mean) base value.

Both of these model explain over 90% of the variance in the data, but this is hardly surprising given they include so much detail.

In reality compile-time is driven by some combination of additive and multiplicative factors. Building a combined additive and multiplicative model is going to be like wrestling an octopus, and is left as an exercise for the reader :-)

Given a choice between these two models, I think the multiplicative model is probably closest to reality.

Guaranteed Copy Elision Does Not Elide Copies

Simon Brand from Simon Brand

This post is also available at the Microsoft Visual C++ Team Blog

C++17 merged in a paper called Guaranteed copy elision through simplified value categories. The changes mandate that no copies or moves take place in some situations where they were previously allowed, e.g.:

struct non_moveable {
    non_moveable() = default;
    non_moveable(non_moveable&&) = delete;
};
non_moveable make() { return {}; }
non_moveable x = make(); //compiles in C++17, error in C++11/14

You can see this behavior in compiler versions Visual Studio 2017 15.6, Clang 4, GCC 7, and above.

Despite the name of the paper and what you might read on the Internet, the new rules do not guarantee copy elision. Instead, the new value category rules are defined such that no copy exists in the first place. Understanding this nuance gives a deeper understanding of the current C++ object model, so I will explain the pre-C++17 rules, what changes were made, and how they solve real-world problems.

Value Categories

To understand the before-and-after, we first need to understand what value categories are (I’ll explain copy elision in the next section). Continuing the theme of C++ misnomers, value categories are not categories of values. They are characteristics of expressions. Every expression in C++ has one of three value categories: lvalue, prvalue (pure rvalue), or xvalue (eXpring value). There are then two parent categories: all lvalues and xvalues are glvalues, and all prvalues and xvalues are rvalues.

diagram expressing the taxonomy described above

For an explanation of what these are, we can look at the standard (C++17 [basic.lval]/1):

  • A glvalue ([generalised lvalue]) is an expression whose evaluation determines the identity of an object, bit-field, or function.
  • A prvalue is an expression whose evaluation initializes an object or a bit-field, or computes the value of an operand of an operator, as specified by the context in which it appears.
  • An xvalue is a glvalue that denotes an object or bit-field whose resources can be reused (usually because it is near the end of its lifetime).
  • An lvalue is a glvalue that is not an xvalue.
  • An rvalue is a prvalue or an xvalue.

Some examples:

std::string s; 
s //lvalue: identity of an object 
s + " cake" //prvalue: could perform initialization/compute a value 

std::string f(); 
std::string& g(); 
std::string&& h(); 

f() //prvalue: could perform initialization/compute a value 
g() //lvalue: identity of an object 
h() //xvalue: denotes an object whose resources can be reused 

struct foo { 
    std::string s; 
}; 

foo{}.s //xvalue: denotes an object whose resources can be reused

C++11

What are the properties of the expression std::string{"a pony"}?

It’s a prvalue. Its type is std::string. It has the value "a pony". It names a temporary.

That last one is the key point I want to talk about, and it’s the real difference between the C++11 rules and C++17. In C++11, std::string{"a pony"} does indeed name a temporary. From C++11 [class.temporary]/1:

Temporaries of class type are created in various contexts: binding a reference to a prvalue, returning a prvalue, a conversion that creates a prvalue, throwing an exception, entering a handler, and in some initializations.

Let’s look at how this interacts with this code:

struct copyable {
    copyable() = default;
    copyable(copyable const&) { /*...*/ }
};
copyable make() { return {}; }
copyable x = make();

make() results in a temporary. This temporary will be moved into x. Since copyable has no move constructor, this calls the copy constructor. However, this copy is unnecessary since the object constructed on the way out of make will never be used for anything else. The standard allows this copy to be elided by constructing the return value at the call-site rather than in make (C++11 [class.copy]). This is called copy elision.

The unfortunate part is this: even if all copies of the type are elided, the constructor still must exist.

This means that if we instead have:

struct non_moveable {
    non_moveable() = default;
    non_moveable(non_moveable&&) = delete;
};
non_moveable make() { return {}; }
auto x = make();

then we get a compiler error:

<source>(7): error C2280: 'non_moveable::non_moveable(non_moveable &&)': attempting to reference a deleted function
<source>(3): note: see declaration of 'non_moveable::non_moveable'
<source>(3): note: 'non_moveable::non_moveable(non_moveable &&)': function was explicitly deleted

Aside from returning non-moveable types by value, this presents other issues:

auto x = non_moveable{}; //compiler error
  • The language makes no guarantees that the constructors won’t be called (in practice this isn’t too much of a worry, but guarantees are more convincing than optional optimizations).
  • If we want to support some of these use-cases, we need to write copy/move constructors for types which they don’t make sense for (and do what? Throw? Abort? Linker error?)
  • You can’t pass non-moveable types to functions by value, in case you have some use-case which that would help with.

What’s the solution? Should the standard just say “oh, if you elide all copies, you don’t need those constructors”? Maybe, but then all this language about constructing temporaries is really a lie and building an intuition about the object model becomes even harder.

C++17

C++17 takes a different approach. Instead of guaranteeing that copies will be elided in these cases, it changes the rules such that the copies were never there in the first place. This is achieved through redefining when temporaries are created.

As noted in the value category descriptions earlier, prvalues exist for purposes of initialization. C++11 creates temporaries eagerly, eventually using them in an initialization and cleaning up copies after the fact. In C++17, the materialization of temporaries is deferred until the initialization is performed.

That’s a better name for this feature. Not guaranteed copy elision. Deferred temporary materialization.

Temporary materialization creates a temporary object from a prvalue, resulting in an xvalue. The most common places it occurs are when binding a reference to or performing member access on a prvalue. If a reference is bound to the prvalue, the materialized temporary’s lifetime is extended to that of the reference (this is unchanged from C++11, but worth repeating). If a prvalue initializes a class type of the same type as the prvalue, then the destination object is initialized directly; no temporary required.

Some examples:

struct foo {
    int i;
};

foo make();
auto& a = make();  //temporary materialized and lifetime-extended
auto&& b = make(); //ditto

foo{}.i //temporary materialized

auto c = make(); //no temporary materialized

That covers the most important points of the new rules. Now on to why this is actually useful past terminology bikeshedding and trivia to impress your friends.

Who cares?

I said at the start that understanding the new rules would grant a deeper understanding of the C++17 object model. I’d like to expand on that a bit.

The key point is that in C++11, prvalues are not “pure” in a sense. That is, the expression std::string{"a pony"} names some temporary std::string object with the contents "a pony". It’s not the pure notion of the list of characters “a pony”. It’s not the Platonic ideal of “a pony”.

In C++17, however, std::string{"a pony"} is the Platonic ideal of “a pony”. It’s not a real object in C++’s object model, it’s some elusive, amorphous idea which can be passed around your program, only being given form when initializing some result object, or materializing a temporary. C++17’s prvalues are purer prvalues.

If this all sounds a bit abstract, that’s okay, but internalising this idea will make it easier to reason about aspects of your program. Consider a simple example:

struct foo {};
auto x = foo{};

In the C++11 model, the prvalue foo{} creates a temporary which is used to move-construct x, but the move is likely elided by the compiler.

In the C++17 model, the prvalue foo{} initializes x.

A more complex example:

std::string a() {
    return "a pony";
}

std::string b() {
    return a();
}

int main() {
    auto x = b();
}

In the C++11 model, return "a pony"; initializes the temporary return object of a(), which move-constructs the temporary return object of b(), which move-constructs x. All the moves are likely elided by the compiler.

In the C++17 model, return "a pony"; initializes the result object of a(), which is the result object of b(), which is x.

In essence, rather than an initializer creating a series of temporaries which in theory move-construct a chain of return objects, the initializer is teleported to the eventual result object. In C++17, the code:

T a() { return /* expression */ ; }
auto x = a();

is identical to auto x = /* expression */;. For any T.

Closing

The “guaranteed copy elision” rules do not guarantee copy elision; instead they purify prvalues such that the copy doesn’t exist in the first place. Next time you hear or read about “guaranteed copy elision”, think instead about deferred temporary materialization.

The first commercially available (claimed) verified compiler

Derek Jones from The Shape of Code

Yesterday, I read a paper containing a new claim by some of those involved with CompCert (yes, they of soap powder advertising fame): “CompCert is the first commercially available optimizing compiler that is formally verified, using machine assisted mathematical proofs, to be exempt from miscompilation”.

First commercially available; really? Surely there are earlier claims of verified compilers being commercial availability. Note, I’m saying claims; bits of the CompCert compiler have involved mathematical proofs (i.e., code generation), so I’m considering earlier claims having at least the level of intellectual honesty used in some CompCert papers (a very low bar).

What does commercially available mean? The CompCert system is open source, so I guess it’s commercially available via free downloading (the paper does not define the term).

Computational Logic, Inc is the name that springs to mind, when thinking of commercial and formal verification. They were active from 1983 to 1997, and published some very interesting technical reports about their work (sadly there are gaps in the archive). One project was A Mechanically Verified Code Generator (in 1989) and their Gypsy system (a Pascal-like language+IDE) provided an environment for doing proofs of programs (I cannot find any reports online). Piton was a high-level assembler and there was a mechanically verified implementation (in 1988).

There is the Danish work on the formal specification of the code generators for their Ada compiler (while there was a formal specification of the Ada semantics in VDM, code generators tend to be much simpler beasts, i.e., a lot less work is needed in formal verification). The paper I have is: “Retargeting and rehosting the DDC Ada compiler system: A case study – the Honeywell DPS 6″ by Clemmensen, from 1986 (cannot find an online copy). This Ada compiler was used by various hardware manufacturers, so it was definitely commercially available for (lots of) money.

Are then there any earlier verified compilers with a commercial connection? There is A PRACTICAL FORMAL SEMANTIC DEFINITION AND VERIFICATION SYSTEM FOR TYPED LISP, from 1976, which has “… has proved a number of interesting, non-trivial theorems including the total correctness of an algorithm which sorts by successive merging, the total correctness of the McCarthy-Painter compiler for expressions, …” (which sounds like a code generator, or part of one, to me).

Francis Morris’s thesis, from 1972, proves the correctness of compilers for three languages (each language contained a single feature) and discusses how these features may be combined into a more “realistic” language. No mention of commercial availability, but I cannot see the demand being that great.

The definition of PL/1 was written in VDM, a formal language. PL/1 is a huge language and there were lots of subsets. Were there any claims of formal verification of a subset compiler for PL/1? I have had little contact with the PL/1 world, so am not in a good position to know. Anybody?

Over to you dear reader. Are there any earlier claims of verified compilers and commercial availability?