Source code chapter added to “Evidence-based software engineering using R”

Derek Jones from The Shape of Code

The Source Code chapter of my evidence-based software engineering book has been added to the draft pdf (download here).

This chapter has suffered from coming last and there is still lots of work to be done. Almost all the source code related data has been plundered to fill up earlier chapters. Some data did not make the cut-off for release of the draft; a global review will probably result in some data migrating back to this chapter.

When talking to developers about the book I am constantly being asked ‘what is empirical software engineering?’ My explanation uses the phrase ‘evidence-based’, which everybody seems to immediately understand. It is counterproductive having a title that has to be explained, so I have changed the title to “Evidence-based Software Engineering using R”.

What is the purpose of a chapter discussing source code in a book on evidence-based software engineering? Source code is obviously an essential component of the topics discussed in the other chapters, but what is so particular to source code that it could not be said elsewhere? Having spent most of my professional life studying source code, first as a compiler writer and then involved with static analysis, am I just being driven by an attachment to the subject?

My view of source code is very different from most other developers: when developers talk about code, they spend most of the time talking about how they do things, when I talk about code I spend most of the time talking about how other developers do things (I’m a mongrel writer of code). Developers’ blinkered view of code prevents them seeing bigger pictures. I take a Gricean view of code and refrain from using meaningless marketing terms such as maintainability, readability and testability.

I have lots of source code data of interest to compiler writers (who are not the target audience) and I have lots of data related to static analysis (tool developers are not the audience). The target audience is professional software developers and hopefully what has been written is of interest to that readership.

I have been promised all sorts of data. Hopefully some of it will arrive. If somebody tells you they promised to send me data, please encourage them to take some time to sort out the data and send it.

As always, if you know of any interesting software engineering data, please tell me.

Finalizing the statistical analysis material in the second half of the book (released almost two years ago) next.

Perl’s failure to grow and Python takes over

Derek Jones from The Shape of Code

Perl, once the most widely used scripting language, has been in decline for many years; the decline now looks terminal (many decades from now, when its die-hard users have died), what happened?

Python is what happened. Why was this? Did Perl have a major fail, did Python acquire pixie dust that could not be replicated, or something else?

Some commentators point to the failure to produce a timely release of Perl 6; a major reworking of the language announced in 2000 with a stumbling release made available around 2015.

I think the real issue is a failure for Perl to take off outside its core use as a systems language. Perl is famous for its one-liners, but not for writing large programs (yes, it can be done, but would many developers would really want to?); a glance of the categories in its module library shows; those 174,970 modules (at the time of writing) are not widely spread over application domains (i.e., not catering to a wide audience).

Perl 5 was failing to grow outside its base before Perl 6 began its protracted failure to launch.

Language use is a winner take-all game, developers create more packages, support tools, and new users who combine to attract more developers. Continuing support for minority languages comes from die-hard users, existing software that is worth somebody paying to maintain and niche advantages.

These days, language success is founded on the associated package ecosystem (Go and Rust have minuscule package ecosystems, which is why they are living on borrowed time, other languages will eventually take away their sheen of trendiness). Developers use languages to build stuff, the days of writing the code for almost everything are long gone; interesting software is created by taking advantage of packages written by others. Python was in the right place, at the right time to acquire a wide variety of commercial grade packages.

It’s difficult to see Python being displaced as the lingua franca of software development. Its language features are almost irrelevant, its package ecosystem is everything. The winner will eventually take all.

I’m sure the cycle of languages becoming popular for a few years, before disappearing, will continue. There have always been, and will always be, fashionable languages.

TODO or TODO Not – Redux

Chris Oldwood from The OldWood Thing

One of my most “successful” posts was one I wrote way back in 2011 about my dislike for TODO style comments in code – “TODO or TODO Not - There Is No Later”. The premise of that post, which includes a number of examples from real codebases I’ve worked on, is that they are fundamentally pointless because they are almost certainly too low in value to get done. If they were valuable enough they either should be a proper feature on the backlog or be left to be handled as part of a relevant story, e.g. refactoring.

At the time I wrote that post I was convinced about them not living in the codebase (past the feature’s release) but I suggested that any potentially useful ones should be converted into bona fide user stories so they could be formally considered and prioritised along with the other items. After receiving a reply to a tweet that referenced my aging blog post from a company that tries to quantify technical debt by analysing such TODO style comments I felt it must be time for an update. (The TL;DR of this post was my reply to them.)

// Learn to Let Go

One of the hardest lessons I have learned over the intervening 7 years is how to let go of stuff. What caused me to cling onto some of those TODOs that I would run across was a fear of forgetting something important. My daily use of a log book (see “Pen & Paper”) to record notes as I go along exemplifies my apparent need to keep track of my current state as my brain is like the proverbial sieve. In an era where change was much harder because the software development QA process was largely manual this makes sense as it was more popular to batch-up changes. Couple this with a general disregard for anything non-functional, such as an appreciation for refactoring, and it’s all too easy to see why people bury their personal backlog in the code rather than open it up for discussion with non-developers.

There is a familiar ring here and that’s because I’ve walked this path before more recently in 2016’s “Developers Can Be Their Own Worst Enemy”. With a modern development process that puts an emphasis on trust and transparency we can let go of the past and should have more confidence in our peers and the management to see the value in our opinions.

It’s also not acceptable to just leave a cryptic comment in the code about why something should be implemented differently or moved elsewhere – the need to change must be backed up with a reason so that we can understand the value in it, and most TODO comments are throwaway rather than well reasoned arguments.

// Measuring Technical Debt

I’ll be honest, I genuinely cannot see the point of attempting to measure the level of technical debt in a codebase, even if it were possible to do so. For a start you need to decide if you’re taking the original interpretation – a conscious decision to temporarily take a shortcut - or the ever more popular one – crap code. Even if you could do that, what are you ever going to do with the output? I once saw a SonarQube report that said we had £X million of technical debt in a codebase I was working on; how exactly does that inform you?

The reason I find it pointless is because it feels like it’s measuring the wrong thing. Just like the story about the man looking for his keys under the streetlight we apply a tool to measure the quality of the code when what really matters is measuring our ability to deliver. The reason poor quality code affects us is because it makes future change hard and slows us down. Hence if it matters there must be a causal link because if the code is that bad our pace of delivery will slow down and we’ll see that (all other things being equal).

But what is more important in my mind is that even if you did know how much debt you had you would only want to focus on the areas that are subject to change, and you do that already by focusing on the most valuable work in the backlog! And the technique for tackling the debt is refactoring. Hence we already have what we need to address the real problem.

// Trimming the Backlog

Unwinding the stack slightly let me return to the topic of reflecting unfinished work in the product’s backlog.

That sentence should already be making your spider-senses tingle – how can it be unfinished? If we’ve not met the acceptance criteria we’re not done, so carry on. If the acceptance criteria was wrong or incomplete then that’s a discussion to have with the product owner for clarification. Note that “of good quality” is an acceptance criteria inherent in every story we do as it is virtually non-negotiable [1].

Perhaps a better term would be undiscovered work. Woody Zuill has this saying “it’s in the doing of the work that we discover the work that needs doing”. What we often unearth are things which never showed up in our initial analysis and therefore we now have to factor them into our schedule or decide to drop them. When we’re knee deep in the feature it may seem really important to do it now, but given time the need may slowly dissolve, and in the end possibly disappear altogether.

What you might initially put on the backlog (or write as a TODO in the code) could be quite specific, e.g. “Need to handle poisoned messages”. As you continue you might also add other scenarios such as “Should refresh token before it expires”  and “Triage malformed messages separately from those well-formed but still invalid”. These are all valuable behaviours which go towards building a reliable service but the initial focus might not be on reliability, maybe its currently just a tactical fix to enable learning about something else. You don’t want to waste time over-engineering but at the same time you also don’t want to forget what else you have learned doing it.

The problem with this mentality is that the backlog just grows and grows, and we’re back to where we were before with TODOs littered all over the code – it’s just an every growing feature list. As time passes the need to implement some of those features will either happen because they have suddenly become important (it’s no longer tactical) or they will vanish (it remains a mere stop-gap). Leaving the stories on the backlog just makes it harder and harder to prioritise as you keep going over old ground again and again.

Once again we need to let go and rely on the fact that if they are important they will eventually resurface. If you feel really uncomfortable about deleting them entirely then you might consider rolling up related stories back into epic sized ones as part of the backlog grooming. Applying this to our recent example you could easily fold them all into a single “Service Reliability” epic that would be easier to handle. You could turn each card’s title into a checklist item.

// Confidence

That said as I get older I become less tolerant of a never-ending backlog and want to be more aggressive in the backlog grooming. Part of that is down to having more confidence in knowing that issues will be addressed [2] in a more timely fashion and that prioritisation will take into account both the technical and functional features with more or less equal consideration because the team is trusted to do The Right Thing.

Being more experienced there is no doubt more than a grain of truth that my sphere of influence is probably wider but that shouldn’t really matter. Every story must be valuable and ideally stand-up on its own merit, where my experience comes in is probably in being able to express that value more succinctly.

// TODO: Redux

I haven’t seen, heard or read anything in the intervening years that has been able to convince me in the value of using TODO comments in the code as a more successful technique of managing what needs to be done. If anything my appetite for tracking any work outside the next few sprints / weeks has begun to wane simply because it is now so easy to change in direction with only a moment’s notice should the situation deteriorate. This does not mean we should be reckless, far from it, but leaving a TODO in the code as a way of conveying a change in architecture hardly seems optimal either.

As for the notion that a TODO in the code can be equated with an increase in technical debt I don’t buy that either. I would posit it is more likely to indicate a failure within the development process itself as the mismatch between behaviour and acceptance criteria either indicates a bug in the code or a bug in the requirements and neither of these outcomes sounds like something that should be brushed off lightly with a comment in the code.

 

[1] I try not to deal in absolutes as there always seems to be an exception, but either way it must be a conscious decision.

[2] Assuming the organisation is behaving in a moderately agile way and not just paying lip service to a couple of the ceremonies or practices.

Evolutionary pressures on C++, Java and Python

Derek Jones from The Shape of Code

The future evolution of C++, Java and Python is being driven by very different interested parties, and it’s going to be interesting watching events unfold over the next 5-10 years.

I have previously written about how the C++ Standard’s committee is past its sell-by date, has taken off its ball and chain and is now in the hands of bored consultants.

Bjarne Stroustrup was once effectively treated as C++’s Benevolent Dictator For Life (during the production of the first C++ Standard some people were labeled as Bjarne groupees); things have moved on since then, but the ‘old-guard’ are trying to make a comeback. Suggesting that people ought to base their thinking on a book published almost 25-years ago (Stroustrup’s “The Design and Evolution of C++”; a very interesting book that is well worth reading) creates a rather backward looking image. Bored consultants are looking to work on exciting new ideas. The old-guard need to appear modern to attract followers (even if the ideas are old ideas with a fresh coat of paint).

The threat to C++ is from bored consultants, each adding their own pet idea to the language standard; a situation that Stroustrup thinks is starting to happen.

Java, the language, is owned by Oracle, the company (let’s not get too involved in exactly what they own, have copyright on, etc). Oracle are not shy about asking people for licensing fees. Java is now on a 6-month release cycle (at least the Oracle version, there are Open Source implementations) and the free support only applies to the current release; paying a license fee buys support for versions older than 6-months. In the short term, the cheapest solution is for companies to pay for support.

Oracle are always happy to send in the lawyers and if too many customers switch to non-Oracle implementations, I’m sure something can be found to introduce enough uncertainty to discourage work/distribution involving Open Source Java implementations.

Will Java survive Oracle’s licensing? It is not in their interest for Java to die; Oracle will adjust their terms to keep the money flowing in, but over the longer term I think willing Java developers are going to be hard to find.

Guido van Rossum recently removed himself from the post of Python’s Benevolent Dictator For Life. One of the jobs of a benevolent dictator is maintaining some degree of language coherence, which involves preventing people’s pet ideas from being added to the language. Does this mean that Python is slowly going to be become more and more bloated? Perhaps, but I think a more likely problem is a language fork, multiple implementations of slightly different (at first) languages all claiming to be Python.

These days, the strength of Python is its large collection of very useful, commercial grade, packages, and future language details may turn out to be irrelevant. There is a lot to learn from the Python 2/3 transition, but true believers like to think that things will turn out differently for them.

On Blockade – student

student from thus spake a.k.

Recall that the Baron's game is comprised of taking turns to place dominoes on a six by six grid of squares with each domino covering a pair of squares. At no turn was a player allowed to place a domino such that it created an oddly-numbered region of empty squares and Sir R----- was to be victorious if, at the end of play, the lines running between the ranks and files of the board were each and every one straddled by at least one domino.

Delivery Anti-Pattern: Local Optimisations

Chris Oldwood from The OldWood Thing

The daily stand-up mostly went along as usual. I wasn’t entirely sure why there was a stand-up as it wasn’t so much a team as a bunch of people working on the same codebase but with more-or-less individual goals. Applying the microservices analogy to the team you could say we were a bunch of micro-teams – each person largely acting with autonomy rather than striving to work together to achieve a common goal [1].

Time

But I digress, only slightly. What happened at the stand-up was a brief discussion around the delivery of a feature with the suggestion that it should be hidden behind a feature toggle. The implementer explained that they weren’t going to add a feature toggle because “it was a waste of their time”.

This surprised me. Knowing what I do now about how the team operates it isn’t that surprising but at the time I was used to working in teams where every member works towards common goals. One of those common goals is to try and ensure the delivery of features is a continuous flow and is not disrupted by a bad change which then has to be backed out because rolling back has the potential to create all sorts of disruption, not least the delay of those unaffected changes.

You should note that I’m not disagreeing with their choice of whether or not to use a feature toggle – I did not know anywhere near enough about the change or the system at that time to contribute to that decision. No, what disturbed me was the reason why they chose not to take that approach – that their time is more valuable than that of anyone else in the team, or the business for that matter.

In isolation that paints an unpleasant picture of the individual in question and that simply is not the case. However their choice of words, even if done without real consideration, does appear to reinforce the culture that surrounds them. In essence, with a feeling that the focus is on them and their performance, they are naturally going to behave in a way that favours optimising delivering their own features over that of the team at large.

Quality

Another example of favouring a local optimisation over the longer term goal of sustained delivery occurred when I was assigned my first piece of work. This was not so much a story as a couple of epics funded as an entire project (over 4 months solid work in the end). My instinct, after being shown roughly where in the code I needed to dig, was to ask where the existing tests were so that I knew where to add mine. The tech lead’s immediate response was “you won’t have time to write tests”.

My usual response to this statement is a jovially phrased “how will I know if it works then?” which often has the effect of opening a line of dialogue around the testing strategy and where it’s heading. Unfortunately this time around it only succeeded in the tech lead launching into a diatribe about how important delivery was, how much the business trusted us to deliver on time, blah, blah, blah, in fact almost everything that a good test suite enables!

Of course I still went ahead and implemented the entire project TDD-style and easily delivered it on time because I knew the approach was sound and the investment was more than worth it. The subsequent enhancements and repaying of some technical debt also became trivial at that point and meant that anyone, not just me, could safely and quickly make changes to that area of code. It also showed how easy it was to add new tests to cover changes to the older parts of the component when required later.

In the end over 10% of unit tests of the entire system had been written by me during that project for a codebase of probably over 1/2 million lines of code. I also added a command line test harness and a regression testing “framework” [2] in that time too all in an effort to reduce the amount of hoops you needed to go through to diagnose and safely fix any edge cases that showed up later. None of this was rocket science or in the least bit onerous.

Knowledge

I would consider a lack of supporting documentation one further local optimisation too. When only a select few have the knowledge to help support a system you have to continually rely on their help to nurse it through the bad times. This is especially true when the system has enough quirks that the cost of taking the wrong action is quite high (in terms of additional noise). If you need to remember a complex set of conditions and actions you’re going to get it wrong eventually without some form of checklist to work from. Relying on tribal knowledge is a great form of optimisation until core members of the team leave and you unearth the gaping holes in the team’s knowledge.

Better yet, design away the problems entirely, but that’s a different can of worms…

Project Before Product

I believe this was another example of how “projects” are detrimental to the development of a complex system. With the team funded by various projects and those projects being used as a very clear division on the task board through swim lanes [3] it killed the desire to swarm on anything but a production incident because you felt beholden to your specific stakeholders.

For example there were a number of conversations about fixing issues with the system that were slowing down delivery through unreliability that ended with “but who’s going to pay for that?” Although improvements were made they had to be so small as to not really affect the delivery of the project work. Hence the only real choice was to find easier ways to treat the symptoms rather than cure the disease.

Victims of Circumstance

Whenever I bump into this kind of culture my gut instinct is not to assume they are “incompetent” people, on the contrary, they’re clearly intelligent so I’ll assume they are shaped by their environment. Of course we all have our differences, that’s what makes diversity so useful, but we have to remember to stop once in a while and reflect on what we’re doing and question whether it’s still the right approach to take. What works for building Fizz Buzz does not work for a real-time, distributed calculation engine. And even if that approach did work once upon a time the world keeps moving on and so now we might be able to do even better.

 

[1] Pairing was only something you did when you’d already been stuck for some time, and when the mistake was found you went your separate ways again.

[2] I say “framework” because it was really just leveraging a classic technique: a command line tool reading CSV format data which fired requests into a server, the results of which are then diff’d against a known set of results (Golden Master Testing).

[3] The stand-up was originally run in project order, lead by the PM. Unsurprisingly those not involved in the other projects were rarely engaged in the meeting unless it was their turn to speak.