Requirements, User Stories & Backlog

Allan Kelly from Allan Kelly Associates

TwoBooks-2019-12-30-10-20.jpg

At the end of January I’m running my 1-day Requirements, User Stories and Backlogs workshop in London with Learning Connexions. I get great feedback from people who attend the course, perhaps because it is mostly exercised based.

If your interested check out the Learning Connexions page – its just one day and won’t break the bank! Hope to see you there.

The post Requirements, User Stories & Backlog appeared first on Allan Kelly Associates.

Reliability chapter of ‘evidence-based software engineering’ updated

Derek Jones from The Shape of Code

The Reliability chapter of my evidence-based software engineering book has been updated (draft pdf).

Unlike the earlier chapters, there were no major changes to the initial version from over 18-months ago; we just don’t know much about software reliability, and there is not much public data.

There are lots of papers published claiming to be about software reliability, but they are mostly smoke-and-mirror shows derived from work down one of several popular rabbit holes:

The growth in research on Fuzzing is the only good news (especially with the availability of practical introductory material).

There is one source of fault experience data that looks like it might be very useful, but it’s hard to get hold of; NASA has kept detailed about what happened using space missions. I have had several people promise to send me data, but none has arrived yet :-(.

Updating the reliability chapter did not take too much time, so I updated earlier chapters with data that has arrived since they were last released.

As always, if you know of any interesting software engineering data, please tell me.

Next, the Source code chapter.

Building an OpenBSD Wireguard server

Timo Geusch from The Lone C++ Coder's Blog

In my previous post, I mentioned that I somehow ended up with a corrupted filesystem on the WireGuard server I had set up earlier this year. That iteration of my VPN server was built on Linux as I expected I would get better performance using the kernel-based WireGuard implementation. It had taken me a while […]

The post Building an OpenBSD Wireguard server appeared first on The Lone C++ Coder's Blog.

A review: .Net Core in Action

Paul Grenyer from Paul Grenyer

.Net Core in Action
by Dustin Metzgar
ISBN-13: 978-1617294273

I still get a fair amount of flack for buying and reading technical books in the 21st Century - almost as much as I get for still buying and listening to CDs. If I was a vinyl loving hipster, it would be different of course…. However, books like .Net Core in Action are a perfect example of why I do it.  I needed to learn what .Net Core was and get a feel for it very quickly and that is what this book allowed me to do.

I’ve been very sceptical of .Net development for a number of years, mostly due to how large I perceived the total cost of ownership and the startup cost to be and the fact that you have to use Windows.  While this was previously true, .Net Core is different and .Net Core in Action made me understand that within the first few pages of the first chapter. It also got me over my prejudice towards Docker by the end of the second chapter.

The first two chapters are as you would expect, an introduction followed by various Hello World examples. Then it gets a bit weird as the book dives into the build system next and then Unit testing (actually, this is good so early) and then two chapters on connecting to relational databases, writing data access layers and ORMs. There’s a sensible chapter on micro services before the weirdness returns with chapters on debugging performance profiling and internationalisation. I can kind of see how the author is trying to show the reader the way different parts of .Net core work on different platforms (Windows, Linux, Mac), but this relatively small volume could have been more concise.




DevelopHER Overall Award 2019

Paul Grenyer from Paul Grenyer

I was honoured and delighted to be asked to judge and present the overall DevelopHER award once again this year. Everyone says choosing a winner is difficult. It may be a cliche, but that doesn’t change the fact that it is.

When the 13 category winners came across my desk I read through them all and reluctantly got it down to seven. Usually on a first pass I like to have it down to three or four and then all I need to agonise over is the order. Luckily on the second pass I was able to be ruthless and get it down to four.

To make it even more difficult, three of my four fell into three categories I am passionate about:

  • Technical excellence and diversity
  • Automated Testing
  • Practical, visual Agile

And the fourth achieved results for her organisation which just couldn’t be ignored.

So I read and reread and ordered and re-ordered. Made more tea, changed the CD and re-read and re-ordered some more. Eventually it became clear.

Technical excellent and the ability for a software engineer to turn their hand to new technologies is vital. When I started my career there were basically two main programming languages, C++ and Java. C# came along soon after, but most people fell into one camp or another and a few of us crossed over. Now are are many, many more to choose from and lots of young engineers decide to specialise in one and are reluctant to learn and use others. This diminishes us all as an industry. So someone who likes to learn new and different technologies is a jewel in any company’s crown.

The implementation of Agile methodologies in Software Development is extremely important. Software, by its very nature is complex. Only on the most trivial projects does the solution the users need look anything like what they thought they wanted at the beginning. Traditional waterfall approaches to software development do not allow for this. The client requires flexibility and we as software engineers need the flexibility to deliver what they need. Software development is a learning process for both the client and the software engineer. Agile gives us a framework for this. Unlike many of the traditional methods, Agile has the flexibility to be agile itself, giving continuous improvement.

When implementing Agile processes, the practices are often forgotten or neglected and in many ways they are more important. Not least of which is automated testing. The practice of writing code which tests your code and running it at least on every checkin. This gives you a safety net that code you’ve already written isn’t broken by new code you write. And when it is, the tests tell you, they tell you what’s wrong and where it’s wrong.  We need more of this as an industry and that is why I chose Rita Cristina Leitao, an automated software tester from Switch Studios as the overall DevelopHER winner.


Finally On An Ethereal Orrery – student

student from thus spake a.k.

Over the course of the year, my fellow students and I have been experimenting with an ethereal orrery which models the motion of heavenly bodies using nought but Sir N-----'s laws of gravitation and motion. Whilst the consequences of those laws are not generally subject to solution by mathematical reckoning, we were able to approximate them with a scheme that admitted errors of the order of the sixth power of the steps in time by which we advanced the positions of those bodies.
We have thus far employed it to model the solar system itself, uniformly distributed bodies of matter and the accretion of bodies that are close to Earth's orbit about the Sun. Whilst we were most satisfied by its behaviour, I should now like to report upon an altogether more surprising consequence of its engine's action.

Branching 0 – Git 1

Chris Oldwood from The OldWood Thing

My recent tirade against unnecessary branching – “Git is Not the Problem” – might have given the impression that I don’t appreciate the power that git provides. That’s not true and hopefully the following example highlights the appreciation I have for the power git provides but also why I dislike being put in that position in the first place.

The Branching Strategy

I was working in a small team with a handful of experienced developers making an old C++/ATL based GUI more accessible for users with disabilities. Given the codebase was very mature and maintenance was minimal, our remit only extended so far as making the minimal changes we needed to both the code and resource files. Hence this effectively meant no refactoring – a strictly surgical approach.

The set-up involved an integration branch per-project with us on one and the client’s team on another – master was reserved for releases. However, as they were using Stash for their repos they also wanted us to make use of its ability to create separate pull requests (PR) for every feature. This meant we needed to create independent branches for every single feature as we didn’t have permission to push directly to the integration branch even if we wanted to.

The Bottleneck

For those who haven’t had the pleasure of working with Visual Studio and C++/ATL on a native GUI with other people, there are certain files which tend to be a bottleneck, most notably resource.h. This file contains the mapping for the symbols (nay #defines) to the resource file IDs. Whenever you add a new resource, such as a localizable string, you add a new symbol and bump the two “next ID” counters at the bottom. This project ended up with us adding a lot of new resource strings for the various (localizable) annotations we used to make the various dialog controls more accessible [1].

Aside from the more obvious bottleneck this resource.h file creates, in terms of editing it in a team scenario, it also has one other undesirable effect – project rebuilds. Being a header file, and also one that has a habit of being used across most of the codebase (whether intentionally or not) if it changes then most of the codebase needs re-building. On a GUI of the size we were working on, using the development VMs we had been provided, this amounted to 45 minutes of thumb twiddling every time it changed. As an aside we couldn’t use the built-in Visual Studio editor either as the file had been edited by hand for so long that when it was saved by the editor you ended up with the diff from hell [2].

The Side-Effects

Consequently we ran into two big problems working on this codebase that were essentially linked to that one file. The first was that adding new resources meant updating the file in a way that was undoubtedly going to generate a merge conflict with every other branch because most tasks meant adding new resources. Even though we tried to coordinate ourselves by introducing padding into the file and artificially bumping the IDs we still ended up causing merge conflicts most of the time.

In hindsight we probably could have made this idea work if we added a huge amount of padding up front and reserved a large range of IDs but we knew there was another team adding GUI stuff on another branch and we expected to integrate with them more often than we did. (We had no real contact with them and the plethora of open branches made it difficult to see what code they were touching.)

The second issue was around the rebuilds. While you can git checkout –b <branch> to create your feature branch without touching resource.h again, the moment you git pull the integration branch and merge you’re going to have to take the hit [3]. Once your changes are integrated and you push your feature branch to the git server it does the integration branch merge for you and moves it forward.

Back on your own machine you want to re-sync by switching back to the integration branch, which I’d normally do with:

> git checkout <branch>
> git pull --ff-only

…except the first step restores the old resource.h before updating it again in the second step back to where you just were! Except now we’ve got another 45 minute rebuild on our hands [3].

Git to the Rescue

It had been some years since any of us had used Visual Studio on such a large GUI and therefore it took us a while to work out why the codebase always seemed to want rebuilding so much. Consequently I looked to the Internet to see if there was a way of going from my feature branch back to the integration branch (which should be identical from a working copy perspective) without any files being touched. It’s git, of course there was a way, and “Fast-forwarding a branch without checking it out” provided the answer [4]:

> git fetch origin <branch>:<branch>
> git checkout <branch>

The trick is to fetch the branch changes from upstream and point the local copy of that branch to its tip. Then, when you do checkout, only the branch metadata needs to change as the versions of the files are identical and nothing gets touched (assuming no other upstream changes have occurred in the meantime).

Discontinuous Integration

In a modern software development world where we strive to integrate as frequently as possible with our colleagues it’s issues like these that remind us what some of the barriers are for some teams. Visual C++ has been around a long time (since 1993) so this problem is not new. It is possible to break up a GUI project – it doesn’t need to have a monolithic resource file – but that requires time & effort to fix and needs to be done in a timely fashion to reap the rewards. In a product this old which is effectively on life-support this is never going to happen now.

As Gerry Weinberg once said “Things are the way they are because they got that way” which is little consolation when the clock is ticking and you’re watching the codebase compile, again.

 

[1] I hope to write up more on this later as the information around this whole area for native apps was pretty sparse and hugely diluted by the same information for web apps.

[2] Luckily it’s a fairly easy format but laying out controls by calculating every window rectangle is pretty tedious. We eventually took a hybrid approach for more complex dialogs where we used the resource editor first, saved our code snippet, reverted all changes, and then manually pasted our snippet back in thereby keeping the diff minimal.

[3] Yes, you can use touch to tweak the file’s timestamp but you need to be sure you can get away with that by working out what the effects might be.

[4] As with any “googling” knowing what the right terms are, to ask the right question, is the majority of the battle.

Git is Not the Problem

Chris Oldwood from The OldWood Thing

Git comes in for a lot of stick for being a complicated tool that’s hard to learn, and they’re right, git is a complicated tool. But it’s a tool designed to solve a difficult problem – many disparate people collaborating on a single product in a totally decentralized fashion. However, many of us don’t need to work that way, so why are we using the tool in a way that makes our lives more difficult?

KISS

For my entire professional programming career, which now spans over 25 years, and my personal endeavours, I have used a version control tool (VCS) to manage the source code. In that time, for the most part, I have worked in a trunk-based development fashion [1]. That means all development goes on in one integration branch and the general philosophy for every commit is “always be ready to ship” [2]. As you might guess features toggles (in many different guises) play a significant part in achieving that.

A consequence of this simplistic way of working is that my development cycle, and therefore my use of git, boils down to these few steps [3]:

  • clone
  • edit / build / test
  • diff
  • add / commit
  • pull
  • push

There may occasionally be a short inner loop where a merge conflict shows up during the pull (integration) phase which causes me to go through the edit / diff / commit cycle again, but by-and-large conflicts are rare due to close collaboration and very short change cycles. Ultimately though, from the gazillions of commands that git supports, I mostly use just those 6. As you can probably guess, despite using git for nearly 7 years, I actually know very little about it (command wise). [4]

Isolation

Where I see people getting into trouble and subsequently venting their anger is when branches are involved. This is not a problem which is specific to git though, you see this crop up with any VCS that supports branches whether it be ClearCase, Perforce, Subversion, etc. Hence, the tool is not the problem, the workflow is. And that commonly stems from a delivery process mandated by the organization, meaning that ultimately the issue is one of an organizational nature, not the tooling per-se.

An organisation which seeks to reduce risk by isolating work (and by extension its people) onto branches is increasing the delay in feedback thereby paradoxically increasing the risk of integration, or so-called “merge debt”. A natural side-effect of making it harder to push through changes is that people will start batching up work in an attempt to boost "efficiency”. The trick is to go in the opposite direction and break things down into smaller units of work that are easier to produce and quicker to improve. Balancing production code changes with a solid investment in test coverage and automation reduces that risk further along with collaboration boosting techniques like pair and mob programming.

Less is More

Instead of enforcing a complicated workflow and employing complex tools in the hope that we can remain in control of our process we should instead seek to keep the workflow simple so that our tools remain easy to use. Git was written to solve a problem most teams don’t have as they neither have the volume of distributed people or complexity of product to deal with. Organisations that do have complex codebases cannot expect to dig themselves out of their hole simply by introducing a more powerful version control tool, it will only increase the cost of delay while bringing a false sense of security as programmers work in the dark for longer.

 

[1] My “Branching Strategies” article in ACCU’s Overload covers this topic if you’re looking for a summary.

[2] This does not preclude the use of private branches for spikes and/or release branches for hotfix engineering when absolutely needed. #NoAbsolutes.

[3] See “In The Toolbox - Commit Checklist” for some deeper discussion about what goes through my head during the diff / commit phase.

[4] I pondered including “log” in the list for when doing a spot of software archaeology but that is becoming much rarer these days. I also only use “fetch” when I have to work with feature branches.

The Renzo Pomodoro dataset

Derek Jones from The Shape of Code

Estimating how long it will take to complete a task is hard work, and the most common motivation for this work comes from external factors, e.g., the boss, or a potential client asks for an estimate to do a job.

People also make estimates for their own use, e.g., when planning work for the day. Various processes and techniques have been created to help structure the estimation process; for developers there is the Personal Software Process, and specifically for time estimation (but not developer specific), there is the Pomodoro Technique.

I met Renzo Borgatti at the first talk I gave on the SiP dataset (Renzo is the organizer of the Papers We Love meetup). After the talk, Renzo told me about his use of the Pomodoro Technique, and how he had 10-years worth of task estimates; wow, I was very interested. What happened next, and a work-in-progress analysis (plus data and R scripts) of the data can be found in the Renzo Pomodoro dataset repo.

The analysis progressed in fits and starts; like me Renzo is working on a book, and is very busy. The work-in-progress pdf is reasonably consistent.

I had never seen a dataset of estimates made for personal use, and had not read about the analysis of such data. When estimates are made for consumption by others, the motives involved in making the estimate can have a big impact on the values chosen, e.g., underestimating to win a bid, or overestimating to impress the boss by completing a task under budget. Is a personal estimate motive free? The following plot led me to ask Renzo if he was superstitious (in not liking odd numbers).

Number of tasks having a given number of estimate and actual Pomodoro values.

The plot shows the number of tasks for which there are a given number of estimates and actuals (measured in Pomodoros, i.e., units of 25 minutes). Most tasks are estimated to require one Pomodoro, and actually require this amount of effort.

Renzo educated me about the details of the Pomodoro technique, e.g., there is a 15-30 minute break after every four Pomodoros. Did this mean that estimates of three Pomodoros were less common because the need for a break was causing Renzo to subconsciously select an estimate of two or four Pomodoro? I am not brave enough to venture an opinion about what is going on in Renzo’s head.

Each estimated task has an associated tag name (sometimes two), which classifies the work involved, e.g., @planning. In the task information these tags have the form @word; I refer to them as at-words. The following plot is very interesting; it shows the date of use of each at-word, over time (ordered by first use of the at-word).

at-words usage, by date.

The first and third black lines are fitted regression models of the form 1-e^{-K*days}, where: K is a constant and days is the number of days since the start of the interval fitted. The second (middle) black line is a fitted straight line.

The slow down in the growth of new at-words suggests (at least to me) a period of time working in the same application domain (which involves a fixed number of distinct activities, that are ‘discovered’ by Renzo over time). More discussion with Renzo is needed to see if we can tie this down to what he was working on at the time.

I have looked for various other patterns and associations, involving at-words, but have not found any (but I did learn some new sequence analysis techniques, and associated R packages).

The data is now out there. What patterns and associations can you find?

Renzo tells me that there is a community of people using the Pomodoro technique. I’m hoping that others users of this technique, involved in software development, have recorded their tasks over a long period (I don’t think I could keep it up for longer than a week).

Perhaps there are PSP followers out there with data…

I offer to do a free analysis of software engineering data, provided I can make data public (in anonymized form). Do get in touch.

Shakespear Sister Ipswich November 2019

Paul Grenyer from Paul Grenyer

I was very surprised and excited and then immediately disappointed to see Shakespere Sister on the Graham Norton show. They performed Stay, which is their big hit (longest single at number in the UK be a female artist, 8 weeks), but Marcella wasn’t even trying to hit the high notes and it was awful. We decided to go and see them on tour anyway as it was potentially a once in a lifetime experience before they fell out again.

The Ipswich Regent was half empty in the stalls and the circle was closed and oddly there were quite a few security guards - apparently at the request of the band. Encouragingly Shakespear Sister came on on time and they sounded good! As they ploughed through many of their well known songs, new songs and a few older more obscure songs, the vocals were strong from both Marcella and Siobhan.

The rhythm section was incredible.  The drumming was tight, varied and interesting, but what really stood out was the bass. I think part of this was that the player had fantastic bass lines to play, but also oozed talent. It’s really uncommon for a bass player to need to change bass guitars between songs but Clare Kenny swapped frequently. It’s just a shame that the lead guitar player was totally unremarkable and I’ve no idea what the keyboard player was for.

The highlight I, and I imagine many others, had been looking forward to was Stay. It was better than with Graham Norton, but it’s clear that Marcella can not get to the highest notes and live, she doesn’t try. It was still a good performance of a fantastic song.

Would I go and see them again? Probably not, unless I was dragged.