Estimating in round numbers

Derek Jones from The Shape of Code

People tend to use round numbers. When asked the time, the response is often rounded to the nearest 5-minute or 15-minute value, even when using a digital watch; the speaker is using what they consider to be a relevant level of accuracy.

When estimating how long it will take to perform a task, developers tend to use round numbers (based on three datasets). Giving what appears to be an overly precise value could be taken as communicating extra information, e.g., an estimate of 1-hr 3-minutes communicates a high degree of certainty (or incompetence, or making a joke). If the consumer of the estimate is working in round numbers, it makes sense to give a round number estimate.

Three large software related effort estimation datasets are now available: the SiP data contains estimates made by many people, the Renzo Pomodoro data is one person’s estimates, and now we have the Brightsquid data (via the paper “Utilizing product usage data for requirements evaluation” by Hemmati, Didar Al Alam and Carlson; I cannot find an online pdf at the moment).

The plot below shows the total number of tasks (out of the 1,945 tasks in the Brightsquid data) for which a given estimate value was recorded; peak values shown in red (code+data):

Number of tasks having a given estimate.

Why are there estimates for tasks taking less than 30 minutes? What are those 1 minute tasks (are they typos, where the second digit was omitted and the person involved simply create a new estimate without deleting the original)? How many of those estimate values appearing once are really typos, e.g., 39 instead of 30? Does the task logging system used require an estimate before anything can be done? Unfortunately I don’t have access to the people involved. It does look like this data needs some cleaning.

There are relatively few 7-hour estimates, but lots for 8-hours. I’m assuming the company works an 8-hour day (the peak at 4-hours, rather than three, adds weight to this assumption).

New users generate more exceptions than existing users (in one dataset)

Derek Jones from The Shape of Code

Application usage data is one of the rarest kinds of public software engineering data.

Even data that might be used to approximate application usage is rare. Server logs might be used as a proxy for browser usage or operating system usage, and number of Debian package downloads as a proxy for usage of packages.

Usage data is an important component of fault prediction models, and the failure to incorporate such data is one reason why existing fault models are almost completely worthless.

The paper Deriving a Usage-Independent Software Quality Metric appeared a few months ago (it’s a bit of a kitchen sink of a paper), and included lots of usage data! As far as I know, this is a first.

The data relates to a mobile based communications App that used Google analytics to log basic usage information, i.e., daily totals of: App usage time, uses by existing users, uses by new users, operating system+version used by the mobile device, and number of exceptions raised by the App.

Working with daily totals means there is likely to be a non-trivial correlation between usage time and number of uses. Given that this is the only public data of its kind, it has to be handled (in my case, ignored for the time being).

I’m expecting to see a relationship between number of exceptions raised and daily usage (the data includes a count of fatal exceptions, which are less common; because lots of data is needed to build a good model, I went with the more common kind). So a’fishing I went.

On most days no exception occurred (zero is the ideal case for the vendor, but I want lots of exception to build a good model). Daily exception counts are likely to be small integers, which suggests a Poisson error model.

It is likely that the same set of exceptions were experienced by many users, rather like the behavior that occurs when fuzzing a program.

Applications often have an initial beta testing period, intended to check that everything works. Lucky for me the beta testing data is included (i.e., more exceptions are likely to occur during beta testing, which get sorted out prior to official release). This is the data I concentrated my modeling.

The model I finally settled on has the form (code+data):

Exceptions approx uses^{0.1}newUserUses^{0.54}e^{0.002sqrt{usagetime}}AndroidVersion

Yes, newUserUses had a much bigger impact than uses. This was true for all the models I built using data for all Android/iOS Apps, and the exponent difference was always greater than two.

Why square-root, rather than log? The model fit was much better for square-root; too much better for me to be willing to go with a model which had usagetime as a power-law.

The impact of AndroidVersion varied by several orders of magnitude (which won’t come as a surprise to developers using earlier versions of Android).

There were not nearly as many exceptions once the App became generally available, and there were a lot fewer exceptions for the iOS version.

The outsized impact of new users on exceptions experienced is easily explained by developers failing to check for users doing nonsensical things (which users new to an App are prone to do). Existing users have a better idea of how to drive an App, and tend to do the kind of things that developers expect them to do.

As always, if you know of any interesting software engineering data, please let me know.

Product owner as a homeowner

Allan Kelly from Allan Kelly Associates

house-illustration-clipart-free-stock-photo-public-domain-pictures1-2020-05-21-14-50.jpg

For years people have been comparing software construction with building construction. Think about how we talk about “architecture” or foundations, or the cost of change and so on. As I’ve said before, building software is not like building a house. Now it occurs to me that a better metaphor is the ongoing ownership of the building.

Every building requires “maintenance” and over time buildings change – indeed buildings learn. While an Englishman’s home is his castle those of us, even the English, who are lucky enough to own a house don’t have a free hand in the changes we make to our houses

Specifically I’m thinking about the Product Owner. Being a Product Owner is less about deciding what you want your new house to look like, or how the building should be constructed, its not even about deciding how many rooms the house should have. The role of the Product Owner is to ensure the house continues to be liveable, preferably the house is getting nicer to live in, and the house is coping with the requests made on it.

I own a house – a nice one in West London. As the owner I am responsible for the house. I do little jobs myself – like painting the fences. More significantly I have to think about what I want to do with the house: do we want to do a loft conversion? What would that entail and when might I be able to afford that?

I am the Product Owner of my own house. I have to decide on what is to be done, what can wait and what trade-offs I can accept.

When I bought the house the big thing to change was the kitchen and backroom. There was little point in any other works until those rooms were smashed to bits and rebuilt. I had to think though what was needed by my family, what was possible and what the result might be like. I received quotes from several builders – each of whom had their own ideas about what I wanted. I hired an architect for advice. I looked at what neighbours had done. And I had a hard think about how much money I could spend.

An Englishman’s home is his castle – I am the lord of my house and I can decide what I want, except…

My wife and children have a say in what happens to the house. Actually my wife has a pretty big say, while the children have less say there needs are pretty high on my list of priorities.

My local council and even the government have a say because they place certain constraints on what I can do – planning permission, rules and building codes. The insurance company and mortgage bank set some constraints and expectations too.

My neighbours might not own my house but they are stakeholders: I can’t upset them (too much) and they impose some constraints. (In my first flat/apartment the neighbours were a bigger issue because we shared a roof, a garden and the walls.)

So while I may be lord of my own house I am not a completely a free agent. And the same is true with Product Owners.

The secret with Product Owners is: they are Owners. They are more than managers – managers are just hired help. But neither do POs have a free hand, they don’t have unlimited power, the are not dictators, they are not completely free to do what they want and order people around.

Like me, Product Owners have limited resources available: how much money, how many helpers, access to customers and more. I have to balance my desire for a large loft conversion (with shower, balcony and everything else) with the money I can afford to spend on it. That involves trade-off and compromises. I could go into debt – increase my mortgage – but that comes with costs.

Product owners have responsibilities: to customers and users, to the those who fund the work (like the mortgage bank), to team members and peers to name a few. Some decisions they can make on their own, but on other decisions they can only lead a conversation and guide it towards a conclusion.

What the homeowner metaphor misses entirely is the commercial aspect: my house exists for me to live in. I don’t expect to make money out of it. The house next door to mine is owned by a commercial landlord who rents it out: the landlord is actively trying to make money out of that house.

Most Product Owners are trying to further some other agenda: commercial (generating money), or public sector (furthering Government policies), or third sector (e.g. a charity). In other words: Product Owners are seeking to add value for their organization. This adds an additional dimension because the PO has to justify their decisions to a higher authority.


Subscribe to my blog newsletter and download Project Myopia for Free

The post Product owner as a homeowner appeared first on Allan Kelly Associates.

Adding Value to User Stories

Allan Kelly from Allan Kelly Associates

4Coins-2020-05-21-12-06.jpg

My next online workshop is now available for booking: Adding Value to User Stories.

As with my previous online workshops this is a series of four 90 minute online (zoom) sessions delivered on consecutive weeks. And as before a few tickets are available for free to those who are furloughed or unemployed.

This workshop is for Product Owners (including business analysts and product managers), Scrum Masters, Project and Development Managers.

Learning Objectives

  • Appreciate the influence of value on effort estimates and technical architecture at the story and project level
  • Know how to estimate value for user stories and epics
  • Recognise how cost-of-delay changes value over time, why deadlines are elastic by value and how to use Best Before and Use By dates when prioritising work
  • Appreciate how values define value, and how this differs between organizations

Details and tickets on the EventBrite booking page. Alternatively you can book directly with me.

The post Adding Value to User Stories appeared first on Allan Kelly Associates.

Setting up rdiff-backup on FreeBSD 12.1

Timo Geusch from The Lone C++ Coder's Blog

My main PC workstation (as opposed to my Mac Pro) is a dual-boot Windows and Linux machine. While backing up the Windows portion is relatively easy via some cheap-ish commercial backup software, I ended up backing up my Linux home directories only very occasionally. Clearly, Something Had To Be Done ™. I had a look […]

The post Setting up rdiff-backup on FreeBSD 12.1 appeared first on The Lone C++ Coder's Blog.

Setting up rdiff-backup on FreeBSD 12.1

The Lone C++ Coder's Blog from The Lone C++ Coder's Blog

My main PC workstation (as opposed to my Mac Pro) is a dual-boot Windows and Linux machine. While backing up the Windows portion is relatively easy via some cheap-ish commercial backup software, I ended up backing up my Linux home directories only very occasionally. Clearly, Something Had To Be Done (tm).

I had a look around for Linux backup software. I was familiar with was Timeshift, but at least the Manjaro port can’t back up to a remote machine and was useless as a result. I eventually settled on rdiff-backup as it seemed to be simple, has been around for a while and also looks very cron-friendly. So far, so good.

Clang-Tidying up the house

Products, the Universe and Everything from Products, the Universe and Everything

If there is any single consolation amidst the circumstances we are all having to cope with at the moment it is that many of us have lots of time to fill - not only with unproductive things like binging Netflix (I really should get around to watching Dirk Gently's Holistic Detective Agency...) but also with tasks which might have a more lasting long-term benefit.

A Clang-Tidy analysis of a Visual Studio 2019 skeleton MFC project within VisualLintGui.

Those could be tasks like learning a language (programming or human), joining online yoga classes, writing a book, designing a website, blogging, reading (I can recommend Francis Buontempo's Genetic Algorithms and Machine Learning for Programmers if you fancy learning a little about ML) or generally just getting on with stuff with perhaps fewer distractions than usual.

On the latter note, for the past few months we've been working on the codebase of the next version of Visual Lint in our development branch, and it is coming along quite nicely.

One of the things we have planned to do for some time is to add direct support for the Clang-Tidy analysis tool to Visual Lint. When the UK lockdown started, focusing on this task in particular proved to be a very useful distraction from all the fear and uncertainty we found around us.

Sometimes being in the zone helps in more ways than usual.

The screenshot above should give you an idea of where we are at the moment. Whilst there is still a great deal to do before we can consider this is production-ready the foundation is in place and it is definitely usable. For example, selected issues can already be suppressed from the Analysis Results Display by inserting inline suppression directives ("// NOLINT") using the same context menu command used to suppress (for example) PC-lint, PC-lint Plus and CppCheck analysis issues.

With Microsoft Visual Studio being one of the major development environments we support one of the most important things to address is configuring Clang-Tidy to be tolerant of non-standard Visual C++ projects. The errors shown for some files in the Analysis Status Display in the screenshot above are exactly because of this - a standards compliant C++ compiler is likely to generate at least some errors while compiling most Visual C++ projects.

The only error we saw in the Visual C++ project mentioned above was: clang-diagnostic-error: -- call to non-static member function without an object argument.

The StackOverflow topic Refactor MFC message maps to include fully qualified member function pointers discusses exactly the same issue.

This is mitigated by the fact that recent versions of Clang-Tidy do a pretty good job of handling nonstandard code, so the things that Clang-Tidy warns about seem to be exactly the sort of things you would expect it to and a fair degree of tuning is possible. For example, the clang-diagnostic-error issues in the case above could be disabled entirely if that was what you wanted (as it's a trivial fix you could alternatively choose to correct the code).

Furthermore, with the C++ Core Guidelines checks enabled (e.g. using the --checks=modernize-* switch) Clang-Tidy can tell you exactly what you could do to your C++ code to bring it up to modern standards.

We will obviously be talking more about Clang-Tidy in due course - particularly once we start looking at using it to automatically correct the issues it finds (if you are not aware, this is one of the most compelling features of Clang-Tidy. The good news is that autofixing selected issues should be fairly straightforward using the output of the Clang-Tidy --export-fixes switch.

Lockdowns or eased lockdowns, more interesting times lie ahead.

Clang-Tidying up the house

Products, the Universe and Everything from Products, the Universe and Everything

If there is any single consolation amidst the circumstances we are all having to cope with at the moment it is that many of us have lots of time to fill - not only with unproductive things like binging Netflix (I really should get around to watching Dirk Gently's Holistic Detective Agency...) but also with tasks which might have a more lasting long-term benefit.

A Clang-Tidy analysis of a skeleton Visual Studio 2019 project within VisualLintGui

Those could be tasks like learning a language (programming or human), joining online yoga classes, writing a book, designing a website, blogging, reading (I can recommend Francis Buontempo's Genetic Algorithms and Machine Learning for Programmers if you fancy learning a little about ML) or generally just getting on with stuff with perhaps fewer distractions than usual.

On the latter note, for the past few months we've been working on the codebase of Visual Lint 7.5 (the next version) in our development branch, and it is coming along quite nicely.

One of the things we have planned to do for some time is to add direct support for the Clang-Tidy analysis tool to Visual Lint. When the UK lockdown started, focusing on this task in particular proved to be a very useful distraction from all the fear and uncertainty we found around us.

Sometimes being in the zone helps in more ways than usual.

The screenshot above should give you an idea of where we are at the moment. Whilst there is still a great deal to do before we can consider this is production-ready the foundation is in place and it is definitely usable. For example, selected issues can already be suppressed from the Analysis Results Display by inserting inline suppression directives ("// NOLINT") using the same context menu command used to suppress (for example) PC-lint, PC-lint Plus and CppCheck analysis issues.

With Microsoft Visual Studio being one of the major development environments we support one of the most important things to address is configuring Clang-Tidy to be tolerant of non-standard Visual C++ projects. The errors shown for some files in the Analysis Status Display in the screenshot above are exactly because of this - a standards compliant C++ compiler is likely to generate at least some errors while compiling most Visual C++ projects.

The only error we saw in the Visual C++ project mentioned above was: clang-diagnostic-error: -- call to non-static member function without an object argument.

The StackOverflow topic this is one of the most compelling features of Clang-Tidy). The good news is that autofixing selected issues should be fairly straightforward using the output of the Clang-Tidy ‑‑export-fixes switch.

Lockdowns or eased lockdowns, more interesting times lie ahead.

Happy 60th birthday: Algol 60

Derek Jones from The Shape of Code

Report on the Algorithmic Language ALGOL 60 is the title of a 16-page paper appearing in the May 1960 issue of the Communication of the ACM. Probably one of the most influential programming languages, and a language that readers may never have heard of.

During the 1960s there were three well known, widely used, programming languages: Algol 60, Cobol, and Fortran.

When somebody created a new programming languages Algol 60 tended to be their role-model. A few of the authors of the Algol 60 report cited beauty as one of their aims, a romantic notion that captured some users imaginations. Also, the language was full of quirky, out-there, features; plenty of scope for pin-head discussions.

Cobol appears visually clunky, is used by business people and focuses on data formatting (a deadly dull, but very important issue).

Fortran spent 20 years catching up with features supported by Algol 60.

Cobol and Fortran are still with us because they never had any serious competition within their target markets.

Algol 60 had lots of competition and its successor language, Algol 68, was groundbreaking within its academic niche, i.e., not in a developer useful way.

Language family trees ought to have Algol 60 at, or close to their root. But the Algol 60 descendants have been so successful, that the creators of these family trees have rarely heard of it.

In the US the ‘military’ language was Jovial, and in the UK it was Coral 66, both derived from Algol 60 (Coral 66 was the first language I used in industry after graduating). I used to hear people saying that Jovial was derived from Fortran; another example of people citing the language the popular language know.

Algol compiler implementers documented their techniques (probably because they were often academics); ALGOL 60 Implementation is a real gem of a book, and still worth a read today (as an introduction to compiling).

Algol 60 was ahead of its time in supporting undefined behaviors 😉 Such as: “The effect, of a go to statement, outside a for statement, which refers to a label within the for statement, is undefined.”

One feature of Algol 60 rarely adopted by other languages is its parameter passing mechanism, call-by-name (now that lambda expressions are starting to appear in widely used languages, call-by-name has a kind-of comeback). Call-by-name essentially has the same effect as textual substitution. Given the following procedure (it’s not a function because it does not return a value):

procedure swap (a, b);
   integer a, b, temp;
begin
temp := a;
a := b;
b:= temp
end;

the effect of the call: swap(i, x[i]) is:

  temp := i;
  i := x[i];
  x[i] := temp

which might come as a surprise to some.

Needless to say, programmers came up with ‘clever’ ways of exploiting this behavior; the most famous being Jensen’s device.

The follow example of go to usage appears in: International Standard 1538 Programming languages – ALGOL 60 (the first and only edition appeared in 1984, after most people had stopped using the language):

go to if Ab < c then L17
  else g[if w < 0 then 2 else n]

Orthogonality of language use won out over the goto FUD.

The Software Preservation Group is a great resource for Algol 60 books and papers.