When estimating how long it will take to perform a task, developers tend to use round numbers (based on three datasets). Giving what appears to be an overly precise value could be taken as communicating extra information, e.g., an estimate of 1-hr 3-minutes communicates a high degree of certainty (or incompetence, or making a joke). If the consumer of the estimate is working in round numbers, it makes sense to give a round number estimate.
Three large software related effort estimation datasets are now available: the SiP data contains estimates made by many people, the Renzo Pomodoro data is one person’s estimates, and now we have the Brightsquid data (via the paper “Utilizing product usage data for requirements evaluation” by Hemmati, Didar Al Alam and Carlson; I cannot find an online pdf at the moment).
The plot below shows the total number of tasks (out of the 1,945 tasks in the Brightsquid data) for which a given estimate value was recorded; peak values shown in red (code+data):
Why are there estimates for tasks taking less than 30 minutes? What are those 1 minute tasks (are they typos, where the second digit was omitted and the person involved simply create a new estimate without deleting the original)? How many of those estimate values appearing once are really typos, e.g., 39 instead of 30? Does the task logging system used require an estimate before anything can be done? Unfortunately I don’t have access to the people involved. It does look like this data needs some cleaning.
There are relatively few 7-hour estimates, but lots for 8-hours. I’m assuming the company works an 8-hour day (the peak at 4-hours, rather than three, adds weight to this assumption).
The data relates to a mobile based communications App that used Google analytics to log basic usage information, i.e., daily totals of: App usage time, uses by existing users, uses by new users, operating system+version used by the mobile device, and number of exceptions raised by the App.
Working with daily totals means there is likely to be a non-trivial correlation between usage time and number of uses. Given that this is the only public data of its kind, it has to be handled (in my case, ignored for the time being).
I’m expecting to see a relationship between number of exceptions raised and daily usage (the data includes a count of fatal exceptions, which are less common; because lots of data is needed to build a good model, I went with the more common kind). So a’fishing I went.
On most days no exception occurred (zero is the ideal case for the vendor, but I want lots of exception to build a good model). Daily exception counts are likely to be small integers, which suggests a Poisson error model.
Applications often have an initial beta testing period, intended to check that everything works. Lucky for me the beta testing data is included (i.e., more exceptions are likely to occur during beta testing, which get sorted out prior to official release). This is the data I concentrated my modeling.
The model I finally settled on has the form (code+data):
Yes, had a much bigger impact than . This was true for all the models I built using data for all Android/iOS Apps, and the exponent difference was always greater than two.
Why square-root, rather than log? The model fit was much better for square-root; too much better for me to be willing to go with a model which had as a power-law.
The impact of varied by several orders of magnitude (which won’t come as a surprise to developers using earlier versions of Android).
There were not nearly as many exceptions once the App became generally available, and there were a lot fewer exceptions for the iOS version.
The outsized impact of new users on exceptions experienced is easily explained by developers failing to check for users doing nonsensical things (which users new to an App are prone to do). Existing users have a better idea of how to drive an App, and tend to do the kind of things that developers expect them to do.
As always, if you know of any interesting software engineering data, please let me know.
For years people have been comparing software construction with building construction. Think about how we talk about “architecture” or foundations, or the cost of change and so on. As I’ve said before, building software is not like building a house. Now it occurs to me that a better metaphor is the ongoing ownership of the building.
Every building requires “maintenance” and over time buildings change – indeed buildings learn. While an Englishman’s home is his castle those of us, even the English, who are lucky enough to own a house don’t have a free hand in the changes we make to our houses
Specifically I’m thinking about the Product Owner. Being a Product Owner is less about deciding what you want your new house to look like, or how the building should be constructed, its not even about deciding how many rooms the house should have. The role of the Product Owner is to ensure the house continues to be liveable, preferably the house is getting nicer to live in, and the house is coping with the requests made on it.
I own a house – a nice one in West London. As the owner I am responsible for the house. I do little jobs myself – like painting the fences. More significantly I have to think about what I want to do with the house: do we want to do a loft conversion? What would that entail and when might I be able to afford that?
I am the Product Owner of my own house. I have to decide on what is to be done, what can wait and what trade-offs I can accept.
When I bought the house the big thing to change was the kitchen and backroom. There was little point in any other works until those rooms were smashed to bits and rebuilt. I had to think though what was needed by my family, what was possible and what the result might be like. I received quotes from several builders – each of whom had their own ideas about what I wanted. I hired an architect for advice. I looked at what neighbours had done. And I had a hard think about how much money I could spend.
An Englishman’s home is his castle – I am the lord of my house and I can decide what I want, except…
My wife and children have a say in what happens to the house. Actually my wife has a pretty big say, while the children have less say there needs are pretty high on my list of priorities.
My local council and even the government have a say because they place certain constraints on what I can do – planning permission, rules and building codes. The insurance company and mortgage bank set some constraints and expectations too.
My neighbours might not own my house but they are stakeholders: I can’t upset them (too much) and they impose some constraints. (In my first flat/apartment the neighbours were a bigger issue because we shared a roof, a garden and the walls.)
So while I may be lord of my own house I am not a completely a free agent. And the same is true with Product Owners.
The secret with Product Owners is: they are Owners. They are more than managers – managers are just hired help. But neither do POs have a free hand, they don’t have unlimited power, the are not dictators, they are not completely free to do what they want and order people around.
Like me, Product Owners have limited resources available: how much money, how many helpers, access to customers and more. I have to balance my desire for a large loft conversion (with shower, balcony and everything else) with the money I can afford to spend on it. That involves trade-off and compromises. I could go into debt – increase my mortgage – but that comes with costs.
Product owners have responsibilities: to customers and users, to the those who fund the work (like the mortgage bank), to team members and peers to name a few. Some decisions they can make on their own, but on other decisions they can only lead a conversation and guide it towards a conclusion.
What the homeowner metaphor misses entirely is the commercial aspect: my house exists for me to live in. I don’t expect to make money out of it. The house next door to mine is owned by a commercial landlord who rents it out: the landlord is actively trying to make money out of that house.
Most Product Owners are trying to further some other agenda: commercial (generating money), or public sector (furthering Government policies), or third sector (e.g. a charity). In other words: Product Owners are seeking to add value for their organization. This adds an additional dimension because the PO has to justify their decisions to a higher authority.
As with my previous online workshops this is a series of four 90 minute online (zoom) sessions delivered on consecutive weeks. And as before a few tickets are available for free to those who are furloughed or unemployed.
This workshop is for Product Owners (including business analysts and product managers), Scrum Masters, Project and Development Managers.
Appreciate the influence of value on effort estimates and technical architecture at the story and project level
Know how to estimate value for user stories and epics
Recognise how cost-of-delay changes value over time, why deadlines are elastic by value and how to use Best Before and Use By dates when prioritising work
Appreciate how values define value, and how this differs between organizations
My main PC workstation (as opposed to my Mac Pro) is a dual-boot Windows and Linux machine. While backing up the Windows portion is relatively easy via some cheap-ish commercial backup software, I ended up backing up my Linux home directories only very occasionally. Clearly, Something Had To Be Done ™. I had a look […]
My main PC workstation (as opposed to my Mac Pro) is a dual-boot Windows and Linux machine. While backing up the Windows portion is relatively easy via some cheap-ish commercial backup software, I ended up backing up my Linux home directories only very occasionally. Clearly, Something Had To Be Done (tm).
I had a look around for Linux backup software. I was familiar with was Timeshift, but at least the Manjaro port can’t back up to a remote machine and was useless as a result. I eventually settled on rdiff-backup as it seemed to be simple, has been around for a while and also looks very cron-friendly. So far, so good.
If there is any single consolation amidst the circumstances we are all having to cope with at the moment it is that many of us have lots of time to fill - not only with unproductive things like binging Netflix (I really should get around to watching Dirk Gently's Holistic Detective Agency...) but also with tasks which might have a more lasting long-term benefit.
A Clang-Tidy analysis of a skeleton Visual Studio 2019 project within VisualLintGui
Those could be tasks like learning a language (programming or human), joining online yoga classes, writing a book, designing a website, blogging, reading (I can recommend Francis Buontempo's Genetic Algorithms and Machine Learning for Programmers if you fancy learning a little about ML) or generally just getting on with stuff with perhaps fewer distractions than usual.
On the latter note, for the past few months we've been working on the codebase of Visual Lint 7.5 (the next version) in our development branch, and it is coming along quite nicely.
One of the things we have planned to do for some time is to add direct support for the Clang-Tidy analysis tool to Visual Lint. When the UK lockdown started, focusing on this task in particular proved to be a very useful distraction from all the fear and uncertainty we found around us.
Sometimes being in the zone helps in more ways than usual.
The screenshot above should give you an idea of where we are at the moment. Whilst there is still a great deal to do before we can consider this is production-ready the foundation is in place and it is definitely usable. For example, selected issues can already be suppressed from the Analysis Results Display by inserting inline suppression directives ("// NOLINT") using the same context menu command used to suppress (for example) PC-lint, PC-lint Plus and CppCheck analysis issues.
With Microsoft Visual Studio being one of the major development environments we support one of the most important things to address is configuring Clang-Tidy to be tolerant of non-standard Visual C++ projects. The errors shown for some files in the Analysis Status Display in the screenshot above are exactly because of this - a standards compliant C++ compiler is likely to generate at least some errors while compiling most Visual C++ projects.
The only error we saw in the Visual C++ project mentioned above was: clang-diagnostic-error: -- call to non-static member function without an object argument.
During the 1960s there were three well known, widely used, programming languages: Algol 60, Cobol, and Fortran.
When somebody created a new programming languages Algol 60 tended to be their role-model. A few of the authors of the Algol 60 report cited beauty as one of their aims, a romantic notion that captured some users imaginations. Also, the language was full of quirky, out-there, features; plenty of scope for pin-head discussions.
Cobol appears visually clunky, is used by business people and focuses on data formatting (a deadly dull, but very important issue).
Fortran spent 20 years catching up with features supported by Algol 60.
Cobol and Fortran are still with us because they never had any serious competition within their target markets.
Algol 60 had lots of competition and its successor language, Algol 68, was groundbreaking within its academic niche, i.e., not in a developer useful way.
Language family trees ought to have Algol 60 at, or close to their root. But the Algol 60 descendants have been so successful, that the creators of these family trees have rarely heard of it.
In the US the ‘military’ language was Jovial, and in the UK it was Coral 66, both derived from Algol 60 (Coral 66 was the first language I used in industry after graduating). I used to hear people saying that Jovial was derived from Fortran; another example of people citing the language the popular language know.
Algol compiler implementers documented their techniques (probably because they were often academics); ALGOL 60 Implementation is a real gem of a book, and still worth a read today (as an introduction to compiling).
Algol 60 was ahead of its time in supporting undefined behaviors Such as: “The effect, of a go to statement, outside a forstatement, which refers to a label within the for statement, is undefined.”
One feature of Algol 60 rarely adopted by other languages is its parameter passing mechanism, call-by-name (now that lambda expressions are starting to appear in widely used languages, call-by-name has a kind-of comeback). Call-by-name essentially has the same effect as textual substitution. Given the following procedure (it’s not a function because it does not return a value):
procedure swap (a, b);
integer a, b, temp;
temp := a;
a := b;
the effect of the call: swap(i, x[i]) is:
temp := i;
i := x[i];
x[i] := temp
which might come as a surprise to some.
Needless to say, programmers came up with ‘clever’ ways of exploiting this behavior; the most famous being Jensen’s device.
Sir R-----! Come join me for a glass of chilled wine! I have a notion that you're in the mood for a wager. What say you?
I knew it!
I have in mind a game of dice that reminds me of my time as the Russian military attaché to the city state of Coruscant and its territories during the traitorous popular uprising fomented by the blasphemous teachings of a fundamentalist religious sect known as the Jedi.