Unneeded requirements implemented in Waterfall & Agile

Derek Jones from The Shape of Code

Software does not wear out, but the world in which it runs evolves. Time and money is lost when, after implementing a feature in software, customer feedback is that the feature is not needed.

How do Waterfall and Agile implementation processes compare in the number of unneeded feature/requirements that they implement?

In a Waterfall process, a list of requirements is created and then implemented. The identity of ‘dead’ requirements is not known until customers start using the software, which is not until it is released at the end of development.

In an Agile process, a list of requirements is used to create a Minimal Viable Product, which is released to customers. An iterative development processes, driven by customer feedback, implements requirements, and makes frequent releases to customers, which reduces the likelihood of implementing known to be ‘dead’ requirements. Previously implemented requirements may be discovered to have become ‘dead’.

An analysis of the number of ‘dead’ requirements implemented by the two approaches appears at the end of this post.

The plot below shows the number of ‘dead’ requirements implemented in a project lasting a given number of working days (blue/red) and the difference between them (green), assuming that one requirement is implemented per working day, with the discovery after 100 working days that a given fraction of implemented requirements are not needed, and the number of requirements in the MVP is assumed to be small (fractions 0.5, 0.1, and 0.05 shown; code):

Dead requirements for Waterfall and Agile projects running for a given number of days, along with difference between them.

The values calculated using one requirement implemented per day scales linearly with requirements implemented per day.

By implementing fewer ‘dead’ requirements, an Agile project will finish earlier (assuming it only implements all the needed requirements of a Waterfall approach, and some subset of the ‘dead’ requirements). However, unless a project is long-running, or has a high requirements’ ‘death’ rate, the difference may not be compelling.

I’m not aware of any data on rate of discovery of ‘dead’ implemented requirements (there is some on rate of discovery of new requirements); as always, pointers to data most welcome.

The Waterfall projects I am familiar with, plus those where data is available, include some amount of requirement discovery during implementation. This has the potential to reduce the number of ‘dead’ implemented requirements, but who knows by how much.

As the size of Minimal Viable Product increases to become a significant fraction of the final software system, the number of fraction of ‘dead’ requirements will approach that of the Waterfall approach.

There are other factors that favor either Waterfall or Agile, which are left to be discussed in future posts.

The following is an analysis of Waterfall/Agile requirements’ implementation.

Define:

F_{live} is the fraction of requirements per day that remain relevant to customers. This value is likely to be very close to one, e.g., 0.999.
R_{done} requirements implemented per working day.

Waterfall

The implementation of R_{total} requirements takes I_{days}=R_{total}/R_{done}days, and the number of implemented ‘dead’ requirements is (assuming that the no ‘dead’ requirements were present at the end of the requirements gathering phase):

R_{Wdead}=R_{total}*(1-{F_{live}}^{I_{days}})

As I_{days} right infty effectively all implemented requirements are ‘dead’.

Agile

The number of implemented ‘live’ requirements on day n is given by:

R_n=F_{live}*R_{n-1}+R_{done}

with the initial condition that the number of implemented requirements at the start of the first day of iterative development is the number of requirements implemented in the Minimum Viable Product, i.e., R_0=R_{mvp}.

Solving this difference equation gives the number of ‘live’ requirements on day n:

R_n=R_{mvp}*{F_{live}}^n+{n*R_{done}}/{n(1-F_{live})+F_{live}}

as n right infty, R_n approaches to its maximum value of {R_{done}}/{1-F_{live}}

Subtracting the number of ‘live’ requirements from the total number of requirements implemented gives:

R_{Adead}=R_{mvp}+n*R_{done}-R_n

or

R_{Adead}=R_{mvp}(1-{F_{live}}^n)+n*R_{done}(1-1/{n(1-F_{live})+F_{live}})
or
R_{Adead}=R_{mvp}(1-{F_{live}}^n)+n*R_{done}{n-1}/{n+F_{live}/(1-F_{live})}

as n right infty effectively all implemented requirements are ‘dead’, because the number of ‘live’ requirements cannot exceed a known maximum.

Finding patterns in construction project drawing creation dates

Derek Jones from The Shape of Code

I took part in Projecting Success‘s 13th hackathon last Thursday and Friday, at CodeNode (host to many weekend hackathons and meetups); around 200 people turned up for the first day. Team Designing-Success included Imogen, Ryan, Dillan, Mo, Zeshan (all building construction domain experts) and yours truly (a data analysis monkey who knows nothing about construction).

One of the challenges came with lots of real multi-million pound building construction project data (two csv files containing 60K+ rows and one containing 15K+ rows), provided by SISK. The data contained information on project construction drawings and RFIs (request for information) from 97 projects.

The construction industry is years ahead of the software industry in terms of collecting data, in that lots of companies actually collect data (for some, accumulate might be a better description) rather than not collecting/accumulating data. While they have data, they don’t seem to be making good use of it (so I am told).

Nearly all the discussions I have had with domain experts about the patterns found in their data have been iterative, brief email exchanges, sometimes running over many months. In this hack, everybody involved is sitting around the same table for two days, i.e., the conversation is happening in real-time and there is a cut-off time for delivery of results.

I got the impression that my fellow team-mates were new to this kind of data analysis, which is my usual experience when discussing patterns recently found in data. My standard approach is to start highlighting visual patterns present in the data (e.g., plot foo against bar), and hope that somebody says “That’s interesting” or suggests potentially more interesting items to plot.

After several dead-end iterations (i.e., plots that failed to invoke a “that’s interesting” response), drawings created per day against project duration (as a percentage of known duration) turned out to be of great interest to the domain experts.

Building construction uses a waterfall process; all the drawings (i.e., a kind of detailed requirements) are supposed to be created at the beginning of the project.

Hmm, many individual project drawing plots were showing quite a few drawings being created close to the end of the project. How could this be? It turns out that there are lots of different reasons for creating a drawing (74 reasons in the data), and that it is to be expected that some kinds of drawings are likely to be created late in the day, e.g., specific landscaping details. The 74 reasons were mapped to three drawing categories (As built, Construction, and Design Development), then project drawings were recounted and plotted in three colors (see below).

The domain experts (i.e., everybody except me) enjoyed themselves interpreting these plots. I nodded sagely, and occasionally blew my cover by asking about an acronym that everybody in the construction obviously knew.

The project meta-data includes a measure of project performance (a value between one and five, derived from profitability and other confidential values) and type of business contract (a value between one and four). The data from the 97 projects was combined by performance and contract to give 20 aggregated plots. The evolution of the number of drawings created per day might vary by contract, and the hypothesis was that projects at different performance levels would exhibit undesirable patterns in the evolution of the number of drawings created.

The plots below contain patterns in the quantity of drawings created by percentage of project completion, that are: (left) considered a good project for contract type 1 (level 5 are best performing projects), and (right) considered a bad project for contract type 1 (level 1 is the worst performing project). Contact the domain experts for details (code+data):

Number of drawings created at percentage project completion times.

The path to the above plot is a common one: discover an interesting pattern in data, notice that something does not look right, use domain knowledge to refine the data analysis (e.g., kinds of drawing or contract), rinse and repeat.

My particular interest is using data to understand software engineering processes. How do these patterns in construction drawings compare with patterns in the software project equivalents, e.g., detailed requirements?

I am not aware of any detailed public data on requirements produced using a waterfall process. So the answer is, I don’t know; but the rationales I heard for the various kinds of drawings sound as-if they would have equivalents in the software requirements world.

What about the other data provided by the challenge sponsor?

I plotted various quantities for the RFI data, but there wasn’t any “that’s interesting” response from the domain experts. Perhaps the genius behind the plot ideas will be recognized later, or perhaps one of the domain experts will suddenly realize what patterns should be present in RFI data on high performance projects (nobody is allowed to consider the possibility that the data has no practical use). It can take time for the consequences of data analysis to sink in, or for new ideas to surface, which is why I am happy for analysis conversations to stretch out over time. Our presentation deck included some RFI plots because there was RFI data in the challenge.

What is the software equivalent of construction RFIs? Perhaps issues in a tracking system, or Jira tickets? I did not think to talk more about RFIs with the domain experts.

How did team Designing-Success do?

In most hackathons, the teams that stay the course present at the end of the hack. For these ProjectHacks, submission deadline is the following day; the judging is all done later, electronically, based on the submitted slide deck and video presentation. The end of this hack was something of an anti-climax.

Did team Designing-Success discover anything of practical use?

I think that finding patterns in the drawing data converted the domain experts from a theoretical to a practical understanding that it was possible to extract interesting patterns from construction data. They each said that they planned to attend the next hack (in about four months), and I suggested that they try to bring some of their own data.

Can these drawing creation patterns be used to help monitor project performance, as it progressed? The domain experts thought so. I suspect that the users of these patterns will be those not closely associated with a project (those close to a project are usually well aware of that fact that things are not going well).

User Story or Epic?

Allan Kelly from Allan Kelly Associates

GoldenRules-2020-08-26-19-57.jpeg

I have two golden rules for user stories:

  1. The story should deliver business value: it should be meaningful to some customer, user, stakeholder. In some way the story should make their lives better.
  2. The story should be small enough to be delivered soon: some people say “within 2 days” but I’d generous, after all I used to be a C++ programmer, I’m happy as long as the story can be delivered within 2-weeks, i.e. the standard size of a sprint.

Now these two rules are in conflict, the need for value – and preferably more value! – pushes stories to be bigger while the second rule demands they are small. That is just the way things are, there is no magic solution, that is the tension we must manage.

Those two rules also help us differentiate between stories and epics – and tasks if you are using them:

  • Epics honour rule #1, epics are very valuable but they are not small, by definition they are large this epics are unlikely to be delivered soon
  • Tasks honour rule #2, they are small, very small, say a day of work. But they do not deliver value to stakeholders – or if they do it is not a big deal

EpicsStoriesTasks-2020-08-26-19-57.jpeg

Tasks are the things you do to build stories. And stories are the things you do to deliver epics. If you find you can complete a story without doing one of the planned tasks then great, and similarly not all stories need to be completed for an epic to be considered done.

In an ideal world you would not need tasks, every story would be small enough to stand alone. Nor would you need epics because stories would justify themselves. We can work towards that world but until then most teams of my experience use two of these three levels – stories and tasks or epics and stories. A few even use all three levels.

Using more than three is an administration problem. There is always a fourth level above these, the project or product that is the reason they exist in the first place. But really, three levels is more than enough to model just about anything: really small, small, and damn big.

And every story is a potential epic until proven guilty.

More about epics, stories and tasks in Little Book of Requirements and User Stories and in my User Stories Masterclass next month (use Blog15 for 15% discount).


September micro-workshops – spaced limited

User Stories Masterclass, Agile Estimation & Forecasting, Maximising value delivered

Early bird discounts & free tickets for unemployed/furloughed

Book with code Blog15 for 15% discount or get more details


The post User Story or Epic? appeared first on Allan Kelly Associates.

User Story or Epic?

Allan Kelly from Allan Kelly Associates

GoldenRules-2020-08-26-19-57.jpeg

I have two golden rules for user stories:

  1. The story should deliver business value: it should be meaningful to some customer, user, stakeholder. In some way the story should make their lives better.
  2. The story should be small enough to be delivered soon: some people say “within 2 days” but I’d generous, after all I used to be a C++ programmer, I’m happy as long as the story can be delivered within 2-weeks, i.e. the standard size of a sprint.

Now these two rules are in conflict, the need for value – and preferably more value! – pushes stories to be bigger while the second rule demands they are small. That is just the way things are, there is no magic solution, that is the tension we must manage.

Those two rules also help us differentiate between stories and epics – and tasks if you are using them:

  • Epics honour rule #1, epics are very valuable but they are not small, by definition they are large this epics are unlikely to be delivered soon
  • Tasks honour rule #2, they are small, very small, say a day of work. But they do not deliver value to stakeholders – or if they do it is not a big deal

EpicsStoriesTasks-2020-08-26-19-57.jpeg

Tasks are the things you do to build stories. And stories are the things you do to deliver epics. If you find you can complete a story without doing one of the planned tasks then great, and similarly not all stories need to be completed for an epic to be considered done.

In an ideal world you would not need tasks, every story would be small enough to stand alone. Nor would you need epics because stories would justify themselves. We can work towards that world but until then most teams of my experience use two of these three levels – stories and tasks or epics and stories. A few even use all three levels.

Using more than three is an administration problem. There is always a fourth level above these, the project or product that is the reason they exist in the first place. But really, three levels is more than enough to model just about anything: really small, small, and damn big.

And every story is a potential epic until proven guilty.

More about epics, stories and tasks in Little Book of Requirements and User Stories and in my User Stories Masterclass next month (use Blog15 for 15% discount).


September micro-workshops – spaced limited

User Stories Masterclass, Agile Estimation & Forecasting, Maximising value delivered

Early bird discounts & free tickets for unemployed/furloughed

Book with code Blog15 for 15% discount or get more details


The post User Story or Epic? appeared first on Allan Kelly Associates.

Product owner as a homeowner

Allan Kelly from Allan Kelly Associates

house-illustration-clipart-free-stock-photo-public-domain-pictures1-2020-05-21-14-50.jpg

For years people have been comparing software construction with building construction. Think about how we talk about “architecture” or foundations, or the cost of change and so on. As I’ve said before, building software is not like building a house. Now it occurs to me that a better metaphor is the ongoing ownership of the building.

Every building requires “maintenance” and over time buildings change – indeed buildings learn. While an Englishman’s home is his castle those of us, even the English, who are lucky enough to own a house don’t have a free hand in the changes we make to our houses

Specifically I’m thinking about the Product Owner. Being a Product Owner is less about deciding what you want your new house to look like, or how the building should be constructed, its not even about deciding how many rooms the house should have. The role of the Product Owner is to ensure the house continues to be liveable, preferably the house is getting nicer to live in, and the house is coping with the requests made on it.

I own a house – a nice one in West London. As the owner I am responsible for the house. I do little jobs myself – like painting the fences. More significantly I have to think about what I want to do with the house: do we want to do a loft conversion? What would that entail and when might I be able to afford that?

I am the Product Owner of my own house. I have to decide on what is to be done, what can wait and what trade-offs I can accept.

When I bought the house the big thing to change was the kitchen and backroom. There was little point in any other works until those rooms were smashed to bits and rebuilt. I had to think though what was needed by my family, what was possible and what the result might be like. I received quotes from several builders – each of whom had their own ideas about what I wanted. I hired an architect for advice. I looked at what neighbours had done. And I had a hard think about how much money I could spend.

An Englishman’s home is his castle – I am the lord of my house and I can decide what I want, except…

My wife and children have a say in what happens to the house. Actually my wife has a pretty big say, while the children have less say there needs are pretty high on my list of priorities.

My local council and even the government have a say because they place certain constraints on what I can do – planning permission, rules and building codes. The insurance company and mortgage bank set some constraints and expectations too.

My neighbours might not own my house but they are stakeholders: I can’t upset them (too much) and they impose some constraints. (In my first flat/apartment the neighbours were a bigger issue because we shared a roof, a garden and the walls.)

So while I may be lord of my own house I am not a completely a free agent. And the same is true with Product Owners.

The secret with Product Owners is: they are Owners. They are more than managers – managers are just hired help. But neither do POs have a free hand, they don’t have unlimited power, the are not dictators, they are not completely free to do what they want and order people around.

Like me, Product Owners have limited resources available: how much money, how many helpers, access to customers and more. I have to balance my desire for a large loft conversion (with shower, balcony and everything else) with the money I can afford to spend on it. That involves trade-off and compromises. I could go into debt – increase my mortgage – but that comes with costs.

Product owners have responsibilities: to customers and users, to the those who fund the work (like the mortgage bank), to team members and peers to name a few. Some decisions they can make on their own, but on other decisions they can only lead a conversation and guide it towards a conclusion.

What the homeowner metaphor misses entirely is the commercial aspect: my house exists for me to live in. I don’t expect to make money out of it. The house next door to mine is owned by a commercial landlord who rents it out: the landlord is actively trying to make money out of that house.

Most Product Owners are trying to further some other agenda: commercial (generating money), or public sector (furthering Government policies), or third sector (e.g. a charity). In other words: Product Owners are seeking to add value for their organization. This adds an additional dimension because the PO has to justify their decisions to a higher authority.


Subscribe to my blog newsletter and download Project Myopia for Free

The post Product owner as a homeowner appeared first on Allan Kelly Associates.

The problem with Product Owners

Allan Kelly from Allan Kelly Associates

HeadacheiStock_000014496990Small-2020-05-8-12-40.jpg

Advertisement: at the time of writing there are still a few tickets available for my online User Stories Masterclass beginning this Wednesday, 90 minutes each week for 4 weeks.

After submitting his review of The Art of Agile Product Ownership one of the reviewers sent me a note about the review was and said:

“Gee, I really wish I could be that type of Product Owner.”

His comment made me smile. He nicely summarised much of the argument in Art of PO. The book makes a case for an expansive product owner – one with product management skills and business analysis skills; a product owner who looks to improve value over the short and long run, and for product owners with more customer empathy and marketing skills than code empathy and technical skills.

Many of the Product Owners I meet aren’t really owners of the product. Rather they are “Backlog Administrators” and as such the industry is creating another problem for itself.

Over the years the product owner role has been diluted, so many product owners are not really owners of their products. Instead their role is limited and constricted. They are judged on how many features they get the team deliver, whether those features are delivered by some date or whether they have met near term goals of doing the things customers – or internal users – are complaining about.

In other words the whole team is a feature factory: requests go in and success is measured by how many of those requests reach production.

Sure that is one way to run a team, and in some places that might be the “right” way to do it (a team dedicated to addressing production/customer issues perhaps.)

Unfortunately agile is prone to this failing because of the sprint-sprint-sprint nature of work. The things in front of you are obviously more valuable than the things that are not. The people shouting at you today obviously represent greater value than those who are sitting quietly asking nicely. And both groups can mask bigger insights and opportunities.

Hang on you say: Is this the same Allan who has argued against long term planning? And against analysis phases? The Allan who always argues for action this day?

Well, yes I am that Allan. And I agree that I regularly argue that teams should get started on coding and limit planning and analysis.

But that doesn’t mean I’m against these things, it only means I’m conscious of the diminishing returns of planning; and I know that what is technically possible frames not only the solution but the problem – because often we can’t conceive of the problem until we see how a solution might change things.

Teams need to watch out for the “bigger” questions. Teams need to take some time to thing long term. Time needs to be spent away from the hurly-burly of sprint-sprint-sprint to imagine a different world. Dis-economies of scale may rule but there still needs to be consideration of larger things, e.g. jobs to be done over user stories.

The responsibility rests with the Product Owner.

They own the product the way I own my house: I have to pay the mortgage and I have to change blow light bulbs but I also need to think: how long will the roof last? Will we build an extension? When will we rebuild the patio? And where am I going to put a car charging point when that day comes?

I don’t take those decisions in isolation, I don’t spend lots of time on them and I don’t let them get in the way of work today. But spending a little time thinking about them, and I may well leader on the discussion. Taking a little time to think through out how things might fit together (don’t do the roof until after the extension is built) has benefits.

And so many Product Owners aren’t doing that. Worse still their organizations don’t expect them to. Maybe they see an Architects doing that, or a Product Manager – or maybe nobody does.

The thing is: the Product Owner is the OWNER.

Managers and architects are hired and fired as needed. The buck stops with owners.

Many organizations have got this the wrong way round. The Product Owner role is diluted and individual Product Owners emasculated.

Advertisement: at the time of writing there are still a few tickets available for my online User Stories Masterclass beginning this Wednesday, 90 minutes each week for 4 weeks.

The post The problem with Product Owners appeared first on Allan Kelly Associates.

User Stories: online workshops – booking now (free for furloughed)

Allan Kelly from Allan Kelly Associates

iStock-512327190s-2020-05-4-09-16.jpg

Last month I ran an experiment: I delivered my one day workshop on User Stories, Requirements and Backlogs as a series of four 90 minute workshops and offered them for free. I had a great response with over 20 people signing up immediately and during the last month we all learned something – I had great feedback.

I’m now reworking that workshop further into two series: User Stories Masterclass and Value Driven Planning. Both of these will run as four 90 minute workshops online over successive weeks.

Booking is are now open for User Stories Masterclass. This will start next week, Wednesday 13 May (3pm, London BST time.) There will be one class at the same time each week.

Places are limited so don’t hold off booking. (If you miss out let me know and I’ll try to schedule another soon.)

Workshops 1 and 2 will focus on User Story format and what makes a good story. Workshop 3 will consider some of the processes that fit around User Stories (definitions of done and ready, acceptance criteria, etc.) and, importantly non-functional requirements. The final workshop will look at splitting stories.

I’m making some tickets free for those who have been laid-off or furloughed because of you-know-what.

For everyone else there are two price points. “Self pay” for those of you who are paying out of your own pocket. And a more expensive “Company paying” ticket for those with a generous employer – who can reclaim or avoid UK VAT.

To sweeten the extra price I’m including a one-hour one-on-one consultation for those who pay the higher price.

More about the value workshop soon – and why the split? Well, a) it turns out I have more than enough material, and b) people had some great questions and I want to allow time for conversation.

More details on my website.


Subscribe to my blog newsletter and download Project Myopia for Free

The post User Stories: online workshops – booking now (free for furloughed) appeared first on Allan Kelly Associates.

Want to join a (free) online workshop?

Allan Kelly from Allan Kelly Associates

iStock_000004600893Small-2020-03-24-11-31.jpg

Consider this a gift, its also an experiment. Numbers are limited so if you would like to join please e-mail me today – if it goes well I’ll repeat, although I might ask for money next time.

I’m going to tun an online workshop entitled: Stories and Value.

Participation is limited to a 16 and its going to be first come first served – blog/newsletter readers are getting the first chance to sign-up.

This is based on my existing “Requirements, Backlogs and User Stories” workshop which has itself mutated into a discussion of stories and value. The workshop will run as a series of 90 minute sessions, one a week for four weeks online.

I want the workshop sessions to remain interactive, I’m sure I will use some slides at some point but I want to keep it interactive. So I’m going to limit participation to 12.

The draft schedule is:

  • Workshop 1: How value influences our thinking
  • Workshop 2: Good and Bad User Stories
  • Workshop 3: Estimating story value
  • Workshop 4: Time value profiles and closing discussion

I plan on using exercises in throughout and I think I know how to run them online. And I want discussion! – I may even set a little homework between sessions.

But in all honesty, it’s an experiment. So, I’m not planning on charging for this – it is Free!

If you find it valuable you can make a payment – like those “pay what you like” restaurants. That will itself be feedback.

I’m thinking Wednesday, 3pm UK time so those in mainline Europe could join too (sorry US and Asia, maybe next time); on a Zoom conference. Start next week, April 1 ? – once I know who’s in we might debate this between ourselves.

My thinking is still developing on this so let me know if you have any ideas to contribute. (And if you can’t join but want to let me know, feedback is valuable too! Likewise, if you are tempted but want to see something different please tell me and I’ll see what I can do.)

So, if you want to join these sessions please e-mail me, allan@allankelly.net.

This is a minimally viable experiment so its all a bit crude.

The post Want to join a (free) online workshop? appeared first on Allan Kelly Associates.

Patterns of regular expression usage: duplicate regexs

Derek Jones from The Shape of Code

Regular expressions are widely used, but until recently they were rarely studied empirically (i.e., just theory research).

This week I discovered two groups studying regular expression usage in source code. The VTLeeLab has various papers analysing 500K distinct regular expressions, from programs written in eight languages and StackOverflow; Carl Chapman and Peipei Wang have been looking at testing of regular expressions, and also ran an interesting experiment (I will write about this when I have decoded the data).

Regular expressions are interesting, in that their use is likely to be purely driven by an application requirement; the use of an integer literals may be driven by internal housekeeping requirements. The number of times the same regular expression appears in source code provides an insight (I claim) into the number of times different programs are having to solve the same application problem.

The data made available by the VTLeeLab group provides lots of information about each distinct regular expression, but not a count of occurrences in source. My email request for count data received a reply from James Davis within the hour :-)

The plot below (code+data; crates.io has not been included because the number of regexs extracted is much smaller than the other repos) shows the number of unique patterns (y-axis) against the number of identical occurrences of each unique pattern (x-axis), e.g., far left shows number of distinct patterns that occurred once, then the number of distinct patterns that each occur twice, etc; colors show the repositories (language) from which the source was obtained (to extract the regexs), and lines are fitted regression models of the form: NumPatterns = a*MultOccur^b, where: a is driven by the total amount of source processed and the frequency of occurrence of regexs in source, and b is the rate at which duplicates occur.

Number of distinct patterns occurring a given number of times in the source stored in various repositories

So most patterns occur once, and a few patterns occur lots of times (there is a long tail off to the unplotted right).

The following table shows values of b for the various repositories (languages):

StackOverflow   cpan    godoc    maven    npm  packagist   pypi   rubygems
    -1.8        -2.5     -2.5    -2.4    -1.9     -2.6     -2.7     -2.4

The lower (i.e., closer to zero) the value of b, the more often the same regex will appear.

The values are in the region of -2.5, with two exceptions; why might StackOverflow and npm be different? I can imagine lots of duplicates on StackOverflow, but npm (I’m not really familiar with this package ecosystem).

I am pleased to see such good regression fits, and close power law exponents (I would have been happy with an exponential fit, or any other equation; I am interested in a consistent pattern across languages, not the pattern itself).

Some of the code is likely to be cloned, i.e., cut-and-pasted from a function in another package/program. Copy rates as high as 70% have been found. In this case, I don’t think cloned code matters. If a particular regex is needed, what difference does it make whether the code was cloned or written from scratch?

If the same regex appears in source because of the same application requirement, the number of reuses should be correlated across languages (unless different languages are being used to solve different kinds of problems). The plot below shows the correlation between number of occurrences of distinct regexs, for each pair of languages (or rather repos for particular languages; top left is StackOverflow).

Correlation of number of identical pattern occurrences, between pairs of repositories.

Why is there a mix of strong and weakly correlated pairs? Is it because similar application problems tend to be solved using different languages? Or perhaps there are different habits for cut-and-pasted source for developers using different repositories (which will cause some patterns to occur more often, but not others, and have an impact on correlation but not the regression fit).

There are lot of other interesting things that can be done with this data, when connected to the results of the analysis of distinct regexs, but these look like hard work, and I have a book to finish.

Framing the question frames the answers – my crown jewels

Allan Kelly from Allan Kelly Associates

iStock-149794120s-2020-02-5-12-18.jpg

Today I’m giving away my crown jewels. Once you have read this post I can’t run my favourite exercise with you. There is a bit of experiential learning I can’t give you. So if you’ve rather have the experience stop reading and go and book yourself on my May workshop.

I’m describing an exercise that models what happens in “the real world(tm).” Plenty of the people who have done the exercise recognise it was a real life problem. The insights are many, and some a little disturbing.

Dozens of teams and the answers are always the same. I live in dread that someone will guess and ruin the exercise but it never happens. Now I’m telling the world that might change.

On screen I put a story something like:

As a widget maker I want an online store to sell my widgets so that I can make money

I separate the room into teams. Each team represents a technology supplier – an agency, an outsourcer, whatever. I want each team to competitively bid on a piece of work. Each team gets to think about the problem and estimate the work. At the end I want each team to be ready to name their price, how long it will take and how many people they need. They may add any more details they like, e.g. staging, design, technology, etc. (and most do).

The teams on the right get a story which says:

As an international widget maker I want to sell direct to customers so that I can cut out distributors. I anticipate $10million turnover within 3 years and have budgeted $1.2m for this project.

15 minutes later the teams on the right read out their bids. They always want a million give or take. They want months, if not years. They want teams of half a dozen or more engineers, testers, UXD, analysts and project managers. They may propose an ongoing maintenance contract too.

Very disconcerting for these teams is that while they are answering and taking questions the other teams, those on the left, often burst out laughing – literally – when they hear these proposals.

What neither side knows is that they have different stories. The teams on the left get a story saying:

As an artisan widget maker I want to sell my widgets to customers so that I can give up my day job. When I make $100,000 a year in sales I can live my dream. My accountant tells me a WordPress website will cost $5,000.

These teams want a week or two, an engineer or two and perhaps $10,000 tops. In some cases they say “We can do it this afternoon, we’ll set up Etsy.” Even if they don’t want to use Esty or eBay they probably propose an OpenSource solution.

So what do you think?

True, it is a semi-artificial set-up but I would argue that these situations happen all the time in “the real world” work environment. However they are usually well disguised and hard to see. Maybe now you will spot them.

That aside there are many potential lessons this exercise illustrates, almost everyone is worth a discussion in its own right. To keep things brief I’ll just highlight some of them:

  • Anchoring (cognitive bias): individuals are anchored to those numbers, bigger number lead them to frame their answers as bigger numbers.
  • Assumptions: people jump to assumptions, people automatically fill in the blanks when they lack information and the information they fill in flows from the numbers mentioned. Few questions get asked.
  • Different solutions: the key lesson for me, confronted with similar problems, people (especially engineers) are capable of formulating very different solutions. Those solutions have different time and cost implications.
  • Problem bounding: presenting the same problem with different bounding constraints results in massively different solutions.
  • Effort estimates, and therefore cost estimates, flow from value: whether through anchoring assumptions or solution designs the estimates (time and money) flow from the value available NOT the other way around.
  • Prior experience often goes out the window. In one run a low-end teams told me: “We did this last week. A digital consultant showed us how to install WordPress and Magento for online retail in the morning and in the afternoon we did it ourselves.” While this team came up with a low cost proposal their colleagues who were given the $1m story forgot everything they learned last week.
  • People don’t ask questions: I answer questions while teams are creating their answers but people rarely challenge what is asked for. Maybe its because I’m usually in some position of authority as a consultant or workshop trainer and my word should be followed.

Occasionally a team with the million dollar story say “We could do this with eBay/WordPress/Shopify.” I keep a poker face and let them be. Inevitably left alone for long enough they talk themselves into a much more complex and expensive bid.

In fact, the longer I give teams the higher the estimates go. I heard a team in Australia say three times “Those estimates look low, lets double them.” And they did. (Again, planning has diminishing returns.)

So far nobody has offered two solutions: you could offer up a Shopify solution and a custom build solution but nobody has.

While we are going through the exercise the minimal viable product idea often gets mentioned – usually by the teams on the right. So recently I introduced a third story. This read the same as the international widget maker but has an extra paragraph underneath:

MacAllan consulting has advised the company to take an iterative and agile approach to this work using a minimally viable product model.

How do you think teams respond?

Think for a minute… your answer is?

It makes no difference.

Or rather, so far I’ve not had any of the million dollar teams propose anything close to the $5,000 solution. In one case a team with the MVP story actually came in more expensive – and longer – than the million dollar team without the MVP clause.

My learning here: talking MVP makes no difference. If you want an MVP you have to impose constraints. (Hence try an MVT.)

People continue to fill in the blanks after the charade is exposed. I’ve heard software architects argue forcefully they these are different problems because of the money involved and therefore require different architecture. They clearly feel cheated and want to justify the proposal they have made. I suspect they feel I’ve made them look silly and want to undo that impression, I’m sorry if I’ve made anyone feel silly.

I wonder how often that happens in the work place? How many of us would back down in real life? I’d like to think I would but I would probably first try and justify my position.

The architects have a point, in many ways the stories are functionally the same but the differences lie in the non-functional requirements: load, throughput, security and so on. But all of that is inferred by people from the price tag without question.

It makes me said that teams ask so few questions. People easily see themselves as a tailor not as a consultant (my Tailor or Image consultant post.)

Then there are the questions about the bidding process and companies bidding on the work.

Imagine you are the buyer: one supplier bids really low, the others were much higher but inline with your expectations. Would you trust the low bid? Have they blow their credibility?

And as a bidder: if you know the client plans to spend $1,000,000 why bid lower? The engineers cost estimates and designs aren’t relevant. Ideally you bid just below the competition. You are the lowest price with all the credibility and maximum revenue.

For that matter, should you be bidding on this at all?

If you work for a small e-commerce provider in rural Cornwall you may never know about, let alone, bid on a million dollar piece of work from an American multi-national. And if you did, would anyone take you seriously?

Suppose you got your big break: you walk in and offer a quick, low cost solution based on something like Shopify. Would they take you seriously? Would they want to listen?

Do corporations increase their own costs simply by being?

Conversely, if you work for a big consultancy and bid on million dollar deals every week will you be interested in bidding on a $5,000 piece of work? Of course not!

But that also means that if a corporation approaches you for a million dollar online shop, even if you could deliver it in a week’s time running on a third party platform do you have any incentive?

I don’t have answers to these questions. Indeed, there aren’t any right answers. All answers are valid, just some answers are “better” in some places than others.

Ultimately the lesson I take away from this is: we craft solutions within constraints.

More specifically: Engineers engineer within constraints, that is what engineers do.

That reinforces my belief that one needs to really understand benefit (value) and how that changes with time. From there we can work back to a solution.

If you would like to see this exercise for real then book yourself my Requirements, Backlogs and User Stories workshop. If you are in London Learning Connexions are running this again in May.


Like this post? – Like to receive these posts by e-mail? XanpanNewlite-2020-02-5-12-18.jpg

Xanpan

Subscribe to my newsletter & receive a free eBook “Xanpan: Team Centric Agile Software Development”

The post Framing the question frames the answers – my crown jewels appeared first on Allan Kelly Associates.