Archimedean View – a.k.

Last time we took a look at how we could define copulas to represent the dependency between random variables by summing the results of a generator function φ applied to the results of their cumulative distribution functions, or CDFs, and then applying the inverse of that function φ-1 to that sum.
These are known as Archimedean copulas and are valid whenever φ is strictly decreasing over the interval [0,1], equal to zero when its argument equals one and have nth derivatives that are non-negative over that interval when n is even and non-positive when it is odd, for n up to the number of random variables.
Whilst such copulas are relatively easy to implement we saw that their densities are a rather trickier job, in contrast to Gaussian copulas where the reverse is true. In this post we shall see how to draw random vectors from Archimedean copulas which is also much more difficult than doing so from Gaussian copulas.

Overthinking is not Overengineering

As the pendulum swings ever closer towards being leaner and focusing on simplicity I grow more concerned about how this is beginning to affect software architecture. By breaking our work down into ever smaller chunks and then focusing on delivering the next most valuable thing, how much of what is further down the pipeline is being factored into the design decisions we make today?

Wasteful Thinking

Part of the ideas around being leaner is an attempt to reduce waste caused by speculative requirements which has led many a project in the past into a state of â€œanalysis paralysisâ€ where they canâ€™t decide what to build because the goalposts keep moving. By focusing on delivering something simpler much sooner we begin to receive some return on our investment earlier and also shape the future based on practical feedback from today, rather than trying to guess what we need.

When weâ€™re building those simpler features that sit nicely upon our existing foundations we have much less need to worry about the cost of rework from getting it wrong as itâ€™s unlikely to be expensive. But as we move from independent features to those which are based around, say, a new â€œconceptâ€ or â€œpillarâ€ we should spend a little more time looking further down the backlog to see how any design choices we make might play out later.

Thinking to Excess

The term â€œoverthinkingâ€ implies that we are doing more thinking than is actually necessary; trying to fit everyoneâ€™s requirements in and getting bogged down in analysis is definitely an undesirable outcome of spending too much time thinking about a problem. As a consequence we are starting to think less and less up-front about the problems we solve to try and ensure that we only solve the problem we actually have and not the problems we think weâ€™ll have in the future. Solving those problems that we are only speculating about can lead to overengineering if they never manage to materialise or could have been solved more simply when the facts where eventually known.

But how much thinking is â€œoverthinkingâ€? If I have a feature to develop and only spend as much effort thinking as I need to solve that problem then, by definition, any more thinking than that is â€œoverthinking itâ€. But not thinking about the wider picture is exactly what leads to the kinds of architecture & design problems that begin to hamper us later in the productâ€™s lifetime, and later on might not be measured in years but even in days or weeks if we are looking to build a set of related features that all sit on top of a new concept or pillar.

The Horizon

Hence, it feels to me that some amount of overthinking is necessary to ensure that we donâ€™t prematurely pessimise our solution and paint ourselves into a corner. We should factor work further down the backlog into our thoughts to help us see the bigger picture and work out how we can shape our decisions today to ensure it biases our thinking towards our anticipated future rather than an arbitrary one.

Acting on our impulses prematurely can lead to overengineering if we implement whatâ€™s in our thoughts without having a fairly solid backlog to draw on, and overengineering is wasteful. In contrast a small amount of overthinking â€“ thought experiments â€“ are relatively cheap and can go towards helping to maintain the integrity of the systemâ€™s architecture.

One has to be careful quoting old adages like â€œa stich in time saves nineâ€ or â€œan ounce of prevention is worth a pound of cureâ€ because they can send the wrong message and lead us back to where we were before â€“ stuck in The Analysis Phase [1]. That said I want us to avoid â€œthrowing the baby out with the bathwaterâ€ and forget exactly how much thinking is required to achieve sustained delivery in the longer term.

[1] The one phrase I always want to mean this is â€œthink globally, act locallyâ€ because it sounds like it promotes big picture thinking while only implementing what we need today, but thatâ€™s probably stretching it too far.

Coding guidelines should specify what constructs can be used

There is a widespread belief that an important component of creating reliable software includes specifying coding constructs that should not be used, i.e., coding guidelines. Given that the number of possible coding constructs is greater than the number of atoms in the universe, this approach is hopelessly impractical.

A more practical approach is to specify the small set of constructs that developers that can only be used. Want a for-loop, then pick one from the top-10 most frequently occurring looping constructs (found by measuring existing usage); the top-10 covers 70% of existing C usage, the top-5 55%.

Specifying the set of coding constructs that can be used, removes the need for developers to learn lots of stuff that hardly ever gets used, allowing them to focus on learning a basic set of techniques. A small set of constructs significantly simplifies the task of automatically checking code for problems; many of the problems currently encountered will not occur; many edge cases disappear.

Developer coding mistakes have two root causes:

• what was written is not what was intended. A common example is the conditional in the if-statement: `if (x = y)`, where the developer intended to write `if (x == y)`. This kind of coding typo is the kind of construct flagged by static analysis tools as suspicious.

People make mistakes, and developers will continue to make this kind of typographical mistake in whatever language is used,

• what was written does not have the behavior that the developer believes it has, i.e., there is a fault in the developers understanding of the language semantics.

Incorrect beliefs, about a language, can be reduced by reducing the amount of language knowledge developers need to remember.

Developer mistakes are also caused by misunderstandings of the requirements, but this is not language specific.

Why do people invest so much effort on guidelines specifying what constructs not to use (these discussions essentially have the form of literary criticism)? Reasons include:

• providing a way for developers to be part of the conversation, through telling others about their personal experiences,
• tool vendors want a regular revenue stream, and product updates flagging uses of even more constructs (that developers could misunderstand or might find confusing; something that could be claimed for any language construct) is a way of extracting more money from existing customers,
• it avoids discussing the elephant in the room. Many developers see themselves as creative artists, and as such are entitled to write whatever they think necessary. Developers don’t seem to be affronted by the suggestion that their artistic pretensions and entitlements be curtailed, probably because they don’t take the idea seriously.

Emerging talent at the DevelopHER Awards 2018!

A couple of weeks ago I was honoured to be asked to judge and present the overall winner of the DevelopHER Awards 2018. There are a number of categories in the awards, including TechStar which I also judged, and the overall winner is chosen from the winners of the other categories.

I believe that the best developers start writing code at an early age and continue throughout their lives and on through their careers. As well as learning all they can, all the time, they give back to community around them and help other people develop as well.

Federica Freddi, who also won the Emerging Talent award, is clearly passionate about software development and is fully deserving of the DevelopHER award and I couldnâ€™t have been more delighted to be able to present her with it on the night.

Federica told me "It is fantastic to see so many women recognised for their contribution to our industry. It is a huge honour for me to be able to represent so many talented people that are making the difference in tech. As an Emerging Talent, I still have a long way ahead and I donâ€™t know what awaits for me in the future, however I am sure I will never forget to stop along the way to give back to people and help the next generations of tech stars to grow too."

I am hoping weâ€™ll see Federica back in Norwich very soon.

I wrote a book

I've written a book pulling together some of my previous talks showing how to code your way out of a paper bag using a variety of machine learning techniques and models, including genetic algorithms.
It's on pre-order at Amazon and you can download free excerpts from the publishers website.

The sales figures show I've sold over 1,000 copies already. I'm going through the copy edits at the moment. I can't wait to see the actual paper book.

Thank you to everyone at ACCU who helped and encouraged me while I wrote this.

I will be giving some talks at conferences and hopefully some meetups based on ideas in some of the chapters in 2019.

Watch this space.