On A Day At The Races – student

student from thus spake a.k.

Most recently the Baron challenged Sir R----- to a race of knights around the perimeter of a chessboard, with the Baron starting upon the lower right hand square and Sir R----- upon the lower left. The chase proceeded anticlockwise with the Baron moving four squares at each turn and Sir R----- by the roll of a die. Costing Sir R----- one cent to play, his goal was to catch or overtake the Baron before he reached the first rank for which he would receive a prize of forty one cents for each square that the Baron still had to traverse before reaching it.

Tracking software evolution via its Changelog

Derek Jones from The Shape of Code

Software that is used evolves. How fast does software evolve, e.g., much new functionality is added and how much existing functionality is updated?

A new software release is often accompanied by a changelog which lists new, changed and deleted functionality. When software is developed using a continuous release process, the changelog can be very fine-grained.

The changelog for the Beeminder app contains 3,829 entries, almost one per day since February 2011 (around 180 entries are not present in the log I downloaded, whose last entry is numbered 4012).

Is it possible to use the information contained in the Beeminder changelog to estimate the rate of growth of functionality of Beeminder over time?

My thinking is driven by patterns in a plot of the Renzo Pomodoro dataset. Renzo assigned a tag-name (sometimes two) to each task, which classified the work involved, e.g., @planning. The following plot shows the date of use of each tag-name, over time (ordered vertically by first use). The first and third black lines are fitted regression models of the form 1-e^{-K*days}, where: K is a constant and days is the number of days since the start of the interval fitted; the second (middle) black line is a fitted straight line.

at-words usage, by date.

How might a changelog line describing a day’s change be distilled to a much shorter description (effectively a tag-name), with very similar changes mapping to the same description?

Named-entity recognition seemed like a good place to start my search, and my natural language text processing tool of choice is still spaCy (which continues to get better and better).

spaCy is Python based and the processing pipeline could have all been written in Python. However, I’m much more fluent in awk for data processing, and R for plotting, so Python was just used for the language processing.

The following shows some Beeminder changelog lines after stripping out urls and formatting characters:

Cheapo bug fix for erroneous quoting of number of safety buffer days for weight loss graphs.
Bugfix: Response emails were accidentally off the past couple days; fixed now. Thanks to user bmndr.com/laur  for alerting us!  
More useful subject lines in the response emails, like "wrong lane!" or whatnot.
Clearer/conciser stats at bottom of graph pages. (Will take effect when you enter your next datapoint.) Progress, rate, lane, delta.  
Better handling of significant digits when displaying numbers. Cf stackoverflow.com/q/5208663

The code to extract and print the named-entities in each changelog line could not be simpler.

import spacy
import sys

nlp = spacy.load("en_core_web_sm") # load trained English pipelines

count=0 
        
for line in sys.stdin:
   count += 1 
   print(f'> {count}: {line}')
#
   doc=nlp(line) # do the heavy lifting
#          
   for ent in doc.ents:  # iterate over detected named-entities
      print(ent.lemma_, ent.label_)

To maximize the similarity between named-entities appearing on different lines the lemmas are printed, rather than original text (i.e., words appear in their base form).

The label_ specifies the kind of named-entity, e.g., person, organization, location, etc.

This code produced 2,225 unique named-entities (5,302 in total) from the Beeminder changelog (around 0.6 per day), and failed to return a named-entity for 33% of lines. I was somewhat optimistically hoping for a few hundred unique named-entities.

There are several problems with this simple implementation:

  • each line is considered in isolation,
  • the change log sometimes contains different names for the same entity, e.g., a person’s full name, Christian name, or twitter name,
  • what appear to be uninteresting named-entities, e.g., numbers and dates,
  • the language does not know much about software, having been training on a corpus of general English.

Handling multiple names for the same entity would a lot of work (i.e., I did nothing), ‘uninteresting’ named-entities can be handled by post-processing the output.

A language processing pipeline that is not software-concept aware is of limited value. spaCy supports adding new training models, all I need is a named-entity model trained on manually annotated software engineering text.

The only decent NER training data I could find (trained on StackOverflow) was for BERT (another language processing tool), and the data format is very different. Existing add-on spaCy models included fashion, food and drugs, but no software engineering.

Time to roll up my sleeves and create a software engineering model. Luckily, I found a webpage that provided a good user interface to tagging sentences and generated the json file used for training. I was patient enough to tag 200 lines with what I considered to be software specific named-entities. … and now I have broken the NER model I built…

The following plot shows the growth in the total number of named-entities appearing in the changelog, and the number of unique named-entities (with the 1,996 numbers and dates removed; code+data);

Growth of total and unique named-entities in the Beeminder changelog.

The regression fits (red lines) are quadratics, slightly curving up (total) and down (unique); the linear growth components are: 0.6 per release for total, and 0.46 for unique.

Including software named-entities is likely to increase the total by at least 15%, but would have little impact on the number of unique entries.

This extraction pipeline processes one release line at a time. Building a set of Beeminder tag-names requires analysing the changelog as a whole, which would take a lot longer than the day spent on this analysis.

The Beeminder developers have consistently added new named-entities to the changelog over more than eleven years, but does this mean that more features have been consistently added to the software (or are they just inventing different names for similar functionality)?

It is not possible to answer this question without access to the code, or experience of using the product over these eleven years.

However, staying in business for eleven years is a good indicator that the developers are doing something right.

Practical tips or mindset change?

Allan Kelly from Allan Kelly

How many books on your bookshelves have a number in the title? Specifically a list of X things. Such books sell, blog posts of a similar ilk get read.

“50 specific ways to improve your programs”

“97 things every dog walker should know”

“10 practical things every Scrum Master should know”

“51 tips to improve your requirements”

Small, specific nuggets of information, best presented as a list and advertised as such. No grand unifying thesis, just “75 things”. The closest I have ever come to this was “Little Book of Requirements and User Stories” which was my best seller and would have sold more if I had called it “16 tips to improve your User Stories.”

However, most of my books aren’t like that. Most of my books contain a big idea – at least one big idea. The whole book sets out to explain that. Business Patterns does say “38 Business strategy patterns” but really the books big idea was “Apply pattern thinking to business strategy”. In retrospect it would have sold better if I had called the book “38 Business strategy patterns” and put the pattern thinking stuff as an appendix.

Regular readers might notice that my blogs follow a similar pattern: mostly long thoughtful pieces which try to build an argument, few practical posts thrown in once in a while. Despite knowing I should write more short practical pieces (to boost readership) I keep failing.

Why?

Two reasons.

Sometimes those “short practical tips” seem so trivial, or so obvious, that I just assume everyone does it that way and everyone sees what I see. They are so small and so “obvious” I don’t see them.

But more because I see value in those long pieces. I see them as “philosophy” pieces, they are about how to see the world, how to comprehend what is going on, sense-making. Quite often I will wrestle with balancing forces, how one force pushed you one way while another pushes you another. The right course of action is about balancing those forces and what is “right” may be different at different times. (Thats a pattern thing.)

It might be better if I called those “Mindset” pieces. They are about preparing the mind to see the world in a particular way. Conditioning you for agile, perhaps.

To me those Mindset pieces are more important because they shape the way you respond. In the complex world in which we live few decisions and few courses of action can actually be boiled down to a simple “If this Then do That”. Instead, the thousands of small decisions you make each day are informed by your mindset (philosophy) of how the world works and what will happen if you make decision X instead of decision Y.

Especially for those working in management, it is your mental view of the world that shapes your decisions and relationships. I’m sure somewhere out there is a “50 practical tips for better management decisions” book but in truth there are so many variables, unknowns and ambiguities that you can’t boil the world down like that.

Thats why, while everyone is short of time and wants “10 practical tips” to fix a problem right now it is more important to spend time really challenging your own thinking. Change can only really become permanent when people change their actions and decisions without thinking each time, when people can make decision #563 today congruently to everything else not because they read it in book but because that is the way their mind works.

Our constant search for “quick answers” can mislead us, we might get a quick answer but we aren’t necessarily building our long term capability.

In Succeeding with OKRs in Agile, I tried hard to write a hands-on-practical tips book. I failed but in failing I did better than I would have done without trying. I very deliberately kept the opening chapters short and quickly moved into “practical tips” (mainly about writing OKRs). Almost all the mindset philosophy was pushed later in the book. So far sales suggest I got it right.

So, even as I strive this year to write more “10 practical tips” blog posts I expect I’ll have more philosophy as I put the world to rights!


Subscribe and download Continuous Digital for free

The post Practical tips or mindset change? appeared first on Allan Kelly.

Team Retrospective cards are back, and better than before

Allan Kelly from Allan Kelly

Agile Stationary have given retrospective cards a new home and are handling all the sales and logistics. That means everything should be slicker and export to anywhere in the world should be hassle free.

Agile Stationary gave the cards another print run and in the process enlarged the cards slightly. So while they can still fit in your pocket they are a bit easier to handle.

To mark the occasion Agile Stationary are offering a 20% discount to blog readers, use the code TEAMRETRO20.


Subscribe to my blog newsletter and download

Continuous Digital for free

The post Team Retrospective cards are back, and better than before appeared first on Allan Kelly.

Join a crowdsourced search for software engineering data

Derek Jones from The Shape of Code

Software engineering data, that can be made publicly available, is very rare; most people don’t attempt to collect data, and when data is collected, people rarely make any attempt to hang onto the data they do collect.

Having just one person actively searching for software engineering data (i.e., me) restricts potential sources of data to be English speaking and to a subset of development ecosystems.

This post is my attempt to start a crowdsourced campaign to search for software engineering data.

Finding data is about finding the people who have the data and have the authority to make it available (no hacking into websites).

Who might have software engineering data?

In the past, I have emailed chief technology officers at companies with less than 100 employees (larger companies have lawyers who introduce serious amounts of friction into releasing company data), and this last week I have been targeting Agile coaches. For my evidence-based software engineering book I mostly emailed the authors of data driven papers.

A lot of software is developed in India, China, South America, Russia, and Europe; unless these developers are active in the English-speaking world, I don’t see them.

If you work in one of these regions, you can help locate data by finding people who might have software engineering data.

If you want to be actively involved, you can email possible sources directly, alternatively I can email them.

If you want to be actively involved in the data analysis, you can work on the analysis yourself, or we can do it together, or I am happy to do it.

In the English-speaking development ecosystems, my connection to the various embedded ecosystems is limited. The embedded ecosystems are huge, there must be software data waiting to be found. If you are active within an embedded ecosystem, you can help locate data by finding people who might have software engineering data.

The email template I use for emailing people is below. The introduction is intended to create a connection with their interests, followed by a brief summary of my interest, examples of previous analysis, and the link to my book to show the depth of my interest.

By all means cut and paste this template, or create one that you feel is likely to work better in your environment. If you have a blog or Twitter feed, then tell them about it and why you think that evidence-based software engineering is important.

Be responsible and only email people who appear to have an interest in applying data analysis to software engineering. Don’t spam entire development groups, but pick the person most likely to be in a position to give a positive response.

This is a search for gold nuggets, and the response rate will be very low; a 10% rate of reply, saying sorry not data, would be better than what I get. I don’t have enough data to be able to calculate a percentage, but a ballpark figure is that 1% of emails might result in data.

I treat the search as a background task, taking months to locate and email, say, 100-people considered worth sending a targeted email. My experience is that I come up with a search idea or encounter a blog post that suggests a line of enquiry, that may result in sending half-a-dozen emails. The following week, if I’m lucky, the same thing might happen again (perhaps with fewer emails). It’s a slow process.

If people want to keep a record of ideas tried, the evidence-based software engineering Slack channel could do with some activity.

Hello,

A personalized introduction, such as: I have been reading
your blog posts on XXX, your tweets about YYY,
your youtube video on ZZZ.

My interest is in trying to figure out the human issues
driving the software process.

Here are two detailed analysis of Agile estimation data:
https://arxiv.org/abs/1901.01621
and
https://arxiv.org/abs/2106.03679

My book Evidence-based Software Engineering discusses what is
currently known about software engineering, based on an
analysis of all the publicly available data.
pdf+code+all data freely available here:
http://knosof.co.uk/ESEUR/

and I'm always on the lookout for more software data.
This email is a fishing request for software engineering data.

I offer a free analysis of software data, provided an
anonymised version of the data can be made public.

Air-Source Heat Pump – our experience so far, 2 months in

Andy Balaam from Andy Balaam's Blog

Summary: less energy, more money

2 months ago, we replaced our gas boiler with an air-source heat pump, which uses electricity to heat our home and boiler. This is a report of our experience so far.

We expected it to reduce our environmental impact, and cost us more money, and we were right.

It works: our house is comfortable. We use a lot less energy, and it costs us significantly more money (because electricity costs way more than gas).

The house

Our house is a beautiful, leaky old house, with a modern extension. Half of it is well-insulated. The other half was built around 1890, and while we do have double-glazing and decent loft insulation, the walls have no cavities and feel cold to the touch, and there are drafts everywhere.

The new half has underfloor heating. The old half and the upstairs are heated by radiators. We have a hot water cylinder.

The air-source heat pump

Our air-source heat pump uses electricity to extract heat from the outside air and heats water for radiators and hot water, directly replacing our gas boiler.

Our heat pump was installed by Your Energy Your Way and I must declare in interest: my wife is a director of the company.

The heat pump is an LG 16kW “THERMA V” model. It looks like a very large air conditioning unit, which sits outside our house in the yard to the side. It is about as tall as my shoulder height, with two big fans on it.

A large air-source heat pump

It stands on a soak-away area with some stones on it that the installers made by removing some patio tiles. This is needed because it drips a small amount of liquid as part of its normal operation. The outdoor unit makes noise, but our house is next to the main road, so we don’t hear it. It is not audible indoors.

Standing next to the outdoor unit you can feel a cold breeze, like opening the fridge door. This is unpleasant on cold days.

That outdoor unit connects through the wall to an indoor part that is a bit smaller than our old boiler.

The controller box has a terrible user interface and is very hard to decipher, but we did eventually manage to programme it to turn the target temperature up in the daytime and down at night. Your Energy Your Way advised us that it is more efficient to keep the house at a cool-ish 17 degrees at night, rather than letting it get cold and having to work hard heat it up again in the morning, so that is how we have set it up.

The controller box’s built-in thermostat does not work properly (it reports the wrong temperature), so we had to add an external thermostat, which works well.

We didn’t need to change anything about our hot water cylinder, or our underfloor heating.

When planning the installation, Your Energy Your Way estimated the heat loss of our rooms, and recommended upgrading our radiators. In an old house like ours this is sometimes needed, because it is way more efficient to heat a house with cooler water running through the radiators, but if the water is cooler, you need more radiator surface area to heat the house effectively. In a newer house with existing radiators, they are probably fine as-is.

We kept most of the existing radiators, and added some more in the coldest rooms.

How comfortable is the house?

The house is more comfortable than it was before, for two reasons: firstly the radiators we had were not really adequate, and secondly the cooler water in the radiators makes a less irritating heat, meaning the house is nicely comfortable most of the time, instead of bouncing between feeling cold and feeling oppressively over-heated.

On cold days, the old part of the house is a bit cold, but I think on average it’s a little better than it was before.

We do find mornings can be chilly, particularly because the system stops heating the radiators if the hot water cylinder needs heating up after people have had showers. We could improve this situation by getting a larger cylinder, which we are considering.

However, it’s worth pointing out that we needed engineers to visit four or five times to make adjustments before we felt the system was working well enough. There are a lot of things that can be tweaked, and it took some time for it to work well.

My advice: don’t pick the cheapest quote – pick the people you think you can trust to do the work well: especially the heat loss calculations before installation and the adjustments afterwards.

How much energy are we using? (The good news)

So far, it looks like we are using about two-thirds less energy in our household than we were before:

The above chart is stacked, so the top line represents the total energy usage. We switched to the air-source heat pump exactly when our gas usage was about to skyrocket (because it’s cold in winter), and it remained relatively low.

This is absolutely fantastic: our house is more comfortable than before, and we have reduced the amount of energy we are using by 66%. This is the total energy usage of our house, not just for heating, so the reduction of energy used for heating is even more dramatic than it looks.

Even better, the energy we use is at least partly produced from renewable sources, so our carbon footprint is much lower. Previously we were directly releasing carbon by burning imported gas – now we use mostly UK-produced electricity, and as the grid decarbonises, our carbon footprint reduces even if we make no further changes.

How much money are we spending? (The bad news)

Excluding standing charges*, we are spending about one third more on energy than we were before. This is because electricity is so much more expensive than gas: our electricity costs 19p per kWh and our gas costs 4p per kWh.

* Note: our energy provider wanted to charge us £350 to remove our gas meter, so we refused, and are still paying the gas standing charge. I’m not sure how we’re going to resolve this, especially since our energy provider is now in administration.

The above chart is stacked, so the top line represents the total cost (excluding standing charges). When we switched to the air-source heat pump, our energy costs increased faster than they did the same time last year, and were consistently higher. We think the peak in November might be misleading as it may have been when the system was not set up correctly, but we are not sure.

Because air-source heat pumps are more efficient when the weather is warmer, we do expect to fare better in the summer than we are right now.

I would not suggest getting a heat pump if you want to save money. Maybe this will change as gas prices are expected to rise significantly this year.

An installation like ours, including new radiators, costs £10-15K. A decent chunk of that will be paid back to us by the government, spread out over the next 7 years, under the soon-to-be-gone Renewable Heat Incentive (RHI). RHI will be replaced by the
Boiler Upgrade Scheme (BUS), which will be limited to a £5K grant for air-source heat pumps, although it is paid up-front. We would have received much less money under BUS than RHI. It is almost certainly too late for you to get a heat pump under RHI, by the way – all the installers are booked up until end of March 2022, when it ends.

Thoughts

If you think it’s surprising (and deeply concerning) that taking the step of significantly reducing our carbon footprint cost us a one-third increase in our energy bills, I would agree with you.

I am told that the tax taken on electricity is much higher than on gas, even though these taxes are apparently intended help decarbonise our energy.

Meanwhile, the government is replacing (with great fanfare) RHI with the much less generous (although more timely) BUS, making it even more economically punishing to reduce your carbon footprint.

I think this should be addressed urgently: money should be provided to help people install heat pumps, and the tax regime should be changed to make it cheap to use low-carbon fuels.

The technology is available, but the financial situation makes this a vanity project for people like me who can afford it, instead of what it could be: a feasible plan to get our national carbon usage down, fast.

On a positive note, our house is nice and warm, and I feel a bit less guilty about how much carbon we’re using to keep it cosy.

Using Eliptical curve cryptography for TLS with Postfix, Dovecot and nginx

Timo Geusch from The Lone C++ Coder's Blog

I may have mentioned this before - I do run my own virtual servers for important services (basically email and my web presence). I do this mostly for historic reasons and also because I’m not a huge fan of using centralised services for all of the above. The downside is that you pretty much have to learn at least about basic security. Over the 20+ years I’ve been doing this, the Internet hasn’t exactly become a less hostile place.

Chi Chi Again – a.k.

a.k. from thus spake a.k.

Several years ago we saw that, under some relatively easily met assumptions, the averages of independent observations of a random variable tend toward the normal distribution. Derived from that is the chi-squared distribution which describes the behaviour of sums of squares of independent standard normal random variables, having means of zero and standard deviations of one.
In this post we shall see how it is related to the gamma distribution and implement its various functions in terms of those of the latter.

Can you keep Agile and OKRs seperate?

Allan Kelly from Allan Kelly

“I’ve been told to keep agile and OKRs separate”

The first time I head this I was surprised, “missed opportunity” I thought but then, as I thought about it more, the more I realised that it was impossible.

Start with the OKRs: OKRs are about deciding what to put your time and energy into. OKRs are about the big priorities for your organization and team. The more I’ve spent time with OKRs, the more I’ve come to see them as the management method rather than a management method among many. Let me caveat that lest it sound arrogate: management within an organization.

The management approach

There are many management approaches out there: strict time-and-motion were workers time is schedule to the minute by experts; complete devolution giving employees free rein and managers (if they exist at all) only exist to coach. And there is everything in between, including project management which attempts to define the start and stop dates in advance. At this level OKRs are one management approach among many and organizations are free to choose which they follow.

Even combining traditional HR performance review processes with OKRs can lead to ruin. Once compensation is conected to OKRs people become incentivised to stay safe by setting OKRs which bring rewards, i.e. not ambitions ones that might be missed.

Running any other management method in tandem with OKRs risks jeopardising both. So if you choose OKR then follow it all the way, call it “Extreme OKRs” if you like.

Just try imaging agile as something separate to your OKRs: you set OKRs and then you run iterations. What are you delivering in the iteration? Surely iterations are delivering progress against OKRs?

I suppose you could have a backlog of work to do (Track A) and some OKRs to work on as well (Track B). Track A and B might lead to different places or represent different work to do. Leave aside potential conflict for a minute, think about how you divide your time.

More WIP, fewer results

Agile teaches that work in progress should be minimised, but now in this example there are two sanctioned work streams. Maybe we could ring-fence work: Agile in the morning, OKRs in the afternoon. I find it hard to see that working well.

Maybe A could be the main stream and B other a “best efforts” / “spare time” stream. But, if both A and B are important then why leave prioritisation be left to the worker? It smells a bit of leadership abdicating responsibility for prioritisation.

It is a fantasy to think that workers can focus on delivering the backlog and in their “spare time” deliver the OKRs. If your workers have copious amounts of spare time then something else is seriously wrong. It is easy to overload workers, and thereby create problems further down the line. People will burn out, goals will be missed or goals are met but with such poor quality that problems emerge later.

I can see how you can run OKRs without agile.

And I’ve long seen Agile working without OKRs.

But if you have both Agile and OKRs in the same company I just don’t see how Agile and OKRs can be separated. Conversely I can see how they can work well together – yes, I wrote a book on that.

If you are going to have OKRs and Agile in the same company then you need to consider them as one thing, not as two separate endeavours.

Photo by Jackson Simmer on Unsplash


Subscribe to my blog newsletter and download Continuous Digital for free

The post Can you keep Agile and OKRs seperate? appeared first on Allan Kelly.

Find My Tea: A technical journey through new product development (online 1st February 2022)

Paul Grenyer from Paul Grenyer

What: Find My Tea: A technical journey through new product development

When:
Tuesday, February 1, 2022 7:00pm to 8:30p (GMT)

Where:  SyncIpswich (online)

RSVP: https://www.meetup.com/SyncIpswich-Ipswichs-Tech-Startup-Community/events/281991960/

 

 

After what feels like an age I’m getting back into speaking and of course I’m speaking about Find My Tea! This time it’s technical!

As well as online with SyncIpswich I’m also doing the ACCU Conference, nor(DEV):con and one other:

    ACCU Conference - 4pm 8th April (Bristol) - 90min version
    nor(DEV):con - 24th & 25th June (Norwich)
    TBC - July

Find My Tea: A technical journey through new product development

There is more to having a great idea for an app than just building the app. You’re not only required to be a full stack developer (whatever that means), which doesn’t usually include the skills for building an app, you need to understand and be competent in ‘Ops’ (there’s really no such thing as DevOps) and the automated pipelines used for testing and deploying the app, it’s backend services and supporting applications. And there is so much to choose from!

In this session I will take you on the journey of discovery from having an idea, to choosing, rechoosing and choosing again the different technologies and platforms I used to build and release a new product from scratch.

This session will be focussed on the technology choices made and the reasoning and not on the product itself - although of course this will feature. This will include the mobile technology, the technology used for the web applications, backed services, hosting and development pipelines.

You can download the Find My Tea app here: https://findmytea.co.uk

 

Paul Grenyer

Husband, father, software engineer, metaller, Paul has been writing software for over 35 years and professionally for more than 20. In that time he has worked for and in all sorts of companies from two man startups to world famous investment banks and insurance companies. He has built and run three limited companies, none of which made him a millionaire and two of which threatened his sanity on more than one occasion.

Paul was a founding member of both SyncNorwich and Norfolk Developers, two of the most successful tech and startup based community groups in the East. He created and chaired the hugely successful Norfolk Developers Conference (nor(DEV):con) for seven years bringing in speakers and delegates in the sphere of software engineering from around the globe.

Paul is currently a Senior Software Engineer at Bourne Leisure, the owners of Haven caravan parks, and the founder of the tea finding app, Find My Tea. He loathes the word Entrepreneur, not least because he struggles to spell it and it reminds him of Del Boy from the 80s sitcom Only Fools and Horses. He sees Entrepreneurship as a side effect of the creative process of problem solving, rather than a career path in its own right.

Despite having dealt with the world of business from directors of the board down, Paul has kept both feet firmly on the ground even when his head has been in the clouds with healthy doses of Heavy Metal, Science Fiction and Formula One and long hair until it started falling out in 2013.

Oh, and he loves good tea too!