Printing press+widespread religious behavior: A theory

Derek Jones from The Shape of Code

The book The Weirdest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous provides an explanation of the processes which weakened the existing social ties of family and tribe; however, the emergence of WEIRD people (Western, Educated, Industrialized, Rich and Democratic) required new social norms to spread and be accepted throughout society. A major technical innovation, in the form of the printing press, provided the means for mass communication of ideas and practices.

David High-Jones’ book Wyclif’s Dust: Western Cultures from the Printing Press to the Present describes the social consequences of what he calls book religion; a combination of deeply religious western societies and the ability of individuals to write and sell affordable books (made possible by the printing press). Religion+printing press created the conditions for what High-Jones calls a hothouse culture, a period from the 1600s to the end of the 1800s.

Around 1440 the printing press is invented and quickly spreads; around 5 million books were handwritten in the 1400s, about 80 million books were produced in the first 50 years of printing, and around a billion in the 1700s. During the 1500s the Protestant reformation happens; Protestant encouraged its followers to read the Bible, which creates a demand for printed Bibles and the need to be able to read (which increases literacy rates). In England, between 1480-1640, 40% of published books were religious.

The changes to society’s existing norms are wrought by cultural transmission, initially via middle class parents making use of edifying books to teach their children moral values and social skills, later Sunday schools took on this role, but also had to offer reading lessons to attract members. In the adult world, accepted norms were maintained by social enforcement. The impact on western societies was widespread because observant religious behavior was widespread.

The original intent, of those writing the religious books, was the creation of a god fearing society. In practice, a trust based society was created, where workers might be relied upon not to shirk their duties and businessmen to not renege on agreements.

In the beginning science, in the form of printed technical books, rarely made an appearance. In the 1700s the Enlightenment happens, and scientific books are discussed by small collections of disparate individuals. The industrial revolution happens, but the bulk of the demand is for trustworthy workers; technical and scientific know how remains a minority interest.

In Part I of the book, High-Jones weaves a reading and convincing narrative. Part II, 1900 to today, is a tale of the crumbling and breakdown of the social forces and incentives that creates the trust based society; while example are enumerated, no overarching theory is proposed (I skimmed this part).

The Tims by Eleanor Farjeon

Tim Pizey from Tim Pizey

From The Little Bookroom by Eleanor Farjeon via Eldrbarry

There were five Tims all told, Old Tim, Big Tim, Little Tim, Young Tim and Baby Tim, and they were all born wise. So whenever something was the matter in the village, or anything went wrong, or people for some cause were vexed or sorry for themselves, it became their habit to say "Let's go to the Tims about it, they'll know, they were born wise."

For instance, when Farmer John found out that the gypsies had slept without leave in his barn one night, his first thought was not, as you might suppose, "I'll fetch the constable and have the law on them!" but, "I'll see Old Tim about it, so I will!"

Then off he went to Old Tim, who was eighty years old, and he found him sitting on a gate, smoking his clay pipe.

"Morning, Old Tim," said Farmer John. "Morning, Farmer John," said Old Tim taking his clay pipe from his mouth. "I've had gypsies in my barn again, Old Tim," said Farmer John "Ah, have you now!"said Old Tim. "Ay, that I have," said Farmer John. "Ah, to be sure!," said Old Tim. "You were born wise, Old Tim," said Farmer John. "What would you do if you was me?"

Old Tim put his clay pipe in his mouth again and said, "If I was you, I'd ask Big Tim about it, for he was born wise too and is but sixty years old. So I be twenty years further off wisdom than he be."

Then off he went to Big Tim, who was Old Tim's son, and he found him sitting on a gate, smoking his briar.

"Morning, Big Tim," said Farmer John. "Morning, Farmer John," said Big Tim taking his briar from his mouth. "I've had gypsies in my barn again, Big Tim, and Old Tim sent me to ask what you would do if you was me, for you were born wise" said Farmer John. Big Tim put his briar in his mouth again and said, "If I was you, I'd ask Little Tim about it, for he was born wise too and is but forty years old which is twenty year nigher to wisdom than me."

Then off he went to Little Tim, who was Big Tim's son, and he found him lying in a haystack chewing a straw.

"Morning, Little Tim," said Farmer John. "Morning, Farmer John," said Little Tim taking the straw from his mouth. Then Farmer John put his case again, saying, "Big Tim told me to come to you about it, for you were born wise".Big Tim put the straw back in his mouth again and said, "Young Tim was born wise too and he's but twenty years old. You'll get wisdom fresher from him than from me."

Then off went Farmer John to find Young Tim, who was Little Tim's son, and he found him staring into the mill pond, munching an apple.

"Morning, Young Tim," said Farmer John. "Morning, Farmer John," said Young Tim taking the apple from his mouth. Then Farmer John told his tale for the fourth time, and ended by saying, "Little Tim thinks you'll know what I'd best do, for you were born wise".Young Tim took a new bite of his apple and said, "My son who was born last month was born wise too, and from him you'll get wisdom at the fountain-head, so to say."

Off went Farmer John to find Baby Tim, who was Young Tim's son, and he found him in his cradle with his thumb in his mouth.

"Morning, Baby Tim," said Farmer John. Baby Tim took his thumb out of his mouth and said nothing. "I've had gypsies in my barn again, Baby Tim," said Farmer John, "and Young Tim advised me to ask your advice upon it, for you was born wise. What would you do if you was me? I'll do whatever you say."Baby Tim put his thumb back in his mouth and said nothing. So Farmer John went home and did it.

And the gypsies went on to the next village and slept in the barn of Farmer George, and Farmer George called in the constable and had the law on them; and a week later his barn and his ricks were burned down, and his speckled hen was stolen away.

But the happy village went on being happy and doing nothing, neither when the Miller's wife forgot herself one day and boxed the Miller's ears, nor when Molly Garden got a bad sixpence from the pedlar, nor when the parson once came home singing by moonlight. After consulting the Tims, the village did no more than trees do in a wood or crops in a field and so all these accidents got better before they got worse.

Until the day when Baby Tim died an unmarried man at one hundred years of age. After that the happy village became as other villages and did something.

A Review: Eversion by Alastair Reynolds

Paul Grenyer from Paul Grenyer

Eversion

by Alastair Reynolds
ISBN-13 ‏ : ‎ 978-0575090781

There’s little to no sci-fi in the first 30% of this book and you’d be forgiven for thinking Reynolds was just indulging himself in a seafaring romp. There’s only a hint of sci-fi up to about half way through, when it develops into groundhog day. I didn’t enjoy this. I put the book down for a week or two until I forced myself to pick it up again and finish it. I couldn’t put the second half down!

There isn’t the usual Alastair Reynolds scope, but that doesn’t matter as there is plenty of his trademark exploration and discovery and a fantastic plot twist I didn’t see coming. By far my favourite character is Ada for her sharp sense of humour and general attitude to life. The other characters are convincing and, for once, there’s a good ending, even if it is a little corny.

Shopper estimates of the total value of items in their basket

Derek Jones from The Shape of Code

Agile development processes break down the work that needs to be done into a collection of tasks (which may be called stories or some other name). A task, whose implementation time may be measured in hours or a few days, is itself composed of a collection of subtasks (which may in turn be composed of subsubtasks, and so on down).

When asked to estimate the time needed to implement a task, a developer may settle on a value by adding up estimates of the effort needed to implement the subtasks thought to be involved. If this process is performed in the mind of the developer (i.e., not by writing down a list of subtask estimates), the accuracy of the result may be affected by the characteristics of cognitive arithmetic.

Humans have two cognitive systems for processing quantities, the approximate number system (which has been found to be present in the brain of many creatures), and language. Researchers studying the approximate number system often ask subjects to estimate the number of dots in an image; I recently discovered studies of number processing that used language.

In a study by Benjamin Scheibehenne, 966 shoppers at the checkout counter in a grocery shop were asked to estimate the total value of the items in their shopping basket; a subset of 421 subjects were also asked to estimate the number of items in their basket (this subset were also asked if they used a shopping list). The actual price and number of items was obtained after checkout.

There are broad similarities between shopping basket estimation and estimating task implementation time, e.g., approximate idea of number of items and their cost. Does an analysis of the shopping data suggest ideas for patterns that might be present in software task estimate data?

The left plot below shows shopper estimated total item value against actual, with fitted regression line (red) and estimate==actual (grey); the right plot shows shopper estimated number of items in their basket against actual, with fitted regression line (red) and estimate==actual (grey) (code+data):

Left: Shopper estimated total value against actual, with fitted regression line; right: shopper estimated number of items against actual, with fitted regression line.

The model fitted to estimated total item value is: totalActual=1.4totalEstimate^{0.93}, which differs from software task estimates/actuals in always underestimating over the range measured; the exponent value, 0.93, is at the upper range of those seen for software task estimates.

The model fitted to estimated number of items in the basket is: itemsActual=1.8itemsEstimate^{0.75}. This pattern, of underestimating small values and overestimating large values is seen in software task estimation, but the exponent of 0.75 is much smaller.

Including the estimated number of items in the shopping basket, Nguess, in a model for total value produces a slightly better fitting model: totalActual=1.4totalEstimate^{0.92}e^{0.003itemsEstimate}, which explains 83% of the variance in the data (use of a shopping list had a relatively small impact).

The accuracy of a software task implementation estimate based on estimating its subtasks dependent on identifying all the subtasks, or having a good enough idea of the number of subtasks. The shopping basket study found a pattern of inaccuracies in estimates of the number of recently collected items, which has been seen before. However, adding Nguess to the Shopping model only reduced the unexplained variance by a few percent.

Would the impact of adding an estimate of the number of subtasks to models of software task estimates also only be a few percent? A question to add to the already long list of unknowns.

Like task estimates, round numbers were often given as estimate values; see code+data.

The same study also included a laboratory experiment, where subjects saw a sequence of 24 numbers, presented one at a time for 0.5 seconds each. At the end of the sequence, subjects were asked to type in their best estimate of the sum of the numbers seen (other studies asked subjects to type in the mean). Each subject saw 75 sequences, with feedback on the mean accuracy of their responses given after every 10 sequences. The numbers were described as the prices of items in a shopping basket. The values were drawn from a distribution that was either uniform, positively skewed, negatively skewed, unimodal, or bimodal. The sequential order of values was either increasing, decreasing, U-shaped, or inversely U-shaped.

Fitting a regression model to the lab data finds that the distribution used had very little impact on performance, and the sequence order had a small impact; see code+data.

Is this shampoo?

Jon Jagger from less code, more software

Mini-bottles in hotel bathrooms...
Am I the only one who can't read the writing on them?
I mean the font size is tiny. You can't read them.
There's just no way to tell which bottles are which.
Shampoo? Conditioner? Body lotion? Moisturiser? Shower gel?
It's unbelievable how many different bottles there are!
(But no toothpaste).
Even if you have your reading glasses on you still can't read the labels.
Not that you have your reading glasses on when you're about to take a shower!
Not at first anyway.
You get in the bathroom and you see the 18 mini-bottles!
And you think - "figgins - which one is shampoo?"
You pick one hoping the bottles in this hotel will be different.
That you'll be able to read the label.
But no. You can't.
So you walk back into your room to try and find your glasses.
Now you're buck naked, wearing nothing but your glasses, about to take a shower!
You start reading the mini-bottle labels.
The colour is #3422DE and the background colour is #3422DD.
They're so similar you cannot read anything!
It's just not physically possible.
A small defeat in todays modern world!


Setting the text selection in a browser: just use setBaseAndExtent

Andy Balaam from Andy Balaam's Blog

The Selection API is confusing and weird.

But, here’s what I’ve discovered: just use setBaseAndExtend, and when (rarely) needed, extend.

Summary

Every selection in a browser consists of:

  • an “anchor” – the beginning, where you started dragging, and
  • a “focus” – the end, where your stopped dragging, and where your cursor is

To select some text in a browser, find the from and to DOM nodes you want, and how far through them you want to be, and set the selection like this:

const sel = document.getSelection();
sel.setBaseAndExtent(from_node, from_offset, to_node, to_offset);

If you want just a cursor with nothing selected, use the same node and offset for both from and to.

The Internet has a lot of recipes that involve creating Ranges and adding them to the Selection, but there is no need to do that.

Finding the from and to nodes

When the locations are inside a text node in the DOM, it’s easy to find the from and to nodes – they are just the text nodes, and the offsets are how many UTF-16 code units through the text you want to be. [Not sure what a code unit is? See my video Interesting Characters].

“But isn’t your cursor always in a text node?”

No – for example, when I have two <br>s next to each other, and I want the cursor to be in between them (so it’s on the blank line). In this case there is no text node, and browsers handle it weirdly.

In Firefox, when my cursor is between two br nodes, the node is set to the tag (e.g. a div) that contains the brs, and the offset is the index of the second br node in the list.

Other browsers may well do it differently.

So do expect that the nodes may not be text nodes, and where that is the case, the offset will be the index of one of their children.

To select the blank line (between two brs) you can actually specify the br node directly, and give an index number of 0, and it works too. But, it does not match exactly what the browser sets when you click on the blank line, so I can’t guarantee it works exactly the same.

Backwards selections

The fly in the ointment is backwards selections: when you click and drag the mouse from right to left, the selection that is created in the browser is backwards: the anchor is on the right and the focus is on the left.

However, if you call setBaseAndExtent attempting to set up a selection like that, the focus and anchor will be swapped so that the anchor is on the left, and the focus is on the right.

But never fear! We can use extend to force the selection to be what we wanted:

const sel = document.getSelection();
sel.setBaseAndExtent(from_node, from_offset, from_node, from_offset);
sel.extend(to_node, to_offset);

Job done.

Go deeper

For more info, check out my demo page: Browser Selections Inventory.

Overview of broad US data on IT job hiring/firing and quitting

Derek Jones from The Shape of Code

Software developers are employed by organizations and people change jobs, either voluntarily or not; every year a new batch of people join the workforce, e.g., new graduates. Governments track employment activities for a variety of reasons, e.g., tax collection, and monitoring labour supply and demand (for the purposes of planning).

The US Bureau of Labor Statistics’ publishes a monthly summary of their Job Openings and Labor Turnover Survey. What can be learned about software development employment from this data (description)?

The data starts in December 2000, with each row contains a monthly count of Job Openings, Hires, Quits, Layoffs and Discharges, and totals, along with one of 21 major non-farm industry codes or one of the 5 government codes (the counts are broken out by State). I’m guessing that software developers are assigned the Information code (i.e., 510000), but who is to say that some have not been classified with the code for, say, Construction or Education and health services. The Information code will cover a lot more than just software developers; I’m trading off broad IT coverage for monthly details on employment turnover (software developer specific information is available, but it comes without the turnover information). The Bureau of Labor Statistics make available a huge quantity of information, and understanding how it all fits together would probably require me to spend several months learning my way around (I have already spent a week or two over the years), so I’m sticking with a prebuilt dataset.

The plot below shows the aggregated monthly counts (i.e., all states) of Job Openings, Hires, Quits, Layoffs and Discharges for the Information industry code (code+data):

Aggregated monthly counts of Job Openings, Hires, Quits, Layoffs and Discharges for the Information industry code, from 2000 to 2022.

The general trend follows the ups and downs of the economy, there is a huge spike in layoffs in early 2020 (the start of COVID), and Job Openings often exceeding Hires (which I did not expect).

These counts have the form of a time-series, which leads to the questions about repeating patterns in the sequence of values? The plot below shows the autocorrelation of the four employment counts (code+data):

Autocorrelation of Job Openings, Hires, Quits, Layoffs time series for the Information code.

The spike in Hires at 12-months is too large to be just be new graduates entering the workforce; perhaps large IT employers have annual reviews for all employees at the same time every year, causing some people to quit and obtain new jobs (Quits has a slightly larger spike at 12-months). Why is there a regular 3-month cycle for Job Openings? The negative correlation in Layoffs at one & two months is explained by companies laying off a batch of workers one month, followed by layoffs in the following two months being lower than usual.

I don’t know much about employment practices, so I won’t speculate any more. Comments welcome.

Are there any interest cross-correlations between the pairs of time-series?

The plot below shows four pairs of cross correlations (code+data):

Cross correlation between the pairs of time series Hires/Layoffs, Quits/Layoffs, Job Openings/Hires, and Hires/Quits.

Hires & Layoffs shows a scattered pattern of Hires preceding Layoffs (to be expected), and the bottom left shows there is a pattern of Quits preceding Layoffs (are people searching for steadier employment when layoffs loom?). Top right shows a pattern of Job Openings following Hires (I’m clutching at straws for this; is Hires a proxy for Quits, the cross correlation of Job Openings & Quits does have Job Openings leading), the bottom right shows the pattern of Hires leading Quits.

Nothing in this analysis surprised me, but then it is rather basic and broad brush. These results are the start of an analysis of the IT employment ecosystem; one that probably won’t progress far because of a lack of data and interest on my part.

ARD & Winterfyleth at the Bread Shed

Paul Grenyer from Paul Grenyer

ARD

Following a nightmare which is parking in Manchester and getting a meal on a Saturday night without booking, we walked into the Bread Shed just as ARD were getting going, minus my vinyl and CDs for signing!. The masterpiece which is Take Up My Bones was instantly recognisable, as were composer and multi instrumentalist Mark Deeks and fellow Winterfyleth band mate Chris Naughton, both on guitar. The latter was centre stage, where surely Deeks should have been?

From the off the band, who were put together to perform an album which was never intended to be performed live, were a little loose with the drums too prominent and the guitars not clear enough. There appeared to be a lot retuning necessary, especially from Chris and the lead guitarist who appeared hidden away a lot of the time. This didn’t really detract from enjoyment of the incredible compositions from the album.  By the time the final 10 minutes, consisting of Only Three Shall Know, came along something had changed, the band was as tight as anything and I wished they could have started again from the beginning. 45 minutes had flown by and I’ll certainly go and see them again.

Winterfyleth


I think I’ve seen Winterfyleth four times now, including the set which became their live album recorded at Bloodstock and earlier this year supporting Emperor at Incineration Fest. They never disappoint.

Winterfyleth are one of those bands that are so consistent with their music, without being boring or repetitive, that it doesn’t matter what they play or how familiar I am with the songs, it’s just incredible to listen to. Having said that, disappointingly, they didn’t play A Valley Thick With Oaks, which is my favourite. Who can resist singing along “In the heart of every Englishman…”? However, I did come away with a new favourite in Green Cathedral!

We only got an hour, but at least they didn’t bugger about going off and coming back for an encore. There were old songs, new songs and never before played live songs. Loved every second of it and, for the first time for me, the final song wasn’t preceded with “Sadly time is short and our songs are long, so this is our last one.” Until next time!

 

Optimal sizing of a product backlog

Derek Jones from The Shape of Code

Developers working on the implementation of a software system will have a list of work that needs to be done, a to-do list, known as the product backlog in Agile.

The Agile development process differs from the Waterfall process in that the list of work items is intentionally incomplete when coding starts (discovery of new work items is an integral part of the Agile process). In a Waterfall process, it is intended that all work items are known before coding starts (as work progresses, new items are invariably discovered).

Complaints are sometimes expressed about the size of a team’s backlog, measured in number of items waiting to be implemented. Are these complaints just grumblings about the amount of work outstanding, or is there an economic cost that increases with the size of the backlog?

If the number of items in the backlog is too low, developers may be left twiddling their expensive thumbs because they have run out of work items to implement.

A parallel is sometimes drawn between items waiting to be implemented in a product backlog and hardware items in a manufacturer’s store waiting to be checked-out for the production line. Hardware occupies space on a shelf, a cost in that the manufacturer has to pay for the building to hold it; another cost is the interest on the money spent to purchase the items sitting in the store.

For over 100 years, people have been analyzing the problem of the optimum number of stock items to order, and at what stock level to place an order. The economic order quantity gives the optimum number of items to reorder, Q (the derivation assumes that the average quantity in stock is Q/2), it is given by:

Q=sqrt{{2DK}/h}, where D is the quantity consumed per year, K is the fixed cost per order (e.g., cost of ordering, shipping and handling; not the actual cost of the goods), h is the annual holding cost per item.

What is the likely range of these values for software?

  • D is around 1,000 per year for a team of ten’ish people working on multiple (related) projects; based on one dataset,
  • K is the cost associated with the time taken to gather the requirements, i.e., the items to add to the backlog. If we assume that the time taken to gather an item is less than the time taken to implement it (the estimated time taken to implement varies from hours to days), then the average should be less than an hour or two,
  • h: While the cost of a post-it note on a board, or an entry in an online issue tracking system, is effectively zero, there is the time cost of deciding which backlog items should be implemented next, or added to the next Sprint.

    If the backlog starts with n items, and it takes t seconds to decide whether a given item should be implemented next, and f is the fraction of items scanned before one is selected: the average decision time per item is: avDecideTime={f*n*(f*n+1)/2}*t seconds. For example, if n=50, pulling some numbers out of the air, f=0.5, and t=10, then avDecideTime=325, or 5.4 minutes.

    The Scrum approach of selecting a subset of backlog items to completely implement in a Sprint has a much lower overhead than the one-at-a-time approach.

If we assume that K/h==1, then Q=sqrt{2*1000}=44.7.

An ‘order’ for 45 work items might make sense when dealing with clients who have formal processes in place and are not able to be as proactive as an Agile developer might like, e.g., meetings have to be scheduled in advance, with minutes circulated for agreement.

In a more informal environment, with close client contacts, work items are more likely to trickle in or appear in small batches. The SiP dataset came from such an environment. The plot below shows the number of tasks in the backlog of the SiP dataset, for each day (blue/green) and seven-day rolling average (red) (code+data):

Tasks waiting to be implemented, per day, over duration of SiP projects.