Time taken to compile a source file

Derek Jones from The Shape of Code

How long will it take to compile a source file?

When computers were a lot slower than they are today, this question was of general interest. Job scheduling is more effective when reliable runtime estimates are available, and developers want to know if there is enough time to get a coffee before the compile finishes.

An embarrassing fact about compile time performance, used to be that a large percentage of compile time was spent doing lexical analysis [“The cost of lexical analysis”, I cannot find an online copy]. Why was this embarrassing? Compiler writers like to boast about all the fancy optimizations their compiler does; but doing fancy stuff consumes lots of resources, so why were compilers spending so much of their time doing simple things like lexical analysis? The reality was that fancy compiler optimizations were not commercially viable until developer computers contained tens of megabytes of memory, i.e., very few pre-1990 compilers did any real optimization (people are still fussing over lexer performance).

An analysis of the data in Captain Dennis Miller’s Masters thesis (late Rome period), finds compile time is proportional to the square root of the number of tokens in the source (code+data); more complicated models are a slightly better fit. Where did square root come from? I expected a linear relationship, but would be willing to go with log. The measurements are from Ada compilers in the mid 1980s. I know several people who worked on Ada compilers during that time, and they were implementing the latest fancy optimizations (Ada was going to be the next big thing and the venture capital was flowing; big companies, with big computers were going to be paying lots of money to use Ada, but then microcomputers came along). I think that square root is driven by OS resource limitations, the compilers are using lots of memory and a noticeable amount of time is spent swapping.

So computers got a lot faster and people lost interest in estimates of how long it would take to compile individual files. I have not seen any interest in predicting how long it would take to compile whole projects (just complaints about how long it takes). There has been some work on progress indicators, updated as compilation progresses, which is a step in the right direction. Perhaps somebody has recorded compile time information and thrown machine learning at it; I usually ignore machine learning papers applied to software engineering and perhaps I have missed something. Pointers to project compile time prediction work welcome.

Then along came just-in-time compilation. Now people want to estimate how long it will take to generate machine code from some intermediate form, that is being interpreted.

The plot below (thanks to Rafael Auler for kindly supplying the data from his paper) shows the time taken to generate code from functions containing a given number of LLVM instructions (an intermediate code), at optimization level O3. The red line is a regression fit to one of the ‘arms’ and shows constant time for less than 100’ish instructions and then a linear relationship. I have no idea why the time is roughly constant for a large number of functions.

Time taken to convert functions containing a given number of LLVM instructions to machine code

There is a lot of variation for function containing the same number of instructions. This is to be expected when lots of different optimizations are being tried; sometimes a function will contain lots of the kind of code that a particular optimization spends lot of times process and sometimes the code will not contain anything interesting (i.e., no optimizations are found).

Find out how the employer voice can influence technical & vocational training.

Paul Grenyer from Paul Grenyer


When: Monday 2nd July @ 5.30pm

Where: The Library Restuarant, 4A Guildhall Hill, Norwich NR2 1JH

RSVP: https://www.meetup.com/Norfolk-Developers-NorDev/events/251982755/

How the employer voice can influence technical & vocational training and apprenticeships in the digital and IT skills sector over the next 5 years.

Jerry White - Deputy Principal at City College Norwich

The Government’s agenda for the wider field of technical and vocational education alongside apprenticeships is changing. As a large FE College and apprenticeship provider we believe that employers are central to this process and our aim this evening is to start meaningful dialogue into how we can enable employers to both support and challenge us as we launch the new T Level curriculum and apprenticeship standards. We want to hear from you about the challenges you have in recruiting employees and how we can work together to fill the gap and make a difference to the next generation.

The goal of the session is to begin to investigate where local businesses could help us trial the industrial placement model and how we can develop our working relationship over a number of years and ask you as employers what should our curriculum look like?'

Undergraduates and learning to program

Derek Jones from The Shape of Code

I last looked at the research on teaching programming around 10 years ago and I have been catching up with what has been going on; in brief: same old, same old. One of the best papers on the subject is still: Language-independent conceptual “Bugs”

The research activity is still focused on making the tools and language ‘better’. There is a defining silence on the possibility that those doing the teaching could not teach their way out of a paper bag. Nobody is brave enough to suggest that teacher training might be a worthwhile investment, or that lectures oriented to what is useful (rather than what the lecturer finds interesting) would be appreciated by students.

I have always thought that researching the teaching programming had no practical purpose, other than possibly helping universities increase the number of students graduating with computing degrees (some universities are solving the problem students have with programming by offering degrees that don’t involve being able to program). I still think that teaching programming to school children is at best a waste of time.

My experience with students learning to program is from a very long time ago. The process involved listening to confusing and disjoint lectures, reading books and figuring out what worked by trial and error. Students were not taught to program, they got thrown in at the deep and were expected to survive. Anybody who could handle this stood some chance of being able to handle developing software in the ‘real world'; universities were (accidentally) graduating people with the skills industry needed. However, these days universities are supposed to be customer focused, what industry needs to irrelevant (my experience of sitting on departmental industry panels is that the head of department tells us what they are thinking of doing {i.e., new courses for which there will be lots of paying students} and we try to talk him/her out of the sillier ideas); too many fee paying students find programming too hard, let’s offer computing degrees that don’t require any programming.

Would you hire a recent graduate, for a development role, who had trouble figuring out how to fix syntax errors in their code? Surely, the minimum requirement is somebody who gets some pleasure from coding, even if they don’t want to spend lots of time doing it.

There is a shortage of software developers and flying turkeys are still with us.

Best practices considered harmfull

Allan Kelly from Allan Kelly Associates

NoBestPractice-2018-06-20-16-53.jpg

I’ve long worried about “Best Practices”. Sure I usually play along at the time but lurking in the back of my mind, waiting for a suitable opportunity are two questions:

  • Who decided this was best practice?
  • Who says this practice can’t be bettered?

I was once told by someone from the oil industry that it was common for contracts to specify “best practice” should be used. But seldom was the actual practice specified. Instead each party to the contract would interpret best practice as they wished, until something went wrong. At that point, after an accident, after money was lost they would go to court and a judge would decide what was best practice.

Sure practice X might be the best know way of doing things at the moment but how much better could it be? By declaring something “best practice” you can be self limiting and potentially preventing innovation.

Now a piece in MIT Sloan Management Review (Why Best Practices Often Fall Short, Jérôme Barthélemy, February 2018) adds to the debate and highlights a few more problems.

Just for openers, sometimes people mistakenly identify the practice creating the benefits. Apparently some people looked at Pixar animation and decided that having rest rooms (toilets to us English speakers) in the centre of an office floor enhances creativity. They might do, but there is so much else happening at Pixar that moving all the toilets in your organization will probably make no difference at all.

But it is worse than that.

Adopting best practice from elsewhere does not mean it will be best practice in your environment but adopting that “best practice” will be disruptive. Think of all the money you will need to spend relocating the toilets, all the people who will be upset by a desk move they don’t want, all the lost productivity while the work is going on.

The author suggests that in some cases that disruption costs are so high the “best practice” will never cover the costs of the change. Organizations are better shunning the best practice and carrying on as they are. (ERP anyone?)

It gets worse.

There is risk in those best practices. Risk that they will cost more, risk that they won’t be implemented correctly and risk that they will backfire. What was best practice at one organization might not be best practice in yours. (Which might imply you need even more change, even more disruption at even more cost.)

In fact, some best practices – like stock options for executives – can go horrendously wrong and induce behaviours you most definitely don’t want.

So what is a poor company to do?

Well, the author suggests something that does work: copying good practices. Not best but “just OK”. That works. Copy the mundane stuff, the proven stuff. The costs and risks of a big change are avoided. (This sounds a bit like In Search of Mediocracy.)

In my world that means you want to be getting better at doing Agile instead of trying to leapfrog Agile and move to DevOps in one bound.

The author also suggests that where your competitive advantage is concerned keep your cards close to your chest. Do thinks yourself. Work out what your best practice is, work out how you can improve yourself.

I’ve long argued that I want teams to learn and learn for themselves rather than have change done to them. But I also want teams to steal. When they see other teams – at home or elsewhere – doing good things they should steal practices. The important thing from my point of view is for the teams to decide for themselves.

Sign-up to receive these posts by e-mail and free eBook of Xanpan

The post Best practices considered harmfull appeared first on Allan Kelly Associates.

Learn WordPress & build a website in ONE day

Paul Grenyer from Paul Grenyer


When: 28 June 2018, 9am to 4.45pm

Where: The Kings Centre, Kings Street, Norwich, NR1 1PH

How much: £150

RSVP: https://www.meetup.com/Norfolk-Developers-NorDev/events/250241910/

WordPress is the world’s best and most popular website builder and this hands-on course takes you through from the basics, including installation and set up, to cover all the most useful features and tools WordPress offers. Whether you already have a site and want to manage it properly or are starting completely from scratch, this is the course for you.

You will learn to

  • set up and run a great website of your own
  • add content, images and videos
  • add structure and navigation menus
  • apply an attractive design using easy templates
  • make the site search engine friendly
  • add contact forms, maps and take payments
  • add social networking and track visitorslearn to add all the features and functionality you need to run and develop a fantastic website
  • and much, much more…


How the course works


  • Please bring your own laptop: PC, Mac or Chromebook, any is fine. Or you can hire a laptop for the day here.
  • WiFi and power sockets are provided
  • No experience is needed – WordPress is incredibly easy to pick up and you will be free to go at your own pace throughout the day.
  • All training materials will be provided after the course, so there need be no fear of “falling behind”.This is an intensive course and assumes a reasonable working knowledge of using computers and the internet, even if you have little or no prior knowledge of WordPress. If you are comfortable with using email, copy/paste, saving files/folders and navigating the internet, you should be fine! (see more advice in our FAQ here)
  • After the course you are welcome to stay around for further discussion with your trainer Toby and with each other, about WordPress, about your website and about your business.
  • After the course, you will be sent all the course materials and clear instructions for setting up your site on its own domain name (old or new). You will have a year of free hosting, after which time it is from just £8/mo for unlimited space and bandwidth.


More details here (https://wpcourses.co.uk/wordpress-training-courses/?gclid=EAIaIQobChMI0bSL8tTR1wIVSjobCh2A9gVdEAAYASAAEgJLnfD_BwE).

Number of parameters vs. accessing globals

Derek Jones from The Shape of Code

I spend a lot of time looking at software engineering data, asking, what is the story here?

In a previous post I suggested that the distribution of the number of functions defined to have a given number of parameters, might be a signature of developer beliefs about the relative cost of parameter passing vs accessing globals.

Looking at the data that Iran Rodrigues Gonzaga Junior made available (good man), as part of his thesis Empirical Studies on Fine-Grained Feature Dependencies, I saw it contained information about the number of parameters in a function definition and whether functions accessed a global (Gonzaga’s research question is in another direction; I am always repurposing data).

Are functions that access globals, defined with fewer parameters, compared to those that do not contain any such access? The plot below shows a count of the number of functions defined to have a given number of parameters, for four systems written in C; the solid lines are functions that did not access globals, the dashed lines are functions that accessed globals (code+data).

Number of functions defined to have a given number of parameters; four systems, written in C

Over all 50 projects measured, functions that don’t access globals are defined, on average, to have an extra 0.7 parameters (the fitted Poisson regression models are better than a poke in the eye {i.e., the distribution is not really Poisson}, it’s more informative to look at the plotted data).

There is a lot of variation between projects (I picked these four because they were the larger projects and showed variation in behaviors). While the shape of the distributions varies a lot, there is always a noticeable difference in the mean.

Is this difference between projects a difference in developer beliefs, a difference in application requirements, a difference in developer coding habits (and parameter usage is a side effect; are there really that many getters and setters)?

I was hoping for a simple answer, and could not find one. Since I am writing a book and not researching individual issues in detail, it’s time to move on.

Ideas welcome.

Further On Natural Analogarithms – student

student from thus spake a.k.

My fellow students and I have of late been thinking upon an equivalence between the roots of rational numbers and an infinite dimensional rational vector space, which we have named -space, that we discovered whilst defining analogues of logarithms that were expressed purely in terms of rationals.
We were particularly intrigued by the possibility of defining functions of such numbers by applying linear algebra operations to their associated vectors, which we began with a brief consideration of that given by their magnitudes. We have subsequently spent some time further exploring its properties and it is upon our findings that I shall now report.