macOS Time Machine is usually set up to work in the background and not overly affect anything that’s going on in the foreground while the user is working. Under normal circumstances, this is desirable behaviour. It is not desirable when you try to take one last backup of a failing SSD before it keels over […]
I hate the “building software is like building buildings” metaphor. For a start most software people know so little about physical building that they are comparing the messy world of software construction with a idealised image of what they think happens in construction.
The above is a picture from my back garden last week. Only a couple of weeks ago the back of this house looked much like the back of my house, except it had an small, ugly, 2m x 1m kitchen extension on it. And it had an over grown garden, a third of which houses this big extension while the rest of it is full of building debris.
The builders moved into this house less than six week ago. During that time they have: cleared the garden, cleared the house of remaining furniture, dug foundations, concreted foundations, built a two story side and back extension, ripped the roof off both this house and the ajoining house and replaced the roof while also constructing loft conversions on both houses. The roof came off on Monday and the new one was done by Saturday.
Builders – roofers and brick layers – go fast. Three years ago I had some work done on my house. I was well impressed with the speed the builders ripped the back rooms apart, tore down a wall, bricked up a door, and built a new wall.
Then it goes slow.
Once the actual “building” is done the wiring, plumbing, plastering, decorating and other fitting-out starts and o my, it goes so slow. Hold ups are regular as parts and people are not available when needed. There is a constantly changing cast of workers as specialists appear for a few days and then disappear.
I expect it will be four or five months before anyone is living in my neighbours house again.
During that time it goes slow, work people come in but it is hard to see progress. And the progress is far less clear than when a wall disappears, or a new wall comes into being overnight.
And so it is in software… usually.
The thing about construction – both physical and software – is that so often the most visible bits are actually among the most quickest and straight forward. These are the bits where not only can progress be made rapidly but visibly too.
But lots of time is spent on less visible bits – the tricky bits. The Devil is in the Detail as the saying goes.
When people say “How can is possibly take so long?” they are speaking from the experience of seeing the visible bits happen fast. The less visible bits, where the time is consumed, they don’t know what is happening during the less visible bits and have few memories to guide them – but they did remember it went fast last month!
When my house was being worked on I often felt like saying “why isn’t it done yet?”. The big changes, the heavy lifting, was done. How could all this other stuff? – the minor bits, the bits which didn’t require actual building – take so long?
Fortunately, software doesn’t need to be like this. There are alternatives. But working such ways means slowing down the early stages to deal with the detail as you go. Overall you get a more consistent (and sustainable) pace, and may well finish sooner but you loose the quick start that impresses people. Almost from day-1 they start complaining about the lack of progress.
Not being a builder I don’t know, but I do know I hired experts and I have to trust them. And I never tried to say to them “If this was software you would be done by now.”
(Now I’ve written this blog I remember, I’ve said something two years ago… Heavy Lifting is the Easy Bit. Worth saying again I think.)
The post Building software is not like building houses (thankfully) appeared first on Allan Kelly Associates.
One very simple, and effective technique used by scientists and engineers to check whether an equation makes sense, is dimensional analysis. The basic idea is that when performing an operation between two variables, their measurement units must be consistent; for instance, two lengths can be added, but a length and a time cannot be added (a length can be divided by time, returning distance traveled per unit time, i.e., velocity).
Let’s run a dimensional analysis check on the Halstead equations.
The input variables to the Halstead metrics are: , the number of distinct operators, , the number of distinct operands, , the total number of operators, and , the total number of operands. These quantities can be interpreted as units of measurement in tokens.
The formula are:
- Program length:
There is a consistent interpretation of this equation: operators and operands are both kinds of tokens, and number of tokens can be interpreted as a length.
- Calculated program length:
There is a consistent interpretation of this equation: the operand of a logarithm has to be dimensionless, and the convention is to treat the operand as a ratio (if no denominator is specified, the value 1 is taken), the value returned is dimensionless, which can be multiplied by a variable having any kind of dimension; so again two (token) lengths are being added.
A volume has units of (i.e., it is created by multiplying three lengths). There is only one length in this equation; the equation is misnamed, it is a length.
Here the dimensions of and cancel, leaving the dimensions of (a length); now Halstead is interpreting length as a difficulty unit (whatever that might be).
This equation multiplies two variables, both having a length dimension; the result should be interpreted as an area. In physics work is force times distance, and power is work per unit time; the term effort is not defined.
Halstead is claiming that a single dimension, program length, contains so much unique information that it can be used as a measure of a variety of disparate quantities.
Halstead’s colleagues at Purdue were rather damming in their analysis of these metrics. Their report Software Science Revisited: A Critical Analysis of the Theory and Its Empirical Support points out the lack of any theoretical foundation for some of the equations, that the analysis of the data was weak and that a more thorough analysis suggests theory and data don’t agree.
I pointed out in an earlier post, that people use Halstead’s metrics because everybody else does. This post is unlikely to change existing herd behavior, but it gives me another page to point people at, when people ask why I laugh at their use of these metrics.
The ISO C Standard is currently being revised by WG14, to create C2X.
There is a rather nebulous clustering of people who want to stop compilers using undefined behaviors to generate what these people (and probably most other developers) consider to be very surprising code. For instance, always printing p is truep is false, when executing the code:
bool p; if ( p ) printf("p is true"); if ( !p ) printf("p is false"); (possible because
p is uninitialized, and accessing an uninitialized value is undefined behavior).
This sounds like a good thing; nobody wants compilers generating surprising code.
All the proposals I have seen, so far, involve doing away with constructs that can produce undefined behavior. Again, this sounds like a good thing; nobody likes undefined behaviors.
The problem is, there is a reason for labeling certain constructs as producing undefined behavior; the behavior is who-knows-what.
Now the C Standard could specify the who-knows-what behavior; for instance, it could specify that the result of dividing by zero is 42. Standard’s conforming compilers would then have to generate code to check whether the denominator was zero, and return 42 for this case (until Intel, ARM and other processor vendors ‘updated’ the behavior of their divide instructions). Way-back-when a design decision was made, the behavior of divide by zero is undefined, not 42 or any other value; this was a design decision, code efficiency and compactness was considered to be more important.
I have not seen anybody arguing that the behavior of divide by zero should be specified. But I have seen people arguing that once C’s integer representation is specified as being twos-compliment (currently it can also be ones-compliment or signed-magnitude), then arithmetic overflow becomes defined. Wrong.
Twos-compliment is a specification of a representation, not a specification of behavior. What is the behavior when the result of adding two integers cannot be represented? The result might be to wrap (the behavior expected by many developers), to saturate at the maximum value (frequently needed in image and signal processing), to raise a signal (overflow is not usually supposed to happen), or something else.
WG14 could define the behavior, for when the result of an arithmetic operation is not representable in the number of bits available. Standard’s conforming compilers targeting processors whose arithmetic instructions did not behave as required would have to generate code, for any operation that could overflow, to do what was necessary. The embedded market are heavy users of C; in this market memory is limited, and processor performance is never fast enough, the overhead of supporting a defined behavior could just be too high (a more attractive solution is to code review, to make sure the undefined behavior cannot occur).
Is there another way of addressing the issue of compiler writers’ use/misuse of undefined behavior? Yes, offer them money. Compiler writing is a business, at least at the level at which gcc and llvm operate. If people really are keen to influence the code generated by gcc and llvm, money is the solution. Wot, no money? Then stop complaining.
People at work suggested Kotlin was “just syntactic sugar”, so I set out to explain how Kotlin can really make better code, and here is the result:
To spin up a temporary environment with a different Java version without touching your real environment, try this Docker command:
docker run -i -t --mount "type=bind,src=$PWD,dst=/code" openjdk:11-jdk bash
(Change “11-jdk” to the version you want as listed on the README.)
Then you can build the code inside the current directory something like this:
cd code ./gradlew test
Or similar for other build tools, although you may need to install them first.
I attended ACCU 2019 a couple of weeks ago, where I was presenting my session Here's my number; call me, maybe. Callbacks in a multithreaded world.
The conference proper started on Wednesday, after a day of pre-conference workshops on the Tuesday, and continued until Saturday. I was only there Wednesday to Friday.
I didn't arrive until Wednesday lunchtime, so I missed the first keynote and morning sessions. I did, however get to see Ivan Čukić presenting his session on Ranges for distributed and asynchronous systems. This was an interesting talk that covered similar ground to things I've thought about before. It was good to see Ivan's take, and think about how it differed to mine. It was was also good to see how modern C++ techniques can produce simpler code than I had when I thought about this a few years ago. Ivan's approach is a clean design for pipelined tasks that allows implicit parallelism.
After the break I then went to Gail Ollis's presentation and workshop on Helping Developers to Help Each Other . Gail shared some of her research into how developers feel about various aspects of software development, from the behaviour of others to the code that they write. She then got us to try one of the exercises she talked about in small groups. By picking developer behaviours from the cards she provided to each group, and telling stories about how that behaviour has affected us, either positively or negatively, we can share our experiences, and learn from each other.
First up on Thursday was Herb Sutter's keynote: De-fragmenting C++: Making exceptions more affordable and usable . Herb was eloquent as always, talking about his idea for making exceptions in C++ lower cost, so that they can be used in all projects: a significant number of projects currently ban exceptions from at least some of their code. I think this is a worthwhile aim, and hope to see something like Herb's ideas get accepted for C++ in a future standard.
Next up was my session, Here's my number; call me, maybe. Callbacks in a multithreaded world. It was well attended, with interesting questions from the audience. My slides are available here, and the video is available on youtube. Several people came up to me later in the conference to say that they had enjoyed my talk, and that they thought it would be useful for them in their work, which pleased me no end: this is what I always hope to achieve from my presentations.
Thursday lunchtime was taken up with book signings. I was one of four authors of recently-published programming books set up in the conservatory area of the hotel to sell copies of our books, and sign books for people. I sold plenty, and signed more, which was great.
Kate Gregory's talk
What Do We Mean When We Say Nothing At All? was
after lunch. She discussed the various places in C++ where we can choose to
specify something (such as
explicit), but we don't have
to. Can we interpret meaning from the lack of an annotation? If your codebase
override everywhere, except in one place, is that an accidental omission,
or is it a flag to say "this isn't actually an override of the base class
function"? Is it a good or bad idea to omit the names of unused parameters?
There was a lot to think about with this talk, but the key takeaway for me is
Consistency is Key: if you are consistent in your use of optional annotations,
then deviation from your usual pattern can convey meaning to the reader, whereas
if you are inconsistent then the reader cannot infer anything.
The final session I attended on Thursday was the C++ Pub Quiz, which was hosted by Felix Petriconi. The presented code was intended to confuse, and elicit exclamations of "WTF!", and succeeded on both counts. However, it was fun as ever, helped by the free drinks, and the fact that my team "Ungarian Notation" were the eventual winners.
Friday was the last day of the conference for me (though there the conference had another full day on Saturday). It started with Paul Grenyer's keynote on the trials and tribulations of trying to form a "community" for developers in Norwich, with meet-ups and conferences. Paul managed to be entertaining, but having followed Paul's blog for a few years, there wasn't anything that was new to me.
Interactive C++ : Meet Jupyter / Cling - The data scientist's geeky younger sibling was the next session I attended, presented by Neil Horlock. This was an interesting session about cling, a C++ interpreter, complete with a REPL, and how this can be combined with Jupyter notebooks to create a wiki with embedded code that you can edit and run. Support for various libraries allows to write code to plot graphs and maps and things, and have the graphs appear right there in the web page immediately. This is an incredibly powerful tool, and I had discussions with people afterwards about how this could be used both as an educational tool, and for "live" documentation and customer-facing tests: "here is sample code, try it out right now" is an incredibly powerful thing to be able to say.
After lunch I went to see Andreas Weis talk about Taming Dynamic Memory - An Introduction to Custom Allocators. This was a good introduction to various simple allocators, along with how and why you might use them in your C++ code. With John Lakos in the front row, Andreas had to field many questions. I had hoped for more depth, but I thought the material was well-paced, and so there wouldn't have been time; that would have been quite a different presentation, and less of an "introduction".
The final session I attended was Elsewhere Memory by Niall Douglas. Niall talked about the C++ object model, and how that can cause difficulties for code that wants to serialize the binary representation of objects to disk, or over the network, or wants to directly share memory with another process. Niall is working on a standardization proposal which would allow creating objects "fully formed" from a binary representation, without running a constructor, and would allow terminating the lifetime of an object without running its destructor. This is a difficult area as it interacts with compilers' alias analysis and the normal deterministic lifetime rules. However, this is an area where people currently do have "working" code that violates the strict lifetime rules of the standard, so it would be good to have a way of making such code standards-conforming.
Between the Sessions
The sessions at a conference at ACCU are great, and I always enjoy attending them, and often learn things. However, you can often watch these on Youtube later. One of the best parts of physically attending a conference is the discussions had in person before and after the sessions. It is always great to chat to people in person who you primarily converse with via email, and it is exciting to meet new people.
The conference tries to encourage attendees to be open to new people joining discussions with the "Pacman rule" — don't form a closed circle when having a discussion, but leave a space for someone to join. This seemed to work well in practice.
I always have a great time at ACCU conferences, and this one was no different.
Follow me on Twitter