Research software code is likely to remain a tangled mess

Research software (i.e., software written to support research in engineering or the sciences) is usually a tangled mess of spaghetti code that only the author knows how to use. Very occasionally I encounter well organized research software that can be used without having an email conversation with the author (who has invariably spent years iterating through many versions).

Spaghetti code is not unique to academia, there is plenty to be found in industry.

Structural differences between academia and industry make it likely that research software will always be a tangled mess, only usable by the person who wrote it. These structural differences include:

  • writing software is a low status academic activity; it is a low status activity in some companies, but those involved don’t commonly have other higher status tasks available to work on. Why would a researcher want to invest in becoming proficient in a low status activity? Why would the principal investigator spend lots of their grant money hiring a proficient developer to work on a low status activity?

    I think the lack of status is rooted in researchers’ lack of appreciation of the effort and skill needed to become a proficient developer of software. Software differs from that other essential tool, mathematics, in that most researchers have spent many years studying mathematics and understand that effort/skill is needed to be able to use it.

    Academic performance is often measured using citations, and there is a growing move towards citing software,

  • many of those writing software know very little about how to do it, and don’t have daily contact with people who do. Recent graduates are the pool from which many new researchers are drawn. People in industry are intimately familiar with the software development skills of recent graduates, i.e., the majority are essentially beginners; most developers in industry were once recent graduates, and the stream of new employees reminds them of the skill level of such people. Academics see a constant stream of people new to software development, this group forms the norm they have to work within, and many don’t appreciate the skill gulf that exists between a recent graduate and an experienced software developer,
  • paid a lot less. The handful of very competent software developers I know working in engineering/scientific research are doing it for their love of the engineering/scientific field in which they are active. Take this love away, and they will find that not only does industry pay better, but it also provides lots of interesting projects for them to work on (academics often have the idea that all work in industry is dull).

    I have met people who have taken jobs writing research software to learn about software development, to make themselves more employable outside academia.

Does it matter that the source code of research software is a tangled mess?

The author of a published paper is supposed to provide enough information to enable their work to be reproduced. It is very unlikely that I would be able to reproduce the results in a chemistry or genetics paper, because I don’t know enough about the subject, i.e., I am not skilled in the art. Given a tangled mess of source code, I think I could reproduce the results in the associated paper (assuming the author was shipping the code associated with the paper; I have encountered cases where this was not true). If the code failed to build correctly, I could figure out (eventually) what needed to be fixed. I think people have an unrealistic expectation that research code should just build out of the box. It takes a lot of work by a skilled person to create to build portable software that just builds.

Is it really cost-effective to insist on even a medium-degree of buildability for research software?

I suspect that the lifetime of source code used in research is just as short and lonely as it is in other domains. One study of 214 packages associated with papers published between 2001-2015 found that 73% had not been updated since publication.

I would argue that a more useful investment would be in testing that the software behaves as expected. Many researchers I have spoken to have not appreciated the importance of testing. A common misconception is that because the mathematics is correct, the software must be correct (completely ignoring the possibility of silly coding mistakes, which everybody makes). Commercial software has the benefit of user feedback, for detecting some incorrect failures. Research software may only ever have one user.

Research software engineer is the fancy title now being applied to people who write the software used in research. Originally this struck me as an example of what companies do when they cannot pay people more, they give them a fancy title. Recently the Society of Research Software Engineering was setup. This society could certainly help with training, but I don’t see it making much difference with regard status and salary.

A Magical new World — Thoughts of a first time ACCU attendee.

I went to my first ACCU Conference last week. It was great.

ACCU Conference

I’d heard about ACCU from Russel Winder several months ago. He recommended I check out the conference (for which hes on the programme board) since I’m a fan and user of the C and C++ languages.

I arrived in Bristol on Tuesday excited for what the week held.

This post contains a section about the talks and a section about my experience at the bottom.

Be aware that some of the photos might not look as good on here as they should, I think Medium has compressed them a bit. All my photos should be online shortly.

The Talks

Russ Miles opens ACCU 2017

We started the conference proper with a fantastically explosive keynote delivered by Russ Miles who jumped on stage to deliver a programming parody of Highway to Hell accompanied by his own guitar playing.
His keynote was all about modern development and how most of a programmer’s tools currently just shout information at the programmer, rather than actually helping.

Later on the Wednesday I headed into a talk from Kevlin Henny that totally re-jigged how I think about concurrency. Thinking outside the Synchronisation Quadrant was wonderfully entertaining, with Kevlin excitedly bouncing across the floor.
A fantastically engaging speaker.

Lightning talks on Wednesday

Wednesday’s talks continued with several other good talks and a number of great lightning talks too.
Finalising with the welcome reception where delegates gathered in the hotel for drinks, food and conversation.

It was here that I really got the chance to socialise with a good few people, including Anna-Jayne and Beth, who I’d been excited about meeting since I found out they were going to be there!

Thursday began with an interesting keynote about the Chapel parallel programming language. The talk has encouraged me to try the language out and I’ll certainly be having a good play with that soon.

Peter Hilton’s Documentation for Developers workshop

Thursday’s stand out talks included Documentation for Developers workshop by Peter Hilton. I really enjoyed the workshoppy style that Peter used to deliver the talk. He got the audience working in groups, talking to each other and essentially complaining about documentation. He finished with suggesting a method of writing docs called Readme Driven Development as well as other suggestions.

The other talk on Thursday which I really loved was “The C++ Type System is your Friend”. Hubert Matthews was a great speaker with clear experience in explaining a complex topic in an easier to understand fashion.
I can’t say I understood everything, but I certainly liked listening to Hubert speak.

Thursday evening I headed out for dinner with Anna-Jayne and Beth before heading back to my accommodation to write up a last minute talk for Friday.

My talk was covering Intel Software Guard Extensions — Russel announced that there was an open slot on Friday for a 15 minute topic and I took the chance to speak then.

Friday began with a curious but thought provoking talk from Fran Buontempo called AI: Actual Intelligence.
I’m not entirely sure what the take away from the talk was intended to be, but nonetheless it was interesting!

Friday morning was full of 15 minute talks. A format I think is wonderful.
I really loved that amongst the 90 min talks throughout the rest of the week, there was time for these quick fire shorter talks too that were still serious technical talks (unlike the 5min lightning talks).

The talks I went to see were:

Odin Holmes talks about Named Parameters

At Friday lunch time I took part in a bit of an unplanned workshop on sketch noting with Michel Grootjans. It was essentially an hour of trying to make our notes prettier!
It was a lot of fun.

Sketch Noting with Michel Grootjans

Friday was the conference dinner — a rock themed night of fun and frivolities.

This was by far the high point of the conference for me.
It offered a great evening of meeting people and having a lot of fun.
I loved how everyone loosened up and spoke to anyone else there.

I met a whole bunch of people, and got on super well with a few people who I would like to consider friends now.

ACCU made it easy to get to know people too by forcing everyone who isn't a speaker to move tables between each meal course. Its a great idea!

Odin enjoys inflatable instruments

Saturday’s talks started with a really fun talk from Arjan van Leeuwen about string handling in C++1x. Covering the differences between char arrays and std::strings and how best to use them. As well as tantalising us with a C++17 feature called std::string_view (immutable views of a string).

Later I watched a talk from Anthony Williams and another from Odin, both of which went wildly over my head, but all the same I gained a few things from both of them.

Finishing off the conference was a brilliant keynote from renowned speaker and member of the ISO C++ standards committee Herb Sutter.

Herb Sutter on Metaclasses at ACCU 2017

Herb introduced a new feature of C++ that he may be proposing to the standards committee.
He described a feature allowing one to create meta-classes.

Essentially, one could describe a template of a class with certain interfaces, data and operators. Then, one could implement an instance of that class defining all the functionality of the class.
Its essentially a way to more cleanly describe something akin to inheritance with virtual functions.

I highly suggest you try to catch the talk, since it was so interesting that even an hour or so after the talk there was still quite a crowd of people gathered around Herb asking him questions.

Herb is surrounded by curious programmers

The Conference Environment

As a first time ACCU attendee — this wouldn't be a useful blog post without a few words about the environment at the conference.

As most of my readers know, I’m a young transwomen, so a safe and welcome environment is something that I very much appreciate and makes a huge difference to my experience of an event.

Its something thats super hard to achieve in a world like software development where the workforce are predominantly male.

I’m glad to say that ACCU did a great job of creating a safe and welcoming space. Despite being predominantly male as expected, everyone I encountered was not only friendly and helpful, but ever so willing to go out of their way to make me feel welcome and comfortable.
Everyone I met simply accepted me for me and didnt treat me any other way than friendly.

I would suggest that offering diversity tickets to ACCU would help make me feel even better there, since I’d feel better with a more diverse set of delegates.

I was especially comforted by Russel mentioning the code of conduct, without fail, every day of the conference. As well as one of the lightning talks being, delivered by a man, taking the form of a spoke word-ish piece praising the welcoming nature of ACCU and calling for the maintenance of the welcoming nature to all people in the community, not just people like himself.

I’d like to especially mention Julie and the Archer-Yates team for checking up on my happiness throughout the conference, they really helped me feel safe there.

I think there still could be work to do about making the conference a good place for younger adults — I was rather overwhelmed by the fact that everyone seemed older than me and clearly had a better idea of how to conduct themselves in the conference setting.
However, I think the only real way of solving this problem would be to make the conference easier to access to younger people (i,e cheaper tickets for students, its still super expensive) which wouldn't always be possible. Additionally, the inclusion of some simpler, easier to understand talks would have been great. Lots of the talks were very complicated and easily got to a level that was way over my head.

Thanks to everyone who helped me feel welcome at ACCU — including but not limited to Richard, Antonello, Anna-Jayne, Beth, Jackie, Fran, Russel and Odin.

In conclusion

ACCU was a fantastic experience for me. I would highly recommend it to anyone interested in improving their C and C++ programming skills as well as general programming skills.
I’ll certainly be heading back next year if I can, and am happily a registered ACCU member now!