Academic recognition for creating and supporting software

Derek Jones from The Shape of Code

A scientific paper is supposed to contain enough information that somebody skilled in the field can perform the experiment(s) described therein (issues around the money needed to obtain access to the necessary equipment tend to be side stepped). In addition to the skills generally taught within a field, every niche has its specific skill set, which for leading edge research may only be available in one lab.

Bespoke software has become an essential component of many research projects, and the ability to reimplement the necessary software is rarely considered to be a necessary skill. Some researchers consider software to be “just code” whose creation is not really a skill that is worth investing in acquiring.

There is a widespread belief in academic circles that the solution to the issues created by bespoke software is for researchers to release the source code of the software they create.

Experienced developers will laugh at the idea that once the source code is available, running it is straight forward. Figuring out how to run somebody else’s code can be a very time-consuming process, particularly when the person who wrote it is relatively inexperienced.

This post is about the social issues around the bespoke research code being made available, and not the technical issues likely to be encountered in building it on another researcher’s computer.

Lots of researchers do make their code available, without being asked, and some researchers actively promote the software they have written. In a few cases, active software ecosystems have sprung up around a research topic, e.g., Astropy and SunPy.

However, a lot of code never gets released. Based on my own experience of asking for code (in the last 10 years, most of my requests have been for data), reasons given by researchers for not making the code they have written available to others, include:

  • not replying to email requests for the code,
  • not sure that they still have the all code, which is taken as a reason for not sending what they have. This may also be a cover story for another reason they don’t want to admit to,
  • they don’t want the hassle of supporting other users of the code. Having received some clueless requests for help on software I have released, I have sympathy for this position. Sometimes pointing out that I am an experienced developer who does not need support, works, other times it just changes the reason given,
  • they think the code is poorly written, and that this poor of quality will make them look bad. Pointing out that research code is leading edge (rare true, it’s an attempt to stroke their ego), and not supposed to be polished, rarely works for me. Some people are just perfectionists, with a strong aversion to showing others anything that has not been polished to death,
  • a large investment was made to create the software, and they want to reap all the benefits. I have a lot of sympathy with this position. Some research fields are very competitive, or sometimes the researcher just wants to believe that they really will get another grant to work on the subject.

Researchers who create and support research software complain that they don’t get any formal recognition for this work; which begs the question: why are you working on this software when you know that you are unlikely to receive any recognition?

How might researchers receive recognition for writing, supporting and releasing code?

Citations to published papers are a commonly used technique for measuring the worth of the work done by a researcher (this metric is used when evaluating people for promotion, awarding grants, and evaluating departments), and various organizations are promoting the use of citations for software.

Some software provides enough benefits that the authors can write a conventional paper about it, e.g., a paper on Astropy (which does not cite any of the third-party packages used in its own implementation). But a lot of research software does not have sufficient general appeal to warrant a paper.

Are citations for software a good idea?

An important characteristic of any evaluation metric is how hard it is to fake a good score.

Research papers are rated by the journal in which they are published, with each journal having its own rating (a short-term metric), and the number of times the paper is cited (a longer-term metric). Papers are reviewed, with many failing to be accepted (at least by the higher quality journals; there are so-called predatory journals that will publish anything for a fee).

While there are a few journals where source code may be an integral component of a paper, most research software is published on sites having minimal acceptance criteria, e.g., Github.

Will citations to software become as commonplace as citations to other papers?

I regularly read software papers that cites software packages, but this practice is a long way from being common.

Will those awarding job promotions and grants start to include software creation as having a status comparable to published papers? We will have to wait and see.

Will the lure of recognition via citations increase the quantity of source being released?

I don’t think it will have any impact until the benefits of software citations are seen to be worthwhile (which may be many years away).

Evidence-based SE groups doing interesting work, 2021 version

Derek Jones from The Shape of Code

Who are the research groups currently doing interesting work in evidenced-base software engineering (academics often use the term empirical software engineering)? Interestingness is very subjective, in my case it is based on whether I think the work looks like it might contribute something towards software engineering practices (rather than measuring something to get a paper published or fulfil a requirement for an MSc or PhD). I last addressed this question in 2013, and things have changed a lot since then.

This post focuses on groups (i.e., multiple active researchers), and by “currently doing” I’m looking for multiple papers published per year in the last few years.

As regular readers will know, I think that clueless button pushing (a.k.a. machine learning) in software engineering is mostly fake research. I tend to ignore groups that are heavily clueless button pushing oriented.

Like software development groups, research groups come and go, with a few persisting for many years. People change jobs, move into management, start companies based on their research, new productive people appear, and there is the perennial issue of funding. A year from now, any of the following groups may be disbanded or moved on to other research areas.

Some researchers leave a group to set up their own group (even moving continents), and I know that many people in the 2013 survey have done this (many in the Microsoft group listed in 2013 are now scattered across the country). Most academic research is done by students studying for a PhD, and the money needed to pay for these students comes from research grants. Some researchers are willing to spend their time applying for grants to build a group (on average, around 40% of a group’s lead researcher’s time is spent applying for grants), while others are happy to operate on a smaller scale.

Evidence-based research has become mainstream in software engineering, but this is not to say that the findings or data have any use outside of getting a paper published. A popular tactic employed by PhD students appears to be to look for what they consider to be an interesting pattern in code appearing on Github, and write a thesis that associated this pattern with an issue thought to be of general interest, e.g., predicting estimates/faults/maintainability/etc. Every now and again, a gold nugget turns up in the stream of fake research.

Data is being made available via personal Github pages, figshare, osf, Zenondo, and project or personal University (generally not a good idea, because the pages often go away when the researcher leaves). There is no current systematic attempt to catalogue the data.

There has been a huge increase in papers coming out of Brazil, and Brazilians working in research groups around the world, since 2013. No major Brazilian name springs to mind, but that may be because I have not noticed that they are Brazilian (every major research group seems to have one, and many of the minor ones as well). I may have failed to list a group because their group page is years out of date, which may be COVID related, bureaucracy, or they are no longer active.

The China list is incomplete. There are Chinese research groups whose group page is hosted on Github, and I have failed to remember that they are based in China. Also, Chinese pages can appear inactive for a year or two, and then suddenly be updated with lots of recent information. I have not attempted to keep track of Chinese research groups.

Organized by country, groups include (when there is no group page available, I have used the principle’s page, and when that is not available I have used a group member page; some groups make no attempt to help others find out about their work):

Belgium (I cite the researchers with links to pdfs)

Brazil (Garcia, Steinmacher)

Canada (Antoniol, Data-driven Analysis of Software Lab, Godfrey and Ptidel, Robillard, SAIL; three were listed in 2013)

China (Lin Chen, Lu Zhang)

Germany (Chair of Software Engineering, CSE working group, Software Engineering for Distributed Systems Group, Research group Zeller)

Greece (listed in 2013)

Israel

Italy (listed in 2013)

Japan (Inoue lab, Kamei Web, Kula, and Kusumoto lab)

Netherlands

Spain (the only member of the group listed in 2013 with a usable web page)

Sweden (Chalmers, KTH {Baudry and Monperrus, with no group page})

Switzerland (SCG and REVEAL; both listed in 2013)

UK

USA (Devanbu, Foster, Maletic, Microsoft, PLUM lab, SEMERU, squaresLab, Weimer; two were listed in 2013)

Sitting here typing away, I have probably missed out some obvious candidates (particularly in the US). Suggestions for omissions welcome (remember, this is about groups, not individuals).

Visual Lint 8.0.5.346 adds support for Visual Studio 2022

Products, the Universe and Everything from Products, the Universe and Everything

Visual Lint 8.0.5.346 has now been released. This a recommended maintenance update for Visual Lint 8.0, and adds support for Visual Studio 2022 Preview:

Visual Lint 8.0.5.346 running within Visual Studio 2022 Preview 5.0Visual Lint 8.0.5.346 running within Visual Studio 2022 Preview 5.0

The following changes are included:

  • The Visual Studio plugin is now compatible with Visual Studio 2022 Preview.

  • The Visual Lint installer now includes dedicated VSIX extensions for Visual Studio 2017 (VisualLintPlugIn_vs2017.vsix), Visual Studio 2019 (VisualLintPlugIn_vs2019.vsix) and Visual Studio 2022 (VisualLintPlugIn_vs2022.vsix).

  • If makefile output is read while loading a C/C++ project, it is now used to determine which files will be built.

  • The Project Properties Dialog now includes a dedicated field for built-in (i.e. compiler defined) preprocessor definitions. Note that these include those inferred by other project settings, e.g. the Visual C++ "Runtime Library" property (compiler switches /MD[d], /MLd and /MT[d]) which infer _DEBUG/NDEBUG, _MT and _DLL.

  • Built-in (i.e. compiler defined) preprocessor symbols are now written separately in generated PC-lint/PC-lint Plus project indirect files from those specified directly.

  • Updated the PC-lint Plus compiler indirect file co-rb-vs2022.lnt to support Visual Studio 2022 v17.0.0 Preview 4.1.

  • Updated the PC-lint Plus compiler indirect file co-rb-vs2019.lnt to support Visual Studio 2019 v16.11.4.

  • Updated the VisualLintConsole help screen to reflect support for Visual Studio 2022 project and solution files.

  • Updated the installer to clarify that the "Remove Visual Lint commands from Visual Studio" option only applies to Visual Studio 2002-2008.

Download Visual Lint 8.0.5.346

How I prioritise planning over plans

Allan Kelly from Allan Kelly

You might think that I’m a really organized person. After all, I spend a good chunk of my life helping other people be more organized about their work – and not just organized, prioritised, effective, and all those other good things. That might be true, people who know me well say I’m really well organized. But I always feel I’m faking it. I’m really disorganized.

As I spend a lot of time working by myself, for myself, and interleaving clients I need to organize my days. Over the years I’ve tried man different ways of organizing myself. Todo lists in my notebook are the main mechanism. Notebook and todo-list works well for the medium range but for the actual work of today its not so effective. I have, and sometimes use, a whiteboard: write out a list of things todo today and tick them as I do them. I’ve use post-it notes: write out all the things I need to do on one post-it each, prioritise them down the side of my desk and tick them as I do them.

In general I find that a system works for a while, maybe even a few of weeks but it decays. Perhaps its too routine, perhaps I’m over familiar. After a while I need a change. So after a period in the doldrums I bring back an old system or invent a new one.

During the last year of house-arrest I’ve found organizing myself really hard. A few months ago I came up with a new system: good old index cards. Extra big ones. On the left are the mundane or household things I need to do. On the right the important business stuff todo.

The keys to making any one of my systems work are:

1. The power of the (big red) tick. Being able to tick things off and mark them as done. Perhaps thats why electronic systems never work for me.

2. Prioritisation: Recognising that some things are more important and need to be done sooner or require more time. Accept some things fall off the bottom and don’t get done.

3. Limiting WIP, work in progress. It is easy to put too many things down, not do them, and write them down again tomorrow.

4. Sticking to the list and not getting distracted.

5. Rewrite the list every day

Now the last three there: prioritisation, limiting WIP and not getting distracted require personal discipline. I have to force myself to work within my system. Sometimes that is hard but if I don’t do it the system breaks down. And actually, that is usually why the system periodically requires reinventing.

For example: many of my cards list “EM, SL and LN” – short for E-Mail, Slack and LinkedIn. Messages arrive for me on all three and there are conversations I sometimes want to join in. But, very little on any of them is so important that is needs to be looked at at 9am. Everything on Slack and LinkedIn can wait, and almost everything on e-mail can wait. So a quick e-mail triage at 9am and pushing the rest until later in the day allows focus on important stuff. Unfortunately, because EM, SL and LN all generate dopamine it it very difficult to prevent myself from being distracted by them.

Rewriting the list everyday helps focus because I’m saying “this is what I will do.” For years I found that every time my notebook todo list ran out of space and I rewrote it I was much more productive that day – plus it was a cathartic experience. Arguably rewriting my list everyday is a waste of time because some items carry over and some items repeat. But the actual process of doing it, the planning rather than the plan, creates focus and motivation.

As you might have guess by now, a lot of this carrie directly over to my clients and their teams: prioritise the work, limit wip, let work fall off, stick to the system and the difficulties of discipline. Of course, what I’m describing is a system that emphasises planning over plans.

Another issue I regularly run up against is the “second priority” problem. Once all the pressing, really important and urgent stuff is done where do I put my attention? When I have three or four lower priority things to do it can be hard to choose which one to do and to stick with it. It can help to time-box the work, write “45” next to the work item and when I start work on it set a timer to stop after 45 minutes. I may not have done everything but I will have made progress and will at least have broken the second priority logjam.

Sometimes it’s hard to address an issue because it not clear what needs doing. When todo items are transactional “call X”, “write Y” it is easy to close them off. But sometimes hard to know what “Website” actually involves, I’m the one who decided I should work on my website, I’m the one doing the work, but what exactly should I be doing? And even if I know I should be, say: “updating keywords” what keywords? where? Even thought I can see something needs attention I don’t know what.

So, am I organized? or disorganized?

Well, I think this goes back to my dyslexia. Dyslexics are frequently disorganised because many (including me) suffer with poor short term memory. Left to myself I can be very disorganized.

But, dyslexics over compensate. A reoccurring pattern with dyslexics is that we have to create our own learning strategies and solutions to our own problems. Sometimes we over compensate and something we are bad at naturally becomes something we are very good at.

That I think is why other people think I’m really well organized and I think I’m badly organized.

And it is those ways of thinking and approaching organization – and work to do – which carries over to my professional work. Call it neurodiversity if you like.


Subscribe to my blog newsletter and download Continuous Digital for free

The post How I prioritise planning over plans appeared first on Allan Kelly.

How I prioritise planning over plans

Allan Kelly from Allan Kelly, Software Strategy

You might think that I’m a really organized person. After all, I spend a good chunk of my life helping other people be more organized about their work – and not just organized, prioritised, effective, and all those other good things. That might be true, people who know me well say I’m really well organized. But I always feel I’m faking it. I’m really disorganized.

As I spend a lot of time working by myself, for myself, and interleaving clients I need to organize my days. Over the years I’ve tried man different ways of organizing myself. Todo lists in my notebook are the main mechanism. Notebook and todo-list works well for the medium range but for the actual work of today its not so effective. I have, and sometimes use, a whiteboard: write out a list of things todo today and tick them as I do them. I’ve use post-it notes: write out all the things I need to do on one post-it each, prioritise them down the side of my desk and tick them as I do them.

In general I find that a system works for a while, maybe even a few of weeks but it decays. Perhaps its too routine, perhaps I’m over familiar. After a while I need a change. So after a period in the doldrums I bring back an old system or invent a new one.

During the last year of house-arrest I’ve found organizing myself really hard. A few months ago I came up with a new system: good old index cards. Extra big ones. On the left are the mundane or household things I need to do. On the right the important business stuff todo.

The keys to making any one of my systems work are:

1. The power of the (big red) tick. Being able to tick things off and mark them as done. Perhaps thats why electronic systems never work for me.

2. Prioritisation: Recognising that some things are more important and need to be done sooner or require more time. Accept some things fall off the bottom and don’t get done.

3. Limiting WIP, work in progress. It is easy to put too many things down, not do them, and write them down again tomorrow.

4. Sticking to the list and not getting distracted.

5. Rewrite the list every day

Now the last three there: prioritisation, limiting WIP and not getting distracted require personal discipline. I have to force myself to work within my system. Sometimes that is hard but if I don’t do it the system breaks down. And actually, that is usually why the system periodically requires reinventing.

For example: many of my cards list “EM, SL and LN” – short for E-Mail, Slack and LinkedIn. Messages arrive for me on all three and there are conversations I sometimes want to join in. But, very little on any of them is so important that is needs to be looked at at 9am. Everything on Slack and LinkedIn can wait, and almost everything on e-mail can wait. So a quick e-mail triage at 9am and pushing the rest until later in the day allows focus on important stuff. Unfortunately, because EM, SL and LN all generate dopamine it it very difficult to prevent myself from being distracted by them.

Rewriting the list everyday helps focus because I’m saying “this is what I will do.” For years I found that every time my notebook todo list ran out of space and I rewrote it I was much more productive that day – plus it was a cathartic experience. Arguably rewriting my list everyday is a waste of time because some items carry over and some items repeat. But the actual process of doing it, the planning rather than the plan, creates focus and motivation.

As you might have guess by now, a lot of this carrie directly over to my clients and their teams: prioritise the work, limit wip, let work fall off, stick to the system and the difficulties of discipline. Of course, what I’m describing is a system that emphasises planning over plans.

Another issue I regularly run up against is the “second priority” problem. Once all the pressing, really important and urgent stuff is done where do I put my attention? When I have three or four lower priority things to do it can be hard to choose which one to do and to stick with it. It can help to time-box the work, write “45” next to the work item and when I start work on it set a timer to stop after 45 minutes. I may not have done everything but I will have made progress and will at least have broken the second priority logjam.

Sometimes it’s hard to address an issue because it not clear what needs doing. When todo items are transactional “call X”, “write Y” it is easy to close them off. But sometimes hard to know what “Website” actually involves, I’m the one who decided I should work on my website, I’m the one doing the work, but what exactly should I be doing? And even if I know I should be, say: “updating keywords” what keywords? where? Even thought I can see something needs attention I don’t know what.

So, am I organized? or disorganized?

Well, I think this goes back to my dyslexia. Dyslexics are frequently disorganised because many (including me) suffer with poor short term memory. Left to myself I can be very disorganized.

But, dyslexics over compensate. A reoccurring pattern with dyslexics is that we have to create our own learning strategies and solutions to our own problems. Sometimes we over compensate and something we are bad at naturally becomes something we are very good at.

That I think is why other people think I’m really well organized and I think I’m badly organized.

And it is those ways of thinking and approaching organization – and work to do – which carries over to my professional work. Call it neurodiversity if you like.


Subscribe to my blog newsletter and download Continuous Digital for free

The post How I prioritise planning over plans appeared first on Allan Kelly, Software Strategy.

Online User Stories tutorials now complete

Allan Kelly from Allan Kelly

Better User Stories
As a Product Owner I want to write better stories

I’m pleased to announce I’ve released the last of my online User Stories tutorials (part 5: Workflow and Lifecycle) and with that the whole series is complete. You can now buy the entire User Stories set of 5 tutorials as one package at a 40% discount to buying the tutorials individually.

The package includes over six hours of video commentary, exercises, quizzes, downloads and both ebook and audio book versions of Little Book of Requirements and User Stories.

Blog readers can get a further 25% off the price with the code: “blogoct21” until the end of this month, October 2021.

In addition, the first 3 people to use that code will receive a free print copy of The Art of Agile Product Ownership.

The 5 tutorials are:

These tutorials turned ouit to be a lot more work than I expected (where have i heard that before?). The core material is based on the Requirements, Backlogs and User Stories workshop that I have been running for a few years and last year converted to a series of online webinars. In the process the material has become a lot more focused.

Please, let me know what you think, in the comments section below or in the feedback forms at the end of each tutorial in the series.

The post Online User Stories tutorials now complete appeared first on Allan Kelly.

Online User Stories tutorials now complete

Allan Kelly from Allan Kelly, Software Strategy

Better User Stories
As a Product Owner I want to write better stories

I’m pleased to announce I’ve released the last of my online User Stories tutorials (part 5: Workflow and Lifecycle) and with that the whole series is complete. You can now buy the entire User Stories set of 5 tutorials as one package at a 40% discount to buying the tutorials individually.

The package includes over six hours of video commentary, exercises, quizzes, downloads and both ebook and audio book versions of Little Book of Requirements and User Stories.

Blog readers can get a further 25% off the price with the code: “blogoct21” until the end of this month, October 2021.

In addition, the first 3 people to use that code will receive a free print copy of The Art of Agile Product Ownership.

The 5 tutorials are:

These tutorials turned ouit to be a lot more work than I expected (where have i heard that before?). The core material is based on the Requirements, Backlogs and User Stories workshop that I have been running for a few years and last year converted to a series of online webinars. In the process the material has become a lot more focused.

Please, let me know what you think, in the comments section below or in the feedback forms at the end of each tutorial in the series.

The post Online User Stories tutorials now complete appeared first on Allan Kelly, Software Strategy.

Looking for a measurable impact from developer social learning

Derek Jones from The Shape of Code

Almost everything you know was discovered/invented by other people. Social learning (i.e., learning from others) is the process of acquiring skills by observing others (teaching is explicit formalised sharing of skills). Social learning provides a mechanism for skills to spread through a population. An alternative to social learning is learning by personal trial and error.

When working within an ecosystem that changes slowly, it is more cost-effective to learn from others than learn through trial and error (assuming that experienced people are available to learn from, and the learner is capable of identifying them); “Social Learning” by Hoppitt and Layland analyzes the costs and benefits of using social learning.

Since its inception, much of software engineering has been constantly changing. In a rapidly changing ecosystem, the experience of established members may suggest possible solutions that do not deliver the expected results in a changed world, i.e., social learning may not be a cost-effective way of building a skill set applicable within the new ecosystem.

Opportunities for social learning occur wherever developers tend to congregate.

When I started writing software people, developers would print out a copy of their code to take away and correct/improve/add-to (this was when 100+ people were time-sharing on a computer with 256K words of memory, running at 1 MHz). People would cluster around the printer, which ran sufficiently slowly that it was possible, in real-time, to read the code and figure out what was going on; it was possible to learn from others code (pointing out mistakes in programs that people planned to hand in was not appreciated). Then personal computers became available, along with low-cost printers (e.g., dot matrix), which were often shared, and did not print so fast that an experienced developer could not figure things out in real-time. Then laser printers came along, delivering a page at a time every 15 seconds, or so; experiencing the first print out from a Laser printer, I immediately knew that real-time code reading was a thing of the past (also, around this time, full-screen editors achieved the responsiveness needed to enthral developers, paper code listings could not compete). A regular opportunity for social learning disappeared.

Mentoring and retrospectives are intended as explicit (perhaps semi-taught) learning contexts, in which social learning opportunities may be available.

The effectiveness of social learning is dependent on being able to select a good enough source of expertise to learn from. Choosing the person with the highest prestige is a common social selection technique; selecting web pages appearing on the first page of a Google search is actually a form of conformist learning (i.e., selecting what others have chosen).

It is possible to point at particular instances of social learning in software engineering, but to what extent does social learning, other than explicit teaching, contribute to developer skills?

Answering this question requires enumerating all the non-explicitly taught skills a developer uses to get the job done, excluding the non-developer specific skills. A daunting task.

Is it even possible to consistently distinguish between social learning (implicit or taught) and individual learning?

For instance, take source code indentation. Any initial social learning is likely to have been subsequently strongly influenced by peer pressure, and default IDE settings.

Pronunciation of operator names is a personal choice that may only ever exist within a developer’s head. In my head, I pronounce the ^ operator as up-arrow, because I first encountered its use in the book Algorithms + Data Structures = Programs which used the symbol , which appears as the ^ character on modern keyboards. I often hear others using the word caret, which I have to mentally switch over to using. People who teach themselves to program have to invent names for unfamiliar symbols, until they hear somebody speaking code (the widespread availability of teach-yourself videos will make it rare to need for this kind of individual learning; individual learning is giving way to social learning).

The problem with attempting to model social learning is that much of the activity occurs in private, and is not recorded.

One public source of prestigious experience is Stack Overflow. Code snippets included as part of an answer on Stack Overflow appear in around 1.8% of Github repositories. However, is the use of this code social learning or conformist transmission (i.e., copy and paste)?

Explaining social learning to people is all well and good, but having to hand wave when asked for a data-driven example is not good. Suggestions welcome.

On Triumvirate – student

student from thus spake a.k.

When last they met, the Baron invited Sir R----- to join him in a wager involving a sequence of coin tosses. At a cost of seven coins Sir R----- would receive one coin for every toss of the coin until a run of three heads or of three tails brought the game to its conclusion.

To evaluate its worth to Sir R----- we begin with his expected winnings after a single toss of the coin.

Letter to my MP about climate emergency

Andy Balaam from Andy Balaam's Blog

[Introduction including details about my own air source heat pump install, and mention of the ending of the RHI funding in April 2022.]

After I have installed an air source heat pump, I will pay more money to heat my home, even though I am using less energy, because electricity is more expensive than gas. So this change will hurt me financially over both the short and longer terms.

Do you agree with me that climate emergency is the most important issue the government is now facing?

Do you also agree with me that we urgently need people to switch their heating and home insulation to reduce our dependence on burning gas?

Please do all you can to persuade the Prime Minister to introduce initiatives before COP26 that make it financially viable for families without spare cash to insulate their home and heat them with renewable energy.

Please pass my letter on to the Prime Minister and any government departments you consider relevant.

Thank you very much for your time.