Beta release of data analysis chapters: Evidence-based software engineering

Derek Jones from The Shape of Code

When I started my evidence-based software engineering book, nobody had written a data analysis book for software developers, so I had to write one (in fact, a book on this topic has still to be written). When I say “I had to write one”, what I mean is that the 200 pages in the second half of my evidence-based software engineering book contains a concentrated form of such a book.

This 200 pages is now on beta release (it’s 186 pages, if the bibliography is excluded); chapters 8 to 15 of the draft pdf. Originally I was going to wait until all the material was ready, before making a beta release; the Coronavirus changed my plans.

Here is your chance to learn a new skill during the lockdown (yes, these are starting to end; my schedule has not changed, I’m just moving with the times).

All the code+data is available for you to try out any ideas you might have.

The software engineering material, the first half of the book, is also part of the current draft pdf, and the polished form should be available on beta release in about 6 weeks.

If you have a comment or find a problem, either email me or raise an issue on the book’s Github page.

Yes, a few figures and tables still bump into each other. I’m loath to do very fine-tuning because things will shuffle around a bit with minor changes to the words.

I’m thinking of running some online sessions around each chapter. Watch this space for information.

Tizen on Orange Pi PC

Christof Meerwald from cmeerw.org blog

Made some significant progress for running Tizen on an Orange Pi PC (and hopefully any other SBC with a similar Mali GPU). Main issue was that alignments in TBM (Tizen Buffer Manager) weren't in sync with what the actual GPU driver expected. With that fixed, rendering seems to work fine now and I was also able to enable Full HD (1920x1080) resolution.

Next step would be to get the Home Screen app auto started and find some way to get a "Home" button (to be able to switch between apps without completely closing them).

Creating a tiny Docker image of a Rust project

Andy Balaam from Andy Balaam's Blog

I am building a toy project in Rust to help me learn how to deploy things in AWS. I’m considering using Elastic Beanstalk (AWS’s platform-as-a-service) and also Kubernetes. Both of these support deploying via Docker containers, so I am learning how to package a Rust executable as a Docker image.

My program is a small web site that uses Redis as a back end database. It consists of some Rust code and a couple of static files.

Because Rust has good support for building executables with very few dependencies, we can actually build a Docker image with almost nothing in it, except my program and the static files.

Thanks to Alexander Brand’s blog post How to Package Rust Applications Into Minimal Docker Containers I was able to build a Docker image that:

  1. Is very small
  2. Does not take too long to build

The main concern for making the build faster is that we don’t download and build all the dependencies every time. To achieve that we make sure there is a layer in the Docker build process that includes all the dependencies being built, and is not re-built when we only change our source code.

Here is the Dockerfile I ended up with:

# 1: Build the exe
FROM rust:1.42 as builder
WORKDIR /usr/src
Creating a tiny Docker image of a Rust project
# 1a: Prepare for static linking
RUN apt-get update && \
    apt-get dist-upgrade -y && \
    apt-get install -y musl-tools && \
    rustup target add x86_64-unknown-linux-musl

# 1b: Download and compile Rust dependencies (and store as a separate Docker layer)
RUN USER=root cargo new myprogram
WORKDIR /usr/src/myprogram
COPY Cargo.toml Cargo.lock ./
RUN cargo install --target x86_64-unknown-linux-musl --path .

# 1c: Build the exe using the actual source code
COPY src ./src
RUN cargo install --target x86_64-unknown-linux-musl --path .

# 2: Copy the exe and extra files ("static") to an empty Docker image
FROM scratch
COPY --from=builder /usr/local/cargo/bin/myprogram .
COPY static .
USER 1000
CMD ["./myprogram"]

The FROM rust:1.42 as build line uses the newish Docker feature multi-stage builds – we create one Docker image (“builder”) just to build the code, and then copy the resulting executable into the final Docker image.

In order to allow us to build a stand-alone executable that does not depend on the standard libraries in the operating system, we use the “musl” target, which is designed to statically linked.

The final Docker image produced is pretty much the same size as the release build of myprogram, and the build is fast, so long as I don’t change the dependencies in Cargo.toml.

A couple more tips to make the build faster:

1. Use a .dockerignore file. Here is mine:

/target/
/.git/

2. Use Docker BuildKit, by running the build like this:

DOCKER_BUILDKIT=1 docker build  .

Visual Lint 7.0.7.318 has been released

Products, the Universe and Everything from Products, the Universe and Everything

This is a recommended maintenance update for Visual Lint 7.0. The following changes are included:

  • The Visual C++ built-in preprocessor symbol _MSC_VER is now defined on the generated CppCheck command line when analysing Visual Studio 2019 projects.

  • Appropriate defaults (e.g. the compiler indirect file co-rb-vs2019.lnt) are now offered when creating a PC-lint Plus analysis configuration for Visual Studio 2019.

  • Updated the values of _MSC_VER and _MSC_FULL_VER in the PC-lint Plus compiler indirect file co-rb-v2017.lnt to reflect those in the latest Visual Studio 2017 update (VS2017 v15.9.21).

  • Updated the values of _MSC_VER and _MSC_FULL_VER in the PC-lint Plus compiler indirect file co-rb-vs2019.lnt to reflect those in the latest Visual Studio 2019 update (VS2019 v16.5.1).

  • Fixed a bug in the Configuration Wizard "Select Analysis Tool Installation Folder" page.

  • Fixed a bug in the implementation of the "Active Analysis Tool" option.

  • Fixed a bug in the implementation of the .vlconfig file "Solution specific Analysis Tool" property [Visual Lint Enterprise and Build Server Editions]`.

  • Fixed a display bug in the Configuration Wizard "Information" page. Note that this page is PC-lint and PC-lint Plus specific.

  • Corrected the title of the "PC-lint" page in the Analysis Configuration Dialog to "PC-lint Plus" when the active analysis tool is PC-lint Plus.

Visual Lint 7.0.7.318 has been released

Products, the Universe and Everything from Products, the Universe and Everything

This is a recommended maintenance update for Visual Lint 7.0. The following changes are included:

  • The Visual C++ built-in preprocessor symbol _MSC_VER is now defined on the generated CppCheck command line when analysing Visual Studio 2019 projects.
  • Appropriate defaults (e.g. the compiler indirect file co-rb-vs2019.lnt) are now offered when creating a PC-lint Plus analysis configuration for Visual Studio 2019.
  • Updated the values of _MSC_VER and _MSC_FULL_VER in the PC-lint Plus compiler indirect file co-rb-v2017.lnt to reflect those in the latest Visual Studio 2017 update (VS2017 v15.9.21).
  • Updated the values of _MSC_VER and _MSC_FULL_VER in the PC-lint Plus compiler indirect file co-rb-vs2019.lnt to reflect those in the latest Visual Studio 2019 update (VS2019 v16.5.1).
  • Fixed a bug in the Configuration Wizard "Select Analysis Tool Installation Folder" page.
  • Fixed a bug in the implementation of the "Active Analysis Tool" option.
  • Fixed a bug in the implementation of the .vlconfig file "Solution specific Analysis Tool" property [Visual Lint Enterprise and Build Server Editions].
  • Fixed a display bug in the Configuration Wizard "Information" page. Note that this page is PC-lint and PC-lint Plus specific.
  • Corrected the title of the "PC-lint" page in the Analysis Configuration Dialog to "PC-lint Plus" when the active analysis tool is PC-lint Plus.

Download Visual Lint 7.0.7.318

SIGCHLD si_pid Linux kernel bug

Christof Meerwald from cmeerw.org blog

While trying to get Tizen working on my Orange Pi PC, I noticed some strange behaviour in the Linux kernel in that SIGCHLD signals sent to the parent process don't always set the "si_pid" field correctly. I tracked this down to a bug in the Linux kernel for multithreaded process termination, see SIGCHLD signal sometimes sent with si_pid==0 (Linux 5.6.5). Luckily, a patch has already been posted less than 24 hours later.

Predicting the future with data+logistic regression

Derek Jones from The Shape of Code

Predicting the peak of data fitted by a logistic equation is attracting a lot of attention at the moment. Let’s see how well we can predict the final size of a software system, in lines of code, using logistic regression (code+data).

First up is the size of the GNU C library. This is not really a good test, since the peak (or rather a peak) has been reached.

Growth of glibc, in lines,, with logistic regression fit

We need a system that has not yet reached an easily recognizable peak. The Linux kernel has been under development for many years, and lots of LOC counts are available. The plot below shows a logistic equation fitted to the kernel data, assuming that the only available data was up to day: 2,900, 3,650, 4,200, and 5,000+. Can you tell which fitted line corresponds to which number of days?

Number lines in Linux kernel, on days since release1, and four fitted logistic regression models.

The underlying ‘problem’ is that we are telling the fitting software to fit a particular equation; the software does what it has been told to do, and fits a logistic equation (in this case).

A cubic polynomial is also a great fit to the existing kernel data (red line to the left of the blue line), and this fitted equation can be extended into future (to the right of the blue line); dotted lines are 95% confidence bounds. Do any readers believe the future size of the Linux kernel predicted by this cubic model?

Number of distinct silhouettes for a function containing four statements

Predicting the future requires lots of data on the underlying processes that drive events. Modeling events is an iterative process. Build a model, check against reality, adjust model, rinse and repeat.

If the COVID-19 experience trains people to be suspicious of future predictions made by models, it will have done something positive.

On Fruitful Opals – student

student from thus spake a.k.

Recall that the Baron’s game consisted of guessing under which of a pair of cups was to be found a token for a stake of four cents and a prize, if correct, of one. Upon success, Sir R----- could have elected to play again with three cups for the same stake and double the prize. Success at this and subsequent rounds gave him the opportunity to play another round for the same stake again with one more cup than the previous round and a prize equal to that of the previous round multiplied by its number of cups.

COVID-19 Lockdown Blues

Products, the Universe and Everything from Products, the Universe and Everything

The roof of our office buildingA corner of the roof of our office building. Even though the sea is just over 350m away to the left, sadly it's currently off-limits.

2020 is not turning out to be what we expect as - like much of the world - the UK is locked down right now as a result of the COVID-19 pandemic. As such we have been working from home for the past month and only going out for essentials.

That means no ACCU Conference, no free coffee in the office, no impromptu meetings on the beach or on our astroturfed office roof (yes, it does look like that!), no aerial yoga (cue sad face from Anna) but a whole lot of Zoom, configuring VPNs, being thankful for distributed version control systems, kicking off builds remotely and so on.

As far as Riverblade goes, it's rather fortunate that we have been set up for remote working from the outset, so from one point of view the lockdown hasn't come as a big change - although like everyone else we're really missing friends, family...and just simple experiences like going to a cafe at lunchtime or buying an ice cream at the seafront.

Quite frankly it sucks. But you already know all that - and if it saves lives, it is a tiny price to pay. We can only hope that politicians will take heed of the warnings from scientists, nurses, doctors and people who actually know what they are talking about, and that whatever we each endure proves to be enough to stop this virus in its tracks.

Needless to say our thoughts are with everyone touched by this pandemic - but especially with those who have lost loved ones and with anyone working in health and social care.

Be safe, people.

COVID-19 Lockdown Blues

Products, the Universe and Everything from Products, the Universe and Everything

A corner of the roof of our office building. Even though the sea is just over 350m away to the left, sadly it's currently off-limits.

2020 is not turning out to be what we expect as - like much of the world - the UK is locked down right now as a result of the COVID-19 pandemic. As such we have been working from home for the past month and only going out for essentials.

That means no ACCU Conference, no free coffee in the office, no impromptu meetings on the beach or on our astroturfed office roof (yes, it does look like that!), no aerial yoga (cue sad face from Anna) but a whole lot of Zoom, configuring VPNs, being thankful for distributed version control systems, kicking off builds remotely and so on.

As far as Riverblade goes, it's rather fortunate that we have been set up for remote working from the outset, so from one point of view the lockdown hasn't come as a big change - although like everyone else we're really missing friends, family...and just simple experiences like going to a cafe at lunchtime or buying an ice cream at the seafront.

Quite frankly it sucks. But you already know all that - and if it saves lives, it is a tiny price to pay. We can only hope that politicians will take heed of the warnings from scientists, nurses, doctors and people who actually know what they are talking about, and that whatever we each endure proves to be enough to stop this virus in its tracks.

Needless to say our thoughts are with everyone touched by this pandemic - but especially with those who have lost loved ones and with anyone working in health and social care.

Be safe, people.