Grammar checking in 2018

Derek Jones from The Shape of Code

I am a big fan of using tools to find problems quickly. Polishing the draft material for my evidence-based software engineering book, I have been finding an annoying number of grammatical mistakes :-(.

LanguageTool is what I use to check my grammar; it is the best tool of its kind that I know, and supports lots of different languages.

I also have an awk script that looks for new instances of previous mistakes I have made. It rarely flags anything, I seem to be in a continual state of making new grammatical mistakes.

Stung by a recent series is blatant mistakes, I have been searching for a better tool. What did I find:

So, lots of interesting stuff, but nothing better that is usable.

I keep looking at the interesting things that spaCY can do (if you are looking to integrate language processing in your app, spaCY is currently the best language processing library). Does anybody know of grammar checking work being done using spaCY (LanguageTool is based around a parsing engine that is rather long in the tooth now)?

Anybody interested in organizing a grammar checking tool hack day in London?

Experimental Psychology by Robert S. Woodworth

Derek Jones from The Shape of Code

I have just discovered “Experimental Psychology” by Robert S. Woodworth; first published in 1938, I have a reprinted in Great Britain copy from 1951. The Internet Archive has a copy of the 1954 revised edition; it’s a very useful pdf, but it does not have the atmospheric musty smell of an old book.

The Archives of Psychology was edited by Woodworth and contain reports of what look like ground breaking studies done in the 1930s.

The book is surprisingly modern, in that the topics covered are all of active interest today, in fields related to cognitive psychology. There are lots of experimental results (which always biases me towards really liking a book) and the coverage is extensive.

The history of cognitive psychology, as I understood it until this week, was early researchers asking questions, doing introspection and sometimes running experiments in the late 1800s and early 1900s (e.g., Wundt and Ebbinghaus), behaviorism dominants the field, behaviorism is eviscerated by Chomsky in the 1960s and cognitive psychology as we know it today takes off.

Now I know that lots of interesting and relevant experiments were being done in the 1920s and 1930s.

What is missing from this book? The most obvious omission is equations; lots of data points plotted on graph paper, but no attempt to fit an equation to anything, e.g., an exponential curve to the rate of learning.

A more subtle omission is the world view; digital computers had not been invented yet and Shannon’s information theory was almost 20 years in the future. Researchers tend to be heavily influenced by the tools they use and the zeitgeist. Computers as calculators and information processors could not be used as the basis for models of the human mind; they had not been invented yet.

Example of a systemd service file

Andy Balaam from Andy Balaam's Blog

Here is an almost-minimal example of a systemd service file, that I use to run the Mastodon bot of my generative art playground Graft.

I made a dedicated user just to run this service, and installed Graft into /home/graft/apps/graft under that username. Now, as root, I edited a file called /etc/systemd/service/graft.service and made it look like this:

[Service]
ExecStart=/home/graft/apps/graft/bot-mastodon
User=graft
Group=graft
[Install]
WantedBy=multi-user.target

Now I can start the graft service like any other service like this:

sudo systemctl start graft

and find out its status with:

sudo systemctl status graft

If I want it to run on startup I can do:

sudo systemctl enable graft

and it will. Easy!

If I want to look at its output, it’s:

sudo journalctl -u graft

As a reward for reading this far, here’s a little animation you can make with Graft:

Spaceship Operator

Simon Brand from Simon Brand

You write a class. It has a bunch of member data. At some point, you realise that you need to be able to compare objects of this type. You sigh and resign yourself to writing six operator overloads for every type of comparison you need to make. Afterwards your fingers ache and your previously clean code is lost in a sea of functions which do essentially the same thing. If this sounds familiar, then C++20’s spaceship operator is for you. This post will look at how the spaceship operator allows you to describe the strength of relations, write your own overloads, have them be automatically generated, and how correct, efficient two-way comparisons are automatically rewritten to use them.

Relation strength

The spaceship operator looks like <=> and its official C++ name is the “three-way comparison operator”. It is so-called, because it is used by comparing two objects, then comparing that result to 0, like so:

(a <=> b) < 0  //true if a < b
(a <=> b) > 0  //true if a > b
(a <=> b) == 0 //true if a is equal/equivalent to b

One might think that <=> will simply return -1, 0, or 1, similar to strcmp. This being C++, the reality is a fair bit more complex, but it’s also substantially more powerful.

Not all equality relations were created equal. C++’s spaceship operator doesn’t just let you express orderings and equality between objects, but also the characteristics of those relations. Let’s look at two examples.

Example 1: Ordering rectangles

We have a rectangle class and want to define comparison operators for it based on its size. But what does it mean for one rectangle to be smaller than another? Clearly if one rectangle is 1cm by 1cm, it is smaller than one which is 10cm by 10cm. But what about one rectangle which is 1cm by 5cm and another which is 5cm by 1cm? These are clearly not equal, but they’re also not less than or greater than each other. But speaking practically, having all of <, <=, ==, >, and >= return false for this case is not particularly useful, and it breaks some common assumptions at these operators, such as (a == b || a < b || a > b) == true. Instead, we can say that == in this case models equivalence rather than true equality. This is known as a weak ordering.

Example 2: Ordering squares

Similar to above, we have a square type which we want to define comparison operators for with respect to size. In this case, we don’t have the issue of two objects being equivalent, but not equal: if two squares have the same area, then they are equal. This means that == models equality rather than equivalence. This is known as a strong ordering.

Describing relations

Three-way comparisons allow you express the strength of your relation, and whether it allows just equality or also ordering. This is achieved through the return type of operator<=>. Five types are provided, and stronger relations can implicitly convert to weaker ones, like so:

relation conversion

Strong and weak orderings are as described in the above examples. Strong and weak equality means that only == and != are valid. Partial ordering is weaker than weak_ordering in that it also allows unordered values, like NaN for floats.

These types have a number of uses. Firstly they indicate to users of the types what kind of relation is modelled by the comparisons, so that their behaviour is more clear. Secondly, algorithms could potentially have more efficient specializations for when the orderings are stronger, e.g. if two strongly-ordered objects are equal, calling a function on one of them will give the same result as the other, so it would only need to be carried out once. Finally, they can be used in defining language features, such as class type non-type template parameters, which require the type to have strong equality so that equal values always name the same template instantiation.

Writing your own three-way comparison

You can provide a three-way comparison operator for your type in the usual way, by writing an operator overload:

struct foo {
  int i;

  std::strong_ordering operator<=> (foo const& rhs) {
    return i <=> rhs.i;
  }
};

Note that whereas two-way comparisons should be non-member functions so that implicit conversions are done on both sides of the operator, this is not necessary for operator<=>; we can make it a member and it’ll do the right thing.

Sometimes you may find that you want to do <=> on types which don’t have an operator<=> defined. The compiler won’t try to work out a definition for <=> based on the other operators, but you can use std::compare_3way instead, which will fall back on two-way comparisons if there is no three-way version available. For example, we could write a three-way comparison operator for a pair type like so:

template<class T, class U>
struct pair {
  T t;
  U u;

  auto operator<=> (pair const& rhs) const
    -> std::common_comparison_category_t<
         decltype(std::compare3_way(t, rhs.t)),
         decltype(std::compare3_way(u, rhs.u)> {
    if (auto cmp = std::compare3_way(t, rhs.t); cmp != 0) return cmp;
    return std::compare3_way(u, rhs.u);
  }

std::common_comparison_category_t there determines the weakest relation given a number of them. E.g. std::common_comparison_category_t<std::strong_ordering, std::partial_ordering> is std::partial_ordering.

If you found the previous example a bit too complex, you might be glad to know that C++20 will support automatic generation of comparison operators. All we need to do is =default our operator<=>:

auto operator<=>(x const&) = default;

Simple! This will carry out a lexicographic comparison for each base, then member of the type, in order of declaration.

Automatically-rewritten two-way comparisons

Although <=> is very powerful, most of the time we just want to know if some object is less than another, or equal to another. To facilitate this, two-way comparisons like < can be rewritten by the compiler to use <=> if a better match is not found.

The basic idea is that for some operator @, an expression a @ b can be rewritten as a <=> b @ 0. It can even be rewritten as 0 @ b <=> a if that is a better match, which means we get symmetry for free.

In some cases, this can actually provide a performance benefit. Quite often comparison operators are implemented by writing == and <, then writing the other operators in terms of those rather than duplicating the code. This can lead to situations where we want to check <= and end up doing an expensive comparison twice. This automatic rewriting can avoid that cost, since it will only call the one operator<=> rather than both operator< and operator==.

Conclusion

The spaceship operator is a very welcome addition to C++. It gives us more expressiveness in how we define our relations, lets us write less code to define them (sometimes even just a defaulted declaration), and avoids some of the performance pitfalls of manually implementing some comparison operators in terms of others.

For more details, have a look at the cppreference articles for comparison operators and the compare header, and Herb Sutter’s standards proposal. For a good example on how to write a spaceship operator, see Barry Revzin’s post on implementing it for std::optional. For more information on the mathematics of ordering with a C++ slant, see Jonathan Müller’s blog post series on the mathematics behind comparison.

Installing Flarum on Ubuntu 18.04

Andy Balaam from Andy Balaam&#039;s Blog

I am setting up a forum for sharing levels for my game Rabbit Escape, and I have decided to try and use Flarum, because it looks really usable and responsive, has features we need like liking posts and following authors, and I think it will be reasonably OK to write the custom features we want.

So, I want a dev environment on my local Ubuntu 18.04 machine, and the first step to that is a standard install.

Warning: at the time of writing the Flarum docs say it does not work with PHP 7.2, which is what is included with Ubuntu 18.04, so this may not work. (So far it looks OK for me.)

Here’s how I got it working (as far as the web installer stage, anyway):

sudo apt install \
    apache2 \
    libapache2-mod-php \
    mariadb-server \
    php-mysql \
    php-json \
    php-gd \
    php-tokenizer \
    php-mbstring \
    php-curl

php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');"

# Get the neat line from https://getcomposer.org/download/
# Don't copy it exactly!
php -r "if (hash_file('SHA384', 'composer-setup.php') === '544e09ee996cdf60ece3804abc52599c22b1f40f4323403c44d44fdfdd586475ca9813a858088ffbc1f233e9b180f061') { echo 'Installer verified'; } else { echo 'Installer corrupt'; unlink('composer-setup.php'); } echo PHP_EOL;"

mkdir ~/bin
php composer-setup.php --install-dir=~/bin/ --filename=composer
rm composer-setup.php

cd /var/www/html
sudo mkdir flarum
sudo chown $(whoami) flarum

# Log out and in again here to get composer to be in your PATH
cd flarum
composer create-project flarum/flarum . --stability=beta

sudo chgrp -R www-data .
sudo chmod -R 775 .

sudo systemctl restart apache2

Go to http://localhost/flarum in your browser, and follow the instructions there to get set up.

If I get further, I will update this post, including on how to set up the MySQL database.

If you want to find and share levels for Rabbit Escape, check up on our progress setting up the forum at https://artificialworlds.net/rabbit-escape/levels.

nor(DEV):mag Youth in Tech out NOW!

Paul Grenyer from Paul Grenyer

Welcome to our Youth in Tech issue – just in time for the new term! It was no coincidence that our August issue is all about younger developers, but we’ve also included articles about how it is important to never stop learning, even when we might think we know everything!

The world of tech is always evolving, sometimes it feels difficult to keep up with new developments and technologies. It’s important to spend some time investing in ourselves and staff to make sure we keep abreast of this ever-changing sector. We hear from Luminous PR and netmatters about the importance of learning in their articles.

Of course it wouldn’t really be a Youth Issue without the view of an actual young developer, and we knew just the chap. Teenager James kindly gave us his opinion on the best bits of being into technology and where he thinks the future of development will be and student developers Emily and David were also kind enough to give us a developers view of their university project ‘Theia’, which aims to use machine learning to help students.

On the lighter side of things developer Adam sent us an amusing code piece (just try not to spit your tea while reading!) and we were also lucky enough to have an insider view from the recent Apple conference in San Jose, California, so if you’ve ever wondered what actually happens inside Apple, see inside!

Download now: https://www.norfolkdevelopers.com/wp-content/uploads/2018/08/norDEVmagazine-201808-e05.pdf

Impact of team size on planning, when sitting around a table

Derek Jones from The Shape of Code

A recent blog post by Allan Kelly caught my attention; on Monday Allan sent me some comments on the draft of my book and I got to ask for a copy of his data (you don’t need your own software engineering data before sending me comments).

During an Agile training course he gives, Allan runs an exercise based on an extended version of the XP game. The basic points are: people form into teams, a task is announced, teams have to estimate how long it will take them to complete the task and then to plan the task implementation. Allan recorded information on team size, time spent estimating and time spent planning (no information on the tasks, which were straightforward, e.g., fold a paper airplane).

In a recent post I gave a brief analysis of team size on productivity. What does this XP game data have to say about the impact of team size on performance?

We don’t have task information, but we do have two timing measurements for each team. With a bit of suck-it-and-see analysis, I found that the following equation explained 50% of the variance (code+data):

Planning=-4+0.8*TimeSize+5*sqrt{Estimate}

There was some flexibility in the numbers, depending on the method used to build the regression model.

The introduction of each new team member incurs a fixed overhead. Given that everybody is sitting together around a table, this is not surprising; or, perhaps the problem was so simply that nobody felt the need to give a personal response to everything said by everybody else; or, perhaps the exercise was run just before lunch and people were hungry.

I am not aware of any connection between time spent estimating and time spent planning, but then I know almost nothing about this kind of XP game exercise. That square-root looks interesting (an exponent of 0.4 or 0.6 was a slightly less good fit). Thoughts and experiences anybody?

The Rich Get Richer – baron m.

baron m. from thus spake a.k.

Sir R-----! I must say that it is a relief to have the company of a fellow nobleman in these distressing times. That I have had to sell not one, but two of my several hundred antiquities to settle the burden of tax that this oppressive democracy has put upon me, simply to enrich slugabeds I might add, is quite intolerable!

Come, let us drown our sorrows whilst we still have the means to do so and engage in a little sport to raise our spirits.

I have a fancy for a game that I used to play when I was the Russian ambassador to the Rose Tree Valley commune. Founded by the philosopher queen Zway Remington as a haven for downtrodden wealthy industrialists, it was the purest of pure meritocracies; no handouts to the idle labouring classes there!