On The Rich Get Richer – student

student from thus spake a.k.

The Baron's latest wager set Sir R----- the task of surpassing his score before he reached eight points as they each cast an eight sided die, each adding one point to their score should the roll of their die be less than or equal to it. The cost to play for Sir R------ was one coin and he should have had a prize of five coins had he succeeded.

A key observation when figuring the fairness of this wager is that if both Sir R----- and the Baron cast greater than their present score then the state of play remains unchanged. We may therefore ignore such outcomes, provided that we adjust the probabilities of those that we have not to reflect the fact that we have done so.

Elm JSON decoder examples

Andy Balaam from Andy Balaam's Blog

I find JSON decoding in Elm confusing, so here are some thoughts and examples.

Setup

$ elm --version
0.19.0
$ mkdir myproj; cd myproj
$ elm init
...
$ elm install elm/json
...

To run the “Demo” parts of the examples below, type them into the interactive Elm interpreter. To try them out, start it like this:

$ elm repl

and import the library you need:

import Json.Decode as D

Scroll to “Concepts” at the bottom for lots of waffling about what is really going on, but if you’re looking to copy and paste concrete examples, here we are:

Examples

JSON object to Record

type alias MyRecord =
    { i : Int
    , s : String
    }

recordDecoder : D.Decoder MyRecord
recordDecoder =
    D.map2
        MyRecord
        (D.field "i" D.int)
        (D.field "s" D.string)

Demo:

> type alias MyRec = {i: Int, s: String}
> myRecDec = D.map2 MyRec (D.field "i" D.int) (D.field "s" D.string)
<internals> : D.Decoder MyRec
> D.decodeString myRecDec "{\"i\": 3, \"s\": \"bar\"}"
Ok { i = 3, s = "bar" }
    : Result D.Error MyRec

JSON array of ints to List

intArrayDecoder : D.Decoder (List Int)
intArrayDecoder =
    D.list D.int

Demo:

> myArrDec = D.list D.int
<internals> : D.Decoder (List Int)
> D.decodeString myArrDec "[3, 4]"
Ok [3,4] : Result D.Error (List Int)

JSON array of strings to List

stringArrayDecoder : D.Decoder (List String)
stringArrayDecoder =
    D.list D.string

Demo:

> myArrDec2 = D.list D.string
<internals> : D.Decoder (List String)
> D.decodeString myArrDec2 "[\"a\", \"b\"]"
Ok ["a","b"] : Result D.Error (List String)

JSON object to Dict

intDictDecoder : D.Decoder (Dict String Int)
intDictDecoder =
    D.dict D.int

Demo:

> myDictDecoder = D.dict D.int
<internals> : D.Decoder (Dict.Dict String Int)
> D.decodeString myDictDecoder "{\"a\": \"b\"}"
Err (Field "a" (Failure ("Expecting an INT") <internals>))
    : Result D.Error (Dict.Dict String Int)
> D.decodeString myDictDecoder "{\"a\": 3}"
Ok (Dict.fromList [("a",3)])
    : Result D.Error (Dict.Dict String Int)

To build a Dict of String to String, replace D.int above with
D.string.

JSON array of objects to List of Records

type alias MyRecord =
    { i : Int
    , s : String
    }

recordDecoder : D.Decoder MyRecord
recordDecoder =
    D.map2
        MyRecord
        (D.field "i" D.int)
        (D.field "s" D.string)


listOfRecordsDecoder : D.Decoder (List MyRecord)
listOfRecordsDecoder =
    D.list recordDecoder

Demo:

> import Json.Decode as D
> type alias MyRec = {i: Int, s: String}
> myRecDec = D.map2 MyRec (D.field "i" D.int) (D.field "s" D.string)
<internals> : D.Decoder MyRec
> listOfMyRecDec = D.list myRecDec
<internals> : D.Decoder (List MyRec)
> D.decodeString listOfMyRecDec "[{\"i\": 4, \"s\": \"one\"}, {\"i\": 5, \"s\":\"two\"}]"
Ok [{ i = 4, s = "one" },{ i = 5, s = "two" }]
    : Result D.Error (List MyRec)

Concepts

What is a Decoder?

A Decoder is something that describes how to take in JSON and spit out something. The “something” part is written after Decoder, so e.g. Decoder Int describes how to take in JSON and spit out an Int.

The Json.Decode module contains a function that is a Decoder Int. It’s called int:

> D.int
<internals> : D.Decoder Int

In some not-all-all-true way, a Decoder is sort of like a function:

-- This is a lie, but just pretend with me for a sec
Decoder a : SomeJSON -> a
-- That was a lie

To actually run your a Decoder, provide it to a function like decodeString:

> D.decodeString D.int "45"
Ok 45 : Result D.Error Int

So the actually-true way of getting an actual function is to combine decodeString and a decoder like int:

> D.decodeString D.int
<function> : String -> Result D.Error Int

When you apply decodeString to int you get a function that takes in a String and returns either an Int or an error. The error could be because the string you passed was not valid JSON:

> D.decodeString D.int "foo bar"
Err (Failure ("This is not valid JSON! Unexpected token o in JSON at position 1") )
    : Result D.Error Int

or because the parsed JSON does not match what the Decoder you supplied expects:

> D.decodeString D.int "\"45\""
Err (Failure ("Expecting an INT") )
    : Result D.Error Int

(We supplied a String containing a JSON string, but the int Decoder expects to find a JSON int.)

Side note: ints and floats are treated as different, even though the JSON Spec treats them all as just “Numbers”:

> D.decodeString D.int "45.2"
Err (Failure ("Expecting an INT") )
    : Result D.Error Int

What is a Value?

Elm has a type that represents JSON that has been parsed (actually, parsed and stored in a JavaScript object) but not interpreted into a useful Elm type. You can make one using the functions inside Json.Encode:

> import Json.Encode as E
> foo = E.string "foo"
 : E.Value

You can even turn one of these into a String containing JSON using encode:

> E.encode 0 foo
"\"foo\"" : String

or interpret the Value as useful Elm types using decodeValue:

> D.decodeValue D.string foo
Ok "foo" : Result D.Error String

(When JSON values come from JavaScript, e.g. via flags, they actually come as Values, but you don’t usually need to worry about that.)

However, what you can’t do is pull Values apart in any way, other than the standard ways Elm gives you. So any custom Decoder that you write has to be built out of existing Decoders.

How do I write my own Decoder?

If you want to make a Decoder that does custom things, build it from the existing magic Decoders, give it a type that describes the type it outputs, and insert your code using one of the mapN functions.

For example, to decode only ints that are below 100:

> under100 i = if i < 100 then D.succeed i else (D.fail "Not under 100")
<function> : number -> D.Decoder number
> intUnder100 = D.int > D.andThen under100
<internals> : D.Decoder Int
> D.decodeString intUnder100 "50"
Ok 50 : Result D.Error Int
> D.decodeString intUnder100 "500"
Err (Failure ("Not under 100") <internals>)
    : Result D.Error Int

Here, we use the andThen function to transform the Int value coming from calling the int function into a Decoder Int that expresses success or failure in terms of decoding. When we do actual decoding using the decodeString funcion, this is transformed into the more familiar Result values like Ok or Err.

If you want to understand the above, pay close attention to the types of under100 and intUnder100.

If you want to write a Decoder that returns some complex type, you should build it using the mapN functions.

For example, to decode strings into arrays of words split by spaces:

> splitIntoWords = String.split " "
<function> : String -> List String
> words = D.map splitIntoWords D.string
<internals> : D.Decoder (List String)
> D.decodeString words "\"foo bar baz\""
Ok ["foo","bar","baz"]
    : Result D.Error (List String)

Above we used map to transform a Decoder String (the provided string function) into a Decoder (List String) by mapping it over a function (splitIntoWords) that transforms a String into a List String.

Again, to understand this, look carefully at the types of splitIntoWords
and words.

How do I build up complex Decoders?

Complex decoders are built by combining simple ones. Many functions that make decoders take another decoder as an argument. A good example is “JSON array of objects to List of Records” above – there we make a Decoder MyRecord and use it to decode a whole list of records by passing it as an argument to list, so that it returns a Decoder (List MyRecord) which can take in a JSON array of JSON objects, and return a List of MyRecords.

Why is this so confusing?

Because Decoders are not functions, but they feel like functions. In fact they are opaque descriptions of how to interpret JSON that the Elm runtime uses to make Elm objects for you out of Values, which are opaque objects that underneath represent a piece of parsed JSON.

static_assert in templates

Anders Schau Knatten from C++ on a Friday

Quiz time! Which of the following programs can you expect to not compile? For bonus points, which are required by the C++ standard to not compile?

Program 1

struct B{};

template <typename T>
struct A {
//Assume sizeof(B) != 4
static_assert(sizeof(T) == 4);
};

A<B> a;

Program 2

struct B{};

template <typename T>
struct A {
//Assume sizeof(B) != 4
static_assert(sizeof(T) == 4);
};

A<B>* a;

Program 3

struct B{};

template <typename T>
struct A {
//Assume sizeof(int) != 4
static_assert(sizeof(int) == 4);
};

In case you’re not familiar with static_assert, it takes a constant boolean expression, and if it evaluates to false, you get a compilation error. The most basic example is just doing static_assert(false). If you have that anywhere in your program, compilation fails. But let’s start at the beginning:

Program 1

struct B{};

template <typename T>
struct A {
//Assume sizeof(B) != 4
static_assert(sizeof(T) == 4);
};

A<B> a;

Here we have a class template struct A, which takes a type T as its single template parameter. We then assert that the size of the provided template argument is 4.

We then define a variable a of type A<B>. In order to do that, we need the complete definition of A<B>, so that specialization of A gets implicitly instantiated. In that specialization, sizeof(T) becomes sizeof(B), which is not equal to 4, and compliation fails.

Program 2

struct B{};

template <typename T>
struct A {
//Assume sizeof(B) != 4
static_assert(sizeof(T) == 4);
};

A<B>* a;

This is the exact same problem as in Program 1, except we only define a pointer to A<B>. Does this result in a implicit instantiation? Let’s have a look at [temp.inst] (§17.7.1) ¶1 in the C++17 standard:

Unless a class template specialization has been explicitly instantiated (17.7.2) or explicitly specialized (17.7.3), the class template specialization is implicitly instantiated when the specialization is referenced in a context that requires a completely-defined object type or when the completeness of the class type affects the semantics of the program.

The class template specialization A<B> has not been explicitly instantiated nor explicitly specialized, so the question is then whether it’s implicitly instantiated. We’re only declaring a pointer to it, which doesn’t require a completely-defined object type, so it’s not instantiated. The program compiles just fine.

Program 3

struct B{};

template <typename T>
struct A {
//Assume sizeof(int) != 4
static_assert(sizeof(int) == 4);
};

In this variation, we’re asserting on the size of int, rather than the size of the template argument. And given the assumption that sizeof(int) != 4, that assertion will always fail. However, we’re never actually instantiating any specialization of A whatsoever. In Program 2, not instatiating the template allowed us to ignore the static_assert. Does the same apply here? In fact, it doesn’t. Let’s have a look at [temp.inst] (§17.6) ¶8 in the C++17 standard:

The program is ill-formed, no diagnostic required, if:

[…]

a hypothetical instantiation of a template immediately following its definition would be ill-formed due to a construct that does not depend on a template parameter

The static_assert(sizeof(int) == 4) does not depend on a template parameter, so if we were to instantiate A immediately following its definition, A would always be ill-formed.

So our program is ill-formed, no diagnostic required.

Now what does that mean? ill-formed is the terminology used by the standard for a program that’s not valid C++, where compilation is required to fail with an error message. Ill-formed, no diagnostic required however means our program is not valid C++, but that the compiler isn’t required to let us know. The standard makes no guarantees on the behaviour of the program, i.e. we have undefined behaviour.

So Program 3 has undefined behaviour. In practice however, both Clang, gcc and msvc gives a compilation error in this case.

Summary

Program Because … … the standard says In practice
1 We need the class definition Compilation error Compilation error
2 We don’t need the class definition No error No error
3 The assertion doesn’t depend on T Undefined behaviour Compilation error

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

Students vs. professionals in software engineering experiments

Derek Jones from The Shape of Code

Experiments are an essential component of any engineering discipline. When the experiments involve people, as subjects in the experiment, it is crucial that the subjects are representative of the population of interest.

Academic researchers have easy access to students, but find it difficult to recruit professional developers, as subjects.

If the intent is to generalize the results of an experiment to the population of students, then using student as subjects sounds reasonable.

If the intent is to generalize the results of an experiment to the population of professional software developers, then using student as subjects is questionable.

What it is about students that makes them likely to be very poor subjects, to use in experiments designed to learn about the behavior and performance of professional software developers?

The difference between students and professionals is practice and experience. Professionals have spent many thousands of hours writing code, attending meetings discussing the development of software; they have many more experiences of the activities that occur during software development.

The hours of practice reading and writing code gives professional developers a fluency that enables them to concentrate on the problem being solved, not on technical coding details. Yes, there are students who have this level of fluency, but most have not spent the many hours of practice needed to achieve it.

Experience gives professional developers insight into what is unlikely to work and what may work. Without experience students have no way of evaluating the first idea that pops into their head, or a situation presented to them in an experiment.

People working in industry are well aware of the difference between students and professional developers. Every year a fresh batch of graduates start work in industry. The difference between a new graduate and one with a few years experience is apparent for all to see. And no, Masters and PhD students are often not much better and in some cases worse (their prolonged sojourn in academia means that have had more opportunity to pick up impractical habits).

It’s no wonder that people in industry laugh when they hear about the results from experiments based on student subjects.

Just because somebody has “software development” in their job title does not automatically make they an appropriate subject for an experiment targeting professional developers. There are plenty of managers with people skills and minimal technical skills (sub-student level in some cases)

In the software related experiments I have run, subjects were asked how many lines of code they had read/written. The low values started at 25,000 lines. The intent was for the results of the experiments to be generalized to the population of people who regularly wrote code.

Psychology journals are filled with experimental papers that used students as subjects. The intent is to generalize the results to the general population. It has been argued that students are not representative of the general population in that they have spent more time reading, writing and reasoning than most people. These subjects have been labeled as WEIRD.

I spend a lot of time reading software engineering papers. If a paper involves human subjects, the first thing I do is find out whether the subjects were students (usual) or professional developers (not common). Authors sometimes put effort into dressing up their student subjects as having professional experience (perhaps some of them have spent a year or two in industry, but talking to the authors often reveals that the professional experience was tutoring other students), others say almost nothing about the identity of the subjects. Papers describing experiments using professional developers, trumpet this fact in the abstract and throughout the paper.

I usually delete any paper using student subjects, some of the better ones are kept in a subdirectory called students.

Software engineering researchers are currently going through another bout of hand wringing over the use of student subjects. One paper makes the point that a student based experiment is a good way of validating an experiment that will later involve professional developers. This is a good point, but ignored the problem that researchers rarely move on to using professional subjects; many researchers only ever intend to run student-based experiments. Also, they publish the results from the student based experiment, which are at best misleading (but academics get credit for publishing papers, not for the content of the papers).

Researchers are complaining that reviews are rejecting their papers on student based experiments. I’m pleased to hear that reviewers are rejecting these papers.

The best or most compiler writers born in February?

Derek Jones from The Shape of Code

Some years ago, now, I ran a poll asking about readers’ month of birth and whether they had worked on a compiler. One hypothesis was that the best compiler writers are born in February, an alternative hypothesis is that most compiler writers are born in February.

I have finally gotten around to analyzing the data and below is the Rose diagram for the 82, out of 132 responses, compiler writers (the green arrow shows the direction and magnitude of the mean; code+data):

Rose diagram of birth month of compiler writers

At 15% of responses, February is the most common month for compiler writer birthdays. The percentage increases to 16%, if weighted by the number of births in each month.

So there you have it, the hypothesis that most compiler writers are born in February is rejected, leaving the hypothesis that the best compiler writers are born in February. How could this not be true :-)

What about the birth month of readers who are not compiler writers? While the mean direction and length are more-or-less the same, for the two populations, the Rose diagram shows that the shape of the distributions are different:

Rose diagram of birth month of non-compiler writers

Event: nor(DEV), TechEast & Barclays Eagle Lab present the digital technology showcase

Paul Grenyer from Paul Grenyer



Event: nor(DEV), TechEast & Barclays Eagle Lab present the digital technology showcase

When: Monday, 29th October @ 6.30pm

Where: The King's Centre, King Street, Norwich, NR1 1PH

RSVP: https://www.meetup.com/Norfolk-Developers-NorDev/events/252875683/

Ever thought that technology could help your business but weren’t sure how?

Have you wanted to talk to experts about the advantages of business software but were afraid of being given a hard sell?

Do you have questions about software and technology but were unsure who to ask?

Our Technology Showcase is here to help! Norfolk Developers, along with TechEast and the Barclays Eagle Lab Norwich have put together an evening of talks and introductions to help companies access technology firms on an informal basis. No sales pitches, no pushy sales people, just information and introductions.

The evening will be free from complicated jargon and tech-speak and questions are actively encouraged! This is an evening designed for people who run businesses, not IT experts.

The Technology Showcase is your chance to understand the advantages that technology can offer your business, and an opportunity to pick the brains of the regions best software companies, without obligation.

The evening will begin with introductions to Norwich’s supportive tech community, who offer support, workshops and discussions to novices and experts alike.


  • SynchNorwich help promote startups and enable local tech businesses to grow.
  • DevelopHER promote equality in technology and host the successful DevelopHER awards, recognising women in technology.
  • Norfolk Developers are a group focussing on practical skills and hardcore software developing and run the annual nor(DEV): conference.
  • Hot Source are a creative community group that aims to promote digital enterprise in our region.


The second half of the evening will give tech companies a chance to explain what they do, how it can benefit you and answer any questions you might have.


  • Candour are a creative digital agency that harnesses the power of open minds and honest feedback to create engaging experiences for users and powerful platforms for business growth.
  • Neon Tribe develop software in three stages, using ‘Discovery, Development and Delivery’ process to ensure they get the results businesses want first time.
  • Selesti are a digital marketing agency who pride themselves on strategies that help brands achieve extraordinary things. There will be one final featured company.
  • TBC.


We hope you will join us in what promises to be an informative and informal evening, designed to help technology help your business.

Calling Companies – Showcase your Tech at Sync the City 54 Hour Startup event

Paul Grenyer from Paul Grenyer


Sync the City is a great event in which you can promote, share and test your API's, Tools and Services.  You could provide the vital ingredient that helps the startup teams Build & Launch a Startup in 54 Hours, and win the amazing cash prizes on offer!

Sync the City 2018 is a 54 Hour event that brings together local entrepreneurs, developers, business managers, marketing gurus, graphic artists, students to pitch ideas for new startup companies, form teams around those ideas, and work to develop a working prototype, demo & final pitch (to win cash prizes).

More details syncthecity.com

In the past, we have had companies providing free access to address lookup API's, customer surveying services, hosting services, sentiment analysis API's and much more.  Its a great opportunity to publicise your own SDK / APIs / Products to be used at the event, and put them to the test and get direct feedback.

If you are interested in getting involved, then email syncthecity@norfolkdevelopers.com.


+!!””

Anders Schau Knatten from C++ on a Friday

In this Tweet, @willkirkby posts:

+!!””

that evaluates as 1 in C/C++, but no, JavaScript is the weird language

C++ is indeed weird, or at least it’s very weakly typed. Let’s go through all the details of what’s going on here:

+!!""

Summary:

Starting from the right, "" is a string literal, which gets converted to a pointer, which again gets converted to a bool with the value true. This then gets passed to two operator!s, which flip it to false and back to true again. Finally, operator+ converts the bool true to the int 1.

All of this happens behind our backs, so to speak, as C++ is very eager to do conversions we didn’t explicitly ask it to do. This eagerness to do implicit conversions is by the way why you should always mark your single argument constructors and your converting operators explicit.

Detailed explanation:

Now let’s go through this in detail, quoting the C++17 standard. Starting from the right:

"" is a string literal. [lex.string]¶8:

A narrow string literal has type “array of n const char

Then comes operator!. We have an array of n const char, can we use that for operator!? [expr.unary.op]¶9 says:

The operand of the logical negation operator ! is contextually converted to bool (Clause 7); its value is true if the converted operand is false and false otherwise. The type of the result is bool.

So we need to contextually convert our array of n const char to bool. [conv]¶5 says:

Certain language constructs require that an expression be converted to a Boolean value. An expression e appearing in such a context is said to be contextually converted to bool and is well-formed if and only if the declaration bool t(e); is well-formed, for some invented temporary variable t.

So let’s see where bool t(e); takes us when e is an array of n const char.[conv]¶2:

expressions with a given type will be implicitly converted to other types in several contexts:

  • When used as the source expression for an initialization

In our case, the expression is of type “array of n const char“, and we need a bool. We’re going to need a standard conversion sequence. [conv]¶1:

A standard conversion sequence is a sequence of standard conversions in the following order:

  • (1.1) Zero or one conversion from the following set: lvalue-to-rvalue conversion, array-to-pointer conversion, and function-to-pointer conversion.
  • (1.2) Zero or one conversion from the following set: integral promotions, floating-point promotion, integral conversions, floating-point conversions, floating-integral conversions, pointer conversions, pointer to member conversions, and boolean conversions.— […]

So we can first use an array-to-pointer conversion (1.1) to get from “array of n const char” to a pointer. We can then use a boolean conversion (1.2) to get from pointer to bool.

First, the array-to-pointer conversion, [conv.array]¶1:

An lvalue or rvalue of type “array of N T” […] can be converted to a prvalue of type “pointer to T”. The result is a pointer to the first element of the array.

So we now have a pointer to the first element (the terminating \0).

And then the boolean conversion [conv.bool]¶1:

A prvalue of arithmetic, unscoped enumeration, pointer, or pointer to member type can be converted to a prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true.

Since our pointer is not a null pointer, its value converts to true.

Negating this twice with two operator!s is trivial, we end up back at true.

Finally, true is passed to operator+. [expr.unary.op]¶7:

The operand of the unary + operator shall have arithmetic, unscoped enumeration, or pointer type and the result is the value of the argument.

bool is not an arithmetic type, so we need to promote the bool true to an arithmetic type before passing it to opreator+. [conv.prom]¶6:

A prvalue of type bool can be converted to a prvalue of type int, with false becoming zero and true becoming one.

So the bool true becomes the int 1, and we’re done.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

Facebook’s Big Code Summit

Derek Jones from The Shape of Code

I was at Facebook’s first Big Code Summit on Monday and Tuesday (I say the first, because I hope there is another one next year).

The talks all involved machine learning (to be expected, given the Big Code in the event’s title). Normally I ignore papers on machine learning in software engineering, but understanding code is hard and we don’t know much about it. As I keep telling anyone who will listen, machine learning is the tool to use when you don’t know what you are doing (provided you have enough data).

People have been learning code patterns for some time now, suggesting applications in code completion in the IDE and finding suspicious API sequences (e.g., a missing call). This is one area where machine learning is a natural solution: nobody has the time to write down all the common patterns, for all the common languages, and APIs are constantly changing. It makes no sense to solve this problem manually.

So what was new and/or interesting?

We got new and very interesting in the first talk, when Eran Yahav presented his group’s work on cod2vec, the paper was interesting, but the demo had the wow factor.

I have not made up my mind about Michael Pradel‘s proposal for learning new coding rule checks. These rules are often created by people, but people with the necessary skill are thin on the ground. Machine learning requires something to learn from, how could coding rules be created this way. Michael’s group is working on a system where developers create the positive and negative cases and a machine learner figures out rules from these examples. Would the creation of these positive/negative examples prove to be just as hard as writing rules? I was not convinced that such an approach was practical, but if somebody wants to try it out, why not.

I found Xinyun Chen‘s talk interesting, but then I’ve written lots of parsers, and automatically figuring out how to parse a language from examples will always get my attention. A few people in the audience thought that a better solution was typing in a grammar and parsing the ‘usual’ way. This approach assumes a grammar exists, can be strong-armed into a form that is practical to embed in a parser (requiring somebody skilled in the necessary black arts), to produce a system that will only process complete translation units (or whatever the language calls a unit of translation).