First language taught to undergraduates in the 1990s

Derek Jones from The Shape of Code

The average new graduate is likely to do more programming during the first month of a software engineering job, than they did during a year as an undergraduate. Programming courses for undergraduates is really about filtering out those who cannot code.

Long, long ago, when I had some connection to undergraduate hiring, around 70-80% of those interviewed for a programming job could not write a simple 10-20 line program; I’m told that this is still true today. Fluency in any language (computer or human) takes practice, and the typical undergraduate gets very little practice (there is no reason why they should, there are lots of activities on offer to students and programming fluency is not needed to get a degree).

There is lots of academic discussion around which language students should learn first, and what languages they should be exposed to. I have always been baffled by the idea that there was much to be gained by spending time teaching students multiple languages, when most of them barely grasp the primary course language. When I was at school the idea behind the trendy new maths curriculum was to teach concepts, rather than rote learning (such as algebra; yes, rote learning of the rules of algebra); the concept of number-base was considered to be a worthwhile concept and us kids were taught this concept by having the class convert values back and forth, such as base-10 numbers to base-5 (base-2 was rarely used in examples). Those of us who were good at maths instantly figured it out, while everybody else was completely confused (including some teachers).

My view is that there is no major teaching/learning impact on the choice of first language; it is all about academic fashion and marketing to students. Those who have the ability to program will just pick it up, and everybody else will flounder and do their best to stay away from it.

Richard Reid was interested in knowing which languages were being used to teach introductory programming to computer science and information systems majors. Starting in 1992, he contacted universities roughly twice a year, asking about the language(s) used to teach introductory programming. The Reid list (as it became known), was regularly updated until Reid retired in 1999 (the average number of universities included in the list was over 400); one of Reid’s ex-students, Frances VanScoy, took over until 2006.

The plot below is from 1992 to 2002, and shows languages in the top 3% in any year (code+data):

Normalised returned required for various elapsed years.

Looking at the list again reminded me how widespread Pascal was as a teaching language. Modula-2 was the language that Niklaus Wirth designed as the successor of Pascal, and Ada was intended to be the grown up Pascal.

While there is plenty of discussion about which language to teach first, doing this teaching is a low status activity (there is more fun to be had with the material taught to the final year students). One consequence is lack of any real incentive for spending time changing the course (e.g., using a new language). The Open University continued teaching Pascal for years, because material had been printed and had to be used up.

C++ took a while to take-off because of its association with C (which was very out of fashion in academia), and Java was still too new to risk exposing to impressionable first-years.

A count of the total number of languages listed, between 1992 and 2002, contains a few that might not be familiar to readers.

          Ada    Ada/Pascal          Beta          Blue             C 
         1087             1            10             3           667 
       C/Java      C/Scheme           C++    C++/Pascal        Eiffel 
            1             1           910             1            29 
      Fortran       Haskell     HyperTalk         ISETL       ISETL/C 
          133            12             2            30             1 
         Java  Java/Haskell       Miranda            ML       ML/Java 
          107             1            48            16             1 
     Modula-2      Modula-3        Oberon      Oberon-2     ObjPascal 
          727            24            26             7            22 
       Orwell        Pascal      Pascal/C        Prolog        Scheme 
           12          2269             1            12           752 
    Scheme/ML Scheme/Turing        Simula     Smalltalk           SML 
            1             1            14            33            88 
       Turing  Visual-Basic 
           71             3 

I had never heard of Orwell, a vanity language foisted on Oxford Mathematics and Computation students. It used to be common for someone in computing departments to foist their vanity language on students; it enabled them to claim the language was being used and stoked their ego. Is there some law that enables students to sue for damages?

The 1990s was still in the shadow of the 1980s fashion for functional programming (which came back into fashion a few years ago). Miranda was an attempt to commercialize a functional language compiler, with Haskell being an open source reaction.

I was surprised that Turing was so widely taught. More to do with the stature of where it came from (university of Toronto), than anything else.

Fortran was my first language, and is still widely used where high performance floating-point is required.

ISETL is a very interesting language from the 1960s that never really attracted much attention outside of New York. I suspect that Blue is BlueJ, a Java IDE targeting novices.

In the beginning there is architecture, well maybe

Allan Kelly from Allan Kelly Associates


I was on a panel last year when someone asked that old chestnut:

“Surely before you start coding anything you need to design an architecture?”

As regular readers might guess I took a very simple stand:

“No, you need to get something that works, you need to show you can address the problem in a way that delivers business benefit. You can retrofit architecture when you have shown that your thing is useful and valuable. Sure spend some time thinking about design and architecture but this should be measured in hours not days. As the system grows return to these discussions but keep them to hours not days. Embrace emergent design and refactoring.”

Another panellist disagreed:


In 2018 you can tell someone is about to disagree with you because they don’t start “No” or “But” they start “And”

“Companies have databases, and data centres, and standards, and you need to spend some time thinking about them. You need to integrate to existing systems. Changing database schema’s, let alone database application servers is big work and you want to avoid it if you can.”

OK, I can’t remember his exact words but that is the tone I recall. I was told, I was wrong.

Despite all those very real concerns I stand by my position. OK, maybe I need to elaborate a bit so here goes…

The thing is, I have nothing against architecture. When you build a software system you will – before very long – have an architecture. It may not be a very good one but you will have one.

I encourage all developers to study software design and architecture. Coders make hundreds, even thousands, of decisions everyday when they are coding and there is an architectural angle to every decision. Coders need to be versed in good design practices so they make good decisions.

Nor do I have anything against guiding the architecture in such a way that it becomes better with time – indeed I think that is a necessity. Neither do I disagree with architectural conventions in a system or documentation, again they become more valuable with time.

What I’m not in favour of is: believing that you can set it from the start. I believe the more you try to predetermine the more time you will waste.

Architecture is nine-tenths what you have, and one-tenth what you think you have, or should have. Architecture exists in the minds and shared models of those who maintain the system as much as it does in the code.

There are some occasion when there is a completely blank sheet of paper to start with. New start-ups spring obviously, but even inside a large corporation there are occasions when it happens; say when the company wants to experiment with new technologies and approaches, or hedge options. There are two priorities here:

  • Be different, differentiate the solution from what already exists
  • Prove that this new solution delivers something of business benefit

So: pick the solutions that seem right to you now – thats “you” plural, you and the team you are working with.

Maybe spend a little time considering your options (Oracle? MySql? CouchBase? Mongo?) but don’t spend too long, hours not days. Accept that you will learn when you start using your chosen option and accept that you may change it, or that you may make a bad decision and be cursing you decision for ever more.

The thing is: all that time spent designing the architecture, researching options and making decisions is wasted if you do not deliver business benefit. If nobody uses it then having the perfect product, perfect architecture, is pointless.

Anyway, even if you spend months and agree the perfect architecture it will not survive. Once you start work you will find assumptions invalidated, you will find things don’t work as you expected, and you will find that the cheap code monkeys you hired to implement you wonderful design do things you didn’t expect.

Now consider the case where you start a new initiative in an existing environment. There are existing databases and systems to work with. Again you don’t need to spend a lot of time looking at options because you don’t have them. Instead you have constraints.

If the company standard is Oracle for databases you are going to use Oracle.
If you need to integrate with SalesForce then you need to integrate with SalesForce.

You might believe that CouchBase is a better solution than Oracle from the word go. In which case you can either:

  • Accept that right now Oracle is the standard and use it even if it is an inferior solution. In time if you demonstrate business benefit than you will be in a stronger position to change. Or you might find that even an inferior solution is good enough.


  • Argue your case and hope to win: if you don’t win then you are back to the previous option, if you do then great.

What I would not do is: embark on a lengthy examination of all the possible options with the aim of deciding which is best. This takes time and means you will deliver later, which means cost-of-delay will reduce the value you deliver. Sure the great design might generate slightly more business benefit, or get sightly more performance from your development team but you are also delaying.

If you start with a good-enough solution you will learn. That learning may change your opinion, or it might confirm your initial opinion.

Remember: You are not building the ideal system. You are building the best system you can within constraints. The best system you can with the time and money available. In an existing environment many of those constraints are usually given before you start.

Unfortunately, us development folk tend to limit options thinking and exploring to the initial stages of a development effort – usually before any code is written. Once work is underway they don’t consider options often enough, we tend to jump at the first solution we think of.

Like this post? – Like to receive these posts by e-mail?

Subscribe to my newsletter & receive a free eBook “Xanpan: Team Centric Agile Software Development”

Check out my latest books – Continuous Digital and Project Myopia – and the Project Myopia audio edition

The post In the beginning there is architecture, well maybe appeared first on Allan Kelly Associates.

Want to be the coauthor of a prestigious book? Send me your bid

Derek Jones from The Shape of Code

The corruption that pervades the academic publishing system has become more public.

There is now a website that makes use of an ingenious technique for helping people increase their paper count (as might be expected, the competitive China thought of it first). Want to be listed as the first author of a paper? Fees start at $500. The beauty of the scheme is that the papers have already been accepted by a journal for publication, so the buyer knows exactly what they are getting. Paying to be included as an author before the paper is accepted incurs the risk that the paper might not be accepted.

Measurement of academic performance is based on number of papers published, weighted by the impact factor of the journal in which they are published. Individuals seeking promotion and research funding need an appropriately high publication score; the ranking of university departments is based on the publications of its members. The phrase publish or perish aptly describes the process. As expected, with individual careers and departmental funding on the line, the system has become corrupt in all kinds of ways.

There are organizations who will publish your paper for a fee, 100% guaranteed, and you can even attend a scam conference (that’s not how the organizers describe them). Problem is, word gets around and the weighting given to publishing in such journals is very low (or it should be, not all of them get caught).

The horror being expressed at this practice is driven by the fact that money is changing hands. Adding a colleague as an author (on the basis that they will return the favor later) is accepted practice; tacking your supervisors name on to the end of the list of authors is standard practice, irrespective of any contribution that might have made (how else would a professor accumulate 100+ published papers).

I regularly receive emails from academics telling me they would like to work on this or that with me. If they look like proper researchers, I am respectful; if they look like an academic paper mill, my reply points out (subtly or otherwise) that their work is not of high enough standard to be relevant to industry. Perhaps I should send them a quote for having their name appear on a paper written by me (I don’t publish in academic journals, so such a paper is unlikely to have much value in the system they operate within); it sounds worth doing just for the shocked response.

I read lots of papers, and usually ignore the list of authors. If it looks like there is some interesting data associated with the work, I email the first author, and will only include the other authors in the email if I am looking to do a bit of marketing for my book or the paper is many years old (so the first author is less likely to have the data).

I continue to be amazed at the number of people who continue to strive to do proper research in this academic environment.

I wonder how much I might get by auctioning off the coauthoship of my software engineering book?

Visual Lint has been released

Products, the Universe and Everything from Products, the Universe and Everything

This is a recommended maintenance update for Visual Lint 7.0. The following changes are included:

  • Added a +rw(noexcept) directive to the Visual Studio 2015 and 2017 PC-lint 9.0 compiler indirect files co-msc140.lnt and co-rb-vs2017.lnt respectively. This allows code which uses the noexcept keyword to be analysed more cleanly.
    As PC-lint 9.0 does not officially support modern C++ code other analysis errors may still occur when you analyse modern C++ code with PC-lint 9.0. As such we recommend that you update to PC-lint Plus to analyse code for modern C++ compilers such as GCC 4.x/5.x or Visual Studio 2013/15/17/19. Please contact us for details.

  • Fixed a bug in the installer which could prevent the Visual Studio plug-in from being installed to Visual Studio 2019.

  • Fixed a bug in the VisualLintGui Products Display "Add Existing File" context menu command.

  • Fixed a bug in the update check process which could result in incorrect text being displayed for a major update (e.g. to Visual Lint 7.x) [also in Visual Lint].

Visual Lint has been released

Products, the Universe and Everything from Products, the Universe and Everything

This is a recommended maintenance update for Visual Lint 7.0. The following changes are included:

  • Added a +rw(noexcept) directive to the Visual Studio 2015 and 2017 PC-lint 9.0 compiler indirect files co-msc140.lnt and co-rb-vs2017.lnt respectively. This allows code which uses the noexcept keyword to be analysed more cleanly.
    As PC-lint 9.0 does not officially support modern C++ code other analysis errors may still occur when you analyse modern C++ code with PC-lint 9.0. As such we recommend that you update to PC-lint Plus to analyse code for modern C++ compilers such as GCC 4.x/5.x or Visual Studio 2013/15/17/19. Please contact us for details.
  • Fixed a bug in the installer which could prevent the Visual Studio plug-in from being installed to Visual Studio 2019.
  • Fixed a bug in the VisualLintGui Products Display "Add Existing File" context menu command.
  • Fixed a bug in the update check process which could result in incorrect text being displayed for a major update (e.g. to Visual Lint 7.x) [also in Visual Lint].

Download Visual Lint

On The Hydra Of Argos – student

student from thus spake a.k.

When the Baron last met with Sir R-----, he proposed a wager which commenced with the placing of twenty black tokens and fifteen white tokens in a bag. At each turn Sir R----- was to draw a token from the bag and then put it and another of the same colour back inside until there were thirty tokens of the same colour in the bag, with the Baron winning a coin from Sir R----- if there were thirty black and Sir R----- winning ten coins from the Baron if there were thirty white.
Upon hearing these rules I recognised that they described the classic probability problem known as Pólya's Urn. I explained to the Baron that it admits a relatively simple expression that governs the likelihood that the bag contains given numbers of black and white tokens at each turn which could be used to figure the probability that he should have triumphed, but I fear that he didn't entirely grasp my point.

Decision trees for feature selection

Fran from BuontempoConsulting

I asked twitter who is using decision trees and what for. Most were using them, unsurprisingly, to make decisions. It wasn't always clear how the trees themselves were built.

If you are armed with data, so that each row has some features and a category - either yes/no, or one of many classes - you can build a classifier from the data. There are various ways to decided how to split up the data. Nonetheless, each algorithm follows the same overall process. Start with a tree root node, with all the data, and add nodes with part of the data.


  1. Pick a feature
  2. Split the data set, some to the left branch, some to the other branch (or branches) depending on the value of the feature
  3. If all the data at a node is in the same category (or almost all in the same category) form a leaf node
  4. Continue until each node is a leaf node.

This is a bit like a sorting algorithm: in quick sort, you choose a pivot value and split the data down one branch or the other, until you have single points at nodes. Here we don't choose a pivot value but features. The way to pick a feature can be based on statistics, information theory or even at random. At each step, you want to know if all the items in one category tend to have the same value or range of values of a feature. Once you are done you have a tree (or flow chart) you can apply to new data. Each way to split has various pros and cons. You can even build several trees. A random forest will build lots of trees and they vote on the class of new, unseen data. You could build your own voting system, using a variety of tree induction techniques. This might avoid some specific problems, like over-fitting from some techniques.

You can use decision tree induction in a variety of places, even if you don't want a full decision tree or ruleset. A rule set is a tree written in as a sequence of if statements. For example,

If currency is USD then data goes missing.

If you are moving a data source from one provider to another, and some data goes missing, can you spot what the missing items have in common? You could do a bit of manual investigation, say using pivot tables and data filters in Excel. However, a decision tree might find common features far more quickly than you can. This is a form of feature selection - using an algorithm to find pertinent features. Find a decision tree implementation in a language you know, or write one yourself, and have an experiment.

My book, Genetic Algorithms and Machine Learning for Programmers, has a chapter explaining how to build one type of decision tree. Take a look. Lots of machine learning frameworks also have tools to help you build decision trees. Next time you want to know what certain things have in common, try a decision tree and see you you learn. Machine learning is often about humans learning, rather than the machines.

2019 in the programming language standards’ world

Derek Jones from The Shape of Code

Last Tuesday I was at the British Standards Institute for a meeting of IST/5, the committee responsible for programming language standards in the UK.

There has been progress on a few issues discussed last year, and one interesting point came up.

It is starting to look as if there might be another iteration of the Cobol Standard. A handful of people, in various countries, have started to nibble around the edges of various new (in the Cobol sense) features. No, the INCITS Cobol committee (the people who used to do all the heavy lifting) has not been reformed; the work now appears to be driven by people who cannot let go of their involvement in Cobol standards.

ISO/IEC 23360-1:2006, the ISO version of the Linux Base Standard, has been updated and we were asked for a UK position on the document being published. Abstain seemed to be the only sensible option.

Our WG20 representative reported that the ongoing debate over pile of poo emoji has crossed the chasm (he did not exactly phrase it like that). Vendors want to have the freedom to specify code-points for use with their own emoji, e.g., pineapple emoji. The heady days, of a few short years ago, when an encoding for all the world’s character symbols seemed possible, have become a distant memory (the number of unhandled logographs on ancient pots and clay tablets was declining rapidly). Who could have predicted that the dream of a complete encoding of the symbols used by all the world’s languages would be dashed by pile of poo emoji?

The interesting news is from WG9. The document intended to become the Ada20 standard was due to enter the voting process in June, i.e., the committee considered it done. At the end of April the main Ada compiler vendor asked for the schedule to be slipped by a year or two, to enable them to get some implementation experience with the new features; oops. I have been predicting that in the future language ‘standards’ will be decided by the main compiler vendors, and the future is finally starting to arrive. What is the incentive for the GNAT compiler people to pay any attention to proposals written by a bunch of non-customers (ok, some of them might work for customers)? One answer is that Ada users tend to be large bureaucratic organizations (e.g., the DOD), who like to follow standards, and might fund GNAT to implement the new document (perhaps this delay by GNAT is all about funding, or lack thereof).

Right on cue, C++ users have started to notice that C++20’s added support for a system header with the name version, which conflicts with much existing practice of using a file called version to contain versioning information; a problem if the header search path used the compiler includes a project’s top-level directory (which is where the versioning file version often sits). So the WG21 committee decides on what it thinks is a good idea, implementors implement it, and users complain; implementors now have a good reason to not follow a requirement in the standard, to keep users happy. Will WG21 be apologetic, or get all high and mighty; we will have to wait and see.

Building an all-in-one Jar in Gradle with the Kotlin DSL

Andy Balaam from Andy Balaam's Blog

To build a “fat” Jar of your Java or Kotlin project that contains all the dependencies within a single file, you can use the shadow Gradle plugin.

I found it hard to find clear documentation on how it works using the Gradle Kotlin DSL (with a build.gradle.kts instead of build.gradle) so here is how I did it:

$ cat build.gradle.kts 
import com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar

plugins {
    kotlin("jvm") version "1.3.41"
    id("com.github.johnrengelman.shadow") version "5.1.0"

repositories {

dependencies {

tasks.withType<ShadowJar>() {
    manifest {
        attributes["Main-Class"] = "HelloKt"

$ cat src/main/kotlin/Hello.kt 
fun main() {

$ gradle wrapper --gradle-version 5.5
1 actionable task: 1 executed

$ ./gradlew shadowJar
2 actionable tasks: 2 executed

$ java -jar build/libs/hello-all.jar 

Complexity is a source of income in open source ecosystems

Derek Jones from The Shape of Code

I am someone who regularly uses R, and my interest in programming languages means that on a semi-regular basis spend time reading blog posts about the language. Over the last year, or so, I had noticed several patterns of behavior, and after reading a recent blog post things started to make sense (the blog post gets a lot of things wrong, but more of that later).

What are the patterns that have caught my attention?

Some background: Hadley Wickham is the guy behind some very useful R packages. Hadley was an academic, and is now the chief scientist at RStudio, the company behind the R language specific IDE of the same name. As Hadley’s thinking about how to manipulate data has evolved, he has created new packages, and has been very prolific. The term Hadley-verse was coined to describe an approach to data manipulation and program structuring, based around use of packages written by the man.

For the last nine-months I have noticed that the term Tidyverse is being used more regularly to describe what had been the Hadley-verse. And???

Another thing that has become very noticeable, over the last six-months, is the extent to which a wide range of packages now have dependencies on packages in the HadleyTidyverse. And???

A recent post by Norman Matloff complains about the Tidyverse’s complexity (and about the consistency between its packages; which I had always thought was a good design principle), and how RStudio’s promotion of the Tidyverse could result in it becoming the dominant R world view. Matloff has an academic world view and misses what is going on.

RStudio, the company, need to sell their services (their IDE is clunky and will be wiped out if a top of the range product, such as Jetbrains, adds support for R). If R were simple to use, companies would have less need to hire external experts. A widely used complicated library of packages is a god-send for a company looking to sell R services.

I don’t think Hadley Wickam intentionally made things complicated, any more than the creators of the Microsoft server protocols added interdependencies to make life difficult for competitors.

A complex package ecosystem was probably not part of RStudio’s product vision, at least for many years. But sooner or later, RStudio management will have realised that simplicity and ease of use is not in their interest.

Once a collection of complicated packages exist, it is in RStudio’s interest to get as many other packages using them, as quickly as possible. Infect the host quickly, before anybody notices; all the while telling people how much the company is investing in the community that it cares about (making lots of money from).

Having this package ecosystem known as the Hadley-verse gives too much influence to one person, and makes it difficult to fire him later. Rebranding as the Tidyverse solves these problems.

Matloff accuses RStudio of monopoly behavior, I would have said they are fighting for survival (i.e., creating an environment capable of generating the kind of income a VC funded company is expected to make). Having worked in language environments where multiple, and incompatible, package ecosystems existed, I can see advantages in there being a monopoly. Matloff is also upset about a commercial company swooping in to steal their precious, a common academic complaint (academics swooping in to steal ideas from commercially developed software is, of course, perfectly respectable). Matloff also makes claims about teachability of programming that are not derived from any experimental evidence, but then everybody makes claims about programming languages without there being any experimental evidence.

RStudio management rode in on the data science wave, raising money from VCs. The wave is subsiding and they now need to appear to have a viable business (so they can be sold to a bigger fish), which means there has to be a visible market they can sell into. One way to sell in an open source environment is for things to be so complicated, that large companies will pay somebody to handle the complexity.