May 2019 – Page 2 – ACCU World of Code

I like Lispy languages. One Iâ€™ve been playing with â€“ and occasionally been using for smaller projects â€“ is Clojure. Clojure projects usually use Leiningen for their build system. There are generally two ways to install leiningen â€“ just download the script as per the Leiningen web site, or use the OS package manager. I usually prefer using the OS package manager, but Manjaro doesnâ€™t include leiningen as a package in its repositories. Installing leiningen is pretty easy via the package manager and Iâ€™ll show you how.

May 4, 2019

C Standard meeting, April-May 2019

Derek Jones from The Shape of Code

I was at the ISO C language committee meeting, WG14, in London this week (apart from the few hours on Friday morning, which was scheduled to be only slightly longer than my commute to the meeting would have been).

It has been three years since the committee last met in London (the meeting was planned for Germany, but there was a hosting issue, and Germany are hosting next year), and around 20 people attended, plus 2-5 people dialing in. Some regular attendees were not in the room because of schedule conflicts; nine of those present were in London three years ago, and I had met three of those present (this week) at WG14 meetings prior to the last London meeting. I had thought that Fred Tydeman was the longest serving member in the room, but talking to Fred I found out that I was involved a few years earlier than him (our convenor is also a long-time member); Fred has attended more meeting than me, since I stopped being a regular attender 10 years ago. Tom Plum, who dialed in, has been a member from the beginning, and Larry Jones, who dialed in, predates me. There are still original committee members active on the WG14 mailing list.

Having so many relatively new meeting attendees is a good thing, in that they are likely to be keen and willing to do things; it’s also a bad thing for exactly the same reason (i.e., if it not really broken, don’t fix it).

The bulk of committee time was spent discussing the proposals contains in papers that have been submitted (listed in the agenda). The C Standard is currently being revised, WG14 are working to produce C2X. If a person wants the next version of the C Standard to support particular functionality, then they have to submit a paper specifying the desired functionality; for any proposal to have any chance of success, the interested parties need to turn up at multiple meetings, and argue for it.

There were three common patterns in the proposals discussed (none of these patterns are unique to the London meeting):

change existing wording, based on the idea that the change will stop compilers generating code that the person making the proposal considers to be undesirable behavior. Some proposals fitting this pattern were for niche uses, with alternative solutions available. If developers don’t have the funding needed to influence the behavior of open source compilers, submitting a proposal to WG14 offers a low cost route. Unless the proposal is a compelling use case, affecting lots of developers, WG14’s incentive is to not adopt the proposal (accepting too many proposals will only encourage trolls),
change/add wording to be compatible with C++. There are cost advantages, for vendors who have to support C and C++ products, to having the two language be as mutually consistent as possible. Embedded systems are a major market for C, but this market is not nearly as large for C++ (because of the much larger overhead required to support C++). I pointed out that WG14 needs to be careful about alienating a significant user base, by slavishly following C++; the C language needs to maintain a separate identity, for long term survival,
add a new function to the C library, based on its existence in another standard. Why add new functions to the C library? In the case of math functions, it’s to increase the likelihood that the implementation will be correct (maths functions often have dark corners that are difficult to get right), and for string functions it’s the hope that compilers will do magic to turn a function call directly into inline code. The alternative argument is not to add any new functions, because the common cases are already covered, and everything else is niche usage.

At the 2016 London meeting Peter Sewell gave a presentation on the Cerberus group’s work on a formal definition of C; this work has resulted in various papers questioning the interpretation of wording in the standard, i.e., possible ambiguities or inconsistencies. At this meeting the submitted papers focused on pointer provenance, and I was expecting to hear about the fancy optimizations this work would enable (which would be a major selling point of any proposal). No such luck, the aim of the work was stated as clearly specifying the behavior (a worthwhile aim), with no major new optimizations being claimed (formal methods researchers often oversell their claims, Peter is at the opposite end of the spectrum and could do with an injection of some positive advertising). Clarifying behavior is a worthwhile aim, but not at the cost of major changes to existing wording. I have had plenty of experience of asking WG14 for clarification of existing (what I thought to be ambiguous) wording, only to be told that the existing wording was clear and not ambiguous (to those reviewing my proposed defect). I wonder how many of the wording ambiguities that the Cerberus group claim to have found would be accepted by WG14 as a defect that required a wording change?

Winner of the best pub quiz question: Does the C Standard require an implementation to be able to exactly represent floating-point zero? No, but it is now required in C2X. Do any existing conforming implementations not support an exact representation for floating-point zero? There are processors that use a logarithmic representation for floating-point, but I don’t know if any conforming implementation exists for such systems; all implementations I know of support an exact representation for floating-point zero. Logarithmic representation could handle zero using a special bit pattern, with cpu instructions doing the right thing when operating on this bit pattern, e.g., 0.0+X == X, (I wonder how much code would break, if the compiler mapped the literal 0.0 to the representable value nearest to zero).

Winner of the best good intentions corrupted by the real world: intmax_t, an integer type capable of representing any value of any signed integer type (i.e., a largest representable integer type). The concept of a unique largest has issues in a world that embraces diversity.

Today’s C development environment is very different from 25 years ago, let alone 40 years ago. The number of compilers in active use has decreased by almost two orders of magnitude, the number of commonly encountered distinct processors has shrunk, the number of very distinct operating systems has shrunk. While it is not a monoculture, things appear to be heading in that direction.

The relevance of WG14 decreases, as the number of independent C compilers, in widespread use, decreases.

What is the purpose of a C Standard in today’s world? If it were not already a standard, I don’t think a committee would be set up to standardize the language today.

Is the role of WG14 now, the arbiter of useful common practice across widely used compilers? Documenting decisions in revisions of the C Standard.

Work on the Cobol Standard ran for almost 60-years; WG14 has to be active for another 20-years to equal this.

May 3, 2019May 3, 2019

Nearest And Dearest – a.k.

a.k. from thus spake a.k.

Last time we saw how we could use the list of the nearest neighbours of a datum in a set of clustered data to calculate its strengths of association with their clusters. Specifically, we used the k nearest neighbours algorithm to define those strengths as the proportions of its k nearest neighbours that were members of each cluster or with a generalisation of it that assigned weights to the neighbours according to their positions in the list.
This time we shall take a look at a clustering algorithm that uses nearest neighbours to identify clusters, contrasting it with the k means clustering algorithm that we covered about four years ago.

May 3, 2019May 3, 2019

Log driven development

Fran from BuontempoConsulting

Everyone knows attempting to figure out what's happening by resorting to print statements is desperation. Everyone knows you should use TDD, BDD or some xDD to write code well, don't they? I am desperate. I suggest PDD, print driven development is slow, error prone but sometimes helpful.

I recently shoe-horned a new feature into a large existing code base. It's hard to make it run a small set of work and I wanted to check a config change create my new class via the dependency injection framework. Now, I could put in a break point and wait a while for it to get hit, or not. Fortuenately, the code writes logs. By adding a log line to the constructor, I could tell much more quickly that I'd got the config correct. Write the code, leave it running, have a look later. I am tempted to name this Log Driven Development.

On one hand, adding the moral equivalents of prints is a bad thing. Teedy Deigh mused on logging in Overload 126, saying

It is generally arbitrary and unsustainable and few people are ever clear what the exact requirements are, so it spreads like a cat meme.

And yet, if I add informative logging as things happen, I can see how the whole system hangs together. Jonathan Boccara mentioned finding logs when trying to understand an unfamiliar code base, in his talk at the ACCU Conference. Existing logs can help you understand some code. Looking at this from a different direction begs the question are my logs useful? As a first step seeing a log entry when I wired in my class felt like a win. It told me what I wanted to know. I must make sure the logs I write out as my class starts doing something useful are themselves useful.

This is a step in the direction of a walking skeleton style approach - get things running end to end, outside-in, then fill in the details. I'm much more comfortable with a TDD approach, but keeping an eye on the whole system and how it hangs together is important. Building up the new feature by writing useful logs means the logs will be useful when it's released. LDD - log driven development. Why not?

May 2, 2019

Creating a self-signed certificate for Apache and connecting to it from Java

Andy Balaam from Andy Balaam's Blog

Our mission: to create a self-signed certificate for an Apache web server that allows us to connect to it over HTTPS (SSL/TLS) from a Java program.

The tricky bit for me was generating a certificate that contains Subject Alternative Names for my server, which is needed to connect to it from Java.

We will use the openssl command.

Creating a self-signed certificate for Apache HTTPD

First create a config file cert.conf:

[ req ]
distinguished_name  = subject
x509_extensions     = x509_ext
prompt = no

[ subject ]
commonName = Example Company

[ x509_ext ]
subjectAltName = @alternate_names

[ alternate_names ]
DNS.1 = example.com

In the above, replace “example.com” with the name you will use for the host when you connect from Java. This is important, because Java requires the name in the certificate to match the name it is using to connect to the server. If you’re connecting to it as localhost, just put “localhost”. Note: do not include “https://” or any port or path after the hostname, so “example.com:8080/mypath” is wrong – it should be just “example.com”.

The alternate_names section above gives the “Subject Alternative Names” for this certificate. You can add more as “DNS.2”, “DNS.3”, etc.

Next, generate the server key and self-signed certificate:

openssl genrsa 2048 > server.key
chmod 400 server.key
openssl req -new -x509 -config cert.conf -nodes -sha256 -days 365 -key server.key -out server.crt

Now you have two new files: server.key and server.crt. These are the files that will be used by Apache HTTPD, so put them somewhere useful (e.g. inside /usr/local/apache2/conf/) and refer to them in the Apache config file using keys “SSLCertificateKeyFile” and “SSLCertificateFile” respectively. For more info see the SSL/TLS How-To.

Checking the certificate is being used

Start up your Apache and ensure you can connect to it over HTTPS using curl:

curl -v --insecure https://example.com:8080

Replace “https://example.com:8080” above with the full URL (this time, include “https://” and the port and path.

To examine the certificate that is being returned, run:

openssl s_client -showcerts -connect example.com:8080

Replace “example.com:8080” above with hostname and port (no “https:// this time!).

Connecting from Java

To be able to connect from Java, we need a Trust Store. We can create one in PKCS#12 format with:

openssl pkcs12 -export -passout pass:000000 -out trust.pkcs12 -inkey server.key -in server.crt

Note: Java 8 onwards is able to use .pkcs12 (PKCS#12) files for its trust store. The old .jks (Java Key Store) format is deprecated.

Now you have a file we can use as a trust store, follow my other article to connect from Java over HTTPS with a self-signed certificate.

May 1, 2019

ACCU Talk â€œHow Kotlin makes your Java code betterâ€

Andy Balaam from Andy Balaam's Blog

Here is the live version of my talk designed to help you advocate for adopting Kotlin:

May 1, 2019

ACCU Talk â€œHow Git really worksâ€

Andy Balaam from Andy Balaam's Blog

Here is the talk CB Bailey and I did at ACCU Conference 2019. This was the first talk I have done with someone else, and I really enjoyed it:

Month: May 2019

Installing leiningen on Manjaro Linux

C Standard meeting, April-May 2019

Nearest And Dearest – a.k.

Log driven development

Creating a self-signed certificate for Apache and connecting to it from Java

Creating a self-signed certificate for Apache HTTPD

Checking the certificate is being used

Connecting from Java

ACCU Talk â€œHow Kotlin makes your Java code betterâ€

ACCU Talk â€œHow Git really worksâ€