CppCon 2019 Trip Report and Slides

Anthony Williams from Just Software Solutions Blog

Having been back from CppCon 2019 for over a week, I thought it was about time I wrote up my trip report.

The venue

This year, CppCon was at a new venue: the Gaylord Rockies Resort near Denver, Colorado, USA. This is a huge conference centre, currently surrounded by vast tracts of empty space, though people told me there were many plans for developing the surrounding area.

There were hosting multiple conferences and events alongside CppCon; it was quite amusing to emerge from the conference rooms and find oneself surrounded by people in ballgowns and fancy evening wear for an event in the nearby ballroom!

There were a choice of eating establishments, but they all had one thing in common: they were overpriced, taking advantage of the captured nature of the hotel clientelle. The food was reasonably nice though.

The size of the venue did make for a fair amount of walking around between sessions.

Overall the venue was nice, and the staff were friendly and helpful.

Pre-conference Workshop

I ran a 2-day pre-conference class, entitled More Concurrent Thinking in C++: Beyond the Basics, which was for those looking to move beyond the basics of threads and locks to the next level: high level library and application design, as well as lock-free programming with atomics. This was well attended, and I had interesting discussions with people over lunch and in the evening.

If you would like to book this course for your company, please see my training page.

The main conference

Bjarne Stroustrup kicked off the main conference with his presentation on "C++20: C++ at 40". Bjarne again reiterated his vision for C++, and outlined some of the many nice language and library features we have to make development easier, and code clearer and less error-prone.

Matt Godbolt's presentation on "Compiler Explorer: Behind the Scenes" was good and entertaining. Matt showed how he'd evolved Compiler Explorer from a simple script to the current website, and demonstrated some nifty things about it along the way, including features you might not have known about such as the LLVM instruction cost view, or the new "run your code" facility.

In "If You Can't Open It, You Don't Own It", Matt Butler talked about security and trust, and how bad things can happen if something you trust is compromised. Mostly this was obvious if you thought about it, but not something we necessarily do think about, so it was nice to be reminded, especially with the concrete examples. His advice on what we can do to build more secure systems, and existing and proposed C++ features that help was also good.

Barbara Geller and Ansel Sermersheim made an enthusiastic duo presenting "High performance graphics and text rendering on the GPU for any C++ application". I am excited about the potential for their Copperspice wrapper for the Vulkan rendering library: rendering 3D graphics portably is hard, and text more so.

Andrew Sutton's presentation on "Reflections: Compile-time Introspection of Source Code" was an interesting end to Monday. There is a lot of scope for eliminating boilerplate if we can use reflection, so it is good to see the progress being made on it.

Tuesday morning began with a scary question posed by Michael Wong, Paul McKenney and Maged Michael: "Will Your Code Survive the Attack of the Zombie Pointers?" Currently, if you delete an object or call free then all copies of those pointers immediately become invalid across all threads. Since invalid pointers can't even be compared, this can result in zombies eating your brains. Michael, Paul and Maged looked at what we can do in our code to avoid this, and what they are proposing for the C++ Standard to fix the problem.

Andrei Alexandrescu's presentation on "Speed is found in the minds of people" was an insightful look at optimizing sort. Andrei showed how compiler and processor features mean that performance can be counter-intuitive, and code with a higher algorithmic complexity can run faster in the right conditions. Always use infinite loops (except for most cases).

I love the interactive slides in Hana Dusikova's presentation "A State of Compile Time Regular Expressions". She is pushing the boundaries of compile-time coding to make our code perform better at runtime. std::regex can be slow compared to other regular expression libraries, but ctre can be much better. I am excited to see how this can be extended to compile-time parsing of other DSLs.

In "Applied WebAssembly: Compiling and Running C++ in Your Web Browser", Ben Smith showed the use of WebAssembly as a target to allow you to write high-performance C++ code that will run in a suitable web browser on any platform, much like the "Write once, run anywhere" promise of Java. I am interested to see where this can lead.

Samy Al Bahra and Paul Khuong presented the final session I attended: "Abusing Your Memory Model for Fun and Profit". They discussed how they have written code that relies on the stronger memory ordering requirements imposed by X86 CPUs over and above the standard C++ memory model in order to write high-performance concurrent data structures. I am intrigued to see if any of their techniques can be used in a portable fashion, or used to improve Just::Thread Pro.

Whiteboard code

This year there were a few whiteboards around the conference area for people to use for impromptu discussions. One of them had a challenge written on it:

"Can you write a requires expression that ensures a class has a member function with a specified signature?"

This led to a lot of discussion, which Arthur O'Dwyer wrote up as a blog post. Though the premise of the question is wrong (we shouldn't want to constrain on such specifics), it was fun, interesting and enlightening trying to think how one might do it — it allows you to explore the corner cases of the language in ways that might turn out to be useful later.

My presentation

As well as the workshop, I presented a talk on "Concurrency in C++20 and beyond", which was on Tuesday afternoon. It was in an intermediate-sized room, and I believe was well attended, though it was hard to see the audience with the bright stage lighting. There were a number of interesting questions from the audience addressing the issues raised in my presentation, which is always good, though the acoustics did make it hard to hear some of them.

Slides are available here.

~trip_report()

So that was an overview of another awesome CppCon. I love the in-person interactions with so many people involved in using C++ for such a wide variety of things. Everyone has their own perspective, and I always learn something.

The videos are being uploaded incrementally to the CppCon YouTube channel, so hopefully the video of my presentation and the ones above that aren't already available will be uploaded soon.

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , , , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

Follow me on Twitter

C compilers on PR1ME computers

Derek Jones from The Shape of Code

At the start of this week I knew almost nothing about the C compilers running on PR1ME computers (every now and again I check bitsavers for newly scanned C compiler manuals).

On Tuesday I spotted the source of the Georgia Tech C Compiler for Prime Computers (which was believed lost until two 35-years old mag tapes were found); the implementation language is Ratfor.

I posted the information to the comp.compilers newsgroup, and in the ensuing posts I learned about Dennis Boone’s collection of scanned PR1ME manuals, including a C compiler user guide. This C user’s guide is for the Conboy/Pacer software compiler, not the one whose sources I linked to above.

The PR1ME C compiler has the usually assortment of vendor extensions, and missing features, of the kind that might be found in any C compiler from the 1980s; however, there is a larger than usual collection of infrequently seen characteristics. I spotted the following during a quick read through the manual:

  • addresses point to 16-bit words (not 8-bit bytes). The implementers could have chosen to define a char as 16-bit (i.e., defining the standard macro CHAR_BIT as 16), but they went with 8-bit chars. Handling 8-bit chars on a word addressed processor means that pointers to char have to include a bit specifying which half of the 16-bits they are pointing to. Some Cray computers had the same issue, except their words contained 64 bits, so more offset bits had to be stored in pointers.

    The definition of an object having type char occupies 8 bits of a word, no other object is allocated in the other 8 bits; arrays of char are packed, two chars to a word.

  • ASCII characters have the top-bit set (the manual phrases this as “The basic character set … ASCII 7-bit set (called ASCII-7), with the eighth bit turned on.”). The C standard requires that the ten digit characters have continuous values, and that the basic character set be representable in a char.
  • a pointer type may contain more bits than any supported integer type, i.e., 48-bit pointer and 32-bit integer. PR1ME cpus supported a dizzying array of different modes and instruction sets; some later instruction sets/modes support 48-bit pointers.
  • “On some other machines, you may write code in which a function is called with more or fewer parameters than the function actually expects. Such code may work correctly on the 50 Series, but only if the missing or extra parameter is never referenced.”
  • “On some other machines, programs run correctly if function return value data types are left undeclared. … If this function is not explicitly defined as returning a pointer, the default return value is type int. Such a program may run correctly on some machines, but not on a 50 Series machine.”

These characteristics combine to make the PR1ME a very unfriendly environment for C programs.

C programmers have a culture, the C way of doing things (these cultures exist for all languages), and the characteristics of this (and perhaps other) PR1ME C compilers run counter to this culture in many ways.

I’m not saying that C culture is good or bad, just that it exists and PR1ME is a very poor fit.

What elements of C culture clash with the PR1ME implementation?

There is an expectation that when two objects having char type are defined sequentially (e.g., char a, b;), it will be possible to access b by adding 1 to a pointer to a (as if the definition had been written as: char a[2];). Yes, this practice is now frowned upon, but it was once considered ok (at least in some circles). On PR1ME the two definitions are not equivalent, from the pointer arithmetic point of view.

Very many developers assume that C characters use ASCII. Most of the time they do, but they are not required to; EBCDIC being perhaps the most well known alternative. At least in the PR1ME encoding the alphabetic characters have contiguous values, which they don’t in EBCDIC. But setting the top bit, hmmm….

The assumption that sizeof(int) == sizeof(pointer_type) is endemic in 1980s code, and much code in later decades; many (not so young) C programmers will tell you the story of the first time they had to switch mind-sets to: sizeof(long) == sizeof(pointer_type). Not having a 48-bit integer type is a bit of a killer for C on PR1ME, as we will find out.

PR1MOS, the vendor operating system, uses a function call stack layout that assumes a function definition specifies exactly how the function will be called, e.g., the number and type of parameters, and the return type.

In the early decades of C, programmers were very lax about specifying exactly what arguments a function expected, and that no return type implicitly specified an int return type (and since everybody knew that sizeof(int) == sizeof(pointer_type), this meant that it was not necessary to specify pointer return types).

During development, having the program raise an exception, or whatever the behavior was, when a function call did not match the defined type of the function, is useful; it improves program reliability by catching those cases that might work as expected a lot of the time, but not all the time.

A lot of existing code was created on systems that were forgiving of function call/definition mismatches. Such C source is unlikely to just compile and run under PR1MOS, i.e., porting other peoples programs is likely to be a time-consuming process.

Visual Lint 7.0.3.311 has been released

Products, the Universe and Everything from Products, the Universe and Everything

This is a recommended maintenance update for Visual Lint 7.0. The following changes are included:

  • Fixed a bug in the Visual Studio 2017/2019 installation process which can cause an invalid filename to be specified for VSIX logfiles written to the %TEMP% folder.

  • The timestamps of any property (.props, .targets or .vsprops) files referenced by a Visual Studio project are now taken into account when determining whether a new PC-lint/PC-lint Plus indirect (.lnt) file needs to be written.

  • Added a breadcrumb bar to HTML solution, project and file reports to make navigating to the report for the parent entity (e.g. from file to project) easier.

  • Fixed a bug which could cause an invalid -setenv() directive to be generated on the PC-lint Plus analysis command line if a platform name referenced in a project contained one or more spaces.

  • Replaced the "invalid key entered" balloon tip in the Registration Key Dialog with an inline text field.

  • Added a missing help topic.

Visual Lint 7.0.3.311 has been released

Products, the Universe and Everything from Products, the Universe and Everything

This is a recommended maintenance update for Visual Lint 7.0. The following changes are included:

  • Fixed a bug in the Visual Studio 2017/2019 installation process which can cause an invalid filename to be specified for VSIX logfiles written to the %TEMP% folder.
  • The timestamps of any property (.props, .targets or .vsprops) files referenced by a Visual Studio project are now taken into account when determining whether a new PC-lint/PC-lint Plus indirect (.lnt) file needs to be written.
  • Added a breadcrumb bar to HTML solution, project and file reports to make navigating to the report for the parent entity (e.g. from file to project) easier.
  • Fixed a bug which could cause an invalid -setenv() directive to be generated on the PC-lint Plus analysis command line if a platform name referenced in a project contained one or more spaces.
  • Replaced the "invalid key entered" balloon tip in the Registration Key Dialog with an inline text field.
  • Added a missing help topic.

Download Visual Lint 7.0.3.311

Plotting artifacts when the axis involves lines of code

Derek Jones from The Shape of Code

While reading a report from the very late Rome period, the plot below caught my attention (the regression line was not in the original plot). The points follow a general trend, suggesting that when implementing a module, lines of code written per man-hour increases as the size of the module increases (in LOC). There are explanations for such behavior: perhaps module implementation time is mostly think-time that is independent of LOC, or perhaps larger modules contain more lines that can be quickly implemented (code+data).

Then I realised that the pattern of points was generated by a mathematical artifact. Can you spot the artifact?

Module size against LOC-per-hour.

The x-axis shows LOC, and the y-axis shows LOC/man-hour. Just plotting LOC against LOC would produce a row of points along a straight line, and if we treat dividing by man-hours as roughly equivalent to dividing by a random number (which might have some correlation with LOC), the result is points scattered around a line going up to the right.

If LOC-per-hour were constant, the points would form a horizontal line across the plot.

In the below left plot, from a different report (whose axis are function-points, and function-points implemented per month), the author has fitted a line, and it is close to horizontal (suggesting that the mean FP-per-month is constant).

FP against FP-per-month.

In fact the points are essentially random, and the line is a terrible fit (just how terrible is shown by switching the axis and refitting the line, above right; the refitted line should be vertical, but is horizontal. There is no connection between FP and FP-per-month, which is a good thing because the creators of function-points intended this to be true).

What process might generate this random scattering, rather than the trend seen in the first plot? If the implementation time was proportional to both the number of FP and some uniform random component, then the FP/time ratio would have the pattern seen.

The plots below show module size (in LOC) against man-hour (left) and FP against months (right):

Module size against man-hours, and FP against months.

The module-LOC points are all over the place, while the FP points look as-if they are roughly consistent. Perhaps the module-LOC measurements came from a wide variety of sources, and we should not expect a visually pleasant trend.

Plotting LOC against LOC appears in other guises. Perhaps the most common being plotting fault-density against LOC; fault-density is generally calculated as faults/LOC.

Of course the artifacts also occur when plotting other kinds of measurements. Lines of code happens to be a commonly plotted quantity (at least in software engineering).

Further Still On An Ethereal Orrery – student

student from thus spake a.k.

Recently, my fellow students and I constructed a mathematical orrery which modelled the motion of heavenly bodies employing Sir N-----'s laws of gravitation and motion, rather than clockwork, as its engine. Those laws state that bodies are attracted toward each other with a force proportional to the product of their masses divided by the square of the distance between them, that a body will remain at rest or in constant motion unless a force acts upon it, that if a force acts upon it then it will be accelerated in the direction of that force at a rate proportional to its strength divided by its mass and that, if so, it will reciprocate with an opposing force of equal strength.
Its operation was most satisfactory, which set us to wondering whether we might use its engine to investigate the motions of entirely hypothetical arrangements of heavenly bodies and I should now like to report upon our progress in doing so.

nor(DEV):biz Big Dinner with Roarr! Dinosaur Adventure

Paul Grenyer from Paul Grenyer



What: nor(DEV):biz Big Dinner with Roarr! Dinosaur Adventure

When: 7th October, 2019

Where: Norwich City Football Club

How much: £40.99

Book: https://nordevbiz-oct-2019.eventbrite.co.uk

Join the best Norfolk and Norwich tech companies for dinner, while enjoying good food and great company.

Roarr! Dinosaur Adventure

A desire to innovate, with continual reinvestment creating bigger and bolder attractions – this is what our guest speakers have in mammoth (or should I say dinosaur!) proportions.

Owners of award-winning, Roarr! Dinosaur Adventure in Lenwade, Martin and Adam Goymour will be sharing their aspirations to develop this thriving business both in Norfolk and further afield. Not ones to rest on their laurels, they’ve already rebranded and invested millions so they can appeal to a broader market.

In 2018, they won the Best Large Visitor Attraction award in the Norfolk and Suffolk Tourism Awards. With more projects ‘in the pipeline’, their hard work and enthusiasm for innovation and redevelopment are evident.

From advancing their green energy strategy by placing solar panels on their indoor play area to a fossil dig and a steampunk-inspired restaurant in the Victorian walled garden, they are delighting thousands of visitors of all ages in Norfolk’s very own Jurassic Park.

About nor(DEV):biz

The aims of nor(DEV):biz (Norfolk Developers Business) are:

  • to be the go-to group for local businesses requiring a technology solution.
  • to facilitate and increase referrals and collaboration among Norfolk’s tech businesses.
  • to help close the digital skills gap.
  • to facilitate better collaboration between technology businesses and academic institutions.
  • to have a great meal with great company

Tickets prices do include a donation to the nor(DEV): chosen charity of the year, for 2019/2020.

Coding workshop example worksheets

Andy Balaam from Andy Balaam's Blog

This week we did a coding workshop exercise: 2 teams implemented the different sides of the SMPP protocol (without speaking to each other) and this morning we tried out connecting them together.

We successfully sent a message and received an acknowledgement!

It was a lot of fun and I we learned a surprising amount about SMPP (and quite how … interesting … the standard is).

In case they’re useful to anyone, here are the worksheets I made up: Team 1 ODT, Team 1 PDF, Team 2 ODT, Team 2 PDF.

Idea for a team who are less interested in SMPP (!) – try a similar exercise implementing FTP, which is a nice simple text-based protocol. I did this once and found it extremely rewarding.

Swarm algorithms

Fran from BuontempoConsulting

I wrote a book about about genetic algorithms and machine learning. You can buy it here.




Apart from genetic algorithms and other aspects of machine learning, it includes some swarm algorithms. Where a genetic algorithm mixes up potential solutions, by merging some together, and periodically mutates some values, swarm algorithms can be regarded as individual agents collaborating, each representing a solution to a problem, They can work together in various ways, giving rise to a variety of swarm algorithms.  

The so-called particle swarm algorithm can be used to find optimal solutions to problems. It's commonly referred to as a particle swarm optimisation, or PSO for short. PSO is often claimed to be based on the flocking behaviour of birds. Indeed, if you get the parameters right, you might see something similar to a flock of birds. PSO are similar to colony algorithms, which are also nature inspired, and also having agents collaborating to solve a problem.

Suppose you have some particles in a paper bag, say somewhere near the bottom. If they move about at random, some might get out of the bag in the end. If they follow each other, they might escape, but more likely than not, they'll hang round together in a gang. By providing a fitness function to encourage them, they can learn, for some definition of learn. Each particle can assess where it is, and remember the better places. The whole swarm will have a global best too. To escape a paper bag, we want the particles to go up. By inspecting the current (x, y) position, the fitness score can be the y-value. The bigger, the better. For real world problems, there can be many more than two dimensions, and the fitness function will require some thought. 

The algorithms is as follows:

    Choose n
    Initialize n particles randomly
    For a while:
        Update best global position
        Move particles
        Update each particle's best position and velocity

The particles' personal bests and the overall global best give the whole swarm a memory, of sorts. Initially, this is the starting position for each particle. In addition to the current position, each particle has a velocity, initialised with random numbers. Since we're doing this in two dimensions, the velocity has an x component, and a y component. To move a particle, update each of these by adding the velocity, v, in that direction to the current position:

    xt+1= xt + vx,t
    yt+1= yt + yx,t

Since the velocity starts at random, the particles move in various different directions to begin with.  The trick comes in when we update the velocity. There are several ways to do this. The standard way adds a fraction of the distance between the personal best and global best position for each particle and a proportion of the current velocity, kinda remembering where it was heading. This gives each a momentum along a trajectory, making it veer towards somewhere between its best spot and the global best spot. You'll need to pick the fractions. Using w, for weight, since we're doing a weighted sum, and c1 and c2 for the other proportions, we have:

    vx,t+1 = wvt + c1(pt-xt)+c2(gt-xt)

If you draw the particles moving around you will see them swarm, in this case out of the paper bag. 

This is one of many ways to code your way out of a paper bag covered in my book. When a particle solves a problem, here being out of the bag, inspecting the x and y values gives a solution to the problem. PSO can be used for a variety of numerical problems. It's usually described as a stochastic optimisation algorithm. That means it does something random (stochastic) to find the best (optimal) solution to a problem. You can read more here.  






Team DNA-impersonators create a business plan

Derek Jones from The Shape of Code

This weekend I was at the Hack the Police hackthon, sponsored by the Metropolitan Police+other organizations. My plan was to find an interesting problem to help solve, using the data we were told would be available. My previous experience with crime data is that there is not enough of it to allow reliable models to be built, this is a good thing in that nobody wants lots of crime. Talking to a Police intelligence officer, the publicly available data contained crimes (i.e., a court case had found somebody guilty), not reported incidents, and was not large enough to build allow a good model to be built.

Looking for a team to join, I got talking to Joe and Rebecca. Joe had discovered a very interesting possible threat to the existing DNA matching technique, and they were happy for me to join them analyzing this threat model; team DNA-impersonators was go.

Some background (Joe and Rebecca are the team’s genetic experts, I’m a software guy who has read a few books on the subject; all the mistakes in this post are mine). The DNA matching technique used by the Police is based on 17 specific sequences (each around 100 bases, known as loci), within the human genome (which contains around 3 billion bases).

There are companies who synthesize sequences of DNA to order. I knew that machines for doing this existed, but I did not know it was possible to order a bespoke sequence online, and how inexpensive it was.

Some people have had their DNA sequenced, and have allowed it to be published online; Steven Pinker is the most famous person I could find, whose DNA sequence is available online (link not given; it requires work+luck to find). The Personal Genome Projects aims to sequence and make available the complete genomes of 100,000 volunteers (the UK arm of this project is on hold because of lack of funding; master criminals in the UK have a window of opportunity: offer to sponsor the project on condition that their DNA is included in the public data set).

How much would it cost to manufacture bottles of spray-on Steven Pinker DNA? Is there a viable business model selling Pinker No. 5?

The screen shot below shows a quote for 2-nmol of DNA for the sequence of 100 bases that are one of the 17 loci used in DNA matching. This order is for concentrated DNA, and needs to be diluted to the level likely to be found as residue at a crime scene. Joe calculated that 2-nmol can be diluted to produce 60-liters of usable ‘product’.

Quote for synthesis of 100 bases of human DNA.

There was not enough time to obtain sequences for the other 16-loci, and get quotes for them. Information on the 17-loci used for DNA matching is available in research papers; a summer job for a PhD student to sort out the details.

The concentrate from the 17-loci dilutes to 60-liters. Say each spray-on bottle contains 100ml, then an investment of £800 (plus researcher time) generates enough liquid for 600-bottles of Pinker No. 5.

What is the pricing model? Is there a mass market (e.g., Hong Kong protesters wanting to be anonymous), or would it be more profitable to target a few select clients? Perhaps Steven Pinker always wanted to try his hand at safe-cracking in his spare time, but was worried about leaving DNA evidence behind; he might be willing to pay to have the market flooded, so Pinker No. 5 residue becomes a common occurrence at crime scenes (allowing him to plausible claim that any crime scene DNA matches were left behind by other people).

Some of the police officers at the hack volunteered that they knew lots of potential customers; the forensics officer present was horrified.

Before the 1980s, DNA profiling was not available. Will the 2020s be the decade in which DNA profiling ceases being a viable tool for catching competent criminals?

High quality photocopiers manufacturers are required to implement features that make it difficult for people to create good quality copies of paper currency.

What might law enforcement do about this threat to the viability of DNA profiling?

Ideas include:

  • Requiring companies in the bespoke DNA business to report suspicious orders. What is a suspicious order? Are enough companies in business to make it possible to order each of the 17-loci from different company (we think so)?
  • Introducing laws making it illegal to be in possession of diluted forms of other people’s DNA (with provisions for legitimate uses).
  • Attacking the economics of the Pinker No. 5 business model by having more than 17-loci available for use in DNA matching. Perhaps 1,000 loci could be selected as potential match sites, with individual DNA testing kits randomly testing 17 (or more) from this set.