Benchmarking desktop PCs circa 1990

Derek Jones from The Shape of Code

Before buying a computer customers want to be confident of choosing the best they can get for the money, and performance has often been a major consideration. Computer benchmark performance results were once widely discussed.

Knight’s analysis of early mainframe performance was widely cited for many years.

Performance on the Byte benchmarks was widely cited before Intel started spending billions on advertising, clock frequency has not always had the brand recognition it has today.

The Byte benchmark was originally designed for Intel x86 processors running Microsoft DOS; The benchmark was introduced in the June 1985 issue, and was written in the still relatively new C language (earlier microprocessor benchmarks were often written in BASIC, because early micros often came with a free BASIC interpreter), it was updated in the 1990s to be Windows based, and implemented for Unix.

Benchmarking computers using essentially the same cpu architecture and operating system removes many complications that have to be addressed when these differ. Before Wintel wiped them out, computers from different manufacturers (and often the same manufacturer) contained completely different cpu architectures, ran different operating systems, and compilers were usually created in-house by the manufacturer (or some university who got a large discount on their computer purchase).

The Fall 1990 issue of Byte contains tables of benchmark results from 1988-90. What can we learn from these results?

The most important takeaway from the tables is that those performing the benchmarks appreciated the importance of measuring hardware performance using the applications that customers are likely to be running on their computer, e.g., word processors, spreadsheets, databases, scientific calculations (computers were still sufficiently niche back then that scientific users were a non-trivial percentage of the market), and compiling (hackers were a large percentage of Byte’s readership).

The C benchmarks attempted to measure CPU, FPU (built-in hardware support for floating-point arrived with the 486 in April 1989, prior to that it was an add-on chip that required spending more money), Disk and Video (at the time support for color was becoming mainstream, but bundled hardware graphics support still tended to be minimal).

Running the application benchmarks takes a lot of time, plus the necessary software (which takes time to install from floppies, the distribution technology of the day). Running the C benchmarks is much quicker and simpler.

Ideally the C benchmarks are a reliable stand-in for the application benchmarks (meaning that only the C benchmarks need be run).

Let’s fit some regression models to the measurements of the 61 systems benchmarked, all supporting hardware floating-point (code+data). Surprisingly there is no mention of such an exercise being done by the Byte staff, even though one of the scientific benchmarks included regression fitting.

The following fitted equations explain around 90% of the variance of the data, i.e., they are good fits.

Wordprocessing=0.66+0.56*CPU+0.24*Disk

For wordprocessing, the CPU benchmark explains around twice as much as the Disk benchmark.

Spreedsheet=-0.46+0.8*CPU+1*Disk-0.16*CPU*Disk

For spreadsheets, CPU and Disk contribute about the same.

Database=0.6+0.01*CPU*FPU+0.53*Disk

Database is nearly all Disk.

ScientificEngineering=0.27+FPU*(0.59-0.17*Disk-0.03*CPU)+0.45*CPU*Disk

Scientific/Engineering is FPU, plus interactions with other components.

Compiling=-0.33+CPU*(1.1-0.09*Disk-0.16*Video)+0.33*Disk*Video

Compiling is CPU, plus interactions with other components.

Byte’s benchmark reports were great eye candy, and readers probably took away a rough feel for the performance of various systems. Perhaps somebody at the time also fitted regression models to the data. The magazine contained plenty of adverts for software to do this.

The Ascent of Man

Jon Jagger from less code, more software

is an excellent book by Jacob Bronowski (isbn 0-7088-2035-2)
As usual I'm going to quote from a few pages.

Evolution is the climbing of a ladder from the simple to the complex by steps, each of which is stable in itself.
The turning point to the spread of agriculture in the Old World was almost certainly the occurrence of two forms of what with a large, full head of seeds. Before 800 BC wheat was not the luxuriant plant it is today. It was merely one of many wild grasses that spread throughout the Middle East. By some genetic accident, the wild wheat crossed with a natural goat grass and formed a fertile hybrid. ... Now we have a beautiful ear of wheat, but one which will never spread in the wind because the ear is too tight to break up.
The Principle of Uncertainty is a bad name. In science or outside it, we are not uncertain, our knowledge is merely confined within a certain tolerance. We should call it the Principle of Tolerance.
When to copper you add an even softer metal, tin, you make an alloy which is harder and more durable than either - bronze. ... Almost any pure material is weak, and many impurities will do to make is stronger.
The making of the sword, like all ancient metallurgy, is surrounded with ritual, and that is for a clear reason. When you have no written language, when you have nothing that can be called a chemical formula, then you must have a precise ceremonial which fixes the sequence of operations so that they are exact and memorable.
To us gold is precious because it is scarce; but to the alchemists, all over the world, gold was precious because it was incorruptible. ... every medicine to fight old age contained gold, metallic gold, as an essential ingredient, and the alchemists urged their patrons to drink from gold cups to prolong life.
We still use for the female the alchemical symbol for copper, that is, what is soft: Venus. And we use for the male the alchemical sign for iron, that is, what is hard: Mars.
When the Bible says three wise men followed a star to Bethlehem, there sounds in the story the echo of an age when wise men were stargazers.
Three thousand years after they were made, the village women of Khuzistan still draw their water ration from the qanats.
The Greeks when they saw the Scythian riders believed the horse and the rider to be one; that is how they invented the legend of the centaur.
Galileo is the creator of the modern scientific method... he really did for the first time what we think of as practical science: build the apparatus, do the experiment, publish the results.
Relativity is the understanding of the world not as events but as relations.
He [Einstein] hated war, and cruelty, and hypocrisy, and above all he hated dogma.
It was because [James] Brindley could not spell the world 'navigator' that workmen who dig trenches or canals are still called 'navvies'.
Always [Leo] Szilard wanted the [atom] bomb to be tested openly before the Japanese and an international audience, so that the Japanese should know its power and should surrender before people died.


Review: Bone Silence Delivers

Paul Grenyer from Paul Grenyer

Bone Silence (Revenger, #3)Bone Silence by Alastair Reynolds
My rating: 5 of 5 stars

Alistair Reynolds is still by far my favorite author and he has continued his form, which rebooted with Revenger, in Bone Silence. A friend once said to me that Alistair Reynolds struggles to know how to write an ending and, disappointingly, I think this is still true with Bone Silence. This trilogy has the constant unanswered questions which drew me into and kept me hooked on the Revelation Space stories. There is a kind of answer at the end and a bit of a twist which, if I’m honest, left me underwhelmed. The answer lacks detail and explanation of the reasons and then the Ness sisters ride off into the sunset and Reynolds announces that he’s done with the pair.

That aside I loved Bone Silence and the entire trilogy. Each book is different and describes different aspects of the universe in which it is set. The characters are diverse and interesting and the story wide, far reaching and mostly unpredictable. The sisters' respect for each other is clear throughout, but like real siblings there are tensions and fights. There is a lot more to this universe and so much scope for expansion. I’d love Alistair Reynolds to return to it some day and expand this story, fill in some glossed over gaps and detail and tell some of the other stories, even if the cliched seafaring language did irritate me throughout.


View all my reviews

Learning useful stuff from the Human cognition chapter of my book

Derek Jones from The Shape of Code

What useful, practical things might professional software developers learn from the Human cognition chapter in my evidence-based software engineering book (an updated beta was release this week)?

Last week I checked the human cognition chapter; what useful things did I learn (combined with everything I learned during all the other weeks spent working on this chapter)?

I had spent a lot of time of learning about cognition when writing my C book; for this chapter I was catching up on what had happened in the last 10 years, which included: building executable models has become more popular, sample size has gotten larger (mostly thanks to Mechanical Turk), more researchers are making their data available on the web, and a few new theories (but mostly refinements of existing ideas).

Software is created by people, and it always seemed obvious to me that human cognition was a major topic in software engineering. But most researchers in computing departments joined the field because of their interest in maths, computers or software. The lack of interested in the human element means that the topic is rarely a research topic. There is a psychology of programming interest group, but most of those involved don’t appear to have read any psychology text books (I went to a couple of their annual workshops, and while writing the C book I was active on their mailing list for a few years).

What might readers learn from the chapter?

Visual processing: the rationale given for many code layout recommendations is plain daft; people need to learn something about how the brain processes images.

Models of reading. Existing readability claims are a joke (or bad marketing, take your pick). Researchers have been using eye trackers, since the 1960s, to figure out what actually happens when people read text, and various models have been built. Market researchers have been using eye trackers for decades to work out where best to place products on shelves, to maximise sales. In the last 10 years software researchers have started using eye trackers to study how people read code; next they need to learn about some of the existing models of how people read text. This chapter contains some handy discussion and references.

Learning and forgetting: it takes time to become proficient; going on a course is the start of the learning process, not the end.

One practical take away for readers of this chapter is being able to give good reasons how other people’s proposals, that are claimed to be based on how the brain operates, won’t work as claimed because that is not how the brain works. Actually, most of the time it is not possible to figure out whether something will work as advertised (this is why user interface testing is such a prolonged, and expensive, process), but the speaker with the most convincing techno-babble often wins the argument :-)

Readers might have a completely different learning experience from reading the human cognition chapter. What useful things did you learn from the human cognition chapter?

Fast Hardware Hides Many Sins

Chris Oldwood from The OldWood Thing

Way back at the beginning of my professional programming career I worked for a small software house that wrote graphics software. Although it had a desktop publisher and line-art based graphics package in its suite it didn’t have a bitmap editor and so they decided to outsource that to another local company.

A Different User Base

The company they chose to outsource to had a very high-end bitmap editing product and so the deal – to produce a cut-down version – suited both parties. In principle they would take their high-end product, strip out the features aimed at the more sophisticated market (professional photographers) and throw in a few others that the lower end of the market would find beneficial instead. For example their current product only supported 24-bit video cards, which were pretty unusual in the early to mid ‘90s due to their high price, and so supporting 8-bit palleted images was new to them. Due to the large images their high-end product could handle using its own virtual memory system they also demanded a large, fast hard disk too.

Even though I was only a year or two into my career at that point I was asked to look after the project and so I would get the first drop of each version as they delivered it so that I could evaluate their progress and also keep an eye on quality. The very first drop I got contained various issues that in retrospect did not bode well for the project, which ultimately fell through, although that was not until much later. (Naturally I didn’t have the experience I have now that would probably cause me to pull the alarm chord much sooner.)

Hard Disk Disco

One of the features that they partially supported but we wanted to make a little more prominent was the ability to see what the RGB value of the pixel under the cursor was – often referred to now as a colour dropper or eye dropper. When I first used the feature on my 486DX PC I noticed that it was a somewhat laggy; this surprised me as I had implemented algorithms like Floyd-Steinberg dithering so knew a fair bit about image manipulation and what algorithms were expensive and this definitely wasn’t one! As an aside I had also noticed that the hard disk light on my PC was pretty busy too which made no sense but was probably worth mentioning to them as an aside.

After feeding back to them about this and various other things I’d noticed they made some suggestions that their virtual memory system was probably overly aggressive as the product was designed for more beefier hardware. That kind of made sense and I waited for the next drop.

On the next drop they had apparently made various changes to their virtual memory system which helped it cope much better with smaller images so they didn’t page unnecessarily but I still found the feature laggy, and as I played with it some more I noticed that the hard disk light was definitely flashing lots when I moved the mouse although it didn’t stop flashing entirely when I stopped moving it. For our QA department who only had somewhat smaller 386SX machines it was almost even more noticeable.

DBWIN – Airing Dirty Laundry

At our company all the developers ran the debug version of Windows 3.1. enhanced mode with a second mono monitor to display messages from the Windows APIs to point out bugs in our software, but it was also very interesting to see what errors other software generated too [1]. You probably won’t be surprised to discover that the bitmap editor generated a lot of warnings. For example Windows complained about the amount of extra (custom) data it was storing against a window handle (hundreds of bytes) which I later discovered was caused by them constantly copying image attribute data back-and-forth as individual values instead of allocating a single struct with the data and copying that single pointer around.

Unearthing The Truth

Anyway, back to the performance problem. Part of the deal enabled our company to gain access to the bitmap editor source code which they gave to us earlier than originally planned so that I could help them by debugging some of their gnarlier crashes [2]. Naturally the first issue I looked into was the colour dropper and I quickly discovered the root cause of the dreadful performance – they were reading the application’s .ini file every time [3] the mouse moved! They also had a timer which simulated a WM_MOUSEMOVE message for other reasons which was why it still flashed the hard disk light even when the mouse wasn’t actually moving.

When I spoke to them about it they explained that once upon a time they ran into a Targa video card where the driver returned the RGB values as BGR when calling GetPixel(). Hence what they were doing was checking the .ini file to see if there was an application setting there to tell them to swap the GetPixel() result. Naturally I asked them why they didn’t just read this setting once at application start-up and cache the value given that the user can’t swap the video card whilst the machine (let alone the application) was running. Their response was simply a shrug, which wasn’t surprising by that time as it was becoming ever more apparent that the quality of the code was making it hard to implement the features we wanted and our QA team was turning up other issues which the mostly one-man team was never going to cope with in a reasonable time frame.

Epilogue

I don’t think it’s hard to see how this feature ended up this way. It wasn’t a prominent part of their high-end product and given the kit their users ran on and the kind of images they were dealing with it probably never even registered with all the other swapping going on. While I’d like to think it was just an oversight and one should never optimise until they have measured and prioritised there were too many other signs in the codebase that suggested they were relying heavily on the hardware to compensate for poor design choices. The other is that with pretty much only one full-time developer [5] the pressure was surely on to focus on new features first and quality was further down the list.

The project was eventually canned and with the company I was working for struggling too due to the huge growth of Microsoft Publisher and CorelDraw I only just missed the chop myself. Sadly neither company is around today despite quality playing a major part in the company I worked for and it being significantly better than many of the competing products.

 

[1]  One of the first pieces of open source software I ever published (on CiX) was a Mono Display Adapter Library.

[2] One involved taking Windows “out at the knees” – not even CodeView or BoundsChecker would trap it – the machine would just restart. Using SoftICE I eventually found the cause – calling EndDialog() instead of DestroyWindow() to close a modeless dialog.

[3] Although Windows cached the contents of the .ini file it still needed to stat() the file on every read access to see if it had changed and disk caching wasn’t exactly stellar back then [4].

[4] See this tweet of mine about how I used to grep my hard disk under Windows 3.1 :o).

[5] I ended up moonlighting for them in my spare time by writing them a scanner driver for one of their clients while they concentrated on getting the cut-down bitmap editor done for my company.

TDD – Romanes Eunt Domus!

Jon Jagger from less code, more software