WMI Performance Anomaly: Querying the Number of CPU Cores

Chris Oldwood from The OldWood Thing

As one of the few devs that both likes and is reasonably well-versed in PowerShell I became the point of contact for a colleague that was bemused by a performance oddity when querying the number of cores on a host. He was introducing Ninja into the build and needed to throttle its expectations around how many actual cores there were because hyperthreading was enabled and our compilation intensive build was being slowed by its bad guesswork [1].

The PowerShell query for the number of cores (rather than logical processors) was pulled straight from the Internet and seemed fairly simple:

(Get-WmiObject Win32_Processor |
  measure -Property NumberOfCores -Sum).Sum

However when he ran this it took a whopping 4 seconds! Although he was using a Windows VM running on QEMU/KVM, I knew from benchmarking a while back this setup added very little overhead, i.e. only a percentage or two, and even on my work PC I observed a similar tardy performance. Here’s how we measured it:

Measure-Command {
  (Get-WmiObject Win32_Processor |
  measure -Property NumberOfCores -Sum).Sum
} | % TotalSeconds
4.0867539

(As I write my HP laptop running Windows 11 is still showing well over a second to run this command.)

My first instinct was that this was some weird overhead with PowerShell, what with it being .Net based so I tried the classic native wmic tool under the Git Bash to see how that behaved:

$ time WMIC CPU Get //Format:List | grep NumberOfCores  | cut -d '=' -f 2 | awk '{ sum += $1 } END{ print sum }'
4

real    0m4.138s

As you can see there was no real difference so that discounted the .Net theory. For kicks I tried lscpu under the WSL based Ubuntu 20.04 and that returned a far more sane time:

$ time lscpu > /dev/null

real    0m0.064s

I presume that lscpu will do some direct spelunking but even so the added machinery of WMI should not be adding the kind of ridiculous overhead that we were seeing. I even tried my own C++ based WMICmd tool as I knew that was talking directly to WMI with no extra cleverness going on behind the scenes, but I got a similar outcome.

On a whim I decided to try pushing more work onto WMI by passing a custom query instead so that it only needed to return the one value I cared about:

Measure-Command {
  (Get-WmiObject -Query 'select NumberOfCores from Win32_Processor' |
  measure -Property NumberOfCores -Sum).Sum
} | % TotalSeconds
0.0481644

Lo-and-behold that gave a timing in the tens of milliseconds range which was far closer to lscpu and definitely more like what we were expecting.

While my office machine has some “industrial strength” [2] anti-virus software that could easily be to blame, my colleague’s VM didn’t, only the default of MS Defender. So at this point I’m none the wiser about what was going on although my personal laptop suggests that the native tools of wmic and wmicmd are both returning times more in-line with lscpu so something funky is going on somewhere.

 

[1] Hyper-threaded cores meant Ninja was scheduling too much concurrent work.

[2] Read that as “massively interfering”!

 

Chaining IF and && with CMD

Chris Oldwood from The OldWood Thing

An interesting bug cropped up the other day in a dub configuration file which made me realise I wasn’t consciously aware of the precedence of && when used in an IF statement with cmd.exe.

Batch File Idioms

I’ve written a ton of batch files over the years and, with error handling being a manual affair, the usual pattern is to alternate pairs of statement + error check, e.g.

mkdir folder
if %errorlevel% neq 0 exit /b %errorlevel%

It’s not uncommon for people to explicitly leave off the error check in this particular scenario so that (hopefully) the folder will exist whether not it already does. However it then masks a (not uncommon) failure where the folder can’t be created due to permissions and so I tend to go for the more verbose option:

if not exist "folder" (
  mkdir folder
  if !errorlevel! neq 0 exit /b !errorlevel!
)

Note the switch from %errorlevel% to !errorlevel!. I tend to use setlocal EnableDelayedExpansion at the beginning of every batch file and use !var! everywhere by convention to avoid forgetting this transformation as it’s an easy mistake to make in batch files.

Chaining Statements

In cmd you can chain commands with & (much like ; in bash) with && being used when the previous command succeeds and || for when it fails. This is useful with tools like dub which allow you to define “one liners” that will be executed during a build by “shelling out”. For example you might write this:

mkdir bin\media && copy media\*.* bin\media

This works fine first time but it’s not idempotent which might be okay for automated builds where the workspace is always clean but it’s annoying when running the build repeatedly, locally. Hence you might be inclined to fix this by changing it to:

if not exist "bin\media" mkdir bin\media && copy media\*.* bin\media

Sadly this doesn’t do what the author intended because the && is part of the IF statement “then” block – the copy is only executed if the folder doesn’t exist. Hence this was the aforementioned bug which wasn’t spotted at first as it worked fine for the automated builds but failed locally.

Here is a canonical example:

> if exist "C:\" echo A && echo B
A
B

> if not exist "C:\" echo A && echo B

As you can see, in the second case B is not printed so is part of the IF statement happy path.

Parenthesis to the Rescue

Naturally the solution to problems involving ordering or precedence is to introduce parenthesis to be more explicit.

If you look at how parenthesis were used in the second example right back at the beginning you might be inclined to write this thinking that the parenthesis create a scope somewhat akin to {} in C style languages:

> if not exist "C:\" (echo A) && echo B

But it won’t work as the parenthesis are still part of the “then” statement. (They are useful to control evaluation when mixing compound conditional commands that use, say, || and & [1].)

Hence the correct solution is to use parenthesis around the entire IF statement:

> (if not exist "C:\" echo A) && echo B
B

Applying this to the original problem, it’s:

(if not exist "bin\media" mkdir bin\media) && copy media\*.* bin\media

 

[1] Single line with multiple commands using Windows batch file

Simple Tables From JSON Data With JQ and Column

Chris Oldwood from The OldWood Thing

My current role is more of a DevOps role and I’m spending more time than usual monitoring and administrating various services, such as the GitLab instance we use for source control, build pipelines, issue management, etc. While the GitLab UI is very useful for certain kinds of tasks the rich RESTful API allows you to easily build your own custom tools to to monitor, analyse, and investigate the things you’re particularly interested in.

For example one of the first views I wanted was an alphabetical list of all runners with their current status so that I could quickly see if any had gone AWOL during the night. The alphabetical sorting requirement is not something the standard UI view provides hence I needed to use the REST API or hope that someone had already done something similar first.

GitLab Clients

I quickly found two candidates: python-gitlab and go-gitlab-client which looked promising but they only really wrap the API – I’d still need to do some heavy lifting myself and understand what the GitLab API does. Given how simple the examples were, even with curl, it felt like I wasn’t really saving myself anything at this point, e.g.

curl --header "PRIVATE-TOKEN: $token" "https://gitlab.example.com/api/v4/runners"

So I decided to go with a wrapper script [1] approach instead and find a way to prettify the JSON output so that the script encapsulated a shell one-liner that would request the data and format the output in a simple table. Here is the kind of JSON the GitLab API would return for the list of runners:

[
  {
   "id": 6,
   "status": "online"
   . . .
  }
,
  {
   "id": 8,
   "status": "offline"
   . . .
  }
]

JQ – The JSON Tool

I’d come across the excellent JQ tool for querying JSON payloads many years ago so that was my first thought for at least simplifying the JSON payloads to the fields I was interested in. However on further reading I found it could do some simple formatting too. At first I thought the compact output using the –c option was what I needed (perhaps along with some tr magic to strip the punctuation), e.g.

$ echo '[{"id":1, "status":"online"}]' |\
  jq -c
[{"id":1,"status":"online"}]

but later I discovered the –r option provided raw output which formatted the values as simple text and removed all the JSON punctuation, e.g.

$ echo '[{"id":1, "status":"online"}]' |\
  jq -r '( .[] | "\(.id) \(.status)" )'
1 online

Naturally my first thought for the column headings was to use a couple of echo statements before the curl pipeline but I also discovered that you can mix-and match string literals with the output from the incoming JSON stream, e.g.

$ echo '[{"id":1, "status":"online"}]' |\
   jq -r '"ID Status",
          "-- ------",
          ( .[] | "\(.id) \(.status)" )'
ID Status
-- ------
1 online

This way the headings were only output if the command succeeded.

Neater Tables with Column

While these crude tables were readable and simple enough for further processing with grep and awk they were still pretty unsightly when the values of a column were too varied in length such as a branch name or description field. Putting them on the right hand side kind of worked but I wondered if I could create fixed width fields ala printf via jq.

At this point I stumbled across the StackOverflow question How to format a JSON string as a table using jq? where one of the later answers mentioned a command line tool called “column” which takes rows of text values and arranges them as columns of similar width by adjusting the spacing between elements.

This almost worked except for the fact that some fields had spaces in their input and column would treat them by default as separate elements. A simple change of field separator from a space to a tab meant that I could have my cake and eat it, e.g.

$ echo '[ {"id":1, "status":"online"},
          {"id":2, "status":"offline"} ]' |\
  jq -r '"ID\tStatus",
         "--\t-------",
         ( .[] | "\(.id)\t\(.status)" )' |\
  column -t -s $'\t'
ID  Status
--  -------
1   online
2   offline

Sorting and Limiting

While many of the views I was happy to order by ID, which is often the default for the API, or in the case of jobs and pipelines was a proxy for “start time”, there were cases where I needed to control the sorting. For example we used the runner description to store the hostname (or host + container name) so it made sense to order by that, e.g.

jq 'sort_by(.description|ascii_downcase)'

For the runner’s jobs the job ID ordering wasn’t that useful as the IDs were allocated up front but the job might start much later if it’s a latter part of the pipeline so I chose to order by the job start time instead with descending order so the most recent jobs were listed first, e.g.

jq ‘sort_by(.started_at) | reverse’

One other final trick that proved useful occasionally when there was no limiting in the API was to do it with jq instead, e.g

jq "sort_by(.name) | [limit($max; .[])]"

 

[1] See my 2013 article “In The Toolbox – Wrapper Scripts” for more about this common technique of simplifying tools.

PowerShell’s Call Operator (&) Arguments with Embedded Spaces and Quotes

Chris Oldwood from The OldWood Thing

I was recently upgrading a PowerShell script that used the v2 nunit-console runner to use the v3 one instead when I ran across a weird issue with PowerShell. I’ve haven’t found a definitive bug report or release note yet to describe the change in behaviour, hence I’m documenting my observation here in the meantime.

When running the script on my desktop machine, which runs Windows 10 and PowerShell v5.x it worked first time, but when pushing the script to our build server, which was running Windows Server 2012 and PowerShell v4.x it failed with a weird error that suggested the command line being passed to nunit-console was borked.

Passing Arguments with Spaces

The v3 nunit-console command line takes a “/where” argument which allows you to provide a filter to describe which test cases to run. This is a form of expression and the script’s default filter was essentially this:

cat == Integration && cat != LongRunning

Formatting this as a command line argument it then becomes:

/where:“cat == Integration && cat != LongRunning”

Note that the value for the /where argument contains spaces and therefore needs to be enclosed in double quotes. An alternative of course is to enclose the whole argument in double quotes instead:

“/where:cat == Integration && cat != LongRunning”

or you can try splitting the argument name and value up into two separate arguments:

/where “cat == Integration && cat != LongRunning”

I’ve generally found these command-line argument games unnecessary unless the tool I’m invoking is using some broken or naïve command line parsing library [1]. (In this particular scenario I could have removed the spaces too but if it was a path, like “C:\Program Files\Xxx”, I would not have had that luxury.)

PowerShell Differences

What I discovered was that on PowerShell v4 when an argument has embedded spaces it appears to ignore the embedded quotes and therefore sticks an extra pair of quotes around the entire argument, which you can see here:

> $where='/where:"cat == Integration"'; & cmd /c echo $where
"/where:"cat == Integration""

…whereas on PowerShell v5 it “notices” that the value with spaces is already correctly quoted and therefore elides the outer pair of double quotes:

> $where='/where:"cat == Integration"'; & cmd /c echo $where
/where:"cat == Integration"

On PowerShell v4 only by removing the spaces, which I mentioned above may not always be possible, can you stop it adding the outer pair of quotes:

> $where='/where:"cat==Integration"'; & cmd /c echo $where
/where:"cat==Integration"

…of course now you don’t need the quotes anymore :o). However, if for some reason you are formatting the string, such as with the –f operator that might be useful (e.g. you control the value but not the format string).

I should point out that this doesn’t just affect PowerShell v4, I also tried it on my Vista machine with PowerShell v2 and that exhibited the same behaviour, so my guess is this was “fixed” in v5.

[1] I once worked with an in-house C++ based application framework that completely ignored the standard parser that fed main() and instead re-parsed the arguments, very badly, from the raw string obtained from GetCommandLine().