Republishing Bartosz Milewski’s Category Theory lectures

Andy Balaam from Andy Balaam's Blog

Category Theory is an incredibly exciting and challenging area of Maths, that (among other things) can really help us understand what programming is on a fundamental level, and make us better programmers.

By far the best explanation of Category Theory that I have ever seen is a series of videos by Bartosz Milewski on YouTube.

The videos have quite a bit of background noise, and they were not available on PeerTube, so I asked for permission to edit and repost them, and Bartosz generously agreed! The conversation was in the comments section of Category Theory 1.1: Motivation and Philosophy and I reproduce it below.

So, I present these awesome videos, with background noise removed using Audacity, for your enjoyment:

Category Theory by Bartosz Milewski

Permission details:

Andy Balaam: Utterly brilliant lecture series.  Is it available under a free license?  I'd like to try and clean up audio and repost it to PeerTube, if that is permitted. Bartosz Milewski: You have my permission. I consider my lectures public domain.

Andy Balaam: Utterly brilliant lecture series. Is it available under a free license? I’d like to try and clean up audio and repost it to PeerTube, if that is permitted.
Bartosz Milewski: You have my permission. I consider my lectures public domain.

Visual Lint 7.0.12.336 has been released

Products, the Universe and Everything from Products, the Universe and Everything

Visual Lint 7.0.12.336 is a recommended maintenance update for Visual Lint 7.0. The following changes are included:

  • Updated the interface with IncrediBuild to support IncrediBuild 9.5.x.

  • IncrediBuild analysis tasks can now be colour coded in the IncrediBuild Build Monitor from within the Visual Studio, Atmel Studio and Eclipse plug-ins.

  • Fixed a bug which could cause duplicate IncrediBuild analysis tasks to be queued.

  • Updated the prompt displayed if an IncrediBuild installation was not found when IncrediBuild analysis was enabled.

  • Various corrections and updates to help topics.

Download Visual Lint 7.0.12.336

Visual Lint 8.0.1.337 has been released

Products, the Universe and Everything from Products, the Universe and Everything

Visual Lint 8.0.1.337 is a recommended maintenance update for Visual Lint 8.0. The following changes are included:

  • If the Visual Studio plugin is selected for installation and the Visual Studio Debug Console (VsDebugConsole.exe) is running, the installer will now ask you to close it before installation can proceed.

  • Updated the interface with IncrediBuild to support IncrediBuild 9.5.x. [also in Visual Lint 7.0.12.336]

  • IncrediBuild analysis tasks can now be colour coded in the IncrediBuild Build Monitor from within the Visual Studio, Atmel Studio and Eclipse plug-ins. [also in Visual Lint 7.0.12.336]

  • Fixed a bug which could cause duplicate IncrediBuild analysis tasks to be queued. [also in Visual Lint 7.0.12.336]

  • Updated the prompt displayed if an IncrediBuild installation was not found when IncrediBuild analysis was enabled. [also in Visual Lint 7.0.12.336]

  • Updated the values of _MSC_VER and _MSC_FULL_VER in the PC-lint Plus compiler indirect file co-rb-vs2019.lnt to support Visual Studio 2019 v16.4.4.

  • Updated the values of _MSC_VER and _MSC_FULL_VER in the PC-lint Plus compiler indirect file co-rb-vs2017.lnt to support Visual Studio 2017 v15.9.35.

  • Various corrections and updates to help topics.

Download Visual Lint 8.0.1.337

Python virtual environments with pyenv on Apple Silicon

Ekaterina Nikonova from Good With Computers

Apple's recent transition to the new architecture for its Mac computers has caused rather predictable problems for developers whose workflow depends on certain versions of pre-compiled libraries for x86 architecture. While the latest releases of Python come with a universal installer that allows to build universal binaries for M1 systems, those who prefer to manage Python environments with pyenv, may find it difficult to choose the correct version for installation.

This problem can be solved by installing both x86 and arm64 Python executables. To do that, we need to be able to run pyenv in x86 mode and make sure that all system dependencies are met for both architectures. In other words, we'll need both x86 and arm64 Homebrew packages that we'll keep separate using two installations of Homebrew.

First of all, to be able to run x86 executables, we'll need to install Rosetta:

$ softwareupdate —install-rosetta

Now we can install the x86 Homebrew:

$ arch -x86_64 /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"

It will be installed in the /usr/local/bin/ directory. For convenience, you can create an alias by adding the following line in your shell configuration file:

alias brew86="arch -x86_64 /usr/local/bin/brew"

Now we can invoke the x86 Homebrew as brew86 and install packages required by pyenv:

$ brew install openssl readline sqlite3 xz zlib

$ brew86 install openssl readline sqlite3 xz zlib

You can check whether the installation was successful and you have packages for both architectures using the file command, for example:

$ file /opt/homebrew/Cellar/openssl@1.1/1.1.1k/bin/openssl
/opt/homebrew/Cellar/openssl@1.1/1.1.1k/bin/openssl: Mach-O 64-bit executable arm64

$ file /usr/local/Cellar/openssl@1.1/1.1.1k/bin/openssl
/usr/local/Cellar/openssl@1.1/1.1.1k/bin/openssl: Mach-O 64-bit executable x86_64

To install x86 Python, you'll need to call pyenv with the arch -x86_64 prefix. For convenience, let's create an alias for this command by adding the following line in the shell config file:

alias pyenv86="arch -x86_64 pyenv"

Now you can install x86 Python binaries by calling:

$ pyenv86 install <PYTHON_VERSION>

By default, pyenv doesn't allow you to specify custom names for the installed Python versions, but you can use the pyenv-alias plugin to give your installations appropriate names:

$ VERSION_ALIAS="3.x.x_x86" pyenv86 install 3.x.x

Note that with aliases for your pyenv and Homebrew installations, you’ll have to specify them in all commands and locations, for example:

$ CFLAGS="-I$(brew86 --prefix openssl)/include" \
LDFLAGS="-L$(brew86 --prefix openssl)/lib" \
VERSION_ALIAS="3.x.x_x86" \
pyenv86 install -v 3.x.x

Using atomics for thread synchronization in C++

Anthony Williams from Just Software Solutions Blog

In my previous blog post I wrote about spin locks, and how compilers must not move the locking loop above a prior unlock.

After thinking about this done more, I realised that is not something specific to locks — the same issue arises with any two step synchronization between threads.

Consider the following code

std::atomic<bool> ready1{false};
std::atomic<bool> ready2{false};

void thread1(){
  ready1.store(true, std::memory_order_release);
  while(!ready2.load(std::memory_order_acquire)){}
}

void thread2() {
  while(!ready1.load(std::memory_order_acquire)) {}
  ready2.store(true, std::memory_order_release);
}

thread1 sets ready1 to true, then waits for thread2 to set ready2 to true. Meanwhile, thread2 waits for ready1 to be true, then sets ready2 to true.

This is almost identical to the unlock/lock case from the previous blog post, except the waiting thread is just using plain load rather than exchange.

If the compiler moves the wait loop in thread1 above the store then both threads will hang forever. However it cannot do this for the same reason the spinlocks can't deadlock in the previous post: the store has to be visible to the other thread in a finite period if time, so must be issued before the wait loop. https://eel.is/c++draft/intro.multithread#intro.progress-18

An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.

If the optimizer moved the store across the loop in thread1, then it could not guarantee that the value became visible to the other thread in a finite period of time. Therefore such an optimization is forbidden.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

Follow me on Twitter

Another nail for the coffin of past effort estimation research

Derek Jones from The Shape of Code

Programs are built from lines of code written by programmers. Lines of code played a starring role in many early effort estimation techniques (section 5.3.1 of my book). Why would anybody think that it was even possible to accurately estimate the number of lines of code needed to implement a library/program, let alone use it for estimating effort?

Until recently, say up to the early 1990s, there were lots of different computer systems, some with multiple (incompatible’ish) operating systems, almost non-existent selection of non-vendor supplied libraries/packages, and programs providing more-or-less the same functionality were written more-or-less from scratch by different people/teams. People knew people who had done it before, or even done it before themselves, so information on lines of code was available.

The numeric values for the parameters appearing in models were obtained by fitting data on recorded effort and lines needed to implement various programs (63 sets of values, one for each of the 63 programs in the case of COCOMO).

How accurate is estimated lines of code likely to be (this estimate will be plugged into a model fitted using actual lines of code)?

I’m not asking about the accuracy of effort estimates calculated using techniques based on lines of code; studies repeatedly show very poor accuracy.

There is data showing that different people implement the same functionality with programs containing a wide range of number of lines of code, e.g., the 3n+1 problem.

I recently discovered, tucked away in a dataset I had previously analyzed, developer estimates of the number of lines of code they expected to add/modify/delete to implement some functionality, along with the actuals.

The following plot shows estimated added+modified lines of code against actual, for 2,692 tasks. The fitted regression line, in red, is: Actual = 5.9Estimated^{0.72} (the standard error on the exponent is pm 0.02), the green line shows Actual==Estimated (code+data):

Estimated and actual lines of code added+modified to implement a task.

The fitted red line, for lines of code, shows the pattern commonly seen with effort estimation, i.e., underestimating small values and over estimating large values; but there is a much wider spread of actuals, and the cross-over point is much further up (if estimates below 50-lines are excluded, the exponent increases to 0.92, and the intercept decreases to 2, and the line shifts a bit.). The vertical river of actuals either side of the 10-LOC estimate looks very odd (estimating such small values happen when people estimate everything).

My article pointing out that software effort estimation is mostly fake research has been widely read (it appears in the first three results returned by a Google search on software fake research). The early researchers did some real research to build these models, but later researchers have been blindly following the early ‘prophets’ (i.e., later research is fake).

Lines of code probably does have an impact on effort, but estimating lines of code is a fool’s errand, and plugging estimates into models built from actuals is just crazy.

On Twenty-Niner – student

student from thus spake a.k.

The Baron's most recent wager set Sir R----- the task of placing tokens upon spaces numbered from zero to nine according to the outcome of a twenty sided die upon which was inscribed two of each of those numbers. At a cost of one coin per roll of the die, Sir R-----'s goal was to place a token upon every space for which he should receive twenty nine coins and twenty nine cents from the Baron.

Can non-overlapping spinlocks deadlock in C++?

Anthony Williams from Just Software Solutions Blog

There has been discussion on Twitter recently about whether or not the C++ memory model allows spinlocks to deadlock if they just use memory_order_acquire in lock and memory_order_release in unlock, due to compiler optimizations. The case in question is where a thread locks one mutex, unlocks it, and locks a second: can the compiler reorder the second lock above the first unlock? If it does, and another thread does the same in the reverse order, with the same optimization, then sequential locks could deadlock.

Here is the code in question, with all the lock/unlock code inlined.

std::atomic<bool> mutex1{false};
std::atomic<bool> mutex2{false};

int x=0;
int y=0;

void thread1(){
  while(mutex1.exchange(true,std::memory_order_acquire)){}  // #1
  x=1;
  mutex1.store(false,std::memory_order_release); // #2

  while(mutex2.exchange(true,std::memory_order_acquire)){} // #3
  y=1;
  mutex2.store(false,std::memory_order_release); // #4
}

void thread2(){
  while(mutex2.exchange(true,std::memory_order_acquire)){} // #5
  x=2;
  mutex2.store(false,std::memory_order_release); // #6

  while(mutex1.exchange(true,std::memory_order_acquire)){} // #7
  y=2;
  mutex1.store(false,std::memory_order_release); // #8
}

For there to even be the possibility of deadlock, thread1 must successfully execute line #1 before thread2 successfully executes line #7, and thread2 must successfully execute line #5 before thread1 successfully executes line #3. Because these are RMW operations, the threads must agree on the ordering.

The modification order of mutex1 must thus be #1(success), #2, #7(success), #8. Similarly, the modification order of mutex2 must be #5(success), #6, #3(success), #4.

All threads must agree on these modification orders. https://eel.is/c++draft/intro.multithread#intro.races-4

From the point of view of thread1, everything must run in program order: compilers can only optimize things as long as they run "as if" in program order.

The store to mutex1 at #2 is guaranteed to be visible to thread2 in "a finite period of time". https://eel.is/c++draft/intro.multithread#intro.progress-18

Consequently, thread2 must eventually see that store at line #7, even if it executes line #7 a large number of times first.

Therefore, the compiler cannot move line #3 completely above line #2, since doing so would not guarantee the visibility of #2 to other threads in a finite period of time. It can move an arbitrary number of executions of line #3 above line #2 (all of which will see that mutex2 is still true), but not all the executions of line #3.

Given that thread2 eventually sees the store from #2 at line #7, the exchange at line #7 will eventually succeed, and thread2 will eventually complete.

Likewise, the store at #6 must become visible to thread1 in a finite period of time. Therefore the exchange at line #3 will eventually see the value stored by

6, the exchange will succeed, and thread1 will complete, and the compiler is

not allowed to move all the executions of line #7 above #6.

No amount of compiler optimization is allowed to break this, so no: spinlocks cannot deadlock if they don't overlap.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

Follow me on Twitter

Pricing by quantity of source code

Derek Jones from The Shape of Code

Software tool vendors have traditionally licensed their software on a per-seat basis, e.g., the cost increases with the number of concurrent users. Per-seat licensing works well when there is substantial user interaction, because the usage time is long enough for concurrent usage to build up. When a tool can be run non-interactively in the cloud, its use is effectively instantaneous. For instance, a tool that checks source code for suspicious constructs. Charging by lines of code processed is a pricing model used by some tool vendors.

Charging by lines of code processed creates an incentive to reduce the number of lines. This incentive was once very common, when screens supporting 24 lines of 80 characters were considered a luxury, or the BASIC interpreter limited programs to 1023 lines, or a hobby computer used a TV for its screen (a ‘tiny’ CRT screen, not a big flat one).

It’s easy enough to splice adjacent lines together, and halve the cost. Well, ease of splicing depends on programming language; various edge cases have to be handled (somebody is bound to write a tool that does a good job).

How does the tool vendor respond to a (potential) halving of their revenue?

Blindly splicing pairs of lines creates some easily detectable patterns in the generated source. In fact, some of these patterns are likely to be flagged as suspicious, e.g., if (x) a=1;b=2; (did the developer forget to bracket the two statements with { }).

The plot below shows the number of lines in gcc 2.95 containing a given number of characters (left, including indentation), and the same count after even-numbered lines (with leading whitespace removed) have been appended to odd-numbered lines (code+data, this version of gcc was using in my C book):

North Star Horizon with cover removed.

The obvious change is the introduction of a third straight’ish line segment (the increase in the offset of the sharp decline might be explained away as a consequence of developers using wider windows). By only slicing the ‘right’ pairs of lines together, the obvious patterns won’t be present.

Using lines of codes for pricing has the advantage of being easy to explain to management, the people who sign off the expense, who might not know much about source code. There are other metrics that are much harder for developers to game. Counting tokens is the obvious one, but has developer perception issues: Brackets, both round and curly. In the grand scheme of things, the use/non-use of brackets where they are optional has a minor impact on the token count, but brackets have an oversized presence in developer’s psyche.

Counting identifiers avoids the brackets issue, along with other developer perceptions associated with punctuation tokens, e.g., a null statement in an else arm.

If the amount charged is low enough, social pressure comes into play. Would you want to work for a company that penny pinches to save such a small amount of money?

As a former tool vendor, I’m strongly in favour of tool vendors making a healthy profit.

Creating an effective static analysis requires paying lots of attention to lots of details, which is very time-consuming. There are lots of not particularly good Open source tools out there; the implementers did all the interesting stuff, and then moved on. I know of several groups who got together to build tools for Java when it started to take-off in the mid-90s. When they went to market, they quickly found out that Java developers expected their tools to be free, and would not pay for claimed better versions. By making good enough Java tools freely available, Sun killed the commercial market for sales of Java tools (some companies used their own tools as a unique component of their consulting or service offerings).

Could vendors charge by the number of problems found in the code? This would create an incentive for them to report trivial issues, or be overly pessimistic about flagging issues that could occur (rather than will occur).

Why try selling a tool, why not offer a service selling issues found in code?

Back in the day a living could be made by offering a go-faster service, i.e., turn up at a company and reduce the usage cost of a company’s applications, or reducing the turn-around time (e.g., getting the daily management numbers to appear in less than 24-hours). This was back when mainframes ruled the computing world, and usage costs could be eye-watering.

Some companies offer bug-bounties to the first person reporting a serious vulnerability. These public offers are only viable when the source is publicly available.

There are companies who offer a code review service. Having people review code is very expensive; tools are good at finding certain kinds of problem, and investing in tools makes sense for companies looking to reduce review turn-around time, along with checking for more issues.

OKRs in Agile infrographic

Allan Kelly from Allan Kelly Associates

I am indebted to Yoan Thirion for created this infographic to illustrate Succeeding with OKRs in Agile. He’s done a brilliant job on both the graphics and the summary – undoubtedly better graphics than I could have done and probably a better summary than I would too, sometimes one can be too close to a thing.


Subscribe to my blog newsletter and download Continuous Digital for free – normal price $9.99/£9.95/€9.95

The post OKRs in Agile infrographic appeared first on Allan Kelly Associates.