CentOS setup on VirtualBox

Once you have Networking working there is still a long way to go.

yum groupinstall "Development Tools"
yum install kernel-devel
yum install kde-workspace
yum group install "X Window System"
yum groupinstall "Fonts"
yum install gdm

Now we can login without a GUI but startx when one is needed.

Installing Guest Additions

The guest Centos is a stock distribution, you have to tell it that it is inside VirtualBox.

Make the additions visible to the guest:

In the "Devices" menu in the virtual machine's menu bar, VirtualBox has a handy menu item named "Insert Guest Additions CD image", which mounts the Guest Additions ISO file inside your virtual machine.

yum install dkms
mkdir -p /media/cdrom
# Note change from /dev/scd0 in CentOS6
mount /dev/sr0 /media/cdrom
sh /media/cdrom/VBoxLinuxAdditions.run

We are now able to move the mouse seamlessly between our guest and host and window systems understand each other.

Sharing files between the host and guest

In the host (Windows) create C:\vbshared and using the VirtualBox interface share this with the guest. In the guest:

mkdir /vbshared
mount -t vboxsf vbshared /vbshared

it will be visible as /vbshared/ from inside the guest.

Random Magic

Have you ever written a unit test with magic numbers in and felt bad? For example, given a C++ class that simulates stock prices, Simulation, you would expect a starting price of zero to stay at zero. Let’s write a test for this using Catch 

TEST_CASE("simulation starting at 0 remains at 0", "[Property]")
    const double start_price = 0.0;
    const double drift       = 0.3;//or whatever
    const double volatility  = 0.2;//or whatever
    const double dt          = 0.1;//or whatever
    const unsigned int seed  = 1;  //or whatever
    Simulation price(start_price, drift, volatility, dt, seed);
    REQUIRE(price.update() == 0.0);

Oh dear; magic numbers. That sinking feeling when you don’t know or care what values some variables take. The comments hint at the unhappiness. You could write a few more tests cases with other numbers, or use a parameterised approach. Trying every possible double or int would be extreme, and make the unit tests slow. Unit tests should be fast, so we’d best not. We could try some random variables instead of the magic numbers. This might lead to cases that sometimes fail, and unit tests should provide repeatable results, so we’d best not.

Oh dear. If only we had some random magic to help. We need something that allows us to test that properties hold for a variety of cases. We don’t want to hand roll lots of ad-hoc test cases ourselves. If we generate random test cases we need the results to be clearly reported so we know what went wrong if something fails. We need property-based testing. Good news! Haskell got there long before us. 

QuickCheck â€œis a tool for testing Haskell programs automatically. The programmer provides a specification of the program, in the form of properties which functions should satisfy, and QuickCheck then tests that the properties hold in a large number of randomly generated cases.” [See the manual] You define a property, such as reversing a reversed list gives the original list

prop_RevRev xs = reverse (reverse xs) == xs
          where types = xs::[Int]

Then quickly check it holds for some randomly generated examples.

        Main> quickCheck prop_RevRev
        OK, passed 100 tests.

If a property doesn’t hold, quickCheck reports the case or “counter-example” for which it does not hold. Instead of my initial “example-based” test I can now test my property holds generally. Since the cases are randomly generated rather than exhaustive I may still miss problems, but look how much shorter the code was.

Wait a moment! I was trying to test some C++ and got distracted by Haskell. The good news is ports of QuickCheck exist for various languages. For example, F# has FSCheck  Python has Hypothesis  and, C++ being C++, has various versions. I have tried Legiasoft’s QuickCheck and showed my initial attempts at the #ACCU2015 conference.

A recent blog from Spotify drew my attention to RapidCheck. This claims to integrate with Boost test and Google Test/Mock though I haven't tried it yet. I wonder if I can make it play nicely with Catch. I will report back. Another interesting feature it supports is stateful based testing, based on Erlang’s port of QuickCheck. Since this started with Haskell, many frameworks need *pure* functions. Once in a while, some of us are not quite as pure as we'd like, so I can imagine this being very useful.

I hope this has sparked some excitement about new ways of testing your code. Next time someone asks “Unit tests or integration tests?” say “Yes, and also property-based tests”.

Current Software Development Pre-Requisites

When starting a new project or joining an existing one there are a number of tools and features which should be in place. I have ordered them in order both of importance and the order in which the global community learnt the painful lessons that none of these are optional.

This is based upon Project initiation - a recipe.

Short name

Google it, ensure it is available as a url, check twitter.


If there is no README create it now!

Source control

The only decision is public or private. It will be a git repo.

If any other SCM system is in place convert to git before doing anything else.

Decide on git usage strategy: git flow, release branches, developer forks with feature branches and merge to master.

Development machine

Do we really want to develop in Fortran under VMS? oh, OK.

Develop on the operating system you are deploying to. If you develop on OSX and deploy to debian it will bite you. Developing for Redhat using Windows should be made illegal.

Continuous Integration

Jenkins of course.

Track the code coverage, anything less than 100&percent; is not acceptable.

Static Analysis

For legacy projects Sonar establishes a baseline, for new projects it holds the line throughout the projects life.

Continuous Deployment

The closer to Continuous Deployment the fewer platform types are needed.


Metrics enable blue green deployment and A/B testing.

Issue tracking and work planning

Just you: gitthub, team: Jira

Continuous Availability for a Jenkins Continuous Integration Service

When your CI server is becoming too big to fail

This post was written when I was responsible for a heavily used CI server, for a company which is no longer trading, so the tenses may be a mixed

Once an organisation starts to use Jenkins, and starts to buy into the Continuous Integration methodology, very quickly the Continuous Integration server becomes indispensable.

The Problem

The success of Jenkins is based upon its plugin based architecture. This has enabled Kohsuke Kawaguchi to keep tight control over the core whilst allowing others to contribute plugins. This has led to rapid growth of the community and a very low bar to contributing (there are currently over 1000 plugins).

Each plugin has the ability to bring your CI server to a halt. Whilst there is a Long Term Support version of Jenkins the plugins, which supply almost all of the functionality, do not have any enforced gate keeping.

Solution Elements

A completely resilient CI service is an expensive thing to achieve. The following elements must be applied baring in mind the proportion of the risk of failure they mitigate.

Split its jobs onto multiple CI servers

Use of personal Jenkins installations is recommended, but there is still a requirement for a single, central server.

This should be a last resort, splitting tasks out across slaves achieves many of the benefits without losing a single reporting point.

Split jobs out to SSH slaves
We had a misconfiguration of our ssh slaves such that they install the Jenkins package. The only use of the package is to ensure that the jenkins user is present, though tasks should not, ideally, be run as the jenkins user.

One disadvantage of using ssh slaves is that it requires copies of the ssh keys to be manually copied from the master server to the slaves.

Because jobs are initiated from master to the slave the master cannot be restarted during a job's execution (this is currently also true for JNLP slaves, but is not necessarily so).

The main disadvantage of ssh slaves is that by referencing real slaves they make the task of creating a staging server more complex, as a simple copy of the master would initiate jobs on the real slaves.

Split jobs out to JNLP slaves

Existing ssh slave jobs should be left unchanged until they can be replaced. This is a blocker on creating a staging CI server.

This is the recommended setup, which we used eventually for most jobs.

Minimise Shared Resources

Most of these problems can be overcome by spinning up a virtual machine for each job, from scratch, provisioned by puppet via vagrant.

In addition to sharing plugins, and hence sharing faulty plugins, another way in which jobs can adversely interact is by their use of shared resources(disk space, memory, cpus) and shared services(databases, message queues, mail servers, web application servers, caches and indexes).

Run the LTS version on production CI servers

Move to LTS at the earliest opportunity.

There are two plugin feeds, one for bleeding edge, the other for LTS.

Strategies for Plugin upgrade

Hope and trust

Up until our recent problem I would have said that the Jenkins community is pretty high quality, most plugins do not break your server, your ability to predict which ones will break your installation is small so brace yourself and be ready to fix and report any problems that there are. I have run three servers for five years and not previously had a problem.

Upgrade plugins one at a time, restart server between each one.

This seems reasonable, but at a release rate of 4.3 per day, seven days a week since 2011-02-21 even your subset of plugins are going to get updated quite frequently.

Use a staging CI server, if you can

If your CI server and its slaves are all setup using puppet, then you can clone it all, including repositories and services, so that any publishing acts do not have any impact on the real world, otherwise you will send emails and publish artefacts which interfere with your live system. Whilst we are using ssh slaves the staging server would either initiate jobs on real slaves or they too would need to be staged.

Use a partial staging CI server
Jobs which publish an artefact every time they are run cannot be re-run so are not suitable for running on a staging server.

You can prune your jobs down to those which are idempotent, ie those which do not publish and do not use ssh slaves, but the non-idempotent jobs cannot be re-run.

Control and monitor the addition of plugins

Users intending to install a plugin should ask on irc, giving the plugin url.

From the above it is clear that for a production CI server the addition of plugins is not risk or cost free.

Remove unused plugins, after consulting original installer

We still have a number of redundant plugins installed.

Plugins build up over time.

Monitor the logs

Currently there is no monitoring of the Jenkins log.

A log monitor which detects java exceptions might be used.

Backup the whole machine

Whilst the machine is backed up a fire drill is needed to prove that a state can be returned to.

Once a month restore from backup to a clean machine.

Store the configuration in Git

The configuration of Jenkins has been stored, and restored from.

This process is only one element of recreating a server. Once a month restore from git to a clean machine.

Swift’s defer statement is funkier than I thought

Swift 2.0 introduced the defer keyword. I've used this a little but only in a simple way, basically when I wanted to make sure some code would be executed regardless of where control left the function, e.g.

private func resetAfterError() throws
    selectedIndex = 0
isError = false
  if /* condition */
    // Do stuff

  if /* other condition */
    // Do other stuff

  // Do default stuff

In my usage to date there has always been some code that should always be executed prior to the function's exit and additionally only one piece of code. Therefore I've always put the defer statement at the top of the function so when reading it's pretty obvious.

I was aware that if there were multiple defer statements then they'd be executed in reverse order but what I'd not given any thought to before was what happens if the defer statement isn't reached. In fact I'd just assumed it was more of a declaration that this code should always be executed on function exit and as I put mine right at the start of the function this was effectively the case.

However, for some functions (probably most) you don't want this. You only want the deferred code executing if some else as happened. This is shown simply in The Swift Programming Language book example:

  1. func processFile(filename: String) throws {
  2. if exists(filename) {
  3. let file = open(filename)
  4. defer {
  5. close(file)
  6. }
  7. while let line = try file.readline() {
  8. // Work with the file.
  9. }
  10. // close(file) is called here, at the end of the scope.
  11. }
  12. }

In this if the file is not opened then the deferred code should not be executed. Another very important usage is:

extension NSLock
  func synchronized<T>(@noescape closure: () throws -> T) rethrows -> T

  return try closure()

If the lock is never obtained then it should never be unlocked. In this case this shouldn't have as the self.lock() will not return until it obtains the lock but if that line were replaced with self.

This is how defer works. If the defer statement is never reached and/or encountered then the deferred code block will never be executed. This includes branches (if-statements etc.). The following example:

enum WhenToReturn
  case After0
  case After1
  case After2

func deferTest(whenToReturn: WhenToReturn, shouldBranch: Bool)
  print("Defer Test - whenToReturn:\(whenToReturn), shouldBranch:\(shouldBranch)")
    print("defer 0")
  if whenToReturn == WhenToReturn.After0
    print("defer 1")
  if whenToReturn == WhenToReturn.After1
  if shouldBranch
    print("defer 2")


deferTest(WhenToReturn.After0, shouldBranch: false)
deferTest(WhenToReturn.After1, shouldBranch: true)
deferTest(WhenToReturn.After2, shouldBranch: false)
deferTest(WhenToReturn.After2, shouldBranch: true)


Defer Test - whenToReturn:After0, shouldBranch:false
defer 0

Defer Test - whenToReturn:After1, shouldBranch:true
defer 1
defer 0

Defer Test - whenToReturn:After2, shouldBranch:false
defer 2
defer 1
defer 0

Defer Test - whenToReturn:After2, shouldBranch:true
defer 2
defer 1
defer 0

Program ended with exit code: 0

This shows that returning before and/or not branching results in defer statements not being encountered hence the deferred code is not executed.  This is no different to say a finally-block in C#. The reason for my initial confusion is that there is no additional content for a defer block as there is for a finally block, i.e. the presence of the try, e.g.

  // Try some stuff 
  // Always do something having tried something regardless of whether it worked or not

Whereas the only and actual context of the defer block is it's position.

The Perils of debugging with return statements in languages without semi-colon statement terminators, i.e. Swift

This is a pretty obvious post but perhaps writing it will stop me falling prey to this issue.

When I'm debugging and I know that some code executed in a function is not to blame but is noisy in terms of what it causes to happen etc. I'll often just prevent it from being executed in order to simplify the system, e.g.

func foo()
// Do lots of other things...

Sometimes I like to be quicker to I just put in an early return statement, i.e.

func foo()
// Do lots of other things...

I must also go temporarily warning blind and ignore the following:

The effect of this is that rather than prevent everything after the return statement from executing it as per the warning the return statement takes f() as its argument and explicitly calls it returning its value, though not executing the remaining functions. In this case as foo() (and f() though it's not shown) is void that is nothing. In fact if foo() or f() had non-void return types this wouldn't compile.

The fix is easy. Just put a semi-colon after the return.

func foo()
// Do lots of other things...


I use this 'technique' when I'm debugging C++ where this works fine. This is slightly interesting as C++ has the same semantics. The following C++ code has the same problem as the Swift, in that this code also invokes f() as its return value.

void foo()

I guess the reason it's not an issue with C++ (as much or at all) is that my muscle memory or something else is always wanting to terminate lines with semi-colons so the natural way to write the return would be 'return;' whereas in Swift without the semi-colon requirement it's natural not to hence this issue becomes slightly more prevalent.

OAuth authentication on tvOS

Recently I've just published an Apple TV (tvOS) App to view photos stored on Microsoft OneDrive.

Implementing this on tvOS rather than iOS presented one unique challenge. The OneDrive REST API requires OAuth2 authentication in order to obtain an OAuth token which is then used for all the other calls.

Normally (well based on my limited experience) OAuth within Apps is handled by using a UIWebView along with delegate code that performs the OAuth handshake (image linked from IBM).

tvOS does not contain any form of web view, i.e. no UIWebView and no WKWebView (not that it would be that much use due to the lack of hooks). As the actual authentication is performed within the UIWebView by the authenticating 3rd party (Microsoft in this case requiring the user logs in with their Microsoft Account credentials) there's not a lot that can be done without it.

However, both iOS and tvOS are generally logged into using an Apple Id which is also used to login into iCloud and generally for an Apple TV owned by the same person who owns another iOS device these use the same Apple Id. Therefore, what I did was to write a very simple iOS App that:
  1. Performs the OAuth Authentication handshake
  2. Stores the resulting OAuth token in the iCloud KeyValue Store

When written this is usually synchronized to iCloud very quickly. On the other side the tvOS app reads the iCloud KeyValue Store checking to see if the OAuth token exists.

If it does then it can continue as per any other App that has successfully performed the OAuth handshake.

I believe that iCloud Storage and the process of writing to and reading from iCloud is secure. This is important as following the handshake the OAuth token acts effectively as a password. Each token obtained from Microsoft is valid for one hour so after that the user needs to perform the authentication from the iOS device again.

It is possible to request an OAuth refresh token which allows a client to update an expired token as long as access to OneDrive for the App has not been revoked. However, I prefer to err on the side caution at the moment. I also only request read-only access to OneDrive as well.

For this to work the same user (Apple Id) needs to be logged into both the Apple TV and the iOS device as the same user and additionally be signed into iCloud on these devices. From a programmatic perspective both Apps need the iCloud capability enabling but only Key-value storage.

However, you'll notice that CloudKit has also been enabled. This is so that CKContainer methods can be called. In particular (well only)


In order to establish whether the user is currently signed in to iCloud and any changes (signing in & out potentially as a different user) whilst the App is running.

Enabling this automatically creates an Entitlement file (named <AppName>.entitlements) and within it creates Key confusingly called 'iCloud Key-Value Store' with the default value of '$(TeamIdentifierPrefix)$(CFBundleIdentifier)' - this is not the key to access values you store BTW but is just the iCloud KV configuration. This will happen for both the iOS and tvOS Apps.

NOTE: The two collapsed keys are as a result of enabling CloudKit.

For both Apps to have access to the same iCloud Key-Value storage the results of expanding the ''$(TeamIdentifierPrefix)$(CFBundleIdentifier)' macros needs to be the same. For my App I've created  a single App that has both an iOS and tvOS component so their CFBundleIdentifier is the same. The TeamIdentifierPrefix is taken from your Developer Apple Id.

The first part of the value has to be $(TeamIdentifierPrefix) as in order to make the KV Storage secure this value forms part of the signing process. If you replaced the whole value with say 'BOB' then it won't build properly.

As such it's possible for all your Apps (published from the with the same Apple Id) to share iCloud Key-Value Storage contents.

Reading & writing is very simple. I just use a single Key-Value pair to read, write (& where necessary delete the token. This is accomplished using:

NSUbiquitousKeyValueStore.defaultStore().setDictionary(stuff.asDict(), forKey: "mykey")

to write and

let stuff = NSUbiquitousKeyValueStore.defaultStore().dictionaryRepresentation["mykey"] as? [String:AnyObject]

to read.

This example reads & writes a dictionary as I needed to set store a set of KV Pairs (a dictionary) as the value of a single KV-pair but fundamental data types can be stored directly too.

When starting the App, according to the docs it is important to call synchronize method to initiate timely iCloud synchronization.

When the tvOS based Apple TV was first released there were various articles about how to enable users to login to their accounts for certain apps. These often involved similar configuration requiring the user to use an iOS device to input a string of numbers presented by the tvOS App. However, this solutions was usually for Apps that managed their own accounts. This solution is similar in that solves the cumbersome entry problem but also enables the use of browser (UIWebView) based OAuth2 on a device that directly support it.

Git remote repos with OneDrive

I have various public git repositories on GitHub but I like to keep some source (usually my active App Store apps) private. Whilst it'd be nice to use GitHub private repositories, given my Apps are for fun and don't really make anything, don't require collaboration then the pricing is prohibitive.

However, I really like the idea of having an offsite copy of my repository. As it happens I have an Office 365 Subscription which comes with 1TB of OneDrive space. I use OneDrive on OSX to sync a bunch of folders. I could use a synchronised OneDrive for my work directory but I don't want OneDrive synchronising all the temporary build files etc every time I build.

It turns out the ideal solution is to create a remote clone in a OneDrive synchronised folder. In fact I have a dedicated OneDrive folder called 'src' that contains clones of all of my git repos. Then, each time I commit to the local git repository and push OneDrive performs the synchronisation. If it happens that the code I'm working on is public then adding a public GitHub repo is a doddle.

Having created a local Git repo (though usually Xcode does this when starting a new project) it's easy to create the OneDrive clone:

  1. cd /Users/Pete/OneDrive/src
  2. git clone --bare file:////Users/Pete/Projects/<ProjectName>/.git <ProjectName>.git
As the remote clone is really just a backup come master that will never be working repository I create a bare clone and suffix the directory name with '.git'. I think this is a fairly common convention.

At this point the source repository is cloned but the source is now a remote of the new repository rather than than the other way round. This is easy to fix:

cd-ing into <ProjectName>.git (my example Project is Photone> and running git remotes gives:

~/OneDrive/src/Photone.git[23]git remote -v
origin file:////Users/Pete/Projects/PhotoneViewer/.git (fetch)
origin file:////Users/Pete/Projects/PhotoneViewer/.git (push)

than running:
  1. git remote remove origin
Removes the relationship between the new remote and the original source. To implement the desired reverse relationship:

  1. cd /Users/Pete/Projects<ProjectName>
  2. git remote add OneDrive file:////Users/Pete/OneDrive/src/<ProjectName>.git
I name my OneDrive remote repos 'OneDrive'. This helps if I have multiple remotes.

git remote (for my current project) now gives:

~/Projects/PhotoneViewer[49]git remote -v
OneDrive file://Users/Pete/OneDrive/src/Photone.git (fetch)
OneDrive file://Users/Pete/OneDrive/src/Photone.git (push)

From this point onwards I use SourceTree. However, if you use the command line then a couple of extra steps are required otherwise git complains. 

Firstly, when pushing if you just want to do:

git push OneDrive you need to tell git that the new OneDrive remote is the master. This is done by:

git push --set-upstream OneDrive master

Secondly, unless you've already set the push.default setting or just use 'git push --all' then you'll need to decide which option you want. The help from git describes these well:

Git 2.0 from 'matching' to 'simple'. To squelch this message
and maintain the traditional behavior, use:

  git config --global push.default matching

To squelch this message and adopt the new behavior now, use:

  git config --global push.default simple

When push.default is set to 'matching', git will push local branches
to the remote branches that already exist with the same name.

Since Git 2.0, Git defaults to the more conservative 'simple'
behavior, which only pushes the current branch to the corresponding
remote branch that 'git pull' uses to update the current branch.

See 'git help config' and search for 'push.default' for further information.
(the 'simple' mode was introduced in Git 1.7.11. Use the similar mode
'current' instead of 'simple' if you sometimes use older versions of Git)

If you're using the remote clone as a backup then perhaps the original, i.e. matching behaviour is desirable. Whilst I use OneDrive this configuration should work for any other file synchronisation service, e.g. iCloud, DropBox or a standard mounted File System, e.g. Samba, NFS etc.