Elastic stack – RTFM

Frances Buontempo from BuontempoConsulting

I tried to setup ELK (well, just elasticsearch and kibana initially), with a view to monitoring a network.

Having tried to read the documentation for an older version than I'd downloaded and furthermore one for *Nix when I'm using Windows, I eventually restarted at the "Learn" pages on https://www.elastic.co/

There are a lot of links in there, and it's easy to get lost, but it is very well written.

This is my executive summary of what I think I did.

First, download the zip of kibana and elasticsearch.

From the bin directory for elasticsearch, run elasticsearch.bat file, or run service install then service run. If you run the batch file it will spew logs to the console, as well as a log file (in the logs folder). You can tail the file if you choose to run it as a service. Either works.

If you then open http://localhost:9200/ in a suitable browser you should see something like this:

{
"name" : "Barbarus",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "bE-p5dLXQ_69o0FWQqsObw",
"version" : {
"number" : "2.4.1",
"build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
"build_timestamp" : "2016-09-27T18:57:55Z",
"build_snapshot" : false,
"lucene_version" : "5.5.2"
},
"tagline" : "You Know, for Search"
}
 
The name is a randomly assigned Marvel character. You can configure all of this, but don't need to just to get something up and running to explore. kibana will expect elasticsearch to be on port 9200, but again that is configurable. I am getting ahead of myself though.

Second, unzip kibana, and run the batch file kibana.bat in the bin directory. This will witter to itself. This starts a webserver, on port 5601 (again configurable, but this by default): so open http://localhost:5601 in your browser.

kibana wants an "index" (way to find data), so we need to get some into elasticsearch: the first page will say "Configure an index pattern". This blog has a good walk through of kibana (so do the official docs).

All of the official docs tell you to use curl to add (or CRUD) data in elasticsearch, for example
curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
{
"name": "John Doe"
}'

NEVER try that from a Windows prompt, even if you have a curl library installed. You need to escape out the quote, and even then I had trouble. You can put the data (-d part) in a file instead and use @, but it's not worth it.
Python to the rescue. And Requests:HTTP for Humans
pip install requests
to the rescue.
Now I can run the instructions in Python instead of shouting at a cmd prompt.

import requests
r = requests.get('http://localhost:9200/_cat/health?v')

r.text


Simple. The text shows me the response. There is a status code property too. And other gooides. The the manual. For this simple get command you could just point your browser at localhost:9200/_cat/health?v


Don't worry if the status is yellow - this just means you only have omne node so it can't replicate in cause of disaster.

Notice the transport, http:// at the start. If you forget this, you'll get an error like
>>> r = requests.put('localhost:9200/customer/external/1?pretty', json={"name": "John Doe"})
...
    raise InvalidSchema("No connection adapters were found for '%s'" % url) requests.exceptions.InvalidSchema: No connection adapters were found for 'localhost:9200/customer/external/1?pretty'



Now we can put in some data.

First make an index (elastic might add this if you try to put data under a non-existent index). We will then be able to point kibana at that index - I mentioned kibana wanted an index earlier.
r = requests.put('http://localhost:9200/customer?pretty')


Right, now we want some data.
>>> payload = {'name': 'John Doe'}
>>> r = requests.post('http://localhost:9200/customer/external/1?pretty', json=payload)


If you point your browser at localhost:9200/customer/external/1?pretty you (should) then see the data you created. We gave it an id of 1, but it will be automatically assigned a unique id if we left that off.

We can use requests.delete to delete, and requests.post to update:
 >>> r = requests.post('http://localhost:9200/customer/external/1/_update', \
 json={ "doc" : {"name" : "Jane Doe"}})

Now, this small record set won't be much use to us. The docs have a link to some json data. I downloaded some ficticious account data. SO to the rescue for uploading the file:


>>> with open('accounts.json', 'rb') as payload:
...   headers = {'content-type': 'application/x-www-form-urlencoded'}
...   r = requests.post('http://localhost:9200/bank/account/_bulk?pretty', \ 

              data=payload,  verify=False, headers=headers)
...

>>> r = requests.get('http://localhost:9200/bank/_search?q=*&pretty')
>>> r.json()
This is equivalent to using

>>> r = requests.post('http://localhost:9200/bank/_search?pretty', \

      json={"query" : {"match_all": {}}})
 i.e. instead of q=* in the uri we have put it in the rest body.



Either way, you now have some data which you can point kibana at. In kibana, the discover tab allows you to view the data by clicking through fields. The visualise tab allows you to set up graphs. What wasn't immeditely apparent was once you have selected your buckets, fields and so forth, you need to press the green "play" button by the "options" to make it render your visualisation. And finally, I got a pie chart of the data.  I now need to point it at some real data.

 
 

 

Pipeline 2016

Frances Buontempo from BuontempoConsulting

A write up of my notes: they may or may not make any sense.

Keynote: Jez Humble "What I Learned From Three Years Of Sciencing The Cr*p Out Of Continuous Delivery" or "All about SCIENCE"

Suverys

Surveys are measures looking for latent constructs for feelings and similar - see psychometrics.
Surveys need a hypothesis to test and should be worded carefully.
Consider discriminant and convergent validity.
Test for false positives.

Consider the Westrum toypology.
With 6 axes (rows) scaled across three columns: pathological, bureaucratic, generative you can start spotting connections.

Pathological
Bureaucratic
Generative
Power Oriented
Rule Oriented
Performance Oriented
Low cooperation
Modest cooperation
High cooperation
Messengers shot
Messengers neglected
Messengers trained
Responsibilities shirked
Narrow responsibilities
Risks are shared
Bridging discouraged
Bridging tolerated
Bridging encouraged
Failure leads to scapegoating
Failure leads to justice
Failure leads to inquiry
Novelty crushed
Novelty leads to problems
Novelty implemented

For example "Failure leads to" has three different options: scapegoating, justice or inquiry. Where does your org come out for each question? If they say "It's all Matt's fault" and sack Matt that won't avoid mistakes happening again. Blameless postmortems are important.
IT and aviation are both high-tempo, high consequence environments. They are adaptive complex systems: there is frequently not enough information to make a decision. Therefore reduce the consequences of things going wrong.
In general for surveys, use a Likert type scale - use clearly worded statements on a scale, allowing numerical analysis. See if your questions "load together" (or bucket). Maybe spotting what's gone wrong with some software buckets into notification from outside (customers etc) and notification from inside (alerts etc).
Consider CMV, CMB - common method variance or bias. Look for early versus late respondents.
See https://puppetlabs.com/2015-devops-report for the previous devops survey.
In fact take this year's https://puppetlabs.com/blog/2016-state-devops-survey-here

IT performance

How do you measure it? How do you predict it? It seems that "I am satisfied with my job" is the biggest predictor of organisational performance.
Does your company have a culture of "autonomy, mastery, purpose"? What motivates us? [See Pink]

How do we measure IT performance? Consider lead time, release frequency, time to restore, change failure rate...
Going faster doesn't mean you break things, it actually makes you *more* stable, if you look at the data [citation needed]
"Bi-modal IT" is wrong: watch out for Jez's upcoming blog about "fast doesn't compromise safety"

Do we still want to work in the dark-ages of manual config and no test automation?

We claim we are doing continuous integration (CI) by redefining CI. Do devs merge to trunk daily? Do you have tests? Do you fix the build if it goes red?

Aside: "Surveys are a powerful source of confirmation bias"

Question: Can we work together when things go wrong?

Do you have peer reviewed changes? (Mind you, change advisory boards)

Science again (well, stats)

SEM: structured equation modelling: use this to avoid spurious correlations.

Apparently 25% of people do TDD - it's the lost XP practice. TDD forces you to write code in testable ways: it's not about the tests.

How good are your tests? Consider mutation testing e.g. Ivan Moore's Jester

Change advisory boards don't work. They obviously impact throughput but have negligible impact on stability. Jez suggested the phrase "Risk management theatre".


Ian Watson and Chris Covell "Steps closer to awesome"

They work at Call Credit (used to be part of the Skipton building soc) and talked about how to change an organisation.

Their hypothesis: "You already have the people you need."
"Metal as a service" sneaked a mention, since some people were playing buzz-word bingo.
Question: what would make this org "nirvana"?
They started broadcasting good (and bad) things to change the culture. e.g. moving away from a fear of failure. Having shared objectives helped.

We are people, not resources. "Matrix management" (queue obvious slides)  - not a good thing. Be the "A" team instead. (Or the goonies).

The environment matters. They suggested blowing up a red balloon each time you are interrupted for 15 seconds or more, giving a visual aid of the distractions.

They mentioned "Death to manual deployments" being worth reading.

They said devs should never have access to prod.
You need centres of excellence: peer pressure helps.
They have new bottlenecks: "two speed IT" .... the security team should be enablers not the police.
They mentioned the "improvement kata"
They said you need your ducks in a straight line == a backlog of good stories.

Gary Frost "Financial Institutions Carry Too Much Risk, It’s Time To Embrace Continuous Delivery"

of 51zero.com
Sarbanes-Oxley (SOx) was introduced because of risk in finance. Has it worked? No.
It brought about a segregation of duties and lots of change control review. "runbooks" This is still high risk. There have been lots of breeches from IT departments e.g. Knight Capital, NatWest (3 times).
Why are we still failing, despite these "safety measures"?
We need fully automated testing including security and performance. We need micro-services (and containers), giving us isolation.
Aside; architecture diagrams...! Are they helpful? Are they even correct? Why not automatically generate these too so they are at least correct?

What are the blockers? Silos. Move to collaborative environments.

Look out for new FinTech disruption (start-ups I presume)

Gustavo Elias "How To Deal With A Hot Potato"

He was landed with legacy code that was deeply flawed, had multiple responsibilities and high maintenance costs. In fact he calculated these costs and told management, For example, with downtime for deployment and 40 minutes to restarted calculate the cost at over £500 per day per dev.
How to change this?
  • Re-architect
  • Reach zero downtime
  • Detach from the old release cycle
How?
Re-architect with micro-services and the strangle-vine pattern.
Reach zero downtime with a canary release and blue/green deployment. You need business onside for the extra hardware.
Old release cycle: bamboo plan - but this needs new machines.
In the end, be proud.

Pete Marshall "Achieving Continuous Delivery In A Legacy Environment"

The tech architect at Planday (a shift work app)
C.D. in a legacy environment: and not "chaotic delivery".
Ask the question: "What are you business goals?"
They had DNS load balancing, "interesting stand-ups" (nobody cared), no monitoring.
He started a tech radar: goals to get people on board.
He used a corp screensaver to communicate the pipeline vision.
How easy is your code to build? Do you know what's actually in prod? Can you find the delta?
He changed nant to msbuild.
He became a test mentor, having half hour sessions to increase test coverage.
They had estimation sessions and planning sessions.
Teams started to release on their own schedule with minimal disruption to others. 
Logging, monitoring and alerting helped: look for patterns in the logs. n.b. loggly (though cloud based with no instance in Europe so might be slow)
He mentioned feature toggles (I wondered how he implemented these: please not boolean flags in a database, but enough of my pain), though watch out - you can still get surprises.
He used the strangle pattern.
Don't do loads of things: do a couple of things you can actually measure.
Ask yourself "What's the risk of failure?"

Sally Goble "What do you do if you don't do testing?"

From QA at The Guardian
They previously has a two-week release cycle, with a staging environment and lots of manual testing.
They deployed at 8am on a Wednesday. A big news day delayed the release cycle by a week. 
They couldn't roll back.
They moved to automated tests - perhaps selenium. They were mainly comparing pixels.
Then they threw them out.
So, what does QA do if it doesn't do testing? They now make sure they are "not wrong long." i.e. they can fix things quickly.
They have feature switching, canary releases and monitoring (but avoid noise).
They are not a testing department but a quality department. They can concentrate on other things - like less data so apps don't blow out users' data plans or similar.

Steve Elliott "Measure everything, not just production"

Laterooms: something about badgers.
Tools: log aggregation: elastic stack. Metrics: kibana, grafana. Alerting: icinga(2) [like nagios only prettier]
Previously dev/test was slow, had no investment. They had flaky tests and it was difficult to spot trends.
They moved to instrumentation and tooling in dev.
"Measure ALL the things"
Be aware that dashboard fatigue is a thing.
He pointed us at github
Have lots of metrics but don't used them to be Orwellian. Have data-driven retrospectives. (I once made a graph of who was asking who for code review to reveal cliques in our team - data makes a difference! And pictures more so.) He mentioned that you need to make space for feelings in the retrospectives too.
He suggested mixing up the format to keep retrospectives fresh: consider using http://plans-for-retrospectives.com/index.html

He said he was running sentiment analysis on the tweets he got during his talk. 

He mentioned that Devops Manchester is always looking for speakers.

Summary

I'm so glad I went. It's useful to see people talking about their successes (and failures) and to reflect on common themes. "People not resources" struck a deep note for me. I am always inspired when I see people trying to make things better, no matter how hard.
I loved the brief mention of stats in the keynote. The main themes were, of course, about measuring and automating. I will spend time thinking about what else I can measure and how to do stats and present them to non-statisticians in a clear way.
Never under-estimate the power of saying "Prove it" when someone makes a claim.




Random Magic

Frances Buontempo from BuontempoConsulting

Have you ever written a unit test with magic numbers in and felt bad? For example, given a C++ class that simulates stock prices, Simulation, you would expect a starting price of zero to stay at zero. Let’s write a test for this using Catch 

TEST_CASE("simulation starting at 0 remains at 0", "[Property]")
{
    const double start_price = 0.0;
    const double drift       = 0.3;//or whatever
    const double volatility  = 0.2;//or whatever
    const double dt          = 0.1;//or whatever
    const unsigned int seed  = 1;  //or whatever
    Simulation price(start_price, drift, volatility, dt, seed);
    REQUIRE(price.update() == 0.0);
}

Oh dear; magic numbers. That sinking feeling when you don’t know or care what values some variables take. The comments hint at the unhappiness. You could write a few more tests cases with other numbers, or use a parameterised approach. Trying every possible double or int would be extreme, and make the unit tests slow. Unit tests should be fast, so we’d best not. We could try some random variables instead of the magic numbers. This might lead to cases that sometimes fail, and unit tests should provide repeatable results, so we’d best not.

Oh dear. If only we had some random magic to help. We need something that allows us to test that properties hold for a variety of cases. We don’t want to hand roll lots of ad-hoc test cases ourselves. If we generate random test cases we need the results to be clearly reported so we know what went wrong if something fails. We need property-based testing. Good news! Haskell got there long before us. 

QuickCheck “is a tool for testing Haskell programs automatically. The programmer provides a specification of the program, in the form of properties which functions should satisfy, and QuickCheck then tests that the properties hold in a large number of randomly generated cases.” [See the manual] You define a property, such as reversing a reversed list gives the original list

prop_RevRev xs = reverse (reverse xs) == xs
          where types = xs::[Int]

Then quickly check it holds for some randomly generated examples.


        Main> quickCheck prop_RevRev
        OK, passed 100 tests.

If a property doesn’t hold, quickCheck reports the case or “counter-example” for which it does not hold. Instead of my initial “example-based” test I can now test my property holds generally. Since the cases are randomly generated rather than exhaustive I may still miss problems, but look how much shorter the code was.

Wait a moment! I was trying to test some C++ and got distracted by Haskell. The good news is ports of QuickCheck exist for various languages. For example, F# has FSCheck  Python has Hypothesis  and, C++ being C++, has various versions. I have tried Legiasoft’s QuickCheck and showed my initial attempts at the #ACCU2015 conference.

A recent blog from Spotify drew my attention to RapidCheck. This claims to integrate with Boost test and Google Test/Mock though I haven't tried it yet. I wonder if I can make it play nicely with Catch. I will report back. Another interesting feature it supports is stateful based testing, based on Erlang’s port of QuickCheck. Since this started with Haskell, many frameworks need *pure* functions. Once in a while, some of us are not quite as pure as we'd like, so I can imagine this being very useful.

I hope this has sparked some excitement about new ways of testing your code. Next time someone asks “Unit tests or integration tests?” say “Yes, and also property-based tests”.

In vivo, in vitro, in silico

Frances Buontempo from BuontempoConsulting


In vivo, in vitro, in silico


Some people get unit testing and some people don't. The reasons vary, usually based on a mixture of previous experience, lack of experience, fear of the unknown or joy at a safer quicker way of developing. One specific doubt crops up from time to time. It comes in the form of "If I test small bits, i.e. units, whatever *that* means, it proves nothing. I need to test the whole thing or small parts of the whole thing live."

My PhD was in toxicity prediction, which involves testing if something will be toxic or not. You can test a chemical "in vivo" - administer it to a several animals in varying doses. You sit back and wait til half of them die, or show toxicity symptoms and record the doses. This gives you the Lx50 - for example the LD50 is the lethal dose that kills 50% of the animals.  Notice I said you can do this. You can also test the chemical on a set of cells in a test tube or petri dish - "in vitro" (in glass). Again you can find the dose which affects 50% of the specimens. I personally find this less upsetting, but I want to focus on parallels with testing code here. Finally, given all this data the previous tests have generated, you can analyse the data, probably on a computer, perhaps finding chemical structure to activity relationships - SAR, or quantitative SARS i.e. QSARs. These are referred to as "in silico" - for obvious reasons. Some in silico experiments will just find clusters of similar chemical, which can either alert you to groups that might need more detailed toxicity testing, or even guide drug discovery by steering clear of molecules, say containing benzene rings which can be carcinogenic, saving time and money if you are trying to invent a drug that cures cancer. The value of testing on a computer outside a live organism should be clear. It can save time, money and even lives.


If we keep this in mind while considering testing a software system, rather than a biological system, we should be able to see some parallels. It is possible to test a live system - maybe on beta rather than "TIP" (Test in production). This can be a good thing. However, it might save time and money, and though maybe not lives, certainly headaches, to test parts of the live system in a sandboxed environment, analogous to in vitro. Running an end to end test against a test database instance with data in a specific state might count. Pushing the analogy further, you could even test small parts of the system, say units, whatever they are, in silico. Just try this small part away from the live system in a computer. This is worthwhile. It will be quicker, as toxicity in silico experiments are quicker - they tend to take hours rather than days. This is a good thing. Of course, you won't know exactly what will happen in a full live system, but you can catch problems earlier, before killing something. This is a Good Thing.

Other industries also test things in units - I could put together a car or a computer hit the on switch and see if it works.  However, I am given to believe that the components are tested thoroughly *before* the full system is built. If I build a PC and it doesn't work I will then have to go through one part at a time and check. If someone tests the parts first, this will ensure I haven't put a dodgy power block in the whole thing. Testing small parts, preferably before testing the whole system, is a Good Thing.

I don't believe this short observation will change anyone's minds. But I hope it will give pause for thought to those who think only testing from end to end matters, and testing "in silico" is a waste of time.










Language lawyers – or why words can have precise meaning

Frances Buontempo from BuontempoConsulting

I was called a language lawyer the other day, because I attempted to be precise about the state of play with some code. Initially I was taken aback, but eventually concluded that the phrase "language lawyer" was not being used precisely. It was used in the sense of, "Saying exactly what you mean." If I had clarified this the self-reference may have meant I got lost down a rabbit hole, so I left it.

The situation came about because a co-worker is changing some code in a repo which has a few unit tests, but due to circumstance I won't bore you with the code is in two repos - one has the tests and the other doesn't. I have been tasks with getting tests round any code changes he makes. I am therefore working in the repo with the tests. He, of course, has decided to work in the repo without tests so doesn't know if his code changes break any existing tests.

/head-desk

It's like pair-programming but we have to talk in words rather than code.

I cannot manage to guess what his code changes might do to the tests. This would be so much easier if he ran the tests as he changed the code. In fact, by definiton, refactoring should involve running the tests as you go. Trying to ask questions like "Have you deleted the isValid function or changed its behaviour?" in order to try to get the tests to match his changes have resulted in answers like "No, well a bit, but I haven't decided yet."

My attempted to print off the test names so we could discuss how the code actually behaved before the changes have been met with ,"I haven't looked at the tests yet - I'd need to look at the code to see what they test." I think the tests have really clear names - like FooWithDefaultDateIsNotValid, He could look at the test code but I was rather hoping this was clear enough. I tried asking what new test *names* we might need, but got no-where. He did suggest I check the private container didn't contain any default dates - and offered to add a getter so I could verify this from outside the object in test code. I muttered something about encapsulation and seppuku and encapsulation.

I'm not sure if this is happening because people are used to function names making no sense and figuring out  one line at a time in a debugger, or if some people genuinely don't think in words. It's very difficult to communicate if people assume you aren't saying what you mean, realise you are and then call you out for trying to be clear.

Eulogy for my Dad

Frances Buontempo from BuontempoConsulting

My Dad loved many things and it therefore falls to me to mention mathematics. Non-geeks tend to say things like “He had a gift for that,” as though geeks know a magic incantation or are “naturally clever”. My Dad was clever. However, one of my lecturers at University frequently reminded me that “Genius is 1% inspiration and 99% perspiration.” David loved maths and was willing to spend many hours learning more, usually in order to teach his students or to share the latest puzzle he was thinking about with anyone willing to listen. Recently the puzzles had tended to be the Sunday Times Puzzler. After showing him how to program in Python he submitted a few that were accepted. You can still see them on the internet if you search. He has left ripples in the ether.
He cared about sharing results and ideas, and instilled in me the joy of someone moving from disbelief to confusion to understanding and conviction. The world is often a bigger and more amazing place than we first assume. I recall him getting a paper published he had written with some students. Not only did he credit the students, he also persuaded the publishers to include the negative results – things they tried that went wrong. Many academic papers avoid doing this, but he felt it is important to stop others from going down the same blind alleys and to learn from your mistakes.
I know he inspired many people. In my brief career as a teacher I met many who had been his students at Christchurch and they always spoke highly of him. He had a knack for explaining things and making sure you had understood. He was also willing to listen to me trying to explain things to him – including how to code in python and what I was trying to do with my “new work” chapter in my PhD thesis. He was willing to ask “Why?” and allowed me to ask as well. I have never grown out of this and that leaves me with an unsatisfiable curiosity. That makes it OK to ask “Why?” about his unexpected death. Not being a mathematical question, we are unlikely to get a clear and compelling answer, but it’s ok to ask.
The day after he died, I saw a nine digit number in large neon on the top of a building-front. It had all the digits except the number “1” – I forget which was repeated. I have no idea what the number meant, but I know if he’d been there he would have noticed it as well. Whenever I notice symmetries in tiles or paving slabs, or broken symmetries, curious numbers or patterns I will think of him. And have been doing for years. His excitement and curiosity about mathematics could be infectious if you were prone to it. Some people might say “Stop being a geek,” others just raise an eyebrow. Once in a while you’ll find someone else who’s noticed it too or looks when you point and wonders with you at the patterns and meaning that point to something greater in an otherwise chaotic seeming world.
I have no idea why the digit 1 was missing from the number on the building, let alone what the number was trying to convey, but having spotted a surprising number of physics books on the bookshelves of a man who claimed physics is just watered-down maths, I am reminded of a quote attributed to Feynman:
"I would rather have questions that can’t be answered than answers that can’t be questioned."
It’s always ok to ask “Why?” We may never really know but we may discover beautiful and interesting things on the way. Or perhaps I should end with another actual Feynman quote

“The most important thing I found out from [my father] is that if you asked any question and pursued it deeply enough, then at the end there was a glorious discovery of a general and beautiful kind.”


Testing legacy code by adding singletons

Frances Buontempo from BuontempoConsulting

This is not a good idea: Michael Feathers says "STOP IT NOW"

Testing legacy code

Many people have read Mike Feather's excellent book, "Working effectively with legacy code." including people on my team. Some people like Mocks. Watch this space - Overload 127 will contain an article asking if mocks are always the right thing to use.

My team has lots of legacy code, that is code without tests. We want to get it under test and I want these tests to run on our Jenkins box. I want any quick running tests to run on each checkin and email us if the build got broken and whoever broke it fix it. A girl can dream.

Stop it - you're doing it wrong


We seem to be developing a "pattern" whereby we introduce singletons in order to make our code testable. Yes, I just said introduce singletons in order to make the code testable.
I think this is happening because "we" (well, they) want to use gmock because it's brilliant. I could be wrong. Perhaps it doesn't matter why it is happening we just need to stop this and do something different.

Why does gmock make you write singletons?

Let's look at an example, with the names changed to protect the guilty.
Suppose you have some code like this (C++).

class Asset
{

    //miles and miles of public functions and comments
    double Value(std::string logMessage, double someIrrelevantNumberToLog);

  

};

double Asset::Value(std::string logMessage, double someIrrelevantNumberToLog)
{
    ENTERPRISE_INHOUSE_LOG_FRAMEWORK_THAT_PULLS_IN_THE_WORLD(info, logMessage, someIrrelevantNumberToLog);

    double value = 0.0;
    if (isSpot)
        value = spotValue(m_notional, m_exchangeRate);
    else
        value = futureValue(m_notional, m_exchangeRate);
    return value;

}

spotValue and futureValue are C functions that may or may not call COBOL or FORTRAN or similar.

We have ended up with some tests. Yay! Which use singletons. Boo!!
(Hope you like the comment being in red - as a warning rather than the odd convention of making them green in many IDEs).


No, but, HOW?

In order to test this, and armed with gmock we have something like mockSpotValue.h (namespaces and include guards left as an exercise for the reader for brevity)


#include <gmock>

class MockSpotValue 

{
    public:
    MOCK_CONST_METHOD(spotValue, double(double, double));


  void rest() 
  {
      Mock::VerifyAndClear(this);

  }
};

/**
 * Singleton
 * /
MockSpotValue & mockSpotValue();


Let's not point out this isn't a singleton. I'll leave the "mockSpotValue" instance create factory builder method as an exercise for the reader too. Making comments in red reminds me of being a teacher. It's the future. Or spot on. Depending on a boolean.

Now we use a linker seam to make our very own spotValue we can call in a test on a dev box.

double spotValue(double x, double y)
{
    mockSpotValue().(x,y);
}



And where's the test(s)?

Ah. Tests. Yes, having done this we should write some. Or maybe just one for brevity.

TEST_F(ValueTest, testGetSpotValueWithZeroNotional)
{
  MockSpotValue & valueApi = mockSpotValue();
  valueApi,reset();
  Asset asset;
  asset.makeSpot();//or something mad like that

  EXPECT_CALL(valueApi, spotValue(_, _)).WillOnce(Return(42));
  EXPECT_THAT(asset.Value(), DoubleEq(42.0));
}
I have simplified this. In order to get something like this Asset into a test we did some things with a sprout. This may require another blog post.

BUT that's a B(ad) U(nit) T(est)

How do we know this is a bad test? Because we have seen the singleton? Even without that the name smells. "testGetSpotValueWithZeroNotional" 
I have seen worse. I saw one call "testDefaultIsValid" which asserted that a thing constructed with defaults IS NOT valid. I digress.
So, testGetSpotValueWithZeroNotional. What are we testing? Can we make this test name clearly express what is tests?

The best I can come up with is testThatSpotAssetValueReturnsTheValueIToldTheMockToReturn or more simply
testThatGMockDoesWhatItIsSupposedToCosYouCannotTrustThesePeople

Help

I like that we are trying to get tests round legacy code. I just have a few qualms about how we are doing this. Please comment with suggestions on how to test this better.
I feel like banning mocks until we have written a few characterisation tests. At least there will be fewer singletons that way. Who ever heard of adding singletons in order to test code?




Eigenfaces FTW or the "Zebra/non-zebra decision boundary."

Frances Buontempo from BuontempoConsulting

Yesterday I attended the Karen Spärck Jones lecture at the BCS in London Dr Cordelia Schmid talked about computer vision, giving an overview of it's history through to the current state of the art. This is a tall order to get into an hour or so.

Let's see if I can summarise what she covered.

Still pictures and moving pictures need different techniques. For still pictures, we start with attempting to recognise objects or classes of objects. For moving pictures we might be spotting actions, as well as objects; maybe tuning a stringed instrument or celebrating a birthday. For still pictures, spotting a known chair in pictures is slightly easier than getting a program to spot any chair in pictures. How do you generalise the definition of chair anyway?

For the simpler case of a specific chair, or other object, you still need to deal with problems such as the different viewpoints, or different scales. The pixels of a bridge/chair/object close up will be completely different to the same bridge further away, or at a slightly different angle. Techniques started with edge detection, then moved on to projective invariants (and geometric and photometric invariants - light levels affect the pixel). I regarded this as akin to the difference between bitmaps and scalable vector graphics.

A milestone in the move away from edge detection to feature selection came with "Eigenfaces" - see Turk and Pentland. This uses principal component analysis.. In essence you find the line of best fit through the points, plotted in n-dimensional space, if you have n features. This is the first eigenvector. It's a vector, as it has direction. It's "eigen" as it is a peculiar, singular or *characteristic* - etymology slightly uncertain. If you project the data onto this, you will have lost lots of information. You then find a perpendicular line - the 2nd best fit line. And continue until you've captured enough information. This allows you to summarise datasets and is sometimes known as a feature reduction technique. Have you ever wonder how facebook recognises faces? Or how football programmes track how far a footballer has run? Actually the latter is more moving pictures, so I am ahead of myself.

These approaches look at the global scale - the whole picture. Next came local greyscale invariants, using a voting system to spot things. This can deal with photometric problems - varying light levels. Next we have SIFT - scale-invariant feature transform.

Mention was then made of wavelet filters and boosting feature selection, trained on positive and negative examples, such as pictures with a given object, say a car, and pictures without the object. The code is in OpenCV. I wonder if this is similar to AdaBoost.

Mention was then made of histograms of orientation - see Datal and Triggs. This is related to support vector machines, SVM, which finds a hyperplane between positive and negative examples. Some example still pictures were shown wherein this technique could be used to detect a, and I quote, "Zebra/non-zebra decision boundary." This may not seem like a day-to-day problem many of us face, but made the important point that you need training data near the boundary - for example other animals with stripes, and other things with a similar profile, like a motorbike. In a more general setting I was thinking about flushing out edge-cases in unit tests. The importance of a good set of representative training data was made - you need more than just edge cases, you want many cases away from the edges too. This also applies to automated testing. But I digress.

Finally we move on to the current state of the art - convolution neural networks (CNN), which I have not met before. The find "high-dimensional aggregated descriptors" - they have a huge number of nodes and several layers and require some serious computing power - GPU etc etc. As always there is a trade-off between speed and accuracy. I presume the hand-tuned network may be incomprehensible afterwards. I have worked on "feature extraction" from feed-forward neural networks before which represent a trained network as a decision tree so a human can understand what the program has discovered. I presume for CNNs this is neither possible nor desirable. It just needs to get the job done and find Wally^H^H^H^H^H zebras. I previous mentioned python to find Wally on El Reg. Aren't computers amazing?

Could a machine automatically tag things in a still picture? "Dog 1: Terror", "Man: John Smith". I wonder if we end up with CCTV automatically sending out Robocop to arrest people. Big brother is watching you and figuring out what you're doing.

This leads to the action recognition, mentioned at the start. Having got to a point where we tag things in a still picture, can we set the machines lose to do "weak supervised learning" - find an interesting thing in this video. We were shown examples of a programming picking out a bird or person etc moving in a video, Sometimes it worked, sometimes it didn't. Supervised learning involves giving training examples as input and getting the trained algo to find the same things in other inputs. For moving pictures describing the data - giving positive and negative examples would take hours. Would you go through frame by frame and label features? It would take far too long. Instead let it learn as it goes, setting it off with a few clues - here's a robin. Is there one in this movie? Or spot and label a moving thing - which happened to be a car moving very quickly so seeming to get much smaller - it didn't find that. It seems slow movements are easier to track than fast jerky ones. Though an algo did manage to draw a rectangle around a cat rolling about in another video. The two main techniques involved were dense trajectory features (Wang) and CNN features for optical flow (Simonyan). These made the front page in the last year or so.

A compelling throw away comment at the end was that hand-crafted models are NOT machine learning (ML).Most ML I have attempted before has left me to chose some parameters - how many iterations, how fast to move towards a solution, how many layers in my neural network. The machine has learnt nothing - it just did what it was told. True ML would let the machine find its own parameters. Of course, I have seen a few people trying to do this. It's all very exciting.

Somebody asked, "How come I don't see any of this in my day to day life?" I presume the usual - this is all so academic. Pay attention at the back, I say...

  • Have you ever been issued with an automatic speeding ticket? How did it find you?
  • Have you ever uploaded a picture to facebook and found little boxes around faces (and the odd random tree, but what do you expect?) 


My Dad once asked me how on earth the sports program he was watching could tell him how far a specific footballer had run in the course of a football match. This involves image recognition, including the optical flow - tracking an individual player over the course of a game, from various different angles, so captures many of the specific problems we mentioned above.  Unless they just use a pedometer.

Fascinating stuff. I wonder if the machines could spot things we haven't spotted. For example, speckles or shadows in medical scans or even x-ray machines at passport control/baggage checks, that people might miss. Or imagine facebook looked at your holiday snaps and sent you an advert for a clinic dealing with skin cancer, having spotted the stirrings of a carcinoma in your holiday tan. Would you want this?

Further extensions including pairing up audio information, so we can find youtube videos of tuning a guitar - made much easier if the spoken commentary says "tuning" and "guitar" as well as just having the pictures to go on. Combine this with smell and haptics and the machines will soon be writing their own drivel all over the internet. Welcome Skynet.


Prove it – factorial is bigger than 2^n

Frances Buontempo from BuontempoConsulting

I've been doing the scala coursera and wanted to write down the proof that

factorial(n) ≥ 2n when n ≥ 4

since it uses induction and I am out of practise.

Base case

For n = 4

factorial(4) = 4*3*2*1 = 24
and 24 = 2*2*2*2 = 16
and 24 ≥ 16

so factorial(n) ≥ 2n when n = 4

Induction step

For n >= 4 assume we have

factorial(n) >= 2n

and consider

factorial(n+1) = factorial(n) × (n+1)
 ≥ 2n × (n+1)
 ≥ 2n × 2 since (n+1) ≥ 2 when n ≥ 4
 = 2n+1

QED
(I also wanted to learn about writing maths in html)

Remote debugging python in Visual Studio

Frances Buontempo from BuontempoConsulting

Suppose you have a script you want to run on linux and you only know how to drive the Visual Studio debugger. By installing an add-in for Visual Studio locally, installing the python tools for Visual Studio debugging on the remote machine, e.g. with pip install ptvsd==2.0.0pr1 and adding a (minimum of) a couple of lines to your script you can debug in Visual Studio even if the remote machine is running linux.
The additional lines are highlighted in the following script:

#!/usr/bin/python

"""
You will need to insert both these in your script
The remote box requires the ptvsd package (otherwise the import fails)
"""
import ptvsd
ptvsd.enable_attach(secret = 'joshua')
#use None instead of joshua but that is not secure
#The secret can be any string - but this is not properly secure


def say_it(it):
  """
  This inserts a breakpoint
  but you can add new breakpoints in Visual Studio
  if required too/instead
  """
  ptvsd.break_into_debugger()
  print(it)

if __name__ == "__main__":
  #pause this script til we attach to it
  ptvsd.wait_for_attach()
  say_it("Hello world")


See https://pytools.codeplex.com/wikipage?title=Remote%20Debugging%20for%20Windows%2C%20Linux%20and%20OS%20X for more details and be wary of line ending in VS which may be inappropriate for linux.

Install the ptvs from the relevant msi for your version of Visual Studio.

Start the script on the linux box:
$python VSPyNoodle.py

It will hang, since it has a wait_for_attach call in main. 
ctrl-Z will stop it on the remote box if something goes wrong.

Select "Attach to process" in the Debug menu on Visual Studio
Change the "Transport" to "Python remote debugging (unsecured)"

Add the secret (joshua in this script) @ hostname to Qualifier
e.g. joshua@hostname

Hit "Refresh"

It should find the process running on the linux box and add the port it uses to the Qualifier
Select your process in the list box and hit "Attach"
  then debug as you are used to in VS.

If it complains about stack frames and not being able to see the code you may need to make a VS project from a local version of the code. having made sure it exactly matches the remote code.