Unmatched REST Resources – 400, 404 or 405?

Chris Oldwood from The OldWood Thing

There is always a tension in programming between creating something that is hard to misuse but at the same time adheres to standards to try and leverage the Principle of Least Surprise. One area I personally struggle with this conflict is how to communicate to a client (of the software kind) that they have made a request for something which doesn’t currently exist, and almost certainly will never exist.

As a general rule when someone requests a resource that doesn’t exist then you should return a 404 (Not Found). And this makes perfect sense when we’re in production and all the bugs have been ironed but during development when we’re still exploring the API it’s all too easy to make a silly mistake and not realise that it’s due to a bug in our code.

An Easy Mistake

Imagine you’re looking up all orders for a customer, you might design your API something like this:

GET /orders/customer/12345

For a starter you have the whole singular noun vs plural debate which means you’ll almost definitely try this by accident:

GET /order/customer/12345

or make the inverse mistake

GET /orders/customers/12345

By the standard HTTP rules you should return a 404 as the resource does not exist at that address. But does it actually help your fellow developers to stick to the letter of the law?

Frameworks

What makes this whole issue much thornier is that if you decide you want to do the right thing by your fellow programmers you will likely have to fight any web framework you’re using because they usually take the moral high ground and do what the standard says.

What then ensues is a fight between the developer and framework as they try their hardest to coerce the framework to send all unmatched routes through to a handler that can return their preferred non-404 choice.

A colleague who is also up for the good fight recently tried to convince the Nancy .Net framework to match the equivalent of “/.*” (the lowest weighted expression) only to find they had to define one route for each possible list of segments, i.e. “/.*”, “/.*/.*”, “/.*/.*/.*”, etc. [1].

Even then he still got some inconsistent behaviour. Frameworks also make it really easy to route based on value types which gives you a form of validation. For example if I know my customer ID is always an integer I could express my route like this:

/orders/customer/{integer}

That’s great for me but when someone using my API accidentally formats a URL wrong and puts the wrong type of value for the ID, say the customer’s name, they get a 404 because no route matches a non-integer ID. I think this is a validation error and should probably be a 400 (Bad Request) as it’s a client programmer bug, but the framework has caused it to surface in a way that’s no different to a completely invalid route.

Choice of Status Code

So, assuming we want to return something other than Not Found for what is clearly a mistake on the client’s part, what are our choices?

In the debates I’ve seen on this 400 (Bad Request) seems like a popular choice as the request, while perhaps not technically malformed, is often synonymous with “client screwed up”. I also like Phil Parker’s suggestion of using 405 (Method Not Allowed) because it feels like less abuse of the 4XX status codes and is also perhaps not as common as a 400 so shows up a bit more.

 

[1] According to this StackOverflow post it used to be possible, maybe our Google fu was letting us down.

Elastic stack – RTFM

Frances Buontempo from BuontempoConsulting

I tried to setup ELK (well, just elasticsearch and kibana initially), with a view to monitoring a network.

Having tried to read the documentation for an older version than I'd downloaded and furthermore one for *Nix when I'm using Windows, I eventually restarted at the "Learn" pages on https://www.elastic.co/

There are a lot of links in there, and it's easy to get lost, but it is very well written.

This is my executive summary of what I think I did.

First, download the zip of kibana and elasticsearch.

From the bin directory for elasticsearch, run elasticsearch.bat file, or run service install then service run. If you run the batch file it will spew logs to the console, as well as a log file (in the logs folder). You can tail the file if you choose to run it as a service. Either works.

If you then open http://localhost:9200/ in a suitable browser you should see something like this:

{
"name" : "Barbarus",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "bE-p5dLXQ_69o0FWQqsObw",
"version" : {
"number" : "2.4.1",
"build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
"build_timestamp" : "2016-09-27T18:57:55Z",
"build_snapshot" : false,
"lucene_version" : "5.5.2"
},
"tagline" : "You Know, for Search"
}
 
The name is a randomly assigned Marvel character. You can configure all of this, but don't need to just to get something up and running to explore. kibana will expect elasticsearch to be on port 9200, but again that is configurable. I am getting ahead of myself though.

Second, unzip kibana, and run the batch file kibana.bat in the bin directory. This will witter to itself. This starts a webserver, on port 5601 (again configurable, but this by default): so open http://localhost:5601 in your browser.

kibana wants an "index" (way to find data), so we need to get some into elasticsearch: the first page will say "Configure an index pattern". This blog has a good walk through of kibana (so do the official docs).

All of the official docs tell you to use curl to add (or CRUD) data in elasticsearch, for example
curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '
{
"name": "John Doe"
}'

NEVER try that from a Windows prompt, even if you have a curl library installed. You need to escape out the quote, and even then I had trouble. You can put the data (-d part) in a file instead and use @, but it's not worth it.
Python to the rescue. And Requests:HTTP for Humans
pip install requests
to the rescue.
Now I can run the instructions in Python instead of shouting at a cmd prompt.

import requests
r = requests.get('http://localhost:9200/_cat/health?v')

r.text


Simple. The text shows me the response. There is a status code property too. And other gooides. The the manual. For this simple get command you could just point your browser at localhost:9200/_cat/health?v


Don't worry if the status is yellow - this just means you only have omne node so it can't replicate in cause of disaster.

Notice the transport, http:// at the start. If you forget this, you'll get an error like
>>> r = requests.put('localhost:9200/customer/external/1?pretty', json={"name": "John Doe"})
...
    raise InvalidSchema("No connection adapters were found for '%s'" % url) requests.exceptions.InvalidSchema: No connection adapters were found for 'localhost:9200/customer/external/1?pretty'



Now we can put in some data.

First make an index (elastic might add this if you try to put data under a non-existent index). We will then be able to point kibana at that index - I mentioned kibana wanted an index earlier.
r = requests.put('http://localhost:9200/customer?pretty')


Right, now we want some data.
>>> payload = {'name': 'John Doe'}
>>> r = requests.post('http://localhost:9200/customer/external/1?pretty', json=payload)


If you point your browser at localhost:9200/customer/external/1?pretty you (should) then see the data you created. We gave it an id of 1, but it will be automatically assigned a unique id if we left that off.

We can use requests.delete to delete, and requests.post to update:
 >>> r = requests.post('http://localhost:9200/customer/external/1/_update', \
 json={ "doc" : {"name" : "Jane Doe"}})

Now, this small record set won't be much use to us. The docs have a link to some json data. I downloaded some ficticious account data. SO to the rescue for uploading the file:


>>> with open('accounts.json', 'rb') as payload:
...   headers = {'content-type': 'application/x-www-form-urlencoded'}
...   r = requests.post('http://localhost:9200/bank/account/_bulk?pretty', \ 

              data=payload,  verify=False, headers=headers)
...

>>> r = requests.get('http://localhost:9200/bank/_search?q=*&pretty')
>>> r.json()
This is equivalent to using

>>> r = requests.post('http://localhost:9200/bank/_search?pretty', \

      json={"query" : {"match_all": {}}})
 i.e. instead of q=* in the uri we have put it in the rest body.



Either way, you now have some data which you can point kibana at. In kibana, the discover tab allows you to view the data by clicking through fields. The visualise tab allows you to set up graphs. What wasn't immeditely apparent was once you have selected your buckets, fields and so forth, you need to press the green "play" button by the "options" to make it render your visualisation. And finally, I got a pie chart of the data.  I now need to point it at some real data.

 
 

 

Continuous Delivery

Jon Jagger from less code, more software

Is an excellent book by Jez Humble and Dave Farley. As usual I'm going to quote from a few pages...
Software delivers no value until it is in the hands of its users.
The pattern that is central to this book is the deployment pipeline.
It should not be possible to make manual changes to testing, staging, and production environments.
If releases are frequent, the delta between releases will be small. This significantly reduces the risk associated with releasing and makes it much easier to to roll back.
Branching should, in most circumstances, be avoided.
Dashboards should be ubiquitous, and certainly at least one should be present in each team room.
One of the key principles of the deployment pipeline is that it is a pull system.
A corollary of having every version of every file in version control is that it allows you to be aggressive about deleting things that you don't think you need... The ability to weed out old ideas and implementations frees the team to try new things and to improve the code.
It should always be cheaper to create a new environment than to repair an old one.
The goal of continuous integration is that the software is in a working state all the time... Continuous is a practice not a tool... Continuously is more often than you think.
The most important practice for continuous integration to work properly is frequent check-ins to trunk or mainline.
Ideally, the compile and test process that you run prior to check-in and on your CI server should take no more than a few minutes. We think that ten minutes is about the limit, five minutes is better, and about 90 seconds is ideal.
Enabling developers to run smoke tests against a working system on a developer machine prior to each check-in can make a huge difference to the quality of your application.
Build breakages are a normal and expected part of the process. Our aim is to find errors and eliminate them as quickly as possible, without expecting perfection and zero errors.
Having a comprehensive test suite is essential to continuous integration.
You should also consider refactoring as a cornerstone of effective software development.


Building Microservices

Jon Jagger from less code, more software

Is an excellent book by Sam Newman. As usual I'm going to quote from a few pages...
Because microservices are primarily modeled around business domains, they avoid the problems of traditional tiered architectures.
Microservices should cleanly align to bounded contexts.
Another reason to prefer the nested approach could be to chunk up your architecture to simplify testing.
With an event-based collaboration, we invert things. Instead of a client initiating requests asking for things to be done, it instead says this thing happened and expects other parties to know what to do. We never tell anyone else what to do.
We always want to maintain the ability to release microservices independenty of each other.
A red build means the last change possibly did not intergrate. You need to stop all further check-ins that aren't involved in fixing the build to get it passing again.
The approach I prefer is to have a single CI build per microservice, to allow us to quickly make and validate a change prior to deployment into production.
No changes are ever made to a running server.
Rather than using a package manager like debs or RPMs, all software is installed as independent Docker apps, each running in its own container.
Flaky tests are the enemy. When they fail, they don't tell us much... A test suite with flaky tests can become a victim of what Diane Vaughan calls the normalization of deviance - the idea that over time we can become so accustomed to things being wrong that we start to accept them as being normal and not a problem.
All too often, the approach of accepting multiple services being deployed together drifts into a situation where services become coupled.
Most organizations that I see spending time creating functional test suites often expend little or no effort at all on better monitoring or recovering from failure.