Occasionally there are things that I read about on the web which happen to fit perfectly with some need I have at the time: and “Open Data Protocol”, or just oData, is one of them. I think I got hip to this by reading Miguel’s post on oData, but looking around it has been mentioned in a few other blogs I follow.

What is oData? Put simply, it’s a bit like being able to do SQL queries over the web – for non-technical people it’s deeply disinteresting, but what it effectively promotes is an ability for web-based services to open access to their databases in a pretty straightforward and standards-compliant method.

Now, there is some commentary that this is effectively trying to subvert another set of standards, RDF, OWL & SPARQL. I have to say up-front that I don’t see the comparison particularly: they’re similar in many ways, but also quite different, and I personally think they are more complimentary than competitive. However, the people who specified oData are Microsoft – so the “anti-competitive” label is one which sticks easily. It’s a lazy criticism in my opinion, but that’s up to the commentator.

More serious is the problem Miguel raises: while there are a number of free software oData consumer libraries available already, there are limited options for producing oData services. This is a major issue. As relatively light-weight as oData is, it’s still a pretty broad specification: your service needs to be able to produce both XML and JSON, and there are particular schemas and URL structures you have to follow. I’m tempted to start to write an oData producer for PHP, but it’s likely to be a lot of effort for not much immediate gain.

Another problem I’ve seen is that authentication and authorisation is basically not mentioned at all; the nearest we get is in section 8 of the overview:

This is a particular problem for me, because the immediate itch I have that this could scratch would need authentication. If one oData service uses HTTP basic auth, and another uses a cookie-based system, that’s an issue – it hinders interoperability. In a way, I understand why they did it – it’s a somewhat orthogonal issue, and once you start prescribing features like that there’s no logical reason to prescribe other HTTP features like if-modified-since, but it does seem to me to be a pretty key issue. Not all data wants to be public.

All that said, I’m planning on digging deeper into oData. It’s extremely interesting, and I think of large potential value in the future – and being honest, there is nothing else like it immediately available. The JSON format alone is of huge value, since it means that browsers can access all this data immediately. There’s just a fair amount of work for it to become useable…