Last week I gave a lightning talk, ostensibly on Kubernetes / docker / prometheus, to a motley crew of London CTOs. I rarely give talks these days, so of course I ran over massively. There’s so much I’ve learned, and my teams have learned, attempting to force a discussion like this into a lightning format feels trite (in retrospect).

There’s a key reason for this. In my talk, I called it the “Cambrian explosion period of Ops”: there are a multitude of tools available, more are being developed all the time, and there’s a huge amount of overlap.

I don’t want to try to predict when we’re going to hit “peak Ops tools”. Serverless is a way off from hitting most mainstream teams yet; it’s still a novelty right now, will grow very quickly around 2020 and will probably be dominant by 2025. So, we can’t be that far off the peak – and another sign of this is that we’re beginning to see real consolidation.

Kubernetes is a great example of this consolidation. As a project to recreate some Google internal infra, it wasn’t clear to me from the beginning that it was necessarily gain a lot of traction. OpenStack is an example of a project with similar intentions, but which is unlikely to be genuinely successful. K8s is much better delineated than OpenStack, simpler to setup and has a clearer proposition for cloud-based deployment.

What really demonstrates its success, though, is the number of people and projects now lining up behind it. Many PaaS projects – the likes of Deis, Rancher, et al. – are re-platforming on top of k8s. This requires significant work, and people expect to see significant benefit from it. Many vendors are interacting in the k8s project, and we’re seeing good support for it outside of Google Cloud. Notably, Amazon Web Services isn’t really doing much with it – which I think is a clear signal to the market that they’re slightly scared by this.

Enter Istio

Today, a group of vendors announced the release of Istio – a system for linking microservices together. This is another area which has seen rampant innovation, from simple-but-clever systems like Consul to much grander affairs like Kong. Based on Lyft’s Envoy, I feel Istio is a grander affair.

Already, though, people are lining up behind Istio. Free software as a model works excellently for infrastructure projects, and strong communities can be set up around them – but only if created in the right way. Google are involved here again, with IBM and Lyft involved, which is a good start, but others like Red Hat have also been involved – making it much more likely to receive widespread support.

Istio is made up of three pieces: Envoy, Manager and Mixer – a layer 7 proxy, a policy manager and control plane, respectively (kinda).

Istio doesn’t yet support Auth, but that will be coming soon. Envoy is a strong L7 proxy, and right now as a whole it only supports Kubernetes (although it is intended to be independent). Personally, I would play with this, but I wouldn’t deploy it yet – even though Envoy is rock-solid production infrastructure. What we have here is a really early sneak peek of what the project is doing – again a great sign of a project serious about creating a real community.

I’m particularly excited about the telemetry side of Istio. As well as controlling the flow of traffic, it can highlighting the dependencies between services (which is crucial – services evolve over time, and you never end up with the architecture you expected). Also, it enables tracing through the architecture.

How well the tracing works will remain to be seen. If you have any scatter-gather type microservice operations, or a significant amount of caching within the architecture, results may be incomplete or (worse) misleading. How tracing gets triggered in the first place is also an interesting problem to solve – obviously you can take the decision to trace everything and record lots of data; that has never worked well for me in the past though.

Future of Ops

Serverless is clearly the future, and eventually we will be able to disregard a lot of the deployment environment – just like how virtualisation allows us to disregard a lot of the detail of the hardware.

Until then, though, we do need something better in a microservices architecture. k8s is definitely something I would put money on as a good choice that will withstand what the future brings. Some form of flat, policy-driven software-defined networking is definitely needed too – and Istio looks like a very good start.

I also wonder if this is the beginning of the end of Docker. For various reasons, right since its inception, I’ve had concerns about how it works. Since then, the scope of “docker” has grown immensely – it has grown swarm, and attempted to spin out Moby, as two examples. There is no clear delineation around it in the same way as k8s, and indeed it’s difficult to see how on its current path it could be successful at the same time as k8s. Container runtime is much more commodity now, and docker cannot occupy an infrastructure position as well as a higher-order orchestration role.

I believe we’re not done with the creation of new tools. The build process for containers is still too hard, and there’s a lot of best practice still waiting to be discovered. The standard docker registry is pretty awful, in all honesty. The main pain, though, is the crazy situation where teams are constantly having to build their whole environment from scratch. I’m very much looking forward to seeing solutions to that.