The CTO’s guide to containers/serverless changes – re:Invent 2017

Gosh, I wish I was at re:Invent. Personally, I don’t like the States much (the place is great; getting through the airports is an exercise in frustration) and while I’ve never been to Las Vegas there isn’t much that ordinarily attracts me to the place. But, to have so many incredible people in one place – amazing.

For those – like me – not there, what do the announcements today mean? I’m not going to focus on the tech so much, but more on the additional options and architecture that is becoming available. Let’s look at a strategic level.

The main announcements I want to talk about are:

Amazon Kubernetes (EKS) and Containers (Fargate) – new ways to orchestrate and run containers that you’ve built
Serverless SQL (Aurora serverless) – that nice MySQL flavour, but in a “not always on” version
DynamoDB “Global Tables” – the same old DynamoDB, but distributed automatically around the world
Lambda@Edge – Lambda, but nearer the customer. This was actually a pre-announcement, but it’s interesting for this discussion.

The overall theme here from a strategy point of view is a further unpacking of resources. We know what applications need: compute (the ability to run code), memory (the ability to keep data alongside the running code), storage (the ability to save data), and transit (the ability to communicate). In other words, CPU, RAM, disk and network.

We’re constrained by our resources

For a long time, network has been easily unpacked. We’ve paid for traffic in and out of our networks, and it’s easily metered and planned for. Disk has been somewhat unpacked: we can flex storage somewhat easily, it’s possible to add storage in retrospect, and often we can access storage over the network.

CPU and RAM has been a bit more difficult. They need to be co-located, it’s not straightforward to scale in the same way storage does, and because of the way software works the two are linked. I can use RAM without using CPU in principle; but it’s difficult to do the inverse.

Also, all of these resources have a concept of location. Depending where our resources are, we may be able to scale easily or not. Having RAM free in another DC may not help me scale an app.

AWS continues to remove constraints

Looking at the products I just mentioned through this lens:

EKS & Fargate – making CPU/RAM more flexible/scalable
Neptune & Aurora – decoupling CPU/RAM from storage
DynamoDB & Lambda@Edge – decoupling apps from locality

This has a direct impact on how we design and plan applications. With fewer constraints we can potentially open up new ways of designing / developing applications, and there are some obvious themes emerging.

The Lambda pricing half-lie

I did note during the conference that they say for Lambda, “You do not pay for idle”. This really isn’t true: you don’t pay for idle time while there is no active request, but you absolutely do pay for the idle time during it.

This is an important point to understand, because it’s important to understand how Lambda pricing works. If you have two simultaneous requests, you pay for both. If your code is blocking on I/O, you’re paying for that. Lambda is great if your application is rarely accessed or is compute-heavy: you pay a small premium on per-second compute but save huge on scaling and idle time. If you have a generally busy app that is I/O heavy, though, Lambda stops making so much sense.

People will often say “ah, but serverless is the future!” – this isn’t quite true. What is important is the unpacking of the consumption of resources, and the ability to switch between them as is best for the organization.

I often compare this to catering. If I’m dining out with a small number of people on a special occasion, I may well want to spend on demand for the a-la-carte options. If I’m feeding a large number, I want predictability about my spend, maybe some kind of all-you-can-eat deal or a flat rate.

Are you really ready to scale?

Lambda already made it possible to deploy apps that start up from cold very quickly. They need to be designed to take advantage of this – it’s 12-factor on steroids – but there is a clear benefit to doing this. Further offering CPU/RAM options for both managed containers and more flexible Lambda means that we can worry much less about pricing: we can start small with Lambda and pay on demand, switching to a more predictable rate when the need becomes clear.

I think what’s interesting here is that most people over-estimate the degree of scaling their application will require. Of course, there is no compensating for badly-designed applications, but a web application that mainly hosts static content (for example) will scale to many millions of monthly users from a trivial container.

I like to think this is a form of infrastructural Pokemon architecture. You can start very small, and very simply, and know that you can scale an application to significant traffic doing very little.

In 2018, then, it makes sense to start out new applications with Lambda. Note that this doesn’t necessarily mean microservices – in fact, you should actively avoid this until much later. You can fit complete applications into small numbers of Lambda functions.

For re-using existing applications, it also makes sense to see if you can redeploy them into Lambda. This won’t often be the case, but it should be investigated, such is the power of this platform.

By 2020, I think it is unlikely that any business running in-house software of some kind is not running a significant proportion on some FaaS-type.

Scaling horizontally

Traditionally, most people have scaled out for reasons other than traffic / load. The usual one is resilience: by scaling out hardware, for example, we make a system less reliable but give it the ability to be resilient to errors. Having a cluster of three database servers, for example, is not usually done for load, but because the loss of the database service would be too painful.

On some occasions, people have also had to scale out a system in order to be closer to their customers. This is a much less frequent example, but for SaaS-type enterprises it’s not unusual to have a highly global customer base. If your contracts say “we may have downtime out of hours”, then it becomes an obvious play to put customers on hardware that is local to them so that downtime in one customer’s out-of-hours is not downtime in another customer’s business hours.

The introduction of Lambda@Edge and DynamoDB Global Tables are real game-changers here, and I think go hand-in-glove together. Both software code and intelligent storage can now sit close to the customer, together, and give excellent performance.

What does this mean for architecture? It’s sadly not a free pass on scaling: this system only works well if your app is happy with eventual consistency. Most apps will work happily in this mode – however, very few developers (in my experience) genuinely understand the race conditions here. This is one strength that Aurora gives – and that I think serverless developers don’t often appreciate – but doing that at global scale is technically very difficult.

Users tend to conveniently cluster themselves in specific geographic locations, however, and the likelihood is that for many developers the approach they need to follow to make this work is accessible enough.

Again, for 2018, we should be architecting applications with this mode in mind. As I’ve said before, I don’t believe people should scale things up before they need to – but as a design pattern, consuming eventually-consistent storage is clearly a positive for building applications.

Moving further into the future, the death of the three-tier architecture is pretty clear. At re:Invent they said that an RDBMS is no longer the go-to choice: I strongly disagree with this for a variety of reasons, but it’s absolutely true that having a single DB as your source of truth for an entire application is an anti-pattern at this point.

What’s coming

It’s really clear to me that with these new facilities available, it’s increasingly possible to replace significant applications with a new breed of decoupled software. Microservices will definitely be important, but I think we will see more patterns of co-operative application design that take into account on-demand capability.

It’s not totally clear to me that business as a whole is ready or willing to move to an entirely demand-oriented payment model – for one thing, the ability to account for things on a capital rather than operational basis has afforded them significant flexibility in the past.

In a sense, though, the money side of it is becoming increasingly irrelevant, as the cost of actually running the software and storing the data becomes less and less significant.

The majority of line-of-business applications now, I think, could run within an AWS free tier architecture quite happily. Now is absolutely the right time to begin to plan and develop for this future.

Security by Design For Everyone