Most teams already let production shape what they do next, but we still think in these relatively linear cycles from idea to production. An incident changes a feature requirement. A cost spike forces a redesign. A support pattern triggers a new feature flag or rate limit. These issues get ticketed, included into the backlog, and the development team attempts to address this growing list in priority order.

The software industry is still grappling with novel AI technology. There are a range of opinions on how to best incorporate these systems - from giving the AI narrow and specific roles within the existing org structures, through to tearing the whole thing up and running a software factory with swarms of “agents” doing relatively unreviewed work. But even in the most radical approaches, this linear “we design something, make it, deploy it and then run it” thinking still dominates.

I have been thinking about a more deliberate model for this, which I call Aigre: an AI-governed runtime engineering approach. It is an aspirational framework that combines decades of learning about feedback-driven systems with current AI and cloud automation. The ambition is not to pretend that software can now redesign itself end to end without oversight. It is narrower than that. The useful question is whether we can organise software delivery so that runtime evidence, policy, and AI-assisted decision-making combine into a more adaptive engineering system.

Reimagining software as a living system

Software is still often treated as a static artefact: requirements go in, code comes out, a release happens, and then operations inherits the consequences. That framing has never really matched production reality, but it looks even less convincing in AI-heavy environments.

We software developers have also tended to think in terms of zero-to-one: how do we go from nothing to something? I think this is backwards. We spend the majority of the life of a software application running and maintaining it. So Aigre works from this scenario backwards: let’s get operations right first, and then work out the bootstrapping.

Aigre asks us to treat software more like a living system embedded in an environment. The running service senses demand, failures, latency changes, security events, dependency drift, and cost pressure. It then adapts through changes in configuration, routing, scaling, policy, and sometimes code. In that sense, runtime and environment are not an afterthought to development; they are part of development.

This is close to a cybernetic view of software delivery. Rather than a single lifecycle, we are dealing with multiple feedback loops, some fast and some slow. A deployment rollback can happen in seconds. A product correction may take weeks. Cost optimisation may emerge over quarters. The point is not (necessarily) that every loop should be fully automated. The point is that software should be engineered with those loops in mind from the start.

There is a useful analogy here. Aigre implies that software needs something like a nervous system and something like an immune system. The nervous system is continuous sensing: telemetry, user behaviour, error patterns, cost signals, and infrastructure state. The immune system is controlled response: rollback, failover, throttling, scaling, or containment when something abnormal happens. AI sits in that picture as a contributor to the wider system, not as a magical planner reasoning from first principles about everything.

The case for this model is fairly strong. The limit is equally strong: many organisations do not yet have the instrumentation, policy discipline, or rollback safety to do it well. So the opportunity is real, but the prerequisite work is not optional.

Key principles of Aigre

I think the approach rests on five practical principles.

Work from the runtime backwards. Priorities should be shaped by real usage, operational outcomes, and observed constraints, not just the original specification.
Build a nervous system. Teams need continuous sensing that captures what the software is experiencing in the real world.
Build an immune system. Systems need bounded ways to detect and respond to failures, threats, and anomalies without waiting for a full manual escalation.
Treat the lifecycle as continuous. There is no final “finished” state, only successive generations adapting to new conditions.
Work within the envelope. Development cost, runtime cost, reliability limits, and policy constraints are all resources the system has to respect.

What AI should actually do in this model

There is a temptation to jump from “AI can suggest a fix” to “AI should autonomously run the estate”. I don’t think that is the right conclusion, although in time it feels like that’s likely to be very possible.

In a practical Aigre model, AI is most valuable in four places:

Signal interpretation

Modern systems emit too much data for people to parse quickly under pressure. AI can cluster anomalies, summarise behaviour shifts, correlate incidents with deployments, and separate weak signals from noise. But teams should not send all logs straight to an LLM: that is expensive, and cost rises as instrumentation grows. A layered model works better. Start with fast, cheap, deterministic checks for immediate reactions, then escalate only the harder cases to AI.

Option generation

Once a problem is recognised, teams still need candidate responses. AI can propose rollback plans, scaling changes, policy adjustments, circuit breaker thresholds, or code changes that fit the observed condition. Even here there are likely to be multiple layers: AI is currently a good system 1 thinker, but a poor overall planner/reasoner.

Although I don’t specify Aigre as being human on-or-outside the loop, it makes sense that at the highest levels, it’s people considering the options and prioritising the work.

Constraint checking

This is underrated by engineers today. A good operating model needs more than ideas; it needs discipline. AI should be tested on whether it can check proposed actions against declared limits for risk, cost, compliance, and reliability. We already know that AI in closed finite-information systems can perform very well, and that while LLM performance can vary, we can often detect this with evals or other methods. Plain English specifications of software are not the way forward here: clear bright-line tests, inspired by test-driven development, are what intelligent systems need in order to make progress on work.

Bounded execution

Some actions are safe enough to automate when they are reversible and well understood. Routing traffic away from an unhealthy dependency, tightening an autoscaling guardrail, or rolling back a clearly regressive release are plausible examples. Large architectural changes or speculative feature redesigns are not.

That distinction is important. Aigre should begin with containment and stabilisation, not creativity. If the system cannot reliably protect itself from obvious bad states, it is nowhere near ready for autonomous feature evolution.

The trade-off most teams will face

The attraction of Aigre is obvious: shorter feedback loops, faster response, lower operational drag, and better alignment between what software does and what software should do.

One risk is also obvious: a fast system can be wrong quickly.

This is why I think the central design problem is not raw model capability. It is governance in the engineering sense: who sets policy, how actions are bounded, what evidence is required before autonomy increases, and how reversibility is guaranteed. Without those controls, Aigre is nothing more than automated headless chickens.

There is also a cultural trade-off. Teams that adopt this approach will need to think less in terms of one-off delivery and more in terms of operating loops. Product, platform, security, and finance all become part of the same control system. That is powerful, but it can be uncomfortable because it exposes conflicting incentives much earlier.

There is also a budgetary implication that people often skip past. If runtime spend is a live engineering signal, then cost cannot sit in a separate reporting area. It becomes one of the conditions that shapes what the system is allowed to do. In other words, the envelope is not only technical; it is economic too.

We’ve seen this before. The move into the cloud was largely about moving capex spend into opex, giving up some technical responsibility, and largely thinking harder about the economic implications of your IT environment. Aigre and cloud are bed-fellows: if an Aigre-designed software application is similar to an organism, then the environment it inhabits is the cloud. The resources in these environments are physically limitless: but we need engineering discipline to work out what resources/capacity should be used.

What success looks like

Traditional delivery metrics still matter, but they are not enough for an Aigre model. If the framework is working, I would expect teams to measure at least the following - inspired heavily by DORA metrics:

Adaptation lead time

How quickly does the system recognise a meaningful change and produce a safe response? That might be a rollback, a routing change, a scaling action, or a patch.

Automated resolution rate

What proportion of incidents or degradations can be resolved autonomously within defined guardrails? A high number is only useful if it does not come with higher collateral damage.

Resilience and recovery

MTTR, error budget stability, and recovery consistency all matter here. A system that recovers quickly and predictably is learning to regulate itself more effectively.

Cost per function or transaction

Monthly cloud cost is too blunt. The better question is whether the cost of serving a user journey, workflow, or transaction is stable or improving as the system adapts.

Feature responsiveness

How much of the roadmap is actually being shaped by sensed user need, operational evidence, or repeated failure patterns? In Aigre, more change should originate from observed conditions rather than only from prior planning.

What changes?

Compared with DevOps, Aigre keeps the same emphasis on telemetry, deployment discipline, and fast feedback. The difference is that Aigre treats production signals as direct input to bounded automated decisions, not only as input to human operations work.

Compared with XP, Aigre shares the focus on short feedback loops, testing, refactoring, and continuous improvement. The difference is scope: XP mainly optimises how teams build code, while Aigre extends those loops into runtime control, resilience, and cost behaviour.

Compared with Scrum and broader Agile practice, Aigre still relies on iterative delivery and reprioritisation. The difference is cadence: Aigre is less centred on sprint ceremony and more centred on continuous control loops that react to observed system behaviour.

The strength of Aigre is that it can make a well-instrumented system more adaptive, safer to run, and more cost-aware. The weakness is that it raises the bar on observability, policy design, and governance; in weakly managed organisations, it can automate the same process flaws faster.

Unlike some of the methodologies I’ve just mentioned, Aigre doesn’t demand a specific practice or approach. While the functioning cybernetics are essential - you need a system that senses and can respond to its environment - how you interpret the signals/feedback could vary wildly. Find a security issue and decide not to fix it? That’s a choice! Set an overall budget of $100/month and turn off expensive features if that’s likely to be hit? You can do that too! Run the user feedback loops once a year? No problem.

I think it’s likely that, with all things, people will just pick and choose the elements of this that resonate and align them to their existing practices. And, as I said earlier, most of this is not realistically possible today. However, I think this is an interesting direction that I’d like to try…

Aigre: the AI-Governed Runtime Engineering Approach