The Agent Runtime Wars began this week

Agent runtime is the new browser tier, and your site will be evaluated based on runtime, not a single model.

This is a shift that web professionals have not yet made. The conversation still revolves around models. Which model writes better? Which one quotes more accurately? Which API is cheaper this month? The model discussion is loud because new models come onto the market every few weeks and every release appears in the cinema.

The interesting story lies behind it. The foundation is being rebuilt. This week made it impossible to ignore.

The runtime stack was delivered in April

On April 15, Cloudflare released Project Think, a new agent SDK based on persistent execution with crash recovery and checkpointing, subagents running as isolated child agents, persistent sessions with tree-structured messages, and sandboxed code execution on dynamic workers. Within hours of the same day, OpenAI shipped the next evolution of its Agent SDK with native sandbox execution and model-native usage. Two of the largest infrastructure operators on the Internet provided competing answers to the same question, and the question was: How does a long-running AI agent actually run in production?

Then on April 16, Cloudflare added five more pieces. AI Platform: a vendor-agnostic inference layer that passes models to agents. AI Search: a vector index plus chunking pipeline that ships as a managed product specifically for agent retrieval and competes with Pinecone and Algolia in the agent-side RAG tier, rather than Google AI mode. Email service in public beta designed to allow agents to use the world’s most universal interface as a channel. PlanetScale Postgres and MySQL in Workers. And the technical basis for hosting very large open source LLMs like Kimi K2.5 directly on the Cloudflare network.

Sundar Pichai described the same shift a week earlier. On the April 7 Cheeky Pint podcast with Stripe co-founder John Collison, he referred to search itself as an “agent manager”: “Many of the purely information-seeking queries are going to be agentic in search. They’re going to be doing tasks. There’s going to be a lot of threads running.” “Many threads per query” is a runtime description of the search. Google’s CEO points to the same substrate that Cloudflare and OpenAI shipped this week.

If OpenClaw was the agent web for consumers (a playable demo, an interesting prototype, something hands-on), then this is the agent web for adults. Permanently. Sandboxed. Verifiable. The kind of infrastructure you would actually run a business on.

The pattern in all of this is one thing: the term. Not the model. Not the consumer chat app. Not the keynote slide. Runtime is the level at which agents spin up, persist for hours and days, and gain file system access, network access, and storage. Runtime is the layer that decides whether an agent’s session survives a crash, whether its subagents can be reasoned about, and whether its code execution is contained.

The wrong question and the new one

Web professionals have spent the last 18 months asking the wrong questions. The question was: Which AI model should we optimize for? ChatGPT or Claude or Gemini or Perplexity. Whose quotes are more important? Whose crawlers should we let through? This conversation made sense when the models read your website directly.

They don’t do that anymore. The model reads what the runtime passes to it. The runtime retrieved your page. The runtime analyzed it. The runtime executed (or didn’t execute) your JavaScript. The runtime has resolved your structured data. The authentication negotiated at runtime. When the model sees something from your website, it sees the runtime’s interpretation of it.

The new question, if you’re serious about this week, is what agent term will your website be readable for? Three things to test by next week:

Do your key endpoints return machine-readable structured responses or only render correctly within a full browser session?
Is your authentication comprehensive enough to allow an agent acting on behalf of a user to hold a session across multiple calls, or does it only support one-time human logins?
Does your structured data still mean the same thing when a runtime that didn’t execute your JavaScript tried to read it?

These are questions about runtime readability. The model has nothing to do with it. The runtime decides whether your response is even in the model’s context window, and the model chooses from whatever the runtime passes.

The Internet’s sanitary facilities are being rebuilt. Every model in the next two years will see your site through one of these terms, not directly. Your website’s job from now on is to be readable for runtime.

The model conversation will continue to take place on conference stages and in keynote slides. The runtime conversation takes place in infrastructure company product change logs. The companies that provide the runtime decide which websites are reached through AI search and AI commerce. Stop asking about the model. Ask about the duration.

Additional resources:

This post was originally published on No Hacks.

Featured image: Viktoriia_M/Shutterstock

WPAP (907)