latest / economics
← all posts
// economics · economics

Is a subscription the wrong business model for AI coding tools?

A flat monthly subscription is a bet: that the average user costs the provider less than they pay, and the rare heavy user is subsidized by the many light ones. That bet worked when the product was a human typing — a human has a natural ceiling on how much they can consume in a day. Agents broke the bet, because an agent's appetite for compute has no human-sized ceiling.

Agents changed what a "unit of work" is

A developer using autocomplete generates a bounded number of requests an hour. A developer running an agent kicks off a task and the agent loops — reading files, calling tools, retrying — burning thousands of tokens unattended, then they kick off three more in parallel. The "average user" the subscription was priced for no longer exists; the power user is now everyone, all day, and the tail that used to be rare is the whole curve.

When one user on a flat plan can burn hundreds of dollars of inference in an afternoon, the subscription isn't a price — it's an unhedged short position on that user's ambition.

You can watch the seams form

The market is already adjusting in plain sight:

  • Credits and "premium requests." GitHub Copilot bills agent and chat work in premium requests on top of the subscription — a metered layer grafted onto a flat plan. That's a subscription admitting it can't cover agentic usage.
  • Rate limits as soft pricing. "Unlimited" plans quietly cap with usage windows and slowdowns — metering wearing a flat-rate costume.
  • Tiered model access. The cheap plan gets the cheap model; the frontier model is where the real cost is, so it's gated. Pricing by which model is pricing by cost.

What survives

Flat-rate isn't dead, but pure flat-rate for agentic products is. The shapes that hold up:

  • Hybrid: base + metered overage. A subscription for the predictable floor, usage billing above it. Honest, and it aligns price with cost.
  • Usage-based / token-metered. What the underlying APIs already are. Transparent, scales with value, punishes waste — which is good, because waste is the cost levers you should be pulling anyway.
  • Bring-your-own-key. The tool is a harness; you pay the model provider directly. Decouples the tool's margin from your consumption entirely.
  • Local for the high-volume floor. The most radical hedge: run a local model for the private, high-volume 80% and pay the cloud only for the hard 20%. Zero marginal cost is the ultimate answer to metered pricing.

What it means for you

Stop treating tokens as free and start treating them as a cost center you manage:

  • Instrument cost per task, not just per month. You can't optimize what you don't measure — see agent observability.
  • Route by difficulty. Cheap model for the easy 80%, frontier for the hard 20%. The single biggest lever (benchmark is sorted by price for this reason).
  • Cache, compress, batch. The 99%-cost architecture is, read another way, your hedge against whatever pricing model your vendor lands on.

The provider's pricing problem is also your sourcing strategy. The teams that win the next two years aren't the ones on the cheapest plan — they're the ones who made their token consumption an engineered number instead of a surprise on the invoice.

Inspired by the AI-economics writing at tomaskubica.cz.

#economics#cost#pricing