The AI Subsidy Is Ending. Do Your Tokens Still Generate Value?

For two years, enterprise AI adoption has rested on a convenient fiction: that generative AI could be bought like a SaaS subscription. Pay a flat monthly fee, hand out licenses, let adoption scale. This worked when AI tools were mostly assistants — drafting text, summarizing documents, suggesting code.

That phase is ending. But not cleanly, and the detail matters.

A partial shift in economics

GitHub’s announcement that Copilot will move to usage-based billing from June 1, 2026 is being read as a watershed moment for enterprise AI pricing. The detail behind the headline matters. Code completions and Next Edit suggestions remain flat-rate under the new model; it is Copilot Chat, agentic coding sessions, and code review that shift to token-based consumption. The flat-rate model is not dead, it is retreating from the heaviest, most compute-intensive interactions. Agentic workflows broke the economics; Copilot’s pricing change is the first visible response.

There is a second layer worth understanding. Under GitHub’s previous Premium Request Unit model, heavy users were consuming between three and eight times the token value of their subscription cost, with GitHub absorbing the difference. The shift to usage-based billing ends that subsidy. This wasn’t simply about AI complexity outgrowing flat-rate packaging, it was about an unsustainable cross-subsidy being ended.

Why AI is harder to forecast than cloud

Enterprise AI may have been packaged like SaaS, but economically it behaves more like cloud infrastructure. That comparison holds, with a qualification that is easy to understate.

Cloud costs are also notoriously hard to forecast. Auto-scaling, cross-region replication, and data egress charges create real volatility. What makes AI materially harder is not simply that costs vary with usage, but that AI usage is shaped by human intent and task complexity in ways that infrastructure demand simply is not. Two users with identical licenses can generate very different costs depending on what they ask the system to do and how the model executes it.

The deeper parallel is the agentic loop. Cloud architects learned to fear data egress: predictable in isolation, exponentially variable at scale. AI architects will learn to fear runaway context and recursive tool calls. A single tool call is cheap. An agent that reads a repository, generates a plan, writes code, runs tests, fails, revises, and retries is not. That is not a cost structure that forecasting tools built for infrastructure workloads are designed to handle.

The ROI hurdle is moving

This is where the standard productivity narrative breaks down — and where the economics become genuinely interesting.

A $19/user/month AI subscription is easy to justify. Against a developer with a fully loaded annual cost of $50,000 (approximately $4,167/month — a reasonable figure for mid-to-senior talent in lower-cost delivery markets such as India, Eastern Europe, or Southeast Asia), the AI tool represents 0.46% of monthly labor cost.

Agentic AI changes the equation. If usage-based consumption drives AI spend to $250, $600, or $1,000 per developer per month, the enterprise has to prove that consumption is creating proportionate value. This requires distinguishing between two things that are routinely conflated: productivity created and productivity captured.

A 20% productivity improvement on a $4,167/month developer generates $833 in theoretical value. But enterprises rarely capture productivity gains in full. Saved time is absorbed by coordination overhead, release bottlenecks, approval queues, and rework. If only 25% of that gain translates into measurable output, faster releases, reduced contractor dependency, higher throughput, realized value is $208/month. The table below maps this across plausible scenarios:

AI productivity uplift	Capture rate	Realized value / dev / month	Max AI cost for break-even
10%	25%	$104	$104
10%	50%	$208	$208
20%	25%	$208	$208
20%	50%	$417	$417
20%	75%	$625	$625
30%	50%	$625	$625
30%	75%	$938	$938

Break-even AI cost per developer, by productivity uplift and capture rate (assumes $4,167/month fully loaded developer cost).

The implication is pointed. A $250/month AI cost may be defensible at reasonable uplift and capture rates. At $600 or $1,000, the case becomes selective. The same tool that delivers clear ROI for a high-cost product engineer may struggle to justify itself against a $50,000/year delivery-centre developer — not because the productivity gain is smaller, but because the labour cost base is lower and the capture rate may not differ.

This is the insight the generic AI productivity narrative obscures: AI ROI is not uniform. It depends on labour-cost context, workflow value, model selection, and the enterprise’s ability to convert saved time into saved cost or additional output. A 20% productivity gain does not automatically create 20% ROI.

The discipline usage-based pricing demands — and the risk it creates

For enterprises, the shift toward consumption-based pricing creates a genuine tension that is rarely acknowledged.

On one hand, it drives necessary discipline: which workflows deserve premium AI spend, which models are sufficient, where consumption is producing value versus merely generating activity. A mature AI FinOps capability answers these questions in practice — through team-level spend dashboards, model routing policies that direct frontier models to high-value workflows and cheaper models to routine tasks, and prompt-cost thresholds that flag runaway consumption before it compounds.

On the other hand, usage-based pricing can suppress experimentation. When developers know every heavy interaction has a visible cost, exploratory use declines. Teams avoid long context windows, skip edge-case testing, and pull back from the agentic workflows that generate the highest-value use cases. Usage-based pricing improves efficiency but can suppress exactly the exploration that discovers where AI creates real leverage. Enterprises should design their governance accordingly — separating managed production budgets from explicitly ring-fenced experimental allocations where consumption is expected and permitted.

What this means for providers and vendors

Pricing strategy is also a competitive choice, not just an economic one. A vendor that moves aggressively to usage-based billing opens the door for flat-rate holdouts to win buyers who value predictability. Pure usage-based pricing is unlikely to dominate; hybrid models — base subscriptions with included credits, tiered model access, and overage controls — are the more likely outcome. Vendors that make consumption visible and manageable without penalising adoption will be better positioned than those that simply pass costs through.
Open-source models add pressure from a different direction. The capability gap between leading open-weight models and proprietary ones has narrowed to near-parity, which structurally caps how high API pricing can go, regardless of what subsidies currently hold it down. That competitive ceiling is real and growing. Self-hosted inference can reach cost parity for high-volume, steady workloads, but it is not a free alternative. Infrastructure, security, model governance, and the talent to run it are genuine costs. The better framing is that open source gives enterprises a credible outside option; it does not automatically reduce total spend.
Reliability becomes an economic feature. Under flat-rate pricing, users tolerate imperfect outputs. Under usage-based pricing, a hallucination is not just a quality issue — it triggers a correction loop that compounds cost. Vendors that improve output reliability reduce their customers’ effective cost of use. That is a sharper competitive differentiator than benchmark scores.
For service providers, the opportunity is nuanced. Higher AI product costs may make clients more cautious about paying higher service fees on top of more expensive tools. At the same time, the complexity of managing AI consumption creates new advisory and managed-services opportunities. Enterprises will need help with AI cost governance, model selection, usage policies, prompt and context optimisation, workflow redesign, value tracking, and AI FinOps.

What this means for enterprises

The next discipline for enterprise AI is not access management — it is value management.
Start with segmentation. A senior product engineer, a junior developer in a low-cost delivery centre, a support analyst, and a document-drafting business user all have different needs. Giving everyone the same model and the same consumption allowance is wasteful. Premium agentic capabilities should go to workflows where the value is large enough to justify them. Lighter models work for broader populations.
Build measurement that goes beyond activity. Tokens consumed and prompts sent show that people are using AI, not that it is working. ROI requires connecting AI spend to outcomes: shorter cycle times, less rework, higher throughput, faster releases, avoided cost.

Conclusion

The Copilot pricing change is a signal, not a cliff edge. But the direction is clear: as AI becomes more agentic, flat-rate packaging will keep retreating from the heaviest workloads. Usage-based pricing is the next stage, not the final one.

The standard aspiration is for pricing to eventually follow outcomes — enterprises pay for resolved tickets, reduced cycle time, faster releases, not just tokens consumed. That is a reasonable direction, but history should temper the confidence. Outcome-based pricing has been an aspiration in automation markets for two decades. It works where outcomes are clearly measurable and attributable to the provider; it struggles where results depend on how well the enterprise itself is run.

That points to a sharper irony. The enterprises best placed to capture AI productivity gains — those with disciplined engineering, clear bottlenecks, and strong governance — are also the ones best placed to measure and prove that value. The enterprises that most need AI to fix chaotic processes may find it hardest to justify the spend. The gap between potential value and captured value is where the real work of this next phase lies.

AI arrived in the enterprise with SaaS-like simplicity. It is now becoming a variable-cost operating layer. That does not weaken the case for AI — it strengthens the case for running it with the same financial discipline that cloud computing never quite received until the bills arrived.

Connect with
Our Experts

Reach out today to speak with an expert who can provide the guidance you need to navigate your challenges and unlock new opportunities. Let us help you transform data into actionable strategies!

The AI Subsidy Is Ending. Do Your Tokens Still Generate Value?

A partial shift in economics

Why AI is harder to forecast than cloud

The ROI hurdle is moving

The discipline usage-based pricing demands — and the risk it creates

What this means for providers and vendors

What this means for enterprises

Conclusion

Connect with
Our Experts

How we help

Company

The AI Subsidy Is Ending. Do Your Tokens Still Generate Value?

A partial shift in economics

Why AI is harder to forecast than cloud

The ROI hurdle is moving

The discipline usage-based pricing demands — and the risk it creates

What this means for providers and vendors

What this means for enterprises

Conclusion

Related Solutions

Client Objectives

Related Industries

Connect with Our Experts

Related Publications

Beyond Wealth: 10 Key Takeaways on What Drives Ultra High Net Worth Individuals

​​Private Markets in Wealth Management: Resilient demand despite market scrutiny

​​Why brands are getting their VIC programmes so wrong

Connect with
Our Experts

Private Markets in Wealth Management: Resilient demand despite market scrutiny

Why brands are getting their VIC programmes so wrong