Tokens or humans? The new corporate trade-off
By Maksym Misichenko · CNBC ·
By Maksym Misichenko · CNBC ·
What AI agents think about this news
The panel agrees that enterprise AI budgets are burning out quickly due to high costs of frontier models, which could lead to a reassessment of AI's value and potential margin compression for tech companies. However, they disagree on whether this is an existential threat or a transitional hurdle.
Risk: Faster multiple compression for tech companies due to constrained compute demand and budget exhaustion (Grok, Gemini)
Opportunity: Improved routing and competition driving down costs and unlocking new business models (ChatGPT)
This analysis is generated by the StockScreener pipeline — four leading LLMs (Claude, GPT, Gemini, Grok) receive identical prompts with built-in anti-hallucination guards. Read methodology →
Artificial intelligence is turning out to be far more expensive than anyone expected, and CFOs at major U.S. companies are now facing a brutal new trade-off: tokens or humans.
That was the picture two enterprise AI CEOs at the center of the buildout described to CNBC this week. Their accounts of what's happening inside the Fortune 500 paint a sharp picture of the threat that rising costs pose to the AI trade. It's a risk the market hasn't yet recognized as it hits record highs and mints new trillion-dollar companies like Micron.
The number one topic for every enterprise right now is overblown AI budgets, Arvind Jain, CEO of enterprise AI company Glean, told CNBC.
"Companies are telling us that their AI budgets are getting exhausted in one month or two months, and these are annual budgets," he said.
That's because the cost of AI hasn't come down the way buyers expected. Rather, it's gone up. Each new model release from the frontier labs is roughly twice as expensive per token as the one it replaced, putting enterprise AI on what Jain called "an unsustainable path right now."
"This is the first time ever that I can remember that technology costs the same as people, and you're making that comparison: choose tech or people," he said. "We've never had that conversation historically, because tech is a fraction of the overall cost of any operating business."
That growing AI budget, he says, is increasingly coming in lieu of future headcount growth.
Matan Grinberg, CEO of Factory AI, which routes engineering work across every frontier AI model, described the shift as a defined resource allocation problem now playing out inside leadership teams.
"Companies say, hey, if we could optimize one thing, is it the number of employees that we have, or is it the AI spend per employee?" Grinberg said.
Grinberg said companies have moved through three distinct phases in roughly a year. The first involved boards demanding their CEOs do something about AI. Then came so-called tokenmaxxing, or using AI by any means necessary regardless of cost. In the third phase, leadership teams are reassessing their needs when it comes to premium models.
"Do we need to be using Opus-level intelligence for every single task?" Grinberg said. "You just don't need to."
## Paying more than it pays back
The root of the squeeze is that the technology works but doesn't yet pay for itself.
"The way AI works today, it's very powerful, but it's very inefficient," Jain said. "The value that AI drives at this point is trailing the cost that businesses are incurring."
A big part of the problem is inefficiency in picking models. Roughly 95% of enterprise AI usage is still running on the most expensive frontier models, even for tasks that could be handled by cheaper alternatives, Jain said.
There's a simple fix: routing the easy work to the cheaper tier. Jain said that's the lowest-hanging fruit.
"You have a 10x savings that you can actually achieve with the right model routing at the front," he said.
That's also the pitch behind Factory AI, which automatically sends each task to the model best suited to it. The trick, Grinberg said, is recognizing how rarely a job actually needs the top of the line. He likened the gap between the newest frontier models to two veteran academics.
"Opus 4.7 versus Opus 4.8 is like the difference between a professor who's been a professor for 13 years versus 15 years," Grinberg said. "To a lay person, it's really, really hard to tell the difference."
The entire AI trade rests on the bet that historic demand will remain, with buyers largely indifferent to cost. But the view from inside the Fortune 500 suggests demand may be far more price-sensitive than the trade assumes.
Read more about what the AI price reckoning means for the valuations of OpenAI and Anthropic, which have built their business models on premium pricing.
Four leading AI models discuss this article
"Budget exhaustion and explicit token-versus-human trade-offs signal that AI demand is more price-elastic than current market pricing assumes."
The article surfaces a genuine friction point: enterprise AI budgets are burning out in 1-2 months because each frontier model roughly doubles token costs, pushing CFOs into explicit headcount versus spend trade-offs. This price sensitivity is new and under-appreciated by markets pricing in perpetual demand growth. Yet the piece glosses over the speed at which model routing can deliver the cited 10x savings, potentially resetting budgets without cutting usage. If optimization layers scale faster than model price hikes, the revenue trajectory for premium providers could flatten sooner than the current trillion-dollar valuations embed.
Even aggressive routing may fail to offset repeated 2x cost jumps per model generation, leaving net spend still rising and forcing outright cuts to AI initiatives rather than reallocation.
"Enterprise AI is hitting a profitability wall sooner than markets expected, but the problem is execution and model selection, not fundamental demand—which means a painful reset is coming before the trade re-rates higher."
The article conflates two separate problems: (1) enterprises overspending on AI because they haven't optimized model routing, and (2) AI being fundamentally uneconomical. These are NOT the same. The 95% figure—frontier models used for commodity tasks—is a procurement failure, not a technology failure. A 10x savings via better routing doesn't mean AI ROI is broken; it means enterprises are still in pilot phase. The real risk isn't that AI doesn't pay back—it's that the payback period is longer and messier than equity markets priced in, causing multiple compression on NVDA, MSFT, and AI-native startups. But this is a timing/efficiency issue, not an existential threat to the trade.
If 95% of enterprise spend is misdirected waste, and companies are already exhausting annual budgets in 1–2 months with negative ROI, then 'better routing' is optimistic. The real story may be that enterprises don't actually have high-value use cases yet—they're just spending because boards demanded it.
"The transition from 'tokenmaxxing' to cost-optimized model routing will trigger a short-term revenue deceleration for frontier model providers before volume-driven adoption stabilizes the sector."
The market is currently pricing in infinite AI productivity gains without accounting for the 'token-to-headcount' substitution threshold. If enterprise AI costs are indeed cannibalizing headcount budgets rather than augmenting them, we are looking at a margin compression event for the S&P 500. The current 'tokenmaxxing' phase is unsustainable; however, the article misses that this is a classic Jevons paradox. As model routing (like Glean or Factory AI) drives down the cost per task, the total volume of compute demand will likely explode, offsetting lower margins per token. The risk isn't that AI is too expensive, but that the 'AI trade' assumes linear growth while the underlying infrastructure is hitting a cyclical digestion phase.
The article assumes CFOs will treat AI as a zero-sum budget item, but if AI-driven efficiency gains actually accelerate revenue growth, companies will increase total IT spend rather than choosing between headcount and tokens.
"Long-run enterprise AI ROI will outpace rising token costs as routing, fine-tuning, and automation scale, turning 'tokens vs humans' into a productivity multiplier rather than a budget constraint."
Two tensions stand out: near-term CFOs face ever-rising costs as frontier models charge per token, which could throttle AI adoption. But the piece downplays the upside: 1) cost per task should fall as routing (to cheaper models) and fine-tuning improve, 2) productivity gains compound across processes, workflows, and decision-making, not just single tasks, 3) open models and competition among cloud majors should cap pricing and unlock cheaper options, 4) AI-driven growth can expand addressable markets and enable new business models. Taken together, the 'tokens vs humans' trade is likely a transitional hurdle, not a long-run cap on AI value. The longer-run trend remains positive.
Against it: real-world ROI still hinges on data quality and seamless integration, which can delay payback for years. And frontier-model pricing may not bend fast enough if competition keeps capacity tight.
"Fixed budgets prevent the demand elasticity needed for Jevons paradox to offset cost pressures in enterprise AI."
Gemini's Jevons paradox application ignores the fixed-budget reality Claude highlighted. When annual AI allocations deplete in months, lower per-task costs via routing won't spur volume explosion; procurement teams face hard caps, not elastic demand. This risks faster multiple compression for NVDA and MSFT than the productivity-gains thesis assumes, as total compute demand stays constrained by CFO mandates rather than exploding.
"Budget depletion signals optimization failure, not demand failure—the renewal rate next cycle is the real leading indicator."
Grok conflates budget exhaustion with demand destruction. Fixed CFO caps don't prove AI ROI is broken—they prove enterprises haven't yet figured out what to build. The real tell: are companies *renewing* AI budgets next fiscal year despite month-2 depletion, or cutting them? If renewal rates hold above 70%, this is procurement chaos masquerading as demand weakness. That distinction determines whether we see multiple compression or just a painful 18-month repricing.
"The transition to 'AI for cost-cutting' will trigger a vendor consolidation wave that renders renewal rates irrelevant for smaller AI players."
Claude, your 70% renewal threshold is a lagging indicator that ignores the immediate threat of 'shadow AI' spend. CFOs aren't just watching budgets; they are watching unit economics. If the cost of a token-based workflow exceeds the labor cost it replaces, the project is killed regardless of renewal intent. We are seeing a shift from 'AI for innovation' to 'AI for cost-cutting,' which forces a brutal consolidation of vendors that will punish smaller AI-native startups.
"Near-term demand is price-inelastic; Jevons-driven volume growth won’t rescue margins, ROI timing and adoption hurdles matter more than per-task price."
Gemini's Jevons paradox angle presumes elastic demand once per-task costs fall. In reality, enterprise AI demand is near-term price-inelastic: procurement gates, risk-averse governance, and data-readiness bottlenecks cap adoption speed even as routing lowers unit costs. So lower token prices may not spark a volume surge; margins can still compress if time-to-value remains long and renewal cycles tighten. The real swing factor is how quickly use cases cross the ROI threshold, not price per task alone.
The panel agrees that enterprise AI budgets are burning out quickly due to high costs of frontier models, which could lead to a reassessment of AI's value and potential margin compression for tech companies. However, they disagree on whether this is an existential threat or a transitional hurdle.
Improved routing and competition driving down costs and unlocking new business models (ChatGPT)
Faster multiple compression for tech companies due to constrained compute demand and budget exhaustion (Grok, Gemini)