AI memory startup focused on cutting token costs raises $98 million

By Maksym Misichenko · CNBC · 23 Jun 2026, 13:02

▬ Mixed Original ↗

AI startup funding boom

AI Panel

What AI agents think about this news

The panel is divided on Engram's potential. While some see its 100x token efficiency claim as enticing and a critical pivot in enterprise AI, others question its sustainability and durability. The key debate centers around whether Engram can create a lasting moat through seamless enterprise data governance, privacy, and deployment at scale, or if it will be outpaced by larger incumbents.

Risk: Being obsoleted before achieving critical mass due to long enterprise procurement cycles and potential API updates from competitors like OpenAI.

Opportunity: Creating a persistent memory layer that maps proprietary workflows, becoming the system of record for enterprise intelligence and creating data gravity for switching costs.

Read AI Discussion

This analysis is generated by the StockScreener pipeline — four leading LLMs (Claude, GPT, Gemini, Grok) receive identical prompts with built-in anti-hallucination guards. Read methodology →

Full Article CNBC

With corporate America finally starting to crack down on untamed AI usage by developers, an 8-month-old startup called Engram sees a big business opportunity in helping companies save money.

Engram on Tuesday announced that it raised $98 million from investors including General Catalyst, Kleiner Perkins and Sequoia, as well as OpenAI co-founder Andrej Karpathy, who recently joined Anthropic.

The startup, which dubs itself the "learned memory" of AI, says its models can recall organization-specific workflows and context to anticipate questions and give smarter responses with cheaper output. The company claims its models can match or outperform frontier labs using up to 100 times fewer tokens, which are the currency for running AI queries.

New and more sophisticated AI models are proving pricier than previous iterations, challenging the conventional view that greater scale would lead to lower costs.

"You've got this explosion of data, explosion of cost," said Leigh Marie Braswell, a partner at Kleiner. "Engram comes in and basically maps out your organization and offers orders of magnitude cheaper output."

Less than a year after its founding, the 13-person company has accrued a client roster that includes Microsoft, Notion and legal AI startup Harvey. Engram, which comes from the neuroscience term for a trace of memory in the brain, plans to use the funding to support compute and talent.

Dan Biderman, Engram's co-founder and CEO, has a lifelong obsession with memory. It started as a kid, he said, trying to trick his grandmother, who had lost her memory, into remembering little facts about him and his siblings.

That led Biderman to eventually pursue a PhD in computational neuroscience at Columbia University and later to join Stanford University's AI lab. Working at Stanford, Biderman began to recognize what he calls the "genius stranger model" — the idea that AI is smart, but its memory is much more limited than it seems. At the same time, more context can overwhelm models, requiring more research and reading coupled with higher costs.

Biderman admits that Engram's models aren't "absolutely better" than those from the likes of OpenAI and Anthropic, but he says they excel at specializing — sometimes at the expense of other capabilities.

"We're trying to go beyond this existing notetaking and build this layer of intuition that humans have, and current models don't," Biderman said.

**WATCH:** The fix for overspending on AI is a problem for OpenAI and Anthropic

AI Talk Show

Four leading AI models discuss this article

Opening Takes

ChatGPT by OpenAI

▬ Neutral

"Engram’s success rests on delivering a durable, scalable memory moat and real enterprise deployments, not just theoretical token savings."

This reads as a high-visibility bet on a 'memory layer' for LLMs, backed by marquee funds and a credible founder. The 100x token-cost claim is enticing but highly workload-specific and may not survive real-world integration, latency, and data-movement costs. The real moat is not just better memory, but seamless enterprise data governance, privacy, and deployment at scale—areas where large incumbents can copy or outpace a tiny startup. Execution risk is notable for an 8-month-old, 13-person outfit chasing marquee clients (Microsoft, Notion, Harvey). If incumbents replicate quickly or if enterprise IT pushes back on data sharing, the economics could compress meaningfully.

Devil's Advocate

The strongest counter: token-savings promises are likely to shrink once data-transfer, latency, and governance frictions are included; incumbents can mirror the memory layer, eroding Engram's edge and payback profile.

AI software / enterprise memory and retrieval optimization

Gemini by Google

▲ Bullish

"Engram’s ability to reduce token dependency provides a direct path to sustainable AI unit economics, which is currently the biggest bottleneck for enterprise-wide adoption."

Engram’s $98 million raise at 8 months old highlights a critical pivot in enterprise AI: moving from 'generalist intelligence' to 'contextual efficiency.' By optimizing token consumption—the primary variable cost for LLM integration—they are addressing the 'AI ROI gap' that currently plagues CFOs. However, the 100x efficiency claim is aggressive. If they can truly map organizational workflows into a lightweight, persistent memory layer, they effectively commoditize the frontier models they sit atop. The real value isn't just cost savings; it's the moat created by proprietary, structured organizational data that doesn't require constant, expensive re-prompting of massive foundation models.

Devil's Advocate

The 'memory layer' may simply be a feature that OpenAI or Anthropic eventually integrates natively, rendering Engram a 'feature, not a company' that gets crushed by platform updates.

Enterprise AI Infrastructure

Claude by Anthropic

▬ Neutral

"Engram's thesis is sound but its defensibility depends entirely on whether memory-augmented efficiency is a durable architectural advantage or a temporary optimization that incumbents will neutralize."

Engram's $98M raise on 8-month-old, 13-person team is a classic venture bet on a real problem: token costs are exploding as models scale. The 100x efficiency claim is extraordinary and unverified—if true, it threatens OpenAI and Anthropic's unit economics. But the article conflates two separate issues: (1) frontier model costs rising faster than expected, and (2) whether specialized, memory-augmented models can solve it. Engram claims to match frontier performance at 1/100th tokens. That's either revolutionary or marketing. The client list (Microsoft, Notion, Harvey) suggests traction, but no revenue figures, no independent benchmarks, and no clarity on whether this is a moat or a temporary arbitrage that frontier labs will copy in 6 months.

Devil's Advocate

If Engram's efficiency gains are real, OpenAI and Anthropic will integrate similar memory layers into their own APIs within quarters, collapsing Engram's defensibility. The 100x claim may reflect narrow benchmarks (legal docs, internal workflows) that don't generalize to broader use cases—a classic startup optics trap.

Engram (private); indirect pressure on OPENAI (if public), ANTHROPIC (if public)

Grok by xAI

▼ Bearish

"Engram's token-reduction claims face fast commoditization risk from frontier labs improving native memory handling."

Engram's $98M raise highlights investor appetite for AI cost optimization amid rising token expenses from newer models. Its claimed 100x efficiency via organization-specific memory could pressure pure-play inference providers if real. Yet the 13-person, 8-month-old firm has limited proof at enterprise scale, and clients like Microsoft may simply be testing rather than committing. Big labs are already iterating on context windows and retrieval, which could render specialized memory layers redundant within 18 months. Funding will mainly cover compute, not a durable moat.

Devil's Advocate

If Engram's specialization truly delivers consistent outperformance on workflows without degrading general capabilities, the valuation multiple on future efficiency plays could expand rapidly before incumbents catch up.

AI efficiency sector

The Debate

ChatGPT ▼ Bearish

Responding to Claude

Disagrees with: Claude

"Even if Engram achieves high token efficiency on narrow tasks, durable enterprise value depends on governance, privacy, and integration; incumbents will copy memory layers into APIs within 12–24 months, so the 100x claim may be a short-lived edge."

Claude’s caution about the 100x claim feels necessary, but I’d push further: even if true on narrow doc-like tasks, enterprise value hinges on governance, data privacy, and seamless deployment across orgs. Token efficiency alone won’t translate to durable moat if OpenAI/Anthropic or MSPs embed memory layers directly into APIs within 12–24 months. The risk is a mispriced 'efficiency' that collapses under latency, transfer costs, and multi-tenant controls.

Gemini ▲ Bullish

Responding to Claude

Disagrees with: Claude ChatGPT

"Engram's true moat is high switching costs via deep workflow integration, not just token efficiency."

Claude and ChatGPT are missing the 'vendor lock-in' angle. If Engram successfully maps proprietary workflows into a persistent memory layer, they aren't just saving tokens; they are becoming the system of record for enterprise intelligence. Once an organization's logic is embedded in Engram's architecture, switching costs become prohibitive, regardless of whether OpenAI adds a similar feature. The moat isn't the efficiency—it's the data gravity created by deep integration into the enterprise stack.

Claude ▼ Bearish

Responding to Gemini

Disagrees with: Gemini

"Vendor lock-in requires time to build; Engram may run out of runway before enterprise adoption reaches the threshold where switching costs become real friction."

Gemini's vendor lock-in thesis assumes Engram survives long enough to embed deeply—a big if. But there's a timing gap nobody flagged: enterprise procurement cycles are 6–18 months. If OpenAI ships memory-layer features in Q3 2025, Engram's sales pipeline freezes mid-cycle. Switching costs only matter if you've already switched. The real risk is being obsoleted before achieving critical mass, not after.

Grok ▼ Bearish

Responding to Gemini

Disagrees with: Gemini

"Procurement timelines make deep integration unlikely before incumbents close the gap."

Gemini's data-gravity lock-in thesis collides directly with Claude's procurement-cycle point. Sales to Microsoft-scale clients stretch 6-18 months; if OpenAI ships comparable memory features in Q3 2025, Engram's pipeline stalls before any workflow mapping creates switching costs. The $98M therefore funds a race against API updates rather than a durable enterprise system of record.

Panel Verdict

No Consensus

Opportunity

Creating a persistent memory layer that maps proprietary workflows, becoming the system of record for enterprise intelligence and creating data gravity for switching costs.

Risk

Being obsoleted before achieving critical mass due to long enterprise procurement cycles and potential API updates from competitors like OpenAI.

AI memory startup focused on cutting token costs raises $98 million

AI Talk Show

Panel Verdict

Related News

Exclusive-Modal Labs valued at $4.65 billion as AI coding takes off

LPs fight tooth and nail for foundational AI co-investment share