The Shelf Life of Intelligence
Martin Casado, GP at a16z, put it plainly last week:
It's only a matter of time before only the model creators have access to the most powerful models. The rest get access to smaller, distilled versions. Or access the models through first party apps and services that don't provide direct access to the token path.
Raghu Raghuram followed with the economic logic:
If the investment needs of training are too high and the inference product is too GPU intensive to make broadly accessible, then the logical answer to recoup the training costs is to climb up the value curve, all the way to supplying AI labor and all the other AI use cases.
I replied: "Hence the push for the highest risk-reward work at companies. AI Scientists."
Why scientists? Because frontier intelligence is an ephemeral product. The model you trained for hundreds of millions becomes commodity in quarters. Think of it as winemaking. Tokens are grape juice (fresh, perishable, cheap by next season). Discoveries are wine (same ingredient, transformed into something that ages and appreciates). Labs are converging on science because it's the only work where the output outlasts the intelligence that produced it.
The Decay
At every other level of the value chain, the margin evaporates.
| Model | Training Cost | Inference Cost | Total Cost | Frontier Window | Revenue (90%) | Net | Recovery |
|---|---|---|---|---|---|---|---|
| GPT-3 | $5M | $50M | $55M | 30 months | $200M | +$145M | 3.6x |
| GPT-4 | $80M | $2.1B | $2.2B | 14 months | $2.9B | +$700M | 1.3x |
| GPT-4.5 | $300M | $5B | $5.3B | 5 months | $3.9B | -$1.4B | 0.7x |
| GPT-5 | $500M | $12B | $12.5B | 8 months | $13.3B | +$800M | 1.1x |
Each generation costs more to build, far more to serve, holds frontier status for less time, and recovers less. Inference is the real cost: GPT-4's was 26x its training bill, and by 2024 OpenAI spent more on inference than it earned in revenue. Gemini Ultra lasted a month before 1.5 Pro replaced it. Claude 3 Opus lasted three before 3.5 Sonnet killed it at a fifth the price. Every DeepSeek, Qwen, and Llama release compresses the window further.
Agents inherit the same decay. Cursor hit $1 billion ARR by November 2025 and spends roughly all of it on API costs. The labs noticed: OpenAI acquired Windsurf for ~$3 billion, Claude Code already runs at $500 million. The agent layer is a customer acquisition channel, not a business.
Two Frontiers
So how do you make money from intelligence that spoils? You don't sell the grape juice. You make wine. Capturing durable value requires owning two things.
Frontier 1: the discoveries. Labs will close their strongest models and use them internally to produce discoveries they sell directly. The economics show why:
| Traditional | With AI | Compression | |
|---|---|---|---|
| Cost per drug | $2 to $2.6B | 25 to 40% lower | Saves $500M to $1B per drug |
| Timeline | 10 to 15 years | 30 to 40% shorter | 3 to 6 years faster |
| Phase I success | 40 to 65% | 80 to 90% | Failure rate cut in half |
| Single hit value | $6.7B median lifetime | Same, but found faster and cheaper | The margin that matters |
A drug worth $6.7 billion, discovered using compute that cost millions. Keytruda alone generated $31.7 billion in 2025. Global pharma R&D runs $190 billion a year, and that's one domain. Materials science, chip design, climate modeling, mathematics: each is a multi-billion-dollar knowledge production market.
This is already happening. Isomorphic Labs runs drug design campaigns on AlphaFold. Gemini Deep Think produced a research paper (Feng26) with zero human co-authors. The pattern will look like what it already looks like: labs and their strategic partners get first access to the strongest models, extract the highest-value discoveries, and then open the model to everyone else once the frontier has moved on. A startup using GPT-5 via API pays $5/$20 per million tokens. OpenAI's internal team pays inference at cost: same model, 10x the compute budget.
Frontier 2: the instrumentation frontier. The less obvious but potentially more important play. Not software connectors or database integrations. The instruments themselves: physical sources where new data comes into existence.
When OpenAI partnered with Ginkgo Bioworks, they ran 36,000 cell-free protein synthesis reactions. Each round of wet-lab outcomes fed back into the model, improving the next round's predictions. After three rounds, protein production cost dropped 40%. Model to experiment to data to better model. That's the instrument at work.
DeepMind opened a materials lab in the UK that synthesizes hundreds of novel materials per day. Not simulated. Synthesized. Each one a data point that didn't exist before the experiment ran.
Owning the instrumentation frontier means owning the instruments where the bits are born. The wet lab. The marketplace. The clinical trial. The synthesis reactor. The patient cohort. Primary, proprietary, and compounding. A competitor can build a better model. They cannot replicate data that was physically created in your lab or measured in your trial.
Two flywheels. Best model to best research to proprietary discoveries. Best model to widest instrumentation to proprietary data to better model. The first captures value. The second compounds it.
Why Science
Few companies control frontier models. OpenAI at $852 billion, Anthropic at $380 billion, DeepMind inside a $2 trillion Alphabet. They are not spending $650 to $700 billion a year to sell API tokens at collapsing prices. They are building knowledge production machines.
The gap Casado describes is not a capability gap. It is a knowledge production gap. The distilled model you get via API can write code and summarize documents. The full model running inside the lab, connected to instruments generating proprietary data at industrial scale, can produce discoveries that outlast the intelligence that made them. Same architecture. Different access. Different power.
Sources
- Martin Casado on model access concentration, X post, 2026
- Raghu Raghuram on climbing the value curve, X reply, 2026
- Jigar Doshi on AI scientists push, X post, 2026
- GPT-3 training cost ($4.6M, 2020), Lambda Labs estimate
- GPT-4 training cost ($79M, 2023), Semianalysis
- GPT-5-class training costs ($500M+), Industry estimates, 2026
- OpenAI API pricing, OpenAI pricing page
- OpenAI revenue ($3.5M 2020, $34M 2021, $200M 2022, $2B 2023, $3.7B 2024, $13.1B 2025, $2B/month early 2026), GrowthNavigate; Statista; Reuters, Jan 2026; Wikipedia; CoinDesk, Apr 2026
- OpenAI inference costs ($3.77B in 2024, $5.02B H1 2025, $8.67B through Q3 2025), leaked Azure documents, Where's Your Ed At
- OpenAI gross margin fell from 40% to 33% in 2025 (below 46% internal target), MBI Deep Dives / The Information
- Model frontier windows (GPT-3 33mo, GPT-4 14mo, Gemini Ultra 1mo, Claude 3 Opus 3mo, GPT-4.5 5mo), Wikipedia model pages; TokenMix.ai, Apr 2026
- Cursor $1B ARR (Nov 2025), Cursor Series D
- Cursor 100% API cost ratio, The Information, 2026
- Claude Code $500M run-rate, Anthropic, 2025
- OpenAI acquired Windsurf ($3B), Bloomberg, 2026
- Traditional drug discovery cost ($2-2.6B), timeline, failure rate, Tufts CSDD / DiMasi et al.
- AI drug discovery improvements (Phase I/II success rates), McKinsey, BCG
- Keytruda revenue ($31.7B, 2025), Merck annual report
- Median lifetime sales per FDA-approved drug ($6.7B), Deloitte Centre for Health Solutions
- Global pharma R&D spend ($190B, 2024), EvaluatePharma / IQVIA
- OpenAI + Ginkgo Bioworks (36,000 CFPS reactions, 40% cost reduction), Ginkgo Bioworks, 2026
- DeepMind automated materials lab (UK, 2026), Google DeepMind
- Gemini Deep Think autonomous paper (Feng26), Google DeepMind, 2026
- OpenAI valuation ($852B), Reuters, 2026
- Anthropic valuation ($380B), Reuters, 2026
- Alphabet market cap ($2T+), Public markets, 2026
- Hyperscaler capex ($650-700B, 2026), Company earnings / SemiAnalysis