The Intelligence Stack: Who Wins When Models Get Cheap
The artificial intelligence industry in February 2026 is defined by a paradox. On the surface, frontier models appear to be turning into cheap commodities. Chinese research labs are closing the gap, releasing models at a fraction of the cost. But the reality is more nuanced than the headline narrative suggests. Chinese models remain roughly three to four months behind the leading American systems. That gap has narrowed from six to eight months a year ago. The cost advantage is real and significant. The capability parity is not.
The entire market is shifting. Winning no longer means having the smartest standalone model. It means having the best overall system. Orchestration, local memory, autonomous agents that execute multi-step workflows without human supervision. These are the new boundaries of competition.
The cost to run these models has dropped roughly a thousand times since GPT-4 launched. That cost collapse makes previously impossible use cases suddenly viable and reshapes who captures value in the stack. This month saw Anthropic draw political fire for restricting US military access to Claude. OpenAI and Google accelerated their race to dominate agentic coding. Apple quietly emerged as an asymmetric threat through its disciplined capital expenditure strategy. Enterprise adoption is finally moving past the pilot phase, but the pattern is specific. Companies are buying automated workflows, not raw intelligence. The vendors who understand this distinction are pulling away from the pack.
The Great Commoditisation of Intelligence
The most pressing dynamic reshaping the market is the speed at which frontier capabilities get replicated. The narrative that Chinese labs instantly match American models and sell access for pennies has become the dominant framing. It’s not wrong, but it’s incomplete in ways that matter.
Chinese models are still roughly three to four months behind the leading American systems. A year ago that gap was somewhere in the range of six to eight months. The gap is narrowing. But here’s the critical nuance that most analysis misses: as capability curves steepen, those months become exponentially more significant. Think of it like two objects falling toward a black hole. The one that crossed the event horizon three months earlier isn’t just slightly ahead. It’s experiencing a fundamentally different reality. Time dilates. The distance between them, measured in months on a calendar, corresponds to what feels like years of capability divergence. At the frontier, where each new model generation compounds on the previous one, a three-month lead translates into capabilities that the trailing model simply cannot access yet.
The cost advantage, however, is unambiguous. The Lunar New Year releases from Chinese labs proved this aggressively. Zhipu AI released GLM-5, a 744-billion-parameter model that activates just 40 billion active parameters per token. Trained on 28.5 trillion tokens, it performs strongly in early testing and dominates in certain agentic coding tasks. Alibaba open-sourced Qwen 3.5, a massive 397-billion-parameter multimodal system activating only 17 billion parameters during inference. ByteDance launched Seed 2.0, which tied for top spots on vision and reasoning benchmarks while costing just 47 cents per million input tokens.
MiniMax delivered perhaps the most severe shock to the pricing structure. Their M2.5 release scored 80.2% on SWE-Bench Verified. That places it within a fraction of a percentage point of Anthropic’s flagship Claude Opus 4.6. MiniMax charges roughly one dollar per hour of compute at 100 tokens per second. The maths looks punishing for Western incumbents on paper.
But benchmarks don’t capture the whole story. They never have. SWE-Bench scores can be nearly identical between two models while the actual experience of using them for production work differs enormously. GPT-5.3-Codex xhigh is substantially better than Claude Opus 4.6 Extended Thinking in practice, despite the two scoring within margin of error on the same benchmark. The difference shows up in how the model handles ambiguity, how it navigates complex multi-file dependencies, how it recovers from dead ends. These are qualities that a pass/fail benchmark on isolated coding tasks simply cannot measure. Anyone who has spent serious time with both models knows this intuitively. The benchmark says they’re equal. They are not.
This doesn’t diminish the genuine advances in open-source and Chinese models. The progress is real and structurally important. Open-weight models are enabling an enormous ecosystem of fine-tuned, domain-specific applications that wouldn’t exist otherwise. The cost reductions are making AI accessible to markets and use cases that couldn’t afford frontier pricing. But we need to be precise about what’s actually happening. The Chinese ecosystem is producing excellent models at dramatically lower cost. It is not yet producing models that match the best American systems at the frontier of complex reasoning and production-grade coding.
The implication is that raw model capability is a depreciating asset, but the depreciation curve isn’t as steep as the commodity narrative suggests. There remains a meaningful premium for the best models, and that premium is being captured by companies building the most capable systems, not just the cheapest ones. The two trillion dollars in collective capital expenditure deployed by American hyperscalers still faces a return-on-investment challenge, but the moat is wider than a pure benchmark comparison implies.
Value is migrating into infrastructure and orchestration regardless. Companies building agent frameworks, retrieval pipelines, and domain-specific workflows are creating sticky products. The language model is becoming the equivalent of a computer processor. Necessary, but not the actual product. The difference is that some processors are still meaningfully better than others, and the customers building mission-critical systems know it.
OpenAI clearly feels competitive pressure. They sent a memo to US lawmakers accusing DeepSeek of using distillation to free-ride on American research, framing it as intellectual property theft. But even if distillation stopped entirely tomorrow, the Chinese ecosystem is already operating independently. Labs in Beijing and Shenzhen train on domestic Huawei Ascend clusters and write highly efficient code to squeeze every drop of performance from their hardware. They’re proving that you don’t need the biggest supercomputer to produce strong results if your training pipeline is smart enough. The gap is narrowing. It hasn’t closed.
The Agentic Transition
We officially crossed the threshold where AI agents moved from terminal demos to production enterprise workers. Vibe coding is no longer a joke on social media. It’s becoming the default method for software creation.
The clearest example of this shift is the story of OpenClaw. Peter Steinberger, an Austrian developer who previously built and sold a PDF software company for over 100 million euros, created OpenClaw as a weekend project in late 2025. He wanted a personal AI agent that could manage his files and interact with messaging apps. He open-sourced the code, and within weeks it became the fastest-growing project in GitHub history, gathering over 150,000 stars and outpacing the early growth of React and Linux. People bought dedicated computers just to leave OpenClaw running in their closets.
The project relied heavily on Anthropic’s Claude models. Instead of embracing this massive free distribution channel, Anthropic’s legal team sent Steinberger a trademark complaint over his original name for the project. They forced a rebrand at five in the morning. During the transition, crypto scammers hijacked the abandoned social handles. Anthropic also unexpectedly revoked the API access third-party tools used to connect to Claude subscriptions.
OpenAI recognised the opportunity immediately. Sam Altman hired Steinberger to lead the next generation of personal agents at OpenAI. This acquisition highlights a critical vulnerability for the major labs. A solo developer with a messaging app integration built a more compelling agent experience than companies spending billions on compute. OpenAI didn’t buy a model. They bought the person who proved that the orchestration layer is where users actually want to spend their time.
Meta recognised the exact same threat. They paid two billion dollars to acquire Manus, an agentic AI startup that hit 100 million dollars in recurring revenue in eight months without ever training a proprietary model. Meta immediately launched Manus on Telegram to gather behavioural data on how humans interact with autonomous agents before rolling it out to their core platforms.
The wrapper layer isn’t the product. The reliability engineering is the product. OpenClaw spawned a massive ecosystem of agents, but it also triggered cybersecurity warnings from Gartner because users struggled to manage permissions and state across long sessions. An agent is useless if it hallucinates at step fourteen of a complex task and emails your entire contact list. The labs know the models are becoming commodities, so they’re buying the teams who can make an unreliable probabilistic system work reliably.
The Coding Renaissance
The way software gets built changed permanently this month.
Anthropic quietly revealed that Claude Code reached a 2.5-billion-dollar annualised revenue run rate within twelve months of launch. That makes it one of the fastest-growing software products in history. A single coding feature inside Anthropic now generates more than double the revenue the entire company made twelve months ago. Analysts estimate that four per cent of all public GitHub commits are now authored directly by Claude. The company also released Claude Sonnet 4.6, bringing a one-million-token context window and significantly reduced hallucination rates to their mid-tier pricing level.
A million tokens is roughly 750,000 words. That’s an entire large codebase held in memory simultaneously. No more losing track of distant files. No more the model forgetting about a utility function defined three directories away. The whole project sits inside the conversation, and the model reasons about it as a unified whole.
OpenAI responded aggressively. They retired GPT-4o entirely, forcing hundreds of millions of users onto the newer GPT-5.2 architecture. Then they released GPT-5.3-Codex and a specialised variant called Codex-Spark. Spark runs on Cerebras Wafer Scale Engine hardware rather than traditional Nvidia GPUs. This custom silicon features four trillion transistors on a single die, eliminating the memory bandwidth bottlenecks that plague traditional GPU clusters. It generates over a thousand tokens per second. Developers no longer wait for code to generate. The output appears almost instantly, allowing rapid iterations, real-time debugging, and immediate interface tweaks. By routing production workloads through Cerebras, OpenAI is also building bargaining power against Nvidia.
The two models occupy different positions. Opus dives into tasks quickly, tries things, iterates, and occasionally surprises you with elegant solutions. It’s pleasant to work with. But it can be hasty, committing to a solution path before fully understanding the problem. Codex reads more of the codebase before acting. It disappears for twenty to thirty minutes, then returns with a comprehensive solution. Less fun, more reliable. Its willingness to pause and gather context before acting often produces code that fits more naturally into existing project structures.
This level of automation is forcing a redefinition of the software engineer’s role. Anthropic’s head of Claude Code recently stated that the title “software engineer” will soon be replaced by “builder” or “product manager”. Developers note that their daily work now consists of running ten or more parallel agent sessions. They act as reviewers and directors rather than typists. The skill of writing syntax is losing its premium. The new premium skills are system design, context engineering, and taste. A developer must know how to evaluate a dozen working implementations and identify the one that fits the broader architecture without creating technical debt.
Non-technical users are building functional applications in hours by describing their intent to an agent and iterating on the results. Platforms like Replit and Emergent are capturing this massive new market. Emergent recently doubled its revenue to 100 million dollars in a single month, with seventy per cent of its users having zero formal coding background. They’re replacing the spreadsheets that run small businesses with bespoke mobile apps built entirely through natural language.
Programming as a manual craft is ending. As one developer put it: “It’s okay to mourn our craft.” The sadness is real. So is the excitement. The same tools that make manual coding obsolete make building things faster, more accessible, and more fun than ever. The joy hasn’t disappeared. It’s migrated. From the act of writing code to the act of building things, with code as one tool among many.
Video Generation Crosses the Threshold
While text and coding models fight over enterprise budgets, video generation models are threatening the entertainment industry.
ByteDance released Seedance 2.0, and early usage suggests it’s a definitive leap past OpenAI’s Sora and Google’s Veo. Previous tools felt like playing the lottery. You prompted the model, waited, and hoped one out of five clips was usable. They were limited to silent, short bursts of motion. Seedance 2.0 supports continuous generation. It pushes past the 15-second limit to create coherent, multi-minute scenes. It processes text, images, video references, and audio inputs simultaneously. It generates the video and the corresponding sound effects in a single pass, ensuring perfect lip-sync and environmental audio coordination without requiring post-production dubbing.
The cost reduction is brutal. Creators generate broadcast-quality anime and cinematic shorts for pennies. A professional video ad that used to require a massive budget and a production crew can now be generated for under three dollars. ByteDance built this model inside the largest video consumption feedback loop on earth. They know exactly which visual pacing holds human attention on TikTok. They used that data to shape the model’s output. Hollywood and the advertising industry are facing a structural reset.
Hollywood’s response highlights the futility of local copyright enforcement against global technology. When OpenAI released Sora, the motion picture industry threatened litigation, and OpenAI quickly implemented strict guardrails protecting existing intellectual property. ByteDance, operating primarily out of Beijing, faces different incentives. Seedance 2.0 freely generates high-quality videos featuring copyrighted characters and franchise universes. Every time an American company degrades its product to satisfy domestic copyright demands, a Chinese competitor fills the capability gap. Users migrate to the platform that lets them build what they want.
Scientific Breakthroughs and Deep Reasoning
February proved that language models are no longer statistical parrots repeating their training data. They’re actively contributing to net-new scientific discovery.
Google released a major update to Gemini 3 Deep Think. The model achieved 84.6% on the ARC-AGI-2 benchmark, a test designed to resist brute-force memorisation. A year ago, the best models scored in the single digits. Deep Think also reached a 3455 Elo rating on Codeforces, putting it in the top 0.01 per cent of competitive programmers globally. Google achieved this while cutting inference cost by 82% compared to previous versions. Efficient reasoning pathways matter just as much as raw parameter scale.
OpenAI published a preprint showing GPT-5.2 derived a novel result in theoretical physics. Physicists assumed for decades that a specific type of particle interaction involving single-minus gluon scattering could not occur. The mathematical expressions required to prove it were growing superexponentially. GPT-5.2 took the messy maths, simplified it, spotted a hidden pattern, and conjectured a general formula. An internal reasoning model spent twelve hours formally proving it. Human researchers simply verified the work. The AI didn’t replace the physicists. It handled a combinatorial explosion that would have taken humans months or years to map out.
The maths community saw similar shifts. Eleven top mathematicians, including a Fields Medal winner, created the First Proof challenge. They wrote ten unpublished, post-doctoral problems and gave the models one week. OpenAI claimed their internal models solved six of the ten. The mathematicians verified that only two solutions were actually correct. The models generated incredibly articulate garbage for the other eight. Still, the fact that an AI could solve even two novel maths problems that didn’t exist anywhere in its training data proves that actual reasoning is beginning to emerge.
In biology, Isomorphic Labs released IsoDDE, an AI engine that designs drugs natively on a computer. It reportedly doubles the accuracy of AlphaFold 3 on hard targets and finds drug pockets from genetic sequences alone. The pharmaceutical industry is watching the cost of discovering a viable drug molecule drop toward zero. The traditional AI drug discovery industry just ran a 15-billion-dollar experiment proving a 2011 Turing Award winner right. Judea Pearl argued for decades that statistical models trained on text learn how we describe the world, not how the world actually works. AI companies trained models on published papers and genomic databases, found correlations, and failed completely in clinical trials. The companies integrating causal inference into their pipelines are telling a different story. The FDA is now moving toward Bayesian methods for clinical trial design, forcing the industry to adapt.
The Physical Limits of Intelligence
The software is ready to scale, but the physical world is struggling to accommodate it. The primary constraints on AI are no longer algorithms or data. They’re electricity, water, and cooling.
The chip bottleneck is rapidly being replaced by a power grid bottleneck. Amazon Web Services admitted they’ll be supply-constrained for years, despite spending 200 billion dollars on capital expenditures in 2026. You can buy all the Nvidia Blackwell chips you want. They’re useless if you wait three years for a utility company to install a substation. Data centres currently consume roughly four per cent of United States electricity. That number is projected to hit twelve per cent by 2028.
A single large-scale AI data centre can consume as much electricity as 100,000 homes. That’s not a theoretical projection. It’s the current reality for facilities operated by Microsoft, Google, and xAI. Build-out announcements appear weekly, each measured in hundreds of megawatts or even gigawatts of planned capacity. Most of this capacity will be powered by natural gas, at least in the near term. Nuclear is too slow to build. Solar and wind are intermittent and require battery storage that doesn’t yet exist at sufficient scale. Gas-fired power plants can be constructed relatively quickly, run continuously, and ramp up to match load. For hyperscalers racing to bring compute online as fast as possible, gas is the pragmatic answer.
Water consumption is equally alarming. Google’s single data centre in Council Bluffs, Iowa, consumed one billion gallons of fresh water last year just for evaporative cooling. As the industry scales, these facilities consume water at the rate of small cities.
Massive physical supply chains are pivoting to feed the AI boom. Ford built a 5.8-billion-dollar EV battery plant in Kentucky. Four months after it opened, they shut it down, laid off 1,600 workers, and are spending another two billion dollars to retool the factory for data centre energy storage. The EV market slowed exactly when AI compute demand went vertical. The workers caught in the middle didn’t get a vote on which demand curve won.
In the middle of this infrastructure frenzy, Apple is playing a completely different game. Microsoft, Amazon, Google, and Meta are taking on massive debt to buy server racks that’ll be obsolete in three years. Apple spent a fraction of that amount on capital expenditures. They integrated existing models into their hardware, prioritised local processing with their M-series chips, and returned over a hundred billion dollars to shareholders in cash. Apple’s unified memory architecture and custom Neural Engine provide a hardware moat that Intel, Qualcomm, and even Nvidia’s consumer GPUs can’t easily replicate. The company trades outside the “AI factor” entirely. If the massive data centre investments fail to produce adequate returns, Apple will look brilliant for sitting out the capital expenditure war.
The SaaS Crisis and the Skill Era
Software-as-a-service companies are facing their most severe valuation compression since 2022, and this time the driver is structural rather than monetary. The market is waking up to the fact that standardised SaaS is highly vulnerable to agentic automation.
For the last fifteen years, software companies built products and exposed APIs. The goal was to become a pipe that other developers plugged into. Language models shifted the centre of gravity. Execution is no longer scarce. You don’t need a dedicated SaaS application to manage files, compile research, or analyse churn.
Anthropic’s Claude Cowork operates as a persistent agent on your desktop. It reads local files, opens applications, cross-references design documents, and builds structured spreadsheets. When an AI agent can read your raw data, process the logic, and generate a customised dashboard on the fly, the need for a massive annual SaaS contract disappears.
The semiconductor sector and the software sector have moved in opposite directions. Semiconductors have 89% of their members trading above their 200-day moving average. Software has the inverse pattern. Short interest on XLK jumped to extreme levels. Only three insider purchases in large-cap software occurred over recent months. A near-total absence of insider conviction.
ServiceNow at 24 times forward earnings with 20% growth and an 80s gross margin isn’t a distressed valuation. It’s a quality company priced for deceleration that may or may not materialise. One insider purchase appeared at NOW in February. That’s worth watching for follow-through. When insiders start buying, it’ll signal that the people with the best information believe the disruption narrative has overshot.
This transition is moving the economy from the API era to the skill era. The new moat is encoded judgement. Developers encode high-level playbooks into files that agents can read. You write a markdown file explaining exactly how to audit a landing page or structure a legal intake. You package it as a skill, and you let agents call it thousands of times a day. Distribution no longer means getting humans to log into your dashboard. It means getting autonomous agents to call your skill.
The traditional open-source library economy is changing too. Andrej Karpathy demonstrated how an AI agent could read a massive codebase, extract only the specific 150 lines of code needed for a task, and rewrite it without any external dependencies. If agents can dynamically pull and rewrite logic on the fly, the value of maintaining sprawling open-source packages drops. Code becomes completely fluid.
Geopolitics, Defence, and the Safety Fracture
The tension between AI safety and national security reached a boiling point this month.
Anthropic reportedly asked the Department of Defence how their Claude model was being used following a military extraction operation in Venezuela. The Pentagon’s response was hostile. Defence officials threatened to classify Anthropic as a supply chain risk. The message from the military establishment is blunt. If you build frontier AI using American energy, American capital, and American legal protections, you hand over the keys and don’t ask questions about the targets.
This creates a dilemma that has no comfortable resolution. Anthropic’s head of safeguards research resigned this month, publicly stating that commercial and geopolitical pressures are forcing AI labs to abandon their safety commitments. The people who built the defences against AI-assisted bioterrorism are walking out the door because the incentives prioritise deployment over caution.
OpenAI quietly changed the wording of its core mission statement in recent tax filings. They removed phrases about building AI safely and being unconstrained by the need to generate financial return. The new mission simply focuses on ensuring artificial general intelligence benefits humanity. This aligns with their increasingly aggressive commercial posture and their active lobbying efforts to ban Chinese models from the US market.
Elon Musk’s xAI experienced a different kind of turbulence. Just as SpaceX acquired xAI to create a massive combined entity heading toward an IPO, half of xAI’s founding team resigned. Jimmy Ba, Tony Wu, and other prominent researchers departed. Musk immediately restructured and launched the Grok 4.20 beta, a multi-agent architecture where four specialised agents run in parallel to debate and verify answers before presenting them. Grok has a unique structural advantage: it ingests the entire real-time data firehose from the X platform. It can process global market sentiment minutes before that data appears in traditional financial terminals. xAI also entered a massive 100-million-dollar Pentagon contract competition to build voice-controlled AI systems for autonomous drone swarms.
The US is assembling its largest Middle East air presence since 2003. SpaceX’s Starshield programme and integration with xAI suggest that Musk’s companies are embedding deeply within military infrastructure. AI companies may face a binary choice: align with government access requirements or face regulatory and political consequences, including restricted access to compute, energy, and talent pipelines.
Capital Markets and Financial Engineering
The financial engineering behind these AI labs is unprecedented.
Anthropic just raised 30 billion dollars at a 380-billion-dollar valuation. They grew their revenue to a 14-billion-dollar run rate in less than three years. OpenAI is attempting to raise 100 billion dollars at an 850-billion-dollar valuation. The structure is circular. Nvidia, Microsoft, and Amazon are expected to provide the capital. OpenAI will take that capital and hand it right back to those same three companies to buy chips and rent cloud servers. OpenAI projects 14 billion dollars in losses for 2026. They’re spending aggressively to capture market share, betting that inference costs will drop enough by 2029 to make the business profitable.
Venture capital is adapting through extreme concentration. Thrive X raised 10 billion dollars. Josh Kushner started with 5 million in 2010, grew it to 25 billion by making incredibly concentrated bets. He put over half a previous fund into Stripe. He led multiple rounds for OpenAI, committing billions at a time. The traditional venture model of spraying capital across fifty companies and hoping for one winner is dead at the frontier level.
Collective hyperscaler capex is approaching 700 billion dollars annually, with backlog growth exceeding capex growth. Demand is accumulating faster than capacity is deployed. Amazon’s 200-billion-dollar commitment to Trainium, data centres, and AI drew comparisons to “financial terrorism” from critics questioning the ROI. The bull case is that this capex builds infrastructure for the next decade of AI applications. The bear case is that model commoditisation means the returns accrue to users, not infrastructure investors.
The Labour Shift
The economic data is finally reflecting the AI transition. The Bureau of Labor Statistics significantly revised its 2025 job numbers, eliminating over one million jobs from initial estimates. The software and information sectors saw the steepest downward revisions.
Entry-level software engineering roles have collapsed. Companies aren’t hiring junior developers to write boilerplate code. Yet demand for senior AI engineers and orchestration specialists has spiked. The labour market is experiencing a severe K-shaped recovery. Workers who understand how to deploy AI and orchestrate agents see massive wage premiums. The premium for AI-skilled workers doubled from 25 per cent to 56 per cent in a single year, even as 88 per cent of workers still haven’t taken a single hour of AI training.
UPS announced closure of 22 package centres and elimination of 30,000 jobs, following 50,000 cuts the previous year. Major corporations are actively rewriting their job descriptions. IBM froze hiring for thousands of back-office roles in 2023. Now they’re hiring entry-level workers again, but the jobs are entirely different. Instead of spending thirty hours a week writing code, these workers manage AI outputs, evaluate system reliability, and interface directly with clients.
When you compress forty hours of manual research synthesis into four hours of AI orchestration, you don’t just do the same research faster. You start researching things you previously ignored. The companies spending billions on AI are now paying 775,000-dollar salaries for human communications directors. The supply of words went to infinity, which means the value of any individual word went to zero. The scarce resource is human judgement. Knowing what to say, when, to whom, and why it matters is the actual skill.
Gen Z appears to be the first cohort rationally assessing this reality. Sixty per cent are pursuing skilled trades, possibly the first generation to recognise that white-collar “knowledge work” is more vulnerable than blue-collar “physical work” to AI displacement.
Opportunities
The 1000x cost reduction in inference makes non-tech industries viable for AI for the first time. Healthcare, agriculture, logistics, and government remain dramatically underserved by AI tooling. Companies building domain-specific copilots for regulated industries sit at the intersection of high defensibility and expanding addressable market. The current healthcare system is built on scarcity. There aren’t enough doctors, especially specialists, especially in poor countries. An AI system with genuine medical expertise, available to anyone with a smartphone, wouldn’t merely improve this system. It would bypass it entirely.
Energy infrastructure for AI represents a multi-decade investment opportunity. Data centre demand growing from four per cent to twelve per cent of US electricity drives sustained demand for generation, transmission, and cooling infrastructure. Nuclear restarts, natural gas peaking plants, and advanced cooling systems are all adjacent beneficiaries. The smarter play, according to analysts close to the space, may be in the infrastructure rather than the commodity itself. Pipeline operators, LNG terminal owners, and power-generation companies with gas-fired portfolios offer exposure to demand growth with less direct commodity price risk.
AI evaluation and safety tooling is experiencing explosive demand. As agents move to production, companies are realising that deploying non-deterministic models without strict measurement frameworks is disastrous. Evaluations are no longer a quality assurance function. They’re the core requirement for building any modern software product.
Apple represents an asymmetric hedge. It trades outside the AI factor, has custom silicon advantages, a two-billion-device installed base, and benefits from both the cloud AI thesis and its failure. Multiple sophisticated investors are re-entering Apple positions specifically on this reasoning. If the cloud AI thesis works, Apple benefits from on-device integration. If it fails, Apple’s core business is unaffected.
Contrarian Views
The hyperscaler capex narrative may be wrong in both directions. The dominant bearish case frames 700 billion dollars in annual capex as reckless speculation. But the backlog is growing faster than capex, meaning demand for AI infrastructure exceeds investment. If AI agents replace even ten per cent of white-collar labour within five years, the total addressable market for AI infrastructure dwarfs the investment. The market may be correctly pricing long-term value while incorrectly pricing near-term uncertainty.
Software may not be as vulnerable as the market thinks. Enterprise buying decisions are made by procurement committees, not developers. Switching costs, compliance requirements, audit trails, and integration complexity create inertia that AI tools can’t easily overcome. The newspaper analogy is vivid but potentially misleading. Newspapers had no switching costs. Readers could simply go to Google. Enterprise SaaS has switching costs measured in years.
China’s distillation advantage may be temporary. It depends on continued access to frontier model outputs. If American labs shift to API-only access with rate limiting, watermarking, and usage restrictions, the distillation pipeline breaks. The current openness may reflect strategic naivety rather than a permanent structural feature.
The most underappreciated dynamic is what happens when the cost to produce marginal code drops to zero. The standard analysis focuses on job displacement. The second-order effect is more interesting. Every displaced engineer becomes a potential competitor, armed with the same AI tools, competing to build in domains they know well. This creates a deflationary spiral in software pricing power that extends far beyond the companies directly threatened by AI substitution.
What to Watch
The next six months will determine which of these trends become permanent fixtures of the economy. Watch the enterprise software sector for insider buying. If executives at companies like ServiceNow or Salesforce begin buying their own stock, it signals they believe the market overreacted and their proprietary data positions will hold.
Monitor the rollout of WebMCP standards. Google and Microsoft are pushing a protocol that allows websites to expose their interfaces directly to AI agents. Once websites are built for agent traffic rather than human traffic, the entire nature of web browsing changes.
Keep an eye on power grid legislation. The AI industry can’t hit its growth targets without massive electrical upgrades. State governments will soon be forced to handle the tension between residential utility ratepayers and data centres demanding gigawatts of dedicated power.
Watch hyperscaler earnings in April. Capex guidance versus revenue trajectory will determine whether the ROI thesis holds. Watch the capability curve of the models themselves. The gap between open-weight models and proprietary models is currently measured in single-digit percentages on major benchmarks. The transition from building intelligence to applying intelligence is already underway. The companies that survive will be the ones that recognise the difference.
