Apple's AI Strategy Isn't Apple Intelligence

The consensus take on Apple and AI is that they missed the boat. Siri is a punchline. Apple Intelligence shipped late and underwhelming. There's no Apple frontier model. Every quarter another headline asks whether Tim Cook fell asleep at the wheel while OpenAI, Google, and Anthropic ate his lunch.

I think the consensus is reading one product (a chatbot) and calling it a strategy. The actual strategy is visible if you look at the hardware.

The hardware tells a different story

The M5 chip ships with Neural Accelerators — dedicated matrix-multiplication units inside every GPU core. That is not the silicon you tape out reactively. That is a 3+ year design decision that says "we believe transformer inference will run on billions of personal devices, and we want to own that layer."

The performance numbers back this up. For LLM inference, M5 over M4 looks roughly like this:

  • Token generation: ~20–25% faster, tracking the memory bandwidth bump.
  • Time-to-first-token: up to 4x faster on a Qwen3-14B-4bit prompt, because TTFT is compute-bound and the Neural Accelerators light up exactly when you hit a big matmul.
  • Image generation (FLUX-dev-4bit): 3.8x faster for the same reason.

This is Apple's own published research, and it matters because it means the M5 isn't just a speed bump — it's the first chip where the architecture is specifically tuned for transformer math. Pair that with unified memory scaling to 128GB+ and you get a laptop that can run models which on the NVIDIA side need a workstation costing many times more.

Then there's MLX. Apple's open-source array framework, purpose-built for Apple Silicon and designed from the ground up for unified memory and Metal. It's been adopted by the community fast enough that Ollama added MLX as its inference backend in preview on March 30, 2026 — with prefill throughput up 57% and decode up 93% on a Qwen3.5 benchmark vs. its previous Metal path. The framework, the chip features, the Foundation Models Swift API for app developers, and Private Cloud Compute with custom silicon — these are not the moves of a company that gave up. They're the moves of a company that decided not to fight the race everyone is watching.

The Gemini deal proves it

In January, Apple confirmed a multi-year deal with Google — reportedly around $1B per year — to license Gemini for the next generation of Apple Foundation Models and a revamped Siri. The deal grants Apple access to a custom 1.2-trillion-parameter Gemini model built specifically for Siri and Apple Intelligence, roughly eight times larger than Apple's existing 150B-parameter cloud models. The press read this as capitulation. It's not. It's the thesis stated plainly.

Apple is not going to outspend Google or OpenAI on training. Why would they? They have 2 billion devices, a chip team, the OS, and the App Store. The right move for them is to let everyone else burn the capex training frontier models, then rent distribution to whoever wins. Apple's own Foundation Models stay in the stack — but they get scoped to searching personal data, where on-device + privacy is the actual differentiator. Frontier reasoning gets outsourced. Private context stays in the house. Hardware sells either way.

Craig Federighi has reportedly told employees that the goal is AI "integrated into everything you do, not a bolt-on chatbot on the side." That is a coherent strategy. It just isn't the strategy the press has been grading them on.

Shovels, but they own the goldfield

The usual frame is "be the guy who sells the shovels in a gold rush." That's the right shape, but Apple's position is actually stronger than the analogy implies. Levi Strauss didn't also own California.

Apple sells the shovels (the M-series chips, MLX, the inference platform) and controls the goldfield where the miners are allowed to dig (the App Store, default integrations, the system-level surfaces where AI assistants meet real users). OpenAI and Anthropic have to pay rent to reach those users either way — and now, with the Gemini deal, the rent is flowing in the literal opposite direction from where everyone assumed it would.

The economics also favor this position long-term. NVIDIA's data center revenue cleared ~$195B in fiscal 2026 — concentrated in roughly a dozen hyperscaler-and-lab buyers, any one of whom in-housing their accelerator walks away from the bill. Apple's M-series ships into ~2 billion active devices on staggered refresh cycles, with no hyperscaler price wars and no customer-concentration risk. Inference at the edge is the part of the AI economy that hasn't fully arrived yet, and Apple has been quietly tooling up for it while everyone benchmarks them against the part that already has.

The part that genuinely isn't working

This isn't all 4D chess. The internal Siri/Ajax LLM effort genuinely struggled. The first-generation architecture had to be scrapped because Apple tried to merge a legacy command system with an LLM-based one and it didn't work. John Giannandrea was reassigned in March 2025 and retired this April. Mike Rockwell (the Vision Pro guy) took over Siri — and per recent Bloomberg reporting, he's already considered stepping back or moving to an advisory role.

The Meta exodus made it worse. Ruoming Pang, who led the 100-person team building Apple's Foundation Models, left for Meta in mid-2025 on a package reportedly over $200M. Other senior researchers followed, lured by similar superintelligence-lab money while Apple was paying half of market or less. The Gemini deal is partly a strategic embrace of platform economics and partly an admission that they couldn't ship in time.

And here's where the shovel-seller analogy strains: shovel-sellers don't need a great chatbot. Apple does. Siri is the surface where regular users will decide whether "Apple AI" is good or not. Apple can win the platform war on fundamentals — the chips are real, MLX is real, the distribution is real — and still lose the narrative war if the Siri preview at WWDC 2026 (June 8) lands flat, or if the genuinely conversational version slips into late 2026 or 2027 as recent reporting suggests it might.

So: it's about shovels. The shovels are excellent. The shovels are getting better faster than most people noticed. But Apple still has to ship one specific software product that justifies the entire strategy to a public that grades them on chatbots. That's the genuinely unresolved bet — and WWDC is in a month.

The pattern

There's a misattribution here worth naming, and it's the same one I wrote about recently at the individual-engineer scale. When an engineer fails at a task with Claude Code and concludes "AI isn't ready yet," the failure is almost always in the context or the workflow, not the model. When an analyst looks at Siri and concludes "Apple missed the boat," the failure is in their model of where the value will accrue, not in Apple's strategy. The pattern is identical: name the most visible variable, declare it the constraint, miss the layer that actually matters. Engineers do it on individual tasks. Markets do it on $3T companies. The corrective is the same — ask which layer the action is happening at before deciding who's losing.

← Back to archive