Apple Built It for Five Years. It Lasted Five Days

May 18

What a macOS exploit tells us about the new economics of cybersecurity.

Apple spent roughly five years and what the security community estimates to be billions of dollars building a hardware-level protection called Memory Integrity Enforcement.
Introduced with the M5 chip, it works by tagging every piece of memory with a secret label. Any process that tries to access memory without the correct label is blocked and logged instantly.
Apple designed it to make entire classes of attacks, the kind that have plagued operating systems for decades, structurally impossible to execute.

A team of three researchers bypassed it in five days.

The team works at Calif, a small Palo Alto-based security research firm. They did not do it alone.
Working alongside them was an early preview version of Anthropic’s Claude Mythos, a model so capable at finding software vulnerabilities that Anthropic has not released it publicly.
The story of how this happened, and what it actually demonstrates about AI in security research, is more nuanced and more important than most of the coverage has made it sound.

What Happened

In April 2026, the Calif team began testing Mythos Preview through Anthropic’s Project Glasswing, a controlled initiative that gives select security organizations access to the model for defensive research.
During that testing, Mythos identified two previously unknown vulnerabilities in macOS.
The vulnerabilities on their own were not enough. What the team built was an exploit chain: a sequence that linked the two bugs together with additional techniques to corrupt the Mac’s kernel memory and ultimately deliver a root shell, full administrative access to the device, from a standard unprivileged user account. Critically, the chain survived Memory Integrity Enforcement, which had previously disrupted every public exploit against modern iOS.

Calif disclosed the findings to Apple in person at Cupertino rather than through the standard submission queue, and will publish a full 55-page technical writeup only after a patch ships.
Apple spent five years building MIE, and according to Calif’s research, the mitigation disrupts every public exploit chain against modern iOS, including recently leaked exploit kits.

What the AI Actually Did and Did Not Do

This is where the story gets both more interesting and more careful.

Mythos Preview is powerful: once it has learned how to attack a class of problems, it generalizes to nearly any problem in that class.
Mythos discovered the bugs quickly because they belong to known bug classes.
That is genuine capability. The model scanned for vulnerability patterns it had learned during training and found matches fast, reducing what might otherwise have taken weeks of manual review to a matter of days.

But Mythos did not design the exploit chain. It did not figure out how to bypass MIE.
That part required human expertise, specifically the kind of deep, intuitive understanding of novel security mitigations that comes from years of hands-on offensive research.
Calif CEO Thai Duong was explicit that the attack could not have been pulled off by Mythos alone and required the very human cybersecurity expertise of the firm’s researchers.

The distinction matters. What Mythos contributed was speed and pattern recognition within known territory. What the humans contributed was judgment in unknown territory.
MIE was new enough that no training data could have taught the model how to defeat it. That gap still required a person.

Why Five Days Against Five Years Is the Real Story

The more significant number here is not the five days. It is the asymmetry.
Defenders build mitigations over years, with large teams, enormous budgets, and the constraint of shipping software that millions of people use reliably every day.
Attackers have always operated under fewer constraints, but historically they also needed significant time and expertise to find and chain vulnerabilities in hardened systems.

Anthropic released the preview version of Mythos in April after internal testing suggested the model could autonomously identify and exploit software vulnerabilities at a level beyond previous public AI models.
Rather than release it publicly, Anthropic restricted access to select technology companies, banks, and researchers under Project Glasswing.
The reasoning is straightforward: if a small team with access to the model can reduce five years of defensive engineering to a five-day research sprint, the same capability in the wrong hands would be genuinely dangerous.

The Calif team put it plainly in their own writeup. Apple’s MIE was built in a world before Mythos Preview existed.
The security landscape is changing fast enough that mitigations designed under the old assumptions are going to be tested hard under the new ones.

What This Means for Everyone Else

For people who use Macs, the immediate practical answer is: wait for the patch and install it. Apple is reviewing the report and a fix is expected soon.
The exploit requires local access to the machine, which means a remote attacker cannot use it without first getting onto your device through other means. It is serious but not an emergency for most users.

For the broader security industry, the implications run deeper.
Anthropic’s AI previously identified more than 100 high-severity vulnerabilities in Mozilla Firefox over a two-week period, a pace significantly faster than traditional vulnerability discovery methods.
These are not isolated demonstrations. They suggest that AI-assisted security research is compressing timelines across the board, for defenders and attackers alike.

The good news is that the same capability that helps find vulnerabilities faster also helps patch them faster.
The Calif disclosure and Apple’s response are the system working as intended. The harder question is what happens when similar tools reach actors who are not interested in responsible disclosure

The Honest Takeaway

What Calif demonstrated is not that AI can hack systems autonomously.
It demonstrated something more specific and arguably more important: that a small team of skilled humans, paired with a capable AI model, can now tackle security research that previously required either much more time or much larger teams.
That changes the economics of offensive security research in ways the industry is still working out. It also changes what defenders need to build for, not just the attacks that exist today but the speed at which new ones will arrive tomorrow.
Small teams can now make discoveries like this, and we are about to learn how the best mitigation technology on Earth holds up during the first AI bugmageddon. That framing is dramatic, but the underlying point is sound.
The pace is accelerating. The five-day sprint is not the ceiling. It is closer to a starting point.

Suzanne M