The Day the Cost Curve Broke
How an unreleased Anthropic model just changed the economics of finding software bugs, and why eleven of the world’s largest companies signed up to keep it locked away.
On April 8, 2026, Anthropic announced something that doesn’t fit cleanly into any of the buckets the tech press is used to. It wasn’t a product launch. It wasn’t a research paper. It wasn’t quite a policy announcement, though it borrowed pieces from all three.
What they announced was Project Glasswing, a coalition with Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. The point of the coalition is to use one unreleased model, Claude Mythos Preview, to find and patch vulnerabilities in critical software before attackers find them first.
Anthropic is putting in $100 million in usage credits and another $4 million in donations to open-source security organizations. And they are explicitly not releasing the model to the public. No frontier AI lab has done that before.
I want to take this seriously, separate the marketing from the substance, and explain why people who have spent decades inside the security community are reacting with a mix of awe and discomfort.
The headline you can’t ignore
Mythos has already discovered thousands of zero-day vulnerabilities across every major operating system and every major web browser. Three of those finds are worth sitting with.
The first is a 27-year-old bug in OpenBSD. OpenBSD is the operating system you pick because you trust it. The vulnerability Mythos found let an attacker remotely crash any machine running it just by connecting. It survived nearly three decades of code review by some of the most paranoid programmers alive.
The second is a 16-year-old bug in FFmpeg, the video library most of the internet runs on top of. The bug lived on a line of code that automated fuzzing tools had hit more than five million times without ever flagging it.
The third is a chain of vulnerabilities in the Linux kernel that Mythos found and weaponized on its own. It went from a normal user account to full control of the machine without any human guidance.
All three are now patched. Anthropic is publishing cryptographic hashes of additional bugs today and will reveal the specifics once fixes ship.
The number that should stop you cold
There’s one statistic in the Frontier Red Team’s technical report that, once you absorb it, reframes everything else.
The 27-year-old OpenBSD vulnerability was discovered for about $50 in inference costs. Less than $20,000 in total spend produced “several dozen” critical findings across major systems.
For decades, the moat protecting legacy software from sophisticated attackers was the human cost of finding bugs in it. Elite vulnerability researchers were rare. They were expensive. The systems most worth attacking were also the ones hardened by years of accumulated review. Defense held because offense was costly.
That equilibrium has dissolved. The moat hasn’t been eroded by a new generation of attackers. It’s been deleted by a price collapse.
How big is the jump, really?
Skepticism is healthy. Vendors hype models. So the right question is: how much better is Mythos than what we already had six months ago?
The Frontier Red Team report compares Mythos to Claude Opus 4.6 on a Firefox JavaScript engine exploitation benchmark. Opus 4.6 produced two working exploits in hundreds of attempts. Mythos produced 181 working exploits, plus 29 more that achieved register control.
On the OSS-Fuzz corpus, Mythos achieved full control-flow hijack on ten separate, fully patched targets. Opus 4.6 managed only single crashes at the equivalent tier.
On the standard CyberGym vulnerability reproduction benchmark, Mythos scored 83.1% versus Opus 4.6 at 66.6%. On SWE-bench Verified, Mythos hit 93.9% versus 80.8%. On Terminal-Bench 2.0, the gap was 82% to 65.4%, climbing to 92.1% with extended timeouts.
These aren’t iterative improvements. They’re the difference between an interesting research direction and an industrial vulnerability factory.
The voices you should actually listen to
You don’t have to take Anthropic’s word for any of this. The most important confirmation is coming from working security researchers, and it’s louder than the vendor announcement.
Nicholas Carlini, one of the most respected security researchers in the world, summed up his recent work with the model in a quote that’s been circulating since the announcement: “More bugs in the last couple of weeks than I found in the rest of my life combined.”
Greg Kroah-Hartman, the Linux kernel maintainer who has been receiving low-quality AI bug reports for years and pushing back on them publicly, said: “A month ago, the world switched.” His inbox went from noise to real reports.
Simon Willison, who built his reputation by being unimpressed with most AI vendor claims, called the restricted release “a reasonable trade-off.”
When Carlini, Kroah-Hartman, and Willison are independently telling you the same thing, the inflection point is real.
Why “don’t release it” is the most interesting choice
For two years, the consensus among frontier AI labs has been that public release is the responsible path. The reasoning: attackers will get the capability eventually, so defenders need it now, and openness produces the most resilient ecosystem. It’s a coherent position. It’s the position Google took with Big Sleep and CodeMender, and OpenAI has taken with most of its work.
Anthropic just broke ranks.
The bet is unusual. They’re arguing that capability can be confined to trusted hands long enough to harden the systems attackers will eventually target, but only just long enough. The 90-day reporting window is the giveaway. Anthropic isn’t claiming Mythos will stay restricted forever. They’re claiming they need about three months to patch enough critical infrastructure that the eventual proliferation matters less.
Whether this works depends on one variable. If a competing lab (OpenAI, Google DeepMind, xAI, Meta, or one of the Chinese frontier labs) announces a comparable model inside that 90-day window, the strategy collapses immediately. If they don’t, this becomes the template for governing frontier capability for the next several years.
It’s the first real-world test of “responsible non-release,” and the answer matters more than any individual benchmark result.
The coalition is the policy
The most interesting thing to do with the Glasswing partner list is read it twice. Once for who’s in the room. Once for who isn’t.
In the room: every layer of the modern computing stack. Silicon (NVIDIA, Broadcom). Cloud (AWS, Google, Microsoft). Endpoints and security platforms (CrowdStrike, Palo Alto Networks, Cisco). Operating systems and core OSS (Apple, the Linux Foundation). Finance (JPMorganChase). The additional 40+ organizations Anthropic is extending access to are largely critical infrastructure operators.
Not in the room: OpenAI. Meta. Bitcoin Core. Any Chinese firm. Any government. The Ethereum Foundation, whose representative immediately tweeted asking for access.
This isn’t a customer list. It’s a defensive alliance, closer in shape to NATO than to AWS, with coordinated disclosure, shared tooling, and joint reporting. The post-preview pricing of $25 per million input tokens and $125 per million output tokens is roughly four times standard frontier pricing. That’s a capability tax, not a revenue play.
The composition is the political shape of the project. The shape, more than the model, is what we’ll be debating six months from now.
Open source finally gets funded, barely
Jim Zemlin of the Linux Foundation said the most strategically honest thing in the entire announcement: “Security expertise has been a luxury reserved for organizations with large security teams. Open source maintainers, whose software underpins much of the world’s critical infrastructure, have historically been left to figure out security on their own.”
For two decades the technology industry has treated open source as free infrastructure. Every Fortune 500 has a dependency tree dominated by the unpaid work of a few hundred maintainers, and the relationship between value extracted and value returned has been embarrassing. Mythos exposes how dangerous that asymmetry has become. The same code Anthropic’s own agents are written on top of can be torn apart for $50 a target by anyone with the right model.
The $4 million Anthropic is putting into OpenSSF, Alpha-Omega, and the Apache Software Foundation is real money for organizations that have been operating on a shoestring. It’s also overdue, and small relative to the scale of the problem. If Glasswing produces nothing else, the precedent of frontier labs materially funding the maintainers their models depend on would itself be valuable.
The honest caveats
I’d be doing you a disservice if I didn’t name what this announcement isn’t.
“Defensive only” is a framing, not a property of the model. The same Mythos that finds the bug also writes the exploit. The Frontier Red Team report describes the model autonomously producing working ROP chains, KASLR bypasses, cross-cache reclamation attacks, and browser sandbox escapes. The only thing separating defensive use from offensive use is whose hands hold the API key.
Insider risk is now a frontier-AI policy problem. Eleven Fortune 500 companies plus 40 additional organizations plus thousands of OSS maintainers equals a very large set of credentialed humans with access to the most capable offensive AI ever publicly described. The threat model has expanded from “external attackers” to “any compromised employee at any partner organization.”
Anthropic’s own operational security is part of the story. Crypto Briefing pointed out, fairly, that Mythos arrives weeks after a Claude Code leak that exposed an internal security lapse. A lab claiming to harden the world’s software is implicitly asking for trust on its own ability to contain the most dangerous model it has ever built.
The 90-day report is the only real accountability mechanism. Everything else can be quietly walked back if the political winds shift or competitors force their hand. The report can’t be. It’s the piece that determines whether this becomes a credible governance experiment or a marketing moment.
What you should actually do
Different audiences should take different things away from this.
If you’re an open source maintainer: apply for access through the Claude for Open Source program. A $100 million security audit from one of the best models on earth, paid for, is not something you turn down because you have philosophical reservations about the vendor. Use it, fix what it finds, and document the process for everyone else.
If you run an enterprise security team: your bug bounty pipeline is about to be flooded, from Glasswing partners and from independent researchers using whatever model comes next. Triage capacity is the bottleneck. Anthropic explicitly listing “triage scaling and automation” in its planned recommendations is the strongest tell that this is the choke point everyone is about to hit.
If you build security products: the platform layer just shifted under you. If your moat is traditional vulnerability scanning, expect compression. If your moat is coordinating remediation across complex systems with humans in the loop, expect expansion. The next 18 months will sort the categories.
If you build on top of OSS without contributing back: this is the last warning the industry will get. The asymmetry between value extracted and value returned has just become a national security problem.
If you care about AI policy: watch the next 90 days more closely than you’ve ever watched any AI announcement. Anthropic is running a real experiment with real downside, and the result will shape the governance debate for years.
The artifact
The era when offense and defense had similar economics is over. That’s the headline.
But the story that will be told about April 8, 2026 isn’t really about a model or a coalition or a policy. It’s about a single artifact: a 27-year-old bug in OpenBSD, found for $50 in inference costs, invisible to a generation of human reviewers and millions of automated test cycles, sitting in code that powers some of the most carefully maintained infrastructure on earth.
That bug is the moment the cost curve broke. The rest of the industry is still catching up to what that means.
Takeaways
- The economics of finding software vulnerabilities just collapsed by orders of magnitude. The defensive moat protecting legacy code has effectively dissolved.
- Anthropic is the first frontier lab to refuse public release of a model on cybersecurity grounds. It’s a real test of “responsible non-release” governance.
- The Glasswing coalition is closer to a defensive alliance than a customer list, and its composition is the political shape of the project.
- Open source maintainers are finally being treated as critical infrastructure. It’s overdue and barely sufficient.
- The 90-day public report is the only accountability mechanism in the announcement. Watch that, not the press releases.
- “Defensive only” is a framing, not a property. The same model finds bugs and writes exploits. The only thing separating the two is who holds the API key.
