The baseline just moved. Permanently.
Vulnerability research has always been asymmetric. Attackers invest deeply in a single target; defenders spread thin across thousands of CVEs, patches, and alerts. That imbalance was at least bounded by human capacity — by the hours a researcher could put into reverse engineering an obscure subsystem, tracing a data flow through a million lines of C, or manually constructing a fuzzing harness for a codec library nobody has touched in a decade.
That constraint is dissolving. This week Anthropic announced Project Glasswing and the preview of Claude Mythos, a frontier model with security capabilities that don't fit comfortably inside the existing mental model of "AI-assisted research." Mythos isn't assisting. In the cases Anthropic documented, it's working autonomously — finding vulnerabilities that survived decades of human review and standard automated tooling, then building working exploits against them.
The shift isn't from slow to fast. It's from static scanning to agentic discovery. That's a different category.
What Mythos actually found
Anthropic has been direct about the results. Over the past few weeks of internal testing, Claude Mythos Preview autonomously identified thousands of high-severity zero-day vulnerabilities spanning every major operating system and browser. Three findings in particular illustrate what makes this model different from anything that came before it.
-
01A 27-year-old crash vulnerability in OpenBSDOpenBSD is the security-first BSD variant — its developers have spent three decades treating conservative code review as a core product property. Mythos found an integer overflow in the TCP SACK implementation that has been present since the project's early days. The flaw allows a remote attacker to crash any OpenBSD host responding over TCP. No authentication. No prior access. A 27-year-old bug in code that has been reviewed by some of the most security-focused developers in the open-source ecosystem.Remote DoS · 27 years undetected
-
02A 16-year-old critical flaw in FFmpeg — invisible to fuzzersFFmpeg is among the most heavily fuzz-tested codebases in existence. It processes untrusted media from every corner of the internet and has been a high-value fuzzing target for years. Mythos discovered a 16-year-old vulnerability in FFmpeg that fuzzers exercised the vulnerable code path 5 million times without triggering. The flaw wasn't reachable through random input mutation — it required reasoning about the semantic conditions under which the code path becomes exploitable. That's not something coverage-guided fuzzing finds. It's something a model that understands code semantics finds.Critical · Survived 5M fuzzing passes
-
03Autonomous Linux kernel privilege escalation chainsMythos autonomously obtained local privilege escalation exploits on Linux by chaining subtle race conditions with KASLR bypasses — constructing a coherent exploit path from bugs that, in isolation, would each be difficult to weaponize. This is the class of work that in previous generations required a senior kernel exploitation researcher with weeks of availability. Mythos assembled the chain autonomously, from identification to working exploit.Local Privesc · Autonomous chain construction
To be precise: these aren't benchmark results on synthetic tasks. This is what the model did against real production software during Anthropic's own internal testing. The 17-year-old FreeBSD NFS RCE (CVE-2026-4747) that Mythos found and exploited — unauthenticated root from the internet — was disclosed alongside the announcement. The scope of what Anthropic is sitting on from those few weeks of testing is significant enough that they've decided not to release the model publicly at all.
Project Glasswing: moving remediation upstream
The decision not to release Mythos publicly is itself a statement. Anthropic's reasoning is explicit: the model's offensive capabilities are potent enough that broad access would accelerate attackers faster than it would help defenders. Instead, they've launched Project Glasswing — a controlled initiative giving a vetted coalition of partners early access to Mythos specifically for pre-emptive defensive work.
The founding partner list reads like a who's-who of critical infrastructure ownership:
Roughly 40 additional organizations responsible for building or maintaining critical software infrastructure have also been granted access. The mandate is to use Mythos to find and fix vulnerabilities in their own systems before threat actors — with or without equivalent models — find them first.
The structural logic is sound: if an AI model can autonomously discover and exploit vulnerabilities at this scale, the organizations best positioned to respond are the ones who own the code and can deploy patches. Glasswing is an attempt to put the discovery capability in the hands of maintainers before it lands in the hands of adversaries. The window between those two events is narrowing. The goal is to do as much remediation as possible inside that window.
Why attack chains are the right unit of analysis now
The Linux kernel finding is the one that demands the most attention from practitioners. Not because privilege escalation is novel — it isn't — but because of how Mythos got there: by linking individually minor vulnerabilities into a coherent chain. A race condition that's hard to win reliably. A KASLR bypass that requires a prior information leak. Two bugs that together constitute a full local root.
This is exactly the pattern that breaks vulnerability management programs built around individual CVE triage. If your process is to look at each CVE in isolation, score it on CVSS, and prioritize by severity, you will correctly identify both of those bugs as medium-severity findings and schedule them for the next patch cycle. You will not see the path they form together.
An AI agent reasoning about your attack surface doesn't triage CVEs one by one. It builds a model of reachability — what components are exposed, what trust transitions are available, how data flows between subsystems — and finds the paths. This is how Mythos assembled that Linux chain. It's also how a well-resourced adversary with access to comparable tooling will approach your infrastructure.
The defense that holds against this is not faster individual CVE patching, though patching speed still matters at the margins. It's architectural resilience: understanding the actual paths between your exposed surface and your critical assets, identifying the chokepoints where a chain can be broken, and hardening those chokepoints structurally — not just patching the individual steps.
-
CVSS scores describe individual steps. A race condition is medium severity. A KASLR bypass is medium severity. The chain they form is root. CVSS was never designed to score paths — only points.
-
MITRE ATT&CK is the shared vocabulary. When Mythos constructs an exploit chain, it's executing a sequence of techniques that map to ATT&CK. Understanding which techniques apply to your CVEs tells you whether your compensating controls actually break the chain or just address one step.
-
Chokepoints require the full graph. You can't identify where to invest in architectural hardening without seeing the complete path from initial access to impact. Segmentation, privilege boundaries, and network controls only matter if you know which paths they intersect.
VulnPath exists at exactly this layer. The shift Glasswing represents — from AI-assisted scanning to AI-driven autonomous chain discovery — doesn't make CVE-by-CVE analysis obsolete. It makes attack chain analysis urgent. Because your adversary is no longer limited to the chains a skilled human researcher can construct in the time they have available. The chains are being found at scale, autonomously, and patched by defenders inside Glasswing while the rest of the industry waits for the CVE to drop.
The commoditization timeline just accelerated
The security industry has anticipated AI-driven exploit development for years. The conversation has generally been forward-looking: what happens when models get good enough, when the compute is cheap enough, when the scaffolding is reliable enough. Glasswing and Mythos are a signal that we've crossed at least one version of that threshold. The model is good enough that Anthropic considers it too dangerous to release. The capabilities are real enough that they're pre-patching critical infrastructure rather than waiting to see what happens.
That doesn't mean human exploit researchers are obsolete. Novel zero-day discovery against hardened proprietary targets — finding the first bug in a class, building primitives where no public primitives exist — remains genuinely hard in ways current models handle poorly. But the long-tail of unpatched, decades-old vulnerabilities in widely deployed open-source software is apparently enormous, and it is now accessible to a model.
The cat-and-mouse game between offense and defense has always been accelerating. What changes when both sides have access to AI is the cycle time. Glasswing is Anthropic's attempt to give the defensive side a head start by controlling who holds the model during the window when it's most asymmetrically powerful. That window won't last. Equivalent capability will become more widely available — it always does.
The security programs that adapt are the ones already building dynamic, architectural resilience rather than waiting to react to the next CVE. Understanding attack chains isn't a competitive advantage anymore. It's the baseline.