AI Found Thousands of Software Bugs This Month. Then It Wrote Exploits.

John Z Black Mar 10, 2026 AI & Security #ai-security #vulnerability-research #firefox #openssh #chromium #anthropic #openai #exploit

An AI found 22 unknown Firefox vulnerabilities in two weeks. Then it wrote working exploits for two of them.

That’s not a hypothetical. Anthropic disclosed it last week.

They pointed Claude Opus 4.6 at nearly 6,000 C++ source files in Firefox. Out of 112 reports, 22 were confirmed real. Fourteen rated high severity. One use-after-free bug? Detected in 20 minutes.

For context, those 14 high-severity finds represent almost a fifth of all high-severity Firefox bugs patched in all of 2025. One model. Two weeks.

And then Anthropic made it weirder. They fed the bugs back to Claude and asked it to write exploits. Several hundred attempts at about $4,000 in API credits. Two worked, including one for a CVSS 9.8 JIT flaw in WebAssembly. The exploits were crude and only worked without Firefox’s sandboxing. But crude isn’t the same as impossible. And $4,000 is pocket change.

Meanwhile, OpenAI’s Codex Security tool scanned 1.2 million open-source commits during its beta. The haul: 792 critical findings, 10,561 high-severity ones. Affected projects include OpenSSH, Chromium, PHP, GnuTLS, and GnuPG. Real CVEs came out of this. Real patches followed.

Codex doesn’t just flag bugs. It builds a threat model of the repo, identifies vulnerabilities using that context, then proposes patches in a sandbox. It’s closer to an automated security researcher than a scanner with a chatbot bolted on.

So here’s the uncomfortable part. If AI can find 22 Firefox bugs in two weeks, it can find bugs in anything. The capability works in both directions. Right now, discovery is way ahead of exploitation. Models are better at finding bugs than writing reliable exploits against modern defenses like ASLR and sandboxing.

But discovery is the expensive part. That’s what takes human researchers months. Writing the exploit once you know exactly where the bug lives? That’s a more bounded problem. And bounded problems are exactly what AI gets better at over time.

The economics shifted. If a nation-state can run this at $4,000 per attempt, the old barriers of time and labor cost don’t hold the way they used to.

If you use Firefox: Update to 148. Now. Not next maintenance window. The bugs Claude found are real and patched.

If you run anything built on OpenSSH, PHP, GnuTLS, or Chromium: Watch for the CVEs from Codex Security’s beta. Patches are rolling out.

If you’re a security leader: AI-powered vulnerability research isn’t a future threat. It has CVE numbers attached to it today. Model it accordingly.

This round went well. Anthropic disclosed responsibly. Mozilla patched. OpenAI coordinated with maintainers. But the machinery is visible now, and it’s only going to get faster.

Read the full story on gNerdSEC