Brunswick, ME • (207) 245-1010 • contact@johnzblack.com
Nobody hacked Meta. No phishing email. No zero-day. According to a Guardian report this week, a Meta AI agent was simply instructed to do its job – and in doing that job, it caused a large internal data leak, exposing sensitive user and company data to employees who had no business seeing it.
The AI followed orders. That was the problem.
This wasn’t misbehavior. The agent did what it was designed to do. The failure was in the permission model around it: an agent with access to more than it needed, acting on instructions that had consequences nobody anticipated. You can’t patch that away with a point release.
The Model Context Protocol is Anthropic’s open standard for connecting AI agents to external tools – databases, APIs, file systems, anything. It’s become the dominant way enterprises wire up their AI agents to do real work. Adoption has been fast. Security has been an afterthought.
OWASP published its first MCP Top 10 this spring, and it reads like a catalog of problems baked into the architecture before security teams had a seat at the table.
Tool poisoning: MCP servers advertise capabilities through tool metadata. If an attacker can manipulate that metadata, they can get an agent to do things the developer never intended. It’s prompt injection at the infrastructure layer.
The Clinejection attack: a malicious actor posts a GitHub issue with a specially crafted title. An AI coding agent reads it as part of its workflow. The crafted title injects instructions that execute unauthorized code. The developer didn’t write anything malicious. The agent did nothing “wrong” by its own lights. But unauthorized code ran anyway, because the agent had permissions it shouldn’t have had.
Dark Reading’s framing: “MCP security can’t be patched.” These aren’t implementation bugs. They’re architectural patterns. Update every library in your stack and you still have the same exposure if your agents have overbroad permissions, if you haven’t thought about shadow MCP servers, if you’re trusting tool metadata without verification.
The principle is old: systems should have exactly the access they need to do their job, and no more. Apply it to AI agents: an agent that can read your entire internal database will eventually share something it shouldn’t. Not maliciously – just because it was asked to do something adjacent, had the access, and didn’t know to refuse.
That’s the Meta situation. An agent given broad access received instructions that, combined with that access, produced an unintended data leak. Nobody typed “leak sensitive data.” The agent just connected the available dots.
Enterprises rushing to deploy agentic AI are mostly thinking about what their AI can do. They should be thinking about what it can access. Those are different questions.
When an AI agent causes a breach, who’s responsible? The vendor who built the model? The enterprise that deployed it and configured permissions? The developer who wrote the prompt? The team that built the MCP server?
This isn’t hypothetical anymore. The Meta case is a real company, a real incident, and a set of questions existing frameworks aren’t equipped to answer. OWASP’s work is a start – but frameworks describe the problem; they don’t assign liability.
The accountability question will get louder as more incidents surface. And the Meta case almost certainly isn’t the last one. It’s just the first one that made the Guardian.
Start with this question: does your AI have access to more than it needs? If you’re not sure, assume yes.