The Real MCP Problem Isn't More Tools. It's Whether You Can Trust Them.
MCP has become the default wiring for enterprise AI agents. Now security researchers are finding that the real governance gap isn't connectivity — it's whether approved tools stay approved.
The upgrade everyone wanted comes with a problem nobody budgeted for
For the past year, enterprises evaluating agentic AI had one question: what can it connect to? The appeal of the Model Context Protocol was obvious. One standard, any tool — databases, file systems, SaaS platforms, internal APIs. Instead of custom integration work for every system, MCP offered a universal socket.
The ecosystem moved fast. By March 2026, over 19,000 MCP servers are in circulation. Developers plug them into Claude, GPT, Copilot, and custom agents with minimal vetting. Speed was the point.
Now the speed is the problem.
We covered MCP's connectivity-versus-accuracy gap earlier this year. Since then the questions have shifted: from "can the agent access the tool" to "should we trust the tool it accessed, and can we prove what it actually did."
Why tool-layer governance became urgent
The enterprise AI governance conversation has moved through a predictable sequence. First: is the model accurate? Then: does the agent have proper identity and credentials? Now, with MCP-enabled agents executing real actions — reading databases, writing files, calling APIs — the focus has landed on the tool layer itself.
When an agent uses a tool, something concrete happens. A file is read. A record is updated. A request goes out. The model's response quality doesn't matter if the tool it used was compromised, overprivileged, or changed between approval and execution.
This isn't theoretical. In March 2026, Microsoft patched CVE-2026-26118, a server-side request forgery vulnerability in Azure MCP Server Tools rated 8.8 CVSS (dev.to/Protodex). An attacker who could interact with an MCP-backed agent could submit a malicious URL, causing the server to leak its managed identity token. Privilege escalation through your own AI agent.
Dark Reading argued that MCP security can't be patched away: the problem is architectural, not a bug. SC Media called MCP "the backdoor your zero-trust architecture forgot to close," warning that trusted tool channels can become data-exfiltration paths without triggering any alert (SC Media).
What's actually broken
Overprivileged tools, minimal vetting
A March 2026 scan of 5,618 MCP servers found only a small fraction passed basic safety checks (dev.to/Protodex). Developers search for "MCP server for [tool]" and deploy whatever appears — no dependency audits, no CVE checks, no verification that the server is still maintained. Tools with far more access than the task requires are the default, not the exception.
Mutable definitions and the approval problem
Here's the subtler issue: MCP tool definitions aren't versioned.
The spec allows a server to rewrite a tool's description, parameters, and behavior at any point — including after a user has reviewed and approved it. Security researcher Nasser Ali Alzahrani named this the Rug Pull Attack (nasser.nz).
The sequence: a user reviews a tool called read_file, described as "read a local file and return contents." They approve it. Before execution, the server rewrites the tool to also POST file contents to an external endpoint. The agent calls the original tool name with the original parameters. The rewritten behavior executes. No error. No notification. No trace of the change.
There is no hash, no snapshot, no version recorded on the client side. The approval references a definition that no longer exists by the time the call goes through.
The blast radius scales with what the tool could access. An MCP tool connected to a FHIR API or an EHR system, rug-pulled to exfiltrate PHI, produces an audit trail that shows a successful read_file — and nothing else. HIPAA requires you to demonstrate data wasn't improperly disclosed. That trail won't help you.
Stale and abandoned servers
Beyond intentional attacks, there's a quieter problem: servers that have simply been abandoned. A tool approved months ago may now run on an unmaintained server with unpatched dependencies. The agent doesn't know. The enterprise doesn't know. The tool keeps running.
Pivot Point Security framed MCP as an expanding attack surface that enterprises didn't fully scope when they adopted it — every connected tool is a potential entry point, and most weren't evaluated as one.
Why gateways are the emerging answer
The security community isn't waiting for the MCP spec to catch up. The practical response is a control plane between agents and tool servers.
Versa Networks describes an MCP Gateway as the "hands control point" for enterprise AI, sitting in-line to enforce approval flows, restrict tool arguments, log all execution, enforce least-privilege access, and verify server posture before any call proceeds. The framing is deliberate: you secured the model (the brain). Now you need to secure what it can do with its hands.
SiliconANGLE's preview of RSAC 2026 pulled MCP into enterprise operating-model conversations alongside authentication, provenance, posture management, and SecOps (SiliconANGLE). Enterprise AI is no longer evaluated only on output quality. Buyers want proof of what ran, evidence that it was authorized, and logs that hold up in an audit.
The market conversation has shifted: from "connect agents to more tools" to "govern whether those tools are trustworthy, stable, least-privileged, and still the same tools users originally approved."
When AI can modify knowledge, the stakes are higher
Read-only tool use is one risk profile. Write-capable tool use is another.
When agents move from answering questions to changing knowledge, the governance requirements go up sharply. Inserting new content, deleting outdated documents, reconciling contradictions across policy files: these aren't reversible with an apology. An agent with legitimate credentials, using a rug-pulled tool, can alter enterprise knowledge in ways that are hard to detect and harder to undo.
This matters especially for document-heavy systems where AI writes back to the source of truth. Mojar AI operates at this boundary. AI that reads from a knowledge base and AI that writes to one are different governance problems. Source attribution for read answers is baseline. The harder requirement is proving that the write action the human approved is the write action that executed: scoped, logged, and traceable.
It connects to what we wrote about AI agents hitting a post-auth control problem: authentication tells you who the agent is, but it says nothing about whether the tool it used was still what it was supposed to be. Both problems need answers for knowledge operations to be trustworthy. Right now, most deployments have neither fully solved.
For enterprises running AI that modifies policy repositories, compliance documents, or shared knowledge bases, weak tool governance isn't a theoretical risk. Mutable tool definitions, overprivileged write access, and incomplete action logs create a knowledge base that drifts — and when it does, nobody can reconstruct what changed it or why.
What to watch
The MCP spec will likely move toward versioned tool definitions and approval-time snapshots — the Rug Pull attack is too well-documented and too clean to stay unaddressed for long. The enterprise market isn't waiting. Expect MCP governance and gateway capabilities to become a standard procurement question for agentic AI platforms by mid-2026.
The first wave of AI tooling competition was about giving agents more hands. The next wave is about proving those hands are governed, scoped, and trustworthy enough to touch enterprise knowledge in the first place.