Your Enterprise Systems Weren't Built to Be Spoken To — And That's Becoming a Governance Crisis
Chat interfaces are moving in front of enterprise systems faster than governance can catch up. The problem isn't the UX. It's what's underneath.
What changed this month
Two product launches in the same week said more about where enterprise AI is headed than any analyst report.
Amazon announced its healthcare AI assistant is now live on Amazon's website and app — not just in clinical settings, but available to consumers, explaining health records, managing prescription renewals, helping with appointments, and offering personalized health guidance when users share access to their information (TechCrunch). The same week, Adobe began rolling out an AI assistant for Photoshop (TechCrunch), extending conversational interfaces into professional creative software.
Neither of these is a chatbot bolted onto the side. Both sit in front of systems that carry real operational weight. That's the shift: conversational AI is no longer a productivity sidecar. It's becoming the primary interface through which people interact with core enterprise systems.
This isn't a UX story
Every few months, a demo impresses someone with a conversational interface. Chat with your files. Ask your CRM. Query your knowledge base in plain English. The demos always work. The question nobody asks in the demo is: what happens when the thing it's reading is wrong, outdated, or visible to the wrong person?
A conversational layer doesn't create better underlying systems. It exposes what's already there — permissions, provenance, document versions, access controls — through a medium that wasn't designed to surface any of those things. Forms and dashboards were built with explicit constraints. You clicked the fields your role could edit. You saw the screens your permissions allowed. Natural language queries don't carry those guardrails by default. The system has to provide them, and most systems weren't built to.
As Unite.AI framed it this week: systems were built for explicit navigation and structured UI flows. Users now expect to ask in plain language and let the system figure it out (Unite.AI). That expectation is reasonable. The infrastructure to meet it reliably is not yet in place at most enterprises.
This is a governance and architecture problem. The UX is just making it visible.
The attack surface opened before most people noticed
In early March, CodeWall published a report claiming it had compromised McKinsey's internal AI platform, Lilli, through conventional application security issues — not some exotic AI-specific exploit (The Register). According to CodeWall's account, the access reached infrastructure containing chat histories, files, user accounts, system prompts, and RAG document chunks (CodeWall).
The temptation is to read this as proof that AI platforms are uniquely fragile. Security analyst Edward Kiledjian pushed back on that interpretation: the vulnerability was conventional (Kiledjian). What's new is the blast radius. When a retrieval layer sits on top of legacy infrastructure, old app-sec failures can now alter what the AI says, what it cites, and what it surfaces. A weakness that was previously a data exposure problem becomes a knowledge integrity problem too. That distinction matters a great deal for how security and governance teams scope their response.
Meanwhile, MIT Technology Review reported that the Pentagon is actively planning for AI companies to train on classified data (MIT Technology Review). The same interface-access-governance tension is arriving in environments where getting it wrong carries consequences that dwarf a corporate audit.
Five questions enterprises can't currently answer
When an employee asks an internal AI assistant a question and gets an answer, the organization needs to be able to answer five things. Most organizations can't.
What documents did the AI actually access? Most RAG deployments don't expose this at query time. The scoping decisions were made at setup, if they were made deliberately at all. After the fact, there's often no way to reconstruct which documents contributed to an answer.
Which version of a document informed the answer? If a policy was updated last Tuesday and the system was indexed last month, both versions can be technically present. The conflict is invisible to the person asking. The answer they receive reflects a choice the system made without flagging it.
Were the sources approved and current? Approved by whom, as of when? Most enterprise knowledge bases grow by accumulation: someone uploads a file, it stays indefinitely, nobody audits it. The assistant treats a four-year-old onboarding doc and last week's policy update with equal authority.
Can permissions survive natural-language mediation? Role-based access control was designed for structured interfaces. Asking the AI a question that touches restricted information doesn't automatically trigger the same access restrictions as clicking a restricted menu item — it depends entirely on how carefully the retrieval layer was scoped during deployment.
Can the organization reconstruct why a given answer happened? For regulated industries, this isn't optional. It's the difference between passing an audit and failing one. If an AI assistant gave a clinician, a banker, or a compliance officer the wrong answer, the organization needs to explain exactly why — which documents, which version, which retrieval decision.
What regulated industries are learning first
Healthcare has been dealing with this for years. Clinical AI systems reading patient records, surfacing drug interactions, informing care decisions — all of it depends on the accuracy and currency of the underlying documents. When those documents contradict each other, the AI doesn't flag ambiguity. It picks one. The clinician has no way to know.
The same dynamic is now arriving in legal, financial services, and any environment where answers to questions carry downstream consequences that can be audited or challenged. The assistant adds speed. It doesn't add accuracy. That gap belongs to the organization deploying it.
We've written previously about how AI agents passing authentication doesn't solve the control problem that comes after access is granted, and about the knowledge governance problem that agent deployments inherit directly from their underlying document systems. Conversational interfaces are now accelerating both problems at consumer scale.
The knowledge layer is where trust actually lives
The assistant boom across healthcare, creative tools, and enterprise platforms creates useful pressure. It's forcing organizations to confront something they've been deferring: the quality of what their AI reads.
A conversational interface in front of a well-governed knowledge base is genuinely useful. A conversational interface in front of stale, contradictory, un-permissioned documents is a faster way to be wrong at scale.
The problem isn't the model. Models are generally good. The problem is that most enterprise knowledge bases were never built to support the governance requirements that conversational AI creates. No source attribution. No contradiction detection. No version tracking. No audit trail for retrieval decisions. None of that was necessary when people navigated systems themselves. It's necessary now.
Organizations getting this right treat the knowledge layer as operational infrastructure — maintaining source attribution on every retrieval, scoping document access to approved repositories, running regular audits for contradictions and outdated content, and building systems that can reconstruct provenance after the fact. Mojar AI was built around that premise: that RAG only works reliably when the documents underneath it are actively governed, not just indexed.
Most enterprise deployments aren't there yet. The assistant interfaces will keep arriving. The question is whether the knowledge those assistants read is ready to be trusted.
What to watch
The McKinsey incident will not be the last. As conversational AI moves deeper into operational systems, the combination of weak document governance and permeable retrieval layers creates compounding risk. Watch for regulatory attention in healthcare and financial services to start naming the knowledge layer specifically — not just "AI accuracy" in the abstract — as the compliance requirement enterprises are failing to meet. That shift in language, from "model performance" to "source governance," is where the accountability is heading.
Frequently Asked Questions
Conversational interfaces don't replace the underlying document systems — they expose them. Forms and dashboards had explicit access controls built into their structure. Natural language queries bypass those guardrails unless the retrieval layer is deliberately scoped and permissioned. The UX changes; the underlying governance requirements multiply.
CodeWall reported compromising McKinsey's internal AI platform through conventional application security vulnerabilities, not AI-specific exploits. Analysts noted the real lesson isn't that AI platforms are uniquely fragile — it's that old app-sec weaknesses now carry knowledge integrity consequences when retrieval layers sit on top of legacy infrastructure.
Which documents did the AI access for a given answer? Which version of those documents was used? Were the sources approved and current? Can role-based permissions survive natural-language mediation? Can the organization reconstruct why a specific answer happened after the fact?
Source attribution on every retrieval, document access scoped to approved repositories, regular audits for contradictions and outdated content, version tracking so answers can be traced to specific document states, and audit-ready provenance logs for regulated industries.