Kubernetes May Be Ready for Production AI. Most Knowledge Layers Still Aren't.
Kubernetes now runs 66% of enterprise AI inference. The infrastructure is ready. The knowledge feeding those systems usually isn't.
What changed at KubeCon
Not long ago, "running AI on Kubernetes" was an engineering experiment. Something you showed at conference demos, not something you shipped to production.
That framing is over.
According to the CNCF's 2025 Annual Cloud-Native Survey, 82% of container users now run Kubernetes in production. More relevant for the AI conversation: 66% of organizations hosting generative AI models use Kubernetes for some or all inference workloads. Cloud-native has become the default substrate for production AI.
KubeCon EU 2026 in London made this shift hard to ignore. The conversation at the conference has moved from "can Kubernetes handle AI?" to something more demanding: can enterprises govern AI at scale across teams, environments, and cost centers?
That is a different question. And the answer is more complicated.
The question has changed
Infrastructure adoption stories follow a familiar arc: skepticism, early adopters, enterprise consensus. Kubernetes has cleared that arc. 98% of surveyed organizations use cloud-native techniques in some form (CNCF). Platform engineering is becoming a recognized discipline. Observability is table stakes.
The interesting signal now is what comes after infrastructure consensus: the gap between deployment capability and operational maturity.
One number from the CNCF survey captures this well: only 7% of organizations deploy AI models daily. Nearly half — 47% — deploy only occasionally. The infrastructure is ready well before most teams are.
That gap doesn't live in the platform layer.
Where the maturity gap actually is
Standardization and platform engineering
Platform engineering teams have made real progress. Internal developer platforms, policy-as-code, golden paths for deployment — these patterns are landing in serious organizations. The hard infrastructure problems around scheduling, networking, and autoscaling are largely solved.
But platform engineering for AI adds a layer most DevOps playbooks haven't addressed: what does the AI actually read? Model deployment is one concern. Model inputs are another. Most platform teams have detailed opinions on the former and no standard at all for the latter.
Observability under new pressure
The observability stack has matured throughout the Kubernetes era. Teams instrument their services, collect telemetry, and set alerts. When a microservice misbehaves, logs and traces and dashboards diagnose it.
AI systems create a different observability challenge. A retrieval-augmented system doesn't fail in ways that trip a latency alert. It returns a confident, coherent answer based on a policy document from three years ago, or a procedure revised last quarter that never got updated in the knowledge base. The infrastructure is healthy. The output is wrong. Standard telemetry won't catch that.
Hybrid and sovereign deployment
Coverage from KubeCon EU and Red Hat's hybrid-cloud framing points to a maturing demand: enterprises want AI running on-premises, at the edge, across clouds — for compliance, data residency, and cost control. Kubernetes handles the distribution. Knowledge governance gets harder as you spread.
When your AI stack spans regions and environments, keeping the source-of-truth layer consistent becomes an active engineering problem, not a documentation task someone handles quarterly.
Deployment cadence lagging infrastructure
The 7% daily deployment figure deserves a longer look. If nearly half of AI model deployments happen "occasionally," teams aren't running the kind of continuous delivery pipelines they've built for application code. The infrastructure supports it. The operating model hasn't caught up.
Part of this is model evaluation — you need confidence a new version is better before shipping. But part of it is upstream: the content those models retrieve is changing continuously, even when the models themselves don't.
The bottleneck nobody is writing about
Here is what most KubeCon coverage skips.
Enterprise AI systems built on retrieval-augmented generation are not primarily a model-serving problem. They are a document problem. The model is ready. The infrastructure is ready. The question is whether the content being retrieved is current, internally consistent, and actually attributable to a real source.
Most of it isn't.
The average enterprise knowledge base has accumulated years of documents across SharePoint, Google Drive, internal wikis, compliance repositories, and shared drives. Some documents are authoritative. Others are drafts. Many contradict each other. A significant portion are scanned PDFs that no standard parser handles reliably.
When a production AI system retrieves from that corpus, infrastructure maturity doesn't protect you. A 99.9% uptime SLA means nothing when the retrieved content is stale, conflicting, or ingested incorrectly because the original document was a scanned form from 2019.
This is the operational gap missing from CNCF survey data: not whether enterprises can run AI, but whether the content those systems read is production-grade.
Enterprise agent platforms are converging toward the knowledge layer as the primary bottleneck — and for good reason. Once the infrastructure problem is solved, organizations find the next question waiting: what are we actually serving the model?
What production-ready AI actually requires
If you take the full stack seriously, the picture is roughly this:
Kubernetes and platform engineering handle infrastructure — deployment pipelines, scheduling, autoscaling. That part is largely solved. What isn't solved is what lives above it: a knowledge base with defined ownership, update processes, and sourced content. Most enterprises don't have that.
Clean ingestion matters more than most teams expect. Enterprise document collections include scanned PDFs from a decade ago, spreadsheets exported in half a dozen formats, and Word documents that have been passed around long enough that nobody remembers which version is current. Standard RAG pipelines choke on these. The model is fine. The input isn't.
Source attribution is where things get serious. Every AI answer should trace to a specific document and version. That's a quality requirement today and a compliance question in more jurisdictions every quarter.
Contradiction detection is the one most teams skip entirely. If two SOP documents disagree on a procedure and your AI doesn't know it, the system will confidently serve whichever one got indexed first. Most knowledge bases have no mechanism for catching this before the answer ships.
AI readiness is really knowledge base readiness. The infrastructure stack is necessary but not sufficient.
Platforms like Mojar AI are built around this specific problem: active management of the knowledge layer itself — detecting outdated content, scanning for contradictions across documents, handling scanned PDF ingestion, attributing every answer to its source. It isn't glamorous infrastructure work. But it's what separates a demo-ready AI from one that holds up in production.
What to watch
The KubeCon conversation reflects a maturation that's real. Cloud-native AI infrastructure is no longer experimental. For platform engineering teams, the playbook is coming together.
The open question heading into the rest of 2026 is which organizations treat the knowledge layer with the same rigor they've applied to infrastructure — and which assume the hard part is already done.
Teams that get this right will have AI systems that behave reliably outside demo environments. The ones that don't will keep shipping infrastructure improvements on top of a fragmented source-of-truth layer and keep getting disappointed by the outputs.
Platform engineering solved the substrate problem. The next problem is older, messier, and harder to automate: making sure the information flowing through that substrate is actually worth trusting.