What is agent-friendly documentation?

Agent-friendly documentation is designed to be consumed reliably by AI coding agents and other automated systems, not just human readers. It typically includes markdown delivery, a discoverable llms.txt index file, stable URLs, and page sizes small enough to fit within context windows without truncation.

llms.txt is a proposed standard where websites place a markdown file at /llms.txt that provides LLM-friendly background content and links to detailed markdown pages. It acts as a discovery index for AI systems accessing a site's documentation.

Why does documentation quality matter for AI agents?

Agents that can't reliably retrieve documentation fall back on training data, work with partial information, or fail silently. Better formatting helps agents find content, but it doesn't guarantee the content is accurate, current, or free of internal contradictions.

AI Agents Don't Read Docs Like Humans. That's Becoming an Infrastructure Problem.

Q: What is the Agent-Friendly Documentation Spec?

The Agent-Friendly Documentation Spec defines 22 checks across 7 categories — discovery, markdown availability, page size, content structure, URL stability, observability, and authentication — to evaluate how well a documentation site serves AI agent consumers.

Most documentation sites were built for humans who scroll, skim, click through navigation, and tolerate a bit of JavaScript overhead. That was fine. For a long time, the only reader was a person.

That's no longer true. Claude Code, Cursor, GitHub Copilot, and dozens of other coding agents now fetch documentation directly during execution. They don't scroll. They don't tolerate walls of CSS. They hit truncation limits, fail on cross-host redirects, and silently work with partial content when they can't find what they actually need.

The result, per field research behind the new Agent-Friendly Documentation Spec: agents frequently fall back on training data when documentation retrieval fails. Meaning they guess. In production.

The next docs war isn't about design

For the last decade, the big fights in documentation were about information architecture, search UX, versioning, and developer experience. Design-level problems. Important, but fixable with good writers and decent tooling.

The emerging fight is about machine readability: whether the knowledge in your docs can be retrieved reliably by a system that doesn't read the way you do.

The Agent-Friendly Documentation Spec formalizes what practitioners already knew was happening. It defines 22 checks across 7 categories: content discoverability, markdown availability, page size, content structure, URL stability, observability, and authentication. Each check has pass/warn/fail criteria. This is not a style guide. It's an operational checklist.

The categories tell you exactly where things break:

Discoverability: Does your llms.txt exist, is it valid, does it fit in a single fetch, and do the links actually resolve to markdown? Agents that can't find the index file fall back to guessing URLs from training data.
Markdown availability: Does your site serve clean .md versions? HTML-heavy docs waste context window and introduce lossy conversion artifacts.
Page size: Pages over 50K characters get truncated. Truncated docs produce partial retrieval. Partial retrieval produces wrong answers, delivered confidently.
URL stability: Agents fetch specific URLs from model memory. Move content without same-host redirects and the agent silently fails.
Auth access: If docs are behind a login, is there an alternative path? Agents don't click "sign in."

llms.txt has emerged as the most practical entry point. The proposal is simple: place a markdown file at /llms.txt with background content and links to detailed markdown pages. Anthropic, Cloudflare, Stripe, and Mintlify already implement versions of this pattern. Expo now ships dedicated documentation endpoints for AI agents and LLMs. Fern publishes guidance specifically on how API providers can optimize docs for agent consumption.

The spec emerged because llms.txt alone wasn't enough. You can have the index file and still fail 15 other checks.

This isn't a developer docs problem. It's a knowledge infrastructure problem.

Most of the early conversation around agent-friendly docs centers on public API documentation. That's where the pattern first became visible, because coding agents hammer those docs constantly.

But the retrieval problem generalizes. Anywhere an agent reads structured knowledge to inform an action, the same failure modes apply:

Hospital SOPs distributed as HTML-heavy intranet pages
Insurance policy libraries sitting behind authentication walls
Compliance documentation that moved URLs six months ago, no redirects set
Internal support knowledge bases where content size outgrew any context window
Field service manuals where some sections were never converted from PDF to anything machine-readable

Internal enterprise knowledge typically has no llms.txt, no markdown parity, no page size discipline, and no URL stability policy. It was never built for machines at all. Agents are reading it now anyway, or trying to.

We've written before about what happens when agents act on documents they can't fully retrieve. Wrong retrieval produces wrong actions, not just wrong answers. In regulated environments, that distinction matters a lot.

Formatting helps agents find content. It doesn't make that content trustworthy.

Here's where the spec stops being useful.

Twenty-two checks tell you whether your documentation is machine-retrievable. None of them tell you whether what the agent retrieves is current. Whether the SOP it pulled was updated six months ago and the old version is still indexed. Whether two documents in the same knowledge base contradict each other on the same policy point. Whether the answer the agent constructs from retrieved fragments is internally consistent.

Agent-friendly formatting is the new baseline. It's not the moat.

An llms.txt index file helps an agent find content. It cannot tell the agent whether that content is current, complete, or in conflict with another source. Markdown delivery reduces conversion noise. It doesn't solve content decay. Stable URLs mean agents don't silently fail on moved pages. They don't mean the page that loads is the right version.

The deeper problem, the one that's been compounding as agent deployments scale, is knowledge governance. Retrievability is a prerequisite. Trust is a different problem entirely.

As agents consume more knowledge directly, documentation quality stops being a design issue and becomes an execution risk. Teams that figured out agent-friendly formatting early have a head start on retrieval reliability. The next question is whether the knowledge those agents retrieve is governed, versioned, contradiction-checked, and current. That's where the real work is.

It's a pattern we've tracked across the agentic enterprise transition: organizations sprint to make agents work and discover too late that their knowledge wasn't ready to be consumed at this pace or this scale.

Mojar AI is built for this layer: not just making knowledge findable, but keeping it accurate. Contradiction detection across documents, feedback-driven remediation, audit trails on what changed and when. The layer that sits behind retrieval and asks whether what was found should actually be trusted.

The floor is rising

The Agent-Friendly Documentation Spec matters because it gives documentation teams a concrete operational target. Twenty-two checks, clear pass/fail criteria, run against a real docs site, produce an actionable gap list. That's genuinely useful.

Treating it as the finish line is the mistake. The spec solves the retrieval problem. Enterprise AI deployments in healthcare, compliance, field operations, any context where an agent's output triggers a real-world consequence, have a second problem: whether retrieved knowledge is worth retrieving.

Agent-ready docs are the floor. Governed retrieval is the ceiling. Most organizations are still pouring the foundation.

Most documentation sites were built for humans who scroll, skim, click through navigation, and tolerate a bit of JavaScript overhead. That was fine. For a long time, the only reader was a person.

The result, per field research behind the new Agent-Friendly Documentation Spec: agents frequently fall back on training data when documentation retrieval fails. Meaning they guess. In production.

The next docs war isn't about design

The emerging fight is about machine readability: whether the knowledge in your docs can be retrieved reliably by a system that doesn't read the way you do.

The categories tell you exactly where things break:

Discoverability: Does your llms.txt exist, is it valid, does it fit in a single fetch, and do the links actually resolve to markdown? Agents that can't find the index file fall back to guessing URLs from training data.
Markdown availability: Does your site serve clean .md versions? HTML-heavy docs waste context window and introduce lossy conversion artifacts.
Page size: Pages over 50K characters get truncated. Truncated docs produce partial retrieval. Partial retrieval produces wrong answers, delivered confidently.
URL stability: Agents fetch specific URLs from model memory. Move content without same-host redirects and the agent silently fails.
Auth access: If docs are behind a login, is there an alternative path? Agents don't click "sign in."

The spec emerged because llms.txt alone wasn't enough. You can have the index file and still fail 15 other checks.

This isn't a developer docs problem. It's a knowledge infrastructure problem.

Most of the early conversation around agent-friendly docs centers on public API documentation. That's where the pattern first became visible, because coding agents hammer those docs constantly.

But the retrieval problem generalizes. Anywhere an agent reads structured knowledge to inform an action, the same failure modes apply:

Hospital SOPs distributed as HTML-heavy intranet pages
Insurance policy libraries sitting behind authentication walls
Compliance documentation that moved URLs six months ago, no redirects set
Internal support knowledge bases where content size outgrew any context window
Field service manuals where some sections were never converted from PDF to anything machine-readable

Formatting helps agents find content. It doesn't make that content trustworthy.

Here's where the spec stops being useful.

Agent-friendly formatting is the new baseline. It's not the moat.

The deeper problem, the one that's been compounding as agent deployments scale, is knowledge governance. Retrievability is a prerequisite. Trust is a different problem entirely.

The floor is rising

Agent-ready docs are the floor. Governed retrieval is the ceiling. Most organizations are still pouring the foundation.

AI Agents Don't Read Docs Like Humans. That's Becoming an Infrastructure Problem.

The next docs war isn't about design

This isn't a developer docs problem. It's a knowledge infrastructure problem.

Formatting helps agents find content. It doesn't make that content trustworthy.

The floor is rising

Frequently Asked Questions

Related Resources

AI Agents Don't Read Docs Like Humans. That's Becoming an Infrastructure Problem.

The next docs war isn't about design

This isn't a developer docs problem. It's a knowledge infrastructure problem.

Formatting helps agents find content. It doesn't make that content trustworthy.

The floor is rising

Frequently Asked Questions

Related Resources