From folder hierarchies to natural language search in healthcare

Shared drives and keyword search fail clinical staff at scale. How semantic search changes the way hospitals provide policy access, and what to evaluate when modernizing.

14 min read• January 21, 2026• Updated April 21, 2026View raw markdown

HealthcareKnowledge ManagementSemantic SearchClinical OperationsPolicy ManagementRAG for clinical policiesreduce nursing administrative burdenhospital intranet modernizationvector search vs keyword search healthcareclinical informatics workflow toolsAI-powered document retrievalimproving clinical decision support

Timotei Mierlut

Founding Engineer, Mojar AI

January 21, 2026(Updated April 21, 2026)

Table of contents

The folder was called Clinical_Protocols_FINAL_v3_USE_THIS_ONE.

Somewhere in that same shared drive existed Clinical_Protocols_FINAL_v2, Clinical_Protocols_UPDATED_2024, and the original Clinical_Protocols folder that nobody had touched since 2019 but nobody had archived either. Each contained slightly different versions of documents with nearly identical names.

This is the reality of clinical knowledge management at most healthcare organizations. Not a failure of technology investment or staff training, but the predictable result of decades of organic growth without intentional architecture. Policies written in Word, saved to shared drives, organized by whoever happened to create the folder, and searched by people who learned to guess at naming conventions because the alternative was giving up entirely.

The shift from folder hierarchies to natural language search fundamentally changes how clinical staff interact with institutional knowledge: from "know where to look" to "know what to ask."

Comparison of cluttered shared drive folder navigation versus clean natural language search interface for hospital policy access

The three eras of clinical policy access

Era 1: physical binders and unit-based knowledge

Before digitization, policies lived in physical binders on nursing units. If you needed to check a procedure, you walked to the binder. The system was simple and had a certain reliability: the binder was always in the same place, and the policy inside was the one your unit used.

The limitations were obvious. Updating meant printing new pages and replacing old ones, a process that happened inconsistently. Different units might have different versions. And if you needed a policy from another department, you made a phone call or took a walk.

But for local knowledge access, physical binders had a clarity that their digital successors would accidentally destroy.

Era 2: shared drives and folder hierarchies

The migration to digital storage promised universal access and easier updates. In practice, it created new problems while solving old ones.

Shared drives evolved without planning. Each department created its own folder structure. Naming conventions were invented by people who left years ago. The cardiac care unit organized by document type; pharmacy organized by year; quality improvement organized by regulatory body. There was no master architecture, only accumulated decisions.

To find a policy in this environment, you needed tribal knowledge: which drive held clinical documents, which subfolder matched your department's logic, which filename variant was current. Research from BMJ Quality & Safety found that 16-34% of clinical shift time is preventable waste: time spent searching for information that should be readily available.

The nursing forums tell the story in real terms. As one nurse wrote on AllNurses:

Nobody seems to know how to find most policies.

Another:

When you really want a policy to guide practice, there is none.

The policies existed. Finding them was the problem.

Era 3: keyword search (the false promise)

Keyword search seemed like the obvious solution. Instead of navigating folder trees, staff could simply search for what they needed. Type "restraint policy," find the restraint policy.

Except that's not what happened.

A keyword search for "restraint policy" returns every document containing those words: the actual policy (in three versions), training slides from 2021, a committee meeting agenda that discussed policy updates, a regulatory summary that references restraints, and a dozen other documents that happen to mention restraints somewhere in their text.

Staff don't have time to open and scan a dozen documents during a busy shift. They need the answer, not a list of possibilities. When keyword search returns 47 results, none of which are obviously correct, most people do what any reasonable person would do: give up and ask a colleague.

The colleague tells them what they think the policy is. And the knowledge management system becomes irrelevant.

Why keyword search fails clinical workflows

Understanding why keyword search doesn't work for clinical knowledge access helps explain what's different about semantic approaches.

The vocabulary mismatch problem

Clinical staff think in questions: "Can an RN remove a chest tube?" The policy is titled "Scope of Practice: Invasive Procedures." A keyword search for "RN chest tube" might not surface this document because those exact words don't appear in the title or prominently in the text.

Staff are forced to guess at the vocabulary the document authors used, which may not match how clinicians describe the same concepts in practice. This is the vocabulary mismatch problem: the words searchers use often differ from the words indexers used.

The context collapse problem

Keywords have no understanding of context. A search for "medication administration" returns documents about:

Medication administration policies
Medication administration training
Medication administration error reporting
Medication administration during codes
Medication administration for pediatric patients

All relevant to medication administration. None necessarily relevant to the specific question the searcher has in mind. The searcher must mentally filter the results, determine which category applies to their situation, and hope the right document is in the list.

The version control invisibility problem

Keyword search treats all documents equally. It doesn't know that Medication_Policy_v2.docx supersedes Medication_Policy.docx. It doesn't flag that the document you're about to rely on was last updated in 2020. It surfaces what matches, not what's current or authoritative.

Staff learn they can't trust search results without independent verification. And verification takes time nobody has.

The multi-document problem

Clinical questions often span multiple policies. "What do I need to document for a patient refusing treatment?" might require information from the informed consent policy, the documentation policy, the patient rights policy, and possibly department-specific guidelines.

Keyword search returns individual documents. It can't synthesize information across sources. Staff must search multiple times, read multiple documents, and piece together the complete answer themselves, assuming they know all the relevant policies exist.

What semantic search actually means

Semantic search represents a fundamental architectural shift. Instead of matching text strings, semantic systems understand meaning.

From keywords to intent

When a nurse types "Can I confirm a patient's appointment to their spouse?", a semantic system recognizes:

This is a question about information disclosure
The caller relationship (spouse) is relevant
HIPAA/privacy policies are likely involved
The answer requires understanding consent and authorization rules

The system doesn't search for documents containing "spouse" and "appointment." It searches for documents about patient information disclosure, caller verification, and privacy requirements—the concepts behind the question.

How it works: embeddings and vector search

Without getting too technical, semantic search works by converting text into mathematical representations called embeddings. These embeddings capture meaning, not just words. Texts with similar meanings have similar embeddings, even if they use different vocabulary.

When you ask a question, the system converts your question into an embedding and finds documents with the most similar meaning, not the most similar words. "IV infusion rate for potassium chloride" matches documents about potassium administration protocols even if they use terms like "parenteral potassium supplementation" or "KCl infusion guidelines."

From retrieval to generation

Modern semantic search systems, often called RAG (Retrieval-Augmented Generation), go a step further. They don't only retrieve relevant documents, they also generate natural language answers based on those documents.

Instead of returning a list of documents for the nurse to read, the system provides a direct answer: "According to the Medication Administration Policy (Section 4.2), IV infusion potassium chloride should not exceed 10 mEq/hour unless under continuous cardiac monitoring..."

The answer includes a citation. The nurse can click through to verify. But for routine questions, they get the answer immediately, in language that addresses their specific question.

The clinical workflow difference

The practical difference between keyword and semantic search becomes clear in how clinical staff actually use these systems.

Traditional workflow: the search spiral

Open the intranet
Navigate to policies section
Wonder whether this is "Clinical" or "Administrative"
Try search with keywords
Get 23 results
Open one that looks promising
Scroll through 15 pages looking for the relevant section
Find it's the wrong policy, or the right policy but wrong version
Try different keywords
Give up, text a colleague

Time: 4-7 minutes (or abandonment)

Semantic workflow: ask and verify

Type: "What's the policy on family presence during resuscitation?"
Receive answer with citation to specific policy section
Click source link if verification needed

Time: 30-60 seconds

The efficiency gain is obvious. What's less obvious is the behavior change this enables. Our customers consistently report that staff adoption of semantic search correlates directly with query speed relative to the old approach: when looking something up takes less time than texting a colleague, staff actually use the system. Below that threshold, they don't.

When finding information is hard, staff develop workarounds. They rely on memory. They ask colleagues. They develop local practices that may or may not match documented policy. Over time, tribal knowledge replaces documented knowledge, creating variation, inconsistency, and risk.

When finding information is easy, easier than the workaround, staff actually use the system. Documented policy becomes the reference point. Consistency improves. New staff can access institutional knowledge without relying entirely on their preceptors' recollections.

What modern systems can (and can't) do

Understanding the current capabilities helps set realistic expectations when evaluating solutions.

What works well today

Natural language policy queries: Asking questions in plain English and receiving direct answers is mature technology. Systems handle synonyms, related concepts, and implicit context reasonably well.

Source attribution: Every answer should cite its source, the specific document and section where the information originates. This is essential for trust and verification.

Universal document ingestion: Modern systems handle PDFs (including scanned documents), Word files, Excel spreadsheets, and other common formats. Quality varies by vendor, especially for low-quality scans or complex layouts.

Role-based access: Users can be granted access to different document sets based on their role. Not every staff member needs access to every policy.

Emerging capabilities

Contradiction detection: Some systems can analyze your entire policy library and identify conflicts: Policy A says 24 hours, Policy B says 48 hours for the same process. This proactive quality management is new and not universally available.

Outdated content flagging: Systems that can surface documents referencing superseded regulations, former employees, or discontinued processes. More sophisticated than simple "last modified" dates.

Feedback-driven improvement: When users mark an answer as unhelpful, the system flags the source document for review. Bad user experiences become signals that drive documentation quality improvement.

Current limitations

Clinical judgment: These systems retrieve and summarize documented knowledge. They don't replace clinical judgment for patient care decisions. The value is access to documented guidance, not autonomous decision-making.

Unstandardized documentation: If your policies are poorly written, vague, or contradictory, semantic search will faithfully surface that poor content. Garbage in, garbage out. As we detailed in our analysis of managing 1,000+ SOPs at enterprise scale, better access doesn't fix bad documentation, though contradiction detection can help identify it.

Integration complexity: While standalone knowledge systems work well, deep integration with EHRs remains challenging. Most implementations exist alongside clinical systems rather than embedded within them.

Evaluation criteria for healthcare organizations

When assessing whether to modernize policy access systems, consider these questions. We've seen organizations fail vendor evaluations by testing with curated demo content rather than their actual documentation; the gap in results is significant.

Evaluation area	What to test	Why it matters
Document handling	Upload your messiest scanned PDFs from 2015	Basic parsers fail on legacy healthcare documents
Answer quality	Ask the same clinical question five different ways	Semantic systems should handle synonyms; keyword systems won't
Source attribution	Check whether answers cite specific document sections	Without citation, staff can't verify or trust answers
Security and access	Verify BAA availability and role-based access controls	Non-negotiable for PHI-adjacent healthcare content
Maintenance burden	Ask how policy updates are reflected in the system	Manual re-indexing requirements often kill adoption
Failure mode	Ask what happens when the system can't find an answer	Good systems acknowledge gaps; bad ones hallucinate

Document handling

Can the system handle your actual documents? Request a demo using your messiest PDFs: scanned policies from 2015, multi-column layouts, documents with tables and forms. Basic systems struggle with these. Enterprise-grade systems have sophisticated parsing that handles edge cases.

Answer quality

Ask the same question five different ways. Does the system consistently find the right source and provide accurate answers? Does it cite sources clearly? Does it acknowledge when it doesn't have relevant information rather than guessing?

Security and compliance

Where does your data reside? Who can access it? Is there an audit trail? Can the system enforce role-based access to sensitive documents? For healthcare, these aren't optional features.

Maintenance requirements

Who maintains the knowledge base when policies change? Is updating as simple as uploading a new document, or does it require manual indexing? Systems with automatic ingestion and processing require significantly less ongoing work.

User experience

Will staff actually use it? The system must be faster and easier than existing workarounds. If it takes more clicks to query the system than to text a colleague, adoption will fail regardless of how sophisticated the technology is.

The transition challenge

Moving from folder-based access to semantic search is more than a technology deployment. It's a change in how staff interact with institutional knowledge. In our experience, the organizations that succeed approach this as a behavior change initiative with technology support, not a technology rollout with some training attached.

In our experience, the organizations that succeed typically:

Start with high-pain use cases. Don't try to transform all knowledge access at once. Pick the problems staff are already frustrated about, policy lookup, onboarding questions, compliance queries, and demonstrate value there first.

Train for behavior change, not just features. Staff need to learn that asking the system is now faster than asking a colleague. This is a habit change that takes time and visible reinforcement.

Close the feedback loop. When staff report that an answer was wrong or unhelpful, investigate and fix the underlying documentation. Visible improvement builds trust. Ignored feedback destroys it. We recommend making feedback responses visible to the unit that reported them; it's one of the highest-leverage trust-building actions in the first three months of deployment.

Measure what matters. Time to find policies. Staff satisfaction with knowledge access. Audit preparation hours. New hire time-to-competency. These metrics demonstrate value and justify continued investment.

The infrastructure perspective

Healthcare has systematically invested in transactional systems. EHRs capture what happened to patients, revenue cycle systems track what was billed, supply chain systems monitor what was consumed. The systems that guide how staff should do their work have received comparatively little attention.

The result is predictable. According to the American Association of Critical-Care Nurses, nurses spend 40% of their 12-hour shifts on documentation. A significant portion of that time is searching for the right form, the current protocol, the documentation requirements nobody can seem to find, instead of charting patient care

The shift from folder hierarchies to semantic search addresses this infrastructure gap. It treats institutional knowledge as a first-class system worthy of intentional architecture, not an afterthought stored wherever someone happened to create a folder in 2017.

For organizations still navigating shared drives and keyword searches that return dozens of irrelevant results, the technology to do better exists today. The question is whether knowledge access will remain a frustration staff work around, or become infrastructure that actually supports clinical work.

Workflow comparison showing traditional policy search taking 4-7 minutes versus semantic search taking 30-60 seconds for healthcare staff

Ready to see the difference with your own documentation? Request a demo to test Mojar's natural language search against your actual policy library, including scanned PDFs and legacy documents.

Start small and prove value. Start a free trial to ingest one unit or department's documentation and measure how query accuracy and speed compare to your current system.

For a deeper analysis of how modern knowledge management systems work in healthcare settings, explore our complete guide to RAG-based healthcare knowledge management.

Timotei Mierlut is a founding engineer at Mojar. He has worked with clinical informatics teams deploying semantic search systems in healthcare settings and leads Mojar's work on document parsing and retrieval quality.

Frequently Asked Questions

Semantic search uses AI to understand the meaning behind a query, not just the keywords. When a nurse asks 'Can I give potassium IV infusion?' a semantic search system understands they're looking for medication administration policies, IV protocols, and potassium-specific guidelines, even if those exact words don't appear in the query.

Keyword search returns every document containing the searched words, regardless of context. A search for 'restraint policy' might return training materials, committee meeting notes, and five versions of the actual policy. Staff must manually filter results, often giving up before finding what they need.

Natural language search allows staff to ask questions the way they'd ask a colleague: 'What's the documentation requirement for patient refusal of care?' instead of guessing keywords like 'refusal' or 'consent' or 'AMA.' The system interprets intent rather than matching text strings.

Research published in BMJ Quality & Safety found that 16-34% of clinical shift time is preventable waste: time spent searching for information, equipment, or people. Policy lookup is a significant contributor, with staff often abandoning searches after navigating 3-4 folder levels.

Effective systems should offer natural language queries, source attribution (showing exactly where answers come from), support for scanned PDFs and legacy documents, role-based access controls, and ideally the ability to detect contradictions between policies before they cause patient safety incidents.

Related Resources

← Back to Blog

Healthcare