Why AI Search Needs Intent (and Why DITA XML Makes It Possible)
If intent isn’t explicit in your docs, AI will invent it (and that’s a BIG problem)
Search used to be about matching keywords. Today, AI-powered search engines promise to “understand what users mean.” In practice, they struggle with something far more basic: intent.
Intent is the goal behind a search query—the job the user is trying to get done at that moment. Are they trying to learn, do, fix, compare, or verify? Intent answers why the question was asked, not just what words were typed.
For technical content, intent matters more than phrasing. A query like “configure OAuth token refresh” can signal at least four different needs:
a conceptual explanation,
a step-by-step task,
a troubleshooting scenario,
or a confirmation that a system even supports the feature.
The words don’t change. The intent does.
Why AI Search Engines Struggle With Intent
AI search systems are trained on language patterns, not user goals. That creates several hard problems:
Queries are underspecified
Users compress complex situations into a few words (often not even in a sentence). AI must infer missing context we fail to provide.Intent evolves across the customer journey
The same query appears during learning, implementation, and troubleshooting phases of the journey.Unstructured content hides purpose
When documentation mixes concepts, tasks, and reference material together, AI has no reliable signal to determine which kind of answer fits the situation.
When intent is unclear, AI systems fall back on probability. That’s when you get answers that sound right, rank well, and still fail the user.
Intent is the reason behind a search query. It answers the question: What is the person actually trying to accomplish right now?
Are they trying to learn a concept, complete a task, fix a problem, compare options, or confirm a decision they’ve already made?
The challenge is that intent rarely appears explicitly in the query.
A search like “reset authentication token” could mean:
“I need step-by-step instructions right now.”
“I want to understand how token-based authentication works.”
“I’m evaluating whether this system supports token rotation.”
“I already tried something and it failed—what did I miss?”
The words stay the same. The intent changes completely.
Why AI Has Trouble Inferring Intent
Large language models are good at generating plausible answers, but intent inference is not a text-generation problem. It’s a context problem.
AI search systems struggle because:
Queries are compressed signals
Users type the shortest thing that might work, not a full explanation of their situation. AI must guess what’s missing.Intent shifts mid-journey
A user may start by learning, then move to doing, then to troubleshooting (all with nearly identical queries).Training data reflects language, not goals
LLMs learn patterns of words, not the underlying job the user is trying to get done.
👉🏾 Related: MIT researchers find a shortcoming that makes LLMs less reliableDocumentation rarely exposes intent explicitly
Most docs are organized by product structure, not by user intent, task maturity, or decision state.
When intent is unclear, AI fills the gap with probability. That’s where hallucinations, irrelevant answers, and dangerously confident responses come from.
What Breaks When You Don’t Use DITA (or Structured Content)
When documentation is unstructured or loosely structured, humans compensate. AI systems cannot. The gaps show up quickly (and painfully) once your content feeds an AI-powered search or answer engine.
Intent Collapses
Without information typing, AI cannot reliably tell whether a paragraph exists to explain, instruct, or troubleshoot. Concepts bleed into procedures. Reference data sneaks into narrative text. The system returns answers that mix what something is with how to do it, often in the wrong order and at the wrong time.
Result: users get answers that feel close but don’t help them act.
Context Gets Lost at the Fragment Level
AI systems rarely deliver full pages. They extract fragments.
In unstructured docs, those fragments depend on surrounding prose to make sense. Once separated, steps lose prerequisites, warnings lose scope, and examples lose relevance. AI fills in the gaps with probability, not certainty.
Result: confident answers with missing or incorrect constraints.
Troubleshooting Becomes Guesswork
When troubleshooting guidance is embedded inside tasks or buried in narrative text, AI cannot reliably identify it as recovery-oriented content. It may return setup instructions when the user is already in failure mode—or worse, repeat the action that caused the problem.
Result: frustration, repeated errors, and support escalations.
👉🏾 See also: Why Technical Writers Need to Understand Context Engineering
Audience and Experience Level Are Invisible
Unstructured content rarely signals who something is for. AI cannot distinguish beginner guidance from expert shortcuts or administrative tasks from end-user actions.
Result: novice users get overwhelmed; experts get slowed down.
AI Compensates by Hallucinating
When intent, purpose, and constraints are not explicit, AI does what it was trained to do: produce something that sounds plausible.
This is not an AI flaw. It’s a content design failure!
Why DITA Prevents These Failures
DITA doesn’t eliminate ambiguity; but it sharply reduces it by design.
Information typing makes purpose explicit
Structured components preserve meaning when content is reused or extracted
Metadata exposes audience, environment, and conditions
Modular topics give AI clean, intent-aligned units to reason over
When intent is encoded in structure, AI doesn’t have to guess. It can select.
What This Means for Tech Writers
AI search systems can only infer intent if the content makes intent visible.
That’s why structured, semantically rich documentation matters. When content clearly distinguishes:
conceptual explanations from procedures,
reference material from troubleshooting,
beginner guidance from expert shortcuts,
you give AI something concrete to reason over instead of something vague to guess at.
Well-structured documentation does not just help humans scan faster. It helps machines choose the right answer for the right situation.
And in an AI-driven search world, choosing the right answer matters more than ranking first. 🤠







