The Problem with Traditional RAG Models

How DOM Graph RAG transforms AI content retrieval for greater accuracy and context

Sep 23, 2024

The DOM GraphRAG Project introduces a novel approach to improving retrieval-augmented generation (RAG). It solves the issues with traditional vector-based RAG models by using an improved architecture. This method increases the accuracy, reliability, and context of AI-generated content through the use of Document Object Model (DOM) structures and knowledge graphs.

Related: What is retrieval-augmented generation (RAG)?

Michael Iantosca, Senior Director of Knowledge Platforms and Engineering at Avalara, along with Helmut Nagy, Chief Product Officer, and William Sandri, Data & Knowledge Engineer at Semantic Web Company, wrote the paper to explain the issues with traditional retrieval-augmented generation (RAG), particularly in technical documentation.

Problems with traditional vector-based RAG models

Traditional vector-based RAG models have issues that impact their reliability. They can produce inaccurate or misleading information (hallucinations) and lose context.

Vector-based RAG models struggle also with multi-step reasoning and fact-checking, making them unreliable. They typically depend on manual fact-checking, which is both expensive and inefficient. This reliance creates additional challenges because these models can't scale effectively. Their limitations stem from retrieving information based on similarity rather than deeply understanding the content or verifying its accuracy.

While vector-based models are good at retrieving information based on similarity, they're bad at relevance and context. The data they retrieve often lacks the context required for accurate reasoning (as mentioned earlier, fact-checking), leading to misleading conclusions.

And, they cannot easily connect related information across multiple sources, something knowledge graphs are good at.

Adding another negative to the list, vector-based models require a lot of computational power, making them costly to use. The International Energy Agency predicts that AI will use ten times more energy by 2026 than it does today. This increase is due to the rising complexity of AI models and the infrastructure required to support them.

Meet DOM Graph RAG: The smarter way to handle content retrieval

Imagine trying to build a house with a hammer that randomly selects nails based on their color instead of where they need to go. That's pretty much how traditional RAG models handle content retrieval.

Enter DOM Graph RAG — it's like giving that hammer a brain, a blueprint, and maybe even a cup of coffee to ensure everything is done accurately, in context, and with much less chaos!

The DOM Graph RAG model leverages a semantic, content-first approach, integrating neuro-symbolic reasoning and knowledge graphs. By utilizing the DOM to structure content, it preserves the integrity of the original material. This method offers several key advantages over traditional RAG models:

Preservation of Context: DOM Graph RAG keeps the original content structure intact, essential for accurate content retrieval.
Neuro-Symbolic Reasoning: This technique combines neural network pattern recognition with symbolic model logical reasoning, enhancing the system's ability to provide factually accurate and logically consistent responses.
Dynamic Content Management: Unlike traditional vector models that struggle with content currency and relevance, the system efficiently handles dynamic and frequently updated content.

What's In It For Technical Writers?

Technical writers and information developers who create self-service support content can significantly benefit from adopting this model. Traditional RAG models often chunk content arbitrarily, making it difficult for users to find precise, contextually rich information. By contrast, DOM Graph RAG maintains the relationships between topics, elements, and metadata, ensuring users receive accurate, explainable answers to their queries.

For technical writers, the DOM Graph RAG model opens up new possibilities for creating content that is easy to retrieve and retains its context and structure, which is crucial for technical documentation. Furthermore, by automating content updates and managing dynamic metadata, such as entitlement, localization, and version control, DOM Graph RAG can significantly improve the efficiency of content management systems.

[Webinar] October 16, 2024 — DITA-Driven Graph RAG For Customer Self-Service Support

Michael Iantosca, Senior Director of Knowledge Platforms and Engineering at Avalara, and Helmut Nagy, Chief Product Officer at Semantic Web Company, as they unravel the complexities of DITA-powered Graph RAG and demonstrate this innovative solution.

Practical Applications

The DOM Graph RAG model isn't theoretical. It's already in use in chatbot systems, providing more accurate and context-aware responses. By incorporating knowledge graphs and reducing dependence on large language models (LLMs), this approach also reduces computational costs, making it a scalable and cost-effective solution for industries requiring high precision.

The DOM Graph RAG model offers a robust, scalable, and efficient solution that addresses many of the challenges traditional RAG models face. For information developers, adopting this approach can lead to improved self-service support experiences, greater accuracy in content retrieval, and more effective content management overall.

This innovative model is an exciting addition to the literature on AI-driven content management. Take a moment and check out the additional resources available on the Thinking Documentation site.

Download the paper

The Content Wrangler

Discussion about this post