Breaking the Static Graph: Context-Aware Traversal for Robust Retrieval-Augmented Generation
Recent advances in Retrieval-Augmented Generation (RAG) have shifted from simple vector similarity to structure-aware approaches like HippoRAG, which leverage Knowledge Graphs (KGs) and Personalized PageRank (PPR) to capture multi-hop dependencies. However, these methods suffer from a “Static Graph Fallacy”: they rely on fixed transition probabilities determined during indexing. This rigidity ignores the query-dependent nature of edge relevance, causing semantic drift where random walks are diverted into high-degree “hub” nodes before reaching critical downstream evidence. Consequently, models often achieve high partial recall but fail to retrieve the complete evidence chain required for multi-hop queries. To address this, we propose CatRAG, Context-Aware Traversal for robust RAG, a framework that builds on the HippoRAG 2 architecture and transforms the static KG into a query-adaptive navigation structure. We introduce a multi-faceted framework to steer the random walk: (1) Symbolic Anchoring, which injects weak entity constraints to regularize the random walk; (2) Query-Aware Dynamic Edge Weighting, which dynamically modulates graph structure, to prune irrelevant paths while amplifying those aligned with the query’s intent; and (3) Key-Fact Passage Weight Enhancement, a cost-efficient bias that structurally anchors the random walk to likely evidence. Experiments across four multi-hop benchmarks demonstrate that CatRAG consistently outperforms state of the art baselines. Our analysis reveals that while standard Recall metrics show modest gains, CatRAG achieves substantial improvements in reasoning completeness, the capacity to recover the entire evidence path without gaps. These results reveal that our approach effectively bridges the gap between retrieving partial context and enabling fully grounded reasoning. Resources are available at https://github.com/kwunhang/CatRAG.
💡 Research Summary
Retrieval‑Augmented Generation (RAG) has become a popular strategy for mitigating hallucinations in large language models by grounding generation in external documents. While dense vector retrievers excel at finding individually relevant passages, they struggle with multi‑hop reasoning where the answer depends on a chain of facts. Recent structure‑aware RAG systems such as HippoRAG and its successor HippoRAG2 address this limitation by building a Knowledge Graph (KG) and using Personalized PageRank (PPR) to simulate a neuro‑symbolic memory that can traverse multi‑hop relationships. However, these approaches suffer from a “Static Graph Fallacy”: the transition matrix governing the random walk is fixed at indexing time and does not adapt to the specific query. Consequently, high‑degree hub nodes and generic edges can dominate the walk, causing semantic drift. The walk often retrieves the first entity correctly but then diffuses into irrelevant clusters, yielding high Recall but low “Full Chain Retrieval” – the ability to recover the entire evidence chain required for correct reasoning.
CatRAG (Context‑Aware Traversal for robust RAG) proposes a three‑pronged solution that transforms the static KG into a query‑adaptive navigation structure while preserving the efficiency of a single‑pass retrieval. The first component, Symbolic Anchoring, extracts named entities from the query (via NER or similar tools) and injects them as weak seeds with a small reset probability ε. These anchors act as gravitational pulls during PPR, repeatedly nudging the walk back toward the exact entities mentioned in the query and preventing it from being swallowed by hub nodes. The second component, Query‑Aware Dynamic Edge Weighting, re‑weights outgoing relation edges in two stages. A coarse‑grained filter first prunes edges based on vector similarity between the query embedding and fact embeddings, limiting the number of candidates per seed to a top‑K set. In the fine‑grained stage, a large language model (LLM) evaluates each remaining edge by presenting the query, the source entity, the target entity, and a concise summary of the target’s neighborhood. The LLM classifies the transition into four relevance tiers (Irrelevant, Weak, High, Direct); a mapping function converts these tiers into scalar multipliers that are applied to the original static weight, yielding a query‑specific transition matrix (\hat{T}_q). This asymmetric re‑weighting amplifies edges that are truly useful for the current question while suppressing distracting paths. The third component, Key‑Fact Passage Weight Enhancement, identifies passage nodes whose context edges are supported by verified seed triples (the “key facts” extracted during the HippoRAG2 filtering stage). Such edges receive a multiplicative boost (1 + β) without any additional LLM calls, ensuring that passages containing explicit evidence are preferentially visited.
The unified retrieval process runs standard PPR on the dynamically adjusted graph, producing a stationary distribution that ranks passages. Experiments on four multi‑hop benchmarks—MuSiQue‑2Wiki, HotpotQA, HoVer, and an additional dataset—show that CatRAG consistently improves standard recall by a modest 1‑3 percentage points but dramatically lifts Full Chain Retrieval by 10‑15 percentage points over the strongest baselines, including HippoRAG2, LightRAG, and dense retrievers. Ablation studies confirm that each module contributes: Symbolic Anchoring alone reduces drift into hub nodes by ~30 %; removing Dynamic Edge Weighting collapses full‑chain recovery by ~40 %; and Key‑Fact Enhancement adds a further ~5 % gain at zero extra cost. Importantly, the two‑stage weighting scheme keeps the number of LLM inferences low, preserving latency comparable to static graph methods.
In summary, CatRAG demonstrates that the rigidity of a static transition matrix is a fundamental bottleneck for graph‑based RAG. By introducing query‑conditioned edge re‑weighting, weak symbolic anchors, and cost‑free passage boosting, the framework achieves robust multi‑hop evidence retrieval while maintaining efficiency. The authors release code and data at https://github.com/kwunhang/CatRAG, and suggest future directions such as integrating richer attribute‑value triples, employing lightweight LLMs for real‑time edge scoring, and incorporating user feedback for online adaptation of the transition matrix.
Comments & Academic Discussion
Loading comments...
Leave a Comment