Both Topology and Text Matter: Revisiting LLM-guided Out-of-Distribution Detection on Text-attributed Graphs

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Text-attributed graphs (TAGs) associate nodes with textual attributes and graph structure, enabling GNNs to jointly model semantic and structural information. While effective on in-distribution (ID) data, GNNs often encounter out-of-distribution (OOD) nodes with unseen textual or structural patterns in real-world settings, leading to overconfident and erroneous predictions in the absence of reliable OOD detection. Early approaches address this issue from a topology-driven perspective, leveraging neighboring structures to mitigate node-level detection bias. However, these methods typically encode node texts as shallow vector features, failing to fully exploit rich semantic information. In contrast, recent LLM-based approaches generate pseudo OOD priors by leveraging textual knowledge, but they suffer from several limitations: (1) a reliability-informativeness imbalance in the synthesized OOD priors, as the generated OOD exposures either deviate from the true OOD semantics, or introduce non-negligible ID noise, all of which offers limited improvement to detection performance; (2) reliance on specialized architectures, which prevents incorporation of the extensive effective topology-level insights that have been empirically validated in prior work. To this end, we propose LG-Plug, an LLM-Guided Plug-and-play strategy for TAG OOD detection tasks. LG-Plug aligns topology and text representations to produce fine-grained node embeddings, then generates consensus-driven OOD exposure via clustered iterative LLM prompting. Moreover, it leverages lightweight in-cluster codebook and heuristic sampling reduce time cost of LLM querying. The resulting OOD exposure serves as a regularization term to separate ID and OOD nodes, enabling seamless integration with existing detectors.

💡 Research Summary

The paper addresses the critical problem of out‑of‑distribution (OOD) detection on text‑attributed graphs (TAGs), where each node carries a textual description and edges encode relational structure. While graph neural networks (GNNs) excel on in‑distribution (ID) tasks, they often produce over‑confident predictions on OOD nodes that exhibit unseen textual or structural patterns, posing safety risks in real‑world applications such as cyber‑threat intelligence or biomedical citation networks.

Existing solutions fall into two families. Topology‑driven methods (e.g., NodeSAFE, GRASP, GKDE) exploit graph structure to propagate OOD scores but treat node texts as shallow bag‑of‑words vectors, thus ignoring rich semantic cues. Recent large‑language‑model (LLM) based approaches (LLMGuard, GLIP‑OOD, GOE‑LLM) synthesize pseudo‑OOD priors by prompting LLMs, yet they suffer from a reliability‑informativeness trade‑off: reliability‑oriented prompts generate semantically unrealistic OOD samples, while informativeness‑oriented prompts risk mislabeling true ID nodes as OOD. Moreover, these LLM‑centric methods are tightly coupled with custom architectures, making them incompatible with the well‑established topology‑driven detectors.

To bridge this gap, the authors propose LG‑Plug, a plug‑and‑play framework that harmoniously combines topology and text while remaining compatible with any existing graph OOD detector. LG‑Plug consists of three core components:

Topology‑Text Representation Alignment – A 2‑layer Graph Convolutional Network (GCN) encodes structural information into node embeddings (z_i), while a Transformer encoder processes raw texts into embeddings (h_i). A contrastive loss aligns the two modalities at the node level, and an additional edge‑level alignment term preserves neighborhood consistency. This yields fine‑grained, modality‑aligned embeddings that already separate many ID and OOD nodes.
Consensus‑Driven OOD Exposure – The aligned embeddings are clustered (e.g., K‑means) to group nodes with similar semantic‑structural patterns. For each cluster, the most central node is selected and sent to an LLM with a prompt asking for a “generic OOD label and description for this cluster.” The LLM’s response is stored in a lightweight category codebook (cluster → OOD label/description). Subsequent nodes in the same cluster reuse the codebook entry; only ambiguous or low‑confidence nodes trigger additional LLM queries. This iterative “cluster → LLM → codebook” loop produces OOD priors that are both reliable (they do not overlap with ID classes) and informative (they reflect realistic, domain‑relevant semantics). Heuristic sampling (based on centrality and entropy) dramatically reduces the number of LLM calls, achieving >70 % cost savings.
Plug‑and‑Play Regularization – The generated OOD priors are incorporated as an auxiliary loss term that pushes ID node embeddings toward an ID centroid and OOD node embeddings toward an OOD centroid, often realized as a margin‑based contrastive loss. This regularizer is added to the original loss of any topology‑driven OOD detector, requiring no architectural changes.

The authors evaluate LG‑Plug on six benchmark TAG datasets (Cora, Citeseer, PubMed, ogbn‑arxiv, DBLP‑Citation, and a cyber‑threat graph). They integrate LG‑Plug with several state‑of‑the‑art topology‑driven detectors (NodeSAFE, GRASP, GKDE) and with LLM‑based baselines (LLMGuard, GLIP‑OOD, GOE‑LLM). Results are reported using standard OOD metrics: FPR@95% TPR (FPR95), AUROC, and AUPR. LG‑Plug consistently reduces FPR95 by at least 7 % when combined with topology‑driven methods and by at least 5 % compared to LLM baselines. AUROC improvements average 3–4 %. Ablation studies confirm that (i) removing the alignment stage degrades performance by ~4 %, (ii) varying the number of clusters shows an optimal range around 30–40 clusters, and (iii) eliminating the codebook forces direct LLM prompting for every node, inflating query cost by 2.5× with only marginal performance loss.

Key strengths of LG‑Plug include: (a) a principled way to balance reliability and informativeness of synthetic OOD data, (b) a lightweight, modular design that can be dropped into any existing graph OOD pipeline, and (c) substantial reduction in LLM query overhead through clustering and codebook reuse. Limitations are also acknowledged: dependence on LLM quality (hallucinations can still occur), sensitivity to clustering hyper‑parameters, and evaluation limited to static benchmark graphs rather than truly streaming or dynamic settings.

In conclusion, the paper demonstrates that jointly aligning topology and text, then leveraging LLMs in a consensus‑driven, cost‑effective manner, yields superior OOD detection on TAGs. Future work is suggested in three directions: (1) exploring LLM‑free or lightweight LLM alternatives to further cut inference cost, (2) extending the framework to multimodal graphs that include images or audio alongside text, and (3) developing online mechanisms for dynamic cluster updating and codebook maintenance to handle real‑time graph streams.

Both Topology and Text Matter: Revisiting LLM-guided Out-of-Distribution Detection on Text-attributed Graphs

💡 Research Summary

Comments & Academic Discussion

Leave a Comment