Spatio-temporal graph neural networks (ST-GNNs) have achieved notable success in structured domains such as road traffic and public transportation, where spatial entities can be naturally represented as fixed nodes. In contrast, many real-world systems including maritime traffic lack such fixed anchors, making the construction of spatio-temporal graphs a fundamental challenge. Anomaly detection in these non-grid environments is particularly difficult due to the absence of canonical reference points, the sparsity and irregularity of trajectories, and the fact that anomalies may manifest at multiple granularities. In this work, we introduce a novel benchmark dataset for anomaly detection in the maritime domain, extending the Open Maritime Traffic Analysis Dataset (OMTAD) into a benchmark tailored for graph-based anomaly detection. Our dataset enables systematic evaluation across three different granularities: node-level, edge-level, and graph-level anomalies. We plan to employ two specialized LLM-based agents: \emph{Trajectory Synthesizer} and \emph{Anomaly Injector} to construct richer interaction contexts and generate semantically meaningful anomalies. We expect this benchmark to promote reproducibility and to foster methodological advances in anomaly detection for non-grid spatio-temporal systems.
💡 Deep Analysis
📄 Full Content
Spatio-Temporal Graphs Beyond Grids:
Benchmark for Maritime Anomaly Detection
Jeehong Kim∗
Graduate School of Data Science
Seoul National University
williamkim10@snu.ac.kr
Youngseok Hwang∗
Graduate School of Data Science
Seoul National University
yshwang35@snu.ac.kr
Minchan Kim
Graduate School of Data Science
Seoul National University
mmm5373@snu.ac.kr
Sungho Bae
Graduate School of Data Science
Seoul National University
sunghobae@snu.ac.kr
Hyunwoo Park
Graduate School of Data Science
Seoul National University
hyunwoopark@snu.ac.kr
Abstract
Spatio-temporal graph neural networks (ST-GNNs) have achieved notable success
in structured domains such as road traffic and public transportation, where spatial
entities can be naturally represented as fixed nodes. In contrast, many real-world
systems including maritime traffic lack such fixed anchors, making the construction
of spatio-temporal graphs a fundamental challenge. Anomaly detection in these
non-grid environments is particularly difficult due to the absence of canonical
reference points, the sparsity and irregularity of trajectories, and the fact that
anomalies may manifest at multiple granularities. In this work, we introduce a novel
benchmark dataset for anomaly detection in the maritime domain, extending the
Open Maritime Traffic Analysis Dataset (OMTAD) into a benchmark tailored for
graph-based anomaly detection. Our dataset enables systematic evaluation across
three different granularities: node-level, edge-level, and graph-level anomalies. We
plan to employ two specialized LLM-based agents: Trajectory Synthesizer and
Anomaly Injector to construct richer interaction contexts and generate semantically
meaningful anomalies. We expect this benchmark to promote reproducibility and to
foster methodological advances in anomaly detection for non-grid spatio-temporal
systems.
1
Introduction
Spatio-temporal graph neural networks (ST-GNNs) have been extensively studied in domains such as
road traffic forecasting and public transportation systems [19, 3, 2]. A common characteristic of these
applications is that the underlying spatial entities like road intersections, bus stops, or subway stations
can be naturally defined as fixed nodes. This inherent grid-like structure makes the construction of
spatio-temporal graphs straightforward and facilitates the modeling of both spatial dependencies and
∗Equal contribution (co-first authors).
39th Conference on Neural Information Processing Systems (NeurIPS 2025) Workshop: AI for Science: The
Reach and Limits of AI for Scientific Discovery.
arXiv:2512.20086v1 [cs.LG] 23 Dec 2025
temporal dynamics. Consequently, anomaly detection in such structured environments has received
significant attention and demonstrated promising results [1].
However, there are many cases both in real-world and scientific domains where situations do not
conform to these assumptions. In particular, there exist domains where fixed spatial anchors are
absent or physically ambiguous. The maritime environment represents one of the most prominent
examples: unlike road traffic systems, the open sea does not provide natural fixed nodes such as
intersections or road segments. Although artificial proxies such as waypoints, port coordinates, or
grid discretizations can be imposed, these methods are often ad hoc and fail to capture the continuous
and dynamic nature of vessel trajectories. This fundamental challenge renders the construction of a
meaningful spatio-temporal graph a non-trivial task. We expect that such non-grid spatio-temporal
systems will become increasingly common, not only in maritime monitoring but also in emerging
domains such as drone swarms and aerial traffic management.
Performing anomaly detection in these settings is even more challenging. First, the lack of fixed
spatial anchors complicates the definition of normal versus abnormal interactions among moving
entities. Second, the inherent sparsity and irregularity of the trajectories make it difficult to design
robust models. Third, anomalous patterns may manifest at multiple levels: individual entities (node-
level anomalies), unusual pairwise interactions (edge-level anomalies), or entire subgroups behaving
abnormally (graph-level anomalies). These challenges highlight the need for systematic benchmarks
that enable rigorous evaluation and foster methodological innovations [7]. There are several Marine
datasets
To address this gap, in this paper we introduce a novel benchmark dataset for anomaly detection in the
maritime domain. Our dataset is designed to support anomaly detection tasks at three granularities:
(i) node-level anomalies, capturing abnormal single-entity behaviors, (ii) edge-level anomalies,
reflecting irregular inter-entity interactions, and (iii) graph-level anomalies, identifying collective
abnormal events. Inspired by recent advances in graph anomaly detection across node-, edge-, and
graph-level settings [5], we aim to provide a unified testbed that allows the communi