Accurate and Fast Estimation of Temporal Motifs using Path Sampling

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Counting the number of small subgraphs, called motifs, is a fundamental problem in social network analysis and graph mining. Many real-world networks are directed and temporal, where edges have timestamps. Motif counting in directed, temporal graphs is especially challenging because there are a plethora of different kinds of patterns. Temporal motif counts reveal much richer information and there is a need for scalable algorithms for motif counting. A major challenge in counting is that there can be trillions of temporal motif matches even with a graph with only millions of vertices. Both the motifs and the input graphs can have multiple edges between two vertices, leading to a combinatorial explosion problem. Counting temporal motifs involving just four vertices is not feasible with current state-of-the-art algorithms. We design an algorithm, TEACUPS, that addresses this problem using a novel technique of temporal path sampling. We combine a path sampling method with carefully designed temporal data structures, to propose an efficient approximate algorithm for temporal motif counting. TEACUPS is an unbiased estimator with provable concentration behavior, which can be used to bound the estimation error. For a Bitcoin graph with hundreds of millions of edges, TEACUPS runs in less than 1 minute, while the exact counting algorithm takes more than a day. We empirically demonstrate the accuracy of TEACUPS on large datasets, showing an average of 30$\times$ speedup (up to 2000$\times$ speedup) compared to existing GPU-based exact counting methods while preserving high count estimation accuracy.

💡 Research Summary

The paper introduces TEACUPS (Temporal Explorations Accurately Counted Using Path Sampling), a novel algorithm for estimating the counts of temporal motifs in directed, timestamped graphs. Temporal motifs are small subgraph patterns that must respect both edge direction and the chronological order of events within a bounded time window δ. Existing exact counting methods become infeasible even for modestly sized graphs because the number of possible motif matches can reach trillions, especially when the input graph contains multiple edges between the same pair of vertices (multigraphs). Prior work has been limited to simple three‑vertex motifs (temporal triangles) and cannot handle four‑vertex patterns or multigraph motifs at scale.

The core idea of TEACUPS is to sample “δ‑centered 3‑paths” rather than enumerating all possible subgraphs. A δ‑centered 3‑path consists of three edges (e1, e2, e3) where e2 is the central edge, e1 is incident to the source of e2, e3 is incident to the target of e2, and both e1 and e3 occur within δ time units of e2. No direct time constraint is imposed between e1 and e3, allowing them to overlap or even share vertices, which is essential for handling multigraphs. By focusing on these structures, the algorithm can quickly generate a representative sample of the underlying temporal topology while preserving the ordering constraints required by motifs.

TEACUPS first computes a sampling weight we,δ for each edge, reflecting the number of δ‑centered 3‑paths that include that edge. The total number of such paths, Wδ, is used to normalize the weights, ensuring that each sampled 3‑path corresponds to an unbiased estimator of the total motif count. After randomly selecting a 3‑path according to these weights, the algorithm determines how many instances of the target motif can be formed by extending the sampled path. This extension step runs in time linear in the multiplicity of the involved edges, so even when thousands of parallel edges exist between two vertices, the cost remains manageable.

The authors provide rigorous theoretical guarantees. They prove that the estimator is unbiased and apply concentration inequalities (Chebyshev, Markov, and Chernoff bounds) to bound the variance and derive sample‑size requirements for a desired relative error ε with confidence 1‑δ. Consequently, practitioners can control the trade‑off between runtime and accuracy by adjusting the number of sampled paths.

Empirical evaluation spans several large real‑world datasets, including a Bitcoin transaction graph with over 110 million temporal edges, Reddit comment threads, and StackOverflow activity logs. For four‑vertex motifs such as directed 4‑cycles, 4‑cliques, and multigraph variants (e.g., motifs with two parallel edges per vertex pair), TEACUPS achieves speedups ranging from 30× to 2000× compared to the state‑of‑the‑art exact GPU implementation (Everest) and outperforms prior approximate methods (PRESTO, Edge‑Sampling) both in runtime and accuracy. On the Bitcoin graph, TEACUPS finishes in under one minute while the exact algorithm requires more than a day; the relative error across all tested motifs stays below 10 % in most cases, often under 5 %. Moreover, the algorithm scales near‑linearly with the number of CPU threads, demonstrating that it can be deployed on commodity multi‑core machines without specialized hardware.

The paper’s contributions are fourfold: (1) introduction of temporal path sampling as a general technique for temporal motif estimation, (2) the first algorithm capable of efficiently counting motifs that are multigraphs, (3) provable unbiasedness with explicit error bounds, and (4) extensive experimental validation showing both speed and accuracy advantages. The authors also discuss extending the approach beyond three‑edge paths to larger spanning‑tree structures, which would enable efficient estimation of motifs on five or more vertices—a direction left for future work. Overall, TEACUPS represents a significant step forward in making temporal motif analysis practical for massive, real‑world dynamic networks.

Accurate and Fast Estimation of Temporal Motifs using Path Sampling

💡 Research Summary

Comments & Academic Discussion

Leave a Comment