BRAVA-GNN: Betweenness Ranking Approximation Via Degree MAss Inspired Graph Neural Network
Computing node importance in networks is a long-standing fundamental problem that has driven extensive study of various centrality measures. A particularly well-known centrality measure is betweenness centrality, which becomes computationally prohibitive on large-scale networks. Graph Neural Network (GNN) models have thus been proposed to predict node rankings according to their relative betweenness centrality. However, state-of-the-art methods fail to generalize to high-diameter graphs such as road networks. We propose BRAVA-GNN, a lightweight GNN architecture that leverages the empirically observed correlation linking betweenness centrality to degree-based quantities, in particular multi-hop degree mass. This correlation motivates the use of degree masses as size-invariant node features and synthetic training graphs that closely match the degree distributions of real networks. Furthermore, while previous work relies on scale-free synthetic graphs, we leverage the hyperbolic random graph model, which reproduces power-law exponents outside the scale-free regime, better capturing the structure of real-world graphs like road networks. This design enables BRAVA-GNN to generalize across diverse graph families while using 54x fewer parameters than the most lightweight existing GNN baseline. Extensive experiments on 19 real-world networks, spanning social, web, email, and road graphs, show that BRAVA-GNN achieves up to 214% improvement in Kendall-Tau correlation and up to 70x speedup in inference time over state-of-the-art GNN-based approaches, particularly on challenging road networks.
💡 Research Summary
The paper addresses the long‑standing challenge of estimating betweenness centrality (BC) on large‑scale graphs. Exact computation via Brandes’ algorithm requires O(|V||E|) time, which is infeasible for networks with millions of nodes. Recent attempts to use graph neural networks (GNNs) for BC approximation have shown promise, yet they fail to generalize to high‑diameter graphs such as road networks.
The authors make two key observations. First, multi‑hop degree mass—a cumulative count of a node’s neighbors within m hops—correlates strongly with BC across a wide range of real‑world graphs. Formally, the m‑th order degree mass d^(m) = Σ_{k=0}^{m} A^k d (where A is the adjacency matrix and d the degree vector) captures the “local reach” of a node, and empirical studies have shown that nodes with larger degree mass tend to lie on many shortest paths. Second, existing synthetic training data are usually generated from scale‑free models (e.g., Barabási‑Albert), which only produce power‑law exponents γ≈2–3. Such models cannot mimic the degree distributions of road networks, where γ often exceeds 3 and clustering is low. To overcome this, the authors adopt the hyperbolic random graph (HRG) model, which places nodes in a hyperbolic disk and connects them based on hyperbolic distance. HRG allows independent control of the power‑law exponent, average degree, and temperature (clustering), thereby reproducing the structural signatures of diverse real graphs, including high‑diameter road maps.
BRAVA‑GNN (Betweenness Ranking Approximation Via Degree Mass GNN) is built around three design pillars: (1) a preprocessing step that removes nodes guaranteed not to belong to any shortest path (isolated/leaf nodes and nodes whose neighbors form a clique); (2) degree‑mass‑based node features, specifically the concatenation of 1‑ to 6‑hop degree masses, passed through a small dense layer to obtain an initial embedding; and (3) a dual‑direction message‑passing architecture. Two parallel GNN encoders share the same weight matrices but operate on the original adjacency matrix A and its transpose Aᵀ, thus processing incoming and outgoing information separately. Each layer updates the hidden states as H^{l+1}_in = ReLU(A H^{l}_in W^{l}) and H^{l+1}_out = ReLU(Aᵀ H^{l}_out W^{l}). After each layer, a multilayer perceptron (MLP) produces a scalar score; the final betweenness score for a node is the product of the accumulated inbound and outbound scores. This design respects the directed nature of shortest‑path counts while keeping the parameter count extremely low.
Training is performed on a large corpus of HRG‑generated graphs whose degree distributions are matched to those observed in the target domains. The loss function optimizes a pairwise ranking objective (e.g., a differentiable approximation of Kendall‑Tau) rather than absolute BC values, reflecting the practical use‑case where relative ordering matters more than exact scores.
The experimental evaluation spans 19 real‑world datasets covering social, web, email, and road networks. Baselines include prior GNN‑based BC approximators, sampling‑based algorithms, parallel exact Brandes implementations, and a hyperbolic‑embedding baseline. Results show that BRAVA‑GNN achieves up to a 214 % improvement in Kendall‑Tau correlation on road networks, where previous methods often collapse to near‑random performance. Across all datasets, the average Kendall‑Tau gain is 1.92×, and inference speed is accelerated by up to 70× compared with the strongest GNN baseline. Notably, the model uses 54× fewer parameters than the most lightweight existing GNN, demonstrating that the degree‑mass features and HRG training data provide sufficient expressive power without deep, heavyweight architectures.
Ablation studies confirm that (i) removing the dual‑direction encoders degrades performance on directed graphs, (ii) limiting degree‑mass to fewer hops reduces correlation, and (iii) training on conventional scale‑free graphs leads to a marked drop in accuracy on high‑diameter road networks, underscoring the importance of the HRG synthetic regime.
The authors acknowledge limitations: the degree‑mass feature is capped at six hops, which may miss long‑range structural cues in extremely sparse graphs; HRG parameter tuning can be dataset‑specific; and the preprocessing heuristic, while effective, may inadvertently affect BC values in pathological cases. Future work is suggested in three directions: extending the framework to dynamic graphs with temporal updates, enriching node features with additional structural descriptors (e.g., community centrality, eigenvector centrality), and automating HRG hyperparameter selection via meta‑learning or Bayesian optimization.
In summary, BRAVA‑GNN demonstrates that a lightweight GNN, equipped with theoretically motivated degree‑mass features and trained on hyperbolically generated synthetic graphs, can dramatically improve both accuracy and efficiency of betweenness centrality ranking, especially on challenging high‑diameter networks where prior learning‑based methods falter.
Comments & Academic Discussion
Loading comments...
Leave a Comment