📝 Original Info
- Title:
- ArXiv ID: 2512.18826
- Date:
- Authors: Unknown
📝 Abstract
This survey reviews hyperbolic graph embedding models, and evaluate them on anomaly detection, highlighting their advantages over Euclidean methods in capturing complex structures. Evaluating models like HG-CAE, P-VAE, and HGCN demonstrates high performance, with P-VAE achieving an F1-score of 94% on the Elliptic dataset and HGCAE scoring 80% on Cora. In contrast, Euclidean methods like DOMINANT and GraphSage struggle with complex data. The study emphasizes the potential of hyperbolic spaces for improving anomaly detection, and provides an open-source library to foster further research in this field.📄 Full Content
• Detailed Review and Analysis : We present a thorough review a of existing hyperbolic graph embedding techniques.
• Exploration on anomaly detection: We investigate the application of hyperbolic graph embeddings on anomaly detection, highlighting their advantages over traditional methods by evaluating them on established datasets.
• Development of an Open-Source Library :
We present an open-source library that includes the main hyperbolic graph embedding methods and datasets https://gitlab.liris.cnrs.fr/gladis/ghypeddings .
The remainder of this paper is structured as follows: Section 2 provides the necessary background on graph embedding and highlighting the main methods in the Euclidean space. Section 3 introduces the hyperbolic space covering key concepts in differential geometry, and hyperbolic geometry graph and its key concepts. It also proposes a comprehensive taxonomy of hyperbolic graph embedding models, categorizing the existing techniques into traditional methods and deep learning-based methods. Section 4 outlines the methodology we followed for using hyperbolic embedding for the anomaly detection task. We also introduce the Ghypeddings library we constructed for this purpose. Section 5 describes the experimental setup, including datasets, evaluation metrics, and presents the experimental results, discussing the key findings. Finally, Section 6 concludes the paper and offers insights into potential future research directions.
In this section, we define the concepts of graphs and graph embedding. Then we present the commonly used techniques of graph embedding in the Euclidean space.
Graphs are powerful data structures used to model relationships among entities in various domains. However, their irregular structure presents challenges for conventional machine learning models designed to process tabular data. Graph embeddings address this issue by mapping graph data into a lowdimensional, continuous vector space that captures structural and attributebased information. This transformation enables machine learning models to effectively utilize graph data for various tasks. Formally, let G = (V, E) represent a graph, where V is the set of nodes and E is the set of edges. A graph embedding is a mapping f : V → R d , where each node v ∈ V is represented as a d-dimensional vector in Euclidean space. The objective is to ensure that geometric relations in the embedding space reflect structural or attribute-based similarities in the original graph. These embeddings are vital for tasks such as node classification, link prediction, graph clustering, and visualization. Graph embedding methods can be categorized based on their tasks and objectives. Supervised tasks involve predicting outputs using graph structure and node labels, such as node and graph classification. In contrast, unsupervised tasks leverage the graph’s inherent structure for self-supervision, enabling applications like link prediction, graph reconstruction, and clustering [10].
Graph embedding methods often operate at the node level, with edge and graph embeddings derived from node embeddings. Most techniques employ the Euclidean space R n as the embedding space. The study of graph embeddings dates back to the early 2000s, when they were explored for dimensionality reduction tasks [8]. In this context, a graph is typically constructed from highdimensional data points by connecting each point to its top k nearest neighbors. Pairwise node similarity matrices are then calculated, and nodes are mapped into a lower-dimensional space to preserve these pairwise relationships. For instance, Isomap [62] computes shortest path distances on the graph and applies multidimensional scaling (MDS) [34], which reconstructs a set of points from a matrix of pairwise distances. Numerous Euclidean graph embedding methods have since been proposed [26]. These methods are genrally classified into three primary algorithmic categories:
1-Matrix Factorization-Based Methods: These methods decompose a similarity matrix into lower-dimensional matrices to learn node embeddings that preserve node relationships. Graph Factorization factorizes the adjacency matrix A to learn node embeddings such that the inner product of embeddings approximates the graph’s connectivity [1].
2-Random Walk-Based Methods: These methods generate stochastic sequences (random walks) over the graph and use these sequences to learn node embeddings. These methods are flexible, as they can model various types of node proximities by adjusting the random walk behavior.
Graph Neural Networks (GNNs), have become central to graph embedding. GNNs can learn node embeddings by aggregating information from a node’s neighborhood iteratively, allowing the model to capture complex structural dependencies within the graph. Graph Convolutional Networks (GCNs) [32] use a convolutional operation over the graph to aggregate information from a node’s neighbors and update its embedding. GraphSAGE (Graph Sample and Aggregation) [25], unlike GCNs, which require the full graph to be stored in memory, performs inductive learning by sampling a fixed-size neighborhood for each node. This allows the model to scale to large graphs. Graph Attention Networks (GATs) [66] introduce a selfattention mechanism to GNNs. In GATs, each node learns to assign attention coefficients to its neighbors, which indicate the importance of each neighbor’s information. Graph Autoencoders (GAEs) models [31] adapt the autoencoder framework to graph-structured data. The encoder learns node embeddings, while the decoder reconstructs the graph (e.g., adjacency matrix) from the embeddings. Variational Graph Autoencoders (VGAEs) introduce probabilistic elements to the framework, allowing for uncertainty modeling in the embeddings.
In this section, we move to the hyperbolic space. We first introduce the fundamental concepts of differential geometry. Then, we provide a detailed overview of the hyperbolic space and its commonly used models. Finally, we thoroughly review existing hyperbolic graph embedding techniques.
Definition 1 (Topological Space). Let X be a set, and τ be a collection of subsets of X. The pair (X, τ ) is said to be a topological space, and τ is called a topology on X if it satisfies three axioms: (1) The empty set ∅ and X belong to τ . (2) Any finite or infinite union of members of τ belongs to τ . (3) Any finite intersection of members of τ belongs to τ .
A manifold M of dimension n is a topological space in which the neighborhood of any point can be locally approximated by Euclidean space. For every point p ∈ M, there exists a pair (U, φ) called a coordinate chart, where U is a neighborhood of p, and φ is a homeomorphism (a continuous bijection with a continuous inverse) between U and an open subset of R n . Manifolds are a higher-dimensional generalization of surfaces.
Definition 3 (Smooth Manifold). Given a manifold M equipped with a collection of coordinate charts that cover M, we say that M is smooth if, for every two coordinate charts (U α , φ α ) and (U β , φ β ) whose domains intersect, the map
α and its inverse are C ∞ (infinitely differentiable). This property allows for the application of differential calculus on M. Definition 4 (Tangent Space). Let M be an n-dimensional smooth manifold and p ∈ M. The tangent space of M at p, denoted by T p M, is the collection of all possible tangent vectors at p. Intuitively, T p M is the vector space that best approximates the manifold in a neighborhood of p.
Definition 5 (Riemannian Manifold). A Riemannian manifold (M, g) is a smooth manifold M equipped with a Riemannian metric g, which is a smoothly varying inner product g p : T p M×T p M → R defined on the tangent space at each point p ∈ M. The n × n symmetric positive-definite matrix G(p) that satisfies g p (u, v) = u ⊤ G(p)v is called the metric tensor of M at p. Definition 6 (Geodesics). Geodesics generalize straight lines in Euclidean space to curved spaces. Formally, a geodesic is a constant-speed curve that minimizes the distance between two points locally. The distance between two points on the manifold is the length of the geodesic connecting them.
Definition 7 (Exponential Map). On a Riemannian manifold (M, g), the exponential map exp p : T p M → M projects a tangent vector v ∈ T p M to the point on M reached by following the geodesic starting at p in the direction of v.
Definition 8 (Logarithmic Map). The logarithmic map log p : M → T p M is the inverse of the exponential map. It projects points from the manifold M back to the tangent space T p M. Definition 9 (Gaussian Curvature). The Gaussian curvature of a surface S at a point p ∈ S measures the deviation of S from being flat at p. It is a signed value where positive curvature corresponds to a spherical surface, negative curvature to a saddle-shaped surface, and zero curvature to a flat plane.
Definition 10 (Sectional curvature). Let M be a Riemannian manifold and T p M the tangent space of T at p ∈ M. The sectional curvature of M at p associated with a 2-dimensional plane Π p ⊂ T p M is defined as the Gaussian curvature of the surface which has the plane Π p as a tangent plane at p. M is said to have constant sectional curvature if its sectional curvature at every point and for all 2-dimensional planes Π p ⊂ T p M is the same. The notion of sectional curvature is a generalization of Gaussian curvature for high-dimensional Riemannian manifolds.
In this section, we briefly review the fundamental concepts of hyperbolic geometry. For a more detailed examination, refer to [37]. Hyperbolic geometry is a non-Euclidean geometry characterized by constant negative curvature, which contrasts with Euclidean geometry where the parallel postulate holds. This results in unique properties for lines, angles, and distances. The hyperbolic space of n-dimensions is defined as a complete1 and simply connected2 n-dimensional Riemannian manifold with constant negative sectional curvature. This space can be represented using five well-known models. Each model is defined on a different subset of the real vector space, called its domain, and has its own tensor metric, geodesics, distance function, etc [9]. We focus on three widely studied models of hyperbolic space, the Lorentz model, the Poincaré ball model, and the Klein model, each offering a different perspective but being mathematically equivalent. These models are defined as follows:
Lorentz Model Also known as the Hyperboloid model or the Minkowski model. The Lorentz model of n-dimensional hyperbolic space corresponds to the Riemannian manifold (L n , g L ) where L n is given as the upper sheet of a two-sheeted n-dimensional hyperbola, that is
in which the ⟨x, x⟩ L denotes the Lorentzian inner product which is defined for every x, y ∈ L n as ⟨x, y⟩ L = -x 1 y 1 + n+1 i=1
x i y i .
x is a diagonal matrix of size n + 1 such that each main diagonal element is equal to 1, except for the first element being -1, i.e., g L x = diag(-1, 1, . . . , 1). The induced distance on the Lorentz manifold between two points x, y ∈ L n is by d L (x, y) = arcosh(-⟨x, y⟩ L ).
The n-dimensional Poincaré ball model of hyperbolic space is defined as the Riemannian manifold (B n , g B ) where B n is an open unit ball defined as B n = {x ∈ R n : ∥x∥ < 1}, in which ∥.∥ denotes the Euclidean norm. The metric tensor g B
x is given for any x ∈ B n as follows:
The 2-dimensional version of the Poincaré ball model is known as the Poincaré disk. The associated distance function between two points x, y ∈ B n is computed as:
. Klein model Also known as the Beltrami-Klein model or the projective model, The Klein model of n-dimensional hyperbolic space is the Riemannian manifold (K n , g K ) where K n is the open unit ball given by K n = {x ∈ R n : ∥x∥ < 1}, and g K x is the metric tensor defined for any x ∈ K n as
Several approaches have been proposed in the literature for embedding graphs into hyperbolic spaces, and these methods can be categorized based on various criteria. Key distinctions include whether the methods incorporate node features or focus solely on preserving graph structure, as well as the level of supervision. Unsupervised methods typically learn embeddings in two stages, while supervised methods optimize directly for specific tasks like node classification. Additionally, embeddings can be transductive, mapping nodes directly, or inductive, learning a function that can embed new nodes. Based on these distinctions, we propose a comprehensive taxonomy that categorizes state-ofthe-art methods (see Figure 1). In what follows, we describe the categories within our proposed taxonomy and present in detail the most representative methods for each category (that we evaluate in our experimental section) while describing briefly the other methods. 1-Traditional methods Traditional methods for embedding graphs into hyperbolic spaces can be categorized based on their approach and the type of information they utilize. These methods include direct optimization-based approaches, tree approximation-based techniques, multidimensional scaling (MDS)based methods and network model dependent methods.
Direct optimization-based methods: These methods focus on learning node embeddings by minimizing a loss function. The main approach is Poincaré embedding [44], which projects graphs and symbolic data into hyperbolic space Convolutional GNNs HGNN [39] HGCN [11] k-GCN [4] HAT [76] GIL [79] Q-GCN [71] ACE-HGNN [19] NHGCN [18] kHGCN [73] FMGNN [16] H2H-GCN [15] LGCN [77] Graph Autoencoders P-VAE [41] HGCAE [50] CCM-AAE [23] PWA [46] HyperbolicNF [7] Spatio-temporal GNNs HTGN [72] HGWaveNet [5] ST-GCN [51] HVGNN [58] DHGAT [38] Direct Optimization Poincaré Embedding [44] Lorentz embedding [45] Entailment Cones [20] Lorentzian distance [36] Disk embedding [59] Tiling [75] Hyperbolic random walk [68] Tree Approximation shortest path [74] Combinatorial constructions [54] Multidimensional scaling
Lorentzian MDS [13] h-MDS [55] hydra [29] HDM [61] PGA [28] Symmetric spaces [40] Manifold product [24] Network Model
Community Structure [69] Common neighbors [6] Laplacian [2] HyperMap [49] HyperMap-CN [47] Hyperbolic Graph Embedding Deep learning methods Traditional methods using the Poincaré ball model. The method learns node embedding by minimizing a task-specific loss function. The update rule for embeddings ensures that embeddings remain inside the Poincaré ball. For link prediction tasks, the method uses a cross-entropy loss function defined by
is the hyperbolic distance between node embeddings u and v, and r and t are hyperparameters that adjust the probability distribution. Evaluated on tasks like link prediction and node classification [44],
Poincaré embedding shows superior performance over Euclidean and translational embeddings, particularly in low dimensions. They excel in capturing hierarchical relationships, as seen in datasets like WordNet, CondMat, and HepPh. Poincaré embedding performs better in terms of mean average precision (MAP) and mean rank scores, demonstrating the advantage of hyperbolic geometry in modeling complex graph structures. Further work on Poincaré embedding was conducted by [45], who focus on embedding weighted graphs in the Lorentz model. We can also cite order based methods such as [20], which view hierarchical relations as partial orders defined using a family of nested geodesically convex cones and prove that these entailment cones admit an optimal shape with a closed form expression both in the Euclidean and hyperbolic spaces. To obtain efficient and interpretable algorithms, [36] exploit the fact that the squared Lorentzian distance makes it appropriate to represent hierarchies where parent nodes minimize the distances to their descendants. [59] develop Disk Embeddings, which is a framework for embedding DAGs into quasi-metric spaces that outperform existing methods especially in complex DAGs. [75] learn high-precision embedding using an integer-based tiling to represent points in hyperbolic space with provably bounded numerical error. of a Poincaré embedding on WordNet Nouns) and learn more accurate embeddings on real-world datasets. We can cite also random walk techniques [68], that adopt architectures that use hyperbolic inner products and hyperbolic distance as proximity measures.
Tree Approximation-Based Methods: These methods embed graphs by transforming them into tree structures before mapping them into hyperbolic space. The main approach is Combinatorial construction [54], which involves embedding a graph into a weighted tree and then mapping this tree into the Poincaré ball. This method was evaluated on several datasets revealing that the embedding dimension scales linearly with the longest path length and logarithmically with the maximum degree in trees. This shows a trade-off between precision and dimensionality, highlighting the method’s suitability for hierarchical structures [54]. Further advancements in tree approximation-based methods include learning tree structures that approximate the shortest path metric on a given graph, thereby enhancing embedding accuracy [74].
Multidimensional Scaling-Based Methods: In this category, we can cite mainly H-MDS [67], which extends multidimensional scaling (MDS ) [35] into hyperbolic space, particularly within the Lorentz manifold. Euclidean MDS methods only recover centered points, but H-MDS introduces a pseudo-Euclidean mean to center points in hyperbolic space. The pseudo-Euclidean mean is defined as the local minimum. Once points are centered using this mean, the H-MDS problem reduces to a standard PCA problem. H-MDS was evaluated [67] using similar datasets and metrics as the combinatorial construction. It demonstrated low distortion, particularly for tree-like datasets, outperforming traditional dimensionality reduction and optimization methods. However, its performance on MAP scores was less optimal due to precision limitations. Several extensions and enhancement, trying to further minimize distortion, i.e., the difference between distances in the original graph and distances in the embedding space, are also proposed such as Lorentzian MDS [13], h-MDS [55], hydra [29], HDM [61].
For further dimensionality reduction or visualization in one or two dimensions, the authors of [28] suggest using Principal Geodesics Analysis (PGA), a generalization of PCA for non-Euclidean manifolds . To address the quadratic complexity of computing all pairwise distances, the authors of [57] propose methods to select a subset of distances, significantly reducing computational overhead while maintaining accuracy. In [40] propose to use symmetric spaces to better adapt to dissimilar structures in the graph. In [24], the authors propose learning embeddings in a product manifold combining multiple copies of model spaces (spherical, hyperbolic, Euclidean), providing a space of heterogeneous curvature suitable for a wide variety of structures.
Network model dependent methods: Inspired by the study of [33] on models generating random graphs in hyperbolic spaces, such as the Popularity × Similarity Optimization (PSO) model [48], several works proposed to infer node coordinates in the Poincaré disk using Maximum Likelihood Estimation (MLE), maximizing the likelihood that the network was produced by such a model. Several network properties are used for optimizing the obtained embedding. In [6], the authors rely on common neighbor information. In [69], the authors use community structure to compute the angular coordinates.
In [42], the authors utilize eigenvalue decomposition of the Laplacian matrix to find angular coordinates of nodes in the Poincaré disk, while radial coordinates are inferred by employing a network model. Hybrid approaches combining manifold learning and maximum likelihood estimation were explored by [22]. In [49], the authors present HyperMap is a simple method for mapping a real network into hyperbolic space based on a recent geometric theory of complex networks. It reconstructs the network’s geometric growth by estimating the hyperbolic coordinates of new nodes at each step, maximizing the likelihood of the observed network in the model. HyperMap-CN [47] is an extension of HyperMap that also uses the number of common neighbors between the nodes. 2-Deep learning methods Graph embedding methods based on deep learning have demonstrated remarkable progress in capturing complex graph structures. The integration of hyperbolic geometry into deep learning for graph data has significantly advanced, particularly since the work of [21]. Hyperbolic geometry offers an effective framework for encoding hierarchical and tree-like structures, enabling deep learning models to represent graph data with greater fidelity. Below, we discuss methods grouped into GNN -based approaches, graph autoencoders, and spatio-temporal models.
In this category, we have Hyperbolic Graph Neural Networks (HGNNs) [39] that extend traditional graph neural networks (GNNs) to Riemannian manifolds, specifically the Lorentz and Poincaré ball models. Building on the framework of the Euclidean Vanilla GCN [32], HGNNs utilize differentiable exponential and logarithmic maps to perform message passing. At each layer k, the node representation h k v is mapped to the tangent space of a reference point x ′ using the logarithmic map log x ′ , aggregated, and then mapped back to the manifold via the exponential map exp x ′ . The updated representation h k+1 u is given by:
where W k are trainable parameters, Ã is the normalized adjacency matrix, and I(u) denotes the in-neighbors of node u. This framework allows HGNN to perform message passing in the tangent space and map results back to hyperbolic space effectively.
HGNNs operate in both Lorentz and Poincaré models, with curvature fixed to -1. In the Poincaré model, pointwise non-linearities are used for activation, whereas in the Lorentz model, points are projected to the Poincaré space for activation and then mapped back. The output consists of hyperbolic node embeddings used for node-level and graph-level predictions, leveraging cross-entropy and mean squared error loss functions, respectively. This approach demonstrate superior performance in preserving graph structures compared to Euclidean methods, excelling in graph classification tasks. Notably, the Lorentz model outperformed the Poincaré model, especially in higher-dimensional settings and datasets such as Zinc for molecular property prediction and Ether/USDT for time series. HGNNs showcase robust performance in complex scenarios, advancing the application of hyperbolic geometry in GNNs [39].
Hyperbolic Graph Convolutional Neural Networks (HGCNs) [11], extend Graph Convolutional Networks (GCNs) to hyperbolic geometry, achieving significant advancements in preserving hierarchical structures and tree-like data. HGCNs operate in the hyperboloid model with trainable curvature, where each layer ℓ functions at a different curvature -1/K ℓ (K ℓ > 0) centered at o = { √ K ℓ , 0, . . . , 0}. Euclidean input features x i0,E are mapped to hyperbolic space using the exponential map exp oK ℓ . Each layer performs a transformation of node features in the tangent space using the formula:
where W ℓ and b ℓ are the weights and bias parameters of layer ℓ. HGCNs employ a hyperbolic attention-based mechanism to aggregate features of neighboring nodes:
where N (i) represents the neighbors of node i, and w ij are attention weights computed using an Euclidean multi-layer perceptron (MLP) similar to the Graph Attention Network (GAT) [65]. Unlike fixed-curvature models, HGCNs assign distinct curvature values -1/K ℓ to each layer, enhancing flexibility and alignment with the dataset’s geometry. For link prediction tasks, HGCNs employ the Fermi-Dirac decoder to compute edge probability scores, while for node classification, the output embeddings are projected to the tangent space for Euclidean multinomial logistic regression. HGCNs were evaluated on datasets such as CoRA [56], PubMed [43], SIR [3], PPI [60], and Airport (a flight network), where they outperform baseline methods such as GCN, GAT, and Poincaré embeddings, especially on datasets with low δ-hyperbolicity [11]. Its attention mechanisms and trainable curvature demonstrated improved preservation of hierarchies, class separability, and tree structures. In particular, HGCNs achieve state-of-the-art results in link prediction and node classification tasks, showcasing robust performance across varying data geometries [11].
Hyperbolic-to-Hyperbolic Graph Convolutional Networks (H2H-GCNs) [15], address the limitations of hyperbolic GCNs that rely on tangent space approximations for feature transformation and neighborhood aggregation by performing all operations directly within hyperbolic manifolds. The model projects Euclidean input features into the Lorentz manifold using the exponential map and applies Lorentz linear transformations at each layer ℓ, defined as:
Neighborhood aggregation is performed using the Einstein midpoint in the Klein model, followed by activation in the Poincaré model. By avoiding tangent space approximations, H2H-GCNs reduce distortion and improves performance, particularly in low-dimensional hyperbolic spaces. H2H-GCNs [15] demonstrate superior results across datasets like Disease, Cora, PubMed, and Airport for tasks such as link prediction and node classification. It achieved top AUC scores for link prediction and high F1 scores for node classification, outperforming models like HGCNs and other baselines. For graph classification on synthetic and molecular graphs, H2H-GCNs excel in preserving hyperbolic structures, showcasing advantages over tangent space-based models like HGNNs, which suffer from distortion in higher dimensions. Its robust performance in both low-and high-dimensional settings establishes H2H-GCN as a significant advancement in hyperbolic graph learning.
Several GNN variants operate directly on the hyperbolic manifold, leveraging non-Euclidean geometry to better capture hierarchical and complex graph structures. For instance, k-GCN [4] generalizes GCNs to constant curvature spaces using gyro-barycentric coordinates, enabling smooth interpolation between Euclidean and non-Euclidean geometries. Similarly, HAT [76] introduces hyperbolic attention-based aggregation, computing attention coefficients based on hyperbolic distances. GIL [79] embeds graphs in both Euclidean and hyperbolic spaces for flexible geometry-aware representations, while Q-GCN [71] extends GCNs to pseudo-Riemannian manifolds to handle mixed topologies. Adaptive approaches like ACE-HGCNN [19] use reinforcement learning to optimize curvature, and NHGCN [18] employs nested hyperbolic spaces for efficient representation learning. Additionally, κHGCN [73] models tree-likeness by learning discrete and continuous curvature.
Beyond hyperbolic geometry, FMGNN [16] fuses embeddings from multiple Riemannian manifolds, aligning different geometric spaces for enhanced learning. Finally, LGCN [77] redefines graph operations using Lorentzian distance, reducing distortion and preserving hierarchical structures.
Graph autoencoder-based methods: Graph autoencoder-based methods have been successfully extended to hyperbolic spaces, providing a novel approach for learning unsupervised representations of graph data. In this category, we can cite Poincaré Variationnal Auto-Encoders (P-VAE) [41], which extends Variational Autoencoders (VAEs) to hierarchical or tree-structured data by mapping it into a hyperbolic latent space, specifically the Poincaré ball B c d with constant negative curvature -c. VAEs, introduced by [30], are generative models that encode input data into a latent space, typically modeled as a Gaussian distribution in Euclidean space, and decode it back to the original space. The key idea is to optimize the Evidence Lower Bound (ELBO), balancing reconstruction accuracy with latent space regularization, allowing for the generation of new, similar data points.
In P-VAE, the latent space is modeled using Riemannian normal and wrapped normal distributions. The encoder outputs the Fréchet mean µ in the Poincaré ball and standard deviation σ. To maintain hyperbolicity, the encoder employs an exponential map layer, while the decoder utilizes a gyroplane layer, which involves hyperbolic affine transformations that map the latent representation back to Euclidean space. P-VAE also adapts the reparameterization trick and ELBO for hyperbolic geometry, allowing for better preservation of hierarchical structures inherent in the data. It showed improved performance on graph datasets, particularly in link prediction tasks, effectively capturing the hierarchical relationships within the data compared to Euclidean methods [41].
Hyperbolic Graph Convolutional Autoencoder (HGCAE) [50] is a hyperbolic version of graph autoencoders that operates on both the Poincaré ball and Lorentz models, with trainable curvature across layers. The architecture is similar to HGCN with attention-based aggregation. For node i at layer ℓ, the message passing operation is:
where h ℓ j is the embedding of node j from the previous layer, K ℓ is the curvature at layer ℓ, W ℓ and b ℓ are the weight matrix and bias, and α ℓ ij represents the attention score. The node representation h ℓ+1 i is obtained by applying an activation function, such as ReLU, in the Poincaré ball.
The encoder reconstructs the adjacency matrix A, while the decoder reconstructs the Euclidean node attributes X Euc . The total loss function combines cross-entropy loss for the adjacency matrix and mean squared error for node attributes L = L REC-A + λL REC-X , where λ controls the balance between structural and attribute reconstruction. HGCAE was evaluated [50] on link prediction and node clustering tasks using datasets like CoRa, Citeseer, Wiki, PubMed, and BlocCatalog, outperforming state-of-the-art models like GAE, VGAE, ARGA, and DBGAN. The Poincaré (HGCAE-P ) and hyperboloid (HGCAE-H ) variants achieved superior AUC and average precision for link prediction and higher accuracy and NMI for node clustering. HGCAE-H generally performed best, with some exceptions on Wiki and PubMed. Visualizations on CoRa demonstrated effective node clustering near the hyperbolic space boundary, showcasing the model’s ability to capture complex graph structures.
Extending beyond these architectures, additional methods explore diverse autoencoder-based approaches in hyperbolic spaces. GCM-AAE [23] introduces an adversarial autoencoder designed for constant-curvature Riemannian manifolds (CCMs), effectively embedding hierarchical and circular structures. By aligning the aggregated posterior with a probability distribution on a CCM, the model enhances representation learning across varying curvature spaces.
In another direction, PWA [46] reformulates Wasserstein autoencoders within the Poincaré ball, leveraging hyperbolic latent spaces to impose a hierarchical structure on learned representations. The approach demonstrates strong results in graph link prediction by effectively structuring latent spaces according to data hierarchy.
Finally, Hyperbolic NF [7] introduces normalizing flows into hyperbolic spaces, overcoming limitations of Euclidean-based flows in stochastic variational inference. By employing coupling transforms on the tangent bundle and Wrapped Hyperboloid Coupling (WHC), Hyperbolic NF achieves expressive posteriors, outperforming Euclidean and hyperbolic VAEs on hierarchical data density estimation and graph reconstruction.
In this category, we have Hyperbolic Temporal Graph Network (HTGN) [72], which is designed for discretetime graphs represented as snapshots from a temporal graph G. Each snapshot G t = (V t , A t ) includes the current node set V t and its adjacency matrix A t . Operating in the Poincaré ball model with learnable curvature c, HTGN consists of three components: (1) a Hyperbolic Graph Neural Network (HGNN ) for topological dependency extraction, (2) Hyperbolic Temporal Attention (HTA) to aggregate historical information, and (3) a Hyperbolic Gated Recurrent Unit (HGRU ) for capturing sequential patterns. HTGN introduces a Hyperbolic Temporal Consistency (HTC ) constraint to ensure smooth embedding transitions over time. HTGN minimizes an overall loss function combining hyperbolic distance for temporal smoothness and a homophily loss to preserve graph structure. Evaluated on datasets like HepPh, COLAB, and FB, HTGN outperformed several Euclidean baselines in link prediction and particularly excelled in new link prediction tasks, demonstrating robust inductive learning even on sparse graphs.
HGWaveNet [5] is a hyperbolic graph neural network framework designed for discrete dynamic graphs, inspired by WaveNet [64] and Graph WaveNet [70]. It operates in the Poincaré ball or Lorentz manifolds with learnable curvature and consists of several key modules: (1) an Hyperbolic Diffusion Graph Convolution (HDGC ) module that learns node representations from both direct and indirect neighbors in a snapshot through a diffusion process, performing K diffusion steps to consider longer paths in message propagation; (2) An Hyperbolic Dilated Causal Convolution (HDCC ) module that aggregates historical information while respecting causality, allowing each state to depend only on past and present states. This module uses a fixed kernel size to update node representations across snapshots, effectively capturing long-range dependencies;
(3) an Hyperbolic Gated Recurrent Unit (HGRU ): Similar to the one in HTGN, it processes the current node representations and historical hidden states; and (4) an Hyperbolic Temporal Consistency (HTC ) module that Ensures stability in the embeddings over time. HGWaveNet was evaluated on link prediction tasks using datasets such as Enron and DBLP outperforming multiple static and dynamic models. Building on the previously discussed spatio-temporal GNN-based methods, several other approaches leverage hyperbolic geometry for improved representation learning over dynamic graphs. ST-GCN [51] extends traditional spatio-temporal GCNs by integrating Poincaré geometry, effectively modeling latent anatomical structures in human action recognition while optimizing projection dimensions in the Riemann space. By leveraging hyperbolic embeddings, ST-GCN enhances structural representation efficiency while reducing model complexity. In another direction, HVGNN [58] introduces a hyperbolic variational framework to model both dynamics and uncertainty in evolving graphs. The method combines a Temporal GNN (TGNN ) with a hyperbolic graph variational autoencoder to generate stochastic node representations, ensuring robust performance across dynamic settings. By leveraging hyperbolic graph attention networks, it effectively captures spatial and temporal dependencies among stocks, significantly enhancing profitability and risk-adjusted returns through structured hyperbolic representations. Finally, DHGAT [38] proposes a novel spatiotemporal self-attention mechanism based on hyperbolic distance that achieves superior performance in multi-step link prediction tasks, particularly excelling at identifying new and evolving links within dynamic graphs.
Embedding graphs into hyperbolic spaces is gaining attention because it helps capture the hierarchical and scale-free nature of many real-world networks. These networks often have a few highly connected nodes and many less connected ones, and they tend to form tightly-knit groups. Hyperbolic geometry is well-suited for representing these properties because it can model hierarchical structures with low distortion [33]. Despite these advantages, embedding graphs in hyperbolic spaces introduces several challenges. Unlike Euclidean spaces, hyperbolic spaces do not possess vector space properties, making basic operations such as vector addition, matrix-vector multiplication, and gradient-based optimization problematic [77]. These operations are crucial for machine learning algorithms and may result in outputs that lie outside the manifold, complicating the embedding process [52]. Addressing these challenges requires innovative methods tailored to the unique characteristics of hyperbolic geometry. In this section, we propose a unified framework for node-level anomaly detection in homogeneous static graphs, enabling the testing and comparison of various embedding methods alongside different anomaly detection techniques. The overall architecture of our framework is illustrated in Figure 2. This architecture is divided into two primary components:
• Graph Embedding Model: This model generates node embeddings in a hyperbolic space (e.g., Poincaré ball, Lorentz model), which are then mapped to the Euclidean space. This setup allows flexibility in evaluating various state-of-the-art hyperbolic embedding techniques.
• Anomaly Detection Mechanism: The Euclidean embeddings are used by an anomaly detection algorithm (supervised or unsupervised) to classify nodes as benign (normal) or malicious(abnormal).
In this framework, we evaluated six models: H2H-GCN [15], HGCAE [50], HGCN [11], HGNN [39], Poincaré [44], and P-VAE [41]. In order to ensure reproducibility and facilitate comparative analysis, we summarize the hyperparameters used for each model in Table 3. Key hyperparameters such as learning rate, weight decay, and dropout were fixed, with adjustments based on the dataset and embedding dimensionality. For anomaly detection, we applied a two-phase approach with models like P-VAE and HGCAE. The first phase involves training the models, while the second phase applies various anomaly detection algorithms (e.g., Random Forest and k-Nearest Neighbors) to classify nodes as benign or anomalous. To support these efforts, we developed Ghypeddings, a Python library designed to embed static graph nodes into hyperbolic space. Short for Graph Hyperbolic Embeddings, Ghypeddings consolidates all six hyperbolic graph models and integrates some anomaly detection algorithms. Built on the PyTorch framework, this library provides a comprehensive solution for hyperbolic graph embeddings and is publicly available. Ghypeddings streamlines the process of building and loading graphs, training models, and
saving/loading them for future use, thus improving the usability and reproducibility of the models.
This section presents a comprehensive evaluation of anomaly detection models using multiple datasets. It begins by describing the datasets used, followed by a discussion of the Euclidean embedding models included for comparison to benchmark the performance of hyperbolic embeddings. The preprocessing steps applied to the datasets are detailed to ensure reproducibility. Key evaluation metrics, such as accuracy and F1-score, are introduced to provide a consistent basis for assessing model performance. Finally, the results analyze and compare
Accuracy Measures the overall correctness of the model’s predictions.
T P +T N T P +T N +F P +F N Precision Indicates the accuracy when the model predicts an anomaly.
Measures the model’s ability to detect all actual anomalies.
The harmonic mean of precision and recall, balancing the two. the performance of hyperbolic and Euclidean embedding models, highlighting their strengths, trade-offs, and computational efficiency. Datasets: For evaluating anomaly detection models across various domains, we selected a diverse set of datasets to ensure robustness and consistency. The datasets used are listed in table 5. For each dataset, data is partitioned into training (70%), validation (20%), and testing (10%) sets.
The effectiveness of the models on anomaly detection are assessed using:(1) the classification Outcomes, based on the commonly used metrics summarized in Table 4, and the computational efficiency, particularly regarding training time (TT), which represents the duration in seconds that the model requires to learn from the training dataset, including the optimization of parameters for both graph embedding and anomaly detection.
Euclidean Baselines: To evaluate the effectiveness of hyperbolic embeddings, we compare them with two prominent Euclidean embedding models: Graph-Sage [25], which has demonstrated strong performance in graph representation tasks, and DOMINANT [17], which integrates Graph Convolutional Networks (GCNs) with Autoencoders to detect outlier nodes in graphs. DOMINANT employs a GCN-based encoder and two GCN-based decoders: one reconstructs node features, and the other reconstructs the adjacency matrix. The model outputs an outlier score that combines reconstruction errors from both decoders. We selected these models due to their strong performance on node anomaly detection tasks and their compatibility with diverse datasets.
Results and Discussion: The results of all the algorithms are depicted in Table 6 for the Darknet datatset, Table 7 for CICDDoS2019 dataset, Table 8 for Elliptic datatset, Table 9 YelpNYC datatset, Table10 for Cora datatset, and Table 11 for DGraphFin dataset.
In addition to the comparison, we are working on selecting the classification optimal algorithms for autoencoder-based embedding models, specifically HG-CAE and P-VAE, to maximize their performance. In all tables, the best results for each metric are marked in bold and underlined, while the second-best results are underlined. Across all datasets, we kept the embedding dimension fixed at 10, as this con- figuration provided consistently better results across most models. Performance of Autoencoder-based models with different classifiers: We used these models with Random Forest (RF), k-Nearest Neighbors (kNN), Agglomerative Clustering (AC), and Gaussian Mixture (GM) classifiers. According to Tables 6-11, HGCAE, especially when integrated with kNN and GM, generally demonstrates solid performance, achieving relatively high accuracy and F1-scores across most datasets, but it faces challenges in training time (TT) efficiency. For instance, on the Cora dataset, HGCAE +KNN achieves an accuracy of 78% and an F1-score of nearly 80%, but requires a training time of 8 seconds, highlighting its computational demands. This trend persists across other datasets, where HGCAE variants show comparable performance but necessitate significantly longer training periods, notably on more complex datasets like Darknet, where TT reaches 23 seconds. In contrast, P-VAE, particularly when paired with Random Forest(RF) or Agglomerative Clustering(AC) classifiers, exhibits excellent performance metrics, especially in terms of accuracy and AUC scores, indicating strong anomaly detection capabilities. For instance, P-VAE +RF achieves the highest accuracy on datasets such as YelpNYC 88% and Cora 86%. However, this performance comes at the cost of even longer training times, with values such as 32 seconds on Cora and nearly 34 seconds on YelpNYC, due to the model’s complexity and the additional computation required for variational inference. This underscores the computational intensity of P-VAE approaches in anomaly detection tasks, where the benefits of accuracy and robustness must be balanced against their high computational overhead. Highlighting P-VAE ’s high F1-scores on datasets like Darknet and Elliptic is essential, as these scores underscore its robustness in complex anomaly detection tasks. For instance, P-VAE combined with AC achieves an impressive F1-score of 92% on the Darknet dataset, indicating that it is particularly effective at identifying anomalies in high-dimensional, complex network data.Similarly, on the Elliptic dataset, P-VAE +AC achieves an F1-score of 94%, one of the highest across all tested models, reflecting its robustness in detecting fraudulent transactions in financial networks.
Comparison of the Hyperbolic Embedding Models: Tables 67891011show that the Poincaré model consistently achieves low training times (TT), especially notable on datasets such as Darknet and Elliptic, where it maintains a training time below 2.5 seconds. While its recall scores are generally high indicating a strong ability to identify anomalies , the model shows relatively lower precision on most datasets. For instance, on the Elliptic dataset, Poincaré achieves a recall of 95% but has a lower precision of 59%, which could imply a higher false-positive rate. HGNN and HGCN models demonstrate impressive performance metrics across both the Darknet and CICDDoS2019 datasets. HGNN, for instance, attains high F1-scores and recall rates, reaching 88.92% F1score on Darknet and 93% on CICDDoS2019. These results highlight HGNN ’s effectiveness in identifying and classifying anomalies accurately, although at a higher computational cost (e.g., 19.96 seconds on CICDDoS2019 ). HGCN generally achieves high precision scores, especially on datasets like CICDDoS2019 (94.95%), indicating its robustness in reducing false positives. H2H-GCN stands out on the Elliptic dataset with an accuracy of 89.71% and precision of 96.55%, reflecting strong performance in detecting financial anomalies, a critical feature in applications such as fraud detection. This high performance on Elliptic, combined with moderate training times, suggests H2H-GCN ’s applicability to complex network structures with limited computational trade-offs.
Performance of the Euclidean Embedding Models: Euclidean embedding methods typically underperformed compared to hyperbolic embeddings, especially on datasets with hierarchical or highly structured data, where the limitations of Euclidean space in capturing latent hierarchies become apparent. However, some models, particularly GraphSage, occasionally achieved competitive results, demonstrating relatively high precision and low training time on less structured datasets. Overall, the performance of Euclidean embedding models varied significantly by dataset. DOMINANT, both in 10% and 50% variants, exhibited high recall across several datasets but suffered in accuracy and precision, leading to an inflated number of false positives. For example, on the Darknet dataset, DOMINANT (10%) achieved a recall of 93%, indicating strong anomaly detection capability. However, its accuracy was only 53.68%, underscoring the model’s tendency to misclassify normal points as anomalies. A similar trend was observed across other datasets, where DOMINANT consistently produced high recall scores at the expense of precision and accuracy, highlighting its capacity to capture anomalies broadly but with limited specificity. GraphSage, in contrast, demonstrated a better balance between precision, recall, and overall accuracy. On simpler datasets such as Cora and YelpNYC, GraphSage achieved competitive scores across evaluation metrics. For instance, in the Cora dataset, GraphSage obtained an accuracy of 88% and an F1-score of 87.6%, indicating that it effectively captured anomalies while maintaining a lower rate of false positives. On the YelpNYC dataset, where it achieved a precision of 93%, the model demonstrated an ability to correctly identify true anomalies with minimal misclassifications. The balanced performance across multiple metrics suggests that GraphSage may offer robust anomaly detection capabilities within Euclidean spaces for datasets without strong hierarchical patterns. In terms of computational efficiency, Euclidean embeddings offer significant advantages. Both DOMINANT and GraphSage exhibit notably low training times, rendering them suitable for large-scale applications or situations where computational resources are limited. For example, DOMINANT required less than 1.5 seconds of training on several datasets, whereas GraphSage consistently showed shorter training times compared to hyperbolic models, such as P-VAE and H2H-GCN, which can exceed 30 seconds in training time. This efficiency makes Euclidean models particularly useful in scenarios where quick model deployment and lower computational overhead are priorities.
This study reviewed, classified and compared hyperbolic graph embeddings. Through the development of a unified framework, we provided a flexible and systematic methodology for combining hyperbolic graph embedding models with various anomaly detection algorithms, enabling robust detection of anomalous patterns in graph-structured data. Beyond the framework, we addressed a significant gap in the field by introducing a comprehensive taxonomy of hyperbolic graph embedding techniques from an experimental and practical perspective. This taxonomy provides a structured foundation for comparing methods, understanding their design choices, and identifying their strengths and limitations. Such a contribution not only facilitates deeper exploration of hyperbolic embeddings but also empowers researchers and practitioners to select or develop methods suited to specific challenges. The evaluation of our framework across multiple datasets showed that hyperbolic models consistently outperform their Euclidean counterparts, particularly when paired with unsupervised approaches like autoencoders. Additionally, we contributed to the field by developing Ghypeddings, an open-source library that simplifies the application and experimentation with hyperbolic embeddings, ensuring both ease of use and reproducibility for future research.
As for future work, this study opens several promising directions. For instance, extending the methodology to dynamic graphs could provide deeper insights by capturing the evolution of graph structures over time, enabling real- time anomaly detection in changing environments. Also, investigating anomaly detection at multiple levels such as edges, subgraphs, and entire graphs, could help uncover more complex and interconnected patterns, enhancing the ability to identify diverse forms of irregularities in graph-structured data.
ModelLayers and Attention Mechanism Initialization Optimizer H2H-GCN Two layers with SELU activation. Uniform Bias, Orthogonal Weights Riemannian SGD HGCAE Two-layer encoder and decoder with ReLU activation, sparse attention.
Model
A Riemannian manifold is complete if every geodesic can be extended indefinitely.
A manifold is simply connected if it has no holes, meaning every loop can be contracted to a point.