Gene regulatory network inference algorithm based on spectral signed directed graph convolution

Reading time: 5 minute
...

📝 Original Info

  • Title: Gene regulatory network inference algorithm based on spectral signed directed graph convolution
  • ArXiv ID: 2512.11927
  • Date: 2025-12-12
  • Authors: Rijie Xi, Weikang Xu, Wei Xiong, Yuannong Ye, Bin Zhao

📝 Abstract

Accurately reconstructing Gene Regulatory Networks (GRNs) is crucial for understanding gene functions and disease mechanisms. Single-cell RNA sequencing (scRNA-seq) technology provides vast data for computational GRN reconstruction. Since GRNs are ideally modeled as signed directed graphs to capture activation/inhibition relationships, the most intuitive and reasonable approach is to design feature extractors based on the topological structure of GRNs to extract structural features, then combine them with biological characteristics for research. However, traditional spectral graph convolution struggles with this representation. Thus, we propose MSGRNLink, a novel framework that explicitly models GRNs as signed directed graphs and employs magnetic signed Laplacian convolution. Experiments across simulated and real datasets demonstrate that MSGRNLink outperforms all baseline models in AUROC. Parameter sensitivity analysis and ablation studies confirmed its robustness and the importance of each module. In a bladder cancer case study, MSGRNLink predicted more known edges and edge signs than benchmark models, further validating its biological relevance.

💡 Deep Analysis

Figure 1

📄 Full Content

ene regulatory networks (GRNs) consist of intricate regulatory interactions between transcription factors (TFs) and target genes, which is a crucial mechanism for maintaining life processes, controlling biochemical reactions, and regulating the levels of chemical compounds. GRNs play a key role in many domains, such as gene function prediction, cancer biomarkers identification, and the discovery of potential drug targets [1]. The development of gene sequencing technology has facilitated data acquisition and produced a wealth of gene microarray data. Particularly, single-cell RNA sequencing technology (scRNA-seq) has deepened insights into cellular heterogeneity, providing opportunities to identify high-resolution transcriptional states and transitions [2]. In recent years, researchers have devised many supervised and unsupervised computational methods to deduce GRN from scRNA-seq data, which are primarily classified into three categories: information theory-based methods, machine learning-based methods, and deep learningbased methods.

Methods based on information theory suggest that genes within the same group have analogous expression patterns during physiological processes and forecast regulatory linkages by assessing the correlations among genes [3]. For instance, Chan et al. developed the undirected unsigned model PIDC [4] in 2017 that uses partial information decomposition (PID) to uncover gene regulatory relationships. In the same year, Specht et al. presented the undirected unsigned model LEAP [5], which infers GRNs by calculating Pearson correlations on fixed size time windows with different lags. Later, in 2020, Aibar et al. proposed the undirected unsigned SCRIBE [6], a model that constructs GRNs based on the mutual information between the past state of a regulator and the current state of a target gene. These methods have the advantage of requiring a small sample size and having minimal computing cost, which allows for the construction of massive networks from little quantities of data. However, the GRNs inferred by this kind of approach are undirected and fail to distinguish between upstream and downstream of the regulatory link because the correlations used in the aforementioned literature are bidirectional [7]. Furthermore, these approaches ignore the known network topology and only employ gene expression profiles.

Machine learning-based algorithms transform the GRN inference problem into a classification or regression problem. For example, Huynh-Thu et al. introduced the undirected unsigned model GENIE3 [8] in 2010, which decomposes the prediction of GRN between p genes into p distinct regression problems. In 2017, Hirotak et al. proposed the directed unsigned model SCODE [9], which integrates linear ordinary differential equations and linear regression to infer GRNs. Later, in 2019, Moerman et al. proposed the undirected unsigned model GRNBoost2 [10], an efficient algorithm based on gradient boosting within the GENIE3 [8] framework. In 2020, Ghosh et al. employed Lasso to introduce an ensemble regression algorithm PoLoBag [11] -this was the first model in GRN inference to simultaneously consider both sign and direction. Most Recently, in 2022, Abdullah et al. designed a non-convex optimization model within the ADMM framework and proposed scSGL [12], which further incorporates kernel functions to model undirected signed gene regulatory G networks.

Deep learning-based methods aim to process raw biological data to infer GRNs using classical deep learning algorithms. For example, Kishan et al. proposed the undirected unsigned model GNE [13] in 2019, a supervised model that uses MLP to encode gene expression for predicting the interactions between genes. In the same year, Yuan et al. developed CNNC [14], a directed unsigned model that transforms gene pair co-expression into image-like histograms and applies CNNs for classification. Although both GNE [13] and CNNC [14] incorporate gene expression profile information and network topology, they cannot handle time series data. To address this limitation, DGRNS [15] was created in 2022, combining RNN and CNN to capture temporal and spatial features respectively, thereby enabling accurate inference of directed unsigned regulatory relationships among genes.

Traditional deep learning methods are not well-suited for non-Euclidean data such as GRN, as they fail to handle the topology of the network. To address this limitation, an increasing number of studies have adopted Graph Neural Network (GNN) for GRN inference. In 2020, Wang et al. introduced GRGNN [16], the first model to apply GNN to GRN inference. It is an undirected unsigned model that trains GNN classifiers using positive and negative subgraphs. In 2022, Chen et al. proposed a directed unsigned model GENELink [17], which first reconstructs GRN employing GAT and predicts potential interactions through a selfattention mechanism. GENELink [17] addresses the issue raised in GNE [13], where captur

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut