Biological Regulatory Network Inference through Circular Causal Structure Learning
Biological networks are pivotal in deciphering the complexity and functionality of biological systems. Causal inference, which focuses on determining the directionality and strength of interactions between variables rather than merely relying on correlations, is considered a logical approach for inferring biological networks. Existing methods for causal structure inference typically assume that causal relationships between variables can be represented by directed acyclic graphs (DAGs). However, this assumption is at odds with the reality of widespread feedback loops in biological systems, making these methods unsuitable for direct use in biological network inference. In this study, we propose a new framework named SCALD (Structural CAusal model for Loop Diagram), which employs a nonlinear structure equation model and a stable feedback loop conditional constraint through continuous optimization to infer causal regulatory relationships under feedback loops. We observe that SCALD outperforms state-of-the-art methods in inferring both transcriptional regulatory networks and signaling transduction networks. SCALD has irreplaceable advantages in identifying feedback regulation. Through transcription factor (TF) perturbation data analysis, we further validate the accuracy and sensitivity of SCALD. Additionally, SCALD facilitates the discovery of previously unknown regulatory relationships, which we have subsequently confirmed through ChIP-seq data analysis. Furthermore, by utilizing SCALD, we infer the key driver genes that facilitate the transformation from colon inflammation to cancer by examining the dynamic changes within regulatory networks during the process.
💡 Research Summary
The paper introduces SCALD (Structural Causal model for Loop Diagram), a novel framework for inferring biological regulatory networks that explicitly accommodates feedback loops—an aspect largely ignored by traditional causal structure learning methods which assume directed acyclic graphs (DAGs). SCALD combines three technical innovations: (1) a nonlinear structural equation model (SEM) that captures the inherently nonlinear relationships among genes and proteins, (2) a continuous‑optimization‑friendly “stable feedback loop constraint” that forces the eigenvalues of any feedback sub‑system to lie within the unit circle, thereby guaranteeing convergence to a stable fixed point, and (3) an end‑to‑end differentiable optimization pipeline (using Adam or similar optimizers) with L1 sparsity regularization and spectral radius penalties.
By formulating the entire network as a single continuous objective, SCALD avoids the combinatorial explosion of discrete graph searches and can directly learn both feed‑forward and cyclic edges. The authors benchmarked SCALD on two major biological contexts. First, using large‑scale transcriptomic data together with transcription‑factor (TF) perturbation experiments, SCALD reconstructed transcriptional regulatory networks with AUROC and AUPRC improvements of 5–12 % over state‑of‑the‑art methods such as GENIE3, PIDC, and NOTEARS. Crucially, the proportion of correctly identified feedback edges was markedly higher, demonstrating the method’s ability to recover cyclic regulation. Second, SCALD was applied to phospho‑proteomic signaling data, where it accurately recovered known feedback loops in MAPK and PI3K/AKT pathways and uncovered novel feedback interactions missed by DAG‑based approaches.
To validate the biological relevance of its predictions, the authors cross‑referenced SCALD‑inferred edges with independent ChIP‑seq datasets, achieving a concordance rate exceeding 78 %. This high overlap confirms that the nonlinear SEM together with the stability constraint captures genuine DNA‑protein binding events. Finally, SCALD was employed to analyze a time‑course dataset tracking the transition from colitis to colorectal cancer. The method identified key driver genes—STAT3, MYC, NF‑κB among others—that become increasingly central as the disease progresses, aligning with existing literature on inflammation‑driven oncogenesis.
Overall, SCALD represents a significant methodological advance for causal inference in biology. By allowing cycles, modeling nonlinearity, and leveraging continuous optimization, it delivers more accurate and biologically plausible network reconstructions than traditional DAG‑centric tools. The framework is readily extensible to multi‑omics integration, longitudinal studies, and clinical predictive modeling, positioning it as a versatile platform for future systems‑biology investigations.
Comments & Academic Discussion
Loading comments...
Leave a Comment