A Guide to Bayesian Networks Software Packages for Structure and Parameter Learning -- 2025 Edition
A representation of the cause-effect mechanism is needed to enable artificial intelligence to represent how the world works. Bayesian Networks (BNs) have proven to be an effective and versatile tool for this task. BNs require constructing a structure of dependencies among variables and learning the parameters that govern these relationships. These tasks, referred to as structural learning and parameter learning, are actively investigated by the research community, with several algorithms proposed and no single method having established itself as standard. A wide range of software, tools, and packages have been developed for BNs analysis and made available to academic researchers and industry practitioners. As a consequence of having no one-size-fits-all solution, moving the first practical steps and getting oriented into this field is proving to be challenging to outsiders and beginners. In this paper, we review the most relevant tools and software for BNs structural and parameter learning to date, providing our subjective recommendations directed to an audience of beginners. In addition, we provide an extensive easy-to-consult overview table summarizing all software packages and their main features. By improving the reader understanding of which available software might best suit their needs, we improve accessibility to the field and make it easier for beginners to take their first step into it.
💡 Research Summary
The paper “A Guide to Bayesian Networks Software Packages for Structure and Parameter Learning – 2025 Edition” offers a comprehensive, up‑to‑date survey of tools that support the two fundamental tasks in Bayesian network (BN) modeling: learning the graph structure from data and estimating the conditional probability parameters. Recognizing that the BN community has produced a plethora of algorithms—constraint‑based, score‑based, hybrid, functional and gradient‑based—but no single method has become a de‑facto standard, the authors set out to map the software landscape to help newcomers navigate the field.
The authors first outline the theoretical background, reminding readers that a BN is a directed acyclic graph (DAG) whose nodes encode random variables and whose edges encode probabilistic dependencies. They briefly categorize structure‑learning algorithms into four groups and point to recent surveys for deeper methodological details. Parameter learning is presented as the complementary problem of estimating conditional probability tables (CPTs) once the DAG is known, with references to classic works on maximum‑likelihood and Bayesian estimation.
Section 2 constitutes the core of the paper. Sixteen software packages are described, spanning open‑source libraries in Python, R, and Java, as well as commercial platforms. For each tool the authors list: (i) the programming language(s) and primary interface (API, GUI, command line), (ii) the families of algorithms implemented, (iii) support for discrete, Gaussian, conditional‑Gaussian, and dynamic (time‑varying) models, (iv) parameter‑learning capabilities (MLE, Bayesian, EM, handling of missing data), (v) inference engines, (vi) documentation quality, (vii) licensing and price. Highlights include:
- gCastle (Python) – a Huawei‑backed end‑to‑end toolbox that bundles score‑based, gradient‑based and hybrid causal discovery methods, provides synthetic data generators, evaluation metrics, and a web‑based GUI.
- bnlearn (R) – the most mature open‑source package for static BNs, offering a full suite of constraint‑based (PC, GS, IAMB, MMPC, etc.), score‑based (Hill‑Climbing, Tabu), and hybrid (MMHC) algorithms, plus extensive parameter‑learning (MLE, Bayesian) and inference utilities. Its documentation is tightly linked to two standard textbooks.
- pgmpy (Python) – a modular library that supports static and dynamic BNs, includes a variety of inference algorithms, and provides a clear API for custom model building.
- pyAgrum (Python wrapper for C++ aGrUM) – delivers high‑performance inference, supports both static and dynamic models, and supplies a rich set of tutorials, interactive widgets, and a “Book of Why” solution list.
- Tetrad (Java) – a GUI‑driven suite from CMU‑CLeaR offering classic constraint‑based algorithms (PC, FCI, CPC) and basic MLE parameter learning, with facilities for data preprocessing, missing‑value imputation, and simulation.
- Causal‑Command (Java CLI) – a command‑line collection of more than 30 causal discovery algorithms, suited for batch processing or integration into Java pipelines.
- LiNGAM (Python) – specialized in linear non‑Gaussian models, providing Direct‑LiNGAM, VAR‑LiNGAM, and related algorithms for time‑series data.
- CDT (Python) – integrates many R‑based algorithms (bnlearn, pcalg) into a Python environment, leverages PyTorch for deep‑learning‑based causal discovery, and offers a large algorithmic repertoire.
- pomegranate (Python) – a general probabilistic modeling library that includes BN and hidden Markov model components, with both constraint‑based and score‑based structure learning.
- OpenMarkov (Java GUI) – a lightweight desktop tool supporting PC and Hill‑Climbing search.
The commercial offerings—BayesFusion (GeNIe/SMILE), BayesiaLab, and Bayes Server—are presented as mature, user‑friendly platforms that bundle extensive GUIs, cross‑language APIs (C++, Python, Java, .NET, R, Matlab), cloud deployment options, and professional support. They target enterprise users who prioritize stability, documentation, and integration over algorithmic breadth.
Section 3 translates the technical inventory into practical advice for beginners. The authors propose three decision pathways:
-
Structure‑only discovery – recommend gCastle for its documentation and optional GUI, CDT for the widest algorithmic coverage (especially when PyTorch is desired), and LiNGAM when the user’s domain assumes linear non‑Gaussian relationships. Causal‑Command is suggested only for users comfortable with command‑line workflows and larger data volumes.
-
Structure + parameter learning – position bnlearn as the first choice because of its comprehensive algorithm set, robust parameter‑learning routines, and strong pedagogical resources. pgmpy and pyAgrum are offered as Python alternatives that also support dynamic BNs, with pyAgrum distinguished by its high‑performance inference engine and interactive tutorials.
-
Commercial/enterprise solutions – outline the trade‑offs among GeNIe/SMILE (flexible language bindings, mobile/web deployment), BayesiaLab (intuitive UI, extensive tutorials), and Bayes Server (cloud‑native, Spark integration).
The paper concludes by summarizing its contributions: (i) a curated, up‑to‑date catalogue of 16 BN tools, (ii) a side‑by‑side feature matrix that enables rapid comparison, (iii) beginner‑focused recommendations based on learning goals, programming language preference, need for GUI, dynamic modeling, and commercial support. The authors acknowledge the subjective nature of their rankings and call for future work on systematic benchmarking, domain‑specific evaluations (e.g., systems biology, finance), and tracking of emerging deep‑learning‑based causal discovery frameworks.
Overall, the article serves as a practical roadmap for researchers, data scientists, and engineers who are stepping into Bayesian network modeling for the first time, helping them avoid analysis paralysis and select the software that best aligns with their immediate objectives and long‑term project requirements.
Comments & Academic Discussion
Loading comments...
Leave a Comment