Learning Bayesian Networks with the bnlearn R Package

Reading time: 5 minute
...

📝 Abstract

bnlearn is an R package which includes several algorithms for learning the structure of Bayesian networks with either discrete or continuous variables. Both constraint-based and score-based algorithms are implemented, and can use the functionality provided by the snow package to improve their performance via parallel computing. Several network scores and conditional independence algorithms are available for both the learning algorithms and independent use. Advanced plotting options are provided by the Rgraphviz package.

💡 Analysis

bnlearn is an R package which includes several algorithms for learning the structure of Bayesian networks with either discrete or continuous variables. Both constraint-based and score-based algorithms are implemented, and can use the functionality provided by the snow package to improve their performance via parallel computing. Several network scores and conditional independence algorithms are available for both the learning algorithms and independent use. Advanced plotting options are provided by the Rgraphviz package.

📄 Content

JSS Journal of Statistical Software MMMMMM YYYY, Volume VV, Issue II. http://www.jstatsoft.org/ Learning Bayesian Networks with the bnlearn R Package Marco Scutari University of Padova Abstract bnlearn is an R package (R Development Core Team 2009) which includes several algo- rithms for learning the structure of Bayesian networks with either discrete or continuous variables. Both constraint-based and score-based algorithms are implemented, and can use the functionality provided by the snow package (Tierney et al. 2008) to improve their performance via parallel computing. Several network scores and conditional independence algorithms are available for both the learning algorithms and independent use. Advanced plotting options are provided by the Rgraphviz package (Gentry et al. 2010). Keywords: bayesian networks, R, structure learning algorithms, constraint-based algorithms, score-based algorithms, conditional independence tests.

  1. Introduction In recent years Bayesian networks have been used in many fields, from On-line Analytical Processing (OLAP) performance enhancement (Margaritis 2003) to medical service perfor- mance analysis (Acid et al. 2004), gene expression analysis (Friedman et al. 2000), breast cancer prognosis and epidemiology (Holmes and Jain 2008). The high dimensionality of the data sets common in these domains have led to the develop- ment of several learning algorithms focused on reducing computational complexity while still learning the correct network. Some examples are the Grow-Shrink algorithm in Margaritis (2003), the Incremental Association algorithm and its derivatives in Tsamardinos et al. (2003) and in Yaramakala and Margaritis (2005), the Sparse Candidate algorithm in Friedman et al. (1999), the Optimal Reinsertion in Moore and Wong (2003) and the Greedy Equivalent Search in Chickering (2002). The aim of the bnlearn package is to provide a free implementation of some of these structure learning algorithms along with the conditional independence tests and network scores used arXiv:0908.3817v2 [stat.ML] 10 Jul 2010 2 Learning Bayesian Networks with the bnlearn R Package to construct the Bayesian network. Both discrete and continuous data are supported. Fur- thermore, the learning algorithms can be chosen separately from the statistical criterion they are based on (which is usually not possible in the reference implementation provided by the algorithms’ authors), so that the best combination for the data at hand can be used.
  2. Bayesian networks Bayesian networks are graphical models where nodes represent random variables (the two terms are used interchangeably in this article) and arrows represent probabilistic dependencies between them (Korb and Nicholson 2004). The graphical structure G = (V, A) of a Bayesian network is a directed acyclic graph (DAG), where V is the node (or vertex) set and A is the arc (or edge) set. The DAG defines a factorization of the joint probability distribution of V = {X1, X2, . . . , Xv}, often called the global probability distribution, into a set of local probability distributions, one for each variable. The form of the factorization is given by the Markov property of Bayesian networks (Korb and Nicholson 2004, section 2.2.4), which states that every random variable Xi directly depends only on its parents ΠXi: P(X1, . . . , Xv) = v Y i=1 P(Xi | ΠXi) (for discrete variables) (1) f(X1, . . . , Xv) = v Y i=1 f(Xi | ΠXi) (for continuous variables). (2) The correspondence between conditional independence (of the random variables) and graph- ical separation (of the corresponding nodes of the graph) has been extended to an arbitrary triplet of disjoint subsets of V by Pearl (1988) with the d-separation (from direction-dependent separation). Therefore model selection algorithms first try to learn the graphical structure of the Bayesian network (hence the name of structure learning algorithms) and then estimate the parameters of the local distribution functions conditional on the learned structure. This two-step approach has the advantage that it considers one local distribution function at a time, and it does not require to model the global distribution function explicitly. Another advantage is that learning algorithms are able to scale to fit high-dimensional models without incurring in the so-called curse of dimensionality. Although there are many possible choices for both the global and the local distribution func- tions, literature have focused mostly on two cases: • multinomial data (the discrete case): both the global and the local distributions are multinomial, and are represented as probability or contingency tables. This is by far the most common assumption, and the corresponding Bayesian networks are usually referred to as discrete Bayesian networks (or simply as Bayesian networks). • multivariate normal data (the continuous case): the global distribution is multivariate normal, and the local distributions are normal random variables linked by linear

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut