Reconstruction of Markov Random Fields from Samples: Some Easy Observations and Algorithms
Markov random fields are used to model high dimensional distributions in a number of applied areas. Much recent interest has been devoted to the reconstruction of the dependency structure from independent samples from the Markov random fields. We analyze a simple algorithm for reconstructing the underlying graph defining a Markov random field on $n$ nodes and maximum degree $d$ given observations. We show that under mild non-degeneracy conditions it reconstructs the generating graph with high probability using $\Theta(d \epsilon^{-2}\delta^{-4} \log n)$ samples where $\epsilon,\delta$ depend on the local interactions. For most local interaction $\eps,\delta$ are of order $\exp(-O(d))$. Our results are optimal as a function of $n$ up to a multiplicative constant depending on $d$ and the strength of the local interactions. Our results seem to be the first results for general models that guarantee that {\em the} generating model is reconstructed. Furthermore, we provide explicit $O(n^{d+2} \epsilon^{-2}\delta^{-4} \log n)$ running time bound. In cases where the measure on the graph has correlation decay, the running time is $O(n^2 \log n)$ for all fixed $d$. We also discuss the effect of observing noisy samples and show that as long as the noise level is low, our algorithm is effective. On the other hand, we construct an example where large noise implies non-identifiability even for generic noise and interactions. Finally, we briefly show that in some simple cases, models with hidden nodes can also be recovered.
💡 Research Summary
The paper addresses the fundamental problem of learning the underlying graph structure of a Markov Random Field (MRF) from independent samples. The authors propose a remarkably simple algorithm that relies only on pairwise conditional probability estimates and a thresholding test, yet they prove that it recovers the exact generating graph with high probability under mild non‑degeneracy conditions.
Model and assumptions.
Consider an MRF on n discrete variables with maximum degree d. For every true edge (i, j) the conditional distribution of Xi given Xj differs from the marginal distribution of Xi by at least a constant ε > 0, and this difference occurs with probability at least δ > 0 over the sampling distribution. These two parameters capture the strength and the frequency of local interactions; they are assumed to be strictly positive for all edges, while non‑edges exhibit no such systematic deviation.
Algorithm.
For each node i the algorithm scans all other nodes j ∈ V{i}. Using m samples it computes empirical conditional probabilities (\hat P(X_i=a\mid X_j=b)) and the marginal (\hat P(X_i=a)). If for some state pair (a,b) the absolute difference exceeds ε/2 and the event (Xi=a, Xj=b) appears in at least a δ/2 fraction of the samples, then j is declared a neighbor of i. The procedure is repeated for all i, yielding a symmetric edge set.
Sample complexity.
Applying Chernoff–Hoeffding bounds to the empirical estimates, the authors show that with
\
Comments & Academic Discussion
Loading comments...
Leave a Comment