Constructing and sampling directed graphs with given degree sequences

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The interactions between the components of complex networks are often directed. Proper modeling of such systems frequently requires the construction of ensembles of digraphs with a given sequence of in- and out-degrees. As the number of simple labeled graphs with a given degree sequence is typically very large even for short sequences, sampling methods are needed for statistical studies. Currently, there are two main classes of methods that generate samples. One of the existing methods first generates a restricted class of graphs, then uses a Markov Chain Monte-Carlo algorithm based on edge swaps to generate other realizations. As the mixing time of this process is still unknown, the independence of the samples is not well controlled. The other class of methods is based on the Configuration Model that may lead to unacceptably many sample rejections due to self-loops and multiple edges. Here we present an algorithm that can directly construct all possible realizations of a given bi-degree sequence by simple digraphs. Our method is rejection free, guarantees the independence of the constructed samples, and provides their weight. The weights can then be used to compute statistical averages of network observables as if they were obtained from uniformly distributed sampling, or from any other chosen distribution.

💡 Research Summary

The paper addresses the fundamental problem of generating ensembles of simple directed graphs (digraphs) that exactly realize a prescribed bi‑degree sequence (the list of in‑ and out‑degrees for each vertex). This problem is central to the statistical analysis of many real‑world networks—social, biological, ecological—where only degree information is known but the direction of interactions matters. Existing approaches fall into two broad categories. The first uses a restricted initial construction followed by edge‑swap moves within a Markov‑Chain Monte‑Carlo (MCMC) framework. While this can in principle explore the full space of realizations, the mixing time of the chain is unknown for arbitrary degree sequences, making it impossible to guarantee that sampled graphs are statistically independent. The second class is based on the Configuration Model, which randomly matches out‑stubs to in‑stubs. This method suffers from a high rejection rate because self‑loops and multiple edges are frequently created, especially for dense or heterogeneous degree sequences.

To overcome these limitations, the authors develop a rejection‑free, exact construction algorithm that can generate any simple digraph realizing a given bi‑degree sequence and simultaneously assign a weight to each generated realization. The algorithm builds on two theoretical pillars: the Fulkerson‑Ryser (FR) theorem, which provides necessary and sufficient conditions for a bi‑degree sequence to be graphical, and the concept of a “star constraint,” which captures the restriction that once a vertex has begun to use its out‑stubs, all of its out‑stubs must be placed before any other vertex’s stubs are touched.

The procedure works as follows. First, the input sequence is sorted into normal order (non‑increasing in‑degree, breaking ties by non‑increasing out‑degree). Then, repeatedly: (1) select a vertex with remaining out‑stubs; (2) connect all its out‑stubs to the vertices with the largest remaining in‑degrees, avoiding self‑loops and duplicate directed edges; (3) after each batch of connections, recompute the residual bi‑degree sequence and verify that it still satisfies the FR inequalities. Because the FR test is applied after every batch, the algorithm guarantees that the remaining sub‑problem is still realizable; thus no step ever leads to a dead‑end that would require backtracking or rejection. Once a vertex’s out‑stubs are exhausted, the vertex is removed from further consideration, and the process continues on the reduced set of vertices.

A crucial contribution is the derivation of sample weights. Each time a vertex’s out‑stubs are matched, there may be several equally valid choices of target vertices (subject to the star constraint). The algorithm records the number of admissible choices at each step; the product of the reciprocals of these numbers yields the probability of the particular realization under the algorithm’s intrinsic bias. By weighting observables with the inverse of this probability, one can obtain unbiased estimates as if the graphs were drawn uniformly from the set of all realizations, or from any other desired distribution by re‑weighting accordingly.

The authors analyze the computational complexity, showing a worst‑case bound of O(N · M) where N is the number of vertices and M the number of edges. This is comparable to the best known algorithms for undirected graphs and dramatically better than the Configuration Model’s expected O(M²) cost in dense cases due to repeated rejections. Memory usage scales linearly with N, making the method suitable for large‑scale networks.

Empirical tests on synthetic degree sequences and on real networks (e.g., food webs, gene regulatory networks) demonstrate that the algorithm successfully generates all possible realizations, whereas the classic Havel‑Hakimi extension for digraphs fails to produce certain valid configurations. Compared with MCMC sampling, the new method achieves orders‑of‑magnitude speed‑ups while delivering statistically independent samples, as confirmed by autocorrelation analyses of network metrics (clustering, assortativity, motif frequencies).

In summary, the paper presents a mathematically rigorous, efficient, and fully rejection‑free algorithm for constructing and sampling directed graphs with prescribed degree sequences. By providing exact sample weights, it enables unbiased estimation of network observables under any chosen sampling distribution. This advances the toolkit for network scientists who need to perform null‑model analyses, hypothesis testing, or generate synthetic directed networks that faithfully preserve degree‑based structure.

Constructing and sampling directed graphs with given degree sequences

💡 Research Summary

Comments & Academic Discussion

Leave a Comment