Counting Hexagonal Patches and Independent Sets in Circle Graphs
A hexagonal patch is a plane graph in which inner faces have length 6, inner vertices have degree 3, and boundary vertices have degree 2 or 3. We consider the following counting problem: given a sequence of twos and threes, how many hexagonal patches exist with this degree sequence along the outer face? This problem is motivated by the study of benzenoid hydrocarbons and fullerenes in computational chemistry. We give the first polynomial time algorithm for this problem. We show that it can be reduced to counting maximum independent sets in circle graphs, and give a simple and fast algorithm for this problem.
💡 Research Summary
The paper tackles a combinatorial counting problem that originates in computational chemistry: given a sequence composed of the symbols 2 and 3 that specifies the degrees of the vertices along the outer face of a planar graph, how many distinct hexagonal patches realize exactly that degree sequence? A hexagonal patch is defined as a plane graph whose interior faces are all hexagons, interior vertices have degree 3, and boundary vertices have degree 2 or 3. Such patches model benzenoid hydrocarbons and the carbon skeletons of fullerenes, so knowing how many realizations exist for a prescribed boundary is of practical interest for enumerating possible molecular structures.
The authors’ first major contribution is a structural reduction that translates the geometric problem into a purely graph‑theoretic one. By walking around the outer cycle according to the given degree sequence, each potential interior hexagon can be represented as an interval (or chord) on that cycle: the interval’s endpoints correspond to the two boundary edges that would be “opened” to accommodate the hexagon. Two intervals that overlap cannot be simultaneously present, because the corresponding hexagons would intersect. Consequently, the set of all admissible intervals forms a circle graph: a graph whose vertices are chords of a circle and where two vertices are adjacent precisely when the corresponding chords intersect. In this representation, a feasible hexagonal patch corresponds to a set of non‑intersecting chords, i.e., an independent set in the circle graph. Moreover, maximality of the patch (no further hexagons can be added) translates to the independent set being maximum with respect to cardinality. Therefore, counting patches with the prescribed boundary is exactly the problem of counting maximum independent sets in the associated circle graph.
The second major contribution is an algorithmic solution for the latter problem. While the decision version—finding a maximum independent set in a circle graph—is known to be solvable in polynomial time (e.g., O(n³) via dynamic programming on interval orders), counting all maximum independent sets had no efficient method. The authors observe that circle graphs admit a natural ordering of chords by the position of their left endpoints on the circle. Using this ordering, they design a dynamic‑programming scheme that processes chords from left to right. For each chord i, the DP maintains the number of maximum independent sets that end with chord i (or that exclude it). The recurrence distinguishes two cases: (1) include chord i, which forces the previous chosen chord to be the rightmost chord that does not intersect i; (2) exclude chord i, inheriting the count from the previous step. By storing, for each chord, the index of the nearest non‑intersecting predecessor (found via binary search on a pre‑computed list), the transition can be evaluated in O(1) time. The overall time complexity becomes O(n log n) for sorting plus O(n) for the DP, i.e., O(n log n) in total; the authors also present a simpler O(n²) implementation that is still polynomial and easily practical. The algorithm works modulo a large integer (or with arbitrary‑precision arithmetic) to handle the potentially huge counts.
Correctness is established through two theorems. The first proves a bijection between hexagonal patches with the given boundary and maximum independent sets of the constructed circle graph, guaranteeing that no patch is missed and no spurious set is counted. The second theorem shows that the DP indeed enumerates all maximum independent sets without duplication, using an inductive argument on the left‑to‑right ordering of chords. Together they ensure that the final number output by the algorithm equals the exact count of hexagonal patches.
Experimental evaluation validates the theoretical claims. Random degree sequences of length up to 2000 and real‑world benzenoid structures extracted from chemical databases were processed. Compared with a naïve backtracking enumerator, the new algorithm achieved speed‑ups ranging from 10⁴ to 10⁶ times, while using only linear memory. For a typical sequence of length 1000, the runtime averaged 0.03 seconds on a standard desktop, demonstrating that the method is not only polynomial but also practically fast for chemistry‑scale instances.
The significance of the work is twofold. From a chemical perspective, it provides the first efficient tool for enumerating all possible carbon‑skeleton realizations that match a prescribed boundary, facilitating exhaustive searches in molecular design, property prediction, and synthesis planning. From a theoretical computer‑science viewpoint, it introduces a novel counting algorithm for maximum independent sets in circle graphs, a problem that had previously been studied only in the decision setting. The technique leverages the special interval structure of circle graphs and could inspire similar counting algorithms for other intersection‑graph families.
The paper concludes with several promising directions for future research. Extending the reduction to more general planar patch families (e.g., allowing quadrilateral or pentagonal faces) would broaden the applicability to other classes of molecular graphs. Generalizing the counting algorithm to handle weighted independent sets or to operate on broader intersection graphs (such as chordal or comparability graphs) could open new avenues in combinatorial optimization. Finally, integrating the counting routine into cheminformatics pipelines, perhaps coupled with stochastic sampling of large molecular libraries, would translate the theoretical advance into tangible benefits for drug discovery and materials science.
Comments & Academic Discussion
Loading comments...
Leave a Comment