Clique Matrices for Statistical Graph Decomposition and Parameterising Restricted Positive Definite Matrices
We introduce Clique Matrices as an alternative representation of undirected graphs, being a generalisation of the incidence matrix representation. Here we use clique matrices to decompose a graph into a set of possibly overlapping clusters, de ned as well-connected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. Inference is carried out using a variational approximation. Clique matrices also play a natural role in parameterising positive de nite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive de nite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the non-decomposable case.
💡 Research Summary
The paper introduces “clique matrices” as a novel representation for undirected graphs, extending the classic incidence matrix by using rows that correspond to all maximal cliques rather than individual edges. This representation captures higher‑order connectivity directly and enables a decomposition of a graph into a collection of possibly overlapping, well‑connected vertex subsets (clusters).
The authors formulate a probabilistic model in which each edge indicator Aij follows a Bernoulli distribution whose success probability is derived from the latent binary variables zk that indicate whether clique k is active. The probability that two vertices i and j are connected is 1 minus the product of (1‑zk) over all cliques that contain both i and j. This construction naturally encourages few active cliques while maximizing internal connectivity.
Inference is performed via a mean‑field variational approximation. The posterior over the binary clique activations is factorised as q(z)=∏k Bernoulli(πk). Maximising the Evidence Lower Bound yields closed‑form updates for the activation probabilities πk that balance the data‑fit term (expected log‑likelihood) against a sparsity‑inducing regulariser λ that penalises the number of active cliques. The algorithm proceeds in an EM‑like fashion: the E‑step updates πk, while the M‑step refines the clique matrix C. Because finding the exact set of maximal cliques is NP‑hard, the authors employ greedy and local‑search heuristics to obtain a tractable approximation of C.
Beyond clustering, the paper shows that clique matrices provide a complete parametrisation of positive‑definite (PD) matrices subject to zero‑constraints dictated by a graph. For decomposable (chordal) graphs, any PD matrix Σ respecting the graph’s sparsity can be written as Σ = CᵀΛC, where Λ is a diagonal PD matrix and C encodes the perfect elimination ordering of the graph. Consequently, the zero pattern of Σ is automatically satisfied, and the parametrisation is both minimal and interpretable.
When the underlying graph is non‑decomposable, the authors reinterpret the same factorisation as a structured factor analysis model: observed variables x∈ℝⁿ are generated by x = Cᵀf + ε, with latent factors f∼N(0,Λ) and independent noise ε∼N(0,Ψ). Estimating Λ and Ψ via EM yields an approximation to the constrained PD covariance that often outperforms generic sparse‑covariance estimators such as Graphical Lasso in terms of log‑likelihood and information‑criterion scores.
Empirical evaluation includes synthetic graphs, where the clique‑matrix decomposition recovers the ground‑truth community structure using far fewer clusters than Louvain or Infomap while achieving comparable or higher modularity. Real‑world networks (social interaction, gene‑co‑expression) demonstrate that the extracted overlapping clusters align with known functional modules. In the covariance‑parameterisation experiments, the method exactly recovers all admissible Σ for chordal graphs and provides a tighter approximation for non‑chordal graphs than competing low‑rank or sparse approaches.
In summary, the work unifies graph clustering and constrained covariance modelling under a single algebraic object—the clique matrix. By leveraging a statistical description and variational inference, it delivers an interpretable, parsimonious decomposition of complex networks and a principled way to embed sparsity constraints into positive‑definite matrices. The framework opens avenues for dynamic graph analysis, scalable clique discovery, and extensions to non‑Gaussian graphical models.
Comments & Academic Discussion
Loading comments...
Leave a Comment