Consistent model selection in a collection of stochastic block models
We introduce the penalized Krichevsky-Trofimov (KT) estimator as a convergent method for estimating the number of nodes clusters when observing multiple networks within both multi-layer and dynamic Stochastic Block Models. We establish the consistency of the KT estimator, showing that it converges to the correct number of clusters in both types of models when the number of nodes in the networks increases. Our estimator does not require a known upper bound on this number to be consistent. Furthermore, we show that these consistency results hold in both dense and sparse regimes, making the penalized KT estimator robust across various network configurations. We illustrate its performance on synthetic datasets.
💡 Research Summary
This paper addresses the problem of determining the number of communities (the model order) in collections of stochastic block models (SBMs), specifically in multi‑layer SBMs (MLSBM) and dynamic SBMs (DynSBM). The authors propose a penalized version of the Krichevsky‑Trofimov (KT) estimator, a Bayesian‑inspired estimator originally designed for categorical distributions, and adapt it to the network setting.
Key contributions are:
-
Upper‑bound‑free consistency – The estimator converges to the true number of clusters (k_{0}) without requiring a pre‑specified upper bound on (k). This removes a restrictive assumption common in previous work.
-
Unified treatment of dense and sparse regimes – The analysis covers both constant‑edge‑probability (dense) graphs and graphs whose edge probabilities decay as (\rho_{n}=O(1/n)) (sparse). In the sparse case the authors assume (n\rho_{n}\to\infty) (average degree diverges), which is sufficient for weak recovery and matches known detection thresholds.
-
Explicit penalty terms – For MLSBM the penalty is
\
Comments & Academic Discussion
Loading comments...
Leave a Comment