Infinite Hierarchical MMSB Model for Nested Communities/Groups in Social Networks
Actors in realistic social networks play not one but a number of diverse roles depending on whom they interact with, and a large number of such role-specific interactions collectively determine social communities and their organizations. Methods for analyzing social networks should capture these multi-faceted role-specific interactions, and, more interestingly, discover the latent organization or hierarchy of social communities. We propose a hierarchical Mixed Membership Stochastic Blockmodel to model the generation of hierarchies in social communities, selective membership of actors to subsets of these communities, and the resultant networks due to within- and cross-community interactions. Furthermore, to automatically discover these latent structures from social networks, we develop a Gibbs sampling algorithm for our model. We conduct extensive validation of our model using synthetic networks, and demonstrate the utility of our model in real-world datasets such as predator-prey networks and citation networks.
💡 Research Summary
The paper tackles two fundamental characteristics of real‑world social networks: (1) individuals often play multiple, context‑dependent roles, and (2) these role‑specific interactions give rise to communities that are themselves organized in a hierarchical fashion (e.g., teams within departments within firms). Existing mixed‑membership stochastic blockmodels (MMSB) capture the first aspect but assume a flat community structure, thereby failing to represent nested or parent‑child relationships among groups. To fill this gap, the authors propose the Hierarchical Infinite Mixed‑Membership Stochastic Blockmodel (H‑MMSB), which jointly models role‑specific interactions and the latent hierarchy of communities.
Model Construction
The hierarchy is generated by a Nested Chinese Restaurant Process (NCRP), a non‑parametric Bayesian prior that yields a potentially infinite tree of communities. Each node (actor) possesses a membership vector at every level of the tree, indicating the proportion of its affiliation to the communities at that level. Interaction between two actors proceeds as follows: for a given dyad, each actor independently selects a level and a community at that level according to its membership vectors; the pair of selected communities determines the edge probability via a block matrix B. The entries of B are drawn from Beta priors, ensuring higher probabilities for intra‑community links and lower probabilities for inter‑community links. This construction naturally extends the original MMSB: when the tree collapses to a single level, H‑MMSB reduces to MMSB.
Inference
A Gibbs sampling scheme is derived for posterior inference. The sampler iteratively updates (i) the level‑specific community assignments for each actor, (ii) the tree structure (i.e., whether a new community node is created under the NCRP), and (iii) the block‑matrix parameters. Because the conditional posteriors retain conjugate forms (Dirichlet‑Multinomial for memberships, Beta‑Bernoulli for edges), each update can be performed analytically, leading to relatively fast convergence despite the model’s complexity. Convergence diagnostics are based on log‑likelihood trajectories and perplexity measures.
Empirical Evaluation
Three experimental settings are presented.
- Synthetic Networks – The authors generate graphs from known hierarchical structures with varying depths and branching factors. H‑MMSB accurately recovers both the depth and the community partitions, outperforming flat MMSB and a non‑hierarchical HDP‑MMSB in terms of held‑out log‑likelihood.
- Predator‑Prey Networks – Real ecological data (e.g., food‑webs) are analyzed. The top‑level communities correspond to trophic roles (predator vs. prey), while lower levels capture finer ecological niches such as habitat or hunting strategy. The hierarchical model yields a clearer separation of functional groups than flat models.
- Citation Networks – In a collection of scientific papers, the highest level distinguishes broad disciplines (physics, biology, etc.), and subsequent levels reveal sub‑fields and specialized topics. Quantitatively, H‑MMSB achieves higher precision, recall, and F1 scores for link prediction, especially for cross‑disciplinary citations that flat models tend to misclassify.
Strengths and Limitations
The main strength lies in the unified treatment of multi‑role membership and nested community organization, enabling richer interpretations of complex networks. The non‑parametric nature eliminates the need to pre‑specify the number of communities or hierarchy depth. However, the Gibbs sampler’s computational cost grows with tree depth, and hyper‑parameter sensitivity (e.g., concentration parameters of the NCRP) can affect performance. The authors discuss possible extensions such as variational inference for scalability and Bayesian model selection criteria to mitigate over‑fitting.
Conclusion
Overall, the paper introduces a powerful Bayesian framework that extends mixed‑membership blockmodels to infinite hierarchical settings. By demonstrating both synthetic recovery and meaningful real‑world discoveries in ecological and scholarly networks, the work provides a compelling tool for researchers seeking to uncover the multi‑level, role‑driven organization inherent in many social and biological systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment