Community detection, discovering the underlying communities within a network from observed connections, is a fundamental problem in network analysis, yet it remains underexplored for signed networks. In signed networks, both edge connection patterns and edge signs are informative, and structural balance theory (e.g., triangles aligned with ``the enemy of my enemy is my friend'' and ``the friend of my friend is my friend'' are more prevalent) provides a global higher-order principle that guides community formation. We propose a Balanced Stochastic Block Model (BSBM), which incorporates balance theory into the network generating process such that balanced triangles are more likely to occur. We develop a fast profile pseudo-likelihood estimation algorithm with provable convergence and establish that our estimator achieves strong consistency under weaker signal conditions than methods for the binary SBM that rely solely on edge connectivity. Extensive simulation studies and two real-world signed networks demonstrate strong empirical performance.
In network analysis, communities are defined as clusters of nodes whose members share similar connection patterns with others. Community detection, discovering such latent clusters from an observed network, is a fundamental problem that has received extensive attention. Many methods for community detection are based on probabilistic network models, including the stochastic block model (SBM) (Holland et al. 1983;Nowicki et al. 2001), degree-corrected SBM (Karrer et al. 2011), latent factor model (Handcock et al. 2007;Hoff 2007), and mixedmembership SBM for overlapping community detection (Airoldi et al. 2008). Other methods formulate community detection as an optimization problem, which maximizes criteria that quantify the strength of community structure or their spectral approximations, including normalized cuts (Shi et al. 2000), modularity (Newman et al. 2004;Newman 2006), and variants of spectral clustering (Ng et al. 2001). These methods rely solely on the edge-connectivity information in binary networks for community detection.
In many applications, however, networks contain not only information about whether a connection exists but also the type of the connection. In signed networks, each edge takes either a positive (e.g., friendship, trust, agreement, positive correlation) or negative (e.g., hostility, distrust, disagreement, negative correlation) sign. Such signed networks are common in diverse fields, examples include social network (Heider 1946;Leskovec et al. 2010), international relations (Doreian et al. 1996;Doreian et al. 2015;Tang et al. 2025), and biological network (Vinayagam et al. 2014;Morabito et al. 2023). Incorporating edgesign information in signed networks allows for the identification of community structures that are not captured by edge-connectivity patterns alone.
To this end, numerous algorithms have been proposed for community detection in signed networks (Doreian et al. 1996;Bansal et al. 2004;Yang et al. 2007;Chiang et al. 2012;Li et al. 2014;Kunegis et al. 2010), among which many extend classical criteria such as normalized cuts and modularity that were originally designed for binary (unsigned) networks to incorporate edge signs. These extensions aggregate local pairwise sign signals into a partition objective, where positive edges encourage placing the connected nodes in the same community while negative edges encourage assigning them to different communities. However, methods based on local pairwise information alone overlooks a unique feature of signed networks: positive and negative edges interact through higher-order patterns. An important theory in social psychology for understanding such interactions is structural balance theory (Harary 1953). The theory characterizes signed triangles (e.g., three nodes connected to each other) as either balanced if the product of their three edge signs is positive, or unbalanced otherwise. In particular, balanced triangles are consistent with the proverbs “the enemy of my enemy is my friend” and “the friend of my friend is my friend”. The balance theory suggests that balanced triangles are more prevalent than unbalanced ones in signed networks. This pattern has been empirically observed in numerous real-world signed networks, including social and biological networks (Facchetti et al. 2011;Allahyari et al. 2022;Aref et al. 2018).
Balance theory provides a global higher-order principle for community detection. Beyond local pairwise information, it suggests that communities should be formed to minimize the occurrence of unbalanced triangles across the network. To incorporate the structural balance, one line of work uses low-rank matrix completion algorithms for community detection and sign prediction (Hsieh et al. 2012;Chiang et al. 2014). However, they treat non-edges as missing entries and thereby rely solely on edge-sign information.
Probabilistic model-based approaches for signed networks, in contrast, have been less explored (Vu et al. 2013;Chen et al. 2014;Jiang 2015;Zhang et al. 2022;Li et al. 2023;Tang et al. 2025;Pensky 2025). Among them, Jiang (2015) transformed a signed network into a two-layer network, where one layer represents the presence of positive edges and the other represents the presence of negative edges. This approach, however, ignores the mutual exclusivity between positive and negative edges on the same node pair. Vu et al. (2013) developed an exponential random network model for discrete-valued networks, which can be applied to signed networks, and Li et al. (2023) proposed a signed SBM, where each edge follows a multinomial distribution. More recently, Pensky (2025) proposed a variant of the generalized random dot product graph model for signed networks. However, all of them do not incorporate the balance theory for community detection that is the focus of our work. Zhang et al. (2022) introduced a latent space model for joint community and anomaly detection. Their method models signed edges as ordinal var
This content is AI-processed based on open access ArXiv data.