Community detection is one of the most important and interesting issues in social network analysis. In recent years, simultaneous considering of nodes' attributes and topological structures of social networks in the process of community detection has attracted the attentions of many scholars, and this consideration has been recently used in some community detection methods to increase their efficiencies and to enhance their performances in finding meaningful and relevant communities. But the problem is that most of these methods tend to find non-overlapping communities, while many real-world networks include communities that often overlap to some extent. In order to solve this problem, an evolutionary algorithm called MOBBO-OCD, which is based on multi-objective biogeography-based optimization (BBO), is proposed in this paper to automatically find overlapping communities in a social network with node attributes with synchronously considering the density of connections and the similarity of nodes' attributes in the network. In MOBBO-OCD, an extended locus-based adjacency representation called OLAR is introduced to encode and decode overlapping communities. Based on OLAR, a rank-based migration operator along with a novel two-phase mutation strategy and a new double-point crossover are used in the evolution process of MOBBO-OCD to effectively lead the population into the evolution path. In order to assess the performance of MOBBO-OCD, a new metric called alpha_SAEM is proposed in this paper, which is able to evaluate the goodness of both overlapping and non-overlapping partitions with considering the two aspects of node attributes and linkage structure. Quantitative evaluations reveal that MOBBO-OCD achieves favorable results which are quite superior to the results of 15 relevant community detection algorithms in the literature.
Almost any natural phenomena can be modeled as networks by defining a set of entities and establishing a criterion of the relation between them [32]. A social network can be considered as a well-known example of a network, which is a social structure made up of a set of nodes as the social actors who are connected by one or more specific types of interdependency [3]. Community is a significant substructure in many complex networks [27]. Since social networks are considered as a kind of complex networks, their community structure is one of their distinctive properties, which can reveal their organization and the hidden relation among their components [16]. Identifying meaningful communities of social networks is an interesting field of study which has attracted many researchers in recent years [31]. A community can be defined as a subset of nodes that are densely connected to each other and loosely connected to the nodes in the other communities in the same network [19], like a group of individuals in a social network who are friends with each other. Since it is more likely that the members of a community have common hobbies, social functions, etc., the identified communities can be used in collaborative recommendation, information spreading, knowledge sharing, and other applications that are beneficent for us [47].
Community detection, also known as graph clustering, is one of long-standing popular research topics [13]. Most of the researches in the field of community detection focused on designing a variety of methods for non-overlapping (disjoint or separated) community detection, in which every node just belongs to exactly one community [46]. However, many real-world networks include communities that often overlap to some extent. It means that some nodes of these networks may belong to more than one community because they may have different roles in the network [46]. For example, we can consider an individual in a social network that might be a member of a karate community and a cinema community, simultaneously.
On the other hand, most of the studies in the field of community detection focused on the graph structures of social networks to detect communities, while no content analysis is performed in their process of community detection [31]. In many real-world social networks, there is one or more attributes assigned to each node, which describe its properties, and are often homogenous in a community [18]. In other words, it is more likely that the nodes with the same attributes belong to the same communities. Nowadays, real world networks contain a vast range of information which can be classified as node (user) attributes, such as shared objects, comments, following information, age, education, gender, profession, etc [29]. Thus, the process of community detection can be more optimized with considering contents of a social network (if available) in finding communities in which members are not just densely connected but share similar attributes [18].
The problem of overlapping community detection have been considered in some researches, and some efficient overlapping community detection methods have been proposed in the literature of community detection in which no content analysis are performed. On the other hand, in recent years, the interest of scholars for finding community structures of social networks with considering node attributes and link structure have increased which have led them to propose some non-overlapping community detection methods. But, to our best knowledge, the problem of detecting overlapping communities in social networks with node attributes with synchronously considering structure and attribute is remained as an open problem.
In order to solve the mentioned problem, in this paper, we proposed a multi-objective evolutionary algorithm called MOBBO-OCD to automatically find overlapping communities in a social network, in which node attributes are available, with synchronously considering the density of connections and the similarity of nodes’ attributes. Our proposed algorithm is based on biogeography-based optimization (BBO) [35], which is a novel promising evolutionary algorithm proposed with inspiration from the science of biogeography to solve global optimization problems. Since attribute similarity and connection density can be considered as two independent and sometimes conflicting objectives [18], we used a multi-objective BBO to make balance between them. The final result of the proposed method is a set of nondominated solutions (partitions of a network) which contain partitions which have the best performance from the perspective of topological structure (density of connections) of a network, partitions which have the best performance from the perspective of similarity of nodes’ attributes in the network, and partitions that reach to a trade-off between the density of connections and the similarity of nodes’ attributes in the network. Thus, our proposed method can pr
This content is AI-processed based on open access ArXiv data.