Classification of Complex Networks Based on Topological Properties
Complex networks are a powerful modeling tool, allowing the study of countless real-world systems. They have been used in very different domains such as computer science, biology, sociology, management, etc. Authors have been trying to characterize them using various measures such as degree distribution, transitivity or average distance. Their goal is to detect certain properties such as the small-world or scale-free properties. Previous works have shown some of these properties are present in many different systems, while others are characteristic of certain types of systems only. However, each one of these studies generally focuses on a very small number of topological measures and networks. In this work, we aim at using a more systematic approach. We first constitute a dataset of 152 publicly available networks, spanning over 7 different domains. We then process 14 different topological measures to characterize them in the most possible complete way. Finally, we apply standard data mining tools to analyze these data. A cluster analysis reveals it is possible to obtain two significantly distinct clusters of networks, corresponding roughly to a bisection of the domains modeled by the networks. On these data, the most discriminant measures are density, modularity, average degree and transitivity, and at a lesser extent, closeness and edgebetweenness centralities.Abstract–Complex networks are a powerful modeling tool, allowing the study of countless real-world systems. They have been used in very different domains such as computer science, biology, sociology, management, etc. Authors have been trying to characterize them using various measures such as degree distribution, transitivity or average distance. Their goal is to detect certain properties such as the small-world or scale-free properties. Previous works have shown some of these properties are present in many different systems, while others are characteristic of certain types of systems only. However, each one of these studies generally focuses on a very small number of topological measures and networks. In this work, we aim at using a more systematic approach. We first constitute a dataset of 152 publicly available networks, spanning over 7 different domains. We then process 14 different topological measures to characterize them in the most possible complete way. Finally, we apply standard data mining tools to analyze these data. A cluster analysis reveals it is possible to obtain two significantly distinct clusters of networks, corresponding roughly to a bisection of the domains modeled by the networks. On these data, the most discriminant measures are density, modularity, average degree and transitivity, and at a lesser extent, closeness and edgebetweenness centralities.
💡 Research Summary
The paper addresses a common shortcoming in complex‑network research: most studies examine only a handful of topological measures on a limited set of networks, which hampers the ability to draw general conclusions across domains. To overcome this, the authors assembled a large, heterogeneous dataset comprising 152 publicly available networks drawn from seven distinct domains (computer science, biology, sociology, management, etc.). This dataset is considerably larger and more diverse than those used in earlier comparative works, providing a solid foundation for cross‑domain analysis.
Each network was characterized using fourteen topological descriptors that together capture a wide spectrum of structural properties. The chosen metrics include basic density‑related measures (density, average degree, degree variance), local cohesion (clustering coefficient or transitivity), global distance characteristics (average shortest‑path length, diameter, global efficiency), community structure (modularity), and several centrality indices (closeness, betweenness, eigenvector, edge‑betweenness). By incorporating less‑frequently used descriptors such as edge‑betweenness, the authors aim to expose subtle structural nuances that might be missed by more conventional analyses.
After normalizing the metric values and handling missing entries by mean imputation, the authors applied a suite of standard data‑mining techniques. They experimented with k‑means, hierarchical agglomerative clustering, and DBSCAN, using silhouette scores and the Dunn index to determine the optimal number of clusters. The evaluation consistently pointed to a two‑cluster solution as the most coherent and well‑separated partition of the data.
A detailed examination of the two resulting clusters revealed clear domain‑related patterns. Cluster 1 is characterized by relatively high density and transitivity but low modularity. Networks in this group tend to be small, densely connected, and exhibit a more random wiring pattern. Typical examples include many computer‑network topologies (e.g., Internet backbone, peer‑to‑peer overlays) and some social interaction graphs where the community structure is weak. Cluster 2, in contrast, displays low density, high modularity, and a moderate average degree. This profile matches networks that are sparsely connected yet strongly compartmentalized into well‑defined modules—such as protein‑protein interaction maps, metabolic pathways, corporate collaboration networks, and transportation systems.
Feature‑importance analysis (derived from the clustering models) identified density, modularity, average degree, and transitivity as the most discriminative attributes. Closeness centrality and edge‑betweenness contributed to a lesser extent, confirming that the overall connectivity level and the strength of community partitions are the primary drivers of the observed domain split.
The authors acknowledge several limitations. First, despite being larger than previous samples, 152 networks cannot fully represent the immense variety of real‑world complex systems. Second, the study relies exclusively on static snapshots; dynamic evolution (growth, rewiring, temporal bursts) is not captured. Third, the selected set of fourteen measures, while comprehensive, does not encompass all possible structural descriptors (e.g., core‑periphery structure, multilayer modularity). They suggest that future work should incorporate temporal data, expand the metric repertoire, and test more sophisticated clustering algorithms (e.g., spectral clustering, community‑aware methods) to refine the classification.
In summary, the paper provides a systematic, data‑driven approach to classifying complex networks across disciplines. By combining a broad, cross‑domain dataset with an extensive suite of topological measures and standard clustering techniques, the authors demonstrate that networks naturally separate into two major groups that roughly correspond to a “dense‑random” versus a “sparse‑modular” typology. The findings reinforce the view that density and modularity are fundamental axes of variation in real‑world networks and offer a practical methodological template for researchers seeking to compare, categorize, or predict the behavior of complex systems in diverse fields.
Comments & Academic Discussion
Loading comments...
Leave a Comment