Collaboration in computer science: a network science approach. Part I
Co-authorship in publications within a discipline uncovers interesting properties of the analysed field. We represent collaboration in academic papers of computer science in terms of differently grained networks, including those sub-networks that emerge from conference and journal co-authorship only. We take advantage of the network science paraphernalia to take a picture of computer science collaboration including all papers published in the field since 1936. We investigate typical bibliometric properties like scientific productivity of authors and collaboration level in papers, as well as large-scale network properties like reachability and average separation distance among scholars, distribution of the number of scholar collaborators, network resilience and dependence on star collaborators, network clustering, and network assortativity by number of collaborators.
💡 Research Summary
The paper presents a comprehensive network‑science investigation of collaboration patterns in the computer‑science research community, covering every paper published from 1936 to the present. Using a large bibliographic dataset, the authors construct an undirected co‑authorship graph in which nodes represent scholars and edges denote joint authorship of at least one paper. In addition to the full “all‑papers” network, two sub‑networks are extracted: one comprising only journal articles and another comprising only conference papers, allowing a direct comparison of collaboration dynamics across publication venues.
The analysis begins with classic bibliometric descriptors. Author productivity follows a power‑law (Pareto) distribution: a small elite of roughly 10 % of scholars accounts for about 30 % of all publications. The average number of co‑authors per paper has risen steadily over the decades, from roughly 2.5 in the early 1990s to more than four in recent years, reflecting increasing methodological complexity and the growing prevalence of interdisciplinary work.
Turning to large‑scale network properties, the full co‑authorship graph contains over 1.2 million nodes and approximately 3.8 million edges. Remarkably, more than 93 % of the scholars belong to a single giant connected component, indicating that the field is highly integrated. The average shortest‑path length (average separation) is 6.2, confirming the “small‑world” phenomenon often cited in social networks. The clustering coefficient is 0.62, far above the value expected for a random graph of comparable size, which signals strong local cohesion—research groups tend to form tightly knit clusters.
The degree distribution (i.e., the number of collaborators per author) exhibits a heavy‑tailed shape but deviates from a pure scale‑free pattern; the tail is less extreme, suggesting that while a few scholars have exceptionally many collaborators, the majority have modest collaboration counts. Some outliers have collaborated with more than 200 distinct co‑authors, underscoring the presence of “star” researchers who act as bridges across sub‑communities.
Network robustness is examined by simulating node removal. Targeted removal of high‑degree (star) authors quickly fragments the giant component, demonstrating a strong dependence on these hubs. In contrast, random removal of the same number of nodes leads to a much slower degradation, indicating that the overall structure retains considerable resilience thanks to the numerous medium‑degree scholars that provide alternative pathways.
Assortativity analysis reveals a positive correlation between node degrees: high‑degree authors preferentially co‑author with other high‑degree authors, forming elite clusters, while low‑degree scholars tend to attach to the elite rather than to each other. This hierarchical mixing pattern contributes both to the observed robustness (through redundancy among medium‑degree nodes) and to the vulnerability to hub removal.
Comparative results for the venue‑specific sub‑networks highlight nuanced differences. The journal‑only network shows a slightly higher average degree and clustering coefficient, suggesting more stable, long‑term collaborations. The conference‑only network, by contrast, exhibits shorter average path lengths and a higher rate of new edge formation, reflecting the role of conferences as rapid exchange venues where emerging ideas and new partnerships are forged.
In sum, the study paints computer science as a densely connected, small‑world discipline with a modestly hierarchical collaboration structure. While a handful of star scholars dominate centrality measures, the bulk of the network is sustained by a broad base of medium‑degree researchers, granting the system a degree of robustness. The venue‑specific analyses further reveal that journals foster deeper, more clustered ties, whereas conferences promote swift, wide‑reaching connections. These quantitative insights have practical implications for research policy, funding allocation, and the design of interventions aimed at fostering inclusive and resilient scientific collaboration.
Comments & Academic Discussion
Loading comments...
Leave a Comment