A small world of citations? The influence of collaboration networks on citation practices
This paper examines the proximity of authors to those they cite using degrees of separation in a co-author network, essentially using collaboration networks to expand on the notion of self-citations. While the proportion of direct self-citations (including co-authors of both citing and cited papers) is relatively constant in time and across specialties in the natural sciences (10% of citations) and the social sciences (20%), the same cannot be said for citations to authors who are members of the co-author network. Differences between fields and trends over time lie not only in the degree of co-authorship which defines the large-scale topology of the collaboration network, but also in the referencing practices within a given discipline, computed by defining a propensity to cite at a given distance within the collaboration network. Overall, there is little tendency to cite those nearby in the collaboration network, excluding direct self-citations. By analyzing these social references, we characterize the social capital of local collaboration networks in terms of the knowledge production within scientific fields. These results have implications for the long-standing debate over biases common to most types of citation analysis, and for understanding citation practices across scientific disciplines over the past 50 years. In addition, our findings have important practical implications for the availability of ‘arm’s length’ expert reviewers of grant applications and manuscripts.
💡 Research Summary
This paper investigates how the structure of scientific collaboration networks influences citation behavior, using a massive bibliometric dataset covering the period 1945‑2008. The authors draw on Thomson Reuters’ Web of Science (Science Citation Index Expanded, Social Science Citation Index, Arts & Humanities Citation Index) and focus on eight specialties that span the natural and medical sciences (astronomy & astrophysics, atmospheric science & meteorology, biochemistry & molecular biology, organic chemistry, neuroscience & neurosurgery) and the social sciences and humanities (economics, sociology, history). For each specialty and each publication year, they extract all research articles (excluding notes and reviews) and the reference lists of those articles.
Construction of the co‑authorship network
Authors are identified by name and initials; to mitigate homonym problems the authors restrict the network to the same specialty and to collaborations occurring within a two‑year window of the citing article’s publication year. This yields an undirected, unweighted co‑authorship graph for each field and year. From this graph they define three concentric layers of social proximity relative to a citing author:
- Level 0 (self‑citation): the cited paper shares at least one author with the citing paper.
- Level 1 (co‑author citation): the cited paper includes a direct co‑author of the citing author.
- Level 2 (co‑author‑of‑co‑author citation): the cited paper includes an author who has co‑authored with a direct co‑author of the citing author.
- Level 3 (distant citation): none of the above relationships hold.
Each reference is assigned to the closest applicable level, ensuring mutually exclusive categories.
Descriptive and macro‑level variables
The authors compute, for every year and specialty, several aggregate indicators: total number of papers, average number of authors per paper, average number of references per paper, the proportion of references that belong to the same specialty, and the proportion of references that are “recent” (published within ten years of the citing article). These variables serve to characterize the growth, collaboration intensity, and citation culture of each field.
Key empirical findings
-
Stability of self‑citations – Across all specialties, self‑citations constitute a remarkably stable share of total references: roughly 10 % in the natural and medical sciences and about 20 % in the social sciences and humanities. This proportion shows little temporal variation over the five‑decade span, confirming that self‑citation is a persistent, field‑wide practice rather than a trend‑driven phenomenon.
-
Variation in co‑author citations – The share of Level 1 and Level 2 citations varies dramatically across fields. In disciplines with high average numbers of co‑authors per paper (e.g., astronomy/astrophysics and atmospheric science, where papers often list five or more authors), Level 1 citations can exceed 10 % of all references. By contrast, fields with modest collaboration rates (organic chemistry, sociology, history) exhibit Level 1 shares below 1 % and virtually no Level 2 citations. This pattern reflects the density of the underlying co‑authorship network: a richer network provides more “socially close” papers to cite.
-
Dominance of distant citations – Regardless of field, the majority of references are distant (Level 3). In most specialties, distant citations account for more than half of all references, and this share has grown in recent decades, especially in the natural sciences. The increase coincides with a rise in the proportion of recent literature cited (references <10 years old), indicating that researchers are turning to the latest work even when it lies outside their immediate collaboration circle.
-
Correlation with macro‑variables – Higher average numbers of authors per paper, larger overall publication output, and a higher proportion of intra‑specialty references are positively associated with the prevalence of Level 1 and Level 2 citations. Conversely, a higher share of recent references correlates negatively with close‑network citations, suggesting that the pursuit of cutting‑edge literature pushes scholars toward more distant sources.
-
Limited impact of collaboration on citation bias – While self‑citation is a stable source of bias, the influence of the broader collaboration network on citation choices is modest and highly field‑dependent. In the social sciences and humanities, where co‑authorship is rare, the network effect is essentially absent. In the natural sciences, the effect is measurable but does not overturn the overall tendency to cite distant work.
Implications
-
Citation‑based evaluation – Standard bibliometric indicators that adjust for self‑citations may need to incorporate field‑specific corrections for close‑network citations, especially in highly collaborative domains. A one‑size‑fits‑all approach could misrepresent scholarly impact in disciplines with dense co‑authorship structures.
-
Peer review and grant evaluation – The findings provide empirical support for policies that avoid assigning reviewers who are within two degrees of collaboration with the applicant. Such “distance‑based” reviewer selection can reduce potential conflicts of interest and promote more impartial assessment, particularly in fields with extensive co‑authorship.
-
Science policy – Encouraging cross‑disciplinary collaborations and facilitating exposure to literature beyond one’s immediate network could enhance knowledge diffusion. Funding agencies might consider mechanisms (e.g., interdisciplinary workshops, collaborative grants) that deliberately broaden scholars’ citation horizons, thereby increasing the diversity of cited work.
Conclusion
By integrating a large‑scale co‑authorship network analysis with detailed citation categorization, the authors demonstrate that while self‑citation remains a constant element of scholarly practice, the propensity to cite recent collaborators is highly variable and largely subordinate to the overall tendency to cite distant literature. The study enriches our understanding of the social dimensions of citation behavior, offers concrete guidance for improving the fairness of research evaluation, and highlights avenues for policy interventions aimed at fostering a more interconnected and less insular scientific ecosystem.
Comments & Academic Discussion
Loading comments...
Leave a Comment