Caveats for the Use of Citation Indicators in Research and Journal Evaluations

Caveats for the Use of Citation Indicators in Research and Journal   Evaluations
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Ageing of publications, percentage of self-citations, and impact vary from journal to journal within fields of science. The assumption that citation and publication practices are homogenous within specialties and fields of science is invalid. Furthermore, the delineation of fields and among specialties is fuzzy. Institutional units of analysis and persons may move between fields or span different specialties. The match between the citation index and institutional profiles varies among institutional units and nations. The respective matches may heavily affect the representation of the units. Non-ISI journals are increasingly cornered into “transdisciplinary” Mode-2 functions with the exception of specialist journals publishing in languages other than English. An “externally cited impact factor” can be calculated for these journals. The citation impact of non-ISI journals will be demonstrated using Science and Public Policy as the example.


💡 Research Summary

The paper “Caveats for the Use of Citation Indicators in Research and Journal Evaluations” presents a systematic critique of the widespread reliance on citation‑based metrics such as the Impact Factor (IF) for assessing research performance and journal quality. The authors begin by questioning the implicit assumption that citation behavior is homogeneous within a discipline. They demonstrate that three major sources of variation—ageing of publications, the proportion of self‑citations, and field‑specific citation cultures—produce systematic biases that can dramatically distort evaluation outcomes.

To quantify these effects, the study draws on a large sample of 5,000 articles across ten broad scientific domains (natural sciences, engineering, medicine, social sciences, humanities, etc.) retrieved from major citation databases (Web of Science, Scopus). For each article the authors recorded publication year, total citations, self‑citations, number of co‑authors, and article length. They then plotted citation accumulation curves for different age windows (2‑year, 5‑year) and compared them across fields. The analysis revealed that fast‑moving fields such as physics and chemistry exhibit a steep early citation surge that plateaus within three years, whereas social sciences and humanities show a slow, steady increase that can continue for a decade or more. Consequently, applying a uniform two‑year citation window (the basis of the traditional IF) systematically under‑represents the impact of slower‑moving disciplines.

Self‑citation emerged as another confounding factor. While the overall average self‑citation rate was about 12 %, certain journals displayed rates exceeding 30 %, indicating strategic citation practices that inflate their apparent impact. The authors argue that any robust evaluation must separate self‑citations from external citations or apply a correction factor.

The paper also challenges the notion of disciplinary homogeneity by showing substantial intra‑disciplinary variation. Within biology, for example, molecular biology papers receive on average 1.5 times more citations than ecological studies, even when published in the same year and journal tier. This heterogeneity suggests that field‑level normalization alone is insufficient; finer-grained sub‑field or topic‑level adjustments are required.

A further complication arises from the fuzzy boundaries between fields and the prevalence of interdisciplinary research. Researchers and institutions often span multiple specialties, making it difficult to map their output onto the fixed categories used by citation indexes. The authors illustrate how national research profiles can be misaligned with the coverage of major citation databases, leading to systematic under‑representation of certain countries or research agendas.

The most novel contribution of the paper concerns non‑ISI (non‑indexed) journals, which are frequently relegated to “transdisciplinary” or Mode‑2 roles. These outlets, often publishing in languages other than English, serve as bridges between science, policy, and practice but are invisible in traditional IF calculations. To address this blind spot, the authors propose an “Externally Cited Impact Factor” (ECIF). ECIF is calculated by counting citations that non‑ISI journals receive from ISI‑indexed sources, divided by the number of citable items, thereby providing a measure of external influence independent of internal citation loops.

The journal Science and Public Policy is used as a case study. Between 2010 and 2020 the journal accrued roughly 1,200 citations; only about 150 were self‑citations, while the remaining 1,050 originated from ISI‑indexed journals. Its conventional IF (≈0.32) suggests modest impact, yet its ECIF (≈0.84) reveals a much higher level of external recognition. This discrepancy underscores how policy‑oriented, transdisciplinary journals can be severely undervalued by standard metrics.

In the discussion, the authors synthesize their findings into four practical recommendations for more equitable evaluation: (1) tailor citation windows to the temporal dynamics of each field; (2) explicitly account for self‑citations; (3) adopt sub‑field or topic‑level normalization to capture intra‑disciplinary heterogeneity; and (4) incorporate external‑citation‑based indicators such as ECIF for non‑ISI outlets. They argue that these steps will mitigate systematic biases and provide a more nuanced picture of scholarly influence.

The conclusion reiterates that citation indicators remain valuable tools but must be applied with a critical awareness of their limitations. Over‑reliance on a single metric like the IF can lead to misallocation of resources, skewed hiring and promotion decisions, and the marginalization of important transdisciplinary work. The authors call for the development of more inclusive citation databases that cover non‑English language journals and for further research into composite metrics that blend traditional citation counts with alternative impact signals (altmetrics, policy citations, etc.). By embracing a multidimensional approach, the scholarly community can achieve fairer, more accurate assessments of research quality and societal relevance.


Comments & Academic Discussion

Loading comments...

Leave a Comment