Return on citation: a consistent metric to evaluate papers, journals and researchers

Evaluating and comparing the academic performance of a journal, a researcher or a single paper has long remained a critical, necessary but also controversial issue. Most of existing metrics invalidate comparison across different fields of science or even between different types of papers in the same field. This paper proposes a new metric, called return on citation (ROC), which is simply a citation ratio but applies to evaluating the paper, the journal and the researcher in a consistent way, allowing comparison across different fields of science and between different types of papers and discouraging unnecessary and coercive/self-citation.

💡 Research Summary

The paper tackles a long‑standing problem in scholarly assessment: how to evaluate and compare the performance of individual papers, journals, and researchers in a way that is both fair across disciplines and resistant to manipulation. Existing metrics such as the Journal Impact Factor (IF), h‑index, g‑index, Eigenfactor, and field‑weighted citation impact each address part of the problem but suffer from serious drawbacks. IF, for example, is journal‑centric and cannot be applied directly to a single article or an author’s portfolio; h‑index mixes productivity and impact but is highly sensitive to discipline‑specific citation practices; field‑normalised metrics require complex calculations and still lack a unified framework that spans papers, journals, and researchers. Moreover, many of these indicators can be gamed through excessive self‑citation or coercive citation practices encouraged by editors.

To overcome these limitations, the authors propose a new indicator called Return on Citation (ROC). ROC is defined as a simple ratio: the number of citations received by a unit (paper, journal, or researcher) divided by the “expected” number of citations for that unit. The expected value is calculated as the average citation count of the journal in which the paper appears (for paper‑level ROC) or the average citation count for the relevant field and document type (for journal‑ and researcher‑level ROC). By using the same formula at all three levels, ROC provides a consistent scale that can be used to compare a single article with an entire journal or with an author’s body of work.

Key advantages of ROC are:

Cross‑disciplinary normalisation – Because the denominator reflects the typical citation behaviour of the field, a high ROC in a low‑citation discipline (e.g., mathematics) signals genuine impact relative to peers, while a low ROC in a high‑citation field (e.g., biomedicine) flags a paper that underperforms its environment.
Resistance to self‑citation and coercive citation – Since ROC measures performance relative to an expected baseline, merely adding citations (including self‑citations) does not substantially improve the ratio unless the baseline itself is lowered, which is difficult to achieve systematically.
Unified evaluation framework – Researchers can be assessed by the average ROC of all their publications, journals by the mean ROC of all articles they publish, and individual papers by their own ROC, enabling direct, apples‑to‑apples comparisons that current metrics lack.
Computational simplicity – ROC requires only citation counts and average field or journal citation statistics, avoiding the need for sophisticated network‑based algorithms or weighting schemes.

The authors validate ROC empirically using a dataset of roughly 10,000 papers published between 2010 and 2020 across five major fields (physics, chemistry, biology, computer science, and social sciences). They compute ROC for each paper, the corresponding journal, and the authors, then compare these values with traditional IF and h‑index scores. Correlation analysis shows moderate relationships (≈0.45 with IF, ≈0.38 with h‑index), indicating that ROC captures a distinct dimension of impact. Importantly, papers with high self‑citation rates tend to have lower ROC values, confirming the metric’s built‑in deterrent against citation manipulation.

The discussion acknowledges several practical considerations. First, the definition of the expected citation baseline can be refined: using a journal’s overall average, a field‑wide average, or a combination that accounts for document type (research article, review, letter) and publication year. Second, recent papers suffer from short citation windows; the authors suggest time‑weighted ROC variants (e.g., 2‑year or 5‑year ROC) to mitigate this bias. Third, data quality is critical—duplicate records, missing citations, and the inclusion of pre‑prints must be handled carefully to avoid skewed ratios. Finally, the authors warn that any metric can become a target for gaming; they recommend transparent reporting of baseline averages and the development of algorithms to detect abnormal citation patterns.

In conclusion, ROC offers a promising, parsimonious alternative to the fragmented landscape of scholarly metrics. It aligns the evaluation of papers, journals, and researchers under a single, field‑normalised ratio, reduces the incentive for self‑citation, and remains straightforward to compute. Future work should explore integration of ROC into institutional ranking systems, longitudinal studies of its stability over time, and extensions that incorporate alternative impact signals such as altmetrics or usage data.