Best of both worlds? Simultaneous evaluation of researchers and their works

Best of both worlds? Simultaneous evaluation of researchers and their   works
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper explores a dual score system that simultaneously evaluates the relative importance of researchers and their works. It is a modification of the CITEX algorithm recently described in Pal and Ruj (2015). Using available publication data for $m$ author keywords (as a proxy for researchers) and $n$ papers it is possible to construct a $m \times n$ author-paper feature matrix. This is further combined with citation data to construct a HITS-like algorithm that iteratively satisfies two criteria: first, \emph{a good author is cited by good authors}, and second, \emph{a good paper is cited by good authors}. Following Pal and Ruj, the resulting algorithm produces an author eigenscore and a paper eigenscore. The algorithm is tested on 213,530 citable publications listed under Thomson ISI’s “\emph{Information Science & Library Science}” JCR category from 1980–2012.


💡 Research Summary

The paper addresses the problem of jointly ranking scholars and their publications by improving upon the CITEX algorithm originally proposed by Pal and Ruj (2015). CITEX builds a bipartite author‑paper matrix M and a paper‑paper citation matrix C, then iteratively updates author scores x and paper scores y using HITS‑like equations that incorporate a self‑citation term (I + Cᵀ). While conceptually appealing—“good authors are cited by good authors” and “good papers are cited by good authors”—the algorithm suffers from two systematic biases. First, authors who are highly prolific, especially solo authors, can achieve high scores even with few or no citations because the term W x appears in the update rule. Second, papers that share the same author list repeatedly receive inflated scores regardless of actual citation impact. These quirks lead to rankings that do not faithfully reflect scholarly influence.

To remedy these issues, the authors propose the Coupled Author‑Paper Scoring (CAPS) algorithm. CAPS removes the self‑citation term entirely and replaces the raw author‑paper matrix M with its column‑normalized version W, ensuring that each author’s contribution to a paper is proportionally accounted for. The core iterative equations become:

x(k) = W Cᵀ Wᵀ x(k‑1)
y(k) = Cᵀ Wᵀ x(k)

Here, W Cᵀ Wᵀ captures fractional citations flowing from one author to another through the papers they authored, while Cᵀ Wᵀ aggregates the influence of authors who cite a given paper. The authors initialize both x and y as all‑ones vectors and iterate until the change falls below a small tolerance ε. Because the update matrices are non‑negative and irreducible, the Perron‑Frobenius theorem guarantees convergence to a unique dominant eigenvector for both author and paper scores.

The method is evaluated on a large real‑world dataset: 213,530 citable items from the ISI “Information Science & Library Science” JCR category spanning 1980–2012. The authors construct W and C from author‑keyword (proxy for researchers) and citation data, then run CAPS alongside the original CITEX and a PageRank‑based co‑ranking baseline. Results show that CAPS dramatically reduces the over‑emphasis on sheer publication volume; prolific solo authors no longer dominate the ranking. Paper scores under CAPS correlate more strongly with raw citation counts, and the algorithm better distinguishes papers that are truly influential from those that merely share author lists with high‑output researchers. Network‑level analysis confirms that CAPS faithfully captures the structure of inter‑author citations, rewarding authors who are cited by other high‑impact scholars and whose own papers receive citations from influential works.

In summary, the paper makes three key contributions: (1) a clear identification of systematic biases in the CITEX formulation, (2) the design of the CAPS algorithm that eliminates self‑citation bias and enforces proper normalization, and (3) an extensive empirical validation demonstrating that CAPS provides more balanced and meaningful rankings of both authors and papers. The authors suggest future work on applying CAPS to other disciplines, exploring temporal dynamics, and integrating additional metadata (e.g., venue prestige) to further enrich the dual‑ranking framework.


Comments & Academic Discussion

Loading comments...

Leave a Comment