CITEX: A new citation index to measure the relative importance of authors and papers in scientific publications

CITEX: A new citation index to measure the relative importance of   authors and papers in scientific publications
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Evaluating the performance of researchers and measuring the impact of papers written by scientists is the main objective of citation analysis. Various indices and metrics have been proposed for this. In this paper, we propose a new citation index CITEX, which gives normalized scores to authors and papers to determine their rankings. To the best of our knowledge, this is the first citation index which simultaneously assigns scores to both authors and papers. Using these scores, we can get an objective measure of the reputation of an author and the impact of a paper. We model this problem as an iterative computation on a publication graph, whose vertices are authors and papers, and whose edges indicate which author has written which paper. We prove that this iterative computation converges in the limit, by using a powerful theorem from linear algebra. We run this algorithm on several examples, and find that the author and paper scores match closely with what is suggested by our intuition. The algorithm is theoretically sound and runs very fast in practice. We compare this index with several existing metrics and find that CITEX gives far more accurate scores compared to the traditional metrics.


💡 Research Summary

The paper introduces CITEX, a novel citation index that simultaneously assigns normalized scores to both authors and papers. Unlike traditional metrics such as citation counts, h‑index, or journal‑level impact factors, which treat authors and papers separately, CITEX models the scholarly ecosystem as a bipartite “publication graph” linking authors to the papers they have written, together with a directed “citation graph” among papers.

Formally, let A = {a₁,…,a_m} be the set of authors and P = {p₁,…,p_n} the set of papers. The publication relationship is captured by an m×n binary matrix M where M_{ij}=1 iff author a_i co‑authored paper p_j. The citation relationship is captured by an n×n binary matrix C where C_{jk}=1 iff paper p_j cites paper p_k. The authors assume the citation graph is acyclic (papers can only cite earlier works), which makes C upper‑triangular.

CITEX proceeds by initializing all author scores α_i and paper scores β_j to 1 (any positive value would work). In each iteration two update steps are performed:

  1. Author update – each paper’s current score β_j is divided equally among its |AUTHORS(p_j)| co‑authors. The sum of these shares gives the new author score α_i^{new}. Mathematically, α^{new}=M·D^{-1}·β where D is a diagonal matrix with D_{jj}=|AUTHORS(p_j)|.

  2. Paper update – a paper’s new score β^{new}_j is the sum of (a) the contributions from its authors (α_i^{new} divided equally among co‑authors) and (b) the scores of papers that cite it, i.e., β^{new}=M^{T}·D^{-1}·α^{new}+C^{T}·β.

After each full iteration the vectors α and β are normalized so that their entries sum to 1, ensuring they lie in the interval (0,1). This “principle of repeated improvement” creates a feedback loop: influential authors boost the scores of their papers, and highly‑cited papers in turn raise the standing of their authors.

The authors prove convergence by rewriting the two update equations as linear transformations using stochastic (row‑ or column‑stochastic) matrices. Because these matrices are non‑negative and have spectral radius 1, the Perron‑Frobenius theorem guarantees a unique positive eigenvector associated with eigenvalue 1. The iterative process therefore converges to this eigenvector pair (α*, β*), independent of the initial values. The proof is analogous to that for PageRank but extended to the bipartite setting.

Empirical evaluation is limited to a handful of synthetic examples. In each case the resulting author and paper rankings align with intuitive expectations (e.g., authors who write many well‑cited papers receive higher scores). The authors also compare CITEX qualitatively against traditional metrics, arguing that CITEX offers finer discrimination because scores are real numbers in (0,1) rather than integer counts, and because it accounts for co‑author contributions and the quality of citing papers.

The paper surveys related work, including h‑index, g‑index, Eigenfactor, and more recent co‑ranking approaches such as SIMRANK, CITERANK, and PageRank‑based citation analyses. It emphasizes that while many prior methods either rank only authors or only papers, or rely on separate random walks, CITEX is a unified, simpler algorithm that does not require external parameters beyond the graph structure.

Strengths

  • Provides a unified framework for author and paper ranking.
  • The iterative scheme is mathematically grounded; convergence is rigorously proved via Perron‑Frobenius.
  • Normalized scores avoid the integer granularity of traditional indices, offering higher discriminatory power.
  • The model is extensible: weighted author contributions, citation weighting, or temporal decay can be incorporated.

Weaknesses / Limitations

  • The assumption of an acyclic citation graph is unrealistic; real citation networks contain cycles (e.g., mutual citations, re‑citations).
  • Uniform division of a paper’s score among co‑authors ignores author order or contribution differences, which are important in many fields.
  • Experiments are confined to artificial datasets; no large‑scale validation on real bibliographic corpora (e.g., DBLP, Microsoft Academic Graph) is presented.
  • Self‑citations and citation manipulation are not explicitly mitigated; the model could still be gamed.
  • Temporal dynamics are omitted; newer papers may be undervalued because the model does not decay older citations.

Future Directions Suggested

  • Introduce author‑specific weights (first author, corresponding author) to reflect contribution hierarchy.
  • Apply citation weighting based on the citing paper’s own CITEX score (similar to Eigenfactor) to capture citation quality.
  • Incorporate a time‑decay factor so that recent citations have higher impact, addressing the “cold‑start” problem for new papers.
  • Conduct extensive experiments on real-world citation databases, comparing CITEX quantitatively against PageRank, HITS, and other co‑ranking methods using ground‑truth benchmarks (e.g., award winners, expert surveys).
  • Explore robustness against self‑citation and citation rings, possibly by penalizing reciprocal citation patterns.

In summary, CITEX offers an elegant, theoretically sound approach to jointly rank authors and papers using a bipartite graph and iterative linear updates. While the conceptual contribution is valuable, the practical utility of the index will depend on addressing the noted limitations and demonstrating its performance on large, real-world scholarly datasets.


Comments & Academic Discussion

Loading comments...

Leave a Comment