A critical cluster analysis of 44 indicators of author-level performance

This paper explores the relationship between author-level bibliometric indicators and the researchers the “measure”, exemplified across five academic seniorities and four disciplines. Using cluster methodology, the disciplinary and seniority appropriateness of author-level indicators is examined. Publication and citation data for 741 researchers across Astronomy, Environmental Science, Philosophy and Public Health was collected in Web of Science (WoS). Forty-four indicators of individual performance were computed using the data. A two-step cluster analysis using IBM SPSS version 22 was performed, followed by a risk analysis and ordinal logistic regression to explore cluster membership. Indicator scores were contextualized using the individual researcher’s curriculum vitae. Four different clusters based on indicator scores ranked researchers as low, middle, high and extremely high performers. The results show that different indicators were appropriate in demarcating ranked performance in different disciplines. In Astronomy the h2 indicator, sum pp top prop in Environmental Science, Q2 in Philosophy and e-index in Public Health. The regression and odds analysis showed individual level indicator scores were primarily dependent on the number of years since the researcher’s first publication registered in WoS, number of publications and number of citations. Seniority classification was secondary therefore no seniority appropriate indicators were confidently identified. Cluster methodology proved useful in identifying disciplinary appropriate indicators providing the preliminary data preparation was thorough but needed to be supplemented by other analyses to validate the results. A general disconnection between the performance of the researcher on their curriculum vitae and the performance of the researcher based on bibliometric indicators was observed.

💡 Research Summary

The paper investigates how well author‑level bibliometric indicators reflect the actual performance of researchers across different seniorities and disciplines. Data were gathered from the Web of Science for 741 researchers representing four fields—Astronomy (186), Environmental Science (184), Philosophy (184), and Public Health (187). For each researcher, 44 bibliometric indicators were calculated, ranging from classic metrics such as h‑index, g‑index, and e‑index to field‑specific variants like h2, Q2, and sum pp top prop.

A two‑step cluster analysis was performed using IBM SPSS Statistics 22. The first step automatically determined the optimal number of clusters and the appropriate distance measure; the second step applied K‑means clustering to assign individuals to clusters. Four distinct clusters emerged, which the authors labelled Low, Middle, High, and Extremely High performers based on the distribution of indicator scores.

To explore what drives cluster membership, the authors conducted a risk analysis and an ordinal logistic regression. The regression model identified three primary predictors: (1) the number of years since the researcher’s first WoS‑indexed publication, (2) total number of publications, and (3) total citation count. All three were highly significant (p < 0.001). Academic seniority (e.g., post‑doc, assistant professor, associate professor, professor) entered the model as a secondary variable but did not exert a statistically meaningful effect. This suggests that author‑level indicators are more sensitive to actual research activity and impact than to formal career stage.

Discipline‑specific findings revealed that different metrics best discriminate performance within each field. In Astronomy, the h2 indicator—an extension of the h‑index that accounts for the distribution of citations—provided the clearest separation. In Environmental Science, the sum pp top prop (the sum of the proportion of papers in the top 10 % most cited) was most effective, reflecting the field’s emphasis on high‑impact, interdisciplinary work. Philosophy’s best discriminator was Q2, a metric that captures the proportion of citations in the top 25 % of papers, aligning with the humanities’ slower citation dynamics and longer citation half‑life. In Public Health, the e‑index—designed to complement the h‑index by accounting for excess citations beyond the h‑core—proved most suitable. These results confirm that a single universal indicator cannot adequately capture performance across diverse scholarly cultures.

When the authors compared the bibliometric cluster assignments with information extracted from each researcher’s curriculum vitae, a notable mismatch emerged. Some scholars with extensive grant leadership, teaching, and collaborative activities received relatively low bibliometric scores, whereas others with high citation counts but limited broader academic contributions were placed in top clusters. This discrepancy underscores the limitation of relying solely on quantitative citation‑based metrics to assess overall scholarly merit.

Methodological limitations were acknowledged. The dataset excludes publications not indexed in WoS (e.g., regional journals, conference proceedings, books), which may bias results, especially in fields like Philosophy where monographs are common. The preprocessing steps required for author name disambiguation and self‑citation removal introduce a degree of subjectivity. Additionally, the K‑means algorithm’s sensitivity to initial centroids could affect cluster stability. The authors recommend supplementing the current approach with data from other citation databases (Scopus, Google Scholar), peer‑review assessments, and qualitative case studies to validate and refine the indicator selection.

In conclusion, the study demonstrates that two‑step cluster analysis is a valuable tool for identifying discipline‑specific bibliometric indicators that effectively rank researchers. However, the utility of any indicator depends on rigorous data preparation and must be complemented by other evaluative methods. The finding that years since first publication, total output, and total citations dominate indicator scores, while seniority plays a secondary role, has implications for research assessment policies: evaluation frameworks should prioritize actual research productivity and impact rather than relying on seniority‑based heuristics. The observed disconnect between CV‑based narratives and metric‑based rankings further argues for a mixed‑methods approach that integrates quantitative indicators with qualitative evidence to achieve a more holistic appraisal of scholarly performance.

💡 Research Summary

📜 Original Paper Content