Measuring impact in research evaluations: A thorough discussion of methods for, effects of, and problems with impact measurements
Impact of science is one of the most important topics in scientometrics. Recent developments show a fundamental change in impact measurements from impact on science to impact on society. Since impact measurement is currently in a state of far reaching changes, this paper describes recent developments and facing problems in this area. For that the results of key publications (dealing with impact measurement) are discussed. The paper discusses how impact is generally measured within science and beyond (section 2), which effects impact measurements have on the science system (section 3), and which problems are associated with impact measurement (section 4). The problems associated with impact measurement constitute the focus of this paper: Science is marked by inequality, random chance, anomalies, the right to make mistakes, unpredictability, and a high significance of extreme events, which might distort impact measurements. Scientometricians as the producer of impact scores and decision makers as their consumers should be aware of these problems and should consider them in the generation and interpretation of bibliometric results, respectively.
💡 Research Summary
The paper provides a comprehensive discussion of the current state of research impact measurement, tracing the shift from traditional scientific impact—primarily captured through citation‑based indicators such as citation counts, journal impact factors, and h‑indices—to a broader conception of societal impact that includes patents, policy citations, media coverage, and altmetric mentions. The authors first outline the methodological landscape: classic bibliometric approaches rely on large citation databases (Web of Science, Scopus) and employ field‑normalisation techniques to enable cross‑disciplinary comparison, yet they suffer from disciplinary citation cultures, long citation windows, and incentives that encourage “citation‑friendly” research topics. In contrast, societal impact metrics are heterogeneous, encompassing quantitative altmetric scores (tweets, blog posts, news articles), patent citations, policy document references, technology transfer cases, and qualitative expert panels. While these measures capture rapid, outward‑facing signals of influence, they are hampered by a lack of standardisation, temporal volatility, and subjectivity inherent in qualitative judgments.
The second part of the paper examines how impact metrics shape the scientific system. When impact scores are directly linked to funding allocations, hiring, promotion, and institutional rankings, researchers adapt their behaviour: they increase co‑authorship, target high‑impact journals, and gravitate toward “hot” topics that promise rapid citation accrual. This incentive structure can boost short‑term citation performance but may undermine long‑term innovation and basic research. The emerging emphasis on societal impact, driven by policy agendas, encourages industry‑university collaborations, technology transfer, and policy advisory work, yet it also risks marginalising curiosity‑driven inquiry if reward mechanisms become overly utilitarian.
A central contribution of the paper is its articulation of five fundamental problems that can distort impact measurement: (1) inequality— a small elite of highly cited papers or researchers can dominate average scores, masking the contributions of the broader community; (2) randomness—some papers achieve high citations by chance rather than intrinsic merit, leading to a mismatch between citation‑based impact and true scientific value; (3) extreme events and outliers—rare papers that experience sudden citation spikes or massive media attention can inflate aggregate metrics; (4) the right to err—science progresses through trial, error, and revision, yet most metrics reward only successful, visible outcomes, ignoring the essential role of failed experiments; and (5) unpredictability—future societal relevance of a research output is often unknowable at the time of publication, so current impact scores cannot reliably forecast long‑term benefits.
In the concluding section, the authors propose a set of safeguards for both scientometricians and policy makers. They advocate for multi‑dimensional evaluation frameworks that combine several quantitative indicators with qualitative peer review, thereby reducing reliance on any single metric. Field‑specific normalisation and weighted scoring are recommended to mitigate inequality and random effects. The paper also suggests systematic monitoring of outliers and the incorporation of longitudinal follow‑up studies to capture delayed societal benefits. Ultimately, the authors stress that impact scores should be treated as decision‑support tools, not definitive judgments, and that their limitations must be explicitly considered when designing funding policies, institutional assessments, and research strategies.
Comments & Academic Discussion
Loading comments...
Leave a Comment