In this work we investigate the sensitivity of individual researchers' productivity rankings to the time of citation observation. The analysis is based on observation of research products for the 2001-2003 triennium for all research staff of Italian universities in the hard sciences, with the year of citation observation varying from 2004 to 2008. The 2008 rankings list is assumed the most accurate, as citations have had the longest time to accumulate and thus represent the best possible proxy of impact. By comparing the rankings lists from each year against the 2008 benchmark we provide policy-makers and research organization managers a measure of trade-off between timeliness of evaluation execution and accuracy of performance rankings. The results show that with variation in the evaluation citation window there are variable rates of inaccuracy across the disciplines of researchers. The inaccuracy results negligible for Physics, Biology and Medicine.
Continuous development in bibliometric indicators and techniques has made it possible to use bibliometrics to integrate or even totally substitute peer-review methods in national research evaluation exercises, at least for the hard sciences. In the United Kingdom, the previous peer-review Research Assessment Exercise series will be substituted in 2014 by the Research Excellence Framework (REF). The latter is an informed peer-review exercise, where the assessment outcomes will be a product of expert review informed by citation information and other quantitative indicators. In Italy there is a plan to substitute the peerreview Triennial Evaluation Exercise (VTR), first held in 2006, with a new Quality in Research Assessment (VQR). The new exercise can be considered a hybrid, as the panels of experts can choose from or use both methodologies for evaluating any particular output: i) citation analysis; and/or ii) peer-review by external experts. The Excellence in Research for Australia initiative (ERA), launched in 2010, is conducted through a pure bibliometric approach for the hard sciences: single research outputs are evaluated by a citation index referring to world and Australian benchmarks.
The pros and cons of peer-review and bibliometrics methods have been amply debated in the literature (Horrobin, 1990;Moxham and Anderson, 1992;MacRoberts and MacRoberts, 1996;Moed, 2002;van Raan, 2005;Pendlebury, 2009;Abramo and D’Angelo, 2011a). For evaluation of individual scientific products, the literature fails to decisively indicate whether one method is better than the other but demonstrates that there is certainly a correlation between the results from peer-review evaluation and those from purely bibliometric exercises (Franceschet and Costantini, 2011;Abramo et al., 2009;Aksnes and Taxt 2004;Oppenheim and Norris 2003;Rinia et al. 1998;Oppenheim 1997). The situation changes when evaluation turns from consideration of individual research products to ratings of individuals, research groups or entire institutions on a large scale. The huge costs and the long times of execution for peer-review force this type of evaluation to focus on a limited share of total output from each research institution. A number of negative consequences arise, among others: i) the final rankings are strongly dependent on the share of product evaluated; ii) the selection of products to submit to evaluation can be inefficient, due to both technical and social factors; and, most important iii) it is impossible to measure research productivity, which is the quintessential indicator of any production systems. Abramo and D’Angelo (2011a) have contrasted the peer-review and bibliometrics approaches in the Italian VTR and conclude that, for the hard sciences, the bibliometric methodology is by far preferable to peer-review in terms of robustness, validity, functionality, time and costs.
While peer-review can be applied to any type of research product at any moment after its codification, bibliometric methods, being based on citation analysis, are applicable only to research products for which citations are available. Furthermore, citation counts must be observed at sufficient distance in time from the date of publication in order to be considered a reliable proxy of real impact of a publication. The first condition means that the field of application for bibliometrics is limited to the hard sciences. The second one gives rise to a potential conflict between the need for evaluations to be conducted as quickly as possible after the period of interest and the need for time to develop accuracy and robustness in the ranking lists of individuals, research groups and institutions.
In order to provide policy makers and research institution managers a measure of the trade-off between timeliness of execution and accuracy of performance rankings, the authors have undertaken two studies: a first preparatory study, regarding the sensitivity of a publication’s impact measurement to the citation window length (Abramo et al., 2011a), and a second concerning the sensitivity of the institutions’ performance rankings (Abramo et al., 2011b). The conclusions were: i) with the sole exception of Mathematics, a time lapse of two or three years between date of publication and citation observation appears a sufficient guarantee of robustness in impact indicators for single research products (greater time lag would offer greater accuracy, but with ever decreasing incremental effect); ii) for rankings of institutional productivity, it seems sufficient to count citations one year after the upper limit of a three-year production period to ensure acceptable accuracy. In this work we complete the picture, investigating the sensitivity of individual researchers’ productivity rankings to the time of citation observation. For this purpose we calculate the productivity of individual researcher staff in the hard sciences in Italian universities for the triennium 2001-2003, with the
This content is AI-processed based on open access ArXiv data.