The Accuracy of Confidence Intervals for Field Normalised Indicators

February 23, 2026

Reading time: 5 minute

...

📝 Abstract

💡 Analysis

📄 Content

The Accuracy of Confidence Intervals for Field Normalised Indicators1 Mike Thelwall, Ruth Fairclough Statistical Cybermetrics Research Group, University of Wolverhampton, UK.

When comparing the average citation impact of research groups, universities and countries, field normalisation reduces the influence of discipline and time. Confidence intervals for these indicators can help with attempts to infer whether differences between sets of publications are due to chance factors. Although both bootstrapping and formulae have been proposed for these, their accuracy is unknown. In response, this article uses simulated data to systematically compare the accuracy of confidence limits in the simplest possible case, a single field and year. The results suggest that the MNLCS (Mean Normalised Log- transformed Citation Score) confidence interval formula is conservative for large groups but almost always safe, whereas bootstrap MNLCS confidence intervals tend to be accurate but can be unsafe for smaller world or group sample sizes. In contrast, bootstrap MNCS (Mean Normalised Citation Score) confidence intervals can be very unsafe, although their accuracy increases with sample sizes.
Keywords: Citation analysis; field normalised citation indicators; confidence intervals 1 Introduction Citation indicators that estimate the average citation rate of articles produced by a group are widely used in research assessment and for ranking universities, countries and departments (Aksnes, Schneider, & Gunnarsson, 2012; Albarrán, Perianes‐Rodríguez, & Ruiz‐Castillo, 2015; Braun, Glänzel, & Grupp, 1995; Elsevier, 2013; Fairclough & Thelwall, 2015). For example, in the U.K., they have been proposed for the national Research Excellence Framework (REF) to cross-check peer review judgements (Stern, 2016). If average citation indicators are to be used in such a role, then they must be calculated in a fair way and accompanied with an estimate of statistical variability so that strong conclusions are not drawn from small or biased differences. Field normalised citation impact indicators adjust average citation counts for the field and year of publication to allow fair comparisons of citation impact between sets of articles that were published in different combinations of fields and years. For example, if group A published 100 medical humanities articles in 2014 with an average of 4 citations each but group B published 100 oncology articles in 2013 with an average of 30 citations each then it is not clear which had generated the most impactful research. Group B has two advantages: its articles are older, with longer to attract citations, and it publishes in an area where citations accrue rapidly. A field normalised indicator may divide by the average number of citations for the field and year so that the normalised counts are 1 if the average citation impact is equal to the world average. After this, it would be reasonable to compare the field normalised values of A and B. Nevertheless, confidence intervals or statistical hypothesis tests are needed to be able to judge whether the difference between A and B is likely to reflect an underlying trend rather than a random fluctuation of the data.

1Thelwall, M. & Fairclough, R. (in press). The accuracy of confidence intervals for field normalised indicators Journal of Informetrics. doi:10.1016/j.joi.2017.03.004 This manuscript version is made available under the CC- BY-NCND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/ 2

The use of statistical inference or confidence intervals to compare the average citation impact is uncommon within scientometrics and there are arguments against it, such as a lack of clarity about what exactly is being sampled (Waltman, 2016). Statistical inference is typically used when data is available about a sample whereas in scientometrics, relatively complete sets of publications are normally analysed and so there is no necessity to infer population properties from a sample, at least in the obvious sense. Nevertheless, research is a social process and therefore each citation is the product of activities that are affected by processes that can be thought of as random in the sense of not predictable in advance (Williams & Bornmann, 2016). The exact citation count of an article is therefore partly a result of chance factors rather than just the quality or value of an article. For example, if two essentially identical papers are published at the same time then one may become more highly cited than the other for spurious reasons, such as the prestige of the publishing journal (Larivière & Gingras, 2010), or the extent to which the citing literature is covered by the database used for the counts (Harzing & Alakangas, 2016; Table 3 in: Kousha & Thelwall, 2008). Thus, it seems impossible to regard citation counting as precisely measuring the im

View Original ArXiv

This content is AI-processed based on ArXiv data.

The Accuracy of Confidence Intervals for Field Normalised Indicators

📝 Abstract

💡 Analysis

📄 Content

Table of Contents

Table of Contents

📝 Abstract

💡 Analysis

📄 Content

Start searching

No results found