Citation analysis cannot legitimate the strategic selection of excellence

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In reaction to a previous critique(Opthof & Leydesdorff, 2010), the Center for Science and Technology Studies (CWTS) in Leiden proposed to change their old “crown” indicator in citation analysis into a new one. Waltman et al. (2011)argue that this change does not affect rankings at various aggregated levels. However, CWTS data is not publicly available for testing and criticism. In this correspondence, we use previously published data of Van Raan (2006) to address the pivotal issue of how the results of citation analysis correlate with the results of peer review. A quality parameter based on peer review was neither significantly correlated with the two parameters developed by the CWTS in the past (CPP/JCSm or CPP/FCSm) nor with the more recently proposed h-index (Hirsch, 2005). Given the high correlations between the old and new “crown” indicators, one can expect that the lack of correlation with the peer-review based quality indicator applies equally to the newly developed ones.

💡 Research Summary

The paper critically examines the claim by the Centre for Science and Technology Studies (CWTS) in Leiden that replacing its traditional “crown” indicators (CPP/JCSm and CPP/FCSm) with a newly normalised version does not materially affect research rankings. Because the CWTS data underlying the new indicator are not publicly available, the authors turn to a previously published dataset from Van Raan (2006), which contains detailed bibliometric and peer‑review information for 147 chemistry and chemical‑engineering research groups in the Netherlands (1991‑1998).

Using this dataset, the authors test two central questions: (1) How well do the traditional crown indicators, the new crown indicator, and the h‑index correlate with a peer‑review based quality score (Q) that ranges from 3 = satisfactory to 5 = excellent? (2) Can any of these citation‑based metrics discriminate between groups rated “good” (Q = 4) and “excellent” (Q = 5)?

Statistical analysis shows that CPP/JCSm and CPP/FCSm are highly correlated with each other (Pearson r ≈ 0.78, Spearman ρ ≈ 0.78), confirming that the new indicator essentially measures the same construct as the old one. However, both crown indicators exhibit virtually no correlation with the peer‑review quality score (e.g., Pearson r ≈ ‑0.13, p > 0.05). The h‑index, while modestly correlated with Q at the aggregate level (χ² = 5.56, p ≈ 0.06), is also largely independent of the crown indicators and is strongly dependent on the number of publications, a property not shared by the CWTS metrics.

A concrete illustration using 12 groups from a single university (Table 1) demonstrates that the mean values of CPP/JCSm, CPP/FCSm, and the h‑index for “good” versus “excellent” groups overlap within their standard errors. Figure 1 (as reproduced by the authors) shows that none of the three citation‑based measures can statistically separate the two peer‑review categories. At the full sample level (N = 147), the association between Q and the h‑index reaches statistical significance, but the association between Q and CPP/FCSm does not (χ² = 4.211, df = 2, p = 0.112). Thus, even when aggregating across many groups, the crown indicators fail to distinguish “good” from “excellent” research.

The authors argue that these findings expose a fundamental weakness in the current practice of research evaluation: citation analysis, now a quasi‑industrial service with proprietary data (e.g., the Science Citation Index), lacks transparency and is therefore resistant to external scrutiny. Despite its widespread use in university rankings, funding decisions, and policy advice, the crown indicators do not validate against an independent peer‑review benchmark. Moreover, the h‑index, though different in construction, also does not reliably reflect peer‑assessed quality.

In conclusion, the paper asserts that citation‑based metrics, whether the traditional crown indicators, their newly normalised successors, or the h‑index, cannot legitimately support strategic selection of research excellence. For citation analysis to become a trustworthy evaluative tool, the underlying data must be openly accessible, methodological choices must be transparent, and systematic validation against peer review must be performed. Until such reforms occur, reliance on these metrics for high‑stakes decisions remains scientifically unjustified.

Citation analysis cannot legitimate the strategic selection of excellence

💡 Research Summary

Comments & Academic Discussion

Leave a Comment