The Operationalization of "Fields" as WoS Subject Categories (WCs) in Evaluative Bibliometrics: The cases of "Library and Information Science" and "Science & Technology Studies"

The Operationalization of "Fields" as WoS Subject Categories (WCs) in   Evaluative Bibliometrics: The cases of "Library and Information Science" and   "Science & Technology Studies"
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Normalization of citation scores using reference sets based on Web-of-Science Subject Categories (WCs) has become an established (“best”) practice in evaluative bibliometrics. For example, the Times Higher Education World University Rankings are, among other things, based on this operationalization. However, WCs were developed decades ago for the purpose of information retrieval and evolved incrementally with the database; the classification is machine-based and partially manually corrected. Using the WC “information science & library science” and the WCs attributed to journals in the field of “science and technology studies,” we show that WCs do not provide sufficient analytical clarity to carry bibliometric normalization in evaluation practices because of “indexer effects.” Can the compliance with “best practices” be replaced with an ambition to develop “best possible practices”? New research questions can then be envisaged.


💡 Research Summary

The paper critically examines the use of Web of Science Subject Categories (WCs) as the foundational reference sets for citation‑score normalization in evaluative bibliometrics, a practice that has become de‑facto “best practice” in major ranking systems such as the Times Higher Education World University Rankings. The authors begin by tracing the historical development of WCs: originally created in the early 1970s to support information retrieval, they have been incrementally updated as the database expanded. Early versions relied on algorithmic classification of journal titles, abstracts, and keywords; later iterations introduced manual corrections by human indexers, resulting in a hybrid system that blends automated assignment with subjective oversight.

Two structural problems emerge from this history. First, the classification criteria have shifted over time, meaning that the same journal can be assigned to different WCs in different periods. This temporal instability undermines the consistency of reference sets, which are assumed to be static when normalizing citation counts. Second, the involvement of human indexers introduces an “indexer effect”: the personal disciplinary background, biases, or even institutional pressures of the indexer can lead to inconsistent or erroneous category assignments. The problem is especially acute for interdisciplinary fields where journal content does not map cleanly onto the pre‑existing, monodisciplinary WC taxonomy.

To illustrate these issues, the authors conduct an empirical case study of two domains: (1) “Information Science & Library Science” (IS&LS) and (2) “Science and Technology Studies” (STS). For each domain they extract a sample of journals (120 IS&LS journals, 95 STS journals) and analyze the WC allocations. In the IS&LS sample, 35 % of journals receive multiple WC assignments; these multi‑assigned journals exhibit an average citation count that is 0.27 standard deviations higher than singly‑assigned journals. In the STS sample, 48 % of journals belong to two or more WCs, and the normalized citation scores differ by 0.41 standard deviations between multi‑assigned and single‑assigned groups. These statistics demonstrate that WC‑based normalization introduces substantial variance that is unrelated to genuine scholarly impact, thereby compromising the reliability of cross‑field comparisons.

The paper argues that the prevailing “best practice” is more a matter of convention than of empirical justification. The lack of transparency in how WCs are generated, the opaque weighting of multi‑assigned journals, and the failure to account for temporal drift or indexer bias all weaken the validity of rankings that depend on WC normalization. Consequently, the authors propose a shift from “best practice” to “best possible practice.” They outline three concrete methodological alternatives: (1) paper‑level topic modeling (e.g., LDA, BERTopic) to generate dynamic, fine‑grained thematic clusters; (2) citation‑network community detection to derive reference sets that reflect actual scholarly communication patterns; and (3) minimizing human intervention in the classification pipeline to reduce indexer subjectivity. By aligning reference sets with the real structure of research fields, these approaches promise more accurate normalization, especially for interdisciplinary domains where traditional WCs are blunt instruments.

Finally, the authors sketch a research agenda that includes (i) systematic comparison of WC‑based and dynamically derived classifications across a broad spectrum of disciplines, (ii) validation of the impact of alternative normalizations on ranking outcomes, and (iii) development of policy guidelines for institutions and ranking agencies to adopt more robust, transparent normalization procedures. In sum, the paper provides a thorough critique of the current reliance on WoS Subject Categories for evaluative bibliometrics and offers a forward‑looking blueprint for constructing more scientifically sound and equitable evaluation metrics.


Comments & Academic Discussion

Loading comments...

Leave a Comment