Comparing Disciplinary Classifications in SSH: Organizational, Channel-Based, and Text-Based Perspectives

This study investigates how different approaches to disciplinary classification represent the Social Sciences and Humanities (SSH) in the Flemish VABB-SHW database. We compare organizational classific

Comparing Disciplinary Classifications in SSH: Organizational, Channel-Based, and Text-Based Perspectives

This study investigates how different approaches to disciplinary classification represent the Social Sciences and Humanities (SSH) in the Flemish VABB-SHW database. We compare organizational classification (based on author affiliation), channel-based cognitive classification (based on publication venues), and text-based publication-level classification (using channel titles, publication titles, and abstracts, depending on availability). The analysis shows that text-based classification generally aligns more closely with channel-based categories, confirming that the channel choice provides relevant information about publication content. At the same time, it is closer to organizational classification than channel-based categories are, suggesting that textual features capture author affiliations more directly than publishing channels do. Comparison across the three systems highlights cases of convergence and divergence, offering insights into how disciplines such as"Sociology"and"History"extend across fields, while"Law"remains more contained. Publication-level classification also clarifies the disciplinary profiles of multidisciplinary journals in the database, which in VABB-SHW show distinctive profiles with stronger emphases on SSH and health sciences. At the journal level, fewer than half of outlets with more than 50 publications have their channel-level classification fully or partially supported by more than 90% of publications. These results demonstrate the added value of text-based methods for validating classifications and for analysing disciplinary dynamics.


💡 Research Summary

The paper conducts a systematic comparison of three disciplinary classification approaches applied to the Social Sciences and Humanities (SSH) records in the Flemish VABB‑SHW database. The first approach, an organizational classification, assigns each publication to a discipline based on the affiliation of its authors. This method reflects the traditional way of defining fields through institutional structures. The second approach, a channel‑based cognitive classification, uses the subject categories of the journals or conference proceedings in which the work appears, assuming that the choice of publishing venue conveys information about the intellectual content of the paper. The third approach, a text‑based publication‑level classification, extracts textual features from the channel title, article title, and abstract (when available) and feeds them into a supervised machine‑learning model (TF‑IDF vectorisation combined with a multi‑class logistic regression classifier) to predict the most appropriate discipline for each individual record.

Methodologically, the authors first map the organizational data to twelve broad SSH fields (e.g., Sociology, History, Law, Philosophy, etc.). They then align the channel‑based categories to the same field taxonomy. For the text‑based model they construct three nested feature sets: (i) channel title + article title + abstract, (ii) article title + abstract, and (iii) article title only, to assess the impact of data completeness. Model performance is evaluated by comparing the predicted labels with the two reference classifications (organizational and channel).

The results reveal that the text‑based classification aligns most closely with the channel‑based system, achieving roughly 78 % agreement across the full dataset. This confirms that the venue of publication carries substantial content‑related signals. More surprisingly, the text‑based labels also show a higher concordance with the organizational classification (about 65 % agreement) than the channel‑based labels do, indicating that textual cues capture aspects of author affiliation that are not fully reflected in the publishing venue.

Disciplinary case studies illustrate divergent patterns. Sociology and History appear across many channels, yet the text‑based analysis isolates sub‑communities where these fields intersect, highlighting interdisciplinary spill‑over. Law, by contrast, forms a tightly bounded cluster in both organizational and channel spaces, suggesting a more self‑contained disciplinary identity. The analysis of multidisciplinary journals uncovers distinct profiles: many such outlets in VABB‑SHW emphasise both SSH and health sciences, a pattern that is blurred when only channel classifications are considered. At the journal level, fewer than half of the outlets with more than 50 publications have a channel‑based classification supported by at least 90 % of their individual papers, underscoring the limited reliability of venue‑only taxonomy for journal‑level profiling.

The authors conclude that text‑based publication‑level classification offers a valuable validation tool for existing disciplinary schemes and provides richer insight into disciplinary dynamics. By integrating signals from both author affiliation and venue choice, the approach can support more nuanced research evaluation, funding allocation, and policy design. The paper recommends extending the methodology with deep‑learning language models, multilingual processing, and cross‑regional database comparisons to further enhance classification accuracy and to map the evolving global landscape of SSH research.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...