Can Scientific Journals be Classified in terms of Aggregated Journal-Journal Citation Relations using the Journal Citation Reports?
The aggregated citation relations among journals included in the Science Citation Index provide us with a huge matrix which can be analyzed in various ways. Using principal component analysis or factor analysis, the factor scores can be used as indicators of the position of the cited journals in the citing dimensions of the database. Unrotated factor scores are exact, and the extraction of principal components can be made stepwise since the principal components are independent. Rotation may be needed for the designation, but in the rotated solution a model is assumed. This assumption can be legitimated on pragmatic or theoretical grounds. Since the resulting outcomes remain sensitive to the assumptions in the model, an unambiguous classification is no longer possible in this case. However, the factor-analytic solutions allow us to test classifications against the structures contained in the database. This will be demonstrated for the delineation of a set of biochemistry journals.
💡 Research Summary
The paper investigates whether scientific journals can be classified by exploiting the aggregated citation relations recorded in the Journal Citation Reports (JCR) for the Science Citation Index. The authors begin by describing the citation matrix that results from counting how often each journal cites every other journal. In this matrix, rows represent cited journals and columns represent citing journals; each cell contains the number of citations from the column journal to the row journal. Because the matrix is extremely high‑dimensional, the authors apply dimensionality‑reduction techniques—principal component analysis (PCA) and factor analysis—to uncover latent “citing dimensions” that structure the data.
First, they extract unrotated principal components. These components are orthogonal, statistically independent, and can be extracted stepwise because each component explains a distinct portion of the total variance. The resulting unrotated factor scores are mathematically exact linear transformations of the original citation data, meaning no information is lost. Consequently, each journal receives a set of scores that locate it precisely in the space defined by the principal components. However, the authors note that these scores are difficult to interpret substantively; a journal’s position may be spread across several components without a clear disciplinary meaning.
To obtain a more interpretable solution, the authors then rotate the factor solution. They discuss both orthogonal rotations (e.g., Varimax) and oblique rotations (e.g., Oblimin), explaining that rotation redistributes variance among factors to achieve a simpler, more “clean” structure. In a rotated solution, each factor tends to load heavily on a subset of journals that share a common research focus, making it possible to label factors as disciplinary clusters (e.g., biochemistry, physics). The authors stress, however, that rotation introduces a model assumption: the analyst must decide how many factors to retain and which rotation method to use. These decisions shape the final solution, so the rotated factor scores are not purely data‑driven but are contingent on the chosen model.
The paper’s central methodological argument is that classification based on rotated factors is inherently model‑dependent, and therefore cannot be regarded as unambiguous. Nonetheless, the authors propose that rotated factor solutions can serve as hypothesis‑testing tools. By comparing a rotated factor structure to an existing classification (for example, a list of journals traditionally considered “biochemistry” journals), researchers can assess how well the data‑driven structure aligns with the conventional taxonomy.
To illustrate the approach, the authors conduct an empirical case study on a set of biochemistry journals. They first compute unrotated factor scores for the entire citation matrix and observe that biochemistry journals are scattered across multiple components, offering little discriminative power. After applying a Varimax rotation, a single factor emerges that captures the majority of variance associated with the biochemistry set; most biochemistry journals load strongly on this factor while journals from other fields load weakly. The authors then cross‑validate this rotated factor against an independently compiled list of biochemistry journals. The overlap is high, confirming that the rotated factor successfully isolates the biochemistry discipline within the citation space.
The findings lead to several important conclusions. Unrotated components faithfully represent the raw citation structure but lack interpretability for practical classification. Rotated factors, while dependent on analyst choices, provide a clearer mapping between statistical dimensions and disciplinary boundaries, making them useful for constructing or testing journal classifications. However, because rotation imposes a theoretical model, the robustness of any classification must be examined by varying the number of factors, rotation method, and by comparing results across different subject areas.
In the discussion, the authors argue that journal classification should be viewed as a hypothesis rather than a final, immutable categorization. They recommend a mixed‑methods approach that combines quantitative factor‑analytic results with qualitative expert judgment to achieve more reliable and transparent classifications. Future research directions include testing alternative rotation schemes, extending the analysis to other scientific domains, and integrating factor‑analytic classifications with network‑based clustering techniques. By doing so, the scholarly community can develop a more nuanced, data‑grounded understanding of how journals relate to one another within the complex ecosystem of scientific communication.
Comments & Academic Discussion
Loading comments...
Leave a Comment