The social network analysis of bibliometric data needs matrices to be recast in a network framework. In this paper we argue that a simple conservation rule requires that this should be done only using fractional counting so that conservation at the paper level will be faithfully reproduced at higher levels ofaggregation (i.e. author, institute, country, journal etc.) of the complex network.
Quite early in the development of bibliometrics as a field of enquiry the network properties were anticipated (De Solla Price, 1965;Kessler, 1963;Small, 1973). There is now a pronounced trend towards developing more sophisticated indicators for scholarly performance evaluation (Bollen et al. 2006;Leydesdorff 2009) using social-network analysis (Pinski and Narin 1976;Brin and Page 1998;Bergstrom 2007). The social network analysis of bibliometric data needs matrices to be recast in a network framework. (e.g., Börner, Chen, & Boyack, 2003;Milojević, 2014;Van Eck & Waltman, 2014;Zhao & Strotmann, 2015).
A key issue in constructing a bibliometric network is whether a full counting or a fractional counting approach is to be used (Batagelj & Cerinšek, 2013;Park, Yoon, & Leydesdorff, 2016). Perianes-Rodriguez, Waltman, & van Eck (2016) argue that the fractional counting method is preferable over the full counting method. In this paper we further argue that a simple conservation rule requires that only fractional counting faithfully reproduces the conservation rule introduced at the paper level at higher levels of aggregation (i.e. author, institute, country, journal etc.) of the complex network.
Table 1 shows an instance where four authors (a1, a2, a3 and a4) publish three papers (p1, p2 and p3), an example taken from Leydesdorff & Park (2016). The top half of the table shows how the credit for the papers is assigned to the individual authors under the full counting scheme and the bottom half shows this for the fractional counting scheme. It is clearly seen that under full counting, the conservation law is violated leading to an inflation of paper count from 3 to 7. However, in fractional counting, conservation of the total is also maintained at the network level as we shall demonstrate below. Note that in fractional counting, any rule (here an equal credit to all authors rule is shown) can be used as long as the conservation rule is followed for each paper (that is each column must add up to 1). Let the authorship matrix at the paper level be designated by A where the elements are aij, for the contribution of author i to paper j, there being a total of I authors and J papers.
Let us represent the authorship matrix at the network level by B where B = AA T . Table 2 shows how the authorship matrix at the network level is constructed when full counting or fractional counting is used at the paper level. We see immediately that only fractional counting is able to conserve the total number of papers. Full counting has now inflated the count at the network level to 17.
Another lesson that emerges from Table 2 is that the diagonal terms have to be non-zero for conservation to be true. A simple thought experiment will establish this. Consider the case of I authors all of whom have been single authors of a paper each, i.e. author i is the single author of paper i. The A matrix is a diagonal matrix and so would the B matrix be; that is, all diagonal terms will have to be conserved.
In the next section a simple algebraic procedure shows that whatever the number of authors I, after the AA T operation at the network level, the total number of papers will remain at J.
A is an
wherein the I elements of each of its J columns add up to unity, i.e.
A new matrix B is given as
Thus
Note that in statement (2), the orders of indices, ij and ji (two possible permutations for j i ) are both accommodated. Since the matrix B is symmetric
one combination of indices i and j (without change of order) may be permitted (i.e. either with j i or with j i ), with a factor 2 incorporated in the appropriate expressions,
When graph theoretic procedures from social network analysis are applied to bibliometric data one must take care to maintain the conservation principle. In this paper we have shown from an empirical example as well as a formal proof that the conservation rule requires that this should be done only using fractional counting so that conservation at the paper level will be faithfully reproduced at higher levels of aggregation (i.e. author, institute, country, journal etc.) of the complex network.
This content is AI-processed based on open access ArXiv data.