Empirical Evidence for the Relevance of Fractional Scoring in the Calculation of Percentile Rank Scores

1   Empirical Evidence for the Rel evance of Fractional Scoring in the Calculation of Percentile Rank Scores Michael Schreiber Institute of Physics, Chemnitz University of Technology, 09107 Chemnitz, Germany. E-ma il: schreiber@physik.tu-chemnitz.de Fractional scoring has been proposed to avoid inconsistenc ies in the attribution of publications to percentile rank classes. Uncertainties and ambiguities in th e evalua tion of percentile ranks can be demonstrated m ost easily with small datasets. But for larger d atasets an often large num ber of pape rs with the same citation count leads to the same uncertainties and ambiguities which can be avoided by fractional scoring. This is demonstrated for four different empirical datasets with several thousand publications each which are assigned to 6 percentile rank classe s. Only by utilizing fractional scor ing the total score of all papers exactly reproduces the theore tical value in each case. Introduction Leydesdorff, Bornmann, Mutz, and Opthof (2011) proposed per centile-based indicators for the evaluation of publications based on their posi tion within a given cita tion distribution. The basi c idea is to assign publications to a percentile rank (PR) class and then attribute a weight according to the PR class to determine the score of the publication. There is no uni que way of appointing papers to PR classes and different suggestions have been presented (Hyndm an & Fan, 1996; Sheskin, 2007; Leydesdorff et al., 2011; Rousseau, 2012; Pudovkin & Garfield, 2009). All these proposals can lead to inconsistencies in the behavior of the calculated citati on impact indicators (Schreiber, 2012b). Leydesdorff a nd Bornmann (2012) are afraid that the discussion "may have opened a box of Pandora allowing for generating a parameter space of other possibilities". Most of the problems are created by assigning the same PR class and thus the same (usually intege r) weight to a large number of papers with the same citation count. A small change in the citation distribution can then shift all these tied papers from one PR class to the other and thus produce la rge changes in the sco ring (Schreiber, 2012b; Waltman & Schreiber, 2013). I have recently proposed to average the weights of the tied papers and assign the average we ight to all tied papers (Schreib er, 2012b). For several example sets with a small number of publications it was demonstrated tha t in this way the mentioned inconsistencies c an be avoided nearly completely. A small remaining im perf ection could be traced to the discretization of the PR classes. Therefore I suggested the fractional scoring (Schreiber, 201 2b) as a new scoring rule as the final solution of the pro blem. In this fraction al scor ing scheme the publications at the border between two different PR classes are shared between the respec tive PR classes and attrib uted fractional weights corresponding to their shares . This approach has been elaborated and applied for the calculation of the indicator counting the top 10% most frequently cited papers (Waltman et al., 2013). There it was shown in a formal mathematical f ramework that this fractiona l scoring exactly reproduces the theoretical value for the total score of all papers. Som e simple examples again for a sm all number of papers have been analyzed (Schreiber, 2012c) showing that the previous ly critici zed uncertainties and am biguities in the evaluation of PR classes do not occur in the fractional scoring scheme. In a recent letter Bornmann (2012) has discounted the fr actional scoring approach and other procedures for the calculation of PR scores, claiming that "the diffe rences between the various methods of the PR score calculation proposed might be of little and no practical consequence" when using large datasets. Indeed the previously presented examples (Schreiber, 2012b, 2012c; Waltman et al., 2013) comprised sm all numbers of publications; this, however, was due to the f act that the problems and their solution could be 2   demonstrated most clearly by utiliz ing such small datasets. Waltman et al. (2013) mentioned already that these problems and their solution are of empirical rele vance also for large datase ts of several ten thousand publications, because a sizable number of publications of tied papers o ccurred at the threshold of the top 10% most frequently cited publications in various fields. It is the purpo se of the present investigation to demonstrate these problems quantitatively for different empirical datasets thus showing the relevance of fractional scoring. Different scoring rules and their application to the first example PR classes and thresholds All evaluations below are performed for the case of 6 PR classes yielding the I 3(6) indica tor for the total score or, respectively, the R (6) = I 3(6)/ N indicator for the relative score, normalizing I 3 with the total number N of publications in the datase t, as proposed by Bornmann and Mutz (2011). The 6 PR classes distinguish the bottom 50%, 50 %-75%, 75%-90%, 90%-95%, 95%-99% , and top 1% publications determined according to the num ber of citations whic h the papers receive d in a given time interval. As a first example I hav e evaluated the publication data of 26 physicists from m y home Institute of Physics at Chemnitz University of Technology, which I ha d previously investigat ed (Schreiber, 2008a, 2009, 2010a). The data have been collected from the Web of Science in January and February 2007 and comprise 2373 publications with a total number of 25554 citations. This is considered here as the reference set for an evaluation of the individual scientists. For the purpos e of the present study it is sufficient to analyze the complete set without caring about the at tribution of the papers to the diffe rent researchers. But the idea is that these researchers can be evaluate d in comparison with this r eference set. Usually a larger reference set is used. Nevertheless the present ap proach refers only to papers of th e same field (physics) and the same document type (article) and thus provides a reasonably ho mogeneous sample. In principle one should further distinguish different publicati on years in the reference set. This is not realized in the present investigation, because then the refere nce sets per year would becom e rath er small. However, as the purpose of the present study is not evaluation and comparison of the individual scientists, but rather to show the significance of tied publica tions in the application of different sc oring schemes, the distinction of the publication years is not urgent, The boundaries p k of the above mentioned 6 percentage interval s are listed in Table 1; after sorting the publications according to their numbers of citations one can easily determ ine the numbers of citations which the publications at th ese thre sholds have received. These numbers are also g iven in Table 1, as well as the numbers of publications below, at, and above the thresholds. It is well known that the citation distributions are usually st rongly skewed. Therefore it is not surpris ing that the 50% threshold is a lready reached with only 4 citations. 126 publications with 4 ci tations occur in the dataset, which means that w e have 5.31% of all publications at this threshold, see also Table 1. At the 75% boundary we find 50 publications with 12 citations each, i. e., 2.10% of all publications are exactly at the threshold. But even at the 90% boundary where 25 citations are needed, we still have 9 tied pa pers at the threshold, i.e., 0.37%. These numbers indicate already the problem, nam ely ho w to assign these publications at the threshold to the PR classes. PLEASE INSERT TABLE 1 HERE Counting items with lower citation rates In agreement with the proposal of Leydesdorff et al. (2011), Leydesdorff and Bornm ann (2011) have applied the "counting rule that the number of items with lower cita tion rates than the item under study determines the percentile" (p.2137). This means that all the tied papers at a border are included in the lower 3   PR class. 1 Accordingly the factual threshold is always above the theoretical value as denoted in Table 1. In the present example this leads to factual interv al boundaries between the PR classes at 50.36%, 76.44%, 90.05%, 95.15%, and 99.03%, see Table 1. Although these valu es are not too different from the theoretical thresholds, their distance varies enough to yield str ong fluctuations in the percentages of publications which fall into the thus determined PR classes. In th e p resent case notably the third PR class is som ewhat underoccupied with 13.61% instead of 15% of the public ations. Weighting the perc entage of publications with the interval number k yields the corresponding contribution to R (6) as given in Table 1 so that the total score of all publications is 188.97% = 1.8897. This is somewhat below the theoretical expectation value of R (6) = 50% * 1 + 25% * 2 + 15% * 3 + 5% * 4 + 4% * 5 + 1% *6 = 191.00 % = 1.9100. (1) Counting items with lower or equal citation rates Rousseau (2012) has suggested to in clude the item under study into the nu mber of item s to compare with. This effectively means that in the counting rule not only item s with lower cita tion rates but item s with lower or equal citations rates are taken into account. As a consequence, all the tied public ations at the threshold are now always included into the high er PR cl a ss so that the factual thresh old is always below the theoretical value, as denoted in Table 1. In the present ex ample this leads to a str ong deviation in particular for the first PR class which now comprises only 45.05% instead of 50% of al l publications. This is counterbalanced by a high occupation of 29.29% instead of 25% for the second PR class. In summary, the contributions to R (6) yield a total value of 1.9671 much higher th an the theoretical ex pectation value (1). Middle of the uncertainty inte rval or average percentiles The difference between the factual thresholds in the tw o discussed approaches is th e uncertainty interval defined by Leydesdorff (2012). Following Leydesdorff (2012) one could utilize the mi ddle of that interval to categorize the publications at the threshold 2 : if the middle is below the thresho ld, then all tied papers are attributed to the lower PR cl ass, if it is above the threshold, all papers would fa ll into the high er PR class. In principle there could be an ambiguity, because the m iddle of the interval might be exactly equal to the border, but this case is extremely unlikely, because it would not only require th at the theoretical boundary corresponds to an integer va lue of publications, but al so that the tied publicatio ns at the threshold are symmetrically distributed on both sides of the border. Effectively, using the middle of the uncertainty interval is very sim ilar to the approach of Pudovkin and Garfield (2009) who average the per centiles of the tied pub lications at th e threshold. In order to avoid a distracting discussion about different ways of rounding percentages to integer percentile values, I have used the rational numbers n / N for the n -th paper to determ ine the average quantile of the publications at the threshold as given in Table 1. The de viation from the middle of the uncertainty interval always am ounts to exactly 1/2 N and does not have an influence on the subsequent evaluation in the present example. The averaged quantile is now utilized according to Pudovkin et al. ( 2009) in the same way as Leydesdorff (2012) used the middle of the uncerta inty interval described in the pr evious paragraph: if the average quantile is below/above the threshold, then all tied papers are attributed to the lower/upper PR class. The resulting factual threshold can th erefore now lie below or above th e PR class boundary, see Table 1. The deviations for the resulting percenta ges of publications in the various PR classes are somewhat sm aller than in the previously discussed approaches and accordingly, after weighting the percentages with the interv al    1  The  same  method  is  utilized  in  a  recent  study  comparing  different  universities  (Bornmann,  2012a).  2  For  a  single  manuscript  this  middle  of  the  uncertainty  interval  corresponds  to  the  rule  of  Hazen  (1 914).  It  can  be  interpreted  as  the  simplest  linear  interpolation  between  the  boundarie s  of  the  uncertainty  interval.  Other  proposals  for  the  determination  of  percentiles  often  use  slightly  modified  linear  interpolation  schemes  (Gringorten,  1963;  Cunnan e,  1978;  Harter,  1984).  The  quantile  difference  between  these  schemes  for  a  single  publication  is  smaller  than  1/ N ,  but  the  main  question  of  the  present  investigation  remains  open,  namely  how  to  treat  tied  papers.  4   number k , i.e., with the PR class number, the total score of R (6) = 1.9128 is rather clos e to the theoretical expectation value (1). Average weights I have previously proposed (Schreiber, 2012b) to averag e not the percentiles of the tied papers, but rather to determine the indivi dual weights and avera ge these weights. The average (non -integer) weight should then be given to each of the tied papers. Effectively th is means that the tied p apers are shared by the two PR classes according to the number of tied papers b elow and above the border. Therefore the factual thresholds in this scheme are very close to the theo retical values, see Table 1. These factual thresh olds can be obtained by rounding p k N to the next higher integer num ber, i.e., by the ceiling function, and then dividing by N . It is only due to the discretization that th ese values still deviate f rom the theoretical boundaries, and the deviation must always be smaller than 1/ N . Consequently the percentage of publications in each of the 6 PR classes is close to the th eoretical distribution and th e total score of R (6) = 1.9090 is also very close to 1.91. I note that for this evaluation I have sorted the papers by number of citations and determined the quantile of each publication by the number of papers with a lowe r rank. Thus the factual th reshold will always be (slightly) above th e theoretical boundary va lue. If I would include the ranked paper in the paper count, then the factual threshold would be given by rounding p k N to the nearest lower integer number, i.e., by the floor function, divided by N . Therefore it would always be (slightly) smaller than the theoretical boundary value, namely exactly 1/ N below the values given in Table 1. In this case one ob tains a total score of 1.9111. Fractional scoring Although these deviations are agreeably small, I find them still irritating and thus the approach not really satisfactory. As mentioned in the introduction, the final solution is give n by the fractional scoring schem e. In that method each publication is attributed a percen tage interval, which for an individu al publication corresponds to the uncertain ty interval mentioned above. It is de term ined by ranking the publications according to their citations withou t regarding tied publ ications, i.e., giving the tied publications a random order. The i -th paper covers th e interval from ( i -1)/ N to i / N . Each publication will thus be attributed an interval of length 1/ N . The interval of a paper exactly at the thre shold, in the present case that would be for example the 1187 th paper at the 50% boundary, is fractionalized into the part (here one half of the 1187 th paper) below and the part (its other half) above the boundary. These fractions need not be equal and in general at the other boundaries they will not be equal. They are then utilized to dete rm ine the average weight for this paper, and this is then used in th e average over the weights of all the tied papers. Equivalently, one can also first agg regate the intervals of all tied papers and thus star t with one interval for all the tied papers at a boundary. The thus accumulated in terval agrees with the uncertainty interval for tied papers discussed above. Now one has to fractionalize this inte rval into a part belo w and a part above the boundary (Waltman et al., 2013). The score is then de term ined by the weighted average of these two fractions. Because the summations involved m ay be exch anged, this weighted average is exactly equal to the average weight of the tied papers determined above where only the pa per at the border was fractionalized. Conceptually the latter approach of considering one comprehensive interval for all tied p apers is m ore attractive, because it allows us to separate the attribution of the papers to different PR classes from the scoring. Now all tied papers are trea ted in the same manner, namely th ey are all counted fractionally in both PR classes below and above the border. One can visualize this fractionalization by considering the overlap between the uncertainty intervals for the publica tio ns and the percentile intervals for the PR class es (Waltman et al., 2013; Schreiber, 2 012c). This fractional counting m eans that the shares of publications 5   below and above the boundary correspond exactly to the theoretical values so that the perfect result of R (6) = 1.9100 given in (1) is achieved for the total score as denoted in Table 1. PLEASE INSERT TABLE 2 HERE Another example: Highly cited researchers As a second example I use the citation data which I ha ve harvested in July 2007 from the W eb of Science for 8 highly cited physicists and whic h I have investigated in a differe nt context previously (Schreiber, 2008b, 2010b). As above the search comprised only articles in the field of physics. Again the complete dataset is considered as the reference set and it is be yond the purpose of the present analysis to evaluate the individual scientists with respect to this referenc e set. The full dataset comprises 335 4 publications with a total number of 279027 citations. Thus with 83.2 citations per publication th e citation count is m uch higher than in my first example. Likewise the citation thres holds for the different PR classes are much higher, for example 22 citations are necessary to reach the 50% boundary, see Table 2. Nevertheless, there are 39 papers tied at this threshold, which means 1.16%. The factual thresholds for the above discussed 4 different counting rules are presented in Table 2 and in this case the p ercentages of publications in the 6 P R classes deviate not so strongly from the id eal distribution. Nevertheless, deviati ons of about 1% do occur and the total score ranges from R (6) = 1.8962 to 1.9129. These deviations are much sm aller than in the previous example, but again I find them irri tating. In any case, they are unnecessary, because the fractional sco ring scheme again avoids any deviation as shown in the last line of Table 2. PLEASE INSERT TABLE 3 HERE Two further examples: Pub lications in a journal Journal sets are recommended as reference sets (Bor nmann, 2012). For this reason I present in T able 3 my evaluation of all articles that ha ve appeared in the phys ics journal EPL in the years 2007 - 2010. I have determined the total number of 20997 citations for thes e 3203 papers from the We b of Science in January 2012 and utilized the data in a different context (Schre iber, 2012a). On average these papers have acquired 6.6 citations each and it is therefore not surprising that the 50% boundary is reached with 3 citations already, 7 citations are sufficient for the 75% boundary, and 90% of th e papers have no more than 14 citations. Consequently, 10.02%, 3.47%, and 1.40% of all the publ ications can be found at these thresholds, many more than in the previous exam ples. Thus the factual thresholds deviate much more strongly from the boundary values than in the previous examples an d therefore also the num bers of publications in the different PR classes are often far from the theoretical distribution. As a result th e total score ranges from R (6) = 1.8517 to R (6) = 2.0066. In this case the total score for the average-percen tile a pproach is be low the theoreti cal value (1) in contrast to the first two cases where it was above 1.9100. This shows, that the average-percentiles method of Pudovkin et al. (2009) as well as the middle-of-the-unc ertainty interval proce dure of Leydesdorff (2012) can deviate in either direction from the ideal value, wh ile the other approaches are a lways leading to results either below (Leydesdorff and Bornmann, 2011) or above (Rousseau, 2012) R (6) = 1.9100 for the total score (1). Fractionalizing the pape r counts of the tied papers (Schreib er, 2012b) again yields a value very close to the ideal total score (1.909 5, see Table 3, or 1.9110 if the ranked paper is included in the paper count), but only the fractional scori ng scheme reproduces this theoretic al expectation va lue (1) exactly again. PLEASE INSERT TABLE 4 HERE 6   For the mentioned study (Schreiber, 2012a) I had also investigated all articles published in Europhysics Letters from 1999 until 2006. Europhysics Letters was rebr anded EPL in 2007, so this is the same journal as discussed in the previous paragr aph. Including the above analyzed 4- year period of EPL, altogether within these 12 years the journal has published 7553 papers which have been cited 87418 times as determined from the Web of Science in January 2012. Of course the older public ations had much more time to be cited and therefore acquired more citations th an those papers from the last 4 years evaluated in Table 3. On average now there are 11.6 citations per publi cation and this is also reflected in Table 4 where the numbers of citations at the PR class borders are about twice as large as in Table 3. Nevertheless the numbers of publications exactly at all these threshol ds are som ewhat larger. Due to the increased total number of papers, however, the percenta ges of publications at the thresholds are only about half as large as in Table 3. But the respective values are comp arable with the percentages given in Table 1 for the first example, so are the deviations of th e factual thresholds from the ideal boundaries and the deviations of the percentages of publicatio ns in the various PR classes fr om the theoretical distribution. In this case the average quantiles of the publications at all thresholds are always above the boundary so that the tied papers are always attributed to the higher PR class. Thus for this datase t the average-quantile approach yields exactly the same re sults as the method in which the investigated item s are included in the paper count for the determin ation of the pe rcentile. Due to the la rger number of publications the deviation of the averaged-weights approach is smalle r th an in the previous examples namely R (6) = 1.9098 (or 1.9192 if the ranked paper is included in the citation count ). For completeness the last line in Table 4 again shows that fractional scoring lead s to the perfect outcome (1). Concluding remarks With 4 examples empirical evid ence has been given that the determination of PR scores can be problem atic not only for small publication sets. Du e to the large numbers of tied publi cations which usually occur at the boundaries between the PR classes the differences between various methods for attributing the tied papers to different PR classes and thus for the calculation of the PR scores are i ndeed of more than "little or no practical consequence" in contrast to the claim made by Bornm ann (2 012). I found as many as 10% of the total number of publications at th e 50% threshold in Table 3. Even at the 90% boundary there were as many as 1.40% of all publications. A similar obser vation was made by W alt man et al. (2013) where between 0.4% and 3.6% of all publicatio ns were found at the 90% boundary in 7 even la rger datasets with up to 42749 publications. Thus the here presented values can be expected to be representa tive. Using even larger reference sets only m eans that more papers ar e tied at the thresholds. There is no reason to believe that the share of tied papers decrea ses. Thus the here discussed problems occur even in very large referen ce sets. Likewise, on average the individual publication sets , which are to be evaluated in comparison with the reference set, can also be expected to have a sim ilar share of tied p apers with the sam e citation numbers at the threshold unless these publication se ts are very small. Thus is foll ows from the present investigation that the results of an ev aluation will certainly be di f ferent if the various scoring schem es are applied. However, it remains an open question, whether this l eads to significant changes in the ranking as long as the same reference set is used. Differences can be e xpected when different reference sets have to be utilized, e.g. in the comparison of publication sets for different fields (Waltman et al., 2013). In conclusion, the differences between the various scoring methods are indeed relevant and do have practical consequences. Therefore fractio nal scoring is strongly recommended. 7   References Bornmann, L. (2012). The problem of percentile rank scor es used with small refere nce sets. Journal of the American Society for Information Science and Technology, (in press). Bornmann, L. (2012a). How to analyse percentile impact data meaningfully in bibliom etrics: The statistical analysis of distribut ions, percentile rank classes and top-cited. (in preparation). Bornmann, L. and Mutz, R. (2011). Further steps towards an ideal method of measuring citation performance: The avoidance of citation (ratio) averages in field-normalization. Journal of Informetrics , 5(1), 228–230. Cunnane, C. (1978). Unbiased plotting positions – A review. Journal of Hydrology . 37(3-4), 205-222. Gringorten, I.I. (1963). A plotting rule for extr eme probability paper. Journal of G eophysical Research , 68(3), 813-814. Harter, H.L. (1984). Another look at plotting positions. Communications in Statistics-Theory and Methods , 13(13), 1613-1633. Hazen, A. (1914). Storage to be provided in impounding re servoirs for municipal water supply. Transactions of American Soc iety of Civil Engineers , 77, 1539-1640. Hyndman, R.J. and Fan, Y.N. (1996). Sample quantiles in statistical packages. American Statistician , 50(4), 361-365. Leydesdorff, L. (2012). Accounting for the uncerta inty in the evaluation of percentile ranks . Journal of the American Society for Information Science and Technology, (in press). Leydesdorff, L. and Bornmann, L. (2011). Integrated impact indicators co mpared with impact factor s: An alternative research design with policy implications. Journal of the American Society for Information Science and Technology, 62(11), 2133-2146. Leydesdorff, L., Bornmann, L., Mutz, R., & Opthof, T. (2011). Turning the tables on citation analysis one more time: Principles for comparing sets of documents. Journal of the American Society for Information Science and Technology , 62(7), 1370–1381. Leydesdorff, L., & Bornmann, L. (2012). Percentile ranks and the inte grated impact indicato r (I3). Journal of the American Society fo r Information Science and Technology, (in press) Pudovkin, A.I., & Garfield, E. (2009). Percentile rank and author supe riority indexes for evaluating individual journal articles and the au thor’s overall cita tion performance. Collnet Journal of Scientometrics and Information Management , 3(2), 3–10. Rousseau, R. (2012). Basic properties of both percentile rank scores and the I3 indicator. Journal of the American Society for Information Science and Technology , 63(2), 416-420. 8   Schreiber, M. (2008a). An empirical investigation of the g -index for 26 physicists in comparison with the h -index, the A -index, and the R -index. Journal of the American Society for Info rmati on Science and Technology, 59(9), 1513-1522. Schreiber, M. (2008b). To share the fame in a fair way, h m modifies h for multi-authored manuscripts. New Journal of Physics, 10, 040201. Schreiber, M. (2009). The influence of self-citation corrections and th e fractionalised counting of multi- authored manuscripts on the Hirsch index. Annalen der Physik (Berlin), 18(9), 607-621. Schreiber, M. (2010a). Twenty Hirsch index varian ts and other indicators givi ng more or less preference to highly cited papers. Annalen der Physik (Berlin), 522(8), 536-554. Schreiber, M. (2010b). Revisiting the g-index: the average number of citations in the g-core. Journal of the American Society for Information Science and Technology, 61(1), 169-174. Schreiber, M. (2012a). Seasonal bias in editorial decisions fo r a physics journal: you should write when you like, but submit in July. Learned Publishing, 25(2), 145-151. Schreiber, M. (2012b). Inconsistencies of recently proposed c itation impact indicators and how to avoid them. Journal of the American Society fo r Inform ation Science and Technology, 63(10), 2062-2073. Schreiber, M. (2012c). Uncertainties and am biguities in pe rcentiles and how to avoid them. Journal of the American Society for Information Science and Technology, Sheskin, D. (2007). Handbook of parametric and nonparametri c statistical procedures (4 th ed.) Boca Raton, FL, USA: Chapman & Hall/CRC. Waltman, L. and Schreiber, M. (2013). On the calculation of percentile-based bibliometric indicators. Journal of the American Society fo r Information Science and Technology, 64(2), 372-379. Table 1. Evaluation of the citation records of 26 researchers from the Institute of Physics at Chemnitz University of Technology, determining the contributions to the 6 PR classes in the R (6) indicator. Percentile interval k 0 1 2 3 4 5 6 total Threshold p k 0% 50% 75% 90% 95% 99% 100% No. citations at threshold 0 4 12 25 43 104 457 No. pubs. below threshold 0 1069 1764 2128 2254 2349 2372 No. pubs. at threshold 477 126 50 9 4 1 1 No. pubs. above threshold 1896 1178 559 236 115 23 0 % pubs. below threshold 0.00 45.05 74.34 89.68 94.99 98.99 99.96 % pubs. at threshold 21.10 5.31 2.10 0.37 0.16 0.04 0.04 % pubs. above threshold 79.90 49.64 23.56 9.95 4.85 0.97 0.00 Factual threshold (LB) 0.00 50.36 76.44 90.05 95.15 99.03 100.00 Factual threshold (R) 0.00 45.05 74.34 89.68 94.99 98.99 100.00 Av. quantile of pubs. at threshold 47.72 75.41 89.89 95.09 99.03 100.00 Factual threshold (PG) 0.00 50.36 74.34 90.05 94.99 98.99 100.00 Factual threshold (S) 0.00 50.02 75.01 90.01 95.03 99.03 100.00 % pubs. in k- th PR class (LB) 50.36 26.08 13.61 5.10 3.88 0.97 100.00 % pubs. in k- th PR class (R) 45.05 29.29 15.34 5.31 4.00 1.01 100.00 % pubs. in k- th PR class (PG) 50.36 23.98 15.72 4.93 4.00 1.01 100.00 % pubs. in k- th PR class (S) 50.02 24.99 15.00 5.01 4.00 0.97 100.00 % pubs. in k -th PR class (WS) 50.00 25.00 15.00 5.00 4.00 1.00 100.00 Contribution to R( 6) (LB) 50.36 52.16 40.83 20.40 19.40 5.82 188.97 Contribution to R( 6) (R) 45.05 58.58 46.02 21.24 20.00 5.82 196.71 Contribution to R( 6) (PG) 50.36 47.96 47.16 19.72 20.02 6.07 191.28 Contribution to R( 6) (S) 50.02 49.98 45.01 20.06 20.02 5.82 190.90 Contribution to R( 6) (WS) 50.00 50.00 45.00 20.00 20.00 6.00 191.00 Note. The abbreviations LB, R, PG, S, WS refer to the different scoring schemes by Leydesdorff and Bornmann (2011), Rousse au (2012), Pudovkin a nd Garfield (2009), Schreiber (2012b), Waltman and Sc hreiber (2012), respectively. Table 2. Same as Table 1, but fo r 8 highly cited physicists; note that some lines from Table 1 are left out, because they are le ss important for the discussio n. Percentile interval k 0 1 2 3 4 5 6 total Threshold p k 0% 50% 75% 90% 95% 99% 100% No. citations at threshold 0 22 63 171 354 1102 4192 No. pubs. below threshold 0 1674 2512 3016 3186 3320 3353 No. pubs. at threshold 384 39 12 3 1 1 1 No. pubs. above threshold 2970 1641 830 335 167 33 0 % pubs. at threshold 11.45 1.16 0.36 0.09 0.03 0.03 0.03 Factual threshold (LB) 0.00 51.07 75.25 90.01 95.02 99.02 100.00 Factual threshold (R) 0.00 49.91 74.90 89.92 94.99 98.99 100.00 Factual threshold (PG) 0.00 49.91 74.90 90.01 94.99 98.99 100.00 Factual threshold (S) 0.00 50.00 75.01 90.01 95.02 99.02 100.00 % pubs. in k- th PR class (LB) 51.07 24.18 14.76 5.01 4.00 0.98 100.00 % pubs. in k- th PR class (R) 49.91 24.99 15.03 5.07 4.00 1.01 100.00 % pubs. in k- th PR class (PG) 49.91 24.99 15.12 4.98 4.00 1.01 100.00 % pubs. in k- th PR class (S) 50.00 25.01 15.00 5.01 4.00 0.98 100.00 Contribution to R (6) (LB) 51.07 48.36 44.28 20.04 19.98 5.90 189.62 Contribution to R (6) (R) 49.91 49.97 45.08 20.27 19.98 6.08 191.29 Contribution to R (6) (PG) 49.91 49.97 45.35 19.29 19.98 6.08 191.20 Contribution to R (6) (S) 50.00 50.03 44.99 20.04 19.98 5.90 190.94 Contribution to R (6) (WS) 50.00 50.00 45.00 20.00 20.00 6.00 191.00  Table 3. Same as Table 2, but for all publications in the physics jou rnal EPL from 2007 until 2010. Percentile interval k 0 1 2 3 4 5 6 total Threshold p k 0% 50% 75% 90% 95% 99% 100% No. citations at threshold 0 3 7 14 20 45 444 No. pubs. below threshold 0 1395 2320 2876 3029 3171 3202 No. pubs. at threshold 508 321 111 45 17 2 1 No. pubs. above threshold 2695 1487 772 282 157 30 0 % pubs. at threshold 15.86 10.02 3.47 1.40 0.53 0.06 0.03 Factual threshold (LB) 0.00 53.57 75.90 91.20 95.10 99.06 100.00 Factual threshold (R) 0.00 43.55 72.43 89.79 94.57 99.00 100.00 Factual threshold (PG) 0.00 53.57 75.90 89.79 95.10 99.00 100.00 Factual threshold (S) 0.00 50.02 75.02 90.01 95.00 99.00 100.00 % pubs. in k- th PR class (LB) 53.57 22.32 15.30 3.90 3.97 0.94 100.00 % pubs. in k- th PR class (R) 43.55 28.88 17.36 4.78 4.43 1.00 100.00 % pubs. in k- th PR class (PG) 53.57 22.32 13.98 5.31 3.90 1.00 100.00 % pubs. in k- th PR class (S) 50.02 25.01 14.99 5.00 4.00 1.00 100.00 Contribution to R (6) (LB) 53.57 44.65 45.89 15.61 19.83 5.62 185.17 Contribution to R (6) (R) 43.55 57.76 52.08 19.11 22.17 5.99 200.66 Contribution to R (6) (PG) 53.57 44.65 41.68 21.23 19.51 5.99 186.64 Contribution to R (6) (S) 50.02 50.02 44.96 19.98 19.98 5.99 190.95 Contribution to R (6) (WS) 50.00 50.00 45.00 20.00 20.00 6.00 191.00  Table 4. Same as Table 2, but for all publicatio ns in the physics journal Europhysics Letters / EPL from 1999 until 2010. Percentile interval k 0 1 2 3 4 5 6 total Threshold p k 0% 50% 75% 90% 95% 99% 100% No. citations at threshold 0 6 14 27 40 87 536 No. pubs. below threshold 0 3673 5654 6795 7173 7473 7551 No. pubs. at threshold 732 380 169 48 18 6 1 No. pubs. above threshold 6820 3499 1729 709 361 73 0 % pubs. at threshold 9.69 5.03 2.24 0.64 0.24 0.08 0.01 Factual threshold (LB) 0.00 53.67 77.11 90.61 95.22 99.03 100.00 Factual threshold (R) 0.00 48.64 74.87 89.98 94.98 98.95 100.00 Factual threshold (PG) 0.00 48.64 74.87 89.98 94.98 98.95 100.00 Factual threshold (S) 0.00 50.00 75.00 90.00 95.01 99.01 100.00 % pubs. in k- th PR class (LB) 53.67 23.44 13.51 4.61 3.81 0.97 100.00 % pubs. in k- th PR class (R) 48.64 26.23 15.11 5.01 3.97 1.05 100.00 % pubs. in k- th PR class (PG) 48.64 26.23 15.11 5.01 3.97 1.05 100.00 % pubs. in k- th PR class (S) 50.00 25.00 15.00 5.01 4.00 0.99 100.00 Contribution to R (6) (LB) 53.67 46.88 40.52 18.43 19.07 5.80 184.36 Contribution to R (6) (R) 48.64 52.46 45.33 20.02 19.86 6.28 192.58 Contribution to R (6) (PG) 48.64 52.46 45.33 20.02 19.86 6.28 192.58 Contribution to R (6) (S) 50.00 50.00 45.01 20.02 19.99 5.96 190.98 Contribution to R (6) (WS) 50.00 50.00 45.00 20.00 20.00 6.00 191.00

Empirical Evidence for the Relevance of Fractional Scoring in the Calculation of Percentile Rank Scores

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment