Deep Impact: Unintended consequences of journal rank

Running Head: Consequences of Journal Rank Deep Impact: Unintended consequences of journal r ank Björn Brembs 1 , Kat herin e Butt on 2 and Marcus Munafò 3 1. Institute o f Zoology – Neurogenetics, University of Regensburg, Universität sstr. 31 , 93040 Regensburg, Germany, bjoern@brembs.net 2. School of Social and Community Medicine, University of Bristol, 12a Priory Road, Bristol BS8 1TU, United Kingdom. 3. UK Centre for Tobacco Control Studies and School of Experimental Psychology, University of Bristol, 12a Priory Road, Bristol BS8 1TU, United Kingdom. Corresponding Author: Björn Brembs Consequences of Journal Rank 2 Abstract M ost researchers a cknowledge an intrinsic h ierarchy in the scholarly journals (‘journal ran k’) that they submit thei r work to , and adjust not only their subm ission but also their reading strategies accordingly. On the other hand, m uch has been written about the negative effects of institutionalizing journal rank as an impact measure . So far, contributi ons to the debate concerning the limitat ions of journal rank as a scientific impa ct assessment tool have either lacked data, or relied on only a few stud ies. In this review , we present th e most recent and per tinent data on the consequences of o ur current scholarly commu nication system with respect to various measures o f scientific quality (such as utility/citations, met hodological soundness, exper t ratings or retractions). The se data corroborate previous hypotheses : usi ng journal rank as an assess ment tool is bad scientif ic practice. Moreove r, the da ta lead us to argue that any journal rank (not only the currently -favored Impact Factor) would have this negative impact. Th erefore, we suggest that abandoni ng journals altogether, in favor of a library - based scholarly communica tion system , will ult imately be neces sary . This new syste m will use modern information t echnology to vastly improve the filter, sor t and discovery functi on s of the current journa l system. Consequences of Journal Rank 3 In tr o duction Science is the bedrock of modern society, improving our lives through advances in medicine , communication, transportation, forensics , entertainment and countless oth er areas . Moreover, today’s global problems cannot be solved without scie ntific input and understanding. The more our society relies on science , and the more o u r population becomes scientif ically literate, the more impor tant the reliability (i.e., veracity and integrity, or, ‘credibility’ (Ioann idis, 2012) ) of scientific research becomes. Scientific research is largely a public e ndeavor, requiring public trust. Therefore, it is critic al that public trust in science remains high . In other words, the reliability of science is not only a societal imper ative, it is also vi tal to the scientific com munity itsel f . However, every scientific publicati on may in principle report resul ts which prove to be unreliable , e ither unintentional ly , in the case of honest error or statistical variability, or intentional ly in the case of misconduct or fraud. Ev en under ideal circumstances , science can never provide us with absolute truth. In Karl Po pper’s words: “ Science is not a system of cer tain, or established, statement s ” (Popper, 1995) . Peer - review is one of the mechanisms which have evolved to increa se the reliabilit y of the scientific literature. At the same time, the current publication syst em is being used to structure th e careers of the members of t he scientific community by evaluating their success in obtaining publicati ons in high - ranking journals. The hierarchical publication system (‘ journal rank ’ ) used to communicate scientific results is thus central , not only to the composition of the scientific community at large (by selecting its members), but also t o science ’s position in society. In recent years, the scien tific study of the eff ectivene ss of such measures of quality control has grown . Retractions and the Decline Effect A disturbing trend has recently g ained wide public attention: The retraction rate of articles p ublished in scientific journals, which had remained stable sinc e the 1970’s, began to increase rapidly in the early 200 0’s from 0.001% of the total to about 0.02% (Figure 1a) . In 2010 we have seen the creation and popularization of a web site dedicated to monitoring r etractions (http://retractionw atch.com), wh ile 2011 has been descr ibed as Consequences of Journal Rank 4 the “ the year of the retr action” (Hamilton, 2011) . The reasons suggested for retr actions vary widely, with the r ecent sharp rise potentially facilitated by an increased willingness of journals to issue retractions, or increased scrutiny an d error - detection from online media . Although cases of clear scientific miscond uct initially constitute d a minority of cases (Fanelli, 2009; Van Noorden, 2011; Wager and Williams, 2011; Nath et al., 2006; Cokol et al., 2007; Steen, 2011a) , the fraction of r etractions due to miscon duct has risen sharper th an the overall r etraction rate and now the majori ty of all retractions is d ue to misconduct (Fang et al., 2012; Steen , 2011b). Retraction notices, a metric which is rel atively easy to collect, on ly constitute the extreme end of a spectrum of unreliabilit y that is inherent to the scientific metho d: we can hardly ever be entirely certain of our results (Popper, 1995) . Much of the training scient ists receive aims to red uce this uncertainty long before the work is submitted for publicati on . However , a less readily q uantified but m ore frequent phenomenon ( compared to rare retractions ) has recently garner ed attention, which calls into question the effectiveness of this training . T h e ‘decline - effect’ , which is now wel l - described, relates to the ob servation that the strength of evidenc e for a particular finding often decline s over time (Schooler, 2011; Lehrer, 2010; Bertamini and Munafo, 2012; Palmer, 2000; Fanelli, 2010; Ioannidis, 2005b; Simmons et al., 1999, 2011; Møller and Jennions, 2001; Møller et al., 2005; Van Dongen, 2011; Gonon et al., 2012) . T hi s effect provides wider scope for assessing the unrelia bility of scien tific research than retractions alone, and allows for more general conclu sions to be drawn . R esearchers make ch oices about data collect ion and analysis which increase the chance of false - positives (i.e., researcher bias) (Simmons et al., 1999, 2011) , and s urprising and novel effects are more likely to be published than s tudies showing no effect. This is the well - known phenomenon of publica tion bias (Song et al., 1999; Van Dongen, 2011; Munafò et al., 2007; Young et al., 2008; Callaham, 2002; Møller and Jennions, 2001; Møl ler et al., 2005; Schooler, 2011; Dwan et al., 2008) . In other words, the probability of getting a paper pu blished might be biased toward s larger initial effect sizes , which are revealed by later studies to be not so large (or Consequences of Journal Rank 5 even absent entirely) , leading to the decline effect. While so und methodology c an help redu ce researcher bias (Simmons et a l., 1999) , publication bias is more difficult to address. Some journals are devoted to p ublishing null results, or have sections devo ted to these, but cov erage is uneven across disciplines an d often these are not particularly high - ranking or wel l - read (Schooler, 2011; Nosek et al., 2012) . P ublication the rein is typica lly not a cause for excitement (Giner - Sorolla, 2012; Nosek et al., 2012), leading to a n overall low frequency of replication stu dies in many fields (Hartshorne and Schachne r, 20 12; Kelly, 2006; Carpenter, 2012; Yong, 2012; Makel et al., 2012) . P ublication bias is also exacerbated by a tendency for jour nals to be less likel y to publish replication studies (or, worse still, f ailures to replicate) (Editorial, 2012; Goldacre, 2011; Sutton, 2011; Hartshorne and Schachner, 2012; Curry, 2009; Yong, 2012) . Here we argue that the counter - measures proposed to improve the rel iability and ve racity of science such as peer - review in a hierarchy of journals or methodolog ical training of scienti sts may not be sufficient. While there is gr owing concern regar ding the increasing rate of retractions in particular, and the unreli ability of scientific findings in gener al, little consideration h as been given to the infrastructure by which scientists not on ly communicate their findings but also eva luate each other as a p otential contributing factor. That is, to what extent does the environment in w hich science takes place con tribute to the problems described ab ove? By far the most common metric by which publications are evaluated, at least initially, is the perceived pr estige or rank of the jou rnal in which they appear. Does the pressure to p ublish in prestigious, high - ranking journals contri bute to the unreliability of science? The Decline Effect and Journal Rank The common pattern seen where the decline effect has b een documented is one of an initial publication in a high - ranking journal, followed by attempt s at replication in lower - ranked journals which either failed to replicate the original findings, or suggested a much w eaker effect (Lehrer, 2010) . Journal rank is most commonly asse ssed using Thomson Reuter s’ Impact Factor (IF ), whi ch has been shown to correspond well with subjective ratings of journal qual ity and rank (Gordon, 1982; Saha et al., Consequences of Journal Rank 6 Fig. 1: Current trends in the reliability of science. a – Exponential fit for PubMed retraction notices (data from pmretract.heroku.com). b – Relationship between year of publication and individual stud y effect size. Data are taken from Munafò et al., 2007, and represent candidate gene studies of the association between DRD2 genotype and alcoholism. The effect size (y - axis) represents th e individual study effect size (odds ratio; OR), on a log - scale. This is plotted against the year of publication of the study (x - axis). T he size of the circle is proportional to the IF of th e journal the individual study was published in. Effect size is signif icantly negative ly correlated with year of publication . c – Relati onship between IF and extent to which an individual study overestimates the likely true effect. Data are taken from Munafò et al., 2009, and represent candi date gene studies of a number of gene - phenotype associations of psychiatric phenotypes. The bias s core (y - axis) represents th e effect size of the individual study divided by the pooled effect s ize estimated in dicated by me ta - analysis, on a log - scale. Therefore, a value greater than zero indicates that the study provided an ove r - estimate of the lik ely true effect size. This is plotted against the I F of the journal the study was published in (x - axis), on a log - scale. The size of the circle is proportional to the sample size of the individual stud y. Bias score is significantly positively correlated with IF, sam ple size significantly negatively. d – L inear regression with confidence intervals between IF and Fang and Casadevall’s Retraction Index (da ta provided by Fang and Casadevall, 2011). 2003; Yue et al., 2007; Sønderstrup - Andersen and Sønderstrup - Andersen, 2008) . One particular case (Munafò et al., 2007) illustrates the decline effect (F igure 1b) , and sho ws that early publications both report a larger eff ect than subsequent s tudies, and are also published in journal s with a higher IF. These observations rais e the more general q uestion of whether research publis hed in high - ranking journals is inherent ly less reliable than research in lower - ranking journals . As journal rank is also predictive of the incidence of fraud and misconduct in retracted publications , as opposed to o ther reasons for retraction (Steen, 2011a) , it is not surprising that higher ranking journal s are also more likely to publish fraudulent work than lower ranking journals (Fang et al., 2012) . These data, however, cover only the small fr action of publications that have Consequences of Journal Rank 7 been retracted. More i mportant is the large body of the literature that is not retr acted and thus actively being used by the scientific community. There is evidence that unreliab ility is higher in high - rank ing journals as well, also for non - retracted pub lications : A meta - analysis of genetic asso ciation studies provides evidence that the extent to which a study over - estimate s the likely true effect size is positively correl ated with the IF of the journal in wh ich it is published (Figure 1c) (Munafò et al., 2009) . S imilar effect s have been reported in the co ntext of other resear ch fields (Sio ntis et al. , 2011; Ioannidis, 2005a; Ioannidis and Panagiotou, 2011) . There are additional measures of scientific quality and in none does journal rank fa re much better. A study in crystallography repo rts that the quality of th e protein structures described is significantly lower in publications in high - ranki ng journals (B rown and Ram aswamy, 2007) . Adherence to basi c principles of sound sc ientific ( e.g ., the CONSORT gui delines : http://www.c onsort - statement.org ), or statistical methodology have also been test ed. Four different stud ies on levels of evidence in medical and/or psyc hological research have found varyi ng results. While two studies on surgery jour nals found a correlation between IF and the levels of evidence def ined in the resp ective studies (Obremskey et al., 2005; Lau and Samman, 2007) , a study of anesthesia journals failed to fin d any statistical ly significant correlation between jo urnal rank and evidence - bas ed medicine principles (Bain and Myles, 2005) and a study of seven medical/psychol ogical journals found highly varying adherence to statistical guidelines, irr espective of journal rank (Tressoldi et al ., 2013) . The two sur gery studies covered an I F range between 0.5 and 2.0, and 0 .7 and 1.2, respectivel y, while the anesthesia study covered the range 0.8 to 3.5. It is possible that any cor relation at the lower end of the scale is abolished when higher rank journals are included. The study by Tressoldi and colleagues, which included very high ranki ng journals , supports this inter pretation. Importantly, if publicat ions in higher ranking journals were methodologica lly sounder , then one would expect the opposite result: inclusion of high - ranki ng journals should result in a stronger, not a weaker correlation. Further supporting the notion that journal rank i s a poor predictor of stati stical soundness is our own analysis of data on sta tistical power i n neuroscience studies (Button et al., 2013) . There was n o significant correlation between Consequences of Journal Rank 8 statistical power and journal rank (N= 650 ; r s =-0.0 1; t= 0.8; Figure 2 ). Thus, the current ly ava ilable data seem to indicate that journal rank is a poor indicator of methodological soundness. Beyond explicit quality me trics and sound methodology, reproducibility is at the core of the scientific method and thus a hallmark of scientifi c quality. Three recent studies r eported attempts to replicate publishe d findings in preclinical medicine (Scott et al., 2008; Begley and E llis, 2012; Prinz et al., 2011) . All three found a very low frequency of replication , suggesting that maybe only one out of five preclinical findings is reproducible. In fact, the level of reprodu cibility was so low that no relationshi p between journal rank and re producibility could be det ected. Hence, these data suppor t the necessity of recen t efforts such as the ‘Reproducib ility Initiative’ (Ba ker, 2012) or the “Reproducibilit y Project” (Collaboration, 2012) . In fact, the data also i ndicate that these projects may con sider starting with replicating findings published in high - ranki ng journals. Given all of the above evidence, i t is ther efore not surprising that journal rank is also a strong predictor of the rate o f Fig. 2: No association between statistical power and journal I F . The statistical power of 650 neu roscience studies (data from Button et al. 2013; 19 missing ref; 3 unclear reporting; 57 published in journal without 2011 IF; 1 book) plotted as a function of the 2011 IF of the publishing journal. The studies were selected from t he 730 contributing to the meta - analyses included in Button et al. 2013, Table 1, and included where journal title and IF (2011 © Thomson Reuters Journal Citation Reports) were available. Consequences of Journal Rank 9 retractions (Figure 1d) (Fang and Casadevall, 2011; Liu, 2006; Cokol et al., 2007) . So cial pr essure and journal ran k There are thus several converging lines of evidence which indicate that publica tions in high ranking journals are no t only more likely to be fraudulent t han articles in lower ranking journals, but also mo re likely to present disco veries which are les s reliable (i.e., are infla ted, or canno t subs equently be replicate d) . Some of the sociological mechanisms behind these correlations have been documente d, such as pressure to publish (preferabl y positive re sults in high - ranking journals) , leading to the potential for decreased ethical stand ards (Anderson et al., 2007) and increased publication bias in highly c ompetitive fie lds (Fanelli, 2010) . The general increase in competitiveness , and the precariousness of scientific careers (Shapin, 2008) , may also lead t o an increased pub lication bias across the sciences (Fa nelli, 2011) . This e vidence supports earlier propositions about social pressure being a major factor driving misconduct and publicat ion bias (Giles, 2007) , eventua lly culminating in retractions in the most extreme cases . That being said, it is clear that the correlation between journal rank and retraction rate is likel y too strong (co efficient of determination of 0.77; data from (Fang and Casadevall, 2011) ) to be explained exclu sively by the decreased reli ability of the research published in high ranking journal s. Probably, additional factors contribute to this effect. For instance, one such factor may be the greater visibility of publications in these journals , which is both one of the incent ives driving publication bias , and a l ikely underlying cause for the detection of error or misconduct w ith the eventual retraction of th e publications as a result (Cokol et al., 2007) . Conversely, th e scientific community may also be less concerned about incor rect fi ndings published in more obs cure journal s . With respect to the latter, t he finding that the large majority of retractions come from the numerous low er - ranking journals (Fang et al., 2012) reveals that publications in low er ranking journals are scrutinized and, i f warranted, retracted . T hus, differences in scrutiny are likely to be o nly a contributin g factor and not an exclusive expl anation , either . With respect to the Consequences of Journal Rank 10 former, visi bility effects in general can be quantified by measuring citation rates be tween journals, tes ting the assumption that if higher visibility were a con tributing factor to retraction s, i t must also contribute to c itations. Journal Rank and Study Impact Thus far we have presented evidence that re se arch published in high - ranking journals may be less reliable compared with publications in lower -r anking journals. Nevertheless, there is a strong common perception that high - ranking journals publish ‘better’ or ‘more importan t’ science , and that the IF captur es this well ( Gordon, 1982; Saha et al., 2003) . The assump tion is that high - ranking journals are able to be highly sel ective and publish only the most impo rtant , novel and best - supported scientific discoveries, which w ill then, as a conse quence of their qu ality, g o on to be highly cited (Young et al., 2008) . One way to reconcile this common perception wit h the data would be that , while journal r ank may be ind icative of a minority o f unreliable publication s, it may also (or m ore strongly ) be indicative o f the importance of the majority of remaining, re liable publicatio ns. Indeed, a recent study on clinical trial meta - analyses found that a measure for the no velty of a clin ical trial’s main outco me did correlate significan tly with journal rank (Evangelou et al., 2012) . Compared to this r elatively weak correlation ( with all coefficients of determination lower than 0.1 ), a stronger correlation was reported for journal rank and expert ratings of impo rtance (Allen et al., 2009) . In this study, the journal in whi ch the study had appeared was not maske d, thus not excluding the strong correlation between subjecti ve journal rank and journal q uality as a confounding factor. Nevertheless, there is converging evidence from two studies that journal rank is indeed indicative of a publication ’s perceived importanc e. Beyond the importance or novelt y of the research, there are three additional reasons why publications in high - ranking journals might receive a high number of citation s . First, pub lications in high - ranking journals ac hieve greater exposure by virtue no t only of the larger circulat ion of the journal in which they appe ar, but also of the more prominent media attenti on (Gonon et al., 2012) . Second, citing high - rank ing publications in one’s own publ ication Consequences of Journal Rank 11 Fig. 3 : Trends in predicting citations from journa l rank. The coefficient of determination (R 2 ) between journal rank (as measured by IF) and the citations accruing over two years after publications is plotted as a function of publication year in a sample of almost 30 mil lion publications. Lozano et al. (2012) m ake the case that one can explain the trends in the predictive value of journal rank by the publication of the IF in the 1960s (R 2 i ncrease is accelerating) and the widespread adoption of internet searches in the 1990s ( R 2 i s dropping). The data support the interpretation that reading habits drive th e correlation between journal rank and citations more than any inherent quality of the articles. IFs before the invention of the IF have been retroactively computed for the ye ars before the 1960s. may increase its perceived value. Third, the novel, surprising, counter - intuitive or controversial findings often published i n high - ranking journals, draw citations not only from follow - up studies but also from news - ty pe art icles in scholarly j ournals reporting and discussing the discovery . Despite these four factors , which would suggest considerable effects of journal rank o n future citations , i t has been established for some time that the actual effect of journal rank is measu rable, but nowhere near as substantial as indicated (Hegarty and Walton, 2012; Seglen, 1997; C allaham, 2002; Kravitz and Baker, 2011; Chow et al., 2007; Seglen, 1994; Finardi, 2013) and as one would expect if visibility we re the exclusive facto r driving ret ractions . In fact, the average effect siz es roughly approach those for journal rank and unrel iability, cited above. The data presented in a recent analysis of the dev elopment of these correlations b etween IF - based journal rank and future citations over the period from 1902 -2009 (with IFs before the 1960s computed retr oactively) reveal tw o very info rmative trends (Figure 3 , data from (Lozano et al., 2 012) . First, while the Consequences of Journal Rank 12 predictive power of journal rank remained very low for t he entire first two thirds of the 20 th century, it started to slowly increas e shortly after the public ation of the firs t IF data in the 1960 ’ s. This correlation kept incr easing until the second inter esting trend emerged with the advent of the internet and keyword - search engines in the 1990 ’ s, from w hich time on it fe ll back to pre -1960’ s levels until the end of the st udy period in 2009. Overall, consistent with the citation data already available, the coefficient of determination between j ournal rank and citations was always in the rang e of ~0.1 to 0.3 ( i.e., quite low) . It thus appears that inde ed a small but significant correlati on between journal rank and future citations can be observed. Moreo ver, the data suggest that most of this small eff ect stems from visibility effects due to the influence of the IF on reading habi ts (Lozano et al., 2 012), rath er than from factor s intrinsic to the pu blished articles (see data cited above) . However, the correlation is so w eak t hat it cannot alone account for the str ong correlation between r etractions and journa l rank, but instead requires additional factor s, such as the increa sed unreliability of publicati ons in high ranking journals cited above. Supporting these weak corr elations between journal rank and future citations are data repor ting c lassification errors ( i.e., whether a pub lication received too many or too few citations wit h regard to the rank of the journal it was published in ) at or exceeding 30% (Chow et al., 2007 ; Kravitz and Baker, 2011; Singh et al., 2007; Starbuck, 2005) . In fact, these classification errors , in conjunction with the weak citation advantage , render journal rank practically useless as an evaluation signal , even if there was no indication of less reliable scie nce being published in high ranking journals . T he only measure o f citation count th at does correlate strongly with journal rank (negatively) is the number of articles without any citations at all (Weale et al., 2004) , supporting the argument that few er articles in high - ranking journals go unread. Thus, there is quite ex tensive evidence ar guing for the strong correlation between journal rank and retr action rate to be main ly due to two factor s: there is direct e vidence that the soci al pressures to publish in hig h ranking journals increases the unrel iability, i ntentional or not, of the research publ ished there. There is more indirect evidence , derived mainly from citation data , indicating that increased visibility of publications in high ranking Consequences of Journal Rank 13 journals may potentially contrib ute to increased error -d etection in these journals . With several independent measures failing to provide co mpelling evidence th at journal r ank is a reliable predictor of scientific impact or qual ity , and othe r measures indicating that journal rank is at least equally i f not more predictive of low reliability , the central role of journal rank in modern science deserv es close scrutiny. Practical consequences of J ournal R ank Even if a particular study has been performed to the highest standards , the quest fo r publication i n high - ranking journals slows down t he dissemination of science and increases the burden on rev iewers , by iterations of submissions and rejection s cascading down the h ierarchy of journal rank (Statzner and Resh, 2010; Kravitz and Baker, 2011; Nosek and Bar - Anan, 2012) . A recent study seems to suggest that such rejections even tually improve manuscrip ts enough to yield meas urable citation benefits (Calcagno et al., 2012) . However, the effect size of such resubmissions appear s to be o f the order of 0.1 citations per article, a statistically s ignifica nt but , in practical terms , negligib le effect. This conclusion is corroborated by an earlier study which failed to find any such effect (Nosek and Bar - Anan, 2012) . Moreover, with peer - review co sts estimated in excess of 2.2 billion € (US$~2.8b) annually (Research Information Network, 2008) , the resubmission cascade contributes to the already rising costs of journal rank: the focus on journal rank has allowed corporate publishers to keep their most prestigious jou rnals closed - access and to increase subscripti on prices (Kyrillidou et al., 2012) , creatin g additional barrier s to the dissemination of science. The argument from highly selective j ournals is that their per - article cost wou ld be too high for au thor processing fees, wh ich may be up to 37,000€ (US$48,000) for the journal Nature (House of Commons, 2004) . There i s also evidence from one study in economics suggesting that journal rank can contribute to suppression of interdisci plinary research (Rafols et al., 2012) , keeping disciplines separa te and isolated. Consequences of Journal Rank 14 Fi nally, the attention given to publication in high - ranking journal s may distort the communication of scientific progr ess, both inside and outside of the scient ific community . For instance, the recent discovery of a ‘Default - Mode Network’ in rodent brains was, presu mably, made indepen dently by two differ ent sets of neuroscientists and published only a few months apart (Lu et al., 2012; Upadhyay et al., 2011) . T he later , but not th e earlier , publication (Lu et al ., 2012) was cited in a subsequent high - rank ing publication (Welbe rg, 2012) . Despite both studies largely reporting identical findings (albeit, perhaps, w ith different quality) , t he later report has gar nered 19 citations, while th e earlier one only 5, at the time of this writin g . We do not know of any empirica l studies quantitatively addressing this particular effect of journal rank. Howev er, a similar distortion due to selective attention to pu blications in high - ranking journals has been reported in a study on medical research . This st udy found media reporting to be distorted, such that o nce initial findings in higher - ranking journals ha ve been refuted by publications in lower ranking journals (a case of decline eff ect), they do not recei ve adequate media coverage (Gonon et al., 2012) . Impact F actor – Negotiated, irrepro ducible and unsound The IF is a metric for the number of citations to articles in a journal (the numerator) , normalized by the number of articles in that journal (the denominator) . However, there is evid ence that IF is, at least in so me cases, not calculated but negotiated, that it is not reproducible, and that, even if it were reproducibly computed, the way it is derived is not ma thematically sound. The fact that publishers have the option to negotiate how their IF is calculated is well - established – in the case of PLoS Medicine , the negotiation range was between 2 and about 11 (The PLoS Medicine Editors, 2006) . What is n egotiated is the denom inator in the IF equa tion (i.e., whi ch published articles which are counted), gi ven that all citations count tow ards the numerator wheth er they result from publications include d in the denominator or not. It has thus been p ublic know ledge for quite some time now that r emoving editorials and News - and - Views articles from the d enominator (so called “front - matter”) can dramatically alter the resulting IF (Editorial, Consequences of Journal Rank 15 Journal: Current Biology Published items 2000 Published items 2001 Published items 2002 Sum published items Citations in preceding two years IF JCR Science Edition 2002 504 528 n.c. 1032 7231 7.007 JCR Science Edition 2003 n.c. 300 334 634 7551 11.910 Table 1: Thomson R euters’ IF calculat ions for t he journal ‘Current Biology ’ in the y ears 2002/2003. Most of the rise in IF is due to the reduction in published items. Note the discrepancy between the number of items published in 2001 between the two consecutive JCR Science Editions. – n.c.: year not covered by this edition. Raw data see Suppl. Fig. S1. 2005; Garfield, 1999; Adam, 2002; Moed and Van Leeuwen, 1995; Moed and van Leeuwen, 1996; Hernán, 2009; Baylis et al., 1999) . While these IF negotiati ons are rarely made public, t he number of citations (numerator ) and published articles (denomina tor) used to calculate IF are accessible via Journal Cita tion Reports . This database can be sea rched for evidence that th e IF has been negotiated. For insta nce, the numerator a nd denominator values for Current Biolog y in 2002 and 2003 indicate that while the number of citations remained relatively c onstant, the number of pu blished articles dr opped. This decrease occur red after the journal was purchased by Cell Press (an imprint of Elsevier), despite there being no change in the lay out of the journal. Critically, the arrival of a new publishe r corresponded with a retrospectiv e change in the denominator used to calculate IF ( Table 1 ). Similar procedu res raised the IF of FASEB Journal from 0.24 in 1988 to 18.3 in 1989 , when confer ence abstracts ceased to count tow ards the denominator (Baylis et al., 1999) . In an attempt to test th e accuracy of the ranking of some of their journals by IF , Rockefeller University Press purch ased access to the citation data of their journals and some compet itors. They found numerous discrepan cies between the data they received a nd the published ran kings, sometimes leading to differences of up to 19% (Rossner et a l., 2007). When asked to explain th is discrepancy, Thomson Reuters replied that they routinely use several differen t databases and ha d accidentally sent Rockefe ller University Press the w rong one. Despite this, a second database sent also did not match the published records . This is only one of Consequences of Journal Rank 16 a number repor ted errors and inconsistenci es (Reedijk, 1998; Moed et al ., 1996) . It is well - kn own that citation data a re strongly left - sk ewed, meaning tha t a small number of publicatio ns receive a la rge number of citations, wh ile most publications receive very few (Rossner et al., 2007; Seglen, 1992, 1997; Kr avitz and Baker, 2011; Editorial, 2005; Chow et al., 2007; Weale et al., 2004; Taylor et al., 2008) . The use of an arithmetic mean as a measure of central tendency on such data ( rather than, say, the med ian ) is clearly inappropriate, but this i s exactly what is used in the IF calculation. The Internati on al Mathematical Un ion reached the same conclusion in an analys is of the IF (Adler et al., 2008) . A recent study correlate d the median citation fr equency in a sample of 100 journals with their two - y ear IF and found a very strong correlation, which is expected due to the sim ilarly left - skewed distributions in most journals (Editorial, 2013) . However, at th e time of thi s writing, it is not known if using the med ian (instead of the mean) improves any of the predominantly weak predic tive properties of journal rank . Complem enting the specific flaws just mentioned, a recent, comp rehensive review of the bibliometr ic literature lists various additional shortcomings of the IF more generally (Van clay, 2011) . Conclusion s While at t his point it see ms impossible to quantify the relative contribution s of the different factors influenci ng the reliability of sc ientific publications, th e current empirical literature on the effects of journal rank provides evidence supporting the following four conclusions: 1) j ournal rank is a weak to moderate predictor of utility and perceived importance ; 2) j ournal rank is a moderate to strong p redictor of both intentional and uni ntentional scientific unreliabili ty ; 3) j ournal rank is expensive, delays scienc e and frustrates researc hers ; and, 4) j ournal rank as establishe d by IF violates even the most basic scientific standards , but predicts subjective judgments of journal quality. Consequences of Journal Rank 17 Ca v eats While our latter two conclusio ns appear uncontroversial, the former two are counter - intuitive and requ ire explanation . Weak correlations betw een future c itations and journal rank bas ed on IF may be caused by the poor statistical pro perties of the IF . This explanation could (and s hould) be tested by u sing any of the existing al ternative ranking tools availab le (such as Th omson Reuters’ Eigenfactor , Scopus’ S CImagoJ ournal R ank , or Google’s Scholar Metrics etc. ) and comput ing correla tions with the metrics discussed above. However, a recent analysis shows a high correlation between these r anks, so no large differences would be expected (Lopez- Cozar and Cabezas - Clavijo, 2013) . Alternatively, one can cho o se other important metrics and comp ute which journals score particul arly high on these . Eithe r way , since the IF reflects the common perception o f journal hierarchies rather well (Gordon, 1982; Saha et al., 2003; Yue et al., 2007; Sønderstrup - Andersen and Sønderstrup - Andersen, 2008), any alternative hierarchy that wo uld better reflect ar ticle citation frequencie s might violate this intuitive sen se of journal rank, as different w ays to compute journal rank lea d to different hierar chies (Wagner, 2011) . Both alternatives thus challenge our sub jective journal ranking. To put it more bluntly , if perceived importance and utility were to be discounted as indir ect proxies of quality, wh ile retraction rate, replica bility , effect size o verestimation, correct sample sizes, crystallographic quality, sound methodology and so on counted as more direct measu res of qual ity, then in versing the current IF - based journal hierarchy would improve the ali gnment of journal rank for most and have no effe ct on the rest of these more direct measures of quality . The subjective journal hierarc hy also leads to a circularity that confounds m any empirical studies. That is, a uthors use journal r ank, in part, to make decisions of where to submit t heir manuscripts, such th at well - performed studies yielding ground - breaking discoveries with general implicat ions are preferentially submitted to high - ranking journals . R eaders , in turn, expect on ly to read about such articles in high - ranking journals, leading to the exposure and visibility confounds discussed above and at length in the cite d literature . Moreover, citation practices and methodological standa rds vary in different scientific field s, Consequences of Journal Rank 18 potentially distorting b oth the citation and reliability data. Give n these confounds one might expect highly varying and often inconclusive results . Despite this , the literature contains eviden ce for associations between journal rank and measure s of scientific impact (e. g., citations, import ance, unread articles) , but also contains at least equally stron g, consistent effects of jour nal rank predicting scientific u nreliability (e.g., retractions, effect size, sample size, r eplicability, fr aud/misconduct , methodology) . N either group of studies ca n thus be easily dismissed, suggesting that the incentives journ al rank creates for the scientific com munity (to submit either their best or their most unreliable work to the most high - ranking journals ) at best cancel each other out. Such unintended consequen ces are well - known from other field s where metrics are applied (Hauser and Katz, 1998) . Therefore , while there are concerns not only about the validity of the IF as the metric of ch oice for establishing journal rank but also about confounding factors complicating the interpretation of so me of the data, we find, in the absen ce of additional data, that these concerns do not suffice to s ubstantially question our conclusi ons, but do emphasi ze the need for future research. P o ten ti al long -term conseque nces of journal rank Taken together, the reviewed l iterature suggests that using journal rank is unhel pful at best and unsc ientific at worst . In our view, IF generates an illusion of exclusivity and prestige based on an assumption that i t will predict subsequent impac t, which is not supported by empirical data . As t he IF aligns well with intuitive notions of journal hierarchi es ( Gordon, 1982; Saha et al., 2003; Yue et al., 2007) , it receives insufficient scrutiny ( Frank, 2003) ( perhaps a case of confirmation bi as) . The one field in which journal rank is scrutinized is bibl iometrics. We have reviewed the pertinent empirical liter ature to supplement the largely argumentative disc ussion on the opinion pages of many learned journals (Adler and Harzing, 2009; Bauer, 2004; Lawre nce, 2002; Brumback, 2012; Lawrence, 2007, 2008; Garwood, 2011; Taylor et al., 2008; Tsikliras, 2008; Todd and Ladle, 2008; Giles, 2007; Moed Consequences of Journal Rank 19 and van Leeuwen, 1996; Editorial, 2005; Sarewitz, 2012; Schooler, 2011) with emp irical data . Much like dowsing, homeopathy or astrology, j ournal rank seems to appeal to sub jective impressions of certain effects, but these effects disappea r as soon as they are s ubjected to scientific scrutiny . In our understanding of the da ta , the social and psychological influences described abov e are , at least to s ome extent , ge nerated by journal r ank itself , which in turn may contribute to the obser ved decline effect and rise in retraction r ate. That is , systemic pressures on the author , rather than increased scrutiny on the part of the reader , infl ate the unreliabili ty of much scientific research. Without reform of our publicati on system, the incentives associated with increa sed pressure to publish i n high - ranking journals will continue to encourage sc ientist s to be less cautious in their conclusion s ( or worse) , in an attempt to market their research to the top journal s (Anderson et al., 2007; Fanelli, 2010; Shapin, 2008; Giles, 2007; Munafò et al., 200 9) . This is reflected in the decline in null results reported across disciplines and countries (Fanelli, 2011) , and corroborated by the finding s that much of the incre ase in retractions may be due to misconduct (Steen, 2011b; Fang et al., 2012) , and that much of this misconduct oc curs in studies published high - ranking journals (Steen, 2011a; Fang et al., 2012) . Inasmuch as journal rank guides the appointment and promotion policies of research institutions, the increasing r ate of misconduct that has recently been ob served may prove to be but the beginning of a pandemic : It is conceivable that , for the last few decades, research institu tions world - wide may have been hiring and promoting scientists who excel at marketing their work to top journals , but who are not necess arily eq ually good at conducting their research. Con versely, these institution s may have purged excellent sc ientists from their ranks, whos e marketing skills d id not meet institutional r equirements. If this interpretation of the data is correct, a generation of excellent marketers ( possibly, but not necessarily , also excellent scientists ) now serve as the leading figures and role mo dels of the scie ntific enterprise, constituting another potentially major contr ibuting fac tor to the rise in ret ractio ns. T he implications of the data presented here go beyond the reliability of sci entific publications – public trust in science and Consequences of Journal Rank 20 scientists has been in decline for some ti me in many countri es (Gauchat, 2010; EuropeanCommission, 2010; Nowotny, 2005) , dramatically so in some sectio ns of society (Gauchat, 2012) , culminating in the sentim ent that scientists ar e nothing more than yet another special int erest group (Miller, 2012; Sarewitz, 2013) . In the words of Dan iel Sarewitz: “Nothing w ill corrode pub lic trust more than a cr eeping awareness th at scientists are una ble to live up to the standards th at they have set for themselves ” (Sarewitz, 2012). The data presented here prompt the suspicion that t he corrosion has already begun and that journal rank may have played a pa rt in this declin e as well. Alternativ es Alternatives to journal rank exist – we now have technology at our disposal which allows us to perform all of the functions journal rank is current ly supposed to perform in an unbiased, dynamic way on a per - article basis, allowing the resear ch community greater control over select ion, filtering, and rank ing of scientific information (Lin, 2012; Kravitz and Baker, 2011; Priem et al., 2012; Hönekopp and Khan, 2011; Roemer and Borchardt, 2012; Priem, 2013) . Since there is no technological r eason to continue using journal rank, one implication of the da ta reviewed here is that we can ins tead use current technology and r emove the need for a journal hierarchy completely. As we have argued, it is not only technically o bsolete, but also counter - productive and a potential threat to the sc ientific endeavor . We therefore would favor bringing scholarly communication back to the research institutions in an archival publication system in which both software, raw data and their text description s are archived a nd made accessible, after peer - review and with scientifi cally - tested metrics accruing r eputation in a constantly improving reputation system (Eve, 2012) . This reputation system would be subjected to the same standar ds of scientific scrutiny as are commonly applied to all sc ientific matt ers and evo lve to minimize gaming an d maximize th e alignment of resear chers ’ interests with those o f science (which are currently mis aligned (Nosek et al., 2012) ). Only an elaborate ecosystem of a multitude of metrics can provide th e flexibility t o capitalize on the small frac tion of the multi - faceted scientific outp ut that is actually quantifiable. Such an ecosyste m Consequences of Journal Rank 21 would evolve such th at the only evolutionary stable stra tegy is to try and do the best science one can. The current ly balkanized literature, with a lack of int eroperability and standards as one of its many detr imental, unintended consequence s , prevents the kind of innovation that gave rise to the discover f unctions of Amazon or eB ay, the social networking functions o f Facebook or Reddit and course the sort and search functions of Google – all techno logies virtually e very scientist uses regula rly for all activities but science . Thus, fragmentation and the resulting lack of a ccess and interoper ability are among the main underlying reasons why journal rank has not yet been replaced b y more scientific evaluation options, desp ite widespread access to article - lev el metrics today. With a n openly accessible scholarly literatu re standardized for intero perability, it would of course still be po ssible t o pay professional editors to select publications, as is the cas e now, but after publication. T hese editors would then actually compete with each other for paying customers , accumulat ing track records for select ing (or missing ) the most important d iscoveries. Likewise, vir tually any functionality the curr ent system offers would ea sily be replicable in the system we envisage. However, a bove and beyond replicating current functionality , a n open, standardized scholarly liter ature would place a ny and all thinka ble scientific me trics only a few lines of code away, offering t he possibility of a truly ope n evaluation system where a ny hypothesis can be tested. Metr ics, social networks and intelligent software then can provide each individual user with regular, cu stomized updates on the most relevant research. Th ese updates respond to the beh avior of the user and learn from and evol ve with their preferences . With op enly accessible, interop erable literature, data and software, agents can be developed that independently search f or hypotheses in the vast knowledge accumulating ther e. But perhaps most importantl y, with an openly acc essible database of scien ce, innovation can thrive , bringing us features and ideas nobody can think of today and nobody will eve r be capable of imaginin g , if we do not bring the products of our labor back under our own control . It was the hypertext transfer protocol (http) standard that spurred innovation and made the in ternet what it is today. What is required is t he equivalent of http for scholarly literature, data and software. Consequences of Journal Rank 22 Funds currently spe nt on journal subscript ion s could ea sily suffice to finance the i nitial conversion of scholar ly communication , even if only as long - term saving s . One ave nue to move in this direction may be the r ecently announced Episcience Project (Van Noorden, 2013) . Other solut ions certainly exist (Beverungen et al., 2012; Nosek and Bar - Anan, 2012; Kriegeskorte et al., 2012; Bachmann, 2011; Birukou et al., 2011; Florian, 2012; Ghosh et al., 2012; Hunter, 2012; Ietto - Gillies, 201 2; Kreiman and Ma unsell, 2011; Kriegeskorte, 2012; Lee, 2012; Pösch l, 2012; Priem and Hemminger, 2012; Sandewall, 2012; Walther and van d en Bosch, 2012; Wicherts et al., 2012; Yarkoni, 2012; Zimmermann et al., 2011; Hartshorne and Schachner, 2012; Kra vitz and Baker, 2011) , but the need for an alternativ e system is clea rly pressing (Casadevall and Fang, 2012) . Given the data we surv eyed above, almost anything app ears superior to the status qu o. Consequences of Journal Rank 23 Acknowledgements Neil Saunders was of tremendous va lue in helping us obtain and understand the PubMed retraction dat a for Fig ure 1a. Ferric Fang and Artur o Casadeval were so kind as to let us use their retraction data to r e - plot their fig ure on a logarithmic scale (Figure 1d). We are grateful to George A. Lozano , Vincent Larivière and Yves Gingras for sharing the ir citation data w ith us (Figure 3 ). We are indebted to John Ioannidis, D aniele Fanelli , Christoph er Baker, Dwigh t Kravitz , Tom Hartley , Jason Priem, Stephen Curry , Nikolaus Kriegeskor te and four anonymous reviewers for their comments on an e arlier version of this manuscript. MRM is a m ember of the UK Cen tre for Tobacco Control Studies, a UKC RC Public Health Research: Centre of Excellence. Funding from Briti sh Heart Foundation, Cancer Research UK, Economic and Social Research Council, Medic al Research Council, and the National Institute for Hea lth Research, under the ausp ices of the UK Clinical Res earch Collaboration , is gratefully a cknowledged. BB w a s a Heisenberg - Fellow of the DFG during the time most of this manuscript was wr itten and their support is gratefully acknowl edged as well . Consequences of Journal Rank   24  References Adam, D. (2002). The counting house. Nature 415, 726–9. Adler, N. J., and Harzing, a.-W. (2009). Wh en Knowledge Wins: Transcending the Sense and Nonsense of Academic Rankings. Academy of Management Learning & Education 8, 72–95. Adler, R., Ewing, J., and Taylor, P. (2008). Joint Committee on Quantitative Assessment of Research: Citation Statisti cs (A report from the International Mathematical Union (IMU) in cooperation with the International Council of Industrial and Applied Mathematic s (ICIAM) and the Institute of Mathemat. Available at: http://www.mathunion.org/fileadmin/IM U/Report/CitationStatis tics.pdf. Allen, L., Jones, C., Dolby, K., Lynn, D., and Walport, M. (2009). Looking for Landmarks: The Role of Expert Review a nd Bibliometric Analys is in Evaluating Scientific Publication Outputs. PLoS ONE 4, 8. Anderson, M. S., Martinson, B. C., and De Vries, R. (2007). Normative dissonance in science: results from a national survey of u.s. Scientists. Journal of empirical research on human research ethics : JERHRE 2, 3–14. Bachmann, T. (2011). Fair and Open Evaluation May Call for Temporarily Hidden Authorship, Caution When Counting the Votes, and Transparency of the Fu ll Pre- publication Procedure. Frontiers in computational neuroscience 5, 61. Bain, C. R., and Myles, P. S. (2005). Relati onship between journal impact factor and levels of evidence in anaesthesia. Anaesthesia and intensive care 33, 567–70 . Baker, M. (2012). Independent labs to verify high-profile papers. Natu re . Available at: http://www.nature.com/doifinder/10. 1038/nature.2012.11176 [Accessed January 8, 2013]. Bauer, H. H. (2004). Science in the 21st Ce ntury : Knowledge Monopolies and Research Cartels. Jour. Scient. Explor. 18, 643–660. Baylis, M., Gravenor, M., and Kao, R. (1999). Sprucing up one’s impact factor. Nature 401, 322. Begley, C. G., and Ellis, L. M. (2012). Drug development: Raise standards for preclinical cancer research. Nature 483, 531–533. Bertamini, M., and Munafo, M. R. (2012). Bi te-Size Science and Its Undesired Side Effects. Perspectives on Psychological Science 7, 67–71. Beverungen, A., Bohm, S., and Land, C. (2012). The poverty of journ al publishing. Organization 19, 929–938. Birukou, A., Wakeli ng, J. R., Bartolini, C., Casati , F., Marchese, M., Mirylenka, K., Osman, N., Ragone, A., Sierra, C., and Wa ssef, A. (2011). Alternatives to peer review: novel approaches for research evaluation. Frontiers in computational neuroscience 5, 56. Brown, E. N., and Ramaswam y, S. (2007). Quality of protein crystal structures. Acta crystallographica. Section D, Biological crystallography 63, 941–50. Brumback, R. A. (2012). “3 . . 2 . . 1 . . Impact [factor]: target [academic career] destroyed!”: just another statistical casualty. Journal of child neurology 27, 1565– 76. Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., and Munafò, M. R. (2013). Power fa ilure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience 14, 365–376. Calcagno, V., Demoinet, E., Gollner, K., Gu idi, L., Ruths, D., and De Mazancourt, C . (2012). Flows of Research Manuscripts Am ong Scientific Journals Reveal Hidden Submission Patterns. Science (New York, N.Y.) 338, 1065–1069. Callaham, M. (2002). Journal Prestige, Pub lication Bias, and Ot her Characteristics Associated With Citation of Published Studies in Peer-Reviewe d Journals. JAMA: The Journal of the Americ an Medical Association 287, 2847–2850. Carpenter, S. (2012). Psycho logy’s Bold Initiative. Science 335, 1558–1561. Consequences of Journal Rank   25  Casadevall, A., and Fang, F. C. (2012). Reforming science: methodological and cultural reforms. Infection and immunity 80, 891–6. Chow, C. W., Haddad, K., Singh, G., and Wu , A. (2007). On Using Jo urnal Rank to Proxy for an Article ’ s Contribution or Value. Issues in Accounting Education 22, 411–427. Cokol, M., Iossifov, I., Rodriguez-Esteban, R., and Rzhetsky, A. (2007). How many scientific papers should be retrac ted? EMBO reports 8, 422–3. Collaboration, O. S. (2012). An Open, Large-Scale, Collab orative Effort to Estimate the Reproducibility of Psychological Science. Perspectives on Psychological Science 7, 657–660. Curry, S. (2009). Ey e-opening Access. Occam’s Typwriter: Reciprocal Space . Available at: http://occamstypewriter.o rg/scurry/2009/03/27/eye_opening_access/. Van Dongen, S. (2011). Associations betw een asymmetry and hum an attractiveness: Possible direct effects of asymmetry and signatures of publication bias. Annals of human biology 38, 317–23. Dwan, K., Altman, D. G., Arnaiz, J. A ., Bl oom, J., Chan, A.-W., Cronin, E., D ecullier, E., Easterbrook, P. J., Von Elm, E., Gamb le, C., et al. (2008). Systematic review of the empirical evidence of study pub lic ation bias and outcome reporting bias. PloS one 3, e3081. Editorial (2013). Beware the impact factor. Nature materials 12, 89. Editorial (2005). Not-so-deep impact. Nature 435, 1003–1004. Editorial (2012). The Well-Behaved Scientist. Science 335, 285–285. EuropeanCommission (2010). Science and Technology Report. Evangelou, E., Siontis, K. C., Pfeiffer, T., and Ioannidis, J. P. A. (2012). Perceived information gain from randomized trials correlates with publication in high -impact factor journals. Journal of clinical epidemiology 65, 1274–81. Eve, M. P. (2012). Tear it down, build it up: the Research Output Team, or the library- as-publisher. Insights: the UKSG journal 25, 158–162. Fanelli, D. (2010). Do pressures to publish incr ease scientists’ bias? An empirical support from US States Data. PloS one 5, e10271. Fanelli, D. (2009). How many scientists fabr icate and falsify research ? A systematic review and meta-analysis of survey data. PloS one 4, e5738. Fanelli, D. (2011). Negative results are disappear ing from most disciplines and countries. Scientometrics 90, 891–904. Fang, F. C., and Casadevall, A. (2011). Retr acted science and the retraction index. Infection and immunity 79, 3855–9. Fang, F. C., Steen, R. G., and Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences of the United States of America 109, 17028–33. Finardi, U. (2013). Correlation between Journal Impact Factor and Citation Performance: An experimental study. Journal of Informetrics 7, 357–370. Florian, R. V (2012). Aggregating post-p ublication peer reviews and ratings. Frontiers in computational neuroscience 6, 31. Frank, M. (2003). Impact fact ors: arbiter of excellence? Journal of the Medical Library Association : JMLA 91, 4–6. Garfield, E. (1999). Journal impact factor: a brief review. CMAJ : Canadian Medical Association journal = jo urnal de l’Association medicale canadienne 161, 979–80. Garwood, J. (2011). A conversation with Peter Lawrence, Cambridge. “The Heart of Research is Sick”. LabTimes 2-2011, 24–31. Gauchat, G. (2012). Politicization of Scienc e in the Public Sphere: A Study of Public Trust in the United States, 1974 to 2010. American Sociological Review 77, 167– 187. Gauchat, G. (2010). The cultur al authority of science: Pu blic trust and acceptance of organized science. Public Understanding of Science 20, 751–770. Consequences of Journal Rank   26  Ghosh, S. S., Klein, A., Avan ts, B., and Millman, K. J. (2012). Learning from open source software projects to improve scientific r eview. Frontiers in computational neuroscience 6, 18. Giles, J. (2007). Breeding cheats. Nature 445, 242–3. Giner-Sorolla, R. (2012). Science or Art? How Aesthetic Standards Grease the Way Through the Publication Bottleneck but Unde rmine Science. Perspectives on Psychological Science 7, 562–571. Goldacre, B. (2011). I foresee that nobody will do anything about this problem – Bad Science. Bad Science . Available at: http://www.badscience.net/2011/04/i-foresee- that-nobody-will-do-anything-about-this -problem/ [Accessed March 8, 2012]. Gonon, F., Konsman, J.-P., Cohen, D., a nd Boraud, T. (2012). Why most biomedical findings Echoed by newspapers turn out to be false: the case of attention deficit hyperactivity disorder. PloS one 7, e44275. Gordon, M. D. (1982). Citation ranking versus subjective evaluation in the determination of journal hierachies in the social sciences. Journal of the Amer ican Society for Information Science 33, 55–57. Hamilton, J. (2011). Debunked Scien ce: Studies Take Heat In 2011. NPR . Available at: http://www.npr.org/2011/12/29/144431640/debunked-s cience-studies-take-heat-in- 2011 [Accessed March 8, 2012]. Hartshorne, J. K., and Schachner, A. (2012). Tr acking Replicability as a Method of Post- Publication Open Evaluation. Frontiers in Computational Neuroscience 6, 8. Hauser, J. R., and Katz, G. M. (1998). Metrics: you are what you measure! European Management Journal 16, 517–528. Hegarty, P., and Walton, Z. (2012). The Consequences of Predicting Scientific Impact i n Psychology Using Journal Impact Factors. Perspectives on Psychological Scienc e 7, 72–78. Hernán, M. A. (2009). Impact factor: a call to reason. Epidemiology (Cambridge, Mass.) 20, 317–8; discussion 319–20. Hönekopp, J., and Khan, J. (2011). Future publication success in science is better predicted by traditional measures than by the h index. Scientometrics 90, 843–853. House of Commons (2004). Scientif ic Publications: Free for all? Tenth Report of Session 2003-2004 , vol II: Written evidence, Appendix 138. Available at: http://www.publications .parliament.uk/pa/cm 200304/cmselect/cmsctech/399/399 we163.htm [Accessed December 17, 2012]. Hunter, J. (2012). Post-publication peer re view: opening up scient ific conversation. Frontiers in computational neuroscience 6, 63. Ietto-Gillies, G. (2012). The evaluation of rese arch papers in the XXI century. The Open Peer Discussion system of the World Economics Association. Frontiers in computational neuroscience 6, 54. Ioannidis, J. P. A. (2005a). Contradicted and initially stronger effects in highly cited clinical research. JA MA : the journal of the American Medical Association 294, 218–28. Ioannidis, J. P. A. (2005b). Why most published research findings are false. PLoS medicine 2, e124. Ioannidis, J. P. A. (2012). Why Scienc e Is Not Necessarily Self-Corre cting. Perspectives on Psychological Science 7, 645–654. Ioannidis, J. P. A., and Panagiotou, O. A. ( 2011). Comparison of eff ect sizes associated with biomarkers reported in highly cite d individual arti cles and in subsequent meta-analyses. JAMA : the journal of the American Medical Association 305 , 2200–10. Kelly, C. D. (2006). Replicating Empirical Research In Behavioral Ecology: How And Why It Should Be Done But Rarely Ever Is. The Quarterly Review of Biology 81, 221–236. Consequences of Journal Rank   27  Kravitz, D. J., and Baker, C. I. (2011). To ward a new model of scientific publishing: discussion and a proposal. Frontiers in computational neuroscience 5, 55. Kreiman, G., and Maunsell, J. H. R. (2011). N ine criteria for a measure of scientific output. Frontiers in computational neuroscience 5, 48. Kriegeskorte, N. (2012). Open evaluation: a vision for entirely transparent post- publication peer review and rating for science. Frontiers in computational neuroscience 6, 79. Kriegeskorte, N., Walther, A., and Deca, D. (2012). An emerging co nsensus for open evaluation: 18 visions for the fu ture of scientific publishing. Frontiers in computational neuroscience 6, 94. Kyrillidou, M., Morris, S., and Ro ebuck, G. (2012). ARL statistics. American Research Libraries Digital Publications . Available at: http://publications.arl.org/ARL_S tatistics [Accessed March 18, 2012]. Lau, S. L., and Samman, N. (2007). Levels of evidence and journal impact factor in oral and maxillofacial surgery. International journal of oral and maxillofacial surgery 36, 1–5. Lawrence, P. (2008). Lost in publicat ion: how measurement harms science. Ethics in Science and Environmental Politics 8, 9–11. Lawrence, P. A. (2002). Rank injustice. Nature 415, 835–6. Lawrence, P. A. (2007). The mi smeasurement of science. Current biology : CB 17, R583– 5. Lee, C. (2012). Open peer revi ew by a selected-papers network. Frontiers i n computational neuroscience 6, 1. Lehrer, J. (2010). The decline e ffect and the scientific method. New Yorker . Available at: http://www.newyorker.com/repo rting/2010/12/13/101213fa_fact_lehrer [Accessed March 8, 2012]. Lin, J. (2012). Cracking Open the Scientific Process: “Open Science” Challenges Journal Tradition With Web Collaboration. New York Times . Available at: http://www.nytimes.com/2012/01/17/science /open-science-challenges-journal- tradition-with-web-collaboration.html? pagewanted=all [Accessed March 8, 2012]. Liu, S. (2006). Top Journals ’ Top Retraction Rates. Scientific Ethics 1, 92–93. Lopez-Cozar, E. D., and Cabezas-Clavijo, A. (2013). Ranking journals: Could Google Scholar Metrics be an alternative to Jour nal Citation Reports and Scimago Journal Rank? ArXiv 1303.5870, 26. Lozano, G. A., Larivière, V., and Gingra s, Y. (2012). The weak ening relationship between the impact factor and p apers’ citations in the digital age. Journal of the American Society for Inform ation Science and Technology 63, 2140–2145. Lu, H., Zou, Q., Gu, H., Raic hle, M. E., Stei n, E. A., and Yang, Y. (2012). Rat brains also have a default mode network. Proceedings of the National Academy of Sciences of the United States of America 109, 3979–84. Makel, M. C., Plucker, J. A., and Hegarty, B. (2012). Replications in Psychology Research: How Often Do Th ey Really Occur? Perspectives on Psychological Science 7, 537–542. Miller, K. R. (2012). Amer ica’s Darwin Problem. Huffington Post . Available at: http://www.huffingtonpost.com/k enneth-r-miller/darwin-day- evolution_b_1269191.html [Accessed March 14, 2012]. Moed, H. F., and Van Leeuwen, T. N. (1996). Impact factors can mislead. Nature 381, 186. Moed, H. F., and Van Leeuwen, T. N. (1995). Improving th e accuracy of institute for scientific information’s journal impact factors. Journal of the American Society for Information Science 46, 461–467. Moed, H. F., Van Leeuwen, T. N., and Reedij k, J. (1996). A critical analysis of the journal impact factors ofAngewandte Chemie and the journal of the American Consequences of Journal Rank   28  Chemical Society inaccuraci es in pub lished impact factors based on ove rall citations only. Scientometrics 37, 105–116. Møller, A. P., and Jennions, M. D. (2001). Testing and adjusting for publication bias. Trends in Ecology & Evolution 16, 580–586. Møller, A. P., Thornhill, R., and Gangestad, S. W. (2005). Dir ect and indirect tests for publication bias: asymmetry and sexual selection. Animal Behaviour 70, 497–506. Munafò, M. R., Freimer, N. B., Ng, W., Ophoff, R., Veijola, J., Miettunen, J., Järvelin , M.-R., Taanila, A., and Flint, J. (2009). 5-HTTLPR geno type and anxiety-related personality traits: a meta-analysis and new data. American journal of medical genetics. Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics 150B, 271–81. Munafò, M. R., Matheson, I. J., and Flint, J. (2007). Association of the DRD2 gene Taq1A polymorphism and alcoholism: a meta -analysis of case-control studies and evidence of publication bias. Molecular psychiatry 12, 454–61. Nath, S. B., Marcus, S. C., and Druss, B. G. (2006). Retractions i n the research literature: misconduct or mistakes? The Medical jou rnal of Australia 185, 152–4. Van Noorden, R. (2013). Mathematicians aim to take publishers out of publishing. Nature . Van Noorden, R. (2011). Science publishi ng: The trouble with retractions. Nature 478, 26–8. Nosek, B. A., and Bar-Anan , Y. (2012). Scientific Utop ia: I. Opening Scientific Communication. Psychological Inquiry 23, 217–243. Nosek, B. A., Spies, J. R., and Motyl, M. (2012). Scientific Utopia: II. Restructuring Incentives and Practices to Prom ote Truth Over Publishability. Perspectives on Psychological Science 7, 615–631. Nowotny, H. (2005). Science and society. High- and low-cost realities for science and society. Science (New York, N.Y.) 308, 1117–8. Obremskey, W. T., Pappas, N., Attallah-Wa sif, E., Torn etta, P., and Bhandari, M. (2005). Level of evidence in orthopaedic journals. The Journal of bone and joint surgery. American volume 87, 2632–8. Palmer, A. R. (2000). QUASI-REPLICAT ION AND THE CONTRA CT OF ERROR: Lessons from Sex Ratios, Heritab ilities and Fluctuating Asymmetry. Annual Review of Ecology and Systematics 31, 441– 480. Popper, K. (1995). In Search of a Better World: Lect ures and Essays from Thirty Years . Routledge; New edition edition (December 20, 1995). Pöschl, U. (2012). Multi-stage open peer review: scien tific evaluation integrating the strengths of traditional peer review with the virtues of transparency and self- regulation. Frontiers in computational neuroscience 6, 33. Priem, J. (2013). Schola rship: Beyond the paper. Nature 495, 437–440. Priem, J., and Hemminger, B. M. (2012). Decoupling the scholarly journal. Frontiers in computational neuroscience 6, 19. Priem, J., Piwowar, H. A., and Hemminger, B. M. (2012). Altmetrics in the wild: Using social media to explore scholarly impact. ArXiv 1203.4745. Prinz, F., Schlange, T., and Asadullah, K. ( 2011). Believe it or not: how much can we rely on published data on potential drug targe ts? Nature reviews. Drug discovery 10, 712. Rafols, I., Leydesdorff, L., O’Hare, A., Ni ghtingale, P., and Stirling, A. (2012). How journal rankings can suppress interdisciplinary re search: A comparison between Innovation Studies and Business & M anagement. Research Policy 41, 1262–1282. Reedijk, J. (1998). Sense and nonsense of science citation anal yses: comments on the monopoly position of ISI and citation in accuracies. Risks of possible misuse and biased citation and impact data. New Journal of Chemistry 22, 767–770. Research Information Network (2008). Activities , costs and funding flows in the scholarly communications system | Research Information Network. Report commissioned by Consequences of Journal Rank   29  the Research Inform ation Network (RIN) . Available at: http://www.rin.ac.uk/our- work/communicating-and-disseminating- research/activities-costs-and-funding- flows-scholarly-commu [Accessed March 18, 2013]. Roemer, R. C., and Borchardt, R. (2012). Fr om bibliometrics to altmetrics: A changing scholarly landscape. College & Research Libraries News 73, 596–600. Rossner, M., Van Epps, H., and Hill, E. (2007). Show me the data. The Journal of Cell Biology 179, 1091–1092. Saha, S., Saint, S., and Christakis, D. A. (2003). Impact factor: a valid measure of journal quality? Journal of the Medical Library Association : JMLA 91, 42–6. Sandewall, E. (2012). Maintaining live discu ssion in two-stage open peer review. Frontiers in computational neuroscience 6, 9. Sarewitz, D. (2012). Beware the creeping cracks of bias. Nature 485, 149–149. Sarewitz, D. (2013). Science must be seen to bridge the political divide. Nature 493, 7. Schooler, J. (2011). Unpublished results hide the d ecline effect. Nature 470, 437. Scott, S., Kranz, J. E., Cole, J., Lincecum, J. M., Thompson, K., K elly, N., Bostrom, A., Theodoss, J., Al-Nakhala, B. M., Vieira, F. G., et al. (2008). Design, power, and interpretation of studies in the standard murine model of ALS. Amyotrophic lateral sclerosis : official publication of the World Federation of Neurology Research Group on Motor Neuron Diseases 9, 4–15. Seglen, P. O. (1994). Causal re lationship between article citedness and journal impact. Journal of the American Soci ety for Information Science 45, 1–11. Seglen, P. O. (1992). The skewness of science. Journal of the American Society for Information Science 43, 628–638. Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. BMJ 314. Shapin, S. (2008). The scientific life : a moral history of a late modern vocation. Chicago: University of Chicago Press. Simmons, J. P., Nelson, L. D., and Simonso hn, U. (2011). False- positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological science 22, 1359–66. Simmons, L. W., Tomkins, J. L., Kotiaho, J. S., and Hunt, J. (1999). Fluctuating paradigm. Proceedings of the Royal Society B: Biologic al Sciences 266, 593–595. Singh, G., Haddad, K. M., and Chow, C. W. (2007). Are Articles in “Top ” Management Journals Necessarily of Higher Quality? Journal of Management Inquiry 16, 319– 331. Siontis, K. C. M., Evangelou, E., and Ioannidi s, J. P. A. (2011). Magnitude of effects in clinical trials published in high -impact general medical journals. International journal of epidemiology 40, 1280–91. Sønderstrup-Andersen, E. M., and Sønder strup-Andersen, H. H. K. (2008). An investigation into diabetes researcher’s pe rceptions of the Journal Impact Factor — reconsidering evalua ting research. Scientometrics 76, 391–406. Song, F., Eastwood, A., Gilbody, S., and Duley, L. (1999). The role of electronic journals in reducing publication bias. Informatics for Health and Social Care 24, 223–229. Starbuck, W. H. (2005). How Much Better Are the Most-Prestigious Journals? The Statistics of Academic Publication. Organization Science 16, 180–200. Statzner, B., and Resh, V. H. (2010). Negat ive changes in the sc ientific publication process in ecology: potential causes and conseque nces. Freshwater Biology 55, 2639–2653. Steen, R. G. (2011a). Retractions in the scie ntific literature: do authors deliberately commit research fraud? Journal of medical ethics 37, 113–7. Steen, R. G. (2011b). Retractions in the scientif ic literature: is the incidence of research fraud increasing? Journal of medical ethics 37, 249–53. Sutton, J. (2011). psi study highlights re plication problems. The Psychologist News . Available at: Consequences of Journal Rank   30  http://www.thepsychologist.org.uk/bl og/blogpost.cfm?threadid=1984&catid=48 [Accessed March 8, 2012]. Taylor, M., Perakakis, P., and Tracha na, V. (2008). The siege of science. Ethics in Science and Environmental Politics 8, 17–40. The PLoS Medicine Editors ( 2006). The impact factor game. It is time to find a better way to assess the scientific literature. PLoS medicine 3, e291. Todd, P., and Ladle, R. (2008). Hidden dan gers of a ‘citation culture’. Ethics in Science and Environmental Politics 8, 13–16. Tressoldi, P. E., Giofré, D ., Sella, F., and Cumming, G. (2013). High Impact = High Statistical Standards? Not Necessarily So. PloS one 8, e56180. Tsikliras, A. (2008). Chasing after the high impact. Ethics in Science and Environmental Politics 8, 45–47. Upadhyay, J., Baker, S. J., Chandran, P., Mille r, L., Le e, Y., Marek, G. J., Sakoglu, U., Chin, C.-L., Luo, F., Fox, G. B., et a l. (2011). Default-mode-like network activation in awake rodents. PloS one 6, e27839. Vanclay, J. K. (2011). Impact factor: outdated artefact or stepping-stone to journal certification? Scientometrics 92, 211–238. Wager, E., and Williams, P. (2011). Why and how do journals retract articles? An analysis of Medline retractions 1988-2008. Journal of medical ethics 37, 56 7–70. Wagner, P. D. (2011). What’s in a number? Journal of applied physiology (Bethesda, Md. : 1985) 111, 951–3. Walther, A., and Van den Bosch, J. J. F. (2012). FOSE: a framework for open science evaluation. Frontiers in computational neuroscience 6, 32. Weale, A. R., Bailey, M., and Lear, P. A. (2004). The level of non-citation of articles within a journal as a measure of qualit y: a comp arison to the impact factor. BMC medical research methodology 4, 14. Welberg, L. (2012). Neuroimaging: Rats join the “default mode” club. Nature Reviews Neuroscience 11, 223. Wicherts, J. M., Kievit, R. A., Bakker, M., and Borsboom, D. (2012). Letting the daylight in: Reviewing the reviewers and other ways to maximize transp arency in science. Frontiers in computational neuroscience 6, 20. Yarkoni, T. (2012). Designing next-generation pl atforms for evaluating scientific output: what scientists can learn from the social web. Frontiers in computational neuroscience 6, 72. Yong, E. (2012). Replication studies: Bad copy. Nature 485, 298–300. Young, N. S., Ioannidis, J. P. A., and Al-Ubaydli, O. (2008). Why current publication practices may distort science. PLoS medicine 5, e201. Yue, W., Wilson, C. S., and Boller, F. (2007). Peer asses sment of journal quality in clinical neurology. Journal of the Medical Lib rary Association : JMLA 95, 70–6. Zimmermann, J., Roebroeck, A., Uludag, K., Sa ck, A., Formisano, E., Jansma, B., De Weerd, P., and Goebel, R. (2011). Network-based statistics for a community driven transparent publication process. Frontiers in computational neuroscience 6, 11. Consequences of Journal Rank 31 Suppl . Fig. S1 : Impact Factor of the journal „Current Biology“ in the years 2002 (above) and 2003 (below) showing a 40% increase in impact. The increase in the IF of the journal “Current Biology” from approx. 7 to almost 12 from one edition of Thomson Reuters’ “Journal Citation Reports” to the next is due to a retrospective adjustment of the number of items published (m arked), while the actual citatio ns remained relatively constant.

Deep Impact: Unintended consequences of journal rank

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment