The Effects of Research Level and Article Type on the Differences between Citation Metrics and F1000 Recommendations

F1000 recommendations have been validated as a potential data source for research evaluation, but reasons for differences between F1000 Article Factor (FFa scores) and citations remain to be explored. By linking 28254 publications in F1000 to citations in Scopus, we investigated the effect of research level and article type on the internal consistency of assessments based on citations and FFa scores. It turns out that research level has little impact, while article type has big effect on the differences. These two measures are significantly different for two groups: non-primary research or evidence-based research publications are more highly cited rather than highly recommended, however, translational research or transformative research publications are more highly recommended by faculty members but gather relatively lower citations. This can be expected because citation activities are usually practiced by academic authors while the potential for scientific revolutions and the suitability for clinical practice of an article should be investigated from the practitioners’ points of view. We conclude with a policy relevant recommendation that the application of bibliometric approaches in research evaluation procedures should include the proportion of three types of publications: evidence-based research, transformative research, and translational research. The latter two types are more suitable to be assessed through peer review.

💡 Research Summary

The paper investigates why the F1000 Article Factor (FFa) scores, derived from post‑publication peer recommendations, often diverge from traditional citation counts. By linking 28,254 publications indexed in F1000 to their citation records in Scopus, the authors examine whether the “research level” (basic, clinical, applied) and the “article type” (non‑primary research, evidence‑based research, transformative research, translational research) explain these discrepancies.

Methodologically, the study first classifies each paper along two dimensions. Research level is determined by the primary focus of the work (e.g., fundamental science versus clinical application). Article type is assigned based on content and purpose: non‑primary research includes niche or methodological studies; evidence‑based research comprises systematic reviews or meta‑analyses; transformative research introduces new theories, paradigms, or breakthrough findings; translational research directly addresses clinical practice or policy implementation. The authors then compute average FFa scores and average citation counts for each subgroup and apply one‑way ANOVA with Tukey post‑hoc tests to assess statistical significance. Effect sizes and Pearson correlations are reported to gauge the strength of association between the two metrics.

The key findings are twofold. First, research level has little impact on the gap between FFa scores and citations. Papers across basic, clinical, and applied domains show similar patterns, suggesting that F1000 reviewers prioritize aspects other than the abstract scientific depth of a study. Second, article type exerts a strong influence. Non‑primary and evidence‑based papers receive higher citation counts but relatively modest FFa scores, reflecting the academic community’s tendency to cite works that extend or validate existing knowledge. In contrast, transformative and translational papers achieve higher FFa scores while accruing fewer citations. Transformative research often proposes radical ideas that require time for the scholarly community to adopt and cite, whereas translational research offers immediate practical value to clinicians and policymakers, which is recognized by F1000 faculty but not necessarily captured by citation metrics.

These results underscore that citation metrics primarily measure scholarly impact within the academic literature, whereas FFa scores capture expert appraisal of relevance, novelty, and practical significance. Consequently, relying on a single metric can misrepresent a work’s overall contribution. The authors recommend that research evaluation frameworks incorporate both dimensions, especially when assessing portfolios that include evidence‑based, transformative, and translational outputs. They further advise that policy makers explicitly track the proportion of each article type in funding and assessment exercises, using peer review for transformative and translational research while allowing citation‑based indicators to dominate for evidence‑based work.

The study acknowledges several limitations. The classification of article types was performed manually, introducing potential subjectivity. The F1000 reviewer pool is skewed toward biomedical fields, which may limit generalizability to other disciplines. Moreover, citation counts are time‑dependent; transformative and translational papers might accumulate citations later, so a snapshot analysis could underestimate their long‑term scholarly impact. Future research should employ automated text‑mining techniques for more objective categorization, expand the dataset to include additional disciplines, and conduct longitudinal analyses to observe how the relationship between FFa scores and citations evolves over time.

In conclusion, the paper demonstrates that while the scientific “level” of a study does not explain differences between citation counts and F1000 recommendations, the nature of the article does. Evidence‑based research is citation‑rich but recommendation‑poor; transformative and translational research is recommendation‑rich but citation‑poor. A balanced evaluation system that combines bibliometrics with expert peer review—tailored to the specific article type—will yield a more nuanced and policy‑relevant assessment of research performance.

💡 Research Summary

📜 Original Paper Content