Disentangling Visibility and Self-Promotion Bias in the arXiv:astro-ph Positional Citation Effect
We established in an earlier study that articles listed at or near the top of the daily arXiv:astro-ph mailings receive on average significantly more citations than articles further down the list. In our earlier work we were not able to decide whether this positional citation effect was due to author self-promotion of intrinsically more citable papers or whether papers are cited more often simply because they are at the top of the astro-ph listing. Using new data we can now disentangle both effects. Based on their submission times we separate articles into a self-promoted sample and a sample of articles that achieved a high rank on astro-ph by chance and compare their citation distributions with those of articles on lower astro-ph positions. We find that the positional citation effect is a superposition of self-promotion and visibility bias.
💡 Research Summary
The paper investigates why articles that appear near the top of the daily arXiv:astro‑ph mailing receive more citations than those further down the list. Building on a previous study (Dietrich 2008) that identified a “Positional Citation Effect” (PCE), the author seeks to separate two hypothesized mechanisms: Self‑Promotion (SP) – authors deliberately timing submissions to secure top positions for their most citable work – and Visibility Bias (VB) – readers simply noticing and citing papers that are more prominently displayed. A third hypothesis, Geography Bias (GB), is dismissed because the effect persists across both European and US author samples.
To disentangle SP from VB, the author exploits newly recovered submission‑time stamps from the arXiv server logs. The dataset comprises all astro‑ph e‑prints posted between July 2002 and December 2005 that later appeared in the core astronomy journals (ApJ, A&A, MNRAS, AJ, PASP). Only papers whose first author’s affiliation is in North or South America are retained, thereby minimizing residual GB while preserving a large enough sample.
The key methodological step is to split the top‑three positions of each daily astro‑ph list into two temporal groups: (1) “early” submissions, posted within 300 seconds of the 16:00 EST/EDT deadline, presumed to be self‑promoted; and (2) “late” submissions, posted more than 5400 seconds after the deadline, presumed to have achieved high rank by chance. A third control group consists of papers occupying positions 26–30, regardless of submission time. Citation counts are drawn from NASA’s ADS and analyzed using Zipf plots, which are appropriate for power‑law citation distributions (Redner 1998).
All three groups display similar Zipf slopes, indicating that the underlying citation distribution follows the same power‑law exponent. However, the vertical offsets differ markedly: early‑submitted top‑position papers have the highest normalization, late‑submitted top‑position papers an intermediate normalization, and the 26–30 control the lowest. Quantitatively, the mean citations per paper are 34.4 ± 1.1 for the early group, 26.2 ± 1.3 for the late group, and 22.0 ± 0.7 for the control. Bootstrap‑derived 68 % confidence intervals confirm that each difference is statistically significant.
Further analysis subdivides the post‑deadline interval into several bins. The citation advantage declines gradually as the submission time moves farther from the deadline, confirming a temporal decay of the SP effect. Yet even for papers submitted more than 1.5 hours after the deadline—i.e., those that did not intentionally chase the “pole position”—the citation rate remains about three standard deviations above the control, demonstrating a persistent VB component.
The paper also notes a temporal trend: the proportion of papers submitted within the first minute after the deadline rose from 0.5 % in the first half of 2002 to 2.3 % in the second half of 2005 (similarly for the first 5 minutes). The probability of this increase being a random fluctuation is <0.01 %, suggesting a growing community awareness of the advantage conferred by early submission.
In the discussion, the author emphasizes that both mechanisms are real and additive. Self‑Promotion preferentially places intrinsically more citable work at the top, while Visibility Bias inflates citations simply because readers are more likely to notice and cite papers that appear first in a long daily list. The coexistence of these biases has practical implications: citation counts, widely used for evaluating researchers, institutions, and funding decisions, are partially driven by non‑scientific factors.
Potential remedies are explored. Randomizing the order of astro‑ph listings would eliminate VB but would not address the underlying problem of information overload that forces astronomers to skim only the top of the list. A more promising approach is personalization, exemplified by the “arXivsorter” system (Magué & Ménard 2007), which ranks new e‑prints according to their proximity in a co‑authorship network to a user’s selected set of relevant authors. Such a recommender could surface relevant papers regardless of their arbitrary position in the daily list, thereby mitigating both SP and VB effects.
In conclusion, the study provides robust empirical evidence that the Positional Citation Effect in astro‑ph is a superposition of self‑promotion and visibility bias. It calls for caution when using raw citation metrics for assessment and suggests that community‑wide tools for personalized paper ranking may be necessary to counteract the inequities introduced by list ordering.
Comments & Academic Discussion
Loading comments...
Leave a Comment