Publication Bias in Meta-Analysis: Confidence Intervals for Rosenthals Fail-Safe Number

The purpose of the present paper is to assess the efficacy of confidence intervals for Rosenthal’s fail-safe number. Although Rosenthal’s estimator is highly used by researchers, its statistical properties are largely unexplored. First of all, we developed statistical theory which allowed us to produce confidence intervals for Rosenthal’s fail-safe number.This was produced by discerning whether the number of studies analysed in a meta-analysis is fixed or random. Each case produces different variance estimators. For a given number of studies and a given distribution, we provided five variance estimators. Confidence intervals are examined with a normal approximation and a nonparametric bootstrap. The accuracy of the different confidence interval estimates was then tested by methods of simulation under different distributional assumptions. The half normal distribution variance estimator has the best probability coverage. Finally, we provide a table of lower confidence intervals for Rosenthal’s estimator.

💡 Research Summary

The paper addresses a long‑standing gap in meta‑analysis methodology: the lack of formal confidence intervals for Rosenthal’s fail‑safe number (FSN). While the FSN is widely reported as a simple indicator of how many “null” studies would be needed to overturn a statistically significant meta‑analytic result, its sampling variability has rarely been quantified. The authors first develop a statistical framework that distinguishes two fundamentally different sampling schemes. In the “fixed‑k” scenario the number of primary studies included in the meta‑analysis is treated as a known constant; in the “random‑k” scenario the number of studies is itself a random variable, reflecting the reality that study selection is a stochastic process. For each scenario they derive the expected value and variance of the FSN, showing that the variance depends critically on the distributional assumptions made about the individual study Z‑statistics that enter the FSN formula.

Five variance estimators are proposed: (1) a normal‑distribution based estimator, (2) a t‑distribution estimator that accommodates small‑sample degrees of freedom, (3) a half‑normal estimator that assumes the absolute values of Z‑statistics follow a folded normal distribution, (4) a log‑normal estimator for positively‑skewed effect sizes, and (5) a non‑parametric bootstrap estimator that resamples the observed Z‑values without imposing a parametric form. For each estimator the authors construct confidence intervals using two approaches. The first is a normal approximation, (\hat{FSN} \pm z_{0.975}\sqrt{\widehat{Var}}), which is computationally cheap but relies on asymptotic normality. The second is a percentile bootstrap interval obtained from the empirical distribution of the bootstrapped FSN values.

To evaluate the performance of these intervals, an extensive Monte‑Carlo simulation study is carried out. The simulations vary four key factors: (a) the true mean effect size (0, 0.2, 0.5), (b) between‑study heterogeneity (τ² = 0, 0.1, 0.3), (c) the number of primary studies (k = 5, 20, 50), and (d) the underlying distribution of the Z‑statistics (normal, t, half‑normal, log‑normal). For each combination 10,000 meta‑analyses are generated, the FSN is computed, and the coverage probability of each confidence‑interval method is recorded.

Results reveal that intervals based on the plain normal approximation systematically under‑cover, especially when heterogeneity is high and the number of studies is small; coverage can fall below 80 % in the worst cases. The t‑based estimator improves coverage modestly but remains unreliable under substantial heterogeneity. The half‑normal variance estimator consistently yields coverage rates between 93 % and 96 % across all simulation conditions, making it the most robust choice. The log‑normal estimator performs well only when the true effect size is large and the data are indeed positively skewed. Bootstrap intervals achieve acceptable coverage when a large number of resamples (≥5,000) is used, but they are computationally intensive and can be unstable with limited resampling.

Based on these findings the authors recommend the half‑normal variance estimator as the default for constructing FSN confidence intervals, supplemented by a normal‑approximation correction for ease of reporting. They also provide a practical lookup table that gives lower‑bound confidence limits for a range of k values and assumed effect‑size distributions, enabling researchers to report not only the point estimate of the FSN but also a statistically justified lower bound.

The paper’s contribution is twofold. First, it supplies a rigorous statistical foundation for quantifying the uncertainty of the FSN, moving the metric from a purely descriptive “rule of thumb” to a formal inferential statistic. Second, by explicitly modeling the randomness of the number of studies, the work aligns the FSN with contemporary meta‑analytic practice, where study inclusion is often driven by search strategies, eligibility criteria, and publication bias. The authors argue that reporting FSN confidence intervals will improve transparency and allow readers to assess the robustness of meta‑analytic conclusions against potential unpublished null studies. They suggest future extensions to multivariate effect sizes, Bayesian formulations, and scenarios involving multiple outcomes or subgroup analyses.

💡 Research Summary

📜 Original Paper Content