ChauBoxplot and AdaptiveBoxplot: Two R packages for boxplot-based outlier detection
Tukey’s boxplot is widely used for outlier detection; however, its classic fixed-fence rule tends to flag an excessive number of outliers as the sample size grows. To address this, we introduce two new R packages, ChauBoxplot and AdaptiveBoxplot, which implement more robust and statistically principled outlier detection methods. We illustrate their advantages and practical implications through comprehensive simulation studies and a real-world analysis of provincial university admission rates from China’s National College Entrance Examination. Based on these findings, we provide practical guidance to help practitioners select appropriate boxplot methods, achieving a balance between interpretability and statistical reliability.
💡 Research Summary
This paper addresses a well‑known limitation of the classic Tukey boxplot: its fixed‑fence rule (k = 1.5 · IQR) tends to flag an increasing proportion of non‑outlying observations as the sample size grows, leading to an inflated false‑positive rate. To remedy this, the authors introduce two new R packages—ChauBoxplot and AdaptiveBoxplot—that embed statistically principled, sample‑size‑aware outlier detection mechanisms while preserving the familiar visual language of the box‑and‑whisker plot.
ChauBoxplot implements the “Chauvenet‑type” boxplot originally proposed by Lin et al. (2026). Instead of a constant k, the fence coefficient is computed as
\
Comments & Academic Discussion
Loading comments...
Leave a Comment