Semiparametric Joint Inference for Sensitivity and Specificity at the Youden-Optimal Cut-Off

Semiparametric Joint Inference for Sensitivity and Specificity at the Youden-Optimal Cut-Off
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Sensitivity and specificity evaluated at an optimal diagnostic cut-off are fundamental measures of classification accuracy when continuous biomarkers are used for disease diagnosis. Joint inference for these quantities is challenging because their estimators are evaluated at a common, data-driven threshold estimated from both diseased and healthy samples, inducing statistical dependence. Existing approaches are largely based on parametric assumptions or fully nonparametric procedures, which may be sensitive to model misspecification or lack efficiency in moderate samples. We propose a semiparametric framework for joint inference on sensitivity and specificity at the Youden-optimal cut-off under the density ratio model. Using maximum empirical likelihood, we derive estimators of the optimal threshold and the corresponding sensitivity and specificity, and establish their joint asymptotic normality. This leads to Wald-type and range-preserving logit-transformed confidence regions. Simulation studies show that the proposed method achieves accurate coverage with improved efficiency relative to existing parametric and nonparametric alternatives across a variety of distributional settings. An analysis of COVID-19 antibody data demonstrates the practical advantages of the proposed approach for diagnostic decision-making.


💡 Research Summary

This paper addresses the problem of jointly estimating sensitivity and specificity at the Youden‑optimal cut‑off for continuous diagnostic biomarkers. While the Youden index provides a convenient scalar summary of diagnostic accuracy, clinical decisions often require the pair (sensitivity, specificity) evaluated at the cut‑off that maximizes the index. A major difficulty is that the optimal threshold is itself estimated from the same data used to compute sensitivity and specificity, inducing dependence between the two estimators. Existing solutions are either fully parametric—relying on strong distributional assumptions—or fully non‑parametric—robust but inefficient in moderate sample sizes.

The authors propose a semiparametric framework based on the density‑ratio model (DRM). They start by linking disease status D and biomarker X through a logistic regression:
(P(D=1\mid X=x)=\exp{\alpha^{*}+\beta^{\top}q(x)}/


Comments & Academic Discussion

Loading comments...

Leave a Comment