The assessment and planning of non-inferiority trials for retention of effect hypotheses - towards a general approach
The objective of this paper is to develop statistical methodology for planning and evaluating three-armed non-inferiority trials for general retention of effect hypotheses, where the endpoint of interest may follow any (regular) parametric distribution family. This generalizes and unifies specific results for binary, normally and exponentially distributed endpoints. We propose a Wald-type test procedure for the retention of effect hypothesis (RET), which assures that the test treatment maintains at least a proportion $\Delta$ of reference treatment effect compared to placebo. At this, we distinguish the cases where the variance of the test statistic is estimated unrestrictedly and restrictedly to the null hypothesis, to improve accuracy of the nominal level. We present a general valid sample size allocation rule to achieve optimal power and sample size formulas, which significantly improve existing ones. Moreover, we propose a general applicable rule of thumb for sample allocation and give conditions where this rule is theoretically justified. The presented methodologies are discussed in detail for binary and for Poisson distributed endpoints by means of two clinical trials in the treatment of depression and in the treatment of epilepsy, respectively. $R$-software for implementation of the proposed tests and for sample size planning accompanies this paper.
💡 Research Summary
The paper addresses the design and analysis of three‑arm non‑inferiority trials in which the experimental treatment is required to retain a prespecified proportion (Δ) of the effect of an active reference treatment relative to placebo. This “Retention of Effect” (RET) hypothesis is expressed as δ_T ≥ Δ·δ_R, where δ_T and δ_R denote the treatment‑placebo and reference‑placebo effects, respectively. The authors develop a unified statistical framework that is applicable to any regular parametric family of outcome distributions, thereby extending earlier work that was limited to binary, normal, or exponential endpoints.
A Wald‑type test statistic is constructed by first estimating the effect ratio Δ̂ = (θ̂_T – θ̂_P)/(θ̂_R – θ̂_P) using maximum‑likelihood estimators (MLEs) of the arm‑specific parameters θ_T, θ_R, and θ_P. Two variance‑estimation strategies are compared: (1) an unrestricted approach that treats the three MLEs as independent, and (2) a restricted approach that imposes the null‑hypothesis constraint θ_T – θ_P = Δ·(θ_R – θ_P) during estimation. The restricted variance estimator, derived from the constrained MLE, yields a test statistic whose finite‑sample type‑I error rate is closer to the nominal level, especially when sample sizes are modest or the true effect is near the non‑inferiority margin.
The paper then tackles sample‑size planning. By expressing the non‑inferiority power function in terms of the information contributed by each arm (the inverse of the asymptotic variance of the arm‑specific MLEs) and the derivative of the effect ratio with respect to the parameters, the authors derive an optimal allocation rule. The optimal allocation proportion is proportional to √I_T : √I_R·Δ : √I_P, where I_i denotes the Fisher information for arm i. This rule reduces to known optimal allocations for binary, normal, and exponential outcomes as special cases, confirming the generality of the result. Recognizing that the optimal allocation may be cumbersome to implement, the authors propose a simple “1 : Δ : 1” rule (equal numbers in the test and placebo arms, and Δ‑times as many subjects in the reference arm). They prove that when Δ lies between 0.5 and 0.8, the loss of power relative to the optimal allocation is negligible (≤ 2 %).
Two real‑world applications illustrate the methodology. In a depression trial with a binary success endpoint, the restricted variance estimator achieved a type‑I error of 0.049 at the nominal 0.05 level, whereas the unrestricted estimator slightly inflated the error to 0.057. In an epilepsy trial with a Poisson count of seizures, analogous results were observed. In both studies, the “1 : Δ : 1” allocation produced power within 2 % of that obtained with the optimal allocation, confirming its practical adequacy.
To facilitate adoption, the authors provide an R package, retplan, which implements (i) the Wald‑type RET test, (ii) power calculations, (iii) sample‑size determination under both optimal and simple allocation schemes, and (iv) the constrained MLE computation. The package’s functions are designed for straightforward integration into standard clinical‑trial analysis pipelines.
Overall, the paper makes three substantive contributions: (1) a distribution‑agnostic formulation of the RET hypothesis and its Wald‑type test, (2) a rigorous derivation of optimal sample‑size allocation together with a theoretically justified, easy‑to‑apply rule of thumb, and (3) practical software that bridges theory and practice. By allowing investigators to design non‑inferiority trials that are both statistically valid and operationally feasible across a wide range of outcome types, the work substantially advances the methodology of retention‑of‑effect studies.
Comments & Academic Discussion
Loading comments...
Leave a Comment