Bayesian generalized method of moments applied to pseudo-observations in survival analysis
Bayesian inference for survival regression modeling offers numerous advantages, especially for decision-making and external data borrowing, but demands the specification of the baseline hazard function, which may be a challenging task. We propose an alternative approach that does not need the specification of this function. Our approach combines pseudo-observations to convert censored data into longitudinal data with the Generalized Methods of Moments (GMM) to estimate the parameters of interest from the survival function directly. GMM may be viewed as an extension of the Generalized Estimating Equation (GEE) currently used for frequentist pseudo-observations analysis and can be extended to the Bayesian framework using a pseudo-likelihood function. We assessed the behavior of the frequentist and Bayesian GMM in the new context of analyzing pseudo-observations. We compared their performances to the Cox, GEE, and Bayesian piecewise exponential models through a simulation study of two-arm randomized clinical trials. Frequentist and Bayesian GMM gave valid inferences with similar performances compared to the three benchmark methods, except for small sample sizes and high censoring rates. For illustration, three post-hoc efficacy analyses were performed on randomized clinical trials involving patients with Ewing Sarcoma, producing results similar to those of the benchmark methods. Through a simple application of estimating hazard ratios, these findings confirm the effectiveness of this new Bayesian approach based on pseudo-observations and the generalized method of moments. This offers new insights on using pseudo-observations for Bayesian survival analysis.
💡 Research Summary
This paper introduces a novel Bayesian framework for survival regression that avoids the need to specify a baseline hazard function, a common obstacle in traditional Bayesian survival models. The authors first transform right‑censored survival data into pseudo‑observations (POs), which are individual‑level contributions derived from the Kaplan‑Meier estimator at a set of pre‑chosen time points. For each subject i and time tₖ, the PO is defined as y_{ik}=n·\hat S(tₖ)−(n−1)·\hat S^{−i}(tₖ), where \hat S denotes the KM estimate and the superscript −i indicates the leave‑one‑out estimate. These POs lie between 0 and 1 (or become negative after an event) and can be modeled directly as outcomes in a generalized linear model with a complementary log‑log (cloglog) link. Under the Cox proportional hazards formulation S(t|X)=S₀(t)exp(βX), the cloglog transformation yields log(−log S(t|X))=log H₀(t)+βX, allowing the regression coefficient β to be interpreted as a hazard ratio (HR).
While the frequentist literature typically analyses POs using Generalized Estimating Equations (GEE), GEE relies on quasi‑likelihoods and lacks a full likelihood, making a Bayesian extension non‑trivial. The authors therefore adopt the Generalized Method of Moments (GMM), originally proposed by Hansen (1982) for econometrics. In the GMM setting, multiple moment conditions are constructed: the mean model (identical to the GEE mean) and a set of J working correlation structures expressed as linear combinations of basis matrices M₁,…,M_J. For each subject i, a J·P dimensional score vector u_i(β) is built from D_iᵀM_j(y_i−μ_i), where D_i is the derivative of the mean with respect to β and μ_i is the model‑based mean. The sample average U_n(β)=n⁻¹∑_i u_i(β) and its empirical covariance C_n(β)=n⁻¹∑_i u_i(β)u_i(β)ᵀ define the quadratic inference function Q_n(β)=U_n(β)ᵀC_n(β)⁻¹U_n(β). Minimising Q_n(β) yields the GMM estimator β̂, which coincides with the GEE estimator when the same working correlation is used, but can achieve greater efficiency when the weighting matrix captures the true correlation structure.
To embed GMM within a Bayesian paradigm, the authors employ the pseudo‑likelihood approach of Yoon (2009). By the Central Limit Theorem, U_n(β)≈N(0,Σ(β)), so Q_n(β) behaves like a χ² statistic. The pseudo‑likelihood is defined as
\tilde L(y|β) ∝ exp{−½ U_n(β)ᵀ Σ_n(β)⁻¹ U_n(β)} ,
with Σ_n(β) = n⁻¹∑_i u_i(β)u_i(β)ᵀ − n U_n(β)U_n(β)ᵀ. Combining this with a prior π(β) yields a proper posterior p(β|y) ∝ \tilde L(y|β)π(β). Because the cloglog link restricts the support of β (Σ_n must be invertible), the implementation includes safeguards such as small regularisation terms and constrained sampling. Posterior inference is obtained via MCMC (e.g., Hamiltonian Monte Carlo), providing point estimates, credible intervals, and the ability to incorporate external information through informative priors.
The authors evaluate the method through extensive simulations mimicking two‑arm randomized trials. Sample sizes n∈{50,100,200} and censoring rates c∈{20 %,40 %,60 %} are examined. Competing methods are: (1) Cox proportional hazards model with a piecewise‑exponential baseline (frequentist and Bayesian versions), (2) GEE applied to POs, and (3) a fully Bayesian piecewise‑exponential model with a Gamma process prior. Results show that Bayesian GMM delivers unbiased HR estimates and mean‑squared errors comparable to the Cox piecewise‑exponential approach, while maintaining nominal 95 % coverage for moderate to large samples and censoring ≤40 %. In small‑sample/high‑censoring scenarios, Bayesian GMM exhibits slight under‑coverage, yet still outperforms GEE in terms of bias.
Real‑world applicability is demonstrated on three Ewing Sarcoma trials. For each trial, the treatment HR estimated via Bayesian GMM aligns closely with estimates from Cox and Bayesian piecewise‑exponential models (HR≈0.68–0.74, 95 % credible/confidence intervals overlapping). The Bayesian GMM framework also facilitates borrowing strength from historical controls by specifying informative priors on β, a feature not readily available in the frequentist GEE or Cox implementations.
In summary, the paper contributes: (i) a practical Bayesian survival analysis technique that bypasses baseline hazard specification by leveraging pseudo‑observations, (ii) a rigorous translation of GMM into a pseudo‑likelihood suitable for Bayesian computation, and (iii) comprehensive empirical evidence that the method matches or exceeds the performance of established frequentist and Bayesian benchmarks. The authors suggest future extensions to multi‑state models, restricted mean survival time, high‑dimensional covariates with variable selection, and adaptive trial designs that could benefit from the Bayesian GMM’s flexibility and ability to incorporate prior information.
Comments & Academic Discussion
Loading comments...
Leave a Comment