Mixed models for longitudinal left-censored repeated measures
Longitudinal studies could be complicated by left-censored repeated measures. For example, in Human Immunodeficiency Virus infection, there is a detection limit of the assay used to quantify the plasma viral load. Simple imputation of the limit of the detection or of half of this limit for left-censored measures biases estimations and their standard errors. In this paper, we review two likelihood-based methods proposed to handle left-censoring of the outcome in linear mixed model. We show how to fit these models using SAS Proc NLMIXED and we compare this tool with other programs. Indications and limitations of the programs are discussed and an example in the field of HIV infection is shown.
💡 Research Summary
Longitudinal studies in biomedical research often encounter left‑censored outcomes, that is, measurements that fall below a detection limit and are recorded only as “≤ L”. A naïve approach—replacing censored values with the limit of detection (LOD) or LOD/2—produces biased estimates of means, variances, and especially of time‑dependent trajectories, because the true distribution of the unobserved values is ignored. The paper by Croissant, Guo and colleagues tackles this problem within the framework of linear mixed‑effects models (LMMs), which are the workhorse for analyzing repeated measures with both fixed and random effects.
Two likelihood‑based strategies are reviewed. The first constructs a full likelihood that explicitly incorporates the censoring mechanism. For each observed value above the detection limit the contribution is the usual normal density; for each censored observation the contribution is the cumulative distribution function (CDF) of the normal distribution evaluated at the limit. The overall log‑likelihood is therefore a sum of log‑densities for uncensored data and log‑CDF terms for censored data, while the random‑effects structure is retained through the multivariate normal distribution of the subject‑specific intercepts and slopes. Parameter estimation proceeds by direct maximization of this mixed‑effects likelihood.
The second strategy uses the Expectation–Maximization (EM) algorithm. Censored observations are treated as latent continuous variables. In the E‑step the conditional expectation of each censored value, given the current parameter estimates and the censoring interval, is computed using the truncated normal distribution. The M‑step then updates the fixed‑effects coefficients, random‑effects variance components, and residual variance as if the expected values were observed. This iterative scheme converges to the same maximum‑likelihood estimates under regularity conditions, but it can be more stable when the likelihood surface is flat or when the number of random‑effects dimensions is moderate.
Implementation is demonstrated with SAS PROC NLMIXED, a flexible procedure that allows user‑defined log‑likelihood functions and integrates over random effects using adaptive Gaussian quadrature or Monte‑Carlo methods. The authors provide explicit SAS code: censored contributions are coded with the logcdf function, uncensored contributions with logpdf, and random effects are declared via the random statement. They discuss practical issues such as choice of starting values, selection of the quadrature order, convergence criteria, and computational time.
A comparative discussion follows. In R, the lme4 package does not support censoring directly, so analysts must resort to ad‑hoc Tobit models or Bayesian MCMC tools such as rstanarm or brms. Bayesian approaches offer great flexibility (e.g., non‑normal error distributions, hierarchical censoring) but require specification of priors and careful convergence diagnostics, and they can be computationally intensive for large cohorts. By contrast, PROC NLMIXED integrates naturally with SAS data‑management pipelines and typically achieves faster convergence for moderate‑size mixed models, though the computational burden grows quickly with the number of random‑effects dimensions because the required numerical integration becomes high‑dimensional.
The methodological exposition is anchored by an applied example: longitudinal plasma HIV‑1 viral load measurements from a cohort of patients on antiretroviral therapy. The assay’s lower limit of detection is 50 copies/mL. The authors fit three models: (1) naïve substitution of LOD, (2) the full likelihood approach via NLMIXED, and (3) the EM‑based approach. Results show that models ignoring censoring underestimate the early decline in viral load and overestimate residual variability. Both likelihood‑based methods produce similar fixed‑effect estimates—a steeper average log‑viral‑load decline per week—and tighter confidence intervals, reflecting more efficient use of the information contained in censored observations. Random‑effects variance components also differ, indicating that accounting for censoring changes the inferred heterogeneity among patients. These differences have direct clinical implications, for instance in assessing the efficacy of a new drug regimen or in determining the optimal timing of treatment switches.
In conclusion, the paper argues that left‑censoring should not be treated as a nuisance to be “fixed” by simple imputation; instead, it must be incorporated into the statistical model. The two maximum‑likelihood strategies reviewed—direct likelihood with CDF terms and EM with truncated‑normal expectations—provide rigorous solutions within the familiar linear mixed‑effects framework. SAS PROC NLMIXED offers a practical, scriptable environment for implementing both approaches, while acknowledging limitations such as computational scalability and the need for careful specification of integration settings. The work supplies a clear roadmap for biostatisticians and clinical researchers who need to extract unbiased, efficient estimates from longitudinal data plagued by detection‑limit censoring.
Comments & Academic Discussion
Loading comments...
Leave a Comment