Bivariate linear mixed models using SAS proc MIXED

Bivariate linear mixed models using SAS proc MIXED
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Bivariate linear mixed models are useful when analyzing longitudinal data of two associated markers. In this paper, we present a bivariate linear mixed model including random effects or first-order auto-regressive process and independent measurement error for both markers. Codes and tricks to fit these models using SAS Proc MIXED are provided. Limitations of this program are discussed and an example in the field of HIV infection is shown. Despite some limitations, SAS Proc MIXED is a useful tool that may be easily extendable to multivariate response in longitudinal studies.


💡 Research Summary

The paper addresses a practical need in longitudinal research: the simultaneous analysis of two correlated biomarkers measured repeatedly over time. While univariate linear mixed models (LMMs) are standard for single outcomes, they fail to capture the inter‑relationship between multiple markers. To fill this gap, the authors develop a bivariate linear mixed model that combines traditional random effects (subject‑specific intercepts and slopes) with a first‑order autoregressive (AR(1)) process to model within‑subject temporal correlation, and they assume independent measurement error for each marker.

A central contribution of the work is a step‑by‑step guide for fitting this model using SAS PROC MIXED, a procedure that is not explicitly designed for multivariate responses. The authors begin by reshaping the data from a wide to a long format, creating an indicator variable for the marker (e.g., “Marker”) and a time variable. In the PROC MIXED call, the CLASS statement includes both the subject identifier and the marker variable. The MODEL statement specifies the dependent variable and the fixed effects (overall intercept, time, treatment group, and their interactions). The RANDOM statement defines a bivariate random‑effects structure, typically using TYPE=UN (unstructured) to allow free estimation of the covariance between the two markers’ random intercepts and slopes. The REPEATED statement, with SUBJECT=Subject*Marker and TYPE=AR(1), imposes an AR(1) correlation structure separately for each marker, thereby capturing the serial correlation of repeated measurements.

The paper supplies complete SAS code, illustrates how to obtain estimates of the covariance between markers, and shows how to test hypotheses about differences in trajectories using ESTIMATE and CONTRAST statements. It also discusses common pitfalls: non‑convergence due to an overly complex covariance matrix, mismatched dimensions when the missing‑data pattern differs between markers, and violations of the multivariate normality assumption. Suggested remedies include simplifying the covariance structure (e.g., using a compound symmetry or Toeplitz form), providing better starting values, or resorting to PROC NLMIXED or PROC MCMC for more flexible modeling.

To demonstrate the method, the authors analyze data from an HIV cohort where CD4+ T‑cell counts and plasma viral load were measured every three months for up to two years. The bivariate model reveals a significant treatment‑by‑time interaction for both outcomes and a positive covariance between the random effects, indicating that subjects with higher CD4 trajectories also tend to have more rapid viral load declines. Model fit statistics (AIC, BIC) improve markedly relative to separate univariate models, and predictive performance for future observations is enhanced.

In the discussion, the authors acknowledge limitations of PROC MIXED for multivariate work: computational burden grows quickly with the number of outcomes, and the procedure assumes a multivariate normal distribution of random effects and residuals. They note that irregular missingness or non‑Gaussian markers may require transformations or alternative Bayesian approaches. Nonetheless, they argue that for many applied settings, PROC MIXED offers a relatively accessible and extensible platform for bivariate (and, by extension, multivariate) longitudinal analysis. Future research directions include extending the framework to three or more markers, incorporating nonlinear trajectories, and exploring more sophisticated covariance structures such as time‑varying random effects.


Comments & Academic Discussion

Loading comments...

Leave a Comment