GLMMLasso: An Algorithm for High-Dimensional Generalized Linear Mixed Models Using L1-Penalization
We propose an L1-penalized algorithm for fitting high-dimensional generalized linear mixed models. Generalized linear mixed models (GLMMs) can be viewed as an extension of generalized linear models for clustered observations. This Lasso-type approach for GLMMs should be mainly used as variable screening method to reduce the number of variables below the sample size. We then suggest a refitting by maximum likelihood based on the selected variables only. This is an effective correction to overcome problems stemming from the variable screening procedure which are more severe with GLMMs. We illustrate the performance of our algorithm on simulated as well as on real data examples. Supplemental materials are available online and the algorithm is implemented in the R package glmmixedlasso.
💡 Research Summary
**
The paper introduces GLMMLasso, a novel algorithm for fitting high‑dimensional generalized linear mixed models (GLMMs) with an L1‑penalty on the fixed‑effects coefficients. GLMMs extend generalized linear models by incorporating random effects to account for clustered or over‑dispersed data, but traditional maximum‑likelihood estimation becomes infeasible when the number of covariates p far exceeds the number of observations n. To address this, the authors augment the negative log‑likelihood with a λ‖β‖₁ penalty, defining the objective
Qλ(β,θ,φ)=−2 log L(β,θ,φ)+λ‖β‖₁,
where L is the GLMM likelihood. Because the likelihood involves an intractable high‑dimensional integral, they employ a Laplace approximation, which replaces the integral with a Gaussian approximation around the mode ũ of the random‑effects distribution. This yields a tractable surrogate objective QLA,λ that is still non‑convex in (β,θ,φ) but amenable to efficient optimization.
The optimization strategy combines the Laplace‑approximated objective with a block coordinate gradient descent (CGD) scheme. For each fixed‑effect coefficient βk the algorithm constructs a quadratic approximation using a positive curvature estimate hk (taken from the diagonal of the Fisher information). The descent direction dk is obtained in closed form via a soft‑thresholding rule that incorporates the sub‑gradient of the L1 term. An Armijo line‑search determines a step size αk guaranteeing monotonic decrease of QLA,λ. Two algorithmic variants are presented: (i) an “exact” version that recomputes the mode ũ after each coordinate update, and (ii) an “approximate” version that treats ũ as fixed within a full sweep over all β’s, dramatically reducing computational cost while preserving accuracy.
Because L1 shrinkage introduces bias, especially in the presence of random effects, the authors advocate a two‑stage procedure. Stage 1 runs GLMMLasso to screen variables and obtain a reduced active set. Stage 2 refits the model on this active set using ordinary (unpenalized) maximum likelihood, thereby correcting the bias and improving variance component estimates. The paper discusses practical choices for the penalty parameter λ (cross‑validation, information criteria) and provides guidelines for initializing the algorithm (e.g., fitting a penalized GLM to obtain a starting β).
Extensive simulation studies explore a range of scenarios: varying p/n ratios, different link functions (logit, probit, log), and diverse random‑effects covariance structures (diagonal, block). Results show that GLMMLasso achieves higher true‑positive rates and lower false‑discovery rates than existing methods (e.g., penalized GLMMs based on low‑dimensional approximations, Groll & Tutz 2012). Prediction error (RMSE) and estimation error for variance components are also reduced. Real‑data applications include a genetics data set with thousands of SNP predictors and a medical longitudinal study; in both cases the method identifies a parsimonious set of fixed effects while correctly estimating random‑effect variances.
Implementation is provided in the R package glmmixedlasso. The package incorporates an active‑set strategy to exploit sparsity, supports user‑defined random‑effects structures, and offers functions for λ‑selection, model diagnostics, and post‑selection refitting. The authors make supplemental material and additional simulation results publicly available.
Key contributions are: (1) a scalable algorithm that merges Laplace approximation with coordinate gradient descent for L1‑penalized GLMMs, (2) a principled two‑stage screening‑refitting workflow that mitigates Lasso‑induced bias in mixed‑model contexts, and (3) an open‑source software implementation enabling practitioners to apply high‑dimensional GLMMs to real‑world clustered data. The work fills a gap in the literature where high‑dimensional mixed models have previously been limited to low‑dimensional settings, and it opens avenues for further research on Bayesian sparse mixed models, non‑Gaussian random‑effects distributions, and extensions to multi‑level hierarchical structures.
Comments & Academic Discussion
Loading comments...
Leave a Comment