Bayesian Adaptive Lasso
We propose the Bayesian adaptive Lasso (BaLasso) for variable selection and coefficient estimation in linear regression. The BaLasso is adaptive to the signal level by adopting different shrinkage for different coefficients. Furthermore, we provide a model selection machinery for the BaLasso by assessing the posterior conditional mode estimates, motivated by the hierarchical Bayesian interpretation of the Lasso. Our formulation also permits prediction using a model averaging strategy. We discuss other variants of this new approach and provide a unified framework for variable selection using flexible penalties. Empirical evidence of the attractiveness of the method is demonstrated via extensive simulation studies and data analysis.
💡 Research Summary
The paper introduces the Bayesian Adaptive Lasso (BaLasso), a hierarchical Bayesian extension of the classic Lasso that allows coefficient‑specific shrinkage. In the standard Lasso a single L1 penalty is applied uniformly to all regression coefficients, which can be suboptimal when the true signals vary in magnitude across predictors. BaLasso addresses this by assigning each coefficient βj its own scale parameter λj and placing a Gamma hyper‑prior on λj. This construction yields an adaptive Laplace prior: βj | λj,σ² ∼ N(0,σ²/λj) with λj ∼ Gamma(a,b). The resulting posterior is amenable to a Gibbs sampler because the full conditional distributions are closed‑form (β is Gaussian, λj is Gamma, σ² is inverse‑Gamma).
Variable selection is performed via the posterior conditional mode (MAP) of β. Large posterior values of λj force the corresponding βj toward zero, effectively removing the predictor, while small λj allow the coefficient to remain sizable. Thus the model automatically adapts the amount of shrinkage to the signal strength of each variable. In addition to a MAP‑based selection rule, the authors propose a model‑averaging prediction strategy: posterior draws of β are averaged to produce predictive means that incorporate model uncertainty, reducing over‑fitting relative to a single selected model.
The authors evaluate BaLasso through extensive simulations that vary signal‑to‑noise ratio, dimensionality (p > n), predictor correlation, and sparsity patterns. Compared with the ordinary Lasso, Adaptive Lasso, and a Bayesian Lasso with a common λ, BaLasso consistently achieves higher true‑positive rates, lower false‑positive rates, and smaller root‑mean‑square prediction errors, especially in highly correlated or ultra‑high‑dimensional settings. Real‑world applications—including a benchmark dataset and a genomics expression study—demonstrate that BaLasso selects a parsimonious set of interpretable predictors while delivering competitive or superior predictive performance.
Beyond the core method, the paper discusses extensions. The Gamma‑Gaussian mixture formulation can be generalized to other non‑convex penalties such as SCAD or MCP by choosing alternative hyper‑priors. Hierarchical extensions enable group‑wise or structured sparsity (e.g., group Lasso). For very large p, variational Bayes or sparse Gibbs sampling can be employed to improve scalability. The authors also note that the framework readily accommodates non‑Gaussian likelihoods (binary, count data) by replacing the normal error model with appropriate exponential‑family links.
In summary, BaLasso provides a unified Bayesian framework that (1) delivers coefficient‑specific adaptive shrinkage, (2) offers a principled MAP‑based variable‑selection rule, and (3) incorporates model‑averaging for robust prediction. Empirical results confirm its advantages over existing Lasso variants, and the methodological flexibility opens avenues for further research in high‑dimensional, structured, and non‑standard regression problems.
Comments & Academic Discussion
Loading comments...
Leave a Comment