Pac-bayesian bounds for sparse regression estimation with exponential weights

We consider the sparse regression model where the number of parameters $p$ is larger than the sample size $n$. The difficulty when considering high-dimensional problems is to propose estimators achieving a good compromise between statistical and computational performances. The BIC estimator for instance performs well from the statistical point of view \cite{BTW07} but can only be computed for values of $p$ of at most a few tens. The Lasso estimator is solution of a convex minimization problem, hence computable for large value of $p$. However stringent conditions on the design are required to establish fast rates of convergence for this estimator. Dalalyan and Tsybakov \cite{arnak} propose a method achieving a good compromise between the statistical and computational aspects of the problem. Their estimator can be computed for reasonably large $p$ and satisfies nice statistical properties under weak assumptions on the design. However, \cite{arnak} proposes sparsity oracle inequalities in expectation for the empirical excess risk only. In this paper, we propose an aggregation procedure similar to that of \cite{arnak} but with improved statistical performances. Our main theoretical result is a sparsity oracle inequality in probability for the true excess risk for a version of exponential weight estimator. We also propose a MCMC method to compute our estimator for reasonably large values of $p$.

💡 Research Summary

The paper addresses the high‑dimensional sparse linear regression setting where the number of covariates p far exceeds the sample size n. Classical model‑selection criteria such as BIC enjoy optimal statistical properties but are computationally infeasible beyond a few dozen variables because they require exhaustive search over 2^p models. Convex‑relaxation methods like the Lasso are scalable, yet they demand stringent conditions on the design matrix (e.g., restricted eigenvalue or compatibility conditions) to achieve fast convergence rates. Dalalyan and Tsybakov introduced an exponential‑weight aggregation scheme that balances statistical efficiency with computational tractability; however, their theoretical guarantees are limited to oracle inequalities in expectation for the empirical excess risk, offering no high‑probability control of the true risk.

In this work the authors propose a refined exponential‑weight estimator within the PAC‑Bayesian framework and prove a sparsity‑oracle inequality that holds with high probability for the true excess risk. The construction proceeds as follows. Let M denote the collection of all subsets of {1,…,p}. For each subset m∈M, the ordinary least‑squares estimator β_m is computed on the selected variables. A prior π(m) is placed on the model space, typically decreasing exponentially with the cardinality |m| to favor sparse models. Given data (X_i,Y_i){i=1}^n, the empirical loss L_n(β)= (1/n)∑{i=1}^n (Y_i−X_i^Tβ)^2 is evaluated, and the posterior weight for model m is defined as

w(m) ∝ π(m)·exp(−λ n L_n(β_m)),

where λ>0 is a temperature parameter. The final estimator is the weighted average

β̂ = Σ_{m∈M} w(m) β_m.

The main theorem states that for any confidence level δ∈(0,1), with probability at least 1−δ,

L(β̂) ≤ L(β_{m*}) + C·

💡 Research Summary

📜 Original Paper Content