Structured sparsity-inducing norms through submodular functions

Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turned into a convex optimization problem by replacing the cardinality function by its convex envelope (tightest convex lower bound), in this case the L1-norm. In this paper, we investigate more general set-functions than the cardinality, that may incorporate prior knowledge or structural constraints which are common in many applications: namely, we show that for nondecreasing submodular set-functions, the corresponding convex envelope can be obtained from its \lova extension, a common tool in submodular analysis. This defines a family of polyhedral norms, for which we provide generic algorithmic tools (subgradients and proximal operators) and theoretical results (conditions for support recovery or high-dimensional inference). By selecting specific submodular functions, we can give a new interpretation to known norms, such as those based on rank-statistics or grouped norms with potentially overlapping groups; we also define new norms, in particular ones that can be used as non-factorial priors for supervised learning.

💡 Research Summary

The paper tackles the fundamental problem of inducing sparsity in high‑dimensional supervised learning, where the goal is to select a small subset of variables that yields a good linear predictor. Traditionally this combinatorial selection problem is relaxed by replacing the cardinality function with its convex envelope, namely the ℓ₁‑norm. While this relaxation is computationally attractive, it completely ignores any prior structural information that may be available about the variables (e.g., groups, hierarchies, overlapping clusters, rank‑based penalties).

The authors propose a unifying framework that replaces the cardinality function with any non‑decreasing submodular set function F. Submodular functions are set functions that exhibit diminishing returns, a property that naturally captures many structural constraints. The key theoretical contribution is the observation that the convex envelope of F is precisely the Lovász extension ϕ_F of F, evaluated on the absolute values of the coefficient vector. Formally, the induced norm is

💡 Research Summary

📜 Original Paper Content