Structured variable selection in support vector machines
When applying the support vector machine (SVM) to high-dimensional classification problems, we often impose a sparse structure in the SVM to eliminate the influences of the irrelevant predictors. The lasso and other variable selection techniques have been successfully used in the SVM to perform automatic variable selection. In some problems, there is a natural hierarchical structure among the variables. Thus, in order to have an interpretable SVM classifier, it is important to respect the heredity principle when enforcing the sparsity in the SVM. Many variable selection methods, however, do not respect the heredity principle. In this paper we enforce both sparsity and the heredity principle in the SVM by using the so-called structured variable selection (SVS) framework originally proposed in Yuan, Joseph and Zou (2007). We minimize the empirical hinge loss under a set of linear inequality constraints and a lasso-type penalty. The solution always obeys the desired heredity principle and enjoys sparsity. The new SVM classifier can be efficiently fitted, because the optimization problem is a linear program. Another contribution of this work is to present a nonparametric extension of the SVS framework, and we propose nonparametric heredity SVMs. Simulated and real data are used to illustrate the merits of the proposed method.
💡 Research Summary
The paper addresses a fundamental limitation of sparsity‑inducing support vector machines (SVMs) when applied to high‑dimensional classification problems that possess a natural hierarchical organization among predictors. Traditional approaches such as the Lasso‑penalized SVM achieve variable selection by adding an ℓ₁ penalty to the hinge‑loss objective, but they ignore the heredity principle: an interaction or higher‑order term should be retained only if its constituent main effects are also present. Ignoring this principle can lead to models that are difficult to interpret and that may include spurious interaction terms.
To overcome this, the authors adopt the Structured Variable Selection (SVS) framework originally proposed by Yuan, Joseph, and Zou (2007). The key idea is to embed linear inequality constraints that encode the desired hierarchical relationships directly into the SVM optimization problem. Concretely, each predictor (or group of predictors) is assigned a non‑negative coefficient βj. For every interaction term βjk the constraints βjk ≤ βj and βjk ≤ βk are imposed. These constraints guarantee that if an interaction term receives a non‑zero weight, the corresponding main effects must also be non‑zero, thereby enforcing either strong or weak heredity as required.
The resulting optimization problem is:
\
Comments & Academic Discussion
Loading comments...
Leave a Comment