Kernel Alignment-based Multi-view Unsupervised Feature Selection with Sample-level Adaptive Graph Learning
Although multi-view unsupervised feature selection (MUFS) has demonstrated success in dimensionality reduction for unlabeled multi-view data, most existing methods reduce feature redundancy by focusing on linear correlations among features but often overlook complex nonlinear dependencies. This limits the effectiveness of feature selection. In addition, existing methods fuse similarity graphs from multiple views by employing sample-invariant weights to preserve local structure. However, this process fails to account for differences in local neighborhood clarity among samples within each view, thereby hindering accurate characterization of the intrinsic local structure of the data. In this paper, we propose a Kernel Alignment-based multi-view unsupervised FeatUre selection with Sample-level adaptive graph lEarning method (KAFUSE) to address these issues. Specifically, we first employ kernel alignment with an orthogonal constraint to reduce feature redundancy in both linear and nonlinear relationships. Then, a cross-view consistent similarity graph is learned by applying sample-level fusion to each slice of a tensor formed by stacking similarity graphs from different views, which automatically adjusts the view weights for each sample during fusion. These two steps are integrated into a unified model for feature selection, enabling mutual enhancement between them. Extensive experiments on real multi-view datasets demonstrate the superiority of KAFUSE over state-of-the-art methods.
💡 Research Summary
The paper addresses two critical shortcomings in existing multi‑view unsupervised feature selection (MUFS) approaches. First, most methods only consider linear correlations among features, thereby ignoring complex nonlinear dependencies that cause feature redundancy. Second, when fusing similarity graphs from multiple views, they typically assign a single weight per view for all samples, which fails to capture the fact that the clarity of local neighborhoods can differ dramatically from sample to sample within each view. To overcome these issues, the authors propose KAFUSE (Kernel Alignment‑based multi‑view unsupervised FeatUre selection with Sample‑level adaptive graph lEarning), a unified framework that simultaneously reduces feature redundancy (both linear and nonlinear) and learns a cross‑view consistent similarity graph with sample‑level adaptive weighting.
Kernel‑alignment based redundancy reduction
For each view (v), the method separates the selected feature subset (X^{c}{(v)}) and the unselected subset (X^{u}{(v)}). Gaussian kernels are computed on both subsets, centered by the matrix (H = I_n - \frac{1}{n}\mathbf{1}\mathbf{1}^\top), yielding centered kernel matrices (HK^{c}{(v)}H) and (HK^{u}{(v)}H). Kernel alignment maximizes the trace (\operatorname{Tr}(HK^{c}{(v)}HK^{u}{(v)})), encouraging the selected features to capture the same similarity structure as the unselected ones. In parallel, an orthogonal constraint (W_{(v)}^\top W_{(v)} = I_c) is imposed on the view‑specific projection matrix, guaranteeing linear independence among the selected features. By combining kernel alignment (which captures nonlinear relationships) with orthogonality (which eliminates linear redundancy), the model effectively partitions redundant information into the unselected set, leaving a compact, informative feature subspace.
Sample‑level adaptive graph learning
All view‑specific similarity matrices (S^{(v)}) are stacked into a third‑order tensor (\mathcal{S}\in\mathbb{R}^{n\times n\times V}). For each sample (j), a weight vector (\alpha_j\in\mathbb{R}^V) (non‑negative and summing to one) is learned, assigning view‑specific importance to that sample. The fused graph (G) is then constructed as a weighted combination of the slices: (G_{ij}= \sum_{v=1}^{V}\alpha_{j}^{(v)} S^{(v)}_{ij}). This formulation allows the algorithm to give higher influence to the view that provides a clearer local neighborhood for a particular sample, thereby preserving intrinsic geometry more faithfully than uniform weighting schemes.
Unified objective and optimization
The overall objective integrates four components: (1) the kernel‑alignment term for redundancy reduction, (2) a Laplacian regularizer (\operatorname{Tr}(F L F^\top)) where (L) is derived from the adaptive graph and (F) is a clustering indicator matrix, (3) sparsity/regularization terms (ℓ₁/ℓ₂) to control the number of selected features, and (4) the orthogonal constraint on (W). The problem is non‑convex, and the authors solve it via an alternating optimization scheme: (i) update the binary feature‑selection indicator (\Lambda) given the current graph, (ii) update the projection matrices (W) under orthogonality, (iii) update the sample‑level weights (\alpha) by solving a constrained quadratic program for each sample, and (iv) update the Laplacian and clustering indicator (F). Each sub‑problem admits a closed‑form solution or an efficient proximal update, guaranteeing monotonic decrease of the objective and theoretical convergence.
Experimental validation
The method is evaluated on six publicly available multi‑view datasets (e.g., Caltech‑101, NUS‑WIDE, Reuters, MSRC‑V1) covering image, text, and sensor modalities. Baselines include representative MUFS methods such as GAWFS, RNE, CDMvFS, MAMFS, UKMFS, WLTL, and CE‑UMFS. Performance is measured by clustering accuracy (ACC), normalized mutual information (NMI), and purity. KAFUSE consistently outperforms all baselines, achieving improvements of 3–7 percentage points on average. The gains are especially pronounced on datasets where certain views provide ambiguous neighborhoods for many samples (e.g., texture view for zebras), confirming the benefit of sample‑level weighting. Moreover, when the number of selected features is reduced to 10 % of the original dimensionality, KAFUSE still retains higher information content than competing methods, demonstrating its effectiveness in aggressive dimensionality reduction.
Contributions and impact
- Introduces a novel sample‑level view weighting mechanism that adapts to the local clarity of each sample, leading to a more faithful cross‑view similarity graph.
- Combines kernel alignment with orthogonal projection to simultaneously mitigate linear and nonlinear feature redundancy, a capability lacking in prior MUFS approaches.
- Provides a unified optimization framework where feature selection and graph learning mutually reinforce each other, together with a provably convergent alternating algorithm.
- Empirically validates the superiority of the proposed method across diverse real‑world multi‑view scenarios.
In summary, KAFUSE advances the state of the art in multi‑view unsupervised feature selection by jointly addressing nonlinear redundancy and heterogeneous local structures, offering a robust and scalable solution for high‑dimensional, multi‑modal data analysis.
Comments & Academic Discussion
Loading comments...
Leave a Comment