시맨틱 매칭 기반 대조 학습으로 부분 정렬 클러스터링 강화

Reading time: 6 minute
...

📝 Abstract

Multi-view clustering has been empirically shown to improve learning performance by leveraging the inherent complementary information across multiple views of data. However, in real-world scenarios, collecting strictly aligned views is challenging, and learning from both aligned and unaligned data becomes a more practical solution. Partially View-aligned Clustering (PVC) aims to learn correspondences between misaligned view samples to better exploit the potential consistency and complementarity across views, including both aligned and unaligned data. However, most existing PVC methods fail to leverage unaligned data to capture the shared semantics among samples from the same cluster. Moreover, the inherent heterogeneity of multi-view data induces distributional shifts in representations, leading to inaccuracies in establishing meaningful correspondences between cross-view latent features and, consequently, impairing learning effectiveness. To address these challenges, we propose a Semantic MAtching contRasTive learning model (SMART) for PVC. The main idea of our approach is to alleviate the influence of cross-view distributional shifts, thereby facilitating semantic matching contrastive learning to fully exploit semantic relationships in both aligned and unaligned data. Specifically, we mitigate view distribution shifts by aligning cross-view covariance matrices, which enables the inference of a semantic graph for all data. Guided by the learned semantic graph, we further exploit semantic consistency across views through semantic matching contrastive learning. After the optimization of the above mechanisms, our model smoothly performs semantic matching for different view embeddings instead of the cumbersome view realignment, which enables the learned representations to enjoy richer category-level semantics and stronger robustness. Extensive experiments on eight benchmark datasets demonstrate that our method consistently outperforms existing approaches on the PVC problem. The code is available at https://github.com/THPengL/SMART

💡 Analysis

Multi-view clustering has been empirically shown to improve learning performance by leveraging the inherent complementary information across multiple views of data. However, in real-world scenarios, collecting strictly aligned views is challenging, and learning from both aligned and unaligned data becomes a more practical solution. Partially View-aligned Clustering (PVC) aims to learn correspondences between misaligned view samples to better exploit the potential consistency and complementarity across views, including both aligned and unaligned data. However, most existing PVC methods fail to leverage unaligned data to capture the shared semantics among samples from the same cluster. Moreover, the inherent heterogeneity of multi-view data induces distributional shifts in representations, leading to inaccuracies in establishing meaningful correspondences between cross-view latent features and, consequently, impairing learning effectiveness. To address these challenges, we propose a Semantic MAtching contRasTive learning model (SMART) for PVC. The main idea of our approach is to alleviate the influence of cross-view distributional shifts, thereby facilitating semantic matching contrastive learning to fully exploit semantic relationships in both aligned and unaligned data. Specifically, we mitigate view distribution shifts by aligning cross-view covariance matrices, which enables the inference of a semantic graph for all data. Guided by the learned semantic graph, we further exploit semantic consistency across views through semantic matching contrastive learning. After the optimization of the above mechanisms, our model smoothly performs semantic matching for different view embeddings instead of the cumbersome view realignment, which enables the learned representations to enjoy richer category-level semantics and stronger robustness. Extensive experiments on eight benchmark datasets demonstrate that our method consistently outperforms existing approaches on the PVC problem. The code is available at https://github.com/THPengL/SMART

📄 Content

1 SMART: Semantic Matching Contrastive Learning for Partially View-Aligned Clustering Liang Peng†, Yixuan Ye†, Cheng Liu∗, Senior Member, IEEE, Hangjun Che, Senior Member, IEEE, Fei Wang, Zhiwen Yu, Senior Member, IEEE, Si Wu, and Hau-San Wong Abstract—Multi-view clustering has been empirically shown to improve learning performance by leveraging the inherent complementary information across multiple views of data. How- ever, in real-world scenarios, collecting strictly aligned views is challenging, and learning from both aligned and unaligned data becomes a more practical solution. Partially View-aligned Clustering (PVC) aims to learn correspondences between mis- aligned view samples to better exploit the potential consistency and complementarity across views, including both aligned and unaligned data. However, most existing PVC methods fail to leverage unaligned data to capture the shared semantics among samples from the same cluster. Moreover, the inherent het- erogeneity of multi-view data induces distributional shifts in representations, leading to inaccuracies in establishing mean- ingful correspondences between cross-view latent features and, consequently, impairing learning effectiveness. To address these challenges, we propose a Semantic MAtching contRasTive learn- ing model (SMART) for PVC. The main idea of our approach is to alleviate the influence of cross-view distributional shifts, thereby facilitating semantic matching contrastive learning to fully exploit semantic relationships in both aligned and unaligned data. Specifically, we mitigate view distribution shifts by aligning cross-view covariance matrices, which enables the inference of a semantic graph for all data. Guided by the learned semantic graph, we further exploit semantic consistency across views through semantic matching contrastive learning. After the opti- mization of the above mechanisms, our model smoothly performs semantic matching for different view embeddings instead of the cumbersome view realignment, which enables the learned representations to enjoy richer category-level semantics and stronger robustness. Extensive experiments on eight benchmark datasets demonstrate that our method consistently outperforms existing approaches on the PVC problem. The code is available at https://github.com/THPengL/SMART Index Terms—Muti-View Clustering, Partially View-Aligned Clustering, Contrastive Learning. I. INTRODUCTION Cheng Liu is with the College of Computer Science and Technology, Huaqiao University and the Department of Computer Science, Shantou University (email: chengliu10@gmail.com) Liang Peng, Yixuan Ye, Fei Wang are with the Department of Computer Science, Shantou University (email: 23lpeng@stu.edu.cn; 22yxye2@stu.edu.cn; wangfei@stu.edu.cn) Hangjun Che is with the Chongqing Key Laboratory of Nonlinear Cir- cuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China (email: hjche123@swu.edu.cn) Zhiwen Yu and Si Wu are with the School of Computer Science and Engi- neering, South China University of Technology. (email: zhwyu@scut.edu.cn, cswusi@scut.edu.cn) Hau-San Wong is with the Department of Computer Science, City Univer- sity of Hong Kong. (email: cshswong@cityu.edu.hk) † These authors contributed equally and are co-first authors. ∗Corresponding author: Cheng Liu Guide Match View 1 View 2 View Distribution Alignment Semantic Matching Contrastive Learning Partially View- Aligned Data Cross-View Semantic Guidance Graph Semantically Matched Data 1 3 4 2 5 6 1 4 3 2 6 5 1 1 2 2 4 5 3 5 4 6 3 6 1 3 2 1 3 2 5 5 4 6 4 6 1 3 2 5 4 6 1 3 4 2 5 6 Fig. 1. An example illustrates PVC setting and our model: different categories are represented by distinct shapes, and individual instances are characterized by different colors. Solid lines indicate aligned instances, while dashed lines represent unaligned ones. Black arrows indicate attraction between semantic pairs (weighted positive pairs), while red arrows indicate repulsion between negative pairs. Our method first performs view distribution alignment through aligned samples, then constructs a reliable semantic graph with all instances to guide the semantic matching contrastive learning. This ensures the model to capture common and complementary semantics from both aligned and unaligned data, thereby generating semantically matched meaningful and robust representations. D IVERSE real-world data can be represented through multiple heterogeneous feature representations derived from distinct modalities, such as images, text, and videos [1]–[6]. These heterogeneous representations are collectively referred to as multi-view data [7]–[10]. Multi-view clustering (MVC) has demonstrated empirical success in enhancing clus- tering performance by leveraging the inherent complementary characteristics and shared information across different views [11]–[19]. However, existing MVC methods often rely on the idealized assumpt

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut