Conditional Distribution Learning for Graph Classification
Leveraging the diversity and quantity of data provided by various graph-structured data augmentations while preserving intrinsic semantic information is challenging. Additionally, successive layers in graph neural network (GNN) tend to produce more similar node embeddings, while graph contrastive learning aims to increase the dissimilarity between negative pairs of node embeddings. This inevitably results in a conflict between the message-passing mechanism (MPM) of GNNs and the contrastive learning (CL) of negative pairs via intraviews. In this paper, we propose a conditional distribution learning (CDL) method that learns graph representations from graph-structured data for semisupervised graph classification. Specifically, we present an end-to-end graph representation learning model to align the conditional distributions of weakly and strongly augmented features over the original features. This alignment enables the CDL model to effectively preserve intrinsic semantic information when both weak and strong augmentations are applied to graph-structured data. To avoid the conflict between the MPM and the CL of negative pairs, positive pairs of node representations are retained for measuring the similarity between the original features and the corresponding weakly augmented features. Extensive experiments with several benchmark graph datasets demonstrate the effectiveness of the proposed CDL method.
💡 Research Summary
**
The paper addresses two fundamental challenges that have limited the effectiveness of graph contrastive learning (GCL) for semi‑supervised graph classification. First, the message‑passing mechanism (MPM) of graph neural networks (GNNs) tends to make node embeddings increasingly similar as layers deepen, while GCL simultaneously tries to push negative pairs apart. When multiple augmented views (weak and strong) of the same graph are used, a node can contribute to both positive and negative contrastive losses, creating a conflict that destabilizes training. Second, strong graph augmentations (e.g., heavy edge perturbation, attribute masking) often destroy the intrinsic semantic information of the original graph, reducing the benefit of data‑driven diversity.
To resolve these issues, the authors propose Conditional Distribution Learning (CDL), a novel two‑stage semi‑supervised framework that aligns the conditional distributions of weakly and strongly augmented node embeddings with respect to the original embeddings. Specifically, for each node (i) they define conditional probabilities
\
Comments & Academic Discussion
Loading comments...
Leave a Comment