RaDAR: Relation-aware Diffusion-Asymmetric Graph Contrastive Learning for Recommendation

Collaborative filtering (CF) recommendation has been significantly advanced by integrating Graph Neural Networks (GNNs) and Graph Contrastive Learning (GCL). However, (i) random edge perturbations often distort critical structural signals and degrade…

Authors: Yixuan Huang, Jiawei Chen, Shengfan Zhang

RaDAR: Relation-aware Diffusion-Asymmetric Graph Contrastive Learning for Recommendation
RaD AR: Relation-aware Diusion- A symmetric Graph Contrastive Learning for Recommendation Yixuan Huang ∗ University of Electronic Science and T echnology of China Chengdu, China wangdaijiayou@163.com Jiawei Chen ∗ National University of Defense T echnology Changsha, China cjw@nudt.edu.cn Shengfan Zhang Ant Group Chongqing, China 15060126965@163.com Zongsheng Cao † T singhua University Beijing, China agiczsr@gmail.com Abstract Collaborative ltering (CF) recommendation has be en signicantly advanced by integrating Graph Neural Netw orks (GNNs) and Graph Contrastive Learning ( GCL). However , (i) random edge perturba- tions often distort critical structural signals and degrade semantic consistency acr oss augmented vie ws, and (ii) data sparsity hampers the propagation of collaborative signals, limiting generalization. T o tackle these challenges, we propose RaD AR ( R elation- a war e D iusion- A symmetric Graph Contrastive Learning Framework for Recommendation Systems ), a novel framework that com- bines two complementary view generation mechanisms: a graph generative model to capture global structure and a r elation-aware denoising model to rene noisy edges. RaD AR introduces three key innovations: (1) asymmetric contrastive learning with global nega- tive sampling to maintain semantic alignment while suppressing noise; (2) diusion - guided augmentation , which employs progres- sive noise injection and denoising for enhanced robustness; and (3) relation - aware edge renement , dynamically adjusting edge weights based on latent no de semantics. Extensive experiments on three public benchmarks demonstrate that RaDAR consistently outp er- forms state - of - the - art methods, particularly under noisy and sparse conditions. Our code is available at our repository 1 . CCS Concepts • Information systems → Recommender systems ; • Mathe- matics of computing → Graph algorithms . ∗ These authors contributed equally to this work. † Corresponding author . 1 https://github.com/ohowandanliao/RaD AR Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. Copyrights for components of this work owned by others than the author(s) must be honor e d. Abstracting with cr e dit is permitted. T o copy otherwise, or republish, to post on servers or to redistribute to lists, r equires prior specic permission and /or a fe e. Request permissions from permissions@acm.org. WW W ’26, Dubai, United Arab Emirates © 2026 Copyright held by the owner/author(s). Publication rights licensed to A CM. ACM ISBN 979-8-4007-2307-0/2026/04 https://doi.org/10.1145/3774904.3792463 Ke y words Recommendation, Diusion Model, Contrastive Learning, Data A ugmentation A CM Reference Format: Yixuan Huang, Jiawei Chen, Shengfan Zhang, and Zongsheng Cao. 2026. RaDAR: Relation-awar e Diusion-Asymmetric Graph Contrastive Learning for Recommendation. In Proceedings of the A CM W eb Conference 2026 (W WW ’26), A pril 13–17, 2026, Dubai, United Arab Emirates . A CM, New Y ork, NY, USA, 13 pages. https://doi.org/10.1145/3774904.3792463 1 Introduction With the development of Articial Intelligence[ 6 – 8 ], recommender systems[ 43 ] play a vital role in alleviating information overload by learning personalized preferences from sparse user-item interac- tions. A prevailing approach to r e commendation is collaborative ltering (CF)[ 32 ], which infers user interests based on historical be- havioral patterns. T o capture high-order connectivity and structural semantics, recent methods have leveraged Graph Neural Networks (GNNs) [ 31 ], which model user-item interactions through message passing on bipartite graphs. These advances have signicantly im- proved recommendation accuracy , particularly in sparse settings. T o further enhance representation learning, Graph Contrastiv e Learning (GCL)[ 51 ] has emerged as a self-supervise d paradigm that encourages consistency across multiple augmented views of the interaction graph. By integrating GCL with GNNs, recent models aim to improve robustness against data sparsity and noise. T ypical implementations, such as SGL [ 42 ], generate graph augmentations through node or edge dropout, while metho ds like GraphA CL [ 47 ] introduce asymmetric contrastive objectives to capture multi-hop patterns. In parallel, diusion-based mo dels [ 9 , 16 , 24 ] have shown promise in impro ving denoising capacity through iterative noise injection and reconstruction. Despite these advancements, two fundamental challenges limit current GCL-based recommendation models: Challenge 1 (C1): Structural Semantics Degradation. Standard graph augmen- tations (e.g., random no de/edge dropout) often corrupt essential topological structures, degrading collaborative signals and desta- bilizing contrastive learning. This structural perturbation compro- mises semantic consistency between augmented views, hindering eective representation learning. Challenge 2 (C2): Limited Rela- tional Expressiveness. Existing metho ds predominantly assume WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates Yixuan Huang, Jiawei Chen, Shengfan Zhang, & Zongsheng Cao u1 u2 u3 u4 i1 i2 i3 i3 similar predict +Noise Forward -Noise Reverse two-hop monophily user–item interactions diffusion process Fai le d to ca ptu re tw o-h op mo no ph ily Figure 1: ACL and diusion model on a user–item graph, il- lustrating how standard diusion misses two-hop monophily where indirectly conne cted users share similar preferences. homophily , emphasizing one-hop neighborhood alignment. How- ever , real-w orld user interactions frequently exhibit heterophily or distant homophily—where similar users conne ct through multi-hop paths with weak direct links. Current models inadequately cap- ture these higher-order relational patterns. While diusion models enhance noise robustness, they sacrice ne-grained relational se- mantics beyond immediate neighborhoods. As illustrated in Fig. 1, two-hop neighbors often share implicit pr eferences despite weak direct conne ctions, which are not captured by conventional ap- proaches. This raises a key question: How can we design a unied model that preser ves structural semantics during augmentation while learning relation-aware representations across multi-hop and het- erophilous neighborhoo ds? T o address recommendation challenges in sparse and noisy sce- narios, we propose RaDAR ( R elation-aware D iusion- A symmetric Graph Contrastive Learning for R ecommendation), a contrastive learning framew ork with two core objectives: preserving structural semantics and enhancing relational expressiv eness. For C1 (structural semantics degradation) , RaDAR introduces a diusion-guided augmentation strategy applying Gaussian noise to node representations with learned denoising. This maintains semantic integrity while generating robust graph views for con- trastive learning, reducing overtting to spurious patterns. For C2 (limited relational expressiveness) , RaDAR employs a dual-view generation architecture combining: (i) a graph generative module based on variational autoencoders, capturing global struc- tural semantics beyond one-hop conne ctions; and (ii) a relation- aware graph denoising module that adaptively reweights edge con- tributions, preserving ne-grained relational signals. Additionally , RaD AR’s asymmetric contrastive objective decouples node identity from structural context, enabling alignment of semantically similar nodes even in heterogeneous neighborhoods. Comprehensive experiments on both binary-e dge and weighted- edge b enchmarks—covering public datasets (Last.FM, Y elp, Beer- Advocate) and multi-b ehavior datasets (Tmall, RetailRo cket, IJ- CAI15)—demonstrate that RaD AR consistently outperforms state- of-the-art baselines. The model achieves particularly strong gains under high sparsity and noise conditions, demonstrating its robust- ness and adaptability across interaction regimes. In summary , our main contributions are threefold: • W e present RaDAR , a dual-view contrastive framework that integrates diusion-guided augmentation with relation-aware denoising for robust representation learning; • W e design an asymmetric contrastive objective that enhances structural discrimination while mitigating noise via DDR-style diusion regularization; • RaD AR achieves consistent state-of-the-art performance across both binary-e dge and weighted-edge recommendation settings, demonstrating strong generalization and noise resilience. 2 Preliminaries and Related W ork 2.1 Collaborative Filtering Paradigm Let 𝑈 and 𝑉 denote user and item sets, with interactions encoded in a binary matrix. Graph-based collab orative ltering extracts representations by propagating information acr oss the interaction graph under the homophily principle: users with similar interaction patterns share pr eferences. Implementations typically employ dual- tower architectures to map users and items into a shared latent space, enabling relevance estimation through similarity match- ing. This approach captures transitive dependencies in interaction graphs to infer unobserved user-item anities. 2.2 Self-Supervise d Graph Learning Recent advances in graph neural networks (GNNs) have trans- formed recommendation systems by explicitly modeling user–item relations. Core architectures such as PinSage [ 50 ], NGCF [ 38 ], and LightGCN [ 14 ] encode multi-hop relational patterns through graph convolutions, with LightGCN simplifying propagation for higher eciency . Subsequent works enhance representation learning via multi-intent disentanglement (DGCF [ 39 ], DCCF [ 29 ]) and adaptive relation discovery (DRAN [ 41 ]). T emporal dynamics are further captured by se quence-aware GNNs (DGSR [ 53 ], GCE-GNN [ 40 ]) that connect historical interactions with evolving preferences. The fusion of self-super vised learning (SSL) and graph model- ing has be come a leading paradigm for data-ecient representa- tion learning. Contrastive methods ( e.g., SGL [ 42 ], GFormer [ 23 ]) improve user–item emb eddings through view-invariance, while reconstruction-based models (S3-Rec [ 54 ]) employ masked inter- action prediction. Recent studies extend SSL to cross-domain and multi-modal settings (C2DSR [ 5 ], SLMRec [ 35 ]), highlighting its versatility in leveraging auxiliary self-supervision. Positioning and Scope. RaD AR lies within the collaborative lter- ing (CF) paradigm on bipartite user–item graphs, and is designed to operate consistently under both binar y- and weighted-edge regimes. Unlike models that rely on large-scale pretraining or side informa- tion, RaDAR unies two complementar y view-generation strategies, generative reconstruction and relation-awar e denoising, within a single training pipeline. This unied design generalizes acr oss edge settings via a simple weight toggle, minimizing implementation variance and ensuring controlled, fair evaluation in subse quent experiments. 3 Methodology In this se ction, we present the comprehensive architecture of RaDAR, comprising four interconnected components. The rst comp onent RaDAR: Relation-awar e Diusion-Asymmetric Graph Contrastive Learning for Recommendation WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates ♻ Multi-V iew Generation l = 1 GCNs std mean sigmoid Gaussian noise InfoNCE loss Decoder User-Item Interaction Graph +Noise Forward -Noise Reverse U-I Graph Forward Diffusion process V AGE-base Graph Generative Model Relation-aware Graph Denoising Model Gate Layer Gate Layer Users Items V iew1 Embed. IB loss IB loss Joint Optimization BPR V iew1 Updated Embeddings V iew2 Updated Embeddings Generative Graph view1 Generative Graph view2 l = 1,2,3 V iew1 Denoised Embed V iew2 Denoised Embed Contrastive Learning V iew2 Embed. Reverse-denoising process 📊 Loss Functions 🔵 InfoNCE loss 🔵 IB(Informaion Bottleneck) loss by ACL 🔵 BPR(Bayesian Personalized Rankding) loss ⚡ Diffusion Model based Graph Contrastive Learning 🗒 Multi-V iew Generation Figure 2: RaDAR framework architecture: The left se ction shows two view generators extracting complementar y graph representations. The right section demonstrates the contrastive learning process with diusion model-based graph generation and joint optimization through InfoNCE, IB, and BPR losses. employs a graph message passing enco der to eectively capture lo- cal collab orative relationships b etween users and items. The second component implements a sophisticated user-item graph diusion model. The third component integrates an adaptive framework fea- turing two distinct trainable view generators: one that leverages a graph variational model and another that utilizes relation-aware denoising graph models. The fourth component fo cuses on model optimization through a multi-faceted loss function that incorp orates A CL to bo ost performance, complemente d by diusion mo del-based Graph Contrastive Learning. The o verall architecture of the RaD AR model is illustrated in Figure 2. 3.1 User-item Embedding Propagation W e project users and items into a 𝑑 -dimensional latent space through learnable embeddings, denoted as E ( 𝑢 ) ∈ R 𝑁 × 𝑑 and E ( 𝑣 ) ∈ R 𝑀 × 𝑑 for 𝑁 users and 𝑀 items. T o capture collaborative signals, w e em- ploy a normalized adjacency matrix deriv ed from the interaction matrix: ˜ 𝑨 = 𝑫 − 1 2 𝑢 𝑨𝑫 − 1 2 𝑣 (1) where 𝑫 𝑢 and 𝑫 𝑣 are diagonal degree matrices for users and items. The embedding propagation process utilizes a multi-layer graph neural network where user and item representations are iterativ ely rened through message passing: E ( 𝑢 ) 𝑙 = ˜ AE ( 𝑣 ) 𝑙 − 1 + E ( 𝑢 ) 𝑙 − 1 E ( 𝑣 ) 𝑙 = ˜ A ⊤ E ( 𝑢 ) 𝑙 − 1 + E ( 𝑣 ) 𝑙 − 1 (2) The nal embeddings integrate information across all 𝐿 layers through summation: E ( 𝑢 ) = 𝐿  𝑙 = 0 E ( 𝑢 ) 𝑙 , E ( 𝑣 ) = 𝐿  𝑙 = 0 E ( 𝑣 ) 𝑙 (3) W e compute the preference score between user 𝑢 𝑖 and item 𝑣 𝑗 via the inner product of their respective embeddings: ˆ 𝑦 𝑖 , 𝑗 = ( 𝑒 ( 𝑢 ) 𝑖 ) ⊤ 𝑒 ( 𝑣 ) 𝑗 (4) 3.2 GCL Paradigm 3.2.1 Graph Generative Model as View Generator . W e adopt V ariational Graph A uto-Encoder (VGAE) [ 21 ] for view generation, integrating variational inference with graph reconstruction. The encoder employs multi-layer GCN for node embe ddings, while the decoder reconstructs graph structures using Gaussian-sampled em- beddings. The VGAE framework optimizes a multi-component loss function comprising KL-divergence regularization (Eq. 18), discrimi- native loss for r econstructing graph structure (Eq. 19), and Bayesian Personalized Ranking loss (Eq. 20). The complete formulation of the V GAE obje ctive is provided in Appendix B.2 (Eq. 21). 3.2.2 Relation- A ware Graph Denoising for View Generation. Our denoising framework employs a layer-wise edge masking strat- egy with sparsity constraints to generate clean graph views. The core idea is to model edge retention probabilities through reparam- eterized Bernoulli distributions, where the parameters are learned via relation-aware denoising layers. The layer-wise edge masking is formulated as: 𝐴 𝑙 = 𝐴 ⊙ 𝑀 𝑙 , 𝐿  𝑙 = 1 | 𝑀 𝑙 | 0 = 𝐿  𝑙 = 1  ( 𝑢, 𝑣 ) ∈ 𝜖 I ( 𝑚 𝑙 𝑢, 𝑣 ≠ 0 ) (5) where 𝐴 𝑙 denotes the masked adjacency matrix at layer 𝑙 , 𝑀 𝑙 is the binary mask matrix, and 𝜏 sparse controls the overall sparsity budget across all layers. WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates Yixuan Huang, Jiawei Chen, Shengfan Zhang, & Zongsheng Cao T o preserve essential user-item relationships while ltering noise, our denoising layer employs adaptive gating mechanisms: g = 𝜎 ( W 𝑔 [ e 𝑖 ; e 𝑗 ] + b ) 𝛼 𝑙 𝑖 , 𝑗 = 𝑓 att  G ( e 𝑖 , e 𝑗 ) ⊕ G ( e 𝑗 , e 𝑖 ) ⊕ [ e 𝑖 ; e 𝑗 ]  (6) where g represents the adaptive gate vector , and 𝛼 𝑙 𝑖 , 𝑗 denotes the attention weight for edge ( 𝑖 , 𝑗 ) at layer 𝑙 . The adaptiv e feature com- position G ( · , ·) combines relational context with node embeddings: G ( e 𝑖 , e 𝑗 ) = g ⊙ 𝜏 ( W embed [ e 𝑖 ; a 𝑟 , 𝑖 ] ) + ( 1 − g ) ⊙ e 𝑖 (7) where a 𝑟 , 𝑖 represents the r elational feature vector for node 𝑖 . The framework utilizes a GRU-inspir e d mechanism [ 12 ] for relational ltering and employs a concrete distribution for dier entiable edge sampling. The edge sampling process uses a concrete distribution with a hard sigmoid rectication to enable end-to-end training: L 𝑐 = 𝐿  𝑙 = 1  ( 𝑢 𝑖 ,𝑣 𝑗 ) ∈ 𝜖  1 − P 𝜎 ( 𝑠 𝑙 𝑖 , 𝑗 | 𝜃 𝑙 )  (8) where 𝑠 𝑙 𝑖 , 𝑗 represents the e dge score and 𝜃 𝑙 are the learnable param- eters for layer 𝑙 . The training objective combines concrete distribu- tion regularization with recommendation loss: L den = L 𝑐 + L gen bpr + 𝜆 2 ∥ Θ ∥ 2 𝐹 (9) where L gen bpr is the BPR loss computed on the denoised graph views, and the last term provides L2 regularization on model parameters Θ . 3.3 Diusion with User-Item Graph Building on diusion models’ noise-to-data generation capabilities[ 16 , 34 , 37 ], we propose a graph diusion framework that transforms the original user-item graph G 𝑢𝑖 into recommendation-optimized subgraphs G ′ 𝑢𝑖 . W e design a for ward-inverse diusion mechanism: forward noise injection gradually degrades node embeddings via Gaussian perturbations, while inverse denoising recovers semantic patterns through learne d transitions. This process enhances robust- ness against interaction noise while learning complex embedding distributions. The restored embeddings produce probability dis- tributions for subgraph reconstruction, establishing an eective diusion paradigm for high-delity recommendation graph gener- ation. 3.3.1 Noise Diusion Process . Our framework introduces a la- tent diusion paradigm for graph representation learning, operating on GCN-derived embeddings rather than graph structures. Let h 𝐿 denote the item embedding from the nal GCN lay er . W e construct a 𝑇 -step Markov chain 𝝌 0: 𝑇 with initial state 𝝌 0 = h ( 𝐿 ) 0 . The for ward process progressively adds Gaussian noise to embed- dings, transforming them towards a standard normal distribution. Through reparameterization techniques (detailed in Appendix A.1), we can directly compute any intermediate state from the initial embedding: 𝝌 𝑡 = √ ¯ 𝛼 𝑡 𝝌 0 + √ 1 − ¯ 𝛼 𝑡 𝝐 , 𝝐 ∼ N ( 0 , I ) (10) T o precisely control noise injection, we implement a linear noise scheduler with hyperparameters 𝑠 , 𝛼 𝑙 𝑜 𝑤 , and 𝛼 𝑢 𝑝 Appendix B.3. The reverse process employs neural networks parameterized by 𝜃 to progressively denoise repr esentations, recovering the original embeddings through learned Gaussian transitions. This denois- ing procedure enables our mo del to capture complex patterns in the graph-derived embeddings while maintaining their structural properties. 3.3.2 Diusion Process Optimization for User-Item Inter- action. The optimization objective is formulated to maximize the Evidence Lower Bound (ELBO) of the item emb edding likelihoo d 𝝌 0 . Following the diusion framew ork in [ 19 ], we derive the training objective as: L 𝑒𝑙 𝑏 𝑜 = E 𝑡 ∼ U ( 1 , T ) L 𝑡 . (11) where L 𝑡 denotes the loss at diusion step 𝑡 , computed by uniformly sampling timesteps during training. The ELBO comprises two com- ponents: (1) a reconstruction term E 𝑞 ( 𝝌 1 | 𝝌 0 )  ∥ ˆ 𝝌 𝜃 ( 𝝌 1 , 1 ) − 𝝌 0 ∥ 2 2  that evaluates the model’s denoising capability at 𝑡 = 1 , and (2) KL regularization terms governing the rev erse process transitions. Fol- lowing [ 19 ], we minimize the KL divergence between the learned re- verse distribution 𝑝 𝜃 ( 𝝌 𝑡 − 1 | 𝝌 𝑡 ) and the tractable posterior 𝑞 ( 𝝌 𝑡 − 1 | 𝝌 𝑡 The neural network ˆ 𝝌 𝜃 ( · ) , implemented as a Multi-Layer Per cep- tron (MLP), predicts the original embedding 𝝌 0 from noisy embed- dings 𝝌 𝑡 and timestep encodings. This formulation pr eser ves the theoretical guarantees of ELBO maximization. 3.4 Contrastive Learning paradigms 3.4.1 Diusion-Enhanced Graph Contrastive Learning. W e propose a diusion-augmented contrastive framework leveraging intra- node self-discrimination for self-supervised learning. Given node embeddings 𝐸 ′ and 𝐸 ′′ from two augmented views, we consider augmented views of the same no de as p ositive pairs ( 𝑒 ′ 𝑖 , 𝑒 ′′ 𝑖 ), and views of dierent nodes as negative pairs ( 𝑒 ′ 𝑖 , 𝑒 ′′ 𝑖 ′ ) where 𝑢 𝑖 ≠ 𝑢 𝑖 ′ . The formulation of the loss function is: L 𝑢𝑠 𝑒 𝑟 𝑠𝑠 𝑙 =  𝑢 𝑖 ∈ U − log exp ( 𝑠 ( 𝑒 ′ 𝑖 , 𝑒 ′′ 𝑖 ) / 𝜏 ) Í 𝑢 𝑗 ∈ U exp ( 𝑠 ( 𝑒 ′ 𝑖 , 𝑒 ′′ 𝑗 ) / 𝜏 ) , (12) where 𝑠 ( · ) denotes cosine similarity and 𝜏 represents the temper- ature parameter . The item-side contrastive loss L 𝑖𝑡 𝑒𝑚 𝑠𝑠 𝑙 follows an analogous formulation. The complete self-supervised objective com- bines both components: L 𝑠𝑠 𝑙 = L 𝑢𝑠 𝑒 𝑟 𝑠𝑠 𝑙 + L 𝑖𝑡 𝑒𝑚 𝑠𝑠 𝑙 (13) Our diusion-enhanced augmentation generates denoise d views ( V 𝑑 𝑒𝑛 1 , V 𝑑 𝑒𝑛 2 ) via Markov chains that preserve interaction patterns while suppressing high-frequency noise. The framework imple- ments: (i) Intra-Vie w Alignment ( 𝐿 intra ), which measures the con- trastive loss b etween original vie w V 𝑖 and its denoised counterpart V 𝑑 𝑒𝑛 𝑖 . (ii) Inter- View Regularization ( 𝐿 inter ), which computes the contrastive loss between dierent denoised views V 𝑑 𝑒𝑛 1 and V 𝑑 𝑒𝑛 2 . The composite loss integrates these mechanisms: 𝐿 di-ssl = 𝐿 ssl + 𝜆 1 𝐿 intra + 𝜆 2 𝐿 inter (14) where 𝜆 1 and 𝜆 2 balance view consistency and information diversity . This design enables simultaneous noise suppression and multi- perspective representation learning. RaDAR: Relation-awar e Diusion-Asymmetric Graph Contrastive Learning for Recommendation WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates Diusion– ACL Synergy . Diusion and asymmetric contrastive learning (A CL) play complementary roles in enhancing representa- tion quality . Diusion smooths high-frequency noise and r enes local neighborho ods, stabilizing latent embe ddings. ACL aligns structurally similar nodes acr oss multi-hop relations by decoupling identity from context. Their combination enables RaDAR to sup- press noise while preserving monophily-style semantics beyond one hop. Empirical results (Se c. 4.1, RQ1/RQ2) conrm that diusion- augmented ACL consistently impro ves NDCG with comparable or better Recall. 3.4.2 Asymmetric Graph Contrastive Learning . Conventional contrastive frameworks are limite d by homophily assumptions [ 11 , 26 ]. W e adopt an asymmetric paradigm [ 47 ] for monophily- structural contexts using dual encoders 𝑓 𝜃 and 𝑓 𝜉 that generate identity and context representations. An asymmetric predictor reconstructs neighborhoo d contexts from no de identities (Eq. 27 in Appendix B.4). This preserves node semantics while encoding structural patterns, naturally accommodating monophily through shared central no des. Our dual-representation framework uses view-specic encoders 𝑓 𝜃 and 𝑓 𝜉 to generate identity representa- tions v = 𝑓 𝜃 ( 𝐺 ) [ v ] and context repr esentations u = 𝑓 𝜉 ( 𝐺 ) [ u ] . An asymmetric predictor 𝑔 𝜙 reconstructs neighborhood contexts from node identities, optimizing a contrastive objective (see Eq. 27 in Appendix B.4). This formulation achieves two key properties: (1) identity repre- sentations preserve no de-specic semantics, and (2) context rep- resentations encode structural neighborhood patterns. The asym- metric objective naturally accommodates monophily by enabling two-hop neighbors to reconstruct similar contexts through their shared central nodes. 3.5 Model Training Our framework adopts a hierar chical optimization approach with three coupled stages, as summarized in T able 1. Phase 1: Unied Multi- T ask Learning W e initiate joint opti- mization: 𝐿 1 = 𝐿 bpr + 𝜆 3 𝐿 di-ssl + 𝜆 4 ∥ Θ ∥ 2 𝐹 (15) where 𝐿 di-ssl is the diusion-based self-supervised loss from Eq. 14, and ∥ Θ ∥ 2 𝐹 is L2 regularization. Phase 2: Representation Distillation W e impose an informa- tion bottleneck constraint: L 𝐼 𝐵 = 𝐿 𝐴 ( 𝐺 , 𝑔 𝜙 ( v ) , v , u ) = 𝐿 𝐴 ( 𝐺 , 𝑔 𝜙 ( y ∗ ) , y ∗ , ˆ y ) , (16) where y ∗ represents historical r epresentations and 𝐿 𝐴 is the A CL loss. Phase 3: View Generator Optimization W e nalize training by optimizing view generators: L 𝑔𝑒𝑛𝑒 𝑟 𝑎𝑡 𝑜 𝑟 𝑠 = L 𝑔𝑒𝑛 + L 𝑑 𝑒𝑛 (17) where L 𝑔𝑒𝑛 is the VGAE graph generation loss(see Eq. 21) and L 𝑑 𝑒𝑛 (see Eq. 9) is the relation-aware denoising loss. 3.5.1 DDR-Style Denoising W armup . Following DDRM [ 37 ], a lightweight diusion regularizer is optionally applie d to item em- beddings after a brief warmup to enhance ranking robustness and T able 1: Training phases in our framework. Phase Objective Params 1 𝐿 bpr , 𝐿 di-ssl , ∥ Θ ∥ 2 𝐹 User-item embeds 2 L 𝐼 𝐵 (Info. bottleneck) User-item embeds 3 L 𝑔𝑒𝑛 + L 𝑑 𝑒𝑛 View generators T able 2: Statistics of weighted-edge datasets. Dataset Users Items Links Interaction T ypes Tmall 31,882 31,232 1,451,29 View , Favorite, Cart, Purchase RetailRocket 2,174 30,113 97,381 View , Cart, Transaction IJCAI15 17,435 35,920 799,368 Vie w, Favorite , Cart, Purchase T able 3: Statistics of the experimental datasets. Dataset Users Items Interactions Density Last.FM 1,892 17,632 92,834 2 . 8 × 10 − 3 Y elp 42,712 26,822 182,357 1 . 6 × 10 − 4 Beer Advocate 10,456 13,845 1,381,094 9 . 5 × 10 − 3 avoid early-stage drift, consistent with DDRM’s delayed diusion supervision. 4 Experimental Evaluation T o rigorously evaluate the proposed model, we design experiments to investigate four critical aspects: • RQ1 : How does RaD AR p erform against state-of-the-art recom- mendation baselines in benchmark comparisons? • RQ2 : What is the individual contribution of key comp onents to the model’s eectiveness across diverse datasets? (Ablation Analysis) • RQ3 : How robust is RaD AR in handling data sparsity and noise compared to conventional approaches? • RQ4 : How do critical hyperparameters inuence the model’s performance characteristics? 4.1 Model Complexity Analysis W e summarize the dominant costs on a bipartite graph with | U | users, | V | items, | E | edges, embedding size 𝑑 , and 𝐿 propagation layers. Time Complexity . Per epoch, RaDAR has three components: • Sparse propagation: 𝐿 GCN-style message passing costs O ( | E | 𝑑 𝐿 ) . • Contrastive objective: the implementation computes all-pairs similarities within a mini-batch (users+items, size 𝐵 ), yielding O ( 𝐵 2 𝑑 ) per step. • Diusion regularization (optional): one denoising pass on item embeddings costs O ( |V | 𝑑 𝑑 ′ ) (or O ( 𝑆 𝐵𝑑 ′ ) if step-based), typically smaller than propagation when | E | is large. Overall, the training is dominate d by sparse propagation; con- trastive and diusion add a mo dest, data/mini-batch dependent overhead. WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates Yixuan Huang, Jiawei Chen, Shengfan Zhang, & Zongsheng Cao T able 4: Performance Metrics for V arious Models Model Last.FM Y elp Beer Advocate Recall@20 NDCG@20 Recall@40 NDCG@40 Recall@20 NDCG@20 Re call@40 NDCG@40 Recall@20 NDCG@20 Recall@40 NDCG@40 BiasMF 0.1879 0.1362 0.2660 0.1653 0.0532 0.0264 0.0802 0.0321 0.0996 0.0856 0.1602 0.1016 NCF 0.1130 0.0795 0.1693 0.0952 0.0304 0.0143 0.0487 0.0187 0.0729 0.0654 0.1203 0.0754 AutoR 0.1518 0.1114 0.2174 0.1336 0.0491 0.0222 0.0692 0.0268 0.0816 0.0650 0.1325 0.0794 PinSage 0.1690 0.1228 0.2402 0.1472 0.0510 0.0245 0.0743 0.0315 0.0930 0.0816 0.1553 0.0980 STGCN 0.2067 0.1558 0.2940 0.1821 0.0562 0.0282 0.0856 0.0355 0.1003 0.0852 0.1650 0.1031 GCMC 0.2218 0.1714 0.3149 0.1897 0.0584 0.0280 0.0891 0.0360 0.1082 0.0901 0.1766 0.1085 NGCF 0.2081 0.1474 0.2944 0.1829 0.0681 0.0336 0.1019 0.0419 0.1033 0.0873 0.1653 0.1032 GCCF 0.2222 0.1642 0.3083 0.1931 0.0724 0.0365 0.1151 0.0466 0.1035 0.0901 0.1662 0.1062 LightGCN 0.2349 0.1704 0.3220 0.2022 0.0761 0.0373 0.1175 0.0474 0.1102 0.0943 0.1757 0.1113 SLRec 0.1957 0.1442 0.2792 0.1737 0.0665 0.0327 0.1032 0.0418 0.1048 0.0881 0.1723 0.1068 NCL 0.2353 0.1715 0.3252 0.2033 0.0806 0.0402 0.1230 0.0505 0.1131 0.0971 0.1819 0.1150 SGL 0.2427 0.1761 0.3405 0.2104 0.0803 0.0398 0.1226 0.0502 0.1138 0.0959 0.1776 0.1122 HCCF 0.2410 0.1773 0.3232 0.2051 0.0789 0.0391 0.1210 0.0492 0.1156 0.0990 0.1847 0.1176 SH T 0.2420 0.1770 0.3235 0.2055 0.0794 0.0395 0.1217 0.0497 0.1150 0.0977 0.1799 0.1156 DirectAU 0.2422 0.1727 0.3356 0.2042 0.0818 0.0424 0.1226 0.0524 0.1182 0.0981 0.1797 0.1139 AdaGCL 0.2603 0.1911 0.3531 0.2204 0.0873 0.0439 0.1315 0.0548 0.1216 0.1015 0.1867 0.1182 Ours 0.2724 0.1992 0.3664 0.2309 0.0914 0.0464 0.1355 0.0571 0.1262 0.1056 0.1946 0.1375 Improv 4.65% 4.24% 3.77% 4.76% 4.70% 5.69% 3.04% 4.20% 3.78% 4.04% 4.23% 16.33% p-val 2 . 4 𝑒 − 6 5 . 8 𝑒 − 5 4 . 9 𝑒 − 9 6 . 4 𝑒 − 5 1 . 3 𝑒 − 4 8 . 8 𝑒 − 4 7 . 6 𝑒 − 3 2 . 2 𝑒 − 3 1 . 2 𝑒 − 4 7 . 9 𝑒 − 4 1 . 4 𝑒 − 4 2 . 9 𝑒 − 6 Memory Complexity . W e store user/item embeddings and 𝐿 layer outputs: O ( ( | U | + | V | ) 𝑑 ) . The diusion MLP contributes O ( 𝑑 2 + 𝑑𝑑 ′ ) parameters, indep endent of graph size. Hence the footprint remains comparable to lightweight CF GCNs. 4.2 Experimental Settings 4.2.1 Evaluation Datasets . W e evaluate our method on three publicly available datasets: • Last.FM [ 4 ]: Music listening behaviors and so cial interactions from Last.fm users. • Y elp [ 49 ]: A b enchmark dataset of user-business ratings from Y elp, widely utilized in location-base d recommendation studies. • Beer Advocate [ 28 ]: Beer reviews from BeerA dvocate, prepro- cessed with 10-core ltering to ensure data density . Additionally , we adopt three widely used weighted-e dge (multi- behavior) e-commerce datasets: Tmall [ 2 ], RetailRocket [ 30 ], and IJCAI15 [ 1 ]. Dataset statistics are summarized in T able 3 (binary) and T able 2 (weighted). Following public protocols (e .g., DiGraph/HEC- GCN), we merge multiple behaviors into a single interaction graph and treat edges as binary presence; graph propagation uses the standard symmetric normalized adjacency . Binary vs. W eighte d-Edge Regimes. RaDAR is evaluated un- der two complementar y settings: binar y e dges capture implicit feedback presence, whereas the weighted-edge setting follows a multi-behavior protocol that aggregates behaviors into a single binary interaction graph. This dual setup enables fair assessment of edge denoising on sparse graphs and r obustness in heter ogene ous e- commerce data. aligning our evaluation with recent multi-behavior studies such as DiGraph [25]. 4.2.2 Evaluation Protocols . Following standard evaluation pro- tocols for recommendation systems, we partition datasets into train- ing/validation/test sets (7:2:1). Adopting the all-ranking strategy , we evaluate each user by ranking all non-interacted items along- side test positives. Performance is measured using Recall@20 and NDCG@20 metrics, with 𝐾 = 20 as the default ranking cuto. This setup ensures a comprehensive assessment of mo del capabilities in real-world sparse interaction scenarios. For the weighted-e dge (multi-behavior) regime, we follow the public construction and eval- uation protocol: behaviors are aggregated into a binar y interaction graph and evaluated with the same all-ranking pipeline. 4.2.3 Compared Baseline Methods . W e evaluate RaDAR against a comprehensive suite of r epresentative recommendation models spanning both single- and multi-behavior paradigms. Spe cically , we categorize the baselines into four major research streams: (1) Traditional collaborative ltering : BiasMF [ 22 ] and NCF [ 15 ]; (2) GNN-based methods : LightGCN [ 14 ] and NGCF [ 38 ]; (3) Self- supervise d frameworks : SGL [ 42 ] and SLRec [ 48 ]; and (4) Con- trastive and diusion learning : Dir e ctAU [ 36 ] and AdaGCL [ 18 ]. For the weighted-e dge regime, following Di Graph’s protocol [ 25 ], we further include multi-behavior and heterogeneous baselines widely adopted in e-commerce scenarios, including BPR [ 22 ], Pin- Sage [ 50 ], NGCF [ 38 ], NMTR [ 13 ], MBGCN [ 20 ], HGT [ 17 ], and MA TN [ 44 ]. W e also incorporate DDRM [ 37 ] for completeness as a recent diusion-based generative recommender . Binary vs. W eighted Evaluation Narrative. W e rst report re- sults under the binary-edge setting to benchmark each mo del’s col- laborative ltering and contrastive learning capability without be- havioral intensities. W e then extend the evaluation to the weighted- edge setting, which introduces interaction-level heter ogeneity to assess robustness under multi-behavior conditions. This order high- lights that RaD AR’s improvements are not tie d to a specic edge construction but arise from its unied design—integrating view generation and diusion-based asymmetric contrastive learning (A CL)—that transfers seamlessly across regimes. T o avoid redundancy , shared baselines (e.g., BPR, PinSage, NGCF) are evaluated under both settings but described only once. Full baseline descriptions and implementation details are pro vide d in Appendix A. This taxonomy provides comprehensive coverage from foundational to state-of-the-art paradigms, enabling a rigor ous and balanced evaluation across methodological dimensions. RaDAR: Relation-awar e Diusion-Asymmetric Graph Contrastive Learning for Recommendation WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates T able 5: W eighted-edge results under Di Graph settings (Re- call@20 / NDCG@20). Model Tmall RetailRo cket IJCAI15 R@20 N@20 R@20 N@20 R@20 N@20 BPR 0.0248 0.0131 0.0308 0.0237 0.0051 0.0037 PinSage 0.0368 0.0156 0.0423 0.0248 0.0101 0.0041 NGCF 0.0399 0.0169 0.0405 0.0257 0.0091 0.0035 NMTR 0.0441 0.0192 0.0460 0.0265 0.0108 0.0048 MBGCN 0.0419 0.0179 0.0492 0.0258 0.0112 0.0045 HGT 0.0431 0.0192 0.0413 0.0250 0.0126 0.0051 MA TN 0.0463 0.0197 0.0524 0.0302 0.0136 0.0054 DiGraph 0.0553 0.0254 0.0626 0.0353 0.0178 0.0067 RaDAR ( w) 0.0626 0.0268 0.1380 0.0746 0.0582 0.0323 RaDAR ( w+DDR) 0.0620 0.0260 0.1375 0.0748 0.0603 0.0325 T able 6: Ablation study on key comp onents of RaD AR. Model V ariant Last.FM Y elp Beer Description Recall NDCG Recall NDCG Recall NDCG Baseline SOT A SSL 0.2603 0.1911 0.0873 0.0439 0.1216 0.1015 RaDAR Gen+Gen 0.2665 0.1936 0.0900 0.0456 0.1226 0.1027 Gen+Linear 0.2698 0.1986 0.0910 0.0461 0.1247 0.1050 w/o D- ACL 0.2652 0.1934 0.0904 0.0458 0.1250 0.1036 w/ ACL only 0.2720 0.1962 0.0911 0.0461 0.1264 0.1057 RaDAR(full) 0.2724 0.1992 0.0914 0.0464 0.1262 0.1056 4.3 Overall Performance (RQ1) T able 4 summarizes the results on the three binary-edge b ench- marks. RaDAR achieves the best p erformance across all datasets and cutos (top-20/40), surpassing strong CF , GNN, and SSL base- lines with statistically signicant impro vements (see Improv and p-val rows). These gains v erify that relation-aware denoising and diusion-guided augmentation jointly enhance b oth coverage and ranking quality while preserving structural semantics. For the weighted-edge regime, following DiGraph’s protocol, T able 5 reports results on Tmall, RetailRocket, and IJCAI15. Un- der the same training pipeline (enabling edge weights), RaDAR consistently outperforms public multi-behavior baselines and the reproduced DiGraph across all datasets. The DDR variant further improves NDCG@20, r ee cting its emphasis on ranking robustness in heterogeneous interactions. 4.4 Model Ablation T est (RQ2) T o evaluate RaD AR’s architectural comp onents, we conducted sys- tematic ablation studies against the state-of-the-art baseline. W e examined four congurations across three datasets (Last.FM, Y elp, and Beer): • RaD AR (Gen+Gen): Dual VGAE-based generators without de- noising model • RaD AR (Gen+Linear): Linear attention replacing relation-aware denoising model • RaD AR (w/o D-A CL): Conventional graph contrastive loss with- out diusion-asymmetric contrastive learning optimization • RaD AR (w/ ACL only): Asymmetric contrastive learning with- out diusion-based augmentation • RaD AR (full): Complete propose d framework T able 6 reveals nuanced performance patterns that illuminate four critical insights regarding component interactions and dataset- specic behaviors: T able 6 reveals four key insights: Relation-A ware Denois- ing Eectiveness: The relation-aware denoising module consis- tently outperforms alternatives. Replacing it with linear attention yields modest degradation (Recall@20: -0.95% on Last.FM), while removing explicit denoising causes more signicant drops ( -2.17%), conrming its superior noise handling capabilities. Contrastive Learning Impact: Asymmetric contrastive learning provides con- sistent improv ements over conventional loss, with Beer showing the largest gains (Recall@20: +1.12%). The eectiveness varies by dataset density , with sparse datasets like Y elp showing minimal sensitivity to contrastive learning variations. Diusion A ugmen- tation Benets: Diusion-based augmentation primarily enhances ranking quality rather than coverage. It achieves notable NDCG improvements on Last.FM ( +1.53%) with marginal recall gains, sug- gesting it optimizes embedding discriminability for ranking tasks. Component Complementarity: Results establish a clear hierar- chy: relation-aware denoising dominates recall performance, while diusion augmentation excels in ranking quality . This demonstrates the complementary nature of components, with denoising enhanc- ing coverage and D-A CL optimizing ranking precision. Dataset- Specic V ariance: On BeerA dvocate, margins among top variants fall within run-to-run variance ( < 0.2%), while substantial gains on sparse datasets (Last.FM, Y elp) validate the framework’s eective- ness in data-scarce scenarios. 4.5 Model Robustness T est (RQ3) In this section, our e xtensive experimental evaluation demonstrates the ecacy of our propose d RaD AR framework. The results indicate that RaD AR exhibits remarkable resilience against data noise and signicantly outperforms existing methods in handling sparse user- item interaction data. Sp ecically , our approach maintains high performance even in the presence of substantial noise, showcasing its robust nature. 4.5.1 Performance w .r .t. Data Noise Degrees. W e systemati- cally evaluate RaD AR’s resilience to data corruption through con- trolled noise injection experiments, where spurious edges replace genuine interactions at incr emental ratios (5%-25%). A comparative analysis with AdaGCL and SGL across datasets of varying density (Fig. 3) reveals tw o key patterns: On moderate-density datasets (Last.FM: 2 . 8 × 10 − 3 , Beer: 9 . 5 × 10 − 3 ), RaD AR demonstrates a mo dest improvement o ver AdaGCL on the Beer dataset, while the relative Recall / NDCG robustness performance among RaDAR, AdaGCL, and GCL shows less sig- nicant variation on the Last.FM dataset. This suggests that the benets of our proposed approach may be less pronounced when data sparsity is moderate, as the existing methods already capture sucient structural information under these conditions. In extreme sparsity conditions (Y elp: 1 . 6 × 10 − 4 ), RaD AR demon- strates a pronounced advantage with higher relative improv ement margins, conrming sup erior noise r esilience in data-scarce scenar- ios. WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates Yixuan Huang, Jiawei Chen, Shengfan Zhang, & Zongsheng Cao 0.00 0.05 0.10 0.15 0.20 0.25 Noise R atio 0.80 0.85 0.90 0.95 1.00 R elative R ecall Ours A daGCL SGL 0.00 0.05 0.10 0.15 0.20 0.25 Noise R atio 0.75 0.80 0.85 0.90 0.95 1.00 R elative NDCG Ours A daGCL SGL (a) Last.FM data 0.00 0.05 0.10 0.15 0.20 0.25 Noise R atio 0.6 0.7 0.8 0.9 1.0 R elative R ecall Ours A daGCL SGL 0.00 0.05 0.10 0.15 0.20 0.25 Noise R atio 0.6 0.7 0.8 0.9 1.0 R elative NDCG Ours A daGCL SGL (b) Y elp data 0.00 0.05 0.10 0.15 0.20 0.25 Noise R atio 0.8 0.9 1.0 R elative R ecall Ours A daGCL SGL 0.00 0.05 0.10 0.15 0.20 0.25 Noise R atio 0.75 0.80 0.85 0.90 0.95 1.00 R elative NDCG Ours A daGCL SGL (c) BeerA dvocate data Figure 3: Impact of Noise Ratio (5%–25%) on Performance Degradation Our empirical analysis demonstrates RaD AR’s eectiveness in cold-start scenarios through its density-aware denoising framework. The widening performance gap under increasing sparsity highlights the mo del’s ability to extract critical signals from sparse interactions - a pivotal requirement for practical recommendation systems. 4.5.2 Performance w .r .t. Data Sparsity . W e further examine RaD AR’s performance under dierent levels of user and item spar- sity on Y elp. From the user persp ective (Fig. 5a), RaDAR consistently surpasses AdaGCL across all interaction groups, with the largest improvements obser ved for cold-start users (0–10 interactions). This conrms RaD AR’s robustness in capturing informative user representations even under sparse fee dback through adaptive graph augmentation. In contrast, the item-side analysis (Fig. 5b) shows that the p erformance gap widens as item interaction density in- creases. While user metrics tend to degrade with sparsity (except a mild rebound at 20–25 interactions), item metrics improve steadily with interaction frequency . These results demonstrate that RaDAR achieves a balance between user-side generalization under sparsity and item-side learning under dense collaborative signals through its dual adaptive and density-aware mechanisms. Why Recall@20 decreases while NDCG@20 increases under sparsity . Interestingly , we observe diverging trends between Recall@20 and NDCG@20 as sparsity increases. Although both metrics are evalu- ated under the same protocol, Recall@20 tends to drop for highly sparse users, whereas NDCG@20 may increase slightly . This arises mainly from the test set characteristics: (i) many sparse users have 1 2 3 4 5 6 ratio for acl loss 0.18 0.20 0.22 0.24 0.26 0.28 0.30 Metrics V alue max min max min R ecall@20 NDCG@20 (a) Last.FM 1 2 3 4 5 6 ratio for acl loss 0.04 0.06 0.08 0.10 Metrics V alue max min max min R ecall@20 NDCG@20 (b) Y elp Figure 4: Performance variation with ACL ratio 𝜆 . Last.FM peaks Re call@20 at 𝜆 = 5 . 5 , NDCG@20 at 𝜆 = 3 . 5 . Y elp peaks Recall@20 at 𝜆 = 1 . 5 , NDCG@20 at 𝜆 = 1 . 0 . Higher 𝜆 values enhance relation-aware denoising for Last.FM, while Y elp requires balanced contributions due to interaction sparsity . only 1–2 held-out p ositives, where a single miss drastically r e duces Recall, but ranking these few positives higher still boosts NDCG; and (ii) on the item side, positives often concentrate on popular items, where improved ranking contributes more to NDCG than to Recall. Hence , under extreme sparsity , the two metrics capture com- plementary p erspectives—Recall reects coverage, while NDCG emphasizes ranking quality . 4.6 Hyperparameter Analysis (RQ4) W e investigate the impact of the adjustable contrastive learning (A CL) ratio 𝜆 , which balances Information Bottleneck (IB) losses between the VGAE-base and relation-awar e graph denoising view generators. The total IB loss is formulated as 𝐿 𝐼 𝐵 = 𝐿 𝐺 𝐼 𝐵 + 𝜆 𝐿 𝐷 𝐼 𝐵 where 𝐿 𝐺 𝐼 𝐵 and 𝐿 𝐷 𝐼 𝐵 represent the IB losses from the V GAE-base view generator and the relation-aware graph denoising view generator , and 𝜆 > 1 prioritizes relation-aware structural preservation, while 𝜆 < 1 emphasizes generated graph views. Fig. 4 reveals distinct 𝜆 preferences across datasets. Last.FM achieves optimal performance with 𝜆 > 1 (Fig. 4a), indicating its structural complexity benets from enhance d relation-aware de- noising. Conversely , Y elp attains peak metrics at lower 𝜆 values (Fig. 4b), suggesting its sparse interaction patterns require balanced information preservation from both view generators to prevent overtting. This empirical evidence conrms RaDAR’s adaptability through our symmetric contrastive learning design, sho wing robust performance across diverse graph recommendation scenarios. 5 Conclusion W e present RaDAR , a contrastive recommendation framework that unies (1) generative–denoising dual-view learning, (2) asymmet- ric contrastive objectives, and (3) diusion-based stabilization for noise-robust representation learning. RaDAR achieves consistent improvements over state-of-the-art baselines under both binar y and weighted-edge regimes, demonstrating strong generalization and resilience to sparse or noisy interactions. Ablation analyses conrm that diusion-enhanced denoising and asymmetric con- trastive learning jointly contribute to its r obustness and stability across datasets. RaDAR: Relation-awar e Diusion-Asymmetric Graph Contrastive Learning for Recommendation WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates 6 Limitations. RaD AR assumes static graphs without temp oral dynamics; extend- ing to sequence-aware diusion is future work. Training eciency could b e improved via adaptive sampling for large-scale deploy- ment. WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates Yixuan Huang, Jiawei Chen, Shengfan Zhang, & Zongsheng Cao References [1] Alibaba Group. 2025. IJCAI-15 Repeat Buyer Prediction Dataset (Tianchi). https: //tianchi.aliyun.com/competition/entrance/231576/information. Accessed: 2025- 01-09. Competition dataset from the IJCAI-15 Repeat Buyer Prediction Challenge.. [2] Alibaba Group. 2025. Tmall User Behavior Dataset (Tianchi). https://tianchi. aliyun.com/dataset/649. Accessed: 2025-01-09. E-commerce user behavior dataset including view , favorite, cart, and purchase actions.. [3] Rianne van den Berg, Thomas N Kipf, and Max W elling. 2017. Graph convolu- tional matrix completion. arXiv preprint arXiv:1706.02263 (2017). [4] Iván Cantador , Peter Brusilovsky, and T svi Kuik. 2011. Second workshop on information heterogeneity and fusion in recommender systems (HetRec2011). In Proceedings of the fth ACM conference on Recommender systems . 387–388. [5] Jiangxia Cao, Xin Cong, Jiawei Sheng, Tingwen Liu, and Bin W ang. 2022. Con- trastive cross-domain sequential recommendation. In Proceedings of the 31st A CM International Conference on Information & Knowledge Management . 138–147. [6] Zongsheng Cao, Y angfan He, Anran Liu, Jun Xie, Feng Chen, and Zhepeng W ang. 2025. Tv-rag: A temporal-aware and semantic entropy-weighted framew ork for long video retrieval and understanding. In Proceedings of the 33rd ACM Interna- tional Conference on Multimedia . 9071–9079. [7] Zongsheng Cao, Y angfan He, Anran Liu, Jun Xie, Zhepeng W ang, and Feng Chen. 2025. Co-dec: Hallucination-resistant decoding via coarse-to-ne generative feedback in large vision-language models. In A CM International Conference on Multimedia . 10709–10718. [8] Zongsheng Cao, Y angfan He, Anran Liu, Jun Xie, Zhepeng W ang, and Feng Chen. 2025. Purifygen: A risk-discrimination and semantic-purication model for safe text-to-image generation. In Proceedings of the 33rd ACM International Conference on Multimedia . 816–825. [9] Zongsheng Cao, Jing Li, Zigan W ang, and Jinliang Li. 2024. Diusione: Reasoning on knowledge graphs via diusion-based graph neural networks. In Procee dings of the 30th A CM SIGKDD Conference on Knowledge Discovery and Data Mining . 222–230. [10] Lei Chen, Le Wu, Richang Hong, Kun Zhang, and Meng W ang. 2020. Revisiting graph based collaborative ltering: A linear residual graph convolutional network approach. In Proceedings of the AAAI conference on articial intelligence , V ol. 34. 27–34. [11] Alex Chin, Y atong Chen, Kristen M. Altenburger , and Johan Ugander. 2019. Decoupled smoothing on graphs. In The W orld Wide W eb Conference . 263–272. [12] K yunghyun Cho, Bart V an Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Y oshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014). [13] Chen Gao, Xiangnan He, Dahua Gan, Xiangning Chen, Fuli Feng, Y ong Li, T at- Seng Chua, and Depeng Jin. 2019. Neural multi-task recommendation from multi-behavior data. In 2019 IEEE 35th international conference on data engine ering (ICDE) . IEEE, 1554–1557. [14] Xiangnan He, Kuan Deng, Xiang Wang, Y an Li, Y ongdong Zhang, and Meng W ang. 2020. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval . 639–648. [15] Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and T at-Seng Chua. 2017. Neural collaborative ltering. In Proceedings of the 26th international conference on world wide web . 173–182. [16] Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840–6851. [17] Ziniu Hu, Y uxiao Dong, Kuansan Wang, and Yizhou Sun. 2020. Heterogeneous graph transformer . In Proceedings of the web conference 2020 . 2704–2710. [18] Y angqin Jiang, Chao Huang, and Lianghao Huang. 2023. Adaptive graph con- trastive learning for recommendation. In Proceedings of the 29th ACM SIGKDD conference on knowledge discovery and data mining . 4252–4261. [19] Y angqin Jiang, Y uhao Y ang, Lianghao Xia, and Chao Huang. 2024. Di kg: Knowl- edge graph diusion model for recommendation. In Proce edings of the 17th ACM International Conference on W eb Search and Data Mining . 313–321. [20] Bowen Jin, Chen Gao, Xiangnan He, Depeng Jin, and Y ong Li. 2020. Multi- behavior recommendation with graph convolutional networks. In Pr o ceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval . 659–668. [21] Thomas N Kipf and Max W elling. 2016. V ariational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016). [22] Y ehuda Koren, Robert Bell, and Chris V olinsky . 2009. Matrix factorization tech- niques for recommender systems. Computer 42, 8 (2009), 30–37. [23] Chaoliu Li, Lianghao Xia, Xubin Ren, Y aowen Y e, Y ong Xu, and Chao Huang. 2023. Graph transformer for recommendation. In Proceedings of the 46th international ACM SIGIR conference on research and development in information retrieval . 1680– 1689. [24] Xiang Li, John Thickstun, Ishaan Gulrajani, Percy S Liang, and T atsunori B Hashimoto. 2022. Diusion-lm improves controllable text generation. Advances in Neural Information Processing Systems 35 (2022), 4328–4343. [25] Zongwei Li, Lianghao Xia, Hua Hua, Shijie Zhang, Shuangyang W ang, and Chao Huang. 2025. Di Graph: Heterogeneous Graph Diusion Model. In Proceedings of the Eighteenth ACM International Conference on W eb Search and Data Mining . 40–49. [26] Derek Lim, Felix Hohne, Xiuyu Li, Sijia Linda Huang, V aishnavi Gupta, Omkar Bhalerao, and Ser Nam Lim. 2021. Large scale learning on non-homophilous graphs: New benchmarks and strong simple methods. Advances in neural infor- mation processing systems 34 (2021), 20887–20902. [27] Zihan Lin, Changxin Tian, Yupeng Hou, and W ayne Xin Zhao. 2022. Improving graph collaborative ltering with neighborhood-enriched contrastive learning. In Proceedings of the ACM web conference 2022 . 2320–2329. [28] Julian John McA uley and Jure Leskovec. 2013. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd international conference on W orld Wide W eb . 897–908. [29] Xubin Ren, Lianghao Xia, Jiashu Zhao, Dawei Yin, and Chao Huang. 2023. Disen- tangled contrastive collaborative ltering. In Proceedings of the 46th international ACM SIGIR conference on research and development in information retrieval . 1137– 1146. [30] Retail Rocket. 2025. Retail Rocket Recommender System Dataset. https://w ww. kaggle.com/datasets/retailrocket/ecommerce- dataset. Accessed: 2025-01-09. E-commerce interaction logs including page views, add-to-cart, and purchase events.. [31] Franco Scarselli, Marco Gori, Ah Chung T soi, Markus Hagenbuchner , and Gabriele Monfardini. 2008. The graph neural network model. IEEE transactions on neural networks 20, 1 (2008), 61–80. [32] J Ben Schafer, Dan Frankowski, Jon Herlocker , and Shilad Sen. 2007. Collaborative ltering recommender systems. In The adaptive web: methods and strategies of web personalization . Springer , 291–324. [33] Suvash Se dhain, Aditya Krishna Menon, Scott Sanner , and Lexing Xie. 2015. Autor e c: Autoencoders meet collab orative ltering. In Proceedings of the 24th international conference on W orld Wide W eb . 111–112. [34] Jascha Sohl-Dickstein, Eric W eiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep unsuper vised learning using nonequilibrium thermodynamics. In ICML . PMLR, 2256–2265. [35] Zhulin T ao, Xiaohao Liu, Y ewei Xia, Xiang W ang, Lifang Y ang, Xianglin Huang, and T at-Seng Chua. 2022. Self-supervise d learning for multimedia recommenda- tion. IEEE Transactions on Multimedia 25 (2022), 5107–5116. [36] Chenyang W ang, Yuanqing Y u, W eizhi Ma, Min Zhang, Chong Chen, Yiqun Liu, and Shaoping Ma. 2022. T owards representation alignment and uniformity in collaborative ltering. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining . 1816–1825. [37] W enjie W ang, Yiyan Xu, Fuli Feng, Xinyu Lin, Xiangnan He, and T at-Seng Chua. 2023. Diusion r e commender model. In Proceedings of the 46th international ACM SIGIR conference on research and development in information retrieval . 832–841. [38] Xiang Wang, Xiangnan He, Meng W ang, Fuli Feng, and T at-Seng Chua. 2019. Neural graph collaborative ltering. In Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval . 165–174. [39] Xiang W ang, Hongye Jin, An Zhang, Xiangnan He, T ong Xu, and T at-Seng Chua. 2020. Disentangle d graph collaborative ltering. In Proce edings of the 43rd international A CM SIGIR conference on research and development in information retrieval . 1001–1010. [40] Ziyang W ang, W ei W ei, Gao Cong, Xiao-Li Li, Xian-Ling Mao, and Minghui Qiu. 2020. Global context enhance d graph neural networks for session-based recommendation. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval . 169–178. [41] Zhaobo Wang, Y anmin Zhu, Haobing Liu, and Chunyang W ang. 2022. Learn- ing graph-based disentangled representations for next POI recommendation. In Proceedings of the 45th international ACM SIGIR conference on research and development in information retrieval . 1154–1163. [42] Jiancan W u, Xiang W ang, Fuli Feng, Xiangnan He , Liang Chen, Jianxun Lian, and Xing Xie. 2021. Self-supervise d graph learning for recommendation. In Proceed- ings of the 44th international ACM SIGIR conference on research and development in information retrieval . 726–735. [43] Shiwen Wu, Fei Sun, W entao Zhang, Xu Xie, and Bin Cui. 2022. Graph neural networks in recommender systems: a survey . Comput. Surveys 55, 5 (2022), 1–37. [44] Lianghao Xia, Chao Huang, Y ong Xu, Peng Dai, Bo Zhang, and Liefeng Bo. 2020. Multiplex behavioral relation learning for recommendation via memory augmented transformer network. In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval . 2397–2406. [45] Lianghao Xia, Chao Huang, Y ong Xu, Jiashu Zhao, Dawei Yin, and Jimmy Huang. 2022. Hypergraph contrastive collaborative ltering. In Proceedings of the 45th International ACM SIGIR conference on research and development in information retrieval . 70–79. [46] Lianghao Xia, Chao Huang, and Chuxu Zhang. 2022. Self-super vised hypergraph transformer for recommender systems. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining . 2100–2109. [47] T eng Xiao, Huaisheng Zhu, Zhengyu Chen, and Suhang Wang. 2023. Simple and asymmetric graph contrastive learning without augmentations. Advances in neural information processing systems 36 (2023), 16129–16152. RaDAR: Relation-awar e Diusion-Asymmetric Graph Contrastive Learning for Recommendation WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates [48] Tiansheng Y ao, Xinyang Yi, Derek Zhiyuan Cheng, Felix Y u, Ting Chen, Aditya Menon, Lichan Hong, Ed H Chi, Steve Tjoa, Jieqi K ang, et al . 2021. Self-super vised learning for large-scale item recommendations. In Proceedings of the 30th ACM international conference on information & knowledge management . 4321–4330. [49] Y elp. 2018. Y elp Open Dataset. https://business.yelp.com/data/resources/open- dataset/. [50] Rex Ying, Ruining He, K aifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining . 974–983. [51] Y uning Y ou, Tianlong Chen, Y ongduo Sui, Ting Chen, Zhangyang W ang, and Y ang Shen. 2020. Graph contrastive learning with augmentations. Advances in neural information processing systems 33 (2020), 5812–5823. [52] Jiani Zhang, Xingjian Shi, Shenglin Zhao, and Irwin King. 2019. Star-gcn: Stacked and reconstructed graph convolutional networks for recommender systems. arXiv preprint arXiv:1905.13129 (2019). [53] Mengqi Zhang, Shu Wu, Xueli Y u, Qiang Liu, and Liang W ang. 2022. Dynamic graph neural networks for sequential recommendation. IEEE Transactions on Knowledge and Data Engineering 35, 5 (2022), 4741–4753. [54] Kun Zhou, Hui W ang, W ayne Xin Zhao, Yutao Zhu, Sirui W ang, Fuzheng Zhang, Zhongyuan W ang, and Ji-Rong W en. 2020. S3-rec: Self-supervised learning for se- quential recommendation with mutual information maximization. In Proceedings of the 29th ACM international conference on information & knowledge management . 1893–1902. WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates Yixuan Huang, Jiawei Chen, Shengfan Zhang, & Zongsheng Cao A Baseline Methods Details A.0.1 Single-behavior Baselines (Binar y-edge Seing) . W e evaluate RaDAR against representative baselines across four re- search streams cov ering traditional, neural, graph-base d, and con- trastive paradigms: • BiasMF [ 22 ]: A classical matrix factorization model integrating user and item bias terms to enhance personalized preference modeling. • NCF [ 15 ]: A neural collab orative ltering framework that re- places dot-product interactions with multilayer perceptrons for higher-order user–item relations. • A utoR [ 33 ]: An auto encoder-based collaborative ltering metho d reconstructing user–item interaction matrices for latent repre- sentation learning. • GCMC [ 3 ]: A graph convolutional autoencoder that models user–item interactions via message passing on bipartite graphs. • PinSage [ 50 ]: A graph convolutional framework leveraging ran- dom walk sampling and neighb orhood aggregation for large-scale recommendation. • NGCF [ 38 ]: A neural graph collaborative ltering mo del captur- ing high-order connectivity through multi-layer message propa- gation. • LightGCN [ 14 ]: A simplied graph convolutional network that removes featur e transformations and nonlinearities to enhance eciency and stability . • GCCF [ 10 ], STGCN [ 52 ]: Graph-based collaborative ltering variants that rene neighborhood aggregation and address over- smoothing eects. • SGL [ 42 ], SLRe c [ 48 ]: Self-sup ervised graph learning frame- works introducing augmentation and contrastive objectives for robust representation learning. • HCCF [ 45 ], SHT [ 46 ]: Hypergraph-based self-super vised recom- menders capturing both local and global collaborative relations. • NCL [ 27 ]: A neighborho od-enriched contrastive learning method that constructs both structural and semantic contrastive pairs. • DirectAU [ 36 ]: A representation learning framework optimiz- ing alignment and uniformity on the hypersphere for improved embedding quality . • AdaGCL [ 18 ]: An adaptive graph contrastive learning paradigm utilizing trainable view generators for personalized augmenta- tion. A.0.2 Multi-behavior Baselines (W eighted-edge Seing) . Fol- lowing DiGraph’s weighted-e dge protocol [ 25 ], RaDAR is further evaluated against baselines specically designe d for multi-behavior or heterogeneous recommendation. T o ensure consistency , over- lapping models such as BPR [ 22 ], PinSage [ 50 ], and NGCF [ 38 ] are reused but not redundantly describ ed. The remaining representative methods are summarized below: • NMTR [ 13 ]: A neural multi-task model that jointly learns cas- cading dependencies among multiple user behaviors. • MBGCN [ 20 ]: A multi-behavior graph convolutional network disentangling heterogeneous interactions through shared and behavior-specic emb eddings. • HGT [ 17 ]: A heterogeneous graph transformer employing type- specic attention and meta-r elation projection to model complex interaction semantics. • MA TN [ 44 ]: A memor y-augmented transformer network that captures multiplex behavioral relations and long-term dependen- cies in user behavior . • DiGraph [ 25 ]: A diusion-based heterogeneous graph frame- work performing noise-aware semantic propagation across rela- tion types for robust link prediction. • DDRM [ 37 ]: A generative recommendation model built upon denoising diusion processes, bridging collab orative ltering and generative modeling. B Mathematical Details B.1 Embedding Propagation Details For completeness, we omit duplicated formulas and refer readers to the main text se ction “User-item Emb edding Propagation” for the normalized adjacency , layer-wise propagation, nal embedding aggregation, and preference scoring formulations. B.2 V ariational Graph A uto-Encoder Details In this section, we provide the detailed mathematical formulations of the V GAE framework use d in our view generation approach. The KL-divergence regularization term for the latent distributions is dened as: L kl = − 1 2 𝐷  𝑑 = 1 ( 1 + 2 log ( x std ) − x 2 mean − x 2 std ) (18) For graph structure reconstruction, we employ a discriminative loss L dis that evaluates both positive and negative interactions: L pos = BCE ( 𝜎 ( 𝑓 ( x user [ 𝑢 ] ⊙ x item [ 𝑖 ] ) ) , 1 ) = − log ( 𝜎 ( 𝑓 ( x user [ 𝑢 ] ⊙ x item [ 𝑖 ] ) ) ) L neg = BCE ( 𝜎 ( 𝑓 ( x user [ 𝑢 ] ⊙ x item [ 𝑗 ] ) ) , 0 ) = − log ( 1 − 𝜎 ( 𝑓 ( x user [ 𝑢 ] ⊙ x item [ 𝑗 ] ) ) ) L dis = L pos + L neg (19) The Bayesian Personalized Ranking (BPR) loss is incorporate d to enhance recommendation performance: L bpr =  ( 𝑢,𝑖, 𝑗 ) ∈ 𝑂 − log 𝜎 ( ˆ 𝑦 𝑢𝑖 − ˆ 𝑦 𝑢 𝑗 ) , (20) The total V GAE optimization objective combines these comp onents with weight regularization: L gen = L kl + L dis + L gen bpr + 𝜆 2 ∥ Θ ∥ 2 𝐹 , (21) B.3 Detailed Diusion Process Formulation B.3.1 Forward Diusion Process. Our diusion process begins with the for ward phase, where Gaussian noise is progressively added according to: 𝑞 ( 𝝌 𝑡 | 𝝌 𝑡 − 1 ) = N ( 𝝌 𝑡 ;  1 − 𝛽 𝑡 𝝌 𝑡 − 1 , 𝛽 𝑡 𝑰 ) (22) RaDAR: Relation-awar e Diusion-Asymmetric Graph Contrastive Learning for Recommendation WW W ’26, April 13–17, 2026, Dubai, United Arab Emirates 0-5 5-10 10-15 15-20 20-25 Sparsity Degr ee 0.05 0.06 0.07 0.08 0.09 0.10 R ecall@20 Ours A daGCL 0-5 5-10 10-15 15-20 20-25 Sparsity Degr ee 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NDCG@20 Ours A daGCL (a) Performance w.r .t. user interaction numb ers 0-5 5-10 10-15 15-20 20-25 Sparsity Degr ee 0.04 0.05 0.06 0.07 R ecall@20 Ours A daGCL 0-5 5-10 10-15 15-20 20-25 Sparsity Degr ee 0.03 0.04 0.05 0.06 0.07 NDCG@20 Ours A daGCL (b) Performance w .r .t. item interaction numbers Figure 5: Performance analysis across ve user and item interaction sparsity levels on Y elp dataset. with 𝛽 𝑡 controlling the noise scale at step 𝑡 . The intermediate state 𝝌 𝑡 can be eciently computed directly from the initial state 𝝌 0 through: 𝑞 ( 𝝌 𝑡 | 𝝌 0 ) = N ( 𝝌 𝑡 ; √ ¯ 𝛼 𝑡 𝝌 0 , ( 1 − ¯ 𝛼 𝑡 ) I ) , ¯ 𝛼 𝑡 = 𝑡 Ö 𝑡 ′ = 1 ( 1 − 𝛽 𝑡 ′ ) (23) This allows for the reparameterization: 𝝌 𝑡 = √ ¯ 𝛼 𝑡 𝝌 0 + √ 1 − ¯ 𝛼 𝑡 𝝐 , 𝝐 ∼ N ( 0 , I ) (24) B.3.2 Linear Noise Scheduler . T o control the injection of noise in 𝝌 1: 𝑇 , we employ a linear noise scheduler that parameterizes 1 − ¯ 𝛼 𝑡 using three hyperparameters: 1 − ¯ 𝛼 𝑡 = 𝑠 ·  𝛼 𝑙 𝑜 𝑤 + 𝑡 − 1 T − 1 ( 𝛼 𝑢 𝑝 − 𝛼 𝑙 𝑜 𝑤 )  , 𝑡 ∈ { 1 , · · · , T } (25) Here, 𝑠 ∈ [ 0 , 1 ] regulates the overall noise scale, while 𝛼 𝑙 𝑜 𝑤 < 𝛼 𝑢 𝑝 ∈ ( 0 , 1 ) determines the lower and upp er bounds for the injected noise. B.3.3 Reverse Denoising Process. The reverse process aims to r e- cover the original representations by progressiv ely denoising 𝝌 𝑡 to reconstruct 𝝌 𝑡 − 1 through a neural network: 𝑝 𝜃 ( 𝝌 𝑡 − 1 | 𝝌 𝑡 ) = N ( 𝝌 𝑡 − 1 ; 𝝁 𝜃 ( 𝝌 𝑡 , 𝑡 ) , 𝚺 𝜃 ( 𝝌 𝑡 , 𝑡 ) ) (26) where neural networks parameterized by 𝜃 generate the mean and covariance of the denoising distribution. B.4 Asymmetric Contrastive Loss The asymmetric contrastive learning loss function is dened as: L 𝐴 = − 1 | V |  𝑣 ∈ V 1 | N ( 𝑣 ) |  𝑢 ∈ N ( 𝑣 ) log exp ( 𝑝 ⊤ 𝑢 / 𝜏 ) exp ( 𝑝 ⊤ 𝑢 / 𝜏 ) + Í 𝑣 − ∈ V exp ( 𝑣 ⊤ 𝑣 − / 𝜏 ) , (27) where N ( 𝑣 ) represents the one-hop neighb ors of node 𝑣 , and 𝜏 controls the softmax temperature. The pr e dictor output 𝑝 = 𝑔 𝜙 ( 𝑣 ) transforms the identity representation into a prediction of its neigh- borhood context.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment