EpicCBR: Item-Relation-Enhanced Dual-Scenario Contrastive Learning for Cold-Start Bundle Recommendation
Bundle recommendation aims to recommend a set of items to users for overall consumption. Existing bundle recommendation models primarily depend on observed user-bundle interactions, limiting exploration of newly-emerged bundles that are constantly created. It pose a critical representation challenge for current bundle methods, as they usually treat each bundle as an independent instance, while neglecting to fully leverage the user-item (UI) and bundle-item (BI) relations over popular items. To alleviate it, in this paper we propose a multi-view contrastive learning framework for cold-start bundle recommendation, named EpicCBR. Specifically, it precisely mine and utilize the item relations to construct user profiles, identifying users likely to engage with bundles. Additionally, a popularity-based method that characterizes the features of new bundles through historical bundle information and user preferences is proposed. To build a framework that demonstrates robustness in both cold-start and warm-start scenarios, a multi-view graph contrastive learning framework capable of integrating these diverse scenarios is introduced to ensure the model’s generalization capability. Extensive experiments conducted on three popular benchmarks showed that EpicCBR outperforms state-of-the-art by a large margin (up to 387%), sufficiently demonstrating the superiority of the proposed method in cold-start scenario. The code and dataset can be found in the GitHub repository: https://github.com/alexlovecoding/EpicCBR.
💡 Research Summary
The paper tackles the challenging problem of recommending newly created bundles—sets of items that have no historical user‑bundle interactions—to users in e‑commerce platforms. Existing bundle recommendation methods rely heavily on observed user‑bundle (UB) interactions and treat each bundle as an isolated entity, which makes them ill‑suited for cold‑start scenarios where bundles are brand‑new. To address this, the authors propose EpicCBR, a multi‑view contrastive learning framework that jointly handles cold‑start and warm‑start recommendation through a dual‑scenario architecture.
Problem formulation
Given users U, items I, and bundles B, with implicit interaction sets D_UI (user‑item), D_BI (bundle‑item), and D_UB (user‑bundle), the goal is to learn a scoring function y*(u,b)=⟨e_u, e_b⟩ that can rank candidate bundles for any user, especially for bundles in B_cold that never appear in D_UB during training.
Key technical components
-
Item‑pair relation mining – The authors compute Jaccard similarity on both the UI and BI graphs for every item pair (i,j). Using three thresholds (high, low, anti) and a minimum occurrence filter, they categorize pairs into four semantic relations:
- R1 (Same‑Class): high similarity in both UI and BI – items frequently co‑consumed and co‑bundled.
- R2 (Cross‑Domain Complementary): high UI similarity but low BI similarity – items often bought together by users but rarely bundled by the platform.
- R3 (Design‑Driven): high BI similarity but low UI similarity – platform‑driven bundles that users seldom purchase together.
- R4 (Anti‑Class): low similarity in both – mutually exclusive items.
The most useful for cold‑start is R2. The authors inject R2 edges into the UI graph as auxiliary item‑item connections, forming an enhanced adjacency matrix A′_UI = A_UI + α·A_UI·A_enh_II. LightGCN propagates over this enriched graph, yielding richer user embeddings even when UB data are missing.
-
Popularity‑aware bundle representation – To overcome the lack of interaction data for new bundles, the paper introduces a dual‑source popularity measure. UI popularity is simply the global count of user‑item interactions. BI popularity is derived by (i) counting user‑bundle interactions per bundle, (ii) normalizing these counts using the 90th percentile to obtain a weight w_b, and (iii) aggregating w_b over all bundles containing item i. The total popularity p_pop_i = p_pop_UI_i + p_pop_BI_i is then adjusted with a piecewise linear scaling based on median and percentile thresholds (P_50, …, P_90) to mitigate the long‑tail distribution. Edge weights in the BI graph are replaced with the adjusted popularity of the item, producing a popularity‑aware BI graph Ĝ_BI that encodes how “hot” each item is within bundles.
-
Dual‑scenario contrastive learning – EpicCBR contains two parallel modules:
- Cold‑Path: uses the enhanced UI view and the popularity‑aware BI view to compute user embeddings e_u^c and bundle embeddings e_b^c via LightGCN.
- Warm‑Path: processes the original UB, UI, and BI interactions with the same LightGCN architecture, yielding e_u^w and e_b^w.
A multi‑view contrastive loss aligns the two representations: for each (u,b) pair, the similarity between (e_u^c, e_b^c) and (e_u^w, e_b^w) is maximized, while similarities with negative pairs are minimized. This alignment preserves cold‑start signals without sacrificing warm‑start accuracy. The final recommendation score is the inner product ⟨e_u, e_b⟩, where e_u and e_b can be either the cold, warm, or a fused representation.
Experimental evaluation
The authors conduct extensive experiments on three Amazon datasets (Food, Electronics, Clothing), each containing hundreds of thousands of users, millions of items, and tens of thousands of bundles. Baselines include MultiCBR, Coheat, LightGCN, and several GNN‑based collaborative filtering models. Metrics such as Recall@10, NDCG@10, and Hit‑Rate@20 are reported. EpicCBR achieves up to a 387 % improvement in Recall@10 for cold‑start bundles compared with the strongest baseline, and also yields modest gains (5–12 %) in warm‑start performance. Ablation studies demonstrate that (a) the R2 relation injection contributes the most among the four relation types, (b) the popularity‑aware BI graph significantly boosts cold‑start bundle embeddings, and (c) the contrastive alignment between cold and warm paths is essential for overall robustness.
Contributions and limitations
The paper’s main contributions are: (1) a systematic method to mine and exploit item‑pair relations for user profile enrichment, (2) a popularity‑aware bundle embedding technique that alleviates cold‑start bias, and (3) a unified dual‑scenario contrastive learning framework that simultaneously optimizes for both cold and warm recommendation contexts. Limitations include the need to manually set Jaccard thresholds for each dataset, reliance on LightGCN which may struggle with deep or highly heterogeneous graphs, and the empirical nature of the popularity scaling parameters.
Future directions suggested by the authors involve learning the thresholds automatically (e.g., via meta‑learning), exploring deeper or transformer‑based graph encoders, and extending the framework to multi‑task learning where item‑item, user‑item, and bundle‑item relations are jointly optimized.
In summary, EpicCBR introduces a novel combination of item‑relation enhancement, popularity‑driven bundle encoding, and dual‑scenario contrastive alignment, delivering a substantial leap in cold‑start bundle recommendation performance while maintaining competitive results in traditional warm‑start settings.
Comments & Academic Discussion
Loading comments...
Leave a Comment