Dual-Tree LLM-Enhanced Negative Sampling for Implicit Collaborative Filtering
Negative sampling is a pivotal technique in implicit collaborative filtering (CF) recommendation, enabling efficient and effective training by contrasting observed interactions with sampled unobserved ones. Recently, large language models (LLMs) have shown promise in recommender systems; however, research on LLM-empowered negative sampling remains underexplored. Existing methods heavily rely on textual information and task-specific fine-tuning, limiting practical applicability. To address this limitation, we propose a text-free and fine-tuning-free Dual-Tree LLM-enhanced Negative Sampling method (DTL-NS). It consists of two modules: (i) an offline false negative identification module that leverages hierarchical index trees to transform collaborative structural and latent semantic information into structured item-ID encodings for LLM inference, enabling accurate identification of false negatives; and (ii) a multi-view hard negative sampling module that combines user-item preference scores with item-item hierarchical similarities from these encodings to mine high-quality hard negatives, thus improving models’ discriminative ability. Extensive experiments demonstrate the effectiveness of DTL-NS. For example, on the Amazon-sports dataset, DTL-NS outperforms the strongest baseline by 10.64% and 19.12% in Recall@20 and NDCG@20, respectively. Moreover, DTL-NS can be integrated into various implicit CF models and negative sampling methods, consistently enhancing their performance.
💡 Research Summary
The paper introduces DTL‑NS, a novel negative‑sampling framework for implicit collaborative filtering that eliminates the need for textual side information and for fine‑tuning large language models (LLMs). Traditional negative sampling either focuses on mining hard negatives based solely on model‑predicted user‑item scores or attempts to avoid false negatives through heuristic training‑time statistics. Both approaches ignore the rich relational structure among items and treat false negatives merely as noise rather than potential latent positives.
DTL‑NS addresses these gaps through two tightly coupled modules. First, the Dual‑Tree LLM‑based False Negative Identification (DTL‑FNI) builds two hierarchical index trees: a collaborative‑structure tree derived from user‑item interaction graphs (using Jaccard similarity, graph Laplacian spectral embedding, and hierarchical clustering) and a latent‑semantic tree built from the embedding space of a base recommender (e.g., MF or LightGCN). Each item is encoded as a compact path of integer identifiers (root‑level‑…‑leaf) that captures both structural co‑occurrence and semantic proximity. These path encodings are fed to an off‑the‑shelf LLM (e.g., Llama‑3‑3B/8B) as prompts, asking the model to decide whether an unobserved item is likely a false negative. Because the LLM operates on discrete, interpretable tokens rather than raw IDs or continuous embeddings, it can reason effectively without any task‑specific fine‑tuning. Experiments that artificially hide 0.1 % to 20 % of positive interactions show that the LLM identifies false negatives with >80 % accuracy for the 3B model and >90 % for the 8B model. Identified false negatives are then promoted to positive samples, augmenting the training set and turning a source of noise into additional supervision. Crucially, this LLM inference is performed once offline, incurring negligible overhead compared to full‑scale model training.
The second module, Dual‑Tree Guided Multi‑View Hard Negative Sampling (DT‑MHNS), redefines the negative‑sampling distribution by combining two views: (i) the conventional user‑item preference score (e.g., BPR inner product) and (ii) an item‑item similarity derived from the hierarchical path encodings (e.g., common‑ancestor depth or tree‑distance based Jaccard). A weighted sum of these scores yields a multi‑view hardness metric that favors items that are both high‑scoring and structurally/semantically close to the positive item. Sampling according to this metric produces harder, more informative negatives than score‑only methods, while still respecting the false‑negative mitigation performed earlier.
Extensive experiments on three real‑world datasets (Amazon‑Sports, Yelp, MovieLens) and two synthetic benchmarks demonstrate consistent gains. When integrated with standard implicit CF models such as BPR, LightGCN, and NGCF, DTL‑NS improves Recall@20 by up to 10.64 % and NDCG@20 by up to 19.12 % on Amazon‑Sports, with average improvements of 7–12 % across other datasets. Moreover, DTL‑NS can be plugged into existing negative‑sampling strategies (DNS, MixGCF, SRNS, etc.), yielding further performance boosts without any additional training of the LLM. The offline LLM cost accounts for less than 5 % of total training time, and no extra GPU memory is required because no fine‑tuning adapters are used.
In summary, DTL‑NS makes three key contributions: (1) a text‑free, fine‑tuning‑free item encoding scheme based on dual hierarchical trees; (2) an offline LLM reasoning step that accurately flags false negatives and converts them into positive supervision; (3) a multi‑view hard‑negative sampler that fuses user‑item preference with item‑item hierarchical similarity. By jointly addressing false‑negative risk at its source and enriching the hardness signal, the framework advances the state of the art in implicit collaborative‑filtering training. Future work may explore dynamic tree updates, incorporation of multimodal LLMs, or adaptive weighting of the two views to further enhance recommendation quality.
Comments & Academic Discussion
Loading comments...
Leave a Comment