SpecTran: Spectral-Aware Transformer-based Adapter for LLM-Enhanced Sequential Recommendation
Traditional sequential recommendation (SR) models learn low-dimensional item ID embeddings from user-item interactions, often overlooking textual information such as item titles or descriptions. Recent advances in Large Language Models (LLMs) have inspired a surge of research that encodes item textual information with high-dimensional semantic embeddings, and designs transformation methods to inject such embeddings into SR models. These embedding transformation strategies can be categorized into two types, both of which exhibits notable drawbacks: 1) adapter-based methods suffer from pronounced dimension collapse, concentrating information into a few dominant dimensions; 2) SVD-based methods are rigid and manual, considering only a few principal spectral components while discarding rich information in the remaining spectrum. To address these limitations, we propose SpecTran, a spectral-aware transformer-based adapter that operates in the spectral domain, attending to the full spectrum to select and aggregates informative components. A learnable spectral-position encoding injects singular-value cues as an inductive bias, guiding attention toward salient spectral components and promoting diversity across embedding dimensions. Across four real-world datasets and three SR backbones, it consistently outperforms strong baselines, achieving an average improvement of 9.17%.
💡 Research Summary
Sequential recommendation (SR) aims to predict a user’s next item based on their interaction history. Traditional SR models rely solely on low‑dimensional item ID embeddings (e.g., 64‑128 dimensions) and ignore rich textual side‑information such as titles or descriptions. Recent work leverages large language models (LLMs) to encode item text into high‑dimensional semantic vectors (e.g., 4096‑dimensional), but a fundamental mismatch exists between the high‑dimensional language space and the low‑dimensional collaborative space used by SR models.
Two families of transformation strategies have emerged to bridge this gap. Adapter‑based methods train a learnable multilayer perceptron (MLP) to project LLM embeddings into the item space. Although flexible, empirical studies reveal a severe “spectral dimension collapse”: after projection, most singular values of the resulting embeddings are near zero, meaning that useful information concentrates in only a few dimensions. SVD‑based methods, on the other hand, decompose the LLM embeddings via singular value decomposition (SVD) and retain only the top‑d singular vectors. While this static approach often outperforms adapters, it discards the lower‑rank spectral components that may still carry useful signals for recommendation, and it uses hand‑crafted weighting schemes that cannot adapt to the downstream task.
SpecTran (Spectral‑aware Transformer‑based Adapter) is proposed to combine the strengths of both families while eliminating their weaknesses. The pipeline first applies SVD to the LLM embedding matrix (E_{LLM}\in\mathbb{R}^{N\times l}) obtaining (U\Sigma V^{\top}). Instead of truncating (V) to the top‑d columns, SpecTran treats the entire right singular matrix (V) as a sequence of spectral tokens. A learnable spectral‑positional encoding injects normalized singular‑value information into each token, providing an inductive bias that larger singular values are likely more informative but still allowing the model to learn task‑specific importance.
These enriched tokens are fed into a standard multi‑head Transformer encoder. Queries, keys, and values are generated by linear projections of the token sequence; the attention scores (\phi(QK^{\top})) (softmax) weight each spectral component dynamically. The attention‑weighted sum is then multiplied by the left singular matrix (U) to reconstruct a low‑dimensional semantic embedding: \
Comments & Academic Discussion
Loading comments...
Leave a Comment