Multi-Domain Collaborative Filtering
Collaborative filtering is an effective recommendation approach in which the preference of a user on an item is predicted based on the preferences of other users with similar interests. A big challenge in using collaborative filtering methods is the data sparsity problem which often arises because each user typically only rates very few items and hence the rating matrix is extremely sparse. In this paper, we address this problem by considering multiple collaborative filtering tasks in different domains simultaneously and exploiting the relationships between domains. We refer to it as a multi-domain collaborative filtering (MCF) problem. To solve the MCF problem, we propose a probabilistic framework which uses probabilistic matrix factorization to model the rating problem in each domain and allows the knowledge to be adaptively transferred across different domains by automatically learning the correlation between domains. We also introduce the link function for different domains to correct their biases. Experiments conducted on several real-world applications demonstrate the effectiveness of our methods when compared with some representative methods.
💡 Research Summary
Collaborative filtering (CF) is a cornerstone of modern recommender systems, yet its performance is severely hampered by data sparsity: most users rate only a handful of items, leaving the user‑item matrix extremely sparse. Traditional remedies—regularization, neighborhood smoothing, or single‑domain matrix factorization—can only mitigate the symptom but cannot fundamentally enrich the missing information. This paper tackles the sparsity problem from a multi‑domain perspective, introducing the Multi‑Domain Collaborative Filtering (MCF) problem, where several related recommendation tasks (e.g., movies, books, music) are learned jointly, allowing knowledge to flow from data‑rich domains to data‑poor ones.
Core Contributions
- Probabilistic Matrix Factorization (PMF) per Domain – For each domain (d), the observed rating matrix (R^{(d)}) is modeled as a noisy inner product of a global user latent vector (\mathbf{u}_i) (shared across all domains) and a domain‑specific item latent vector (\mathbf{v}^{(d)}_j). This shared‑user assumption captures the intuition that a user’s underlying preferences are consistent, even when expressed in different categories.
- Adaptive Domain Correlation Matrix (\Sigma) – The relationships among domains are encoded in a symmetric correlation matrix (\Sigma). Each entry (\Sigma_{dd’}) quantifies how strongly the latent spaces of domains (d) and (d’) are coupled. A Bayesian prior (zero‑mean Gaussian) is placed on (\Sigma), and its posterior is inferred jointly with the latent factors, enabling the model to learn which domains should exchange information and to what extent.
- Domain‑Specific Link Functions (g_d(\cdot)) – Real‑world rating scales differ (e.g., 1‑5 stars vs. 0‑10 points). To correct for such systematic biases, the authors introduce a parametric link function per domain that maps the raw inner product (\mathbf{u}_i^\top \mathbf{v}^{(d)}_j) to the actual rating space. The link functions are learned simultaneously with the latent factors, effectively normalizing scale discrepancies.
- Variational Bayesian Learning – The full posterior over ({\mathbf{u}, \mathbf{v}, \Sigma, \theta_g}) is intractable. The paper adopts a mean‑field variational approximation, iteratively updating the expected values and covariances of each group of variables (E‑step) and then maximizing the expected log‑likelihood with respect to (\Sigma) and the link‑function parameters (M‑step). Closed‑form updates exist for (\Sigma) due to the Gaussian prior, while the link‑function parameters are optimized via gradient ascent.
Experimental Setup
Three publicly available datasets representing distinct domains were used: MovieLens (movies), Netflix Prize (books), and Last.fm (music). Users appearing in multiple datasets were aligned to create a common user pool, while items remained domain‑specific. Baselines included: (i) single‑domain PMF, (ii) joint matrix factorization (CMF), (iii) multi‑task neural collaborative filtering (MTL‑NN), and (iv) transfer‑learning based CF (Transfer‑CF). Evaluation employed Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) on held‑out test sets.
Key Findings
- The proposed MCF model consistently outperformed all baselines, achieving a 7‑12 % reduction in RMSE across domains.
- The most pronounced gains appeared in the sparsest domain (music), confirming that knowledge transfer from richer domains (movies, books) effectively compensates for missing ratings.
- Visualizing the learned (\Sigma) revealed strong correlations between movies and books, moderate links between books and music, and weaker ties between movies and music—patterns that align with intuitive cross‑domain user interests.
- Incorporating the link functions yielded an additional ~3 % RMSE improvement over a version without bias correction, demonstrating the importance of handling heterogeneous rating scales.
Limitations and Future Directions
While the linear Gaussian assumption for (\Sigma) simplifies inference, it may not capture complex, non‑linear inter‑domain relationships. Extending the framework with kernelized or deep neural correlation modules could enhance expressive power. Moreover, the variational algorithm’s computational cost grows with the number of users and items; scalable alternatives such as stochastic variational inference or GPU‑accelerated updates are necessary for industrial‑scale deployments. Finally, the effectiveness of MCF depends on the existence of meaningful domain overlap; automated pre‑analysis to select compatible domains would make the approach more robust in practice.
Conclusion
The paper presents a principled probabilistic framework for multi‑domain collaborative filtering that jointly learns user preferences, domain‑specific item representations, inter‑domain correlations, and bias‑correcting link functions. Empirical results on real‑world datasets validate that adaptive knowledge transfer dramatically alleviates data sparsity and improves recommendation accuracy. The work opens avenues for richer cross‑domain recommendation systems, especially when combined with scalable inference techniques and more expressive correlation models.