Dynamic Matrix Factorization: A State Space Approach
Matrix factorization from a small number of observed entries has recently garnered much attention as the key ingredient of successful recommendation systems. One unresolved problem in this area is how to adapt current methods to handle changing user preferences over time. Recent proposals to address this issue are heuristic in nature and do not fully exploit the time-dependent structure of the problem. As a principled and general temporal formulation, we propose a dynamical state space model of matrix factorization. Our proposal builds upon probabilistic matrix factorization, a Bayesian model with Gaussian priors. We utilize results in state tracking, such as the Kalman filter, to provide accurate recommendations in the presence of both process and measurement noise. We show how system parameters can be learned via expectation-maximization and provide comparisons to current published techniques.
💡 Research Summary
The paper addresses a fundamental limitation of conventional matrix factorization (MF) techniques used in collaborative filtering: the inability to capture evolving user preferences over time. Building on probabilistic matrix factorization (PMF), which interprets the Frobenius‑norm regularization as Gaussian priors on user and item latent factors, the authors propose a fully probabilistic, linear‑Gaussian state‑space model. In this formulation each user’s latent vector u_i(t) becomes a hidden state x_{i,t} that evolves according to a possibly time‑varying linear transition matrix A_{i,t} with additive Gaussian process noise w_{i,t} ~ N(0, Q_{i,t}). Item factors V are assumed to change slowly and are treated as fixed during the observation window. Observations are modeled as y_{i,t} = H_{i,t} x_{i,t} + z_{i,t}, where H_{i,t} selects the rows of V corresponding to the items rated by user i at time t, and z_{i,t} ~ N(0, R_{i,t}) captures measurement noise and quantization effects.
Because the model is linear‑Gaussian, the optimal MAP estimates of the hidden states can be obtained with a Kalman filter for forward inference and a Rauch‑Tung‑Striebel (RTS) smoother for backward refinement. The authors call the resulting architecture “Collaborative Kalman Filtering” (CKF): a set of N independent Kalman filters (one per user) that share a common item matrix V. The CKF naturally fits into the Expectation‑Maximization (EM) framework. In the E‑step, the forward‑backward passes compute the expected sufficient statistics of the hidden states given current parameter values. In the M‑step, closed‑form updates are derived for the initial state mean and covariance, the transition matrix A, the process‑noise covariance Q, the measurement‑noise covariance R, and the item matrix V. To keep learning tractable, the authors impose several simplifications: (i) time‑invariant parameters (A, Q, R), (ii) homogeneous priors for all users (zero mean, isotropic covariance σ_U^2 I), and (iii) scalar variances for Q and R. Under these assumptions the EM updates reduce to simple trace formulas.
Experimental validation is performed on synthetic data generated exactly according to the proposed state‑space model. The authors generate 500 users, 500 items, 20 time steps, and 5 latent dimensions, with only 0.5 % of the possible user‑item‑time entries observed. The transition matrix A is a weighted mixture of the identity and a random dense matrix, normalized to keep state power constant. Process and measurement noise variances are set to σ_Q^2 = 0.05 and σ_R^2 = 0.1. Results show that CKF, after 10–20 EM iterations, quickly converges to accurate estimates of the latent trajectories and the full preference tensor, achieving RMSE close to the theoretical lower bound given by the RTS smoother with known parameters. In contrast, static SVD and time‑aware SVD (timeSVD) perform poorly: SVD cannot model any dynamics, while timeSVD only captures linear drift and often overfits, leading to higher RMSE than even the static baseline. Visualizations of user trajectories illustrate that CKF can track curved (arc‑shaped) paths in latent space, whereas the competing methods collapse to a single point or a straight line.
The paper concludes that embedding MF within a dynamical state‑space framework yields a principled, statistically sound method for temporal recommendation. The CKF/EM combination offers both interpretability (explicit process and measurement noise) and empirical superiority on data that follow the assumed dynamics. Future work includes applying the model to real‑world datasets, extending to non‑linear dynamics (e.g., extended Kalman filters or particle filters), and exploring richer transition structures learned from data. Overall, the work bridges collaborative filtering and control‑theoretic state estimation, opening a promising direction for time‑aware recommender systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment