Learning Preference from Observed Rankings

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Estimating consumer preferences is central to many problems in economics and marketing. This paper develops a flexible framework for learning individual preferences from partial ranking information by interpreting observed rankings as collections of pairwise comparisons with logistic choice probabilities. We model latent utility as the sum of interpretable product attributes, item fixed effects, and a low-rank user-item factor structure, enabling both interpretability and information sharing across consumers and items. We further correct for selection in which comparisons are observed: a comparison is recorded only if both items enter the consumer’s consideration set, inducing exposure bias toward frequently encountered items. We model pair observability as the product of item-level observability propensities and estimate these propensities with a logistic model for the marginal probability that an item is observable. Preference parameters are then estimated by maximizing an inverse-probability-weighted (IPW), ridge-regularized log-likelihood that reweights observed comparisons toward a target comparison population. To scale computation, we propose a stochastic gradient descent (SGD) algorithm based on inverse-probability resampling, which draws comparisons in proportion to their IPW weights. In an application to transaction data from an online wine retailer, the method improves out-of-sample recommendation performance relative to a popularity-based benchmark, with particularly strong gains in predicting purchases of previously unconsumed products.

💡 Research Summary

The paper tackles the problem of learning individual consumer preferences from partial ranking data, a setting common in e‑commerce and marketing where only a subset of items is ever compared. The authors reinterpret each observed ranking as a collection of pairwise comparisons and model the probability that a consumer prefers item i over item j with a logistic (Bradley‑Terry‑Luce‑type) choice model.

Utility specification
The latent utility for user u and item i is decomposed as
(u_{ui}=x_i^{\top}\beta_u + \alpha_i + \gamma_{ui}).
The first term captures interpretable product attributes (e.g., sweetness, price) weighted by user‑specific coefficients, allowing marketers to understand attribute importance. The second term, (\alpha_i), is an item fixed effect that absorbs overall popularity. The third term, (\gamma_{ui}=p_u^{\top}q_i), is a low‑rank user‑item factor that shares information across users and items, similar to matrix factorization in collaborative filtering. This hybrid structure provides both interpretability and predictive power.

Selection bias and propensity modeling
A key contribution is the explicit treatment of exposure bias: a pairwise comparison is recorded only if both items appear in the consumer’s consideration set. The authors model the probability that item i is observable (i.e., enters the consideration set) as (\pi_i = \text{logit}^{-1}(z_i^{\top}\eta)), where (z_i) may include exposure variables such as search frequency, advertising spend, or shelf placement. Assuming independence across items, the probability that a comparison (i, j) is observed is approximated by (\pi_i\pi_j).

Inverse‑probability weighting (IPW)
To correct for this selection mechanism, each observed comparison receives a weight (w_{ij}=1/(\pi_i\pi_j)). The weighted log‑likelihood to be maximized is
\

Learning Preference from Observed Rankings

💡 Research Summary

Comments & Academic Discussion

Leave a Comment