Sparse Choice Models
Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed as a distribution over permutations, must be learned from observed data. The observed data, in turn, may frequently be viewed as (partial, noisy) information about marginals of this distribution over permutations. As such, the search for an appropriate choice model boils down to learning a distribution over permutations that is (near-)consistent with observed information about this distribution. In this work, we pursue a non-parametric approach which seeks to learn a choice model (i.e. a distribution over permutations) with {\em sparsest} possible support, and consistent with observed data. We assume that the data observed consists of noisy information pertaining to the marginals of the choice model we seek to learn. We establish that {\em any} choice model admits a very' sparse approximation in the sense that there exists a choice model whose support is small relative to the dimension of the observed data and whose marginals approximately agree with the observed marginal information. We further show that under, what we dub, signature’ conditions, such a sparse approximation can be found in a computationally efficiently fashion relative to a brute force approach. An empirical study using the American Psychological Association election data-set suggests that our approach manages to unearth useful structural properties of the underlying choice model using the sparse approximation found. Our results further suggest that the signature condition is a potential alternative to the recently popularized Restricted Null Space condition for efficient recovery of sparse models.
💡 Research Summary
The paper tackles the fundamental problem of learning a choice model—a probability distribution over permutations—when only noisy marginal information is available. Rather than imposing a parametric form such as the Multinomial Logit, the authors adopt a non‑parametric perspective and ask for the sparsest possible distribution that is (approximately) consistent with the observed marginals.
The first theoretical contribution is an existence result: for any choice model, there exists a “very sparse” approximation whose support size grows only with the dimension d of the observed marginal data (roughly O(d log n), where n is the number of items). The proof builds on a generalized Carathéodory theorem for the permutation polytope, showing that any marginal vector can be expressed as a convex combination of a limited number of extreme points (i.e., permutations). This guarantees that, even though the full permutation space is factorial in size, the information contained in the marginals can be captured by a tiny subset of permutations.
The second contribution introduces the “signature condition,” a structural property of the marginal matrix that ensures each candidate permutation has a unique pattern of marginal entries. Under this condition the sparse approximation can be recovered efficiently, avoiding the combinatorial explosion of a brute‑force search. The authors propose a greedy‑plus‑ℓ1‑minimization algorithm: iteratively select the permutation that explains the largest residual of the marginal vector, solve a small linear program to re‑weight the selected permutations, and update the residual. The algorithm runs in polynomial time in d and n and provably converges to a solution that satisfies the marginal constraints up to the prescribed noise level.
To validate the theory, the authors apply their method to the American Psychological Association (APA) election dataset, which contains votes for 30 candidates. They extract first‑place frequencies and pairwise comparison probabilities as marginal observations. Using only 5–10 permutations, the sparse model reproduces the observed marginals with an average relative error below 5 %, outperforming standard parametric baselines. Moreover, the recovered permutations reveal coherent coalition structures among candidates, offering interpretable insights into voter preferences that are difficult to obtain from dense parametric models.
The paper also positions the signature condition as a practical alternative to the Restricted Null Space (RNS) condition that has been popular in compressed sensing literature. While RNS is often hard to verify in real data, the signature condition can be checked directly from the marginal matrix and tends to hold when the number of items is large and the voting behavior is diverse.
In summary, the work makes three key advances: (1) it proves that any choice model admits a highly sparse representation relative to the observed marginal dimension; (2) it identifies a verifiable structural condition (signature) that enables polynomial‑time recovery of such a sparse model; and (3) it demonstrates on real election data that the approach not only achieves accurate marginal reconstruction but also yields interpretable structural information about the underlying preference distribution. The authors suggest future directions including relaxing the signature condition, handling richer noise models, and extending the framework to online or streaming settings. This research opens a promising avenue for scalable, interpretable modeling of complex human choice behavior.
Comments & Academic Discussion
Loading comments...
Leave a Comment