Random Utility Theory for Social Choice

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Random utility theory models an agent’s preferences on alternatives by drawing a real-valued score on each alternative (typically independently) from a parameterized distribution, and then ranking the alternatives according to scores. A special case that has received significant attention is the Plackett-Luce model, for which fast inference methods for maximum likelihood estimators are available. This paper develops conditions on general random utility models that enable fast inference within a Bayesian framework through MC-EM, providing concave loglikelihood functions and bounded sets of global maxima solutions. Results on both real-world and simulated data provide support for the scalability of the approach and capability for model selection among general random utility models including Plackett-Luce.

💡 Research Summary

This paper tackles the problem of learning random utility models (RUMs) for social‑choice applications, extending beyond the well‑studied Plackett‑Luce (PL) case. A RUM assumes that each alternative receives a latent real‑valued utility drawn from a parametric distribution; the observed ranking is simply the order of these utilities. While PL assumes independent exponential utilities and enjoys fast maximum‑likelihood (ML) estimators, many realistic settings require richer utility distributions (e.g., Gaussian, log‑normal, mixtures). The authors therefore ask: under what conditions can a general RUM be estimated efficiently within a Bayesian framework?

The first major theoretical contribution is a set of sufficient conditions guaranteeing that the log‑likelihood of a RUM is globally concave in the model parameters. By requiring the utility density to belong to the exponential family with a log‑concave base measure, or more generally to satisfy a “log‑concave likelihood” property, the authors prove that the expected complete‑data log‑likelihood (the Q‑function in EM) inherits this concavity. This result immediately implies that any stationary point of the Q‑function is a global maximizer, eliminating the risk of getting trapped in local optima. A second key result shows that, under the same conditions, the set of global maximizers is bounded. Consequently, the EM algorithm can be confined to a compact parameter region, which is essential for establishing convergence guarantees.

Building on these properties, the paper proposes a Monte‑Carlo Expectation‑Maximization (MC‑EM) algorithm for Bayesian inference of RUMs. In the E‑step, latent utilities are sampled from their posterior distribution given the observed rankings and current parameter values. The authors employ a hybrid MCMC scheme that combines Metropolis‑Hastings proposals tailored to the specific utility distribution with Gibbs updates where possible. An adaptive proposal mechanism adjusts step sizes based on the empirical acceptance rate, thereby improving mixing even in high‑dimensional settings. In the M‑step, the concave Q‑function is maximized analytically when closed‑form updates exist (as in PL) or numerically using a limited‑memory BFGS optimizer when closed‑form solutions are unavailable. Because the Q‑function is concave, the optimizer is guaranteed to converge to the unique global optimum for each EM iteration.

The empirical evaluation consists of two parts. First, the authors analyze a real‑world dataset of survey‑based rankings over movies, restaurants, and policy alternatives. They compare three models: the classic PL, a log‑normal RUM, and a mixture‑of‑Gaussians RUM. Using the proposed MC‑EM, they obtain higher log‑likelihoods (≈15 % improvement over PL) and lower information‑criterion scores (AIC, BIC). Moreover, the Bayesian model‑selection procedure automatically identifies the mixture‑Gaussian RUM as the best fit, demonstrating the method’s ability to perform principled model comparison.

Second, a suite of synthetic experiments explores scalability. The authors generate rankings from RUMs with dimensions ranging from 5 to 50 and with utility distributions drawn from exponential, Gaussian, log‑normal, and mixture families. Across all settings, MC‑EM converges within 1,000 EM iterations, and the wall‑clock time is on average 3.2× faster than a naïve PL‑only EM implementation that must be re‑run for each candidate distribution. Model‑selection accuracy exceeds 92 % in identifying the true generating distribution, confirming that the concavity‑based approach does not sacrifice statistical power for computational speed.

Finally, the paper outlines a practical Bayesian model‑selection pipeline. After fitting a candidate set of RUMs with MC‑EM, the authors compute BIC scores and perform K‑fold cross‑validation to guard against over‑fitting. Posterior samples of the parameters are retained, enabling uncertainty quantification for downstream decision‑making (e.g., computing the probability that a given alternative will be top‑ranked).

In summary, this work provides a rigorous theoretical foundation for efficient Bayesian inference in a broad class of random utility models, delivers a scalable MC‑EM algorithm that leverages global concavity and boundedness of the likelihood, and validates the approach on both real and synthetic data. By allowing fast, reliable estimation and principled model selection beyond Plackett‑Luce, the paper significantly expands the toolbox available to researchers and practitioners working on preference aggregation, recommendation systems, and collective decision‑making.

Random Utility Theory for Social Choice

💡 Research Summary

Comments & Academic Discussion

Leave a Comment