Bayesian Deep Learning for Discrete Choice
Discrete choice models (DCMs) are used to analyze individual decision-making in contexts such as transportation choices, political elections, and consumer preferences. DCMs play a central role in applied econometrics by enabling inference on key economic variables, such as marginal rates of substitution, rather than focusing solely on predicting choices on new unlabeled data. However, while traditional DCMs offer high interpretability and support for point and interval estimation of economic quantities, these models often underperform in predictive tasks compared to deep learning (DL) models. Despite their predictive advantages, DL models remain largely underutilized in discrete choice due to concerns about their lack of interpretability, unstable parameter estimates, and the absence of established methods for uncertainty quantification. Here, we introduce a deep learning model architecture specifically designed to integrate with approximate Bayesian inference methods, such as Stochastic Gradient Langevin Dynamics (SGLD). Our proposed model collapses to behaviorally informed hypotheses when data is limited, mitigating overfitting and instability in underspecified settings while retaining the flexibility to capture complex nonlinear relationships when sufficient data is available. We demonstrate our approach using SGLD through a Monte Carlo simulation study, evaluating both predictive metrics–such as out-of-sample balanced accuracy–and inferential metrics–such as empirical coverage for marginal rates of substitution interval estimates. Additionally, we present results from two empirical case studies: one using revealed mode choice data in NYC, and the other based on the widely used Swiss train choice stated preference data.
💡 Research Summary
This paper presents a methodological breakthrough in integrating deep learning with Discrete Choice Models (DCMs), which are fundamental tools in economics and transportation engineering for analyzing individual decision-making. While traditional DCMs excel in interpretability and providing statistical inference for economic variables like the Marginal Rate of Substitution (MRS), they often lack the predictive power of deep learning (DL) models. Conversely, DL models, despite their superior predictive capabilities, suffer from a lack of interpretability, parameter instability, and the absence of robust uncertainty quantification, making them difficult to use for rigorous economic inference.
To bridge this gap, the authors propose a Bayesian Deep Learning framework specifically designed for discrete choice tasks. The core technical innovation lies in the implementation of approximate Bayesian inference using Stochastic Gradient Langevin Dynamics (SGLD). By injecting controlled noise into the SGD training process, SGLD enables the approximation of the full posterior distribution of model parameters. This allows the model to quantify parameter uncertainty, enabling the generation of reliable confidence intervals for critical economic indicators.
The proposed architecture is engineered to preserve the interpretability of traditional models. By utilizing a combination of 1D convolutional layers and fully connected layers, the model explicitly separates alternative-specific attributes from individual-specific characteristics. A key advantage of this Bayesian approach is its inherent ability to balance underfitting and overfitting. In data-scarce environments, the introduction of Bayesian priors guides the model toward behaviorally informed hypotheses, preventing instability. In data-rich environments, the model leverages the flexibility of deep learning to capture complex, non-linear relationships.
The effectiveness of the proposed method was validated through a rigorous experimental design. Using Monte Carlo simulations, the authors evaluated the model not only on predictive metrics, such as out-of-sample balanced accuracy, but also on inferential metrics, specifically the empirical coverage of MRS interval estimates. The results demonstrated that the coverage closely aligns with the target confidence levels. Furthermore, the model was applied to two empirical case studies: revealed mode choice data from New York City and stated preference data regarding Swiss train choices. The findings confirmed that the proposed model provides economic insights (e.g., estimating the value of time) comparable to traditional DCMs while achieving significantly better predictive performance.
In summary, this research successfully elevates deep learning from a “black-box” predictive tool to an “interpretable economic inference tool.” By providing a framework for uncertainty quantification and statistical rigor, this approach paves the way for the reliable application of deep learning in high-stakes policy-making and economic analysis.
Comments & Academic Discussion
Loading comments...
Leave a Comment