A market resilient data-driven approach to option pricing

A market resilient data-driven approach to option pricing
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this paper, we present a data-driven ensemble approach for option price prediction whose derivation is based on the no-arbitrage theory of option pricing. Using the theoretical treatment, we derive a common representation space for achieving domain adaptation. The success of an implementation of this idea is shown using some real data. Then we report several experimental results for critically examining the performance of the derived pricing models.


💡 Research Summary

The paper proposes a novel data‑driven ensemble framework for option price prediction that is grounded in the no‑arbitrage theory of option pricing. It begins by revisiting the classic Black‑Scholes‑Merton (BSM) setting and formalising two distinct modelling approaches. The first, called the Homogeneity Hint (HH) approach, exploits a scale‑free property: for two risky assets whose risk‑neutral dynamics are identical, the ratio of the option price to the underlying price is the same when the contracts share the same maturity and moneyness. This property is proved in Theorem 2.2 and provides a simple way to train a model on one asset class and apply it to another, provided the log‑return distributions under the risk‑neutral measure are essentially the same. The authors acknowledge that this assumption is often too restrictive in practice.

To overcome the limitation, the second approach introduces a common representation space that bridges disparate risk‑neutral return distributions. By parametrising asset dynamics (e.g., geometric Brownian motion with possibly stochastic volatility) they define a “volatility scalar” – the root‑mean‑square of the underlying’s volatility over the remaining life of the option. This scalar is used to rescale features, creating a latent space where data from different assets become comparable. In this space, supervised learning models can be trained on one market and transferred to another despite substantial distributional shifts.

A key contribution is the Domain Shift Quotient (DSQ), a quantitative measure of the discrepancy between the training and test asset’s risk‑neutral return laws. DSQ is employed to weight the two constituent models (HH and the domain‑shift‑aware model) in an ensemble. When DSQ is large (significant shift), the ensemble relies more on the domain‑shift model; when DSQ is small, the HH component dominates.

Empirical validation uses option data from two major Indian indices – NIFTY 50 and NIFTY Bank – traded on the National Stock Exchange (NSE). The authors focus on the COVID‑19 lockdown period, during which the statistical properties of the two indices changed markedly, providing a natural testbed for domain adaptation. Features include strike price, time‑to‑maturity, and the volatility scalar; macro‑economic variables are deliberately omitted to preserve interpretability. Results show that the HH‑only model suffers a large increase in mean absolute error (MAE) during the lockdown, whereas the domain‑shift model maintains stable performance. The DSQ‑weighted ensemble improves overall MAE by roughly 15 % and yields the most pronounced gains in the high‑shift regime.

Further experiments explore multi‑source training, where data from both indices are combined. The resulting model demonstrates reliable predictions even for a third asset with scant historical data, illustrating the robustness of the common representation space.

The paper also discusses limitations: the volatility scalar is estimated under a constant‑volatility assumption, which may not capture all stochastic volatility effects; hedging and risk‑management aspects are not addressed; and large‑scale cross‑market validation is left for future work.

In summary, the authors contribute (1) a rigorous linkage between no‑arbitrage theory and data‑driven pricing, (2) a parametric common representation that enables domain adaptation across assets, (3) a novel DSQ metric for ensemble weighting, and (4) empirical evidence of superior performance on real‑world option data with pronounced market regime shifts. The methodology offers a promising template for applying machine‑learning techniques to financial pricing problems where data scarcity and distributional drift are major challenges.


Comments & Academic Discussion

Loading comments...

Leave a Comment