The realized empirical distribution function of stochastic variance with application to goodness-of-fit testing
We propose a nonparametric estimator of the empirical distribution function (EDF) of the latent spot variance of the log-price of a financial asset. We show that over a fixed time span our realized EDF (or REDF) – inferred from noisy high-frequency data – is consistent as the mesh of the observation grid goes to zero. In a double-asymptotic framework, with time also increasing to infinity, the REDF converges to the cumulative distribution function of volatility, if it exists. We exploit these results to construct some new goodness-of-fit tests for stochastic volatility models. In a Monte Carlo study, the REDF is found to be accurate over the entire support of volatility. This leads to goodness-of-fit tests that are both correctly sized and relatively powerful against common alternatives. In an empirical application, we recover the REDF from stock market high-frequency data. We inspect the goodness-of-fit of several two-parameter marginal distributions that are inherent in standard stochastic volatility models. The inverse Gaussian offers the best overall description of random equity variation, but the fit is less than perfect. This suggests an extra parameter (as available in, e.g., the generalized inverse Gaussian) is required to model stochastic variance.
💡 Research Summary
The paper introduces a novel non‑parametric methodology for estimating the empirical distribution function (EDF) of the latent spot variance (the instantaneous variance) of a financial asset’s log‑price using noisy high‑frequency data. The authors call the resulting estimator the Realized Empirical Distribution Function (REDF). The main contributions are threefold.
First, they establish consistency of the REDF on any fixed time interval as the observation mesh Δₙ shrinks to zero, even in the presence of micro‑structure noise and price jumps. Unlike earlier work that relied on non‑overlapping blocks, the authors employ overlapping blocks together with a pre‑averaging scheme to improve efficiency. They prove a uniform bound on the estimation error of the spot variance, showing that the REDF converges to the true occupation‑time based EDF at the usual √Δₙ rate.
Second, they develop a double‑asymptotic theory in which the sampling horizon T also tends to infinity. Under weak regularity conditions on the drift, volatility and jump components, they prove a functional central limit theorem for the REDF: √T (REDF – F) converges in distribution to a Gaussian process, where F denotes the limiting cumulative distribution function of volatility (the long‑run distribution, assuming it exists). This extends the empirical‑process results of van der Vaart and Wellner to a setting with continuous‑time indexing and dependent, non‑i.i.d. observations.
Third, leveraging the asymptotic results, the authors construct two goodness‑of‑fit (GoF) tests for stochastic volatility models. The first test is a Kolmogorov–Smirnov‑type statistic that measures the supremum absolute deviation between the REDF and the model‑implied CDF. The second test is a weighted L² statistic that integrates the squared deviation with a weight function emphasizing the tails. Both statistics involve estimated model parameters, which makes their limiting distributions non‑standard. To obtain critical values, the authors propose a parametric bootstrap: simulated price paths are generated under the fitted model, the REDF is recomputed for each bootstrap sample, and the empirical distribution of the test statistics is used to compute p‑values.
Monte‑Carlo experiments compare the proposed REDF‑based tests with traditional tests based on integrated variance. The REDF tests display correct size and superior power, especially against alternatives that differ in the tail behavior of the volatility distribution.
In an empirical application, the authors extract tick‑by‑tick log‑price data for several large U.S. equities, apply the pre‑averaging and overlapping‑block estimator to obtain the REDF, and then test a range of two‑parameter marginal distributions that arise as stationary laws of common stochastic volatility models (e.g., Inverse Gaussian, Gamma, Generalized Inverse Gaussian). The Inverse Gaussian provides the best overall fit, but the goodness‑of‑fit statistics indicate systematic deviations, suggesting that an extra shape parameter—as in the Generalized Inverse Gaussian—may be necessary to capture the full distribution of spot variance.
Overall, the paper delivers a rigorous theoretical foundation for estimating the full distribution of latent volatility from high‑frequency data, proposes practical GoF tests that are easy to compute once the REDF is available, and demonstrates both through simulation and real‑world data that the approach yields reliable inference for stochastic volatility modeling. This work bridges a gap between high‑frequency econometrics and model validation, offering a valuable tool for researchers and practitioners interested in the dynamics of financial volatility.
Comments & Academic Discussion
Loading comments...
Leave a Comment