High Dimensional Mean Test for Shrinking Random Variables with Applications to Backtesting
We propose a high dimensional mean test framework for shrinking random variables, where the underlying random variables shrink to zero as the sample size increases. By pooling observations across overlapping subsets of dimensions, we estimate subsets means and test whether the maximum absolute mean deviates from zero. This approach overcomes cancellations that occur in simple averaging and remains valid even when marginal asymptotic normality fails. We establish theoretical properties of the test statistic and develop a multiplier bootstrap procedure to approximate its distribution. The method provides a flexible and powerful tool for the validation and comparative backtesting of value-at-risk. Simulations show superior performance in high-dimensional settings, and a real-data application demonstrates its practical effectiveness in backtesting.
💡 Research Summary
The paper addresses the problem of testing whether the mean vector of a high‑dimensional random vector is zero when each component shrinks toward zero as the sample size grows. In such “shrinking” settings the classical high‑dimensional mean tests, which rely on marginal central limit theorems for each coordinate, break down because the normalized sums of the individual components no longer converge to a normal distribution.
To overcome this difficulty the authors propose a two‑step testing framework. The first, naïve approach pools all p coordinates into a single scalar Y_i = Σ_{j=1}^p X_{i,j} and forms the standardized statistic T = √n \bar Y / \hat σ. Under mild ρ‑mixing, variance lower‑bound, and uniform moment conditions (A)–(C) they prove that T converges to N(0,1). However, this test only checks the weaker hypothesis that the pooled mean μ_Y = Σ μ_j equals zero, which is not equivalent to the original hypothesis H₀: μ_j = 0 for all j. Consequently the power of the naïve test can be very low.
The core contribution is a subsets‑based test. The p dimensions are partitioned into d (possibly overlapping) subsets S₁,…,S_d, each of size q. For each subset ℓ a pooled variable Y_i^{(ℓ)} = Σ_{j∈S_ℓ} X_{i,j} is constructed, and a statistic T^{(ℓ)} = √n \bar Y^{(ℓ)} / \hat σ_ℓ is computed. The final test statistic is M = max_{ℓ=1,…,d} |T^{(ℓ)}|. Lemma 1 shows that if q and p are coprime, the collection of p cyclic subsets of length q makes the null hypothesis “all subset means are zero’’ equivalent to the original H₀. In practice the authors recommend using d > p subsets, adding user‑defined groups that reflect domain knowledge, to increase power.
Because the T^{(ℓ)} are strongly dependent, the asymptotic distribution of M is not analytically tractable. The authors therefore employ a multiplier bootstrap: i.i.d. N(0,1) multipliers ξ_i are drawn, and bootstrap versions T_B^{(ℓ)} = √n (∑ ξ_i Y_i^{(ℓ)}) / \hat σ_ℓ are formed, leading to M_B = max_{ℓ}|T_B^{(ℓ)}|. Under strengthened moment and dimensionality conditions (A), (B′), (C′), (D) and a boundedness assumption on the observations, Theorem 2 establishes that the bootstrap critical value yields a test with exact asymptotic size α. Theorem 3 proves consistency: under a local alternative where at least one subset mean exceeds √(λ log d)/√n (with λ → ∞), the power tends to one.
The methodology is then applied to backtesting Value‑at‑Risk (VaR) models. VaR exceedance indicators are binary variables with a very small success probability; as the probability shrinks with the confidence level, they fit the shrinking‑variable framework. By collecting exceedances across many assets (or many climate stations) one obtains a high‑dimensional binary matrix. The subsets can be defined by asset classes, geographic regions, or any grouping that captures dependence. The proposed test can therefore assess (i) whether a single VaR model is correctly calibrated (validation backtest) and (ii) whether two competing models differ significantly (comparative backtest).
Simulation studies compare the new test with classical VaR backtests (Kupiec, Christoffersen) and recent multivariate approaches. When p ranges from 100 to 500, n≈250, and the true exceedance probability is 1 %, the naïve test and traditional methods have negligible power, whereas the subsets‑based test achieves power above 80 % while maintaining the nominal size.
A real‑data example uses daily negative log‑returns of S&P 500 constituents and evaluates several VaR forecasting models (GARCH, ES‑based, machine‑learning). Subsets are formed by industry sectors. The bootstrap‑based critical values are computed, and the test identifies that certain machine‑learning models produce significantly fewer exceedances than the GARCH benchmark, a result that standard backtests would miss due to low event counts.
In summary, the paper makes three major contributions: (1) it introduces a novel high‑dimensional mean‑testing framework that remains valid when component variables shrink to zero and marginal CLTs fail; (2) it provides a practical implementation via overlapping subsets and multiplier bootstrap, together with rigorous asymptotic theory for size and power; (3) it demonstrates the relevance of the method for VaR backtesting, offering a more powerful tool for risk‑model validation and comparison in finance and other fields where extreme‑event data are scarce. The work bridges a gap between high‑dimensional statistical theory and practical risk‑management applications.
Comments & Academic Discussion
Loading comments...
Leave a Comment