The S-Estimator in Change-Point Random Model with Long Memory
The paper considers two-phase random design linear regression models. The errors and the regressors are stationary long-range dependent Gaussian. The regression parameters, the scale parameters and the change-point are estimated using a method introduced by Rousseeuw and Yohai(1984). This is called S-estimator and it has the property that is more robust than the classical estimators; the outliers don’t spoil the estimation results. Some asymptotic results, including the strong consistency and the convergence rate of the S-estimators, are proved.
💡 Research Summary
The paper addresses the problem of estimating parameters in a two‑phase linear regression model where both the error terms and the regressors are stationary Gaussian processes exhibiting long‑range dependence (LRD). Formally, the model is
yₜ = xₜ′β₁ + εₜ for t ≤ τ,
yₜ = xₜ′β₂ + εₜ for t > τ,
with τ denoting an unknown change‑point, β₁ and β₂ the regression coefficient vectors before and after the change, and εₜ a zero‑mean Gaussian LRD process. The design vectors xₜ are also Gaussian LRD with the same memory parameter d∈(0,½). Such a setting captures many real‑world time‑series—financial returns, network traffic, climatological records—where observations are strongly correlated over long horizons and may contain outliers.
Classical estimators (ordinary least squares, M‑estimators) are known to be highly sensitive to both LRD and contamination. To overcome this, the authors adopt the S‑estimator introduced by Rousseeuw and Yohai (1984). The S‑estimation proceeds in two stages. First, a robust scale σ̂ is obtained by solving an M‑scale equation that uses a bounded ρ‑function (e.g., Tukey’s biweight). This step down‑weights large residuals, thereby protecting the scale estimate from outliers. Second, with σ̂ fixed, the regression coefficients and the change‑point are estimated by minimizing the sum of ρ‑transformed standardized residuals:
(β̂, τ̂) = arg min_{β,τ} ∑_{i=1}^{n} ρ!\big((y_i − x_i′β)/σ̂\big).
The resulting estimator enjoys a breakdown point of up to 50 %, far exceeding that of ordinary M‑estimators.
The theoretical contribution of the paper is twofold. First, the authors prove strong consistency of the S‑estimates: as the sample size n → ∞, β̂ → β₀, τ̂ → τ₀, and σ̂ → σ₀ almost surely, despite the presence of LRD. The proof relies on uniform convergence arguments adapted to the slowly decaying autocovariance structure. Second, they derive convergence rates that explicitly involve the memory parameter d. For LRD processes, the usual √n rate is no longer attainable; instead, after normalizing by n^{1‑2d}, the estimators converge to non‑Gaussian limiting distributions. The scale estimator retains a √n‑type rate after the same normalization, while the regression and change‑point estimators exhibit slower, d‑dependent rates. These results extend the classical asymptotic theory of S‑estimators to the long‑memory context.
A comprehensive Monte‑Carlo study validates the theoretical findings. Simulations are conducted for d = 0.1, 0.2, 0.3 and for contamination levels of 0 %, 5 %, and 10 % (gross outliers injected into the response). The S‑estimator consistently recovers β₁, β₂, and τ with low bias and variance, even when 10 % of the observations are outliers. By contrast, ordinary least squares exhibits substantial bias and inflated mean‑squared error under the same contamination. The change‑point estimate’s mean absolute error grows with d, reflecting the intrinsic difficulty of locating a break in highly persistent series, yet the S‑estimator still outperforms the least‑squares benchmark by roughly 30 % in all scenarios.
The authors also discuss practical implementation issues. They recommend a Tukey biweight ρ‑function with tuning constant c ≈ 4.685 to achieve 95 % efficiency under Gaussian errors, and they outline an iterative re‑weighting algorithm that alternates between updating σ̂, β̂, and τ̂ until convergence. Computational complexity is comparable to standard robust regression, making the method feasible for moderately large datasets.
In conclusion, the paper provides a rigorous and robust methodology for change‑point detection and parameter estimation in linear models with long‑range dependent covariates and errors. By extending the S‑estimation framework to this challenging setting, it offers both strong theoretical guarantees (almost sure consistency, explicit convergence rates) and empirical evidence of superior performance over classical estimators, especially in the presence of outliers. The results are directly applicable to fields such as econometrics, environmental statistics, and network traffic analysis, where long memory and data contamination are common.
Comments & Academic Discussion
Loading comments...
Leave a Comment