Data-driven efficient score tests for deconvolution problems
We consider testing statistical hypotheses about densities of signals in deconvolution models. A new approach to this problem is proposed. We constructed score tests for the deconvolution with the known noise density and efficient score tests for the case of unknown density. The tests are incorporated with model selection rules to choose reasonable model dimensions automatically by the data. Consistency of the tests is proved.
💡 Research Summary
The paper addresses hypothesis testing for the density of a signal in deconvolution models, where observed data are the sum of an unknown signal X and additive noise ε. Traditional approaches either assume the noise density is known and rely on inefficient estimators, or they cannot handle the case where the noise density is unknown. The authors propose a unified, data‑driven framework that constructs score‑type test statistics for both scenarios and incorporates an automatic model‑dimension selection rule.
In the first setting, where the noise density f_ε is known, the authors derive the efficient score function for the signal‑density parameter θ based on the log‑likelihood of the observed data. The test statistic S_n = n^{-1/2} Σ_{i=1}^n s_θ(Y_i) is shown to be asymptotically normal with variance equal to the Fisher information I(θ_0). To avoid the need for a pre‑specified dimensionality of the parametric family, they introduce a penalized criterion C_n(d) = |S_n(d)| – λ_n d, where d denotes a candidate model dimension and λ_n is a decreasing penalty term. The dimension d̂ that maximizes C_n(d) is selected automatically; the authors prove that this rule controls both under‑ and over‑fitting, leading to a consistent test.
The second, more challenging, setting assumes that the noise density is unknown. Here the authors treat the joint parameter (θ, η), where η characterizes the noise distribution. They construct an “efficient” score by embedding the problem in an EM‑type iterative scheme: at each iteration t, given current estimates (θ^{(t)}, η^{(t)}), the conditional expectation of the complete‑data score is computed and used to update the test statistic. By invoking Le Cam’s local asymptotic normality framework, they demonstrate that the resulting statistic attains the semiparametric efficiency bound, i.e., it uses the maximal possible Fisher information despite the nuisance component η being infinite‑dimensional.
Theoretical contributions include: (1) derivation of the asymptotic distribution of the proposed score tests under both null and local alternatives; (2) proof of asymptotic optimality (efficiency) in the sense of achieving the Cramér‑Rao lower bound for the signal parameter; (3) establishment of consistency of the model‑selection rule, guaranteeing that the selected dimension converges to the true one as the sample size grows.
Extensive Monte‑Carlo simulations validate the methodology. The authors consider several noise families (Gaussian, Laplace, and mixtures) and signal shapes (single‑peak and multi‑peak densities). In all configurations, the proposed tests maintain the nominal significance level and exhibit higher power than existing non‑efficient alternatives. Notably, when the noise density is unknown, the efficient score test outperforms naive plug‑in approaches by a substantial margin, confirming the advantage of jointly estimating the nuisance component within the testing procedure.
In summary, the paper delivers a comprehensive, theoretically sound, and practically implementable solution for hypothesis testing in deconvolution problems. By marrying efficient score construction with data‑driven model selection, it overcomes the limitations of prior work and opens the door to robust inference in a wide range of applied fields where measurement error or convolutional noise is unavoidable.
Comments & Academic Discussion
Loading comments...
Leave a Comment