Testing Sign Congruence Between Two Parameters

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We test the null hypothesis that two parameters $(μ_1,μ_2)$ have the same sign, assuming that (asymptotically) normal estimators $(\hatμ_1,\hatμ_2)$ are available. Examples of this problem include the analysis of heterogeneous treatment effects, causal interpretation of reduced-form estimands, meta-studies, and mediation analysis. A number of tests were recently proposed. We recommend a test that is simple and rejects more often than many of these recent proposals. Like all other tests in the literature, it is conservative if the truth is near $(0,0)$ and therefore also biased. To clarify whether these features are avoidable, we also provide a test that is unbiased and has exact size control on the boundary of the null hypothesis, but which has counterintuitive properties and hence we do not recommend. We use the test to improve p-values in Kowalski (2022) from information contained in that paper’s main text and to establish statistical significance of some key estimates in Dippel et al. (2021).

💡 Research Summary

The paper addresses the problem of testing whether two parameters, μ₁ and μ₂, share the same sign, i.e., testing the null hypothesis H₀: μ₁·μ₂ ≥ 0. This question arises in many empirical contexts, such as heterogeneous treatment‑effect analysis, causal interpretation of reduced‑form estimands, meta‑analysis, and mediation analysis. The authors assume that asymptotically normal estimators (μ̂₁, μ̂₂) are available, with known or consistently estimable variances σ₁², σ₂² and correlation ρ.

Existing proposals—including the Bonferroni‑adjusted union‑intersection test of Brinch, Mogstad, and Wischall (BMW), a heuristic non‑parametric bootstrap test used by Kowalski (2022), and more elaborate pre‑testing procedures—are either conservative near the origin (μ₁, μ₂) = (0, 0) or lack uniform size control. The authors show that these methods can have arbitrarily low power when the true parameters are close to the boundary of the null.

The central contribution is a rediscovery of a simple test originally presented by Russek‑Cohen and Simon (1993). The test rejects H₀ if two conditions hold:

The estimated signs disagree, i.e., μ̂₁·μ̂₂ < 0.
Both standardized absolute estimates exceed a critical value cα: min{|μ̂₁|/σ₁, |μ̂₂|/σ₂} ≥ cα.

When the correlation ρ is non‑negative, cα equals the (1 − α) quantile of the standard normal distribution, Φ⁻¹(1 − α) (≈ 1.645 for α = 0.05, ≈ 2.326 for α = 0.01). For negative ρ, cα is defined implicitly by a supremum condition involving a bivariate normal with correlation ρ; it can be computed numerically and approaches the Bonferroni‑adjusted threshold Φ⁻¹(1 − α/2) as ρ → −1.

The authors prove (Theorem 2.1) that this test is “non‑conservatively valid”: it controls size uniformly over the entire null region and attains exact size on the boundary where μ₁·μ₂ = 0. The proof uses Anderson’s Lemma to show that the rejection probability is maximized on the boundary, and for ρ ≥ 0 the argument simplifies to a one‑sided Wald test on the parameter with the smaller signal.

To make the procedure feasible in practice, the authors replace the known σ’s and ρ with consistent estimators (σ̂₁, σ̂₂, ρ̂) and scale the test statistic by √n, yielding the feasible test (Definition 2.2). Under a uniform central limit theorem and uniform consistency of the variance‑covariance estimator (Assumption 2), the feasible test inherits the same non‑conservative size control asymptotically.

The paper also presents an alternative unbiased test that achieves exact size on the null boundary but exhibits counter‑intuitive behavior—essentially rejecting for sample realizations arbitrarily close to the null—so the authors advise against its use.

Two empirical applications illustrate the practical relevance. First, the authors re‑examine Kowalski (2022), showing that the heuristic bootstrap test is valid only when the estimators are asymptotically uncorrelated; otherwise it can have arbitrary size. Their simple sign‑congruence test is easier to implement and uniformly more powerful. Second, they apply the test to Dippel et al. (2021), improving the reported p‑values and establishing statistical significance for several key estimates that were previously marginal.

In summary, the paper offers a theoretically sound, computationally trivial, and uniformly more powerful test for sign‑congruence of two parameters. By controlling size uniformly, being non‑conservative, and requiring only standard normal critical values (or a simple numerical routine for negative correlation), the method provides a practical tool for applied researchers dealing with heterogeneous effects, mediation pathways, or meta‑analytic sign consistency.

Testing Sign Congruence Between Two Parameters

💡 Research Summary

Comments & Academic Discussion

Leave a Comment