A robust regression approach to synthetic control with interference
Synthetic control methods are widely used for policy evaluation, but most existing approaches rule out interference among units, compromising validity when such effects are present. We develop a framework that accommodates contaminated donor pools and unknown interference patterns through two stages: factor-model adjustment for unobserved confounding, followed by robust regression in which direct and interference effects appear as a sparse outlier component. We study two asymptotic regimes. When the number of units is fixed and at least half are unaffected by interference, high-breakdown robust regression yields consistent identification of valid controls and asymptotically normal inference. When the number of units diverges, we allow for sparse large and dense weak interference, with robust M-estimation remaining valid even when the post-intervention period is short. Unlike existing approaches requiring prespecification of valid controls or parametric modeling of interference, our framework relies only on coarse sparsity information and enables formal inference on both direct and interference effects. We assess the proposed methods through simulations and two empirical applications. An analysis of the US embassy relocation to Jerusalem reveals significant interference effects on conflict outcomes in Jordan, and an analysis of Beijing’s air pollution policy uncovers spatial interference patterns consistent with prevailing wind directions.
💡 Research Summary
The paper tackles a fundamental limitation of the synthetic‑control (SC) methodology: the assumption that the treatment applied to the focal unit does not affect any of the donor‑pool units (the “no‑interference” or SUTVA assumption). In many comparative‑case studies this assumption is implausible because policies often spill over across geographic or economic networks. Existing work either discards potentially contaminated donor units—sacrificing efficiency and requiring strong domain knowledge—or imposes parametric structures on the spill‑over (e.g., linear spatial autoregressions). Both approaches are fragile when the interference pattern is unknown.
The authors propose a two‑stage framework that simultaneously handles contaminated donor pools and unknown interference patterns without requiring explicit pre‑selection of valid controls.
Stage 1 – Factor‑model adjustment.
They adopt the standard latent‑factor representation of untreated potential outcomes:
(Y_{it}(0)=\lambda_i^{\top}f_t+\varepsilon_{it}),
where (f_t) are common time‑varying factors, (\lambda_i) are unit‑specific loadings, and (\varepsilon_{it}) are mean‑zero errors. Using the long pre‑intervention panel, they estimate (\Lambda=(\lambda_1,\dots,\lambda_N)^{\top}) and the factor series ({f_t}) via classical factor analysis (e.g., PCA or maximum‑likelihood). Crucially, they allow the factor mean to shift between pre‑ and post‑intervention periods ((\alpha_0) vs. (\alpha_1)), thereby capturing any systematic change in the latent environment induced by the policy.
Stage 2 – Robust regression for direct and spill‑over effects.
Define the post‑pre mean difference for each unit as
(D_i=\bar Y_{i}^{\text{post}}-\bar Y_{i}^{\text{pre}}).
Under the factor model and the mean‑shift, this can be written as
(D = \Lambda\alpha + \bar\beta + \tilde\varepsilon),
where (\alpha) is the vector of factor‑mean shifts, (\bar\beta) collects the average direct effect (for the treated unit) and average interference effects (for each control), and (\tilde\varepsilon) vanishes as the pre‑ and post‑period lengths grow. The key insight is to view (\bar\beta) as a sparse outlier component: most control units are assumed to experience no interference, i.e., (\bar\beta_i=0) for a strict majority. This “majority‑valid‑controls” assumption (|C| ≥ ⌊N/2⌋ + r) is analogous to the majority rule in invalid‑instrument literature.
With this formulation, the authors apply a high‑breakdown robust regression (e.g., Least Absolute Deviations or smoothed‑Huber loss) of (D) on (\Lambda). Because the breakdown point of such estimators is 0.5, the regression automatically locks onto the hyperplane defined by the uncontaminated units (the majority) while treating the interfered units as outliers. The resulting estimates (\hat\alpha) and the residual outlier vector (\hat{\bar\beta}) provide, respectively, a bias correction for the latent factor shift and consistent estimates of the average direct and spill‑over effects.
Asymptotic regimes.
-
Fixed N, long panels ((T_0,T-T_0\to\infty)). The averaging eliminates (\tilde\varepsilon); robust regression yields (\sqrt{T-T_0})‑consistent, asymptotically normal estimators for (\alpha) and (\bar\beta). The majority‑valid‑controls condition guarantees identification even when up to 50 % of donor units are contaminated.
-
Growing N, possibly short post‑period ((N\to\infty,,T_0\to\infty)). The authors allow two interference structures: (a) a sparse set of units with large (\bar\beta_i) (“large sparse interference”) and (b) a dense set with small (\bar\beta_i) (“weak dense interference”). By employing M‑estimation with a smooth Huber loss, the cross‑sectional averaging across many units drives the influence of both types of interference to zero at rate (1/\sqrt{N}). Consequently, consistent and asymptotically normal estimates are obtained even when the post‑intervention window is fixed. For very short post‑periods, they propose a conformal permutation test that retains finite‑sample validity.
Simulation evidence.
Monte‑Carlo experiments vary N, T, the proportion of interfered units, and the magnitude of interference. The robust‑regression‑based estimator dramatically outperforms standard SC (which ignores interference) in terms of mean‑squared error and coverage probability. Performance is robust to the choice of the Huber tuning parameter, confirming practical stability.
Empirical applications.
1. US Embassy relocation to Jerusalem.
The treated unit is Israel‑Palestine; the donor pool includes neighboring Middle‑East countries. Traditional SC would treat Jordan as a valid control, but the policy likely altered regional conflict dynamics. The proposed method flags Jordan as an outlier and estimates a statistically significant negative average interference effect (≈ ‑0.42 on Jordan’s conflict count), indicating that the relocation reduced conflict spill‑overs into Jordan.
2. Beijing air‑pollution control.
The policy targeted core urban districts; the authors examine surrounding suburban districts. Without pre‑specifying a spatial weight matrix, the robust regression uncovers interference patterns aligned with prevailing wind directions: districts downwind experience larger reductions, upwind districts show smaller or even adverse effects. This validates the method’s ability to recover meaningful spatial spill‑overs purely from the data.
Limitations and future work.
- The number of latent factors r must be chosen a priori; misspecifying r could affect the first‑stage factor estimates and propagate error to the second stage.
- The framework assumes linear additive factor structure; non‑linear or time‑varying interference mechanisms are not addressed.
- Extension to multiple treated units (J > 1) and to settings with dynamic treatment timing remains an open avenue.
Overall contribution.
The paper introduces a novel “interference‑robust synthetic control” methodology that blends latent‑factor de‑confounding with high‑breakdown robust regression. By treating interference as a sparse outlier problem, it eliminates the need for ad‑hoc donor‑pool cleaning or parametric spill‑over modeling, while providing rigorous asymptotic guarantees under both fixed‑N and large‑N regimes. The approach broadens the applicability of synthetic‑control techniques to real‑world policy evaluations where spill‑overs are the rule rather than the exception.
Comments & Academic Discussion
Loading comments...
Leave a Comment