Nonparametric Inference with an Instrumental Variable under a Separable Binary Treatment Choice Model
Instrumental variable (IV) methods are widely used to infer treatment effects in the presence of unmeasured confounding. In this paper, we study nonparametric inference with an IV under a separable binary treatment choice model, which posits that the odds of the probability of taking the treatment, conditional on the instrument and the treatment-free potential outcome, factor into separable components for each variable. While nonparametric identification of smooth functionals of the treatment-free potential outcome among the treated, such as the average treatment effect on the treated, has been established under this model, corresponding nonparametric efficient estimation has proven elusive due to variationally dependent nuisance parameters defined in terms of counterfactual quantities. To address this challenge, we introduce a new variationally independent parameterization based on nuisance functions defined directly from the observed data. This parameterization, coupled with a novel fixed-point argument, enables the use of modern machine learning methods for nuisance function estimation. We characterize the semiparametric efficiency bound for any smooth functional of the treatment-free potential outcome among the treated and construct a corresponding semiparametric efficient estimator without imposing any unnecessary restriction on nuisance functions. Furthermore, we describe a straightforward generative model justifying our identifying assumptions and characterize empirically falsifiable implications of the framework to evaluate our assumptions in practical settings. Our approach seamlessly extends to nonlinear treatment effects, population-level effects, and nonignorable missing data settings. We illustrate our methods through simulation studies and an application to the Job Corps study.
💡 Research Summary
This paper tackles the challenging problem of non‑parametric inference with an instrumental variable (IV) when the treatment assignment follows a logit‑separable binary choice model. Under the four standard IV assumptions—exclusion restriction, independence, relevance, and a novel logit‑separable treatment mechanism (IV4)—the distribution of the untreated potential outcome Y(0) among the treated can be identified, as shown in earlier work by Liu (2020) and Sun (2018). However, existing estimators suffer from a crucial limitation: the nuisance functions (e.g., odds of treatment given Y(0), instrument, and covariates) are variationally dependent, meaning they are defined in terms of each other or in terms of counterfactual quantities that are not directly observable. This dependence makes it difficult to apply modern machine‑learning tools and to obtain semiparametric efficiency.
The authors introduce a completely new parameterization that is variationally independent and relies solely on observable quantities. They define four nuisance functions—π, α, β, and μ—based on conditional densities of the observed data (Y, A, Z, X). π is the odds of treatment given Y(0), Z, and X; α is the odds‑ratio function that captures the degree of hidden confounding; β is a baseline odds term; and μ is the conditional mean of Y(0) among the treated. Crucially, these functions satisfy the simple product relationship π = α·β, and under IV4 the odds‑ratio α does not depend on the instrument Z. By exploiting this structure, the authors derive a fixed‑point equation for α that can be solved iteratively; they prove that the equation has a unique solution and that the iterative algorithm converges from any starting point.
With this variationally independent representation, the paper derives the full set of influence functions for the average treatment effect on the treated (ATT). Solving a complex integral equation, the authors obtain a closed‑form expression for the efficient influence function (EIF). They show that this EIF is unique and therefore attains the semiparametric efficiency bound for the non‑parametric model that imposes no further restrictions on the nuisance functions. This result contrasts sharply with prior work that required parametric specifications to obtain an EIF.
Building on the EIF, the authors construct a one‑step (or targeted maximum likelihood) estimator for the ATT. All nuisance functions are estimated non‑parametrically using flexible machine‑learning methods (e.g., random forests, neural networks). The estimator is shown to be √N‑consistent, asymptotically normal, and semiparametrically efficient provided that at least some of the nuisance estimators converge faster than n⁻¹⁴ (a rate easily achieved by modern learners). The paper also compares these rate conditions with those required by earlier methods, highlighting the relaxed assumptions of the new approach.
Beyond the core ATT estimation, the authors present a generative model for the logit‑separable mechanism grounded in discrete‑choice theory. In this model, an unobserved shock to the treatment decision and an independent shock to the instrument combine additively on the log‑odds scale, giving a clear economic interpretation of the separability assumption. They further derive falsifiable implications of the IV assumptions, providing testable restrictions that can be checked with observed data, thereby offering a practical diagnostic for model misspecification.
The methodological contributions are illustrated through extensive simulations and an empirical application to the Job Corps program, a U.S. job‑training initiative. In simulations, the proposed estimator exhibits lower bias and more accurate confidence intervals than existing methods across a range of data‑generating scenarios (linear, nonlinear, varying noise levels). In the Job Corps analysis, the IV is a randomly assigned training offer (Z), treatment is actual participation (A), and outcome is later earnings (Y). The new estimator yields a more precise ATT estimate, suggesting a modest but statistically significant earnings boost for participants, and the falsification tests support the plausibility of the separable treatment mechanism.
Finally, the paper discusses several extensions. The same framework can handle nonlinear treatment effects such as the quantile treatment effect on the treated (QTT), population‑average effects (ATE), and settings with nonignorable missing data. The authors also outline how the approach adapts to categorical or continuous instruments, where the model is no longer saturated and identification becomes partial; they provide the influence‑function set for a categorical IV in the supplementary material.
In summary, the article makes five major contributions: (1) a variationally independent, observable‑based parameterization; (2) a closed‑form efficient influence function and semiparametric efficiency bound for ATT; (3) a practical, machine‑learning‑compatible, semiparametrically efficient estimator; (4) a generative economic model with falsifiable implications; and (5) extensions to broader causal parameters and missing‑data problems. By resolving the long‑standing nuisance‑dependence issue, the work substantially advances the feasibility and reliability of IV analysis under logit‑separable binary treatment choice models.
Comments & Academic Discussion
Loading comments...
Leave a Comment