Identification of Long-Term Treatment Effects via Temporal Links, Observational, and Experimental Data

Identification of Long-Term Treatment Effects via Temporal Links, Observational, and Experimental Data
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Recent literature proposes combining short-term experimental and long-term observational data to provide alternatives to conventional observational studies for the identification of long-term average treatment effects (LTEs). This paper re-examines the identification problem and uncovers that assumptions restricting temporal link functions – relationships between short-term and mean long-term potential outcomes – are central in this context. The experimental data serve to amplify the identifying power of such assumptions; absent them, the combined data are no more informative than the observational data alone. Plausible inference thus hinges on justifiable restrictions in this class. Motivated by this, I introduce two treatment response assumptions that may be defensible based on economic theory or intuition. To utilize them and facilitate future developments, I develop a novel unifying identification framework that computationally produces sharp bounds on the LTE for a general class of temporal link function restrictions and accommodates imperfect experimental compliance – thereby also extending existing approaches. I illustrate the method by estimating the long-term effects of Head Start participation. The findings indicate that the effects on educational attainment, employment, and criminal involvement are lasting but smaller in magnitude than those established by sibling comparisons.


💡 Research Summary

The paper revisits the problem of identifying long‑term average treatment effects (LTEs) when researchers have access to short‑run experimental data and long‑run observational data. The author shows that the experimental data can only add identifying power if the analyst imposes substantive restrictions on the “temporal link functions” – the conditional means of long‑term potential outcomes given short‑term potential outcomes. Without such restrictions, the identified set for the LTE using both data sources is identical to that obtained from the observational data alone, rendering the experiment informationally irrelevant.

Two economically plausible restrictions are introduced. The first is a monotonicity assumption: for each treatment status, the expected long‑term outcome is a non‑decreasing function of the short‑term outcome. This mirrors standard monotonicity arguments in production, human capital, and welfare economics. The second is a treatment‑invariance (or surrogacy) assumption: the temporal link function does not depend on treatment, i.e., the same short‑term outcome predicts the same long‑term mean regardless of whether the unit was treated. Both assumptions constrain the link functions while leaving the treatment‑selection mechanism unrestricted, thereby qualifying as “treatment‑response” assumptions.

To operationalize these ideas, the author develops a unifying identification framework that casts the problem as a constrained optimization. The constraints encode the chosen class of temporal‑link restrictions together with the observed short‑run outcome distributions from both datasets. By extending the partial‑identification literature of Beresteanu‑Molchanov‑Molinari (2012) and Chesher‑Rosen (2017), the framework simultaneously bounds latent distributions and conditional means, delivering sharp lower and upper bounds on the LTE. Importantly, the method accommodates imperfect compliance in the experiment (i.e., treatment receipt D may differ from assignment Z), a feature often omitted in prior work.

The empirical illustration combines the Head Start Impact Study (short‑run randomized trial) with the NLSY79 Child and Young Adult Supplement (long‑run panel). Applying the monotonicity and treatment‑invariance assumptions, the estimated bounds suggest that Head Start raises high‑school graduation probability by 1.9–3.2 percentage points, reduces grade‑repeat probability by 1.1–5.3 points, lowers the chance of being idle (neither working nor in school) by 1.5–4.6 points, and cuts criminal involvement by 1.2–4.0 points. These effects are statistically and substantively meaningful but smaller than those reported in sibling‑comparison studies, illustrating that the proposed assumptions can yield informative bounds in practice.

Overall, the paper makes three key contributions: (1) it formally demonstrates that restrictions on temporal link functions are necessary for experimental data to improve identification of LTEs; (2) it proposes two intuitive, theory‑driven restrictions that are broadly applicable; and (3) it provides a flexible, optimization‑based identification toolkit that handles imperfect compliance and can be extended with additional assumptions. The framework thus offers a powerful new avenue for policy evaluation when long‑run outcomes are unobservable in randomized experiments.


Comments & Academic Discussion

Loading comments...

Leave a Comment