Double Machine Learning for Static Panel Data with Instrumental Variables: New Method and Applications

Panel data methods are widely used in empirical analysis to address unobserved heterogeneity, but causal inference remains challenging when treatments are endogenous and confounding variables high-dimensional and potentially nonlinear. Standard instr…

Authors: Anna Baiardi, Paul S. Clarke, Andrea A. Naghi

Double Mac hine Learning for Static P anel Data with Instrumen tal V ariables: New Metho d and Applications Anna Baiardi : P aul Clark e ; Andrea A. Naghi ‹ Annalivia P olselli ; ∗ 1 : Erasm us Sc ho ol of Economics, Erasmus Universit y and Tin b ergen Institute, Rotterdam, Netherlands. ‹ Departmen t of Business Analytics and Applied Economics, Queen Mary Univ ersity of London, UK ; Institute for So cial and Economic Research, Univ ersity of Essex, Colc hester, UK. Latest up date: Marc h 24, 2026 Abstract P anel data methods are widely used in empirical analysis to address unobserved heterogeneit y , but causal inference remains c hallenging when treatments are endoge- nous and confounding v ariables high-dimensional and p oten tially nonlinear. Standard instrumen tal v ariables (IV) estimators, such as tw o-stage least squares (2SLS), b ecome unreliable when instrumen t v alidit y requires flexibly conditioning on many cov ariates with p oten tially non-linear effects. This pap er develops a Double Mac hine Learning estimator for static panel mo dels with endogenous treatments (panel IV DML), and in tro duces weak-iden tification diagnostics for it. W e revisit three influential migration studies that use shift-share instruments. In these settings, instrument v alidity dep ends on a ric h cov ariate adjustment. In one application, panel IV DML strengthens the pre- dictiv e p o wer of the instrument and broadly confirms 2SLS results. In the other cases, flexible adjustmen t mak es the instrumen ts w eak, leading to substantially more cautious causal inference than conv en tional 2SLS. Mon te Carlo evidence supp orts these findings, sho wing that panel IV DML improv es estimation accuracy under strong instruments and delivers more reliable inference under weak iden tification. Keyw ords: Anderson-Rubin test, causal effects, double machine learning, Neyman orthogonalit y , panel data, shift-share instrument, w eak identification. JEL codes: C14, C18, C33, C36, C45, C52. ∗ The authors thank F rank Windmeijer, F ederica Lib erini, Luca F av ero, Angelina Nazaro v a, and Daniela Sonedda for their helpful commen ts, and the participants at the QMUL Causal Mac hine Learning W orkshop, 11th ICEEE, 36th EC 2 Conference. Annalivia Polselli ac knowledges supp ort of the British Academ y through the Postdoctoral F ello wship (grant num ber PFSS24/240003). The authors ackno wledge the use of the High P erformance Computing F acilit y (Ceres) and its associated support services at the Universit y of Essex in the completion of this w ork. 1 1 In tro duction P anel data are the bac kb one of empirical researc h in mo dern economics. The structure of panel data enables researchers to account for time-in v arian t unobserv ed heterogeneit y and exploit within-unit v ariation for causal inference. Ho w ever, causal estimation remains chal- lenging when treatmen t v ariables are end ogenous and identification hinges on the v alidit y of instrumen tal v ariables (IV s). In many applications, instrumen ts are only v alid conditional on rich sets of co v ariates ( Angrist and Im b ens , 1995 ; Abadie , 2003 ). T raditional econometric metho ds, suc h as tw o-stage least squares (2SLS), cannot handle the complexities introduced b y high-dimensional confounders or nonlinear relationships. F or instance, least squares es- timation ma y b e infeasible in high-dimensional settings due to rank deficiency induced b y man y irrelev ant regressors, and it cannot capture nonlinear relationships unless these are explicitly parameterized. This creates a tension b etw een the need to con trol for a rich set of confounders to justify instrument v alidity and the limitations of conv en tional metho ds under sparsit y and model-selection constrain ts. This pap er presents a new metho d to address these c hallenges. W e prop ose a Dou- ble Machine Learning (DML) pro cedure for static panels with endogenous treatmen ts and instrumen tal v ariables, which we call ‘P anel IV DML’. Our approach extends the framew ork of Chernozh uko v et al. ( 2018 ) to accommo date serial dep endence and unobserv ed individual heterogeneit y , features typical of panel datasets, and that of Clark e and Polselli ( 2025 ) to allo w for treatment endogeneit y . First, w e derive a nov el DML estimator for static panel data mo dels with endogenous treatments based on Neyman orthogonal score functions that accoun t for unobserved individual heterogeneity . The prop osed metho d allows for flexible predictions of the nuisance functions with v arious mac hine learning algorithms. It further de- liv ers first-order normal statistical inference for treatmen t effects, adjusted for regularization and o ver-fitting bias through the use of Neyman orthogonal score functions and blo ck-k-fold cross-fitting, where each unit’s entire time series is assigned to the same fold. Second, to ensure credible inference in the p ossible presence of w eak instruments, we prop ose first-stage F -statistics and Anderson-Rubin (AR) test statistics and confidence sets allo wing us to de- tect weak identification issues and, th us, ensure v alid inference ev en when the instrumen t strength is limited. T o our knowledge, this is among the first work to develop w eak IV tests for DML pro cedures, providing empirical researc hers with new to ols for robust instrumen tal 2 v ariables analysis. This framew ork is particularly relev ant in applied economics, as panel data mo dels estimated with an instrumen tal v ariable metho d (e.g., 2SLS, generalized metho d of momen ts) accoun t for 40% of the Americ an Ec onomic R eview (AER) articles using panel data, pub- lished in the b et ween 2011-2018. 1 W e hence illustrate the empirical relev ance of our metho d b y revisiting three highly influential studies in migration and p olitical econom y that rely on shift-share instrumen ts. Suc h instruments t ypically require conditioning on ric h sets of co- v ariates to b e plausibly exogenous, or as go o d as r andomly assigne d ( Borusy ak et al. , 2025 ), hence these applications pro vide a natural setting for panel IV DML, which flexibly controls for high-dimensional and nonlinear confounding. The first empirical study we revisit, T abellini ( 2020 ), studies the p olitical and economic effects of immigration in early 20th-century U.S. cities using a shift-share IV. Our panel IV DML increases the predictive p o wer of the instrument (relev ance) since we allo w for flexible adjustment in the instrument equation. The panel IV DML estimates broadly align with conv entional 2SLS in the baseline specifications for the p olitical outcome, but find no effect for the economic outcome, unlik e 2SLS. When additional con trols are included in the sp ecification as a robustness chec k, the main effects found in the original study for b oth economic and p olitical outcomes disapp ear under b oth 2SLS and panel IV DML, suggesting that the results are sensitive to the inclusion of confounders rather than to the estimation metho d p er se . In con trast, the re-analyses of Moriconi et al. ( 2019 , 2022 ), whic h examine the impact of immigration on political and economic attitudes of individual voters and parties, rev eal w eak-instrument concerns and, therefore, second-stage results should b e interpreted with caution. W eak-iden tification diagnostics indicate that, while some reduced-form rela- tionships may b e presen t, the av ailabl e v ariation is insufficien t to supp ort identification of the treatment effect. In these cases, panel IV DML helps assess instrument v alidity through flexible adjustment, while p oten tially reducing effective instrument relev ance. In these ex- ercises, panel IV DML estimator leads to more cautious inference, emphasising the v alue of robust weak-iden tification diagnostics, esp ecially when instruments are only mo derately strong with conv en tional 2SLS. 1 This figure is based on a review of 477 empirical articles published in the A meric an Ec onomic R eview b et w een 2011 and 2018. In particular, 68% p“ 326 { 477 q of these inspected articles employ pane l data metho ds. Among the set of these panel data articles, 40% p“ 131 { 326 q implement an instrumen tal v ariables estimation strategy . Detailed information on data collection is pro vided in App endix G . 3 Our Monte Carlo simulation results supp ort the findings of the empirical applica- tions. Through simulation exercises calibrated to studied empirical designs, w e show that panel IV DML outp erforms conv en tional 2SLS in b oth bias and rate of con vergence when confounding is nonlinear and the instrument is strong. When instruments are w eak, our P anel IV DML metho d pro vides more reliable inference under weak identification than con- v entional 2SLS, and AR-based confidence sets more reliably reflect true uncertaint y than standard inference. Our w ork in tersects with several strands of the econometric literature. First, the pap er is situated within the broad field of causal inference metho ds that emplo y doubly/de- biased (orthogonal) estimation to adjust for high-dimensional confounders ( Belloni et al. , 2014a , b , 2016 ; Chernozh uko v et al. , 2018 , 2022 ; Knaus , 2022 ; Bia et al. , 2023 ; Chen et al. , 2025 ; Chernozhuk o v et al. , 2025 ; Chen et al. , 2025 ). The canonical article is Chernozh uko v et al. ( 2018 ), who introduced the DML metho d for cross-sectional mo dels b y combining the predictiv e p o wer of mac hine learning (ML) algorithms to flexibly predict the functional form of the co v ariates with statistical metho ds to obtain v alid statistical inference of the struc- tural parameter. F rom this seminal pap er, w e extend the DML pro cedure for partially linear regression (PLR) mo dels with instrumental v ariables to panel data. With rep eated observ a- tions for each sub ject in the sample, the presence of the unobserv ed individual heterogeneity , usually assumed to b e correlated with the confounding v ariables and treatmen t (i.e. fixed effects), poses serious c hallenges for mac hine learning and DML, whic h w ere not developed for serially correlated data. 2 W e th us use con ven tional panel data transformations (such as first-differences) for static panels to remo ve the linear effect of the unobserv ed heterogeneit y . Then, provided that we can learn new nuisance functions (that is, transformations of the n uisance functions of the original partially linear panel mo del), w e show ho w DML can b e used to obtain a consisten t estimator of the structural parameter with desirable asymptotic prop erties. Second, within the literature on causal machine learning, our con tribution relates to a growing strand that extends the DML framework to b oth static and dynamic panel data mo dels targeting differen t causal estimands. In static panel settings, Klosin and Vil- galys ( 2022 ) prop ose an estimator for the a verage partial effect in mo dels with contin uous 2 F or instance, naiv ely applying to panel data the DML for partially linear regression mo dels with instrumental v ariables set out by Chernozh uko v et al. ( 2018 , Sec. 4.2) (i.e. using ‘p ooled’ estimation) would ignore this heterogeneit y and, thus, b e inconsistent. 4 exogenous treatments, using a first-difference transformation to eliminate unobserv ed time- in v ariant heterogeneity . Extensions of DML to dynamic panel data models address additional sources of endogeneit y arising from predetermined regressors and lagged outcomes. Semen- o v a et al. ( 2023 ) dev elop a pro cedure for dynamic panel mo dels with predetermined v ariables, binary treatmen ts and fully heterogeneous treatmen t effects, relying on w eak sparsit y and lin- earit y assumptions suitable for Lasso estimation. Chernozh uko v et al. ( 2024 ) further extend the framework b y combining the Arellano-Bond estimation strategy with Lasso to estimate structural parameters in dynamic linear panel mo dels with lagged dep enden t v ariables, pre- determined co v ariates, and unobserved individual and time fixed effects. In contrast to the rest of the DML literature, Argañaraz and Escanciano ( 2025 ) allow for m ultiv ariate and general functionals of unobserved heterogeneit y in high-dimensional panel settings, and b y pro viding a full c haracterization of Neyman-orthogonal momen ts in mo dels with nonpara- metric unobserv ed heterogeneity , even when nuisance comp onen ts or the distribution of the heterogeneit y are not fully iden tified. The work most closely related to ours in this strand of the literature is Clark e and Polselli ( 2025 ). The authors dev elop DML metho ds for partially linear panel regression mo dels while accounting for low-dimensional unobserv ed individual heterogeneity , without imp osing sparsit y on the n uisance functions, thereb y allo wing the use of general mac hine learning algorithms within the DML framew ork. W e extend their framew ork b y relaxing the exogeneity assumption imp osed on the treatment v ariable in their setting. Allo wing for p oten tially non-random instruments requires explicitly mo delling the structural equation of the instrument (uncommon in conv en tional IV pro cedures) and, consequently , learning an additional n uisance function for that equation. Th is IV setting also necessitates the construction of new Neyman-orthogonal score functions tailored to our panel data structure to remo v e regularization bias and deliv er v alid asymptotic inference under flexible mac hine learning estimation of the nuisance functions. A complementary article to ours is A vila Marquez ( 2025 ), who prop oses a DML pro cedure for panel data to capture nonlinearity in the effects of otherwise v alid instrumen tal v ariables which would b e sp eciously weak using 2SLS. She uses a control function approac h in which ML is used to learn the functional form of the instrument(s) and exogenous re- gressors in the first-stage. In contrast, we adopt a first-stage sp ecification for the treatment whic h is linear in the instrument and nonlinear in the exogenous regressors. This setting giv es us a direct connection to conv entional 2SLS for scenarios where established weak IV 5 pro cedures can b e used. Our approac h hence prioritizes improving instrument v alidit y by flexibly controlling for a ric h set of confounding v ariables using ML algorithms, making the instrument as go o d as r andomly assigne d , and b y developing inference pro cedures that remain v alid under w eak identification. W e also interact with the literature on v alid inference in IV settings under w eak iden tification. Some studies fo cus primarily on dev eloping inference to ols (including F and t -tests, adjusted critical v alues) for detecting weak instrumen ts in low-dimensional structural mo dels ( Cragg and Donald , 1993 ; Sto c k and Y ogo , 2005 ; Kleib ergen and P aap , 2006 ; Montiel Olea and Pflueger , 2013 ; Lee et al. , 2022 ; Windmeijer , 2025 ). Others built on Anderson- Rubin ( Anderson and Rubin , 1949 , henceforth AR) test statistics to pro vide robust inference regardless of instrumental relev ance in settings with few instruments ( Moreira and Moreira , 2019 ) or in settings where the num b er of instrumen ts gro ws with the sample size (e.g., Anatoly ev and Gosp o dino v , 2011 ; Carrasco and T ch uen te , 2016 ; Crudu et al. , 2021 ; Miku- shev a and Sun , 2022 ; Do vì et al. , 2024 ). Our paper contributes to this conv ersation b y prop osing AR-based inference for the panel IV DML estimator with few instrumen ts and under heterosk edasticity , aligning with b est practices emerged in the econometrics litera- ture ( Davidson and MacKinnon , 2014 ; Andrews et al. , 2019 ; Keane and Neal , 2023 , 2024 ). 3 Building on this established discussion, w e leverage the desirable asymptotic prop erties of the DML estimator and the linearit y of instrumental v ariable in the first-stage equation to derive statistical tests for the panel IV DML estimator under weak iden tification (first- stage F -statistic, Anderson-Rubin test and confidence sets), not originally prop osed for the cross-sectional case. The prop osed statistics incorp orate flexible information on confounding v ariables, cross-fitting strategy , and orthogonal residuals that mitigate the bias arising from missp ecified confounding structures. The linearit y of the instrumen tal v ariable in the first- stage relationship b et w een the instrument and the treatmen t is fundamental for in voking con ven tional w eak IV asymptotic theory and associated inference pro cedures. 4 Finally , we contribute to the broader discussion on the v alue added of machine learning metho ds to causal analysis and p olicy ev aluation in economics. A gro wing bo dy of 3 The AR test statistic and confidence set are not as commonly used in applied w ork as the first-stage F- statistic, and is rarely reported in standard 2SLS regression tables. Some examples of empirical studies that rep ort AR test statistics and/or AR confidence sets alongside the conv entional w eak IV tests are: Clark and Zh u ( 2024 ) in labour economics; Ronconi et al. ( 2012 ) and Abay asek ara et al. ( 2025 ) in health economics; Cruz and Moreira ( 2005 ) in the economics of education, and Khammo et al. ( 2024 ) in public economics; and F umagalli et al. ( 2021 ) in household b eha viour and family economics. 4 A metho dological c hallenge for future research is extending weak IV asymptotics to more general scenarios where the first-stage relationship is nonlinear, as in Chernozhuk o v et al. ( 2018 ) and A vila Marquez ( 2025 ). 6 empirical work (e.g., Deryugina et al. , 2019 ; Langen and Hub er , 2023 ; Strittmatter , 2023 ; Baiardi and Naghi , 2024a , b ) do cumen ts promising results from the use of causal mac hine learning tec hniques (e.g., causal forest, generic mac hine learning, and double machine learn- ing) to address empirical research questions by uncov ering nonlinearities and heterogeneous effects that w ould otherwise remain undetected with con ven tional estimation metho ds. These applications use cross-sectional data or panel data reshap ed into a cross-sectional form. Our w ork complements this literature by extending these insights to panel data settings, where unobserv ed heterogeneity , nonlinearities, and high-dimensional confounding p ose additional c hallenges for causal inference. Although panel data with instrumen tal v ariables is a common estimation strategy in empirical economics, the application of causal mac hine learning meth- o ds to this setting remains largely underdev elop ed — a gap this paper directly addresses. The remainder of the pap er is structured as follows. Section 2 in tro duces the notation and presents th e econometric framework for panel IV DML. Section 3 describ es the estimation and inference with panel IV DML by defining the Neyman orthogonal score functions, the estimators, and the (robust) tests under weak identification. Section 4 applies the metho d to three empirical applications with shift-share instruments in the context of the economics of migration. Section 5 illustrates the finite sample prop erties of the panel IV DML estimator with the Monte Carlo simulation exercises. Section 6 concludes. 2 Econometric F ramew ork for P anel IV 2.1 Notation Supp ose that individual i is randomly dra wn from a population and that measures are collected for every one in the sample o ver m ultiple time p erio ds t (or wa v es in survey studies). Let tp Y it , D it , Z it , X it q : t “ 1 , . . . , T u N i “ 1 b e indep enden t and identically distributed ( iid ) random vectors for eac h of the N individuals across all T time p oin ts, where Y it P Y is the outcome v ariable, D it P D a contin uous or binary endogenous treatmen t v ariable (or in terven tion), Z it “ p Z it, 1 , . . . , Z it,r q 1 P Z is a r ˆ 1 v ector of v alid instruments (contin uous or binary), and X it “ p X it, 1 , . . . , X it,p q 1 P X a p ˆ 1 v ector of control (pre-determined) v ariables, usually including a constant term, able to capture time-v arying confounding induced b y 7 non-random treatmen t selection. 5 W e denote the realizations of these random v ariables b y tp y it , d it , z it , x it qu , resp ectiv ely . F or contin uous D it P D Ă R , if d it ě 0 then it is assumed that a dose-resp onse relationship is maintained with D it “ 0 indicating n ull treatment; otherwise, D it is taken to b e cen tered around its mean µ D suc h that D it ” D it ´ µ D . F or binary D it P t 0 , 1 u , D it “ 0 is tak en to indicate the absence and D it “ 1 the presence of treatmen t. In the interv al b et ween times t ´ 1 and t , it is assumed that the realizations of time-v arying predictor X it and instrument Z it precede that of p Y it , D it q . 2.2 Mo del and Assumptions Consider the follo wing partially linear panel regression (PLPR) mo del with instrumental v ariables, based on extending the cross-sectional mo del b y Robinson ( 1988 ) (see also Cher- nozh uko v et al. ( 2018 , sec. 4.2)): Y it “ t D it ´ r 0 p X it qu θ 0 ` l 0 p X it q ` α i ` U it , (2.1) D it “ V it 1 π 0 ` r 0 p X it q ` ζ i ` R it , (2.2) V it “ Z it ´ m 0 p X it q ´ γ i , (2.3) where Y it is the outcome of interest, D it an endo genous treatmen t v ariable (or p olicy in- terv ention), Z it is a r ˆ 1 v ector of exo genous exclude d instrumen ts whic h is not fixe d , X it is a p ˆ 1 v ector of exo genous include d regressors; η 0 “ p l 0 , r 0 , m 0 q is a v ector of p ossibly nonlinear nuisance functions of the co v ariates to b e estimated using ML algorithms; θ 0 is the structural parameter of interest to estimate consisten tly to conduct causal inference, and π 0 a r ˆ 1 vector of parameters for the exclude d exo genous v ariables (or instrumen ts). The α i , ζ i and γ i are unobserv ed individual-lev el, or fixed, effects; all three are functions of the omitted (time-inv ariant) individual-lev el omitted v ariables ξ i . Endogenous treatment selection induces a correlation b et ween errors U it and R it suc h that E p U it | D it , X i , ξ i q ‰ 0 but E p U it | Z it , X it , ξ i q “ 0 . Unlike conv en tional 2SLS settings, the instruments Z it are not treated as fixed but as w av e t realizations conditional on X it but preceding p Y it , D it q . Based on ( 2.3 ), the role of instrumen tal v ariable residuals V it is crucial to the construction of our DML pro cedure. These are required to ensure a) the score function is Neyman orthogonal and, hence, that DML controls the size of the errors 5 Throughout, we use v 1 to indicate the matrix transpose of arbitrary vector v and tak e vectors to b e column v ectors unless the opp osite is explicitly stated. 8 in tro duced b y using ML-based estimation of the n uisance functions, and b) the nuisance functions for the first-difference estimator can b e consistently learnt (see Remark 2.4 b elo w). Moreo ver, using V it as the instrumental v ariable rather than Z it enables ( 2.1 ) and ( 2.2 ) to b e parametrized in terms of the same n uisance parameter r 0 p X it q , which simplifies the resulting DML algorithm. The assumptions whic h must b e satisfied b y the data generating pro cess in order that ( 2.1 )-( 2.2 ) holds, and the special role play ed by ( 2.3 ), are now set out: Assumption 2.1. (a) (No feedback fr om outcome and trea tment to predic- tors and instruments given omitted v ariables) F or al l t “ 1 , . . . , T with T ě 2 Pr ` X it , Z it | Y it ´ 1 , D it ´ 1 , X it ´ 1 , Z it ´ 1 , . . . , Y i 1 , D i 1 , X i 1 , Z i 1 , ξ i ˘ “ Pr p X it , Z it | X it ´ 1 , Z it ´ 1 , . . . , X i 1 , Z i 1 , ξ i q . (b) (Local no-feedba ck assumption) Pr p X it | Z it ´ 1 , X it ´ 1 , ξ i q “ Pr p X it | X it ´ 1 , ξ i q . Assumption 2.2. (St a tic str uctural model f or outcome) Y it “ g ˚ 0 p X it , ξ q ` D it θ it ` ϵ it , wher e E p ϵ it | D it , X i , ξ i q ‰ 0 but E p ϵ it | Z it , X it , ξ i q “ 0 . θ it is the c ausal effe ct of D it on Y it sp e cific to individual i at wave t . Assumption 2.3. (Homogeneity of the trea tment effect) The individual-time tr e atment effe cts ar e me an-indep endent of the pr e dictors, instruments and omitte d time- invariant variables such that E p θ it | D it “ d, X it , Z it , ξ i q “ dθ 0 . Remark 2.1. E p θ it | D it “ d q “ dθ 0 without further assumptions, where θ 0 is the a verage treatmen t effect on the treated (A TET) if D it is binary; and if it can be assumed that E p θ it | D it “ 0 q “ E p θ it | D it “ 1 q then θ 0 is the av erage treatmen t effect (A TE). 9 Remark 2.2. Under Assumptions 2.1 (a) and Assumptions 2.2 - 2.3 , it follows that g ˚ 0 p X it , ξ i q “ E p Y it | X it , ξ i q ´ E p D it | X it , ξ i q θ 0 . Hence, if l ˚ 0 p X it , ξ i q ” E p Y it | X it , ξ i q , we hav e that Y it “ l ˚ 0 p X it , ξ i q ` t D it ´ E p D it | X it , ξ i qu θ 0 ` U it where U it “ ϵ it ` p θ it ´ θ 0 q D it satisfies E p U it | Z it , X it , ξ i q “ 0 . W e no w require three further assumptions ab out the relationship betw een the instrumen ts and predictors, the treatmen t and instruments: Assumption 2.4. (Nonlinear instrument al v ariable model) Z it “ m ˚ 0 p X it , ξ i q ` V it , wher e E p V it | X it , ξ i q “ 0 . Assumption 2.5. (Linear effects of instr uments on trea tment) D it “ V 1 it π 0 ` r ˚ 0 p X it , ξ i q ` R it , wher e r ˚ 0 p X it , ξ i q “ E p D it | X it , ξ i q and E p R it | X it , ξ i q “ 0 . 6 Remark 2.3. The treatment model assumes there are no interactions betw een the instru- men ts and exogenous regressors, and that the effects of the instrumen ts are linear. A vila Mar- quez ( 2025 ) allo ws a more general sp ecification in whic h E r D it | Z it , X it , ξ s “ r 1 p Z it , X it q ` γ i , where r 1 is a second treatment-related n uisance parameter in addition to r 0 p . q but Assump- tion 2.4 is not needed; this sp ecification allo ws the analyst to address the problem of w eak instrumen ts induced by 2SLS applied to instrumental v ariables with nonlinear effects on the treatmen t. In contrast, the setup here is based on a common sp ecification for multiple instrumen ts and p ermits the use of established tec hniques for weak instrumen ts, namely , the Sto c k and Y ogo ( 2005 ) F -statistic and Anderson-Rubin confidence interv als, as set out in Section 3 . 6 Generally , E p R it | Z it , X it , ξ i q “ 0 but b ecause Assumptions 2.4 and 2.5 are taken to hold together it requires only E p R it | X it , ξ i q “ 0 . 10 Assumption 2.6. (a) (Additive sep arability) Each of the nonline ar nuisanc e functions is additively sep ar able such that l ˚ 0 p X it , ξ i q “ l 0 p X it q ` α i , m ˚ 0 p X it , ξ i q “ m 0 p X it q ` γ i and r ˚ 0 p X it , ξ i q “ r 0 p X it q ` ζ i , wher e α i “ α p ξ i q , γ i “ γ p ξ i q and ζ i “ ζ p ξ i q ar e me asur able func- tions of ξ i . (b) (Fixed effects) The unobserve d individual heter o geneity is c orr elate d with the include d variables such that E p α i | D it , X it q , E p γ i | X it q and E p ζ i | Z it , X it q ar e nonzer o. Remark 2.4. Fixed effects panel mo del ( 2.1 )-( 2.2 ) follows straightforw ardly from Assump- tion 2.1 (a), Assumptions 2.2 - 2.3 and Assumptions 2.5 - 2.6 alone. Assumptions 2.1 (b) and 2.4 are needed to permit consisten t learning of the n uisance parameters. Assumption 2.4 is not simply a mo del of the asso ciation b et w een instrumen t and regressor, but an assumption ab out the data-generating pro cess for the instrumen t. T ogether with Assumption 2.1 .b, it en- sures that the first-difference estimator (in tro duced in Section 2.3 ) is consistent. Sp ecifically , the tw o assumptions together blo c k pathw a ys b et ween Y it ´ 1 and X it induced by omitting Z it ´ 1 from the conditioning set so that E r Y it ´ 1 | X it , X it ´ 1 , ξ i s “ l 0 p X it ´ 1 q ` α i and, hence, E r Y it ´ Y it ´ 1 | X it , X it ´ 1 s “ l 0 p X it q´ l 0 p X it ´ 1 q ; it also follows that E r D it ´ 1 | X it , X it ´ 1 , ξ i s “ r 0 p X it ´ 1 q ` ζ i and E r Z it ´ 1 | X it , X it ´ 1 , ξ i s “ m 0 p X it ´ 1 q ` γ i so that the n uisance function con trasts inv olving r 0 and m 0 can b e learn t consisten tly . 2.3 First-differencing for P anel Data with IV W e employ the first-difference (FD) transformation to remo ve the unobserved individual heterogeneit y (or fixed effects) in the PLPR Mo del ( 2.1 )-( 2.3 ). The FD transformation is preferable o ver other traditional panel data tec hniques, suc h as the within-group (or fixed effects) transformation and correlated random effects ( Mundlak , 1978 ; Chamberlain , 1984 ), b ecause it imp oses the fewest constraints on the data generating pro cess and p ermits feasible estimation of the n uisance functions. Let r Y it “ Y it ´ Y it ´ 1 b e the first-difference of the random v ariable Y it , and r l 0 p X it q “ l 0 p X it q ´ l 0 p X it ´ 1 q b e the first-difference of the nuisance function l 0 for all i “ 1 , . . . , N and t “ 2 , . . . , T . The first differences of the other random v ariables and n uisance functions are similarly defined using r notation. Then, the first-differenced PLPR mo del with IV mo del 11 based on mo del equations ( 2.1 )-( 2.3 ) is r Y it “ ␣ r D it ´ r r 0 p X it q ( θ 0 ` r l 0 p X it q ` r U it , (2.4) r D it “ r V 1 it π 0 ` r r 0 p X it q ` r R it , (2.5) r V it “ r Z it ´ Ă m 0 p X it q , (2.6) where α i , ζ i and γ i are now absent from the system, and r Y it “ r V 1 it δ 0 ` r l 0 p X it q ` r U ˚ it , (2.7) where δ 0 “ π 0 θ 0 and r U ˚ it “ r U it ` r R it θ 0 . F ollo wing Andrews et al. ( 2019 ), we refer to ( 2.4 ) as the structural mo del, ( 2.5 ) as the first-stage mo del, and ( 2.7 ) as the reduced-form mo del. The estimator of reduced-form model parameter δ 0 pla ys a crucial role in deriving asymptotic results for weak instrumen ts. Under the mo del ab o ve, naiv ely using tw o-stage least squares (2SLS) to estimate the effect of endogenous treatment r D it on r Y it using r Z it as an instrumental v ariable leads to bias. F or example, in the simple T “ 2 case and letting r d “ p r D 12 , . . . , r D n 2 q 1 , r Z “ p r Z 12 , . . . , r Z n 2 q 1 and r X “ p r X 12 , . . . , r X n 2 q 1 , it can be s ho wn that p θ 2 S LS ´ θ 0 “ p M X r d q 1 M ˆ Z p M X r d q 1 M ˆ Z p M X r d q M X p r e ` r g 2 q ” b p r e ` r g 2 q , where r e is the residual of the p opulation-lev el linear pro jection of r y “ p r Y 12 , . . . , r Y n 2 q 1 on r X and r Z , and r g 2 “ p r g 0 p X 12 q . . . , r g 0 p X n 2 qq 1 ; M X r d is the residual of the linear pro jection of r d on to the space spanned by the columns of r X , or span p r X q , M X p r e ` r g 2 q is that for r e ` r g 2 on to span p r X q , and M ˆ Z a is the residual of the linear pro jection of an y a on to span p M X r Z q . The usual bias term for 2SLS is b r e , induced by the sample correlation betw een r Z it and the structural error being non-zero, alb eit negligible for strong r Z i 2 ; but b r g 2 , the (scaled) linear pro jection of r g 2 on to span p r X q , is p otentially non-zero whether r Z i 2 is strong or w eak. The main c hallenge for implemen ting DML is to learn the transformed ex-ante unkno wn nuisance functions r l 0 p X it q , r r 0 p X it q , and Ă m 0 p X it q . F or this, w e adopt the FD (exact) appr o ach prop osed b y Clarke and P olselli ( 2025 ) to learn these functions from the observ ed transformed data on r Y it , r D it and r Z it , resp ectiv ely , given X it ´ 1 and X it . In the next sections, we handle the first source of bias and derive a consistent estimator of the target 12 (or causal) parameter of interest when the treatment v ariable is endogenous. 3 Estimation and Inference This section presen ts the estimation and inference of the panel IV DML estimator and the w eak iden tification diagnostics. A detailed discussion of the algorithm is in the Online App endix B . 3.1 P anel IV DML Estimator In this section, w e set out DML estimation of the panel IV mo del parameters first b y deriving a Neyman-orthogonal score function for estimating structural mo del parameters b 0 “ p θ 0 , π 1 0 q 1 needed for DML and second b y establishing its asymptotic prop erties. 7 Denote the N p T ´ 1 q ˆ 1 random vectors of first differences r Y “ p r Y 1 , . . . , r Y N q 1 , with r Y i “ p r Y i 2 , . . . , r Y iT q 1 for individual i , and the first differences of the n uisance function r l 0 “ p r l 01 , . . . , r l 0 N q 1 , with r l 0 i “ ` r l 0 p X i 2 q , . . . , r l 0 p X iT q ˘ 1 for i “ 1 , . . . , N . The other N p T ´ 1 q - v ectors of first differences r D , r U , r R , r U ˚ and r r 0 are similarly defined. The N p T ´ 1 qˆ r matrices relating to instrumental v ariable mo del ( 2.6 ), where r is the num b er of v alid instrumen tal v ariables, are r Z “ p r Z 1 , . . . , r Z N q 1 with r Z i “ p r Z i 2 , . . . , r Z iT q 1 , and r V “ r Z ´ Ă M 0 , where Ă M 0 “ p Ă M 01 , . . . , Ă M 0 N q 1 with Ă M 0 i “ ` Ă m 0 p X i 2 q , . . . , Ă m 0 p X iT q ˘ 1 . 8 Finally , let X be a N T ˆ p matrix of un transformed confounding v ariables. The complete set of random vectors and matrices with observed realizations W “ t r Y , r D , r Z , X u . Prop osition 3.1. (Neyman-ortho gonal Sc or e for Panel IV DML) Suppr essing subscript i , the c ontribution of a single individual to a Neyman ortho gonal sc or e ψ for (finite-dimensional) p ar ameters θ 0 and π 0 of structur al mo del ( 2.4 ) - ( 2.6 ) is ψ 0 “ ψ p W ; b 0 ; η 0 q “ ´ ¨ ˝ V K 0 Ω ´ 1 θθ r U r V 1 Ω ´ 1 π π r R ˛ ‚ , (3.1) wher e η 0 “ ` r l 0 , r r 0 , Ă M 0 ˘ is the true (infinite-dimensional) nuisanc e p ar ameter, and r ow ve ctor V K 0 “ π 1 0 r V 1 is the c ombine d effe ct of the r instrumental variables on r D it in mo del ( 2.5 ) . 7 Equiv alent expressions for reduced-form model parameters b rf 0 “ p δ 1 0 , π 1 0 q 1 to facilitate the use of w eak instrumen t tec hniques are derived, but omitted from the main text. 8 Ă M 0 should not b e confused with pro jection matrix residual M X defined in the previous section. 13 The r esidual varianc e-c ovarianc e matric es in the ab ove expr essions ar e Ω θθ “ E r r U r U 1 | X s and Ω π π “ E r r R r R 1 | X s . These sc or es le ad to lo c al ly efficient estimators in the sense of b eing semi-p ar ametric al ly efficient if E r r U r U 1 | r Z , X s “ Ω θθ p X q , E r r R r R 1 | r Z , X s “ Ω π π p X q , and E r r U r R 1 | r Z , X s “ 0 . The R e gularity Conditions ar e that nuisanc e p ar ameter estimates η “ ` r l , r r , Ă M ˘ P T , wher e T is a c onvex subset of some norme d ve ctor sp ac e of squar e-inte gr able functions, } r Z } 4 and the singular values of } Ω j j p X i q} , for j “ θ , δ , π , have finite exp e ctation under data gener ating pr o c ess P , ther e exists finite C η such that for any η P T , Pr p} E r ψ | r Z , X s} ă C η q “ 1 , wher e Pr and norm } . } ar e taken with r esp e ct to P . Then ( 3.1 ) is Neyman ortho gonal at true b 0 for al l η P T N , wher e T N Ă T is a pr op er shrinking neighb ourho o d of η 0 . 9 A pro of is giv en in printed App endix A.1 . A formal definition of T N is later provided in the pro of of Prop osition 3.2 . Remark 3.5. (Line ar sc or e function). The Neyman orthogonal score is linear in b : ψ “ ¨ ˝ V K 0 r U r V 1 r R ˛ ‚ “ ¨ ˝ V K 0 p r Y ´ r l 0 q r V 1 p r D ´ r r 0 q ˛ ‚ ´ ¨ ˝ V K 0 p r D ´ r r 0 q 0 r 0 r r V 1 r V ˛ ‚ ¨ ˝ θ 0 π 0 ˛ ‚ ” ψ b ´ Ψ a b 0 , where the true ψ , ψ b and Ψ a at η “ η 0 are ψ 0 , ψ b 0 and Ψ a 0 , resp ectively , 10 and 0 r is a conformable vector of zeros. Remark 3.6. (Pr op erties of the sc or e function). The score functions satisfy: (a) B η E “ ψ K p W ; b 0 ; η 0 q ‰ r η ´ η 0 s “ 0 , where exp ectations are with resp ect to the data gen- erating pro cess P ; and (b) the moment condition E r ψ K p W ; b 0 ; η 0 qs “ 0 for unique b 0 at η 0 . Neyman orthogonality (a) reduces th e first-order bias introduced by regularising ma- 9 F or the reduced-form mo del parameter b bf 0 , the score function is ψ rf 0 “ ψ rf p W ; b rf 0 , π 0 ; η 0 q “ ´ ˜ r V 1 Ω ´ 1 δ δ r U ˚ r V 1 Ω ´ 1 π π r R ¸ , where Ω π π “ E r r R r R 1 | X s and Ω δ δ “ E r r U ˚ r U ˚1 | X s . The score leads to lo cally efficien t estimators in the sense of b eing semi-parametrically efficien t if E r r R r R 1 | r Z , X s “ Ω π π p X q , E r r U ˚ r U ˚1 | r Z , X s “ Ω δ δ p X q and E r r U ˚ r R 1 | r Z , X s “ 0 . 10 F or reduced-form mo del, the Neyman orthogonal function is linear in parameters b bf 0 : ψ rf “ ˜ r V 1 r U ˚ r V 1 r R ¸ “ ˜ r V 1 p r Y ´ r l 0 q r V 1 p r D ´ r r 0 q ¸ ´ ˜ r V 1 r V 0 r 0 r r V 1 r V ¸ ˆ δ 0 π 0 ˙ ” ψ b rf ´ Ψ a rf b rf 0 , where the true ψ b rf and Ψ a rf at η “ η 0 are ψ b rf 0 and Ψ a rf 0 , resp ectiv ely . 14 c hine learning algorithms ( r e gularization bias ), and guaran tees that the estimated effect is insensitiv e to small errors in predicting the n uisance parameters. Remark 3.7. (The Panel IV DML Estimator). Moment condition (b) guarantees a closed- form solution of the structural parameters suc h that b 0 “ ¨ ˝ θ 0 π 0 ˛ ‚ “ E « ¨ ˝ V K 0 p r D ´ r r 0 q 0 r 0 r r V 1 r V ˛ ‚ ff ´ 1 E « ¨ ˝ V K 0 p r Y ´ r l 0 q r V 1 p r D ´ r r 0 q ˛ ‚ ff . (3.2) Solving the moment condition ( 3.2 ), based on the Neyman-orthogonal score from Prop osi- tion 3.1 , with respect to b 0 yields the panel IV DML estimator p b “ ¨ ˝ p θ p π ˛ ‚ “ ¨ ˝ V K p r D ´ p r q 0 r 0 r r V 1 r V ˛ ‚ ´ 1 ¨ ˝ V K p r Y ´ p l q r V 1 p r D ´ p r q ˛ ‚ , (3.3) with asymptotic v ariance-co v ariance matrix Σ “ ¨ ˝ Σ θθ 0 0 r Σ π π ˛ ‚ “ Q ´ 1 lim N Ñ8 V ar $ & % ¨ ˝ V K 0 ` r Y ´ r l 0 q r V 1 ` r D ´ r r 0 ˘ ˛ ‚ , . - Q ´ 1 (3.4) where Q ´ 1 “ ¨ ˝ Q ´ 1 v K r 0 r 0 r Q ´ 1 v v ˛ ‚ , Q v K r “ E r r V K ` r D ´ r r 0 ˘ s and Q v v “ E ` r V 1 r V ˘ , Q ´ 1 is the in verse of Q , and N is the cross-sectional sample size of the fold used to estimate the in terest parameters. A consistent estimator of the scalar Q v K r is p Q v K r “ p V K ` r D ´ p r ˘ , and of the r ˆ r matrix Q v v is p Q v v “ p V 1 p V . 11 T o presen t our main result for the estimation and inference of b 0 using the panel IV DML estimator, let t δ N u 8 N “ 1 and t ∆ N u 8 N “ 1 b e t wo sequences of p ositiv e constants tending to zero as N increases suc h that δ N ě N ´ 1 { 2 . Let c, C ą 0 b e arbitrary fixed constan ts denoting low er and upp er b ounds, resp ectiv ely . F urther fixed constants: recall that r ě 1 is the num b er of instrumen tal v ariables, and let K ě 2 b e the num b er of cross-v alidation folds c hosen by the analyst. Prop osition 3.2. (Asymptotic Distribution of the Panel IV DML Estimator) Sup- p ose that the R e gularity Conditions (a) - (e) b elow ar e al l satisfie d for al l pr ob ability laws P P P for r andom ve ctors and matric es W “ t r Y , r D , r Z , X u : 11 The v ariance-co v ariance matrix of the P anel IV DML estimators is robust to heteroskedasticit y , and has the conv entional Eiker-Huber-White sandwich form ula. 15 (a) The mo del define d in Equations ( 2.4 ) - ( 2.7 ) holds for al l P P P ; (b) } r Y } P,q ` } r D } P,q ` } r Z } P,q ď C for al l q ě 1 ; (c) The singular values of E r r V 1 0 r V 0 s al l exc e e d some arbitr ary c ą 0 as do es ˇ ˇ E “ π 1 0 r V 1 0 p r D ´ r r 0 q ‰ ˇ ˇ ; (d) A l l of › › E “ r U 2 0 | X ‰ › › P, 8 , › › E “ r R 2 0 | X ‰ › › P, 8 and › › E “ r V 2 0 | X ‰ › › P, 8 ă C ; (e) Given first-stage b ase le arner pr e diction p η b ase d on blo ck-k-fold cr oss fitting, the fol- lowing hold with pr ob ability no smal ler tha 1 ´ ∆ N : } p η ´ η 0 } P,q ď C for q ą 2 and } p η ´ η 0 } P,q ď δ N with } r m ´ r m 0 } P, 2 ˆ ! } r l ´ r l 0 } P, 2 ` } r r ´ r r 0 } P, 2 ` } r m ´ r m 0 } P, 2 ) ď δ N N ´ 1 { 2 ; Then, the p anel IV DML estimator p b b ase d on the Neyman-ortho gonal sc or e fr om Pr op osi- tion 3.1 ob eys Σ ´ 1 ? N ` p b ´ b 0 ˘ „ N p 0 , I q uniformly over P P P , wher e the varianc e-c ovarianc e matrix is Σ “ ` E r Ψ a 0 s ˘ ´ 1 E r ψ b 0 ψ b 1 0 s ` E r Ψ a 0 s ˘ ´ 1 . Mor e over, this r esult holds if Σ in Equa- tion ( 3.4 ) is r eplac e d by a ? N -c onsistent estimator p Σ . Remark 3.8. Norms } . } P,q in L q p P q are defined o v er T space of P -square-integrable nuisance functions η “ η p W q for P P P , where T N Ă T b e a prop erly v anishing neighbourho o d of true η 0 as defined by bounds on p η , P N Ă P is the corresp onding v anishing neigh b ourho od of data generating pro cess P . Norms o v er the coun ting measure are denoted by } . } q in the usual wa y , and those with matrix argumen ts are Schatten n orms. Remark 3.9. A pro of extending that of Chernozh uko v et al. ( 2018 , Theorem 4.2), in whic h Regularit y Conditions (a) - (e) are sho wn to satisfy the assumptions listed in their Assump- tions 3.1 and 3.2 required for Theorem 3.1 to hold, can b e found in the Online App endix A.2 . 3.2 Inference under W eak Identification As discussed ab o v e, the partially linear first-difference mo del ( 2.4 )-( 2.6 ) (and ( 2.6 )-( 2.7 )) holds under Assumptions 2.1 - 2.6 , where Assumptions 2.1 (b) and 2.4 p ermit p l 0 i , p r 0 i and x M 0 i to b e consisten tly learn t. F urthermore, Prop ositions 3.1 and 3.2 presen t the Neyman orthog- onal score and Regularity Conditions under whic h DML yields consisten t and asymptotically normal estimators of θ 0 and π 0 (or δ 0 and π 0 ). If we ha ve ? N -consistent estimators of Ω θθ and Ω π π then, without loss of generalit y , we can set both equal to iden tity matrices in the 16 follo wing explanation. Then, following Chernozh uk ov et al. ( 2018 , Sect. 1), a second-order T aylor series expansion of score ( 3.1 ) around ψ “ ψ 0 , where ψ 0 “ p θ 0 , π 0 , r l 0 , r r 0 , r m 0 q , gives ? N p p θ DML ´ θ 0 q “ „ 1 N π 1 0 r V 1 0 ` r D ´ r r 0 ˘ ȷ ´ 1 1 ? N π 1 0 r V 1 0 r U 0 ` o p p 1 q , and ? N p p π DML ´ π 0 q “ „ 1 N r V 0 p V 1 0 ȷ ´ 1 1 ? N r V 0 r R 1 0 ` o p p 1 q , where DML allows us to lump into o p p 1 q all cross-pro ducts in volving at least one residual from r D ´ r r 0 , r R 0 , r V 0 , and r U 0 , and at least one of p π DML ´ π 0 , p l ´ r l 0 , p r ´ r r 0 , and x M ´ Ă M 0 in the T aylor series ( p l , p r , and x M are, resp ectiv ely , predictions of r l , r r , and Ă M from DML stage one); and the requirement that the base learners satisfy o p p N ´ 1 { 4 q ensures the same for cross-pro ducts in volving pairs from p π DML ´ π 0 , p l ´ r l 0 , p r ´ r r 0 , and x M ´ Ă M 0 . This allo ws w eak-instrument asymptotics to b e applied in the usual manner for b oth the first-stage F and Anderson-Rubin statistic. Assessing instrumen t relev ance is routine in applied w ork, as w eak instruments bias 2SLS estimator and in v alidate its normal asymptotic prop erties (see, e.g., Bound et al. , 1995 ). In practice, applied researc hers t ypically rely on the first-stage F -statistic, whic h is routinely rep orted in empirical tables. W e therefore b egin b y sho wing ho w con ven tional first-stage F -statistic can b e used within our DML framework, pro viding a diagnostic that is familiar and immediately interpretable for applied researc hers. W e then turn to the Anderson-Rubin test statistic and confidence set whic h, although more robust to w eak iden tification, remains underutilized in empirical practice. 3.2.1 First-stage F -statistic for P anel IV DML The F-statistic from the linear regression of the endogenous v ariable on the instrument(s) (first-stage regression) is routinely rep orted in empirical IV applications and often used as a diagnostic for instrumen t relev ance. In practice, instruments are typically considered suffi- cien tly strong when this statistic exceeds conv entional critical v alues. 12 Giv en the imp ortance 12 Con ven tional Sto c k and Y ogo ( 2005 )’s critical v alues can be used for inferences in just-iden tified cases. In practice, a widely used rule-of-thum b is to consider an instrument str ong if the first-stage F -statistic exceeds the v alue of 10 ( Sto ck and W atson , 2019 , pp.470-471). A more rigorous approach is based on thresholds for first-stage F derived b y Sto ck and Y ogo ( 2005 ), who consider the maximal tolerable t-test size distortion, or alternatively how often a 5% t-test r eje cts a true hyp othesis . Recently , Lee et al. ( 2022 ) claim these thresholds should be reconsidered. They show that only when the F -statistic exceeds 104.7 the t-tests are ensured not to reject the null hypothesis at a rate higher than the desired one (e.g., 5% rate). 17 of assessing w eak identification for v alid inference, w e deriv e expressions for a first-stage F- statistic adapted to our panel IV DML framew ork. 13 Definition 3.1. (First-stage F -statistic for Panel IV DML) L et r ě 1 b e the numb er of exclude d exo genous instruments, and N the numb er of subje cts in the estimation sample. Assume the c onditions in Pr op osition 3.2 hold such that the DML first-stage estimator, p π , is wel l-b ehave d. Under the nul l hyp othesis H 0 : π “ 0 , the F -statistic is F DM L “ 1 r p π 1 p Σ ´ 1 π π p π (3.5) wher e p Σ π π “ p Q ´ 1 v v ! N ´ 1 p V 1 ` r D ´ p r ˘` r D ´ p r ˘ 1 p V ) p Q ´ 1 v v with p Q v v “ N ´ 1 p V 1 p V is a c onsistent estimator of the asymptotic varianc e-c ovarianc e matrix Σ π π . Remark 3.10. The first-stage F -statistic ( 3.5 ) relies on the asymptotic normalit y guaranteed under the DML conditions stated in Prop osition 3.2 . Remark 3.11. The first-stage F -statistic ( 3.5 ) is robust to non-homoskedastic errors (i.e., serially correlated errors, heteroskedasticit y , and clusters), and corresp onds to the robust v ersion of the Kleib ergen and P aap ( 2006 ) F -statistic in a con v entional 2SLS setting. 14 3.2.2 AR T est Statistic and Confidence Sets for P anel IV DML It is well established that statistical inference based on con ven tional 2SLS estimator is inv alid when instrumen ts are weak, especially when the OLS bias is large (see Keane and Neal , 2024 ). Therefore, con v entional F- and t-based inference b ecomes unreliable, ev en when instrumen ts app ear strong, whereas tests that are robust to w eak identification and do not rely on assumptions ab out instrumen t relev ance, suc h as the Anderson-Rubin (AR) test ( Anderson and Rubin , 1949 ), remain v alid ( Andrews et al. , 2019 ; Moreira and Moreira , 2019 ; Keane and Neal , 2023 ). 13 Our tests for w eak IV can b e easily adapted to the partially linear regression mo del with linear instrumental v ariables within cross-sectional DML framework b y Chernozhuk o v et al. ( 2018 ). These tests hav e not been y et dev elop ed in that setting at the time of writing. 14 In con ven tional IV settings, the Kleib ergen-P aap F -statistic (robust F) is usually rep orted in tables to argue the strength of the instrument. Alternativ e statistics suggested b y the literature are effective F - statistic ( Montiel Olea and Pflueger , 2013 ) whic h is cen tred around the correct p opulation parameter, unlik e the conv entional robust F -statistic, and the efficient GMM F -statistic ( Windmeijer , 2025 ). In the just-iden tified case with one instrumen t, lik e in this case, the effective F and the robust F coincide and con ven tional Sto c k and Y ogo ( 2005 ) critical v alues can b e used ( Andrews et al. , 2019 ). W e do not implement the effective F -statistic for the panel IV DML estimator because the calculation of the critical v alues is computationally demanding. In addition, Anderson-Rubin robust test statistic should be preferred. 18 The AR test ev aluates the n ull hypothesis on the structural parameter b y test- ing the corresponding reduced-form restriction. Specifically , it assesses whether the instru- men t(s) hav e a statistically significan t reduced-form effect on the outcome under the assump- tion of instrument exogeneit y (i.e., that instruments are as go od as randomly assigned). This approac h provides v alid inference on the structural parameter without requiring strong iden- tification (i.e., relev ance). Definition 3.2. (A nderson-Rubin T est for Panel IV DML) L et r ě 1 b e the numb er of exclude d exo genous instruments, N the numb er of subje cts in the estimation sample, and p δ the p anel IV DML estimator of the r e duc e d-form e quation satisfying the moment c ondition E p V 1 U ˚ q “ 0 such that the e quality δ “ π θ holds. When θ “ θ 0 , the nul l hyp othesis is δ p θ 0 q “ 0 and the Anderson-R ubin W ald test statistic for p anel IV DML at the 1 ´ α level is AR p θ 0 q DM L “ p δ 1 p θ 0 q p Σ ´ 1 δ δ p δ p θ 0 q a „ χ 2 p r q 1 ´ α (3.6) wher e p Σ δ δ “ p Q ´ 1 v v ! N ´ 1 p V 1 ` r Y ´ p l ˘` r Y ´ p l ˘ 1 p V ) p Q ´ 1 v v with p Q v v “ N ´ 1 p V 1 p V is a c onsistent estimator of the asymptotic varianc e-c ovarianc e matrix Σ δ δ . Remark 3.12. The AR test statistic constructed from DML estimator relies on the asymp- totic normality guaranteed und er the DML condition s stated in Prop osition 3.2 . Definition 3.3. (AR Confidenc e Sets). Using the Anderson-Rubin test statistic in ( 3.6 ), it is p ossible to construct the set of parameter v alues θ 0 for which the n ull hypothesis H 0 : θ “ θ 0 is not rejected. The AR confidence set (CS) for panel DML IV estimator is obtained by test in version (following Da vidson and MacKinnon , 2014 ) and defined as C S AR p 1 ´ α q “ ␣ θ 0 P R : AR p θ 0 q DM L ď χ 2 p r q 1 ´ α ( . (3.7) As in the con ven tional 2SLS estimation metho d, the solutions of the quadratic form asso- ciated with the AR test statistic are: (a) a b ounded interv al r x, y s , where t x, y u with tw o real ro ots x ď y ; (b) a disjoint set p´8 , x s Y r y , `8q with t wo real ro ots x ă y ; (c) the real line p´8 , `8q ; and (d) the empty set, whic h is only p ossible in o veriden tified settings. AR CS (c) and (d) hav e no real roots. The type of AR CS informs ab out the relev ance of the instrument. When instru- men ts are strong, the AR CS is b ounded, as in case (a), and the mo del is well-iden tified. 19 When instruments are sufficien tly weak, the AR CS are usually un b ounded (cases b-c) and uninformativ e, since a b ounded set generally do es not exist lo cally when the parameter is not iden tified ( Davidson and MacKinnon , 2014 ). Empt y sets (case d) o ccur when the equalit y δ “ π θ 0 fails due to either (a) treatmen t effect heterogeneity when the equality holds for θ ‰ θ 0 , or (b) inv alidit y of the instrumen ts – i.e., the failure of the o veriden tifying restrictions – when there is no v alue of θ that satisfies the inequalit y (as in Andrews et al. , 2019 ). 4 Empirical Applications: Shift-Share IV s In this section, we sho wcase the applicability of our panel IV DML metho d. 15 F or a com- prehensiv e illustration, w e revisit three empirical studies that examine whether and how immigration affects the p olitical views of nativ e v oters to wards immigration in the United States ( T ab ellini , 2020 ) and Europ e ( Moriconi et al. , 2019 , 2022 ). As typical in the migration literature, an instrumental v ariable strategy (2SLS) is emplo yed to account for the p oten tially endogenous decision of immigran ts to settle across areas: on the one hand, they ma y b e more lik ely to settle in areas where the economy or attitudes tow ards immigration are more fav ourable; on the other hand, they ma y decide to settle in places where housing is more affordable and whic h are economically declining. These three studies adopt mo dified v ersions of the widely used shift-share instrumen t ( Card , 2001 ), whic h exploits the tendency of immigran ts to settle in areas with larger pre-existing comm unities from the same country of origin or ethnic group. 16 The version of the shift- share instrument used in these articles is constructed as the weigh ted av erage of new national inflo ws from each immigran t’s coun try of origin (the ‘shifts’), using fixed pre-existing lo cal settlemen t patterns (the ‘shares’) as weigh ts. In formulae, Z rt “ 1 P op rT 0 ÿ c P C S h crT 0 ÿ r P R M crt (4.1) 15 The panel IV DML estimation is conducted in R using the latest version of the xtivdml pack age accessible in its latest version at https://github.com/POLSEAN/xtivdml at the time of writing, which is built on R pac k ages DoubleML ( Bach et al. , 2024a ) and xtdml ( Polselli , 2025 ). The con ven tional 2SLS estimation is implemen ted in St a t a 19.5 using the communit y-contributed commands xtivreg2 ( Schaffer , 2005 ) and twostepweakiv ( Sun , 2018 ) 16 The fo cus on articles that exclusiv ely use a shift-share IV is motiv ated b y the fact that the instrument is not randomly assigned, b y construction, and a ric h set of cov ariates is typically required to make the instrumen t appro ximately exogenous, namely as go o d as r andomly assigne d . This setting is well suited to our panel IV DML approac h, whic h flexibly adjusts for high-dimensional confounding. 20 where P op is the total p opulation in city/region r in the pre-settlemen t p eriod T 0 , S h crT 0 “ M crT 0 { ř r M crT 0 is a fixed share of pre-existing immigran ts, M crt is the national sto c k (in- flo ws) of immigrants, C is the set of countries of origin, and R is the set of receiving cities/re- gions. Regarding the v alidity of the assumptions underlying the shift-share instrumen t, historical settlemen t patterns (the ‘shifts’) are generally strong predictors of curren t immi- gran t inflows and therefore are often found to satisfy the relev ance condition in conv entional 2SLS settings. The exogeneity of these instruments and the v alidit y of the exclusion restric- tion hav e recently b een at the center of a n umber of pap ers, which highlight that one of the tw o comp onen ts (either the ‘shifts’ or the ‘shares’) m ust b e exogenous (e.g., A dao et al. , 2019 ; Borusy ak et al. , 2022 ; Goldsmith-Pinkham et al. , 2020 ). In our empirical applications, w e abstract from this debate and take as given the arguments provided in the original pap ers in support of these assumptions. Instead, we fo cus on ho w our panel IV DML method can address the p oten tial presence of a large n umber of p ossibly nonlinear confounders. Dep end- ing on the setting, the exogeneity of the instrument ma y only b e v alid once conditioning on certain controls (see, for instance, Borusyak et al. , 2025 ). These con trols may en ter the true n uisance functions nonlinearly , and ignoring suc h nonlinearities can affect the estimation of treatmen t effects due to mo del missp ecification. In our re-analysis, we compare the results from our panel IV DML estimator to those obtained from conv en tional 2SLS with FD, whic h is the closest panel data estimation metho d to our estimation approac h. 17 As the original estimation approaches are either 2SLS with FE ( T ab ellini , 2020 ) or p o oled ordinary least squares with region and year fixed effects ( Moriconi et al. , 2019 , 2022 ) due to the structure of the data, we additionally contrast the estimates obtained from a 2SLS regression with FE with 2SLS with FD in each empirical application. This preliminary step allows us to v erify that any differences with panel IV DML metho d arise only from the use of a different estimation metho dology and not from differen t specification c hoices (see Online App endices C and D for an exhaustiv e discussion of conv entional panel data estimation results). In our panel IV DML regressions, the functional form of the confounding v ariables 17 As previously discussed in Section 2.3 , the FD (exact) approach allows us to approximate the transformed unkno wn n uisance functions in nonlinear settings without imposing man y constrain ts on the fixed effects, unlik e the within-group transformation (or fixed effects) and the correlated random effects device. The other tw o approaches are p ossible but may lead to inconsistent estimates when the true functional form is highly nonlinear (for further details, see Clark e and Polselli , 2025 ). 21 is learned flexibly using machine learning algorithms from each ma jor class (i.e., Lasso for L-norm regularized linear mo dels, gradient b o osting with 1000 trees for tree-based metho ds, and a single-hidden-la yer neural net work for deep learning) allowing the mo del to capture a broad range of nonlinearities in the data. The h yp erparameters of the base learners are tuned with grid searc h ( Bergstra and Bengio , 2012 ) (see T able F.1 in the Online App endix F ). The set of co v ariates emplo y ed as inputs in panel IV DML estimation with neural netw ork and gradien t b o osting includes raw cov ariates only , no interaction terms or p olynomials are included b ecause these base learners are designed to automatically capture nonlinearities in the data. Conv ersely , panel IV DML with Lasso uses an extended dictionary of nonlinear terms of the raw co v ariates (i.e., p olynomials up to order three and interaction terms of all co v ariates) to satisfy weak sparsity assumption. The inputs used in panel IV DML estimation also include one-p erio d lags of all co ntrol v ariables, follo wing the FD (exact) approac h discussed in Section 2.3 . In addition, the sample in panel IV DML regressions is divided in tw o folds, where the num b er of folds is c hosen to accoun t for the small cross- sectional dimension ( N ) in all empirical applications. 18 4.1 Empirical Example: T ab ellini ( 2020 ) The first empirical application revisits the study by T ab ellini ( 2020 ), whic h examines the p olitical and economic effects of restrictiv e immigration p olicies induced b y W orld W ar I and the Immigration A cts of the 1920s in the United States (U.S.) that reduced the quota of Europ ean immigran ts. The analysis exploits the exogenous v ariation in immigration from Europ ean countries to 180 U.S. cities ov er three census years 1910, 1920 and 1930. Data are collected from v arious sources: U.S. Census of Population, V oteview, Census of Man ufactures. The author employs an instrumen tal v ariable strategy to address the p oten tially endogenous settlemen t patterns of Europ ean immigrants across U.S. cities. The instrumen t used is a leav e-out v ersion of the shift-share measure defined in equation ( 4.1 ), which in- teracts historical settlement patterns of differen t ethnic groups in each city in 1900 with con temp orary immigration flows of the same group, excluding immigran ts who ultimately settle in the same city . The iden tifying assumption for the v alidity of the instrumen t is 18 Dividing the sample into a higher num b er of folds reduces the size of the estimation sample considerably , whic h may differ in terms of observ able characteristics from the prediction sample. In such cases, the learner w ould need to extrap olate the information, but is not desirable for flexible tree-based learners and neural netw orks. W e allo w for cross-fitting as desirable to restore efficiency . 22 v erified b y the author with a pre-trend test, whic h confirms that, prior to 1900 (b efore an y net works formation), Europ ean immigran ts did not disprop ortionately settle in cities exp eriencing economic gro wth or p olitical c hange. The original analysis is conducted using con ven tional 2SLS estimation with fixed effects (FE). The main findings of the article suggest that immigration induced hostile p olitical reactions, suc h as the election of more conserv ative legislators, stronger supp ort for anti-immigration legislation, and low er redistribution. W e re-analyze the baseline sp ecifications from T able 3 (Column 4, Panel B) and T able 5 (Column 1, P anel B) in T ab ellini ( 2020 ) using con ven tional 2SLS with FD and our panel IV DML method with the FD approac h. W e fo cus on tw o outcomes: for the p olitical effects, w e lo ok at the Po ole-R osenthal DW Nominate Sc or e , which ranks congressmen on an ideological scale from lib eral to conserv ativ e using voting b eha viour on previous roll- calls; for the economic effects, w e consider the (lo g) o c cup ational sc or e , which is a pro xy for nativ es’ income and do es not capture within o ccupation changes in earnings. 19 The baseline sp ecifications include only interaction terms of region and year fixed effects. W e then depart from the original analysis by augmen ting the baseline sp ecifications with additional con trol v ariables: predicted industrialization, immigran t and cit y p opulation, v alue added manufacturing, skill ratios, fraction of blacks, v alue of pro ducts, employmen t share in manufacturing. In the original analysis, eac h of these con trol v ariables is added one at a time to the baseline sp ecification to ev aluate the robustness of the results. 20 In practice, the iden tification of the effect with inclusion of many irrelev ant controls in the estimating equation ma y not b e p ossible with con ven tional estimation metho ds, suc h as least squares, due to matrix singularity and multicollinearit y issues. In contrast, DML metho ds (for b oth cross-sectional and panel data) can handle high-dimensional co v ariate spaces. Before discussing panel IV DML results, we briefly compare conv en tional 2SLS estimates obtained using fixed effects (FE) and first-differences (FD) (see the Online Ap- p endix C for a detailed discussion). This preliminary step ensures that any differences ob- serv ed against our panel IV DML estimator reflect metho dological rather than sp ecification c hoices. In the baseline sp ecifications, the shift-share instruments appear strong with both FE and FD estimators for the p olitical outcome ( D W Nominate Sc or e ) and the economic 19 As explained by the author, w age data are not av ailable since until 1940. Occupational scores are commonly used in the literature to proxy lifetime earnings, which are calculated b y assigning the median income of an individual job category in 1950 to them. 20 F or instance see T ables D2-D3 in the Online App endix of T abellini ( 2020 ). W e thank Marco T ab ellini and F rancesco Maria T oti Ognib ene for providing us with the material to replicate T ables D2-D3 in the Online App endix of T ab ellini ( 2020 ). 23 T able 1. The p olitic al and e c onomic effe cts of immigr ation Dep endent variable: DW Nominate Score Log Occupational Score 2SLS DML-Lasso DML-NNet DML-Boosting 2SLS DML-Lasso DML-NNet DML-Bo osting (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Baseline spe cification with fixed-effe cts interactions Se c ond-stage results F r. Immigran ts 1.772** 2.738*** 2.359*** 2.84*** 0.095** 0.042 0.01 0.065 (0.829) (0.777) (0.79) (0.874) (0.042) (0.041) (0.056) (0.053) AR 95% CS [0.049, 3.495] [0.382, 4.462] [-0.05, 4.559] [0.008, 4.921] [0.017, 0.173] [-0.103, 0.166] [-0.121, 0.144] [-0.133, 0.175] First-stage r esults Shift Share IV 0.965*** 1.091*** 1.051*** 0.967*** 0.933*** 0.982*** 0.98*** 0.957*** (0.226) (0.154) (0.158) (0.156) (0.094) (0.11) (0.112) (0.123) F stat 18.23 49.284 43.838 38.101 99.45 78.489 74.733 59.899 AR χ 2 3.84* 11.221*** 7.869*** 9.439*** 5.40** 0.858 0.029 1.031 Quality of le arners Model RMSE 0.232 0.406 0.414 0.419 0.019 0.039 0.039 0.044 MSE of l 0.291 0.297 0.3 0.024 0.023 0.027 MSE of r 0.037 0.035 0.034 0.035 0.035 0.035 MSE of m 0.021 0.02 0.021 0.024 0.023 0.023 Observ ations 303 303 303 303 342 342 342 342 No. clusters 157 157 157 157 125 125 125 125 Panel B: Spe cification with al l c ontrols (not original ly implemente d) Se c ond-stage results F r. Immigran ts 0.928 2.184 1.807 0.465 0.094 0.05 0.001 -0.039 (2.103) (1.507) (1.85) (1.52) (0.060) (0.08) (0.081) (0.125) AR 95% CS [-4.276, 4.883] [-2.252, 9.237] [-4.938, 10.666] [-4.131, 6.268] [-0.032, 0.207] [-0.15, 0.241] [-0.158, 0.183] [-0.271, 0.255] First-stage r esults Shift Share IV 0.541*** 0.528*** 0.47*** 0.657*** 0.745*** 0.792*** 0.85*** 0.589*** (0.178) (0.112) (0.115) (0.156) (0.117) (0.06) (0.116) (0.139) F stat 9.26 21.987 16.531 17.587 40.89 173.047 52.899 17.575 AR χ 2 0.18 2.003 1.011 0.148 2.2 0.44 0.001 0.066 Quality of le arners Model RMSE 0.218 0.406 0.447 0.425 0.019 0.039 0.039 0.044 MSE of l 0.292 0.319 0.306 0.024 0.024 0.027 MSE of r 0.028 0.027 0.031 0.031 0.031 0.031 MSE of m 0.018 0.017 0.016 0.022 0.021 0.023 Observ ations 297 297 297 297 338 338 338 338 No. clusters 154 154 154 154 125 125 125 125 Note: Panel A rep orts our estimates based on baseline sp ecifications in T able 3 (Column 4, Panel B) and T able 5 (Column 2, Panel B) of T ab ellini ( 2020 ). Columns 1 and 5 display our estimates from conven tional 2SLS regression with FD transformation, and Columns 2-4 and 6-8 our estimates from panel IV DML estimation with different base learners. Panel B is not implemented in the original article, and adds several controls to the b aseline sp ecifications in Panel A including: predicted industrialization, log 1900 city and immigrant population, log of value added p er establishment in 1904, natives’ 1900 skill ratios, 1900 fraction of blacks, 1904 log v alue of pro ducts p er establishment, and the 1904 employment share in manufacturing. The number of ra w control v ariables from the original analysis is 77 in Panel A (Columns 1-4), 79 Panel A (Columns 5-8), 96 in Panel B (Columns 1-4), 92 in Panel B (Columns 5-8). Slight differences in the n umber of cov ariates in the panel IV DML estimates are due to the type of learner and the FD (exact) approach. Sp ecifically , the set of confounding variables in panel IV DML estimations with NNet and Boosting (Columns 3-4, 7-8) does not include any interaction terms, unlike the original analysis, b ecause these base learners are designed to capture possible nonlinearities in the data. Panel IV DML estimation with Lasso (Columns 2 and 6) uses an extended dictionary of the raw variables, which includes p olynomials up to order three and interaction terms between all the cov ariates, to satisfy Lasso’s weak sparsity assumption. The number of cov ariates used in panel IV DML estimation (Columns 2-4, 6-8) also includes th e lags of all cov ariates, following the FD (exact) approach. Observations for which data is unav ailable are dropp ed from the final estimation sample; therefore, when there are only two time perio ds the unit with at least a missing case is dropp ed after the first-difference. The difference in the num b er of observ ations b etw een estimation metho ds is explained by missing v alues generated after transforming the cov ariates to use as inputs in the machine learning algorithms. Panel IV DML technical note: 2 folds, cross-fitting, hyperparameters are tuned as p er T able F.1 . Standard errors in parenthesis are clustered at the metropolitan area in Columns 1-4 and at the city code level in Columns 5-8. Significance levels: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. outcome ( L o g Oc cup ational Sc or e ). The second-stage co efficien ts are p ositiv e and significan t at least at 5% lev el. When additional con trols are included, an extension not implemen ted in the original analysis, the shift-share instrumen t becomes w eak in the specification of the p olitical outcome while remaining s trong in the sp ecification of the economic outcome with b oth panel data estimators. Regardless of the strength of the instrument, b oth estimators pro duce statistically insignificant effects, contrasting the results found in the baseline regres- sion. In general, the tw o estimators pro duce v ery similar results in terms of sign, magnitude and statistical significance, ensuring that subsequent differences b et ween 2SLS and panel IV DML can b e explained by differences in the methodology . W e now pro ceed to ev aluate how our panel IV DML estimates compare to the 24 corresp onding 2SLS with FD. T able 1 rep orts the results of the baseline sp ecifications in P anel A, and the augmen ted specifications with all control v ariables in P anel B for the t wo outcome v ariables of in terest. Columns 1 and 5 in T able 1 rep ort the 2SLS estimates with FD, and Columns 2-4 and 6-8 the panel IV DML estimates with FD using Lasso, Neural Net work (NNet), and Gradient Bo osting (Bo osting). F or each sp ecification, w e need to verify whether the shift-share instruments in panel IV DML regressions still predict immigran t shares across cities (i.e., instrumen t rele- v ance) within the more flexible DML framew ork b ecause our estimator controls for co v ariates in a more flexible w ay and, therefore, the effective strength of the instrumen t ma y differ. W e start b y discussing the baseline regression estimates (P anel A of T able 1 ) fo cusing on one outcome at a time. F or the p olitical outcome (Columns 2-4), the first-stage F-statistics from panel IV DML regressions are muc h larger than those from 2SLS, which sligh tly exceeds Sto c k and Y ogo ( 2005 )’s cut-off of 16.30, but never exceed Lee et al. ( 2022 )’s threshold of 104.70. In this case, P anel IV DML strengthens the relev ance of the shift-share instrument. The AR test statistics for the relev ance of the reduced-form stage, when there is no second- stage effect regardless of the strength of the instrumen t, strongly reject the null hypothesis with panel IV DML, suggesting some effect of immigration on the p olitical outcome. This is confirmed b y the AR confidence sets (CS) at 95% level which are b ounded, include the estimated second-stage co efficien t and do not include zero, with the exception of neural net- w ork due to small sample size (further reduced by sample splitting). Giv en the strength of the instrumen t (based on the Sto c k and Y ogo ( 2005 )’s threshold), w e can comment on the statistical significance of the second-stage co efficien t and mak e inference ab out the effect of immigration. All second-stage co efficien ts are p ositiv e and significant at 1% level; sp ecifi- cally , panel IV DML co efficients are larger in magnitudes than 2SLS, suggesting a stronger effect of the fraction of immigran ts on the p olitical outcome than originally found. Ov erall, panel IV DML findings indicate that a greater inflo w of immigran ts leads to the election of more conserv ativ e congressmen; the direction of the effect is aligned with the original 2SLS findings, which seem to b e underestimated. Mo ving to the economic outcome (Columns 5-8 of Panel A), the first-stage F- statistics from panel IV DML are smaller than those from 2SLS, but alw a ys w ell ab o ve the con ven tional critical v alue of 16.30 by Sto ck and Y ogo ( 2005 ) and b elo w 104.70 b y Lee et al. ( 2022 )’s. Therefore, the instrumen t can still b e considered strong with panel IV DML. Unlik e 2SLS results, the AR test statistic and AR CS from panel IV DML regressions highlight the 25 absence of any effect for the economic outcome. That is, in Columns 6-8, the AR test nev er rejects the null h yp othesis of no reduced-form effect when the treatment effect is assumed to be absen t, and the AR CS alw a ys include zero as a p ossible v alue of the treatmen t effect. Given the strength of the IV and the results of the AR diagnostics, we can interpret the second-stage co efficients as not statistically differen t from zero with panel IV DML. By con trast, 2SLS finds a p ositiv e and statistically significan t (at 5% level) second-stage co efficien t. Overall, panel IV DML findings con tradict con ven tional 2SLS results, suggesting that there is no evidence that emplo ymen t gains for natives, induced b y immigration, were accompanied by o ccupational or skill upgrading, as found in the original article. W e now fo cus on P anel B of T able 1 , when all controls are included simultaneously (not originally implemented). In b oth sp ecifications (for p olitical and economic outcomes), the first-stage F-statistics generally b ecome smaller than those from P anel A. In Columns 2-4 of Panel B, the shift-share instrument in the panel IV DML regressions pro duces stronger first-stages ( F ą 16 . 30 ) relativ e to the corresp onding 2SLS regression, where the instrument is clearly w eak ( F “ 9 . 26 ). In Columns 5-8 of Panel B, the first-stage F-statistics obtained from b oth 2SLS and panel IV DML estimators are largely ab o ve Sto c k and Y ogo ( 2005 )’s threshold, and in the case of the panel IV DML regression with Lasso even ab ov e Lee et al. ( 2022 )’s threshold of 104.70. Overall, results for Columns 2-8 pro vide supp orting evidence of a strong instrument. This pattern suggests that allo wing for flexible and p oten tially nonlinear control adjustment in the instrumen t equation strengthens the predictive p ow er of the instrument with panel IV DML. The AR test statistics are insignificant with b oth estimators in b oth p olitical and economic sp ecifications, suggesting w e cannot reject no second-stage effects. The absence of a causal effect for either outcome v ariable is supp orted b y the AR CS, which include zero as a plausib le v alue of the treatment effect, as well as b y the statistical insignificance of the second-stage co efficien ts with 2SLS and panel IV DML estimators. This first empirical application highlights that our metho d can enhance the plausi- bilit y of instrument’s v alidit y assumption, while con trolling for more and p oten tially nonlin- ear confounding v ariables. W e show that our panel IV DML estimator generally confirms the con ven tional 2SLS results, with the only exception of the economic outcome in the baseline sp ecifications (Panel A of T able 1 ). Our robustness analysis with additional control v ari- ables (P anel B of T able 1 ) illustrates ho w the panel IV DML metho d in shift-share designs, where con v entional estimators ma y fail. W e find that the main effect of the original article 26 (Columns 1-5 in P anel A of T able 1 ) for both outcomes disapp ears in b oth 2SLS and panel IV DML regressions. 4.2 Empirical Examples: Moriconi et al. ( 2019 ) and Moriconi et al. ( 2022 ) In this section, w e revisit t wo related studies b y Moriconi et al. ( 2019 , 2022 ) on the impact of high-skilled (HS) and lo w-skilled (LS) immigration on support for redistribution p olicies and p erceiv ed attitudes in Europ ean countries. Similarly to T ab ellini ( 2020 ), endogeneity concerns arise as immigran ts ma y decide to settle in areas with more fa v ourable p olicies and attitudes to wards immigration. T o address these issues, an IV approach is employ ed with a modified v ersion of the shift-share instrumental v ariable ( 4.1 ) b y skill-sp ecific group (i.e., HS and LS). The instruments are constructed interacting the aggregate immigrant flows b y skill group and coun try of origin (the ‘shift’) with the initial distribution of immigrants by nationalit y across regions (the ‘share’), as in Mayda et al. ( 2022 ). The authors show that the skill-sp ecific shift-share instruments are uncorrelated with economic and demographic regional trends in the p erio d b efore the analysis, so that the exclusion restriction is satisfied. The data for the analyses in b oth articles are obtained from the Europ ean So cial Surv ey (ESS), the Europ ean Lab or F orce Surv ey (EULFS), the Manifesto Pro ject Database and Eurostat. The main analyses are implemented on a dataset of individual voters from t welv e Europ ean countries observed ov er the election y ears b et w een 2007 and 2016. The final dataset is not a conv entional panel dataset, where the same sub ject is observed o ver m ultiple time p erio ds, as con tains information of randomly sampled native voters across regions of the same t welv e 12 EU coun tries in each election y ear. In contrast, our panel IV DML metho d requires (balanced or unbalanced) panel data where the same sub ject is follo wed ov er at least tw o consecutive p eriods. T o satisfy this requiremen t, w e construct an aggregated balanced panel dataset from the individual data b y av eraging v ariables at the NUTS2 region-year level. 21 Both studies estimate the main specifications through p o oled ordinary least-squares (POLS) regressions with regional and election-y ear fixed effects, using individual voters’ data. Therefore, w e first re-estimate the key sp ecifications using b oth 2SLS with fixed effects (FE) 21 In the original analyses b oth the endogenous v ariables and the instrumen ts are at the NUTS2 region-y ear lev el. Therefore, the level of aggregation of these v ariables do es not c hange in our aggregated data, but it affects the outcome v ariables and the individual-lev el con trols only . 27 and 2SLS with first-differences (FD) estimation approaches using our aggregated (unbal- anced) panel dataset to establish a consisten t comparison with our panel IV DML metho d. 4.2.1 Empirical Application: Moriconi et al. ( 2019 ) The article in vestigates the effect of skilled and unskilled immigration on individual prefer- ences for the expansion of the welfare state and public education in t welv e EU countries. While the ov erall effect of migration v aries, their main findings highligh t a pro-redistribution effect with HS immigration from nativ es’ voters. In this second empirical application, we re-examine the main results of Moriconi et al. ( 2019 ) relativ e to Columns 2, 4 and 6 (P anels A and B) of their T able 4 (on individual v oters’ data). 22 The dep endent v ariables are: Net W elfar e State and Net Public Educ ation . The con trol v ariables included in the regressions are the same as in the original article and include: the share of w omen, a verage age, share of tertiary/p ost-tertiary education, av erage GDP p er capita (in log), share of tertiary sector (in log), a v erage unemploymen t rate, and election year dummies. The regression analysis is separately estimated b y skill-sp ecific group of immigrants. Before discussing the panel IV DML results, w e briefly compare the results from con ven tional 2SLS estimators with FE and FD (see Online Appendix D.1 for more details). In general, b oth FE and FD estimators confirm that the instrumen t is strong in the HS immigration sample ( F ! 16 . 30 ), but b orderline w eak in the LS immigration sample (as 10 ă F ă 16 . 30 ) for b oth sp ecifications. AR diagnostics, not computed in the original study , find the presence of a second-stage effect of HS immigrants and, unlike the original article, of LS immigran ts on b oth outcomes. The AR CS from both panel estimators agree on the direction of the effect of HS immigrants on w elfare expansion, and of LS immigrants on education expansion. Therefore, an y difference observ ed b et ween 2SLS with FD and panel IV DML later can b e explained b y differences in the adopted metho dology , i.e. 2SLS v ersus panel IV DML. T able 2 rep orts our estimates obtained from 2SLS regressions with FD (Columns 1 and 5) and panel IV DML regressions using the FD approach (Columns 2-4, and 6-8), based on the aggregated panel dataset. The first-stage regression results in Panels A and B are 22 In the Online App endix D.1 , we revisit the sp ecifications in T able 5 of Moriconi et al. ( 2019 ), estimated using parties’ data. W e thank the authors for pro viding us with the en tire replication pack age to reproduce the main analysis. 28 T able 2. Politic al pr efer enc es over 2007–2016 – A ggr e gate d individual voters Sample: HS immigrants LS immigrants 2SLS DML-Lasso DML-NNet DML-Boosting 2SLS DML-Lasso DML-NNet DML-Boosting (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Net W elfare State Se cond-stage results F r. Immigrants 0.054*** 0.368* 0.59** 0.34 0.05 0.057 0.327 -0.777 (0.014) (0.189) (0.285) (0.276) (0.033) (0.09) (0.526) (1.082) R obust W eak IV T ests AR χ 2 stat 15.36*** 13.114*** 15.398*** 7.692*** 3.6* 1.091 1.743 7.036*** AR 95% CS [0.029, 0.081] p´8 , ´ 0 . 154 s Y r´ 0 . 008 , `8q p´8 , ´ 0 . 288 s Y r´ 0 . 019 , `8q p´8 , `8q [0.003, 0.159] p´8 , `8q p´8 , `8q p´8 , `8q Quality of le arners Model RMSE 0.135 0.374 0.518 0.323 0.145 0.213 0.608 1.444 MSE of l 0.145 0.147 0.139 0.145 0.147 0.139 MSE of r 1.009 0.768 0.743 1.369 1.779 1.711 MSE of m 0.3 0.295 0.338 0.674 0.645 0.695 Panel B: Net Public Education Se cond-stage results F r. Immigrants 0.024 0.265** 0.333** 0.19 -0.088* -0.125 -0.279 -0.07 (0.015) (0.118) (0.133) (0.145) (0.046) (0.135) (0.52) (0.134) R obust W eak IV T ests AR χ 2 stat 2.78* 8.414*** 7.194*** 2.364 6.65*** 3.003* 0.608 0.024 AR 95% CS [-0.004, 0.055] p´8 , `8q p´8 , `8q p´8 , `8q [-0.25, -0.022] p´8 , `8q p´8 , `8q p´8 , `8q Quality of le arners Model RMSE 0.139 0.306 0.314 0.227 0.181 0.276 0.571 0.264 MSE of l 0.149 0.14 0.146 0.149 0.14 0.146 MSE of r 1.009 0.768 0.743 1.369 1.779 1.711 MSE of m 0.3 0.295 0.338 0.674 0.645 0.695 Panels A and B First-stage r esults Shift Share IV 1.655*** 0.487** 0.491** 0.341 0.522*** 0.259 0.209 -0.204 (0.263) (0.199) (0.215) (0.236) (0.162) (0.198) (0.335) (0.293) F stat 39.51 5.904 5.127 2.056 10.37 1.681 0.383 0.478 Observ ations 146 146 146 146 146 146 146 146 No. clusters 113 113 113 113 113 113 113 113 Note: The table displays our estimates based on Specifications (2), (4) and (6) of T able 4 (Panels A and B) in Moriconi et al. ( 2019 ) obtained from conventional 2SLS regression with FD transformation (Columns 1 and 5), and our panel IV DML estimation with different base learners (Columns 2-4 and 6-8). The sample is aggregated sample at regional (NUTS2) level to construct an unbalanced panel data set. The treatment and instrumental variables in Columns (1)-(2) and (5)-(6) refer to the fraction of high-skilled w orkers, and in Columns (3)-(4) and (7)-(8) of low-skilled workers. The dep endent variable in P anel A is ‘Net W elfare State, and in Panel B ‘Net Public Education’. Ra w control variables in all panels are: the share of w omen, av erage age, share of tertiary/post-tertiary education, a verage GDP p er capita (in log), share of tertiary sector (in log), a verage unemployment rate, and election year dummies. The set of con trol v ariables in the panel IV DML estimation with NNet and Boosting does not include interaction terms because these base learners are designed to capture nonlinearities in the data. Panel IV DML estimation with Lasso (Columns 2 and 6) uses an extended dictionary of the raw v ariables, including polynomials up to order three and interaction terms b et ween all the cov ariates, to satisfy Lasso’s weak sparsity assumption. The number of covariates used for panel IV DML estimation doubles due to the inclusion of the lags of all included covariates, following the FD (exact) approach. Panel IV DML technical note: 2 folds, cross-fitting, hyperparameters are tuned as p er T able F.1 . Standard errors in paren thesis are clustered at the regional level. Significance levels: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. iden tical for the same skill group, as they use the same subsample. In b oth panels, the first-stage F-statistics for HS and LS immigran ts are substantially smaller in the panel IV DML sp ecifications than in the corresp onding 2SLS regressions. F o cusing on HS immigran ts (Panels A and B of Columns 1-4 in T able 2 ), the first-stage F-statistics from panel IV DML regressions fall well b elo w the conv en tional rule- of-th umb v alue of 10, whereas the corresp onding F-statistics of 2SLS is ab o v e the Sto c k and Y ogo ( 2005 ) threshold ( F “ 39 . 5 ). Therefore, under panel IV DML, the shift-share instrumen t for HS immigran ts app ears extremely w eak, unlik e with conv en tional 2SLS. The AR test statistics, robust to weak instruments b y construction, strongly reject the null h yp othesis of no effect at 1% lev el with panel IV DML in b oth panels (with the exception of gradient b o osting in P anel B), indicating the presence of an effect of HS immigration on b oth outcomes. By contrast, the AR test statistic from 2SLS regressions rejects the null h yp othesis at 1% in P anel A only , and at 10% level in P anel B. The corresp onding AR CS from the panel IV DML regressions are either disjoin t sets that include zero or un b ounded, 29 reflecting limited information to identify the effect due to w eak instrumen ts, even when a causal effect may b e present. More generally , all second-stage estimates of the treatment parameter in these sp ecifications should b e interpreted cautiously with b oth estimators, as the weakness of the shift-share instrumen t mak es standard inference unreliable. F or LS immigrants (Columns 5-8 of Panels A and B in T able 2 ), the first-stage F-statistics from panel IV DML estimators are well b elo w the rule-of-th umb threshold of 10 while F-statistic from 2SLS barely exceeds 10. This raises serious concerns ab out the relev ance of the shift-share instrument for LS immigrations. Consistent with this, the AR test statistics generally fail to reject the null h yp othesis with panel IV DML in b oth panels, with a few exceptions. In P anel A, the AR test statistic rejects the null h yp othesis at the 1% lev el for b o osting sp ecification; ho wev er, giv en the known instabilit y of the b oosting in small samples, the latter result should b e in terpreted with caution. In P anel B, the AR test statistic is b orderline significan t (at 10% level) for the Lasso sp ecification, but broadly consistent with the absence of a treatmen t effect with the other learners. The AR CS obtained from panel IV DML are unbounded (en tire real line) in all cases, indicating insufficien t information to iden tifying the range of causal effects, if an y . By con trast, the AR confidence sets from the 2SLS sp ecifications are alw ays b ounded and contain the corresp onding second-stage p oint estimates, a pattern consisten t with the tendency of 2SLS to inflate identification strength under weak instruments. Ov erall, our panel IV DML does not fully supp ort 2SLS findings mainly due con- cerns ab out the strength of the shift-share instruments. While the instrumen ts app ear mo d- erately strong under standard 2SLS sp ecifications, they are found to b e weak once high- dimensional and potentially nonlinear confounding is flexibly accounted for using the panel IV DML estimator. The AR diagnostics further indicate that, although some effects may b e presen t, w eak IV preven ts the identification of the causal effect. While panel IV DML mak es the iden tifying assumptions more plausible under the included con trols, in this setting this comes at the cost of reduced effectiv e relev ance, limiting the reliabilit y of second-stage p oin t estimates. 4.2.2 Empirical Application: Moriconi et al. ( 2022 ) In line with the article discussed in the previous section, Moriconi et al. ( 2022 ) examine the impact of high-skilled (HS) and low-skilled (LS) immigration on individual voting patterns 30 to wards parties with a nationalist agenda, and p erceiv ed attitudes to wards politics and immigration. The authors find that a higher share of HS immigrants decreases the in tensity of nationalist preferences of native v oters, and an opp osite effect for LS immigran ts. In this third empirical application, w e revisit the baseline specifications of Moriconi et al. ( 2022 ) on the effect of HS and LS immigrants on nationalism (Columns 2-3 from their T able 6), and on the c hange in attitudes tow ards politics and immigran ts (Columns 1, 3, 4 and 6 from their T able 10). The dep enden t v ariables of interest for the reanalysis are: an index of nationalism in tensity of parties ( Nationalism ), constructed by matching individual- lev el part y v otes in national elections with the conten t of each party’s p olitical manifesto; an index of trust in country parliamen t as p olitical attitude; and a measure for a b etter place to live as attitude tow ards migran ts. The set of control v ariables employ ed in analysis are the same as those in T able 2 . W e first discuss the key insights from the comparison betw een conv entional 2SLS estimates with FE and with FD transformation (see the Online Appendix D.2 for more details). First-stage strength is consisten tly higher under FE than FD, mainly b ecause FD drops the first time-p eriod and sub jects without consecutive observ ations. The shift- share instruments in b oth FE and FD regressions are not alw ays strong, in contrast with the original study , where 16 . 30 ă F ă 104 . 70 in all first-stage sp ecifications. While the instrumen ts can b e considered mo derately strong in all sp ecifications using the sample of HS immigran ts, this is not alwa ys the case for LS immigran ts samples. Sp ecifically , for the nationalism outcome, the instrument is essentially irrelev an t ( F « 0 ) for LS immigration under b oth estimators, undermining the credibility of the relativ e estimates. The second- stage results seem to be estimator-sensitive. The effect of HS immigration on political attitudes (originally only marginally significan t at 10% level) surviv es only in FE, and the LS immigration effect on immigration attitudes disapp ears entirely . Mo ving to the panel IV DML results, T able 3 displays conv en tional 2SLS with FD estimates (Columns 1 and 5) and our panel IV DML estimation results (Columns 2-4, and 6-8) b y dep enden t v ariable. F or HS immigrants (sp ecifications in Columns 1-4 Panels A-C), the first-stage F-statistics obtained with panel IV DML nev er exceed ev en the rule-of- th umb threshold of 10 and are alwa ys substantially smaller than those from 2SLS (alw ays F ą 16 . 30 ). This provides initial evidence of the presence of weak instruments in these sp ecifications for HS immigran ts with panel IV DML. In all panels, the AR test with 2SLS and panel IV DML fails to reject the null h yp othesis of no reduced-form effect when assuming 31 T able 3. Nationalism intensity, attitudes towar ds p olitics and immigr ation Sample: HS immigrants LS Immigrants 2SLS DML-Lasso DML-NNet DML-Bo osting 2SLS DML-Lasso DML-NNet DML-Boosting (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Nationalism intensity of parties Se cond-stage results F r. Immigran ts -0.049 -0.100 -0.054 -0.017 0.084 -2.561 -0.086 1.41 (0.044) (0.089) (0.079) (0.159) (0.067) (3.186) (0.297) (8.882) AR 95% CS [-0.140, 0.025] p´8 , `8q p´8 , `8q p´8 , `8q [-0.028, 0.316] p´8 , `8q p´8 , `8q p´8 , `8q First-stage results Shift Share IV 1.476*** 0.601** 0.742** 0.406* 0.602*** 0.167 0.189 -0.051 (0.241) (0.257) (0.308) (0.217) (0.189) (0.208) (0.356) (0.262) F stat 37.54 5.381 5.699 3.433 8.31 0.629 0.277 0.037 AR χ 2 stat 1.39 0.307 0.401 0.142 2.13 11.632*** 0.066 3.456* Quality of learners Model RMSE 0.244 0.365 0.297 0.337 0.274 4.536 0.353 4.714 MSE of l 0.27 0.248 0.285 0.27 0.248 0.285 MSE of r 0.961 0.872 0.813 1.365 1.521 1.85 MSE of m 0.28 0.278 0.318 0.713 0.72 0.671 Observ ations 147 147 147 147 147 147 147 147 No. clusters 114 114 114 114 114 114 114 114 Panel B: Politic al attitudes – T rust in country parliament Se cond-stage results F r. Immigran ts 0.067 0.014 -0.035 0.045 0.05 0.137 0.344 0.205 (0.044) (0.066) (0.053) (0.058) (0.051) (0.142) (0.5) (0.179) AR 95% CS [-0.015,0.174] [-0.139, 0.146] [-0.154, 0.077] [-0.082, 0.151] [-0.030,0.181] [-0.016, 0.513] p´8 , `8q p´8 , `8q First-stage results Shift Share IV 1.174*** 1.062*** 1.268*** 1.15*** 0.647*** 0.559** 0.166 0.291 (0.277) (0.397) (0.444) (0.394) (0.155) (0.244) (0.206) (0.207) F stat 19.09 7.016 8.002 8.353 16.39 5.139 0.639 1.941 AR χ 2 stat 3.3* 0.001 0.507 0.449 1.17 1.936 2.992* 1.558 Quality of learners Model RMSE 0.423 0.383 0.413 0.506 0.782 0.63 MSE of l 0.249 0.228 0.241 0.249 0.228 0.241 MSE of r 0.739 0.851 0.811 1.172 1.345 1.248 MSE of m 0.239 0.227 0.255 0.409 0.398 0.408 Observ ations 327 327 327 327 327 327 327 327 No. clusters 114 114 114 114 114 114 114 114 Panel C: Migr ation attitudes – Better place to live Se cond-stage results F r. Immigran ts -0.023 -0.027 -0.109 -0.074 -0.100* 0.065 -0.311 -0.254 (0.048) (0.07) (0.089) (0.071) (0.051) (0.082) (0.471) (0.227) AR 95% CS [-0.168, 0 .041] [-0.193, 0.099] [-0.232, 0.008] [-0.208, 0.033] [-0.233,-0.026] [-0.106, 0.339] p´8 , `8q p´8 , 0 . 106 s Y r 0 . 434 , `8q First-stage results Shift Share IV 1.174*** 1.062*** 1.268*** 1.15*** 0.647*** 0.559** 0.166 0.291 (0.277) (0.397) (0.444) (0.394) (0.155) (0.244) (0.206) (0.207) F stat 19.09 7.016 8.002 8.353 16.39 5.139 0.639 1.941 AR χ 2 stat 0.28 0.381 2.683 1.745 7.99*** 0.78 3.514* 3.307* Quality of learners Model RMSE 0.434 0.403 0.424 0.44 0.819 0.739 MSE of l 0.253 0.22 0.241 0.253 0.22 0.241 MSE of r 0.739 0.851 0.811 1.172 1.345 1.248 MSE of m 0.239 0.227 0.255 0.409 0.398 0.408 Observ ations 327 327 327 327 327 327 327 327 No. clusters 114 114 114 114 114 114 114 114 Note: The table reports our estimates based on Specifications (2) and (3) of T able 6 in Moriconi et al. ( 2022 ) (our P anel A), and Specifications (3) and (5) (P anels A and B) of Table 10 in Moriconi et al. ( 2022 ) (our Panels B-C). The table displays our estimates from conven tional 2SLS regression with FD transformation (Columns 1 and 5), and our panel IV DML estimation with different base learners (Columns 2-4 and 6-8). The sample is aggregated sample at regional (NUTS2) lev el to construct an unbalanced panel data set. The treatment and instrumental v ariables in Columns (1)-(2) and (5)-(6) refer to the fraction of high-skilled w orkers, and in Columns (3)-(4) and (7)-(8) of lo w-skilled workers. Each panel uses a differen t dep ended v ariable. The raw control v ariables in all panels are: the share of women, average age, share of tertiary/p ost-tertiary education, average GDP per capita (in log), share of tertiary sector (in log), av erage unemploymen t rate, and year dummies. Raw variables in the panel IV DML estimation with NNet and Bo osting does not include interaction terms b ecause these base learners are designed to capture nonlinearities in the data. P anel IV DML with Lasso (Columns 2 and 6) uses an extende d dictionary of the raw variables, including p olynomials up to order three and interaction terms between all the cov ariates, to satisfy Lasso’s weak sparsity assumption. The n umber of cov ariates used for panel IV DML estimation doubles due to the inclusion of the lags of all included co variates, following the FD (exact) approach. Panel IV DML technical note: 2 folds, cross-fitting, hyperparameters are tuned as per T able F.1 . Standard errors in parenthesis are clustered at the regional lev el. Significance lev els: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. that the treatmen t has no effect. The associated AR CS are either b ounded with zero included, suggesting no second-stage effect, or unbounded (real line), indicating that the causal parameter cannot b e iden tified with the a v ailable v ariation in the data due to w eak IV. Both estimators therefore point to the absence of a causal effect, although only panel IV DML explicitly reveals the w eak-identification problem. F or LS immigran ts w e commen t on the results of the three outcomes (panels) separately . In Columns 5-8 of Panel A, the first-stage F-statistics are well b elow 10 for b oth 32 2SLS and panel IV DML, clearly indicating the presence of w eak instrumen ts. Although AR test statistics from panel IV DML suggest the possible presence of a second-stage effect across all three learners, not detected b y 2SLS, the asso ciated AR confidence sets span the entire real line. This reflects the lack of information in the instrument that preven ts iden tification of the causal parameter. In this case, without AR diagnostics, the researcher might dra w the same conclusion from 2SLS and panel IV DML, i.e. that the instrument is weak and no inference on the second stage can b e made. AR diagnostics can inform us if there is a second stage effect regardless of the strength of the instrumen t. In this case, only the AR diagnostics from panel IV DML allow us to conclude that there migh t b e an effect, which cannot b e iden tified due to the w eakness of the instrument. In Columns 5-8 of Panels B-C, where the first-stage regressions are identical across outcomes, panel IV DML again pro duces substantially smaller F-statistics than 2SLS, whose v alues no w only marginally exceed the Sto c k and Y ogo ( 2005 )’s threshold. In P anel B, AR tests fail to reject the null hypothesis with b oth estimators (and only at 10% lev el with neural netw ork), indicating no evidence of a causal effect on political attitudes. In P anel C on migration attitudes, results are mixed: AR tests from panel IV DML with Lasso fails to reject the null of no effect while neural netw orks and b o osting reject at 10% lev el only . In con trast, the result of the 2SLS AR test conv eys the message that there is a strong second- stage effect. Consisten t with weak iden tification, AR confidence sets from panel IV DML either cov er the en tire real line (NNet) or disjoin t segmen ts (b o osting), whereas those from 2SLS are b ounded and exclude zero. In contrast, AR CS from Lasso is b ounded but includes zero, which is aligned with the AR test result. In general, giv en the weakness of the shift- share instrument, statistical inference based on conv en tional 2SLS estimator is unreliable, and the second-stage coefficient cannot be in terpreted as a credible causal effect. Ov erall, panel IV DML reveals that once instrument exogeneit y is strengthened through flexible control for confounding, in this case the shift-share instruments lose substan- tial relev ance. As a result, the av ailable v ariation is insufficien t to supp ort reliable inference on second-stage effects, remarking the imp ortance of w eak-identification diagnostics in IV applications. 33 5 Mon te Carlo Sim ulations F or the Mon te Carlo sim ulations, we consider a data generating pro cess (DGP) inspired b y the estimating equations of the empirical applications discussed in Section 4 . The data are generated from a static panel data mo del with high-dimensional cov ariates and individual fixed effects, which are strongly correlated with the included v ariables. The outcome de- p ends on an endogenous treatment; treatmen t endogeneit y arises from tw o sources: (a) the correlation b et ween the structural and first-stage error term, and (b) the presence of the fixed effect in b oth structural and treatment equations. 23 The instrumen t is not randomly assigned, b eing affected b y cov ariates, but is exogenous to the structural error. The nuisance functions linking con trols to the outcome, treatmen t, and instrumen t are sparse (only few v ariables out of thirt y are relev an t) and nonlinear (in teractions are included). W e consider b oth a strong-instrument design and a w eak-instrument design. More details on the DGP are provided in the Online Appendix E . As in the empirical applications, we employ our P anel IV DML estimator with differen t learners – i.e., Lasso, a single-la yer neural netw ork (NNet), and gradient b o osting with 100 trees (Boosting) – to predict the n uissance functions of the co v ariates in a flexible w ay , and use the conv entional 2SLS estimator as b enchmark to assess the finite sample prop erties of the prop osed estimator. The set of confounding v ariables used in conv en tional 2SLS and panel IV DML with NNet and Bo osting regressions includes all ra w v ariables and no nonlinear terms b ecause, in practice, the analyst is agnostic on the true functional form of the co v ariates. 24 By con trast, panel IV DML with Lasso requires th e analyst to sp ecify a ric h set of v ariables, including p olynomials and in teractions of the raw cov ariates (extended dictionary) to satisfy the sparsit y assumption. 25 The h yp erparameters of the base learners are tuned via grid search ( Bergstra and Bengio , 2012 ) as explained in the Online App endix F . T ables 4 and 5 rep ort the av erage results across 100 b o otstrap samples for the strong- and w eak-instrument cases, resp ectiv ely . Each table summarizes the av erages (for n umerical quantities) or prop ortions (for binary indicators) ov er all Monte Carlo replica- 23 The first source of endogeneity can b e addressed with an IV approach, while the second with a panel data estimation approach (i.e., FE, FD and correlated random effects). 24 Neural net work and gradien t b oosting should be able to capture those nonlinearities, b y construction, even if unsp ecified, unlike 2SLS. 25 The constructed extended dictionary do es not include the in teraction terms present in Equations ( E.8 )- ( E.10 ) of the DGP , allowing us to assess ho w effectively Lasso can flexibly recov er similar co v ariate com- plexit y . 34 T able 4. MC R esults, FD Appr o ach with Str ong IV T arget parameter p θ Nuisance parameters First-stage F statistic AR χ 2 Anderson-Rubin Confidence Set Bias RMSE SE/SD RMSE l RMSE r RMSE m F F ą 16 . 3 F ą 104 . 7 p ă 0 . 05 Bou nded Real Line Disjoin t Includes 0 Panel A: N=100, T=10 2SLS 0.508 0.260 1.213 – – – 747.6 1.00 1.00 1.00 1.00 0 0 0 Panel IV DML with: Lasso 0.005 0.331 3.420 1.982 1.480 0.389 44.6 0.99 0 0.9 1.00 0 0 0.07 NNet 0.097 0.132 1.711 2.296 1.727 0.424 33.8 0.87 0.01 0.74 1.00 0 0 0.23 Boosting 0.448 0.209 1.232 2.415 1.903 0.661 136.3 1.00 0.68 1.00 1.00 0 0 0 Panel B: N=500, T=10 2SLS 0.505 0.255 1.112 – – – 3760.9 1.00 1.00 1.00 1.00 0 0 0 Panel IV DML with: Lasso 0.055 0.008 1.037 1.953 1.456 0.374 179.7 1.00 1.00 1.00 1.00 0 0 0 NNet 0.062 0.013 1.244 2.021 1.529 0.387 158.8 0.99 0.96 1.00 1.00 0 0 0 Boosting 0.280 0.084 1.330 2.125 1.625 0.491 243.1 1.00 1.00 1.00 1.00 0 0 0 Panel C: N=1000, T=10 2SLS 0.505 0.256 1.177 – – – 7561.6 1.00 1.00 1.00 1.00 0 0 0 Panel IV DML with: Lasso 0.048 0.005 1.072 1.932 1.443 0.372 352.5 1.00 1.00 1.00 1.00 0 0 0 NNet 0.052 0.007 1.206 1.985 1.498 0.385 309.1 0.99 0.97 1.00 1.00 0 0 0 Boosting 0.193 0.041 1.258 2.055 1.562 0.455 394.8 1.00 1.00 1.00 1.00 0 0 0 Panel D: N=5000, T=10 2SLS 0.505 0.255 0.835 – – – 37612.8 1.00 1.00 1.00 1.00 0 0 0 Panel IV DML with: Lasso 0.065 0.004 0.697 1.880 1.406 0.367 1808.4 1.00 1.00 1.00 1.00 0 0 0 NNet 0.073 0.008 1.979 1.954 1.462 0.393 1509.3 1.00 0.98 1.00 1.00 0 0 0 Boosting 0.050 0.004 1.215 1.986 1.497 0.415 1390.3 1.00 1.00 1.00 1.00 0 0 0 Note: The figures in the table are av erage v alues and frequencies ov er 100 bo otstrapped replications by estimation method (conven tional 2SLS and panel IV DML) and learner (Lasso, neural netw ork, and gradient bo osting). The true structural parameter θ is 0.50, and the true first-stage co efficien t is π “ 0 . 8 . P anel IV DML estimation details: cross-fitting with 3 folds. tions. The metrics rep orted include: (a) the bias and ro ot mean squared error (RMSE) of the estimated target parameter p θ n , together with the ratio of its estimated standard error to its Monte Carlo standard deviation (SE/SD); (b) the RMSE of the three n uisance func- tions; (c) the first-stage F-statistic and threshold rules for assessing instrument strength; and (d) the p-v alue of the Anderson-Rubin (AR) test statistic together with the corresp ond- ing t yp e of AR confidence set. Each panel displa ys the results by cross-sectional sample size, N “ t 100 , 500 , 1000 , 5000 u , with T “ 10 fixed, where each row represents a different estimator. W e b egin the discussion of the Mon te Carlo results with T able 4 , whic h corresp onds to the strong-instrumen t design. In this setting, inference is reliable and, hence, the fo cus is primarily on the p erformance of the panel IV DML estimator for the target parameter. P anel IV DML consisten tly outp erforms conv en tional 2SLS across all sample sizes, reflecting its abilit y to capture nonlinearities in the cov ariates that 2SLS cannot accommo date, b y construction. In particular, the panel IV DML estimator reco v ers the structural parameter with high accuracy and precision when using Lasso and neural net works: the bias remains small (b elo w 0.10) even in small samples ( N “ 100 ). Gradient b o osting exhibits larger bias in small samples, likely due to ov erfitting when the training sample is small, but still p erforms slightly b etter than 2SLS ev en in small samples. As the sample size increases, its 35 T able 5. MC R esults, FD Appr o ach with W e ak IV T arget parameter p θ Nuisance parameters First-stage F statistic AR χ 2 Anderson-Rubin Confidence Set Bias RMSE SE/SD RMSE l RMSE r RMSE m F F ą 16 . 3 F ą 104 . 7 p ă 0 . 05 Bounded Real Line Disjoint Includes 0 Panel A: N=100, T=10 2SLS 1.007 1.020 1.174 – – – 192.9 1.00 0.99 1.00 1.00 0 0 0 Panel IV DML with: Lasso 1.249 20.434 1.240 1.969 1.429 0.389 2.8 0.01 0 0.26 0.36 0.53 0.11 0.71 NNet 0.495 12.131 0.378 2.304 1.666 0.424 2.1 0.01 0 0.2 0.25 0.6 0.15 0.78 Boosting 1.136 2.094 4.201 2.322 1.682 0.661 37.1 0.93 0 1.00 1.00 0 0 0 Panel B: N=500, T=10 2SLS 1.001 1.002 1.108 – – – 958.5 1.00 1.00 1.00 1.00 0 0 0 Panel IV DML with: Lasso 1.334 106.700 2.848 1.942 1.415 0.374 4.2 0 0 0.27 0.53 0.32 0.15 0.63 NNet 0.417 147.068 2.659 2.004 1.457 0.387 3.9 0 0 0.31 0.46 0.4 0.14 0.64 Boosting 1.364 7.888 8.154 2.086 1.519 0.491 26.5 0.82 0 1.00 1.00 0 0 0 Panel C: N=1000, T=10 2SLS 1.001 1.003 1.082 – – – 1929.5 1.00 1.00 1.00 1.00 0 0 0 Panel IV DML with: Lasso 0.651 19.555 4.002 1.923 1.403 0.372 6.9 0.04 0 0.46 0.79 0.14 0.07 0.51 NNet 0.408 23.187 2.721 1.972 1.435 0.385 7.1 0.08 0 0.46 0.7 0.2 0.1 0.48 Boosting 0.792 1.078 3.339 2.025 1.473 0.455 27.8 0.84 0 1.00 1.00 0 0 0 Panel D: N=5000, T=10 2SLS 1.002 1.004 0.826 – – – 9575.5 1.00 1.00 1.00 1.00 0 0 0 Panel IV DML with: Lasso 0.506 0.333 1.932 1.869 1.361 0.367 34.3 1.00 0 1.00 1.00 0 0 0 NNet 0.096 14.171 11.232 1.931 1.400 0.393 32.1 0.85 0.03 0.96 1.00 0 0 0.03 Boosting -5.328 3370.120 31.939 1.964 1.428 0.415 21.5 0.72 0 0.95 1.00 0 0 0.03 Note: The figures in the table are average v alues and frequencies over 100 bo otstrapped replications b y estimation metho d (conv entional 2SLS and panel IV DML) and learner (Lasso, neural netw ork, and gradient bo osting). The true structural parameter θ is 0.50, and the true first-stage co efficien t is π “ 0 . 001 . P anel IV DML estimation details: cross-fitting with 3 folds. bias con verges tow ard that of Lasso and neural netw orks. The RMSE of the panel IV DML estimator declines remark ably with sample size, b eing consisten t with ro ot-N conv ergence of the panel IV DML estimator, whereas the 2SLS estimator does not displa y the same impro vemen t. Consistent with the design, first-stage F-statistics are large and frequently exceed the Lee et al. ( 2022 ) threshold of 104.70, indicating strong iden tification. The main exceptions are for Lasso and neural netw ork in very small samples ( N “ 100 ), where the a verage F-statistics exceed the Sto ck and Y ogo ( 2005 ) critical v alue of 16.30 but remain below 104.70. This do es not indicate gen uine weak iden tification, but rather reflects the limited information a v ailable in small samples. The Anderson-Rubin (AR) diagnostics supp ort this in terpretation: b oth the AR test statistic and the AR confidence set (CS) provide evidence of the existence of a treatment effect under this DGP across estimators and sample sizes. This pattern mirrors our empirical reanalysis of T ab ellini ( 2020 ), where F-statisti cs fell similarly b elo w 104.70, but the AR test statistic rejected the null hypothesis of no effect and the AR CS remained b ounded. Exceptions arise again for Lasso and neural net w orks in v ery small samples ( N “ 100 ), where limited information implies that the absence of a second-stage effect cannot b e fully ru led out. When the instrument is designed to b e weak (T able 5 ), the second-stage estimates and statistical inference are unreliable for all estimators, so the Mon te Carlo metrics for 36 p θ n (bias, RMSE, SE/SD) are largely uninformativ e. 26 The k ey consideration, therefore, is whether the w eak-identification diagnostics correctly detect the lac k of instrument relev ance when using conv en tional 2SLS and panel IV DML estimators. Con ven tional 2SLS systematically fails to detect weak instruments. That is, the first-stage F-statistics alwa ys exceed 104.70, the AR test alw a ys rejects the n ull h yp othesis of no effect at 5% level in all samples, and AR 95% CS are alw ays b ounded and exclude zero. These diagnostics incorrectly suggest strong identification and iden tification of the treatmen t effect, despite the DGP b eing designed with w eak instruments. By con trast, panel IV DML with Lasso and NNet correctly detects w eak identification in most cases, esp ecially in small samples. F or these learners, the av erage first-stage F-statistic do es not exceed the rule-of- th umb threshold of 10 for N ď 1000 , is alw ays b elo w the Sto c k and Y ogo ( 2005 )’s critical v alue of 16.30 in small samples, and never approaches the Lee et al. ( 2022 ) threshold of 104.70 with exceptions in large samples ( N “ 5 , 000 ). Bo osting nev er reac hes the Lee et al. ( 2022 )’s threshold, although its F-statistics are systematically larger (around 25 on av erage) than the other learners. The AR test further highligh ts the contrast b et ween panel IV DML and conv en tional 2SLS. F or Lasso and NNet, the AR rejection rates are lo w in small samples (on av erage around 30% at N ď 500 and 46% at N “ 1 , 000 ), appropriately reflecting w eak iden tification, and increase to ab out 95% at N “ 5 , 000 since larger samples provide more information. The corresp onding AR CS are often unbounded (either disjoin t or real line) in small samples, signalling weak instruments. As N increases, the AR CS b ecome more often b ounded, but still include zero, correctly indicating that the causal effect cannot b e precisely iden tified under w eak instruments. Bo osting b eha v es similarly to 2SLS in small samples, but con verges tow ard the neural-netw ork patterns as the sample size gro ws. In this regard, panel IV DML provides inference that is more reliable in finite samples than 2SLS, follo wing the patterns observ ed in the empirical applications of Moriconi et al. ( 2019 , 2022 ), where the instrumen t was found w eak by panel IV DML but not by 2SLS and the AR CS w ere mostly bounded with zero excluded. In conclusion, the Monte Carlo simulations indicate that con v entional 2SLS pro- duces biased and inconsisten t estimates under b oth w eak and strong instrumen ts, with the magnitude of the bias remaining similar across sample sizes and failing to exhibit ro ot-N 26 As expected, b oth conv entional 2SLS and panel IV DML produce biased estimates of the target parameter across all sample sizes. Surprisingly , neural netw ork exhibits comparatively smaller av erage bias in large samples ( N “ 5 , 000 ) though at the cost of reduced precision ( S E { S D " 1 ). 37 con vergence. Con versely , panel IV DML not only impro v es estimation and statistical infer- ence of the treatment effect under strong identification, regardless of the sample size, but it is also able to pro vide more robust and informative inference when identification is w eak and conv entional 2SLS can b e severely misleading. 6 Conclusion This pap er introduces nov el DML estimation procedures for partially linear panel regression mo dels with fixed effects under endogenous treatment and p oten tially nonlinear cov ariate effects. W e show that panel IV DML is a p ow erful alternative to 2SLS, capturing nonlin- earities in the data while delivering more precise and reliable finite-sample inference. The panel IV DML to olkit allows researc hers to complement traditional estimation tec hniques, pro viding a flexible, theoretically grounded, and empirically practical approach to causal estimation in panel settings where con v entional metho ds are falling short. Another k ey con tribution of this pap er is the in tegration of weak-instrumen t diag- nostics (i.e., the first-stage F-statistic and the Anderson-Rubin test and confidence sets) into the panel IV DML framew ork. Our results sho w that AR inference is essential for reliable conclusions, as it remains v alid under weak identification and can uncov er treatmen t effects ev en when the F-statistic suggests limited instrument strength. Therefore, we advocate rep orting AR diagnostics alongside con ven tional measures in panel IV DML applications. References Abadie, A. (2003). Semiparametric instrumen tal v ariable estimation of treatment resp onse mo dels. Journal of Ec onometrics , 113(2):231–263. Aba yasek ara, A., Kim, J. S., and W ang, L. C. (2025). Impacts of housing costs on health and satisfaction with life circumstances: Evidence from Australia. He alth Ec onomics , 34(4):741–757. A dao, R., Kolesár, M., and Morales, E. (2019). Shift-share designs: Theory and inference. The Quarterly Journal of Ec onomics , 134(4):1949–2010. Anatoly ev, S. and Gosp o dino v, N. (2011). Sp ecification testing in mo dels wi th many instru- men ts. Ec onometric The ory , 27(2):427–441. 38 Anderson, T. W. and Rubin, H. (1949). Estimation of the parameters of a single equation in a complete system of stochastic equations. The A nnals of Mathematic al Statistics , 20(1):46–63. Andrews, I., Sto c k, J. H., and Sun, L. (2019). W eak instrumen ts in instrumental v ariables regression. Annual R eview of Ec onomics , 11:727–753. Angrist, J. D. and Imbens, G. W. (1995). T wo-stage least squares estimation of a verage causal effects in mo dels with v ariable treatment in tensity . Journal of the Americ an statistic al Asso ciation , 90(430):431–442. Argañaraz, F. and Escanciano, J. C. (2025). Debiased mac hine learning for unobserved heterogeneit y: High-dimensional panels and measurement error mo dels. arXiv pr eprint arXiv:2507.13788 . A vila Marquez, M. (2025). W eak instrumen tal v ariables due to nonlinearities in panel data: A sup er learner control function estimator. arXiv pr eprint arXiv:2504.03228 . Uploaded v ersion 12 No v ember 2025. Bac h, P ., Kurz, M. S., Chernozhuk o v, V., Spindler, M., and Klaassen, S. (2024a). DoubleML: Double Machine L e arning in R . R pack age v ersion 1.0.2. Bac h, P ., Schac ht, O., Chernozh uko v, V., Klaassen, S., and Spindler, M. (2024b). Hyp er- parameter tuning for causal inference with double mac hine learning: A sim ulation study . In Pr o c e e dings of the Thir d Confer enc e on Causal L e arning and R e asoning , v olume 236 of Pr o c e e dings of Machine L e arning R ese ar ch , pages 1065–1117. PMLR. Baiardi, A. and Naghi, A. A. (2024a). The effect of plough agriculture on gender roles: A mac hine learning approac h. Journal of Applie d Ec onometrics . Baiardi, A. and Naghi, A. A. (2024b). The v alue added of machine learning to causal inference: Evidence from revisited studies. The Ec onometrics Journal , page utae004. Belloni, A., Chernozh uko v, V., and Hansen, C. (2014a). High-dimensional metho ds and inference on structural and treatment effects. Journal of Ec onomic Persp e ctives , 28(2):29– 50. Belloni, A., Chernozh uko v, V., and Hansen, C. (2014b). Inference on treatmen t effects after selection among high-dimensional con trols. R eview of Ec onomic Studies , 81(2):608–650. 39 Belloni, A., Chernozh uko v, V., Hansen, C., and Kozbur, D. (2016). Inference in high- dimensional panel models with an application to gun con trol. Journal of Business & Ec onomic Statistics , 34(4):590–605. Bergstra, J. and Bengio, Y. (2012). Random searc h for hyper-parameter optimization. Jour- nal of Machine L e arning R ese ar ch , 13(2). Bia, M., Hub er, M., and Lafférs, L. (2023). Double machine learning for sample selection mo dels. Journal of Business & Ec onomic Statistics , pages 1–12. Borusy ak, K., Hull, P ., and Jara v el, X. (2022). Quasi-exp erimen tal shift-share researc h designs. The R eview of e c onomic studies , 89(1):181–213. Borusy ak, K., Hull, P ., and Jara vel, X. (2025). A practical guide to shift-share instruments. Journal of Ec onomic Persp e ctives , 39(1):181–204. Bound, J., Jaeger, D. A., and Bak er, R. M. (1995). Problems with instrumental v ariables estimation when the correlation b et w een the instruments and the endogenous explanatory v ariable is w eak. Journal of the A meric an Statistic al Asso ciation , 90(430):443–450. Card, D. (2001). Immigran t inflo ws, nativ e outflows, and the lo cal lab or market impacts of higher immigration. Journal of L ab or Ec onomics , 19(1):22–64. Carrasco, M. and T ch uen te, G. (2016). Efficient estimation with many w eak instrumen ts using regularization techniques. Ec onometric R eviews , 35(8-10):1609–1637. Cham b erlain, G. (1984). P anel data. Handb o ok of Ec onometrics , 2:1247–1318. Chen, C. Y.-H., Lioui, A., and Scaillet, O. (2025). Green silence: Double mac hine learning carb on emissions under sample selection bias. A v ailable at SSRN: https://ssrn.com/ abstract=5512980 . Chernozh uko v, V., Chetverik o v, D., Demirer, M., Duflo, E., Hansen, C., New ey , W., and Robins, J. (2018). Double/debiased machine learning for treatmen t and structural param- eters. The Ec onometrics Journal , 21(1): C 1– C 68. Chernozh uko v, V., Demirer, M., Duflo, E., and F ernández-V al, I. (2025). Fisher–sch ultz lec- ture: Generic mac hine learning inference on heterogeneous treatment effects in randomized 40 exp erimen ts, with an application to immunization in india. Ec onometric a , 93(4):1121– 1164. Chernozh uko v, V., F ernández-V al, I., Huang, C., and W ang, W. (2024). Arellano-b ond Lasso estimator for dynamic linear panel models. arXiv pr eprint arXiv:2402.00584 . Chernozh uko v, V., New ey , W. K., and Singh, R. (2022). Automatic debiased mac hine learning of causal and structural effects. Ec onometric a , 90(3):967–1027. Clark, A. E. and Zh u, R. (2024). T aking bac k control? quasi-exp erimen tal evidence on the impact of retirement on lo cus of control. The Ec onomic Journal , 134(660):1465–1493. Clark e, P . S. and P olselli, A. (2025). Double machine learning for static panel mo dels with fixed effects. The Ec onometrics Journal , 29(1):69–86. Cragg, J. G. and Donald, S. G. (1993). T esting identifiabilit y and sp ecification in ins tru- men tal v ariable mo dels. Ec onometric The ory , 9(2):222–240. Crudu, F., Mellace, G., and Sándor, Z. (2021). Inference in instrumental v ariable mo dels with heteroskedasticit y and man y instruments. Ec onometric The ory , 37(2):281–310. Cruz, L. M. and Moreira, M. J. (2005). On the v alidit y of econometric techniques with w eak instrumen ts: Inference on returns to education using compulsory sc ho ol attendance la ws. Journal of Human R esour c es , 40(2):393–410. Da vidson, R. and MacKinnon, J. G. (2014). Confidence sets based on inv erting anderson– rubin tests. The Ec onometrics Journal , 17(2):S39–S58. Deryugina, T., Heutel, G., Miller, N. H., Molitor, D., and Reif, J. (2019). The mortality and medical costs of air p ollution: Evidence from changes in wind direction. Americ an Ec onomic R eview , 109(12):4178–4219. Do vì, M.-S., Kock, A. B., and Ma vro eidis, S. (2024). A Ridge-regularized jackknifed Anderson-Rubin test. Journal of Business & Ec onomic Statistics , 42(3):1083–1094. F umagalli, L., Lynn, P ., and Muñoz-Bugarin, J. (2021). In vestigating the role of debt advice on b orro wers’ well-being: An encouragement study on a new sample of ov er-indebted p eople in britain. T ec hnical rep ort, ISER W orking Paper Series. 41 Goldsmith-Pinkham, P ., Sorkin, I., and Swift, H. (2020). Bartik instrumen ts: What, when, wh y , and ho w. Americ an Ec onomic R eview , 110(8):2586–2624. Keane, M. and Neal, T. (2023). Instrumen t strength in IV estimation and inference: A guide to theory and practice. Journal of Ec onometrics , 235(2):1625–1653. Keane, M. P . and Neal, T. (2024). A practical guide to w eak instrumen ts. A nnual R eview of Ec onomics , 16. Khammo, F., Kim, J. S., and W ang, L. C. (2024). Do housing costs affect transportation? longitudinal evidence from australia. Cities , 155:105469. Kleib ergen, F. and Paap, R. (2006). Generalized reduced rank tests using the singular v alue decomp osition. Journal of Ec onometrics , 133(1):97–126. Klosin, S. and Vilgalys, M. (2022). Estimating con tinuous treatmen t effects in panel data using mac hine l earning with an agricultural application. arXiv pr eprint arXiv:2207.08789 . Knaus, M. C. (2022). Double mac hine learning-based programme ev aluation under uncon- foundedness. The Ec onometrics Journal , 25(3):602–627. Langen, H. and Hub er, M. (2023). How causal mac hine learning can leverage marketing strategies: Assessing and improving the p erformance of a coup on campaign. Plos One , 18(1):e0278937. Lee, D. S., McCrary , J., Moreira, M. J., and Porter, J. (2022). V alid t-ratio inference for iv. A meric an Ec onomic R eview , 112(10):3260–3290. Mac hlanski, D., Samothrakis, S., and Clarke, P . (2023). Hyp erparameter tuning and model ev aluation in causal effect estimation. arXiv pr eprint arXiv:2303.01412 . Mac hlanski, D., Samothrakis, S., and Clarke, P . S. (2024). Robustness of algorithms for causal structure learning to h yp erparameter choice. In Pr o c e e dings of the Thir d Confer- enc e on Causal L e arning and R e asoning , v olume 236 of Pr o c e e dings of Machine L e arning R ese ar ch , pages 703–739. PMLR. Ma yda, A. M., P eri, G., and Steingress, W. (2022). The p olitical impact of immigration: Evi- dence from the united states. A meric an Ec onomic Journal: Applie d Ec onomics , 14(1):358– 389. 42 Mikushev a, A. and Sun, L. (2022). Inference with man y w eak instrumen ts. The R eview of Ec onomic Studies , 89(5):2663–2686. Mon tiel Olea, J. L. and Pflueger, C. (2013). A robust test for w eak instrumen ts. Journal of Business & Ec onomic Statistics , 31(3):358–369. Moreira, H. and Moreira, M. J. (2019). Optimal t wo-sided tests for instrumen tal v ari- ables regression with heteroskedastic and auto correlated errors. Journal of Ec onometrics , 213(2):398–433. Moriconi, S., Peri, G., and T urati, R. (2019). Immigration and v oting for redistribution: Evidence from Europ ean elections. L ab our Ec onomics , 61:101765. Moriconi, S., Peri, G., and T urati, R. (2022). Skill of the immigrants and v ote of the na- tiv es: Immigration and nationalism in europ ean elections 2007–2016. Eur op e an Ec onomic R eview , 141:103986. Mundlak, Y. (1978). On the p o oling of time series and cross section data. Ec onometric a , pages 69–85. P olselli, A. (2025). xtdml : Double Machine L e arning for Static Panel Mo dels with Fixe d Effe cts . R pac k age v ersion 0.1.12. Robinson, P . M. (1988). Ro ot-n-consisten t semiparametric regression. Ec onometric a: Jour- nal of the Ec onometric So ciety , pages 931–954. Ronconi, L., Bro wn, T. T., and Scheffler, R. M. (2012). So cial capital and self-rated health in argentina. He alth Ec onomics , 21(2):201–208. Sc haffer, M. E. (2005). XTIVREG2: Stata mo dule to p erform extended IV/2SLS, GMM and AC/HA C, LIML and k-class regression for panel data mo dels. Statistical Soft ware Comp onen ts, Boston College Departmen t of Economics. Semeno v a, V., Goldman, M., Chernozhuk o v, V., and T addy , M. (2023). Inference on het- erogeneous treatmen t effects in high-dimensional dynamic panels under w eak dep endence. Quantitative Ec onomics , 14(2):471–510. 43 Sto c k, J. and Y ogo, M. (2005). Asymptotic distributions of instrumental v ariables statistics with many instrumen ts. Identific ation and infer enc e for e c onometric mo dels: Essays in honor of Thomas R othenb er g , 6:109–120. Sto c k, J. H. and W atson, M. W. (2019). Intr o duction to Ec onometrics, Glob al Edition . Harlo w: Pearson Education, Limited, 4th edition. Print. Strittmatter, A. (2023). What is the v alue added b y using causal machine learning metho ds in a welfare exp erimen t ev aluation? L ab our Ec onomics , 84:102412. Sun, L. (2018). TW OSTEPWEAKIV: Stata mo dule to implemen t tw o-step w eak- instrumen t-robust confidence sets for linear instrumen tal-v ariable (IV) mo dels. Statistical Soft ware Comp onents, Boston College Department of Economics. T ab ellini, M. (2020). Gifts of the immigran ts, woes of the natives: Lessons from the age of mass migration. The R eview of Ec onomic Studies , 87(1):454–486. Windmeijer, F. (2025). The robust f-statistic as a test for weak instruments. Journal of Ec onometrics , 247:105951. 44 A Pro ofs A.1 Pro of of Prop osition 3.1 The deriv ation of the Neyman orthogonal (NO) score function for IV panel is based on the follo wing conditional momen t restrictions E r U it | Z it , X it , ξ i s “ E r R it | Z it , X it , ξ i s “ E r U ˚ it | Z it , X it , ξ i s “ 0 where U ˚ it “ Y it ´ l 0 p X it q ´ V 1 it δ ´ α i is the reduced-form regression residual, whic h induce the conditional moment restrictions E r r U it | r Z it , X it , X it ´ 1 , ξ i s “ E r r R it | r Z it , X it , X it ´ 1 , ξ i s “ E r r U ˚ it | r Z it , X it , X it ´ 1 , ξ i s “ 0 under Assumptions 2.1 and 2.2 . F rom Chernozh uko v et al. ( 2018 , Lemma 2.6), the semiparametrically efficien t NO score for b 0 “ p π 0 , θ 0 q or b rf 0 “ p δ 0 , π 0 q based on the momen t restrictions ab o ve is ψ K p r Y i , r D i , r Z i , X i q “ µ p r Z i , X i q r i where µ p r Z i , X i q “ A 1 p r Z i , X i q Ω ´ 1 p r Z i , X i q ´ G p X q Γ p r Z i , X i q Ω ´ 1 p r Z i , X i q and the comp onen ts of µ p r Z i , X i q are given as follo ws: r 1 i “ ´ r R 1 i r U 1 i or r U ˚1 i ¯ (A.1) A p r V , X q ” E “ B π ,θ or δ r i | r V i , X i ‰ “ ´ ¨ ˚ ˚ ˚ ˝ r V 1 i 0 0 r V 1 i π 0 0 r V 1 i ˛ ‹ ‹ ‹ ‚ (A.2) Ω p r V , X q ” E » — — — – r R i r R 1 i r R i r U 1 i r R i r U ˚1 i r U i r R 1 i r U i r U 1 i r U i r R ˚1 i r U ˚ i r U ˚1 i ˇ ˇ ˇ ˇ ˇ r V i , X i fi ffi ffi ffi fl “ ¨ ˚ ˚ ˚ ˝ Ω π π Ω π θ Ω π δ Ω 1 π θ Ω θθ Ω 1 π δ Ω δ δ ˛ ‹ ‹ ‹ ‚ (A.3) Γ p r V , X q ” E “ B η r i | r V i , X i ‰ “ ¨ ˚ ˚ ˚ ˝ 0 ´ 1 T ´ 1 π 1 0  1 T ´ 1 ´ 1 T ´ 1 1 T ´ 1 θ 0 0 0 δ 1 0  1 T ´ 1 ˛ ‹ ‹ ‹ ‚ , (A.4) 45 where 0 indicates the conformable matrix or v ector of zeros, and the within-matrix lines indicate (ab o v e and left) the en tries asso ciated with θ 0 and the others (b elow and righ t) asso ciated with δ 0 . The lo wer-diagonal element of ( A.2 ) follo ws under ( 2.4 )-( 2.5 ): E ˚ r r D it ´ r r 0 p X it qs “ E ˚ r r V 1 it π 0 ` r r 0 p X it q ` r R it ´ r r 0 p X it qs “ r V 1 it π 0 , where E ˚ r . s “ E r . | r V it , X it ´ 1 , X it s . Finally , G p X q ” E r A 1 p r Z i , X i q Ω ´ 1 p r Z i , X i q Γ p r Z i , X i q| X i s “ 0 under the lo cally efficien t condition that Ω p r Z i , X i q “ Ω p X i q and b ecause Γ p r Z i , X i q “ Γ and E r A p r Z i , X i q| X i s “ 0 under b oth models. The final step in the deriv ation of ( 3.1 ) (and that in the fo otnote for it) is the lo cally efficient condition that E r r U i r R 1 i | X i s “ E r r U ˚ i r R 1 i | X i s “ 0 . It then remains to verify that the existence of finite moments for the ab ov e ex- pressions to b e v alid as p er Chernozh uko v et al. ( 2018 , Lemma 2.6): these conditions are that E r} Γ p R q} 4 s , E r} A p R q} 4 s , E r} G p X q} 4 s and E r} Ω p R q} ´ 2 s are finite (noting R “ r Z i , X i ). Clearly , these conditions are satisfied b y Γ and G if } θ 0 } , } π 0 } and } δ 0 } “ O p 1 q (see pro of of P roposition 3.2 b elo w), and also b y Ω if Ω θθ (or Ω δ δ ) and Ω π π ha ve p ositiv e and finite singular v alues (implying b oth are p ositiv e-definite and non-singular, see also the Prop osi- tion 3.2 proof ). F or the structural mo del, } A p R q} “ } r V 1 i π 0 } ` } r V i } ď p C ` 1 q} r Z i ´ Ă M 0 i } ď p C ` 1 q ` } r Z i } ` } Ă M 0 i } ˘ ď 2 p C ` 1 q} r Z i } first b ecause π “ O p 1 q (s ee Prop osition 3.2 proof ) and then b ecause Jensen’s inequalit y gives } r Z i } ě } Ă M 0 i } . Similar arguments for the reduced-form mo del giv e } A p R q} “ } r V i } ` } r V i } ď 2 } r Z i ´ Ă M 0 i } ď 2 ` } r Z i } ` } Ă M 0 i } ˘ ď 4 } r Z i } Hence, all of the necessary conditions are satisfied. A.2 Pro of of Prop osition 3.2 Preliminaries. F or v ectors v “ p v 1 , . . . , v r ` 1 q 1 , we use the v ector norm } v } q “ ` ř r ` 1 i “ 1 | v i | q ˘ 1 { q where the norm satisfies the Hölder inequalit y such that } ab } s ď } a } p } b } q for 1 { p ` 1 { q “ 1 { s for any conformable a and b whose product is a v ector. F or real square matrices M “ p m 1 , . . . , m p q , where m k “ p m 1 k , . . . , m rk q 1 is column k of M , w e use the Schat- ten norm } M } q “ p ř p k “ 1 | σ k p M q| q q 1 { q , that is, the v ector norm of the singular v alues of M σ 1 p M q , . . . , σ p p M q . The Hölder inequalit y for conformable real matrices A and B is } AB } 2 2 ď } A } p } B } q for 1 { p ` 1 { q “ 1 . Generalising to L q p P q for non-coun ting measure P P P , the norm of measurable f p W q for random v ector W is } f p W q} P,q “ ` ş | f p w q| q B P p w q ˘ 1 { q , and for v ector functional f p W q “ p f 1 p W q , . . . , f l p W qq it is } f p W q} P,q “ max k p} f k p W q} P,q q , and Hölder’s inequalit y is } f i p W q f k p W q} P,s ď } f i p W q} P,p } f k p W q} P,q if 1 { p ` 1 { q “ 1 { s . Regularity 46 Condition (e) sets T N to b e defined by the follo wing conditions: • } η ´ η 0 } P,q ď C for q ą 4 , • } η ´ η 0 } P, 2 ď δ N , • } Ă M ´ Ă M 0 } P, 2 ´ } r l ´ r l 0 } P, 2 ` } r r ´ r r 0 } P, 2 ` } Ă M ´ Ă M 0 } P, 2 ¯ ď δ N N ´ 1 { 2 , Crudely put, these conditions are that, o v erall, T N can shrink to wards η 0 in MSE at a rate slo wer than ? N , but Ă M r l , Ă M r r and Ă M Ă M 1 m ust con verge at a rate at least as fast as ? N . Note that throughout the lab elling of b ounding constants is informal but this do es not affect the conclusions b ecause these are arbitrary . Throughout, note that we use italicised The or em , Assumption etc. to refer to the asso ciated results in Chernozh uk o v et al. ( 2018 ). The justification follows the same outline as their pro of of The or em 4.2 and, as suc h, ignores con tributions from the residual momen ts Ω for ψ p W ; θ 0 , π 0 , η q defined in Prop osition 3.1 . Step 1: V erify Assumptions 3.1(a)-3.1(c) . The locally efficient linear score from Prop osition 3.2 can b e written ψ p W ; θ 0 , π 0 , η q “ ´ ¨ ˝ v ˚ 0 p r D ´ r r q 0 0 1 r r V 1 r V ˛ ‚ ¨ ˝ θ 0 π 0 ˛ ‚ ` ¨ ˝ v ˚ 0 p r Y ´ r l q r V 1 p r D ´ r r q ˛ ‚ ” ψ a p W ; η q ¨ ˝ θ 0 π 0 ˛ ‚ ` ψ b p W ; η q , (A.5) where ro w vector v ˚ 0 “ π 1 0 r V 1 “ π 0 p r Z ´ Ă M q 1 is the combined effect of the r instrumen tal v ariables on r D . This is linear, satisfies E P r ψ p W ; θ 0 , π 0 , η 0 qs “ 0 and, if it exists, is twice con tinuously Gateaux differen tiable. Step 2: V erify Assumption 3.1(d) . That Neyman orthogonality of ( A.1 ) with λ N “ 0 follo ws from L emma 2.6 has already b een verified b y Prop osition 3.1 . Step 3: V erify Assumptions 3.1(e) and 3.2(d) . First, consider decomp osing the norm of ψ 0 a “ ψ a p W ; η 0 q in terms of the singular v alues of E P ` ψ A 0 p W ; η 0 q ˘ : } E P p ψ a 0 q } “ ˇ ˇ ˇ E P ´ π 1 0 p r Z ´ Ă M 0 qp r D ´ r r 0 q ¯ ˇ ˇ ˇ ` r ÿ k “ 1 σ k ´ E P p r V 1 0 r V 0 q ¯ , 47 where σ k ´ E P p r V 1 0 r V 0 q ¯ is singular v alue k of E P “ ψ A 0 p W ; η 0 q ‰ . F rom Regularity Condition (c) , all r ` 1 comp onen ts are greater than some p ositiv e constan t and so } E P p ψ a q} ě c follows. Similarly , } E P p ψ a q} ď ˇ ˇ E P t π 1 0 p r Z ´ Ă M 0 qp r D ´ r r 0 qu ˇ ˇ ` r ˇ ˇ ρ ´ E P p r V 1 0 r V 0 q ¯ ˇ ˇ ď p 1 ` r q C , where ρ ´ E P p r V 1 0 r V 0 q ¯ “ max k σ k ´ E P p r V 1 0 r V 0 q ¯ is the maxim um singular v alue of E P p r V 1 0 r V 0 q . Regularit y Condition (c) also ensures that } E P p ψ a q} is b ounded ab o ve, as required. Second, b ecause E P p ψ 0 q “ 0 it follo ws under ( A.5 ) that E P p ψ b 0 q “ E P p ψ a 0 q b 0 , where b 0 “ p θ 0 , π 0 q 1 , and so ψ b 0 “ ψ a 0 b 0 ` ϵ , where b 0 “ E ´ 1 P p ψ a 0 q E P p ψ b 0 q is the co efficien t of the linear pro jection of ψ a 0 on to ψ b 0 for the just-iden tified case, so E P p ϵ q “ E P p ψ a 0 ϵ q “ 0 . Then, using the singular v alue decomp osition, E P p ψ b 0 ψ b 1 0 q “ E P ` ψ a 0 b 0 b 1 0 ψ a 0 ` ϵϵ 1 ˘ “ E P ` A Σ A 1 b 0 b 1 0 A 1 Σ A ` ϵϵ 1 ˘ , where Σ is a diagonal matrix with diagonal elements the singular v alues of normal matrix ψ a 0 , and A is a unitary matrix satisfying A 1 A “ AA 1 “ I so that A 1 b 0 b 1 0 A “ b 1 0 b 0 I . F urthermore, E P “ A Σ A 1 b 0 b 1 0 A Σ A 1 ‰ “ E P “ A p b 1 0 b 0 Σ 2 q A 1 ‰ , whic h shows E P p ψ b 0 ψ b 1 0 q to b e a matrix with all-p ositive singular v alues b ecause b 1 0 b 0 ą 0 . Finally , E P p ψ 0 ψ 1 0 q “ 2 E P p ψ a 0 b 0 b 1 0 ψ a 0 q ` E P p ϵϵ 1 q has p ositive singular v alues b ecause it is the sum of tw o matrices with p ositiv e singular v alues; noting that p ositiv e-definite E P p ϵϵ 1 q “ V ar P p ϵ q has p ositiv e eigen v alues. Step 4: Assumption 3.2(a) . That the following results hold with the following limits lying in T N with probabilit y at least 1 ´ ∆ N is asserted to b e true by Regularit y Condition (e) . Step 5: V erify Assumption 3.2(b) that m ˚ N “ sup ` E P p} ψ a } q q 1 { q ˘ ď c 2 . F rom the blo c k-diagonal structure of ψ a , } ψ a } “ | v ˚ 0 p r D ´ r r q| ` T r p r V 1 r V q . Now w e follo w The or em 4.2 b y sp ecifying q { 2 and requiring that q ą 4 . Then applying the Minko wski inequality (for 48 the sum of t w o scalars) giv es } ψ a } “ | v ˚ 0 p r D ´ r r q| ` T r p r V 1 r V q . No w we follow The or em 4.2 b y sp ecifying q { 2 and requiring that q ą 4 . Then, applying the Mink owski inequality (for the sum of tw o scalars) gives E P ` } ψ a } q { 2 ˘ 2 { q “ › › | v ˚ 0 p r D ´ r r q| ` T r p r V 1 r V q › › P,q { 2 ď } v ˚ 0 p r D ´ r r q} P,q { 2 ` } T r p r V 1 r V q} P,q { 2 , and a further application of Mink o wski gives } T r p r V 1 r V q} P,q { 2 “ › › › r ÿ k “ 1 σ k p r V 1 r V q › › › P,q { 2 ď r ÿ k “ 1 } σ k p r V 1 r V q} P,q { 2 ď r } ρ p r V 1 r V q} P,q { 2 “ r } ρ p r V q} 2 P,q , where ρ p r V q “ max σ k p r V q is the maximum singular v alue of r V and, b eing a linear com bination of all r mean-cen tred instrumen tal v ariables, resp ects the assumptions satisfied by the other instrumen tal v ariables so that ρ p r V q “ r Z ρ ´ m ρ p X q and ρ 0 p r V q “ r Z ρ ´ m ρ 0 is its true mean- cen tred v alue. Successive applications of Minko wski’s inequality to } ρ p r V q} P,q giv es } ρ p r V q} P,q “ } r Z ρ ´ r m ρ 0 ´p r m ρ ´ r m ρ 0 q} P,q ď } ρ 0 p r V q} P,q `} r m ρ ´ r m ρ 0 } P,q ď } r Z ρ } P,q `} r m ρ 0 } P,q `} r m ρ ´ r m ρ 0 } P,q , and Jensen’s inequality giv es } r m ρ 0 } P,q ď } r Z ρ } P,q so that } r Z ρ } P,q ` } r m ρ 0 } P,q ` } r m ρ 0 ´ r m ρ 0 } P,q ď 2 } r Z ρ } P,q ` } r m ρ ´ r m ρ 0 } P,q ď 2 C ` C. Before mo ving on to the second comp onen t, note that π 0 and θ 0 can b e b ounded empirically as follows: } π 0 } “ } r V 1 0 p r D ´ r r 0 q} P, 1 } r V 1 0 r V 0 } P, 1 ď r ´ 1 c ´ 1 0 } r V 1 0 p r D ´ r r 0 q} P, 1 , b ecause from Regularity Condition (c) it follows that } r V 1 0 r V 0 } P, 1 ě r c 0 . Then Hölder’s in- equalit y follow ed by Minko wski’s and then Jensen’s inequalities further giv es } π 0 } ď r ´ 1 c ´ 1 0 } r V 0 } P, 2 } r D ´ r r 0 } P, 2 ď r ´ 1 c ´ 1 0 ´ } r Z } P, 2 ` } Ă M 0 } P, 2 ¯ ´ } r D } P, 2 ` } r r 0 } P, 2 ¯ ď r ´ 1 c ´ 1 0 ´ } r Z } P, 2 ` } r Z } P, 2 ¯ ´ } r D } P, 2 ` } r D } P, 2 ¯ ď 4 r ´ 1 c ´ 1 0 C 2 ” α . 49 with the final inequality following from Regularit y Condition (b) that } r Z } P, 2 and } r D } P, 2 ď C . Similarly , θ 0 “ E P “ v ˚ 0 p r Y ´ r l 0 q ‰ { E P “ v ˚ 0 p r D ´ r r 0 q ‰ and so applying Mink o wski’s inequality | θ 0 | “ } v ˚ 0 p r Y ´ r l 0 q} P, 1 {} v ˚ 0 p r D ´ r r 0 q} P, 1 ď c ´ 1 0 } π 0 } } r Z ´ Ă M 0 } P, 2 } r Y ´ r l 0 } P, 2 ď r α 2 , with the first inequality follo wing from the lo wer b ound on π 0 , the second from Hölder’s inequalit y com bined with the norm prop ert y } π 0 } 2 ď } π 0 } , and the final one from Regularity Condition (b) . No w returning to the second comp onen t | v ˚ 0 p r D ´ r r q| , successively applying Mink owski’s, Hölder’s, Mink o wski’s and Jensen’s inequalities giv es “ E P p| v ˚ 0 p r D ´ r r q| q { 2 q ‰ 2 { q “ } v ˚ 0 p r D ´ r r 0 q ´ v ˚ 0 p r r ´ r r 0 q} P,q { 2 ď | v ˚ 0 p r D ´ r r 0 q} P,q { 2 ` } v ˚ 0 p r r ´ r r 0 q} P,q { 2 ď } π 0 } q } r Z ´ Ă M 0 ´ p Ă M ´ Ă M 0 q} P,q ´ } r D ´ r r 0 } P,q ` } r r ´ r r 0 } P,q ¯ ď } π 0 } q p} r Z } P,q ` } Ă M 0 } P,q ` } Ă M 0 } P,q q ´ } r D } P,q ` } r r 0 } P,q ` } r r ´ r r 0 } P,q ¯ ď } π 0 } q p 2 } r Z } P,q ` } Ă M ´ Ă M 0 } P,q q ´ 2 } r D } P,q ` } r r ´ r r 0 } P,q ¯ ď α p 3 C qp 3 C q “ α β 2 , where the final inequalit y follo ws from Regularit y Conditions (c) and (e) together with Jensen’s inequalit y to give } r Z } P,q ` } Ă M 0 } P,q ď 2 } r Z } P,q ď 2 C and } r D } P,q ` } r r 0 } P,q ď 2 } r D } P,q ď 2 C and 3 C ” β . Therefore, b ecause r is fixed, ` E P } Ψ a } q { 2 ˘ 2 { q ď β 2 p r ` α q ă c 2 , noting that b oth α and β 2 are O p C 2 q . Now we show that m N “ sup p E P } ψ } q q 1 { q ď c , where ψ “ ¨ ˝ v ˚ 0 r U r V 1 r R ˛ ‚ “ ¨ ˝ v ˚ 0 r U 0 r V 1 r R 0 ˛ ‚ ´ ¨ ˝ v ˚ 0 ` r l ´ r l 0 ` p r r ´ r r 0 q θ 0 ˘ r V 1 ` r r ´ r r 0 ` p r V ´ r V 0 q π 0 ˘ ˛ ‚ ” ψ 0 ´ ψ 1 . r U 0 “ r Y ´ r l 0 ´ p r D ´ r r 0 q θ 0 , r R 0 “ r D ´ r r 0 ´ r V 0 π 0 , and r V “ r Z ´ Ă M 0 are the true mo del residuals, and r U “ r Y ´ r l 0 ´ p r D ´ r r 0 q θ 0 , r R “ r D ´ r r ´ r V π 0 and r V “ r Z ´ Ă M . Applying the Mink owski inequalit y to p E P ` } ψ } q { 2 ˘ 2 { q “ } ψ } P,q { 2 ď } ψ 0 } P,q { 2 ` } ψ 1 } P,q { 2 , 50 and then Mink owski’s follow ed b y Hölder’s inequalities follow ed by Regularit y Condition (c) to } ψ 0 } P,q { 2 “ # E P ˜ | v ˚ 0 r U 0 | q { 2 ` r ÿ k “ 1 | r v 1 k r R 0 | q { 2 ¸+ 2 { q ď } v ˚ 0 r U 0 } P,q { 2 ` r } ˘ v 1 r R 0 } P,q { 2 ď } π 0 } q } r U 0 } P,q ` r } ˘ v } P,q } r R 0 } P,q “ α β } r U 0 } P,q ` r β } r R 0 } P,q , where } π 0 } q ď α , ˘ v “ arg max r v k } r v k } P,q , } ˘ V } P,q ď β and } r v ˚ 0 } P,q ď } π 0 }} r Z ´ Ă M 0 ´ p Ă M ´ Ă M 0 q} P,q ď α β . Successive applications of these inequalities along the lines ab o ve further giv es } ψ 1 } P,q { 2 ď } v ˚ 0 p r l ´ r l 0 ` p r r ´ r r 0 q θ 0 q} P,q { 2 ` r } ˘ v 1 p r r ´ r r 0 q ` p r V ´ r V 0 q π 0 } P,q { 2 ď α β } r l ´ r l 0 } P,q ` α } r r ´ r r 0 } P,q r α 2 ` r β } r r ´ r r 0 } P,q ` r β } r r ´ r r 0 } P,q ` r β } r V ´ r V 0 } P,q α ď C αβ ` r C α 3 ` 2 r C β ` r C αβ , b ecause } θ 0 } P,q ď r ´ 1 α 2 so that } ψ } P,q { 2 ď α β } r U 0 } P,q ` r β } r R 0 } P,q ` C αβ ` r C α 3 ` 2 r C β ` r C αβ . Finally , } r U 0 } P,q ď } r Y ´ r l 0 } P,q ` }p r D ´ r r 0 q θ 0 } P,q ď 2 C ` 2 C } θ 0 } q ď 2 C p 1 ` r α 2 q , and } r R 0 } P,q “ } r D ´ r r 0 } P,q ` }p r Z ´ Ă M 0 q π 0 } P,q ď 2 C ` 2 C } π 0 } q ď 2 C p 1 ` α q . Therefore, } ψ } P,q { 2 ď α β 2 C p 1 ` r α 2 q ` r β 2 C p 1 ` α q ` C αβ ` r C α 3 ` 2 r C β ` r C αβ “ α β 2 C p 1 ` r α 2 q ` r β 2 C p 1 ` α q ` C αβ ` r C α 3 ` 2 r C β ` r C αβ “ C β p 2 α ` 2 rα 3 ` 2 r p 1 ` α q ` p 1 ` r q α ` p r α 3 q{ β ` 2 r q ă c 2 . This implies that c 2 “ O p C 7 q b ecause α 3 { β “ O p C 5 q . 51 Step 6: V erify Assumption 3.2(c) . No w w e show that } E P p Ψ a q ´ E P p Ψ a 0 q} conv erges at rate δ N , where ψ p W ; θ 0 , π 0 , η q “ ´ ¨ ˝ v ˚ 0 p r D ´ r r q 0 r 0 1 r 0 r r V 1 r V ˛ ‚ ¨ ˝ θ 0 π 0 ˛ ‚ ` ¨ ˝ v ˚ 0 p r Y ´ r l q r V 1 p r D ´ r r q ˛ ‚ ” ψ a p W ; η q ¨ ˝ θ 0 π 0 ˛ ‚ ` ψ b p W ; η q , and, taking the top-left and bottom-right diagonals in turn, › › E P “ v ˚ 0 p r D ´ r r q ‰ ´ E P “ π 1 0 p r Z ´ Ă M 0 q 1 p r D ´ r r q ‰ › › “ } π 1 0 p Ă M ´ Ă M 0 q 1 p r D ´ r r 0 q ` π 1 0 p r Z ´ Ă M 0 q 1 p r r ´ r r 0 q ` π 1 0 p Ă M ´ Ă M 0 q 1 p r r ´ r r 0 q} P, 1 ď } π 1 0 p Ă M ´ Ă M 0 q 1 p r R ´ r r 0 q} P, 1 ` } π 1 0 p r Z ´ Ă M 0 q 1 p r r ´ r r 0 q} P, 1 ` } π 1 0 p Ă M ´ Ă M 0 q 1 p r r ´ r r 0 q} P, 1 ď | π 0 } 2 p} Ă M 0 } P, 2 } r D ´ r r 0 } P, 2 ` } r Z ´ Ă M 0 } P, 2 | r r ´ r r 0 } P, 2 ` } Ă M ´ Ă M 0 } P, 2 } r r ´ r r 0 } P, 2 q ď α p 2 C δ N ` 2 C δ N ` δ N N ´ 1 { 2 q ď α p 2 C δ N ` 2 C δ N ` δ N q “ δ N p 4 C ` 1 q α , and › › E P “ p r Z ´ Ă M q 1 p r Z ´ Ă M q ‰ ´ E P “ p r Z ´ Ă M 0 q 1 p r Z ´ Ă M 0 q ‰ › › “ › › ´ 2 p Ă M ´ Ă M 0 q 1 p r Z ´ Ă M 0 q ` p Ă M ´ Ă M 0 q 1 p Ă M ´ Ă M 0 q › › P, 1 ď 2 › › p Ă M ´ Ă M 0 q 1 p r Z ´ Ă M 0 q › › P, 1 ` › › p Ă M ´ Ă M 0 q 1 p Ă M ´ Ă M 0 q › › P, 1 ď 2 } Ă M ´ Ă M 0 } P, 2 } r Z ´ Ă M 0 } P, 2 ` } Ă M ´ Ă M 0 } 2 P, 2 ď δ N p 4 C ` N ´ 1 { 2 q ď δ N p 4 C ` 1 q ď δ N p 4 C ` 1 q α , b y successiv e applications of Minko wski’s and Hölder’s inequalities and Regularity Condi- 52 tions (c) and (e) . Now w e show “ E P } ψ ´ ψ 0 } 2 ‰ 1 { 2 con verges at the same rate: } ψ ´ ψ 0 } 2 P, 2 “ › › › › › › ¨ ˝ v ˚ 0 r U π 1 0 ´ r V 1 0 r U 0 r V 1 r R ´ r V 0 1 r R 0 ˛ ‚ › › › › › › 2 P, 2 “ } z ˚ 0 r U ´ π 1 0 r V 1 0 r U 0 } 2 P, 2 ` r ÿ k “ 1 › › › p r z k ´ r m k q1 r R ´ p r z k ´ r m k 0 q 1 r R 0 › › › 2 P, 2 “ › › › › π 1 0 ´ r V 0 ´ p Ă M ´ Ă M 0 q ¯ 1 ´ r U 0 ´ p r l ´ r l 0 q ` p r r ´ r r 0 q θ 0 ¯ ´ π 1 0 r V 1 0 r U 0 › › › › 2 P, 2 ` r ÿ k “ 1 › › › ` r v k 0 ´ p r m k ´ r m k 0 q ˘ 1 ´ r R 0 ´ p r r ´ r r 0 q ` p Ă M ´ Ă M 0 q π 0 ¯ ´ r v 1 0 k r R 0 › › › 2 P, 2 “ › › › › π 1 0 " p Ă M ´ Ă M 0 q 1 r U 0 ` ´ r V 0 ´ p Ă M ´ Ă M 0 q ¯ 1 ´ r l ´ r l 0 ´ p r r ´ r r 0 q θ 0 ¯ * › › › › 2 P, 2 ` r ÿ k “ 1 › › › p r m k ´ r m k 0 q 1 r R 0 ´ ` r v k 0 ` p r m k ´ r m k 0 q ˘ 1 ´ p r r ´ r r 0 q ´ p Ă M ´ Ă M 0 q π 0 ¯ › › › 2 P, 2 ď › › › › π 1 0 " p Ă M ´ Ă M 0 q 1 r U 0 ` ´ r V 0 ´ p Ă M ´ Ă M 0 q ¯ 1 ´ r l ´ r l 0 ´ p r r ´ r r 0 q θ 0 ¯ * › › › › 2 P, 2 ` r max k › › › p r m k ´ r m k 0 q 1 r R 0 ´ ` r v k 0 ` p r m k ´ r m k 0 q ˘ 1 ` p r r ´ r r 0 q ´ p Ă M ´ Ă M 0 q π 0 ˘ › › › 2 P, 2 . Then, applying Hölder’s inequalit y , gives an expression of the form } ψ ´ ψ 0 } 2 P, 2 ď } π 0 } 2 1 p} A 1 } 2 P, 2 ` . . . ` } A 6 } 2 P, 2 q ` r p} B 1 } 2 P, 2 ` . . . ` } B 6 } 2 P, 2 q , based on the six cross-pro ducts of Ă M ´ Ă M 0 , r V 0 , r l ´ r l 0 and r r ´ r r 0 in the first expression and similarly for the second. F or example, the follo wing cross-pro ducts inv olving tw o nuisance functions satisfies } A 1 } 2 P, 2 ď } Ă M ´ Ă M 0 } 2 P, 2 } r U 0 } 2 P, 2 } r l ´ r l 0 } 2 P, 2 ď p C δ N N ´ 1 { 2 q 2 , as do the other sev en, and the cross-pro duct in volving only one n uisance parameter satisfies } B 5 } 2 P, 2 ď } r v 0 } 2 P, 2 } r r ´ r r 0 } 2 P, 2 ď p 2 C δ N q 2 as do the other three (these follo w from Hölder’s inequalit y and Regularity Conditions (c) and (e) and Regularity Condition (d) ) gives } r U 0 } P, 2 “ ´ E P p r U 2 0 q ¯ 1 { 2 ď ? C (and ditto for r R 0 , r V 0 and ˘ v 0 ) and so the conv ergence rate of the pro duct of the tw o n uisance parameters is 53 δ N N ´ 1 { 2 . Hence, } ψ ´ ψ 0 } 2 P, 2 ď c O p C 2 q ´ O p C 2 q ` 1 ` o p N ´ 1 q ˘ δ 2 N ¯´ O p C 2 q ` 1 ` o p N ´ 1 q ˘ δ 2 N ¯ “ O p C 2 δ N q b ecause } π 0 } 2 1 ď α 2 “ O p C 4 q . Finally , it remains to sho w sup d Pp 0 , 1 q ,η P T N › › B 2 r E P ℓ “ ψ ` W ; θ 0 , η 0 ` p η ´ η 0 q d ˘‰ › › ď δ N N ´ 1 { 2 . Here, f p d q “ E P » – ¨ ˝ π 1 0 t r V 0 ´ d p Ă M ´ Ă M 0 qu 1 t r U 0 ´ d p r l ´ r l 0 q d ´ d p r r ´ r r 0 q θ 0 u t r V 0 ´ d p Ă M ´ Ă M 0 qu 1 t r R 0 ´ p r r ´ r r 0 q d ´ d p Ă M ´ Ă M 0 q π 0 u ˛ ‚ fi fl . and its second deriv ativ e is B 2 d f p d q “ E P » – ¨ ˝ 2 π 1 0 p Ă M ´ Ă M 0 q 1 tp r l ´ r l 0 q ´ p r r ´ r r 0 q θ 0 u 2 p Ă M ´ Ă M 0 q 1 tp r r ´ r r 0 q ´ p Ă M ´ Ă M 0 q π 0 u ˛ ‚ fi fl . Straigh tforward application of Mink owski’s and Hölder’s inequalities and Regularit y Condi- tion (e) gives ˇ ˇ ˇ E P ” 2 π 1 0 p Ă M ´ Ă M 0 q 1 tp r l ´ r l 0 q ´ p r r ´ r r 0 q θ 0 u ı ˇ ˇ ˇ ď 2 α p 1 ` r α q δ N N ´ 1 { 2 , and    E P ” 2 p Ă M ´ Ă M 0 q 1 ␣ p r r ´ r r 0 q ´ p Ă M ´ Ă M 0 q π 0 ( ı    ď 2 p 1 ` α q δ N N ´ 1 { 2 ď 2 α p 1 ` r α q δ N N ´ 1 { 2 , so that δ N ” 2 α p 1 ` r α q δ N N ´ 1 { 2 . All the conditions set out in Assumption 3.1 and Assumption 3.2 are all satisfied, and so The or em 3.1 holds for the DML estimator. 54 B Algorithm for P anel IV DML The panel IV DML algorithm for the estimation and statistical inference of b 0 “ p θ 0 , π 0 q 1 and for b rf 0 “ p δ 0 , π 0 q 1 , follows the steps b elow. A summary is pro vided in Algorithm 1 . St age 1 (P anel d a t aset) Consider a dataset with rep eated cross-sectional units N (e.g., households, firms, countries, regions) and multiple time perio ds such that T ě 2 . Apply the FD (exact) approach to remov e the unobserv ed individual heterogeneity from the mo del equations. This consists in taking the first difference of the following v ariables t Y it , D it , Z it u , and generating the first-order lag of X it . St age 2 (Block-k sample splitting and cr oss-fitting) Randomly partition the cross-sectional units N in to K folds of the same size. F or each fold k “ 1 , . . . , K denote W k Ă W as the estimation sample, and W c k as its complemen t, where N k ” | W k | “ N { K , | W c k | “ N ´ N k . The folds are m utually exclusive and exhaustive such that W k X W j “ W k X W c k “ ∅ and W k Y W c k “ W 1 Y . . . Y W K “ W . F or K ą 2 , the larger complemen tary sample W c k is used to learn the p oten tially complex nuisance parameters η 0 , and W k for the relativ ely simple task of estimating the parameters b 0 and b rf 0 . St age 3 (Learning the nuisance p arameters) F or eac h fold k , use ML algorithms to predict the v ector of n uisance parameters η from the complemen tary data t W i : i P W c k u . Stage 4 (Constr uction of Neyman-or thogonal score) Use the predicted n ui- sances p η k from St age 3 to construct Neyman orthogonal scores s p W k , b k ; p η k q for the struc- tural mo del and s rf p W k , b rf k ; p η k q for the reduced form model using the estimation sample in fold k ( W k ). Stage 5 (Estima tion and inference on the p arameters) The P anel IV DML estimators p b k p b rf k are the unique solutions to p K N k q ´ 1 ř K k “ 1 ř i P W k ř t P T s p W it , b k ; p η k q “ 0 and p K N k q ´ 1 ř K k “ 1 ř i P W k ř t P T s rf p W it , b rf k ; p η k q “ 0 . A finite-sample correction of p p b k ´ K ´ 1 ř k p b k q 2 is applied to the a v erage v ariance across the k -folds, p Σ 2 , w eigh ted by the n umber of units in the cluster to account for the v ariation introduced by sampling splitting 55 Algorithm 1: P anel IV DML Algorithm Require : Panel dataset with N sub jects, and T ě 2 time p erio ds. Input : Data t Y it , D it , Z it , X it u , panel and time identifiers, and cluster v ariable if differen t from panel identifier. Output : p b “ p p θ , p π q 1 , p b rf k “ p p δ , p π q 1 , p Σ , p Σ rf , F DM L , AR p p θ q DM L , C S AR p p θ q , mo del RMSE, RMSE of l, r, m . Initialize: Set num b er of folds k ě 2 ; i randomly assigned to fold k . Assign learners for nuisance parameters η “ t l , r, m u . 1 Divide the sample into K folds, and randomly assign unit i and its time series to fold k . Define the estimation sample in fold k ( W k ) with size | N k | , and the complementary sample in folds ´ k ( W c k ). 2 for k Ð 1 to K do 3 Predict p η k using base learners on data W c k . 4 Construct s p W k , b k ; p η k q and s rf p W k , b rf k ; p η k q . 5 Solve 1 | N k | ř i P W k s p W k , b k ; p η k q “ 0 and 1 | N k | ř i P W k s rf p W k , b rf k ; p η k q “ 0 for p b k and p b rf k , resp ectiv ely . 6 end 7 A v erage t p b k , p b rf k u K k “ 1 and mo del RMSE, compute finite-sample v ariance-cov ariance matrix p Σ , p Σ rf and RMSE of l, r, m . 8 Compute F DM L , AR p p θ q DM L , C S AR p p θ q . ( Chernozh uko v et al. , 2018 , p. C30). St age 6 (Itera tion) Rep eat Steps 3-5 for eac h of the k folds and a verage the in terme- diate results. St age 7 (R obust weak-IV dia gnostics) Calculate the first-stage F -statistic F DM L , the AR test statistic AR p p θ q DM L , and the AR confidence set C S AR p p θ q . Compare the test statistics with the appropriate critical v alues at the desired levels of significance to make inferences ab out the causal parameter of in terest. 56 C A dditional T ables for T ab ellini ( 2020 ) In this section, we discuss additional results based on conv en tional 2SLS estimators using fixed effects (FE) and first-differences (FD) transformations in the reanalysis of T ab ellini ( 2020 ). This comparison, briefly commen ted in Section 4.1 , is a necessary robustness step to v erify that an y p ossible differences found with panel IV DML estimates mainly reflect metho dological rather than sp ecification c hoices. T able C.1 presents the results of this sup- plemen tary c hec k. P anel A rep orts baseline sp ecifications while P anel B, not originally im- plemen ted, displa ys the augmen ted specifications with additional con trol v ariables included sim ultaneously . The dep enden t v ariables are: DW Nominate Sc or e (Columns 1-2) for the p olitical outcome, and L o g Oc cup ational Sc or e (Columns 3-4) for economic outcome. P anel A of T able C.1 rep orts the original 2SLS estimates with FE in Columns 1 and 3 (originally from T able 3, Column 4, Panel B, and T able 5, Column 1, Panel B of T ab ellini , 2020 ) alongside our 2SLS estimates with FD in Columns 2 and 4. The first-stage F-statistics from the 2SLS sp ecifications with FD for b oth outcomes, although somewhat smaller than those from FE, alw ays exceed Sto c k and Y ogo ( 2005 )’s critical v alue of 16.30, ev en if slightly ab o ve this threshold for the p olitical outcome with the FD estimator. 27 The results from Anderson-Rubin (AR) test statistics and AR confidence sets (b ounded without zero included), not computed in the original study , provide supp ort about the existence of a strong p ositiv e effect of immigration on b oth outcome v ariables with either 2SLS estimators. Giv en the strength of the instrumen ts, the second-stage co efficien ts (close in magnitudes for b oth panel estimators) suggest that higher immigration shares led to the election of more conserv ative representativ es, while emplo yment gains for natives, induced b y Europ ean immigran ts, were accompanied b y occupational or skill upgrading. In Panel B of T able C.1 with the augmen ted specifications, w e find m uc h smaller first-stage F-statistics from FD than those from FE regressions. The instrument app ears w eak with b oth estimators in the p olitical outcome sp ecification (Columns 1-2). Sp ecifically , the FE F-statistic is b elo w Sto c k and Y ogo ( 2005 )’s threshold, while the FD F-statistic do es not ev en exceed the conv en tional rule-of-thum b v alue of 10. By contrast, the shift-share instrumen t in the economic outcome sp ecification (Columns 3-4) can b e considered strong, 27 Using a more conserv ative threshold of 104.70 for the strength of the instrumen t ( Lee et al. , 2022 ), only the sp ecification for the economic outcome estimated with 2SLS with FE (Column 3 in Panel A) passes the test for instrumental relev ance. 57 T able C.1. The p olitic al and e c onomic effe cts of immigr ation Dep endent variable: D W Nominate Score Log Occupational Score 2SLS with: FE FD FE FD (1) (2) (3) (4) P anel A: Baseline sp e cific ation with fixe d-effe cts inter actions Se c ond-stage r esults F r. Immigrants 1.658** 1.772** 0.097*** 0.095*** (0.808) (0.829) (0.036) (0.042) AR CS 95% [0.298, 3.499] [0.049, 3.495] [0.030, 0.162] [0.017,0.173] First-stage r esults Shift-Share IV 1.007*** 0.965*** 0.993*** 0.933*** (0.209) (0.226) (0.061) (0.094) F stat 23.11 18.23 251.31 99.45 AR χ 2 stat 4.71 3.84 8.27 5.40 AR χ 2 p-v alue 0.030 0.050 0.004 0.020 Model RMSE 0.015 0.232 0.013 0.015 Observ ations 460 303 538 342 No. clusters 157 157 127 125 P anel B: Sp e cific ation with additional c ontr ols (not original ly implemente d) Se c ond-stage r esults F r. Immigrants 1.736 0.928 0.091* 0.094 (1.902) (2.103) (0.049) (0.060) AR CS 95% [-2.143, 6.070] [-4.276, 4.883] [-0.004, 0.181] [-0.032, 0.207] First-stage r esults Shift-Share IV 0.544*** 0.541*** 0.794*** 0.745*** (0.143) (0.178) (0.083) (0.117) F stat 14.62 9.26 85.79 40.89 AR χ 2 stat 0.82 0.18 3.38 2.20 AR χ 2 p-v alue 0.366 0.668 0.066 0.138 Model RMSE 0.013 0.218 0.011 0.019 Observ ations 451 297 526 338 No. clusters 154 154 126 125 Note: The table displays the estimates of baseline sp ecifications from T able 3 (column 4, Panel B) and T able 5 (column 2, Panel B) in T abellini ( 2020 ), obtained with con ventional 2SLS with fixed effects, and our estimates with conv entional 2SLS regression with FD transformation (Columns 2 and 4). The raw con trol v ariables only include interactions b et ween state and year fixed effects, for a total of 77 v ariables in P anel A and 96 in P anel B; some v ariables are omitted due to m ulticollinearity . The num ber of clusters differs b etw een Columns 3 and 4 b ecause the FD estimator requires at least tw o consecutive non-missing observations per group, while the FE estimator only requires tw o non-missing observ ations, not necessarily consecutive. In this case, tw o groups lack consecutive data and, hence, excluded from the estimation. The Anderson-Rubin test statistic and confidence sets are not originally implemented in the analysis. Standard errors in parenthesis are clustered at the metropolitan area in Columns 1-4 and at the cit y code level in Columns 5-8. Significance lev els: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. as b oth first-stage F-statistics exceed Sto c k and Y ogo ( 2005 )’s threshold, but nev er Lee et al. ( 2022 )’s threshold of 104.70. The AR test statistics either fail to reject the n ull hypothesis of no reduced-form effect, when assuming no second-stage effect, or only at 10% level (in Column 3). The AR CS alwa ys include zero as a p ossible v alue of the treatmen t effect, suggesting no second-stage effect in b oth outcomes. Giv en the moderate strength of the in- strumen ts, w e can comment on the second-stage estimates. The second-stage co efficien ts are statistically insignifican t (or b orderline significan t in Column 3) with b oth panel estimators and in b oth sp ecifications, confirming the AR test results. 58 In conclusion, FE and FD estimators yield v ery similar results in many dimensions across b oth outcomes and, therefore, they ma y identify the same underlying parameter under standard assumptions. The baseline findings of the original study are confirmed with b oth FE and FD estimators. Ho wev er, after con trolling for additional confounders in the regression, b oth estimators pro vide no evidence of an y impact of immigration on b oth p olitical and economic outcomes, with the exception of the FE regression of log o ccupational score. D A dditional T ables for Moriconi et al. ( 2019 , 2022 ) In the following sections, w e discuss the results on individual voters’ b eha viour from conv en- tional 2SLS estimators using FE and FD transformations for Moriconi et al. ( 2019 , 2022 ), whic h were briefly commented in Section 4.2 . In our re-analysis, the estimates are obtained from a conv en tional un balanced panel dataset, where w e aggregated the individual voters’ information at the regional (either NUTS2 or country) and year lev els suc h that the same regional units from each of the tw elv e Europ ean countries are observed ov er consecutive elec- tion y ears. F or completeness of the analysis, w e also rep ort the original estimates obtained with the individual lev el data without any aggregation. In revisiting the study conducted by Moriconi et al. ( 2019 ), w e also discuss the results of the p olitical parties to complement the analysis. The part y platforms data are constructed as a con v entional balanced panel dataset, where information of the same p olitical parties from t welv e Europ ean coun tries is collected o ver up to three election y ears. Unlike the v oters’ data, no aggregation w as required. The set of co v ariates used in the political party analysis differs from the aggregated individual voters’ analysis and includes: the av erage GDP p er capita (in log), share of tertiary sector (in log), a v erage unemploymen t rate, and election year fixed effects. D.1 Moriconi et al. ( 2019 ) T able D.1 rep orts our estimates obtained from conv en tional 2SLS with FE and 2SLS with FD using aggregated individual data (Columns 1-4) and p olitical party data (Columns 5-8). The results are rep orted by skill-group, as in the original analysis: HS immigran ts (Columns 1-2, and 5-6) and LS immigrants (3-4, and 7-8). The dependent v ariables are: Net W elfar e State in Panel A, and Net Public Educ ation in P anel B. F or completeness, T able D.2 displays 59 T able D.1. Politic al pr efer enc es over 2007–2016: 2SLS with FD and with FE Sample: Aggregated Individual voters Political parties HS immigrants LS immigrants HS immigrants LS immigrants 2SLS with: FE FD FE FD FE FD FE FD (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Net W elfare State Se c ond-stage results Share 0.062*** 0.054*** 0.040* 0.050 0.157 0.076 -0.331* -0.448* (0.011) (0.014) (0.024) (0.033) (0.477) (0.350) (0.168) (0.237) AR CS 95% [0.043, 0.087] [0.029, 0.080] [-0.004, 0.101] [0.003, 0.159] r´ 0 . 940 , `8q [-0.471, 1.329] p´8 , ´ 0 . 121 s p´8 , ´ 0 . 209 s R obust W e ak IV T ests AR χ 2 stat 27.93 15.36 3.24 3.60 0.14 0.06 11.38 20.06 AR χ 2 p-v alue 0.000 0.000 0.072 0.058 0.710 0.810 0.001 0.000 Panel B: Net Public Educ ation Se c ond-stage results Share 0.026** 0.024 -0.070** -0.088* 0.356 0.229 0.239 0.333 (0.013) (0.015) (0.032) (0.046) (0.743) (0.551) (0.195) (0.284) AR CS 95% [0.005, 0.055] [-0.004, 0.055] [-0.160, -0.017] [-0.250, -0.022] r´ 2 . 493 , `8q [-0.731, 1.998] r 0 . 007 , `8q r´ 0 . 005 , `8q R obust W e ak IV T ests AR χ 2 5.24 2.78 6.70 6.65 0.28 0.22 3.09 3.66 AR χ 2 p-v alue 0.022 0.096 0.010 0.010 0.598 0.639 0.079 0.056 Panels A and B First-stage r esults Shift-share IV 1.995*** 1.655*** 0.636*** 0.522*** 0.739* 0.915** 0.779* 0.731* (0.305) (0.263) (0.167) (0.162) (0.402) (0.340) (0.366) (0.401) F-stat 42.90 39.51 14.61 10.37 3.30 7.25 4.60 3.32 Observ ations 259 146 259 146 179 97 179 97 No. clusters 113 113 113 113 12 12 12 12 Note: The table displays our estimates of Specifications (2), (4) and (6) (Panels A and B) in Moriconi et al. ( 2019 )’s T ables 4 (individual voters’ data) and 5 (parties’ data), obtained from conven tional 2SLS regression with FE (Columns 1, 3, 5 and 7) and FD transformation (Columns 2, 4, 6 and 8). The original sample consists of different individual voters from tw elve Europ ean countries sampled each year, which we aggregate at regional (NUTS2) level to obtain an unbalanced panel data set. The treatment and instrumental variables in Columns (1)-(2) and (5)-(6) refer to the fraction of high-skilled workers, and in Columns (3)-(4) and (7)-(8) of low-skilled work ers. The num b er of observations differs from 2SLS with FE and with FD regressions b ecause the first time p eriod is remov ed after the first-difference transformation. Ra w control v ariables in all panels are: the share of women, average age, share of tertiary/p ost-tertiary education, average GDP per capita (in log), share of tertiary sector (in log), average unemploymen t rate, and election year dummies. Dep enden t v ariables are ‘Net W elfare State’ in Panel A, and ‘Net Public Education in Panel B. Standard errors in parenthesis are clustered at the regional level in Columns 1-4 and at the country level in Column 5-8. Significance levels: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. the corresp onding original results from Moriconi et al. ( 2019 )’s T able 4 (individual v oters’ data) and T able 5 (parties’ data) relative to their Sp ecifications (2), (4) and (6) rep orted in their Panels A and B. W e b egin b y discussing the results for individual voters based on the aggregated sample (Columns 1-4 of T able D.1 ) for b oth panels, as the first-stage regressions are iden tical across sp ecifications. In Columns 1-2, the shift-share instruments for HS immigration ap- p ear strong under both estimators. The corresp onding F-statistics are commonly viewed as sufficien tly large based on Sto c k and Y ogo ( 2005 )’s critical v alue of 16.30 and, hence, strong. Ho wev er, their strength may b e considered quite mo derate relative to more conserv ativ e thresholds prop osed in the literature (e.g., 104.700 in Lee et al. , 2022 ). The same consid- eration applies to the original study , where the reported F-statistic ( F “ 32 . 87 in Columns 1-2 of T able D.2 ) is also b elo w Lee et al. ( 2022 )’s threshold and smaller than those from the panel 2SLS sp ecifications. The AR test statistics, not computed in the original study , reject the n ull h yp othesis of no reduced-form effect with b oth estimators (alb eit only at the 10% lev el for FD in P anel B), providing evidence of a second-stage effect in b oth P anels A 60 T able D.2. Politic al pr efer enc es over 2007–2016 ( Moric oni et al. ( 2019 )’s r esults) Sample: Individual voters P olitical parties HS LS HS LS (1) (2) (3) (4) P anel A: Net W elfar e State Share of immigrants 0.049 ˚˚ 0.009 0.062 -0.357 ˚ (0.022) (0.012) (0.494) (0.183) Observ ations 50,304 50,304 177 177 K-P rk W ald F-stat 32.87 32.58 21.47 17.39 A dj. R-Square 0.70 0.70 0.57 0.54 P anel B: Net Public Educ ation Share of immigrants 0.058 ˚˚˚ -0.028 0.378 0.261 (0.020) (0.022) (0.458) (0.181) Observ ations 50,304 50,304 177 177 K-P rk W ald F-stat 32.87 32.58 21.47 17.39 A dj. R-Square 0.56 0.56 0.62 0.59 Election FE Y es Y es Y es Y es NUTS2 Controls Y es Y es Y es Y es Individual Controls Y es Y es Y es Y es Note: The table displays the original estimates for Sp ecifications (2), (4) and (6) in Panels A and B from Moriconi et al. ( 2019 )’s T able 4 (in- dividual voters’ data) and T able 5 (parties’ data), obtained from con ven- tional IV estimation. Standard errors in parenthesis are clustered at the regional level. Significance lev els: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. and B. This conclusion is supp orted b y the corresp onding 95% AR confidence sets, which are alwa ys bounded and exclude zero, with the exception of Column 2 in P anel B. Overall, the FE estimates align closely with the original cross-sectional findings ab out the effect of HS immigration on b oth w elfare and public education expansion, whereas the FD results only supp ort the presence of an effect for w elfare expansion. In Columns 3-4, the instruments appear m uch w eak er than in the original study under b oth FE and FD estimators. The corresp onding first-stage F-statistics lie b et w een the conv en tional cut-off v alues of 10 and 16.30, raising concerns ab out the strength of the shift-share instrument in these sp ecifications. By con trast, the original cross-sectional anal- ysis rep orts considerably larger v alues ( F “ 32 . 58 in Column 2 of T able D.2 ), showing no evidence of w eak instruments under p o oled 2SLS. The lo wer F-statistics obtained from FE and FD regressions may partly reflect a loss of statistical p o wer after the region-y ear ag- gregation, which substantially reduced the sample size (from around 50,000 observ ations to 61 roughly 200 units in the estimation sample), thereby attenuating the apparen t relev ance of the instrument. The AR test statistics suggest the presence of some effect of LS immigration on b oth outcomes among individual voters (at 10% lev el for Net W elfar e State , and at 5% lev el for Net Public Educ ation ). The corresp onding AR CS are often b ounded and do not in- clude zero (except for Column 3 of P anel A). Giv en the apparent weakness of the shift-share instrumen ts in panel data regressions, second-stage results from b oth estimators should b e in terpreted with caution. Nevertheless, the AR diagnostic results indicate that b oth panel estimators agree on a negative effect of LS immigration on public education, whereas only the FD sp ecification p oin ts to a p oten tial p ositiv e effect on w elfare expansion.The original study , by con trast, finds no significant effect for LS immigran ts. Lo oking at the p olitical parties in Columns 5-8 of T able D.1 , The shift-share instrumen ts for b oth HS and LS immigran ts are w eak in all panel regressions, with F- statistics b elo w 10, compared to F “ 21 . 47 for HS and F “ 17 . 39 for LS in the original study . The AR test statistics indicate the absence of any effect in HS sp ecifications (Columns 5-6) and the presence of some second-stage effect only in LS sp ecifications (Columns 7-8) on b oth outcomes (p ositive on welfare and negative on education). The corresponding AR confidence sets are either un b ounded (Columns 5, 7 and 8), indicating that the data contain limited information to precisely identify the causal effect, or b ounded with zero included (Column 6), indicating no second-stage effect. Unlik e us, the original study finds only a b orderline anti-redistribution (negative) effect of LS immigration on w elfare expansion (Column 4, P anel A of T able D.2 ). Although AR results show that in some cases an effect ma y exist, the weakness of the instrument preven ts precise identification when lo oking at the AR CS. On the basis of the F-statistics and AR diagnostics, the second-stage co efficients from the FE and FD estimators cannot b e regarded as consistent causal estimates. Extending the discussion on p olitical parties, T able D.3 rep orts our estimates from 2SLS regressions with FD alongside panel IV DML regressions using the FD approac h. F or b oth HS and LS immigration, the shift-share instrumen ts appear w eak under b oth con v en- tional 2SLS and panel IV DML. That is, the corresp onding first-stage F-statistics are w ell b elo w the rule-of-th umb threshold of 10. The instrumen ts are ev en weak er under panel IV DML, with first-stage F-statistics muc h low er than those from conv en tional 2SLS sp ecifica- tions. Consequently , the second-stage estimates should b e in terpreted with caution, as w eak iden tification undermines the consistency of the estimator and the v alidit y of statistical in- ference. The AR diagnostic results with panel IV DML are mixed b ecause the specification 62 T able D.3. Politic al pr efer enc es – Politic al Parties Sample of: HS immigrants LS Immigrants 2SLS DML-Lasso DML-NNet DML-Bo osting 2SLS DML-Lasso DML-NNet DML-Bo osting (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Net W elfare State Se c ond-stage results Share 0.076 -0.359 0.951*** 1.497 -0.448* 0.189 1.264 1.782 -0.35 (0.912) (0.352) (0.99) -0.237 (0.288) (1.066) (3.762) AR 95% CS [-0.471,1.329] [-2.844, 0.627] [0.322, 0.87] [0.357, 8.611] p´8 , ´ 0 . 209 s [-0.212, -0.028] [-0.177, 0.14] p´8 , `8q First-stage r esults Shift-Share IV 0.915** 0.754** 1.256*** 0.626 0.731* 0.165 0.465 -0.098 -0.34 (0.367) (0.369) (0.527) -0.401 (1.073) (0.611) (0.49) F stat 7.25 3.525 9.677 1.177 3.32 0.02 0.483 0.033 AR χ 2 stat 0.06 1.034 3.059* 6.457** 20.06*** 4.546** 0.875 0.14 Quality of learners Model RMSE 0.964 3.691 4.018 3.844 1.019 3.298 8.41 6.825 MSE of l 1.152 1.279 1.142 1.152 1.279 1.142 MSE of r 0.801 1.428 0.754 3.977 3.09 1.353 MSE of m 1.076 1.078 0.322 2.582 1.853 0.675 Panel B: Net Public Educ ation Se c ond-stage results Share 0.229 -0.491 0.025 1.043 0.333 -0.269 0.031 -0.75 -0.551 (0.473) (0.588) (1.361) -0.284 (0.285) (0.306) (2.557) AR 95% CS [-0.731,1.998] [-1.343, 1.068] [0.261, 0.979] [-0.417, 4.898] r 0 . 005 , `8q [-0.034, 0.159] [0.281, 0.646] p´8 , `8q First-stage r esults Shift-Share IV 0.915** 0.754** 1.256*** 0.626 0.731* 0.165 0.465 -0.098 -0.34 (0.367) (0.369) (0.527) -0.401 (1.073) (0.611) (0.49) F stat 7.25 3.525 9.677 1.177 3.32 0.02 0.483 0.033 AR χ 2 stat 0.22 2.464 0.059 0.319 3.66* 1.499 4.292** 0.991 Quality of learners Model RMSE 0.895 2.698 3.386 4.973 0.959 2.687 3.647 6.035 MSE of l 1.128 1.321 0.982 1.128 1.321 0.982 MSE of r 0.801 1.428 0.754 3.977 3.09 1.353 MSE of m 1.076 1.078 0.322 2.582 1.853 0.675 Observ ations 97 97 97 97 97 97 97 97 No. clusters 12 12 12 12 12 12 12 12 Note: Note: The table displays our estimates based on Specifications (2), (4) and (6) of T able 5 (Panels A and B) in Moriconi et al. ( 2019 ) obtained from conven tional 2SLS regression with FD transformation (Columns 1 and 5), and our panel IV DML estimation with different base learners (Columns 2-4 and 6-8). The sample is aggregated sample at regional (NUTS2) level to construct an unbalanced panel data set.The treatment and instrumental v ariables in Columns (1)-(2) and (5)-(6) refer to the fraction of high-skilled workers, and in Columns (3)-(4) and (7)-(8) of low-skilled workers. The dependent variable in Panel A is ‘Net W elfare State, and in Panel B ‘Net Public Education’. Raw con trol v ariables in all panels are: the share of tertiary sector (in log), average unemployment rate, and election year dummies. The set of controls variables in the panel IV DML estimation with NNet and Bo osting does not include interaction terms because these base learners are designed to capture nonlinearities in the data. Panel IV DML estimation with Lasso (Columns 2 and 6) uses an extended dictionary of the raw v ariables, including polynomials up to order three and interaction terms between all the cov ariates, to satisfy Lasso’s weak sparsity assumption. The num b er of cov ariates used for panel IV DML estimation doubles due to the inclusion of the lags of all included cov ariates, following the FD (exact) approach. Panel IV DML technical note: 2 folds, cross-fitting, hyperparameters are tuned as p er T able F.1 . Standard errors in parenthesis are clustered at the country level. Significance levels: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. using Lasso-predictions supp orts the presence of some negativ e effect of LS immigrants on w elfare expansion (Column 6 in Panel A) while the sp ecification using neural-netw ork pre- dictions provides evidence for some p ositiv e effect of LS immigran ts on public education (Column 6 in Panel A). How ev er, in b oth cases the second-stage co efficien ts are outside the AR CS, reflecting the limited information in the data to iden tify the causal effect precisely and the unreliability of the estimates under w eak identification. Ov erall, the negative effect of LS immigration on welfare expansion with p oliti- cal parties, found with conv entional 2SLS, disapp ears in panel IV DML regressions. AR diagnostics from panel IV DML regression with Lasso rev eal the p ossible presence of some negativ e effect of HS immigration on welfare expansion. 63 D.2 Moriconi et al. ( 2022 ) T able D.4 rep orts our estimates obtained from con v en tional 2SLS regressions with FE and with FD using aggregated voters’ data. The analysis is divided b y skill-group: HS im- migran ts (Columns 1-2) and LS immigrants (Columns 3-4). The dep enden t v ariables are: Nationalism in P anel A, T rust in own c ountry p arliament for p olitical attitudes in Panel B, and Better plac e for immigration attitudes in Panel C. F or completeness, our T able D.5 displa ys the original results for Sp ecifications (2) and (3) in Moriconi et al. ( 2022 )’s T able 6, and Sp ecifications (3) and (5) in their T able 10, estimated with p ooled 2SLS. W e start comparing the effect of immigration on nationalism under FE and FD estimators (Panel A of T able D.4 ). Regarding instrumen t strength, the shift-share instrumen t for HS immigrants (Columns 1-2) is strong with F ą 16 . 30 , while th e instrument for LS immigran ts (Columns 3-4) is w eak with F ă 16 . 30 under b oth panel estimators. Therefore, the regression results for LS immigration sp ecification should b e interpreted with extreme caution due to weak iden tification, unlike those for HS immigran ts. By con trast, the original first-stage F-statistics (Columns 1-2 of T able D.5 ) are sufficiently large to consider b oth shift-share instruments strong (according to Sto c k and Y ogo , 2005 ). AR test statistics, not implemen ted in the original study , show that there is no reduced form effect when the second- stage effect is assumed to b e zero (the n ull hypothesis is not rejected) with b oth estimators, and AR 95% CS are b ounded with zero included. The original study finds a b orderline significan t effect (at 10% lev el) for HS immigrants on nationalism (Column 1 of T able D.5 ) instead. W e no w discuss the effect of immigration on p olitical and immigration attitudes (P anels B and C of T able D.4 ). The first-stage regressions for b oth specifications together are the same b ecause they use the same sample, endogenous treatmen t and instrumental v ariables. Both shift-share instruments app ear strong across all sp ecifications and panel estimators, with first-stage F-statistics exceeding the threshold of 16.30, although v alues are somewhat low er under the FD estimator whic h remain only marginally ab o ve Sto ck and Y ogo ( 2005 )’s threshold. By con trast, the first-stage F-statistics in the original analysis are alw ays ab o v e F ą 35 in b oth HS and LS sp ecifications (Columns 3-8 of T able D.5 ). In our re-analysis, there are only tw o cases when the AR diagnostics suggest the p ossible presence of a second-stage effect. First, in Columns 1-2 of P anel B (HS sp ecifications), the AR tests with FE and FD reject the null hypothesis of no effect at 10% significance level; 64 T able D.4. Nationalism intensity, immigr ation and attitudes towar ds p olitics and immigr ation – 2SLS with FD and with FE HS immigrants LS immigrants 2SLS with: FE FD FE FD (1) (2) (3) (4) P anel A: Nationalism intensity of p arties Se c ond-stage r esults F r. Immigrants 0.003 -0.049 0.070 0.084 (0.037) (0.044) (0.048) (0.067) AR 95% CS [-0.08, 0.062] [-0.140, 0.025] [-0.014, 0.230] [-0.028, 0.316] First-stage r esults Shift-Share IV 1.775*** 1.476*** 0.721*** 0.602*** (0.267) (0.241) (0.193) (0.189) F stat 38.26 37.54 11.40 8.31 R obust W e ak IV T ests AR χ 2 0.01 1.39 2.47 2.13 AR χ 2 p-v alue 0.927 0.239 0.116 0.144 Observ ations 261 147 261 147 No. groups 114 114 114 114 P anel B: Politic al attitudes – T rust c ountry p arliament Se c ond-stage r esults F r. Immigrants 0.052* 0.067 0.048 0.050 (0.031) (0.044) (0.042) (0.051) AR CS 95% [0.000, 0.116] [-0.015, 0.174] [-0.030, 0.143] [-0.030, 0.181] R obust W e ak IV T ests AR χ 2 3.27 3.30 1.42 1.17 AR χ 2 p-v alue 0.071 0.069 0.234 0.279 P anel C: Migr ation attitudes – Better plac e to live Se c ond-stage r esults F r. Immigrants -0.036 -0.023 -0.040 -0.100* (0.028) (0.048) (0.035) (0.051) AR CS 95% [-0.094, 0.016] [-0.168, 0.041] [-0.125, 0.0186] [-0.233, -0.026] R obust W e ak IV T ests AR χ 2 1.73 0.28 1.62 7.99 AR χ 2 p-v alue 0.189 0.595 0.203 0.005 P anels B and C First-stage r esults Shift-Share IV 1.480*** 1.174*** 0.716*** 0.647*** (0.144) (0.277) (0.125) (0.155) F stat 106.48 19.09 32.77 16.39 Observ ations 441 327 441 327 No. clusters 114 114 114 114 Note: The table rep orts our estimates based on Sp ecifications (2) and (3) of T able 6 in Moriconi et al. ( 2022 ) (our Panel A), and Sp ecifications (3) and (5) (Panels A and B) of T able 10 in Moriconi et al. ( 2022 ) (our Panels B-C). The estimates are obtained from conven tional 2SLS regression with FE (Columns 1 and 3) and FD transformation (Columns 2 and 4). The original sample consists of different individual v oters from t welv e European countries sampled eac h year, which we aggregate at regional (NUTS2) level to obtain an unbalanced panel data set. The treatment and instrumental v ariables in Columns (1)-(2) refer to the fraction of high-skilled work ers, and in Columns (3)-(4) of low-skilled work ers. The num b er of observ ations differs from 2SLS with FE and with FD regressions because the first time perio d is remo ved after the first-difference transformation. Each panel uses a different dep ended v ariable. The ra w control v ariables in all panels are: the share of w omen, av erage age, share of tertiary/p ost-tertiary education, average GDP p er capita (in log), share of tertiary sector (in log), average unemploymen t rate, and year dummies. Raw v ariables in the panel IV DML estimation with NNet and Bo osting do es not include interaction terms b ecause these base learners are designed to capture nonlinearities in the data. Panel IV DML with Lasso (Columns 2 and 6) uses an extende d dictionary of the raw v ariables, including p olynomials up to order three and interaction terms betw een all the cov ariates, to satisfy Lasso’s w eak sparsit y assumption. The num ber of cov ariates used for panel IV DML estimation doubles due to the inclusion of the lags of all included cov ariates, following the FD (exact) approach. Standard errors in parenthesis are clustered at the regional level. Significance levels: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. 65 T able D.5. Outc omes and immigr ant shar e ( Moric oni et al. ( 2022 )’s r esults) Dep. v ariable: Nationalism intensit y T rust country parliament Better place to liv e (1) (2) (3) (4) (5) (6) Share HS -0.14 ˚ 0.07 ˚ -0.05 (0.07) (0.04) (0.03) Share LS 0.05 0.04 -0.05 ˚˚ (0.05) (0.04) (0.02) Observ ations 48,303 48,303 78,058 78,058 77,862 77,862 K-P rk W ald F-stat 32.24 38.72 35.77 45.07 35.77 45.24 A dj. R-Square 0.13 0.13 0.10 0.10 0.11 0.11 NUTS2 FE Y es Y es Y es Y es Y es Y es Y ear FE Y es Y es Y es Y es Y es Y es NUTS2 Con trols Y es Y es Y es Y es Y es Y es Individual Con trols Y es Y es Y es Y es Y es Y es Note: The table displays the original results for Sp ecifications (2) and (3) from Moriconi et al. ( 2022 ) ’s T able 6, and Sp ecifications (3) and (5) from their T able 10. The estimation metho d the authors used is conv entional IV estimation. Standard errors in parenthesis are clustered at the regional level. Significance levels: * p ă 0.10, ** p ă 0.05, *** p ă 0.01. the corresp onding AR CS are b ounded, but include zero. Therefore, we can conclude that there is no effect of HS immigration on increased trust in o wn country parliamen t, unlik e the original article which finds a p ositiv e but b orderline effect. Second, in Column 4 of P anel C (LS sp ecification estimated with FD es timator), the AR test statistic is significant at 1% lev el; the b ounded AR confidence set identifies a negative effect that includes the estimated p oin t estimate. The original study finds a significant negativ e effect, but their shift-share instrumen t is muc h stronger ( F “ 45 . 24 ) than ours ( F “ 16 . 39 ). In all other cases, the AR tests fail to reject the null hypothesis and the AR CS are b ounded with zero included, suggesting the absence of a statistically significan t causal effect. In general, FE and FD estimators widely agree on the results with the exception of LS immigrants in the sp ecification for Better plac e to live and, therefore, an y difference observ ed in panel IV DML regressions is due to metho dological rather than sp ecification c hoices. 66 E Mon te Carlo Data Generating Pro cess W e consider a data generating pro cess (DGP) that resem bles the sp ecification of the re- analized empirical application. W e generate the following simulated regression model from Mo del ( 2.1 )-( 2.3 ) Y it “ D it θ ` l 0 p X it q ` α i ` U it (E.1) D it “ Z it π ` r 0 p X it q ` 0 . 5 α i ` R it (E.2) Z it “ m 0 p X it q ` γ i ` V it (E.3) X it “ p X it, 1 , . . . , X it,p q 1 „ Γ i ` N p 0 , 1 q (E.4) α i “ ρ Γ i ` a p 1 ´ ρ 2 q A i , (E.5) Γ i „ N p 3 , 9 q , A i „ N p 0 , 1 q , γ i „ N p 0 , 25 q (E.6) ¨ ˚ ˚ ˚ ˝ U it R it V it ˛ ‹ ‹ ‹ ‚ „ N » — — — – ¨ ˚ ˚ ˚ ˝ 0 0 0 ˛ ‹ ‹ ‹ ‚ , ¨ ˚ ˚ ˚ ˝ 1 0 . 6 0 0 . 6 1 0 0 0 0 . 25 ˛ ‹ ‹ ‹ ‚ fi ffi ffi ffi fl , (E.7) where the target parameter is θ “ 0 . 5 ; the parameter ρ “ 0 . 9 controls the exten t of the influ- ence of the fixed effect ov er the random effect; and p “ 30 is the num b er of con trol v ariables. W e set up tw o Monte Carlo simulation designs: a setting in which the instrument is strong with π “ 0 . 8 , and one where the instrumen t is w eak with π “ 0 . 001 . T reatment endogeneity is induced b oth b y the correlation b et w een the error terms U it and R it , σ ur “ 0 . 6 , and by the fixed effects α i through Γ i , which captures the dep endence of unobserved heterogeneit y on the included exogenous cov ariates. The nuisance functions p l 0 , m 0 , r 0 q are mo delled as follows: l 0 p X it q “ a 1 X it, 1 ` a 2 X it, 3 ` a 3 X it, 1 ¨ 1 t X it, 1 ą 0 u (E.8) r 0 p X it q “ b 1 X it, 1 ` b 2 X it, 3 ` b 3 X it, 1 ¨ 1 t X it, 1 ą 0 u (E.9) m 0 p X it q “ c 1 X it, 1 ` c 2 X it, 3 ` c 3 X it, 1 ¨ 1 t X it, 1 ą 0 u (E.10) where a j “ b j “ c j “ 0 . 5 for j “ t 1 , 2 , 3 u , and 1 t . u is an indicator op erator that transforms the contin uous v ariable X it, 1 in to a binary v ariable b y assigning v alue of one to p ositiv e realisations of X it, 1 and zero otherwise. 67 In this design, nonlinearit y enters the nuisance functions via the interaction term X it, 1 ¨ 1 t X it, 1 ą 0 u . Practically , an example of the c hosen form of nonlinearit y is as follows: supp ose that X it, 1 is the contin uous v ariable Ag e it , and 1 t X it, 1 ą 0 u is a binary indicator of the v ariable when a sp ecific a sp ecific age threshold is surpassed, e.g. 1 t Ag e it ą 25 u . In this example, the in teraction the interaction term allo ws the slop e of A ge to change when A ge is greater than 25, effectiv ely mo delling a kink (or slope shift) at the cut-off. T o allo w for sparsity ( s ! p ), we include man y con trol v ariables ( p “ 30 ), but only a subset of them ( s “ 2 ) are relev ant and enter the mo del b oth linearly and through nonlinear in teraction terms. In practice, the n umber of v ariables p doubles with the FD approac h describ ed in the panel IV DML Algorithm 1 b ecause the first-order lags of the included v ariables are included in the regression. Panel IV DML estimates the nuisance functions flexibly using Lasso, a single-la y er neural net work (NNet), and gradient b o osting with 100 trees (Boosting). Conv en tional 2SLS, by contrast, estimates these functions linearly . Ov erall, this design allows us to inv estigate the performance of our panel IV DML estimator o ver the conv entional 2SLS when the true functional form of the cov ariates is flexible. F Hyp erparameter T uning Hyp erparameter tuning is critical for ac hieving accurate effect estimation in machine learn- ing, regardless of the choice of base learners or estimators. Using default v alues, whether suggested b y soft ware pack ages or the literature, can substan tially limit learner p erformance and in tro duce bias into the causal estimand ( Machlanski et al. , 2023 , 2024 ; Bac h et al. , 2024b ). In the empirical applications (Section 4 ) and Monte Carlo sim ulations (Section 5 ), h yp erparameters w ere tuned according to the pro cedures summarized in T able F.1 . Sp ecifi- cally , Lasso penalty parameter is selected to minimize the cross-v alidated mean error, whereas the hyperparameters for gradien t b o osting and neural netw orks are tuned using grid search ( Bergstra and Bengio , 2012 ). W e set the h yp erparameter optimizer to try five distinct v alues p er h yp erparameter, randomly selected from the interv als sp ecified in T able F.1 , within eac h ev aluation and terminate th e optimization at the fifth ev aluation. 68 T able F.1. Hyp erp ar ameter tuning Learner Hyp erparam ter V alue of parameter in set Description Lasso lambda.min cross-v alidated λ equiv alen t to minimum mean cross-v alidated error Gradient Bo osting lambda real v alue in {0,2} L2 regularization term on weigh ts. maxdepth integer in {2,10} Maxim um depth of any node of the final tree. nrounds Number of decision trees in the final mo del 1000 In empirical applications. 100 In Monte Carlo sim ulations. NNET maxit 100 Maximum num b er of iterations. MaxNWts 2000 The maximum allo wable n umber of weigh ts. trace F ALSE Switch for tracing optimization.. size in teger in {2,10} Num b er of units in the hidden lay er. decay double in {0,0.5} Parameter for w eight deca y . Note: The h yp erparameters of the base learners c hosen to mo del the nuisance functions are tuned in each Monte Carlo replication via grid search, that ev aluates each p ossible combination of hyperparameters’ v alues in the grid ( Bergstra and Bengio , 2012 ). W e set the hyperpa- rameter optimizer to try five distinct v alues p er hyperparameter randomly selected from the specified interv als, within each ev aluation and terminate the optimization at the fifth evaluation. G Data Collection for Surv ey of AER Articles The surv ey is conducted by man ually collecting information on 477 empirical articles pub- lished in the American Economic Review (AER) b et ween 2011 and 2018, without the use of automated text-mining techniques. 28 Articles classified as purely theoretical (i.e., without an y econometric methods) w ere excluded from the dataset. F or eac h article, we recorded k ey information suc h as the author names, publica- tion date (y ear and mon th), type of data analyzed (cross-sectional, panel, time series, or a com bination), and the estimation methods employ ed. W e grouped the estimation metho ds in to broad categories: Least Squares (OLS, WLS, and GLS), Maximum Likelihoo d (probit and logit t yp es regressions), IV metho ds (GMM, IV, 2SLS, 3SLS), and other non-linear tec hniques. Any remaining sp ecialized estimators were placed in the other non-linear tech- niques category . When the analysis in the article used multiple estimation techniques, we recorded both the primary and secondary estimation metho ds. The primary method is the main technique used (t ypically for the key regression), while the secondary metho d is the second-most important. Often the secondary metho d s erv es as an alternative estimator for the main relationship or is used in a supplementary analysis. W e did not coun t metho ds used only for robustness chec ks as secondary , since those serve solely to supp ort the main analysis. 28 This dataset w as originally constructed for different pro jects, sev eral y ears ago, and now reused for the presen t study . 69

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment