Disentangled Instrumental Variables for Causal Inference with Networked Observational Data

Disentangled Instrumental Variables for Causal Inference with Networked Observational Data
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Instrumental variables (IVs) are crucial for addressing unobservable confounders, yet their stringent exogeneity assumptions pose significant challenges in networked data. Existing methods typically rely on modelling neighbour information when recovering IVs, thereby inevitably mixing shared environment-induced endogenous correlations and individual-specific exogenous variation, leading the resulting IVs to inherit dependence on unobserved confounders and to violate exogeneity. To overcome this challenge, we propose $\underline{Dis}$entangled $\underline{I}$nstrumental $\underline{V}$ariables (DisIV) framework, a novel method for causal inference based on networked observational data with latent confounders. DisIV exploits network homogeneity as an inductive bias and employs a structural disentanglement mechanism to extract individual-specific components that serve as latent IVs. The causal validity of the extracted IVs is constrained through explicit orthogonality and exclusion conditions. Extensive semi-synthetic experiments on real-world datasets demonstrate that DisIV consistently outperforms state-of-the-art baselines in causal effect estimation under network-induced confounding.


💡 Research Summary

The paper tackles a fundamental obstacle in causal inference on networked observational data: the difficulty of finding valid instrumental variables (IVs) that satisfy relevance, exclusion, and unconfoundedness when individuals are linked by a graph. Existing approaches either rely on pre‑specified exogenous variables or construct IVs directly from neighbor information (e.g., NetIV). The latter inevitably mixes shared environmental confounders—arising from homophily and common exposures—with the individual‑specific variation needed for a true IV, thereby violating the exogeneity assumption.

DisIV (Disentangled Instrumental Variables) proposes a principled solution by viewing each observed feature vector x_i as a composition of two latent factors: a network‑induced confounder u_i (environment) and an individual‑specific factor z_i (specificity). Under the assumption of network homogeneity, the local neighborhood provides a reliable proxy for u_i. The method therefore proceeds in two stages. First, a Graph Convolutional Network (GCN) aggregates node features across first‑order neighbors to produce an environment embedding e_i, which serves as a learned proxy for the hidden confounder. Second, a variational auto‑encoder‑like architecture with an asymmetric inference‑generation design extracts z_i. The encoder q_φ(z|x) maps x_i to a Gaussian distribution (μ_φ(x_i), σ_φ(x_i)), from which z_i is sampled. The decoder p_θ(x|z,e) reconstructs x_i conditioned on both the candidate IV z_i and the environment embedding e_i. By forcing the decoder to explain as much of x_i as possible using e_i, the residual information captured in z_i must be orthogonal to the environment, thereby encouraging statistical independence between z_i and u_i.

To guarantee that the recovered latent variables satisfy the classic IV criteria, DisIV adds explicit regularization terms: (1) an orthogonality penalty between z and e to enforce the unconfoundedness condition; (2) a relevance term that trains a treatment model t_i ← f(z_i) ensuring that z_i is predictive of the treatment; and (3) an exclusion constraint implemented by modeling the outcome y_i as a function of the treatment and the environment only, y_i ← g(t_i, e_i), explicitly omitting any direct dependence on z_i. The overall objective combines reconstruction loss, KL divergence, and the three regularizers, allowing end‑to‑end gradient‑based optimization.

The authors evaluate DisIV on two semi‑synthetic datasets derived from real social networks (e.g., Facebook, Twitter). They inject latent confounders u_i and generate outcomes with known causal effects, enabling precise measurement of estimation error. Baselines include NetIV, VIV, DeepIV, and GCN‑based confounder adjustment methods. Across metrics such as ATE RMSE and ITE PEHE, DisIV consistently outperforms competitors, achieving 15–30 % lower error. Performance gains are especially pronounced in dense networks where neighbor‑based IVs suffer severe bias. Sensitivity analyses explore the dimensionality of the environment embedding and the strength of the orthogonality regularizer λ, revealing a stable region (λ≈0.1–0.5) where the trade‑off between independence and information retention is optimal.

The paper concludes by highlighting several avenues for future work: extending the framework to dynamic graphs, handling multiple simultaneous treatments, and incorporating multimodal data (text, images) into the disentanglement process. In sum, DisIV offers a novel, theoretically grounded, and empirically validated approach to extracting latent, individually‑specific instrumental variables from entangled networked data, thereby overcoming the exogeneity violations that have limited prior IV‑based causal methods in graph settings.


Comments & Academic Discussion

Loading comments...

Leave a Comment