Bayesian Estimation of Causal Effects Using Proxies of a Latent Interference Network
Network interference occurs when treatments assigned to some units affect the outcomes of others. Traditional approaches often assume that the observed network correctly specifies the interference structure. However, in practice, researchers frequently only have access to proxy measurements of the interference network due to limitations in data collection or potential mismatches between measured networks and actual interference pathways. In this paper, we introduce a framework for estimating causal effects when only proxy networks are available. Our approach leverages a structural causal model that accommodates diverse proxy types, including noisy measurements, multiple data sources, and multilayer networks, and defines causal effects as interventions on population-level treatments. The latent nature of the true interference network poses significant challenges. To overcome them, we develop a Bayesian inference framework. We propose a Block Gibbs sampler with Locally Informed Proposals to update the latent network, thereby efficiently exploring the high-dimensional posterior space composed of both discrete and continuous parameters. The latent network updates are driven by information from the proxy networks, treatments, and outcomes. We illustrate the performance of our method through numerical experiments, demonstrating its accuracy in recovering causal effects even when only proxies of the interference network are available.
💡 Research Summary
The paper tackles the problem of causal inference under network interference when the true interference network is unobserved and only proxy measurements are available. Recognizing that observed social or relational networks often contain measurement error or may represent a different set of ties than those through which treatment spillovers actually occur, the authors develop a comprehensive Bayesian framework that treats the true interference network as a latent random graph.
The methodological core is a structural causal model (SCM) that jointly generates baseline covariates, latent variables, the true adjacency matrix (A^{}), multiple proxy adjacency matrices ({A^{(b)}}_{b=1}^{B}), treatment assignments, and outcomes. Two families of proxies are distinguished: (i) causal proxies that are noisy observations of (A^{}) (direct descendants in the DAG) and (ii) non‑causal proxies that share higher‑level latent factors with (A^{*}) but are not direct descendants (e.g., multilayer networks). Treatment assignment may depend on the latent network or on the proxies, yielding four possible DAG configurations.
Causal estimands are defined as population‑level policy interventions (static, dynamic, or stochastic) on the treatment vector, rather than unit‑level exposure functions, which allows the authors to sidestep the need for exact exposure calculations when the network is uncertain. Identification results show that, under reasonable overlap and positivity conditions, the joint distribution of the latent network and the causal parameters is identifiable from the observed outcomes, treatments, and multiple proxies.
Because the latent network lives in a high‑dimensional discrete space, standard MCMC is infeasible. The authors propose a block Gibbs sampler that alternates between updating continuous parameters (e.g., regression coefficients, noise variances) and the discrete network. For the network updates they adapt Zanella’s Locally Informed Proposals (LIP): the proposal distribution is guided by a gradient‑based approximation of the log‑posterior with respect to edge flips, additions, or deletions. This local information dramatically improves acceptance rates and mixing in the combinatorial space. The paper also provides a theoretical bound on the approximation error of the LIP and shows how network sparsity can be exploited to tighten this bound.
Empirical evaluation includes fully synthetic data and semi‑synthetic data derived from real social networks with injected noise and missing edges. The proposed joint Bayesian approach is compared against a naïve two‑stage procedure (first infer the network, then treat it as fixed). Results demonstrate lower bias, reduced mean‑squared error, and proper coverage of credible intervals for the causal estimands, especially when multiple heterogeneous proxies are available.
In summary, the contributions are threefold: (1) a flexible SCM that accommodates latent interference networks and diverse proxy structures; (2) a unified Bayesian inference scheme that propagates uncertainty from network reconstruction to causal effect estimation; and (3) an efficient MCMC algorithm based on block Gibbs sampling with locally informed proposals for high‑dimensional discrete network spaces. The work opens avenues for more reliable policy evaluation in settings where only imperfect relational data are available, such as vaccination campaigns, information diffusion studies, and social program assessments. Limitations include scalability to very large networks and reliance on sufficient proxy information for identification, which the authors suggest as directions for future research.
Comments & Academic Discussion
Loading comments...
Leave a Comment