Network Structural Equation Models for Causal Mediation and Spillover Effects
Social network interference induces complex dependencies where a unit’s outcome is influenced not only by its own exposure and mediator but also by those of connected neighbors. In such settings, a significant challenge lies in distinguishing direct exposure effects from interference-driven spillover effects, and further separating these from indirect effects mediated by intermediate variables. To address this, we propose a theoretical framework utilizing structural graphical models. Central to our approach is the Random Effects Network Structural Equation Model (REN-SEM), which extends the exposure mapping paradigm to capture these multifaceted spillover and mediation mechanisms while accounting for latent dependencies within mediators and outcomes. We establish general identification conditions and derive decomposition formulas for six distinct mechanistic estimands. Furthermore, for the class of Linear REN-SEMs, we develop a maximum likelihood estimation framework and establish a rigorous asymptotic theory tailored to non-i.i.d. network data, proving the consistency of our estimators and the validity of the variance estimates. The robustness and practical utility of our methodology are demonstrated through simulation experiments and an analysis of the Twitch Gamers Network, underscoring its effectiveness in quantifying intricate network-mediated exposure effects.
💡 Research Summary
This paper addresses the challenging problem of causal inference in observational studies where units are embedded in a social network, and both interference (spill‑over) and mediation are present. Classical mediation analysis assumes independent units and focuses on natural direct and indirect effects of a binary exposure on an outcome through a single mediator. In a network, however, a unit’s outcome can be affected by its own exposure, its own mediator, the exposures of its first‑degree neighbors, the mediators of its first‑degree neighbors, and even the exposures of second‑degree neighbors (which act only through the mediators of first‑degree neighbors). To capture these complex pathways, the authors introduce the Random Effects Network Structural Equation Model (REN‑SEM).
REN‑SEM specifies, for each node i, a system of structural equations:
Y_i = f_y(A_i, A_{N†i}, M_i, M_{N†i}, C_i, C_{N†i}, b_{yi}, b_{y,N†i}, ε_{yi}, E)
M_i = f_m(A_i, A_{N†i}, C_i, C_{N†i}, b_{mi}, b_{m,N†i}, ε_{mi}, E)
A_i = f_a(C_i, C_{N†i}, ε_{ai}, E)
where N†i denotes first‑degree neighbors, N‡i denotes second‑degree neighbors, C_i are observed confounders, and b_{yi}, b_{mi} are unit‑specific random effects that capture latent heterogeneity and correlation across the network. The exposure assignment is assumed conditionally independent across units given their own and neighbors’ covariates (Assumption 1).
The model reveals seven distinct causal routes: (1) A_i → Y_i, (2) A_i → M_i → Y_i, (3) A_i → M_{N†i} → Y_i, (4) A_{N†i} → Y_i, (5) A_{N†i} → M_i → Y_i, (6) A_{N†i} → M_{N†i} → Y_i, and (7) A_{N‡i} → M_{N†i} → Y_i. Based on these routes, the authors define six mechanistic estimands that extend the classic natural direct and indirect effects to incorporate neighbor‑induced spill‑over and mediated spill‑over components.
Identification relies on standard causal assumptions (no unmeasured confounding, positivity, consistency) extended to the network setting, together with the exposure‑mapping framework. The high‑dimensional exposure vector A is reduced to low‑dimensional summaries T_{a1i}, T_{a2i}, T_{a3i} for the mediator model, and analogous summaries for the outcome model. For example, the mediator depends on A_i and a summary of first‑degree neighbor exposures, M_i(a)=f_{Mi}(a_i,T_{a1i}(A_{N†i})). The outcome depends on A_i, the same neighbor‑exposure summary, the unit’s own mediator, and a summary of neighbor mediators that themselves are functions of neighbor exposures. This sufficient‑dimension‑reduction preserves the conditional independence needed for identification while making the problem tractable.
The authors then focus on a linear subclass, the Linear REN‑SEM (LREN‑SEM). In this case, the outcome and mediator equations become mixed‑effects linear regressions:
Y_i = β_0 + β_1 A_i + β_2 S_{1i}(A_{N†i}) + β_3 M_i + β_4 S_{1i}(M_{N†i}) + β_5^T C_i + β_6^T S_{1i}(C_{N†i}) + b_{yi} + ε_{yi}
M_i = γ_0 + γ_1 A_i + γ_2 S_{1i}(A_{N†i}) + γ_3^T C_i + γ_4^T S_{1i}(C_{N†i}) + b_{mi} + ε_{mi}
where S_{1i}(·) denotes a weighted average of the values of a variable over i’s first‑degree neighbors. The exposure model is a Bernoulli regression with probability given by a link function of C_i and neighbor covariates.
Maximum‑likelihood estimation (MLE) for LREN‑SEM is derived by treating the random effects as latent Gaussian variables. The likelihood factorizes into a product of normal densities conditional on the random effects, allowing the use of EM algorithms or direct closed‑form solutions for the fixed‑effect parameters. Crucially, the authors develop an asymptotic theory that accommodates the non‑i.i.d. nature of network data. Under regularity conditions (independent random effects, bounded degree, and mixing conditions on the network), the MLEs are shown to be consistent and asymptotically normal with √N convergence. A novel variance estimator that accounts for network dependence is provided, guaranteeing correct coverage of confidence intervals.
Simulation studies explore a range of scenarios: varying network density, average degree, sample size (N=200–800), and strength of mediation and spill‑over effects. Results demonstrate that the proposed MLE substantially reduces bias relative to naïve independent‑unit SEMs, achieves lower mean‑squared error, and yields confidence intervals with empirical coverage close to the nominal 95 % level.
The methodology is illustrated with data from the Twitch gamers’ social network. Each streamer’s binary exposure indicates whether they promoted a particular game. The mediator is the average watch time of the streamer’s audience, and the outcome is the change in follower count. Applying the REN‑SEM reveals a modest but statistically significant direct effect of self‑promotion, while the indirect effect transmitted through neighbors’ promotion (via increased watch time) is larger. This empirical finding underscores the importance of accounting for both mediation and network spill‑over when evaluating marketing or public‑health interventions in social platforms.
In summary, the paper makes four major contributions: (1) a unified structural‑equation framework that simultaneously handles interference and mediation, (2) clear identification conditions and six novel mechanistic estimands, (3) a tractable linear implementation with maximum‑likelihood estimation and rigorous asymptotic results for dependent network data, and (4) empirical validation through simulations and a real‑world Twitch network analysis. The work substantially advances causal inference methodology for observational network data, providing researchers with practical tools to disentangle direct, indirect, and spill‑over pathways in complex social systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment