Detecting Information Channels in Congressional Trading via Temporal Graph Learning

Detecting Information Channels in Congressional Trading via Temporal Graph Learning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Congressional stock trading has raised concerns about potential information asymmetries and conflicts of interest in financial markets. We introduce a temporal graph network (TGN) framework to identify information channels through which members of Congress may possess advantageous knowledge when trading company stocks. We construct a multimodal dynamic graph integrating diverse publicly available datasets, including congressional stock transactions, lobbying relationships, campaign finance contributions, and geographical connections between legislators and corporations. Our approach formulates the detection problem as a dynamic edge classification task, where we identify trades that exhibit statistically significant outperformance relative to the S&P 500 across long time horizons. To handle the temporal nature of these relationships, we develop a two-step walk-forward validation architecture that respects information availability constraints and prevents look-ahead bias. We evaluate several labeling strategies based on risk-adjusted returns and demonstrate that the TGN successfully captures complex temporal dependencies between congressional-corporate interactions and subsequent trading performance.


💡 Research Summary

The paper tackles the controversial issue of whether members of the United States Congress exploit privileged information when trading corporate stocks. Rather than treating each trade as an isolated time‑series signal, the authors construct a multimodal dynamic bipartite graph—named Capitol Gains—that integrates publicly available data spanning 2013‑2025: legislator biographies, committee assignments, roll‑call votes (W‑NOMINATE scores), lobbying records, campaign contributions, daily OHLCV and dark‑pool volumes for all equities, 64 macro‑economic indicators, and quarterly 10‑Q filings. All features are aligned to their public release timestamps (point‑in‑time construction) to guarantee that at any moment the model only accesses information that market participants could have known, thereby eliminating look‑ahead bias.

The detection task is formalized as a dynamic edge‑classification problem on a continuous‑time heterogeneous graph G(t). The target edge type is a stock trade between a legislator and a company. For each trade the authors compute the excess return over the S&P 500 across a long horizon Δt (e.g., 18 months). If the excess return exceeds a threshold τ the edge receives a positive label, if it falls below τ a negative label, and if the outcome is still unresolved a “latent” label of 0.5 is assigned. This delayed labeling mirrors the real‑world latency between trade disclosure and performance measurement and prevents the model from relying on future information.

To handle the latency problem, the paper introduces GAP‑TGN (Gated Asynchronous Propagation Temporal Graph Network), an extension of the standard Temporal Graph Network (TGN). GAP‑TGN adds two key components: (1) a Gated Multi‑Modal Fusion layer that combines the node’s dynamic memory embedding, static attributes (party, state), and a market signal vector derived from systematic features; and (2) an Asynchronous Propagation mechanism that updates node memories even when the label for the current event is still latent. The edge‑level attention mechanism incorporates the latent label into the key/value vectors, allowing the network to weigh historical interactions according to whether they later proved to be high‑alpha, low‑alpha, or pending. This design keeps node representations “fresh” despite the long feedback loop.

Training and evaluation follow a two‑step walk‑forward validation. In the training phase the model learns from all events up to time t, using only information that would have been publicly available at that moment. In the testing phase the model makes a prediction at time t and then waits for the true label to become resolved at t + Δt, exactly replicating the information constraints an investor would face. This strict protocol eliminates any look‑ahead bias.

Empirical results show that GAP‑TGN outperforms baseline tabular models (logistic regression, random forest) and a vanilla TGN on several metrics: long‑term excess return, Sharpe ratio, and AUC for the binary classification. The improvement is most pronounced when the latent label is used, indicating that the asynchronous update successfully mitigates the staleness issue. Attention visualizations reveal that edges involving strong committee ties, intensive lobbying connections, and sizable campaign contributions receive higher weights, supporting the hypothesis that political influence channels can generate measurable “Congressional alpha”.

The authors contribute (1) the Capitol Gains dataset, (2) the GAP‑TGN architecture tailored for financial domains with delayed supervision, and (3) a rigorous walk‑forward evaluation framework. Limitations include reliance on the S&P 500 as a single benchmark, potential omission of undisclosed trades or informal lobbying, and limited interpretability beyond attention scores. Future work is suggested to incorporate sector‑specific benchmarks, textual analysis of bills and news, and multi‑step forecasting to further uncover the mechanisms behind political information advantages.


Comments & Academic Discussion

Loading comments...

Leave a Comment