Causal Claims in Economics

Causal Claims in Economics
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

As economics scales, a key bottleneck is representing what papers claim in a comparable, aggregable form. We introduce evidence-annotated claim graphs that map each paper into a directed network of standardized economic concepts (nodes) and stated relationships (edges), with each edge labeled by evidentiary basis, including whether it is supported by causal inference designs or by non-causal evidence. Using a structured multi-stage AI workflow, we construct claim graphs for 44,852 economics papers from 1980-2023. The share of causal edges rises from 7.7% in 1990 to 31.7% in 2020. Measures of causal narrative structure and causal novelty are positively associated with top-five publication and long-run citations, whereas non-causal counterparts are weakly related or negative.


💡 Research Summary

Paper Overview
The authors introduce “evidence‑annotated claim graphs” as a scalable, machine‑readable representation of economic research papers. Each paper is transformed into a directed network where nodes correspond to standardized economic concepts (mapped to JEL codes) and edges capture the relationships explicitly stated by the authors. Crucially, every edge is labeled with its evidentiary basis, distinguishing claims supported by canonical causal inference designs (Difference‑in‑Differences, Instrumental Variables, Randomized Controlled Trials, Regression Discontinuity, Synthetic Control) from those based on theory, description, or simple correlation.

Data and Corpus Construction
The study processes 44,852 working papers from NBER and CEPR spanning 1980‑2023. Duplicate records are removed via normalized title‑year keys. Papers shorter than 1,000 characters are excluded, and only the first 30 pages are analyzed to capture the sections where authors most often discuss research questions, identification strategies, and core results. PDFs are converted to clean text, and all concepts are later mapped to JEL codes.

Multi‑Stage LLM Extraction Pipeline
A three‑stage workflow uses GPT‑4o‑mini as a constrained information‑retrieval engine:

  1. Stage 1 – Structured Summaries: The model extracts research questions, designs, and metadata from the first 30 pages, producing three independent runs per paper.
  2. Stage 2 – Edge Extraction: For each Stage 1 output, the model extracts candidate source‑target pairs and assigns an evidentiary label, again in three independent runs, yielding a 3 × 3 matrix (nine edge lists).
  3. Stage 3 – Concept Standardization: Free‑text entities are matched to JEL codes using embedding‑based similarity and a curated ontology.

Edges that appear in at least four of the nine lists (EO ≥ 4) are retained, providing a transparent precision‑recall trade‑off. Validation combines internal repeatability diagnostics, snippet‑based human verification, and external benchmarks (e.g., existing meta‑analysis datasets). Reported labeling accuracy exceeds 90 % for causal designs and 88 % for non‑causal categories.

Key Empirical Findings

  1. Rise of Causal Claims – The share of causal edges grows from 7.7 % in 1990 to 31.7 % in 2020, with notable heterogeneity across sub‑fields (e.g., labor and development economics show the steepest increases).

  2. Impact on Publication and Citations – Doubling the volume of causal edges raises the probability of landing in the top‑5 % of journal placements by roughly 1.36 percentage points (≈12 % relative to an 11.35 % baseline) and boosts long‑run citations by about 11.2 %. Doubling the number of new causal edges (edges that have not appeared in prior literature) yields an even larger effect: +1.71 pp in top‑5 placement probability and +10.7 % in citations. By contrast, expanding non‑causal edges shows weak or negative associations.

  3. Causal Novelty and “Gap‑Filling” – Introducing genuinely new causal pathways (connecting previously unlinked concept pairs) is positively associated with editorial success and citation impact, but only when the link is backed by a credible design. Purely descriptive or correlational gap‑filling does not exhibit a consistent citation advantage.

Interpretation and Implications
These results demonstrate that the “credibility revolution” is not merely a shift in methodological tags at the paper level; it permeates the very argumentative structure of research. Journals and peer reviewers appear to reward depth of causal inference and originality of causal mechanisms, while non‑causal narrative complexity offers limited payoff. The claim‑graph framework thus provides a new quantitative lens for studying the economics of science, allowing researchers to separate “mechanistic depth” from “conceptual breadth” and to track how incentives shape methodological adoption.

Methodological Contributions

  1. A scalable pipeline that converts unstructured economic manuscripts into standardized, evidence‑annotated graphs.
  2. An open dataset (GitHub) and tooling for claim‑level queries, enabling replication and extension.
  3. A compact suite of paper‑level graph metrics (causal volume, causal novelty, narrative complexity, conceptual centrality) that are validated against multiple robustness checks.

Limitations and Future Work
The approach relies on LLM extraction, which, despite extensive validation, may still miss nuanced claims or misclassify ambiguous designs. The 30‑page truncation could omit late‑stage robustness checks or discussion of limitations. Future research should explore full‑paper processing, ensemble LLM strategies, and automated verification of causal design descriptions (e.g., checklist‑based parsing). Extending claim graphs to other disciplines could illuminate cross‑field diffusion of causal reasoning and support meta‑analytic synthesis, policy translation, and media framing studies.

In sum, the paper offers both a novel methodological infrastructure for representing scholarly claims and substantive evidence that causal depth and novelty are key drivers of academic impact in economics.


Comments & Academic Discussion

Loading comments...

Leave a Comment