Achieving Sustainable Development Goal 7 (Affordable and Clean Energy) requires not only technological innovation but also a deeper understanding of the socioeconomic factors influencing energy access and carbon emissions. While these factors are gaining attention, critical questions remain, particularly regarding how to quantify their impacts on energy systems, model their cross-domain interactions, and capture feedback dynamics in the broader context of energy transitions. To address these gaps, this study introduces ClimateAgents, an AI-based framework that combines large language models with domain-specialized agents to support hypothesis generation and scenario exploration. Leveraging 20 years of socioeconomic and emissions data from 265 economies, countries and regions, and 98 indicators drawn from the World Bank database, the framework applies a machine learning based causal inference approach to identify key determinants of carbon emissions in an evidence-based, data driven manner. The analysis highlights three primary drivers: access to clean cooking fuels in rural areas, access to clean cooking fuels in urban areas, and the percentage of population living in urban areas. These findings underscore the critical role of clean cooking technologies and urbanization patterns in shaping emission outcomes. In line with growing calls for evidence-based AI policy, ClimateAgents offers a modular and reflexive learning system that supports the generation of credible and actionable insights for policy. By integrating heterogeneous data modalities, including structured indicators, policy documents, and semantic reasoning, the framework contributes to adaptive policymaking infrastructures that can evolve with complex socio-technical challenges. This approach aims to support a shift from siloed modeling to reflexive, modular systems designed for dynamic, context-aware climate action.
support causal inquiry in LLM prompts [22,23]. Critically, no dataset addresses socio-climate-related causal questions, leaving a major gap for advanced applications [24].
To ensure reliable and standardized evidence-based analysis, this research adopts the World Bank Development Indicators-a widely recognized, high-quality, and publicly available data framework. This data-driven foundation enhances the credibility and precision of the study’s outputs. By applying causal inference techniques with machine learning algorithms, the analysis moves beyond simple correlation to uncover deeper, more robust relationships. This enables more grounded, interpretable reasoning for policy-making. Furthermore, the integration of large language models (LLMs) supports evidence-based analysis by generating outputs that aim to be credible and actionable, as their interpretability can facilitate context-aware and informed decision-making [25,26,27].
The orchestration of these three modules reflects a system-level design philosophy rooted in modularity, specialization, and agent-based coordination. To address these challenges and extend the utility of large language models (LLMs), this work proposes a multi-agent architecture grounded in Minsky’s philosophy of modular, emergent intelligence. Rather than treating LLMs as monolithic tools, the proposed system distributes reasoning and task execution across a set of interacting agents, each specialized for distinct functions. The resulting framework-ClimateAgents-is a reflexive, causal modeling system powered by GPT family models [28] and accessed via the OpenAI Application Programming Interface [29]. It moves beyond static prompt-response paradigms, enabling adaptive reasoning within complex socio-environmental systems. Central to this architecture is the concept of Reflexive machine learning, defined here as a process through which agents iteratively adjust their prompts, inference strategies, or actions in response to environmental feedback and task complexity, thereby supporting context-aware and adaptive decision-making.
The contributions of this work are as follows: (i) introduction of a reflexive multiagent architecture for causal analysis and policy simulation in socio-environmental contexts; (ii) integration of multimodal data with LLM-driven agents to complement statistical models through simulation, reasoning, and hypothesis generation; and (iii) proposal of Reflexive Machine Learning as a natural-language interface for interpretable modeling of complex systems.
ClimateAgents consists of three components: (i) a perception layer that structures multimodal inputs into formal representations (e.g., indicators, semantic frames); (ii) a reasoning layer for planning, inference, and adaptive decision-making; and (iii) an operation layer that performs causal inference, modeling, and policy simulations, with outputs interpreted via LLMs. A continuous agent feedback loop enables real-time refinement and contextual adaptation for evidence-based policy support (Figure 1,3).
Based on former work [8], this study furhter introduces a three-stage comparative framework for investigating causal relationships in the context of social science and climate change, aimed at supporting evidence-based reasoning. The pipeline combines (i) correlation analysis to identify initial statistical associations, (ii) machine learning based causal discovery to estimate structural dependencies, and (iii) LLM-guided prompt exploration to surface contextual explanations and generate policy-relevant hypotheses. Each stage contributes distinct but complementary evidence toward causal interpretation, facilitating more transparent and informed downstream analysis. This approach is designed to support the development of empirical insights that can inform decisionmaking in complex socio-environmental systems.
Evidence retrieval is demonstrated through text classification of agent-generated prompts (Figure 2), which revealed themes related to carbon emission prediction, including model diversity, geographic specificity, and environmental justice. Using Biopython and the NCBI Entrez database, the system efficiently retrieves and synthesizes relevant literature, supporting large-scale climate and air quality research. Causal effects are estimated following Rolland et al. [30], modeling each variable as a function of its causal parents with additive noise. Leaf nodes are identified using score function derivatives, and topological ordering is achieved by sequential leaf removal, with the Jacobian approximated by the Stein gradient estimator and refined through the CAM procedure [30]. Validation and interpretation involve domain expertise and standard metrics, with results highlighting key drivers such as rural and urban access to clean fuels and urbanization growth. To valid LLMs contribute to causal inference, the framework applies World Bank Development Indicators and employs a taxonomy of causality [31,32]
This content is AI-processed based on open access ArXiv data.