이더리움 거래 경제적 의도 파악을 위한 TxSum 데이터셋과 MATEX 멀티에이전트 시스템

Reading time: 5 minute
...

📝 Abstract

Understanding the economic intent of Ethereum transactions is critical for user safety, yet current tools expose only raw on-chain data, leading to widespread “blind signing” (approving transactions without understanding them). Through interviews with 16 Web3 users, we find that effective explanations should be structured, risk-aware, and grounded at the token-flow level. Based on interviews, we propose TxSum, a new task and dataset of 100 complex Ethereum transactions annotated with natural-language summaries and step-wise semantic labels (intent, mechanism, etc.). We then introduce MATEX, a multiagent system that emulates human experts’ dual-process reasoning. MATEX achieves the highest faithfulness and intent clarity among strong baselines. It boosts user comprehension by 23.6% on complex transactions and doubles users’ ability to find real attacks, significantly reducing blind signing.

💡 Analysis

Understanding the economic intent of Ethereum transactions is critical for user safety, yet current tools expose only raw on-chain data, leading to widespread “blind signing” (approving transactions without understanding them). Through interviews with 16 Web3 users, we find that effective explanations should be structured, risk-aware, and grounded at the token-flow level. Based on interviews, we propose TxSum, a new task and dataset of 100 complex Ethereum transactions annotated with natural-language summaries and step-wise semantic labels (intent, mechanism, etc.). We then introduce MATEX, a multiagent system that emulates human experts’ dual-process reasoning. MATEX achieves the highest faithfulness and intent clarity among strong baselines. It boosts user comprehension by 23.6% on complex transactions and doubles users’ ability to find real attacks, significantly reducing blind signing.

📄 Content

MATEX: A Multi-Agent Framework for Explaining Ethereum Transactions Zifan Peng1 Jingyi Zheng1 Yule Liu1 Huaiyu Jia1 Qiming Ye1 Jingyu Liu1 Xufeng Yang2 Mingchen Li3 Qingyuan Gong4 Xuechao Wang1 Xinlei He1* 1Hong Kong University of Science and Technology (Guangzhou) 2Independent Researcher 3University of North Texas 4Fudan University Abstract Understanding the economic intent of Ethereum transactions is critical for user safety, yet current tools expose only raw on-chain data, leading to widespread “blind signing” (ap- proving transactions without understanding them). Through interviews with 16 Web3 users, we find that effective expla- nations should be structured, risk-aware, and grounded at the token-flow level. Based on interviews, we propose TxSum, a new task and dataset of 100 complex Ethereum transactions annotated with natural-language summaries and step-wise semantic labels (intent, mechanism, etc.). We then introduce MATEX, a multi- agent system that emulates human experts’ dual-process rea- soning. MATEX achieves the highest faithfulness and intent clarity among strong baselines. It boosts user comprehension by 23.6% on complex transactions and doubles users’ ability to find real attacks, significantly reducing blind signing. 1 Introduction Understanding the economic intent of an Ethereum transac- tion is crucial for both proactive risk mitigation and post- incident analysis. It enables users to preview outcomes and avoid irreversible errors before signing, and supports regula- tory compliance, anti-money laundering, and forensic diag- nosis after exploits—such as the TraceLLM [1] case, where on-chain analysis uncovered attack vectors in major DeFi incidents. However, this understanding is critically lacking in practice. Modern DeFi transactions are highly compositional, leading to widespread “blind signing” [2, 3]—a phenomenon with se- vere consequences. In early 2025, attackers stole $1.5B from Bybit by tricking operators into approving a malicious trans- action that appeared innocuous in their interface [4]. Even in routine interactions, users struggle: new users are confused by dual-sign patterns (approve + swap), while experienced users rely on fragile heuristics like “reject unlimited approvals,” which fail in multi-contract scenarios. These incidents reveal a fundamental gap: current tools expose raw on-chain data but fail to translate it into human- understandable economic narratives. While tools like Meta- *Corresponding author. Suites [5], EigenPhi [6], and Tenderly [7] expose detailed to- ken flows and contract interactions, they stop at the syntactic level. They do not interpret what these low-level events mean economically [8]. As Figure 1 shows, a user’s simple intent (e.g., “swap ETH for DAI”) is often executed as a fragmented, multi- contract transaction. Raw traces reveal what happened but not why—leaving users unable to distinguish a routine swap from a high-risk operation. This gap between user mental models and on-chain reality motivates our research questions (RQs): • RQ1: How do Web3 users understand transaction details when signing? • RQ2: What do users expect from transaction explanations in terms of content and format? • RQ3: How to design a framework and how the generated explanations by it affect users’ comprehension and signing behavior? To address this gap, we first conduct a user study with 16 diverse Web3 participants to uncover how real users interpret transactions and what explanatory elements they find most valuable. Guided by these findings, we propose TxSum: a new task and schema for transaction understanding that operates at the token-flow level. TxSum defines five semantic attributes (see Section 3) that ground explanations in step-wise, verifiable actions aligned with user mental models. We then construct a high-quality dataset of 100 complex Ethereum transactions, each annotated with (1) a 3–4 sentence natural-language sum- mary and (2) fine-grained flow-level labels following our pre-defined schema. Finally, we introduce MATEX, a cognitive multi-agent framework motivated by how human experts analyze trans- actions: it implements a dual-process reasoning workflow [9, 10], where fast pattern recognition (System 1) flags uncer- tainty and triggers slow, evidence-based investigation (Sys- tem 2). The system decomposes traces, retrieves live protocol context to resolve ambiguity, fuses evidence into attributed narratives. We evaluate MATEX through automatic and expert assess- ment, a user study, and risk auditing on real transactions. 1 arXiv:2512.06933v2 [cs.CE] 5 Jan 2026 10 ETH 25,000 DAI User expects best price swap via Aggregator B. Actual Aggregated Execution (1) 10 ETH (2a) 6ETH USDC Uniswap V3 (ETH/USDC) Uniswap V3 (USDC/DAI) DAI DAI (2b) 4ETH Aggregator Router (3) Net DAI (post-fees) A. User’s Simple Intent User Fee Fee Fee SushiSwap (ETH/DAI) Referral Proxy Referral Proxy Fee Surplus Collector Figure 1: Mismatch between user intent and on-chain exec

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut