Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

February 22, 2026

Reading time: 2 minute

...

📝 Original Info

Title: Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
ArXiv ID: 2510.24702
Date: 2025-10-28
Authors: 제공되지 않음 (논문에 명시된 저자 정보가 없으므로 추후 확인 필요)

📝 Abstract

Public research results on large-scale supervised finetuning of AI agents remain relatively rare, since the collection of agent training data presents unique challenges. In this work, we argue that the bottleneck is not a lack of underlying data sources, but that a large variety of data is fragmented across heterogeneous formats, tools, and interfaces. To this end, we introduce the agent data protocol (ADP), a light-weight representation language that serves as an "interlingua" between agent datasets in diverse formats and unified agent training pipelines downstream. The design of ADP is expressive enough to capture a large variety of tasks, including API/tool use, browsing, coding, software engineering, and general agentic workflows, while remaining simple to parse and train on without engineering at a per-dataset level. In experiments, we unified a broad collection of 13 existing agent training datasets into ADP format, and converted the standardized ADP data into training-ready formats for multiple agent frameworks. We performed SFT on these data, and demonstrated an average performance gain of ~20% over corresponding base models, and delivers state-of-the-art or near-SOTA performance on standard coding, browsing, tool use, and research benchmarks, without domain-specific tuning. All code and data are released publicly, in the hope that ADP could help lower the barrier to standardized, scalable, and reproducible agent training.

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

A comment on an $L^frac{2n}{n+2}-L^frac{2n}{n-2}$ Carleman inequality in relation to 'the determination of an unbounded potential from Cauchy data'

A data-driven multiscale scheme for anisotropic finite strain magneto-elasticity

Advancements in synthetic data extraction for industrial injection molding

Start searching

No results found