Doc2Spec: Synthesizing Formal Programming Specifications from Natural Language via Grammar Induction

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Ensuring that API implementations and usage comply with natural language programming rules is critical for software correctness, security, and reliability. Formal verification can provide strong guarantees but requires precise specifications, which are difficult and costly to write manually. To address this challenge, we present Doc2Spec, a multi-agent framework that uses LLMs to automatically induce a specification grammar from natural-language rules and then generates formal specifications guided by the induced grammar. The grammar captures essential domain knowledge, constrains the specification space, and enforces consistent representations, thereby improving the reliability and quality of generated specifications. Evaluated on seven benchmarks across three programming languages, Doc2Spec outperforms a baseline without grammar induction and achieves competitive results against a technique with a manually crafted grammar, demonstrating the effectiveness of automated grammar induction for formalizing natural-language rules.

💡 Research Summary

Doc2Spec addresses the challenge of converting natural‑language API rules into formal specifications without requiring costly manual effort or large labeled datasets. The system operates as a five‑stage multi‑agent pipeline. First, an EntityLocalizer combines LLM‑generated regular expressions with deterministic regex matching and a sliding‑window strategy to reliably locate code entities in potentially long documents. Second, an AttributeAgent extracts a comprehensive schema of each entity’s attributes (name, type, visibility, etc.) using LLM prompts guided by one‑shot examples. Third, the system isolates rule‑bearing sentences from the surrounding documentation. Fourth, a GrammarInduction module prompts the LLM with a template that asks it to infer domain‑specific sorts and predicates from the extracted rules; these are assembled into an EBNF grammar that captures the essential domain concepts. Finally, a SpecificationGenerator uses the induced grammar as a strict scaffold to translate the natural‑language rules into formal specifications suitable for verification tools such as symbolic execution engines. All LLM outputs are validated against JSON schemas and re‑queried on failure, ensuring robustness. Evaluation on seven benchmarks spanning Solidity, Rust, and Java (covering memory allocation, smart contracts, and database usage) shows that Doc2Spec outperforms a baseline without grammar induction by +0.27 precision and +0.11 recall, and achieves competitive or superior performance compared to SymGPT, which relies on a manually crafted grammar. When integrated with SymGPT’s verification backend, specifications generated by Doc2Spec uncovered 120 bugs in real‑world smart contracts, demonstrating practical impact. The paper’s contributions are (1) an automated grammar‑induction technique from natural‑language rules, (2) the Doc2Spec multi‑agent framework that couples grammar induction with formal specification synthesis, and (3) extensive empirical validation across multiple languages and domains, establishing that automatically induced grammars can effectively constrain LLM output, improve consistency, and enable reliable formal verification without extensive domain engineering.

Doc2Spec: Synthesizing Formal Programming Specifications from Natural Language via Grammar Induction

💡 Research Summary

Comments & Academic Discussion

Leave a Comment