Monotonic Reference-Free Refinement for Autoformalization

Monotonic Reference-Free Refinement for Autoformalization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

While statement autoformalization has advanced rapidly, full-theorem autoformalization remains largely unexplored. Existing iterative refinement methods in statement autoformalization typicall improve isolated aspects of formalization, such as syntactic correctness, but struggle to jointly optimizing multiple quality dimensions, which is critical for full-theorem autoformalization. We introduce a reference-free iterative monotonic process for full-theorem autoformalization that leverages complementary feedback from theorem provers and LLM-based judges, without access to ground-truth proofs or existing formalizations at inference time. Our approach optimizes a masked composite objective over Formal Validity, Logical Preservation, Mathematical Consistency, and Formal Quality, guided by a responsiveness map that indicates how different LLMs acting as different roles preferentially improve each dimension. We further propose an acceptance policy that guarantees certified monotonic improvement, and provide conditions ensuring convergence and termination. Empirical experiments demonstrate the proposed process enables simultaneous improvement across multiple dimensions, achieving 93.44% formal validity and a 78.22% overall score on miniF2F, and 44.09% formal validity and a 29.79% overall score on ProofNet.


💡 Research Summary

The paper tackles the largely unexplored problem of full‑theorem autoformalization, which requires generating both a formal statement and a mechanically verified proof from a natural‑language theorem and its informal proof. The authors argue that four quality dimensions must be jointly optimized: (1) Formal Validity (FV) – the formalization must be accepted by a theorem prover; (2) Logical Preservation (LP) – the logical structure of the informal theorem must be retained; (3) Mathematical Consistency (MC) – the mathematical objects and operations must be faithfully represented; and (4) Formal Quality (FQ) – the resulting code should be clear, concise, and reusable.

To address this multi‑objective optimization, the authors introduce a masked composite objective
(J_{OA}(x)=\frac{1}{3},\pi_{FV}(x)\bigl(\pi_{LP}(x)+\pi_{MC}(x)+\pi_{FQ}(x)\bigr)).
Here (\pi_{FV}(x)\in{0,1}) is a hard mask supplied by a theorem prover, while the other three scores lie in (


Comments & Academic Discussion

Loading comments...

Leave a Comment