Automated QoR improvement in OpenROAD with coding agents
EDA development and innovation has been constrained by scarcity of expert engineering resources. While leading LLMs have demonstrated excellent performance in coding and scientific reasoning tasks, their capacity to advance EDA technology itself has been largely untested. We present AuDoPEDA, an autonomous, repository-grounded coding system built atop OpenAI models and a Codex-class agent that reads OpenROAD, proposes research directions, expands them into implementation steps, and submits executable diffs. Our contributions include (i) a closed-loop LLM framework for EDA code changes; (ii) a task suite and evaluation protocol on OpenROAD for PPA-oriented improvements; and (iii) end-to-end demonstrations with minimal human oversight. Experiments in OpenROAD achieve routed wirelength reductions of up to 5.9%, effective clock period reductions of up to 10.0%, and power reductions of up to 19.4%.
💡 Research Summary
The paper introduces AuDoPEDA, an autonomous, repository‑grounded coding system that leverages large language models (LLMs) to improve quality‑of‑results (QoR) in the OpenROAD physical design stack. The authors argue that the scarcity of senior EDA engineers limits innovation, and they explore whether an LLM‑driven pipeline can make meaningful code contributions at the scale of a production‑grade EDA repository.
System Overview
AuDoPEDA is organized into four tightly coupled stages, each producing versioned artifacts that enable auditability and deterministic replay:
-
S0 – Repository Graphing & Documentation
- Uses tree‑sitter to parse C++, Tcl, Python, and Verilog files into language‑agnostic abstract syntax trees (ASTs).
- Constructs a property graph G where nodes represent files, declarations, definitions, and call sites, and typed edges capture calls, includes, bindings, and script‑to‑core links.
- Applies condensation and sparsification to keep the graph tractable.
- A “Docmaker” traverses G bottom‑up, extracts signatures, default parameters, invariants, and configuration flags, then employs a code‑specialized LLM to generate concise documentation cards (role, I/O, pre/post‑conditions, tunable knobs).
- Cards are validated against G and indexed with both sparse (BM25) and dense embeddings for fast retrieval.
-
S1 – Literature‑Grounded Planning (DSPy)
- Maintains two corpora: C_repo (the documentation cards) and C_lit (EDA papers, tutorials, and wikis).
- Implements a declarative LLM program using DSPy, which compiles a Retrieve‑Synthesize‑Validate workflow.
- Retrieval combines RAG techniques with a re‑ranker to pull relevant literature and repository context for a given QoR objective (e.g., reduce routed wirelength).
- Synthesis produces a high‑level research plan that includes hypotheses, algorithmic interventions (cost‑model tweaks, congestion penalties), telemetry hooks, and suggested code locations.
- Validation uses LM Assertions to ensure feasibility: referenced APIs exist, parameters lie within acceptable ranges, and invariants are respected.
-
S2 – Plan Localization & Granular Execution Plan
- Maps the high‑level plan onto concrete edit surfaces by aligning objectives with neighborhoods in G.
- Generates an ordered list of diffs annotated with pre‑flight checks (build, unit, smoke tests), runtime probes (wirelength, WNS, TNS, via count, power), and post‑conditions tied to QoR deltas.
- Each step includes rollback conditions, enabling safe iteration.
-
S3 – Autonomous Execution with QoR Feedback
- A Codex‑class coding agent acts as a planner‑executor: it applies diffs, compiles the modified code, and runs instrumented OpenROAD flows on designated benchmarks.
- The agent monitors self‑measured signals (routed WL, effective clock period, power) and employs guardrails such as revert‑on‑regression and bisect‑on‑failure.
- Failures (compile errors, runtime crashes, QoR regressions) are transformed into counterexamples that feed back into the granular plan for repair, forming a self‑correcting loop.
Experimental Evaluation
The authors evaluate AuDoPEDA on a set of publicly available benchmarks (e.g., ISPD, OpenCores) using the standard OpenROAD flow as a baseline. Results show:
- Routed wirelength reductions up to 5.9 %.
- Effective clock period (ECP) improvements up to 10.0 %.
- Total power consumption reductions up to 19.4 %.
All generated patches were submitted as pull requests to the main OpenROAD repository, passed automated tests and QoR verification, and were merged, demonstrating that the system can produce production‑ready code changes without human intervention.
Key Contributions
- A graph‑structured documentation pipeline that automatically extracts and validates cross‑language API contracts from a large EDA codebase.
- A DSPy‑based, literature‑grounded planner that fuses repository knowledge with domain research to synthesize executable, testable research directions.
- An autonomous executor that closes the loop between code edits and physical‑design metrics, extending verification beyond unit tests to DRC, timing, and power.
Limitations & Future Work
- The current implementation is confined to the OpenROAD stack; integration with commercial tools (Synopsys, Cadence) remains unexplored.
- High‑level planning still relies on human‑provided QoR objectives and manually curated literature tags, limiting full autonomy.
- Benchmarks are limited in diversity; broader industrial designs are needed to assess generalization.
Future research directions include extending the framework to multi‑tool environments, automating objective formulation (e.g., cost‑performance trade‑offs), and employing meta‑learning so that the LLM improves its planning quality over time.
Overall, AuDoPEDA demonstrates that LLM‑driven autonomous agents can move beyond code suggestion to actual algorithmic improvement in a complex, safety‑critical domain, opening a promising path for AI‑augmented EDA development.
Comments & Academic Discussion
Loading comments...
Leave a Comment