DUET: Agentic Design Understanding via Experimentation and Testing

Reading time: 5 minute
...

📝 Original Info

  • Title: DUET: Agentic Design Understanding via Experimentation and Testing
  • ArXiv ID: 2512.06247
  • Date: 2025-12-06
  • Authors: ** - Gus Henry Smith¹* - Sandesh Adhikary² - Vineet Thumuluri² - Karthik Suresh² - Vivek Pandit² - Kartik Hegde² - Hamid Shojaei² - Chandra Bhagavatula² ¹ Southmountain Research, ² ChipStack (현재 Cadence Design Systems) **

📝 Abstract

AI agents powered by large language models (LLMs) are being used to solve increasingly complex software engineering challenges, but struggle with hardware design tasks. Register Transfer Level (RTL) code presents a unique challenge for LLMs, as it encodes complex, dynamic, time-evolving behaviors using the low-level language features of SystemVerilog. LLMs struggle to infer these complex behaviors from the syntax of RTL alone, which limits their ability to complete all downstream tasks like code completion, documentation, or verification. In response to this issue, we present DUET: a general methodology for developing Design Understanding via Experimentation and Testing. DUET mimics how hardware design experts develop an understanding of complex designs: not just via a one-off readthrough of the RTL, but via iterative experimentation using a number of tools. DUET iteratively generates hypotheses, tests them with EDA tools (e.g., simulation, waveform inspection, and formal verification), and integrates the results to build a bottom-up understanding of the design. In our evaluations, we show that DUET improves AI agent performance on formal verification, when compared to a baseline flow without experimentation.

💡 Deep Analysis

Figure 1

📄 Full Content

DUET: Agentic Design Understanding via Experimentation and Testing Gus Henry Smith1*, Sandesh Adhikary2, Vineet Thumuluri2, Karthik Suresh2, Vivek Pandit2, Kartik Hegde2, Hamid Shojaei2, Chandra Bhagavatula2 1Southmountain Research, 2ChipStack** gus@southmountain.ai, {sandesha,vineett,sureshk,vivekp, kartikv,hamids,bchandra}@cadence.com Abstract–AI agents powered by large language models (LLMs) are being used to solve increasingly complex software engineering challenges, but struggle with hardware design tasks. Register Transfer Level (RTL) code presents a unique challenge for LLMs, as it encodes complex, dynamic, time-evolving behaviors using the low-level language features of SystemVerilog. LLMs struggle to infer these complex behaviors from the syntax of RTL alone, which limits their ability to complete all downstream tasks like code completion, documentation, or verification. In response to this issue, we present DUET: a general methodology for developing Design Understanding via Experimentation and Testing. DUET mimics how hardware design experts develop an understanding of complex designs: not just via a one-off readthrough of the RTL, but via iterative experimentation using a number of tools. DUET iteratively generates hypotheses, tests them with EDA tools (e.g., simulation, waveform inspection, and formal verification), and integrates the results to build a bottom-up understanding of the design. In our evaluations, we show that DUET improves AI agent performance on formal verification, when compared to a baseline flow without experimentation. I. INTRODUCTION AI agents are quickly taking over many tasks in software engineering and beyond. Powered by large language models (LLMs), AI agents can be deployed as powerful autonomous software workflows. An AI agent takes a text prompt as input, and is generally also equipped with tools it can call (e.g., Python functions or command-line utilities to read and write files, execute computations, or run installed tools). The agent then iteratively queries the LLM (starting with the prompt) to get the next action. When the LLM requests an action, the function or command-line utility is called and the results are sent back to the LLM. Eventually, the LLM decides to stop (or is stopped externally), and sends a final result. Throughout the process, the agent may have also created a number of artifacts, such as files in the filesystem. Using this general structure, AI agents have been able to replicate many human tasks. While AI agents have shown human-level performance in software engineering tasks, they continue to struggle with similar tasks in hardware design. Among many reasons, we hypothesize that this is because the underlying LLMs powering AI-assisted tools struggle with Register Transfer Level (RTL) code. RTL is inherently complex, often capturing dynamic, time-evolving behaviors using the low-level, heavily implicit syntax of languages like SystemVerilog. This is in stark contrast to software languages like Python or Java which contain more structure in the syntax of the code itself; for example, sequential lines in imperative programming languages generally correspond to instructions which will execute over time, and function names can be used to understand when control flow jumps across files in the codebase. On the other hand, RTL does not have such structure. For example, sequential states in a finite state machine (FSM) might be separated by tens or hundreds of lines in a case statement, and the control flow between these states may be much less explicit. Thus, when LLMs are given only the RTL, they struggle to understand the deeper behaviors of hardware designs and build design understanding. As a result, AI agents perform worse on downstream hardware tasks like verification or debugging. However, hardware designers themselves do not build their understanding of a design simply by reading the RTL. Instead, the process is more dynamic; designers will use tools like simulation, waveform viewers, and formal tools to understand the design. Even if the designer builds their understanding of the design purely from documentation, that documentation contains descriptions of the dynamic behavior of the design such as timing diagrams and waveforms. We present DUET: a general methodology for developing Design Understanding via Experimentation and Testing. DUET that mimics how hardware design experts develop an understanding of complex designs: not just through *work done while at ChipStack; **work done before ChipStack acquisition by Cadence Design Systems arXiv:2512.06247v2 [cs.SE] 22 Jan 2026 reading the RTL, but via iterative experimentation. DUET enables AI agents to generate hypotheses about a design, and equips it with tools to test and confirm or refine these hypotheses. Let’s consider a simple example. Imagine we task an AI agent with describing a design with a reasonably complex finite state machine—for example, an implementation of I2C. Specifically

📸 Image Gallery

DUET.drawio.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut