Understanding the Nature of Depth-1 Equivariant Quantum Circuit

Reading time: 6 minute
...

📝 Abstract

The Equivariant Quantum Circuit (EQC) for the Travelling Salesman Problem (TSP) has been shown to achieve near-optimal performance in solving small TSP problems (up to 20 nodes) using only two parameters at depth 1. However, extending EQCs to larger TSP problem sizes remains challenging due to the exponential time and memory for quantum circuit simulation, as well as increasing noise and decoherence when running on actual quantum hardware. In this work, we propose the Size-Invariant Grid Search (SIGS), an efficient training optimization for Quantum Reinforcement Learning (QRL), and use it to simulate the outputs of a trained Depth-1 EQC up to 350-node TSP instances - well beyond previously tractable limits. At TSP with 100 nodes, we reduce total simulation times by 96.4%, when comparing to RL simulations with the analytical expression (151 minutes using RL to under 6 minutes using SIGS on TSP-100), while achieving a mean optimality gap within 0.005 of the RL trained model on the test set. SIGS provides a practical benchmarking tool for the QRL community, allowing us to efficiently analyze the performance of QRL algorithms on larger problem sizes. We provide a theoretical explanation for SIGS called the Size-Invariant Properties that goes beyond the concept of equivariance discussed in prior literature.

💡 Analysis

The Equivariant Quantum Circuit (EQC) for the Travelling Salesman Problem (TSP) has been shown to achieve near-optimal performance in solving small TSP problems (up to 20 nodes) using only two parameters at depth 1. However, extending EQCs to larger TSP problem sizes remains challenging due to the exponential time and memory for quantum circuit simulation, as well as increasing noise and decoherence when running on actual quantum hardware. In this work, we propose the Size-Invariant Grid Search (SIGS), an efficient training optimization for Quantum Reinforcement Learning (QRL), and use it to simulate the outputs of a trained Depth-1 EQC up to 350-node TSP instances - well beyond previously tractable limits. At TSP with 100 nodes, we reduce total simulation times by 96.4%, when comparing to RL simulations with the analytical expression (151 minutes using RL to under 6 minutes using SIGS on TSP-100), while achieving a mean optimality gap within 0.005 of the RL trained model on the test set. SIGS provides a practical benchmarking tool for the QRL community, allowing us to efficiently analyze the performance of QRL algorithms on larger problem sizes. We provide a theoretical explanation for SIGS called the Size-Invariant Properties that goes beyond the concept of equivariance discussed in prior literature.

📄 Content

Quantum reinforcement learning (QRL) studies parameterized quantum circuits (PQCs) as function approximators within deep reinforcement learning (DRL) pipelines. A common instantiation replaces the neural network in a Deep Q-Network (DQN) with a PQC trained end to end by a classical optimizer, yielding a hybrid agent. Early demonstrations show that PQC-based DQNs can learn near-optimal policies on small benchmarks (e.g., FrozenLake, Cognitive Radio), establishing feasibility and hinting at alternative inductive biases relative to classical value approximators [1].

This paper is concerned with solving combinatorial optimization problems with QRL. In the classical domain, Neural Combinatorial Optimization (NCO) develops DRL/attention architectures that construct high-quality solutions to discrete problems, achieving nearoptimal tours for symmetric TSP up to 100 nodes [2]; more recent diffusion/backbone advances further improve the quality-speed trade-off [3].

Jonathan TEO 1 : jrteo.2022@smu.edu.sg Xin Wei LEE 1 : xwlee@smu.edu.sg Hoong Chuin LAU 1 : hclau@smu.edu.sg (Corresponding Author)

For QRL to match or even surpass NCO counterparts, two gaps remain. Interpretability. PQC architectures with far fewer parameters than classical networks can perform well, yet the role of each parameter is often opaque for concrete combinatorial tasks. Scalability. Prevalent encodings map one decision variable to one qubit and rely on highly entangling layers; both simulation and hardware execution grow costly with problem size, even with efficient tensor-network simulators. These gaps motivate the developments in this paper.

1.1 Overview of the Literature QRL for Combinatorial Optimization. QRL involves replacing the value function approximator in Deep Reinforcement Learning (DRL) with a Parameterized Quantum Circuit (PQC). Skolik et al. (2023) marked the first systematic application of QRL to Combinatorial Optimization via the Equivariant Quantum Circuit (EQC) for TSP [4]. Using a Symmetry Preserving Ansatz (SPA), they demonstrated that respecting graph isomorphisms can drastically reduce the number of trainable parameters of the circuit down to two values, while producing near-optimal tours for instances of up to 20 nodes. This also serves as a starting point of our work. More recently, Kruse et al. (2024) generalized this framework to Quadratic Unconstrained Binary Optimization (QUBO) formulations, with an sge-sgv ansatz that closely related to the EQC (having also two parameters per layer), demonstrating the ability to learn near-optimal policies on other Combinatorial Optimization problems such as Weighted-MaxCut, Knapsack and the Unit Containment Problem [5]. Ansatz design and trainability. An ansatz specifies the a certain structure of a PQC to prepare a variational state or wavefunction [6]. The structure of the ansatz can be chosen with or without prior knowledge of the problem. In QRL, the most popular ansatz used is the Hardware Efficient Ansatz (HEA) [1,5,7] due to their device-friendly layouts. However, due to its high expressivity and problem independence, these ansatz can trigger Barren Plateaus (BP), making gradients vanish as problem sizes increase [8,9]. This has fueled a shift toward problem-tailoned ansatzes for QRL (to be symmetry-aware) in order to improve trainability at shallow depths.

We can interpret the work of Skolik et al. (2023) in three ways. Firstly, from a Geometric Quantum Machine Learning (GQML) perspective, Schatzki et al. (2024) formalizes SPAs that are equivariant under the symmetry group S n , and prove that S n -equivariant ansatzes do not suffer from barren plateaus, quickly reach over-parameterization, and generalize well from small amounts of data [10,11]. Both the EQC and the sge-sgv ansatzes are examples of S n -equivariant Quantum Neural Networks (QNNs). Secondly, Skolik et al. (2023) further empirically demonstrates that SPAs outperform their non-symmetry preserving counterparts through ablation simulations which gradually break the equivariance of the EQC [4]. An alternate lens for interpreting the EQC is via the Quantum Approximate Optimization Algorithm (QAOA). The structure of the EQC mirrors the structure of the QAOA ansatz: where the parameters γ and β play analogous roles to the cost and mixer Hamiltonians [12]. QAOA is well-known in solving combinatorial optimization problems, prominently on the MaxCut problem with proven bounds [13,14] and empirical benchmarks [15,16]. Lastly, the idea of ensuring that embeddings of graphs to be invariant under node permutations closely follows to the design principles of classical NCO algorithms that respect graph structures through message passing and attention [17][18][19].

Our work makes the following contributions:

• A size-invariant understanding of parameters: We provide a new explanation for the roles of the trainable parameters (γ and β) at Depth 1 of the EQC, extending beyond the concept of equivariance in [4]. By analy

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut