📝 Original Info
- Title: QoS-Aware Dynamic CU Selection in O-RAN with Graph-Based Reinforcement Learning
- ArXiv ID: 2512.19696
- Date: 2025-11-21
- Authors: Researchers from original ArXiv paper
📝 Abstract
Open Radio Access Network (O RAN) disaggregates conventional RAN into interoperable components, enabling flexible resource allocation, energy savings, and agile architectural design. In legacy deployments, the binding between logical functions and physical locations is static, which leads to inefficiencies under time varying traffic and resource conditions. We address this limitation by relaxing the fixed mapping and performing dynamic service function chain (SFC) provisioning with on the fly O CU selection. We formulate the problem as a Markov decision process and solve it using GRLDyP, i.e., a graph neural network (GNN) assisted deep reinforcement learning (DRL). The proposed agent jointly selects routes and the O-CU location (from candidate sites) for each incoming service flow to minimize network energy consumption while satisfying quality of service (QoS) constraints. The GNN encodes the instantaneous network topology and resource utilization (e.g., CPU and bandwidth), and the DRL policy learns to balance grade of service, latency, and energy. We perform the evaluation of GRLDyP on a data set with 24-hour traffic traces from the city of Montreal, showing that dynamic O CU selection and routing significantly reduce energy consumption compared to a static mapping baseline, without violating QoS. The results highlight DRL based SFC provisioning as a practical control primitive for energy-aware, resource-adaptive O-RAN deployments.
💡 Deep Analysis
Deep Dive into QoS-Aware Dynamic CU Selection in O-RAN with Graph-Based Reinforcement Learning.
Open Radio Access Network (O RAN) disaggregates conventional RAN into interoperable components, enabling flexible resource allocation, energy savings, and agile architectural design. In legacy deployments, the binding between logical functions and physical locations is static, which leads to inefficiencies under time varying traffic and resource conditions. We address this limitation by relaxing the fixed mapping and performing dynamic service function chain (SFC) provisioning with on the fly O CU selection. We formulate the problem as a Markov decision process and solve it using GRLDyP, i.e., a graph neural network (GNN) assisted deep reinforcement learning (DRL). The proposed agent jointly selects routes and the O-CU location (from candidate sites) for each incoming service flow to minimize network energy consumption while satisfying quality of service (QoS) constraints. The GNN encodes the instantaneous network topology and resource utilization (e.g., CPU and bandwidth), and the DRL
📄 Full Content
QoS-Aware Dynamic CU Selection in O-RAN with
Graph-Based Reinforcement Learning
Sebastian Racedo and Brigitte Jaumard
Computer Science and Software Engineering
Concordia University
Montreal (Qc) Canada
brigitte.jaumard@concordia.ca
Oscar Delgado
Systems engineering
Ecole de Technologie Sup´erieure (ETS)
Montreal (Qc) Canada
Meysam Masoudi
Ericsson
Kista, Sweden
Abstract—Open
Radio
Access
Network
(O-RAN)
dis-
aggregates conventional RAN into interoperable components,
enabling flexible resource allocation, energy savings, and agile
architectural design. In legacy deployments, the binding between
logical functions and physical locations is static, which leads to
inefficiencies under time-varying traffic and resource conditions.
We address this limitation by relaxing the fixed mapping and
performing dynamic service function chain (SFC) provisioning
with on-the-fly O-CU selection. We formulate the problem as
a Markov decision process and solve it using GRL-DyP, i.e.,
a graph neural network (GNN)–assisted deep reinforcement
learning (DRL). The proposed agent jointly selects routes and
the O-CU location (from candidate sites) for each incoming
service flow to minimize network energy consumption while
satisfying quality-of-service (QoS) constraints. The GNN encodes
the instantaneous network topology and resource utilization (e.g.,
CPU and bandwidth), and the DRL policy learns to balance grade
of service, latency, and energy. We perform the evaluation of
GRL-DyP on a data set with 24-hour traffic traces from the city
of Montreal, showing that dynamic O-CU selection and routing
significantly reduce energy consumption compared to a static
mapping baseline, without violating QoS. The results highlight
DRL-based SFC provisioning as a practical control primitive for
energy-aware, resource-adaptive O-RAN deployments.
Index Terms—O-RAN, Deep Reinforcement Learning, Graph
Neural Networks, SFC Provisioning, Energy Efficiency.
I. INTRODUCTION
The transition to 5G and the trajectory toward 6G are
accelerating adoption of the O-RAN architecture, shifting
networks from proprietary, monolithic stacks to disaggregated,
virtualized, and intelligent network [1]. By decoupling the
Radio Unit (O-RU), Distributed Unit (O-DU), and Centralized
Unit (O-CU), O-RAN enables flexible placement and scaling
of functions while fostering a multi-vendor ecosystem. These
capabilities are underpinned by Software-Defined Networking
(SDN) and Network Function Virtualization (NFV), which
also support resource partitioning via network slicing [2]. As
the O-RAN Alliance advances specifications and deployment
profiles, the resulting design space offers greater agility but
also introduces substantial orchestration complexity across het-
This work was supported by NSERC (under project ALLRP 566589-21)
and Innov´E´E (INNOV-R program) through the partnership with Ericsson. We
are grateful to Adel Larabi at GAIA, Ericsson Montr´eal for clarifying some
concepts of the current 5G technology.
erogeneous hardware, fronthaul constraints, and time-varying
traffic [1].
However, the same flexibility complicates resource allo-
cation and control. Service function chains (SFCs) must be
placed, scaled, and steered across heterogeneous compute
and transport resources while meeting slice-specific QoS tar-
gets, ranging from high-throughput enhanced Mobile Broad-
band (eMBB) to Ultra-Reliable Low Latency Communication
(URLLC) [3]. In practice, many deployments still rely on static
deployment and enforce rigid 1:1 bindings among O-RAN
components. Such configurations are often derived from offline
capacity planning for peak demand, leading to underutilized
hardware and significant energy waste during non-peak hours.
The principles of NFV enable resource and routing deci-
sions to be managed by a centralized controller that maintains
a global view of the network’s state. This opens the door
for more intelligent orchestration methods [3]. Although tradi-
tional optimization techniques, such as integer linear programs
(ILPs), can find optimal solutions, they often struggle to cope
with the scale and dynamism of real-world networks [2].
This complex, dynamic trade-off space is an ideal application
for Deep Reinforcement Learning (DRL). A DRL agent, in
contrast, can learn complex, non-obvious strategies directly
from data patterns, managing the multi-objective problem of
maximizing service success while minimizing both latency and
energy consumption. Besides using DRL, given the natural
structure of network-related problems, the use of graph data
is usually beneficial. To work directly with this type of data,
we utilize Graph Neural Networks (GNNs) [4].
In this paper, we propose an RL framework that leverages
a GNN [4] to learn a joint routing and O-CU selection policy.
Our agent’s architecture is based explicitly on Graph Convo-
lutional Networks (GCNs) [5] that use convolutional network,
enabling it to learn from the underlying network topology and
real-time state effectively. W
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.