Near-Real-Time Resource Slicing for QoS Optimization in 5G O-RAN using Deep Reinforcement Learning

Near-Real-Time Resource Slicing for QoS Optimization in 5G O-RAN using Deep Reinforcement Learning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Open-Radio Access Network (O-RAN) has become an important paradigm for 5G and beyond radio access networks. This paper presents an xApp called xSlice for the Near-Real-Time (Near-RT) RAN Intelligent Controller (RIC) of 5G O-RANs. xSlice is an online learning algorithm that adaptively adjusts MAC-layer resource allocation in response to dynamic network states, including time-varying wireless channel conditions, user mobility, traffic fluctuations, and changes in user demand. To address these network dynamics, we first formulate the Quality-of-Service (QoS) optimization problem as a regret minimization problem by quantifying the QoS demands of all traffic sessions through weighting their throughput, latency, and reliability. We then develop a deep reinforcement learning (DRL) framework that utilizes an actor-critic model to combine the advantages of both value-based and policy-based updating methods. A graph convolutional network (GCN) is incorporated as a component of the DRL framework for graph embedding of RAN data, enabling xSlice to handle a dynamic number of traffic sessions. We have implemented xSlice on an O-RAN testbed with 10 smartphones and conducted extensive experiments to evaluate its performance in realistic scenarios. Experimental results show that xSlice can reduce performance regret by 67% compared to the state-of-the-art solutions. Source code is available at https://github.com/xslice-5G/code.


💡 Research Summary

The paper introduces “xSlice,” a novel xApp designed for the Near‑Real‑Time (Near‑RT) RAN Intelligent Controller (RIC) in a 5G Open‑Radio Access Network (O‑RAN). The authors address the challenge of dynamically allocating MAC‑layer resources to satisfy heterogeneous Quality‑of‑Service (QoS) requirements—throughput, latency, and reliability—under constantly changing wireless channel conditions, user mobility, and traffic fluctuations.

First, they formulate the QoS optimization problem as a regret‑minimization objective. Each traffic session is assigned weights for the three QoS dimensions, allowing operators to prioritize services (e.g., eMBB, URLLC, mMTC) via a simple weighting scheme. Minimizing regret directly corresponds to meeting the weighted QoS targets as closely as possible.

To solve this online, the authors develop a deep reinforcement learning (DRL) framework based on an actor‑critic architecture. The policy (actor) is trained with Proximal Policy Optimization (PPO), which stabilizes updates through clipping, while the value (critic) network uses temporal‑difference learning to quickly estimate expected returns. This combination leverages the fast policy adaptation of policy‑gradient methods and the sample efficiency of value‑based learning.

A key technical contribution is the integration of a Graph Convolutional Network (GCN) as an “adapter” that converts a variable‑size set of MAC‑layer Key Performance Metrics (KPMs) into a fixed‑dimensional embedding. Each active traffic session becomes a node in a graph; edges encode logical relationships such as sharing the same cell, frequency band, or time slot. The GCN aggregates neighboring node features, producing compact node embeddings that capture both per‑session statistics and inter‑session interactions. These embeddings serve as the state input to the DRL agent, enabling the system to handle any number of sessions without redesigning the neural network architecture.

Implementation is carried out on an OpenAirInterface‑based O‑RAN testbed. A gNB connects to the Near‑RT RIC via the standard E2 interface, and ten commercial smartphones act as user equipment (UE) generating realistic traffic mixes. xSlice operates with a decision interval ranging from 10 ms to 1 s, continuously ingesting MAC‑KPMs (throughput, latency, block error rate, etc.), passing them through the GCN, and producing slice‑allocation actions (e.g., PRB distribution among sessions).

Experimental results demonstrate that xSlice consistently outperforms a broad set of baselines, including simulation‑only DRL approaches, multi‑armed bandit schedulers, and recent GNN‑enhanced methods. The primary metric, performance regret, is reduced by an average of 67 % compared with state‑of‑the‑art solutions. Moreover, xSlice maintains low latency in decision making, adapts quickly to sudden channel degradations or user handovers, and exhibits stable convergence even when the number of active sessions changes dynamically.

The authors highlight three main contributions: (1) a practical, online DRL framework for resource slicing that operates on real‑time KPM data rather than synthetic traces; (2) a GCN‑based adapter that provides scalable, graph‑structured representations of dynamic traffic sessions; and (3) a full end‑to‑end implementation and over‑the‑air evaluation on a realistic 5G O‑RAN testbed.

Limitations include the relatively small scale of the experimental setup (10 UEs) and the simplistic graph construction that only captures immediate adjacency relationships. Future work is proposed to extend the graph model to multi‑cell cooperation, incorporate meta‑learning for faster policy warm‑starts, and develop attention‑based visualizations to improve interpretability of the learned policies.

In summary, xSlice demonstrates that combining regret‑driven objectives, actor‑critic DRL, and graph neural network embeddings can deliver near‑real‑time, QoS‑aware resource slicing in practical O‑RAN deployments, bridging the gap between theoretical AI‑driven RAN control and real‑world 5G network operation.


Comments & Academic Discussion

Loading comments...

Leave a Comment