Continual uncertainty learning

Reading time: 5 minute
...

📝 Original Info

  • Title: Continual uncertainty learning
  • ArXiv ID: 2602.17174
  • Date: 2026-02-19
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (정보 없음) **

📝 Abstract

Robust control of mechanical systems with multiple uncertainties remains a fundamental challenge, particularly when nonlinear dynamics and operating-condition variations are intricately intertwined. While deep reinforcement learning (DRL) combined with domain randomization has shown promise in mitigating the sim-to-real gap, simultaneously handling all sources of uncertainty often leads to sub-optimal policies and poor learning efficiency. This study formulates a new curriculum-based continual learning framework for robust control problems involving nonlinear dynamical systems in which multiple sources of uncertainty are simultaneously superimposed. The key idea is to decompose a complex control problem with multiple uncertainties into a sequence of continual learning tasks, in which strategies for handling each uncertainty are acquired sequentially. The original system is extended into a finite set of plants whose dynamic uncertainties are gradually expanded and diversified as learning progresses. The policy is stably updated across the entire plant sets associated with tasks defined by different uncertainty configurations without catastrophic forgetting. To ensure learning efficiency, we jointly incorporate a model-based controller (MBC), which guarantees a shared baseline performance across the plant sets, into the learning process to accelerate the convergence. This residual learning scheme facilitates task-specific optimization of the DRL agent for each uncertainty, thereby enhancing sample efficiency. As a practical industrial application, this study applies the proposed method to designing an active vibration controller for automotive powertrains. We verified that the resulting controller is robust against structural nonlinearities and dynamic variations, realizing successful sim-to-real transfer.

💡 Deep Analysis

📄 Full Content

In modern industrial applications, including automotive powertrain systems [1][2] and robotic platforms [3] [4], performance demands have become progressively more stringent, resulting in a marked increase in system complexity. Such mechanical systems commonly exhibit nonlinear behaviors [5], communication delays, and uncertainties induced by variations in system parameters [6]. As a consequence, control strategies must be designed to simultaneously address these multiple sources of uncertainty in an integrated manner in order to achieve reliable performance.

Model-based control has achieved numerous successes across a wide range of mechanical systems; however, it fundamentally relies on the assumption that accurate and complete models of real-world systems are available. In practice, this assumption is rarely satisfied, and performance degradation caused by discrepancies between plant models and real systems is widely recognized as the robust control problem in control theory and as the sim-to-real gap in the machine learning community. For robotic and automotive systems characterized by intertwined parameter variations and strong nonlinearities, conventional robust control approaches, such as 𝐻 ∞ control [7], are increasingly reaching their practical limitations. Meanwhile, rapid advances in computational resources have driven remarkable progress in artificial intelligence, which has recently demonstrated strong potential as an alternative control paradigm through numerous industrial applications. In particular, deep reinforcement learning (DRL) [8] [9], emerging from the integration of deep neural networks (DNNS) and reinforcement learning (RL) [10], has attracted significant attention. A growing body of related work has shown that DRL can learn practically effective control policies for nonlinear, complex, largescale, and high-dimensional plants-such as robotic systems [11], powertrain control [12] and complex vibration control problems [13] [14][15]-without relying on explicit system models. However, since DRL is based on trial-and-error interactions with the environment, learning directly in real-world systems is inherently dangerous [16]. Moreover, collecting the massive amounts of training data required through repeated real-world experiments is often impractical [17].

The success of DRL is largely attributed to its model-free nature and the generalization capability of DNNs. In recent years, increasing attention has been paid to training in simulation environments using domain randomization (DR) [16][18] [19]. In simulation, where safety concerns are eliminated and virtually unlimited training data can be generated, DR intentionally injects random variations into the parameters of the simulation dynamics during training. Intuitively, by exposing an agent to a wide range of plant dynamics and encouraging it to learn policies that perform well across such variations, robustness against real-world systems can be enhanced. As a result, DR has been widely adopted as an effective sim-to-real transfer for complex systems in which modeling state transition dynamics is challenging, such as robotic control [16], locomotion tasks [20] [21], and humanoid robots [22] [23].

Nevertheless, when the training environment simultaneously involves multiple nonlinear characteristics and parameter variations, DR is known to produce overly conservative and sub-optimal policies [19]. This is because excessive randomization across many dynamic factors increases uncertainty perceived by the agent and exacerbates task complexity, thereby making learning more difficult and timeconsuming.

Several approaches have been proposed to address this challenge. Active domain randomization [24] aims to identify the most informative regions of the parameter space by exploiting discrepancies between policy rollouts in randomized and reference environments, thereby addressing the limitations of uniform parameter sampling. Automatic domain randomization [25] adaptively adjusts the range of parameter randomization. Specifically, a curriculum strategy is employed in which the randomization strength is gradually increased as long as the policy successfully learns under the current environment.

However, when a plant simultaneously exhibits multiple and diverse sources of uncertainty, the aforementioned approaches alone may not be sufficient to achieve satisfactory performance.

In recent years, continual learning (CL) has attracted substantial attention in the machine learning community [26][27] [28]. When neural networks are trained on new tasks, they overwrite previously acquired knowledge, a phenomenon commonly referred to as catastrophic forgetting. CL aims to alleviate this difficulty by enabling the accumulation of knowledge across a sequence of distinct tasks.

Existing CL approaches can be broadly categorized into several classes. Regularization-based methods [26] [27] introduce additional constraints into the objective function to

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut