This paper provides statistical guarantees on the accuracy of dynamical models learned from dependent data sequences. Specifically, we develop uniform error bounds that apply to quantized models and imperfect optimization algorithms commonly used in practical contexts for system identification, and in particular hybrid system identification. Two families of bounds are obtained: slow-rate bounds via a block decomposition and fast-rate, variance-adaptive, bounds via a novel spaced-point strategy. The bounds scale with the number of bits required to encode the model and thus translate hardware constraints into interpretable statistical complexities.
This paper lies at the intersection of system identification (Ljung, 1999), which aims at learning models of dynamical systems, and learning theory (Vapnik, 1998), which provides statistical guarantees on the accuracy of models learned from data. More specifically, we concentrate on the question of obtaining high probability bounds on the accuracy of models learned from sequences of possibly dependent data, while taking practical considerations into account. In particular, the proposed analysis holds for imperfect algorithms (for instance those that cannot guarantee to minimize the empirical risk) and models implemented on computing devices with finite precision. Indeed, local optimization or heuristic methods are often used for learning in practice, when training neural networks (Goodfellow et al., 2016) or estimating hybrid dynamical systems (Lauer and Bloch, 2019) for instance, and models are more and more implemented with low precision due to limited hardware capabilities (as with microcontrollers) or latency and power consumption restrictions (Jacob et al., 2018).
The main difficulty for providing nonasymptotic risk guarantees for system identification stems from the dependence between data points that are collected at subsequent time steps along a single trajectory of the modeled system. This issue has been addressed from two complementary perspectives in the literature.
• Mixing-based learning theory: A first family of approaches extends classical tools from learning theory to dependent data via mixing arguments (Yu, 1994;Meir, 2000;Weyer, 2000;Vidyasagar and Karandika, 2004;Mohri and Rostamizadeh, 2009;Massucci et al., 2022). These analyses quantify the temporal dependence through coefficients (e.g., β-or θ-mixing ones) that capture the decay of correlations across time. Then, a decomposition technique due to the seminal work of Yu (1994) yields bounds that apply to a subsample of the data. However, the loss in terms of effective sample size is compensated by the versatility of the approach that can yield widely applicable and uniform error bounds, i.e., results that are algorithm-independent. Yet, these approaches typically rely on rather involved measures of the complexity of the model that must be accurately analyzed before applying the bounds, such as Rademacher complexities for Mohri and Rostamizadeh (2009), growth functions for McDonald et al. (2011), weakdependence metrics for Alquier and Wintenberger (2012), or an information-theoretic divergence for Eringis et al. (2024). • Algorithm-specific finite-sample analyses without mixing: A second line of work provides sharp, problem tailored, guarantees for specific estimators and model classes, that are typically linear or well-structured, using self-normalized martingale tools and related techniques (Simchowitz et al., 2018;Faradonbeh et al., 2018;Jedra and Proutière, 2023), as surveyed in (Tsiamis et al., 2023). When a closed-form expression of the estimator is available, these results yield precise finite-sample rates. Their specialization to a given algorithm-model pair makes them complementary to the uniform, algorithm-agnostic perspective adopted below.
Notably, other works also consider a mid-point between these two types of approaches: Ziemann and Tu (2022) arXiv:2602.15586v1 [cs.
LG] 17 Feb 2026
derives strong guarantees for the specific case of the leastsquares estimator under mixing conditions.
An important area of application for the proposed approach is hybrid system identification, as defined in Lauer and Bloch (2019), where the data-generating system switches between different subsystems in an unobserved and unknown manner. Beside the issue of dependence, this raises additional algorithmic difficulties that prevent the application of the algorithmic-specific approaches mentioned above. Statistical guarantees for hybrid systems were derived in Chen and Poor (2022), but in a slightly different and simplified setting where the data is collected as multiple short and independent trajectories, each generated by a single subsystem, thus alleviating some algorithmic issues and reducing the dependency issue. Other works, like Sattar et al. (2021), propose error bounds for Markov jump systems, but under the simplifying assumption that the switchings are observed or known, in which case the problem becomes more closely related to the identification of multiple independent linear systems and algorithmic-specific approaches can be more easily developed.
This paper focuses on the derivation of widely applicable guarantees that take into account practical limitations often encountered in practice, by following the line of work based on mixing arguments. The proposed results take the form of probabilistic error bounds that enjoy the following properties.
• Uniform over the model class. The bounds hold for any model within the predefined class, and thus remain independent of the identification procedure and insensitive to algorithmic or op
This content is AI-processed based on open access ArXiv data.