An Approximate Bayesian Approach to Optimal Input Signal Design for System Identification
The design of informatively rich input signals is essential for accurate system identification, yet classical Fisher-information-based methods are inherently local and often inadequate in the presence of significant model uncertainty and nonlinearity. This paper develops a Bayesian approach that uses the mutual information (MI) between observations and parameters as the utility function. To address the computational intractability of the MI, we maximize a tractable MI lower bound. The method is then applied to the design of an input signals for the identification of quasi-linear stochastic dynamical systems. Evaluating the MI lower bound requires inversion of large covariance matrices whose dimensions scale with the number of data points $N$. To overcome this problem, an algorithm that reduces the dimension of the matrices to be inverted by a factor of $N$ is developed, making the approach feasible for long experiments. The proposed Bayesian method is compared with the average D-optimal design method, a semi-Bayesian approach, and its advantages are demonstrated. The effectiveness of the proposed method is further illustrated through four examples, including atomic sensor models, where the input signals that generates large MI are especially important for reducing the estimation error.
💡 Research Summary
This paper presents a novel Bayesian methodology for designing optimal input signals in system identification, moving beyond the limitations of classical Fisher-information-based approaches. The core challenge in system identification is to design input signals that are maximally informative about the unknown system parameters. Traditional methods, which optimize criteria like D-optimality based on the Fisher Information Matrix (FIM), are inherently local, relying on linear approximations and asymptotic normality. They often fail in the presence of significant prior uncertainty and strong nonlinearities.
To address this, the authors propose a fully Bayesian framework where the quality of an experimental design is measured by the Mutual Information (MI) between the model parameters θ and the observed data Y. Maximizing MI is a fundamental criterion, as an information-theoretic lower bound (ITB) directly links higher MI to lower minimum mean squared estimation error. However, computing and optimizing MI is notoriously intractable, requiring high-dimensional integration over both parameters and observations.
The paper’s key innovation is to circumvent this intractability. The authors focus on a specific but widely applicable model class: Y = F(θ, U) + Z, where Z is conditionally Gaussian noise, and U is the design variable (input signal). When the prior distribution for θ is discrete over a finite set {θ1,…, θr}, the marginal distribution of Y becomes a finite Gaussian mixture. For such mixtures, the authors adopt a tractable lower bound I_l(U) on the MI, derived by Kolchinsky and Tracey. This bound is expressed in terms of pairwise distances d_i,j(U) between the Gaussian components, which depend on the mean differences F(θi,U)-F(θj,U) and the covariances S(θi,U). Thus, the original intractable problem of maximizing MI is approximated by the tractable problem of maximizing this lower bound I_l(U).
The method is then specialized for identifying quasi-linear stochastic dynamical systems—systems linear in state but nonlinear in control input, common in quantum control, chemical processes, and thermal dynamics. For such systems, a finite observation sequence can always be cast in the Y = F(θ, U) + Z form. A significant practical hurdle arises because evaluating I_l(U) requires inverting covariance matrices S(θ, U) whose dimension scales with the number of data points N, making long experiments (N ~ 10^3-10^6) computationally prohibitive. A major algorithmic contribution of the paper is the development of a method that reduces the dimension of the matrices needing inversion by a factor of N, rendering the approach feasible for large-scale problems.
The proposed Bayesian method is compared against the average D-optimal design, a semi-Bayesian method. The comparison demonstrates the advantages of the fully Bayesian, information-theoretic approach, particularly in nonlinear scenarios. The effectiveness of the method is thoroughly illustrated through four examples. The first two are simple linear examples for clarity. The third and fourth are more complex, drawn from atomic sensor models where precise parameter estimation is critical. The fourth example involves a nonlinear optically pumped magnetometer model. Here, the optimal input signal generated by the proposed method significantly outperforms a simple harmonic signal, and the resulting estimation error of the Maximum a Posteriori (MAP) estimator is shown to approach the theoretical ITB, validating the practical optimality of the design.
In summary, this work provides a principled, computationally feasible Bayesian strategy for optimal input design. It successfully bridges the gap between the theoretical ideal of maximizing mutual information and practical application, especially for quasi-linear systems, by introducing a tractable lower-bound optimization and a crucial dimensionality-reduction algorithm for handling long data records.
Comments & Academic Discussion
Loading comments...
Leave a Comment