Time-Delayed Transformers for Data-Driven Modeling of Low-Dimensional Dynamics

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We propose the time-delayed transformer (TD-TF), a simplified transformer architecture for data-driven modeling of unsteady spatio-temporal dynamics. TD-TF bridges linear operator-based methods and deep sequence models by showing that a single-layer, single-head transformer can be interpreted as a nonlinear generalization of time-delayed dynamic mode decomposition (TD-DMD). The architecture is deliberately minimal, consisting of one self-attention layer with a single query per prediction and one feedforward layer, resulting in linear computational complexity in sequence length and a small parameter count. Numerical experiments demonstrate that TD-TF matches the performance of strong linear baselines on near-linear systems, while significantly outperforming them in nonlinear and chaotic regimes, where it accurately captures long-term dynamics. Validation studies on synthetic signals, unsteady aerodynamics, the Lorenz ‘63 system, and a reaction-diffusion model show that TD-TF preserves the interpretability and efficiency of linear models while providing substantially enhanced expressive power for complex dynamics.

💡 Research Summary

The paper introduces the Time‑Delayed Transformer (TD‑TF), a highly compact transformer architecture designed for data‑driven modeling of low‑dimensional, unsteady spatio‑temporal dynamics. The authors motivate the need for a model that bridges the interpretability and efficiency of linear operator‑based methods (such as Dynamic Mode Decomposition and its time‑delayed variant, TD‑DMD) with the expressive power of modern deep sequence models (transformers). By restricting the architecture to a single self‑attention layer, a single query per prediction, and one shallow feed‑forward network, TD‑TF retains linear‑time computational complexity with respect to sequence length and uses only a few thousand parameters.

The construction proceeds as follows: each snapshot (w_k) is augmented with a normalized time index, forming a positional‑encoded vector (y_k =

Time-Delayed Transformers for Data-Driven Modeling of Low-Dimensional Dynamics

💡 Research Summary

Comments & Academic Discussion

Leave a Comment