시간 예측을 위한 통합 인코더 디코더 프레임워크 TIMEPERCEIVER

February 20, 2026

Reading time: 5 minute

...

📝 Original Info

Title: 시간 예측을 위한 통합 인코더 디코더 프레임워크 TIMEPERCEIVER
ArXiv ID: 2512.22550
Date: Pending
Authors: ** - Jaebin Lee – Sungkyunkwan University (jaebin.lee@skku.edu) - Hankook Lee – Sungkyunkwan University (hankook.lee@skku.edu) **

📝 Abstract

In machine learning, effective modeling requires a holistic consideration of how to encode inputs, make predictions (i.e., decoding), and train the model. However, in time-series forecasting, prior work has predominantly focused on encoder design, often treating prediction and training as separate or secondary concerns. In this paper, we propose TIMEPERCEIVER, a unified encoder-decoder forecasting framework that is tightly aligned with an effective training strategy. To be specific, we first generalize the forecasting task to include diverse temporal prediction objectives such as extrapolation, interpolation, and imputation. Since this generalization requires handling input and target segments that are arbitrarily positioned along the temporal axis, we design a novel encoder-decoder architecture that can flexibly perceive and adapt to these varying positions. For encoding, we introduce a set of latent bottleneck representations that can interact with all input segments to jointly capture temporal and cross-channel dependencies. For decoding, we leverage learnable queries corresponding to target timestamps to effectively retrieve relevant information. Extensive experiments demonstrate that our framework consistently and significantly outperforms prior state-of-the-art baselines across a wide range of benchmark datasets. The code is available at https://github.com/efficient-learning-lab/TimePerceiver.

💡 Deep Analysis

📄 Full Content

TIMEPERCEIVER: An Encoder-Decoder Framework for Generalized Time-Series Forecasting Jaebin Lee Sungkyunkwan University jaebin.lee@skku.edu Hankook Lee Sungkyunkwan University hankook.lee@skku.edu Abstract In machine learning, effective modeling requires a holistic consideration of how to encode inputs, make predictions (i.e., decoding), and train the model. How- ever, in time-series forecasting, prior work has predominantly focused on encoder design, often treating prediction and training as separate or secondary concerns. In this paper, we propose TIMEPERCEIVER, a unified encoder-decoder forecast- ing framework that is tightly aligned with an effective training strategy. To be specific, we first generalize the forecasting task to include diverse temporal pre- diction objectives such as extrapolation, interpolation, and imputation. Since this generalization requires handling input and target segments that are arbitrarily po- sitioned along the temporal axis, we design a novel encoder-decoder architecture that can flexibly perceive and adapt to these varying positions. For encoding, we introduce a set of latent bottleneck representations that can interact with all input segments to jointly capture temporal and cross-channel dependencies. For decoding, we leverage learnable queries corresponding to target timestamps to effectively retrieve relevant information. Extensive experiments demonstrate that our framework consistently and significantly outperforms prior state-of-the-art baselines across a wide range of benchmark datasets. The code is available at https://github.com/efficient-learning-lab/TimePerceiver. 1 Introduction Time-series forecasting is a fundamental task in machine learning, aiming to predict future events based on past observations. It is of practical importance, as it plays a crucial role in many real-world applications, including weather forecasting [1], electricity consumption forecasting [2], and traffic flow prediction [3]. Despite decades of rapid advances in machine learning, time-series forecasting remains a challenging problem due to complex temporal dependencies, non-linear patterns, domain variability, and other factors. In recent years, numerous deep learning approaches [4–18] have been proposed to improve forecasting accuracy, and it continues to be an active area of research. One promising and popular research direction is to design a new neural network architecture for time-series data, such as Transformers [4–9], convolutional neural networks (CNNs) [11, 12], multi- layer perceptrons (MLPs) [13–15], and state space models (SSMs) [17, 18]. These architectures primarily focus on capturing temporal and channel (i.e., variate) dependencies within input signals, and how to encode the input into a meaningful representation. The encoder architectures are often categorized into two groups: channel-independent encoders, which treat each variate separately and apply the same encoder across all variates, and channel-dependent encoders, which explicitly model interactions among variates. The channel-independent encoders are considered simple yet robust [19]; however, they fundamentally overlook cross-channel interactions, which can be critical for multivariate time-series forecasting. In contrast, the channel-dependent encoders [5, 6, 8] can inherently capture such cross-channel dependencies, but they often suffer from high computational 39th Conference on Neural Information Processing Systems (NeurIPS 2025). arXiv:2512.22550v1 [cs.LG] 27 Dec 2025 fω Xpast = [x1, x2, . . . , x6] ˆXfuture = [ˆx7, ˆx8, . . . , ˆx10] X = [x1, x2, . . . , x10] Input Target (a) Standard formulation gω XI = [x2, x3, x4, x5, x7, x8] ˆXJ = [ˆx1, ˆx6, ˆx9, ˆx10] I = {2, 3, 4, 5, 7, 8} J = {1, 6, 9, 10} X = [x1, x2, . . . , x10] Target Input (b) Generalized formulation (ours) Figure 1: (a) The standard time-series forecasting task aims to predict only the future values from past observations. In contrast, (b) our generalized task formulation aims to predict not only the future, but also the past and missing values based on arbitrary contextual information. cost and do not consistently yield significant improvements in forecasting accuracy over channel- independent baselines. While the encoder architecture is undoubtedly a core component of time-series forecasting models, it is equally important to consider (i) how to accurately predict (i.e., decode) future signals from the encoded representations of past signals, and (ii) how to effectively train the entire forecasting model. However, they often been studied independently, and little attention has been paid to how to effectively integrate them. For decoding, most prior works rely on a simple linear projection that directly predicts the future from the encoded representations. This design offers advantages in terms of simplicity and training efficiency, but may struggle to fully capture complex temporal structures. For training, inspired by BERT [20], masking-and

📄 Read Full PDF on ArXiv