Linear Detrending Subsequence Matching in Time-Series Databases

Each time-series has its own linear trend, the directionality of a timeseries, and removing the linear trend is crucial to get the more intuitive matching results. Supporting the linear detrending in subsequence matching is a challenging problem due to a huge number of possible subsequences. In this paper we define this problem the linear detrending subsequence matching and propose its efficient index-based solution. To this end, we first present a notion of LD-windows (LD means linear detrending), which is obtained as follows: we eliminate the linear trend from a subsequence rather than each window itself and obtain LD-windows by dividing the subsequence into windows. Using the LD-windows we then present a lower bounding theorem for the index-based matching solution and formally prove its correctness. Based on the lower bounding theorem, we next propose the index building and subsequence matching algorithms for linear detrending subsequence matching.We finally show the superiority of our index-based solution through extensive experiments.

💡 Research Summary

The paper tackles a fundamental yet under‑explored problem in time‑series databases: how to perform subsequence similarity search when each candidate subsequence may contain its own linear trend. While global detrending is a common preprocessing step, applying linear detrending at the subsequence level is computationally prohibitive because the number of possible subsequences grows quadratically with the length of the series. The authors therefore define the “linear detrending subsequence matching” problem and propose a complete index‑based solution that scales to large databases.

The key technical contribution is the introduction of LD‑windows (Linear Detrending windows). For any candidate subsequence, a least‑squares linear regression line is first fitted to the entire subsequence. The regression line is subtracted, yielding a detrended version of the subsequence. This detrended subsequence is then partitioned into fixed‑size windows; each window is an LD‑window. Because the detrending operation is applied once per subsequence rather than per window, the resulting windows share a common reference (the same regression line) and can be compared directly using ordinary Euclidean distance.

Building on this representation, the authors prove a lower‑bounding theorem: the sum of Euclidean distances between corresponding LD‑windows of two subsequences is a lower bound on the true distance between the original (non‑detrended) subsequences. Formally, if A and B are two original subsequences and {a_i}, {b_i} are their respective LD‑window sets, then

💡 Research Summary

📜 Original Paper Content