SMKC: Sketch Based Kernel Correlation Images for Variable Cardinality Time Series Anomaly Detection
Conventional anomaly detection in multivariate time series relies on the assumption that the set of observed variables remains static. In operational environments, however, monitoring systems frequently experience sensor churn. Signals may appear, disappear, or be renamed, creating data windows where the cardinality varies and may include values unseen during training. To address this challenge, we propose SMKC, a framework that decouples the dynamic input structure from the anomaly detector. We first employ permutation-invariant feature hashing to sketch raw inputs into a fixed size state sequence. We then construct a hybrid kernel image to capture global temporal structure through pairwise comparisons of the sequence and its derivatives. The model learns normal patterns using masked reconstruction and a teacher-student prediction objective. Our evaluation reveals that robust log-distance channels provide the primary discriminative signal, whereas cosine representations often fail to capture sufficient contrast. Notably, we find that a detector using random projections and nearest neighbors on the SMKC representation performs competitively with fully trained baselines without requiring gradient updates. This highlights the effectiveness of the representation itself and offers a practical cold-start solution for resource-constrained deployments.
💡 Research Summary
The paper tackles a practical yet under‑explored problem: anomaly detection in multivariate time‑series when the set of observed variables changes over time. In many operational monitoring systems, sensors are added, removed, or renamed, leading to windows with different cardinalities and even previously unseen variables. Traditional deep‑learning based detectors (e.g., USAD, TranAD, Anomaly Transformer) assume a fixed‑dimensional input and often require costly retraining or manual engineering to handle such churn.
SMKC (Sketch‑Based Kernel Correlation) proposes a two‑stage pipeline that separates representation from detection, making variable cardinality a property of the representation rather than an architectural constraint.
Stage 1 – Fixed‑width sketching.
Each window provides a value matrix X∈ℝ^{L×C} and a binary mask M∈{0,1}^{L×C}. Using signed feature hashing, the method maps each variable identifier to a deterministic bucket index and sign (both for the value stream and a presence stream). For a chosen number of buckets m (default 128) the values and presence indicators are accumulated into two m‑dimensional vectors, concatenated into a 2 m‑dimensional state g_t for each timestep t, and normalized by the per‑timestep observed count n_t with a saturating factor λ(n_t)=min{0.2 n_t, 1}. The result is a fixed‑length sequence g_{1:L} that is permutation‑invariant, independent of C, and automatically accommodates unseen identifiers.
Stage 2 – Hybrid kernel image.
From g_{1:L} three derived sequences are built: the raw sequence g, its first difference Δg, and the element‑wise absolute difference |Δg|. For each sequence z, two pairwise matrices are computed across all timesteps: (i) cosine similarity after ℓ₂‑normalization, yielding values in
Comments & Academic Discussion
Loading comments...
Leave a Comment