Sequential anomaly detection in the presence of noise and limited feedback
This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: (1) {\em filtering}, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations, and (2) {\em hedging}, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset.
💡 Research Summary
The paper tackles the problem of detecting anomalies in a sequential data stream when observations are noisy and user feedback is scarce. It proposes a two‑stage framework—filtering and hedging—that integrates universal prediction, exponential‑family modeling, and online convex optimization to produce an adaptive, low‑complexity anomaly detector with provable performance guarantees.
Filtering stage. The authors model each observation (x_t) as drawn from an exponential‑family distribution parameterized by a natural parameter (\theta_t). Rather than computing full posterior distributions, they treat the negative log‑likelihood (\ell_t(\theta) = -\log p_\theta(x_t)) as a convex loss and apply an online convex programming (OCP) algorithm. Using a primal‑dual update, the natural parameter is adjusted at each round based only on the current noisy observation and the previous parameter estimate. This yields a belief (or likelihood) (\hat p_t = p_{\theta_t}(x_t)) for the current measurement. The authors prove that, for a static underlying distribution, the cumulative regret of this online predictor scales as (\tilde O(\sqrt{T})); for slowly varying product distributions the regret remains sub‑linear (e.g., (O(T^{2/3}))). Moreover, the regret matches the minimax value of the corresponding strongly convex online game, establishing optimality.
Hedging stage. The belief (\hat p_t) is compared against a time‑varying threshold (\tau_t). When a user labels an instance as anomalous (feedback (y_t\in{0,1})), the algorithm updates (\tau_t) using an online sub‑gradient step on the loss (\ell_t^h(\tau) = \mathbf{1}{\hat p_t>\tau}(1-y_t) + \mathbf{1}{\hat p_t\le\tau}y_t). The update is projected onto (
Comments & Academic Discussion
Loading comments...
Leave a Comment