We consider the problem of event detection based upon a (typically multivariate) data stream characterizing some system. Most of the time the system is quiescent - nothing of interest is happening - but occasionally events of interest occur. The goal of event detection is to raise an alarm as soon as possible after the onset of an event. A simple way of addressing the event detection problem is to look for changes in the data stream and equate `change' with `onset of event'. However, there might be many kinds of changes in the stream that are uninteresting. We assume that we are given a segment of the stream where interesting events have been marked. We propose a method for using these training data to construct a `targeted' detector that is specifically sensitive to changes signaling the onset of interesting events.
We consider the problem of event detection based upon a (typically multivariate) data stream characterizing some system. Examples include sensor readings for a patient in an intensive care unit, video images of a scene, and sales records of pharmacies. Most of the time the system is quiescent -nothing of interest is happening -but occasionally events of interest occur: a patient goes into shock, an intruder appears, or pharmacies in some geographic area experience increased demand for some medications. The goal of event detection is to raise an alarm as soon as possible after the onset of an event.
A simple way of addressing the event detection problem is to look for changes in the data stream and equate “change” with “onset of event”. The assumption is that, once an alarm rings, a human will enter the loop and decide whether an event of interest did in fact occur. If not, then the system issued a false alarm. If an event is in progress, then the human will monitor the system till the event ends. Under this assumption the second alarm caused by the change from “event” to “quiescent period” would not count as a false alarm.
Changes in the data stream can be detected by comparing the distribution of the most recent observations (the current set) with the distribution of previous observations (the reference set). Let T denote the current time. A simple approach is to choose window sizes C and R, and use a two-sample test S to compare the observations in the current set C T = {x T -C+1 , . . . , x T } with the observations in the reference set R T = {x T -C-R+1 , . . . , x T -C }. When the test statistic S(R T , C T ) exceeds a chosen threshold τ , we ring the alarm. The threshold controls the tradeoff between false alarms and missed detections. Abstracting away details, a change detector can be defined as a combination of a detection algorithm mapping the multivariate input stream x T into a univariate detection stream d T , and an alarm threshold τ . The only restriction is that d T can depend only on input observed up to time T .
A weakness of the approach to event detection outlined above is the equating of “onset of event” with “change”: there might be many kinds of changes in the stream that do not signal the onset of an event of interest. If we detect changes by running two-sample tests, the weakness can be expressed in terms of the power characteristics of the test S. We want S to have high power for discriminating between data observed during quiescent periods and data observed at the onset of an interesting event, and low power against all other alternatives. The difficulty is that it can be hard to “manually” design such a test, especially in a multivariate setting.
In a previous paper [6] we argued that realistically assessing the performance of a change detector and choosing the threshold τ for a desired false alarm rate requires labeled data. By this we mean a segment x 1 , . . . , x n of the data stream with labels y 1 , . . . , y n , where y i = 1 if x i is observed during an event and y i = 0 if x i is observed during a quiescent period. The assumption that we have labeled training data begs a question: shouldn’t we use these data for designing rather than merely evaluating a detector? In this paper we propose a way of injecting labeled data into the design phase of an event detector. We refer to this process as training or “targeting” the detector.
The remainder of this paper is organized as follows: In Section 2 we describe the basic idea behind targeted event detection and contrast it with untargeted event detection. Targeting converts the problem of detecting a change in the data stream signaling the onset of an event to the problem of detecting a positive level shift in a univariate stream; we address this problem in Section 3. In Section 4 we briefly sketch an adaptation of ROC curves to event detection proposed in [6]. In Section 5 we illustrate the effect of targeting in a simple situation where the data stream is univariate and the observations are independent. A more realistic multivariate example is presented in Section 6. Section 7 concludes the paper with a summary and discussion.
We assume we are given a segment x 1 , . . . , x n of a (possibly multivariate) data stream together with class labels y 1 , . . . , y n , where y = 1 if x i was observed during an event of interest, and y i = 0 otherwise. We use these training data to target the event detector.
The key step in our targeting method is to train a classifier on the labeled data. The classifier produces a classification score s i for each x i , with large values indicating y i = 1; i.e., x i was observed during an event.
By construction, onset of an event is signaled by a positive shift in the score stream. We are now left with the simpler problem of detecting a positive level shift in a univariate stream; two univariate change detectors mapping scores into a detection stream d T are described in Section 3. We raise
This content is AI-processed based on open access ArXiv data.