Unsupervised Anomalous Trajectory Detection for Crowded Scenes
We present an improved clustering based, unsupervised anomalous trajectory detection algorithm for crowded scenes. The proposed work is based on four major steps, namely, extraction of trajectories from crowded scene video, extraction of several feat…
Authors: Deepan Das, Deepak Mishra
Unsupervised Anomalous T rajectory Detection for Cro wded Scenes Deepan Das Dept. of Electr onics and T elecommunication Engineering Indian Institute of Engineering Science and T echnolo gy , Shibpur WB,India ddas27@wisc.edu Deepak Mishra Dept. of A vionics Indian Institute of Space Science and T echnolo gy Thiruvananthapuram, India deepak.mishra@iist.ac.in Abstract —W e present an improv ed clustering based, unsu- pervised anomalous trajectory detection algorithm for crowded scenes. The proposed work is based on four major steps, namely , extraction of trajectories fr om cro wded scene video, extraction of several featur es from these trajectories, independent mean-shift clustering and anomaly detection. First, the trajectories of all moving objects in a crowd are extracted using a multi feature video object tracker . These trajectories are then transformed into a set of feature spaces. Mean shift clustering is applied on these feature matrices to obtain distinct clusters, while a Shannon En- tropy based anomaly detector identifies corr esponding anomalies. In the final step, a voting mechanism identifies the trajectories that exhibit anomalous characteristics. The algorithm is tested on cro wd scene videos from datasets. The videos repr esent various possible crowd scenes with different motion patter ns and the method perf orms well to detect the expected anomalous trajectories from the scene. Index T erms —Cr owd, Anomaly Detection, T rajectory , Cluster- ing, Entropy . I . I N T R O D U C TI O N Computer V ision research aims to con verge at human-like abilities to interpret and extract useful information regarding behavioural patterns and anomalies from a descriptive set of visual data. Howe ver , human abilities ha ve glaring limitations when it comes to analyzing simultaneously changing signals [1]. A crowd presents itself as a considerably large collec- tion of simultaneously changing parameters, characterized by usual dominant patterns and some observable abnormalities. Safety is the primary reason to understand cro wd dynamics and isolate anomalous patterns. With cro wd-related violent incidents on the rise, it is paramount that we expand our studies to analyze the intricate and comple x nature of cro wds. Understanding anomalies in a crowded scene enables better public space design and also allo ws better surveillance systems to be built. Earlier works like those of Kim et al. [2] used a Mixture of Probabilistic Principal Component Analyzers to learn patterns of local optical flow and then validate the consistency by Marko v Random Field. Cong et al. [3] used a multi-scale histogram of Optical Flow as the feature descriptor and used it as the basis for a sparse reconstruction. Ali et al. [4] used Lagrangian Particle Dynamics to model coherent crowd flow as fluid flow . In general, Supervised methods require a considerable amount of labeled data, which is directly utilized to build the (a) Snapshot from the Pilgrim se- quence of the UCF Database of crowded scenes (b) Snapshot from the Crowded Intersection sequence of the UCF Database of crowded scenes Fig. 1. T ypical Crowded Scenes connection between video features and video labels. Therefore, dev eloping Unsupervised anomaly detection systems prove to be more challenging than supervised ones. An anomaly in a crowded scene can be determined from the motion patterns of it’ s constituent pedestrians and objects. Analyzing trajectory data enables one to predict and identify anomalies with an ex- cellent degree of accurac y . The early works on trajectory anal- ysis includes that of Fu et al. [5] which proposed a hierarchical clustering framework to classify vehicle motion trajectories based on pairwise similarities, but with the limitation of using only a single feature for clustering. Progressing further, Anjum and Cav allaro [6] proposed the use of multiple features in a Mean shift clustering based framew ork. They could identify outliers using a basic mean trajectory location based measure. Antonini et al. [7] transformed the input trajectories using Independent Components Analysis and then use Euclidean distance to find similarities between v arious trajectories. The Shannon Entropy measure has presented itself as an excellent tool for many applications including video ke y selection [8], Network anomaly detection [9] and w orm detection [10]. The principal contributions of this paper include the incorporation of a multi-feature object tracker that works excellently well for crowded scenes [11] and the use of multiple features for independent clustering. Furthermore, an information the- ory based Shannon entropy measure is proposed to detect anomalies for each cluster and then identify overall anomalous trajectories for the entire scene using a voting mechanism. The paper is organized as follows: Section II discusses the trajectory estimation and feature e xtraction procedure. Section III discusses the Clustering task with Section IV focusing on the Anomaly detection mechanism while Section V sheds light on the results obtained using the algorithm. I I . T R A J E C T O RY A N D F E A T U R E E X T R AC T I O N The first task is to ev aluate trajectory paths for all moving objects. A. T rajectory Extraction The estimation of trajectories in crowded scenes is a challenging task due to various factors like high de gree of occlusion, dif ficulty in tracking individual objects and arbitrary changes in nature of the motion. T o tackle this problem, we incorporate the use of a multi-object tracker , that works ex- ceedingly well in cro wded scenes as demonstrated by Sharma et al. [11]. Using this approach, each frame is divided into non- ov erlapping boxes and low-le vel features are detected inside each box. Follo wing this, the centroids of all the detected feature points in each box is tracked using the standard Kanade Lucas tracking algorithm. Fresh boxes are introduced periodically to track newly introduced objects. B. F eatur e Extraction Most trajectory clustering and anomaly classifiers used a single feature descriptor for the task. W e propose the use of multiple features, namely: 1) Density: A trajectory can hav e varying densities around it, depending on the size of it’ s neighbourhood. The density feature is thus computed using varying sizes of neighbourhood . W e hav e considered three varying sizes as proposed by Sharma et al. [11]. n T ,j, = |{ T i |∀ i 6 = j, d ( f j , f i ) < }| (1) F j = [ n j, 1 , n j, 2 , n j, 3 ] In this work, we are also interested in distances that describe the similarity of objects along time and therefore are computed by analysing the way distance between the objects v aries ov er time. This gives us a measure of the spatio-temporal density in the most natural way possible: D ( τ 1 , τ 2 ) | T = R d ( τ 1 ( t ) , τ 2 ( t )) dt | T | (2) Where d ( τ 1 ( t ) , τ 2 ( t ) represents the pairwise distance between two trajectories at the instant t . 2) Shape: All trajectory sketch a particular shape across the spatio-temporal scene, and this is represented as a polynomial function. The coefficients are calculated separately for the x and y coordinates yielding the f s feature vector . x ( t ) = a 0 + a 1 t + a 2 t 2 + a 3 t 3 (3) y ( t ) = b 0 + b 1 t + b 2 t 2 + b 3 t 3 (4) f s = [ a 0 , . . . , a 3 , b 0 , . . . , b 3 ] (a) Snapshot from the Cro wded Sub- way exit sequence (b) Extracted trajectories. Green dots are starting points Fig. 2. Cro wded scene and e xtracted trajectories 3) Mean P osition: It may be possible that trajectories separated over large distances may have similar velocities, directions and density features and consequently , get clustered in the same group. T o av oid this, a location measure is needed as f l = [ mean x , mean y ] . 4) Standar d Deviation: Standard De viation is an extremely popular measure that quantifies the amount of variation or dispersion in a time-series data. σ = p ( E [( X − µ ) 2 ]) (5) The trajectories extracted from each surveillance video will giv e rise to a distinct feature-space for each of the features mentioned abo ve. These distinct feature spaces will be used for identifying anomalies for that particular feature, and thereafter , the detection of ov erall anomalies. I I I . C L U S T E R I N G Clustering methods hav e gained immense popularity as a data analysis tool ever since Clements [12] introduced it in 1954. It is observed that significantly dominant and usual features correspond to the denser regions of the probability density function of the data points. Using a Kernel Density Estimate, the modes of the probability density function can be found using either the Mean Shift [13], [14] or the Mountain method [15]. W e would be using the Mean Shift method here as proposed by Fukunaga and Hostetler [13]. Moreover , since the anomaly detection algorithm proposed here re volv es around clustering similar data points, the clustering algorithm used here has to be highly effecti ve, as demonstrated by the Mean Shift Clustering algorithm. A. Mean Shift Clustering It is a non-parametric and versatile, iterati ve algorithm with applications in varied fields like object tracking, texture segmentation and data mining. After learning estimate of the probability density of the data points using a K ernel Density Estimate, a gradient ascent procedure associates each data point with the nearby peak of the data-set’ s density function. It defines a window around it and computes the mean of all the data-points within the windo w and shifts the centre of the window to the new mean until the process con ver ges. When the process con ver ges, we obtain the modes of the density estimate which serv e as the centre-points of the clusters in the data. Suppose, there are n data-points in the d-dimensional space R d , then the density estimate with Kernel K ( x ) and bandwidth h , can be denoted as f h,k ( x ) . If we define g ( x ) = − ´ K ( x ) as a shadow function [16] of K ( x ) , with the assumption that the deriv ative of the kernel K exists for all x ∈ [0 , ∞ ) , then the gradient of the density estimate can be written as: ∇ f h,k ( x ) = 2 c k,d nh d +2 n X i =1 ( x i − x ) ´ K ( || x − x i h || 2 ) (6) ∇ f h,k ( x ) = 2 c k,d nh d +2 [ n X i =1 g ( || x − x i h || 2 )] m h,G ( x ) (7) The modes of the density function are obtained among the zeros of the gradient of the density function. The first term in the product is proportional to the density estimate at x computed with kernel G , while the second term, or the mean shift is defined as the difference between the weighted mean and the centre of the Kernel window . m h,G ( x ) = [ P n i =1 x i g ( || x − x i h || 2 ) P n i =1 g ( || x − x i h || 2 ) − x ] (8) It can be observed that the mean shift vector always points tow ards the direction of maximum increase in the density [17]. These obtained modes, or cluster centres, are found for each independent feature obtained, therefore, gi ving us a non- ov erlapping set of trajectories that are characteristic of the cluster they belong to. I V . A N O M A LY D E T E C T I O N The entire crowd is often characterized by some dominant patterns, based on which, the entire set of trajectories is clustered. The anomalous trajectories, present throughout the crowded scene may belong to any one of these clusters but as a general property , will not hav e a substantial degree of belongingness to any of the clusters. The entire mechanism depends on two major tasks, as follo ws: Detecting Anomalies for each independent feature space follo wed by the selection of those trajectories that e xhibit anomalous behaviour in most of the cases, using a voting mechanism. Shannon Entropy has found widespread applications in nu- merous domains, with anomaly detection being one. The greatest advantage of this technique is that it allo ws the summarization of the feature distributions in the form of a single number . Our approach is based on the simple idea that an anomalous trajectory would exhibit higher levels of entropy when compared to normal trajectories. Instead of comparing the distances between the means of the cluster centres and trajectories as in previous work [6], we build a probability distribution using the distances between a trajectory and all of the cluster centres. The entropy of this probability distribution is ev aluated and if it exceeds a threshold, it is classified as an anomaly . The threshold should be data adaptive and must adapt itself with the changing properties of the data. C i = [ c ,i , c ,i , . . . , c n,i ] Algorithm 1: Overall Algorithm Input: Crowded V ideo Sequence Output: Anomalous T rajectories in the Sequence 1 Extract Trajectories from V ideo 2 Compute Features F 1 , F 2 , . . . , F n 3 f or i = 1 to n do 4 Compute Cluster Centres for F i 5 C i = [ c i, 1 , c i, 2 , . . . , c i,k ] 6 for i = 1 to numT r aj ec do 7 for j = 1 to k do 8 distv ec ( l, j ) = dist ( tr aj ec l , c i,j ) 9 end 10 end 11 for i = 1 to numT r aj ec do 12 for j = 1 to k do 13 P ( i, j ) = distv ec ( i, j ) / k P m =1 distv ec ( i, m ) 14 end 15 end 16 for i = 1 to numT r aj ec do 17 H ( i ) = − P k j =1 P i,j log P i,j 18 if H ( i ) > thr esh then 19 V ote(i)++ 20 end 21 end 22 if V ote ( i ) > n/ 2 then 23 Anom ( i ) = 1 24 r eturn Anom distv ec j = [ distance ( c ,i , f j ) , . . . , distance ( c n,i , f j )] C i represents the Cluster centres for a specific feature i and the distv ec vector contains the distance measures between each of the cluster centre and trajectory f j . W e further build the probability distribution P j = [ p j, 1 , p j, 2 , . . . , p j,n ] where p j,k = distance ( c k,i , f j ) P n m =1 distance ( c m,i , f j ) (9) An entropy measure is computed for each trajectory: H i = − n X k =1 p i,k log a p i,k (10) H = [ H 1 , H 2 , . . . , H numT raj ] (11) T rajectories with an entropy value exceeding that of a thresh- old are mark ed anomalous for that feature. In a cro wded scene, the changes in it’ s attributes occur randomly and most definitely . A particular section of the crowd can exhibit spatio-temporal changes in density and may also suddenly slow down or fasten up, thereby af fecting individual feature parameters of the trajectories. Moreover , new trajectories that are introduced after a fixed interval of time may hav e similar features as a particular cluster but may exhibit one or more abnormalities due to it’ s late introduction. Therefore, we cannot club all trajectories marked as anomalous from the abov e stated procedure as our desired set of abnormalities. A simple voting mechanism siev es out those trajectories that are marked anomalous for majority of the cases. V . R E S U LT S The method is tested on videos from two datasets, namely the Crowded scenes dataset used by Cheriyadat et al. [18] to detect dominant motions in cro wds and the UCF cro wd dataset, first used by Ali et al. [4]. T o measure the efficienc y of the method, we first identify all possible anomalous trajectories from the video and then, compare it with the classification test results. Since, the method in volves the use of videos directly , we had to mark the anomalous trajectories in the actual video for the ev aluation procedure. The results for three standard crowded videos from the mentioned datasets are tabulated as follows: V ideo Precision Recall f -Score Accuracy Crowded Subway Exit 0.8258 0.9944 0.9023 96.31% Pilgrim Sequence 0.8221 0.9965 0.9009 98.15% Intersection Sequence 0.7287 0.9971 0.842 98.68% T ABLE I R E SU LT S O N S E VE R A L C RO WD E D S C E N E V I DE O S The results indicate that this method exhibits excellent Specificity , i.e. the probability of classifying a normal trajec- tory as anomalous is extremely low . Howe ver , improvement can be achie ved in the Sensitivity of the approach by improv- ing the T rue Positive rate. It is to be noted that the method indicates almost all anomalous trajectories in the expected regions of interest with commendable accuracy . The graphical plots rev eal the effecti ve nature of the results produced. The plots as depicted in Figure 3 are from the Cro wded subway exit sequence. The trajectories have been detected from the entire video sequence and thereafter , clustering has been done on the sev eral feature-spaces as shown in Figures 3(a),3(c),3(e) and 3(g). The anomalous trajectories detected in each such feature space has been plotted in Figures 3(b),3(d),3(f) and 3(h). Follo wing the voting mechanism, the final anomalous trajectories have been displayed as red curves with their origin points sho wn as blue dots in Figure 3(i). Figure 3(j) shows the ov erall crowded scene as being composed of the anomalous trajectories sho wn in red and the normal trajectories shown in blue. If the video is analyzed properly , one can find that the trajectories responsible for slowing do wn the crowd e xiting the subway are closely represented by the ones detected as anomalous by the algorithm. These are in essence, the peripheral trajectories present together with the principal crowd flow that has been represented closely by the blue section in Figure 3(j). The method performs well when compared with dif ferent state of the art methods. The o verall accurac y has been used as the metric for comparison here. The Information Bottleneck based approach [19] only extracts a speed based feature to improve the shape analysis of trajectory data. This method shows an accuracy of about 96% on their task-specific Method Accuracy Guo et. al. 96% Xu et. al. 87% Biswas et. al. 96.7% Proposed 98.68% T ABLE II C O MPA R IS O N O F R E S U L T S datasets. The other unsupervised methods, like the one based on hierarchical pattern discovery methods [20], although using a completely different approach; exhibit an accuracy of around 87%. Other abnormality methods that use the property of sparsity in abnormal ev ents [21] exhibit an accuracy in the range of 88.71% to 96.7%. V I . C O N C L U S I O N This paper stresses on the need for understanding crowd dynamics better and presents an unsupervised mechanism to detect anomalous trajectories. The method is an application- ready one that itself generates trajectories from a video using a multi-object tracker and then cluster them based on multiple independent features. The use of multiple features for deter- mining the clusters and the anomalies is based on the fact that an anomalous trajectory may posses similarity with a dominant pattern in one aspect, but differs significantly in a majority of aspects. A trajectory that may be similar to most trajectories in terms of mean location and position may cause disturbance in the scene due to its unnatural speed. This has been taken care of by using multiple features to detect the ov erall anomalies. The use of Shannon Entropy provides a novel approach to determine the anomalies, considering the f act that a probability distribution is dev eloped using the distances from all cluster centres and not only the specific cluster with which the trajectory is associated. An anomalous trajectory is unlikely to belong to any specific cluster to a significant de gree, thereby maximizing entropy in the probability distribution. The proposed approach yields excellent results on the chosen crowd videos. This work can be made efficient by de veloping a substantially large dataset that demarcates abnormal trajec- tories where the trajectories are represented as a time series as used here. Trajectory representation such as this has been used to ev aluate crowd flow segmentation but here it has been put to use for abnormality detection. This lends this approach the added advantage of detecting the specific areas in the scene that contribute majorly to disturbance. Finally , this work may find extensi ve use in improving surveillance methods, better public space design, efficient e vent organization and possibly , ev en in tracking rogue na val and air routes. This work can be improv ed by making it real-time and also by generalizing the Entropy measure that could classify the anomalies optimally . R E F E R E N C E S [1] N. Sulman, T . Sanocki, D. Goldgof, and R. Kasturi, “How ef fectiv e is human video surveillance performance?” in P attern Recognition, 2008. ICPR 2008. 19th International Confer ence on . IEEE, 2008, pp. 1–3. (a) Clustering based on the density feature (b) Anomalous trajectories based on Density feature (c) Clustering based on the Shape feature (d) Anomalous trajectories based on Shape feature (e) Clustering based on the Mean Position (f) Anomalous trajectories based on Mean Position (g) Clustering based on the Stan- dard Deviation (h) Anomalous trajectories based on Standard Deviation (i) Overall Anomalies (j) Overall Scene Fig. 3. Dif ferent Clusterings and Anomalous T rajectory Classification [2] J. Kim and K. Grauman, “Observe locally , infer globally: a space- time mrf for detecting abnormal activities with incremental updates, ” in Computer V ision and P attern Recognition, 2009. CVPR 2009. IEEE Confer ence on . IEEE, 2009, pp. 2921–2928. [3] Y . Cong, J. Y uan, and J. Liu, “ Abnormal event detection in crowded scenes using sparse representation, ” P attern Recognition , vol. 46, no. 7, pp. 1851–1864, 2013. [4] S. Ali and M. Shah, “ A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis, ” in Computer V ision and P attern Reco gnition, 2007. CVPR’07. IEEE Confer ence on . IEEE, 2007, pp. 1–6. [5] Z. Fu, W . Hu, and T . T an, “Similarity based v ehicle trajectory clustering and anomaly detection, ” in Image Pr ocessing, 2005. ICIP 2005. IEEE International Conference on , vol. 2. IEEE, 2005, pp. II–602. [6] N. Anjum and A. Cav allaro, “Multifeature object trajectory clustering for video analysis, ” IEEE T ransactions on Cir cuits and Systems for V ideo T echnology , vol. 18, no. 11, pp. 1555–1564, 2008. [7] G. Antonini and J.-P . Thiran, “Counting pedestrians in video sequences using trajectory clustering, ” IEEE T ransactions on Cir cuits and Systems for V ideo T echnology , vol. 16, no. 8, pp. 1008–1020, 2006. [8] Q. Xu, Y . Liu, X. Li, Z. Y ang, J. W ang, M. Sbert, and R. Scopigno, “Browsing and exploration of video sequences: A new scheme for key frame extraction and 3d visualization using entropy based jensen div ergence, ” Information Sciences , vol. 278, pp. 736–756, 2014. [9] J. Santiago-Paz and D. T orres-Roman, “On entropy in network traffic anomaly detection, ” Entr opy , vol. 20, p. 2, 2015. [10] S. Ranjan, S. Shah, A. Nucci, M. Munafo, R. Cruz, and S. Muthukr- ishnan, “Dowitcher: Effecti ve worm detection and containment in the internet core, ” in INFOCOM 2007. 26th IEEE International Conference on Computer Communications. IEEE . IEEE, 2007, pp. 2541–2545. [11] R. Sharma and T . Guha, “ A trajectory clustering approach to crowd flow segmentation in videos, ” 2016 IEEE International Confer ence on Ima ge Pr ocessing (ICIP) , pp. 1200–1204, 2016. [12] F . E. Clements, “Use of cluster analysis with anthropological data, ” American Anthr opologist , vol. 56, no. 2, pp. 180–199, 1954. [13] K. Fukunaga and L. Hostetler, “The estimation of the gradient of a density function, with applications in pattern recognition, ” IEEE T ransactions on information theory , vol. 21, no. 1, pp. 32–40, 1975. [14] Y . Cheng, “Mean shift, mode seeking, and clustering, ” IEEE transactions on pattern analysis and machine intelligence , vol. 17, no. 8, pp. 790– 799, 1995. [15] R. R. Y ager and D. P . Filev , “ Approximate clustering via the mountain method, ” IEEE T ransactions on Systems, Man, and Cybernetics , vol. 24, no. 8, pp. 1279–1284, 1994. [16] K.-L. Wu and M.-S. Y ang, “Mean shift-based clustering, ” P attern Recognition , vol. 40, no. 11, pp. 3035–3052, 2007. [17] D. Comaniciu and P . Meer, “Mean shift: A robust approach toward feature space analysis, ” IEEE T ransactions on pattern analysis and machine intelligence , vol. 24, no. 5, pp. 603–619, 2002. [18] A. M. Cheriyadat and R. J. Radke, “Detecting dominant motions in dense crowds, ” IEEE Journal of Selected T opics in Signal Pr ocessing , vol. 2, no. 4, pp. 568–581, 2008. [19] Y . Guo, Q. Xu, Y . Y ang, S. Liang, Y . Liu, and M. Sbert, “ Anomaly detection based on trajectory analysis using kernel density estimation and information bottleneck techniques, ” in T ech. Rep., T echnical Report 108 . Uni versity of Girona, 2014. [20] D. Xu, R. Song, X. W u, N. Li, W . Feng, and H. Qian, “V ideo anomaly detection based on a hierarchical activity discov ery within spatio- temporal contexts, ” Neur ocomputing , vol. 143, pp. 144–152, 2014. [21] S. Biswas and V . Gupta, “ Abnormality detection in crowd videos by tracking sparse components, ” Machine V ision and Applications , vol. 28, no. 1-2, pp. 35–48, 2017.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment