Robust Time-Series Retrieval Using Probabilistic Adaptive Segmental Alignment

Traditional pairwise sequence alignment is based on matching individual samples from two sequences, under time monotonicity constraints. However, in many application settings matching subsequences (segments) instead of individual samples may bring in…

Authors: Shahriar Shariat, Vladimir Pavlovic

Robust Time-Series Retrieval Using Probabilistic Adaptive Segmental   Alignment
Noname man uscript No. (will b e inserted b y the editor) Robust Time-Series Retriev al Using Probabilistic Adaptiv e Segmen tal Alignmen t Shahriar Shariat and Vladimir P a vlo vic Received: Jun 25, 2014 / Revised: Aug 21, 2015 / Accepted: Sep 26, 2015 Abstract T raditional pairwise sequence alignment is based on matching individ- ual samples from tw o sequences, under time monotonicity constraints. How ev er, in many application settings matc hing subsequences (segmen ts) instead of indi- vidual samples ma y bring in additional robustness to noise or lo cal non-causal p erturbations. This pap er presents an approac h to segmen tal sequence alignmen t that join tly segments and aligns t w o sequences, generalizing the traditional p er- sample alignmen t. T o accomplish this task, w e introduce a distance metric betw een segmen ts based on av erage pairwise distances and then present a mo dified pair- HMM (PHMM) that incorporates the prop osed distance metric to solve the join t segmen tation and alignmen t task. W e also prop ose a relaxation to our mo del that impro v es the computational efficiency of the generic segmen tal PHMM. Our results demonstrate that this new measure of sequence similarit y can lead to impro v ed classification p erformance, while b eing resilient to noise, on a v ariet y of sequence retriev al problems, from EEG to motion sequence classification. Keyw ords Time-Series, Alignmen t, Segmentation, Distance metric, Classification 1 Introduction Man y problems in data analytics to day critically depend on comparison and re- triev al of time-series data, such as the stock mark et prices, medical signals, or mo ving ob ject tra jectories. The non-Euclidean nature of the space of sequences has given rise to domain-sp ecific approaches and algorithms for general analytics tasks, including indexing, classification and clustering of time-series or sequences Shahriar Shariat Rutgers Universit y , NJ 08854 E-mail: sshariat@cs.rutgers.edu Curr ent affiliation: T urn Inc., E-mail: sshariat@turn.com. Vladimir Pa vlovic Rutgers Universit y , NJ 08854 E-mail: Vladimir@cs.rutgers.edu 2 Shahriar Shariat and Vladimir Pa vlo vic Keogh (2006)); Aghabozorgi et al (2015)). Asserting the pairwise sequence simi- larit y is at the core of these tasks Morse and Patel (2007)); Pree et al (2014)). A family of alignment algorithms accomplishes this b y measuring similarities betw een pairs of samples across tw o sequences and matching them under monotonicity (i.e., temp oral ordering) constrain ts. Dynamic time w arping (DTW) c.f., Berndt and Clifford (1994)) is a common computational tec hnique to tackle the problem of measuring the pairwise sequence similarit y Ding et al (2008)). DTW alignment algorithms are based on pairing of individual sequence samples. That is, a sample at time i in sequence X is t ypically matc hed with only one other sample at time j in sequence Y , while guaranteeing monotonic ordering, i.e., that a subsequen t sample X i +1 in one sequence could not b e simultaneously matched with a preceding sample Y j − 1 in the second sequence. Another category of similarit y measurement metho ds are designed based on the edit distance algorithm Atallah and F o x (1998)). Examples of suc h approaches include Longest Common Sub-Sequence (LCSS) Andre-Jonsson and Badal (1997)); Vlac hos et al (2002)), Edit distance with Real Penalt y (ERP) Chen and Ng (2004)) and Edit Distance for Real Sequences (EDR) prop osed in Chen and zsu (2005)). These algorithms compare the pairwise distance of tw o p oints against a threshold (pre-defined or v ariable) and revert the problem bac k to the original edit distance problem. A comprehensive review that ev aluates many similarit y measures across a range of b enchmark tasks in Ding et al (2008)) concludes that no single algorithm consisten tly outperform others. Nevertheless, DTW itself was demonstrated to remain a comp etitive baseline, particularly in instances of noise-free or low noise time-series. One consequence of DTW’s essen tial reliance on comparison of pairs of indi- vidual time-series samples is its, as well as many of its deriv atives’, sensitivity to noise Shariat and Pa vlo vic (2011)); Y e and Keogh (2009)); Vlachos et al (2002)); Zak aria et al (2015)). W e hav e observed that in the presence of significant noise, edit-distance based metho ds outperforms DTW. If such noise is to b e remov ed b y means of prepro cessing, DTW-based comparison could again b ecome a stable measure of sequence similarity . Ho wev er, effective noise remo v al if often domain- sp ecific, may require adaptation to follo w the c hanging sequence dynamics, and, most critically , t ypically considers denoising of one sequence outside the context of the sequence it is b eing compared to. As a consequence, the denoising b ecomes decoupled from the pro cess of measuring sequence similarity and, in turn, the retriev al or classification end-goals. DTW-family algorithms are also constrained to preserve the time mononon- icit y . In case of non-causal signals where lo cal ordering of samples can change, suc h as the EEG time series de Munck et al (2007)) or signals with general ran- dom time delays Blaum and Bruck (1994)), DTW is not able to any more yield reliable pairwise similarit y measures. Finally , in many applications, such as video segmen tation, one might b e interested in not only calculating the similarity but also retrieving the lo cally similar segments of the contrasting sequences Shariat and P a vlovic (2013)), which may constitute meaningful units of local similarity . With its focus on p er-sample alignments, DTW cannot inheren tly pro duce suc h delineation. As a consequence, to achiev e b oth resilience against multiple types of noise and recov er similar segments, it is reasonable to establish pairing b et w een groups of p oints in contrasting sequences. That is, one may seek to match a temp oral Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 3 segmen t (con tiguous subsequence) X i : i + m = [ x i , . . . , x i + m ] to another segment of the contrasting sequence, Y j : j + n = [ y j , . . . , y j + n ] as the basic units employ ed in full matching of the tw o sequences. In other words, the pro cess of establishing pairwise sequence similarit y needs to inv olv e simultane ous segmentation of the t w o sequences b eing compared as w ell as their comparison that depends on the iden- tified segments, while no w satisfying the monotonicity in the order of paired seg- men ts rather than individual paired samples. W e call this the adaptive segmental alignmen t task. In Shariat and Pa vlo vic (2011)) the authors prop osed an approac h, based on canonical correlation analysis (CCA), to handle this segmental alignment. The ob jectiv e function (IsoCCA) is constrained prop erly to imp ose time monotonicity o v er segments. Although the results show strong resilience to noise, the ob jec- tiv e do es not provide a proper metric b etw een the segmen ts. This can cause the resulting segments to b e unnecessarily short. F urthermore, the non-conv exit y of IsoCCA ob jective makes it increasingly sensitive to initial segmentation and model parameter choices. Another recent work, Ryoo (2011)), prop oses to find the b est matc hing segments of the tw o sequences based on a probabilistic model. How- ev er, the algorithm do es not handle gaps/insertions and, hence, do es not consider a complete alignment mo del. Moreov er, the author suggests empirically fixing all segmen t lengths, with the approac h lac king clear means to handle data-driven seg- men ts. In practice, how ev er, v ariable and data-adapted segments result in more robust alignmen ts. In Y e and Keogh (2009)), L. Y e and E. Keogh prop ose a metho d (shap elet) to discov er a common subsequence b etw een a class of time-series and take that as a class representativ e. This wa y , they ov ercome p ossible scattered noise pro cesses that could contaminate the classification procedure. In con trast, our approach is not a motif discov ery algorithm and is essentially an alignment algorithm that enhances the pairwise similarity of tw o sequences through discov ery and matching of similar segments. In this paper w e prop ose a complete segmen tal alignment framework to address the deficiencies of prior segmental sequence comparison approaches. Specifically , the new contributions of this work are: – W e prop ose a distance metric based on a v erage pair-wise distances suitable for measuring similarity b etw een tw o segmen ts, and aimed at segmental sequence alignmen t. – Based on the prop osed distance metric we dev elop a probabilistic alignment mo del b y extending the traditional pair-HMM formalism. – W e prop ose a relaxation to the original mo del and use b ounding techniques to reduce the computation time necessary to optimize the mo del and, hence, ev aluate the pairwise segmental alignments. Since the order of p oints is ignored within a segmen t, the algorithm is able to handle non-causal signals. Segment matching is particularly interesting in action recognition scenarios considering that actions can b e easily divided in sub-actions (for example walking with long and short strides). F urthermore, the direction of the progress is not imp ortant within each segment and thus tw o actions that are p erformed in different directions migh t still, as desired, exhibit high similarit y . The prop erties of the new similarity metric make it v ery resilient to noise and thus ap- plicable to situations where the conv en tional noise remov al tec hniques combined 4 Shahriar Shariat and Vladimir Pa vlo vic with traditional alignment algorithms fail to pro duce a reliable similarity mea- sure. In such cases, our metho d combines the properties of an adaptive filter and an alignment algorithm, leading to more robust estimate of the similarity of con- trasting sequences. Through extensiv e experiments w e show that the prop osed segmental sequence alignmen t and similarity measure can lead to improv ed classification results on b enc hmark sequence classification tasks, classification of non-causal EEG signals, and recognition of activities from human motion data. This contrasts the often inconsisten t p erformance of the comp eting approaches that either lack the abilit y to matc h segments instead of individual samples, or assume fixed, non-adaptiv e segmen tation. The pap er is organized as follows: in Section 2 we discuss the metric prop erty of IsoCCA and construct our segmental metric. In Section 3 the prop osed mo del is discussed in detail. Section 4 introduces the relaxed mo del for reduced compu- tational time. In Section 5 exp erimental results is presented follow ed b y Section 6 that concludes the pap er with the discussion of our findings and some suggestions for future work. 2 Segment Matching Metric Cen tral to an y alignment algorithm is the distance metric b etw een t w o p oints (or segmen ts in our case). DTW, typically , assumes Euclidean distance betw een con trasting en tities. Edit-distance-based metho ds, such as LCSS and EDR, mea- sure the Euclidean or L1 distance of t w o p oints and test it against a threshold. The aforemen tioned algorithms are still based on the p oint-wise comparison of the sequences. In Shariat and Pa vlovic (2011)) the authors prop osed a segmental alignment metho d based on CCA, i.e. , IsoCCA. Despite promising results, the proposed framew ork do es not provide a proper metric b et ween the segmen ts. The reason for that lies in the fact that IsoCCA works by effectively finding the closet p oints of the conv ex hulls of the tw o segments of points. This results in a non-metric b ecause the triangular inequality do es not hold. Moreov er in the case of ov erlapping conv ex h ulls, their distance is zero even though the size of the common area can b e very small resulting in unnecessarily small segments. In some applications, as illustrated in Section 1, one is interested in matc h- ing unordered small segments of p oin ts where p ermutation of the p oints is not a matter of concern. In addition to insensitivit y to the p erm utation, we seek to find a distance metric that suppresses the noise and is efficien t to compute. Many distance metrics hav e been prop osed to measure the distance b etw een sets, c.f., W oznica et al (2006)). Often the prop osed distances are based on non-linear func- tions (Hausdorff, for instance), whic h are computationally intensiv e. Moreo v er, Hausdorff-t yp e distances can b e highly insensitive to the conten t of the contrast- ing sets, fo cusing instead on the boundary cases. Kernels prop osed on sets Kondor (2003)) are not also suitable when the set of p oints is small and therefore, in prac- tice the estimated distribution is inaccurate. In the follo wing we prop ose a distance based on av erage pair-wise distances. Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 5 F ormally , for tw o sets of p oints X and Y , we consider d ( X , Y ) = 1 |X ||Y | X x i ∈X X y j ∈Y k x i − y j k n , (1) where k . k n is a conv ex norm b et w een tw o p oints. It is trivial to show d ( X , Y ) ≥ 0 and d ( X , Y ) = d ( Y , X ). It is also straigh tforw ard to pro v e that (1) has the triangular prop erty giv en the conv exit y of the norms. Equation (1) needs to b e sligh tly mo dified to ha ve definiteness prop ert y (i.e d ( x, y ) = 0 ⇐ ⇒ x = y ). D ( X , Y ) = 1 |X ∪ Y |   1 |X | X x i ∈X X y i ∈ ( Y \X ) k x i − y j k n + 1 |Y | X x i ∈ ( X \Y ) X y i ∈Y k x i − y j k n   . (2) Equation (2) is symmetric, non-negative and definite due to empty sums in case of equality of X and Y . T o pro v e that (2) has triangular prop ert y , one can partition ( D ( X , Y ) + D ( Y , Z ) − D ( X , Z ) ) ≥ 0 into disjoin t sets and observe that giv en triangular prop ert y of (1), the required inequality holds for (2). Note that in case of X ∩ Y = ∅ , (2) reduces to (1). In practice, any sampling is prone to measuremen t error and one needs to compare all pair-wise distances against that error. This emphasizes the importance of definiteness property imposed b y (2) ev en for real-v alued signals. W e will show in the exp erimental results that even though the ordering of samples is not preserved within a short segmen t when mo deled as a set, the prop osed metric can b e used for general purp ose alignment. The metric also exhibits in v ariance to arbitrary temporal p erm utations. This can be beneficial for non-causal sequences that arise from random delays (e.g., EEG). How ev er, it can also b e desirable in video retriev al settings when, for instance, the direction of an activity is not a concern. In the exp erimen ts w e will demonstrate that this metric is resilient to noise when incorp orated in to an alignment algorithm. In Section 3 w e demonstrate ho w it can b e computed efficien tly . 3 Segmental Pair-HMM (SPHMM) In this section we describ e the details of our alignment mo dels and algorithm. W e first describ e the basis of our mo del, a v ariation of Pair-HMM and its for- malism. The inference algorithm w orks by , essen tially , fixing the segmen t size in eac h step and then dynamically adjusting it to reco v er the b est segments. W e rev eal the computational techniques, based on Viterbi deco ding, that make this task efficient. W e also prop ose a forw ard algorithm whose primary aim is to yield the similarity measure of in terest without explicitly determining the segments, an approac h sufficien t for e.g., classification tasks. Finally , we present the SPHMM learning metho dology , based on the prop osed inference algorithm. The Pair HMM, introduced by Durbin et al (1997)), can be seen as a proba- bilistic mo del defined on pairs of sequences ( X , Y ) that aims to describe their joint lik eliho o d, P ( X, Y | alig nment ). As shown in Figure 3, PHMM has three states: 6 Shahriar Shariat and Vladimir Pa vlo vic Fig. 1 Segmental P air-HMM state-transition diagram M for matc h, I for insertion and D for deletion. Given t w o sequences of obser- v ations X and Y with n and m samples, resp ectively , the match state emits a pair of samples ( x, y ) x ∈ X , y ∈ Y . Insertion and deletion states emit ( x, − ) and ( − , y ) resp ectively where − stands for a gap. This mo del implements an affine gap p enalt y which is more general than constant gap p enalty t ypically used in DTW. In the following we add the notion of segmentation to the pair-HMM formal- ism. T o define the segmentation structure consider a sequence X = ( x 1 , x 2 , . . . x n ) of length n . A segment X b : e , a contiguous subsequence of X , is defined such that X b : e = ( x b , x b +1 , . . . , x e ). Equiv alen tly , the segment is defined by segment indexes s = ( b, b + 1 , . . . e ). W e consider non-ov erlapping and tight segments ov er X . That is, a complete segmen tation of X is defined as S = ( s 1 , s 2 , . . . , s L ) such that b 1 = 1 , e L = n, b i +1 = e i +1. This S ( X ) = (X 1 , X 2 , . . . , X L ) now defines the segmentation of sequence X = ( x 1 . . . x n ) in to segments ( ( x 1 . . . x e 1 ) , ( x b 2 . . . x e 2 ) . . . ( x b L . . . x e L ) ) . Lik ewise, w e define S ( Y ) for Y . F rom this p oint forward we represent the segmen- tation of b oth sequences, X and Y , with S = ( S ( X ) , S ( Y ) ) = ( ( X 1 , X 2 , . . . X L X ) , ( Y 1 , Y 2 , . . . Y L Y ) ) . Giv en the segments defined by S , a segmen tal alignment is a sequence of cor- resp ondences Q = ( q 1 , q 2 . . . q T ) where q t = ( i t , j t ) , i t ∈ { 1 , . . . L X } , j t ∈ { 1 , . . . L y } indicating the matc hing of segmen ts, suc h that the following monotonic constraints hold: i t ∈ { i t − 1 , i t − 1 + 1 } , j t ∈ { j t − 1 , j t − 1 + 1 } . (3) The lik eliho o d of one suc h fixed alignment Q is defined as P ( X , Y | S , Q, λ ) = T Y t =1 b q t q t − 1 ( X, Y ) (4) Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 7 where λ encompasses the HMM parameters. Here the lik eliho o d of a match b q t q t − 1 ( X, Y ) is                exp ( −D (X i t , Y j t )) · Ψ ( | X i t | , | Y j t | ) i t = i t − 1 + 1 , j t = j t − 1 + 1 exp ( − σ g | X i t | ) i t = i t − 1 + 1 , j t = j t − 1 exp ( − σ g | Y j t | ) i t = i t − 1 , j t = j t − 1 + 1 (5) where D (X i t , Y j t ) is the distance b etw een tw o segmen ts, defined in (2), Ψ sp ec- ifies the distribution of the corresp onding segment lengths, and σ g is a scaling factor. The transition probabilities in the match sequence are defined by the state transition graph in Figure 3 and are denoted by a . F or instance, a q t q t − 1 q t − 2 =                    δ , i t − 1 = i t − 2 + 1 , i t = i t − 1 , j t − 1 = j t − 2 + 1 , j t = j t − 1 + 1  , i t − 1 = i t − 2 + 1 , i t = i t − 1 + 1 , j t − 1 = j t − 2 , j t = j t − 1 τ , i t − 1 = i t − 2 + 1 , i t = T , j t − 1 = j t − 2 + 1 , j t = T etc . (6) with initial transitions, e.g., a (0) q 1 =    δ , i 1 = 0 , j 1 = 1 , or i 1 = 1 , j 1 = 0 1 − 2 δ − τ , i 1 = 1 , j 1 = 1 τ , i 1 = 0 , j 1 = 0 (7) where i 1 = 0 stands for deleting the first segmen t of X and similarly j 1 = 0 denotes deleting the first segmen t of Y . Ψ in (5) can b e learned from the data or giv en as a prior distribution, e.g., uniform. Note that the first case of (5) defines the observ ation probability of matc hing tw o segments (asso ciated with state M in Figure 3) while other cases corresp ond to gap op erations (states I and D). 3.1 Inference in SPHMM An optimal alignment for a fixed segmentation S can b e found as Q ∗ = arg max Q P ( Q | X , Y , S , λ ) = arg max Q P ( X , Y | Q, S , λ ) P ( Q ) . (8) The prior on Q in (8) can enco de traditional band-priors such as the Sakoe-Chiba band. (4)-(8) show that the optimal alignmen t is the Viterbi path for observing segmen ted sequences ( X, Y ). It is p ossible to find an optimal segmentation S ∗ , together with the optimal alignmen t, as Q ∗ , S ∗ = arg max Q, S P ( S , Q | X , Y , λ ) = arg max Q, S P ( X , Y | S , Q, λ ) P ( S ) P ( Q ) , (9) 8 Shahriar Shariat and Vladimir Pa vlo vic Fig. 2 Pair-HMM n ull mo del. Although, any informed prior could b e used, without loss of generalit y , w e sp ecify uniform prior on S . T o assert that the alignment likelihoo d indicates a relationship b etw een the contrasting sequences rather than a random match, one needs to compare the generative sequence likelihoo d to that of a null mo del. This n ull mo del deletes all segments of one sequence and inserts segmen ts of the con- trasting sequence (Figure 2). Therefore, the likelihoo d of the null mo del is P ( X , Y | S , R ) = η (1 − η ) L X L X Y i =1 exp ( − σ g | X i | ) !   η (1 − η ) L Y L Y Y j =1 exp ( − σ g | Y i | )   (10) where R is the null HMM model with transitions depicted in Figure 2 and observ a- tion mo del similar to (5) (except for the the first equation, which is the likelihoo d of observing a match b etw een tw o segments). Th us, assuming that the segmenta- tion priors of the null mo del and the alternative mo del are the same, we intend to ev aluate Q ∗ , S ∗ = arg max Q, S P ( X , Y | S , Q, λ ) P ( Q ) P ( X , Y | S , R ) . (11) It is possible to ev aluate b oth SPHMM and n ull mo del in a single pass o ver the sequences. In particular, one can assign ev ery match in the SPHMM model to a pair of insertion and a deletion and likewise assign every gap operation to its corre- sp onding insertion or deletion in the null model. Th us, it w ould be straigh tforward to formulate reward for match and p enalties for op ening and extending a gap b y expanding (11) with resp ect to (4) and (10). It helps to observe this formulation in the context of a dynamic programming algorithm for alignment with an affine gap p enalt y . In particular, for tw o segments X i and Y j the matc hing rew ard is r mm ( X i , Y j ) = 1 − 2 δ − τ (1 − η ) 2 (12) for sta ying in matc h state or r g m ( X i , Y j ) = 1 −  − τ (1 − η ) 2 (13) for transitioning from a gap state to matc h. Consequently , the gap opening penalty for X i is r op ( X i ) = δ (1 − η ) (14) Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 9 and the gap extension p enalty is r ex ( X i ) =  (1 − η ) . (15) By transferring in to log-odds ratio the relationship b et w een a Viterbi algorithm and a dynamic programming for alignment is evident. The resulting algorithm is an extension of the b est-path algorithm describ ed in Durbin et al (1997)) to segmen tal mo del by searching ov er all p ermissible segment lengths at eac h step of the recursion considering the match rewards and gap p enalties in (12)-(15). That is, in every state, all p ossible segments are considered and the segmentation that leads to the highest ratio of p osteriors (11) is chosen. T o make this pro cedure computationally tractable one may imp ose a maximum constraint on the segment length. Complexit y: The time complexity of (11) dep ends b oth on the lengths of seg- men ts in eac h sequence and the length of the sequences themselv es. Giv en that the n um b er of states is fixed and small, one can pro v e that the time complexity of the dynamic programming (or marginal matching discussed in Section 3.2) algorithm is O ( l X l Y mn ) where l X and l Y are the maxim um segment lengths and n and m are the lengths of sequences X and Y , resp ectively . T o compute the distance b etw een t w o segmen ts, one can employ the summed ar e a table technique Cro w (1984)) to impro v e the p erformance. That is, the pairwise distances of all pairs of samples are pre-calculated and the summed area table is constructed. Then within the matc hing pro cedure only a few additions are required to compute the distance. Usually , l X and l Y are not to o long relative to the sequence lengths. Thus, the o v erall time complexity is typically a small constan t factor l X l Y a w ay from that of the regular DTW. 3.2 Marginal matching likelihoo d This subsection introduces an approximation to forw ard algorithm for segmental pair-HMM. Let us define Γ to b e the set of all p ossible segmen tations of t w o sequences X and Y with m and n samples, resp ectively . Also assume that Π is the set of all segmental alignmen ts b etw een X and Y . Using the forward algorithm one can estimate the following P ( X , Y | λ ) = X S ∈ Γ X Q ∈ Π P ( X , Y | Q, S , λ ) P ( S ) P ( Q ) . (16) W e will, again, assume P ( S ) to b e uniform. Computing (16) is not tractable for ev ery p ossible segmentation. Therefore, we appro ximate the joint probability of X and Y b y explicitly marginalizing ov er all alignments. That is, we approximate (16) b y estimating P ( X, Y | S ∗ ) at eac h step where S ∗ is a partially optimal seg- men tation. Specifically , S ∗ denotes the segments that are optimal only for a partial alignmen t of the sequences X and Y up to the curren t step of the algorithm. W e use the following recursion to define this approximation. 10 Shahriar Shariat and Vladimir Pa vlo vic P  X 1: i , Y 1: j | q t q t − 1 ,  S ∗ ( X 1:( i − k ) ) , S ∗ ( Y 1:( j − l ) )  λ  = b q t q t − 1 · max S 0 ∈  Γ ( X 1:( i − k ) ) , Γ ( y 1:( j − l ) )  X Q 0 ∈ Π ( i − k ) , ( j − l ) P  X 1:( i − k ) , Y 1:( j − l ) | Q 0 , λ, S 0  (17) where  S ∗ ( X 1: i ) , S ∗ ( Y 1: j )  = arg max S 0 ∈ ( Γ ( X 1: i ) ,Γ ( Y 1: j ) ) X Q 0 ∈ Π i,j P ( X 1: i , Y 1: j | Q 0 , λ, S 0 ) . (18) In (17) and (18) k and l are p ermissible segment lengths for X and Y . Γ ( . ) is the set of all segmentations while S ∗ ( . ) denotes the approximated segmentation of the giv en input sequence. Π i,j is the set of all p ossible alignments of X and Y up to x i and y j . In (17) q t q t − 1 defines the current state the same wa y we defined it in (5). The second term of right hand side of (17) finds the maximum marginalized lik eliho o d ov er aligning partial sequences given all p ossible segmentations up to x i − k , y j − l . The result of applying this recursive algorithm is the approximated marginalized likelihoo d of X and Y . This is useful in classification problems where one is not necessarily interested in alignment path or optimal segmentation but a reliable lik elihoo d is more desirable. In this paper how ev er, w e mainly show the result of the dynamic programming algorithm that arises from (11). The dynamic programming algorithm not only pro vides us with a lik elihoo d that later can b e used as a measure of similarity , but also yields the optimal alignment path and segmen tation whic h is essential to our analysis. W e observed sup erior classification accuracy using the marginal matc hing algorithm in EEG classification (Section 5). 3.3 Learning SPHMM parameters Algorithm 1 Learning algorithm for SPHMM.#( A → B ) denotes the num ber of transitions f rom state A to state B deco ded b y the Viterbi algorithm. Initialization Randomly initialize δ,  and τ . Set Ψ ( i, j ) to uniform. rep eat E-step : Align training sequences using the Viterbi algorithm describ ed in Section 3 M-step : 1. Re-estimate transition parameters: δ = #( M → I )+#( M → D ) 2#( M →∗ ) ,  = #( I → I )+#( D → D ) #( I →∗ )+#( D →∗ ) and τ = 1 − 2 δ −  . 2. Re-estimate segment length distribution, Ψ ( i, j ) = #( | X t X | = i, | Y t Y | = j ) # segments ∀ t ∈ { 1 . . . L X } , t Y ∈ { 1 . . . L Y } . 3. T une the parameters using (22) with ( δ ,  and τ ) as the initial v alues (pro ject back if needed to resp ect the feasibility of the starting p oint) un til Conv ergence. Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 11 T o learn the parameters of SPHMM one can use a standard exp ectation maxi- mization algorithm t ypically used to train HMM parameters Rabiner (1989)). The parameter of the null mo del cannot b e trained using the EM algorithm and must remain constant during training in order to hav e the consistent reference model. An attractiv e c hoice for η is the maximum like liho o d estimate of (10). That is, η = 2 L X + L Y + 2 (19) where L X and L Y are num ber of segmen ts (based on the prior) in eac h sequence. In our exp eriments we noticed c hoosing η according to (19) may result in to ov erfitting to the training set in a classification problem and therefore suggest c ho osing η > 0 . 5 in that case. The standard EM algorithm, do es not resp ect certain constrain ts that must hold when one designs an alignment algorithm. Those constrains are designed to k eep matching rew ard and gap p enalties (Eq. 13-15) within certain b ounds. In particular, one would like to hav e 1 < r mm , r g m < z m , (20) z g < r op , r ex < 1 , (21) where z m > 1 and 0 < z g < 1 are real n um bers. In our exp eriments we hav e set z m = exp (5) and z g = exp ( − 10) whic h provide a reasonable range for learning the parameters. Maximizing the contribution of matching rew ards and gap p enalties while sat- isfying ab o v e constraints will lead to solving ( δ ∗ ,  ∗ , τ ∗ ) = arg max δ,,τ ( ˆ c mm log (1 − 2 δ − τ ) + ˆ c g m log(1 −  − τ ) +ˆ c op log( δ ) + ˆ c ex log(  ) ) (22) st . 2 log(1 − η ) < log (1 − 2 δ − τ ) < log( z m ) + 2 log (1 − η ) (23) 2 log(1 − η ) < log (1 −  − τ ) < log( z m ) + 2 log (1 − η ) (24) log( z g ) + l og (1 − η ) < log( δ ) , log(  ) < log(1 − η ) (25) log( τ ) < 0 (26) where f or N alignments in the training set ˆ c mm = #( M → M ) N (27) ˆ c g m = #(( I or D ) → M ) N (28) ˆ c op = #( M → ( I orD )) N (29) ˆ c ex = #( I → I ) + #( D → D ) N (30) where #( A → B ) stands for the num ber of transitions from state A to B . In (22), w e hav e transferred to log-space for numerical stability and used the fact that parameter of the null mo del ( η ) will not b e up dated. One can transfer (22) into a 12 Shahriar Shariat and Vladimir Pa vlo vic linear programming b y adding log( τ ) to the ob jective function and effectively max- imize the lik eliho o d of the av erage Marko v mo del (transitions) under men tioned constrain ts. Finally , one can consider the algorithm in Alg.1 for learning the parameters of SPHMM. Note that the inference step is appro ximated with the dynamic program- ming resulted from (11). One can incorp orate the method describ ed in Section 3.2 to approximate the forward algorithm and use it in a forward-bac kward learning task (backw ard algorithm can also b e appro ximated similarly) for estimating the p osterior and finally learn the parameters including the distribution of segmen t lengths. The conv ergence of the learning algorithm is ob vious and pro v able through the con v ergence of the EM algorithm. In practice the learning algorithm con v erges quite fast after a few num ber of iterations. 4 Segmental Matching In our exp eriments we observed that during learning SPHMM, the probability of transitioning from match state to gap states can b e decreased substantially with- out significan tly affecting the lik eliho o d or alignmen t path. Giv en this observ ation, it is reasonable to exp ect a single match op eration coupled with adaptive segmen- tation b e able to approximate the alignment. Let Γ m ⊂ Γ b e the collection of all p ossible segmen tation of X and Y such that: 1) the num ber of segments is equal in eac h segmentation, L = L X = L Y ; 2) Corresp onding segments are then matched, i.e., the alignmen t path Q = ( q 1 , q 2 , . . . q L ) where q i = ( i, i ). In other w ords, the alignmen t is reco vered through segmentation. That is, P ( X , Y ) = X S ∈ Γ m P ( X , Y | S ) P ( S ) (31) where P ( X , Y | S ) = L Y t =1 exp  − 1 σ D ( X t , Y t )  Ψ ( | X t | , | Y t | ) (32) whic h is the likelihoo d of matching tw o segments in the original SPHMM mo del. D ( · , · ) can b e any distance metric on sets. Therefore, the join t lik eliho o d of X and Y is maximized by searchi ng ov er all p ossible segmentation. That is, P ∗ ( X, Y ) = max S ∈ Γ m P ( X , Y | S ) P ( S ) (33) and consequen tly one ma y obtain the optimal segmen tation as S ∗ = arg max S ∈ Γ m P ( X , Y | S ) P ( S ) (34) where we assume uniform prior on segmentation. A non-uniform prior on segmen- tation can result into different alignments by fa voring longer or shorter segments on different interv als of the sequences. It is p ossible to compare this mo del with a random mo del similar to (10). In that case the prior on segmentation will again cancel out and eac h matching will b e compared to a pair of deletion and insertion. Remo ving the tw o gap op erations not only reduces the computational effort incurred by joint segmentation and alignmen t but also enables one to use b ounding Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 13 metho ds for particular representations of time-series to further prune the unneces- sary computation and sp eedup the matc hing. F or instance, if the time-series can be lo cally represented using Bag-of-W ords and histogram, often found as a represen- tation in documents or complex video signals, Lamp ert et al Lampert et al (2009)) ha v e designed b ounds on the distance b etw een tw o segmen ts given a minim um and maxim um segment length and their corresp onding histograms. W e leverage this fact to reduce the computational time of the metho d prop osed in Section 3. 4.1 Bounding Histogram Distances Bag-of-W ords (BoW): is a popular representation that has b een successfully used b y researchers Riemenschneider et al (2009)); Chu et al (2012)). In this represen- tation extracted features are clustered into several co dewords using a clustering metho d such as k-means. Similar features describ ed by the same co deword are then counted together and form a histogram for a single or a collection of frames. Therefore, given a histogram map φ b i : e i ( . ), we denote an H -bin histogram of a con tiguous segmen t b i : e i = ( b i , b i + 1 , . . . , e i − 1 , e i ) as X b i : e i = φ b i : e i ( V ) or X i for short. Giv en the maximum segment length l max , the minimum segment length l min , and tw o segments of sequence X and Y , starting from b i and b j , resp ectively , we denote the maximum length segmen ts by X b i = X b i : b i + l max and Y b j = Y b j : b j + l max . Lik ewise, the minimum length segments are denoted by X b i = X b i : b i + l min and Y b j = Y b j : b j + l min . W e are aiming to b ound the distance b etw een the histogram features of any p ossible segment starting from X b i extending to X b i + l max and Y b j extending maximally to Y b i + l max . Note that even though we use the same l min and l max for both sequences, it is not a requiremen t of our metho d and is used only to simplify the notation. The bin counts of X b i and Y b j are b ounded as X h b i ≤ X h b i : b i + k ≤ X h b i , ( l min ≤ k ≤ l max ) (35) Y h b j ≤ Y h b j : b j + z ≤ Y h b j , ( l min ≤ z ≤ l max ) (36) where X h . and Y h . denote the histogram bin h . One can easily extend (35, 36) to normalized histogram noting that | X b i | ≤ X b i : b i + k ≤ | X b i | . That is, X h b i | X b i | ≤ ˆ X h b i : b i + k ≤ X h b i | X b i | , ( l min ≤ k ≤ l max ) (37) Y h b j | Y b j | ≤ ˆ Y h b j : b j + z ≤ Y h b j | Y b i | , ( l min ≤ z ≤ l max ) (38) It is straightforw ard to observe min( X h b i , Y h b j ) ≤ min( X h b i : b i + k , Y h b j : b j + z ) ≤ min( X h b i , Y h b j ) (39) max( X h b i , Y h b j ) ≤ max( X h b i : b i + k , Y h b j : b j + z ) ≤ max( X h b i , Y h b j ) (40) for l min ≤ k, z ≤ l max . F ollowing Chu et al (2012)) one may construct the b ounds on p opular histogram distances. F or completeness of presentation these b ounds are included b elo w. 14 Shahriar Shariat and Vladimir Pa vlo vic Bounding l 1 distance : Noting that | a − b | = max( a, b ) − min( a, b ) and a simple reordering of (39, 40) one can observe that max( X h b i , Y h b j ) − min( X h b i , Y h b j ) ≤ | X h b i : b i + k − Y h b j : b j + z | ≤ max( X h b i , Y h b j ) − min( X h b i , Y h b j ) (41) for l min ≤ k, z ≤ l max . The bounds on l 1 distance are then the summation ov er all bins. That is, l l 1 b ( X b i , Y b j , m, l ) = H X h =1 max( X h b i , Y h b j ) − min( X h b i , Y h b j ) (42) u l 1 b ( X b i , Y b j , m, l ) = H X h =1 max( X h b i , Y h b j ) − min( X h b i , Y h b j ) (43) and for normalized histograms ˆ l l 1 b ( X b i , Y b j ,l min , l max ) = H X h =1 max X h b i | X h b i | , Y h b j | Y h b j | ! − min X h b i | X h b i | , Y h b j | Y h b j | !! (44) ˆ u l 1 b ( X b i , Y b j ,l min , l max ) = H X h =1 max X h b i | X h b i | , Y h b j | Y h b j | ! − min X h b i | X h b i | , Y h b j | Y h b j | !! . (45) Histogram intersection and χ 2 distances can also b e derived in the same wa y . Bounding histogram intersection distance : Histogram intersection distance is defined as d ∩ ( φ H X , φ H Y ) = − H X h =1 min( ˆ X h , ˆ Y h ) (46) using (37), (38) the corresp onding low er and upp er b ound is ˆ l ∩ b ( X b i , Y b j , l min , l max ) = − H X h =1 min X h b i | X h b i | , Y h b j | Y h b j | ! (47) ˆ u ∩ b ( X b i , Y b j , l min , l max ) = − H X h =1 min X h b i | X h b i | , Y h b j | Y h b j | ! (48) Bounding χ 2 distance : χ 2 distance is defined as d χ 2 ( φ H X , φ H Y ) = H X h =1  ˆ X h − ˆ Y h  2 ˆ X h + ˆ Y h . (49) Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 15 Using the normalized bounds on l 1 distance i.e., (44) and (45) one can easily pro v e ˆ l χ 2 b ( X b i , Y b j , l min , l max ) = H X h =1  max(0 , ˆ l l 1 b )  2 X h b i | X h b i | + Y h b j | Y h b i | (50) ˆ u χ 2 b ( X b i , Y b j , l min , l max ) = H X h =1 ( ˆ u l 1 b ) 2 X h b i | X h b i | + Y h b j | Y h b j | (51) 4.2 F ast Segmental Matching (F ast-SM) W e prop ose a recursive algorithm that starts matching from the end of the tw o sequences. Eac h segmen tal match is effectively finding the joint likelihoo d of X i and Y i . Within each matc h we searc h ov er all p ossible segmentations up to the maxim um segmen t length. That is, given l max and l min , for i = L, . . . 1, j = L, . . . 1 and considering uniform prior on segments the likelihoo d of matching is P ( X b i , Y b j ) = max l min ≤ k,z ≤ l max exp( − D ( X b i − k : b i , Y b j − z : b j )) P ( X b i − k − 1 , Y b j − z − 1 ) . (52) In other words, (52) is the optimal (maxim um) likelihoo d of matc hing segments b y searching ov er the likelihoo d of the last pair of segments in b oth sequences and all p ossible segmen tation starting from the current p oint. W e assume that the likelihoo d of corresp ondences in the lo cal neighbourho o d is approximately constant. Therefore, b efore executing a recursion to calculate P ( X b i − k − 1 , Y b j − z − 1 ), we examine the approximated likelihoo d of the alignment path passing through ( X b i − k , Y b j − z ) against the b est path found so far. W e define P ∗ as the maximal lik eliho o d calculated for the immediate preceding segment ending in ( X b i − k − 1 , Y b j − z − 1 ), w e ha ve P ∗ = max l min ≤ k 0 100 . (57) T o introduce non-causality we add noise to (57) within four interv als such that f n ( t ) =  f ( t ) + N (0 , 10) B i ≤ t ≤ E i ∀ i f ( t ) other wise. (58) where B i and E i indicate the starting and ending time p oint of i th non-causal inter- v al. The non-causal time in terv als are [50 , 100] , [125 , 150] , [250 , 350] and [400 , 425]. F or ev ery time-series the contrasting sequence is generated by nearest neighbour in terp olation at time p oints giv en b y (58). A sample of a sequence and its non- causal warped version are shown is Figure 4. SPHMM parameters are learned using Alg. 1 for aligning every sequence and its warped (causal or non-causal) v ersion. W e tried segment lengths l x = l y = [50 , 100 , 150 , 200]. F or a fair comparison with DTW we tried 10 different gap p enalties (constant) from 0 to 100, which was applied for every gap op eration. Zero gap penalty yielded the best result for DTW. Six of suc h alignmen ts are depicted in Figure 5. The bac kground is the distance b etw een each sample. The ground truth given by (58) is plotted in red, while the resulting alignment from DTW is dra wn in white and that of SPHMM in green. Both axes indicate time and plots are o v erlaid on the pairwise distance of the tw o sequences. Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 19 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 (a) 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 (b) 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 (c) 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 (d) 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 (e) 50 100 150 200 250 300 350 400 450 50 100 150 200 250 300 350 400 450 (f ) Fig. 5 Samples of aligning t wo sequences with non-causal interv als. Each plot depicts the comparison of the ground truth alignment (red) with DTW (white) and SPHMM (Green). The plots show the result for SPHMM with l x = l y = 150. It is eviden t from Figure 5 that SPHMM outp erforms DTW in aligning the non-causal time-series. T o give a quantitative assessment of the go o dness of the alignmen t, the ground truth is compared with reported corresp ondences b y each algorithm. It should b e noted that while DTW giv es a correspondence for every time-p oin t of the sequence, SPHMM pro duces segmen ts. These segmen ts are in- dicated by the starting and ending p oints. T o b e able to compare the sequence of segments with ground truth we ha v e used linear interpolation. The go o dness measure is the L 1 distance of every corresp ondence from the ground truth. The a v erage L 1 distance for DTW ov er 100 alignmen ts is 8258.8. This v alue is different for SPHMM for v arious segment lengths. Namely , the av erage distance is 7625.5, 5487.1, 5458.5, 5356.0 for l x = l y = [50 , 100 , 150 , 200] resp ectively . It is interesting to note that the distance does not change muc h for l X , l Y > 100. The reason is that the largest non-causal in terv al is 100 time-p oints long. In man y cases the correct segmen ts are extracted except for the second time interv al which is lo cated on the v alley of the warping function where deco ding the correct alignmen t is difficult for b oth algorithms. 5.2 Synthetic Data I I W e also consider the dataset prop osed in Shariat and Pa vlovic (2011)), where the authors dev eloped an alternative approach to segmental alignmen t. The dataset consists of sinusoidal and rectangular signals that are embedded into Gaussian noise such that the placemen t of the signal is also random. Tw o samples of this dataset are shown in Figure 6. In our original IsoCCA paper we ha v e generated 10 samples from each class and used 1-NN classifier in a leav e-one-out setting. W e ha v e shown that IsoCCA can ac hiev e 90% accuracy while DTW cannot p erform 20 Shahriar Shariat and Vladimir Pa vlo vic 0 50 100 150 200 250 −30 −20 −10 0 10 20 30 40 (a) Sinusoidal signal 0 20 40 60 80 100 120 140 160 180 200 −25 −20 −15 −10 −5 0 5 10 15 20 (b) Rectangular signal Fig. 6 Synthetic data from Shariat and Pa vlo vic (2011)). b etter than 60%. W e, how ev er, need to train SPHMM parameters, whic h is not feasible using a training set derived from 20 sequences. Therefore, we generate 20 more sequences for training the parameters. SPHMM can classify the 20 sequences in test set with 100% accuracy . T o assure that the small size of the dataset is not affecting the result we generated 100 sequences and used 5-fold cross-v alidation setting. W e observ ed that SPHMM is still able to p erfectly classify all sequences. This dataset was used in Zak aria et al (2015)); Y e and Keogh (2009)), where the authors show a p erfect classification accuracy . Note, how ev er, that their mo del is not an alignment algorithm and relies on discov ering a single motif within each class. 5.3 Benchmark Data In order to compare our prop osed approach to DTW and demonstrate the ap- plicabilit y of our metho d to general sequences, w e tested SPHMM on the entire set of time-series from the UC Riverside time-series rep ository that contains 45 datasets. The length of time-series in this dataset v aries from 60 to 1882. T o b e able to test the noise resilience of SPHMM, w e ha v e added tw o t ypes of noise to all sequences. The first noise mo del is the impulse noise. Impulse noise mo del is very w ell-kno wn in signal processing communit y and can mo del abrupt sensor failure (or other rapid change effects) Abreu et al (1996)). In particular, additive noise pro cess is Gaussian N (0 , ω σ i ) where σ i is the standard deviation of the feature i and ω is the pow er degree of the noise. W e hav e added the noise to time p oints c hosen uniformly at random, such that the noise do es not cov er more than 20% of the sequence duration (Figure 7). W e conducted the exp eriment on original data and noisy version of data with ω = 1. F or every sequence, we hav e generated three noisy samples (three noisy sequences) of the corresp onding time-series. The algorithms (DTW, PHMM and SPHMM) are then applied to each noisy version of the data and the resultant recognition accuracy results are av eraged and rep orted. The results are shown in table 2. W e compared the prop osed approac h to DTW and pair-HMM (where no seg- men tation is applied) with the w arping band. T o inv estigate whether DTW with a noise remov al pre-pro cessing is sup erior to SPHMM, w e remov ed the noise using a median filter with t w o fixed windo w sizes of 5 and 3 and sho w ed the b etter recogni- tion rate for each dataset in the DTW-NR column. W e ha ve applied the Sk ao-Chiba band suggested by UCR time-series page to DTW and PHMM. F or SPHMM the Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 21 maxim um of the aforementioned band and twice the maximum segment length is c hosen as the band to allo w SPHMM accommo date up to tw o segments a w a y from the diagonal of the alignment matrix. The parameters of SPHMM are learned us- ing the method defined in Alg. 1. The segment length distribution how ev er, is not learned and assumed to b e uniform. In our exp eriments w e noticed that the mo del is sensitive to segment length distribution and introducing a non-uniform prior can quic kly lead to o v erfitting. This is due to the fact that the longer segments b eha v e more lik e outliers. Therefore, it mak es sense to use uniform as the segment length distribution. The parameters are not changed for noisy data exp eriments. One can see in table 2 that SPHMM is sup erior or on par with PHMM and DTW in all cases and sup erior in the original, noise-free setting. Ho wev er, as so on as the noise is introduced, SPHMM shows a m uc h stronger p erformance compared to b oth DTW and PHMM even though PHMM outp erforms DTW. One may also notice that even though the median filter noise remov al has elev ated the recognition rates of DTW (DTW-NR column of impulse noise section in T able 2), it still falls b ehind SPHMM except for a few cases. The sup erior performance of DTW-NR in those cases is due to the fact that the window size of median filter accidentally matc hes the noise spread in one or tw o noisy versions of those datasets. How ev er, there is no clear wa y of guessing the correct window size in adv ance. T o in v estigate whether the rep orted results indeed indicate the significance of SPHMM, w e ha v e p erformed Wilco xon signed rank testDemsar (2006)). In our case for a tw o-tailed Wilcoxon signed rank test on 45 datasets and α = . 05, T = min ( R + , R − ) and z = T − 1 4 45 · 46 √ 1 24 45 · 46 · 91 < − 1 . 95 was used to assert the significance of the prop osed classifier 1 . T able 1 summarizes the results of significance testing. As one can observe SPHMM p erforms significantly b etter than other metho ds in all cases. In the original, noise-free setting, PHMM’s p erformance is not significantly (for α = 0 . 05) sup erior to that of DTW and b oth trail the performance of SPHMM. Since the significance of DTW-NR ov er DTW in the case of noisy data is very m uc h eviden t, w e ha v e not rep orted this in 1. A standard t w o tailed Student t- test for asserting the significance of SPHMM results in the same conclusion at 1% significance level for original and 0.1% level for noisy exp eriments. T able 1 Wilcoxon signed rank test for T able 2. ” > ” stands for ”significan tly better”. Boldface indicates statistically significant relationships. Original Impulse Noise PHMM ≈ DTW SPHMM > PHMM DTW-NR > PHMM SPHMM > DTW-NR R + 469 590 696 762 R − 396 15 90 228 z -1.37 -5.67 -4.83 -3.27 The av erage length of the extracted matching segments is approximately 1 . 08 with a standard deviation of 0 . 37 in case of noise free data. F or the noisy v ersion of the dataset the a v erage length of the matching segments rises to 1 . 97 with standard deviation of 1 . 78 indicating that man y segments are detected. One has to note that since the chosen data do es not result from the random delay pro cesses, detecting man y segments of lengths 1, i.e a sample-to-sample matching, is not unexp ected.On 1 R + ( R − ) denote the total rank of the datasets where the accuracy of metho d A is higher (low er) than the accuracy of metho d B. SeeDemsar (2006)) for details. 22 Shahriar Shariat and Vladimir Pa vlo vic 0 50 100 150 200 250 300 0 5 10 15 20 25 30 (a) Original 0 50 100 150 200 250 300 −10 −5 0 5 10 15 20 25 30 35 40 (b) ω = 100% Fig. 7 Sample of a sequence from UCR dataset (Coffee) with and without noise. the other hand, and due to noise (inheren t or artificial), it is adv an tageous to hav e in termitten tly extended segments as eviden t from the rep orted standard deviation. T o demonstrate that our approach is resilient to additive Gaussian noise, we ha v e done the same exp eriment with the noise spread ov er the whole span of the signal. Since the noise is more dominant in this case the maximum segmen t length is increased to 10. W e ha v e performed noise-remo v al using and av erage filter before applying DTW to mak e sure that a noise remo v al with constan t window size cannot impro v e the p erformance of DTW b eyond SPHMM. The av erage filter window sizes are 10 and 5 and, as we did in the previous experiment, the higher recognition rate is rep orted. The learned parameters are not c hanged from the original case. The result is again rep orted in T able 2. The significance of SPHMM, is obvious and prov ed by Wilco xon signed rank test depicted in T able 3. It is interesting to note that noise remov al w as not able to improv e the the p erformance of DTW and furthermore, in 15 cases has caused a degradation of the p erformance. This is due to the constant window size and the fact that it do es not adapt to the data which is crucial in case of such excessive noise. T o assert this conclusion we pic k ed ”T race” and ”Adiac” dataset and tried different window sizes for filtering. The result show ed significan t improv emen t when the windo w size is set to 18 for ”T race” and 4 for Adiac. In particular, their accuracy improv ed to 82.31 and 12.12 for ”T race” and ”Adiac”, resp ectively . Another surprising p oint is that the accuracy results for Beef dataset is higher in noisy case putting the quality of this dataset in doubt (normalization remov es this o dd b ehaviour). W e also applied LCSS and EDR algorithm to the noisy data in b oth impulse and wide-spread Gaussian noise exp eriments. F or brevit y , w e hav e not sho wn those results. Edit-distance based algorithms w ork significantly b etter than DTW in case of impulse noise (according to a Wilcoxon signed-rank test) but fall b ehind the DTW with noise remov al pre-pro cessing in that setting. In case of wide-spread additiv e Gaussian noise they show similar p erformance to that of DTW. If one applies a noise remo v al pre-pro cessing b efore LCSS or EDR, they p erform b etter but again, not b etter than PHMM. Running Time : Figure 8 depicts the comparison of the a v erage p er alignment computation time b etw een DTW and SPHMM when applied to original noise- less data. F or short time-series the o v erhead of computing summed area table is dominan t. F or longer time-series the computation time is roughly 4 times that of DTW which is muc h b etter than the worst case. This is due to the fact that when the algorithm is inv estigating all segmentations for a corresp ondence for the first time, it has to find the score of a full alignment for ev ery particular segment. This results in storing the score for every corresp ondence within all segmen ts originated Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 23 T able 2 UCR time-series classification accuracy in presence of additive Gaussian and impulse noise mo dels. Original Gaussian Noise Impulse Noise DTW PHMM SPHMM DTW DTW-NR PHMM SPHMM DTW DTW-NR PHMM SPHMM Lighting7 71.23 75.34 79.45 57.99 53.42 63.01 68.03 43.51 59.10 55.01 73.94 OSULeaf 61.98 65.7 66.12 41.32 53.99 63.77 65.01 47.11 55.55 55.15 67.18 OliveOil 83.33 86.67 86.67 37.78 27.78 35.56 35.56 28.89 51.12 28.89 32.22 SwedishLeaf 84.8 80.64 85.28 29.12 27.52 41.28 53.01 27.84 52.02 46.43 57.81 T race 98 100 100 76.67 66.33 80.67 81.67 73.67 89.93 75.83 88.67 Two Patterns 99.33 100 100 94.66 96.24 95.36 96.85 88.22 99.85 89.86 99.96 fish 82.86 86.86 86.86 35.43 33.91 36.38 38.47 34.18 60.13 60.70 71.21 synthetic control 98.67 96.67 97.33 82.33 60.78 83.55 85.33 92.78 98.33 92.89 93.2 wafer 99.56 99.76 99.79 99.44 95.38 99.03 99.79 84.01 97.21 89.60 99.39 yoga 84.17 84.2 84.23 78.19 76.86 72.06 72.87 63.00 68.18 65.77 77.18 50words 77.14 80 80.44 29.08 70.18 70.48 71.14 57.21 74.12 74.12 77.87 Adiac 60.61 60.87 60.87 10.66 7.33 10.91 14.41 10.20 28.17 14.59 40.04 Beef 53.33 53.33 53.33 53.33 54.44 55.55 55.55 40.00 50.00 50.00 53.33 CBF 99.67 99.89 99.89 85.78 64.71 88.11 88.74 74.35 97.33 85.93 98.01 Coffee 82.14 78.57 87 65.47 70.24 60.71 87 57.14 73.81 63.22 76.78 ECG200 88 91 91 85 72.67 84.33 86 77.00 78.00 81.00 85.00 F aceAll 81.72 77.51 79.59 63.89 27.97 66.31 72.25 67.89 66.84 69.05 77.20 F aceF our 89.77 89.77 92.05 84.47 73.48 87.5 90.15 52.65 80.04 68.88 89.07 Gun Point 92 98 98 76.22 70.67 66.22 68.45 71.33 83.31 75.80 84.65 Lighting2 86.89 86.89 85.25 75.96 71.04 81.42 83.61 61.97 87.43 76.89 86.89 ChlorineConcentration 64.9 65 66.95 42.37 38.37 45.04 52.44 38.32 50.12 41.28 53.29 CinC ECG torso 92.9 97.83 97.83 93.31 91.31 92.9 92.9 85.39 92.9 92.9 97.83 Crick et X 76.15 67.69 76.15 65.3 69.3 71.41 75.11 59.06 76.15 66.24 76.05 Crick et Y 80.51 77.95 82.82 47.52 46.52 51.51 58.21 41.98 53.82 48.51 54.19 Crick et Z 81.79 73.33 81.54 68.37 64.37 70.67 81.55 63.83 81.79 76.76 81.54 DiatomSizeReduction 95.42 93.46 95.75 78.32 75.32 85.25 92.69 69.83 90.52 81.69 92.47 ECGFiveDa ys 79.67 93.73 94.08 77.93 73.93 79.67 79.67 71.48 79.67 79.67 94.08 F acesUCR 91.27 91.85 96.88 36.94 33.94 39.27 45.36 31.41 41.27 36.29 45.04 Haptics 41.56 35.06 36.66 37.77 41.56 40.15 41.56 34.16 41.56 39.47 36.66 InlineSk ate 38.73 41.45 44.37 28.97 26.97 29.79 35.44 27.2 35.87 33.43 33.41 ItalyPo werDemand 95.53 95.53 95.53 78.82 80.82 85.78 89.25 72.28 95.53 82.52 95.53 MALLA T 93.26 91.47 97.14 80.44 82.44 82.5 93.07 73.75 92.54 88.22 97.14 MedicalImages 74.61 69.87 74.68 31.97 34.97 32.98 36.79 30.91 42.63 37.72 44.37 MoteStrain 87.86 86.34 93.3 84.82 86.82 87.86 87.86 74.71 87.86 86.33 93.3 NonInv asiveF atalECG Thorax1 81.48 82.9 87.84 11.82 8.82 13.56 13.23 10.69 13.7 13.35 17.31 NonInv asiveF atalECG Thorax2 87.02 88.04 93.4 21.63 17.63 19.99 29.22 16.56 21.23 17.13 22.17 SonyAIBORobot Surface 69.55 75.87 79.61 84.23 69.55 69.55 69.55 69.55 69.55 69.55 79.61 SonyAIBORobot SurfaceI I 85.94 85.52 91.28 84.23 85.94 85.94 85.94 73.18 85.94 84.56 91.28 StarLightCurv es 86.07 84.91 90.13 89.7 86.07 86.07 86.07 82.13 86.07 86.07 90.13 Symbols 93.77 92.06 97.88 78.39 80.39 82.32 90.77 67.9 84.04 79.46 93.89 TwoLeadECG 85.51 74.89 79.35 62.84 62.84 65.8 73.36 59.83 74.7 67.13 79.35 W ordsSynonyms 74.29 68.97 76.66 15 12 17.94 19.79 15.57 20.96 18.99 21.84 uW aveGestureLibrary X 77.44 75.1 81.18 60.6 56.6 62.42 72.85 54.81 68.26 62.33 72.36 uW aveGestureLibrary Y 69.68 66.72 72.24 59.91 64.91 63.68 67.08 53.18 69.17 58.45 72.24 uW aveGestureLibrary Z 67.9 65.41 70.78 48.96 47.96 55.04 56.82 43.28 56.48 49.01 58.93 24 Shahriar Shariat and Vladimir Pa vlo vic T able 3 Wilcoxon signed rank test for T able 2 additive Gaussian noise section. ” > ” stands for ”significantly b etter”. Boldface indicates statistically significant relationships. DTW > DTW-NR PHMM > DTW SPHMM > PHMM R + 649 810 674 R − 341 225 37 z -1.99 -3.30 -5.42 0 100 200 300 400 500 600 700 0 0.05 0.1 0.15 0.2 0.25 Time−series Length Computation Time SPHMM DTW Fig. 8 Comparison of the av erage p er alignment computation time of SPHMM and DTW on 20 datasets of UCR rep ository . V ertical axis show the time in seconds from that corresp ondence. Therefore, it is not necessary to recompute those v al- ues later when inv estigating the segmentations for neighbouring corresp ondences (neigh b ourho o d is defined by the maximum segmen t length). Therefore, it is ev- iden t that the algorithm is computationally very efficient given the fact that it significan tly outp erforms all the riv al metho ds, shows strong robustness against m ultiple types of noise in addition to pro ducing the joint segmentation and align- men t. 5.4 EEG Signal Classification W e rep eat the exp eriment on EEG signal classification rep orted in Shariat and P a vlovic (2012)) to compare the marginal matching algorithm resulted from (17) is the dynamic programming suggested by (11). This experiment also asserts the effectiv eness of SPHMM in case of non-causal and noisy real-w orld time-series. W e used the P300 dataset described inHoffmann et al (2005)). F our session are held for each sub ject. In each session six runs are conducted such that the set of all 6 images is shown at least 20 times to each sub ject where one of the images is the target in each run. W e chose sub ject 1 and target 2 for our exp eriment. In eac h fold of cross-v alidation we keep one session as training and the remaining three are used as the test set such that every session is used as training once. 1-NN is used as the classifier within a 5-fold cross-v alidation. W e applied the default pre-pro cessing on the data except that w e increased the sub-sampling rate to 128 from 32 to acquire longer signals (129 samples). As recommended in the original pap er, w e only kept 8 c hannels. The maximum segment length is 20 for b oth marginal matching and the dynamic programming. Using the dynamic Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 25 T able 4 Confusion matrix of action recognition for SPHMM(in p ercentage p oints) DFR JJack KRF KRS PLF PRF Sq W2S DepositFlo orR 65.6 0 0 0 6.3 3.1 0 25 JumpingJack 0 98 0 0 0 0 0 2 KickRF ront 0 0 75.9 20.1 0 0 0 3.5 KickRSide 0 0 21.40 71.4 0 3.6 0 3.6 PunchLF ront 0 0 3.6 3.6 82.1 10.7 0 0 PunchRF ront 0 6.7 0 6.7 6.7 80 0 0 Squat 0 0 0 0 0 0 100 0 W alk2Steps 0 3.5 3.5 0 0 0 0 93.1 programming an accuracy of 82 . 64( ± 1 . 35) is ac hiev ed while using the forw ard algorithm yielded 84 . 1( ± 1 . 64) whic h shows a marginal adv an tage for the marginal matc hing algorithm. 5.5 Motion Capture Data In order to show the effectiveness of our mo del in a challenging real-world ap- plication we p erformed exp eriment on HDM05 motion-capture (MoCap) dataset M ¨ uller et al (2007)). The actions are usually comprised of sev eral sub-actions. Ev en plain actions such as walking can b e divided into w alking with larger or shorter strides at different ends of the line. Therefore, an algorithm that can potentially reco v er and lev erage the subaction segmen ts can outperform alternate approaches. W e examine that hypothesis in this exp eriment. HDM05 contains MoCap data which consists of 2-3 rotation angles of 29 skele- tal join ts, resulting in 62 joint angle time series. HDM05 includes 100 classes of action p erformed by 5 sub jects. W e choose 8 action classes which are Dep osit- Flo orR, JumpingJack, KickRF r ont, KickRSide, PunchLF ront, PunchRF r ont, Squat, Walk2Steps . Sequences are around 300 time-points long and the whole dataset con tains 276 sequences in total. W e p erform 5-fold cross v alidation and 1-NN is our classifier. Maximum segment length is set to 10. W e compare our metho d against DTW, canonical time warping (CTW) Zhou and de la T orre (2009)) and IsoCCA Shariat and Pa vlo vic (2011)). SPHMM ac hieved the highest accu- racy , 85 . 5( ± 6 . 18). DTW, CTW and IsoCCA yield 70 . 1( ± 5 . 09), 60 . 2( ± 5 . 1) and 75 . 1( ± 6 . 8) resp ectively . The significance of SPHMM is eviden t from the rep orted results. The confusion matrix for this experiment is sho wn in T able 4. One can notice that Dep ositFlo orR is confused with Walk2Steps and KickRF r ont with KickR- Side . It should b e noted that Dep ositFlo orR contains the action of w alking (one or t w o steps) right before actual dep ositing. Also KickRF ront and KickRSide are very m uc h alike. PunchRF r ont is also sometimes confused with KickRF r ont , KickRSide and PunchLF r ont where one can p erceive that those actions ha ve a lot in common making it difficult to distinguish them correctly in some instances. 5.6 UT-Interaction F or typical activities that consist of elemen tary actions, it may often b e the case that the ordering of time p oints inside the segment ought not to affect the action similarit y . F or instance, elemen tary actions p erformed in opp osite directions should still b e deemed equally similar as the actions p erformed in the same direction. 26 Shahriar Shariat and Vladimir Pa vlo vic Fig. 9 Sample frames from UT-interaction dataset #1. ! " !# !$ ! % ! & # " ## #$ # % # & '" (" (( %" %( )" )( &" &( *" *( + + r # +,- r # +./01 ï ,- 2 ! +,- 2 ! +./01 ï ,- !"#$%&%'()*%)+,'-)+*,.' /00&1"023' (a) Accuracy !"#$%&%'()*%)+,'-)+*,.' (/))0&/'1"2,34' 10 12 14 16 18 20 22 24 26 28 30 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 χ 2 l 1 (b) Sp eedup Fig. 10 Accuracy and sp eedup ( SM computation time Fast − SM computation time ) results for l 1 and χ 2 distances as a function of maximum segment length. l 1 is depicted as green and χ 2 as blue. Accuracy result of F ast-SM for distance metric is identical to SM. T able 5 Recognition rates on UT-interaction dataset #1. Method Accuracy Segmental Match 91.57% Dynamic BOW Ryoo (2011)) 85.0% SVM 85.0% V oting W altisberg et al (2010)) 88.0% Therefore, w e exp ect to observe improv ed p erformance by applying the segmen tal matc hing algorithm to an activit y recognition problem. T o apply segmen tal matching we needed to pic k a dataset of reasonable length and complexit y so we could try different segmentation lengths and observe how the recognition rate is affected. Therefore, p opular action recognition datasets such as KTH Sch uldt et al (2004)) or W eizmann Gorelick et al (2007)) datasets were not suitable for our settings b ecause they con tain short p erio dic actions and only a few frames are sufficien t for a reliable recognition. Instead, w e use the first sub- set of publicly a v ailable UT-in teraction dataset containing 10 sequences (60 after segmen tation of actions). Within each sequence, six actions, hand shaking, hugging, kicking, p ointing, punching and pushing are p erformed by 10 differen t actors. The videos inv olv e camera jitter. Pedestrians are presen t in the video which makes the recognition more difficult (Figure 9). W e ha ve used spatio-temp oral interest points (Cub oids) Dollar et al (2005)) as the descriptors. Then k-means is applied on the resulting features to pro duce an 800 element co deb o ok. W e use a nearest neighbour classifier to compare with Ryoo (2011)). Leav e- one-sequence-out cross-v alidation b y holding one sequence for testing and using Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 27 the remaining nine for training. Each action in the test set is matched with all training sequences. As a baseline we report the results on SVM using the same feature set and also the results rep orted in Ryoo (2011)). W e hav e used l 1 and χ 2 histogram distances. The results on the l 1 distance metric are rep orted in T able 5. It is evident from the results that our approach significan tly outp erforms other metho ds. Using either l 1 or χ 2 distance metrics SM and F ast-SM were able to ac hiev e the b est result when the maximum segment length was 30. χ 2 ac hiev ed the b est result even with maxim um segment length of 20. W e tried different max- im um segment lengths, namely , 10,15,20, 25 and 30. Figure 10 illustrates how the resulting accuracy and sp eedup, gained by b ounding the distance (F ast-SM), c hange as the maximum segment length increases applying l 1 and χ 2 histogram distance metrics. It is interesting to note that the recognition rates of F ast-SM and SM are identical in all cases eliciting the fact that the b ounding tec hnique and the smo othness assumption on the local lik elihoo ds are in fact effective. In addition, F ast-SM achiev es at least a 2-fold sp eedup compared to SM. As sho wn in Figure 10(a), χ 2 ac hiev es b etter results in smaller maximum segment lengths p ointing to it as a more suitable measure of distance on segment histograms. Unfortunately , as the maximum segment length increases the b ounds on the histogram distances b ecome lo oser, resulting in reduced sp eedup. How ev er, one should notice that the shortest sequence is 24 frames long and our final maximum segment length (30) already exceeds this limit. This implies that the model has the option to effectiv ely considers a single BOTW representation as an alternative. W e also applied SPHMM to observe whether a complete alignment mo del is able to ac hiev e b etter p erformance compared to SM and F ast-SM. The result sho w ed that SPHMM cannot adv ance the recognition rate b eyond 91.57% yet, is at least three times slow er than SM and four times slow er than F ast-SM. Samples of the disco v ered segments are depicted in Figure 11. Fiv e activities are illustrated and each segment is separated using a red bar. Only a few frames from eac h segmen t is shown. The num ber of frames shown in each segment is pro- p ortional to the length of the segment such that a longer segment is shown with more frames comparing to a shorter segment in the same segmental alignment. An imp ortant observ ation is that the algorithm tends to encapsulate similar rela- tiv e motions within each segment. F or instance, in the ’Hugging’ activity (Figure 11(a)), the second and the third segments, which b oth had the maximum length, encompass th e action of h ugging. The next segment, shorter in length, contains the pause when the tw o actors do not mov e substantially , while the last segment col- lects the frames corresp onding to the actors separating from each other. One can sp eculate that the second and third segmen ts would merge if the maximum seg- men t length was large enough. Ho wev er, having larger maximum segment length results in longer running time. 6 Conclusion In this pap er we presen ted a probabilistic mo del for segmental sequences align- men t. W e show ed that a mo dified pair-HMM, in conjunction with a prop er segmen t metric, can lead to effective joint segmentation and segmen tal alignment. Our ex- p erimen tal results sho w ed high accuracy particularly in settings with high lev els of noise where DTW loses robustness and, hence, underp erforms, even after noise 28 Shahriar Shariat and Vladimir Pa vlo vic (a) Hugging (b) Pushing (c) Hand Shaking (d) Kicking (e) Punching Fig. 11 Samples of disco v ered segments. Segments are separated b y red bars. Only a few frames from each segment are shown. The segments and sequences are not necessarily of the same length. The n um ber of frames sho wn for eac h segment is increased or decreased for better illustration. remo v al pre-pro cessing. Additionally , the inv ariance to lo cal p ermutation has en- abled our algorithm to p erform well on non-causal signals. W e also prop osed a relaxation of the original mo del that reduced the computational time. In the par- ticular but common case when histograms are used to represent the time-series w e w ere able to prune the unnecessary computation using b ounds on histogram distance metrics. References Abreu E, Lightstone M, Mitra S, Arak a w a K (1996) A new efficien t approach for the remov al of impulse noise from highly corrupted images. Image Pro cessing, IEEE T ransactions on 5(6):1012–1025 Robust Time-Series Retriev al Using Probabilistic Adaptive Segmental Alignment 29 Aghab ozorgi S, Shirkhorshidi AS, W ah TY (2015) Time-series clustering-a decade review. Information Systems Andre-Jonsson H, Badal DZ (1997) Using signature files for querying time-series data. In: First Europ ean Symp osium on Principles of Data Mining and Knowl- edge Disco v ery , pp 211–220 A tallah MJ, F ox S (1998) Algorithms and Theory of Computation Handb o ok, 1st edn. CR C Press, Inc., Bo ca Raton, FL, USA Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: KDD W orkshop, pp 359–370 Blaum M, Bruck J (1994) Co ding for dela y-insensitiv e communication with partial sync hronization. Information Theory , IEEE T ransactions on 40(3):941 –945 Chen L, Ng R (2004) On the marriage of lp-norms and edit distance. In: Pro ceed- ings of the Thirtieth international conference on V ery large data bases - V olume 30, VLDB Endowmen t, VLDB ’04, pp 792–803 Chen L, zsu MT (2005) Robust and fast similarit y search for moving ob ject tra- jectories. In: In SIGMOD, pp 491–502 Ch u WS, Zhou F, la T orre FD (2012) Unsup ervised temp oral commonality dis- co v ery . Europ ean Conference on Computer Vision (ECCV) pp 373–387 Cro w FC (1984) Summed area tables for texture mapping. Pro ceedings of the 11th ann ual conference on Computer Graphics and In teractive Tec hniques pp 207–211 Demsar J (2006) Statistical comparisons of classifiers ov er multiple data sets. Jour- nal of Machine Learning Research 7:1–30 Ding H, T ra jcevski G, Sc heuermann P , W ang X, Keogh E (2008) Querying and mining of time series data: exp erimental comparison of representations and dis- tance measures. Pro ceedings of the VLDB Endowmen t 1(2):1542–1552 Dollar P , Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temp oral features. In: Pro ceedings of the 14th In ternational Conference on Computer Communications and Net w orks (ICCCN), IEEE Computer So ci- et y , pp 65–72 Durbin R, Eddy S, Krogh A, Mitchison G (1997) Biological Sequence Analysis. Probabilistic mo del of proteins and nuclear acids. Cambridge Universit y Press Gorelic k L, Blank M, Shech tman E, Irani M, Basri R (2007) Actions as space-time shap es. T ransactions on Pattern Analysis and Machine Intelligen ce 29(12):2247– 2253 Hoffmann U, Garcia G, V esin J, Diserens K, Ebrahimi T (2005) A Bo osting Ap- proac h to P300 Detection with Application to Brain-Computer In terfaces. In: Pro ceedings of the IEEE EMBS Conference on Neural Engineering, SPIE, pp 97–100 Keogh E (2006) A decade of progress in indexing and mining large time series databases. In: Proceedings of the 32nd international conference on V ery large data bases, VLDB Endowmen t, pp 1268–1268 Keogh E, Zh u Q, Hu B, Hao Y, Xi X, W ei L, R R (2011) The ucr time series classi- fication/clustering homepage. CA URL www.cs.ucr.edu/ ~ eamonn/time_series_ data/ Kondor R (2003) A kernel b etw een sets of vectors. In: Pro ceedings of In ternational Conference on Machine Learning Lamp ert CH, Blasc hk o MB, Hofmann T (2009) Efficien t subwindo w search: A branc h and b ound framework for ob ject lo calization. IEEE T rans Pattern Anal 30 Shahriar Shariat and Vladimir Pa vlo vic Mac h In tell 31(12):2129–2142 Morse MD, Patel JM (2007) An efficient and accurate metho d for ev aluating time series similarit y . In: Pro ceedings of the 2007 ACM SIGMOD international con- ference on Management of data, ACM, pp 569–580 M ¨ uller M, R¨ oder T, Clausen M, Eb erhardt B, Kr ¨ uger B, W eb er A (2007) Do cu- men tation MoCap database HDM05. T ech. Rep. CG-2007-2, Universit¨ at Bonn de Munck J, Gonalves S, Huijb o om L, Kuijer J, Pou w els P , Heethaar R, da Silv a FL (2007) The hemodynamic response of the alpha rh ythm: An EEG/fMRI study . NeuroImage 35(3):1142 – 1151 Pree H, Herwig B, Grub er T, Sick B, David K, Luko wicz P (2014) On general purp ose time series similarity measures and their use as kernel functions in supp ort v ector mac hines. Information Sciences 281:478–495 Rabiner LR (1989) A tutorial on hidden Mark o v mo dels and selected applications in sp eec h recognition. In: Pro ceedings of the IEEE 77 (2), pp 257–286 Riemensc hneider H, Donoser M, Bischof H (2009) Bag of optical flow volumes for image sequence recognition. In: British Machine Vision Conference Ry o o MS (2011) Human activit y prediction: Early recognition of ongoing activities from streaming videos. Pro ceeding of IEEE Conference on Compupter Vision pp 1036–1043 Sc h uldt C, Laptev I, Caputo B (2004) Recognizing human actions: A lo cal SVM approac h. In: Pro ceedings of the Pattern Recognition, 17th International Con- ference on (ICPR)., IEEE Computer So ciety , pp 32–36 Shariat S, Pa vlovic V (2011) Isotonic CCA for Sequence Alignment and Activity Recognition. Pro ceeding of IEEE Conference on Compupter Vision pp 2572– 2578 Shariat S, Pa vlo vic V (2012) Improv ed sequence classification using adaptive seg- men tal sequence alignment. Journal of Machine Learning Researc h - Pro ceedings T rack 25:379–394 Shariat S, Pa vlo vic V (2013) A new adaptive segmental matching measure for hu- man activity recognition. In: Computer Vision (ICCV), 2013 IEEE In ternational Conference on, IEEE, pp 3583–3590 Viola P , Jones MJ (2004) Robust real-time face detection. Int J Comput Vision 57(2):137–154 Vlac hos M, Kollios G, Gunopulos D (2002) Discov ering similar multidimensional tra jectories. In: Data Engineering, 2002. Pro ceedings. 18th International Con- ference on, pp 673–684 W altisb erg D, Y ao A, Gall J, V an Go ol L (2010) V ariations of a hough-v oting action recognition system. In: Pro ceedings of the 20th International conference on Recognizing patterns in signals, sp eech, images, and videos, Springer-V erlag, pp 306–312 W oznica A, Kalousis A, Hilario M (2006) Distances and (indefinite) kernels for sets of ob jects. In: International Conference on Data Mining, pp 1151 –1156 Y e L, Keogh E (2009) Time series shap elets: A new primitive for data mining. In: Pro ceedings of the 15th ACM SIGKDD In ternational Conference on Knowledge Disco v ery and Data Mining, New Y ork, NY, USA, pp 947–956 Zak aria J, Mueen A, Keogh E, Y oung N (2015) Accelerating the discov ery of unsup ervised-shap elets. Data Mining and Knowledge Discov ery pp 1–39 Zhou F, de la T orre F (2009) Canonical time w arping for alignment of human b eha vior. Adv ances in Neural Information Pro cessing Systems (NIPS) pp 1–9 Robust Time-Series Retriev al Using Probabilistic Adaptiv e Segmen tal Alignmen t 31 Author Biographies Shahriar Shariat receiv ed hi s PhD in computer scien ce from Rutgers Univ er- sit y in 2013 and MSc from Sharif Univ er- sit y in 2008. Since 2013, he is with ap- plied science team of T urn Inc. as a senior scien tist. Sh ahriar’s researc h in terests in- clude time-series analysis and alignmen t, computer vision and large-scale predictiv e and statistical learning mo dels. He has man y publications in ma jor p eer-review ed v en ues and has serv ed as a review er for sev eral top tier conferences and journals. Vladimir P a vlo vic is an Asso ciate Pro - fessor in the Computer Scie nce Depart- men t at Rutgers Univ ersit y . He receiv ed the PhD in electrical engineering from the Univ ersit y of Illinois in Urbana- Champaign in 1999. F rom 1999 un til 2001 he w as a mem b er of researc h staff at the Cam bridge Researc h Lab oratory , Cam- bridge, MA. Before joining Rutgers in 2002, he held a researc h professor p o- sition in the Bioinformatics Program at Boston Univ ersit y . Vladimir’s researc h in- terests include probabilistic system mo d - eling, time-series analysis, statistical com- puter vision and bioinformatics. He has published o v er 130 p eer-review e d pap ers in ma jor computer vision, m ac hine learn- ing and pattern recognition journals and conferences.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment