A Riemannian Framework for Statistical Analysis of Topological Persistence Diagrams

A Riemannian Framework f or Statistical Analysis of T opological P ersistence Diagrams Rushil Anirudh*, V inay V enkataraman*, Karthik eyan Natesan Ramamurthy + , Pa van T uraga* *Arizona State Uni versity , + IBM T . J. W atson Research Center, NY Abstract T opolo gical data analysis is becoming a popular way to study high dimensional featur e spaces without any con- textual clues or assumptions. This paper concerns itself with one popular topological featur e, which is the number of d − dimensional holes in the dataset, also known as the Betti − d number . The persistence of the Betti numbers over various scales is encoded into a persistence diagram (PD), which indicates the birth and death times of these holes as scale varies. A common way to compar e PDs is by a point- to-point matching, whic h is given by the n -W asser stein met- ric. However , a big drawback of this appr oach is the need to solve correspondence between points befor e computing the distance; for n points, the complexity gr ows accord- ing to O ( n 3 ) . Instead, we pr opose to use an entir ely new frame work built on Riemannian geometry , that models PDs as 2D pr obability density functions that ar e r epresented in the squar e-root framework on a Hilbert Spher e. The r e- sulting space is much mor e intuitive with closed form ex- pr essions for common operations. The distance metric is 1) corr espondence-free and also 2) independent of the number of points in the dataset. The complexity of computing dis- tance between PDs now gr ows accor ding to O ( K 2 ) , for a K × K discr etization of [0 , 1] 2 . This also enables the use of existing mac hinery in differ ential geometry towar ds statisti- cal analysis of PDs suc h as computing the mean, geodesics, classiﬁcation etc. W e report competitive r esults with the W asserstein metric, at a much lower computational load, in- dicating the favorable pr operties of the proposed appr oach. 1. Intr oduction T opological data analysis (TD A) has emerged as a use- ful tool to analyze underlying properties of data before any contextual modeling assumptions kick in. Generally speak- ing, TDA seeks to characterize the shape of high dimen- sional data (viewed as a point-cloud in some metric space), by quantifying various topological constructs such as con- nected components, high-dimensional holes, le vel-sets and monotonic regions of functions deﬁned on the data [9]. In particular , the number of d -dimensional holes in a data, also known as the Betti - d number , corresponds to the rank of the d − dimensional homology group. A commonly used and powerful topological feature, the persistence diagram (PD) summarizes the persistence of the Betti numbers with re- spect to the scale parameter used to analyze the data. A typ- ical machine learning pipeline using TDA features would ﬁrst estimate the PDs from the gi ven point cloud, and de- ﬁne a metric on them to compare different point clouds. The W asserstein distance measure has become ubiquitous as a metric between such PDs, as it is stable and has a well- deﬁned metric space associated with it [21, 8]. Howe ver , computation of the W asserstein distance inv olves ﬁnding a one-to-one map between the points in one persistence dia- gram to those in the other , which is a computationally e x- pensiv e operation. In this paper, we propose a novel representation for per- sistence diagrams as points on a hypersphere, by approx- imating them as 2D probability density functions (pdf). W e perform a square-root transform of the constructed pdf, wherein the Hilbert sphere becomes the appropriate space for deﬁning metrics [19]. This insight is used to construct closed form metrics – geodesics on the Hilbert sphere – which can be computed very efﬁciently , bypassing the cor- respondence problem entirely . The overall pipeline of op- erations for computing the proposed representation is gi ven in Figure 1. The biggest adv antage of the proposed representation is that it completely w orks around the computationally ex- pensiv e step of obtaining one-to-one correspondences be- tween points in persistence diagrams, thereby making the distance computation between PDs extremely efﬁcient. W e show that approximating PDs as pdfs results in compara- ble performances to the popular and best performing L 1 - W asserstein metric, while at the same time being orders of magnitude faster . W e also provide a theoretically well- grounded understanding of the geometry of the pdf repre- sentation. Additionally , the av ailability of closed form ex- pressions for many important tools such as the geodesics, This work was supported by NSF CAREER grant 1452163. 1 Statistical analysis on the Hilbert Sphere Square-root representation Joint Trajectories Estimated 2D PDF Reconstructed Phase space Persistence Diagram Figure 1. The o verall sequence of operations leading to the proposed representation, illustrated for the application of acti vity analysis. Joint position data from motion capture systems are collected as 1D time series. The phase space is reconstructed from these time series via T akens’ embedding theorem. W e represent the topological properties of this phase space using the persistence diagram (PD). Next, we use kernel density estimation to represent the PD itself as a 2D probability density function (pdf). Finally , we use the square-root frame work to interpret these pdfs as points on a Hilbert sphere. exponential maps etc. enables us to adapt powerful statis- tical tools – such as clustering, PCA, sample mean etc. – opening ne w possibilities for applications in volving large datasets. T o the best of our knowledge we are the ﬁrst to propose this representation for persistence diagrams. Contributions 1. W e present the ﬁrst representation of persistence di- agrams that are approximated as points on a Hilbert sphere resulting in closed-form expressions to com- pare two diagrams. This also completely avoids the correspondence problem, which is typically a compu- tational bottleneck. 2. W e demonstrate the ability of the proposed represen- tation for statistical analyses, such as computing the mean persistence diagram, principal geodesic analysis (PGA), and classiﬁcation using SVMs. 3. W e sho w promising results for supervised tasks such as action recognition, and assessment of mov ement qual- ity in stroke rehabilitation. 4. The space of the representation – the Hilbert sphere – is a geometrically well-understood and intuitive space, which may further promote its use in topological ma- chine learning applications. The rest of the paper is organized as follo ws. Section 2 discusses related work in more detail. Section 3 provides the necessary background on persistent homology , the space of persistence diagrams, and the square-root framework on the Hilbert space. Section 4 pro vides details about the pro- posed framew ork for using the new representation of per- sistence diagrams in a functional Hilbert space for statisti- cal learning tasks. Section 5 describes the experiments and results. Section 6 concludes the paper . 2. Related W ork Persistence diagrams elegantly summarize the topolog- ical properties of point clouds, and the ﬁrst algorithm for computing topological persistence was proposed in [9]. Ever since, there has been an e xplosion in understanding the properties of PDs, comparing PDs, and computing statis- tics on them. Since a PD is a multiset of off-diagonal points along with inﬁnite number of points on the diago- nal, the cost of optimal bijecti ve assignment, also kno wn as the W asserstein distance, between the individual points in the PDs is a valid distance measure. The time complexity of computing the distance between two PDs with n points each is O ( n 3 ) [4]. It has been shown in [8] that the per- sistence diagrams of Lipschitz functions are stable with re- spect to p -W asserstein distance. Ho wever , the bottleneck and p -W asserstein distance do not allo w for easy computa- tion of statistics. Hence, L p -W asserstein metrics have been used to dev elop approaches for computing statistical sum- maries such as the Fr ´ echet mean [21]. Computing Fr ´ echet mean of PDs in volves iterativ ely assigning points from the individual diagrams to the candidate mean diagram, recom- puting the mean, and repeating till con ver gence. Since the mean PD obtained is not guaranteed to be unique, the au- thors in [16] proposed the probabilistic Fr ´ echet mean which itself is a probability measure on the space of PDs. An in- vestigation of the properties of the space of PDs that allow for deﬁnition of various probability measures is reported in [15], and a deriv ation of conﬁdence sets that allo w the sepa- ration of topological signal from noise is presented in [10]. Since operations with PDs can be computationally ex- pensiv e, Bubenik proposed a new topological summary - known as the persistence landscape - deriv ed from the PD [5]. It can be thought of as a collection of en velope func- tions on the points in PD based on the order of their im- portance. Persistence landscapes allow for fast computa- tion of statistics since they lie in a vector space. Howe ver their practical utility has been limited since the y pro vide decreasing importance to secondary and tertiary features in PDs that are usually useful in terms of discriminating be- tween data from different classes. Recently , an approach that deﬁnes a stable multi-scale kernel between persistence diagrams has been proposed [17]. The kernel is obtained by creating a surface with a Gaussian centered at each point in the PD along with a ne gativ e Gaussian centered at its reﬂection across the diagonal. In addition, the authors in [2] propose to compute a vector representation - the persis- tence image - by computing a kernel density estimate of the PD and integrating it locally . The kernel density estimate is weighted such that the points will hav e increasing weights as they move aw ay from the diagonal. A similar weighting approach is used in [14] to create a persistence-weighted Gaussian kernel. The square-root representation for 1D probability den- sity functions (pdfs) was proposed in [19] and was used for shape classiﬁcation. It has since been used in sev eral ap- plications, including activity recognition [22] - where the square-root velocity representation is used to model the space of time warping functions. This representation ex- tends quite easily to arbitrary high-dimensional pdfs as well. 3. Mathematical Preliminaries In this section we will introduce the concepts of a) per- sistence homology , b) persistent diagrams, c) our proposed representation of PDs as a 2D pdf, and the square-root framew ork resulting in the Hilbert sphere geometry . 3.1. Persistent Homology The homology of point cloud X can be computed by ﬁrst constructing a shape out of the point cloud and estimating its homology . A simplex is imposed on e very set of points in the point cloud that satisfy a neighborhood condition based on a scale parameter  . The collection of such simplices is known as the simplicial complex S . S can be thought of as a representation of the underlying shape of the data, and its homology can be inferred. Ho wever , homology groups ob- tained from S depend on the scale or time parameter  based on which the complexes are constructed [9]. The homolog- ical features of the simplicial complex constructed from the point cloud that are stable across scales, i.e., that are per- sistent , are the ones that provide information about the un- derlying shape. T opological features that do not persist are considered to be noise. This information is represented us- ing persistence diagrams as a 2D plot of birth versus death times of each homology cycle corresponding to the homol- ogy group H k , k = { 0 , 1 , . . . } . The birth time is the scale at which the homological feature is born and the death time is the scale at which it ceases to exist. The homology cycle of dimension d is also referred to as a d − dimensional hole. Therefore, the PD can be considered as an object that repre- sents the number of holes in terms of the range of scales at which the y appear . T ypically , PDs of the point cloud data are obtained using ﬁltration of simplicial complexes. A well-known ﬁltration is the V ietoris-Rips (VR) ﬁltration, where a simplex is induced between a group of points when- ev er the distance between each pair of points is less than the giv en scale  [26]. An example of point clouds and their corresponding persistence diagrams for homology groups 0 and 1 are provided in Figure 2. Point cloud data Persistence diagrams Figure 2. The above example illustrates the topological features extracted from the point cloud data with two and one one- dimensional holes. These properties are reﬂected well in their cor- responding persistence diagrams on the right. 3.2. The Space of Persistence Diagrams Every PD is a multiset of 2D points, where each point denotes the birth and death time of a homology group. Fur- thermore, the diagonal is assumed to contain an inﬁnite number of points with the same birth and death times. For any two PDs X and Y , the distance between the diagrams is usually quantiﬁed using the bottleneck distance or the L q - W asserstein metric [13]. In this paper , we consider only the L 2 - and L 1 -W asserstein ( d L 2 and d L 1 ) metrics giv en re- spectiv ely as, d L 2 ( X, Y ) = inf η : X → Y X x ∈ X || x − η ( x ) || 2 2 ! 1 2 , (1) and d L 1 ( X, Y ) = inf η : X → Y X x ∈ X || x − η ( x ) || 1 . (2) Since each diagram contains an inﬁnite number of points in the diagonal, this distance is computed by pairing each point in one diagram uniquely to another non-diagonal or diagonal point in the other diagram, and then computing the distance. This correspondence can be obtained via the Hungarian algorithm or its more ef ﬁcient variants [13]. The space of PDs with respect to the L 2 -W asserstein metric is giv en by D L 2 = { X | d L 2 ( X, ∅ ) < ∞} . (3) T urner et. al. [21] show this is a non-negati vely curved Alexandro v space. Furthermore, the diagram on the geodesic between the PDs X and Y in this space is gi ven as γ ( s ) = (1 − s ) x + sφ ( x ) , (4) where x is a point in the diagram X , φ ( x ) is a correspond- ing point in the diagram Y and s ∈ [0 , 1] parametrizes the geodesic. Clearly , the points in the diagram on the geodesic can be simply obtained by linearly interpolating between the corresponding points on the candidate points X and Y . Furthermore, the Riemannian mean between two PDs is easily computed as γ (0 . 5) . 3.3. Square-r oot Framework and the Hilbert Sphere W e treat the points in a persistence diagram as samples from an underlying probability distribution function. This representation is inspired from recent work dealing with deﬁning Mercer kernels on PDs [17], where the feature em- bedding obtained from the Mercer kernel closely resembles a kernel-density estimator from the gi ven points, with an additional reﬂection term about the diagonal to account for boundary effects while solving a heat-equation. The fact that the feature embedding resembles a kernel density esti- mate is not further exploited in [17]. In our work, we more directly exploit this pdf interpre- tation of PDs, further endowing it with a square-root form [19] – and then utilizing the Hilbert spherical geometry that results. W ithout loss of generality , we will assume that the sup- ports for each 2D pdf is [0 , 1] 2 . The space of pdfs that we will consider is P = { p : [0 , 1] × [0 , 1] → R ∀ x, y | p ( x, y ) ≥ 0 , and Z 1 0 Z 1 0 p ( x, y ) dxdy = 1 } (5) It is well-kno wn that P is not a vector space [19]. Instead, it is a Riemannian manifold with the Fisher-Rao metric as the natural Riemannian metric. Geodesics under the Fisher- Rao metric are quite dif ﬁcult to compute. Instead, we adopt a square-root form proposed by Sriv astav a et. al. [19] which simpliﬁes further analysis enormously . In other words we consider the space, Ψ = { ψ : [0 , 1] × [0 , 1] → R | ψ ≥ 0 , Z 1 0 Z 1 0 ψ 2 ( x, y ) dxdy = 1 } . (6) For any two tangent vectors v 1 , v 2 ∈ T ψ (Ψ) , the Fisher-Rao metric is giv en by , h v 1 , v 2 i = Z 1 0 Z 1 0 v 1 ( x, y ) v 2 ( x, y ) dxdy . (7) The abov e two pieces of information imply that the square-root form ψ = √ p results in the space becom- ing a unit Hilbert-sphere, endowed with the usual inner - product metric. Geodesics on the unit-Hilbert sphere un- der the above Riemannian metric are kno wn in closed form. In fact, the dif ferential geometry of the Hilbert sphere results in closed form expressions for computing geodesics, exponential and in verse exponential maps [19]. Further , the square-root form with the above metric has ad- ditional fav orable properties such as in variance to domain re-parameterization. Giv en two points ψ 1 , ψ 2 the geodesic distance between them is giv en by d H ( ψ 1 , ψ 2 ) = cos − 1 ( h ψ 1 , ψ 2 i ) , (8) where h ψ 1 , ψ 2 i ψ = R 1 0 ψ 1 ( t ) ψ 2 ( t ) dt . The exponential map is deﬁned as, exp ψ i ( υ ) = cos( || υ || ψ i ) ψ i + sin( || υ || ψ i ) υ || υ || ψ i , (9) where υ ∈ T ψ i (Ψ) is a tangent vector at ψ i and || υ || ψ i = q h υ , υ i ψ i . The log arithmic map from ψ i to ψ j is gi ven by: exp − 1 ψ i ( ψ j ) = u p h u, u i cos − 1 h ψ i , ψ j i , (10) with u = ψ i − h ψ i , ψ j i ψ j . The geodesic on the sphere is giv en in closed form as π ( s ) = (1 − s ) ψ 1 + sψ 2 s 2 + (1 − s ) 2 + 2 s (1 − s )( h ψ 1 , ψ 2 i ) , (11) or equiv alently as π ( s ) = cos( s || v || ) ψ + sin( s || v || ) v || v || . (12) A comparison of sampling of the geodesics in the Alexandro v space induced by the L 2 -W asserstein metric and the proposed Hilbert sphere are provided in Figure 3. 4. Algorithmic Details Our proposed framework consists of reconstructing the phase space of the time series data, computing the PDs of the reconstructed phase space, transforming each PD to a 2D pdf by kernel density estimation, using the 2D pdfs in the square-root framework to estimate distances or obtain statistical summaries. The distances computed and the fea- tures obtained will be used in inference tasks. Additional details for some of the abov e steps are given belo w . Figure 3. Comparing geodesics in the Alexandrov space (top) and the proposed Hilbert sphere (bottom), for a simple persistence diagram containing just 3 points. The Alexandro v space requires computing correspondence between points, and the geodesic in volves linearly interpolating between corresponding points. The persistence pdf a voids correspondence estimation, hence the geodesics correspond to new local modes appearing and gradually increasing in intensity as the original modes decrease in intensity . Phase-space Reconstruction from Activity Data In dy- namical system understanding, we aim to estimate the un- derlying states, but we measure functions – usually pro- jections – of the original state-space. e.g. while human mov ement is inﬂuenced by many joints, lig aments, muscles of the body , we might measure the location of only a few joints. One way to deal with this issue is to reconstruct- ing the ‘phase space’ by the method of delays used com- monly in non-linear dynamical analysis [20]. Reconstruct- ing the phase-space in this manner only preserves important topological properties of the original dynamical system, and does not recov er the true state-space. F or a discrete dynami- cal system with a multidimensional phase-space, time-delay vectors (or embedding vectors) are obtained by the concate- nation of delayed versions of the data points, x t = [ x ( t ) , x ( t + τ ) , · · · , x ( t + ( m − 1) τ )] T . (13) where m is the embedding dimension, τ is the embed- ding delay . For a sufﬁciently lar ge m , the important topo- logical properties of the unknown multidimensional sys- tem are reproduced in the reconstructed phase-space [1]. The collection of n time-delay vectors is the point cloud data under further consideration, and this is represented as X = [ x t ] n t =0 . Estimating the PDs After estimating the point cloud in the reconstructed phase space, we will construct a simpli- cial complex S using the point cloud data X to compute the persistence diagrams for the Betti numbers using the VR ﬁltration. Howe ver , this approach considers only spa- tial adjacency between the points and ignores the temporal adjacency . W e improv e upon the existing VR ﬁltration ap- proach by explicitly creating temporal links between x t − 1 , x t , and x t +1 in the one-skeleton of S , thereby creating a metric space which encodes adjacency in both space and time [23]. The persistence diagrams for homology groups of dimensions 0 and 1 are then estimated. Square-r oot Framework for Distance Estimation be- tween PDs Since each PD is a multiset of points in the 2D space, we start by constructing a 2D pdf from each of them using kernel density estimation using a Gaussian kernel of zero mean and variance σ 2 . For each PD, we compute the square-root representation ψ using the approach provided in Section 3.3, and the distance between two PDs can be computed using (8). Dimensionality Reduction with Principal Geodesic Analysis (PGA) W e are able to extract principal com- ponents using PGA [11], adapted to our Hilbert sphere – which essentially performs classical PCA on the tan- gent space of the mean-point on the sphere. Given a dataset of persistence diagrams in their square-root form { ψ 1 , ψ 2 , . . . , ψ N } , we ﬁrst compute the Riemannian cen- ter of mass [12]. W e use a simple approximation using an extrinsic algorithm that computes ﬁrst the Euclidean mean and maps it on to the closest point on the sphere. Next, we represent each ψ i by its tangent vector from the mean. W e then compute the principal components on the tangent space and represent a persistence diagram as a low-dimensional vector . 5. Experiments W e perform experiments on two datasets for human ac- tion analysis. First we perform action recognition on the MoCap dataset [3], next we show the use of the framework in quality assessment of mov ements in stroke rehabilitation [7]. W e will describe the datasets next, followed by the ev al- uation settings and parameters used. In all our experiments, we performed a discretization of the 2D pdf into a K × K grid. The choice of K indirectly affects classiﬁcation per - formance as expected, i.e. – a smaller value of K results in a reduced ability to identify between nearby points in the PD due to lower resolution. On the other hand, a larger value of K improves resolution, but also increases computational re- quirements. T ypical v alues of K used in experiments range from 50 to 100 at most, whereas typical v alues of n – the number of points in a PD – ranges from 20 to 50 . 5.1. Motion Capture Data W e e valuate the performance of the proposed frame work using 3 -dimensional motion capture sequences of body joints [3]. The dataset is a collection of ﬁv e actions: dance, jump, run, sit and walk with 31 , 14 , 30 , 35 and 48 instances respectiv ely . W e generate 100 random splits having 5 test- Method Accuracy (%) Time (sec) Chaos [3] 52.44 - VR-Complex [26] 93.68 - DT2 [24] 93.92 - T -VR Complex (L1) [23] 96.48 (1 . 2 ± 1 . 23) × 10 3 Proposed - 1NN 89.87 ( 0 . 059 ± 0 . 044 ) Proposed - PGA +SVM 91.68 - T able 1. Comparison of classiﬁcation rates for different methods using nearest neighbor classiﬁer on the motion capture dataset. It is observed that on an av erage, the proposed metric is 10 5 times faster than the traditional W asserstein metric, while achieving a comparable recognition accuracy . ing examples from each action class and use an SVM clas- siﬁer on the vector features computed with PGA, we get a performance of 91 . 68% . The mean recognition rates for the different methods are giv en in T able 1. W e also com- pare with a 1-nearest neighbor classiﬁer computed on the Hilbert sphere, which gi ves a performance of 89 . 87% . This is clearly competitiv e, when compared to approaches that use the L 1 -W asserstein metric as shown in table 1. Clearly , the proposed metric for persistence diagrams is able to cap- ture most of the important information, with the added ben- eﬁt being free from expensiv e correspondence estimation. T o compute the times taken using the W asserstein metric and the proposed metric, we a verage across 3 samples from 5 action classes, across 57 (19 joint trajectories along 3 axes) trajectories – a total of 855 persistence diagrams. W e computed a pairwise distance matrix for all these PDs, and computed the av erage time taken in Matlab 2014 on a stan- dard Intel i7 PC. 0.1 0.2 1 2 4 8 16 60 65 70 75 80 85 90 95 100 σ (Std. Dev.) Recognition Accuracy Figure 4. Recognition accurac y vs σ , the standard de viation of the 2D Gaussians used in the kernel density estimates. In Figure 4, we compare the recognition accuracy with the choice of σ , the standard deviation used during kernel density estimation. The accuracy is generally higher for a small σ and drops as the σ increases. W e note that a similar trend is also reported in [17]. 5.2. Strok e Rehabilitation Dataset Our aim in this experiment is to quantitati vely assess the quality of movement performed by the impaired subjects during repetitiv e task therapy . The experimental data was collected using a heavy marker-based system ( 14 markers on the right hand, arm and torso) in a hospital setting from 15 impaired subjects performing reach and grasp mov e- ments to a deﬁned target. The stroke surviv ors were also ev aluated by the W olf Motor Function T est (WMFT) [25] on the day of recording, which ev aluates a subject’ s func- tional ability on a scale of 1 − 5 (with 5 being least impaired and 1 being most impaired) based on predeﬁned functional tasks. In our experiments, we only use the data corre- 0 10 20 30 40 50 0 20 40 0 0.005 0.01 0.015 0.02 0.025 0.03 Unimpaired Subjects (a) Unimpaired Subjects 0 10 20 30 40 50 0 20 40 0 0.005 0.01 0.015 0.02 0.025 0.03 Impaired Subjects (b) Impaired Subjects Figure 5. The mean persistence pdfs visualized as heat maps for unimpaired subjects (left) and stroke surviv ors (right), for the reach-and- grasp action, are shown. These means are computed as the extrinsic means (we choose extrinsic mean here for simplicity) on the Hilbert sphere using the proposed representation. The locations of the peaks are exactly the same, since the subjects perform the same general mov ement, howev er the intensity differs signiﬁcantly indicating that the quality of mov ement is captured using topological tools well. sponding to the single marker on the wrist from the heavy marker -based hospital system, which allows us to ev aluate the framew ork in a home-based rehabilitation system set- ting. W e utilize the WMFT scores as an approximate high- lev el quantitativ e measure for movement quality (ground- truth labels) of impaired subjects performing the reach and grasp mov ements. W e compute explicit features as described earlier using PGA. This allo ws us to represent each movement by a lo w- dimensional feature vector . W e use the WMFT scores and split the dataset into train and test to perform regression. Since the dataset is small, we use a leave-one-out approach, where we train on all b ut one sample, which is used for test- ing, and repeat this over all the samples in the dataset. The correlation performance with the WMFT score is sho wn in T able 2, and it can be seen that we are comparable to the state of the art, and much better than traditional features. The predicted scores are sho wn in Figure 6 compared to the original scores. The dynamical features proposed in [24] and [3], de- pend on describing the phase space for each mov ement. W e compute the topological features, which are expected to be much weak er than other characteristics such as shape. Howe ver , we are still able to capture subtle information re- garding movement quality . This is illustrated in Figure 5, where we see the 3D peaks associated with the av erage of persistence diagrams across subjects impaired with stroke and those who are unimpaired. It is clearly seen that since they are performing the same kind of mo vements, the peaks occur at the same location. Howe ver , the intensity is signif- icantly different across these diagrams, perhaps indicating information regarding mo vement quality . Method Correlation with WMFT KIM (14 markers) [6] 0.85 L yapunov exponent (1 marker) [18] 0.60 Proposed method (1 marker) 0.80 T able 2. Comparison of correlation coefﬁcient for dif ferent meth- ods using leave-one-subject-out cross-validation scheme and SVM regressor on the stroke rehabilitation dataset. Even with just a sin- gle marker , W e obtain comparable results to a clinical 14-marker system. 0 5 10 15 2.5 3 3.5 4 Subject Quality Score Expert Score Predicted Figure 6. Predicting movement quality scores for reach-and-grasp in stroke surviv ors using topological features. W e obtain a 0.8 correlation score with a p-value of 5 . 46 × 10 − 4 6. Discussion and Conclusion Based on the theory and experiments presented so far , it is instructiv e to compare the space of PDs with respect to the different distance metrics. W e will consider only the L 2 -W asserstein metric ( d L 2 ) and the proposed Hilbert sphere metric d H . The interpretation of PDs with respect to these two metrics is very different. d L 2 ( X, Y ) is the Earth Mov er’ s Distance between the PDs X and Y considered as scaled discrete probability mass functions (pmfs). Howe ver , d H ( ψ X , ψ Y ) is the geodesic distance between the square root of kernel density estimates of X and Y . Furthermore, the “length” of the persistence diagrams induced by these metrics are very different. For the W asserstein metric, this is giv en by d L 2 ( X, ∅ ) which can be arbitrarily high, whereas with the proposed metric it is constrained as the persistence diagrams li ve on a unit sphere. The most important dis- tinction arises when the pairwise distances between all the points in the two PDs are sufﬁciently high. When the 2- dimensional pdfs obtained from kernel density estimates of the two PDs do not o verlap anywhere, d H ( X, Y ) → 1 . This implies that if the variance of the kernel is sufﬁciently small, two PDs with non-overlapping points will always have d H close to 1 . This problem can be alleviated by using kernels with multiple scales for smoothing the PDs to obtain and combining the distances obtained at each scale. The comparison between the geodesics in the persis- tence space is also illuminating (Figure 3). Sampling of the geodesic in the Ale xandrov space shows that the points in one diagram move towards the corresponding points in the other, as we mo ve along the geodesic. Whereas, a sim- ilar sampling in the Hilbert sphere shows that the PDs in the middle of the geodesic contain the modes from pdfs corresponding to both the candidate PDs. This results in markedly different interpretations of the means computed from d L 2 and d H ev en for the case of two persistence dia- grams. Sev eral directions of future work emerge from this en- deav or . On the theoretical side a more formal understand- ing of the variety of metrics and their relationship to the proposed one can be considered. On the applied side, the fa vorable computational load reduction should open ne w applications of TD A for massiv e datasets. References [1] H. D. Abarbanel. Analysis of observed chaotic data . New Y ork: Springer-V erlag, 1996. [2] H. Adams, S. Chepushtanov a, T . Emerson, E. Hanson, M. Kirby , F . Motta, R. Neville, C. Peterson, P . Shipman, and L. Ziegelmeier . Persistent images: A stable vec- tor representation of persistent homology . ArXiv preprint arXiv:1507.06217 , 2015. [3] S. Ali, A. Basharat, and M. Shah. Chaotic in variants for human action recognition. In IEEE International Conference on Computer V ision , pages 1–8, Oct. 2007. [4] D. P . Bertsekas. A new algorithm for the assignment prob- lem. Mathematical Pr ogramming , 21(1):152–171, 1981. [5] P . Bubenik. Statistical topological data analysis using per- sistence landscapes. The Journal of Machine Learning Re- sear ch , 16(1):77–102, 2015. [6] Y . Chen, M. Duf f, N. Lehrer , H. Sundaram, J. He, S. L. W olf, and T . Rikakis. A computational frame work for quantitati ve ev aluation of movement during rehabilitation. In AIP Con- fer ence Pr oceedings-American Institute of Physics , volume 1371, pages 317–326, 2011. [7] Y . Chen, M. Duf f, N. Lehrer , H. Sundaram, J. He, S. L. W olf, T . Rikakis, T . D. Pham, X. Zhou, H. T anaka, et al. A com- putational framework for quantitative ev aluation of move- ment during rehabilitation. In AIP Conference Pr oceedings- American Institute of Physics , v olume 1371, page 317, 2011. [8] D. Cohen-Steiner, H. Edelsbrunner , J. Harer, and Y . Mileyko. Lipschitz functions ha ve l p-stable persistence. F oundations of Computational Mathematics , 10(2):127–139, 2010. [9] H. Edelsbrunner , D. Letscher , and A. Zomorodian. T opolog- ical persistence and simpliﬁcation. Discrete and Computa- tional Geometry , 28(4):511–533, 2002. [10] B. T . F asy , F . Lecci, A. Rinaldo, L. W asserman, S. Balakr- ishnan, A. Singh, et al. Conﬁdence sets for persistence dia- grams. The Annals of Statistics , 42(6):2301–2339, 2014. [11] P . T . Fletcher, C. Lu, S. M. Pizer , and S. C. Joshi. Princi- pal geodesic analysis for the study of nonlinear statistics of shape. IEEE T ransactions on Medical Imaging , 23(8):995– 1005, August 2004. [12] K. Grove and H. Karcher . How to conjugate C 1 -close group actions. Math.Z , 132:11–20, 1973. [13] M. K erber, D. Morozov , and A. Nigmetov . Geometry helps to compare persistence diagrams. In Pr oceedings of the Eighteenth W orkshop on Algorithm Engineering and Exper- iments (ALENEX) , pages 103–112, 2016. [14] G. Kusano, K. Fukumizu, and Y . Hiraoka. Persistence weighted gaussian kernel for topological data analysis. arXiv pr eprint arXiv:1601.01741 , 2016. [15] Y . Mileyko, S. Mukherjee, and J. Harer . Probability mea- sures on the space of persistence diagrams. In verse Pr ob- lems , 27(12):124007, 2011. [16] E. Munch, P . Bendich, K. Turner , S. Mukherjee, J. Mattingly , and J. Harer . Probabilistic Fr ´ echet means and statistics on vineyards. ArXiv pr eprint arXiv:1307.6530 , 2013. [17] J. Reininghaus, S. Huber , U. Bauer , and R. Kwitt. A stable multi-scale kernel for topological machine learning. In Pr o- ceedings of the IEEE Conference on Computer V ision and P attern Recognition , pages 4741–4748, 2015. [18] M. Rosenstein, J. Collins, and C. De Luca. A prac- tical method for calculating largest lyapunov exponents from small data sets. Physica D: Nonlinear Phenomena , 65(1):117–134, 1993. [19] A. Sri vasta va, I. Jermyn, and S. Joshi. Riemannian analysis of probability density functions with applications in vision. In IEEE Conference on Computer V ision and P attern Recog- nition , pages 1–8, 2007. [20] F . T akens. Detecting strange attractors in turbulence. Dy- namical Systems and T urbulence , 898:366–381, 1981. [21] K. T urner , Y . Mileyko, S. Mukherjee, and J. Harer . Fr ´ echet means for distributions of persistence diagrams. Discrete & Computational Geometry , 52(1):44–70, 2014. [22] A. V eeraraghavan, A. Srivasta va, A. K. Ro y-Chowdhury , and R. Chellappa. Rate-in variant recognition of humans and their activities. Image Processing , IEEE T ransactions on , 18(6):1326–1339, 2009. [23] V . V enkataraman, K. N. Ramamurthy , and P . T uraga. Per- sistent homology of attractors for action recognition. arXiv pr eprint arXiv:1603.05310 , 2016. [24] V . V enkataraman and P . T uraga. Shape distrib utions of non- linear dynamical systems for video-based inference. IEEE T ransactions on P attern Analysis and Machine Intelligence , 2016 (accepted) ; [25] S. L. W olf, P . A. Catlin, M. Ellis, A. L. Archer , B. Morgan, and A. Piacentino. Assessing wolf motor function test as outcome measure for research in patients after stroke. Stroke , 32(7):1635–1639, 2001. [26] A. Zomorodian. Fast construction of the V ietoris-Rips com- plex. Computers & Graphics , 34(3):263–271, 2010.

A Riemannian Framework for Statistical Analysis of Topological Persistence Diagrams

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment