Topological Data Analysis with $epsilon$-net Induced Lazy Witness Complex

T op ological Data Analysis with  -net Induced Lazy Witness Complex Naheed Anjum Arafat 1 , Debabrota Basu 2 , St ´ ephane Bressan 1 1 Sc ho ol of Computing, National Universit y of Singapore, Singapore 2 Departmen t of Computer Science and Engineering, Chalmers Universit y of T ec hnology , G¨ oteb org, Sw eden Abstract. T op ological data analysis computes and analyses topological features of the p oin t clouds b y constructing and studying a simplicial represen tation of the underlying topological structure. The enth usiasm that follo w ed the initial successes of topological data analysis w as curbed b y the computational cost of constructing suc h simplicial represen tations. The lazy witness complex is a computationally feasible appro ximation of the underlying top ological structure of a p oint cloud. It is built in reference to a subset of points, called landmarks, rather than considering all the points as in the ˇ Cec h and Vietoris-Rips complexes. The c hoice and the num ber of landmarks dictate the eﬀectiv eness and eﬃciency of the appro ximation. W e adopt the notion of  -cov er to deﬁne  -net. W e prov e that  -net, as a choice of landmarks, is an  -appro ximate representation of the p oint cloud and the induced lazy witness complex is a 3-approximation of the induced Vietoris-Rips complex. F urthermore, we prop ose three al- gorithms to construct  -net landmarks. W e establish the relationship of these algorithms with the existing landmark selection algorithms. W e empirically v alidate our theoretical claims. W e empirically and compar- ativ ely ev aluate the eﬀectiveness, eﬃciency , and stabilit y of the proposed algorithms on syn thetic and real datasets. 1 In tro duction T opological data analysis computes and analyses top ological features of gener- ally high-dimensional and p ossibly noisy data se ts. T op ological data analysis is applied to eclectic domains namely shap e analysis [6], images [2,22], sensor net w ork analysis [8], so cial netw ork analysis [3,25,27,24], computational neuro- science [21], and protein structure study [30,19]. The enth usiasm that follo w ed the initial successes of top ological data anal- ysis w as curb ed b y the computational challenges p osed by the construction of an exact simplicial representation, the ˇ Cec h complex, of the point cloud. A sim- plicial represen tation facilitates computation of basic top ological ob jects such as simplicial complexes, ﬁltrations, and persistent homologies. Thus, researc hers devised approximations of the ˇ Cec h complex as well as its b est p ossible appro xi- mation the Vietoris-Rips complex [7,26,10]. One of suc h computationally feasible 2 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan appro ximate simplicial represen tation is the lazy witness c omplex [7]. The lazy witness complex is built in reference to a subset of p oin ts, called landmarks . The c hoice and the num ber of landmarks dictate the eﬀectiveness and eﬃciency of the appro ximation. W e adopt the notion of  -cov er [18] from analysis to deﬁne and presen t the notions of  -sample,  -sparsit y , and  -net (Section 4) to capture b ounds on the loss of top ological features induced by the choice of landmarks. W e prov e that an  -net is an  -approximate representation of the p oint cloud with resp ect to the Hausdorﬀ distance. W e prov e that the lazy witness complex induced b y an  -net, as a choice of landmarks, is a 3-approximation of the induced Vietoris- Rips complex.  -net allows us to provide suc h approximation guaran tees for lazy witness complex (Section 4.2) whic h was absen t in the literature. F urthermore, w e prop ose three algorithms to construct  -net as landmarks for p oin t clouds (Section 5). W e establish their relationship with the existing landmark selection algorithms, namely r andom and maxmin [7]. W e empirically and comparativ ely sho w that the size of the  -net landmarks constructed by the algorithms v aries in v ersely with  and agrees with the kno wn b ound on the size of  -net [20]. W e empirically and comparatively v alidate our claim on the top ological ap- pro ximation quality of the lazy witness complex induced by the  -net landmarks (Section 6). F urthermore, we empirically and comparatively v alidate the eﬀec- tiv eness, eﬃciency and stability of the prop osed algorithm on represen tativ e syn thetic p oint clouds as well as a real dataset. Exp eriments conﬁrm our claims b y sho wing equiv alent eﬀectiveness of the algorithms constructing  -net land- marks with the existing maxmin algorithm. W e also show the  -net landmarks to b e more stable than those selected by the algorithms maxmin and random as  -net incurs narrow er conﬁdence band in the p ersisten t landscap e top ological descriptor. W e conclude (Section 7) with the theoretical and experimental pieces of evidence that v alidate the  -nets constructed as a stable and eﬀective wa y to construct landmarks and to induce lazy witness complexes. 2 Related W orks Applications of TD A. TD A is applied in diﬀerent domains mostly on relatively small datasets and up to dimension 2 due to computational intractabilit y of the p opular ˇ Cec h and Vietoris-Rips complexes. [6] computed homology classes at di- mension 0 for their proposed tangen tial ﬁltration of point clouds of handwritten digits for image classiﬁcation (dataset size ∼ 69–294). [28] used the p ersistence pairs at dimension 0 for segmen ting mesh on b enc hmark mesh segmentation datasets (size ∼ 50000). Researc hers applying TD A to net w ork analysis fo cus on characterising net w orks using features computed from persistence homology classes. [3] and [25] computed p ersistence homology at dimension 0,1 and 2 of the clique ﬁltration to study weigh ted collab oration netw orks (size ∼ 36000) and w eigh ted netw orks from diﬀerent domains (size ∼ 54000) respectively . In bio- logical netw orks, [11] clustered gene co-expression netw orks (size ∼ 400) based on distances b et w een Vietoris-Rips p ersistence diagram computed on eac h net- T op ological Data Analysis with  -net Induced Lazy Witness Complex 3 Input Data Computationof Topological Representations Aggregationof Representations Applications Representationof TopologicalFeatures Computationof TopologicalFeatures T opological Representations Persistent Homology Classes T opological Descriptors Filtration Fig. 1: Comp onents of top ological data analysis. w ork. In molecular biology , p ersisten t homology rev eals diﬀerent conformations of proteins [30,19] based on the strength of the b onds of the molecules. T opological Approximation. Computational infeasibility of the ˇ Cec h com- plex and Vietoris-Rips complex motiv ates the developmen t of approximate sim- plicial representations suc h as the lazy witness complexes, sparse-Rips com- plex [26] and graph induced complex (GIC) [10]. Sparse-Rips complex [26] p er- turbs the distance metric in such a wa y that when the regions co v ered by a p oin t can b e cov ered by its neighbouring points, that p oint can b e deleted with- out changing the top ology . Giv en a graph constructed on a p oin t cloud as input, the graph induced complex is a simplicial complex built on a subset of vertices, where the v ertices of a k -simplex are the nearest neigh bours of a clique-vertices of a k -clique [10]. Due to their computational b eneﬁts, lazy witness complex and graph induced complexes hav e found applications in studying natural image statistics [7] and image classiﬁcation [9]. Applications of  -net. The concept of  -net is a standard concept in anal- ysis and top ology [18] originating from the idea of ( δ,  )-limits formulated b y Cauc h y . Nets ha v e b een used in nearest-neighbor searc h [20]. [15] prop osed the Net-tree data structure to represent  -nets at all scales of  . Net-tree is used to construct approximate w ell-separated pair decomp ositions [15] and approximate geometric spanners [15]. The simplicial complexes in the graph induced complex are nets. Sparse-Rips ﬁltration constructs a net-tree on the point-cloud to decide whic h neighbouring points to delete. [14] used  -net for manifold reconstruction. 3 T op ological Data Analysis T opological data analysis is the study of computational mo dels for eﬃcient and eﬀectiv e computation of top ological features, such as persistent homology classes, from diﬀeren t datasets, and representation of the topological features using dif- feren t top ological descriptors, such as p ersistence barco des, for further analysis and application [12,23]. In this section, we represen t the computational blo c ks of top ological data analysis in Figure 1 and further describ e each of the blo c ks. T opological data analysis computes the topological features, such as per- sisten t homology classes, by computing the top ological ob jects called simpli- cial c omplex for a giv en dataset. A simplicial complex is constructed using simplices. F ormally , a k -simplex is the conv ex-h ull of ( k + 1) data p oin ts. F or instance, A 0-simplex [ v 0 ] is a single p oin t, a 1-simplex [ v 0 v 1 ] an edge, and a 4 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan 2-simplex [ v 0 v 1 v 2 ] a ﬁlled triangle. A k -homology class is an equiv alen t class of such k -simplicial complexes that cannot be reduced to a lo w er dimensional simplicial complex [12]. In order to compute the k -homology classes, a practi- tioner do es not hav e direct access to the underlying space of the p oin t cloud and it is com binatorially hard to compute the exact simplicial representation of ˇ Cec h complex [31]. Thus, diﬀeren t appro ximations of the exact simplicial representa- tion are prop osed: Vietoris-R ips complex [17] and lazy witness complex [7]. The Vietoris-Rips complex R α ( D ), for a giv en dataset D and real n um ber α > 0, is an abstract simplicial complex representation consisting of suc h k - simplices, where an y tw o p oints v i , v j in any of these k -simplices are at distance at most α . Vietoris-Rips complex is the b est p ossible ( √ 2-)appro ximation of the ˇ Cec h complex, computable with present computational resources, and is extensiv ely used in top ological data analysis literature [23]. Thus, we use the Vietoris-Rips complex as the baseline representation in this pap er. In the worst case, the num ber of simplices in the Vietoris-Rips complex gro ws exp onen tially with the num ber of data p oin ts [23,31]. Lazy witness complex [7] appro ximates the Vietoris-Rips complex b y constructing the simplicial complexes o v er a subset of data p oin ts L , referred to as the landmarks. Deﬁnition 1 (Lazy Witness Complex [7]). Given a p ositive inte ger ν and r e al numb er α > 0 , the lazy witness c omplex LW α ( D , L, ν ) of a dataset D is a simplicial c omplex over a landmark set L wher e for any two p oints v i , v j of a k -simplex [ v 0 v 1 · · · v k ] , ther e is a p oint w whose ( d ν ( w ) + α ) -neighb orho o d 3 c ontains v i , v j . In the w orst case, the size of the lazy witness complexes gro ws exp onentially with the num ber of landmarks. Less n um b er of landmarks facilitates computa- tional acceleration while pro duces a bad approximation of Vietoris-Rips with loss of top ological features. Thus, the trade-oﬀ b et w een the approximation of top ological features and a v ailable computational resources dictates the c hoice of landmarks. As the v alue of ﬁltration parameter α increases, new simplices arriv e and the top ological features, i.e. the homology classes, start to app ear. Some of the homology classes merge with the existing classes in a subsequent simplicial com- plex, and some of them p ersist ind eﬁnitely [12]. In order to capture the ev olution of top ological structure with scale, topological data analysis techniques construct a sequence of simplicial complex representations, called a ﬁltr ation [12], for an increasing sequence of α ’s. In a giv en ﬁltration, the p ersistence interv al of a homology class is denoted by [ α b , α d ), where α b and α d are the ﬁltration v al- ues of its app earance and merging respectively . The p ersistence in terv al of an indeﬁnitely p ersisting homology class is denoted as [ α b , ∞ ). T opological descriptors, such as p ersistence diagram [12] and p ersistence landscap es [1] represen t p ersistence in terv als as p oin ts and functions resp ectiv ely in order to draw qualitative and quantitativ e inference ab out the top ological fea- 3 d ν ( w ) is the distance from point w ∈ D to its ν -th nearest p oin t in L . T op ological Data Analysis with  -net Induced Lazy Witness Complex 5 tures. Distance measures b et w een p ersisten t diagrams such as the b ottlenec k and W asserstein distances [12] are often used to draw quantitativ e inference. The b ottlenec k distance b et w een t w o diagrams is the smallest distance d for which there is a p erfect matching betw een the points of the t w o diagrams such that any pair of matc hed points are at distance at most d [12]. The W asserstein distance b et w een tw o diagrams is the cost of the optimal matc hing b et w een p oin ts of the t w o diagrams [12]. 4  -net As w e discussed in Section 3, top ological data analysis of a dataset begins with the computation of simplicial complex represen tations. Though Vietoris-Rips is the b est p ossible appro ximation of the ˇ Cec h complex, it incurs an exponential computational cost with resp ect to the size of the p oin t cloud. Th us, lazy wit- ness complex is often used as a practical solution for scalable top ological data analysis. Computation of lazy witness complex is dependent on the selection of landmarks. Selection of landmarks dictates the trade-oﬀ b et w een eﬀe ctiveness , i.e. the quality of approximation of top ological features, and eﬃciency , i.e. the computational cost of computing the lazy witness complex.  -co v er is a construction used in top ology to compute the inherent prop er- ties of a given space [18]. In this pap er, w e imp ort the concept of  -cov er to deﬁne  -net of a p oin t cloud. W e use the  -net of the p oin t cloud as the land- marks for constructing lazy witness complex. W e show that  -net, as a choice of landmarks, has guaran tees such as b eing an  -appro ximate representation of the p oin t cloud, its induced lazy witness complex b eing a 3-approximation of its induced Vietoris-Rips complex, and also b ounding the n um b er of landmarks for a giv en  . These guaran tees are absent for the other existing landmark selection algorithms (Section 5) suc h as random and maxmin algorithms. 4.1  -net of a Poin t Cloud In this section, we deﬁne the  -net of a p oin t cloud in reference to the  -cov er used in top ology .  -co v er is a set of subsets of a p oin t cloud in an Euclidean space suc h that these subsets together cov er the point cloud, but none of the subsets has a diameter more than  . Deﬁnition 2 (  -cov er [18]). An  -c over of a p oint cloud P is the set of P i ’s such that P i ⊆ P , P = ∪ i P i , and diameter 4 of P i is at most  ≥ 0 for al l i . When the sets in the 2  -co v er of P are Euclidean balls of radius  , the set of cen tres of the balls is termed as an  -sample of set P . Deﬁnition 3 (  -sample [14]). A set L ⊆ P is an  -sample of P if the c ol le ction { B  ( x ) : x ∈ L } of  -b al ls of r adius  c overs P , i.e. P = ∪ x ∈ L B  ( x ) . 4 The diameter diam ( P i ) of a set P i ⊆ P is deﬁned as the largest distance d ( x, y ) b et w een any t w o p oin ts in x, y ∈ P i . 6 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan According to the deﬁnition of  -sample, P is an  -sample of itself for  = 0. F or decreasing further computational exp ense, it is desirable to hav e an  -sample is sparse that means it contains as less n um ber of p oin ts as p ossible. An  -sparse subset of P is a subset where an y t w o points are at least  apart from eac h other. Deﬁnition 4 (  -sparse). A set L ⊂ P is  -sp arse if for al l x, y ∈ L , d ( x, y ) >  . An  -net of set P is an  -sparse subset of P which is also an  -sample of P . Deﬁnition 5 (  -net [18]). L et ( P, d ) b e a metric sp ac e and  ≥ 0 . A subset L ⊂ P is c al le d an  -net of P if L is  -sp arse and an  -sample of P . 4.2 Prop erties of  -nets  -net of a p oin t cloud comes with appro ximation guarantees irresp ectiv e of its algorithmic construction. An  -net of a p oin t cloud of diameter ∆ in Euclidean space R D is an  -appro ximation of the point cloud in Hausdorﬀ distance. The lazy witness complex induced b y an  -net is a 3-approximation of the Vietoris-Rips complex on that  -net. F urthermore, the size of an  -net is at most ( ∆  ) θ ( D ) [20]. Here, w e establish the ﬁrst tw o approximation guaran tees of  -net theoretically . In section 6 w e v alidate the last t w o prop erties empirically for the prop osed algorithms for constructing  -nets. P oin t-cloud Appro ximation Guarantee of an  -net. W e use Lemma 1 to pro v e that the  -net of a p oin t cloud P is an  -appro ximate representation of that p oin t cloud in Hausdorﬀ distance. Lemma 1. L et L b e an  -net of p oint cloud P . F or any p oint p ∈ P , ther e exists a p oint q ∈ L such that the distanc e d ( p, q ) ≤  Theorem 1. The Hausdorﬀ distanc e b etwe en the p oint cloud P and its  -net L ⊆ P is at most  . Pr o of. F or an y l ∈ L , there exists a p oin t p ∈ P suc h that d ( l, p ) ≤  , by deﬁni- tion of B  ( l ). Hence, min l ∈ L d ( l, p ) ≤  , and thus, max p ∈ P min l ∈ L d ( l, v ) ≤  . F or an y p ∈ P , there exists a landmark l ∈ L suc h that d ( l, p ) ≤  , by Lemma 1. Thus, max l ∈ L min p ∈ P d ( l, p ) ≤  . Hence the Hausdorﬀ distance d H ( P , L ) betw een P and L , deﬁned as the maximum of max l ∈ L min p ∈ P d ( l, p ) and max p ∈ P min l ∈ L d ( l, p ) is b ounded b y  . T opological Approximation Guarantee of an  -net Induced Lazy Wit- ness Complex. In addition to an  -net b eing an  -appro ximation of the p oin t- cloud, w e prov e that the lazy witness complex induced b y the  -net landmarks is a goo d approximation (Theorem 2) to the Vietoris-Rips complex on the land- marks. This approximation ratio is indep enden t of the algorithm constructing the  -net. As a step tow ards Theorem 2, we state Lemma 2 that follows from the deﬁnition of the lazy witness complex and  -sample. Lemma 2 establishes the relation b et w een 1-nearest neigh bour of p oin ts in an  -net. T op ological Data Analysis with  -net Induced Lazy Witness Complex 7 Lemma 2. If L is an  -net landmark of p oint cloud P , then the distanc e d ( p, p 0 ) fr om any p oint p ∈ P to its 1-ne ar est neighb our p 0 ∈ P is at most  . Theorem 2 shows that the lazy witness complex induced by the landmarks in an  -net is a 3-approximation of the Vietoris-Rips complex on those landmarks ab o v e the v alue 2  of ﬁltration parameter. Theorem 2. If L is an  -net of the p oint cloud P for  ∈ R + , LW α ( P , L, ν = 1) is the lazy witness c omplex of L at ﬁltr ation value α , and R α ( L ) is the Vietoris- R ips c omplex of L at ﬁltr ation α , then R α/ 3 ( L ) ⊆ LW α ( P , L, 1) ⊆ R 3 α ( L ) for α ≥ 2  . Pr o of. In order to prov e the ﬁrst inclusion, consider a k -simplex σ k = [ x 0 x 1 · · · x k ] ∈ R α/ 3 ( L ). F or any edge [ x i x j ] ∈ σ k , let w t b e the p oin t in P that is nearest to the vertices of [ x i x j ] and wlog, let the p oin t corresp onding to that vertex b e x j . Since w t is the nearest neighbour of x j , b y Lemma 2, d ( w t , x j ) ≤  ≤ α 2 . Since [ x i x j ] ∈ R α/ 3 , d ( x i , x j ) ≤ α 3 < α 2 . By triangle inequality , d ( w t , x i ) ≤ α 2 + α 2 ≤ α . Hence, x i is within distance α from w t . The α -neighbourho o d of p oin t w t con- tains b oth x i and x j . Since d 1 ( w t ) > 0, the ( d 1 ( w t ) + α )-neighbourho od of w t also con tains x i , x j . Therefore, [ x i x j ] is an edge in LW α ( P , L, 1). Since the argument is true for an y x i , x j ∈ σ k , the k -simplex σ k ∈ LW α ( P , L, 1). In order to prov e the second inclusion, consider a k -simplex σ k = [ x 0 x 1 · · · x k ] ∈ LW α ( P , L, 1). Therefore, by deﬁnition of lazy witness complex, for an y edge [ x i x j ] of σ k there is a witness w ∈ P such that, the ( d 1 ( w ) + α )-neighbourho od of w contains both x i and x j . Hence, d ( w , x i ) ≤ d 1 ( w ) + α ≤  + α (b y Lemma 2) ≤ 3 α / 2. Similarly , d ( w , x j ) ≤ 3 α/ 2. By triangle inequality , d ( x i , x j ) ≤ 3 α . Therefore, [ x i x j ] is an edge in R 3 α ( L ). Since the argumen t is true for an y x i , x j ∈ σ k , the k-simplex σ k ∈ R 3 α ( L ). Discussion. Theorem 2 implies that the interlea ving of lazy witness ﬁltration LW = LW α ( L ) and the Vietoris-Rips ﬁltration R = R α ( L ) o ccurs when α > 2  . As a consequence, the w eak-stabilit y theorem [4] implies that the bottleneck dis- tance b etw een the partial p ersistence diagrams D gm > 2  ( LW ) and D gm > 2  ( R ) is upp er bounded by 3 log 3. In Section 6, w e empirically v alidate this b ound. Size of an  -net. The size of an  -net dep ends on  , the diameter of the p oin t- cloud and the dimension of the underlying space [20,15]. If a point cloud P ⊂ R D has diameter ∆ , the size of an  -net of P is ( ∆  ) θ ( D ) [20]. The size of an  -net does not dep end on the size of the p oint cloud. In Section 6, w e empirically v alidate this b ound for the  -net landmarks generated by the prop osed algorithms. The framew ork of  -net along with its approximation guaran tees lead to the question of its algorithmic construction as landmarks. 5 Construction of an  -net The na ¨ ıv e algorithm [16] to construct an  -net selects the ﬁrst p oin t l 1 uniformly at random. In i -th iteration, it marks the points at a distance less than  from the previously selected landmark l i − 1 as co v ered, and selects the new p oin t 8 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan l i from the unmarked points arbitrarily until all p oin ts are mark ed [15]. The fundamen tal principle is to c ho ose, at each iteration, a new landmark from the set of yet-to-co v er p oin ts such that it retains the  -net prop ert y . W e prop ose three algorithms where this c hoice determines the algorithm. 5.1 Three Algorithms:  -net-rand,  -net-maxmin, and ( , 2  )-net The algorithm  -net-rand , at each iteration, marks the p oints at a distance less than  from the previously c hosen landmark as cov ered and c hooses a new landmark uniformly at random from the unmark ed p oin ts. The algorithm  - net-maxmin , at eac h iteration, marks the points at a distance less than  from the previously chosen landmark as cov ered and chooses the farthest unmark ed p oin t from the already chosen landmarks. It terminates when the distance to the farthest unmarked p oin t is no more than  . The algorithm ( , 2  ) -net , at eac h iteration, marks the p oin ts at a distance less than  from the previously c hosen landmark as cov ered, and chooses a landmark uniformly at random from those unmarked p oints whose distance to the previously chosen landmark is at most 2  . If there are no unmarked p oin ts at a distance in-betw een  and 2  from the previous landmark, it searches for unmarked p oin ts at a distance b etw een 2  and 4  , 4  and 8  , and so on, until it either ﬁnds one to con tin ue as b efore or all p oin ts are mark ed. The pseudo-code for ( , 2  )-net is in Algorithm 1. ( , 2  )-net attempts to co v er the p oin t-cloud with in tersecting balls of radius  , whereas  -net-maxmin attempts to co v er the point-cloud with non-in tersecting balls of radius  .  -net-rand do es not main tain an y in v ariant.  -net-rand and ( , 2  )-net ha v e the time-complexity of O ( 1  D ) and O ( 1  D log( 1  )) resp ectiv ely . Their run-time does not dep end on the size of the input p oint cloud. On the other hand, the run-time of  -net-maxmin dep ends on the size of the p oin t-cloud as it has to search for the farthest p oin t from the landmarks at each iteration. On a point cloud of sinze n ,  -net-maxmin has O ( n  D ) time-complexit y . 5.2 Connecting  -net to Random and Maxmin algorithms De Silv a et al.[7] prop osed random and maxmin algorithms for p oin t clouds. Random. Giv en a point cloud P , the algorithm r andom selects | L | points uniformly at random from the set of p oin ts P . This algorithm is closely related to  -nets. Given the n um ber of landmarks K > 1, the set of landmarks selected b y random is δ -sparse where δ is the minimum of the pairwise distances among the landmarks. Ho w ev er, the same choice of K may not necessarily make the landmarks a δ -sample of the point cloud. The  -net-rand algorithm is a mo diﬁcation of random that tak es  as a pa- rameter instead of K and use  to put a constraint on the domain of random c hoices. It contin ues to select landmarks until all p oin ts are mark ed to ensure the  -sample prop ert y . The pro of sk etc h of the fact that the constructed landmarks are  -sparse and  -sample is as follo ws: T op ological Data Analysis with  -net Induced Lazy Witness Complex 9 Algorithm 1 Algorithm ( , 2  )-net Input: P oint cloud P = { p 1 , p 2 , · · · , p n } , n × n Distance matrix D , parameter  . Output: Set of Landmarks L . 1: Select the initial landmark l 1 uniformly at random from P . 2: Initialize L = { l 1 } . 3: Let N 1 ( , 2  ) b e the set of p oin ts at a distance betw een  and 2  from l 1 . 4: Initialize candidate landmarks C 1 = N 1 ( , 2  ) . 5: i = 1. 6: rep eat 7: Let N i ≤  b e the set of p oin ts at a distance less than  from l i . 8: Mark all the p oin ts in N i ≤  as cov ered. 9: Let C u i b e the set of unmarked p oin ts in C i . 10: if C u i is empty then 11: Find the ﬁrst δ = [1 , 2 , · · · , log ( d ∆/ 2  e )] for which N i (2 δ , 2 δ +1  ) con tains any unmark ed p oin t. 12: Set C i = C i ∪ N i (2 δ , 2 δ +1  ) . 13: end if 14: Select l i +1 uniformly at random from C u i . 15: Insert l i +1 to L. 16: C i +1 = C i ∪ N i +1 ( , 2  ) . 17: i = i + 1. 18: un til all points are mark ed Pr o of of Corr e ctness. The  -net-rand algorithm do es not terminate un til all p oin ts are marked as cov ered. Hence the set of landmarks selected b y  -net-rand is an  -sample, since otherwise, there w ould hav e b een unmarked p oin ts. The pairwise distance betw een any tw o landmarks cannot b e less than  ; otherwise, one of them would ha v e b een mark ed b y the other and the mark ed point w ould not b e a landmark. Hence the set landmarks selected b y  -net-rand is  -sparse. Maxmin. The maxmin algorithm selects the ﬁrst landmark l 1 uniformly at random from the set of p oin ts, P . F ollowing that; it selects the p oint which is furthest to the present set of landmarks at each step till a given num ber of landmarks, say | L | , are chosen. If L i − 1 = { l 1 , l 2 , . . . , l i − 1 } is the set of already c hosen landmarks, it selects such a p oin t u ∈ P \ L i − 1 as the i th landmark that maximises the minimum distance from the present set of landmarks L i − 1 . Mathematically , l i , arg max u ∈ P \ L i − 1 min v ∈ L i − 1 d ( u, v ) . The maxmin algorithm selects landmarks suc h that the p oin t cloud is co v ered as v astly as p ossible. The maxmin algorithm is closely related to  -net. Given the num ber of land- marks K > 1, the set of landmarks selected by maxmin is δ -sparse where δ is the minim um of the pairwise distances among the landmarks chosen. How- ev er, that c hoice of K may not necessarily make the landmarks a δ -sample of the p oin t cloud. The  -net-maxmin algorithm is a mo diﬁcation of maxmin that tak es  as a parameter instead of K and uses  to control sparsity among the landmarks. It terminates when the minim um of the pairwise distances among the landmarks drops b elo w  to ensure the  -sample prop ert y of the landmarks c hosen. The pro of sketc h of the resulting landmarks b eing  -sparse and  -sample is as follo ws: 10 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan Pr o of of Corr e ctness. The  -net-maxmin algorithm, at eac h iteration, selects only that point as a landmark whose minimum distance to the other landmark p oin ts is the largest among all unmark ed p oin ts. If such a p oin t’s minimum dis- tance to the other landmark p oin ts is no more than  , the algorithm terminates. Hence the set of landmarks selected b y  -net-maxmin must b e  -sparse. A p oin t that is not a landmark m ust b e cov ered by some landmark p oin t already . Oth- erwise, its minim um distance to the landmark set would hav e b een at least  , and hence w ould hav e been the only unmark ed p oint av ailable to b e selected as a new landmark by  -net-maxmin. Therefore the set of landmarks selected b y  -net-maxmin is also  -sample of the p oin t cloud. 6 Empirical Performance Ev aluation W e implemen t the pip eline illustrated in Figure 1 to empirically v alidate our theoretical claims and also the eﬀectiv eness, eﬃciency , and stability of the al- gorithms that construct  -net landmarks compared to that of the random and maxmin algorithms. W e test and ev aluate these algorithms on tw o syn thetic p oin t cloud datasets, namely T orus and T angled-torus, and a real-w orld p oin t cloud dataset, namely 1grm. On each input p oin t cloud, we compute the lazy witness ﬁltration and Vietoris-Rips ﬁltration induced by the landmarks, as well as the Vietoris-Rips ﬁltration induced b y the p oin t cloud. On eac h dataset, as w e v ary parameter  of the algorithms constructing  - nets, we study the relationship b etw een  to the num ber of landmarks, the qualit y of the top ological features approximated by the lazy witness ﬁltration induced b y those landmarks, as w ell as the stabilit y of those approximated features. As the algorithms maxmin and random require the num ber of landmarks a priori, for the sak e of comparison, we use the same n um ber of landmarks as that of the corresp onding  -net algorithm for a giv en  . W e compute the quality of the features approximated b y an algorithm in terms of the 1-W asserstein distance betw een the lazy witness ﬁltration induced b y the landmarks selected b y that algorithm to those of a Vietoris-Rips ﬁltration on the same dataset. As there are elements of randomness in the algorithms, w e run eac h exp erimen t 10 times and compute distances a v eraged o v er the runs. W e compute the stabilit y of the features approximated b y the algorithms in terms of the 95% conﬁdence band corresponding the rank 1 p ersistence landscap e using b o otstrap [5]. W e use p ersistence landscap e to v alidate the stability of the ﬁltrations b ecause unlike p ersistence diagrams and barcodes, t w o sets of p ersistence landscap es alw a ys hav e unique mean and b y strong law of large n um b ers the empirical mean landscap es of suﬃciently large collection conv erge to its exp ected landscapes [1]. 6.1 Datasets and Exp erimen tal Setup Datasets. W e use the datasets illustrated in Figure 2 for exp erimentation. The dataset T orus is a point cloud of size 500 sampled uniformly at random T op ological Data Analysis with  -net Induc ed Lazy Witness Complex 11 Fig. 2: (left) T orus, (middle) T angled-torus, and (right) 1grm Dataset from the surface of a torus in R 3 . The torus has a ma jor radius of 2 . 5 and minor radius of 0 . 5. The dataset T angled-torus is a p oin t cloud of size 1000 sampled uniformly at random from t w o tori tangled with each other in R 3 . Both tori has a ma jor radius of 2 . 5 and minor radius of 0 . 5. The dataset 1grm is the conformation of the gramicidin-A protein. It has a helical shap e. Gramicidin-A has t w o disconnected c hains of monomers consisting of 272 atoms. Exp erimen tal Setup. W e implement the exp erimental workﬂo w in Matlab 2018a (with 80GB memory limit). All exp erimen ts are run on a machine with an Intel(R) Xeon(R)@2.20GHz CPU and 196 GB memory . W e use the Jav aplex library [29] to construct lazy witness ﬁltrations and to compute their p ersistence in terv als. W e use the Ripser library to construct the Vietoris-Rips ﬁltrations and to compute their persistence in terv als. W e use R-TDA pack age [13] to com- pute b ottleneck and W asserstein distances, and 95% conﬁdence band for the landscap es. W e set the lazy witness parameter ν = 1 in all computations. 6.2 V alidation of Theoretical Claims Num b er of Landmarks Generated b y the  -net Algorithms. In Figure 3, w e illustrate the relation b et w een the num ber of landmarks generated b y the  -net algorithms and  on T orus dataset. Each algorithm is run 10 times for each  , and the mean and standard deviation are plotted. W e observ e that the num ber of landmarks decreases as  increases. W e also observe that, for a ﬁxed  , the a v erage num ber of landmarks selected by the  -net algorithms is more or less stable across diﬀerent algorithms. W e use the n um ber of landmarks of an  -net- maxmin to ﬁt a curv e with v alues ∆ = 5 . 9 (the diameter of T orus) and co eﬃcien t θ ( D ) = 1 . 73 (found from ﬁtting with 95% conﬁdence). This observ ation supp orts the theoretical b ound of ( ∆  ) θ ( D ) . T op ological Approximation Guarantee. In o rder to v alidate Theorem 2 on dataset T orus, w e compute the b ottlenec k distance b etw een the p ersistence diagram of the lazy w itness ﬁltration and that of the Vietoris-Rips ﬁltration induced by the  -net landmarks for diﬀeren t v alues of  . F or each  and algorithm, w e generate 10 sets of  -net landmarks, compute their corresp onding p ersistence diagrams and plot the mean and standard deviation of the bottleneck distances in Figure 4. Since the theorem is v alid for α ≥ 2  , we exclude the homology classes b orn b elow 2  b efore the distance computation. The algorithms satisfy the bound as the distances are alw a ys less than the theoretical b ound of 3 log 3. 12 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan 0 1 2 3 4 5 6 0 20 40 60 80 100 120 Number of Landmarks -net-maxmin -net-rand ( ,2 )-net fitted curve ( (D) = 1.73) Fig. 3: Number of landmarks gener- ated b y the  -net algorithms vs.  on T orus dataset. 0 2 4 6 0 2 4 -net-maxmin Bottleneck dist. dim0 0 2 4 6 0 2 4 -net-maxmin Bottleneck dist. dim1 0 2 4 6 0 2 4 -net-maxmin Bottleneck dist. dim2 0 2 4 6 0 2 4 -net-rand Bottleneck dist. 0 2 4 6 0 2 4 -net-rand Bottleneck dist. 0 2 4 6 0 2 4 -net-rand Bottleneck dist. 0 2 4 6 0 2 4 ( ,2 )-net Bottleneck dist. 0 2 4 6 0 2 4 ( ,2 )-net Bottleneck dist. 0 2 4 6 0 2 4 ( ,2 )-net Bottleneck dist. Fig. 4: T op ological appro ximation guar- an tee of  -net constructed b y the algo- rithms on T orus dataset. Since the plots on the other datasets supp ort these claims, for the sake of brevit y , w e omit them. 6.3 Eﬀectiv eness and Eﬃciency of Algorithms Constructing  -nets F or eac h  , w e compute the 1-W asserstein distance b et w een the persistent dia- grams of the lazy witness ﬁltration induced by each  -net landmarks and that of the Vietoris-Rips ﬁltration induced b y the whole p oint cloud. W e compute the mean distance and mean CPU-times across 10 runs. Unlik e  -net algorithms, the existing landmark selection algorithms take the n um ber of landmarks as input. Since the av erage n um b er of landmarks selected b y the  -net algorithms do es not v ary muc h across diﬀeren t algorithms (Figure 3), for eac h  , we tak e the same num ber of  -net-maxmin landmarks as parameters to select the random and maxmin landmarks. Figure 5 illustrates result on T orus dataset. W e observ e that maxmin performs w ell in dimensions 0 and 2 whereas ( , 2  )- net has comp etitiv e eﬀectiveness. In dimension 1, w e observ e that  -net-maxmin ac hiev es the lo w est minimum, whereas random ac hiev es the highest minimum. All the  -net algorithms has tw o lo cal minima, the ﬁrst of which at around  = 0 . 5 and the second in b et w een  = 2 to  = 4. The ﬁrst local minimu m is due to the minor radius. As for the explanation of the second lo cal minimum, it is suﬃcien t to either co v er the inner diameter of 5 or the outer diameter of 6 to capture the cycle. A 2 . 5- to 3-sparse sample suﬃces to do so. The p erformance of the maxmin and random landmarks is not as explainable as the  -net landmarks. In terms of eﬃciency , w e observ e that that ( , 2  )-net algorithm has the lo w est run-time among all the  -net algorithms. The ( , 2  )-net algorithm has comp eti- tiv e eﬀectiveness and b etter eﬃciency among the prop osed algorithms. Figure 6 illustrates the results for 1grm dataset. W e observe that ( , 2  )-net achiev es the smallest loss in dimensions 0 and 1. In dimension 2, maxmin achiev es the small- est loss.  -net-rand tak es the smallest CPU-time among all the  -net algorithms. W e observ e that the eﬀectiveness of ( , 2  )-net and eﬃciency of  -net-rand in the results on T angled-torus dataset. W e omit the plots due to space limitation. T op ological Data Analysis with  -net Induced Lazy Witness Complex 13  49 . 8 50 . 0 50 . 2 50 . 4  -net-maxmin Mean 1-w ass. dist. dim. 0  5 . 6 5 . 8 6 . 0 6 . 2 6 . 4 6 . 6 Mean 1-w ass. dist. dim. 1  0 . 6 0 . 8 1 . 0 1 . 2 1 . 4 1 . 6 Mean 1-w ass. dist. dim. 2  0 . 02 0 . 03 0 . 04 0 . 05 0 . 06 0 . 07 Mean CPU-time(ms) Eﬃciency  49 . 8 50 . 0 50 . 2 50 . 4  -net-rand Mean 1-w ass. dist.  5 . 6 5 . 8 6 . 0 6 . 2 6 . 4 6 . 6 Mean 1-w ass. dist.  0 . 6 0 . 8 1 . 0 1 . 2 1 . 4 1 . 6 Mean 1-w ass. dist.  0 . 02 0 . 03 0 . 04 0 . 05 0 . 06 0 . 07 Mean CPU-time(ms) 2 4 6  49 . 8 50 . 0 50 . 2 50 . 4 ( , 2  )-net Mean 1-w ass. dist. 2 4 6  5 . 6 5 . 8 6 . 0 6 . 2 6 . 4 6 . 6 Mean 1-w ass. dist. 2 4 6  0 . 6 0 . 8 1 . 0 1 . 2 1 . 4 1 . 6 Mean 1-w ass. dist. 2 4 6  0 . 02 0 . 03 0 . 04 0 . 05 0 . 06 0 . 07 Mean CPU-time(ms) (a) Eﬀectiveness and Eﬃciency of the  -net algorithms on T orus dataset. Num. of Landmarks 49 . 6 49 . 8 50 . 0 50 . 2 50 . 4 maxmin Mean 1-w ass. dist. dim. 0 Num. of Landmarks 6 . 0 6 . 2 6 . 4 6 . 6 6 . 8 Mean 1-w ass. dist. dim. 1 Num. of Landmarks 0 . 6 0 . 8 1 . 0 1 . 2 1 . 4 1 . 6 Mean 1-w ass. dist. dim. 2 Num. of Landmarks 0 . 000 0 . 005 0 . 010 0 . 015 0 . 020 0 . 025 0 . 030 Mean CPU-time(ms) Eﬃciency 50 100 Num. of Landmarks 49 . 6 49 . 8 50 . 0 50 . 2 50 . 4 random Mean 1-w ass. dist. 50 100 Num. of Landmarks 6 . 0 6 . 2 6 . 4 6 . 6 6 . 8 Mean 1-w ass. dist. 50 100 Num. of Landmarks 0 . 6 0 . 8 1 . 0 1 . 2 1 . 4 1 . 6 Mean 1-w ass. dist. 50 100 Num. of Landmarks 0 . 000 0 . 005 0 . 010 0 . 015 0 . 020 0 . 025 0 . 030 Mean CPU-time(ms) (b) Eﬀectiv eness and Eﬃciency of the existing algorithms on T orus dataset. Fig. 5: T orus dataset.  194 195 196 197  -net-maxmin Mean 1-w ass. dist. dim. 0  21 . 0 21 . 5 22 . 0 22 . 5 23 . 0 Mean 1-w ass. dist. dim. 1  2 . 4 2 . 6 2 . 8 3 . 0 3 . 2 Mean 1-w ass. dist. dim. 2  0 . 02 0 . 04 0 . 06 0 . 08 Mean CPU-time(ms) Eﬃciency  194 195 196 197  -net-rand Mean 1-w ass. dist.  21 . 0 21 . 5 22 . 0 22 . 5 23 . 0 Mean 1-w ass. dist.  2 . 4 2 . 6 2 . 8 3 . 0 3 . 2 Mean 1-w ass. dist.  0 . 02 0 . 04 0 . 06 0 . 08 Mean CPU-time(ms) 10 20 30  194 195 196 197 ( , 2  )-net Mean 1-w ass. dist. 10 20 30  21 . 0 21 . 5 22 . 0 22 . 5 23 . 0 Mean 1-w ass. dist. 10 20 30  2 . 4 2 . 6 2 . 8 3 . 0 3 . 2 Mean 1-w ass. dist. 10 20 30  0 . 02 0 . 04 0 . 06 0 . 08 Mean CPU-time(ms) (a) Eﬀectiveness and Eﬃciency of the  -net algorithms on 1grm dataset. Num. of Landmarks 194 195 196 197 maxmin Mean 1-w ass. dist. dim. 0 Num. of Landmarks 21 22 23 24 Mean 1-w ass. dist. dim. 1 Num. of Landmarks 2 . 4 2 . 6 2 . 8 3 . 0 3 . 2 Mean 1-w ass. dist. dim. 2 Num. of Landmarks 0 . 0000 0 . 0025 0 . 0050 0 . 0075 0 . 0100 0 . 0125 0 . 0150 Mean CPU-time(ms) Eﬃciency 25 50 Num. of Landmarks 194 195 196 197 random Mean 1-w ass. dist. 25 50 Num. of Landmarks 21 22 23 24 Mean 1-w ass. dist. 25 50 Num. of Landmarks 2 . 4 2 . 6 2 . 8 3 . 0 3 . 2 Mean 1-w ass. dist. 25 50 Num. of Landmarks 0 . 0000 0 . 0025 0 . 0050 0 . 0075 0 . 0100 0 . 0125 0 . 0150 Mean CPU-time(ms) (b) Eﬀectiveness and Eﬃciency of the existing algorithms on 1grm dataset. Fig. 6: 1grm dataset. 14 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan 0 2 4 6 8 0.0 0.2 0.4 0.6 ε −net−maxmin α ε = 0.8 0 2 4 6 8 0.0 0.2 0.4 0.6 ε −net−rand α ε = 0.8 0 2 4 6 8 0.0 0.2 0.4 0.6 ( ε ,2 ε )−net α ε = 0.8 0 2 4 6 8 0.0 0.2 0.4 0.6 maxmin α Median #landmark = 53 0 2 4 6 8 0.0 0.2 0.4 0.6 random α Median #landmark = 53 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 0.9 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 0.9 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 0.9 0 2 4 6 8 0.0 0.2 0.4 0.6 α Median #landmark = 41 0 2 4 6 8 0.0 0.2 0.4 0.6 α Median #landmark = 41 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 1 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 1 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 1 0 2 4 6 8 0.0 0.2 0.4 0.6 α Median #landmark = 31 0 2 4 6 8 0.0 0.2 0.4 0.6 α Median #landmark = 31 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 1.1 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 1.1 0 2 4 6 8 0.0 0.2 0.4 0.6 α ε = 1.1 0 2 4 6 8 0.0 0.2 0.4 0.6 α Median #landmark = 25 0 2 4 6 8 0.0 0.2 0.4 0.6 α Median #landmark = 25 Fig. 7: 95% conﬁdence band of the rank one p ersistence landscap e at dimension 1 of the lazy witness ﬁltration induced by the landmark selection algorithms on T angled-torus dataset. Despite pro viding b etter eﬃciency and equiv alent eﬀectiveness on the datasets under study , the p erformance of the maxmin algorithm is less predictable and less explainable than the  -net algorithms. Among the  -net algorithms, ( , 2  )- net has b etter eﬀectiveness at the exp ense of little loss in eﬃciency , whereas  -net-rand has better eﬃciency than the others with eﬀectiveness comparable to  -net-maxmin. 6.4 Stabilit y of the  -net Landmarks In Figure 7, we v ary  and plot the rank 1 p ersistence landscape at dimension 1 and its 95% conﬁdence band corresponding to the lazy witness ﬁltration induced b y the diﬀerent landmark selection algorithms. F or maxmin and random, we tak e the same n um ber of landmarks as that in the corresp onding  -net-maxmin landmarks. The rank 1 p ersistent landscap e is a functional representation of the most p ersisten t homology class, which w e observe from Figure 7 in the form of a p eak for all the algorithms. The x -axis represents the v alue of ﬁltration parameter and y -axis represen ts function v alues. W e observe that the  -net- maxmin has similar conﬁdence bands as maxmin, whereas the conﬁdence bands of  -net-rand and ( , 2  )-net are often narrow er than both maxmin and random. T op ological Data Analysis with  -net Induced Lazy Witness Complex 15 Random has the widest conﬁdence band among all. The conﬁdence bands of maxmin are in-betw een these tw o extremes. This observ ation implies that the  - net algorithms are more stable than the existing algorithms. W e observ e similar stabilit y results on other datasets that w e omit due to space limitation. 7 Conclusion W e use the notion of  -net to capture b ounds on the loss of the top ological features of the induced lazy witness complex. W e prov e that  -net is an  - appro ximation to the original p oint cloud and the lazy witness complex induced b y  -net is a 3-approximation to the Vietoris-Rips complex on the landmarks for v alues of ﬁltration parameter b eyond 2  . Suc h quan tiﬁcation of approxima- tion for lazy witness complex was absent in literature and is not deriv able for algorithms limiting the n um b er of landmarks. W e propose three algorithms to construct  -net landmarks. W e sho w that the prop osed  -net-rand and  -net-maxmin algorithms are v ariants of the algorithm random and maxmin resp ectiv ely , which ensures the constructed landmarks to b e an  -sample of the point cloud. W e empirically and comparativ ely sho w that the sizes of the landmarks that our algorithms construct agree with the bound on the size of  -net. W e empirically v alidate our claim on the top ological appro ximation guaran tee by sho wing that b ey ond 2  ﬁltration v alue, the b ottlenec k distances are b ounded b y 3 log 3. F urthermore, w e empirically and comparativ ely v alidate the eﬀectiveness, eﬃciency and stabilit y of the proposed algorithms on repre- sen tativ e syn thetic p oin t clouds as well as a real dataset. Experiments conﬁrm our claims by sho wing equiv alen t eﬀectiv eness of the algorithms constructing  -net landmarks at the cost of a little decrease in eﬃciency but oﬀering b etter stabilit y . Ac kno wledgemen t This work is supp orted b y the National Universit y of Singap ore Institute for Data Science pro ject W A TCHA: W A T er CHallenges Analytics. References 1. Bub enik, P .: Statistical topological data analysis using persistence landscap es. The Journal of Mac hine Learning Researc h 16(1), 77–102 (2015) 2. Carlsson, G., Ishkhanov, T., de Silv a, V., Zomoro dian, A.: On the lo cal behavior of spaces of natural images. In ternational Journal of Computer Vision 76(1), 1–12 (2008) 3. Carstens, C.J., Horadam, K.J.: Persisten t Homology of Collab oration Netw orks. Mathematical Problems in Engineering 2013(6), 1–7 (jun 2013) 4. Chazal, F., Cohen-Steiner, D., Glisse, M., Guibas, L.J., Oudot, S.Y.: Pro ximit y of p ersistence modules and their diagrams. In: Proceedings of the tw en t y-ﬁfth annual symp osium on Computational geometry . pp. 237–246. ACM (2009) 5. Chazal, F., F asy , B., Lecci, F., Michel, B., Rinaldo, A., W asserman, L.: Subsam- pling metho ds for persistent homology . In: International Conference on Mac hine Learning. pp. 2143–2151 (2015) 6. Collins, A., Zomoro dian, A., Carlsson, G., Guibas, L.J.: A barcode shape descriptor for curve p oin t cloud data. Computers & Graphics 28(6), 881–894 (2004) 16 Naheed Anjum Arafat, Debabrota Basu, St´ ephane Bressan 7. De Silv a, V., Carlsson, G.: T op ological estimation using witness complexes. In: Pro ceedings of the First Eurographics conference on P oin t-Based Graphics. pp. 157–166. Eurographics Association (2004) 8. De Silv a, V., Ghrist, R., et al.: Co v erage in sensor netw orks via persistent homology . Algebraic & Geometric T op ology 7(1), 339–358 (2007) 9. Dey , T., Mandal, S., V archo, W.: Impro v ed image classiﬁcation using topological p ersistence. In: Proceedings of the conference on Vision, Modeling and Visualiza- tion. pp. 161–168. Eurographics Association (2017) 10. Dey , T.K., F an, F., W ang, Y.: Graph induced complex on point data. Computa- tional Geometry 48(8), 575–588 (2015) 11. Duman, A.N., Pirim, H.: Gene co expression net w ork comparison via p ersisten t homology . International journal of genomics 2018 (2018) 12. Edelsbrunner, H., Harer, J.: Computational top ology: an introduction. American Mathematical So c. (2010) 13. F asy , B.T., Kim, J., Lecci, F., Maria, C.: Introduction to the r pack age tda. arXiv preprin t arXiv:1411.1830 (2014) 14. Guibas, L.J., Oudot, S.Y.: Reconstruction using witness complexes. Discrete & computational geometry 40(3), 325–356 (2008) 15. Har-P eled, S., Mendel, M.: F ast construction of nets in lo w-dimensional metrics and their applications. SIAM Journal on Computing 35(5), 1148–1184 (2006) 16. Har-P eled, S., Raichel, B.: Net and prune: A linear time algorithm for euclidean distance problems. Journal of the ACM (JACM) 62(6), 44 (2015) 17. Hausmann, J.C., et al.: On the vietoris-rips complexes and a cohomology theory for metric spaces. Ann. Math. Studies 138, 175–188 (1995) 18. Heinonen, J.: Lectures on analysis on metric spaces. Springer Science & Business Media (2012) 19. Ko v acev-Nik olic, V., Bub enik, P ., Nikoli ´ c, D., Heo, G.: Using persistent homology and dynamical distances to analyze protein binding. Statistical applications in genetics and molecular biology 15(1), 19–38 (2016) 20. Krauthgamer, R., Lee, J.R.: Navigating nets: simple algorithms for proximit y searc h. In: Pro ceedings of the ﬁfteenth annual ACM-SIAM symp osium on Dis- crete algorithms. pp. 798–807 (2004) 21. Lee, H., Chung, M.K., Kang, H., Kim, B.N., Lee, D.S.: Discriminative p ersisten t homology of brain net w orks. In: 2011 IEEE In ternational Symp osium on Biomed- ical Imaging: F rom Nano to Macro. pp. 841–844. IEEE (2011) 22. Letsc her, D., F ritts, J.: Image segmentation using top ological persistence. In: In ter- national Conference on Computer Analysis of Images and Patterns. pp. 587–595. Springer (2007) 23. Otter, N., Porter, M.A., Tillmann, U., Grindro d, P ., Harrington, H.A.: A roadmap for the computation of persistent homology . EPJ Data Science 6(1), 17 (2017) 24. P atania, A., P etri, G., V accarino, F.: The shape of collaborations. EPJ Data Sci- ence 6(1), 18 (2017) 25. P etri, G., Scolamiero, M., Donato, I., V accarino, F.: T op ological strata of weigh ted complex netw orks. PloS one 8(6), e66506 (2013) 26. Sheeh y , D.R.: Linear-size approximations to the vietoris–rips ﬁltration. Discrete & Computational Geometry 49(4), 778–796 (2013) 27. Sizemore, A., Giusti, C., Bassett, D.S.: Classiﬁcation of w eigh ted net w orks through mesoscale homological features. Journal of Complex Net w orks 5(2), 245–273 (2017) 28. Skraba, P ., Ovsjanik o v, M., Chazal, F., Guibas, L.: P ersistence-based segmen tation of deformable shapes. In: 2010 IEEE Computer So ciet y Conference on Computer Vision and P attern Recognition-W orkshops. pp. 45–52. IEEE (2010) T op ological Data Analysis with  -net Induced Lazy Witness Complex 17 29. T ausz, A., V ejdemo-Johansson, M., Adams, H.: Jav aPlex: A researc h softw are pack- age for p ersisten t (co)homology . In: Hong, H., Y ap, C. (eds.) Pro ceedings of ICMS 2014. pp. 129–136. Lecture Notes in Computer Science 8592 (2014) 30. Xia, K., W ei, G.W.: Persisten t homology analysis of protein structure, ﬂexibilit y , and folding. International Journal for Numerical Methods in Biomedical Engineer- ing 30(8), 814–844 (8 2014) 31. Zomoro dian, A.: F ast construction of the vietoris-rips complex. Computers & Graphics 34(3), 263–271 (2010)

Topological Data Analysis with $epsilon$-net Induced Lazy Witness Complex

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment