A spectral sequence for parallelized persistence

A SPECTRAL SEQUENCE F OR P ARALLELIZED PERSISTENCE DA VID LIPSKY, PRIMOZ SKRABA, AND MIKAEL VEJDEMO-JOHANSSON Abstract. W e approach the problem of the computation of p ersistent homology for large datasets b y a divide-and-conquer strategy . Dividing the total space into separate but o verlapping components, we are able to limit the total memory residency for any part of the computation, while not degrading the ov erall complexity muc h. Lo cally computed p ersistence information is then merged from the comp onen ts and their intersections using a sp ectral sequence generalizing the May er-Vietoris long exact sequence. W e describ e the May er-Vietoris sp ectral sequence and give details on how to compute with it. This allows us to merge local homological data into the global p ersisten t homology . F urthermore, w e detail how the classical topology constructions inherent in the sp ectral sequence adapt to a p ersistence p ersp ectiv e, as well as describ e the tec hniques from computational commutativ e algebra necessary for this extension. The resulting computational sc heme suggests a parallelization sc heme, and w e discuss the communication steps in v olved in this scheme. F urthermore, the computational sc heme can also serv e as a guideline for whic h parts of the b oundary matrix manipulation need to co-exist in primary memory at any given time allowing for stratiﬁed memory access in single-core computation. The sp ectral sequence viewp oin t also provides easy proofs of a homology nerve lemma as well as a p ersisten t homology nerve lemma. In addition, the algebraic tools we dev elop to approch persistent homology provide a purely algebraic form ulation of kernel, image and cokernel p ersistence [7]. E 0 0 , 0 E 0 0 , 1 E 0 0 , 2 E 0 0 , 3 E 0 1 , 0 E 0 1 , 1 E 0 1 , 2 E 0 1 , 3 E 0 2 , 0 E 0 2 , 1 E 0 2 , 2 E 0 2 , 3 E 0 3 , 0 E 0 3 , 1 E 0 3 , 2 E 0 3 , 3 ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ d d d d d d d d d d d d C n ( X ) C n ( X ) C n ( X ) C n ( X ) (a) Double complex (b) Sphere with a bump E ∞ 0 , 0 E ∞ 0 , 1 E ∞ 0 , 2 E ∞ 0 , 3 E ∞ 1 , 0 E ∞ 1 , 1 E ∞ 1 , 2 E ∞ 1 , 3 E ∞ 2 , 0 E ∞ 2 , 1 E ∞ 2 , 2 E ∞ 2 , 3 E ∞ 3 , 0 E ∞ 3 , 1 E ∞ 3 , 2 E ∞ 3 , 3 H 0 ( X ) H 1 ( X ) H 2 ( X ) H 3 ( X ) (c) Final computation Dep ar tment of Ma thema tics, University of Pennsyl v ania, 209 South 33rd Street. Philadelphia, P A 19104, USA Ar tificial Intelligence Labora tor y, Jo ˇ zef Stef an Institute, Jamov a 39, 1000 Ljubljana, Slovenia Corresponding author, School of Computer Science, University of St Andrews, Jack Cole Building, Nor th Haugh, St Andrews KY16 9SX, Scotland, UK E-mail addr esses : dlipsky@math.upenn.edu, primoz.skraba@ijs.si, mvvj@st- andrews.ac.uk . 1 1. Introduction T op ological data analysis [3, 15] lo oks to quan tify the qualitative asp ects of the geometry of a data set. The existence and lo cation of voids in an otherwise densely present ﬁgure are among the top ological features w e can try detect. The formalism of homolo gy allows us to do this. It has h ad a large impact: both in geometry and mathematics in general, and more recently in a computational top ology and geometry setting. The key to the success of homology in computational geometry , as well as in other application ﬁelds – is the introduction in [13] of p ersistent homolo gy , a multi-scale approac h to the computation of homological features giv en a p oint cloud of samples. The original p ersistence algorithm [13] has a w orst case, a O ( n 3 ) running time . There has b een con- siderable eﬀort looking for more eﬃcient algorithms for computing persistence. F or sp ecial cases, such as 2-manifolds [13], v ery eﬃcient, nearly linear algorithms exist. In the general case, recent progress has shown that p ersistence can be computed in matrix m ultiplication time [22] or alternatively in an output sensitiv e w ay [5]. These running times are given in terms of simplices, so another activ e line of research is into us- ing discrete Morse-theoretic metho ds or homotopical collapses [11, 17, 20, 23 – 25] to reduce the size of the underlying complex while ensuring the resulting homology computation giv es the correct result. These are particularly useful when the underlying complex is described by a simpler ob ject such as its graph [9, 29, 30] (i.e. for ﬂag complexes). The standard algorithm is most often used, since in practice, it has almost linear runtime. The algorithm dep ends on ha ving random access to a represen tation of the boundary operator. The standard represen tation is as a sparse matrix and during the course of the standard algorithm in higher dimensions, the matrix quic kly ﬁlls-in requiring O ( n 2 ) storage space. In practice, the requirement of random access to such a large data structure has pro v en to b e the limiting factor in computation. There has b een work on distributed computation, primarily fo cusing on ordinary homology in sensor netw orks and using either coreduction tec hniques as ab o v e [10] or via com binatorial Laplacians [26]. The latter approach does not readily extend to persistent homology and is m uch slo wer in practice than centralized metho ds. In this pap er, we present an algorithm whic h uses a divide-and-conquer approach to computing b oth homology and persistent homology . Rather than computing on the en tire complex, w e break the computation up into smaller pieces. The b eneﬁt b eing that we can compute p ersistence indep enden tly on the each of the pieces or patches. The key con tirbution of this pap er is to show ho w to merge the results from the individual patc hes into a global quantit y 1 . Our approach is based on a to ol from algebraic top ology called sp e ctr al se quenc es . Dividing the space up based on a c over U , the sp ectral sequence gives the machinery necessary to iterativ ely compute b etter approximations of the p ersistent homology of a total space X from the lo cal p ersistence computations in eac h U i ∈ U . W e fo cus our attention on the underlying ideas based on sp ectral sequences and the resulting algorithms rather than on the analysis of these algorithms, which dep end hea vily on the cov er (i.e. ho w we break up the underlying complex). W e partially address the problem of computing co vers by presenting some straigh tforward approaches which could b e used. Using the dev elop ed techniques also gives a more con venien t form of the Nerve [19] and Persisten t Nerve [4] lemmas which are widely used in p ersistent homology , as well as an algebraic description of the image, kernel and cok ernel p ersistence introduced in [7]. The pap er is structured as follows: we ﬁrst in tro duce the Ma yer-Vietoris sp ectral sequence, which is the main algebraic to ol that will b e used in the pap er. W e adapt the treatement of [16] to a p ersistent homology con text, and pro vide mathematical descriptions of the spectral sequence computation. Using these structural results, in Section 3 we giv e explicit descriptions of all the needed algorithms; from basic algebraic primitives computing kernels, images and cok ernels of maps b et w een ﬁnitely presented k [ t ]-mo dules to the concrete computation of the sp ectral sequence. Finally , in Section 5 we discuss several applications and corollaries to the theory and algorithmics in this pap er. In particular, w e discuss strategies for parallelizing p ersistence and for computing persistent homology with a stratiﬁed memory residence approach; the relationship b etw een our algebraic primitives and the corresp onding geometric primitiv es in [7]; and we give pro ofs of the homological Nerv e lemma and of the p ersisten t homological Nerve lemma that av oid the requirement of con tractibility of all the cov ering patches and patc h intersections. 1 The merging step is what has made this kind of approach diﬃcult in the past, particularly for persistence 2 1.1. Bac kground. F or an introduction to p ersistent homology w e refer the reader to [12, 28] along with [19] for an introduction to homology within algebraic top ology . The main ob ject which we lo ok to compute is the p ersistenc e mo dule . This is an algebraic ob ject introduced in [31], which enco des the homology of a ﬁltr ation , a sequence of increasing simplicial complexes ∅ = X 0 ⊆ X 1 ⊆ . . . ⊆ X n − 1 ⊆ X N = X The mo dule enco des the information in all pairwise maps b et w een these spaces. It turns out that, this information can b e represented by a b ar c o de or p ersistenc e diagr am which are a set of co ordinates which represen t the birth and de ath times of non-trivial homology classes in the ﬁltration. These hav e geometrical meaning if the ﬁltration is derived from some geometric quantit y . Common examples include: the sub-level set ﬁltration with resp ect to some function deﬁned on the space and the ﬁltration based on distances b etw een p oints in an input p oint cloud. In this pap er, we compute homology ov er co eﬃcien ts in a ﬁeld k . In this case, the homology ov er a space is a vector space. Ho wev er, as shown in [31], rather than computing the homology for eac h space indep enden tly , we can build graded k [ t ]-linear chain complexes where the degree of each generator enco des the en try of a simplex into the ﬁltration. Refering to this as p ersistenc e c omplex , the authors sho w that p ersisten t homology is equiv alent to ordinary homology on the p ersistence complex with co eﬃcients in k [ t ]. The division of our space will b e in terms of a co ver. A cov er of a space X is an indexed family of sets U i suc h that X ⊆ ∪ i U i . The metho ds we dev elop work with arbitrary cov ers of the space, making the metho d suitable for an y division of the underlying space (of course, the p erformance will dep end on the prop erties of the co v er). Muc h of the exp osition is given in the language of commu tative algebra. This not only simpliﬁes the exp osition, but gives imp ortant insight into how the algorithms should work. Whereever p ossible, we try to put things into the context of previous approaches. 2. The Ma yer-Vietoris spectral sequence Sp ectral sequences hav e seen n umerous uses in classical algebraic top ology [21]. In the ﬁeld of p ersistent homology , an aw areness of the role of sp ectral sequences has b een present from the start [31], but the diﬃculty in computing with sp ectral sequences has consistently made alternative routes attractive. W e do not give a complete bac kground on sp ectral sequences here but rather, we focus our atten tion on one particular sp ectral sequence, which we use for the parallelization of p ersistent homology . F or reference, sp ectral sequences are discussed extensively in [21] although more accessible introductions can b e found in [6] and [18]. The sp ectral sequence we use can b e viewed as generalization of the Ma yer-Vietoris long exact sequence whic h given tw o pieces, relates the homology of their union with the homology of the pieces and their in tersection. The same wa y that the long exact sequence enco des an inclusion-exclusion principle on homology classes, the corresp onding sp ectral sequence provides the mec hanics of an inclusion-exclusion principle for deep er in tersections than the pairwise aﬀorded by the long exact sequence. F requently men tioned in the literature (inter alia: [21]), we found the most detailed exp ositions in [1, 2, 16]. The sp ectral sequence builds up a space very similar to the blowup complex introduced b y Carlsson and Zomoro dian in [32], but uses more of the av ailable information in order to knit together lo cal information to a global homology computation. Also Carlsson and Zomoro dian do not use the same ﬁltration as we dep end on here. The following is based v ery lo osely on the exp osition in [16]. As in [32], w e consider a ﬁltered simplicial complex X , with a cov er U by ﬁltered sub complexes X = S i ∈ I U i (see Figure 1(b)). The cov er can of course b e anything, but in practice w e wish to divide the full ﬁltered complex in to almost disjoin t pieces so as to maximally parallelize the computation. The condition of co v ering with ﬁltered sub complexes is trivially fulﬁlled if we construct the cov er on the ﬁnal space of the ﬁltration. W e recall the deﬁnition of the blowup complex from [32] X U = [ ∅6 = J ⊆ I   \ j ∈ J U j   × ∆ J where ∆ J is the | J | -simplex with vertices num b ered from J . 3 (a) An underlying top olog- ical space co vered by three sets. (b) The sets shown as dis- joint spaces. (c) Blowup complex. Figure 1. Construction of the blo wup for a triple cov er. In the blo wup (c), an ything in a double in tersection is multiplied with an edge, and anything in a triple in tersection is multiplied by a 2- simplex. While only visible in the top right one, all three parts of the curve going through a double in tersection will b e replaced by a triangulated quadrilateral. E 0 0 , 0 E 0 0 , 1 E 0 0 , 2 E 0 0 , 3 E 0 1 , 0 E 0 1 , 1 E 0 1 , 2 E 0 1 , 3 E 0 2 , 0 E 0 2 , 1 E 0 2 , 2 E 0 2 , 3 E 0 3 , 0 E 0 3 , 1 E 0 3 , 2 E 0 3 , 3 E 0 4 , 0 E 0 4 , 1 E 0 4 , 2 E 0 4 , 3 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 1 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 d 0 C n ( X ) C n ( X ) C n ( X ) C n ( X ) (a) Double complex expression of the E 0 page. α α 0 α 1 α 2 α 3 α 4 ω 0 ω 1 ω 2 ω 3 (b) Typical element of the total complex. Figure 2. Double complex setup. A double complex is a grid as depicted in (a), with mo dules at eac h grid p oin t, and t wo families of maps d 0 and d 1 b et w een the mo dules. F or a double complex, w e require d 0 d 1 + d 1 d 0 = 0 as well as d 0 d 0 = 0 and d 1 d 1 = 0. A typical element of the total complex is some tuple ( α 0 , . . . , α n ), and will map under the total diﬀerential to some other elemen t ( ω 0 , . . . , ω n − 1 ) given by summing up the contributions from eac h direction. F undamentally , the question is how we can compute the homology of the blowup complex. W e approach this by separating out the problem into computational comp onen ts. Whenev er n , p and q o ccur simultane- ously , we shall assume that n = p + q . The q -chains in a p -fold intersection of the original cov er generate p + q -chains in the chain complex of the blowup complex, and thus p otentially p + q -homology . W e use this to organize our computation. Let us write E 0 p,q for the q -chains in a p -fold intersection of the original cov er 2 . The individual chain complexes E 0 p, ∗ ha ve b oundary maps inherent from their simplicial complex structure, and we shall write d 0 p,q for the map E 0 p,q → E 0 p,q − 1 , and d 0 for the aggregate map E 0 ∗ , ∗ → E 0 ∗ , ∗ . F or an y particular J ⊆ I , and any particular j 0 ∈ J , there is an inclusion map T j ∈ J U j → T j ∈ J \ j 0 U j . If w e multiply eac h such inclusion map with the sign of the corresp onding term J → J \ j 0 of the nerve complex b oundary map, and sum the maps up for all j 0 , we get a map d 1 p,q : E 0 p,q → E 0 p − 1 ,q . Aggregating o v er all c hains, we hav e a map d 1 : E 0 ∗ , ∗ → E 0 ∗ , ∗ . A structure lik e this, with bi-indexed mo dules E 0 p,q and horizon tal maps d 0 and v ertical maps d 1 suc h that d 0 ◦ d 0 = 0, d 1 ◦ d 1 = 0 and d 0 ◦ d 1 + d 1 ◦ d 0 = 0 is called a double c omplex . F or double complexes, we can 2 The E 0 p,q notation will b e familiar to any one who has work ed with sp ectral sequences as the 0-page of the sp ectral sequence 4 generate a straightforw ard chain complex called the total c omplex b y writing T ot( E 0 ∗ , ∗ ) n = L p + q = n E 0 p,q , and introduce as a b oundary map the aggregate map d 0 + d 1 . F or our computation of the homology of the blowup complex, we rely on a fundamental theorem from algebraic top ology [16]: Theorem 1. H ∗ (T ot( E 0 ∗ , ∗ )) = H ∗ ( X ) . While w e shall not prov e the theorem in its full exten t here, w e still need the statement of the isomorphism for the developmen t of our algorithm. A chain in E 0 1 ,q is a collection of q -chains in the cov ering patches. Suc h a chain can b e mapp ed into C ∗ X by simply summing up the comp onent chains, included into the full complex. Just summing up these comp onents may b oth in tro duce and remov e homological features present in the comp osite chain – and the exact b ehaviour of these introductions and remo v als is enco ded in to the other summands of the direct sum. Thus, the isomorphism iden tiﬁes a class [( α 0 , . . . , α n )] with a class [ α ] precisely if α is the result of summing up the comp onents of α 0 . The chains α 1 , . . . , α n capture, in a wa y reminiscen t of the inclusion-exclusion principle, the exact wa ys in which α diﬀers from merely summing up α 0 . The columns in the double complex provide a ﬁltration on T ot( E 0 p,q ). W e write E 0 ≤ p, ∗ for the sub complex consisting of the p leftmost columns, and write L n,p = im( H n (T ot( E 0 ≤ p, ∗ )) → H n (T ot( E 0 ∗ , ∗ ))) for the image of the map induced by the inclusion of a particular ﬁltration step into the entire double complex. Since the E 0 ≤ p, ∗ ﬁlter E 0 ∗ , ∗ , the L n,p ﬁlter H n (T ot( E 0 ∗ , ∗ )). F urthermore, a single class [( α 0 , . . . , α n )] is in L n,p if and only if the class has a representativ e on the form ( α 0 0 , . . . , α 0 p , 0 , . . . , 0). While computing L n,p completely ends up b eing similar in diﬃcult y to the original homology computation, w e can consider the computation of just the elements that get added at any particular step of the ﬁltration. In other words, if we compute L n,p /L n,p − 1 , this tells us ab out only those elements of H ∗ ( X ) that show up at precisely the p -fold in tersections. Clearly , taking the direct sum o ver all p of these quotien ts, we can compute a complete description of the entire homology H ∗ ( X ). Giv en some elemen t [( α 0 , . . . , α p , 0 , . . . , 0)] ∈ L n,p , the equiv alence class it b elongs to in L n,p /L n,p − 1 is completely determined by the v alue of α p . The particular α p represen ting such an element is not uniquely determined, nor is every eleme n t of E 0 p,q a v alid choice for some element determining an equiv alence class of L n,p /L n,p − 1 . W e shall write Z ∞ p,q for the α p that do represent some equiv alence class of L n,p /L n,p − 1 , and we shall write B ∞ p,q for the elements of E 0 p,q that represent the 0 class of L n,p /L n,p − 1 . The sp ectral sequence here aims to provide increasingly accurate approximations to these Z ∞ p,q and B ∞ p,q . These approximations are computed by introducing the conditions deﬁning Z ∞ p,q and B ∞ p,q one at a time. The case Z r p,q . Fix some α p ∈ E 0 p,q . F or α p to b e in Z ∞ p,q , there has to b e a sequence α p − 1 , . . . , α 0 as in Figure 3 (a) such that (1) d 0 α p = 0 d 0 α p − 1 = d 1 α p . . . d 0 α 0 = d 1 α 1 These equations express the condition d ( α 0 , . . . , α p , 0 , . . . , 0) = 0 in T ot( E 0 ∗ . ∗ ). W e deﬁne Z r p,q to b e the set of all α p that fulﬁll the conditions (2) d 0 α p = 0 d 0 α p − 1 = d 1 α p . . . d 0 α p − r = d 1 α p − r +1 and in particular, we can compute Z r p,q recursiv ely as those elements α p of Z r − 1 p,q for which there in addition to the α p − 1 , . . . , α p − r +1 elemen ts guaranteed to exist since α p ∈ Z r − 1 p,q , there is also an α p − r ∈ E 0 p − r,q + r suc h that d 0 α p − r = d 1 α p − r +1 . It is clear from these deﬁnitions that Z p p,q = Z ∞ p,q . The case B r p,q . As ab ov e, w e ﬁx α p ∈ E 0 p,q . In order for α p to b e in B ∞ p,q ⊆ Z ∞ p,q , we additionally need a collection of elements β p , . . . , β n +1 as in Figure 3 (b) such that (3) d 0 β p + d 1 β p +1 = α p d 0 β p +1 + d 1 β p +2 = 0 . . . d 0 β n + d 1 β n +1 = 0 The simplest case when this happ ens is if there is some β p suc h that d 0 β p = α p . In this case w e can c ho ose β j = 0 for all other β j . W e write B 0 p,q for the vector space of all such α p . In general, we shall deﬁne B r p,q to b e the set of α p for whic h there are β p , . . . , β p + r satisfying the equations d 0 β p + d 1 β p +1 = α p d 0 β p +1 + d 1 β p +2 = 0 . . . d 0 β p + r − 1 + d 1 β p + r = 0 d 0 β p + r = 0 In particular α p ∈ B r p,q if α p ∈ B r − 1 p,q or if there are β p , . . . , β p + r satsifying all these equations. 5 α 0 α 1 α 2 0 0 ω 0 ω 1 0 0 (a) Condition for a cycle: ( α 0 , . . . , α 2 , 0 , . . . , 0) is a cycle if there are ω 0 , ω 1 such that d 0 α 0 = ω 0 = d 1 α 1 , d 0 α 1 = ω 1 = d 1 α 2 and d 0 α 2 = 0. α 0 α 1 α 2 0 0 β 2 β 3 β 4 β 5 ω 0 ω 1 0 0 (b) Condition for a boundary: ( α 0 , . . . , α 2 , 0 , . . . , 0) is a b oundary if in addition to the ω 0 , ω 1 in (a), there are also β 2 , . . . , β 5 such that d 0 β 2 + d 1 β 3 = α 2 , d 0 β 3 + d 1 β 4 = 0, d 0 β 4 + d 1 β 5 = 0 and (trivially) d 0 β 5 = 0. Figure 3. Cycles and boundaries in E r p,q . Z r p,q and B r p,q as k ernel and image of a diﬀeren tial. Supp ose α p ∈ Z r p,q . Then there exist α p − 1 , . . . , α p − r as ab ov e. Let ω p − r − 1 = d 1 α p − r ∈ Z r p − r − 1 ,q + r . This element is w ell-deﬁned, dep ending only on α p , up to a sequence of error terms for all the α p − i , eac h of which is guaran teed to b e in the appropriate B r p − i,q + i . Hence, the map [ α p ] 7→ [ ω p − r − 1 ] is a w ell-deﬁned map d r : Z r p,q /B r p,q → Z r p − r − 1 ,q + r /B r p − r − 1 ,q + r . This collection of quotien t mo dules is often denoted E r p,q = Z r p,q /B r p,q and called the E r -page. The elements α p , . . . , α p − r are an appropriate set of β j to witness ω p − r − 1 ∈ B r p − r − 1 ,q + r . Re-indexing, we can conclude that α p ∈ B r +1 p,q if and only if [ α p ] = d r [ β p + r +1 ] for some β p + r +1 ∈ Z r p + r +1 ,q − r . Finally , if w p − r − 1 ∈ B r p − r − 1 ,q − r , or equiv alently if d r [ α p ] = [0], then it follows that α p ∈ Z r p,q . All these statements can b e easily prov en by diagram c hases. If at any stage of the computation, all the maps d r : E r ∗ , ∗ → E r ∗ , ∗ are the zero-map, then the corresponding Z r ∗ , ∗ and B r ∗ , ∗ already satisfy all their conditions, and so for that r , it already holds that Z r ∗ , ∗ = Z ∞ ∗ , ∗ and B r ∗ , ∗ = B ∞ ∗ , ∗ . Many pro ofs in algebraic top ology work by proving that for a low v alue of r , the map d r v anishes, eliminating the need to construct the higher index maps. In Section 5.3 we shall see an application of this principle. 3. Algorithms W e shall state algorithms here b oth for the underlying algebraic op erations, as well as for the resulting sp ectral sequence and p ersistence computations. 3.1. Algebraic op erations. In algebra textb ooks written with a suﬃciently general approach to linear algebra, the dev elopment of matrix arithmetic and its uses in solving linear equations will b e done in a manner appropriate for a generic Euclide an domain [27] 3 . It is a well-kno wn fact that in addition to any ﬁeld, further Euclidean domains include Z , as well as k [ t ] for any ﬁeld k . In this setting, a few useful facts hold: any submo dule of a free mo dule is free, and Gaussian elimination is an appropriate algorithm to compute matrix ranks, function kernels and images, reducing bases to remov e redundancies and to form appropriate reduction systems. In light of this, we dra w the reader’s attention to the implications for p ersistence. A persistence chain complex is a free graded k [ t ]-mo dule with a generator in degree k for each simplex that enters the ﬁltered c hain complex in the k th ﬁltration step. Persisten t homology is a ﬁnitely presented graded k [ t ]-mo dule, with generators given by a cycle basis and relations given by a b oundary sub-basis of the cycle basis. W e can assume – esp ecially in the topological case – that if some t k P λ j t j m j is a boundary basis elemen t, then the cycle was a cycle already k time-steps earlier, represen ted by P λ j t j m j . W e shall represent ﬁnitely presen ted mo dules precisely as a segregated pair of bases: a generators basis and a separate relations basis. Because of this nice b ehaviour of the generators corresp onding to sp eciﬁc relations, we can further allo w 3 This is a slightly more restrictive class than principle ideal domains. 6 ourselv es to keep the relations basis in a row-reduced form – to ensure that the division algorithm pro duces nice normal forms of all elements – and the generators basis reduced with resp ect to the relations. As the reader recalls, Gaussian elimination puts a m atrix into a normal form by clearing out row by row b y using the leftmost column in the matrix with no non-zero entries ab ov e of the current row to cancel out an y entries o ccurring later. When working with graded k [ t ]-mo dules, some care has to b e tak en to av oid reducing any columns using columns that do not exist in an early enough grading. This can b e accomplished b y sorting the columns in the matrix in order of rising degree, and only reducing to the right. T racking the op erations p erformed during a reduction allows for the computation of a kernel b y reading oﬀ the op erations that pro duced all-zero columns in the reduced matrix. Giv en a function f : M /Q → N/P , where M /Q and N /P are b oth represented by lists of generators and lists of relations as discussed ab ov e, the follo wing three constructions come naturally: Image mo dules. T o compute im f , w e compute the list f ( m ) for m ∈ M . Since f ( Q ) ⊆ P , the relations on im f are the relations listed in P , and so im f = f ( M ) /P . Cok ernel mo dules. T o compute coker f , w e need to compute ( N /P ) / ( f ( M /Q )). Again, f ( Q ) ⊆ P , and th us we get the resulting mo dule by imp osing all the images f ( m ) of the generators of M as additional relations. Hence, coker f = N / ( f ( M ) ∪ Q ). Kernel modules. The kernel is, just as in [7], more complex than the image and cok ernel. The computation of the kernel mo dule as well as its em b edding into M /Q is computed in a tw o-step pro cess. An element of k er f is going to b e represented by some element of the free mo dule M suc h that f ( m ) ∈ P . W e can compute this by computing the kernel F K of the map M ⊕ P → N given by ( m, p ) 7→ f ( m ) − p . This is a k ernel of a map b etw een free k [ t ]-mo dules, and thus computable with the matrix algorithm abov e. The actual generators in M are given b y the pro jection on to the ﬁrst summand π M : ( m, q ) 7→ m , and we write K = π M ( F K ). The resulting elements form a free submo dule of M giving generators of the kernel mo dule. The kernel mo dule will hav e relations, how ev er, and not all relations in Q are going to b e in the submo dule K . Hence, for a full presentation, we need to ﬁnd the relations that are actually in K , which we can do with another k ernel computation. W e consider the map K ⊕ Q → M given by ( k , q ) 7→ k − q and compute its kernel. The pro jection π K : ( k , q ) 7→ k gives us a set of relations. The constructions ab ov e, as well as Gr¨ obner basis approaches to handling free mo dules o ver p olynomial rings are describ ed in [14]. 3.2. Sp ectral sequence diﬀerentials. Using the algebraic op erations describ ed ab ov e, we describ e an algorithm for computing p ersisten t homology via the May er-Vietoris sp ectral sequence. This algorithm is iterativ e and c onver ges to the correct cycles Z ∞ and b oundaries B ∞ so that the isomorphism in Theorem 1 holds. Con verge o ccurs when the sp ectral sequence collapses. In particular, if the diﬀerential d r is the zero-map ev erywhere, then Z r +1 = Z r , and B r +1 = B r , and we can verify that d r +1 is also going to b e the zero-map ev erywhere, and this contin ues all the wa y up. One p o w erful use of this is that once the maps d r tra verse the double complex in such a wa y that they only hit trivial mo dules, the corresp onding d r has to b e the zero-map. This happ ens, for instance, if the double complex is contained in a p × q b ounding b ox where p and q are maximal dimensions of the complex and nerve resp ectively . In this case, the sp ectral sequence will collapse for r > min( p, q ). At that p oin t all the maps will map outside the b ox and so can only hit trivial mo dules. W e ﬁrst describ e the initialization of the algorithm. Beginning with the double complex of Figure 2(a), w e ﬁrst compute homology vertically , replacing each E 0 p,q with the quotien t mo dule Z 1 p,q /B 1 p,q where Z 1 p,q = k er d 0 p,q and B 1 p,q = im d 0 p,q +1 . This computes, in eﬀect, the p ersisten t homology of the chain complex o ver each simplex in the nerve of the cov er. W e represent E 1 p,q b y its segregated basis, as describ ed in Section 3.1, main taining the free mo dules Z 1 p,q and B 1 p,q as well as the preb oundary I 1 p,q deﬁned by the equation d 0 p,q I 1 p,q = B 0 p,q − 1 . Next, we describ e the iterativ e step. W e will, for now, assume that w e are given the diﬀerentials and describ e their com putation afterwards. 7 (a) (b) β 0 β 2 (c) Figure 4. (a) A sphere with constant function everywhere except one hotsp ot. The partition of the space is along the axes. Each partition is then thic k ened slighlt y as describ ed in Section 5.1 to give the cov ering. (b) The nerve of the cov ering. Note that the faces ha ve tetrahedra as all 4 v ertices of the square intersect. (c) The p ersistent barco de of the sphere, H 1 is trivial so it is not sho wn. k [ t ] 6 k [ t ] 24 k [ t ] 24 k [ t ] 8 k [ t ] /t m 0 0 0 (a) E 1 page. 0 k [ t ] 0 k [ t ] k [ t ] /t m (b) Computing d 2 on E 2 . 0 t m k [ t ] 0 k [ t ] 0 (c) E 3 page, collapsed sp ectral se- quence. Figure 5. Example computation for the sphere. A t the r -th iteration, at each no de ( p, q ), w e compute the kernel mo dule of d r p,q , storing it as Z r p,q and the image mo dule of d r p + r − 1 ,q − r , storing this as B r p,q . Since d r is a diﬀeren tial, we know that B r +1 p,q ⊆ Z r +1 p,q , and b y reducing b oth generating sets we can ﬁnd a go o d presentation of the quotient Z r +1 p,q /B r +1 p,q = E r +1 p,q . As for the concrete maps d r , we compute them as we go along. In particular, d 1 is the induced map on E 1 b y giving the inclusion maps of deep er in tersections in to more shallow the co eﬃcien ts from the nerve b oundary map. The higher diﬀeren tials d r are computed according to the machinery of Section 2. In particular, the images of d r − 1 p,q are retained. F or an element x ∈ E r p,q in the kernel of d r − 1 p,q w e can ﬁnd a corresp onding image element d r − 1 p,q x ∈ B 0 p − r +2 ,q + r − 1 . Therefore, we can ﬁnd some y ∈ I 1 p − r +2 ,q + r suc h that d 0 p − r +2 ,q + r y = d r − 1 p,q x . W e deﬁne d r x = d 1 y . This corresp onds precisely to the description in Section 2, where the theoretical motiv ation can b e found explaining exactly why these op erations are all p ossible to p erform and appropriate. In each iteration, w e use the image and k ernel mo dule computations b et w een ﬁnitely presented mo dules and maps that are computed in earlier stages. F or a low er level description of the algorithm see App endix B. 4. Explicit example Consider the sphere in Figure 4(a). Its function will introduce the hot sp ot and its surrounding area signiﬁcan tly later than the remaining sphere. Computing p ersistence ov er the entire sphere, w e would exp ect to see a signature somewhat like the one in Figure 4(c). W e can divide the computation on the sphere into o ctants divided by the axis planes, as in Figure 4(a) and (b). By growing eac h o ctant by the tubular neighbourho o d metho d in Section 5.1, we get separate computational units, with contractible intersections of all higher orders. Hence, after computing p ersisten t homology on all patches, and on pairwise, triple and quadruple intersections, we get a double complex of p ersistence mo dules shown in Figure 5(b). The exp onen ts come from the 6 vertices of the o ctahedron generating 6 quadruple intersection. In eac h of these, there are 4 diﬀeren t triple intersections, generating a total of 24 triple intersections. Each of the 6 tetrahedra has 6 edges, corresp onding to all the double in tersections at each vertex – how ev er, this 8 o vercoun ts each the 12 double intersections corresp onding to edges of the o ctahedron. Th us, out of the 36 double intersections, only 24 remain after comp ensating for the ov ercoun ting. Finally , each of the 8 faces of the o ctahedron corresp onds to a single cov ering patch. After computing the corresp onding E 2 -page, the redundancies in the duplicate representations of all comp onen ts hav e largely resolved, and w e are left with a double complex of p ersistence mo dules shown in Figure 5(b). Here, the map k [ t ] → k [ t ] /t m maps the generator onto the generator. The kernel of this map is t m k [ t ] ⊂ k [ t ], and the image is the entire mo dule. This giv es our ﬁnal page of the sp ectral sequence shown in Figure 5(c). Reading oﬀ the mo dules diagonally , we recov er the p ersistent homology groups H 0 = k [ t ] and H 2 = t m k [ t ], whic h is precisely what we computed b efore. 5. Applica tions 5.1. Computing partitions. The input to use the May er-Vietoris sp ectral sequence is a ﬁltered simplicial complex is cov ered by sub complexes. Giv en a p oint cloud X , and a metho d C ∗ (suc h as Vietoris-Rips or Witness complexes) to construct ﬁltered a simplicial complex C ∗ X from X , we shall describ e a few schemes to divide C ∗ X into a cov ering b y appropriate sub complexes with small intersections. Minimizing the size of the intersections, we minimize the amount of work which must b e done in higher pages. Recall that the close d star in a simplicial complex of a subset of its simplices is the sub complex consisting of all simplices that intersect any simplex in the subset. W e now state tw o lemmas and refer the reader to App endix A for the pro ofs. Lemma 2. Supp ose C ∗ has the pr op erty that any simplex has diameter at most ε/ 2 , and that the pr esenc e of any simplex is de cide d using p oints of a distanc e at most ε/ 2 fr om any of the vertic es of the simplex. Then the close d star of a subset S ∈ C ∗ X is going to b e c ontaine d in C ∗ T ε S for T ε S the ε -tubular neighb ourho o d of S in X , the c ol le ction of p oints x ∈ X such that d ( x, S ) < ε . Lemma 3. Supp ose X is p artitione d into subsets U j . Then a c overing of C ∗ X by sub c omplexes is given by the c ol le ction of sub c omplexes C ∗ T ε U j . W e now describ e some concrete metho ds for computing the co vers. Cub e partitions. Supp ose X is a subset of some Euclidean space R n . Co v er X by a disjoint collection of axis-parallel cub es. Membership in a particular cell is quickly computed b y comparing the co ordinates of the p oints in X to b oundaries for the cub es. A cov ering of C ∗ X is given by the closed stars of the cells, easily computable as C ∗ T ε U j for p oin ts in the tubular neighborho o ds of the cells. V oronoi partitions. Pick some collection of landmark p oints L ⊂ X . Compute V oronoi regions V ` for ` ∈ L . As ab o v e, we construct a cov ering of C ∗ X b y closed stars of the V oronoi cells, computable as C ∗ T ε U j . Geo desic V oronoi partitions. Pick some collection of landmark p oin ts L ⊂ X . Let C ∗ b e a metho d that constructs a ﬂag complex on some graph induced by the geometry of X . W rite d Γ for the geo desic graph distance, given by d Γ ( x, y ) equal to the least n umber of edges from x to y in Γ. W rite U ` for the set { y ∈ X : d Γ ( x, ` ) < d Γ ( x, ` 0 ) ∀ ` 0 ∈ L \ ` } , or in other words the set of p oin ts closer to ` than to an y other p oin t in L . As a cov ering by sub complexes, we grow each U ` to U 0 ` b y including all immediate neighbours of the p oin ts in U ` . A cov ering of X b y sub complexes is given by the ﬂag complexes on U 0 ` . 5.2. P arallelism and stratiﬁed memory . Parallelisation with a cov ering is p erformed b y splitting re- sp onsibilit y for the c hain complex of a single cov ering patch or patch intersection to a single computational task. These tasks communicate with eac h other to up date the global state in the horizon tal diﬀerential parts of the computation, and p erform b oundary map preimage computations internally . If a computational no de is resp onsible for chains ov er a single simplex of the nerve complex, it will only communicate with no des represen ting simplices in its closure and its star. This can b e seen since in the ﬁrst iteration, that no de will only nee d to communicate with the no des resp onsible for its faces and cofaces. In the second iteration, it m ust communicate with the faces of its faces and cofaces of its cofaces. T o compute the total num b er of comm unications pro duced, we only consider communication at the lev el of the nerve of the cov ering rather than the size of the underlying simplicial complex. T o av oid double coun ting, we only consider communication from a simplex into its closure. 9 ε Figure 6. Regions for Vietoris-Rips in an ambien t V oronoi decomp osition. F rom a pointcloud (far left) we pick three landmark p oin ts, and get a V oronoi decomp osition of the ambien t space (middle left). By growing eac h V oronoi region by ε , we get a co vering of the point cloud satisfying the conclusion in Lemma 3 (middle right). The ε -Vietoris-Rips complexes in each of the cov ering patc hes and their intersections (far right) will pro vide a cov ering of the ε -Vietoris-Rips complex of the entire p oint cloud b y sub complexes. T o get a w orst-case b ound, we assume that the n um b er of iterations required is equal to the maximal dimension of the nerv e 4 whic h we denote with r . Over all iterations, a d -simplex may pro duce no more than O (2 d ) communications. This follows from the assumption that r > d and the fact that the size of a d -simplex’s closure is 2 d . If we denote the num b er of i -dimensional simplices in the nerv e with K i and assume r is the maximal dimension of the nerve, the total n umber of communications is P r i =1 K i 2 i . This is v ery rough and p essimistic b ound. W e intend to do a more detailed analysis in future work. W e note that this approach will also b e applicable in a single-pro cessor context. Instead of distributing computational tasks on separate no des, w e can consider eac h sub complex or sub complex intersection a memory block. This wa y , the sp ectral sequence approach structures a computation, grouping accesses to related memory blo cks into cohesiv e computational tasks, which induces sw ap sc hemes and schemes for stratiﬁed memory access. 5.3. Nerv e lemma. The Nerve lemma [19, Corollary 4G.3] is a p ow erful result from algebraic top ology that has seen rep eated use in computational top ology – all the wa y from motiv ating why the commonly used ﬁltered simplicial complex constructions capture the original top ology [3] to generating particular new results. Quite often, when computational top ologists use the Nerve lemma, quite a lot of eﬀort is exp ended to pro vide the contractibilit y conditions required in the classical statemen t of the Nerve lemma. This is due to the most common form of the Nerv e lemma guarantees b oth homological and homotopical equiv alence b et w een a cov ering and its nerve. As we are usually only in terested in homology , w e observ e that the Ma yer-Vietoris sp ectral sequence yields a pro of of a homological Nerve lemma: Theorem 4 (Homology Nerve lemma) . Supp ose a sp ac e X is c over e d by a c ol le ction U of subsp ac es X = S i U i , such that e ach T j U j has trivial r e duc e d homolo gy in al l de gr e es. Then H ∗ ( X ) = H ∗ ( N ( U )) . Pr o of. On the E 1 -page, the non-reduced homology is concen trated in degree 0. Since by assumption, the reduced homology is trivial for each intersection, the non-reduced homology is just a copy of the ground ﬁeld. The diﬀerential on the E 1 -page is the diﬀerential of the nerv e complex itself, and the result follows immediately .  While this is a w ell-kno wn fact in classical algebraic top ology , we p oint this out to emphasize that the in tersections need not b e equipp ed with explicit contracting homotopies for the Nerve Lemma to work in homological computations. 4 As mentioned in Section 3.2, the n umber of iterations is bounded by the minimum of the maximal dimensions of the nerve and the simplical complex 10 P ersisten t Nerve Lemma: The Nerve Lemma was extended to the p ersistence setting in [4] using similiar homotopical argumen ts. Here, we give the sp ectral sequence version of it which relies on a p ersistent acyclicity condition. F rom our framework also follows a p ersisten t Nerv e lemma. Deﬁnition 5. A sp ac e is p ersisten tly acyclic if it has p ersistent homolo gy c onsisting of one fr e e gener ator of H 0 , and no homolo gy in higher de gr e es. Theorem 6 (P ersisten t homology Nerve lemma) . Supp ose a ﬁlter e d sp ac e X is c over e d by a c ol le ction U of ﬁlter e d subsp ac es X = S i U i , such that e ach T j U j is p ersistently acyclic. Then H ∗ ( X ) = H ∗ ( N ( U )) . Pr o of. W e note that p ersistent homology is homology with co eﬃcients in k [ t ]. The p ersisten t acyclicity condition ensures that homology has just one generator with the appropriate grading concentrated in the 0-homology . As in Theorem 4, the sequence collapses after the E 1 page and the d 1 diﬀeren tial is equiv alent to the b oundary op erator of the nerve. The result follows.  5.4. Algebraic k ernel, cok ernel and image persistence. In [7], the authors construct top ological spaces that compute image , kernel and c okernel p ersistence mo dules for the map induced in p ersisten t homology b y a map b etw een ﬁltered top ological spaces. W e observe that the constructions in section 3.1 provide corresp onding op erations on a purely algebraic level, computing p ersistence mo dules for image , kernel and c okernel given a map of p ersistence mo dules, such as the one induced from a top ological map. The notion of compatible ﬁltration included as a condition in [7] is reﬂected in the requirement that maps b et w een p ersistence mo dules b e gr ade d module homomorphisms of degree 0, implying that the ﬁltration inclusions on b oth sides commute with the map itself. 6. Discussion W e hav e constructed algorithms to compute with the sp ectral sequence generated by a double complex. In particular, we describ e how to compute a sp ectral sequence generalizing the May er-Vietoris long exact sequence to compute b oth classical and p ersisten t homology using a cov er of the space and restricting computation to each of the cov ered regions and their intersections. This la yout enables us to formulate parallelization schemes, with parallelism and independency guaran tees presen t b oth along the Nerve complex axis of our computational scheme and along separate comp onen ts of the cov ering. As an added b on us, the formalism of the sp ectral sequence approach suggests simpliﬁed pro ofs for several classical interesting results in the p ersistence communit y , and op ens up a wide sp ectrum of interesting areas of research. In particular, we are interested in pursuing: Complexit y b ounds for parallelism: This pap er presents the formalism, but do es not attempt a complete description of the complexity b ehaviour of the prop osed algorithms. HPC implemen tation: While the work was driven b y a wish to take p ersisten t homology to the w orld of computing clusters, our implemen tation so far is a serial one. A parallel implemen tation, and actual p erformance measurements would b e v ery interesting. Co v er generation sc hemes: While we give a few suggestions here, the construction of an adequate co ver that uses or highlights inheren t features of the dataset under analysis is going to b e an imp ortant factor of these metho ds. A more complete catalogue of cov er generation schemes is going to b e imp ortan t for the practical application of our techniques. Theoretical implications: W e ha v e already seen that sev eral results ha v e algebraic pro ofs that v astly simplify the arguments inv olved. W e w ould b e very in terested in exploring the space of new or classical results that b ecome accessible with a sp ectral sequence language. 11 References [1] R. Bott and L. T u. Diﬀer ential forms in algebr aic top ology , volume 82. Springer, 1982. [2] K. Bro wn. Cohomology of groups , volume 87 of Gr aduate T exts in Mathematics . Springer, 1982. [3] G. Carlsson. T op ology and data. Americ an Mathematic al So ciety , 46(2):255–308, 2009. [4] F. Chazal and S. Y. Oudot. T ow ards p ersistence-based reconstruction in euclidean spaces. In Pr oc. 24th ACM Symp os. Comput. Ge om. , pages 232–241, 2008. [5] C. Chen and M. Kerb er. An output-sensitive algorithm for p ersistent homology . In Pr o ce e dings of the 27th annual symp o- sium on Computational ge ometry , 2011. [6] T. Cho w. Y ou could hav e inv ented spectral sequences. Notices of the AMS , 53:15–19, 2006. [7] D. Cohen-Steiner, H. Edelsbrunner, J. Harer, and D. Morozov. Persisten t homology for kernels, images, and cokernels. In Pro c e e dings of the twentieth Annual ACM-SIAM Symp osium on Discr ete Algorithms , pages 1011–1020. So ciety for Industrial and Applied Mathematics, 2009. [8] D. Cohen-Steiner, H. Edelsbrunner, and D. Morozov. Vines and vineyards by up dating p ersistence in linear time. In Pr oc e e dings of the twenty-se c ond annual symp osium on Computational ge ometry , SCG ’06, pages 119–126, New Y ork, NY, USA, 2006. ACM. [9] A. L. D. Attali and D. Salinas. Eﬃcien t data structure for representing and simplifying simplicial complexes in high dimension. In Pr oc. 27th ACM Symp os. Comput. Ge om. , 2011. [10] P . Dlotko, M. Juda, M. Mrozek, and R. Ghrist. Di stributed computation of cov erage in sensor netw orks by homological methods, 2011. [11] P . Dlotko, T. Kaczynski, M. Mrozek, and T. W anner. Coreduction homology algorithm for regular cw-complexes. Discr ete and Computational Ge ometry , 46(2):361–388, 2011. [12] H. Edelsbrunner and J. Harer. Computational T opolo gy: an Intr o duction . AMS Press, 2009. [13] H. Edelsbrunner, D. Letscher, and A. Zomoro dian. T opological p ersistence and simpliﬁcation. In F oundations of Computer Scienc e, 2000. Pro c ee dings. 41st Annual Symp osium on , pages 454–463. IEEE, 2000. [14] D. Eisen bud. Commutative algebra with a view towar d algebr aic ge ometry , v olume 150. Springer, 1995. [15] R. Ghrist. Barco des: the p ersisten t top ology of data. Bul letin-Americ an Mathematic al Society , 45(1):61, 2008. [16] R. Godement. T op olo gie alg´ ebrique et th´ eorie des faisc e aux , volume 1. Hermann Paris, 1958. [17] S. Harker, K. Mischaik ow, M. Mrozek, V. Nanda, H. W agner, M. Juda, and P . Dlotko. The eﬃciency of a homology algorithm based on discrete morse theory and coreductions. In 3r d International Workshop on Computational T opolo gy in Image Context , Novem b er 2010. [18] A. Hatcher. Sp ectral sequences in algebraic top ology . First three chapters of b ook in progress published online on http: //www.math.cornell.edu/ ~ hatcher/SSAT/SSATpage.html . [19] A. Hatc her. Algebr aic topolo gy . Cambridge Universit y Press, 2005. [20] P . P . K. Misc haiko w, M. Mrozek. Graph approach to the computation of the homology of contin uous maps. F oundations of Computational Mathematics , 5:199–229, 2005. [21] J. McCleary . A user’s guide to sp ectr al se quences , volume 58. Cambridge Univ Pr, 2001. [22] N. Milosavljevic, D. Morozov, P . Skraba, et al. Zigzag p ersistent homology in matrix multiplication time. In Annual Symp osium on Computational Ge ometry , pages 247–256, 2009. [23] M. Mrozek and B. Batko. Coreduction homology algorithm. Discr ete and Computational Ge ometry , 41:96–118, 2009. [24] M. Mrozek, P . Pilarczyk, and N. Zelazna. Homology algorithm based on acyclic subspace. Computers and Mathematics with Applic ations, , 55:2395–2412, 2008. [25] M. Mrozek and T. W anner. Coreduction homology algorithm for inclusions and p ersisten t homology . Computers and Mathematics with Applic ations , accepted. [26] A. Muhammad and A. Jadbabaie. Distributed computation of homology groups by gossip. In Americ an Contr ol Confer enc e, ACC 2007 , 2007. [27] C. Sims. Abstr act algebr a: a c omputational appr oach . John Wiley & Sons, Inc., 1984. [28] A. Zomorodian. T opolo gy for c omputing . Cambridge Univ Pr, 2005. [29] A. Zomorodian. F ast construction of the vietoris-rips complex. Computers & Graphics , 34(3):263–271, 2010. [30] A. Zomorodian. The tidy set: a minimal simplicial set for computing homology of clique complexes. In Pro c ee dings of the 2010 annual symp osium on Computational ge ometry , pages 257–266. ACM, 2010. [31] A. Zomorodian and G. Carlsson. Computing persistent homology . Discr ete and Computational Geometry , 33(2):249–274, 2005. [32] A. Zomorodian and G. Carlsson. Lo calized homology . In Shap e Modeling and Applic ations, 2007. SMI’07. IEEE Interna- tional Confer ence on , pages 189–198. IEEE, 2007. 12 R D 0 p,q R C D 0 p,q B 1 p,q − 1 I 1 p,q Z 1 p,q ⇒ C Figure 7. The basic reduction: As in [8], we reduce the matrix R left to right, alwa ys choosing the low est p ossible pivot with shaded areas representing non-zero entries. In contrast with [8], we represen t each dimension separately , so the matrices are not upp er triangular. As we reduce R , w e p erform the same op erations on C . At the end, we perform a p erm utation to put the columns with piv ots on the left (this is not done in practice). These represent the b oundary basis and the corresp onding columns in C are the preb oundary basis. The columns in C corresp onding to the empt y columns in R are the cycle basis. Appendix A. Proofs of selected resul ts Pr o of of L emma 2. Supp ose C ∗ determines the presence of a simplex using p oints at most ε/ 2 aw ay from an y p oin t in a simplex, and eac h simplex has diameter at most ε/ 2. A simplex σ in C ∗ X is in the closed star of S if at least one vertex of σ is in S . By the condition on the diameter, all other p oin ts of σ are contained in T ε/ 2 S . In addition, an y p oin t that can inﬂuence the decision on whether σ exists or not will b e within ε/ 2 of the furthest p oin t from the v ertex of σ in S , and th us the simplex σ is entirely determined by p oints at most ε aw ay from S , and thus by T ε S .  Pr o of of L emma 3. If X is partitioned into subsets U j , we may certainly partition C ∗ X by the closed stars of the subsets. Indeed, since the U j partition X , we are guaranteed that any vertex of a simplex lies in at least one of the U j . Hence, in particular, the closed stars of all the U j will contain all simplices in C ∗ X . By Lemma 2, eac h suc h closed star is contained in C ∗ T ε U j . Hence, the sub complexes C ∗ T ε U j co ver X .  Appendix B. Implement a tion In this section, we recount the algorithm in Section 3.2 in greater detail. In particular, the sections follow eac h other closely . The basic element of computation is the chain which is represented by a column vector with eac h row corresp onding to a simplex and the en tries corresp onding to a coeﬃcient k . Since the chains are homogeneous, it is suﬃcient to annotate each chain with its degree. Bases are a collection of linearly indep endent c hains, and are represen ted as a matrix, where the vectors are sorted from left to righ t in order of increasing degree 5 . A p ersisten t homology class is represented as a cycle chain and a b oundary chain which represent the birth and death times of the class resp ectively . In terms of the presentation of the mo dule, an all classes ha ve a representativ e in the cycle basis where the degree corresp onds to birth time. Inessential classes hav e a copy of this with higher degree in the b oundary basis. In the quotient space, this corresp onds precisely to the algebraic form describ ed in [7]. In practice, we do not c hoose quite this representation. Rather than keep a separate b oundary basis and cycle basis, inessen tial classes are annotated with tw o degrees to represent the b oundary and cycle basis degrees. Initialization. : The initialization corresp onds to running the standard p ersistence algorithm on each col- umn in the double complex with d 0 . As in [8], by reducing the b oundary matrix and keeping track of column op erations we can extract all the required bases at each no de ( p, q )(Figure 7). W e now p erform one additional op eration: at each no de, we reduce the cycle basis with resp ect to the b oundary basis as sho wn in Figure 8. W e do this additional step to make op erations at later stages simpler. T o keep trac k of the preb oundary we trac k the op erations p erformed to pro duce the non-zero reduced columns of the b oundary matrix. This w ay , we get a minimal spanning set of I 1 p,q suc h that d 0 yields an isomorphism I 1 p,q → B 1 p,q − 1 . A t the end of this step, we hav e matrices representing Z 1 p,q , B 1 p,q and I 1 p,q for all ( p, q ). Z 1 p,q and part of B 1 p,q together form a basis for E 1 p,q . W e omit the b oundary elemen ts b ecause E r is a sub quotient of all earlier E j , so an y non-trivial class in E r is represen ted earlier, and an y relation from earlier remains a relation. Hence, any null-persistent class may show up as a relation, but only a relation 5 This corresp onds to an increasing ﬁltration index. 13 D 0 p,q +1 D 0 p,q B 1 p,q I 1 p,q +1 Z 1 p,q +1 B 1 p,q − 1 I 1 p,q Z 1 p,q Z 1 p,q B 1 p,q H 0 p,q = { E 1 p,q 1. 2. 3. 1. Z 1 p,q B 1 p,q Figure 8. The reduction pro cedure: 1. The standard reduction into cycles and b oundaries 2. Reduction of cycle basis with resp ect to cycle basis, the reduction proceeds left to right. Note that certain chains (non-essen tial classes) in Z 1 pq will b e completely reduced. Since these are already in the span of the chosen basis we drop those columns. 3. The input to the next iteration ignores n ull p ersistent classes (in the picture we assume they happ en in a left blo c k, but this is only for visualization). killing its corresp onding cycle, which also will no longer b e relev an t. Therefore they are still v alid relations for the basis but cannot map to any element in E r p − r +1 ,q + r . Iteration. : The general iteration computes the basis for Z r , B r , I r , from whic h w e compute E r +1 . Note that these do not need to b e stored ov er multiple iterations. W e store the 0-th iteration from the initialization and then only the previous iteration: F or the r -th iteration, we only ha ve ( r − 1)-th iteration. Once completed w e can forget the ( r − 1)-th iteration. As b efore, we ﬁrst assume that all the r -diﬀeren tials are given as linear maps. Practically , we c ompute the diﬀerential from the information in each subsequent iteration. This can b e thought of as a subroutine we call at the b eginning of each iteration and is describ ed b elo w. The rest of the iteration follows m uch as the initialization, we apply d r to E r p,q . Eac h element gives a c hain x ∈ E r p − r +1 ,q + r . W e ﬁrst reduce x with resp ect to the existing b oundary c hains. In Figure 9, we show ho w we p erform a reduction step. W e reduce the chain x with resp ect to y . The underlying op eration is the same as for regular reduction in Gaussian e limination: w e multiply the pivot column with the appropriate ﬁeld co eﬃcient and add it to the chain w e are reducing. The only additional b o ok-k eeping we must do is to tak e care that the degrees of the cycle and b oundary chains are appropriately up dated. Assume that b oth x and y are inessen tial classes, so each has a generator and a b oundary relation. T o illustrate this, Figure 9 shows the mapping b etw e en tw o p ersistence interv al, representing the degrees of the generators and relations. Such a mapping would generically map some in terv al [ c, d ] to an interv al [ a, b ], with a ≤ c ≤ b ≤ d . This corresp onds to the requiremen t that the maps b e graded so generators map to multiples of generators and relations map to relations. The kernel of the map is represen ted b y x shifted to the degree of the relation of y . The cok ernel is represented by y with the relation at the degree of the generator of x . The image is represented by y , with a degree shift up to the generator of x . The transformation of the in terv als is shown in Figure 9. This corresp onds to only one reduction step, we rep eat this pro cedure completely reducing x in terms of the basis in E r p − r +1 ,q + r and w e do this for all elements in E r p,q . If in the reduction of an elemen t, an interv al is reduced to zero (i.e. [ c, c ]), further reduction is unnecessary , since it cannot b e in the k ernel, nor can can it map to anything further in the b oundary . The kernel of the map is given by all the elemen ts which hav e non-n ull p ersistence and form a basis for Z r +1 p,q . Lik ewise, in the target space (assuming Z r +1 p − r +1 ,q + r had already b een computed), the remaining non-null p ersisten t classes form a basis E r +1 p − r +1 ,q + r . W e note that w e can compute the basis for the b oundary and k ernel indep enden tly . How ev er, then we must reduce the k ernel basis with rep ect to the b oundary basis (just as in the initialization). Computing the r -diﬀeren tial. : F or r = 1, the diﬀerential is given by d 1 , where the map on homology is induced directly from the chain map. F or r > 1, w e use the following pro cedure to compute d r sho wn in Figure 10. In the previous iteration, w e store the image of d r − 1 E r − 1 p,q . More precisely , w e store the represen tation of the image in the basis of B 0 p − r +2 ,q + r − 1 . In this basis, p erforming the lift to I 0 p − r +2 ,q + r can b e done through the stored relation. The lift exists b ecause from Section 2, we know that d r − 1 E r p,q ∈ B 0 p − r +2 ,q + r − 1 . 14 [ a, b ] [ c, d ] c d a b c d a b k er cok er ⇒ im x + t c − a k y t c − a Figure 9. T o p erform a reduction of the x ∈ d r E r p + r − 1 ,q − r with an element of the basis of y ∈ E r p,q , we multiply by the appropriate co eﬃcient t c − a k ∈ k [ t ] and add the tw o. The condition a ≤ c ≤ b ≤ d follo ws from the grading of the maps. W e update the degrees of d r E r p + r − 1 ,q − r to [ b, d ] and E r p,q to [ a, c ]. Note that if we get [ d, d ] or [ a, a ], these elemen ts hav e b ecome null persistent. B 0 p − r +2 ,q + r − 1 I 0 p − r +2 ,q + r E r p,q d r − 1 E r p,q d 0 d 1 d r − 1 E r p − r +1 ,q + r Figure 10. T o compute d r , we ﬁrst ﬁnd the represen tation of the d r − 1 E r p,q in terms of B 0 p − r +1 ,q + r − 1 . Since we know this is in the image of d 0 , we can lift this vector to I 0 p − r +1 ,q + r and then compute the chain in terms of I 0 p − r +1 ,q + r . Applying d 1 , we get a chain w e can reduce in terms of E r p − r +1 ,q + r , thereby computing d r The last step is applying the d 1 map, which giv es the c hain we then reduce with respect to the E r p − r +1 ,q + r basis. By construction this satisﬁes equations 2 and 3. This is precisely one step of the staircase sho wn in Figure 3. 15

A spectral sequence for parallelized persistence

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment