Manifold reconstruction has been extensively studied for the last decade or so, especially in two and three dimensions. Recently, significant improvements were made in higher dimensions, leading to new methods to reconstruct large classes of compact subsets of Euclidean space $\R^d$. However, the complexities of these methods scale up exponentially with d, which makes them impractical in medium or high dimensions, even for handling low-dimensional submanifolds. In this paper, we introduce a novel approach that stands in-between classical reconstruction and topological estimation, and whose complexity scales up with the intrinsic dimension of the data. Specifically, when the data points are sufficiently densely sampled from a smooth $m$-submanifold of $\R^d$, our method retrieves the homology of the submanifold in time at most $c(m)n^5$, where $n$ is the size of the input and $c(m)$ is a constant depending solely on $m$. It can also provably well handle a wide range of compact subsets of $\R^d$, though with worse complexities. Along the way to proving the correctness of our algorithm, we obtain new results on \v{C}ech, Rips, and witness complex filtrations in Euclidean spaces.
Deep Dive into Towards Persistence-Based Reconstruction in Euclidean Spaces.
Manifold reconstruction has been extensively studied for the last decade or so, especially in two and three dimensions. Recently, significant improvements were made in higher dimensions, leading to new methods to reconstruct large classes of compact subsets of Euclidean space $\R^d$. However, the complexities of these methods scale up exponentially with d, which makes them impractical in medium or high dimensions, even for handling low-dimensional submanifolds. In this paper, we introduce a novel approach that stands in-between classical reconstruction and topological estimation, and whose complexity scales up with the intrinsic dimension of the data. Specifically, when the data points are sufficiently densely sampled from a smooth $m$-submanifold of $\R^d$, our method retrieves the homology of the submanifold in time at most $c(m)n^5$, where $n$ is the size of the input and $c(m)$ is a constant depending solely on $m$. It can also provably well handle a wide range of compact subsets o
The problem of reconstructing unknown structures from finite collections of data samples is ubiquitous in the Sciences, where it has many different variants, depending on the nature of the data and on the targeted application. In the last decade or so, the computational geometry community has gained a lot of interest in manifold reconstruction, where the goal is to reconstruct submanifolds of Euclidean spaces from point clouds. In particular, efficient solutions have been proposed in dimensions two and three, based on the use of the Delaunay triangulation -see [8] for a survey. In these methods, the unknown manifold is approximated by a simplicial complex that is extracted from the full-dimensional Delaunay triangulation of the input point cloud. The success of this approach is explained by the fact that, not only does it behave well on practical examples, but the quality of its output is guaranteed by a sound theoretical framework. Indeed, the extracted complex is usually shown to be equal, or at least close, to the so-called restricted Delaunay triangulation, a particular subset of the Delaunay triangulation whose approximation power is well-understood on smooth or Lipschitz curves and surfaces [1,2,6]. Unfortunately, the size of the Delaunay triangulation grows too fast with the dimension of the ambient space for the approach to be still tractable in high-dimensional spaces [33].
Recently, significant steps were made towards a full understanding of the potential and limitations of the restricted Delaunay triangulation on smooth manifolds [14,35]. In parallel, new sampling theories were developped, such as the critical point theory for distance functions [9], which provides sufficient conditions for the topology of a shape X ⊂ R d to be captured by the offsets of a point cloud L lying at small Hausdorff distance. These advances lay the foundations of a new theoretical framework for the reconstruction of smooth submanifolds [11,34], and more generally of large classes of compact subsets of R d [9,10,12]. Combined with the introduction of more lightweight data structures, such as the witness complex [16], they have lead to new reconstruction techniques in arbitrary Euclidean spaces [4], whose outputs can be guaranteed under mild sampling conditions, and whose complexities can be orders of magnitude below the one of the classical Delaunay-based approach. For instance, on a data set with n points in R d , the algorithm of [4] runs in time 2 O(d 2 ) n 2 , whereas the size of the Delaunay triangulation can be of the order of n ⌈ d 2 ⌉ . Unfortunately, 2 O(d 2 ) n 2 still remains too large for these new methods to be practical, even when the data points lie on or near a very low-dimensional submanifold.
A weaker yet similarly difficult version of the reconstruction paradigm is topological estimation, where the goal is not to exhibit a data structure that faithfully approximates the underlying shape X, but simply to infer the topological invariants of X from an input point cloud L. This problem has received a lot of attention in the recent years, and it finds applications in a number of areas of Science, such as sensor networks [19], statistical analysis [7], or dynamical systems [32,36]. A classical approach to learning the homology of X consists in building a nested sequence of spaces
and in studying the persistence of homology classes throughout this sequence. In particular, it has been independently proved in [12] and [15] that the persistent homology of the sequence defined by the α-offsets of a point cloud L coincides with the homology of the underlying shape X, under sampling conditions that are milder than the ones of [9]. Specifically, if the Hausdorff distance between L and X is less than ε, for some small enough ε, then, for all α ≥ ε, the canonical inclusion map L α ֒→ L α+2ε induces homomorphisms between homology groups, whose images are isomorphic to the homology groups of X. Combined with the structure theorem of [38], which states that the persistent homology of the sequence {L α } α≥0 is fully described by a finite set of intervals, called a persistence barcode or a persistence diagram -see Figure 1 (left), the above result means that the homology of X can be deduced from this barcode, simply by removing the intervals of length less than 2ε, which are therefore viewed as topological noise.
From an algorithmic point of view, the persistent homology of a nested sequence of simplicial complexes (called a filtration) can be efficiently computed using the persistence algorithm [22,38]. Among the many filtrations that can be built on top of a point set L, the α-shape enables to reliably recover the homology of the underlying space X, since it is known to be a deformation retract of L α [21]. However, this property is useless in high dimensions, since computing the α-shape requires to build the full-dimensional Delaunay triangulation. It is therefore appealing to consider other filtrations that are
…(Full text truncated)…
This content is AI-processed based on ArXiv data.