Geometric Learning and Topological Inference with Biobotic Networks: Convergence Analysis

1 Geometric Learning and T opological Inference with Biobotic Networks: Con v er gence Analysis Alireza Dirafzoon 1 , Alper Bozkurt 1 , and Edgar Lobaton 1 Abstract —In this study , we pr esent and analyze a framework for geometric and topological estimation f or mapping of unknown en vironments. W e consider agents mimicking motion behaviors of cyborg insects, known as biobots , and exploit coordinate- free local interactions among them to infer geometric and topological information about the envir onment, under minimal sensing and localization constraints. Local interactions are used to create a graphical r epresentation r eferred to as the encounter graph. A metric is estimated over the encounter graph of the agents in order to construct a geometric point cloud using manifold learning techniques. T opological data analysis (TD A), in particular persistent homology , is used in order to extract topological features of the space and a classiﬁcation method is proposed to infer robust features of interest (e.g. existence of obstacles). W e examine the asymptotic behavior of the proposed metric in terms of the conv ergence to the geodesic distances in the underlying manifold of the domain, and provide stability analysis results for the topological persistence. The proposed framework and its con vergences and stability analysis are demonstrated through numerical simulations and experiments. I . I N T R O DU C T I O N R ECENT dev elopments in neural engineering ha ve em- powered us to directly control insect locomotion us- ing wireless neuro-stimulators to enable remotely controlled cybor g insects, known as insect biobots [1]. Each agent is equipped with a system-on-chip based ZigBee enabled wireless neuro-stimulation backpack system and remote nav- igational control circuits. In particular , we have been able to control Gr omphadorhina portentosa (Madagascar hissing) cockroaches in open and mazed en vironments [1]–[3]. Exploit- ing unmatched natural abilities of insects in navigation and exploration, a c yber-physical network of such biobots, which we call a biobotic network , could prov e useful for search and rescue applications in uncertain disaster environments (e.g. when an earthquake occurs). Figure 1 illustrates a realization of a biobotic network, a snapshot of one of our cockroach biobots, and a summary of their natural mobility and sensing model. Exploration and mapping are essential tasks in such sce- narios, for which exploiting mobile robotic networks is ex- tremely beneﬁcial over using human rescuers in terms of safety and cov erage. T eams of small autonomous agents (e.g. biologically inspired milli-robots [4] or biobots), in particular , hav e certain advantages ov er traditional platforms since larger This work was supported by the National Science Foundation under award CNS-1239243. 1 A. Dirafzoon, A. Bozkurt, and E. Lobaton are with the Department of Electrical and Computer Engineering, North Carolina State University , Raleigh, NC, 27606 USA. e-mail: adirafz@ncsu.edu, aybozkur@ncsu.edu, ejlobato@ncsu.edu. (a) CRW ߠ ଵ ݈ ଶ (b) (c) (d) Fig. 1. (a) A physical en vironment with a biobotic (cyborg insect) network, (b) Gromphadorhina portentosa “biobot” with stimulation backpack and battery [1], (c) natural motion model for the biobots in terms of correlated random walk (CR W), (d) encounter region for two biobots. robotic systems may not be able to reach certain locations under the collapsed buildings. Furthermore, biobotic platforms can be particularly versatile at locomotion and na vigation in unstructured and dynamic scenes due to their natural ability to crawl through small spaces. Closely related to our work are the techniques in simulta- neous localization and mapping (SLAM) literature [5], [6]. Howe ver , the majority of recent approaches assume that the exploring agents are equipped with high-bandwidth sensing devices such as cameras or depth sensors [7]. On the other hand, mapping and localization in under-rubble en vironments using such platforms becomes extremely challenging due to hardware limitations and the unstructured nature of the en vironment. Po wer and computational resource constraints prohibit us from using traditional on-board imaging techniques for their localization (e.g., visual SLAM [7], [8]). Furthermore, traditional signal propagation based localization (e.g., GPS, or computing signal strength or time of ﬂight [9], [10]) may be unreliable for indoor or underground locations in cluttered en vironments, and odometry and inertial-based approaches (e.g., [11]) may be unreliable due to irregular conditions of the terrain. Hence, in this work, we focus on minimal sensing strategies for robotic networks in which such sensing modalities are unav ailable. Since obtaining an accurate map of the en vironment may not be possible in such scenarios, we propose an approach to construct a map that captures robust topological features and 2 approximates geometric information of the en vironment using only coordinate-free local interactions among the agents in a biobotic network. Such maps can be used for identifying local and global homology (i.e., connected regions and obstacles) of the space, identifying homotopic trajectories connecting regions in the space, and providing an approximate and in- tuitiv e visualization of the space. Additionally , the topological information can be used as a constraint to ensure that a metric reconstruction properly captures connecti vity in the space as it is done via loop-closure in traditional SLAM approaches. Moreov er , instead of providing continuous control feedback to the agents, we exploit natural stochastic motion models of the biobots and their encounter information. These strategies for motion and sensing accommodate for the hardware limitations of the biobotic platform under consideration. Methods from computational topology , on the other hand, can provide tools to extract topological features from point cloud data without requiring coordinate information, which makes them more suitable for scenarios in which weak or no localization is provided. In particular , topological data analysis (TD A), introduced in [12], employs tools from persis- tent homology theory [13] to obtain a qualitati ve description of the topological attributes of data sets sampled from high dimensional point clouds. T opological frame works have been emplo yed in applications on sensor and robotic networks, e.g. for cov erage and hole detection in such networks, under noisy and limited sensing with neither metric nor localization information a vailable [14]– [18]. Howe v er , these studies focus on stationary networks, are mainly concerned about the topology of the network rather than the characterization of the en vironment itself, and bypass the geometric estimation requirement by considering a purely topological algorithm. In [19] persistent homology was used in order to compute topological in variants and infer global topological information of spaces from interacting nodes in mobile Ad-Hoc networks; yet the nodes are assumed to follow a simple mobility model restricted to a graph. T opological localization and mapping has been also considered lately in [20]–[22]. Howev er , the approach in [20] is mostly focused on the localization problem, and requires the existence of trans- mitters in the environment pro viding signals of opportunity . The methodology of [21], [22], assume the existence of an IMU or a motion model for prediction of the agents’ states, as well as ﬁx ed landmarks on the boundaries of structures (e.g. buildings) used for observ ation. In this paper , howe ver , the mapping is intended to be performed using local encounters among the agents and not relying on e xternal resources (sig- nals of opportunity or kno wn ﬁxed landmarks), as they may be una vailable as a consequence of the emer gency response scenarios under consideration. This work is a formal mathematical framework for the approach introduced in [23] and extended in [24] for mapping of unkno wn en vironments under limited sensing constraints, where a notion of encounter metric is introduced based on local interactions among stochastic agents, and tools from TD A were exploited to extract topologically persistent spatial information such as connected components and obstacles of the en vironment. W e implemented the approach of [23] on a multi-robot network, the W olfBots [24]. In this paper , we expand the metric estimation procedure of [23] in order to construct more accurate geometric point clouds of the en vi- ronment, and analyze the precision of proposed methodology from a manifold learning [25] perspecti ve. W e deri ve bounds on the metric estimation error , and present conditions for which asymptotic con v ergence to geodesic distances holds. Furthermore, we present a topological framew ork to infer persistent features in the map of the en vironment, and analyze its stability under geometric uncertainty . This work, can be used as building blocks for mapping of larger scenarios by herding the network using an aerial vehicle sweeping the en vironment. The remainder of the paper is organized as follows: A concise background on the mathematical tools used throughout the paper is provided in section II. Section III describes the problem under study including models and assumptions for the agents in a biobotic network, followed by an ov erview of our approach in section IV. Geometric estimation methodology including asymptotic con vergence analysis is studied in section V, and topological estimation and its stability analysis are dis- cussed in section VI. Section VII illustrates and v alidates the approach via numerical simulation experiments, and ﬁnally , conclusion and future w ork are pro vided in section VIII. I I . M A T H E M A T I C A L B A C K G RO U N D A. Manifold learning Consider a manifold M whose structure and features are of interest (e.g. an unknown en vironment to be mapped). M usually is not directly accessible but a ﬁnite sampled version of it, X = { x 1 , . . . , x n } , can be used for computations. X can be represented as a point cloud , a set of points equipped with a metric , deﬁned by pairwise distances between the points. The metric can be obtained directly using the coordinates of the points, or by estimation of pairwise distances independently from their (possibly unkno wn) coordinates (see Fig. 2 (a)). Giv en a ﬁnite set of sampled data points X , manifold learn- ing is referred to the set of algorithms that try to discov er the intrinsic low dimensional structure of the embedding manifold M of the data set. A revie w of different manifold learning approaches can be found in [26]. Isomap [27], in particular , learns the underlying global geometry of the manifold captured giv en only sampled data points by preserving the geodesic distances . The geodesic distance between two points is the length of the shortest curve γ (geodesic) that connects the two points on M , i.e. d M ( x i , x j ) = inf γ { length ( γ ) } . As an example, on a sphere, the geodesic distances are obtained from the great circles. The main idea behind Isomap is to approximate geodesic distances on the manifold by the shortest path distance on the neighborhood graph of the samples taken from the manifold, using measured/estimated local metric information. Isomap is guaranteed to asymptotically recover geometric structure of a manifold giv en a dense enough sampled data set [27]. B. T opological Data Analysis (TD A) TD A, on the other hand, provides a framework for descrip- tion of topological features in a point cloud in a coordinate 3 Fig. 2. T opological persistence: (a) An example of a space M with a sampled point cloud X , (b) illustration of simplicial complexes, (c) a ﬁltration of simplicial complexes over X , (d) the corresponding persistence diagram with PD 0 ( X ) in green, and PD 1 ( X ) in red, highlighting persistence of connected components and holes over the scale. Plots (e) and (f) illustrate the stability of persistence: (e) a circular space M with noisy samples X , (f) persistence diagram of M and X and their bottleneck distance. free manner , while being robust to metric perturbations and noise. It employs persistent homology theory to extract the prominence of such features in terms of a compact multi-scale representation called persistence diagrams [28]. 1) Simplicial Comple xes: A standard method to analyze the topological structure of a point cloud is to map it into combinatorial objects called simplicial complexes . Given a set of samples X , a k -simplex is deﬁned as a set { x 1 , x 2 , . . . , x k +1 } , x i ∈ X , ∀ i and i 6 = j, ∀ i, j. For example, a 0 -simplex is a vertex, a 1 -simplex is an edge, and a 2 - simplex is a triangle. A ﬁnite collection of simplices is called a simplicial complex if for each simplex each of its faces are also included in the collection. Fig. 2(b) shows an example of 0,1,2-simplices constructed on points drawn from M . One way to b uild these complexes is to select a scale  , place balls of radius  on each verte x, and construct simplices based on their pairwise distance relativ e to  . A computationally efﬁcient complex, called the V ietoris- Rips complex R ( X ,  ) , consists of simplices for which the distances between each pair of its vertices are at most  . In other words, gi ven pairwise distances d over X , any subset X 0 ⊂ X is a simplex in R ( X ,  ) if and only if d ( x i , x j ) ≤ , ∀ x i , x j ∈ X 0 . Note that the construction of V ietoris-Rips complex only requires the knowledge of pairwise distances among the sampled points. 2) P ersistent Homology: T opological inv ariants of a space M are mappings which aim to classify equiv alent topological objects into the same classes. Homology (a type of topological in v ariance) can be summarized as a compact representation in the form of so-called Betti numbers, which are the ranks of the homology gr oups . The n -th Betti number , β n is a measure of the number of n -dimensional cycles in the space (e.g. β 0 is the number of connected components and β 1 is the number of holes in the space). For a dense enough sample, one can ﬁnd an interval for the values of  , for which constructed simplicial complex es belong to the same class of topological in v ariants as the space M . Ho wev er , ﬁnding the proper v alues for  is a strenuous job. On the other hand, persistent homolo gy theory considers construction of a sequence of such complex es for the sampled point cloud over multiple scales  , called a ﬁltration , and extracts geometrical and topological features of the dataset at speciﬁc scales. A ﬁltration X (  ) can be obtained by increasing  over a range of interest, with the property that if t < s, then X ( t ) ⊂ X ( s ) . Persistent homology computes the values of  for which the classes of topological features appear ( b i n ) and disappear ( d i n ) during ﬁltration, referred to as the birth and death values of the i -th class of features in dimension n . This information is encoded into persistence intervals [ b i n , d i n ] , or a compact representation called persistence dia gram [12]. The persistent diagram PD n ( X ) , is a multi-set of points ( b i n , d i n ) that summarizes ho w topological features of the point cloud vary ov er the scale  . Algorithms for computation of persistent homology can be found in [12]. An example of a topological space M together with sampled data X and corresponding ﬁltration over  is shown in Fig. 2 (a-c). In the persistence diagram in Fig. 2(d), features corresponding to dimension 0 and dimension 1 are sho wn in green and red colors, respectiv ely . The space M can be described topologically as ha ving 1 connected component and 1 hole, meaning that β 0 = 1 and β 1 = 1 for this example. Although there exist sev eral topological features in the PD , only 2 of them can be distinguished from the rest based on their distant from the diagonal. These points correspond to the persistent featur es with longer persistence interv als, referred to as topological signals , while the rest are short-lived features that are referred to as topological noise [29]. Hence one can infer the existence of one connected component and one persistent hole from the PD . 3) Stability of P ersistence Diagrams: One of the fundamen- tal properties of persistence diagram is its stability [30], which means a sufﬁciently small change in the metric results in a small change in the persistence diagram. There are v ariations of stability results based on different metric deﬁnitions. For a metric space ( M , d ) (a set M together with a metric d ), the Hausdorff distance between these two of its subsets X and Y , 4 d H ( X , Y ) , is deﬁned as the maximum distance from a point in one set to the closest point in the other set: d H ( X , Y ) = max { max x ∈X min y ∈Y d ( x, y ) , max y ∈Y min x ∈X d ( x, y ) } (1) Moreov er , for tw o metric spaces ( X , d 1 ) and ( Y , d 2 ) their Gr omov-Hausdorf f distance d GH (( X , d 1 ) , ( Y , d 2 )) is deﬁned as the inﬁmum of their Hausdorff distance over all possible isometric embeddings of these two spaces into a common metric space [31]. In order to compare persistence diagrams, a distance metric called the Bottlenec k distance , is deﬁned for two diagrams PD 1 and PD 2 as [30]: d B ( PD 1 , PD 2 ) = inf φ sup a ∈ PD 1 k a − φ ( a ) k ∞ , (2) where the inﬁmum is taken over all bijections φ : PD 1 → PD 2 and k . k ∞ denotes supremumnorm over sets. For two compact metric spaces ( X , d 1 ) and ( Y , d 2 ) , the persistence diagram is stable in the sense that [31]: d B ( PD X , PD Y ) ≤ 2 . d GH (( X , d 1 ) , ( Y , d 2 )) (3) Furthermore, if X and Y are embedded in the same space ( M , d M ) then (3) holds for the Hausdorff distance d H instead of 2 d GH . Fig. 2 (e) shows a circular space M along with noisy samples X drawn from it. The corresponding persistence diagrams PD M , PD X are presented in Fig. 2(f), which are not much different. Here the bottleneck distance can be described as the L ∞ norm between the points corresponding to the holes in each space as sho wn in the plot. I I I . P R O B L EM F O R M U L A T I O N Consider a network of biobotic agents mo ving stochastically within a bounded domain of interest M ⊂ R 2 . W e restrict our analysis to a 2D en vironment for simplicity while the techniques introduced do not rely on this assumption. Each agent is distinguished by its unique ID belonging to the set I = { 1 , . . . , n } , and its motion dynamics in the bounded space mimics the mo vement model of a biobotic insect. The mov ement model of the biobotic insect can be described by two types of motions: natural motion and contr olled motion , which will be described in the following subsection. A. Motion Model of the Biobotic Networks Natural Motion (NM) : The natural motion model of the biobotic agents is adopted from the probabilistic motion model of cockroaches in bounded spaces described in [32], de- scribed by their individual and gr oup behaviors. The individual mov ement of cockroaches can be characterized mainly by two behaviors, namely corr elated random walk (CRW) and wall following (WF). Cockroaches, in their natural mode, are known to move according to a CR W model when they are far enough from the boundaries of the domain, and perform WF beha vior when they detect the edges of the environment using their antennas [32]. During WF , they switch back again to CR W motion tow ards the interior of the domain after some stochastic amount of time. Moreo ver , it is known that during their CR W or WF motion, the insects probabilistically stop for some period of time and then continue their movement [32]. W e refer to the insects being in a static (S) mode when they stop. The CR W is modeled as piecewise linear mov ements with ﬁxed orientation, characterized by line segments l i , interrupted by isotropic changes in direction θ i , and constant average ve- locity of v m . The lengths of line se gments l has an exponential distribution with characteristic length l ∗ : p ( l ) = l ∗ − 1 e − l / l ∗ , (4) Changes in orientation also triggered by collision detection with other agents or obstacles in the scene. Fig. 1(c) sho ws an example of CR W motion with few line segments l i , with angular re-orientations of θ i . Cockroaches also manifest a group behavior , aggr egation , when they interact with each other [33]. In this paper , howe ver , we focus on the CR W behavior for simplicity . Hence our model of natural motion will be reduced to a two state graphical model switching between CR W and S. A. Dirafzoon et al [23] explored the effect of WF behavior on the exploration and mapping process. Note that although this model is particularly de veloped and studied for German cockroach Blattella germanica , many animals (in particular other species of cockroaches) are known to display similar motion behaviors with possibly dif ferent parameters for CR W or Levy W alk [34]. Contr olled Motion (CM) : The remotely controlled biobots are equipped with system-on-chip based commands of the following types: turn left , turn right , start or stop motion. In this paper , we only consider start and stop commands in order to switch the mode of the agents from CR W to S and vice versa. B. Sensing and Pr ocessing The sensing and communication model for the agents is inspired by limited sensing capabilities of the biobotic insects, consisting of a wireless transmitter and receiv er provided by the system-on-chip based ZigBee enabled backpacks attached to their bodies [1]. Each agent is assumed to have a limited proximity sensing capability , and can identify and communi- cate with other agents within a detection radius r d , as sho wn in Fig. 1(d). The agents are able to record their encounters with each other and the corresponding times as encounter events . Furthermore, the nodes are able to report their status as being in a CR W or S state. Local encounters are the only piece of information provided, and no coordinates or other localization information is av ailable. Note that odometry and inertial mea- surements are considered to be too unreliable due to the une ven and unstructured terrains present in our application scenario. W e do not consider an y external obstacle detection/collision mechanism as we rely on the insects’ “built-in” instinct for collision detection and their natural change of direction in case of collision. C. Objective Giv en this minimal sensing scenario, our goal is to learn the underlying geometry and infer the topology of the unknown en vironment as the swarm of agents ef ﬁciently e xplore the domain. As the swarm of agents share information about 5 Event Detection Dimension Reduction Filtration Persistent Homology Classification (c) (d) (f) λ Death Dim 0 Dim 1 Birth (a) (b) Agent - CRW Agents - S Encounter points Metric Estimation (e) Persistent Diagram (PD) Fig. 3. Overvie w of our methodology: (a) an en vironment with two obstacles and a snapshot of the location of 100 agents (gray circles), (b) a subsample of collected coordinate-free encounter nodes (pink circles) ov er which a metric is estimated, (c) the point cloud obtained from projection of the encounter metric (d) a simplicial Rips-complex representation, (e) persistence diagram of the complex highlighting connected components in blue and holes in red dots classiﬁed as signals and noisy features, and (f) a topological sketch of the en vironment. their local encounters during exploration, a coordinate-free map of the scene is built, that consists of geometric point clouds representing estimates of distances between the sam- pled points, and a topological summary of the environment (e.g. connected components and obstacles). W e refer to such a map as Geometric-T opological P ersistent Map (GTP-Map) of the environment. W e measure the quality of geometric estimation via computation of errors in geodesic distance estimated and quality of topological inference via errors in the topological features in the persistence diagrams and the inferred ranks of the homology of the space. I V . M E T H O D O L O G Y A high le vel ov erview of our approach is illustrated in Fig. 3. A physical en vironment with 2 holes (corresponding to obstacles) is shown in Fig. 3(a) with agents whose motion mimics the biobots movement model. The objectiv e is to estimate a 2D embedding which best represents intrinsic geometry and topological features of the environment using local interactions among the agents. This task is accomplished by the follo wing stages: (1) exploration and e vent (data) col- lection, (2) metric reconstruction, and (3) topological feature extraction. Exploration and Event Collection : The e xploration is initi- ated with a dispersion stage, where all nodes start their motion in a CR W status until dispersed throughout the en vironment, while maintaining their connectivity . This is follo wed by a landmark selection stage, where a small percentage of the nodes are selected as static landmarks using a modiﬁed maxmin landmark selection approach [35], and are com- manded to switch to S mode. Given the adjacency graph of the communication, we use its hop distance to select landmarks iterativ ely by maximizing the distance from the previous set of landmarks. If the graph is not connected, the landmark agents are selected randomly from each cluster with probability equal to the percentage of agents in each cluster . The encounters among the agents (Fig. 3(b)) are recorded as data points, over which a distance metric is estimated after it is pre-processed. The estimated point cloud is projected to 2D space using a dimension reduction technique (Fig. 3(c)). Persistent homology extracts the persistence diagram for the simplicial ﬁltrations of the point cloud (Fig. 3(d)) which will be used for topological inference. Details on metric and topological estimation are provided in the follo wing sections. V . G E O M E T R I C E S T I M A T I O N A N D A NA LY S I S A. Metric Estimation An encounter event e i takes place between tw o agents if their corresponding positions fall within a distance of r d , and is recorded by an agent as a tuple: e i = ( t 1 i , t 2 i , ID 1 i , ID 2 i ) , (5) where [ t 1 i , t 2 i ] denotes the time interval of the encounter , and ID 1 i and ID 2 i represent the IDs of the two agents encountering. Deﬁne t i = ( t 1 i + t 2 i ) / 2 as the center of the time interval, and ID( e i ) = { ID 1 i , ID 2 i } as the ID set of the ev ent e i . Assigned to each ev ent e i is a coordinate p i ∈ M , that represents the center of mass of the set of location at which the agents with ID 1 i and ID 2 i met. W e denote the set of all e vents as E = { e i } , and the set of corresponding coordinates as P E = { p i } ⊂ M . Furthermore, deﬁne the landmark set L = { l k } ⊂ I as the set of IDs of the agents that are assigned to be static (in S mode). Given the set L , then the set C k ⊂ E of all landmark events corresponding to the k -th landmark is deﬁned as C k = { e | l k ∈ ID( e ) } . (6) Note that C k represents a cluster of e vents, which is referred to as a community in a graph; so we refer to the set C k as the k -th landmark community . For each landmark community C k , deﬁne its coordinate q k as the center of mass of the positions of all the events in the cluster . W e denote the set of all landmark communities as C = {C k } , and the corresponding coordinates as P L = { q k } ⊂ M . The sets E and C can be considered as unlocalized samples gathered from the environment, which can be projected into sampled subspace P E and P L drawn from M . W e construct distance metrics on E and C that approximate pairwise dis- tances close to the geodesic distances in P E and P L induced by the manifold distances d M : M × M → R + . The idea is to approximate geodesic distances by the shortest path distance 6 on the neighborhood graph of the samples taken from the manifold. Construct an undirected weighted graph G with vertices corresponding to the e vents e i , denoted as the encounter graph . Gi ven G , for any two v ertices e i and e j , include an edge e ij if they have a common agent in volv ed in both ev ents, i.e. ID( e i ) ∩ ID( e j ) 6 = ∅ . The corresponding weight w ij will represent pairwise distances to be calculated as follo ws. For e i and e j , deﬁne the pairwise encounter time metric d E as d E ( e i , e j ) = | t i − t j | if e i and e j are connected. Note that, for connected nodes e i and e j , the Euclidean distance between the corresponding coordinates p i and p j is upper bounded as k p i − p j k 2 ≤ v m . | t i − t j | . Provided that the agents were moving only on straight lines and there were no obstacles in the ﬁeld, this upper bound could ha ve precisely approximated the Euclidean distance. Ho wev er , due to CR W motion of the agents and existence of the obstacles, the accumulated errors of the pairwise distances o ver time will cause large drifts in metric estimation. T o reduce the amount of such drifts, we furthermore im- prov e our graphical model by incorporating the fact that landmark nodes are static. Consider again the set L of the landmark nodes. Then, for two ev ents e i and e j , if there exists a landmark community such that e i , e j ∈ C k , they had to occur geometrically at nearby locations. Therefore, the corresponding weight w ij is set to 0 in this case, and we deﬁne a pairwise encounter metric d E L as d E L ( e i , e j ) = ( 0 , e i , e j ∈ C k for some k , d E ( e i , e j ) , otherwise . (7) T o proceed with the construction of geometric point cloud, we perform a graph shortest path algorithm on the graph G , and deﬁne our encounter metric on the nodes in G as: d G E L ( e i , e j ) = min P { d E L ( x 0 , x 1 ) + . . . + d E L ( x p − 1 , x p ) } , (8) where P = ( x 0 = e i , x 1 , . . . , x p = e j ) denotes a path along the edges of graph G connecting e i to e j . The shortest paths and their corresponding lengths can be found from Dijkstra’ s or Floyd’ s shortest path algorithms [36]. The graph distances d G E are deﬁned similarly . Graph G consisting of nodes corresponding to the events e i and edge weights set as w ij = d G E L ( e i , e j ) , is referred to as the encounter metric graph , which approximates the geodesic distances o ver M from the sample points. W e refer to the set of nodes in G together with the metric d G E L as the encounter point cloud of M . The estimated point cloud lies in a higher dimensional space although its intrinsic geometry can be embedded in a lower dimensional spaces. W e make use of classical Multi-dimensional Scaling (MDS) [37] for dimension reduction in order to ﬁnd a 2D embedding of the point cloud data. MDS, is typically used for visualization of dissimilarity matrices by calculating a set of of coordinate whose conﬁguration minimizes a loss function [37]. B. Metric Analysis In the follo wing, we analyze and discuss con vergence and asymptotic behavior of the proposed metric for approximation of geodesic distances, and in vestigate assumptions and condi- tions on sampling and graph connectivity under which d G E L is guaranteed to recover the true geometry of M . For events e i and e j , deﬁne d M as the distances between their corresponding coordinates, p i and p j , i.e. d M ( e i , e j ) = d M ( p i , p j ) . Moreov er , for a point p ∈ M and an e vent e i , we deﬁne the pairwise distance d M ( p, e i ) = d M ( p, p i ) . For an ev ent e i and a landmark community C k , deﬁne their pairwise distance as d G E L ( e i , C j ) = min e ∈C j d G E L ( e i , e ) . Accordingly , for two landmark communities C i , C j , set the encounter metric for communities d G E L ( C i , C j ) as the shortest path distance from one community to another . In the following we will show that if there exist enough number of encounters among the agents and the encounter graph is suf ﬁciently dense, the encounter metric for commu- nities will con ver ge to the manifold distances between their corresponding positions in M . Furthermore, the encounter metric for all pair of nodes in G will con v erge to the manifold distance as the number of landmarks increases. In our analysis, we make an assumption of unit linear velocity for CR W motion, i.e. v m = 1 , for the sake of simplicity . The results can be easily extended for a general case with a scaling factor of 1 /v m . Moreover , we neglect the effect of the detection radius r d on the obtained bounds; yet it can be shown that it will introduce a bias term proportional to r d in the errors. Under these assumptions, we can identify an event e i with a single point p i and time t i . For simplicity , we ab use our notation and redeﬁne e i = ( p i , t i ) ∈ M T := M × T , where T represent the time domain, and deﬁne the projections Π M : M T → M and Π T : M T → T . W e further deﬁne a metric on the product space M T as d M T ( e , e 0 ) = inf Γ ∈ A ( e , e 0 ) S Γ , (9) where A ( e , e 0 ) is the set of continuous functions of the form Γ( s ) = ( γ M ( s ) , γ T ( s )) such that Γ(0) = e and Γ( S Γ ) = e 0 , |∇ M γ M ( s ) | = 1 and | ∂ γ T ∂ t ( s ) | = 1 . This set of trajectories correspond to curves in M tra veled at constant velocity that hav e been lifted to M T . In order to analyze the asymptotic behavior of our approach, we assume that the set of trajectories and events become dense enough such that we can approximate any of the geodesics trajectories in M T . Let δ e and ε be positiv e real numbers. W e require the following conditions to hold: Assumption 1 (Sampling Conditions): (i) For e very point e ∈ M T , there e xists a sampled e vent e i such that d M T ( e , e i ) ≤ δ e . (ii) For any two neighboring e vents e i and e j such that d M T ( e i , e j ) ≤ ε , there exists a τ such that d E ( e i , e j ) ≤ τ , where τ →  as the number of agents increases to inﬁnity . Assumption 2 (Gr aph Connectivity Condition): The graph G contains all edges for which d M T ( e i , e j ) ≤ ε . Condition (i) corresponds to the δ e -dense encounter sam- pling condition for the ev ents, which is guaranteed as the number of agents or the running time increases. Condition (ii) can be guaranteed by the motion models considered. As the number of agents increases, each path connecting nearby 7 ev ents will look more like a straight path (due to the CR W model under consideration). Lemma 1: For any two encounter ev ents e i and e j we have: d M T ( e i , e j ) ≤ d G E ( e i , e j ) . (10) Pr oof: First, we note that gi ven neighboring e vents e i and e j in G then d M T ( e i , e j ) ≤ d E ( e i , e j ) since d M T is deﬁned as the inﬁmum over all trajectories of constant velocity , and the trajectory between e i and e j is one particular realization. Furthermore, the trajectory deﬁned by the path P = ( x 0 = e i , x 1 , . . . , x p = e j ) associated with the shortest path in ( G , d E ) corresponds to one particular realization of the set of trajectories considered in the deﬁnition of d M T ; hence, we have the desired inequality . Lemma 2: For two encounters e i and e j , under sampling Assumption 1 and graph connecti vity Assumption 2, if 4 δ e < ε then: d G E ( e i , e j ) ≤ (1 + λ 1 ) · λ 2 · d M T ( e i , e j ) , (11) where λ 1 = 4 δ e /ε and λ 2 = τ /ε . Pr oof: Let Γ( s ) := ( γ M ( s ) , γ T ( s )) for s ∈ [0 , S Γ ] be an arbitrary curve in A ( e i , e j ) as deﬁned in Equation (9). W e decompose this curv e into arcs with lengths s k by deﬁning a sequence of points Γ 0 = e i , Γ 1 , . . . , Γ n , Γ n +1 = e j along Γ such that S Γ = P n k =0 s k with s k = ε − 2 δ e for k = 1 , . . . , n − 1 , and ( ε − 2 δ ) / 2 ≤ s 0 , s n ≤ ε − 2 δ e . Based on condition (i) of Assumption 1, for each point Γ k there exist an encounter e vent x k within distance δ e from Γ k . Additionally , for each pair ( x k , x k +1 ) in the path P = ( e i , x 1 , . . . , x n , e j ) we ha ve d M T ( x k , x k +1 ) ≤ d M T ( x k , Γ k ) + d M T (Γ k , Γ k +1 ) + d M T (Γ k +1 , x k +1 ) ≤ s k + 2 δ e = ε for k = 1 , . . . , n − 1 , which implies that there is an edge between x k and x k +1 in G by Assumption 2. Then, by condition (ii) in Assumption 1, d E ( x k , x k +1 ) ≤ τ = λ 2 · ε = λ 2 · ε · s k / ( ε − 2 δ e ) , where the last equality follows from the deﬁnition of s k . Similarly , one can show that d E ( e i , x 1 ) ≤ λ 2 · ε · s 0 / (  − 2 δ e ) and d E ( x n , e j ) ≤ λ 2 · ε · s n / (  − 2 δ e ) . By summing all distances along the path P , we ha ve that d G E ( e i , e j ) ≤ d E ( e i , x 1 ) + d E ( x 1 , x 2 ) + · · · + d E ( x n , e j ) ≤ λ 2 · ε · S Γ / ( ε − 2 δ e ) . Using the fact that 1 / (1 − x ) < 1 + 2 x for positive x < 1 / 2 , one concludes d G E ( e i , e j ) ≤ λ 2 · S Γ · (1 + 4 δ e /ε ) for 4 δ e < ε . T aking the shortest path among all paths between e i and e j we have the desired result. Combining the abov e lower and upper bounds on d G E L ( e 1 , e 2 ) , we can conclude the main result for the con ver- gence of distances as follo ws. Theor em 1: For any pair of e vents e i , e j , provided that Assumptions 1 and 2 hold, and 4 δ e < ε then d M T ( e i , e j ) ≤ d G E ( e i , e j ) ≤ (1 + λ 3 ) · d M T ( e i , e j ) , (12) where λ 3 = τ /ε − 1 + 4 δ e τ /ε 2 . Landmark communities are the set of events associated with a particular landmark at a ﬁxed location, so we can represent them in M T as the set S k = Π − 1 M ( p k ) where p k ∈ M is the physical location of landmark l k . Hence, we can deﬁne a ne w metric space M L T which is M T with the proper identiﬁcation along the set of landmark communities {C k } . W e can show that this ne w space has a similar metric to the original space. Lemma 3: Gi ven tw o landmark ev ents e k ∈ S k and e k 0 ∈ S k 0 , where p k and p k 0 are the location of landmarks l k and l k 0 ∈ L , then d M L T ( e k , e k 0 ) = d M ( e k , e k 0 ) . (13) Pr oof: Note that the deﬁnitions of d M and d M L T both require curves γ M ( s ) parameterized by arc-length in M . Howe ver , d M L T would also impose some additional condi- tions on γ T ( s ) based on the structure in M L T . Therefore, d M L T ( e k , e k 0 ) ≥ d M ( e k , e k 0 ) since the inﬁmum for d M is ov er a bigger set. Furthermore, giv en that γ ∗ M ( s ) is a geodesic in M with arc- length S ∗ Γ joining p k and p k 0 , then Γ( s ) = ( γ ∗ M ( s ) , t k + s ) is a valid trajectory in M L T joining e k and ˆ e k 0 = ( p k 0 , t k + S ∗ Γ ) . This means that d M L T ( e k , ˆ e k 0 ) ≤ d M ( e k , ˆ e k 0 ) . Ho wev er , since the e k 0 and ˆ e k 0 ∈ S k 0 then they are both identiﬁed as the same point in M L T . Hence, d M L T ( e k , e k 0 ) ≤ d M ( e k , e k 0 ) , which concludes the proof. Lemma 4: Given two e vents e i and e j , then d M L T ( e i , e j ) = min ( d M T ( e i , e j ) , inf l k ,l k 0 ∈ L { d M ( e i , S k ) + d M ( S k , S k 0 ) + d M ( S k 0 , e j ) }  . (14) Pr oof: The geodesic associated with the distance metric d M L T either passes through the position of a landmark or it does not. Assuming that it does not pass through one of the landmarks that the distance metric will be the same as d M T since we are not using the identiﬁcation in M L T . If it passes through a landmark then the distance will be equal to d M L T ( e i , S k ) + d M L T ( S k , S k 0 ) + d M L T ( S k 0 , e j ) . Ho we ver , we know that d M L T ( S k , S k 0 ) = d M ( S k , S k 0 ) by Lemma 3, and we can similarly show that d M L T ( e i , S k ) = d M ( e i , S k ) and d M L T ( S k 0 , e j ) = d M ( S k 0 , e j ) . Hence, taking the minimum of both possible scenarios we get the desired result. Finally , combining the result of Lemma 4 with Theorem 1, we obtain the desired bound on landmark communities. Theor em 2: For any pair of landmark communities C k , C k 0 , provided that Assumptions 1 and 2 hold, and 4 δ e < ε then d M ( C k , C k 0 ) ≤ d G E L ( C k , C k 0 ) ≤ (1 + λ 3 ) · d M ( C k , C k 0 ) , (15) where λ 3 = τ /ε − 1 + 4 δ e τ /ε 2 . Pr oof: The inequality d M L T ( E k , E k 0 ) ≤ d G E L ( E k , E k 0 ) can be shown by following the same arguments as in Lemma 1. Hence, d M T ( E k , E k 0 ) ≤ d G E L ( E k , E k 0 ) follows by Lemma 3. Similarly , the second inequality follows by using the argument in Lemma 2 and the results in Lemma 3. Note that we can also generalize the results of this theorem to include bounds on the conv ergence of distances between an y arbitrary e vents by using Lemma 4. Howe ver , this result does not provide conv ergence to d M for an arbitrary pair of e vents, and will depend on an additional constraint on the density of the conﬁguration of landmarks in M . Assumption 3 (Landmark Density Condition): For any landmark l k , there exists another landmark l k 0 such that d M ( C k , C 0 k ) ≤ δ l . Moreov er , for any two distinct landmark communities d M ( C k , C k 0 ) > 0 . 8 Cor ollary 1: Given two ev ents e i and e j , provided that Assumptions 1-3 hold, and 4 δ e < ε then d M ( e i , e j ) ≤ d G E L ( e i , e j ) ≤ (1 + λ 3 ) · ( d M ( e i , e j ) + 2 δ l ) . (16) Pr oof: The left hand side inequality can be obtained by following the same ar guments as in Lemma 1 to show that d M L T ( e i , e j ) ≤ d G E L ( e i , e j ) , and using the fact that d M ( e i , e j ) ≤ d M L T ( e i , e j ) from the ar gument in Lemma 3. For the second inequality , follo wing the same argument as in Theorem 1 and Lemma 4 we hav e d G E ( e i , e j ) ≤ (1 + λ 3 ) · d M L T ( e i , e j ) and d M L T ( e i , e j ) ≤ inf l k ,l k 0 ∈ L { d M ( e i , S k ) + d M ( S k , S k 0 ) + d M ( S k 0 , e j ) } . In particular , if we let k 1 and k 2 be the landmarks closest to S k and S k 0 respectiv ely , then d M L T ( e i , e j ) ≤ δ l / 2 + d M ( S k 1 , S k 2 ) + δ l / 2 ≤ δ l + d M ( S k 1 , e i ) + d M ( e i , e j ) + d M ( e j , S k 2 ) ≤ 2 δ l + d M ( e i , e j ) . Theorem 2 asserts that gi ven enough number of encounter samples from the en vironment and graph connectivity con- dition, if the CR W motion of the biobots is close enough to straight line segments for small enough neighborhoods, the proposed encounter metric asymptotically recovers the geodesic distances between landmarks. In particular, for a ﬁxed size and shape of the en vironment, and the speed of agents, the encounter sampling density δ e is inv ersely proportional to |E | , which is proportional to the number of moving agents N m . Hence, with increased number of moving agents, δ e will ev entually con verge to 0 and λ 1 can be chosen arbitrary small. τ , on the other hand, is dependent on CR W parameters, the existence of obstacles in the space, and the encounter sampling as well. Consider a local subdomain (patch) of M , where there are no obstacles within. For this patch, τ is mainly determined by the de viation of the trajectories agents tra verse between their encounters from straight line segments. If the agents mo ve on straight lines and change direction only if they detect other agents or obstacles, then: for any two neighboring encounters d E ( e i , e j ) = d M T ( e i , e j ) , so δ e ≤ ε and λ 2 → 0 . This depends on the probability of each line se gment in CR W model being larger than encounter sampling density . Considering the CR W model of (4), p [ l ≥ δ e ] = e δ e / l ∗ , which will con verge to 1 if δ e / l ∗ → 0 . Hence, for a ﬁxed l ∗ , in an obstacle free space if δ e → 0 , then λ 2 → 0 . Conv ergence of λ 2 also can realized by increasing l ∗ for a ﬁx ed sampling density , although smaller l ∗ will result in better dispersion performance, which is representing a trade-off between exploration and mapping performance. When there exist obstacles, the change of direc- tion due to obstacle av oidance introduces an extra source of error , which can also be decreased by having enough number samples close to the boundaries of obstacles. V I . T O P O L O G I C A L I N F E R E N C E So far , we hav e constructed point clouds for geometric estimation of the domain M . Howe ver , such models are not precise and is subject to errors due to uncertainty on the distances, which may lead to deformations in the shape of the space. In order to address this issue, we exploit persistent homology , that is stable to bounded perturbations of the dis- similarity measure [30], along with a statistical classiﬁcation approach in order to infer topological features of M from coordinate-free encounter point clouds. Consider again the point cloud E with the encounter metric d G E L . W e construct ﬁltrations of V ietoris-Rips simplicial com- plex es R ( E ,  ) , denoted as encounter complexes , and compute its persistent homology to extract topological features in terms of persistence diagrams. An example of a simplicial comple x constructed o ver E is sho wn in Fig. 3(d). Similarly , one can calculate the persistence over the ﬁltrations of simplicial complex es R ( C ,  ) . W e restrict our computations to the di- mensions 0 and 1 of persistence (corresponding to connected components and holes/obstacles in the point cloud) as the higher dimension are not applicable in our e xperiments. This process is, ho wev er , preceded by a subsampling step for cleaning up the data. Subsampling and Outlier Removal Extracting topological information from the whole point cloud would be computa- tionally expensiv e due to the large number of e vents that are created ov er time, and it can be prone to error due to outliers in the estimated metric; hence using a subsampling algorithm would be beneﬁcial. A common subsampling method in per - sistent homology which preserves topology is the maxmin algorithm, b ut it is very sensiti ve to outliers. Hence, we employ a modiﬁed version of maxmin by adding a pre-processing layer for outlier remov al using a K-nearest neighbor (KNN) density estimation [38]. For a node e i in graph G , let d k ( e i ) denote the dis- tance between e i and it’ s k-nearest neighbor , and d k ( e i ) be the a verage distance to its k-nearest neighbors (in versely proportional to the local density of the point cloud around e i ). Consider the density of all average distances over E as ρ k ( E ) and select the threshold τ q ,k to be q -quantile of ρ k ( E ) . Then we select a subset V ⊂ E of points v i such that d k ( v i ) < τ q ,k , ∀ v i ∈ V , which will remov e outliers of E to a lar ge extent. Finally , we apply the maxmin algorithm on the pre-ﬁltered point cloud V for further subsampling. The ﬁrst sample x 1 is chosen randomly from the point cloud V . Iterativ ely , if X = { x 1 , ..., x k − 1 } is the set of previously selected samples, ﬁnd the next sample as a point x k ∈ V s.t. maximizes d ( x k , X ) . A. T opological Stability Analysis Here we will analyze topological stability of our metric estimation, and show that our geometrically estimated point cloud will provide estimations of features of environment that are topologically stable. Consider the metric spaces ( M , d M ) , ( E , d G E L ) , and ( C , d G E L ) , and let PD M , PD E , PD C represent the corresponding persistence diagrams obtained from ﬁltrations of simplicial complexes R ( M ,  ) , R ( E ,  ) , R ( C ,  ) , respec- tiv ely . W e will exploit the bounds obtained in section V -B in order to show stability of persistence diagrams over E and C . Theor em 3 (Stability of persistence): Under assumptions and conditions of Theorem (1) and Theorem (2), the proposed encounter metric is topologically stable i.e. the persistence 9 diagrams PD E and PD C are not much dif ferent form PD M in terms of the bottleneck distance: d B ( PD M , PD C ) ≤ λ 3 · δ l + δ l , (17) d B ( PD M , PD E ) ≤ (1 + λ 3 ) · 2 δ l + (1 + λ 3 ) · δ e . (18) The bottleneck distances between the persistence diagrams in (17) and (18) are bounded by the errors that correspond to the summation of geometric estimation err or and sam- pling error . In (17) the term λ 3 δ l corresponds to the metric estimation error between landmarks, which will v anish if λ 3 → 0 as discussed in the pre vious section. The term δ l , on the other hand, is purely the density of landmarks in the space, and determines the topological estimation error using only landmarks as the samples of the M if we had the true metric ov er the space C . Likewise in (18), the ﬁrst term corresponds to the metric estimation error between all encounters which will v anish if both δ e , δ l → 0 . The term (1 + λ 3 ) δ e corresponds to the topological perturbation using the sampled set of encounters which will con ver ge to zero if δ e → 0 . B. T opological Classiﬁcation Persistent diagrams pro vide a means for extracting topo- logical features out of the point cloud, in terms of birth and death of features as the value of ε varies, encoded in persistence interv als. Howe ver , one further needs to classify persistent features from noisy detections that emerge due to subsampling and metric estimation errors. For a dense enough sampled data, ho we ver , one can ﬁnd a range of v alues for which the homology of simplicial complex es is equal to the one for M [39]. Here, we propose a robust and scale-in variant classiﬁcation algorithm for persistence interv als by learning such thresholds based on a non-parametric density estimation of noise in order to separate persistent features form noisy ones and infer the correct Betti numbers of the underlying space. For our statistical analysis, we consider a collection of sampled point clouds {E ( p ) } N p =1 of a topological space M . For each E ( p ) , we can extract from its persistence diagram a collection of n -th Betti number persistent interval lengths { r ( p ) ,k n } k , where r ( p ) ,k n = | d ( p ) ,k n − b ( p ) ,k n | . Denote the q n quan- tile for this collection by r ( p ) q n , and deﬁne the n -th estimated Betti number for the p -th point cloud as ˆ β ( p ) n ( θ n ) = X k 1 R + r ( p ) ,k n r ( p ) q n + ∆ n − τ n ! , (19) where θ n = ( q n , ∆ n , τ n ) is the estimation parameter vector , and 1 R + ( . ) is the indicator function of R + . This function speciﬁes that only those persistent intervals for which the normalized v alue of r ( p ) ,k n is greater than τ n are considered as “true” persistent features. Furthermore, deﬁne the n -th Betti number error as e ( p ) n ( θ n ) =    ˆ β ( p ) n ( θ n ) − β n    , (20) where β n is the n -th Betti number of M . Hence, the average error for the collection of sampled points can be expressed as e n ( θ n ) = 1 N N X p =1 e ( p ) n ( θ n ) . (21) In order to perform classiﬁcation with this metric, we generate a set of samples from the same topological space and train our classiﬁer by ﬁnding the v alue of θ n that minimizes the error function e n . In order to make the cost function smooth for optimization purposes, one can replace the indicator func- tion 1 ( x ) with a sigmoid function σ α ( x ) = 1 1+ e − αx , where α is a scaling parameter . W e denote by θ ∗ n the optimal parameter value found using brute force optimization. Finally , we can compute the error e n ( θ ∗ n ) for a ne w collection of sampled data for testing purposes. V I I . N U M E R I C A L A N A L Y S I S R E S U LT S In this section, we present numerical simulation experiments to demonstrate the v alidity and performance of our approach. W e ha ve considered two-dimensional en vironments with ba- sic geometric shapes for simplicity . Howe ver , the approach makes no use of these assumption and is directly applicable to more unstructured en vironments. Additionally , videos of the simulations for more complex en vironments as well as experiments with W olfbots are av ailable in the project website 1 . The software is composed of a biobotic swarm simulator and a GTP-mapping package. The sw arm simulator (in C++) reproduces the behavior of biobots by combining natural as well as control motion patterns. The mapping package (in C++ and Matlab) performs geometric estimation and topological inference. In the swarm simulator , CR W parameters are set to be C = 1 , l ∗ = 0 . 30 m, v m = 0 . 1 m/s . The change of direction is isotropic e verywhere except the boundaries of the en vironment. The detection radius of each agent for encounter and boundary recognition is adopted to be r d = 1 cm . For computation of persistence intervals and persistent diagrams, we exploit Dionysus C++ library [40]. A. Example Scenarios Here we illustrate our methodology presenting the results for three different example scenarios as sho wn in Fig. 4. W e consider square shaped physical environments with dimen- sions 50 m × 50 m and three scenarios as (1) no obstacles, (2) one obstacle in the middle, and (3) two obstacles. A total number of 150 agents are considered to start their CR W motion from random initial conﬁguration among which N l = 20 are selected as landmarks after the initial dispersion and are commanded to switch to S mode, while the others ( N m = 130 ) keep moving as CR W motion, and exploring the area for 200 seconds. Fig 4(a) sketches snapshots of the en vironments together with moving agents (gray circles) at t = 0 and the set of landmark nodes (sho wn as pink dots). W e constructed estimated geometric point clouds and persis- tence diagrams for each scenario. Simplicial representations of each point cloud in the form of Rips comple xes R ( E ,  ) , where 1 https://research.ece.ncsu.edu/aros/project/ 10 ( b) Ge om e t ri c poi nt c l ouds ( c) P er si st en ce d i ag r am s ( a) P h y si cal en v i r o n m en t s S c e na ri o 1 S c e na ri o 2 S c e na ri o 3 Maxmin KNN - maxmin 0 2 4 6 8 10 Performance measure 0 20 40 60 80 100 P S P N SNR ( d) ( e) Birth Death 0 28 0 28 d b = 2.81 Birth Death 0 35 0 35 d b = 2.938 Birth Death 0 25 0 25 d b = 3.731 Death Death dim 0 dim 1 Birth Birth 0 0 25 25 M a x m i n K N N - ma x min M a x m i n K NN - ma x min 28 28 28 P er f o r m an ce m easu r e Fig. 4. Simulation results for e xample scenarios (1) - (3): (a) a snapshot of the physical en vironments with moving agents (gray circles) and landmarks (green circles), (b) simplicial complexes built ov er learned geometric point cloud, (c) persistence diagrams PD E PD M , their bottleneck distances, and topological classiﬁcation hyperplanes H 0 and H 1 . Subsampling performance: (d) comparison of subsampled point clouds from standard maxmin and KNN-maxmin, and their persistence diagrams, and (e) their SNR performance for 100 independent runs. # of holes 0 1 2 3 SE 0.5 0.75 1 Robustness to change in # of features Scale 0.75 1 1.25 SE 0.5 0.75 1 Robustness to Scaling 0 3 6 9 0 20 40 Birth Death 0 25 0 ∞ l 1 τ 1 Density of PD 1 PD (b) (c) (e) (d) (a) ρ ( r 1 ) Fig. 5. Robust Classiﬁcation of persistence intervals: (a) A 2-hole en vironment, (b) the extracted persistence diagram, (c) density of persistence intervals with the appropriate threshold, (d) rob ustness of classiﬁcation algorithm to scaling, and (e) to change in the number of features.  = 0 . 5 max( d G E L ) , are shown in Fig. 4(b). The obstacles in the physical en vironments are represented as holes in these representations. The persistent diagrams for the ﬁltrations over the point clouds are sketched on top of the true persistence diagrams of the spaces in Fig. (4). On each persistence diagram, 0-dimensional and 1-dimensional topological features (con- nected components and holes) are plotted as “blue” and “red” points, respecti vely , and the class hyperplanes H 0 and H 1 separating topological features from noise are drawn as dashed lines. The classiﬁcation thresholds are obtained after 50 independent runs as τ 0 = 25 , τ 1 = 30 . The features classiﬁed as signals are marked with green rectangular patches. On each diagram, we calculated the bottleneck distance between the persistence intervals obtained o ver the ﬁltration of the true manifold distances and the encounter metric d B ( PD M , PD E ) . The maximum topological estimation errors are in the order of 4 . 2% of the largest pairwise distance v alue in the en vi- ronments. It can be observed that the point clouds capture the correct topological features of the domains, which can be veriﬁed by the features classiﬁed as signals in the persistence diagrams. Moreo ver , the estimated point clouds also contain some geometric information, which could be exploited to improv e the mapping accuracy for approaches that make use 11 of additional sensing and localization information. In order to explore the effect of point cloud subsampling on the topological inference, we further studied the signal to noise ratio (SNR) performance over N = 100 independent runs for the one hole case, comparing the standard maxmin subsam- pling with the proposed KNN pre-ﬁltered maxmin approach. Fig. 4(e)-(f) sho ws the subsampled point clouds consisting of 150 points along with the corresponding persistence diagrams. KNN parameters are selected as q = 0 . 9 and k = 10 . The signal to noise ratio is calculated as S N R = P S /P N where P S and P N represent the average of persistence sig- nal po wer P ( p ) S and noise po wer P ( p ) N ov er N runs, re- spectiv ely . W e deﬁne persistence signal power for the p - th data set as P ( p ) S = P S p | r ( p ) ,k 1 | 2 / | S p | , where S p is the signal set for the p -th run consisting of all persistence intervals of dimension 1 classiﬁed as signals, and r ( p ) ,k 1 rep- resents the corresponding lengths of such intervals. Similarly , P ( p ) N = P N p | r ( p ) ,k 1 | 2 / | N p | , where N p denotes the p -th noise set. An in vestigation of the plots in this ﬁgure clariﬁes ho w the pre-ﬁltering step cleans up the data from outliers that ﬁll out the topological holes, and improv es the SNR to a great extent. B. Classiﬁcation and Robustness Additionally , we assessed the performance of our topologi- cal classiﬁcation in terms of robustness to scaling and number of holes in the space. W e used a training data set consisting of 100 v ariations of the space sho wn in ﬁgure 5(a) with random placements of the two square holes in the space. Persistence diagram and the corresponding density of interval lengths for one of the environments is shown in ﬁgure 5(b) and (c), respectiv ely . The optimization process resulted in a minimum of F (Θ ∗ 1 ) = 0 . 03 with Θ ∗ 1 = (0 . 5 , 0 . 7 , 37) . This result is not surprising as the 0 . 5 -quantile of the density , which is the median of the dataset, is known to be robust to outliers. W e Birth Death 0 36.1 0 36.1 d E - d i m0 d E - d i m1 d M - d i m0 d M - d i m1 d b = 2.938 d Birth Death 0 40.1 0 40.1 d b = 1 4 . 3 7 0 d b = 14.370 d E - d i m0 d E - d i m1 d M - d i m0 d M - d i m1 0 50 0 20 40 60 80 Landmark Communities 0 50 0 20 40 60 80 Encounter Events 3 = 0.139 Fig. 6. Estimated geodesic distances d G E L vs. d M and the corre- sponding PD ’ s over the set of landmark communities (left), and the set of encounters (right) for the environment in scenario 2. ev aluated F (Θ ∗ 1 ) for a variety of test sets, each containing 100 point clouds with random feature placements, with each test set differing in either scale of the en vironment and features or the number of features (holes). T o in vestigate the performance, we used the sensitivity mea- sure for classiﬁcation, S E = T P / ( T P + F N ) , where T P and F N denote the number of true positives and false negativ es, respectiv ely . The sensitivity performance for these scenarios is summarized in Fig. 5 (d) and (e) for scaled en vironments, and en vironments with different numbers of holes but the same scale, respectiv ely . Note that we hav e ev aluated this measure for the whole mapping algorithm and not just for the classiﬁcation part, which can justify the degradation in the performance for the 3-hole case, as we observed that in 15% of the cases, the third hole could not be retrie ved from the point cloud data. The reason is that adding more topological holes in a ﬁxed space will degrade the encounter graph connecti vity , which which in turn deteriorates the geometric estimation accuracy and topological connectivity . C. Con ver gence Analysis W e performed a set of experiments in order to ev alu- ate the geometric estimation error and topological stability bounds deriv ed in sections VI and V. Fig. 6 sketches the estimated geodesic distances versus the manifold distances ov er the sets P L and P E for a single run with N l = 20 landmarks and N m = 100 moving agents, using the same parameters described as in example scenarios. One can observe how closely d G E L estimates geodesic distances over the set of landmarks, while it is prone to more error for pairs of non-landmark events, which is consistent with the results in Theorem 2. Ho wev er , looking at the bottleneck distance d B ( PD M , PD E ) in the right plot, we realize that the rob ust- ness of persistent homology to perturbations compensates for the relati ve lar ge geometric errors yielding to a bottleneck 5 10 20 50 0 0.1 0.2 0.3 3 N m = 50 N m = 100 N m = 150 N m = 200 Scenario 1 Scenario 2 50 100 150 200 0 0.1 0.2 0.3 3 N l = 5 N l = 10 N l = 20 N l = 50 5 10 20 50 0 0.2 0.4 0.6 3 N m = 50 N m = 100 N m = 150 N m = 200 N 50 100 150 200 0 0.2 0.4 0.6 3 N l = 5 N l = 10 N l = 20 N l = 50 # of landmarks # of moving agents Fig. 7. V ariation of the geometric estimation con ver gence parameter λ 3 with N l and N m . Increasing N l for different ﬁxed values of N m (left), and increasing N m with ﬁxed values of N l (right). 12 6 Scenario 1 Scenario 2 # of landmarks # of moving agents N l 5 10 20 50 0 10 20 30 N m 50 100 150 200 0 10 20 30 N l 5 10 20 50 0 10 20 30 N m 50 100 150 200 0 10 20 30 N l 5 10 20 50 0 10 20 30 40 N m 50 100 150 200 0 10 20 30 40 N l 5 10 20 50 0 10 20 30 40 N m 50 100 150 200 0 10 20 30 40 Fig. 8. V ariation of the bottleneck distances d B ( PD M , PD E ) and d B ( PD M , PD C ) (dashed lines) and estimated stability bounds (solid lines) with N l and N m . Increasing N l for ﬁxed values of N m (left), and increasing N m with ﬁxed values of N l (right). distance of only 2.938 in topological estimation. The relativ ely larger topological error d B ( PD M , PD C ) comes from the fact that although the metric d G E L among the landmarks is more precisely estimated, the density of the landmarks (in versely proportional to δ l = 7 . 27 ) is not sufﬁcient to capture the topological information with a high precision if the simplicial complex es are constructed on a subset of M consisting of only Π M ( C ) . Note that the density of Π M ( C ) is in f act equal to the density of position of the landmarks themselves as they do not mo ve over time, whereas for moving agents, densities of their encounter locations Π M ( E ) in M increases o ver time. Hence although the metric d G E L learned over C is estimated more precisely , topological inference is better to be carried out ov er the set of E . The larger density of samples Π M ( E ) with topological stability of persistence homology will compensate for the larger geometric estimation errors of d G E L ov er E . This conv eys a trade-of f between learning the geometry and topology in terms the densities of landmarks and encounter samples. T o better illustrate this, we studied the v ariations of the geometric con vergence parameter λ 3 as well as the bottleneck distances d B ( PD M , PD E ) and d B ( PD M , PD C ) in terms of N m and N l for scenarios 1 and 2. Particularly , we considered two case studies: (1) increasing N m for ﬁxed values of N l , and (2) increasing N l for ﬁxed N m . For each case, we considered 20 independent runs, and four dif ferent values for the controlled parameter . The trends for independent parameters λ 3 and bottleneck distances are shown in the diagrams plotted in Fig. 7 and Fig. 8, respecti vely . A quick ov ervie w of these diagrams v eriﬁes that the metric estimation errors and bottleneck distances will both generally decrease with increasing values of N l and N m . Looking more carefully at Fig. 7 one can easily observe the monotonously decreasing pattern of the v alues of λ 3 as N m increases, and the higher sensiti vity of λ 3 with respect to N m than N l . This con ve ys the fact that although landmarks are essential for precise geometric estimation, increasing their number would not improv e the accuracy of d G E L metric ov er C as much as increasing N m will do. Ho we ver , increasing N l will improve the accuracy of d G E L ov er E to a greater extent as a consequence of Theorem 2 and Lemma 3. On the other hand, from Fig. 8, one can observe that N l has a greater impact on topological estimation over landmarks than N m . This follo ws the same discussion mentioned for bottleneck distances in Fig. 6, as gen- erally d B ( PD M , PD C ) is dominated by δ l . Moreover , looking into diagrams for d B ( PD M , PD E ) in Fig. 8, one notices that the topological stability bounds in this case are dominated by N l rather than N m , although the experimental trends for the two cases do not v ary that much. Furthermore, it is e vident that the bottleneck distance d B ( PD M , PD E ) are in general smaller than d B ( PD M , PD C ) due to the same ar gument as mentioned for Fig. 6. Hence, having a ﬁxed number of agents at hand, a reasonable strategy would be to select a small percentage (10 - 20 % ) of them as landmarks, in order to improv e geometric estimation ov er the set of encounters, and infer topological structures ov er the set E . Then the ratios of N m and N l can further be adjusted based on the needs of the problem and the priorities of geometric or topological learning. Note that in our analysis above, we discussed N m as the deﬁning parameter for density of events, while other parameters are ﬁxed. Howe v er , one can also increase the running time of the exploration stage or the speed of the agents depending on the problem constraints in order to acquire more encounter samples so as to increase the accurac y with smaller number of agents. V I I I . C O N C L U S I O N A N D F U T U R E W O R K W e presented a framework for geometric-topological learn- ing of unknown en vironments using local interactions amongst the agents in biobotic networks. Geodesic metric estimation and topological stability con ver gence analysis was pro vided, and it was shown that under dense sampling of the en vi- ronment, connecti vity of the network encounter graph, and existence of enough landmarks, one can learn the intrinsic geometry of the en vironment with arbitrary small errors, extract topological features of the space with bounded error , and infer the true homology of the space with the proposed classiﬁer . Our future work will include the results of experiments for implementation of the proposed approach on Madagascar hissing biobots in order to realize exploration and mapping with cyborg insect networks for search and rescue operations. R E F E R E N C E S [1] T . Latif and A. Bozkurt, “Line following terrestrial insect biobots, ” in Engineering in Medicine and Biology Society (EMBC), 2012 Annual International Conference of the IEEE . IEEE, 2012, pp. 972–975. [2] E. Whitmire, T . Latif, and A. Bozkurt, “Kinect-based system for automated control of terrestrial insect biobots, ” in Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Confer ence of the IEEE . IEEE, 2013, pp. 1470–1473. 13 [3] T . Latif, E. Whitmire, T . Novak, and A. Bozkurt, “T ow ards fenceless boundaries for solar po wered insect biobots, ” in Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Confer - ence of the IEEE . IEEE, 2014, pp. 1670–1673. [4] D. Haldane, K. Peterson, F . Garcia Bermudez, and R. Fearing, “ Animal- inspired design and aerodynamic stabilization of a hexapedal millirobot, ” in Robotics and Automation (ICRA), 2013 IEEE International Confer- ence on , May 2013, pp. 3279–3286. [5] S. Thrun et al. , “Robotic mapping: A survey , ” Exploring artiﬁcial intelligence in the new millennium , pp. 1–35, 2002. [6] H. Durrant-Whyte and T . Bailey , “Simultaneous localization and map- ping: part i, ” Robotics Automation Magazine, IEEE , vol. 13, no. 2, pp. 99–110, Jun. 2006. [7] P . Henry , M. Krainin, E. Herbst, X. Ren, and D. Fox, “Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments, ” in In the 12th International Symposium on Experimental Robotics (ISER . Citeseer , 2010. [8] A. S. Huang, A. Bachrach, P . Henry , M. Krainin, D. Maturana, D. Fox, and N. Roy , “Visual odometry and mapping for autonomous ﬂight using an rgb-d camera, ” in International Symposium on Robotics Resear ch (ISRR) , 2011, pp. 1–16. [9] B. Ferris, D. Fox, and N. Lawrence, “W iFi-SLAM Using Gaussian Process Latent V ariable Models.” IJCAI , pp. 2480–2485, 2007. [10] D. Kurth, G. Kantor, and S. Singh, “Experimental results in range-only localization with radio, ” in Intelligent Robots and Systems, 2003.(IR OS 2003). Pr oceedings. 2003 IEEE/RSJ International Confer ence on , vol. 1. IEEE, 2003, pp. 974–979. [11] H. M. Choset, Principles of r obot motion: theory, algorithms, and implementation . MIT press, 2005. [12] H. Edelsbrunner, D. Letscher , and A. Zomorodian, “T opological per- sistence and simpliﬁcation, ” Discrete and Computational Geometry , vol. 28, no. 4, pp. 511–533, 2002. [13] H. Edelsbrunner and J. Harer , “Persistent homology - a survey , ” Con- temporary mathematics , pp. 1–26, 2008. [14] A. T ahbaz-Salehi and A. Jadbabaie, “Distributed coverage veriﬁcation in sensor networks without location information, ” A utomatic Contr ol, IEEE T ransactions on , vol. 55, no. 8, pp. 1837–1849, Aug. 2010. [15] R. Ghrist and A. Muhammad, “Cov erage and hole-detection in sensor networks via homology , ” in Pr oceedings of the 4th International Symposium on Information Pr ocessing in Sensor Networks , ser . IPSN ’05. Piscataway , NJ, USA: IEEE Press, 2005. [16] V . de Silv a and R. Ghrist, “Coordinate-free Cov erage in Sensor Networks with Controlled Boundaries via Homology, ” The International Journal of Robotics Resear ch , vol. 25, no. 12, pp. 1205–1222, Dec. 2006. [17] E. Munch, M. Shapiro, and J. Harer , “Failure ﬁltrations for fenced sensor networks, ” The International Journal of Robotics Researc h , v ol. 31, no. 9, pp. 1044–1056, 2012. [18] R. Ramaithitima, M. Whitzer , S. Bhattacharya, and V . Kumar , “Sensor coverage robot swarms using local sensing without metric information, ” in Robotics and Automation (ICRA), 2015 IEEE International Confer- ence on . IEEE, 2015, pp. 3408–3415. [19] B. W alker and C. Park, “Using persistent homology to recover spatial information from encounter traces, ” in Pr oceedings of the 9 th ACM international symposium on Mobile ad hoc networking and computing - MobiHoc ’08 . A CM Press, 2008, p. 371. [20] M. Robinson and R. Ghrist, “T opological localization via signals of opportunity , ” Signal Processing, IEEE T ransactions on , vol. 60, no. 5, pp. 2362–2373, May 2012. [21] J. Derenick, A. Speranzon, and R. Ghrist, “Homological sensing for mobile robot localization, ” in Robotics and Automation (ICRA), 2013 IEEE International Confer ence on , May 2013, pp. 572–579. [22] R. Ghrist, D. Lipsky , J. Derenick, and A. Speranzon, “T opological landmark-based navig ation and mapping, ” Univ ersity of Pennsylv ania, Department of Mathematics, T ech. Rep., Aug. 2012. [23] A. Dirafzoon and E. Lobaton, “T opological mapping of unkno wn en vironments using an unlocalized robotic swarm, ” in Intelligent Robots and Systems (IR OS), 2013 IEEE/RSJ International Conference on , Nov . 2013, pp. 5545–5551. [24] A. Dirafzoon, J. Betthauser, J. S. D. Benavides, and E. Lobaton, “Mapping of Unknown Environments using Minimal Sensing from a Stochastic Swarm, ” in Intelligent Robots and Systems (IR OS), 2014 IEEE/RSJ International Confer ence on , Sep. 2014, pp. 3842–3849. [25] J. Zhang, H. Huang, and J. W ang, “Manifold learning for visualizing and analyzing high-dimensional data, ” Intelligent Systems, IEEE , vol. 25, no. 4, pp. 54–61, Jul. 2010. [26] T . Lin and H. Zha, “Riemannian manifold learning, ” P attern Analysis and Machine Intelligence, IEEE Tr ansactions on , v ol. 30, no. 5, pp. 796–809, 2008. [27] J. B. T enenbaum, V . de Silva, and J. C. Langford, “A global geometric framew ork for nonlinear dimensionality reduction.” Science (New Y ork, N.Y .) , vol. 290, no. 5500, pp. 2319–23, Dec. 2000. [28] H. Edelsbrunner and J. Harer , Computational T opology: an Introduction . American Mathematical Society , 2010. [29] D. Horak, S. Maleti ´ c, and M. Rajkovi ´ c, “Persistent homology of complex networks, ” Journal of Statistical Mechanics: Theory and Experiment , vol. 2009, no. 03, p. P03034, Mar . 2009. [30] D. Cohen-Steiner , H. Edelsbrunner , and J. Harer , “Stability of persis- tence diagrams, ” Discr ete & Computational Geometry , vol. 37, no. 1, pp. 103–120, 2007. [31] F . Chazal, D. Cohen-Steiner, L. J. Guibas, F . M ´ emoli, and S. Y . Oudot, “Gromov-hausdorf f stable signatures for shapes using persistence, ” in Computer Graphics F orum , vol. 28, no. 5. W iley Online Library , 2009, pp. 1393–1403. [32] R. Jeanson, S. Blanco, R. Fournier , J.-L. Deneubourg, V . Fourcassi ´ e, and G. Theraulaz, “A model of animal movements in a bounded space, ” Journal of Theoretical Biology , v ol. 225, no. 4, pp. 443–451, Dec. 2003. [33] R. Jeanson, C. Ri vault, J.-L. Deneubourg, S. Blanco, R. Fournier , C. Jost, and G. Theraulaz, “Self-organized aggregation in cockroaches, ” Animal Behaviour , vol. 69, no. 1, pp. 169–180, Jan 2005. [34] M. J. Plank, M. Auger-M ´ eth ´ e, and E. A. Codling, “L ´ evy or not? analysing positional data from animal movement paths, ” in Dispersal, individual movement and spatial ecology . Springer, 2013, pp. 33–52. [35] V . D. Silva and G. Carlsson, “T opological estimation using witness complexes, ” Pr oc. Sympos. P oint-Based Graphics , 2004. [36] T . H. Cormen, C. Stein, R. L. Riv est, and C. E. Leiserson, Intr oduction to Algorithms , 2nd ed. McGraw-Hill Higher Education, 2001. [37] T . F . Cox and M. A. A. Cox, Multidimensional Scaling , 2nd ed. Chapman and Hall / CRC, 2000. [38] G. Carlsson, T . Ishkhanov , V . Silva, and A. Zomorodian, “On the Local Behavior of Spaces of Natural Images, ” International Journal of Computer V ision , vol. 76, no. 1, pp. 1–12, Jun. 2007. [39] S. Balakrishnan, B. Fasy , and F . Lecci, “Statistical Inference For Persistent Homology, ” arXiv preprint , 2013. [40] D. Morozov , “Dionysus library for persistent homology , ” A vailable at http://mrzv .org/software/dionysus. [41] D. Burago, Y . Burago, and S. Ivano v , A course in metric geometry . American Mathematical Society Providence, 2001, vol. 33.

Geometric Learning and Topological Inference with Biobotic Networks: Convergence Analysis

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment