A polynomial-time relaxation of the Gromov-Hausdorff distance

The Gromov-Hausdorff distance provides a metric on the set of isometry classes of compact metric spaces. Unfortunately, computing this metric directly is believed to be computationally intractable. Motivated by applications in shape matching and poin…

Authors: Soledad Villar, Afonso S. B, eira

A polynomial-time relaxation of the Gromov-Hausdorff distance
A POL YNOMIAL-TIME RELAXA TION OF THE GR OMO V-HA USDORFF DIST ANCE SOLEDAD VILLAR, AFONSO S. BANDEIRA, ANDREW J. BLUMBERG, AND RACHEL W ARD Abstract. The Gromov-Hausdorff distance provides a metric on the set of isometry classes of compact metric spaces. Unfortunately , computing this metric directly is b elieved to b e computationally in tractable. Motiv ated b y applications in shap e matching and point-cloud comparison, we study a semidefinite program- ming relaxation of the Gromov-Hausdorff metric. This relaxation can be computed in p olynomial time, and somewhat surprisingly is itself a pseudometric. W e describe the induced top ology on the set of compact metric spaces. Finally , we demonstrate the numerical performance of v arious algorithms for computing the relaxed distance and apply these algorithms to sev eral relev ant data sets. In particular w e prop ose a greedy algorithm for finding the b est correspondence b etw een finite metric spaces that can handle hundreds of points. 1. Introduction In order to study the con v ergence of sequences of metric spaces, Gromov in troduced what is now called the Gromo v-Hausdorff metric [Gro81]. Roughly speaking, this metric generalizes the classical Hausdorff distance b et w een subsets of an ambien t metric space, to a pair of arbitrary metric spaces by taking the infimum ov er all em b eddings. The Gromov-Hausdorff metric has been of theoretical imp ortance in geometric group theory and is at the heart of the sub ject of “metric geometry”. More rec en tly , the Gromov-Hausdorff distance has b een prop osed as a basic metho d for comparing p oint clouds [MS05]. A p oint cloud is simply a finite metric space (often presented as a subset of R n ); this is a fundamen tal and ubiquitous representation of data. Geometric examples, where the p oin t cloud represents samples from some smooth geometric ob ject, arise from v arious kinds of shape acquisition devices. Examples with less obvious in trinsic geometric structure are frequently generated b y biological data (e.g., collections of gene expression vectors). Giv en tw o p oint clouds, a natural question is to determine if they are related b y some isometric transformation; if not, one might wish to know a quan titative measure of their difference. Another v ersion of this sort of problem is what is kno wn as the p oin t registration problem (also sometimes referred to as point matc hing and net w ork alignmen t). P oint registration consists in finding a corresp ondence b et w een p oin t sets or graphs such that a certain cost function is minimized. It app ears in computer vision problems like shap e matc hing, computational biology [CK14], and general pattern recognition problems. In some applications, registering or aligning is particularly challenging since there is no explicit corresp ondence b et w een the sets, often b ecause deformation has o ccurred or they hav e different num b ers of p oints. In such cases it is natural to consider a metric on p oint clouds that is defined in terms of corresp ondences betw een p oin t clouds together; the Gromov-Hausdorff distance can b e describ ed in terms of a minimax expression o v er correspondences b et w een the metric spaces, and so is p otentially suitable for this purp ose. Unfortunately , exact computation of the Gromo v-Hausdorff distance is essentially intractable; it in v olv es the solution of an NP-hard optimization problem. As a consequence, it is natural to consider relaxations. In [M´ em11], M ´ emoli studied a relaxation referred to as the Gromov-W asserstein distance — this distance is closely related to distances motiv ated by optimal transp ort problems [L V09, Stu06], to a “distance dis- tribution” metric defined b y Gromov, and also to the cut distance of graphons. Unfortunately , computing the Gromov-W asserstein distance still requires solving a non-conv ex optimization problem which does not app ear to hav e attractive p erformance characteristics in practice. In this paper, w e study a semidefinite programming (SDP) relaxation of the Gromo v-Hausdorff distance; this yields a tractable conv ex optimization problem. W e pro v e that this relaxation defines a pseudometric A. J. Blumberg and S. Villar were supp orted i n part by AFOSR grant F A9550-15-1-0302. R. W ard and S. Villar were supp orted in part by NSF CAREER grant #1255631. Part of this w ork w as done while A. S. Bandeira w as with the Mathematics Department at MIT and supported by NSF Grant DMS-1317308. 1 on p oin t clouds that can b e computed in p olynomial time. Our pseudometric also provides a relaxed corresp ondence b etw een p oint clouds and a low er b ound for the Gromo v-Hausdorff distance. A similar v ersion of this SDP w as recently introduced in [KKBL15] and further studied in [MDK + 16] and [DL16]; our w ork provides theoretical v alidation for some of the computational phenomena observed therein and complemen ts their theoretical framework. W e describ e the p erformance of optimized solvers for this SDP . W e also propose a non-conv ex optimization algorithm to approach the registration problem efficien tly . The output of this algorithm not only provides a lo cal optim um for the registration problem, but also an upper b ound for the Gromo v-Hausdorff distance. 2. Back gr ound W e b egin b y recalling the definition of the Gromo v-Hausdorff distance; for this, w e start with the Hausdorff distance. Let ( Z, d ) a compact metric space and C ( Z ) the collection of all compact sets in Z . If A, B ∈ C ( Z ), the Hausdorff distance betw een A and B can be expressed as d Z H ( A, B ) = inf R ∈R ( A,B ) sup ( a,b ) ∈ R d ( a, b ) where R ( A, B ) is the set of corresp ondences in R ⊂ A × B such that every element a ∈ A is related to at least one element in B and every elemen t b ∈ B is related at least one element in A . F or man y theoretical and practical applications it is common to relax suc h distance to the W asserstein distance [Vil03]. In that setting, one endows C ( Z ) with a measure, C w ( Z ) = { ( A, µ A ) : A ∈ C ( Z ) , supp( µ A ) = A } , and relaxes the set of corresp ondences R ( A, B ) to the set of transp ortation plans M ( µ A , µ B ) = { µ : µ ( A 0 × B ) = µ A ( A 0 ) , µ ( A × B 0 ) = µ B ( B 0 ) , for all Borel sets A 0 ⊂ A, B 0 ⊂ B } . Then, for A, B ∈ C w ( Z ), the W asserstein distance is defined for 1 ≤ p ≤ ∞ as d Z W ,p ( A, B ) = inf µ ∈M ( µ A ,µ B )  Z A × B d p ( a, b ) dµ ( a, b )  1 /p for 1 ≤ p < ∞ d Z W , ∞ ( A, B ) = inf µ ∈M ( µ A ,µ B ) sup ( a,b ) ∈ supp( µ ) d ( a, b ) . F or A, B finite sets this distance can b e efficien tly computed by a linear program. The Hausdorff distance suffices to compare metric spaces embedded in a common ambien t metric space; Gromo v’s idea to extend this to compare arbitrary metric spaces is simply to consider the infim um o v er all isometric em beddings into a common metric space [Gro01]. Sp ecifically , if X, Y are compact metric spaces, the Gromo v-Hausdorff distance is defined as d G H ( X, Y ) = inf Z,f ,g d Z H ( f ( X ) , g ( Y )) where f : X → Z and g : Y → Z are isometric embeddings into Z , a metric space. Unfortunately , it is an NP-hard problem to compute this distance. Since the Hausdorff distance b ecomes computationally tractable when relaxed to the W asserstein dis- tance, one might consider a transp ort-based relaxation of the Gromov-Hausdorff distance that works in the setting of metric measure spaces. In a series of pap ers [M´ em11, M ´ em07, MS05] M´ emoli considers dif- feren t equiv alen t expressions for the Gromov-Hausdorff distance, and by relaxing them and considering them in the measure metric space setting, he obtains differen t gr omovizations of the W asserstein distance, called Gromo v-W asserstein distances. A particularly natural relaxation is based on the observ ation that the Gromo v-Hausdorff distance can b e expressed as: (2.1) d G H ( X, Y ) = 1 2 inf R ∈R ( X,Y ) sup x,x 0 ∈ X y ,y 0 ∈ Y ( x,y ) , ( x 0 ,y 0 ) ∈ R Γ X,Y ( x, y , x 0 , y 0 ) 2 where Γ X,Y ( x, y , x 0 , y 0 ) = | d X ( x, x 0 ) − d Y ( y , y 0 ) | . F or 1 ≤ p ≤ ∞ M ´ emoli then defines Gromov-W asserstein relaxations of the Gromo v-Hausdorff distance as (2.2) D p ( X, Y ) = 1 2 inf µ ∈M ( µ X ,µ Y )  Z X × Y Z X × Y (Γ X,Y ( x, y , x 0 , y 0 )) p µ ( dx × dy ) µ ( dx 0 × dy 0 )  1 /p (2.3) D ∞ ( X, Y ) = 1 2 inf µ ∈M ( µ X ,µ Y ) sup x,x 0 ∈ X y ,y 0 ∈ Y ( x,y ) , ( x 0 ,y 0 ) ∈ supp( µ ) Γ X,Y ( x, y , x 0 , y 0 ) In his work, M´ emoli studies top ological prop erties of the differen t distance relaxations and how they com- pare with each other and with the Gromov-Hausdorff distance. He also prop oses an algorithm to approximate D p , but due the non-conv exit y of its ob jectiv e function, no p erformance guarantees are pro vided. Recen t w ork [PCS16] has provided efficient heuristic algorithms based on optimal transp ortation to approximate the Gromo v-W asserstein distance for alignmen t applications. R emark 2.4 . Gromov considered another metric on the set of metric measure spaces defined in terms of the con v ergence of all distance matrix distributions (i.e., the distributions induced b y taking the pushforward of the measure to a collection of n p oints and applying the metric to all pairs). It turns out that this metric is closely related to the Gromov-W asserstein distance [Stu06, 3.7]. Moreo ver, these metrics induce the same notion of conv ergence as arises in the theory of dense graph sequences and graphons. Sp ecifically , we can regard a graph as a metric measure space; the underlying metric space has p oints the set of vertices and pairwise distances 1 2 if the p oints are connected and 1 otherwise, and the measure assigns equal mass to eac h p oin t. (See [Ele12] for further discussion of this p oint). 3. Semidefinite relaxa tions of Gromo v-W asserstein and Gr omov-Ha usdorff dist ances Consider the setting where X and Y are finite metric spaces (or metric measure spaces), say X = { x 1 , . . . , x n } and Y = { y 1 , . . . , y m } (with measures µ X ( x i ) = ν i and µ Y ( y j ) = λ j ). Let us abbreviate Γ X,Y ( x i , y j , x i 0 , y j 0 ) as Γ ij,i 0 j 0 for i, i 0 = 1 , . . . , n and j, j 0 = 1 , . . . , m . The formulation of the Gromov- Hausdorff distance given in equation (2.1) can b e expressed as a quadratic assignment (Remark 3 in [M´ em07]): d G H ( X, Y ) = 1 2 min µ max ij,i 0 j 0 Γ ij,i 0 j 0 µ ij µ i 0 j 0 sub ject to µ ij ∈ { 0 , 1 } , m X j =1 µ ij ≥ 1 , n X i =1 µ ij ≥ 1 (3.1) and the expressions (2.2) and (2.3) can b e written as (3.2) and (3.3) resp ectiv ely: (3.2) D p ( X, Y ) = 1 2 min µ ∈ R n × m   n X i,i 0 =1 m X j,j 0 =1 µ ij µ i 0 j 0 Γ p ij,i 0 j 0   1 /p sub ject to 0 ≤ µ ij ≤ 1 , n X i =1 µ ij = λ j , m X j =1 µ ij = ν i (3.3) D ∞ ( X, Y ) = 1 2 min µ ∈ R n × m max i,i 0 ,j,j 0 µ ij µ i 0 j 0 6 =0 Γ ij,i 0 j 0 sub ject to 0 ≤ µ ij ≤ 1 , n X i =1 µ ij = λ j , m X j =1 µ ij = ν i In order to approac h non-conv ex optimization problems lik e (3.1), (3.2) or (3.3), one standard technique is to linearize the ob jective by lifting µ ij µ i 0 j 0 and µ ij to a symmetric v ariable Z ∈ R nm +1 × nm +1 whose entries en tries are indexed b y pairs ( ij, i 0 j 0 ),( ij, nm + 1), ( nm + 1 , i 0 j 0 ) and ( nm + 1 , nm + 1) with i, i 0 = 1 , . . . n and j, j 0 = 1 , . . . , m . (3.4) Z =  ˆ Z z z > 1  . 3 Then note that, for instance, the problems (3.2) and (3.3) are equiv alent to problems (3.5) and (3.6) resp ectiv ely: D p ( X, Y ) = 1 2  min Z T race(Γ ( p ) ˆ Z )  1 /p sub ject to Z ∈ S (3.5) D ∞ ( X, Y ) = 1 2 min Z max i,i 0 ,j,j 0 : Z 6 =0 Γ ij,i 0 j 0 sub ject to Z ∈ S (3.6) where S = { Z ∈ R nm +1 × nm +1 : P i Z ij,nm +1 = λ j , P j Z ij,nm +1 = ν i , Z = Z > , Z ≥ 0 , rank( Z ) = 1 } and Γ ( p ) denotes the p -th pow er of the matrix Γ entrywise. The constrain t rank( Z ) = 1 can b e relaxed to the conv ex constraint Z  0 (whic h means Z is symmetric and p ositive semidefinite) and additional linear constraints satisfied by the rank 1 matrix can be added to mak e the relaxation tigh ter. Using this recip e we can construct the following family of semidefinite programming relaxations of the Gromo v-W asserstein and Gromo v-Hausdorff distances. ˜ d A ,p ( X, Y ) = 1 2  1 n 2 min Z T race(Γ ( p ) ˆ Z )  1 /p sub ject to Z ∈ A (3.7) ˜ d A , ∞ ( X, Y ) = 1 2 min Z max i,j,i 0 ,j 0 : Z 6 =0 Γ ij,i 0 j 0 sub ject to Z ∈ A (3.8) where w e can consider differen t conv ex sets A as relaxing to different distances. (a) F or a relaxation of the Gromov-Hausdorff distance (or Gromov-W asserstein for uniform weigh ts λ j = ν i = 1 / max { n, m } for all j = 1 , . . . , m and i = 1 , . . . , n ) 1 consider A = G H : = { Z ∈ R nm +1 × nm +1 : X i Z ij,nm +1 ≥ 1 , X j Z ij,nm +1 ≥ 1 , X i,i 0 Z ij,i 0 j 0 ≥ 1 , X j,j 0 Z ij,i 0 j 0 ≥ 1 , ˆ Z1 = max { n, m } z , 0 ≤ Z ≤ 1 , Z  0 } . Relaxation (3.8) pro vides a lo w er bound for the Gromov-Hausdorff distance, since every element of R ( X , Y ) induces, up to normalization, a feasible Z . In fact, if the optimal solution of equa- tion (2.1) corresp onds to R ∈ R ( X , Y ) suc h that ( x i , y j ) , ( x i , y j 0 ) ∈ R for some j 6 = j 0 , the solu- tion of equation (3.7) may split the mass in a wa y so Z ij,nm +1 + Z ij 0 ,nm +1 = 1 instead of having Z ij,nm +1 = Z ij 0 ,nm +1 = 1. (b) If | X | = | Y | we may w an t to restrict the set of all correspondences b etw een X and Y (where ev ery elemen t of X is related to at least one element in Y and vice versa) to the set of all bijective corresp ondences. In that case we can consider a tighter relaxation, that relaxes the registration problem and is similar to the one in [KKBL15]. A = Reg : = { Z ∈ R n 2 +1 × n 2 +1 : n X i =1 Z ij,n 2 +1 = 1 , n X j =1 Z ij,n 2 +1 = 1 , Z n 2 +1 ,n 2 +1 = 1 , n X i,i 0 =1 Z ij,i 0 j 0 = 1 , n X j,j 0 =1 Z ij,i 0 j 0 = 1 , Z ij,ij 0 = 0 if j 6 = j 0 , Z ij,ij 0 = 0 if i 6 = i 0 , T race( Z ) = n + 1 , ˆ Z1 = n z , 0 ≤ Z ≤ 1 , Z  0 } (c) The registration relaxation can b e extended to finite metric spaces with different num b ers of p oints. Let as b efore | X | = n and | Y | = m . First, without loss of generality assume that n ≥ m . No w consider the problem (2.1) where the set R ( X , Y ) is restricted to surjective functions X → Y . Then 1 By appropriately choosing the righ t hand side of the equalit y constraints i n G H one can obtain a semidefinite relaxation of Gromov o-W asserstein distance for any weigh ts. 4 relax the feasible set to the con v ex set: A = Sur : = { Z ∈ R nm +1 × nm +1 : n X i =1 Z ij,nm +1 ≥ 1 , m X j =1 Z ij,nm +1 = 1 , Z nm +1 ,nm +1 = 1 , n X i,i 0 =1 Z ij,i 0 j 0 ≥ 1 , m X j,j 0 =1 Z ij,i 0 j 0 = 1 , Z ij,ij 0 = 0 if j 6 = j 0 , T race( Z ) = n + 1 , ˆ Z1 = n z , 0 ≤ Z ≤ 1 , Z  0 , } Note that the set of constraints assumes that i, i 0 = 1 , . . . n , j, j 0 = 1 , . . . m and n ≥ m and it is not symmetric with resp ect to i and j . Also note that under this relaxation, there exist sets that d G H ( X, Y ) = 0 but the relaxed distance (3.7) with A = Sur satisfies ˜ d Sur ,p ( X, Y ) 6 = 0 (and the same phenomena o ccurs for A = Reg). F or instance X = { x, x, y } and Y = { x, y , y } . This is an artifact of only allowing surjective functions instead of all possible relations in R ( X , Y ). R emark 3.9 . Even though the max ob jective in equation (3.8) is conv ex, it is not smooth, which w e observe to significantly h urt the p erformance of the numerical implemen tations. This is one reason to consider the p -norm approac h to this relaxation and define the family of SDP relaxations (3.7) for 1 ≤ p < ∞ . R emark 3.10 . Note that linear constraints in the sets A are not linearly indep enden t and the extra v ariable z is redundan t. Ho w ev er, one can easily c ho ose alternative sets of constrain ts (even with fewer constrain ts or where the extra constrain t is not redundan t) with the same ob jectives in mind: (i) the relaxation is tight when the spaces are isometric (ii) the corresp onding ob jectiv e v alue satisfies the triangle inequality when | X | = | Y | . W e are primarily in terested in the semidefinite programming relaxations of the Gromov-Hausdorff distance for finite metric spaces, namely ˜ d A ,p , 1 ≤ p ≤ ∞ for A = G H , Reg , Sur. In figure 1, w e extend a diagram of M ´ emoli’s to situate the SDP relaxations we study in this pap er. ( C ( Z ) , d Z H ) ( C w ( Z ) , d Z W ,p ) ( G , d G H ) ( G w , D p ) ( G or G w , ˜ d G H ,p ) gromo vization relaxation SDP relaxation relaxation gromo vization SDP relaxation Figure 1. Here G is the collection of all compact metric spaces and G w the collection of all metric measure spaces. The horizontal arrows represen t the relaxation on the notion of corresp ondences. The gromovization arrows represent the pro cess of getting rid of the am bien t space. 4. Topological proper ties of the relaxed dist ances In this section, w e pro v e the main theoretical results of the p ap er. W e begin by sho wing that the distances obtained b y semidefinite relaxation are in fact pseudometrics on suitable subsets of the set of isometry classes of finite metric spaces; i.e., these distances satisfy all the axioms for a metric except that there exist distinct finite metric spaces such that the relaxed distance b etw een them is 0. W e then study v arious prop erties of the induced top ology , proving analogues of the standard results ab out the top ology induced on the set of isometry classes of compact metric spaces b y the Gromov-Hausdorff distance. 5 4.1. Pseudometrics. Let G < ∞ the set of all finite metric spaces. First, we observe that ˜ d A , ∞ is a pseudo- metric in G < ∞ . How ev er, if the spaces X, Y , Z hav e differen t num bers of p oints we cannot exp ect the triangle inequalit y to hold for ˜ d A ,p . That is because the triangle inequalit y do es not ev en hold for tigh t solutions of equation (3.7) (i.e., rank 1 solutions, corresp onding to elements of R ( X , Y )). This is an artifact of replacing the max with a sum. In order to illustrate that fact we consider a simple example. Let d G H , 1 b e the optimal of (3.7) for p = 1 and A the domain of (3.1) (i.e. the solutions corresp onding to elemen ts of R ( X , Y ) ). Then consider X = { x, y } , Y = { x, x, y } , Z = { y } , and observe that triangle inequality is not satisfied since d G H , 1 ( X, Y ) = 0, d G H , 1 ( Y , Z ) = 2 d ( x, y ) and d G H , 1 ( X, Z ) = d ( x, y ). Nonetheless, if we consider the set of metric spaces with n p oints, whic h we denote b y G n , we will sho w that ˜ d A ,p for 1 ≤ p < ∞ is a pseudometric on G n . The most interesting part of this v erification is the triangle inequalit y , which w e prov e in Theorem 4.5 b elow. In contrast to the situation with the Gromo v-Hausdorff distance, passing to isometry classes of finite metric spaces do es not suffice to produce an actual metric. Of course, if X and Y are isometric spaces then ˜ d A ,p ( X, Y ) = 0. By construction ˜ d A ,p ( · , · ) ≥ 0 and the isometry b et w een X and Y induces a feasible solution for equation (3.7) with ob jectiv e v alue 0. How ever, there exists non-isometric spaces X, Y such that ˜ d A ,p ( X, Y ) = 0. Examples of that phenomenon can b e constructed b y observing that the graph isomorphism problem can b e reduced to deciding whether the Gromov-Hausdorff distance is zero. Given a graph G = ( V , E ) one then constructs a metric space X ( G ) where (4.1) d ( v , v 0 ) =  1 if ( v, v 0 ) ∈ E K  | V | otherwise . Therefore, giv en t w o graphs G, G 0 w e ha v e that G, G 0 are isomorphic if and only if d G H ( X ( G ) , X ( G 0 )) = 0. There exist explicit examples in the literature of graphs where any SDP relaxation on | V | 2 ×| V | 2 matrices can- not distinguish b etw een tw o non-isomorphic graphs [OWWZ14]. F or suc h examples, ˜ d A ,p ( X ( G ) , X ( G 0 )) = 0 (see Figure 2). Figure 2. W e consider 3X OR instances with 5 v ariables and 4 equations and w e construct the reduction from 3X OR to graph isomorphism from [O WWZ14]. The left and middle figures represent corresp onding adjacency graphs obtained after the reduction from the follo wing system of equations in Z 2 : x 1 ⊕ x 2 ⊕ x 5 = b 1 x 1 ⊕ x 2 ⊕ x 5 = b 2 x 1 ⊕ x 3 ⊕ x 4 = b 3 x 2 ⊕ x 3 ⊕ x 4 = b 4 , with     b 1 b 2 b 3 b 4     =     1 1 0 1     (left), and     b 1 b 2 b 3 b 4     =     0 0 0 0     (middle). Eac h graph has 26 vertices. W e construct finite metric spaces X and Y according to (4.1) and w e use SDPNAL+ [YST15] to compute the the relaxed distance, obtaining ˜ d Reg , 1 ( X, Y ) = 0. The minimizer Z of (3.7) is rank 16. The figure in the righ t sho ws a soft assignment b etw een X and Y obtained from Z b y computing ˆ Z1 and rearranging acc ordingly . 6 Theorem 4.2. Consider ˜ d A ,p and ˜ d A , ∞ define d in e quations (3.7) and (3.8) r esp e ctively for A = G H , Reg , Sur . Then we have: (a) F or X, Y , W ∈ G n , and 1 ≤ p < ∞ , ˜ d A ,p satisfies the triangle ine quality. (b) F or X, Y , W ∈ G < ∞ , ˜ d A , ∞ satisfies the triangle ine quality. Pr o of. W e b egin by proving part (a). Note that it suffices to show that for p ≥ 1, (4.3) ˜ d A ,p ( X, W ) p ≤ ˜ d A ,p ( X, Y ) p + ˜ d A ,p ( Y , W ) p . This follo ws from the fact that for a, b > 0 and p ≥ 1 w e ha ve a p + b p ≤ ( a + b ) p and therefore if equation (4.3) holds w e hav e: ˜ d A ,p ( X, W ) ≤ p q ˜ d A ,p ( X, Y ) p + ˜ d A ,p ( Y , W ) p ≤ ˜ d A ,p ( X, Y ) + ˜ d A ,p ( Y , W ) . No w let Z and V the minimizers in equation (3.7) for X , Y and Y , W resp ectiv ely in A . F rom Z and V w e construct T feasible for X , W in equation (3.7) and we show the ob jective function in T is smaller or equal to ˜ d A ,p ( X, Y ) + ˜ d A ,p ( Y , W ). If x i , x i 0 ∈ X , y j , y j 0 ∈ Y and w k , w k 0 ∈ W let T the unique feasible matrix in A that satisfies (4.4) ˆ T ik,i 0 k 0 : = X j,j 0 ˆ Z ij,i 0 j 0 ˆ V j k,j 0 k 0 . T o see that T is well-defined, observ e that it is straigh tforw ard to chec k that T satisfies the linear and inequalit y constraints of A using the fact that Z and V b elong to A . In order to verify that T is p ositiv e semidefinite, consider the Cholesky decomp ositions of Z and V . Then ˆ Z ij,i 0 j 0 = z > ij z i 0 j 0 , ˆ V j k,j 0 k 0 = v > j k v j 0 k 0 where z and v do not necessarily corresp ond to the last column in equation (3.4). In fact z is a r × n 2 matrix where r is the rank of ˆ Z and z ij is the column of z indexed by i = 1 , . . . , n and j = 1 , . . . , n . Then note ˆ T ik,i 0 k 0 = X j,j 0 ˆ Z ij,i 0 j 0 ˆ V j k,j 0 k 0 = * X j z ij ⊗ v j k , X j 0 z i 0 j 0 ⊗ v j 0 k 0 + therefore ˆ T is PSD since it is a Gram matrix, and T is PSD since it has the same rank as ˆ T . F or the triangle inequality we need to show (4.5) X i,i 0 X k,k 0 T ik,i 0 k 0 | d X ( x i , x i 0 ) − d W ( w k , w k 0 ) | ≤ X i,i 0 X j,j 0 Z ij,i 0 j 0 | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | + X j,j 0 X k,k 0 V j k,j 0 k 0 | d X ( x j , x j 0 ) − d W ( w k , w k 0 ) | . In the case w e are dealing with, where | X | = | Y | = | W | , the constrain ts X i,i 0 Z ij,i 0 j 0 ≥ 1 and X k,k 0 V j k,j 0 k 0 ≥ 1 are tigh t, meaning that equality holds, so for all j, j 0 , we can multiply by 1 and rewrite the RHS of equa- tion (4.5) as X i,i 0 X j,j 0 Z ij,i 0 j 0 | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | X k,k 0 V j k,j 0 k 0 + X j,j 0 X k,k 0 V j k,j 0 k 0 | d X ( x j , x j 0 ) − d W ( w k , w k 0 ) | X i,i 0 Z ij,i 0 j 0 = X i,i 0 X k,k 0 X j,j 0 Z ij,i 0 j 0 V j k,j 0 k 0 ( | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | + | d Y ( y j , y j 0 ) − d W ( w k , w k 0 ) | ) No w it is clear that equation (4.5) follows from the triangle inequality in R , which completes the verification of part (a). 7 In order to prov e part (b), no w we let X , Y , Z to b e finite metric spaces with arbitrary n um b er of points. And w e let Z and V the minimizers in equations (3.8) for X , Y and Y , W resp ectiv ely as b efore. W e define T as in equation (4.4). W e kno w T is feasible for equation (3.8) so the remaining step to prov e is (4.6) max T 6 =0 Γ ik,i 0 k 0 ≤ max Z 6 =0 Γ ij,i 0 j 0 + max V 6 =0 Γ j k,j 0 k 0 Let ( ik , i 0 k 0 ) the arg max of the left hand side of equation (4.6). Since T ik,i 0 k 0 6 = 0 and T ik,i 0 k 0 = P j,j 0 Z ij,i 0 j 0 T kj,k 0 j 0 then there exists j, j 0 suc h that Z ij,i 0 j 0 6 = 0 and T kj,k 0 j 0 6 = 0. Then w e hav e (4.7) ˜ d A , ∞ ( X, W ) ≤ max T 6 =0 Γ ik,i 0 k 0 = | d X ( x i , x i 0 ) − d W ( w k , w k 0 ) | ≤ | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | + | d Y ( y j , y j 0 ) − d W ( w k , w k 0 ) | ≤ max Z 6 =0 Γ ij,i 0 j 0 + max V 6 =0 Γ j k,j 0 k 0 = ˜ d A , ∞ ( X, Y ) + ˜ d A , ∞ ( Y , W ) .  R emark 4.8 . The same argumen t will sho w that ˜ d G W ,p satisfies triangle inequality as long as we add the constrain t ˆ Z1 = n z , where n = | X | = | Y | = | W | and the measure of each of the p oints is equal 1 /n . 4.2. Monotonicit y and contin uity prop erties. The following lemma shows the monotonicity of ˜ d A ,p with respect to p . The second part of the lemma pro v es contin uit y of ˜ d A ,p at infinit y . Prop osition 4.9. F or any X , Y finite metric sp ac es we have: (a) If 1 ≤ p ≤ q < ∞ then ˜ d A ,p ( X, Y ) ≤ ˜ d A ,q ( X, Y ) ≤ ˜ d A , ∞ ( X, Y ) . (b) lim p →∞ ˜ d A ,p ( X, Y ) = min Z ∈ ˜ T max i,j,i 0 ,j 0 : Z 6 =0 Γ ij,i 0 j 0 = ˜ d A , ∞ ( X, Y ) . Pr o of. Let Z ∈ A optimal for equations (3.8) or (3.7) for some v alue of p . Then 1 > ˆ Z1 = n 2 . The w eigh ted p o w er mean inequality implies   1 n 2 X ij,i 0 j 0 Γ p ij,i 0 j 0 Z ij,i 0 j 0   1 /p ≤   1 n 2 X ij,i 0 j 0 Γ q ij,i 0 j 0 Z ij,i 0 j 0   1 /q ≤ max i,j,i 0 ,j 0 : Z ij,i 0 j 0 6 =0 Γ ij,i 0 j 0 and taking the infim um in Z we obtain (a). No w for fixed Z let Γ ∗ Z = max { Γ ij,i 0 j 0 : Z ij,i 0 j 0 6 = 0 } , then using the standard calculus argumen t lim p →∞   1 n 2 X ij,i 0 j 0 Γ p ij,i 0 j 0 Z ij,i 0 j 0   1 /p = Γ ∗ Z lim p →∞   X ij,i 0 j 0  Γ ij,i 0 j 0 Γ ∗ Z  p Z ij,i 0 j 0 n 2   1 /p = Γ ∗ Z . and taking infimum in Z we obtain (b).  Prop osition 4.9 holds for metric spaces X and Y with p ossibly different n um b er of p oin ts and it says that ev en though ˜ d A ,p ma y not satisfy the triangle inequality , it do es in the limit p → ∞ . 4.3. Extension of the distance to compact infinite sets. Every compact metric space X is the limit of a sequence of finite metric spaces in the Gromov-Hausdorff top ology , denoted here b y τ G H (see for in- stance [BBI01, Example 7.4.9]). In fact, by taking  n → 0 and choosing a finite  n -net S n in X for ev ery n , w e get S n G H → X b ecause d G H ( X, S n ) ≤ d H ( X, S n ) ≤  n . This property inspires the following definition of an actual distance betw een compact metric spaces. Definition 4.10. L et X , Y c omp act metric sp ac es. Given  n → 0 , let X n , Y n r esp e ctive  n -nets of X and Y , with the same numb er of p oints N . Define (4.11) ˆ d A ,p ( X, Y ) = inf  n , X n Y n lim sup n →∞ ˜ d A ,p ( X n , Y n ) 8 Note that lim sup exists because ˜ d A ,p ( X n , Y n ) ≤ ˜ d A , ∞ ( X n , Y n ) ≤ 1 2 max(diam( X ) , diam( Y )) for all n . Also, note the triangle inequality holds for this limit, whic h also implies that ˆ d A ,p and ˜ d A ,p ma y not agree. T o illustrate how the righ t hand side of (4.11) b ehav es, let’s say that | X | < | Y | and for some n the  n -net Y n of Y has at least N p oin ts and | X | < N , then consider X n to b e X with some rep eated p oin ts and run the SDP (3.7) or (3.8) to compute ˜ d A ,p ( X n , Y n ) so that | Y n | = | X n | = N . Note that this is well defined and when ˜ d A ,p ( X n , X ) exists (i.e. A = G H , Sur) w e hav e ˜ d A ,p ( X n , X ) = 0 b ecause the matrix Z corresponding to the surjective function X n → X is in A and has ob jectiv e v alue 0. 4.4. Comparison with the Gromov-Hausdorff distance. Let X denote the set of isometry classes of compact metric spaces. Definition 4.10 extends the relaxed distances (3.7) and (3.8) to X , obtaining the function ˆ d A ,p : X × X → R . Lemma 4.12. F or X , Y ∈ X we have for 1 ≤ p ≤ ∞ ˆ d A,p ( X, Y ) ≤ d G H ( X, Y ) Pr o of. First assume X , Y are finite and let R ∈ R ( X, Y ) the minimizer in (2.1). If | R | = N let X N , Y N the  -net so that every element of X app ears in X N as many times as it appears in R (and the same for Y N ). Then the bijectiv e function b et w een X N and Y N corresp onding to R induces a feasible Z , pro ving the result in the finite case. F or the remaining case consider a  n -net where  n → 0.  No w consider X, Y finite metric spaces. First observ e that A = G H includes the Z induced b y all elemen ts in R ( X, Y ), whic h together with Proposition 4.9 implies ˜ d G H ,p ( X, Y ) ≤ ˜ d G H , ∞ ( X, Y ) ≤ d G H ( X, Y ) . Since Sur ⊂ G H we hav e ˜ d G H ,p ( X, Y ) ≤ ˜ d Sur ,p ( X, Y ) , and if | X | = | Y | w e can consider that Reg ⊂ Sur therefore ˜ d G H ,p ( X, Y ) ≤ ˜ d Sur ,p ( X, Y ) ≤ ˜ d Reg ,p ( X, Y ) . Also, the smaller the set A , the more likely is the relaxation to produce a tigh t solution (a rank 1 solution, corresp onding with an elemen t of R ( X , Y )). Note that neither ˜ d Sur ,p nor ˜ d Reg ,p are comparable with d G H . 4.5. T op ologies induced b y relaxed distances. Any metric or pseudometric d defines a top ology τ c haracterized by the prop erty that giv en a sequence X n , we hav e conv ergence X n τ → X if and only if d ( X n , X ) → 0. In particular, the Gromov-Hausdorff distance induces a topology on the set of isometry classes of compact metric spaces. The Gromov-Hausdorff top ology is a fairly weak topology; for example, there are man y compact sets. Prop osition 7.4.12 in [BBI01] c haracterizes the Gromov-Hausdorff con v ergence in terms of  -nets, implying that if a sequence { X n } con verges in the Gromo v-Hausdorff top ology , then for all  > 0, the cardinality of  -nets is uniformly b ounded ov er X n . Therefore if a class X of metric spaces is pre-compact (i.e. an y sequence of elements of X has a conv ergent subsequence) in the Gromov-Hausdorff top ology , then for ev ery  > 0 the size of a minimal  -net is uniformly bounded ov er all elemen ts of X . The analysis in [BBI01] sho ws that this prop ert y of X , along with the fact that the diameters of its members are uniformly b ounded (what is called totally b oundedness, Definition 4.14), is sufficient for pre-compactness (Theorem 4.15). Let ˆ τ A ,p (1 ≤ p ≤ ∞ ), and ˜ τ A , ∞ denote the topologies induced by the pseudometrics ˆ d A ,p (1 ≤ p ≤ ∞ ) and ˜ d A , ∞ , resp ectively . W e obtain an analogous characterization of pre-compact sets in the topology for ˆ τ A ,p for 1 ≤ p ≤ ∞ in Corollary 4.16 b elow. Prop osition 4.13. [BBI01, 7.4.12] F or c omp act metric sp ac es X and { X n } ∞ n =1 , X n τ G H − → X if and only if the fol lowing holds. F or every  > 0 ther e exist a finite  -net S in X and an  -net S n in e ach X n such that S n τ G H − → S . Mor e over these  -nets c an b e chosen so that, for al l sufficiently lar ge n , S n have the same c ar dinality as S . Note that b y construction (Definition 4.10) the characterization of conv ergence by  -nets from Prop osi- tion 4.13 is also true when substituting τ G H b y ˆ τ A ,p , 1 ≤ p ≤ ∞ . 9 Definition 4.14. [BBI01, 7.4.14] A class X of c omp act metric sp ac es is total ly b ounde d if (a) Ther e exists a c onstant D such that for al l X ∈ X , diam( X ) < D . (b) F or every  > 0 ther e exists a numb er N  such that every X ∈ X c ontains an  -net c onsisting of at most N  p oints. Theorem 4.15. [BBI01, 7.4.15] Any uniformly total ly b ounde d class X of c omp act metric sp ac es is pr e- c omp act in the Gr omov-Hausdorff top olo gy. By Theorem 4.15, we know that if X is totally b ounded and { X n } ∞ n =1 is a sequence in X then it contains a conv ergen t subsequence in X . Since ˆ d A ,p ≤ d G H , that subsequence is also conv ergent in ˆ τ A ,p , whic h immediately implies the follo wing corollary: Corollary 4.16. Any uniformly total ly b ounde d class X of c omp act metric sp ac es is pr e-c omp act in the top olo gy ˆ τ A ,p for 1 ≤ p ≤ ∞ . 4.6. Lo cal top ological prop erties. In the space of compact metric spaces we know that d G H ( X, Y ) = 0 if and only if X and Y are isometric. The example at the b eginning of Section 4.1 shows that this is not true for ˜ d A ,p in general. How ev er, in this section we show it is true for most finite X . Definition 4.17. L et X a finite metric sp ac e. We say that X is generic if X ∈ G < ∞ and al l p airwise distanc es in X ar e differ ent and non-zer o. The name generic is justified in the following sense: if X ∈ G n is not generic, for all  > 0 there exists Y ∈ G n suc h that d G H ( X, Y ) <  and Y is generic. Also, if X ∈ G n is generic there exists  > 0 such that for all Y ∈ G n that satisfy d G H ( X, Y ) <  we hav e that Y is also generic. Which prov es the follo wing remark: R emark 4.18 . The set of generic metric spaces is dense in τ G H | G < ∞ and open in τ G H | G n . Lemma 4.19. If X and Y ar e generic and ˜ d A ,p ( X, Y ) = 0 then X and Y ar e isometric. Pr o of. Assume without loss of generality | X | ≥ | Y | and ˜ d A ,p ( X, Y ) = 0. Let Z the solution of (3.7) for X , Y with ob jective v alue 0. The constraint P j,j 0 Z ij,i 0 j 0 = 1 for all i, i 0 implies that, given i 6 = i 0 there exists j, j 0 suc h that Z ij,i 0 j 0 > 0. Since the ob jective v alue is 0, that implies that d X ( x i , x i 0 ) = d Y ( y j , y j 0 ). Since all pairwise distances in X are different, that completely determines all distances in Y and in particular it implies | X | = | Y | , X and Y are isometric, and the unique solution of (3.7) corresp onds to the isometry b et w een X and Y .  Corollary 4.20. If X ∈ G n is generic ther e exists a neighb orho o d of X in G n such that for al l Y in that neighb orho o d d G H ( X, Y ) = 1 2 max i,j =1 ...n | d X ( x i , x j ) − d Y ( y i , y j ) | ( Y is a smal l enough p erturb ation of a metric sp ac e isometric with X wher e we lab el the p oints such that the isometry is x i 7→ y i for al l i ). In p articular we c an think of the neighb orho o d wher e that pr op erty holds as the neighb orho o d of X with r adius ∆ /n wher e ∆ is the smal lest non-zer o entry of the matrix Γ( X, X ) . In this setting we have that ˜ d A ,p ( X, Y ) = 1 2   1 n 2 X i,j | d X ( x i , x j ) − d Y ( y i , y j ) | p   1 /p in the neighb orho o d of X of r adius ∆ p /n . And, in the neighb orho o d of X of r adius ∆ we have ˜ d A , ∞ ( X, Y ) = d G H ( X, Y ) . This implies that the top olo gies τ G H | G n and ˜ τ G H ,p | G n ar e e quivalent for al l finite n and p . And we have ˜ d A , ∞ and d G H ar e generic al ly lo c al ly the same. 10 5. GHMa tch: a rank-1 augmented Lagrangian appro ach tow ards the registra tion problem In the previous sections, w e ha v e studied an approac h to the problem of computing the Gromo v-Hausdorff distance (equation (3.1)) via semidefinite optimization. Here we first lift the v ariable µ ∈ R nm to a symmetric v ariable Z ∈ R nm +1 × nm +1 suc h that rank( Z ) = 1. W e then relax the non-conv ex rank constraint to the con v ex constrain t Z  0. There are many attractive prop erties of the semidefinite relaxations. F or one thing, there are many soft- w are pac k ages that efficiently pro vide global solutions to semidefinite programs (e.g., SDPNAL+ [YST15]). Moreo v er, there is a great deal of researc h energy directed at pro ducing more efficient SDP solvers; the field is rapidly ev olving and solvers are getting more efficient every da y . F urthermore, SDP relaxations hav e the adv antage of often b eing tight: in our situation, we ha v e observed n umerically that the solution Z frequen tly has rank 1. In this case, the semidefinite optimization finds the global solution of the original problem, and also pro vides a certificate of its optimalit y . This prop erty has recently b een exploited to efficiently pro duce certificates of optimality of solutions found by fast non-conv ex algorithms that typically may conv erge to lo cal optima [Ban15, IMPV15]. On the other hand, the se midefinite relaxations ha v e the disadv an tage that they square the n um b er of v ariables of the original problem: even with efficien t solvers, this expansion makes these problems intractable for large sets of p oints. Also, when the SDP is not tight, it ma y pro duce a high rank solution Z that may not be easily rounded to a feasible µ . Motiv ated by these concerns, in this section we prop ose a non-conv ex optimization approach for the registration problem. Here we trade the global optimality guarantee and the pseudometric the semidefinite optimization provides for computational efficiency and a guaran tee of a true (albeit not necessarily globally optimal) correspondence. W e will assume that | X | = | Y | = n . By restricting equation (3.1) to this case and replacing the infinity norm b y the p -norm formulation we obtain the follo wing non-con v ex optimization problem, where y ∈ R n 2 is indexed by a pair of v ariables ij where i, j = 1 , . . . , n . (5.1) min y h Γ ( p ) , yy > i sub ject to n X i =1 y ij = 1 , n X j =1 y ij = 1 , 0 ≤ y ij ≤ 1 No w, instead of considering a semidefinite relaxation as w e did previously , w e prop ose a greedy metho d to directly solv e (5.1). Let A ∈ R 2 n × n 2 suc h that A y = 1 if and only if P n i =1 y ij = 1 and P n j =1 y ij = 1 and let b = 1 ∈ R 2 n . Then equation (5.1) is equiv alen t to the following quadratic optimization problem with linear and b ox constraints. (5.2) min y y > Γ ( p ) y sub ject to A y = b, 0 ≤ y ≤ 1 In order to solve problem (5.2), w e use a pro jected augmented Lagrangian approach (e.g., see [NW06, Algorithm 17.4]). (5.3) L ( y , λ, σ ) = y > Γ ( p ) y − λ T ( A y − b ) + σ 2 k A y − b k 2 2 W e prop ose the algorithm GHMatch (see Algorithm 1). Theoretical conv ergence analysis for GHMatch is left for future work. In the next section, we describ e numerical experiments that indicate the p erformance of the algorithm. In the exp eriments w e conducted, we found that this pro cedure con verges to a lo cal minimum of equation (5.1). That solution may b e rounded to an actual correspondence b etw een the p oin t sets ¯ y , and therefore the v alue h Γ , ¯ y ¯ y > i is an upper b ound for the Gromo v-Hausdorff distance. 6. Numerical performance In this section, we describe the results of a n um ber of numerical exp eriments to explore the applications of our new distance and the performance of our augmented Lagrangian approac h. 11 Algorithm 1 GHMatch 1: y 0 ← 1 n 1 ∈ R n 2 2: λ 0 ← 1 ∈ R 2 n 3: σ 0 ← 5 4: µ ← 10 5: for k = 0 , 1 , 2 , . . . do 6: y ← arg min 0 ≤ y ≤ 1 L ( y , λ k , σ k )  Use y k as initial p oint for this minimization 7: λ k +1 ← λ k − σ k ( A y k +1 − b ) 8: σ k +1 ← µσ k 9: end for 10: for i = 1 , . . . , n do  T o find the map corresp onding with y T 11: map( i ) = arg max j =1 ,...,n y T (1 + ( i − 1) n : in ) 12: end for Figure 3. (Left) Distance matrix obtained from computing the b est rigid transformation that maps the corresp onding lab eled landmarks from the teeth dataset described in Sec- tion 6.1. (Right) Distance matrix obtained from computing the SDP distance ˜ d G H ,p with p = 1. Dark er color corresp onds to smaller distance. W e observe the same distance patterns in both matrices even though the scales are different. 6.1. Classification using the distance ˜ d G H . In order to v alidate our distance numerically we compare with the numerical exp erimen ts describ ed in [BLSC + 11], using data and algorithms av ailable on Y aron Lipman’s personal website [Lip11]. As w e describe b elow, we find that our pro cedure produces results that are competitive with this pro cedure. In [BLSC + 11] the authors prop ose an algorithm to automatically quan tify the geometric similarit y of anatomical surfaces based on a distance inspired by the Gromov-W asserstein distance whic h is in v ariant under conformal maps. They exp erimen t with a real dataset coming from surfaces of teeth of differen t sp ecies; they compare the results of their algorithm with a metho d based on an expert selecting 16 landmarks on each to oth and then finding the b est rigid transformation to match the lab eled landmarks. Sp ecifically , they w ork with a dataset consisting of 116 teeth. F or each to oth, they find the closest to oth according to eac h distance, and then see whether they are in the same category . W e perform the same exp eriment on 115 of the teeth (since one of them seems to b e in a differen t scale), but without providing our algorithm with the corresp ondence b et ween the landmarks. T o b e precise, w e consider the metric spaces X i = { p i 1 , . . . p i 16 } , i = 1 . . . 115 . The p oin ts of these metric spaces are the landmarks c hosen by the exp ert, and the metric is given by the euclidean distance b et w een the landmarks. W e compare the distance matrix ˜ d G H ,p ( X i , X j ) with the distance obtained by the soft w are from [Lip11] that finds the b est rigid transformation that sends the n -th p oin t of X i to the n -th point of X j for n = 1 , . . . 16. See Figure 3 for a visual comparison of the distance matrices. 12 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 2 µ = 5 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 5 µ = 5 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 10 µ = 5 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 2 µ = 10 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 5 µ = 10 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 10 µ = 10 Figure 4. This is the v alue of conv ergence of y k for different parameters σ 0 and µ , for n = 30. All these y satisfy the linear constraint A y = b but the choice of the parameters determine ho w close the v ector y is to a feasible v ector with entries in { 0 , 1 } . When running the nearest-neighbor classification test as describ ed ab o v e, we obtain v ery similar p erfor- mance: 0 . 85 frequency of success in our distance against 0 . 91 for the conformal W asserstein distance prop osed in [BLSC + 11] and 0 . 92 for the landmark comparison algorithm that uses the a priori known correspondence. W e find this result very encouraging giv en that our algorithm do es not make any geometric assumptions ab out the teeth (e.g., we do not assume they are smo oth surfaces), in contrast to [BLSC + 11]. 6.2. P erformance of GHMatc h. In order to ev aluate our non-conv ex optimization form ulation of the reg- istration problem, we consider the 3D mo dels from [BBK08] and we sample random p oin ts from eac h mo del. W e run the rank 1 augmented lagrangian optimization from Algorithm 1, using Matlab’s implementation of the reflectiv e trust region algorithm to run the step y ← arg min 0 ≤ y ≤ 1 L ( y , λ k , σ k ) . In Figure 5 w e depict the resulting map b et w een corresponding figures. By design we know σ k → ∞ as k increases, which guarantees that k A y k − b k → 0. Ho w ev er, there is no theoretical guaran tee that y k will con v erge to a sparse vector with entries in { 0 , 1 } . How ever, w e ha ve observ ed in our n umerical sim ulations that y k con v erges to a fairly sparse v ector where most of the en tries are close to 0 or 1 provided a go o d choice of the parameters µ and σ 0 (see Figure 4). Moreo ver, regardless of the choice of the parameters, we find that our thresholding step in the algorithm often obtains a map that is bijectiv e. R emark 6.1 . As an alternative to the selec tion of parameters σ 0 and λ , a thresholding step could b e in tro- duced inside the main iteration (lines 5 to 9) to enforce sparsit y of the resulting y . A cknowledgements The authors would like to thank F acundo M ´ emoli for insightful discussions about the Gromo v-W asserstein distance, Tingran Gao for p ointing out useful references, Y aron Lipman for ha ving his dataset and co de freely 13 Figure 5. W e run GHMatch on 50 p oin ts sampled at random from the surfaces, and 60 points for the dogs. The pairwise distances we consider in eac h figure are the geo desic distances in the mesh. Images that are horizontally aligned corresp ond to different angles of the same corresp ondence b etw een the 3D mo dels. Note that for the dogs, the corresp ondence matc hes the left legs of one dog with the right legs of the other one (this is a consequence of the symmetry). Also note that there are small imp erfections, like the tail of one dog matc hing with one leg of the other one (this is a consequence of randomly sampling and obtaining different num ber of p oin ts from differen t dogs tails). The algorithm with 50 sample p oin ts runs in less than 6 min utes on a standard 2013 MacBo ok Air. 14 a v ailable, Stephen W right for his advice on the implemen tation of our non-con v ex algorithm and Marcos Cossarini for p ointing out a misstatement in a previous version of this manuscript. References [Ban15] Afonso S Bandeira. A note on probably certifiably correct algorithms. Comptes R endus Mathematique, to app e ar , 2015. [BBI01] Dmitri Burago, Y uri Burago, and Sergei Iv anov. A c ourse in metric ge ometry , volume 33. American Mathematical Society Providence, 2001. [BBK08] Alexander Bronstein, Michael Bronstein, and Ron Kimmel. Numeric al Ge ometry of Non-Rigid Shap es . Springer Publishing Company , Incorp orated, 1 edition, 2008. [BLSC + 11] Doug M. Boy er, Y aron Lipman, Elizab eth St. Clair, Jesus Puente, Biren A. Patel, Thomas F unkhouser, Jukk a Jernv all, and Ingrid Daub ec hies. Algorithms to automatically quan tify the geometric similarity of anatomical surfaces. Pr o ce e dings of the National A cademy of Scienc es , 108(45):18221–18226, 2011. [CK14] Connor Clark and Jugal Kalita. A comparison of algorithms for the pairwise alignment of biological netw orks. Bioinformatics , 2014. [DL16] Nadav Dym and Y aron Lipman. Exact recov ery with symmetries for pro crustes matching. arXiv pr eprint arXiv:1606.01548 , 2016. [Ele12] G´ ab or Elek. Samplings and observ ables. limits of metric measure spaces. arXiv:1205.6936, 2012. [Gro81] Mikhail Gromov. Groups of p olynomial gro wth and expanding maps. Inst. Hautes ´ Etudes Sci. Publ. Math. , (53):53– 73, 1981. [Gro01] Mikhail Gromov. Metric Structur es for Riemannian and Non-Riemannian Sp ac es . Progress in Mathematics. Birkh¨ auser, 2001. [IMPV15] T aka yuki Iguchi, Dustin G. Mixon, Jesse Peterson, and Soledad Villar. Probably certifiably correct k-means clus- tering. , 2015. [KKBL15] Itay Kezurer, Shahar Z. Kov alsky , Ronen Basri, and Y aron Lipman. Tight relaxation of quadratic matc hing. Computer Gr aphics F orum , 34(5):115–128, 2015. [Lip11] Y aron Lipman. Software and teeth dataset, 2011. [L V09] John Lott and C´ edric Villani. Ricci curv ature for metric-measure spaces via optimal transport. Ann. of Math. (2) , 169(3):903–991, 2009. [MDK + 16] Haggai Maron, Nada v Dym, Ita y Kezurer, Shahar Kov alsky , and Y aron Lipman. P oint registration via efficient conv ex relaxation. ACM T rans. Gr aph. , 35(4):73:1–73:12, July 2016. [M´ em07] F acundo M´ emoli. On the use of Gromov-Hausdorff Distances for Shap e Comparison. pages 81–90, Prague, Czech Republic, 2007. Eurographics Association. [M´ em11] F acundo M´ emoli. Gromov-Wasserstein distances and the metric approach to ob ject matching. F oundations of Computational Mathematics , pages 1–71, 2011. 10.1007/s10208-011-9093-5. [MS05] F acundo M´ emoli and Guillermo Sapiro. A theoretical and computational framework for isometry inv ariant recog- nition of p oint cloud data. F ound. Comput. Math. , 5(3):313–347, 2005. [NW06] Jorge Nocedal and Stephen J. W right. Numerical Optimization . Springer, New Y ork, 2nd edition, 2006. [OWWZ14] Ryan O’Donnell, John W right, Chenggang W u, and Y uan Zhou. Hardness of robust graph isomorphism, lasserre gaps, and asymmetry of random graphs. In Pr o c e e dings of the Twenty-Fifth Annual A CM-SIAM Symp osium on Discr ete Algorithms , pages 1659–1677. SIAM, 2014. [PCS16] Gabriel Peyr ´ e, Marco Cuturi, and Justin Solomon. Gromov-w asserstein a veraging of kernel and distance matrices. In Pr o c. ICML’16 , pages 2664–2672, 2016. [Stu06] Karl-Theodor Sturm. On the geometry of metric measure spaces. I. A cta Math. , 196(1):65–131, 2006. [Vil03] C ´ edric Villani. T opics in optimal tr ansp ortation . Graduate studies in mathematics. American Mathematical So ciety , cop., Providence (R.I.), 2003. [YST15] Liuqin Y ang, Defeng Sun, and Kim-Chuan T oh. Sdpnal+: a ma jorized semismo oth newton-cg augmented la- grangian metho d for semidefinite programming with nonnegative constraints. Mathematic al Pr ogr amming Compu- tation , 7(3):331–366, 2015. Dep ar tment of Ma thema tics, University of Texas, A ustin, TX 78703 E-mail addr ess : mvillar@math.utexas.edu Dep ar tment of Ma thema tics and Center for Da t a Science, Courant Institute of Ma thema tical Sciences, New York University, New York, NY 10012 E-mail addr ess : bandeira@cims.nyu.edu Dep ar tment of Ma thema tics, University of Texas, A ustin, TX 78703 E-mail addr ess : blumberg@math.utexas.edu Dep ar tment of Ma thema tics, University of Texas, A ustin, TX 78703 E-mail addr ess : rward@math.utexas.edu 15

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment