A polynomial-time relaxation of the Gromov-Hausdorff distance

A POL YNOMIAL-TIME RELAXA TION OF THE GR OMO V-HA USDORFF DIST ANCE SOLEDAD VILLAR, AFONSO S. BANDEIRA, ANDREW J. BLUMBERG, AND RACHEL W ARD Abstract. The Gromov-Hausdorﬀ distance provides a metric on the set of isometry classes of compact metric spaces. Unfortunately , computing this metric directly is b elieved to b e computationally in tractable. Motiv ated b y applications in shap e matching and point-cloud comparison, we study a semideﬁnite program- ming relaxation of the Gromov-Hausdorﬀ metric. This relaxation can be computed in p olynomial time, and somewhat surprisingly is itself a pseudometric. W e describe the induced top ology on the set of compact metric spaces. Finally , we demonstrate the numerical performance of v arious algorithms for computing the relaxed distance and apply these algorithms to sev eral relev ant data sets. In particular w e prop ose a greedy algorithm for ﬁnding the b est correspondence b etw een ﬁnite metric spaces that can handle hundreds of points. 1. Introduction In order to study the con v ergence of sequences of metric spaces, Gromov in troduced what is now called the Gromo v-Hausdorﬀ metric [Gro81]. Roughly speaking, this metric generalizes the classical Hausdorﬀ distance b et w een subsets of an ambien t metric space, to a pair of arbitrary metric spaces by taking the inﬁmum ov er all em b eddings. The Gromov-Hausdorﬀ metric has been of theoretical imp ortance in geometric group theory and is at the heart of the sub ject of “metric geometry”. More rec en tly , the Gromov-Hausdorﬀ distance has b een prop osed as a basic metho d for comparing p oint clouds [MS05]. A p oint cloud is simply a ﬁnite metric space (often presented as a subset of R n ); this is a fundamen tal and ubiquitous representation of data. Geometric examples, where the p oin t cloud represents samples from some smooth geometric ob ject, arise from v arious kinds of shape acquisition devices. Examples with less obvious in trinsic geometric structure are frequently generated b y biological data (e.g., collections of gene expression vectors). Giv en tw o p oint clouds, a natural question is to determine if they are related b y some isometric transformation; if not, one might wish to know a quan titative measure of their diﬀerence. Another v ersion of this sort of problem is what is kno wn as the p oin t registration problem (also sometimes referred to as point matc hing and net w ork alignmen t). P oint registration consists in ﬁnding a corresp ondence b et w een p oin t sets or graphs such that a certain cost function is minimized. It app ears in computer vision problems like shap e matc hing, computational biology [CK14], and general pattern recognition problems. In some applications, registering or aligning is particularly challenging since there is no explicit corresp ondence b et w een the sets, often b ecause deformation has o ccurred or they hav e diﬀerent num b ers of p oints. In such cases it is natural to consider a metric on p oint clouds that is deﬁned in terms of corresp ondences betw een p oin t clouds together; the Gromov-Hausdorﬀ distance can b e describ ed in terms of a minimax expression o v er correspondences b et w een the metric spaces, and so is p otentially suitable for this purp ose. Unfortunately , exact computation of the Gromo v-Hausdorﬀ distance is essentially intractable; it in v olv es the solution of an NP-hard optimization problem. As a consequence, it is natural to consider relaxations. In [M´ em11], M ´ emoli studied a relaxation referred to as the Gromov-W asserstein distance — this distance is closely related to distances motiv ated by optimal transp ort problems [L V09, Stu06], to a “distance dis- tribution” metric deﬁned b y Gromov, and also to the cut distance of graphons. Unfortunately , computing the Gromov-W asserstein distance still requires solving a non-conv ex optimization problem which does not app ear to hav e attractive p erformance characteristics in practice. In this paper, w e study a semideﬁnite programming (SDP) relaxation of the Gromo v-Hausdorﬀ distance; this yields a tractable conv ex optimization problem. W e pro v e that this relaxation deﬁnes a pseudometric A. J. Blumberg and S. Villar were supp orted i n part by AFOSR grant F A9550-15-1-0302. R. W ard and S. Villar were supp orted in part by NSF CAREER grant #1255631. Part of this w ork w as done while A. S. Bandeira w as with the Mathematics Department at MIT and supported by NSF Grant DMS-1317308. 1 on p oin t clouds that can b e computed in p olynomial time. Our pseudometric also provides a relaxed corresp ondence b etw een p oint clouds and a low er b ound for the Gromo v-Hausdorﬀ distance. A similar v ersion of this SDP w as recently introduced in [KKBL15] and further studied in [MDK + 16] and [DL16]; our w ork provides theoretical v alidation for some of the computational phenomena observed therein and complemen ts their theoretical framework. W e describ e the p erformance of optimized solvers for this SDP . W e also propose a non-conv ex optimization algorithm to approach the registration problem eﬃcien tly . The output of this algorithm not only provides a lo cal optim um for the registration problem, but also an upper b ound for the Gromo v-Hausdorﬀ distance. 2. Back gr ound W e b egin b y recalling the deﬁnition of the Gromo v-Hausdorﬀ distance; for this, w e start with the Hausdorﬀ distance. Let ( Z, d ) a compact metric space and C ( Z ) the collection of all compact sets in Z . If A, B ∈ C ( Z ), the Hausdorﬀ distance betw een A and B can be expressed as d Z H ( A, B ) = inf R ∈R ( A,B ) sup ( a,b ) ∈ R d ( a, b ) where R ( A, B ) is the set of corresp ondences in R ⊂ A × B such that every element a ∈ A is related to at least one element in B and every elemen t b ∈ B is related at least one element in A . F or man y theoretical and practical applications it is common to relax suc h distance to the W asserstein distance [Vil03]. In that setting, one endows C ( Z ) with a measure, C w ( Z ) = { ( A, µ A ) : A ∈ C ( Z ) , supp( µ A ) = A } , and relaxes the set of corresp ondences R ( A, B ) to the set of transp ortation plans M ( µ A , µ B ) = { µ : µ ( A 0 × B ) = µ A ( A 0 ) , µ ( A × B 0 ) = µ B ( B 0 ) , for all Borel sets A 0 ⊂ A, B 0 ⊂ B } . Then, for A, B ∈ C w ( Z ), the W asserstein distance is deﬁned for 1 ≤ p ≤ ∞ as d Z W ,p ( A, B ) = inf µ ∈M ( µ A ,µ B )  Z A × B d p ( a, b ) dµ ( a, b )  1 /p for 1 ≤ p < ∞ d Z W , ∞ ( A, B ) = inf µ ∈M ( µ A ,µ B ) sup ( a,b ) ∈ supp( µ ) d ( a, b ) . F or A, B ﬁnite sets this distance can b e eﬃcien tly computed by a linear program. The Hausdorﬀ distance suﬃces to compare metric spaces embedded in a common ambien t metric space; Gromo v’s idea to extend this to compare arbitrary metric spaces is simply to consider the inﬁm um o v er all isometric em beddings into a common metric space [Gro01]. Sp eciﬁcally , if X, Y are compact metric spaces, the Gromo v-Hausdorﬀ distance is deﬁned as d G H ( X, Y ) = inf Z,f ,g d Z H ( f ( X ) , g ( Y )) where f : X → Z and g : Y → Z are isometric embeddings into Z , a metric space. Unfortunately , it is an NP-hard problem to compute this distance. Since the Hausdorﬀ distance b ecomes computationally tractable when relaxed to the W asserstein dis- tance, one might consider a transp ort-based relaxation of the Gromov-Hausdorﬀ distance that works in the setting of metric measure spaces. In a series of pap ers [M´ em11, M ´ em07, MS05] M´ emoli considers dif- feren t equiv alen t expressions for the Gromov-Hausdorﬀ distance, and by relaxing them and considering them in the measure metric space setting, he obtains diﬀeren t gr omovizations of the W asserstein distance, called Gromo v-W asserstein distances. A particularly natural relaxation is based on the observ ation that the Gromo v-Hausdorﬀ distance can b e expressed as: (2.1) d G H ( X, Y ) = 1 2 inf R ∈R ( X,Y ) sup x,x 0 ∈ X y ,y 0 ∈ Y ( x,y ) , ( x 0 ,y 0 ) ∈ R Γ X,Y ( x, y , x 0 , y 0 ) 2 where Γ X,Y ( x, y , x 0 , y 0 ) = | d X ( x, x 0 ) − d Y ( y , y 0 ) | . F or 1 ≤ p ≤ ∞ M ´ emoli then deﬁnes Gromov-W asserstein relaxations of the Gromo v-Hausdorﬀ distance as (2.2) D p ( X, Y ) = 1 2 inf µ ∈M ( µ X ,µ Y )  Z X × Y Z X × Y (Γ X,Y ( x, y , x 0 , y 0 )) p µ ( dx × dy ) µ ( dx 0 × dy 0 )  1 /p (2.3) D ∞ ( X, Y ) = 1 2 inf µ ∈M ( µ X ,µ Y ) sup x,x 0 ∈ X y ,y 0 ∈ Y ( x,y ) , ( x 0 ,y 0 ) ∈ supp( µ ) Γ X,Y ( x, y , x 0 , y 0 ) In his work, M´ emoli studies top ological prop erties of the diﬀeren t distance relaxations and how they com- pare with each other and with the Gromov-Hausdorﬀ distance. He also prop oses an algorithm to approximate D p , but due the non-conv exit y of its ob jectiv e function, no p erformance guarantees are pro vided. Recen t w ork [PCS16] has provided eﬃcient heuristic algorithms based on optimal transp ortation to approximate the Gromo v-W asserstein distance for alignmen t applications. R emark 2.4 . Gromov considered another metric on the set of metric measure spaces deﬁned in terms of the con v ergence of all distance matrix distributions (i.e., the distributions induced b y taking the pushforward of the measure to a collection of n p oints and applying the metric to all pairs). It turns out that this metric is closely related to the Gromov-W asserstein distance [Stu06, 3.7]. Moreo ver, these metrics induce the same notion of conv ergence as arises in the theory of dense graph sequences and graphons. Sp eciﬁcally , we can regard a graph as a metric measure space; the underlying metric space has p oints the set of vertices and pairwise distances 1 2 if the p oints are connected and 1 otherwise, and the measure assigns equal mass to eac h p oin t. (See [Ele12] for further discussion of this p oint). 3. Semidefinite relaxa tions of Gromo v-W asserstein and Gr omov-Ha usdorff dist ances Consider the setting where X and Y are ﬁnite metric spaces (or metric measure spaces), say X = { x 1 , . . . , x n } and Y = { y 1 , . . . , y m } (with measures µ X ( x i ) = ν i and µ Y ( y j ) = λ j ). Let us abbreviate Γ X,Y ( x i , y j , x i 0 , y j 0 ) as Γ ij,i 0 j 0 for i, i 0 = 1 , . . . , n and j, j 0 = 1 , . . . , m . The formulation of the Gromov- Hausdorﬀ distance given in equation (2.1) can b e expressed as a quadratic assignment (Remark 3 in [M´ em07]): d G H ( X, Y ) = 1 2 min µ max ij,i 0 j 0 Γ ij,i 0 j 0 µ ij µ i 0 j 0 sub ject to µ ij ∈ { 0 , 1 } , m X j =1 µ ij ≥ 1 , n X i =1 µ ij ≥ 1 (3.1) and the expressions (2.2) and (2.3) can b e written as (3.2) and (3.3) resp ectiv ely: (3.2) D p ( X, Y ) = 1 2 min µ ∈ R n × m   n X i,i 0 =1 m X j,j 0 =1 µ ij µ i 0 j 0 Γ p ij,i 0 j 0   1 /p sub ject to 0 ≤ µ ij ≤ 1 , n X i =1 µ ij = λ j , m X j =1 µ ij = ν i (3.3) D ∞ ( X, Y ) = 1 2 min µ ∈ R n × m max i,i 0 ,j,j 0 µ ij µ i 0 j 0 6 =0 Γ ij,i 0 j 0 sub ject to 0 ≤ µ ij ≤ 1 , n X i =1 µ ij = λ j , m X j =1 µ ij = ν i In order to approac h non-conv ex optimization problems lik e (3.1), (3.2) or (3.3), one standard technique is to linearize the ob jective by lifting µ ij µ i 0 j 0 and µ ij to a symmetric v ariable Z ∈ R nm +1 × nm +1 whose entries en tries are indexed b y pairs ( ij, i 0 j 0 ),( ij, nm + 1), ( nm + 1 , i 0 j 0 ) and ( nm + 1 , nm + 1) with i, i 0 = 1 , . . . n and j, j 0 = 1 , . . . , m . (3.4) Z =  ˆ Z z z > 1  . 3 Then note that, for instance, the problems (3.2) and (3.3) are equiv alent to problems (3.5) and (3.6) resp ectiv ely: D p ( X, Y ) = 1 2  min Z T race(Γ ( p ) ˆ Z )  1 /p sub ject to Z ∈ S (3.5) D ∞ ( X, Y ) = 1 2 min Z max i,i 0 ,j,j 0 : Z 6 =0 Γ ij,i 0 j 0 sub ject to Z ∈ S (3.6) where S = { Z ∈ R nm +1 × nm +1 : P i Z ij,nm +1 = λ j , P j Z ij,nm +1 = ν i , Z = Z > , Z ≥ 0 , rank( Z ) = 1 } and Γ ( p ) denotes the p -th pow er of the matrix Γ entrywise. The constrain t rank( Z ) = 1 can b e relaxed to the conv ex constraint Z  0 (whic h means Z is symmetric and p ositive semideﬁnite) and additional linear constraints satisﬁed by the rank 1 matrix can be added to mak e the relaxation tigh ter. Using this recip e we can construct the following family of semideﬁnite programming relaxations of the Gromo v-W asserstein and Gromo v-Hausdorﬀ distances. ˜ d A ,p ( X, Y ) = 1 2  1 n 2 min Z T race(Γ ( p ) ˆ Z )  1 /p sub ject to Z ∈ A (3.7) ˜ d A , ∞ ( X, Y ) = 1 2 min Z max i,j,i 0 ,j 0 : Z 6 =0 Γ ij,i 0 j 0 sub ject to Z ∈ A (3.8) where w e can consider diﬀeren t conv ex sets A as relaxing to diﬀerent distances. (a) F or a relaxation of the Gromov-Hausdorﬀ distance (or Gromov-W asserstein for uniform weigh ts λ j = ν i = 1 / max { n, m } for all j = 1 , . . . , m and i = 1 , . . . , n ) 1 consider A = G H : = { Z ∈ R nm +1 × nm +1 : X i Z ij,nm +1 ≥ 1 , X j Z ij,nm +1 ≥ 1 , X i,i 0 Z ij,i 0 j 0 ≥ 1 , X j,j 0 Z ij,i 0 j 0 ≥ 1 , ˆ Z1 = max { n, m } z , 0 ≤ Z ≤ 1 , Z  0 } . Relaxation (3.8) pro vides a lo w er bound for the Gromov-Hausdorﬀ distance, since every element of R ( X , Y ) induces, up to normalization, a feasible Z . In fact, if the optimal solution of equa- tion (2.1) corresp onds to R ∈ R ( X , Y ) suc h that ( x i , y j ) , ( x i , y j 0 ) ∈ R for some j 6 = j 0 , the solu- tion of equation (3.7) may split the mass in a wa y so Z ij,nm +1 + Z ij 0 ,nm +1 = 1 instead of having Z ij,nm +1 = Z ij 0 ,nm +1 = 1. (b) If | X | = | Y | we may w an t to restrict the set of all correspondences b etw een X and Y (where ev ery elemen t of X is related to at least one element in Y and vice versa) to the set of all bijective corresp ondences. In that case we can consider a tighter relaxation, that relaxes the registration problem and is similar to the one in [KKBL15]. A = Reg : = { Z ∈ R n 2 +1 × n 2 +1 : n X i =1 Z ij,n 2 +1 = 1 , n X j =1 Z ij,n 2 +1 = 1 , Z n 2 +1 ,n 2 +1 = 1 , n X i,i 0 =1 Z ij,i 0 j 0 = 1 , n X j,j 0 =1 Z ij,i 0 j 0 = 1 , Z ij,ij 0 = 0 if j 6 = j 0 , Z ij,ij 0 = 0 if i 6 = i 0 , T race( Z ) = n + 1 , ˆ Z1 = n z , 0 ≤ Z ≤ 1 , Z  0 } (c) The registration relaxation can b e extended to ﬁnite metric spaces with diﬀerent num b ers of p oints. Let as b efore | X | = n and | Y | = m . First, without loss of generality assume that n ≥ m . No w consider the problem (2.1) where the set R ( X , Y ) is restricted to surjective functions X → Y . Then 1 By appropriately choosing the righ t hand side of the equalit y constraints i n G H one can obtain a semideﬁnite relaxation of Gromov o-W asserstein distance for any weigh ts. 4 relax the feasible set to the con v ex set: A = Sur : = { Z ∈ R nm +1 × nm +1 : n X i =1 Z ij,nm +1 ≥ 1 , m X j =1 Z ij,nm +1 = 1 , Z nm +1 ,nm +1 = 1 , n X i,i 0 =1 Z ij,i 0 j 0 ≥ 1 , m X j,j 0 =1 Z ij,i 0 j 0 = 1 , Z ij,ij 0 = 0 if j 6 = j 0 , T race( Z ) = n + 1 , ˆ Z1 = n z , 0 ≤ Z ≤ 1 , Z  0 , } Note that the set of constraints assumes that i, i 0 = 1 , . . . n , j, j 0 = 1 , . . . m and n ≥ m and it is not symmetric with resp ect to i and j . Also note that under this relaxation, there exist sets that d G H ( X, Y ) = 0 but the relaxed distance (3.7) with A = Sur satisﬁes ˜ d Sur ,p ( X, Y ) 6 = 0 (and the same phenomena o ccurs for A = Reg). F or instance X = { x, x, y } and Y = { x, y , y } . This is an artifact of only allowing surjective functions instead of all possible relations in R ( X , Y ). R emark 3.9 . Even though the max ob jective in equation (3.8) is conv ex, it is not smooth, which w e observe to signiﬁcantly h urt the p erformance of the numerical implemen tations. This is one reason to consider the p -norm approac h to this relaxation and deﬁne the family of SDP relaxations (3.7) for 1 ≤ p < ∞ . R emark 3.10 . Note that linear constraints in the sets A are not linearly indep enden t and the extra v ariable z is redundan t. Ho w ev er, one can easily c ho ose alternative sets of constrain ts (even with fewer constrain ts or where the extra constrain t is not redundan t) with the same ob jectives in mind: (i) the relaxation is tight when the spaces are isometric (ii) the corresp onding ob jectiv e v alue satisﬁes the triangle inequality when | X | = | Y | . W e are primarily in terested in the semideﬁnite programming relaxations of the Gromov-Hausdorﬀ distance for ﬁnite metric spaces, namely ˜ d A ,p , 1 ≤ p ≤ ∞ for A = G H , Reg , Sur. In ﬁgure 1, w e extend a diagram of M ´ emoli’s to situate the SDP relaxations we study in this pap er. ( C ( Z ) , d Z H ) ( C w ( Z ) , d Z W ,p ) ( G , d G H ) ( G w , D p ) ( G or G w , ˜ d G H ,p ) gromo vization relaxation SDP relaxation relaxation gromo vization SDP relaxation Figure 1. Here G is the collection of all compact metric spaces and G w the collection of all metric measure spaces. The horizontal arrows represen t the relaxation on the notion of corresp ondences. The gromovization arrows represent the pro cess of getting rid of the am bien t space. 4. Topological proper ties of the relaxed dist ances In this section, w e pro v e the main theoretical results of the p ap er. W e begin by sho wing that the distances obtained b y semideﬁnite relaxation are in fact pseudometrics on suitable subsets of the set of isometry classes of ﬁnite metric spaces; i.e., these distances satisfy all the axioms for a metric except that there exist distinct ﬁnite metric spaces such that the relaxed distance b etw een them is 0. W e then study v arious prop erties of the induced top ology , proving analogues of the standard results ab out the top ology induced on the set of isometry classes of compact metric spaces b y the Gromov-Hausdorﬀ distance. 5 4.1. Pseudometrics. Let G < ∞ the set of all ﬁnite metric spaces. First, we observe that ˜ d A , ∞ is a pseudo- metric in G < ∞ . How ev er, if the spaces X, Y , Z hav e diﬀeren t num bers of p oints we cannot exp ect the triangle inequalit y to hold for ˜ d A ,p . That is because the triangle inequalit y do es not ev en hold for tigh t solutions of equation (3.7) (i.e., rank 1 solutions, corresp onding to elements of R ( X , Y )). This is an artifact of replacing the max with a sum. In order to illustrate that fact we consider a simple example. Let d G H , 1 b e the optimal of (3.7) for p = 1 and A the domain of (3.1) (i.e. the solutions corresp onding to elemen ts of R ( X , Y ) ). Then consider X = { x, y } , Y = { x, x, y } , Z = { y } , and observe that triangle inequality is not satisﬁed since d G H , 1 ( X, Y ) = 0, d G H , 1 ( Y , Z ) = 2 d ( x, y ) and d G H , 1 ( X, Z ) = d ( x, y ). Nonetheless, if we consider the set of metric spaces with n p oints, whic h we denote b y G n , we will sho w that ˜ d A ,p for 1 ≤ p < ∞ is a pseudometric on G n . The most interesting part of this v eriﬁcation is the triangle inequalit y , which w e prov e in Theorem 4.5 b elow. In contrast to the situation with the Gromo v-Hausdorﬀ distance, passing to isometry classes of ﬁnite metric spaces do es not suﬃce to produce an actual metric. Of course, if X and Y are isometric spaces then ˜ d A ,p ( X, Y ) = 0. By construction ˜ d A ,p ( · , · ) ≥ 0 and the isometry b et w een X and Y induces a feasible solution for equation (3.7) with ob jectiv e v alue 0. How ever, there exists non-isometric spaces X, Y such that ˜ d A ,p ( X, Y ) = 0. Examples of that phenomenon can b e constructed b y observing that the graph isomorphism problem can b e reduced to deciding whether the Gromov-Hausdorﬀ distance is zero. Given a graph G = ( V , E ) one then constructs a metric space X ( G ) where (4.1) d ( v , v 0 ) =  1 if ( v, v 0 ) ∈ E K  | V | otherwise . Therefore, giv en t w o graphs G, G 0 w e ha v e that G, G 0 are isomorphic if and only if d G H ( X ( G ) , X ( G 0 )) = 0. There exist explicit examples in the literature of graphs where any SDP relaxation on | V | 2 ×| V | 2 matrices can- not distinguish b etw een tw o non-isomorphic graphs [OWWZ14]. F or suc h examples, ˜ d A ,p ( X ( G ) , X ( G 0 )) = 0 (see Figure 2). Figure 2. W e consider 3X OR instances with 5 v ariables and 4 equations and w e construct the reduction from 3X OR to graph isomorphism from [O WWZ14]. The left and middle ﬁgures represent corresp onding adjacency graphs obtained after the reduction from the follo wing system of equations in Z 2 : x 1 ⊕ x 2 ⊕ x 5 = b 1 x 1 ⊕ x 2 ⊕ x 5 = b 2 x 1 ⊕ x 3 ⊕ x 4 = b 3 x 2 ⊕ x 3 ⊕ x 4 = b 4 , with     b 1 b 2 b 3 b 4     =     1 1 0 1     (left), and     b 1 b 2 b 3 b 4     =     0 0 0 0     (middle). Eac h graph has 26 vertices. W e construct ﬁnite metric spaces X and Y according to (4.1) and w e use SDPNAL+ [YST15] to compute the the relaxed distance, obtaining ˜ d Reg , 1 ( X, Y ) = 0. The minimizer Z of (3.7) is rank 16. The ﬁgure in the righ t sho ws a soft assignment b etw een X and Y obtained from Z b y computing ˆ Z1 and rearranging acc ordingly . 6 Theorem 4.2. Consider ˜ d A ,p and ˜ d A , ∞ deﬁne d in e quations (3.7) and (3.8) r esp e ctively for A = G H , Reg , Sur . Then we have: (a) F or X, Y , W ∈ G n , and 1 ≤ p < ∞ , ˜ d A ,p satisﬁes the triangle ine quality. (b) F or X, Y , W ∈ G < ∞ , ˜ d A , ∞ satisﬁes the triangle ine quality. Pr o of. W e b egin by proving part (a). Note that it suﬃces to show that for p ≥ 1, (4.3) ˜ d A ,p ( X, W ) p ≤ ˜ d A ,p ( X, Y ) p + ˜ d A ,p ( Y , W ) p . This follo ws from the fact that for a, b > 0 and p ≥ 1 w e ha ve a p + b p ≤ ( a + b ) p and therefore if equation (4.3) holds w e hav e: ˜ d A ,p ( X, W ) ≤ p q ˜ d A ,p ( X, Y ) p + ˜ d A ,p ( Y , W ) p ≤ ˜ d A ,p ( X, Y ) + ˜ d A ,p ( Y , W ) . No w let Z and V the minimizers in equation (3.7) for X , Y and Y , W resp ectiv ely in A . F rom Z and V w e construct T feasible for X , W in equation (3.7) and we show the ob jective function in T is smaller or equal to ˜ d A ,p ( X, Y ) + ˜ d A ,p ( Y , W ). If x i , x i 0 ∈ X , y j , y j 0 ∈ Y and w k , w k 0 ∈ W let T the unique feasible matrix in A that satisﬁes (4.4) ˆ T ik,i 0 k 0 : = X j,j 0 ˆ Z ij,i 0 j 0 ˆ V j k,j 0 k 0 . T o see that T is well-deﬁned, observ e that it is straigh tforw ard to chec k that T satisﬁes the linear and inequalit y constraints of A using the fact that Z and V b elong to A . In order to verify that T is p ositiv e semideﬁnite, consider the Cholesky decomp ositions of Z and V . Then ˆ Z ij,i 0 j 0 = z > ij z i 0 j 0 , ˆ V j k,j 0 k 0 = v > j k v j 0 k 0 where z and v do not necessarily corresp ond to the last column in equation (3.4). In fact z is a r × n 2 matrix where r is the rank of ˆ Z and z ij is the column of z indexed by i = 1 , . . . , n and j = 1 , . . . , n . Then note ˆ T ik,i 0 k 0 = X j,j 0 ˆ Z ij,i 0 j 0 ˆ V j k,j 0 k 0 = * X j z ij ⊗ v j k , X j 0 z i 0 j 0 ⊗ v j 0 k 0 + therefore ˆ T is PSD since it is a Gram matrix, and T is PSD since it has the same rank as ˆ T . F or the triangle inequality we need to show (4.5) X i,i 0 X k,k 0 T ik,i 0 k 0 | d X ( x i , x i 0 ) − d W ( w k , w k 0 ) | ≤ X i,i 0 X j,j 0 Z ij,i 0 j 0 | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | + X j,j 0 X k,k 0 V j k,j 0 k 0 | d X ( x j , x j 0 ) − d W ( w k , w k 0 ) | . In the case w e are dealing with, where | X | = | Y | = | W | , the constrain ts X i,i 0 Z ij,i 0 j 0 ≥ 1 and X k,k 0 V j k,j 0 k 0 ≥ 1 are tigh t, meaning that equality holds, so for all j, j 0 , we can multiply by 1 and rewrite the RHS of equa- tion (4.5) as X i,i 0 X j,j 0 Z ij,i 0 j 0 | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | X k,k 0 V j k,j 0 k 0 + X j,j 0 X k,k 0 V j k,j 0 k 0 | d X ( x j , x j 0 ) − d W ( w k , w k 0 ) | X i,i 0 Z ij,i 0 j 0 = X i,i 0 X k,k 0 X j,j 0 Z ij,i 0 j 0 V j k,j 0 k 0 ( | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | + | d Y ( y j , y j 0 ) − d W ( w k , w k 0 ) | ) No w it is clear that equation (4.5) follows from the triangle inequality in R , which completes the veriﬁcation of part (a). 7 In order to prov e part (b), no w we let X , Y , Z to b e ﬁnite metric spaces with arbitrary n um b er of points. And w e let Z and V the minimizers in equations (3.8) for X , Y and Y , W resp ectiv ely as b efore. W e deﬁne T as in equation (4.4). W e kno w T is feasible for equation (3.8) so the remaining step to prov e is (4.6) max T 6 =0 Γ ik,i 0 k 0 ≤ max Z 6 =0 Γ ij,i 0 j 0 + max V 6 =0 Γ j k,j 0 k 0 Let ( ik , i 0 k 0 ) the arg max of the left hand side of equation (4.6). Since T ik,i 0 k 0 6 = 0 and T ik,i 0 k 0 = P j,j 0 Z ij,i 0 j 0 T kj,k 0 j 0 then there exists j, j 0 suc h that Z ij,i 0 j 0 6 = 0 and T kj,k 0 j 0 6 = 0. Then w e hav e (4.7) ˜ d A , ∞ ( X, W ) ≤ max T 6 =0 Γ ik,i 0 k 0 = | d X ( x i , x i 0 ) − d W ( w k , w k 0 ) | ≤ | d X ( x i , x i 0 ) − d Y ( y j , y j 0 ) | + | d Y ( y j , y j 0 ) − d W ( w k , w k 0 ) | ≤ max Z 6 =0 Γ ij,i 0 j 0 + max V 6 =0 Γ j k,j 0 k 0 = ˜ d A , ∞ ( X, Y ) + ˜ d A , ∞ ( Y , W ) .  R emark 4.8 . The same argumen t will sho w that ˜ d G W ,p satisﬁes triangle inequality as long as we add the constrain t ˆ Z1 = n z , where n = | X | = | Y | = | W | and the measure of each of the p oints is equal 1 /n . 4.2. Monotonicit y and contin uity prop erties. The following lemma shows the monotonicity of ˜ d A ,p with respect to p . The second part of the lemma pro v es contin uit y of ˜ d A ,p at inﬁnit y . Prop osition 4.9. F or any X , Y ﬁnite metric sp ac es we have: (a) If 1 ≤ p ≤ q < ∞ then ˜ d A ,p ( X, Y ) ≤ ˜ d A ,q ( X, Y ) ≤ ˜ d A , ∞ ( X, Y ) . (b) lim p →∞ ˜ d A ,p ( X, Y ) = min Z ∈ ˜ T max i,j,i 0 ,j 0 : Z 6 =0 Γ ij,i 0 j 0 = ˜ d A , ∞ ( X, Y ) . Pr o of. Let Z ∈ A optimal for equations (3.8) or (3.7) for some v alue of p . Then 1 > ˆ Z1 = n 2 . The w eigh ted p o w er mean inequality implies   1 n 2 X ij,i 0 j 0 Γ p ij,i 0 j 0 Z ij,i 0 j 0   1 /p ≤   1 n 2 X ij,i 0 j 0 Γ q ij,i 0 j 0 Z ij,i 0 j 0   1 /q ≤ max i,j,i 0 ,j 0 : Z ij,i 0 j 0 6 =0 Γ ij,i 0 j 0 and taking the inﬁm um in Z we obtain (a). No w for ﬁxed Z let Γ ∗ Z = max { Γ ij,i 0 j 0 : Z ij,i 0 j 0 6 = 0 } , then using the standard calculus argumen t lim p →∞   1 n 2 X ij,i 0 j 0 Γ p ij,i 0 j 0 Z ij,i 0 j 0   1 /p = Γ ∗ Z lim p →∞   X ij,i 0 j 0  Γ ij,i 0 j 0 Γ ∗ Z  p Z ij,i 0 j 0 n 2   1 /p = Γ ∗ Z . and taking inﬁmum in Z we obtain (b).  Prop osition 4.9 holds for metric spaces X and Y with p ossibly diﬀerent n um b er of p oin ts and it says that ev en though ˜ d A ,p ma y not satisfy the triangle inequality , it do es in the limit p → ∞ . 4.3. Extension of the distance to compact inﬁnite sets. Every compact metric space X is the limit of a sequence of ﬁnite metric spaces in the Gromov-Hausdorﬀ top ology , denoted here b y τ G H (see for in- stance [BBI01, Example 7.4.9]). In fact, by taking  n → 0 and choosing a ﬁnite  n -net S n in X for ev ery n , w e get S n G H → X b ecause d G H ( X, S n ) ≤ d H ( X, S n ) ≤  n . This property inspires the following deﬁnition of an actual distance betw een compact metric spaces. Deﬁnition 4.10. L et X , Y c omp act metric sp ac es. Given  n → 0 , let X n , Y n r esp e ctive  n -nets of X and Y , with the same numb er of p oints N . Deﬁne (4.11) ˆ d A ,p ( X, Y ) = inf  n , X n Y n lim sup n →∞ ˜ d A ,p ( X n , Y n ) 8 Note that lim sup exists because ˜ d A ,p ( X n , Y n ) ≤ ˜ d A , ∞ ( X n , Y n ) ≤ 1 2 max(diam( X ) , diam( Y )) for all n . Also, note the triangle inequality holds for this limit, whic h also implies that ˆ d A ,p and ˜ d A ,p ma y not agree. T o illustrate how the righ t hand side of (4.11) b ehav es, let’s say that | X | < | Y | and for some n the  n -net Y n of Y has at least N p oin ts and | X | < N , then consider X n to b e X with some rep eated p oin ts and run the SDP (3.7) or (3.8) to compute ˜ d A ,p ( X n , Y n ) so that | Y n | = | X n | = N . Note that this is well deﬁned and when ˜ d A ,p ( X n , X ) exists (i.e. A = G H , Sur) w e hav e ˜ d A ,p ( X n , X ) = 0 b ecause the matrix Z corresponding to the surjective function X n → X is in A and has ob jectiv e v alue 0. 4.4. Comparison with the Gromov-Hausdorﬀ distance. Let X denote the set of isometry classes of compact metric spaces. Deﬁnition 4.10 extends the relaxed distances (3.7) and (3.8) to X , obtaining the function ˆ d A ,p : X × X → R . Lemma 4.12. F or X , Y ∈ X we have for 1 ≤ p ≤ ∞ ˆ d A,p ( X, Y ) ≤ d G H ( X, Y ) Pr o of. First assume X , Y are ﬁnite and let R ∈ R ( X, Y ) the minimizer in (2.1). If | R | = N let X N , Y N the  -net so that every element of X app ears in X N as many times as it appears in R (and the same for Y N ). Then the bijectiv e function b et w een X N and Y N corresp onding to R induces a feasible Z , pro ving the result in the ﬁnite case. F or the remaining case consider a  n -net where  n → 0.  No w consider X, Y ﬁnite metric spaces. First observ e that A = G H includes the Z induced b y all elemen ts in R ( X, Y ), whic h together with Proposition 4.9 implies ˜ d G H ,p ( X, Y ) ≤ ˜ d G H , ∞ ( X, Y ) ≤ d G H ( X, Y ) . Since Sur ⊂ G H we hav e ˜ d G H ,p ( X, Y ) ≤ ˜ d Sur ,p ( X, Y ) , and if | X | = | Y | w e can consider that Reg ⊂ Sur therefore ˜ d G H ,p ( X, Y ) ≤ ˜ d Sur ,p ( X, Y ) ≤ ˜ d Reg ,p ( X, Y ) . Also, the smaller the set A , the more likely is the relaxation to produce a tigh t solution (a rank 1 solution, corresp onding with an elemen t of R ( X , Y )). Note that neither ˜ d Sur ,p nor ˜ d Reg ,p are comparable with d G H . 4.5. T op ologies induced b y relaxed distances. Any metric or pseudometric d deﬁnes a top ology τ c haracterized by the prop erty that giv en a sequence X n , we hav e conv ergence X n τ → X if and only if d ( X n , X ) → 0. In particular, the Gromov-Hausdorﬀ distance induces a topology on the set of isometry classes of compact metric spaces. The Gromov-Hausdorﬀ top ology is a fairly weak topology; for example, there are man y compact sets. Prop osition 7.4.12 in [BBI01] c haracterizes the Gromov-Hausdorﬀ con v ergence in terms of  -nets, implying that if a sequence { X n } con verges in the Gromo v-Hausdorﬀ top ology , then for all  > 0, the cardinality of  -nets is uniformly b ounded ov er X n . Therefore if a class X of metric spaces is pre-compact (i.e. an y sequence of elements of X has a conv ergent subsequence) in the Gromov-Hausdorﬀ top ology , then for ev ery  > 0 the size of a minimal  -net is uniformly bounded ov er all elemen ts of X . The analysis in [BBI01] sho ws that this prop ert y of X , along with the fact that the diameters of its members are uniformly b ounded (what is called totally b oundedness, Deﬁnition 4.14), is suﬃcient for pre-compactness (Theorem 4.15). Let ˆ τ A ,p (1 ≤ p ≤ ∞ ), and ˜ τ A , ∞ denote the topologies induced by the pseudometrics ˆ d A ,p (1 ≤ p ≤ ∞ ) and ˜ d A , ∞ , resp ectively . W e obtain an analogous characterization of pre-compact sets in the topology for ˆ τ A ,p for 1 ≤ p ≤ ∞ in Corollary 4.16 b elow. Prop osition 4.13. [BBI01, 7.4.12] F or c omp act metric sp ac es X and { X n } ∞ n =1 , X n τ G H − → X if and only if the fol lowing holds. F or every  > 0 ther e exist a ﬁnite  -net S in X and an  -net S n in e ach X n such that S n τ G H − → S . Mor e over these  -nets c an b e chosen so that, for al l suﬃciently lar ge n , S n have the same c ar dinality as S . Note that b y construction (Deﬁnition 4.10) the characterization of conv ergence by  -nets from Prop osi- tion 4.13 is also true when substituting τ G H b y ˆ τ A ,p , 1 ≤ p ≤ ∞ . 9 Deﬁnition 4.14. [BBI01, 7.4.14] A class X of c omp act metric sp ac es is total ly b ounde d if (a) Ther e exists a c onstant D such that for al l X ∈ X , diam( X ) < D . (b) F or every  > 0 ther e exists a numb er N  such that every X ∈ X c ontains an  -net c onsisting of at most N  p oints. Theorem 4.15. [BBI01, 7.4.15] Any uniformly total ly b ounde d class X of c omp act metric sp ac es is pr e- c omp act in the Gr omov-Hausdorﬀ top olo gy. By Theorem 4.15, we know that if X is totally b ounded and { X n } ∞ n =1 is a sequence in X then it contains a conv ergen t subsequence in X . Since ˆ d A ,p ≤ d G H , that subsequence is also conv ergent in ˆ τ A ,p , whic h immediately implies the follo wing corollary: Corollary 4.16. Any uniformly total ly b ounde d class X of c omp act metric sp ac es is pr e-c omp act in the top olo gy ˆ τ A ,p for 1 ≤ p ≤ ∞ . 4.6. Lo cal top ological prop erties. In the space of compact metric spaces we know that d G H ( X, Y ) = 0 if and only if X and Y are isometric. The example at the b eginning of Section 4.1 shows that this is not true for ˜ d A ,p in general. How ev er, in this section we show it is true for most ﬁnite X . Deﬁnition 4.17. L et X a ﬁnite metric sp ac e. We say that X is generic if X ∈ G < ∞ and al l p airwise distanc es in X ar e diﬀer ent and non-zer o. The name generic is justiﬁed in the following sense: if X ∈ G n is not generic, for all  > 0 there exists Y ∈ G n suc h that d G H ( X, Y ) <  and Y is generic. Also, if X ∈ G n is generic there exists  > 0 such that for all Y ∈ G n that satisfy d G H ( X, Y ) <  we hav e that Y is also generic. Which prov es the follo wing remark: R emark 4.18 . The set of generic metric spaces is dense in τ G H | G < ∞ and open in τ G H | G n . Lemma 4.19. If X and Y ar e generic and ˜ d A ,p ( X, Y ) = 0 then X and Y ar e isometric. Pr o of. Assume without loss of generality | X | ≥ | Y | and ˜ d A ,p ( X, Y ) = 0. Let Z the solution of (3.7) for X , Y with ob jective v alue 0. The constraint P j,j 0 Z ij,i 0 j 0 = 1 for all i, i 0 implies that, given i 6 = i 0 there exists j, j 0 suc h that Z ij,i 0 j 0 > 0. Since the ob jective v alue is 0, that implies that d X ( x i , x i 0 ) = d Y ( y j , y j 0 ). Since all pairwise distances in X are diﬀerent, that completely determines all distances in Y and in particular it implies | X | = | Y | , X and Y are isometric, and the unique solution of (3.7) corresp onds to the isometry b et w een X and Y .  Corollary 4.20. If X ∈ G n is generic ther e exists a neighb orho o d of X in G n such that for al l Y in that neighb orho o d d G H ( X, Y ) = 1 2 max i,j =1 ...n | d X ( x i , x j ) − d Y ( y i , y j ) | ( Y is a smal l enough p erturb ation of a metric sp ac e isometric with X wher e we lab el the p oints such that the isometry is x i 7→ y i for al l i ). In p articular we c an think of the neighb orho o d wher e that pr op erty holds as the neighb orho o d of X with r adius ∆ /n wher e ∆ is the smal lest non-zer o entry of the matrix Γ( X, X ) . In this setting we have that ˜ d A ,p ( X, Y ) = 1 2   1 n 2 X i,j | d X ( x i , x j ) − d Y ( y i , y j ) | p   1 /p in the neighb orho o d of X of r adius ∆ p /n . And, in the neighb orho o d of X of r adius ∆ we have ˜ d A , ∞ ( X, Y ) = d G H ( X, Y ) . This implies that the top olo gies τ G H | G n and ˜ τ G H ,p | G n ar e e quivalent for al l ﬁnite n and p . And we have ˜ d A , ∞ and d G H ar e generic al ly lo c al ly the same. 10 5. GHMa tch: a rank-1 augmented Lagrangian appro ach tow ards the registra tion problem In the previous sections, w e ha v e studied an approac h to the problem of computing the Gromo v-Hausdorﬀ distance (equation (3.1)) via semideﬁnite optimization. Here we ﬁrst lift the v ariable µ ∈ R nm to a symmetric v ariable Z ∈ R nm +1 × nm +1 suc h that rank( Z ) = 1. W e then relax the non-conv ex rank constraint to the con v ex constrain t Z  0. There are many attractive prop erties of the semideﬁnite relaxations. F or one thing, there are many soft- w are pac k ages that eﬃciently pro vide global solutions to semideﬁnite programs (e.g., SDPNAL+ [YST15]). Moreo v er, there is a great deal of researc h energy directed at pro ducing more eﬃcient SDP solvers; the ﬁeld is rapidly ev olving and solvers are getting more eﬃcient every da y . F urthermore, SDP relaxations hav e the adv antage of often b eing tight: in our situation, we ha v e observed n umerically that the solution Z frequen tly has rank 1. In this case, the semideﬁnite optimization ﬁnds the global solution of the original problem, and also pro vides a certiﬁcate of its optimalit y . This prop erty has recently b een exploited to eﬃciently pro duce certiﬁcates of optimality of solutions found by fast non-conv ex algorithms that typically may conv erge to lo cal optima [Ban15, IMPV15]. On the other hand, the se mideﬁnite relaxations ha v e the disadv an tage that they square the n um b er of v ariables of the original problem: even with eﬃcien t solvers, this expansion makes these problems intractable for large sets of p oints. Also, when the SDP is not tight, it ma y pro duce a high rank solution Z that may not be easily rounded to a feasible µ . Motiv ated by these concerns, in this section we prop ose a non-conv ex optimization approach for the registration problem. Here we trade the global optimality guarantee and the pseudometric the semideﬁnite optimization provides for computational eﬃciency and a guaran tee of a true (albeit not necessarily globally optimal) correspondence. W e will assume that | X | = | Y | = n . By restricting equation (3.1) to this case and replacing the inﬁnity norm b y the p -norm formulation we obtain the follo wing non-con v ex optimization problem, where y ∈ R n 2 is indexed by a pair of v ariables ij where i, j = 1 , . . . , n . (5.1) min y h Γ ( p ) , yy > i sub ject to n X i =1 y ij = 1 , n X j =1 y ij = 1 , 0 ≤ y ij ≤ 1 No w, instead of considering a semideﬁnite relaxation as w e did previously , w e prop ose a greedy metho d to directly solv e (5.1). Let A ∈ R 2 n × n 2 suc h that A y = 1 if and only if P n i =1 y ij = 1 and P n j =1 y ij = 1 and let b = 1 ∈ R 2 n . Then equation (5.1) is equiv alen t to the following quadratic optimization problem with linear and b ox constraints. (5.2) min y y > Γ ( p ) y sub ject to A y = b, 0 ≤ y ≤ 1 In order to solve problem (5.2), w e use a pro jected augmented Lagrangian approach (e.g., see [NW06, Algorithm 17.4]). (5.3) L ( y , λ, σ ) = y > Γ ( p ) y − λ T ( A y − b ) + σ 2 k A y − b k 2 2 W e prop ose the algorithm GHMatch (see Algorithm 1). Theoretical conv ergence analysis for GHMatch is left for future work. In the next section, we describ e numerical experiments that indicate the p erformance of the algorithm. In the exp eriments w e conducted, we found that this pro cedure con verges to a lo cal minimum of equation (5.1). That solution may b e rounded to an actual correspondence b etw een the p oin t sets ¯ y , and therefore the v alue h Γ , ¯ y ¯ y > i is an upper b ound for the Gromo v-Hausdorﬀ distance. 6. Numerical performance In this section, we describe the results of a n um ber of numerical exp eriments to explore the applications of our new distance and the performance of our augmented Lagrangian approac h. 11 Algorithm 1 GHMatch 1: y 0 ← 1 n 1 ∈ R n 2 2: λ 0 ← 1 ∈ R 2 n 3: σ 0 ← 5 4: µ ← 10 5: for k = 0 , 1 , 2 , . . . do 6: y ← arg min 0 ≤ y ≤ 1 L ( y , λ k , σ k )  Use y k as initial p oint for this minimization 7: λ k +1 ← λ k − σ k ( A y k +1 − b ) 8: σ k +1 ← µσ k 9: end for 10: for i = 1 , . . . , n do  T o ﬁnd the map corresp onding with y T 11: map( i ) = arg max j =1 ,...,n y T (1 + ( i − 1) n : in ) 12: end for Figure 3. (Left) Distance matrix obtained from computing the b est rigid transformation that maps the corresp onding lab eled landmarks from the teeth dataset described in Sec- tion 6.1. (Right) Distance matrix obtained from computing the SDP distance ˜ d G H ,p with p = 1. Dark er color corresp onds to smaller distance. W e observe the same distance patterns in both matrices even though the scales are diﬀerent. 6.1. Classiﬁcation using the distance ˜ d G H . In order to v alidate our distance numerically we compare with the numerical exp erimen ts describ ed in [BLSC + 11], using data and algorithms av ailable on Y aron Lipman’s personal website [Lip11]. As w e describe b elow, we ﬁnd that our pro cedure produces results that are competitive with this pro cedure. In [BLSC + 11] the authors prop ose an algorithm to automatically quan tify the geometric similarit y of anatomical surfaces based on a distance inspired by the Gromov-W asserstein distance whic h is in v ariant under conformal maps. They exp erimen t with a real dataset coming from surfaces of teeth of diﬀeren t sp ecies; they compare the results of their algorithm with a metho d based on an expert selecting 16 landmarks on each to oth and then ﬁnding the b est rigid transformation to match the lab eled landmarks. Sp eciﬁcally , they w ork with a dataset consisting of 116 teeth. F or each to oth, they ﬁnd the closest to oth according to eac h distance, and then see whether they are in the same category . W e perform the same exp eriment on 115 of the teeth (since one of them seems to b e in a diﬀeren t scale), but without providing our algorithm with the corresp ondence b et ween the landmarks. T o b e precise, w e consider the metric spaces X i = { p i 1 , . . . p i 16 } , i = 1 . . . 115 . The p oin ts of these metric spaces are the landmarks c hosen by the exp ert, and the metric is given by the euclidean distance b et w een the landmarks. W e compare the distance matrix ˜ d G H ,p ( X i , X j ) with the distance obtained by the soft w are from [Lip11] that ﬁnds the b est rigid transformation that sends the n -th p oin t of X i to the n -th point of X j for n = 1 , . . . 16. See Figure 3 for a visual comparison of the distance matrices. 12 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 2 µ = 5 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 5 µ = 5 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 10 µ = 5 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 2 µ = 10 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 5 µ = 10 1 ≤ i ≤ n 0 0.2 0.4 0.6 0.8 1 y i σ 0 = 10 µ = 10 Figure 4. This is the v alue of conv ergence of y k for diﬀerent parameters σ 0 and µ , for n = 30. All these y satisfy the linear constraint A y = b but the choice of the parameters determine ho w close the v ector y is to a feasible v ector with entries in { 0 , 1 } . When running the nearest-neighbor classiﬁcation test as describ ed ab o v e, we obtain v ery similar p erfor- mance: 0 . 85 frequency of success in our distance against 0 . 91 for the conformal W asserstein distance prop osed in [BLSC + 11] and 0 . 92 for the landmark comparison algorithm that uses the a priori known correspondence. W e ﬁnd this result very encouraging giv en that our algorithm do es not make any geometric assumptions ab out the teeth (e.g., we do not assume they are smo oth surfaces), in contrast to [BLSC + 11]. 6.2. P erformance of GHMatc h. In order to ev aluate our non-conv ex optimization form ulation of the reg- istration problem, we consider the 3D mo dels from [BBK08] and we sample random p oin ts from eac h mo del. W e run the rank 1 augmented lagrangian optimization from Algorithm 1, using Matlab’s implementation of the reﬂectiv e trust region algorithm to run the step y ← arg min 0 ≤ y ≤ 1 L ( y , λ k , σ k ) . In Figure 5 w e depict the resulting map b et w een corresponding ﬁgures. By design we know σ k → ∞ as k increases, which guarantees that k A y k − b k → 0. Ho w ev er, there is no theoretical guaran tee that y k will con v erge to a sparse vector with entries in { 0 , 1 } . How ever, w e ha ve observ ed in our n umerical sim ulations that y k con v erges to a fairly sparse v ector where most of the en tries are close to 0 or 1 provided a go o d choice of the parameters µ and σ 0 (see Figure 4). Moreo ver, regardless of the choice of the parameters, we ﬁnd that our thresholding step in the algorithm often obtains a map that is bijectiv e. R emark 6.1 . As an alternative to the selec tion of parameters σ 0 and λ , a thresholding step could b e in tro- duced inside the main iteration (lines 5 to 9) to enforce sparsit y of the resulting y . A cknowledgements The authors would like to thank F acundo M ´ emoli for insightful discussions about the Gromo v-W asserstein distance, Tingran Gao for p ointing out useful references, Y aron Lipman for ha ving his dataset and co de freely 13 Figure 5. W e run GHMatch on 50 p oin ts sampled at random from the surfaces, and 60 points for the dogs. The pairwise distances we consider in eac h ﬁgure are the geo desic distances in the mesh. Images that are horizontally aligned corresp ond to diﬀerent angles of the same corresp ondence b etw een the 3D mo dels. Note that for the dogs, the corresp ondence matc hes the left legs of one dog with the right legs of the other one (this is a consequence of the symmetry). Also note that there are small imp erfections, like the tail of one dog matc hing with one leg of the other one (this is a consequence of randomly sampling and obtaining diﬀerent num ber of p oin ts from diﬀeren t dogs tails). The algorithm with 50 sample p oin ts runs in less than 6 min utes on a standard 2013 MacBo ok Air. 14 a v ailable, Stephen W right for his advice on the implemen tation of our non-con v ex algorithm and Marcos Cossarini for p ointing out a misstatement in a previous version of this manuscript. References [Ban15] Afonso S Bandeira. A note on probably certiﬁably correct algorithms. Comptes R endus Mathematique, to app e ar , 2015. [BBI01] Dmitri Burago, Y uri Burago, and Sergei Iv anov. A c ourse in metric ge ometry , volume 33. American Mathematical Society Providence, 2001. [BBK08] Alexander Bronstein, Michael Bronstein, and Ron Kimmel. Numeric al Ge ometry of Non-Rigid Shap es . Springer Publishing Company , Incorp orated, 1 edition, 2008. [BLSC + 11] Doug M. Boy er, Y aron Lipman, Elizab eth St. Clair, Jesus Puente, Biren A. Patel, Thomas F unkhouser, Jukk a Jernv all, and Ingrid Daub ec hies. Algorithms to automatically quan tify the geometric similarity of anatomical surfaces. Pr o ce e dings of the National A cademy of Scienc es , 108(45):18221–18226, 2011. [CK14] Connor Clark and Jugal Kalita. A comparison of algorithms for the pairwise alignment of biological netw orks. Bioinformatics , 2014. [DL16] Nadav Dym and Y aron Lipman. Exact recov ery with symmetries for pro crustes matching. arXiv pr eprint arXiv:1606.01548 , 2016. [Ele12] G´ ab or Elek. Samplings and observ ables. limits of metric measure spaces. arXiv:1205.6936, 2012. [Gro81] Mikhail Gromov. Groups of p olynomial gro wth and expanding maps. Inst. Hautes ´ Etudes Sci. Publ. Math. , (53):53– 73, 1981. [Gro01] Mikhail Gromov. Metric Structur es for Riemannian and Non-Riemannian Sp ac es . Progress in Mathematics. Birkh¨ auser, 2001. [IMPV15] T aka yuki Iguchi, Dustin G. Mixon, Jesse Peterson, and Soledad Villar. Probably certiﬁably correct k-means clus- tering. , 2015. [KKBL15] Itay Kezurer, Shahar Z. Kov alsky , Ronen Basri, and Y aron Lipman. Tight relaxation of quadratic matc hing. Computer Gr aphics F orum , 34(5):115–128, 2015. [Lip11] Y aron Lipman. Software and teeth dataset, 2011. [L V09] John Lott and C´ edric Villani. Ricci curv ature for metric-measure spaces via optimal transport. Ann. of Math. (2) , 169(3):903–991, 2009. [MDK + 16] Haggai Maron, Nada v Dym, Ita y Kezurer, Shahar Kov alsky , and Y aron Lipman. P oint registration via eﬃcient conv ex relaxation. ACM T rans. Gr aph. , 35(4):73:1–73:12, July 2016. [M´ em07] F acundo M´ emoli. On the use of Gromov-Hausdorﬀ Distances for Shap e Comparison. pages 81–90, Prague, Czech Republic, 2007. Eurographics Association. [M´ em11] F acundo M´ emoli. Gromov-Wasserstein distances and the metric approach to ob ject matching. F oundations of Computational Mathematics , pages 1–71, 2011. 10.1007/s10208-011-9093-5. [MS05] F acundo M´ emoli and Guillermo Sapiro. A theoretical and computational framework for isometry inv ariant recog- nition of p oint cloud data. F ound. Comput. Math. , 5(3):313–347, 2005. [NW06] Jorge Nocedal and Stephen J. W right. Numerical Optimization . Springer, New Y ork, 2nd edition, 2006. [OWWZ14] Ryan O’Donnell, John W right, Chenggang W u, and Y uan Zhou. Hardness of robust graph isomorphism, lasserre gaps, and asymmetry of random graphs. In Pr o c e e dings of the Twenty-Fifth Annual A CM-SIAM Symp osium on Discr ete Algorithms , pages 1659–1677. SIAM, 2014. [PCS16] Gabriel Peyr ´ e, Marco Cuturi, and Justin Solomon. Gromov-w asserstein a veraging of kernel and distance matrices. In Pr o c. ICML’16 , pages 2664–2672, 2016. [Stu06] Karl-Theodor Sturm. On the geometry of metric measure spaces. I. A cta Math. , 196(1):65–131, 2006. [Vil03] C ´ edric Villani. T opics in optimal tr ansp ortation . Graduate studies in mathematics. American Mathematical So ciety , cop., Providence (R.I.), 2003. [YST15] Liuqin Y ang, Defeng Sun, and Kim-Chuan T oh. Sdpnal+: a ma jorized semismo oth newton-cg augmented la- grangian metho d for semideﬁnite programming with nonnegative constraints. Mathematic al Pr ogr amming Compu- tation , 7(3):331–366, 2015. Dep ar tment of Ma thema tics, University of Texas, A ustin, TX 78703 E-mail addr ess : mvillar@math.utexas.edu Dep ar tment of Ma thema tics and Center for Da t a Science, Courant Institute of Ma thema tical Sciences, New York University, New York, NY 10012 E-mail addr ess : bandeira@cims.nyu.edu Dep ar tment of Ma thema tics, University of Texas, A ustin, TX 78703 E-mail addr ess : blumberg@math.utexas.edu Dep ar tment of Ma thema tics, University of Texas, A ustin, TX 78703 E-mail addr ess : rward@math.utexas.edu 15

A polynomial-time relaxation of the Gromov-Hausdorff distance

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment