A path following algorithm for the graph matching problem

A P ath F ollo wing Algorithm for the Graph Matc hing Problem Mikhail Zasla vskiy , F rancis Bac h, and Jean-Philipp e V ert ∗ †‡ § F ebruary 11, 2013 Abstract W e prop ose a conv ex-concav e programming approac h for the lab eled weigh ted graph matching problem. The con vex-concav e programming formulatio n is obtained b y rewriting the w eighted graph ma tching problem as a least-square problem o n the set of permutation matrices and rel axing it to tw o diﬀerent optimization problems: a qu adratic conv ex and a quadratic conca ve optimization problem on th e set of doub ly sto chasti c matrices. The conca ve relaxation has the same global minim um as the initial graph matching problem, but the search for its global minimum is also a hard combinatorial problem. W e ther efore construct an approximation of the conca ve problem solution by foll owing a solution path of a conv ex-concave problem obtained by linear interpolation of the con vex and concav e form ulations, starting from the con vex relaxation. This method allow s to easily integ rate the informatio n on g raph lab el simila rities in to the optimizatio n problem, and therefo re to p erform labeled weigh ted graph matc hing. The algori thm is compared with some of the b est p erforming graph matc hing method s on four datasets: sim ulated graphs, QA PLib, retina vessel images and handwritten chinese c haracters. In all cases, the results are comp etitive with t he state-of-th e- art. Keyw ords : Graph a lgorithms, g raph matching, co nvex progr amming, gradient metho ds, ma c hine learning, classiﬁcation, image pro cessing 1 In tro duction The gra ph matching pro blem is among the most imp ortant challenges of graph pro cessing, a nd plays a central ro le in v arious ﬁelds of pattern recog nition. Roug hly sp eaking, the problem co nsists in ﬁnding a co rresp ondence betw een vertices of t wo g iven graphs whic h is optimal in so me sens e . Usually , the optimality refers to the alignment of graph structures and, when av ailable, of vertices lab els, although other criteria are p ossible a s well. A non- e x haustive list of graph ma tc hing applications includes do cument pro cessing tasks like optical character reco gnition [1 , 2], image analysis (2D and 3 D) [3– 6], o r bioinformatics [7–9]. During the last decades, many diﬀere nt algorithms for graph matching ha ve b een pro pos ed. Because of the combinatorial nature of this pr oblem, it is v er y hard to solve it exa ctly for large g raphs, how ever some metho ds based on incomplete enumeration may b e a pplied to s earch for an exact optimal s olution in the cas e of small or sparse graphs. Some exa mples o f such algor ithms may b e found in [10–12]. Another group of metho ds includes approximate algo rithms which a re supp osed to b e more s calable. The price to pay for the scalabilit y is that the s o lution found is usually only an appro ximation of the optimal matching. Approximate methods may b e divided in to tw o gr oups of alg orithms. The ﬁrs t group is comp osed of methods which use s pectra l repres en tations of adjacency matrices, or equiv ale ntly embed the vertices in to a Euclidean space where linear or nonlinear matching algor ithms can b e deploy ed. This a pproach was pioneer ed by Umeyama [13], while further diﬀerent metho ds based on sp ectral repres en tations were pro pos ed in [3–5, 14, 15]. The sec o nd gro up of approximate algor ithms is comp osed of alg orithms whic h w ork directly with gr aph adjacency matr ic e s , and t ypically inv o lve a relaxation of the disc r ete optimization pro blem. The mos t eﬀective algo r ithms were pr opo sed in [6, 16–18]. ∗ Mikhail Zasla vskiy is with the Cent r e for Computat ional Biology and the Cen tre f or Mathe matical M orphology; Jean-Philipp e V ert is with the Cen tre for Computational Biology , M ines ParisT e c h, 35 rue Saint-Honor ´ e, 77305 F ont ainebleau, F rance. They ar e also with the Institut C ur ie, Paris, F-75248 F r ance, and wi th INSERM U900, Pa r is, F-75248 F rance. † E-mail: mikhail.zaslavskiy@ensmp.fr, jean-philipp e.vert@ensmp.fr ‡ F rancis Bach is wi th the INRIA-Willow Pro j ect T eam, Laboratoire d’Inform atique de l’Ecole Normale Sup ´ e ri eure (CNRS/ENS/INRIA UMR 8548), 45 rue d’Ul m, 75230 Paris, F rance § E-mail: francis.bach@mines.org 1 An interesting insta nce o f the g raph matching problem is the matching of lab eled graphs. In that case, gra ph vertices have ass o cia ted lab els, whic h ma y be num be r s, categorical v ariables , etc... The imp ortant p oint is that there is also a similar it y measur e b etw een these lab els. Therefore, when we sear c h for the optimal co r resp ondence betw een vertices, we search a corre s pondence which matches not only the structures of the graphs but also vertices with similar la bels. Some widely used approaches for this application only use the infor mation ab out similar ities betw een graph lab els. In vision, o ne such algor ithm is the shap e context algorithm pro po sed in [19], whic h inv olves an eﬃcient algorithm of no de lab el construction. Another example is the BLAST-ba sed algo rithms in bioinformatics such as the Inpara noid algor ithm [20], wher e co r resp ondence b etw een diﬀerent protein netw orks is established on the basis of BLAST scores b etw een pairs of pr oteins. The main a dv antages of all these metho ds a re their s peed and s implicit y . Ho wever, the main drawback of these metho ds is that they do no t take into account information ab out the gra ph structure. Some g raph matching metho ds try to comb ine information on gr aph structures and vertex similarities, exa mples of such metho d a re presented in [7, 18]. In this a rticle we propo se a n appr oximate method for lab eled w eighted gr a ph matching, based on a con vex- concav e pr ogramming appr oach which can b e applied for matching of g r aphs of large sizes. Our metho d is based on a for m ulation of the la beled weigh ted graph matching problem as a quadratic assignment pro blem (QAP) ov er the set of permutation matrices, where the quadratic term encodes the structura l compatibility and the linear term enco des lo ca l co mpatibilities. W e pr o po se tw o r elaxations of this pro blem, resulting in one q ua dratic conv ex and o ne q uadratic concave minimizatio n problem on the set of do ubly sto chastic matrices. While the concav e relaxation has the same global minimum as the initial QAP , so lving it is also a hard co m binator ial problem. W e ﬁnd a lo cal minimum of this problem b y f ollowing a solution path o f a family of con vex-concav e minimization problems, obtained by linear ly in terp olating be tw een the conv ex and co ncav e relaxations. Starting from the convex formulation with a unique lo cal (and glo ba l) minimum, the s olution pa th leads to a lo cal optimum of the concave relaxation. W e refer to this pro cedure as the P A TH a lgorithm 1 . W e p erfor m an extensive comparis o n o f this P A TH algorithm with several state-of-the-a rt matching metho ds o n small s im ulated gr aphs and v arious QAP b enchmarks, and show that it consistently provides state-of-the-ar t p erformances while scaling to gr aphs of up to a few thousands vertices on a mo dern desktop computer. W e further illustrate the use o f the algo r ithm on tw o a pplications in image pro cessing, namely the matching of r e tina images bas ed on vessel orga niza tion, a nd the matching of handwritten chin ese characters. The r est of the pap er is o r ganized as follows: Section 2 presents the mathematical formulation of the gr aph matchin g pr oblem. I n Section 3, we present our new approa ch. Then, in Section 4, we pres e n t the compar ison of our metho d with Umeyama’s algo rithm [13] a nd the linear progra mming a pproach [16] on the example of a rtiﬁcially simu lated graphs. In Section 5, we test o ur algo r ithm on the QAP b enchm ar k libr a ry , and we compare o btained results with the results published in [18] for the QBP algo r ithm and g raduated a ssignment algorithms. Finally , in Section 6 we pres en t tw o examples of applica tions to r eal-world image pro cessing tasks . 2 Problem des cription A g r aph G = ( V , E ) of size N is deﬁned by a ﬁnite set of vertices V = { 1 , . . . , N } and a set of edges E ⊂ V × V . W e co nsider only undirected gr aphs with no self-lo op, i.e., such that ( i, j ) ∈ E = ⇒ ( j, i ) ∈ E and ( i, i ) / ∈ E for an y v ertices i, j ∈ V . Each such graph c a n b e equiv alently represe nted by a symmetric adjacency matrix A of size | V | × | V | , wher e A ij is equal to one if there is an e dg e b et ween vertex i and vertex j , and zero otherw is e. An in teresting genera lization is a w eighted graph which is deﬁned by asso cia tio n of nonnegative r eal v alues w ij (w eights) to all edges of graph G . Such gra phs ar e represented b y r eal v a lued adjacency matrices A with A ij = w ij . This genera lization is imp ortant b ecause in many a pplications the graphs of interest have asso ciated weigh ts for all their edges, and taking into account these weigh ts may b e crucial in construction o f eﬃcient metho ds. In the following, when we talk a bo ut “a djac e ncy matr ix” we mean a real-v alued “weigh ted” a djacency matrix. Given tw o graphs G and H with the same num b er of vertices N , the pro blem of matching G a nd H consists in ﬁnding a co r resp ondence b etw een vertices o f G and vertices of H which aligns G and H in some optimal way . W e will co nsider in Section 3 .8 an extension o f the problem to gra phs of diﬀerent sizes. F or graphs with the sa me size N , the co rresp ondence betw een vertices is a p ermutation of N vertices, whic h can be deﬁned by a p ermutation matrix P , i.e., a { 0 , 1 } -v alued N × N matrix with exactly one entry 1 in each column and ea ch row. The matrix P ent irely deﬁnes the mapping b etw een vertices of G and vertices of H , P ij being equal to 1 if the i -th vertex o f G is matched to the j -th vertex of H , and 0 other w is e. After a pplying the p ermut atio n deﬁned by P to the vertices 1 The P A TH algorithm as well as other referenced appro ximate metho ds are implement ed in the softw are GraphM av ailable at http://c bio.ensmp.fr/ graphm 2 of H w e o btain a new g r aph isomor phic to H whic h we deno te by P ( H ). The a djacency matrix of the p ermuted graph, A P ( H ) , is simply o btained fro m A H b y the equality A P ( H ) = P A H P T . In order to as sess whether a p e r m utation P deﬁnes a go o d matchin g betw een the vertices of G and those of H , a quality cr iterion must b e deﬁned. Although other choices ar e p oss ible, we fo cus in this pap er on measuring the discrepancy be tw een the graphs after matching, by the num b er of edges (in the case of weighted gr aphs, it will b e the total weigh t o f edges) which ar e pr esen t in one graph and not in the other. In terms o f adjacency matrices, this num ber can b e computed as : F 0 ( P ) = || A G − A P ( H ) || 2 F = || A G − P A H P T || 2 F , (1) where || . || F is the F rob enius matrix nor m deﬁned by k A k 2 F = tr A T A = ( P i P j A 2 ij ). A p opular a lternative to the F ro benius no rm for m ulation (1) is the 1-nor m formulation o btained by replacing the F rob enius norm by the 1- norm k A k 1 = P i P j | A ij | , which is equal t o the square of the F rob enius norm k A k 2 F when comparing { 0 , 1 } -v alued matrices, but may diﬀer in the case of general matrices. Therefore, the pro blem of gr aph matching can b e reformulated as the problem o f minimizing F 0 ( P ) ov er the set of p ermutation matrices . This pr o blem has a co m binator ial nature and there is no known po lynomial algor ithm to solve it [21 ]. It is therefore very hard to solve it in the cas e of large graphs, a nd nu mero us a pproximate metho ds hav e b een develop ed. An interesting g eneralization o f the g raph matching pro blem is the problem of lab ele d graph matching. Here, each gr aph has a sso ciated lab els to all its vertices and the o b j ective is to ﬁnd an alignmen t that ﬁts w ell g raph lab els and g raph structure s a t the sa me time. If we let C ij denote the cost of ﬁtness b et ween the i -th vertex of G and the j - th vertex o f H , then the matching problem based only on lab el c o mparison can b e formulated a s follows: min P ∈P tr( C T P ) = N X i =1 N X j =1 C ij P ij , (2) where P denotes the s et of p ermutation matrices. A natural way of unifying (2) and (1) to match b oth the graph structure and the vertices’ la bels is then to minimize a conv ex combination [18 ]: min P ∈P (1 − α ) F 0 ( P ) + α tr( C T P ) , (3) that mak es explicit, through the p a rameter α ∈ [0 , 1], the trade-oﬀ betw een cost of individual matc hings and faithfulness to the g raph structure. A sma ll α v alue emphasizes the matching of structures, while a larg e α v a lue gives more imp ortance to the matching of lab els. 2.1 P erm utation matrices Before des cribing how we prop ose to solve (1) a nd (3), we ﬁrs t introduce some notations and bring to notice some impo r tant characteristics of these optimization pro blems. They ar e deﬁned on the s et of p ermutation matrices , which we denoted by P . The set P is a set of extreme po in ts of the set of doubly stochastic ma trices, that is, matrices with nonnegative en tries and with r ow sums and co lumn s ums equal to o ne: D = { A : A 1 N = 1 N , A T 1 N = 1 N , A ≥ 0 } , wher e 1 N denotes the N -dimensional vector of all ones [22]. This s hows that when a linear function is minimized ov er the set of doubly sto chastic matr ices D , a solution can alwa ys b e found in the set of p ermut ation matrices. Consequen tly , minimizing a linear function ov er P is in fact equiv alent to a linear progra m and can thus be so lv ed in p olynomial time by , e.g., interior p o in t metho ds [23 ]. In fact, one of the mos t eﬃcient metho ds to solve this problem is the Hungarian algor ithm, which uses a sp eciﬁc primal-dual stra tegy to solve this problem in O ( N 3 ) [24 ]. Note that the Hungarian a lgorithm allows to av oid the g eneric O ( N 7 ) co mplexity as so ciated with a linear progr am with N 2 v ar iables. A t the same time P may b e co nsidered as a subset o f orthonorma l matrices O = { A : A T A = I } of D a nd in fact P = D ∩ O . An (idealized) illustration o f these se ts is presented in Figure 1: t he discrete set P of p ermutation matrices is the intersection o f the con vex s e t D of doubly sto ch a s tic matrices a nd the manifold O of orthog onal matrices. 2.2 Appro ximate methods: existing w orks A go o d review of gr aph matching a lgorithms may b e found in [25]. Here, w e only present a brief description of some a pproximate methods which illustrate well ideas b ehind tw o subgro ups of these algorithms. As mentioned in 3 Figure 1: Relatio n b etw een three matrix sets. O —set of orthog onal matrices, D — set of doubly sto c has tic matrices, P = D ∩ O —s et of p e r m utation matrices. the introduction, o ne p opular approa c h to ﬁnd approximate solutions to the graph matching pr oblem is based on the sp ectral decomp osition o f the adjacency matrices of the gr aphs to be ma tc hed. In this approa c h, the singular v alue deco mpositio ns o f the gra ph adjacency matrices are used: A G = U G Λ G U T G , A H = U H Λ H U T H , where the c o lumns of the ortho g onal matr ices U G and U H consist of eigenv ectors of A G and A H resp ectively , a nd Λ G and Λ H are diago nal matrices of eig en v alues. If we consider the r ows of eigenv ector ma tr ices U G and U H as graph no de co ordina tes in eigenspa ces, then we can match the vertices with similar co ordinates throug h a v ariety of metho ds [5, 13, 15]. How ever, these metho ds suﬀer fr o m the fact that the sp ectral em b edding of gra ph vertices is not uniquely deﬁned. First, the unit norm eigenv ectors ar e at mo st deﬁned up to a sign ﬂip and we have to choo se their signs synchronously . Although it is po ssible to use some normaliza tion conven tion, such as choosing the s ig n of each eigenv ector in such a way that the biggest comp onent is always po sitiv e, this usually do es not guar ant ee a per fect sign synchronization, in pa rticular in the presence of no ise. Second, if the adjacency ma tr ix has multiple eigenv alues, then the choice of eig en vectors beco mes arbitrary within the co rresp onding eig en-subspace, a s they are deﬁned only up to rota tions [26]. One o f the ﬁrst sp ectral approximate algor ithms w as pr esent ed b y Umey ama [1 3 ]. T o a void the am biguity of eigen vector selection, Umey ama propos ed to consider the absolute v alues of eigenv ectors. Acco rding to this approach, the corr espo ndence b etw een graph no des is e s tablished by matching the r ows of | U G | a nd | U H | (which are deﬁned as matrices o f abso lute v alues). The cr iterion of optimal match ing is the total distance betw een matched rows, leading to the optimization problem: min P ∈P k | U G | − P | U H | k F , or equiv alently: max P ∈P tr( | U H || U G | T P ) . (4) The optimization problem (4) is a linear pr ogram on the set of p ermutation matrices which can b e solved by the Hungarian algorithm in O ( N 3 ) [27, 28]. The second group of appr oximate metho ds consists of algor ithms which work directly with the ob jective function in (1), and t ypica lly in volve v arious relaxations to o ptimizations problems that can b e eﬃcien tly s olved. An example of such an a pproach is th e linear programming metho d pro pos ed b y Almohamad and Duﬀu a a in [16]. They considered the 1 - norm as the matching criterio n for a p ermu tation matr ix P ∈ P : F ′ 0 ( P ) = || A G − P A H P T || 1 = || A G P − P A H || 1 . The linea r progra m r elaxation is obtained by optimizing F ′ 0 ( P ) on the set of doubly sto chastic ma trices D instea d of P : min P ∈D F ′ 0 ( P ) , (5) where the 1-norm of a ma trix is deﬁned a s the sum of the absolute v a lues of all the ele ments o f a matr ix. A priori the so lution o f (5) is an arbitra ry doubly sto chastic matrix X ∈ D , so the ﬁnal s tep is a pr o jection of X on the set 4 of p ermutation matrices (we let deno te Π P X the pro jection of X onto P ) : P ∗ = Π P X = arg min P ∈P || P − X || 2 F , or equiv alently: P ∗ = arg max P ∈P X T P . (6) The pro jection (6) can b e performed wit h the Hungaria n algorithm, wit h a complexity cubic in the dimension of the problem. The main disadv a n tage of this metho d is that the dimensionality (i.e., nu mber of v ar iables and n umber of co nstraint s) of the linea r progra m (6) is O ( N 2 ), and therefor e it is quite hard to pro ce s s g raphs of size more than o ne hundred no des. Other conv ex relax a tions o f (1) can b e found in [18 ] and [17 ]. In the nex t section we describ e o ur new a lgorithm which is based on the techn ique of convex-conca ve relaxations of the initial problems (1) a nd (3). 3 Con v ex-conc a v e relaxation Let us start the description of our algorithm for unlab eled weigh ted gra phs. The genera liza tion to lab eled weighted graphs is pr esent ed in Section 3.7. The gra ph matching criterion we consider for unlabeled graphs is the sq uare of the F r obe nius norm of the diﬀerence betw een a djac e ncy matrices (1). Since p e r m utation matrices are als o orthogona l ma trices (i.e., P P T = I and P T P = I ), we ca n r ewrite F 0 ( P ) on P as follows: F 0 ( P ) = k A G − P A H P T k 2 F = k ( A G − P A H P T ) P k 2 F = k A G P − P A H k 2 F . The gra ph matching problem is then the pro blem o f minimizing F 0 ( P ) ov er P , which we ca ll GM : GM : min P ∈P F 0 ( P ) . (7) 3.1 Con v ex relaxation A ﬁrst relaxa tion of GM is obtained by expanding the conv ex quadratic function F 0 ( P ) on the set of doubly sto c has tic matrice s D : QCV : min P ∈D F 0 ( P ) . (8) The QCV problem is a conv ex quadra tic prog ram that can b e solved in po ly no mial time, e.g., by the F rank- W olfe algorithm [29] (see Section 3.5 for more details). How ever, the optimal v alue is usually not an extreme p oints of D , and ther efore not a p ermut atio n matrix. If we wan t to use only QCV fo r the gra ph matching problem, we therefore hav e to pro ject its so lution o n the s et of p ermu tation matr ices, a nd to make, e.g., the following approximation: arg min P F 0 ( P ) ≈ Π P arg min D F 0 ( P ) . (9) Although the pro jection Π P can b e made e ﬃciently in O ( N 3 ) by the Hungarian a lgorithm, the diﬃculty with this approach is that if arg min D F 0 ( P ) is far from P then the quality of the approximation (9 ) may b e p o or: in other words, the work p erformed to optimize F 0 ( P ) is pa rtly lost by the pro jection step which is independent of the co st function. The P A TH alg o rithm that we present later ca n b e thought of as a improved pro jection step that takes in to a c coun t the co st function in the pro jecti o n. 3.2 Conca v e relaxation W e now present a second r e laxation of GM , which r esults in a concav e minimization pro blem. F or that purp ose, let us introduce the diago na l deg ree matrix D of an adjacency matrix A , which is the diagonal matr ix with entries D ii = d ( i ) = P N i =1 A ij for i = 1 , . . . , N , as well a s the Laplacian matrix L = D − A . A having only nonnegative ent ries , it is w ell-known that the Laplacian matrix is positive semideﬁnite [3 0]. W e can now rewrite F 0 ( P ) a s 5 follows: F 0 ( P ) = || A G P − P A H || 2 F = || ( D G P − P D H ) − ( L G P − P L H ) || 2 F = || D G P − P D H || 2 F − 2tr[( D G P − P D H ) T ( L G P − P L H )] + || L G P − P L H || 2 F . (10) Let us now consider more precisely the seco nd term in this la s t expr e s sion: tr[( D G P − P D H ) T ( L G P − P L H )] = tr P P T D T G L G | {z } P d 2 G ( i ) + tr L H D T H P T P | {z } P d 2 H ( i ) − tr P T D T G P L H | {z } P d G ( i ) d P ( H ) ( i ) − tr D T H P T L G P | {z } P d P ( H ) ( i ) d G ( i ) = X ( d G ( i ) − d P ( H ) ( i )) 2 = k D G − D P ( H ) k 2 F = k D G P − P D H k 2 F . (11) Plugging (11) into (10) we obtain F 0 ( P ) = k D G P − P D H k 2 F − 2 k D G P − P D H k 2 F + k L G P − P L H k 2 F = −k D G P − P D H k 2 F + k L G P − P L H k 2 F = − X i,j P ij ( D G ( j ) − D H ( i )) 2 + tr( P P T | {z } I L T G L G ) + tr( L T H P T P | {z } I L H ) − 2 tr( P T L T G P L H ) | {z } vec ( P ) T ( L T H ⊗ L T G ) vec ( P ) = − tr(∆ P ) + tr( L 2 G ) + tr( L 2 H ) − 2vec( P ) T ( L T H ⊗ L T G )v ec( P ) , (12) where we introduced the matrix ∆ i,j = ( D H ( j, j ) − D G ( i, i )) 2 and we used ⊗ to denote the Kro nec ker pro duct o f t wo matrices (deﬁnition of the Kro necker pr o duct and some impo rtant pr op e r ties may b e found in the appendix B). Let us denote F 1 ( P ) the part of (12) which dep ends o n P : F 1 ( P ) = − tr(∆ P ) − 2vec ( P ) T ( L T H ⊗ L T G )v ec( P ) . F ro m (12) we see that the p ermut atio n matrix w hich minimizes F 1 ov er P is the solution of the gr aph matching problem. Now, minimizing F 1 ( P ) ov er D gives us a relaxa tion o f (7) o n the set of doubly s tochastic matrices. Since graph Laplacian matrices ar e p ositive semi-deﬁnite, the matrix L H ⊗ L G is also p ositive semideﬁnite a s a Kroneck er pro duct of tw o symmetric p ositive semi-deﬁnite matrices [26]. Therefore the function F 1 ( P ) is co ncave on D , and we obtain a co ncav e relaxatio n of the graph matching problem: QCC : min P ∈D F 1 ( P ) . (13) In teres tingly , the global minim um of a concave function is necessarily lo ca ted at a b oundary of the conv ex set where it is minimized [3 1], s o the minimium of F 1 ( P ) on D is in fact in P . A t this p oint, we hav e obtained t wo relaxations o f GM as nonlinear o ptimization problems on D : the ﬁrst one is the conv ex minimization problem QCV (8), which can b e solved eﬃcient ly but leads to a solution in D that m ust then be pro j ected onto P , and the other is the concav e minimization problem QCC (13) which do es not hav e an eﬃcient (p olynomial) o ptimization algor ithm but ha s the same s olution as the initial pr oblem GM . W e note that these conv ex and concave relaxa tion of the gr aph matching problem can b e mor e genera lly derived for a n y quadratic assig nmen t pr oblems [32]. 6 3.3 P A TH algorithm W e prop ose to a pproximately so lve Q CC by tracking a pa th o f lo cal minima ov er D of a ser ies o f functions that linearly interpolate b etw een F 0 ( P ) and F 1 ( P ), namely: F λ ( P ) = (1 − λ ) F 0 ( P ) + λF 1 ( P ) , for 0 ≤ λ ≤ 1. F or all λ ∈ [0 , 1], F λ is a quadratic function (which is in general neither convex nor concav e for λ aw ay from zer o or one). W e r ecov er the conv ex function F 0 for λ = 0, and the concav e function F 1 for λ = 1. Our method searches sequentially lo ca l minima of F λ , where λ mov es from 0 to 1. Mo re precisely , we start at λ = 0, and ﬁnd the unique lo ca l minimum of F 0 (whic h is in this ca se its unique global minimu m) by any cla ssical QP solver. Then, iteratively , w e ﬁnd a lo cal minimum of F λ + dλ given a lo cal minimum of F λ b y pe r forming a lo ca l optimization of F λ + dλ starting from the lo ca l minim um o f F λ , using for example the F rank- W olfe algorithm. Rep eating this iterative pro ce s s for dλ small eno ugh we o btain a pa th of solutions P ∗ ( λ ), where P ∗ (0) = a rg min P ∈D F 0 ( P ) and P ∗ ( λ ) is a lo cal minimu m of F λ for a ll λ ∈ [0 , 1]. Noting that any lo cal minim um of the concave function F 1 on D is in P , we ﬁnally output P ∗ (1) ∈ P as an a pproximate s o lution o f GM . Our approa ch is s imilar to gr aduated non-conv exity [33]: this approach is often used to approximate the global minim um of a non-c o n vex ob j ective function. This function consists of t wo part, the co n vex component, and non-conv ex c o mpo nent, and the gr aduated non-conv exity framework prop oses to tra ck the linear combination of the conv ex and non-convex parts (from the c o n vex relaxa tion to the true o b j ective function) to approximate the minim um of the non-convex function. The P A TH algo rithm may indeed b e considered a s an example of s uc h a n approach. Ho wev er, the main diﬀerence is the construction of the ob j ective function. Unlik e [33], we construct tw o relaxations of the initial optimization problem, which lead to the same v alue on the s et of interest ( P ), the g oal being to cho ose conv ex/co ncave relaxa tions which a ppr oximate in the b est wa y the ob jective function on the set of p ermutation matrices. The pse udo -co de for the P A TH algo rithm is presented in Figure 2. The r ationale b ehind it is that among the lo cal minima of F 1 ( P ) on D , w e exp ect the one connected to the glo bal minimum of F 0 through a path of local minima to b e a go o d approximation o f the global minima. Such a situation is fo r example shown in Figure 3, where in 1 dimension the global minimum of a concav e quadratic function on an int er v al (among tw o ca ndidate p oints) can b e found by following the path of lo cal minima connected to the unique global minimum of a conv ex function. 1. Initialization : (a) λ := 0 (b) P ∗ (0) = arg min F 0 — conv ex o ptimizatio n pro blem, glo bal minimum is found by F rank-W olfe algo rithm. 2. Cycle o v er λ : while λ < 1 (a) λ new := λ + dλ (b) if | F λ new ( P ∗ ( λ )) − F λ ( P ∗ ( λ )) | < ǫ λ then λ = λ new else P ∗ ( λ new ) = arg min F λ new is found b y F r ank-W o lfe starting from P ∗ ( λ ) λ = λ new 3. Output : P out := P ∗ (1) Figure 2: Sc hema of the P A TH alg o rithm More precisely , and althoug h we do not hav e a ny formal result a bo ut the optimality of the P A TH optimization method (b eyond the lack of g lobal o ptimalit y , see App endix A), we can men tion a few in teres ting prop erties of this method: • W e know from (12) that for P ∈ P , F 1 ( P ) = F 0 ( P ) − κ , where κ = tr( L 2 G ) + tr( L 2 H ) is a constant indep e ndent of P . As a r esult, it ho lds for all λ ∈ [0 , 1] that, for P ∈ P : F λ ( P ) = F 0 ( P ) − λκ . 7 0 0.2 0.4 0.6 0.8 1 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 λ =1 λ =0.75 λ =0.3 λ =0.2 λ =0.1 λ =0 Figure 3: Illustration for path optimization appro ach. F 0 ( λ = 0) — initial conv ex function, F 1 ( λ = 1) — initial concav e function, b old black line — pa th of function minima P ∗ ( λ ) ( λ = 0 . . . 0 . 1 . . . 0 . 2 . . . 0 . 3 . . . 0 . 75 . . . 1) This shows that if for some λ the global minim um o f F λ ( P ) ov er D lies in P , then this minimum is also the global minimum of F 0 ( P ) ov er P a nd therefore the o ptimal solution o f the initial problem. Hence, if for example the glo bal minimum of F λ is found o n P by the P A TH alg o rithm (for instance, if F λ is still conv ex), then the P A TH alg orithm lea ds to the global o ptimum of F 1 . This situation can b e seen in the toy exa mple in Figur e 3 where, for λ = 0 . 3, F λ has its unique minimum at the b oundary o f the domain. • The sub-optimality o f the P A TH algorithm comes from the fact that, when λ increase s , the num b er of lo cal minima o f F λ may increase and the s equence o f lo cal minima tr ack ed by P A TH may not b e glo bal minima. How ever we can exp ect the lo ca l minima follow ed by the P A TH algorithm to b e interesting approximations for the following reason. First obser v e that if P 1 and P 2 are tw o lo cal minima of F λ for some λ ∈ [0 , 1 ], then the r e striction o f F λ to ( P 1 , P 2 ) b eing a quadratic function it has to b e concav e and P 1 and P 2 m ust be on the b oundary of D . Now, let λ 1 be th e smallest λ such that F λ has several lo cal minima o n D . If P 1 denotes the lo cal minima follow ed by the P A TH alg orithm, and P 2 denotes the “new” lo cal minimu m of F λ 1 , then necessarily the restriction of F λ 1 to ( P 1 , P 2 ) must b e concav e and hav e a v anishing deriv ative in P 2 (otherwise, by co n tinuit y of F λ in λ , there would b e a lo cal minim um of F λ near P 2 for λ s ligh tly smaller than λ 1 ). Co nsequent ly w e necessarily have F λ 1 ( P 1 ) < F λ 1 ( P 2 ). This situation is illustrated in Figure 3 where, when the sec o nd lo cal minimum app ears for λ = 0 . 75 , it is worse than the one tr ack ed b y the P A TH algorithm. More genera lly , w hen “new” lo cal minima appea r, they a re strictly worse than the one track ed b y the P A TH alg orithm. Of cour se, they may b ecome b etter than the P A TH so lution when λ con tinues to increase. Of course, in spite of these justiﬁcations, the P A TH a lgorithm only gives an approximation of the global minimum in the general case. In App endix A, we provide t wo simple e x amples when the P A TH alg orithm resp ectively succ e eds and fails to ﬁnd the global minimum of the gra ph matching problem. 3.4 Numerical con tinuation metho d interpretation Our path following alg orithm may b e consider e d as a par ticular case of numerical contin uation metho ds (sometimes called path following metho ds) [34]. These allow to estimate curves given in the following implicit form: T ( u ) = 0 where T is a ma pping T : R K +1 → R K . (14) In fact, o ur P A TH algo rithm co rresp onds to a pa rticular implement ation of the so -called Gener ic Predicto r Co r rector Approach [34] widely used in num er ical c o n tinuation metho ds. In o ur case, w e have a set of problems min P ∈D (1 − λ ) F 0 ( P ) + λF 1 ( P ) parametrized by λ ∈ [0 , 1]. In other words for each λ we have to solve the following system of Ka r ush-Kuhn-T uc ker (KK T ) equations: (1 − λ ) ∇ P F 0 ( P ) + λ ∇ P F 1 ( P ) + B T ν + µ S = 0 , B P − 1 2 N = 0 , P S = 0 , 8 where S is a set o f active constraints, i.e., o f pairs of indices ( i, j ) that satisfy P ij = 0, B P − 1 2 N = 0 co des the conditions P j P ij = 1 ∀ i and P i P ij = 1 ∀ j , ν and µ S are dual v aria bles. W e hav e to solve this system for a ll po ssible sets o f active co nstraints S on the op en se t of matrices P that satisfy P i,j > 0 for ( i, j ) / ∈ S , in o r der to deﬁne the set of stationar y p oints of the functions F λ . Now if we let T ( P , ν, µ, λ ) denote the left-hand pa r t o f the KKT equation sy s tem then we have ex a ctly (14) with K = N 2 + 2 N + # S . F rom the implicit function theorem [35], we know that for ea ch set o f constraints S , W S = { ( P, ν, µ S , λ ) : T ( P, ν, µ S , λ ) = 0 and T ′ ( P, ν, µ S , λ ) has the ma x imal p ossible rank } is a smo oth 1 - dimensional curve or the e mpty set and ca n b e parametrized by λ . In term of the ob jective function F λ ( P ), the condition on T ′ ( P, ν, µ S , λ ) ma y b e interpreted as a prohibition for the pro jection of F λ ( P ) on a n y feasible direction to b e a constant. Therefore the whole set of stationar y p oints of F λ ( P ) when λ is v ar ying from 0 to 1 may b e repr esen ted as a unio n W ( λ ) = ∪ S W S ( λ ) where e a ch W S ( λ ) is homotopic to a 1-dimensional segment. The set W ( λ ) may hav e quite complicated form. Some of W S ( λ ) may intersect each other, in this ca se we obser ve a bifurcation p oint, so me of W S ( λ ) may co nnect each other, in this case we hav e a tr ansformation p oint of one pa th in to a nother, so me o f W S ( λ ) may app ear only for λ > 0 a nd/or disapp ear b efore λ rea ch es 1. At the b e g inning the P A TH algo rithm starts fr o m W ∅ (0), then it follows W ∅ ( λ ) un til the b order of D (or a bifurcation p oint). If such an even t o cc ur s befo r e λ = 1 then P A TH mov es to ano ther segment of solutions co rresp onding to diﬀerent constraints S , and k eeps moving along segment s and sometimes jumping b etw een segments until λ = 1. As we said in the previous s ection one of the interesting prop erties of P A TH algo r ithm is the fact that if W ∗ S ( λ ) a ppea r s only when λ = λ 1 and W ∗ S ( λ 1 ) is a lo cal minimum then the v alue of the ob jective function F λ 1 in W ∗ S ( λ 1 ) is gr eater than in the p oint traced by the P A TH a lgorithm. 3.5 Some imp lemen tation details In this section we pr ovide a few details re lev ant for the eﬃcient implement a tio n of the P A TH algor ithms. F rank-W olfe Among the diﬀerent o ptimization tec hniques for the optimization of F λ ( P ) starting fro m the current lo cal minimum track ed by the P A TH algorithm, we use in our exper imen ts the F rank - W olfe algorithm whic h is particularly suited to optimization over doubly stochastic matrices [36]. The idea of the this algor ithm is to sequentially minimize linear a pproximations of F 0 ( P ). Eac h step includes three op erations: 1. estimation of the gr adient ∇ F λ ( P n ), 2. resolution of the linear pr ogram P ∗ n = arg min P ∈ D h∇ F λ ( P n ) , P i , 3. line search: ﬁnding the minimum of F λ ( P ) on the segment [ P n P ∗ n ]. An impor tant p r o per t y of this metho d is that the seco nd operation can b e done eﬃcien tly b y the Hungarian algorithm, in O ( N 3 ). Eﬃcient gradien t computations Another essential p oint is that w e do not need to store matrices of size N 2 × N 2 for the computation of ∇ F 1 ( P ), b ecause the tenso r pro duct in ∇ F 1 ( P ) = − vec(∆ T ) − 2( L T H ⊗ L T G )v ec( P ) can b e expressed in terms of N × N ma tr ix multiplication: ∇ F 1 ( P ) = − vec(∆ T ) − 2( L T H ⊗ L T G )v ec( P ) = − vec (∆ T ) − 2vec( L T G P L H ) . The same thing may b e do ne for the gr adient of the conv ex comp onent ∇ F 0 ( P ) = ∇ [vec( P ) T Q vec ( P )] where Q = ( I ⊗ A G − A T H ⊗ I ) T ( I ⊗ A G − A T H ⊗ I ) ∇ F 0 ( P ) = 2 Q vec( P ) = 2vec ( A 2 G P − A T G P A T H − A G P A H + P A 2 H ) 9 Initialization The propo sed algor ithm can be accelera ted by the application of Newton a lg orithm as the ﬁrst step of QCV (minimization of F 0 ( P )). Firs t, let us rewrite the Q CV pro blem a s follows: min P ∈D k A G P − P A H k 2 F ⇔ min P ∈D vec ( P ) T Q vec ( P ) ⇔        min P vec ( P ) T Q vec ( P ) such that B vec( P ) = 1 2 N vec ( P ) ≥ 0 N 2 (15) where B is the matrix which co des the conditions X j P i,j = 1 and X i P i,j = 1. The Lagra ngian has the following form L ( P, ν, λ ) = vec ( P ) T Q vec ( P ) + ν T ( B vec( P ) − 1 2 N ) + µ T vec ( P ) , where ν and µ are Lag r ange multip liers . No w we would like to use Newton method for constra ined o ptimization [36 ] to so lv e (15). Let µ a denote the set o f v aria bles asso ciated to the set of active co nstraints vec( P ) = 0 at the current po in ts, then the Newton step co nsist in solving the following sys tem of equations:   2 Q B T I a B 0 0 I a 0 0     vec ( P ) ν µ a   =   0 1 0   N 2 element s, 2 N elemen ts, # of act. ineq. cons. (16) More precisely we hav e to so lv e (16) for P . The pro blem is that in genera l situations this problem is computationally demanding b ecause it inv olves the inv ersion o f matric e s of size O ( N 2 ) × O ( N 2 ). In some pa r ticular ca ses, howev er, the Newton step bec o mes fea sible. Typically , if none of the constra in ts vec( P ) ≥ 0 are a ctiv e, then (1 6) takes the following form 2 :  2 Q B T B 0   vec( P ) ν  =  0 1  N 2 element s , 2 N elemen ts . (17) The solution is then o btained as follows: vec ( P ) K K T = 1 2 Q − 1 B T ( B Q − 1 B T ) − 1 1 2 N . (18) Because o f the par ticular form of matrices Q and B , the express ion (18) may b e co mputed very simply with the help of Kronecker pro duct pro perties in O ( N 3 ) instead of O ( N 6 ). More precisely , the ﬁr st step is the calculation of M = B Q − 1 B T where Q = ( I ⊗ A G − A T H ⊗ I ) 2 . The matrix Q − 1 may b e represe nted as follows: Q − 1 = ( U H ⊗ U G )( I ⊗ Λ G − Λ H ⊗ I ) − 2 ( U H ⊗ U G ) T . (19) Therefore the ( i, j )-th element of M is the fo llowing pro duct: B i Q − 1 B T j = vec( U T H f B i T U G ) T )(Λ G − Λ H ) − 2 × vec( U T G f B j T U H ) , (20) where B i is the i -th r ow of B and f B i is B i reshap ed int o a N × N matrix. The second step is an inv ersion of the 2 N × 2 N matrix M , and a sum ov er columns M s = M − 1 1 2 N . The last step is a multipl ica tio n of Q − 1 b y B T M s , which can b e done with the s ame tr icks as the ﬁrst step. The result is the v alue of matrix P K K T . W e then hav e t wo p oss ible scena rios: 1. If P K K T ∈ D , then we hav e found the solution of (15). 2. Otherwise we take the p oint of intersection of the line ( P 0 , P K K T ) and the b order ∂ D as the next p oint and we contin ue with F rank-W olfe algo rithm. Unfortunately we can do the Newton step only o nce, then some o f P ≥ 0 co nstraints b ecome active and eﬃcient calculations a r e not feasible anymore. But even in this case the Newton s tep is generally very useful be c ause it decrease s a lot the v alue of the o b j ective function. 2 It is true if we start our algorithm, for example, from the constant matrix P 0 = 1 N 1 N 1 T N . 10 dλ -adaptation strategy In pr actice, we fo und it useful to have the pa rameter dλ in the a lgorithm of Figure 2 v ar y b etw een itera tions. Intuitiv ely , dλ should depend on the form of the ob j ective function as follows: if F λ α ( P ) is s mo o th and if incr easing the parameter λ do es not change a lot the form o f the function, then w e can a ﬀord large steps, in co n tras t, we should do a lot of small s teps in the situation where the ob jective function is very sensitive to changes in the par ameter λ . The adaptive scheme we pr op o se is the following. First, we ﬁx a co nstant dλ min = 10 − 5 , whic h represents the low er limit for dλ . When the P A TH algor ithm star ts, d λ is set to dλ min . If w e see after an update λ new = λ + dλ that | F λ new ( P ∗ ( λ )) − F λ ( P ∗ ( λ )) | ≤ ǫ λ then w e double dλ and k eep m ultiplying dλ by 2 as long as | F λ new ( P ∗ ( λ )) − F λ ( P ∗ ( λ )) | ≤ ǫ λ . On the contrary , if dλ is to o lar ge in the sense that | F λ new ( P ∗ ( λ )) − F λ ( P ∗ ( λ )) | > ǫ λ , then we divide dλ by 2 until the cr iterion | F λ new ( P ∗ ( λ )) − F λ ( P ∗ ( λ )) | ≤ ǫ λ is met, or dλ = dλ min . Once the up date on dλ is done, we run the optimization (F r ank-W o lfe) for the new v alue λ + dλ . The idea b ehind this simple adaptation schema is to choos e dλ which keeps | F λ new ( P ∗ ( λ )) − F λ ( P ∗ ( λ )) | just b elow ǫ λ . Stopping criterion The choice o f the up date criterio n | F λ new ( P ∗ ( λ )) − F λ ( P ∗ ( λ )) | is not unique. Here we check whether the function v alue has b een changed a lot at the given po in t. But in fact it may b e more interesting to trace the minimum of the ob jective function. T o compare the new minim um with the current one, we need to chec k the distance b etw een these minima and the diﬀerence b etw een function v alues. It means that we us e the following condition as the stopping cr iterion | F λ new ( P ∗ ( λ new )) − F λ ( P ∗ ( λ )) | < ǫ F λ and || P ∗ ( λ new ) − P ∗ ( λ ) || < ǫ P λ Although this appro ach takes a little bit mo re c o mputations (we nee d to run F rank-W olfe on each up date o f dλ ), it is quite eﬃcient if we use the adaptatio n schema for dλ . T o ﬁx the v alues ǫ F λ and ǫ P λ we use a parameter M which deﬁnes a ratio betw een these parameters a nd the parameters of the s topping cr iterion used in the F rank - W olfe algor ithm: ǫ F F W (limit v alue of function decrement) and ǫ P F W (limit v a lue of a rgument changing): ǫ F λ = M ǫ F F W and ǫ P λ = M ǫ P F W . The para meter M represe n ts a n authorized level o f s topping cr iterion r elaxation when we increment λ . In practice, it means that when we sta rt to increment λ we may mov e awa y from the lo cal minima and the extent o f this mov e is deﬁned by the par ameter M . The larg er the v alue of M , the further we can move awa y and the larger dλ may b e used. In other words, the parameter M con trols the width of the tube a round the pa th of optimal so lutions. 3.6 Algorithm complexit y Here we present the co mplexity o f the algo r ithms discussed in the pap er. • Umey ama’s a lgorithm has three comp onent s: matrix multiplication, calculation o f eigenv ectors and a pplica - tion of the Hungar ian alg o rithm for (4). Co mplexit y of each co mp onent is equa l to O ( N 3 ). Th us Umeyama’s algorithm has complexity O ( N 3 ). • LP appro ach (5 ) has complexity O ( N 7 ) (worst case) b ecause it may b e r ewritten a s an linear optimization problem with 3 N 2 v ar iables [23 ]. In the P A TH algorithm, there ar e three pr incipal parameter s which hav e a big impact on the algorithm complex- it y . These pa rameters are ǫ F F W , ǫ P F W , M a nd N . The ﬁrst parameter ǫ F W deﬁnes the pr ecision of the F rank-W olfe algorithm, in some case s its spee d may b e sublinear [3 6], howev er it should work muc h better when the o ptimization po lytope has a “ smo o th” b order [36 ]. The inﬂuence of the ratio parameter M is mor e co mplicated. In pra ctice, in order to ensur e that the ob jective function takes v alues b et ween 0 and 1, we usually use the nor malized version o f the ob jective function: F nor m ( P ) = || A G P − P A H || 2 F || A G || 2 F + || A H || 2 F In this case if we use the simple stopping criterion based on the v alue of the ob jective function then the num b er of iteration ov er λ (num b er of F rank- W olfe algorithm runs) is a t least e q ual to C M ǫ F F W where C = min P F nor m − min D F nor m . The mo s t impor tan t thing is how the algor ithm c o mplexit y dep ends on the gra ph size N . In general the num b er of iterations of the F rank-W olfe algorithm s cales as O  κ ǫ F F W  where κ is the conditional nu mber of the Hessian 11 matrix descr ibing the o b jective function near a lo cal minima [36 ]. It mea ns that in terms of num b ers of iterations , the pa rameter N is not crucial. N deﬁnes the dimensionality of the minimization problem, while κ may be close to zero o r one dep ending on the gra ph structures, not explicitly on their size. On the other hand, N ha s a big inﬂuence on the cost of one iteratio n. Indeed, in ea c h iteration step w e need to calculate the gr adien t and to minimize a linear function over the p olytop e of doubly sto chastic matrices . The gradient estimation a nd the minimization may be do ne in O ( N 3 ). In Section 4.2 we pres en t the empirical r esults on how alg orithm co mplexit y and optimization precision dep end on M (Figure 7b) and N (Figure 8). 3.7 V ertex pa irwise similarities If w e match tw o la beled gra phs, then we ma y increase the performance of o ur metho d by using information on pairwise similar ities b etw een their no des. In fact one metho d of image ma tc hing uses only this type o f informa tion, namely shap e con text matching [1 9]. T o integrate t he infor ma tio n o n v ertex similarities we use the appro ach prop osed in (3), but in our case we use F λ ( P ) instead of F 0 ( P ) min P F α λ ( P ) = min P (1 − α ) F λ ( P ) + α tr( C T P ) , . (21) The a dv antage of the last formulation is that F α λ ( P ) is just F λ ( P ) with a n a dditiona l linear term. Therefore we can use the same a lgorithm for the minimization o f F α λ ( P ) as the one we presented for the minimization of F λ ( P ). 3.8 Matc hing graphs of diﬀeren t sizes Often in practice we hav e to match gra phs of diﬀerent sizes N G and N H (suppo s e, for example, that N G > N H ). In this case we have to match all vertices o f graph H to a subse t of vertices o f graph G . In the usual ca se when N G = N H , the error (1 ) corresp onds to the num b er of misma tc hed edges (edges which exist in o ne gr aph a nd do not ex ist in the other one). When we match graphs of diﬀerent sizes the s ituation is a bit more co mplicated. Le t V + G ⊂ V G denote the set of vertices of g raph G that ar e selected for matching to vertices of g r aph H , let V − G = V G \ V + G denote all the res t. Therefore all edges o f the gr aph G a re divided into four parts E G = E ++ G ∪ E + − G ∪ E − + G ∪ E −− G , where E ++ G are edges betw een vertices fro m V + G , E −− G are edges betw een vertices fro m V − G , E + − G and E + − G are edges from V + G to V − G and fro m V − G to V + G resp ectively . F or undirected gr a phs the sets E + − G and E + − G are the same (but, for directed gra phs we do not consider, they would b e diﬀerent ). The edges from E −− G , E + − G and E − + G are alwa ys misma tc hed a nd a question is whether we have to take them into a ccount in the ob jective function or not. According to the answer w e ha ve three t yp es of matching error (four for dir ected graphs) with interesting in terpr etations. 1. W e count only the num b er of mismatched edges b etw een H a nd the ch os en subgr aph G + ⊂ G . It co rresp onds to the ca s e when the matrix P from (1) is a matrix of size N G × N H and N G − N H rows of P contain only zeros. 2. W e co un t the num b e r of mismatched edges b et ween H and the chosen s ubg raph G + ⊂ G . And we also count all edges from E −− G , E + − G and E − + G . In this case P from (1) is a matrix of size N G × N G . An d we transform matrix A H in to a matrix of size o f size N G × N G b y adding N G − N H zero rows and zero columns. It means that we add dummy iso lated vertices to the smallest g r aph, and then we match g r aphs of the s ame s ize. 3. W e co un t the num b er of misma tched edg es b etw een H and chosen subgr aph G + ⊂ G . And we a lso count all edges from E + − G (or E − + G ). It means that we count matching error be tw een H and G + and we count also the n umber o f edges which connect G + and G − . In other words we a re lo oking for s ubg raph G + which is similar to H and which is max ima lly iso lated in the gra ph G . Each type of err or may b e useful a c c ording to context and inte rpr etation, but a priori, it seems that the b est choice is the seco nd one where we add dumm y no des to the sma llest graph. The main rea s on is th e f ollowing. Suppos e that gr aph H is quite spar se, and suppose th at graph G has t wo candidate subgraphs G + s (also quite sparse) a nd G + d (dense). The upp er bo und for the matching er r or b et ween H a nd G + s is # V H + # V G + s , the low er bo und for the matching err or b etw een H and G + d is # V G + d − # V H . So if # V H + # V G + s < # V G + d − # V H then we will alwa ys choose the gr a ph G + s with the ﬁrst strateg y , even if it is not simila r a t all to the graph H . The main explanation of this eﬀect lies in the fac t that the alg orithm tries to minimize the num b er of mismatched edges , and not to maximize the num ber of well matched e dg es. In contrast, when we use dummy no des, we do not hav e this problem b ecause if we take a very sparse subgraph G + it increas es the num ber of edges in G − (the co mmon n umber 12 of edges in G + and G − is c onstant a nd is equal to the num b er of edges in G ) and ﬁnally it decr eases the quality of matching. 4 Sim u lations 4.1 Syn thetic examples In this section w e compare the propo s ed alg orithm with s ome classica l metho ds on a rtiﬁcially g enerated gr aphs. Our choice of r andom gr aph types is based on [3 7] where authors discuss diﬀerent types o f random gra phs which are the most fr e q uen tly observed in v arious rea l w orld applications (world wide web, colla bo r ations net works, so cial netw orks, etc...). Ea ch t yp e of random gr aphs is deﬁned by the distr ibution function of no de degree Prob(no de degree = k ) = V D ( k ). The v ector of no de degre es of each graph is supp osed to b e an i.i.d sample from V D ( k ). In our e xper imen ts we have us ed the following types o f ra ndom gr aphs: Binomial law V D ( k ) = C k N p k (1 − p ) 1 − k Geometric law V D ( k ) = (1 − e − µ ) e µk Po wer law V D ( k ) = C τ k − τ The schema of gr aph gener ation is the following 1. generate a sample d = ( d 1 , . . . , d N ) from V D ( k ) 2. if P i d i is o dd then g oto step 1 3. while P i d i > 0 (a) c ho ose ra ndomly tw o non-zero elements from d : d n 1 and d n 2 (b) add edge ( n 1 , n 2) to the graph (c) d n 1 ← d n 1 − 1 d n 2 ← d n 2 − 1 If we a re in teres ted in isomorphic gra ph matching then we compar e just the initial g raph and its rando mly p ermuted copy . T o test the matching o f non-iso morphic gr aphs, we add randomly σ N E edges to the initial gra ph and to its per mitted c o p y , where N E is the num ber of edges in the or ig inal gra ph, and σ is the noise level. 4.2 Results The ﬁrst ser ies of exp eriments are exp e r imen ts on small size graphs (N=8), here we a re interested in compar ison of the P A TH algor ithm (see Figure 2), the QCV appro ach (8), Umeyama sp ectral algo rithm (4), the linear prog r am- ming approach (5) and exhaustive search which is feasible for the small size graphs. The a lgorithms were tested on the three types of ra ndom gra phs (binomial, exp onential and p ow er). The results ar e pr esent ed in Figure 4. The 0 0.5 1 0 2 4 6 noise level matching error U PATH QCV LP OPT (a) bin 0 0.5 1 0 2 4 6 noise level matching error U PATH QCV LP OPT (b) exp 0 0.5 1 0 2 4 6 noise level matching error U PATH QCV LP OPT (c) p o w Figure 4 : Matching error (mean v alue ov er sample of size 100) as a function of no ise. Graph size N=8. U — Umey ama’s algor ithm, LP — linea r pr ogramming algorithm, QCV — con vex o ptimization, P A T H — path minimization algo rithm,OPT — an exhaus tiv e sear c h (the global minim um). The range of err or bars is the standard deviation of matching er rors 13 0 0.5 1 0 10 20 30 40 noise level matching error U PATH QCV LP (a) bin 0 0.5 1 0 10 20 30 40 noise level matching error U PATH QCV LP (b) exp 0 0.5 1 0 10 20 30 40 noise level matching error U PATH QCV LP (c) p o w Figure 5: Matching error (mean v alue ov er sample of size 100) as a function of noise. Graph size N=20 . U — Umey ama’s algor ithm, LP — linea r pr ogramming algorithm, QCV — con vex o ptimization, P A T H — path minimization algo rithm. 0 0.5 1 0 100 200 300 400 noise level matching error U PATH QCV (a) bin 0 0.5 1 0 200 400 600 800 noise level matching error U PATH QCV (b) exp 0 0.5 1 0 500 1000 noise level matching error U PATH QCV (c) p o w Figure 6: Matching erro r (mean v alue over s ample of size 100) a s a function of noise. Gra ph size N=10 0. U — Umey a ma ’s a lgorithm, QCV — conv ex o ptimization, P A TH — path minimization alg orithm. same exp eriment was rep eated for middle-sized gra phs ( N = 2 0, Figure 5) and for large graphs ( N = 10 0, Figure 6). In a ll cases, the P A TH algor ithm works muc h b etter than all other approximate alg orithms. Ther e a re s o me impo r tant things to note here. First, the choice of nor m in (1) is not very imp ortant — r esults of QC V and LP are ab out the same. Second, following the solution paths is v ery useful compared to just minimizing the con vex relaxation and pro jecting the solution on the set of permutation matr ices (P A TH algorithms works m uch be tter than QCV). Another noteworth y obs erv atio n is that the per formance of P A TH is very close to the o ptimal solution when the later ca n b e e v aluated. W e note that sometimes the matching er ror decreases as the noise level increases (e.g., in Fig ures 6c,5c), whic h can b e explained as fo llows. The matching err or is upp e r bounded by the minimum of the total n umber of zer o s in the adjace ncy matrices A G and A H , s o in genera l this upp er bo und deacr eases when the edge density increas e s . When the noise level incre a ses, it makes graphs denser, and consequently the upp er b ound of matching error decreases. The general behavior of graph ma tc hing algorithms a s functions of the gra ph density is present ed in Figure 7a). Here a gain the matching error decrea ses when the g raph density b e c omes very larg e. The pa rameter M (see section 3 .5) deﬁnes how precisely the P A TH algor ithm tries to follow the path of lo ca l minimas. The la rger M , the fas ter the P A TH alg orithm. At the extreme, when M is close to 1/ ǫ F W , we jump directly fro m the con vex function ( λ = 0) to the conca ve one ( λ = 1 ). Figure 7 b) shows in more details how algorithm sp eed and pre c is ion dep end o n M . Another imp ortant aspect to compare the diﬀerent algorithms is their r un-time complexity a s a function of N . Figure 8 sho ws the time needed to obtain the matching b et ween tw o g raphs as a function o f the n umber o f vertices N (for N betw een 10 and 1 00), for the diﬀerent methods. These curves are coherent with theoretical v alues of algo rithm complexities summar ize d in Section 3.6. In particular we o bserve that Umey ama’s algor ithm is the fastest metho d, but that QCV and P A TH hav e the same complexity in N . The LP method is comp etitiv e with QCV and P A TH for small gr aphs, but has a worse co mplexit y in N . 14 0 20 40 60 0 500 1000 1500 mean node degree Matching error U QCV PATH (a) 0 2 6 8 0 200 400 Matching error log(M) 0 200 0 Time (sec) (b) Figure 7: (a) Algorithm p erfor mance as a function of graph density . (b) Precision and sp eed of the P A TH algorithm as a function of M , the r elaxation constant used in the P A TH algorithm (see sectio n 3 .5). In b oth cases, g r aph size N=100, no ise level σ =0 .3, sample size is equal to 30. Er ror bar s repres en t standar d deviation of the matching error (not averaged) 1 1.5 2 −4 −2 0 2 log10(N) log10(sec) LP QCV PATH U (a) bin 1 1.5 2 −4 −2 0 2 log10(N) log10(sec) LP QCV PATH U (b) exp 1 1.5 2 −4 −2 0 2 4 log10(N) log10(sec) LP QCV PATH U (c) p o w Figure 8: Timing of U,LP ,QCV and P A TH a lgorithms as a function of gr aph s ize, for the diﬀerent random g raph mo dels. LP slop e ≈ 6.7, U, QCV and P A TH slop e ≈ 3 .4 15 T a ble 1 : Experiment results for QAP LIB b enchmark datasets. QAP MIN P A TH QPB GRAD U chr12c 11156 18048 2 0306 19014 40370 chr15a 9896 19086 2 6 132 30370 60986 chr15c 9504 16206 2 9 862 23686 76318 chr20b 2298 5560 6674 6290 1 0022 chr22b 6194 8500 9942 9658 1 3118 esc16b 292 300 296 298 306 rou12 2 35528 256320 27883 4 27 3438 295752 rou15 3 54210 39127 0 381016 45 7908 48035 2 rou20 7 25522 778284 80467 6 84 0120 905246 tai10a 13502 8 1525 34 165364 168096 1898 52 tai15a 38821 4 4192 24 455778 451164 4835 96 tai17a 49181 2 5309 78 550852 589814 6209 64 tai20a 70348 2 7537 12 799790 871480 9151 44 tai30a 18181 46 1903872 19964 42 207795 8 2213846 tai35a 24220 02 2555110 27209 86 280345 6 2925390 tai40a 31393 70 3281830 35294 02 366804 4 3727478 5 QAP b enchmark library The problem of g raph matching may be consider ed a s a particular case of the quadratic assig nmen t problem (Q AP ). The minimization of the loss function (1) is equiv a len t to the maximization of the following function: max P tr( P T A T G P A H ) . Therefore it is int ere s ting to compare our method with other a ppr oximate metho ds prop osed for Q AP . [18] pr opo sed the Q P B algo rithm fo r that purp ose and tested it on matrices from the QAP b enchmark library [38], Q PB res ults were co mpared to the res ults of g raduated assignment algor ithm GRAD [17] and Umeyama’s algo rithm. Results of P A TH application to the same matrices ar e pre s en ted in T able 1, sc o res for QPB and graduated assignment algorithm are ta k en directly from the publication [18]. W e o bs erve that on 14 o ut of 16 b enchmark, P A TH is the bes t optimization metho d among the methods tested. 6 Image pro c essing In this section, we pr esen t tw o applications in image pro cessing. The ﬁrs t one (Section 6.1) illustrates how taking in to ac coun t information o n gr aph structure may increa se image alig nmen t quality . The seco nd one (Section 6.2) shows that the structure of c o n tour graphs may b e very impo rtant in classiﬁca tion tasks. In both examples we compare the pe r formance of our metho d with the sha p e context approa ch [19 ], a sta te- o f-the-art metho d for imag e matchin g. 6.1 Alignmen t of v essel images The ﬁrs t example is dedica ted to the pro blem of image a lig nmen t. W e consider tw o photos of v essels in h uman eyes. The original ph o to s and the i mag es o f extracted v essel con tours (obtained from the metho d of [39]) are presented in Figure 9. T o align the vessel images, the s hape context algo rithm uses the context ra dial histogr a ms of contour p oint s (see [1 9]). In other w o r ds, according to the shap e context algo rithm one alig ns p oints which hav e similar context histogra ms. The P A TH algo rithm uses also infor ma tion a b out the g r aph s tructure. When we use the P A TH algor ithm we have to tune the par ameter α (2 1), we tested several p ossible v alues a nd we to ok the one which pro duced the b est result. T o constr uct gra ph we use all p oints of v ess e l cont our s a s gra ph no des a nd we connect all no des within a circle o f radius r (in our ca s e we use r = 50 ). Finally , to ea ch edge ( i, j ) we asso ciate the weigh t w i,j = exp( −| x i − y j | ). A gr aph matching algo r ithm pr o duces an alignment of image con tours, then to alig n tw o ima g es we hav e to expand this alignment to the rest of imag e. F or this purp ose, we use a smo oth spline-ba s ed transfor mation [40 ]. In 16 Figure 9: Ey e photos (top) a nd vessel contour extraction (b ottom). other words, w e es timate parameter s of the spline tr a nsformation from the known a lignmen t of contour p oints and then we apply this tra nsformation to the whole imag e. Results of image ma tching based on s hape context algor ithm and o n P A TH algorithm a r e presented in Figure 10, where black lines des ig nate co nnections b et ween asso ciated po in ts. W e observe that the cont ext s hape method creates man y un wan ted matching, while P A TH produces a matchin g that visually cor resp onds to a co rrect alig nmen t of the structure of vessels. The main r eason why graph matchin g works b etter than shap e context matching is the fact that shap e context do es not take into account the relational p ositions of matched points and may lead to totally incoherent gra ph structures. In con tra s t, g raph matchin g tr ie s to match pairs of nearest p oints in one image to pairs of nea rest p o in ts in a nother one. Among gr aph matchin g metho ds, diﬀerent res ults are obtained with diﬀerent optimization alg orithms. T able 2 shows the matching error s pr o duced by diﬀerent a lgorithms on this vessel alig nmen t problem. The P A TH algorithm has the smalles t ma tc hing error, with the alignment shown o n Figure 10. QCV comes next, with an alignment that is also visua lly correc t. On the o ther hand, the Umeyama alg o rithm has a mu ch larger matchin g erro r, a nd visually fails to ﬁnd a corr ect alignment, similar to the shap e context metho d. T a ble 2 : A lignment of vessel images, a lgorithm p erforma nce Method Shap e c o n text Umeyama QCV P A TH matchin g er ror 870.61 764.83 654.42 625.7 5 6.2 Recognition of handwritten c hinese c haracters Another example that we co nsider in this pap er is the problem o f chi nese character rec o gnition from the E TL9B dataset [41 ]. The main idea is to use a score o f optimal ma tc hing as a s imilarit y meas ure b etw een tw o images of characters. This similar it y measur e ca n b e used then in machine learning a lgorithms, K-near est neighbors (KNN) for instance, for character classiﬁca tion. Here we compare the p erforma nce of four metho ds: linear supp ort vector machine (SVM), SVM with ga us s ian kernel, KNN based on score of s hape co n text matching a nd KNN based on scores fro m graph matching which combines structur a l and shap e co n text information. A s a scor e, we use just the v alue of the ob jective function (21) at the (lo cally) optimal p oint. W e hav e selected three chinese characters k nown to b e diﬃcult to distinguish by automatic metho ds. Examples of these characters as well as examples o f extra cted graphs (obtained by thinning a nd uniformly subsa mpling the imag es) a re presented in Figure 1 1. F or SVM based algorithms, we use directly the v alues of image pixels (so e a ch ima ge is r epresented by a binary vector), in graph matchin g a lgorithm we use binary adjace ncy matrices of extracted gr aphs and shap e context matrices (see [1 9]). 17 Figure 1 0: Comparis o n of alignment based on shap e context (top) and alignment based on the P A TH optimization algorithm (bo ttom). F or each algorithm we prese nt tw o alig nmen ts: image ’1 ’ on image ’2’ and the inv erse. Each alignment is a spline-ba sed tr ansformation (see text). character 1 character 2 character 3 Figure 11: Chinese characters from the ETL9B dataset. 18 Our data set consist o f 50 exemples (images) of each cla ss. E ach image is represented by 63 × 64 binar y matrix. T o c o mpare diﬀerent metho ds we use the cr o ss v alidation error (ﬁve folds). The dep endency of classiﬁcation erro r from tw o algorithm parameters ( α — co eﬃcient of linear combination (21) a nd k — num ber of neares t neighbo rs used in KNN)) is s hown in Figure 12. 0 0.5 1 0.2 0.4 0.6 α classification error k=3 k=4 k=9 (a) (b) Figure 1 2: (a) Classiﬁca tio n erro r as a function of α . (b) Cla ssiﬁcation err o r as a function of k . Cla ssiﬁcation e r ror is estimated as cr oss v alidation er ror (ﬁve folds, 50 rep etition), the ra nge of the erro r ba rs is the standard deviation of test erro r ov er one fold (not av era ged ov er folds and rep etition) Two ex treme choices α = 1 and α = 0 corr esp o nd r espe c tiv ely to pure sha p e context matching, i.e., when only no de lab els information is used, and pure unlab eled g raph matching. It is worth observing here that KNN based just on the sc o re of unlab e le d gr aph matching do es no t work very well, the cla ssiﬁcation err or b eing a bo ut 60%. An explanation of this phenomenon is the fact that lear ning patterns have very unstable graph structure within one cla ss. The pure s ha pe con text meth o d has a cla ssiﬁcation erro r of ab out 39%. The com bination o f shap e context and gr aph s tructure informations a llows to decrease the classiﬁcation erro r down to 25%. Beside the P A T H algorithm, we tested also the QCV algorithm and the Umey ama alg o rithm,the Umeyama alg orithm a lmost do es not decr ease the classiﬁcation er ror. The QCV algorithm works b e tter than then Umeyama algor ithm, but still worse than the P A TH alg o rithm. Complete results ca n b e fo und in T able 3. T a ble 3: Classiﬁcation of chinese characters. ( C V , S T D )—mean and s ta ndard deviation of test erro r ov er cros s- v alida tio n runs (ﬁve folds, 50 rep etitions) Method C V S T D Linear SVM 0.377 ± 0.09 0 SVM with g aussian kernel 0.359 ± 0.07 6 KNN (P A TH) ( α =1): shape co n text 0.399 ± 0 .081 KNN (P A TH) ( α =0.4) 0.248 ± 0.07 5 KNN (P A TH) ( α =0): pure g r aph matching 0.60 7 ± 0.0 7 2 KNN (U) ( α =0 .9): α b est choice 0.382 ± 0.07 7 KNN (QCV) ( α =0.3 ): α b est choice 0.295 ± 0.06 1 7 Conclusion W e hav e presented the P A TH algor ithm, a new technique for gr aph ma tc hing based on conv ex-concave relaxatio ns of the initial int ege r progr a mming pr oblem. P A TH a llows to in tegr ate the alig nmen t o f graph structural elements with the matching o f vertices with similar lab els. Its r esults are compe titiv e with state-of-the-a rt metho ds in several graph ma tc hing and QAP b enchmark exp eriments. Moreover, P A TH has a theoretical and empirica l complexity comp etitiv e with the fastest av ailable gr aph matching alg orithms. Two p oints can be mentioned as interesting directions for further resea rch. First, the qualit y of the conv ex- concav e approximation is deﬁned by the choice of conv ex and concav e rela xation functions. Better p erforma nces 19 may b e achiev ed by mor e a ppropriate choices of these functions. Second, ano ther interesting p oint concerns the construction of a go o d concave r elaxation for the pr o blem o f dir ected g raph matching, i.e., for asy mmetric a djacency matrix. Such g eneralizations would b e interesting a lso as p ossible p o lynomial-time approximate so lutions for the general QAP problem. A A to y example The P A TH a lgorithm do es not ge ner ally ﬁnd the global optim um o f the NP-complete o ptimization problem. In this app endix we illustrate with tw o examples how the set of lo cal optima tra c ked by P A TH may o r may not lead to the global o ptim um. More pr ecisely , we co nsider tw o simple graphs with the following adjacency matrices: G =   0 1 1 1 0 0 1 0 0   and H =   0 1 0 1 0 0 0 0 0   . Let C denote the cost matrix of vertex asso ciation C =   0 . 1691 0 . 03 64 1 . 0509 0 . 6288 0 . 58 79 0 . 8231 0 . 8826 0 . 54 83 0 . 6100   . Let us supp ose that we hav e ﬁxed the tradeoﬀ α = 0 . 5, and that our ob jective is then to ﬁnd the global minim um of the following function: F 0 ( P ) = 0 . 5 || GP − P H || 2 F + 0 . 5tr( C ′ P ) , P ∈ P . (22) As explained earlier, the main idea underlying the P A TH algorithm is to try to follow the path of globa l minima of F α λ ( P ) (21). This may b e p ossible if all global minima P ∗ λ form a contin uous path, w hich is no t true in general. In the ca s e o f small gr aphs we can ﬁnd the exact global minimum of F α λ ( P ) for a ll λ . The trace o f global minima as functions of λ is presented in Figure 13(a) (i.e., we plot the v alues of the nine pa rameters of the do ubly sto chastic matrix, which are, as exp ected, all equal to zero or one when λ = 1). When λ is near 0 . 2 there is a jump of global minim um fro m one face to a nother. Howev er if we change the linear term C to C ′ =   0 . 4376 0 . 38 27 0 . 1798 0 . 3979 0 . 35 20 0 . 2500 0 . 1645 0 . 26 53 0 . 5702   , then the tra ce b ecomes smo oth (see Fig ur e 13(b)) and the P A TH algo rithm then ﬁnds the globally o ptim um p oint . Characterizing cases where the path is indeed smo oth is the sub ject o f ong o ing res e a rch. 0 0.5 1 0 0.5 1 λ (a) 0 0.5 1 0 0.2 0.4 0.6 0.8 1 λ (b) Figure 13: Nine co or dinates of g lobal minimum o f F α λ as a function of λ 20 B Kronec k er pro duct The Kro nec ker pro duct of tw o matrices A ⊗ B is deﬁned as follows: A ⊗ B =    B a 11 · · · B a 1 n . . . . . . . . . B a m 1 · · · B a mn    . Two imp ortant prop erties of Kro neck er pro duct that we use in this pap er ar e : ( A T ⊗ B )vec( X ) = vec( B X A ), and tr( X T AX B T ) = vec( X ) T ( B ⊗ A )vec( X ) . References [1] R. S. T. Lee a nd J. N. K. Liu, “An oscillatory elastic g raph matching mo del fo r recognition of oﬄine handwritten chin ese characters,” Thir d International Confer enc e on Know le dge-Base d Intel ligent Information Engine eing Systems , pp. 284 –287, 1999 . [2] A. Filatov, A. Gitis, and I. Kil, “Gr aph-based handwritten dig it string recog nition,” Thir d Int ernational Confer enc e on D o cument Analy sis and R e c o gnition (ICDAR’95) , pp. 845– 848, 1995 . [3] H. F. W ang and E . R. Ha nco ck, “Cor resp ondence matchin g using k ernel principal co mponents ana lysis and lab el co nsistency c o nstraints,” Pattern Re c o gnition , 2005 . [4] B. Luo and E. R. Hanco ck, “Alignment a nd corr espo ndence using sigular v alue decomp osition,” L e ctur e notes in c omputer scienc e , vol. 1876 , pp. 22 6–235 , 200 0. [5] M. Carca ssoni and E . R.Hanco ck, “Sp ectral cor resp ondance for p oint pattern ma tc hing,” Patt ern R e c o gnition , vol. 3 6, pp. 193– 204, 20 02. [6] C. Schellew ald a nd C. Schn o r , “ P robabilistic subgra ph matching based on conv ex relaxa tion,” L e ctur e n otes in c omputer scienc e , pp. 17 1 –186, 200 5. [7] R. Singh, J. Xu, and B. Ber ger, “Pairwise global alignment of protein interaction net works b y matc hing neighborho o d top ology ,” The Pr o c e e dings of t he 11th International Confer enc e on R ese ar ch in Computational Mole cular Biolo gy (RECOMB) , 2007 . [8] Y. W ang , F. Makedon, J. F o rd, and H. Hua ng , “A bipa r tite graph matching framework for ﬁnding co rresp on- dences b etw een structural elements in tw o pr oteins,” Pr o c e e dings of the 26th Annual International Confer enc e of t he IEEE EMBS , pp. 29 72–29 7 5, 2 0 04. [9] W. R. T aylor, “Protein s tructure co mparison using bipartite g raph matching and its application to protein structure classiﬁcation,” Mole cular and Cel lu lar Pr ote omics , pp. 33 4–339, 200 2. [10] D. C. Schmidt a nd L. E. Druﬀel, “A fast backtracking algor ithm for test directed g raphs for iso morphism,” Journal of the Asso c. for Computing Machinery , vol. 23, pp. 43 3–445, 1 976. [11] J. R. Ullmann, “An algor ithm for subgraph isomorphism,” Journal of the Asso c. for Computing Machinery , vol. 2 3, pp. 433– 445, 19 76. [12] L. P . Cordella, P . F oggia , C. Sansone, and M. V ento, “Performance ev aluation o f the vf gra ph matc hing algorithm,” Pr o c. of the 10th ICIAP , vol. 2, pp. 1 038–1 0 41, 19 9 1. [13] S. Umeyama, “An e igendecomp osition approach to weigh ted gr aph matching problems,” T r ansaction on p at- tern analysis and machine intel ligenc e , vol. 1 0, 1988. [14] L. S. Shapiro and J. M. Br ady , “F eature-based corr espo ndance: an eigenv ector appro ach,” Image and visio n c omput ing , vol. 10, pp. 28 3–288 , 199 2. [15] T. Caelli and S. Ko sinov, “An eigenspace pro jection clustering metho d for inexa ct g raph matching,” T r ansac- tion on p attern analysis and machine intel ligenc e , vol. 24, 200 4. 21 [16] H. Almohama d and S. Duﬀuaa, “A linear progra mming a pproach for the weighted graph matching problem,” T r ansaction on p attern analysis and machine intel ligenc e , vol. 15 , 1 993. [17] S. Gold and A. Ranga ra jan, “A gr aduated as s ignmen t algorithm for gra ph matching,” T r ansaction on p attern analysis and machine intel ligenc e , vol. 18 , 1 996. [18] D. Cremers, T. Ko hlber ger, and C. Schnor, “E v aluation of conv ex optimization techniques for the weighted graph-matching problem in computer visio n,” Patter R e c o gnition , vol. 21 91, 2 0 01. [19] S. Belongie, J. Malik, a nd J. Puzicha, “Shap e matching and ob j ect reco gnition using shap e contexts,” T r ans- action on p attern analysis and m achine intel ligenc e , vol. 24 , 2 002. [20] K. Bre in, M. Remm, a nd E. So nnhammer, “ Inparanoid: a comprehensive database of euk aryothic ortholog s ,” Nucleic acids r ese ar ch , vol. 33, 200 5. [21] M. R. Garey and D. S. Johnson, Computer and intr actability: A guide to t he the ory of N P-c ompleteness . San F ra ncisco, CA: W. H. F reeman, 197 9. [22] J. M. Borwein and A. S. Lewis, Convex A nalysis and Nonline ar Optimization . Springer-V erlag, New Y ork, 2000. [23] S. Boyd and L. V andenberghe, Convex Optimization . Ca m b. Univ. Press , 200 3. [24] L. F. McGinnis, “Implementation and testing of a prima l-dua l a lgorithm for the assignment problem,” O p er- ations R ese ar ch , vol. 31, no. 2, pp. 277– 2 91, 1983. [25] D. Conte, P . F ogg ia, C. Sa nsone, and M. V ento, “Thirt y years o f gra ph matc hing in pattern recognition,” International journal of p attern r e c o gnition and artiﬁcial intel ligenc e , vol. 18, pp. 265– 2 98, 2004. [26] G. H. Golub and C. F. V. Loa n, Matrix c omputations (3r d e d.) . Baltimore, MD, USA: Jo hns Hopkins Univ er s it y Press, 1996 . [27] L. F. McGinnis, “Implementation and testing of a prima l-dua l a lgorithm for the assignment problem,” O p er- ations R ese ar ch , vol. 31, pp. 277 –291, 19 83. [28] H. Kuhn, “The hungarian metho d for the ass ignmen t problem,” Naval R ese ar ch , vol. 2, pp. 8 3–97, 1 955. [29] M. F r ank and P . W olfe, “An algorithm for quadratic prog ramming,” Naval R ese ar ch L o gistics Quarterly , vol. 3 , pp. 95–1 10, 1956. [30] F. R. K. Chung, Sp e ctr al Gr aph The ory . Americal Mathematical So ciety , 199 7. [31] R. Ro ck afeller, Convex Analysis . Princeton Univ. Press, 1970 . [32] K. M. Anstreicher and N. W. Brix ius , “ A new bo und for the quadratic assignment pro blem based on conv ex quadratic progra mming,” Mathematic al Pr o gr amming , vol. 89, no. 3, pp. 341 –357, 2001 . [33] A. Blake and A. Zisserman, Visual R e c onstruction . MIT P ress, 198 7. [34] E. Allgow er and K.Georg, Numeric al c ontinuation metho ds . Springer , 199 0 . [35] J. Milnor, T op olo gy fr om the Diﬀer entiable Viewp oint . Univ. P ress o f Virginia , 1 969. [36] D. Bertsek as, N online ar pr o gr amming . Athen a Scientiﬁc, 19 99. [37] M. E. J . Newman, S. H. Stro g atz, and D. J. W atts, “Ra ndom graphs with a rbitrary degree distributions and their applications,” PHYSICAL REVIEW , vol. 6 4, 2001 . [38] E. Cela, “Qaudr atuc assig nmen t pr oblem library ,” www.opt.math. tu -gr az.ac.at/qapli b/ , 2007. [39] T. W alter, J.-C. Klein, P . Mass in, a nd A. Er ig nay , “Detection of the median a xis of vessels in r e tinal images,” Eur op e an Journ al of O phthalmolo gy , vol. 13, no. 2, 20 03. [40] F. Bo ok s tein, “Principal warps: thin-plate splines and the decomp osition of deformations,” T r ansaction o n p att ern analysis and machine intel ligenc e , vol. 11, pp. 567–5 85, 1 989. [41] T. Saito, H. Y amada, a nd K. Y amamoto, “ On the data base etl9b of handprin ted characters in jis chinese characters and its analys is,” IEICE T r ans , vol. 6 8, 1985. 22

A path following algorithm for the graph matching problem

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment