The set of solutions of random XORSAT formulae

The XOR-satisfiability (XORSAT) problem requires finding an assignment of $n$ Boolean variables that satisfy $m$ exclusive OR (XOR) clauses, whereby each clause constrains a subset of the variables. We consider random XORSAT instances, drawn uniforml…

Authors: Morteza Ibrahimi, Yash Kanoria, Matt Kraning

The Annals of Applie d Pr obabil ity 2015, V ol. 25, N o. 5, 2743 –2808 DOI: 10.1214 /14-AAP1060 c  Institute of Mathematical Statistics , 2 015 THE SET OF SOL UTIONS OF RANDOM X ORSA T F ORMULAE 1 By Mor teza Ibrahimi, Y as h K anoria 2 , Ma t t Kraning and Andre a Mont anari Urb an Engines, Columbia Business Scho ol, Qadium, Inc. and Stanfor d University The X OR-satisfiability (X ORS A T) problem requires find ing an assignmen t of n Boolean v ariables that satisfy m exclusive OR (XOR) clauses, whereby each clause constrains a subset of the v ariables. W e consider random XORSA T instances, drawn uniformly at ran- dom from the ensem ble of form ulae contai ning n v ariables and m clauses of size k . This model presen ts severa l structural similari ties to other ensem bles of constrain t satisfaction problems, su ch as k - satisfiabilit y ( k -SA T), hypergraph bicoloring and graph coloring. F or many of these ensembles, as th e num b er of constraints p er v ariable gro ws, th e set of solutions shatters into an exp onential n umber of w ell-separated comp onents. This ph en omenon app ears to b e related to the difficulty of solving random instances of su ch problems. W e prov e a complete characterization of th is clustering phase tran- sition for random k -XORSA T. In particular, we pro ve that the clus- tering threshold is sharp and determine its exact location. W e prov e that th e set of solutions h as large conductance b elow this t hreshold and th at eac h of the clusters has large conductance ab ov e the same threshold. Our pro of constru ct s a very sparse b asis for the set of solutions (or the subset within a cluster). This construction is intimately tied to the construction of specific subgraphs of t he hypergraph asso ciated with an instance of k -XORSA T. In order to study such subgraphs, w e establish no vel lo cal w eak converg ence results for them. 1. In trod uction. An instance of X OR-satisfiabilit y (XORSA T) is s p eci- fied b y an intege r n (the n umb er of v ariables) and b y a set of m cla uses of Received F ebruary 2012; revised August 2014. 1 Supp orted by NSF Gran ts CCF-074397 8, CCF-0915145, DMS-08-06211 and a T erman fello wship. 2 Supp orted by a 3Com Corp oration Stan ford Graduate F ello wship. AMS 2000 subje ct classific ations. Primary 68Q87; secondary 82B20. Key wor ds and phr ases. Random constraint satisfacti on problem, clustering of solu- tions, phase transition, random graph, lo cal weak con vergence, b elief propagation. This is an electronic r eprint of the original article published by the Institute o f Ma thematical Statistics in The Annals of Applie d Pr ob ability , 2015, V ol. 25, No. 5, 274 3–28 08 . This repr int differs fro m the origina l in pagination a nd typogr aphic de ta il. 1 2 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI the form x i a (1) ⊕ · · · ⊕ x i a ( k ) = b a for a ∈ [ m ] ≡ { 1 , . . . , m } . Here, ⊕ denotes mo dulo-2 sum, b = ( b 1 , . . . , b m ) is a Bo olean v ector, b a ∈ { 0 , 1 } , sp ecified b y the p roblem in s tances, and x = ( x 1 , . . . , x n ) is a vecto r of Bo olean v ariables x i ∈ { 0 , 1 } that must b e chosen to satisfy the clauses. Standard lin ear algebra metho ds allo w us to determine wh ether a given X ORS A T instance admits a solution, to find a solution, and ev en to coun t the num b er of solutions, all in p olynomial time. In this pap er, we shall b e in terested in the structural prop erties of the set of s olutions S ⊆ { 0 , 1 } n of a random k -X O RSA T form ula. More explicitly , we consid er a r andom X ORS A T in stance I that is dra wn uniformly at random within the set G ( n, k , m ) of instances with m clauses o ve r n v ariables, wher eby eac h clause in volv es exac tly k v ariables. The set of solutions S = S ( I ) is then defi n ed as th e set of binary vec tors x that satisfy all m clauses. Since I is a random form u la, S is a random sub set of the Hammin g hyp er- cub e. The structur al prop erties of S are of in terest for seve ral r easons. Firs t of all, linear systems o v er finite fields are com binatorial ob jects that emerge naturally in a num b er of fi elds. Dietzfelbinger an d collab orators [ 18 ] us e a mapping b et ween XORSA T and the matc h ing problem to establish tight thresholds for the p erformances of Cuc koo Hashing, an arc het ypal load b al- ancing scheme. Such th r esholds are computed by determining thresholds ab o ve whic h the set of solutio ns S of a random X ORSA T form ula b ecomes empt y . The existence of solutions is in turn relate d to the existence of an ev en-degree subgraph in a random hyp ergraph. Random sparse linear sys- tems ov er finite fi elds are used to constru ct capacit y ac hieving error cor- recting co des [ 27 , 28 , 36 ]. The deco dability of suc h co des is r elated to the emergence of a nontrivia l 2-core in the same random hyp ergraph—a p h e- nomenon that will play a crucial r ole in the follo wing. Finally , structured linear systems ov er finite fields are generated by p opular factoring algorithms [ 25 ]. In th e p resen t p ap er, w e are also motiv ated b y the close analogy b et we en random k -XORSA T and ot her random ensem bles of constraint sati sfaction problems (CSPs). The p rotot ypical example of this family is random k - satisfiabilit y ( k -SA T). The random k -SA T ensem b le can b e d escrib ed in complete analogy to random k -XORSA T with the mo d ification of replacing exclusiv e O R clauses b y O R clauses among v ariables or their negations. Namely , in k -SA T eac h clause tak es the form ( x ′ i a (1) ∨ · · · ∨ x ′ i a ( k ) ), whereb y x ′ i a ( ℓ ) = x i a ( ℓ ) or x ′ i a ( ℓ ) = x i a ( ℓ ) . An extensiv e literature [ 1 , 3 , 26 , 29 , 30 , 33 ] pro v id es strong supp ort for the existence of t w o sharp th r esholds in random k -SA T, as th e n umb er of clauses p er v ariable α = m/n gro ws. First, as α crosses a “satisfiabilit y thresh old” α s ( k ), r andom k -SA T form u lae pass from b eing with high probability (w.h.p., i.e., with probabilit y conv erging to 1 as n → ∞ ) satisfiable [for α < α s ( k )] to b eing w.h.p. unsatisfiable [f or α > THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 3 α s ( k )]. F or an y α < α s ( k ) the set of solutions is therefore nonempty . Ho w ever, it undergo es a dramatic structural c hange as α crosses a second thresh old α d ( k ) < α s ( k ). While for α < α d ( k ), S is w .h .p. “wel l connected” (more precise defin itions will b e give n b elo w), for α ∈ [ α d ( k ) , α s ( k )] it shatters into an exp onen tial num b er of clusters. It h as b een argued that su c h a “clustered” structure of the space of solutions can hav e an intimat e relation with the failure of standard p olynomial time alg orithms when applied to random form u lae in th is regime. The same s cenario is though t to hold for a num b er of random constraint satisfaction problems (including, e.g ., prop er co loring of rand om graphs, b icoloring rand om h yp ergraphs, Not All Equal-SA T, etc.). Unfortunately , this fascinating picture is so far only conjectural. E ven the b est und ersto o d elemen t, namely the existence of a satisfiabilit y th r eshold α s ( k ) has not b een esta blished (with the exception of the sp ecial case k = 2). In an early breakthrough, F riedgut [ 21 ] used F ourier-analytic m etho ds to pro ve the existence of a—p ossibly n -dep enden t— sequence of thresholds α s ( k ; n ) . Pro vin g that in fact this sequence can b e tak en to b e n -indep enden t is one of th e most challe nging op en problems in probabilistic com binatorics and random grap h theory . Understanding the pr ecise connection b etw een clustering of the space of solutions and computational complexit y is an ev en more d aunt ing task. Giv en su ch outstandin g challe nges, a fruitful line of researc h has pur sued the analysis of somewhat s im p ler mo dels. A v ery in teresting p ossib ility is to study k -SA T formulae for large but still b ounded v alues of k . As explained in [ 1 ], eac h SA T clause eliminates only one binary assignment of its k v ariables, out of 2 k p ossible assignmen ts of the same v ariables. Hence, for k large , a single clause has a small effect on the set of solutions, and most binary v ec- tors are satisfying unless the formula includes ab out 2 k clauses p er v ariable. This r esults in an “a v eraging” effect and su itable momen t metho ds pro- vide asymptotically sh arp results for large k . In p articular, Ac h lioptas and P eres [ 4 ] pro ve d upp er and lo w er b ound s on α s ( k ) that b ecome asymptoti- cally equiv alent (i.e. , whose ratio conv erges to 1) as k gets large. Ac hlioptas and Co ja-Oghlan [ 1 , 2 ], pro ved that clusterin g indeed tak es p lace in an in- terv al of v alues of α b elo w the satisfiabilit y threshold and ob tained upp er and low er b oun ds on the corresp onding threshold α d ( k ) that are asymp- totical ly equiv alen t for large k . Finally , Co ja-Oghlan [ 13 ] pro ved th at solu- tions can b e found w.h.p . in p olynomial time for an y α < α d , alg ( k ), whereby α d , alg ( k ) is asymptotically equiv alen t to α d ( k ) for large k . Intriguingly , n o algorithm is known that can p ro v ably find solutions in p olynomial time for α ∈ ((1 + δ ) α d ( k ) , α s ( k )), for any δ > 0, and all k ≥ 3 . X ORS A T is a very different example on whic h rigorous mathematical analysis prov ed p ossible, th us p ro vid ing precious complemen tary insights. The k ey simp lification is th at the s et of s olutions S is, in this case, an affine subsp ace of the Hamming h yp ercub e { 0 , 1 } n (view ed as a v ector s p ace 4 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI o ve r G F [2]). This implies a high degree of symmetry that can b e exploited to obtain very s h arp c haracterizations for large n , and an y k (w e assume throughout th at k ≥ 3, since 2-XORSA T is s ignifican tly simpler). It w as pro ve d in [ 20 ] that, for k = 3, there exists an n -indep enden t thresh- old α s ( k ) suc h that a random k -X ORSA T instance is w.h.p. satisfiable if α < α s ( k ) and u nsatisfiable if α > α s ( k ). T he pro of constructs a subform ula, b y consid er in g the 2 -core of the hyp ergraph asso ciated with the X ORS A T instance. O ne can then pr o ve that the original formula is satisfiable if and only if the 2 -core su bformula is. The thresh old for the latter can b e deter- mined exactly u sing the second momen t metho d . The pro of wa s extended to all k ≥ 4 in [ 18 ]. The existence of a 2 -core in a random XORSA T form ula has a sharp threshold w hen the n u m b er of clauses p er v ariable α crosses a v alue α core ( k ). This wa s argued to b e int imately related to the app ea rance of clusters. In particular, [ 12 , 31 ] giv e an argument 3 sho w in g that, ab o v e α core ( k ), the space of solutions shatters into exp onenti ally many clusters. In other w ord s, α core ( k ) is an upp er b oun d on the clustering threshold. [ 31 ] furth er sh o ws that, for α < α core ( k ), a particular co ordinate of a solution can b e c h anged b y changing O (1) other v ariables on a ve rage, without lea vin g the space of solutions. If this argumen t is p ushed a step fur ther, one can sh o w that, w.h.p., an y coord inate can b e c hanged by flipping at most O (log n ) other co ordinates. This suggests that it may b e p ossib le to concatenate a sequence of suc h flip s to connect any t wo solutions via a p ath through the solution space, with O (log n ) s teps. Ho w ev er, the analysis [ 31 ] do es not imply that this is the case, as it do es n ot add ress the main c h allenge, namely to con- struct a path from an y solution to any other s olution. In this wo rk, we solv e this problem and provide the first pro of of a lo w er b ound of α core ( k ) on the clustering threshold α d , thus establishing that ind eed α d ( k ) = α core ( k ). F or α > α d ( k ) we pro ve a s harp c h aracterizat ion of the decomp osition into clusters. As men tioned ab ov e, rand om k -XORSA T f ormulae can b e solve d in p oly- nomial time using linear algebra metho ds, and this app ears to b e insensitiv e to the clustering thresh old. Nev ertheless, an in triguing algorithmic phase transition migh t tak e place exactly at the clustering threshold α d ( k ). F or an y α < α d ( k ), solutions can b e fou n d in time linear in the n u m b er of v ari- ables (the algorithm is in fact an imp ortan t comp onen t of our p ro of ). On the other h and, no algorithm is kn o wn that finds a solution in linear time for α ∈ ( α d ( k ) , α s ( k )). W e think that our pr o of sheds some light on this phenomenon. 3 The argument of [ 12 , 31 ] is essen tially rigorous, but do es not deal with several technical steps. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 5 1.1. Main r esult. In this pap er, w e obtain tw o sharp results c haracteriz- ing th e clustering phase transition for random k -XORSA T: (i) W e exactly d etermin e the clustering thr eshold α d ( k ), proving that the space of solutions is w.h.p. well conn ected f or α < α d ( k ), and instead shatters in to exp onen tially man y clusters for α ∈ ( α d ( k ) , α s ( k )). (ii) W e determine the exp onen tial gro wth rate of the n umb er of clusters, that is, we sho w that this is w.h.p. exp { n Σ( α ; k ) + o ( n ) } where Σ( α ; k ) is a nonr an d om function which is explicitly giv en. W e pro v e that eac h of the clusters is itself “wel l connected.” This is th erefore the fi rst random CS P ensem ble for whic h a sharp thresh- old f or clustering is prov ed. Earlier literature fell short of esta blishing (i) since it did not pro vide an y argumen t f or connectedness b elo w α d ( k ). Also, informal calculations only suggested a lo wer b ound on the n umb er of clusters, but did not establish (ii) since they did not pro ve conn ectedness of eac h cluster by itself. Th e situation is akin to the analysis of Mark o v Chain Mon te Carlo metho ds: It is often significantl y more c hallenging to pro ve rapid mixing (connectedness of th e space of configurations) than the opp osite (i.e., to find b ottlenec ks). One imp ortan t n o velt y is that the notion of connectedness used here is v ery strong and go es b eyo nd path connectivit y , whic h was us ed earlier for k -SA T [ 1 , 2 ]. W e use a prop erly defin ed notion of c onductanc e w hic h w e think can b e applied to a broader set of C SPs, and h as the adv an tage of b e- ing closely related to im p ortan t algorithmic notions (fast mixing for MCMC and expansion). Giv en a subset of the hypercub e S ⊆ { 0 , 1 } n , and a p os- itiv e integ er ℓ , we defin e the conductance of S as follo ws. Cons truct the graph G ( S , ℓ ) with v ertex set S and an edge connecting x, x ′ ∈ S if and only if d ( x , x ′ ) ≤ ℓ [here and b elo w, d ( · , · ) denotes the Hamming distance, i.e., d ( x , x ′ ) = |{ i : 1 ≤ i ≤ n, x i 6 = x ′ i }| , wh er e x = ( x 1 , x 2 , . . . , x n ) and similarly for x ′ , and | B | denotes th e cardin alit y of the set B ]. Then w e define the ℓ th c onductanc e of S as the graph condu ctance of G ( S , ℓ ) , namely Φ( S ; ℓ ) ≡ min A ⊆S cut G ( S ,ℓ ) ( A, S \ A ) min( | A | , |S \ A | ) , (1) where, for a graph G = ( V , E ), and an y B ⊆ V , we defin e cut G ( B , V \ B ) ≡ |{ e ∈ E : Exactly one of the t wo endp oin ts of e is in B }| . Notice th at we measure the v olume of a set by the num b er of its v ertices instead of the su m of its degrees. 4 4 This difference is irrelev an t for α < α d ( k ) since in this case S will b e taken to b e an affine subspace of { 0 , 1 } n , and h ence G ( S , ℓ ) will b e a regular graph. F or α ∈ ( α d ( k ) , α s ( k )), 6 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI W e defin e the distance b et w een t wo sub s ets of the hypercu b e S 1 , S 2 ⊆ { 0 , 1 } n as d ( S 1 , S 2 ) ≡ min x ∈S 1 ,x ′ ∈S 2 d ( x , x ′ ) . F or our statemen ts, k ≥ 3 is alw a ys fixed, tog ether with a sequence m ( n ) = αn . W e sa y that a sequence of ev en ts ( E n ) n> 0 o ccurs with high pr ob ability (w.h.p.) if lim n →∞ P ( E n ) = 1. (W e refer to Section 2 for a formal definition of th e und erlying pr obabilit y space.) Theorem 1. L et S b e the set of solutions of a r andom k - X ORS A T formula with n variables and m = nα clauses. F or any k ≥ 3 , let α d ( k ) b e define d as α d ( k ) ≡ sup { α ∈ [0 , 1] : z > 1 − e − k αz k − 1 , ∀ z ∈ (0 , 1) } . (2) 1. If α < α d ( k ) , ther e exists C = C ( α, k ) < ∞ such that, w.h.p., Φ( S ; (log n ) C ) ≥ 1 / 2 . 2. If α ∈ ( α d ( k ) , α s ( k )) , then ther e exists ε = ε ( k ; α ) > 0 such that, w.h.p., Φ( S ; nε ) = 0 . 3. If α ∈ ( α d ( k ) , α s ( k )) , and δ > 0 is arbitr ary, then ther e exist c onstants C = C ( α, k ) < ∞ , ε = ε ( α, k ) > 0 , Σ = Σ( α, k ) > 0 , and a p artition of the set of solutions S = S 1 ∪ · · · ∪ S N , such tha t, w.h.p., the fol low ing pr op erties hold: (a) F or e ach a ∈ [ N ] , we have Φ( S a ; (log n ) C ) ≥ 1 / 2 . (b) F or e ach a 6 = b ∈ [ N ] , we have d ( S a , S b ) ≥ n ε . (c) exp { n (Σ − δ ) } ≤ N ≤ exp { n (Σ + δ ) } . F urther, letting Q b e the lar gest p ositive solution of Q = 1 − exp {− k αQ k − 1 } and b Q ≡ Q k − 1 , we have Σ( α, k ) = Q − k α b Q + ( k − 1) αQ b Q . 1.2. Conductanc e and sp arse b asis. W e w ill p ro ve Theorem 1 by obtain- ing a fairly complete description of the set S b oth ab o v e and b elow α d ( k ). In a n utsh ell, f or α < α d ( k ), S admits a sparse basis, while for α > α d ( k ) eac h of the clusters S 1 , . . . , S N admits a sparse basis b ut their union do es not. This is particularly suggestiv e of the connection b et w een the cluster- ing phase transitions and algorithm p erf orm ance. Belo w α d ( k ) the space of solutions adm its a succinct explicit repr esen tation [in O ( n (log n ) C ) b its]. S will be constructed as the union of a “small” n u mber of affine spaces , and hence G ( S , ℓ ) should be approxima tely regular. W e keep the definition ( 1 ) since it simplifies our state- ments. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 7 Ab o v e α d ( k ), w e can pro duce a representati on that is succinct but implicit (as solutions of a give n form ula), or explicit but prolix [no basis is kn o wn that can b e en co ded in o ( n 2 ) bits]. Giv en a linear subspace S ⊆ { 0 , 1 } n , w e say that it admits an s -sparse basis if there exist v ectors x ( l ) ∈ S for l ∈ { 1 , . . . , D } suc h that d ( x ( l ) , 0 ) ≤ s and ( x ( l ) ) D l =0 form a basis for S . The latter means that the v ectors are linearly indep en d en t and S = { P D l =1 a l x ( l ) : ( a l ) D l =0 ∈ { 0 , 1 } D } . W e sa y that an affine space S ⊆ { 0 , 1 } n admits an s -spars e basis if, for x (0) ∈ S , the linear subspace S − x (0) admits an s -sparse b asis. The prop- ert y of h a ving a sparse basis ind eed implies large conductance. The pro of is immediate. Lemma 1.1. If the affine subsp ac e S ⊆ { 0 , 1 } n admits a s -sp arse b asis, then Φ( S ; s ) ≥ 1 / 2 . Vic e versa, assume that Φ ( S ; s ) = 0 . Then S do es not admit a s -sp arse b asis. Pr oof. W e can assume, without loss of generalit y , that S is a linear space. Let d b e its d im en sion. F urther, giv en a graph G , let, with a slight abuse of notation Φ( G ) ≡ min A ⊆S cut G ( A, S \ A ) min( | A | , |S \ A | ) , (3) so th at Φ( S ; ℓ ) = Φ( G ( S ; ℓ )). Assume that S admits a s -sparse b asis. This immediately implies the graph G ( S , s ) con tains a spannin g subgraph that is isomorph ic to the d - dimensional h yp ercu b e H d . F urther, G 7→ Φ( G ) is monotone increasing in the edge set of G . T h erefore, Φ( S ; s ) ≥ Φ( H d ) ≥ 1 / 2 where the last in equ alit y follo ws from the s tand ard isop er im etric inequalit y on the hyp ercub e [ 23 ].  The charact erization of the solution space in terms of sparsit y of its b asis is gi v en b elo w. Theorem 2. L et S b e the set of solutions of a r andom k - X ORS A T formula with n variables and m = nα clauses. F or any k ≥ 3 , let α d ( k ) b e define d as p er e quation ( 2 ). Then the fol lowing hold: 1. If α < α d ( k ) , ther e exists C = C ( α, k ) < ∞ such that, w.h.p., S admits a (log n ) C -sp arse b asis. 2. If α ∈ ( α d ( k ) , α s ( k )) , and δ > 0 is arbitr ary, then ther e exist c onstants C = C ( α, k ) < ∞ , ε = ε ( α, k ) > 0 , Σ = Σ( α, k ) > 0 , and a p artition of the set of solutions S = S 1 ∪ · · · ∪ S N , such tha t, w.h.p., the fol low ing pr op erties hold: 8 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI (a) F or e ach a ∈ [ N ] , S a admits a (log n ) C -sp arse b asis. (b) F or e ach a 6 = b ∈ [ N ] we have d ( S a , S b ) ≥ n ε . (c) exp { n (Σ − δ ) } ≤ N ≤ exp { n (Σ + δ ) } . F u rther, Σ is given by the same expr ession given in The or em 1 . Clearly , this theorem immediately implies Theorem 1 by applyin g Lem- ma 1.1 . T he rest of this pap er is devote d to the pro of of Theorem 2 . 1.3. F urther te chnic al c ontributions. T o a give n a XORSA T instance I , w e can asso ciate a b ipartite graph (“factor graph”) with v ertex sets F ( fac- tor or che ck no des) corr esp onding to equations, and V ( variable no d es) v ariables. The edge set E includ es those pairs ( a, i ) ∈ F × V s uc h that v ari- able x i participates in the a th equation. The co n struction of the sparse basis in Theorem 2 relies hea vily on a charact erization of the ran d om fact or graph asso ciated to a random XORSA T instance. Th is could b e gleaned from the pro of of [ 1 8 , 20 ] that construct the 2-core of G . In ord er to prov e Theorem 2 , w e c h aracterize a larger subgraph that w e refer to as the b ackb one of G . T his subgraph has the follo wing in terpretation: if t wo solutions x and x ′ coincide on the core, th en they coincide on ev ery v ertex of the b ac kb one. The 2-core of the random graph G w as stud ied in a n umb er of pap ers [ 14 , 27 , 32 , 35 ]. The k ey to ol in th ese w ork s is the analysis of an iterativ e pro cedur e th at constructs the 2 -core in Θ( n ) ite rations. This pro cedu re has an imp ortan t prop erty: A t eac h step, the resulting graph remains uniform ly random, giv en a small num b er of parameters (essen tially , its degree d istri- bution). Thanks to this prop ert y , the analysis of [ 14 , 27 , 32 , 35 ] is reduced to the study of a Mark o v c hain in Z 2 . T his is done by sho wing that sample paths of this chain are sh o wn to concen trate around solutions of a certain ordinary differen tial equation. Our analysis of the bac kb one has a similar starting p oin t, namely the study of an iterat iv e p ro cedure that constructs the bac kb on e (indeed we define form ally the bac kb one as the fixed p oint of this pro cedur e). Un for- tunately , the graphs generated b y this pro cedure are not uniformly ran- dom, conditional on a small num b er of parameters. Hence, th e tec hniqu es [ 14 , 27 , 32 , 35 ] do not app ly . W e o vercome this d ifficult y b y c haracterizing the large- n limit of its fixed p oin t using th e theory of lo cal weak con v er- gence. T his is in tu r n c h allenging b ecause the fixed p oin t is not, a priori , a lo cal function of G . W e consider this c haracterizati on of the bac kb one, and its pro of, to b e a con tribu tion of indep enden t inte rest. F or d escribing the iterativ e pro cedure, we us e the language of message passing algorithms, and will refer to it as to “b elief p r opagation” (BP), as the same algorithm is also of in terest in iterativ e co d ing; see [ 29 , 36 ]. Giv en a factor graph G = ( F , V , E ), the algorithm up dates 2 | E | messages indexed b y THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 9 directed edges in G . In other w ords, for ea c h ( a, v ) ∈ E , a ∈ F and v ∈ V , and an y iteration num b er t ∈ N , w e hav e t wo messages ν t v → a , and b ν t a → v , taking v alues in { 0 , ∗} . F or t ≥ 1, messages are co mputed follo win g the up date rules: ν t v → a =  ∗ , if b ν t − 1 b → v = ∗ for all b ∈ ∂ v \ a , 0 , otherwise, (4) and b ν t a → v =  0 , if ν t u → a = 0 for all u ∈ ∂ a \ v , ∗ , otherwise. (5) W e call th is algorithm BP 0 when all messages are initialized to 0: ν 0 v → a = b ν 0 a → v = 0 f or all ( a, v ) ∈ E . I t is not hard to see that BP 0 is monotone, 5 in the sense that m essages only c hange fr om 0 to ∗ , and hence con ve rges to a fixed p oin t ν ∗ v → a . It is easy to c hec k (see Lemma 4.3 b elo w) th at the core of G coincides with the subgraph induced b y the factor no d es that receiv e no ∗ message at the fi xed p oint of BP 0 . The b ac kb one is instead the subgraph induced by factor nod es that r eceiv e at m ost one ∗ message at the fixed p oin t. Denote by ˜ µ ∗ n the probability distribution on r o ote d fact or graph s with marks on the edges constructed as follo ws. Dra w a graph uniformly at ran- dom from G ( n, k , m ) . Cho ose a u niformly r andom v ariable no d e i ∈ V as the ro ot. Mark the edges (in eac h direction) with the messages corresp onding to the BP 0 fixed p oin t ν ∗ v → a . W e next construct a random tr ee e T ∗ ( α, k ) with marks on the directed edges as follo ws. Marks tak e v alues in { 0 , ∗} and to eac h un directed edge w e asso ciate a mark f or eac h of the tw o directions. W e will r efer to the direction to wa rd the r o ot as to the “upw ard” d irection, and to the opp osite one as to the “do wnw ard” direction. Th e marks corresp ond to fixed-p oin t BP mes- sages, and w e will call them messages as w ell in what follo w s. First, consider only edges directed u pw ard. Th is is a multit yp e Galton–W atson (GW) tree. A t the root generate P oisson ( k α ) offsprings, and mark eac h of th e edges to 0 ind ep endently with prob ab ility b Q , and to ∗ otherwise. At a non r o ot v ariable no de, if the p arent ed ge is mark ed 0 , generate P oisson ( k α (1 − b Q )) descendan t edges marked ∗ and P oisson ≥ 1 ( k α b Q ) descendant edges m ark ed 0 [here P oisson E ( λ ) denotes a Poi sson random v ariable w ith parameter λ con- ditional on E ]. If the paren t ed ge is mark ed ∗ , generate Po isson( k α (1 − b Q )) 5 This can b e established by induction: Since we start with all 0s, clea rly messages can only c hange from 0 to ∗ in the fi rst iteratio n . Therea fter, this holds ind uctively for eac h subsequent iteration since eac h of the up date rules is monotone in the sense that if the in coming messages only change from 0 to ∗ , then the same h olds for the outgoing messages. 10 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI descendan t edges marked ∗ and no descendan t edges mark ed 0 . A t a f ac- tor no de, if the p aren t edge is mark ed 0, generate k − 1 descendant ed ges mark ed 0 . If th e parent no de is marked ∗ , generate M ∼ Binom ≤ k − 2 ( k − 1 , Q ) descendan ts m ark ed 0, and k − 1 − M descendan ts mark ed ∗ . F or edges directed do w n w ard , marks are generated recursiv ely follo wing the usual BP ru les, cf. equations ( 4 ), ( 5 ), starting fr om the top to the b ot- tom. It is easy to c h ec k that w ith this construction, the marks in e T ∗ ( α, k ) corresp ond to a BP fi xed p oin t. Giv en a factor graph G = ( F , V , E ) , w e use B G ( v , t ) to denote th e ball of r adius t cen tered at no de v ∈ V . This ball is defined inductivel y as f ollo ws: Th e B G ( v , 0) consists of no d e v alone and no edges. F or t > 0 , the B G ( v , t ) includes B G ( v , t − 1) . In addition, it includes all factor no d es connected to v ariable no des in B G ( v , t − 1) and asso ciated edges, and all v ariable no d es connected to those factor no d es and associ- ated edges. [Th us, B G ( v , t ) includes no des and edges up to a distance t from v , where v ariable no des are said to b e separated b y distance 1 if they are connected to the same factor no d e.] Definition 1.2. Let { G n } , G n = ( F n , V n , E n ) b e a sequence of (ran- dom) factor graph s. Let µ ( t ) n denote the emp irical probabilit y distrib ution of B G n ( v , t ) when v ∈ V n is un iformly random. Exp licitly , for an y locally finite ro oted graph T 0 of depth at most t , µ ( t ) n ≡ 1 n X v ∈ V n I ( B G n ( v , t ) ≃ T 0 ) , (6) (with ≃ denoting equalit y up to graph vertex relab eling.) W e sa y that { G n } c onver ges lo c al ly almost sur ely to the measure µ on ro oted graphs if, for an y finite t , and an y locally finite ro ote d graph T 0 of depth at most t , w e ha v e lim n →∞ µ ( t ) n ( T 0 ) = µ ( t ) ( T 0 ) (7) holds almost surely with resp ect to the graph la w. Here, µ ( t ) denotes the marginal of µ with resp ect to a ball of r adius t aroun d the r o ot. The same notion of lo cal graph con v ergence was used earlier in the liter- ature, for instance, in [ 15 – 17 ]. Giv en a random graph distribu tion, w e fi rst dra w a sequence of { G n } n ≥ 1 , and then c hec k that µ ( t ) n ( T 0 ) → µ ( t ) ( T 0 ) with probabilit y one. It is w orth emphasizing the d ifference from a w eake r notion (that we never use b elo w), whereb y we only c hec k E G n µ ( t ) n ( T 0 ) → µ ( t ) ( T 0 ), with E G n denoting exp ectatio n with resp ect to the graph distr ibution. In particular, establishing almost sure local graph con ve rgence is more c hal- lenging that proving con vergence of the exp ectatio n E G n µ ( t ) n ( T 0 ) since it requires to con trol the deviations of the s ubgraph counts µ ( t ) n ( T 0 ). With this THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 11 clarification, w e sh all o ccasionally drop the “al most surely” in “con v erges lo cally almo st surely .” As part of our pro of of Theorem 2 , w e obtain the follo wing result, which ma y b e of in d ep end ent in terest. (W e refer to the next section for a complete definition of the un derlying p robabilit y space.) Theorem 3. The se quenc e { ˜ µ ∗ n } n ≥ 0 c onver ges lo c al ly almost sur e ly to the pr ob ability distribution of e T ∗ . Theorem 3 is pro ved in Secti on 4 . Besides this, our pro of uses several other ideas: • W e sho w that Theorem 3 can b e used to extend the lo w w eigh t core solutions to low w eight solutions of the whole XORSA T instance (see Section 8 ). • W e sho w that th e p erip hery (the complemen t of the core in G ) is uniformly random with a given degree s equence, conditioned on b eing “p eel able.” W e estimate precisely this d egree distribution, and show that the p er ip hery is indeed p eelable w ith p ositiv e probab ility for that degree sequ ence (see Section 6 ). • In addition to the fixed p oint c h aracterizatio n , we ob tain a pr ecise c har- acterizat ion of the con vergence rate of BP 0 (see Sectio n 4 ), whic h allo ws us to b ound the sparsit y of the basis constructed. • F or α > α d , con vergence to the BP fixed p oint is geometric rather than quadratic. In this regime, w e sho w that in fact ther e are “strings” of degree 2 v ariable no des that slo w down conv ergence but do not preve n t the constru ction of a s p arse basis. W e b ound the sp arsit y b y d efining a certain “co llapse” op erator on suc h s trings (see Section 5 ). 1.4. Outline of the p ap er. In Section 2 , w e define some basic concepts and notation. Section 3 describ es the construction of clus ters an d sparse bases, and u ses this constru ction to pro ve Theorem 2 . Sev eral basic lemmas necessary f or the pro of are stated in this section. Section 4 in tro duces a certain b elie f pr op agation (BP) algorithm and a tec hnical to ol called density evolution , that pla y a k ey r ole in our analysis: The BP algorithm naturally d ecomp oses the linear system into a “bac k- b one” (consisting roughly of the 2-core and the v ariables implied b y it) and a “p eriphery .” Densit y ev olution allo ws us to trac k the progress of BP , even- tually facilit ating a tig h t characte rization of basic p arameters (lik e n umb er of n o des) of the backbone and p er ip hery . Section 5 b ounds th e num b er of iterations of a “p eeling” algorithm (re- lated to BP) that pla ys a key r ole in our construction of a s parse basis. 12 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Section 6 prov es a sharp c haracterization of the p eriphery . T ogether, this yields the first (large) set of basis ve ctors. Section 7 sho w s the 2-core has ve ry few sparse solutions, leading to well separated, small, “core-clusters.” S ection 8 shows ho w to pro duce a sparse solution of the linear system corresp ondin g to eac h sparse solution of the 2-core subsystem. This yields the second (small) set of basis vec tors in our construction. Sev eral tec hnical lemmas are deferred to the App end ices . A short v ersion of this pap er was presente d at the ACM-SIAM Symp osium on Disc rete Algo rithms SOD A 20 12. 2. Random k -X ORSA T: Defin itions and notation. As describ ed in the In tro duction , eac h k -X ORSA T clause is actually a linear equation o v er G F [2]: x i a (1) ⊕ · · · ⊕ x i a ( k ) = b a , for a ∈ [ m ] ≡ { 1 , . . . , m } . Intro d ucing a ve c- tor h a ∈ { 0 , 1 } n , with nonzero en tries only at p ositions i 1 ( a ) , . . . , i k ( a ), this can b e written as h T a x = b a . Hence, an instance is completely sp ecified by the p air ( H , b ) wh ere H ∈ { 0 , 1 } m × n is a matrix with ro ws h T 1 , . . . , h T m and b = ( b 1 , . . . , b m ) T ∈ { 0 , 1 } m . Th e space of solutions is ther efore S ≡ { x ∈ { 0 , 1 } n : H x = b mod 2 } . If S h as at least one elemen t x (0) , then S ⊕ x (0) is jus t the set of solutions of the homogeneous linear sy s tem corresp ond- ing to b = 0 (the k ern el of H ). In the follo wing we shall alw ays assume α < α s ( k ), so that S is nonempty w.h.p. [ 18 ]. Note that, if S is nonempt y , then S = S 0 ⊕ x 0 where x 0 ∈ S is any solution of the original system and S 0 is the set of solutio ns of the h omogeneous linear system H x = 0. Sin ce w e are only in terested in geomet ric prop erties of the set of solutions that are in v ariant under translatio n, we will assume hereafter that b = 0, and hence S = S 0 . An XORSA T instance is therefore completely sp ecified b y a binary matrix H , or equiv alen tly by th e corresp onding factor graph G = ( F , V , E ). This is a b ipartite graph with t wo sets of no des: F ( factor or che ck no des) corre- sp ond ing to ro w s of H , and V ( variable no des) corresp ond ing to columns of H . Th e edge set E includ es those pairs ( a, i ), a ∈ F , i ∈ V suc h that H ai = 1. W e denote b y G ( n, k , m ) the set of all f actor graphs with n labeled v ariable no des and m lab eled c h eck no des, eac h having d egree exactly k (with no double edges). Note that | G ( n, k , m ) | =  n k  m . With a slight abuse of n ota- tion, w e will denote b y G ( n, k , m ) also the u n iform d istribution o v er this s et, and w r ite G ∼ G ( n , k , m ) for a uniformly rand om suc h graph. F or v ∈ V or v ∈ F , w e d enote by deg G ( v ), the degree of no de v in graph G (omit ting the su bscript when clea r from the con text) and we let ∂ v denote the set of n eigh b ors of v . W e define the distance with resp ect to G b etw een t wo v ariable no des i , j ∈ V , d enoted by d G ( i, j ) as the length of th e shortest path from i to j in G , wh ereb y the length of a path is the num b er of THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 13 c heck no des encounte red along the path. Giv en a v ector x , w e denote b y x A = ( x i ) i ∈ A its r estriction to A . The cardinalit y of set A is d enoted b y | A | . W e only consider th e “inte resting” case k ≥ 3, and th e asymptotics m, n → ∞ with m/n → α and α ∈ [0 , α s ( k )), where α s ( k ) is the satisfiabilit y thresh- old. Hence, H has w.h.p. maxim um rank, that is, ran k( H ) = m [ 29 ]. Definition 2.1. Let F 0 ⊆ F . T he sub gr aph induc e d by F 0 is defined as ( F 0 , V 0 , E 0 ) where V 0 ≡ { i ∈ V : ∂ i ∩ F 0 6 = ∅ } and E 0 ≡ { ( a, i ) ∈ E : a ∈ F 0 , i ∈ V 0 } . A che ck-induc e d subgraph is the s u bgraph ( F 0 , V 0 , E 0 ) induced b y some F 0 ⊆ F . Similarly , w e can define the subgraph in d uced by V 0 ⊆ V , and variable-induc e d subgraphs. Let F 0 ⊆ F , V 0 ⊆ V . T he sub gr aph induc e d b y ( F 0 , V 0 ) is defi ned as ( F 0 , V 0 , E 0 ) where E 0 ≡ { ( a, i ) ∈ E : a ∈ F 0 , i ∈ V 0 } . Definition 2.2. A stopping set is a chec k-induced sub graph with the prop erty that ev ery v ariable no de has degree larger than one with resp ect to the subgraph. Th e 2 -c or e of G is its maximal stopping set. Notice th at the maximal stopping set of G is uniquely defined b ecause the u nion of t w o stopping sets is a stopp ing set. All of our stat emen ts are with resp ect to the follo wing probabilit y sp ace, for a fixed k ≥ 3, and an inte ger sequence { m ( n ) } n ∈ N . F or eac h n , we let m = m ( n ) and consider the finite set Ω n = G ( n, k , m ) of k -X ORSA T ins tances with n v ariables and m cla u ses. F ormally , eac h elemen t of G ( n, k , m ) is giv en b y a pair ( H , b ) where H ∈ { 0 , 1 } m × n is a matrix with k nonzero elements p er ro w and b ∈ { 0 , 1 } m . [In the pro ofs, we shall o ccasionally replace G ( n, k , m ) b y sligh tly differen t sets—defined therein—for tec hnical con venience. Th e connection will b e mad e clear.] Since Ω m is fi nite, it is straight forw ard to endow it with the uniform probabilit y measure P n o ve r the complete σ -algebra 2 Ω n . The probabilit y space underlyin g all of our statemen ts is the pro duct space Ω = × n ∈ N Ω n , with pro d uct pr obabilit y measure P = × n ∈ N P n . An ev en t E ⊆ Ω is a an elemen t of the pro d uct σ -algebra. As a sp ecial example, let f n : Ω n → R b e a sequence of functions, and ω = ( ω i ) i ∈ N ∈ Ω . Then existence of the limit lim n →∞ f n ( ω n ) is a wel l defined ev ent in Ω . With a sligh t abu se of language, we will identify an y set E n ⊆ Ω n with an ev ent, n amely with the cylindrical set C ( E n ) ≡ { ω = ( ω i ) i ∈ N ∈ × i ∈ N Ω i : ω n ∈ E n } . W e will t yp ically write E n for C ( E n ) and P ( E n ) = P ( { ω n ∈ E n } ) for the probabilit y of suc h an even t. W e say that E n o ccurs with high pr ob ability ( w.h.p. ) if lim n →∞ P ( E n ) = 1. W e sa y th at a sequ en ce of ev en ts ( E n ) n> 0 o ccurs e v entual ly almo st sur ely if lim n 0 →∞ P ( T n ≥ n 0 C ( E n )) = 1. Note th at, with th is prob ab ility space, the notion of local almost sur e con verge n ce in Definition 1.2 is we ll defined . Note that our main results 14 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI (Theorems 1 and 2 ) are “with high probabilit y resu lts,” and hence do not require the defin ition of a common probability sp ace for differen t graph sizes. This is in deed mainly a matter of tec hnical con venience (and is of course needed for Theorem 3 ). A key fact to b e used in the follo wing is that a gian t 2 -core app ears abruptly at α d ( k ). F orm s of the follo wing statemen t app ear in [ 14 , 27 , 32 ]. Theorem 4 ([ 14 , 27 , 32 ]). Assume α < α d ( k ) . Then w.h.p., a gr aph G ∼ G ( n , k , m ) do es not c ontain any stopping set. Vic e versa, assume α > α d ( k ) . Then ther e exists C ( k ) > 0 such that, w.h.p., a gr aph G dr awn uniformly at r andom fr om G ( n, k , m ) c ontains a 2 -c or e of size lar ger than C ( k ) n . W e will often refer to the depth - t neigh b orh o od of a no de v in G . Definition 2.3. Giv en a n o de v ∈ V and an in teger t , let V ′ = { u : u ∈ V , d G ( u, v ) ≤ t } . Then the b al l of r adius t ar ound no de v is defined as the (v ariable-induced) su bgraph B G ( v , t ) induced by V ′ . With an abuse of nota- tion, we w ill use the same notation for the set of v ariable no d es in B G ( v , t ) . Lastly , we d efi ne | B G ( v , t ) | to b e the n u m b er of v ariable no des in th e sub- graph B G ( v , t ) . W e will o cca s ionally w ork with certain random infin ite ro oted factor graphs, with marks on the edges or v ertices. (Note that a factor graph can b e rega rded as an ordinary graph, with add itional marks on the v ertices to distinguish “v ariable n o des” from “fact or no des.”) A useful concept in this con text is the one of “unimo dular” random ro oted graphs, that w e b riefly recall n ext. F or a more complete presen tation, w e refer to the o verview p ap er b y Aldous and Ly ons [ 5 ]. Informally , a random ro oted (mark ed) graph is unimo dular if it lo oks the same (in distribution), w hen the ro ot is mov ed to any other v ertex. In order to form alize this n otion, we denote by G ∗ the sp ace of lo cally finite ro ote d graphs, with marks on the ve rtices or edges (w e assume marks to b elong to s ome fixed finite set for simplicit y). W e view t wo graphs th at differ b y an isomorphism as iden tical. T his space can b e endo wed b y a metric that metrizes local con v ergence, and hence a Borel σ -algera. Analogously , w e denote by G ∗∗ the space of d oubly ro oted graph s [a doubly r o ote d gr aph is a graph with tw o d istinguished v ertices, i.e., a triple ( G, u, v ) where G = ( V , E ) is a graph, and u, v ∈ V ]. As for the simp ly r o ote d case, G ∗∗ can b e made in to a complete metric space; we regard it as a measur able space end o wed with the Borel σ -algebra. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 15 T able 1 Synchr onous p e eling algorithm Synchronous Peeling (Graph G = ( F , V , E )) F ′ ← F V ′ ← V E ′ ← E J 0 ← ( F , V , E ), t = 0 While J t has a v ariable no d e of degree ≤ 1 do t ← t + 1 V t ← { v ∈ V ′ : d eg G t − 1 ( v ) ≤ 1 } F t ← { a ∈ F ′ : ( v , a ) ∈ E ′ for some v ∈ V t } E t ← { ( v , a ) ∈ E ′ : a ∈ F t , v ∈ V ′ } F ′ ← F ′ \ F t V ′ ← V ′ \ V t E ′ ← E ′ \ E t J t ← ( F ′ , V ′ , E ′ ) End While T C ← t G C ← G ′ Return ( G C , T C , ( F t ) T C t =1 , ( V t ) T C t =1 , ( J t ) T C t =1 ) Definition 2.4. Let ( G, ∅ ) b e a random ro oted graph with ro ot ∅ . W e sa y that ( G, ∅ ) is unimo d ular if, f or an y measurable function f : G ∗∗ → R ≥ 0 , ( G, u, v ) 7→ f ( G, u, v ), w e ha ve E  X v ∈ V ( G ) f ( G, ∅ , v )  = E  X v ∈ V ( G ) f ( G, v , ∅ )  . (8) Consequences, and equiv alen t v ersions of unimo dularit y can b e foun d in [ 5 , 34 ]. 3. Pro of of Theorem 2 . In this sectio n, we describ e the construction of clusters and sparse bases within the clusters [or for the whole space of solutions for α ∈ [0 , α d ( k ))]. The analysis of this construction is giv en in Section 3.3 in terms of a few tec hnical lemmas. Finally , the formal pr o of of Theorem 2 is giv en in Section 3.4 . 3.1. Construction of the sp arse b asis. Th e construction of a s parse basis, whic h is at the heart of Theorem 2 , is b ased on the follo wing algorithm, formally s tated in T able 1 . The alg orithm constructs a sequence of residual factor grap h s ( J t ) t ≥ 0 , starting with the instance un der consideration J 0 = G . A t eac h step, the new graph is constructed b y removing all v ariable no des of degree one or zero, their adjacent factor no des, and all the edges adjacent 16 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI to these factor n o des. W e refer to the algorithm as synchr onous p e eling or simply p e eling . W e denote the sets of no des and edges remo ved at step (or round) t ≥ 1 b y ( F t , V t , E t ), so that J t − 1 = ( F t , V t , E t ) ∪ J t . Notice that, at ea c h step, the residual graph J t is c h ec k-indu ced. Th e algorithm halts when the resid ual graph do es not contai n an y v ariable no de of degree smaller than tw o. W e let the total n umber of iteratio ns b e T C ( G ), where we will drop the explicit dep end ence on G when it is clea r fr om co n text. The final residual graph is then J T C ≡ G C . Th e follo wing elemen tary fact is used in several pap ers on this topic [ 14 , 27 , 32 ]. Remark 3.1. The residual graph G C resulting at th e end of sync h ronous p eeling is the 2-core of G . It is conv enien t to reorder the factors (fr om 1 to m ) and v ariables (from 1 to n ) as follo ws . W e index the factors in increasing order according to F 1 , F 2 , . . . , F T C , c h o osing an arbitrary order within eac h F t for 1 ≤ t ≤ T C . F or th e v ariable no des, w e first index nod es in V 1 , then nod es in V 2 and so on. Within eac h set V t , th e ordering is c h osen in such a wa y that no des that ha ve degree 0 in J t − 1 ha ve lo wer index than those with d egree 1 [n otice that, b y d efinition, for an y v ∈ V t , deg J t − 1 ( v ) ≤ 1]. Finally , for v ariable no des in V t that ha ve degree 1 in J t − 1 , w e u se the follo wing ordering. Eac h suc h n o de v ∈ V t is connected to a unique factor no d e in F t . Call th is the asso ciate d factor , and denote it b y f v . W e order the no des d eg J t − 1 ( v ) = 1 according to the order of their associated factor, c ho osing an arbitrary int ernal order for v ariable no d es with the same asso cia ted factor. F or A ⊆ F , B ⊆ V , w e d enote b y H A,B the submatrix of H consisting of ro ws with in d ex a ∈ A and columns i ∈ B . Th e follo wing structural lemma is immediate , and w e omit its pro of. Lemma 3.2. L et G b e any factor gr aph [not ne c essarily in G ( n, k , m ) ] with no 2 -c or e. With the or der of factors and variable no des define d thr ough synchr onous p e eling, the matrix H is p artitione d i n T C × T C blo c k s { H F s ,V t } 1 ≤ s ≤ T C , 1 ≤ t ≤ T C with the fol lowing structur e: 1. F or any s > t , H F s ,V t = 0 . 2. The diagona l blo cks H F s ,V s , have a staircase structur e, namely for e ach such blo ck the c olumns c an b e p artitione d into c onse cutive gr oups ( C l ) ℓ l =0 , for ℓ = | F s | , such tha t c olumns in C 0 ar e e q ual to 0 , c olumns in C 1 have only the first entry e q u al to 1 , c olumns in C 2 have only the se c ond entry e qual to 1 , etc. Se e b elow for an example. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 17 An example of a stai rcase matrix    0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1    . (9) Note that V t is not empt y and F t is not empt y for all t < T C . On the other hand, F T C ma y b e empt y , in which case, w e adopt the con v entio n th at all columns corresp onding to V T C are included in C 0 . The ab ov e ordering reduces H to an essen tially upp er triangular matrix. It is th en immediate to construct a basis for its k ern el. W e w ill do this b y partitioning the set of v ariable no des as the disjoin t u nion V = U ∪ W in suc h a w a y that U ∈ { 0 , 1 } m × m and H U is square with full rank, and W ∈ { 0 , 1 } m × ( n − m ) . W e then treat x W as indep enden t v ariables and x U as dep end en t ones. The partition is then constructed b y letti ng W = W 1 ∪ · · · ∪ W T C and U = U 1 ∪ · · · ∪ U T C , whereby for eac h t ∈ { 1 , . . . , T C } , W t ⊆ V t is c h osen b y considering th e staircase structur e of b lo c k H F t ,V t and the corresp ondin g partition ov er columns V t = C 0 ∪ C 1 ∪ · · · ∪ C ℓ . W e let W t = C 0 ∪ C ′ 1 ∪ · · · ∪ C ′ ℓ , where C ′ i includes all th e elemen ts of C i except the first (and is empt y if |C i | = 1 ). Fin ally , U t ≡ V t \ W t . With th ese d efinitions, H F ,U is an m × m binary matrix with f ull rank. In add ition, it is u pp er triangular with d iagonal blo cks H F t ,U t = I | U t | for t = 1 , . . . , T C , where I r is the r × r iden tity matrix. In order to construct a sparse basis for the clusters when α > α d ( k ) [a nd hence pr o ve Theorem 2 , p oint 2(a )], w e w ill ha v e to consider matrices H (without a 2-core) that con tain r o ws with exactly 2 n onzero ent ries (i.e., c heck no des of degree 2). Whenever this h app ens , the construction must b e mo dified, b y in tro du cing the notion of c ol lapse d gr aph . The b asic idea is that a factor nod e of degree 2 constrains the adjacen t v ariables to b e iden tical and hence we can replace eac h set of v ariables that are th us constrained to b e equal by a single pro xy v ariable (a “sup er-nod e”). This pr o xy v ariable no de w ill hav e an ed ge with ea c h factor that w as p reviously connected to a r ep laced v ariable nod e, with a small modifi cation: Since w e are op erating in G F (2), we retain a single edge for edges with o dd multiplicit y , and dr op edges with ev en m ultiplicit y . Definition 3.3. The c ol lapse d gr aph G ∗ = ( F ∗ , V ∗ , E ∗ ) of a graph G = ( F , V , E ) is the graph of connected comp on ents in the su bgraph induced by factor nod es of d egree 2. F ormally , F ∗ ≡ { f ∈ F : | ∂ f | ≥ 3 } , V ∗ ≡ { S ⊆ V : d G (2) ( i, j ) < ∞ , ∀ i, j ∈ S } , E ∗ ≡ { ( S, a ) : S ∈ V ∗ , a ∈ F ∗ , |{ i ∈ S s.t. ( i, a ) ∈ E }| is o dd } , 18 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI where G (2) is the subgraph of G induced by factor n o des of d egree 2. W e let n ∗ ≡ | V ∗ | , m ∗ ≡ | F ∗ | . An elemen t of V ∗ is referred to as a sup erno de . Note th at for a graph G with no 2-core, the collapsed graph G ∗ also has no 2-core. W e let Q d enote th e corresp onding adjacency matrix of G ∗ . Fi- nally , we construct a b inary matrix L with ro ws indexed b y V , and column s indexed b y V ∗ , and suc h that L i,v = 1 if and only if i b elongs to connected comp onent v . W e app ly p eeling to G ∗ , thus obtaining the decomp osition of V ∗ in to U ∗ ∪ W ∗ as d escrib ed for the original graph G ab o ve. The follo win g is the key deterministic lemma on the construction of the basis. W e denote the size of the comp onen t of v ∈ V ∗ in G (2) b y S ( v ) , and for v ∈ V ∗ , t ≥ 0 we let S ( v , t ) = P w ∈ B G ∗ ( v,t ) S ( w ) b e the sum of sizes of v er tices within d istance t from v . Lemma 3.4. Assume that G ∗ has no 2 - c or e, then the c olumns of L  ( Q F ∗ ,U ∗ ) − 1 Q F ∗ ,W ∗ I ( n ∗ − m ∗ ) × ( n ∗ − m ∗ )  form an s -sp arse b asis of the kernel of H , with s = max v ∗ ∈ V ∗ S ( v ∗ , T C ( G ∗ )) . Her e, we have or der e d the sup er-no des v ∗ ∈ V ∗ as U ∗ fol lowe d by W ∗ , and the matrix inverse is taken over G F [2] . The p ro of of Lemma 3.4 is presen ted in th e App endix A . 3.2. Construction of the cluster de c omp osition. When G do es not con- tain a 2-core [whic h h app en s w.h.p.for α < α d ( k )] the ab o v e lemma is suf- ficien t to c haracterize the s pace of solutions S . When G con tains a 2-core [w.h.p. for α > α d ( k )] we need to constru ct th e p artition of the space of solutions S 1 ∪ · · · ∪ S N . W e let G C = ( F C , V C , E C ) denote the 2-core of G , and P G : { 0 , 1 } V → { 0 , 1 } V C b e the pro jector that maps a v ector x to its r estriction x V C . Next, w e let H C ≡ H F C ,V C b e the restriction of H to th e 2-core, and denote its k er n el b y S C . Obviously , for an y x ∈ S , w e h a ve P G x ∈ S C . F ur th er, S = [ x C ∈S C S ( x C ) , S ( x C ) ≡ { x ∈ S : P G x = x C } , (10) with {S ( x C ) } x C ∈S C forming a partition of S . It is easy to c heck H F \ F C ,V \ V C has full row ran k . F or in s tance, this follo w s from the fact th at th e subgraph indu ced by ( F \ F C , V \ V C ) is annihilated b y p eeling (cf. Remark 3.1 ). Thus, S ( x C ) is nonempty for all x C ∈ S C , and the s ets S ( x C ) are simply tran s lations of eac h other. It tur n s out that {S ( x C ) } x C ∈S C is not exactly the partition of S that we seek. In our next le mma, w e sh o w that the set of solutio ns of the core S C can THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 19 b e p artitioned in w ell-separated core-clusters. Moreo v er, the core-clusters are small and hav e a high conductance. W e will f orm sets in our partition of S b y taking th e union of S ( x C ) ov er x C that lie in a particular core-cluster. W e write x ′  x for binary v ectors x ′ , x if x ′ i ≤ x i for all i . W e wr ite x ′ ≺ x if x ′  x and x ′ 6 = x . W e need the follo wing definition: L C ( ℓ ) ≡ { x : x ∈ S C ( G ) , d ( x, 0 ) ≤ ℓ, ∄ x ′ ∈ S C ( G ) \ { 0 } s.t. x ′ ≺ x } . (11) The set L C ( ℓ ) consists of minimal nonzero solutions of the 2-core ha ving w eight at most ℓ . (Here, the supp ort of a binary ve ctor x is the subset of its co ordinates that are n onzero, and the w eight of x is the size of its supp ort.) Lemma 3.5. F or any α ∈ ( α d ( k ) , α s ( k )) , ther e exists ε = ε ( α, k ) > 0 such that the fol lowing holds. T ake any se quenc e ( s n ) n ≥ 1 such that lim n →∞ s n = ∞ and s n ≤ εn . L et G ∼ G ( n, k , αn ) . Then w.h.p., we have: (i) L C ( εn ) = L C ( s n ) ; (ii) |L C ( εn ) | < s n ; (iii) F or any x, x ′ ∈ L C ( εn ) , we have x ∧ x ′ = 0 , wher e ∧ denotes bitwise AND. In other wor ds, diffe r ent elements of L C ( εn ) have disjoint supp orts. Lemma 3.5 is pro ved in Sectio n 7 . Remark 3.6. Let E n b e the ev ent th at p oints (i), (ii) and (iii) in Lemma 3.5 hold. Assume E n and s 2 n < εn . Let S C , 1 b e the set of core so- lutions with w eigh t less than εn . Then S C , 1 forms a linear space o ver G F (2) of dimension |L C ( εn ) | , with L C ( εn ) b eing a s n -sparse basis for S C , 1 . More- o ve r, ev ery elemen t of S C , 1 is s 2 n -sparse. Let g ≡ 2 |L C ( εn ) | . (12) W e partition the set S C of core solutions in disjoint c or e - clusters , as follo ws. F or x , x ′ ∈ S C , w e write x ≃ x ′ if x ⊕ x ′ ∈ s p an( L C ( εn )). It is immediate to see that ≃ is an equiv alence relation. W e define the core-clusters to b e the equiv alence classes of ≃ . Obviously , the core clusters are affine spaces that differ by a tr an s lation, eac h con taining g ≤ 2 s n solutions. Their num b er is to b e denoted b y N . Denote the core-clusters by S C , 1 , S C , 2 , . . . , S C ,N . Note that for any x , x ′ ∈ S C b elonging to different core-clusters, we hav e d ( x , x ′ ) > n ε , that is, the core-clusters are we ll separated. W e us e the follo wing partition of the solution space (including noncore v ariables) S in to clusters, b ased on the core-clusters defined abov e: S = N [ i =1 S i , S i ≡ { x ∈ S : P G x ∈ S C ,i } . (13) 20 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI A ve rsion of Lemma 3.5 wa s claimed in [ 12 , 29 , 31 ]. T hese pap ers capture the essence of the pro of but miss some tec h nical details, and make the erro- neous claim that, w.h.p. eac h pair of core solutions is separate d b y Hamming distance Ω( n ). W e next wan t to study the internal structure of cl usters. By linearit y , it is sufficien t to consider only one of them, sa y S 1 , which w e can tak e to con tain the origin 0 . F or an y x ∈ S 1 , we h av e P G x ∈ S C , 1 = span( L C ( εn )), and L C ( εn ) forms a s n -sparse basis for S C , 1 , whic h coincides with the pro jection of S 1 on to the core. Consider the subset of solutio ns x ∈ S , such that P G x = x C for some x C ∈ S C , 1 . The set of v ariables that take the same v alue for all solutions in this s et is strictly larger than the 2-core. In ord er to capture this remark, w e defin e the b ackb one (v ariables that are uniquely d etermined by the core assignmen t) and p eriphery (other v ariables) of a graph G . Definition 3.7. Define the b ackb one augmentation pr o c e dur e on G with the initial c hec k induced sub graph G (0) b as follo ws. Start with G (0) b . F or an y t ≥ 0, pic k all c hec k n o des w hic h are not in G ( t ) b and ha ve at most one neigh b or outside G ( t ) b . Build G ( t +1) b b y adding all these chec k no des and their inciden t edges and neigh b ors to G ( t ) b . If no suc h c hec k no d es exist, terminate and output G b = G ( t ) b . The b ackb one G B = ( F B , V B , E B ) of a graph G = ( F , V , E ) is the outp ut of bac kb one augmen tation pro cedure on G w ith the initial subgraph G C , the 2-core of the graph G . The p eriphery G P of a graph G = ( F, V , E ) is th e subgraph ind uced b y the factor nod es F P = F \ F B and v ariable no d es V P = V \ V B that are not in the b ac kb one. 6 W e can no w define our basis for S 1 . T his is formed b y t wo sets of v ectors. The first set has a v ector corresp onding to eac h elemen t of L C ( εn ). F or eac h x C ∈ L C ( εn ), we construct a sp ars e solution x ∈ S 1 suc h that P G x = x C (Lemma 3.8 b elo w guaran tees the existence of suc h a v ector, and b ounds its sparsit y). This set of vect ors forms a basis for th e pro jection of S 1 on to the bac kb one. F or th e second set of v ectors, let H P ≡ H F P ,V P b e the matrix co r resp on d ing to the p eripher y graph. W e construct a sparse basis for the kernel of the matrix H P , follo wing th e general pro cedur e d escrib ed in Section 3.1 . Namely , w e first collapse the graph and then p eel it to order the no d es. Note that this second set of basis vecto rs v anishes on the bac kb one v ariables. Lemma 3.4 is used to b ound its sparsity . 6 Notice t hat there may be a few va riables ( w.h.p. at most a constant num b er) in the p eriphery that also are uniquely determined by th e core assignmen t. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 21 The fi rst set of v ectors is charac terized as b elo w (see Secti on 8 f or a p ro of ). Lemma 3.8. Consider any α ∈ ( α d ( k ) , α s ( k )) . L et G b e dr awn uniformly fr om G ( n, k , m ) . T ake ε ( α, k ) > 0 fr om L emma 3.5 , and c onsider any se- quenc e ( c n ) n ≥ 1 such that lim n →∞ c n = ∞ . Then, with high pr ob ability, the fol lowing is true. F or eve ry x C ∈ L C ( εn ) , ther e exists c n -sp arse x ∈ S 1 such that P G x = x C . 3.3. Analysis of the c onstruction. T he main c hallenge in pro ving Theo- rem 2 is b ounding th e sp arsit y of th e bases constru cted (either f or th e full set of solutions, when G do es not ha v e a core, or for th e cluster S 1 , when G h as a core). Th is inv olv es t wo t yp e of estimates: the first one u ses Lemma 3.4 , while the second is stated as Lemma 3.8 . In the first estimate, we need to b ound all the quantitie s inv olv ed in the sparsity upp er b ound : th e n umb er of iterations T after whic h p eeling (on the collapsed graph G ∗ ) halts, and the maxim u m size m ax v ∈ V ∗ S ( v , T ) of an y ball of radius T in the collapsed graph. In p articular, we will sho w that, w.h.p., we h a ve T = O (log log n ), and that max v ∈ V ∗ S ( v , T ) ≤ (log n ) C w.h.p., w hic h giv es sparsit y s ≤ (log n ) C . Pro vin g these b oun ds turns out to b e a r elativ ely simpler task w h en G do es not h av e a 2-core, partly b ecause the graph in question has n o factor no des of degree 2 , and th us the collapse p ro cedure is n ot needed. A sec- ond r eason is that wh en G has a 2 -core, we need to apply Lemm a 3.4 to the p eriph ery sub graph as discussed ab o ve. R emark ably , the p erip hery graph ad- mits a relativ ely explicit probabilistic charac terizatio n. W e sa y that a graph is p e elable if its core is empt y , and hence the p eeling pr o cedure h alts with the empt y graph. It turns out that, cond itional on the degree distribu tion, the p eriphery is uniformly random among all p eelable graphs. Suc h an explicit c haracterization is not a v ailable, how ev er, when we con- sider the subgraph obtained by removing the core (the p erip hery is obtained b y removing the en tire bac kb one). Neve rtheless, the pro of of Lemma 3.8 re- quires the study of this more complex subgraph . W e o v ercome this p roblem b y using to ols from the theory of local w eak con ve r gence [ 5 , 6 , 9 ]. Giv en a graph G = ( F , V , E ), its c hec k -n o de degree pr ofile R = ( R l ) l ∈ N is a probabilit y distribution suc h that, for an y l ∈ N , mR l is the n umb er of c h eck no des of degree l . A degree p rofile R can conv enien tly b e represent ed b y its generating p olynomial R ( x ) ≡ P l ≥ 0 R l x l . The deriv ativ e of this p olynomial is denoted b y R ′ ( x ). In particular, R ′ (1) = P l ≥ 0 lR l is the a v erage degree. Giv en in tegers m , n and a probabilit y distribution R = ( R l ) l ≤ k o ve r { 0 , 1 , . . . , k } , we denote b y D ( n, R, m ) the set of che c k -no de-de g r e e- c onstr aine d gr aphs , that is, the set of bipartite graph with m lab eled c hec k no d es, n lab eled v ariable no des and c hec k no de degree p rofile R . As for the mo del G ( n, k , m ), w e will write G ∼ D ( n, R, m ) to denote a graph drawn uniformly 22 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI at r andom from this set. Note that w e ha ve restricted the c h ec ks to ha ve degree n o more than k . F ur ther, w e will only b e intereste d in cases with R 0 = R 1 = 0 . Lemma 3.9. L et G = ( F , V , E ) ∼ G ( m, n , k ) and let G P b e its p eriphery. Supp ose that with p ositive pr ob ability, G P has n p variable no des, m p che ck no des, and che ck de gr e e pr ofile R p . Then, c onditione d on G P ∈ D ( n p , R p , m p ) , the p eriphery G P is distribute d uniformly over the set D ( n p , R p , m p ) ∩ P , wher e P is the set of p e elable gr aphs. There is a small tec hnical issue here in that if G ′ ∈ D ( n p , R p , m p ), then v ariable no d es in G ′ ha ve lab els from 1 to n p , whereas G P has v ariable no de lab els that form a subset of { 1 , 2 , . . . , n } , and similarly for chec k no des. W e adopt the conv en tion that th e v ariable and c hec k no des in G P are relab eled sequen tially , resp ecting the original order, b efore comparing with elemen ts of D ( n p , R p , m p ). The ab o ve lemma establishes that the p eriphery is roughly u niform, con- ditional on b eing p eelable. Its pro of is in Sectio n 6.1 . Conceptually , w e will b ound th e spars ity , as estimated in Lemma 3.4 b y pro ceeding in three steps: (1) Bound the estimated basis sparsity max v S ( v , T ) for che ck no de de gr e e c onstr aine d gr aphs D ( n, R, m ), in terms of the d egree distribution; (2) Estimate the “t ypical” degree d istribution for the p eriph- ery , and pro ve concent ration around this estimate ; (3) Pro ve that, if R is close to the t ypical degree distribution, then G ∼ D ( n, R , m ) is p eelable with uniformly p ositi v e p robabilit y . The la tter allo ws us to transfer the sparsit y estimates from the uniform mo del D ( n, R, m ) to the actual distribution of the p eriphery . Lemma 3.11 b elo w accomplishes steps (1) and (3), w hile Lemma 3.12 tak es care of step (2). In order to state these lemmas, it is conv enien t to int ro duce densit y ev olution (the terminolog y comes from the analysis of sparse graph co des [ 27 , 28 , 36 ]). Definition 3.10. Giv en α > 0, a degree pr ofile R , and an initial con- dition z 0 ∈ [0 , 1], w e defin e th e density evolution se quenc e { z t } t ≥ 0 b y let ting for any t ≥ 1, z t = 1 − exp {− αR ′ ( z t − 1 ) } . (14) Whenev er n ot sp ecified, the initial condition will b e assum ed to b e z 0 = 1. The one-dimensional recursion ( 14 ) will b e also called density evolution r e cu rsion . W e sa y the p air ( α, R ) is p e elable at r ate η for η > 0 if z t ≤ (1 − η ) t /η for all t ≥ 0. W e sa y that the pair ( α, R ) is exp onential ly p e elable (for sh ort p e elable ) if there exists η > 0 suc h that it is p eel able at rate η . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 23 The density ev olution r ecursion ( 14 ) describ es th e large graph asymp totics of a certain b elief p r opagation algorithm that ca ptures the p eeling pro cess, and w ill b e describ ed Section 4 . The n ext lemma is prov ed in Section 5 . Lemma 3.11. Consider the set D ( n, R, αn ) , wher e R = ( R l ) l ≤ k is a che c k de gr e e pr ofile such that R 0 = R 1 = 0 . Assume that the p air ( α, R ) is p e elable at r ate η . Then ther e exist c onstants N 0 = N 0 ( η , k ) < ∞ , δ = δ ( η , k ) > 0 , C 1 = C 1 ( η , k ) < ∞ , C 2 = C 2 ( η , k ) < ∞ such that the f ol low ing hold, for G a r ando m gr aph dr awn fr om D ( n, R , m ) with n > N 0 : (i) The g r aph G is p e elable with pr ob ability at le ast δ . F urther, if R 2 = 0 , one c an take δ arbitr ary close to 1 (in other wor ds G is p e elable w.h.p.). (ii) Conditional on G b eing p e elable, p e eling on the c ol lapse d gr aph G ∗ terminates after T ≤ C 1 log log n iter ations, with pr ob ability at le ast 1 − n − 1 / 2 . (iii) L etting T ub = ⌊ C 1 log log n ⌋ , we have max v ∈ V ∗ S ( v , T ub ) ≤ (log n ) C 2 , with pr ob ability at le ast 1 − n − 1 / 2 . Our final lemma is pro v ed in Section 6.2 and establishes the p eelabilit y condition f or the p eriphery . Lemma 3.12. F or any α > α d ther e exist c onstants η = η ( k , α ) > 0 , γ ∗ = γ ∗ ( k , α ) > 0 such that the fol lowing holds. L et G = ( F , V , E ) b e a gr aph dr awn uniformly at r andom fr om the ensemble G ( n, k , m ) , m = nα , and let G P = ( F P , V P , E P ) b e its p eriphery. L et m P ≡ | F P | , n P ≡ | V P | , α P ≡ m P /n P and denote by R P the r andom che ck de gr e e pr ofile of G P . Then, f or any ε > 0 , w.h.p. we have: (i) The p air ( α P , R P ) is p e elable at r ate η ; (ii) n ( γ ∗ − ε ) ≤ n p ≤ n ( γ ∗ + ε ) . 3.4. Putting everything to gether. A t this p oin t, we can formally summa- rize the pro of of our main result, Theorem 2 , that b uilds on the construction and analysis pro v id ed so far. Pr oof of Theorem 2 . 1. F or α < α d ( k ), w .h.p., the graph G do es not con tain a 2-core (cf. Theorem 4 ), hence p eeling r eturns an empt y graph. Using the constru ction in L emm a 3.4 , we obtain an s -sparse basis, with s = m ax v ∈ V | B ( v , T C ) | (notice that in this case there is no factor no de of degree 2, and hence the collapsed graph coincides with the original graph). The num b er of p eeling iterati ons T C is b ounded b y Lemma 3.11 (ii), using the fact that, b y defi nition of α d ( k ) the pair ( α, R ), w ith R k = 1 is p eelable at rate η = η ( α, k ) > 0 for α < α d ( k ). Hence, T C ≤ C 1 log log n w .h .p., for some C 1 = C 1 ( α, k ) < ∞ . Finally , b y applying Lemm a 3.11 (iii) we obtain the th esis. 24 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Next, consider p oin t 2. The partition in to clusters is constructed as p er equation ( 13 ), and in particular the num b er of clusters N is equ al to the n u m b er of solutions of the core linear system H C x = 0 divided by g giv en b y equation ( 12 ). Let us consider the v arious claims concerning this partition: 2(a). By construction, it is sufficien t to constru ct a basis of the cluster S 1 con taining the origi n, cf. Section 3.2 . T he basis has t wo sets of ve ctors. The fi r st set of v ectors is given by Lemma 3.8 . Their pro jection on to the core spans the core solutions in S C , 1 . Since v ariables in the bac kb one are uniquely d etermin ed b y those on the core, their pr o jectio n on to the bac kb one spans th e bac kb one p r o jectio n of S 1 . By Lemma 3.8 , these v ectors are, w.h.p., c n -sparse for any c n → ∞ . Lemma 3.4 pro vides th e second set of v ectors. These sp an the kernel of th e adjacency matrix of the p erip hery , H p and v anish iden tically in th e b ac kb one. In particular, they are indep enden t from the first set. It is easy to c heck that the t wo sets of v ectors tog ether form a basis for the cluster S 1 . W e are left w ith the task of pr oving that the second set of b asis v ectors is sparse. The construction in Lemma 3.4 pro ceeds b y colla p sing the p eriphery graph G P , and applyin g p eeling. W e th u s n eed to b oun d the sparsity s = max v ∈ V S ( v , T C ). Define the ev en t (implicitly indexed by n ) E 1 ≡ { ( α P , R P ) is p eelable at rate η > 0 and n P ≥ nγ ∗ / 2 } . By Lemma 3.12 , we know that E 1 holds with high pr obabilit y for suitable c hoices of η = η ( k , α ) > 0 and γ ∗ = γ ∗ ( α, k ) > 0. F urth er R P 0 = R P 1 = 0 with probabilit y 1. F r om Lemma 3.9 , w e kno w that G P is drawn un iformly from the set D ( n P , R P , m P ) ∩ P . Let G ′ b e d ra wn uniformly f rom D ( n P , R P , m P ), with ( n P , R P , m P ) distribu ted as for G P , conditional on ( α p , R p ) ∈ E 1 . W e can then app ly Lemma 3.11 to G ′ . F rom p oin t (i), it follo ws that G ′ is p eela ble with p robabilit y at least δ = δ ( α, k ) > 0. Let G ′ ∗ b e the result of collaps- ing G ′ . F rom p oints (ii) an d (iii) it follo ws that, with probability at least 1 − n − 0 . 5 P ≥ 1 − ( nγ ∗ / 2) − 0 . 5 → 1 as n → ∞ , w e ha v e max v ∈ V ′ ∗ S G ′ ( v , T C ) ≤ max v ∈ V ′ ∗ S G ′ ( v , T ub ) ≤ (log n ) C , for some C = C ( α, k ) < ∞ . (W e use the sub - script on S to indicate the graph u n der consideration.) Since E 1 holds for G P w.h.p., and since G ′ is p eelable with probabilit y uniformly b ounded a wa y from zero, it follo ws that the same b ound on the sparsit y holds for G P as we ll. In other words, w.h.p., w e ha v e that max v ∈ V P , ∗ S G P ( v , T C ) = (log n ) C . Here, V P , ∗ is th e set of sup er-no des resulting from th e collapse of G P . Finally , using Lemma 3.4 , w e deduce that th e second set of basis v ectors obtained from this construction is s -sparse for s = (log n ) C . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 25 2(b). By Lemma 3.5 , w.h.p., f or any t wo core s olutions x C ∈ S C , 1 , x C ′ ∈ S C ,b , b 6 = 1 w e ha ve d ( x C , x C ′ ) ≥ n ε . Th is immediately implies d ( x , x ′ ) ≥ n ε , for an y t wo solutions x ∈ S 1 , x ′ ∈ S \ S 1 . By linearit y , w e conclud e d ( S a , S b ) ≥ n ε for all a, b . 2(c). Let N C b e the num b er of solutions of the core linear sy s tem H C x = 0. This w as p ro ved to conce n tr ate on the exp onential scale in [ 18 , 20 ], with n (Σ − ε ) ≤ log N C ≤ n (Σ + ε ) with high p robabilit y , and Σ giv en as in the statemen t (cf. also [ 29 ]). The n umb er of clusters is N = N C /g for g = 2 L C ( εn ) , cf. equatio n ( 12 ). Using th e b ound |L C ( εn ) | ≤ s n from Lemma 3.5 (ii) and c ho osin g s n to div erge sufficien tly slo wly with n , w e deduce that N also concen trates on th e exp onen tial scal e with the same exp onent as N C .  4. A b elief propagation algorithm and dens ity evol ution. A useful anal- ysis to ol is pro vided b y a b elief propagation algorithm [cf. equations ( 4 ) and ( 5 )] that refines the p eeling algorithm in tro duced in Section 3.1 . The same algorithm is also of in terest in iterativ e co ding; see [ 29 , 36 ]. W e restate the BP up date rules f or the con v enience of the r eader. ν t v → a =  ∗ , if b ν t − 1 b → v = ∗ for all b ∈ ∂ v \ a , 0 , otherwise, and b ν t a → v =  0 , if ν t u → a = 0 for all u ∈ ∂ a \ v , ∗ , otherwise. The initialization at t = 0 dep ends on the con text, but it is conv enien t to single out tw o sp eci al case s. In the fi rst case, all messages are initia lized to 0: ν 0 v → a = b ν 0 a → v = 0 for all ( a, v ) ∈ E . In the second, they are all initialized to ∗ : ν 0 v → a = b ν 0 a → v = ∗ for all ( a, v ) ∈ E . W e will refer to these t wo cases (resp.) as BP 0 and BP ∗ . W e let ν t ≡ ( ν t v → a ) ( a,v ) ∈ E and b ν t ≡ ( b ν t v → a ) ( a,v ) ∈ E denote the vecto r of messages. W e mentio n here that BP ∗ on the a graph G ∈ G ( n, k , m ) turns out to b e trivial (all messages remain ∗ ). Ho w ev er, w e find it usefu l to run BP ∗ on the subgraph induced b y v ariable and c h ec k no des outside the core. W e d escrib e this in detail in Section 4.2 . The b elief propagation algorithm introdu ced here enjoys an imp ortan t monotonicit y prop erty . More precisely , defin e a partial ordering b et ween message vec tors b y letting 0 ≻ ∗ and ν  ν ′ if ν v → a  ν ′ v → a and b ν a → v  b ν a → v for all ( a, v ) ∈ E . Lemma 4.1 ([ 29 , 36 ]). Given two states ν t 1  ν t 2 , we have ν t ′ 1  ν t ′ 2 and b ν t ′ 1  b ν t ′ 2 at al l t ′ ≥ t . As a c onse quenc e, the i ter ation BP 0 is mo notone de cr e asing (i.e., ν t +1  ν t ) and BP ∗ is monotone incr e asing (i.e., ν t +1  ν t ). In p articular, b oth c onver ge to a fixe d p oint in at most | E | iter ations. 26 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI It is not hard to c hec k by induction o v er t that BP 0 corresp onds closely to the p eeling pro cess. Lemma 4.2. A variable no de v is eliminate d in r ound t of p e eling, that is, v ∈ V t , if ther e i s at most one inc oming 0 message to v in iter ation t − 1 of BP 0 but this was not true in pr evious r ounds. A factor no de a is eliminate d in r ound t of p e eling (i.e., a ∈ F t ), alon g with al l its incident e dges, if it r e c eive s a ∗ message for the first time in iter ation t of BP 0 . F u rther, the fixed p oin t of BP 0 captures the decomp osition of G in to core, bac kb one and p eriphery as follo ws. Lemma 4.3. L et ( ν ∞ , b ν ∞ ) denote the fixe d p oint of BP 0 . F or v ∈ V , we have: • v ∈ V C if and only if v r e c eives two or mor e inc oming 0 messages under b ν ∞ , • v ∈ V B \ V C if and only if v r e c e ives exactly one inc oming 0 message under b ν ∞ , • v ∈ V P if and only if v r e c eives no inc oming 0 messages under b ν ∞ . F or a ∈ F , we have • a ∈ F C if and only if a r e c eives no inc oming ∗ message under ν ∞ , • a ∈ F B \ F C if and only if a r e c eives one inc oming ∗ message under ν ∞ , • a ∈ F P if and only if a r e c eives two or mor e i nc oming ∗ messages u nder ν ∞ . Final ly, G C is the su b gr aph induc e d by ( F C , V C ) and similarly for G B and G P . The pro ofs of the last t w o lemmas are b ased on a straigh tforward ca s e-b y- case analysis, and we omit them. (In fact, this corr esp ondence is w ell known in iterativ e co d ing, albeit in a somewhat differen t language [ 36 ].) 4.1. Density evolution. It turns out that distrib ution of BP messages is closely trac k ed by density ev olution, in the large graph limit. Before stating this fact f ormally , it is useful to introdu ce a different ensemble C ( n, R , m ) that will b e used in some of the pro ofs. A graph G in C ( n, R, m ) is con- structed as f ollo ws. W e lab el v ariable n o des 1 through n and c hec k no des 1 throu gh m . W e c ho ose an arb itrary partition of the m c heck no des into k + 1 sets with the l th set consisting of mR l c heck no des with degree l eac h, for l = 0 , 1 , . . . , k . F or eac h c hec k n o de of degree l , we draw l half-edges d is- tinct fr om eac h other. Eac h of these half-edges is connected to an arbitrary v ariable no d e. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 27 There is a close relationship b et w een the sets D ( n, R, m ) and C ( n, R , m ) . An y elemen t of D ( n, R , m ) corresp onds to Q k l =2 ( l !) mR l elemen ts of C ( n, R, m ) , with the ambiguit y arising due to the ordering of the neigh b orho o d of a c heck no d e in C ( n, R , m ) . Conv ersely , an y elemen t of C ( n, R , m ) with no double edges [t w o or more edges b et w een the same (v ariable, c hec k) p air] corresp onds to a unique elemen t of D ( n, R, m ). Moreo v er, the fraction of el- emen ts of C ( n, R, m ) that ha v e no double edges is u niformly b ound ed aw a y from zero as n → ∞ [ 10 ]. T his leads to Lemma 4.4 b elo w. Lemma 4.4. L et E b e a gr aph pr op erty that do es not dep end on e dge la- b els [e.g., E ( G ) ≡ { G is a tr e e } ]. Ther e exists C = C ( k , α max ) < ∞ su ch that the fol lowing is true for any α ∈ [0 , α max ] . Supp ose E holds with pr ob ability 1 − ε for G dr awn uniformly at r andom fr om C ( n, R , αn ) , for some ε ∈ [0 , 1] . Then E hold s with pr ob ability at le ast 1 − C ε for G ′ dr awn uniformly at r an- dom fr om D ( n, R, αn ) . An imp ortan t to ol in the follo w ing will b e the notion of almost sure lo- cal con v ergence of graph sequences. W e made this notion precise in Defin i- tion 1.2 , f ollo wing [ 15 ]. W e now r eturn to the distribu tion of BP messages and densit y evo lu tion. Lemma 4.5. L et { z t } b e the density e volution se quenc e define d by ( 14 ), for a given p olynomial R , with z 0 = 1 , and define b z t ≡ R ′ ( z t ) /R ′ (1) . Assume G n ∼ D ( n , R, m ) or G n ∼ C ( n, R , m ) with m = nα . L et R ( t ) l 0 ,l ∗ b e the f r action of che ck no des r e c e iving l 0 inc oming 0 messages and l ∗ inc oming ∗ messages after t iter ations of BP 0 in G n . Similarly, let L ( t ) l 0 ,l ∗ the fr action of v ariable no des r e c eiving l 0 inc oming 0 messages and l ∗ inc oming ∗ messages after t iter ations of BP 0 . Then for any fixe d t ≥ 0 , the fol lowing o c cu rs almost sur ely: lim n →∞ R ( t ) l 0 ,l ∗ = R l 0 + l ∗  l 0 + l ∗ l 0  z l 0 t (1 − z t ) l ∗ for l 0 , l ∗ ∈ { 0 , 1 , . . . , k } , (15) lim n →∞ L ( t ) l 0 ,l ∗ = P { X 0 = l 0 , X ∗ = l ∗ } for al l l 0 , l ∗ ∈ N , (16) wher e X 0 ∼ Po isson( R ′ (1) α b z t ) , X ∗ ∼ Po isson( R ′ (1) α (1 − b z t )) ar e two i nde- p e ndent Poisson r andom variables. Pr oof. Notice that b oth D ( n, R, m ) and C ( n, R, m ), m = n α con verge lo cally to unimo du lar bipartite trees. More precisely , if r o oted at random v ariable no des, they con v erge to Galto n–W atson trees with ro ot offspring distribution Poisso n( R ′ (1) α ) at v ariable no des, and equal to th e size-biased v ersion of R at c hec k no d es. The p ro of of the analogous statemen t in the 28 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI case of nonbipartite graphs can b e found in [ 15 ], Prop osition 2.6. It uses an explicit calcula tion to s h o w that the empirical d istribution of lo cal neigh b or- ho o ds conv erges in exp ectation, and a martingale concen tration argumen t to verify the assumptions of Borel–Can telli, and hence deduce almost sure con verge n ce. The same p ro of extend s—with minimal changes—to bipartite (factor) graph s. Messages are local functions of the graph, hence their distr ibution con- v erges to the one on th e limit tree. In particular, incoming messages on the same no de are asymp totically indep en den t b eca u se they dep end on distinct subtrees. The message distribu tion can b e computed through a standard tree recursion (see [ 29 , 36 ]) that coincides with the densit y ev olution recursion ( 14 ).  Using the corresp ondence in Lemm a 4.2 b et wee n BP 0 and th e p ee ling algorithm, w e ca n u s e d ensit y ev olution to trac k the p eeling algorithm. Lemma 4.6. Given a factor gr aph H , let n 1 ( H ) denote the numb er of variable no des of de gr e e 1 , and n 2+ ( H ) the numb er of variable no des of de gr e e 2 or lar ger in H . F or l ∈ N , let m l ( H ) b e the numb er of factor no des of de gr e e l in H . Consider synchr onous p e eling for t ≥ 1 r ounds on a gr aph G ∼ D ( n, R, αn ) or G ∼ C ( n, R, αn ) , with R 0 = R 1 = 0 , and let J t denote the r esidual gr aph after t iter ations. L et ω ≡ αR ′ (1) . Then for any δ > 0 , ther e e xi sts N 0 = N 0 ( δ , k , t, α ) such that with pr ob ability at le ast 1 − 1 /n 2     m l ( J t ) n − αR l z l t     ≤ δ (17) for l ∈ { 2 , 3 , . . . , k } ,     n 1 ( J t ) n − ω b z t exp( − ω b z t )(1 − exp( − ω ( b z t − 1 − b z t )))     ≤ δ, (18)     n 2+ ( J t ) n − 1 + exp( − ω b z t )(1 + ω b z t )     ≤ δ. (19) Pr oof. F or the sak e of simp licit y , let us consider n 1 ( J t ). By Lemma 4.2 , a nod e v h as degree 1 in th e resid ual graph J t if and only if there is one incoming 0 message to v at time t , and there were t wo or m ore incoming 0 messages to v at time t − 1. By Lemma 4.5 , the num b er of incoming 0 messages to v at time t con verges in distr ibution to Z 1 ∼ P oisson ( ω b z t ). Using monotonicit y of the algorithm, and ag ain Lemm a 4.5 , the num b er of incident edges such that the message incoming to v at time t − 1 is 0 but c hanges to ∗ at time t , con verges to Z 2 ∼ Po isson( ω ( b z t − 1 − b z t )), and is asymptotically THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 29 indep en d en t of the num b er of 0 messages (con v erging to Z 1 ). Therefore, n 1 ,t /n con v erges as n → ∞ to P [ Z 1 = 1] P [ Z 2 ≥ 1] = ω b z t exp( − ω b z t )(1 − exp( − ω ( b z t − 1 − b z t ))) . This establishes th at the estimate ( 18 ) holds with high pr obabilit y . In order to obtain the desired pr obabilit y b oun d, one can u se a standard con- cen tration of measure argument [ 19 , 36 ]. Namely , w e first condition on the degrees of the chec k no d es. Sin ce the unconditional distributions D ( n, R, m ) and C ( n, R, m ) are reco v ered by a random relab eling of the c h ec k no des, suc h conditioning is ir relev an t. W e then regard n 1 ( J t ) as a fun ction of the indep en d en t random v ariables X 1 , . . . , X m whereby X a is the neigh b orho o d of the a th c hec k no de. W e d enote b y E n the ev ent that all the balls B G ( v , 2 t ) of r adius t in G ha v e size smaller than (log n ) C . W e ha v e | E { n 1 ( J t ) | X 1 , . . . , X a − 1 , X a ; E n } − E { n 1 ( J t ) | X 1 , . . . , X a − 1 , X ′ a ; E n }| ≤ (log n ) C . The d esired p r obabilit y estimate then follo ws by applying Azuma’s inequal- it y (in a form that allo w for exceptional ev ents; see, e.g., [ 19 ], Theorem 7.7) and b ounding P ( E c n ) (see, e.g., Secti on 5.2 ).  4.2. BP fixe d p oints. F or our p urp oses, it is imp ortan t to c haracterize the fi xed p oin t of the BP 0 algorithm in tro duced abov e. Indeed, the structure of this fixed p oint is directly related to the decomp osition of G in to core, bac kb one and p erip hery (cf. Lemma 4.3 ), whic h is in turn crucial for our definition of clusters. Let us start from an easy remark on dens ity ev olution. Lemma 4.7. L et { z t } t ≥ 0 b e the density evolution se que nc e define d by e quation ( 14 ) with initial c ondition z 0 = 1 . Then t 7→ z t is monotone de- cr e asing, and henc e has a limit Q ≡ lim t →∞ z t which is given by Q = sup { z s.t. z = 1 − exp {− αR ′ ( z ) } } . (20) Pr oof. Monotonicit y follo ws from the fact that z 7→ f ( z ) ≡ 1 − exp {− αR ′ ( z ) } is monotone increasing, and that z 1 = 1 − exp {− αR ′ (1) } < z 0 , whence z 2 = f ( z 1 ) ≤ f ( z 0 ) = z 1 , and so on.  Notice that the definition of Q giv en in this lemma is consistent with the one in Theorem 1 , that corresp ond s to the sp ecial case of regular, degree- k c heck no d es, that is, R ( x ) = x k . W e further let b Q ≡ R ′ ( Q ) /R ′ (1). W e kno w th at b oth BP 0 and density evo lution con v erge to a fixed p oint. Since densit y ev olution trac ks BP 0 for any b ounded n um b er of iterations, it w ould b e tempting to co nclude that a description of the BP 0 fixed p oin t is obtained b y replacing z t b y Q and b z t b y b Q in Lemma 4.5 . T his is, of cours e, far from ob vious b ecause it requires an inv ersion of the limits n → ∞ and t → ∞ . Despite this cav eat, this sub stitution is essen tially correct. 30 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Lemma 4.8. Assume G n ∼ G ( n, k , m ) with m = nα , and α ∈ [0 , α d ( k )) ∪ ( α d ( k ) , ∞ ) . L et R ( ∞ ) l 0 ,l ∗ b e the f r action of che ck no des r e c e iving l 0 inc oming 0 messages and l ∗ inc oming ∗ messages at the fixe d p oint of BP 0 . Similarly, let L ( ∞ ) l 0 ,l ∗ the fr action of variable no des r e c eiving l 0 inc oming 0 messages and l ∗ inc oming ∗ messages at the fixe d p oint of BP 0 . The fol lowing o c curs with pr ob ability 1 : lim n →∞ R ( ∞ ) l 0 ,l ∗ =  k l 0  Q l 0 (1 − Q ) l ∗ for l 0 ∈ { 0 , 1 , . . . , k } , l ∗ = k − l 0 , (21) lim n →∞ L ( ∞ ) l 0 ,l ∗ = P { X 0 = l 0 , X ∗ = l ∗ } for al l l 0 , l ∗ ∈ N , (22) wher e X 0 ∼ P oisson( k α b Q ) , X ∗ ∼ P oisson( k α (1 − b Q )) ar e two indep endent Poisson r andom variables. Giv en Lemma 4.5 ab ov e, Lemma 4.8 sa ys that the messages c hange v ery little b eyo nd a large constan t n u m b er of iterations. A hint at the f act that Lemma 4.8 is significantl y more c hallenging than L emm a 4.5 is given by the assumption in the former th at α 6 = α d ( k ). In fact, this turns out to b e a n ecessary assumption, b ecause it implies an imp ortant correlation deca y prop erty . Mollo y [ 32 ] established the analog of equation ( 22 ) for P ℓ 0 ≥ 2 ,ℓ ∗ ≥ 0 L ( ∞ ) l 0 ,l ∗ , whic h corresp onds to the relativ e size of the core. W e fin d that the complete theorem presents n ew c hallenges: keeping trac k of the bac kb one turns out to b e hard. One hurdle is that the “estimated bac kb one” after t iterations of BP 0 (i.e., the subset of v ariable no des that receiv e exactly one 0 message) do es not evol v e m onotonically in t . In con trast, the “estimate d core” (i.e., the subset of v ariable no des that r eceiv e t wo or more 0 messages) can only shrink. Another hurdle is that, unlik e the p eriphery (cf. Section 6 ), it turn s out that the bac kb one is not uniformly ran d om conditioned on th e degree sequence. The pro of of Lemm a 4.8 is qu ite long and w ill b e presen ted in Section 4.3 . The basic idea is to run BP starting from the initializa tion with 0 mes- sages coming from v er tices in the core and ∗ messages ev eryw h ere else. This corresp onds to BP ∗ on the noncore G NC [i.e., th e subgraph induced by ( F \ F C , V \ V C )], since messages outside the noncore do n ot change: Messages within the co re and from core v ariables to noncore c hecks sta y fi x ed to 0. Messages from n oncore chec ks to core v ariables sta y fixed to *. W e refer to this alg orithm simply as BP ∗ , with th e u nderstandin g that BP ∗ is ac tually run on G NC . It is not hard to c hec k by induction o v er t that BP ∗ corresp onds to the bac kb one augment ation pro cedure. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 31 Lemma 4.9. Consider the b ackb one augmenta tion pr o c e dur e with the ini- tial sub gr aph G C . A factor no de a is adde d to the b ackb one in r ound t of b ackb one augmentation, that is, a ∈ G ( t ) b \ G ( t − 1) b (cf. Definition 3.7 ) if al l but one inc oming message to a in iter ation t of BP ∗ ar e 0 , but this was not the c ase in pr evious iter ations. A variable no de v is adde d to the b ackb one in r ound t , of b ackb one aug- mentation, that is, v ∈ G ( t ) b \ G ( t − 1) b if ther e is one inc oming 0 message to v in iter ation t of BP ∗ but this was not true in pr e vious iter ations. It then follo ws imm ediately f rom Lemma 4.3 that BP 0 and BP ∗ con verge to the same fi x ed p oin t. Denote the messages at this fixed p oin t b y ν 0 , ∞ v → a . Denote b y ν ∗ ,t v → a the messages pr o duced in iteration t of BP ∗ , and ν 0 ,t v → a the messages pro d uced by BP 0 . Monotonicit y of BP up date implies ν 0 ,t v → a  ν 0 , ∞ v → a  ν ∗ ,t v → a . T he pr o of consists in sho wing that the fraction of 0 messages in { ν 0 ,t v → a } ( a,v ) ∈ E is, for large fixed t , close to the fraction of 0 messages in { ν ∗ ,t v → a } ( a,v ) ∈ E . T he challe n ge is that no analog of Lemma 4.5 is a v ailable for BP ∗ . Our final lemma is a straigh tforward consequence of Lemmas 4.5 and 4.8 ab o ve. Lemma 4.10. Consider any k ≥ 3 , any α ∈ (0 , α d ) ∪ ( α d , α s ) and any δ > 0 . Ther e exists T < ∞ such that the fol lowing o c curs. L et G n ∼ G ( n , k , αn ) . Then, eventual ly (in n ) almost sur ely, the fr action of (che ck- to-v ariable or variable-to-che ck) messages that change after iter ation T of BP 0 is smal ler than δ . Pr oof. Let N t ( n ) b e the fraction of v ariable-to-c hec k messages that are equal to 0 after t iterations on G n (with t = ∞ corresp ond ing to the fixed p oin t). Then equations ( 15 ) and ( 21 ) imp ly that | N t ( n ) − z t | ≤ δ 3 k , | N ∞ ( n ) − Q | ≤ δ 3 k holds ev en tually almost surely . Using Lemma 4.7 , there exists T large enough so that, for t ≥ T , | z t − Q | ≤ δ / (3 k ) . By th e triangle inequalit y | N t ( n ) − N ∞ ( n ) | ≤ δ /k . The thesis for v ariable-to-c hec k messages follo ws since, by monotonicit y of BP 0 , N t ( n ) − N ∞ ( n ) is exactly equal to the fraction of messages that c hange v alue from iteration t to the fixed p oin t. Eac h c hange in a v ariable-to -chec k message can lead to a c hange in at m ost k − 1 c hec k- to-v ariable messages. Th u s, the f raction of c hec k-to-v ariable messages that c hange after ite ration T is smaller than δ .  32 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI 4.3. Pr o of of L emma 4.8 . Throughout this section, the notion of con ver- gence adopted is c onver genc e lo c al ly (cf. Definition 1.2 ). F or n ≥ 0 , d ra w a graph G n uniformly at r andom from G ( n, k , αn ) . Con- sider equation ( 22 ). Since the total n u mb er of incoming messages is equal to the ve rtex d egree, whic h is P oisson ( k α ), it is sufficient to con trol th e distribution of 0 incoming messages. In particular, w e define L ( t ) ℓ + ≡ ∞ X ℓ ∗ =0 ∞ X l 0 = ℓ L ( t ) l 0 ,l ∗ , that is the fraction of no des that receiv e ℓ or more 0 incoming messages. W e prov e a series of lemmas, leading to the desired estimate for L ( t ) ℓ + . An up p er b ound on L ( ∞ ) ℓ + is relat iv ely easy to obtain. Lemma 4.11. With pr ob ability 1 with r esp e ct to the choic e of ( G n ) n ≥ 0 , we have for al l l ≥ 0 , lim su p n →∞ L ( ∞ ) ℓ + ≤ P { P oisson ( k α b Q ) ≥ ℓ } . Pr oof. Using Lemma 4.5 (and using the fact that L l ≤ C exp( − l /C ) for all l h olds ev en tually almost surely , for some C < ∞ ) we hav e, lim n →∞ L ( t ) ℓ + = P { Poisson( kα b z t ) ≥ ℓ } holds w .p . 1. F rom Lemma 4.1 , it follo ws that L ( t ) ℓ + is monoto ne decreasing. Th us, we ha v e lim sup n →∞ L ( ∞ ) ℓ + = P { P oisson( k α b z t ) ≥ ℓ } w.p. 1. Fix an arbitrary δ > 0. Lemma 4.7 implies that, for t large enough, [ P { P oisson ( k α b z t ) ≥ ℓ } − P { P oisson( k α b Q ) ≥ ℓ } ] ≤ δ, whic h implies that lim sup n →∞ L ( ∞ ) ℓ + ≤ P { P oisson( k α b Q ) ≥ ℓ } + δ holds almost surely . Sin ce δ is arbitrary , we obtain the claimed resu lt.  The lo wer b oun d on L ( ∞ ) ℓ + cannot b e obtained by the same approac h . W e go therefore through a detour. Let µ n ≡ µ ( G n ) b e the m easure on ro oted factor graphs with marks (called “net wo rks” in [ 5 ]), constructed as follo w s : Cho ose a uniformly ran d om v ari- able n o de i ∈ V n as r o ot. Mark v ariable n o des with mark c if they are in th e 2-core of G n . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 33 Lemma 4.12. The se quenc e { µ n } n ≥ 0 c onver ges lo c al ly to the me asur e on r ando m r o ote d tr e e with marks, T ∗ ( α, k ) , define d as fol low s. Construct a r ando m bip artite Galton–Watson tr e e r o ote d at ∅ with offspring distribution P oisson ( k α ) at variable no des and deterministic k − 1 at factor notes. L et V C ( T ∗ ) b e the maxima l subset of its vertic es such that e ach variable no de has de gr e e at le ast 2 and e ach factor no de has de gr e e k i n the induc e d sub gr aph. Mark with c al l vertic e s in V C ( T ∗ ) . Pr oof. It is immediate to see that the sequence { µ n } n ≥ 0 is tight almost surely with resp ect to the c hoice of ( G n ) n ≥ 0 , that is, that for an y ε ≥ 0 there exists a compact s et K such th at P { H ∗ ( n ) ∈ K } ≥ 1 − ε . (E.g., tak e K to b e the set of graphs th at ha ve maxim u m degree ∆ t at distance t for a suitable sequence t 7→ ∆ t .) Therefore [ 5 ], an y subsequence of { µ n } admits a further su bsequence th at con ve rges lo cally w eakly to a limiting measure on ro oted net works. This s ubsequence can b e constructed through a diagonal argumen t: First, construct a subsequen ce { µ n t s } s ≥ 0 suc h that the dep th- t subtree con v erges. Refin e it to get a su b sequence { µ n t +1 s } s ≥ 0 suc h that the depth- ( t + 1) subtree con verges and so on. Finally , extract the diagonal subsequence { µ n s s } s ≥ 0 . W e w ill p ro ve the thesis b y a standard w eak con ve rgence argum ent [ 24 ]: W e will sh o w that for any subsequence of { µ n ) } n ≥ 0 , there is a sub -su bsequence that conv erges lo cally wea kly to the measure on T ∗ ( α, k ). Consider indeed an y sub -subsequence that con verges lo cally weakly to limiting r andom ro oted graph with marks, wh ic h w e denote by O ∗ . Define the unmarking op erator U that maps a mark ed ro oted graph to the cor- resp ond ing unmarke d ro oted graph. W e ha ve that U ( O ∗ ) d = U ( T ∗ ) (here d = denotes equalit y in d istribution) from lo cal w eak con v ergence of ran d om graphs to Galton–W atson trees (see, e.g., [ 6 , 15 ]). W e will hereafter couple the tw o trees in suc h a w ay that U ( O ∗ ) = U ( T ∗ ). Recall that a stopping set is an y sub set of v ariable nodes of a f actor graph, suc h that eac h v ariable n o de has degree at least 2 in the induced subgraph. The 2-core of the factor graph is the maximal stoppin g set and is a sup erset of any stopping set. Th ese notions are w ell d efined for infinite graphs as well. No w, the marks in T ∗ corresp ond to the core b y definition. T h e marks in O ∗ form a stopping set, since the m easur e on O ∗ is the lo cal w eak limit of µ n , and in any graph dra wn fr om µ n , w .p. 1 a v ertex is mark ed only if at least t wo of its neighborin g c h ecks ha v e all marked neighboring v ariable no des. Moreo v er, one can sh o w that b oth T ∗ and O ∗ are u nimo du lar. Ind eed T ∗ is unimo d u lar since the u n mark ed tree is clearly unimo dular, and the markin g pro cess do es not mak e an y r eference to the ro ot. Unimo dularit y of O ∗ is clear since it is the lo cal weak limit of a mark ed random graph [ 5 ]. Thus, in order to prov e our thesis it su ffices to sho w that the densit y of marks is 34 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI the same in T ∗ and O ∗ . (Because the sub set of no des that is mark ed in T ∗ con tains the subset marked in O ∗ and the d ensit y of their difference is equal to the differen ce of the den sities. Finally , for unimo d u lar net work, if a mark t yp e h as densit y 0, then the set of m ark ed no d es is empty by u nion b ounds.) Let E ≡ n lim n →∞ | V c ( G n ) | /n = P { Poisson( kα b Q ) ≥ 2 } o , where Q and b Q are defined as at the b eginning of Section 4 . It was pro v ed in [ 32 ] that | V c ( G n ) | /n a . s . − → P { P oisson ( k α b Q ) ≥ 2 } , that is, the ev ent E o ccurs with probabilit y 1. Now let the set of mark ed vertic es in O ∗ b e denoted by b V C ( O ∗ ). It is easy to see that if E h olds, the dens it y of marks in O ∗ is giv en b y P { ∅ ∈ b V C ( O ∗ ) } = P { Poi sson( k α b Q ) ≥ 2 } . (23) Pro ceeding analo gously to the p ro of of [ 8 ], Prop osition 1.2, w e obtain P { ∅ ∈ V C ( T ∗ ) } = P { P oisson( k α b Q ) ≥ 2 } . (24) The sketc h of this step is the follo win g. Let E t b e the ev ent that ∅ b elongs to a “depth t core,” where the requiremen t of “degree at least 2 in the subgraph” applies only to v ariables up to depth t − 1. The p r obabilit y on the left-hand side is j u st P { E } for E = T t ≥ 1 E t . Since E t is a d ecreasing sequence, P { E } = lim t →∞ P { E t } . On the other hand, P { E t } can b e compu ted explicitly through a tree calculatio n and conv erges to P { P oisson( αk b Q ) ≥ 2 } as t → ∞ yielding ( 24 ). Finally , th e thesis follo ws by comparing equations ( 23 ) and ( 24 ), and recalling th at P ( E ) = 1.  W e next construct a random tr ee e T ∗ ( α, k ) with marks on the directed edges as follo ws. Marks tak e v alues in { 0 , ∗} and to eac h un directed edge w e asso ciate a mark for eac h of the t wo directions. W e will refer to the direc- tion to w ard th e ro ot as to the “upw ard” direction, and to the opp osite one as to the “do w nw ard” direction. The marks corresp ond to fixed p oin t BP messages, and we will call them messages as wel l in wh at follo ws. First, con- sider only edges directed upw ard. This is a multit yp e GW tree. A t the ro ot generate P oisson( k α ) offsprings, and mark eac h of the edges to 0 indep en- den tly with p robabilit y b Q , and to ∗ otherw ise. A t a nonro ot v ariable no d e, if the paren t edge is mark ed 0, generate P oisson( kα (1 − b Q )) descendan t edges mark ed ∗ and P oisson ≥ 1 ( k α b Q ) descendan t ed ges mark ed 0 [here Poisson E ( λ ) denotes a Poi sson random v ariable with p arameter λ conditional to E ]. If the paren t edge is mark ed ∗ , generate P oisson( kα (1 − b Q )) descendan t edges mark ed ∗ and n o descendan t edges mark ed 0. A t a factor no de, if the parent THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 35 edge is mark ed 0, generate k − 1 descendan t edges m ark ed 0. If the parent no de is marked ∗ , generate M ∼ Binom ≤ k − 2 ( k − 1 , Q ) descendants mark ed 0, and k − 1 − M descendan ts mark ed ∗ . F or edges directed do w n w ard , marks are generated recursiv ely follo wing the usual BP ru les, cf. equations ( 4 ), ( 5 ), starting fr om the top to the b ot- tom. It is easy to c h ec k that w ith this construction, the marks in e T ∗ ( α, k ) corresp ond to a BP fixed p oin t. W e extend the unmarking op erator U b y allo wing it to act on graphs with marks on edges (and remo ving the marks). Lemma 4.13. U ( e T ∗ ) and U ( T ∗ ) have the same distribution. Pr oof. F or this, w e construct U ( e T ∗ ) (which is e T ∗ without the marks re- v ealed) in a “br eadth first” manner as follo ws: First, w e dra w a Po isson( αk ) n u m b er of factor descendan ts for the r o ot n o de. Let a b e a factor d escendan t of th e r o ot. T hen a has k − 1 v ariable no de descendan ts. The message b ν a → ∅ is 0 with p robabilit y b Q . It immediate to c heck from our construction and b Q = Q k − 1 that: F act 1: Conditional on the d egree of the ro ot deg ( ∅ ) = d 1 , th e d 1 ( k − 1) upw ard messages incoming to the c hec k no des a ∈ ∂ ø are indep end ent, with P { ν v → a = 0 } = Q . No w, we dr a w the n u m b er of descendant s for eac h neigh b or of a . Using fact 1, together with the defin ition of e T , one can c hec k that: F act 2: Conditional on the degree of the r o ot deg( ∅ ) = d 1 , th e num b er of descendan ts of ea c h of the d 1 ( k − 1) v ariable no des v at the fi rst generation is an in dep end en t P oisson( k α ) rand om v ariable. F urther , the u p w ard messages to wa rd these v ariable no des are indep endent w ith P { b ν b → v = 0 } = b Q . This argumen t (outlined for simplicit y for the first generatio n ) can b e rep eated almost ve rbatim at an y generation. Denote by e T ∗ ,d the firs t d gen- erations of e T ∗ ,d (with v ariable no des at the leav es). One then p r o ves b y induction that at an y d , conditional on U ( e T ∗ ,d ), the num b er of d escendan ts of the v ariable no d es in th e last ge neration are i.i.d. P oisson ( k α ), and giv en these, the corresp ond ing up ward message s are i.i.d. P { b ν b → v = 0 } = b Q . This implies the thesis.  Lemma 4.14. e T ∗ is unimo dular. Pr oof. W e already established unimo d ularit y of U ( e T ∗ ) [since U ( e T ∗ ) = U ( T ∗ ) is a un imo dular Galton–W atson tree]. T o establish the claim, let e T ′ ∗ b e the random tree whose distrib ution has Radon–Nik o dym deriv ativ e deg( ∅ ) / E { deg ( ∅ ) } with resp ect to that of e T ∗ . W e need to sho w that m oving 36 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI the ro ot to a u niformly rand om d escend an t v ariable no de of the ro ot (via one c h ec k) in e T ′ ∗ , lea ves the d istribution of e T ′ ∗ unc hanged (cf. [ 5 ], Section 4). Dra w e T ′ ∗ at random, w eigh ted b y the degree of the ro ot ∅ . In this ar- gumen t, w e mak e the ro ot explicit b y denoting th e tree by ( e T ′ ∗ , ∅ ). Rev eal the d egree d 1 = deg( ∅ ) of the ro ot. W e ha ve d 1 > 0 almost surely . T ak e a uniformly rand om neigh b oring chec k a ∈ ∂ ∅ , and a un iformly random de- scendan t i of a (we know that a has k − 1 descendan ts). Rev eal the num b er of descendan ts of i . Let this n umb er b e d 2 − 1, so that i has d 2 neigh b ors in total. Note that we d o not rev eal any of the messages in e T ′ ∗ . A t this p oint, consider the incoming messages to the v ariable no des ∅ an d i except for b ν a → ∅ and b ν a → i , and the incoming messages to the c hec k a except for ν ∅ → a and ν i → a . Call this v ector of m essages M . Th e m essages in M are indep en d en t, with prob ab ility b Q of for eac h incoming message to v ariable no des to b e 0, and p robabilit y Q for incoming messages to a to b e 0. 7 The messages b ν a → ∅ , b ν a → i , ν ∅ → a and ν i → a are d eterministic functions of M . Fi- nally , notice that d 1 and d 2 are indep endent, and iden tically distribu ted as 1 + P oisson ( αk ). A t this p oin t, it is clear that ( e T ′ ∗ , i ) is distributed ident ically to ( e T ′ ∗ , ∅ ), whic h establishes unim o dularit y .  Lemma 4.15. L et F b e a map fr om “tr e es with marke d e dges” to “tr e es with marke d variable no des” define d as fol lows: F ( T ) is obtaine d fr om T by putting a c mark on vertex i if and only if at le ast two inc oming e dges have a 0 mark. Then F ( e T ∗ ( α, k )) d = T ∗ ( α, k ) . Pr oof. It is easy to c hec k that the su bset of v ariable no des in e T ∗ that receiv e t wo or more incoming 0’s forms a stopping set (since the s et of messages is at a BP fixed p oint). But the densit y of marked no des in F ( e T ∗ ) (i.e., the pr obabilit y of the ro ot b eing marked) is P { P oisson( k α b Q ) ≥ 2 } , whic h is exactly the same as the density of mark ed no des in T ∗ (recall that T ∗ is also unimo dular, cf. p ro of of Lemma 4.12 ). On the other hand, the set of mark ed n o des in T ∗ is the core b y d efinition and hen ce includ es the marked no des in F ( e T ∗ ). W e d educe that the set of ve r tices that are mark ed in T ∗ but not in F ( e T ∗ ) has v anish ing densit y and, therefore, F ( e T ∗ ( α, k )) d = T ( α, k ).  W e let B b e the subset of v ariable no des v of e T ∗ ( α, k ) suc h that at least one m essage incoming to v is equal to 0. Then this set has densit y P { ∅ ∈ B } = P { P oisson ( k α b Q ) ≥ 1 } ≡ b Q. (25 ) 7 The argument establishing this is essentiall y th e one ab o ve, where w e show ed that U ( e T ∗ ) = U ( T ∗ ). THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 37 In ligh t of Lemma 4.1 5 , w e further d en ote the set of v ariable no d es in e T ∗ ha ving t wo or more incoming 0 messages by V C ( e T ∗ ). Consider running BP ∗ on U ( e T ∗ ) [this is BP starting with zeros from the v ariable no des in V C ( e T ∗ ) and ∗ elsewhere]. Let the trees with marks on edge s obtained after t iterations b e denoted by e T t ∗ . Denote by ˜ µ t n the measure on the ro oted factor graph with marks on the edges constructed as follo ws: Cho ose a u niformly random v ariable no de i ∈ V ( G n ). Mark the edges (in eac h direction) with the messages corresp onding to BP ∗ run for t iterati ons. Lemma 4.16. The me asur es ( ˜ µ t n ) n ≥ 0 c onver ge lo c al ly to the me asur e on e T t ∗ . Pr oof. This r esult is immediate from Lemmas 4.12 and 4.15 .  The follo wing is immediate from the co nstruction of e T ∗ . Remark 4.17. I f ∅ ∈ B , then there exists a subtree of e T ∗ ro oted at ∅ with the follo wing prop erties: (i) If j is a v ariable no de in the subtree, either j ∈ V C ( e T ∗ ) or at least one descendan t factor no de is in the subtree; (ii) If a is a factor no de in the subtree, all its descendants are also in the subtree. W e call the subtree just defined a witness for ∅ (there migh t b e more than one in p rinciple). Notic e that a priori a witness can b e finite [if it ends up with no des in V C ( e T ∗ )], or infinite. Lemma 4.18. Almo st sur ely any no de i ∈ B has a finite witness. Thus, lim t →∞ e T t ∗ = e T ∗ . Pr oof. It is sufficien t to pro ve that the follo wing ev ent has zero p r ob- abilit y: ∅ ∈ B and ∅ only has in finite w itnesses. Sup p ose ∅ ∈ B . W e will lo ok for a m inimal witness for ∅ . If ∅ ∈ V C ( e T ∗ ), then it is itself a witness and w e are done. If n ot then, there is exactly one incoming 0 message, sa y from factor a . Then factor a has k − 1 incoming 0 messages f r om d escendan ts. The su btrees corresp ond ing to th ese d escendan ts are in d ep end en t. Consid er a d escendan t i of a . W e ha v e P ( i ∈ B \ V C ( e T ∗ )) = P { Po iss on ≥ 1 ( αk b Q ) = 1 } = exp( − αk b Q ) αk b Q/ (1 − exp( − αk b Q )) = exp( − αk b Q ) αk Q k − 2 . 38 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Conditioned on i ∈ B \ V C ( e T ∗ ), the no d e i has exact ly k − 1 descendan t v ari- able no d es (via one c heck nod e). Th us , cond itioned on ∅ ∈ B , the minimal witness is a Galton–W atson tree with offspr in g distribu ted as Z , whereb y Z = ( k − 1) with p robabilit y exp( − αk b Q ) αk Q k − 2 , and Z = 0 otherwise. Th e branc hing factor of th is tree is exp( − αk b Q ) αk ( k − 1) Q k − 2 < 1 (cf. Lemma 6.6 b elo w ). The lemma follo w s.  Lemma 4.19. Consider the setting of L emma 4.8 . We have lim inf n →∞ L ( ∞ ) 1+ ≥ P { P oisson ( k α b Q ) ≥ 1 } , almost sur ely with r esp e ct to the choic e of G n . Pr oof. Let B t b e the su bset of v ariable n o des in e T t ∗ that receiv e at least one 0 m essage. Let y t b e the densit y of n o des in B t . F rom Lemma 4.18 , w e ha ve immediately lim t →∞ y t = P { P oisson( k α b Q ) ≥ 1 } . (26) Let B t ( n ) ⊆ V ( G n ) b e the subs et of no des ha ving at least one incoming 0 after t ite rations of BP ∗ . Let y t ( n ) b e the fraction of these no des, that is, y t ( n ) ≡ | B t ( n ) | /n . F rom Lemma 4.16 , we ha ve lim n →∞ y t ( n ) = y t (27) almost surely . By equation ( 26 ), w e ha ve lim n →∞ y t ( n ) ≥ P { Po isson( k α b Q ) ≥ 1 } − δ for all t ≥ T ( δ ). By monotonicit y of BP ∗ , we ha v e lim inf n →∞ L ( ∞ ) 1+ ≥ lim n →∞ y t ( n ) ≥ P { Po isson( k α b Q ) ≥ 1 } − δ , whic h implies the thesis.  Lemma 4.20. Consider the setting of L emma 4.8 . We have, for al l ℓ ≥ 2 , lim in f n →∞ L ( ∞ ) ℓ + ≥ P { P oisson ( k α b Q ) ≥ ℓ } , almost sur ely with r esp e ct to the choic e of G n . Pr oof. The pro of is v er y similar to that of the p r evious lemma. Let C ( ℓ ; n ) ⊆ V b e the subset of v ariable no des in G n that are in the core and ha ve at least ℓ neighborin g c hec k no des in the core. Then w e ha ve (b y mono- tonicit y of BP ∗ ) L ( ∞ ) ℓ + ≥ | C ( ℓ ; n ) | n . (28) On the other hand, let y ( ℓ ) b e the den s it y of v ariable n o des in e T ∗ that receiv e t wo or more 0 m essages and ha ve at lea st ℓ neigh b oring chec k no des in the set { a : F or ea c h i ∈ ∂ a , nod e i receiv es t wo or more 0 messages } . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 39 It f ollo ws from Lemmas 4.12 and 4.15 that lim in f n →∞ 1 n | C ( ℓ ; n ) | = y ( ℓ ) . (29) On the other hand, it is easy to chec k that the construction of e T ∗ implies that y ( ℓ ) coincides with the d ensit y of n o des receiving ℓ or m ore 0 messages (here the assump tion ℓ ≥ 2 is crucial). Hence, y ( ℓ ) = P { P oisson( k α b Q ) ≥ ℓ } , whic h together w ith equations ( 28 ), ( 29 ) yields th e th esis.  Pr oof of Lem ma 4.8 . Equation ( 22 ) follo ws from Lemmas 4.11 , 4.19 and 4.20 . Equation ( 21 ) follo ws from a completely analogous argument .  Recall that ˜ µ t n is the m easure on the r o oted factor graph with marks on the edges constructed as follo ws: Cho ose a u niformly random v ariable no de i ∈ V ( G n ). Mark the edges (in eac h direction) with the messages corresp onding to BP ∗ run f or t iterations. Recall that ˜ µ ∗ n is defined similarly with marks corresp ondin g to the BP fixed p oint. Denote b y ˜ µ t n ( d ), the measure obtained from ˜ µ t n b y restricting the depth of the ro oted graph to d . Lemma 4.21. F or any d ≥ 0 and any δ > 0 , ther e exists t < ∞ such that almost sur ely, lim su p n →∞ k ˜ µ t n ( d ) − ˜ µ ∗ n ( d ) k TV < δ . Pr oof. Consider r unnin g BP ∗ on G n . F rom Lemma 4.16 , we kno w ˜ µ t n con verge s lo cally to the measure on e T t ∗ . F rom Lemm a 4.18 , w e know lim t →∞ e T ( t ) ∗ = e T ∞ ∗ . In particular, the fraction of 0 v ariable-to -c hec k messages in e T ( t ) ∗ con verge s to Q (i.e., the fraction of 0 v ariable-to-c hec k messages in e T ( ∞ ) ∗ ). But fr om Lemma 4.8 , th e f r action of 0 v ariable-to-c hec k messages in ˜ µ ∗ n con verge s ev en tually almost su rely to the same v alue, and similarly for c heck- to-v ariable messages the fraction of 0 messages conv erges to b Q [usin g the fact that L l ≤ C exp( − l /C ) for all l holds even tually almost surely , for some C < ∞ ]. Using monotonicit y of BP ∗ , w e d educe that for an y ε > 0, there exists t large enough suc h that, lim sup n →∞ { Num b er of message c hanges after iteration t in G n } /n ≤ ε (30) holds almost surely . No w, we can c ho ose ε small enough suc h that eve ntually (in n ) almost sur ely , for an y set of εn edges in G n , the un ion of balls of radius d around these edges con tains n o more than δn no des. Com binin g with equatio n ( 30 ), at least (1 − δ ) f r action of no des h a ve all messages in a ball of radius d unchanged after iteration t , almost su r ely . This yields the result.  40 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Pr oof of Theorem 3 . F rom Lemma 4.16 , we kno w ˜ µ t n con verge s lo- cally to the measure on e T t ∗ . F r om Lemma 4.18 , w e kn o w lim t →∞ e T t ∗ = e T ∞ ∗ . Com b ining with Lemma 4.21 , we obtain that lim su p n →∞ k ˜ µ ∗ n ( d ) − µ ( e T ∞ ∗ ( d )) k TV < δ (31) almost surely . Since δ is arbitrary , w e obtain, for ev ery d , that lim su p n →∞ k ˜ µ ∗ n ( d ) − µ ( e T ∞ ∗ ( d )) k TV = 0 (32) holds almost surely . Th e result follo ws.  5. Pro of of Lemma 3.11 : Pee lab ilit y implies a sparse basis. 5.1. Pr o of of L emma 3.11 (i) and (ii) . Let us b egin by d escribing the pro of strateg y . Instead of analyzing p eeling on the collapsed graph G ∗ , we analyze a differen t p eeling pro cess. W e first r un syn chronous p eeling on G for a large constan t τ num b er of iterations. W e then coll apse the r esu lting graph, as discussed in S ection 3.1 , that is, coalescing v ariables connected to eac h other via degree 2 factors (cf. Definition 3.3 ). Finally , we run sync h r onous p eeling on the collapsed graph unt il it gets annihilated. W e s h o w that this pr o cess tak es at least as man y iterat ions as synchronous p eeling on G ∗ (Lemma 5.1 b elo w ). In order to b oun d th e n u m b er of iterations u nder this new t w o- stages pro cess, w e pr o ceed as follo ws. W e c ho ose the constan t τ suc h that the residual graph J τ is sub critica l, and h en ce co nsists of trees and unicyclic comp onent s of size O (log n ) w.h.p. As a consequen ce, the collapsed graph — to b e denoted by T ( J τ )—con tains only c hec ks of degree 3 or more, and consists of trees and un icyclic comp onent s of size O (log n ) . It is n ot hard to sho w th at it tak es only O (log log n ) additional roun ds of p eeling to annihilate T ( J τ ) under this condition (see L emma 5.4 b elo w). Sev eral tec hnical lemmas follo w, which are p ro ved in the Ap p end ix B , except Lemma 5.1 , which w e pro v e b elo w. A t th e end of the sub section, we pro v id e a pro of of Lemma 3.11 , parts (i) and (ii). Consider th e p eel ing algorithm and define J to b e the p eeling op erator corresp ondin g to one round of sync hronous p eeling (cf. T able 1 ). Thus, for a bip artite graph G , the resid ual graph after t r ounds of p eel ing is J t ( G ). Denote by J ∞ ( G ) the graph pro duced by the p eeling pro cedure after it halts: this is the empty graph if G is p eel able, and the core of G otherwise. Recall that T C ( G ) denotes the n um b er of rounds of p eeling p erformed b efore halting at J ∞ ( G ). F u rther, define T to b e the collapse op erator as p er Defin ition 3.3 . F or instance G ∗ = T ( G ). The next lemma b ou n ds from ab o ve the n umb er of rounds of p eeling required to ann ih ilate G ∗ , in terms of the mo dified p eeling pro cess (consisting of τ round s of p eeling, follo w ed by coll apse, and then p eeling un til annihilation). THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 41 Lemma 5.1. F or any c onstant τ ≥ 0 and any p e elable bip artite g r aph G , T C ( T ( G )) ≤ T C ( T ( J τ ( G ))) + τ . P eelabilit y of a p air ( α, R ) immediately imp lies some useful prop erties. Lemma 5.2. F or a factor de gr e e pr ofile ( α, R ) that is p e elable at r ate η > 0 , we have: (i) 2 αR 2 ≤ 1 − η . (ii) α ≤ 1 . Notice th at the factor graph induced by degree 2 c hec k no des is in n atur al corresp onden ce with an ordinary graph (replace ev ery c hec k no de by an edge) w hic h is un if orm ly random giv en th e n u m b er of edges. The a v erage degree of this graph is 2 αR 2 , and Lemma 5.2 (i) implies that it is sub critical, as we wo uld exp ect for a p eelable degree distribution. Lemma 3.11 is stated for the ensemble D ( n, R , m ), m = nα . Ho we ver, in parts of the pro of of this lemma, we fi nd it con venien t to work in stead with the en sem b le C ( n, R, m ) in tro du ced in Section 4.1 . W e n eed to charact erize the residu al graph J t after t rounds of p eeling. Lemmas 5.3 and 4.6 ac hieve s this for G ∼ C ( n, R, m ). T ogether, they sh o w essen tially that d ensit y evo lution p ro vides an accurate charac terizatio n of J t . Using these L emmas, we are able to deduce [see pro of of Lemma 3.11 (i) and (ii) b elo w] that J τ consists of small trees and u nicyclic co mp onents w .h.p., for large enough τ . Finally , using Lemma 4.4 , we apply the s ame results to G ∼ D ( n , R, m ). Recall that n 1 ( G ) denotes the num b er of v ariable no des of degree 1 in G , and n 2+ ( G ) denotes the num b er of v ariable no des of degree 2 or more in G . Let C ( n, R , m ; n ′ 1 , n ′ 2 ) ≡ { G : G ∈ C ( n, R, m ) , n 1 ( G ) = n ′ 1 , n 2+ ( G ) = n ′ 2 } . (33) In the lemma b elo w , w e slightly mod ify the p eeling pro cess, c h o osing to retain all v ariable no des V in the resid ual graph (chec k no d es are eliminated as usu al). With a sligh t abuse of notation, w e k eep denoting by J t the residual graph, although this is obtained from J t b y adding a certain n umb er of isolated v ariable no des. Lemma 5.3. Consider a gr aph G dr awn uniformly at r ando m fr om C ( n, R, m ) . F or any t ∈ N , c onsider synchr onous p e eling for t r ounds on G , r e- sulting in the r esidual gr aph J t . Supp ose that for some ( e R, ˜ m, ˜ n 1 , ˜ n 2 ) , we have J t ∈ C ( n, e R, ˜ m ; ˜ n 1 , ˜ n 2 ) with p ositive pr ob ability. Then, c onditio ne d on J t ∈ C ( n , e R, ˜ m ; ˜ n 1 , ˜ n 2 ) , the r e si dual gr aph J t is u niformly r andom within C ( n, e R, ˜ m ; ˜ n 1 , ˜ n 2 ) . 42 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Our fi nal tec hn ical lemma b ounds the num b er of p eeling rounds needed to annih ilate a tree or unicyclic comp onent. Lemma 5.4. Consider a factor gr aph G = ( F , V , E ) with no che ck no des of de gr e e 1 or 2 , and that is a tr e e or unicyclic. Then G is p e elable and T C ( G ) ≤ 2 ⌈ log 2 | V |⌉ . Pr oof of Lemma 3.11 (i) and (ii). A standard cal culation (see, e.g., [ 14 ], or Section 7.1 whic h carries th rough a similar calculation) sho ws that, for a un iformly r andom graph C ( n, e R, ˜ m ; ˜ n 1 , ˜ n 2 ), with ˜ n 1 , ˜ n 2 ≥ nε and with ˜ m e R ′ (1) ≥ ˜ n 1 + 2 ˜ n 2 + nε for some ε > 0, the asymptotic degree d istribution of v ariable no des is P { D = 0 } = q 0 , P { D = 1 } = q 1 , P { D = ℓ } = (1 − q 0 − q 1 ) P { P oisson ≥ 2 ( λ ) = ℓ } for all ℓ ≥ 2 for suitable c hoices of q 0 , q 1 , λ dep ending on the ensemble parameters. F u r- ther, by a standard breadth-first searc h argument, the n eighb orh o o d of a v ertex v is dominated sto c h astically b y a (bipartite) Galt on–W atson tree, with offspring distrib ution equal to the size-biased version of e R at c hec k no des, and equal to of P { D = ·} at v ariable no d es. Consider G ∼ C ( n, R, m ). Using Lemmas 4.6 and 5.3 , it is p ossib le to estimate the d egree distribution, of J t . A length y but straight forw ard cal- culation sho ws th at the corresp onding branching factor is θ ( J t ) = αR ′ ( z t ). No w, notice that R ′ ( z ) = 2 R 2 + k X l =3 l ( l − 1) z l − 2 ≤ 2 R 2 + k ( k − 1) z for z ≤ 1. Ch o ose τ = τ ( η , k ) < ∞ su ch that z τ ≤ η / (3 αk ( k − 1)). Th en we ha ve αR ′ (1) ρ ′ ( z τ ) ≤ 2 αR 2 + η / 3. But Lemma 5.2 tells us that 2 αR 2 ≤ 1 − η . It f ollo ws that αR ′ ( z τ ) ≤ 1 − 2 η / 3. In particular, the branching factor θ = θ ( J τ ) asso cia ted with the r andom graph J τ satisfies θ ≤ 1 − η / 3, with probabilit y at least 1 − 1 /n 2 . F ollo wing a standard argumen t [ 11 ] where we explore the n eigh b orho o d of v b y breadth first searc h , we obtain that with probability at least 1 − 1 /n 1 . 7 for n ≥ N 1 ( η , k ) , the connected comp onent con taining v is a tree or un icyclic, with size less than C 4 log n , for some C 4 = C 4 ( η , k ) < ∞ . Applying a union b ound, w e obtai n that for n ≥ N 2 = N 2 ( η , k ) , with probab ility at least 1 /n 0 . 7 , the ev ent E n o ccurs, where E n ≡ { All connected compon ents in J τ are trees or un icyclic (34) and ha v e size at most C 4 log n } . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 43 Then, from Lemma 4.4 , w e infer that E n o ccurs with probabilit y at least 1 /n 0 . 6 for G ∼ D ( n, R , m ) pro vided n ≥ N 3 , w here N 3 = N 3 ( k ) < ∞ . W e stic k to G ∼ D ( n, R , m ) for the rest of th is pro of. W e no w analyz e the p eeling pro cess starting with J τ and consider only what happ en s on E n since it o ccurs with sufficientl y large probabilit y . Let us consider first p oint (i). Clearly , tree comp onent s are p eelable. If R 2 = 0, then there are no facto rs of degree 2, and unicyclic comp onen ts are also p ee- lable (Lemma 5.4 ). Th us , the en tire graph is annihilated b y p eeling w.h.p., as claimed. If R 2 > 0 , then the num b er of unicyclic comp onen ts of size smaller than M is asymp totical ly P oisson with p arameter C 5 < ∞ u niformly b ound ed in M (this f ollo ws, e.g., b y [ 37 ]; see also [ 11 , 38 ]). It follo w s that with probabilit y at least exp( − C 5 ) / 2 for n ≥ N 4 , there are no u nicyclic comp o- nen ts of size smalle r than M . The exp ected n umb er of unicyclic comp onen ts of size M or larger is upp er b oun ded b y P ℓ ≥ M θ ℓ / (2 ℓ ) ≤ θ M / (1 − θ ), an d for M large enough n o unicyclic comp onen t of this s izes exists, with p rob- abilit y at least 1 − exp ( − C 5 ) / 4. Considering these t wo con tribu tions, th e graph con tains no cycle with probabilit y at least exp( − C 5 ) / 4 for n ≥ N 4 , and h ence it is p eela b le. Th is completes part (i). F or (ii), n otice that in collapsing a connected comp onen t of J τ , the n u m- b er of v ariable no des do es not increase. F urther, a tree comp on ent collapses to a tr ee and a unicyclic comp onent collapses either to a tree or a unicyclic comp onent s. Thus, we can use Lemma 5.4 with N ≤ C 4 log n to obtain the a b ound of ( C 1 / 2) log log n ≤ C 1 log log n − τ on the n u m b er of add itional p eeling rounds needed, with probabilit y at least 1 − 1 /n 0 . 6 . S ince the prob- abilit y of p eelabilit y is uniformly b ounded a wa y from zero as n → ∞ , th e probabilit y that the same b ound on the n umber of p eeling r ounds holds con- ditioned on p eelabilit y is at least (for some δ > 0) 1 − 1 / ( δn 0 . 6 ) ≥ 1 − 1 /n 0 . 5 for n ≥ N 5 , as required.  5.2. Pr o of of L emma 3.11 (iii) . The follo wing lemma b ou n ds the size of a sup ercritical Galto n–W atson tree, observ ed up to fi nite d epth. The pro of is in the App endix B . Lemma 5.5. Consider a Galton–Watso n br anching pr o c ess { Z t } ∞ t =0 with Z 0 = 1 and with offspring distribution P { Z 1 = j } = b j , j ≥ 0 . Supp ose b r ≤ (1 − δ ) r /δ for al l r ≥ 0 , for some δ > 0 . Also, assume that the b r anching factor satisfies θ ≡ P ∞ j =1 j b j = E [ Z 1 ] > 1 . Then ther e exists C = C ( δ ) > 0 such that the fol lowing happ ens. F or any β > 3 and T ∈ N , we have P " T X t =0 Z t > ( β θ ) T # ≤ 2 exp( − C ( β / 3) T ) . (35) 44 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Pr oof of Lemm a 3.11 (iii). F rom Lemma 5.2 (ii), w e kno w that α ≤ 1. Th e f ollo wing occur s in the collapse pro cess: Let G (2) = ( F (2) , V , E (2) ) b e the subgraph of G induced b y the degree 2 factor no des (with isolated v ertices retained). W e h av e F ∗ = F \ F (2) . All v ariable no des that b elong to a single connected co mp onent of G (2) coalesce int o a single sup ern o de v ′ ∈ V ∗ in G ∗ , w ith a neigh b orho o d th at consists of the union of the in dividual neigh b orho o ds restricted to F ∗ (cf. Definition 3.3 ). As men tioned ab o v e, G (2) is a rand om factor graph w ith αR 2 n factor no d es of degree 2, and is in one-to-one corresp ondence with a uniformly random grap h . F or v ′ ∈ V ∗ , w e denote by S ( v ′ ) the num b er of v ariable no des in V in the comp onent v ′ . Lemma 5.2 (i) implies that the br anc hin g f actor of G (2) ob eys 2 αR 2 ≤ 1 − η , that is, G (2) is sub critical. Th is leads to the follo wing claim that follo ws immediately from a w ell-kno wn result on the size of the largest connected comp onent in a sub critical rand om graph [ 11 ]. Claim 1: There exists C 2 = C 2 ( η ) < ∞ , N 2 = N 2 ( η ) < ∞ suc h that the follo wing o ccurs for all n > N 2 . No comp onen t v ′ ∈ V ∗ is comp osed of more than C 2 log n v ariable n o des, that is, max v ′ ∈ V ∗ S ( v ′ ) ≤ C 2 log n , with proba- bilit y at lea st 1 − 1 /n . Let G ∼ 2 ≡ ( F ∗ , V , E \ E (2) ), that is, G ∼ 2 is the subgraph of G indu ced by factors of degree greater than 2 (with isolat ed v ertices retained). F r om Poisson estimates on the n o de degree distr ibution, we get the fol- lo wing. Claim 2: Th er e exists C 3 = C 3 ( η , k ) < ∞ , N 3 = N 3 ( η , k ) < ∞ suc h that the f ollo wing occur s . F or all n > N 3 , no v ariable no de v ∈ V has degree larger than C 3 log n in G ∼ 2 , that is, d eg G ∼ 2 ( v ) ≤ C 3 log n for all v ∈ V , with probabilit y at least 1 − 1 /n . Note that w e used α < 1 [from Lemma 5.2 (i)] to a voi d d ep end en ce on α in the ab o ve claim. Let E n ≡ { S ( v ′ ) ≤ C 2 log n for all v ′ ∈ V ∗ } ∩ { deg G ∼ 2 ( v ) ≤ C 3 log n for all v ∈ V } . Using claims 1 and 2 ab o v e and a union b ound, we dedu ce that E n holds with p r obabilit y at least 1 − 2 /n for n > N 4 , f or some N 4 = N 4 ( η , k ) < ∞ . Clearly , G ∼ 2 is indep end en t of G (2) . In particular, for v ∈ V that is p art of sup erno de v ′ ∈ V ∗ , we kno w that | S ( v ′ ) | is indep end en t of G ∼ 2 . There is a slight dep endence b et w een the degree of different v ariable no des, but assuming E n , the effect of this is small if we only condition on p olylog( n ) no des in G ∗ . This enables our b ound on the size of balls in G ∗ . Recall th at the d istr ibution of r an d om v ariable X 1 is domin ated by the distribution of X 2 , if there exists a coupling b et ween X 1 and X 2 suc h that X 1 ≤ X 2 with probabilit y 1. In b ounding the size of a ball of radius T ub , w e THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 45 are ju stified in replacing degree distrib utions by dominating distrib utions, and in assuming that there are no lo ops. Fixing a v ertex v ∈ V ∗ , w e construct the ball B G ∗ ( v , T ub ) sequen tially through a breadth-first s earch. Cho ose ε = η/ 2. F or n large enough, the distribution of | S ( v ′ ) | is dominated by th e d istribution of th e num b er of no des in a Gal ton–W atson tree with offspring distribution P oisson(2 αR 2 + ε ). The distribution of d eg G ∼ 2 ( v ) is d ominated b y Poi sson( α ( P k l =3 lR l ) + ε ) . In particular, the degree distribution of G ∗ is dominated b y a geometric distribution b r ≤ (1 − δ ) r /δ for some δ = δ ( η , k ) > 0 . Assum in g E n , this also holds conditionally on the no des rev ealed so far, as long as the num b er of these is, sa y , p olylog ( n ). Th us, assuming E n , the num b er of no des in a ball of radius T ub = C 1 log log n is dominated by the n um b er of no des in a Galton–W atson tree of depth T ub with offspr ing distribu tion ( b r ) ∞ 0 satisfying b r ≤ (1 − δ ) r /δ for some and θ ≡ P ∞ j =1 j b j < C 5 , f or n ≥ N 5 . W e deduce f r om Lemma 5.5 that P h max v ′ ∈ V ∗ | B G ∗ ( v ′ , T ub ) | ≤ (log n ) C 6 | E n i ≥ 1 − 1 /n (36) for some C 6 = C 6 ( η , k ) < ∞ , w here | B G ∗ ( v ′ , T ub ) | denotes the n u m b er of sup er-nod es in B G ∗ ( v ′ , T ub ). But giv en E n , the s ize of comp onen ts v ′ ∈ V ∗ is uniformly b ound ed by C 2 log n . Th u s, conditioned on E n , w e h a ve max v ′ ∈ V ∗ S ( v ′ , T ub ) ≤ C 2 (log n ) C 6 +1 with probabilit y at least 1 − 1 /n . A t this p oin t, w e reca ll that P [ E n ] > 1 − 2 /n , and the result follo ws.  6. Characterizing the p eriph ery . Consider a factor graph G w hen it h as a non trivial 2-core. Recall the defi nitions of the 2-core, bac kb one and p e- riphery of a graph from Section 3.2 . Firs t, we note some of the prop erties of these subgraphs that will b e useful in the pro of of the main lemmas of this section. As a matter of n otation, for a bipartite graph G c h osen unif orm ly at random from the set G ( n, k , m ) we denote b y G P the p eriphery of G and by G p (lo we r case subscr ip t) a su b graph of G that is a p oten tial candidate for b eing the p eriph ery of G . Similarly , we d enote b y G B the bac kb one of G and b y G b a su bgraph of G that is a p oten tial candidate f or b eing the bac kb one of G . 6.1. Pr o of of L emma 3.9 : Pe riphery is c onditional ly a uniform r andom gr aph. Lemm a 3.9 states that if we fix the n u m b er of no d es and the chec k degree profile of the p eriph er y of a graph G c h osen uniformly at random from the set G ( n, k , m ) then the p eriph ery , G P , is d istributed uniform ly at random conditioned on b eing p eelable. Sin ce the original graph G is c hosen uniformly at r andom, in order to p ro ve this lemma it is enough to coun t, for 46 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI eac h p ossible c h oice of th e p eriphery G P , the num b er of graphs G that ha ve the p eriphery G P . Before pr o ving Lemma 3.9 , we first in tro d uce the concept of a “rigid” graph and establish a monotonicit y prop ert y for the backbone augmen tation pro cedur e whic h was defined in Section 3.2 . W e use the notation G ⊆ G ′ if G is a su bgraph of G ′ . Lemma 6.1. L et G = ( F, V , E ) b e a bip artite gr aph and let G s b e the sub gr aph of G induc e d by some F s ⊆ F . L et F l and F u b e subsets of F such that F l ⊆ F u and F l ⊆ F s . L et B (0) l b e the su b gr aph induc e d by F l (so B (0) l ⊆ G s ) and let B (0) u b e the sub gr aph induc e d by F u . Denote by B ( ∞ ) l the output of the b ackb one augmentation pr o c ess on G s with the initial gr aph B (0) l and by B ( ∞ ) u the output of the b ackb one augmentation pr o c ess on G with the initial gr aph B (0) s . Then B ( ∞ ) l ⊆ B ( ∞ ) u . The p ro of of Lemma 6.1 can b e foun d in the App endix C . Definition 6.2. Define a graph to b e rigid if its bac kb one is th e whole graph. W e d enote b y R ( n, k , m ) the class of rigid graph s with n v ariable no des, and m c heck no des eac h of degree k . Lemma 6.3. Consider a bip artite gr aph G = ( F , V , E ) fr om the ensemble G ( n, k , m ) . F or some set of che ck no des F b ⊆ F denote by G b = ( F b , V b , E b ) the sub gr aph induc e d by F b , and denote by G p = ( F p , V p , E p ) the sub gr aph of G induc e d by the p air ( F p ≡ F \ F b , V p ≡ V \ V b ) . Assume G b and G p satisfy the fol lowing c onditions: • G p is p e elable, • G b is rigid, • | ∂ a | ≥ 2 , ∀ a ∈ F p . Then G b is the b ackb one of G (and G p is the p eriphery). Pr oof. If G b is empt y , the lemma is tr ivially true. Ass ume G b is nonempt y . W e pr o ve this lemma in t wo steps. In the fi r st step we p ro ve that G b is a sub graph of G B , the bac kb one of G . In the second step we sho w that G B cannot conta in an ything outside G b . Since G b is rigid, it cont ains a nonemp ty 2-co r e ( G b ) C and the output of the b ackbone augmenta tion pro cedur e with initial graph ( G b ) C is G b itself. F urtherm ore, ( G b ) C is part of G C , th e 2-core of th e original graph G , since b y definition a 2-core is the maximal stopping set (cf. Definition 2.2 ) and ( G b ) C is a stoppin g set in G . Hence, the monotonicit y of the backbone augmen tation pro cedure implies that G b ⊆ G B . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 47 In the second s tep, we pro ve that G B cannot con tain any n o de outside G b . First, n ote that G p cannot con tain any chec k no de from the 2-core of the original graph G . W e pro ve this b y contradictio n . Supp ose instead th at ˜ F is the n on emp t y set of all the c hec k no des from the 2-core of G that are in G p . Let ˜ V b e the set of neigh b ors of ˜ F in G p . The no d es in ˜ V are also part of the 2-core of G and ha ve d egree at least 2 in the 2-core of G . F u rthermore, ther e is no edge incident from F b to V p b ecause, by definition, G b is c hec k-ind uced. In particular, in the 2-core of G , there is no other edge inciden t on v ariables in ˜ V b eyo nd the ones coming f rom ˜ F . Hence, in the nonempt y subgraph ˜ G ⊆ G p induced b y the c h ec k no des in ˜ F and all their neigh b ors every v ariable no de has degree at least 2. This su bgraph is then, b y defin ition, a stopping set in G p . But b y assump tion G p is p eelable and cannot con tain a stopping set. This is a contradict ion that rules out the existence of a nonemp ty set ˜ F . Hence, the 2-co re of G is conta ined en tirely in G b (recall that b oth G b and the 2-co re are c hec k-indu ced). Let B ( G C ) and B ( G b ) b e the output of the bac kb one augmen tation pr o- cedure on G , once with initial sub graph giv en by the 2-core of G and once with the initial su bgraph giv en b y G b (whic h con tains th e 2-core of G ). By monotonicit y , B ( G C ) ⊆ B ( G b ) . But the pr o cess with the initial su bgraph G b terminates imm ed iately since, by assum ption, all c h eck n o de outsid e G b ha ve at least t w o neigh b ors in G p . Therefore, G B = B ( G C ) ⊆ B ( G b ) = G b . This completes our pro of.  It is easy to see that the conv erse of Lemm a 6.3 is also true, as stated b elo w . Remark 6.4. If G b = G B is the bac kb one of G , then the s u bgraphs G b and G p = G \ G b = G P satisfy th e condition of Lemma 6.3 . Here, G \ G b denotes th e su bgraph of G ind uced by ( F \ F b , V \ V b ). Notice that the fact that the graph G \ G b is p eel able follo ws fr om the connection b et ween the p eel ing alg orithm and BP 0 stated in Lemmas 4.2 and 4.3 . W e stated that the messages coming out of the bac k b one are alwa ys 0. F rom the c h ec k no de up date rule, an incoming 0 message to a c h eck n o de can b e dropp ed with ou t c h an ging any of the outgoing messages as long as there is at least one other incoming message. By definition, there is no edge b et w een v ariable no d es in the p eriphery and c h ec k no d es in the b ac kb one. F u rthermore, all the c heck n o des in the p eriphery hav e at least t wo n eigh b ors in the p eriphery . Th erefore, BP 0 on the p eriphery has the same messages as the corresp ond ing m essages of BP 0 on the whole graph. In particular, the fixed p oin t of BP 0 on the p eriphery is all ∗ messages whic h sho ws that the p eriph ery subgraph is p eelable. W e no w pro ve Lemma 3.9 . 48 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Pr oof of Lemma 3.9 . Our goal is to characte rize the pr obabilit y of observing the p erip hery of G to b e G p = ( F p , V p , E p ). W e use the shorthand notation G \ G p to denote the subgraph of G indu ced by the c hec k-v ariable no des pair ( F \ F p , V \ V p ). Let G b = ( F \ F p , V \ V p , E b ) = G \ G p and E pb = { ( i, a ) | i ∈ V \ V p , a ∈ F p } b e a set of edges that satisfy the condition deg E p ( a ) + deg E pb ( a ) = k for all a ∈ F p . As b efore, we d enote by G B and G P the act ual p eriph ery and backbone of the graph G . Define the set of rigid graphs on n b v ariable n o des, m b c heck n o des and chec k degree k , R ( n b , k , m b ), as R ( n b , k , m b ) (37) = { G b = ( F b , V b , E b ) : | F b | = m b , V b = n b , | ∂ a | = k ∀ a ∈ F b , G b is rigid } . By Lemm a 6.3 , { G ∈ G ( n, k , m ) : G P = G p , G B = G b } (38) = { G ∈ G ( n, k , m ) : G p ⊆ G, G \ G p = G b , G p ∈ P , G b ∈ R} , and in particular, { G ∈ G ( n, k , m ) : G P = G p } (39) = { G ∈ G ( n, k , m ) : G p ⊆ G, G p ∈ P , G \ G p ∈ R} . F r om equ ation ( 39 ), and coun ting all the c h oices for the subgraph G b = G \ G p , and the edges that connect G p and G b , |{ G ∈ G ( n, k , m ) : G P = G p }| = X G b X E pb |{ G ∈ G ( n, k , m ) : G p ⊆ G, G p ∈ P , G \ G p = G b , G b ∈ R , (40) E \ ( E p ∪ E b ) = E pb }| . F or fixed G p and G b , we can coun t the num b er of wa ys these tw o s ubgraphs can b e connected to eac h other. Letting ¯ R b e the degree profile of G p , w e ha ve |{ G ∈ G ( n, k , m ) : G P = G p }| (41) = X G \ G p k Y l =2  n − | V p | k − l  | F p | ¯ R l I ( G \ G p ∈ R ) I ( G p ∈ P ) . W e can rewrite this as |{ G ∈ G ( n, k , m ) : G P = G p }| (42) = k Y l =2  n − | V p | k − l  | F p | ¯ R l |R ( n − | V p | , k , m − | F p | ) | I ( G p ∈ P ) . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 49 It is clear that the cardinality of the set R ( n b , k , m b ) is a f unction of only n b and m b . Hence, |{ G ∈ G ( n, k , m ) : G P = G p }| = Z ( n p , k , R p , m p ) I ( G p ∈ P ) , (43) for some function Z ( · , · , · , · ). Since th e graph G itself w as chosen uniformly at random from the s et G ( n, k , m ), this sh o ws that conditioned on ( n p , R p , m p ), all graphs G p ∈ P with n p v ariable no des, m p c heck no des, and c hec k d egree profile R p are equally lik ely to b e observ ed.  6.2. Pr o of of L emma 3.12 : Periphery is exp onential ly p e elable. Let G = ( F , V , E ) b e a graph drawn uniform ly at rand om fr om G ( n , k , αn ), and let G P = ( F P , V P , E P ) b e its p erip hery . Recall the connection b etw een BP 0 and the p eeling alg orithm from Section 4 . Let Q b e defined as in Theorem 1 , that is, Q is the largest p ositive solution of Q = 1 − exp {− k αQ k − 1 } . In ligh t of L emma 4.8 , we defin e the asymptotic degree p rofile p air of th e p eriphery , ( ¯ α, ¯ R ( x )) as follo ws (recall that, from Lemma 4.3 , the p eriph er y do es include c heck no d es receiving at most k − 2 messages of t yp e 0). Definition 6.5. ¯ R ( x ) ≡ 1 1 − Q k − k (1 − Q ) Q k − 1 · k X l =2  k l  (1 − Q ) l Q k − l x l , (44) ¯ α ≡ α  1 − Q k − k (1 − Q ) Q k − 1 1 − Q  . (45) Unlik e the bac kb one where all chec k no des are of degree k , th e p eriph- ery can h a ve c hec k no d es of degrees b et w een 2 and k . Among these, c heck no des of degree 2 are of imp ortance to us s ince they ca n p oten tially form long strings. Strings are particularly unfriendly structures f or the p eeling algorithm; p eeling tak es linear time to p eel suc h structures. In the next lemma, w e defin e a parameter θ as a function of Q , whic h is the estimated branc hing factor of the sub grap h of the p eriphery ind uced b y chec k no des of degree 2. Lemma 6.6 pr ov es that th is branc hin g factor is less than one for all α ∈ ( α d ( k ) , 1]. Lemma 6.6. L et θ ≡ αk ( k − 1)(1 − Q ) Q k − 2 with Q as define d in The o- r em 1 . Then θ < 1 for al l α ∈ ( α d ( k ) , 1] . Pro of of this lemma can b e found in the App endix C . Lemma 6.7. L et Q b e define d as in The or em 1 . Then ther e exists η 1 = η 1 ( α, k ) > 0 such that the p air ( ¯ α, ¯ R ) define d in Definition 6.5 is p e elable at r ate η 1 . F urther, 0 ≤ f ( z , ¯ α, ¯ R ) ≤ (1 − η 1 ) z for al l z ∈ (0 , 1] . 50 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI Pr oof. In view of the d ensit y ev olution recursion (Definition 14 ), defi ne f ( z ) = 1 − exp( − ¯ α ¯ R ′ ( z )) . W e prov e the lemma b y sho wing that f ′ (0) = θ < 1 and that f ( z ) < z strictly for z ∈ (0 , 1] . Using th e defin itions of ¯ α and ¯ R ( z ), the function f ( z ) can b e written as f ( z ) = 1 − exp( − αk ( ( Q + (1 − Q ) z ) k − 1 − Q k − 1 )) . (46) By a straigh tforward calculation, and us ing Lemma 6.6 , we get f ′ (0) = ¯ α ¯ R ′ (0) exp( − ¯ α ¯ R ′ (0)) = αk ( k − 1)(1 − Q ) Q k − 2 = θ < 1 . (47) Assume 0 ≤ y ≤ 1 to b e fixed p oin t of f , that is, y = 1 − exp( − αk (( Q + (1 − Q ) y ) k − 1 − Q k − 1 )) . (48) Using the iden tit y Q = 1 − exp( − αk Q k − 1 ) and after some calculation, we get Q + (1 − Q ) y = 1 − exp( − αk ( Q + (1 − Q ) y ) k − 1 ) . (49) Equation ( 49 ) sh o ws that Q + (1 − Q ) y is a fixed p oint of th e original densit y ev olution recursion ( 14 ) with R ( x ) = x k . Since, b y d efinition, Q is the la rgest fixed p oint of that recursion, y = 0 is the only fi xed p oint of f ( z ) = 1 − exp( − ¯ α ¯ R ′ ( z )) in the in terv al [0 , 1]. Since f ′ (0) < 1 , w e ha v e f ( z ) < z for all z ∈ (0 , 1] and , therefore, f ( z ) /z < 1 for all z ∈ [0 , 1]. The claim f ollo ws by taking η 1 = 1 − s u p z ∈ [0 , 1] f ( z ) /z , with η 1 > 0 b y con tin u it y of z 7→ f ( z ) /z o ve r the compact [0 , 1].  W e can n ow prov e L emm a 3.12 . Pr oof of Lemma 3.12 . F or an y ε > 0, b y Lemmas 4.3 and 4.8 , w e kno w that | α P − ¯ α | < ε, (50) | R P l − ¯ R l | < ε for l ∈ { 2 , . . . , k } , hold w.h .p. As b efore, let f ( z , α, R ) = 1 − exp {− αR ′ ( z ) } . Using R P 0 = R P 1 = 0 w e ob- tain that the fun ction f ( z , α, R ) /z is an analytic fu nction o ver set [0 , 1] k +2 . By Lemma 6.7 , f ( z , ¯ α, ¯ R ) /z ≤ 1 − η 1 . It follo ws that, for ε > 0 small en ou gh , ∂ f ( z , ¯ α, ¯ R ) /∂ z ≤ 1 − ( η 1 / 2) using con tinuit y ∂ f /∂ z with resp ect to the other argumen ts of f . W e in fer that the p eriph er y is w.h.p. p eela b le at rate η = η 1 / 2. This prov es part (i). P art (ii) follo ws immediately from Lemm a 4.8 .  THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 51 7. Pro of of L emm a 3.5 . W e find it con ve nien t to w ork within the config- uration mo del: w e assu m e h ere that G is dra wn uniformly at ran d om from C ( n, k , m ). T he follo w ing fact is an immediate consequence of Lemma 5.3 . F a c t 7.1. Assume G is drawn uniformly at rand om from C ( n, k , m ) , and denote b y n C , m C the n u m b er of v ariable and c hec k no des in the core of G . Supp ose ( n C = n c , m C = m c ) o ccur s with p ositiv e p r obabilit y . Then condi- tioned on ( n C = n c , m C = m c ), th e core is dra w n uniformly from C ( n c , k , m c ; 0 , n c ) [recall the definition of this ensem b le in equation ( 33 )]. In w ords, the core is drawn un iformly from C ( n C , k , m C ) conditioned on all v ariable no des ha ving d egree 2 or more. No w, it has b een pr o ve d [ 14 ] that, w.h.p. | n C /n − (1 − exp( − αk b Q )(1 + αk b Q )) | = o (1) , (51) | m C /n − αQ k | = o (1) , (52) where ( Q, b Q ) is as defin ed in Theorem 1 . The ab o v e b ounds also follo w from Lemmas 4.3 and 4.5 . The kernel of the core system S C con tains all v ectors x with the follo wing prop erty . Let V (1) ⊆ V C b e the subset of v ariables taking v alue 1 in x (i.e., the supp ort of x ). T hen the subgraph of G C induced b y V (1) has no chec k no de with o dd d egree. W e will refer to suc h subgraphs as to even subgraph s. Explicitly , ev en subgraphs are v ariable-induced subgraphs such that no c heck no de has o dd degree. W e w ant c haracterize the eve n subgraphs of G C ha ving no more than nε v ariable n o des, in terms of their size and num b er. Lemma 7.4 in Section 7.1 b elo w allo ws u s to do this pro vided certain conditions are met. Our next lemma tells us that the core meets these conditions w.h.p. Lemma 7.2. Fix k and c onsider any α ∈ ( α d ( k ) , α s ( k )) . Ther e exists δ = δ ( α, k ) > 0 such that the fol lowing hap p e ns. L et G b e dr awn unif ormly fr om C ( n, k , αn ) . L et n C b e the (r andom) numb er of variable no des in the c or e, m C b e the numb er of che ck no des in the c or e and α C ≡ m C /n C . L et η C b e the unique p ositive solution of η C ( e η C − 1) e η C − 1 − η C = α C k (53) and let θ 2C ≡ η C ( k − 1) / ( e η C − 1) . F or any δ ′ > 0 , we have, w.h.p.: (i) θ 2C ≤ 1 − δ . (ii) α C ∈ [2 /k + δ, 1] . (iii) n C /n ≥ (1 − exp( − αk b Q )(1 + αk b Q )) − δ ′ . 52 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI The d iscussion in Section 7.1 throws ligh t on the defin itions of η C and θ 2C used. Pr oof of Lem ma 7.2 . F rom equations ( 51 ), ( 52 ), w e deduce that η C = αk b Q + o (1) w.h.p., leading to θ 2C = αk ( k − 1) Q k − 2 (1 − Q ) + o (1) ≤ 1 − δ for sufficien tly small δ , using Lemma 6.6 . Th u s, we ha ve established p oin t (i). P oint (iii) and th e lo w er b ound in p oint (ii) are easy consequences of equations ( 51 ), ( 52 ). Th e upp er b oun d in p oin t (ii), α C ≤ 1 w.h.p., follo ws directly from the fact that for α < α s , th e sys tem H x = b has a solution for all b ∈ { 0 , 1 } m w.h.p.  Pr oof of Lemma 3.5 . Consider first G ∼ C ( n, k , m ) . Ap plying F act 7.1 and Lemma 7.2 , we d educe that, conditional on th e n umb er of nod es, the core is G C ∼ C ( n c , k , m c ; 0 , n c ) and satisfies th e conditions of Lemma 7.4 pro ved b elo w. By Lemma 7.4 , the element s of L C ( εn ) are in corresp ondence with simple lo ops in the subgraph of G C induced by d egree-2 v ariable no d es. The sparsit y b ounds follo w s from Lemma 7.4 . The clam that they are, with high probabilit y , disjoin t, follo w s instead from the fact th at this ran d om subgraph is sub critical (since 2 αR 2 < 1), and hence decomp oses in trees and unicyclic compon ents. Using Lemma 4.4 , w e d educe th at the resu lt holds also for th e G ∼ G ( n, k , m ) as required.  7.1. Char acterizing even sub gr aphs of the c or e. This section aims at char- acterizing the s mall ev en subgraphs of the core G C . F or the sak e of sim p licit y , w e shall d rop the subscript C throughout the subsection. Fix k . Con s ider some α > 2 /k . L et η ∗ > 0 b e defined imp licitly b y η ∗ ( e η ∗ − 1) e η ∗ − 1 − η ∗ = αk . (54) F or α ∈ (2 /k , ∞ ), we hav e η ∗ ( α ) > 0 and η ∗ is an increasing fu n ction of α at fixed k [ 14 ]. Consider a graph G = ( F , V , E ) dra w n un iformly at random from C ( n, k , αn ; 0 , n ) . The rationale for th is definition of η ∗ is that the asymp totic degree distribution of v ariable no des in G is P oisson ( η ∗ ) conditioned on the outcome b eing greater than or equal to 2 [to b e denoted b elo w P oisson ≥ 2 ( η ∗ )]. W e are intereste d in eve n subgraphs of G . Consider the subgraph G 2 = ( F , V (2) , E (2) ) of G in d uced b y variable no d es of d egree 2 (with all facto r n o des retai ned). The asymptotic bran ching fac- tor this subgraph tu r ns out to b e θ 2 ≡ η ∗ ( k − 1) / ( e η ∗ − 1). W e imp ose th e THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 53 condition θ 2 ≤ 1 − δ for some δ > 0 (since this is true of the core). Note that θ 2 is a d ecreasing function of η ∗ , and hence a decreasing function of α , for fixed k . First, we state a tec hnical lemma that w e find us efu l. Lemma 7.3. Consider any k , any α ∈ (2 /k , 1] and ε ∈ (0 , 1] . Then ther e exists N 0 ≡ N 0 ( k , ε ) < ∞ and C = C ( k ) < ∞ such that the fol lowing o c curs for al l n > N 0 . Consider a gr aph G = ( F, V , E ) dr awn uni f ormly at r andom fr om C ( n, k , m ; 0 , n ) , m = nα . With pr ob ability at le ast 1 − 1 /n , ther e is no subset of v ariable no des V ′ ⊆ V suc h that | V ′ | ≤ εn and the sum of the de gr e es of no des in V ′ exc e e ds C ε log(1 /ε ) n . Pr oof. Let d eg( i ) b e the degree of v ariable n o de i ∈ V . Let X i ∼ P oisson ≥ 2 ( η ∗ ) b e i.i.d. for i ∈ V . Then (deg( i )) n i =1 is distrib uted as ( X i ) n i =1 , conditioned on P n i =1 X i = mk . C on s ider V ′ = { 1 , 2 , . . . , l } . W e ha v e P ( l X i =1 deg( i ) ≥ γ l ) = P ( l X i =1 X i ≥ γ l     n X i =1 X i = mk ) ≤ P { P l i =1 X i ≥ γ l } P { P n i =1 X i = mk } . No w, n E [ X i ] = nαk = mk , by our choice of η ∗ in equation ( 54 ). Since α ≤ 1 , w e deduce that η ∗ ≤ C 1 = C 1 ( k ) < ∞ . Using a local cen tr al limit the- orem (CL T) for lattice random v ariables (Theorem 5.4 of [ 22 ]) w e obtain P { P n i =1 X i = mk } ≥ C 2 n − 1 / 2 for some C 2 = C 2 ( k ) > 0. A standard Ch er - noff b ound yields P { P l i =1 X i ≥ γ l } ≤ exp {− l γ C 3 } , for some C 3 ( k ) ∈ (0 , 1], pro v id ed γ > 2 αk . Thus, w e obtain P ( l X i =1 deg( i ) ≥ γ l ) ≤ n 1 / 2 exp {− l γ C 3 } /C 2 , (55) pro v id ed γ > 2 αk . W e us e γ = C ′ (1 + log(1 /ε )) w ith C ′ = 2 αk /C 3 . T ak e l = εn . The n u m b er of differen t subs ets of v ariable no des of size l is  n l  ≤ ( e/ε ) l for n ≥ N 1 for some N 1 = N 1 ( ε ) < ∞ . A union b ound giv es the desired result.  Lemma 7.4. Fix k ≥ 3 , and δ > 0 so that for any α ∈ [2 /k + δ, 1] , we have θ 2 ( α, k ) ≤ 1 − δ . Then, for any δ ′ > 0 , ther e exists ε = ε ( δ , k ) > 0 , C = C ( δ , δ ′ , k ) < ∞ and N 0 = N 0 ( δ , δ ′ , k ) < ∞ such that the fol lowing o c- curs for e very n > N 0 . Consider a g r aph G = ( F , V , E ) dr awn unif ormly at r ando m fr om C ( n, k , αn ; 0 , n ) . With pr ob ability at le ast 1 − δ ′ , b oth the fol- lowing hold: 54 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI (i) Consider minimal even sub gr aphs c onsisting of only de gr e e 2 variable no des. Ther e ar e no mor e than C such sub gr aphs. Each of them is a simple cycle c onsisting of no mor e than C variable no des. (ii) E very even sub gr aph of G with less than εn variable no des c ontains only de gr e e 2 variable no des. Pr oof. Part (i): Rev eal the mk edges of G sequentia lly . T he exp ected n u m b er of no des in V (2) , cond itioned on the first t edges reve aled forms a martingale with differences b oun ded b y 2. Then , from Azuma–Ho effding inequalit y [ 19 ], we dedu ce that | V (2) | concen trates around its expectation: P ( || V (2) | − E [ | V (2) | ] | ≥ ζ √ n ) ≤ exp( − b C 1 ζ 2 ) for all ζ > 0, wh ere b C 1 = b C 1 ( k ) > 0 . The exp ectation can b e computed for instance u sing the Po isson repr esen tation as in the pro of of Lemma 7.3 , yielding | E | V (2) | − nη 2 ∗ / (2( e η ∗ − 1 − η ∗ )) | ≤ n 3 / 4 , for all α < 1, n ≥ b N 0 ( k ). W e deduce that for any δ 1 = δ 1 ( δ , k ) > 0, w e ha ve P ( || V (2) | /n − η 2 ∗ / (2( e η ∗ − 1 − η ∗ )) | ≥ δ 1 n ) ≤ 1 /n (56) for all n > b N 1 , w here b N 1 = b N 1 ( δ , k ) < ∞ . No w, condition on | V (2) | = n (2) , for some n (2) suc h that | n (2) /n − η 2 ∗ / (2( e η ∗ − 1 − η ∗ )) | < δ 1 n. (57) Note that b y c h o osing δ 1 small enough, w e can ensure n (2) = Ω( n ). W e are n o w inte rested in the c h ec k d egree distrib ution R (2) in G 2 . Rev eal the 2 n (2) edges of G 2 sequen tially . Consider l ∈ { 0 , 1 , . . . , k } . The exp ected n u m b er of c hec k no des with degree l in G 2 , conditioned on the edges re- v ealed thus far, forms a martingale with differences b oun ded b y 2 . Let Z ∼ Binom( k , 2 n (2) / ( mk )). W e hav e E [ R (2) l ] = P ( Z = l ) + O (1 / n ). Arguin g as ab o v e for eac h l ≤ k , w e finally obtain P k X l =0 | R (2) l − P ( Z = l ) | ≥ δ 1 n ! ≤ 1 /n (58) for all n > b N 2 , w here b N 2 = b N 2 ( δ , k ) < ∞ . No w condition on b oth n (2) satisfying equ ation ( 57 ) and R (2) satisfying k X l =0 | R (2) l − P ( Z = l ) | < δ 1 . Let ζ b e the b ranc h ing factor of G 2 (i.e., of a graph that is uniform ly random conditional on the degree profile R (2) ). Under the ab ov e conditions on n (2) and R (2) , a s traigh tforward calculat ion implies that ζ is b ounded ab ov e b y THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 55 θ 2 + δ 2 , for some δ 2 = δ 2 ( δ 1 , k ) such that δ 2 → 0 as δ 1 → 0. T h u s, b y selec ting appropriately sm all δ 1 , w e can ensu r e that δ 2 ≤ δ / 2, leading to a b ound of 1 − δ / 2 on the branc h ing facto r for all n (2) , R (2) within the range sp ecified ab o ve. No w w e condition also on the degree sequence, that is, the sequence of c heck no d e degrees in G 2 . The facto r graph G 2 can b e naturally asso ciated to a graph, b y replacing eac h v ariable no de by an edge and eac h c hec k nod e b y a v ertex. This graph is d istributed according to th e standard (nonbipar- tite) configuration mo del. Using [ 37 ], Theorem 4, w e obtain that the n umb er of cycles of length l ∈ { 1 , 2 , . . . , l 0 } for a constant l 0 are asymptotically inde- p end ent P oisson random v ariables, with parameters 8 λ l = ζ l / (2 l ) for ζ = " k X d =1 d ( d − 1) R (2) ( d ) #  " k X d =1 dR (2) ( d ) # . More pr ecisely , for any constan ts c 1 , c 2 , . . . , c l 0 ∈ N ∪ { 0 } , w e ha v e P [ E n ( c )] = l 0 Y l =1 P (P oisson ( λ l ) = c l ) + o (1) , where E n ( c ) is the ev ent that there are c l cycles of length l for l ∈ { 1 , 2 , . . . , l 0 } with all cycl es disj oin t f rom eac h other, and c = ( c l ) l 0 l =1 . Cho osing l 0 large enough, w e ha v e X c ∈N P [ E n ( c )] ≥ 1 − exp − ∞ X l =1 λ l ! − δ / 4 = 1 − (1 − ζ ) − 1 / 2 − δ ′ / 4 , where N = { c : c 6 = 0 , c l ≤ l 0 for l ∈ { 1 , 2 , . . . , l 0 }} , f or n large enough. On the other hand, we kno w that the pr obabilit y of having no cycles in G 2 is (1 − ζ ) − 1 / 2 + o (1) under ou r assumption of ζ ≤ 1 − δ / 2. The argument for this wa s already outlined in the pr o of of Lemma 3.11 , cf. Section 5.1 : the P oisson ap p ro ximation of [ 37 ] is used to estimate th e pr obabilit y of ha ving no cycles of length smaller than M , while a simple first momen t b ound is sufficien t for cycles of length M or larger. Th us, with probabilit y at least 1 − δ ′ / 3, w e ha ve n o more than l 2 0 cycles, d isjoin t and eac h of length n o more than l 0 . Cho osing C = l 2 0 , w e obtain part (i) with p robabilit y at least 1 − δ ′ / 2 for large enough n . Part (ii): Let m ≡ αn . Let N ( G ; l , j ) b e the num b er of ev en sub graphs of G induced by l v ariable no des suc h that the su m of the degrees of the l 8 The mo del in [ 37 ] is slightly different from th e configuration model for its treatment of self-loops and d ouble edges. H o w ever, th e results and pro of can be adapt ed to the configuration mo del. 56 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI v ariable no des is 2( l + j ). W e are in terested in l ≤ εn (we will c ho ose ε later) and j > 0. In particular, w e w an t to sho w that, for an y δ ′ > 0, P ( εn X l =1 mk/ 2 X j =1 N ( G ; l , j ) > 0 ) ≤ δ ′ / 2 . (59) This immediately implies the desired result fr om linearit y of exp ectation and Marko v inequalit y . F r om Lemma 7.3 , we dedu ce that P ( εn X l =1 mk/ 2 X j = ε ′ n N ( G ; l , j ) > 0 ) ≤ 1 /n, (60) for some ε ′ ( ε, k ) with the pr op ert y th at ε ′ → 0 as ε → 0. Th u s, we only n eed to establish εn X l =1 ε ′ n X j =1 E [ N ( G ; l , j )] ≤ δ ′ / 3 , (61) for all n large enough, since the cla im then follo ws from Mark o v inequalit y . A straigh tforw ard calculation [ 29 , 36 ] yields E [ N ( G ; l , j )] =  n l  T 1 T 2 T 3  mk 2( l + j )  T 4 , where T 1 = co eff [( e y − 1 − y ) l ; y 2( l + j ) ] , T 2 = co eff [( e y − 1 − y ) n − l ; y mk − 2( l + j ) ] , T 3 = co eff  (1 + y ) k + (1 − y ) k 2  m ; y 2( l + j )  , T 4 = co eff [( e y − 1 − y ) n ; y mk ] . It is useful to recall th e follo win g p robabilistic representat ion of com bina- torial co efficients. F a c t 7.5. F or an y η > 0, w e ha v e co eff [( e y − 1 − y ) N ; y M ] = η − M ( e η − 1 − η ) N P " N X i =1 X i = M # , (62) where X i ∼ P oisson ≥ 2 ( η ) are i.i.d. for i ∈ { 1 , . . . , M } . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 57 Consider T 4 . By d efinition, cf. equatio n ( 54 ), η ∗ is suc h that for X i ∼ P oisson ≥ 2 ( η ∗ ) we h a ve E [ X i ] = αk = mk /n . Moreo v er, α ∈ [2 /k + δ , 1] implies η ∗ ∈ [ C 1 , C 2 ] for some C 1 = C 1 ( δ , k ) > 0 and C 2 = C 2 ( k ) < ∞ . F rom η ∗ ≤ C 2 and using a lo cal C L T for lattice random v ariables [ 22 ], it follo ws that P [ P n i =1 X i = mk ] ≥ C 3 / √ n for some C 3 = C 3 ( δ , k ) > 0 . Th u s, u sing F act 7.5 , w e ha ve T 4 ≥ η − mk ∗ ( e η ∗ − 1 − η ∗ ) n C 3 n − 1 / 2 . (63) No w, consider T 2 . Again use η = η ∗ in F act 7.5 . F rom η ∗ ≥ C 1 and aga in using a lo cal CL T for lattice r.v.’s [ 22 ], w e obtain P [ P n − l i =1 X i = mk − 2( l + j )] ≤ C 4 / 2 √ n − l ≤ C 4 / √ n for some C 4 = C 4 ( δ , k ) < ∞ , since l ≤ εn . Thus, F act 7.5 yields T 2 ≤ η − mk + 2( l + j ) ∗ ( e η ∗ − 1 − η ∗ ) n − l C 4 n − 1 / 2 . (64) F act 7.5 y ields that T 1 can b e b oun ded ab o v e as T 1 ≤ η − 2( l + j ) ( e η − 1 − η ) l (65) for any η > 0. W e will choose a suitable η later. Finally , for T 3 , similar to F act 7.5 , we can deduce that T 3 ≤  (1 + ξ ) k + (1 − ξ ) k 2  m ξ − 2( l + j ) for all ξ > 0. No w, it is easy to c hec k that (1 + ξ ) k + (1 − ξ ) k 2 ≤ exp  k 2  ξ 2  , b y comparing co efficients in the series expansions of b oth sides. Cho osing ξ = q ( l + j ) / ( m  k 2  ), we obtain T 3 ≤  em  k 2  l + j  l + j . (66) Finally , we ha ve  n l  ≤ n l l ! ,  mk 2( l + j )  ≥ ( mk − 2( l + j )) 2( l + j ) (2( l + j ))! . (67) Putting to gether equations ( 63 ), ( 64 ), ( 65 ), ( 66 ) and ( 67 ), w e obtain E [ N ( G ; l , j )] ≤ C 6 · ( e η − 1 − η ) l η 2( l + j ) · η 2( l + j ) ∗ ( e η ∗ − 1 − η ∗ ) l ×  e ( k − 1)(1 + C 5 (( l + j ) /n )) 2( l + j ) k  l + j · (2( l + j ))! l ! α l m j , 58 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI for some C 6 = C 6 ( k , δ ) < ∞ . Now, N ! ≥ C 7 √ N ( N /e ) N for all N ∈ N , f or some C 7 > 0. Using this with N = l + j , w e obtain  e l + j  l + j · (2( l + j ))! l ! ≤ √ l + j C 7 · (2( l + j ))! l !( l + j )! ≤ √ l + j C 7 l j ·  2( l + j ) ( l + j )  ≤ C 8 2 2( l + j ) l j , for some C 8 < ∞ . P lu gging back, we get E [ N ( G ; l , j )] ≤ C 9 ( T 5 ) l ( T 6 ) j , where T 5 = 2 θ 2 ( e η − 1 − η ) η 2 (1 + C 5 (( l + j ) /n )) , T 6 = 4( k − 1) η 2 ∗ ml η 2 . Without loss of generalit y , assume δ ≤ 0 . 1. No w, we c h o ose ε = ε ( δ , k ) > 0 suc h that ε + ε ′ ≤ δ/ (10 C 5 ). W e c h o ose η = η ( k ) > 0 su ch that ( e η − 1 − η ) η − 2 ≤ (1 + δ / 10) / 2 [note that ( e η − 1 − η ) η − 2 → 1 / 2 as η → 0 ]. This leads to T 5 ≤ 1 − δ / 2 for all l ≤ εn and j ≤ ε ′ n , wh en w e use θ 2 ≤ 1 − δ . Also, T 6 ≤ C 10 /n for all l , j , f or some C 10 = C 10 ( k ) < ∞ . Th us, E [ N ( G ; l , j )] ≤ C 9 (1 − δ / 2) l  C 10 n  j . Summing ov er j and l , w e obtain εn X l =1 ε ′ n X j =1 E [ N ( G ; l , j )] ≤ C 11 n (68) for some C 11 = C 11 ( k , δ ) < ∞ . This implies equ ation ( 61 ) for large en ough n as required.  8. Pro of of Lemma 3.8 : A sparse basis for lo w-w eigh t core solutions. F or eac h x C ∈ L C ( εn ), w e n eed to fin d a sparse solution x ∈ S 1 that matc hes x C on the core. F rom Lemma 3.5 , we kno w that w.h.p., x C consists of all zeros except for a small su bset of v ariables. Indeed, w e k n o w from Lemma 7.4 that these v ariables corresp ond to a cycle of degree-2 v ariable no des. Although this is not used in th e follo wing, we sh all nev ertheless refer to the set of v ari- able n o des corresp onding to an elemen t of L C ( εn ) as a cycle . Denote by L 1 the cycle corresp onding to x C . Recall that the n oncore G NC = ( F NC , V NC , E NC ) is the subgraph of G indu ced b y F NC = F \ F C and V NC = V \ V C . Supp ose w e THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 59 set all noncore v ariables to 0. T he set of violated c hec ks consists of those c hecks in F NC that h a ve an o dd n u m b er of neigh b ors in L 1 . W e sh o w that w.h.p., eac h suc h chec k can b e satisfied by changing a s mall num b er of n on- core v ariables in its neigh b orho o d to 1. T o sho w that th is is p ossible, we mak e use of the b elief propagation al gorithm d escrib ed in Section 4 . Our strategy is rough ly the follo wing. Consider a violated c h ec k a . W e wish to set an o dd n umb er of its noncore neigh b oring v ariables to 1. But then, th is m ay cause further c hec ks to b e viola ted, and so on. A k ey fact comes to our rescue. If c hec k no de a receiv es an incoming ∗ message in round T , then w e can fi nd a subset of noncore v ariable no d es in a T -neigh b orho o d of a suc h that if w e set those v ariables to 1, chec k a will b e satisfied (w ith an o dd n um b er of neigh b oring ones in the noncore) without causing an y new violatio ns. W e d o this for eac h violat ed chec k. No w w.h.p., for suitable T , all violated c hecks will receiv e at least one incoming ∗ by time T (note that eac h n oncore chec k receiv es an incoming ∗ at the BP fixed p oint ). Th u s, w e can satisfy them all b y setting a small num b er of noncore v ariables to 1. Lemma 8.1. Consider G dr awn uniformly fr om G ( n, k , m ) . Denote by F ( l ) ⊆ F NC the che cks in the nonc or e having de gr e e l with r esp e ct to the nonc or e, f or l ∈ { 1 , 2 , . . . , k } . Conditio n on the c or e G C , and F ( l ) for l ∈ { 1 , 2 , . . . , k } . • Then E C , NC and G NC ar e i ndep endent of e ach other. H er e E C , NC denotes the e dges b etwe e n c or e variables V C and nonc or e che cks F NC . • The e dges in E C , NC ar e distribute d as fol lows: F or e ach a ∈ F NC , if a ∈ F ( l ) , its neighb orho o d in G C is a uniformly r andom subset of V C of size k − l , indep endent of the others. • Cle arly, ( G C , ( F ( l ) ) k l =1 ) uniquely determine the p ar ameters ( n NC , R NC , m NC ) of the nonc or e . The nonc or e G NC is dr awn u niformly at r andom fr om D ( n NC , R NC , m NC ) c onditione d on b e ing p e elable, that is, G NC is dr awn uni- formly at r andom fr om D ( n NC , R NC , m NC ) ∩ P . Pr oof. Eac h G ∈ G ( n, k , m ) w ith the giv en ( G C , ( F ( l ) ) k l =1 ) h as a G NC corresp ondin g to a unique element of D ( n NC , R NC , m NC ) ∩ P and E C , NC corre- sp ond ing to a sub set of V C of size k − l for eac h a ∈ F ( l ) , for l ∈ { 1 , . . . , k } . The conv erse is also true. This yields the result.  Pr oof of Lem m a 3.8 . T ak e an y sequence ( s n ) n ≥ 1 suc h that lim n →∞ s n = ∞ and s n ≤ εn . If p oin ts (i), (ii) and (iii) in Lemm a 3.5 hold, let V cycle denote the union of the supp orts of the solutions in L C ( s n ). Let E 1 ≡ E 1 ,a ∩ E 1 ,b ∩ E 1 ,c , E 1 ,a ≡ { P oin ts (i), (ii) and (iii) in Lemma 3.5 hold } , 60 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI E 1 ,b ≡ {| F ( l ) | ≥ n /C 2 for all l ∈ { 1 , 2 , . . . , k }} , E 1 ,c ≡ { No v ariable in V cycle has degree exceeding log s n } . (Note that th ese ev en ts are implicitly indexed b y n .) W e argue that E 1 holds w.h.p. for an appr op r iate c hoice of C 2 = C 2 ( k , α ) < ∞ . In deed, Lemma 3.5 implies that E 1 ,a holds w.h.p. Lemma 4.8 im p lies that E 1 ,b holds w.h.p. for sufficien tly large C 2 . Finally , Lemma 8.1 and a sub exp onen tial tail b ound on the P oisson distribution en sure E 1 ,c holds w .h .p. Assume that E 1 holds. Let sets of v ariable nod es on the disjoin t cycles cor- resp ond ing to elemen ts of L C ( εn ) b e d enoted by L i for i ∈ { 1 , 2 , . . . , |L C ( εn ) |} . Consider a cycle L i . Denote b y a ij , j ∈ { 1 , 2 , . . . , Z i } , the c hecks in the noncore ha ving an o dd num b er of neigh b ors in L i . (Thus, Z i is the num- b er of suc h c hec ks.) Call these marke d c hec ks. Giv en E 1 , w e know that Z i ≤ s n log s n , and that th ere are n o more than s 2 n log s n mark ed chec ks in total: |L C ( εn ) | X i =1 Z i ≤ s 2 n log s n . Define E 2 ≡ { No more than n /s 3 n messages change after T n iterations of BP 0 } . By L emm a 4.10 , the ev en t E 2 holds w .h.p. p ro vid ed lim n →∞ T n = ∞ and s n gro ws sufficien tly slowly with n [f or the give n c hoice of ( T n ) n ≥ 1 ]. Let B ij ≡ { Not all messages incoming to c heck a ij ha ve conv erged to their fixed-p oint v alue in T n iterations } . W e wish to sho w that \ i,j B c ij (69) holds w .h .p. W e ha v e P  [ i,j B ij  ≤ E ( G C ,E C , NC )  E  I [ E 1 , E 2 ] X i,j I [ B ij ]    G C , E C , NC  + P [ E c 1 ] + P [ E c 2 ] . Giv en E 2 , we kno w that the num b er of c hec ks for whic h an incoming message changes after T n is no more than n/s 3 n . Sup p ose a ij ∈ F ( l ) is a mark ed c h ec k. Then we ha v e E [ I [ E 1 , E 2 ] I [ B ij ] | G C , E C , NC ] ≤ n s 3 n | F ( l ) | ≤ 1 C 2 s 3 n , THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 61 since all c hec k no d es in F ( l ) are equiv alent with resp ect to the noncore, from Lemma 8.1 . W e already kn o w that under E 1 , th e num b er of mark ed c hec k s is bou n ded b y s 2 n log s n . This leads to P  [ ij B ij  ≤ log s n C 2 s n + P [ E c 1 ] + P [ E c 2 ] n →∞ − → 0 , implying equatio n ( 69 ) h olds w.h .p. Condition on G C and E C , NC . This iden tifies the mark ed c hec ks. Lemma 8.1 guaran tees us that all c hec ks in F ( l ) are equiv alen t w ith resp ect to G NC . Supp ose E 1 holds. Define a ball of radius t around a chec k no de as consisting of the n eigh b oring v ariable no des, and the balls of radius t around eac h of those v ariables. Similar to the pr o of of Lemma 3.11 (iii), w e ca n sh o w that | B G NC ( a ij , T n ) | ≤ C T n 3 (70) holds with probabilit y at least 1 − C 4 exp( − 2 T n /C 4 ), for some C 3 = C 3 ( α, k ) < ∞ and C 4 = C 4 ( α, k ) < ∞ , for all m arked c hec ks a ij . Th u s, the p robabilit y that this b ound on ball size holds s imultaneously for all mark ed chec ks, b y union b ound, is at least 1 − s 2 n log s n C 4 exp( − 2 T n /C 4 ) → 1 as n → 1 pro vided T n → ∞ and s n gro ws sufficien tly slo wly with n . Supp ose equation ( 69 ) and E 1 hold. Consider an y mark ed c hec k a ij ad- jacen t to v ∈ L i for an y L i . It r eceiv es at least one incoming ∗ message at the BP 0 fixed p oint and since B ij = 0 , this is also true after T n iterations of BP 0 . Hence, there is a su bset of v ariables V ( ij ) ⊆ B G NC ( a ij , T n ), su c h that setting v ariables in V ( ij ) to 1 satisfies a ij without viol ating an y other c h ec ks. Define V ( i ) ≡ { v : v o ccurs an o dd n umb er of times in the sets ( V ( ij ) ) Z i j =1 } . It is not hard to v erify that the v ector x c ,i with v ariables in L i ∪ V ( i ) set to one and all other v ariables set to zero, is a mem b er of S 1 . If equation ( 70 ) holds for all mark ed chec ks, th en w e deduce th at | V ( i ) | ≤ C T n 3 s n log s n ≤ c n for T n and s n gro wing sufficient ly slo wly with n . T hus, x c ,i ∈ S 1 is c n - sparse assuming these ev en ts, eac h of w hic h o ccurs w.h.p. W e rep eat this construction f or eve ry L i .  APPENDIX A: PR OO F OF LEMMA 3.4 Lemma A.1. A ssu me that G has no 2 - c or e, and let K ≡  H − 1 F ,U H F ,W I ( n − m ) × ( n − m )  , wher e U and W ar e c onstructe d as in L emma 3.2 , we or der the variables as U f ol low e d by W , and the matrix inverse is taken over G F [2] . Then the 62 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI c olumns of K form a b asis of the kernel of S , which is also the kernel of H . In addition, if K i,j = 1 , then d G ( i, j ) ≤ T C . Pr oof. A s tandard linear algebra result sh o ws that K is a basis for the k ern el of H . T he b ottom iden tit y blo ck of K corresp onds to the ( n − m ) indep en d en t v ariables w ∈ W , and in this blo c k a 1 only o ccur s if the ro w and column corresp ond to the same v ariable, that is, for i, j ∈ W , K i,j = 1 implies i = j , and thus d G ( i, j ) = 0. T o pro ve the distance claim for the upp er blo c k of K , w e pro ceed b y ind u ction on T C . F or a v ariable u ∈ U that is p eeled along w ith factor nod e a ∈ F , w e will reference u via the facto r no de it w as p eeled with as u a . • Induction b ase : F or T C = 1, H F ,U = I m , and th u s K =  H F ,W I ( n − m ) × ( n − m )  . Since T C = 1 , note th at ev er v ariable no de m u st b e conn ected to no more than 1 f actor no de. T h us, ( H F ,W ) a,i = 1 imp lies that facto r no de a was connected to ind ep end en t v ariable no de i . Th us, v ariables i and u a are b oth adjace n t to facto r a , and consequen tly d G ( u a , i ) = 1. • Inductive step : Assume that T C = T + 1 and co nsider the graph J ( G ) = ( F J , V J , E J ) (recall that J d enoted the p eeling op erato r). By construction T C ( J ( G )) = T , and th u s by the inductiv e h yp othesis the columns of K J ( G ) ≡  ˜ K I (( n − n 1 ) − ( m − m 1 )) × (( n − n 1 ) − ( m − m 1 ))  ≡  H − 1 F J ,U J H F J ,W J I (( n − n 1 ) − ( m − m 1 )) × (( n − n 1 ) − ( m − m 1 ))  , form a basis for the k ern el of H J ( G ) , where F J , U J , and W J refer to the set of factor no des of the f actor graph J ( G ), and their corresp ond ing p artition, resp ectiv ely . In addition, ( K J ( G ) ) a,i = 1 only if d J ( G ) ( u a , i ) ≤ T . T o extend this basis to a basis for the ke r nel of H , note that K ≡  H − 1 F ,U H F ,W I ( n − m ) × ( n − m )  =    H F 1 ,U 1 H F 1 ,U J 0 H F J ,U J  − 1  H F 1 ,W 1 H F 1 ,W J 0 H F J ,W J  I ( n − m ) × ( n − m )   =    I | U 1 | − H F 1 ,U J H − 1 F J ,U J 0 H − 1 F J ,U J !  H F 1 ,W 1 H F 1 ,W J 0 H F J ,W J  I ( n − m ) × ( n − m )    =    H F 1 ,W 1 H F 1 ,W J + H F 1 ,U J ˜ K 0 ˜ K  I ( n − m ) × ( n − m )   . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 63 By construction if ( H F 1 ,W 1 ) a,i = 1, then d G ( u a , i ) = 1 ≤ T . Consider the ( a, i ) entry of the matrix B ≡ H F 1 ,W J + H F 1 ,U J ˜ K . A necessary condition for B a,i = 1 is the existence of an edge b et w een chec k no de a ∈ F 1 and indep en d en t v ariable no d e i ∈ W \ W 1 = W J [i.e., ( H F 1 ,W J ) a,i = 1], or the existence of b oth an edge b et w een a ∈ F 1 and dep end en t v ariable nod e j ∈ U J that is in the basis for indep endent v ariable i [i.e., ( ( H F 1 ,U J ) a,j = 1, ˜ K j,i = 1)]. W e note that if d J ( G ) ( u a , i ) ≤ T , then d G ( u a , i ) ≤ T also, since E J ⊂ E . Thus, if ( H F 1 ,U J ) a,j = 1 , ˜ K j,i = 1, then d G ( u a , i ) ≤ T + 1. Similarly , if ( H F 1 ,W J ) a,i = 1 , then d G ( u a , i ) = 1 as in th e base case. Thus, if K i,j = 1, then d G ( i, j ) ≤ T + 1 = T C .  A direct result of this is the sparsit y b oun d giv en b elo w. Lemma A.2. F or K c onstructe d as in L emma A.1 , the c olumns of K form an s -sp arse b asis for the kernel of H , with s ≤ max i ∈ V | B G ( i, T C ) | . Pr oof. By Lemma A.1 , d G ( a, i ) ≤ T C is a necessary co ndition for K a,i = 1. Th us, for all i ∈ W , the i th column of K can only con tain 1’s on the en tries that corresp ond to v ariables at d istance at most T C from i . The resu lt follo ws b y taking a union b ound o v er all i ∈ W .  Pr oof of Lemma 3.4 . Let b K = L  Q − 1 F ∗ ,U ∗ Q F ∗ ,W ∗ I ( n − m ) × ( n − m )  , where the matrix in verse is take n o v er G F [2]. If G ∗ 6 = G , then all degree 2 c hec k no des constrain their adjacen t v ariable no des to the same v alue. Therefore, all v ariables in the same connected co mp onent tak e on the same v alue in a sati sfying solution, that is, for all v ∗ ∈ V ∗ , if H x = 0, then for all i ∈ v ∗ , either x i = 0 or x i = 1 . Consequ en tly , H x = 0 if and only if x = L x ∗ for some x ∗ suc h that Q x ∗ = 0 Th u s, { x (1) , . . . , x ( N ) } is a basis for the k ernel of H if and only if x ( i ) = L x ( i ) ∗ and { x (1) ∗ , . . . , x ( N ) ∗ } is a basis for the k ernel of Q . Finally notice that L x ∗ has | v ∗ | n onzero en tries for eac h v ∗ ∈ V ∗ suc h that x ∗ ,v ∗ 6 = 0 . Thus, the sparsity b oun d follo ws as a dir ect extension of the b ound from Lemma A.2 , and the col umns of b K form an s -sparse basis for the kernel of H , with s ≤ max v ∗ ∈ V ∗ S ( v ∗ , T C ( G ∗ )) .  64 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI APPENDIX B: PROOFS OF TECHNICAL LEMMAS IN SECTIO N 5 Pr oof of Lemma 5.2 . Let ω ≡ αR ′ (1). Define f ( z ) ≡ 1 − λ (1 − ρ ( z )) = 1 − exp( − αR ′ (1) ρ ( z )). W e obtain f ′ (0) = 2 αR 2 . (71) No w, w e kno w that z t → 0 as t → ∞ , it f ollo ws that lim t →∞ z t +1 /z t → f ′ (0). W e then deduce from p eelabilit y at rate η that f ′ (0) ≤ 1 − η . (72) Com b ining equations ( 71 ) and ( 72 ), we obtain the desired result (i). In order to pro ve (ii) notice that, for the pair to b e p eelable, need z ≤ 1 − exp( − αR ′ ( z )) for all z ∈ [0 , 1] , that is, R ′ − 1 ( x ) ≤ 1 − e − αx for all x ∈ [0 , R ′ (1)], (73) where R ′ − 1 is the inv erse mapping of z 7→ R ′ ( z ). W e next in tegrate th e ab ov e o ve r [0 , R ′ (1)], using Z R ′ (1) 0 R ′ − 1 ( x ) d x = Z 1 0 wR ′′ ( w ) d w = R ′ (1) − 1 , (74) Z R ′ (1) 0 (1 − e − αx ) d x = R ′ (1) − 1 α (1 − e − αR ′ (1) ) . (75) W e thus obtain 1 ≥ 1 α (1 − e − αR ′ (1) ) , (76) whic h yields α ≤ 1 − e − αR ′ (1) < 1.  Pr oof of Lem ma 5.3 . W e use the notation ( G ) = ( m l ( G )) k l =2 whereby m l ( G ) is the num b er of chec k no des of degree l in G . Let n ( t ) 1 ≡ n 1 ( J t ) , n ( t ) 2 ≡ n 2 ( J t ) , m ( t ) ≡ m ( J t ) , α ( t ) ≡ k X l =2 m ( t ) l !  n, R ( t ) l ≡ m ( t ) l  k X l ′ =2 m ( t ) l ′ ! for l ∈ { 2 , 3 , . . . , k } . Note th at R ( t ) defined ab o v e is, in fact, the c heck degree profile of J t . As ab o v e, let J ( · ) denote the op erator corresp onding to on e roun d of sync h ronous p eeling [so th at J t = J t ( G )]. Define the set S ( G ; b m , b n 1 , b n 2 ) ≡ { b G : n 1 ( b G ) = b n 1 , n 2 ( b G ) = b n 2 , m ( b G ) = b m, J ( b G ) = G } . THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 65 W e p ro ve the r esult b y induction. By definition, w e kno w that J 0 = G is dra w n un iformly from the C ( n, R , αn ). S upp ose, conditioned on m ( t ) , n ( t ) 1 , n ( t ) 2 , the graph J t is dra w n uniformly from C ( n, R ( t ) , α ( t ) n ). Let the probabil- it y of eac h p ossible J t [with parameters ( m ( t ) , n ( t ) 1 , n ( t ) 2 )] b e denoted by q ( m ( t ) , n ( t ) 1 , n ( t ) 2 ). Consider a candidate graph G ′ with parameters ( m ′ , n ′ 1 , n ′ 2 ). W e ha v e P [ J t +1 = G ′ ] = X J t : J ( J t )= G ′ P [ J t ] = X b m , b n 1 , b n 2 X J t ∈ S ( G ′ ; b m , b n 1 , b n 2 ) P [ J t ] = X b m , b n 1 , b n 2 q ( b m , b n 1 , b n 2 ) | S ( G ′ ; b m, b n 1 , b n 2 ) | . A s tr aigh tforwa rd coun t yields | S ( G ′ ; b m , b n 1 , b n 2 ) | =  n − n ′ 1 − n ′ 2 b n 1  · ∆! · co eff [( e z − 1) n ′ 1 ( e z ) n ′ 2 ; z ∆ − b n 1 ] · I [ b n 2 = n ′ 1 + n ′ 2 ] , where ∆ ≡ P k l =1 ( b m l − m ′ l ) l . Thus, P [ J t +1 = G ′ ] dep end s on G ′ only through ( m ′ , n ′ 1 , n ′ 2 ).  T o simplify the pro of of Lemma 5.4 , we firs t p ro ve a simple tec hnical lemma. Lemma B.1. L et G = ( F , V , E ) b e a factor gr aph that is a tr e e with no che ck no de of de gr e e 1 or 2 , r o ote d at a variable no de v , with | V | > 1 . Then |{ u ∈ V : deg( u ) ≤ 1 , u 6 = v }| ≥ | V | / 2 , that is, at le ast half of al l variable no des ar e le aves. (Her e, a le af is define d as a variable no de that is distinct f r om the r o ot and has de gr e e at most 1 .) Pr oof. W e pro ceed b y induction on the maxim um depth t of the tree G rooted at v . • Induction b ase : F or a tree of depth 1, let c = deg ( v ) > 0 . Since all c heck no des ha ve degree 3 or more, G has N l ≥ 2 c lea ves and | V | = N l + 1. Clearly , N l ≥ | V | / 2. • Inductive step : Consider G ha ving depth t + 1 and p erform 1 roun d of sync h ronous p eeling, resu lting in J ( G ) = G ′ = ( F ′ , V ′ , E ′ ). Let N ′ l b e the n u m b er of lea ves in V ′ . Th e in ductiv e hyp othesis implies | V ′ | ≤ 2 N ′ l , since G ′ is also a tree. S ince, by constru ction, eve r y factor n o de has degree 66 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI at least 3 in G , ev ery leaf in G ′ m u st h a ve at least 2 lea v es in G as descendan ts, that is, 2 N ′ l ≤ N l , where N l is the n um b er of lea v es in G . Com b ining these t wo inequalities yields | V | = | V ′ | + N l ≤ 2 N ′ l + N l ≤ 2 N l , as d esired.  Pr oof of Lemm a 5.4 . By Lemma B.1 , if G is a tree, at least one-half of all v ariable no des are lea ves at ev ery stage of p eeling. Th us, G is p eelable and T C ( G ) ≤ ⌈ log 2 | V |⌉ . (After ⌈ log 2 | V |⌉ − 1 rounds of p eeling, w e ha v e 2 or less v ariable nod es r emaining, and hence no chec ks. A t most one more round of p eeling leads to annih ilation.) No w su pp ose G is unicyclic. Eac h factor in the cycle h as degree at least 3, hence it has a neighbor outside the cycle and m u st ev en tu ally get p eeled. Breaking ties arbitrarily , let a b e the first factor in the cycle to b e p eeled, and let u ∈ ∂ a b e the v ariable no d e that “causes” it to get p eeled (clearly u is n ot in the cycle). Let t u ≤ T C ( G ) b e the p eeling round in which u and a are p eeled. Consider the sub tree G u = ( F u , V u , E u ) ro oted at u defined as follo ws: G u is the maximal conn ected s ubgraph of G that includ es u , b ut not a . Using Lemma B.1 on this subtree and reasoning as ab o v e, w e hav e t u ≤ ⌈ log 2 | V u |⌉ ≤ ⌈ log 2 | V |⌉ . As at least one factor no de in the unicycle is p eeled in r ound t u , w e m ust ha ve th at J t u is a tree or forest, whic h by Lemma B.1 can b e p eeled in at most ⌈ log 2 | V |⌉ additional iterations, since the n umb er of v ariable no des in the J t u is at most | V | . Thus, T C ( G ) ≤ t u + ⌈ log 2 | V |⌉ . Combining these t w o inequalities yields T C ( G ) ≤ t u + ⌈ log 2 | V |⌉ ≤ 2 ⌈ log 2 | V |⌉ .  Pr oof of Lemma 5.5 . The lemma can b e derive d from known results (see, e.g ., [ 7 ]), b ut we find it easier to pro vide an indep enden t pr o of. W e use a generating fu nction appr oac h to pro v e the b ound P [ Z T > ( β θ ) T ] ≤ 2 exp( − C ( β / 2) T ) . (77) Equation ( 35 ) follo ws (eve n tually f or a d ifferen t constant C ) via u nion b ound . Define f ( s ) ≡ E [ s Z 1 ] = P ∞ j =0 s j b j . By assum ption, it is clear th at f ( s ) is finite for s ∈ (0 , 1 / (1 − δ )). Define f ( t ) ( s ) ≡ E [ s Z t ] for t ≥ 1 [so that f ( s ) = f (1) ( s )]. It is w ell kn o wn that f ( t ) ( s ) = f ( f ( t − 1) ( s )) (78) for τ ≥ 2 . I t follo ws that f ( t ) ( s ) is finite for s ∈ (0 , 1 / (1 − δ )), and al l τ ≥ 2. THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 67 By d ominated conv ergence f is differen tiable at 0 with f ′ (0) = θ . Hence, there exists ε 0 > 0 such that, for all ε ∈ [0 , ε 0 ] f (1 + ε ) ≤ 1 + 2 θ ε. (79) By applying th e recursion ( 78 ) and the fact that f is monotone increasing, w e obtain, for all ε ∈ [0 , ε 0 ] obtain f ( T ) (1 + ε ) ≤ 1 + (2 θ ) T ε. (80) In p articular setting ε = ε 0 / (2 θ ) T , we get f ( T ) (1 + ε ) ≤ 1 + ε 0 ≤ 2. Finally , by Mark o v inequalit y , P { Z T ≥ ( β θ ) T } ≤ (1 + ε ) − ( β θ ) T f ( T ) (1 + ε ) ≤ 2  1 − ε 2  ( β θ ) T ≤ 2 e − ( β θ ) T / 2 , whic h completes th e p ro of.  u APPENDIX C: P ROOF OF TECHNICAL LEMMAS O F S ECTION 6 Pr oof of Lemma 6.1 . W e pro v e this lemma by indu ction. Let B ( t ) l and B ( t ) u b e the result of t steps of bac kb one augmentat ion on graphs G s and G with initial graphs B (0) l and B (0) u , resp ectiv ely . By assump tion B (0) l ⊆ B (0) u . No w assu me B ( t ) l ⊆ B ( t ) u . It is enou gh to sh o w that if a ∈ B ( t +1) l \ B ( t ) l then a ∈ B ( t +1) u . Since a ∈ B ( t +1) l \ B ( t ) l , w e kn ow that a ∈ G and has at most one neigh b or outside of B ( t ) l . By in duction assu mption B ( t ) l ⊆ B ( t ) u and, th erefore, a has at m ost one neigh b or outside B ( t ) l . Hence, either a ∈ B ( t ) u or it is added to B ( ∞ ) u at step t + 1 .  Pr oof of Lemma 6.6 . Define f ( x ) = 1 − exp {− k αx k − 1 } . It follo ws immediately fr om the definition of α d ( k ), that for α > α d ( k ), we ha ve Q > 0 and f ′ ( Q ) ≤ 1 . F ur thermore, a straigh tforwa r d calcula tion yields f ′ ( Q ) = k ( k − 1) αQ k − 2 exp {− k αQ k − 1 } . (81) It is therefore suffi cien t to exclude the case f ′ ( Q ) = 1 . Solving the equations f ( Q ) = Q and f ′ ( Q ) = 1 , we get the follo wing equation for Q : − (1 − Q ) log (1 − Q ) = Q k − 1 , (82) 68 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI whic h has a unique solution Q ∗ ( k ) du e to the conca vity of the left-hand side. W e can then solv e for α yielding the unique v alue α = α ∗ ( k ) suc h that f ( Q ) = Q and f ′ ( Q ) = 1 ad m its a solution. On the other hand, these tw o equations are satisfied at α d ( k ) by a cont in u it y argumen t. It follo w s that α d ( k ) = α ∗ ( k ), and hence f ′ ( Q ) < 1 for all α > α d ( k ).  Ac kn owledgemen ts. While this pap er w as b eing finish ed, we b ecame a ware that Dimitris Achlio ptas and Michae l Mollo y concurrently obtained related resu lts on the same pr oblem. The tw o pap ers are in dep end en t. F ur- ther, they use different tec h niques and establish somewh at differen t results. REFERENCES [1] Achl iopt as, D. and Coja-Oghlan, A. (2008). Algorithmic barriers from phase transitions. In Pr o c. of the 49th IEEE Symp osium on F oundations of C om puter Scienc e, F OC S 793– 802. IEEE Computer So ciety , Los Alamitos, CA. [2] Achl iopt as, D. , Coja-Oghlan, A . and Ri cci-Tersenghi, F. ( 2011). On the solution-space geometry of random constraint satisfaction problems. R andom Structur es A l gorithms 38 251–268 . MR2663730 [3] Achl iopt as, D. , Naor, A. and Peres, Y. (2005). Rigorous location of phase t ran- sitions in hard optimization problems. Natur e 435 759–764. [4] Achl iopt as, D. and Peres, Y. (2004). The threshold for random k -SA T is 2 k log 2 − O ( k ) . J. Amer . Math. So c. 17 947–973 (electronic). MR2083472 [5] Aldous, D. and L yons, R. (2007). Pro cesses on un imodu lar random netw orks. Ele c- tr on. J. Pr ob ab. 12 1454–1 508. MR2354165 [6] Aldous, D. and Steele, J. M. (2004). The ob jectiv e metho d: Probabilistic combina- torial optimization and local w eak conv ergence. In Pr ob abil ity on Discr ete Struc- tur es ( H. Ke sten , ed.). Encyclop ae dia Math. Sci. 110 1–72. S pringer, Berlin. MR2023650 [7] A threy a, K. B. and Ney, P. E. (1972). Br anching Pr o c esses . S pringer, New Y ork. MR0373040 [8] Balogh, J. , Peres, Y. and Pete, G. (2006). Bootstrap percolation on infinite trees and non-amenable groups. Combin. Pr ob ab. Comput. 15 715–73 0. MR2248323 [9] Benjami n i, I. and Schramm, O. (1996). Percolatio n b ey ond Z d , many questions and a few answ ers. Ele ctr on. Commun. Pr ob ab. 1 71–8 2 (electronic). MR1423907 [10] Bollob ´ as, B. (1980). A probabilistic proof of an asymptotic form u la fo r the num b er of lab elled regular graphs. Eur op e an J. Combin. 1 311–31 6. MR0595929 [11] Bollob ´ as, B. (2001). R andom Gr aphs , 2nd ed. Cambridge Studies in A dvanc e d Mathematics 73 . Cam b ridge Univ . Press, Cambridge. MR1864966 [12] Cocco, S. , Dubois, O. , Mandler, J. and Monasson, R. (2003). Rigorous decimation-based construction of ground pu re states for spin-glass mo dels on random lattices. Phys. R ev. Le tt. 90 047205. [13] Coja-Oghlan, A. (2010). A b etter al gorithm for random k - SA T. SIAM J. Comput. 39 2823–286 4. MR2645890 [14] Dembo, A . and Mont anari, A. (2008). Finite size scaling fo r the core of large random hypergraphs. Ann. Appl. Pr ob ab. 18 1993–2040 . MR2462557 [15] Dembo, A. and Mont anari, A. (201 0). Gibbs measures and phase transitions on sparse random graphs. Br az. J. Pr ob ab. Sta t. 24 137–21 1. MR2643563 THE SET OF SOLUTIONS OF RANDO M XORSA T FORMULAE 69 [16] Dembo, A. and Mont anari, A. (2010). I sing models on locally tree-like graphs. Ann . Appl. Pr ob ab. 20 565–592. MR2650042 [17] Dembo, A. , Mont anari, A. and Sun, N. (2013). F actor models on locally tree-like graphs. Ann. Pr ob ab. 41 4162–4213. MR3161472 [18] Dietzfe lbinger, M. , Goerdt, A. , Mitz enmacher, M. , Mont anari, A. , P agh, R. and Rink, M. (2010). Tight th resholds for Cuc koo Hashing via XORSA T. In Pr o c. of the 37th I nternational Col l o quium on A utomata, L anguages and Pr o- gr amming, ICALP . L e ctur e Notes i n Computer Scienc e 6198 213–225. Springer, Berlin. [19] Dubhashi, D. P. and P anconesi, A. ( 2009). Conc entr ation of Me asur e f or the Anal- ysis of R andomize d A lgorithms . Cam bridge Univ. Press, Cam bridge. MR2547432 [20] Dubois, O. and Mand ler, J. (2002). The 3-XORSA T threshold. In Pr o c. of the 43r d IEEE Symp osium on F oundations of Computer Scienc e, F OC S 769–7 78. IEEE Computer So ciet y , Los Alamitos, CA. [21] Friedgut, E. ( 1999). S harp thresholds of graph prop erties, and the k -sat problem. J. A m er. Math. So c. 12 1017–1054 . MR1678031 [22] Hall, P. (1982). R ates of Conver genc e in the Centr al Limit The or em . R ese ar ch Notes in Mathematics 62 . Pitman, London. MR0668197 [23] Hoor y, S . , Linial, N. and Wigderson, A . (2006). Expander graphs and t heir applications. Bul l. Amer. Math. So c. (N.S.) 43 439–56 1 ( electronic). MR2247919 [24] Kallenberg, O. (2002). F oundations of Mo dern Pr ob abil i ty , 2nd ed . Springer, N ew Y ork. MR1876169 [25] Kleinjung, T. , Aoki, K. , Fran ke, J. , Lenstra, A. K. , T h om ´ e, E. , Bos, J. W. , Gaudr y, P. , Kr upp a, A. , Montgomer y, P. L. , Osvik, D. A. , te Ri ele, H. , Timofeev, A. and Zimmerm ann, P. (2010). F actorizatio n of a 768-bit RSA mod u lus. In A dvanc es in Cryptolo gy—CR YPTO 2010 . L e ctur e Notes in Com- puter Scienc e 6223 333–3 50. Springer, Berlin. MR2725602 [26] Krzakala, F. , Mont a nari, A. , Ricci-Tersengh i, F. , Seme rjian, G. and Zde- bor ov ´ a, L. (2007). Gibbs states and the set of solutions of random constraint satisfactio n problems. Pr o c. Natl. A c ad. Sci. USA 104 10318–10323 (electronic). MR2317690 [27] Luby, M. , M itzenmacher, M . , Shokro llahi, A. and Sp ielman, D. A. (1998). Analysis of lo w density co des and improv ed designs using irregular graphs. In Pr o c. of the 30th ACM Symp osium on The ory of Computing, STOC 249–25 8. ACM , New Y ork. [28] Luby, M. G. , Mi tzenmacher, M. , Sh okrollahi, M. A. and Spielman, D. A. (2001). Efficien t erasure correcting co des. IEEE T r ans. I nform. The ory 47 569– 584. MR1820477 [29] M ´ ezard, M. and Mont a nari, A. (2009). Information, Physics, and Computat ion . Oxford Univ. Press, Oxford. MR2518205 [30] M ´ ezard, M. , P ari si , G. and Zecchina, R. (2003). A nalytic and algori thmic solution of random satisfiabilit y problems. Scienc e 297 812–815. [31] M ´ ezard, M. , Ricci-Terse nghi, F. and Zecchin a, R. (2003). Two solutions to diluted p -spin mo dels and XORSA T problems. J. Stat. Phys. 111 505–5 33. MR1972120 [32] Mollo y, M. (2005). Cores in random hypergraphs and Boolean form ulas. Ra ndom Structur es A l gorithms 27 124–135 . MR2150018 [33] Monasson, R. , Zecchina, R. , Kirkp a tri ck, S . , Selman, B. and Tr oy an sky, L. (1999). Determinin g computational complexity from c haracteristic “phase tran- sitions”. Natur e 400 133–137. MR1704845 70 IBRAHIMI, KANORI A, K RANIN G AND MONT ANARI [34] Mont anari, A. (2013). Statistical mec h anics and algorithms on sparse and random graphs. In L e ctur es on Pr ob ability The ory and Statistics . Saint-Flour. [35] Pittel, B. , Spencer, J. and Wormald, N. (1996). Sudden emergence of a gian t k -core in a random graph. J. Combin. The ory Ser. B 67 111–1 51. MR13853 86 [36] Richardson, T. and Urbanke, R. ( 2008). Mo dern Co ding The ory . Cambridge Un iv. Press, Cam b rid ge. MR2494807 [37] Wormald, N . C. (1981). The asymptotic distribut ion of short cycles in random regular graphs. J. Combin. The ory Ser. B 31 168–1 82. MR0630980 [38] Wormald, N. C. (1999). Mo dels of random regular graphs. In Surveys in Combi- natorics, 1999 (Canterbury) ( J. D. Lam b and D. A. Pree ce , eds.). L ondon Mathematic al So ciety L e ctur e Note Series 267 239–29 8. Cam b ridge Univ. Press, Cam b rid ge. MR172500 6 M. Ibrahimi Urban Engines Los Al tos, California 94022 USA E-mail: ibrahimi@stanford.edu Y. Kanoria Decision, Risk and Operat ions Division Gradua te School of Business Columbia University New York, New York 10 027 USA E-mail: yk anoria@columb ia.edu M. Kraning Qadium, Inc. San Francisco, California 9410 7 USA E-mail: matt@qadium.com A. Mont anari Dep ar tment of Electrical Engineering and Dep ar tment of St atis tics St anford Un iversity St anford, California 94305 USA E-mail: mon tanari@stanford.edu

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment