How Well Do Local Algorithms Solve Semidefinite Programs?
Several probabilistic models from high-dimensional statistics and machine learning reveal an intriguing --and yet poorly understood-- dichotomy. Either simple local algorithms succeed in estimating the object of interest, or even sophisticated semi-d…
Authors: Zhou Fan, Andrea Montanari
Ho w W ell Do Lo cal Algorithms Solv e Semidefinite Programs? Zhou F an ∗ Andrea Mon tanari † Octob er 19, 2016 Abstract Sev eral probabilistic models from high-dimensional statistics and mac hine learning rev eal an in triguing –and y et po orly understoo d– dic hotom y . Either simple lo cal algorithms succeed in estimating the ob ject of in terest, or ev en sophisticated semi-definite programming (SDP) relaxations fail. In order to explore this phenomenon, w e study a classical SDP relaxation of the minim um graph bisection problem, when applied to Erd˝ os-R ´ en yi random graphs with b ounded a verage degree d > 1, and obtain several types of results. First, we use a dual witness construction (using the so-called non-backtrac king matrix of the graph) to upp er bound the SDP v alue. Second, we prov e that a simple lo cal algorithm approximately solves the SDP to within a factor 2 d 2 / (2 d 2 + d − 1) of the upp er bound. In particular, the lo cal algorithm is at most 8 / 9 sub optimal, and 1 + O (1 /d ) sub optimal for large degree. W e then analyze a more sophisticated lo cal algorithm, which aggregates information accord- ing to the harmonic measure on the limiting Galton-W atson (GW) tree. The resulting low er b ound is expressed in terms of the conductance of the GW tree and matches surprisingly well the empirically determined SDP v alues on large-scale Erd˝ os-R´ enyi graphs. W e finally consider the plan ted partition mo del. In this case, purely lo cal algorithms are kno wn to fail, but they do succeed if a small amount of side information is a v ailable. Our results imply quantitativ e b ounds on the threshold for partial recov ery using SDP in this mo del. 1 In tro duction Semi-definite programming (SDP) relaxations are among the most p o w erful to ols av ailable to the algorithm designer. How ev er, while efficient sp ecialized solvers exist for sev eral imp ortan t applica- tions [BM03, WS08, NN13], generic SDP algorithms are not well suited for large-scale problems. A t the other end of the sp ectrum, lo cal algorithms attempt to solv e graph-structured problems by taking, at eac h vertex of the graph, a decision that is only based on a bounded-radius neighborho od of that vertex [Suo13]. As such, they can b e implemen ted in linear time, or constant time on a distributed platform. On the flip side, their p o w er is obviously limited. Giv en these fundamental differences, it is surprising that these tw o classes of algorithms b eha ve similarly on a num b er of probabilistic mo dels arising from statistics and machine learning. Let us briefly review t wo well-studied examples of this phenomenon. In the (generalized) hidden clique pr oblem , a random graph G ov er n vertices is generated as follo ws: A subset S of k vertices is chosen uniformly at random among all n k sets of that size. ∗ Departmen t of Statistics, Stanford Universit y † Departmen t of Electrical Engineering and Department of Statistics, Stanford Universit y 1 Conditional on S , any tw o vertices i , j are connected by an edge indep enden tly with probability p if { i, j } ⊆ S and probability q < p otherwise. Giv en a single realization of this random graph G , w e are requested to find the set S . (The original form ulation [Jer92] of the problem use s p = 1, q = 1 / 2 but it is useful to consider the case of general p , q .) SDP relaxations for the hidden clique problem were studied in a num b er of pap ers, b eginning with the seminal work of F eige and Krauthgamer [FK00, A V11, MPW15, DM15b, BHK + 16]. Re- mark ably , ev en the most p o werful among these relaxations –which are constructed through the sum-of-squares (SOS) hierarch y– fail unless k & √ n [BHK + 16], while exhaustiv e searc h succeeds with high probabilit y as so on as k ≥ C 0 log n for C 0 = C 0 ( p, q ) a constant. Local algorithms can b e formally defined only for a sparse version of this mo del, whereb y p, q = Θ(1 /n ) and hence eac h no de has b ounded a verage degree [Mon15]. In this regime, there exists an optimal lo cal algorithm for this problem that is related to the ‘b elief propagation’ heuristic in graphical mo dels. The v ery same algorithm can b e applied to dense graphs (i.e. p, q = Θ(1)), and was prov en to succeed if and only if k ≥ C 1 √ n [DM15a]. Summarizing, the full p o wer of the SOS hierarc hy , despite ha ving a m uch larger computational burden, do es not qualitatively improv e up on the p erformance of simple lo cal heuristics. As a second example, we consider the two-gr oups symmetric sto chastic blo ck mo del (also known as the planted partition problem) that has attracted considerable attention in recent y ears as a to y mo del for comm unity detection in netw orks [DKMZ11, KMM + 13, MNS13, Mas14, BLM15, GV15]. A random graph G ov er n vertices is generated by partitioning the v ertex set into tw o subsets 1 S + ∪ S − of size n/ 2 uniformly at random. Conditional on this partition, any tw o vertices i , j are connected b y an edge indep enden tly with probability a/n if { i, j } ⊆ S + or { i, j } ⊆ S − (the t wo v ertices are on the same side of the partition), and with probability b/n otherwise (the tw o vertices are on different sides). Giv en a single realization of the random graph G , w e are requested to iden tify the partition. While several ‘success’ criteria hav e b een studied for this model, for the sake of simplicit y w e will fo cus on weak recov ery (also referred to as ‘detection’ or ‘partial recov ery’). Namely , we wan t to attribute { + , −} lab els to the vertices so that –with high probabilit y– at least (1 / 2 + ε ) n vertices are lab eled correctly (up to a global sign flip that cannot b e identified). It w as conjectured in [DKMZ11] that this is p ossible if and only if λ > 1, where λ ≡ ( a − b ) / p 2( a + b ) is an effectiv e ‘signal-to-noise ratio’ parameter. This conjecture follow ed from the heuristic analysis of a lo cal algorithm based –once again– on b elief-propagation. The conjecture was subsequen tly prov en in [MNS12, MNS13, Mas14] through the analysis of carefully constructed sp ectral algorithms. While these algorithms are, strictly speaking, not lo cal, they are related to the linearization of b elief propagation around a ‘non-informativ e fixed p oin t’. Con vex optimization approaches for this problem are based on the classical SDP relaxation of the minim um-bisection problem. Denoting b y A = A G the adjacency matrix of G , the minim um bisection problem is written as maximize h σ , Aσ i , (1) sub ject to σ ∈ { +1 , − 1 } n , h σ , 1 i = 0 . (2) The follo wing SDP relaxes the ab o ve problem, where d = ( a + b ) / 2 is the av erage degree: maximize h A − d n 11 T , X i , (3) 1 T o av oid notational nuisances, we assume n even. 2 sub ject to X 0 , X ii = 1 ∀ i . (Here, the term − ( d/n ) 11 T can b e thought of as a relaxation of the hard constrain t h σ , 1 i = 0.) This SDP relaxation has a weak recov ery threshold λ SDP that app ears to b e very close to the ideal one λ = 1. Namely , Gu´ edon and V ershynin [GV15] pro ved λ SDP ≤ C for C a universal constant, while [MS16] established λ SDP = 1 + o d (1) for large a verage degree d . Summarizing, also for the planted partition problem lo cal algorithms (b elief propagation) and SDP relaxations behav e in strikingly similar wa ys 2 . In addition to the abov e rigorous results, n umerical evidence suggests that the tw o thresholds are very close for all degrees d , and that the reconstruction accuracy ab o v e these thresholds is also very similar [JMR T16]. The conjectural picture emerging from these and similar examples can b e describ ed as follows. F or statistical inference problems on sparse random graphs, SDP relaxations are no more p ow erful than lo cal algorithms (even tually supplemented with a small amount of side information to break symmetries). On the other hand, an y information that is genuinely non-lo cal is not exploited even b y sophisticated SDP hierarchies. Of course, formalizing this picture is of utmost practical interest, since it w ould entail a dramatic simplification of algorithmic options. With this general picture in mind, it is natural to ask: Can semidefinite pr o gr ams b e (appr oxi- mately) solve d by lo c al algorithms for a lar ge class of r andom gr aph mo dels ? A p ositiv e answer to this question w ould clarify the equiv alence b etw een local algorithms and SDP relaxations. Here, w e address this problem b y considering the semidefinite program (3), for tw o simple graph mo dels, the Erd˝ os-R ´ en yi random graph with a verage degree d , G ∼ G ( n, d/n ), and the tw o- groups symmetric blo c k mo del, G ∼ G ( n, a/n, b/n ). W e establish the follo wing results (denoting b y SDP ( A G ) the v alue of (3)). Appro ximation ratio of lo cal algorithms. W e prov e that there exists a simple lo cal algorithm that approximates SDP ( A G ) (when G ∼ G ( n, d/n )) within a factor 2 d 2 / (2 d 2 + d − 1), asymp- totically for large n . In particular, the lo cal algorithm is at most a factor 8 / 9 sub optimal, and 1 + O (1 /d ) sub optimal for large degree. Note that SDP ( A G ) concentrates tigh tly around its exp ected v alue. When w e write that an algorithm appr oximates SDP ( A G ), we mean that it returns a feasible solution whose v alue satisfies the claimed appro ximation guarantee. T ypical SDP v alue. Our pro of pro vides upper and low er bounds on SDP ( A G ) for G ∼ G ( n, d/n ), implying in particular SDP ( A G ) /n = 2 √ d (1 − Θ(1 /d )) + o n (1) where the term Θ(1 /d ) has explicit upp er and low er b ounds. While the low er b ound is based on the analysis of a lo cal algorithm, the upp er b ound follows from a dual witness construction which is of indep endent in terest. Our upp er and lo w er bounds are plotted in Fig. 1 together with the results of numerical sim ulations. A lo cal algorithm based on harmonic measures. The simple lo cal algorithm ab o ve uses ran- domness av ailable at each v ertex of G and aggregates it uniformly within a neighborho o d of 2 An important remark is that strictly lo cal algorithms are ineffective in the plan ted partition problem. This p eculiarit y is related to the symmetry of the mo del, and can be resolved in sev eral w ays, for instance by an oracle that reveals an arbitrarily small fraction of the true vertex lab els, or running b elief propagation a logarithmic (rather than constan t) num b er of iterations. W e refer to Section 2.3 for further discussion of this p oint. 3 0 2 4 6 8 10 12 14 16 d 0.5 0.6 0.7 0.8 0.9 1.0 1.1 S D P ( A ) / 2 n p d Figure 1: T ypical v alue SDP ( A G ) of the min-bisection SDP for large Erd˝ os-R ´ en yi random graphs with av erage degree d , normalized b y the large degree form ula 2 n √ d . Circles: numerical sim ulations with graphs of size n = 10 6 . Solid lines: Upp er b ound from Theorem 2.2 and lo cal algorithm lo wer b ound (ev aluated n umerically) from Theorem 2.4. Lo wer dashed line: Explicit lo cal-algorithm lo wer b ound from Theorem 2.2. (The small inconsistency b et ween n umerical SDP v alues and the lo wer b ound at large d is due to non-asymptotic effects that app ear to v anish as n → ∞ .) eac h vertex. W e analyze a different lo cal algorithm that aggregates information in prop ortion to the harmonic measure of eac h vertex. W e c haracterize the v alue achiev ed by this algorithm in the large n limit in terms of the conductance of a random Galton-W atson tree. Numerical data (obtained by ev aluating this v alue and also solving the SDP (3) on large random graphs), as well as a large- d asymptotic expansion, suggest that this lo wer b ound is v ery accurate, cf. Fig. 1. SDP detection threshold for the sto c hastic blo c k mo del. W e then turn to the weak recov- ery problem in the t wo-group symmetric stochastic blo ck mo del G ∼ G ( n, a/n, b/n ). As ab o v e, it is more conv enient to parametrize this mo del by the a verage degree d = ( a + b ) / 2 and the signal-to-noise ratio λ = ( a − b ) / p 2( a + b ). It w as kno wn from [GV15] that the threshold for SDP to ac hiev e w eak reco very is λ SDP ( d ) ≤ 10 4 , and in [MS16] that λ SDP ( d ) ≤ 1 + o d (1) for large degree. Our results pro vide more precise information, implying in particular λ SDP ( d ) ≤ min(2 − d − 1 , 1 + C d − 1 / 4 ) for C a universal constant. 4 2 Main results In this section w e recall the notion of lo cal algorithms, as sp ecialized to solving the problem (3). W e then state formally our main results. F or general background on lo cal algorithms, we refer to [HLS14, GS14, Ly o14]: this line of work is briefly discussed in Section 3. Note that the application of lo cal algorithms to solve SDPs is not entirely ob vious, since lo cal algorithms are normally defined to return a quantit y for each vertex in G , instead of a matrix X whose ro ws and columns are indexed by those vertices. Our p oin t of view will b e that a lo cal algorithm can solve the SDP (3) by returning, for eac h v ertex i , a random v ariable ξ i , and the SDP solution X asso ciated to this lo cal algorithm is the cov ariance matrix of ξ = ( ξ 1 , . . . , ξ n ) with resp ect to the randomness of the algorithm execution. An arbitrarily go o d appro ximation of this solution X can b e obtained by rep eatedly sampling the v ector ξ ∈ R n (i.e. by repeatedly running the algorithm with indep enden t randomness). F ormally , let G b e the space of (finite or) lo cally finite ro oted graphs, i.e. of pairs ( G, ø) where G = ( V , E ) is a lo cally finite graph and ø ∈ V is a distinguished ro ot vertex. W e denote by G ∗ the space of tuples ( G, ø , z ) where ( G, ø) ∈ G and z : V → R asso ciates a real-v alued mark to each v ertex of G . Giv en a graph G = ( V , E ) and a vertex i ∈ V , w e denote b y B ` ( i ; G ) the subgraph induced by vertices j whose graph distance from i is at most ` , ro oted at i . If G carries marks z : V → R , it is understo od that B ` ( i ; G ) inherits the ‘same’ marks. W e will write in this case ( B ` ( i ; G ) , z ) instead of the cumbersome (but more explicit) notation ( B ` ( i ; G ) , i, z B ` ( i ; G ) ). Definition 2.1. A radius- ` ` ` lo cal algorithm for the semidefinite program (3) is an y measurable function F : G ∗ → R suc h that 1. F ( G 1 , ø 1 , z 1 ) = F ( G 2 , ø 2 , z 2 ) if ( B ` (ø 1 ; G 1 ) , z 1 ) ' ( B ` (ø 2 ; G 2 ) , z 2 ), where ' denotes graph isomorphism that preserv es the ro ot v ertex and vertex marks. 2. Letting z = ( z ( i )) i ∈ V b e i.i.d. with z ( i ) ∼ Normal(0 , 1), w e ha ve E z F ( G, ø , z ) 2 = 1. (Here and b elo w E z denotes exp ectation with resp ect to the random v ariables z ). W e denote the set of suc h functions F by F ∗ ( ` ). A lo cal algorithm is a radius- ` lo cal algorithm for some fixed ` (indep enden t of the graph). The set of suc h functions is denoted by F ∗ ≡ ∪ ` ≥ 1 F ∗ ( ` ). W e apply a lo cal algorithm to a fixed graph G b y generating i.i.d. marks z = ( z ( i )) i ∈ V as z ( i ) ∼ Normal(0 , 1), and pro ducing the random v ariable ξ i = F ( G, i, z ) for eac h vertex i ∈ V . In other words, w e use the radius- ` lo cal algorithm to compute, for each v ertex of G , a function of the ball of radius ` around that vertex that dep ends on the additional randomness provided by the z ( i )’s in this ball. The cov ariance matrix X = E z { ξ ξ T } is a feasible p oin t for the SDP (3), ac hieving the v alue n E ( F ; G ) where E ( F ; G ) ≡ 1 n E z X i,j ∈ V ( A G ) ij − d n F ( G, i, z ) F ( G, j , z ) . (4) W e are no w in p osition to state our main results. 5 2.1 Erd˝ os-R ´ en yi random graphs G ∼ G ( n, d/n ) W e first prov e an optimality guaran tee, in the large n limit, for the v alue achiev ed by a simple lo cal algorithm (or more precisely , a sequence of simple lo cal algorithms) solving (3) on the Erd˝ os-R ´ en yi graph. Theorem 2.2. Fix d ≥ 0 and let A = A G n b e the adjac ency matrix of the Er d˝ os-R ´ enyi r andom gr aph G n ∼ G ( n, d/n ) . Then for d < 1 , almost sur ely, lim n →∞ SDP ( A ) /n = d . F or d ≥ 1 , almost sur ely, 2 √ d 1 − 1 d + 1 ≤ lim inf n →∞ 1 n SDP ( A ) ≤ lim sup n →∞ 1 n SDP ( A ) ≤ 2 √ d 1 − 1 2 d . (5) F urthermor e, ther e exists a se quenc e of lo c al algorithms that achieves the lower b ound. Namely, for e ach ε > 0 , ther e exist ` ( ε ) > 0 and F ∈ F ∗ ( ` ( ε )) such that, almost sur ely, lim n →∞ E ( F ; G n ) ≥ 2 √ d 1 − 1 d + 1 − ε . (6) As an ticipated in the introduction, the upp er and low er b ounds of (5) approac h each other for large d , implying in particular SDP ( A ) /n = 2 √ d 1 − Θ(1 /d ) + o n (1). This should b e compared with the result of [MS16] yielding SDP ( A ) /n = 2 √ d 1 + o d (1) + o n (1). Also, by simple calculus, the upp er and low er b ounds stay within a ratio b ounded by 8 / 9 for all d , with the worst-case ratio 8 / 9 b eing achiev ed at d = 2. Finally , they again conv erge as d → 1, implying in particular lim n →∞ SDP ( A ) /n = 1 for d = 1. Remark 2.3. The result lim n →∞ SDP ( A ) /n = d for d < 1 is elementary and only stated for completeness. Indeed, for d < 1, the graph G ∼ G ( n, d/n ) decomp oses with high probabilit y in to disconnected comp onents of size O (log n ), whic h are all trees or unicyclic [JLR11]. As a consequence, the vertex set can b e partitioned into t wo subsets of size n/ 2 so that at most one connected comp onent of G has vertices on b oth sides of the partition, and hence at most tw o edges cross the partition. By using the feasible p oin t X = σ σ T with σ ∈ { +1 , − 1 } n the indicator vector of the partition, w e get 2 | E | − 8 ≤ SDP ( A ) ≤ 2 | E | whence the claim follows immediately . In the pro of of Theorem 2.2, w e will assume d > 1. Note that the case d = 1 follo ws as well, since SDP ( A ) is a Lipschitz contin uous function of A , with Lipschitz constan t equal to one. This implies that lim sup n →∞ SDP ( A ) /n , lim inf n →∞ SDP ( A ) /n are con tinuous functions of d . The lo cal algorithm ac hieving the low er b ound of Theorem 2.2 is extremely simple. At eac h v ertex i ∈ V , it outputs a weigh ted sum of the random v ariables z ( j ) with j ∈ B ` ( i ; G ), with w eight prop ortional to d − dist ( i,j ) / 2 (here dist ( i, j ) is the graph distance b etw een vertices i and j ). When applied to random d -regular graphs, this approach is related to the Gaussian wa v e function of [CGHV15] and is kno wn to achiev e the SDP v alue SDP ( A ) in the large n limit [MS16]. 2.2 A lo cal algorithm based on harmonic measures A natural question arising from the previous section is whether a b etter lo cal algorithm can b e constructed by summing the random v ariables z ( j ) with different weigh ts, to account for the graph geometry . It turns out that indeed this is p ossible b y using a certain harmonic w eighting scheme that we next describ e, deferring some technical details to Section 6. Throughout we assume d > 1. 6 Recall that the random graph G ∼ G ( n, d/n ) conv erges lo cally to a Galton-W atson tree (see Section 6 for bac kground on local w eak con vergence). This can b e shown to imply that it is sufficient to define the function F ∈ F ∗ for trees. Let ( T , ø) b e an infinite ro oted tree and consider the simple random walk on T started at ø, which we assume to b e transient. The harmonic measure assigns to v ertex v ∈ V ( T ), with dist (ø , v ) = k , a w eight h (ø) ( v ) whic h is the probability 3 that the w alk exits for the last time B k (ø; T ) at v [LPP95]. W e then define e F ( T , ø , z ) ≡ 1 √ ` + 1 X v ∈ B ` (ø ,T ) q h (ø) ( v ) z ( v ) . (7) T echnically sp eaking, this is not a lo cal function b ecause the w eights h (ø) ( v ) dep end on the whole tree T . Ho wev er a lo cal approximation to these weigh ts can b e constructed by truncating T at a depth L ≥ ` : details are provided in Section 6. Giv en the well-understoo d relationship b etw een random walks and electrical netw orks, it is not surprising that the v alue achiev ed by this lo cal algorithm can be expressed in terms of conductance. The conductance c( T , ø) of a rooted tree ( T , ø) is the intensit y of current flowing out of the root when a unit p otential difference is imp osed b etw een the ro ot and the b oundary (‘at infinity’). It is understo od that c( T , ø) = 0 if T is finite. Theorem 2.4. F or ( T , ø) a Galton-Watson tr e e with offspring distribution Poisson( d ) , let c 1 , c 2 d = c( T , ø) b e two indep endent and identic al ly distribute d c opies of the c onductanc e of T . L et A = A G n b e the adjac ency matrix of the Er d˝ os-R ´ enyi r andom gr aph G n ∼ G ( n, d/n ) . Then for d > 1 , almost sur ely, lim inf n →∞ 1 n SDP ( A ) ≥ d E Ψ(c 1 , c 2 ) , (8) Ψ(c 1 , c 2 ) ≡ ( c 1 √ 1+c 2 +c 2 √ 1+c 1 c 1 +c 2 +c 1 c 2 if c 1 > 0 or c 2 > 0 , 1 otherwise. (9) F urthermor e, for e ach ε > 0 , ther e exist ` ( ε ) > 0 and F ∈ F ∗ ( ` ( ε )) such that, almost sur ely, lim n →∞ E ( F ; G n ) ≥ d E Ψ(c 1 , c 2 ) − ε. (10) Final ly, for lar ge d , this lower b ound b ehaves as d E Ψ(c 1 , c 2 ) = √ 2 d 1 − 5 8 d + O log d d 3 / 2 . (11) The low er b ound d E Ψ(c 1 , c 2 ) is not explicit but can b e efficiently ev aluated n umerically , by sampling the distributional recursion satisfied b y c. This numerical technique was used in [JMR T16], to which w e refer for further details. The result of such a numerical ev aluation is plotted as the lo wer solid line in Figure 1. This harmonic lo wer b ound seems to capture extremely well our numerical data for SDP ( A ) (red circles). Note that Theorem 2.4 implies that the lo wer b ound in Theorem 2.2 is not tigh t (see in particular Eq. (11)) and it pro vides a tighter low er b ound (at least for large d ). 3 F or eac h distance k , the weigh ts h (ø) ( v ) form a probability distribution o ver vertices at distance k from the ro ot. These distributions can b e derived from a unique probability measure ov er the b oundary of T at infinity , as is done in [LPP95], but this is not necessary here. 7 2.3 Sto c hastic blo ck mo del G ∼ G ( n, a/n, b/n ) As discussed in the previous sections, lo cal algorithms can approximately solv e the SDP (3) for A = A G the adjacency matrix of G ∼ G ( n, d/n ). The sto c hastic blo c k mo del G ∼ G ( n, a/n, b/n ) pro vides a simple example in whic h they are b ound to fail, although they can succeed with a small amoun t of additional side information. As stated in the introduction, a random graph G ∼ G ( n, a/n, b/n ) ov er n vertices is generated as follows. Let σ ∈ { +1 , − 1 } n b e distributed uniformly at random, conditional on h σ , 1 i = 0. Conditional on σ , an y t wo v ertices i , j are connected b y an edge indep enden tly with probability a/n if σ ( i ) = σ ( j ) and with probability b/n otherwise. W e will assume a > b : in the so cial sciences parlance, the graph is assortativ e. The av erage v ertex degree of such a graph is d = ( a + b ) / 2. W e assume d > 1 to ensure that G has a giant comp onen t with high probabilit y . The signal-to-noise ratio parameter λ = ( a − b ) / p 2( a + b ) pla ys a sp ecial role in the mo del’s b eha vior. If λ < 1, then the total v ariation distance b et ween G ( n, a/n, b/n ) and the Erd˝ os-R ´ en yi graph G ( n, d/n ) is b ounded aw a y from 1. On the other hand, if λ ≥ 1, then we can test whether G ∼ G ( n, a/n, b/n ) or G ∼ G ( n, d/n ) with probability of error con verging to 0 as n → ∞ [MNS12]. The next theorem lo wer-bounds the SDP v alue for the sto c hastic blo c k mo del. Theorem 2.5. L et A = A G n b e the adjac ency matrix of the r andom gr aph G n ∼ G ( n, a/n, b/n ) . If d = ( a + b ) / 2 > 1 and λ = ( a − b ) / p 2( a + b ) > 1 , then for a universal c onstant C > 0 (indep endent of a and b ), almost sur ely, lim inf n →∞ 1 n SDP ( A ) ≥ √ d max λ, 2 + ( λ − 1) 2 λ √ d − C d . (12) (The first b ound in (12) dominates for large λ , whereas the second dominates near the information- theoretic threshold λ = 1 for large d .) On one hand, this theorem implies that local algorithms fail to appro ximately solve the SDP (3) for the sto c hastic blo c k mo del, for the following reason: The lo cal structures of G ∼ G ( n, a/n, b/n ) and G ∼ G ( n, d/n ) are the same asymptotically , in the sense that they b oth conv erge lo cally to the Galton-W atson tree with P oisson( d ) offspring distribution. This and the upp er b ound of Theorem 2.2 immediately imply that for an y F ∈ F ∗ , lim sup n →∞ E ( F ; G n ) ≤ 2 √ d 1 − 1 2 d . (13) In particular, the gap b etw een this upp er b ound and the low er b ound (12) for the SDP v alue is un b ounded for large λ . This problem is related to the symmetry b et ween +1 and − 1 lab els in this mo del. It can b e resolv ed if we allow the lo cal algorithm to dep end on B ` n (ø; G ) where ` n gro ws logarithmically in n , or alternativ ely if we provide a small amoun t of side information ab out the hidden partition. Here w e explore the latter scenario (see also [MX16] for related work). Supp ose that for eac h v ertex i ∈ V , the lab el σ ( i ) ∈ { +1 , − 1 } is rev ealed indep enden tly with probabilit y δ for some fixed δ > 0, and that the radius- ` lo cal algorithm has access to the revealed lab els in B ` (ø; G ). More formally , let M = { +1 , − 1 , u } b e the set of p ossible vertex lab els, where u co des for ‘unrevealed’, let σ : V → M b e any assignment of lab els to v ertices, and let G ∗ ( M ) b e the space of tuples ( G, ø , σ , z ) (where ( G, ø , z ) ∈ G ∗ as b efore). 8 Definition 2.6. A radius- ` ` ` lo cal algorithm using partiall y revealed lab els for the semidef- inite program (3) is an y measurable function F : G ∗ ( M ) → R suc h that 1. F ( G 1 , ø 1 , z 1 , σ 1 ) = F ( G 2 , ø 2 , z 2 , σ 2 ) if ( B ` (ø 1 ; G 1 ) , z 1 , σ 1 ) ' ( B ` (ø 2 ; G 2 ) , z 2 , σ 2 ), where ' denotes isomorphism that preserv es the ro ot v ertex, vertex marks, and vertex lab els in M . 2. Letting z = ( z ( i )) i ∈ V b e i.i.d. with z ( i ) ∼ Normal(0 , 1), we hav e E z { F ( G, ø , σ , z ) 2 } = 1, where E z denotes exp ectation only ov er z . W e denote the set of suc h functions F b y F M ∗ ( ` ), and w e denote F M ∗ ≡ ∪ ` ≥ 1 F M ∗ ( ` ). F or any F ∈ F M ∗ , w e denote E ( F ; G, σ ) ≡ 1 n E z X i,j ∈ V ( A G ) ij − d n F ( G, i, σ , z ) F ( G, j, σ , z ) , (14) so that F yields a solution to the SDP (3) ac hieving v alue n E ( F ; G ; σ ). Then w e hav e the following result: Theorem 2.7. L et A = A G n b e the adjac ency matrix of the r andom gr aph G n ∼ G ( n, a/n, b/n ) . F or any fixe d δ > 0 , let σ n = ( σ n ( i )) i ∈ V ( G n ) b e r andom and such that, indep endently for e ach i ∈ V ( G n ) , with pr ob ability 1 − δ we have σ n ( i ) = u , and with pr ob ability δ we have that σ n ( i ) ∈ { +1 , − 1 } identifies the c omp onent of the hidden p artition c ontaining i . If d = ( a + b ) / 2 ≥ 2 and λ = ( a − b ) / p 2( a + b ) > 1 , then for any ε > 0 , ther e exist ` ( ε ) > 0 and F ∈ F M ∗ ( ` ( ε )) for which, almost sur ely, lim n →∞ E ( F ; G n , σ n ) ≥ √ d 2 + ( λ − 1) 2 λ √ d − C d − ε. (15) The restriction to d ≥ 2 ab o ve is arbitrary; our pro of is v alid if this constraint is relaxed to d ≥ 1 + η for an y η > 0, at the exp ense of a larger constan t C := C η . This theorem implies the second lo wer b ound of (12) when d ≥ 2; the first lo wer bound of (12) is trivial and pro ven in Section 6. 2.4 T esting in the sto c hastic blo c k mo del Semidefinite programming can b e used as follo ws to test whether G ∼ G ( n, a/n, b/n ) or G ∼ G ( n, d/n ): 1. Giv en the graph G , compute the v alue SDP ( A G ) of the program (3). 2. If SDP ( A G ) /n ≥ 2 √ d (1 − (2 d ) − 1 ) + ε , reject the null hypothesis G ∼ G ( n, d/n ). (Here ε is a small constant indep enden t of n .) The rationale for this pro cedure is provided by Theorem 2.2, implying that, if G ∼ G ( n, d/n ), then the probability of false discov ery (i.e. rejecting the n ull when G ∼ G ( n, d/n )) conv erges to 0 as n → ∞ . W e hav e the following immediate consequence of Theorem 2.2 and Theorem 2.5 (here error probabilit y refers to the probability of false discov ery plus the probability of miss-detection, i.e. not rejecting the n ull when G ∼ G ( n, a/n, b/n )): 9 Corollary 2.8. The SDP-b ase d test has err or pr ob ability c onver ging to 0 pr ovide d λ > λ SDP ( d ) , wher e λ SDP ( d ) ≤ min 2 − 1 d , 1 + C d 1 / 4 . (16) F or comparison 4 , [MS16] pro ved λ SDP ( d ) = 1 + o d (1) for large d , while the last result giv es a quan titative b ound for all d . 3 F urther related w ork The SDP relaxation (3) has attracted a significant amoun t of work since Go emans-Williamson’s seminal work on the MAX CUT problem [GW95]. In the last few years, sev eral authors used this approac h for clustering or communit y detection and derived optimalit y or near-optimality guaran tees. An incomplete list includes [BCSZ14, ABH16, HWX16, HWX15, ABC + 15]. Under the assumption that G is generated according to the sto c hastic blo ck mo del (whose t wo-groups version w as in tro duced in Section 2.3), these pap ers pro vide conditions under which the SDP approach reco vers exactly the v ertex lab els. This can b e regarded as a ‘high signal-to-noise ratio’ regime, in whic h (with high probability) the SDP solution has rank one and is deterministic (i.e. indep enden t of the graph realization). In contrast, we fo cus on the ‘pure noise’ scenario in which G ∼ G ( n, d/n ) is an Erd˝ os-R ´ en yi random graph, or on the tw o-groups sto c hastic blo c k-mo del G ∼ G ( n, a/n, b/n ) close to the detection threshold. In this regime, the SDP optimum has rank larger than one and is non-deterministic. The only pap ers that hav e addressed this regime using SDP are [GV15, MS16], discussed previously . Sev eral pap ers applied sophisticated sp ectral methods to the sto c hastic blo c k mo del near the detection threshold [Mas14, MNS13, BLM15]. Our upp er b ound in Theorem 2.2 is based on a dualit y argument, where we establish feasibility of a certain dual witness construction using an argumen t similar to [BLM15]. Sev eral pap ers studied the use of lo cal algorithms to solve com binatorial optimization problems on graphs. Hatami, Lov´ asz and Szegedy [HLS14] in vestigated sev eral notions of graph conv ergence, and put forward a conjecture implying –in particular– that lo cal algorithms are able to find (nearly) optimal solutions of a broad class of combinatorial problems on random d -regular graphs. This conjecture was disprov ed by Gamarnik and Sudan [GS14] b y considering maxim um size indep enden t sets on random d -regular graphs. In particular, they pro ved that the size of an indep enden t set pro duced by a lo cal algorithm is at most (1 / 2) + (1 / √ 8) + ε times the maxim um indep endent set, for large enough d . Rahman and Virag [R V14] improv ed this result b y establishing that no lo cal algorithm can pro duce indep endent sets of size larger than (1 / 2) + ε times the maxim um indep enden t set, for large enough d . This approximation ratio is essen tially optimal, since known lo cal algorithms can achiev e (1 / 2) − ε of the maximum indep enden t set. It is unknown whether a similar gap is presen t for small degree d . In particular, Cs´ ok a et al. [CGHV15] establish a lo wer-bound on the max-size indep enden t set on random 3-regular graphs. A similar technique is used by Lyons [Ly o14] to low er b ound the max-cut on random 3-regular graphs. In summary , the question of which graph-structured optimization problems can be appro ximated b y lo cal algorithms is broadly op en. 4 Note [MS16] also prov es guarantees on the estimation error. W e b eliev e it should be possible to improv e those results using the methods in the present paper, but we defer it to future work. 10 By construction, lo cal algorithms can b e applied to infinite random graphs, and hav e a well defined v alue provided the graph distribution is unimo dular (see b elo w). Asymptotic results for graph sequences can b e ‘read-off ’ these infinite-graph settings (our pro ofs will use this device m ultiple times). In this context, the (random) solutions generated by lo cal algorithms, together with their limits in the w eak top ology , are referred to as ‘factors of i.i.d. pro cesses’ [Lyo14]. 4 Notation W e use upp er case b oldface for matrices (e.g. A , B , . . . ), low er case b oldface for vectors (e.g. u , v , etc.) and lo wer case plain for scalars (e.g. x, y , . . . ). The scalar pro duct of v ectors u , v ∈ R m is denoted by h u , v i = P m i =1 u i v i , and the scalar pro duct b et ween matrices is indicated in the same w ay h A , B i = T r( AB T ). Giv en a matrix A ∈ R m × m , diag( A ) ∈ R m is the vector that contains its diagonal en tries. Con versely , given v ∈ R m , diag( v ) ∈ R m × m is the diagonal matrix with en tries diag( v ) ii = v i . W e denote b y 1 the all-ones vector and by Id the identit y matrix. W e follo w the standard big-Oh notation. 5 Upp er b ound: Theorem 2.2 In this section, we pro ve the upp er b ound in Theorem 2.2. Denote PSD 1 := { X : X 0 , X ii = 1 ∀ i } . Introducing dual v ariables Λ 0 and ν ∈ R n and in voking strong duality , we hav e SDP ( A ) = max X ∈ PSD 1 min ν , Λ 0 A − d n 11 T , X + h Λ , X i − h ν , diag ( X ) − 1 i (17) = min ν , Λ 0 max X ∈ PSD 1 h ν , 1 i + A − d n 11 T + Λ − diag( ν ) , X . (18) The minim um ov er Λ 0 o ccurs at Λ = 0, hence SDP ( A ) is equiv alen tly given by the v alue of the dual minimization problem o ver ν ∈ R n : minimize h ν , 1 i (19) sub ject to A − d n 11 T diag( ν ) . W e pro ve the upp er b ound in Theorem 2.2 by constructing a dual-feasible solution ν , parametrized b y a small positive constant δ ∈ (0 , 1 / √ d ). Denote the diagonal degree matrix of A as D := diag( A 1 ) and set u := 1 √ d − δ, ν := ( diag 1+ δ − u 2 u Id + u D if 1+ δ − u 2 u Id + u D − A + d n 11 T 0 diag( D ) otherwise. (20) The follo wing is the main result of this section, whic h ensures that the first cas e in the definition of ν in (20) holds with high probability . 11 Theorem 5.1. F or fixe d d > 1 , let A b e the adjac ency matrix of the Er d˝ os-R ´ enyi r andom gr aph G ∼ G ( n, d/n ) , and let D := diag ( A 1 ) b e the diagonal de gr e e matrix. Then for any δ ∈ (0 , 1 / √ d ) and for u = 1 / √ d − δ , with pr ob ability appr o aching 1 as n → ∞ , 1 + δ − u 2 u Id + u D − A + (1 − u 2 ) d n 11 T 0 . Let us first sho w that this implies the desired upp er b ound: Pr o of of The or em 2.2 (upp er b ound). By construction, ν as defined in (20) is a feasible solution for the dual problem (19). Let E b e the ev ent where 1 + δ − u 2 u Id + u D − A + d n 11 T 0 . Then ν T 1 = 1 + δ − u 2 u n + u 1 T A 1 1 {E } + ( 1 T A 1 ) 1 {E c } . As 1 T A 1 ∼ 2 Binom( n 2 , d/n ), this implies E [ 1 T A 1 ] ≤ dn and E [( 1 T A 1 ) 2 ] ≤ d 2 ( n 2 + 1). Then E 1 n SDP ( A ) ≤ 1 n E [ ν T 1 ] ≤ 1 + δ − u 2 u + u n E [ 1 T A 1 ] + 1 n E h ( 1 T A 1 ) 1 {E c } i ≤ 1 + δ − u 2 u + ud + 1 n q E [( 1 T A 1 ) 2 ] P [ E c ] ≤ 1 + δ − u 2 u + ud + d p P [ E c ] . By Theorem 5.1, P [ E c ] → 0 as n → ∞ . T aking n → ∞ and then δ → 0, lim sup n →∞ E 1 n SDP ( A ) ≤ 2 √ d − 1 √ d = 2 √ d 1 − 1 2 d . T o obtain the b ound almost surely rather than in exp ectation, note that if G and G 0 are tw o fixed graphs that differ in one edge, with adjacency matrices A and A 0 , then T r A − d n 11 T X − T r A 0 − d n 11 T X ≤ 2 for any feasible p oin t X of (3), so | 1 n SDP ( A ) − 1 n SDP ( A 0 ) | ≤ 2 n . Let e 1 , . . . , e ( n 2 ) b e any ordering of the edges { ( i, j ) : 1 ≤ i < j ≤ n } , and denote by F 0 , F 1 , . . . , F ( n 2 ) the filtration where F l is generated b y A e 1 , . . . , A e l . Then by coupling, this implies for eac h l = 1 , . . . , n 2 | d l | := E 1 n SDP ( A ) F l − E 1 n SDP ( A ) F l − 1 ≤ 1 { A e l = 0 } d n · 2 n + 1 { A e l = 1 } 1 − d n 2 n . Hence | d l | ≤ 2 /n for each l , and V n := ( n 2 ) X l =1 | d l | 2 ≤ 1 T A 1 2 1 − d n 2 n 2 + n 2 − 1 T A 1 2 2 d n 2 2 ≤ 2 d 2 n 2 + 2 n 2 ( 1 T A 1 ) . 12 Bernstein’s inequality yields P [ 1 T A 1 > 3 dn ] ≤ exp( − C d n ) for a constant C d > 0. Then, applying the martingale tail b ound of [dlP99, Theorem 1.2A], for any ε > 0, P 1 n SDP ( A ) − E 1 n SDP ( A ) ≥ ε ≤ exp − ε 2 2 2 d 2 n 2 + 6 d n + 2 ε n + P V n ≥ 2 d 2 n 2 + 6 d n ≤ 2 exp ( − C d,ε n ) for a constan t C d,ε > 0. Then the Borel-Cantelli lemma implies | 1 n SDP ( A ) − E [ 1 n SDP ( A )] | < ε almost surely for all large n , and the result follows by taking ε → 0. In the remainder of this section, we pro ve Theorem 5.1. Heuristically , we migh t exp ect that Theorem 5.1 is true by the following reasoning: The matrix H ( u ) := (1 − u 2 ) Id + u 2 D − u A is the deformed Laplacian, or Bethe Hessian, of the graph. By a relation in graph theory kno wn as the Ihara-Bass formula [Bas92, KS00], the v alues of u for which this matrix is singular are the in verses of the non-trivial eigenv alues of a certain “non-backtrac king matrix” [KMM + 13, SKZ14]. Theorem 3 of [BLM15] shows that this non-bac ktracking matrix has, with high probabilit y , the bulk of its sp ectrum supp orted on the complex disk of radius approximately √ d , with a single outlier eigen v alue at d . F rom this, the observ ation that H (0) 0, and a contin uity argumen t in u , one deduces that H ( u ) has, with high probability for large n , only a single negative eigen v alue when u ∈ (1 /d, 1 / √ d ). If the eigenv ector corresp onding to this eigen v alue has p ositiv e alignmen t with 1 ∈ R n , then adding a certain multiple of the rank-one matrix 11 T should eliminate this negative eigen v alue. Direct analysis of the rank-one p erturbation of H ( u ) is hindered b y the fact that the sp ectrum and eigenv ectors of H ( u ) are difficult to c haracterize. Instead, we will study a certain p erturbation of the non-backtrac king matrix. W e prov e Theorem 5.1 via the following t wo steps: First, we pro ve a generalization of the Ihara-Bass relation to edge-weigh ted graphs. F or an y graph H = ( V , E ), let c : E → R b e a set of p ossibly negative edge weigh ts. F or each u ∈ R suc h that | u | / ∈ {| c ( i, j ) | − 1 : { i, j } ∈ E } , define the n × n symmetric matrix A c,u and diagonal matrix D c,u b y A c,u ( i, j ) = ( uc ( i,j ) 1 − u 2 c ( i,j ) 2 { i, j } ∈ E 0 otherwise, D c,u ( i, j ) = ( P k : { i,k }∈ E u 2 c ( i,k ) 2 1 − u 2 c ( i,k ) 2 i = j 0 otherwise. Let E o denote the set of directed edges E o := { ( i, j ) , ( j, i ) : { i, j } ∈ E } , and define the | E o | × | E o | w eighted non-backtrac king matrix B c , with ro ws and columns indexed by E o , as B c (( i, j ) , ( i 0 , j 0 )) = ( c ( i 0 , j 0 ) i 0 = j, j 0 6 = i 0 otherwise. The follo wing result relates B c with a generalized deformed Laplacian defined b y A c,u and D c,u : Lemma 5.2 (Generalized Ihara-Bass formula) . F or any gr aph H = ( V , E ) , e dge weights c : E → R , u ∈ R with | u | / ∈ {| c ( i, j ) | − 1 : { i, j } ∈ E } , and the matric es B c , A c,u , and D c,u as define d ab ove, det(Id − u B c ) = det(Id + D c,u − A c,u ) Y { i,j }∈ E (1 − u 2 c 2 ( i, j )) . (21) 13 When c ≡ 1, this recov ers the standard Ihara-Bass identit y . The pro of is a direct adaptation of the pro of for the unw eighted c ase in [KS00]; for the reader’s conv enience we provide it in Section 5.1. Second, we consider a weigh ted non-backtrac king matrix B ∈ R n ( n − 1) × n ( n − 1) of the ab o ve form for the complete graph with n vertices, with rows and columns indexed by all ordered pairs ( i, j ) of distinct indices i, j ∈ { 1 , . . . , n } , and defined as B (( i, j ) , ( i 0 , j 0 )) = ( A i 0 j 0 − d n i 0 = j, j 0 6 = i 0 otherwise . (22) W e prov e in Section 5.2 that B no longer has an outlier eigenv alue at d , but instead has all of its eigen v alues contained within the complex disk of radius approximately √ d : Lemma 5.3. Fix d > 1 , let A b e the adjac ency matrix of the Er d˝ os-R ´ enyi r andom gr aph G ( n, d/n ) , and define B ∈ R n ( n − 1) × n ( n − 1) by (22). L et ρ ( B ) denote the sp e ctr al r adius of B . Then for any ε > 0 , with pr ob ability appr o aching 1 as n → ∞ , ρ ( B ) ≤ √ d (1 + ε ) . Using these results, w e prov e Theorem 5.1: Pr o of of The or em 5.1. Denote b y D the diagonal degree matrix of A . Let E b e the ev en t on whic h ρ ( B ) ≤ (1 / √ d − δ / 2) − 1 and k D k ≤ 2 log n . Each diagonal entry of D is distributed as Binom( n − 1 , d/n ), hence P [ k D k > 2 log n ] ≤ c d /n 2 for a constant c d > 0 by Bernstein’s inequalit y and a union b ound. This and Lemma 5.3 imply E holds with probability approaching 1. On E , det(Id − u B ) 6 = 0 for all u ∈ (0 , 1 / √ d − δ / 2). Applying Lemma 5.2 for the complete graph H with edge w eights c ( i, j ) = A i,j − d/n , and noting | u | 6 = | c ( i, j ) | − 1 for any { i, j } and any u ∈ (0 , 1 / √ d − δ / 2) when n is sufficiently large, we hav e det(Id + D u − A u ) 6 = 0 for A u = u (1 − d/n ) 1 − u 2 (1 − d/n ) 2 + ud/n 1 − u 2 d 2 /n 2 A − ud/n 1 − u 2 d 2 /n 2 11 T + ud/n 1 − u 2 d 2 /n 2 Id , D u = u 2 (1 − d/n ) 2 1 − u 2 (1 − d/n ) 2 − u 2 d 2 /n 2 1 − u 2 d 2 /n 2 D + ( n − 1) u 2 d 2 /n 2 1 − u 2 d 2 /n 2 Id . Note that at u = 0, Id + D u − A u = Id 0. Then b y contin uit y in u , Id + D u − A u 0 for all u ∈ (0 , 1 / √ d − δ / 2) on the even t E . Cho osing u = 1 / √ d − δ , it is easily verifi ed that Id + D u − A u = Id + u 2 1 − u 2 D − u 1 − u 2 A + ud n 11 T + R for a remainder matrix R satisfying k R k ≤ C d,δ n ( k A k + k D k + 1) ≤ C d,δ n (2 k D k + 1) for a constan t C d,δ > 0. Hence on E , R δ 1 − u 2 Id for all large n , and rearranging yields the desired result. In the follo wing tw o subsections, we complete the pro of b y proving Lemmas 5.2 and 5.3. 14 5.1 Pro of of Lemma 5.2 W e follow the argument of [KS00, Sections 4 and 5]. Assume without loss of generality c ( i, j ) 6 = 0 for all { i, j } ∈ E . (Otherwise, remov e { i, j } from E .) Identify R | E o | with the space of linear functionals ω : E o → R and R | V | with the space of linear functionals f : V → R . Consider the orthogonal decomp osition R | E o | = C − ⊕ C + , where C ± := { ω : ω ( i, j ) = ± ω ( j, i ) ∀ ( i, j ) ∈ E o } . An y ω ∈ R | E o | has the decomp osition ω = ω − + ω + where ω ± ( i, j ) = 1 2 ( ω ( i, j ) ± ω ( j, i )) ∈ C ± . Define d ± : R | V | → C ± b y (d ± f )( i, j ) := f ( j ) ± f ( i ) , and δ ± : C ± → R | V | b y ( δ ± ω )( k ) := ± X ( i,j ) ∈ E o : i = k c ( i, j ) ω ( i, j ) . With a slight abuse of notation, denote b y c : C ± → C ± the diagonal op erators ( cω )( i, j ) := c ( i, j ) ω ( i, j ) (which are well-defined as c ( i, j ) = c ( j, i )). F or ω ∈ C + , 1 2 (( B c ω )( i, j ) + ( B c ω )( j, i )) = 1 2 X ( i 0 ,j 0 ) ∈ E o : i 0 = j,j 0 6 = i c ( i 0 , j 0 ) ω ( i 0 , j 0 ) + X ( i 0 ,j 0 ) ∈ E o : i 0 = i,j 0 6 = j c ( i 0 , j 0 ) ω ( i 0 , j 0 ) = 1 2 (( δ + ω )( j ) − c ( j , i ) ω ( j, i ) + ( δ + ω )( i ) − c ( i, j ) ω ( i, j )) = 1 2 d + δ + ω − cω ( i, j ) Similar computations for 1 2 (( B c ω )( i, j ) − ( B c ω )( j, i )) and for ω ∈ C − v erify that B c has the follo wing blo c k decomp osition with resp ect to R | E o | = C − ⊕ C + : B c = − 1 2 d − δ − + c 1 2 d − δ + − 1 2 d + δ − 1 2 d + δ + − c . Define the matrices T ∈ R | V |×| E o | , S ∈ R | E o |×| V | , and τ τ τ ∈ R | E o |×| E o | (with resp ect to the decom- p osition R | E o | = C − ⊕ C + ) b y T := δ − , − δ + , S := d − d + , τ τ τ := − c 0 0 c . Then B c = − τ τ τ − 1 2 S T , and hence (Id − u B c )(Id + u τ τ τ ) − 1 = Id + 1 2 u S T (Id + u τ τ τ ) − 1 . (23) (Id + u τ τ τ is inv ertible by the assumption | u | / ∈ {| c ( i, j ) | − 1 : { i, j } ∈ E } .) In particular, this implies that (Id − u B c )(Id + u τ τ τ ) − 1 preserv es Ker T (Id + u τ τ τ ) − 1 and Im S . F or f ∈ R | V | and k ∈ V , we compute ( u T (Id + u τ τ τ ) − 1 S f )( k ) = − X ( i,j ) ∈ E o : i = k uc ( i, j ) 1 − uc ( i, j ) ( f ( j ) − f ( i )) − X ( i,j ) ∈ E o : i = k uc ( i, j ) 1 + uc ( i, j ) ( f ( j ) + f ( i )) 15 = − X j : { k,j }∈ E 2 uc ( k , j ) 1 − u 2 c ( k , j ) 2 f ( j ) + X j : { k,j }∈ E 2 u 2 c ( k , j ) 2 1 − u 2 c ( k , j ) 2 f ( k ) = (2 D c,u f − 2 A c,u f )( k ) . Hence u T (Id + u τ τ τ ) − 1 S = 2 D c,u − 2 A c,u . The determinant of this matrix is a rational function of u and non-zero for large | u | , so u T (Id + u τ τ τ ) − 1 S is in vertible for generic u ∈ R . F or an y suc h u , Ker T (Id + u τ τ τ ) − 1 and Im S are linearly indep enden t. F urthermore, one ma y verify δ ± c − 1 = d ∗ ± , so − T τ τ τ − 1 = S ∗ and | E o | = dim Im S + dim Ker S ∗ = dim Im S + dim Ker T (Id + u τ τ τ ) − 1 . Hence for generic u , Ker T (Id + u τ τ τ ) − 1 and Im S span all of R | E o | . By (23), we ma y write the blo c k decomp osition of (Id − u B c )(Id + u τ τ τ ) − 1 with resp ect to R | E o | = Im S ⊕ Ker T (Id + u τ τ τ ) − 1 as (Id − u B c )(Id + u τ τ τ ) − 1 = S Id + 1 2 u T (1 + u τ τ τ ) − 1 S S − 1 0 0 Id . Then, computing the determinan t of b oth sides and rearranging, det(Id − u B c ) = Y { i,j }∈ E 1 − u 2 c 2 ( i, j ) × det Id + 1 2 u T (1 + u τ τ τ ) − 1 S . Recalling u T (1 + u τ τ τ ) − 1 S = 2 D c,u − 2 A c,u , this establishes the result for generic u . The conclusion for all | u | / ∈ {| c ( i, j ) | − 1 : { i, j } ∈ E } then follo ws by contin uit y . 5.2 Pro of of Lemma 5.3 W e b ound ρ ( B 2 m ) ≤ T r B m ( B T ) m for some m := m ( n ), and apply the moment metho d to b ound the latter quan tity . Throughout this section, “edge” and “graph” mean undirected edge and undi- rected graph. W e b egin with some com binatorial definitions: Definition 5.4. The cycle num b er of a graph H , denoted # c ( H ), is the minimum n umber of edges that must b e remo ved from H so that the resulting graph h as no cycles. (If H has k connected comp onen ts, v vertices, and e edges, then # c ( H ) = e + k − v .) Definition 5.5. F or any l ≥ 1, an l l l -coil is an y connected graph with at most l edges and at least t wo cycles. A graph H is l l l -coil-free if H contains no l -coils, i.e. every connected subset of at most l edges in H contains at most one cycle. Definition 5.6. A sequence of v ertices γ 0 , . . . , γ m ∈ { 1 , . . . , n } is a non-bac ktrac king path of length m (on the complete graph) if γ 1 6 = γ 0 and γ j +1 / ∈ { γ j − 1 , γ j } for each j = 1 , . . . , m − 1. γ visits the v ertices γ 0 , . . . , γ m and the edges { γ 0 , γ 1 } , . . . , { γ m − 1 , γ m } . γ is l l l -coil-free if the subgraph of edges visited by γ is l -coil-free. F or an y K ⊆ { 0 , . . . , m − 1 } , γ visits on K K K the edges { γ j , γ j +1 } : j ∈ K , and γ is l l l -coil-free on K K K if the subgraph formed by these edges is l -coil-free. The definition of l -coil-free is similar to (and more con venien t for our pro of than) that of l - tangle-free in [BLM15] and [MNS13], which states that ev ery ball of radius l in H contains at most one cycle. Clearly if H has an l -coil, then the ball of radius l around an y vertex in this l -coil has t wo cycles. Hence if H is l -tangle-free in the sense of [BLM15], then it is also l -coil-free, which yields the follo wing lemma: 16 Lemma 5.7. L et d > 1 and c onsider the Er d˝ os-R ´ enyi gr aph G ∼ G ( n, d/n ) . F or some absolute c onstant C > 0 and any l ≥ 1 , P [ G is l -c oil-fr e e ] ≥ 1 − C d 2 l /n . Pr o of. This is prov en for G b eing l -tangle-free in [BLM15, Lemma 30]; hence the result follows the ab o v e remark. Our momen t metho d computation will dra w on the follo wing t wo technical lemmas, whose pro ofs we defer to App endix A. Lemma 5.8. L et d > 1 and let A b e the adjac ency matrix of the Er d˝ os-R ´ enyi gr aph G ∼ G ( n, d/n ) . L et E := {{ v , w } : v , w ∈ { 1 , . . . , n } , v 6 = w } b e the e dges of the c omplete gr aph on n vertic es, let S, T ⊆ E b e any disjoint e dge sets such that | S | , | T | ≤ (log n ) 2 , and let # c ( S ∪ T ) b e the cycle numb er of the sub gr aph forme d by the e dges S ∪ T . L et l b e a p ositive inte ger with l ≤ 0 . 1 log d n . Then for some C := C ( d ) > 0 , N 0 := N 0 ( d ) > 0 , and al l n ≥ N 0 , E Y { v ,w }∈ S A v w − d n Y { v ,w }∈ T A v w 1 { G is l -coil-free } ≤ C (log n ) 2 d n | S | + | T | 2 | S | n − 0 . 7 | S | l − # c ( S ∪ T ) . Lemma 5.9. F or m, l , v , e ≥ 1 and K 1 , K 2 ⊆ { 0 , . . . , m − 1 } , let W ( m, l , v , e, K 1 , K 2 ) denote the set of al l or der e d p airs ( γ (1) , γ (2) ) of non-b acktr acking p aths of length m (on the c omplete gr aph with n vertic es), such that e ach γ ( i ) is l -c oil-fr e e on K i , γ (1) 0 = γ (2) 0 , γ (1) m = γ (2) m , and ( γ (1) , γ (2) ) visit a total of v distinct vertic es and e distinct e dges. Cal l two such p airs of p aths e quivalent if they ar e the same up to a r elab eling of the vertic es, and let W ( m, l , v , e, K 1 , K 2 ) b e the set of al l e quivalenc e classes under this r elation. Then the numb er of distinct e quivalenc e classes satisfies the b ound |W ( m, l, v , e, K 1 , K 2 ) | ≤ l (3 v 2 ) 2 l +2 2 m −| K 1 |−| K 2 | l (3 v 2 ) 2 e − 2 v +4 2 m l +2 . Using the ab o ve results, we prov e Lemma 5.3: Pr o of of L emma 5.3. Let m := m ( n ) ≥ 1 with m = o ((log n ) 2 ), to b e sp ecified later. Denote T m := T r B m ( B T ) m = X e 1 ,...,e 2 m m Y j =1 B e j e j +1 m Y j =1 ( B T ) e m + j e m + j +1 = X e 1 ,...,e 2 m m Y j =1 B e j e j +1 m Y j =1 B e m + j +1 e m + j where the summation runs ov er all p ossible tuples of ordered v ertex pairs e j ∈ { ( v , w ) : v , w ∈ { 1 , . . . , n } , v 6 = w } , and where e 2 m +1 := e 1 . By the definition of B , a term of the ab o ve sum corresp onding to e 1 , . . . , e 2 m is 0 unless e 1 , . . . , e m +1 = ( ˜ γ (1) 0 , ˜ γ (1) 1 ) , . . . , ( ˜ γ (1) m , ˜ γ (1) m +1 ) , e 2 m +1 , . . . , e m +1 = ( ˜ γ (2) 0 , ˜ γ (2) 1 ) , . . . , ( ˜ γ (2) m , ˜ γ (2) m +1 ) 17 for tw o non-backtrac king paths ˜ γ (1) and ˜ γ (2) of length m + 1 on the complete graph, such that ( ˜ γ (1) 0 , ˜ γ (1) 1 ) = ( ˜ γ (2) 0 , ˜ γ (2) 1 ) and ( ˜ γ (1) m , ˜ γ (1) m +1 ) = ( ˜ γ (2) m , ˜ γ (2) m +1 ). Letting ˜ Γ m denote the set of all pairs of suc h paths, T m = X ( ˜ γ (1) , ˜ γ (2) ) ∈ ˜ Γ m m Y j =1 A ˜ γ (1) j ˜ γ (1) j +1 − d n m Y j =1 A ˜ γ (2) j ˜ γ (2) j +1 − d n . (The ab o ve pro ducts do not include the j = 0 terms corresp onding to the first edge of eac h path.) W riting ( γ ( i ) 0 , . . . , γ ( i ) m ) := ( ˜ γ ( i ) 1 , . . . , ˜ γ ( i ) m +1 ) to remov e the first vertex of each path, and letting Γ m denote the set of pairs ( γ (1) , γ (2) ) where each γ ( i ) is a non-backtrac king path of length m and suc h that γ (1) 0 = γ (2) 0 and ( γ (1) m − 1 , γ (1) m ) = ( γ (2) m − 1 , γ (2) m ), the ab o ve may b e written (by summing o ver ˜ γ (1) 0 = ˜ γ (2) 0 ) as T m = X ( γ (1) ,γ (2) ) ∈ Γ m n − n γ (1) 0 , γ (1) 1 , γ (2) 1 o m − 1 Y j =0 A γ (1) j γ (1) j +1 − d n m − 1 Y j =0 A γ (2) j γ (2) j +1 − d n . F or eac h edge { v , w } in the complete graph, call { v , w } single if it is visited exactly once b y the pair of paths ( γ (1) , γ (2) ). F or each i = 1 , 2, denote J i := J i ( γ (1) , γ (2) ) = { j ∈ { 0 , . . . , m − 1 } : { γ ( i ) j , γ ( i ) j +1 } is single } , and write J c i = { 0 , . . . , m − 1 } \ J i . Distributing the pro ducts ov er J c 1 and J c 2 , the ab o ve may b e written as T m = X ( γ (1) ,γ (2) ) ∈ Γ m X K 1 ⊆ J c 1 X K 2 ⊆ J c 2 n − n γ (1) 0 , γ (1) 1 , γ (2) 1 o − d n | J c 1 |−| K 1 | + | J c 2 |−| K 2 | Y j ∈ J 1 A γ (1) j γ (1) j +1 − d n Y j ∈ K 1 A γ (1) j γ (1) j +1 Y j ∈ J 2 A γ (2) j γ (2) j +1 − d n Y j ∈ K 2 A γ (2) j γ (2) j +1 . (24) Let l := l ( n ) ≥ 1 with l log n , to b e sp ecified later, and let E b e the ev en t that G is l -coil-free. W e multiply (24) on b oth sides by 1 {E } and apply Lemma 5.8. On E , the only nonzero terms of the sum in (24) are those where γ (1) is l -coil-free on K 1 and γ (2) is l -coil-free on K 2 . F or each such term, denote b y S the set of single edges and by T the set of all edges visited by γ (1) on K 1 and b y γ (2) on K 2 . Then Y j ∈ J 1 A γ (1) j γ (1) j +1 − d n Y j ∈ K 1 A γ (1) j γ (1) j +1 Y j ∈ J 2 A γ (2) j γ (2) j +1 − d n Y j ∈ K 2 A γ (2) j γ (2) j +1 = Y { v ,w }∈ S A v w − d n Y { v ,w }∈ T A v w . Note that | S | = | J 1 | + | J 2 | . If ( γ (1) , γ (2) ) visit e := e ( γ (1) , γ (2) ) total distinct edges, then e − | J 1 | − | J 2 | of these are non-single and hence p oten tially visited by γ (1) on K 1 and γ (2) on K 2 . F or eac h suc h edge not visited by γ (1) on K 1 and γ (2) on K 2 , there must b e at least tw o indices corresp onding to this edge in J c 1 \ K 1 and J c 2 \ K 2 . Hence | T | ≥ e − | J 1 | − | J 2 | − | J c 1 |−| K 1 | + | J c 2 |−| K 2 | 2 . The cycle num b er 18 # c ( S ∪ T ) is at most the cycle num b er of the graph formed b y all edges visited by ( γ (1) , γ (2) ), which is e − v + 1 if ( γ (1) , γ (2) ) visit v := v ( γ (1) , γ (2) ) total distinct vertices. Combining these observ ations and applying Lemma 5.8, for a constan t C > 0 and all large n , E [ T m 1 {E } ] ≤ C n (log n ) 2 X γ (1) ,γ (2) ,K 1 ,K 2 d n e + | J c 1 |−| K 1 | + | J c 2 |−| K 2 | 2 n 0 . 7( e − v +1) 2 n 0 . 7 /l | J 1 | + | J 2 | , where the summation is ov er all ( γ (1) , γ (2) ) ∈ Γ m and K 1 ⊆ J c 1 and K 2 ⊆ J c 2 suc h that γ ( i ) is l -coil-free on K i for i = 1 , 2, and where e, v , J 1 , J 2 all dep end on the paths γ (1) and γ (2) . As γ (1) and γ (2) are of length m , this implies 2( e − | J 1 | − | J 2 | ) + | J 1 | + | J 2 | ≤ 2 m , so e ≤ m + ( | J 1 | + | J 2 | ) / 2. Then, since d > 1, E [ T m 1 {E } ] ≤ C n (log n ) 2 d m X γ (1) ,γ (2) ,K 1 ,K 2 n − e d n | J c 1 |−| K 1 | + | J c 2 |−| K 2 | 2 n 0 . 7( e − v +1) 2 √ d n 0 . 7 /l ! | J 1 | + | J 2 | . Let us now drop the condition that J i corresp onds to indices where { γ ( i ) j , γ ( i ) j +1 } is single, and instead sum o ver all subsets J 1 , J 2 ⊆ { 0 , . . . , m − 1 } , all subsets K 1 ⊆ J c 1 and K 2 ⊆ J c 2 , and all paths ( γ (1) , γ (2) ) ∈ Γ m suc h that γ ( i ) is l -coil-free on K i for i = 1 , 2. Letting W ( m, l , v , e, K 1 , K 2 ) b e as in Lemma 5.9, and noting that each class W ( m, l , v , e, K 1 , K 2 ) represents at most n v distinct pairs of paths, this yields E [ T m 1 {E } ] ≤ C n (log n ) 2 d m X J 1 ,J 2 ,K 1 ,K 2 2 m X v =2 2 m X e = v − 1 n − e d n | J c 1 |−| K 1 | + | J c 2 |−| K 2 | 2 n 0 . 7( e − v +1) 2 √ d n 0 . 7 /l ! | J 1 | + | J 2 | n v |W ( m, l, v , e, K 1 , K 2 ) | ≤ C n 2 (log n ) 2 d m 2 m X v =2 2 m X e = v − 1 X J 1 ,J 2 ,K 1 ,K 2 n − 0 . 3( e − v +1) d n | J c 1 |−| K 1 | + | J c 2 |−| K 2 | 2 2 √ d n 0 . 7 /l ! | J 1 | + | J 2 | l (3 v 2 ) 2 l +2 2 m −| K 1 |−| K 2 | l (3 v 2 ) 2 e − 2 v +4 2 m l +2 = C n 2 (log n ) 2 d m 2 m X v =2 2 m X e = v − 1 X J 1 ,J 2 ,K 1 ,K 2 l (3 v 2 ) 2 2 m l +2 (3 v 2 ) 4 m l +4 n 0 . 3 ! e − v +1 √ d l (3 v 2 ) 2 l +2 √ n ! | J c 1 |−| K 1 | + | J c 2 |−| K 2 | 2 √ d l (3 v 2 ) 2 l +2 n 0 . 7 /l ! | J 1 | + | J 2 | , where the summations are o ver J 1 , J 2 ⊆ { 0 , . . . , m − 1 } and K i ⊆ J c i for i = 1 , 2, and in the last line ab o v e we hav e written 2 m = | J 1 | + | J 2 | + | J c 1 | + | J c 2 | and collected terms with common exp onen ts. F actoring the summations o ver J 1 , J 2 , K 1 , K 2 , the ab o ve is equiv alent to E [ T m 1 {E } ] ≤ C n 2 (log n ) 2 d m 2 m X v =2 2 m X e = v − 1 l (3 v 2 ) 2 2 m l +2 (3 v 2 ) 4 m l +4 n 0 . 3 ! e − v +1 19 1 + √ d l (3 v 2 ) 2 l +2 √ n + 2 √ d l (3 v 2 ) 2 l +2 n 0 . 7 /l ! 2 m . Finally , let us tak e l ∼ (log log n ) 3 and m ∼ (log n )(log log n ). Then for v ≤ 2 m , we may verify m √ d l (3 v 2 ) 2 l +2 n 0 . 7 /l √ n, (3 v 2 ) 4 m l +4 n 0 . 2 . So for some C 0 > 0 and all large n , E [ T m 1 {E } ] ≤ C 0 n 2 (log n ) 2 d m 2 m X v =2 ( l (3 v 2 ) 2 ) 2 m l +2 ≤ C 0 n 2 (log n ) 2 d m · 2 m (144 l m 4 ) 2 m l +2 . W e ma y verify , for any ε > 0, n 2 (log n ) 2 · 2 m (144 l m 4 ) 2 m l +2 (1 + ε ) 2 m . Noting that ρ ( B ) 2 m ≤ T m , w e obtain by Marko v’s inequality P h ρ ( B ) ≥ √ d (1 + ε ) , G is l -coil-free i ≤ E [ T m 1 {E } ] d m (1 + ε ) 2 m → 0 . Lemma 5.7 yields P [ G is not l -coil-free] → 0, establishing the desired result. 6 Lo w er b ounds: Theorems 2.2, 2.4, 2.5, and 2.7 In this section, we pro ve the lo wer b ound in Theorem 2.2, and also Theorems 2.4, 2.5, and 2.7. W e will b egin by reviewing some definitions and facts on lo cal weak con vergence in Section 6.1, describing the sense in which the graphs G ( n, d/n ) and G ( n, a/n, b/n ) conv erge lo cally to Galton- W atson trees. W e will then reduce the low er b ounds to the construction of lo cal algorithms on suc h trees, stated as a sequence of lemmas in Section 6.2, and finally turn to the pro ofs of these lemmas. 6.1 Lo cal weak con vergence: Definitions T o accommo date notationally b oth the Erd˝ os-R ´ en yi mo del and the sto c hastic-blo c k-mo del, let M denote a general finite set of p ossible vertex lab els. In the Erd˝ os-R ´ en yi case, we will simply tak e the trivial set M = { 1 } ; for the sto c hastic-blo c k-mo del with partially rev ealed lab els, w e will take M = { +1 , − 1 , u } as describ ed in Section 2. Then Definitions 2.1 and 2.6 coincide. Let G ( M ) denote the space of tuples ( G, ø , σ ) where G = ( V , E ) is a (finite or) lo cally-finite graph, ø ∈ V is a distinguished ro ot v ertex, and σ : V → M asso ciates a lab el to eac h vertex. Define a corresponding edge-p ersp ectiv e set G e ( M ) as the space of tuples ( G, { ø , ø 0 } , σ ) where G = ( V , E ) is a (finite or) lo cally-finite graph, σ : V → M asso ciates a lab el to eac h v ertex, and { ø , ø 0 } ∈ E is a distinguished (undirected) ro ot edge. The subspaces of trees in G ( M ), G e ( M ) are denoted b y T ( M ), T e ( M ). F or any graph H , integer ` ≥ 0, and set of v ertices S in H , let B ` ( S ; H ) denote the subgraph induced b y all v ertices at distance at most ` from S in H (including S itself ). T o make con tact with previously in tro duced notation, for a v ertex v , w e write B ` ( v ; H ) for B ` ( { v } ; H ). Lo cal weak conv ergence w as initially in tro duced in [BS01] and further dev elop ed in [AL07, AS04]. W e define it here in a somewhat restricted setting that is relev ant for our pro ofs. 20 Definition 6.1. Let ν b e a law ov er T ( M ) and let { G n = ( V n , E n ) } n ≥ 1 b e a sequence of (deter- ministic) graphs with (deterministic) v ertex lab els σ n : V n → M . W e say that ( G n , σ n ) con v erges lo cally to ν if, for any ` ≥ 0, an y τ ∈ T ( M ), and a vertex v ∈ V n c hosen uniformly at random 5 , lim n →∞ P { ( B ` ( v ; G n ) , v , σ n ) ' τ } = P ν { ( B ` (ø; T ) , ø , σ ) ' τ } , (25) where ' denotes graph isomorphism that preserves the ro ot vertex and vertex lab els. W e write ( G n , σ n ) ⇒ ν . A la w ν ov er T ( M ) is the limit of some graph sequence if and only if ν is unimo dular [BS01, Ele10, BLS15]. Roughly sp eaking, this means that the la w ν do es not change if the ro ot is c hanged. Corresp onding to any unimo dular law ν is an asso ciated edge-p ersp ectiv e la w ν e o ver T e ( M ). This is obtained from ν as follo ws. First define a la w ˜ ν ov er T ( M ) whose Radon-Nykodym deriv ative with resp ect to ν is d ˜ ν d ν ( T , ν, σ ) = deg T (ø) / E ν deg T (ø). Then, letting ( T , ø , σ ) ∼ ˜ ν , define ν e to b e the la w of ( T , { ø , v } , σ ) where v is a uniformly random neighbor of ø in T . The ab o ve definition is clarified by the following fact. Its proof is an immediate consequence of the definitions, once we notice that, in order to sample a uniformly random edge in G n , it is sufficien t to sample a random v ertex v with probability prop ortional to deg ( v ), and then sample one of its neigh b ors uniformly at random. Lemma 6.2. F or ν a unimo dular law over T ( M ) , denote by ν e the c orr esp onding e dge-p ersp e ctive law. L et { G n = ( V n , E n ) } n ≥ 1 b e a gr aph se quenc e with vertex lab els σ n : V n → M such that ( G n , σ n ) ⇒ ν . Then for any ` ≥ 0 and any τ e ∈ T e ( M ) , if an e dge { u, v } ∈ E n is chosen uniformly at r andom, we have lim n →∞ P ( B ` ( { u, v } ; G n ) , { u, v } , σ n ) ' τ e = P ν e ( B ` ( { ø , ø 0 } ; T ) , { ø , ø 0 } , σ ) ' τ e , (26) wher e ' denotes gr aph isomorphism that pr eserves the r o ot e dge and vertex lab els. Both the Erd˝ os-R ´ en yi random graph and the planted partition random graph with partially observ ed labels (rev ealed indep enden tly at random) satisfy the ab o ve definitions, where the la ws ν and ν e are the la ws of Galton-W atson trees. Definition 6.3. A Galton-W atson tree with offspring distribution µ is a random tree rooted at a v ertex ø, such that each vertex v has N v ∼ µ c hildren indep enden tly of the other v ertices. A t w o-type Galton-W atson tree with offspring distributions µ = and µ 6 = is a random tree with binary v ertex lab els { +1 , − 1 } ro oted at ø, suc h that ø has label ± 1 with equal probabilit y , and eac h vertex has N = v ∼ µ = c hildren with same lab el as itself and N 6 = v ∼ µ 6 = c hildren with opp osite lab el from itself, indep enden tly of each other and of the other vertices. The labels of a tw o-type Galton-W atson tree are partially rev ealed with probability δ δ δ if the lab el set is augmen ted to { +1 , − 1 , u } and, conditional on the tree, the lab el of each vertex is replaced b y u indep enden tly with probability 1 − δ . Example 6.4. Fix d > 0, let G n = ( V n , E n ) ∼ G ( n, d/n ) b e an Erd˝ os-R´ enyi random graph, let M = { 1 } , and let σ n ≡ 1 the trivial lab eling. Then almost surely (ov er the realization of G n ), 5 Here σ n and σ really denote the restrictions σ n | B ` ( v ; G n ) and σ | B ` (ø; T ) of the lab els to the balls of radius ` . W e a void this type of cumbersome notation when the meaning is clear. 21 ( G n , σ n ) ⇒ ν where ν is the la w of a Galton-W atson tree ro oted at ø with offspring distribution P oisson( d ) (and lab els σ ≡ 1). The asso ciated edge-p ersp ectiv e la w ν e is the law of tw o indep enden t such trees ro oted at ø and ø 0 and connected b y the single edge { ø , ø 0 } . Example 6.5. Fix a, b > 0 and δ ∈ (0 , 1], let G n = ( V n , E n ) ∼ G ( n, a/n, b/n ) b e the planted partition random graph, and let σ n : V n → { +1 , − 1 , u } b e such that, indep enden tly for eac h vertex i , with probability 1 − δ we ha ve σ n ( i ) = u , and with probability δ we ha ve that σ n ( i ) equals the v ertex lab el (+1 or − 1) of the hidden partition to which i b elongs. Then almost surely (ov er the realization of G n and σ n ), ( G n , σ n ) ⇒ ν where ν is the la w of a tw o-t yp e Galton-W atson tree ro oted at ø, with offspring distributions Poisson( a/ 2) and Poisson( b/ 2) and with v ertices partially rev ealed with probability δ . The asso ciated edge-p erspective la w ν e is the law of tw o suc h trees rooted at ø and ø 0 and connected by the single edge { ø , ø 0 } , where ø and ø 0 b elong to the same side of the partition with probabilit y a/ ( a + b ) and to opp osite sides of the partition with probability b/ ( a + b ), and the trees are indep enden t conditional on the partition memberships of ø and ø 0 . As in Section 2, to define lo cal algorithms that solve the SDP (3), we extend our definitions to include additional random real-v alued marks. Namely , w e denote b y G ∗ ( M ) the space of tuples ( G, ø , σ , z ) where ( G, ø , σ ) ∈ G ( M ) and z : V ( G ) → R asso ciates a real-v alued mark to eac h vertex of G . The spaces G ∗ e ( M ), T ∗ ( M ), T ∗ e ( M ) are defined analogously . F or a unimodular law ν ov er T ( M ), we let ν ∗ b e the la w ov er T ∗ ( M ) suc h that ( T , ø , σ , z ) ∼ ν ∗ if ( T , ø , σ ) ∼ ν and, conditional on ( T , ø , σ ), z ( i ) iid ∼ Normal(0 , 1) for all vertices i ∈ V ( T ). ν ∗ e is defined analogously . Remark 6.6. Since w e are interested in graph sequences that conv erge lo cally to trees, it will turn out to b e sufficient to define lo cal algorithms F : G ∗ ( M ) → R on trees and, for instance, extend it arbitrarily to other graphs. With a slight abuse of notation, we will therefore write F : T ∗ ( M ) → R . Finally , given a lo cal algorithm F ∈ F M ∗ ( ` ) and a unimo dular probability measure ν on T ( M ), w e define the v alue of F with resp ect to ν as E ( F , ν ) := d E ν ∗ e F ( B ` (ø; T ) , ø , σ , z ) F ( B ` (ø 0 ; T ) , ø 0 , σ , z ) , (27) where d = E ν { deg(ø) } is the exp ected degree of the ro ot under ν . (This is a sligh t abuse of notation, giv en the definitions of E ( F ; G ) and E ( F ; G, σ ) in (4) and (14).) 6.2 Key lemmas Using the abov e framework, the desired low er b ounds are now consequences of the follo wing results. Lemma 6.7. L et { G n = ( V n , E n ) } n ≥ 1 b e a deterministic se quenc e of gr aphs with deterministic vertex marks σ n : V n → M , such that | V n | = n , | E n | /n → d/ 2 for a c onstant d > 0 , and ( G n , σ n ) ⇒ ν for a law ν on T ( M ) . F or fixe d ` ≥ 0 , let F ∈ F M ∗ ( ` ) b e any r adius- ` lo c al algorithm such that the fol lowing two c onditions hold 6 : E ν ∗ F ( T , ø , σ , z ) = 0 , E ν ∗ F ( T , ø , σ , z ) 2 | T , ø , σ ≡ 1 . (28) 6 The second of these is the same as condition 2 of Definition 2.6 and condition 2 of Definition 2.1 in the case of trivial markings M = { 1 } ; we restate it here for conv enience. 22 Then we have lim n →∞ E ( F ; G n , σ n ) ≥ E ( F , ν ) . (29) Lemma 6.8. Fix d > 1 , M = { 1 } , and let ν b e the law of the Galton-Watson tr e e with offspring distribution Poisson( d ) (and trivial marking σ ≡ 1 ). Then ther e exist lo c al algorithms F ` ∈ F M ∗ ( ` ) for ` ≥ 1 satisfying (28) and such that lim ` →∞ E ( F ` , ν ) ≥ 2 √ d 1 − 1 d + 1 . (30) Lemma 6.9. In the same setup as L emma 6.8, ther e exist lo c al algorithms F `,L ∈ F M ∗ ( L ) for L ≥ ` ≥ 1 satisfying (28) and such that lim ` →∞ lim L →∞ E ( F `,L , ν ) ≥ d E Ψ(c 1 , c 2 ) , (31) wher e Ψ(c 1 , c 2 ) is as in (9). Lemma 6.10. Fix a, b > 0 such that d := ( a + b ) / 2 ≥ 2 and λ := ( a − b ) / p 2( a + b ) > 1 . Fix δ ∈ (0 , 1] , let M = { +1 , − 1 , u } , and let ν b e the law of the two-typ e Galton-Watson tr e e with offspring distributions Poisson( a/ 2) and Poisson( b/ 2) and with vertex lab els p artial ly r eve ale d with pr ob ability δ . Then for a universal c onstant C > 0 , ther e exist lo c al algorithms F ` ∈ F M ∗ ( ` ) for ` ≥ 1 satisfying (28) and such that lim ` →∞ E ( F ` , ν ) ≥ √ d 2 + ( λ − 1) 2 λ √ d − C d . (32) Pro ofs of the ab ov e four lemmas are contained in the next four subsections. Let us first show that these lemmas imply the desired lo wer b ounds. Pr o of of The or em 2.2 (lower b ound) and The or ems 2.4, 2.5, and 2.7. Consider mo dels G ( n, d/n ) and G ( n, a/n, b/n ) with d := ( a + b ) / 2. Then (6) and (10) follo w from Example 6.4, Lemma 6.7, Lemma 6.8, and Lemma 6.9, while (15) follows from Example 6.5, Lemma 6.7, and Lemma 6.10. The b ounds (5), (8), and the second b ound of (12) in the case d ≥ 2 follow in turn from (6), (10), and (15), as an y lo cal algorithm defines a feasible solution X for the SDP (3) which achiev es the SDP v alue n E ( F ; G n , σ n ), as discussed in Section 2. F or the second b ound of (12) in the case d ∈ (1 , 2), w e may tak e C > 4 so that λ > 2 + ( λ − 1) 2 / ( λ √ d ) − C /d alwa ys when d ∈ (1 , 2), and hence the first bound dominates in (12). F or the first b ound of (12) and any d > 1, let us simply consider the feasible p oin t X = σ n σ T n for (3), where σ n ∈ { +1 , − 1 } n is the indicator vector of the hidden partition. Then 1 n SDP ( A ) ≥ 1 n h A − d n 11 T , X i = 1 n X i,j ∈ V n A ij σ n ( i ) σ n ( j ) . F rom the definition of G ( n, a/n, b/n ), we obtain almost surely lim inf n →∞ 1 n SDP ( A ) ≥ 1 n n 2 2 · a n − n 2 2 · b n = a − b 2 = λ √ d. Finally , the large d expansion (11) in Theorem 2.4 is pro ven in App endix C. In the remainder of this section, w e establish Lemmas 6.7, 6.8, 6.9, and 6.10. 23 6.3 Pro of of Lemma 6.7 W e first recall some well-kno wn prop erties of lo cally conv ergent graphs. (Short pro ofs are pro vided for the reader’s con venience.) Lemma 6.11. L et ( G n , σ n ) ⇒ ν for any law ν , wher e G n = ( V n , E n ) and | V n | = n . Denote by | B ` ( v ; G n ) | the numb er of vertic es in B ` ( v ; G n ) . Then for any fixe d ` ≥ 0 , lim n →∞ 1 n max v ∈ V n | B ` ( v ; G n ) | = 0 . Pr o of. Supp ose b y contradiction that the claim is false. Then there exist ε > 0, a sequence of graph sizes { n i } , and vertices v i ∈ G n i for whic h | B ` ( v i ; G n i ) | ≥ nε . In particular the maxim um degree of any vertex in B ` ( v i ; G n i ) is at least n δ for some δ > 0. Hence, for any w ∈ B ` ( v i ; G n i ), the maximum degree of any v ertex in B 2 ` ( w ; G n i ) is at least n δ . Since there are at least nε suc h v ertices w , w e hav e, for w a vertex of G n c hosen uniformly at random, lim sup n →∞ P max deg( v ) : v ∈ B 2 ` ( w ; G n ) ≥ n δ ≥ ε. (33) This contradicts the hypothesis that ( B 2 ` ( w ; G n ) , w , σ n ) conv erges in la w to ( B 2 ` (ø; T ) , ø , σ ) where ( T , ø , σ ) ∼ ν . Lemma 6.12. Supp ose ( G n , σ n ) ⇒ ν , wher e G n = ( V n , E n ) . F or any fixe d ` ≥ 0 , let f ( H ` , ø , σ ) b e any b ounde d function of a gr aph H ` with r o ot vertex ø and vertex marks σ , such that e ach vertex of H ` is at distanc e at most ` fr om ø . Then as n → ∞ , 1 | V n | X v ∈ V n f ( B ` ( v ; G n ) , v , σ n ) → E ν [ f ( B ` (ø; T ) , ø , σ )] . (34) Pr o of. Let v ∈ V n b e a v ertex c hosen uniformly at random. Then by the assumption of local w eak conv ergence, f ( B ` ( v ; G n ) , v , σ n ) is a random v ariable that conv erges in law to f ( B ` (ø; T ) , ø , σ ) where ( T , ø , σ ) ∼ ν , and the conclusion follows from the b ounded con vergence theorem. Lemma 6.13. Supp ose ( G n , σ n ) ⇒ ν , wher e G n = ( V n , E n ) . L et ν e b e the e dge-p ersp e ctive law asso ciate d to ν . F or any fixe d ` ≥ 0 , let f ( H ` , { ø , ø 0 } , σ ) b e any b ounde d function of a gr aph H l with r o ot e dge { ø , ø 0 } and vertex marks σ , such that e ach vertex of H ` is at distanc e at most ` fr om ø or ø 0 . Then as n → ∞ , 1 | E n | X { v ,w }∈ E n f ( B ` ( { v , w } ; G n ) , { v , w } , σ n ) → E ν e [ f ( B ` ( { ø , ø 0 } ; T ) , { ø , ø 0 } , σ )] . (35) Pr o of. The pro of is the same as Lemma 6.12; we let { v , w } ∈ E n b e an edge c hosen uniformly at random, and apply Lemma 6.2 and the bounded conv ergence theorem. Pr o of of L emma 6.7. F or notational conv enience, let us denote E z simply b y E (so exp ectations are understo o d to b e with resp ect to z only). Given a lo cal algorithm F : T M ∗ ( ` ) → R defined on trees, augmen t it to F : G M ∗ ( ` ) → R defined on all graphs by setting F ( B ` (ø; G ) , ø , σ , z ) = z (ø) if B ` (ø , G ) is not a tree. Note that this satisfies the conditions of Definitions 2.1 and 2.6. 24 Define ξ ( i ) = F ( G n , i, σ n , z n ). T o b ound the v alue E ( F ; G n , σ n ), let us write E X i,j ∈ V n ( A G n ) ij ξ ( i ) ξ ( j ) = 2 X { i,j }∈ E n E [ ξ ( i ) ξ ( j )] . F or an y vertices i, j ∈ V n , | E [ ξ ( i ) ξ ( j )] | ≤ 1 by condition (28). F urthermore, for each edge { i, j } ∈ E n , E [ ξ ( i ) ξ ( j )] is a function only of { i, j } , the local neigh b orho od B ` ( { i, j } ; G n ), and the marks z ( v ) of v ertices v in this neighborho o d. Then Lemma 6.13 and the assumption ( G n , σ n ) ⇒ ν implies 1 2 | E n | E X i,j ∈ V n ( A G n ) ij ξ ( i ) ξ ( j ) → E ( F , ν ) as n → ∞ . Next, note that if i / ∈ B 2 ` ( j ; G n ), then ξ ( i ) and ξ ( j ) are independent by construction. Hence E X i,j ∈ V n ξ ( i ) ξ ( j ) = X i,j ∈ V n E [ ξ ( i )] E [ ξ ( j )] + X i ∈ V n X j ∈ B 2 l ( i ; G n ) ( E [ ξ ( i ) ξ ( j )] − E [ ξ ( i )] E [ ξ ( j )]) ≤ X i ∈ V n E [ ξ ( i )] ! 2 + 2 n max i ∈ V n | B 2 ` ( i ; G n ) | . F or each vertex i ∈ V n , | E [ ξ ( i )] | ≤ 1 and E [ ξ ( i )] is a function only of i , the ball B ` ( i ; G n ), and the marks of vertices in this ball. Then Lemma 6.12 and the first condition in (28) implies 1 n P i ∈ V n E [ ξ ( i )] → 0. T ogether with Lemma 6.11, this implies d n 2 E X i,j ∈ V n ξ ( i ) ξ ( j ) → 0 . Com bining the ab o ve and applying | E n | /n → d/ 2 yields the desired result. 6.4 Pro of of Lemma 6.8 F or any ro oted tree T and eac h vertex v of T , let k ( v ) := dist ( v , ø) denote the distance from v to the ro ot ø. F or eac h ` ≥ 0, denote N ` := |{ v : k ( v ) = ` }| , X ` := d − ` N ` . (36) In the case where T is a random Galton-W atson tree with offspring distribution Poisson( d ), let F ` b e the σ -field generated by N 0 , . . . , N ` . Then for eac h ` ≥ 1, conditional on F ` − 1 , X ` ∼ d − ` P oisson( dN ` − 1 ), so E [ X ` | F ` − 1 ] = X ` − 1 . Hence { X ` } ` ≥ 0 is a nonnegative martingale with resp ect to the filtration {F ` } ` ≥ 0 , and X := lim ` →∞ X ` (37) exists almost surely b y the martingale conv ergence theorem, with X ≥ 0. W e first establish the follo wing lemma. 25 Lemma 6.14. L et X , X 0 b e indep endent r andom variables with law define d by (37), wher e X ` ar e define d by (36) for the Galton-Watson tr e e with offspring distribution Poisson( d ) . In the setup of L emma 6.8, for e ach ` ≥ 1 , ther e exists F ` ∈ F M ∗ ( ` ) satisfying (28) such that lim ` →∞ E ( F ` , ν ) = E [ s ( X , X 0 )] , wher e s ( X , X 0 ) = d X = X 0 = 0 √ d ( X + X 0 ) q X + 1 d X 0 q X 0 + 1 d X otherwise . (38) Pr o of. As the vertex marking σ ≡ 1 is trivial, for notational clarity we omit it from all expressions b elo w. Define the lo cal algorithm F ` ( T , ø , z ) := P v ∈ B ` (ø; T ) d − k ( v ) / 2 z ( v ) q P v ∈ B ` (ø; T ) d − k ( v ) X ` > 0 , sign P v ∈ B ` (ø; T ) z ( v ) X ` = 0 , (39) where k ( v ) and X ` are defined for the tree T . When z ( v ) iid ∼ Normal(0 , 1) conditional on ( T , σ ), the conditions of (28) hold b y construction. It remains to compute E ( F `,L , ν ). F or ( T , { ø , ø 0 } ) ∈ T e , denote by T ø the subtree ro oted at ø of vertices connected to ø by a path not including ø 0 , and by T ø 0 the subtree of remaining v ertices ro oted at ø 0 (i.e. connected to ø 0 b y a path not including ø). Recall from Example 6.4 that if ( T , { ø , ø 0 } ) ∼ ν ∗ e , then T ø and T ø 0 are indep enden t Galton-W atson trees with offspring distribution P oisson( d ). F or each v ertex v ∈ T ø , denote by k ( v ) its distance to ø, and for each vertex v 0 ∈ T ø 0 , denote b y k 0 ( v 0 ) its distance to ø 0 . Let X ` b e as defined in (36) for the subtree T ø and X 0 ` b e as defined in (36) for the subtree T ø 0 . F or any k ≥ 0, write as shorthand B k := B k (ø; T ø ) and B 0 k = B k (ø 0 ; T ø 0 ). Note that B ` (ø; T ) consists of B ` and B 0 ` − 1 connected by the edge { ø , ø 0 } , and similarly B ` (ø 0 ; T ) consists of B 0 ` and B ` − 1 connected b y this edge. W e consider three cases: (I) If X ` − 1 = 0 and X 0 ` − 1 = 0, then B ` (ø; T ) = B ` (ø 0 ; T ) and the second case of (39) holds for b oth balls. In this case F ` ( T , ø , z ) F ` ( T , ø 0 , z ) = 1 . (I I) If X ` = 0 and X ` − 1 > 0 and X 0 ` − 1 = 0, or if X 0 ` = 0 and X 0 ` − 1 > 0 and X ` − 1 = 0, then the first case of (39) holds for one of the balls B ` (ø; T ) or B ` (ø 0 ; T ) and the second case holds for the other ball. In this case we simply b ound, using Cauch y-Sc hw arz, | E z [ F ` ( T , ø , z ) F ` ( T , ø 0 , z )] | ≤ 1 . (I II) Otherwise, the first case of (39) holds for b oth balls B ` (ø; T ) and B ` (ø 0 ; T ). Then we hav e E z F ` ( T , ø , z ) F ` ( T , ø 0 , z ) = P v ∈ B ` − 1 d − k ( v ) / 2 d − ( k ( v )+1) / 2 + P v 0 ∈ B 0 ` − 1 d − k 0 ( v 0 ) / 2 d − ( k 0 ( v 0 )+1) / 2 q P v ∈ B ` d − k ( v ) + P v 0 ∈ B 0 ` − 1 d − ( k 0 ( v 0 )+1) q P v 0 ∈ B 0 ` d − k 0 ( v 0 ) + P v ∈ B ` − 1 d − ( k ( v )+1) . 26 Letting S ` = P ` j =0 X ` and S 0 ` = P ` j =0 X 0 ` , the ab o ve may be written as E z F ` ( T , ø , z ) F ` ( T , ø 0 , z ) = 1 √ d S ` − 1 + S 0 ` − 1 q S ` + 1 d S 0 ` − 1 q S 0 ` + 1 d S ` − 1 . Com bining the ab o v e three cases, taking the full exp ectation with resp ect to ν ∗ e , and recalling the definition (27), E ( F ` , ν ) = d E 1 { I } + 1 { I I } E z [ F ` ( T , ø , z ) F ` ( T , ø 0 , z )] + 1 { I II } 1 √ d S ` − 1 + S 0 ` − 1 q S ` + 1 d S 0 ` − 1 q S 0 ` + 1 d S ` − 1 , where 1 { I } , 1 { I I } , and 1 { I II } indicate whic h of the ab o v e three cases o ccur. By con vergence of C ´ esaro sums, lim ` →∞ 1 ` S ` − 1 = lim ` →∞ 1 ` S ` = lim ` →∞ X ` = X , lim ` →∞ 1 ` S 0 ` − 1 = lim ` →∞ 1 ` S 0 ` = lim ` →∞ X 0 ` = X 0 almost surely , where X and X 0 are indep endent random v ariables with the la w defined b y (37) for the Galton-W atson tree. The ev ents where T ø has maximal depth exactly ` − 1 for ` = 1 , 2 , 3 , . . . are disjoin t, and similarly for T ø 0 , so 1 { I I } → 0 a.s. Recall that for the Galton-W atson tree, the extinction even t lim ` →∞ 1 { X ` = 0 } equals 1 { X = 0 } a.s. Then 1 { I } → 1 { X = 0 , X 0 = 0 } a.s., and hence 1 { I II } → 1 { X > 0 or X 0 > 0 } a.s. T aking ` → ∞ and applying the bounded conv ergence theorem yields the desired result. Pr o of of L emma 6.8. Let X , X 0 and s ( X , X 0 ) b e as in Lemma 6.14, and write W = q X + 1 d X 0 and W 0 = q X 0 + 1 d X . Then s ( X , X 0 ) = ( d W = W 0 = 0 , d 3 / 2 d +1 W 2 + W 0 2 W W 0 W > 0 and W 0 > 0 . Conditional on the even t E := { W > 0 and W 0 > 0 } , the biv ariate law of ( W , W 0 ) is exchangeable in W and W 0 b y symmetry . Then applying Jensen’s inequality , log E [ W W 0 | E ] ≥ E [log W W 0 | E ] = E [log W − log W 0 | E ] = 0 , so E [ W W 0 | E ] ≥ 1. Similarly E [ W 0 W | E ] ≥ 1, so E [ s ( X , X 0 )] ≥ 2 d 3 / 2 d + 1 P [ E ] + d P [ E c ] ≥ 2 d 3 / 2 d + 1 = 2 √ d 1 − 1 d + 1 . The result then follo ws from Lemma 6.14. 27 6.5 Pro of of Lemma 6.9 The pro of is similar to that of Lemma 6.14. F or a ro oted tree ( T , ø) and any v ertex v of T , let k ( v ) := dist ( v , ø) denote the distance to the ro ot. F or an y k > 0, let us call v ertices v for which k ( v ) = k the ‘leaf vertices’ of the ball B k (ø; T ). F or L ≥ ` ≥ 1, if B L (ø; T ) has at least one leaf vertex (i.e. { v : k ( v ) = L } is non-empty), then let us define a depth- L approximation h (ø ,L ) ( v ; T ) to the harmonic measure introduced in Section 2.2, as follows: F or an y vertex v ∈ B L (ø; T ), let k = k ( v ) and consider a simple random w alk on T starting at ø and ending when it visits the first leaf vertex of B L (ø; T ). Then let h (ø ,L ) ( v ; T ) b e the probabilit y that v is the last vertex at distance k from ø that is visited b y this w alk. Clearly , for eac h k = 0 , . . . , L , X v ∈ T : k ( v )= k h (ø ,L ) ( v ; T ) = 1 . (40) W e ma y then define a lo cal algorithm F `,L ∈ F M ∗ ( L ) b y F `,L ( T , ø , z ) := 1 √ ` +1 P v ∈ B ` (ø; T ) p h (ø ,L ) ( v ; T ) z ( v ) if { v : k ( v ) = L } is non-empty , sign P v ∈ B ` (ø; T ) z ( v ) otherwise . (41) When z ( v ) iid ∼ Normal(0 , 1) conditional on ( T , σ ), the conditions of (28) hold by (40). So it remains to compute E ( F `,L , ν ). F or ( T , { ø , ø 0 } ) ∈ T e , define T ø , T ø 0 , k ( v ), k 0 ( v 0 ), B k , and B 0 k as in the pro of of Lemma 6.14, and recall that B L (ø; T ) consists of B L and B 0 L − 1 connected b y the edge { ø , ø 0 } and that B L (ø 0 ; T ) consists of B 0 L and B L − 1 connected b y this same edge. W e consider the same three cases as in the pro of of Lemma 6.14: (I) If k ( v ) ≤ L − 2 for all v ∈ T ø and k 0 ( v 0 ) ≤ L − 2 for all v 0 ∈ T ø 0 , then the second case of (41) holds for b oth B L (ø; T ) and B L (ø 0 ; T ), and F `,L ( T , ø , z ) F `,L ( T , ø 0 z ) = 1 . (42) (I I) If max v ∈ T ø k ( v ) = L − 1 and k 0 ( v 0 ) ≤ L − 2 for all v 0 ∈ T ø 0 , or vice versa, then we simply b ound b y Cauch y-Sch warz | E z [ F `,L ( T , ø , z ) F `,L ( T , ø 0 z )] | ≤ 1 . (43) (I II) Otherwise, the first case of (41) holds for b oth B L (ø; T ) and B L (ø 0 ; T ), and we hav e E z F `,L ( T , ø , z ) F `,L ( T , ø 0 , z ) = 1 ` + 1 X v ∈ B ` − 1 q h (ø ,L ) ( v ; T ) h (ø 0 ,L ) ( v ; T ) + X v 0 ∈ B 0 ` − 1 q h (ø ,L ) ( v 0 ; T ) h (ø 0 ,L ) ( v 0 ; T ) . F or v ∈ B ` − 1 and v 0 ∈ B 0 ` − 1 , let us write as shorthand h ( L ) ( v ) := h (ø ,L ) ( v ; T ø ) and h ( L ) 0 ( v 0 ) := h (ø 0 ,L ) ( v 0 ; T ø 0 ) for the depth- L harmonic measures in the subtrees T ø and T ø 0 . T o relate these quan tities to the harmonic measure in the full tree T , consider a simple random walk on T starting at ø and ending when it hits the first leaf vertex of B L (ø; T ), i.e. when it hits the first v ertex v ∈ T ø for which k ( v ) = L or the first vertex v 0 ∈ T ø 0 for which k 0 ( v 0 ) = L − 1. Let A b e the even t that the last vertex in { ø , ø 0 } visited by this walk is ø. Then the Marko v prop ert y of the walk implies that for any v ∈ B L with k ( v ) = k , v can b e the last vertex at distance k from ø that is visited b y this 28 w alk only if A holds, and the probabilit y of this o ccurring conditional on A is h ( L ) ( v ). Similarly , for any v 0 ∈ B 0 L − 1 with k 0 ( v 0 ) = k − 1, v 0 can b e the last vertex at distance k from ø that is visited b y this walk only if A c holds, and the probabilit y of this o ccurring conditional on A c is h ( L − 1) 0 ( v 0 ). Hence for any v ∈ B ` − 1 and v 0 ∈ B 0 ` − 1 , and letting P ø denote the probability distribution of the simple random w alk started at ø, we hav e h (ø ,L ) ( v ; T ) = P ø [ A ] h ( L ) ( v ) , h (ø ,L ) ( v 0 ; T ) = P root [ A c ] h ( L − 1) 0 ( v 0 ) . Considering analogously a w alk on T starting at ø 0 and ending at the first visited leaf v ertex of B L (ø 0 ; T ), we hav e h (ø 0 ,L ) ( v ; T ) = P ø 0 [ A ] h ( L − 1) ( v ) , h (ø 0 ,L ) ( v 0 ; T ) = P ø 0 [ A c ] h ( L ) 0 ( v 0 ) . Denote by c ( L ) the conductance b et ween ø and the leav es of B L in the subtree T ø , with c ( L ) = 0 if T ø has no v ertices v with k ( v ) = L . Similarly , denote by c ( L ) 0 the conductance b et w een ø 0 and the lea ves of B 0 L in the subtree T ø 0 . Recall [LPP97] that if T ø is augmen ted with the vertex ø 0 connected b y an edge to ø, then c ( L ) / (1 + c ( L ) ) giv es the probability that a simple random walk on T ø started at ø hits a leaf v ertex of B L b efore hitting ø 0 , and the analogous statement holds for c ( L ) 0 . Then letting i coun t the num b er of visits of the random w alk to ø 0 , P ø [ A ] = ∞ X i =0 c ( L ) 1 + c ( L ) 1 1 + c ( L ) 1 1 + c ( L − 1) 0 i = c ( L ) (1 + c ( L − 1) 0 ) c ( L ) + c ( L − 1) 0 + c ( L ) c ( L − 1) 0 . The analogous formula holds for P ø 0 [ A c ]. Recalling [LPP97] that as L → ∞ , c ( L ) → c and c ( L ) 0 → c 0 where c and c 0 are the conductances of the infinite trees T and T 0 , and h ( L ) ( v ) → h ( v ) and h ( L ) 0 ( v 0 ) → h 0 ( v 0 ) for any fixed v ∈ T and v 0 ∈ T 0 where h and h 0 are the the harmonic measures of the infinite trees T and T 0 , as defined in Section 2.2, this implies that in case (I I I), lim L →∞ E z [ F `,L ( T , ø , z ) F `,L ( T , ø 0 , z )] = 1 ` + 1 X v ∈ B ` − 1 h ( v ) c √ 1 + c 0 c + c 0 + cc 0 + X v 0 ∈ B 0 ` − 1 h 0 ( v 0 ) c 0 √ 1 + c c + c 0 + cc 0 = ` ` + 1 c √ 1 + c 0 + c 0 √ 1 + c c + c 0 + cc 0 , where the second equalit y follows from P v : k ( v )= k h ( v ) = 1 and P v 0 : k 0 ( v 0 )= k h 0 ( v 0 ) = 1 for eac h k = 0 , . . . , ` − 1. Let 1 { I } , 1 { I I } , and 1 { I II } indicate whic h of the ab o v e three cases o ccur. The even t that T go es extinct equals the even t { c = 0 } a.s., and similarly for T 0 and c 0 , so by the same argument as in the pro of of Lemma 6.14, as L → ∞ , 1 { I } → 1 { c > 0 or c 0 > 0 } , 1 { I I } → 0, and 1 { I II } → 1 { c = 0 , c 0 = 0 } . Combining the three cases ab ov e, taking the full exp ectation with resp ect to ν ∗ e , letting L → ∞ , and applying the b ounded con vergence theorem, lim L →∞ E ( F `,L , ν ) = d E Ψ ( ` ) (c , c 0 ) where Ψ ( ` ) ≡ ( ` ` +1 c √ 1+c 0 +c 0 √ 1+c c+c 0 +cc 0 if c > 0 or c 0 > 0 1 otherwise . Then taking ` → ∞ and applying again the b ounded con vergence theorem yields the desired result. 29 6.6 Pro of of Lemma 6.10 W e use throughout the fixed v alues δ ∈ (0 , 1], d = ( a + b ) / 2, µ = ( a − b ) / 2, and λ = µ/ √ d . Recall that b y assumption, d ≥ 2 and λ > 1. F or an y ro oted tree ( T , ø , σ ) with v ertex lab els σ : V ( T ) → { +1 , − 1 , u } , define by k ( v ) := dist ( v , ø) the distance from v to ø. F or ` ≥ 0, define N ` and X ` as in (36), and define N + ` := |{ v : k ( v ) = `, σ ( v ) = +1 }| , N − ` := |{ v : k ( v ) = `, σ ( v ) = − 1 }| , D ` := δ − 1 µ − ` ( N + ` − N − ` ) . (44) Note that ( X ` , D ` ) is computable from the observ ed lab els in B ` (ø; T ). F or ( T , ø , σ ) a random tw o-type Galton-W atson tree with offspring distributions P oisson( a/ 2) and Poisson( b/ 2) and v ertex lab els partially rev ealed with probability δ , w e denote by σ true (ø) ∈ { +1 , − 1 } the vertex lab el of the hidden partition that contains ø. (So σ true (ø) = σ (ø) if the lab el of ø is rev ealed.) Define ( X , Y ) := lim ` →∞ ( X ` , σ true (ø) D ` ) , (45) where the limit exists b y the following lemma. Lemma 6.15. L et δ ∈ (0 , 1] and let a, b > 0 b e such that d > 1 and λ > 1 . L et ( X ` , D ` ) b e define d by (36) and (44) for the two-typ e Galton-Watson tr e e with offspring distributions P oisson( a/ 2) and P oisson( b/ 2) and vertex lab els p artial ly r eve ale d with pr ob ability δ . Then the limit ( X, Y ) in (45) exists almost sur ely, and X and Y satisfy E [ X ] = 1 , V ar[ X ] = 1 d − 1 , E [ Y ] = 1 , V ar[ Y ] = d µ 2 − d . (46) F urthermor e, if d ≥ 2 , then for some universal c onstants C , c > 0 and any γ > 0 , P | X − 1 | ≥ γ √ d − 1 ≤ C exp( − cγ ) , P " | Y − 1 | ≥ γ s d µ 2 − d # ≤ C exp( − cγ ) . (47) The pro of of this lemma is deferred to App endix B. Analogous to our pro of of Lemma 6.8 in the Erd˝ os-R ´ en yi case, to establish Lemma 6.10, w e first pro ve the following intermediary result. Lemma 6.16. L et ( X , Y ) , ( X 0 , Y 0 ) b e indep endent p airs of r andom variables with law define d by (45), wher e ( X ` , D ` ) ar e define d by (36) and (44) for the two-typ e Galton-Watson tr e e with offspring distributions P oisson( a/ 2) and P oisson( b/ 2) and vertex lab els p artial ly r eve ale d with pr ob ability δ . Then in the setup of L emma 6.10, for any fixe d α > 0 and for e ach ` ≥ 1 , ther e exists a lo c al algorithm F `,α ∈ F M ∗ ( ` ) satisfying (28) such that lim ` →∞ E ( F `,α , ν ) = E [ s ( X , Y , X 0 , Y 0 ; α )] , wher e s ( X , Y , X 0 , Y 0 ; α ) := a 2 1 √ d ( X + X 0 ) + α ( Y + Y 0 µ )( Y 0 + Y µ ) q X + X 0 d + α ( Y + Y 0 µ ) 2 q X 0 + X d + α ( Y 0 + Y µ ) 2 (48) 30 + b 2 1 √ d ( X + X 0 ) − α ( Y − Y 0 µ )( Y 0 − Y µ ) q X + X 0 d + α ( Y − Y 0 µ ) 2 q X 0 + X d + α ( Y 0 − Y µ ) 2 if X > 0 or X 0 > 0 , and s ( X , Y , X 0 , Y 0 ; α ) := d if X = X 0 = 0 . Pr o of. The pro of is similar to that of Lemma 6.14, and w e explain only the key differences. Define the lo cal algorithm F `,α ( T , ø , σ , z ) := P v ∈ B ` (ø; T ) d − k ( v ) / 2 z ( v )+ D ` √ α` q P v ∈ B ` (ø; T ) d − k ( v ) + D 2 ` α` X ` > 0 , sign P v ∈ B ` (ø; T ) z ( v ) X ` = 0 , where k ( v ) and ( X ` , D ` ) are defined as ab ov e for the lab eled tree ( T , ø , σ ). When z ( v ) iid ∼ Normal(0 , 1) conditional on ( T , ø , σ ), the conditions of (28) hold by construction, where the first condition follo ws from noting that E [ z ( v ) | T , ø , σ ] = 0 and that E [ D ` ] = 0. It remains to compute E ( F `,α , ν ). F or ( T , { ø , ø 0 } , σ ) ∈ T e ( M ), define T ø and T ø 0 as in the pro of of Lemma 6.14. Recall from Example 6.5 that T ø and T ø 0 (with the marks σ | T and σ | T 0 ) eac h ha ve the la w of a tw o-t yp e Galton-W atson tree with offspring distributions Poisson( a/ 2) and Poisson( b/ 2) and lab els partially rev ealed with probabilit y δ , and they are conditionally indep enden t given σ true (ø) and σ true (ø 0 ). Define ( X ` , D ` ) b y (36) and (44) for the subtree T ø , and ( X 0 ` , D 0 ` ) b y (36) and (44) for the subtree T ø 0 . Define also S ` = P ` j =0 X j , S 0 ` = P ` j =0 X 0 ` , Y ` = σ true (ø) D ` , and Y 0 ` = σ true (ø 0 ) D 0 ` . Considering the same three cases (I), (I I), and (I I I) as in the pro of of Lemma 6.14, the same argument shows for case (I) F `,α ( B ` (ø; T ) , ø , σ , z ) F `,α ( B ` (ø 0 ; T ) , ø 0 , σ , z ) = 1 , for case (I I) | E z [ F `,α ( B ` (ø; T ) , ø , σ , z ) F `,α ( B ` (ø 0 ; T ) , ø 0 , σ , z )] | ≤ 1 , and for case (I II) E z [ F `,α ( B ` (ø; T ) , ø , σ , z ) F `,α ( B ` (ø 0 ; T ) , ø 0 , σ , z )] = 1 √ d ( S ` − 1 + S 0 ` − 1 )+ α` ( Y ` + 1 µ Y 0 ` − 1 )( Y 0 ` + 1 µ Y ` − 1 ) q S ` + 1 d S 0 ` − 1 + α` ( Y ` + 1 µ Y 0 ` − 1 ) 2 q S 0 ` + 1 d S ` − 1 + α` ( Y 0 ` + 1 µ Y ` − 1 ) 2 σ true (ø) = σ true (ø 0 ) 1 √ d ( S ` − 1 + S 0 ` − 1 ) − α` ( Y ` − 1 µ Y 0 ` − 1 )( Y 0 ` − 1 µ Y ` − 1 ) q S ` + 1 d S 0 ` − 1 + α` ( Y ` − 1 µ Y 0 ` − 1 ) 2 q S 0 ` + 1 d S ` − 1 + α` ( Y 0 ` − 1 µ Y ` − 1 ) 2 σ true (ø) = − σ true (ø 0 ) . Note that by the symmetry of +1 and − 1 lab els in the definition of the t wo-t yp e Galton-W atson tree, { ( X ` , Y ` ) } ∞ ` =1 is indep enden t of σ true (ø), and similarly { ( X 0 ` , Y 0 ` ) } ∞ ` =1 is indep enden t of σ true (ø 0 ). Hence by the c haracterization of ν e in Example 6.5, { ( X ` , Y ` ) } ∞ ` =1 is indep enden t of { ( X 0 ` , Y 0 ` ) } ∞ ` =1 under ν e . Then conv ergence of C´ esaro sums implies, almost surely , lim ` →∞ 1 ` S ` = lim ` →∞ 1 ` S ` − 1 = lim ` →∞ X ` = X , lim ` →∞ 1 ` S 0 ` = lim ` →∞ 1 ` S 0 ` − 1 = lim ` →∞ X 0 ` = X 0 , lim ` →∞ Y ` = lim ` →∞ Y ` − 1 = Y , lim ` →∞ Y 0 ` = lim ` →∞ Y 0 ` − 1 = Y 0 , 31 where ( X , Y ) and ( X 0 , Y 0 ) are independent pairs of random v ariables with la w defined b y (45). As in the pro of of Lemma 6.14, as ` → ∞ , 1 { I } → 1 { X = 0 , X 0 = 0 } , 1 { I I } → 0, and 1 { I II } → 1 { X > 0 or X 0 > 0 } . Then combining these three cases, taking the full exp ectation with resp ect to ν ∗ e , recalling from Example 6.5 that under ν ∗ e w e hav e σ true (ø) = σ true (ø 0 ) with probabilit y a/ ( a + b ) = a/ (2 d ) and σ true (ø) = − σ true (ø 0 ) with probabilit y b/ ( a + b ) = b/ (2 d ), letting ` → ∞ , and applying the b ounded conv ergence theorem, we obtain the desired result. Pr o of of L emma 6.10. W e compute a lo wer b ound for the quantit y E [ s ( X , Y , X 0 , Y 0 ; α )] in Lemma 6.16, for the c hoice α = µ 2 − d µ 2 1 √ d . F or any p ositiv e function f ( µ, d ) and an y random v ariable Z := Z ( µ, d ) whose la w dep ends on µ and d , we write Z = O ( f ( µ, d )) if E [ | Z | k ] ≤ C k f ( µ, d ) k for some constants { C k } k ≥ 0 indep enden t of µ and d and for all k ≥ 1, d ≥ 2, and µ > √ d . By the Mink owski, Cauc h y-Sch w arz, and Jensen inequalities, if Z = O ( f ( µ, d )) and Z 0 = O ( g ( µ, d )), then Z + Z 0 = O ( f ( µ, d ) + g ( µ, d )), Z Z 0 = O ( f ( µ, d ) g ( µ, d )), and p | Z | = O ( p f ( µ, d )). Note that if P [ | Z | ≥ γ f ( µ, d )] ≤ C exp( − cγ ) for some constan ts C , c > 0 and all γ > 0, then for eac h k ≥ 1 E | Z | k f ( µ, d ) k = Z ∞ 0 k γ k − 1 P | Z | f ( µ, d ) > γ dγ ≤ C k for a constant C k > 0, and hence Z = O ( f ( µ, d )). Then Lemma 6.15 implies X − 1 = O (1 / √ d ) and Y − 1 = O ( p d/ ( µ 2 − d )). This then implies X = O (1) and Y 2 ≤ 2( Y − 1) 2 + 2 = O ( µ 2 / ( µ 2 − d )), so αY 2 , αY 0 2 , αY Y 0 = O (1 / √ d ). Let us write W ± = X + X 0 d + α ( Y ± Y 0 µ ) 2 and W 0 ± = X 0 + X d + α ( Y 0 ± Y µ ) 2 . Define the ev ent E = { X > 1 / 2 and X 0 > 1 / 2 } . On E , we hav e W ± , W 0 ± > 1 / 2. Then applying the b ound | (1 + x ) − 1 / 2 − 1 + x/ 2 | ≤ x 2 for all x > − 1 / 2 and noting 1 /µ < 1 / √ d , 1 p W ± W 0 ± 1 {E } = 1 − X − 1 2 − αY 2 2 + O (1 /d ) 1 − X 0 − 1 2 − αY 0 2 2 + O (1 /d ) ! 1 {E } = 1 − X − 1 2 − X 0 − 1 2 − αY 2 2 − αY 0 2 2 + O (1 /d ) ! 1 {E } . Then, recalling d = ( a + b ) / 2 and µ = ( a − b ) / 2 and noting a, b ≤ d , s ( X , Y , X 0 , Y 0 ; α ) 1 {E } = d 1 √ d ( X + X 0 ) + α µ ( Y 2 + Y 0 2 ) + αµ 1 + 1 µ 2 Y Y 0 1 − X − 1 2 − X 0 − 1 2 − αY 2 2 − αY 0 2 2 + O (1 /d ) ! 1 {E } = h 2 √ d + √ d ( X − 1 + X 0 − 1) + αd µ ( Y 2 + Y 0 2 ) + αµY Y 0 − √ d ( X − 1 + X 0 − 1) − α √ d ( Y 2 + Y 0 2 ) + O (1 / √ d ) i 1 {E } = 2 √ d + α d µ − √ d ( Y 2 + Y 0 2 ) + αµY Y 0 + O (1 / √ d ) 1 {E } . 32 W riting R := s ( X , Y , X 0 , Y 0 ; α ) − 2 √ d + α d µ − √ d ( Y 2 + Y 0 2 ) + αµY Y 0 , (49) the ab ov e implies E [ | R | 1 {E } ] ≤ C / √ d for an absolute constant C > 0 and all d ≥ 2 and µ > √ d . On the other hand, E [ | R | 1 {E c } ] ≤ E [ R 2 ] P [ E c ] ≤ E [ R 2 ]( C e − cd ) for some constants C, c > 0 by (47). Noting that s ( X, Y , X 0 , Y 0 ; α ) satisfies the deterministic b ound s ( X , Y , X 0 , Y 0 ; α ) ≤ √ d ( X + X 0 ) q X + X 0 d q X 0 + X d + a 2 + b 2 ≤ 2 d (whic h holds also if X = X 0 = 0, by definition of s ), we hav e R = O ( d ), so E [ R 2 ] ≤ C d 2 . As d 2 e − cd ≤ C 0 / √ d for all d ≥ 2 and some constant C 0 > 0, this yields E [ | R | ] ≤ C / √ d . Finally , applying (49) and (46), for an y d ≥ 2 and µ > √ d , E [ s ( X , Y , X 0 , Y 0 ; α )] ≥ 2 √ d + 2 α d µ − √ d µ 2 µ 2 − d + αµ − C √ d = 2 √ d + ( µ − √ d ) 2 µ √ d − C √ d . Com bining with Lemma 6.16 yields the desired result. Ac kno wledgemen ts Z.F. was partially supp orted by a Hertz F oundation F ello wship and an NDSEG F ello wship (DoD AF OSR 32 CFR 168a). A.M. was partially supp orted b y the NSF grants CCF-1319979. A Com binatorial lemmas In this app endix, w e prov e Lemmas 5.8 and 5.9. F or any graph H , l ≥ 1, and set of vertices S in H , let B l ( S ; H ) denote the subgraph consisting of all v ertices at distance at most l from S in H (including S itself ) and all edges b et ween pairs of such v ertices. Let | B l ( S ; H ) | denote the num b er of suc h vertices. F or a single vertex v , we write B l ( v ; H ) := B l ( { v } ; H ). Lemma A.1. Fix d > 1 and c onsider the Er d˝ os-R´ enyi gr aph G ∼ G ( n, d/n ) . Then ther e exist C, c > 0 such that for any s ≥ 0 , l ≥ 1 , and v ∈ { 1 , . . . , n } , P [ | B l ( v ; G ) | > sd l ] ≤ C e − cs . Pr o of. See [BLM15, Lemma 29]. Recall Definition 5.4 of the cycle n umber # c ( H ), and also Definition 5.5 of an l -coil. Let us say an l -coil is irreducible if no prop er subset of its edges forms an l -coil. Lemma A.2. F or any l ≥ 1 and any gr aph H , the numb er of distinct e dges in H that b elong to any irr e ducible l -c oil of H is at most l · # c ( H ) . 33 Pr o of. Let H 0 denote the subgraph of H formed b y the union of all irreducible l -coils in H . It suffices to show l · # c ( H 0 ) ≥ e ( H 0 ) where e ( H 0 ) is the n umber of edges in H 0 . As the c ycle num b er and num b er of edges are additive ov er connected comp onen ts, it suffices to show this separately for eac h connected component of H 0 ; hence assume without los s of generality H 0 is connected (and non-empt y , otherwise the result is trivial). Let us construct H 0 b y starting with a single irreducible l -coil H 0 1 and, for eac h t ≥ 2, letting H 0 t b e the union of H 0 t − 1 and an irreducible l -coil sharing at least one vertex with H 0 t − 1 but not en tirely contained in H 0 t − 1 . (Suc h an l -coil exists until H 0 t = H 0 .) Denote by e ( H 0 t ) and v ( H 0 t ) the n umber of distinct edges and v ertices in H 0 t . Clearly e ( H 0 t ) − e ( H 0 t − 1 ) ≤ l for eac h t ≥ 1, hence e ( H 0 t ) ≤ tl . Note # c ( H 0 1 ) ≥ 1. F or eac h t ≥ 1, e ( H 0 t ) − e ( H 0 t − 1 ) ≥ v ( H 0 t ) − v ( H 0 t − 1 ), since adding eac h new vertex requires adding at least one new edge as H 0 t remains connected. F urthermore, equalit y can only hold if the newly added vertices and edges form a forest, where eac h tree in the forest intersects H 0 t − 1 only at its ro ot no de. But this would imply that there exists a v ertex in H 0 t with degree one, contradicting that this v ertex is part of any irreducible l -coil. Hence in fact e ( H 0 t ) − e ( H 0 t − 1 ) ≥ v ( H 0 t ) − v ( H 0 t − 1 ) + 1, implying # c ( H 0 t ) ≥ # c ( H 0 t − 1 ) + 1 for all t . Then # c ( H 0 t ) ≥ t , and so l · # c ( H 0 ) ≥ lt ≥ e ( H 0 ) for the v alue of t suc h that H 0 t = H 0 . Lemma A.3. Fix d > 1 . L et S b e a subset of e dges in the c omplete gr aph on n vertic es such that | S | ≤ 2(log n ) 2 , and let # c ( S ) denote the cycle numb er of the sub gr aph forme d by the e dges in S . L et l b e a p ositive inte ger with l ≤ 0 . 1 log d n . L et G o ∪ S denote the r andom sub gr aph of the c omplete gr aph in which e ach e dge outside of S is pr esent indep endently with pr ob ability d/n and e ach e dge in S is pr esent with pr ob ability 1. L et V ⊆ { 1 , . . . , n } denote the set of vertic es incident to at le ast one e dge in S . Then for some C := C ( d ) > 0 , N 0 := N 0 ( d ) > 0 , al l n ≥ N 0 , and al l 0 < t ≤ (log n ) 2 , P [# c ( B l ( V ; G o ∪ S )) ≥ # c ( S ) + t ] ≤ C (log n ) 2 n − 0 . 7 t . Pr o of. Let G o denote the graph G o ∪ S with all edges in S remov ed. Construct a growing breadth- first-searc h forest in G o , “ro oted” at V , in the following manner: Initialize F 0 as the graph with the vertices V and no edges, and mark each vertex in V as unexplored. Iteratively for eac h t ≥ 1, consider the set of unexplored vertices in F t − 1 ha ving minimal distance from V , and let v t b e the one with smallest index. Mark v t as explored, let N v t b e the set of neighbors of v t in G o whic h are not in F t − 1 , and let F t b e F t − 1 with all vertices v ∈ N v t and edges { v t , v } : v ∈ N v t added. (Hence eac h F t is a forest of | V | disjoint trees, with one tree ro oted at each vertex v ∈ V and with all of its edges in G o .) Let τ b e the first time for whic h all vertices in B l ( V ; G o ∪ S ) are in F τ . Note that for any l ≥ 1, B l ( V ; G o ∪ S ) = S ∪ B l ( V ; G o ) (i.e. B l ( V ; G o ) with the edges in S added), as S is con tained in B 1 ( V ; G o ∪ S ) and also an y vertex at distance at most l from V in G o ∪ S is at distance at most l from V in G o , by definition of V . Then the cycle num b er # c ( B l ( V ; G o ∪ S )) is at most # c ( S ) plus the num b er of edges in B l ( V ; G o ) that are not in F τ (as removing these edges and # c ( S ) edges from S yields a graph with no cycles). Eac h edge in B l ( V ; G o ) that is not in F τ m ust either b e b et ween tw o vertices v 1 and v 2 at the same distance r ∈ [0 , l ] from V , or b et ween a vertex v 1 at some distance r ∈ [0 , l − 1] from V and a vertex v 2 at distance r + 1 from V , where v 2 is a child (in F τ ) of a different v ertex v 0 at distance r from V and having smaller index than v 1 . Giv en F τ , let S F τ denote the set of all such pairs of vertices { v 1 , v 2 } . Then the even t that F τ is the breadth-first-search forest as constructed ab o v e is exactly the ev ent that the vertices of B l ( S ; G o ) are those of F τ and the edges of B l ( S ; G o ) are those of F τ together with some subset of the edges corresp onding to the vertex pairs in S F τ . 34 Hence, conditional on F τ , each edge { v 1 , v 2 } ∈ S F τ is presen t in G o indep enden tly with probability d/n . Then the n umber of suc h edges has conditional law Binom( |S F τ | , d/n ), which is sto chastically dominated by Binom( | F τ | 2 , d/n ) where | F τ | is the num b er of v ertices in | F τ | . Then for any t > 0, letting c b e the constant in Lemma A.1, P [# c ( B l ( V ; G o ∪ S )) ≥ # c ( S ) + t ] ≤ P h | F τ | > c − 1 | V | t (log n ) d l i + P h Binom b c − 1 | V | t (log n ) d l c 2 , d/n ≥ t i . T o b ound the first term, note | F τ | ≤ P v ∈ V | B l ( v ; G o ) | ≤ P v ∈ V | B l ( v ; G ) | , where G ∼ G ( n, d/n ) denotes the full Erd˝ os-R´ en yi graph on n vertices. Then by Lemma A.1, P h | F τ | > c − 1 | V | t (log n ) d l i ≤ X v ∈ V P h | B l ( v ; G ) | > c − 1 t (log n ) d l i ≤ C | V | n − t ≤ 4 C (log n ) 2 n − t , where the last bound uses | V | ≤ 2 | S | ≤ 4(log n ) 2 . F or the second term, let N := b c − 1 | V | t (log n ) d l c 2 and assume N > 0, otherwise the probability is 0. Then for any λ > 0, b y the Chernoff b ound, P [Binom( N , d/n ) ≥ t ] ≤ e − λt 1 − d n + d n e λ N ≤ exp − λt + − d n + d n e λ N . Using t ≤ (log n ) 2 , l ≤ 0 . 1 log d n and | V | ≤ 2 | S | ≤ 4(log n ) 2 , and setting λ = − log( dN /nt ) (which is p ositiv e for large n ), the ab ov e is at most exp t log d b c − 1 | V | t (log n ) d l c 2 nt + t − d b c − 1 | V | t (log n ) d l c 2 n ≤ exp t log( n − 0 . 7 ) for all large n . Combining these b ounds yields the desired result. Pr o of of L emma 5.8. F or an y edge set S and c ∈ { 0 , 1 } , write A S = c as shorthand for the condition ∀{ v , w } ∈ S : A v w = c . Then E Y { v ,w }∈ S A v w − d n Y { v ,w }∈ T A v w 1 { G is l -coil-free } = X J ⊆ S 1 − d n | J | − d n | S |−| J | P A S \ J = 0 , A J ∪ T = 1 , G is l -coil-free = X J ⊆ S 1 − d n | S | − d n | S |−| J | d n | J | + | T | P G is l -coil-free | A S \ J = 0 , A J ∪ T = 1 ≤ d n | S | + | T | X J ⊆ S ( − 1) | J | P G is l -coil-free | A S \ J = 0 , A J ∪ T = 1 . Let G o denote the random graph on n vertices in which each edge outside of S ∪ T is presen t indep enden tly with probabilit y d/n , and ha ving no edges in S ∪ T . Then for an y J ⊆ S , the 35 distribution of the graph G o ∪ J ∪ T (i.e. G o with the edges in J ∪ T added) is equal to the conditional distribution of G | A S \ J = 0 , A J ∪ T = 1. Thus X J ⊆ S ( − 1) | J | P G is l -coil-free | A S \ J = 0 , A J ∪ T = 1 = E [ f ( G o )] for the function f ( g o ) = X J ⊆ S ( − 1) | J | 1 { g o ∪ J ∪ T is l -coil-free } , where g o denotes any fixed realization of G o . If | S | /l ≤ # c ( S ∪ T ), then the desired result follo ws from the trivial b ound | E [ f ( G o )] | ≤ max g o | f ( g o ) | ≤ 2 | S | . F or | S | /l > # c ( S ∪ T ), note that if g o is suc h that there exists e ∈ S for whic h g o ∪ S ∪ T do es not con tain an y irreducible l -coil containing e , then for each J ⊆ S \ { e } , 1 { g o ∪ J ∪ T is l -coil-free } = 1 { g o ∪ J ∪ { e } ∪ T is l -coil-free } , so f ( g o ) = 0. Hence if f ( g o ) 6 = 0, then each edge in S must b e con tained in an irreducible l -coil of g o ∪ S ∪ T , which must be an irreducible l -coil in B l ( V ; g o ∪ S ∪ T ) where V is the set of v ertices inciden t to at least one edge in S ∪ T . This implies that B l ( V ; g o ∪ S ∪ T ) has at least | S | distinct edges contained in irreducible l -coils, so b y Lemma A.2, # c ( B l ( V ; g o ∪ S ∪ T )) ≥ | S | /l . Applying the b ound | f ( g o ) | ≤ 2 | S | , this yields | E [ f ( G o )] | ≤ 2 | S | P [ f ( G o ) 6 = 0] ≤ 2 | S | P h # c ( B l ( V ; g o ∪ S ∪ T )) ≥ | S | /l i . Note | S | + | T | ≤ 2(log n ) 2 and | S | /l − # c ( S ∪ T ) ≤ | S | ≤ (log n ) 2 , so applying Lemma A.3, for some C := C ( d ) > 0 and all n ≥ N 0 := N 0 ( d ) > 0, P h # c ( B l ( V ; g o ∪ S ∪ T )) ≥ | S | /l i ≤ C (log n ) 2 n − 0 . 7 | S | l − # c ( S ∪ T ) . Com bining the ab o ve yields the desired result. Pr o of of L emma 5.9. The proof idea is similar to that of [BLM15, Lemma 17]. W e order the steps tak en b y γ (1) and γ (2) as ( γ (1) 0 , γ (1) 1 ) , . . . , ( γ (1) m − 1 , γ (1) m ) , ( γ (2) 0 , γ (2) 1 ) , . . . , ( γ (2) m − 1 , γ (2) m ). (W e will use “step” to refer to an ordered pair of consecutive v ertices in one of the paths γ (1) , γ (2) and “edge” to refer to an undirected edge { v , w } in the complete graph.) Corresp onding to each equiv alence class in W ( m, l , v , e, K 1 , K 2 ) is a unique canonical element in which γ (1) 0 = γ (2) 0 = 1, and the successive new vertices visited in the ab o ve ordering are 2 , 3 , . . . , v . W e b ound |W ( m, l , v , e, K 1 , K 2 ) | by constructing an injectiv e enco ding of these canonical elemen ts and bounding the n umber of possible co des. Call a step ( γ ( i ) j , γ ( i ) j +1 ) an “innov ation” if γ ( i ) j +1 is a vertex not previously visited (in the ab o ve ordering). Edges corresp onding to innov ations form a tree T ; call an edge { v , w } in the complete graph a “tree edge” if it belongs to T . Note that as γ ( i ) is non-bac ktracking, if ( γ ( i ) j , γ ( i ) j +1 ) is an inno v ation and ( γ ( i ) t , γ ( i ) t +1 ) is the first step with t ≥ j that is not an inno v ation, then { γ ( i ) t , γ ( i ) t +1 } cannot b e a tree-edge. The set K 1 uniquely partitions into maximal interv als of consecutiv e indices. F or instance, if K 1 = { 1 , 2 , 3 , 5 , 7 , 9 , 10 } then these in terv als are { 1 , 2 , 3 } , { 5 } , { 7 } , { 9 , 10 } . If any such interv al is of size L ≥ l , let us remov e ev ery l th element of the interv al and replace the in terv al by the 36 resulting sub-interv als (of whic h there are at most b L/l c + 1, each of size at most l − 1). Call the final collection of in terv als I 1 . In the same manner, we may obtain a collection of interv als I 2 for K 2 . Then j ∈ { 0 , . . . , m − 1 } : j / ∈ [ I ∈I 1 I ≤ m − | K 1 | + m l . As each pair of consecutiv e interv als in I 1 is separated by at least one index j ∈ { 0 , . . . , m − 1 } , this also implies that the total n umber of interv als in I 1 is b ounded as |I 1 | ≤ m − | K 1 | + m l + 1 . Analogous b ounds hold for I 2 . F or eac h i ∈ { 1 , 2 } , the collection of interv als I i corresp onds to a collection of sub-paths of γ ( i ) , where the interv al I = { j, j + 1 , . . . , j 0 } ∈ I i corresp onds to γ ( i ) I := ( γ ( i ) j , γ ( i ) j +1 , . . . , γ ( i ) j 0 , γ ( i ) j 0 +1 ). Each sub-path γ ( i ) I : I ∈ I i is a non-backtrac king path of length at most l . As γ ( i ) is l -coil-free on K i , this implies that the graph G ( γ ( i ) I ) of distinct edges visited b y each such sub-path γ ( i ) I con tains at most one cycle. F or eac h sub-path γ ( i ) I corresp onding to I = { j, j + 1 , . . . , j 0 } and any innov ation ( γ ( i ) k , γ ( i ) k +1 ) in the sub-path, call it a “leading innov ation” if k = j or if its preceding step ( γ ( i ) k − 1 , γ ( i ) k ) is not an inno v ation. Also, call each step ( γ ( i ) k , γ ( i ) k +1 ) that do es not coincide with a tree edge a “non-tree-edge step” (where “tree edge” is as previously defined b y the tree T tra versed by all innov ations in the t wo paths γ (1) , γ (2) ). Note that the non-tree-edge steps are disjoin t from the inno v ations, as inno v ations (by definition) coincide with edges of the tree T . Call eac h non-tree-edge ste p either a “short-cycling step”, a “long-cycling step”, or a “sup erfluous” step, as follows: If G ( γ ( i ) I ) do es not con tain any cycles, then all non-tree-edge steps are long-cycling steps. If G ( γ ( i ) I ) con tains a cycle C , then for eac h non-tree-edge in C , the first step ( γ ( i ) k , γ ( i ) k +1 ) that tra verses that edge is a short-cycling step. All non-tree-edge steps preceding the first short-cycling step are long-cycling steps. Letting τ b e suc h that ( γ ( i ) τ , γ ( i ) τ +1 ) is the first step after the last short-cycling step that do es not b elong to C (if such a step exists), all non-tree-edge steps ( γ ( i ) k , γ ( i ) k +1 ) with k ≥ τ are also long-cycling steps. The remaining non-tree-edge steps (whic h must b elong to C ) are sup erfluous steps. Our injectiv e enco ding of canonical elemen ts consists of: (1) F or eac h i ∈ { 1 , 2 } and each j / ∈ S I ∈I i I : The vertex indices γ ( i ) j and γ ( i ) j +1 . (2) F or eac h sub-path γ ( i ) I : (a) The sequence of vertex index pairs ( γ ( i ) k 1 , γ ( i ) k 1 +1 ) , . . . , ( γ ( i ) k P , γ ( i ) k P +1 ) corresp onding to leading inno v ations, long-cycling steps, and short-cycling steps, (b) for eac h of these P steps, whether it is a leading innov ation, short-cycling, or long-cycling, and (c) the total n umber of sup erfluous non-tree-edge steps. T o see that this encoding is injectiv e, note that item (1) abov e specifies the start and end v ertex of each sub-path γ ( i ) I . Betw een the start of each sub-path and the first leading innov ation, non-tree-edge step, or end of the sub-path (whichev er comes first), γ ( i ) I is a non-backtrac king w alk on the sub-tree of T corresp onding to already-visited v ertices and hence is uniquely determined b y the start and end v ertices of this walk. Similarly , b et ween eac h non-tree-edge step and the next leading innov ation, non-tree-edge step, or end of the sub-path (whic hever comes first), γ ( i ) I 37 is also a non-backtrac king walk uniquely determined b y its start and end vertices. F ollo wing a leading innov ation, all further steps m ust b e (non-leading) innov ations until the next non-tree-edge step, and hence are uniquely determined for the canonical elemen t of the equiv alence class. Hence, giv en the sub-tree of T already visited b efore γ ( i ) I , as well as the v ertex index pair for eac h leading inno v ation and non-tree-edge step in γ ( i ) I and the first and last vertices of γ ( i ) I , we ma y uniquely reconstruct γ ( i ) I . The ab o v e enco ding do es not explicitly sp ecify the vertex index pairs of sup erfluous non-tree- edge steps, but note that if G ( γ ( i ) I ) contains a cycle C , then γ ( i ) I cannot leav e and return to C b ecause G ( γ ( i ) I ) contains only one cycle. I.e., the structure of γ ( i ) I m ust be suc h that it enters the cycle C at some step, trav erses all of its short-cycling steps in the first lo op around C , then tra verses all of its sup erfluous steps in additional lo ops around C , and then exits the cycle C and do es not return. So in fact, given the total num b er of sup erfluous steps, the vertex index pairs of these sup erfluous steps are uniquely determined b y rep eating those of the short-cycling steps in order. Hence the ab o ve enco ding sp ecifies the vertex index pairs of all non-tree-edge steps, so the enco ding is injective. Finally , we b ound the n umber of different co des under this enco ding. F or item (1), there are at most v 2 v ertex pairs for each j / ∈ S I ∈I i I , so there are at most ( v 2 ) m −| K 1 | + m l + m −| K 2 | + m l p ossible co des for item (1). F or each sub-path γ ( i ) I of item (2), note that the edges corresp onding to long-cycling steps (which do not b elong to the cycle C ) are distinct from those corresp onding to short-cycling steps (whic h b elong to C ), and eac h edge corresp onding to a long-cycling or short- cycling step is visited exactly once since γ ( i ) I con tains at most one cycle. As there are at most e − v + 1 distinct non-tree edges and γ ( i ) I has length at most l , γ ( i ) I has at most min( e − v + 1 , l ) total long- cycling steps and short-cycling steps. Recall that the first non-inno v ation step after each leading inno v ation m ust b e a non-tree-edge step, and note that there cannot b e an innov ation b et ween a sup erfluous step and the next long-cycling step. Hence b et w een eac h pair of successive long-cycling or short-cycling steps, and b efore the first such step and after the last such step, there is at most one leading inno v ation. So the total num b er P of leading innov ations, short-cycling steps, and long-cycling steps in γ ( i ) I is at most 2 min( e − v + 1 , l ) + 1. The n umber of sup erfluous non-tree-edge steps at most l . Hence for each sub-path γ ( i ) I of item (2), there are at most l (3 v 2 ) 2 min( e − v +1 ,l )+1 p ossible co des, yielding at most l (3 v 2 ) 2 min( e − v +1 ,l )+1 m −| K 1 | + m l +1+ m −| K 2 | + m l +1 ≤ l (3 v 2 ) 2 l +1 2 m −| K 1 |−| K 2 | l (3 v 2 ) 2 e − 2 v +3 2 m l +2 p ossible co des for item (2) by the ab o v e b ounds on |I 1 | and |I 2 | . Com bining these b ounds gives the desired result. B Galton-W atson martingales In this app endix, we prov e Lemma 6.15. Let us first establish the lemma in the case δ = 1, i.e. all lab els are revealed (so σ true (ø) = σ (ø)). Let Y ` = σ (ø) D ` , and let {F ` } ∞ ` =0 b e the filtration such that F ` is the sigma-field generated b y B ` (ø; T ) and the lab els in this ball. F or each vertex v 6 = ø in T , denote by p ( v ) its paren t 38 v ertex in T and by k ( v ) its distance to ø. Fix ` ≥ 1, let J 1 , J 2 b e the n umbers of v ertices v at generation ` − 1 with σ ( v ) = σ (ø) and σ ( v ) = − σ (ø), resp ectiv ely , and let K 1 , K 2 , K 3 , K 4 b e the n umbers of vertices v at generation ` with σ ( v ) = σ ( p ( v )) = σ (ø), − σ ( v ) = σ ( p ( v )) = σ (ø), σ ( v ) = − σ ( p ( v )) = σ (ø), and − σ ( v ) = − σ ( p ( v )) = σ (ø), resp ectively . Then X ` − 1 = d − ( ` − 1) ( J 1 + J 2 ), X ` = d − ` ( K 1 + K 2 + K 3 + K 4 ), Y ` − 1 = µ − ( ` − 1) ( J 1 − J 2 ), and Y ` = µ − ` ( K 1 − K 2 + K 3 − K 4 ). Conditional on F ` − 1 , K 1 , K 2 , K 3 , K 4 are indep enden t and distributed as K 1 ∼ P oisson( a 2 J 1 ) , K 2 ∼ P oisson( b 2 J 1 ) , K 3 ∼ P oisson( b 2 J 2 ) , K 4 ∼ P oisson( a 2 J 2 ) . Hence for an y t, s ∈ R , E [exp( tX ` + sY ` ) | F ` − 1 ] = E " exp t d ` + s µ ` K 1 + t d ` − s µ ` K 2 + t d ` + s µ ` K 3 + t d ` − s µ ` K 4 F ` − 1 # = exp a 2 J 1 ( e t d ` + s µ ` − 1) + b 2 J 1 ( e t d ` − s µ ` − 1) + b 2 J 2 ( e t d ` + s µ ` − 1) + a 2 J 2 ( e t d ` − s µ ` − 1) = exp d 2 e t d ` + s µ ` + e t d ` − s µ ` − 2 ( J 1 + J 2 ) + µ 2 e t d ` + s µ ` − e t d ` − s µ ` ( J 1 − J 2 ) = exp d ` exp t d ` cosh s µ ` − 1 X ` − 1 + µ ` exp t d ` sinh s µ ` Y ` − 1 . (50) Denote the joint moment generating function of ( X ` , Y ` ) by M ` ( t, s ) = E [exp( tX ` + sY ` )]. T aking the full exp ectation of (51), for each ` ≥ 1, M ` ( t, s ) = M ` − 1 d ` exp t d ` cosh s µ ` − 1 , µ ` exp t d ` sinh s µ ` . (51) In particular, induction on ` sho ws M ` ( t, s ) is finite for all t, s ∈ R and ` ≥ 0. Differen tiating (50) at (0 , 0), E [ X ` | F ` − 1 ] = ∂ ∂ t E [exp( tX ` + sY ` ) | F ` − 1 ] t =0 ,s =0 = X ` − 1 , E [ Y ` | F ` − 1 ] = ∂ ∂ s E [exp( tX ` + sY ` ) | F ` − 1 ] t =0 ,s =0 = Y ` − 1 , E [ X 2 ` | F ` − 1 ] = ∂ 2 ∂ t 2 E [exp( tX ` + sY ` ) | F ` − 1 ] t =0 ,s =0 = X 2 ` − 1 + d − ` X ` − 1 , E [ Y 2 ` | F ` − 1 ] = ∂ 2 ∂ s 2 E [exp( tX ` + sY ` ) | F ` − 1 ] t =0 ,s =0 = Y 2 ` − 1 + d ` µ − 2 ` X ` − 1 . Then { X ` } and { Y ` } are martingales with resp ect to {F ` } , with E [ X ` ] = X 0 = 1, E [ Y ` ] = Y 0 = 1, E [ X 2 ` ] = E [ X 2 ` − 1 ] + d − ` = . . . = ` X k =0 d − k , E [ Y 2 ` ] = E [ Y 2 ` − 1 ] + d ` µ − 2 ` = . . . = ` X k =0 d k µ − 2 k . Hence { X ` } , { Y ` } are b ounded in L 2 , so they con verge a.s. and in L 2 to some ( X, Y ) ∈ F ∞ b y the martingale con vergence theorem. As E [ X 2 ` ] → d/ ( d − 1) and E [ Y 2 ` ] → µ 2 / ( µ 2 − d ), (46) follows. F or the tail b ounds (47), set α = 1 / 6 and define T 0 = α √ d − 1 , S 0 = α r µ 2 − d d , 39 T ` = T 0 − T 2 0 ` X k =1 d − k − S 2 0 ` X k =1 d k µ − 2 k , S ` = S 0 − 2 S 0 T 0 ` X k =1 d − k − S 3 0 ` X k =1 µ − 2 k . Note that for d ≥ 2, α ≤ T 0 ≤ α ( d − 1) and S 2 0 ≤ α 2 ( µ 2 − 1) / 2. Then as ` → ∞ , T ` ↓ T ∞ := T 0 − T 2 0 d − 1 − S 2 0 d µ 2 − d = T 0 − 2 α 2 ≥ (1 − 2 α ) T 0 , S ` ↓ S ∞ := S 0 − 2 S 0 T 0 d − 1 − S 3 0 µ 2 − 1 ≥ (1 − 2 α − α 2 / 2) S 0 . W e claim b y induction on ` that for all ` ≥ 0 and t, s ∈ R with | t | ≤ T ` and | s | ≤ S ` , M ` ( t, s ) ≤ exp t + s + | t | √ d − 1 ` X k =1 d − k + | s | r µ 2 − d d ` X k =1 d k µ − 2 k ! . (52) As X 0 = Y 0 = 1, (52) holds with equalit y for ` = 0. Let ` ≥ 1 and assume inductively that (52) holds for ` − 1. Let t ` , s ` ∈ R with | t ` | ≤ T ` and | s ` | ≤ S ` , and denote t ` − 1 = d ` exp t ` d ` cosh s ` µ ` − 1 , s ` − 1 = µ ` exp t ` d ` sinh s ` µ ` . As | t ` | ≤ T 0 ≤ αd ` and | s ` | ≤ S 0 ≤ αµ ≤ αµ ` , and | exp( x ) cosh( y ) − 1 − x | ≤ x 2 + y 2 and | exp( x ) sinh( y ) − y | ≤ 2 | xy | + | y | 3 for all | x | , | y | ≤ α , we obtain | t ` − 1 − t ` | ≤ t 2 ` d ` + s 2 ` d ` µ 2 ` ≤ T 2 0 d ` + S 2 0 d ` µ 2 ` , | s ` − 1 − s ` | ≤ 2 | s ` t ` | d ` + | s ` | 3 µ 2 ` ≤ 2 S 0 T 0 d ` + S 3 0 µ 2 ` . This implies | t ` − 1 | ≤ T ` − 1 and | s ` − 1 | ≤ S ` − 1 . Then (51) and the induction h yp othesis imply M ` ( t ` , s ` ) ≤ exp t ` − 1 + s ` − 1 + | t ` − 1 | √ d − 1 ` − 1 X k =1 d − k + | s ` − 1 | r µ 2 − d d ` − 1 X k =1 d k µ − 2 k ! . T o complete the induction, it suffices to sho w | t ` − 1 − t ` | 1 + √ d − 1 ` − 1 X k =1 d − k ! + | s ` − 1 − s ` | 1 + r µ 2 − d d ` − 1 X k =1 d k µ − 2 k ! ≤ √ d − 1 | t ` | d ` + r µ 2 − d d | s ` | d ` µ 2 ` . (53) Recalling d ≥ 2, we may b ound | t ` − 1 − t ` | ≤ t 2 ` d ` + s 2 ` d ` µ 2 ` ≤ α √ d − 1 | t ` | d ` + α r µ 2 − d d | s ` | d ` µ 2 ` , 1 + √ d − 1 ` − 1 X k =1 d − k ≤ 1 + 1 √ d − 1 ≤ 2 , 40 | s ` − 1 − s ` | ≤ 2 | s ` t ` | d ` + | s ` | 3 µ 2 ` ≤ 2 α r µ 2 − d d | t ` | d ` + α 2 µ 2 − d d | s ` | µ 2 ` , 1 + r µ 2 − d d ` − 1 X k =1 d k µ − 2 k ≤ 1 + s d µ 2 − d . Com bining the ab o ve, applying the b ounds 1 ≤ √ d − 1 ≤ d ` and p ( µ 2 − d ) /d ≤ √ d − 1 ≤ d ` , and recalling α = 1 / 6, w e obtain (53). This completes the induction and our pro of of (52). Finally , (52) implies, in particular, E [exp( tX ` )] = M ` ( t, 0) ≤ exp t + | t | √ d − 1 ∀| t | ≤ √ d − 1 9 ≤ T ∞ , E [exp( sY ` )] = M ` (0 , s ) ≤ exp s + | s | s d µ 2 − d ! ∀| s | ≤ 1 10 r µ 2 − d d ≤ S ∞ . By F atou’s lemma, the same b ounds hold for E [exp( tX )] and E [exp( sY )]. Choosing t = √ d − 1 / 9, P h | X − 1 | ≥ γ √ d − 1 i ≤ e − t γ √ d − 1 E [ e t | X − 1 | ] ≤ e − γ / 9 ( E [ e t ( X − 1) ] + E [ e t (1 − X ) ]) ≤ 2 e 1 / 9 e − γ / 9 , yielding the b ound for X in (47). The same argumen t yields the b ound for Y , and this completes the pro of of Lemma 6.15 in the case of δ = 1. F or δ ∈ (0 , 1), first note that the definition of X ` do es not dep end on the revealed lab els, and hence X ` → X a.s. for the same limit X as in the case δ = 1. T o show Y ` → Y a.s., denote N + `, true = |{ v : k ( v ) = `, σ true ( v ) = +1 }| , N − `, true = |{ v : k ( v ) = `, σ true ( v ) = − 1 }| , Y `, true = σ true (ø) µ − ` ( N + `, true − N − `, true ) where σ true ( v ) ∈ { +1 , − 1 } denotes the ‘true lab el’ of each v ertex v (i.e. the vertex set of the hidden partition containing v ). Then the δ = 1 case implies Y `, true → Y a.s. On the even t that the tree T go es extinct, we hav e Y ` = Y `, true = 0 for all sufficien tly large ` , so Y ` → Y = 0. On the even t that T does not go extinct, w e hav e X > 0 a.s., so in particular N + `, true + N − `, true → ∞ . As d > µ and as Y `, true → Y < ∞ , this also implies N + `, true / N − `, true → 1, so in fact d − ` N + `, true → X/ 2 and d − ` N − `, true → X/ 2. Conditional on T and the true lab els, N + ` ∼ Binom( N + `, true , δ ), so Ho effding’s inequalit y implies | δ − 1 N + ` − N + `, true | ≤ (log N + `, true ) q N + `, true almost surely for all large ` . A similar b ound holds for N − ` , implying that | Y ` − Y `, true | ≤ µ − ` (log N + `, true ) q N + `, true + (log N − `, true ) q N − `, true almost surely for all large ` . As µ > √ d and d − ` N + `, true → X/ 2 and d − ` N − `, true → X/ 2, this implies | Y ` − Y `, true | → 0. Hence Y ` → Y a.s. also on the even t that T do es not go extinct. C Pro of of Theorem 2.4, Eq. (11) Recall that, given an infinite ro oted tree ( T , ø), we denote b y c( T , ø) its conductance. W e start b y recalling some basic notions that can b e found in [LPP97]. Let c ( ` ) ( T , ø) b e the conductance 41 of the first ` generations of ( T , ø), i.e. the intensit y of current flo wing through the tree when the ro ot has p oten tial equal to one, and the vertices at generation ` hav e p oten tial equal to 0. By definition, c ( ` ) ( T , ø) is monotone non-increasing, and c( T , ø) = c ( ∞ ) ( T , ø) ≡ lim ` →∞ c ( ` ) ( T , ø). W e omit the argumen t and write c ( ` ) , c when ( T , ø) is a Galton-W atson tree with Poisson( d ) offspring distribution. By the standard rules for series/parallel combinations of resistances, we get the distributional recursion c ( ` +1) d = L X i =1 c ( ` ) i 1 + c ( ` ) i , (54) where L ∼ P oisson( d ) is indep enden t of the i.i.d. collection (c ( ` ) i ) i ≥ 1 , c ( ` ) i d = c ( ` ) . This is to be complemen ted with the b oundary condition c (0) = ∞ (with the conv ention that ∞ / (1 + ∞ ) = 1). The limit conductance c is a fixed point of the ab ov e recursion. W e start with a simple concen tration estimate. Lemma C.1. L et h ( x ) ≡ (1 + x ) log (1 + x ) − x . Then for any ` ≥ 1 , we have P n c ( ` ) − E c ( ` ) ≥ t o ≤ 2 e − d h ( t/d ) . (55) In p articular, for any M > 0 ther e exists d 0 ( M ) such that, for al l d ≥ d 0 ( M ) , P n c ( ` ) − E c ( ` ) ≥ p 4 M d log d o ≤ 2 d M . (56) Pr o of. Consider a mo dified random v ariable ˜ c ( ` ) defined b y ˜ c ( ` ) d = m X i =1 c ( ` − 1) i 1 + c ( ` − 1) i B i , (57) where m is an in teger and B i ∼ iid Bernoulli( d/m ). By Bennet’s inequalit y [BLM13, Theorem 2.9], w e hav e P n ˜ c ( ` ) − E ˜ c ( ` ) ≥ t o ≤ 2 e − ν h ( t/ν ) , (58) where ν = P m i =1 E ( X 2 i ), X i ≡ B i c ( ` − 1) i / (1 + c ( ` − 1) i ). Claim (55) simply follo ws because ˜ c ( ` ) con verges to c ( ` ) in distribution and in expectation (for instance b y coupling P oisson( d ) and Binom( m, d/m )), and using ν ≤ d together with the fact that ν 7→ ν h ( t/ν ) is monotone non-increasing for all t ≥ 0. Claim (56) follo ws by using h ( x ) ≥ x 2 / 4 for x ≤ 1. W e next estimate the mean and v ariance of c. Lemma C.2. With the ab ove definitions, we have E c ≤ d , V ar(c) ≤ d and, for lar ge d , E c = d − 1 + O d r log d d , (59) V ar(c) = d − O d (1) . (60) 42 Pr o of. First note that, by the recursion (54), we hav e E c ( ` +1) = d E n c ( ` ) c ( ` ) + 1 o , V ar(c ( ` +1) ) = d E n c ( ` ) c ( ` ) + 1 2 o . (61) whence we get immediately E c ≤ d and V ar(c) ≤ d (note that lim ` →∞ E (c ( ` ) ) = E (c), lim ` →∞ V ar(c ( ` ) ) = V ar(c) b y dominated conv ergence, since c ( ` ) is dominated by a Poisson( d ) random v ariable). Also, b y Jensen’s inequality and using the notation c ( ` ) ≡ E c ( ` ) , w e get c ( ` +1) ≤ d c ( ` ) c ( ` ) + 1 . (62) Iterating this b ound from c (0) = ∞ , w e obtain c ≤ d − 1. In order to prov e a low er b ound, define B ` = [c ( ` ) − √ 4 M d log d, c ( ` ) + √ 4 M d log d ] ≡ [c ( ` ) 1 , c ( ` ) 2 ], with M to b e fixed b elow. W e hav e E n c ( ` ) c ( ` ) + 1 o ≥ E n c ( ` ) c ( ` ) + 1 1 { c ( ` ) ∈ B ` } o (63) ≥ c ( ` ) 1 c ( ` ) 1 + 1 P c ( ` ) ∈ B ` (64) ≥ c ( ` ) − ∆ c ( ` ) − ∆ + 1 1 − 2 d M (65) where ∆ = √ 4 M d log d and we used (56). Defining ˜ d = d (1 − 2 d − M ), and the function F ( x ) = ˜ d ( x − ∆) / ( x − ∆ + 1), we obtain the low er b ounds c ( ` ) ≥ F (c ( ` − 1) ) . (66) By calculus, we obtain that the fixed point equation x = F ( x ) has tw o p ositiv e solutions 0 < x 0 ( d ) < x 1 ( d ) < ∞ for all d large enough, with x 1 ( d ) = ˜ d + ∆ − 1 2 + s ( ˜ d + ∆ − 1) 2 4 − ˜ d ∆ (67) = d − 1 + O r log d d , (68) where, for the second equalit y , w e used M ≥ 3 / 2. F urthermore x 7→ F ( x ) is non-decreasing for x ≥ x 1 ( d ). Using the initial condition c (0) = ∞ , this implies c ( ` ) ≥ x 1 ( d ) for all ` , thus finishing the pro of of (59). In order to prov e (60), recall that we already prov ed V ar(c) ≤ d . Using Jensen’s inequality in (61), w e get V ar(c) ≥ d E n c c + 1 o 2 = d c d 2 , (69) and the claim follo ws from our estimate of c. W e are no w in p osition to pro ve the T aylor expansion in (11). 43 Pr o of of The or em 2.4, Eq. (11). First we claim that Ψ( x, y ) ∈ [0 , 1] for all x, y ≥ 0. Indeed, it is clear that Ψ( x, y ) ≥ 0. F urthermore, lim x → 0 Ψ( x, y ) = 1 for an y y > 0 and ∂ Ψ ∂ x ( x, y ) = − y (1 − √ K ) 2 2( x + y + xy ) 2 √ 1 + x ≤ 0 , (70) where K ≡ (1 + x )(1 + y ). This prov es the claim Ψ( x, y ) ∈ [0 , 1]. Next, let c = E c and B = [c − √ 4 M d log d, c + √ 4 M d log d ] ≡ [c 1 , c 2 ], with M to b e fixed b elo w. By the ab o ve calculation and Lemma C.1, we get d E Ψ(c 1 , c 2 ) = 2 d E n c 1 √ c 2 + 1 c 1 + c 2 + c 1 c 2 1 { c 1 ∈ B } 1 { c 2 ∈ B } o + O ( d − M +1 ) (71) = 2 d E n c 1 √ c 2 + 1 (c 1 + 1)(c 2 + 1) 1 { c 1 ∈ B } 1 { c 2 ∈ B } o + O ( d − 3 / 2 ) (72) = 2 d E n c c + 1 1 { c ∈ B } o E n 1 √ c + 1 1 { c ∈ B } o + O ( d − 3 / 2 ) . (73) In the second equalit y w e to ok M ≥ 5 / 2 and used the fact that there exists a constant C = C ( M ) suc h that c 1 √ c 2 + 1 c 1 + c 2 + c 1 c 2 − c 1 √ c 2 + 1 (c 1 + 1)(c 2 + 1) = c 1 √ c 2 + 1 (c 1 + c 2 + c 1 c 2 )(c 1 + 1)(c 2 + 1) ≤ C d − 5 / 2 , (74) for all c 1 , c 2 ∈ B . W e are left with the task of ev aluating the tw o exp ectations in (73). Consider the first one. W e ha ve E n c c + 1 1 { c ∈ B } o = E n c c + 1 o + O ( d − M ) (75) = c d + O ( d − M ) (76) = 1 − 1 d + O d (log d ) 1 / 2 d 3 / 2 . (77) where the first equality follo ws from Lemma C.1, the second from (61), and the last from Lemma C.2. Next let f ( x ) = (1 + x ) − 1 / 2 . Note that sup x ∈ B | f 000 ( x ) | = O ( d − 7 / 2 ). Hence by the intermediate v alue theorem (for ξ a p oin t in B ), w e hav e E f (c) 1 { c ∈ B } = E nh f (c) + f 0 (c) (c − c) + 1 2 f 00 (c) (c − c) 2 + 1 6 f 000 ( ξ )(c − c) 3 i 1 { c ∈ B } o (78) ( a ) = E n f (c) + f 0 (c) (c − c) + 1 2 f 00 (c) (c − c) 2 o + O ( d − M +1 ) + O (log d ) 3 / 2 d 2 ! (79) ( b ) = 1 (1 + c) 1 / 2 + 3 8 1 (1 + c) 5 / 2 V ar(c) + O (log d ) 3 / 2 d 2 ! (80) ( c ) = 1 d 1 / 2 + 3 8 d 3 / 2 + O (log d ) 3 / 2 d 2 ! . (81) 44 Here ( a ) follo ws from Lemma C.1 and the ab ov e upp er b ound on | f 000 ( x ) | , ( b ) by taking M ≥ 3, and ( c ) from Lemma C.2. The pro of is completed by substituting the estimates (77) and (81) in (73). References [ABC + 15] Pranjal Awasthi, Afonso S Bandeira, Moses Charik ar, Ravishank ar Krishnasw amy , Soledad Villar, and Rachel W ard. Relax, no need to round: Integralit y of clustering form ulations. In Pr o c e e dings of the 2015 Confer enc e on Innovations in The or etic al Computer Scienc e , pages 191–200. ACM, 2015. [ABH16] Emman uel Abb e, Afonso S Bandeira, and Georgina Hall. Exact recov ery in the sto c has- tic blo c k mo del. IEEE T r ansactions on Information The ory , 62(1):471–487, 2016. [AL07] Da vid Aldous and Russell Ly ons. Pro cesses on unimodular random net works. Ele ctr on. J. Pr ob ab , 12(54):1454–1508, 2007. [AS04] Da vid Aldous and J Michael Steele. The ob jective method: Probabilistic com binatorial optimization and lo cal weak con vergence. In Pr ob ability on discr ete structur es , pages 1–72. Springer, 2004. [A V11] Brendan PW Ames and Stephen A V a v asis. Nuclear norm minimization for the planted clique and biclique problems. Mathematic al pr o gr amming , 129(1):69–89, 2011. [Bas92] Hyman Bass. The Ihara-Selb erg zeta function of a tree lattice. International Journal of Mathematics , 3(06):717–797, 1992. [BCSZ14] Afonso S Bandeira, Moses Charik ar, Amit Singer, and Andy Zh u. Multireference alignmen t using semidefinite programming. In Pr o c e e dings of the 5th c onfer enc e on Innovations in the or etic al c omputer scienc e , pages 459–470. ACM, 2014. [BHK + 16] Boaz Barak, Samuel B Hopkins, Jonathan Kelner, Prav esh K Kothari, Ankur Moitra, and Aaron Potec hin. A nearly tight sum-of-squares lo wer b ound for the planted clique problem. , 2016. [BLM13] St ´ ephane Bouc heron, G´ ab or Lugosi, and P ascal Massart. Conc entr ation ine qualities: A nonasymptotic the ory of indep endenc e . Oxford Universit y Press, 2013. [BLM15] Charles Bordenav e, Marc Lelarge, and Laurent Massouli´ e. Non-bac ktracking sp ectrum of random graphs: communit y detection and non-regular Ramanujan graphs. In F oun- dations of Computer Scienc e (FOCS), 2015 IEEE 56th Annual Symp osium on , pages 1347–1357. IEEE, 2015. [BLS15] Itai Benjamini, Russell Lyons, and Oded Sc hramm. Unimo dular random trees. Er go dic The ory and Dynamic al Systems , 35(02):359–373, 2015. [BM03] S amuel Burer and Renato DC Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematic al Pr o gr amming , 95(2):329–357, 2003. 45 [BS01] Itai Benjamini and Oded Schramm. Recurrence of distributional limits of finite planar graphs. Ele ctr on. J. Pr ob ab. , 6:13 pp., 2001. [CGHV15] Endre Cs´ ok a, Bal´ azs Gerencs´ er, Viktor Harangi, and B´ alin t Vir´ ag. Inv arian t gaussian pro cesses and indep enden t sets on regular graphs of large girth. R andom Structur es & A lgorithms , 47:284–303, 2015. [DKMZ11] Aurelien Decelle, Florent Krzak ala, Cristopher Mo ore, and Lenk a Zdeb oro v´ a. Asymp- totic analysis of the sto c hastic blo c k model for mo dular netw orks and its algorithmic applications. Physic al R eview E , 84(6):066106, 2011. [dlP99] Victor H de la P ena. A general class of exp onen tial inequalities for martingales and ratios. The Annals of Pr ob ability , 27(1):537–564, 1999. [DM15a] Y ash Deshpande and Andrea Montanari. Finding hidden cliques of size p N /e in nearly linear time. F oundations of Computational Mathematics , 15(4):1069–1128, 2015. [DM15b] Y ash Deshpande and Andrea Montanari. Improv ed sum-of-squares low er b ounds for hidden clique and hidden submatrix problems. In Pr o c e e dings of The 28th Confer enc e on L e arning The ory , pages 523–562, 2015. [Ele10] G´ ab or Elek. On the limit of large girth graph sequences. Combinatoric a , 30(5):553–563, 2010. [FK00] Uriel F eige and Rob ert Krauthgamer. Finding and certifying a large hidden clique in a semirandom graph. R andom Structur es and Algorithms , 16(2):195–208, 2000. [GS14] Da vid Gamarnik and Madh u Sudan. Limits of lo cal algorithms o ver sparse random graphs. In Pr o c e e dings of the 5th c onfer enc e on Innovations in the or etic al c omputer scienc e , pages 369–376. ACM, 2014. [GV15] Olivier Gu ´ edon and Roman V ershynin. Communit y detection in sparse netw orks via grothendiec ks inequality . Pr ob ability The ory and R elate d Fields , pages 1–25, 2015. [GW95] Mic hel X Go emans and David P Williamson. Impro ved approximation algorithms for maxim um cut and satisfiability problems using semidefinite programming. Journal of the A CM (JACM) , 42(6):1115–1145, 1995. [HLS14] Hamed Hatami, L´ aszl´ o Lov´ asz, and Bal´ azs Szegedy . Limits of lo cally–globally conv er- gen t graph sequences. Ge ometric and F unctional Analysis , 24(1):269–296, 2014. [HWX15] Bruce Ha jek, Yihong W u, and Jiaming Xu. Achieving exact cluster recov ery threshold via semidefinite programming: Extensions. , 2015. [HWX16] Bruce Ha jek, Yihong W u, and Jiaming Xu. Achieving exact cluster recov ery threshold via semidefinite programming. IEEE T r ansactions on Information The ory , 62(5):2788– 2797, 2016. [Jer92] Mark Jerrum. Large cliques elude the metrop olis pro cess. R andom Structur es & Algo- rithms , 3(4):347–359, 1992. 46 [JLR11] Sv ante Janson, T omasz Luczak, and Andrzej Rucinski. R andom gr aphs , volume 45. John Wiley & Sons, 2011. [JMR T16] Adel Jav anmard, Andrea Montanari, and F ederico Ricci-T ersenghi. Phase transi- tions in semidefinite relaxations. Pr o c e e dings of the National A c ademy of Scienc es , 113(16):E2218–E2223, 2016. [KMM + 13] Floren t Krzak ala, Cristopher Mo ore, Elchan an Mossel, Jo e Neeman, Allan Sly , Lenk a Zdeb oro v´ a, and Pan Zhang. Sp ectral redemption in clustering sparse net works. Pr o- c e e dings of the National A c ademy of Scienc es , 110(52):20935–20940, 2013. [KS00] Motoko Kotani and T oshik azu Sunada. Zeta functions of finite graphs. J. Math. Sci. Univ. T okyo , 7:7–25, 2000. [LPP95] Russell Lyons, Robin P emantle, and Y uv al Peres. Ergo dic theory on galtonwatson trees: Sp eed of random walk and dimension of harmonic measure. Er go dic The ory and Dynamic al Systems , 15(03):593–619, 1995. [LPP97] Russell Lyons, Robin Peman tle, and Y uv al Peres. Unsolv ed problems concerning ran- dom walks on trees. In Classic al and mo dern br anching pr o c esses , pages 223–237. Springer, 1997. [Ly o14] Russell Lyons. F actors of iid on trees. , 2014. [Mas14] Lauren t Massouli ´ e. Communit y detection thresholds and the weak Ramanujan prop- ert y . In Pr o c e e dings of the 46th Annual ACM Symp osium on The ory of Computing , pages 694–703. A CM, 2014. [MNS12] Elc hanan Mossel, Jo e Neeman, and Allan Sly . Stochastic blo c k mo dels and reconstruc- tion. , 2012. [MNS13] Elc hanan Mossel, Jo e Neeman, and Allan Sly . A pro of of the blo c k mo del threshold conjecture. , 2013. [Mon15] Andrea Mon tanari. Finding one communit y in a sparse graph. Journal of Statistic al Physics , 161(2):273–299, 2015. [MPW15] Ragh u Mek a, Aaron Potec hin, and Avi Wigderson. Sum-of-squares low er b ounds for plan ted clique. In Pr o c e e dings of the F orty-Seventh Annual A CM on Symp osium on The ory of Computing , pages 87–96. ACM, 2015. [MS16] Andrea Mon tanari and Subhabrata Sen. Semidefinite programs on sparse random graphs and their application to comm unity detection. In Pr o c e e dings of the 48th Annual A CM SIGACT Symp osium on The ory of Computing , pages 814–827. ACM, 2016. [MX16] Elc hanan Mossel and Jiaming Xu. Lo cal algorithms for blo c k mo dels with side infor- mation. In Pr o c e e dings of the 2016 ACM Confer enc e on Innovations in The or etic al Computer Scienc e , pages 71–80. ACM, 2016. [NN13] Y urii Nesterov and Ark adi Nemirovski. On first-order algorithms for l 1/n uclear norm minimization. A cta Numeric a , 22:509–575, 2013. 47 [R V14] Mustazee Rahman and Balint Virag. Local algorithms for independent sets are half- optimal. , 2014. [SKZ14] Alaa Saade, Florent Krzak ala, and Lenk a Zdeb oro v´ a. Spectral clustering of graphs with the Bethe Hessian. In A dvanc es in Neur al Information Pr o c essing Systems , pages 406–414, 2014. [Suo13] Jukk a Suomela. Surv ey of lo cal algorithms. A CM Computing Surveys (CSUR) , 45(2):24, 2013. [WS08] Kilian Q W einberger and Lawrence K Saul. F ast solvers and efficient implementations for distance metric learning. In Pr o c e e dings of the 25th international c onfer enc e on Machine le arning , pages 1160–1167. ACM, 2008. 48
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment