Almost-Uniform Edge Sampling: Leveraging Independent-Set and Local Graph Queries

A central theme in sublinear graph algorithms is the relationship between counting and sampling: can the ability to approximately count a combinatorial structure be leveraged to sample it nearly uniformly at essentially the same cost? We study (i) …

Authors: Tomer Adar, Amit Levi

Almost-Uniform Edge Sampling: Lev eraging Indep enden t-Set and Lo cal Graph Queries T omer A dar ∗ Amit Levi † Marc h 18, 2026 Abstract A cen tral theme in sublinear graph algorithms is the relationship b et ween coun ting and sampling: can the abilit y to approximately count a combinatorial structure b e lev eraged to sample it nearly uniformly at essen tially the same cost? W e study (i) independent-set (IS) queries, whic h return whether a v ertex set S is edge-free, and (ii) tw o standard lo cal queries: degree and neighbor queries. Eden and Rosenbaum (SOSA ‘18) pro ved that in the local-query mo del, uniform edge sampling is no harder than approximate edge coun ting. W e extend this phenomenon to new settings. W e establish sampling-counting equiv alence for the hybrid model that com bines IS and lo cal queries, matc hing the complexity of edge-coun t estimation ac hieved by Adar, Hotam and Levi (2026), and an analogous equiv alence for IS queries, matc hing the complexity of edge-coun t estimation ac hieved by Chen, Levi and W aingarten (SODA ‘20). F or each query model, w e show lo wer bounds for uniform edge sampling that essentially coincide with the kno wn b ounds for appro ximate edge counting. ∗ T echnion - Israel Institute of T ec hnology , Israel. Email: tomer-adar@campus.tec hnion.ac.il. † Univ ersity of Haifa, Israel. Email: alevi@cs.haifa.ac.il. Con ten ts 1 In tro duction 1 1.1 Our results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Related w ork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Conceptual ov erview 3 2.1 A dditional details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Preliminaries 7 3.1 Notation sc heme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 A few tec hnical lemmas and to ols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Common samplers 9 4.1 Lone-edge sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 Star-v ertex sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.3 F actor low er b ounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5 Elemen tary procedures 13 5.1 Estimating the in verse of an indicator . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.2 Extracting an edge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 5.3 Brute-force sampler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 6 Using b oth IS and local queries 16 6.1 Sampling L-L edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.2 Sampling medium-high v ertices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 6.3 Sampling L-MH edges (for low ˜ m ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 6.4 Sampling L-M edges (for high ˜ m ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 6.5 The tininess factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 6.6 Sampling L-H edges (for high ˜ m ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6.7 Sampling MH-MH edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 6.8 Sampling all edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 7 Using the IS oracle 27 7.1 Sampling a neigh b or . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 7.2 Categorizing the degree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 7.3 Sampling lo w-low edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 7.4 Sampling high-lo w and high-high edges . . . . . . . . . . . . . . . . . . . . . . . . . . 31 7.5 Sampling all edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 8 Lo wer b ounds 36 8.1 The IS lo wer-bound construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 References 43 1 In tro duction A fundamen tal question in the field of sublinear-time graph algorithms concerns the relationship b et w een appr oximate c ounting and uniform sampling (see e.g, [ ER18 , ERR19 , AKK19 , FGP20 , EMR21 , DLM22 , AMM22 , ENT23 , ELRR25 ]). Can the abilit y to estimate the size of a combina- torial structure, such as the n umber of edges of a graph G = ( V , E ) , be leveraged to sample an elemen t of that structure nearly uniformly at essen tially the same cost? The earliest inv estigation of this relationship fo cused on the case where the combinatorial structure is the set of edges E of an unknown graph G . In this setting, the relationship w as first explored within the lo c al query mo del. In this mo del, the algorithm is gran ted access to the graph’s adjacency list, allo wing it to p erform de gr e e queries (returning the degree of a vertex v ) and neighb or queries (returning the i -th neigh b or of a vertex v ). F or a giv en ε ∈ (0 , 1) , the tasks are formally defined as follows: • Appro ximate counting: With high probability the algorithm outputs a v alue ˜ m satisfying | E | (1 − ε ) ≤ ˜ m ≤ (1 + ε ) | E | . • Almost uniform sampling: With high probabilit y the algorithm outputs e suc h that for ev ery edge e ∈ E , (1 − ε ) / | E | ≤ Pr [ e = e ] ≤ (1 + ε ) / | E | . While early edge sampling metho ds often required significantly higher complexity than count- ing [ KKR04 ], a sequence of w orks [ ER18 , TT22 , ENT23 ] established that the query complexity of almost-uniform edge sampling is essentially equiv alen t to the query complexity of approximate edge counting (see [ GR08 ]). Both tasks share query complexity of e Θ( n/ √ m ) , where n is the num b er of no des and m is the num b er of edges in G and the tilde notation suppresses p olynomial factors in log n and 1 /ε . Beame, Har-Peled, Ramamo orth y , Rash tchian, and Sinha [ BHPR + 20 ] introduced a notably different glob al access by introducing the indep endent set oracle (IS). In this mo del, an algorithm may query an y arbitrary subset S ⊆ V , and the oracle responds with a single bit indicating whether S induces an edge-free subgraph. Unlik e lo cal insp ection, IS queries do not reveal the identit y or sp ecific lo cation of edges. Instead, they provide coarse, global information ab out the presence or absence of edges within a queried set. This mo del can b e view ed as a form of combinatorial group testing, where a single query prob es a large p ortion of the graph simultaneously . As suc h, IS queries exp ose structural information that is fundamen tally inaccessible through purely lo cal insp ection. The IS mo del and its v arian t received significan t atten tion in the past years [ BBGM19 , BGMP19 , BBGM21 , DL21 , AMM22 , DLM22 , DLM24 ] (see more information in Section 1.2). In this mo del, the complexity of approximate edge count w as settled by Chen, Levi and W ain- garten [ CL W20 ] at e Θ(min { √ m, n/ √ m } ) . Most recen tly , A dar, Hotam and Levi [ AHL26 ] consid- ered a hybrid mo del whic h in tegrates global IS queries with lo cal queries. They sho wed that the query complexity of approximate edge coun t is essen tially e Θ(min { √ m, p n/ √ m } ) . Nonetheless, the corresp onding sampling problems in b oth the IS and the hybrid mo dels remained op en. 1 1.1 Our results In this pap er, w e bridge this gap by establishing that the sampling–coun ting equiv alence extends to b oth the IS and Hybrid query mo dels. In particular, w e first present an efficient algorithm for edge sampling in the Hybrid mo del. Theorem 1.1 (Rephrase of Lemma 6.11) . Ther e exists an algorithm Sample-Edge-Hyb rid ( G, ε ) wher e G is a gr aph over n vertic es (known to the algorithm) and m e dges (unknown to the algorithm) and ε > 0 , that makes O ( R log n log( n/ε )) de gr e e, neighb or and indep endent-set queries, wher e R = min { √ m, p n/ √ m } , whose r andom output X satisfies: • Pr [ X  = reject ] ≥ 2 / 3 . • F or every e dge e in G , Pr [ X = e | X  = reject ] ∈ (1 ± ε ) /m . W e complemen t this algorithmic result by showing that the query complexity of Sample-Edge-Hybrid is optimal within the hybrid mo del, up to logarithmic factors. Theorem 1.2 (Rephrase of Lemma 8.5) . A ny algorithm whose b ehavior matches the guar ante es of The or em 1.1 must make Ω(min { √ m, p n/ √ m } ) queries in exp e ctation. Next, w e shift our fo cus to the more restricted Indep enden t Set query mo del. Theorem 1.3 (Rephrase of Lemma 7.10) . Ther e exists an algorithm Sample-Edge-IS ( G, ε ) wher e G is a gr aph over n vertic es (known to the algorithm) and m e dges (unknown to the algorithm) and ε > 0 , that makes O ( R · p oly(log( n/ε ))) indep endent-set queries, wher e R = min { √ m, n/ √ m } , whose r andom output X satisfies: • Pr [ X  = reject ] ≥ 2 / 3 . • F or every e dge e in G , Pr [ X = e | X  = reject ] ∈ (1 ± ε ) /m . Finally , w e establish a corresp onding low er b ound for the IS mo del, confirming that the dep endence on m and n in Theorem 1.3 is essentially tight. Theorem 1.4 (Rephrase of Lemma 8.7) . A ny algorithm whose b ehavior matches the guar ante es of The or em 1.3 must make Ω(min { √ m, n/ √ m } ) queries in exp e ctation. 1.2 Related w ork Since the initial work in the local mo del, the problem of almost-uniform edge sampling has b een extended to v arious settings. Eden, Ron, and Rosen baum [ ERR19 ] sho wed that if a graph has b ounded arboricity 1 α , the query complexity for almost-uniform sampling can be impro ved to e O ( αn/m ) . Eden, Mossel and Rubinfeld [ EMR21 ] ga ve an algorithm whic h samples t edges using e O ( √ t · n/ √ m ) queries – a notable improv emen t compared to naively using t in vocation of the edge sampler. This query complexity was later prov ed to b e optimal b y [ TT22 ]. Bey ond the sp ecific fo cus on edge estimation, several recen t w orks ha ve explored the broader prob- lem of coun ting and sampling arbitrary subgraphs H within a v ariet y of expanded query mo dels. [ ERR22 ] extended the results of [ ERR19 ] for sampling k -cliques in the lo cal mo del. In particular, 1 Note that since α ≤ √ m for every G with m edges, this result is at least as go od as the result in [ ER18 , TT22 , ENT23 ] 2 their results show that for k -cliques sampling is harder than counting. In contrast, [ ELRR25 ] show ed that for Hamiltonian subgraphs the complexity of counting and sampling coincides. Recen t researc h has expanded the in vestigation of the sampling–coun ting relationship to arbitrary subgraphs H , utilizing v arious com binations of degree, neigh b or, and sampling oracles to establish complexit y b ounds based on structural parameters like fractional edge co vers and decomposition costs [ ABG + 18 , AKK19 , FGP20 , BER21 ]. Notably , these works emphasize that the cost of uniform subgraph sampling can often be made to matc h the complexit y of approximate coun ting, ev en as the target structures become more complex. The in tro duction of the independent-set oracle has spurred a broader in v estigation into global query primitives. [ BHPR + 20 ] introduced the Bipartite Indep endent Set (BIS) oracle, whic h queries whether an edge exists b et ween t wo disjoint v ertex subsets, and used it to pro vide b ounds for edge estimation. These b ounds were later strengthen by [ DLM22 ], and [ AMM22 ] considered non-adaptive settings for b oth counting and sampling. This mo del has since been extended to as well as to more general com binatorial frameworks [ BBGM19 , BGMP19 , BBGM21 ]. F urthermore, [ DL21 , DLM22 , DLM24 ] in vestigated the IS mo del alongside the Colorful Indep en- dence Oracle – a generalization of the Bipartite Independent Set (BIS) oracle – a nd dev elop ed efficien t algorithms for appro ximate edge coun ting in h yp ergraphs. Their approac h establishes fine-grained reductions from approximate counting to decision problems with minimal ov erhead. In particular, this w ork pro vides a framework for conv erting decision oracles in to appro ximate counting and sampling algorithms for k -uniform hypergraphs and small witnesses, such as k -cliques. Closely related is the w ork of Beretta, Chakrabarty , and Seshadhri [ BCS26 ], who in vestigate edge counting in a mo del similar to our hybr id setting but with a more p o werful “global” oracle. 2 Conceptual o v erview In this work, w e present t wo algorithms for edge sampling: a hybrid algorithm, whic h leverages b oth lo cal (degree and neighbor) and indep enden t-set (IS) queries, and a pure IS algorithm, whic h relies exclusiv ely on indep enden t-set queries. W e begin by presenting the h ybrid algorithm. While it can b e view ed as a syn thesis of the lo cal sampling techniques from [ ER18 ] and our IS-based approach, the pure IS algorithm is arguably more technically demanding. Although conceptually simpler, the IS mo del requires us to statistically estimate vertex degree categories rather than retrieving them deterministically through lo cal queries The core strategy , follo wing the framework in [ ER18 ], inv olv es partitioning v ertices into three de- gree categories - Lo w (L), Medium (M), and High (H) - and sampling edges within each category indep enden tly . W e classify edges based on the degree profiles of their endp oin ts. T o facilitate this, w e consider three “natural” edge sampling primitiv es, eac h of whic h may fail to return an edge in a single execution: • Sparse sampler: A sparse subset of v ertices is dra wn; if the induced subgraph contains exactly one edge, that edge is returned as the sample. • P ath sampler: A v ertex is selected as a starting p oin t for a short random w alk, with the final edge tra versed serving as the sample. 3 • Link sampler: T wo vertices are drawn indep enden tly , and an IS query is used to determine if they are adjacen t. In b oth the path and link samplers, we ha ve an additional degree of freedom: the initial v ertex distribution. W e leverage the global reach of indep enden t-set queries to prioritize MH-v ertices. Eac h edge category is asso ciated with a sp ecific sampling primitive for which the primitiv e is opti- mally suited. In the general case – sp ecifically when the graph is not excessiv ely sparse – we employ the following strategy . • F or L–L edges: W e utilize the sparse sampler, as the lo w density of these edges allo ws a sparse v ertex subset to isolate a single edge with sufficien t probability . • F or L–M edges: W e employ a one-step path sampler. T o ensure efficiency , the starting vertex is sampled from a distribution that prioritizes MH-vertices, thereb y increasing the likelihoo d that the single-edge w alk terminates at an L-v ertex. • F or L–H edges: W e use a tw o-step path sampler starting from a uniform vertex distribution, follo wing the approac h in [ ER18 ]. This length-2 walk is necessary to “reac h” the relatively rare L–H edges from a random starting p oin t. • F or MH–MH edges: W e use the link sampler. By biasing the v ertex selection to ward MH- v ertices, we significan tly increase the probabilit y that tw o indep enden tly dra wn v ertices are adjacen t. If the graph is sparse, m = O ( n 2 / 3 ) , the optimal sampler for L–H edges is no longer the tw o-step w alk from [ ER18 ], but rather the targeted one-step sampler used for L–M edges. The threshold m ≈ n 2 / 3 is a natural transition point, as it represen ts the “break-ev en” equilibrium b etw een √ m and p n/ √ m , the t wo terms that gov ern our query complexity . F actors W e introduce four factors that c haracterize the bias of the natural samplers described ab o v e. The first factor is the loneliness factor, arising in the analysis of the sparse sampler. If we draw a set of densit y p , then we exp ect to hav e p 2 m edges. If p 2 m is smaller than one but still Ω(1) , then the induced subgraph has exactly one edge with constant probability . While the probabilit y of a sp ecific edge e to b e in the induced subgraph is clearly b ounded by p 2 (whic h is the probability that b oth e ’s vertices b elong to the vertex set), the probability that it is the only edge is lo wer 2 . W e denote this reduced probability b y L e · p 2 , where 0 ≤ L e ≤ 1 is the loneliness factor of e . W e can deriv e the following definition for L e (dep ending on p ), assuming that e = ( u, v ) , G = ( V , E ) and S ∼ Bin( V , p ) : L e = Pr  G S \{ u } has no edges ∧ G S \{ v } has no edges   u, v ∈ S  The second factor is the starness factor, arising in the attempt to prioritize drawing of MH-v ertices. Again, assume that w e dra w a set of densit y p where p 2 m is smaller than one but still Ω(1) . W e 2 Unless e is the only edge in the graph. 4 consider the even t that there is at least one edge in the induced subgraph, and all these edges are adjacen t to a common vertex (note that in the case of a single edge, there are two suc h “common” v ertices). This ev ent contains the high-probability ev ent of having exactly one edge. Consider the following pro cess: first, we dra w a sparse v ertex set S as b efore, and then we uniformly c ho ose a v ertex adjacen t to all edges (if such a vertex exists and there is at least one edge). If the induced subgraph contains at least tw o edges, then suc h a vertex (if exists) is distinct, and therefore w e choose it with probability 1 . W e refer to such a vertex as a ful l-star vertex. If the induced subgraph con tains a single edge, then each of its endp oin t vertices is c hosen with probabilit y 1 / 2 . These v ertices are referred to as semi-star vertices. The probability of a vertex u to b e chosen according to this pro cess is b ounded b y p (which is the probabilit y that u b elongs to the v ertex set), and can b e denoted by S u · p , where 0 ≤ S u ≤ 1 is the starness factor of u . W e can deriv e the follo wing definition for S u (dep ending on p ), assuming that G = ( V , E ) and S ∼ Bin( V , p ) : S u = 1 2 Pr  G S \{ u } has no edges ∧ | S ∩ N u | = 1  + Pr  G S \{ u } has no edges ∧ | S ∩ N u | ≥ 2  . W e hav e additional tw o ad-ho c factors: the “tininess factor” of a vertex, which is only used in the analysis of the hybrid algorithm, is the fraction of tiny-degree neigh b ors, and the “neighborho o d factor” of a v ertex, which is only used in the analysis of the IS algorithm, is the probabilit y to draw a neigh b or under specific constraints. Canceling the factors T o achiev e a uniform edge sample, we must normalize the output proba- bilities by coun teracting the biases introduced b y our low-lev el sampling primitiv es. Sp ecifically , w e m ust “cancel out” the loneliness, starness, and ad-hoc factors that sk ew the sampling distribution of certain edges. T o this end, we introduce Estimate-Indicator-Inverse , a pro cedure that enables “di- vision b y E [ X ] ” giv en sampling access to a Bernoulli random v ariable X . While seemingly simple in concept, this allo ws us to re-w eight our samples to comp ensate for the factors Com bining the category-sp ecific samplers In the hybrid algorithm, each category-sp ecific sampler is close to uniform when conditioned on sampling an edge of its prioritized category , and cannot sample edges of the wrong category (thanks to the degree oracle that allo ws to determine the exact category of a given edge). More precisely , for every category A , we compute a mass λ A so that every edge in the prioritized category is sampled with probability in the range e ± ε · λ A . T o uniformly sample an edge from the graph, we compute a mass λ , and randomly choose a category (or an immediate failure) so that eac h category A is chosen with probability λ/λ A . Clearly , λ m ust b e sufficiently small so that e ε λ P A (1 /λ A ) ≤ 1 (to match the worst-case sampling b ound). This w ay , eac h edge is sampled with probability e ± ε λ , whic h means that the resulting sample (when conditioned on success) is O ( ε ) -uniform. The IS algorithm Conceptually , the IS algorithm can b e view ed as an adaptation of the h ybrid framew ork where the medium-degree category is subsumed into the high-degree category , and lo cal oracle calls are sim ulated using indep enden t-set queries. F rom an analytical p erspective, the primary c hallenge lies in the uncertaint y of vertex categorization. Without access to a degree oracle, the algorithm cannot deterministically assign a v ertex to a category . It can only p erform statistical tests to estimate its degree. This necessitates a careful 5 analysis of the algorithm’s b eha vior for “b oundary” vertices – those whose degrees lie near the threshold b et w een the Low (L) and High (H) categories. At the top level, the algorithm partitions edges in to tw o primary classes: • F or L–L edges: W e contin ue to utilize the sparse-v ertex-set sampler. • F or LH–H edges: W e emplo y the path sampler (analogous to the L–M strategy in the hybrid algorithm), where the neighbor oracle is replaced b y a simulation based on independent-set queries. In b oth cases, once a candidate edge is obtained, the algorithm verifies the degree categories of its endp oin ts. If the estimated degrees do not matc h the target profiles, the sample is rejected, and the algorithm declares a failure. 2.1 A dditional details Both the hybrid algorithm and the IS algorithm star b y obtaining an estimation ˜ m of the num b er of edges. Lemma 2.1 (Theorem 1 in [ AHL26 ] - Rephrased) . Ther e exists an algorithm whose input is a gr aph G = ( V , E ) over n vertic es (known to the algorithm) and m e dges (unknown to the algorithm) that is ac c essible thr ough the de gr e e or acle, the neighb or or acle and the indep endent-set or acle, whose output ˜ m is in the r ange e ± ε m with pr ob ability at le ast 2 / 3 . Mor e over, the exp e cte d query c omplexity of this algorithm is O  log n ε 5 / 2 min { √ m, p n/ √ m }  . Lemma 2.2 (Theorem 1 in [ CL W20 ] - Rephrased) . Ther e exists an algorithm whose input is a gr aph G = ( V , E ) over n vertic es (known to the algorithm) and m e dges (unknown to the algorithm) that is ac c essible thr ough the indep endent-set or acle, whose output ˜ m is in the r ange e ± ε m with pr ob ability at le ast 2 / 3 . Mor e over, the query c omplexity of this algorithm is O (min { √ m, n/ √ m } p oly(log n, 1 /ε )) . W e use a constant accuracy parameter, so that ˜ m ∈ e ± 1 / 10 m with high probability . W e amplify the success probability from 2 / 3 to 1 − r , for r = p oly( ε, 1 /n ) , by taking the median of O (log(1 /r )) = O (log( n/ε )) indep enden t estimations. In total, the exp ected cost of obtaining ˜ m is: O  min { √ m, p n/ √ m } log n log ( n/ε )  for the h ybrid algorithm, O (min { √ m, n/ √ m } p oly(log n ) log( n/ε )) for the IS algorithm. Based on the estimation ˜ m w e define the threshold degrees: • k 1 = 3 q ˜ m 3 / 2 /n – an upp er b ound for tin y degrees in the non-sparse case of the hybrid algorithm. • k 2 = √ ˜ m – an upper b ound for lo w degrees. • k 3 = p n √ ˜ m – an upper b ound for medium degrees in the h ybrid algorithm. More precisely , the degree ranges in the h ybrid algorithm are [0 , k 2 ] for low, ( k 2 , k 3 ] for medium and ( k 3 , n − 1] for high. In the IS algorithm, the high range is ( k 2 , n − 1] . 6 Note that, up to p oly-logarithmic factors, k 2 corresp onds to the threshold degree in [ ER18 ] (sam- pling, lo cal queries) and [ CL W20 ] (coun ting, IS queries) and k 3 corresp onds to the threshold degree in [ AHL26 ] (coun ting, lo cal+IS queries). Estimating an indicator inv erse The following lemma formalizes the mechanism of “division b y E [ X ] ” when X is a Bernoulli v ariable accessible through samples. Lemma 2.3. Consider a black-b ox pr o c e dur e A that r esults in an indic ator. L et p A b e the exp e cte d value of this indic ator. Ther e exists a pr o c e dur e Estimate-Indicator-Inverse ( A , ε, ρ ) that makes O  1 p A  c al ls in exp e ctation to A and: • The outc ome is a Bernoul li variable. • If p A ≥ ρ , then the exp e cte d output is in the r ange e ± ε ρ/ ( p A · ln(5 /ε )) . Mor e over, the worst-c ase numb er of c al ls is O  log ε − 1 /ρ  (even if p A < ρ ). W e prov e Lemma 2.3 in Section 5. Finding an edge in a non-indep endent set Some of our pro cedures require the ability to obtain an arbitrary edge from a given non-indep enden t vertex set. Lemma 2.4. Ther e exists a deterministic pr o c e dur e Extract-Edge-IS ( G ; S ) whose input is a non- empty vertex set S of a gr aph G that makes O (1 + log | S | ) indep endent-set queries (worst-c ase) and r eturns an arbitr ary e dge e b etwe en two S -vertic es, if such an e dge exists, or reject otherwise. Note that this pro cedure also app ears in [ BHPR + 20 , CL W20 ]. F or the sake of self-containmen t, we pro vide a version of the algorithm in Section 5. 3 Preliminaries 3.1 Notation sc heme Graph notations The input graph is usually denoted by G = ( V , E ) . It has n = | V | vertices and m = | E | edges. A t some p oint w e define sets categorizing degree v ertices: L (low), M (medium) and H (high). W e use E A , B to denote the set of edges b etw een category A and category B . F or a v ertex set S , we use E S to denote the set of edges in the subgraph induced on the vertices of S and G S to denote this induced subgraph. Set of neigh b ors W e use N u = { v ∈ V : the edge uv exists } to denote the set of neigh b or v ertices of u . Random subsets When dra wing a random subset S ⊆ U with density p , every elemen t in U b elongs to S with probability p , indep enden tly . This drawing is also denoted by Bin( U, p ) . Range notations An expression of the form ± α indicates the range b et ween − α and α . Com- p osite ranges are defined elemen twise (for example, the implicit range notation exp([min , max]) 7 indicates the range [exp(min) , exp(max)] ). Multiple o ccurrences of ± in the same expression are parsed individually . Multiplicativ e errors W e denote b ounded m ultiplicative errors by an e ± ε -factor. Note that this is slightly differen t from the usual notation of (1 ± ε ) -factor. Observ e that e ± ε/ 2 ⊆ 1 ± ε ⊆ e ± 2 ε for a sufficiently small ε , and therefore, b oth notations are essentially the same up to a constant factor. Pseudo code con ven tions In a few algorithms we use the following phrases: • Filter b y [calling a pro cedure whose output is a Bernoulli v ariable]. • Pro ceed with probability p . The first form requires calling the sp ecified pro cedure and terminate if the result is 0 (or reject ), and the second form terminates with probability 1 − p . In both cases, termination is done by returning a v alue indicating failure (usually reject ). W e b eliev e that these declarativ e phrases impro ve the readability ov er the pure-imp erativ e if-then scheme. Query mo del Across the pap er, ev ery pro cedure describ es the graph oracles it may access as a part of its name (IS, Lo cal, Hybrid). Definition 3.1 (An ε -uniform distribution) . A distribution µ ov er a discrete domain Ω is ε -uniform if µ ( x ) ∈ e ± ε / | Ω | for every x ∈ Ω . Definition 3.2 (Edge sampling algorithm) . An algorithm is a ( λ, ε ) -e dge-sampling algorithm if, for ev ery input graph G that has at least one edge: • With probabilit y at least λ , the output is not a declaration of failure. • When conditioned on success, the output of the algorithm is ε -uniform ov er the edge set of G . 3.2 A few technical lemmas and to ols The follo wing technical lemmas are w ell known and easily v erifiable (either explicitly or using automated to ols such as W olfram Alpha). Lemma 3.3 (T echnical lemma) . F or every x , 1 + x ≤ e x . Lemma 3.4 (T echnical lemma) . F or every x ≥ 0 , e − x ≤ 1 − x + 1 2 x 2 . Lemma 3.5 (T echnical lemma) . F or every x ≥ − 1 and y ≥ 1 , (1 + x ) y ≥ (1 − x 2 y ) e xy . Lemma 3.6 (Chernoff b ounds) . L et X b e a r andom variable distributing Binomial ly or ac c or ding to Poisson distribution. [0 < δ ≤ 1] Pr [ X ≤ (1 − δ ) E [ X ]] ≤ e − 1 2 δ 2 E [ X ] [ δ > 0] Pr [ X ≥ (1 + δ ) E [ X ]] ≤ e − δ 2 2+ δ E [ X ] ≤ ( e − 1 3 δ 2 if δ ≤ 1 e − 1 3 δ if δ ≥ 1 8 4 Common samplers In this section we provide tw o samplers that are used in b oth the hybrid algorithm and the IS algorithm. The other samplers men tioned in the o verview are describ ed within eac h algorithm’s detail section. 4.1 Lone-edge sampler The lone-edge sampler, which corresp onds to the “sparse-vertex-set sampler” in the conceptual o verview, draws a vertex set S of density p = 1 / 10 √ ˜ m . If the induced subgraph G S has exactly a single edge, then the algorithm returns it. Otherwise it returns reject . Algorithm 1 ( Sample-Lone-Edge-IS ) pro vides the pseudo co de for the sparse-set sampler. In the follo wing w e define L e ( ˜ m ) , the loneliness factor of e (as a function of ˜ m ), and sho w that the probabilit y to sample e using this algorithm is exactly 1 100 ˜ m L e ( ˜ m ) . Algorithm 1 : Pro cedure Sample-Lone-Edge-IS ( G, n ; ˜ m ) Output: Every edge uv ∈ E is returned with probabilit y exactly 1 100 ˜ m L u,v ( ˜ m ) Complexit y: O (log n ) worst-case. 1. Let p ← 1 / 10 √ ˜ m . 2. Let S ⊆ V b e a set that every vertex b elongs to with probability p iid. 3. If IS( S ) accepts: (No edges) (a) Return reject . 4. Let uv ← Extract-Edge-IS ( G ; S ) . 5. If T est-Loneliness-IS ( S , u, v ) : (More than one edge in S ) (a) Return uv . 6. Return reject . W e say that an unordered pair { u, v } (regardless of whether it forms an edge) is lonely in a vertex set S , if there are no edges in S ∪ { u, v } , p ossibly except the edge uv itself. Algorithm 2 ( T est- Loneliness-IS ) determines whether the pair { u, v } is lonely in a vertex set S directly by definition. Algorithm 2 : Pro cedure T est-Loneliness-IS ( S ; u, v ) Complexit y: O (1) worst-case. 1. If IS( S ∪ { v } \ { u } ) rejects: (a) Return reject . Has a non- u edge. 2. If IS( S ∪ { u } \ { v } ) rejects: (a) Return reject . Has a non- v edge. 3. Return a ccept . Observ ation 4.1. Pr o c e dur e T est-Loneliness-IS makes O (1) IS queries at worst-c ase. F or an edge e = uv , w e define the loneliness factor of e as the ratio betw een the probability that E S = { e } and the probability that e ∈ E S , where S ∼ Bin( V , p ) and p = 1 / 10 √ ˜ m . Alternativ ely , this is the probabilit y that the pair { u, v } is lonely in the dra wn set S . T o dra w an indicator whose exp ected v alue is L u,v ( ˜ m ) , w e dra w a set S ⊆ V of densit y p = 9 1 / 10 √ ˜ m and test whether or not { u, v } is lonely in S . Algorithm 3 ( Lonliness-Event-IS ) pro vides the pseudo code for this logic. Algorithm 3 : Pro cedure Lonliness-Event-IS ( G, n ; ˜ m, u, v ) Output: Binary answer. Accept probability L u,v ( ˜ m ) . Complexit y: O (1) worst-case. 1. Let p ← 1 / 10 √ ˜ m . 2. Let S ⊆ V b e a set that every vertex b elongs to with probability p iid. 3. A ccept if and only if T est-Loneliness-IS ( S ; u, v ) accepts. Observ ation 4.2. A c al l to Lonliness-Event-IS ( G, n ; ˜ m, u, v ) ac c epts the input with pr ob ability ex- actly L u,v ( ˜ m ) at the c ost of O (1) IS-queries worst-c ase. A t this p oint we hav e all the required notations to state the exact b eha vior of the lone-edge sampler. Lemma 4.3. F or every e dge u 0 v 0 ∈ E , the pr ob ability that Sample-Lone-Edge-IS r eturns u 0 v 0 is exactly 1 100 ˜ m L u 0 ,v 0 ( ˜ m ) . Mor e over, the query c omplexity is O (log n ) worst-c ase. Pr o of. F or complexity , observe that there is one explicit indep enden t-set query , one call to Extract- Edge-IS at the cost of O (log n ) IS-queries (Lemma 2.4) and one call to T est-Loneliness-IS at the cost of O (1) IS-queries (Observ ation 4.1). The probabilit y to return the edge u 0 v 0 is exactly: Pr [ return u 0 v 0 ] = Pr [ u 0 ∈ S ] · Pr [ v 0 ∈ S ] · Pr [ the pair u 0 , v 0 is lonely in S ] = p · p · L u 0 ,v 0 ( ˜ m ) = 1 100 ˜ m L u 0 ,v 0 ( ˜ m ) 4.2 Star-v ertex sampler The star-vertex sampler, which corresponds to the v ertex sampler in the conceptual ov erview that fa vors medium-degree and high-degree v ertices, starts with dra wing a set S of density p = 1 / 10 √ ˜ m , and lo oks for the follo wing cases: (1) the induced subgraph G S has a single edge, in which case w e uniformly choose one of its endp oin t v ertices as the sample, and (2) the induced subgraph G S has t wo or more edges, all adjacent to a single v ertex (a “star”), whic h we c ho ose as our sample. In ev ery other case, the sampler return reject . Algorithm 4 ( Sample-Sta r-Vertex-IS ) pro vides the pseudo co de for the star-vertex sampler. In the follo wing we define S u ( ˜ m ) , the starness factor of u (as a function of ˜ m ), and sho w that the probabilit y to sample u using this algorithm is exactly 1 10 √ ˜ m S u ( ˜ m ) . Algorithm 5 ( T est-Sta rness-IS ) gets an input ( S ; u ) and accepts with probability equals to the prob- abilit y to sample u (by Sample-Star-V ertex-IS ) when dra wing the set S ∪ { u } . Observ ation 4.4. Pr o c e dur e T est-Starness-IS makes O (log n ) IS queries at worst-c ase. 10 Algorithm 4 : Pro cedure Sample-Star-V ertex-IS ( G, n ; ˜ m ) Output: Every vertex u ∈ V is returned with probability exactly 1 10 √ ˜ m S u ( ˜ m ) Complexit y: O (log n ) . 1. Let p ← 1 / 10 √ ˜ m . 2. Let S ⊆ V b e a set that every vertex b elongs to with probability p iid. 3. If IS( S ) accepts: (No edges) (a) Return reject . 4. Let z w ← Extract-Edge-IS ( G, S ) . 5. Exc hange w and z with probability 1 / 2 . 6. Else If IS( S \ { w } ) accepts: ( w connected to all edges) (a) Return w . 7. Else If IS( S \ { z } ) accepts: ( z connected to all edges, which w do es not) (a) Return z . 8. Return reject . Algorithm 5 : Pro cedure T est-Sta rness-IS ( S ; u ) Complexit y: O (log n ) worst-case. 1. If IS( S ) rejects: (a) Return reject . (non- u edges exist) 2. If IS( S ∪ { u } ) accepts: (a) Return reject . (no edges at all) 3. Let w z ← Extract-Edge-IS ( G, S ∪ { u } ) . 4. If T est-Loneliness-IS ( S, w, z ) : (a) Return a ccept with probability 1 2 , otherwise reject . (a single edge) 5. Return a ccept . (t wo or more edges) T o dra w an indicator whose exp ected v alue is S u ( √ ˜ m ) , we dra w a set S ⊆ V of density p = 1 / 10 √ ˜ m and then accept according to the starness-test of u with resp ect to S . Algorithm 6 ( Sta rness-Event-IS ) pro vides the pseudo co de for this logic. Algorithm 6 : Pro cedure Starness-Event-IS ( G, n ; ˜ m, u ) Output: Binary answer. Accept probability S u ( ˜ m ) . Complexit y: O (log n ) worst-case. 1. Let p ← 1 / 10 √ ˜ m . 2. Let S ⊆ V b e a set that every vertex b elongs to with probability p iid. 3. A ccept if and only if T est-Starness-IS ( S ; u ) accepts. Observ ation 4.5. A c al l to Sta rness-Event-IS ( G, n ; ˜ m, u ) ac c epts the input with pr ob ability S u ( ˜ m ) at the c ost of O (log n ) IS-queries worst-c ase. A t this p oin t we ha ve all the required notations to state the exact behavior of the star-vertex sampler. Lemma 4.6. Algorithm 4 ( Sample-Star-V ertex-IS ) samples every vertex u with pr ob ability 1 10 √ ˜ m S u ( ˜ m ) , 11 and otherwise r eje cts. A lso, the query c omplexity is O (log n ) worst-c ase. Pr o of. F or complexity , observ e that there are three indep endent-set queries, in additional a single call to Extract-Edge-IS , at the cost of O (log n ) indep endent set queries at w orst-case (Lemma 2.4). F or a v ertex u , the probability to return it is: Pr [ return u ] = Pr [ u ∈ S ]  1 2 Pr [ single edge, adjacent to u | u ∈ S ] + Pr [ more than one edge, all adjacen t to u | u ∈ S ]  = Pr [ u ∈ S ] Pr [ Sta rness-Event-IS ( G, n ; ˜ m, u )] = p · S u ( ˜ m ) = 1 10 √ ˜ m S u ( ˜ m ) 4.3 F actor low er b ounds W e show settings in whic h the factors are low er-b ounded b y a constant. The lone-edge sampler fav ors lo w-degree v ertices, as stated in the following lemma. T o b e applicable to the IS algorithm, the lemma statement also includes degrees slightly higher than the threshold b et w een low and medium degrees. Lemma 4.7. R e c al l that k 2 = √ ˜ m , and assume that ˜ m ≥ e − 1 / 10 m . If deg ( u ) ≤ 2 k 2 and deg( v ) ≤ 2 k 2 , then L u,v ( ˜ m ) ≥ 1 2 . Pr o of. The loneliness even t is the negation of the union of the following even ts: • S has a u -neigh b or whic h is not v . • S has a v -neigh b or whic h is not u . • S has an edge adjacen t to neither u nor v . The probability to hav e a u -neighbor (which is not v ) is b ounded by p · (deg( u ) − 1) ≤ p deg( u ) ≤ (1 / 10 √ ˜ m ) · 2 √ ˜ m = 1 / 5 . Analogously , this is also a bound for the probability to hav e a v -neigh b or (whic h is not u ). F or ev ery edge that is adjacent to neither u nor v , the probability that b oth its vertices b elong to S is p 2 = 1 / 100 ˜ m . There are at most m suc h edges, and therefore, the exp ected num b er of these edges in S is b ounded by 1 100 ( m/ ˜ m ) ≤ e 1 / 10 / 100 . By linearit y of exp ectation, the exp ected n umber of “bad” edges in S is b ounded by 1 5 + 1 5 + e 1 / 10 100 < 1 2 . By Mark ov’s inequality , the probability to hav e “bad” edges is at least 1 / 2 . The star-vertex sampler fav ors medium-degree and high-degree vertices, as stated in the following lemma. Lemma 4.8. R e c al l that k 2 = √ ˜ m , and assume that ˜ m ≥ max { 4 , e − 1 / 10 m } . If deg( u ) ≥ k 2 then S u ( ˜ m ) ≥ 1 / 30 . 12 Pr o of. The starness even t is the negation of the union of the following three even ts: • S has an edge non-adjacen t to u . • S has no u -neigh b ors. • S has a single u -neigh b or, and an additional uniform bit is 1 . (Note that the “uniform bit” represen ts the symmetry describ ed in the ov erview in the case where there is only one edge) The num b er of edges not-adjacent to u is at most m , and each of them b elongs to S with probability p 2 = 1 / 100 ˜ m . The exp ected num b er of these “bad” edges is bounded by p 2 m = (1 / 100)( m/ ˜ m ) ≤ e 1 / 10 / 100 . By Mark ov’s inequalit y , this is also an upper bound for the probabilit y to ha ve an y of these edges in S . F or the other t wo even ts, w e use an explicit b ound: Pr [Bin(deg ( u ) , p ) = 0] + 1 2 Pr [Bin(deg ( u ) , p ) = 1] = (1 − p ) deg( u ) + 1 2 p deg ( u )(1 − p ) deg( u ) − 1 Since the function x → (1 − p ) x + 1 2 px (1 − p ) x − 1 is decreasing monotone, w e can use deg( u ) ≥ √ ˜ m to obtain: [ · · · ] ≤ (1 − p ) √ ˜ m + 1 2 p √ ˜ m (1 − p ) √ ˜ m − 1 = (1 − p ) √ ˜ m  1 + 1 2(1 − p ) p √ ˜ m  ≤ e − 1 / 10  1 + 1 20(1 − p )  [Since ˜ m ≥ 4 → p ≤ 1 / 20 ] ≤ e − 1 / 10  1 + 1 20(1 − 1 / 20)  ≤ 0 . 953 Therefore, S u ( ˜ m ) ≥ 1 − e 1 / 10 / 100 − 0 . 953 ≥ 1 / 30 . 5 Elemen tary pro cedures In this section w e describ e elementary pro cedures which we use in our algorithms. This section is mostly pro vided for tec hnical completeness (since the fron t-end interface is already stated in the o verview), and has a few to do with our main con tribution. 5.1 Estimating the inv erse of an indicator Algorithm 7 ( Estimate-Indicator-Inverse ) gets a black-box A for drawing an indicator whose exp ected v alue is denoted by p A , an accuracy parameter ε and a saturation parameter ρ . Its output is 1 with probabilit y min { X · ρ/ ln(5 /ε ) , 1 } (and 0 otherwise), where X ∼ Geo( p A ) is obtained b y rep eatedly calling the blac k-b o x pro cedure A until success. Lemma 2.3. Consider a black-b ox pr o c e dur e A that r esults in an indic ator. L et p A b e the exp e cte d value of this indic ator. Ther e exists a pr o c e dur e Estimate-Indicator-Inverse ( A , ε, ρ ) that makes O  1 p A  c al ls in exp e ctation to A and: 13 Algorithm 7 : Pro cedure Estimate-Indicator-Inverse ( A , ε, ρ ) 1. Let C ← ln(5 /ε ) . 2. Let N max ← ⌊ C /ρ ⌋ . 3. Set N ← 0 . 4. While N < N max : (a) Set N ← N + 1 . (b) Call A , giving ans . (c) If ans is accept : i. Break lo op. 5. Return a bit distributing lik e Ber( N · ( ρ/C )) . • The outc ome is a Bernoul li variable. • If p A ≥ ρ , then the exp e cte d output is in the r ange e ± ε ρ/ ( p A · ln(5 /ε )) . Mor e over, the worst-c ase numb er of c al ls is O  log ε − 1 /ρ  (even if p A < ρ ). Pr o of. Let X ∼ Geo( p A ) . Clearly , N distributes as min { X , N max } . Since N max ≤ C /ρ , the output is alw ays b ounded b et ween 0 and 1 . The exp ected v alue of N is: E [ N ] = E [min { X , N max } ] = E [ X ] − Pr [ X ≥ N max + 1] E [ X − N max | X ≥ N max + 1] ( ∗ ) = E [ X ] − Pr [ X ≥ N max + 1] E [ X ] = (1 − Pr [ X ≥ N max + 1]) E [ X ] Where the ( ∗ ) -transition is correct since geometric v ariable is memoryless. If p A ≥ ρ , then: 0 ≤ Pr [ X ≥ N max + 1] = (1 − p A ) N max ≤ e − p A ⌊ C /ρ ⌋ [Since p A ≥ ρ ] ≤ e − p A ( C /p A − 1) = e − C + p A ≤ e − ln(5 /ε )+1 = e 5 ε ≤ 1 − e − ε Therefore, if p A ≥ ρ , then: E [ output ] = e ± ε E [ X ] · ρ C = e ± ε · 1 p A · ρ C = e ± ε · ρ p A ln(5 /ε ) The exp ected cost is b ounded by E [ X ] = O (1 /p A ) . The worst-case cost is N max = O ( C /ρ ) = O (log ε − 1 /ρ ) . 5.2 Extracting an edge Here we pro vide pseudo co de and pro of for the Extract-Edge-IS pro cedure. Note that this to ol app ears in [ BHPR + 20 ] (implicit) and in [ CL W20 ] (explicit randomized version). F or self con tainment, we pro vide it as Algorithm 8 and then pro ve its correctness as an implementation for Lemma 2.4. 14 Giv en a non-indep enden t set S , the algorithm rep eatedly chooses a prop er non-indep endent subset of S until reaching a non-indep enden t set of exactly tw o vertices, which explicitly describ es an edge. Algorithm 8 : Pro cedure Extract-Edge-IS ( G ; S ) Input: A graph G , a vertex set S . Output: An arbitrary edge b et ween tw o S -v ertices or reject if S is indep endent. Complexit y: O (log | S | ) . 1. If | S | ≤ 1 : (a) Return reject . 2. If IS( S ) accepts: (a) Return reject . 3. Let S 1 ∪ S 2 b e an arbitrary partition of S of sizes ⌈| S | / 2 ⌉ and ⌊| S | / 2 ⌋ . 4. If IS( S 1 ) rejects: (a) Return Extract-Edge-IS ( G ; S 1 ) . 5. If IS( S 2 ) rejects: (a) Return Extract-Edge-IS ( G ; S 2 ) . 6. While | S 1 | > 1 : (a) Let S 11 ∪ S 12 b e an arbitrary partition of S 1 of sizes ⌈| S 1 | / 2 ⌉ and ⌊| S 1 | / 2 ⌋ . (b) If IS( S 11 ∪ S 2 ) accepts: i. Set S 1 ← S 12 . (c) Else: i. Set S 1 ← S 11 . 7. While | S 2 | > 1 : (a) Let S 21 ∪ S 22 b e an arbitrary partition of S 2 of sizes ⌈| S 2 | / 2 ⌉ and ⌊| S 2 | / 2 ⌋ . (b) If IS( S 1 ∪ S 22 ) accepts: i. Set S 2 ← S 21 . (c) Else: i. Set S 2 ← S 22 . 8. Let u ∈ S 1 , v ∈ S 2 . 9. Return uv . W e recall Lemma 2.4 and pro ve it. Lemma 2.4. Ther e exists a deterministic pr o c e dur e Extract-Edge-IS ( G ; S ) whose input is a non- empty vertex set S of a gr aph G that makes O (1 + log | S | ) indep endent-set queries (worst-c ase) and r eturns an arbitr ary e dge e b etwe en two S -vertic es, if such an e dge exists, or reject otherwise. Pr o of. F or correctness: if | S | ≤ 1 , then there is no edge to extract. F or a partition S 1 ∪ S 2 , there are three cases (p ossibly ov erlapping): there is an edge in G S 1 , there is an edge in G S 2 , there is an edge crossing the cut S 1 ∪ S 2 . In the first tw o cases, the (tail-)recursive call is correct. If only the third case holds, then w e deduce that b oth S 1 and S 2 are indep enden t, but there exists an edge b et w een them. In every additional step, w e partition S 1 (or S 2 ) and fo cus on a subset that has an edge to S 2 (or S 1 ), un til b oth S 1 and S 2 are singletons that ha ve an edge b et w een them. F or complexit y: observ e that the sequence n i +1 = ⌈ n i / 2 ⌉ con verges to 1 after O (log n ) steps (for an integer n 0 ≥ 1 ). Therefore, after O (log n ) recursiv e steps, the complexity of the lo op-part of the 15 algorithm algorithm is O (log | S 1 | + log | S 2 | ) = O (log S ) . 5.3 Brute-force sampler Some of our category-sp ecific samplers assume that ˜ m is sufficien tly large. Therefore, w e ha ve to handle separately the case where there are only a few edges in the graph ( m = O (1) ). W e recall the follo wing lemma: Lemma 5.1 (Exact count in [ BHPR + 20 ], Enumerate-Edges in [ AHL26 ]) . Ther e exists a determinis- tic algorithm that enumer ates al l e dges in the given gr aph at the c ost of O (1+ m log n ) indep endent-set queries. The brute-force algorithm en umerates all edges and then uniformly c ho oses one of them. Observ ation 5.2. Ther e exists an algorithm Sample-Edge-Bruteforce-IS ( G ) that uniformly samples an e dge in a given gr aph G at the c ost of O (1 + m log n ) indep endent-set queries. 6 Using b oth IS and lo cal queries In this section we provide the hybrid estimation algorithm. Conceptually , it can b e seen as a mixture of the IS algorithm (in Section 7) and the lo cal algorithm (see [ ER18 ]). F or a small error r = min { ε 2 / 6 , 1 /n 1 / 3 log n log 2 (1 /ε ) } , we first estimate ˜ m = e ± 1 / 10 m with prob- abilit y 1 − r . Since we can use both indep enden t-set queries and lo cal queries, the cost of this prepro cess is O (min { √ m, p n/ √ m } · log n log(1 /r )) [ AHL26 ]. W e use this small error to b ound the con tribution of the wrong- ˜ m case to the sampling probability of each individual edge. W e recall the threshold degrees: • k 1 = 3 q ˜ m √ ˜ m/n . • k 2 = √ ˜ m . • k 3 = p n √ ˜ m . Note that k 2 ≤ k 3 (if we enforce the trivial bound ˜ m ≤ n 2 ), but the inequalit y k 1 ≤ k 2 do es not alw ays hold. W e classify degrees as low ( 0 ≤ deg ( u ) ≤ k 2 ), me dium ( k 2 < deg( u ) ≤ k 3 ) and high ( deg( u ) > k 3 ). Using L, M, H to indicate the degree categories, our edge categories are: • L-L: b et w een vertices of degree at most k 2 (lo w). • L-M: b et ween v ertices of degree at most k 2 (lo w) and v ertices of degree b etw een k 2 and k 3 (medium). • L-H: b et ween vertices of degree at most k 2 (lo w) and v ertices of degree greater than k 2 (high). • MH-MH: b et w een vertices of degree greater than k 2 (medium+high). When ˜ m ≤ n 2 / 3 , whic h corresp onds to the anomaly case in [ AHL26 ], we use an alternativ e logic that treats the categories L-M and L-H together as an “L-MH” category . 16 Note that, using the degree oracle, w e can certainly determine the degree category of a given vertex. 6.1 Sampling L-L edges Algorithm 9 ( Sample-L-L-Edge-Hybrid ) merely samples a lone edge, verifies that its ve rtices ha ve low degree and normalizes the probability to return each edge by canceling the loneliness factor. Algorithm 9 : Pro cedure Sample-L-L-Edge-Hybrid ( G, n, ε ; ˜ m ) Input: ˜ m ≥ e − 1 / 10 m . Output: F or every edge e ∈ E L , L , the probabilit y to return e is e ± ε 200 ˜ m ln(5 /ε ) . Complexit y: O (log n ) (exp ected). Complexit y: O (log n + log ε − 1 ) (w orst-case). 1. Let k 2 ← √ ˜ m . 2. Let uv ← Sample-Lone-Edge-IS ( G, n ; ˜ m ) . 3. If uv is reject : (a) Return reject . 4. If DEG( u ) > k 2 or DEG( v ) > k 2 : (a) Return reject . 5. Filter b y Estimate-Indicator-Inverse ( Lonliness-Event-IS ( G, n ; ˜ m, u , v ) , ε, 1 / 2) . 6. Return uv . Lemma 6.1. If ˜ m ≥ e − 1 / 10 , then Algorithm 9 ( Sample-L-L-Edge-Hyb rid ) samples every e dge e ∈ E L,L with pr ob ability e ± ε / 200 ˜ m ln(5 /ε ) , and otherwise r eje cts. Also, r e gar d less of ˜ m , the query c omplexity is O (log n ) in exp e ctation and O (log n + log ε − 1 ) in worst-c ase. Pr o of. F or complexity , observ e that there are at most tw o DEG -calls and one call to Sample-Lone- Edge-IS , whic h costs O (log n ) IS -calls at worst-case (Lemma 4.3). If w e reac h the call to Estimate-Indicato r-Inverse , then deg ( u ) ≤ k 2 and deg( v ) ≤ k 2 . In this case, L u,v ( ˜ m ) ≥ 1 2 (Lemma 4.7), and therefore, this call costs O (1) calls to Lonliness-Event-IS in exp ectation and O (log ε − 1 ) times in w orst-case (Lemma 2.3). The cost for every such call is O (1) (Observ ation 4.2). Clearly , only L-L edges can b e returned. By Lemma 4.3, every L-L edge e = uv is returned with probabilit y 1 100 ˜ m L e ( ˜ m ) . The probabilit y to pass the filter is in the range e ± ε · 1 / 2 L e ( ˜ m ) ln(5 /ε ) (Lemma 2.3). Combined, the probability to return an y individual edge e = { u, v } for which deg ( u ) , deg( v ) ≤ k 2 is in the range 1 100 ˜ m L u,v ( ˜ m ) · e ± ε 1 / 2 L u,v ( ˜ m ) ln(5 /ε ) = e ± ε 200 ˜ m ln(5 /ε ) 6.2 Sampling medium-high vertices In Algorithm 10 ( Sample-MH-V ertex-Hybrid ) we use Sample-Star-V ertex-IS to sample a “star” vertex and then use the degree oracle to make sure its degree is greater than k 2 = √ ˜ m . Then we normalize the return probabilit y by canceling the starness factor of u . Note that in one procedure ( Sample- MH-MH-Edge-Hyb rid ) w e sample tw o MH-vertices without using this sampler, to reduce the p enalt y from square-logarithmic to logarithmic. 17 Algorithm 10 : Pro cedure Sample-MH-Vertex-Hyb rid ( G, n, ε ; ˜ m ) Input: ˜ m ≥ max { 4 , e − 1 / 10 m } . Output: F or ev ery v ertex v ∈ V with deg( u ) > k 2 , the probabilit y to return it is e ± ε 300 √ ˜ m ln(5 /ε ) . Complexit y: O (log n ) (exp ected). Complexit y: O (log n log ε − 1 ) (w orst-case). 1. Let k 2 ← √ ˜ m . 2. Let u ← Sample-Sta r-Vertex-IS ( G, n, ε ; ˜ m ) . 3. If u is reject : (a) Return reject . 4. If u  = reject and DEG( u ) ≤ k 2 : (a) Return reject . 5. Filter b y Estimate-Indicator-Inverse ( Sta rness-Event-IS ( G, n ; ˜ m, u ) , ε, 1 / 30) . 6. Return u . Lemma 6.2. If ˜ m ≥ max { 4 , e − 1 / 10 m } , then A lgorithm 10 ( Sample-MH-V ertex-Hybrid ) samples every vertex whose de gr e e is gr e ater than k 2 with pr ob ability e ± ε / 300 √ ˜ m ln(5 /ε ) , and otherwise r eje cts. Also, r e gar d less of ˜ m , the query c omplexity is O (log n ) in exp e ctation and O (log n log ε − 1 ) in worst-c ase. Pr o of. F or complexity , observ e that there is at most one DEG -call and one call to Sample-Sta r- V ertex-IS , which costs O (log n ) IS -calls at worst-case (Lemma 4.6). If we reach the call to Estimate-Indicator-Inverse , then deg( u ) > k 2 = √ ˜ m . In this case, S u ( ˜ m ) ≥ 1 / 30 (Lemma 4.8), and therefore, this call costs O (1) calls to Starness-Event-IS in exp ectation and O (log ε − 1 ) at w orst-case (Lemma 2.3). The cost for every such call is O (log n ) (Observ ation 4.5). Clearly , only MH-v ertices can b e returned. By Lemma 4.6, every MH-vertex u is returned with probabilit y 1 10 √ ˜ m S u ( ˜ m ) . The probabilit y to pass the filter is in the range e ± ε · 1 / 30 S u ( ˜ m ) ln(5 /ε ) (Lemma 2.3). Combined, the probability to return any individual MH-v ertex u is in the range 1 10 √ ˜ m S u ( ˜ m ) · e ± ε 1 / 30 S u ( ˜ m ) ln(5 /ε ) = e ± ε 300 √ ˜ m ln(5 /ε ) 6.3 Sampling L-MH edges (for lo w ˜ m ) This logic applies when ˜ m is small enough for ha ving √ ˜ m < p n/ ˜ m . In Algorithm 11 ( Sample-L- MH-Edge-Hyb rid ) we draw an MH-vertex u and then a uniform neighbor v . T o normalize the return probabilit y for differen t choices of u , we normalize b y a deg( u ) 2 ˜ m -factor. When ˜ m is higher, the result probabilit y is to o small (and to o exp ensiv e to amplify), and we use different logic for medium-degree v ertices and high-degree vertices. Lemma 6.3. Assume that ˜ m ≥ max { 4 , e − 1 / 10 m } . Algorithm 11 ( Sample-L-MH-Edge-Hyb rid ) sam- ples every e dge b etwe en vertic es of de gr e e of most k 2 = √ ˜ m and vertic es of de gr e e gr e ater than k 2 with pr ob ability e ± ε / 600 ˜ m √ ˜ m ln(5 /ε ) . Mor e over, r e gar d less of ˜ m , the query c omplexity is O (log n ) in exp e ctation and O (log n + log ε − 1 ) worst-c ase. 18 Algorithm 11 : Pro cedure Sample-L-MH-Edge-Hybrid ( G, n, ε ; ˜ m ) Input: ˜ m ≥ max { 4 , e − 1 / 10 m } . Output: F or every edge e ∈ E L , MH , the probabilit y to return e is e ± ε 600 ˜ m √ ˜ m ln(5 /ε ) . Complexit y: O (log n ) (exp ected). Complexit y: O (log n log ε − 1 ) (w orst-case). 1. Let k 2 ← √ ˜ m . 2. Let u ← Sample-MH-V ertex-Hybrid ( G, n, ε ; ˜ m ) . 3. Let v ← NEIGHBOR( u ) . 4. Let d u ← DEG( u ) . 5. If d u > k 2 and DEG( v ) ≤ k 2 : (a) Return uv with probability min { 1 , d u 2 ˜ m } (otherwise reject ). 6. Return reject . Pr o of. F or complexit y , observe that there are at most tw o DEG -calls, one NEIGHBOR -call, and one call to Sample-MH-Vertex-Hyb rid , whic h costs O (log n ) in exp ectation and O (log n + log ε − 1 ) at w orst-case (Lemma 6.2). Note that if ˜ m ≥ e − 1 / 10 m , then d u / 2 ˜ m ≤ 1 2 e 1 / 10 d u /m ≤ 1 , where the last transition is correct since the n umber of edges m cannot b e smaller than the degree of u . Clearly , only L-MH edges can b e returned. Let u 0 v 0 suc h an edge for whic h deg ( u 0 ) > k 2 and deg( v 0 ) ≤ k 2 : Pr [ return u 0 v 0 ] = Pr [ u = u 0 ] · Pr [ v = v 0 | u = u 0 ] · Pr [ pass filter | u = u 0 , v = v 0 ] [Lemma 6.2] = e ± ε 300 √ ˜ m ln(5 /ε ) · 1 deg( u 0 ) · deg( u 0 ) 2 ˜ m = e ± ε 600 ˜ m √ ˜ m ln(5 /ε ) 6.4 Sampling L-M edges (for high ˜ m ) This logic applies when ˜ m is large enough for having √ ˜ m ≥ p n/ ˜ m . In Algorithm 12 ( Sample-L-M- Edge-Hyb rid ) w e use the same logic as in Sample-L-MH-Edge-Hybrid , but the upp er b ound for deg( u ) is k 3 rather than 2 ˜ m . Lemma 6.4. If ˜ m ≥ max { 4 , e − 1 / 10 m } , then A lgorithm 12 ( Sample-L-M-Edge-Hybrid ) samples every e dge b etwe en vertic es of de gr e e of most k 2 and vertic es of de gr e e b etwe en k 2 and k 3 with pr ob ability in the r ange e ± ε / 300 ˜ m q n/ √ ˜ m ln(5 /ε ) . A lso, r e gar d less of ˜ m , the query c omplexity is O (log n ) in exp e ctation and O (log n log ε − 1 ) in worst-c ase. Pr o of. F or complexit y , observ e that there is at most t wo DEG -calls, one NEIGHBOR -call, and one call to Sample-MH-V ertex-Hybrid , which costs O (log n ) in exp ectation and O (log n log ε − 1 ) at w orst-case (Lemma 6.2). 19 Algorithm 12 : Pro cedure Sample-L-M-Edge-Hybrid ( G, n, ε ; ˜ m ) Input: ˜ m ≥ max { 4 , e − 1 / 10 m } . Output: F or every edge e ∈ E L , M , the probabilit y to return e is e ± ε 300 ˜ m √ n/ √ ˜ m ln(5 /ε ) . Complexit y: O (log n ) (exp ected). Complexit y: O (log n log ε − 1 ) (w orst-case). 1. Let k 2 ← √ ˜ m . 2. Let k 3 ← p n √ ˜ m . 3. Let u ← Sample-MH-V ertex-Hybrid ( G, n, ε ; ˜ m ) . 4. Let v ← NEIGHBOR( u ) . 5. Let d u ← DEG( u ) . 6. If k 2 < d u ≤ k 3 and DEG( v ) ≤ k 2 : (a) Return uv with probability d u /k 3 . 7. Return reject . Clearly , only L-M edges can b e returned. Let u 0 v 0 suc h an edge for which k 2 < deg ( u 0 ) ≤ k 3 and deg( v 0 ) ≤ k 2 : Pr [ return u 0 v 0 ] = Pr [ u = u 0 ] · Pr [ v = v 0 | u = u 0 ] · Pr [ pass filter | u = u 0 , v = v 0 ] [Lemma 6.2] = e ± ε 300 √ ˜ m ln(5 /ε ) · 1 deg( u 0 ) · deg( u 0 ) k 3 = e ± ε 300 √ ˜ mk 3 ln(5 /ε ) = e ± ε 300 ˜ m q n/ √ ˜ m ln(5 /ε ) 6.5 The tininess factor In this subsection we define the tininess factor which we use in the analysis of the L-H edge sampling logic (presen ted in the following subsection). W e define the tininess factor of a vertex u with resp ect to the treshold degree k 1 ( √ ˜ m ) = 3 p ˜ m 3 / 2 /n as the fraction of neighbors whose degree is b ounded b y k 1 . F ormally , T u ( ˜ m ) = |{ v ∈ N u : deg( v ) ≤ k 1 ( ˜ m ) }| deg( u ) Algorithm 13 ( Tininess-Event-LOCAL ) draws a uniform neigh b or of u and compares its degree to k 1 ( ˜ m ) . Observ ation 6.5. A lgorithm 13 ( Tininess-Event-LOCAL ) ac c epts the input ( G, n ; ˜ m, u ) with pr ob a- bility exactly T u ( ˜ m ) at the c ost of O (1) lo c al queries. The tininess factor of a high-degree vertex is Ω(1) , as stated in the following lemma. 20 Algorithm 13 : Pro cedure Tininess-Event-LOCAL ( G, n ; ˜ m, u ) Output: Binary answer. Accept probability T u ( ˜ m ) . Complexit y: O (1) worst-case. 1. Let k 1 ← 3 p ˜ m 3 / 2 /n . 2. Let v ← NEIGHBOR( u ) . 3. Let d v ← DEG( v ) . 4. A ccept if and only if d v ≤ k 1 . Lemma 6.6. F or a gr aph G over n vertic es, r e c al l that k 1 ( ˜ m ) = 3 q ˜ m √ ˜ m/n and k 3 ( ˜ m ) = p n √ ˜ m . If ˜ m ≥ e − 1 / 10 m , then for every vertex u for which deg( u ) > k 3 ( ˜ m ) , T u ( ˜ m ) ≥ 1 4 . Pr o of. The n umber of ver tices of degree greater than k 1 in the graph is b ounded by 2 m k 1 = 2 m 3 q ˜ m √ ˜ m/n = 2 3 m ˜ m q n √ ˜ m ≤ 2 e 1 / 10 3 k 3 ≤ 3 4 deg( u ) Hence, the n umber of u -neighbors whose degree is at most k 1 is at least 1 4 deg( u ) . 6.6 Sampling L-H edges (for high ˜ m ) This logic applies when ˜ m is large enough for having √ ˜ m ≥ p n/ ˜ m . Algorithm 14 ( Sample-L-H-Edge-Local ) for sampling L-H edges is based on the concept of [ ER18 ]. The main difference is the use of IS queries to cancel the tininess factor, reducing the query com- plexit y dep endency of 1 /ε from linear to logarithmic. W e uniformly choose a v ertex w . Then, we dra w a uniform neighbor u (of w ) and a uniform neigh b or v (of u ). W e reject immediately if deg( w ) > k 1 or if deg ( u ) ≤ k 3 or if deg ( v ) > k 2 . Finally , we toss a coin to return the edge uv with probability deg( w ) k 1 and reject otherwise. W e normalize the return probabilit y by canceling the tininess factor, which is the fraction of w -neigh b ors of u whose degree is b ounded by k 1 . W e consider v ertices with degree deg( w ) ≤ k 1 as tiny-degree. V ertices of high degree ha ve an Ω(1) - fraction of tin y-degree neighbors, as stated in the following lemma. Recall that N u denotes the set of u -neigh b ors. Lemma 6.7. A lgorithm 14 ( Sample-L-H-Edge-Local ) r eturns every e dge e ∈ E L,H with pr ob ability e ± ε / 12 ˜ m q n/ √ ˜ m ln(5 /ε ) , and otherwise r eje cts. Mor e over, the exp e cte d c omplexity is O (1) and the worst-c ase c omplexity is O (log ε − 1 ) . Pr o of. F or complexit y , observe that there are tw o NEIGHBOR -calls and three DEG -calls. If we reach the call to Estimate-Indicator-Inverse , then deg( u ) > k 3 . In this case, T u ( ˜ m ) ≥ 1 4 , and therefore, this call costs O (1) ev aluations of Tininess-Event-LOCAL in expectation, and O (log ε − 1 ) at worst-case (Lemma 2.3). The cost for ev ery suc h ev aluation is O (1) lo cal queries (Observ ation 6.5). 21 Algorithm 14 : Pro cedure Sample-L-H-Edge-Lo cal ( G, n, ε ; ˜ m ) Input: ˜ m ≥ e − 1 / 10 m . Output: F or every edge e ∈ E L , H , the probabilit y to return e is e ± ε 12 ˜ m √ n/ √ ˜ m ln(5 /ε ) . Complexit y: O (1) (exp ected). Complexit y: O (log ε − 1 ) (w orst-case). 1. Let k 1 ← 3 q ˜ m √ ˜ m/n . 2. Let k 2 ← √ ˜ m . 3. Let k 3 ← p n √ ˜ m . 4. Dra w w ∈ V uniformly . 5. Let u ← NEIGHBOR( w ) . 6. Let v ← NEIGHBOR( u ) . 7. Let d w ← DEG( w ) , d u ← DEG( u ) , d v ← DEG( v ) . 8. If d w ≤ k 1 and d u > k 3 and d v ≤ k 2 : (a) Pro ceed with probability d w /k 1 . (b) Filter b y Estimate-Indicator-Inverse ( Tininess-Event-LOCAL ( G, n ; ˜ m, u ) , ε, 1 / 4) . (c) Return uv . 9. Return reject . Consider an edge u 0 v 0 where deg ( u 0 ) > k 3 and deg ( v 0 ) ≤ k 2 . The probabilit y to return the edge u 0 v 0 is: Pr [ return u 0 v 0 ] = X w ∈ V  Pr [ w = w ] · Pr [ u = u 0 | w = w ] · Pr [ v = v 0 | u = u 0 ] · · · · · · Pr [ pro ceed | u = u 0 ] · Pr [ pass filter | u = u 0 ]  = X w ∈ N u 0 deg( w ) ≤ k 1 1 n · 1 deg( w ) · 1 deg( u 0 ) · deg( w ) k 1 · e ± ε · 1 / 4 T u 0 ( ˜ m ) ln(5 /ε ) = 1 nk 1 deg( u 0 ) X w ∈ N u 0 deg( w ) ≤ k 1 e ± ε · 1 / 4 T u 0 ( ˜ m ) ln(5 /ε ) = 1 4 nk 1 deg( u 0 ) | {z } ln(5 /ε ) · 1 T u 0 ( ˜ m ) · e ± ε · |{ w ∈ N u 0 : deg( w ) ≤ k 1 }| | {z } Recall that T u 0 ( ˜ m ) is the fraction of tiny-degree neighbors of u 0 (according to the threshold k 1 ). Therefore, Pr [ return u 0 v 0 ] = e ± ε 4 nk 1 ln(5 /ε ) = e ± ε 12 ˜ m q n/ √ ˜ m ln(5 /ε ) 22 6.7 Sampling MH-MH edges Algorithm 15 ( Sample-MH-MH-Edge-Hybrid ) merely samples tw o star vertices (plus normalization) and, if they both exist, tests whether there is an edge b et ween them. Algorithm 15 : Pro cedure Sample-MH-MH-Edge-Hybrid ( G, n, ε ; ˜ m ) Input: ˜ m ≥ e − 1 / 10 m . Output: F or every edge e ∈ E MH , MH , the probabilit y to return e is e ± ε 45000 ˜ m ln(5 /ε ) . Complexit y: O (log n ) (exp ected). Complexit y: O (log n log ε − 1 ) (w orst-case). 1. Let k 2 ← √ ˜ m . 2. Let u ← Sample-Sta r-Vertex-IS ( G, n ; ˜ m ) . 3. Let v ← Sample-Star-V ertex-IS ( G, n, ε ; ˜ m ) . 4. If u , v  = reject and IS( { u , v } ) rejects and DEG( u ) > k 2 and DEG( v ) > k 2 : (a) Filter b y Estimate-Indicator-Inverse ( A , ε, 1 / 900) with the following pro cedure A : • Let b 1 ← Sta rness-Event-IS ( G, n ; ˜ m, u ) . • Let b 2 ← Sta rness-Event-IS ( G, n ; ˜ m, v ) . • Accept if and only if b oth b 1 and b 2 are “ accept ”. (b) Return uv . 5. Return reject . Lemma 6.8. If ˜ m ≥ max { 4 , e − 1 / 10 m } , then Algorithm 15 ( Sample-MH-MH-Edge-Hyb rid ) r eturns every e dge e ∈ E M H,M H with pr ob ability e ± ε / 45000 ˜ m ln 2 (5 /ε ) , and otherwise r eje cts. Mor e over, r e gar d less of ˜ m , the c omplexity is O (log n ) in exp e ctation if ˜ m ≥ e − 1 / 10 m and O (log n log ε − 1 ) in worst-c ase. Pr o of. F or complexity , observe that there are three explicit oracle calls, in addition to tw o calls to Sample-Sta r-Vertex-IS , each costing O (log n ) (Lemma 4.6). If w e reac h the call to Estimate-Indicato r-Inverse , then deg( u ) > k 2 = √ ˜ m . In this case, S u ( ˜ m ) S v ( ˜ m ) ≥ (1 / 30) 2 = 1 / 900 (Lemma 4.8), and therefore, this call for costs O (1) calls to Sta rness-Event-IS in exp ectation and O (log ε − 1 ) at worst-case (Lemma 2.3). The cost for every such call is O (log n ) (Observ ation 4.5). Consider a directed edge u 0 → v 0 for which deg ( u 0 ) , deg ( v 0 ) > k 2 . The probability to return u 0 → v 0 is Pr [ return u 0 → v 0 ] = Pr [ u = u 0 ] · Pr [ v = v 0 ] · Pr [ pass filter | u = u 0 , v = v 0 ] = S u 0 ( ˜ m ) 10 √ ˜ m · S v 0 ( ˜ m ) 10 √ ˜ m · e ± ε 1 / 900 S u 0 ( ˜ m ) S v 0 ( ˜ m ) · ln(5 /ε ) = e ± ε 90000 ˜ m ln(5 /ε ) Note that the probability to return the edge u 0 v 0 is twice this result, since it can also b e represented as v 0 u 0 . 23 6.8 Sampling all edges W e combine the category-sp ecific edge sampling pro cedures b y unifying their co efficien t. Algorithm 16 : Pro cedure Sample-Edge-Core-Hyb rid ( G, n, ε ; ˜ m ) Input: ˜ m ≥ max { 4 , e − 1 / 10 m } . Output: F or every edge e ∈ E , the probabilit y to return e is e ± 2 ε 750000 ˜ mR ln(5 /ε ) where R = min  √ ˜ m, q n/ √ ˜ m  . 1. Let ˜ R ← min { √ ˜ m, q n/ √ ˜ m } . 2. Let e ← N / A . 3. If √ ˜ m ≤ q n/ √ ˜ m : (a) T oss a four-head coin. • With probabilit y 1 225 ˜ R : let e ← Sample-L-L-Edge-Hybrid ( G, n, ε ; ˜ m ) . • With probabilit y 1 75 : let e ← Sample-L-MH-Edge-Hybrid ( G, n, ε ; ˜ m ) . • With probabilit y 1 ˜ R : let e ← Sample-MH-MH-Edge-Hybrid ( G, n, ε ; ˜ m ) . • Otherwise: (do nothing). 4. Else: (a) T oss a five-head coin. • With probabilit y 1 225 ˜ R : let e ← Sample-L-L-Edge-Hybrid ( G, n, ε ; ˜ m ) . • With probabilit y 1 150 : let e ← Sample-L-M-Edge-Hybrid ( G, n, ε ; ˜ m ) . • With probabilit y 1 3750 : let e ← Sample-L-H-Edge-Lo cal ( G, n, ε ; ˜ m ) . • With probabilit y 1 ˜ R : let e ← Sample-MH-MH-Edge-Hybrid ( G, n, ε ; ˜ m ) . • Otherwise: (do nothing). 5. If e  = N / A : (a) Return e . 6. Return reject . Lemma 6.9. F or every gr aph G = ( V , E ) (over n vertic es and m e dges), if ˜ m ≥ max { 4 , e − 1 / 10 m } , then Algorithm 16 ( Sample-Edge-Core-Hyb rid ) r eturns every e dge uv ∈ E with pr ob ability in the r ange e ± 2 ε / 45000 ˜ m ˜ R ln(5 /ε ) for ˜ R = min { √ ˜ m, q n/ √ ˜ m } , and otherwise r eje cts. The query c omplexity is O (log n ) in exp e ctation and O (log n log ε − 1 ) in worst-c ase. Pr o of. F or query complexit y , observe that all subroutines require at most O (log n ) queries in ex- p ectation and O (log n log ε − 1 ) queries at w orst-case. If √ ˜ m ≤ q n/ √ ˜ m (that is, ˜ R = √ ˜ m ), then the probability to return every individual edge uv , based on its v ertices’ degree classes, is: Edge class Lemma Base probability Coin Combined L-L 6.1 e ± ε 200 ˜ m ln(5 /ε ) 1 225 ˜ R e ± ε 45000 ˜ m ˜ R ln(5 /ε ) L-MH 6.3 e ± ε 600 ˜ m ˜ R ln(5 /ε ) 1 75 e ± ε 45000 ˜ m ˜ R ln(5 /ε ) MH-MH 6.8 e ± ε 45000 ˜ m ln(5 /ε ) 1 ˜ R e ± ε 45000 ˜ m ˜ R ln(5 /ε ) 24 If √ ˜ m > q n/ √ ˜ m (that is, ˜ R = q n/ √ ˜ m ), then the probability to return ev ery individual edge uv , based on its v ertices’ degree classes, is: Edge class Lemma Base probability Coin Combined L-L 6.1 e ± ε 200 ˜ m ln(5 /ε ) 1 225 ˜ R e ± ε 45000 ˜ m ˜ R ln(5 /ε ) L-M 6.4 e ± ε 300 ˜ m ˜ R ln(5 /ε ) 1 150 e ± ε 45000 ˜ m ˜ R ln(5 /ε ) L-H 6.7 e ± ε 12 ˜ m ˜ R ln(5 /ε ) 1 3750 e ± ε 45000 ˜ m ˜ R ln(5 /ε ) MH-MH 6.8 e ± ε 45000 ˜ m ln(5 /ε ) 1 ˜ R e ± ε 45000 ˜ m ˜ R ln(5 /ε ) The core sampler is a ( λ, ε ) -uniform sampler, but λ can be very small. Algorithm 17 ( Sample-Edge- Amplified-Hyb rid ) amplifies the success probability to 3 / 4 by making O ( R log(1 /ε )) sample tries, for R = min { √ m, p n/ √ m } , and choosing the first successful one. This algorithm requires an advice ˜ m , and its correctness and exp ected query complexity statements only hold if ˜ m ∈ e ± 1 / 10 m . Algorithm 17 : Pro cedure Sample-Edge-Amplified-Hybrid ( G, n, ˜ m, ε ) Input: ˜ m ≥ max { 4 , e − 1 / 10 m } . Output: F or every edge e ∈ E , the probabilit y to return e is e ± ε λ ( G, ˜ m ) . ( λ ( G, ˜ m ) ≥ 3 / 4 if ˜ m ≥ e − 1 / 10 m ). Complexit y: O (min { √ m, n/ √ m } · log n log(1 /ε )) exp ected, if ˜ m ≥ e − 1 / 10 m . Complexit y: O ( n 1 / 3 log n log 2 (1 /ε )) w orst-case. 1. Let ˜ R ← min { √ ˜ m, q n/ √ ˜ m } . 2. F or l 10 6 ˜ R ln(5 /ε ) m times: (a) Let e ← Sample-Edge-Co re-Hybrid ( G, n, ε/ 5; ˜ m ) . (b) If e is not reject : i. Return e . 3. Return reject . Lemma 6.10. F or every gr aph G = ( V , E ) (over n vertic es and m e dges), if ˜ m ≥ max { 4 , e − 1 / 10 m } , then Algorithm 16 ( Sample-Edge-Core-Hyb rid ) r eturns every e dge uv ∈ E with pr ob ability in the r ange e ± ε λ ( G, ˜ m ) , wher e λ ( G, ˜ m ) ≥ 3 / 4 , and otherwise r eje cts. The query c omplexity is O ( R log n log(1 /ε )) in exp e ctation (b ase d on the assumption ab out ˜ m ) for R = min { √ ˜ m, q n/ √ ˜ m } and O ( n 1 / 3 log n · log 2 (1 /ε )) in worst-c ase (for every ˜ m ). Pr o of. Let R = min { √ m, p n/ √ m } and ˜ R = min { √ ˜ m, q n/ √ ˜ m } . The success probabilit y of an iteration is at least e − ε/ 5 / 45000 ˜ m ˜ R ln(5 /ε ) (Lemma 6.9). Therefore, the success probability of the lo op is: λ ( G, ˜ m ) ≥ 1 − 1 − m · e − ε/ 5 45000 ˜ m ˜ R ln(5 /ε ) ! 10 6 ˜ R ln(5 /ε ) ≥ 1 − e − e − 1 / 10 · e − 1 / 5 · 200 9 ≥ 3 / 4 25 F or every edge e ∈ E , the probability to sample it is in the range e ± 2( ε/ 5) λ ( G, ˜ m ) . F or complexity: • Exp ected: O ( ˜ R log (1 /ε )) rounds times O (log n ) p er-round (Lemma 6.9), which is O ( R log n · log(1 /ε )) if ˜ m ∈ e ± 1 / 10 m . • W orst-case: O ( n 1 / 3 log(1 /ε )) rounds times O (log n log(1 /ε )) p er-round (Lemma 6.9). The en try-p oin t sampling algorithm, Sample-Edge-Hyb rid (Algorithm 18), estimates ˜ m ∈ e ± 1 / 10 m with success probabilit y 1 − r , where r = O ( ε 2 ) ∩ O (1 /n 1 / 3 log n ) . Algorithm 18 : Pro cedure Sample-Edge-Hybrid ( G, n, ε ) 1. If ε > 1 / 3 : (a) Set ε ← 1 / 3 . 2. Let r ← min { ε 2 / 6 , 1 /n 2 log n log 2 (1 /ε ) } . 3. Compute ˜ m ← e ± 1 / 10 m with probabilit y 1 − r . 4. If ˜ m < 4 : (a) Return Sample-Edge-Brutefo rce-IS ( G ) . 5. Return Sample-Edge-Amplified-Hyb rid ( G, n, ε/ 5 , ˜ m ) . Lemma 6.11. F or an input gr aph G over n vertic es (known to the algorithm) and m e dges (unknown to the algorithm), Algorithm 18 ( Sample-Edge-Hybrid ) is a (2 / 3 , ε ) -uniform sampler of the e dges of G . The exp e cte d query c omplexity is O ( R log n log( n/ε ) + log 2 (1 /ε )) , wher e R = min { √ m, p n/ √ m } . Pr o of. Estimating ˜ m in the range e ± 1 / 10 m with probabilit y 2 / 3 requires O ( R log n ) queries w orst- case (Lemma 2.1). T o increase this probabilit y to 1 − r , for r = min { ε 2 / 6 , 1 /n 2 log n log 2 (1 /ε ) } , we tak e the median of O (log r − 1 ) = O (log( n/ε )) indep endent runs at the cost of O ( R log ( n/ε ) log n ) queries. The contribution of the brute-force sampler in the case where ˜ m is wrongly small is Pr [ ˜ m < 4] · O (1 + m log n ) = O ( r m log n ) = O (1) (last transition is correct since m ≤ n 2 and r ≤ 1 /n 2 log n ). If ˜ m ≥ 4 , then amplified sample costs: • O ( R · log n ) in exp ectation, if ˜ m ∈ e ± 1 / 10 m . • O ( n 1 / 3 log n log 2 (1 /ε )) if ˜ m / ∈ e ± 1 / 10 m Since ˜ m is outside the correct range with probability at most r ≤ 1 /n 2 log n log 2 (1 /ε ) , the exp ected query complexit y of the sampling logic is O ( R log n log (1 /ε )) . Let λ ′ ( G ) = P a ∈ e ± 1 / 10 m Pr [ ˜ m = a ] Pr [ ˜ m ∈ e ± 1 / 10 m ] ≥ 3 4 (last transition: Lemma 6.10). By Lemma 6.10, the probabilit y to sample an individual edge e ∈ E is at least e − ε/ 5 λ ′ ( G ) and at most e ε/ 5 λ ′ ( G ) + r . F or 0 < ε ≤ 1 / 3 , the ratio b et ween the maxim um and the minim um probability 26 is at most: e (2 / 5) ε + e ε/ 5 λ ′ ( G ) r ≤ e (2 / 5) ε + 4 3 e ε/ 5 · 1 6 ε 2 ≤ e ε/ 2 Therefore, the output is ε -uniform when conditioned on success. 7 Using the IS oracle The main conceptual difference b et ween the IS-only algorithm and the IS+LOCAL algorithm is the inabilit y to certainly determine the category of eac h v ertex, since w e ha ve to estimate the degree rather than using the degree oracle. Besides this, the algorithms differ b y the need to simulate the lo cal oracles using the IS-oracle. In the IS-only part, we use the following parameters: • k = √ ˜ m - a threshold degree. This is the same as k 2 of the tec hnical ov erview. • n ∗ = min { n, 2 m } - an upp er b ound for the n umber of effective vertices. • ˜ n ∗ = min { n, 2 ˜ m } - an estimation for n ∗ . Degrees b ounded b y k are considered low and degrees greater than k are considered as high. Some statemen ts refer to an extended range for low degrees, b et ween 0 and 2 k (rather than k ). This follo ws the need to test the degree of a vertex instead of certainly determine it using the degree oracle. Let r = p oly( ε, 1 /n ) b e a reasonable additiv e error. The algorithm uses O (min { √ m, p n/ √ m } · p oly(log n ) · log(1 /r )) queries to obtain an appro ximation ˜ m , which is in the range e ± 1 / 10 m with probabilit y at least 1 − r [ CL W20 , AHL26 ]. 7.1 Sampling a neighbor The neighbor sampler for parameters n , u and ˜ m has t wo branc hes, dep ending on the choice of ˜ n ∗ . • If ˜ n ∗ = n , then we uniformly choose a v ertex v and use a single IS query to determine whether v ∈ N u , in whic h case we return it, or not, in whic h case we reject. • If ˜ n ∗  = n , then w e dra w a set S of density 1 / ˜ m . If there is exactly one edge in S ∪ { u } , and this edge is adjacen t to u , then w e return its other endp oin t v ertex. Otherwise w e reject. Algorithm 19 ( Sample-Neighb o r-IS ) pro vides the pseudo co de for the neighbor sampler. In the fol- lo wing w e define N u,v ( ˜ m ) , the neighb orho o d factor of u and v , and sho w that the probability to sample v (when u is given) is exactly N u,v ( ˜ m ) / ˜ m . This can b e see as an alternative to the loneliness factor, when considering a different density . T o draw an indicator whose expected v alue is N u,v ( ˜ m ) , w e dra w a set S ⊆ V of density p = 1 / ˜ m and test whether or not { u, v } is lonely in S . Algorithm 20 ( Neighb o rho o d-Event-IS ) provides the pseudo code for this logic. Observ ation 7.1. A lgorithm 20 ( Neighb orhoo d-Event-IS ) ac c epts the input ( G, n, ˜ m ; u, v ) with pr ob- ability exactly N u,v ( ˜ m ) . The query c omplexity is O (1) worst-c ase. 27 Algorithm 19 : Pro cedure Sample-Neighb or-IS ( G, n, ˜ m ; u ) Input: ˜ m ≥ max { 5 , e − 1 / 10 m } . Input: A vertex u . Input: If ˜ n ∗ = n : for every v ∈ N u , the probabilit y to return v is 1 n . Input: If ˜ n ∗  = n : for every v ∈ N u , the probabilit y to return v is 1 ˜ m N u,v ( ˜ m ) . Complexit y: 1 if ˜ n ∗ = n (w orst-case). Complexit y: O (log n ) if ˜ n ∗  = n (w orst-case). 1. Let ˜ n ∗ ← min { n, 2 ˜ m } . 2. If ˜ n ∗ = n : (a) Dra w v ∈ V uniformly . (b) If IS( { u, v } ) accepts: (No edges) i. Return reject . (c) Return v . 3. Else: (a) Let S ⊆ V b e a set that every non- u vertex b elongs to with probability 1 / ˜ m iid. (b) Let e ← Extract-Edge-IS ( G ; S ∪ { u } ) . (c) If e  = reject and u ∈ e : i. Let v b e the other vertex of e . ii. If T est-Loneliness-IS ( S ; u, v ) : A. Return v . (d) Return reject . Algorithm 20 : Pro cedure Neighb orhoo d-Event-IS ( G, n, ˜ m ; u, v ) Input: ˜ m ≥ max { 5 , e − 1 / 10 m } . Input: A vertex u , a vertex v ∈ N u . Output: Exp ected v alue: N u,v ( ˜ m ) . Complexit y: O (1) (worst case). 1. Dra w a set S ⊆ V of density 1 / ˜ m . 2. Return T est-Loneliness-IS ( S ; u, v ) . The neigh b orho od factor of a high degree v ertex is Ω(1) , as stated in the follo wing lemma. Lemma 7.2. R e c al l that k = √ ˜ m and assume that ˜ m ≥ max { 20 , e − 1 / 10 m } . F or every vertex u and a neighb or v ∈ N u , N u,v ( ˜ m ) ≥ 1 25 . Pr o of. The follo wing pro of is iden tical to the pro of of Lemma 4.7, up to the different densit y parameter. The neigh b orho od even t is the negation of the union of the follo wing even t: • S has a u -neigh b or whic h is not v . • S has a v -neigh b or whic h is not u . • S has an edge adjacen t to neither u nor v . 28 The probability to ha ve a u -neighbor (which is not v ) or a v -neighbor (which is not u ) is b ounded b y: 1 − (1 − p ) (deg( u ) − 1)+(deg( v ) − 1) ≤ 1 − (1 − p ) 2 m [Lemma 3.5] ≤ 1 − (1 − p 2 · 2 m ) e − p · 2 m = 1 −  1 − 2 m ˜ m 2  e − 2 m/ ˜ m ≤ 1 − 1 − 2 e 1 / 10 ˜ m ! e − 2 e 1 / 10 ≤ 1 − 1 − 2 e 1 / 10 20 ! e − 2 e 1 / 10 ≤ 0 . 903 The probabilit y to hav e a non- u , non- v edge is b ounded b y p 2 m = m/ ˜ m 2 ≤ e 1 / 10 ˜ m ≤ e 1 / 10 20 ≤ 0 . 056 . Com bined, N u,v ( ˜ m ) ≥ 1 − 0 . 903 − 0 . 056 = 0 . 041 > 1 25 . Lemma 7.3. A c al l to Sample-Neighb or-IS ( G, n ; ˜ m, u ) samples every v ∈ N u with pr ob ability exactly 1 n if ˜ n ∗ = n and otherwise 1 ˜ m N u,v ( ˜ m ) , at the c ost of O (log n ) IS-queries worst-c ase. Pr o of. F or complexity: if ˜ n ∗ = n , then we make a single IS query . Otherwise, the call of Extract- Edge-IS costs O (log n ) worst-case (Lemma 2.4) and the call of T est-Loneliness-IS is O (1) worst-case (Observ ation 4.1). Correctness is trivial if ˜ n ∗ = n . If ˜ n ∗  = n , then: Pr [ sample v | u ] = Pr [ v ∈ S ] Pr [ uv is lonely in S | v ∈ S ] = 1 ˜ m · N u,v ( ˜ m ) 7.2 Categorizing the degree Recall the threshold degree k = √ ˜ m and the tolerable additive error r . F or every vertex u , w e test whether its degree is low or high. If deg ( u ) ≤ k then w e say “low” with probability at least 1 − r , and if deg ( u ) ≥ 2 k then w e sa y “high” with probabilit y at least 1 − r . The high degrees in the range b et w een k and 2 k can b e wrongly classified as “low” with any probability . The query complexity of the lo w-high test (Algorithm 21) is O (log n log(1 /r )) worst-case. T o distinguish b et ween the degree threshold, w e draw a set S of density 1 / 8 k . The ev ent “ S ∪ { u } has edges but S \ { u } is indep endent” is b ounded by a constant probabilit y if deg( u ) ≤ k and greater than (another) constant probability if deg( u ) ≥ 2 k . W e use O (log(1 /r )) = O (log( n/ε )) rounds to amplify the success probabilit y of the test to 1 − r . Lemma 7.4. Assume that ˜ m ≥ min { 36 , e − 1 / 10 m } . Pr o c e dur e T est-High-Degree-IS ( G, n, ˜ m, δ ; u ) (A lgorithm 21) ac c epts with pr ob ability at le ast 1 − r if deg( u ) ≥ 2 k and r eje cts with pr ob ability at le ast 1 − r if deg( u ) ≤ k . Mor e over, the query c omplexity is O (log(1 /r )) worst-c ase. Pr o of. F or complexit y , observe that we make O (log(1 /r )) rounds, eac h costing tw o IS queries. If deg( u ) ≤ k , then the probabilit y that S ∪ { u } has edges but S \ { u } is indep enden t is b ounded b y the exp ected n umber of u -edges, whic h is (1 / 8 k ) · deg( u ) ≤ 1 / 8 . 29 Algorithm 21 : Pro cedure T est-High-Degree-IS ( G, n, ˜ m ; u, r ) Input: ˜ m ≥ max { 36 , e − 1 / 10 m } . Complexit y: O (log(1 /r )) worst-case. Completeness: If deg ( u ) ≥ 2 k , then the accept probability is at least 1 − r . Soundness: If deg ( u ) ≤ k , then the reject probability is at least 1 − r . 1. Let k ← √ ˜ m . 2. Let N ← ⌈ 400 ln(1 /r ) ⌉ 3. Set M ← 0 . 4. F or N times: (a) Dra w a set S ⊆ V that every non- u element b elongs to with probability 1 / 8 k iid. (b) If IS( S ∪ { u } ) rejects and IS( S \ { u } ) accepts: i. Set M ← M + 1 . 5. If M ≥ 13 80 N : (a) Return a ccept . 6. Else: (a) Return reject . Therefore, M distributes as Bin( N , p ) for p ≤ 1 8 . The probability to accept is b ounded by: Pr  Bin  N , 1 8  ≥ 13 80 N  = Pr  Bin  N , 1 8  − 1 8 N ≥ 3 80 N  ≤ e − 2 · ((3 / 80) N ) 2 / N ≤ e − (9 / 3200) N ≤ e − ln r − 1 = r If deg( u ) ≥ 2 k , then the probabilit y that S ∪ { u } has edges but S \ { u } is indep enden t is at least: Pr [ S ∩ N u  = ∅ ] − E [ edges in S ] ≥ 1 −  1 − 1 8 k  deg( u ) ! − 1 64 k 2 · m ≥ (1 − e − (1 / 8 k ) · (2 k ) ) − 1 64 ˜ m · m ≥ 1 − e − 1 / 4 − e 1 / 10 64 ≥ 1 5 Therefore, M distributes as Bin( N , p ) for p > 1 5 . The probability to reject is b ounded by: Pr  Bin  N , 1 5  < 13 80 N  = Pr  Bin  N , 1 5  − 1 5 N < − 3 80 N  ≤ e − 2((3 / 80) N ) 2 / N ≤ e − (9 / 3200) N ≤ e − ln r − 1 = r 7.3 Sampling lo w-low edges Algorithm 22 ( Sample-L-L-Edge-IS ) samples “low-lo w” edges. F ormally , for every edge uv ∈ E , the probabilit y that the output is “ uv ” (undirected) is in the range e ± ε m Pr [ T u = 0] Pr [ T v = 0] ± r , where T u and T v are indicators to accept by the high-degree test ( T est-High-Degree-IS ) whose error parameter is r . When no edge is returned, the algorithm explicitly declares failure b y rejecting. 30 Algorithm 22 : Pro cedure Sample-L-L-Edge-IS ( G, n, ε ; ˜ m, r ) Input: ˜ m ≥ max { 36 , e − 1 / 10 m } . Output: F or every edge e ∈ E , the probabilit y to return e is e ± ε 200 ˜ m ln(5 /ε ) Pr [ T u = T v = 0] ± r . Complexit y: O (log( n/ε ) + log(1 /r )) (worst-case). 1. Let e ← Sample-Lone-Edge-IS ( G, n ; ˜ m ) . 2. If e is reject : (a) Return reject . 3. Let uv = e (in an arbitrary order). 4. Filter b y Estimate-Indicator-Inverse ( Lonliness-Event-IS ( G, n ; ˜ m, u , v ) , ε, 1 / 2) . 5. Let T u ← T est-High-Degree-IS ( G, n, ˜ m ; u , r ) . 6. Let T v ← T est-High-Degree-IS ( G, n, ˜ m ; v , r ) . 7. If T u  = 0 or T v  = 0 : (a) Return reject . 8. Return e . Lemma 7.5. Assume that ˜ m ≥ max { 36 , e − 1 / 10 m } . F or every e dge uv ∈ E , the pr ob ability that Sample-L-L-Edge-IS ( G, n, ε ; ˜ m, r ) r eturns u 0 v 0 is in the r ange e ± ε 200 ˜ m ln(5 /ε ) Pr [ T u 0 = 0 ∧ T v 0 = 0] ± r . A lso, the query c omplexity is O (log( n/ε ) + log(1 /r )) worst-c ase. Pr o of. F or query complexity , observe that there is a single call to Sample-Lone-Edge-IS at the cost of O (log n ) , a single call to Estimate-Indicator-Inverse with ρ = 1 / 2 = Ω(1) at the cost of O (log (1 /ε )) queries worst-case (Lemma 2.3), and tw o calls to T est-High-Degree-IS , at the cost of O (log(1 /r )) queries w orst-case. Let u 0 v 0 ∈ E . If deg( u 0 ) > 2 k , then the probabilit y to return the edge u 0 v 0 is bounded b y Pr [ T u 0 = 0] ≤ r (Lemma 7.4). This applies also when deg ( v 0 ) > 2 k . W e proceed assuming that deg( u 0 ) and deg( v 0 ) are b oth b ounded b y 2 k = 2 √ ˜ m . By Lemma 4.3, the probability that Sample-Lone-Edge-IS returns u 0 v 0 is L u 0 ,v 0 ( ˜ m ) / 100 ˜ m . By Lemma 4.7, L u 0 ,v 0 ( ˜ m ) ≥ 1 2 , and therefore, the probability to pass the filter is in the range e ± ε / 2 L u 0 ,v 0 ( ˜ m ) ln(5 /ε ) . The probabilit y to return the edge e 0 = u 0 v 0 is: Pr [ e = e 0 ] · E [ pass filter | e = e 0 ] · Pr [ T u 0 = T v 0 = 0] = L u 0 ,v 0 ( ˜ m ) 100 ˜ m · e ± ε 2 L u 0 ,v 0 ( ˜ m ) ln(5 /ε ) · Pr [ T u = T v = 0] = e ± ε 200 ˜ m ln(5 /ε ) · Pr [ T u = T v = 0] 7.4 Sampling high-lo w and high-high edges In Algorithm 23, implementing Sample-H-Edge-IS , we sample a star vertex and then sample a neigh- b or. After normalizing the starness factor and the neigh b orho od factor, w e test the degree of each 31 v ertex. If the star vertex degree is considered as low then we reject. Otherwise, we return the edge with probability 1 if the neigh b or’s degree is considered lo w and with probability 1 / 2 otherwise. This eliminates the double-coun ting where b oth vertices are considered as high-degree. Algorithm 23 : Pro cedure Sample-H-Edge-IS ( G, n, ε ; ˜ m, r ) Input: ˜ m ≥ max { 36 , e − 1 / 10 m } . Output: F or every edge uv ∈ E , the probability to return uv is e ± ε 300 ˜ n ∗ √ ˜ m ln(5 /ε ) Pr [ T u = 1 ∨ T v = 1] ± 2 r . Complexit y: O (log n log(1 /ε )) (w orst-case). 1. Let k ← √ ˜ m . 2. Let u ← Sample-Sta r-Vertex-IS ( G, n, ε ; ˜ m ) . 3. If u is reject : (a) Return reject . 4. Let v ← Sample-Neighb o r-IS ( G, n, ˜ m ; u ) . 5. If v is reject : (a) Return reject . 6. Let T u ← T est-High-Degree-IS ( G, n, ˜ m ; u , r ) . 7. If T u = 0 : (a) Return reject . 8. Let T v ← T est-High-Degree-IS ( G, n, ˜ m ; v , r ) . 9. If T v = 1 : (a) Return reject with probabilit y 1 / 2 . 10. If ˜ n ∗ = n : (a) Filter b y Estimate-Indicator-Inverse ( Sta rness-Event-IS ( G, n ; ˜ m, u ) , ε, 1 / 375) . 11. Else: (a) Filter b y Estimate-Indicator-Inverse ( A , ε, 1 / 750) with the following pro cedure A : • Let b 1 ← Sta rness-Event-IS ( G, n ; ˜ m, u ) . • Let b 2 ← Neighb o rho od-Event-IS ( G, n ; ˜ m, u , v ) . • Accept if and only if b oth b 1 and b 2 are “ accept ”. 12. Return uv . Lemma 7.6. Assume that ˜ m ≥ max { 36 , e − 1 / 10 m } , and c onsider the dir e cte d e dge u 0 → v 0 . The pr ob ability of Algorithm 23 ( Sample-H-Edge-IS ) to sample the dir e cte d e dge u 0 → v 0 (thr ough u = u 0 and v = v 0 ) is in the r ange e ± ε 3750 ˜ n ∗ √ ˜ m ln(5 /ε ) Pr [ T u 0 = 1]  Pr [ T v 0 = 0] + 1 2 Pr [ T v 0 = 1]  ± r Pr o of. F or shortening, let α u 0 ,v 0 = Pr [ T u 0 = 1]  Pr [ T v 0 = 0] + 1 2 Pr [ T v 0 = 1]  . If deg( u 0 ) ≤ k , then the probabilit y to pass the ( T u 0 = 1) -test is bounded by r . Since e ε 3750 ˜ n ∗ √ ˜ m ln(5 /ε ) < 1 , the probability to sample the directed edge u 0 → v 0 is indeed b ounded b y e ± ε 3750 ˜ n ∗ √ ˜ m ln(5 /ε ) α u 0 ,v 0 ± r . If deg( u 0 ) > k , then: • The probabilit y of u = u 0 is S u 0 ( ˜ m ) / 10 √ ˜ m (Lemma 4.6). 32 • The probabilit y of v = v 0 is 1 n if ˜ n ∗ = n and N u 0 ,v 0 ( ˜ m ) / ˜ m otherwise (Lemma 7.3). • The probabilit y to pass the test for T u 0 is Pr [ T u = 1] . • The probabilit y to pass the test for T v 0 is Pr [ T v = 0] + 1 2 Pr [ T v = 1] . • If ˜ n ∗ = n : the probability to pass the last filter is e ± ε 375 ln(5 /ε ) S u 0 ( ˜ m ) (Lemma 2.3, Lemma 4.8). • If ˜ n ∗  = n : the probability to pass the last filter is e ± ε 750 ln(5 /ε ) S u 0 ( ˜ m ) N u 0 ,v 0 ( ˜ m ) (Lemma 2.3, Lemma 4.8, Lemma 7.2). If ˜ n ∗ = n , then the probabilit y to return the edge u = u 0 and v = v 0 is: S u 0 ( ˜ m ) 10 √ ˜ m · 1 n · α u 0 ,v 0 · e ± ε 375 ln(5 /ε ) S u 0 ( ˜ m ) = e ± ε α u 0 ,v 0 3750 n √ ˜ m ln(5 /ε ) = e ± ε α u 0 ,v 0 3750 ˜ n ∗ √ ˜ m ln(5 /ε ) If ˜ n ∗ = 2 ˜ m , then the probabilit y to return the edge u 0 = u and v 0 = v is: S u 0 ( ˜ m ) 10 √ ˜ m · N u 0 ,v 0 ( ˜ m ) ˜ m · α u 0 ,v 0 · e ± ε 750 ln(5 /ε ) S u 0 ( ˜ m ) N u 0 ,v 0 ( ˜ m ) = e ± ε α u 0 ,v 0 7500 ˜ m √ ˜ m ln(5 /ε ) = e ± ε α u 0 ,v 0 3750 ˜ n ∗ √ ˜ m ln(5 /ε ) Lemma 7.7. Assume that ˜ m ≥ max { 36 , e − 1 / 10 m } . F or every e dge uv ∈ E , the pr ob ability that Sample-H-Edge-IS ( G, n, ε ; ˜ m, r ) r eturns u 0 v 0 is in the r ange e ± ε 3750 ˜ n ∗ √ ˜ m ln(5 /ε ) Pr [ T u 0 ≥ 1 ∧ T v 0 ≥ 1] ± 2 r . Also, r e gar d less of ˜ m , the query c omplexity is O (log n log(1 /ε ) + log (1 /r )) worst-c ase. Pr o of. F or exp ected complexity assuming that ˜ m ≥ max { 36 , e − 1 / 10 m } , observ e that we hav e: • A single Sample-Sta r-Vertex-IS call, at the cost of O (log n ) queries (worst-case, Lemma 4.6). • A single Sample-Neighb o r-IS call, at the cost of O (log n ) queries (w orst-case, Lemma 7.3). • T wo calls to T est-High-Degree-IS at the cost of O (log(1 /r )) queries (worst-case, Lemma 7.4). • A single call to Estimate-Indicator-Inverse with parameter ρ = 1 / 120 = Ω(1) , at the cost of O (log(1 /ε )) calls to Sta rness-Event-IS and Neighb orhoo d-Event-IS (w orst-case, Lemma 2.3), eac h costing O (log n ) (worst-case, Observ ation 4.5, Observ ation 7.1). F or shortening, let α u 0 ,v 0 = Pr [ T u 0 = 0]  Pr [ T v 0 = 0] + 1 2 Pr [ T v 0 = 1]  . Observ e that α u 0 ,v 0 + α v 0 ,u 0 = Pr [ T u 0 = 1 ∨ T v 0 = 1] , since T u 0 and T v 0 are indep enden t. F or the undirected edge, w e consider b oth directions. Pr [ sample u 0 v 0 ] = Pr [ u 0 → v 0 ] + Pr [ v 0 → u 0 ] [Lemma 7.6] =  e ± ε α u 0 ,v 0 3750 ˜ n ∗ √ ˜ m ± r  +  e ± ε α v 0 ,u 0 3750 ˜ n ∗ √ ˜ m ± r  = e ± ε 3750 ˜ n ∗ √ ˜ m ( α u 0 ,v 0 + α v 0 ,u 0 ) ± 2 r = e ± ε 3750 ˜ n ∗ √ ˜ m Pr [ T u 0 = 1 ∨ T v 0 = 1] ± 2 r 33 7.5 Sampling all edges W e combine the category-sp ecific edge sampling pro cedures b y unifying their co efficien t. Algorithm 24 : Pro cedure Sample-Edge-Core-IS ( G, n, ε ; ˜ m ) Input: 0 < ε ≤ 1 / 3 . Input: ˜ m ≥ max { 36 , e − 1 / 10 m } . Output: F or every edge uv ∈ E , the probability to return uv is e ± 2 ε 1200 ˜ n ∗ √ ˜ m ln(5 /ε ) . Complexit y: O (log n log(1 /ε )) w orst-case. 1. Let r ← ( ε 2 / ln(5 /ε )) / 15000 n 2 . 2. T oss a three-head coin. • With probabilit y √ ˜ m 75 ˜ n ∗ : Return Sample-L-L-Edge-IS ( G, n, ε ; ˜ m, r ) . • With probabilit y 1 4 : Return Sample-H-Edge-IS ( G, n, ε ; ˜ m, r ) . • Otherwise: reject . Lemma 7.8. F or every gr aph G = ( V , E ) (over n vertic es and m e dges), if ˜ m ≥ max { 36 , e − 1 / 10 m } and 0 < ε ≤ 1 / 3 , then Algorithm 24 ( Sample-Edge-Core-IS ) r eturns every e dge uv ∈ E with pr ob- ability in the r ange e ± 2 ε / 15000 ˜ mR ln(5 /ε ) for R = min { √ ˜ m, n/ √ ˜ m } , and otherwise r eje cts. The query c omplexity is O (log n log(1 /ε )) worst-c ase. Pr o of. F or complexit y , using log (1 /r ) = O (log( n/ε )) : • The cost of Sample-L-L-Edge-IS is O (log( n/ε )) (Lemma 7.5). • The cost of Sample-H-Edge-IS is O (log n log(1 /ε ) + log( n/ε )) (Lemma 7.7). Note that √ ˜ m/ ˜ n ∗ = √ ˜ m/ min { n, 2 ˜ m } = max { √ ˜ m/n, 1 / 2 √ m } ≤ max { p n 2 / 2 /n, 1 / 2 } ≤ 1 . The probabilit y to return every individual edge uv is: Pr [ sample uv ] = √ ˜ m 75 ˜ n ∗ ·  e ± ε 200 ˜ m ln(5 /ε ) Pr [ T u = T v = 0] ± r  + · · · 1 4 ·  e ± ε 3750 ˜ n ∗ √ ˜ m ln(5 /ε ) Pr [ T u ≥ 1 ∨ T v ≥ 1] ± 2 r  = e ± ε 15000 ˜ n ∗ √ ˜ m ln(5 /ε ) ( Pr [ T u = T v = 0] + Pr [ T u ≥ 1 ∨ T v ≥ 1]) ±  1 75 + 2 · 1 4  r = e ± ε 15000 ˜ n ∗ √ ˜ m ln(5 /ε ) ± r Observ e that for ε ≤ 1 / 3 : n 2 ≥ ˜ n ∗ · √ ˜ m ε 2 ≤ min { e 2 ε − e ε , e − ε − e − 2 ε } r ≤ min { e 2 ε − e ε , e − ε − e − 2 ε } 15000 ˜ n ∗ √ ˜ m ln(5 /ε ) 34 Therefore, Pr [ sample uv ] = e ± ε 15000 ˜ n ∗ √ ˜ m ln(5 /ε ) ± min { e 2 ε − e ε , e − ε − e − 2 ε } 15000 ˜ n ∗ ˜ m ln(5 /ε ) = e ± 2 ε 15000 ˜ n ∗ √ ˜ m ln(5 /ε ) The core sampler is a ( λ, ε ) -uniform sampler, but λ can b e very small. Algorithm 25 ( Sample- Edge-Amplified-IS ) amplifies the success probabilit y to 3 / 4 by making O ( R log (1 /ε )) sample tries, for R = min { √ m, n/ √ m } , and choosing the first successful one. This algorithm requires an advice ˜ m , and its correctness and exp ected query complexity statements only hold if ˜ m ∈ e ± 1 / 10 m . Algorithm 25 : Pro cedure Sample-Edge-Amplified-IS ( G, n, ˜ m, ε ) Input: ˜ m ≥ max { 36 , e − 1 / 10 m } . Output: F or every edge e ∈ E , the probabilit y to return e is e ± ε λ ( G, ˜ m ) . ( λ ( G, ˜ m ) ≥ 3 / 4 if ˜ m ≥ e − 1 / 10 m ). Complexit y: O (min { √ m, n/ √ m } · log n log(1 /ε ) log( n/ε )) exp ected, if ˜ m ≥ e − 1 / 10 m . Complexit y: O ( m √ n log n log (1 /ε )) worst-case. 1. Let ˜ R ← min { √ ˜ m, n/ √ ˜ m } . 2. F or l 10 6 ˜ R ln(5 /ε ) m times: (a) Let e ← Sample-Edge-Co re-IS ( G, n, ε/ 5; ˜ m ) . (b) If e is not reject : i. Return e . 3. Return reject . Lemma 7.9. F or every gr aph G = ( V , E ) (over n vertic es and m e dges), if ˜ m ≥ max { 36 , e − 1 / 10 m } and 0 < ε ≤ 1 / 3 , then A lgorithm 25 ( Sample-Edge-Amplified-IS ) r eturns every e dge uv ∈ E with pr ob ability in the r ange e ± ε λ ( G, ˜ m ) , wher e λ ( G, ˜ m ) ≥ 3 / 4 , and otherwise r eje cts. The query c om- plexity is O ( R log n log 2 (1 /ε )) if ˜ m ∈ e ± 1 / 10 m and otherwise O ( √ n log n log 2 (1 /ε )) . Pr o of. Let R = min { √ m, n/ √ m } . The success probability of an iteration is at least: e − (2 / 5) ε m 15000 ˜ n ∗ √ ˜ m ln(5 /ε ) ≥ m 30000 e 2 / 15 · ˜ R · e 1 / 10 m ln(5 /ε ) ≥ 1 40000 ˜ R ln(5 /ε ) Therefore, the success probabilit y of the lo op is: λ ( G, ˜ m ) ≥ 1 −  1 − 1 40000 ˜ R ln(5 /ε )  10 6 ˜ R ln(5 /ε ) ≥ 1 − e − 25 ≥ 3 / 4 F or every edge e ∈ E , the probability to sample it is in the range e ± 2( ε/ 5) λ ( G, ˜ m ) . The query complexity is O ( ˜ R log (1 /ε )) rounds, O (log n log (1 /ε )) per round (Lemma 7.8). If ˜ m ≥ 36 and ˜ m ∈ e ± 1 / 10 m , then ˜ R = O ( R ) , and otherwise the w orst case is ˜ R = O ( √ n ) . 35 Algorithm 26 : Pro cedure Sample-Edge-IS ( G, n, ε ) 1. If ε > 1 / 3 : (a) Set ε ← 1 / 3 . 2. Let r ← min { ε 2 / 6 , 1 /n 2 log n } . 3. Compute ˜ m ← e ± 1 / 10 m with probabilit y 1 − r . 4. If ˜ m < 36 : (a) Return Sample-Edge-Brutefo rce-IS ( G ) . 5. Return Sample-Edge-Amplified-IS ( G, n, ˜ m , ε/ 5) . The en try-p oin t sampling algorithm, Sample-Edge-IS (Algorithm 26), estimates ˜ m ∈ e ± 1 / 10 m with success probabilit y 1 − r , where r = O ( ε 2 ) ∩ O (1 /n 3 ) . Lemma 7.10. F or every gr aph G = ( V , E ) (over n vertic es and m e dges). A lgorithm 26 ( Sample- Edge-IS ) r eturns every e dge uv ∈ E with pr ob ability in the r ange e ± 4 ε /m , and otherwise r eje cts. The exp e cte d query c omplexity is O ( R p oly(log n ) log 2 (1 /ε )) for R = min { √ m, n/ √ m } . Pr o of. Estimating ˜ m in the range e ± 1 / 10 m with probability 2 / 3 requires O ( R · p oly(log n )) queries w orst-case (Lemma 2.2). T o increase this probability to 1 − r , we take the median of O (log r − 1 ) = O (log( n/ε )) indep enden t runs at the cost of O ( R p oly(log n ) log ( n/ε )) queries. The contribution of the brute-force sampler in the case where ˜ m is wrongly small is Pr [ ˜ m < 36] · O (1 + m log n ) = O ( r m log n ) = O (1) (last transition is correct since m ≤ n 2 and r ≤ 1 /n 2 log n ). The amplified sample costs O ( R · log n log 2 (1 /ε )) if ˜ m ∈ e ± 1 / 10 m and O ( √ n log n log 2 (1 /ε )) if ˜ m / ∈ e − 1 / 10 m , which happ ens with probability at most r ≤ 1 /n 2 log n . Therefore, the exp ected query complexity of the sampling logic is O ( R log n log 2 (1 /ε )) (in addition to the estimation of ˜ m ). Let λ ′ ( G ) = P a ∈ e ± 1 / 10 m Pr [ ˜ m = a ] Pr [ ˜ m ∈ e ± 1 / 10 m ] ≥ 3 4 (last transition: Lemma 7.9). By Lemma 7.9 and the union b ound, the probability to sample an individual edge e ∈ E is at least e − ε/ 5 λ ′ ( G ) and at most e ε/ 5 λ ′ ( G ) + r . F or 0 < ε ≤ 1 / 3 , the ratio betw een the maxim um and the minim um probability is at most: e (2 / 5) ε + e ε/ 5 λ ′ ( G ) r ≤ e (2 / 5) ε + 4 3 e ε/ 5 · 1 6 ε 2 ≤ e ε/ 2 Therefore, the output is ε -uniform when conditioned on success. 8 Lo w er b ounds W e describe a paradigm for sho wing lo wer b ounds for sampling algorithms. Consider t w o graphs G 1 = ( V , E 1 ) and G 2 = ( V , E 2 ) for which E 1 ⊆ E 2 . F or ev ery relab eling π : V → V , let E π b b e the set of edges in π ( G b ) ( b ∈ { 1 , 2 } ). W e formally define the algorithm b eha vior ov er a random input as a distribution. 36 Definition 8.1. Let A b e a query-making algorithm, and let G b e a distribution ov er inputs. W e define the distribution of runs of A giv en an input dra wn from G as the distribution for whic h, for ev ery query-answer sequence σ , Pr A ( G ) [ σ ] = Pr G ∼G [ A executes the query-answ er sequence σ when given an input G ] . Consider the mo del of double inputs: the algorithm input consists of tw o graphs, and every graph query is p erformed on b oth of them. If a single-input algorithm cannot distinguish b et ween π ( G 1 ) and π ( G 2 ) using a sp ecific sequence of queries, then it cannot distinguish b et w een them even if π ( G 1 ) is given in addition to the actual input G ∈ { π ( G 1 ) , π ( G 2 ) } . This is stated in the following observ ation: Observ ation 8.2. L et G 1 and G 2 b e two gr aphs over the same vertex set V , and let π : V → V b e a r andom r elab eling of the vertic es. Assume that for every single-gr aph algorithm A that makes q queries in exp e ctation when given an input of the form π ( G 1 ) , d TV ( A ( π ( G 1 )) , A ( π ( G 2 ))) ≤ d ( q ) . In this setting, for every double-gr aph algorithm A ′ that makes q queries in exp e ctation when given an input of the form ( π ( G 1 ) , π ( G 1 )) , d TV ( A ′ ( π ( G 1 ) , π ( G 1 )) , A ′ ( π ( G 1 ) , π ( G 2 ))) ≤ d ( q ) . Lemma 8.3. L et G 1 and G 2 b e two gr aphs over the same vertex set V , and let π : V → V b e a r andom r elab eling of the vertic es. Assume that algorithm A that is al lowe d to make queries, in a query mo del that al lows determining the existenc e of a given e dge 3 , is a ( λ, ε ) -uniform sampler that makes q queries in exp e ctation when given an input of the form π ( G 1 ) . In this setting, ther e exists a double-input algorithm A ′ that makes q + 1 queries in exp e ctation when given an input of the form π ( G 1 ) , for which d TV ( A ( π ( G 1 )) , A ( π ( G 2 ))) = d TV ( A ′ ( π ( G 1 ) , π ( G 1 )) , A ′ ( π ( G 1 ) , π ( G 2 ))) ≥ λe − ε | E 2 \ E 1 | | E 2 | . Pr o of. W e define A ′ ( H 1 , H 2 ) as the algorithm that runs A as-is on the graph H 2 , and then makes an additional IS query to determine whether the result edge b elongs to H 1 . Since A is a ( λ, ε ) - uniform sampler, the probability to sample an edge in E 2 \ E 1 from H 1 is zero, and the probabilit y to sample suc h an edge from H 2 is at least λ · e − ε · | E 2 \ E 1 | · 1 | E 2 | . That is, the additional query enforces a λe − ε | E 2 \ E 1 | | E 2 | -difference in the b ehavior of the algorithm when given inputs of the form ( π ( G 1 ) , π ( G 1 )) and inputs of the form ( π ( G 1 ) , π ( G 2 )) . Observ e that the total-v ariation di stance of the algorithm b eha viors is the same, since the last query of A ′ is different only if the execution of A has already fork ed into paths resulting in different outputs. Lemma 8.4 (Lo wer b ound mec hanism) . L et n 0 ≥ 1 , m 0 ≥ 1 , 0 < ρ < 1 , 0 < ε 0 < 1 and some R ( n, m ) > 0 . Assume that ther e exists some glob al ly fixe d c onstant C such that for every n ≥ n 0 and every m 0 ≤ m ≤ ρn 2 ther e exist two gr aphs G 1 and G 2 on the same vertex set V for which: • Considering the e dge sets, E 1 ⊆ E 2 . • | E 1 | ≤ m . 3 Suc h as ev ery model that allows IS queries. 37 • | E 2 | − | E 1 | ≥ ε 0 m . • | E 2 | − | E 1 | ≤ m . • L et π : V → V b e a uniformly chosen r elab eling of the vertic es. F or every algorithm that makes q queries in exp e ctation when given an input of the form π ( G 1 ) , d TV ( A ( π ( G 1 )) , A ( π ( G 2 ))) ≤ C R ( n,m ) · q . In this setting, every ( λ, ε 0 ) -uniform e dge sampler must make 1 2 C λe − ε 0 ε 0 R ( n, m ) queries in exp e c- tation when given an input of the form π ( G 1 ) . Pr o of. By Lemma 8.3, if A is a ( λ, ε 0 ) -uniform sampler, then for an appropriate double-input algorithm A ′ that mak es one more query: d TV ( A ′ ( π ( G 1 ) , π ( G 1 )) , A ′ ( π ( G 1 ) , π ( G 2 ))) ≥ λe − ε 0 ε 0 | E 2 \ E 1 | | E 2 | ≥ λe − ε 0 ε 0 m 2 m = 1 2 λe − ε 0 ε 0 . Since the total v ariation distance in the b eha vior of the mo dified algorithm is b ounded by the total v ariation distance in the b eha vior of the sampling algorithm, C R ( n, m ) · q ≥ 1 2 λe − ε 0 ε 0 . This results in q ≥ 1 2 C λe − ε 0 ε 0 · R ( n, m ) . F or hybrid algorithms: Lemma 8.5 (Low er b ound for hybrid algorithms) . F or a sufficiently smal l, fixe d ε , every ε -uniform e dge-sampling algorithm that uses IS queries and lo c al queries must make at le ast Ω(min { √ m, p n/ √ m } ) queries in exp e ctation. Pr o of. F or some globally fixed constan ts n 0 ≥ 1 , m 0 ≥ 1 and 0 < ε 0 < 1 , let n ≥ n 0 , m ≥ m 0 and 0 < ε ≤ ε 0 . Also, let R ( n, m ) = min { √ m, p n/ε √ m } . By [ AHL26 ], for a globally fixed constan t C (indep enden t of n , m and ε ), there exist tw o graphs named “ G n,m/n, 0 ” (denoted here by G 1 ) and “ G n,m/n,ε ” (denoted here by G 2 ) so that: • Considering the edge sets, E 1 ⊆ E 2 . • | E 1 | ≤ m . • | E 2 | − | E 1 | ≥ εm . • | E 2 | − | E 1 | ≤ m (stated in [ AHL26 ] as Θ( εm ) ). • F or every algorithm that makes q queries in exp ectation, d TV ( A ( π ( G 1 )) , A ( π ( G 2 ))) ≤ C R ( n,m ) · q . This matc hes the constraints of Lemma 8.4 for R ( n, m ) = min { √ m, p n/ √ m } . F or the IS algorithm, the lo wer b ound of [ CL W20 ] for degree estimation is ˜ Ω(min { √ m, n √ m } ) , and the edge containmen t constraint ( E 1 ⊆ E 2 ) is presented in the text but not explicitly stated. Below 38 w e attach an alternative pro of, based on the construction ideas of [ AHL26 ], for an Ω(min { √ m, n √ m } ) lo wer b ound for a fixed ε . Lemma 8.6. F or every n ≥ 16 and 9 ≤ m ≤ n 2 / 16 , ther e exist two gr aphs G n,m and H n,m over { 1 , . . . , n } for which: • G n,m is a sub gr aph of H n,m . • G n,m c ontains at most m e dges. • H n,m c ontains at le ast 1 2 m additional e dges. • H n,m c ontains at most m additional e dges. • Ther e exists a glob al ly fixe d c onstant C so that, for every algorithm that only uses IS queries and makes q queries in exp e ctation, d TV ( A ( π ( G n,m )) , A ( π ( H n,m ))) ≤ C R ( n,m ) · q , for R ( n, m ) = min { √ m, n/ √ m } . W e prov e each sub-statement of Lemma 8.6 in Lemmas 8.8, 8.9, 8.10, 8.14. Lemma 8.7 (Lo wer b ound for IS algorithms) . F or a sufficiently smal l, fixe d ε , every ε -uniform e dge-sampling algorithm that uses IS queries must make at le ast Ω(min { √ m, n/ √ m } ) queries. Pr o of. Just apply Lemma 8.4 with the construction stated in Lemma 8.6. 8.1 The IS low er-b ound construction Our construction is based on the clique+biclique idea of the low er b ound presented in [ AHL26 ]. The follo wing analysis is shorter since it refers to a hard-coded choice of ε and only considers IS queries. F or n ≥ 16 and 9 ≤ m ≤ 1 16 n 2 , we define k = ⌊ √ m ⌋ , m ′ = m −  k 2  , h = ⌈ 2 m ′ /n ⌉ , ℓ = ⌈ m ′ /h ⌉ and t wo graphs: • G n,m is the union of a k -clique and n − k singleton v ertices. • H n,m as the union of a k -clique, an ( ℓ, h ) -biclique and n − ( k + h + ℓ ) singleton v ertices. Note that ℓ ≈ 1 2 n is the num b er of low-degree v ertices and h ≈ m/n is the n umber high-degree v ertices (note that h ≤ ℓ ) . Lemma 8.8. The gr aphs G n,m and H n,m ar e wel l-define d. Pr o of. It suffices to show that the num b er of singletons is non-negative in the H -graph. k + h + ℓ ≤ √ m + 2 m ′ n + 1 + m ′ ⌈ 2 m ′ /n ⌉ + 1 ≤ √ m + 2 m n + 1 2 n + 2 ≤ 1 4 n + 1 8 n + 1 2 n + 2 = 7 8 n + 2 ≤ n Lemma 8.9. The numb er of e dges in G n,m is at most 1 2 m . 39 Pr o of. The n umber of edges in G n,m is  k 2  ≤ 1 2 k 2 ≤ 1 2  √ m  2 ≤ 1 2 m Lemma 8.10. The numb er of e dges in H n,m not b elonging to G n,m is at le ast 1 2 m . Pr o of. The n umber of additional edges in H n,m is: h · ℓ = h ·  m ′ /h  ≥ m ′ = m −  k 2  ≥ m − 1 2 m = 1 2 m Lemma 8.11. The numb er of e dges in H n,m not b elonging to G n,m is at most m . Pr o of. F or m ≥ 9 :  k 2  =  ⌊ √ m ⌋ 2  ≥ 1 5 m (Minim um is ac hieved where m + 1 is a square. Can b e v alidated explicitly for m = 15 and algebraically for m ≥ 24 through ⌊ x ⌋ ≥ x − 1 ). F or m ′ : m ′ = m −  k 2  ≤ m − 1 5 m = 4 5 m Since n ≥ 16 , the n umber of additional edges in H n,m is: h · ℓ ≤ h · ( m ′ /h + 1) = m ′ + h ≤ m ′ + (2 m ′ /n + 1) = (1 + 2 /n ) m ′ + 1 ≤ 9 8 m ′ + 1 ≤ 9 8 · 4 5 m + 1 F or m ≥ 10 we directly obtain that h · ℓ ≤ m . F or m = 9 , we obtain that h · ℓ ≤ 9 . 1 , but since it m ust b e an in teger, it is b ounded by m . W e color the v ertices using four colors: A (singletons), K (clique), L (low-degree part, ℓ ≈ n/ 2 v ertices) and H (high-degree part, h ≈ m/n vertices). Observe that the result of an indep enden t-set query only dep ends on its vertex colors. T o analyze the low er b ound, we consider a deterministic adaptiv e algorithm. This lo wer b ound can b e extended to probabilistic algorithms through Y ao’s principle [ Y ao77 ]. W e use a simulation argumen t similar to [ AHL26 ]. W e translate every IS query into a sequence of “color queries”. In each color query , w e reveal the color of a single vertex (of π ( G ) or π ( H ) ). Note that all non- K -colored v ertices in π ( H ) hav e A -color in π ( G ) . Since additional edges in H m ust b e b et w een an L -color vertex and an H -color v ertex, the b eha vior can differ only when an A -colored v ertex in π ( G ) has an H -color in π ( H ) . This is stated in the following observ ation. Observ ation 8.12. L et A ′ b e a double-input algorithm. The distanc e b etwe en runs of A ′ when given an input ( π ( G ) , π ( G )) and runs of A ′ when given an input ( π ( G ) , π ( H )) , when c onditione d on the numb er of c olor queries Q made when given the input ( π ( G ) , π ( G )) , is b ounde d by h n − k Q . 40 F or a uniformly chosen relab eling π : V → V , we consider a run of a deterministic adaptive algorithm on the input π ( G n,m ) , and then examine this run to b ound the probability of the algorithm to b ehav e differen tly on the input π ( H n,m ) . Lemma 8.13. Assume that we simulate every IS query in the gr aph G n,m by a se quenc e of c olor queries. Given query set S , we se quential ly query the vertic es of S until we obtain two K -vertic es, in which c ase we r eturn “not indep endent”, or until we exhaust al l vertic es, in which c ase we r eturn “indep endent”. In this setting, for q ≤ 1 4 k , the first q query simulations r eve al at most (4 n/k ) q vertex c olors in exp e ctation. Pr o of. F or 1 ≤ i ≤ q , assume that (at most) 2( i − 1) K -color vertices hav e already b een found. The probabilit y of a “new” vertex to be K -color is at least U K U V ≥ k − 2( i − 1) n , where U K is the n umber of unseen K -color v ertices and U V is the n umber of unrevealed v ertices, and this probabilit y cannot decrease as we find A -color v ertices (since such reveal decrease the denominator while keeping the n umerator unchanged). Therefore, the exp ected num b er of color queries until finding a new K -color v ertex is bounded by n k − 2( i − 1) . By the same analysis, the expected n umber of color queries until finding the second K -color vertex is b ounded b y n k − (2( i − 1)+1) . Note that this search may terminate ev en earlier if we exhaust all S i v ertices. Com bined, the exp ected n umber of color queries in the first r ≤ q queries is b ounded b y: q X i =1  n k − 2( i − 1) + n k − (2( i − 1) + 1)  ≤ q X i =1  2 · n k − 2 q  ≤ 2 q X i =1 n k − 2 k / 4 = 4 n k · q Lemma 8.14. Consider an algorithm that when given an input of the form π ( G n,m ) , wher e π is a uniformly chosen r elab eling, makes Q IS queries. The total variation distanc e b etwe en the b ehavior of the algorithm on the input π ( G n,m ) and its b ehavior on the input π ( H n,m ) is b ounde d by 64 E [ Q ] /R for R = min { √ m, n/ √ m } . Pr o of. As observed before (Observ ation 8.12), the total-v ariation distance b et wee n the runs is b ounded by the product of (1) the ratio b et w een h and the num b er of singletons in G n,m and (2) the n umber of color queries. By Lemma 8.13, the exp ected num b er of color queries is b ounded by: ⌊ k/ 4 ⌋ X r =1 Pr [ Q = r ] · 4 n k r + n · Pr [ Q > k / 4] ≤ 4 n k ⌊ k/ 4 ⌋ X r =1 Pr [ Q = r ] r + n · E [ Q ] k / 4 ≤ 8 n k E [ Q ] 41 Bounds: d TV ( A ( π ( G n,m )) , A ( π ( H n,m ))) ≤ h n − k · E [ color queries ] ≤ ⌈ 2 m ′ /n ⌉ n − ⌊ √ m ⌋ · 8 n k E [ Q ] ≤ 2 m/n + 1 (3 / 4) n · 8 n √ m/ 2 E [ Q ] =  (128 / 3) √ m n + 64 / 3 √ m  E [ Q ] ≤ 64 max { √ m/n, 1 / √ m } E [ Q ] = 64 min { √ m, n/ √ m } E [ Q ] 42 References [ABG + 18] Mary am Aliakbarpour, Amart ya Shankha Bisw as, Themis Gouleakis, John Peebles, Ronitt Rubinfeld, and Anak Y o dpin yanee. Sublinear-time algorithms for coun ting star subgraphs via edge sampling. Algorithmic a , 80(2):668–697, 2018. [AHL26] T omer Adar, Y ahel Hotam, and Amit Levi. When lo cal and non-lo cal meet: Quadratic impro vemen t for edge estimation with indep enden t set queries. arXiv pr eprint arXiv:2601.21457 , 2026. [AKK19] Sep ehr Assadi, Mic hael Kapralov, and Sanjeev Khanna. A simple sublinear-time al- gorithm for coun ting arbitrary subgraphs via edge sampling. In 10th Innovations in The or etic al Computer Scienc e, ITCS 2019 , page 6, 2019. [AMM22] Ragha vendra Addanki, Andrew McGregor, and Cameron Musco. Non-adaptive edge coun ting and sampling via bipartite indep endent set queries. In 30th A nnual Eur op e an Symp osium on Algorithms (ESA 2022) , pages 2–1. Schloss Dagstuhl–Leibniz-Zentrum für Informatik, 2022. [BBGM19] Anup Bhattachary a, Arijit Bishnu, Arijit Ghosh, and Gopinath Mishra. Hyp eredge es- timation using p olylogarithmic subset queries. arXiv pr eprint arXiv:1908.04196 , 2019. [BBGM21] Anup Bhattachary a, Arijit Bishn u, Arijit Ghosh, and Gopinath Mishra. On triangle estimation using tripartite indep enden t set queries. The ory of Computing Systems , 65(8):1165–1192, 2021. [BCS26] Lorenzo Beretta, Deeparnab Chakrabart y , and C Seshadhri. F aster estimation of the a verage degree of a graph using random edges and structural queries. In Pr o c e e dings of the 2026 Annual ACM-SIAM Symp osium on Discr ete A lgorithms (SODA) , pages 939–971. SIAM, 2026. [BER21] Amart ya Shankha Bisw as, T aly a Eden, and Ronitt Rubinfeld. T ow ards a decomp osition-optimal algorithm for coun ting and sampling arbitrary motifs in sub- linear time. In Appr oximation, R andomization, and Combinatorial Optimization. A l- gorithms and T e chniques (APPRO X/RANDOM 2021) , pages 55–1, 2021. [BGMP19] Arijit Bishnu, Arijit Ghosh, Gopinath Mishra, and Manaswi P araashar. Efficiently sam- pling and estimating from substructures using linear algebraic queries. arXiv pr eprint arXiv:1906.07398 , 2019. [BHPR + 20] P aul Beame, Sariel Har-P eled, Siv aramakrishnan Natara jan Ramamoorthy , Cyrus Rash tchian, and Makrand Sinha. Edge estimation with indep enden t set oracles. ACM T r ansactions on Algorithms (T ALG) , 16(4):1–27, 2020. [CL W20] Xi Chen, Amit Levi, and Erik W aingarten. Nearly optimal edge estimation with inde- p enden t set queries. In Pr o c e e dings of the F ourte enth Annual ACM-SIAM Symp osium on Discr ete Algorithms , pages 2916–2935. SIAM, 2020. [DL21] Holger Dell and John Lapinsk as. Fine-grained reductions from approximate counting to decision. ACM T r ansactions on Computation The ory (TOCT) , 13(2):1–24, 2021. 43 [DLM22] Holger Dell, John Lapinsk as, and Kitt y Meeks. Appro ximately coun ting and sam- pling small witnesses using a colorful decision oracle. SIAM Journal on Computing , 51(4):849–899, 2022. [DLM24] Holger Dell, John Lapinsk as, and Kitty Meeks. Nearly optimal indep endence oracle algorithms for edge estimation in hypergraphs. In 51st International Col lo quium on A utomata, L anguages, and Pr o gr amming (ICALP 2024) , pages 54–1, 2024. [ELRR25] T alya Eden, Reut Levi, Dana Ron, and Ronitt Rubinfeld. Approximately counting and sampling hamiltonian motifs in sublinear time. In Pr o c e e dings of the 57th Annual A CM Symp osium on The ory of Computing , pages 1043–1054, 2025. [EMR21] T alya Eden, Saleet Mossel, and Ronitt Rubinfeld. Sampling m ultiple edges efficiently . In Appr oximation, R andomization, and Combinatorial Optimization. Algorithms and T e chniques (APPR OX/RANDOM 2021) , pages 51–1, 2021. [ENT23] T alya Eden, Sh yam Naray anan, and Jakub Tětek. Sampling an edge in sublinear time exactly and optimally . In Symp osium on Simplicity in Algorithms (SOSA) , pages 253–260. SIAM, 2023. [ER18] T alya Eden and Will Rosen baum. On sampling edges almost uniformly . In 1st Sym- p osium on Simplicity in A lgorithms , 2018. [ERR19] T alya Eden, Dana Ron, and Will Rosen baum. The arb oricity captures the complexit y of sampling edges. In 46th International Col lo quium on A utomata, L anguages, and Pr o gr amming (ICALP 2019) , pages 52–1, 2019. [ERR22] T alya Eden, Dana Ron, and Will Rosen baum. Almost optimal bounds for sublinear- time sampling of k-cliques in b ounded arb oricit y graphs. In 49th International Col lo- quium on Automata, L anguages, and Pr o gr amming (ICALP 2022) , pages 56–1. Schloss Dagstuhl–Leibniz-Zen trum für Informatik, 2022. [F GP20] H Fic hten b erger, M Gao, and P Peng. Sampling arbitrary subgraphs exactly uniformly in sublinear time. In The 47th International Col lo quium on Automata, L anguages and Pr o gr amming (ICALP 2020) , volume 168, page 45, 2020. [GR08] Oded Goldreich and Dana Ron. Appro ximating av erage parameters of graphs. R andom Structur es & Algorithms , 32(4):473–493, 2008. [KKR04] T ali Kaufman, Mic hael Krivelevic h, and Dana Ron. Tight b ounds for testing bipar- titeness in general graphs. SIAM Journal on c omputing , 33(6):1441–1483, 2004. [TT22] Jakub Tětek and Mikkel Thorup. Edge sampling and graph parameter estimation via vertex neigh b orho od accesses. In Pr o c e e dings of the 54th annual A CM SIGACT symp osium on the ory of c omputing , pages 1116–1129, 2022. [Y ao77] Andrew Chi-Chin Y ao. Probabilistic computations: T o ward a unified measure of com- plexit y . In 18th Annual Symp osium on F oundations of Computer Scienc e (F OCS) , pages 222–227. IEEE Computer So ciety , 1977. 44

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment