KOINEU

February 10, 2026

Reading time: 23 minute

...

📝 Original Info

Title:
ArXiv ID: 2512.19736
Date:
Authors: Unknown

📝 Abstract

The structure of topology underpins much of the research on performance and robustness, yet available topology data are typically scarce, necessitating the generation of synthetic graphs with desired properties for testing or release. Prior diffusion-based approaches either embed conditions into the diffusion model, requiring retraining for each attribute and hindering real-time applicability, or use classifier-based guidance post-training, which does not account for topology scale and practical constraints. In this paper, we show from a discrete perspective that gradients from a pre-trained graph-level classifier can be incorporated into the discrete reverse diffusion posterior to steer generation toward specified structural properties. Based on this insight, we propose Classifier-guided Conditional Topology Generation with Persistent Homology (CoPHo), which builds a persistent homology filtration over intermediate graphs and interprets features as guidance signals that steer generation toward the desired properties at each denoising step. Experiments on four generic/network datasets demonstrate that CoPHo outperforms existing methods at matching target metrics, and we further

📄 Full Content

Topology lies at the core of many applications (e.g., communication networks and protocol design) [35,36]. The arrangement of nodes and links fundamentally impacts routing efficiency, latency, throughput, fault tolerance, and other performance metrics. Prior studies have shown that protocol behavior (e.g. multicast scaling laws, routing state overhead) can vary dramatically across different topological configurations [2]. As a result, network designers must carefully consider topology in optimizing network performance network topologies is often difficult. Carrier and enterprise network graphs are typically proprietary, and even research testbeds reveal only limited or anonymized structural information due to security and privacy concerns [48]. This motivates the use of synthetic graph topology generators for network planning, simulation, benchmarking, and data sharing [1,7]. Indeed, realistic synthetic topologies serve as important tools to evaluate new protocols and algorithms in lieu of real networks. Such tools allow researchers to simulate network behavior on large-scale, representative topologies and to perform "what-if" analyses for new designs. Synthetic topology generation also facilitates anonymization: one can release a plausible network graph for public use without disclosing the true network, as long as the synthetic graph preserves key structural patterns of the original [65,69]. In general, the ability to algorithmically generate graphs with specified characteristics is critical for both networking research and practical network engineering. This motivates the problem of conditional topology generation: automatically creating graphs that satisfy user-defined constraints.

Early network topology generators were largely rule-based or random [19,54,59]. These approaches have limited ability to capture complex structural patterns observed in real-world networks. In the past decade, data-driven approaches [14,46,67] have emerged to learn generative models of graphs [67]. Deep graph generative models such as variational autoencoders [56] and generative adversarial networks [8] have been applied to produce graphs that emulate training examples. More recently, diffusion-based generative models have achieved remarkable success in image and audio synthesis [32,47,51], prompting their adoption for graph data. Diffusion models generate data by iteratively denoising random noise, effectively learning the reverse of a gradual noising process. Several researchers [6, 15-18, 26, 27, 37-39, 45, 55, 60, 68] have extended this paradigm to graph data, demonstrating its promise for graph generation.

Despite these advances, significant challenges remain in controlling the topologies generated by diffusion models, which is an essential requirement for practical use in network topology design. Existing conditional diffusion-based approaches introduce constraints either during the forward noising process [42] or by applying classifier guidance in the reverse sampling phase [34]. Methods that integrate conditions into the forward stochastic differential equation train a distinct conditional score network for each target attribute, but this strategy imposes heavy computational burdens and requires extensive training data for every new specification, rendering it impractical when confronted with unseen conditions. Current classifier guidance methods [9], by contrast, apply classifier gradients at every denoising step . However, real-world network graphs are highly sensitive to even minor link changes, small edge perturbations can fracture connectivity or undermine resilience [5,43]. Moreover, practical topologies tend to be extremely sparse (edge density below 10%) [4,41], and training data are limited. Consequently, practitioners favor message-passing GNNs, such as GCN [29], GraphSAGE [21], and GIN [61], whose cost scales with the number of edges 𝑂 (𝐸), rather than fully-connected architectures [31,33,66,70] with 𝑂 (𝑛 2 ) complexity. In these message-passing GNNs, gradients only propagate along existing edges; as shown in Fig. 1, the sparsity of the gradient signals can cause the generated graph to collapse.

To address the limitations of existing methods by unifying classifier guidance and topological optimization under the diffusion paradigm, we propose Classifier-guided Conditional Topology Generation with Persistent Homology (CoPHo). In theory, we first establish that the gradient of a pretrained graph-property predictor can be rigorously integrated into the reverse-time denoising steps to enforce constraints without retraining the base diffusion model. Building on this foundation, CoPHo treats conditioning as an optimization problem in the space of persistent homology, leveraging the formal correspondence between classifier gradients and topological feature distances.

In practice, CoPHo proceeds by computing at each denoising step the gradient of the classifier with respect to the current unconditioned graph and interpreting its magnitude as a point-wise distance measure, and then constructs a persistent homology filtration [12] over the sequence of intermediate graphs. This filtration framework naturally aligns with the edge-localized gradient flow of message-passing graph neural networks by modeling an initial contraction of nonessential links followed by guided expansion to satisfy the target properties, and can be extended to fully connected architectures.

Finally, CoPHo leverages persistent homology to guide classifierbased edits in an adaptive schedule, enabling conditioning over target properties at each diffusion step. Ablation studies on diverse network datasets show that CoPHo consistently outperforms baseline conditional diffusion models in matching specified properties and maintaining robust local structure. Additional trials on molecular graph benchmarks confirm that this topology-aware guidance generalizes effectively across graph domains.

In summary, our contributions are summarized as follows:

• (Conceptual & Methodological) Adopting a fully discrete perspective, we model each reverse-time update as a Markov chain. We show that gradients from a graph property classifier can directly drive these discrete node/edge edits to satisfy the desired properties at every diffusion step. • (Technical) We incorporate persistent homology into the discrete diffusion process by applying a filtration on each denoised graph to extract persistence scores, then use these scores to guide node/edge edits in stages. This structured and topology-aware scheduling counteracts the sparsity-induced under-connectivity common in GNNs. • (Empirical) Extensive evaluation on Planar, Enzymes, Community-small and Topology Zoo network benchmarks, as well as molecular graphs, demonstrates superior fidelity to target degree distributions, clustering coefficients and diameter specifications, while preserving robust local connectivity and validating cross-domain transferability.

Graph Diffusion and Training with Conditional Inputs. In conditional diffusion training, the target properties are embedded into the denoising network so that each reverse step predicts the previous graph conditioned on desired attributes [42,63]. Conditions are encoded and concatenated with graph and timestep representations, enabling end-to-end multi-property generation [24,25,62]. Each time a new property or property combination is required, the entire model must be retrained or extensively fine-tuned.

Classifier-based Conditioning in Graph Diffusion. Existing methods decouple diffusion training from conditioning by first learning an unconditional graph diffusion model and then deriving gradientbased corrections from an external property predictor [3,60]. Extensions like MOOD further incorporate out-of-distribution guidance signals during reverse sampling [34]. However, by treating these gradient signals as generic continuous perturbations, they overlook (i) the inherent message-passing constraints of GNN architectures-where gradients flow only along existing edges-and (ii) the discrete, combinatorial nature of graph topology.

Persistent Homology in GNNs. A few recent works have explored persistent homology (PH) to preserve topological features in GNN models [23]. PH-induced graph ensembles use Vietoris-Rips filtrations on time-series sensor networks to construct multiple graph views, but focus on representation rather than conditional generation [44]. Besides, PH summaries have been used in GNN pooling layers to retain homological invariants in classification tasks, yet these methods do not address graph synthesis [64]. To our knowledge, CoPHo is the first to integrate PH into the denoising and conditioning loop with both theoretical justification and scalable implementation.

Diffusion Models. Existing diffusion models fall into two main architectures: the Denoising Diffusion Probabilistic Model (DDPM) and the Score-based Stochastic Differential Equation (SDE) framework. In the DDPM framework, the forward noising process is defined as a discrete-time Markov chain

and the learned reverse denoising model takes the form

which can be trained by predicting the noise 𝜖 𝜃 (x 𝑡 +1 , 𝑡) [22,57].

In contrast, the continuous-time SDE formulation [58]

) is learned via a GNN to approximate the true posterior [60] [49].

Persistent Homology (PH).. Given a weighted graph G = (V, E,𝑊 ), define a decreasing filtration {F 𝛼 } by

so that as 𝛼 decreases, edges are monotonically removed. Persistent homology tracks the birth 𝑏 𝑖 and death 𝑑 𝑖 of homological features, producing a persistence diagram {(𝑏 𝑖 , 𝑑 𝑖 )} whose lifetimes 𝑑 𝑖 -𝑏 𝑖 measure feature significance [11,71]. These diagrams yield differentiable topological summaries for gradient-based graph optimization.

Motivation for Integrating Persistent Homology. First, messagepassing GNNs inherently propagate gradients only along existing edges, leading to monotonic edge removals during classifier-guided updates; this behavior aligns intuitively with the decreasing filtration in persistent homology, and the associated edge-weight ordering naturally extends to fully-connected or multi-type GNN architectures. Second, direct gradient injections on sparse graphs can cause structural collapse (see Fig. 1), as even small continuous perturbations may sever critical connections. By filtering edges through a homology-based threshold, CoPHo enforces a controlled, multi-scale simplification that preserves global connectivity while steering toward desired properties.

Classifier-Based Conditioning. According to the derivation of Dhariwal and Nichol [9], let q( • | y 𝐺 ) denote the conditional denoising kernel and 𝑞( •) the unconditional one. In the ideal conditional diffusion model, at each reverse step one samples

This sampling procedure can be shown to satisfy

Here q(y | x 𝑡 +1 ) is independent of x 𝑡 and can be treated as a constant. The term 𝑞(x 𝑡 | x 𝑡 +1 ) is approximated by a pretrained denoising network 𝑝 𝜃 (x 𝑡 | x 𝑡 +1 ), while q(y | x 𝑡 ) can be estimated by an auxiliary classifier Φ.

DiGress [60] extends this framework to graphs by treating G as a continuous variable. However, for discrete graph properties y, such as the length of the unique shortest path = |{𝑒 1 , . . . , 𝑒 𝑛 }| between two nodes, G is not differentiable. If one removes an edge e 𝛿 , then:

Consequently, injecting classifier gradients directly into the unconditional denoising process yields only suboptimal conditioning.

The key symbols used by CoPHo are listed in Table 1.

We reframe conditional graph generation without relying on continuity. As shown in Figure 2, let G 𝑡 +1 be the noisy graph at step 𝑡 + 1, G 𝑡 its unconditional denoising samples, and Ĝ𝑡 the desired conditional sample. We define each sampling step of the conditional diffusion as a two-stage Markov chain:

From Equation 7, we model the sampling process from G 𝑡 +1 to Ĝ𝑡 . By the law of total probability, the chain rule, and the Markov property (derive in Appendix Section B.1), we obtain:

The key term is 𝑞( Ĝ𝑡 | G 𝑡 ), which we decompose via Bayes’ rule by introducing the condition 𝑦:

Here

We decompose 𝑤 into two interpretable factors: 1).

Building on the theoretical foundation established in Section 3.2, we now address the practical challenges of sparse graph data. In this section, we leverage message-passing GNNs to encode graph edits as a monotonic persistent homology filtration: at each diffusion step, we apply a homology-driven removal of features in descending order of persistence, ensuring that the evolving graph both remains realistic and increasingly satisfies 𝑦. Note that, as discussed in Section 5, our framework extends to both fully-connected GNNs (e.g., graph transformers) and to multi-class node/edge-type GNNs. In our QM9 experiments, we employ a fully-connected Graph Transformer backbone and model atoms and bonds as multi-class features. Table 8 demonstrates CoPHo’s transferability to these settings.

Integration into the Persistent Homology Framework. Given a weighted graph G = (V, E,𝑊 ), we first define the usual decreasing filtration

so that as 𝛼 decreases, edges are removed in order of increasing weight. To incorporate our gradient-guided conditioning, we modify this filtration at each denoising step 𝑡 as follows:

then sort nodes and edges by descending 𝑔 𝑡 and 𝑔 𝑡 . We select the top-𝑘 nodes and edges (those with the most positive gradients determined by 𝑇 ℎ𝑜𝑚𝑜 ) to form the initial subcomplex S 0 𝑡 ⊆ F 𝛼max = G 𝑡 .

Monotonic simplification. Starting from S 0 𝑡 , we introduce a fixed number of homology steps 𝑇 ℎ𝑜𝑚𝑜 . At step 𝑖 ∈ {1, . . . ,𝑇 ℎ𝑜𝑚𝑜 }, we choose a threshold 𝛼 𝑖 halfway between the (𝑘 𝑖 )th and (𝑘 𝑖+1 )th largest 𝑔 𝑡 values, and define (note that Ĝ𝑖 𝑡 = G 𝑡 ∨ S 𝑖 𝑡 ):

Here, 1[•] denotes the indicator function, which immediately rejects any sample whose properties fall below 𝜖, thereby preventing poor-quality samples.

In this way, CoPHo embeds persistent homology filtration directly into the denoising process: nodes/edges are pruned in a conditioned, monotonic sequence guided by classifier gradients, and each intermediate complex is evaluated against the desired properties. This yields an end-to-end conditioning mechanism that (i) minimally perturbs the original reverse kernel, (ii) provides interpretable topological updates via the filtration {F 𝛼 }, and (iii) drives the generated graph toward matching both global and fine-grained constraints.

Datasets. We evaluate on four graph benchmarks with varying structural complexity and an additional molecular benchmark for transfer testing. Community-small and Enzymes evaluate global property conditioning-specifically density, clustering, and assortativity-because Implementation Details. For continuous and discrete target properties, we train separate regressors and classifiers, respectively. On Community-small, Enzymes, Planar, and Topology-Zoo we use a six-layer message-passing GNN for property evaluation, while on QM9 we employ a Graph Transformer variant. The diffusion models build on the unconditional DiGress backbone ( [60]).

Baselines. We compare to four state-of-the-art conditional diffusion frameworks. GDSS is a continuous-time score-based model that diffuses node and edge via SDEs and we employ the DiGress’s classifier guidance for property conditioning ( [27]). DiGress implements discrete graph diffusion and steers sampling with gradients from a pretrained graph-level regressor ( [60]). MOOD extends score-based diffusion with an out-of-distribution guidance mechanism, biasing reverse-time sampling toward high-property regions ( [34]). Diffusion Twigs introduces parallel “trunk” and “stem” diffusion flows for properties, coordinated via loop guidance to enrich conditional flexibility ( [42]).

Evaluation Metrics. All experiments follow the protocol of Mercatali et al. [42]. For each target we perform 5 independent runs and report the mean and standard deviation. Metrics are divided into two groups. First, generation quality metrics (Deg, Clus, Orb) quantify the statistical difference between generated and real graphs using Maximum Mean Discrepancy (MMD) [58]. Second, conditioning quality metrics assess how closely each generated sample matches its prescribed properties via mean squared error (MSE), as in Mercatali et al. [42].

Properties Extraction. We extract four global properties with Net-workX [20]. Density is defined as Density = , where 𝑇 (𝑣) counts triangles incident to 𝑣 and 𝑘 𝑣 is its degree. Assortativity is the Pearson correlation of degrees at the ends of each edge. Transitivity is defined as Transitivity = 3×triangles connected triples . All settings align with Mercatali et al. [42] and Song et al. [58].

Single Property Conditioning. Table 2 reports the mean absolute error (MAE) of three runs for Community-Small and Enzymes. The best errors are bolded and second-best shaded. CoPHo achieves the lowest MAE on clustering, assortativity, and transitivity, demonstrating superior conditional over complex global structures.

Density conditioning leverages the fact that for fixed |𝑉 |, density decreases monotonically as edges are removed. Simple edge removal strategies can thus achieve a target density but they may not fully exercise the capabilities of conditional generation methods. We therefore treat density as a complementary constraint that informs but does not dominate our evaluation.

Multi-properties Conditioning. We further evaluate CoPHo for simultaneous conditioning of multiple properties. Table 3 shows that CoPHo achieves the lowest MAE across all tested combinations. To illustrate CoPHo’s per-sample behavior under triplet conditioning, Figure 3 presents generated examples. Samples #1, #2 and #5 demonstrate that even when edge density differs from the ground Properties Extraction. We target shortest paths as our fine-grained properties because controlling them requires accounting for the entire subgraph induced by all paths between each source and target pair. We compute all-pairs shortest paths using Dijkstra’s algorithm [10] and train a simple multi-layer GNN to predict these path lengths.

Motivation for Conditioning. In practice one may wish to release a usable topology while preserving privacy of the true structure. A generated graph should make the shortest paths among critical node pairs closely match those of the original graph so that routing remains accurate. A naive approach that extracts the subgraph on key nodes and adds noise (referred to as Node Subgraph Perturbation, NSP in Table 4, which overlaps with the original graph on key topological structures) still risks privacy breaches [28].

Experimental Results. We begin by conditioning on a single shortest path and then increase the number of conditioned paths to 5 and 50. When all pairs are conditioned, the generated graph 𝐺 is identical to the original 𝐺. Table 4 reports the mean absolute error (MAE), Kullback-Leibler divergence, and overlaping rate (OL) of the conditioned subgraph across three runs. A full discussion of generation quality effects appears in Section A.2.

We introduce two key components in CoPHo: persistent homology and the gradient-based proposal distribution. Our ablations cover the use of persistent homology, 𝑇 ℎ𝑜𝑚𝑜 , the timing of homology introduction, and the choice of proposal distribution. In this section, we report results for three factors: 𝑇 homo , the timing of homology introduction, and different proposal distributions. Additional ablation results appear in Appendix Section A.1.

𝑇 homo and the timing of homology introduction. In our persistenthomology design, we introduce two key hyperparameters: the maximum number of homology steps 𝑇 homo and the timing of homology introduction ph_timing. The corresponding ablations are reported in Table 5 and Table 6. For hyperparameter selection, we balance compute and accuracy: in some cases (e.g., comparing 𝑇 homo = 5 vs. 10 and ph_timing = 0.4 vs. 0.6), additional computation yields only marginal gains, so we adopt the lower-cost settings, 𝑇 homo = 5 (fewer homology steps) and ph_timing = 0.6 (later PH introduction). Proposal Distribution. To assess the effectiveness of gradientbased proposals, we evaluate five proposal distribution on Enzymes and compare rand, which uses a random vector of the same shape as the true gradient; EBC, ranking edges by betweenness centrality, and neg EBC, its inverse ranking; Loop gradient, which uses the density predictor’s gradient to condition clustering, the clustering predictor’s gradient to condition assortativity, and so on; and finally gradient, the persistent-homology-derived guidance at the heart of CoPHo. Table 7 reports MAE over four global properties, where the gradient-based distribution achieves the lowest error in clustering, assortativity, and transitivity, confirming its superior condition effectiveness.

QM9 Generation. We sample 100 molecules from the QM9 test set and retrieve their dipole moment 𝜇 and highest occupied molecular orbital (HOMO). We then apply CoPHo to condition 𝜇, HOMO, and both simultaneously. To explore hybrid strategies, we combine DiGress and CoPHo in three ways: DiGress+CoPHo uses DiGress guidance for the first 50% of diffusion steps and CoPHo for the remaining steps, CoPHo+DiGress applies CoPHo first and DiGress second, CoPHo*DiGress applies both controllers at every step. Table 8 reports the MAE for each target and the average validity rate. Training Time. We compare the training time required to support 𝑘 properties when using DiGress as the diffusion backbone. Let 𝑇 𝑑 denote the time to train the diffusion model and 𝑇 𝑐 the time to train a classifier or regressor. Classifier-guided methods thus require 𝑇 𝑑 + 𝑘 𝑇 𝑐 , whereas other methods require 𝑘 𝑇 𝑑 . Table 9 reports 𝑇 𝑑 and 𝑇 𝑐 for baselines on Enzymes and Community-Small. Cross-Paradigm Transfer to an SDE-Based Diffusion Model. In Tables 2 and 3, CoPHo is evaluated on the DDPM-based DiGress backbone: persistent homology with𝑇 homo = 5 is applied at every reverse diffusion step. By contrast, Tables 11 and 10 report CoPHo’s performance when transferred to the SDE-based GDSS model, where we inject persistent homology with 𝑇 homo = 1 only once every ten steps (i.e. 2% of the frequency used in DiGress). Despite this minimal intervention, GDSS+CoPHo significantly outperforms the DDPM-based variants. This improvement arises because GDSS perturbs node and edge features with continuous Gaussian noise and explicitly constructs a vector field governing distributional transport; within this SDE framework, even sparse classifier-gradient drift terms can effectively steer samples into the desired region. In contrast, DiGress’s DDPM formulation models discrete probability transitions over nodes and edges without an explicit vector field, which limits the influence of gradient-based corrections. These results underscore CoPHo’s strong potential to enhance conditional control across diverse diffusion paradigms. We then rank all (𝑖, 𝑗) pairs by 𝜏 (𝑖, 𝑗) and apply the corresponding positive or negative update to the graph topology or node labels.

We implemented this scheme on QM9-treating atom and bond types as categories-and achieved superior performance over prior classifier-based conditional generation models, validating the efficacy of gradient-difference conditioning.

For fully connected GNN architectures where every node pair (𝑖, 𝑗) is assumed to be connected, CoPHo adapts by interpreting the classifier gradient tensor ∇G 𝑡 [𝑖, 𝑗] (shape 𝑁 × 𝑁 ) as both a tendency and an action signal. At each denoising step:

Here 𝜏 (𝑖, 𝑗) ranks edges by the magnitude of their influence on the target property, and 𝑠 (𝑖, 𝑗) ∈ {+1, -1} indicates whether to remove (+1) or add (-1) the edge. We then:

(1) Sort all pairs (𝑖, 𝑗) in descending order of 𝜏 (𝑖, 𝑗).

(2) For the top-𝑘 pairs, apply the update 𝐸 𝑖 𝑗 ← 𝐸 𝑖 𝑗 -𝑠 (𝑖, 𝑗), clipping to {0, 1} to maintain binary adjacency (Similar to gradient descent). ( 3) Proceed with the next diffusion reverse step on the updated graph.

This procedure leverages the dense initial connectivity to flexibly sculpt the graph toward the desired property, without any retraining of the underlying backbone.

We have introduced CoPHo, a novel framework for conditional graph diffusion that combines classifier-gradient guidance with persistent homology filtrations. By proving that classifier gradients implement the correct density-ratio reweighting and embedding them into a decreasing filtration of the graph, CoPHo achieves precise conditioning over both global and fine-grained properties without retraining the diffusion backbone. Extensive experiments on synthetic and real-world graph benchmarks, as well as transfer to molecular generation on QM9, demonstrate that CoPHo substantially improves conditional accuracy while preserving or even enhancing sample quality and maintaining competitive efficiency.

Limitations. Scalability with Shortest-Path Constraints. While CoPHo scales gracefully to many graph sizes and conditions, its reliance on multiple shortest-path constraints can become burdensome as the number of conditioned pairs grows. Conditioning on an increasing fraction of node-pairs requires constructing and filtering ever-larger subgraphs, and may demand more sophisticated combinatorial or graph-search algorithms to maintain fidelity. In practice, enforcing many shortest-path conditions in a single run leads to a trade-off: stronger conditioning accuracy but significantly slower inference. Future work must explore adaptive strategies to prune or group constraints intelligently, and to accelerate the persistent homology steps, in order to sustain efficiency in scenarios with extensive fine-grained requirements.

Fixed Homology Steps and PH Introduction Timing. In the current design, the number of homology steps 𝑇 homo and the PH introduction timing are selected a priori and not adapted during sampling. This scheduling may miss opportunities for more efficient or accurate conditioning. Future work could explore reinforcement learning or other adaptive strategies to jointly decide 𝑇 homo and PH timing on the fly, balancing inference speed and conditioning fidelity.

Persistent Homology. Conditional control without persistent homology is practically equivalent to DiGress [60]. We evaluate the impact of adding persistent homology on both global and finegrained conditional control tasks using Community-Small and Planar. As shown in Table 12, introducing persistent homology yields substantial improvements in mean absolute error for density, clustering, assortativity, transitivity and shortest-path condition across all tasks.

We also generation quality under global property conditioning and report average performance across all conditional targets. Table 13 presents the mean values of the three metrics described in Sec. 4.1. We visualize the parameterized posterior distribution of the unconditional discrete graph diffusion model, alongside the gradients on the adjacency matrix edges with respect to the shortest-path property as predicted by a GNN. Using the DiGress framework on the Planar dataset, we render these in Figure 5. At every step, the posterior concentrates sharply, while the edge gradients exhibit a very peaky distribution (ranging from -0.2 × 10 7 to 1 × 10 8 , with a single outlier near 10 8 ). Consequently, when we normalize these gradients and inject them into the posterior, it becomes impossible to choose a guidance strength that preserves the (approximate) continuity assumption on graph data. After normalization, nearly all guidance scores collapse to zero, causing the sampled graphs to become progressively sparser at each diffusion step. For example, in the Figure 5-b, only one edge retains a gradient on the order of 10 8 , and the rest are effectively zero-producing the degenerate sample shown in Figure 5-c Applying this to our chain gives

Chain Rule (Product Rule). The joint conditional can be factorized via the chain rule: .

We now show that for any test function 𝑓 , the weighted estimator is unbiased, as definition in Eq. A1: In particular, setting 𝑓 ( Ĝ𝑡 ) ≡ 1 gives

showing that 𝑤 ( Ĝ𝑡 ) is indeed an unbiased importance weight.

To approximate sampling from the intractable conditional kernel B.4 Discussion on the conditional markov chain 1. Memorylessness. The reverse diffusion process is naturally formulated as a Markov chain because each state transition depends only on the immediately preceding graph, not on the full history. This “memoryless” property ensures that the distribution of G 𝑡 conditioned on G 𝑡 +1 is independent of all earlier states G 𝑡 +2 , G 𝑡 +3 , . . . .

Each conditional sampling step is defined as a two-stage chain: first, an unconditional denoising from G 𝑡 +1 to G 𝑡 via the learned kernel 𝑞(G 𝑡 | G 𝑡 +1 ); second, a conditional refinement to Ĝ𝑡 that incorporates the target 𝑦. So the process can be regarded as a markov chain.

𝑡 Graph at step 𝑡, with nodes V t and edges E 𝑡 . Ĝ𝑡 Conditioned graph at denoising step 𝑡. 𝑞(•) Unconditional denoising kernel. q(•) Conditional denoising kernel. 𝑦 Desired target property (e.g., clustering coefficient). Φ Auxiliary classifier/regressor for predicting 𝑦. 𝑇 homo Number of PH steps per diffusion update. Ĝ𝑖 𝑡 Candidate graph at diffusion step 𝑡 and PH step 𝑖. 𝜑 ( Ĝ𝑖 𝑡 ) True property value of candidate graph Ĝ𝑖 𝑡 . 𝑤 ( Ĝ𝑖 𝑡 ) Importance weight for candidate graph Ĝ𝑖 𝑡 . P( Ĝ𝑡 | G 𝑡 ) Proposal distribution for importance sampling. 𝑔 𝑡 (𝑒), 𝑔 𝑡 (𝑣) Edge/node gradient scores w.r.t. classifier (Φ) loss. F 𝛼 Decreasing filtration at threshold 𝛼. 𝛼 𝑖 Threshold at homology step 𝑖.

𝑡 |G𝑡 ) is inversely proportional to the edit distance ∥G 𝑡 -Ĝ𝑖 𝑡 ∥. In the model, this means that while Ĝ𝑖 𝑡 must satisfy the target property 𝑦, we seek to minimally modify the unconditional diffusion sample G 𝑡 .

𝑡 :

they encompass intricate global structures, such as modular community organization and functional motifs, which pose substantial challenges for conditional generation methods. Planar and Topology-Zoo test fine-grained shortest-path conditioning, and QM9 demonstrates cross-domain generalizability.

Fine-grained conditional of shortest paths, averaged over Topology-Zoo and Planar.

Community-Small Enzymes Model Deg. Clus. Orb. Spec. Clus. Orb. DiGress 0.02 11.7 1.16 1.87 3.10 1.58 Ours 0.02 10.3 1.17 2.59 3.32 2.07

📄 Read Full PDF on ArXiv

Reference

This content is AI-processed based on open access ArXiv data.

📝 Original Info

📝 Abstract

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

📄 Full Content

Reference

Start searching

No results found