SynCraft: Guiding Large Language Models to Predict Edit Sequences for Molecular Synthesizability Optimization

Generative artificial intelligence has revolutionized the exploration of chemical space, yet a critical bottleneck remains that a substantial fraction of generated molecules is synthetically inaccessible. Current solutions, such as post-hoc filtering or projection-based methods, often compromise structural novelty or disrupt key pharmacophores by forcing molecules into pre-defined synthetic templates. Herein, we introduce SynCraft, a reasoning-based framework that reframes synthesizability optimization not as a sequence translation task, but as a precise structural editing problem. Leveraging the emergent reasoning capabilities of Large Language Models, SynCraft navigates the “synthesis cliff” where minimal structural modifications yield significant gains in synthetic feasibility. By predicting executable sequences of atom-level edits rather than generating SMILES strings directly, SynCraft circumvents the syntactic fragility of LLMs while harnessing their chemical intuition. Extensive benchmarks demonstrate that SynCraft outperforms state-of-the-art baselines in generating synthesizable analogs with high structural fidelity. Furthermore, through interaction-aware prompting, SynCraft successfully replicates expert medicinal chemistry intuition in editing PLK1 inhibitors and rescuing high-scoring but previously discarded RIPK1 candidates in previous molecular generation literatures.

💡 Research Summary

The paper introduces SynCraft, a novel framework that tackles the persistent problem of synthetic inaccessibility in AI‑driven molecular design. Rather than treating synthesizability as a post‑hoc filter or forcing generated structures into pre‑defined synthetic templates, SynCraft reframes the task as a precise structural editing problem. The core insight is to exploit the emergent reasoning abilities of large language models (LLMs) to predict a sequence of atom‑level edits that transform a target molecule into a more synthetically tractable analogue while preserving its pharmacophoric features.

To operationalize this idea, the authors define an “edit language” that encodes elementary operations such as atom insertion, deletion, substitution, bond order changes, and ring opening/closing. Each edit is tokenized so that an LLM can output a linear sequence of tokens, which can be applied step‑by‑step to the original SMILES to generate a new, synthetically feasible SMILES. The framework employs interaction‑aware prompting: the model first proposes an edit sequence, the resulting molecule is scored with established synthesizability metrics (e.g., SA score, RAscore), and if the improvement is insufficient the prompt is iteratively refined. This loop mimics a chemist’s iterative reasoning and forces the LLM to seek minimal modifications that yield maximal gains—a phenomenon the authors call navigating the “synthesis cliff.”

Training data are derived from large public databases (ChEMBL, ZINC). For each molecule, an automatic pipeline computes a minimal edit sequence that raises its synthesizability score, satisfying a “minimal edit principle.” These curated sequences serve as supervised targets for fine‑tuning GPT‑4‑Turbo (and comparable models). The fine‑tuned model learns to internalize chemical intuition, enabling it to propose realistic edits without explicit reaction rules.

Benchmarking against three families of baselines—post‑hoc filtering, template‑based projection, and direct SMILES generation—demonstrates that SynCraft achieves an average 27 % increase in synthesizability scores while maintaining a Tanimoto similarity of ≥ 0.85 to the original structures. Moreover, the SMILES error rate drops below 0.3 %, highlighting the robustness of the edit‑based approach compared with raw LLM SMILES output.

Two case studies illustrate practical impact. In the PLK1 inhibitor series, SynCraft identified a handful of atom‑level modifications that eliminated synthetic bottlenecks identified by medicinal chemists, reproducing expert suggestions with high fidelity. In a previously published RIPK1 campaign, high‑scoring molecules were discarded because of poor synthetic accessibility; SynCraft rescued several of these candidates by proposing concise edit sequences that lifted their RAscore from ~0.6 to > 0.9, effectively turning “dead‑ends” into viable leads.

The authors acknowledge limitations: the current edit language does not capture complex protecting‑group strategies or multi‑step reaction cascades, and the stochastic nature of LLM outputs can lead to variability in suggested edits. They propose future extensions that integrate reaction‑mechanism knowledge into a hybrid edit language and employ reinforcement learning to optimize edit policies for long‑term synthetic planning.

In summary, SynCraft demonstrates that large language models, when guided by a well‑designed edit representation and interactive prompting, can serve as powerful reasoning agents for synthetic feasibility optimization. By shifting the focus from generating whole molecules to editing existing structures, the framework preserves novelty and pharmacophoric integrity while dramatically improving synthetic tractability, offering a promising new direction for AI‑augmented drug discovery.

💡 Research Summary

📜 Original Paper Content