Predicting how genetic perturbations change cellular state is a core problem for building controllable models of gene regulation. Perturbations targeting the same gene can produce different transcriptional responses depending on their genomic locus, including different transcription start sites and regulatory elements. Gene-level perturbation models collapse these distinct interventions into the same representation. We introduce STRAND, a generative model that predicts single-cell transcriptional responses by conditioning on regulatory DNA sequence. STRAND represents a perturbation by encoding the sequence at its genomic locus and uses this representation to parameterize a conditional transport process from control to perturbed cell states. Representing perturbations by sequence, rather than by a fixed set of gene identifiers, supports zero-shot inference at loci not seen during training and expands inference-time genomic coverage from ~1.5% for gene-level single-cell foundation models to ~95% of the genome. We evaluate STRAND on CRISPR perturbation datasets in K562, Jurkat, and RPE1 cells. STRAND improves discrimination scores by up to 33% in low-sample regimes, achieves the best average rank on unseen gene perturbation benchmarks, and improves transfer to novel cell lines by up to 0.14 in Pearson correlation. Ablations isolate the gains to sequence conditioning and transport, and case studies show that STRAND resolves functionally alternative transcription start sites missed by gene-level models.
Predicting how genetic perturbations change cellular state is a core problem for building controllable models of gene regulation (Roohani et al., 2023;Lotfollahi et al., 2023b;Ahlmann-Eltze et al., 2025;Adduri et al., 2025b;Park & Li, 2026;Lorch et al., 2026;Dong et al., 2026). In practice, regulatory effects are mediated by specific DNA sequences, such as promoters, enhancers, and alternative transcription start sites (TSS), rather than by genes as indivisible units (Nasser et al., 2021;Avsec et al., 2021;Linder et al., 2025;Pampari et al., 2025;Avsec et al., 2026). However, most existing single-cell perturbation models, including recent single-cell foundation models, represent perturbations at the gene level (Cui et al., 2024;Wenkel et al., 2025;Passigan et al., 2025;Zhu & Li, 2025;Dong et al., 2026) , either as discrete identifiers or as static nodes in a graph. As a result, perturbations that target different genomic loci within the same gene are mapped to the same representation. This creates a resolution gap: experimental technologies such as CRISPR-i/a and base editing can intervene at precise genomic locations, while predictive models treat all interventions within a gene body as equivalent.
This resolution gap matters because perturbations sharing the same gene label can produce vastly different cellular outcomes depending on which regulatory sequence is targeted. For example, disrupting the coding region of gene BCL11A is lethal to cells, whereas targeting a specific enhancer sequence induces therapeutic effects without harming the cell (Smith et al., 2017;Frangoul et al., 2021). Even within a single enhancer, only a small subset of nucleotides actually controls gene regulation; perturbing nearby positions often has no measurable effect (Canver et al., 2015). These observations motivate modeling perturbations at sequence-level resolution. Such resolution is also essential for modeling the 98% of the genome that lies outside protein-coding regions, where most disease-associated genetic variants reside but remain inaccessible to gene-level perturbation models (Maurano et al., 2012;Nasser et al., 2021).
Modeling perturbation responses at nucleotide resolution is difficult for several reasons. (1) First, the input space is large: regulatory effects can depend on hundreds of thousands of nucleotides, and the mapping from raw DNA sequence to transcriptomic change is highly non-linear (Cheng et al., 2025;Fu et al., 2023). Recent DNA foundation models learn general regulatory grammar from sequence (Linder et al., 2025;Avsec et al., 2026;Patel et al., 2024), but they are trained to predict regulatory signals, not perturbationinduced state changes. As a result, perturbation effects must be inferred indirectly through post-hoc analyses and remain limited to local sequence windows, without modeling how perturbations propagate through cell-type-specific regulatory programs (Wei et al., 2025). ( 2) Second, regulatory effects are context-dependent. The impact of a sequence motif depends on chromatin state, long-range interactions, and cellular context, all of which vary across cell types (Song et al., 2025). Capturing this dependence requires linking sequence-level changes to transcriptional (RNA) responses conditioned on cellular state. (3) Third, the space of possible sequence perturbations is combinatorial, while available single-cell perturbation data is sparse across perturbations, samples, and contexts (Peidli et al., 2024;Huang et al., 2025), which makes direct supervision from DNA sequence to perturbation response infeasible.
Most existing perturbation predictors avoid modeling upstream regulatory mechanisms and instead condition on proxy descriptors of the perturbed gene, including discrete covariates (Bereket & Karaletsos, 2023;Tu et al., 2024;Gaudelet et al., 2024), protein function (Adduri et al., 2025b;Dong et al., 2026), gene regulatory structure (Wu et al., 2022;Roohani et al., 2023;Wenkel et al., 2025), or text-derived embeddings (Wu et al., 2025;Zhu & Li, 2025). DNA and genome language models learn general sequence representations, but they are not trained to predict perturbation-induced transcriptomic state changes, so they do not provide a direct perturbation predictor. These designs can interpolate within a fixed set of genes, but they impose a closed perturbation vocabulary defined by gene identifiers. Generalization to unseen perturbations then relies on transferring gene-level correlations, which often yields limited gains over simple baselines (Wu et al., 2024). More importantly, gene-identifier conditioning cannot represent locus-defined perturbations (Table 1).
Present Work. We introduce STRAND, a generative model that formulates perturbation prediction as a sequenceconditioned transport problem (Figure 1). The model takes as input an unperturbed (control) cell state and the DNA sequence at a target genomic locus, and predicts the distribution of cellular states after perturbation a
This content is AI-processed based on open access ArXiv data.