ELISA: An Interpretable Hybrid Generative AI Agent for Expression-Grounded Discovery in Single-Cell Genomics

E L I S A : A N I N T E R P R E T A B L E H Y B R I D G E N E R A T I V E A I A G E N T F O R E X P R E S S I O N - G RO U N D E D D I S C OV E RY I N S I N G L E - C E L L G E N O M I C S A P R E P R I N T Omar Coser ∗ No Affiliation, omarcoser10@gmail.com March 18, 2026 A B S T R AC T T ranslating single-cell RN A sequencing (scRN A-seq) data into mechanistic biological h ypotheses remains a critical bottleneck, as agentic AI systems lack direct access to transcriptomic representations while expression foundation models remain opaque to natural language. Here we introduce ELISA (Embedding-Linked Interacti ve Single-cell Agent), an interpretable framework that uniﬁes scGPT expression embeddings with BioBER T -based semantic retrie val and LLM-mediated interpretation for interactive single-cell discov ery . An automatic query classiﬁer routes inputs to gene marker scoring, semantic matching, or reciprocal rank fusion pipelines depending on whether the query is a gene signature, natural language concept, or mixture of both. Integrated analytical modules perform pathway activity scoringacross 60+ gene sets, ligand–receptor interaction prediction using 280+ curated pairs, condition-aw are comparati ve analysis, and cell-type proportion estimation all operating directly on embedded data without access to the original count matrix. Benchmarked across six di verse scRN A-seq datasets spanning inﬂammatory lung disease, pediatric and adult cancers, or ganoid models, healthy tissue, and neurode velopment, ELISA signiﬁcantly outperforms CellWhisperer in cell type retrie val (combined permutation test, p < 0 . 001 ), with particularly lar ge gains on gene-signature queries (Cohen’ s d = 5 . 98 for MRR). ELISA replicates published biological ﬁndings (mean compos- ite score 0.90) with near-perfect pathway alignment and theme cov erage (0.98 each), and generates candidate hypotheses through grounded LLM reasoning, bridging the gap between transcriptomic data exploration and biological discovery . Code av ailable at: https://github .com/omaruno/ELISA-An- AI-Agent-for-Expression-Grounded-Disco very-in-Single-Cell-Genomics.git (If you use ELISA in your research, please cite this work). Keyw ords AI Agents, Single Cell Genomics, AI Discov ery 1 Introduction Single-cell RN A sequencing (scRNA-seq) has transformed our understanding of cellular heterogeneity by enabling genome-wide transcriptional proﬁling at single-cell resolution T ang et al. [2009]. Standardized analytical pipelines support quality control, normalization, clustering, dif ferential expression, and trajectory inference Lueck en and Theis [2019], catalyzing the construction of comprehensi ve cell atlases across tissues, developmental stages, and disease contexts. Ho wever a critical bottleneck persists: translating statistical outputs of differentially expressed gene lists, enriched pathways, and predicted ligand receptor interactions into mechanistic biological h ypotheses remains labor- intensiv e, context-dependent, and dif ﬁcult to scale or reproduce. Large-language models (LLMs) offer a potential solution to this problem. LLMs encode substantial biomedical knowledge and perform competitiv ely on clinical reasoning benchmarks Singhal et al. [2023], whereas retriev al- augmented generation (RA G) improves factual accurac y by grounding outputs in external kno wledge at inference ∗ Correspondence author: Omar Coser . This manuscript has been submitted for peer review . arXiv T emplate A P R E P R I N T T able 1: Comparison of existing AI systems for biomedical and single-cell analysis. Expr . Emb . : uses expression- deriv ed embeddings from foundation models; Sem. Ret. : semantic retriev al ov er biological annotations; L–R / Pathway : ligand–receptor interaction and pathw ay scoring from data; Cond. Comp. : condition-aware comparati ve analysis; Interp. Report : automated interpretiv e report generation with LLM. System Expr . Sem. L–R / Cond. Interp. Primary Emb . Ret. Pathway Comp. Report Scope AI Co-Scientist Gottweis et al. [2025] – – – – ✓ Hypothesis generation Biomni Huang et al. [2025] – ✓ – – – General biomedical GeneAgent W ang et al. [2025] – ✓ – – – Gene-set analysis V irtual Lab Swanson et al. [2025] – – – – ✓ Multi-agent discov ery CellAgent Xiao et al. [2024] – – – – – scRN A-seq pipelines AutoB A Zhou et al. [2023] – – – – – Pipeline generation BRAD Pickard et al. [2025] – ✓ – – – Biomarker ID GeneGPT Jin et al. [2025] – ✓ – – – Database querying CRISPR-GPT Qu et al. [2025] – – – – – Experiment design scGPT Cui et al. [2024] ✓ – – – – Cell embeddings CellWhisperer Schaefer et al. [2025] ✓ ✓ – – – Multimodal embedding ELISA (ours) ✓ ✓ ✓ ✓ ✓ Interactive sc disco very time Le wis et al. [2020]. These capabilities ha ve motiv ated agentic AI architectures that are capable of autonomous planning, tool usage, and iterativ e reasoning within closed-loop workﬂows. Recent agentic systems span a broad range of biomedical applications (T able 1). T owards an AI Co-Scientist Gottweis et al. [2025] introduces multi-agent hypothesis generation through structured debate and ev olutionary reﬁnement, though it operates over textual knowledge without interfacing with experimental data. Biomni Huang et al. [2025] constructs a uniﬁed action space from biomedical tools and databases, enabling dynamic task orchestration including gene prioritization. GeneAgent W ang et al. [2025] and related systems Gao et al. [2024] extend LLM reasoning to gene-set analysis, whereas V irtual Lab Swanson et al. [2025] demonstrates collaborati ve multi-agent discov ery . Within single-cell analysis, CellAgent Xiao et al. [2024] decomposes scRN A-seq workﬂo ws into agent-handled subtasks, A utoBA Zhou et al. [2023] generates executable pipelines from natural language, and BRAD Pickard et al. [2025] integrates LLMs with enrichment analysis for biomark er identiﬁcation. In retriev al-augmented space, GeneGPT Jin et al. [2025] provides structured access to NCBI databases, and systems for deep phenotyping Garcia et al. [2025] and biomedical data extraction Cinquin [2024], Niyonkuru et al. [2025] hav e demonstrated the utility of RA G for factual grounding. CRISPR-GPT Qu et al. [2025] further illustrates agentic automation for gene-editing experiment design. Howe ver , these systems are primarily responsible for curated text and structured databases and lack the capacity to operate directly on high-dimensional transcriptomic representations. Concurrently , foundation models for single-cell biology ha ve achie ved remarkable progress in the learning of expressi ve latent representations from transcriptomic data. scGPT Cui et al. [2024] employs generati ve pre-training over millions of single-cell transcriptomes, capturing gene-gene dependencies for cell embedding, annotation transfer, and perturbation prediction. Extensions such as scWGBS-GPT Liang et al. [2025] and T okensome Zhang et al. [2024] broaden learned representations to methylomics and multimodal settings. Howe ver , these expression embeddings are not designed for semantic querying; they capture transcriptional similarity in latent spaces that lack alignment with the natural language concepts that biologists use to formulate hypotheses. Notably , the CellWhisperer Schaefer et al. [2025] addressed part of this gap by learning joint embeddings of transcriptomes and te xtual annotations via contrastiv e training, enabling chat-based interrogation of scRN A-seq data within CELLxGENE Schaefer et al. [2025]. While this establishes a compelling proof of concept for natural-language e xploration, it does not incorporate b uilt-in analytical modules for pathway scoring, interaction prediction, or condition-aw are comparison. This landscape re veals a fundamental disconnect: agentic systems and LLM-based tools excel at reasoning o ver te xt and generating interpretations but lack direct access to transcriptional data structure, while expression foundation models learn rich cellular representations that remain opaque to natural language interfaces. No existing system has uniﬁed expression-deri ved embeddings with semantic language representations within a single interacti ve frame work for single-cell discov ery . ELISA ( E mbedding- L inked I nteracti ve S ingle-cell A gent) addresses this gap by integrating scGPT expression em- beddings with semantic retriev al (sr) and LLM-based biological interpretation in a uniﬁed discovery platform (Fig. 1). Rather than retraining the expression foundation models, ELISA treats scGPT cluster embeddings as an e xpression-side representation that is explicitly combined with BioBER T -deri ved semantic embeddings through an automatic hybrid routing mechanism. A query classiﬁer detects whether the input is a gene signature, a natural language concept, or a mixture of both, and routes it to the appropriate retrie val pipeline gene mark er scoring, semantic cosine similarity , 2 arXiv T emplate A P R E P R I N T Figure 1: Overview of the ELISA architecture. The frame work comprises three stages. In data pr eparation (left), a single-cell dataset undergoes standard preprocessing (normalization, log-transform, highly v ariable gene selection, PCA, neighbor graph construction, and Leiden clustering), after which per-cluster differential e xpression statistics are computed, enriched with Gene Ontology (GO) and Reactome terms, and encoded into 768-dimensional semantic embeddings via BioBER T . In parallel, cell-le vel expression embeddings are generated through scGPT . Both representations are fused into a single serialized embedding ﬁle (.pt). In the r etrieval and analysis stage (center), a query classiﬁer routes user input—gene signatures, natural language concepts, or mixed queries—to the appropriate pipeline: gene marker scoring, semantic retriev al, or hybrid retrie val via reciprocal rank fusion (RRF). Additional analytical modules perform pathway scoring, ligand–receptor interaction prediction, comparative analysis, and proportion estimation directly on the embedded data. In the interpr etation stage (right), all retrie val and analysis outputs are passed to a Groq-hosted LLM (LLaMA 3.1-8B) that generates grounded biological interpretations and structured reports. or reciprocal rank fusion of both enabling ﬂexible na vigation across the full spectrum of biological queries. Built-in analytical modules for condition-aw are comparati ve analysis, ligand-receptor interaction prediction, pathw ay activity scoring, and cell-type proportion analysis operate directly on the embedded data, while an LLM reasoning layer translates statistical outputs into structured biological interpretations. Critically , ELISA enforces strict separation between dataset-deri ved e vidence and LLM-generated knowledge, enabling transparent h ypothesis generation. The system produces comprehensi ve, publication-ready reports with Nature-style visualizations, supporting the full arc from exploratory query to structured scientiﬁc output. W e validated the ELISA on ﬁ ve di verse scRN A-seq datasets spanning distinct tissues, disease conte xts, and experimental designs. Through a systematic comparison with published ﬁndings, we demonstrate that ELISA recovers ke y biological signals dif ferentially e xpressed genes, altered cell-type proportions, pathw ay acti vities, and cell cell interaction networks with high ﬁdelity . A quantitativ e ev aluation frame work comprising ﬁv e complementary metrics (gene coverage, interaction recov ery , pathway alignment, proportion consistency , and qualitative theme co verage) provides a principled assessment of the capacity of the system to replicate established biological conclusions. T o the best of our knowledge, scGPT embeddings ha ve not been inte grated with semantic language representations in a query-conditioned retrie val framew ork for single-cell genomics. In summary , this work makes the follo wing contributions: • Multimodal discov ery agent for single-cell genomics. W e introduce ELISA, an interpretable AI frame work that integrates transcriptomic embeddings, semantic knowledge retrie val, and large language model reasoning to enable natural-language–driv en exploration and biological discovery from single-cell RN A sequencing data. 3 arXiv T emplate A P R E P R I N T • Query-adaptiv e hybrid retriev al architecture. ELISA employs automatic query classiﬁcation and dynamic pipeline routing to combine complementary retrie val strategies including gene mark er scoring, semantic similarity search, and reciprocal rank fusion allowing ﬂexible, query-conditioned navigation of comple x cellular landscapes. • Integrated biological analysis modules for expression-grounded reasoning. The system incorporates analytical components for comparativ e expression analysis, ligand–receptor interaction scoring, pathway activity estima- tion, and cell-type proportion proﬁling, enabling automated interpretation and contextualization of discovered signals. • Benchmarking framew ork for e valuating AI-assisted biological discov ery . W e propose a quantitativ e e valuation strategy that measures the ability of AI agents to reco ver biologically meaningful ﬁndings reported in reference studies, and apply this framew ork across six diverse scRN A-seq datasets. • Empirical validation of discovery performance. Across multiple datasets and ev aluation metrics, ELISA consistently recov ers the majority of key biological signals reported in the corresponding studies, demonstrating its potential to support interpretable and reproducible AI-assisted discov ery in single-cell genomics. 2 Materials and Methods Detail about parameters and hyperparameters and software are speciﬁed in appendix 6,F .8. Detail about dataset are in E,5. Detail about the method are in F. 2.1 Datasets ELISA w as validated on six publicly a vailable scRN A-seq datasets from CZ CELLxGENE Discover (T able 5), spanning lung (cystic ﬁbrosis)Berg et al. [2025], adrenal tumor (neuroblastoma)Y u et al. [2025], multi-cancer immune checkpoint blockadeGondal et al. [2025], lung or ganoid Lim et al. [2025], healthy breast tissueBhat-Nakshatri et al. [2024], and ﬁrst-trimester brainMannens et al. [2025]. Datasets were downloaded in AnnData format and preprocessed into a standardized embedding format. Cell type annotat ions from the original publications were retained without modiﬁcation. 2.2 System architectur e ELISA integrates four modules a hybrid retriev al engine, an analytical suite, a visualization toolkit, and an LLM chat interface operating on a shared serialized PyT orch embedding ﬁle per dataset. Each embedding ﬁle stores cluster identiﬁers, BioBER T semantic embeddings (768-d), optional scGPT expression embeddings, per-cluster differential e xpression statistics, gene ontology (GO) and Reactome enrichment terms, and metadata. This cluster-le vel representation eliminates the need for access to the original count matrix at query time. 2.3 Hybrid retrie val An automatic query classiﬁer routes each input to one of the three pipeli nes based on token-le vel heuristics. Gene queries ( ≥ 60% gene-symbol tokens) were scored against per-cluster Dif ferential Expression (DE) proﬁles using a weighted function of | log 2 F C | and expression speciﬁcity ( p ct in − p ct out ). Ontology queries are encoded with BioBER TLee et al. [2020] and matched to precomputed cluster description embeddings via cosine similarity , augmented by Cell Ontology name boosting ( α = 0 . 15 ) and synonym e xpansion ( β = 0 . 10 ). Mixed queries are resolv ed through reciprocal rank fusion (RRF) of both pipelines ( k = 60 ). For benchmarking, an additi ve union strate gy selects the higher -recall modality as primary and appends unique results from the secondary pipeline. 2.4 Analytical modules The four b uilt-in modules operate directly on the embedded data. Ligand–r eceptor inter action pr ediction scores source–target cluster pairs using a curated database of 280+ pairs compiled from CellChatJin et al. [2025], Cell- PhoneDBEfremov a et al. [2020], and NicheNetBrowae ys et al. [2020]. P athway activity scoring quantiﬁes 60+ curated gene sets across ﬁ ve cate gories (immune signaling, cell biology , neuroscience, metabolism and tissue-speciﬁc). Com- parative analysis stratiﬁes clusters by condition metadata and identiﬁes condition-biased gene e xpression. Pr oportion analysis computes per-cluster cell fractions and condition-speciﬁc fold changes. Detailed description in F .3. 4 arXiv T emplate A P R E P R I N T 2.5 LLM interpr etation Retriev al and analysis outputs are interpreted by LLaMA-3.1-8B-Instant Grattaﬁori et al. [2024] via the Groq API (temperature 0.2)(free to use with token limit, API of chatGPT Achiam et al. [2023], gemini T eam et al. [2023] and claude Anthropic [2024] are inte grated and ready to use). Prompts enforce strict grounding in dataset e vidence, with explicit instructions to av oid hallucination and causal claims. A discov ery mode generates structured outputs comprising dataset evidence, established biology , consistency analysis, and candidate h ypotheses. 2.6 Benchmarking Retriev al was e valuated using 100 queries (50 ontology , 50 expression) with curated e xpected clusters, assessed using Cluster Recall@ k and Mean Reciprocal Rank (MRR). ELISA was compared against a CellWhisperer Schaefer et al. [2025]. Analytical modules were e valuated against ground truth from source publications using interaction recov ery rate, pathway alignment, proportion consistency , and gene recall. A combined permutation test (50,000 permutations) assessed ov erall signiﬁcance across all metrics simultaneously . 3 Results 3.1 ELISA ’ s hybrid retriev al outperforms CellWhisperer acr oss datasets and query types T o ev aluate the ability of ELISA to retrie ve biologically relev ant cell types from single-cell atlases, we benchmarked its retriev al performance against CellWhisperer Schaefer et al. [2025], a state-of-the-art multimodal framework for natural-language interrogation of scRN A-seq data. For each of the six datasets (T able 5), we designed paired sets of ontology queries (concept-lev el, e.g., “macrophage inﬁltration in CF (Cystic Fibrosis) airways”) and expression queries (gene-signature-based, e.g., “MARCO F ABP4 APOC1 C1QB C1QC MSR1”), with curated expected cluster sets deriv ed from the corresponding reference publications. W e e valuated four retrie val modes: CellWhisperer , Semantic ELISA, scGPT ELISA (gene marker scoring pipeline), and ELISA Union (additiv e fusion of semantic and gene pipelines via adaptiv e routing). Performance was assessed using Cluster Recall@ k and Mean Reciprocal Rank (MRR) across both query categories (Fig. 2; formal deﬁnitions of all retriev al and analytical ev aluation metrics are provided in Supplementary Section C). Across all six datasets, the ELISA mode consistently achieved the highest or near-highest performance on e very metric, en veloping or matching the CellWhisperer proﬁle on all axes of the radar plots (Fig. 2). T o quantify this advantage, we performed paired statistical tests across the six datasets for each retriev al metric (T able 2). A combined permutation test aggregating all 12 metrics simultaneously conﬁrmed that ELISA Union signiﬁcantly outperformed CellWhisperer ( p < 0 . 001 ; 50,000 permutations). This overall advantage w as driv en by large improvements on expression queries (mean ∆ MRR = +0.41, paired t -test p < 0 . 001 , Cohen’ s d = 5.98; mean ∆ Recall@5 = +0.29, p = 0 . 006 , d = 1.57) and consistent gains on ontology queries (mean ∆ MRR = +0.15, p = 0 . 028 , d = 1.02; mean ∆ Recall@5 = +0.08, p = 0 . 047 , d = 0.84). Across all six datasets, the ELISA Union won 46 of 54 indi vidual metric comparisons against CellWhisperer , with no dataset in which CellWhisperer held an overall adv antage. The Semantic ELISA pipeline alone also signiﬁcantly outperformed CellWhisperer (combined permutation test, p = 0 . 003 ), as did the scGPT pipeline ( p = 0 . 023 ), conﬁrming that both modalities independently contribute retriev al v alue be yond the CellWhisperer baseline. A key observation is that no single retriev al modality dominated across both query types. The Semantic pipeline consistently excelled on ontology queries, where biological concept matching beneﬁts from BioBER T’ s language understanding, synonym e xpansion, and Cell Ontology name boosting. In contrast, the gene marker scoring pipeline sho wed its strongest performance on e xpression queries, where matching transcriptomic signatures to cluster DE proﬁles is essential. This complementarity was particularly pronounced in the CF Airw ays dataset, where the Semantic pipeline achiev ed high ontology Recall@10 ( ∼ 0.95) but lower e xpression recall, while the gene pipeline sho wed the in verse pattern. Similar modality-speciﬁc advantages were visible across all datasets: in the Breast Tissue Atlas, Semantic and Union nearly ov erlapped on ontology metrics while the gene pipeline lagged; in Immune Checkpoint Blockade (ICB) Multi-Cancer , the gene pipeline outperformed Semantic on expression MRR while underperforming on ontology axes. CellWhisperer showed competitiv e performance on ontology queries in sev eral datasets, particularly CF Airways and High-Risk Neuroblastoma, where its ontology MRR approached that of the ELISA Semantic pipeline. Ho wever , CellWhisperer’ s performance dropped substantially on expression queries across all six datasets, with a mean MRR of 0.397 ± 0.049 compared to 0.806 ± 0.061 for ELISA Union a twofold dif ference (T able 2) Cohen [2013], Casella and Berger [2024]. This gap was most sev ere in the ICB Multi-Cancer and First-T rimester Brain datasets, where CellWhisperer’ s expression recall fell well below both ELISA pipelines. The expression query deﬁcit reﬂects a 5 arXiv T emplate A P R E P R I N T Figure 2: ELISA outperforms CellWhisperer acr oss six datasets and both query types. Radar plots showing retriev al performance on ontology (Ont) and expression (Exp) queries for each dataset. Each plot displays six ax es: Cluster Recall@ k at two dataset-adapted cutof fs and Mean Reciprocal Rank (MRR), ev aluated separately on ontology and expression queries (see Supplementary Section C for metric deﬁnitions). Higher values (further from center) indicate better performance. Four retriev al modes are compared: CellWhisperer (pink dashed), ELISA Semantic (blue), ELISA scGPT (orange), and ELISA Union (green). The Union mode consistently achiev es the largest radar footprint, matching or exceeding CellWhisperer on ontology metrics while substantially outperforming it on e xpression metrics. ELISA Union signiﬁcantly outperformed CellWhisperer across all datasets and metrics (combined permutation test, p < 0 . 001 ; see T able 2). fundamental architectural difference: CellWhisperer’ s contrastiv e text transcriptome alignment is optimized for natural- language cell type descriptions but does not incorporate a dedicated gene marker scoring mechanism for queries formulated as gene signatures, a query type that is common in exploratory single-cell analysis. The ELISA Union mode resolv es the tension between ontology and e xpression retriev al through its adaptiv e routing mechanism. For each query , the automatic classiﬁer identiﬁes whether the input is a gene list, a natural-language concept, or a mixture, and routes it to the appropriate pipeline. The additive union strate gy then combines the full ranked output of the primary pipeline with unique clusters from the secondary pipeline, ensuring that rele vant cell types captured by either modality are not lost. This yielded consistent gains: in the CF Airways dataset, Union achie ved a larger and more balanced radar footprint than any single modality; in the Breast T issue Atlas, Union matched the near-perfect ontology performance of Semantic while substantially impro ving expression recall; and in the First-T rimester Brain, Union compensated for Semantic’ s lo wer expression scores by incorporating the gene pipeline’ s matching strength. Notably , the performance advantage of ELISA was robust across datasets with v ery different structural properties. The CF Airways dataset (30 cell types, casecontrol design) and the First-Trimester Brain atlas (160 clusters, dev elopmental trajectory without disease contrast) represent opposite ends of the complexity spectrum, ho wever the ELISA Union outperformed CellWhisperer in both settings. Similarly , the ICB Multi-Cancer dataset, which inte grates nine cancer types across 223 patients, poses a challenging retriev al scenario owing due to its heterogeneous cell type nomenclature, yet ELISA maintains its performance advantage. In summary , ELISA ’ s hybrid retriev al architecture combining semantic language matching, gene marker scoring, and adaptiv e fusion provides a signiﬁcantly superior retriev al framew ork compared to text-only multimodal approaches (combined permutation test, p < 0 . 001 ). The systematic advantage on expression queries, where dedicated gene scoring compensates for the limitations of language-only embeddings (Cohen’ s d = 5.98 for MRR), establishes that both retrie val modalities contrib ute essential and non-redundant information for comprehensi ve single-cell atlas interrogation. 6 arXiv T emplate A P R E P R I N T T able 2: Statistical comparison of ELISA Union vs. CellWhisperer retriev al perf ormance. For each metric, ∆ mean reports the average improv ement of Union over CellWhisperer across datasets. Cohen’ s d is the paired effect size. p -values are from one-sided paired t -tests ( H 1 : Union > CellWhisperer). Sign indicates datasets where Union outperformed CellWhisperer . Metrics with fewer than 6 datasets reﬂect different Recall@ k cutof fs used per dataset (see Supplementary Section B). The combined permutation test ( p < 0 . 001 ) aggregates all metrics simultaneously . Category Metric ∆ mean Cohen’s d p (pair ed t ) Sign (W/L) n Expression MRR +0.409 5.98 < 0.001 6/6 6 Expression Recall@5 +0.287 1.57 0.006 5/5 5 Expression Recall@3 +0.428 5.38 0.006 3/3 3 Expression Recall@2 +0.492 3.43 0.014 3/3 3 Expression Recall@1 +0.442 1.84 0.043 3/3 3 Expression Recall@10 +0.284 1.43 0.065 3/3 3 Ontology MRR +0.152 1.02 0.028 5/6 6 Ontology Recall@5 +0.078 0.84 0.047 4/5 5 Ontology Recall@10 +0.113 2.46 0.025 3/3 3 Ontology Recall@1 +0.086 0.61 0.199 2/3 3 Ontology Recall@2 +0.046 0.73 0.166 2/3 3 Ontology Recall@3 +0.032 0.80 0.150 2/3 3 Combined (all 12 metrics) +0.237 — < 0.001 † 46/54 ‡ 6 † Combined permutation test (50,000 permutations). ‡ T otal metric-lev el wins across all datasets. 3.2 ELISA replicates k ey biological ﬁndings across six di verse datasets T o e valuate whether ELISA could recov er published biological conclusions through automated analysis alone, we compared ELISA-generated reports with the main-text results of six reference publications (T able 5). For each dataset, ELISA was pro vided only with the preprocessed embedding ﬁle and no prior kno wledge of the expected ﬁndings. W e assessed replication across ﬁ ve quantitativ e metrics: gene cov erage, pathway alignment, interaction recov ery , proportion consistency , and theme cov erage, and obtained an independent domain expert e valuation score (T able 3). Across all six datasets, ELISA achiev ed a mean composite score of 0.90 (range 0.82–0.96). Pathw ay alignment and theme coverage were near-perfect (mean 0.98 each), while gene coverage av eraged 0.85 and interaction recovery 0.77. Independent biological ev aluation scores (mean 0.88) conﬁrmed strong agreement with published ﬁndings. The computation of these metrics is presented in the appendix B. Airways with Cystic ﬁbr osis. ELISA was used to recov er the major epithelial and immune cell populations, as described by Ber g et al. Berg et al. [2025], including correct proportion shifts and IFN- γ /type I interferon programs (pathway alignment: 1.0). Gene cov erage reached 0.80, capturing markers such as IFNG , CD69 , and HLA-E . Interaction recov ery was 0.20, reﬂecting partial detection of the HLA-E/NKG2A and CALR–LRP1 axes (composite: 0.82). High-risk neuroblastoma. ELISA identiﬁed all major cellular compartments and correctly detected the HB- EGF/ERBB4 paracrine axis (interaction recovery: 1.00) as described by Y u et al. Y u et al. [2025]. Pathway alignment was perfect and with mTOR, MAPK, and ErbB programs identiﬁed. Gene cov erage was 0.84, with partial recovery of therapy-induced markers (composite: 0.95). Immune checkpoint blockade acr oss cancers. Using the ICB dataset, Gondal et al. Gondal et al. [2025], ELISA captured checkpoint molecules ( CD274 , PDCD1 , CTLA4 ), e xhaustion markers, and all major ligand–receptor axes including PD-L1/PD-1 and TIGIT/NECTIN2 (gene coverage: 0.77; pathway and interaction recovery: 1.00; composite: 0.93). Healthy breast tissue atlas. ELISA achie ved its highest composite score (0.96) on the dataset of Bhat-Nakshatri et al. Bhat-Nakshatri et al. [2024], accurately resolving the epithelial hierarchy with a gene cov erage of 0.96, perfect pathway alignment, and interaction reco very of 0.80. Ancestry-related transcriptional programs were not captured, reﬂecting a limitation of ELISA ’ s pathway-centric frame work. Fetal lung Alv eolar T ype (A T2) organoids. ELISA achiev ed perfect gene coverage (1.00) on the dataset of Lim et al. Lim et al. [2025], detecting all canonical surfactant genes and correctly identifying surfactant metabolism, Wnt, and Fibroblast Gro wth Factor (FGF) programs. Interaction reco very was lo wer (0.40), as SFTPC trafﬁcking mechanisms were outside transcriptomic scope (composite: 0.91). 7 arXiv T emplate A P R E P R I N T T able 3: Quantitativ e comparison between ELISA reports and reference single-cell studies. Scores reﬂect agreement between ELISA-generated biological interpretations and ﬁndings described in the main text of the corresponding publications. Gene coverage, pathway alignment, interaction recov ery , and proportion consistency were computed programmatically; theme cov erage was assessed independently by a domain expert as described in Section D. Dataset Gene Path. Int. Prop. Theme Comp. Cov . Align. Rec. Cons. Cov . score CF airway 0.80 1.0 0.20 Y es 0.85 0.82 Neuroblastoma 0.84 1.00 1.00 Y es 0.88 0.95 ICB Multi-Cancer 0.77 1.00 1.00 Y es 0.91 0.93 Breast Atlas 0.96 1.00 0.80 Y es 0.89 0.96 Fetal Lung A T2 1.00 1.00 0.40 Y es 0.88 0.91 Brain Atlas 0.85 1.00 1.00 Y es 0.90 0.95 Mean 0.85 1.00 0.77 6/6 0.88 0.90 Gene Cov .: gene co verage; Path. Align.: pathway alignment; Int. Rec.: interaction recov ery; Prop. Cons.: proportion consistenc y; Theme Cov .: theme cov erage; Biol. Eval.: independent domain expert e valuation score (0–1). Comp. score: unweighted mean of all preceding metrics (Prop. Cons. coded as 1.0 when consistent). First-trimester human brain. Despite operating solely on the transcriptomic component of this multimodal at- las Mannens et al. [2025], ELISA identiﬁed major neuronal populations with gene co verage of 0.85 and perfect pathw ay and interaction recovery . Chromatin accessibility analyses were correctly identiﬁed as outside scope (composite: 0.95). Summary . ELISA demonstrated robust replication across all six datasets (mean composite 0.90), with the strongest performance for pathway-le vel and thematic interpretation ( ≥ 0.98 mean). Gene coverage was high b ut not exhausti ve (0.85), with missed genes primarily in rare cell states and non-transcriptomic modalities. 3.3 Discovery of candidate r egulatory signals across tissue atlases Beyond reproducing the k ey biological signals described in the original studies, ELISA ’ s discovery mode highlighted sev eral candidate regulatory signals that were not explicitly emphasized in the reference publications (T able 4). These signals represent transcriptome-deri ved hypotheses emer ging from systematic cross-cell-type analysis of single-cell atlases. In the cystic ﬁbrosis airway dataset, ELISA identiﬁed enrichment of the CALR–LRP1 phagocytic signaling axis within the macrophage populations. Calreticulin–LRP1 signaling has previously been implicated in apoptotic cell recogni- tion and clearance, suggesting that altered macrophage-mediated phagocytosis may contribute to the inﬂammatory microen vironment characteristic of the CF lung. W ithin the fetal lung atlas, ELISA detected increased expression of the ubiquitin-associated re gulators TRIM21 and TRIM65 in alveolar type II (A T2) cells alongside the known E3 ubiquitin ligase ITCH . Although ITCH has been implicated in regulating surfactant protein C (SFTPC) maturation, the enrichment of these additional TRIM-family ligases suggests that cooperati ve ubiquitin-dependent pathw ays may participate in surfactant protein processing and A T2 cell proteostasis. In the healthy breast tissue atlas, ELISA highlighted strong enrichment of the Kelch-family gene KLHL29 within basal–myoepithelial cell populations. Although not emphasized in the original study , this pattern suggests that KLHL29 may represent a previously unrecognized mark er or structural regulator of basal epithelial identity . Analysis of the immune checkpoint blockade dataset revealed ele vated e xpression of macrophage markers CD163 and MRC1 within tumor -associated macrophage populations follo wing therapy . This expression pattern is consistent with an M2-like macrophage polarization state, potentially reﬂecting remodeling of the immune microen vironment in response to checkpoint blockade treatment. In the neuroblastoma dataset, ELISA identiﬁed differential usage of AP-1 transcription f actors across treatment states. Speciﬁcally , JUND expression was enriched at diagnosis, whereas JUNB and FOS were more strongly expressed after therapy . This shift suggests dynamic remodeling of AP-1–mediated stress-response programs during therapy-induced tumor state transitions. Finally , analysis of the developing brain atlas re vealed a shared transcription factor module composed of TF AP2B , LHX5 , and LHX1 across Purkinje neurons and midbrain GAB Aergic neuronal populations. This co-occurring re gulatory 8 arXiv T emplate A P R E P R I N T signature suggests the existence of a conserved transcriptional program underlying inhibitory neuron speciﬁcation in anatomically distinct brain regions. T aken toget her , these ﬁndings illustrate ho w ELISA can surf ace candidate re gulatory programs across di verse single-cell atlases. While these signals should be interpreted as transcriptome-derived hypotheses, the y provide potential starting points for targeted functional v alidation. These signals should be interpreted as transcriptome-deri ved h ypotheses and may serve as the starting points for tar geted experimental v alidation. T able 4: Candidate regulatory signals identiﬁed by ELISA across six reference single-cell atlases. These signals were not explicitly highlighted in the original publications and represent transcriptome-deri ved hypotheses generated through ELISA ’ s discov ery mode. Dataset Primary ﬁnding in reference study ELISA candidate discovery / h ypothesis CF airway Altered immune–structural cell crosstalk and inﬂammatory sig- naling in cystic ﬁbrosis airway tissue Detection of the macrophage CALR–LRP1 sig- naling axis, suggesting altered apoptotic cell recognition or phagocytic clearance pathw ays contributing to the CF lung inﬂammatory mi- croen vironment Breast Atlas Ancestry-associated epithelial lineage variation and luminal progenitor states in healthy breast tissue Enrichment of the Kelch-f amily gene KLHL29 in basal–myoepithelial cells, suggesting a po- tential additional marker or re gulator of basal epithelial structural identity Fetal Lung A T2 ITCH-mediated ubiquitin- dependent regulation of sur - factant protein C (SFTPC) maturation in alveolar type II cells Upregulation of TRIM21 and TRIM65 in ma- ture A T2 cells, suggesting additional TRIM- family ubiquitin ligases may participate in sur - factant protein processing and proteostasis ICB Multi-Cancer T umor and immune transcrip- tional responses associated with immune checkpoint blockade therapy Elev ated CD163 and MRC1 expression in tumor-associated macrophages, consistent with an M2-like polarization state potentially associ- ated with therapy-induced immune remodeling Neuroblastoma Therapy-induced transcriptional rewiring of tumor cell states and microen vironment interactions Differential AP-1 transcription f actor us- age, with JUND enriched at diagnosis and JUNB/FOS enriched post-treatment, suggest- ing stress-response remodeling during therapy- induced state transitions Brain Dev elopment Atlas Chromatin accessibility pro- grams deﬁning early neuronal lineage speciﬁcation Shared transcription factor module ( TF AP2B , LHX5 , LHX1 ) across Purkinje neurons and mid- brain GAB Aergic populations, suggesting a conserved regulatory program for inhibitory neuron speciﬁcation 4 Discussion In this study we introduced ELISA, an agent-based framew ork that uniﬁes semantic language retrie val, gene marker scoring, and LLM-mediated biological interpretation for interactive single-cell atlas interrogation. Systematic e valuation across six diverse datasets demonstrated that ELISA signiﬁcantly outperforms CellWhisperer in cell type retriev al (combined permutation test, p < 0 . 001 ) and faithfully replicated published biological ﬁndings with a mean composite score of 0.90. Here we discuss the implications of these results for the design of retriev al systems in single-cell genomics, the limitations of contrastiv e multimodal alignment, and broader role of agentic AI in biological discovery . Contrastive alignment produces text-dominated embeddings. A central ﬁnding of this study is the striking asymmetry in CellWhisperer performance across query types. In ontology queries natural language descriptions of cell types and biological processes CellWhisperer performed competitively with ELISA ’ s Semantic pipeline, achieving mean ontology MRR values within 0.15 of ELISA Union across most datasets (T able 2, Fig. 2). This is expected: 9 arXiv T emplate A P R E P R I N T CellWhisperer’ s CLIP-style contrasti ve training aligns transcriptome embeddings with textual descriptions, and ontology queries directly exploit this te xt-side alignment. Howe ver , on expression queries where users provide gene signatures rather than natural language CellWhisperer’ s performance collapsed, with expression MRR av eraging 0.397 compared to 0.806 for ELISA Union, a twofold deﬁcit (Cohen’ s d = 5.98). This asymmetry re veals a fundamental limitation of contrasti ve multimodal alignment for single-cell retrie val. CLIP- style training optimizes for text transcriptome correspondence by learning a shared embedding space where matching text cell pairs are close and mismatched pairs are distant. The resulting embeddings are, by construction, shaped primarily by the textual supervision signal: the model learns to position transcriptomes near their text descriptions, but the ﬁne-grained transcriptomic structure which genes are dif ferentially expressed, at what fold changes, in what fraction of cells is compressed into a representation optimized for te xt matching rather than gene-lev el querying. When a user submits a gene signature such as “MARCO F ABP4 APOC1 C1QB C1QC MSR1”, these gene names are processed as text tokens rather than matched against dif ferential expression statistics, resulting in a retriev al signal that is weaker and less speciﬁc than direct marker scoring. This observ ation has broader implications than those of ELISA and CellWhisperer . As foundation models for single-cell biology increasingly adopt contrasti ve or multimodal pretraining objecti ves, our results caution that te xt-supervised alignment may inadvertently sacriﬁce e xpression-lev el speciﬁcity . The dual-query e valuation frame work introduced here requiring systems to perform well on both ontology and expression queries pro vides a principled diagnostic for detecting such modality imbalances. Explicit routing outperformed implicit fusion. ELISA ’ s architectural response to this challenge was to a void implicit embedding fusion altogether . Rather than learning a single shared space that must simultaneously serve te xt and expression queries, ELISA maintains two separate representation spaces BioBER T semantic embeddings and gene-lev el DE statistics, and routes queries to the appropriate pipeline through e xplicit classiﬁcation. The query classiﬁer , operating on simple tok en-lev el heuristics (gene name patterns, known vocab ulary membership, natural language indicators), achiev ed reliable routing across all six datasets without requiring any training data. This design choice is supported empirically by complementarity analysis: the semantic pipeline won ontology queries, while the gene marker scoring pipeline won on e xpression queries in every dataset, with minimal o verlap in their error proﬁles. The additiv e union strategy , which selects the better-performing modality as the primary and appends unique results from the secondary , captures the strengths of both pipelines without the compression artifacts inherent in learned fusion. The result was a system that matched or exceeded the best single modality on every metric across e very dataset a property that no implicit fusion method could guarantee. Analytical modules bridge r etrieval and inter pretation. A distinguishing feature of ELISA relati ve to prior retriev al-focused systems is the integration of downstream analytical modules pathway scoring, ligand receptor interaction prediction, comparative analysis, and proportion estimation that operate directly on the same embedded data representation used for retrie val. This design enables a seamless transition from “which cell types are rele vant?” (retrie val) to “what biological programs are acti ve in these cell types?” (analysis) to “what does this mean biologically?” (LLM interpretation), all within a single interactiv e session. The near -perfect pathway alignment (mean 0.98) and theme coverage (mean 0.88) scores across all six datasets demonstrated that this integrated architecture effecti vely connects gene-level e vidence to biological programs. In contrast, systems that perform retriev al alone including CellWhisperer, require users to manually extract gene lists from retriev ed clusters and perform separate pathway and interaction analyses using external tools, introducing friction and potential inconsistencies. The interaction recovery metric (mean 0.77) was the most variable across datasets, with perfect recov ery in neurob- lastoma, ICB, and brain datasets b ut lower recov ery in cystic ﬁbrosis (0.40) and fetal lung (0.40). These lo wer scores primarily reﬂect the inherent dif ﬁculty of predicting speciﬁc ligand–receptor pairs from expression data when the ligand or receptor is expressed at moderate lev els across multiple cell types, making the interaction statistically detectable but not highly ranked. Future work could address this by incorporating spatial proximity information or protein-le vel data to improv e the interaction speciﬁcity . LLM grounding and the discov ery hallucination boundary . ELISA ’ s discovery mode, which prompts the LLM to separate dataset e vidence from established biology and to propose hypotheses with probabilistic language, generated biologically plausible candidate signals in all six datasets (T able 4). These include the CALR, LRP1 phagocytic axis in cystic ﬁbrosis macrophages, dif ferential AP-1 family member usage in neuroblastoma therapy response, and a shared TF AP2B/LHX5/LHX1 re gulatory module across inhibitory neuron subtypes in the dev eloping brain. While 10 arXiv T emplate A P R E P R I N T these hypotheses require e xperimental v alidation, they illustrate the potential of grounded LLM reasoning to surface non-obvious patterns in comple x datasets. Howe ver , a strict separation between data-derived e vidence and LLM-generated interpretation is essential. Without it, the LLM would ine vitably introduce plausible-sounding but unsupported claims a risk that is particularly acute in biology , where prior knowledge is vast and contextual. ELISA ’ s prompt architecture addresses this by providing the LLM only with retriev ed cluster data, gene statistics, and pathway results as context, with explicit instructions to av oid external literature and causal claims. Future directions. Se veral extensions can strengthen and broaden ELISA ’ s capabilities. Integration with spatial transcriptomics data would enable spatially resolved interaction prediction, addressing the current limitation of expression-only interaction scoring. Incorporation of trajectory inference methods would allow ELISA to reason about dynamic processes such as dif ferentiation and therapy response. Expansion of the retrie v al engine to support cross- dataset queries comparing cell types across tissues or disease states would enable the kind of meta-analytical reasoning that was outside ELISA ’ s scope in the ICB dataset ev aluation. Finally , replacing the ﬁxed LLM with a ﬁne-tuned model trained on single-cell biological reasoning can improv e the speciﬁcity and depth of automated interpretations. 5 Conclusion. ELISA demonstrates that explicit modality routing, rather than implicit contrastiv e fusion, provides a more robust foundation for multimodal single-cell retrieval. By maintaining separate semantic and expression pipelines and combining them through adaptiv e query classiﬁcation, ELISA achieves consistently superior performance across both natural language and gene-signature queries. The integration of analytical modules and grounded LLM interpretation within a single interactive framework bridges the gap between data exploration and biological disco very , enabling researchers to move from raw atlas data to structured biological hypotheses within a single session. As single-cell datasets continue to grow in scale and complexity , systems that combine the complementary strengths of language models and expression-aw are retriev al will be essential for translating transcriptomic data into biological understanding. 6 Conﬂicts of interest The authors declare that they ha ve no competing interests. 7 Funding Computational resources are furnished by Dr . Antonio Orvieto, PI at Max Planck Institute for Intelligent Systems. The rest of the work is self-ﬁnanced 8 Data a vailability All six single-cell RN A sequencing datasets used in this study are publicly available through CZ CELLxGENE Discov er ( https://cellxgene.cziscience.com ): cystic ﬁbrosis airw ays Berg et al. [2025], high-risk neuroblas- toma Y u et al. [2025], immune checkpoint blockade multi-cancer Gondal et al. [2025], fetal lung A T2 organoids Lim et al. [2025], healthy breast tissue Bhat-Nakshatri et al. [2024], and ﬁrst-trimester brain Mannens et al. [2025]. Datasets were downloaded in AnnData (.h5ad) format. Source code available at https://github.com/omaruno/ ELISA- An- AI- Agent- for- Expression- Grounded- Discovery- in- Single- Cell- Genomics . 9 A uthor contributions statement Omar Coser performed ev erything present in this manuscript. A preliminary version of this work appeared at the ICLR 2025 W orkshop on Generati ve AI for Genomics, and MLGenX Coser [2026a,b]. If you intend to use the script of ELISA cite this work. 10 Acknowledgments The authors acknowledge Dr . Antonio Orvieto for allowing to use computational resources of his Lab . 11 arXiv T emplate A P R E P R I N T References Fuchou T ang, Catalin Barbacioru, Y angzhou W ang, Ellen Nordman, Clarence Lee, Nanlan Xu, Xiaohui W ang, John Bodeau, Brian B T uch, Asim Siddiqui, et al. mrna-seq whole-transcriptome analysis of a single cell. Natur e methods , 6(5):377–382, 2009. Malte D Lueck en and F abian J Theis. Current best practices in single-cell rna-seq analysis: a tutorial. Molecular systems biology , 15(6):e8746, 2019. Karan Singhal, Shekoofeh Azizi, T ao T u, S Sara Mahdavi, Jason W ei, Hyung W on Chung, Nathan Scales, Ajay T anwani, Heather Cole-Le wis, Stephen Pfohl, et al. Lar ge language models encode clinical kno wledge. Natur e , 620(7972): 172–180, 2023. Patrick Le wis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler , Mike Le wis, W en-tau Y ih, T im Rocktäschel, et al. Retriev al-augmented generation for knowledge-intensi ve nlp tasks. Advances in neural information pr ocessing systems , 33:9459–9474, 2020. Juraj Gottweis, W ei-Hung W eng, Alexander Daryin, T ao T u, Anil Palepu, Petar Sirko vic, Artiom Myaskovsky , Felix W eissenberger , Keran Rong, Ryutaro T anno, et al. T owards an ai co-scientist. arXiv pr eprint arXiv:2502.18864 , 2025. Ke xin Huang, Serena Zhang, Hanchen W ang, Y uanhao Qu, Y ingzhou Lu, Y usuf Roohani, Ryan Li, Lin Qiu, Gavin Li, Junze Zhang, et al. Biomni: A general-purpose biomedical ai agent. biorxiv , 2025. Zhizheng W ang, Qiao Jin, Chih-Hsuan W ei, Shubo Tian, Po-Ting Lai, Qingqing Zhu, Chi-Ping Day , Christina Ross, Robert Leaman, and Zhiyong Lu. Geneagent: self-veriﬁcation language agent for gene-set analysis using domain databases. Natur e Methods , 22(8):1677–1685, 2025. Shanghua Gao, Ada Fang, Y epeng Huang, V alentina Giunchiglia, A yush Noori, Jonathan Richard Schw arz, Y asha Ektefaie, Jo vana K ondic, and Marinka Zitnik. Empowering biomedical discovery with ai agents. Cell , 187(22): 6125–6151, 2024. Kyle Swanson, W esley W u, Nash L Bulaong, John E Pak, and James Zou. The virtual lab of ai agents designs new sars-cov-2 nanobodies. Nature , 646(8085):716–723, 2025. Y ihang Xiao, Jinyi Liu, Y an Zheng, Xiaohan Xie, Jianye Hao, Mingzhi Li, Ruitao W ang, Fei Ni, Y uxiao Li, Jintian Luo, et al. Cellagent: An llm-dri ven multi-agent frame work for automated single-cell data analysis. arXiv pr eprint arXiv:2407.09811 , 2024. J Zhou, B Zhang, X Chen, et al. Automated bioinformatics analysis via autoba. arxi v , 2023. Joshua Pickard, Ram Prakash, Marc Andrew Choi, Natalie Oliven, Cooper Stansb ury , Jillian Cwycyshyn, Nicholas Galioto, Alex Gorodetsky , Alvaro V elasquez, and Indika Rajapakse. Automatic biomarker discov ery and enrichment with brad. Bioinformatics , 41(5):btaf159, 2025. Suoqin Jin, Maksim V Plikus, and Qing Nie. Cellchat for systematic analysis of cell–cell communication from single-cell transcriptomics. Natur e pr otocols , 20(1):180–219, 2025. Brandon T Garcia, Lauren W esterﬁeld, Priya Y elemali, Nikhita Gogate, E Andres Riv era-Munoz, Haowei Du, Moez Dawood, Ang ad Jolly , James R Lupski, and Jennifer E Posey . Improving automated deep phenotyping through large language models using retriev al-augmented generation. Genome Medicine , 17(1):91, 2025. Olivier Cinquin. Chip-gpt: a managed large language model for robust data extraction from biomedical database records. Brieﬁngs in bioinformatics , 25(2):bbad535, 2024. Enock Niyonkuru, J Harry Cauﬁeld, Leigh C Carmody , Michael A Gargano, Sabrina T oro, Patricia L Whetzel, Hannah Blau, Mauricio Soto Gomez, Elena Casiraghi, Leonardo Chimirri, et al. Lev eraging generativ e ai to assist biocuration of medical actions for rare disease. Bioinformatics advances , 5(1):vbaf141, 2025. Y uanhao Qu, Kaixuan Huang, Ming Y in, Kanghong Zhan, Dyllan Liu, Di Y in, Henry C Cousins, W illiam A Johnson, Xiaotong W ang, Mihir Shah, et al. Crispr-gpt for agentic automation of gene-editing e xperiments. Natur e Biomedical Engineering , pages 1–14, 2025. Haotian Cui, Chloe W ang, Hassaan Maan, Kuan Pang, Fengning Luo, Nan Duan, and Bo W ang. scgpt: tow ard building a foundation model for single-cell multi-omics using generativ e ai. Natur e methods , 21(8):1470–1480, 2024. Chaoqi Liang, Peng Y e, Hongliang Y an, Peng Zheng, Jianle Sun, Y anni W ang, Y u Li, Y uchen Ren, Y uanpei Jiang, Junjia Xiang, et al. scwgbs-gpt: a foundation model for capturing long-range cpg dependencies in single-cell whole-genome bisulﬁte sequencing to enhance epigenetic analysis. bioRxiv , pages 2025–02, 2025. Haoxi Zhang, Xinxu Zhang, Y uanxin Lin, Maiqi W ang, Y i Lai, Y u W ang, Linfeng Y u, Y ufeng Xu, Ran Cheng, and Edward Szczerbicki. T okensome: T ow ards a genetic vision-language gpt for explainable and cognitiv e karyotyping. arXiv pr eprint arXiv:2403.11073 , 2024. 12 arXiv T emplate A P R E P R I N T Moritz Schaefer, Peter Peneder , Daniel Malzl, Salvo Danilo Lombardo, Mihaela Peychev a, Jake Burton, Anna Hakobyan, V arun Sharma, Thomas Krausgruber , Celine Sin, et al. Multimodal learning enables chat-based exploration of single-cell data. Natur e Biotechnology , pages 1–11, 2025. Marijn Berg, Lisette Krabbendam, Esmee K v an der Ploe g, Menno van Nimwegen, Tjeerd van der V eer, Martin Banchero, Orestes A Carpaij, Remco Hoogenboezem, Maarten van den Ber ge, Eric Bindels, et al. Evidence for altered immune-structural cell crosstalk in cystic ﬁbrosis re vealed by single cell transcriptomics. Journal of Cystic F ibr osis , 2025. W enbao Y u, Rumeysa Biyik-Sit, Y asin Uzun, Chia-Hui Chen, Anusha Thadi, Jonathan H Sussman, Minxing Pang, Chi- Y un W u, Liron D Grossmann, Peng Gao, et al. Longitudinal single-cell multiomic atlas of high-risk neuroblastoma rev eals chemotherapy-induced tumor microen vironment rewiring. Nature Genetics , 57(5):1142–1154, 2025. Mahnoor N Gondal, Marcin Cieslik, and Arul M Chinnaiyan. Integrated cancer cell-speciﬁc single-cell rna-seq datasets of immune checkpoint blockade-treated patients. Scientiﬁc Data , 12(1):139, 2025. Kyungtae Lim, Eimear N Rutherford, Li via Delpiano, Peng He, W eimin Lin, Dawei Sun, Dick JH V an den Boomen, James R Edgar , Jae Hak Bang, Alexander Predeus, et al. A nov el human fetal lung-derived alv eolar organoid model re veals mechanisms of surfactant protein c maturation relev ant to interstitial lung disease. The EMBO Journal , 44(3): 639, 2025. Poornima Bhat-Nakshatri, Hongyu Gao, Aditi S Khatpe, Adedeji K Adebayo, Patrick C McGuire, Cihat Erdogan, Duojiao Chen, Guanglong Jiang, Felicia New , Rana German, et al. Single-nucleus chromatin accessibility and transcriptomic map of breast tissues of women of di verse genetic ancestry . Natur e medicine , 30(12):3482–3494, 2024. Camiel CA Mannens, Lijuan Hu, Peter Lönnerberg, Marijn Schipper , Caleb C Reagor , Xiaofei Li, Xiaoling He, Roger A Barker , Erik Sundström, Danielle Posthuma, et al. Chromatin accessibility during human ﬁrst-trimester neurodev elopment. Natur e , 647(8088):179–186, 2025. Jinhyuk Lee, W onjin Y oon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics , 36(4):1234–1240, 2020. Mirjana Efremo va, Miquel V ento-T ormo, Sarah A T eichmann, and Roser V ento-T ormo. Cellphonedb: inferring cell–cell communication from combined expression of multi-subunit ligand–receptor complex es. Nature pr otocols , 15(4): 1484–1506, 2020. Robin Brow aeys, W outer Saelens, and Yvan Saeys. Nichenet: modeling intercellular communication by linking ligands to target genes. Nature methods , 17(2):159–162, 2020. Aaron Grattaﬁori, Abhiman yu Dubey , Abhinav Jauhri, Abhina v Pande y , Abhishek Kadian, Ahmad Al-Dahle, Aiesha Let- man, Akhil Mathur, Alan Schelten, Alex V aughan, et al. The llama 3 herd of models. arXiv preprint , 2024. Josh Achiam, Ste ven Adler , Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint , 2023. Gemini T eam, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Y u, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. Gemini: a family of highly capable multimodal models. arXiv pr eprint arXiv:2312.11805 , 2023. Anthropic. The claude 3 model family: Opus, sonnet, haiku, 2024. URL https://www- cdn.anthropic.com/ de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf . Jacob Cohen. Statistical power analysis for the behavior al sciences . routledge, 2013. George Casella and Roger Ber ger . Statistical infer ence . Chapman and Hall/CRC, 2024. Omar Coser . Elisa: A generativ e ai agent for expression-grounded disco very in single-cell genomics. In ICLR 2026 W orkshop on Generative AI for Genomics , 2026a. Omar Coser . Elisa: An interpretable hybrid agent for expression-grounded disco very in single-cell genomics. In ICLR 2026 W orkshop on Machine Learning for Genomics Explorations , 2026b. Nils Reimers and Iryna Gure vych. Sentence-bert: Sentence embeddings using siamese bert-networks. In Pr oceedings of the 2019 confer ence on empirical methods in natural language pr ocessing and the 9th international joint conference on natural languag e pr ocessing (EMNLP-IJCNLP) , pages 3982–3992, 2019. F Alexander W olf, Philipp Angerer, and F abian J Theis. Scanpy: large-scale single-cell gene expression data analysis. Genome biology , 19(1):15, 2018. 13 arXiv T emplate A P R E P R I N T Leland McInnes, John Healy , and James Melville. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv pr eprint arXiv:1802.03426 , 2018. Laurens v an der Maaten and Geof frey Hinton. V isualizing data using t-sne. J ournal of machine learning r esearc h , 9 (Nov):2579–2605, 2008. A Software and r eproducibility ELISA was implemented in Python 3.10+ using PyT orch, sentence-transformersReimers and Gurevych [2019], scanpyW olf et al. [2018], scikit-learn, and UMAP-learnMcInnes et al. [2018]. All analyses were performed on a standard workstation without GPU requirements for retrie val and analysis. Source code, benchmark queries, and ev aluation scripts are a vailable at [repository URL]. Use of an LLM (LLaMA-3.1-8B) for automated interpretation is documented in accordance with journal polic y . T opical subheadings are allowed. Authors must ensure that their Methods section includes adequate experimental and characterization data necessary for others in the ﬁeld to reproduce their work. All experiment has been performed on a GPU A100 with 80 gb of RAM B Replication ev aluation metrics T able 3 reports six metrics quantifying the agreement between ELISA-generated reports and the ﬁndings of the corresponding reference publications. Each metric is deﬁned below . Gene coverage. Gene co verage measures the fraction of ke y genes highlighted in the reference publication in which ELISA was identiﬁed in the correct cell type context. For each dataset, the ev aluator compiled a set of key genes from the paper’ s main text, ﬁgures, and supplementary tables (e.g., dif ferentially expressed genes, cell type markers and signaling molecules). A gene was scored as “recovered” if it appeared in ELISA ’ s output for a biologically appropriate cluster . The gene coverage is computed as: Gene cov erage = | key genes reco vered by ELISA | | key genes reported in reference | (1) Pathway alignment. Pathw ay alignment quantiﬁes whether ELISA ’ s pathway scoring module detects the biological programs reported in the reference study . For each dataset, the ev aluator identiﬁed the pathways discussed in this paper (e.g., IFN- γ signaling, mTOR and ErbB). A pathway was scored as “aligned” if ELISA ’ s module returned it with a positiv e score in at least one biologically appropriate cluster . Pathway alignment is computed as: Pathway alignment = | pathways found by ELISA | | pathways reported in reference | (2) Interaction recov ery . Interaction recovery assesses whether ELISA ’ s ligand–receptor prediction module detected the cell–cell communication axes described in the reference publication. For each dataset, the e valuator compiled ground truth interactions from the paper (e.g., HB-EGF/ERBB4 between macrophages and neuroblasts, HLA-E/NKG2A between epithelial and CD8 + T cells). Recov ery was scored at the pair le vel: a ligand–receptor pair was counted as “recov ered” if ELISA detected it with a non-zero score, re gardless of whether the source–tar get cell type assignment exactly matched: Interaction recov ery = | LR pairs detected by ELISA | | LR pairs reported in reference | (3) Proportion consistency . Proportion consistency is a binary (Y es/No) criterion that e valuates whether ELISA ’ s proportion analysis correctly identiﬁed the direction of cell type composition changes for datasets with condition contrasts. For each cell type reported in the reference as increased or decreased in the disease or treatment condition, the e valuator checked whether ELISA ’ s fold change pointed in the same direction. A dataset recei ved “Y es” if the majority of reported changes were directionally consistent. Theme coverage. Theme cov erage captures whether an ELISA ’ s interpretiv e summary reproduced the major biological conclusions of the reference study . Unlike gene and pathw ay-le vel metrics, that assess individual molecular entities, theme cov erage ev aluates high-level biological narrativ es. For each dataset, the ev aluator identiﬁed the main themes from the paper’ s abstract and results (e.g., “aberrant adapti ve immunity with upregulated IFN- γ signaling” for the CF dataset; “therapy-induced macrophage polarization toward immunosuppressiv e phenotypes” for the neuroblastoma 14 arXiv T emplate A P R E P R I N T dataset). A theme was scored as “covered” if ELISA ’ s LLM-generated interpretation mentioned and correctly described the corresponding biological ﬁnding: Theme cov erage = | themes captured by ELISA | | major themes in reference | (4) Biological ev aluation score. The biological Evaluation Score pro vides an independent assessment of overall report quality . Composite score. The composite score summarizes o verall replication performance as the unweighted mean of the four continuous metrics: Composite = Gene cov . + P ath. align. + Int. rec. + Theme cov . 4 (5) Proportion consistency is excluded from the composite av erage because it is binary rather than continuous, but is reported separately as a quality check. C Retriev al and analytical evaluation metrics T o ensure reproducible and interpretable ev aluation of ELISA ’ s retriev al and analytical modules, we deﬁned the full set of metrics used throughout the benchmark (see also the benchmark scripts in the supplementary code repository for complete implementations). Retriev al metrics quantify how ef fectiv ely each mode recovers the expected cell types for a giv en query , while analytical metrics assess the accuracy of ELISA ’ s downstream whereas biological interpretation modules interaction discov ery , pathway enrichment, proportion analysis, and comparati ve dif ferential expression. An ov erview of the six e valuation datasets and their properties is pro vided in T able 5. C.1 Retrieval metrics Each radar plot in Fig. 2 displays six axes corresponding to three retrie val metrics e valuated separately on the tw o query categories (ontology and e xpression). The three metrics are: 1. Cluster Recall@ k (two ax es per plot: Ont R@ k , Exp R@ k ). This metric measures the fraction of expected cell types that appear within the top- k positions of the ranked retriev al list. The v alue of k is adapted to each dataset’ s number of clusters: R@5 and R@10 for large-cluster datasets (CF Airways with 30 clus- ters, ICB Multi-Cancer with 31, First-T rimester Brain with 28), R@1 and R@2 for small-cluster datasets (Breast T issue Atlas with 8 clusters, fdA T2 Or ganoids with 5, High-Risk Neuroblastoma with 11). A Recall@ k of 1.0 indicates that all e xpected clusters were retrie ved within the top- k ; a v alue of 0.0 indicates that none were found. T wo Recall cutoffs are sho wn per plot to capture both stringent (lower k ) and permissiv e (higher k ) retrie val accurac y . 2. Mean Reciprocal Rank (two ax es: Ont MRR, Exp MRR). MRR quantiﬁes the rank position of the ﬁrst correctly retriev ed cluster . An MRR of 1.0 means the top-ranked result is relev ant; 0.5 means the ﬁrst relev ant result appears at rank 2; 0.33 at rank 3, and so on. MRR captures top-of-list precision, which is critical for interactiv e use where researchers typically inspect only the ﬁrst few results. T ogether , the six axes capture complementary aspects of retriev al quality: Recall@ k measures cover age (how many expected clusters are found), whereas MRR measures pr ecision at rank 1 (how quickly the ﬁrst rele vant cluster appears). Evaluating both metrics on ontology queries (natural-language, concept-le vel) and expression queries (gene-signature- based) separately rev eals modality-speciﬁc strengths: a system may excel at one query type while underperforming the other . Thus, the radar footprint thus provides an at-a-glance summary of each retrie val mode’ s ov erall coverage, precision, and balance across query types. A larger , more symmetric footprint indicates stronger and more balanced retriev al performance. Four retrie val modes compared are: CellWhisperer (pink dashed line), which uses contrastiv e text transcriptome CLIP embeddings; ELISA Semantic (blue), which performs BioBER T -based cosine similarity matching against cluster descriptions enriched with GO and Reactome terms; ELISA scGPT (orange), which scores clusters by matching query genes ag ainst per -cluster dif ferential expression proﬁles; and ELISA Union (green), which adaptiv ely fuses both ELISA pipelines by routing each query to the better-performing modality and appending unique results from the secondary pipeline. 15 arXiv T emplate A P R E P R I N T C.1.1 Statistical testing T o assess whether performance differences between retriev al modes are statistically signiﬁcant across datasets, we employed one-sided paired t -tests (with the alternativ e hypothesis that ELISA Union outperforms CellWhisperer) and reported Cohen’ s d as the paired effect size. Because different datasets use dif ferent Recall@ k cutof fs, indi vidual metric comparisons hav e varying sample sizes ( n = 3 to n = 6 datasets). T o obtain a single omnibus test, we performed a combined permutation test: the sign of the difference (Union minus CellWhisperer) w as computed for every metric dataset pair simultaneously , and dataset labels were permuted 50,000 times to construct the null distribution of the aggregate adv antage. All p -values and ef fect sizes are reported in T able 2. D Human ev aluation protocol T o obtain the biological e valuation scores shown in T able 3, a domain expert with training in molecular biology and single-cell genomics independently re viewed each ELISA-generated report against the corresponding reference publication. The ev aluation followed a structured ﬁ ve-step protocol: 1. Gene veriﬁcation. Each gene reported by ELISA as differentially e xpressed or as a marker of a speciﬁc cell type was cross-checked against the main te xt, ﬁgures, and supplementary tables of the reference publications. A gene was scored as “recovered” if it appeared in the paper’ s reported DE gene lists, marker panels, or ﬁgure annotations for the corresponding cell type. The gene cov erage score was computed as the fraction of paper-reported k ey genes that ELISA identiﬁed in the correct cluster context. 2. Pathway assessment. Each pathway identiﬁed by ELISA ’ s pathway scoring module (e.g., “IFN-gamma signaling, ” “mTOR signaling”) was compared against pathway-le vel ﬁndings described in the reference study . A pathway was scored as “aligned” if the reference publication reported activ ation or enrichment of that pathway in a consistent cell type context. Pathw ay alignment was computed as the fraction of paper-reported pathways that ELISA correctly detected as acti ve (score > 0 ) in at least one biologically appropriate cluster . 3. Interaction validation. Each ligand-receptor interaction predicted by ELISA was v eriﬁed against the cell- cell communication analyses reported in a previous pubblication. V alidation was performed at two le vels: (i) whether the ligand–receptor pair itself was reported in the paper , regardless of the cell type context (LR recov ery rate), and (ii) whether both the pair and the source target cell type assignment matched the paper’ s ﬁndings (full match rate). 4. Proportion and condition consistency . For datasets with condition contrasts (e.g., CF vs. healthy), the ev alu- ator veriﬁed whether ELISA ’ s proportion analysis correctly identiﬁed the direction of cell type composition changes reported in the reference study . Each cell type with a known expected change (increased or decreased in the disease/treatment condition) was checked for directional agreement. 5. Theme coverage and h ypothesis assessment. The e v aluator assessed whether ELISA ’ s interpretiv e summaries captured the major biological themes and conclusions of the reference study (e.g., “aberrant adaptiv e immunity with upregulated IFN- γ signaling” for the CF dataset). Additionally , candidate hypotheses generated by ELISA ’ s discov ery mode were e valuated for biological plausibility through targeted literature re view: the ev aluator searched PubMed for prior e vidence supporting or contradicting each proposed mechanism (e.g., CALR–LRP1 in macrophage phagocytosis, TRIM-family ligases in surfactant processing). Hypotheses were classiﬁed as “plausible” if supporting literature existed, “no vel” if no prior reports were found but the mechanism was biologically coherent, or “unsupported” if contradicted by existing e vidence. The composite score for each dataset was computed as the unweighted mean of gene coverage, pathway alignment, interaction recov ery , and theme coverage, with proportion consistenc y treated as a binary (pass/fail) criterion. E Materials E.1 Datasets ELISA was v alidated on six publicly available scRN A-seq datasets deposited in the CZ CELLxGENE Discover portal, spanning ﬁve distinct tissues, four disease contexts, and both case–control and longitudinal experimental designs (T able 5). Datasets were selected to cover a broad range of biological complexity , cell type div ersity , and analytical challenges, including inﬂammatory lung disease, pediatric and adult cancers, drug-resistant epilepsy , immune checkpoint therapy response, and normal tissue homeostasis. 16 arXiv T emplate A P R E P R I N T Dataset 1 (D1): cystic ﬁbrosis br onchial epithelium. Berg et al. Berg et al. [2025] generated the ﬁrst single-cell transcriptome atlas of the cystic ﬁbrosis (CF) lung comprising both structural and immune cells. Droplet-based scRN A-seq was performed on bronchial wall biopsies from patients with CF ( n = 8 ) and healthy controls ( n = 19 ) and inte grated using the fastMNN batch correction framework with the Human Lung Cell Atlas as reference. The dataset encompasses approximately 96,000 cells across 30 annotated cell types, including epithelial (basal, ciliated, secretory , goblet, ionocyte), immune (CD8 + T cells, CD4 + T cells, B cells, plasma cells, macrophages, monocytes, NK cells, dendritic cells, mast cells), stromal (ﬁbroblasts, peric ytes), and endothelial populations. K ey ﬁndings include dysregulated basal cell function, aberrant adapti ve immunity with upregulated IFN- γ signaling, a novel HLA-E/NKG2A immune checkpoint axis, and altered structural–immune cell crosstalk persisting despite CFTR modulator therapy . Dataset 2 (D2): High-risk neuroblastoma. Y u et al. Y u et al. [2025] longitudinally proﬁled 22 patients with high-risk neuroblastoma before and after induction chemotherapy using single-nucleus RNA and A T A C sequencing combined with whole-genome sequencing. The dataset captures profound therapy-induced shifts in tumor and immune cell subpopulations, identifying enhancer-dri ven transcriptional re gulators of neoplastic states (adrenergic, mesench ymal, proliferativ e) and macrophage polarization tow ard pro-angiogenic, immunosuppressive phenotypes. A central ﬁnding was the validation of the HB-EGF/ERBB4 paracrine signaling axis between macrophages and neoplastic cells promoting tumor growth through ERK signaling induction. Dataset 3 (D3): Immune checkpoint blockade across cancers. Gondal et al. Gondal et al. [2025] compiled and standardized eight scRNA-seq studies from nine cancer types encompassing 223 patients and over 350,000 cancer cells treated with immune checkpoint blockade (ICB). Cancer types include melanoma, basal cell carcinoma, melanoma brain metastases, triple-negati ve/HER2-positi ve/ER-positiv e breast cancer, clear cell renal carcinoma, hepatocellular carcinoma, and intrahepatic cholangiocarcinoma. The integrated resource enables cross-cancer in vestigation of cancer cell-speciﬁc ICB responses, with annotations of treatment status, response outcome, and malignant vs. non-malignant cell identity . Dataset 4 (D4): Fetal lung A T2 organoids. Lim et al. Lim et al. [2025] de veloped expandable alv eolar type 2 (A T2) organoids deriv ed from human fetal lungs at 16–22 post-conception weeks (pcw). Single-cell RN A sequencing of four independent organoid lines (passage 11–16) yielded approx 9.6k cells across eight annotated cell types, including A T2-like, cycling A T2-like, CXCL + A T2-like, dif ferentiating basal-like, dif ferentiating pulmonary neuroendocrine, intermediate, neuroendocrine progenitor , and ciliated-like populations. The or ganoids express mature surfactant proteins (SFTPC, SFTPB, SFTP A1) and markers of surfactant processing (LAMP3, ABCA3, N APSA), and can differentiate into A T1-like cells. A forward genetic screen identiﬁed the E3 ligase ITCH as a k ey effector of SFTPC maturation, with its depletion phenocopying the pathological SFTPC-I73T v ariant associated with interstitial lung disease. Dataset 5 (D5): Healthy breast tissue. Bhat-Nakshatri et al. Bhat-Nakshatri et al. [2024] constructed a single-cell atlas of healthy breast tissues collected from volunteer donors from the Komen Normal T issue Bank. Using a rapid procurement and processing protocol, the study proﬁled breast epithelial and stromal cells, identifying 13 epithelial cell clusters with 23 subclusters exhibiting distinct gene expression signatures. Overlap analysis of subcluster -enriched signatures with breast tumor transcriptomes rev ealed dominant representation of differentiated luminal subcluster signatures in breast cancers, providing insights into putati ve cells of origin. Dataset 6 (D6): First-trimester human brain neurodevelopment. Mannens et al. Mannens et al. [2025] generated a high-resolution multiomic atlas of chromatin accessibility and gene expression across the entire dev eloping human brain during the ﬁrst trimester (6-13 weeks post-conception). Using scA T A C-seq and paired multiome (scA T A C-seq + scRN A-seq) sequencing, the study proﬁled 166k nuclei from 76 biological samples dissected into ﬁve antero-posterior segments (telencephalon, diencephalon, mesencephalon, metencephalon, and cerebellum), of which 166,785 nuclei included paired gene expression. The atlas deﬁnes 135 clusters spanning neurons (GABAer gic, glutamatergic, Purkinje, granule), radial glia, glioblasts, oligodendrocyte progenitor cells, ﬁbroblasts, vascular , and immune cell types. Key ﬁndings include o ver 100 cell-type- and region-speciﬁc candidate cis -regulatory elements, CNN-predicted enhancer syntax for neuronal speciﬁcation, elucidation of the ESRRB acti vation mechanism in the Purkinje cell lineage, and linkage of disease-associated GW AS SNPs to speciﬁc neuronal subtypes identifying midbrain-deriv ed GABAer gic neurons as particularly vulnerable to major depressive disorder -related mutations. All datasets were downloaded from CZ CELLxGENE Discov er ( https://cellxgene.cziscience.com ) in AnnData (.h5ad) format and preprocessed into ELISA ’ s standardized embedding format (.pt ﬁles) as described in the Data Representation section. Cell type annotations from the original publications were retained without modiﬁcation. F or datasets with condition metadata (D1, D2, D3, D4), condition columns were mapped to ELISA ’ s comparative analysis frame work. Dataset D5 was used to ev aluate ELISA ’ s performance on a single-condition atlas without disease contrast, testing the system’ s capacity for cell type identiﬁcation and pathway characterization in the absence of dif ferential signals. 17 arXiv T emplate A P R E P R I N T T able 5: Summary of scRN A-seq datasets used for ELISA v alidation. Appr ox. cells : approximate number of cells or nuclei proﬁled after quality control. Cell types : number of annotated major cell types. Conditions : experimental groups or treatment arms. ID Tissue Disease context Reference Appr ox. Cell Conditions cells types D1 Lung (bronchial) Cystic ﬁbrosis Berg et al. Berg et al. [2025] ∼ 96k 30 CF vs. Ctrl D2 Adrenal / tumor Neuroblastoma Y u et al. Y u et al. [2025] ∼ 372k 20+ Pre- vs. post-chemo D3 Multi-cancer ICB response Gondal et al. Gondal et al. [2025] ∼ 356k 25+ R vs. NR; 9 cancers D4 Lung (fetal) A T2 organoid model Lim et al. Lim et al. [2025] ∼ 9.6k 8 fdA T2 organoid lines D5 Breast Healthy tissue at- las Bhat-Nakshatri et al. Bhat-Nakshatri et al. [2024] ∼ 51k 13 Healthy only D6 Brain (whole) Neurodev elopment Mannens et al. Mannens et al. [2025] ∼ 166k 160 6–13 PCW; 5 re gions F Methods F .1 ELISA: architectur e and design principles ELISA (Embedding-Linked Interacti ve Single-cell Agent) is an agent-based computational framew ork for interactive interrogation of single-cell RN A-seq atlases. The system inte grates four core modules a hybrid retrie val engine, an analytical suite, a visualization toolkit, and a lar ge language model (LLM) chat interface to enable biologists to query scRN A-seq datasets using natural language, gene signatures, or a combination of both. The architecture follo ws a modular design in which each component operates on a shared data representation (a serialized PyT orch embedding ﬁle) and communicates through standardized data structures, enabling extensibility to ne w datasets without retraining. The system w as implemented in Python 3.10+ and e valuated on a 6 dataset took from cellxgene. All source code, benchmark queries, and ev aluation scripts are provided in the accompan ying repository . F .2 Hybrid retrie val engine F .2.1 Query classiﬁcation and routing A central design challenge in single-cell atlas retriev al is that user queries span a spectrum from pure natural language (“macrophage inﬁltration in CF airways”) to pure gene signatures (“MARCO F ABP4 APOC1 C1QB”) and mixed queries combining both. ELISA addresses this through an explicit query classiﬁcation module that routes each query to the optimal retriev al pipeline. The classiﬁer operates by tok enizing the input and scoring each token against three criteria: (i) whether it matches a gene name pattern (uppercase alphanumeric, 2–15 characters, with optional hyphenated suf ﬁx), (ii) whether it appears in the dataset’ s known gene v ocabulary , and (iii) whether it belongs to a curated set of natural language indicator terms (e.g., “cell”, “activ ation”, “signaling”). Queries where ≥ 60% of tokens are classiﬁed as gene symbols are routed to the gene pipeline; queries where ≥ 20% of tokens are genes and ≥ 20% are natural language terms are routed to the mixed pipeline; all other queries are routed to the semantic (ontology) pipeline. F .2.2 Gene marker scoring pipeline For gene-list queries, ELISA scores each cluster by ev aluating how well its dif ferential expression (DE) proﬁle matches the query genes. For each query gene g found in cluster c ’ s DE statistics, a per-gene score is computed as: score( g , c ) =  0 . 5 + | log 2 F C |  ×  0 . 3 + max(p ct in − p ct out , 0)  (6) where log 2 F C is the log-fold change of gene g in cluster c , and p ct in and p ct out represent the fraction of cells expressing the gene inside and outside the cluster , respectiv ely . The speciﬁcity term (p ct in − p ct out ) rew ards genes that are selectively enriched in the cluster rather than ubiquitously expressed. A multiplicativ e bonus of 1 . 3 × is applied when p ct in > 0 . 5 . The aggregate cluster score is the sum of per -gene scores, modulated by a coverage f actor (0 . 5 + 0 . 5 × n found /n query ) that re wards clusters matching more query genes. Three scoring modes are av ailable: ‘simple’ (binary hit counting), ‘weighted’ (described above), and ‘full’ (incorporating adjusted p -value signiﬁcance via − log 10 ( p adj ) , capped at 10). 18 arXiv T emplate A P R E P R I N T F .2.3 Semantic matching pipeline For ontology and text-based queries, ELISA employs BioBER TLee et al. [2020] (pritamdeka/BioBER T -mnli-snli- scinli-scitail-mednli-stsb) to encode both query te xt and precomputed cluster descriptions into a shared embedding space. Each cluster’ s description is constructed during dataset preparation by concatenating its Cell Ontology name, top marker genes (ranked by | log 2 F C | ), enriched Gene Ontology terms, and Reactome pathway annotations, producing a dual-representation embedding that captures both identity and functional context. At query time, the input text is encoded with BioBER T and cosine similarity is computed against all cluster embeddings. T wo augmentation strategies improv e retriev al accuracy . First, a name-boosting mechanism adds a score bonus ( α = 0 . 15 , scaled by word-overlap ratio) when signiﬁcant substrings ( ≥ 4 characters) of a cluster’ s name appear in the query . Second, a synonym expansion module maps common cell type aliases (e.g., “endothelial” → “endocardial cell”; “NK” → “natural killer cell”) to their Cell Ontology equiv alents and applies a score boost ( β = 0 . 10 ) to matching clusters, addressing vocab ulary gaps between colloquial and formal ontology terminology . F .2.4 Reciprocal rank fusion f or mixed queries Mixed queries containing both gene names and biological text are handled through reciprocal rank fusion (RRF). Both the gene and semantic pipelines are ex ecuted independently , and their ranked outputs are combined using: RRF( d ) = X r w r k + rank r ( d ) + 1 (7) where k = 60 is the RRF constant, w r are per-pipeline weights (def ault: 1.0 for both), and rank r ( d ) is the 0-index ed rank of cluster d in pipeline r . For gene-dominated queries routed through the gene pipeline, a light fusion with the semantic pipeline at a 3:1 weight ratio is applied as a safety mechanism to capture semantically related clusters that lack direct marker gene ov erlap. F .2.5 Additive union e valuation strategy For benchmarking, we introduce an additi ve union strategy that maximizes complementarity between modalities. For each query , the modality achieving higher recall@5 against expected clusters is designated as the primary pipeline. The union output begins with the primary pipeline’ s full ranked list, follo wed by unique clusters from the secondary pipeline appended in their original rank order . This produces an untruncated result list (up to 2 × top- k ), e valuated at recall@5, @10, @15, and @20. Ties at recall@5 are brok en by mean reciprocal rank (MRR). F .3 Analytical modules F .3.1 Cell–cell interaction prediction ELISA predicts ligand–receptor (LR) interactions between cell types using a curated database of 280+ LR pairs spanning 25 signaling pathway categories. The database was compiled from established resources (CellChatJin et al. [2025], CellPhoneDBEfremov a et al. [2020], NicheNetBrow aeys et al. [2020]) and augmented with context-speciﬁc pairs for cystic ﬁbrosis, neurodegeneration, neuroblastoma, and immune checkpoint biology . Each interaction is represented as a (ligand, receptor , pathway) tuple. For each source–tar get cluster pair, the interaction score is computed as: s ij = p ct in ( ligand , c i ) × p ct in ( receptor , c j ) (8) where p ct in denotes the fraction of cells expressing the gene above detection threshold. Interactions are ﬁltered by minimum expression thresholds (ligand ≥ 10%, receptor ≥ 5% by default) and ranked by score. The module outputs per-interaction statistics, pathw ay-level summaries, and directional pair summaries. F .3.2 Pathway activity scoring Pathway acti vity across clusters is quantiﬁed using curated gene sets encompassing 60+ pathways or ganized into ﬁve categories: immune signaling (IFN- γ , T ype I IFN, TNF/NF- κ B, J AK-ST A T , complement, TLR, chemokine), cell biology (mTOR, PI3K-Akt, Wnt, Notch, Hippo, Hedgehog, cell cycle, apoptosis), neuroscience (glutamatergic/GAB Aergic synapse, neurodegeneration, FCD progenitor markers), metabolism (oxidati ve phosphorylation, glycolysis, lipid metabolism, fatty acid metabolism), and tissue-speciﬁc programs (surf actant metabolism, epithelial defense, ﬁbrosis, angiogenesis). 19 arXiv T emplate A P R E P R I N T For each pathway–cluster combination, the score is computed as the mean p ct in (or alternati ve metric: log 2 F C , p ct out ) across pathway genes detected in the cluster’ s DE proﬁle, requiring a minimum of 3 genes for a non-zero score. Cov erage (fraction of pathway genes detected) is reported alongside scores. Pathway query matching uses word-o verlap fuzzy matching to accommodate variant pathw ay names. F .3.3 Comparative analysis When dataset metadata includes a condition column (e.g., “patient_group” with values “CF” and “Ctrl”), ELISA enables condition-stratiﬁed analysis. The module detects condition columns through ke yword matching against a curated list (patient_group, condition, disease, treatment, genotype, etc. ) and validates that the column contains 2–10 distinct values. For each cluster , the condition distribution is estimated from metadata ﬁeld weights, and a condition bias label is assigned ( > 60% of cells from one condition). Per-gene statistics ( log 2 F C , p ct in , p ct out ) are reported within condition-biased clusters, and condition-enriched gene lists are compiled across all clusters. F .3.4 Proportion analysis The cell type proportion analysis computes the per -cluster cell counts and fractions relati ve to the total dataset size. When a condition column is a vailable, the module additionally computes the condition-speciﬁc proportions and fold changes. For binary conditions (e.g., CF vs. Control), fold changes were calculated as the fraction of condition A cells in a cluster divided by the fraction of condition B cells, enabling the identiﬁcation of cell types enriched or depleted in disease states. F .3.5 Additional analytical functions Supplementary analytical functions include: (i) marker speciﬁcity scoring, which ranks genes by a weighted score combining speciﬁcity  p ct in / (p ct in + pct out )  and ef fect size ( | log 2 F C | ); (ii) co-expression analysis, computing Pearson correlations of p ct in proﬁles across clusters; (iii) cell cycle scoring using established S-phase and G2M-phase gene signatures (43 and 54 genes, respectiv ely); and (iv) gene set enrichment against 10 MSigDB Hallmark gene sets. F .4 V isualization module ELISA includes a comprehensive visualization module that generates publication-quality ﬁgures in two categories. Retriev al-lev el visualizations include: embedding landscape projections (UMAPMcInnes et al. [2018], t-SNEMaaten and Hinton [2008], or PCA fallback) of cluster-le vel semantic and expression embeddings, with optional highlighting of retrie ved clusters; inter -cluster cosine similarity heatmaps; retriev al score waterfall plots; gene e vidence bar charts ( log 2 F C or p ct in ); gene-by-cluster heatmaps; radar charts for multi-metric cluster proﬁles; semantic vs. expression similarity scatter plots for hybrid retrie val diagnostics; and lambda sweep curves for fusion weight optimization. When an AnnData (.h5ad) ﬁle is provided, cell-le vel visualizations are generated in a style consistent with Nature and Cell journals: cell-lev el UMAP plots with Cell Ontology labels placed using a centroid-offset algorithm with iterati ve repulsion to minimize label overlap; single-gene expression UMAPs with non-expressing cells shown in grey and expression on a purple gradient (capped at the 98th percentile); multi-gene expression grids; and dot plots sho wing percentage expression (dot size) and z -scored mean expression (dot color) across clusters. All plots used a 40-color colorblind-friendly palette and rasterized cell-lev el rendering for efﬁcient ﬁle sizes. F .5 LLM-mediated chat interface The interactive chat interface wraps all modules behind a command-driv en interface that routes user queries to the appropriate pipeline and generates LLM-interpreted summaries. The interface supports six retriev al and analysis modes (semantic, hybrid, discovery , compare, interactions, pathway , proportions) and 15 visualization commands. Each analysis result is automatically accumulated into a session-lev el report builder . LLM interpretation is performed via the Groq API using the LLaMA-3.1-8B-Instant modelGrattaﬁori et al. [2024] at temperature 0.2. Prompts are constructed with mode-speciﬁc templates that enforce strict grounding in dataset evidence: the LLM recei ves only the retrie ved cluster data, gene statistics, and pathway/interaction results as conte xt, with explicit instructions to av oid hallucination, external literature, and causal claims. Context payloads are trimmed to ﬁt within the model’ s token limits ( ∼ 4,500 tokens for user content), with priority given to top-ranked clusters and highest-ef fect-size genes. A discovery mode extends standard retrie val by prompting the LLM to produce four structured sections: (i) dataset evidence, (ii) established biology , (iii) consistency analysis identifying matches and mismatches with kno wn biology , 20 arXiv T emplate A P R E P R I N T and (i v) candidate novel hypotheses stated with probabilistic language. This mode is designed to surface unexpected ﬁndings that may represent context-shifted gene functions or no vel cell–cell interactions. F .6 Benchmarking framework F .6.1 Query design The benchmark comprises 100 queries divided into two cate gories: 50 ontology queries (concept-lev el, testing semantic understanding) and 50 expression queries (gene-signature-based, testing transcriptomic matching). Queries were deri ved from the ﬁndings of Ber g et al. Berg et al. [2025], co vering all major cell types identiﬁed in the study (macrophages, monocytes, CD8 + T cells, CD4 + T cells, B cells, basal cells, ciliated cells, NK cells, ionocytes, endothelial cells, dendritic cells, mast cells, secretory/goblet cells, ﬁbroblasts, and neuroendocrine cells). Each query has a curated set of expected clusters and e xpected genes, enabling ev aluation at both the cluster retriev al and gene deliv ery levels. F .6.2 Baseline comparisons ELISA ’ s retriev al performance was ev aluated against: the progression CellWhisperer, Semantic ELISA, scGPT ELISA, Additiv e Union. F .6.3 Metrics Retriev al performance was assessed using three metrics. Cluster Recall@ k measures the fraction of expected clusters appearing in the top- k retrie ved results, using fuzzy matching (substring containment or w ord-overlap Jaccard similarity ≥ 0 . 5 ) to accommodate Cell Ontology naming variations. Mean Recipr ocal Rank (MRR) captures the rank position of the ﬁrst relev ant cluster . Gene Recall measures the fraction of expected genes recov erable from the DE proﬁles of the top-5 retrie ved clusters, assessing whether retriev ed clusters collectiv ely provide the gene e vidence needed for biological interpretation. F .6.4 Analytical module evaluation Analytical modules were ev aluated against ground truth derived from the source publication. Interaction prediction was assessed by lig and–receptor pair reco very rate (whether the correct LR pair was detected regardless of cell type) and full match rate (correct LR pair between the correct source and target cell types, using fuzzy cell type matching). Pathway scoring was e valuated by alignment: the fraction of path activities reported on paper that ELISA correctly identiﬁed as activ e (score > 0 ) in at least one group. The proportion analysis was e valuated by the consistency rate whether cell types reported as increased or decreased in CF show fold changes in the e xpected direction. Comparativ e analysis was ev aluated by gene recall, the fraction of differentially expressed genes reported on paper that can be recovered from the condition-stratiﬁed analysis of ELISA ’ s. F .7 Data repr esentation and preprocessing Each dataset is preprocessed into a single serialized PyT orch ﬁle ( .pt ) containing: cluster identiﬁers, precomputed BioBER T semantic embeddings (768-dimensional, L2-normalized), optional scGPT e xpression embeddings, per-cluster DE gene statistics ( log 2 F C , p ct in , p ct out , adjusted p -value), per-cluster GO and Reactome enrichment terms, per- cluster metadata (cell counts, condition distrib utions, categorical ﬁeld frequencies), cluster te xt descriptions, and the complete gene vocab ulary . This representation enables ELISA to operate entirely at the cluster level without requiring access to the original count matrix, substantially reducing memory requirements and enabling deployment on standard hardware. F .8 Software dependencies and r eproducibility ELISA depends on: PyT orch ( ≥ 1.12) for tensor operations and data serialization, NumPy for numerical computation, sentence-transformersReimers and Gurevych [2019] for BioBER T encoding, scikit-learn for t-SNE projections, UMAP- learnMcInnes et al. [2018] for UMAP projections, matplotlib for visualization, scanpyW olf et al. [2018] for AnnData- backed cell-le vel plots, SciPy for hierarchical clustering and sparse matrix operations, and the Groq Python SDK for LLM access. All analyses were performed on a standard workstation without GPU requirements for the retriev al and analytical modules; BioBER T Lee et al. [2020] encoding beneﬁts from but does not require GPU acceleration. 21 arXiv T emplate A P R E P R I N T F .9 ELISA parameters and hyperparameters T ables 6 – 9 report all parameters and hyperparameters used in the ELISA framew ork. Default v alues were used throughout all experiments; no dataset-speciﬁc tuning was performed. T able 6: Data preprocessing and embedding generation parameters. Parameter V alue Description Pr eprocessing (Scanpy) target_sum 10,000 Library-size normalization target n_top_genes 3,000 HVGs selected (Seurat v3) max_value 10 Z-score clipping threshold n_comps 50 PCA components Leiden resolution 1.0 Used only if no annotations exist Differ ential expr ession Method W ilcoxon V ia scanpy.tl.rank_genes_groups DE_PVAL 0.10 Adjusted p -value cutof f TOP_K_MARKERS_STATS 10,000 Max genes stored per cluster TOP_K_MARKERS_TEXT 400 Genes in cluster text summaries Enrichment (gseapy) Gene sets GO_Biological_Process_2023, Reactome_2022 TOP_K_GO 15 GO terms retained per cluster TOP_K_REACTOME 15 Reactome terms per cluster Enrichment cutoff 0.05 Adjusted p -value threshold Input genes 200 T op DE genes per enrichment call Semantic embedding (BioBERT) Model pritamdeka/BioBERT-mnli-snli-scinli-scitail-mednli-stsb Embedding dim 768 Output dimensionality α ( IDENTITY_ALPHA ) 0.6 Identity vs. conte xt weight Normalization L2 Final combined embeddings Batch size 16 Sentences per encoding batch scGPT expr ession embedding Model scGPT whole-human Pre-trained foundation model Embedding dim 512 CLS token dimensionality N_BINS 51 Expression binning resolution MAX_TOKENS 3,000 Max gene tokens per cell Batch size 64 Cells per inference batch Aggregation Mean pooling Cell → cluster centroids Normalization L2 Cluster-le vel centroids 22 arXiv T emplate A P R E P R I N T T able 7: Hybrid retriev al engine parameters. Parameter V alue Description Query classiﬁcation Gene threshold ≥ 60% T oken fraction to route as gene query Mixed threshold ≥ 20% each Gene + NL tokens for mixed routing Gene pattern A–Z, 2–15 chars Regex for gene symbol detection Gene marker scoring Score function (0 . 5 + | log 2 FC | ) × (0 . 3 + max( pct in − pct out , 0)) High-expr bonus × 1.3 When pct in > 0 . 5 Cov erage factor 0 . 5 + 0 . 5 × n found /n query Semantic matching Similarity Cosine Query vs. cluster embeddings Name boost ( α ) 0.15 Bonus for ontology name overlap Min substring 4 chars For name boost acti vation Synonym boost ( β ) 0.10 Bonus for synonym match Recipr ocal rank fusion RRF constant ( k ) 60 Smoothing constant W eights 1.0 : 1.0 Gene : semantic Additive union (benchmarking) Primary selection Recall@5 Higher-recall modality is primary T iebreaker MRR When Recall@5 is tied Default settings top_k 5 Clusters returned per query pre_k 40 Candidates before reranking γ 2.5 Reranking sharpness λ sem (scGPT) 0.0 Pure gene scoring mode λ sem (discov ery) 0.5 Balanced mode 23 arXiv T emplate A P R E P R I N T T able 8: Analytical module parameters. Parameter V alue Description Ligand–r eceptor interactions Database size 280+ pairs From CellChat, CellPhoneDB, NicheNet Pathway cate gories 25 Signaling annotations min_ligand_pct 0.10 Min ligand expr . in source min_receptor_pct 0.05 Min receptor expr . in target Score pct in ( L ) × pct in ( R ) Expression fraction product Self-interactions Excluded Source  = target P athway activity scoring Number of pathways 60+ Across 5 categories Metric Mean pct in A vg. expression of pathway genes min_genes 3 Min for non-zero score Categories Immune, Cell biology , Neuroscience, Metabolism, Tissue-speciﬁc Comparative analysis Condition bias > 60% Fraction to assign bias label min_pct 0.05 Min expr . for gene inclusion top_n 20 Genes per cluster Enriched genes 30 Per-condition summary limit Pr oportion analysis Fold change frac A / frac B Condition ratio Min denominator 0.001 Belo w: reported as ∞ Cell cycle scoring S-phase genes 43 Seurat S-phase markers G2M-phase genes 54 Seurat G2M markers Cycling threshold S > 0 . 3 and G2M > 0 . 3 Both abo ve threshold Gene set enrichment Default gene sets 10 MSigDB Hallmark Curated pathways min_genes 3 Min for non-zero score T able 9: LLM interpretation parameters. Parameter V alue Description LLM conﬁguration Default provider Groq Free tier, 500K tokens/day Default model LLaMA-3.1-8B V ia Groq Cloud API Supported 4 providers Groq, Gemini, OpenAI, Claude T emperature 0.2 Low for reproducibility Prompt limit 18,000 chars ≈ 4,500 tokens Context limit 12,000 chars ≈ 3,000 tokens Safety and rate limiting Spending cap C1.00 Hard cap, conﬁgurable Max retries 5 On rate-limit errors Initial wait 10 s Backof f start Backof f Exponential Max 120 s Context trimming Clusters T op 10 In compare mode Gene evidence T op 5 Per cluster Pathway scores T op 10 Entries to LLM Interactions T op 20 Entries to LLM Discov ery sections 4 Evidence, Biology , Consistency , Hypotheses 24 arXiv T emplate A P R E P R I N T G D1: Cystic Fibrosis Airways (Ber g et al. [2025] et al. ) G.1 Ontology Queries 1. Macrophage and monoc yte inﬁltration in cystic ﬁbrosis airways 2. Recruited monoc ytes and pro-inﬂammatory macrophages in CF lung tissue 3. Macrophage sca venging receptor expression and phagoc ytosis in CF 4. Non-classical monoc yte patrol function in CF bronchial wall 5. CD8 T cell acti vation and c ytotoxicity in CF lung inﬂammation 6. CD8 T cell inﬂammatory c ytokine production and IFNG signaling in CF 7. HLA-E CD94 NKG2A immune checkpoint inhibiting CD8 T cell acti vity 8. Dysfunctional CD8 T cell response to chronic Pseudomonas infection in CF 9. CALR LRP1 interaction between T cells and macrophages promoting inﬂammation 10. CD4 helper T cell immune acti vation in c ystic ﬁbrosis 11. CD4 T cell VEGF receptor signaling and hypoxia response in CF 12. Aberrant Th2 and Th17 T cell responses in Pseudomonas -infected CF lungs 13. Chronic adapti ve immune acti vation of T lymphocytes in CF despite modulator therap y 14. B cell acti vation and immunoglob ulin response in CF airways 15. B cell receptor do wnregulation and reduced plasma cell markers in CF 16. Interferon gamma signaling and HLA-DP e xpression in B cells of CF patients 17. PDGFRB signaling pathw ay activ ated in B cells from CF lungs 18. Basal cell dysfunction and reduced stemness in c ystic ﬁbrosis epithelium 19. Impaired basal cell dif ferentiation and pathogenic basal cell variants in CF 20. Basal cell DN A damage repair and chromatin remodeling in CF airways 21. Reduced k eratinization gene expression CST A HSPB1 in CF basal cells 22. Basal cell altered cell–cell communication and increased interactions in CF 23. Ciliated cell ciliogenesis and increased ab undance in CF bronchial epithelium 24. Ciliated cell HLA class II e xpression and immune-linked transcriptional changes in CF 25. Sk ewed basal cell dif ferentiation tow ards ciliated cells in CF epithelium 26. Natural killer cell c ytotoxicity and NKG2A immune checkpoint in CF 27. NKG2A blockade to restore NK and CD8 T cell function in CF lung 28. Innate lymphoid cell dysfunction and impaired antimicrobial defense in CF 29. Pulmonary ionoc yte CFTR expression in cystic ﬁbrosis 30. Ionoc yte unique cell–cell interactions with adaptive lymphoc ytes in CF 31. Endothelial cell remodeling and VEGF signaling in CF lung 32. Reduced endothelial cell proportions and altered dif ferentiation in CF airways 33. Hypoxia-induced VEGF upre gulation and vascular remodeling in CF lungs 34. Dendritic cell antigen presentation in CF airw ays 35. IFNG IFNGR2 interaction between CD8 T cells and dendritic cells in CF 36. Mast cell de granulation and allergic inﬂammation in CF 37. Secretory cell mucus o verproduction and inﬂammatory signaling in CF epithelium 38. Goblet cell hyperplasia and mucin gene e xpression in cystic ﬁbrosis 39. Submucosal gland epithelial cell changes in c ystic ﬁbrosis 40. Reduced submucosal gland cell proportions and gland de velopment dysfunction in CF 41. T ype I interferon response and inﬂammatory signaling in CF epithelial cells 25 arXiv T emplate A P R E P R I N T 42. Interferon responsi ve gene upregulation across epithelial subsets in CF 43. VEGF receptor signaling and hypoxia response across cell types in CF 44. TXNIP-mediated NLRP3 inﬂammasome activ ation in CF lymphocytes and epithelial cells 45. GN AI2 immunomodulatory signaling in CD8 T cells and B cells in CF 46. GN AI2 adenylate c yclase regulation and CFTR function in lymphocytes 47. Stromal cell and ﬁbroblast remodeling in CF airw ay tissue 48. Peric yte and stromal cell contribution to airway ﬁbrosis in CF 49. IFNG–IFNGR1 interaction between CD8 T cells and basal cells, macrophages, and endothelial cells in CF 50. Altered structural–immune cell crosstalk in CF in volving lymphoc ytes, ionocytes, and macrophages G.2 Expression Queries 1. MARCO FABP4 APOC1 C1QB C1QC MSR1 2. CD68 CD14 CSF1R CSF2RA LGALS2 3. GOS2 FABP4 PPARG APOC1 C1QB 4. FCGR3A CX3CR1 CD14 CDKN1C LILRB2 5. CD8A CD8B GZMB PRF1 IFNG NKG7 6. IFNG GNAI2 CD69 CD81 CD3G FOS JUND 7. GZMB PRF1 NKG7 GNLY KLRD1 CD8A 8. TXNIP MAP2K2 IFNG CD81 CD3G CD69 9. KLF2 IL7R CD48 TXNIP ETS1 10. CD3D CD4 IL7R CD3E CD3G 11. TRAJ52 TRBV22-1 TRDJ2 CD3E CD3G 12. CD3G CD3E CD69 IL7R CD81 FOS 13. IGLJ3 IGKJ1 IGHJ5 JCHAIN MZB1 XBP1 14. CD79A IGHG3 IGLC2 SYK CD81 JCHAIN 15. SYK CSK CD9 CD81 JUND LTB HLA-DPA1 16. IGHG3 IGLC2 IGHD IGHA1 IGLC1 IGLC3 17. KRT5 KRT14 KRT15 TP63 IL33 CSTA 18. CSTA HSPB1 KRT5 KRT14 TP63 19. KRT5 IL33 TP63 KRT15 LAMB3 COL17A1 20. FOXJ1 DNAH5 CAPS PIFO RSPH1 DNAI1 21. DNAH5 SYNE1 SYNE2 CAPS PIFO 22. GNLY KLRD1 KLRK1 NKG7 PRF1 GZMB 23. GNLY NKG7 KLRD1 KLRK1 KLRC1 24. ATP6V1G3 FOXI1 BSND CLCNKB ASCL3 25. FOXI1 CFTR ATP6V1G3 BSND RARRES2 26. PLVAP ACKR1 ERG VWF PECAM1 CDH5 27. VIM PLVAP ACKR1 MGP PTGDS CXCL14 28. CPA3 TPSAB1 TPSB2 MS4A2 HDC GATA2 29. TPSAB1 TPSB2 KIT CPA3 MS4A2 30. HLA-DPA1 HLA-DRB1 CD74 GPR183 LGALS2 31. HLA-DPA1 HLA-DPB1 HLA-DRB1 CD80 CD86 CD74 32. SCGB1A1 SCGB3A1 MUC5AC MUC5B LYPD2 PRR4 33. SCGB1A1 MUC5AC SCGB3A1 LYPD2 26 arXiv T emplate A P R E P R I N T 34. MUC5AC MUC5B LYZ SCGB1A1 SCGB3A1 35. COL1A2 LUM DCN SFRP2 COL3A1 PDGFRA 36. PDGFRA COL1A2 COL3A1 VCAN DCN LUM 37. PDGFRB VIM COL1A2 MGP CXCL14 38. SST CHGA ASCL1 GRP CALCA SYP 39. GRP ASCL1 SYT1 CHGA SYP CALCA 40. HLA-E KLRC1 KLRD1 KLRC2 KLRC3 KLRK1 41. HLA-E KLRC1 KLRD1 CD8A CD8B 42. CALR LRP1 GNAI2 FOS JUND MAP2K2 43. GNAI2 CXCR3 F2R S1PR4 CD69 44. IFIT1 MX1 OAS2 ISG15 IFITM3 IFIT3 45. IFIT1 MX1 OAS2 IFIT3 IFI6 46. KDM1A KMT5A RAD50 ERCC6 ERCC8 47. TXNIP MAP2K2 ETS1 VEGFA KLF2 48. IFNG IFNGR1 IFNGR2 CALR LRP1 49. CCL5 CCR5 CXCL10 CXCR3 F2R 50. CFTR FOXI1 SCGB1A1 KRT5 FOXJ1 MUC5AC H D5: Healthy Br east Tissue Atlas (Bhat-Nakshatri et al. [2024] et al. ) H.1 Ontology Queries 1. Luminal hormone sensing cells with estrogen receptor e xpression in the healthy breast 2. FO XA1 pioneer transcription factor activity in luminal hormone responsi ve breast epithelial cells 3. ER α –FO XA1–GA T A3 transcription factor network in hormone responsiv e breast cells 4. Mature luminal cells with hormone receptor positi ve identity in breast tissue 5. Hormone sensing alpha v ersus beta cell states in breast epithelium 6. LHS cell-enriched f ate factor D A CH1 and PI3K pathway regulator INPP4B in breast 7. Lob ular epithelial cells expressing APOD and immunoglobulin genes in breast 8. Luminal adapti ve secretory precursor cells and progenitor identity in breast 9. ELF5 and EHF transcription f actor expression in luminal progenitor breast cells 10. Alveolar progenitor cell state enriched in Indigenous American breast tissue 11. BRCA1 associated breast cancer originating from luminal progenitor cells 12. KIT receptor e xpression and chromatin accessibility in luminal progenitor cells 13. MFGE8 and SHANK2 e xpression in luminal progenitor cells of the breast 14. LASP basal–luminal intermediate progenitor cell identity in the breast 15. Basal-myoepithelial cells with TP63 and KR T14 expression in breast 16. Basal cell chromatin accessibility and TP63 binding site enrichment 17. Basal alpha and basal beta cell states in breast myoepithelium 18. SO X10 motif enrichment in basal-myoepithelial cells of the breast 19. KR T14 KR T17 expression in ductal epithelial and basal cells of breast tissue 20. Fibroblast heterogeneity and cell states in healthy breast stroma 21. Genetic ancestry-dependent v ariability in breast ﬁbroblast cell states 22. Fibro-prematrix state enrichment in African ancestry breast tissue ﬁbroblasts 23. PR OCR ZEB1 PDGFR α multipotent stromal cells enriched in African ancestry breast 27 arXiv T emplate A P R E P R I N T 24. Myoﬁbroblast and inﬂammatory ﬁbroblast subtypes in breast cancer stroma 25. SFRP4 and Wnt pathw ay modulation in breast ﬁbroblasts 26. Endothelial cell subtypes and v ascular markers in breast tissue 27. L ymphatic endothelial cells expressing L YVE1 in breast stroma 28. A CKR1 stalk-like endothelial cell subtype in breast v asculature 29. V ascular endothelial cell heterogeneity in mammary gland microv asculature 30. Breast tissue angiogenesis and endothelial cell MECOM e xpression 31. T lymphocyte markers and immune cell identity in breast tissue 32. CD4 T cell IL7R e xpression and chromatin accessibility in breast 33. CD8 T cell GZMK c ytotoxic activity and IFNG signaling in breast tissue 34. T issue-resident memory T lymphocyte populations in healthy breast 35. Adapti ve immune surveillance by T cells in mammary gland stroma 36. Macrophage identity and FCGR3A e xpression in breast tissue stroma 37. Macrophage subtypes and tissue-resident immune cells in healthy breast 38. Breast tissue-resident macrophage phagoc ytic function and complement expression 39. Myeloid lineage immune cells and monoc yte-derived macrophages in mammary gland 40. Adipocyte subtypes and lipid metabolism in breast tissue 41. Adipocyte PLIN1 and F ABP4 expression in healthy breast stroma 42. PLIN1 lipid droplet biology and adipoc yte identity in mammary fat pad 43. Mammary gland adipose tissue and f atty acid binding protein expression 44. Epithelial cell hierarchy from basal to luminal hormone sensing in breast 45. CXCL12 chemokine e xpression in endothelial cells and ﬁbroblasts of breast 46. VEGF A angiogenic signaling from luminal cells to endothelium in breast 47. IGF1 paracrine signaling from ﬁbroblasts to luminal cells in breast stroma 48. Breast tissue microen vironment with stromal and immune cell interactions 49. Ancestry dif ferences in breast tissue cellular composition and cancer risk 50. Gene e xpression differences between ductal and lob ular epithelial cells of the breast H.1.1 Expression Queries 1. FOXA1 ESR1 GATA3 ERBB4 ANKRD30A AFF3 TTC6 2. MYBPC1 THSD4 CTNND2 DACH1 INPP4B NEK10 3. ESR1 FOXA1 GATA3 ELOVL5 ANKRD30A 4. AFF3 TTC6 ERBB4 MYBPC1 THSD4 5. DACH1 NEK10 CTNND2 INPP4B ELOVL5 6. APOD IGHA1 IGKC ESR1 FOXA1 GATA3 7. DUSP1 DPM3 RPL36 IGHA1 IGKC APOD 8. ELF5 EHF KIT CCL28 KRT15 BARX2 NCALD 9. MFGE8 SHANK2 SORBS2 AGAP1 ELF5 10. KRT15 CCL28 KIT INPP4B ELF5 11. RBMS3 EHF BARX2 NCALD ELF5 12. ESR1 ELF5 EHF KIT CCL28 13. ELF5 KIT CCL28 EHF KRT15 BARX2 14. NCALD BARX2 SHANK2 SORBS2 MFGE8 ELF5 15. TP63 KRT14 KLHL29 FHOD3 SEMA5A 28 arXiv T emplate A P R E P R I N T 16. KLHL13 KLHL29 TP63 KRT14 PTPRT 17. TP63 KRT14 KRT17 FHOD3 ABLIM3 18. ST6GALNAC3 PTPRM SEMA5A KLHL29 19. KRT14 KRT17 TP63 KLHL29 KLHL13 FHOD3 20. LAMA2 SLIT2 RUNX1T1 COL1A1 COL3A1 21. COL3A1 POSTN COL1A1 IGF1 ADAM12 22. CFD MGST1 MFAP5 COL3A1 POSTN 23. PROCR ZEB1 PDGFRA COL1A1 LAMA2 24. SFRP4 COL1A1 POSTN LAMA2 SLIT2 25. COL1A1 PDPN CD34 CXCL12 LAMA2 26. MECOM LDB2 MMRN1 CXCL12 ACKR1 27. LYVE1 MECOM LDB2 MMRN1 28. ACKR1 CXCL12 MECOM LDB2 29. MECOM LDB2 MMRN1 LYVE1 ACKR1 30. CXCL12 MECOM LDB2 ACKR1 MMRN1 31. PTPRC SKAP1 ARHGAP15 THEMIS IL7R 32. IL7R GZMK PTPRC SKAP1 33. IFNG GZMK IL7R THEMIS PTPRC 34. THEMIS ARHGAP15 SKAP1 PTPRC IL7R 35. PTPRC SKAP1 GZMK IFNG THEMIS ARHGAP15 36. FCGR3A ALCAM LYVE1 CD163 37. ALCAM FCGR3A LYVE1 CD14 38. FCGR3A ALCAM CD163 MERTK 39. ALCAM LYVE1 FCGR3A CD163 MARCO 40. PLIN1 FABP4 KIT ADIPOQ LEP 41. FABP4 PLIN1 ADIPOQ LEP LPL 42. PLIN1 FABP4 LPL PPARG ADIPOQ 43. FABP4 PLIN1 KIT ADIPOQ 44. FOXA1 ELF5 TP63 KRT14 GATA3 ESR1 45. GATA3 EHF ELF5 FOXA1 KRT15 KRT14 TP63 46. MECOM PTPRC FCGR3A PLIN1 LAMA2 TP63 FOXA1 47. CXCL12 LAMA2 MECOM LDB2 COL1A1 48. ESR1 FOXA1 ELF5 EHF KIT TP63 KRT14 49. PTPRC FCGR3A FABP4 PLIN1 MECOM 50. VEGFA LDB2 IGF1 LAMA2 FOXA1 ELF5 H.2 D3: Fetal Lung A T2 Organoids (Lim et al. [2025] et al. ) H.2.1 Ontology Queries 1. Alveolar type 2 cell identity and surf actant protein production in fetal lung organoids 2. Mature A T2 cell markers and lamellar body formation in fdA T2 or ganoids 3. Surf actant protein C maturation and intracellular trafﬁcking in alveolar epithelium 4. SFTPC processing through endosomal compartments and multi vesicular bodies 5. Surf actant secretion and lamellar body exocytosis in human A T2 cells 6. ITCH E3 ubiquitin ligase role in SFTPC traf ﬁcking and ubiquitination 29 arXiv T emplate A P R E P R I N T 7. K63 ubiquitination of surf actant protein C for ESCR T recognition and MVB entry 8. HECT domain E3 ligase ITCH depletion phenocop ying SFTPC-I73T pathogenic variant 9. Ubiquitome forw ard genetic screen for SFTPC trafﬁcking ef fectors 10. SFTPC relocalisation to plasma membrane and rec ycling endosomes upon ITCH loss 11. A T2 stem cell self-renew al and proliferation in fetal lung organoids 12. FGF7-dri ven A T2 cell proliferation and surfactant processing balance 13. Expandable fetal-deri ved A T2 organoids maintaining identity o ver passaging 14. Alveolar type 1 cell dif ferentiation from A T2 organoids via Y AP acti vation 15. A T2 to A T1 lineage transition through Wnt withdrawal and LA TS inhibition 16. A T1 cell fate markers A QP5 CA V1 A GER in differentiated fdA T2 or ganoids 17. CXCL chemokine e xpressing A T2 subpopulation in fetal lung organoids 18. Immune response gene e xpression in alveolar type 2 cells 19. Chemokine-mediated innate immune signaling in A T2 or ganoid subsets 20. Aberrant basal cell dif ferentiation from A T2 cells in org anoid culture 21. Hypoxia-induced airw ay differentiation of alveolar type 2 cells 22. Pulmonary neuroendocrine cell dif ferentiation in A T2 organoids 23. Neuroendocrine progenitor cells co-e xpressing SFTPC and NE markers 24. Ciliated cell-lik e differentiation in fetal A T2 organoid culture 25. Intermediate transitional cell state between A T2 and dif ferentiated lineages 26. Surf actant metabolism and lipid transport in fetal alveolar epithelium 27. V esicle-mediated transport and lysosome localization in A T2 surfactant processing 28. Lipid storage membrane transport and v esicle cytoskeleton traf ﬁcking in A T2 cells 29. Wnt signaling pathway maintaining A T2 identity and inhibiting A T1 differentiation 30. SFTPC-I73T pathogenic v ariant causing interstitial lung disease and A T2 dysfunction 31. T oxic gain-of-function ef fect of misfolded surfactant protein C v ariants 32. T ranscriptional maturity of fdA T2 organoids compared to adult A T2 and PSC-iA T2 33. Missing immune response MHC class II genes in fetal v ersus adult A T2 cells 34. CRISPRi-mediated depletion of ITCH and UBE2N in fdA T2 or ganoids 35. Re versible SFTPC mislocalization after CRISPRi recov ery in A T2 organoids 36. ESCR T complex components HRS VPS28 required for SFTPC MVB entry 37. Endosomal rec ycling of SFTPC to plasma membrane upon ubiquitination failure 38. SUMOylation pathw ay components UBE2I UBA2 PIAS1 and SFTPC e xpression regulation 39. Fetal lung tip progenitor dif ferentiation into mature A T2 cells 40. EpCAM positi ve tip epithelial cell isolation and A T2 organoid deri vation 41. SFTPC C-terminal clea vage and proprotein processing in endosomal compartments 42. proSFTPC plasma membrane transit before endoc ytosis and maturation 43. Interstitial lung disease caused by SFTPC v ariants and A T2 cell dysfunction 44. Heritable pulmonary ﬁbrosis from SFTPC mistraf ﬁcking and toxic accumulation 45. A T2 medium components dexamethasone cAMP IBMX D APT for alveolar dif ferentiation 46. fdA T2 or ganoid engraftment in mouse precision-cut lung slices and A T1 differentiation 47. NEDD4-2 HECT domain ligase role in SFTPC ubiquitination and maturation 48. Cell type heterogeneity and proportions across fdA T2 or ganoid lines 49. fdA T2 or ganoid stability over long-term passaging and cryopreserv ation 50. Genetic manipulation of fetal A T2 or ganoids using lentiviral CRISPRi system 30 arXiv T emplate A P R E P R I N T H.3 Expression Queries 1. SFTPC SFTPB SFTPA1 SFTPA2 NAPSA LAMP3 2. SFTPC SFTPB ABCA3 LAMP3 HOPX NKX2-1 3. NKX2-1 SLC34A2 LPCAT1 HOPX CEACAM6 4. SFTPC SFTPD SFTA3 CD36 CAV1 SLC34A2 5. SFTPA1 SFTPA2 SFTPB SFTPC SFTPD 6. ITCH UBE2N HRS VPS28 RABGEF1 EEA1 7. ITCH NEDD4 NEDD4L UBE2N UBE2I 8. EEA1 MICALL1 LAMP3 HRS VPS28 9. UBE2I UBA2 PIAS1 ITCH RABGEF1 10. ABCA3 LAMP3 NAPSA CKAP4 ZDHHC2 CTSH 11. ABCA3 SFTPB SFTPC LAMP3 P2RY2 LMCD1 12. MKI67 PCNA TOP2A SFTPC NKX2-1 13. MKI67 PCNA CDK1 CCNB1 SFTPC 14. CXCL1 CXCL2 CXCL3 CCL2 SFTPC 15. CXCL1 CXCL3 CCL2 CCL4 CCL4L1 16. CXCL1 CXCL2 HLA-DPA1 HLA-DPB1 CCL2 17. HLA-DQB1 HLA-DMA HLA-DMB HLA-DRA HLA-DOA 18. HLA-DPA1 HLA-DPB1 HLA-DRA CD86 TNF 19. AQP5 CAV1 AGER HOPX 20. CAV1 AGER AQP5 PDPN 21. TP63 KRT5 KRT14 KRT15 SOX2 22. KRT5 KRT14 TP63 LAMB3 COL17A1 23. ASCL1 NEUROD1 GRP CHGA SYP CALCA 24. GRP ASCL1 SYT1 CHGA SYP 25. ASCL1 GRP SFTPC NKX2-1 26. FOXJ1 DNAH5 CAPS PIFO RSPH1 27. FOXJ1 DNAH5 DNAI1 RSPH1 CAPS 28. SOX2 SOX9 NKX2-1 SFTPC TP63 29. SOX2 NKX2-1 HOPX CAV1 30. CTNNB1 TCF7L2 AXIN2 WNT3A LGR5 31. SFTPC NKX2-1 HOPX SFTPB ABCA3 MKI67 32. NAPSA ABCA3 SFTA3 SFTPD LAMP3 HOPX 33. SFTPC ITCH EEA1 LAMP3 MICALL1 ABCA3 34. SFTPC NAPSA CTSH LAMP3 ITCH UBE2N 35. SFTPC CXCL1 CXCL2 NKX2-1 LAMP3 36. CDH1 TJP1 EPCAM SFTPC NKX2-1 37. ITCH HRS VPS28 UBE2N RABGEF1 PIAS1 UBE2I UBA2 38. ITCH NEDD4 NEDD4L HRS UBAP1 USP8 39. MKI67 TOP2A PCNA CDK1 CCNB1 CCNA2 40. SFTPC TP63 ASCL1 FOXJ1 NKX2-1 41. SFTPC SFTPB ASCL1 GRP TP63 KRT5 42. SFTPC CAV1 AGER AQP5 HOPX NKX2-1 43. LAMP3 ABCA3 SFTPB SFTPC NAPSA CD36 31 arXiv T emplate A P R E P R I N T 44. CKAP4 ZDHHC2 SLC34A2 CTSH SFTPC 45. CXCL1 CXCL2 CXCL3 CCL2 CCL4 TNF 46. SOX9 NKX2-1 SFTPC SFTPB LAMP3 47. SFTPC NKX2-1 ASCL1 NEUROD1 GRP MKI67 48. SFTA3 SFTPD NAPSA NKX2-1 CKAP4 ZDHHC2 SLC34A2 CTSH SFTPA1 SFTPA2 SFTPC SFTPB 49. ITCH SFTPC LAMP3 ABCA3 UBE2N NAPSA 50. SFTPC CXCL1 MKI67 TP63 ASCL1 FOXJ1 SOX2 CAV1 I D2: High-Risk Neuroblastoma (Y u et al. [2025] et al. ) I.1 Ontology Queries 1. Neuroblast neoplastic cell of sympathetic nerv ous system expressing PHO X2B and ISL1 2. Neuroblastoma tumor cell with MYCN ampliﬁcation and proliferati ve phenotype 3. Adrenergic neuroblast e xpressing catecholamine biosynthesis enzymes tyrosine hydroxylase 4. Neuroblastoma cell with calcium and synaptic signaling pathw ay enrichment 5. Dopaminer gic neuroblast expressing dopamine transporter and metabolic genes 6. Proliferating neuroblastoma cell with cell c ycle and DNA replication mark ers 7. Mesenchymal neuroblastoma cell state e xpressing extracellular matrix genes and Y AP1 8. Intermediate O XPHOS neuroblast with ribosomal gene expression and oxidativ e phosphorylation 9. EZH2 e xpressing neuroblastoma cell PRC2 polycomb repressive comple x chromatin regulation 10. Neuroblastoma cell ERBB4 receptor e xpressing epidermal growth factor signaling 11. Neuroblast with adrener gic transcription factor PHO X2A PHOX2B GA T A3 expression 12. Neural crest deri ved neoplastic cell in pediatric tumor expressing chromogranin 13. Neuroblastoma cell immune e vasion NECTIN2 and checkpoint ligand e xpression 14. Mesenchymal transition state in neuroblastoma with AP-1 transcription f actors 15. T umor associated macrophage in neuroblastoma microen vironment CD68 CD163 expressing 16. Pro-inﬂammatory macrophage IL18 e xpressing anti-tumor immune response 17. Pro-angiogenic macrophage VCAN e xpressing promoting tumor vascularization 18. Immunosuppressi ve macrophage C1QC SPP1 complement expressing in tumor 19. T issue resident macrophage F13A1 expressing phagocytic function in neuroblastoma 20. Lipid associated macrophage HS3ST2 with metabolic phenotype in tumor 21. Macrophage secreting HB-EGF ligand for ERBB4 receptor acti vation on neuroblasts 22. CCL4 e xpressing pro-angiogenic macrophage chemokine signaling in tumor 23. Proliferating macrophage MKI67 T OP2A expanding after chemotherapy 24. THY1 positiv e macrophage undeﬁned myeloid phenotype in neuroblastoma 25. T cell lymphocyte inﬁltrating neuroblastoma tumor e xpressing CD247 CD96 26. Cytotoxic T cell with granzyme perforin mediated tumor cell killing 27. T umor inﬁltrating T lymphocyte immune response to neuroblastoma 28. B cell lymphoc yte P AX5 MS4A1 in neuroblastoma tumor immune microen vironment 29. B lymphoc yte humoral immunity and antigen presentation in pediatric tumor 30. Dendritic cell IRF8 FL T3 antigen presentation priming T cell responses in tumor 31. Professional antigen presenting dendritic cell MHC class II e xpression 32. Fibroblast stromal cell PDGFRB DCN e xtracellular matrix production in neuroblastoma 33. Cancer associated ﬁbroblast F AP ACT A2 expressing in tumor stroma 32 arXiv T emplate A P R E P R I N T 34. Neural crest deri ved endoneurial ﬁbroblast in neuroblastoma tissue 35. Schw ann cell PLP1 CDH19 myelinating glial cell in neuroblastoma microenvironment 36. Schw ann cell precursor neural crest lineage expanding after therapy 37. Endothelial cell PECAM1 PTPRB v ascular marker in neuroblastoma tumor v asculature 38. T umor endothelium blood vessel lining cell expressing v ascular endothelial markers 39. Adrenal cortex cell steroidogenesis CYP11A1 CYP11B1 adjacent normal tissue 40. Cortical cell of adrenal gland steroid hormone biosynthesis normal adjacent tissue 41. Hepatoc yte ALB expressing liv er cell from adjacent normal tissue in neuroblastoma biopsy 42. Kidne y cell renal tissue PKHD1 from adjacent normal tissue in neuroblastoma specimen 43. Chemotherap y induced tumor microenvironment re wiring macrophage expansion after therapy 44. HB-EGF ERBB4 paracrine signaling axis between macrophage and neuroblast promoting ERK 45. T umor immune ev asion and antigen presentation in neuroblastoma 46. VEGF A angiogenesis signaling in neuroblastoma tumor microen vironment 47. Immune cell inﬁltration in high-risk neuroblastoma T cell B cell macrophage 48. THBS1 CD47 don’t eat me signal between macrophage and neuroblastoma cell 49. Neuroblastoma cell e xpressing ALK receptor tyrosine kinase oncogenic driver 50. T umor microen vironment cell diversity neuroblasts ﬁbroblasts Schw ann endothelial macrophages I.2 Expression Queries Q51. PHOX2B ISL1 HAND2 TH DBH DDC CHGA Q52. MYCN MKI67 TOP2A EZH2 SMC4 BIRC5 Q53. PHOX2A PHOX2B GATA3 ASCL1 ISL1 HAND2 Q54. CACNA1B SYN2 KCNMA1 KCNQ3 GPC5 CREB5 Q55. SLC18A2 TH DDC AGTR2 ATP2A2 PHOX2B Q56. MKI67 TOP2A EZH2 SMC4 BIRC5 BUB1B ASPM KIF11 Q57. YAP1 FN1 VIM COL1A1 SERPINE1 SPARC THBS2 Q58. ERBB4 EGFR HBEGF TGFA EREG AREG Q59. NECTIN2 CD274 B2M HLA-A HLA-B PHOX2B Q60. JUN FOS JUNB JUND FOSL2 BACH1 BACH2 Q61. CHGA CHGB PHOX2B ISL1 NTRK1 RET Q62. ETS1 ETV6 ELF1 KLF6 KLF7 RUNX1 ZNF148 Q63. ALK MYCN NTRK2 PHOX2B TH Q64. CD68 CD163 CD86 CSF1R MRC1 SPP1 Q65. IL18 CD68 CD163 CD86 HLA-DRA CSF1R Q66. VCAN VEGFA CD68 CD163 SPP1 EGFR Q67. C1QC SPP1 CD68 CD163 APOE TREM2 Q68. F13A1 CD68 CD163 MRC1 LYVE1 CSF1R Q69. HS3ST2 CYP27A1 CD68 CD163 APOE LPL Q70. HBEGF TGFA EREG AREG CD68 CD163 Q71. CCL4 CD68 CD163 VEGFA CSF1R CCL3 Q72. THY1 CD68 CD163 MRC1 CSF1R CD86 Q73. CD247 CD96 CD3D CD3E CD8A CD4 Q74. GZMA GZMB PRF1 IFNG CD8A CD3D Q75. PAX5 MS4A1 CD19 CD79A HLA-DRA HLA-DRB1 33 arXiv T emplate A P R E P R I N T Q76. IRF8 FLT3 CLEC9A CD1C CD80 HLA-DRA Q77. PDGFRB DCN LUM COL1A1 COL1A2 VIM Q78. FAP ACTA2 COL1A1 PDGFRA DCN LUM Q79. PLP1 CDH19 SOX10 MPZ MBP S100B Q80. PECAM1 PTPRB CDH5 VWF KDR FLT1 Q81. CYP11A1 CYP11B1 CYP17A1 STAR NR5A1 Q82. ALB DCDC2 HNF4A APOB Q83. PKHD1 PAX2 WT1 SLC12A1 Q84. PHOX2B CD68 CD3D MS4A1 PECAM1 DCN PLP1 Q85. HBEGF ERBB4 CD68 PHOX2B MAPK1 Q86. VCAN THBS1 CD47 ITGB1 CD68 PHOX2B Q87. HLA-A HLA-B HLA-C B2M HLA-DRA HLA-DRB1 Q88. VEGFA KDR FLT1 NRP1 GPC1 PECAM1 Q89. CD68 IL18 VCAN C1QC SPP1 F13A1 HS3ST2 CCL4 THY1 Q90. PHOX2B MKI67 TOP2A YAP1 CACNA1B SLC18A2 Q91. APOE LDLR VLDLR LPL HS3ST2 CD68 Q92. THBS1 ITGB1 ITGA3 LRP5 CD47 FN1 Q93. COL1A1 COL1A2 COL4A1 COL4A2 FN1 VIM SPARC Q94. MAPK1 MAPK3 AKT1 ERBB4 EGFR HBEGF Q95. CD274 PDCD1 CTLA4 TIGIT LAG3 NECTIN2 Q96. PHOX2B CD68 PLP1 PECAM1 DCN IRF8 PAX5 CD247 Q97. CYP11A1 ALB PKHD1 PHOX2B CD68 Q98. PHOX2B HBEGF ERBB4 VCAN SPP1 CD163 VEGFA Q99. MKI67 TOP2A PCNA CDK1 CCNB1 EZH2 MELK Q100. PHOX2B ISL1 CD68 CD163 CD3D MS4A1 PLP1 PECAM1 DCN CYP11A1 ALB J D3: Immune Checkpoint Blockade Multi-Cancer (Gondal et al. [2025] et al. ) J.1 Ontology Queries 1. Malignant cancer cell e xpressing immune checkpoint ligand PD-L1 for immune ev asion 2. T umor cell immune ev asion through HLA downre gulation and B2M loss 3. Melanoma cancer cell e xpressing MITF MLANA PMEL lineage mark ers 4. Breast cancer epithelial cell mark ers EPCAM KR T8 KR T18 KR T19 in ICB treated tumors 5. T umor cell proliferation and cell cycle markers in malignant cells 6. Cancer cell VEGF A and TGFB1 immunosuppressive signaling in tumor microen vironment 7. Epithelial mesenchymal transition EMT mark ers in cancer cells during ICB treatment 8. Ef fector CD8 T cell cytotoxic function with granzyme and perforin expression 9. Acti vated CD8 T cell expressing IFNG and TNF anti-tumor c ytokines 10. CD8 T cell e xhaustion with PD-1 LAG3 TIM3 TIGIT checkpoint receptor co-e xpression 11. TO X transcription factor dri ving T cell exhaustion program in chronic antigen stimulation 12. Central memory CD8 T cell with TCF7 and IL7R e xpression for long-lived immunity 13. Nai ve CD8 T cell expressing CCR7 SELL before antigen encounter 14. CD8-positi ve T cell co-stimulatory receptor 4-1BB ICOS upon acti vation 15. CD4 positi ve helper T cell TCR signaling and cytokine production 34 arXiv T emplate A P R E P R I N T 16. Re gulatory T cell FOXP3 e xpressing immunosuppressive function in tumor 17. T follicular helper cell CXCR5 BCL6 supporting B cell responses in tertiary lymphoid structures 18. Th17 helper T cell IL17A R ORC inﬂammatory response in tumor microen vironment 19. CD8-positi ve CD28-neg ativ e regulatory T cell with suppressi ve function 20. Natural killer T cell NKT innate c ytotoxicity with KLRD1 and NKG7 expression 21. NK cell mediated tumor killing through NCR1 and KLRB1 receptor acti vation 22. B cell CD19 MS4A1 CD79A antigen presentation and humoral immunity in tumor 23. Plasma cell antibody secreting immunoglob ulin production SDC1 MZB1 24. T ertiary lymphoid structure B cell and plasma cell formation in ICB-responsiv e tumors 25. T umor associated macrophage M2 polarization CD163 MRC1 immunosuppressiv e function 26. Macrophage complement e xpression C1QA C1QB and TREM2 in tumor microenvironment 27. Classical monoc yte CD14 L YZ inﬁltration into tumor during checkpoint blockade 28. Dendritic cell antigen presentation CD80 CD86 priming T cell responses 29. Plasmac ytoid dendritic cell IRF7 LILRA4 type I interferon production 30. Myeloid cell general CSF1R ITGAM e xpressing innate immune population 31. Mast cell KIT TPSB2 CP A3 in allergic and inﬂammatory tumor responses 32. Microglial cell brain resident macrophage in melanoma brain metastasis 33. Cancer associated ﬁbroblast F AP ACT A2 COL1A1 producing extracellular matrix 34. Myoﬁbroblast A CT A2 T A GLN contractile smooth muscle actin expression in tumor stroma 35. T umor endothelial cell PECAM1 CDH5 VWF vascular mark er expression 36. Melanoc yte pigmentation pathway MITF TYR TYRP1 DCT lineage genes 37. Hematopoietic multipotent progenitor cell stem cell mark er expression 38. PD-1 blockade restoring ef fector CD8 T cell anti-tumor cytotoxicity 39. CTLA-4 blockade enhancing CD4 helper T cell and reducing T reg suppression 40. T cell clonal replacement and expansion follo wing PD-1 checkpoint inhibition 41. TCF4 dependent resistance program in mesenchymal-like melanoma cells 42. T cell exclusion program in tumor cells resisting checkpoint blockade therapy 43. Antigen processing and MHC class I presentation in tumor cells 44. MHC class II antigen presentation by professional antigen presenting cells 45. Interferon gamma response dri ving PD-L1 upregulation on tumor cells 46. T umor inﬁltrating lymphocyte di versity including T B and NK cells 47. Li ver cancer hepatocellular carcinoma markers ALB AFP GPC3 in ICB dataset 48. Clear cell renal carcinoma CA9 P AX8 markers in kidney cancer patients 49. Basal cell carcinoma Hedgehog pathw ay PTCH1 GLI1 GLI2 SHH signaling 50. L ymphocyte general population in tumor immune microen vironment J.2 Expr ession Queries 1. CD274 PDCD1LG2 B2M HLA-A CD47 IDO1 VEGFA 2. MITF MLANA PMEL TYR DCT SOX10 TYRP1 3. EPCAM KRT8 KRT18 KRT19 MUC1 CDH1 ESR1 4. MKI67 TOP2A PCNA CD274 B2M TGFB1 5. PRF1 GZMA GZMB GZMK GNLY NKG7 IFNG 6. GZMB PRF1 IFNG TNF FASLG NKG7 CD8A 7. CD69 ICOS TNFRSF9 IFNG GZMB CD8A 35 arXiv T emplate A P R E P R I N T 8. PDCD1 LAG3 HAVCR2 TIGIT TOX ENTPD1 9. TOX TOX2 PDCD1 HAVCR2 LAG3 TIGIT BTLA 10. TCF7 LEF1 CCR7 SELL IL7R CD8A CD8B 11. CCR7 SELL TCF7 LEF1 IL7R CD3D 12. CD4 CD3D CD3E IL7R CD28 ICOS TCF7 13. FOXP3 IL2RA CTLA4 IKZF2 TNFRSF18 TIGIT 14. CXCR5 BCL6 ICOS PDCD1 CD4 CD3D 15. RORC IL17A IL23R CCR6 CD4 CD3E 16. CD8A GZMB PRF1 LAG3 CTLA4 PDCD1 17. KLRD1 KLRK1 NKG7 GNLY PRF1 GZMB NCAM1 18. NCAM1 NCR1 KLRB1 KLRC1 GZMB IFNG 19. CD19 MS4A1 CD79A CD79B HLA-DRA HLA-DRB1 20. SDC1 MZB1 JCHAIN IGHG1 IGKC CD79A 21. CD163 MRC1 MSR1 MARCO CD68 APOE TREM2 22. C1QA C1QB APOE TREM2 CD68 SPP1 23. CD14 FCGR3A S100A8 S100A9 LYZ CSF1R 24. CD80 CD86 CD83 CCR7 HLA-DRA CLEC9A 25. LILRA4 IRF7 IRF8 IL3RA NRP1 26. ITGAM CSF1R CD68 LYZ S100A8 S100A9 27. KIT TPSB2 TPSAB1 CPA3 HPGDS HDC 28. P2RY12 TMEM119 CX3CR1 CSF1R AIF1 29. FAP ACTA2 COL1A1 COL1A2 PDGFRA DCN LUM 30. ACTA2 TAGLN MYH11 COL1A1 PDGFRB VIM 31. PECAM1 CDH5 VWF KDR FLT1 ENG 32. MITF TYR TYRP1 DCT MLANA PMEL SOX10 33. CD34 KIT FLT3 PROM1 THY1 PTPRC 34. CD3D CD3E CD8A CD4 TRAC TRBC1 35. HLA-DRA HLA-DRB1 HLA-DPA1 HLA-DPB1 CD74 CIITA 36. HLA-A HLA-B HLA-C B2M TAP1 TAP2 37. PDCD1 CD274 CTLA4 CD80 CD86 LAG3 HAVCR2 38. CD274 CD47 IDO1 GZMB PRF1 IFNG 39. CD8A CD4 MS4A1 CD68 PECAM1 FAP EPCAM NCAM1 40. GZMB IFNG FOXP3 CD163 CD274 MS4A1 PECAM1 41. ALB AFP GPC3 EPCAM KRT19 42. CA9 PAX8 MME EPCAM VEGFA 43. PTCH1 GLI1 GLI2 EPCAM KRT14 44. ERBB2 ESR1 EPCAM KRT8 KRT18 MUC1 45. CCR7 SELL TCF7 PDCD1 TOX GZMB PRF1 46. IFNG CD274 STAT1 IRF1 B2M HLA-A 47. CD8A CD4 FOXP3 CXCR5 RORC CCR7 KLRD1 CD3D 48. CD68 CD163 CD14 S100A8 CD80 KIT LILRA4 ITGAM 49. FAP ACTA2 PECAM1 CDH5 COL1A1 PDGFRA VWF 50. CD274 GZMB CD68 MS4A1 FAP PECAM1 MITF FOXP3 CD8A KIT LILRA4 36 arXiv T emplate A P R E P R I N T K D6: First-T rimester Human Brain (Mannens et al. [2025] et al. ) K.1 Ontology Queries 1. GAB Aergic inhibitory neuron dif ferentiation in dev eloping human midbrain 2. Midbrain GAB Aergic neuron O TX2 GA T A2 T AL2 transcription factor expression 3. Cortical interneuron deri ved from medial ganglionic eminence LHX6 DLX2 4. Interneuron di versity parv albumin somatostatin VIP subtypes dev eloping cortex 5. T AL2 expressing midbrain GAB Aergic neurons link ed to major depressive disorder 6. Lateral and caudal ganglionic eminence interneuron migration in telencephalon 7. Medial ganglionic eminence deri ved parv albumin somatostatin interneuron 8. SO X14 expressing midbrain GAB Aergic neuron thalamic migration 9. Glutamater gic excitatory neuron in dev eloping human telencephalon cortex 10. T elencephalic glutamatergic neuron LHX2 BHLHE22 cortical layer speciﬁcation 11. Hindbrain glutamater gic neuron A TOH1 MEIS1 cerebellar granule cell 12. Deep layer cortical neuron FEZF2 BCL11B corticospinal projection 13. SA TB2 e xpressing telencephalic excitatory neuron callosal projection 14. Upper layer cortical neuron CUX1 CUX2 R ORB intracortical connectivity 15. EMX2 transcription f actor dorsal telencephalon glutamatergic identity 16. Purkinje cell dif ferentiation in dev eloping cerebellum PTF1A ESRRB lineage 17. Purkinje neuron ESRRB oestrogen-related nuclear receptor cerebellum speciﬁc 18. Cerebellar Purkinje progenitor PTF1A ASCL1 NEUR OG2 ventricular zone 19. TF AP2B LHX5 activ ation of ESRRB enhancer in Purkinje neuroblast 20. R ORA FOXP2 EBF3 late Purkinje maturation gene re gulatory network 21. Cerebellar granule neuron A T OH1 MEIS1 external granular layer 22. Radial glial cell neural stem cell SO X2 P AX6 NES in dev eloping brain 23. Radial glia to glioblast transition NFI f actor maturation NFIA NFIB NFIX 24. Neural progenitor cell proliferation and neurogenesis in v entricular zone 25. Loss of stemness and glial f ate restriction by NFI transcription factors 26. Progenitor cell di viding in dev eloping human brain VIM HES1 proliferating 27. Notch signaling DLL1 J A G1 NOTCH1 lateral inhibition neurogenesis 28. Glioblast astroc yte precursor GF AP S100B A QP4 BCAN TNC fetal brain 29. Astrocyte maturation and glial scar mark ers in developing brain 30. Oligodendroc yte precursor cell OLIG2 PDGFRA SOX10 speciﬁcation 31. Oligodendroc yte differentiation MBP MOG PLP1 myelination fetal brain 32. Committed oligodendroc yte precursor SOX10 lineage commitment 33. Dopaminer gic neuron midbrain TH NR4A2 substantia nigra ventral tegmental area 34. Serotoner gic neuron raphe nucleus TPH2 SLC6A4 FEV brainstem 35. FO XA2 LMX1A ﬂoor plate derived dopaminer gic neuron speciﬁcation 36. Endothelial cell blood–brain barrier CLDN5 PECAM1 CDH5 fetal brain 37. Peric yte PDGFRB RGS5 FOXF2 cerebral v asculature dev eloping brain 38. V ascular leptomeningeal cell FO XC1 meningeal ﬁbroblast DCN COL1A1 39. V ascular smooth muscle cell A CT A2 MYH11 cerebral artery 40. Microglial cell CX3CR1 P2R Y12 TMEM119 brain resident macrophage 41. Border -associated macrophage R UNX1 haematopoietic origin fetal brain 37 arXiv T emplate A P R E P R I N T 42. Immature T cell and leuk ocyte inﬁltration in dev eloping fetal brain 43. Schw ann cell MPZ CDH19 SOX10 neural crest deri ved myelinating peripheral glial 44. Sensory neuron dorsal root ganglion NTRK1 ISL1 peripheral nerv ous system 45. Glyciner gic neuron SLC6A5 GLRA1 inhibitory spinal cord hindbrain 46. Neuroblast immature migrating neuron fetal corte x RBFOX3 NEFM 47. Major depressi ve disorder MDD midbrain GAB Aergic neuron NEGR1 LRFN5 48. Schizophrenia cortical interneuron medial ganglionic eminence SA TB2 49. Attention deﬁcit hyperacti vity disorder ADHD cerebellar Purkinje 50. Autism spectrum disorder hindbrain neuroblast brainstem in volv ement K.2 Expression Queries 1. GAD1 GAD2 SLC32A1 DLX2 DLX5 LHX6 2. OTX2 GATA2 TAL2 SOX14 GAD2 SLC32A1 3. PVALB SST VIP LAMP5 SNCG ADARB2 4. DLX1 DLX2 DLX5 DLX6 MEIS2 LHX6 5. GAD1 GAD2 SLC32A1 TFAP2B OTX2 6. TAL2 SOX14 GAD2 OTX2 GATA2 7. SLC17A7 SLC17A6 SATB2 TBR1 FEZF2 BCL11B 8. EMX2 LHX2 BHLHE22 CUX1 CUX2 RORB 9. ATOH1 MEIS1 MEIS2 SLC17A6 RBFOX3 10. FEZF2 BCL11B TBR1 SATB2 SLC17A7 11. CUX1 CUX2 RORB LHX2 BHLHE22 EMX2 12. PTF1A ASCL1 NEUROG2 NHLH1 NHLH2 TFAP2B 13. ESRRB RORA PCP4 FOXP2 EBF3 LHX5 14. LHX5 LHX1 PAX2 TFAP2B DMBX1 NHLH2 15. ESRRB PCP4 RORA EBF1 EBF3 FOXP2 LHX1 16. SOX2 PAX6 NES VIM HES1 HES5 FABP7 17. NFIA NFIB NFIX SOX9 FABP7 18. SOX2 HES1 HES5 PAX6 NES VIM 19. NOTCH1 NOTCH2 DLL1 JAG1 HES1 HES5 20. GFAP S100B AQP4 ALDH1L1 BCAN TNC 21. OLIG1 OLIG2 SOX10 PDGFRA CSPG4 22. MBP MOG PLP1 MAG SOX10 23. OLIG2 SOX10 PDGFRA NKX2-2 OLIG1 24. TH DDC SLC6A3 SLC18A2 NR4A2 LMX1A FOXA2 25. FOXA2 LMX1A NR4A2 TH DDC SLC18A2 26. TPH2 SLC6A4 FEV DDC SLC18A2 27. SLC6A5 GLRA1 SLC32A1 GAD1 28. RBFOX3 SNAP25 SYT1 NEFM NEFL TUBB3 29. NEFM NEFL MAP2 TUBB3 SYT1 30. CLDN5 PECAM1 CDH5 ERG FLT1 VWF 31. PDGFRB RGS5 ACTA2 MYH11 COL1A2 32. ACTA2 MYH11 PDGFRB TAGLN 33. DCN LUM COL1A1 COL1A2 FOXC1 COL3A1 38 arXiv T emplate A P R E P R I N T 34. FOXC1 FOXF2 DCN COL1A2 LUM 35. AIF1 CX3CR1 P2RY12 TMEM119 HEXB CSF1R 36. RUNX1 SPI1 CSF1R AIF1 CD68 37. AIF1 HEXB P2RY12 TMEM119 CX3CR1 38. CD3D CD3E CD3G PTPRC CD2 39. MPZ CDH19 SOX10 MBP PLP1 40. NTRK1 NTRK2 ISL1 PRPH SNAP25 41. RBFOX3 SLC17A6 GAD2 NEFM SNAP25 42. NEFM NEFL RBFOX3 TUBB3 DCX 43. NEGR1 BTN3A2 LRFN5 SCN8A RGS6 MYCN 44. OTX2 GATA2 MEIS2 PRDM10 MYCN 45. CTCF MECP2 YY1 RAD21 SMC3 46. SHH PTCH1 GLI1 GLI2 FOXA2 NKX2-1 47. WNT5A CTNNB1 LEF1 TCF7L2 AXIN2 48. BMP4 BMPR1A SMAD1 ID1 ID3 49. VEGFA KDR FLT1 PDGFB PDGFRB CLDN5 50. SOX2 PAX6 OLIG2 GFAP RBFOX3 GAD2 SLC17A7 39 arXiv T emplate A P R E P R I N T L Example of plot on Cystic Fibrosis Dataset Figure 3: Cell-lev el UMAP of the cystic ﬁbrosis airway dataset (D1) colored by Cell Ontology annotation. Approxi- mately 96,000 cells are sho wn across 30 annotated cell types spanning immune (T cells, B cells, NK cells, macrophages, monocytes, dendritic cells, mast cells), epithelial (basal, suprabasal, multiciliated, secretory , goblet, club, ionocyte, neuroendocrine), and stromal (ﬁbroblasts, pericytes, endocardial cells) compartments. Labels are placed at cluster centroids with iterativ e repulsion to minimize overlap. 40 arXiv T emplate A P R E P R I N T Figure 4: Expression of HLA-E projected onto the cell-lev el UMAP of the cystic ﬁbrosis airw ay dataset (D1). Color intensity (purple gradient) indicates normalized e xpression level, with non-expressing cells sho wn in grey . HLA-E is most highly expressed in immune cell clusters, particularly CD8 + T cells and NK cells, consistent with its role as a ligand for the NKG2A inhibitory receptor . Moderate expression is observed across epithelial populations including basal cells, supporting the HLA-E/NKG2A immune checkpoint axis identiﬁed by Berg et al. 41

ELISA: An Interpretable Hybrid Generative AI Agent for Expression-Grounded Discovery in Single-Cell Genomics

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment