Towards a foundation model for astrophysical source detection: An End-to-End Gamma-Ray Data Analysis Pipeline Using Deep Learning

Towards a foundation model for astrophysical source detection: An End-to-End Gamma-Ray Data Analysis Pipeline Using Deep Learning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The increasing volume of gamma-ray data demands new analysis approaches that can handle large-scale datasets while providing robustness for source detection. We present a Deep Learning (DL) based pipeline for detection, localization, and characterization of gamma-ray sources. We extend our AutoSourceID (ASID) method, initially tested with \textit{Fermi}-LAT simulated data and optical data (MeerLICHT), to Cherenkov Telescope Array Observatory (CTAO) simulated data. This end-to-end pipeline demonstrates a versatile framework for future application to other surveys and potentially serves as a building block for a foundational model for astrophysical source detection.


💡 Research Summary

The paper presents a comprehensive deep‑learning (DL) pipeline, AutoSourceID (ASID), for automated detection, localization, and characterization of astrophysical gamma‑ray sources, and demonstrates its applicability to both Fermi‑LAT and the upcoming Cherenkov Telescope Array Observatory (CTAO). The authors extend the original ASID framework, which was previously validated on simulated Fermi‑LAT data and optical MeerLICHT images, by adapting it to CTAO Galactic Plane Survey (GPS) simulations. The core of the pipeline is a multi‑input U‑Net that ingests multi‑energy count maps and produces pixel‑wise segmentation masks. These masks are processed with a Laplacian‑of‑Gaussian (LoG) clustering step to extract candidate source coordinates. A downstream VGG‑style convolutional neural network classifies each candidate as true or false, while deep ensemble networks estimate fluxes and refine positions, providing both point estimates and uncertainties.

Performance is evaluated using recall (completeness) as a function of true source flux. On ten‑year simulated Fermi‑LAT data (six energy bins from 300 MeV to 1 TeV, 10° × 10° patches), ASID reaches a detection threshold of ≈2 × 10⁻¹⁰ cm⁻² s⁻¹, comparable to the 4FGL‑DR2 catalog. When trained on one interstellar emission model (B1‑IEM) and tested on another, the pipeline shows consistent true‑source recovery, indicating robustness against background model variations. In the high‑latitude region (|b| > 20°) the method recovers 98 % of the high‑significance (σ > 20) catalog sources.

For CTAO, the authors simulate GPS data with only point‑like sources in three energy bins (70 GeV–100 TeV) using gammapy. They compare ASID with CeDiRNet, a CNN that directly regresses source directions. Both methods achieve ~90 % recall for sources with flux ≳2 × 10⁻¹⁴ cm⁻² s⁻¹ (>1 TeV), matching the sensitivity of standard likelihood analyses while offering full automation and faster processing. The similarity of performance across two distinct CNN architectures underscores the viability of DL for the next‑generation IACT data, which will feature higher background rates, superior angular resolution, and many overlapping extended sources.

Beyond gamma‑rays, the pipeline is tested on optical and infrared surveys. Using MeerLICHT images (256 × 256 pixel patches) the same U‑Net+LoG architecture outperforms traditional tools such as SExtractor, especially in rejecting artifacts. Additional experiments on Hubble and WISE data confirm that the model can adapt to different resolutions and wavelengths without retraining the core architecture. A latent‑space analysis of the bottleneck representations reveals two well‑separated clusters corresponding to background and source classes, with points from both Fermi‑LAT and CTAO data intermixing within each cluster. This suggests that a single, modality‑agnostic latent space can be learned, a key requirement for a “foundation model” that can operate across heterogeneous telescopes and wavebands.

The authors conclude that ASID provides a robust, end‑to‑end solution for gamma‑ray source detection, achieving performance on par with established catalog pipelines while delivering the speed and scalability needed for upcoming large datasets. Future work will focus on mitigating interstellar emission model uncertainties, extending the framework to handle extended sources, incorporating denoising stages for CTAO, and integrating multi‑wavelength data to realize a true foundation model for astrophysical source detection and classification.


Comments & Academic Discussion

Loading comments...

Leave a Comment