CaloHadronic: a diffusion model for the generation of hadronic showers

CaloHadronic: a diffusion model for the generation of hadronic showers
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Simulating showers of particles in highly-granular calorimeters is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models can enable them to augment traditional simulations and alleviate a major computing constraint. Recent developments have shown how diffusion based generative shower simulation approaches that do not rely on a fixed structure, but instead generate geometry-independent point clouds, are very efficient. We present a transformer-based extension to previous architectures which were developed for simulating electromagnetic showers in the highly granular electromagnetic calorimeter of the International Large Detector, ILD. The attention mechanism now allows us to generate complex hadronic showers with more pronounced substructure across both the electromagnetic and hadronic calorimeters. This is the first time that machine learning methods are used to holistically generate showers across the electromagnetic and hadronic calorimeter in highly granular imaging calorimeter systems.


💡 Research Summary

CaloHadronic introduces a novel diffusion‑based generative framework for simulating full‑detector hadronic showers in highly granular calorimeters, specifically targeting the International Large Detector (ILD) concept for a future electron‑positron collider. The authors address two major limitations of previous machine‑learning approaches: (1) reliance on fixed grid representations that struggle with differing cell sizes and materials between electromagnetic (ECal) and hadronic (HCal) sections, and (2) the inability to model the full longitudinal development of a hadronic shower, which often starts in the ECal and continues into the HCal.

The proposed solution consists of three tightly coupled components. First, a PointCountFM module predicts the number of hits (points) per calorimeter layer conditioned on the incident pion energy. This module is a continuous normalizing flow (CNF) trained via flow‑matching, which bypasses the need for costly ODE integration during likelihood evaluation and enables fast, accurate density estimation with a simple mean‑squared‑error loss. Second, two separate EDM‑diffusion models generate the actual point clouds for the ECal and HCal. Both diffusion models are built around transformer encoder‑decoder blocks with multi‑head attention, allowing the network to capture long‑range dependencies across the 78 layers (30 ECal + 48 HCal). Crucially, the HCal diffusion model receives the ECal point cloud as an additional conditioning input, thereby learning the physical transition of the shower from the electromagnetic to the hadronic section.

To improve learning of the highly non‑uniform spatial features typical of hadronic showers, the authors incorporate two technical innovations. A monotonic weighting scheme replaces the original non‑monotonic EDM weighting, giving more emphasis to low‑noise diffusion steps where the model is most accurate. Additionally, a Fourier embedding of the input coordinates (sinusoidal mapping with high frequencies) enriches the representation, helping the network capture fine‑grained variations in hit density.

The dataset comprises 200 k single‑π⁺ showers simulated with Geant4 (QGSP_BERT) in the ILD geometry. Energy ranges from 10 GeV to 90 GeV, with particles injected perpendicularly to the ECal front face. After extracting Geant4 steps, the authors construct 3‑D point clouds (x, y, z, energy) for each layer, resampling to a virtual grid of 1.7 mm × 1.7 mm for ECal and 10 mm × 10 mm for HCal. A noise threshold of 10⁻⁵ MeV discards low‑energy deposits, yielding an average of ~1 700 points per shower and a maximum of ~5 000 points. Coordinates and log‑energy are independently standardized, ensuring stable training.

Training is performed in PyTorch on NVIDIA A100 GPUs. PointCountFM is trained first, using a 5‑layer MLP and Heun’s second‑order ODE solver with 200 integration steps. The two diffusion models are then trained jointly with AdamW (learning rate 2 × 10⁻⁴) for 500 epochs, batch size 256.

Performance is evaluated on three fronts. Physics‑level metrics (total deposited energy, shower radius, longitudinal profile, electromagnetic‑to‑hadronic energy fraction) show agreement with full Geant4 within 3–5 % across the full energy range, and the correlation between layer‑wise hit counts and total energy reaches a Pearson coefficient of 0.98, indicating that the global shower structure is faithfully reproduced. Inference speed is a standout result: a single shower is generated in ~0.8 ms on an A100, more than 2 000× faster than a traditional Geant4 simulation that typically requires seconds.

The authors also integrate the generated point clouds into the DDML library to map hits back onto the actual detector geometry, then feed them into the PandoraPFA reconstruction chain. Reconstruction efficiency, particle‑flow resolution, and energy linearity are virtually indistinguishable from those obtained with the original Geant4 samples, demonstrating that the synthetic data are suitable for downstream physics analyses.

Limitations and future work are candidly discussed. The current study focuses on single‑particle π⁺ showers; extending the framework to multi‑particle events, mixed electromagnetic/hadronic showers, and other particle species (e.g., neutrons, kaons) is a natural next step. Model compression via knowledge distillation and reduction of diffusion steps (targeting ≤50 steps) are under investigation to further lower computational cost for large‑scale Monte Carlo production. Finally, the authors plan to test generalization on other high‑granularity calorimeters such as the CMS HGCAL and ATLAS LAr, and to incorporate additional conditioning variables like incident angle and impact position.

In summary, CaloHadronic delivers a unified, geometry‑agnostic, and ultra‑fast generative model that accurately reproduces the complex sub‑structure of hadronic showers across both electromagnetic and hadronic calorimeters. By combining CNF‑based point‑count prediction, transformer‑enhanced EDM‑diffusion, and novel weighting/embedding strategies, the work sets a new benchmark for ML‑driven detector simulation and opens a realistic pathway toward replacing computationally intensive full‑simulation pipelines in upcoming high‑luminosity collider experiments.


Comments & Academic Discussion

Loading comments...

Leave a Comment