GPU-Accelerated Analytic Simulation of Sparse Signals in Pixelated Time Projection Detector

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper presents a GPU-accelerated simulation package, TRED, for next-generation neutrino detectors with pixelated charge readout, leveraging community-driven software ecosystems to ensure sustainability and extensibility. We introduce two generic contributions: (i) an effective-charge calculation based on Gaussian quadrature rules for numerical integration, and (ii) a sparse, block-binned tensor representation that enables efficient FFT-based computation of induced signals on readout electrodes for sparsely activated detector volumes. The former captures sub-grid structure without requiring dense sampling, while the latter achieves low memory usage and scalable runtime, as demonstrated in benchmark studies. The underlying data representation is applicable to large-scale detectors and to other computational problems involving sparse activity.

💡 Research Summary

The paper introduces TRED, a GPU‑accelerated simulation framework designed for the pixelated charge readout of the DUNE Near‑Detector Liquid Argon Time Projection Chamber (ND‑LAr). Traditional CPU‑based C++ tools struggle with the massive channel count (≈5 × 10⁵ per module) and the highly sparse activity typical of neutrino interactions in the near detector. Recent GPU attempts have shown promise but still demand large memory footprints because they treat the data as dense tensors. TRED addresses these challenges through two generic technical contributions that are broadly applicable to any large‑volume, sparsely‑active detector.

First, the authors develop an “effective‑charge” representation based on Gaussian‑Legendre quadrature. The continuous ionization charge distribution, after recombination, attachment, and diffusion, is partitioned into cuboidal elements that match the granularity of the pre‑computed field‑response grid. Within each cuboid, a small set of Gauss‑Legendre nodes and associated weights are used to evaluate the integral of the product of charge density and the Green’s function. This yields a discrete effective charge Q_eff defined on the same grid as the field response. The method preserves sub‑grid spatial fidelity without requiring a dense sampling of the whole detector volume, and the number of quadrature points can be tuned to trade accuracy for speed.

Second, the authors introduce a block‑sparse tensor data structure. The detector volume is tiled into fixed‑size blocks; blocks that share the same spatial‑temporal coordinates are summed to produce a set of unique blocks. Each block carries its effective charge values and a list of neighboring blocks needed for trilinear interpolation of the field response. Because the charge is sparse—only a tiny fraction of blocks contain non‑zero Q_eff—the convolution of charge with the field response can be performed using FFTs on a per‑block basis rather than on a global dense grid. This dramatically reduces memory consumption (to a few percent of the full dense grid) and enables batched execution on GPUs.

Implementation leverages the PyTorch ecosystem, using its tensor abstractions, automatic differentiation, and CUDA stream support. Custom kernels handle block indexing, weight application, and batched FFTs, overlapping data movement and computation to keep the GPU fully occupied. Benchmarks on modern NVIDIA GPUs show an average 12× speed‑up over a highly optimized CPU reference, while memory usage drops by a factor of 8–10 compared with a naïve dense‑GPU implementation. The framework scales linearly with the number of active blocks, and the quadrature order can be increased with only linear cost growth, allowing users to adjust precision for specific physics studies.

The paper also discusses limitations and future work. Selecting the optimal block size and sparsity threshold currently requires manual tuning; an adaptive or auto‑tuning scheme would improve portability across hardware. The present model assumes a static effective‑charge distribution; incorporating dynamic charge redistribution (e.g., after recombination or space‑charge effects) would require additional physics modules. FFT‑based convolution assumes periodic boundary conditions, so edge effects at detector boundaries need separate correction.

In summary, TRED demonstrates that a combination of high‑order quadrature for accurate sub‑grid charge representation and a block‑sparse tensor layout for memory‑efficient FFT‑based convolution can deliver high‑fidelity, scalable simulations of large LArTPC detectors on GPUs. The methodology is not limited to DUNE ND‑LAr; it can be extended to the far detector, other liquid‑argon experiments, and any computational problem involving sparse activity, such as sparse sub‑manifold convolutions in machine learning.

GPU-Accelerated Analytic Simulation of Sparse Signals in Pixelated Time Projection Detector

💡 Research Summary

Comments & Academic Discussion

Leave a Comment