Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A study of neural network architectures for the reconstruction of the energy deposited in the cells of the ATLAS liquid-argon calorimeters under high pile-up conditions expected at the HL-LHC is presented. These networks are designed to run on the FPGA-based readout hardware of the calorimeters under strict size and latency constraints. Several architectures, including Dense, Recurrent (RNN), and Convolutional (CNN) neural networks, are optimised using a Bayesian procedure that balances energy resolution against network size. The optimised Dense, CNN, and combined Dense+RNN architectures achieve a transverse energy resolution of approximately 80 MeV, outperforming both the optimal filtering (OF) method currently in use and RNNs of similar complexity. A detailed comparison across the full dynamic range shows that Dense, CNN, and Dense+RNN accurately reproduce the energy scale, while OF and RNNs underestimate the energy. Deep Evidential Regression is implemented within the Dense architecture to address the need for reliable per-event energy uncertainties. This approach provides predictive uncertainty estimates with minimal increase in network size. The predicted uncertainty is found to be consistent, on average, with the difference between the true deposited energy and the predicted energy.

💡 Research Summary

This paper addresses the challenge of reconstructing the energy deposited in individual cells of the ATLAS liquid‑argon (LAr) calorimeter under the extreme pile‑up conditions expected at the High‑Luminosity LHC (HL‑LHC). In the upgraded Phase‑II read‑out architecture, each LAr Signal Processor (LASP) board will host two Intel Agilex 7 FPGAs, each responsible for computing the energy of 384 calorimeter cells within a strict latency budget of 125 ns. The authors therefore develop lightweight neural‑network (NN) models that can be deployed on these FPGAs while meeting stringent size (≤ 500 multiply‑accumulate (MAC) operations) and latency constraints.

A realistic simulation framework is built using the AREUS toolkit. A representative cell (η = 0, φ = 0) is populated with hard‑scatter energy deposits uniformly distributed between 0 and 130 GeV, overlaid on a background of 200 simultaneous proton‑proton interactions (⟨µ⟩ = 200) to emulate the worst‑case pile‑up. Digitised waveforms are sampled at 40 MHz, and up to 28 samples (pre‑deposit and post‑deposit) are provided as inputs to the networks. The optimal‑filtering (OF) algorithm, currently used in ATLAS, relies on only five post‑deposit samples and serves as a baseline.

Four NN architectures are investigated:

Staged Dense (Dense) – a purely fully‑connected design split into three stages (pre‑dense, main‑dense, final‑dense). Pre‑deposit samples are first processed to correct for pulse distortion, then combined with post‑deposit samples to predict the transverse energy. This model uses 89 trainable parameters and 368 MACs.
Convolutional Neural Network (CNN) – consists of two convolutional layers (five 1‑D filters of size 7, followed by six 2‑D filters of size 11 × 5) and a final 9 × 6 filter that is mathematically equivalent to a single dense neuron. ReLU activations are applied throughout. By re‑using intermediate convolution results across successive bunch crossings, the effective MAC count is reduced to 419.
Recurrent Neural Network (RNN) – follows the architecture of a previous ATLAS study, employing five vanilla RNN cells (each with an 8‑dimensional hidden state) followed by a dense output layer. Although compact (161 parameters, 240 MACs), the RNN suffers from a computational cost that scales with the square of the hidden dimension and the number of input samples, leading to poorer energy resolution.
Dense + RNN – a hybrid approach where a dense block first processes pre‑deposit samples to initialise the hidden state of an RNN that then processes the post‑deposit samples. A final dense neuron maps the last RNN output to the energy estimate. This design retains the temporal modelling benefits of RNNs while dramatically cutting the number of required MACs (392) and parameters (392).

Hyper‑parameter optimisation is performed via Bayesian optimisation. The objective function balances the energy resolution (σ_E/E) against network size (parameter count or MACs). A Gaussian‑process surrogate with a 5/2 Matérn kernel is iteratively refined over 30–100 evaluations, using Expected Improvement to select new trial points. Optimised hyper‑parameters include the number of pre‑deposit samples, hidden dimensions, and, for the CNN, filter counts and kernel sizes.

Performance is evaluated on an independent set of 13 million events. The Dense, CNN, and Dense + RNN models all achieve a transverse‑energy resolution of ≈ 80 MeV across the full 0–130 GeV range, outperforming the OF algorithm (≈ 120 MeV) and the RNN of comparable size. Moreover, the three best models reproduce the true energy scale without systematic bias, whereas OF and the plain RNN systematically underestimate the energy, especially at low energies (< 5 GeV). The latency of each design, measured after synthesis for the Agilex 7, stays below 110 ns, comfortably within the 125 ns budget.

To provide per‑event uncertainty estimates, the authors embed Deep Evidential Regression (DER) into the Dense architecture. The network outputs four evidential parameters (α, β, ν, λ) that define a Normal‑Inverse‑Gamma distribution over the predicted energy. Training combines a negative‑log‑likelihood term with a regularisation encouraging calibrated uncertainties. The resulting predictive uncertainties correlate well with the absolute prediction error, and the additional parameter overhead is less than 5 % of the base model.

Implementation details show that the CNN’s sliding‑window approach and the Dense + RNN’s pre‑computed dense block enable efficient pipelining on the FPGA, with resource utilisation (DSPs, BRAM) well below the device limits. The authors estimate that each LASP FPGA can host 384 independent instances of the chosen model, satisfying the required throughput.

In summary, the paper demonstrates that carefully optimised, lightweight neural networks can replace the traditional optimal‑filtering algorithm for LAr calorimeter energy reconstruction in the HL‑LHC era. By leveraging Bayesian hyper‑parameter optimisation, convolutional feature extraction, and hybrid dense‑recurrent designs, the authors achieve superior resolution, unbiased energy scale, and real‑time operation on FPGA hardware. The integration of evidential regression further equips the trigger system with reliable per‑event uncertainty information, opening the door to more sophisticated, uncertainty‑aware trigger decisions in future ATLAS data‑taking.

Optimised neural networks for online processing of ATLAS calorimeter data on FPGAs

💡 Research Summary

Comments & Academic Discussion

Leave a Comment