Machine learning in top quark physics at ATLAS and CMS
This note presents an overview of current and potential future applications of machine-learning-based techniques in the study of the top quark. The research community has developed a diverse set of ideas and tools, including algorithms for the efficient reconstruction of recorded collision events and innovative methods for statistical inference. Recent applications of some techniques by the ATLAS and CMS collaborations are also highlighted.
š” Research Summary
This paper provides a comprehensive overview of how machineālearning (ML) techniques are currently employed and futureāoriented within topāquark physics at the ATLAS and CMS experiments. The introduction emphasizes that ML has been a driving force for over a decade, enabling key milestones such as singleātop discovery, bājet identification improvements, and the observation of fourātop production.
The reconstruction section is split into two tasks. First, the inference of the neutrino direction in semileptonic top decays is tackled by the νāFlow method, which uses a conditional normalizingāflow neural network to map the true neutrino vector onto a threeādimensional Gaussian. Sampling from this distribution yields a likelihood for possible neutrino directions, outperforming traditional feedāforward regressors and simple Wāmass constraints. Second, the assignment of the remaining decay products to the correct top quark is addressed by several approaches. SPANET employs a transformerābased architecture with more than 10āÆmillion parameters and auxiliary targets (neutrino regression and signalābackground discrimination) to achieve stateāofātheāart performance. The HYPER method represents decay products as hypergraphs, allowing edges to connect multiple nodes; despite using only 345āÆk parameters, its performance rivals SPANET.
In the analysisāstrategy part, the paper discusses the difficulty of estimating QCD multijet backgrounds. The classic ABCD matrix method is automated by the DISC technique, which trains a classifier while adding a penalty term that suppresses correlations between the classifier score and the two independent observables, thereby preserving the ABCD assumption. A recent CMS allāhadronic fourātop search employed an autoregressive normalizing flow to map events from a backgroundāenriched region into the signal region, providing a dataādriven background estimate.
The statisticalāinference section moves beyond binned likelihoods. Likelihoodāfree (simulationābased) inference uses the classifier output s as a direct test statistic, exploiting the relation (L_{1}/L_{0}=s/(1-s)). Tools such as INFERNO and SALLY implement this idea while propagating systematic uncertainties. Unfolding, an intrinsically illāposed inverse problem, is tackled by OMNIFOLD, which iteratively trains a classifier to reweight simulated events toward data, allowing unbinned, multidimensional unfolding. The method has been demonstrated by ATLAS (average jet mass vs. jetāÆpT in DrellāYan) and CMS (charged constituent multiplicity in minimumābias events).
Looking toward the HighāLuminosity LHC, the paper highlights the growing computational burden of large data sets. CMS introduced the DCTR (Neural reweighting) technique to emulate parameter variationsāsuch as the POWHEG hādamp parameterāor to upgrade NLO samples to NNLO accuracy without generating new MonteāCarlo samples. This approach promises substantial sustainability gains by reducing the need for full detector simulations.
The conclusion reiterates that ML has become indispensable across topāquark reconstruction, background estimation, and statistical inference, and that continued development of efficient, uncertaintyāaware algorithms will be crucial for the precision era of the HLāLHC. The reference list provides a solid bibliography covering bājet tagging, νāFlow, SPANET, HYPER, DISC, INFERNO, SALLY, OMNIFOLD, and DCTR.
Comments & Academic Discussion
Loading comments...
Leave a Comment