Optimising the FRB Search Pipeline for the Northern Cross Radio Telescope

Optimising the FRB Search Pipeline for the Northern Cross Radio T elescope Hayley Camilleri a , Alessio Magro a , Andrea Geminardi b,c,d , Giov anni Naldi e , Gianni Bernardi e , Luca Bruno e , V alentina Cesare e , Francesco Fiori e , Davide Pelliciari e , Maura Pilia d , Matteo T rudu d a Institute of Space Sciences and Astr onomy (ISSA), University of Malta, Msida, Malta b Scuola Universitaria Superior e IUSS of P avia, P avia, Italy c University of T r ento, Department of Physics, P ovo, Italy d Istituto Nazionale di Astr oﬁsica (INAF), Osservatorio Astr onomico di Cagliari, I-09047 Selar gius (Cagliari), Italy e Istituto Nazionale di Astr oﬁsica (INAF), Istituto di Radio Astr onomia, I-40129 Bologna, Italy Abstract Fast Radio Burst (FRB) search pipelines are being developed to operate under strict real-time constraints while maintaining sensitivity to short-duration transient signals. In incoherent dedispersion based pipelines such as Heimdall, apart from observation bandwidth and number of beams, detection performance and computational throughput are strongly dependent on the choice of processing parameters, which are often selected heuristically . In this work, we present a systematic e valuation of ke y dedispersion and matched ﬁltering parameters and quantify their impact on both detection accuracy and runtime performance. A controlled synthetic injection framew ork is dev eloped in which artiﬁcial FRB pulses with known dispersion measures (DMs), signal-to-noise ratios (SNRs), and pulse widths are embedded into realistic ﬁlterbank data contain- ing instrumental noise representati ve of observ ations from the Northern Cross radio telescope. Using this framew ork, a grid of Heimdall conﬁgurations is explored, spanning DM tolerance, boxcar ﬁlter width, and processing gulp size. Detection performance is assessed by comparing recovered and injected signal properties, while computational performance is ev aluated through end-to-end processing time measurements. The results re veal clear trade-o ﬀ s between sensiti vity and throughput across parameter choices. W e identify an em- pirically optimal conﬁguration that provides burst recov ery while maintaining processing speeds exceeding real-time requirements. While the speciﬁc optimal parameters are derived for the Northern Cross, the methodology and ﬁndings are broadly applicable to an y real-time transient detection pipeline emplo ying matched-ﬁltering and dedispersion, and are particularly relev ant for low-frequency radio telescopes with similar observing conﬁgurations. These ﬁndings demonstrate the value of data-dri ven parameter e valuation for improving the performance of real-time transient detec- tion pipelines. K eywor ds: Radio Astronomy, Radio T elescopes, Fast Radio Bursts (FRBs) 1. Introduction Fast Radio Bursts (FRBs) are bright, millisecond- duration radio transients, ﬁrst identiﬁed as a distinct phenomenon through the discovery of a highly dis- persed burst in archival pulsar survey data [1] and subsequently established as a population through multiple detections at cosmological distances [2]. Their dispersion measures (DMs) frequently exceed the expected Galactic contribution, indicating an extra- galactic origin and enabling the use of FRBs as probes of ionised baryons along the line of sight [3]. Beyond their utility as cosmological and intergalactic medium probes, FRBs exhibit a wide diversity in temporal and spectral structure, including complex sub-burst morphology , strong polarisation, and in some cases multiple detections, motiv ating extensi ve observational campaigns and rapid follow-up strategies across the electromagnetic spectrum [4]. W ithin this landscape, the Northern Cross radio tele- scope represents a compelling case study for pipeline dev elopment and optimisation [5]. The Northern Cross operates in the 400-416 MHz band and has a lar ge collecting area and wide ﬁeld of view , making it well suited to FRB searches at low frequencies [6, 7]. At these frequencies, dispersion delays across the band are substantial for high-DM ev ents (a DM = 500 ev ent would spread across the band for about 1s), and pulse broadening e ﬀ ects can be signiﬁcant, increasing the importance of carefully tuned dedispersion and matched-ﬁltering settings. Currently , the instrument is undergoing a major refurbishment and digital upgrade, including the deployment of a modern infrastructure for acquisition and processing and the installation of High- Performance Computing (HPC) resources designed to support full-instrument FRB searches in real time [7, 8]. This upgrade motiv ates the need for principled, data-driv en e valuation of transient search pipeline conﬁgurations, ensuring that the upgraded system can achiev e reliable burst recovery while meeting real-time throughput constraints. Heimdall is a widely used GPU-accelerated single- pulse search tool that implements brute-force incoherent dedispersion and boxcar-based matched ﬁltering [9]. It is used in sev eral transient search backends due to its ﬂexibility and suitability for real-time processing [10, 11]. Howe ver , Heimdall’ s performance is strongly inﬂuenced by user-deﬁned parameters such as DM tolerance and boxcar ﬁlter widths. In practice, these parameters are often selected heuristically , or inher- ited from previous surveys, with limited quantitati ve assessment of their impact on detection accuracy or computational e ﬃ ciency . The lack of systematic ev aluation of pipeline param- eter choices presents a challenge for real-time transient searches, particularly for telescopes operating under hardware or latency constraints. Suboptimal conﬁgura- tions may lead to unnecessary computational overhead, reduced sensitivity to speciﬁc classes of bursts, or increased false-positiv e rates. Pre vious studies have examined search sensitivities and aspects of algorithmic performance in FRB surve ys [12, 13]. Further work has also explored alternativ e dedispersion strategies and candidate classiﬁcation methods, including machine- learning-based approaches [14, 15, 16]. Howe ver , relativ ely little attention has been dev oted to compre- hensiv e, multi-metric ev aluation of parameter-le vel trade-o ﬀ s within established incoherent dedispersion pipelines, particularly in the context of jointly opti- mising detection ﬁdelity and real-time computational performance for speciﬁc instrument conﬁgurations. In this work, we present a systematic, data-driven ev aluation of ke y processing parameters in an in- coherent dedispersion-based FRB search pipeline using Heimdall for the Northern Cross telescope. W e employ a controlled synthetic injection frame work in which artiﬁcial FRB signals with kno wn prop- erties are embedded into true ﬁlterbank data. This approach enables direct, quantitativ e comparison between injected and recovered burst properties, allow- ing both detection accuracy and runtime performance to be assessed across a grid of parameter conﬁgurations. The goals of this study are twofold: ﬁrst, to char- acterise the trade-o ﬀ s between sensitivity and com- putational throughput associated with commonly used Heimdall parameters; and second, to identify empiri- cally optimal conﬁgurations that satisfy real-time pro- cessing requirements while maintaining transient recov- ery . Although the experiments are motiv ated by the Northern Cross telescope, the methodology and con- clusions are broadly applicable to incoherent dedis- persion pipelines used in contemporary radio transient searches. The results presented here should therefore be interpreted as both instrument-speciﬁc recommenda- tions and a general demonstration of ho w systematic pa- rameter ev aluation can inform pipeline optimisation for incoherent dedispersion based transient searches. The remainder of this paper is organised as follows. Section 2 provides theoretical background on signal dis- persion and propagation e ﬀ ects rele vant to FRB detec- tion, along with an overvie w of contemporary real-time FRB detection systems. Section 3 describes the North- ern Cross radio telescope and its observing conﬁgura- tion. Section 4 introduces the Heimdall pipeline and de- ﬁnes the parameter space explored in this study , cov er- ing the dedispersion and matched-ﬁltering frame work, the speciﬁc parameter combinations ev aluated, and the computational environment used. Section 5 presents the synthetic injection frame work and e valuation methodol- ogy , including the signal generation procedure, detec- tion accuracy and runtime metrics, and the statistical analysis approach comprising dimensionality reduction, unsupervised clustering, and non-parametric hypothesis testing. Section 6 reports the results across the full pa- rameter grid, co vering detection accuracy , runtime scal- ing, statistical comparisons, identiﬁcation of the empir- ically optimal conﬁguration, and the cluster structure of the performance space. Section 7 discusses the broader implications of the ﬁndings, and Section 8 summarises the conclusions and outlines directions for future work. 2 2. Theoretical Background 2.1. Dispersion and Pr opagation E ﬀ ects The presence of free electrons in the ISM and in- tergalactic medium causes signal dispersion, which is a distinguishing property of FRBs. The frequency de- pendent delay caused by this plasma dispersion is a de- termining characteristic used in detection. The tempo- ral delay between two frequencies is determined by the equation, ∆ t = 4 . 15 × 10 3 ms       1 ν 2 1 − 1 ν 2 2       × DM pc cm − 3 ! , (1) where ν 1 and ν 2 are frequencies in MHz and DM is the integrated column density of free electrons along the line of sight [17]. The dispersion measure itself is gi ven by , DM = Z d 0 n e ( l ) d l , (2) where n e ( l ) is the electron density at a distance l and d represents the distance from the Earth to the pulsar [18]. In addition to dispersion, FRBs often show propaga- tion e ﬀ ects such as scattering, which is depicted in Fig- ure 1, resulting in asymmetric pulse broadening, espe- cially at lower frequencies. The scattering measure is the path integral of C 2 n [3], SM = Z D 0 d s C 2 n (3) where D represents the independent distance mea- surements and C 2 n is the spectral coe ﬃ cient (the “le vel of turbulence”). Scintillation, generated by small scale turbulence in the plasma, introduces variation in signal intensity . Faraday rotation, which rotates the polarisation angle with frequency , o ﬀ ers information on the magnetic ﬁeld intensity and electron density of the intervening medium. Some bursts exhibit indications of plasma lensing, which occurs when inhomogeneities in the plasma cause the signal to focus or defocus. These e ﬀ ects limit detection while providing a lot of informa- tion about the cosmic en vironment [19]. Figure 1: Pulse proﬁles for PSR B1831-03 observed at ﬁve di ﬀ erent frequencies with the Lov ell telescope and the GMR T , clearly sho wing the increasing e ﬀ ect of scattering at lower frequencies. The solid lines show e xponential ﬁts to the data. Figure extracted from [20]. 2.2. FRB Detection Systems Real-time detection has become central to FRBs because it enables prompt alerts and triggered capture of high-time-resolution data, which are essential for studying burst microstructure, polarisation, and for achieving improv ed localisation. In addition, real-time processing is decisive for managing the large data volumes produced by modern radio telescopes oper- ating at high time and frequency resolution, allowing transient ev ents to be identiﬁed and retained while reducing the need to store and process prohibitiv ely large raw data streams. As summarised in recent pipeline focused work [21, 22], multiple observatories hav e deployed low-latency FRB detection systems with v arying architectures, including image-plane or beamformed searches and voltage-b u ﬀ er triggering [7]. Representativ e examples include the VLA realfast system, which performs commensal transient searching with rapid processing of interferometric data [23], as well as ASKAP’ s CRAFT program, which conducts commensal real-time searches and supports voltage capture for localisation and high-time-resolution studies [24]. Complementary approaches hav e been demonstrated at other facilities, including UTMOST real-time detections with voltage capture [25] and 3 Figure 2: T op vie w of the Northern Cross radio telescope with the two perpendicular arms along the East-W est and North-South direction. Figure extracted from [7]. real-time FRB searching systems dev eloped for F AST [26]. Collectiv ely , these systems illustrate the state of the art: modern FRB surve ys increasingly rely on GPU-accelerated pipelines capable of sustained high-throughput processing, low-latenc y candidate generation, and robust e vent triggering. Over the past decade, the ﬁeld has transitioned from isolated discov eries to systematic surveys that detect FRBs at high rates and publish large, uniform samples. A major milestone was the release of the ﬁrst CHIME / FRB catalog, comprising of 536 FRBs detected between 400–800 MHz in a uniform survey with calibrated selection e ﬀ ects [27]. Follo w-up anal- yses hav e lev eraged channelised raw voltage data for subsets of ev ents to reﬁne burst properties and enable higher-ﬁdelity characterisation [28]. More recently , large-sample catalogues have expanded dramatically in size, enabling population-le vel studies of repetition, energetics, and selection biases with unprecedented statistical power [29]. These dev elopments have estab- lished FRBs as a mature time-domain ﬁeld in which discov ery rates and scientiﬁc yield are increasingly limited not by telescope sensitivity alone, but by the capability of real-time processing systems to detect, classify , and trigger on ev ents reliably . 3. The Northern Cross Radio T elescope The experiments presented in this work are moti- vated by observations from the Northern Cross radio telescope, a transit radio interferometer located near Bologna, Italy (See Figure 2). Originally designed for wide-area radio surve ys, the Northern Cross is now being repositioned as a competiti ve instrument for time-domain astronomy , with a particular focus on FRB detection at low radio frequencies. The telescope operates at a central observing fre- quency of approximately 408 MHz with a bandwidth of 16 MHz, typically channelised into 1024 frequency channels [6]. It should be noted that these parameters are subject to change following future instrument up- grades [30], which would directly a ﬀ ect the dispersi ve smearing timescale and consequently the optimal DM trial spacing and parameter selection discussed in this work. This ﬁne spectral resolution enables accurate tracking of dispersion delays across the band and facilitates the detection of highly dispersed transient signals [7]. At these frequencies, dispersion delays are signiﬁcant for highly dispersed extragalactic FRBs, though not extreme in absolute terms giv en the modest bandwidth. The primary computational challenge instead arises from the high time and spectral resolution of the backend, which produces large data rates and millions of time samples per minute of observation. Performing brute-force incoherent dedispersion across wide DM ranges under these conditions places stringent demands on GPU throughput and memory bandwidth in real-time operation [3]. Consequently , the Northern Cross provides a representative and challenging test case for evaluating incoherent dedispersion based FRB search pipelines. As described by [7], the recent upgrade of the Northern Cross includes a ne w digital acquisition and processing chain designed to support real-time FRB searches, with GPU-accelerated pipelines enabling low-latenc y transient detection across wide DM ranges [9, 7]. At present, observations are stored as ﬁlter- bank ﬁles and analysed o ﬄ ine using software such as Heimdall, providing a con venient frame work for controlled performance ev aluation prior to full real-time deployment. A key challenge of the full-instrument conﬁguration is that all simultaneously formed beams must be processed in parallel, generating substantial data rates and computational loads that scale rapidly with the number of beams, DM trials, and time samples. In this context, conservati ve or poorly tuned parameter choices can lead to unnecessary overhead, reduced sensitivity , or failure to meet real-time processing requirements. The transition from a legac y backend to a modern real-time FRB search system therefore motiv ates the need for systematic, quantitati ve ev aluation of pipeline parameter conﬁgurations. Rather than adopting param- eter settings by analogy with other telescopes or surveys 4 operating at di ﬀ erent frequencies and bandwidths, the upgraded Northern Cross requires tuning that explicitly accounts for its observing band, dispersion regime, and a vailable computational resources. The w ork presented here addresses this requirement by using controlled synthetic injections to ev aluate the impact of dedispersion granularity , matched-ﬁlter coverage, and bu ﬀ ering strategy on both detection ﬁdelity and runtime performance. Data acquired by the Northern Cross are recorded as frequency-time ﬁlterbank ﬁles with ﬁx ed time and frequency resolution, typically ∼ 80 µ s time sampling and ∼ 200 kHz channel widths across the 400-416MHz band. This high time and spectral resolution preserves sensitivity to narrow , highly dispersed bursts, but sub- stantially increases data volume and computational cost for real-time dedispersion and matched ﬁltering. These data products are compatible with standard transient search software, including Heimdall, and retain the instrumental noise and system characteristics present in real observations. While the telescope’ s observing strategy and backend architecture impose speciﬁc con- straints on data rates and processing latency , the core signal processing challenges, dedispersion across wide DM ranges and matched ﬁltering for short-duration pulses, are common to many contemporary FRB search pipelines. 4. Heimdall Pipeline and Parameter Space This section describes the FRB search e valuated in this study and outlines the key processing parame- ters e xplored. W e ﬁrst summarise the core signal- processing stages implemented by the Heimdall single- pulse search software, focusing on incoherent dedisper - sion and matched ﬁltering for transient detection. W e then deﬁne the parameter space considered in this work, highlighting ho w choices related to DM sampling, ﬁlter widths, and data bu ﬀ ering directly inﬂuence both de- tection accuracy and computational performance. T o- gether , these elements establish the framework within which the systematic ev aluation presented in subsequent sections is conducted. 4.1. Incoherent Dedisper sion and Single Pulse Sear ch The FRB search ev aluated in this w ork is through the use of Heimdall, which performs brute-force dedisper - sion followed by matched ﬁltering to identify transient signals [9]. Input data are provided as frequency-time ﬁlterbank ﬁles, which are dedispersed across a user- deﬁned grid of DMs. For each trial DM, Heimdall applies frequency-dependent time shifts to correct for dispersion delays introduced by the ionised interstellar medium, producing a one-dimensional dedispersed time series [3]. H E I M D A L L P R O C E S S E S Receiver Beam Digitise F Polyphase Filterbank F Add Polarisations C Filterbank data Candidate List Candidate Classification C Other Beams Multibeam Coincidence C Candidate Display C FPGA Operation CPU Operation GPU Operation F C G Clean RFI G Dedisperse G Extract Time Series G Remove Baseline G Normalise G Matched Filter G Detect Events G More Filter Trials? Yes No More DM Trials? Yes No Merge Events G Figure 3: Flo w chart of the key processing operations in the pipeline. Heimdall is the name of the main GPU-based pipeline implementa- tion. Adapted from [9]. Follo wing dedispersion, Heimdall performs a single- pulse search by con volving each dedispersed time series with a set of boxcar ﬁlters of increasing width. These boxcar ﬁlters act as matched ﬁlters for pulses of varying temporal extent, enhancing the SNR when the ﬁlter width approximately matches the intrinsic pulse width [31]. Candidate ev ents are identiﬁed as statistically signiﬁcant peaks in the ﬁltered time series that e xceed a predeﬁned SNR threshold. For each detected candidate, Heimdall records properties including T ime of Arriv al (T oA), DM, pulse width, and SNR. This brute-force approach is computationally inten- siv e but o ﬀ ers ﬂexibility and robustness across a wide range of pulse morphologies and DMs. Howe ver , the ov erall performance of the pipeline is dependent on the conﬁguration of several user-deﬁned parameters that control the granularity of the dedispersion, the ﬁltering strategy , and the bu ﬀ ering of the data. A ﬂo w chart 5 showing all the steps that are computed by Heimdall, as well as additional ones that are performed in an FRB search pipeline, can be found in Figure 3. 4.2. P arameter Space Explor ed This study focuses on three ke y Heimdall parameters that strongly inﬂuence both detection performance and computational cost: DM tolerance, boxcar ﬁlter width, and gulp size. The DM tolerance parameter ( dm_tol ) controls the spacing of trial DMs and e ﬀ ectively determines the maximum allow able fractional SNR loss due to dedis- persion mismatch between adjacent DM trials [9]. This parameter governs the adaptiv e spacing of the DM trial grid; each trial is placed such that the e ﬀ ectiv e pulse width gro ws by a factor of dm_tol from one trial to the next. The e ﬀ ective pulse width at any given DM trial is giv en by: W e f f = q t 2 int + t 2 sam p + t 2 D M + t 2 δ D M + τ 2 s (4) where t int is the intrinsic pulse width, t sam p is the sampling time, t D M is the dispersive smearing across a single frequency channel, t δ D M is the smearing intro- duced by the o ﬀ set between the true DM and the nearest trial DM, and τ s is the scattering timescale. Lower DM tolerance values result in ﬁner DM grids and improv ed sensitivity at the cost of increased computational load, while higher values reduce the number of trials but may degrade pulse recovery for signals whose true DM lies between grid points. It is worth noting that [32] highlight that the relationship between dm_tol and actual surve y sensitivity is not entirely straightforward due to the scalloped response between trials, meaning the true worst case SNR loss is not simply 1 / dm_tol . Boxcar ﬁltering is performed using a predeﬁned set of ﬁlter widths, log 2 spaced, expressed in samples. W ider boxcars improv e sensiti vity to broader pulses but increase computational complexity and susceptibility to noise integration. Con versely , narrow boxcars fa vour short duration pulses b ut may underperform for temporally broadened signals. In this work, we explore multiple boxcar conﬁgurations to assess how the upper bound of the ﬁlter width range a ﬀ ects detection accuracy and runtime. The gulp size parameter speciﬁes the duration of data processed in each iteration. Larger gulp sizes can improv e GPU utilisation and reduce kernel launch ov erhead, but they also increase memory requirements and may introduce additional latency . Smaller gulp sizes reduce bu ﬀ ering latency b ut may lead to subopti- mal throughput. W e therefore in vestigate the e ﬀ ect of varying gulp size on real-time processing performance. DM T olerance 1.001 1.01 1.05 1.1 1.2 Boxcar W idth 32 64 128 256 512 T able 1: Parameter Combination V alues. A grid of parameter combinations was constructed by varying DM tolerance and maximum boxcar width across ranges representative of practical FRB search conﬁgurations, which can be seen in T able 1. This parameter space was chosen to reﬂect both commonly used settings and more aggressiv e conﬁgurations that trade sensitivity for computational e ﬃ cienc y . 4.3. Computational Envir onment All experiments were conducted on a GPU- accelerated computing system representative of the operational en vironment used for FRB searches at the Northern Cross radio telescope. The pipeline was ex ecuted on a single NVIDIA R TX 6000 Ada GPU, and runtime measurements were obtained using end-to-end processing times reported by Heimdall. Heimdall 1 was used in its standard, unmodiﬁed form, ensuring that all observed performance di ﬀ erences arise solely from pipeline parameter selection rather than changes to the underlying implementation. While the numerical values of optimal parameters may depend on speciﬁc hardware characteristics, the relativ e performance trends and trade-o ﬀ s identiﬁed in this study are expected to be broadly applicable to sim- ilar GPU-based incoherent dedispersion pipelines. 5. Synthetic Injection Framework and Evaluation Metrics 5.1. Synthetic FRB Signal Generation The signals hav e been injected in real observ ations from the Northern Cross radio telescope. The sample of ﬁlterbanks has been selected in a way that we do not expect real astrophysical signals inside. Indeed, we used Northern Cross observations of e xtragalactic FRBs already analysed with tested pipelines, which 1 https://sourceforge.net/p/heimdall- astro/wiki/ Home/ 6 excluded the presence of radio bursts, and at high sky declination to av oid the signals coming from the Galactic plane. No RFI cleaning tool has been used in order to simulate real observations. T o enable controlled and reproducible ev aluation of pipeline performance, we employ a synthetic signal injection framework in which artiﬁcial FRB-like pulses with known properties are embedded into ﬁlterbank data using the published package FRB Faker [33]. Synthetic injections provide direct ground truth, allow- ing quantitati ve assessment of detection accuracy and computational performance without the ambiguities inherent in real observational data [4]. Each synthetic burst is generated with a Gaussian temporal proﬁle and injected into the dynamic spec- trum prior to dedispersion. The dispersion delay across frequency channels is applied using the cold plasma dispersion relation [3], ensuring that injected signals exhibit realistic frequency-dependent arriv al times. DMs are sampled from a log-uniform distribution spanning 20 to 3000 pc cm − 3 , reﬂecting the wide dynamic range of DMs observed in FRB populations while av oiding ov er-representation of lo w-DM ev ents. The injected pulses were generated with input SNR values sampled uniformly from the range [3, 13]. It is important to note that the input SNR deﬁned by FRB Faker does not correspond directly to the SNR reported by Heimdall, as SNR estimation in radio ﬁlterbank data is not standardised and varies with pulse width, observing parameters, and the detection algorithm employed. As noted by the FRB Faker de velopers, a scaling coe ﬃ cient is generally required to map between the two deﬁnitions. Giv en this ambiguity , and follo wing the recommendation of the dataset authors, the SNR values used throughout this analysis are those reported directly by Heimdall, as these are the quantities directly relev ant to the detection pipeline under ev aluation. The near complete recovery of all injected pulses across all settings is consistent with the e ﬀ ectiv e Heimdall SNR of all injections exceeding the pipeline detection threshold, despite some injections having lo w input SNR values. This reﬂects the empirical nature of the input SNR range, which was chosen to produce a representativ e distribution of weak and strong bursts in the ﬁlterbank rather than to correspond to speciﬁc Heimdall detection thresholds. Pulse widths are drawn from a uniform distribution between 0.5 ms and 130 ms, covering both narrow , unresolved pulses and broader ev ents potentially a ﬀ ected by temporal scattering, where multi-path propagation through turbulent plasma broadens the signal and produces an asymmetric pulse proﬁle with an extended exponential tail. While real FRBs often exhibit more complex temporal and spectral structure, including asymmetric scattering tails and sub-burst components, the use of Gaussian pulses provides a consistent and interpretable basis for comparative ev aluation of pipeline parameters. Each ﬁlterbank ﬁle contains multiple injected bursts at random arriv al times, ensuring that the pipeline is evaluated across a diverse set of signal properties and temporal contexts. The underlying data preserve realistic noise characteristics, channelisation, and time resolution, ensuring that injected signals are e valuated under conditions comparable to operational FRB searches. Injection is performed prior to dedispersion and sin- gle pulse searching, allo wing the full pipeline including dedispersion, boxcar ﬁltering, and candidate selection to operate on the modiﬁed data without additional in- tervention. This approach ensures that recovered signal properties can be directly compared to known injection parameters, and that runtime measurements reﬂect end- to-end pipeline behaviour . A total of 950 ﬁlterbank ﬁles of 140 seconds of duration are generated, each contain- ing 13 injected bursts, resulting in a dataset comprising 12,350 synthetic FRB ev ents. 5.2. Evaluation Metrics Performance is e valuated using a combination of detection accuracy and computational e ﬃ ciency metrics. Detection accuracy is assessed by matching detected candidates to injected bursts based on temporal proximity and DM consistenc y . For matched e vents, we compute relative errors between injected and recovered values of DM, arri val time, pulse width, and SNR. Rather than adopting a binary detection metric, this approach enables a nuanced assessment of how parameter choices a ﬀ ect the ﬁdelity with which signal properties are recovered. This is particularly important for ev aluating dedispersion and matched-ﬁltering performance, where incorrect parameter settings may still produce detections but with degraded accuracy . As illustrated in Figure 4, the dm_tol parameter directly controls the spacing between DM trials ( ∆ DM): larger values produce coarser trial grids while requiring fewer total trials, whereas smaller values sample the DM 7 10 0 10 1 10 2 10 3 Dispersion Measure (pc cm − 3 ) 0 . 2 0 . 4 0 . 6 0 . 8 1 . 0 1 . 2 Step Size ∆ DM (pc cm − 3 ) DM T rial Step Size ( ∆ DM) vs Disp ersion Measure dm tol = 1.001 (40,792 trials) dm tol = 1.01 (12,871 trials) dm tol = 1.05 (5,700 trials) dm tol = 1.1 (3,982 trials) dm tol = 1.2 (2,751 trials) Injected DM 1.001 1.01 1.05 1.1 1.2 dm tol value 0 5 10 15 Mean DM Oﬀset (pc cm − 3 ) 5.04 4.20 4.88 5.14 4.54 * b est DM Recovery Accuracy Figure 4: Impact of the dm_tol parameter on DM trial spacing and recovery accuracy for Heimdall dedispersion. The left panel shows the local DM trial step size ( ∆ DM) as a function of dispersion measure, with red dashed lines marking the injected pulse DM positions. The right panel shows the mean absolute o ﬀ set between detected and injected DM values ( ± 1 σ ) for each dm_tol setting, where all values successfully recover injections b ut with v arying accuracy , demonstrating that parameter choice a ﬀ ects signal reco very ﬁdelity rather than detection alone. Results sho wn are from a representativ e ﬁle drawn from the test dataset; the trends are consistent across the full dataset. space more ﬁnely at greater computational cost (upper panel). Noticably , all dm_tol values tested successfully recov ered all injected pulses across the DM range, demonstrating that detections occur regardless of the parameter setting. Howe ver , the mean absolute DM o ﬀ set between the detected and injected DM values, which is a direct measure of recovery accuracy , varies signiﬁcantly across settings, with dm_tol = 1.1 also exhibiting the largest spread in recov ery accuracy (lower panel). This conﬁrms that while all settings produce detections, the ﬁdelity with which the DM, and by extension other deriv ed signal properties, is recov ered highly depends on the choice of dm_tol , motiv ating a careful parameter selection rather than reliance on default v alues. 5.3. Runtime P erformance Computational e ﬃ ciency is assessed using the total processing time required to analyse each ﬁlterbank ﬁle. Runtime measurements are obtained directly from Heimdall’ s internal timing reports and include dedispersion, boxcar ﬁltering, and candidate genera- tion. Input / output operations are excluded to ensure that runtime comparisons reﬂect pipeline performance rather than storage system characteristics. Runtime is reported both in absolute terms and rela- tiv e to the duration of the input data, allo wing direct as- sessment of real-time feasibility . Conﬁgurations achie v- ing processing speeds exceeding real-time requirements are considered operationally viable, while slo wer con- ﬁgurations are deemed unsuitable for real-time deploy- ment despite potential gains in sensiti vity . 5.4. Statistical Analysis T o analyse the multi-dimensional space arising from the ev aluated parameter conﬁgurations, we employ a combination of exploratory visualisation, unsupervised clustering, and non-parametric statistical testing. This layered approach allows qualitative structure in the data to be identiﬁed prior to formal hypothesis testing, and ensures that statistically signiﬁcant di ﬀ erences are interpreted in the context of ov erall performance trends. 5.4.1. Dimensionality Reduction with t-SNE As an initial exploratory step, we apply t-distributed stochastic neighbour embedding (t-SNE) to project the high-dimensional performance metrics into a two-dimensional space for visual analysis [34] as rep- resented in Figure 5. The input feature space includes detection accuracy metrics (DM error, SNR error , and T oA error) together with runtime performance, enabling joint assessment of sensitivity and computa- tional e ﬃ ciency . Prior to dimensionality reduction, all features are standardised to ensure comparable scaling and to prev ent dominance by any single metric. t-SNE is a non-linear dimensionality reduction technique that seeks to preserve local neighbourhood structure when mapping data from a high-dimensional space into a lower -dimensional embedding. In the original feature space, t-SNE models pairwise simi- larities between points using conditional probability distributions deri ved from Gaussian kernels, with the kernel bandwidth determined by a user-deﬁned per- plexity parameter . In the low-dimensional embedding, similarities are modelled using a heavy-tailed Student t-distribution, which reduces the cro wding problem and 8 Figure 5: 2-D projection of t-SNE dimensionality reduction on the data; where the top ﬁgure is ov erlayed with a heatmap representing percentage accuracy for SNR (cool colours = lower accuracy; warm colours = higher accuracy , up to 100%) and the bottom ﬁgure is ov er- layed with colours which indicate cluster labels returned by HDB- SCAN; label –1 marks points classiﬁed as noise / outliers. Each point represents a parameter-ﬁle outcome embedded into two dimensions by t-SNE (axes are unitless and not directly interpretable). High- density regions indicate many outcomes with very similar feature pro- ﬁles (locally preserved neighbourhoods), i.e., parameter combinations that obtained similar performance and results. allows moderately distant points to be more e ﬀ ectively separated. The optimisation objective of t-SNE minimises the Kullback-Leibler diver gence between the high- dimensional and low-dimensional similarity distrib u- tions. As a result, points that are close neighbours in the original feature space are encouraged to remain close in the embedding, while large pairwise distances are not preserved in a metric sense. For this reason, t-SNE embeddings should not be interpreted as preserving global geometry or absolute distances, but rather as providing a faithful representation of local relationships among conﬁgurations. In this study , t-SNE is used solely as a visualisation tool, allo wing intuitiv e inspection of whether parameter conﬁgurations form natural groupings or exhibit trade- o ﬀ structures in the combined accuracy–runtime space. As an initial step to identify certain performance trends in the t-SNE projection, a heatmap was used to ov erlay the data points which represents the accurac y scores of the SNR feature, as can be seen in Figure 5. This vi- sualisation illustrates re gions with changing signal clar - ity , ﬁnding potential clusters associated with greater or lower SNR accurac y values. 5.4.2. Unsupervised Clustering with HDBSCAN T o objectively identify groups of parameter con- ﬁgurations with similar performance, we apply the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm to the reduced performance space [35]. HDBSCAN extends the DBSCAN framew ork by constructing a hierarchy of density-based clusters and extracting the most stable groupings. Unlike partition-based clustering methods such as k-means, HDBSCAN does not require speciﬁcation of the number of clusters and is capable of identifying clusters of v arying density while explicitly labelling outliers as noise. This is particularly advantageous in the present context, where performance distributions are heterogeneous and some parameter conﬁgurations may represent suboptimal or extreme cases rather than belonging to well-deﬁned groups. The clustering results, depicted in Figure 5, are used to support interpretation of the performance landscape by highlighting sets of conﬁgurations that exhibit simi- lar accuracy-runtime trade-o ﬀ s, and by identifying con- ﬁgurations that consistently underperform or behave anomalously . 5.4.3. F riedman T est for Global P erformance Di ﬀ er- ences Follo wing exploratory analysis, we apply the Friedman test to formally assess whether statistically signiﬁcant performance di ﬀ erences exist among the ev aluated parameter conﬁgurations [36]. The Friedman test is a non-parametric alternativ e to repeated-measures ANO V A (ANalysis Of V Ariance) and is well-suited to this study , as all conﬁgurations are e valuated on the same set of injected signals, and the performance metrics do not satisfy normality assumptions. 9 Each parameter conﬁguration is ranked according to a global performance metric, and the Friedman statis- tic ev aluates whether the observed ranking di ﬀ erences across conﬁgurations are greater than would be ex- pected by chance. This rank-based approach provides a robust global test of whether parameter choice has a statistically signiﬁcant e ﬀ ect on detection accurac y or runtime performance. The results of the best 10 conﬁg- urations were represented using box plots (See Figure 7); the box plot results of T oA were omitted since the most inﬂuential results came from performance of DM and SNR. 5.4.4. Nemenyi P ost-hoc P airwise Comparisons When the Friedman test indicates statistically sig- niﬁcant di ﬀ erences, post-hoc pairwise comparisons are conducted using the Nemenyi test [37]. The Nemenyi test compares the average ranks of all pairs of conﬁgu- rations and determines whether their di ﬀ erences exceed a critical di ﬀ erence threshold. This procedure enables the identiﬁcation of speciﬁc parameter conﬁgurations that perform signiﬁcantly bet- ter or worse than others across the full dataset. Results are con veniently visualised using critical di ﬀ erence dia- grams, which group conﬁgurations that are statistically indistinguishable and highlight those that exhibit su- perior overall performance. T ogether with the cluster analysis, these results provide an interpretable basis for identifying optimal conﬁgurations. 6. Results 6.1. Detection Accur acy Across P arameter Conﬁgura- tions Detection accuracy w as ev aluated across the full grid of Heimdall parameter conﬁgurations using the metrics deﬁned in Section 5.2. For each conﬁguration, recov- ered candidate properties were matched to injected synthetic bursts, and relati ve errors in DM, SNR, and T oA were computed. Performance was summarised using several statistical error metrics, including the mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), and mean absolute percentage error (MAPE). The results, as presented in T able A.2, summarise the behaviour of each conﬁguration across the entire dataset. Across all tested conﬁgurations, detection accuracy exhibited a strong dependence on both DM tolerance and boxcar ﬁlter width. Conﬁgurations employing lower DM tolerance values consistently achiev ed im- prov ed DM recovery , as e xpected from ﬁner sampling of the DM space. Howe ver , this improvement was not uniform across all injected DMs: at higher DMs, coarse DM grids produced noticeably larger relative DM er- rors, indicating increased susceptibility to dedispersion mismatch for highly dispersed signals. SNR recov ery sho wed a pronounced dependence on boxcar ﬁlter conﬁguration. Conﬁgurations with limited maximum boxcar widths tended to underestimate the SNR of broader pulses, particularly for injected bursts with full-width at half-maximum durations exceeding sev eral tens of milliseconds. Con versely , conﬁgurations allowing excessi vely lar ge boxcar widths exhibited increased variance in SNR error for narrow pulses, reﬂecting the integration of additional noise when ﬁlter widths signiﬁcantly exceeded the intrinsic pulse duration. T oA accuracy was generally robust across most parameter conﬁgurations, with median relativ e timing errors remaining small compared to the intrinsic pulse widths. Nev ertheless, conﬁgurations with coarse DM tolerance or mismatched boxcar ranges exhibited suggested systematic timing o ﬀ sets, particularly for low-SNR bursts where imperfect dedispersion led to asymmetric pulse recovery . These e ﬀ ects were most pronounced for bursts near the detection threshold, highlighting the interaction between dedispersion precision and matched-ﬁlter alignment. When considered jointly as was shown in Figure 5, the three accuracy metrics re veal clear trade-o ﬀ s between sensitivity and robustness. Conﬁgurations optimised for ﬁne DM resolution improv ed DM and T oA accuracy but sho wed diminishing returns in SNR recov ery relative to their increased computational cost. Conv ersely , conﬁgurations prioritising reduced computational complexity exhibited degraded recov ery of burst properties, particularly for broad or highly dispersed signals. These results demonstrate that detection accuracy cannot be optimised independently of parameter in- teractions. Instead, optimal performance emerges from conﬁgurations that balance dedispersion granular- ity with matched-ﬁlter coverage, motiv ating the multi- dimensional analysis and statistical comparison pre- sented in the following sections. 10 1.001 1.01 1.05 1.1 1.2 DM Tolerance 5 10 15 20 25 Time (s) Find Giants 1.001 1.01 1.05 1.1 1.2 DM Tolerance 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 Time (s) Dedispersion 1.001 1.01 1.05 1.1 1.2 DM Tolerance 0.5 1.0 1.5 2.0 2.5 3.0 Time (s) Filtering 1.001 1.01 1.05 1.1 1.2 DM Tolerance 10 20 30 40 50 Time (s) T otal BW = 32 BW = 64 BW = 128 BW = 256 BW = 512 Figure 6: Execution time (s) for certain steps as a function of DM T olerance, shown for ﬁve boxcar widths (BW = 32-512). Panels show the three most time-dominant steps alongside total runtime. 6.2. Runtime Scaling and Real-T ime P erformance Runtime performance was evaluated for all param- eter conﬁgurations by measuring the total processing time required to analyse each ﬁlterbank ﬁle, as de- scribed in Section 5.3. Processing times were compared against the duration of the input data to assess real-time feasibility under di ﬀ erent parameter choices. Across the explored parameter space, runtime exhibited a strong dependence on both DM tolerance and boxcar ﬁlter conﬁguration, as can be observed in Figure 6 (see also T able A.3 for a more in depth numerical comparison). Conﬁgurations employing ﬁner DM tolerance v alues incurred substantially higher computational cost due to the increased number of trial DMs. This e ﬀ ect was approximately linear within the tested range, reﬂecting the brute-force nature of incoherent dedispersion. Conv ersely , conﬁgurations with coarser DM tolerance reduced processing time at the expense of reduced dedispersion ﬁdelity . The maximum boxcar ﬁlter width also contributed signiﬁcantly to runtime v ariability . Increasing the upper bound of the boxcar range increased the number of con volution operations applied to each dedispersed time series, resulting in longer processing times. While the impact of boxcar width on runtime was less pronounced than that of DM tolerance, conﬁgurations with large boxcar ranges consistently showed higher processing ov erhead. Gulp size played a secondary but non-negligible role in determining throughput. Larger gulp sizes improved GPU utilisation by reducing kernel launch overhead and enabling more e ﬃ cient memory access patterns. Howe ver , beyond a certain threshold, increasing the gulp size yielded diminishing returns, indicating that memory constraints and data transfer overheads began to dominate. In addition, large gulp sizes are not desirable in operational scenarios where lo w-latency triggering is required, such as saving raw voltage data or issuing alerts to external f acilities for rapid follo w-up observations. Larger gulps also increase memory usage, which can limit the degree of parallelisation achie v- able when multiple Heimdall instances are executed concurrently on the same GPU, or when Heimdall is extended to process multiple beams simultaneously . W ithin the tested range, a gulp size of approximately 40 s provided an e ﬀ ective balance between throughput, memory usage, and triggering latency . Importantly , sev eral parameter conﬁgurations achiev ed processing speeds comfortably exceeding real-time requirements. These conﬁgurations main- tained processing rates signiﬁcantly faster than the input data rate, leaving su ﬃ cient headroom for additional components such as candidate selection, normalisation, ﬁltering, and alert generation. 6.3. Statistical Comparison of P arameter Conﬁgura- tions T o formally assess whether observed performance di ﬀ erences across parameter conﬁgurations were statistically signiﬁcant, we applied the Friedman test as described in Section 5.4.3. The test was performed separately for each e valuation metric, treating each in- jected burst as a repeated measure across conﬁgurations. For all detection accuracy metrics and runtime per- formance, the Friedman test rejected the null hypothesis of equiv alent performance across conﬁgurations at the chosen signiﬁcance lev el. This result conﬁrms that parameter selection has a statistically signiﬁcant impact on both detection ﬁdelity and computational e ﬃ ciency . 11 Figure 7: Boxplots of Friedman test results for DM (top) and SNR (bottom). The numerical values represent outlier counts per combi- nation. The labels are set according to the dm_tol and boxcar width parameter values. Follo wing this global assessment, pairwise compar- isons were conducted using the Nemenyi post-hoc test. A verage ranks were computed for each conﬁguration across all injected bursts, and critical di ﬀ erence thresh- olds were used to identify statistically distinguishable groups. The resulting critical di ﬀ erence diagrams shown in Figure 8 rev ealed clusters of conﬁgurations with comparable performance, as well as conﬁgurations that consistently outperformed or underperformed the rest. In particular , conﬁgurations combining moderate DM tolerance with intermediate boxcar ranges achieved fa vourable ranks across multiple metrics, while conﬁg- urations at the extremes of the parameter space such as very ﬁne DM grids or excessiv ely large boxcar widths, exhibited statistically signiﬁcant performance degradation when accounting for both accuracy and runtime. These results provide quantitativ e conﬁrmation of the qualitativ e trends observed in Sections 6.1 and 6.2, and support the identiﬁcation of parameter conﬁgurations Figure 8: Critical Di ﬀ erence diagrams for SNR (top), DM (middle) and T oA (bottom). that achiev e balanced, statistically robust performance. 6.4. Identiﬁcation of an Empirically Optimal Conﬁgu- ration Based on the combined ev aluation of detection accu- racy , runtime performance, and statistical signiﬁcance, an empirically optimal parameter conﬁguration was identiﬁed. This conﬁguration achieved consistently strong performance across all accuracy metrics while maintaining processing speeds well abov e real-time requirements. The selected conﬁguration employed a DM tolerance of 1.01, a maximum boxcar width of 256 samples, and a gulp size of 40 s. Relati ve to other tested conﬁgura- tions, this setting exhibited low median errors in DM, SNR, and time-of-arri val recovery , while avoiding the substantial runtime penalties associated with ﬁner DM grids or larger boxcar ranges. Importantly , this conﬁguration was not necessarily the best-performing option for any single metric in isolation. Rather, it provided the most fav ourable ov erall trade-o ﬀ when considering all performance dimensions jointly . Statistical testing conﬁrmed that 12 its performance was statistically indistinguishable from the best-performing conﬁgurations for indi vidual metrics, while being signiﬁcantly more e ﬃ cient than sev eral higher-cost alternati ves. This result highlights the importance of multi- metric ev aluation when optimising real-time transient pipelines. P arameter choices that maximise sensitiv- ity alone may impose unnecessary computational over - head, while overly aggressiv e performance optimisation can degrade detection ﬁdelity . The identiﬁed conﬁgura- tion represents a balanced compromise suitable for op- erational deployment. 6.5. Cluster Structure in P erformance Space T o further explore relationships among parameter conﬁgurations, cluster analysis was performed using the method described in Section 5.4.2. Dimensionality reduction with t-SNE revealed a clear structure in the combined accuracy-runtime performance space, with conﬁgurations forming distinct groups corresponding to di ﬀ erent trade-o ﬀ regimes. HDBSCAN clustering identiﬁed sev eral stable clusters, each characterised by similar performance behaviour . One cluster comprised conﬁgurations with ﬁne DM tolerance and large boxcar ranges, which achiev ed high detection accuracy but incurred sub- stantial computational cost. Another cluster contained conﬁgurations prioritising computational e ﬃ ciency , characterised by coarse DM grids and limited boxcar cov erage, b ut exhibiting degraded recov ery of injected burst properties. Notably , the empirically optimal conﬁguration identiﬁed in Section 6.4 resided within a cluster that balanced accuracy and runtime, distinct from both the high-cost, high-sensitivity cluster and the lo w-cost, low-accurac y cluster . This clustering result provides additional support for the robustness of the selected conﬁguration and demonstrates that it occupies a stable region of the performance landscape rather than representing an isolated or anomalous case. T ogether, the clustering and statistical analyses rein- force the conclusion that systematic, data-driv en ev alu- ation of pipeline parameters can re veal structured per - formance regimes and guide informed optimisation de- cisions for real-time FRB search pipelines. 7. Discussion The results presented in Section 6 demonstrate that the performance of incoherent dedispersion- based FRB search pipelines is strongly inﬂuenced by parameter-le vel choices, and that these e ﬀ ects extend beyond simple sensiti vity considerations to include computational e ﬃ ciency and operational feasibility . By systematically ev aluating detection accuracy and runtime across a controlled parameter space, this study provides quantitative insight into how dedispersion granularity , matched-ﬁlter coverage, and bu ﬀ ering strategies interact to shape o verall pipeline behaviour . A key outcome of this work is the identiﬁcation of an empirically optimal conﬁguration that balances de- tection ﬁdelity with real-time processing requirements. Rather than maximising performance along a single metric, the selected conﬁguration represents a com- promise that achieves robust recovery of injected burst properties while maintaining substantial computational headroom. This ﬁnding underscores the importance of multi-dimensional optimisation in real-time transient searches, where sensiti vity gains achie ved through ﬁner parameter sampling may be o ﬀ set by disproportionate increases in computational cost. The observed trade-o ﬀ s between DM tolerance and detection accuracy are consistent with expectations from incoherent dedispersion theory . Finer DM grids reduce temporal smearing and improve parameter re- cov ery , particularly at high DMs, b ut incur a near -linear increase in computational load due to the brute-force nature of the algorithm. Similarly , the inﬂuence of boxcar ﬁlter conﬁguration reﬂects the role of matched ﬁltering in SNR recov ery: insu ﬃ cient ﬁlter coverage degrades sensiti vity to broad pulses, while excessi vely large ﬁlters increase noise inte gration and runtime with- out commensurate gains in detection ﬁdelity . These results highlight that commonly adopted parameter choices may not be optimal when ev aluated in a holistic performance framew ork. From a computational perspective, the runtime analysis conﬁrms that real-time processing is achie v- able on a single GPU for a wide range of parameter conﬁgurations, provided that bu ﬀ ering and ﬁlter ranges are chosen judiciously . The observ ed saturation of throughput gains at larger gulp sizes suggests that memory access and data transfer overheads become limiting factors beyond a certain scale, emphasising the need to consider hardware characteristics when tuning 13 pipeline parameters. While the absolute runtime values reported here are speciﬁc to the tested hardware, the relativ e trends are expected to generalise to similar GPU-based implementations. Although this study is motiv ated by observations from the Northern Cross radio telescope, the method- ology and conclusions are broadly applicable to other FRB search pipelines employing incoherent dedis- persion and matched ﬁltering. The use of synthetic injections enables controlled comparison across pa- rameter conﬁgurations and avoids confounding e ﬀ ects introduced by radio frequency interference (RFI) or unknown source properties. Howe ver , this approach also introduces limitations. The injected pulses adopt simpliﬁed Gaussian temporal proﬁles and do not capture the full comple xity of real FRB signals, such as scattering tails, spectral structure, or sub-burst morphology . As a result, the absolute performance metrics reported here should be interpreted as relativ e indicators rather than deﬁnitiv e sensitivity limits. Future extensions of this work could address these limitations by incorporating more realistic injec- tion models, including scattered pulse proﬁles and frequency-dependent structure, as well as by v alidating the identiﬁed parameter conﬁgurations on real obser- vational data. In addition, the systematic ev aluation framew ork presented here could be extended to assess other pipeline components, such as RFI mitigation strategies or machine learning based candidate classiﬁ- cation stages. More ambitiously , adaptiv e pipelines that dynamically adjust processing parameters in response to data quality or observing conditions could further improv e real-time performance while preserving sensi- tivity . Overall, this study demonstrates that systematic, data-driv en ev aluation of pipeline parameters can yield meaningful improvements in both detection accuracy and computational e ﬃ ciency . As FRB search e ﬀ orts continue to scale in data volume and complexity , such approaches will be increasingly important for ensuring that real-time transient pipelines operate at their full po- tential. 8. Conclusion In this work, we hav e presented a systematic, data-driv en e valuation of key processing parameters in an incoherent dedispersion based FRB search pipeline using the Heimdall single-pulse detection software. By employing a controlled synthetic injection frame- work, we quantitativ ely assessed ho w parameter -lev el choices inﬂuence both detection accurac y and computa- tional performance under realistic observing conditions. Our results demonstrate that commonly used pipeline parameters can exhibit substantial trade-o ﬀ s between sensitivity and runtime, and that optimal performance cannot be achie ved by maximising indi vidual metrics in isolation. Through joint analysis of detection ﬁdelity , processing throughput, and statistical signiﬁcance, we identiﬁed an empirically optimal conﬁguration that achiev es robust recovery of injected burst properties while maintaining processing speeds comfortably exceeding real-time requirements on a single GPU. This conﬁguration balances dedispersion granularity , matched-ﬁlter cov erage, and bu ﬀ ering strategy , and av oids the computational overhead associated with more aggressiv e parameter choices. Future work will focus on extending this frame work to incorporate more realistic signal models, v alidation on real observational data, and integration with ma- chine learning based candidate classiﬁcation and adap- tiv e pipeline strategies. As real-time radio transient searches continue to expand in scale and complexity , systematic parameter e valuation will play a key role in ensuring e ﬃ cient and reliable detection performance. Acknowledgements Part of the research activities described in this paper were carried out with the contrib ution of the NextGen- erationEU funds within the National Recov ery and Re- silience Plan (PNRR), Mission 4 - Education and Re- search, Component 2 - From Research to Business (M4C2), In vestment Line 3.1 - Strengthening and cre- ation of Research Infrastructures, Project IR0000026 – Next Generation Croce del Nord 14 Appendix A. T ables of Results Boxcar W idth Metric DM T olerance 1.001 1.01 1.05 1.1 1.2 ± 0 . 01 ± 0 . 01 ± 0 . 01 ± 0 . 01 ± 0 . 01 32 MAE 17.70 20.02 19.57 19.46 20.13 MSE 660.83 736.26 729.47 736.34 749.35 RMSE 25.71 27.13 27.01 27.14 27.37 MAPE 62.42 68.61 69.83 70.03 72.61 Accuracy 37.58 31.39 30.17 29.97 27.39 64 MAE 13.41 13.80 13.90 13.67 16.77 MSE 412.18 404.80 418.11 414.91 548.53 RMSE 20.30 20.12 20.45 20.37 23.42 MAPE 36.56 36.54 37.85 37.17 47.74 Accuracy 63.44 63.46 62.15 62.83 52.26 128 MAE 8.18 7.09 7.57 7.48 8.22 MSE 186.40 157.77 165.80 162.03 179.29 RMSE 13.65 12.56 12.88 12.73 13.39 MAPE 17.25 15.07 16.29 15.89 17.97 Accuracy 82.75 84.93 83.71 84.11 82.03 256 MAE 2.11 2.53 2.86 2.95 5.25 MSE 26.06 29.94 32.99 35.18 82.41 RMSE 5.11 5.47 5.74 5.93 9.08 MAPE 3.55 4.32 4.99 5.15 12.38 Accuracy 96.45 95.68 95.01 94.85 87.62 512 MAE 0.00 0.41 0.79 0.48 1.57 MSE 0.00 0.30 1.41 0.42 6.54 RMSE 0.00 0.55 1.19 0.65 2.56 MAPE 0.00 0.75 1.55 0.92 3.10 Accuracy 100 99.25 98.45 99.08 96.90 T able A.2: Sample of results of statistical metrics obtained from anal- yses. In this case, this table is obtained from a random ﬁle from the dataset; the results show the SNR analysis. 15 Boxcar W idth T ime Execution (s) DM T olerance 1.001 1.01 1.05 1.1 1.2 ± 0 . 01 ± 0 . 01 ± 0 . 01 ± 0 . 01 ± 0 . 01 32 0-DM Cleaning 1.78 1.65 1.81 1.67 1.79 Dedispersion 3.75 0.85 0.47 0.35 0.22 Baselining 2.01 0.60 0.26 0.18 0.12 Normalisation 1.76 0.52 0.23 0.15 0.11 Filtering 1.75 0.52 0.22 0.15 0.10 Find Giants 11.95 3.57 1.54 1.06 0.72 T otal 31.72 10.09 5.55 4.27 3.60 64 0-DM Cleaning 1.77 1.80 1.66 1.66 1.66 Dedispersion 3.79 0.89 0.42 0.30 0.22 Baselining 2.01 0.60 0.26 0.18 0.12 Normalisation 1.76 0.52 0.23 0.16 0.11 Filtering 2.07 0.62 0.27 0.18 0.12 Find Giants 15.11 4.51 1.93 1.33 0.92 T otal 35.13 11.35 5.87 4.59 3.70 128 0-DM Cleaning 1.80 1.64 1.83 1.67 1.80 Dedispersion 3.95 0.89 0.44 0.32 0.22 Baselining 2.01 0.60 0.26 0.18 0.12 Normalisation 1.77 0.52 0.23 0.16 0.11 Filtering 2.38 0.70 0.31 0.21 0.14 Find Giants 18.41 5.50 2.38 1.63 1.11 T otal 39.11 12.32 6.56 4.96 4.04 256 0-DM Cleaning 1.86 1.80 1.69 1.81 1.67 Dedispersion 3.78 0.93 0.47 0.32 0.22 Baselining 2.02 0.60 0.26 0.18 0.12 Normalisation 1.77 0.52 0.23 0.16 0.11 Filtering 2.68 0.80 0.35 0.24 0.16 Find Giants 21.55 6.51 2.82 1.94 1.31 T otal 42.47 13.65 6.94 5.44 4.15 512 0-DM Cleaning 1.77 1.66 1.83 1.66 1.81 Dedispersion 3.93 0.93 0.40 0.35 0.24 Baselining 1.99 0.60 0.26 0.18 0.12 Normalisation 1.75 0.53 0.23 0.16 0.11 Filtering 2.98 0.90 0.39 0.27 0.18 Find Giants 25.46 7.67 3.29 2.29 1.56 T otal 46.96 14.86 7.43 5.70 4.58 T able A.3: Sample of results of performance timings (in seconds) ob- tained from Heimdall. This table is obtained from a random ﬁle from the dataset; the results show the timing metrics that contributed mostly to performance, for all parameter combinations. 16 References [1] D. R. Lorimer et al., A bright millisecond ra- dio burst of extragalactic origin, Science , 318, 777–780, 2007. [2] D. Thornton et al., A population of fast radio bursts at cosmological distances, Science , 341, 53–56, 2013. [3] J. M. Cordes and T . J. W . Lazio, NE2001. I. A new model for the Galactic distribution of free elec- trons and its ﬂuctuations, arXiv:astr o-ph / 0207156 , 2003. [4] E. Petro ﬀ et al., FRBCA T : The Fast Radio Burst Catalogue, MNRAS , 482, 3109–3204, 2019. [5] R. A. Perley et al., The role of legacy radio tele- scopes in the era of fast radio bursts, Astr onomy & Astr ophysics Review , 27, 1–45, 2019. [6] N. T . Locatelli, G. Bernardi, G. Bianchi, R. Chiello, A. Magro, G. Naldi, M. Pilia, G. Pupillo, A. Ridolﬁ, G. Setti, et al., The Northern Cross fast radio burst project–I. Overvie w and pilot observa- tions at 408 MHz, Monthly Notices of the Royal Astr onomical Society , 494(1), 1229–1236, 2020. [7] A. De Barro, A. Magro, K. Bugeja, G. Naldi, N. Ragno, F . Fiori, and V . Cesare, T ow ards a realtime FRB pipeline at the Northern Cross radio tele- scope, CEUR W orkshop Pr oceedings , V ol. 4130, Paper 111, 2025. [7] [8] G. Naldi et al., The design of a new digital sig- nal processing system for the upgraded Northern Cross Radio T elescope, URSI AP-RASC , 2025. [8] [9] B. R. Barsdell, M. Bailes, D. G. Barnes, and C. J. Fluke, Accelerating incoherent dedispersion, Monthly Notices of the Royal Astr onomical Soci- ety , 422(1), 379–392, 2012. [10] E. F . K eane, E. D. Barr , A. Jameson, V . Morello, M. Caleb, S. Bhandari, E. Petro ﬀ , A. Possenti, M. Burgay , C. T iburzi, et al., The survey for pulsars and extragalactic radio bursts–I. Survey descrip- tion and ov erview , Monthly Notices of the Royal Astr onomical Society , 473(1), 116–135, 2018. [11] E. F . K eane, The future of fast radio b urst science, Natur e Astr onomy , 2, 865–867, 2018. [12] E. F . Keane and E. Petro ﬀ , Fast radio bursts: search sensitivities and completeness, Monthly Notices of the Royal Astr onomical Society , 447(3), 2852–2856, 2015. [13] H. Qiu, E. F . Keane, K. W . Bannister , C. W . James, and R. M. Shannon, Systematic perfor- mance of the ASKAP fast radio burst search algo- rithm, Monthly Notices of the Royal Astr onomical Society , 523(4), 5109–5119, 2023. [14] L. Connor , A. van Leeuwen, and J. M. Cordes, Non-cosmological FRBs from young supernova remnants, MNRAS , 458, L19–L23, 2016. [15] D. Agarwal, A. Aggarwal, M. M. Anderson, et al., FETCH: A deep-learning based classiﬁer for fast transient classiﬁcation, Monthly Notices of the Royal Astr onomical Society , 497(2), 1661–1674, 2020. [16] L. Connor and J. van Leeuwen, Applying Deep Learning to Fast Radio Burst Classiﬁcation, The Astr onomical Journal , 156, 256, 2018. [17] D. R. Lorimer, M. A. McLaughlin, and M. Bailes, The discovery and signiﬁcance of f ast radio bursts, Astr ophysics and Space Science , 369(6), 59, 2024. [18] B. T . Draine, Physics of the Interstellar and In- ter galactic Medium , Princeton Univ ersity Press, Princeton, 2011. [19] J. M. Cordes and I. W asserman, Supergiant pulses from extragalactic neutron stars, Monthly Notices of the Royal Astr onomical Society , 457(1), 232– 257, 2016. [20] D. R. Lorimer and M. Kramer , Handbook of Pul- sar Astr onomy , Cambridge University Press, Cam- bridge, 2005. [21] A. Geminardi, P . Esposito, G. Bernardi, M. Pilia, D. Pelliciari, G. Naldi, D. Dallacasa, R. T urolla, L. Stella, F . Perini, F . V errecchia, C. Casentini, M. T rudu, R. Lulli, A. Maccaferri, A. Magro, A. Mattana, G. Bianchi, G. Pupillo, C. Bortolotti, M. T avani, M. Roma, M. Schia ﬃ no, and G. Setti, The Northern Cross Fast Radio Burst project: V . Search for transient radio emission from Galactic magnetars, Astr onomy & Astr ophysics , 700, A19, 2025. [21] [22] D. Pelliciari, G. Bernardi, M. Pilia, G. Naldi, G. Maccaferri, F . V errecchia, C. Casentini, M. Perri, 17 F . Kirsten, G. Bianchi, C. Bortolotti, L. Bruno, D. Dallacasa, P . Esposito, A. Geminardi, S. Gi- arratana, M. Giroletti, R. Lulli, A. Maccaferri, A. Magro, A. Mattana, F . Perini, G. Pupillo, M. Roma, M. Schia ﬃ no, G. Setti, M. T av ani, M. T rudu, and A. Zanichelli, The Northern Cross F ast Radio Burst project: IV . Multi-wav elength study of the acti vely repeating FRB 20220912A, Astr on- omy & Astr ophysics , 690, A219, 2024. [22] [23] C. J. Law et al., realfast: Real-time, commensal fast transient surve ys with the V ery Lar ge Array , ApJS , 236, 8, 2018. [23] [24] J.-P . Macquart et al., The commensal real-time ASKAP fast transient incoherent-sum surve y , Publications of the Astr onomical Society of A us- tralia , 2025. [24] [25] W . Farah et al., Fiv e new real-time detections of fast radio bursts with UTMOST , MNRAS , 488, 2989–3001, 2019. [25] [26] X. X. Zhang et al., An overvie w of F AST real-time fast radio burst searching system, Resear ch in As- tr onomy and Astrophysics , 23, 095023, 2023. [26] [27] CHIME / FRB Collaboration (M. Amiri et al.), The First CHIME / FRB Fast Radio Burst Catalog, ApJS , 257, 59, 2021. [27] [28] CHIME / FRB Collaboration (M. Amiri et al.), Up- dating the First CHIME / FRB Catalog of Fast Ra- dio Bursts with Baseband Data, ApJ , 969, 145, 2024. [28] [29] CHIME / FRB Collaboration, The Second CHIME / FRB Catalog of Fast Radio Bursts, arXiv:2601.09399 , 2026. [29] [30] M. Pilia et al., The Northern Cross radio telescope: current status and future perspecti ves, Pr oceedings of Science (P oS) , ICRC2019, 687, 2020. [31] J. M. Cordes and M. A. McLaughlin, Searches for fast radio transients, ApJ , 596, 1142–1154, 2003. [32] E. F . Keane and D. J. McKenna, Trial dispersion measure spacing in fast radio b urst searches with HEIMD ALL, Resear ch Notes of the American As- tr onomical Society , 10(3), 43, 2026. [33] L. J. M. Houben, H. Falcke, L. G. Spitler , E. D. Barr, M. Berezina, D. J. Champion, R. Karuppusamy , and M. Kramer, The Northern High Time Resolution Univ erse pulsar survey – II. Single-pulse search set-up and simulations, arXiv:2511.17797 , 2025. [33] [34] L. v an der Maaten and G. Hinton, V isualizing data using t-SNE, Journal of Machine Learning Re- sear ch , 9, 2579–2605, 2008. [35] R. J. G. B. Campello, D. Moulavi, and J. Sander , Density-based clustering based on hierarchical density estimates, P AKDD , 160–172, 2013. [36] M. Friedman, The use of ranks to a void the as- sumption of normality implicit in the analysis of variance, J ournal of the American Statistical As- sociation , 32, 675–701, 1937. [37] P . Nemen yi, Distribution-free multiple compar - isons, PhD Thesis, Princeton Univ ersity , 1963. 18

Optimising the FRB Search Pipeline for the Northern Cross Radio Telescope

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment