Visual inspection of potential exocomet transits identified through machine learning and statistical methods
In this work, we explore several ways to detect possible exocomet transits in the TESS (The Transiting Exoplanet Survey Satellite) light curves. The first one has been presented in our previous work, a machine learning approach based on the Random Forest algorithm. It was trained on asymmetric transit profiles calculated as a result of the modelling of a comet transit, and then applied to real star light curves from Sector 1 of TESS. This allowed us to detect 32 candidates with weak and non-periodic brightness dips that may correspond to comet-like events. The aim of this work is to analyse the events identified by the visual inspection to make sure that the features detected were not caused by instrumental effects. The second approach to detect possible exocomet transits, which is proposed, is an independent statistical method to test the results of the machine learning algorithm and to look for asymmetric minima directly in the light curves. This approach was applied to \b{eta} Pictoris light curves using TESS data from Sectors 5, 6, 32, and 33. The algorithm reproduced nearly all previously known events deeper than 0.03 % of the star flux, showing that it is efficient to detect shallow and irregular flux changes in the different sectors of the TESS data and at the different levels of noise. The combination of machine learning, visual inspection, and statistical analysis facilitates the identification of faint and short-lived asymmetric transits in photometric data. Although the number of confirmed exocomet transits is still small, the growing amount of observations points to their likely presence in many young planetary systems.
💡 Research Summary
This paper presents a comprehensive strategy for detecting exocomet transits in TESS photometry by combining machine‑learning classification, human visual inspection, and an independent statistical search. In the first stage, the authors trained a Random Forest classifier on 20 000 synthetic light‑curve segments. The synthetic transits were generated by Monte‑Carlo modelling of a dusty comet tail crossing a stellar disk, producing the characteristic asymmetric dip with a steep ingress and gradual egress. Feature extraction was performed with the TSFresh library, yielding a large set of time‑domain and frequency‑domain descriptors. Two precision subsets were defined based on the Combined Differential Photometric Precision (CDPP) – one with CDPP < 40 ppm and another with CDPP < 150 ppm – to test the model under different noise conditions. Cross‑validation showed an overall accuracy of 96 % and precision, recall, and F1‑score all above 95 %, indicating that the classifier can reliably separate asymmetric comet‑like events from symmetric planetary transits and random noise.
Applying this classifier to the 2‑minute cadence PDC‑SAP light curves from TESS Sector 1 yielded 32 candidate events. The authors then performed a meticulous visual inspection of each candidate, taking into account a known instrumental artifact that appears between 1347 and 1349 BT JD in all Sector 1 data. They identified three candidates that actually display periodic, symmetric dips consistent with planetary transits, four single deep events that could be caused by long‑period planets or even Solar‑system asteroids, and seven cases where the apparent asymmetry is produced by the artifact or by edge‑of‑sector noise. In total, fifteen candidates were dismissed as spurious, leaving only two light curves that retain the hallmarks of an exocomet transit: a shallow (≲0.1 % of stellar flux), non‑repeating, asymmetric dip.
To validate the machine‑learning results independently, the authors developed a statistical algorithm that searches directly for asymmetric minima in the light curves. This algorithm was applied to β Pictoris observations from TESS Sectors 5, 6, 32, and 33. By setting a detection threshold at a depth of 0.03 % of the stellar flux, the method recovered virtually all previously reported exocomet events in β Pictoris and demonstrated the ability to detect very shallow, irregular flux drops across sectors with varying noise levels. The successful reproduction of known events confirms the robustness of the statistical approach and its complementarity to the machine‑learning pipeline.
The paper argues that the three‑pronged approach—large‑scale candidate generation via Random Forest, expert visual vetting to eliminate instrumental and stellar variability false positives, and an independent statistical search for asymmetric dips—provides a powerful framework for uncovering the elusive, low‑amplitude, aperiodic signatures of exocomets. The authors note that while the current number of confirmed exocomet transits remains small, the methodology scales well to the ever‑growing TESS data set and can be extended to other missions (e.g., PLATO, Roman). They also stress that definitive confirmation will require complementary observations, such as high‑resolution spectroscopy to detect associated Ca II or Na I absorption, multi‑wavelength photometry, or long‑baseline monitoring to establish repeatability. Overall, the study demonstrates that combining modern data‑driven techniques with careful human scrutiny can significantly advance the census of exocomet activity in young planetary systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment