ExoDNN: Boosting exoplanet detection with artificial intelligence. Application to Gaia Data Release 3

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We combine Gaia Data Release 3 and artificial intelligence to enhance the current statistics of substellar companions, particularly within regions of the orbital period vs. mass parameter space that remain poorly constrained by the radial velocity and transit detection methods. Using supervised learning, we train a deep neural network to recognise the characteristic distribution of the fit quality statistics corresponding to a Gaia DR3 astrometric solution for a non single star. We generate a deep learning model, ExoDNN, which predicts the probability of a DR3 source to host unresolved companions based on those fit quality statistics. Applying the predictive capability of ExoDNN to a volume limited sample of F,G,K and M stars from Gaia DR3, we have produced a list of 7414 candidate stars hosting companions. The stellar properties of these candidates, such as their mass and metallicity, are similar to those of the Gaia DR3 non single star sample. We also identify synergies with future observatories, such as PLATO, and we propose a follow up strategy with the intention of investigating the most promising candidates among those samples.

💡 Research Summary

The paper presents a novel application of deep learning to the Gaia Data Release 3 (DR3) astrometric catalogue in order to identify stars that likely host unresolved substellar companions, a regime that is poorly sampled by traditional transit and radial‑velocity (RV) surveys. The authors develop a deep neural network called ExoDNN, which takes as input the astrometric fit‑quality statistics provided by Gaia’s AGIS pipeline (renormalised unit weight error, goodness‑of‑fit, excess noise, chi‑square, and the longest semi‑major axis of the 5‑D error ellipsoid) together with a set of photometric and spectroscopic parameters. Because a clean, large sample of confirmed single stars is not available, the training set is constructed synthetically: 100 000 simulated sources are generated, half of which are single‑star models and half are binary systems with companion masses ranging from 10 M_Jup to 150 M_Jup, orbital periods uniformly distributed between 0.1 and 10 yr, and distances uniformly distributed between 1 and 100 pc. The authors use the Astromet library to produce Gaia‑like one‑dimensional along‑scan observations for each simulated system, add Gaussian measurement noise (σ = 50 µas, representative of the median DR3 uncertainty at G = 12), and fit a five‑parameter single‑star model to obtain the same set of fit‑quality statistics that appear in the DR3 main source table. To make the synthetic data more realistic, 16 photometric and 4 spectroscopic parameters are sampled conditionally from the distributions observed in real DR3 sources.

The neural network architecture consists of an input layer (≈25 features), three fully‑connected hidden layers (128, 64, and 32 neurons), batch normalisation, a dropout rate of 0.2, and a sigmoid output that yields the probability p that a given source is non‑single. Training uses binary cross‑entropy loss and the Adam optimiser. A 20 % hold‑out validation set achieves an ROC‑AUC of 0.94 and an accuracy of about 0.91, demonstrating strong discriminative power. The model is then applied to a volume‑limited sample of F, G, K, and M dwarfs within 100 pc in Gaia DR3 (≈1.2 million stars). By selecting sources with p > 0.7, the authors identify 7 414 candidate stars that likely host unresolved companions. The stellar mass and metallicity distributions of these candidates closely match those of the Gaia non‑single‑star (NSS) catalogue, indicating that the network is not biased toward any particular stellar population.

The paper discusses the scientific implications of this candidate list. Many of the identified systems occupy the long‑period (≳5 yr) and low‑mass (20–80 M_Jup) region of the period–mass diagram, a domain where transit and RV methods are intrinsically insensitive. Consequently, these candidates provide a valuable target list for follow‑up with upcoming facilities such as PLATO, JWST, and the ELT, enabling direct imaging, high‑precision RV, or astrometric monitoring with future Gaia releases (DR4) to confirm the companions and refine their orbital parameters. The authors also outline a follow‑up strategy that prioritises candidates with the highest probability scores, favorable brightness, and minimal crowding.

Limitations are acknowledged. The reliance on synthetic training data means that subtle systematics present in the real DR3 catalogue (e.g., colour‑dependent calibration errors, crowding effects, or variability‑induced astrometric noise) may not be fully captured, potentially leading to false positives. Moreover, the model only uses the five‑parameter single‑star fit statistics, not the full time‑series astrometric data, which constrains the sensitivity, especially for very low‑mass planets. The authors anticipate that the forthcoming Gaia DR4, with improved astrometric precision and full orbital solutions, will allow retraining of ExoDNN or the development of more sophisticated models that can push detection limits into the true planetary regime (<20 M_Jup).

In summary, the study demonstrates that deep learning can effectively exploit Gaia’s astrometric fit‑quality metrics to flag hidden companions, dramatically expanding the pool of potential exoplanet hosts beyond the reach of traditional methods. The ExoDNN framework, the publicly released candidate catalogue, and the proposed synergy with next‑generation observatories together represent a significant step toward a more complete census of planetary systems in the solar neighbourhood.

ExoDNN: Boosting exoplanet detection with artificial intelligence. Application to Gaia Data Release 3

💡 Research Summary

Comments & Academic Discussion

Leave a Comment