Title: 287,872 Supermassive Black Holes Masses: Deep Learning Approaching Reverberation Mapping Accuracy
ArXiv ID: 2512.04803
Date: 2025-12-04
Authors: Yuhao Lu, HengJian SiTu, Jie Li, Yixuan Li, Yang Liu, Wenbin Lin, Yu Wang
📝 Abstract
We present a population-scale catalogue of 287,872 supermassive black hole masses with high accuracy. Using a deep encoder-decoder network trained on optical spectra with reverberation-mapping (RM) based labels of 849 quasars and applied to all SDSS quasars up to $z=4$, our method achieves a root-mean-square error of $0.058$\,dex, a relative uncertainty of $\approx 14\%$, and coefficient of determination $R^{2}\approx0.91$ with respect to RM-based masses, far surpassing traditional single-line virial estimators. Notably, the high accuracy is maintained for both low ($<10^{7.5}\,M_\odot$) and high ($>10^{9}\,M_\odot$) mass quasars, where empirical relations are unreliable.
💡 Deep Analysis
📄 Full Content
287,872 Supermassive Black Holes Masses: Deep Learning
Approaching Reverberation Mapping Accuracy
Yuhao Lu1a,b, HengJian SiTu2c, Jie Li3c, Yixuan Li4c,b, Yang Liu5a,b,d, Wenbin Lin6a,c,e and
Yu Wang7f,b,e,g,∗
aSchool of Computer Science, University of South China, Hengyang, 421001, China
bICRANet-AI, Brickell Avenue 701, Miami, FL 33131, USA
cSchool of Mathematics and Physics, University of South China, Hengyang, 421001, China
dDepartment of Physics E. Pancini, University Federico II, Naples, 80126, Italy
eICRANet, Piazza della Repubblica 10, Pescara, 65122, Italy
fICRA and Dipartimento di Fisica, Sapienza Università di Roma, P.le Aldo Moro 5, Rome, 00185, Italy
gINAF – Osservatorio Astronomico d’Abruzzo, Via M. Maggini snc, Teramo, I-64100, Italy
A R T I C L E I N F O
Keywords:
supermassive black holes
quasars
machine learning
black hole mass estimation
SDSS-RM
A B S T R A C T
We present a population-scale catalogue of 287,872 supermassive black hole masses with high
accuracy. Using a deep encoder-decoder network trained on optical spectra with reverberation-
mapping (RM) based labels of 849 quasars and applied to all SDSS quasars up to 𝑧= 4, our method
achieves a root-mean-square error of 0.058 dex, a relative uncertainty of ≈14%, and coefficient of
determination 𝑅2 ≈0.91 with respect to RM-based masses, far surpassing traditional single-line virial
estimators. Notably, the high accuracy is maintained for both low (< 107.5 𝑀⊙) and high (> 109 𝑀⊙)
mass quasars, where empirical relations are unreliable.
1. Introduction
Supermassive black holes (SMBHs) with masses span-
ning from roughly 105 𝑀⊙to 1010 𝑀⊙are commonly ob-
served at the centers of most massive galaxies (Kormendy
and Richstone, 1995; Ferrarese and Ford, 2005; Kormendy
and Ho, 2013). Recent breakthroughs, particularly the Event
Horizon Telescope’s imaging of the SMBH at the core of the
elliptical galaxy M 87, have provided unprecedented direct
observational evidence (Event Horizon Telescope Collabo-
ration et al., 2019). It is now firmly established that SMBH
masses are strongly correlated with the characteristics of
their host galaxies, including bulge mass, stellar velocity
dispersion, surface brightness, and luminosity (Ferrarese
and Merritt, 2000; Merritt and Ferrarese, 2001; Häring and
Rix, 2004; Saglia et al., 2016). These correlations appear to
persist across both local and high-redshift galaxies (Graham
and Scott, 2013; Schramm and Silverman, 2013; Izumi et al.,
2019), suggesting a fundamental co-evolutionary link de-
spite the vast difference in physical scales between SMBHs
and their hosts (Hopkins et al., 2008; Schawinski et al., 2010;
Izumi et al., 2019).
Nevertheless, significant challenges remain, particularly
in understanding how such massive black holes could have
formed within the universe’s first billion years (Wu et al.,
2015; Inayoshi et al., 2020). Current models suggest that
SMBHs grow predominantly through gas accretion and
galaxy mergers, releasing substantial energy that profoundly
affects host galaxy evolution (Alexander and Hickox, 2012;
Ciotti and Ostriker, 2007; Sijacki et al., 2007).
∗Corresponding author
lwb@usc.edu.cn (W. Lin); yu.wang@icranet.org (Y. Wang)
ORCID(s):
Observationally, SMBHs manifest as active galactic
nuclei (AGNs) or quasars, whose extreme luminosities offer
key insights into accretion processes and black hole growth
(Soltan, 1982). However, directly measuring SMBH masses
remains challenging. Spectroscopic techniques, though widely
used, are labor-intensive and have yielded only about one
million estimates over the past two decades (Shen et al.,
2011; Kelly and Shen, 2013). The advent of large-scale
surveys such as the Vera C. Rubin Observatory’s Legacy
Survey of Space and Time (LSST), expected to detect nearly
108 quasars (Ivezić et al., 2019), will require far more
efficient and scalable mass estimation methods. Alternative
approaches based on variability measurements in optical and
X-ray bands show promise (McHardy et al., 2006; Burke
et al., 2021), but are complicated by nonlinear physical
dependencies and the massive data volumes involved. Mean-
while, direct dynamical measurements of SMBH masses
remain limited to only a small sample of nearby galaxies
(Kuo et al., 2011; Kormendy and Ho, 2013; McConnell and
Ma, 2013; Shankar et al., 2016, 2019).
While classical machine learning techniques, such as
symbolic regression, random forests, and photometric re-
gressions, have contributed to early progress in black hole
mass estimation by extending traditional scaling relations
(Jin and Davis, 2023; He et al., 2022), their performance
remains fundamentally constrained by shallow architectures
and limited feature representations. In contrast, deep learn-
ing approaches have shown greater promise in capturing
the non-linear dependencies inherent in high-dimensional
astrophysical data. For instance, variability-based models
such as AGNet achieve a scatter of approximately 0.37 dex
Lu et al.: Preprint sub