Interpretable Analytic Calabi-Yau Metrics via Symbolic Distillation

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Calabi–Yau manifolds are essential for string theory but require computing intractable metrics. Here we show that symbolic regression can distill neural approximations into simple, interpretable formulas. Our five-term expression matches neural accuracy ($R^2 = 0.9994$) with 3,000-fold fewer parameters. Multi-seed validation confirms that geometric constraints select essential features, specifically power sums and symmetric polynomials, while permitting structural diversity. The functional form can be maintained across the studied moduli range ($ψ\in [0, 0.8]$) with coefficients varying smoothly; we interpret these trends as empirical hypotheses within the accuracy regime of the locally-trained teachers ($σ\approx 8-9%$ at $ψ\neq 0$). The formula reproduces physical observables – volume integrals and Yukawa couplings – validating that symbolic distillation recovers compact, interpretable models for quantities previously accessible only to black-box networks.

💡 Research Summary

The paper tackles the long‑standing computational bottleneck of obtaining explicit Ricci‑flat metrics on Calabi‑Yau (CY) threefolds, which are essential for string‑theoretic phenomenology. Traditional approaches to solving the Monge–Ampère equation require on the order of 10²–10³ CPU‑hours per metric point, while recent neural‑network surrogates (e.g., Donaldson’s balanced metrics learned by deep nets) can evaluate the metric in milliseconds but remain opaque “black‑box” models. The authors ask whether the underlying geometric information can be distilled into a compact, human‑interpretable analytic expression that retains the surrogate’s accuracy across a range of complex‑structure moduli.

To answer this, they first construct a high‑fidelity teacher model: a Donaldson balanced metric at polynomial degree k = 10 on the Fermat quintic, represented by an H‑matrix with 875 parameters. This teacher achieves a Ricci‑flatness indicator σ ≈ 0.0065 (0.65 % deviation) and serves as the ground truth for symbolic regression. They generate 10⁵ uniformly distributed points on the projective hypersurface and define the regression target as the logarithm of the determinant ratio log(det g_alg / det g_FS), where g_alg is the teacher metric and g_FS the Fubini–Study metric.

Guided by the symmetry group U(1)⁵ ⋊ S₅, they select two gauge‑invariant features: the power sum p₂ = ∑|z_i|⁴ and a third‑order elementary symmetric combination σ₃ = (1 − 3p₂ + 2p₃)/6 (with p₃ = ∑|z_i|⁶). Newton’s identities guarantee that (p₂, σ₃) capture essentially all variation of the determinant ratio; an ablation study confirms that p₂ alone yields R² = 0.9981, while adding σ₃ raises R² to 0.9994.

Using the Python Symbolic Regression (PySR) framework, they explore a search space of 200 generations, population 60, and maximum tree depth 30, optimizing a Pareto objective that balances loss and expression complexity. The best model is a 15‑node expression (≈ 54 % of the allowed complexity) consisting of five terms:

log(det g_alg / det g_FS) = c₀ + c₁ p₂² + c₂ σ₃ / p₃² + c₃ p₂ + c₄ σ₃.

At the Fermat point (ψ = 0) the fitted coefficients are c₀≈0, c₁=+0.0022, c₂=−0.0011, c₃=+0.1245, c₄=+0.050. The expression captures bulk behavior (linear p₂ term), angular anisotropy (σ₃ term), and singular corrections (inverse‑square terms) that are essential for high‑fidelity approximation. On the full 10⁵‑point dataset the formula achieves R² = 0.9994 and RMSE = 0.0116 relative to the neural surrogate, outperforming a cubic polynomial baseline (R² = 0.9981) while using only five parameters versus ten for the polynomial and ~15 000 for a typical deep net.

Crucially, the same functional scaffold persists across the Dwork family of quintics for ψ∈

Interpretable Analytic Calabi-Yau Metrics via Symbolic Distillation

💡 Research Summary

Comments & Academic Discussion

Leave a Comment