TiCard: Deployable EXPLAIN-only Residual Learning for Cardinality Estimation

Reading time: 5 minute
...

📝 Original Info

  • Title: TiCard: Deployable EXPLAIN-only Residual Learning for Cardinality Estimation
  • ArXiv ID: 2512.14358
  • Date: 2025-12-16
  • Authors: Qizhi Wang

📝 Abstract

Cardinality estimation is a key bottleneck for cost-based query optimization, yet deployable improvements remain difficult: classical estimators miss correlations, while learned estimators often require workload-specific training pipelines and invasive integration into the optimizer. This paper presents TiCard, a low intrusion, correction-based framework that augments (rather than replaces) a database's native estimator. TiCard learns multiplicative residual corrections using EXPLAIN-only features, and uses EXPLAIN ANALYZE only for offline labels. We study two practical instantiations: (i) a Gradient Boosting Regressor for sub-millisecond inference, and (ii) TabPFN, an in-context tabular foundation model that adapts by refreshing a small reference set without gradient retraining. On TiDB with TPCH and the Join Order Benchmark, in a low-trace setting (263 executions total; 157 used for learning), TiCard improves operator-level tail accuracy substantially: P90 Q-error drops from 312.85 (native) to 13.69 (TiCard-GBR), and P99 drops from 37,974.37 to 3,416.50 (TiCard-TabPFN), while a join-only policy preserves near-perfect median behavior. We position TiCard as an AI4DB building block focused on deployability: explicit scope, conservative integration policies, and an integration roadmap from offline correction to in-optimizer use.

💡 Deep Analysis

📄 Full Content

TiCard: Deployable EXPLAIN-only Residual Learning for Cardinality Estimation Qizhi Wang (0009-0004-1346-5066) PingCAP, Data & AI-Innovation Lab, Beijing, China qizhi.wang@pingcap.com Abstract Cardinality estimation is a key bottleneck for cost- based query optimization, yet deployable improve- ments remain difficult: classical estimators miss correlations, while learned estimators often require workload-specific training pipelines and invasive in- tegration into the optimizer. This paper presents TiCard, a low-intrusion, correction-based framework that augments (rather than replaces) a database’s native estimator. TiCard learns multiplicative resid- ual corrections using EXPLAIN-only features, and uses EXPLAIN ANALYZE only for offline labels. We study two practical instantiations: (i) a Gradient Boost- ing Regressor for sub-millisecond inference, and (ii) TabPFN, an in-context tabular foundation model that adapts by refreshing a small reference set without gra- dient retraining. On TiDB with TPC-H and the Join Order Benchmark, in a low-trace setting (263 executions total; 157 used for learning), TiCard im- proves operator-level tail accuracy substantially: P90 Q-error drops from 312.85 (native) to 13.69 (TiCard- GBR), and P99 drops from 37,974.37 to 3,416.50 (TiCard-TabPFN), while a join-only policy preserves near-perfect median behavior. We position TiCard as an AI4DB building block focused on deployabil- ity: explicit scope, conservative integration policies, and an integration roadmap from offline correction to in-optimizer use. Keywords. Cardinality estimation; Query opti- mization; ML-for-DB; AI4DB; In-context learning; TiDB; EXPLAIN. 1 Introduction Cardinality estimation (CE)—predicting the num- ber of rows produced by each operator—is central to cost-based query optimization, affecting join order- ing, physical operator choice, and memory manage- ment [1, 2]. Despite decades of work, CE remains brittle in modern analytical workloads, primarily be- cause independence assumptions and limited statistics struggle with multi-column predicates and cross-table correlations [3, 4]. From an AI4DB perspective, the challenge is not only improving accuracy but doing so in a way that is deployable: learned estimators can be accurate, yet are often costly to train, sensitive to workload drift, and require deep integration into the optimizer’s enumeration loop [5–7]. In practice, database teams frequently prefer incremental, low-risk changes that preserve existing optimizer behavior and can be rolled out conservatively. This paper proposes a pragmatic framing: treat the native optimizer as a strong prior and learn only its residual error. We introduce TiCard, a correction- based CE framework that learns multiplicative ad- justments on top of the optimizer estimate. Crucially, TiCard’s feature pipeline is derived from EXPLAIN only, enabling a low-intrusion path that leverages ex- isting database interfaces. EXPLAIN ANALYZE is used solely for offline label collection. 1 arXiv:2512.14358v2 [cs.AI] 17 Dec 2025 1.1 Scope and deployability goals We explicitly scope this work to the setting that is most actionable for deployment teams: • Low intrusion: learn from existing interfaces (EXPLAIN / EXPLAIN ANALYZE) without requiring a new optimizer or deep runtime instrumentation. • Data efficiency: operate under a low-trace regime where executed-query labels are expensive (hundreds of executions, not thousands). • Safety controls: support conservative policies (e.g., join-only correction, blending with fallback) to preserve strong baseline behavior. • Evaluation focus: we report offline, operator- level CE accuracy on collected plans; we do not claim end-to-end plan-quality or latency gains without full integration into the optimizer. This scope is not a limitation to hide; it is a design choice motivated by deployability. We therefore also provide an integration roadmap that describes how to use TiCard-style corrections inside a live optimizer, and where the engineering risks and overheads arise. 1.2 Contributions Our main contributions are: 1. EXPLAIN-only correction formulation: We frame CE as learning multiplicative residual cor- rections using a leakage-free feature pipeline de- rived from EXPLAIN plans. 2. Deployable model choices: We study two com- plementary instantiations—TabPFN in-context learning (fast refresh without gradient retrain- ing) and Gradient Boosting Regression (very fast inference). 3. Conservative integration policies: We define and evaluate practical policies (join-only correc- tion, blending, and a two-stage design for zero- cardinality cases) aimed at controlling regres- sions. 4. Empirical evaluation in a low-trace regime: On TiDB with TPC-H and JOB, we show large tail improvements at the operator level using only 157 training executions, and quantify setup/train- ing and inference costs. 5. Integration roadmap: We outline a path from offline correction to online use, with overhead and risk cons

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut