Existing machine learning frameworks operate over the field of real numbers ($\mathbb{R}$) and learn representations in real (Euclidean or Hilbert) vector spaces (e.g., $\mathbb{R}^d$). Their underlying geometric properties align well with intuitive concepts such as linear separability, minimum enclosing balls, and subspace projection; and basic calculus provides a toolbox for learning through gradient-based optimization.
But is this the only possible choice? In this paper, we study the suitability of a radically different field as an alternative to $\mathbb{R}$ -- the ultrametric and non-archimedean space of $p$-adic numbers, $\mathbb{Q}_p$. The hierarchical structure of the $p$-adics and their interpretation as infinite strings make them an appealing tool for code theory and hierarchical representation learning. Our exploratory theoretical work establishes the building blocks for classification, regression, and representation learning with the $p$-adics, providing learning models and algorithms. We illustrate how simple Quillian semantic networks can be represented as a compact $p$-adic linear network, a construction which is not possible with the field of reals. We finish by discussing open problems and opportunities for future research enabled by this new framework.
💡 Deep Analysis
📄 Full Content
Learning with the p-adics
André F. T. Martins1,2
1Instituto Superior Técnico, Universidade de Lisboa, Portugal
2Instituto de Telecomunicações, Lisboa, Portugal
andre.t.martins@tecnico.ulisboa.pt
Abstract
Existing machine learning frameworks operate over the field of real numbers (R)
and learn representations in real (Euclidean or Hilbert) vector spaces (e.g., Rd).
Their underlying geometric properties align well with intuitive concepts such as
linear separability, minimum enclosing balls, and subspace projection; and basic
calculus provides a toolbox for learning through gradient-based optimization.
But is this the only possible choice? In this paper, we study the suitability
of a radically different field as an alternative to R—the ultrametric and non-
archimedean space of p-adic numbers, Qp. The hierarchical structure of the
p-adics and their interpretation as infinite strings make them an appealing tool
for code theory and hierarchical representation learning. Our exploratory the-
oretical work establishes the building blocks for classification, regression, and
representation learning with the p-adics, providing learning models and algo-
rithms. We illustrate how simple Quillian semantic networks can be represented
as a compact p-adic linear network, a construction which is not possible with
the field of reals. We finish by discussing open problems and opportunities for
future research enabled by this new framework.
Figure 1: Hierarchical structure of Qp for p = 2.
1
arXiv:2512.22692v1 [cs.LG] 27 Dec 2025
Learning with the p-adics
2
Contents
1
Introduction
3
2
Background
3
2.1
Non-Archimedean absolute values and ultrametrics . . . . . . . . . . . . . .
3
2.2
The field Qp of p-adic numbers and the ring Zp of p-adic integers
. . . . . .
4
2.3
Properties of Qp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
3
p-adic Classification
6
3.1
Linear classifiers: Unidimensional case . . . . . . . . . . . . . . . . . . . . . .
7
3.2
Nonlinear classifiers: Unidimensional case
. . . . . . . . . . . . . . . . . . .
9
3.3
Linear classifiers: Multidimensional inputs . . . . . . . . . . . . . . . . . . .
10
3.4
Learning a p-adic linear classifier . . . . . . . . . . . . . . . . . . . . . . . . .
11
4
p-adic Regression
12
4.1
Linear regression: Unidimensional case . . . . . . . . . . . . . . . . . . . . .
12
4.2
Linear regression: Multidimensional case . . . . . . . . . . . . . . . . . . . .
13
5
p-adic Representations
13
6
Open Problems
18
6.1
Linear multi-class p-adic classifiers . . . . . . . . . . . . . . . . . . . . . . . .
18
6.2
Gradient-based optimization and p-adic root finding . . . . . . . . . . . . . .
19
6.3
Multi-layer p-adic networks . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
6.4
Using all primes: adelic predictors . . . . . . . . . . . . . . . . . . . . . . . .
20
7
Conclusions
21
A In Ultrametric Spaces Every Triangle Is Isosceles
23
B Additional Properties of p-adic Balls
23
C A p-adic Proof of Kraft’s Inequality
24
D Proofs
25
D.1 Proof of Proposition 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
D.2 Proof of Proposition 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
D.3 Proof of Proposition 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
D.4 Proof of Proposition 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
D.5 Proof of Proposition 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
D.6 Proof of Proposition 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
28
Learning with the p-adics
3
1
Introduction
Since they have been introduced by Hensel (1897), p-adic numbers have seen numerous
applications in number theory, algebraic geometry, physics, and other fields (Koblitz, 1984;
Robert, 2000; Gouvêa, 2020). They differ from the real numbers in many important ways,
which leads to many fascinating and surprising results, such that the equality
1 + 2 + 4 + 8 + . . . = −1
or the fun fact that in the p-adic world all triangles are isosceles and any point in a ball is a
center of that ball.
Yet, with only a few exceptions, very little work has investigated the potential of p-
adic numbers in machine learning. Bradley (2009) studies clustering of p-adic data and
proposes suboptimal algorithms for minimizing cluster energies. Murtagh (2004, 2009)
analyze dendrograms and ultrametricity in data. Chierchia and Perret (2019) and Cohen-
Addad et al. (2020) develop procedures to fit ultrametrics to data. Khrennikov and Tirozzi
(1999) propose a “p-adic neural network” (similar to our unidimensional linear classifier in
§3). Baker and Molla-Aliod (2022) use a variant of p-adic regression for a small sequence-
to-sequence problem in natural language processing, bearing some similarity with our
formulation in §4.
This paper is an attempt to establish the foundations for p-adic machine learning by
developing building blocks for classification (§3) and regression p