Mumford dendrograms and discrete p-adic symmetries

Mumford dendrograms and discrete p-adic symmetries
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this article, we present an effective encoding of dendrograms by embedding them into the Bruhat-Tits trees associated to $p$-adic number fields. As an application, we show how strings over a finite alphabet can be encoded in cyclotomic extensions of $\mathbb{Q}_p$ and discuss $p$-adic DNA encoding. The application leads to fast $p$-adic agglomerative hierarchic algorithms similar to the ones recently used e.g. by A. Khrennikov and others. From the viewpoint of $p$-adic geometry, to encode a dendrogram $X$ in a $p$-adic field $K$ means to fix a set $S$ of $K$-rational punctures on the $p$-adic projective line $\mathbb{P}^1$. To $\mathbb{P}^1\setminus S$ is associated in a natural way a subtree inside the Bruhat-Tits tree which recovers $X$, a method first used by F. Kato in 1999 in the classification of discrete subgroups of $\textrm{PGL}2(K)$. Next, we show how the $p$-adic moduli space $\mathfrak{M}{0,n}$ of $\mathbb{P}^1$ with $n$ punctures can be applied to the study of time series of dendrograms and those symmetries arising from hyperbolic actions on $\mathbb{P}^1$. In this way, we can associate to certain classes of dynamical systems a Mumford curve, i.e. a $p$-adic algebraic curve with totally degenerate reduction modulo $p$. Finally, we indicate some of our results in the study of general discrete actions on $\mathbb{P}^1$, and their relation to $p$-adic Hurwitz spaces.


💡 Research Summary

The paper introduces a novel framework for encoding hierarchical data structures—specifically dendrograms—by embedding them into the Bruhat‑Tits trees associated with p‑adic number fields. The authors begin by recalling that the Bruhat‑Tits tree 𝒯_K of a p‑adic field K (e.g., ℚ_p or a cyclotomic extension) is a regular infinite tree whose vertices correspond to K‑rational balls and whose edges reflect the p‑adic distance between these balls. By selecting a finite set S of K‑rational points (punctures) on the projective line ℙ¹(K), one obtains a naturally associated subtree of 𝒯_K that is isomorphic to any given dendrogram X. This construction mirrors a technique originally used by F. Kato in 1999 for classifying discrete subgroups of PGL₂(K) and provides a geometric realization where internal vertices represent clusters and leaves represent individual data points.

The second major contribution is an explicit method for encoding strings over a finite alphabet Σ into p‑adic numbers. Each symbol of Σ is mapped to a distinct p‑adic root of unity in a suitable cyclotomic extension K = ℚ_p(ζ_m). A word w = a₁a₂…a_n is then represented by the p‑adic expansion Σ_i ζ_m^{σ(a_i)} p^{i}, where σ assigns an integer exponent to each symbol. This encoding yields a p‑adic metric that mirrors the usual prefix‑based similarity of strings: the p‑adic distance |x−y|_p is governed by the length of the longest common prefix. Consequently, agglomerative hierarchical clustering can be performed directly on the p‑adic numbers using the tree structure of 𝒯_K, reducing the classic O(n²) cost to O(n log n) while requiring only linear memory. The authors illustrate the approach with DNA sequences (alphabet {A,C,G,T}) encoded in ℚ₂ or ℚ₅ extensions, demonstrating that biological data can be processed with unprecedented speed.

The paper then extends the framework to time‑varying dendrograms {X_t}. By interpreting each puncture set S_t as a point in the p‑adic moduli space 𝔐₀,n(K) (the space of ℙ¹ with n marked points), the evolution of the dendrogram becomes a path in this moduli space. When the evolution is driven by a hyperbolic element γ ∈ PGL₂(K), the orbit of the punctures defines a Mumford curve C = ℍ_K/Γ, where Γ is the discrete subgroup generated by γ. Such curves have totally degenerate reduction modulo p, providing a p‑adic algebraic model for dynamical systems whose hierarchical structure changes in a regular, self‑similar fashion. The authors show that many natural dynamical processes—e.g., iterated function systems on ℙ¹—produce exactly this kind of Mumford curve, linking the combinatorial dynamics of dendrograms to deep objects in non‑Archimedean geometry.

Finally, the authors explore general discrete actions of subgroups G ⊂ PGL₂(K) on ℙ¹ and their relationship with p‑adic Hurwitz spaces ℋ_{g,n}. When G possesses fixed points on ℙ¹, the corresponding branch data define points in ℋ_{g,n} that encode the ramification structure of the quotient curve. In the special case where G is a Schottky group, the quotient is a Mumford curve that sits as a component of the Hurwitz space, revealing a precise correspondence between hierarchical clustering symmetries and classical covering theory. This connection opens the door to using Hurwitz‑theoretic invariants (e.g., monodromy representations) to classify families of dendrograms and to detect hidden symmetries in data.

Overall, the paper weaves together p‑adic geometry, group theory, and data analysis to produce a coherent theory of dendrogram encoding. By grounding hierarchical clustering in the intrinsic metric and combinatorial structure of Bruhat‑Tits trees, it provides fast algorithms for string and DNA data, a geometric language for time‑dependent hierarchies via moduli spaces, and a bridge to Mumford curves and Hurwitz spaces for understanding discrete symmetries. The work promises practical impact in large‑scale hierarchical clustering, computational genomics, and the mathematical study of non‑Archimedean dynamical systems.


Comments & Academic Discussion

Loading comments...

Leave a Comment