Spectral Geometry and Heat Kernels on Phylogenetic Trees

We develop a unified spectral framework for finite ultrametric phylogenetic trees, grounding the analysis of phylogenetic structure in operator theory and stochastic dynamics in the finite setting. For a given finite ultrametric measure space $(X,d,m…

Authors: Ángel Alfredo Morán Ledezma

Spectral Geometry and Heat Kernels on Phylogenetic Trees
Sp ectral Geometry and Heat Kernels on Ph ylogenetic T rees. ´ Angel Alfredo Mor´ an Ledezma 1* 1* Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of T echnology , Englerstr. 7, Karlsruhe, 76131, Karlsruhe, German y . Corresp onding author(s). E-mail(s): angel.ledezma@kit.edu ; Abstract W e dev elop a unified spectral framew ork for finite ultrametric ph ylogenetic trees, grounding the analysis of ph ylogenetic structure in operator theory and stochas- tic dynamics in the finite setting. F or a given finite ultrametric measure space ( X, d, m ) , w e in tro duce the ultrametric Laplacian L X as the generator of a con tinuous time Marko v chain with transition rate q ( x, y ) = k ( d ( x, y )) m ( y ) . W e establish its complete sp ectral theory , obtaining explicit closed-form eigen- v alues and an eigenbasis supp orted on the clades of the tree. F or phylogenetic applications, we associate to any ultrametric phylogenetic tree T a canonical op erator L T , the ultrametric phylogenetic Laplacian, whose jump rates enco de the temp oral structure of evolutionary divergence. W e sho w that the geometry and top ology of the tree are explicitly enco ded in the spectrum and eigenv ectors of L T : eigenv alues aggregate branch lengths w eighted b y clade mass along ances- tral paths, while eigenv ectors are supp orted on the clades, with one eigenspace attac hed to eac h internal node. F rom this we derive three main contributions: a spectral reconstruction theorem with linear complexity O ( | X | ) ; a rigorous geometric interpretation of the spectral gaps of L T as detectors of distinct evo- lutionary mo des, v alidated on an empirical primate phylogen y; an eigenmo de decomp osition of biological traits that resolves trait v ariance in to contributions from individual splits of the phylogen y; and a closed-form centralit y index for con tinuous-time Marko v chains on ultrametric spaces, which we propose as a mathematically grounded measure of evolutionary distinctiv eness. All results are exact and biologically in terpretable, and are supported b y n umerical experiments on empirical primate data. Keyw ords: phylog enetic trees, ultrametric analysis, sp ectral geometry , heat kernels, evolutionary distinctiv eness, ev olutionary modes 1 Con ten ts 1 In tro duction 2 2 Ultrametric spaces and trees. 6 2.1 Ultrametric trees and Ph ylogenetic trees. . . . . . . . . . . . . . . . . . 8 3 Sp ectral geometry for phylogenetic trees 10 3.1 The sp ectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 The eigenpro jectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 The heat k ernel of an ultrametric Laplacian . . . . . . . . . . . . . . . 27 4 Dynamic centralit y and ultrametric spaces. 31 4.1 A state-cen trality index for CTMC. . . . . . . . . . . . . . . . . . . . . 31 4.2 Dynamic cen trality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.3 Con tinuous time Marko v Chain in ultrametric spaces . . . . . . . . . . 35 4.4 Dynamic cen trality for Ultrametric spaces as a top ological descriptor . 36 5 Conclusions and outlo ok. 40 1 In tro duction Nature organizes complexity through hierarch y . F rom the nested basins of a river catc hment to the genealogical structure of a language family , hierarchical structure emerges wherev er a complex system ev olves through successiv e branching even ts, lea v- ing behind a record of its own history [ 1 ]. Y et perhaps the most ancien t and comp elling instance of this principle is found in life itself. All living organisms carry within their DNA a signature of their evolutionary heritage, and by recognizing and studying the patterns of these signatures, biologists are able to reconstruct a common origin [ 2 ]. This is the idea b ehind the T ree of Life, an image Darwin already sketc hed in 1837, more than tw o decades b efore On the Origin of Sp e cies [ 3 ], organizing the diversit y of life through branc hing and ramification. The mathematical ob ject that captures this structure is the ultr ametric phylo ge- netic tr e e : a rooted, weigh ted tree in which the distance from the ro ot to every leaf is the same, encoding the hierarchical and temporal structure of ev olutionary div ergence. Bey ond the reconstruction of evolutionary history , phylogenetic trees hav e b ecome indisp ensable to ols across the life sciences. Their applications span from p opulation genetics and phylogeograph y , where they enable inference of past demography and historical migration ev ents, to epidemiology , where they hav e prov en essential for tracing the spread of infectious diseases across hosts and geographies. In microbiology , they pro vide one of the most natural and p o werful measures of diversit y , while in ecology they shed light on communit y assembly , interspecific interactions, and sp ecies resp onses to environmen tal change [ 4 ]. In medicine, the ”tree of cells” is used in the study of the evolution of tumors [ 2 ]. And in conserv ation biology , the shap e and branch lengths of a phylogenetic tree enco de the ev olutionary distinctiveness of sp ecies, a quan tity central to mo dern prioritization frameworks [ 5 , 6 ]. 2 Despite a plethora of applications, a unified mathematical framew ork for the sp ectral analysis of ultrametric phylogenetic trees, one that yields explicit form ulas, exact reconstruction theorems, and biologically interpretable op erators, has remained absen t from the literature. This pap er dev elops such a framew ork, encompassing sp ec- tral theory , sto chastic dynamics, and trait analysis, with applications ranging from ph ylogenetic reconstruction to evolutionary conserv ation. The Laplacian op erator app ears across a remark ably broad range of physical phe- nomena: from the propagation of wa v es and the diffusion of heat to the quantum mec hanical description of electron motion and the oscillatory dynamics of fluids. That a single mathematical ob ject go verns such diverse phenomena reflects a deep connec- tion b et w een the geometry of the underlying space and the sp ectral properties of the op erator defined on it. This connection is the central sub ject of sp ectral geometry [ 7 ]. The field traces its origins to Chladni’s eighteen th-cen tury exp erimen ts with vibrating plates and Ra yleigh’s in vestigations in to acoustics, and w as further catalyzed b y Kac’s celebrated question: can one hear the shap e of a drum? [ 8 ]. This question w as ulti- mately answered in the negativ e: Gordon, W ebb, and W olp ert constructed t w o distinct planar domains that are isosp ectral yet non-isometric [ 9 ], showing that the sp ectrum of the Laplacian alone do es not alwa ys suffice to reconstruct the underlying geometry . This problem has since b een extended to discrete structures such as graphs, where analogous questions of sp ectral reconstruction and geometric inference arise naturally . In this direction, ultrametric analysis has prov en particularly effective: Bradley and Mor´ an sho wed that graphs can b e reconstructed from the sp ectrum of an asso ciated p -adic Laplacian [ 10 ]. More broadly , the idea of extracting geometric information from sp ectral data has b ecome central across many areas of science, and is to da y a driving force in machine learning and geometric deep learning, where eigenv alues, geometric priors, and graph kernels are essen tial ingredients in the analysis of structured data [ 11 , 12 ]. Ultrametric trees o ccup y a privileged p osition in this landscap e: their discrete and hierarchical structure makes the sp ectrum and eigen vectors fully explicit and, moreo ver, the geometry of the tree is enco ded directly in the sp ectral data, enabling, as w e show in this pap er, a suite of analytical to ols for the study of ph ylogenetic structure. The sp ectral analysis of phylogenetic trees w as pioneered by Lewitus and Mor- lon [ 4 ], who introduced the mo dified graph Laplacian ∆ = D − W , where D ii = P j w ij is the degree matrix and W is the full pairwise distance matrix b etw een all 2 n − 1 no des of the tree. In that framework, the tree is analyzed as a netw ork and the eigen- v alues of ∆ are used to construct a sp ectral density profile; eigengaps are employ ed heuristically to identify mo des of diversification, including the separation of distinct ev olutionary lineages within empirical ph ylogenies. On the op erator side, Bendiko v, Cygan, and W o ess developed a general framew ork for isotropic Mark ov generators on ultrametric spaces [ 13 , 14 ], defining hierarc hical Laplacians via a c hoice function on the balls of the space, with pure p oin t sp ectrum and compactly supp orted eigenfunctions. In a related but indep enden t direction, Kozyrev developed a theory of ultrametric pseudo differen tial op erators on infinite ultrametric spaces, establishing diagonaliza- tion results in bases of ultrametric wa velets [ 15 ]. Both frameworks are dev elop ed in 3 the infinite setting and without reference to phylogenetic applications. The ultramet- ric analyisis approach has b een successfully applied in studying many other biological problems, see for example [ 16 – 21 ] and, more recen tly , in the study of branching coral gro wth and calcification dynamics [ 22 ]. Both lines of work leav e op en the same funda- men tal question: is there a sp ectral framework for ultrametric phylogenetic trees that is sim ultaneously op erator-theoretic, fully explicit, and biologically in terpretable? The presen t pap er answer this question by introducing the ultr ametric L aplacian L X as the central ob ject of study . A contin uous-time Marko v c hain on X describ es the dynamics of a particle jumping b et ween states of X , in the phylogenetic setting, X is the set of taxa, waiting at each state an exp onentially distributed random time b efore mo ving to the next. The rate at which the particle jumps from x to y is defined b y a function q ( x, y ) ≥ 0. In an ultrametric space, where the hierarchical top ology is enco ded in the distance function d , it is natural for this rate to depend on d ( x, y ): taxa sharing a more recent common ancestor should communicate more readily than those whose common ancestor is more ancient. This motiv ates the choice q ( x, y ) = k ( d ( x, y )) for some p ositiv e k ernel k , leading to the Mark ov generator L X u ( x ) = X y ∈ X k ( d ( x, y )) ( u ( y ) − u ( x )) m ( y ) , where m is a probability measure on X . The framew ork developed here applies to ultrametric ph ylogenetic trees with arbitrary branching, not only to binary trees. W e start Section 2 b y in tro ducing basic definitions. In particular, we introduce the concept of top olo gic al tr e e , a ro oted tree which enco des the branching top ology of a given finite ultrametric space. In Section 2.1 , we show how any phylogenetic tree has attached naturally an ultrametric space; through Lemmas 2.4 and 2.5 we establish the well-kno wn bijection b et ween ultrametric trees and finite ultrametric spaces [ 1 ], creating a dictionary b et ween the phylogenetic terminology and the theory of ultrametric Laplacians. In Section 3 , we introduce L X , an op erator acting on functions defined on the lea ves of an ultrametric tre e, and establish its complete sp ectral theory . The eigen- v alues admit explicit closed-form expressions in terms of the geometry of the tree, and the eigen vectors are supp orted on the clades, with one eigenspace attached to eac h internal no de, making the sp ectral structure fully transparent without numer- ical diagonalization. The first main result is a sp e ctr al r e c onstruction the or em (see Theorem 3.10 ): a lab eled ultrametric phylogenetic tree can b e reco vered, up to real- ization, from the sp e ctr al enc o ding σ e ( X ), a planar sequence of pairs ( λ n , m n ) ordered b y a breadth-first trav ersal of the tree, with linear complexity O ( | X | ). T o prov e this result we use an explicit probability measure referred to as the L eb esgue me asur e of the tr e e . This pro vides an optimal to ol for the storage, access, and simulation of an ultrametric ph ylogenetic tree. F ollowing the strategy of [ 4 ], we then study the sp ectral gaps in this setting. Given a ph ylogenetic tree T , we asso ciate what we call the ultr ametric phylo genetic L aplacian of T , denoted by L T , an ultrametric Laplacian attached to the underlying ultrametric space with jump rates dep ending on h 0 − h ( x ∧ y ), where h 0 is a fixed reference height and h ( x ∧ y ) denotes the height of the div ergence even t b et ween taxa x and y . Since 4 in this case the jump rate follo ws F ( h 0 − h ( x ∧ y )) = d dt P ( X t + h = y | X t = x )   t =0 , w e hav e a compatible picture b et ween the biological information of the phylogenetic tree and the random pro cess: tw o sp ecies separated by a very old common ancestor (high height of their LCA) hav e a small jump rate; that is, a jump b et ween phyloge- netically distan t taxa is a rare even t. On the other hand, recently div erged taxa (small heigh t of their LCA) hav e large jump rates and are b etter connected. This leads to a geometrical interpretation of the sp ectrum and, in particular, of the sp ectral gaps. The eigenv alues of the ultrametric Laplacian enco de the hierarchical structure of the tree. Eac h eigenv alue λ ( u ) = X n ∈ γ r ( u ) m ( n ) l ( n, F ather( n )) aggregates the branch lengths l ( n, F ather( n )) along the ancestral path γ r ( u ) from a clade u to the ro ot, weigh ted by the mass m ( n ) of each intermediate ancestor. This reflects the relative sp ecies richness of the corresp onding subtree. Large eigen- v alues thus arise from clades whose ancestral paths tra verse long branc hes or pass through taxonomically rich intermediate no des, linking sp ectral magnitude to b oth ev olutionary depth and clade diversit y . Consequently , the spectral gaps reflect distinct ev olutionary mo des. T o test the theory , we construct the phylogenetic Laplacian L T on the tree of Primate genera with 109 lea ves obtained from the TimeT ree dataset [ 23 ]; the sp ectral gap separates distinct ev olutionary mo des, isolating Strepsirrhini from Simiiformes without sup ervision. Moreov er, the choice of kernel F acts as a con- trast function at different scales of the hierarch y: a sigmoid kernel achiev es complete sp ectral separation b et w een the parv orders Platyrrhini and Catarrhini, a separation not visible under the baseline k ernel. A further result concerns the decomp osition of trait v ariance along the phylogenetic tree. W e provide an explicit and natural orthonormal basis { ψ P } for the ultrametric Laplacian, leading to an eigenmo de decomp osition of any function f : X → R repre- sen ting a biological trait. Moreov er, if c P = ⟨ f , ψ P ⟩ , w e can decomp ose the v ariance as V ar m ( f ) = P P c 2 P , where eac h co efficien t c P measures the con trast b et ween trait a v er- ages in the tw o clades generated by the split P , weigh ted b y their relative mass, while also admitting a geometrical interpretation as the pro jection of f onto ψ P . This decom- p osition is in the spirit of the ph ylogenetic indep enden t contrasts of F elsenstein [ 24 ] and the orthonormal v ariance decomposition of Ollier, Couteron, and Chessel [ 25 ], but deriv ed directly from the eigen basis of L X , whic h generalizes the Haar-lik e wa v elets of [ 26 ], rather than from an ad ho c top ological construction. Large co efficien ts sig- nal splits where substantial trait div ergence o ccurred b et w een clades of comparable size, providing a natural and in terpretable basis for phylogenetic comparison. As an illustration, we analyze this decomp osition for three traits on the phylogenetic tree of Primate genera, where trait data were obtained from the PanTHERIA dataset [ 27 ]. 5 The cumulativ e v ariance profiles rev eal that b o dy mass accum ulates most of its v ari- ance at lo w eigen v alues, longevity concentrates it in an intermediate spe ctral band, and litter size rises gradually across the entire sp ectrum. These distinct profiles illus- trate that the decomp osition not only summarizes v ariance but lo cates it along the eigen v alue axis, distinguishing traits whose v ariation is captured by a few deep splits from those distributed more uniformly across the tree. In Section 4 , we extend the random walk centralit y of Noh and Rieger [ 28 ] to the con tinuous-time setting, obtaining a closed-form expression for the CTMC centralit y C CTMC ( i ) on ultrametric spaces. The index admits a dual in terpretation: dynamically , it quantifies the accessibility of a state under the sto c hastic evolution; top ologically , it reflects the ramification of the path connecting the state to its ancestors in the ultrametric tree. As an application to phylogenetic conserv ation, a leaf with low cen- tralit y corresp onds to a sp ecies with few close relatives, o ccup ying a long and p o orly branc hed lineage, precisely the signature of a phylogenetically distinct sp ecies whose ev olutionary history is disprop ortionately large. Such sp ecies are central to conserv a- tion prioritization frameworks suc h as the EDGE of Existence program [ 5 , 6 ]. The CTMC cen trality thus provides a mathematically grounded index of evolutionary dis- tinctiv eness, derived not from ad hoc branch-length summation but from the stochastic geometry of the ultrametric space. Compared with existing measures of phylogenetic isolation such as evolutionary distinctiveness, C CTMC offers several structural adv an- tages: it incorp orates information from the entire tree top ology rather than only the ro ot-to-leaf path; the kernel function pro vides an interpretable and mathematically justified mechanism to tune the trade-off b et w een sensitivit y to recen t and ancient div ergences; and the form ulation extends naturally to non-ultrametric trees and phy- logenetic net works, addressing a limitation noted in [ 5 ] for most existing metrics. W e close this introduction by noting that the entire framework can b e extended to non- ultrametric phylogenetic trees, though in most cases at the cost of losing the explicit form ulas. 2 Ultrametric spaces and trees. The concept of ultrametric spaces and ultrametric trees is central in this work. In this section w e introduce the notions of ultrametric space, ultrametric tree and ultrametric ph ylogenetic tree. W e observ e the relationship b et ween these tree ob jects creating a dictionary whic h will b e used for the rest of this w ork. Definition 2.1 A metric space is a pair ( X, d ), where X is a non-empt y set and d : X × X → [0 , ∞ ) is a function such that, for all x, y , z ∈ X : 1. d ( x, y ) = 0 ⇐ ⇒ x = y , 2. d ( x, y ) = d ( y, x ), 3. d ( x, z ) ≤ d ( x, y ) + d ( y , z ). An ultrametric space is a metric space ( X , d ) in whic h the metric d satisfies the strong triangle inequality: d ( x, z ) ≤ max { d ( x, y ) , d ( y , z ) } for all x, y , z ∈ X. 6 Example 2.2 Consider the set X = { a, b, c, d } , and define d : X × X → R as follows: d ( x, y ) =      0 if x = y , 2 if { x, y } ⊂ { a, b } or { x, y } ⊂ { c, d } , 3 otherwise. It is easy to v erify that d is an ultrametric on X . The ultrametric structure can be represen ted b y the follo wing tree: B 3 = X B 1 2 = { a, b } a b B 2 2 = { c, d } c d The concept of tree is intimately related with the one of an ultrametric space. The top ology of any finite ultrametric space can b e describ ed by a tree as shown in the example ab o v e. In order to introduce terminology and fix notation w e contin ue with some definitions. A gr aph is a pair ( V , E ), where V is a finite set of v ertices and E ⊂ V × V is a set of edges. A (c ombinatorial) tr e e T is a connected, acyclic graph. A tree T having a distinguished no de called the r o ot , denoted b y r , is called as a r o ote d tree. F or each no de n ∈ T we define the history of n as the unique sequence of nodes connecting the ro ot r with the no de n including the extremes. W e denote this set by γ r ( n ). The n umber | γ r ( n ) | − 1 will b e called the level of n . W e say that m ∈ T is an anc estor of n ∈ T if m ∈ γ r ( n ). Given t wo vertices n and m , their le ast c ommon anc estor is the unique vertex of maximal lev el that is an ancestor of b oth n and m and will b e denoted b y n ∧ m . In general, a path connecting tw o differen t no des n, m ∈ T \ { r } will b e denoted by γ ( n, m ). If m ∈ T is an ancestor of n ∈ T such that γ ( n, m ) = { n, m } then m is refer as the father of n and w e use the notation F ( n ) = m . In this case, w e also say that m and n are c onse cutive . In this case n is said to b e a child of m . A v ertex with no children is called a le af , while a vertex that is not a leaf is called an internal no de . The top ology induced b y the ultrametric can b e describ ed by ro oted tree. Each in ternal no de corresp onds to a closed ball in X with resp ect to d , and each leaf represen ts an element of X . W e refer to this tree as the top ological tree asso ciated with the ultrametric space ( X, d ). Definition 2.3 Let ( X, d ) be a finite ultrametric space. The top olo gic al tr e e asso ciated to ( X, d ) is a ro oted tree T with the following prop erties: 1. The set of leav es of T is in bijection with X . 2. Each in ternal no de of T is asso ciated with a ball in X of the form B ( x, r ) = { y ∈ X : d ( x, y ) ≤ r } for some x ∈ X and r > 0, and every ball in X arises in this wa y . 7 3. F or any tw o leav es x, y ∈ X , the smallest ball containing b oth corresp onds to their lo west common ancestor in T . Therefore there is a one to one corresp ondence b et ween the balls generated by the ultrametric d and the no des n ∈ T . Henceforth, w e will use b oth terms interc hange- ably when the tree T refers to the top ological tree, a ball will b e refer as n , B or B n when necessary . W e also introduce the notation B + n = B F ( n ) . In particular B r = X . 2.1 Ultrametric trees and Phylogenetic trees. F or a given tree T define a br anch as the edge connecting t wo consecutive in ternal no des u, v ∈ T . W e can asso ciate a length function to the set of branches denoted by l ( u, v ) > 0. In general, when a tree has attached a function in its edges we call such tree a weighte d tr e e . The pair T = ( T , l ) is called an ultr ametric tr e e if the sum of the lengths of the branc hes connecting the ro ot and an y leaf is constant. A sp ecies is one of the most fundamental units of biology . Over time, sp ecies are shap ed by evolutionary forces such as m utation and natural selection, which drive c hanges at b oth the molecular and morphological lev el. A key outcome of these pro- cesses is that a spec ies ma y either giv e rise to tw o distinct lineages that evolv e indep enden tly (a speciation ev ent) or disapp ear en tirely through extinction. The cumu- lativ e result of suc h even ts is a branching pattern that can b e represented as a tree structure, commonly referred to as the tree of life [ 2 ]. A phylo genetic tr e e is a tree equipp ed with a length function T = ( T , l ) that repre- sen t the evolutionary history of a set of entities; it describ es a hypothetical pattern of sp eciation even ts that o ccurred in the past, each internal no de represen t the sp ecia- tion ev ents, and the leav es represent the analyzed ”present time” sp ecies. The length b et w een tw o consecutive internal no des represents the time b et ween those tw o sp ecia- tion ev ents. A r o ote d phylo genetic tr e e is a ph ylogenetic tree T = ( T , l ) suc h that T is a ro oted tree. If T is an ultrametric tree, then T is called a ultr ametric phylo genetic tr e e . Let ϕ : X → t ( X ) b e a lab eling , that is a bijective functions from the set of leav es to the set of taxa t ( X ). An ultrametric ph ylogenetic tree T equipp ed with a lab eling will be refereed as lab ele d phylo genetic ultr ametric tr e e . An internal no de of a lab eled ph ylogenetic tree will b e called a clade of T . An y of these trees ha ve a natural asso ciated ultrametric, that is if T is an ultrametric tree then for an y pair of leav es x, y ∈ X w e define d l ( x, y ) := 2 X u → v u,v ∈ γ ( x ∧ y ,x ) l ( u, v ) . (1) That is, the distance b et ween tw o lea ves is the sum of the le ngths of the branches along the path connecting them which necessarily passes through their least common ancestor. W e also define the height of an internal no de n as h ( n ) := d l ( x, y ) 2 , 8 where n = x ∧ y for some x, y ∈ X , this is well defined by the ultrametric inequality . The distance ma y not hav e a direct biological interpretation, nev ertheless the height h of an in ternal no de in a phylogenetic tree represen t the time betw een the present and the split ev ent. The underlying tree T of a given phylogenetic tree T = ( T , l ) need not b e binary; that is, a single no de ma y give rise to more than t wo descendant lineages sim ultaneously . 0.5 0.5 0.7 0.7 0.9 1.4 0.4 1.1 0.8 2.6 1 2 3 4 5 6 Fig. 1 : Phylogenetic tree. Here d (1 , 2) = 1 and d (3 , 5) = 3 . 6 By construction we conclude that every ultrametric (phylogenetic) tree has attac hed a natural ultrametric space ( X T , d ), where X T is the set of leav es of T . Moreo ver, from the definitions ab o ve the next lemma follows. Lemma 2.4 The top olo gic al tr e e T ∗ of the ultrametric sp ac e ( X T , d ) is the underlying tr e e of T , that is T = ( T ∗ , l ) . This lemma serv es as a dictionary b et ween the theory of ultrametric spaces and ultrametric (ph ylogenetic) trees. In other words, the skeleton of the ultrametric (ph ylogenetic) tree T is the top ological tree of the ultrametric space attached to it. Giv en a finite ultrametric space ( X , d ) with top ological tree T , is possible to con- struct an ultrametric tree from it. F or a given in ternal no de n ∈ T , define h ( n ) := d ( x, y ) / 2 where n = x ∧ y for some leav es x, y ∈ X . Define l ( n, m ) := | h ( n ) − h ( m ) | . Then the tree ( T , l ) is ultrametric, th us the following result follows. Lemma 2.5 L et ( X, d ) a finite ultr ametric sp ac e, then ther e is an ultr ametric tr e e T = ( T , l ) such that ( X , d ) = ( X T , d l ) F rom these tw o lemmas we conclude that there is a bijection b et w een ultrametric trees and finite ultrametric spaces. Henceforth, we use the term ultrametric tree to refer to the mathematical ob ject T = ( T , l ) and w e use the name ultrametric phylogenetic tree when w e w ant to refer to the biological interpretations arising from this structure. 9 3 Sp ectral geometry for ph ylogenetic trees An ultrametric space equipp ed with a measure m is called a me asur able ultrametric space. In particular in the finite setting this is just a p ositive function on the leav es. A pr ob ability me asur e is a measure satisfying X x ∈ X m ( x ) = 1 . W e will develop the theory for any m . Giv en a finite ultrametric space ( X , d, m ) with top ological tree T and measure m , consider the op erator L X u ( x ) = X y ∈ X k ( d ( x, y ))  u ( y ) − u ( x )  m ( y ) , where k : [0 , ∞ ) → R is a given function. Although L X will b e in terpreted in Section 4 as the generator of a contin uous-time Marko v chain with transition rates k ( d ( x, y )) m ( y ), we first develop its sp ectral theory in full generality , as this provides the analytical foundation for the probabilistic results that follo w. This op erator acts on functions u : X → R . W e refer to L T as the ultr ametric L aplacian op er ator attached to the triple ( X, d, m ). Consider the (canonical) basis { e y } y ∈ X of functions on X equipp ed with the canonical inner pro duct, where e y ( x ) = δ x,y and δ x,y is the Kroneck er delta, defined by δ x,y = ( 1 if x = y , 0 otherwise . With resp ect to this basis, the op erator L T can b e represented as a matrix, called the ultr ametric L aplacian matrix . Definition 3.1 The ultr ametric L aplacian matrix associated to the op erator L X is the | X | × | X | matrix L = ( L x,y ) x,y ∈ X whose entries are giv en by L x,y = ( k ( d ( x, y )) m ( y ) if y  = x, − P z  = x k ( d ( x, z )) m ( z ) if y = x. This matrix corresp onds to the matrix representation of L T with resp ect to the canonical basis { e y } y ∈ X . When the kernel satisfies k  diam ( B n )  − k  diam ( B F ( n ) )  > 0 for any n ∈ T \ X differen t from the ro ot, the ultrametric Laplacian L X coincides with the Hierarc hical Laplacian (see [ 14 , 29 ]). In order to see this connection we introduce this op erator in the setting of finite ultrametric spaces. 10 F or a ball B , define the operator ( P B f )( x ) := 1 { x ∈ B } 1 m ( B ) X y ∈ X f ( y ) m ( y ) . Definition 3.2 Let { a ( B n ) } n ∈ T \ X ⊂ [0 , ∞ ) b e weigh ts on balls. The hierarchical Laplacian asso ciated with m and a is defined as ( L a f )( x ) = − X n ∈ T \ X : x ∈ B n a ( B n )  f ( x ) − ( P B n f )( x )  . Prop osition 3.3 (Hierarchical decomp osition of L X ) Assume that for e ach n ∈ T \ X differ ent fr om the r o ot the kernel k satisfies k  diam ( B n )  − k  diam ( B F ( n ) )  > 0 . F or n ∈ T \ X differ ent from the r o ot define a ( B n ) := m ( B n )  k  diam ( B n )  − k  diam ( B F ( n )   ≥ 0 , and a ( X ) := m ( B n ) k ( X ) Then L X f = L a f Pr o of Let δ z the indicator function of the p oint z ∈ X . Let y 0 ∈ X with y 0  = z . It is clear that P B n = δ z ∈ B n m ( z ) m ( B n ) , where δ z ∈ B n is 1 if z ∈ B n and zero otherwise. Therefore, we ha ve L a δ z ( y 0 ) = − X n ∈ γ r ( y 0 ) \{ y 0 } a ( B n )  0 − δ z ∈ B n m ( z ) m ( B n )  = X n ∈ γ r ( y 0 ∧ z ) a ( B n ) m ( z ) m ( B n ) Note that X n ∈ γ r ( y 0 ∧ z ) a ( B n ) m ( B n ) = X n ∈ γ r ( y 0 ∧ z )  k ( d ( B n )) − k ( d ( B F ( n ) ))  = k ( d ( y 0 , z )) . Therefore, L a δ z ( y 0 ) = L X δ z ( y 0 ). On the other hand, − X n ∈ γ r ( z ) \{ z } a ( B n )  1 − m ( z ) m ( B n )  = − X n ∈ γ r ( z ) \{ z } a ( B n ) m ( B n ) ( m ( B n ) − m ( z )) = − X n ∈ γ r ( z ) \{ z } a ( B n ) m ( B n ) X y ∈ B n : z  = y m ( y ) = − X z  = y k ( d ( z, y )) m ( y ) Therefore L X = L a . □ R emark 3.4 When the kernel satisfies the strict monotonicit y condition, L X coincides with the hierarchical Laplacian L C of Bendiko v et al. [ 13 ], restricted to the finite setting. Outside this regime, L X is strictly more general. Note that the framework of Kozyrev [ 15 ], while formally related, is developed exclusiv ely for infinite ultrametric spaces and do es not directly apply here. 11 3.1 The sp ectrum Inspired by the theory of sp ectral geometry and manifold learning, one of the goals of this section is to show the explicit connection betw een the top ology of a finite ultramet- ric space and the eigen v alues and eigen-vectors of the operator L X . W e are particularly in terested in to see how these connection leads to alternative wa ys to analyze a phylo- genetic tree. Therefore, the description of the spectral nature of this op erator is cen tral for the theory . W e initiate this study by constructing an orthonormal basis of eigenv ectors of L X . F ollowing [ 14 ], given n ∈ T \ X an in ternal no de, the functions φ B n ,l = 1 B l m ( B l ) − 1 B n m ( B n ) , for eac h child l ∈ C ( n ) . (2) are eigenfunction of the hierarchical Laplacian L a , and therefore, in the sp ecial case when k ( diam ( B n )) − k ( diam ( B F ( n ) ) > 0, for any n ∈ T \ { X } , they are also eigen functions of L X . Moreo ver, the function φ 0 = 1 ≡ 1, denoting the trivial eigen vector with eigen v alue 0, together with the set of all these functions form a complete system of the space L 2 ( X, m ) [ 14 ]. In particular, notice that for a no de n the set of functions φ B n ,l , with l ∈ C ( n ) span the set V n :=    ψ : ψ | B l , is constan t for all l ∈ C ( n ) , X l ∈ C ( n ) m ( B l ) ψ | B l , = 0 , S upp ψ ⊂ B n    . The dimension of this space is | C ( n ) | − 1. This results extends to an y ultrametric Laplacian L X with p ositiv e k ernel k . Prop osition 3.5 L et φ B n ,l define d as in e quation 2 . Then φ B n ,l is an eigenfunction of L X , with eigenvalue λ n = − X l ∈ γ r ( n ) m ( B l ) h k ( diam ( B l )) − k ( diam ( B F ( l ) ) i = − X y ∈ X \ B n k ( d ( x 0 , y )) dm ( y ) − k ( diam ( B n )) m ( B n ) , (3) wher e x 0 ∈ B n , with multiplicity m n := | C ( n ) | − 1 . Pr o of Let n ∈ T \ X an internal no de. Let φ B n ,l = 1 B l m ( B l ) − 1 B n m ( B n ) , for a given c hild l ∈ C ( n ) . define the real num ber λ n := − P y ∈ X \ B n k ( d ( x 0 , y )) dm ( y ) − k ( diam ( B n )) m ( B n ) . Define a = 1 m ( B l ) − 1 m ( B n ) and b = − 1 m ( B n ) . Let x ∈ B l , therefore, φ B n ,l ( x ) = a , and X y ∈ B l k ( d ( x, y ))( φ B n ,l ( y ) − φ B n ,l ( x )) m ( y ) = 0 . 12 Therefore L X φ B n ,l ( x ) = X y ∈ B n \ B l k ( d ( x, y ))( b − a ) m ( y ) + X y ∈ X \ B n k ( d ( x, y ))(0 − a ) m ( y ) . Since x ∈ B l , for any y ∈ B n \ B l , k ( d ( x, y )) = k ( diam ( B n )) and L X φ B n ,l ( x ) = k ( diam ( B n ))( b − a ) m ( B n \ B l ) − a X y ∈ X \ B n k ( d ( x, y )) m ( y ) = a   − k ( diam ( B n ))  1 − b a  m ( B n \ B l ) + X y ∈ X \ B n k ( d ( x, y )) m ( y ) .   In a similar wa y , for x ∈ B n \ B l where φ B n ,l ( x ) = b , the following holds L X φ B n ,l ( x ) = b   − k ( diam ( B n ))  1 − a b  m ( B l ) + X y ∈ X \ B n k ( d ( x, y )) m ( y ) .   A direct computation leads to the followin g equalities  1 − b a  m ( B n \ B l ) = m ( B n ) , and  1 − a b  m ( B l ) = m ( B n ) . Hence, in b oth cases L X φ B l ,n ( x ) = φ B l ,n ( x ) λ n . Lastly , if x ∈ X \ B n then L X φ B l ,n ( x ) = X y ∈ X k ( d ( x, y )) φ B l ,n ( y ) m ( y ) = 0 , since φ B l ,n has mean zero. □ Extracting geometrical information ab out an ob ject in terms of the sp ectra of an op erator is the core idea b ehind sp ectral geometry . It turns out that for a certain measure µ the spec trum of the op erator L X allo w us to recov er the diameters of the balls of the ultrametric space ( X , d ), when the kernel k is a bijection. Moreo ver, after certain ordering and padding all the geometrical and top ological information can b e reco vered from the mo dified sequence. In order to prov e this assertion w e need to define a de c or ation of a tree and the L eb esgue me asur e of a tree. Definition 3.6 Let T b e the top ological tree asso ciated to a finite ultrametric space ( X , d ), with ro ot r . 1. A decoration of T is a bijection φ : T → A , where A is a set. 2. F or each node v in T , denote by C ( v ) the set of its children. The L eb esgue me asur e µ on the tree is defined recursiv ely as follows: (a) µ ( r ) = 1. (b) F or an y no de v with children C ( v ) = { v 1 , . . . , v k } , set µ ( v i ) = µ ( v ) / | C ( v ) | for eac h i = 1 , . . . , k , where | C ( v ) | denotes the cardinalit y of C ( v ). 13 (c) F or any leaf x ∈ X , µ ( x ) is determined b y the pro duct of the inv erses of the cardinalities along the unique path from r to x : µ ( x ) = Y v ∈ γ r ( x ) \{ x } 1 | C ( v ) | . F or example, an ultrametric tree T = ( T , l ) can b e viewed as the tree T decorated with the diameters of the balls attac hed to its ultrametric space via the function l , adopting the conv en tion that each leaf x ∈ X is decorated with a zero. The sp ectrum lead to another decoration of the form ( λ n , m n ), here w e lab el the leafs by the pairs (0 , 0). Prop osition 3.7 L et ( X , d, µ ) b e finite ultr ametric sp ac e e quipp e d with its L eb esgue me a- sur e and T its top olo gic al tr e e. Then the de c or ation ( λ n , m n ) n ∈ T \ X c onsisting of eigenvalues with its multiplicities of the op er ator L X and (0 , 0) in the le afs r e c onstruct the de c or ation ( k (diam( B n ))) n ∈ T \ X and (0 , 0) in the le afs. Pr o of Let r b e the ro ot of T . Then λ r = − k (diam( B r )). Since | C ( r ) | = m r + 1, we know that B r decomp oses in m r + 1 balls of measure 1 m r +1 . Let n ∈ C ( r ). Then λ n = − k (diam( B r )) m r m r + 1 − 1 m r + 1 k (diam( B n )) . Therefore diam( B n ) is completly determined b y λ n . W e no w pro ceed b y induction on the lev els of T . Assume that for levels 1 , ..., l − 1, we ha ve decorated the no des. Then for n at lev el l , w e hav e λ n = X j ∈ γ r ( n ) \{ n } k (diam( B j )) µ ( B j ) m j + 1 − 1 m F ( n ) + 1 k (diam( B n )) Therefore k ( diam ( B n )) is determined by λ n . This ends the pro of. □ Definition 3.8 (Sp ectral enco ding) Let T b e a rooted tree with each no de n decorated by a pair ( λ n , m n ), with (0 , 0) assigned to all leav es. The canonical ordering of these pairs is defined inductively as follo ws: 1. Base step: Start with the ro ot no de r . The first elemen t of the sequence is ( λ r , m r ). 2. Inductive step: Assume that the sequence has b een constructed up to some lev el, and let S b e the list of pairs added at the previous step. F or each pair in S that is not (0 , 0) (i.e., for each internal no de), add to the sequence, in the order deter- mined by the tree, that is, visiting the no des in breadth-first tra versal order, the pairs corresp onding to its children. F or each leaf c hild, add (0 , 0) in the resp ectiv e p osition. 3. T ermination: Rep eat the inductiv e step until no new in ternal no des remain to expand; that is, un til all subsequent pairs corresp ond to lea ves and are (0 , 0). 14 The spectral enco ding will b e denoted b y σ e ( X ). Tw o lab eled ultrametric ph ylogenetic trees T 1 = ( T 1 , l 1 , ϕ 1 ) and T 2 = ( T 2 , l 2 , ϕ 2 ) are said to b e isomorphic if there exists a bijection Ψ : V ( T 1 ) → V ( T 2 ) suc h that ( u, v ) ∈ E ( T 1 ) ⇐ ⇒ (Ψ( u ) , Ψ( v )) ∈ E ( T 2 ) , l 1 ( u, v ) = l 2 (Ψ( u ) , Ψ( v )) , and, for ev ery leaf x ∈ X 1 , ϕ 1 ( x ) = ϕ 2 (Ψ( x )) . Tw o lab eled ultrametric phylogenetic trees are called e quivalent r e alizations if they are isomorphic. Example of tw o realizations of a lab eled phylogenetic tree are shown b elo w in Figure 2 . a b c ≡ b c a Fig. 2 : Equiv alent realizations of the same lab eled ultrametric ph ylogenetic tree. Therefore, is clear that the sp ectral enco ding determines a realization of the lab eled ultrametric phylogenetic tree. Different realizations corresp onding to p erm utations of c hildren at internal no des produce different spectral enco ding but represen t equiv alent lab eled trees. Example 3.9 Consider the following decorated ro oted tree: r n 1 a b c n 2 d n 3 e f 15 The canonical ordered sequence of pairs is: ( λ r , m r ) , ( λ n 1 , m n 1 ) , ( λ n 2 , m n 2 ) , (0 , 0) , (0 , 0) , (0 , 0) , (0 , 0) , ( λ n 3 , m n 3 ) , (0 , 0) , (0 , 0) where each (0 , 0) corresponds to a leaf, and each ( λ v , m v ) corresp onds to an internal no de. Theorem 3.10 (Sp ectral reconstruction theorem for ultrametric trees.) L et ( T , ϕ ) be a lab ele d phylo genetic ultr ametric tr e e. Then ( T , ϕ ) c an b e r e c onstructe d up to re alization fr om the sp e ctr al enco ding of the ultr ametric L aplacian L X asso ciate d with the underlying ultr ametric spac e ( X, d, µ ) , wher e µ is the L eb esgue me asur e. Pr o of Let ( T , ϕ ) b e a lab eled phylogenetic ultrametric tree. Its clear that the sp ectral enco d- ing reconstruct completely a realization of the top ological tree T decorated with the pairs ( λ n , m n ). By Prop osition 3.7 this decoration is equiv alent to the diamen ter decoration diam( B n ). Since the finite ultrametric space ( X, d ) is characterized by the top ological tree T decorated in this wa y , the result f ollo ws. □ R emark 3.11 Notice that in the case of a binary tree, the decoration ( λ n , m n ) can b e substituted by the decoration λ n since in this case m n = 1 for all internat node n . Example 3.12 (Counterexample, isoesp ectral spaces.) In this example w e show the existence of t wo ultrametric spaces which are not isomorphic but ha ve the same eigenv alues. Consider the following decorated rooted tree: ˆ r ˆ n 1 ˆ a ˆ b ˆ n 3 ˆ c ˆ d ˆ n 2 ˆ e ˆ f The canonical ordered sequence of pairs is: ( λ ˆ r , m ˆ r ) , ( λ ˆ n 1 , m ˆ n 1 ) , ( λ ˆ n 2 , m ˆ n 2 ) , (0 , 0) , (0 , 0) , ( λ ˆ n 3 , m ˆ n 3 ) , (0 , 0) , (0 , 0) , (0 , 0) , (0 , 0) . W e will sho w that w e can construct with this tree an isoesp ectral ultrametric space to the one giv en in Example 3.9 . Assume that diam( B n i ) = diam( B ˆ n i ), for i = 1 , 2 and diam ( B r ) = diam( B ˆ r ). Therefore it is easy to see that the first three pairs of eigen v alues are equal. Assume no w that k ( x, y ) = d ( x, y ). Then imposing λ n 3 = λ ˆ n 3 , we obtain the equation diam( B n 3 ) = 1 3 diam( B n 2 ) + 2 3 diam( B ˆ n 3 ) . Therefore, if w e define the diameter of B n 3 in this w ay , we can construct t wo non-isometric but iso espectral spaces. 16 Example 3.13 Consider the phylogenetic tree of Figure 1 . Consider the gravitational kernel k ( r ) = 1 r 2 and take m ( y ) = 1 / | X | , the coun ting measure. The respective eigenv alues are sho wn in the figure b elo w: Fig. 3 : Eigenv alues of a phylogenetic tree 0.5 0.5 0.7 0.7 0.9 1.4 0.4 1.1 0.8 2.6 1 2 3 4 5 6 ⇕ Fig. 4 : Phylogenetic tree and sp ectral enco ding. F or illustration, each eigenv alue was mapp ed to a corresponding frequency within the visible sp ectrum. The notation × 5 sym b olize 5 leafs. The extended sp ectrum is an efficient to ol to reconstruct the ultrametric space via the sp ectrum of L X b y means of a Breadth-First Searc h (BFS), which is known to hav e complexity O ( | X | ). It follows that for a given finite ultrametric space, the construction of the sp ectrum and the measure has also linear complexity . Therefore, the extended sp ectrum already serv es as a sufficien tly optimal to ol for the storage, 17 access, and sim ulation of an ultrametric space. Lewitus and Morlon [ 4 ] introduced a sp ectral framework for ph ylogenetic trees based on the Mo difie d Gr aph L aplacian (MGL), defined as ∆ = D − W , where D ii = P j w ij is the degree matrix and W is the full pairwise distance matrix b et ween all 2 n − 1 no des of the tree. The eigenv alues of ∆ are used to construct a sp ectral density profile enco ding global properties of the tree shap e. T o identify mo des of diversification within a tree, they employ the eigen-gap heuristic : if the largest gap in the ranked sp ectrum falls b et ween µ i and µ i +1 , the tree is said to contain i clusters of distinct ev olutionary dynamics [ 30 ]. This criterion is explicitly presented as a heuristic, with no formal pro of [ 4 ]. Our framework differs in three key asp ects. First, our op erator L acts on the lea ves of the tree and carries a natural probabilistic interpretation as the generator of a con tinuous time Marko v chain with jump rates q ( x, y ) = k ( d ( x, y )) m ( y ), whereas the MGL has no such sto c hastic interpretation. Second, the eigenv alues of L admit the explicit closed-form expression, making the sp ectral structure fully transparent without n umerical diagonalization. Thirdly , in our framework for a particular kernel k and measure m , the eigen-gaps ha ve a rigorous geometric interpretation: a gap b et ween λ i and λ i +1 corresp onds to a level h ∗ in the ultrametric hierarc hy where the accumulated mass-weigh ted branch length undergo es a significant jump, pro viding a formal interpretation in the context of ph ylogenetic trees. Let us dive into the last observ ation. F or a ph ylogenetic ultrametric tree T , fix a reference height h 0 > h ( X ), where h ( X ) is the height of the ro ot and define the kernel k ( d ( x, y )) = F ( h 0 − h ( x ∧ y )) T ogether with the normalized coun ting measure m ( x ) = 1 / | X | for all x ∈ X , we define the ultr ametric phylo genetic L aplacian of T with k ernel F as the attached ultrametric Laplacian with this measure and k ernel: L T f ( x ) = X y ∈ X F ( h 0 − h ( x ∧ y ))( u ( y ) − u ( x )) m ( y ) . Lets assume first that F ( x ) = x . In this case, the jump-rates are h 0 − h ( x ∧ y ), As discussed in the introduction, the biological interpretation is as follo ws : t wo sp ecies separate by a very old common ancestor (high height of their LCA) ha ve a small rate jump, that is, jump betw een ph ylogenetically distant taxa is a rare ev ent. On the other hand, sister taxon (small height of their LCA) hav e large jump rates. By Prop osition 3.5 , the eigen v alues of this op erator are λ ( u ) = X n ∈ γ r ( u ) m ( n ) l ( n, n + ) , where l ( n, n + ) is the branch length connecting n with its immediate ancestor n + . Hence each eigenv alue accum ulates, along the ancestral path from the clade n to the 18 ro ot, the branc h lengths l ( n, n + ) weigh ted by the mass m ( n ) of each ancestral clade, where m ( n ) reflects the relative sp ecies richness of n . Large eigenv alues corresp ond to a p ossible combination of tw o factors: deep er branches (from the ro ot to the clade) along the path and ric h clades with high taxon diversit y . Now lets analyze the gaps. First, lets assume that the clade v is ancestor of the clade u . In this case λ ( u ) − λ ( v ) = X n ∈ γ ( u,v ) m ( n ) l ( n, n + ) . Notice that within a sub-tree characterized by short internal branches (like a radi- ation even t) and more or less homogeneous mass splits, the eigen v alues tend to b e similar and collapse to a limit limit as we go down the tree. In this case a large gap would represent a strong asymmetry on the mass distribution or a large branch separating t wo even ts. Nevertheless the deep er the clades are, this is less exp ected. In the case when u and v b elong to different lineages we hav e λ ( u ) − λ ( v ) = X n ∈ γ ( u,u ∧ v ) m ( n ) l ( n, n + ) − X n ∈ γ ( v,u ∧ v ) m ( n ) l ( n, n + ) . Therefore, a strong eigenv alue gap may reflect a deep div ergence even t, characterized b y possibly b oth, a long in ternal branch and/or a highly asymmetric mass distribution. F rom this observ ations, is expected that tw o clusters with differen t rates of div ersifica- tion will pro duce big gaps. As an example we compute the eigenv alue distribution of the phylogenetic ultrametric Laplacian attac hed to the ph ylogenetic tree of Primate genera with 109 lea ves, obtained from Timetree [ 23 ]. Fig. 5 : Eigenv alue distribution. Each eigenv alue has attached an in ternal no de. The first plot (left) show the magnitude of each eigenv alue as function of the height. The second one (righ t) sho ws the eigen v alue distribution. A noticeable gap is sho wn in both plots. Each internal p oint has a color representing it b elongs to the class Simiiformes, T arsiidae (infraorders) or Strepsirrhini (order) 19 As shown in Figure 5 , the gap separates successfully different evolutionary mo des. First, left plot show how similar are eigenv alues inside an order (and infra-order) as exp ected from the theory , sev eral similar eigenv alues for different heigh ts but different class of clades. Moreov er, we see from this plot that the Simiiformes infra-order hav e attac hed higher frequency eigenv alues whic h suggest a higher div ersification in recent times and a higher mass in this clade, whereas the low er v alues of nodes in the Strepsir- rhini order and T arsiidae infra-order reflects older divergencies and smaller mas ses as w e can see in Figure 6 . In the righ t plot of Figure 5 , w e see ho w the eigen-gap separates the t wo more dominant splits of the phylogenetic tree: Strepsirrhini vs Simiiformes. 20 Fig. 6 : Phylogenetic tree of Primate genera from Timetree. 21 Fig. 7 : Gap obtained using a sigmoid kernel. The c hoice of kernel F in the phylogenetic ultrametric Laplacian is not unique, and different kernels emphasize differen t scales of the phylogenetic hierarch y . As an illustration, consider the sigmoid k ernel F ( x ) = 1 1 + e − β ( x − c ) , (4) with β = 0 . 3 and c = 35. This k ernel acts as a lo cal contrast function, saturating for no des far from the threshold c and amplifying differences in the neigh b ourho od of h 0 − h ≈ c . Figure 7 sho ws that under this k ernel, the eigen v alues asso ciated to the t w o parv orders of Simiiformes, Platyrrhini and Catarrhini, b ecome completely separated, with no o verlap b et ween the tw o sp ectral bands. This separation is not as eviden t with the kernel k ( x ) = x . The systematic study of how to select an optimal kernel for a giv en ph ylogenetic question, whether to maximize separation at a particular taxonomic scale, or to recov er a target sp ectral structure, is a natural direction for future researc h. 3.2 The eigenpro jectors While the sp ectrum ”hears” the geometry of the metric, the eigenpro jectors will codify the underlying top ology of the ultrametric space, in short, the top ological tree can b e reconstructed from the pro jectors/eigenv ectors. Lets start by fixing an orthonormal basis B n of V n and denote b y B X := [ n ∈ T \ X B n ∪ { ψ 0 } , where ψ 0 ≡ 1 is the trivial eigenv ector of a given ultrametric Laplacian L X . Thus, the set B X is a fixed eigenv ector basis of L X . As we already prov ed, the supp ort of any eigen vector ψ ∈ V n is equal to the no de n . That is, for an ultrametric ph ylogenetic tree, eac h clade hav e an eigen-space asso ciated with it. W e can construct an explicit eigen basis using the Gram-Schmidt pro cess on the functions φ B n ,l . 22 Prop osition 3.14 Fix an internal no de P with disjoint childr en C 1 , . . . , C k . Define m r := m ( C r ) , s j := j X r =1 m r . F or j = 1 , . . . , k − 1 , we define ψ P,j ( x ) =                a j := r m j +1 s j s j +1 , x ∈ C 1 ∪ · · · ∪ C j , b j := − r s j m j +1 s j +1 , x ∈ C j +1 , 0 , x ∈ C r with r ≥ j + 2 or x / ∈ P . Then the set of al l functions ψ P,j sp ans the sp ac e V n . Pr o of F or eac h j = 1 , . . . , k − 1, the function ψ P,j is supported in P and is constant on each c hild C r of P . It remains to chec k that it has zero mean on P . Since ψ P,j tak es the v alue a j on C 1 ∪ · · · ∪ C j , the v alue b j on C j +1 , and v anishes on the remaining children, we get X y ∈ P ψ P,j ( y ) m ( y ) = a j j X r =1 m r + b j m j +1 = a j s j + b j m j +1 . Using the definitions of a j and b j , a j s j = s j r m j +1 s j s j +1 = r s j m j +1 s j +1 , and b j m j +1 = − m j +1 r s j m j +1 s j +1 = − r s j m j +1 s j +1 . Hence X y ∈ P ψ P,j ( y ) m ( y ) = 0 , therefore ψ P,j ∈ V n . Next w e pro ve orthogonalit y . Let 1 ≤ i < j ≤ k − 1. Then ψ P,i is constan t on C 1 ∪ · · · ∪ C i , constan t on C i +1 , and zero on C r for r ≥ i + 2. Since j ≥ i + 1, the function ψ P,j tak es the constan t v alue a j on every child C 1 , . . . , C i +1 . Therefore, ⟨ ψ P,i , ψ P,j ⟩ = a j X y ∈ P ψ P,i ( y ) m ( y ) = 0 , b ecause P y ∈ P ψ P,i ( y ) m ( y ) = 0. Th us the family { ψ P,j } k − 1 j =1 is orthogonal. W e now compute the norm of each ψ P,j : ∥ ψ P,j ∥ 2 L 2 ( m ) = a 2 j j X r =1 m r + b 2 j m j +1 = a 2 j s j + b 2 j m j +1 . Substituting the v alues of a j and b j giv es a 2 j s j = m j +1 s j s j +1 s j = m j +1 s j +1 , and b 2 j m j +1 = s j m j +1 s j +1 m j +1 = s j s j +1 . 23 Hence ∥ ψ P,j ∥ 2 L 2 ( m ) = m j +1 s j +1 + s j s j +1 = s j +1 s j +1 = 1 . So the family is orthonormal. Finally , any function in V n is determined by its constant v alues on the k children C 1 , . . . , C k , sub ject to the single linear relation k X r =1 c r m r = 0 . Therefore, dim V n = k − 1 . Since { ψ P,j } k − 1 j =1 is an orthonormal family of k − 1 elements contained in V n , it is an orthonormal basis of V n . In particular, it spans V n . □ This basis is a generalization of the basis presented in [ 26 ] which is reco vered in the case of a binary tree with the counting measure. In that work, the eigenbasis is in tro duced as an adaptation for a more general framework for wa velets in trees. Nev- ertheless, here they app ear naturally as eigenv ectors of the ultrametric Laplacian. In [ 26 ] this basis is used for the sparsification of huge ultrametric matrices and therefore, is prop osed as a metho d for storing big ultrametric trees lik e the tree of life. W e now specialize to the binary case. W e assume that m is a multiple of the coun t- ing measure and the tree is binary . Let f : X → R b e a function defined on the lea ves of the tree, which in biological applications corresp onds to a trait measured across sp ecies. Then this function can b e decomp osed in eigen-mo des via the ultrametric basis deriv ed in Prop osition 3.14 . f = f 0 + X P c P ψ P , where f 0 = P x ∈ X f ( x ) m ( x ) and c P = ⟨ f , ψ P ⟩ . First, notice that c P are closely related with the v ariance of f respect to the measure m : V ar m ( f ) := || f − ¯ f || 2 = X x m ( x )( f ( x ) − ¯ f ) 2 = X P c 2 P . And, explicitly c P = s m ( C 1 ) m ( C 2 ) m ( P ) ( ¯ f C 1 − ¯ f C 2 ) , where ¯ f C i = 1 m ( C i ) X x ∈ C i m ( x ) f ( x ) , for i ∈ { 1 , 2 } . 24 Therefore, each eigenmo de is prop ortional to the contrast asso ciated with the div ergence P , that is, the difference b etw een the a verages of the traits in the split gen- erated by the sub clades C 1 and C 2 . The prop ortionalit y factor is q m ( C 1 ) m ( C 2 ) m ( P ) whic h p enalizes asymmetries in the split. Substituting into the v ariance identit y we obtain V ar m ( f ) = X P m ( C 1 ) m ( C 2 ) m ( P ) ( ¯ f C 1 − ¯ f C 2 ) 2 . Moreo ver, each summand can b e expressed as c 2 P = m ( C 1 )( ¯ f C 1 − ¯ f P ) 2 + m ( C 2 )( ¯ f C 2 − ¯ f P ) 2 where ¯ f P = m ( C 1 ) ¯ f C 1 + m ( C 2 ) ¯ f C 2 m ( P ) , hence c 2 P measures the b et ween-group v ariance, i.e. how muc h the tw o groups differ resp ect the a verage in the clade P , hence the total v ariance V ar m ( f ) is decomp osed in v ariances b etw een the splits generated by the ph ylogenetic tree. Th us, we exp ect the largest contributions to V ar m ( f ) to arise from splits where substan tial differences of the trait o ccur b et w een tw o clades of comparable mass. This suggests using the co efficien ts c P as a natural framework for phylogenetic compari- son, where differences b et w een traits are analyzed through the orthogonal contrasts asso ciated with the divergence even ts of the tree. As an example, let us analyze three traits on the phylogenetic tree of Primate genera. The information for this traits was obtained from the P anTHERIA dataset [ 27 ]. 25 Fig. 8 : Phylogenetic cumulat ive v ariance function of the traits: b ody mass, longevity and litter size. Figure 8 illustrates the phylogenetic cumulativ e fraction of v ariance of three life-history traits measured across primate genera: log b ody mass, log maximum longevit y , and log litter size. F or each in ternal no de B , the squared F ourier co efficient c 2 B = ⟨ f , ψ B ⟩ 2 m quan tifies the fraction of total trait v ariance explained by the contrast b et w een the tw o child clades of B . The cumulativ e function F ( t ) = P λ ( B ) ≤ t c 2 B ∥ f ∥ 2 m (5) aggregates these con tributions by eigenv alue. The three traits exhibit different profiles. F or log b ody mass approximately 18 . 5% of its v ariance is explained by the ro ot no de alone ( λ ≈ 10), reflecting the large difference in mean b o dy size b etw een Strepsirrhini and Haplorhini. A secondary con- tribution arises within the Strepsirrhini band ( λ ∈ [18 , 20]), accounting for a further 22 . 7% of v ariance through internal contrasts. How ever, the dominan t contributor to b ody mass v ariance is the no de separating Platyrrhini from Catarrhini ( λ ≈ 30 . 5, h ≈ 43 Ma), whic h alone explains 21 . 9% of total v ariance. This split distinguishes the New W orld monkeys from the Old W orld monkeys and ap es. The contrast b et ween the mean b ody mass of these tw o assemblages, weigh ted by the mass distribution m , is the single largest source of b o dy mass v ariation across the primate tree. By λ = 35, appro ximately 71% of b o dy mass v ariance is accoun ted for. Log maximum longevity presents a qualitativ ely differen t pattern, with 49% of its v ariance concentrated in the Simiiformes band ( λ ∈ [35 , 40]). The dominant no de is the split b et ween Hominoidea and Cercopithecidae ( λ = 35 . 4, h ≈ 29 Ma), which 26 alone explains 20 . 5% of total longevity v ariance, reflecting the substantially greater maxim um lifespan of great ap es relative to Old W orld monk eys. The ro ot contributes only 12 . 2%, indicating that the Strepsirrhini–Haplorhini divergence is less informative for longevit y than for b o dy mass. Log litter size exhibits the flattest cumulativ e profile. The ro ot contributes only 2 . 1% of v ariance, meaning that clade mem b ership at the deep est level is nearly unin- formativ e ab out litter size. Instead, v ariance is spread across multiple bands, with the largest single contributor b eing the Plat yrrhini–Catarrhini split ( λ ≈ 30 . 5, 13 . 8%). Ov erall, litter size v ariance is distributed more uniformly across eigen v alue bands than either b ody mass or longevity . T aken together, the three spectral profiles illustrate that the ultrametric Laplacian decomp osition recov ers biologically meaningful structure: traits with strong ancient signal (b o dy mass) accumulate v ariance rapidly at low λ , while traits shap ed by more recen t or lineage-sp ecific ev olution (longevity , litter size) show flatter profiles with v ariance concentrated at higher eigen v alues. 3.3 The heat k ernel of an ultrametric Laplacian In this section we study the heat kernel asso ciated with an ultrametric Laplacian. W e approac h this concept from the p erspec tiv e of the system of differential equations naturally generated by the Laplacian matrix, that is, from the p oin t of view of a master equation or a contin uous-time Marko v chain (CTMC), therefore we are interested in the follo wing initial v alue problem or Cauchy pr oblem : Let ( X , d ) b e a finite ultrametric space, and let L and L X denote the asso ciated ultrametric Laplacian matrix and op erator, resp ectiv ely . F or a function u : X → R , the evolution u ( · , t ) is go verned by d dt u ( t, x ) = ( L X u ( t, · ))( x ) , with initial condition u (0 , x ) = u 0 ( x ). Equiv alently , in matrix form this can b e written as d dt u ( t ) = Lu ( t ) , u (0) = u 0 . By a straightforw ard computation, one can obtain the general solution of the Cauc hy problem by using the sp ectral structure of the op erator L X . In particular, let { λ n } denote the sp ectrum of L X and let { ψ n } be an eigenbasis of L 2 ( X, m ) as in tro duced previously . Expanding the initial condition in this basis, we write u 0 = X n ⟨ u 0 , ψ n ⟩ L 2 ( X,µ ) ψ n . The abov e equation is also called The he at e quation or The diffusion e quation . Substituting this expansion in to the evolution equation yields the explicit solution u ( t ) = X n e λ n t ⟨ u 0 , ψ n ⟩ L 2 ( X,m ) ψ n . 27 Equiv alently , for the initial data u (0) = u 0 , one has u ( t ) = e tL u 0 , where { e tL } t ≥ 0 denotes the semigroup generated by the ultrametric Laplacian matrix L . Equiv alently , in terms of the op erator L X this reads u ( x, t ) = e tL X u 0 ( x ) . This semigroup formulation naturally leads to the notion of the heat kernel associated with L X . Definition 3.15 Let ( X, d, m ) b e a measurable finite ultrametric space and L X the corresp onding ultrametric Laplacian op erator. The function H t : (0 , ∞ ) × X × X − → R , satisfying that for every function u : X → R and every t > 0, ( e tL X u )( x ) = X y ∈ X H t ( x, y ) u ( y ) m ( y ) , is called the he at kernel of L X . F rom the definition it follows that the heat kernel is completely determined by the sp ectrum of L X and its asso ciated orthonormal eigenbasis. Indeed, if { λ n } denotes the sp ectrum of L X and { ψ n } an orthonormal eigen basis of L 2 ( X, m ), then H t ( x, y ) = X n e λ n t ψ n ( x ) ψ n ( y ) . (6) Ev ery eigenv alue λ n is asso ciated to an internal no de n ∈ T . Therefore, w e can write the con tribution of this eigen v alue and the corresponding eigenfunctions B n as follo ws: H ( n ) t ( x, y ) := e λ n t X ψ ∈ B n ψ ( x ) ψ ( y ) . Let E n ( x, y ) the pro jection kernel to the eigen-space V n , that is, for eac h f ∈ L 2 ( m ), X y ∈ X E n ( x, y ) f ( y ) m ( y ) − f ( x ) ⊥ V n . W e hav e E n ( x, y ) = X ψ ∈ B n ψ ( x ) ψ ( y ) . In particular for δ z where z ∈ B j ⊂ B n and j ∈ C ( n ), E ( x, z ) m ( z ) = X y ∈ X E n ( x, y ) δ z ( y ) m ( y ) = m ( z )  1 B j m ( B j ) ( x ) − 1 B n m ( B n ) ( x )  = m ( z ) φ B j ( x ) , 28 where the last equalit y follo ws from the fact that m ( z ) φ B j − δ z ⊥ φ B l , for all l ∈ C ( n ). Consequen tly E n ( x, z ) = X ψ ∈ B n ψ ( x ) ψ ( z ) = φ z ∈ B j ( x ) . (7) W e now sho w that the matrix H t = ( H t ( x, y )) x,y ∈ X inherits a blo c k structure by the top ological tree of X . More precisely , each internal no de of the top ological tree corresp onds to a blo ck of H t , and the entries within eac h blo c k are constant for fixed t > 0. Theorem 3.16 L et ( X , d, m ) b e a me asur able finite ultr ametric spac e with asso ciate d topo- lo gic al tr e e T and he at kernel H t = ( H t ( x, y )) x,y ∈ X . Then for e ach internal no de n ∈ T the subset of indic es c orr esp onding to the children of n determines a blo ck of the matrix H t . Mor e over, e ach such blo ck has c onstant entries, that is, for a given t > 0 , H t ( x, y ) = 1 + e λ ( B n ) t  δ xy m ( { x } ) − 1 m ( B n )  + X T : B n ⊆ T ⊊ X e λ ( T + ) t  1 m ( T ) − 1 m ( T + )  , (8) wher e B n = x ∧ y . Pr o of Let n ∈ T b e an internal no de. Let x, y ∈ X suc h that x ∧ y = n . Therefore, there exists tw o disjoint balls B i , B j with i, j ∈ C ( n ) suc h that x ∈ B i and y ∈ B j . By equation 7 : E n ( x, y ) = φ y ∈ B j ( x ) = − 1 m ( B n ) . On the other hand, let j ∈ γ r ( n ) \ { r } , then x, y ∈ B j , therefore E ( x, y ) F ( j ) = φ y ∈ B j ( x ) = 1 m ( B j ) − 1 m ( B F ( j ) ) ! . By summing along the history of n and including the con tribution of the trivial eigen vector, w e obtain the desired equality . □ F rom the semigroup structure we ha ve the follo wing asymptotic formula, for x, y ∈ X , H t ( x, y ) m ( y ) = δ xy + t L xy + t 2 2 ( L 2 xy )( x, y ) + O ( t 3 ) ( t ↓ 0) . If x ∈ B l , y ∈ B m with l  = m and n = x ∧ y , by expanding the exp onen tial functions in expansion 8 and comparing it with the ab ov e equation, we can write the matrix k ( d ( x, y )) in terms of the eigenv alues and the measure: k ( d ( x, y )) = − λ ( B n ) m ( B n ) + X T : B n ⊆ T ⊊ X  1 m ( T ) − 1 m ( T + )  λ ( T + ) . (9) W e now fo cus in the long-time behavior of the heat k ernel. Let π : X × X → R defined as π ( x, y ) ≡ 1. This kernel has attached an op erator of the form Π u ( x ) = X y ∈ X π ( x, y ) u ( y ) m ( y ) = X y ∈ X u ( y ) m ( y ) . 29 Therefore the heat k ernel can b e write as H t ( x, y ) = π ( x, y ) + X n e λ n t ψ n ( x ) ψ n ( y ) , where ψ n is an eigenv ector attac hed to the internal no de n . Let f ∈ L 2 ( X, µ ), with the expansion f = f 0 1 X ( x ) + P f n ψ n . Therefore, || ( e tL X − Π) f || 2 L 2 = X n e 2 λ n t f 2 n ≤ e 2 λ gap t || f || L 2 , where λ g ap is the eigenv alue with minimal absolute v alue. Therefore, || e tL X − Π || L 2 ≤ e λ gap t . Moreov er, if ψ n is a Kozyrev wa v elet attached to λ g ap , then || ( e tL X − Π) ψ n || L 2 = e λ gap t , hence || ( e tL X − Π) || L 2 = e λ gap t . (10) Therefore, the decaying rate of the heat kernel is con trolled by the sp e ctr al gap | λ g ap | . Moreo ver, since for every function u : X → R and ev ery t > 0, ( e tL X u )( x ) = X y ∈ X H t ( x, y ) u ( y ) m ( y ) , w e hav e that lim t →∞ e tL X u ( x ) = X y ∈ X u ( y ) m ( y ) . (11) Although the matrix of L X resp ect the canonical basis is not symmetric unless m is uniform, L X is a self-adjoint op erator in L 2 ( X, m ). Equiv alently , the ultrametric Laplacian matrix L satisfies the detaile d b alanc e c ondition , namely m ( x ) L xy = m ( y ) L y x . The self adjoin t-ness of L X and equation 11 implies the follo wing identit y . X y ∈ X e tL X u ( y ) m ( y ) = X y ∈ X u ( y ) m ( y ) , ∀ t > 0 (12) Indeed, b y taking the deriv ative resp ect time of the LHS, we obtain X y ∈ X L X ( e tL X u ( y )) m ( y ) = ⟨ L X ( e tL X u ) , 1 ⟩ L 2 ( m ) = ⟨ e tL X u, L X 1 ⟩ L 2 ( m ) = 0 , where last equality holds since 1 is the trivial eigenv ector of L X with eigenv alue zero. This identities will hav e a c lear probabilistic meaning in the next section. Since the rate of con vergence of the semigroup to its limit for t → ∞ dep end on the spectral gap that is the first non zero eigen v alue w e close this section with the follo wing prop osition whic h characterize the sp ectral gap. 30 Prop osition 3.17 If k is de cr e asing, | λ g ap | = k ( diam ( X )) . Pr o of Let n ∈ T b e an in ternal node. Let m ∈ T a proper descendan t. Then X \ B n ⊂ X \ B m . Let x 0 ∈ B m , then X X \ B m k ( d ( x 0 , y )) dµ ( y ) = X X \ B n k ( d ( x 0 , y )) dµ ( y ) + k (diam( B n ))( µ ( B ) ) . Therefore by equation 3 we obtain λ m − λ n = µ ( B m )( k (diam( B n )) − k (diam( B u ))) . If k is a decreasing function and since diam( B m ) ≤ diam( B n ), then λ m ≤ λ n < 0. As a consequence, λ g ap = λ r , where r ∈ T is the ro ot. □ 4 Dynamic cen tralit y and ultrametric spaces. 4.1 A state-cen tralit y index for CTMC. Sev eral complex systems can b e describ ed through sto c hastic dynamics ev olving on a large configuration space. Such systems typically explore a v ast num ber of states b efore approac hing equilibrium. In statistical ph ysics, this b eha vior is often in terpreted through the notion of an underlying energy landscap e, where eac h configuration is asso ciated with a p oten tial energy and the dynamics describ e random transitions or jumps b et ween metastable states. Under suitable assumptions, the resulting dynamics satisfy the Mark ov prop ert y and can b e mo deled as a CTMC [ 31 ]. While this framew ork arises naturally in non-equilibrium statistical mec hanics, sim- ilar Mark ovian descriptions app ear in broader con texts where the state space exhibits an intrinsic hierarc hical organization. In suc h situations, the geometry of the configu- ration space pla ys a fundamental role in shaping the stochastic evolution. Rather than fo cusing on a sp ecific physical mo del, we concen trate here on the structural prop er- ties of ultrametric state spaces and inv estigate ho w contin uous-time Marko v dynamics reflect their hierarc hical organization. In particular, this p erspective motiv ates the study of geometric quantities asso- ciated with the generator of the dynamics, such as state-cen trality indices, whic h quan tify the structural relev ance of individual states independently of a specific ph ysical interpretation. Therefore the contin uous time sto c hastic pro cess X t , where t > 0 and X t ∈ S , where S is the finite state space (the space of configura- tions), is assumed to satisfy the Mark ov prop ert y (and therefore generating a CTMC) P ( X t : X t n ) = P ( X t : X t n , X t n − 1 , ..., X t 0 ). During the ev olution of the dynamics, giv en an y t wo states i, j ∈ S , we are interested in describing the transition probabilities p i,j ( t ), describing the probability of jumping to state j at time t , given that X 0 = i . Using the Marko v prop ert y and the la w of total probability , it is p ossible to describ e this probabilit y function in terms of the so called master equation d dt P ( t ) = QP ( t ) , (13) 31 where p ij ( t ) = ( P ( t )) ij and q ij = ( Q ) ij =: d dt p ij ( t ) | t =0 . The n umbers q ij are called the transition rates of the Mark ov process. The transitions rates q ij determine the infinitesimal b eha vior of the conditional probabilities b y P ( X t + h = j : X t = i ) = δ ij + q ij h + o ( h ). The general solution of equation 13 , is given b y the attached semigroup P ( t ) = exp( tQ ). One imp ortan t conclusion from this description is that when the state space S is finite, the CTMC is determined by its transition matrix Q . Therefore, is usually con venien t to describ e the process via its T ransition Netw ork (or Jump Netw ork), see [ 31 ]. This netw ork is constituted by the state space S as the set of v ertices and the directed edges are w eighted by the corresponding rated q ij , whenev er this are p ositiv e, no directed edge is given when the rate transition is zero. W e sa y that a transition net work is connected if any states j can be reached from an y other state i . This assump- tion is usually made in transition rate phenomena (c hemical reactions, biomolecules, etc) since disconnected master equations represent multiple non interacting ph ysical systems. F rom this alw ays follow the relaxation of the system tow ard an equilibrium state, that is, a stationary distribution exists. W e denote this stationary distribution b y π = ( π ( i )) i ∈ S . Therefore we ha ve lim t →∞ p ij ( t ) = π j . In many thermodynamic systems, it is common to assume a stronger condition called the detailed balance condition giv en by the relation π ( i ) q ij = q j i π ( j ) , (14) it can b e shown that condition 14 holds if and only if π ( i ) p ij ( t ) = p j i ( t ) π ( j ) . (15) for ev ery t ≥ 0. Random w alk cen tralit y In [ 28 ], Noh and Rieger introduce the concept of random walk centralit y , whic h measures the capacity of a no de of a netw ork to receive and redistribute information. Next, we briefly review the definition of this index based on a discrete-time random w alk in a netw ork setting. Let G = ( V , E ) b e a finite, connected, undirected graph with adjacency matrix A . Denote b y k i = X j ∈ V A ij the degree of no de i . 32 A discrete-time random w alk on G is a Marko v chain ( X t ) t ≥ 0 with state space V and transition probabilities P ij = P ( X t +1 = j | X t = i ) = A ij k i . That is, at each time step, a w alker lo cated at no de i c ho oses uniformly at random one of its neigh b ors and mo ves to it. Since the graph is connected and undirected, the Marko v chain admits a unique stationary distribution giv en by π i = k i P j ∈ V k j . The c haracteristic relaxation time of no de i is defined as τ i = ∞ X t =0  P ii ( t ) − π i  , where P ii ( t ) denotes the probabilit y that a walk er starting at i is at i at time t . The r andom walk c entr ality of no de i is then defined as C i = π i τ i . The quotient C i captures the balance betw een tw o effects: the visitation probabilit y of the no de, enco ded in the stationary w eigh t π i , and the local time relaxation behavior of the walk around that no de, enco ded in τ i . Hence, C i quan tifies how efficiently no de i can receive and redistribute information under a random walk dynamics on the net work. Moreo ver, the mean first passage times satisfy the relation E [ T j | X 0 = i ] − E [ T i | X 0 = j ] = C − 1 j − C − 1 i , where T j := inf { t ≥ 0 : X t = j } denote the first p assage time (or hitting time) from to no de j , that is, the first time at which a random walk er starting at i visits j . This sho ws that no des with larger random w alk centralit y are, on av erage, reached more rapidly b y the random walk. 4.2 Dynamic cen tralit y W e no w extend the result of Noh and Rieger to the time contin uous case. T o the b est of our kno wledge, this results hav e not app eared explicitly in previous literature. In a similar w ay , for a C T M C with probability transitions p ij ( t ) w e define T ij := inf { t ≥ 0 : X t = j } 33 denote the first passage (hitting) time of no de j . The mean first passage time (MFPT) from i to j is defined as m ij := E i [ T ij ] , where E i [ · ] denotes exp ectation conditioned on X 0 = i . F ollowing [ 32 ] the following equation holds. ˆ f i,j ( s ) = ˆ p i,j ( s ) ˆ p j,j ( s ) , (16) where ˆ f ij ( s ) = R ∞ 0 e − st f i,j ( t ) dt, is the Laplace transform of the first-passage prob- abilit y , and ˆ p ij ( s ) = R ∞ 0 e − st p ij ( t ) dt is the Laplace transform of the transition probabilit y p ij ( t ). Using equation 16 w e obatin m ij = − d ds ˆ f ij ( s )     s =0 Define R ( m ) ij := R ∞ 0 t m ( p ij ( t ) − π ( j )) dt . F or s > 0, in virtue of the dominated con- v ergence theorem and the uniform conv ergence of the series expansion of the function x 7→ e − x on compacts, w e hav e S  R ( m ) ij  ( s ) = ∞ X m =0 R ( m ) ij ( − 1) m s m m ! = Z ∞ 0 ∞ X m =0 t m s m m ! ( − 1) m ! ( p ij ( t ) − π ( j )) dt = Z ∞ 0 e − st ( p ij ( t ) − π ( j )) dt = ˆ p ij ( s ) − π ( j ) s . Therefore, ˆ f ij ( s ) = ˆ p ij ( s ) ˆ p j j ( s ) = π ( i ) + sS  R ( m ) ij  ( s ) π ( i ) + S  R ( m ) j j  ( s ) . F or i  = j the abov e equality lead to m ij = − d ds ˆ f ij ( s )     s =0 = R (0) j j π ( j ) − R (0) ij π ( j ) . (17) Note that, equation 15 , implies that π ( j ) R (0) j i = π ( i ) R (0) ij , hence m ij − m j i = R (0) j i π ( i ) − R (0) ij π ( j ) ! + R (0) j j π ( j ) − R (0) ii π ( i ) ! = R (0) j j π ( j ) − R (0) ii π ( i ) , Definition 4.1 F or a state i ∈ S we define its CTMC centralit y as the num b er C C T M C ( i ) = π ( i ) R (0) ii . 34 Hence in the CTMC case, we can mak e the same conclusion as in [ 28 ]; the CTMC cen trality C C T M C ( i ) quantifies how central is the state i regarding its p otential to be accessible from other states. That is, the following implication holds for tw o states i, j ∈ S : C C T M C ( i ) > C C T M C ( j ) ⇐ ⇒ m ij > m j i . Therefore, in av erage, the system access from the state j to the state i faster than from i to the state j . This is the contin uous time analog to the result and conclusion obtained b y Noh and Rieger in [ 28 ]. 4.3 Con tin uous time Mark ov Chain in ultrametric spaces The matrix representation of the op erator L X is a Q -matrix , and therefore, the op erator has attached an sto c hastic pro cess where the transition probabilit y matrix is given by P t = exp ( tL X ). The main prop erties of this pro cess are describ ed in the Theorem b elo w. Theorem 4.2 L et ( X , d, m ) b e a finite me asur able ultr ametric sp ac e with ultrametric Lapla- cian L X and m b eing a pr ob ability me asur e. Then L X has attache d an irr e ducible c ontinuous time Markov chain with tr ansition function P t ( x, y ) = e tL X δ y ( x ) = H t ( x, y ) m ( y ) . The pr o cess is r eversible, and the me asur e m is the only stationary pr ob ability me asur e. Pr o of Since L is a Q -matrix, it has attac hed a contin uous time Marko v chain, and since L xy > 0 for all x, y ∈ S , x  = y , then the Marko v c hain is irreducible. The equality from the Theorem follows from the relation e tL X δ y ( x ) = ( e tL ) xy , for all x, y ∈ S . The fact that m is a stationary measure for the process follo ws from equation 12 . □ F rom this result and the last section, we now hav e a clear probabilistic interpreta- tion of the off-diagonal en tries of L : k ( d ( x, y )) m ( y ) = d dt P ( X t + h = y | X t = x ) | t =0 , that is, k ( d ( x, y )) m ( y ) is the instan taneous rate transition, or in other words, the transition density rate p er unit of time from x to y . Since d is ultrametric and k is a non-increasing function, the pro cess is compatible with the topology , in the following sense: if d ( x, y ) ≤ d ( x, z ) and m ( y ) = m ( z ) then P ( X t + h = y | X t = x ) ≥ P ( X t + h = z | X t = x ), i.e., the sto c hastic system at X t = x has more probability to o ccup y the nearest states. Is clear that the measure m can bias the jumping rate, nev ertheless in the next applications m will b e the normalized coun ting measure, hence the property m ( y ) = m ( z ) is satisfied trivially for all pair of p oints. 35 On the other hand, notice that for the ultrametric phylogenetic Laplacian L T the follo wing holds: F ( h 0 − h ( x ∧ y )) = d dt P ( X t + h = y | X t = x )   t =0 , hence, the random pro cess attac hed to this generator is compatible with the phyloge- netic structure of the tree, in particular for F ( x ) = x the rates dep end linearly on the div ergence time. 4.4 Dynamic cen tralit y for Ultrametric spaces as a top ological descriptor W e now proceed to study the CTMC centralit y on ultrametric spaces. Denote by τ i := R (0) ii . Let ( X , d ) a finite ultrametric space. And let P t ( x, y ) the probability transition function attached to the op erator L X with kernel k . In order to study the cen trality C C T M C ( i ), we need to in vestigate the quantities τ i and π ( i ), for the later w e hav e the following result. Lemma 4.3 L et ( X , d ) a finite ultr ametric sp ac e e quipp e d with a pr ob ability me asur e m ( x ) , and P t ( x, y ) the pr ob ability tr ansition function attache d to the op er ator L X . Then π ( i ) = m ( i ) for al l i ∈ X , that is lim t →∞ P t ( x, y ) = m ( y ) , x, y ∈ X . Pr o of The results follows from equation 10 , since P t ( x, y ) = H t ( x, y ) m ( y ). □ This allo w us to simplify the expression of τ i , τ i = Z ∞ 0 ( P t ( i, i ) − π ( i )) dt = m ( i ) Z ∞ 0 ( H t ( i, i ) − 1) dt therefore, C C T M C ( i ) = 1 R ∞ 0 ( H t ( i, i ) − 1) dt Recall that H t ( x, x ) − 1 = e λ ( B n ) t  1 m ( { x } ) − 1 m ( B n )  + X T : B n ⊆ T ⊊ X e λ ( T + ) t  1 m ( T ) − 1 m ( T + )  , therefore, Z ∞ 0 ( H t ( i, i ) − 1) dt = − 1 λ ( B n )  1 m ( { x } ) − 1 m ( B n )  + X T : B n ⊆ T ⊊ X − 1 λ ( T + )  1 m ( T ) − 1 m ( T + )  36 where n = F ( i ), the father of no de i , this leads to the following result for finite ultrametric spaces. Theorem 4.4 Given a finite ultr ametric sp ac e ( X, d ) with prob ability me asur e m , the CTMC c entr ality attache d to the CTMC gener ate d by L X is given by C C T M C ( i ) =   − 1 λ ( B n )  1 m ( { i } ) − 1 m ( B n )  + X T : B n ⊆ T ⊊ X − 1 λ ( T + )  1 m ( T ) − 1 m ( T + )    − 1 (18) It is clear that the top ology of the ultrametric space will affect the accessibility of the sates during the random pro cess. This effect can b e effectively studied using the equation 18 . Since k is non increasing, it follows that the eigenv alues as a function of one of the radius of a given ball is a decreasing function, therefore, increasing the radius of one of the balls, let say B n , will affect by decreasing the index C C T M C ( i ) for all states i ∈ X such that n ∈ γ r ( i ), hence one state will b e more accessible when the balls in its history γ r ha ve smaller radius. ro ot a ro ot a ′ = ⇒ C C T M C ( a ) < C C T M C ( a ′ ) Fig. 9 : Decreasing the radius of a ball make the states inside of it more accessible. The other parameter whic h may affect the cen trality is the measure of a giv en ball. When m ( B n ) increases, then the eigen v alue absolute v alue of the affected eigenv alues increases, w e conclude from equation 18 , that increasing the measure of a ball, increase the cen trality of its elements. Since, C C T M C ( i ) > C C T M C ( j ) ⇐ ⇒ m ij > m j i . 37 ro ot a b c d e f ro ot a ′ b ′ c ′ d ′ e ′ f ′ = ⇒ C C T M C ( a ) > C C T M C ( a ′ ) , C C T M C ( d ) < C C T M C ( d ′ ) Fig. 10 : F or the counting measure, increasing the num ber of lea ves of an internal no de increases the accessibility of them. W e can make explicit the inequality of the RHS. According to the last section, for a general CTMC, m ij = R (0) j j π ( j ) − R (0) ij π ( j ) , therefore m ij = Z ∞ 0 ( H t ( j, j ) − H t ( i, j )) dt = − 1 λ ( B n ) · 1 m ( B n ) − 1 λ ( B F ( j ) )  1 m ( j ) − 1 m ( B F ( j ) )  + X T : B F ( j ) ⊆ T ⊊ B n − 1 λ ( T + )  1 m ( T ) − 1 m ( T + )  = − 1 λ ( B F ( j )) · 1 m ( j ) + X T : B F ( j ) ⊆ T ⊊ B n 1 m ( T )  − 1 λ ( T + )  −  − 1 λ ( T )  (19) As we already established, m ij is the av erage time that, for the first time the sys- tem o ccup y the state j given that the pro cess started at i . Hence, dynamically , the accessibilit y to the leaf j from the no de i depend only its history up to the no de n = LC A ( i, j ). T op ologically , the quantit y m ij capture the top ological ramification of this path. Indeed, first, the b ehavior of m ij dep end on the b eha vior of the differences b et w een the radius of tw o consecutive balls: If ∆ r = | r ( a ) − r ( b ) | , for a, b ∈ γ n ( j ) , increases, then m ij increases as well since, the expression  − 1 λ ( T ∗ )  −  − 1 λ ( T ∗ )  is an increasing function of ∆ r , that is, once the system en ter the ball/cluster B n it finds more easy to access the state j if the radii of the nested ball containing j are smaller. Secondly , if the measure of one of those balls increases, then the system has more possible accessible states in the cluster B n decreasing the time m ij . This shows 38 that m ij reflects the ramification and topology of the path j → LC A ( i, j ): smaller m ij means a more ramified (in the top ological tree) path j → LC A ( i, j ) or more compactly nested balls con taining j . Therefore, C ( i ) > C ( j ) do es not only hav e a dynamic meaning, but also a topolog- ical one, the inequality can be interpreted as ”the p ath i → LC A ( i, j ) is mor e r amifie d or c onne cts mor e quickly to other le afs that the p ath j → LC A ( i, j ) ”. A more dynam- ically isolated state give rise to a more isolated leaf in the ultrametric space. This will b e central for many of the applications later on. T o give an example on ho w the indices reflect ric hness of the top ology w e hav e the following corollary . Corollary 4.5 If the finite ultr ametric spac e is level-r e gular then C ( a ) = C ( b ) for al l a, b ∈ X . Lets close this section with an application to phylogenetic trees. As b efore, we use the ph ylogenetic tree of Primate genera, w e compute the CTMC centralit y for tw o differen t kernels of the ultrametric ph ylogenetic Laplacian, h 0 − h and 1 /d , where d = 2 h . Fig. 11 : Dynamic centralit y for the Primate genus tree. First plot (from left to right) sho ws the cen trality for the k ernel k = h 0 − h , second for k ernel k = 1 / 2 h . The last plot sho w the high but not full correlation b et ween the classic ED rank and the rank giv en by the C C T M C index. Figure 11 shows the CTMC centralit y index C C T M C ( i ) computed for all 109 primate genera under tw o kernel choices, alongside a rank-rank comparison with the Ev olutionary Distinctiveness (ED) score, see[ 5 ]. In the first t wo panels, genera are ordered by ascending centralit y , so the leftmost points corresp ond to the most ph ylogenetically isolated sp ecies. Under the linear kernel k ( h ) = h 0 − h (left panel), the three T arsiidae ( T arsius , Carlito , Cephalop achus ) and Daub entonia , the only representativ e of family Daub en- toniidae. Lorisidae ( L oris , Nyctic ebus , Ar cto c ebus , Per o dicticus ) follo w immediately , consisten t with their p osition as an ancient and sp ecies-p oor clade. At the opposite 39 extreme, the most central genera b elong to the Cercopithecinae, a dense and sp ecies- ric h radiation with man y close relatives sharing long internal branches. Under the k ernel k ( h ) = 1 / (2 h ) (middle panel), whic h do wn-weigh ts ancien t splits and up-w eights recen t divergences, the global ordering is preserved at the extremes but differs in the middle ranks, where recently diversified clades gain centralit y relative to the linear k ernel. The principal discrepancies o ccur for genera suc h as Me galadapis , L epilemur , and Phaner , whic h ED ranks among the five most isolated due to their long pendant edges, while C C T M C assigns them substantially lo wer isolation ranks b ecause their parent no des carry relatively high sp ectral weigh t, reflecting the presence of m ultiple close relativ es within Strepsirrhini. Conv ersely , Daub entonia is ranked as the most isolated b y ED but only fourth b y C C T M C , where the three T arsiidae displace it at the extreme. This illustrates a fundamental difference b etw een the tw o indices: ED captures lo c al uniqueness along the path to the ro ot: E D ( i ) = 1 N X T : { i }⊆ T ⊆ X h ( T + ) − h ( T ) m ( T ) . Whereas C C T M C incorp orates the glob al sp ectral structure of the tree, weigh ting eac h split b y its eigenv alue λ ( B ), which dep ends on the mass distribution of the entire ph ylogeny . The CTMC cen trality index C C T M C offers sev eral adv antages ov er existing mea- sures of evolutionary isolation reviewed in [ 5 ]. First, and most fundamentally , it is not a heuristic: it emerges directly from the sp ectral theory of the ultrametric Lapla- cian op erator L T and admits a precise probabilistic in terpretation in terms of the dynamics of a con tinuous-time random w alk on the leav es of the phylogenetic tree. This grounds the index in a mathematical framework rather than an ad-ho c scheme. Second, the k ernel function k provides an interpretable resolution to ol in whic h to tune the trade-off b et ween uniqueness (sensitivity to recen t, tip-near divergences) and originality (sensitivity to ancient, ro ot-near divergences) identified by [ 5 ] as the key dimension separating existing metrics; each kernel choice is mathematically justified within the CTMC framew ork rather than chosen arbitrarily . Third, because C C T M C is defined through the eigenv alues of a global op erator, it incorporates information from the entire tree top ology rather than only the path from a sp ecies to the ro ot. Finally , the formulation extends naturally to non-ultrametric trees and phylogenetic netw orks b y redefining the underlying op erator, addressing a limitation explicitly noted b y [ 5 ] for most existing metrics. 5 Conclusions and outlo ok. W e ha ve dev elop ed a unified spectral framew ork for finite ultrametric phylogenetic trees, grounding the analysis of phylogenetic structure in op erator theory and sto c has- tic dynamics. The results presented here, sp ectral reconstruction, eigenmo de trait 40 decomp osition, and CTMC centralit y , are exact, computationally efficient, and bio- logically interpretable, and they are supp orted b y n umerical exp erimen ts on empirical primate data. Sev eral directions remain op en. The eigen basis introduced here provides a nat- ural interface with Geometric Deep Learning: ultrametric Laplacians admit explicit diagonalization, making their sp ectral parameters directly in terpretable within graph neural netw ork arc hitectures, and w e conjecture that this could prov e fundamental for the developmen t of phylogenetic comparativ e metho ds in that framework. A second direction concerns the sto c hastic indep endence of the con trasts: while the eigenmo de decomp osition is orthogonal by construction, it remains to iden tify a stochastic pro cess under which the co efficien ts c P are statistically indep enden t, in the spirit of F elsen- stein’s independent contrasts. Finally , the systematic study of k ernel selection, which taxonomic scale to emphasize, and how to recov er a target sp ectral structure, and a deep er comparison of C CTMC with mo dern conserv ation indices b eyond ED, represen t natural next steps. References [1] Rammal, R., T oulouse, G., Virasoro, M.A.: Ultrametricit y for ph ysicists. Reviews of Mo dern Physics 58 (3), 765–788 (1986) [2] Steel, M.: Ph ylogeny . Society for Industrial and Applied Mathemat- ics, Philadelphia, P A (2016). https://doi.org/10.1137/1.9781611974485 . https://epubs.siam.o rg/doi/abs/10.1137/1.9781611974485 [3] Darwin, C.: On the Origin of Species by Means of Natural Selection. John Murray , London (1859) [4] Lewitus, E., Morlon, H.: Characterizing and comparing phylogenies from their laplacian sp ectrum. Systematic Biology 65 (3), 495–507 (2016) [5] Redding, D.W., Mazel, F., Mo o ers, A.Ø.: Measuring evolutionary isolation for conserv ation. PLOS ONE 9 (12), 113490 (2014) h ttps://doi.org/10.1371/journal. p one.0113490 [6] Isaac, N.J.B., T urvey , S.T., Collen, B., W aterman, C., Baillie, J.E.M.: Mammals on the EDGE: Conserv ation priorities based on threat and ph ylogeny . PLoS ONE 2 (3), 296 (2007) h ttps://doi.org/10.1371/journal.p one.0000296 [7] B´ erard, P .H.: Sp ectral Geometry: Direct and Inv erse Problems. Lecture Notes in Mathematics, vol. 1207. Springer, Berlin, Heidelb erg (1986). h ttps://doi.org/10. 1007/BFb0076330 [8] Kac, M.: Can one hear the shap e of a drum? American Mathematical Monthly 73 (4, part 2), 1–23 (1966) [9] Gordon, C., W ebb, D.L., W olp ert, S.: One cannot hear the shape of a drum. Bull. 41 Amer. Math. So c. 27 , 134–138 (1992) [10] Bradley , P .E., Ledezma, A.M.: Hearing shap es via p-adic laplacians. Journal of Mathematical Physics 64 (11), 113502 (2023) https://doi.org/10.1063/5.0152374 [11] Chung, F.R.K.: Sp ectral Graph Theory . CBMS Regional Conference Series in Mathematics, n umber 92. AMS, Providence, RI (1997) [12] Bronstein, M.M., Bruna, J., Cohen, T., V eliˇ cko vi ´ c, P .: Geometric Deep Learning: Grids, Groups, Graphs, Geo desics, and Gauges (2021) [13] Bendiko v, A., Cygan, W., W o ess, W.: Oscillating heat k ernels on ultrametric spaces. Journal of Spectral Theory 9 (1), 195–226 (2019) h ttps://doi.org/10.4171/ jst/245 [14] Bendiko v, A.: Heat k ernels for isotropic-like marko v generators on ultrametric spaces: A survey . p-Adic Num b ers, Ultrametric Analysis and Applications 10 (1), 1–11 (2018) h ttps://doi.org/10.1134/S2070046618010025 [15] Kozyrev, S.V.: Ultrametric pseudo differen tial op erators and wa v elets for the case of non homogeneous measure. arXiv preprint (2005). arXiv:math-ph/0412082v3 [16] Z ´ u ˜ niga-Galindo, W.A.: Ultrametric diffusion, rugged energy landscap es and tran- sition net works. Physica A: Statistical Mechanics and its Applications 597 , 127221 (2022) [17] Mor´ an Ledezma, A.: Time-v arying energy landscap es and temp erature paths: dynamic transition rates in lo cally ultrametric complex systems. Journal of Sta- tistical Mec hanics: Theory and Experiment 2025 , 113501 (2025) h ttps://doi.org/ 10.1088/1742- 5468/ae120f . Op en Access [18] Avetiso v, V.A., Bikulov, A.K., Zubarev, A.P .: Ultrametric random walk and dynamics of protein molecules. Pro c. Steklov Inst. Math. 285 , 3–25 (2014) [19] Dragovic h, B., Khrennik ov, A.Y., Kozyrev, S.V.e.a.: p -adic mathematical ph ysics: the first 30 y ears. P -Adic Num Ultrametr Anal Appl 9 , 87–121 (2017) [20] Khrenniko v, A.: Ultrametric diffusion equation on energy landscap e to mo del disease spread in hierarchic so cially clustered p opulation. Physica A: Statistical Mec hanics and its Applications 583 , 126284 (2021) [21] Dragovic h, B., Khrenniko v, A.Y., Kozyrev, S.V., Misic, N.Z.: p -adic mathematics and theoretical biology . Biosystems 199 , 104288 (2021) [22] F uquen-Tibat´ a, A., Cort´ es-Poza, Y., P´ erez-Buend ´ ıa, J.R.: A p -adic reaction- diffusion mo del of branc hing coral growth and calcification dynamics. Journal of Mathematical Biology 92 (27) (2026) https://doi.org/10.1007/ s00285- 025- 02340- 8 42 [23] Kumar, S., Suleski, M., Craig, J.M., Kasprowicz, A.E., Sanderford, M., Li, M., Stecher, G., Hedges, S.B.: TimeT ree 5: An Expanded Resource for Sp ecies Div ergence Times. Molecular Biology and Evolution 39 (8), 174 (2022) https: //doi.org/10.1093/molb ev/msac174 [24] F elsenstein, J.: Phylogenies and the comparative method. The American Natu- ralist 125 (1), 1–15 (1985) h ttps://doi.org/10.1086/284325 [25] Ollier, S., Couteron, P ., Chessel, D.: Orthonormal transform to decomp ose the v ariance of a life-history trait across a phylogenetic tree. Biometrics 62 (2), 471– 477 (2006) h ttps://doi.org/10.1111/j.1541- 0420.2005.00497.x [26] Gorman, E., Lladser, M.E.: Sparsification of large ultrametric matrices: insigh ts in to the microbial tree of life. Pro ceedings of the Roy al So ciet y A: Mathematical, Ph ysical and Engineering Sciences 479 (2278), 20220847 (2023) https://doi.org/ 10.1098/rspa.2022.0847 [27] Jones, K.E., Bielb y , J., Cardillo, M., F ritz, S.A., O’Dell, J., Orme, C.D.L., Safi, K., Sec hrest, W., Boakes, E.H., Carb one, C., Connolly , C., Cutts, M.J., F oster, J.K., Gren yer, R., Habib, M., Plaster, C.A., Price, S.A., Rigb y , E.A., Rist, J., T eac her, A., Bininda-Emonds, O.R.P ., Gittleman, J.L., Mace, G.M., Purvis, A.: Pan the- ria: a sp ecies-lev el database of life history , ecology , and geography of extant and recen tly extinct mammals. Ecology 90 (9), 2648–2648 (2009) h ttps://doi.org/10. 1890/08- 1494.1 https://esa journals.onlinelibrary .wiley .com/doi/p df/10.1890/08- 1494.1 [28] Noh, J.D., Rieger, H.: Random walks on complex net works. Ph ys. Rev. Lett. 92 , 118701 (2004) h ttps://doi.org/10.1103/PhysRevLett.92.118701 [29] Angulo, J.: Hierarc hical Laplacian and its spectrum in ultrametric image process- ing. In: al., B.B. (ed.) ISMM 2019. LNCS 11564, pp. 29–40. Springer, Switzerland A G (2019) [30] Luxburg, U.: A tutorial on sp ectral clustering. Statistics and Computing 17 (4), 395–416 (2007) [31] Peliti, L., Pigolotti, S.: Sto c hastic Thermo dynamics: An Introduction, p. 272. Princeton Univ ersity Press, ??? (2021) [32] Kamp en, N.G.: Sto chastic Pro cesses in Physics and Chemistry , 3rd edn. North- Holland Personal Library . North-Holland, Amsterdam (2007). https://doi.org/ 10.1016/B978- 0- 444- 52965- 7.X5000- 4 43

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment