Geometric framework for biological evolution

Geometric framew o rk fo r biological evolution Vitaly V anchurin 1 , 2 1 Artiﬁcial Neural Computing, W eston, Florida, 33332, USA 2 Duluth Institute for Adv anced Study , Duluth, Minnesota, 55804, USA E-mail: vitaly .v anc h urin@gmail.com Abstract. W e develop a generally co v ariant description of ev olutionary dynamics that op erates consisten tly in b oth genotype and phenot yp e spaces. W e sho w that the maxim um entrop y principle yields a fundamen tal identiﬁcation betw een the in v erse metric tensor and the co v ariance matrix, revealing the Lande equation as a co v arian t gradien t ascent equation. This demonstrates that ev olution can b e mo deled as a learn- ing pro cess on the ﬁtness landscap e, with the sp eciﬁc learning algorithm determined b y the functional relation b etw een the metric tensor and the noise cov ariance arising from microscopic dynamics. While the metric (or the inv erse genot ypic co v ariance matrix) has b een extensiv ely characterized empirically , the noise cov ariance and its as- so ciated observ able (the co v ariance of ev olutionary c hanges) hav e never b een directly measured. This p oses the exp erimen tal c hallenge of determining the functional form relating metric to noise co v ariance. Con ten ts 1 In tro duction 1 2 Genot yp e and phenot yp e 2 3 Geometric structures 3 4 Statistical structures 4 5 Lande equation 6 6 Learning dynamics 7 7 Genot yp e statistics 9 8 Discussion 10 A Higher-order corrections 12 B Fitness Hessian 13 1 In tro duction Theoretical mo deling of ev olutionary dynamics has a ric h history spanning more than a cen tury , from the foundational w ork in p opulation genetics [ 1 – 3 ] to mo dern quantitativ e framew orks that describ e how p opulations change under selection [ 4 – 7 ]. T raditional approac hes include p opulation genetics mo dels that trac k allele frequencies statistically [ 1 – 3 ] and quantitativ e genetics mo dels that describ e phenotypic resp onse to selection through dynamical equations [ 8 – 10 ]. Despite their successes, these frameworks do not pro vide a co ordinate-independent description of evolutionary dynamics, limiting their abilit y to capture the geometric and statistical structures underlying evolutionary pro cesses. Even approac hes that explicitly inv ok e geometric metaphors, such as ﬁtness landscap e theory [ 11 , 12 ], lack the diﬀerential geometric structure needed for a generally co v ariant formulation. A recen t paradigm shift has emerged from recognizing deep connections b et ween ev olutionary dynamics and learning theory [ 13 ]. This framew ork builds on earlier w ork prop osing that the universe on its most fundamen tal level can b e understo o d as a neural net work [ 14 , 15 ]. Under the evolution-as-learning framew ork, natural selection and replication arise naturally from the existence of a loss function (the negative of ﬁtness) that is minimized during learning, and in suﬃciently complex systems, the same learning phenomena o ccur on multiple scales. Subsequen t w ork [ 16 ] extended these ideas to the thermo dynamics of evolution and the origin of life, rev ealing deep connections b et w een learning theory , statistical mechanics, and evolu tionary dynamics. – 1 – The emergence of scale in v ariance in learning systems [ 17 , 18 ] further supp orts the claim that learning dynamics may b e resp onsible for the multi-lev el structures observ ed across ph ysical and biological systems. More recently a co v ariant description of learning in mac hine learning systems was dev elop ed [ 19 ] and then generalized [ 20 ] as a uniﬁed geometric framew ork for describ- ing learning in ph ysical, biological, and mac hine learning systems. This work revealed three fundamen tal regimes c haracterized b y the pow er-la w relationship g ∝ κ α b et w een the metric tensor g and the noise cov ariance matrix κ : the quantum regime ( α = 1), the eﬃcient learning regime ( α = 1 / 2), and the equilibration regime ( α = 0). The emergence of the in termediate regime α = 1 / 2 w as conjectured to b e a key mec ha- nism underlying biological complexit y , and p ossibly the origin of life, suggesting that ev olution ma y ha v e disco vered optimization algorithms more sophisticated than sim- ple gradient descen t. The present pap er develops these ideas further by sho wing that the metric tensor on genot yp e space is naturally identiﬁed with the inv erse genotypic co v ariance matrix through the maximum entrop y principle [ 21 , 22 ], transforming the Lande equation in to a cov arian t gradien t ascen t equation of ev olutionary dynamics (or co v ariant gradien t descen t equation of learning dynamics). This geometric persp ec- tiv e reveals that the speciﬁc learning algorithm implemen ted by biological evolution is determined by the functional relation g ( κ ) b et w een the metric and noise cov ariance. While the metric has b een extensively characterized empirically , the noise co v ariance remains unmeasured, p osing an op en c hallenge for ev olutionary biology . The pap er is organized as follo ws. Section 2 in tro duces the genot yp e and pheno- t yp e spaces and the maps b et ween them. Section 3 develops the geometric structures, including the pullback metric from phenotype to genot yp e space. Section 4 applies the maxim um entrop y principle to establish the identiﬁcation b et ween the inv erse metric tensor and genot ypic co v ariance. Section 5 derives the Lande equation from the Price equation. Section 6 connects evolution to learning algorithms and introduces p ossi- ble functional relations g ( κ ). Section 7 discusses empirical observ ations of genot ypic co v ariance sp ectra and the c hallenge of measuring noise cov ariance. The app endices pro vide detailed deriv ations of higher-order corrections and the relation b et ween the ﬁtness Hessian and noise co v ariance. 2 Genot yp e and phenot yp e Consider a p opulation of organisms, eac h c haracterized by its genetic sequence: s = ( s 1 , s 2 , . . . , s K ) (2.1) where s α ∈ A represents the allelic state at lo cus α , and A is a ﬁnite set of p ossible alleles with |A| = D . F or nucleotide sequences, A = { A, C, G, T } and D = 4. T o enable the use of geometric and statistical structures, w e deﬁne an embedding map: ˆ q : A → R d (2.2) where d < D is the em b edding dimension. The dimension d can b e chosen based on biological considerations, suc h as the n um b er of indep enden t c hemical prop erties of – 2 – n ucleotides or through dimensionality reduction techniques applied to genomic data. The em b edding map ( 2.2 ) allows us to assign to eac h discrete genot yp e s a contin uous co ordinate: q ( s ) =  q α 1 ( s ) , q α 2 ( s ) , . . . , q αd ( s )  =  ˆ q 1 ( s α ) , ˆ q 2 ( s α ) , . . . , ˆ q d ( s α )  , (2.3) where for eac h lo cus α , the em b edding co ordinates are giv en b y the map ˆ q applied to the allelic state s α . The em b edding is applied indep enden tly to each lo cus, whic h is wh y we use the same notation for all lo ci and co ordinates. Eac h discrete genot yp e giv es rise to observ able phenotypic traits through a phe- not yp e map: ˆ x : A K → R N (2.4) deﬁned on the discrete space: ˆ x ( s ) = ( ˆ x 1 ( s ) , ˆ x 2 ( s ) , . . . , ˆ x N ( s )) , (2.5) where N is the num b er of phenot ypic traits considered. This map represen ts the biolog- ical pro cesses through which genetic information pro duces observ able c haracteristics; it also dep ends on the en vironmen t, which will b e mo deled later in the pap er. T o leverage the con tinuous structure introduced b y the embedding ( 2.2 ) and phe- not yp e ( 2.4 ) maps, we assume that ˆ x can be in terp olated to deﬁne a smo oth map from a con tinuous genotype space to a contin uous phenot yp e space: x : R dK → R N , (2.6) suc h that for all discrete genotypes s ∈ A K , x ( ˆ q ( s )) = ˆ x ( s ) . (2.7) This in terp olation is not unique and m ust b e c hosen based on biological considerations, but its existence allo ws us to extend the phenot yp e map to con tinuous genot yp e v alues and compute deriv atives such as ∂ x i /∂ q αr , whic h capture how contin uous phenotype co ordinates resp ond to inﬁnitesimal c hanges in the con tinuous genotype co ordinates. Throughout this paper, w e use the Einstein summation con ven tion o ver rep eated indices, with Latin indices i, j, . . . running ov er phenot ypic traits 1 , . . . , N , Greek in- dices α, β , . . . running o v er lo ci 1 , . . . , K , and Latin indices r , s, . . . running ov er em- b edding co ordinates 1 , . . . , d . 3 Geometric structures The phenot yp e space R N is equipped with a metric G ij ( x ) that quan tiﬁes distances b e- t ween phenotypes. The squared distance b etw een tw o inﬁnitesimally close phenotypes x and x + d x is: ds 2 = G ij ( x ) dx i dx j . (3.1) The inv erse of this metric tensor is denoted with upp er indices G ij ( x ), satisfying G ik G kj = δ i j . A k ey feature of the geometric form ulation is that the metric tensor – 3 – G ij ( x ) is generally p osition-dep enden t, meaning that the geometry of phenot yp e space can v ary with the phenot yp e itself. This allows the framew ork to capture phenomena where the relationships b et w een traits c hange across diﬀerent regions of phenotype space. The interpolated phenotype map x : R dK → R N allo ws us to pull back the metric from phenotype space to genotype space. F or a genot ypic displacemen t dq αr , the corresp onding phenot ypic displacemen t is dx i = ∂ x i ∂ q αr dq αr . (3.2) The squared distance in genotype space, as measured by the induced metric, b ecomes: ds 2 = G ij ( x ( q )) dx i dx j = G ij ( x ( q ))  ∂ x i ∂ q αr dq αr   ∂ x j ∂ q β s dq β s  =  ∂ x i ∂ q αr G ij ( x ( q )) ∂ x j ∂ q β s  dq αr dq β s = g αr,βs ( q ) dq αr dq β s , (3.3) where the pullbac k metric tensor on genotype space is: g αr,βs ( q ) = ∂ x i ∂ q αr G ij ( x ( q )) ∂ x j ∂ q β s . (3.4) This metric g is the fundamental geometric structure on genotype space: it deﬁnes distances b etw een genotypes not in terms of their genetic comp osition p er se, but in terms of ho w diﬀeren t their phenotypes are, as measured by the phenotype metric G ij . Tw o genotypes are considered close if they pro duce similar phenotypes, regardless of their molecular diﬀerences. This captures the biological intuition that what matters for ev olution is phenotypic v ariation, not genetic v ariation for its own sake. Imp ortan tly , the pullback metric g αr,βs ( q ) inherits the p osition-dep endence from the phenotype metric G ij ( x ( q )), making it a function of the genotype q . This means that the geometry of genotype space is generally curv ed and v aries across the space, reﬂecting how the same genetic change can hav e diﬀerent phenot ypic consequences de- p ending on the genetic bac kground. This geometric p ersp ectiv e complements classical quan titative genetics approaches to studying the G -matrix [ 9 , 23 ], which describ es the patterns of genetic cov ariance among traits. The inv erse of this pullback metric is denoted with upp er indices g αr,βs ( q ), satisfying g αr,γt g γ t,β s = δ αr β s . 4 Statistical structures The p opulation can b e describ ed b y an ensem ble of genotypes, represen ted by a prob- abilit y distribution o v er the ﬁnite set of discrete genot yp es. Let ρ ( s ) denote the prob- abilit y of observing genotype s ∈ A K , satisfying P s ∈A K ρ ( s ) = 1. Eac h genotype s corresp onds to a p oin t ˆ q ( s ) in the con tinuous genotype space via the embedding map ˆ q : A K → R dK , with co ordinates q αr ( s ). Eac h genot yp e also maps to a phenotype x ( ˆ q ( s )) in phenot yp e space. – 4 – T o apply the maxim um en trop y principle [ 21 , 22 ], we specify a lo cal reference frame in genot yp e space b y c ho osing a lo cally ﬂat reference metric. A t any p oin t of in- terest, such as the mean genot yp e ¯ q , we can construct Riemannian normal co ordinates ˜ q αr in which the reference metric b ecomes Euclidean and the volume elemen t lo cally simpliﬁes to the ordinary Leb esgue measure: dV = d dK ˜ q . (4.1) The en tropy of a genotypic distribution that is concentrated near the mean: S [ ˜ ρ ] = − Z ˜ ρ ( ˜ q ) ln ˜ ρ ( ˜ q ) d dK ˜ q , (4.2) is to b e maximized sub ject to constraints on the mean genotype: Z ˜ q αr ˜ ρ ( ˜ q ) d dK ˜ q = ¯ q αr (4.3) and on the exp ected squared distance: Z δ αr,βs ( ˜ q αr − ¯ q αr )( ˜ q β s − ¯ q β s ) ˜ ρ ( ˜ q ) d dK ˜ q = σ 2 . (4.4) This isotropic constraint, combined with the maximum en trop y principle, yields a distribution that is spherically symmetric in the lo cal co ordinates: ˜ ρ ( ˜ q ) = 1 Z exp  − λ 2 δ αr,βs ( ˜ q αr − ¯ q αr )( ˜ q β s − ¯ q β s )  , (4.5) where λ is a Lagrange m ultiplier determined by σ 2 . The co v ariance matrix of the maximum en tropy distribution in the lo cal reference frame is prop ortional to the iden tit y: c αr,βs = Z ( ˜ q αr − ¯ q αr )( ˜ q β s − ¯ q β s ) ˜ ρ ( ˜ q ) d dK ˜ q = σ 2 dK δ αr,βs . (4.6) By appropriately rescaling the co ordinates, the cov ariance matrix can b e set to the iden tity . In an y other co ordinate system, it is giv en b y the in verse metric tensor, leading to the fundamen tal identiﬁcation: g αr,βs = c αr,βs . (4.7) The mean phenotype ¯ x is then given b y the phenot yp e map applied to the mean genot yp e: ¯ x = x ( ¯ q ) . (4.8) Assuming the phenot yp e map is suﬃcien tly smo oth, i.e. lo cally linear, x i ( q ) − ¯ x i = ∂ x i ∂ q αr ( q αr − ¯ q αr ) , (4.9) – 5 – w e obtain a relationship b etw een the phenotypic co v ariance C ij and the genotypic co v ariance c αr,βs : C ij = ∂ x i ∂ q αr ∂ x j ∂ q β s c αr,βs , (4.10) where the deriv ativ es are ev aluated at the mean genot yp e ¯ q . A t the same time, the pushforw ard of the in v erse metric from genot yp e space to phenotype space is given b y: G ij = ∂ x i ∂ q αr ∂ x j ∂ q β s g αr,βs . (4.11) Com bining Eqs. ( 4.7 ), ( 4.10 ) and ( 4.11 ), we obtain a relation b etw een the inv erse metric and the co v ariance matrix in phenotype space: G ij = C ij . (4.12) 5 Lande equation Consider a population with distribution ρ ( x ) in ph enotype space that undergoes ev olu- tion from one generation with state x to the next generation with state x ′ ( x ), described b y W rightian ﬁtness W ( x ). The exp ected c hange in the a verage state can be expressed as ⟨W ( x ) x ′ ( x ) ⟩ ⟨W ( x ) ⟩ − ⟨ x ⟩ = ⟨W ( x ) x ⟩ − ⟨W ( x ) ⟩ ⟨ x ⟩ ⟨W ( x ) ⟩ + ⟨W ( x ) ( x ′ ( x ) − x ) ⟩ ⟨W ( x ) ⟩ , (5.1) where exp ectations are deﬁned with resp ect to the in v ariant volume element: ⟨· · · ⟩ = Z · · · ρ ( x ) √ det G d N x. (5.2) Equation ( 5.1 ) is a con tinuous form of the Price equation [ 24 , 25 ], usually written as ∆ ⟨ x ⟩ = Co v( W ( x ) , x ) ⟨W ( x ) ⟩ + E ( W ( x )∆ x ) ⟨W ( x ) ⟩ (5.3) where ∆ ⟨ x ⟩ ≡ ⟨W ( x ) x ′ ( x ) ⟩ ⟨W ( x ) ⟩ − ⟨ x ⟩ is the c hange in the mean phenotype from one gen- eration to the next, and ∆ x ≡ x ′ ( x ) − x is the individual-lev el phenot ypic c hange b et w een parent and oﬀspring. Assuming p erfect transmission, meaning that oﬀspring resem ble their parents on a verage (E ( W ( x )∆ x ) = 0), the Price equation simpliﬁes to the selection resp onse: ∆ ⟨ x ⟩ = Co v( W ( x ) , x ) ⟨W ( x ) ⟩ . (5.4) The W righ tian ﬁtness W ( x ) can b e expanded around ¯ x : W ( x ) = W ( ¯ x ) + ∂ W ∂ ¯ x i ( x i − ¯ x i ) + O ( | x − ¯ x | 2 ) , (5.5) – 6 – where deriv atives are ev aluated at the mean phenot yp e ¯ x . The mean ﬁtness is then: ⟨W ⟩ = W ( ¯ x ) + ∂ W ∂ ¯ x i ⟨ x i − ¯ x i ⟩ + O ( | x − ¯ x | 2 ) = W ( ¯ x ) + O ( | x − ¯ x | 2 ) . (5.6) Substituting the linear approximations ( 5.5 ) and ( 5.6 ) into the selection resp onse ( 5.4 ) w e get: ∆ ⟨ x i ⟩ = Co v( W ( x ) , x i ) ⟨W ⟩ ≈ 1 W ( ¯ x ) Co v  W ( ¯ x ) + ∂ W ∂ ¯ x j ( x j − ¯ x j ) , x i  = 1 W ( ¯ x ) ∂ W ∂ ¯ x j Co v( x j , x i ) = ∂ log W ∂ ¯ x j C ij , (5.7) where the phenot ypic cov ariance matrix is C ij = Co v ( x i , x j ) = ⟨ ( x i − ¯ x i )( x j − ¯ x j ) ⟩ . (5.8) Equation ( 5.7 ) is the discrete form of the Lande equation [ 8 , 9 ], usually written in con tinuous time as d ¯ x i dt = C ij ( ¯ x ) ∂ F ( ¯ x ) ∂ ¯ x j (5.9) where F ( ¯ x ) = log W ( ¯ x ) is the Malth usian ﬁtness ev aluated at the mean phenotype. (See also Ref. [ 20 ] for an alternativ e deriv ation of the Lande equation.) The resulting equation describ es the c hange in mean phenot yp e as a pro duct of the genetic co v ariance matrix and the selection gradient, pro viding a foundation for understanding evolution- ary dynamics in quan titativ e genetics. Higher-order nonlinear corrections to the Lande equation are deriv ed in App endix A . 6 Learning dynamics In the previous sections, w e established tw o t yp es of structures: the geometric struc- tures G and g , and the statistical structures c and C . By applying the maxim um en tropy principle, the phenot ypic cov ariance matrix C w as identiﬁed with the inv erse metric tensor G − 1 ( 4.12 ), and th us the Lande equation ( 5.9 ) can b e view ed as learning dynamics describ ed b y co v arian t gradien t ascent [ 19 ]: d ¯ x i dt = G ij ( ¯ x ) ∂ F ( ¯ x ) ∂ ¯ x j , (6.1) with the loss function taken to b e the negative of Malthusian ﬁtness. The dynamics can b e pulled bac k to genot yp e space: d ¯ q αr dt = g αr,βs ( ¯ q ) ∂ F ( x ( ¯ q )) ∂ ¯ q β s = g αr,βs ( ¯ q ) ∂ F ( x ) ∂ x i ∂ x i ∂ ¯ q β s , (6.2) where the inv erse genot yp e metric g − 1 is iden tiﬁed with the genotype co v ariance matrix c ( 4.7 ). – 7 – Note that ﬁtness dep ends on the genot yp e co ordinates q only implicitly through the phenot yp e map x ( q ), i.e., F ( ¯ x ( ¯ q )) = F ( ¯ x ). This establishes a direct analogy with mac hine learning systems: the genotype q corresp onds to trainable v ariables (weigh ts and biases), while the phenot yp e x corresp onds to non-trainable v ariables (neuron states). The ﬁtness function dep ends on the phenotypes, which are determined by the genot yp es through the phenot yp e map. Evolution th us optimizes ﬁtness function b y adjusting genotypic v ariables, just as learning algorithms optimize a loss function by up dating w eigh ts and biases. The microscopic coun terpart of ( 6.2 ) can b e mo deled as [ 20 ]: dq αr dt = g αr,βs ( q ) ∂ H ( x ( q ) , ˆ x ) ∂ q β s , (6.3) where ˆ x represen ts environmen tal degrees of freedom. The microscopic ﬁtness is H ( x ( q ) , ˆ x ) ≈ F ( x ( q )) + ϕ ( q , t ) , (6.4) with the sto c hastic comp onen t satisfying ⟨ ϕ ( q , t ) ⟩ τ = 0 , (6.5) ⟨ ϕ ( q , t ) ϕ ( q ′ , t ′ ) ⟩ τ = C ( q , q ′ ) δ ( t − t ′ ) , (6.6) and the noise co v ariance given b y κ αr,βs ( q ) =  ∂ ϕ ∂ q αr ∂ ϕ ∂ q β s  = ∂ ∂ q αr ∂ ∂ q ′ β s C ( q , q ′ )    q ′ = q . (6.7) A p oten tial relation b etw een the metric tensor, whic h is related to the genot ypic co v ariance (see Section 4 ), and the noise co v ariance, whic h is related to the ﬁtness Hessian (see App endix B ), can b e motiv ated b y learning theory , where man y eﬃcient learning algorithms arise by selecting the metric tensor as a sp eciﬁc function of the noise co v ariance. A particularly simple class of such functions is the p ow er-la w dep endence [ 19 ]: g ( κ ) = κ a , (6.8) where a = 0 corresponds to stochastic gradient descen t (or ascent) and a = 1 cor- resp onds to natural gradien t descent (or ascent). More generally , eﬃcient learning algorithms, suc h as Adam [ 26 ] and AdaBelief [ 27 ], eﬀectively implemen t metrics of the form g ( κ ) = κ 1 / 2 + ϵI , (6.9) where ϵ is some constant. These algorithms b elong to the broader class of co v ariant gradien t descent metho ds, where the metric tensor adapts based on statistical infor- mation extracted from the gradien t noise. – 8 – 7 Genot yp e statistics T o dev elop a phenomenological model of the geometry of genot yp e space, w e recall that the inv erse metric tensor g − 1 is given by the genotypic co v ariance of the p opulation of organisms ( 4.7 ): g αr,βs =  q αr q β s  − ⟨ q αr ⟩  q β s  . (7.1) Empirical studies of genotypic cov ariance matrices ha v e revealed that the eigenv alue sp ectrum exhibits a rapid decay , with the ﬁrst few eigenv alues capturing a large frac- tion of the total genetic v ariance [ 28 ]. The decay of the eigen v alues is often well appro ximated by a p ow er law: λ i ∝ i − α , (7.2) with exp onent α typically in the range 1 . 0 to 2 . 0. F or such a p ow er law, the eﬀective rank of the cov ariance matrix can b e expressed in terms of the Riemann zeta function: r eﬀ ( α ) = ( P i λ i ) 2 P i λ 2 i ≈ ( P i i − α ) 2 P i i − 2 α = ζ ( α ) 2 ζ (2 α ) , (7.3) although the p o w er-law dep endence is usually mo diﬁed at b oth large and small eigen- v alues. Empirical v alues of the eﬀectiv e rank are often on the order of ∼ 10 2 , whic h is m uch smaller than the total genomic sequence length dim( g ) ∼ 10 9 . I n practice, this means that to obtain the actual metric tensor (or the in verse of the genotype co v ari- ance matrix) we may need to use a pseudo-inv erse to av oid problems asso ciated with in verting zero or near-zero eigenv alues. While substantial empirical data exist for the genotypic co v ariance matrix g − 1 , direct empirical estimates of the noise co v ariance matrix κ remain una v ailable. Using the microscopic equation ( 6.3 ), the noise cov ariance ( 6.7 ) with raised indices can b e expressed as the co v ariance of temp oral changes: κ αr,βs = g αr,γt κ γ t,δ u g δ u,β s =  dq αr dt dq β s dt  −  dq αr dt   dq β s dt  . (7.4) Estimating this matrix would require time-series data trac king evolutionary tra jectories across man y generations, combined with the abilit y to separate deterministic selection from sto c hastic ﬂuctuations, a formidable empirical c hallenge that has yet to b e met. Giv en the observ ational data for the genot yp e co v ariance g − 1 ( 7.1 ) and the raised- index noise cov ariance g − 1 κg − 1 ( 7.4 ), the metric tensor g and the noise cov ariance κ should b e related through some function g ( κ ) that describ es the learning algorithm. F or example, if the functional dep endence is a p o wer law ( 6.8 ), then g − 1 =  g − 1 κg − 1  a 2 a − 1 . (7.5) F or sto chastic gradient ( a = 0) and natural gradien t ( a = 1), we obtain g − 1 = I and g − 1 = g − 1 κg − 1 , resp ectiv ely . F or eﬃcien t learning algorithms such as those describ ed b y ( 6.9 ), w e obtain g − 1 = ϵ − 1  I − p g − 1 κg − 1  , (7.6) – 9 – whic h also sho ws the signiﬁcance of the ϵ parameter. Ho wev er, the precise functional form of g ( κ ), and therefore the sp eciﬁc learning algorithm implemen ted in biological ev olution, remains an op en question. This disconnect b etw een theory and av ailable data underscores the need for new exp erimental approaches aimed at directly charac- terizing not only the genotype cov ariance matrix g − 1 , but also the cov ariance matrix of ev olutionary changes of genotypes g − 1 κg − 1 . 8 Discussion In this pap er, we dev elop ed a geometric framework for biological ev olution that uniﬁes concepts from diﬀerential geometry , statistical mechanics, and learning theory . Three main results emerged from this analysis. Firstly , w e constructed a generally cov arian t formulation of evolutionary dynam- ics that op erates consisten tly in b oth genotype and phenotype spaces. This is accom- plished by em b edding the discrete genotype space A K in to a con tinuous space R dK via an embedding map ˆ q : A → R d . The phenot yp e map ˆ x : A K → R N is then in terp olated to a smo oth function x : R dK → R N , allo wing geometric structures to b e pushed forward and pulled back b et ween the t w o spaces. This framew ork pro- vides a co ordinate-in v ariant language for describing ev olutionary pro cesses, where all equations main tain their form under arbitrary reparameterizations of either space. Secondly , we applied the maximum entrop y principle to deriv e the fundamen tal iden tiﬁcations betw een the in v erse metric and the cov ariance matrix in b oth genotype space, g αr,βs = c αr,βs , and phenotype space, G ij = C ij . These identiﬁcations are fully consisten t with the Lande equation, whic h emerges as the leading-order contribution from an expansion of the Price equation. Higher-order corrections to the Lande equa- tion, deriv ed in App endix A , introduce contributions from third cumulan ts (skewness) and rev eal the conditions under which the linear approximation breaks down. Thirdly , the identiﬁcation of metric with cov ariance sho ws that the Lande equation is precisely a co v arian t gradien t ascent equation, implying that biological ev olution can b e understo o d as a learning pro cess on the (negative of ) ﬁtness landscap e. The sp eciﬁc learning algorithm implemen ted b y evolution is determined by the functional relation g ( κ ) b et w een the metric tensor g and the noise cov ariance κ that app ears in the microscopic dynamics. While the in v erse metric (or the genot ypic co v ariance matrix) g − 1 has b een extensively characterized empirically , the noise cov ariance κ and its asso ciated observ able g − 1 κg − 1 (the co v ariance of evolutionary changes of genot yp es) ha ve never b een directly reconstructed from observ ations. The functional form of g ( κ ) determines whether evolution implements simple al- gorithms such as sto chastic gradien t descen t g = I , or natural gradient descen t g = κ , or more sophisticated algorithms like those used in mo dern machine learning optimiz- ers [ 19 ]. Without empirical access to κ , the true learning algorithm of ev olution cannot b e determined. This gap b et w een theory and data deﬁnes a clear researc h program for evolutionary biology . F uture exp erimen tal work should aim to characterize not only the static cov ariance of genot yp es but also the dynamical cov ariance of genotypic – 10 – c hanges. Suc h measurements w ould allo w us to infer the functional relation g ( κ ) and thereb y identify the sp eciﬁc learning algorithm that nature has discov ered. A cknow le dgments. The author is grateful to Mikhail Katsnelson and Eugene Ko onin for man y stim ulating discussions and comments on the manuscript. References [1] Ronald A. Fisher. The Genetic al The ory of Natur al Sele ction . Oxford Universit y Press, 1930. [2] Sew all W right. Evolution in mendelian p opulations. Genetics , 16(2):97–159, 1931. [3] J. B. S. Haldane. A mathematical theory of natural and artiﬁcial selection. v. selection and mutation. Mathematic al Pr o c e e dings of the Cambridge Philosophic al So ciety , 23(7):838–844, 1927. [4] Daniel L. Hartl and Andrew G. Clark. Principles of Population Genetics . Sinauer Asso ciates, Sunderland, MA, 4th edition, 2007. [5] James F. Crow and Moto o Kim ura. An Intr o duction to Population Genetics The ory . Harp er & Row, New Y ork, 1970. [6] Mic hael Lynch. The Origins of Genome A r chite ctur e . Sinauer Asso ciates, Sunderland, MA, 2007. [7] Eugene V. Ko onin. The L o gic of Chanc e: The Natur e and Origin of Biolo gic al Evolution . FT Press, Upp er Saddle River, NJ, 2011. [8] Russell Lande. Natural selection and random genetic drift in phenotypic evolution. Evolution , 30(2):314–334, 1976. [9] Russell Lande. Quantitativ e genetic analysis of m ultiv ariate ev olution, applied to brain:b ody size allometry . Evolution , 33(1):402–416, 1979. [10] Stev an J. Arnold. Morphology , p erformance and ﬁtness. Americ an Zo olo gist , 23(2):347–361, 1983. [11] Sew all W right. The roles of m utation, in breeding, crossbreeding and selection in ev olution. Pr o c e e dings of the Sixth International Congr ess of Genetics , 1:356–366, 1932. [12] Sergey Gavrilets. Fitness L andsc ap es and the Origin of Sp e cies . Princeton Univ ersit y Press, 2004. [13] Vitaly V anch urin, Y uri I. W olf, Mikhail I. Katsnelson, and Eugene V. Ko onin. T ow ard a theory of ev olution as multilev el learning. Pr o c e e dings of the National A c ademy of Scienc es of the Unite d States of Americ a , 119(6):e2120037119, 2022. [14] Vitaly V anch urin. The world as a neural netw ork. Entr opy , 22(11):1210, 2020. [15] Vitaly V anch urin. T ow ards a theory of mac hine learning. Machine L e arning: Scienc e and T e chnolo gy , 2(3):035012, 2021. [16] Vitaly V anch urin, Y uri I. W olf, Eugene V. Ko onin, and Mikhail I. Katsnelson. Thermo dynamics of evolution and the origin of life. Pr o c e e dings of the National A c ademy of Scienc es of the Unite d States of Americ a , 119(6):e2120042119, 2022. – 11 – [17] M. I. Katsnelson, V. V anch urin, and T. W esterhout. Emergent scale in v ariance in neural netw orks. Physic a A: Statistic al Me chanics and its Applic ations , 610:128401, 2023. [18] Ek aterina Kuklev a and Vitaly V anch urin. Dataset-learning dualit y and emergen t criticalit y . Entr opy , 27(9):989, 2025. [19] Dmitry Gusko v and Vitaly V anch urin. Cov ariant gradient descent. arXiv pr eprint arXiv:2504.05279 , 2025. [20] Vitaly V anch urin. Geometric learning dynamics. arXiv pr eprint arXiv:2504.14728 , 2025. [21] Edwin T. Jaynes. Information theory and statistical mec hanics. Physic al R eview , 106(4):620–630, 1957. [22] Edwin T. Jaynes. Information theory and statistical mec hanics. ii. Physic al R eview , 108(2):171–190, 1957. [23] Stev an J. Arnold, Reinhard B ¨ urger, Paul A. Hohenlohe, Beverly C. Ajie, and Adam G. Jones. Understanding the evolution and stability of the g-matrix. Evolution , 62(10):2451–2461, 2008. [24] George R. Price. Selection and cov ariance. Natur e , 227:520–521, 1970. [25] George R. Price. Extension of cov ariance selection mathematics. Annals of Human Genetics , 35(4):485–490, 1972. [26] Diederik P . Kingma and Jimmy Ba. Adam: A metho d for sto chastic optimization. arXiv pr eprint arXiv:1412.6980 , 2014. [27] Jun tang Zhuang, T ommy T ang, Yifan Ding, Sekhar T atikonda, Nicha Dv ornek, Xenophon Papademetris, and James Duncan. Adab elief optimizer: Adapting stepsizes b y the b elief in observ ed gradients. A dvanc es in Neur al Information Pr o c essing Systems , 33:18795–18806, 2020. [28] Chongli Qin and Lucy J. Colwell. Po w er law tails in phylogenetic systems. Pr o c e e dings of the National A c ademy of Scienc es , 115(4):690–695, 2018. A Higher-order corrections The Lande equation ( 5.9 ) is a ﬁrst-order appro ximation derived b y linearizing the ﬁtness function around the mean phenotype. T o understand its limitations and obtain more accurate dynamics, w e extend the expansion to second order. Expanding the ﬁtness function to second order around the mean phenotype giv es: W ( x ) = W ( ¯ x ) + ∂ W ∂ ¯ x i ( x i − ¯ x i ) + 1 2 ∂ 2 W ∂ ¯ x i ∂ ¯ x j ( x i − ¯ x i )( x j − ¯ x j ) + O ( | x − ¯ x | 3 ) (A.1) and then the mean ﬁtness is: ⟨W ⟩ = W ( ¯ x ) + 1 2 ∂ 2 W ∂ ¯ x i ∂ ¯ x j C ij + O ( | x − ¯ x | 3 ) . (A.2) – 12 – The co v ariance b etw een ﬁtness and phenot yp e b ecomes: Co v( W ( x ) , x i ) = ∂ W ∂ ¯ x j C j i + 1 2 ∂ 2 W ∂ ¯ x j ∂ ¯ x k S j k i + O ( | x − ¯ x | 4 ) , (A.3) where the third cum ulant (skewness) is: S j k i = ⟨ ( x j − ¯ x j )( x k − ¯ x k )( x i − ¯ x i ) ⟩ . (A.4) Substituting in to ( 5.4 ) gives: ∆ ⟨ x i ⟩ = 1 W ( ¯ x ) ∂ W ∂ ¯ x j C j i + 1 2 W ( ¯ x ) ∂ 2 W ∂ ¯ x j ∂ ¯ x k S j k i + O ( | x − ¯ x | 4 ) . (A.5) Con verting to Malthusian ﬁtness F = log W using ∂ W ∂ ¯ x i = W ∂ F ∂ ¯ x i and k eeping only the leading correction: d ¯ x i dt = C ij ∂ F ∂ ¯ x j + 1 2 S ij k ∂ 2 F ∂ ¯ x j ∂ ¯ x k + O ( | x − ¯ x | 4 ) . (A.6) F or symmetric or Gaussian distributions, the third cumulan t v anishes ( S ij k = 0), and the Lande equation ( 5.9 ) is reco vered. These corrections v anish for symmetric distributions (zero skewness), in whic h case the Lande equation remains accurate to higher order. B Fitness Hessian The metric tensor, whic h is also the genotypic cov ariance ( 4.7 ), g αr,βs =  ( q αr − ¯ q αr )( q β s − ¯ q β s )  , (B.1) can b e diﬀeren tiated to obtain: dg αr,βs dt = 2 g β s,γ t ∂ 2 F ∂ ¯ q γ t ∂ ¯ q δ u g αr,δu +  g αr,γt ∂ ϕ ∂ q γ t ( q β s − ¯ q β s )  +  ( q αr − ¯ q αr ) g β s,γ t ∂ ϕ ∂ q γ t  , (B.2) where w e used ( 6.3 ) and kept only terms to second order in the expansion of F . Using the F urutsu-Novik o v formula for Gaussian white noise, the correlation b e- t ween noise gradient and deviation gives  ∂ ϕ ∂ q γ t ( q β s − ¯ q β s )  = 1 2 g β s,δ u κ γ t,δ u . (B.3) Com bining all contributions, we obtain the evolution equation for the inv erse metric: dg αr,βs dt = g αr,γt  2 ∂ 2 F ∂ ¯ q γ t ∂ ¯ q δ u + κ γ t,δ u  g δ u,β s . (B.4) – 13 – or for the metric itself: dg αr,βs dt = − 2 ∂ 2 F ∂ ¯ q αr ∂ ¯ q β s − κ αr,βs . (B.5) In the stationary limit where the metric (or genotype co v ariance ( B.1 )) is static, we obtain the balance condition ∂ 2 F ∂ ¯ q γ t ∂ ¯ q δ u = − 1 2 κ γ t,δ u . (B.6) – 14 –

Geometric framework for biological evolution

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment